Communications in Mathematical Physics - Volume 305

Commun. Math. Phys. 305, 1–21 (2011) Digital Object Identifier (DOI) 10.1007/s00220-011-1255-4 Communications in Mathe...

Author: M. Aizenman (Chief Editor)

42 downloads 811 Views 11MB Size Report

This content was uploaded by our users and we assume good faith they have the permission to share this book. If you own the copyright to this book and it is wrongfully on our website, we offer a simple DMCA procedure to remove your content from our site. Start by pressing the button below!
Report copyright / DMCA form

DOWNLOAD PDF

Commun. Math. Phys. 305, 1–21 (2011) Digital Object Identifier (DOI) 10.1007/s00220-011-1255-4

Communications in

Mathematical Physics

Oseledets Regularity Functions for Anosov Flows Slobodan N. Simić Department of Mathematics, San José State University, San José, CA 95192-0103, USA. E-mail: [email protected] Received: 6 August 2009 / Accepted: 13 January 2011 Published online: 7 May 2011 – © Springer-Verlag 2011

Abstract: Oseledets regularity functions quantify the deviation of the growth associated with a dynamical system along its Lyapunov bundles from the corresponding uniform exponential growth. The precise degree of regularity of these functions is unknown. We show that for every invariant Lyapunov bundle of a volume preserving Anosov flow on a closed smooth Riemannian manifold, the corresponding Oseledets regularity functions are in L p (m), for some p > 0, where m is the probability measure defined by the volume form. We prove an analogous result for essentially bounded cocycles over volume preserving Anosov flows.

1. Introduction This paper is concerned with the so called Oseledets regularity functions, which naturally arise from the Oseledets Multiplicative Ergodic Theorem and the Birkhoff Ergodic Theorem. We restrict our attention to flows; a similar analysis could be done for diffeomorphisms. For a smooth flow = { f t } on a closed Riemannian manifold M and a nonzero vector v ∈ Tx M, the Lyapunov exponent of v is defined by χ (v) = lim

|t|→∞

1 log Tx f t (v) , t

if the limit exists. Vectors v with the same Lyapunov exponent χ (plus the zero vector) form a linear subspace E χ (x) of Tx M, called the Lyapunov space of χ . By construction, these spaces form an invariant bundle in the sense that Tx f t (E χ (x)) = E χ ( f t x), for all t ∈ R. The fundamental properties of Lyapunov spaces are described by the following seminal result (originally stated for diffeomorphisms).

2

S. N. Simić

Oseledets Multiplicative Ergodic Theorem ([Ose68,Rue79,BP01]). Let = { f t } be a C 1 flow on a closed Riemannian manifold M. There exists a -invariant set R ⊂ M of full measure with respect to any -invariant Borel probability measure μ, such that for every x ∈ R there exists a splitting (called the Oseledets splitting) Tx M =

(x)

E i (x),

i=1

and numbers χ1 (x) < · · · < χ(x) (x) with the following properties: (a) The bundles E i are -invariant, Tx f t (E i (x)) = E i ( f t x), and depend Borel measurably on x. (b) For all v ∈ E i (x)\{0}, lim

|t|→∞

1 log Tx f t (v) = χi (x). t

The convergence is uniform on the unit sphere in E i (x). (c) For any I, J ⊂ {1, . . . , (x)} with I ∩ J = ∅, the angle function is tempered, i.e., lim

|t|→∞

where E I = i∈I E i . (d) For every x ∈ R,

1 log (Tx f t (E I (x)), Tx f t (E J (x))) = 0, t

(x)

1 log det Tx f t = χi (x) dim E i (x). |t|→∞ t lim

i=1

(e) If is ergodic with respect to μ, then the functions and χi are μ-almost everywhere constant. Points x ∈ R are called regular. Assume is ergodic with respect to some measure μ, fix i ∈ {1, . . . , }, and set χ = χi and E = E i . Denote the restriction of T f t to E by T E f t . Since 1 χ = lim log TxE f t , t→∞ t for each x ∈ R, it follows that for every ε > 0, E T f t lim (χx +ε)t = 0. t→∞ e Therefore, there exists a constant C > 0, depending on x and ε, such that T E f t ≤ x

Ce(χ +ε)t , for all t ≥ 0. It is natural to consider the best such C as a function of x and ε: Definition 1.1. For a fixed Lyapunov bundle E of and every ε > 0, the (E, ε)Oseledets regularity function Rε : R → R is defined by E T f t Rε (x) = sup (χx +ε)t . t≥0 e

Oseledets Regularity Functions

3

The family {Rε }ε>0 is the main focus of this paper. It is not hard to see that each Rε is Borel measurable (see [BP01] for the case of diffeomorphisms) and that Rε ≥ 1. What more can be said about the Rε ? For instance: Question 1. Does Rε lie in some L p -space? What is the best value of p? A related question can be posed for cocycles. Recall that a map : M × R → R is called a (multiplicative real-valued) cocycle over a flow { f t } if (x, s + t) = (x, s)( f s x, t), for all x ∈ M and s, t ∈ R. If for every x ∈ M the map t → (x, t) is absolutely continuous, then t (x, t) = exp u( f s x) ds , (1.1) 0 d ˙ |t=0 log (x, t) . When u is essentially bounded with where u(x) = (x, 0) = dt respect to some measure, we will call such a cocycle essentially bounded. ∞ Assume μ is an invariant Borel probability measure, u ∈ L (μ), and set χ = u dμ. If μ is ergodic, then by the Birkhoff Ergodic Theorem, M

lim

t→∞

1 log (x, t) = χ , t

for μ-a.e. x. Denote the set of Birkhoff regular points (at which the above limit exists and equals χ ) by R as well. Then for every x ∈ R and ε > 0, (x, t)/ exp{(χ + ε)t} → 0, as t → ∞, so there exists a constant B > 0 (depending on x and ε) such that (x, t) ≤ B exp{(χ + ε)t}, for all t ≥ 0. It makes sense to consider the best such B as a function of x and ε: Definition 1.2. For each ε > 0, the (u, ε)-regularity function Dεu : M → R is defined by Dεu (x) = sup t≥0

(x, t) . e(χ +ε)t

(1.2)

When u is clear from the context, we will write just Dε . It is clear that each Dε is Borel measurable and Dε ≥ 1. It is also easy to see that if ε ≥ u∞ − χ , then Dε = 1, μ-a.e. We are therefore interested only in the values of ε less than u∞ − χ . What more can be said about the Dε ? For instance: Question 2. Does Dε belong to some L p -space? What is the best value of p? A word of caution is needed here. Even for “best” (interesting) dynamical systems, namely globally uniformly hyperbolic ones, neither the Oseledets nor the Birkhoff theorem guarantee any particularly good properties of the set R of regular points and as a consequence of the regularity functions. Although R has full measure with respect to any invariant probability measure, its complement is not only non-empty, but can be topologically very large. This follows from the work of Barriera and Schmeling [BS00b] who showed that for an Anosov diffeomorphism of the 2-torus, the complement of the set of regular points can have the full Hausdorff dimension (i.e., two). In the continuous time case, using multifractal analysis, Barreira and Saussol [BS00a] showed that for

4

S. N. Simić

hyperbolic flows the set of non-regular points is similarly topologically large, namely, dense and of full Hausdorff dimension, for a generic function u. Similar results were obtained by Pesin and Sadovskaya [PS01]. We will soon see that Questions 1 and 2 are closely related, at least in the case of Anosov flows (see § 2.1), to which we now restrict ourselves. Namely, given a volume preserving Anosov flow and a Lyapunov bundle E of , it turns out that each regularity function Rε of E can be related to a regularity function Dηu , for some η > 0 and some essentially bounded function u dependent only on E. See Theorem B. We now state our main results. Throughout, m will denote the Borel probability measure defined by the Riemannian volume on M. Theorem A. Let = { f t } be a C 2 volume preserving Anosov flow on a closed Riemannian manifold M and let : M × R → R be a multiplicative cocycle over , as in (1.1). If u ∈ L ∞ (m), then for every ε > 0, the corresponding (u, ε)-regularity function Dε belongs to L p (m), for some p > 0. If u is Hölder continuous, let H be the entropy function of u (as defined in § 2.1). Then Dε ∈ L p (m), provided that

−1 u∞ −χ ds . (1.3) p≤ H (χ + s) ε Here is a sketch of the proof of Theorem A. If u is Hölder, then for all x ∈ R, t → (x, t) is continuous, so we define Tε : R → R (0 < ε < u∞ − χ ) to be the smallest t ≥ 0 at which the supremum in (1.2) is attained. Then Tε is Borel measurable and Dε ≤ exp{(u∞ − χ − ε)Tε }, so we study the question of integrability of exp(Tε ). Using a large deviations result of Waddington [Wad96] (see § 2.1 for details), we show that if p < H (χ + ε), then exp(Tε ) ∈ L p (m), where H is the entropy function of u. Next, we show that if η < ε, then Dη ≤ Dε exp{(ε − η)Tη } m-a.e., which for any natural number N by induction extends to Dε ≤

N −1

exp(δTεi ),

i=0

where ε = ε0 < ε1 < · · · < ε N = u∞ − χ is a partition of [ε, u∞ − χ ] with δ = εi+1 − εi = (u∞ − χ − ε)/N , for all i. Using the generalized Hölder inequality and the fact that exp(δTεi ) ∈ L pi (m), where pi < H (χ + εi )/δ, we obtain Dε ∈ L p (m),

where p −1 = i pi−1 > i δ/H (χ + εi ). Passing to the limit as N → ∞ in the last sum, we obtain (1.3). If u is only essentially bounded, we show that it is possible to suitably approximate u in the L 1 -sense by a larger smooth function (Lemma 3.3). Namely, for every δ > 0 there exists a C ∞ function u˜ : M → R such that u ≤ u˜ and M (u˜ − u) dm < δ. It then easily u˜ , m-a.e., which implies that follows that for any 0 < δ < ε < u∞ − χ , Dεu ≤ Dε−δ u p Dε lies in some L -space. A bridge between the two different types of regularity functions is given by the following result. Theorem B. Let = { f t } be a C 2 volume preserving Anosov flow on a closed C ∞ Riemannian manifold M and let E be a Lyapunov bundle for associated with a Lyapunov exponent χ . For every δ > 0 there exists a constant Cδ > 0 such that


5

t E u( f s x) ds , Tx f t ≤ Cδ eδt exp 0

for every x ∈ R and t ≥ 0, where u ∈ L ∞ (m) is independent of δ and u dm = χ . M

The proof of Theorem B goes as follows. First, we trivialize E by using a measurable orthonormal frame. This transforms the second variational equation for the restriction of the flow to E into a family of non-autonomous differential equations X˙ = A x (t)X on Rk (k = dim E), parametrized by x ∈ R. Following [BP01], we use a lemma of Perron to construct a family Ux (t) of orthogonal matrices such that if v(t) is a solution to v˙ = A x (t)v, then z(t) = Ux (t)v(t) is a solution to z˙ = Bx (t)z, where Bx (t) are upper triangular matrices, whose non-diagonal entries are bounded in x and t. We show that for every δ > 0 there exists a norm ·δ on Rk such that Bx (t)δ < r (Bx (t)) + δ, where r denotes the spectral radius. Moreover, r (Bx (s + t)) = r (B f s x (t)), for all t and a.e. x. Furthermore, if X (t) is the unique solution to the matrix differential equation X˙ = A x (t)X satisfying X (0) = I , then t δt X (t)δ ≤ K δ e exp r (Bx (s)) ds , 0

for some constant K δ > 0. We therefore define u : M → R by u(x) = r (Bx (0)). It is not hard to prove that u is essentially bounded. The desired inequality for TxE f t is now obtained by pulling the norms ·δ back to E and observing that each new Finsler structure is globally uniformly equivalent to the original one. The last result is a straightforward corollary of Theorem B. Theorem C. Let = { f t } be a C 2 volume preserving Anosov flow on a closed C ∞ Riemannian manifold M. Let E be a Lyapunov bundle in the Oseledets splitting for . Then for every ε > 0, the corresponding (E, ε)-regularity function Rε belongs to the space L p (m), for some p > 0. To prove Theorem C, denote the Lyapunov exponent corresponding to E by χ and let ε > 0 and 0 < δ < ε be arbitrary. Then by Theorem B, E t exp 0 u( f s x) ds T f t x ≤ Cδ e(χ +ε)t e(χ +ε−δ)t ≤ Cδ Dε−δ (x), for m-a.e. x ∈ R and t ≥ 0. This implies that Rε ≤ Cδ Dε−δ , which yields Theorem C. Remark. The question of the best p = p(ε) such that Dε ∈ L p (m) (and the analogous question for Rε ) remains open. It is likely that the answer can be found by a more careful analysis of the set L = {(ε, p) : Dε ∈ L p (m)}, which possesses a number of interesting properties such as: (a) The set L = {(ε, p −1 ) : (ε, p) ∈ L } is convex. To see this, first observe that D(1−α)ε0 +αε1 ≤ Dε1−α Dεα1 , for all ε0 , ε1 > 0 and 0 ≤ α ≤ 1 (the proof is straight0 −1 forward). If (εi , pi ) ∈ L , i = 1, 2, and 0 < α < 1, then Dεi ∈ L pi (i = 1, 2), so Dε1−α ∈ L p0 /(1−α) and Dε1 ∈ L p1 /α . The above inequality and Hölder’s inequality 0 yield Dε ∈ L p , where ε = (1 − α)ε0 + αε1 and p −1 = (1 − α) p0−1 + αp1−1 . Thus L contains the line segment connecting (ε0 , p0−1 ) and (ε1 , p1−1 ).

6

S. N. Simić

(b) If ν is a Borel probability measure on an interval I ⊂ (0, u∞ −χ ) and φ : I → R a positive Borel function whose graph is contained in L , then (ε, p) ∈ L , where −1 ε = I t dν(t) and p = I dν/φ . The following example shows that even for “simple” systems there is a definite cut-off value for p beyond which the regularity function is not in L p . Example. Let M = T 2 be the 2-torus and f : T 2 → T 2 an area preserving Anosov diffeomorphism. Denote its a.e. Lyapunov exponents by χ − < 0 < χ + . It is not hard to construct f so that it possesses two periodic points x, y, whose corresponding Lyapunov exponents are different, i.e., χx− = χ y− and χx+ = χ y+ (with obvious notation). Clearly, the Lyapunov exponents at x or y (or both) differ from the a.e. Lyapunov exponents χ − , χ + . Assume, for instance, that χx+ > χ + and denote by λ+ the unstable cocycle of f , that is, the determinant of the derivative of f restricted to the unstable bundle of f . Then for 0 < ε < χ p+ − χ + , the regularity function Dε of u = log λ+ is infinite at x. (Using the fact that the homoclinic points of x are dense in T 2 and all have the same Lyapunov exponents as x, it is not hard to see that Dε is in fact infinite on a dense subset of T 2 .) We claim that there exists p0 > 0 such that Dε ∈ L p (m), for all p ≥ p0 . The main idea for showing this is the following. Since λ+ is continuous, each Dε is lowersemicontinuous, so sets E α = {Dε > α} are open, for all α. Since Dε (x) = ∞ (where x is the periodic point as above), x ∈ E α , for all α, so there exists a ball B(x, r ), for some r = r (α), contained in E α . Hölder continuity of u allows us to control the size of ∞ r and show that 0 α p−1 m(E α ) dα diverges for large enough p. The details follow. Denote by C > 0 and 0 < θ < 1 the Hölder constant and exponent of u so that for all x1 , x2 ∈ T 2 , |u(x1 ) − u(x2 )| ≤ Cd(x1 , x2 )θ . Fix 0 < ε < χx+ − χ + and define σ = χx+ − χ + − ε and Sε (z, N ) =

N −1

u( f i z) − (χ + + ε)N ,

i=0

so that Dε (z) = sup N ≥1 exp Sε (z, N ). Consider the periodic point x as above, at which Dε (x) = ∞. Suppose its prime period is . It is not hard to verify that Sε (x, n) = σ n, for all n ≥ 1, that is, Sε (x, N ) grows linearly along the subsequence N = n. Moreover, for all z ∈ T 2 , we have |Sε (x, N ) − Sε (z, N )| ≤

N −1 i=0

λθ N − 1 d(x, z)θ , u( f i x) − u( f i z) ≤ C θ λ −1

(1.4)

where λ > 1 is the Lipschitz constant of f . For α > 0, set E α = {z ∈ T 2 : Dε (z) > α}, as above. Clearly, x ∈ E α for all α > 0. Since Sε (x, n) = σ n, it follows that Sε (x, n) > log α, where we take log α + 1. n= σ


7

Fix N = n and observe that log α log α +≤ N < + 2. σ σ

(1.5)

It easily follows from (1.4) that if the right-hand side in (1.4) is < Sε (x, N ) − log α, then Sε (z, N ) > log α, hence z ∈ E α . Thus the ball B(x, r (α)) in T 2 of radius r (α) =

1 λθ − 1 [Sε (x, N ) − log α] C λN θ − 1

1/θ

is contained in E α . It follows from (1.5) that λ

Nθ

0 t M If χ is finite, we can define regularity functions of A as above by Rε (x) = sup t>0

A(x, t) . e(χ +ε)t

The goal is to show that for every ε > 0, Rε ∈ L p (m), for some p > 0. We briefly outline how this could be done.

8

S. N. Simić

The key is to obtain the asymptotics of m{Rε > eα } with respect to α. Choose T0 > 0 large enough so that 1 ε a(x, T0 ) dm(x) < χ + . T0 M 2 If Rε (x) > eα , then a(x, T ) − (χ + ε)T > α, for some T > 0. Write T = kT0 + τ , for some positive integer k and 0 ≤ τ < T0 and observe that a(x, T ) is bounded above by the sum of a( f Tk0 x, τ ) (which is bounded a.e.) and the kth Birkhoff sum of the function g(x) = a(x, T0 ). Thus the set {Rε > eα } is contained in the set of the form

k−1 { i=0 g ◦ f Ti 0 > c}, for some c depending on α, so one can apply to f T0 the large deviations result for time-t maps of Anosov flows proved by Dolgopyat in [Dol04], yielding the desired asymptotics. This approach (which we will not pursue here) could be used to prove both Theorems A and C without the Pesin-Lyapunov theory, although it does not provide the more precise bound on p as a function of ε given in Theorem A in the Hölder case. We thank the referee for pointing out the possibility of this extension. 2. Preliminaries 2.1. Large deviations for Anosov flows. A non-singular C 1 flow = { f t } on a closed Riemannian manifold M is called an Anosov flow if there exists a T f t -invariant continuous splitting of the tangent bundle, T M = E uu ⊕ E c ⊕ E ss , and constants C, λ > 0 such that for all t ≥ 0, T f t E ss ≤ Ce−λt

T f −t E uu ≤ Ce−λt ,

and

where the center bundle E c is one dimensional and generated by the infinitesimal generator of the flow. The bundles E uu , E ss are called the strong unstable and strong stable bundle of the flow. If the flow is of class C 2 , E ss , E uu are known to be Hölder continuous (cf., [Has94,Has97,HPS77]). If an Anosov flow preserves a volume form, it is automatically ergodic with respect to the Lebesgue measure defined by the volume (see [Ano67]). Recall that a flow is called (topologically) transitive if it has a dense orbit. An equilibrium state of a function ϕ : M → R is an invariant Borel probability measure μ at which the quantity h(μ) + ϕ dμ M

attains its supremum, where h(μ) denotes the measure-theoretic entropy of with respect to μ. This supremum P(ϕ) is called the pressure of ϕ. If ϕ is Hölder continuous, there exists a unique equilibrium state of ϕ, denoted by μϕ . Given a transitive Anosov flow = { f t }, one defines a function ϕ u : M → R by d u ϕ (x) = log det Tx f t E uu . dt 0


9

If is C 2 , ϕ u is known to be Hölder continuous. The unique equilibrium state of −ϕ u is called the Sinai-Ruelle-Bowen (SRB) measure μSRB of the flow. By the Bowen-Ruelle theorem [BR75], for every continuous ϕ : M → R, 1 T lim ϕ( f t x) dt = ϕ dμSRB , T →∞ T 0 M for m-a.e. x ∈ M, where m is the Lebesgue measure defined by the volume form and is C 2 . Thus μSRB is an ergodic measure for . If the flow admits a smooth invariant Borel probability measure μ (i.e., a measure that is absolutely continuous with respect to the volume measure m), then by the Birkhoff ergodic theorem, μ = μSRB . In particular, if is volume preserving, then μSRB = m. For an arbitrary flow f t : M → M and function ψ : M → R, we can define a skew product flow ψ

St : S 1 × M → S 1 × M by ψ St (exp(2πiθ ), x) = exp{2πi(θ + ψ t (x))}, f t (x) , where

t

ψ (x) = t

ψ( f s x) ds.

0

Definition 2.1 ([Wad96]). A Hölder continuous function ϕ : M → R and a flow = { f t } on M are called flow independent if they satisfy the following property: for every a+bϕ is not topologically transitive,1 two numbers a, b ∈ R, if the skew product flow St then a = 0 = b. Large deviation asymptotics for transitive Anosov flows were established by Waddington in [Wad96]. In particular: Theorem 2.2 (Coro. 2, [Wad96]). Let = { f t } be a transitive C 2 Anosov flow on M and let ϕ : M → R be a Hölder continuous function such that ϕ and are flow independent. There exist analytic real-valued functions β, γ , ρ defined on an interval in R, such that if ρ(a) > 0, then T 1 eγ (a)T C(a) μSRB x : ϕ( f t x) dt ≥ T a ∼ √ , ρ(a) 2πβ (ρ(a)) T 0 as T → ∞, where C(a) is a constant depending on a. Here, a(t) ∼ b(t) as t → ∞, means a(t)/b(t) → 1. The function β : R → R is defined by β(t) = P(ψ + tϕ) − P(ψ), for a Hölder continuous ψ. For our purposes, we will take ψ = 0. Some properties of β (see [Wad96] for details), with ψ = 0, are: β (t) = ϕ dμtϕ and β (t) = σμ2tϕ (ϕ), (2.1) M

1 Waddington uses the term topologically ergodic, which has the same meaning, see [Pet89], Prop. 2.4.

10

S. N. Simić

where for a measure μ with σμ2 (ϕ)

M

ϕ dμ = χ , the variance of ϕ is defined by

1 = lim T →∞ T

T

2 ϕ ◦ f t dt − χ T

.

0

Furthermore, σμ2 (ϕ) = 0 if and only if ϕ is cohomologous to a constant. If ϕ is not cohomologous to a constant, the map t → β (t) is strictly increasing. Denote its range by ϕ ; it follows from (2.1) that ϕ ⊂ (min ϕ, max ϕ). On ϕ , set ρ = (β )−1 . Then ρ : ϕ → R is strictly increasing, surjective, and real analytic, with ρ(χ ) = 0. Finally, γ : ϕ → R is defined as minus one times the Legendre transform of β, i.e., γ (s) = − sup{st − β(t)}. t∈R

It can be shown that γ is a strictly concave, non-positive function with a unique maximum at χ = M ϕ dm (where we still take ψ = 0). Furthermore, γ (s) = −1/β (ρ(s)), so in particular, γ (χ ) = −1/σm2 (ϕ) (cf., [Wad96]). In the large deviations literature the function H = −γ is called the entropy function of ϕ. It is easily seen that H has the following properties (see [Wad96]): it is strictly convex on ϕ , H (χ ) = H (χ ) = 0,

H (χ ) =

1 , and H (a) = ∞ for a ∈ ϕ , σm2 (ϕ)

where χ = M ϕ dm. The following lemma will be needed later in the paper. Lemma 2.3. Let be a volume preserving Anosov flow and ϕ : M → R a Hölder continuous function. If ϕ and are not flow independent, then ϕ is cohomologous to a constant. Proof. Suppose ϕ and are not flow independent. Then there exist numbers a, b, not a+bϕ both zero, such that the skew product St is not topologically transitive, hence not ergodic with respect to the measure m 1 × m, where m 1 is the Haar-Lebesgue measure on S 1 . Since the volume measure is an equilibrium state of , Prop. 4.2 in [Wal99] implies the existence of a nonzero integer and a Hölder function w : M → R such that

t

(a + bϕ)( f s x) = w( f t x) − w(x),

0

for all x ∈ M. If b = 0, then a = 0 and w( f t x) − w(x) = at everywhere, which is impossible. Therefore, b = 0. Differentiating the above identity, we obtain ϕ+

1 a = X w, b

which means that ϕ is cohomologous to −a/b.


11

2.2. Pesin-Lyapunov theory. In this section we follow Barreira-Pesin [BP01] and briefly review some elements of Pesin-Lyapunov theory for linear differential equations v˙ = A(t)v,

(2.2)

where A(t) is a k × k bounded matrix function, i.e., sup A(t) < ∞. t∈R

We concentrate on real matrices A(t) ([BP01] deals with complex matrices). The Lyapunov exponent of v ∈ Rk is the number χ (v) = lim sup t→∞

1 log v(t) , t

where v(t) is the unique solution to (2.2) satisfying the initial condition v(0) = v. The function χ : Rk → R ∪ {−∞} attains only finitely many values χ1 < . . . < χ , where ≤ k. For each 1 ≤ i ≤ , define Vi = {v ∈ Rk : χ (v) ≤ χi }. This defines a linear filtration of Rk : {0} = V0 V1 · · · V = Rk . An ordered basis v = (v1 , . . . , vk ) of Rk is called normal with respect to the filtration V = {Vi } if for every 1 ≤ i ≤ , the vectors v1 , . . . , vki form a basis for Vi , where ki = dim Vi . In particular, if χ is a constant function, every basis of Rk is normal. v (t) the volume Given a basis v = (v1 , . . . , vk ) of Rk and 1 ≤ m ≤ k, denote by m of the parallelepiped defined by v1 (t), . . . , vm (t), where vi (t) is the unique solution to (2.2) satisfying vi (0) = vi . Recall that the Lyapunov exponent χ is regular (together with the Lyapunov exponent χ˜ associated with the dual equation w˙ = −A(t)∗ w) if and only if (see [BP01], Theorem 1.3.1) for any normal ordered basis v = (v1 , . . . , vk ) of Rk and 1 ≤ m ≤ k, we have 1 v log m (t) = χ (vi ). t→∞ t m

lim

i=1

In particular, if χ is constant, then for any basis v = (v1 , . . . , vk ) and 1 ≤ m ≤ k, 1 v log m (t) = mχ . (2.3) t We now recall how one converts (as in [BP01]) by a linear change of coordinates Eq. (2.2) into z˙ = B(t)z, where the matrix B(t) is upper triangular. We seek a differentiable family of orthogonal matrices U (t) for the job. Set z(t) = U (t)−1 v(t), where v(t) is a solution to (2.2); then lim

t→∞

v(t) ˙ = U˙ (t)z(t) + U (t)˙z (t) = A(t)U (t)z(t), which implies z˙ (t) = B(t)z(t), where B(t) = U (t)−1 A(t)U (t) − U (t)−1 U˙ (t).

(2.4)

The following lemma of Perron guarantees the existence of U (t) so that B(t) is upper triangular, for all t.

12

S. N. Simić

Lemma 2.4 (Lemma 1.3.3, [BP01]). There exists a differentiable matrix function t → U (t) such that for each t ≥ 0: (a) U (t) is orthogonal. (b) B(t) = [bi j (t)] is upper triangular. (c) For all 1 ≤ i < j ≤ k, sup bi j (t) ≤ 2 sup A(t) < ∞. t≥0

t≥0

(d) For any basis v = (v1 , . . . , vk ) of Rk and all 1 ≤ i ≤ k, bii (t) =

v (t) d log vi . dt i−1 (t)

Here is how families U (t) and B(t) are constructed. Denote by G : Gl(k, R) → O(k) the Gram-Schmidt orthogonalization operator that sends a basis v = (v1 , . . . , vk ) of Rk to an orthonormal basis u = (u 1 , . . . , u k ). We can think of v and u as matrices with columns v1 , . . . , vk and u 1 , . . . , u k , respectively. Then v ∈ Gl(k, R) and u ∈ O(k). Observe that G = N ◦L, where L [v1 , . . . , vk ] = [w1 , . . . , wk ] is a linear operator defined by wi+1 = vi+1 − projWi vi+1 ,

Wi = span{w1 , . . . , wi },

and N [w1 , . . . , wk ] = [u 1 , . . . , u k ] is the normalization operator wi ui = . wi Since L is linear, differentiating G at v yields Tv G = TL v N ◦ L . In the proof of Perron’s Lemma 1.3.1 in [BP01], for an arbitrary but fixed basis v = (v1 , . . . , vk ) of Rk , one defines U (t) by U (t) = G [v1 (t), . . . , vk (t)], where vi (t) is the unique solution to the equation v˙ = A(t)v satisfying the initial condition v(0) = vi . The family B(t) is defined as in (2.4). Thus both t → U (t) and t → B(t) depend on the choice of a basis v = (v1 , . . . , vk ) of Rk . When it is important to emphasize this, we will write U v (t) and B v (t). It is clear that the eigenvalues of B(t) are its diagonal entries bii (t). Denote the spectral radius of a matrix M by r (M). Corollary 2.5. If χ is constant, then 1 t r (B(s)) ds = χ . lim t→∞ t 0 Proof. Follows directly from (2.3) and part (d) of Lemma 2.4.


13

Lemma 2.6. The spectral radius r (B(t)) of the matrix B(t) = B v (t) is independent of the choice of the basis v = (v1 , . . . , vk ) of Rk . Proof. Let v = (v1 , . . . , vk ) and w = (w1 , . . . , wk ) be two bases of Rk and let B v (t) and B w (t) be the corresponding matrices constructed as above. Denote the solutions to (2.2) with initial values (v1 , . . . , vk ), (w1 , . . . , wk ) by (v1 (t), . . . , vk (t)) and (w1 (t), . . . , wk (t)), respectively. Both k-tuples are bases of Rk . Therefore, there exists a family of invertible matrices P(t) such that P(t)vi (t) = wi (t), for all 1 ≤ i ≤ k. It follows that iw (t) = det P(t) iv (t), for all 1 ≤ i ≤ k, and thus iw (t) iv (t) = w (t) v (t) , i−1 i−1 for all t ≥ 0. Lemma 2.4(d) implies that the corresponding diagonal entries of B v (t) and B w (t) are the same, which yields the conclusion of the lemma. Define a function ρ B : R → R by ρ B (t) = r (B(t)). Lemma 2.7. There exists a universal constant K > 0, depending only on k, such that |ρ B (0)| ≤ K A(0) . Proof. Let v = e be the standard basis (e1 , . . . , ek ) of Rk and let U (t) = U e (t) be the corresponding orthogonal matrix function defined as above. Then: |ρ B (0)| = |r (B(0))| = r (A(0) − U (0)−1 U˙ (0)) ≤ A(0) − U (0)−1 U˙ (0) ≤ A(0) + U (0)−1 U˙ (0) = A(0) + U˙ (0) . Denote the solution to (2.2) with initial value ei by ei (t). Then: d G [e1 (t), . . . , ek (t)] U˙ (0) = dt 0 = TI G [e˙1 (0), . . . , e˙k (0)] = TI G [A(0)e1 , . . . , A(0)ek ] = TI G (A(0)), where I is the k × k identity matrix. Let K = 1 + TI G , where TI G is regarded as a map between Lie algebras glk and ok . It follows that ρ B (0) ≤ K A(0) , completing the proof of the lemma.

14

S. N. Simić

3. Proof of Theorem A We split the proof of Theorem A into two cases: in the first case, we deal with Hölder continuous functions u. The general case of essentially bounded u is reduced to the first case in a suitable way. In either case, without loss of generality, we assume that u is a positive function. Otherwise, apply the analysis below to the function u + C, for a sufficiently large positive constant C. It is easy to see that the regularity functions of u and u + C are the same. Case 1. u is Hölder continuous. First of all, we may assume that u and are flow independent. Otherwise, by Lemma 2.3, u is cohomologous to a constant, which is necessarily equal to χ = M u dm, that is, u = X w + χ , for some Hölder function w. This implies that t exp u( f s x) ds = eχ t ew( ft x)−w(x) ≤ e2w∞ eχ t , 0

so the corresponding regularity functions Dε are all constant (in fact, Dε = 1 m-a.e., for all ε > 0). Recall that we are only interested in the values 0 < ε < u∞ − χ , since Dε = 1 for ε ≥ u∞ − χ . Denote the set of Birkhoff regular points by R. For each x ∈ R and 0 < ε < u∞ − χ , define Tε (x) to be the smallest T ≥ 0 at which the supremum defining Dε = Dεu in (1.2) is attained. That is, T u( f s x) ds − (χ + ε)T = log Dε (x) . Tε (x) = min T ≥ 0 : 0

By the Birkhoff Ergodic Theorem, Tε : R → [0, ∞) is well-defined. It is clear that Tε is Borel measurable. As in § 2.1, let H = −γ be the entropy function of u. Lemma 3.1. If p < H (χ + ε), then e Tε ∈ L p (m). Proof. Fix an ε ∈ (0, u∞ − χ ) and let ζ > 1 be arbitrary. Define Bn = {x : ζ n < Tε (x) ≤ ζ n+1 }. Suppose x ∈ Bn . Since u is assumed to be positive, we have 0

ζ n+1

u( f s x) ds ≥

T ε(x)

u( f s x) ds 0

= (χ + ε)Tε (x) + log Dε (x) ≥ (χ + ε)Tε (x) ≥ (χ + ε)ζ n χ + ε n+1 ζ . = ζ

By Theorem 2.2 there exists a constant L depending on ε and ζ such that χ +ε ζ n+1 . m(Bn ) ≤ L exp −H ζ


15

It follows that M

exp( pTε ) dm =

∞ Bn

n=0

≤

exp( pTε ) dm

χ +ε ζ n+1 , L exp pζ n+1 − H ζ

n

which is finite for

p 1 was arbitrary, letting ζ → 1+ yields the claim.

Lemma 3.2. If 0 < η < ε and x ∈ R, then Dη (x) ≤ Dε (x)e(ε−η)Tη (x) . Proof. Set u ε = u − χ − ε. Then for each x ∈ R: t u ε ( f s x) ds log Dε (x) = max t≥0

≥

0 Tη (x)

0

=

Tη (x)

0

u ε ( f s x) ds u η ( f s x) ds + (η − ε)Tη (x)

= log Dη (x) − (ε − η)Tη (x), which proves the claim.

Now let 0 < ε < u∞ − χ be arbitrary and fix a natural number N ≥ 1. Let ε = ε0 < ε1 < · · · < ε N = u∞ − χ be a partition of the interval [ε, u∞ − χ ] with δ = εi+1 − εi = (u∞ − χ − ε)/N , for all 0 ≤ i ≤ N − 1. Applying Lemma 3.2 repeatedly and using Du∞ −χ = 1 a.e., we obtain Dε ≤

N −1

exp(δTεi ).

i=0

Since exp(δTεi ) ∈ L pi (m), for pi < H (χ + εi )/δ (Lemma 3.1), the generalized Hölder inequality yields Dε ∈ L p (m), where N −1 1 1 = p pi i=0

>

N −1

i=0 u∞ −χ

→

δ H (χ + εi )

ε

ds , H (χ + s)

as N → ∞. This proves the second conclusion of Theorem A.

16

S. N. Simić

Remark. It is possible to show that, in fact,

Dε (x) = exp

u∞ −χ ε

Tη (x) dη ,

for m-almost every x ∈ M. Case 2. u ∈ L ∞ (m). We need to show that for every ε > 0 there exists p > 0 such that Dε ∈ L p (m). The following lemma asserts that we may as well work with a smooth u. Lemma 3.3. For every δ > 0 there exists a C ∞ function u˜ : M → R such that u ≤ u˜ and (u˜ − u) dm < δ. M

Proof. We will first find a continuous function w : M → R such that u ≤ w and (w − u) < δ and then regularize w. Let η > 0 be arbitrary. By Luzin’s theorem, there exists a continuous function g : M → [0, ∞) such that g∞ ≤ u∞ and m(A) < η, where A = {x ∈ M : u(x) = g(x)}. The set A is Borel measurable, so there exists an open set U such that A ⊂ U and m(U \A) < η. Let V be an open set such that A ⊂ V ⊂ V ⊂ U. By Urysohn’s lemma, there exists a continuous function h : M → [0, 1] such that h = 0 on the complement of U and h = 1 on V . Let k ∈ (u∞ , 2 u∞ ) and define w = g + kh. On A, w ≥ kh = k > u. On the complement of A, g = u, so w = u + kh ≥ u. Observe that w = u on the complement of U and that g + kh ≤ 3 u∞ on U . Therefore, (w − u) dm = (g + kh − u) dm M U (g + kh) dm ≤ U

≤ m(U ) · 3 u∞ ≤ 6η u∞ . Let wa = w + a, where a > 0 is a small constant, so that wa − u ≥ a. Finally, let u˜ be a C ∞ regularization of wa with u˜ − wa ∞ sufficiently small so that u˜ ≥ u. It is easy to see that the integrals of u˜ and wa are the same. Since (u˜ − u) dm = (wa − u) dm ≤ 6η u∞ + a, M

M

by choosing η and a sufficiently small, we obtain a desired function u. ˜


17

Now fix an ε ∈ (0, u∞ − χ ) and 0 < δ < ε. Let u˜ be a C ∞ function on M supplied by Lemma 3.3 such that u ≤ u˜ and χ˜ − χ < δ, where χ˜ = u˜ dm. Denote by D˜ η the (u, ˜ η)-regularity function. Then: t exp 0 u( ˜ f s x) ds Dε (x) ≤ sup (χ +ε)t e t≥0 t exp 0 u( ˜ f s x) ds ≤ sup e(χ˜ +ε−δ)t t≥0 = D˜ ε−δ (x), which lies in L p (m), for some p > 0, by Case 1. Therefore Dε ∈ L p (m), completing the proof of Theorem A. 4. Proof of Theorem B Let = { f t } be a C 2 volume preserving Anosov flow. Fix a Lyapunov bundle E corresponding to a Lyapunov exponent χ and denote the set of Lyapunov regular points by R. Let x ∈ R and consider the Second Variational Equation for the flow on E: d E (4.1) T f t = (T fEt x X )TxE f t , dt x where X is the Anosov vector field. Choose a measurable orthonormal frame F = {F1 , . . . , Fk } for E and define a vector bundle map T : E → R × Rk by T (Fi (x)) = (x, ei ), element of the standard basis of Rk ; extend T linearly over each where ei is the fiber. Then T trivializes E, transforming (4.1) into a family of differential equations parametrized by x ∈ R: X˙ = A x (t)X, i th

where A x (t) is the matrix of T fEt x X relative to the frame F. As in § 2.2, for each x ∈ R we obtain an orthogonal matrix Ux (t) and an upper triangular matrix Bx (t) whose properties are described by Lemma 2.4. Observe that since sup TxE X ≤ sup Tx X < ∞, x∈R

x∈M

it follows that α = sup{A x (t) : x ∈ R, t ∈ R} < ∞. Furthermore, by Corollary 2.5, 1 t→∞ t

lim

t

(4.2)

r (Bx (s)) ds = χ .

0

It is well-known that for every matrix M and δ > 0 there exists a norm such that M < r (M) + δ. The following lemma is a slight generalization of this result.

18

S. N. Simić

Lemma 4.1. Let β > 0 be fixed and denote by B the set of all upper triangular k × k matrices such that for all B = [bi j ] ∈ B, max bi j ≤ β. i< j

Then for every δ > 0 there exists a norm ·δ on Rk such that for all B ∈ B, the induced operator norm of B satisfies Bδ < r (B) + δ. Proof. The proof is an adaptation of one of the standard proofs (see, e.g., [Kre98], Theorem 3.32). Define δ ε = min 1, (k − 1)β and D = diag(1, ε, ε2 , . . . , εk−1 ). Then for any B = [bi j ] ∈ B, ⎡

b11 ⎢ 0 ⎢ C = D −1 B D = ⎢ 0 ⎣ · 0

εb12 b22 0 · 0

ε2 b13 εb23 b33 · 0

··· ··· ··· ··· ···

⎤ εk−1 b1k k−2 ε b2k ⎥ ⎥ εk−3 b3k ⎥ . ⎦ · bkk

For a k × k matrix A = [ai j ], write A∞ = max

1≤i≤k

k ai j . j=1

Then, for all B = [bi j ] ∈ B: C∞ ≤ max |bii | + (k − 1)εβ ≤ r (B) + δ. 1≤i≤k

We define a norm on

Rk

by

vδ = D −1 v , ∞

where (w1 , . . . , wk )∞ = max |wi |. It follows that for all B ∈ B, Bvδ = D −1 Bv ∞ −1 = C D v ∞ ≤ C∞ D −1 v ∞

= C∞ vδ ≤ (r (B) + δ) vδ .


19

By Lemma 2.4 and (4.2), Bx (t) ≤ 2 A x (t) ≤ 2α, for all x ∈ R and t ∈ R. Thus we can apply Lemma 4.1 to the family of matrices B = {Bx (t) : x ∈ R, t ∈ R}. For each δ > 0, we obtain a norm ·δ on Rk , which induces an operator matrix norm we also denote by ·δ . This yields Bx (t)δ ≤ r (Bx (t)) + δ, for all t ∈ R and x ∈ R. Now consider the unique solution X (t) to X˙ = A x (t)X satisfying the initial condition X (0) = I and the corresponding solution Z (t) = Ux (t)−1 X (t) to Z˙ = Bx (t)Z . Since F is orthonormal, Ux (0) = I , so Z (0) = I . Thus t Bx (s)Z (s) ds. Z (t) = I + 0

It follows that

Z (t)δ ≤ 1 +

t 0

Bx (s)δ Z (s)δ ds,

so by Grönwall’s inequality and Lemma 4.1, t t δt Bx (s)δ ds ≤ e Z (t)δ ≤ exp r (Bx (s)) ds . 0

(4.3)

0

Since Ux (t) is orthogonal, its operator norm with respect to the original norm on Rk equals one. The old norm and the new norm on Rk are uniformly equivalent, so there exists a uniform constant K δ > 0 such that Ux (t)δ ≤ K δ Ux (t) = K δ . Therefore, X (t)δ = Ux (t)Z (t)δ ≤ K δ Z (t)δ .

(4.4)

Now let (x, t) = r (Bx (t)), for any choice of the matrices Bx (t) as above. Note that by Lemma 2.6 is well-defined. Lemma 4.2. For all x ∈ R and s, t ∈ R, we have (x, s + t) = ( f s x, t). Proof. Fix x ∈ R. Recall how the matrices Bx (t) are constructed (cf., 2.4): first, we choose a basis v = (v1 , . . . , vk ) of {x} × Rk , and apply the Gram-Schmidt procedure to the matrix [v1 (t), . . . , vk (t)], where v˙i (t) = A x (t)vi (t) and vi (0) = vi , which yields a family of orthogonal matrices Uxv (t). Then we define Bxv (t) = {Uxv (t)}−1 A x (t)Uxv (t) − {Uxv (t)}−1 U˙ xv (t). Now fix s ∈ R. We define suitable families of matrices Bxv (t) and B wfs x (t) by appropriately choosing bases v = (v1 , . . . , vk ) of T (E(x)) = {x} × Rk and w = (w1 , . . . , wk ) of T (E( f s x)) = { f s x} × Rk , respectively. This can be done as follows. Define v by (x, vi ) = T (Fi (x)) (1 ≤ i ≤ k). This gives rise to a family of orthogonal matrices Uxv (t) and the corresponding family Bxv (t). Define w by ( f s x, wi ) = T (Tx f s (Fi (x))) (1 ≤ i ≤ k). This gives rise to the matrices U wfs x (t) and the corresponding family B wfs x (t).

20

S. N. Simić

Let vi (t) and wi (t) be the solutions of the differential equations v˙ = A x (t)v and w˙ = A fs x (t)w with initial conditions vi and wi , respectively. Then: wi (t) = T (T fs x f t (Tx f s (Fi (x)))) = T (Tx f s+t (Fi (x))) = vi (s + t). This implies that Uxv (s + t) = U wfs x (t). Furthermore, since A x (t) is the matrix of T fEt x X in the frame F, A x (t) = [T fEt x X ]F , it follows that A x (s + t) = [T fEs+t x X ]F = [T fEt ( fs x) X ]F = A fs x (t). Therefore, B wfs x (t) = U wfs x (t)−1 A fs x (t)U wf s x (t) − U wfs x (t)U˙ wf s x (t) −1 A x (s + t)Uxv (s + t) − Uxv (s + t) = Uxv (s + t) = Bxv (s + t).

−1

U˙ xv (s + t)

It follows that (x, s + t) = r (Bxv (s + t)) = r (B wf s x (t)) = ( f s x, t), as claimed.

Define a function u : R → R by u(x) = r (Bx (0)) = (x, 0). By Lemma 2.7, |u(x)| ≤ K A x (0) ≤ K α, for m-a.e. x, hence u ∈

L ∞ (m).

Furthermore, by Lemma 4.2,

u( f t x) = r (B f t x (0)) = r (Bx (t)). Combining (4.3), (4.4), and (4.5), we obtain X (t)δ ≤ K δ eδt exp

t

(4.5)

u( f s x) ds .

0

For each δ > 0, we abuse the notation and denote the pullback of the norms ·δ to E via T by the same symbol. That is, for each v ∈ E(x) (x ∈ R), we set vδ = T (v)δ . This defines a family of Finsler structures on E with respect to which t E δt u( f s x) ds , Tx f t ≤ K δ e exp δ

0

for all x ∈ R and t ≥ 0. Since any two norms on Rk are uniformly equivalent, for each δ > 0 there exists a constant Aδ > 0 such that v ≤ Aδ vδ , for all v ∈ E, where v denotes the original norm of v defined by the Riemann structure on M. It follows that the norm of TxE f t with respect to the original norm on E satisfies t E E δt u( f s x) ds . Tx f t ≤ Aδ Tx f t ≤ Aδ K δ e exp δ

This completes the proof of Theorem B.

0


21

References [Ano67] Anosov, D.V.: Geodesic flows on closed Riemannian manifolds of negative curvature. Proc. Steklov Math. Inst. 90 (1967); AMS Translations, Providence. RI: Amer. Math. Soc., 1969 [BP01] Barreira, L., Pesin, Y.B.: Lyapunov exponents and smooth ergodic theory. University Lecture Series, Vol. 23, Providence. RI: Amer. Math. Soc., 2001 [BR75] Bowen, R., Ruelle, D.: The ergodic theory of axiom a flows. Invent. Math. 29(3), 181–202 (1975) [BS00a] Barreira, L., Saussol, B.: Multifractal analysis of hyperbolic flows. Comm. Math. Phys. 214(2), 339– 371 (2000) [BS00b] Barreira, L., Schmeling, J.: Sets of “non-typical” points have full topological entropy and full hausdorff dimension. Israel J. Math. 116, 29–70 (2000) [Dol04] Dolgopyat, D.: Limit theorems for partially hyperbolic systems. Trans. Amer. Math. Soc. 356(4), 1637–1689 (2004) [Has94] Hasselblatt, B.: Regularity of the anosov splitting and of horospheric foliations. Erg. Th. Dyn. Sys. 14(4), 645–666 (1994) [Has97] Hasselblatt, B.: Regularity of the anosov splitting ii. Ergodic Theory Dynam. Systems 17(1), 169– 172 (1997) [HPS77] Hirsch, M.W., Pugh, C.C., Shub, M.: Invariant manifolds. Lecture Notes in Mathematics, Vol. 583, Berlin-New York: Springer-Verlag, 1977 [Kin68] Kingman, J.F.C.: The ergodic theory of subadditive stochastic processes. J. Royal Stat. Soc. B 30, 499–510 (1968) [Kre98] Kress, R.: Numerical analysis. Grad. Text in Math., Vol. 181, Berlin-Heidelberg-New York: Springer, 1998 [Ose68] Oseledets, V.I.: A multiplicative ergodic theorem. Lyapunov Characteristic Numbers for Dynamical Systems. Trans. Moscow Math. Soc. 19, 197–221 (1968) [Pet89] Petersen, K.E.: Ergodic theory. Cambridge Studies in Advanced Mathematics, Cambridge: Cambridge University Press, 1989 [PS01] Pesin, Ya.B., Sadovskaya, V.: Multifractal analysis of conformal axiom a flows. Comm. Math. Phys. 216(2), 277–312 (2001) [Rue79] Ruelle, D.: Ergodic theory of differentiable dynamical systems. Publications Math. de l’IHES 50, 27–58 (1979) [Wad96] Waddington, S.: Large deviation asymptotics for anosov flows. Annales de l’I.H.P., Section C 13(4), 445–484 (1996) [Wal99] Walkden, C.P.: Stable ergodic properties of cocycles over hyperbolic attractors. Comm. Math. Phys. 205(2), 263–281 (1999) Communicated by G. Gallavotti


Communications in


Spectral Dimension and Random Walks on the Two Dimensional Uniform Spanning Tree Martin T. Barlow , Robert Masson Department of Mathematics, University of British Columbia, #121-1984 Mathematics Road, Vancouver, BC V6T 1Z2, Canada. E-mail: [email protected] Received: 25 March 2010 / Accepted: 28 October 2010 Published online: 30 April 2011 – © Springer-Verlag 2011

Abstract: We study the simple random walk on the uniform spanning tree on Z2 . We obtain estimates for the transition probabilities of the random walk, the distance of the walk from its starting point after n steps, and exit times of both Euclidean balls and balls in the intrinsic graph metric. In particular, we prove that the spectral dimension of the uniform spanning tree on Z2 is 16/13 almost surely. 1. Introduction A spanning tree on a finite graph G = (V, E) is a connected subgraph of G which is a tree and has vertex set V . A uniform spanning tree in G is a random spanning tree chosen uniformly from the set of all spanning trees. Let Q n = [−n, n]d ⊂ Zd , and write U Q n for a uniform spanning tree on Q n . Pemantle [Pem91] showed that the weak limit of U Q n exists and is connected if and only if d ≤ 4. (He also showed that the limit does not depend on the particular sequence of sets Q n chosen, and that ‘free’ or ‘wired’ boundary conditions give rise to the same limit.) We will be interested in the case d = 2, and will call the limit the uniform spanning tree (UST) on Z2 and denote it by U. For further information on USTs, see for example [BKPS04,BLPS01,Lyo98]. The UST can also be obtained as a limit as p, q → 0 of the random cluster model – see [Häg95]. A loop erased random walk (LERW) on a graph is a process obtained by chronologically erasing the loops of a random walk on the graph. There is a close connection between the UST and the LERW. Pemantle [Pem91] showed that the unique path between any two vertices v and w in a UST on a finite graph G has the same distribution as the loop-erasure of a simple random walk on G from v to w. Wilson [Wil96] then proved that a UST could be generated by a sequence of LERWs by the following algorithm. Pick an arbitrary vertex v ∈ G and let T0 = {v}. Now suppose that we have Research partially supported by NSERC (Canada) and by the Peter Wall Institute of Advanced Studies (UBC). Research partially supported by NSERC (Canada).

24

M. T. Barlow, R. Masson

generated the tree Tk and that Tk does not span. Pick any point w ∈ G\Tk and let Tk+1 be the union of Tk and the loop-erasure of a random walk started at w and run until it hits Tk . We continue this process until we generate a spanning tree Tm . Then Tm has the distribution of the UST on G. We now fix our attention on Z2 . By letting the root v in Wilson’s algorithm go to infinity, one sees that one can obtain the UST U on Z2 by first running an infinite LERW from a point x0 (see Sect. 2 for the precise definition) to create the first path in U, and then using Wilson’s algorithm to generate the rest of U. This construction makes it clear that U is a 1-sided tree: from each point x there is a unique infinite (self-avoiding) path in U. Both the LERW and the UST on Z2 have conformally invariant scaling limits. Lawler, Schramm and Werner [LSW04] proved that the LERW in simply connected domains scales to SLE2 – Schramm-Loewner evolution with parameter 2. Using the relation between LERW and UST, this implies that the UST has a conformally invariant scaling limit in the sense of [Sch00], where the UST is regarded as a measure on the set of triples (a, b, γ ), where a, b ∈ R2 ∪ {∞} and γ is a path between a and b. In addition [LSW04] proves that the UST Peano curve – the interface between the UST and the dual UST – has a conformally invariant scaling limit, which is SLE8 . In this paper we will study properties of the UST U on Z2 . We have two natural metrics on U; the intrinsic metric given by the shortest path in U between two points, and the Euclidean metric. For x, y ∈ Z2 , let γ (x, y) be the unique path in U between x and y, and let d(x, y) = |γ (x, y)| be its length. If U0 is a connected subset of U then we write γ (x, U0 ) for the unique path from x to U0 . Write γ (x, ∞) for the path from x to infinity. We define balls in the intrinsic metric by Bd (x, r ) = {y : d(x, y) ≤ r } and let |Bd (x, r )| be the number of points in Bd (x, r ) (the volume of Bd (x, r )). We write B(x, r ) = {y ∈ Zd : |x − y| ≤ r }, for balls in the Euclidean metric, and let B R = B(R) = B(0, R), Bd (R) = Bd (0, R). Our goals in this paper are to study the volume of balls in the d metric, to obtain estimates of the degree of ‘metric distortion’ between the intrinsic and Euclidean metrics, and to study the behaviour of a simple random walk (SRW) on U. To state our results we need some further notation. Let G(n) be the expected number of steps of an infinite LERW started at 0 until it leaves B(0, n). Clearly G(n) is strictly increasing; extend G to a continuous strictly increasing function from [1, ∞) to [1, ∞), with G(1) = 1. Let g(t) be the inverse of G, so that G(g(t)) = t = g(G(t)) for all t ∈ [1, ∞). By [Ken00,Mas09] we have lim

n→∞

5 log G(n) = . log n 4

(1.1)

Our first result is on the relation between balls in the two metrics. Theorem 1.1. (a) There exist constants c, C > 0 such that for all r ≥ 1, λ ≥ 1, 2/3 P Bd (0, λ−1 G(r )) ⊂ B(0, r ) ≤ Ce−cλ .

(1.2)

Spectral Dimension and Random Walks on 2-D Uniform Spanning Tree

25

(b) For all ε > 0, there exist c(ε), C(ε) > 0 and λ0 (ε) ≥ 1 such that for all r ≥ 1 and λ ≥ 1, P (B(0, r ) ⊂ Bd (0, λG(r )) ≤ Cλ−4/15+ε ,

(1.3)

and for all r ≥ 1 and all λ ≥ λ0 (ε), P (B(0, r ) ⊂ Bd (0, λG(r )) ≥ cλ−4/5−ε .

(1.4)

We do not expect any of these bounds to be optimal. In fact, we could improve the exponent in the bound (1.2), but to simplify our proofs we have not tried to find the best exponent that our arguments yield when we have exponential bounds. However, we will usually attempt to find the best exponent given by our arguments when we have polynomial bounds, as in (1.3) and (1.4). The reason we have a polynomial lower bound in (1.4) is that if we have a point w such that |w| = r , then the probability that γ (0, w) leaves the ball B(0, λr ) is bounded below by λ−1 (see Lemma 2.6). This in turn implies that the probability that w ∈ / Bd (0, λG(r )) is bounded from below by cλ−4/5−ε (Proposition 2.7). Theorem 1.1 leads immediately to bounds on the tails of |Bd (0, R)|. However, while (1.2) gives a good bound on the upper tail, (1.3) only gives polynomial control on the lower tail. By working harder (see Theorem 3.4) we can obtain the following stronger bound. Theorem 1.2. Let R ≥ 1, λ ≥ 1. Then P(|Bd (0, R)| ≥ λg(R)2 ) ≤ Ce−cλ , 1/3

P(|Bd (0, R)| ≤ λ

−1

g(R) ) ≤ Ce 2

−cλ1/9

.

(1.5) (1.6)

So in particular there exists C such that for all R ≥ 1, C −1 g(R)2 ≤ E|Bd (0, R)| ≤ Cg(R)2 .

(1.7)

Remark 1.3. (1) The two main ingedients for the proof of these theorems are Wilson’s algorithm, plus good control of the LERWs. For the LERW we need bounds on the length of a LERW run from O until it leaves a domain D ⊂ Zd – see Theorem 2.2. We remark that although we do not need the stretched exponential tails given there, we do need quite rapid decay of the tail probabilities. The UST is constructed from a large number of LERW paths, and a single ‘bad path’ (i.e. one which is much too short or much too long) could mean the inclusions in Theorem 1.1 fail. In addition, to control the way that Wilson’s algorithm ‘fills in’ in the tree, we use the discrete Beurling estimate, which states that the probability a random walk started at 0 will hit a path γ connecting B(0, n) with B(0, 2n)c is bounded below by a constant. It is possible that a similar strategy would work on other graphs (such as Z3 ), conditional on having an analogue to Theorem 2.2, and also a Beurling estimate giving lower bounds on the probability a SRW hits a LERW path α inside a ball B(0, n). Since however such bounds could not be uniform in α and n, some extensions of the arguments here would certainly be needed. (2) We do not use directly the fact that the LERW and UST have SLE limits. Indeed, there is only one point in the whole argument leading to the results of this paper where this connection is needed. That is in [Mas09], where it is used to show that the function Es(n) (the probability a LERW and independent SRW do not hit inside B(0, n)) satisfies Es(n) ≈ n −3/4 .

26


We now discuss the simple random walk on the UST U. To help distinguish between the various probability laws, we will use the following notation. For LERW and a simple random walk in Z2 we will write Pz for the law of the process started at z. The probability law of the UST will be denoted by P, and the UST will be defined on a probability space (, P); we let ω denote elements of . For the tree U(ω) write x ∼ y if x and y are connected by an edge in U, and for x ∈ Z2 let μx = μx (ω) = |{y : x ∼ y}| be the degree of the vertex x. The random walk on U(ω) is defined on a second space D = (Z2 )Z+ . Let X n be the coordinate maps on D, and for each ω ∈ let Pωx be the probability on D which makes X = (X n , n ≥ 0) a simple random walk on U(ω) started at x. Thus we have Pωx (X 0 = x) = 1, and Pωx (X n+1 = y|X n = x) =

1 μx (ω)

if y ∼ x.

We remark that since the UST U is a subgraph of Z2 the SRW X is recurrent. We define the heat kernel (transition density) with respect to μ by x pnω (x, y) = μ−1 y Pω (X n = y).

(1.8)

Define the stopping times τ R = min{n ≥ 0 : d(0, X n ) > R}, τr = min{n ≥ 0 : |X n | > r }.

(1.9) (1.10)

Given functions f and g we write f ≈ g to mean lim

n→∞

log f (n) = 1, log g(n)

and f g to mean that there exists C ≥ 1 such that C −1 f (n) ≤ g(n) ≤ C f (n), n ≥ 1. The following summarizes our main results on the behaviour of X . Some more precise estimates, including heat kernel estimates, can be found in Theorems 4.3–4.7 in Sect. 4. Theorem 1.4. We have for P -a.a. ω, Pω0 -a.s., p2n (0, 0) ≈ n −8/13 , τR ≈ R , τr ≈ r 13/4 , max d(0, X k ) ≈ n 5/13 . 13/5

0≤k≤n

(1.11) (1.12) (1.13) (1.14)


27

We now explain why these exponents arise. If G is a connected graph, with graph metric d, we can define the volume growth exponent (called by physicists the fractal dimension of G) by d f = d f (G) = lim

R→∞

log |Bd (0, R)| , log R

if this limit exists. Using this notation, Theorem 1.2 and (1.1) imply that d f (U) = 8/5, P − a.s. Following work by mathematical physicists in the early 1980s, random walks on graphs with fractal growth of this kind have been studied in the mathematical literature. (Much of the initial mathematical work was done on diffusions on fractal sets, but many of the same results carry over to the graph case). This work showed that the behaviour of SRW on a (sufficiently regular) graph G can be summarized by two exponents. The first of these is the volume growth exponent d f , while the second, denoted dw , and called the walk dimension, can be defined by log E 0 τ R R→∞ log R

dw = dw (G) = lim

(if this limit exists).

Here 0 is a base point in the graph, and τ R is as defined in (1.9); it is easy to see that if G is connected then the limit is independent of the base point. One finds that d f ≥ 1, 2 ≤ dw ≤ 1 + d f , and that all these values can arise – see [Bar04]. Many of the early papers required quite precise knowledge of the structure of the graph in order to calculate d f and dw . However, [BCK05] showed that in some cases it is sufficient to know two facts: the volume growth of balls, and the growth of effective resistance between points in the graph. Write Reff (x, y) for the effective resistance between points x and y in a graph G – see Sect. 3 for a precise definition. The results of [BCK05] imply that if G has uniformly bounded vertex degree, and there exist α > 0, ζ > 0 such that c1 R α ≤ |Bd (x, R)| ≤ c2 R α , x ∈ G, R ≥ 1, c1 d(x, y)ζ ≤ Reff (x, y) ≤ c2 d(x, y)ζ , x, y ∈ G,

(1.15) (1.16)

then writing τ Rx = min{n : d(x, X n ) > R}, p2n (x, x) n −α/(α+ζ ) , x ∈ G, n ≥ 1, E x τ Rx R α+ζ , x ∈ G, R ≥ 1.

(1.17) (1.18)

(They also obtained good estimates on the transition probabilities P x (X n = y) – see [BCK05, Theorem 1.3].) From (1.17) and (1.18) one sees that if G satisfies (1.15) and (1.16) then d f = α, dw = α + ζ. The decay n −d f /dw for the transition probabilities in (1.17) can be explained as follows. If R ≥ 1 and 2n = R dw then with high probability X 2n will be in the ball B(x, c R). This ball has c R d f ≈ cn d f /dw points, and so the average value of p2n (x, y) on this ball will be n −d f /dw . Given enough regularity on G, this average value will then be close to the actual value of p2n (x, x).

28


In the physics literature a third exponent, called the spectral dimension, was introduced; this can be defined by log Pωx (X 2n = x) , n→∞ log 2n

ds (G) = −2 lim

(if this limit exists).

(1.19)

This gives the rate of decay of the transition probabilities; one has ds (Zd ) = d. The discussion above indicates that the three indices d f , dw and ds are not independent, and that given enough regularity in the graph G one expects that ds =

2d f . dw

For graphs satisfying (1.15) and (1.16) one has ds = 2α/(α + ζ ). Note that if G is a tree and satisfies (1.15) then Reff (x, y) = d(x, y) and so (1.16) holds with ζ = 1. Thus d f = α, dw = α + 1, ds =

2α . α+1

(1.20)

For random graphs arising from models in statistical physics, such as critical percolation clusters or the UST, random fluctuations will mean that one cannot expect (1.15) and (1.16) to hold uniformly. Nevertheless, providing similar estimates hold with high enough probability, it was shown in [BJKS08] and [KM08] that one can obtain enough control on the properties of the random walk X to calculate d f , dw and ds . An additional contribution of [BJKS08] was to show that it is sufficient to estimate the volume and resistance growth for balls from one base point. In Sect. 4, we will use these methods to show that (1.20) holds for the UST, namely that Theorem 1.5. We have for P -a.a. ω, d f (U) =

8 13 16 , dw (U) = , ds (U) = . 5 5 13

(1.21)

The methods of [BJKS08] and [KM08] were also used in [BJKS08] to study the incipient infinite cluster (IIC) for high dimensional oriented percolation, and in [KN09] to show the IIC for standard percolation in high dimensions has spectal dimension 4/3. These critical percolation clusters are close to trees and have d f = 2 in their graph metric. Our results for the UST are the first time these exponents have been calculated for a two-dimensional model arising from the random cluster model. It is natural to ask about critical percolation in two dimensions, but in spite of what is known via SLE, the values of dw and ds appear at present to be out of reach. The rest of this paper is laid out as follows. In Sect. 2, we define the LERW on Z2 and recall the results from [BM10,Mas09] which we will need. The paper [BM10] gives bounds on M D , the length of the loop-erasure of a random walk run up to the first exit of a simply connected domain D. However, in addition to these bounds, we require estimates on d(0, w) which by Wilson’s algorithm is the length of the loop-erasure of a random walk started at 0 and run up to the first time it hits w; we obtain these bounds in Proposition 2.7. In Sect. 3, we study the geometry of the two dimensional UST U, and prove Theorems 1.1 and 1.2. In addition (see Proposition 3.6) we show that with high probability the electrical resistance in the network U between 0 and Bd (0, R)c is greater than


29

R/λ. The proofs of all of these results involve constructing the UST U in a particular way using Wilson’s algorithm and then applying the bounds on the lengths of LERW paths from Sect. 2. In Sect. 4, we use the techniques from [BJKS08,KM08] and our results on the volume and effective resistance of U from Sect. 3 to prove Theorems 1.4 and 1.5. Throughout the paper, we use c, c , C, C to denote positive constants which may change between each appearance, but do not depend on any variable. If we wish to fix a constant, we will denote it with a subscript, e.g. c0 . 2. Loop Erased Random Walks In this section, we look at LERW on Z2 . We let S be a simple random walk on Z2 , and given a set D ⊂ Z2 , let σ D = min{ j ≥ 1 : S j ∈ Z2 \D} be the first exit time of the set D, and ξ D = min{ j ≥ 1 : S j ∈ D} be the first hitting time of the set D. If w ∈ Z2 , we write ξw for ξ{w} . We also let σ R = σ B(R) and use a similar convention for ξ R . The outer boundary of a set D ⊂ Z2 is ∂ D = {x ∈ Z2 \D : there exists y ∈ D such that |x − y| = 1}, and its inner boundary is ∂i D = {x ∈ D : there exists y ∈ Z2 \D such that |x − y| = 1}. A path λ in Z2 is a sequence (finite or infinite) λ0 , λ1 , . . . with |λ j+1 − λ j | = 1. We occasionally will write λ( j) rather than λ j if j is a complicated expression. Given a path λ = [λ0 , . . . , λm ] in Z2 , let L(λ) denote its chronological loop-erasure. More precisely, we let s0 = max{ j : λ j = λ0 }, and for i > 0, si = max{ j : λ j = λ(si−1 + 1)}. Let n = min{i : si = m}. Then L(λ) = [λ(s0 ), λ(s1 ), . . . , λ(sn )]. We note that by Wilson’s algorithm, L(S[0, ξw ]) has the same distribution as γ (0, w) – the unique path from 0 to w in the UST U. We will therefore use γ (0, w) to denote L(S[0, ξw ]) even when we make no mention of the UST U. For positive integers l, let l be the set of paths ω = [0, ω1 , . . . , ωk ] ⊂ Z2 such that ω j ∈ B(0, l), j = 1, . . . , k − 1 and ωk ∈ ∂ B(0, l). For n ≥ l, define the measure μl,n

30


on l to be the distribution on l obtained by restricting L(S[0, σn ]) to the part of the path from 0 to the first exit of B(0, l). For a fixed l and ω ∈ l , it was shown in [Law91] that the sequence μl,n (ω) is Cauchy. Therefore, there exists a limiting measure μl such that lim μl,n (ω) = μl (ω).

n→∞

The μl are consistent and therefore there exists a measure μ on infinite self-avoiding paths. We call the associated process the infinite LERW and denote it by S. We denote the exit time of a set D for S by σ D . By Wilson’s algorithm, S[0, ∞) has the same distribution as γ (0, ∞), the unique infinite path in U starting at 0. Depending on the context, either notation will be used. For a set D containing 0, we let M D be the number of steps of L(S[0, σ D ]). Notice that if D = Z2 \{w} and S is a random walk started at x, then M D = d(x, w). In addition, if D ⊂ D then we let M D ,D be the number of steps of L(S[0, σ D ]) while it is in D , or equivalently the number of points in D that are on the path L(S[0, σ D ]). n be the number of steps of We let M S[0, σn ]. As in the Introduction, we set G(n) = E[ Mn ], extend G to a continuous strictly increasing function from [1, ∞) to [1, ∞) with G(1) = 1, and let g be the inverse of G. It was shown [Ken00,Mas09] that G(n) ≈ n 5/4 . In fact, the following is true. Lemma 2.1. Let ε > 0. Then there exist positive constants c(ε) and C(ε) such that if r ≥ 1 and λ ≥ 1, then cλ5/4−ε G(r ) ≤ G(λr ) ≤ Cλ5/4+ε G(r ), cλ4/5−ε g(r ) ≤ g(λr ) ≤ Cλ4/5+ε g(r ).

(2.1) (2.2)

Proof. The first equation follows from [BM10, Lemma 6.5]. Note that while the statement there holds only for all r ≥ R(ε), by choosing different values of c and C, one can easily extend it to all r ≥ 1. The second statement follows from the first since g = G −1 and G is increasing. n and of M D ,D for The following result from [BM10] gives bounds on the tails of M

2 a broad class of sets D and subsets D ⊂ D. We call a subset of Z simply connected if all connected components of its complement are infinite. Theorem 2.2 [BM10, Theorems 5.8 and 6.7]. There exist positive global constants C and c, and given ε > 0, there exist positive constants C(ε) and c(ε) such that for all λ > 0 and all n, the following holds: 1. Suppose that D ⊂ Z2 contains 0, and D ⊂ D is such that for all z ∈ D , there exists a path in D c connecting B(z, n + 1) and B(z, 2n)c (in particular this will hold if D is simply connected and dist(z, D c ) ≤ n for all z ∈ D ). Then (2.3) P M D ,D > λG(n) ≤ 2e−cλ . 2. For all D ⊃ B(0, n),

3.

4/5−ε . P M D < λ−1 G(n) ≤ C(ε)e−c(ε)λ

(2.4)

n > λG(n) ≤ Ce−cλ . P M

(2.5)


4.

n < λ−1 G(n) ≤ C(ε)e−c(ε)λ4/5−ε . P M

31

(2.6)

We would like to use (2.3) in the case where D = Z2 \{w} and D = B(0, n)\{w}. However these choices of D and D do not satisfy the hypotheses in (2.3), so we cannot use Theorem 2.2 directly. The idea behind the proof of the following proposition is to get the distribution on γ (0, w) using Wilson’s algorithm by first running an infinite LERW γ (whose complement is simply connected) and then running a LERW from w to γ . Proposition 2.3. There exist positive constants C and c such that the following holds. Let n ≥ 1 and w ∈ B(0, n). Let Yw = w if γ (0, w) ⊂ B(0, n); otherwise let Yw be the first point on the path γ (0, w) which lies outside B(0, n). Then, P (d(0, Yw ) > λG(n)) ≤ Ce−cλ .

(2.7)

= Z2 \γ . Then D is the Proof. Let γ be any infinite path starting from 0, and let D union of disjoint simply connected subsets Di of Z2 ; we can assume w ∈ D1 and let D1 = D. By (2.3), (taking D = Bn ∩ D) there exist C < ∞ and c > 0 such that (2.8) Pw M D ,D > λG(n) ≤ Ce−cλ . Now suppose that γ has the distribution of an infinite LERW started at 0. By Wilson’s algorithm, if S w is an independent random walk started at w, then γ (0, w) has the same distribution as the path from 0 to w in γ ∪ L(S w [0, σ D ]). Therefore, n + M D ,D , d(0, Yw ) = |γ (0, Yw )| ≤ M and so, n > (λ/2)G(n) + max Pw M D ,D > (λ/2)G(n) . P (d(0, Yw ) > λG(n)) ≤ P M D

The result then follows from (2.5) and (2.8).

Lemma 2.4. There exists a positive constant C such that for all k ≥ 2, n ≥ 1, and K ⊂ Z2 \B(0, 4kn), the following holds. The probability that L(S[0, ξ K ]) reenters B(0, n) after leaving B(0, kn) is less than Ck −1 . This also holds for infinite LERWs, namely (2.9) P S[ σkn , ∞) ∩ B(0, n) = ∅ ≤ Ck −1 . Proof. The result for infinite LERWs follows immediately by taking K = Z2 \B(0, m) and letting m tend to ∞. We now prove the result for L(S[0, ξ K ]). Let γ be the part of the path L(S[0, ξ K ]) from 0 up to the first point z where it exits B(0, kn). Then by the domain Markov property for LERW [Law91], conditioned on γ , the rest of L(S[0, ξ K ]) has the same distribution as the loop-erasure of a random walk started at z, conditioned on the event {ξ K < ξγ }. Therefore, it is sufficient to show that for any fixed path α from 0 to ∂ B(0, kn) and z ∈ ∂ B(0, kn), Pz (ξn < ξ K

ξ K < ξα ) =

Pz (ξn < ξ K ; ξ K < ξα ) ≤ Ck −1 . Pz (ξ K < ξα )

(2.10)

32


On the one hand, Pz (ξn < ξ K ; ξ K < ξα ) ≤ Pz ξkn/2 < ξα max

x∈∂i B(0,kn/2)

×

max

w∈∂ B(0,n)

Pw (σ2kn < ξα )

Px (ξn < ξα ) max

y∈∂ B(0,2kn)

P y (ξ K < ξα ) .

However, by the discrete Beurling estimates (see [LL, Theorem 6.8.1]), for any x ∈ ∂i B(0, kn/2) and w ∈ ∂ B(0, n), Px (ξn < ξα ) ≤ Ck −1/2 ; P (σ2kn < ξα ) ≤ Ck −1/2 . w

Therefore, Pz (ξn < ξ K ; ξ K < ξα ) ≤ Ck −1 Pz ξkn/2 < ξα

max

y∈∂ B(0,2kn)

P y (ξ K < ξα ) .

On the other hand, Pz (ξ K < ξα ) ≥ Pz (σ2kn < ξα )

min

y∈∂ B(0,2kn)

P y (ξ K < ξα ) .

By the discrete Harnack inequality (see [LL, Theorem 6.3.9]), max

y∈∂ B(0,2kn)

P y (ξ K < ξα ) ≤ C

min

y∈∂ B(0,2kn)

P y (ξ K < ξα ) .

Therefore, in order to prove (2.10), it suffices to show that Pz (σ2kn < ξα ) ≥ cPz ξkn/2 < ξα . Let B = B(z; kn/2). By [Mas09, Prop. 3.5], there exists c > 0 such that Pz (|arg(S(σ B ) − z)| ≤ π/3

σ B < ξα ) > c.

Therefore, Pz (σ2kn < ξα ) ≥ ≥ ≥ ≥

P y (σ2kn < ξα ) Pz (σ B < ξα ; S(σ B ) = y)

y∈∂ B |arg(y−z)|≤π/3 cPz (σ B < ξα ; |arg(S(σ B ) − cPz (σ B < ξα ) cPz ξkn/2 < ξα .

z)| ≤ π/3)

Remark 2.5. One can also show that there exists δ > 0 such that P S[ σkn , ∞) ∩ B(0, n) = ∅ ≥ ck −δ .

(2.11)

As we will not need this bound we only give a sketch of the proof. Further, we will not try to find the value of δ that the argument yields, since it would be far from optimal.


33

First, we have S[ σkn , σ4kn ) ∩ B(0, n) = ∅ . P S[ σkn , ∞) ∩ B(0, n) = ∅ ≥ P However, by [Mas09, Cor. 4.5], the latter probability is comparable to the probability that L(S[0, σ16kn ]) leaves Bkn and then reenters B(0, n) before leaving B(0, 4kn). Call the latter event F. Partition Z2 into the three cones A1 = {z ∈ Z2 : 0 ≤ arg(z) < 2π/3}, A2 = {z ∈ 2 Z : 2π/3 ≤ arg(z) < 4π/3} and A3 = {z ∈ Z2 : 4π/3 ≤ arg(z) < 2π }. Then the event F contains the event that a random walk started at 0 (1) (2) (3) (4) (5)

leaves B(0, 2kn) before leaving A1 ∪ B(0, n/2), then enters A2 while staying in B(0, 4kn)\B(0, kn), then enters B(0, n) while staying in A2 ∩ B(0, 4kn), then enters A3 while staying in A2 ∩ B(0, n)\B(0, n/2), then leaves B(0, 16kn) while staying in A3 \B(0, n/2).

One can bound the probabilities of the events in steps (1), (3) and (5) from below by ck −β for some β > 0. The other steps contribute terms that can be bounded from below by a constant; combining these bounds gives (2.11). Lemma 2.6. There exists a positive constant C such that for all k ≥ 1 and w ∈ Z2 , 1 −1 k ≤ P (γ (0, w) ⊂ B(0, k |w|)) ≤ Ck −1/3 . 8

(2.12)

Proof. We first prove the upper bound. By adjusting the value of C we may assume that k ≥ 4. In order to obtain γ (0, w), we first run an infinite LERW γ started at 0 and then run an independent random walk started at w until it hits γ and then erase its loops. By Wilson’s algorithm, the resulting path from 0 to w has the same distribution as γ (0, w). By Lemma 2.4, the probability that γ reenters Bk 2/3 |w| after leaving Bk|w| is less than −1/3 Ck . Furthermore, by the discrete Beurling estimates [LL, Prop. 6.8.1], Pw σk 2/3 |w| < ξγ ≤ C(k 2/3 )−1/2 = Ck −1/3 . Therefore,

P γ (0, w) ⊂ Bk|w| ≤ Ck −1/3 .

To prove the lower bound, we follow the method of proof of [BLPS01, Theorem 14.3] where it was shown that if v and w are nearest neighbors then P (diam γ (v, w) ≥ n) ≥

1 . 8n

If w = (w1 , w2 ), let u = (w1 − w2 , w1 + w2 ) and v = (−w2 , w1 ) so that {0, w, u, v} form four vertices of a square of side length |w|. Now consider the sets Q 1 = { jw : j = 0, . . . , 2k}, Q 2 = {2kw + j (u − w) : j = 0, . . . , 2k}, Q 3 = { jv : j = 0, . . . , 2k}, Q 4 = {2kv + j (u − v) : j = 0, . . . , 2k}, 4 Q i . Then Q consists of 8k lattice points on the perimeter of a and let Q = i=1 square of side length 2k |w|. Let x1 , . . . , x8k be the ordering of these points obtained by letting x1 = 0 and then travelling along the perimeter of the square clockwise. Thus

34


|xi+1 − xi | = |w|. Now consider any spanning tree U on Z2 . If for all i, γ (xi , xi+1 ) stayed in the ball B(xi , k |w|) then the concatenation of these paths would be a closed loop, which contradicts the fact that U is a tree. Therefore, 1 = P (∃i : γ (xi , xi+1 ) ⊂ B(xi , k |w|)) ≤

8k

P (γ (xi , xi+1 ) ⊂ B(xi , k |w|)) .

i=1

Finally, using the fact that Z2 is transitive and is invariant under rotations by 90 degrees, all the probabilities on the right- hand side are equal. This proves the lower bound. Proposition 2.7. For all ε > 0, there exist c(ε), C(ε) > 0 and λ0 (ε) ≥ 1 such that for all w ∈ Z2 and all λ ≥ 1, P (d(0, w) > λG(|w|)) ≤ C(ε)λ−4/15+ε ,

(2.13)

and for all w ∈ Z2 and all λ ≥ λ0 (ε), P (d(0, w) > λG(|w|)) ≥ c(ε)λ−4/5−ε .

(2.14)

Proof. To prove the upper bound, let k = λ4/5−3ε . Then by Lemma 2.1, there exists C(ε) < ∞ such that G(k |w|) ≤ C(ε)k 5/4+ε G(|w|) ≤ C(ε)λ1−ε G(|w|).

(2.15)

Then, P (d(0, w) > λG(|w|)) ≤ P γ (0, w) ⊂ Bk|w| + P d(0, w) > λG(|w|); γ (0, w) ⊂ Bk|w| . However, by Lemma 2.6, P γ (0, w) ⊂ Bk|w| ≤ Ck −1/3 = Cλ−4/15+ε ,

(2.16)

while by Corollary 2.3 and (2.15), P d(0, w) > λG(|w|); γ (0, w) ⊂ Bk|w| ≤ P d(0, w) > c(ε)λε G(k |w|); γ (0, w) ⊂ Bk|w| ≤ C exp(−c(ε)λε ). Therefore, P (d(0, w) > λG(|w|)) ≤ C exp(−c(ε)λε ) + Cλ−4/15+ε ≤ C(ε)λ−4/15+ε .

(2.17)

To prove the lower bound we fix k = λ4/5+ε and assume k ≥ 2 and ε < 1/4. Then by Lemma 2.1, there exists C(ε) < ∞ such that G((k − 1) |w|) ≥ C(ε)−1 k 5/4−ε G(|w|) ≥ C(ε)−1 λ1+ε/3 G(|w|). Hence, P (d(0, w) > λG(|w|)) ≥ P d(0, w) > C(ε)λ−ε/3 G((k − 1) |w|) .


35

Now consider the UST on Z2 and recall that γ (0, ∞) and γ (w, ∞) denote the infinite paths starting at 0 and w. We write Z 0w for the unique point where these meet: thus γ (Z 0w , ∞) = γ (0, ∞) ∩ γ (w, ∞). Then γ (0, w) is the concatenation of γ (0, Z 0w ) and γ (w, Z 0w ). By Lemma 2.6, 1 . P γ (0, w) ⊂ Bk|w| ≥ 8k Therefore, 1 P γ (0, Z 0w ) ⊂ Bk|w| or γ (w, Z 0w ) ⊂ Bk|w| ≥ . 8k By the transitivity of Z2 , the paths γ (0, Z 0,−w ) and γ (w, Z 0w ) − w have the same distribution, and therefore 1 P γ (0, Z 0w ) ⊂ B(k−1)|w| ≥ . 16k Since Z 0w is on the path γ (0, ∞), by (2.6), P d(0, w) > C(ε)λ−ε/3 G((k − 1) |w|) ≥ P d(0, Z 0w ) > C(ε)λ−ε/3 G((k − 1) |w|) (k−1)|w| > C(ε)λ−ε/3 G((k − 1) |w|); γ (0, Z 0w ) ⊂ B(k−1)|w| ≥P M (k−1)|w| < C(ε)λ−ε/3 G((k − 1) |w|) ≥ P γ (0, Z 0w ) ⊂ B(k−1)|w| − P M ≥

1 − C exp{−cλε/4 }. 16k

Finally, since k = λ4/5+ε , the previous quantity can be made greater than c(ε)λ−4/5−ε for λ sufficiently large. 3. Uniform Spanning Trees We recall that U denotes the UST in Z2 , and we write x ∼ y if x and y are joined by an edge in U. Let E be the quadratic form given by 1 ( f (x) − f (y))(g(x) − g(y)). (3.1) E( f, g) = 2 x∼y If we regard U as an electrical network with a unit resistor on each edge, then E( f, f ) is the energy dissipation when the vertices of Z2 are at a potential f . Set H 2 = { f : Z2 → R : E( f, f ) < ∞}. Let A, B be disjoint subsets of U. The effective resistance between A and B is defined by: Reff (A, B)−1 = inf{E( f, f ) : f ∈ H 2 , f | A = 1, f | B = 0}.

(3.2)

Let Reff (x, y) = Reff ({x}, {y}), and Reff (x, x) = 0. For general facts on effective resistance and its connection with random walks see [AF,DS84,LP09]. In this section, we establish the volume and effective resistance estimates for the UST U that will be used in the next section to study random walks on U.

36


Theorem 3.1. There exist positive constants C and c such that for all r ≥ 1 and λ > 0, (a) 2/3 P Bd (0, λ−1 G(r )) ⊂ B(0, r ) ≤ Ce−cλ .

(3.3)

2/3 P Reff (0, B(0, r )c ) < λ−3 G(r ) ≤ Ce−cλ .

(3.4)

(b)

Proof. By adjusting the constants c and C we can assume λ ≥ 4. For k ≥ 1, let δk = λ−1 2−k , and ηk = (2k)−1 . Let k0 be the smallest integer such that r δk0 < 1. Set Ak = B(0, r ) − B(0, (1 − ηk )r ),

k ≥ 1.

Let Dk be a finite collection of points in Ak such that |Dk | ≤ Cδk−2 and Ak ⊂

B(z, δk r ).

z∈Dk

Write U1 , U2 , . . . for the random trees obtained by running Wilson’s algorithm (with root 0) with walks first starting at all points in D1 , then adding those points in D2 , and k Di ∪ {0}, and the sequence (Uk ) is so on. So Uk is a finite tree which contains i=1 increasing. Since r δk0 < 1 we have ∂i B(0, r ) ⊂ Ak0 ⊂ Uk0 . We then complete a UST U on Z2 by applying Wilson’s algorithm to the remaining points in Z2 . For z ∈ D1 , let Nz be the length of the path γ (0, z) until it first exits from B(0, r/8). By first applying [Mas09, Prop. 4.4] and then (2.6), r/8 < λ−1 G(r )) ≤ Ce−cλ , P(Nz < λ−1 G(r )) ≤ CP( M 2/3

so if

1 = {Nz < λ−1 G(r ) for some z ∈ D1 } = F

{Nz < λ−1 G(r )},

z∈D1

then 1 ) ≤ |D1 |Ce−cλ P( F

2/3

≤ Cδ1−2 e−cλ

2/3

≤ Cλ2 e−cλ . 2/3

(3.5)

For z ∈ Ak+1 , let Hz be the event that the path γ (z, 0) enters B(0, (1 − ηk )r ) before it hits Uk . For k ≥ 1, let

Fk+1 = Hz . z∈Dk+1

Let z ∈ Dk+1 and S z be a simple random walk started at z and run until it hits Uk . Then by Wilson’s algorithm, for the event Hz to occur, S z must enter B(0, (1−ηk )r ) before it hits Uk . To bound P(Hz ), note that z is a distance at least (ηk − ηk+1 )r from B(0, (1 − ηk )r ), and that each point in Ak is within a distance δk r of Uk . Let m = (ηk − ηk+1 )/δk . For Hz to occur, the random walk must move through at least 21 m balls of radius 2δk r without


37

hitting Uk . Since Uk is connected, in each of these balls B(wi , 2δk r ) there is a path in Uk from B(wi , δk r ) to B(wi , 2δk r )c . So, by the discrete Beurling estimate, 1 P(Hz ) ≤ exp(−c m) ≤ exp(−c (ηk − ηk+1 )/δk ). 2 Hence −2 exp(−c(ηk − ηk+1 )/δk ) ≤ Cλ2 4k exp(−cλ2k k −2 ). P(Fk+1 ) ≤ Cδk+1

(3.6)

Now define G by 1 ∪ Gc = F

k0

Fk ,

k=2

so that P(G c ) ≤ Cλe−cλ

2/3

+

∞

Cλ2 4k exp(−cλ2k k −2 ) ≤ Ce−cλ . 2/3

(3.7)

k=2

Now suppose that ω ∈ G. Then we claim that: (1) For every z ∈ D1 the part of the path γ (0, z) until its first exit from B(0, r/2) is of length greater than λ−1 G(r ), (2) If z ∈ Dk for any k ≥ 2 then the path γ (z, 0) hits U1 before it enters B(0, r/2). 1 , while (2) follows by induction on k using the Of these, (1) is immediate since ω ∈ F fact that ω ∈ Fk for any k. Hence, if ω ∈ G, then |γ (0, z)| ≥ λ−1 G(r ) for every z ∈ ∂i B(0, r ), which proves (a). To prove (b) we use the Nash-Williams bound for resistance [NW59]. For 1 ≤ k ≤ λ−1 G(r ) let k be the set of z such that d(0, z) = k and z is connected to B(0, r )c by a path in {z} ∪ (U − γ (0, z)). Assume now that the event G holds. Then the k are disjoint sets disconnecting 0 and B(0, r )c , and so Reff (0, B(0, r ) ) ≥ c

λ−1 G(r )

|k |−1 .

k=1

Furthermore, each z ∈ k is on a path from 0 to a point in D1 , and so |k | ≤ |D1 | ≤ Cδ1−2 ≤ Cλ2 . Hence on G we have Reff (0, B(0, r )c ) ≥ cλ−3 G(r ), which proves (b). A similar argument will give a (much weaker) bound in the opposite direction. We begin with a result we will use to control the way the UST fills in a region once we have constructed some initial paths. Proposition 3.2. There exist positive constants c and C such that for each δ0 ≤ 1 the following holds. Let r ≥ 1, and U0 be a fixed tree in Z2 connecting 0 to B(0, 2r )c with the property that dist(x, U0 ) ≤ δ0 r for each x ∈ B(0, r ) (here dist refers to the Euclidean distance). Let U be the random spanning tree in Z2 obtained by running Wilson’s algorithm with root U0 . Then there exists an event G such that −1/3

P(G c ) ≤ Ce−cδ0

,

(3.8)

38


and on G we have that for all x ∈ B(0, r/2), 1/2

d(x, U0 ) ≤ G(δ0 r ), γ (x, U0 ) ⊂ B(0, r ).

(3.9) (3.10)

Proof. We follow a similar strategy to that in Theorem 3.1. Define sequences (δk ) and −1/2 (λk ) by δk = 2−k δ0 , λk = 2k/2 λ0 , where λ0 = 5−1 δ0 . For k ≥ 0, let 1 Ak = B(0, (1 + (1 + k)−1 )r ), 2 and let Dk ⊂ Ak be such that for k ≥ 1, |Dk | ≤ Cδk−2 ,

B(z, δk r ). Ak ⊂ z∈Dk

Let U0 = U0 and as before let U1 , U2 , . . . be the random trees obtained by performing Wilson’s algorithm with root U0 and starting first at points in D1 , then in D2 , etc. Set Mz = d(z, Uk−1 ), z ∈ Dk , Fz = {γ (z, Uk−1 ) ⊂ Ak−1 }, z ∈ Dk , Mk = max Mz , z∈Dk

Fz . Fk = z∈Dk

For z ∈ Dk , P (Mz > λk G(δk−1 r )) ≤ P(Fz ) + P Mz > λk G(δk−1 r ); Fzc .

(3.11)

Since z is a distance at least 21 r (k −1 − (k + 1)−1 ) from Ack−1 , and each point in Ak−1 is within a distance δk−1r of Uk−1 , −1 −1 −2 P (Fz ) ≤ C exp(−cδk−1 (k −1 − (k + 1)−1 )) ≤ C exp(−cδk−1 k ).

(3.12)

By (2.3), again using the fact that each point in Ak−1 is within distance δk−1r of Uk−1 , 2/3 P Mz > λk G(δk−1r ); Fzc ≤ C exp(−cλk ).

(3.13)

So, combining (3.11)–(3.13), for k ≥ 1, 2/3 −1 −2 k ) + exp(−cλk ) . (3.14) P (Mk > λk G(δk−1 r )) + P (Fk ) ≤ C |Dk | exp(−cδk−1 Now let G=

∞

k=1

Fkc ∩ {Mk ≤ λk G(δk−1r )}.

(3.15)


39

Summing the series given by (3.14), and using the bound |Dk | ≤ cδk−2 , we have −1/3 22k exp(−cδ0−1 2k k −2 ) + exp(−c2k/3 δ0 ) P G c ≤ Cδ0−2 k

≤

−1/3

Cδ0−2 e−cδ0

−1/3

≤ Ce−c δ0

.

Using Lemma 2.1 with ε =

1 4

gives

λk G(δk−1r ) ≤ λk δ0 2−(k−1) G(δ0 r ) = 2λ0 δ0 2−k/2 G(δ0 r ). 1/2

1/2

1/2

1/2

So ∞

1/2

1/2

1/2

λk G(δk−1r ) ≤ 5λ0 δ0 G(δ0 r ) = G(δ0 r ).

k=1

Since B(0, r/2) ⊂ k Ak , we have B(0, r/2) ⊂ k Uk . Therefore on the event G, for 1/2 any x ∈ B(0, r/2), d(x, U0 ) ≤ G(δ0 r ). Further, on G, for each z ∈ Dk , we have γ (z, Uk−1 ) ⊂ Ak−1 . Therefore if x ∈ B(0, r/2) the connected component of U − U0 containing x is contained in B(0, r ), which proves (3.10). Theorem 3.3. For all ε > 0, there exist c(ε), C(ε) > 0 and λ0 (ε) ≥ 1 such that for all r ≥ 1 and λ ≥ 1, P (B(0, r ) ⊂ Bd (0, λG(r )) ≤ Cλ−4/15+ε ,

(3.16)

and for all r ≥ 1 and all λ ≥ λ0 (ε), P (B(0, r ) ⊂ Bd (0, λG(r )) ≥ cλ−4/5−ε . Proof. The lower bound follows immediately from the lower bound in Proposition 2.7. To prove the upper bound, let E ⊂ B(0, 4r ) be such that |E| ≤ Cλε/2 and

B(0, 4r ) ⊂ B(z, λ−ε/4 r ). z∈E

We now let U0 be the random tree obtained by applying Wilson’s algorithm with points in E and root 0. Therefore, by Proposition 2.7, for any z ∈ E, P (d(0, z) > λG(r )/2) ≤ P (d(0, z) > cλG(|z|)/2) ≤ C(ε)λ−4/15+ε/2 . Let F = {d(0, z) ≤ λG(r )/2 for all z ∈ E}; then P(F c ) ≤ |E| C(ε)λ−4/15+ε/2 ≤ C(ε)λ−4/15+ε .

40


We have now constructed a tree U0 connecting 0 to B(0, 4r )c and by the definition of the set E, for all z ∈ B(0, 2r ), dist (z, U0 ) ≤ λ−ε/4 r . We now use Wilson’s algorithm to produce the UST U on Z2 with root U0 . Let G be the event given by applying Proposition 3.2 (with r replaced by 2r ), so that ε/12 P G c ≤ Ce−cλ . On the event G we have d(x, U0 ) ≤ G(λ−ε/2 r ) ≤ λG(r )/2 for all x ∈ B(0, r ). Therefore, on the event F ∩ G we have d(x, 0) ≤ λG(r ) for all x ∈ B(0, r ). Thus, ε/12 ≤ C(ε)λ−4/15+ε . P max d(x, 0) > λG(r ) ≤ C(ε)λ−4/15+ε + Ce−cλ x∈B(0,r )

Theorem 1.1 is now immediate from Theorem 3.1 and Theorem 3.3. While Theorem 3.1 immediately gives the exponential bound (1.5) on the upper tail of |Bd (0, r )| in Theorem 1.2, it only gives a polynomial bound for the lower tail. The following theorem gives an exponential bound on the lower tail of |Bd (0, r )| and consequently proves Theorem 1.2. Theorem 3.4. There exist constants c and C such that if R ≥ 1, λ ≥ 1 then P(|Bd (0, R)| ≤ λ−1 g(R)2 ) ≤ Ce−cλ . 1/9

(3.17)

Proof. Let k ≥ 1 and let r = g(R/k 1/2 ), so that R = k 1/2 G(r ). Fix a constant δ0 < 1 such that the right side of (3.8) is less than 1/4. Fix a further constant θ < 1, to be chosen later but which will depend only on δ0 . We begin the construction of U with an infinite LERW S started at 0 which gives the path γ0 = U0 = γ (0, ∞). Let z j , j = 1, . . . k be points on S[0. σr ] chosen such that B j = B(z j , r/k) are disjoint. (We choose these according to some fixed algorithm so that they depend only on the path S[0, σr ].) Let F1 = { S[ σ2r , ∞) hits more than k/2 of B1 , . . . Bk }, 1 F2 = {| S[0, σ2r ]| ≥ k 1/2 G(r )}. 2

(3.18) (3.19)

We have P(F1 ) ≤ Ce−ck P(F2 ) ≤ Ce

1/3

,

(3.20)

−ck 1/2

.

(3.21)

Of these, (3.21) is immediate from (2.5) while (3.20) will be proved in Lemma 3.7 below. If either F1 or F2 occurs, we terminate the algorithm with a ‘Type 1’ or ‘Type 2’ failure. Otherwise, we continue as follows to construct U using Wilson’s algorithm. We define B j = B(z j , θr/k),

B

j = B(z j , θ 2 r/k).

The algorithm is at two ‘levels’ which we call ‘ball steps’ and ‘point steps’. We begin with a list J0 of good balls. These are the balls B j such that B j ∩ S[ σ2r , ∞) = ∅. The n th ball step starts by selecting a good ball B j from the list Jn−1 of remaining good balls.


41

We then run Wilson’s algorithm with paths starting in B j . The ball step will end either with success, in which case the whole algorithm terminates, or with one of three kinds of failure. In the event of failure the ball B j , and possibly a number of other balls also, will be labelled ‘bad’, and Jn is defined to be the remaining set of good balls. If more than k 1/2 /4 balls are labelled bad at any one ball step, we terminate the whole algorithm with a ‘Type 3 failure’. Otherwise, we proceed until, if we have tried k 1/2 balls steps without a success, we terminate the algorithm with a ‘Type 4 failure’. We write Un for the tree obtained after n ball steps. After ball step n, any ball B j in Jn will have the property that B j ∩ Un = B j ∩ U0 . We now describe in detail the second level of the algorithm, which works with a fixed (initially good) ball B j . We assume that this is the n th ball step (where n ≥ 1), so that we have already built the tree Un−1 . Let D ⊂ B(0, θ 2 r/k) satisfy

|D | ≤ cδ0−2 , B(0, θ 2 r/k) ⊂ B(x, δ0 θ 2 r/k). x∈D

Let D j = z j + D , so that D j ⊂ B

j . We now proceed to use Wilson’s algorithm to build the paths γ (w, Un−1 ) for w ∈ D j . For w ∈ D j let S w be a random walk started at w. For each w ∈ D j let G w be the event that γ (w, Un−1 ) ⊂ B j . If Fw is the event that S w exits from B j before it hits U0 , then P(G cw ) ≤ P(Fw ) ≤ cθ 1/2 .

(3.22)

Here the first inequality follows from Wilson’s algorithm, while the second is by the discrete Beurling estimates ([LL, Prop. 6.8.1]). Let Mw = d(w, Un−1 ), and Tw be the first time S w hits Un−1 . Then by Wilson’s algorithm and (2.3), P(Mw ≥ θ −1 G(θr/k); G w ) −1

= P(Mw ≥ θ −1 G(θr/k); L(S w [0, Tw ]) ⊂ B j ) ≤ ce−cθ .

(3.23)

We now define sets corresponding to three possible outcomes to this procedure:

H1,n = G cw , w∈D j

H2,n =

max Mw ≥ θ

w∈D j

H3,n =

−1

G(θr/k) ∩

Gw,

w∈D j

max Mw < θ −1 G(θr/k) ∩ Gw.

w∈D j

w∈D j

By (3.22), P(H1,n ) ≤

P(G w ) ≤ cδ0−2 θ 1/2 ,

(3.24)

w∈D j

and by (3.23), P(H2,n ) ≤

w∈D j

−1

P(Mw ≥ θ −1 G(θr/k); G w ) ≤ cδ0−2 e−cθ .

(3.25)

42


We now choose the constant θ small enough so that each of P(Hi,n ) ≤ and therefore P(H3,n ) ≥

1 4

1 . 2

for i = 1, 2, (3.26)

If H3,n occurs then we have constructed a tree Un which contains Un−1 and D j . Further, we have that for each point w ∈ D j , the path γ (w, 0) hits U0 before it leaves B j . Hence, d(w, 0) ≤ Mw + max d(0, z) ≤ z∈U0 ∩B j

1 1/2 k G(r ) + θ −1 G(θr/k). 2

We now use Wilson’s algorithm to fill in the remainder of B j . Let G n be the event given by applying Proposition 3.2 to the ball B

j with U0 = Un . Then −1/3

P(G cn ) ≤ ce−cδ0

≤

1 4

by the choice of δ0 , and therefore P(H3,n ∩ G n ) ≥ 41 . If this event occurs, then all points 1/2 in B(z j , θ 2 r/2k) are within distance G(δ0 θ 2 r/k) of Un in the graph metric d; in this case we label ball step n as successful, and we terminate the whole algorithm. Then for all z ∈ B(z j , θ 2 r/2k), d(0, z) ≤ d(z, Un ) + max d(w, 0) w∈Un

1 1/2 ≤ G(δ0 θ 2 r/k) + k 1/2 G(r ) + θ −1 G(θr/k) 2 ≤ k 1/2 G(r ), provided that k is large enough. So there exists k0 ≥ 1 such that, provided that k ≥ k0 , if H3,n ∩ G n occurs then B(z j , θ 2 r/2k) ⊂ Bd (0, k 1/2 G(r )). Since R = k 1/2 G(r ) ≤ G(k 1/2 r ) we have g(R) ≤ k 1/2 r , and therefore |Bd (0, R)| ≥ |B(z j , θ 2 r/2k)| ≥ ck −2 r 2 ≥ cg(R)2 /k 3 .

(3.27)

If H1,n ∪ H2,n ∪ (H3,n ∩ G cn ) occurs then as soon as we have a random walk S w that ‘misbehaves’ (either by leaving B j before hitting U0 , or by having Mw too large), then we terminate the ball step and mark the ball B j as ‘bad’. If ω ∈ H2,n only the ball B j becomes bad, but if ω ∈ H1,n ∪ (H3,n ∩ G cn ) then S w may hit several other balls Bi before it hits Un−1 . Let NwB denote the number of such balls hit by S w . By Beurling’s estimate, the probability that S w enters a ball Bi and then exits Bi without hitting U0 is less than cθ 1/2 . Since the balls Bi are disjoint,

P(NwB ≥ m) ≤ (cθ 1/2 )m ≤ e−c m .

(3.28)

A Type 3 failure occurs if NwB ≥ k 1/2 /4; using (3.28) we see that the probability that a ball step ends with a Type 3 failure is bounded by exp(−ck 1/2 ). If we write F3 for the event that some ball step ends with a Type 3 failure, then since there are at most k 1/2 ball steps, P(F3 ) ≤ k 1/2 exp(−ck 1/2 ) ≤ C exp(−c k 1/2 ).

(3.29)


43

The final possibility is that k 1/2 ball steps all end in failure; write F4 for this event. Since each ball step has a probability at least 1/4 of success (conditional on the previous steps of the algorithm), we have P(F4 ) ≤ (3/4)k

1/2

≤ e−ck

1/2

.

(3.30)

Thus either the algorithm is successful, or it ends with one of four types of failure, corresponding to the events Fi , i = 1, . . . 4. By Lemma 3.7 and (3.21), (3.29), (3.30) we have P(Fi ) ≤ C exp(−ck 1/3 ) for each i. Therefore, we have that provided k ≥ k0 , (3.27) holds except on an event of probability C exp(−ck 1/3 ). Taking k = cλ1/3 for a suitable constant c, and adjusting the constant C so that (3.17) holds for all λ completes the proof. The reason why we can only get a polynomial bound in the Theorem 3.3 is that one cannot get exponential estimates for the probability that γ (0, w) leaves B(0, k |w|) (see Lemma 2.6). However, if we let Ur be the connected component of 0 in U ∩ B(0, r ), then the following proposition enables us to get exponential control on the length of γ (0, w) for w ∈ Ur . This will allow us to obtain an exponential bound on the lower tail of Reff (0, Bd (0, R)c ) in Proposition 3.6. Proposition 3.5. There exist positive constants c and C such that for all λ ≥ 1 and r ≥ 1, P (Ur ⊂ Bd (0, λG(r ))) ≤ Ce−cλ .

(3.31)

Proof. This proof is similar to that of Theorem 3.3. Let E ⊂ B(0, 2r ) be such that |E| ≤ Cλ6 and

B(z, λ−3r ), B(0, 2r ) ⊂ z∈E

and let U0 be the random tree obtained by applying Wilson’s algorithm with points in E and root 0. For each z ∈ E, let Yz be defined as in Corollary 2.3, so that Yz = z if γ (0, z) ⊂ B(0, 2r ), and otherwise Yz is the first point on γ (0, z) which is outside B(0, 2r ). Let G 1 = {d(Yz , 0) ≤

1 λG(r ) for all z ∈ E}. 2

Then by Corollary 2.3, P(G c1 ) ≤

P(d(Yz , 0) >

z∈E

1 λG(2r )) ≤ |E| Ce−cλ ≤ Cλ6 e−cλ . 2

(3.32)

We now complete the construction of U by using Wilson’s algorithm. Then Proposition 3.2 with δ0 = λ−3 implies that there exists an event G 2 with −1/3 P G c2 ≤ e−cδ0 = e−cλ ,

and on G 2 , max d(x, U0 ) ≤ G(λ−3/2 r ).

x∈B(0,r )

(3.33)

44


Suppose G 1 ∩ G 2 occurs, and let x ∈ Ur . Write Z x for the point where γ (x, 0) meets U0 . Since x ∈ Ur , we must have Z x ∈ B(0, r ), and γ (Z x , 0) ⊂ B(0, r ). As Z x ∈ U0 , there exists z ∈ E such that Z x ∈ γ (0, z). Since G 1 occurs, d(0, Z x ) ≤ d(0, Yz ) ≤ 1 −3/2 r ). So, provided λ is large enough, 2 λG(r ), while since G 2 occurs d(x, Z x ) ≤ G(λ d(0, x) ≤ d(0, Z x ) + d(Z x , x) ≤

1 λG(r ) + G(λ−3/2 r ) ≤ λG(r ). 2

Using (3.32) and (3.33), and adjusting the constant C to handle the case of small λ completes the proof. Proposition 3.6. There exist positive constants c and C such that for all R ≥ 1 and λ ≥ 1, (a) P(Reff (0, Bd (0, R)c ) < λ−1 R) ≤ Ce−cλ

2/11

,

(3.34)

(b) E(Reff (0, Bd (0, R)c )|Bd (0, R)|) ≤ C Rg(R)2 .

(3.35)

Proof. (a) Recall the definition of Ur given before Proposition 3.5, and note that for all r ≥ 1, Reff (0, B(0, r )c ) = Reff (0, Urc ). Given R and λ, let r be such that R = λ2/11 G(r ). By monotonicity of resistance we have that if Ur ⊂ Bd (0, R), then Reff (0, Bd (0, R)c ) ≥ Reff (0, Urc ). So, writing Bd = Bd (0, R), P(Reff (0, Bdc ) < λ−1 R) = P(Reff (0, Bdc ) < λ−1 R; Ur ⊂ Bd ) + P(Reff (0, Bdc ) < λ−1 R; Ur ⊂ Bd ) ≤ P(Ur ⊂ Bd (0, λ2/11 G(r ))) + P(Reff (0, Urc ) < λ−9/11 G(r )). By Proposition 3.5, P(Ur ⊂ Bd (0, λ2/11 G(r ))) ≤ Ce−cλ

2/11

,

while by (3.4), P(Reff (0, Urc ) < λ−9/11 G(r )) ≤ Ce−cλ

2/11

.

This proves (a). (b) Since Reff (0, Bd (0, R)c ) ≤ R, this is immediate from Theorem 1.2. We conclude this section by proving the following technical lemma that was used in the proof of Theorem 3.4. Lemma 3.7. Let F1 be the event defined by (3.18). Then P(F1 ) ≤ Ce−ck

1/3

.

(3.36)


Proof. Let b = ek

45

1/3

. Then by Lemma 2.4, 1/3 P S[ σbr , ∞) ∩ Br = ∅ ≤ Cb−1 ≤ Ce−k .

(3.37)

If S[ σ2r , ∞) hits more than k/2 balls (from the family B1 , . . . , Bk ) then either S hits Br after time σbr , or S[ σ2r , σbr ] hits more than k/2 balls. Given (3.37), it is therefore sufficient to prove that 1/3 P( S[ σ2r , σbr ] hits more than k/2 balls) ≤ Ce−ck .

(3.38)

Let S be a simple random walk started at 0, and let L = L(S[0, σ4br ]). Then by [Mas09, Cor. 4.5], in order to prove (3.38), it is sufficient to prove that P(L hits more than k/2 balls) ≤ Ce−ck

1/3

.

(3.39)

Define stopping times for S by letting T0 = σ2r and for j ≥ 1, R j = min{n ≥ T j−1 : Sn ∈ B(0, r )}, T j = min{n ≥ R j : Sn ∈ / B(0, 2r )}. Note that the balls B j can only be hit by S in the intervals [R j , T j ] for j ≥ 1. Let M = min{ j : R j ≥ σ4br }. Then P(M = j + 1|M > j) =

log(2r ) − log(r ) log 2 = ≥ ck −1/3 . log(4br ) − log r log(4b)

Hence P(M ≥ k 2/3 ) ≤ C exp(−ck 1/3 ). For each j ≥ 1 let L j = L(S[0, T j ]), let α j be the first exit by L j from B(0, 2r ), and β j be the number of steps of L j . If L hits more than k/2 balls then there must exist some j ≤ M such that L j [α j , β j ] hits more than k/2 balls Bi . (We remark that since the balls Bi are defined in terms of the loop erased walk path, they will depend on L j [0, α j ]. However, they will be fixed in each of the intervals [R j , T j ].) Hence, if M ≤ k 2/3 and L hits more than k/2 balls then S must hit more than ck 1/3 balls in one of the intervals [R j , T j ], without hitting the path L j [0, α j ]. However, by Beurling’s estimate the probability of this event is less than C exp(−ck 1/3 ). Combining these estimates concludes the proof. 4. Random Walk Estimates We recall the notation of random walks on the UST given in the Introduction. We write ω for elements of D. Finally, we recall the definitions of the stopping times τ R and τr from (1.9) and (1.10) and the transition densities pnω (x, y) from (1.8). To avoid difficulties due to U being bipartite, we also define ω pnω (x, y) = pnω (x, y) + pn+1 (x, y).

(4.1)

Throughout this section, we will write C(λ) to denote expressions of the form Cλ p and c(λ) to denote expressions of the form cλ− p , where c, C and p are positive constants.

46


As in [BJKS08,KM08] we define a (random) set J (λ): Definition 4.1. Let U be the UST. For λ ≥ 1 and x ∈ Z2 , let J (x, λ) be the set of those R ∈ [1, ∞] such that the following all hold: (1) |Bd (x, R)| ≤ λg(R)2 , (2) λ−1 g(R)2 ≤ |Bd (x, R)|, (3) Reff (x, Bd (x, R)c ) ≥ λ−1 R. Proposition 4.2. For R ≥ 1, λ ≥ 1 and x ∈ Z2 , (a) P(R ∈ J (x, λ)) ≥ 1 − Ce−cλ , 1/9

(4.2)

(b) E(Reff (0, Bd (0, R)c )|Bd (0, R)|) ≤ C Rg(R)2 . Therefore conditions (1), (2) and (4) of [KM08, Assumption 1.2] hold with v(R) = g(R)2 and r (R) = R. Proof. (a) is immediate from Theorem 1.2 and Proposition 3.6(a), while (b) is exactly Proposition 3.6(b). We note that since r (R) = R, the condition Reff (x, y) ≤ λr (d(x, y)) in [KM08, Def. 1.1] always holds for λ ≥ 1, so that our definition of J (λ) agrees with that in [KM08]. We will see that the time taken by the random walk X to move a distance R is of order Rg(R)2 . We therefore define F(R) = Rg(R)2 ,

(4.3)

and let f be the inverse of F. We will prove that the heat kernel pT (x, y) is of order g( f (T ))−2 , and so we let k(t) = g( f (t))2 , t ≥ 1.

(4.4)

Note that we have f (t)k(t) = f (t)g( f (t))2 = F( f (t)) = t, so 1 f (t) 1 = . = k(t) g( f (t))2 t

(4.5)

Furthermore, since G(R) ≈ R 5/4 , we have G(R) ≈ R 5/4 , g(R) ≈ R 4/5 , f (R) ≈ R

5/13

, k(R) ≈ R

8/13

F(R) ≈ R 13/5 , ,

R G(R) ≈ R 2

13/4

(4.6) .

(4.7)

We now state our results for the SRW X on U, giving the asymptotic behaviour of d(0, X n ), the transition densities pnω (x, y), and the exit times τ R and τr . We begin with three theorems which follow directly from Proposition 4.2 and [KM08]. The first theorem gives tightness for some of these quantities, the second theorem gives expectations with respect to P, and the third theorem gives ‘quenched’ limits which hold P-a.s. In various ways these results make precise the intuition that the time taken by X to escape from a ball of radius R is of order F(R), that X moves a distance of order f (n) in time n, and that the probability of X returning to its initial point after 2n steps is the same order as 1/|B(0, f (n))|, that is g( f (n))−2 = k(n)−1 . We define the averaged measure P ∗ on × D by setting P ∗ (A × B) = E[1 A Pω0 (B)] and extending this to a probability measure.


47

Theorem 4.3. Uniformly with respect to n ≥ 1, R ≥ 1 and r ≥ 1, E 0 τR ≤ θ → 1 as θ → ∞, P θ −1 ≤ ω F(R) τr E ω0 −1 P θ ≤ 2 ≤ θ → 1 as θ → ∞, r G(r ) ω (0, 0) ≤ θ ) → 1 as θ → ∞, P(θ −1 ≤ k(n) p2n 1 + d(0, X n ) ∗ −1 < θ → 1 as θ → ∞. P θ < f (n)

(4.8) (4.9) (4.10) (4.11)

Theorem 4.4. There exist positive constants c and C such that for all n ≥ 1, R ≥ 1, r ≥ 1, cF(R) ≤ E(E ω0 τ R ) ≤ C F(R), cr G(r ) ≤ 2

ck(n)

−1

≤

c f (n) ≤

(4.12)

E(E ω0 τr ) ≤ Cr 2 G(r ), ω E( p2n (0, 0)) ≤ Ck(n)−1 , E(E ω0 d(0, X n )).

(4.13) (4.14) (4.15)

Theorem 4.5. There exist αi < ∞, and a subset 0 with P(0 ) = 1 such that the following statements hold: (a) For each ω ∈ 0 and x ∈ Z2 there exists N x (ω) < ∞ such that ω (log log n)−α1 k(n)−1 ≤ p2n (x, x) ≤ (log log n)α1 k(n)−1 , n ≥ N x (ω). (4.16)

In particular, ds (U) = 16/13, P-a.s. (b) For each ω ∈ 0 and x ∈ Z2 there exists Rx (ω) < ∞ such that (log log R)−α2 F(R) ≤ E ωx τ R ≤ (log log R)α2 F(R), R ≥ Rx (ω), −α3 2

(log log r )

r G(r ) ≤

E ωx τr

≤

(log log r )α3 r 2 G(r )2 ,

r ≥ Rx (ω).

(4.17) (4.18)

Hence 13 log E ωx τ R = , R→∞ log R 5

dw (U) = lim

13 log E ωx τr = . r →∞ log r 4 lim

(4.19)

(c) Let Yn = max0≤k≤n d(0, X k ). For each ω ∈ 0 and x ∈ Z2 there exist N x (ω), R x (ω) such that Pωx (N x < ∞) = Pωx (R x < ∞) = 1, and such that (log log n)−α4 f (n) ≤ Yn (ω) ≤ (log log n)α4 f (n), n ≥ N x (ω), (log log R)−α4 F(R) ≤ τ R (ω) ≤ (log log R)α4 F(R), R ≥ R x (ω), (log log r )−α4 r 2 G(r ) ≤ τr (ω) ≤ (log log r )α4 r 2 G(r ), r ≥ Rx (ω).

(4.20) (4.21) (4.22)

(d) Let Wn = {X 0 , X 1 , . . . , X n } and let |Wn | denote its cardinality. For each ω ∈ 0 and x ∈ Z2 , lim

n→∞

8 log |Wn | = , Pωx -a.s. log n 13

(4.23)

48


The papers [BJKS08,KM08] studied random graphs for which information on ball volumes and resistances were only available from one point. These conditions were not strong enough to bound E ω0 d(0, X n ) or pTω (x, y) – see [BJKS08, Ex. 2.6]. Since the UST is stationary, we have the same estimates available from every point x, and this means that stronger conclusions are possible. Theorem 4.6. There exist N0 (ω) with P(N0 < ∞) = 1, α > 0 and for all q > 0, Cq such that E ω0 d(0, X n )q ≤ Cq f (n)q (log n)αq

for n ≥ N0 (ω).

(4.24)

Further, for all n ≥ 1, E(E ω0 d(0, X n )q ) ≤ Cq f (n)q (log n)αq .

(4.25)

Write (T, x, x) = 0, and for x = y let (T, x, y) =

d(x, y) . G((T /d(x, y))1/2 )

(4.26)

Theorem 4.7. There exists a constant α > 0 and r.v. N x (ω) with P(N x ≥ n) ≤ Ce−c(log n)

2

(4.27)

such that provided F(T ) ∨ |x − y| ≥ N x (ω) and T ≥ d(x, y), then writing A = A(x, y, T ) = C(log(|x − y| ∨ F(T )))α , 1 exp (−A(T, x, y)) ≤ pT (x, y) ≤ Ak(T )

A k(T )

exp −A−1 (T, x, y) .

(4.28)

Remark 4.8. If we had G(n) n 5/4 then since d f = 8/5 and dw = 1 + d f , we would have 1/(dw −1) d(x, y)dw (T, x, y) , (4.29) T so that, except for the logarithmic term A, the bounds in (4.28) would be of the same form as those obtained in the diffusions on fractals literature. Before we prove Theorems 4.3–4.7, we summarize some properties of the exit times τ R . Proposition 4.9. Let λ ≥ 1 and x ∈ Z2 . (a) If R, R/(4λ) ∈ J (x, λ), then c1 (λ)F(R) ≤ E ωx τ (x, R) ≤ C2 (λ)F(R).

(4.30)

(b) Let 0 < ε ≤ c3 (λ). Suppose that R, ε R, c4 (λ)ε R ∈ J (x, λ). Then Pωx (τ (x, R) < c5 (λ)F(ε R)) ≤ C6 (λ)ε.

(4.31)

Proof. This follows directly from [BJKS08, Prop. 2.1] and [KM08, Prop. 3.2, 3.5].


49

Proof of Theorems 4.3, 4.4, and 4.5. All these statements, except those relating to τr , follow immediately from Proposition 4.2 and Propositions 1.3 and 1.4 and Theorem 1.5 of [KM08]. Thus it remains to prove (4.9), (4.13), (4.18) and (4.22). By the stationarity of U it is enough to consider the case x = 0. Recall that Ur denotes the connected component of 0 in U ∩ B(0, r ), and therefore τr = min{n ≥ 0 : X n ∈ Ur }. Let H1 (r, λ) = {Bd (0, λ−1 G(r )) ⊂ Ur ⊂ Bd (0, λG(r )}. On H1 (r, λ) we have τλ−1 G(r ) ≤ τr ≤ τλG(r ) ,

(4.32)

while by Theorem 3.1 and Proposition 3.5 we have for r ≥ 1, λ ≥ 1, P(H1 (r, λ)c ) ≤ e−cλ . 2/3

The upper bound in (4.9) will follow from (4.13). For the lower bound, on H1 (r, λ) we have, writing R = λ−1 G(r ), E ω0 E 0 τ R F(R) τr ≥ ω · 2 , 2 r G(r ) F(R) r G(r ) while F(R)/r 2 G(r ) ≥ λ−3 by Lemma 2.1. So 0 0 τr Eω τ R E ω −4 c −1 ≤ P(H1 (r, λ) ) + P , P 2 0, P(ξi ≤ t0 |ξ1 , . . . , ξi−1 ) ≤ Then

P

n

1 . 2

ξi < T

≤ exp(−c0 n + T /t0 ),

(4.35)

i=1

Proof. Write Fi = σ (ξ1 , . . . ξi ). Let θ = 1/t0 , and let e−c0 = 21 (1 + e−1 ). Then E(e−θξi |Fi−1 ) ≤ P(ξi < t0 |Fi−1 ) + e−θt0 P(ξi ≥ t0 |Fi−1 ) = P(ξi < t0 |Fi−1 )(1 − e−θt0 ) + e−θt0 1 ≤ (1 + e−θt0 ) = e−c0 . 2 Then P

n

ξi < T

= P(e−θ

n

i=1 ξi

> e−θ T ) ≤ eθ T E(e−θ

n

i=1 ξi

) ≤ eθ T e−nc0 .

i=1


51

We also require the following lemma which is an immediate consequence of the definitions of the functions F and G. Lemma 4.11. Let R ≥ 1, T ≥ 1, and b0 =

R . G((T /R)1/2 )

(4.36)

Then, R/b0 = G((T /R)1/2 ) = f (T /b0 ), b ≤ b0 ⇔ T /b ≤ F(R/b) ⇔ f (T /b) ≤ R/b.

(4.37) (4.38)

Also, if θ < 1 and θ R ≥ 1, then c7 θ 3 F(R) ≤ F(θ R) ≤ C8 θ 2 F(R), c7 θ 1/2 f (R) ≤ f (θ R) ≤ C8 θ 1/3 f (R).

(4.39) (4.40)

For x ∈ Z2 , let A x (λ, n) = {ω : R ∈ J (y, λ) for all y ∈ B(x, n 2 ), 1 ≤ R ≤ n 2 }, and let A(λ, n) = A0 (λ, n). Proposition 4.12. Let λ ≥ 1 and suppose that 1 ≤ R ≤ n, T ≥ C9 (λ)R,

(4.41)

and A(λ, n) occurs. Then, R . < T ) ≤ C10 (λ) exp −c11 (λ) G((T /R)1/2 )

Pω0 (τ R

(4.42)

Proof. In this proof, the constants ci (λ), Ci (λ) for 1 ≤ i ≤ 8 will be as in Proposition 4.9 and Lemma 4.11, and c0 will be as in Lemma 4.10. We work with the probability Pω0 , so that X 0 = 0. Let b0 = R/G((T /R)1/2 ) be as in (4.36), and define the quantities 1 −1 C c0 c5 (λ)ε2 , C ∗ (λ) = 2θ −1 , 4 8 R = R/m, t0 = c5 (λ)F(ε R ).

ε = (2C6 (λ))−1 , θ = m = θ b0 ,

We now establish the key facts that we will need about the quantities defined above. We can assume that b0 ≥ C ∗ (λ) for if b0 ≤ C ∗ (λ), then by adjusting the constants C10 (λ) and c11 (λ) we will still obtain (4.42). Therefore, 1≤

1 θ b0 ≤ m ≤ θ b0 . 2

(4.43)

Furthermore, since m/θ ≤ b0 , θ R/m = G((T /R)1/2 ) ≥ 1 and θ/ε < 1, we have by Lemma 4.11 that T /m ≤ θ −1 F(θ R/m) ≤ C8 θ ε−2 F(ε R/m) ≤

1 1 c0 c5 (λ)F(ε R/m) = c0 t0 . 4 4

52


Therefore, T /t0
0 be such that Ci (λ) ≤ Cλ p , i = 9, 10 and c11 (λ) ≥ cλ− p . We have ∞ 0 q q 0 q E ω d(0, X T ) ≤ R + E ω 1(ek−1 R≤d(0,X T )<ek R) d(0, X T ) ≤ Rq + R

k=1 ∞ q kq

e Pω0 (ek−1 R ≤ d(0, X T ) ≤ ek R).

k=1

(4.46)


53

By (4.2) we have P(A(λ, n)c ) ≤ 4n 3 e−cλ ≤ exp(−cλ1/9 + C log n). (4.47) Let λk = k 10 . Then k P(A(λk , ek )c ) < ∞, and so by Borel-Cantelli there exists K 0 (ω) such that A(λk , ek ) holds for all k ≥ K 0 . Furthermore, we have 1/9

P(K 0 ≥ n) ≤ Ce−cn

10/9

.

(4.48)

Suppose now that k ≥ K 0 . To bound the sum (4.46), we consider two ranges of k. If C9 (λk )ek−1 R > T , then we let Ak = Bd (0, ek R) − Bd (0, ek−1 R), and by the Carne-Varopoulos bound (see [Car85]), ekq Pω0 (ek−1 R ≤ d(0, X T ) ≤ ek R) ≤ ekq Pω0 (X T = y) y∈Ak

≤e

kq

C exp(−d(0, y)2 /2T )

y∈Ak

≤ Ce (ek R)2 exp(−(ek−1 R)2 /2T ) kq

≤ C exp(−C9 (λk )−1 ek R + 2 log(ek R) + kq) ≤ C exp(−ck −10 p ek + Cq k).

(4.49)

On the other hand, if C9 (λk )ek−1 R ≤ T , then we let m = k + log R, so that ≤ em < ek+1 R. Then by Proposition 4.12,

ek R

ekq Pω0 (ek−1 R ≤ d(0, X T ) ≤ ek R) ≤ ekq Pω0 (τek−1 R < T ) ek−1 R kq ≤ e C10 (λm ) exp −c11 (λm ) G((e−k+1 T /R)1/2 ) R kq 10 p −10 p k ≤ e Cm exp −cm e G((T /R)1/2 ) ≤ C(k + log R)10 p exp(−c(k +log R)−10 p ek + kq). (4.50) Let k1 = 20 p log log R. Then if k ≥ k1 , (k + log R)10 p ≤ (k + ek/(20 p) )10 p ≤ Cek/2 . Hence for k ≥ k1 , ekq Pω0 (ek−1 R ≤ d(0, X T ) ≤ ek R) ≤ C exp(−cek/2 + Cq k).

(4.51)

Let K = K 0 ∨ k1 . Then since the series given by (4.49) and (4.51) both converge, ∞ k=1

e

kq

Pω0 (ek−1 R

≤ d(0, X T ) ≤ e R) ≤ k

−1 K

k=1 K q

≤e ≤e

K0q

ekq + Cq + Cq + (log R)20 pq + Cq .

54

M. T. Barlow, R. Masson K

Hence since R ≤ T , we have that for all T ≥ N0 = ee 0 , E ω0 d(0, X T )q ≤ Cq R q ((log T )q + (log T )20 pq ), so that (4.24) holds. Taking expectations in (4.52) and using (4.48) gives (4.25).

(4.52)

Remark 4.13. It is natural to ask if (4.25) holds without the term in log T , as with the averaged estimates in Theorem 4.4. It seems likely that this is the case; such an averaged estimate was proved for the incipient infinite cluster on regular trees in [BK06, Th. 1.4(a)]. The key to obtaining such a bound is to control the exit times τek R ; this was done above using the events A(λ, n), but this approach is far from optimal. The argument of Proposition 4.12 goes through if only a positive proportion of the points X Tk are at places where the estimate (4.31) can be applied. This idea was used in [BK06] – see the definition of the event G 2 (N , R) on p. 48. Suppose we say that Bd (x, R) is λ-bad if R ∈ J (x, λ). Then it is natural to conjecture that there exists λc such that for λ > λc the bad balls fail to percolate on U. Given such a result (and suitable control on the size of the clusters of bad balls) it seems plausible that the methods of this paper and [BK06] would then lead to a bound of the form E(E ω0 d(0, X T )q ) ≤ Cq f (T )q . We now use the arguments in [BCK05] to obtain full heat kernel bounds for pT (x, y) and thereby prove Theorem 4.7. Since the techniques are fairly standard, we only give full details for the less familiar steps. Lemma 4.14. Suppose A(λ, n) holds. Let x, y ∈ B(0, n). Then (a) pT (x, y) ≤ C12 (λ)k(T )−1 ,

if 1 ≤ T ≤ F(n),

(4.53)

(b) pT (x, y) ≥ c13 (λ)k(T )−1 ,

if 1 ≤ T ≤ F(n) and d(x, y) ≤ c14 (λ) f (T ). (4.54)

Proof. If x = y then (a) is immediate from [KM08, Prop. 3.1]. Since pT (x, y)2 ≤ pT (x, x) pT (y, y), the general case then follows. (b) The bound when x = y is given by [KM08, Prop. 3.3(2)]. We also have, by [KM08, Prop. 3.1], | pT (x, y) − pT (x, z)|2 ≤

c d(y, z) p2T /2 (x, x). T

Therefore using (a), pT (x, y) ≥ pT (x, x) − | pT (x, x) − pT (x, y)| 1/2 ≥ c(λ)k(T )−1 − C(λ)d(x, y)T −1 k(T )−1 1/2 −1 −1 1 − C(λ)d(x, y)T k(T ) . = c(λ)k(T ) Since k(T )/T = f (T )−1 , (4.54) follows.


55

Recall that (T, x, x) = 0, and for x = y, (T, x, y) =

d(x, y) . G((T /d(x, y))1/2 )

Proposition 4.15. Suppose that A(λ, n) holds. Let x, y ∈ B(0, n). If d(x, y) ≤ T ≤ F(n), then c(λ) C(λ) exp (−C(λ)(T, x, y)) ≤ pT (x, y) ≤ exp (−c(λ)(T, x, y)) . (4.55) k(T ) k(T ) Proof. Let R = d(x, y). In this proof we take c13 (λ) and c14 (λ) to be as in (4.54). We will choose a constant C ∗ (λ) ≥ 2 later. Suppose first that R ≤ T ≤ C ∗ (λ)R. Then the upper bound in (4.55) is immediate from the Carne-Varopoulos bound. If R + T is even and then we have pT (x, y) ≥ 4−T , this gives the lower bound. We can therefore assume that T ≥ C ∗ (λ)R. The upper bound follows from the bounds (4.53) and (4.42) by the same argument as in [BCK05, Prop. 3.8]. It remains to prove the lower bound in the case when T ≥ C ∗ (λ)R, and for this we use a standard chaining technique which derives (4.55) from the ‘near diagonal lower bound’ (4.54). For its use in a discrete setting see for example [BCK05, Sect. 3.3]. As in Lemma 4.11, we set b0 =

R . G((T /R)1/2 )

(4.56) 2/3

2/3

If b0 < 1 then we have from Lemma 4.11 that R ≤ C8 b0 f (T ). If C8 b0 ≤ c14 (λ) then R ≤ c14 (λ) f (T ) and the lower bound in (4.55) follows from (4.54). We can therefore 2/3 assume that C8 b0 > c14 (λ). We will choose θ > 2(c14 /C8 )−3/2 later; this then implies that θ b0 ≥ 2. Let m = θ b0 ; we have 21 θ b0 ≤ m ≤ θ b0 . Let r = R/m, t = T /m; we will require that both r and t are greater than 4. Choose integers t1 , . . . , tm so that |ti − t| ≤ 2 and ti = T . Choose a chain x = z 0 , z 1 , . . . , z m = y of points so that d(z i−1 , z i ) ≤ 2r , and let Bi = B(z i , r ). If xi ∈ Bi for 1 ≤ i ≤ m then d(xi−1 , xi ) ≤ 4r . We choose θ so that we have pti (xi−1 , xi ) ≥ c13 (λ)k(t)−1 whenever xi−1 ∈ Bi−1 , xi ∈ Bi .

(4.57)

By (4.54) it is sufficient for this that 4R/m = 4r ≤ c14 (λ) f (t/2) = c14 (λ) f (T /2m).

(4.58)

Since 2m/θ ≥ b0 , Lemma 4.11 implies that f (θ T /(2m)) ≥ θ R/(2m), and therefore 4R/m ≤ 8θ −1 f (θ T /(2m)) ≤ Cθ −1/3 f (T /2m),

(4.59)

and so taking θ = max(2(c14 /C8 )−3/2 , (C/c3 (λ))3 ) gives (4.58). The condition T ≥ C ∗ (λ)R implies that f (T /b0 ) = R/b0 ≥ G(C ∗ (λ)), so taking C ∗ large enough ensures that both r and t are greater than 4. The Chapman-Kolmogorov equations give pT (x, y) ≥ ··· pt1 (x0 , x1 )μx1 pt2 (x1 , x2 )μx2 . . . x1 ∈B1

xm−1 ∈Bm−1

ptm−1 (xm−2 , xm−1 )μxm−1 ptm (xm−1 , y).

(4.60)

56


Since xm−1 ∈ Bm−1 we have ptm (xm−1 , y) ≥ c13 (λ)k(t)−1 ≥ c13 (λ)k(T )−1 . Note that exactly one of pt (x, y) and pt+1 (x, y) can be non-zero. Using this, and (4.57) we deduce that for 1 ≤ i ≤ m − 1,

pti (xi−1 , xi )μxi ≥ c(λ)k(t)−1 g(r )2 .

(4.61)

xi ∈Bi

The choice of m implies that c (λ) f (t) ≤ r ≤ c(λ) f (t), and therefore k(t)−1 g(r )2 = g(r )2 /g( f (t))2 ≥ c(λ). So we obtain pT (x, y) ≥ k(T )c(λ)m ≥ k(T ) exp(−c(λ)R/G((T /R)1/2 )).

(4.62)

Proof of Theorem 4.7 . As in the proof of Theorem 4.6, we have that by (4.2), P(A(λ, n)c ) ≤ 4n 3 e−cλ

1/9

≤ exp(−cλ1/9 + C log n).

Therefore if we let λn = (log n)18 , then by Borel Cantelli, for each x ∈ Z2 there exists N x such that A x (λn , n) holds for all n ≥ N x . Further we have that P(N x ≥ n) ≤ Ce−c(log n) . 2

Let x, y ∈ Z2 and T ≥ 1. To apply the bound in Proposition 4.15 we need to find n such that T ≤ F(n), y ∈ B(x, n) and n ≥ N x . Hence if F(T ) ∨ |x − y| ≥ N x we can take n = F(T ) ∨ |x − y|, to obtain (4.55) with constants c(λn ) = c(log n) p . Choosing α suitably then gives (4.28). Remark 4.16. If both d(x, y) = R and T are large, then since dw = 13/5, (x, y) R((T /R)1/2 )−5/4 =

R 13/8 = (R dw /T )1/(dw −1) . T 5/8

Thus the term in the exponent takes the usual form one expects for heat kernel bounds on a regular graph with fractal growth – see the conditions UHK(β) and LHK(β) on p. 1644 of [BCK05]. Acknowledgements. The first author would like to thank Adam Timar for some valuable discussions on stationary trees in Zd . The second author would like to thank Greg Lawler for help in proving Lemma 2.4. We would also like to thank two referees for a careful reading of the paper, plus raising some interesting questions, which we have discussed briefly in Remark 1.3.


57

References [AF]

Aldous D., Fill, J.: Reversible Markov Chains and Random Walks on Graphs. Book in preparation. http://www.stat.berkeley.edu/~aldous/RWG/book.html, 2002 [Bar04] Barlow, M.T.: Which values of the volume growth and escape time exponent are possible for a graph? Rev. Mat. Iberoam. 20(1), 1–31 (2004) [BB89] Barlow, M.T., Bass, R.F.: The construction of brownian motion on the Sierpinski carpet. Ann. Inst. H. Poincaré 25(1), 225–257 (1989) [BCK05] Barlow, M.T., Coulhon, T., Kumagai, T.: Characterization of sub-Gaussian heat kernel estimates on strongly recurrent graphs. Comm. Pure Appl. Math. 58, 1642–1677 (2005) [BJKS08] Barlow, M.T., Járai, A., Kumagai, T., Slade, G.: Random walk on the incipient infinite cluster for oriented percolation in high dimensions. Commun. Math. Phys. 278(2), 385–431 (2008) [BK06] Barlow, M.T., Kumagai, T.: Random walk on the incipient infinite cluster on trees. Illinois J. Math. 50(1–4), 33–65 (2006) [BM10] Barlow, M.T., Masson, R.: Exponential tail bounds for loop-erased random walk in two dimensions. Ann. Probab. 38(6), 2379–2417 (2010) [BKPS04] Benjamini, I., Kesten, H., Peres, Y., Schramm, O.: Geometry of the uniform spanning forest: transitions in dimensions 4, 8, 12, . . .. Ann. of Math. 160(2), 465–491 (2004) [BLPS01] Benjamini, I., Lyons, R., Peres, Y., Schramm, O.: Uniform spanning forests. Ann. Probab. 29(1), 1–65 (2001) [Car85] Carne, T.K.: A transmutation formula for Markov chains. Bull. Sci. Math. 109, 399–405 (1985) [DS84] Doyle, P.G., Snell, J.L.: Random Walks and Electric Networks. Washington DC: Mathematical Association of America, 1984, available at http://xxx.lanl.gov/abs/math/0001057v1 [math. PR], 2000 [Häg95] Häggstrøm, O.: Random-cluster measures and uniform spanning trees. Stoch. Proc. App. 59, 267– 275 (1995) [Ken00] Kenyon, R.: The asymptotic determinant of the discrete Laplacian. Acta Math. 185(2), 239–286 (2000) [KN09] Kozma, G., Nachmias, A.: The Alexander-Orbach conjecture holds in high dimensions. Invent. Math. 178(3), 635–654 (2009) [KM08] Kumagai, T., Misumi, J.: Heat kernel estimates for strongly recurrent random walk on random media. J. Theor. Prob. 21(4), 910–935 (2008) [Law91] Lawler, G. F.: Intersections of random walks. In: Probability and its Applications. Boston, MA: Birkhäuser Boston Inc., 1991 [LL] Lawler, G.F., Limic, V.: Random walk: a modern introduction. Cambridge: Cambridge Univ. Press, 2010 [LSW04] Lawler, G.F., Schramm, O., Werner, W.: Conformal invariance of planar loop-erased random walks and uniform spanning trees. Ann. Probab. 32(1B), 939–995 (2004) [Lyo98] Lyons, R.: A bird’s-eye view of uniform spanning trees and forests. In: Microsurveys in discrete probability (Princeton, NJ, 1997), DIMACS Ser. Discrete Math. Theoret. Comput. Sci., 41, Providence, RI: Amer. Math. Soc., 1998, pp. 135–162 [LP09] Lyons, R., Peres, Y.: Probability on Trees and Networks. Book in preparation. http://mypage.iu. edu/~rdlyons/prbtree/prbtree.html, 2011 [Mas09] Masson, R.: The growth exponent for planar loop-erased random walk. Electron. J. Probab. 14(36), 1012–1073 (2009) [NW59] Nash-Williams, C.St J. A.: Random walks and electric currents in networks. Proc. Camb. Phil. Soc. 55, 181–194 (1959) [Pem91] Pemantle, R.: Choosing a spanning tree for the integer lattice uniformly. Ann. Probab. 19(4), 1559–1574 (1991) [PR04] Peres, Y., Revelle, D.: Scaling limits of the uniform spanning tree and loop-erased random walk on finite graphs. Preprint, available at http://front.math.ucdavis.edu/0410.5430, 2005 [Sch00] Schramm, O.: Scaling limits of loop-erased random walks and uniform spanning trees. Israel J. Math. 118, 221–288 (2000) [Sch08] Schweinsberg, J.: Loop-erased random walk on finite graphs and the Rayleigh process. J. Theor. Prob. 21(2), 378–396 (2008) [Sch09] Schweinsberg, J.: The loop-erased random walk and the uniform spanning tree on the fourdimensional discrete torus. Probab. Theory Rel. Fields 144(3–4), 319–370 (2009) [Wil96] Wilson, D.B.: Generating random spanning trees more quickly than the cover time. In: Proceedings of the Twenty-eighth Annual ACM Symposium on the Theory of Computing (Philadelphia, PA, 1996), New York: ACM, 1996, pp. 296–303 Communicated by F.L. Toninelli


Communications in


Ancient Dynamics in Bianchi Models: Approach to Periodic Cycles S. Liebscher1 , J. Härterich2 , K. Webster3 , M. Georgi1 1 Institut für Mathematik, Freie Universität Berlin, Arnimallee 3, 14195 Berlin, Germany.

E-mail: [email protected]; [email protected]

2 Fakultät für Mathematik Ruhr-Universität, Universitätsstr. 150, 44780 Bochum, Germany.

E-mail: [email protected]

3 Department of Mathematics, Imperial College London, South Kensington Campus,

London SW7 2AZ, UK. E-mail: [email protected] Received: 3 May 2010 / Accepted: 1 November 2010 Published online: 30 April 2011 – © Springer-Verlag 2011

Abstract: We consider cosmological models of Bianchi type. In particular, we are interested in the α-limit dynamics near the Kasner circle of equilibria for Bianchi classes VIII and IX. They correspond to cosmological models close to the big-bang singularity. We prove the existence of a codimension-one family of solutions that limit, for t → −∞, onto a heteroclinic 3-cycle to the Kasner circle of equilibria. The theory extends to arbitrary heteroclinic chains that are uniformly bounded away from the three critical Taub points on the Kasner circle, in particular to all closed heteroclinic cycles of the Kasner map. 1. Introduction The Einstein field equations 1 Rgαβ = Tαβ 2 restricted to spatially homogeneous, non-isotropic space-times gαβ with an ideal nontilted fluid yield a system of ordinary differential equations, the Bianchi class-A model (2.1) [WH89,WE05]. The α-limit, t → −∞, of this system corresponds to the dynamics near the big-bang singularity. The dynamics in this limit, however, is not yet fully understood. It has been conjectured [Mis69,BdSR86] that the dynamics follows the (formal) Kasner map defined on the Kasner circle of equilibria and given by heteroclinic connections to the equilibria on the Kasner circle. Equilibria on the Kasner circle represent selfsimilarly expanding space-times. A trajectory close to a formal heteroclinic sequence thus corresponds to a space-time that is close to different self-similar spacetimes as it approaches the singularity in backward time-direction: a tumbling universe. At least for Bianchi class-IX solutions the Bianchi attractor formed by the union of the Kasner circle and its heteroclinic orbits has been proven to indeed be a (global) attractor for trajectories to generic initial data under the time-reversed flow [Rin01]. Rαβ −

60

S. Liebscher, J. Härterich, K. Webster, M. Georgi

Rigorous results on the correspondence of the dynamics close to the attractor and the formal sequences of heteroclinic orbits to the Kasner circle are still missing. They would facilitate the discussion of the Belinskii-Khalatnikov-Lifshitz (BKL) conjecture of spatial decoupling close to singularities as well as the Misner hypothesis on the development of (spatially) homogeneous/isotropic space-times due to mixing near the initial singularity. See also [HU09] for a recent survey. In this paper we make the first step towards a rigorous description of the α-limit dynamics of the Bianchi system. We describe the set of initial conditions near the Bianchi attractor that follow the (up to equivariance) unique period-3 heteroclinic cycle. In fact we prove that this set forms a codimension-one Lipschitz manifold, see Theorem 4.2. We note that this result does not depend on monotonicity arguments and therefore covers both class-VIII and class-IX solutions approaching a period-3 heteroclinic cycle. We will start by reviewing the Bianchi system in Sect. 2. In Sect. 3 we study the passage near a line of equilibria in a generalized context. This yields a local map between sections to the period-3 heteroclinic cycle near the Kasner circle. In Sect. 4 this local map is combined with the global excursion given by the heteroclinic cycle. We obtain a return map with a fixed point representing the heteroclinic cycle. The stable manifold of this fixed point, i.e. the set of all solutions converging to the fixed point under iterations of the return map, represents all solutions following the heteroclinic cycle in the Bianchi system. We construct this (local) stable manifold as the limit object of a graph transformation. This yields the claimed Lipschitz set of trajectories with α-limit dynamics following the period-3 heteroclinic cycle, Theorem 4.2. Finally, we discuss generalizations to this result as well as open problems in Sect. 5. Generalizations include heteroclinic cycles of arbitrary period as well as non-periodic heteroclinic sequences that do not approach the singular Taub points. Matter models between dust and radiation can be included. Shortly after the submission of this article two more results on the same problem have appeared. While Béguin [Beg10] shows existence of solutions to non-periodic trajectories of the Kasner map that remain bounded away from the Taub points like in our treatment, Reiterer & Trubowitz [RT10] construct solutions of the full system near trajectories of the Kasner map that pass arbitrarily close to the Taub points. However, their analysis seems to be restricted to the vacuum case.

2. The Bianchi Model We consider the Bianchi class-A model on (N1 , N2 , N3 , + , − ) ∈ R5 with timederivative = d/dτ . In terms of the spatial curvature variables Ni and the shear variables ± it reads N1 = (q − 4+ )N1 , √ N2 = (q + 2+ + 2 3− )N2 , √ N3 = (q + 2+ − 2 3− )N3 , + = −(2 − q)+ − 3S+ , − = −(2 − q)− − 3S− .

(2.1)

Ancient Dynamics in Bianchi Models: Approach to Periodic Cycles

61

Table 1. Bianchi classes given by the signs of the spatial curvature variables Ni . Remaining cases are related by equivariance Bianchi class I II VI0 VII0 VIII IX

N1 0 + 0 0 − +

N2 0 0 + + + +

N3 0 0 − + + +

The abbreviations 1 2 q = 2 +2 + − + (3γ − 2), 2 2 2 = 1 − + − − − K , 3 2 K = N1 + N22 + N32 − 2 (N1 N2 + N2 N3 + N3 N1 ) , 4 1 S+ = (N2 − N3 )2 − N1 (2N1 − N2 − N3 ) , 2 1√ S− = 3 (N3 − N2 ) (N1 − N2 − N3 ) 2

(2.2)

include the deceleration parameter q, the density parameter , and the curvature parameter K . The fixed parameter 23 < γ ≤ 2, given by the equation of state of an ideal fluid, describes the uniformly distributed matter. For example, a value γ = 1 represents dust, whereas γ = 4/3 represents radiation. For a derivation of these equations from the Einstein field equations see [WH89, WE05], the Appendix to [Rin01], or [Rin09].1 The resulting flow in yields = (2q − (3γ − 2)) .

(2.3)

The invariant set { = 0} corresponds to the 4-dimensional vacuum model N1 = 2(1 − K − 2+ )N1 , √ N2 = 2(1 − K + + + 3− )N2 , √ N3 = 2(1 − K + + − 3− )N3 ,

+

= −2K + − 3S+ ,

= −2K − − 3S− , −

(2.4)

resp. − = ± 1 − K − +2 .

Symmetries are given by permutations of {N1 , N2 , N3 } together with appropriate linear transformation of + , − corresponding to a representation of S3 on R2 . Together with the reflection (N1 , N2 , N3 ) → (−N1 , −N2 , −N3 ), the system yields a S3 × Z2 symmetry group. Note the classification of restrictions of the dynamical system to the various invariant regions, see Table 1. [W E]

1 Note that [WE05] uses a slightly different scaling from [WH89] and [Rin01], N i all other variables being the same. We use the scaling of [WH89].

[W H ]

= 3Ni

with

62


Fig. 1. Heteroclinic caps of vacuum Bianchi II solutions to the Kasner circle

The Kasner circle K = {N1 = N2 = N3 = 0, = 0}, Bianchi class I, consists of equilibria. The attached half ellipsoids Hk = {Nk = 0, Nl = Nm = 0, = 0}, {k, l, m} = {1, 2, 3}, Bianchi class II, consist of heteroclinic orbits to equilibria on the Kasner circle, see Fig. 1. The projections of the trajectories of Bianchi class-II vacuum solutions onto the ± -plane yield straight lines through the point (+ , − ) = (2, 0) in the cap {N1 = 0, N2 = N3 = 0}. The projections of the other caps are given by the equivariance. Away from the singular points, Tk , k = 1, 2, 3, the Kasner circle K is normally hyperbolic with 2-dimensional center-stable manifold given by the family of incoming heteroclinic orbits. The Kasner map : K → K is defined as follows: for each point q+ ∈ K\{T1 , T2 , T3 } there exists a Bianchi class-II vacuum heteroclinic orbit q(t) converging to q+ as t → ∞. This orbit is unique up to reflection (N1 , N2 , N3 ) → (−N1 , −N2 , −N3 ). Its unique α-limit q− defines the image of q+ under the Kasner map (q+ ) := q− .

(2.5)

Including the three fixed points, (Tk ) := Tk , this construction yields a continuous map, : K → K. In fact is a non-uniformly expanding map and its image (K) is a double cover of K. The main goal are rigorous results on the correspondence of iterations of the Kasner map to the dynamics of nearby trajectories to the Bianchi system (2.1) with reversed time, i.e. in the α-limit t → −∞. There exists a 3-cycle of heteroclinic orbits, i.e. a fixed point of 3 , unique up to equivariance. In the following we choose the cycle given by a heteroclinic orbit q(t) in the Bianchi class-II vacuum cap {N1 > 0, N2 = N3 = 0} with − > 0 and its images under the equivariances given by cyclic permutation of Ni and rotation by 2π/3 in (+ , − ), see Fig. 2. The α-limit of q(t) is given by √ √ √ 1 1 − 3 5, 3 + 15 . (+ , − ) = q− = (2.6) 8 After factoring out the equivariance, the heteroclinic orbit q(t) becomes a homoclinic orbit on the orbit space of the equivariance. We will therefore discuss the dynamics of


63

Fig. 2. Kasner circle with sketch of the phase portrait of all Bianchi-II-ellipsoids and the resulting heteroclinic 3-cycle

nearby trajectories by studying the return map to a transverse cross section to h(t). This is done in Sect. 4. The linearization of (2.1) at the Kasner circle yields ⎛

⎞ 0 √ 0 0 0 2 − 4+ ⎜ ⎟ 0 √ 0 0 0 2 + 2+ + 2 3− ⎜ ⎟ ⎜ ⎟. 0 0 0 0 2 + 2+ − 2 3− ⎜ ⎟ ⎝ 0 0 0 3(2 − γ )+2 3(2 − γ )+ − ⎠ 2 0 0 0 3(2 − γ )+ − 3(2 − γ )−

(2.7) We find eigenvalues μ1 = 2 − 4+ ,

√ μ2 = 2 + 2+ + 2 3− , √ μ3 = 2 + 2+ − 2 3−

64


to eigenvectors ∂ N1 , ∂ N2 , ∂ N3 tangential to the Bianchi class-II vacuum heteroclinics. Additionally, there is the trivial eigenvalue zero to the eigenvector −− ∂+ + + ∂− tangential to the Kasner circle K. The fifth eigenvalue μ = 3(2 − γ ) > 0 corresponds to the eigenvector + ∂+ + − ∂− transverse to the vacuum boundary { = 0}. At the Kasner equilibrium q− of the 3-cycle (2.6) we have μ1 =

√ 3 (1 + 5), 2

μ2 = 3,

μ3 =

√ 3 (1 − 5). 2

Note that both unstable eigenvalues are stronger than the stable one, 0 < −μ3 < μ2 < μ1 , and that the heteroclinic orbit belonging to the 3-cycle is tangent to the strong unstable direction ∂ N1 . In fact at every point on K\{T1 , T2 , T3 } there is one negative eigenvalue and it is weaker than the other two positive eigenvalues among μ1 , μ2 , μ3 . At the singular Taub points {T1 , T2 , T3 } two eigenvalues change their signs simultaneously. We therefore study the local passage near such nonsingular points in general systems in the following section, before combining it with the global excursion map and applying the results to the particular Bianchi system in Sect. 4. 3. Local Map In this section we study the passage of trajectories under a general flow near a line of equilibria with eigenvalue constraints (3.4) consistent with the Kasner circle in the Bianchi system. We will collect estimates on expansion and contraction rates to establish Lipschitz properties of the local map between sections to a reference orbit given by the passage near the line of equilibria, see Theorem 3.9 at the end of this section. Consider a C k vector field, k ≥ 4, x = f (x),

x ∈ R4 ,

near a point (w.l.o.g. the origin) on an equilibrium line ∀xc ∈ R. f (0, 0, 0, xc )T ≡ 0,

(3.1)

(3.2)

Assume that the linearization at the origin (and then by continuity locally all along the line) has the form ⎛ ⎞ μu (xc ) 0 0 0 −μs (xc ) 0 0⎟ ⎜ 0 (3.3) A(xc ) = D f (0, 0, 0, xc )T = ⎝ 0 0 −μss (xc ) 0 ⎠ ∗ ∗ ∗ 0 with 0 < μu < μs < μss .

(3.4)

We denote x = (xu , xs , xss , xc )T according to the above splitting. We also abbreviate the vector of stable components as xs,ss = (xs , xss )T . The aim is to study a local map from an in-section in = {xss = ε} to an out-section out = {xu = ε} for xc , xs ≈ 0, see Fig. 3. This corresponds to the passage near the Kasner circle in the Bianchi system in backwards time direction. (We reversed the time direction to obtain a well defined local map.)


65

Fig. 3. Local passage loc : in → out

Rescaling coordinates yields a system x = A(xc )x + εg(x),

(3.5)

with ε arbitrarily fixed and g quadratic in (xu , xs , xss ). The local map out out , xc ) = loc (xuin , xsin , xcin ) (xuin , xsin , xcin ) −→ (xsout , xss

(3.6)

is given by the first intersection of the solution of (3.5) to the initial value (xuin , xsin , in = 1, x in ) with the out-section {x = 1}. The singular points x in in the intersection of xss u c the stable manifold of the equilibrium line with the in-section are mapped to the respective points in the intersection of the unstable manifold of the equilibrium line with the out-section. Time at the in-section is usually set to zero. Time at the out-section depends on the initial condition and is denoted by t loc . For any xc fixed and any k ∈ N, the equilibrium (0, 0, 0, xc )T possesses a onedimensional unstable manifold W u (xc ) and a stable manifold W s,ss (xc ). Since the stable and unstable manifolds as well as the strong stable foliation of the stable manifold are C k they can be flattened, see e.g. [SSTC98], Theorem 5.8. By a C k change of coordinate the stable / strong stable / unstable manifolds to the equilibria locally coincide with the respective eigenspaces, in particular the following subspaces become invariant: W u (xc ) = {xs = xss = 0, xc fixed}, W s,ss (xc ) = {xu = 0, xc fixed}, W ss (xc ) = {xu = xs = 0, xc fixed}.

(3.7)

In fact, in the Bianchi system, W u (xc ) and W ss (xc ) coincide with the class-II caps formed by families of incoming and outgoing heteroclinic orbits. The local map loc is well-defined on the in-section in in in , xc ) | xss = 1, 0 < xuin < 1, |xsin | < 1, |xcin | < 1 }, in = { (xuin , xsin , xss

(3.8)

66


see Lemma 3.6 below. The singular points of the local map thus become the set {xuin = 0} and we define: out out , xc ) := (0, 0, xcin ).

loc (xuin = 0, xsin , xcin ) = (xsout , xss

(3.9)

Let Pu , Ps , Pss , Ps,ss := Ps + Pss , Pc be the eigenprojections with respect to A(xc ). Then (3.3) yields ⎛ ⎞ μu (xc ) 0 0 0 −μs (xc ) 0 0⎟ ⎜ 0 , (3.10) A(xc ) = D f (0, 0, 0, xc )T = ⎝ 0 0 −μss (xc ) 0 ⎠ 0 0 0 0 and due to (3.7) the higher order terms have the form Pu g(x) = xu g˜ u (x)

g˜ s2 (x)

xs

∈ R,

∈ R2 , g˜ ss1 (x) g˜ ss2 (x) xss

xs ∈ R, = xu g˜ c1 (x) g˜ c2 (x) xss

Ps,ss g(x) = g˜ s,ss (x)xs,ss = Pc g(x) = xu g˜ c (x)xs,ss

g˜ s1 (x)

(3.11)

with C k−1 -functions g˜ u , g˜ s,ss , vanishing along the line of equilibria, and C k−2 -function g˜ c . In particular |g˜ c1 (x)|, |g˜ c2 (x)| < C, (3.12) |g˜ u (x)|, |g˜ s1 (x)|, |g˜ s2 (x)|, |g˜ ss1 (x)|, |g˜ ss2 (x)| < C max(|xu |, |xs |, |xss |) for some constant C > 0 independent of ε and x ∈ U, where U is some local neighborhood of the origin. Similarly, g˜ satisfies Lipschitz bounds |g˜ c (x) − g˜ c (x)|, ˜ |g˜ u (x) − g˜ u (x)|, ˜ g˜ s,ss (x) − g˜ s,ss (x) ˜ < Cx − x. ˜ (3.13) Norms of vectors are always taken as 1 -norms, e.g. x = |xc | + |xu | + xs,ss = |xc | + |xu | + |xs | + |xss |. We choose U = (−2, 2)4 .

(3.14)

All further estimates will use this rescaled system (3.5) with flattened invariant manifolds (3.7) in the local neighborhood U. They will be valid for all ε < ε0 and suitably chosen ε0 . In the original system (3.1) ε0 bounds the size of the neighborhood of the origin in which this local analysis is valid. Remark 3.1. The invariance of W ss (xc ) implies that g˜ s2 (xu = 0, xs , xss , xc ) = 0. We do however not exploit this fact in our analysis below. Proposition 3.2. Let μu := μu (0), −μs := −μs (0), −μss := −μss (0) be the eigenvalues of (3.10) at the origin. Then for all 0 < α < 1 there exists an ε0 > 0 such that for all ε < ε0 in (3.5) and x ∈ U, α≤

μu (xc ) μs (xc ) μss (xc ) , , ≤ α −1 . μu μs μss


67

Proof. The distinct eigenvalues in the linearization of the original system (3.3) depend differentiably on xc , as long as (3.4) holds. For the rescaled system (3.5) with small ε0 this provides bounds in U: Indeed, there exists a constant C > 0 independent of ε0 , ε, such that d d d , , (3.15) μ (x ) μ (x ) μ (x ) dx u c dx s c dx ss c < εC. c

c

c

−1

The scalar function θ (x) := μu (μu (xc ) + ε g˜ u (x)) is therefore C k−1 and close to 1. The vector field μu f (x) x = θ (x) f (x) = μu (xc ) + ε g˜ u (x) has the same trajectories as the original vector field and all previous considerations remain valid. Thus we can assume, without loss of generality, that θ (x) ≡ 1 in U, i.e. μu (xc ) ≡ μu ,

g˜ u (x) ≡ 0.

(3.16)

At this step we have made use of the fact that the origin possesses exactly one unstable eigenvalue. The vector field to consider then has the form xu xs,ss xc

= μu xu , = −μs,ss (xc ) xs,ss + ε g˜ s,ss (x) xs,ss ,

(3.17)

= ε xu g˜ c (x) xs,ss .

Note again the abbreviations xs xs,ss = , xss g˜ s1 (x) g˜ s2 (x) , g˜ s,ss (x) = g˜ ss1 (x) g˜ ss2 (x)

0 μs (xc ) , 0 μss (xc )

g˜ c (x) = g˜ c1 (x) g˜ c2 (x) ,

μs,ss (xc ) =

to facilitate later generalizations to higher dimensions of the stable component xs,ss . In particular note the diagonal form of the linear part μs,ss . Lemma 3.3. Define χ := |xu | xs,ss := |xu | |xs | + |xu | |xss | Then for all 0 < α < 1 there exists an ε0 > 0 such that for all ε < ε0 and x ∈ U, χ ≤ −α(μs − μu )χ under the dynamics (3.17). In particular, with x(0) = x in on the in-section in as long as the trajectory remains inside U, in χ (t) ≤ exp(−α(μs − μu )t) |xuin | xs,ss ≤ 2 exp(−α(μs − μu )t) |xuin |.

Remark 3.4. The quantity χ describes the “distance” from the critical heteroclinic cycle. More precisely, χ is related to the distance from the set xc (W s,ss (xc ) ∪ W u (xc )). Close to the critical heteroclinic cycle this set is however under the return map squeezed onto the critical cycle, so for large times χ measures in fact a “distance” to the critical heteroclinic cycle.

68


Proof. Vectorfield (3.17) yields χ ≤ −(μs (xc ) − μu )χ + 2εg˜ s,ss (x)χ . The function g| ˜ U is bounded uniformly in ε, see (3.12). Additionally, Proposition 3.2 provides bounds −μs (xc ) ≤ −αμ ˜ s for arbitrary 0 < α˜ < 1 and ε0 small enough. By a suitable choice of α˜ between α and 1 we estimate χ ≤ −α(μs − μu )χ . Integrating this differential inequality and using the initial value yield the second claim. We need estimates for the passage time t loc as well as bounds on |xc |, |xu |, and xs,ss . These are obtained in the following lemmata and summarized in Corollary 3.7. Lemma 3.5. For all 0 < α < 1 there exists an ε0 > 0 such that for all ε < ε0 and x(0) = x in in the in-section in , see (3.8), as long as x(t) remains in U under the flow to the vector field (3.17) xu (t) = exp(μu t) xuin , xs,ss (t) = |xs (t)| + |xss (t)| ≤

in exp(−αμs t) xs,ss ,

2εC |x in |. |xc (t) − xcin | ≤ α(μs − μu ) u

(3.18) (3.19) (3.20)

Here, C is the uniform (in x and ε) bound on g| ˜ U from (3.12). Proof. The unstable component (3.18) is given directly by the vectorfield. On the stable component, we obtain the estimate

d (|xs | + |xss |) ≤ −μs (xc ) + 2εg˜ s,ss (x) (|xs | + |xss |) . dt The function g| ˜ U is bounded uniformly in ε, see (3.12). Together with the estimate on the eigenvalues of Proposition 3.2, this yields the bounds (3.19) on the stable component. The center component of (3.17) is estimated by t t Pc g(x(s)) ds ≤ ε |Pc g(x(s))| ds |xc (t) − xcin | = ε 0 0 t 2 |xuin |. ≤ εC |χ (s)| ds ≤ εC α(μ − μ ) s u 0 Here we used the estimate of Lemma 3.3 on χ .

Corollary 3.6. There exists an ε0 > 0 such that for all ε < ε0 and x(0) = x in in the in-section in , see (3.8), the trajectory x(t) under the flow to the vector field (3.17) remains in U as long as |xu | ≤ 1, i.e. all along the passage defining the local map loc . The passage time t loc is given by t loc =

1 1 ln in . μu |xu |


69

1 Proof. We choose ε0 smaller than 2C α(μs − μu ), see Lemma 3.5. Then trajectories starting in in cannot leave U unless xu becomes larger than 1, see (3.20), (3.19). Furthermore, (3.18) ensures that xu must grow beyond 1. Thus every trajectory starting in in intersects the out-section {xu = 1} before leaving U. Setting xu (t loc ) = 1 in (3.18) determines the passage time t loc .

Corollary 3.7. The local map loc (3.6, 3.9), i.e. the local passage on the closed in in-section including the singular line {xuin = 0}, is continuous. For all 0 < α < 1 there exists an ε0 > 0 such that for all ε < ε0 the following estimates hold: 2εC |x in |, α(μs − μu ) u out = |x out | + |x out | ≤ |x in |αμs /μu −1 |x in | x in ≤ |x in |αμs /μu −1 (|x in | + |x in |). xs,ss s ss u u s,ss u u s |xcout − xcin | ≤

Thus the drift along the line of equilibria is arbitrarily small and the distance from the orbit to the union of the stable and unstable manifolds shrinks arbitrarily fast, close to the critical orbit. Proof. The estimates follow directly from Lemma 3.5 applied to the local passage time given by Corollary 3.6. They also establish continuity of the local map loc at the sinin = 1 on the in-section. Note further that gular line {xuin = 0}. Note 0 < xuin < 1 and xss β = αμs /μu − 1 > 0 for α chosen close enough to 1. To obtain Lipschitz-bounds for the local map loc , we have to improve the estimates considerably. Consider two trajectories, x(t) and x(t), ˜ starting in the in-section out = 0, x˜ out = x˜ in , and the Lipschitz estimates follow from in . If x˜uin = 0 then x˜s,ss c c Corollary 3.7. Choose, without loss of generality, 0 < x˜uin ≤ xuin . In order to compare the two trajectories, we want x˜u (0) = xuin = xu (0) and therefore start the trajectory x˜ at negative time x(− ˜ t˜0 ) = x˜ in ,

t˜0 :=

1 x in ln uin . μu x˜u

(3.21)

Then x˜u (t) = xu (t) for all t ≥ 0. In particular, both trajectories meet the out-section at the same time t loc = μ1u ln |x1in | , see Corollary 3.6. The next lemma gives u some estimates for the initial segment of the trajectory x˜ up to t = 0. ˜ t˜0 ) = x˜ in in in . Lemma 3.8. Consider (3.17) with initial values x(0) = x in and x(− Assume without loss of generality 0 < x˜uin ≤ xuin . There exists an ε0 > 0 such that for all ε < ε0 the following estimates hold: |x˜c (0) − x˜cin | ≤ εL c

x˜uin in |x − x˜uin |, xuin u

in in = |x˜s (0) − x˜sin | + |x˜ss (0) − x˜ss | ≤ Ls x˜s,ss (0) − x˜s,ss

1 in |x − x˜uin |, xuin u

with suitably chosen constants L s > 0, L c > 0, uniform in x in , x˜ in and ε.

70


Proof. The xc -component is estimated similar to Lemma 3.5: |x˜c (t) − x˜cin | ≤ εC

0

−t˜0

|χ(s)| ˜ ds ≤ εC 0

t˜0

in exp(−α(μs − μu )s)|x˜uin | x˜s,ss .

Again C is the uniform bound on g| ˜ U . The last inequality uses the estimate of Lemma 3.3. in ≤ 2 we obtain Plugging in (3.21) and x˜s,ss 2εC |x˜uin | 1 − exp(−α(μs − μu )t˜0 ) α(μs − μu ) in −(μs −μu )/μu xu 2εC in ≤ |x˜ | 1 − α(μs − μu ) u x˜uin 2εC L x˜ in |x˜uin | 1 − uin . ≤ α(μs − μu ) xu

|xc (0) − xcin | ≤

The last inequality with L := max{(μs −μu )/μu , 1} uses the general Lipschitz estimate 1 − ξ ϑ ≤ max{ϑ, 1}(1 − ξ ) for arbitrary 0 < ξ < 1 and ϑ > 0. Finally, the choice L c := 2C Lα −1 (μs − μu )−1 yields the claim on the xc -component. The xs,ss -component is contracted at most with the rate of the strong stable eigenvalue μss , perturbed by small higher order terms. Indeed, due to the diagonal form of the linear part, the variation-of-constant formula reads 0 exp −t˜0 −μs (x˜c (s)) ds 0 in 0 x˜s,ss (0) = x˜s,ss 0 exp −t˜0 −μss (x˜c (s)) ds 0 0 exp s −μs (x˜c (σ )) dσ 0 0 +ε ˜ x˜s,ss ds. g˜ s,ss (x(s)) 0 exp s −μss (x˜c (σ )) dσ −t˜0 We have already established uniform bounds on the nonlinearity, |g| < C, and the contraction rates, αμs < μs (x˜c ) < μss (x˜c ) < α1 μss , with 0 < α < 1 arbitrarily close to 1. Thus in in x˜s,ss (0) − x˜s,ss < 1 − exp(− α1 μss t˜0 ) x˜s,ss + εC

t˜0

exp(−αμs s) ds

0

εC 1 − exp(−αμs t˜0 ) ≤ 2 1 − exp(− α1 μss t˜0 ) + αμs in μss /αμu in αμs /μu x˜u εC x˜u = 2 1− + 1− xuin αμs xuin x˜ in 2εC L 2 x˜ in 1 − uin . ≤ 2L 1 1 − uin + xu αμs xu The last inequality again uses the general estimate 1 − ξ ϑ ≤ max{ϑ, 1}(1 − ξ ) for arbitrary 0 < ξ < 1 and ϑ > 0. The constants L 1 and L 2 are independent of ε and therefore yield the claim.


71 in

Theorem 3.9. The local map loc (3.6, 3.9), i.e. the local passage on including the singular line xuin = 0, is Lipschitz continuous. Let x˜ in − x in = |x˜uin − xuin | + |x˜cin − xcin | + |x˜sin − xsin | be the norm in the in-section in . Then there exists an ε0 > 0 such that for all ε < ε0 and 0 ≤ x˜uin ≤ xuin the following estimates hold: |(x˜cout − xcout ) − (x˜cin − xcin )| ≤ εL c x˜ in − x in , out out out out − xs,ss = |x˜sout − xsout | + |x˜ss − xss | ≤ L s |xuin |β x˜ in − x in , x˜s,ss

with L c > 0, L s > 0, β > 0 uniform in x, ε. In particular, L s |xuin |β is arbitrarily small for xuin small enough. The drift in the center direction can be made arbitrarily small by choosing a sufficiently small local neighborhood. The contraction in the transverse directions is arbitrarily strong by restricting the in-section to the part close to the primary object, i.e. the stable manifold of the origin. out = 0, x˜ out = x˜ in , and the Lipschitz estimates follow from Proof. If x˜uin = 0 then x˜s,ss c c Corollary 3.7. Therefore let 0 < x˜uin ≤ xuin . Choose x(− ˜ t˜0 ) = x˜ in according to (3.21). in Then x(0) ˜ is estimated by Lemma 3.8, and x(0) = x . We abbreviate t (x˜c (s) − xc (s)) ds, dc (0) = 0, dc (t) := 0 (3.22) ds (t) := x˜s,ss (t) − xs,ss (t), .

In particular, using Lemma 3.8, we estimate the initial value, ds (0) ≤ L 0

1 in 1 in in |xu − x˜uin | + x˜s,ss − xs,ss ≤ L 1 in x˜ in − x in , in xu xu

(3.23)

as well as the distance in the center direction, |x˜c (t) − xc (t)| ≤ dc (t) + |x˜c (0) − x˜cin | + |x˜cin − xcin | ≤ dc (t) + L 2 x˜ in − x in . (3.24) The constants L 1 , L 2 > 0 are uniform in x, x, ˜ ε. Vector field (3.17) yields the estimate (x˜s,ss − xs,ss ) = −μs,ss (x˜c )(x˜s,ss − xs,ss ) + R with remainder term R ≤ μs,ss (x˜c ) − μs,ss (xc ) xs,ss +εg˜ s,ss (x) ˜ x˜s,ss − xs,ss + εg˜ s,ss (x) ˜ − g˜ s,ss (x) xs,ss ≤ εC|x˜c − xc | xs,ss + εCx˜s,ss − xs,ss + εCx˜ − x xs,ss = εC(|x˜c − xc | xs,ss + x˜s,ss − xs,ss ). Here, C is the uniform bound on g˜ and uniform Lipschitz-bound on g˜ and μs,ss , see (3.12), (3.13) and (3.15).

72


In the last step we have used x˜ − x = |x˜c − xc | + x˜s,ss − xs,ss , since |x˜u − xu | = 0 due to our choice (3.21). For arbitrary 0 < α < 1 we have αμs < μs (x˜c ) < μss (x˜c ), provided ε0 is chosen accordingly. Hence ds (t) = x˜s,ss (t) − xs,ss (t) ≤ −αμs ds (t) + εC|x˜c (t) − xc (t)| xs,ss (t). Analogous Lipschitz estimates on g˜ c yield dc (t) = |x˜c (t) − xc (t) | ≤ ε|xu | g˜ c (x) ˜ x˜s,ss − xs,ss + ε|xu | g˜ c (x) ˜ − g˜ c (x) xs,ss

≤ εC|xu (t)| x˜s,ss (t) − xs,ss (t) + |x˜c (t) − xc (t)| xs,ss (t)

= εC exp(μu t)|xuin | ds (t) + |x˜c (t) − xc (t)| xs,ss (t) . in < 2, Using estimate (3.24), the exponential bounds on xs,ss of Lemma 3.5, and xs,ss we obtain

ds (t) ≤ −αμs ds (t) + 2εC exp(−αμs t) dc (t) + L 2 x˜ in − x in ,

dc (t) ≤ εC exp(μu t)|xuin | ds (t) +2εC

exp(−α(μs − μu )t)|xuin |

dc (t) + L 2 x˜ in − x in .

The Gronwall estimate of exp(αμs t)ds (t), t ds (t) ≤ exp(−αμs t) ds (0) + 2εCt L 2 x˜ in − x in + 2εC dc (s) ds ,

(3.25)

(3.26)

0

can be plugged into the estimate of dc : t dc (t) ≤ 2εC dc (t) + εC dc (s) ds 0 + ds (0) + (εCt + 1)L 2 x˜ in − x in .

exp(−α(μs − μu )t)|xuin |

Let x˜ in − x in be an upper bound on dc , i.e. assume x˜ in − x in > sup dc (s).

(3.27)

0≤s≤t

As dc (0) = 0, this assumption is valid on an initial segment. Then dc (t) ≤ 2εC exp(−α(μs − μu )t)|xuin | dc (t) + ds (0) + (εCt + 1)L 3 x˜ in − x in , with L 3 = L 2 + 1. Gronwall inequality yields t t dc (t) ≤ ξ(t) + ξ(s)η(s) exp η(σ ) dσ ds 0

s


73

with η(t) = 2εC exp(−α(μs − μu )t)|xuin | ≤ 2εC exp(−α(μs − μu )t),

t

2εC , 0 ≤ s ≤ t, α(μs − μu ) t ξ(t) = 2εC|xuin | exp(−α(μs − μu )s) ds (0) + (εCs + 1)L 3 x˜ in − x in ds,

η(σ ) dσ ≤

s

0

2C |xuin | ds (0) + L 3 x˜ in −x in α(μs − μu ) 2C 2 + ε2 2 |x in |L 3 x˜ in − x in α (μs − μu )2 u ≤ εL 4 |xuin | ds (0) + x˜ in − x in ≤ε

≤ εL 4 (L 1 + 1)x˜ in − x in . The second to last inequality needs ε0 chosen small enough, L 4 is a uniform constant. The last inequality uses the initial estimate (3.23) and |xuin | ≤ 1. Finally t ξ(s)η(s) exp η(σ ) dσ ds 0 s t in in 2εC ≤ εL 4 (L 1 +1)x˜ −x 1 + 2εC exp( α(μs −μu ) ) exp(−α(μs − μu )s) ds

dc (t) ≤ ξ(t) +

t

0

≤ εL 5 x˜ in − x in ,

(3.28)

with L 5 independent of ε. Choose ε0 small enough. Then εL 5 1 and assumption (3.27) remains valid up to the out-section. This yields the first claim of the theorem. Estimates (3.28) and (3.26) combine to ds (t) ≤ exp(−αμs t) ds (0) + εt L 6 x˜ in − x in . Thus dsout = ds ( μ1u ln

1 ) |xuin |

≤ |xuin |αμs /μu

L1

x˜ in − x in εL 6 in 1 + x˜ − x in ln in |xuin | μu |xu |

≤ |xuin |αμs /μu −1 L s x˜ in − x in . Note that β = αμs /μu − 1 > 0 for α chosen close enough to 1. This finishes the proof.

74


Fig. 4. The return map. = glob ◦ loc

4. The Return Map In this section we define a return map for trajectories near the primary heteroclinic orbit q(t) considered as a homoclinic orbit relative to the equivariance. In this way a map on the in-section is defined which induces a transformation of Lipschitz-curves on this section. We first prove Lipschitz- and cone properties of the return map. The homoclinic orbit q(t) corresponds to a fixed point of this map. The stable set of the equilibrium under the return map is then constructed as the fixed point of the associated graph transformation. This yields the set of trajectories with α-limit dynamics following the period-3 heteroclinic cycle resp. the relative homoclinic orbit q(t). Locally near q− we can use the general result on the local passage of the previous section and apply it to the Bianchi system. The time-reversed Bianchi flow satisfies assumptions (3.2, 3.3). The out-section is then diffeomorphically mapped back to the in-section by the time-reversed (global) Bianchi flow after reduction to the orbit space of the equivariance group, out out out out (xsout , xss , xc ) −→ (xuin , xsin , xcin ) = glob (xsout , xss , xc ).

(4.1)

out = x out = 0} is mapped to the line {x in = x in = 0} by the In particular the line {xss s u s glob expands the xc -coordinate uniformly in a neighborhood of the Kasner map. Thus

3-periodic cycle q(t). Combining both maps (the local passage and the global excursion) we obtain a return map in

in

:= glob ◦ loc : −→ ,

(4.2)

see also Fig. 4. Note that due to expansion of the center component xc under the return in map, the domain of definition of this map is in fact a subset of . The heteroclinic cycle q(t) yields a homoclinic orbit after reduction to the orbit space of the equivariance group. It constitutes the unique fixed point (0, 0, 0) of the return map . The Kasner circle and the attached heteroclinic cap is represented by the


75

set {xuin = xsin = 0}. The return map becomes singular on the stable manifold of the Kasner circle. In particular, we define on this singular set

(0, xsin , xcin ) = (0, 0, (xcin )),

(4.3)

where is the Kasner map. Note that the Bianchi system (2.1) preserves the signs of N1 , N2 , N3 , . Although the local analysis of the previous section did not use this special structure, it is compatible with it. To study Bianchi-IX solutions close to the 3-cycle of heteroclinics we therefore consider only the part of the in-section with xu ≥ 0, xs ≥ 0. This part is mapped to the in-section, again with xu ≥ 0, xs ≥ 0. In complete analogy one can treat Bianchi-VIII solutions close to a 3-cycle, although this 3-cycle is then composed of orbits inside heteroclinic caps of different sign. Indeed, each heteroclinic orbit to the Kasner circle has a mirror image under sign reversal of Nk . Each of the 3 heteroclinic orbits of the cycle can be given a sign, the choice of signs determines whether the cycle lies in the boundary of Bianchi-VIII or Bianchi-IX domains. In fact, looking at a section to a given heteroclinic orbit of the cycle, thus fixing the sign of the first orbit, we have 4 quadrants in this section corresponding to the 4 remaining choices of signs of the 2 remaining orbits. In each of the quadrants, Theorem 4.2 will yield a local Lipschitz manifold. Although the analysis of this section also holds for a 3-cycle in the boundary of Bianchi-VIII domains, we keep the notation corresponding to Bianchi-IX, i.e. all components xs , xss , xu are and remain non-negative. These components correspond to the variables N1 , N2 , N3 in the Bianchi system, the identification permutes along the cycle. The component xc corresponds to the variables ± . In the following, we first prove suitable cone conditions for the return map and then apply almost standard graph transformation to obtain the claimed Lipschitz manifold. The nonstandard part of that graph transformation includes the singular line xuin = 0 of infinite contraction. Note the order of fixing the rescaling parameters: First ε0 resp. ε is fixed small enough to yield our estimates of the local passage loc with small Lipschitz constants, in particular εL c 1. This amounts to a choice of the sections in and out in the original (unscaled) coordinates and also fixes the global excursion map glob . Then a sufficiently small upper bound for xuin is chosen, i.e. is restricted to a smaller section ˜ in = {x ∈ in | 0 ≤ xu ≤ δ, 0 ≤ xs ≤ δ, |xc | < δ}.

(4.4)

This makes the contraction of the local passage as strong as we like without changing

loc , glob . It also ensures that trajectories of interest stay close to the Kasner caps of heteroclinic orbits, thus we ensure that the global excursion glob on the domain of interest is as close to the Kasner map as we like. It also ensures that all non-singular in trajectories in this domain indeed return to the in-section . Lemma 4.1. The return map (4.2) is Lipschitz continuous. Furthermore, there exist ε > 0, δ > 0, 0 < σ < 1, K u,s > 1, and K c > (1 − σ 2 )−1 > 1, such that the following ˜ in → in . cone conditions hold for = glob ◦ loc : in ˜ in a Here is the closed in-section (3.8) corresponding to the choice of ε, and suitable subset (4.4).

76


Fig. 5. Cone properties of the return map

˜ in as The cones are defined for x ∈ ˜ in | x˜u,s − xu,s ≤ σ |x˜c − xc |}, C xc = {x˜ ∈ ˜ in | |x˜c − xc | ≤ σ x˜u,s − xu,s }. C xu,s = {x˜ ∈

(4.5)

The cone conditions are ˜ in ⊂ (int C c )∪{ x} and −1 (C u,s )∩ ˜ in ⊂ (int C xu,s )∪ (i) Invariance: (C xc )∩

x

x {x}; (ii) Contraction & Expansion: For all x˜ ∈ C xc we have expansion in the center direcu,s tion: |( x) ˜ c − ( x)c | ≥ K c |x˜c − xc | and for all x˜ ∈ C x we have contraction in the transverse directions: x˜u,s − xu,s ≥ K u,s ( x) ˜ u,s − ( x)u,s . ˜ in . See also Fig. 5. They hold for all, x, x, ˜ x, x˜ ∈ Proof. Lipschitz continuity of the return map follows directly from Lipschitz continuity of the local passage loc , see Theorem 3.9, as the global excursion glob is smooth. The cone conditions require the expansion in xc -direction along the Kasner circle induced by the Kasner map. On the singular line, the global excursion is given by the Kasner map , out

glob (xsout = 0, xss = 0, xcout ) = (xuin = 0, xsin = 0, (xcout )).

Indeed, the local passage does not change xc on the singular line, so this notation is consistent with (4.3). Therefore we can write ˜ glob (xs , xss , xc )xs,ss ,

glob (xs , xss , xc ) = (0, 0, (xc )) +

˜ glob (xs , xss , xc ) and vector xs,ss . We obtain the following with a smooth (3×2)-matrix

Lipschitz estimates glob (x) ˜ − glob (x) − (0, 0, (x˜c ) − (xc )) glob

˜ glob ˜ glob (x) xs,ss ˜ =

(x) ˜ x˜s,ss − xs,ss +

(x) ˜ −

≤ C glob x˜s,ss − xs,ss + x˜ − xxs,ss ,


77

˜ glob , D

˜ glob on the sup-norm of the higher order with uniform bound C glob >

terms of the global excursion map and their derivatives. Note that x, ˜ x lie on the out-section and glob is a diffeomorphism from the out-section to its image on the in-section. Using loc (x) ˜ and loc (x) instead of x˜ and x we get a similar estimate for the return map (x) = glob ( loc (x)): (x) ˜ − (x) − (0, 0, ( loc (x) ˜ c ) − ( loc (x)c )) = glob ( loc (x)) ˜ − glob ( loc (x)) − (0, 0, ( loc (x) ˜ c ) − ( loc (x)c )) ≤ C glob loc (x) ˜ s,ss − loc (x)s,ss + loc (x) ˜ − loc (x) loc (x)s,ss

≤ C glob L s |xu |β x˜ − x + (εL c + 1 + L s |xu |β )x˜ − x|xu |β x ≤ C return |xu |β x˜ − x.

(4.6)

The second to last inequality uses the estimates of the local passage of Corollary 3.7 and Theorem 3.9 for the choice (w.l.o.g.) 0 ≤ x˜u ≤ xu . Note that the estimates of Theorem 3.9 are used in the form | loc (x) ˜ c − loc (x)c | ≤ εL c x˜ − x + |x˜c − xc | ≤ (εL c + 1)x˜ − x, loc (x) ˜ s,ss − loc (x)s,ss ≤ L s |xu |β x˜ − x. The constant C return is uniform in x, x˜ in the in-section. Because β > 0, we have an ˜ in , i.e. for xu < δ, if we choose δ small enough. arbitrarily strong contraction on The Kasner map in the original Bianchi system is expanding everywhere except at the Taub points T1 , T2 , T3 . In particular it is expanding on our in-section: |(a) − (b)| ≥ K˜ c |a − b|, for some uniform constant K˜ c > 1. Now choose K c with 1 < K c < K˜c , and σ with 0 < σ < 1 such that K c (1−σ 2 ) > 1. (The last relation is needed to obtain a contraction in Theorem 4.2.) Consider the cone in center direction with opening ϑ > 0, i.e. x˜u,s − xu,s ≤ ϑ|x˜c − xc |. Then (4.6) using the local Lipschitz estimate of Theorem 3.9 yield |( (x) ˜ − (x))c | ≥ |( loc (x) ˜ c ) − ( loc (x)c )| − C return |xu |β x˜ − x loc ≥ K˜ c | (x) ˜ c − loc (x)c | − C return |xu |β x˜ − x ≥ K˜ c |x˜c − xc | − εL c K˜ c x˜ − x − C return |xu |β x˜ − x ≥ K˜ c − εL c K˜ c + C return |xu |β (1 + ϑ) |x˜c − xc |. (4.7) For ε and δ chosen small enough, using xu ≤ δ, we can achieve K c < K˜ c − εL c K˜ c + C return |xu |β (1 + 1/σ ), yielding the expansion not only in the cone C xc , with ϑ = σ < 1, but also outside the cone C xu,s , with ϑ = 1/σ . Furthermore, using again (4.6), we see the invariance of the cones. Indeed, assume again x˜u,s − xu,s ≤ ϑ|x˜c − xc |, then we have (x) ˜ u,s − (x)u,s ≤ C return |xu |β x˜ − x ≤ C return |xu |β (1 + ϑ)|x˜c − xc | ˜ c − (x)c |. ≤ C return |xu |β (1 + ϑ)K c−1 | (x)

78


Fig. 6. Lipschitz graph xc = xc (xs , xu ) with fixed line xc (xs , 0) ≡ 0. The graph transformation given by the inverse return map −1 expands xu,s and contracts xc

The last inequality uses the expansion in xc , thus it is valid for ϑ ≤ 1/σ . We choose δ small enough such that C return |xu |β K c−1 < σ/(1 + σ ). Due to the monotone increase of ϑ/(1 + ϑ) we also have C return |xu |β K c−1 < ϑ/(1 + ϑ) for all ϑ ≥ σ . Thus we obtain the cone invariance (x) ˜ u,s − (x)u,s ≤ ϑ| (x) ˜ c − (x)c |

(4.8)

for all σ ≤ ϑ ≤ 1/σ . The choice ϑ = σ yields (forward) invariance of the cone C xc and the choice ϑ = 1/σ u,s yields (backward) invariance of the cone C x . Note that the cone invariances are in fact strict as claimed in the lemma. The above estimates are strict inequalities for x = x. ˜ u,s Now consider the cone in transverse direction, that is (x) ˜ ∈ C x , which amounts to | (x) ˜ c − (x)c | ≤ σ (x) ˜ u,s − (x)u,s . We have already established invariance. Thus |x˜c − xc | ≤ σ x˜u,s − xu,s and estimate (4.6) yields (x) ˜ u,s − (x)u,s ≤ C return |xu |β x˜ − x ≤ C return |xu |β (1 + σ )x˜u,s − xu,s . −1 = C return δ β (1 + σ ) < 1, for δ small enough. This is the claimed contraction, K u,s

Theorem 4.2. The (local) stable set of (0,0,0) under the return map is given by {(xu , xs , xc ) | xc = xc (xu , xs )} with a Lipschitz continuous function xc . In particular, xc (0, xs ) = 0. Proof. The idea of the proof is to define a graph transformation on the space of Lipschitz graphs {(xu , xs , xc (xu , xs ))}, Fig. 6, by the inverse return map −1 . The cone invariance of the previous lemma will ensure that the Lipschitz property of the graph is preserved. Due to the expansion/contraction conditions of the previous lemma, the graph transformation turns out to be a contraction on the space of graphs. The fixed point of this contraction then yields the claim.


79

To make this idea precise, consider the Banach space of Lipschitz functions X = {ζ : [0, δ]2 → [−δ, δ], xu,s = (xu , xs ) → xc = ζ (xu,s ) such that ζ (xs , 0) ≡ 0, Lip(ζ ) ≤ σ } with sup-norm. The parameters δ, σ < 1 correspond to those of Lemma 4.1. Define a map G : X → X as graph transformation, i.e. graph(Gζ ) := −1 graph(ζ ). More precisely

Gζ −1 (xu,s , ζ (xu,s )) u,s := −1 (xu,s , ζ (xu,s )) c , for xu,s = 0, for arbitrary xs . Gζ (0, xs ) := 0, The first equation implicitly assumes that (xu,s , ζ (xu,s )) has a pre-image under

and that it lies in the domain. The second equation just gives the pre-image of the origin under . Note the restriction to non-negative xu , xs consistent with the structure of the Bianchi system that preserves the signs of the spatial curvature variables Nk . In fact due to these constraints the boundary xs = 0 becomes singular after one excursion,

(xu , xs = 0, xc )u = 0. This fact is not explicitly used in the following but justifies the choice of the domain. We will prove the following claims (i) domain of definition: for all ζ ∈ X and xu,s ∈ (0, δ] × [0, δ] there exists x˜u,s ∈ [0, δ]2 \{0}, such that −1 (x˜u,s , ζ (x˜u,s )) u,s = xu,s . (ii) well-definedness: for all ζ ∈ X and xu,s , x˜u,s ∈ [0, δ]2 \{0} the following holds. If

−1

(xu,s , ζ (xu,s )) u,s = −1 (x˜u,s , ζ (x˜u,s )) u,s ∈ [0, δ]2 then already xu,s = x˜u,s . Conditions (i) and (ii) yield a well defined function Gζ for every ζ ∈ X . (iii) Lipschitz property: for all ζ ∈ X the function Gζ is again Lipschitz continuous with Lip(Gζ ) ≤ σ . (iv) contraction: There exists a constant 0 < κ < 1 such that for all ζ, ζ˜ ∈ X the estimate G ζ˜ − Gζ sup ≤ κζ˜ − ζ sup holds. Conditions (i)–(iii) prove that the graph transformation G indeed maps Lipschitz functions in X to Lipschitz functions in X . Condition (iv) provides a contraction. If all four conditions hold, then by contraction-mapping theorem there is a unique fixed point, i.e. a Lipschitz function ζ ∗ ∈ X with Gζ ∗ = ζ ∗ . Its graph is a locally invariant manifold under . It is also the stable set of the origin due to the cone conditions of Lemma 4.1. This yields the claim of the theorem. Therefore it remains to prove (i)–(iv): ˜ in (i) Let ζ ∈ X and xu,s ∈ (0, δ]×[0, δ] be given. The straight line {xu,s }×[−δ, δ] ⊂ c ˜ is contained in the cone C(xu,s ,0) . We use Lemma 4.1: (xu,s , 0) ∈ by invariance c and contraction of the cone C0u,s . Thus, by invariance and Expansion of C(x , the u,s ,0) c image of the straight line {xu,s } × [−δ, δ] under contains a curve in C (xu,s ,0) connecting the extremal plains {xc = ±δ}. By intermediate-value theorem this curve must intersect the graph of ζ .

(ii) Let ζ ∈ X and xu,s , x˜u,s ∈ [0, δ]2 \{0} be given with −1 (xu,s , ζ (xu,s )) u,s =

−1

(x˜u,s , ζ (x˜u,s )) u,s ∈ [0, δ]2 . Note in particular that −1 (xu,s , ζ (xu,s )) u = 0, as the singular line is mapped onto the origin.

80


c Thus −1 (x˜u,s , ζ (x˜u,s )) ∈ C

˜u,s , ζ (x˜u,s )) −1 (x ,ζ (x )) and by cone invariance ( x u,s u,s c ∈ C(xu,s ,ζ (xu,s )) . The Lipschitz-bound on ζ ∈ X on the other hand implies s,u (x˜u,s , ζ (x˜u,s )) ∈ C(x , thus (x˜u,s , ζ (x˜u,s )) = (xu,s , ζ (xu,s )). u,s ,ζ (x u,s )) u,s (iii) Again, the Lipschitz-bound on ζ ∈ X translates to (x˜u,s , ζ (x˜u,s )) ∈ C(x u,s ,ζ (x u,s )) for all x, x. ˜ Cone invariance, Lemma 4.1, immediately yields the Lipschitz bound on Gζ . (iv) The singular line is fixed by construction, thus we only have to estimate the dis2 ˜ tance of the nonsingular part. Let ζ, ζ ∈ X and xu,s, x˜u,s ∈ [0, δ] \{0} be given

−1 with (xu,s , ζ (xu,s )) = −1 (x˜u,s , ζ˜ (x˜u,s )) ∈ [0, δ]2 . u,s

Again, this implies −1 (x˜

u,s , ζ˜ ( x˜ u,s ))

u,s

∈

c C

−1 (x ,ζ (x )) u,s u,s

and by cone invariance we

c . Thus we can estimate have (x˜u,s , ζ˜ (x˜u,s )) ∈ C(x u,s ,ζ (x u,s ))

|ζ˜ (x˜u,s ) − ζ (xu,s )| ≤ ζ˜ − ζ sup + σ x˜u,s − xu,s ≤ ζ˜ − ζ sup + σ 2 |ζ˜ (x˜u,s ) − ζ (xu,s )|. The first inequality uses the Lipschitz bound on ζ ∈ X whereas the second one uses c the cone C(x . We obtain u,s ,ζ (x u,s )) |ζ˜ (x˜u,s ) − ζ (xu,s )| ≤

1 ζ˜ − ζ sup . 1 − σ2

c On the other hand, the expansion of C

−1 (x

u,s ,ζ (x u,s ))

under yields

(G ζ˜ − Gζ ) −1 (xu,s , ζ (xu,s )) u,s = −1 (xu,s , ζ (xu,s )) − −1 (x˜u,s , ζ˜ (x˜u,s )) c c 1 ≤ ζ (xu,s ) − ζ˜ (x˜u,s ) Kc 1 ≤ ζ˜ − ζ sup . K c (1 − σ 2 )

Lemma 4.1 provides constants K c , σ with K c (1 − σ 2 ) > 1. Therefore the last estimates yield the claimed contraction, κ = 1/(K c (1 − σ 2 )), and finishes the proof. The stable set given by the theorem determines the initial conditions of trajectories that possess the same α-limit dynamics as the 3-periodic heteroclinic cycle: there exists a codimension-one set of trajectories that converge to the heteroclinic cycle as they approach the big-bang singularity in backwards time. 5. Discussion We have constructed the set of trajectories that start close to the period-3 heteroclinic cycle and follow it in the α-limit t → −∞, within the vacuum Bianchi system. It forms a codimension-one Lipschitz-manifold.


81

This holds true in Bianchi-IX as well as Bianchi-VIII domains. In fact, the period-3 heteroclinic cycle is a cycle of three pairs of heteroclinic orbits and the domain depends on the choice of representatives. The same technique is applicable to non-vacuum solutions provided the additional eigenvalue μ of the Kasner equilibrium is stronger than the unstable eigenvalue, see (2.7) in reversed time direction. In fact the analysis of the local map in Sect. 3 remains valid for arbitrary stable dimension and order of stable eigenvalues, provided all stable eigenvalues are stronger than the unstable one, just the dimension of the component xs,ss changes. (Note also that the linearization (2.7) is diagonalizable even at points of the Kasner circle at which some of the eigenvalues coincide.) The crucial assumption is a positive gap, |μs | − |μu | > 0, between the weakest stable and the unstable eigenvalue of the diagonalizable linearization at the Kasner circle. The result on the stable manifold of the 3-periodic heteroclinic√ chain therefore extends to the non-vacuum Bianchi model provided 3(2 − γ ) > 23 (1 − 5), i.e. √ 5− 5 γ < , 2 see (2.7) and note the reversed time. The result also extends to arbitrary periodic heteroclinic chains, as they keep a uniform distance from the Taub points T1 , T2 , T3 . To this end it is important to note that all our estimates are uniform on compact pieces of the Kasner circle which do not contain the Taub points. Similarly, the Kasner map is uniformly expanding on such compacta. Therefore, the return map associated to an arbitrary periodic chain is composed of finitely many local and global maps as studied in this paper. This yields uniform estimates for each of the local and global maps. Hence, the graph transformation of Theorem 4.2 can be applied to the concatenation of these local return and global excursion maps. As above we get a codimension one invariant Lipschitz manifold of points that approach the periodic chain of heteroclinic orbits. Note that these chains may even include heteroclinic orbits, that approach the Kasner circle in the direction corresponding to the weak eigenvalue. The results on the local passage remain valid for in-sections {xs = 1}. We used notation consistent with the non-principal direction of the periodic-3 chain, however the order of stable eigenvalues was not used in any of the proofs. The unstable eigenvalue along the Kasner circle K in reversed time direction is bounded by |μu | < 2, see (2.7) in reversed time direction. The extension of the result on arbitrary periodic chains to the non-vacuum Bianchi model is therefore valid as long as the eigenvalue μ to the eigenvector transverse to the vacuum boundary is stronger than 2, i.e. γ
θ , where θ ∈ (0, 1) is arbitrary but fixed in advance” (note that, in particular, the threshold value of k could then depend on θ ). However, much stronger assertions are in fact true, for example the probability of the exceptional set in Theorem 1 can be majorized by exp(−c m). Another comment: one only uses in the proof that m and d are comparable, and larger than ck 2 . ¯ which is done via a The proof will be based on separately majorizing Smin ( ⊗ ), ¯ which well-known and relatively simple trick, and on minorizing Smin () = Smin (), is the main point of the argument. A question analogous to (3) can be asked for the minimal output p-Rényi entropy ( p > 1). For the additivity of Rényi entropy, random counterexamples were constructed earlier by Hayden–Winter [8]. It was shown in [9] that the Hayden–Winter analysis can also be simplified (at least conceptually) by appealing to Dvoretzky’s theorem. Working with the von Neumann entropy, however, requires more effort. First, while [9] relied on a straightforward instance of Milman’s “tangible” version [10,11] of Dvoretzky’s theorem for Schatten classes that was documented in the literature already in the 1970’s, we now need a more subtle, sharp version (which appears in the literature only implicitly). Second, this sharp version is not applied in the most direct way and requires additional preparatory work (for which we mostly follow the approach of Brandao–Horodecki [6]). 5. Lower Bound for Smin (): the Approach Since we are going to consider channels with near-maximal minimal output entropy, the following simple inequality (Lemma III.1 in [6], or formula (40) in [4]) will allow to replace the analysis of the von Neumann entropy S by that of a smoother quantity.

88

G. Aubrun, S. Szarek, E. Werner

Lemma 2. For every state σ ∈ D(Ck ), S(σ ) ≥ S

Id k

2 Id − k σ − . k H S

Consequently, for every quantum channel : Mm → Mk , 2 Id . (ρ) − Smin () ≥ log(k) − k · max k H S ρ∈D (Cm )

(4)

It will be convenient to identify Ck ⊗ Cd (or, to be more precise, Ck ⊗ Cd — a distinction we will ignore) with Mk,d via the canonical map induced by u ⊗ v → |u v|. If x ∈ Ck ⊗ Cd is so identified with a matrix M ∈ Mk,d , then Tr Cd |x x| = M M † .

(5)

Via this identification, Schmidt coefficients of |x coincide with singular values of M. While the tensor and matrix formalisms are equivalent, the matrix formalism is arguably more transparent, which sometimes leads to simpler arguments. Denote by W ⊂ Ck ⊗ Cd the subspace inducing . Note that the maximum in (4) is necessarily attained on pure states which, in this identification, correspond to unit vectors x ∈ W. For such states the action of is given — in the matrix formalism — by (5), and so the inequality (4) can be rewritten as Smin () ≥ log(k) − k ·

2 M M † − Id . k H S M∈W , M H S =1 max

(6)

The idea will be to show that, for a random subspace W, the maximum on the right is very small; this will be formalized in the next proposition. 6. The Main Proposition and the Derivation of the Main Theorem The heart of the argument is the following proposition Proposition 3. There are absolute constants c, C, C > 0 so that for every k, for d = Ck 2 and m = cd, a random Haar-distributed subspace W of dimension m in Mk,d satisfies Id † ≤C M M max (7) − k H S k M∈W ,M H S =1 with large probability (tending to 1 when k tends to ∞). ¯ is a counterexample From the proposition one quickly deduces that the pair (, ) to the additivity of minimum output von Neumann entropy. Indeed, a straightforward ¯ to the maximally entangled state yields an output calculation shows that applying ⊗ dim W m c state with one eigenvalue greater than or equal to dim Mk,d = kd = k ([8], Lemma III.3; see also Sect. 6 in [12]). Then, a simple argument using just concavity of S(·) reduces

Hastings’s Additivity Counterexample via Dvoretzky’s Theorem

89

the problem to calculating the entropy of the state with one eigenvalue equal to all the remaining ones identical, which yields ¯ ≤ 2 log k − Smin ( ⊗ )

c k

and

c log k 1 + . k k

On the other hand, Eq. (6) together with Proposition 3 implies Smin () ≥ log(k) −

C 2 . k

¯ = Smin (), the inequality of Theorem 1 follows if k is large enough, as Since Smin () required. 7. Dvoretzky’s Theorem: Take One We wish to point out that while Proposition 3 will be derived from a Dvoretzky-like theorem for Lipschitz functions (Theorem 4 below), it can be rephrased in the language of the standard Dvoretzky’s theorem. Indeed, its assertion says that for every M ∈ W with M H S = 1 we have 2 † C2 M M † − Id = Tr |M|4 − 2 Tr M M + Tr Id = Tr |M|4 − 1 ≥ 0. (8) ≥ k2 k HS k k2 k Consequently, 1/4 C2 C2 k −1/4 M H S ≤ M4 ≤ k −1/4 1 + M H S ≤ k −1/4 1 + M H S k 4k (9) 2

for all M ∈ W. In other words, W is (1 + δ)-Euclidean, with δ = C4k , when considered as a subspace of the normed space Mk,d , · 4 , the Schatten 4-class. In our prior work [9] we similarly observed that the crucial technical step of the Hayden-Winter proof of non-additivity of p-Rényi entropy for p > 1 can be restated as an instance of Dvoretzky’s theorem for the Schatten 2 p-class. There is an important difference, however. While in the case of p-Rényi entropy the needed Dvoretzky-type statement was known since the 1970s, for the statement of the type (9) needed in the present context, the “off the shelf” methods seem to yield only δ = O(k −1/4 ) as opposed to δ = O(k −1 ) above. This also suggests that while for the p-Rényi entropy derandomization of the example — i.e., supplying explicit channels for which the additivity fails — may be a feasible project (see Sect. IX in [9] and references therein), the analogous task for the von Neumann entropy is likely to be much harder. 8. Dvoretzky’s Theorem: Take Two We use the following definitions: if f is a function from a metric space (X, d) to R, and μ ∈ R, the oscillation of f around μ on a subset A ⊂ X is osc( f, A, μ) = sup | f − μ|. A

90


A function f defined on the unit sphere SCn is called circled if f (eiθ x) = f (x) for any x ∈ SCn , θ ∈ [0, 2π ]. If X is a real random variable, we will say that μ is a central value of X if μ is either the mean of X , or any number between the 1st and the 3rd quartile of X (i.e., if min{P(X ≥ μ), P(X ≤ μ)} ≥ 41 ; this happens in particular if μ is the median of X ). We will need the following variant of Milman’s “tangible” version of Dvoretzky’s theorem. Theorem 4 (Dvoretzky’s theorem for Lipschitz functions). If f : SCn → R is a 1-Lipschitz circled function, then for every ε > 0, if E ⊂ Cn is a random subspace (Haar-distributed) of dimension c0 nε2 , we have with large probability osc( f, SCn ∩ E, μ) ≤ ε, where μ is any central value of f (with respect to the normalized Lebesgue measure on SCn ) and c0 is an absolute constant. If the function is L-Lipschitz, the dimension changes to c0 n(ε/L)2 . A striking application of the theorem above is to the case when f is the gauge function of a convex body, or a norm: it leads to the fact that any high-dimensional convex body has almost spherical sections. At the heart of Dvoretzky-like phenomena lies the concentration of measure, which in our framework is expressed by Lemma 5 (Lévy’s lemma [13]). If f : S n−1 → R is a 1-Lipschitz function, then for every ε > 0, P(| f (x) − μ| > ε) ≤ C1 exp(−c1 nε2 ), where x is uniformly distributed on S n−1 , μ is any central value of f , and C1 , c1 > 0 are absolute constants. Results such as Theorem 4 or Lévy’s lemma are usually stated with μ equal to the median or the mean of f . However, once we know that the result is true for some central value (or, for that matter, for any μ ∈ R), it holds a posteriori for any such value (up to changes √ in the constants) as, for 1-Lipschitz functions, all central values differ at most by C/ n. The obvious idea to prove Theorem 4 is to use Lévy’s lemma and an ε-net argument — using the fact that an ε-net in SCn = S 2n−1 can be chosen to have cardinality ≤ (1+2/ε)2n (see [14], Lemma 4.10). Indeed, this was essentially Milman’s original argument in [10]. However, one only obtains this way a subspace E of dimension cnε2 / log(1/ε). For many applications (including our previous paper [9]), this extra logarithmic factor is not an issue. However, in the present case, having the optimal dependence on ε is crucial. The classical framework of convex geometry is the real case (with or without the assumption “circled,” which in that context just means then that the function is even). In that setting, Theorem 4 was proved by Gordon [15] who used comparison inequalities for Gaussian processes. A proof based on concentration of measure was later given by Schechtman [16]. The complex case does not seem to appear in the literature. Actually, at the face of it, Gordon’s proof does not extend to the complex setting, while Schechtman’s proof does. We sketch Schechtman’s proof of Theorem 4 in Appendix A. It is not clear whether the assumption “ f circled” in Theorem 4 can be completely removed; we do know that it is needed at most for very small values of ε.


91

9. Proof of the Main Proposition Let S H S be the Hilbert–Schmidt sphere in Mk,d and let M be a random matrix uniformly distributed on S H S . Let g(·) ˜ be the function defined on S H S by Id † g(M) ˜ = M M − . k H S The next well-known lemma asserts that the singular values of a very rectangular random matrix are very concentrated. This is a familiar phenomenon in random matrix theory that goes back to [17]. Versions of this lemma appeared in the QIT literature under the tensor formalism (see for example Lemma III.4 in [18]). However, these versions typically introduce an unnecessary logarithmic factor which would imply that the main proposition holds with d = Ck 2 log k instead of d = Ck 2 . For completeness, we include a proof of Lemma 6 in Appendix B. Lemma 6. There exist absolute constants C, c > 0 such that, if M is uniformly distributed on the Hilbert–Schmidt sphere in Mk,d (d ≥ C 2 k), then with probability larger than 1 − exp(−ck), 1 1 C 2 C 2 † spec(M M ) ⊂ , √ +√ . (10) √ −√ k d k d We note that √ inclusion (10) can√be reformulated as follows: all singular values of M differ from 1/ k by less than C/ d. (Recall that the singular values of M correspond to the Schmidt coefficients of a random pure state in Ck ⊗ Cd .) We will use in the sequel the following immediate corollary of Lemma 6. Corollary 7. Under the hypotheses of Lemma 6 and denoting C0 = 3C † (a) with probability larger √ than 1 − exp(−ck), all eigenvalues of M M differ from 1/k by less than√ C0 / kd; consequently, the median (or any fixed quantile) of g˜ is bounded by C0 / d for k large enough. √ (b) if d ≥ C 2 k, the median (or any fixed quantile) of M∞ is bounded by 2/ k for k large enough.

We point out that while we chose to present statements (a) and (b) above as consequences of Lemma 6 for clarity and for “cultural” reasons (the lemma being familiar to the QIT community), more precise versions of these statements are available in (or can be readily deduced from) the random matrix literature. Re (a), the study of the distribution of g˜ is, by (8), equivalent to that of Tr |M|4 , and a closed formula for the expected value of the latter is known (up to terms of smaller order, its value is 1/k +1/d); see, e.g., [19] (Sect. 8) and its references. Re (b), sharp estimates on the tail of M∞√can also√be found in [19] (proof of Lemma 7.3), in particular every fixed quantile is 1/ k + 1/ d up to terms of smaller order. This result can also be retrieved via methods of earlier papers [20,21], which focused on the real case. The function g˜ is 2-Lipschitz on S H S , and Corollary 7(a) implies that the median of g˜ is as small as we want for√large d. However, a direct application of Theorem 4 yields only a bound of order 1/ k in (7). The trick — already present in the previous approaches — is to exploit the fact that g˜ has a much smaller Lipschitz constant when

92


restricted to a certain large subset of S H S . As we will see, this bootstrapping argument is equivalent to applying Theorem 4 twice. The following lemma appears in [6] with a rather long proof, but using the matrix formalism completely demystifies it. √ Lemma 8. The function g˜ is 6/ k-Lipschitz when restricted to the set √ = {M ∈ S H S s.t M∞ ≤ 3/ k}. Proof. The lemma is a consequence of the following chain of matrix inequalities M M † − Id − N N † − Id ≤ M M † − N N † H S k HS k H S ≤ M(M † − N † ) + (M − N )N † H S ≤ M∞ M † − N † H S + M − N H S N † ∞ ≤ (M∞ + N ∞ )M − N H S . The function ·∞ is 1-Lipschitz on S H S . By Corollary 7(b), its median is bounded by √ 2/ k for d ≥ C 2 k. (Note that Lévy’s lemma shows that the measure of the complement of is very small.) An application of the standard Dvoretzky’s theorem (i.e.,Theorem √4 for norms) to f = · ∞ with μ equal to the median of · ∞ and with ε = 1/ k (note that the dimension of the ambient space is n = kd) shows that the intersection of S H S with a random subspace of dimension cd in Mk,d is contained in with large probability. Let g be a 6k −1/2 -Lipschitz extension of g˜ | to S H S — in any metric space X , it is possible to extend any L-Lipschitz function h˜ defined on a subset Y without increasing the Lipschitz constant; use, e.g., the formula

˜ + L dist(x, y) . h(x) = inf h(y) y∈Y

This formula also guarantees that the extended function g is circled. Since g = g˜ on most of S H S , the median of g (resp., g) ˜ is a central value of g˜ (resp., g). We apply Theorem 4 to g with ε = 1/k and L = 6k −1/2 to get (μ being the median of g) ˜ osc(g, S H S ∩ E, μ) ≤ 1/k on a random subspace E ⊂ Mk,d of dimension m = c0 · kd · (k −1 /(6k −1/2 ))2 = cd. Using Corollary 7(a), we obtain that μ ≤ 1/k for d ≥ (C0 k)2 . We then have osc(g, S H S ∩ E, 0) ≤ 2/k. If S H S ∩ E ⊂ (which, as noticed before, holds with large probability), g and g˜ coincide on S H S ∩ E and therefore osc(g, ˜ S H S ∩ E, 0) ≤ 2/k. This completes the proof of Proposition 3 and hence that of Theorem 1. Acknowledgements. The research of the first named author was partially supported by the Agence Nationale de la Recherche grant ANR-08-BLAN-0311-03. The research of the second and third named authors was partially supported by their respective grants from the National Science Foundation (U.S.A.) and from the U.S.-Israel Binational Science Foundation. The authors would like to thank M. B. Hastings and M. Horodecki for valuable comments, and MF Oberwolfach – where insights crucial to this project were crystallized – for their hospitality.


93

Appendix A. Proof of Theorem 4 (après Schechtman) We sketch here a proof of Theorem 4, essentially following Schechtman [16]. As we already mentioned, a simple use of a ε-net argument gives a parasitic factor log(1/ε). This can be improved by a chaining argument, which goes back (at least) to Kolmogorov — a way to use η-nets for all values of η simultaneously. Consider the canonical inclusion Cm ⊂ Cn , and let U ∈ U(n) be a random Haar-distributed unitary matrix. Then F := U (Cm ) is distributed according to the Haar measure on the Grassmann manifold of m-dimensional subspaces. If f : SCn → R is a 1-Lipschitz circled function with mean μ, we need to show that osc( f ◦ U, SCm , μ) ≤ ε with large probability provided m ≤ c0 nε2 . We first prove a lemma. Lemma 9. Let f : SCn → R be a 1-Lipschitz circled function and U ∈ U(n) be a Haar-distributed random unitary matrix. Then for any x, y ∈ SCn with x = y and for any λ > 0, λ2 . P(| f (U x) − f (U y)| > λ) ≤ C exp −cn |x − y|2 Proof. Fix x, y ∈ SCn . Since f is circled (and U is C-linear), we may replace y by eiθ y and choose θ so that x|y is real nonnegative; note that this choice of θ minimizes |x − y| and assures that x + y and y − x are orthogonal. (This is the only really new y−x point needed to accommodate the complex setting.) Set z = x+y 2 and w = 2 , then 1 x = z + w and y = z − w. Further, set β = |w| = 2 |x − y| (we may assume that β = 0) and w = β −1 w. Then, conditionally on u = U (z), U (w ) is distributed uniformly on the sphere Su ⊥ := SCn ∩ u ⊥ . Since U (x) = u + βU (w ) and U (y) = u − βU (w ), it follows that the conditional (on u = U (z)) distribution of f (U x) − f (U y) is the same as that of f u : Su ⊥ → R defined by f u (v) = f (u + βv) − f (u − βv). As is readily seen, f u is 2β-Lipschitz and its mean is 0. From Lévy’s lemma, applied to f u and to the (2n − 3)-dimensional sphere Su ⊥ , we deduce that, conditionally on u = U (z), P(| f (U x) − f (U y)| > λ) ≤ C1 exp(−c1 (2n − 2)λ2 /|x − y|2 ), and hence the same inequality holds also without the conditioning. The end of the proof (the actual chaining argument) is identical to that in Schechtman’s paper, so — rather than copying it — we present the general principle on which it is based. Let (S, ρ) be a compact metric space and let (X s )s∈S be a family of mean 0 random variables (a stochastic process indexed by S). We say that (X s ) is subgaussian if there are A, α > 0 such that, for all s, t ∈ S with s = t and for all λ ≥ 0, λ2 . (11) P(|X s − X t | ≥ λ) ≤ A exp −α ρ(s, t)2 Proposition 10 (Dudley’s inequality). If (X s )s∈S satisfies (11) and some mild regularity conditions, then ∞ −1/2 E sup |X s − X t | ≤ C Aα log N (S, η) dη, s,t∈S

0

94


where N (S, η) is the minimal cardinality of a η-net of S (in particular the integrand is 0 if η is larger than the radius of S). See [22] for the original article, [23] for a generalization to the subgaussian case that is relevant here, and [24] for a book exposition; we also sketch a proof further below for the reader’s convenience. In our case we choose S = SCm ∪ {0} (with the usual Euclidean metric), X s = f (U s) − μ if s ∈ SCm and X 0 = 0 ; then osc( f ◦ U, SCm , μ) = sup |X s |. s∈S

The underlying probability space is U(n), and the subgaussian property is given by Lemma 9 if s, t ∈ SCm and directly by Lévy’s lemma if s or t equals 0. Next, the 2m bound N (SCm , η) = N (S 2m−1 , η) √ ≤ (1 + 2/η) mentioned in the comments following Lemma 5 leads to an estimate 2 m for the integral and to the bound

√ m . E := E sup |X s | ≤ E sup |X s − X t | ≤ C C(cn)−1/2 · 2 m = C n s∈S s,t∈S (For readers confused by different quantities appearing on the left side in different forms of Dudley’s inequality, we point out that the first inequality above uses the fact that one of the variables X t equals 0, and that we always have sups,t |X s − X t | = sups X s + supt (−X t ).) The assertion of Theorem 4 follows now from Markov’s inequality if ε is sufficiently larger than E, which is assured by choosing c0 small enough. A slightly more careful argument (such as that given in [16], or see [24]) or an application of the appropriate concentration inequality (for functions on U(n)) yields a bound of the 2 form exp(−c ε n) on the probability of the exceptional set sups∈S |X s | > C mn + ε (hence for the exceptional set from Theorem 4). Let us comment here that the value of the constant c0 given by the proof of Theorem 4 is probably the single most important obstacle to showing Theorem 1 for “reasonable” values of k, m. An adaptation of the proof from [15] (which yields good constants) to the complex case could be helpful here. Proof of Dudley’s inequality. For every k ∈ Z, let Nk be a 2−k -net of minimal cardinality for (S, ρ). Let k0 ∈ Z be such that the radius of S lies between 2−(k0 +1) and 2−k0 ; the net Nk0 consists of a single element s0 . For every s ∈ S and k ∈ Z, let πk (s) be an element of Nk satisfying ρ(s, πk (s)) ≤ 2−k . The chaining equation reads for every s ∈ S, X πk+1 (s) − X πk (s) . (12) X s = X s0 + k≥k0

(It is here where some regularity of (X s ) – path continuity – is used.) It follows that sup |X s − X t | ≤ 2 sup |X πk+1 (s) − X πk (s) | ≤ 2 sup |X u − X u |, (13) s,t∈S

k≥k0 s∈S

k≥k0 u,u

where the last supremum is taken over couples (u, u ) ∈ Nk+1 ×Nk satisfying ρ(u, u ) ≤ 2−k + 2−(k+1) < 2−k+1 . It remains to bound the expectation of each term in the sum, using the following fact


95

Fact 11. If N ≥ 2 and Y1 , . . . , Y N are nonnegative random variables satisfying the tail estimate P(Yi ≥ t) ≤ A exp(−t 2 /2β 2 ) for all t ≥ 0, then E max Yi ≤ C Aβ log N . To bound E sup |X u − X u |, we apply the above fact with β = 2−k+1 α −1/2 and N = card(Nk ) · card(Nk+1 ) ≤ N (S, 2−(k+1) )2 . This gives E sup |X s − X t | ≤ C Aα −1/2 2−k log N (S, 2−(k+1) ). s,t∈S

k≥k0

The result now follows by relating the last series to the integral in Proposition 10 (a version of the integral test from calculus). Proof of Fact 11. We may assume β = 1 by working with Yi /β. Then simply write ∞ E max Yi = P(max Yi ≥ t)dt 0 ∞ exp(−t 2 /2)dt ≤ 2 log N + A. ≤ 2 log N + AN √ 2 log N

∞

The last inequality follows from √2 log N exp(−t 2 /2)dt ≤ 1/N . Note that the hypotheses force A ≥ 1.

∞

√ 2 log N

t exp(−t 2 /2)dt =

Appendix B. Proof of Lemma 6 The lemma will follow if we show that with large probability, C ∞ ≤ √ , kd where = M M † − Id/k ∈ Mk and · ∞ is the operator (or spectral) norm. Let N be a 14 -net of SCk with cardinality bounded by (C0 )k . One checks that if x ∈ SCk and x¯ ∈ N satisfy |x − x| ¯ ≤ 1/4, then 1 |x||x | ≤ |x|| ¯ x | ¯ + |x − x|| ¯ x | ¯ + |x||x − x | ¯ ≤ |x|| ¯ x | ¯ + 2 · ∞ , 4 so that taking supremum over x ∈ SCk , we get ∞ ≤ 2 sup |x|| ¯ x | ¯ . x∈ ¯ N

An application of the union bound gives C C P ∞ ≥ √ ≤ (C0 )k · P x0 ||x0 ≥ √ kd 2 kd C 1 = (C0 )k · P |M † x0 |2 ≥ + √ k 2 kd 1 C ≤ (C0 )k · P |M † x0 | ≥ √ + √ , k 5 d

96


where x0 ∈ Ck is any fixed unit vector (remember that d ≥ C 2 k). The probabilities above can be expressed in terms of Beta-type integrals, but it’s easier to estimate them using Lévy’s lemma. The function M → |M † x0 | is 1-Lipschitz on the Hilbert–Schmidt sphere (if x0 is the first vector of the canonical basis, then M † x0 is essentially the first row of M) and 1/2 E |M † x0 | ≤ E |M † x0 |2 = 1/k. Hence, by Lévy’s lemma (with n = 2kd and ε =

C √ ), 5 d

we get

C ≤ exp(−ck) P ∞ ≥ √ kd for some choice of the constants C, c > 0, as required. References 1. Nielsen, M. A., Chuang, I. L.: Quantum computation and quantum information. Cambridge: Cambridge University Press (2000) 2. Holevo, A.S.: The additivity problem in quantum information theory. In: “Proceedings of the International Congress of Mathematicians (Madrid, 2006),” Vol. III, Zürich: Eur. Math. Soc., 2006, pp. 999–1018 3. Shor, P.W.: Equivalence of additivity questions in quantum information theory. Commun. Math. Phys. 246(3), 453–472 (2004) 4. Hastings, M.B.: Superadditivity of communication capacity using entangled inputs. Nature Phys. 5, 255 (2009) 5. Dvoretzky, A.: Some Results on Convex Bodies and Banach Spaces. In: “Proc. Internat. Sympos. Linear Spaces (Jerusalem, 1960),” Jerusalem: Jerusalem Academic Press, Oxford: Pergamon, 1961, pp. 123–160 6. Brandao, F., Horodecki, M.: On Hastings’ counterexamples to the minimum output entropy additivity conjecture. Open Syst. Inf. Dyn. 17, 31 (2010) 7. Fukuda, M., King, C., Moser, D.: Comments on Hastings’ Additivity Counterexamples. Commun. Math. Phys. 296, 111 (2010) 8. Hayden, P., Winter, A.: Counterexamples to the maximal p-norm multiplicativity conjecture for all p > 1. Commun. Math. Phys. 284, 263–280 (2008) 9. Aubrun, G., Szarek, S., Werner, E.: Non-additivity of Rényi entropy and Dvoretzky’s theorem. J. Math. Phys. 51, 022102 (2010) 10. Milman, V.: A new proof of the theorem of A. Dvoretzky on sections of convex bodies. Funct. Anal. Appl. 5, 28–37 (1971) (English translation) 11. Figiel, T., Lindenstrauss, J., Milman, V.D.: The dimension of almost spherical sections of convex bodies. Acta Math. 139(1-2), 53–94 (1977) 12. Collins, B., Nechita, I.: Gaussianization and eigenvalue statistics for random quantum channels (III), Ann. Appl. Probab., to appear; http://arxiv.org/abs/0910.1768v2 [quant-ph], 2009 13. Lévy, P.: Problémes concrets d’analyse fonctionnelle, 2nd ed. Paris: Gauthier-Villars, 1951 14. Pisier, G.: The volume of convex bodies and Banach space geometry. Cambridge Tracts in Mathematics, 94. Cambridge: Cambridge University Press, 1989 15. Gordon, Y.: On Milman’s inequality and random subspaces which escape through a mesh in Rn . In: “Geometric aspects of functional analysis (1986/87),” Lecture Notes in Math., 1317, Berlin: Springer, 1988, pp. 84–106 16. Schechtman, G.: A remark concerning the dependence on ε in Dvoretzky’s theorem. In: “Geometric aspects of functional analysis (1987–88),” Lecture Notes in Math., 1376, Berlin: Springer, 1989, pp. 274–277 17. Marchenko, V.A., Pastur, L.A.: The distribution of eigenvalues in certain sets of random matrices. Mat. Sb. 72, 507–536 (1967) 18. Hayden, P., Leung, D., Winter, A.: Aspects of generic entanglement. Commun. Math. Phys. 265, 95–117 (2006) 19. Haagerup, U., Thorbjørnsen, S.: Random matrices with complex Gaussian entries. Expos. Math. 21, 293– 337 (2003) 20. Geman, S.: A limit theorem for the norm of random matrices. Ann. Probab. 8, 252–261 (1980)


97

21. Silverstein, J.W.: The smallest eigenvalue of a large-dimensional Wishart matrix. Ann. Probab 13, 1364– 1368 (1985) 22. Dudley, R.M.: The sizes of compact subsets of Hilbert space and continuity of Gaussian processes. J. Funct. Anal. 1, 290–330 (1967) 23. Jain, N. C., Marcus, M. B.: Continuity of subgaussian processes. In: “Probability on Banach Spaces,” Advances in Probability, Vol. 4, New York: Dekker, 1978, pp. 81–196 24. Talagrand, M.: The generic chaining. Upper and Lower bounds of Stochastic Processes. BerlinHeidelberg-New York: Springer, 2005 Communicated by M.B. Ruskai


Communications in


Deformations of Quantum Field Theories on Spacetimes with Killing Vector Fields Claudio Dappiaggi1 , Gandalf Lechner2 , Eric Morfa-Morales3 1 II. Institut für Theoretische Physik, D-22763 Hamburg, Deutschland.


2 Faculty of Physics, University of Vienna, A-1090 Vienna, Austria.


3 Erwin Schrödinger Institute for Mathematical Physics, Vienna, A-1090 Vienna, Austria.

E-mail: [email protected] Received: 22 June 2010 / Accepted: 8 September 2010 Published online: 20 February 2011 – © Springer-Verlag 2011

Abstract: The recent construction and analysis of deformations of quantum field theories by warped convolutions is extended to a class of curved spacetimes. These spacetimes carry a family of wedge-like regions which share the essential causal properties of the Poincaré transforms of the Rindler wedge in Minkowski space. In the setting of deformed quantum field theories, they play the role of typical localization regions of quantum fields and observables. As a concrete example of such a procedure, the deformation of the free Dirac field is studied. 1. Introduction Deformations of quantum field theories arise in different contexts and have been studied from different points of view in recent years. One motivation for considering such models is a possible noncommutative structure of spacetime at small scales, as suggested by combining classical gravity and the uncertainty principle of quantum physics [DFR95]. Quantum field theories on such noncommutative spaces can then be seen as deformations of usual quantum field theories, and it is hoped that they might capture some aspects of a still elusive theory of quantum gravity (cf. [Sza03] for a review). By now there exist several different types of deformed quantum field theories, see [GW05,BPQV08,Sol08, GL08,BGK+ 08,BDFP10] for some recent papers, and references cited therein. Certain deformation techniques arising from such considerations can also be used as a device for the construction of new models in the framework of usual quantum field theory on commutative spaces [GL07,BS08,GL08,BLS10,LW10], independent of their connection to the idea of noncommutative spaces. From this point of view, the deformation parameter plays the role of a coupling constant which changes the interaction of the model under consideration, but leaves the classical structure of spacetime unchanged. Deformations designed for either describing noncommutative spacetimes or for constructing new models on ordinary spacetimes have been studied mostly in the case of a flat manifold, either with a Euclidean or Lorentzian signature. In fact, many approaches

100

C. Dappiaggi, G. Lechner, E. Morfa-Morales

rely on a preferred choice of Cartesian coordinates in their very formulation, and do not generalize directly to curved spacetimes. The analysis of the interplay between spacetime curvature and deformations involving noncommutative structures thus presents a challenging problem. As a first step in this direction, we study in the present paper how certain deformed quantum field theories can be formulated in the presence of external gravitational fields (i.e., on curved spacetime manifolds), see also [ABD+ 05,OS09] for other approaches to this question. We will not address here the fundamental question of dynamically coupling the matter fields with a possible noncommutative geometry of spacetime [PV04,Ste07], but rather consider as an intermediate step deformed quantum field theories on a fixed Lorentzian background manifold M. A deformation technique which is well suited for our purposes is that of warped convolutions, see [BS08] and [GL07,GL08] for precursors and related work. Starting from a Hilbert space H carrying a representation U of Rn , the warped convolution A Q of an operator A on H is defined as A Q = (2π )−n d n x d n y e−i x y U (Qx)AU (y − Qx). (1.1) Here Q is an antisymmetric (n × n)-matrix playing the role of deformation parameter, and the integral can be defined in an oscillatory sense if A and U meet certain regularity requirements. For deformations of a single algebra, the mapping A → A Q has many features in common with deformation quantization and the Weyl-Moyal product, and in fact was recently shown [BLS10] to be equivalent to specific representations of Rieffel’s deformed C ∗ -algebras with Rn -action [Rie92]. In application to field theory models, however, one has to deform a whole family of algebras, corresponding to subsystems localized in spacetime, and the parameter Q has to be replaced by a family of matrices {Q} adapted to the geometry of the underlying spacetime. To apply this scheme to quantum field theories on curved manifolds, we will consider spacetimes with a sufficiently large isometry group containing two commuting Killing fields, which give rise to a representation of R2 as required in (1.1). This setting is wide enough to encompass a number of cosmologically relevant manifolds such as Friedmann-Robertson-Walker spacetimes, or Bianchi models. Making use of the algebraic framework of quantum field theory [Haa96,Ara99], we can then formulate quantum field theories in an operator-algebraic language and study their deformations. Despite the fact that the warped convolution was invented for the deformation of Minkowski space quantum field theories, it turns out that all reference to the particular structure of flat spacetime, such as Poincaré transformations and a Poincaré invariant vacuum state, can be avoided. We are interested in understanding to what extent the familiar structure of quantum field theories on curved spacetimes is preserved under such deformations, and investigate in particular covariance and localization properties. Concerning locality, it is known that in warped models on Minkowski space, point-like localization is weakened to localization in certain infinitely extended, wedge-shaped regions [GL07,BS08,GL08,BLS10]. These regions are defined as Poincaré transforms of the Rindler wedge W R := (x0 , x1 , x2 , x3 ) ∈ R4 : x1 > |x0 | . (1.2) Because of their intimate relation to the Poincaré symmetry of Minkowski spacetime, it is not obvious what a good replacement for such a collection of regions is in the presence of non-vanishing curvature. In fact, different definitions are possible, and wedges

Deformations of Quantum Field Theories on Spacetimes with Killing Vector Fields

101

on special manifolds have been studied by many authors in the literature [Kay85,BB99, Reh00,BMS01,GLRV01,BS04,LR07,Str08,Bor09]. In Sect. 2, the first main part of our investigation, we show that on those four-dimensional curved spacetimes which allow for the application of the deformation methods in [BLS10], and thus carry two commuting Killing fields, there also exists a family of wedges with causal properties analogous to the Minkowski space wedges. Because of the prominent role wedges play in many areas of Minkowski space quantum field theory [BW75,Bor92,Bor00,BDFS00,BGL02], this geometric and manifestly covariant construction is also of interest independently of its relation to deformations. In Sect. 3, we then consider quantum field theories on curved spacetimes, and deform them by warped convolution. We first show that these deformations can be carried through in a model-independent, operator-algebraic framework, and that the emerging models share many structural properties with deformations of field theories on flat spacetime (Sect. 3.1). In particular, deformed quantum fields are localized in the wedges of the considered spacetime. These and further aspects of deformed quantum field theories are also discussed in the concrete example of a Dirac field in Sect. 3.2. Sect. 4 contains our conclusions. 2. Geometric Setup To prepare the ground for our discussion of deformations of quantum field theories on curved backgrounds, we introduce in this section a suitable class of spacetimes and study their geometrical properties. In particular, we show how the concept of wedges, known from Minkowski space, generalizes to these manifolds. Recall in preparation that a wedge in four-dimensional Minkowski space is a region which is bounded by two non-parallel characteristic hyperplanes [TW97], or, equivalently, a region which is a connected component of the causal complement of a two-dimensional spacelike plane. The latter definition has a natural analogue in the curved setting. Making use of this observation, we construct corresponding wedge regions in Sect. 2.1, and analyse their covariance, causality and inclusion properties. At the end of that section, we compare our notion of wedges to other definitions which have been made in the literature [BB99,BMS01,BS04,LR07,Bor09], and point out the similarities and differences. In Sect. 2.2, the abstract analysis of wedge regions is complemented by a number of concrete examples of spacetimes fulfilling our assumptions. 2.1. Edges and wedges in curved spacetimes. In the following, a spacetime (M, g) is understood to be a four-dimensional, Hausdorff, (arcwise) connected, smooth manifold M endowed with a smooth, Lorentzian metric g whose signature is (+, −, −, −). Notice that it is automatically guaranteed that M is also paracompact and second countable [Ger68,Ger70]. The (open) causal complement of a set O ⊂ M is defined as O := M\ J + (O) ∪ J − (O) , (2.1) where J ± (O) is the causal future respectively past of O in M [Wal84, Sect. 8.1]. To avoid pathological geometric situations such as closed causal curves, and also to define a full-fledged Cauchy problem for a free field theory whose dynamics is determined by a second order hyperbolic partial differential equation, we will restrict ourselves to globally hyperbolic spacetimes. So in particular, M is orientable and timeorientable, and we fix both orientations. While this setting is standard in quantum field

102


theory on curved backgrounds, we will make additional assumptions regarding the structure of the isometry group Iso(M, g) of (M, g), motivated by our desire to define wedges in M which resemble those in Minkowski space. Our most important assumption on the structure of (M, g) is that it admits two linearly independent, spacelike, complete, commuting smooth Killing fields ξ1 , ξ2 , which will later be essential in the context of deformed quantum field theories. We refer here and in the following always to pointwise linear independence, which entails in particular that these vector fields have no zeros. Denoting the flows of ξ1 , ξ2 by ϕξ1 , ϕξ2 , the orbit of a point p ∈ M is a smooth two-dimensional spacelike embedded submanifold of M, E := {ϕξ1 ,s1 (ϕξ2 ,s2 ( p)) ∈ M : s1 , s2 ∈ R},

(2.2)

where s1 , s2 are the flow parameters of ξ1 , ξ2 . Since M is globally hyperbolic, it is isometric to a smooth product manifold R × , where is a smooth, three-dimensional embedded Cauchy hypersurface. It is known that the metric splits according to g = βdT 2 − h with a temporal function T : R × → R and a positive function β ∈ C ∞ (R × , (0, ∞)), while h induces a smooth Riemannian metric on [BS05, Thm. 2.1]. We assume that, with E as in (2.2), the Cauchy surface is smoothly homeomorphic to a product manifold I × E, where I is an open interval or the full real line. Thus M ∼ = R × I × E, and we require in addition that there exists a smooth embedding ι : R × I → M. By our assumption on the topology of I , it follows that (R × I, ι∗ g) is a globally hyperbolic spacetime without null focal points, a feature that we will need in the subsequent construction of wedge regions. Definition 2.1. A spacetime (M, g) is called admissible if it admits two linearly independent, spacelike, complete, commuting, smooth Killing fields ξ1 , ξ2 and the corresponding split M ∼ = R × I × E, with E defined in (2.2), has the properties described above. The set of all ordered pairs ξ := (ξ1 , ξ2 ) satisfying these conditions for a given admissible spacetime (M, g) is denoted (M, g). The elements of (M, g) will be referred to as Killing pairs. For the remainder of this section, we will work with an arbitrary but fixed admissible spacetime (M, g), and usually drop the (M, g)-dependence of various objects in our notation, e.g., write instead of (M, g) for the set of Killing pairs, and Iso in place of Iso(M, g) for the isometry group. Concrete examples of admissible spacetimes, such as Friedmann-Robertson-Walker-, Kasner- and Bianchi-spacetimes, will be discussed in Sect. 2.2. The flow of a Killing pair ξ ∈ is written as ϕξ,s := ϕξ1 ,s1 ◦ ϕξ2 ,s2 = ϕξ2 ,s2 ◦ ϕξ1 ,s1 , ξ = (ξ1 , ξ2 ) ∈ , s = (s1 , s2 ) ∈ R2 , (2.3) where s1 , s2 ∈ R are the parameters of the (complete) flows ϕξ1 , ϕξ2 of ξ1 , ξ2 . By construction, each ϕξ is an isometric R2 -action by diffeomorphisms on (M, g), i.e., ϕξ,s ∈ Iso and ϕξ,s ϕξ,u = ϕξ,s+u for all s, u ∈ R2 . On the set , the isometry group Iso and the general linear group GL(2, R) act in a natural manner. Lemma 2.2. Let h ∈ Iso, N ∈ GL(2, R), and define, ξ = (ξ1 , ξ2 ) ∈ , h ∗ ξ := (h ∗ ξ1 , h ∗ ξ2 ), (N ξ )( p) := N (ξ1 ( p), ξ2 ( p)),

p ∈ M.

(2.4) (2.5)


103

These operations are commuting group actions of Iso and GL(2, R) on , respectively. The GL(2, R)-action transforms the flow of ξ ∈ according to, s ∈ R2 , ϕ N ξ,s = ϕξ,N T s .

(2.6)

If h ∗ ξ = N ξ for some ξ ∈ , h ∈ Iso, N ∈ GL(2, R), then det N = ±1. Proof. Due to the standard properties of isometries, Iso acts on the Lie algebra of Killing fields by the push-forward isomorphisms ξ1 → h ∗ ξ1 [O’N83]. Therefore, for any (ξ1 , ξ2 ) ∈ , also the vector fields h ∗ ξ1 , h ∗ ξ2 are spacelike, complete, commuting, linearly independent, smooth Killing fields. The demanded properties of the splitting M∼ = R × I × E directly carry over to the corresponding split with respect to h ∗ ξ . So h ∗ maps onto , and since h ∗ (k∗ ξ1 ) = (hk)∗ ξ1 for h, k ∈ Iso, we have an action of Iso. The second map, ξ → N ξ , amounts to taking linear combinations of the Killing fields ξ1 , ξ2 . The relation (2.6) holds because ξ1 , ξ2 commute and are complete, which entails that the respective flows can be constructed via the exponential map. Since det N = 0, the two components of N ξ are still linearly independent, and since E (2.2) is invariant under ξ → N ξ , the splitting M ∼ = R × I × E is the same for ξ and N ξ . Hence N ξ ∈ , i.e., ξ → N ξ is a GL(2, R)-action on , and since the push-forward is linear, it is clear that the two actions commute. To prove the last statement, we consider the submanifold E (2.2) together with its induced metric. Since the Killing fields ξ1 , ξ2 are tangent to E, their flows are isometries of E. Since h ∗ ξ = N ξ and E is two-dimensional, it follows that N acts as an isometry on the tangent space T p E, p ∈ E. But as E is spacelike and two-dimensional, we can assume without loss of generality that the metric of T p E is the Euclidean metric, and therefore has the two-dimensional Euclidean group E(2) as its isometry group. Thus N ∈ GL(2, R) ∩ E(2) = O(2), i.e., det N = ±1. The GL(2, R)-transformation given by the flip matrix := 01 01 will play a special role later on. We therefore reserve the name inverted Killing pair of ξ = (ξ1 , ξ2 ) ∈ for ξ := ξ = (ξ2 , ξ1 ).

(2.7)

Note that since we consider ordered tuples (ξ1 , ξ2 ), the Killing pairs ξ and ξ are not identical. Clearly, the map ξ → ξ is an involution on , i.e., (ξ ) = ξ . After these preparations, we turn to the construction of wedge regions in M, and begin by specifying their edges. Definition 2.3. An edge is a subset of M which has the form E ξ, p := ϕξ,s ( p) ∈ M : s ∈ R2

(2.8)

for some ξ ∈ , p ∈ M. Any spacelike vector n ξ, p ∈ T p M which completes the gradient of the chosen temporal function and the Killing vectors ξ1 ( p), ξ2 ( p) to a positively oriented basis (∇T ( p), ξ1 ( p), ξ2 ( p), n ξ, p ) of T p M is called an oriented normal of E ξ, p . It is clear from this definition that each edge is a two-dimensional, spacelike, smooth submanifold of M. Our definition of admissible spacetimes M ∼ = R × I × E explicitly restricts the topology of I , but not of the edge (2.2), which can be homeomorphic to a plane, cylinder, or torus.

104


Fig. 1. Three-dimensional sketch of the wedge Wξ, p and its edge E ξ, p

Note also that the description of the edge E ξ, p in terms of ξ and p is somewhat redundant: Replacing the Killing fields ξ1 , ξ2 by linear combinations ξ˜ := N ξ, N ∈ GL(2, R), or replacing p by p˜ := ϕξ,u ( p) with some u ∈ R2 , results in the same manifold E ξ˜ , p˜ = E ξ, p . Before we define wedges as connected components of causal complements of edges, we have to prove the following key lemma, from which the relevant properties of wedges follow. For its proof, it might be helpful to visualize the geometrical situation as sketched in Fig. 1. Lemma 2.4. The causal complement E ξ, p of an edge E ξ, p is the disjoint union of two connected components, which are causal complements of each other. Proof. We first show that any point q ∈ E ξ, p is connected to the base point p by a smooth, spacelike curve. Since M is globally hyperbolic, there exist Cauchy surfaces p , q passing through p and q, respectively. We pick two compact subsets K q ⊂ q , containing q, and K p ⊂ p , containing p. If K p , K q are chosen sufficiently small, their union K p ∪ K q is an acausal, compact, codimension one submanifold of M. It thus fulfils the hypothesis of Thm. 1.1 in [BS06], which guarantees that there exists a spacelike Cauchy surface containing the said union. In particular, there exists a smooth, spacelike curve γ connecting p = γ (0) and q = γ (1). Picking spacelike vectors v ∈ T p and w ∈ Tq , we have the freedom of choosing γ in such a way that γ˙ (0) = v and γ˙ (1) = w. If v and w are chosen linearly independent from ξ1 ( p), ξ2 ( p) and ξ1 (q), ξ2 (q), respectively, these vectors are oriented normals of E ξ, p respectively E ξ,q , and we can select γ such that it intersects the edge E ξ, p only in p. Let us define the region 1 Wξ, p := q ∈ E ξ, p : ∃ γ ∈ C ([0, 1], M) with γ (0)= p, γ (1)=q, E ξ, p ∩ γ ={ p}, γ˙ (0) is an oriented normal of E ξ, p , γ˙ (1) is an oriented normal of E ξ,q ,

(2.9) and, exchanging ξ with the inverted Killing pair ξ , we correspondingly define the region , and that we can Wξ , p . It is clear from the above argument that Wξ, p ∪ Wξ , p = E ξ, p


105

prescribe arbitrary normals n, m of E ξ, p , E ξ,q as initial respectively final tangent vectors of the curve γ connecting p to q ∈ Wξ, p . The proof of the lemma consists in establishing that Wξ, p and Wξ , p are disjoint, and causal complements of each other. To prove disjointness of Wξ, p , Wξ , p , assume there exists a point q ∈ Wξ, p ∩ Wξ , p . Then q can be connected with the base point p by two spacelike curves, whose tangent vectors satisfy the conditions in (2.9) with ξ respectively ξ . By joining these two curves, we have identified a continuous loop λ . As an oriented normal, the tangent vector λ ˙ (0) at p is linearly independent of in E ξ, p ξ1 ( p), ξ2 ( p), so that λ intersects E ξ, p only in p. Recall that according to Definition 2.1, M splits as the product M ∼ = R × I × E ξ, p , with an open interval I which is smoothly embedded in M. Hence we can consider the projection π(λ) of the loop λ onto I , which is a closed interval π(λ) ⊂ I because the simple connectedness of I rules out the possibility that π(λ) forms a loop, and on account of the linear independence of {ξ1 ( p), ξ2 ( p), n ξ, p }, the projection cannot be just a single point. Yet, as λ is a loop, there exists p ∈ λ such that π( p ) = π( p). We also know that π −1 ({π( p)}) = R × {π( p)} × E ξ, p is contained in J + (E ξ, p ) ∪ E ξ, p ∪ J − (E ξ, p ) and, since p and p are causally separated, the only possibility left is that they both lie on the same edge. Yet, per construction, we know that the loop intersects the edge only once at p and, thus, p and p must coincide, which is the sought contradiction. To verify the claim about causal complements, assume there exist points q ∈ Wξ, p , q ∈ Wξ , p and a causal curve γ connecting them, γ (0) = q, γ (1) = q . By definition of the causal complement, it is clear that γ does not intersect E ξ, p . In view of our restriction on the topology of M, it follows that γ intersects either J + (E ξ, p ) or J − (E ξ, p ). These two cases are completely analogous, and we consider the latter one, where there exists a point q ∈ γ ∩ J − (E ξ, p ). In this situation, we have a causal curve connecting q ∈ Wξ, p with q ∈ J − (E ξ, p ), and since q ∈ / J − (q ) ⊂ J − (E ξ, p ), it follows that γ must be past directed. As the time orientation of γ is the same for the whole curve, it follows that also the part of γ connecting q and q is past directed. Hence q ∈ J − (q ) ⊂ J − (E ξ, p ), which is a contradiction to q ∈ Wξ , p . Thus Wξ , p ⊂ Wξ, p . To show that Wξ , p coincides with Wξ, p , let q ∈ Wξ, p = Wξ, p ⊂ E ξ, p = Wξ, p Wξ , p . Yet q ∈ Wξ, p is not possible since q ∈ Wξ, p and Wξ, p is open. So q ∈ Wξ , p , i.e., we have shown Wξ, p ⊂ Wξ , p , and the claimed identity Wξ , p = Wξ, p follows.

Lemma 2.4 does not hold if the topological requirements on M are dropped. As an example, consider a cylinder universe R × S 1 × R2 , the product of the Lorentz cylinder

106


R × S 1 [O’N83] and the Euclidean plane R2 . The translations in the last factor R2 define spacelike, complete, commuting, linearly independent Killing fields ξ . Yet the causal complement of the edge E ξ, p = {0} × {1} × R2 has only a single connected component, which has empty causal complement. In this situation, wedges lose many of the useful properties which we establish below for admissible spacetimes. In view of Lemma 2.4, wedges in M can be defined as follows. Definition 2.5 (Wedges). A wedge is a subset of M which is a connected component of the causal complement of an edge in M. Given ξ ∈ , p ∈ M, we denote by Wξ, p the component of E ξ, p which intersects the curves γ (t) := exp p (t n ξ, p ), t > 0, for any oriented normal n ξ, p of E ξ, p . The family of all wedges is denoted W := {Wξ, p : ξ ∈ , p ∈ M}.

(2.10) R+

As explained in the proof of Lemma 2.4, the condition that the curve t → exp p (t n ξ, p ) intersects a connected component of E ξ, is independent of the chosen p normal n ξ, p , and each such curve intersects precisely one of the two components of . E ξ, p Some properties of wedges which immediately follow from the construction carried out in the proof of Lemma 2.4 are listed in the following proposition. Proposition 2.6 (Properties of wedges). Let W = Wξ, p be a wedge. Then a) W is causally complete, i.e., W = W , and hence globally hyperbolic. b) The causal complement of a wedge is given by inverting its Killing pair, (Wξ, p ) = Wξ , p .

(2.11)

c) A wedge is invariant under the Killing flow generating its edge, ϕξ,s (Wξ, p ) = Wξ, p ,

s ∈ R2 .

(2.12)

Proof a) By Lemma 2.4, W is the causal complement of another wedge V , and therefore causally complete: W = V = V = W . Since M is globally hyperbolic, this implies that W is globally hyperbolic, too [Key96, Prop. 12.5]. b) This statement has already been checked in the proof of Lemma 2.4. c) By definition of the edge E ξ, p (2.8), we have ϕξ,s (E ξ, p ) = E ξ, p for any s ∈ R2 , ) = E . Continuity of and since the ϕξ,s are isometries, it follows that ϕξ,s (E ξ, p ξ, p the flow implies that also the two connected components of this set are invariant. Corollary 2.7 (Properties of the family of wedges). The family W of wedges is invariant under the isometry group Iso and under taking causal complements. For h ∈ Iso, it holds h(Wξ, p ) = Wh ∗ ξ,h( p) .

(2.13)

Proof Since isometries preserve the causal structure of a spacetime, we only need to look at the action of isometries on edges. We find h E ξ, p = h ◦ ϕξ,s ◦ h −1 (h( p)) : s ∈ R2 = ϕh ∗ ξ,s (h( p)) : s ∈ R2 = E h ∗ ξ,h( p) (2.14) by using the well-known fact that conjugation of flows by isometries amounts to the push-forward by the isometry of the associated vector field. Since h ∗ ξ ∈ for any ξ ∈ , h ∈ Iso (Lemma 2.2), the family W is invariant under the action of the isometry group. Closedness of W under causal complementation is clear from Prop. 2.6 b).


107

In contrast to the situation in flat spacetime, the isometry group Iso does not act transitively on W(M, g) for generic admissible M, and there is no isometry mapping a given wedge onto its causal complement. This can be seen explicitly in the examples discussed in Sect. 2.2. To keep track of this structure of W(M, g), we decompose (M, g) into orbits under the Iso- and GL(2, R)-actions. Definition 2.8 Two Killing pairs ξ, ξ˜ ∈ are equivalent, written ξ ∼ ξ˜ , if there exist h ∈ Iso and N ∈ GL(2, R) such that ξ˜ = N h ∗ ξ . As ξ → N ξ and ξ → h ∗ ξ are commuting group actions, ∼ is an equivalence relation. According to Lemma 2.2 and Prop. 2.6 b), c), acting with N ∈ GL(2, R) on ξ either leaves W N ξ, p = Wξ, p invariant (if det N > 0) or exchanges this wedge with 1 its causal complement, W N ξ, p = Wξ, p (if det N < 0). Therefore the “coherent” subfamilies arising in the decomposition of the family of all wedges along the equivalence classes [ξ ] ∈ /∼, W[ξ ] , W[ξ ] := {Wξ˜ , p : ξ˜ ∼ ξ, p ∈ M}, (2.15) W= [ξ ]

take the form

W[ξ ] = Wh ∗ ξ, p , Wh ∗ ξ, p : h ∈ Iso, p ∈ M .

(2.16)

In particular, each subfamily W[ξ ] is invariant under the action of the isometry group and causal complementation. In our later applications to quantum field theory, it will be important to have control over causal configurations W1 ⊂ W2 and inclusions W1 ⊂ W2 of wedges W1 , W2 ∈ W. Since W is closed under taking causal complements, it is sufficient to consider inclusions. Note that the following proposition states in particular that inclusions can only occur between wedges from the same coherent subfamily W[ξ ] . Proposition 2.9 (Inclusions of wedges). Let ξ, ξ˜ ∈ , p, p˜ ∈ M. The wedges Wξ, p and Wξ˜ , p˜ form an inclusion, Wξ, p ⊂ Wξ˜ , p˜ , if and only if p ∈ Wξ˜ , p˜ and there exists N ∈ GL(2, R) with det N > 0, such that ξ˜ = N ξ . Proof (⇐) Let us assume that ξ˜ = N ξ holds for some N ∈ GL(2, R) with det N > 0, and p ∈ Wξ˜ , p˜ . In this case, the Killing fields in ξ˜ are linear combinations of those in ξ , and consequently, the edges E ξ, p and E ξ˜ , p˜ intersect if and only if they coincide, i.e., if p˜ ∈ E ξ, p . If the edges coincide, we clearly have Wξ˜ , p˜ = Wξ, p . If they do not coincide, it follows from p ∈ Wξ˜ , p˜ that E ξ, p and E ξ˜ , p˜ are either spacelike separated or they can be connected by a null geodesic.

1 See [BS07] for a related notion on Minkowski space.

108


Consider now the case that E ξ, p and E ξ˜ , p˜ are spacelike separated, i.e., p ∈ Wξ˜ , p˜ . Pick a point q ∈ Wξ, p and recall that Wξ, p can be characterized by equation (2.9). Since p ∈ Wξ˜ , p˜ and q ∈ Wξ, p , there exist curves γ p and γq , which connect the pairs of points ( p, ˜ p) and ( p, q), respectively, and comply with the conditions in (2.9). By joining γ p and γq we obtain a curve which connects p˜ and q. The tangent vectors γ˙ p (1) and γ˙q (0) are oriented normals of E ξ, p and we choose γ p and γq in such a way that these tangent vectors coincide. Due to the properties of γ p and γq , the joint curve also complies with the conditions in (2.9), from which we conclude q ∈ Wξ˜ , p˜ , and thus Wξ, p ⊂ Wξ˜ , p˜ . Consider now the case that E ξ˜ , p˜ and E ξ, p are connected by null geodesics, i.e. p ∈ ∂ Wξ˜ , p˜ . Let r be the point in E ξ, p which can be connected by a null geodesic with p˜ and pick a point q ∈ Wξ, p . The intersection J − (r ) ∩ ∂ Wξ, p yields another null curve, say μ, and the intersection μ ∩ J − (q) =: p is non-empty since r and q are spacelike separated and q ∈ Wξ, p . The null curve μ is chosen future directed and parametrized in such a way that μ(0) = p and μ(1) = r . By taking ε ∈ (0, 1) we find q ∈ Wξ,μ(ε) and μ(ε) ∈ Wξ˜ , p˜ which entails q ∈ Wξ˜ , p˜ . (⇒) Let us assume that we have an inclusion of wedges Wξ, p ⊂ Wξ˜ , p˜ . Then clearly p ∈ Wξ˜ , p˜ . Since M is four-dimensional and ξ1 , ξ2 , ξ˜1 , ξ˜2 are all spacelike, they cannot be linearly independent. Let us first assume that three of them are linearly independent, and without loss of generality, let ξ = (ξ1 , ξ2 ) and ξ˜ = (ξ2 , ξ3 ) with three linearly independent spacelike Killing fields ξ1 , ξ2 , ξ3 . Picking points q ∈ E ξ, p , q˜ ∈ E ξ˜ , p˜ these can be written as q = (t, x1 , x2 , x3 ) and q˜ = (t˜, x˜1 , x˜2 , x˜3 ) in the global coordinate system of flow parameters constructed from ξ1 , ξ2 , ξ3 and the gradient of the temporal function. For suitable flow parameters s1 , s2 , s3 , we have ϕξ1 ,s1 (q) = (t, x˜1 , x2 , x3 ) =: q ∈ E ξ, p and ϕ(ξ2 ,ξ3 ),(s2 ,s3 ) (q) ˜ = (t˜, x˜1 , x2 , x3 ) =: q˜ ∈ E ξ˜ , p˜ . Clearly, the points q and q˜ are connected by a timelike curve, e.g. the curve whose tangent vector field is given by the gradient of the temporal function. But a timelike curve connecting the edges of Wξ, p , Wξ˜ , p˜ is a contradiction to these wedges forming an inclusion. So no three of the vector fields ξ1 , ξ2 , ξ˜1 , ξ˜2 can be linearly independent. Hence ξ˜ = N ξ with an invertible matrix N . It remains to establish the correct sign of det N , and to this end, we assume det N < 0. Then we have (Wξ, p ) = Wξ , p ⊂ Wξ˜ , p˜ , by (Prop. 2.6 b)) and the (⇐) statement in this proof, since ξ˜ and ξ are related by a positive determinant transformation and p ∈ Wξ˜ , p˜ . This yields that both, Wξ, p and its causal complement, must be contained in Wξ˜ , p˜ , a contradiction. Hence det N > 0, and the proof is finished. Having derived the structural properties of the set W of wedges needed later, we now compare our wedge regions to the Minkowski wedges and to other definitions proposed in the literature.

The flat Minkowski spacetime R4 , η clearly belongs to the class of admissible spacetimes, with translations along spacelike directions and rotations in the standard time zero Cauchy surface as its complete spacelike Killing fields. However, as Killing pairs consist of non-vanishing vector fields, and each rotation leaves its rotation axis

invariant, the set R4 , η consists precisely of all pairs (ξ1 , ξ2 ) such that the flows ϕξ1 , ϕξ2 are translations along two linearly independent spacelike directions. Hence the set of all edges in Minkowski space coincides with

the set of all two-dimensional spacelike planes. Consequently, each wedge W ∈ W R4 , η is bounded by two non-parallel


109

characteristic three-dimensional planes. This is precisely the family of wedges usually considered in Minkowski space2 (see, for example, [TW97]). Besides the features we established above in the general admissible setting, the family of Minkowski wedges has the following well-known properties:

a) Each wedge W ∈ W R4 , η is the causal completion of the world line of a uniformly accelerated observer.

b) Each wedge W ∈ W R4 , η is the union of a family of double cones whose tips lie on two fixed lightrays.

c) The isometry group (the Poincaré group) acts transitively on W R4 , η .

4 d) W R , η is causally separating in the sense that given any two spacelike separated double cones O1 , O 2 ⊂ R4 , then there exists a wedge W such that O1 ⊂ W ⊂ O2 [TW97]. W R4 , η is a subbase for the topology of R4 . All these properties a)–d) do not hold for the class W(M, g) of wedges on a general admissible spacetime, but some hold for certain subclasses, as can be seen from the explicit examples in the subsequent section. There exist a number of different constructions for wedges in curved spacetimes in the literature, mostly for special manifolds. On de Sitter respectively anti de Sitter space Borchers and Buchholz [BB99] respectively Buchholz and Summers [BS04] construct wedges by taking property a) as their defining feature, see also the generalization by Strich [Str08]. In the de Sitter case, this definition is equivalent to our definition of a wedge as a connected component of the causal complement of an edge [BMS01]. But as two-dimensional spheres, the de Sitter edges do not admit two linearly independent commuting Killing fields. Apart from this difference due to our restriction to commuting, linearly independent, Killing fields, the de Sitter wedges can be constructed in the same way as presented here. Thanks to the maximal symmetry of the de Sitter and anti de Sitter spacetimes, the respective isometry groups act transitively on the corresponding wedge families (c), and causally separate in the sense of d). A definition related to the previous examples has been given by Lauridsen-Ribeiro for wedges in asymptotically anti de Sitter spacetimes (see Def. 1.5 in [LR07]). Note that these spacetimes are not admissible in our sense since anti de Sitter space is not globally hyperbolic. Property b) has recently been taken by Borchers [Bor09] as a definition of wedges in a quite general class of curved spacetimes which is closely related to the structure of double cones. In that setting, wedges do not exhibit in general all of the features we derived in our framework, and can for example have compact closure. Wedges in a class of Friedmann-Robertson-Walker spacetimes with spherical spatial sections have been constructed with the help of conformal embeddings into de Sitter space [BMS01]. This construction also yields wedges defined as connected components of causal complements of edges. Here a) does not, but c) and d) do hold, see also our discussion of Friedmann-Robertson-Walker spacetimes with flat spatial sections in the next section. The idea of constructing wedges as connected components of causal complements of specific two-dimensional submanifolds has also been used in the context of globally 2 Note that we would get a “too large” family of wedges in Minkowski space if we would drop the requirement that the vector fields generating edges are Killing. However, the assumption that edges are generated by commuting Killing fields is motivated by the application to deformations of quantum field theories, and one could generalize our framework to spacetimes with edges generated by complete, linearly independent smooth Killing fields.

110


hyperbolic spacetimes with a bifurcate Killing horizon [GLRV01], building on earlier work in [Kay85]. Here the edge is given as the fixed point manifold of the Killing flow associated with the horizon.

2.2. Concrete examples. In the previous section we provided a complete but abstract characterization of the geometric structures of the class of spacetimes we are interested in. This analysis is now complemented by presenting a number of explicit examples of admissible spacetimes. The easiest way to construct an admissble spacetime is to take the warped product [O’N83, Chap. 7] of an edge with another manifold. Let (E, g E ) be a two-dimensional Riemannian manifold endowed with two complete, commuting, linearly independent, smooth Killing fields, and let (X, g X ) be a two-dimensional, globally hyperbolic spacetime diffeomorphic to R × I , with I an open interval or the full real line. Then, given a positive smooth function f on X , consider the warped product M := X × f E, i.e., the product manifold X × E endowed with the metric tensor field g := π X∗ (g X ) + ( f ◦ π X ) · π E∗ (g E ), where π X : M → X and π E : M → E are the projections on X and E. It readily follows that (M, g) is admissible in the sense of Definition 2.1. The following proposition describes an explicit class of admissible spacetimes in terms of their metrics. Proposition 2.10. Let (M, g) be a spacetime diffeomorphic to R × I × R2 , where I ⊆ R is open and simply connected, endowed with a global coordinate system (t, x, y, z) according to which the metric reads ds 2 = e2 f0 dt 2 − e2 f1 d x 2 − e2 f2 dy 2 − e2 f3 (dz − q dy)2 .

(2.17)

Here t runs over the whole R, f i , q ∈ C ∞ (M) for i = 0, . . . , 3 and f i , q do not depend on y and z. Then (M, g) is an admissible spacetime in the sense of Definition 2.1. Proof. Per direct inspection of (2.17), M is isometric to R × with ∼ = I × R2 2 2 i j with ds = β dt − h i j d x d x , where β is smooth and positive, and h is a metric which depends smoothly on t. Furthermore, on the hypersurfaces at constant t, det h = e2( f1 + f2 + f3 ) > 0 and h is block-diagonal. If we consider the sub-matrix with i, j = y, z, this has a positive determinant and a positive trace. Hence we can conclude that the induced metric on is Riemannian, or, in other words, is a spacelike, smooth, threedimensional Riemannian hypersurface. Therefore we can apply Theorem 1.1 in [BS05] to conclude that M is globally hyperbolic. Since the metric coefficients are independent of y and z, the vector fields ξ1 = ∂ y and ξ2 = ∂z are smooth Killing fields which commute and, as they lie tangent to the Riemannian hypersurfaces at constant time, they are also spacelike. Furthermore, since per definition of spacetime, M and thus also is connected, we can invoke the HopfRinow-Theorem [O’N83, Sect. 5, Thm. 21] to conclude that is complete and, thus, all its Killing fields are complete. As I is simply connected by assumption, it follows that (M, g) is admissible. Under an additional assumption, also a partial converse of Proposition 2.10 is true. Namely, let (M, g) be a globally hyperbolic spacetime with two complete, spacelike,


111

commuting, smooth Killing fields, and pick a local coordinate system (t, x, y, z), where y and z are the flow parameters of the Killing fields. Then, if the reflection map r : M → M, r (t, x, y, z) = (t, x, −y, −z), is an isometry, the metric is locally of the form (2.17), as was proven in [Cha83,CF84]. The reflection r is used to guarantee the vanishing of the unwanted off-diagonal metric coefficients, namely those associated to “d x d y” and “d x dz”. Notice that the cited papers allow only to establish a result on the local structure of M and no a priori condition is imposed on the topology of I , in distinction to Proposition 2.10. Some of the metrics (2.17) are used in cosmology. For the description of a spatially homogeneous but in general anisotropic universe M ∼ = J × R3 , where J ⊆ R (see §5 in [Wal84] and [FPH74]), one puts f 0 = q = 0 in (2.17) and takes f 1 , f 2 , f 3 to depend only on t. This yields the metric of Kasner spacetimes respectively Bianchi I models3 ds 2 = dt 2 − e2 f1 d x 2 − e2 f2 dy 2 − e2 f3 dz 2 .

(2.18)

Clearly here the isometry group contains three smooth Killing fields, locally given by ∂x , ∂ y , ∂z , which are everywhere linearly independent, complete and commuting. In particular, (∂x , ∂ y ), (∂x , ∂z ) and (∂ y , ∂z ) are Killing pairs. A case of great physical relevance arises when specializing the metric further by taking all the functions f i in (2.18) to coincide. In this case, the metric assumes the so-called Friedmann-Robertson-Walker form ds 2 = dt 2 − a(t)2 d x 2 + dy 2 + dz 2 = a(t (τ ))2 dτ 2 − d x 2 − dy 2 − dz 2 .

(2.19)

Here the scale factor a(t) := e f1 (t) is defined on some interval J ⊆ R, and in the second equality, we have introduced the conformal time τ , which is implicitly defined by dτ = a −1 (t)dt. Notice that, as in the Bianchi I model, the manifold is M ∼ = J × R3 , i.e., the variable t does not need to range over the whole real axis. (This does not affect the property of global hyperbolicity.) By inspection of (2.19), it is clear that the isometry group of this spacetime contains the three-dimensional Euclidean group E(3) = O(3) R3 . Disregarding the Minkowski case, where J = R and a is constant, the isometry group in fact coincides with E(3). Edges in such a Friedmann-Robertson-Walker universe are of the form {τ } × S, where S is a two-dimensional plane in R3 and t (τ ) ∈ J . Here W consists of a single coherent family, and the Iso-orbits in W are labelled by the time parameter τ for the corresponding edges. Also note that the family of Friedmann-Robertson-Walker wedges is causally separating in the sense discussed near the end of Sect. 2.1.

3 The Bianchi models I–IX [Ell06] arise from the classification of three-dimensional real Lie algebras, thought of as Lie subalgebras of the Lie algebra of Killing fields. Only the cases Bianchi I–VII, in which the three-dimensional Lie algebra contains R2 as a subalgebra, are of direct interest here, since only in these cases Killing pairs exist.

112


The second form of the metric in (2.19) is manifestly a conformal rescaling of the flat Minkowski metric. Interpreting the coordinates (τ, x, y, z) as coordinates of a point in R4 therefore gives rise to a conformal embedding ι : M → R4 .

In this situation, it is interesting to note that the set of all images ι(E) of edges E in the Friedmann-Robertson-Walker spacetime coincides with the set of all Minkowski space edges which lie completely in ι(M) = J × R3 , provided that J does not coincide with R. These are just the edges parallel to the standard Cauchy surfaces of constant τ in R4 . So Friedmann-Robertson-Walker edges can also be characterized in terms of Minkowski space edges and the conformal embedding ι, analogous to the construction of wedges in Friedmann-Robertson-Walker spacetimes with spherical spatial sections in [BMS01]. 3. Quantum Field Theories on Admissible Spacetimes Having discussed the relevant geometric structures, we now fix an admissible spacetime (M, g) and discuss warped convolution deformations of quantum field theories on it. For models on flat Minkowski space, it is known that this deformation procedure weakens point-like localization to localization in wedges [BLS10], and we will show here that the same holds true for admissible curved spacetimes. For a convenient description of this weakened form of localization, and for a straightforward application of the warped convolution technique, we will work in the framework of local quantum physics [Haa96]. In this setting, a model theory is defined by a net of field algebras, and here we consider algebras F(W ) of quantum fields supported in wedges W ∈ W(M, g). The main idea underlying the deformation is to apply the formalism developed in [BS08,BLS10], but with the global translation symmetries of Minkowski space replaced by the Killing flow ϕξ corresponding to the wedge W = Wξ, p under consideration. In the case of Minkowski spacetime, these deformations reduce to the familiar structure of a noncommutative Minkowski space with commuting time. The details of the model under consideration will not be important in Sect. 3.1, since our construction relies only on a few structural properties satisfied in any well-behaved quantum field theory. In Sect. 3.2, the deformed Dirac quantum field is presented as a particular example. 3.1. Deformations of nets with Killing symmetries. Proceeding to the standard mathematical formalism [Haa96,Ara99], we consider a C ∗ -algebra F, whose elements are interpreted as (bounded functions of) quantum fields on the spacetime M. The field algebra F has a local structure, and in the present context, we focus on localization in


113

wedges W ∈ W, since this form of localization turns out to be stable under the deformation. Therefore, corresponding to each wedge W ∈ W, we consider the C ∗ -subalgebra F(W ) ⊂ F of fields supported in W . Furthermore, we assume a strongly continuous action α of the isometry group Iso of (M, g) on F, and a Bose/Fermi automorphism γ whose square is the identity automorphism, and which commutes with α. This automorphism will be used to separate the Bose/Fermi parts of fields F ∈ F; in the model theory of a free Dirac field discussed later, it can be chosen as a rotation by 2π in the Dirac bundle. To allow for a straightforward application of the results of [BLS10], we will also assume in the following that the field algebra is concretely realized on a separable Hilbert space H, which carries a unitary representation U of Iso implementing the action α, i.e., U (h)FU (h)−1 = αh (F), h ∈ Iso, F ∈ F. We emphasize that despite working on a Hilbert space, we do not select a state, since we do not make any assumptions regarding U -invariant vectors in H or the spectrum of subgroups of the representation U .4 The subsequent analysis will be carried out in a C ∗ -setting, without using the weak closures of the field algebras F(W ) in B(H). For convenience, we also require the Bose/Fermi automorphism γ to be unitarily implemented on H, i.e., there exists a unitary V = V ∗ = V −1 ∈ B(H) such that γ (F) = V F V . We will also use the associated unitary twist operator 1 Z := √ (1 − i V ). 2

(3.1)

Clearly, the unitarily implemented α and γ can be continued to all of B(H). By a slight abuse of notation, these extensions will be denoted by the same symbols. In terms of the data {F(W )}W ∈W , α, γ , the structural properties of a quantum field theory on M can be summarized as follows [Haa96,Ara99]: a) Isotony: F(W ) ⊂ F(W˜ ) whenever W ⊂ W˜ . b) Covariance under Iso: αh (F(W )) = F(hW ), h ∈ Iso, W ∈ W.

(3.2)

c) Twisted Locality: With the unitary Z (3.1), there holds [Z F Z ∗ , G] = 0

for F ∈ F(W ), G ∈ F(W ), W ∈ W.

(3.3)

The twisted locality condition (3.3) is equivalent to normal commutation relations between the Bose/Fermi parts F± := 21 (F ± γ (F)) of fields in spacelike separated wedges, [F+ , G ± ] = [F± , G + ] = {F− , G − } = 0 for F ∈ F(W ), G ∈ F(W ) [DHR69]. The covariance requirement (3.2) entails that for any Killing pair ξ ∈ , the algebra F carries a corresponding R2 -action τξ , defined by τξ,s := αϕξ,s = ad Uξ (s),

s ∈ R2 ,

4 Note that every C ∗ -dynamical system (A, G, α), where A ⊂ B(H) is a concrete C ∗ -algebra on a separable Hilbert space H and α : G → Aut(A) is a strongly continuous representation of the locally compact group G, has a covariant representation [Ped79, Prop. 7.4.7, Lemma 7.4.9], built out of the left-regular representation on the Hilbert space L 2 (G) ⊗ H.

114


where Uξ (s) is shorthand for U (ϕξ,s ). Since a wedge of the form Wξ, p with some p ∈ M is invariant under the flows ϕ N ξ,s for any N ∈ GL(2, R) (see Prop. 2.6 c) and Lemma 2.2), we have in view of isotony τ N ξ,s (F(Wξ, p )) = F(Wξ, p ),

N ∈ GL(2, R), s ∈ R2 .

In this setting, all structural elements necessary for the application of warped convolution deformations [BLS10] are present, and we will use this technique to define a deformed net W −→ F(W )λ of C ∗ -algebras on M, depending on a deformation parameter λ ∈ R. For λ = 0, we will recover the original theory, F(W )0 = F(W ), and for each λ ∈ R, the three basic properties a)–c) listed above will remain valid. To achieve this, the elements of F(W ) will be deformed with the help of the Killing flow leaving W invariant. We begin by recalling some definitions and results from [BLS10], adapted to the situation at hand. Similar to the Weyl product appearing in the quantization of classical systems, the warped convolution deformation is defined in terms of oscillatory integrals of F-valued functions, and we have to introduce the appropriate smooth elements first. The action α is a strongly continuous representation of the Lie group Iso, which acts automorphically and thus isometrically on the C ∗ -algebra F. In view of these properties, the smooth elements F∞ := {F ∈ F : Iso h → αh (F) is · F-smooth} form a norm-dense ∗ subalgebra F∞ ⊂ F (see, for example, [Tay86]). However, the subalgebras F(Wξ, p ) ⊂ F are in general only invariant under the R2 -action τξ , and we therefore also introduce a weakened form of smoothness. An operator F ∈ F will be called ξ -smooth if R2 s → τξ,s (F) ∈ F

(3.4)

is smooth in the norm topology of F. On the Hilbert space level, we have a dense domain H∞ := { ∈ H : Iso h → U (h) is · H -smooth} of smooth vectors in H. As further ingredients for the definition of the oscillatory

integrals, we pick a smooth, compactly supported “cutoff” function χ ∈ C0∞ R2 × R2 with χ (0, 0) = 1, and the standard antisymmetric (2 × 2)-matrix

0 1 . (3.5) Q := −1 0 With these data, we associate to a ξ -smooth F ∈ F the deformed operator (warped convolution) [BLS10], 1 Fξ,λ := lim ds ds e−iss χ (εs, εs ) Uξ (λQs)FUξ (s − λQs), (3.6) 2 4π ε→0 where λ is a real parameter, and ss denotes the standard Euclidean inner product of s, s ∈ R2 . The above limit exists in the strong operator topology of B(H) on the dense subspace H∞ , and is independent of the chosen cutoff function χ within the specified class. The thus (densely) defined operator Fξ,λ can be shown to extend to a bounded ξ -smooth operator on all of H, which we denote by the same symbol [BLS10]. As can be seen from the above formula, setting λ = 0 yields the undeformed operator Fξ,0 = F, for any ξ ∈ . The deformation F → Fξ,λ is closely related to Rieffel’s deformation of C ∗ -algebras [Rie92], where one introduces the deformed product 1 F ×ξ,λ G := lim (3.7) ds ds e−iss χ (εs, εs ) τξ,λQs (F)τξ,s (G). 4π 2 ε→0


115

This limit exists in the norm topology of F for any ξ -smooth F, G ∈ F, and F ×ξ,λ G is ξ -smooth as well. As is well known, this procedure applies in particular to the deformation of classical theories in terms of star products. As field algebra, one would then take a suitable commutative ∗ -algebra of functions on M, endowed with the usual pointwise operations. The isometry group acts on this algebra automorphically by pullback, and in particular, the flow ϕξ of any Killing pair ξ ∈ induces automorphisms. The Rieffel product therefore defines a star product on the subalgebra of smooth elements f, g for this action, 1 ( f ξ,λ g)( p) = lim d 2 s d 2 s e−iss χ (εs, εs ) f (ϕξ,λQs ( p))g(ϕξ,s ( p)). (3.8) 4π 2 ε→0 The function algebra endowed with this star product can be interpreted as a noncommutative version of the manifold M, similar to the flat case [GGBI+ 04]. Note that since we are using a two-dimensional spacelike flow on a four-dimensional spacetime, the deformation corresponds to a noncommutative Minkowski space with “commuting time” in the flat case. The properties of the deformation map F → Fξ,λ which will be relevant here are the following: Lemma 3.1 [BLS10]. Let ξ ∈ , λ ∈ R, and consider ξ -smooth operators F, G ∈ F. Then a) Fξ,λ ∗ = F ∗ ξ,λ . b) Fξ,λ G ξ,λ = (F ×ξ,λ G)ξ,λ . c) If 5 [τξ,s (F), G] = 0 for all s ∈ R2 , then [Fξ,λ , G ξ,−λ ] = 0. d) If a unitary Y ∈ B(H) commutes with Uξ (s), s ∈ R2 , then Y Fξ,λ Y −1 = (Y FY −1 )ξ,λ , and Y Fξ,λ Y −1 is ξ -smooth. Since we are dealing here with a field algebra obeying twisted locality, we also point out that statement c) of the above lemma carries over to the twisted local case. Lemma 3.2. Let ξ ∈ and F, G ∈ F be ξ -smooth such that [Z τξ,s (F)Z ∗ , G] = 0. Then [Z Fξ,λ Z ∗ , G ξ,−λ ] = 0.

(3.9)

Proof. The Bose/Fermi operator V commutes with the representation of the isometry group, and thus the same holds true for the twist Z (3.1). So in view of Lemma 3.1 d), the assumption implies that Z F Z ∗ is ξ -smooth, and [τξ,s (Z F Z ∗ ), G] = 0 for all s ∈ R2 . In view of Lemma 3.1 c), we thus have [(Z F Z ∗ )ξ,λ , G ξ,−λ ] = 0. But as Z and Uξ (s) commute, (Z F Z ∗ )ξ,λ = Z Fξ,λ Z ∗ , and the claim follows. The results summarized in Lemma 3.1 and Lemma 3.2 will be essential for establishing the isotony and twisted locality properties of the deformed quantum field theory. To also control the covariance properties relating different Killing pairs, we need an additional lemma, closely related to [BLS10, Prop. 2.9]. 5 In [BS08,BLS10], this statement is shown to hold under the weaker assumption that the commutator [τξ,s (F), G] vanishes only for all s ∈ S + S, where S is the joint spectrum of the generators of the R2 representation Uξ implementing τξ . But since usually S = R2 in the present setting, we refer here only to the weaker statement, where S + S ⊂ R2 has been replaced by R2 .

116


Lemma 3.3. Let ξ ∈ , λ ∈ R, and F ∈ F be ξ -smooth. a) Let h ∈ Iso. Then αh (F) is h ∗ ξ -smooth, and αh (Fξ,λ ) = αh (F)h ∗ ξ,λ .

(3.10)

FN ξ,λ = Fξ,det N ·λ .

(3.11)

Fξ ,λ = Fξ,−λ .

(3.12)

b) For N ∈ GL(2, R),

In particular,

Proof. a) The flow of ξ transforms under h according to hϕξ,s = ϕh ∗ ξ,s h, so that αh (τξ,s (F)) = τh ∗ ξ,s (αh (F)). Since F is ξ -smooth, and αh is isometric, the smoothness of s → τh ∗ ξ,s (αh (F)) follows. Using the strong convergence of the oscillatory integrals (3.6), we compute on a smooth vector ∈ H∞ , 1 −iss −1 lim e χ (εs, εs ) U hϕ h ds ds αh (Fξ,λ ) = ξ,λQs 4π 2 ε→0 ×αh (F)U hϕξ,s −λQs h −1 1 = lim ds ds e−iss χ (εs, εs ) 4π 2 ε→0 ×U (ϕh ∗ ξ,λQs )αh (F)U (ϕh ∗ ξ,s −λQs ) = αh (F)h ∗ ξ,λ , b)

which entails (3.10) since H∞ ⊂ H is dense. In view of the transformation law ϕ N ξ,s = ϕξ,N T s (2.6), we get, ∈ H∞ , 1 lim ds ds e−iss χ (εs, εs ) U (ϕ N ξ,λQs )FU (ϕ N ξ,s −λQs ) 4π 2 ε→0 −1 1 −i(N −1 s,s ) lim = e χ εs, ε NT s ds ds 4π 2 | det N | ε→0

×Uξ λN T Qs FUξ s −λN T Qs −1 1 −iss = lim ds ds e χ εN s, ε N T s 4π 2 ε→0 ×Uξ λN T Q N s FUξ s − λN T Q N s

FN ξ,λ =

= Fξ,det N ·λ . In the last line, we used the fact that the value of the oscillatory integral does not

T −1 s , and the depend on the choice of cutoff function χ or χ N (s, s ) := χ N s, N equation N T Q N = det N · Q, which holds for any (2 × 2)-matrix N . This proves (3.11), and since ξ = ξ , with the flip matrix = 01 01 which has det = −1, also (3.12) follows.


117

Having established these properties of individual deformed operators, we now set out to deform the net W → F(W ) of wedge algebras. In contrast to the Minkowski space setting [BLS10], we are here in a situation where the set of all Killing pairs is not a single orbit of one reference pair under the isometry group. Whereas the deformation of a net of wedge algebras on Minkowski space amounts to deforming a single algebra associated with a fixed reference wedge (Borchers triple), we have to specify here more data, related to the coherent subfamilies W[ξ ] in the decomposition W = [ξ ] W[ξ ] of W (2.15). For each equivalence class [ξ ], we choose a representative ξ . In case there exists only a single equivalence class, this simply amounts to fixing a reference wedge together with a length scale for the Killing flow. With this choice of representatives ξ ∈ [ξ ] made, we introduce the sets, p ∈ M, F(Wξ, p )λ := {Fξ,λ : F ∈ F(Wξ, p ) ξ -smooth }· , F(Wξ , p )λ := {Fξ ,λ : F ∈

F(Wξ, p)

ξ -smooth }

·

(3.13) .

(3.14)

Here λ ∈ R is the deformation parameter, and the superscript denotes norm closure in B(H). Note that the deformed operators in F(Wξ , p )λ have the form Fξ ,λ = Fξ,−λ (3.12), i.e., the sign of the deformation parameter depends on the choice of reference Killing pair. The definitions (3.13, 3.14) are extended to arbitrary wedges by setting F(hWξ, p )λ := αh (F(Wξ, p )λ ), F(hWξ, p )λ := αh (F(Wξ, p )λ ).

(3.15)

Recall that as h, p and [ξ ] vary over Iso, M and /∼, respectively, this defines F(W )λ for all W ∈ W (cf. (2.16)). It has to be proven that this assignment is well-defined, ˜ ξ, p˜ is represented. e.g. that (3.15) is independent of the way the wedge hWξ, p = hW This will be done below. However, note that the definition of F(W )λ does depend on our choice of representatives ξ ∈ [ξ ], since rescaling ξ amounts to rescaling the deformation parameter (Lemma 3.3 b)). Before establishing the main properties of the assignment W → F(W )λ , we check that the sets (3.13, 3.14) are C ∗ -algebras. As the C ∗ -algebra F(Wξ, p ) is τξ -invariant and τξ acts strongly continuously, the ξ -smooth operators in F(Wξ, p ) which appear in the definition (3.13) form a norm-dense ∗ -subalgebra. Now the deformation F → Fξ,λ is evidently linear and commutes with taking adjoints (Lemma 3.1 a)); so the sets F(Wξ, p )λ are ∗ -invariant norm-closed subspaces of B(H). To check that these spaces are also closed under taking products, we again use the invariance of F(Wξ, p ) under τξ : By inspection of the Rieffel product (3.7), it follows that for any two ξ -smooth F, G ∈ F(Wξ, p ), also the product F ×ξ,λ G lies in this algebra (and is ξ -smooth, see [Rie92]). Hence the multiplication formula from Lemma 3.1 b) entails that the above defined F(W )λ are actually C ∗ -algebras in B(H). The map W → F(W )λ defines the wedge-local field algebras of the deformed quantum field theory. Their basic properties are collected in the following theorem. Theorem 3.4. The above constructed map W −→ F(W )λ , W ∈ W, is a well-defined, isotonous, twisted wedge-local, Iso-covariant net of C ∗ -algebras on H, i.e., W, W˜ ∈ W, for W ⊂ W˜ , F(W )λ ⊂ F(W˜ )λ ∗ Z Fλ Z , G λ = 0 for Fλ ∈ F(W )λ , G λ ∈ F(W )λ , αh (F(W )λ ) = F(hW )λ , h ∈ Iso. For λ = 0, this net coincides with the original net, F(W )0 = F(W ), W ∈ W.

(3.16) (3.17) (3.18)

118


Proof. It is important to note from the beginning that all claimed properties relate only wedges in the same coherent subfamily W[ξ ] . This can be seen from the form (2.16) of W[ξ ] , which is manifestly invariant under isometries and causal complementation, and the structure of the inclusions (Proposition 2.9). So in the following proof, it is sufficient to consider a fixed but arbitrary equivalence class [ξ ], with selected representative ξ . We begin with establishing the isotony of the deformed net, and therefore consider , with h ∈ Iso, p ∈ M arbitrary, and inclusions of wedges of the form hWξ, p , hWξ, p ˜ ξ, p˜ , we note that according to ξ ∈ fixed. Starting with the inclusions hWξ, p ⊆ hW (2.13) and Prop. 2.9, there exists N ∈ GL(2, R) with positive determinant such that h ∗ ξ = N h˜ ∗ ξ . Equivalently, (h˜ −1 h)∗ ξ = N ξ , which by Lemma 2.2 implies det N = 1. By definition, a generic ξ -smooth element of F(hWξ, p )λ is of the form αh (Fξ,λ ) = αh (F)h ∗ ξ,λ with some ξ -smooth F ∈ F(Wξ, p ). But according to the above observation, this can be rewritten as αh (Fξ,λ ) = αh (F)h ∗ ξ,λ = αh (F) N h˜ ∗ ξ,λ = αh (F)h˜ ∗ ξ,λ ,

(3.19)

where in the last equation we used det N = 1 and Lemma 3.3 b). Taking into account ˜ ξ, p˜ , and that the undeformed net is covariant and isotonous, we have that hWξ, p ⊆ hW ˜ ξ, p ), and so the very right hand side of (3.19) is an element αh (F) ∈ F(hWξ, p ) ⊂ F(hW ˜ ξ, p )λ . Going to the norm closures, the inclusion F(hWξ, p )λ ⊂ F hW ˜ ξ, p of of F(hW λ

C ∗ -algebras follows. ˜ Analogously, an inclusion of causal complements, hWξ, p ⊆ hWξ, p˜ , leads to the ˜ inclusion of C ∗ -algebras F hWξ, ⊂ F hW p ξ, p , the only difference to the previλ

λ

ous argument consisting in an exchange of h, h˜ and p, p. ˜ To complete the investigation of inclusions of wedges in W[ξ ] , we must also consider ˜ = W ˜ . By the same reasoning as before, there exists a the case hWξ, p ⊆ hW h ∗ ξ , p˜ ξ, p˜ matrix N with det N = 1 such that (h˜ −1 h)∗ ξ = N ξ = N ξ with the flip matrix . So N := N has determinant det N = −1, and h ∗ ξ = N h˜ ∗ ξ . Using (3.12), we find for ξ -smooth F ∈ F(hWξ, p ), αh (Fξ,λ ) = αh (F)h ∗ ξ,λ = αh (F) N h˜ ∗ ξ ,λ = αh (F)h˜ ∗ ξ ,−λ .

(3.20)

By isotony and covariance of the undeformed net, this deformed operator is an ˜ )λ (3.14), and taking the norm closure in (3.14) yields F(hWξ, p )λ ⊂ element of F(hW ξ, p˜ ˜ )λ . So the isotony (3.16) of the net is established. This implies in particular F(hW ξ, p˜

˜ ξ, p˜ or its causal complethat the net Fλ is well-defined, since in case hWξ, p equals hW ˜ ξ, p˜ )λ respectively ment, the same arguments yield the equality of F(hWξ, p )λ and F(hW ˜ F(hWξ, p˜ )λ . The covariance of W → F(W )λ is evident from the definition. To check twisted . In view of the locality, it is thus sufficient to consider the pair of wedges Wξ, p , Wξ, p ∗ definition of the C -algebras F(W )λ (3.13) as norm closures of algebras of deformed smooth operators, it suffices to show that any ξ -smooth F ∈ F(Wξ, p ), G ∈ F(Wξ , p˜ ) fulfill the commutation relation Z Fξ,λ Z ∗ , G ξ ,λ = 0. (3.21)


119

But G ξ ,λ = G ξ,−λ (3.12), and since the undeformed net is twisted local and covariant, we have [τξ,s (F), G] = 0, for all s ∈ R2 , which implies [Z Fξ,λ Z ∗ , G ξ,−λ ] = 0 by Lemma 3.2. The fact that setting λ = 0 reproduces the undeformed net is a straightforward consequence of Fξ,0 = F for any ξ -smooth operator, ξ ∈ . Theorem 3.4 is our main result concerning the structure of deformed quantum field theories on admissible spacetimes: It states that the same covariance and localization properties as on flat spacetime can be maintained in the curved setting. Whereas the action of the isometry group and the chosen representation space of F are the same for all values of the deformation parameter λ, the concrete C ∗ -algebras F(W )λ depend in a non-trivial and continuous way on λ: For a fixed wedge W , the collection {F(W )λ : λ ∈ R} forms a continuous field of C ∗ -algebras [Dix77]; this follows from Rieffel’s results [Rie92] and the fact that F(W )λ forms a faithful representation of Rieffel’s deformed C ∗ -algebra (F(W ), ×λ ) [BLS10]. For deformed nets on Minkowski space, there also exist proofs showing that the net W → F(W )λ depends on λ, for example by working in a vacuum representation and calculating the corresponding collision operators [BS08]. There one finds as a striking effect of the deformation that the interaction depends on λ, i.e. that deformations of interaction-free models have non-trivial S-matrices. However, on generic curved spacetimes, a distinguished invariant state like the vacuum state with its positive energy representation of the translations does not exist. Consequently, the result concerning scattering theory cannot be reproduced here. Instead we will establish the non-equivalence of the undeformed net W → F(W ) and the deformed net W → F(W )λ , λ = 0, in a concrete example model in Sect. 3.2. As mentioned earlier, the family of wedge regions W(M, g) is causally separating in a subclass of admissible spacetimes, including the Friedmann-Robertson-Walker universes. In this case, the extension of the net Fλ to double cones or similar regions O ⊂ M via F(O)λ := F(W )λ (3.22) W ⊃O

is still twisted local. These algebras contain all operators localized in the region O in the deformed theory. On other spacetimes (M, g), such an extension is possible only for special regions, intersections of wedges, whose shape and size depend on the structure of W(M, g). Because of the relation of warped convolution to noncommutative spaces, where sharp localization is impossible, it is expected that F(O) contains only multiples of the identity if O has compact closure. A partial result in this direction will be presented in the context of the deformed Dirac field in Sect. 3.2. We conclude this section with a remark concerning the relation between the field and observable net structure of deformed quantum field theories. The field net F is composed of Bose and Fermi fields, and therefore contains observable as well as unobservable quantities. The former give rise to the observable net A which consists of the subalgebras invariant under the grading automorphism γ . In terms of the projection v(F) := 1 2 (F + γ (F)), the observable wedge algebras are A(W ) := {F ∈ F(W ) : F = γ (F)} = v(F(W )), W ∈ W,

(3.23)

so that A(W ), A(W˜ ) commute (without twist) if W and W˜ are spacelike separated.

120


Since the observables are the physically relevant objects, we could have considered a deformation A(W ) → A(W )λ of the observable wedge algebras along the same lines as we did for the field algebras. This approach would have resulted precisely in the γ -invariant subalgebras of the deformed field algebras F(W )λ , i.e., the diagram deformation

F(W ) −−−−−−−→ F(W )λ ⏐ ⏐ ⏐v ⏐ v deformation

A(W ) −−−−−−−→ A(W )λ commutes. This claim can quickly be verified by noting that the projection v commutes with the deformation map F → Fξ,λ . 3.2. The Dirac field and its deformation. After the model-independent description of deformed quantum field theories carried out in the previous section, we now consider the theory of a free Dirac quantum field as a concrete example model. We first briefly recall the notion of Dirac (co)spinors and the classical Dirac equation, following largely [DHP09,San08] and partly [Dim82,FV02], where the proofs of all the statements below are presented and an extensive description of the relevant concepts is available. Afterwards, we consider the quantum Dirac field using Araki’s self-dual CAR algebra formulation [Ara71]. As before, we work on a fixed but arbitrary admissible spacetime (M, g) in the sense of Definition 2.1 and we fix its orientation. Therefore, as a four-dimensional, time oriented and oriented, globally hyperbolic spacetime, M admits a spin structure (S M, ρ), consisting of a principle bundle S M over M with structure group SL(2, C), and a smooth bundle homomorphism ρ projecting S M onto the frame bundle F M, which is a principal bundle over M with SO0 (3, 1) as structure group. The map ρ preserves base points and is equivariant in the sense that it intertwines the natural right actions R of SL(2, C) on S M and of SO0 (3, 1) on F M, respectively, ρ ◦ R = R ◦ ρ,

∈ SO0 (3, 1),

(3.24)

from SL(2, C) to SO0 (3, 1). with the covering homomorphism → Although each spacetime of the type considered here has a spin structure [DHP09, Thm. 2.1, Lemma 2.1], this is only unique if the underlying manifold is simply connected [Ger68,Ger70], i.e., if all edges are simply connected in the case of an admissible spacetime. In the following, it is understood that a fixed choice of spin structure has been made. The main object we shall be interested in is the Dirac bundle, that is the associated vector bundle D M := S M ×T C4

1

,0

0, 21

(3.25)

with the representation T := D 2 ⊕D of SL(2, C) on C4 . Dirac spinors ψ are smooth global sections of D M, and the space they span will be denoted E(D M). The dual bundle D ∗ M is called the dual Dirac bundle, and its smooth global sections ψ ∈ E(D ∗ M) are referred to as Dirac cospinors. For the formulation of the Dirac equation, we need two more ingredients. The first are the so-called gamma-matrices, which are the coefficients of a global tensor γ ∈


121

E(T ∗ M ⊗ D M ⊗ D ∗ M) such that γ = γaAB ea ⊗ E A ⊗ E B . Here E A and E B with A, B = 1, . . . , 4 are four global sections of D M and D ∗ M respectively, such that (E A , E B ) = δ AB , with (, ) the natural pairing between dual elements. Notice that E A descends also from a global section E of S M since we can define E A (x) := [(E(x), z A )], where z A is the standard basis of C4 . At the same time, out of E, we can construct ea , with a = 1, . . . , 4, as a set of four global sections of T M once we define e := ρ ◦ E as a global section of the frame bundle, which, in turn, can be read as a vector bundle over T M with R4 as typical fibre. The set of all ea is often referred to as the non-holonomic basis of the base manifold. In this case upper indices are defined via the natural pairing over R4 , that is (eb , ea ) = δab . Furthermore we choose the gamma-matrices to be of the following form:

I2 0 0 σa , a = 1, 2, 3, (3.26) , γa = γ0 = 0 −I2 −σa 0 where the σa are the Pauli matrices, and In will henceforth denote the (n × n) identity matrix. These matrices fulfill the anticommutation relation {γa , γb } = 2ηab I4 , with the flat Minkowski metric η. They therefore depend on the sign convention in the metric signature, and differ from those introduced in [DHP09], where a different convention was used. The last ingredient we need to specify is the covariant derivative (spin connection) on the space of smooth sections of the Dirac bundle, that is ∇ : E(D M) → E(T ∗ M ⊗ D M),

(3.27)

whose action on the sections E A is given as ∇ E A = σaBA ea E B . The connection coeffib γ B γ dC , where b are the coefficients arising cients can be expressed as σaBA = 41 ad ad bC A from the Levi-Civita connection expressed in terms of non-holonomic basis [DHP09, Lemma 2.2]. We can now introduce the Dirac equation for spinors ψ ∈ E(D M) and cospinors ψ ∈ E(D ∗ M) as

Dψ := −iγ μ ∇μ + m I ψ = 0, (3.28)

μ (3.29) D ψ := +iγ ∇μ + m I ψ = 0, where m ≥ 0 is a constant while I is the identity on the respective space. The Dirac equation has unique advanced and retarded fundamental solutions [DHP09]: Denoting the smooth and compactly supported sections of the Dirac bundle by D(D M), there exist two continuous linear maps S ± : D(D M) → E(D M),

such that S ± D = DS ± = I and supp S ± f ⊆ J ± (supp ( f )) for all f ∈ D(D M), where J ± stands for the causal future/past. In the case of cospinors, we shall instead talk about S∗± : D (D ∗ M) → E (D ∗ M) and they have the same properties of S ± , except that S∗± D = D S∗± = I . In analogy with the theory of real scalar fields, the causal propagators for Dirac spinors and cospinors are defined as S := S + − S − and S∗ := S∗+ − S∗− , respectively. For the formulation of a quantized Dirac field, it is advantageous to collect spinors and cospinors in a single object. We therefore introduce the space D := D(D M ⊕ D ∗ M),

(3.30)

122


on which we have the conjugation ( f 1 ⊕ f 2 ) := f 1∗ β ⊕ β −1 f 2∗ ,

(3.31)

defined in terms of the adjoint f → f ∗ on C4 and the Dirac conjugation matrix β. This matrix is the unique selfadjoint element of SL(4, C) with the properties that γa∗ = −βγa β −1 , a = 0, . . . , 3, and iβn a γa is a positive definite matrix, for any timelike future-directed vector n. Due to these properties, the sesquilinear form on D defined as ( f 1 ⊕ f 2 , h 1 ⊕ h 2 ) := −i f 1∗ β, Sh 1 + i S∗ h 2 , β −1 f 2∗ ,

(3.32)

where , is the global pairing between E(D M) and D(D ∗ M) or between E(D ∗ M) and D(D M), is positive semi-definite. Therefore the quotient K := D/(ker S ⊕ ker S∗ ).

(3.33)

has the structure of a pre-Hilbert space, and we denote the corresponding scalar product 1/2 and norm by ., . S and f S := f, f S . The conjugation descends to the quotient K, and we denote its action on K by the same symbol. Moreover, is compatible with the sesquilinear form (3.32) in such a way that it extends to an antiunitary involution = ∗ = −1 on the Hilbert space K [San08, Lemma 4.2.4]. Regarding covariance, the isometry group Iso of (M, g) naturally acts on the sections in D by pullback. In view of the geometrical nature of the causal propagator, this action descends to the quotient K and extends to a unitary representation u of Iso on the Hilbert space K. Given the pre-Hilbert space K, the conjugation , and the representation u as above, the quantized Dirac field can be conveniently described as follows [Ara71]. We consider the C ∗ -algebra CAR (K, ), that is, the unique unital C ∗ -algebra generated by the symbols B( f ), f ∈ K, such that, for all f, h ∈ K, a) f −→ B( f ) is complex linear, b) B( f )∗ = B( f ), c) {B( f ), B(h)} = f, h S · 1. The field equation is implicit here since we took the quotient with respect to the kernels of S, S∗ . The standard picture of spinors and cospinors can be recovered via the identifications ψ(h) = B(0 ⊕ h) and ψ † ( f ) = B( f ⊕ 0). As is well known, the Dirac field satisfies the standard assumptions of quantum field theory, and we briefly point out how this model fits into the general framework used in Sect. 3.1. The global field algebra F := CAR (K, ) carries a natural Iso-action α by Bogoliubov transformations: Since the unitaries u(h), h ∈ Iso, commute with the conjugation , the maps αh (B( f )) := B(u(h) f ),

f ∈ K,

extend to automorphisms of F. Similarly, the grading automorphism γ is fixed by γ (B( f )) := −B( f ),


123

and clearly commutes with αh . The field algebra is faithfully represented on the Fermi Fock space H over K, where the field operators take the form 1 B( f ) = √ (a( f )∗ + a( f )), 2

(3.34)

with the usual Fermi Fock space creation/annihilation operators a # ( f ), f ∈ K. In this representation, the second quantization U of u implements the action α. The Bose/Fermi-grading can be implemented by V = (−1) N , where N ∈ B(H) is the Fock space number operator [Foi83]. Regarding the regularity assumptions on the symmetries, recall that the anticommutation relations of the CAR algebra imply that a( f ) = f S , and thus f → B( f ) is a linear continuous map from K to F. As s → u ξ (s) f is smooth in the norm topology of K for any ξ ∈ , f ∈ K, this implies that the field operators B( f ) transform smoothly under the action α. Furthermore, the unitarity of u yields strong continuity of α on all of F, as required in Sect. 3.1. The field algebra F(W ) ⊂ F associated with a wedge W ∈ W is defined as the unital C ∗ -algebra generated by all B( f ), f ∈ K(W ), where K(W ) is the set of (equivalence classes of) smooth and compactly supported sections of D M ⊕ D ∗ M with support in W . Since f, g S = 0 for f ∈ K(W ), g ∈ K(W ), we have {B( f ), B(g)} = 0 for f ∈ K(W ), g ∈ K(W ), which implies the twisted locality condition (3.3). Isotony is clear from the definition, and covariance under the isometry group follows from u(h)K(W ) = K(hW ). The model of the Dirac field therefore fits into the framework of Sect. 3.1, and the warped convolution deformation defines a one-parameter family of deformed nets Fλ . Besides the properties which were established in the general setting in Sect. 3.1, we can here consider the explicit deformed field B( f )ξ,λ . A nice characterization of these operators can be given in terms of their n-point functions associated with a quasifree state ω on F. Let ω be an α-invariant quasifree state on F, and let (Hω , π ω , ω ) denote the associated GNS triple. As a consequence of invariance of ω, the GNS space Hω carries a unitary representation U ω of Iso which leaves ω invariant. In this situation, the warping map 1 ω Fξ,λ → Fξ,λ := lim ds ds e−iss χ (εs, εs ) Uξω (λQs)π ω (F)Uξω (s − λQs), 4π 2 ε→0 (3.35) defined for ξ -smooth F ∈ F as before, extends continuously to a representation of the Rieffel-deformed C ∗ -algebra (F, ×ξ,λ ) on Hω [BLS10, Thm. 2.8]. Moreover, the U ω -invariance of ω implies ω ω = π ω (F)ω , Fξ,λ

ξ ∈ , λ ∈ R,

F ∈ F ξ -smooth.

(3.36)

Since the CAR algebra is simple, all its representations are faithful [BR97]. We will therefore identify F with its representation π ω (F) in the following, and drop the ω from our notation. ω-dependence of Hω , ω , U ω and the warped convolutions Fξ,λ To characterize the deformed field operators B( f )ξ,λ , we will consider the n-point functions ωn ( f 1 , . . . , f n ) := ω(B( f 1 ) · · · B( f n )) = , B( f 1 ) · · · B( f n ),

f 1 , . . . , f n ∈ K,

124


and the corresponding deformed expectation values of the deformed fields, called deformed n-point functions, ωnξ,λ ( f 1 , . . . , f n ) := , B( f 1 )ξ,λ · · · B( f n )ξ,λ ,

f 1 , . . . , f n ∈ K.

Of particular interest are the quasifree states, where ωn vanishes if n is odd, and ωn is a linear combination of products of two-point functions ω2 if n is even. In particular, the undeformed four-point function of a quasifree state reads ω4 ( f 1 , f 2 , f 3 , f 4 ) = ω2 ( f 1 , f 2 )ω2 ( f 3 , f 4 ) + ω2 ( f 1 , f 4 )ω2 ( f 2 , f 3 ) − ω2 ( f 1 , f 3 )ω2 ( f 2 , f 4 ).

(3.37)

Proposition 3.5. The deformed n-point functions of a quasifree and Iso-invariant state vanish for odd n. The lowest deformed even n-point functions are, f 1 , . . . , f 4 ∈ K, ξ,λ

ω2 ( f 1 , f 2 ) = ω2 ( f 1 , f 2 ), ξ,λ ω4 ( f 1 , f 2 , f 3 , f 4 )

(3.38)

= ω2 ( f 1 , f 2 )ω2 ( f 3 , f 4 ) + ω2 ( f 1 , f 4 )ω2 ( f 2 , f 3 )

1 − 2 lim ds ds e−iss χ εs, εs ω2 ( f 1 , u ξ (s) f 3 ) 4π ε→0 (3.39) · ω2 ( f 2 , u ξ (2λQs ) f 4 ).

Proof. The covariant transformation behaviour of the Dirac field, Uξ (s)B( f )Uξ (s)−1 = B(u ξ (s) f ), the invariance of and the form (3.35) of the warped convolution imply that any deformed n-point function can be written as an integral over undeformed n-point functions with transformed arguments. As the latter functions vanish for odd n, we also ξ,λ have ωn = 0 for odd n. Taking into account (3.36), we obtain for the deformed two-point function ξ,λ

ω2 ( f 1 , f 2 ) = , B( f 1 )ξ,λ B( f 2 )ξ,λ = (B( f 1 )∗ )ξ,λ , B( f 2 )ξ,λ = B( f 1 )∗ , B( f 2 ) = ω2 ( f 1 , f 2 ), proving (3.38). To compute the four-point function, we use again B( f )ξ,λ = B( f ) and Uξ (s) = . Inserting the definition of the warped convolution (3.6), and using the transformation law Uξ (s)B( f )Uξ (s)−1 = B(u ξ (s) f ) and the shorthand f (s) := u ξ (s) f , we obtain ξ,λ

ω4 ( f 1 , f 2 , f 3 , f 4 ) = , B( f 1 )B( f 2 )ξ,λ B( f 3 )ξ,λ B( f 4 ) −4 lim d s e−i(s1 s1 +s2 s2 )χε (s) ω4 ( f 1 , f 2 (λQs1 ), f 3 (λQs2 +s1 ), f 4 (s1 +s2 )), = (2π ) ε1 ,ε2 →0

where d s = ds1 ds1 ds2 ds2 and χε (s) = χ (ε1 s1 , ε1 s1 )χ (ε2 s2 , ε2 s2 ). After the substitutions s2 → s2 − s1 and s1 → s1 − λQs2 , the integrations over s2 , s2 and the limit ε2 → 0 can be carried out. The result is −2 (2π ) lim ds1 ds1 e−is1 s1 χˆ (ε1 s1 , ε1 s1 ) ω4 ( f 1 , f 2 (λQs1 ), f 3 (s1 ), f 4 (s1 − λQs1 )), ε1 →0

with a smooth, compactly supported cutoff function χˆ with χˆ (0, 0) = 1.


125

We now use the fact that ω is quasi-free and write ω4 as a sum of products of twopoint functions (3.37). Considering the term where f 1 , f 2 and f 3 , f 4 are contracted, in the second factor ω2 ( f 3 (s1 ), f 4 (s1 − λQs1 )) the s1 -dependence drops out because ω is invariant under isometries. So the integral over s1 can be performed, and yields a factor δ(s1 ) in the limit ε1 → 0. Hence all λ-dependence drops out in this term, as claimed in (3.39). Similarly, in the term where f 1 , f 4 and f 2 , f 3 are contracted, all integrations disappear after using the invariance of ω and making the substitution s1 → s1 + λQs1 . Also this term does not depend on λ. Finally, we compute the term containing the contractions f 1 , f 3 and f 2 , f 4 , ˆ 1 s1 , ε1 s1 ) ω2 ( f 1 , f 3 (s1 )) ds1 ds1 e−is1 s1 χ(ε (2π )−2 lim ε1 →0

· ω2 ( f 2 (λQs1 ), f 4 (s1 − λQs1 )) = (2π )−2 lim ds1 ds1 e−is1 s1 χˆ (ε1 s1 ,ε1 s1 ) ω2 ( f 1 , f 3 (s1 )) · ω2 ( f 2 , f 4 (s1 −2λQs1 )) ε1 →0 ˜ 1 s1 , ε1 s1 ) ω2 ( f 1 , f 3 (s1 )) · ω2 ( f 2 , f 4 (2λQs1 )). ds1 ds1 e−is1 s1 χ(ε = (2π )−2 lim ε1 →0

1 In the last step, we substituted s1 → s1 + 2λ Q −1 s1 , used the antisymmetry of Q and absorbed the new variables in χˆ in a redefinition of this function. Since the oscillatory integrals are independent of the particular choice of cutoff function, comparison with (3.39) shows that the proof is finished.

The structure of the deformed n-point functions built from a quasifree state ω is quite different from the undeformed case. In particular, the two-point function is undeformed, but the four-point function depends on the deformation parameter. For even n > 4, a structure similar to the n-point functions on noncommutative Minkowski space [GL08] is expected, which are all λ-dependent. These features clearly show that the deformed field B( f )ξ,λ , λ = 0, differs from the undeformed field. Considering the commutation relations of the deformed field operators, it is also straightforward to check that the deformed field is not unitarily equivalent to the undeformed one. However, this structure does not yet imply that the deformed and undeformed Dirac quantum field theories are inequivalent. For there could exist a unitary V on H satisfying V U (h)V ∗ = U (h), h ∈ Iso, V = , which does not interpolate the deformed and undeformed fields, but the C ∗ -algebras according to V F(W )λ V ∗ = F(W ), W ∈ W. If such a unitary exists, the two theories would be physically indistinguishable. On flat spacetime, an indirect way of ruling out the existence of such an intertwiner V , and thus establishing the non-equivalence of deformed and undeformed theories, is to compute their S-matrices and show that these depend on λ in a non-trivial manner. However, on curved spacetimes, collision theory is not available and we will therefore follow the more direct non-equivalence proof of [BLS10, Lemma 4.6], adapted to our setting. This proof aims at showing that the local observable content of warped theories is restricted in comparison to the undeformed setting, as one would expect because of the connection to noncommutative spacetime. However, the argument requires a certain amount of symmetry, and we therefore restrict here to the case of a Friedmann-Robertson-Walker spacetime M. As discussed in Sect. 2.2, M can then be viewed as J × R3 ⊂ R4 via a conformal embedding, where J ⊂ R is an interval. Recall that in this case, we have the Euclidean

126


group E(3) contained in Iso(M, g), and can work in global coordinates (τ, x, y, z), where τ ∈ J and x, y, z ∈ R are the flow parameters of Killing fields. As reference Killing pair, we pick ζ := (∂ y , ∂z ), and as reference wedge, the “right wedge” W 0 := Wζ,0 = {(τ, x, y, z) : τ ∈ J, x > |τ |}. In this geometric context, consider the rotation r ϕ about angle ϕ in the x-y-plane, and the cone C := r ϕ W 0 ∩ r −ϕ W 0 ,

(3.40)

with some fixed angle |ϕ| < π2 . Clearly C ⊂ W 0 , and the reflected cone jx C, where jx (t, x, y, z) = (t, −x, y, z), lies spacelike to W 0 and r ϕ W 0 . Moreover, we will work in the GNS-representation of a particular state ω on F for the subsequent proposition, which besides the properties mentioned above also has the Reeh-Schlieder property. That is, the von Neumann algebra F(C) ⊂ B(Hω ) has ω as a cyclic vector. Since the Dirac field theory is a locally covariant quantum field theory satisfying the time slice axiom [San10], the existence of such states can be deduced by spacetime deformation arguments [San09, Thm. 4.1]. As M and the Minkowski space have unique spin structures, and M can be deformed to Minkowski spacetime in such a way that its E(3) symmetry is preserved, the state obtained from deforming the Poincaré invariant vacuum state on R4 is still invariant under the action of the Euclidean group. In the GNS representation of such a Reeh-Schlieder state on a Friedmann-RobertsonWalker spacetime, we find the following non-equivalence result. Proposition 3.6. Consider the net Fλ generated by the deformed Dirac field on a Friedmann-Robertson-Walker spacetime with flat spatial sections in the GNS representation of an invariant pure state with the Reeh-Schlieder property. Then the implementing vector is cyclic for the field algebra F(C)λ associated with the cone (3.40) if and only if λ = 0. In particular, the nets F0 and Fλ are inequivalent for λ = 0.

Proof. Let f ∈ K(C), so that f, u(r −ϕ ) f ∈ K W 0 , and both field operators, B( f )ζ,λ

and B(u(r −ϕ ) f )ζ,λ , are contained in F W 0 λ . Taking into account the covariance of the deformed net, it follows that U (r ϕ )B(u(r −ϕ ) f )ζ,λ U (r −ϕ ) = B( f )r∗ϕ ζ,λ lies in F(r ϕ W 0 )λ . Now the cone C is defined in such a way that the two wedges W 0 and r ϕ W 0 lie spacelike to jx C. Let us assume that is cyclic for F(C)λ , which by the unitarity of U ( jx ) is equivalent to being cyclic for F( jx C) for the commutant

λ 0.Hence isϕ separating F( jx C)λ , which by locality contains F W λ and F(r W 0 )λ . But in view of (3.36), B( f )ζ,λ and B( f )r∗ϕ ζ,λ coincide on , B( f )r∗ϕ ζ,λ = B( f ) = B( f )ζ,λ , so that the separation property implies B( f )ζ,λ = B( f )r∗ϕ ζ,λ . To produce a contradiction, we now show that these two operators are actually not equal. To this end, we consider a difference of four-point functions (3.39), with ϕ smooth vectors f 1 , f 2 := f, f 3 := f, f 4 , and Killing pairs ζ respectively r∗ ζ . With the


127

ϕ abbreviations wi j (s) := ω2 f i , u r∗ϕ ζ (s) f j , we obtain

, B( f 1 ) B( f )ζ,λ B( f )ζ,λ − B( f )r∗ϕ ζ,λ B( f )r∗ϕ ζ,λ B( f 4 ) ϕ

ζ,λ

r ζ,λ

= ω4 ( f 1 , f, f, f 4 ) − ω4∗ ( f 1 , f, f, f 4 )

ϕ ϕ 0 0 (0) − w13 λ w24 (0), λ w24 = w13 where λ denotes the Weyl-Moyal star product on smooth bounded functions on R2 , with the standard Poisson bracket given by the matrix (3.5) in the basis {ζ1 , ζ2 }. Now the asymptotic expansion of this expression for λ → 0 gives in first order the difference of Poisson brackets [EGBV89] ϕ ϕ ζ ζ ζ ζ 0 0 w13 (0) − w13 , w24 (0) = f 1 , P1 f f, P2 f 4 − f 1 , P2 f f, P1 f 4 , w24 ϕ ϕ r ζ r ζ − f 1 , P1 ∗ f f, P2 ∗ f 4 ϕ ϕ r ζ r ζ + f 1 , P2 ∗ f f, P1 ∗ f 4 , ϕ

r ζ

ϕ

r ζ

where all scalar products are in K and P1 ∗ , P2 ∗ denote the generators of s → u r∗ϕ ζ (s). ζ

ϕ

r ζ

By considering f 4 orthogonal to P1 f and P1 ∗ f , we see that for B( f )ζ,λ = B( f )r∗ϕ ζ,λ ϕ ζ r ζ f = 0. But varying f, f 1 within the specified it is necessary that f 1 , P j − P j ∗ ζ

ϕ

r ζ

limitations gives dense subspaces in K, i.e. we must have P j = P j ∗ . This implies that translations in a spacelike direction are represented trivially on the Dirac field, which is not compatible with its locality and covariance properties. So we conclude that the deformed field operator B( f )r∗ϕ ζ,λ is not independent of ϕ for λ = 0, and hence the cyclicity assumption is not valid for λ = 0. Since on the other hand is cyclic for F(C)0 by the Reeh-Schlieder property of ω, and a unitary V leaving invariant and mapping F(C)0 onto F(C)λ would preserve this property, we have established that the nets F0 and Fλ , λ = 0, are not equivalent. 4. Conclusions In this paper we have shown how to apply the warped convolution deformation to quantum field theories on a large class of globally hyperbolic spacetimes. Under the requirement that the group of isometries contains a two-dimensional Abelian subgroup generated by two commuting spacelike Killing fields, many results known for Minkowski space theories were shown to carry over to the curved setting by formulating concepts like edges, wedges and translations in a geometrical language. In particular, it has been demonstrated in a model-independent framework that the basic covariance and wedge-localization properties we started from are preserved for all values of the deformation parameter λ. As a concrete example we considered a warped Dirac field on a Friedmann-Robertson-Walker spacetime in the GNS representation of a quasifree and Iso-invariant state with Reeh-Schlieder property. It was shown that the deformed models depend in a non-trivial way on λ, and violate the Reeh-Schlieder property for regions smaller than wedges. In view of the picture that the deformed models can be regarded as effective theories on a noncommutative spacetime, where strictly local quantities do

128


not exist because of space-time uncertainty relations, it is actually expected that they do not contain operators sharply localized in bounded spacetime regions for λ = 0. At the current stage, it is difficult to give a clear-cut physical interpretation to the models constructed here since scattering theory is not available for quantum field theories on generic curved spacetimes. Nonetheless, it is interesting to note that in a field theoretic context the deformation often leaves invariant the two-point function (Prop. 3.5), the quantity which is most frequently used for deriving observable quantum effects in cosmology (for the example of quantised cosmological perturbations, see [MFB92]). So when searching for concrete scenarios where deformed quantum field theories can be matched to measurable effects, one has to look for phenomena involving the higher n-point functions. There exist a number of interesting directions in which this research could be extended. As far as our geometrical construction of wedge regions in curved spacetimes is concerned, we limited ourselves here to edges generated by commuting Killing fields. This assumption rules out many physically interesting spacetimes, such as de Sitter, Kerr or Friedmann-Robertson-Walker spacetimes with compact spatial sections. An extension of the geometric construction of edges and wedges to such spaces seems to be straightforward and is expected to coincide with the notions which are already available. For example, in the case of four-dimensional de Sitter spacetime, edges have the topology of a two-sphere. Viewing the two-sphere as an SO(3) orbit, a generalization of the warped convolution deformation formula could involve an integration over this group instead of R2 . A deformation scheme of C ∗ -algebras which is based on an action of this group is currently not yet available (see however [Bie07] for certain non-Abelian group action). In the de Sitter case, a different possibility is to base the deformation formula on the flow of a spacelike and commuting timelike Killing field leaving a wedge invariant. The covariance properties of the nets which arise from such a procedure are somewhat different and require an adaption of the kernel in the warped convolution [Mo10]. It is desireable to find more examples of adequate deformation formulas compatible with the basic covariance and (wedge-) localization properties of quantum field theory. A systematic exploration of the space of all deformations is expected to yield many new models and a better understanding of the structure of interacting quantum field theories. Acknowledgements. C.D. gratefully acknowledges financial support from the Junior Fellowship Programme of the Erwin Schrödinger Institute and from the German Research Foundation DFG through the Emmy Noether Fellowship WO 1447/1-1. E.M.M. would like to thank the EU-network MRTN-CT-2006-031962, EU-NCG for financial support. We are grateful to Valter Moretti and Helmut Rumpf for helpful discussions.

References [ABD+ 05] Aschieri, P., Blohmann, C., Dimitrijevic, M., Meyer, F., Schupp, P., Wess, J.: A gravity theory on noncommutative spaces. Class. Quant. Grav. 22, 3511–3532 (2005) [Ara71] Araki, H.: On quasifree states of CAR and Bogoliubov automorphisms. Publ. Res. Inst. Math. Sci. Kyoto 6, 385–442 (1971) [Ara99] Araki, H.: Mathematical Theory of Quantum Fields. Int. Series of Monographs on Physics. Oxford: Oxford University Press, 1999 [BB99] Borchers, H.-J., Buchholz, D.: Global properties of vacuum states in de Sitter space. Annales Poincare Phys. Theor. A70, 23–40 (1999) [BDFP10] Bahns, D., Doplicher, S., Fredenhagen, K., Piacitelli, G.: Quantum Geometry on Quantum Spacetime: Distance, Area and Volume Operators. http://arxiv.org/abs/1005.2130v1 [hepth], 2010 [BDFS00] Buchholz, D., Dreyer, O., Florig, M., Summers, S.J.: Geometric modular action and spacetime symmetry groups. Rev. Math. Phys. 12, 475–560 (2000)


129

[BGK+ 08] Blaschke, D., Gieres, F., Kronberger, E., Schweda, M., Wohlgenannt, M.: Translation-invariant models for non-commutative gauge fields. J.Phys.A 41, 252002 (2008) [BGL02] Brunetti, R., Guido, D., Longo, R.: Modular localization and Wigner particles. Rev. Math. Phys. 14,759–786 (2002) [Bie07] Bieliavsky, P.: Deformation quantization for actions of the affine group. http://arxiv.org/abs/ 1011.2370v1 [math.Q4], 2010, [BLS10] Buchholz, D., Lechner, G., Summers, S.J.: Warped Convolutions, Rieffel Deformations and the Construction of Quantum Field Theories. Commun. Math. Phys. (2010). doi:10.1007/s00220010-1137-1 [BMS01] Buchholz, D., Mund, J., Summers, S.J.: Transplantation of Local Nets and Geometric Modular Action on Robertson-Walker Space-Times. Fields Inst. Commun. 30, 65–81 (2001) [Bor92] Borchers, H.-J.: The CPT theorem in two-dimensional theories of local observables. Commun. Math. Phys. 143, 315–332 (1992) [Bor00] Borchers, H.-J.: On revolutionizing quantum field theory with tomita’s modular theory. J. Math. Phys. 41, 3604–3673 (2000) [Bor09] Borchers, H.-J.: On the Net of von Neumann algebras associated with a Wedge and Wedgecausal Manifolds. Preprint (2009), available at http://www.theorie.physik.uni-goettingen.de/ forschung/qft/publications/ 2009 [BPQV08] Balachandran, A.P., Pinzul, A., Qureshi, B.A., Vaidya, S.: S-Matrix on the Moyal Plane: Locality versus Lorentz Invariance. Phys. Rev. D 77, 025020 (2008) [BR97] Bratteli, O., Robinson, D.W.: Operator Algebras and Quantum Statistical Mechanics II. Berlin, Heidelberg-New York: Springer, 1997 [BS04] Buchholz D., Summers, S.J.: Stable quantum systems in anti-de Sitter space: Causality, independence and spectral properties. J. Math. Phys. 45, 4810–4831 (2004) [BS05] Bernal, A.N., Sánchez, M.: Smoothness of time functions and the metric splitting of globally hyperbolic spacetimes. Commun. Math. Phys. 257, 43–50 (2005) [BS06] Bernal, A.N., Sánchez, M.: Further results on the smoothability of Cauchy hypersurfaces and Cauchy time functions. Lett. Math. Phys. 77, 183–197 (2006) [BS07] Buchholz, D., Summers, S.J.: String- and brane-localized fields in a strongly nonlocal model. J. Phys. A40, 2147–2163 (2007) [BS08] Buchholz, D., Summers, S.J.: Warped Convolutions: A Novel Tool in the Construction of Quantum Field Theories. In: Seiler, E., Sibold, K. (eds.) Quantum Field Theory and Beyond: Essays in Honor of Wolfhart Zimmermann, River Edge, NJ: World Scientific, 2008, pp. 107–121 [BW75] Bisognano, J.J., Wichmann, E.H.: On the duality condition for a hermitian scalar field. J. Math. Phys. 16, 985–1007 (1975) [CF84] Chandrasekhar, S., Ferrari, V.: On the nutku-halil solution for colliding impulsive gravitational waves. Proc. Roy. Soc. Lond. A396, 55 (1984) [Cha83] Chandrasekhar, S.: The mathematical theory of black holes. Oxford: Oxford University Press, 1983 [DFR95] Doplicher, S., Fredenhagen, K., Roberts, J.E.: The Quantum structure of space-time at the Planck scale and quantum fields. Commun. Math. Phys. 172, 187–220 (1995) [DHP09] Dappiaggi, C., Hack, T.-P., Pinamonti, N.: The extended algebra of observables for Dirac fields and the trace anomaly of their stress-energy tensor. Rev. Math. Phys. 21, 1241–1312 (2009) [DHR69] Doplicher, S., Haag, R., Roberts, J.E.: Fields, observables and gauge transformations. I. Commun. Math. Phys. 13, 1–23 (1969) [Dim82] Dimock, J.: Dirac quantum fields on a manifold. Trans. Amer. Math. Soc. 269(1), 133–147 (1982) [Dix77] Dixmier, J.: C*-Algebras. Amsterdam-New York-Oxford: North-Holland-Publishing Company, 1977 [EGBV89] Estrada, R., Gracia-Bondia, J.M., Varilly, J.C.: On asymptotic expansions of twisted products. J. Math. Phys. 30, 2789–2796 (1989) [Ell06] Ellis, G.F.R.: The bianchi models: then and now. Gen. Rel. Grav. 38, 1003–1015 (2006) [Foi83] Foit, J.J.: Abstract Twisted Duality for Quantum Free Fermi Fields. Publ. Res. Inst. Math. Sci. Kyoto 19, 729–741 (1983) [FPH74] Fulling, S.A., Parker, L., Hu, B.L.: Conformal energy-momentum tensor in curved spacetime: adiabatic regularization and renormalization. Phys. Rev. D10, 3905–3924 (1974) [FV02] Fewster, C.J., Verch, R.: A quantum weak energy inequality for Dirac fields in curved spacetime. Commun. Math. Phys. 225, 331–359 (2002) [Ger68] Geroch, R.P.: Spinor structure of space-times in general relativity. I. J. Math. Phys. 9, 1739–1744 (1968) [Ger70] Geroch, R.P.: Spinor structure of space-times in general relativity II. J. Math. Phys. 11, 343– 348 (1970)

130


[GGBI+ 04] Gayral, V., Gracia-Bondia, J.M., Iochum, B., Schucker, T., Varilly, J.C.: Moyal planes are spectral triples. Commun. Math. Phys. 246, 569–623 (2004) [GL07] Grosse, H., Lechner, G.: Wedge-Local Quantum Fields and Noncommutative Minkowski Space. JHEP 11, 012 (2007) [GL08] Grosse, H., Lechner, G.: Noncommutative Deformations of Wightman Quantum Field Theories. JHEP 09, 131 (2008) [GLRV01] Guido, D., Longo, R., Roberts, J.E., Verch, R.: Charged sectors, spin and statistics in quantum field theory on curved spacetimes. Rev. Math. Phys. 13, 125–198 (2001) [GW05] Grosse, H., Wulkenhaar, R.: Renormalisation of phi 4 theory on noncommutative R4 in the matrix base. Commun. Math. Phys. 256, 305–374 (2005) [Haa96] Haag, R.: Local Quantum Physics - Fields, Particles, Algebras. Berlin, Heidelberg-New York: Springer, 2nd -edition, 1996 [Kay85] Kay, B.S.: The double-wedge algebra for quantum fields on Schwarzschild and Minkowski space-times. Commun. Math. Phys. 100, 57 (1985) [Key96] Keyl, M.: Causal spaces, causal complements and their relations to quantum field theory. Rev. Math. Phys. 8, 229–270 (1996) [LR07] Lauridsen-Ribeiro, P.: Structural and Dynamical Aspects of the AdS/CFT Correspondence: a Rigorous Approach. PhD thesis, Sao Paulo, 2007, available at http://arxiv.org/abs/0712.0401v3 [math-ph], 2008 [LW10] Longo, R., Witten, E.: An Algebraic Construction of Boundary Quantum Field Theory. Commun. Math. Phys. (2010). doi:10.1007/s00220-010-1133-5 [MFB92] Mukhanov, V.F., Feldman, H.A., Brandenberger, R.H.: Theory of cosmological perturbations. part 1. classical perturbations. part 2. quantum theory of perturbations. part 3. extensions. Phys. Rept. 215, 203–333 (1992) [Mo10] Morfa-Morales, E.: Work in progress [O’N83] O’Neill, B.: Semi-Riemannian Geometry. London-New York: Academic Press, 1983 [OS09] Ohl, T., Schenkel, A.: Algebraic approach to quantum field theory on a class of noncommutative curved spacetimes. Gen. Rel. Grav. 42, 2785-2798 (2010) [Ped79] G.K. Pedersen. C*-Algebras and their Automorphism Groups. Academic Press, 1979 [PV04] Paschke, M., Verch, R.: Local covariant quantum field theory over spectral geometries. Class. Quant. Grav. 21, 5299–5316 (2004) [Reh00] Rehren, K.-H.: Algebraic Holography. Ann. Henri Poincaré 1, 607–623 (2000) [Rie92] Rieffel, M.A.: Deformation Quantization for Actions of R d . Volume 106 of Memoirs of the Amerian Mathematical Society. Providence, RI: Amer. Math. Soc., 1992 [San08] Sanders, K.: Aspects of locally covariant quantum field theory. PhD thesis, University of York, September, 2008, available at http://arxiv.org/abs/0809.4828v1 [math-ph], 2008 [San09] Sanders, K.: On the Reeh-Schlieder Property in Curved Spacetime. Commun. Math. Phys. 288, 271–285 (2009) [San10] Sanders, K.: The locally covariant Dirac field. Rev. Math. Phys. 22(4), 381–430 (2010) [Sol08] Soloviev, M.A.: On the failure of microcausality in noncommutative field theories. Phys. Rev. D77, 125013 (2008) [Ste07] Steinacker H. (2007) Emergent Gravity from Noncommutative Gauge Theory. JHEP 0712, 049 [Str08] Strich R.: Passive States for Essential Observers. J. Math. Phys. 49(022301), (2008) [Sza03] Szabo, R.J.: Quantum field theory on noncommutative spaces. Phys. Rept. 378, 207–299 (2003) [Tay86] Taylor, M.E.: Noncommutative Harmonic Analysis. Providence, RI: Amer. Math. Soc., 1986 [TW97] Thomas, L.J., Wichmann, E.H.: On the causal structure of minkowski space-time. J. Math. Phys. 38, 5044–5086 (1997) [Wal84] Wald, R.M.: General Relativity. Chicago, IL: University of Chicago Press, 1984 Communicated by Y. Kawahigashi

Commun. Math. Phys. 305, 131–151 (2011) Digital Object Identifier (DOI) 10.1007/s00220-011-1215-z

Communications in


The Vortex-Wave Equation with a Single Vortex as the Limit of the Euler Equation Clayton Bjorland Department of Mathematics, University of Texas, Austin, TX 78712-0257, USA. E-mail: [email protected] Received: 25 June 2010 / Accepted: 4 September 2010 Published online: 20 February 2011 – © Springer-Verlag 2011

Abstract: In this article we consider the physical justification of the Vortex-Wave equation introduced by Marchioro and Pulvirenti (Mechanics, analysis and geometry: 200 years after Lagrange, North-Holland Delta Ser., Amsterdam, North-Holland, pp. 79–95, 1991), in the case of a single point vortex moving in an ambient vorticity. We consider a sequence of solutions for the Euler equation in the plane corresponding to initial data consisting of an ambient vorticity in L 1 ∩ L ∞ and a sequence of concentrated blobs which approach the Dirac distribution. We introduce a notion of a weak solution of the Vortex-Wave equation in terms of velocity (or primitive variables) and then show, for a subsequence of the blobs, the solutions of the Euler equation converge in velocity to a weak solution of the Vortex-Wave equation. 1. Introduction The classical evolution of an incompressible, inviscid fluid is governed by the Euler equations. In R2 the vorticity formulation of the Euler equations is: ∂t w + v · ∇w = 0, v = K ∗ w, w(·, 0) = w0 .

(1)

Here, w : R2 × R+ → R represents the vorticity of the fluid and v : R2 × R+ → R the velocity. They are related through the curl operator w = ∇ × v and the Biot-Savart Law which inverts the curl operator. In notation we keep throughout, K is the Biot-Savart kernel in two dimensions given by K (x) =

1 x⊥ . 2π |x|2

In this formulation the vorticity is transported by the velocity v. The velocity is divergence free reflecting the incompressibility of the fluid. A classic existence theorem (originally due to Yudovich [15] in bounded domains) states:

132

C. Bjorland

Theorem 1. Given initial vorticity w0 ∈ L 1 ∩ L ∞ (R2 ) there exists a unique weak solution w ∈ L ∞ ([0, T ]; L 1 ∩ L ∞ (R2 )) for the vorticity formulation of the Euler equations (1). Many proofs of this theorem can be found in the literature, for example see [8, Ch. 8]. In fact the literature relating to the Euler equation is quite large and instead of listing numerous results we point the interested reader to [8] and [12] and the references therein. In some situations if the vorticity is reasonable except for N concentrated regions where it is very large it may be better to consider the Vortex-Wave system given by: N ∂t ω + u + h i · ∇ω = 0, i=1

u = K ∗ ω, ω0 (·, 0) = ω0 , h i = ai K (x − z i (t)), z i (0) = z 0 , d z i (t) = u(z i , t) + h j (z i , t). dt

(2)

i = j

Here, h i represents the velocity of a point vortex with strength ai moving along the path z i . Each point vortex is moved by the ambient velocity associated with ω as well as the velocity associated with the other point vorticies. In turn, the point vorticies influence the flow of the ambient vorticity. A solution of this system consists of the ambient vorticity ω along with the paths of point vorticies given by z i (t). The Vortex-Wave equations were introduced by Marchioro and Pulvirenti [11] where an existence theorem was given by constructing Lagrangian paths. Concerning this system the authors of [11] point out two outstanding questions: I. Is the solution of the Vortex-Wave equations unique? II. Consider solutions of the Euler equations starting with a sequence of initial data which is the sum of an ambient vorticity ω0 and concentrated vorticity “blobs.” Do these solutions converge to a solution of the Vortex-Wave equations as the blobs converge to Dirac distributions? Herein we consider the second question with N = 1 but first point the interested reader to [5] where the first question is studied under the assumption that the initial ambient vorticity ω0 is constant near the point vorticies. The second question relates to the physical justification of the model as it is intended to model a sharply concentrated vorticity which is not necessarily a point vortex. If the initial ambient vorticity ω0 is assumed zero, a satisfactory answer for the second question has been given in [9] and [10] (see also the references therein). When ω0 = 0 “mixing” of the ambient vorticity due to the concentrated vorticity complicates the issue significantly. We consider the case of one vortex blob and ambient vorticity initially in L 1 ∩ L ∞ , then take the limit as the blob tends towards the Dirac distribution. To start making this idea precise, consider the Euler equation with initial data w0 = ω0 + δ0 . We assume ω0 ∈ L ∞ ∩ L 1 (R2 ) and δ0 = −2 χ (x), where is an open region such that | | = 2 . Here ω0 generates the ambient velocity field and δ0 is the “blob” which approximates a point vortex. Such initial data belongs to L ∞ ∩ L 1 (R2 ) (although not uniformly in ) so Theorem 1 implies the existence of a solution w (x, t) ∈ L 1 ∩ L ∞ (R2 ) for all t ≥ 0. The question we attack: Do these solutions tend to a solution of the Vortex-Wave system as → 0?

Vortex-Wave Equation

133

To answer this question we decompose the solution w into a part which corresponds to δ0 and another part which corresponds to the ambient vorticity ω0 . This is done by considering, a posteriori, a linear transport equation which transports the vorticity along paths with velocity v = K ∗ w . This decomposition is the main topic of Subsect. 2.1 and yields ω (x, t) and δ (x, t) which correspond to ω0 and δ0 respectively. From here the main convergence theorem of this paper is argued in two parts: First, the solutions of the Euler equations are shown to be approximations of the Vortex-Wave equation in a precise sense, this is recorded in Lemma 1 below. Second, these approximations are shown to converge to a weak solution of the Vortex-Wave equation. This is Theorem 2 below. One aspect of the analysis we would like to point out is that the convergence arguments are made for the velocity of the solutions (sometimes called primitive variables). In contrast Eqs. (1) and (2) are written as transport equations for the vorticity. One natural approach with transport equations is to consider the paths along which the vorticity moves, finding uniform bounds and applying compactness theorems to the paths. In this situation it is difficult to obtain bounds on such paths because the velocity associated with the concentrated blob becomes unbounded in the limit. To circumvent this difficulty we return to primitive variables and our convergence arguments follow the program of study initiated by DiPerna and Majda in [2,3 and 4]. We have “L 1 and L ∞ control” over the ambient vorticity ω but we do not have this control over the whole system because the vorticity associated with the concentrated blob approximates a Dirac distribution and therefore becomes unbounded in L p for p > 1. On the other hand, generalizing the arguments in [10] we are able to deduce considerable information about the sequence of vortex blobs by analyzing the moments: M (t) = xδ (x, t) d x, (3) 2 R I (t) = (x − M (t))2 δ (x, t) d x. (4) R2

In some sense the extra information about M and I allow us to treat the system as having “L 1 and L p control” for p ∈ (1, ∞). The uniform bounds on M and I as → 0 are the key estimates used to show the approximate solutions converge to weak solutions of the vortex-wave equation. In [10] the authors assume the blobs are moving in an ambient velocity field which is Lipschitz but that is too strong for velocity fields generated by vorticity in L ∞ ∩ L 1 . In the later case we only have a “log-Lipschitz” sense of continuity (see Definition 2 below). Nevertheless the analysis in [10] is quite robust and the general ideas can be pushed through in this more general setting, this is detailed in Sect. 3. Before we state the main theorem of this paper we need two more assumptions concerning the initial structure of the vortex blobs. These assumptions are not very restrictive and merely force the blob to initially behave like a reasonable approximation to the Dirac distribution. Assumption. δ0 =

1 χ . 2

Here χ is the indicator function and the set has measure 2 .

(5)

134

C. Bjorland

Assumption. For any continuous bounded function f : R2 → R, lim f (y)δ0 (y) dy = f (z 0 ). →0 R2

(6)

This assumption indicates that δ0 approximates the Dirac distribution at the point z 0 and will be assumed throughout. We reserve the label z 0 specifically for this point. The next assumption will only be used at the very end of our analysis when proving the convergence associated with the term h · ∇h in Sect. 5. Assumption. I (0) ≤ C 2 .

(7)

We now state a lemma describing properties of the approximate sequence. These results follow quickly from the fact that our approximate sequence is derived from solutions of the Euler equation. The solutions stated in the lemma are exactly those sketched earlier in the Introduction but made precise in Subsect. 2.1. Lemma 1. Let w be the solution of (1) given by Theorem 1 with initial data w0 = ω0 +δ0 , where ω0 ∈ L ∞ ([0, T ], L 1 ∩ L ∞ (R2 )) and δ0 satisfies (5-7). Write v = K ∗ w and let ω and δ solve, respectively: ∂t ω + v · ∇ω = 0, ∂t δ + v · ∇δ = 0, ω (·, 0) = ω0 , δ (·, 0) = δ0 . Denote by u = K ∗ ω and h = K ∗ δ the corresponding velocities. Then the following properties hold: (i) The sequence u is uniformly bounded in L ∞ (R2 ), that is sup u (t) L ∞ (R2 ) < C.

t∈[0,T ]

(ii) ∇ · u = 0 and ∇ · h = 0, where the divergence is understood in the weak sense. (iii) Given any test function ∈ Cσ∞ (R2 × [0, T ]): T 0= t u + ∇ : u ⊗ u + ∇ : h ⊗ u d x dt, R2

0

T

+ 0

R2

t h + ∇ : u ⊗ h + ∇ : h ⊗ h d x dt.

(8)

(iv) The sequence ω is uniformly bounded in L 1 ∩ L ∞ (R2 ), that is sup ω (t) L ∞ ∩L 1 (R2 ) < C.

t∈[0,T ]

−L (R2 )) for some L > 0, (v) The sequence v is uniformly bounded in Li p([0, T ], Hloc that is

ρu (t1 ) − ρu (t2 ) H −L (R2 ) ≤ C|t1 − t2 |, 0 ≤ t1 , t2 ≤ T,

∀ρ ∈ Cc∞ (R2 ).


135

In the above lemma we have used the notation A : w⊗v = i, j Ai, j wi v j and (∇ )i j = ∂i j . Cσ∞ ( ) denotes the space of smooth divergence free vector fields with compact support in . We will also use Cc∞ ( ) to denote the set of smooth scalar valued functions with compact support in . Condition (v) is a technical consideration introduced in [2]. It describes how the solution attains the initial data and also plays a role in compactness arguments. Before stating the main convergence theorem we give the definition of a weak solution for the Vortex-Wave equation in primitive variables. Definition 1. A weak solution in primitive variables to the Vortex-Wave equation on [0, T ] with a single vortex consists of a path z(t) : R+ → R2 and a velocity u ∈ 2 (R2 )) such that the following hold: L ∞ ([0, T ], L loc (i) ∇ · u = 0 in the weak sense. (ii) Given any test function ∈ Cσ∞ (R2 × [0, T ] \ {(z(t), t)}t∈[0,T ] ), T t u + ∇ : u ⊗ u + ∇ : h ⊗ u d x dt 0= R2

0

T

+

R2

0

t h + ∇ : u ⊗ h d x dt,

(9)

h(x, t) = K (x − z(t)) −L (iii) The velocity u belongs to Li p([0, T ], Hloc (R2 )) for some L > 0 and u(0) = u 0 ∈ −L Hloc .

The support of the test functions in the above definition are not allowed to contain the path (z(t), t). This allows us to pass the limits through the approximate sequence avoiding some of the complications involved with a point concentration of vorticity. 2 (R2 )) is understood to mean u ∈ L ∞ ([0, T ], L loc max |u|2 d x ≤ C R,T 0≤t≤T

BR

for any R > 0. Here (and throughout) B R denotes the ball of Radius R. The first integral in (9) corresponds to the first line in (2) written in primitive variables while the second corresponds to the ode governing the flow of the vortex. To interpret the solution of (2) constructed through Lagrangian in this setting one should add and subtract the cross term indicated by (14) and (15) to separate these two parts. Now we state the main theorem of the paper whose proof is given in Sect. 5. Theorem 2. Let u (x, t) be the sequence of solutions given by Lemma 1. Let the approximate point vortex paths be given by t z (t) = z 0 + u (z (s), s) ds. (10) 0 2 (R2 )), a path z(t) ∞ L ([0, T ], L loc

∈ C([0, T ]; R2 ), and Then there exists a velocity u ∈ a subsequence → 0 (which we do not relabel) such that for any R > 0, T |u − u| d x dt → 0, 0 T

0

BR

|u − u|2 d x dt → 0, BR

sup |z (t) − z(t)| → 0.

t∈[0,T ]

136

C. Bjorland

Moreover, u and z are a weak solution of the Vortex-Wave equation on [0, T ] in the sense of Definition 1. Two final remarks: The convergence proved herein is global in the following sense. One fixes the interval [0, T ], T > 0 is arbitrary, then following the arguments below the convergence will hold on [0, T ]. Concerning the terms involving h h we point out that the convergence is given in primitive variables but our control over the blobs is in terms of the vorticity. To bridge this gap we use a technique which is based on changing the order of integration to “give” the kernel K to the other terms in the limit. Heuristically, to show φ(x)u (x)h (x) d x − φ(x)u (x)h(x) d x → 0, R

R

we would consider instead φ(x)u (x)K (y − x) d x δ (y) dy − φ(x)u (x)K (z(t) − x) d x. R

R

R

As indicated by Definition 1 the test functions φ will be supported away from z(t) so that K (z(t) − x) is smooth in the domain of integration for the second integral. The kernel K has a smoothing effect on the term φu recorded in Lemma 5. For reasonable functions this smoothing effect combined with the moment bounds are enough to imply convergence, this general statement is recorded in Lemma 6. The smoothing effect of the Biot-Savart kernel has been used before to prove existence of vortex sheets, see [1] (also [6,7 and 13]). In those situations a sequence of initial data approximates a bounded measure and estimates are obtained to show the vorticity does not concentrate to a point in the limit. In our situation the vorticity is necessarily concentrated to a point on the path of the vortex z(t) but this path is not in the support of test functions allowed in Definition 1 so we sidestep some of the complications associated with vorticity concentrating to a point. In trade we are required to establish the movement of the vortex through (9). It is evident from the statement of Lemma 1 and Theorem 2 our analysis relies on classical theorems for the Euler equation and the work of DiPerna and Majda. When similar claims are found in the literature we only sketch proofs or point the reader to the literature and focus our efforts on the properties which are particular to this problem or not readily available in the literature. Specifically, many aspects of Sect. 2 are direct applications of the existing literature relating to the Euler equation so we outline the proofs. Sections 3 through 5 are particular to this problem so we give more detail therein. 2. Construction of the Approximate Sequence and Solution Candidate 2.1. Decomposition of solutions for the Euler equation. Consider the vorticity formulation of the Euler equation with initial data as described in the Introduction: w0 = ω0 + δ0 . We take ω0 ∈ L ∞ ∩ L 1 (R2 ) and δ0 to satisfy (5–7). We take the vortex to have unit “strength” so δ0 (x) d x = 1 ∀ > 0. R2


137

This is not a restriction as the analysis herein can be carried out replacing δ0 with aδ0 , where a ∈ R2 . The initial data w0 belongs to L ∞ ∩ L 2 (R2 ) (though not uniformly in ) so Theorem 1 implies the existence of a solution w (x, t) ∈ L ∞ ([0, T ]; L 1 ∩ L ∞ (R2 )). Let v = K ∗w be the associated velocity. Now consider a related linear transport equation: ∂t f + v · ∇ f = 0.

(11) δ .

Solve separately this equation with initial data ω0 and Denote the solution of this equation corresponding to initial data ω0 (x) by ω (x, t) and the solution corresponding to δ0 (x) by δ (x, t). Furthermore, let u (x, t) = K ∗ ω and h = K ∗ δ denote the associated velocities respectively. Using the linearity of Eq. (11) and the uniqueness of solutions for the Euler equation given by Theorem 1 we can conclude w (x, t) = ω (x, t) + δ (x, t), v (x, t) = u (x, t) + h (x, t). Equation (11) is a transport equation so the divergence free property of u implies solutions will conserve their L p norms. We record this property here because we will use it later: ω (t) L p (R2 ) ≤ ω0 L p (R2 ) ∀ p ∈ [1, ∞],

(12)

δ (t) L 1 (R2 ) ≤

(13)

δ0 L 1 (R2 )

= 1.

2.2. Solution candidate. Returning to primitive variables we rewrite the solutions constructed in the previous subsection: ∂t u + u · ∇u + h · ∇u − ∇u · h + ∇ pu = 0, ∂t h + u · ∇h + h · ∇h + ∇u · h + ∇ ph = 0.

(14) (15)

Here u and h are divergence-free functions and pu and ph are the associated pressures. The term ∇u · h = i h i ∇u i balances the other terms involving u and h so that when the curl is taken one recovers the transport equations governing ω and δ . These equations should be interpreted in a weak sense. We now sketch the proof of Lemma 1. Proof (Proof of Lemma 1). These statements follow directly from well known properties of the Biot-Savart kernel and solutions of the Euler equation, so we only indicate parts of the proof here. Properties (ii) and (iii) are just the weak interpretation of (14) and (15) but added together. Property (iv) is a direct consequence of (12). By a well known estimate of the Biot-Savart kernel (see [8, p. 313]) Property (iv) implies Property (i). Checking Property (v) is straightforward using (14) and following Appendix A in [2] (or [8, p. 395]). The main difference in this setting are the terms with h which are handled in the following way. First, (13) implies the sequence δ is uniformly bounded p in L ∞ ([0, T ]; L 1 (R2 )). Also, the Biot-Savart kernel K ∈ L loc for any p ∈ [1, 2), so sup ρh (t) L p (R2 ) < C

t∈[0,T ]

∀ρ ∈ Cc∞ (R2 ), p ∈ [1, 2).

(16)

In this bound the constant C depends on the support of ρ but not on . Combining this with Item (i) we deduce sup ρ(h (t) ⊗ u (t)) L p (R2 ) < C

t∈[0,T ]

∀ρ ∈ Cc∞ (R2 ), p ∈ [1, 2).

138

C. Bjorland

Writing ∇u in terms of ω using a singular integral operator, then using the CalderonZygmond inequality (see [8, p. 73] and [14]) one finds ∇u (t) L p (R2 ) ≤ C p ω (t) L p (R2 ) for any p ∈ (1, ∞). Combining this with property (iv) and (16) establishes sup ρ(∇u (t) · h (t)) L p (R2 ) < C

t∈[0,T ]

∀ρ ∈ Cc∞ (R2 ), p ∈ (1, 2).

This is enough to establish (v) following the arguments in Appendix A of [2]. With the properties established in Lemma 1 we can use Theorems 10.1 and 10.2 in [8] to find the solution candidate. This next lemma establishes many of the properties stated in Theorem 2. 2 (R2 )) and a subsequence u Lemma 2. There exists a function u ∈ L ∞ ([0, T ], L loc (which we do not relabel) such that |u − u| d x dt → 0, (17) |u − u|2 d x dt → 0. (18)

Proof. This is a direct application of the work done in proving Theorems 10.1 and 10.2 in [8]. Those theorems in turn rely in Properties (i), (iv), and (v) in Lemma 1 in combination with the Lions-Aubin Compactness Lemma, the Calderon-Zygmund Inequality, and the Sobolev Inequality. This lemma rules out “oscillations” in the limiting process. In [8], Theorems 10.1 and 10.2, the authors assume an additional comparability with the Euler equation and deduce existence of a solution for the Euler equation. Here Property (iii) of Lemma 1 holds instead and we prove later that the function u given above is a weak solution for the Vortex-Wave equation in the sense of Definition 1. 2.3. Path of the point vortex. Now we are in a position to construct the path of the point vortex. Lemma 3. Let u (x, t) be the sequence of solutions given by Lemma 1. Furthermore, let z (t) be defined by (10) for ≥ 0. Then, there exists a subsequence z and a continuous function z(t) such that sup |z (t) − z(t)| → 0.

t∈[0,T ]

Proof. Property (i) of Lemma 1 implies u ∈ L ∞ (R2 ) uniformly in time and uniformly in . Therefore z (t) is a sequence of equicontinuous and uniformly bounded functions on some set B R (z 0 ) × [0, T ]. The Arzella-Ascoli theorem gives uniform convergence of a subsequence to a continuous function z(t). In the above lemma we do not prove z(t) solves z˙ (t) = u(z(t), t) although that is what we expect. In the final section we will show h(x, t) = K (z(t) − x) satisfies (9) which contains some of the information from this ODE.


139

3. Estimates for the Vorticity Moments In this section we work in a slightly simplified situation and prove a theorem which is a generalization of Theorem 2.1 in [10]. Later it will be applied to the sequence constructed in the proceeding section. We consider a single vortex blob moving in an external field F. Definition 2. We say a vector field v(x) is “log-Lipschitz” if |v(x)−v(y)| ≤ Cφ(|x − y|) for all x, y where φ is given by: |x|(1 − log |x|) if |x| < 1 φ(|x|) = . 1 otherwise Let F(x, t) : R2 × R+ → R2 be a divergence free “uniformly log-Lipschitz” vector field, that is |F(x, t) − F(y, t)| ≤ C F φ(|x − y|) for all t ∈ [0, T ]. In the subsequent chapters we will choose F = u to apply the analysis of this section to the general situation. It is well known that ω ∈ L ∞ ([0, T ]; L 1 ∩ L ∞ (R2 )) implies u = K ∗ ω is uniformly log-Lipschitz with constant depending only on the L 1 ∩ L ∞ norm of ω, (see [8, p. 315]). Therefore we are careful throughout to bound only using C F as described above so that when we replace F by u we will have uniform bounds. Within this section δ (x, t) denotes the solution of ∂t δ (x, t) + (h + F) · ∇δ = 0, δ (x, 0) = −2 χ (x), h = K ∗ δ . Associated with this system are the equations for the trajectory x(t, x0 ) of a particle initially at position x0 and moved by the velocity field h + F: d x (t, x0 ) = h (x (t, x0 ), t) + F(x (t, x0 ), t), dt x(0, x0 ) = x0 . The first system above is the situation viewed in Eulerian coordinates while the second is the situation viewed in Lagrangian coordinates. We will use both. For the trajectory starting at z 0 (corresponding to the initial position of the blob) we reserve the notation z (t). 3.1. Uniform moment bounds. Define: M (t) = xδ (x, t) d x = −2 x (t, x) d x, 2 R 2 I (t) = (x − M (t)) δ (x, t) d x = −2 (x (t, x) − M (t))2 d x. R2

The first goal of this section is to investigate how these quantities change through motion in the ambient fluid F. We note that in the absence of the ambient vector field (F = 0) both quantities are conserved. Let t be the image of under the mapping x0 → x (t, x0 ). Then, d −2 M (t) = F(x, t) d x. (19) dt t

140

C. Bjorland

The above relation is exactly Eq. (2.10) in [10] so we offer it without proof. Also, d I (t) = 2 −2 (x (t, x) − M (t)) · F(x (t, x), t) d x. dt Using the log-Lipschitz property of F yields |

d I (t)| ≤ 2C F φ(I (t)). dt

The above two relations are exactly (2.12) and (2.13) in [10] but using the log-Lipschitz property instead of the Lipschitz property. If I (0) ≤ exp(− exp C F T ) this implies (see for example [8, p. 319]): I (t) ≤ eI (0)exp(−C F t) ∀t ∈ [0, T ].

(20)

Assumption (6) assures this condition on I (0) can be enforced by taking sufficiently small once T is fixed. 3.2. Convergence of first moments. Theorem 3. Let δ0 satisfy assumption (6) and fix T > 0. Then, lim sup |M (t) − z (t)| = 0.

→0 t∈[0,T ]

In fact, if I (0) ≤ exp(− exp C F T ), then

1 exp(−C F t) |M (t) − z (t)| ≤ e |z 0 − M (0)| + pT C F (eI (0)exp(−C F t) ) 2q for all t ∈ [0, T ], where

1 p

+

1 q

(21)

= 1, and p, q ∈ (1, ∞).

The bound (21) is recorded because it will be used to show how h ⊗ h → h ⊗ h in the final section. Proof. This proof is argued essentially as Theorem 2.1 in [10] but using the weaker assumption on F. Assumption (6) implies M (0) → z 0 and I (0) → 0 so we may choose small enough that I (0) ≤ exp(− exp C F T ) and (20) holds. We begin with a consequence of (19): t |z (t) − M (t)| ≤ |z 0 − M (0)| + |F(z (s), s) − F(M (s), s)| ds 0 t + |F(M (s), s) − −2 F(x, s) d x| ds. (22) s

0

To prove the theorem we estimate the two integrals on the right hand side and use a Gronwall argument to achieve the desired bound. The following estimate is used to bound the second integral on the right hand side of (22); it holds for any p ∈ (1, ∞): φ(t) ≤ pt

1− 1p

.

(23)


141

Now, F(M (s), s) = −2

s

F(M (s), s) d x,

so that the log-Lipschitz continuity along with (23) (let p ∈ (1, ∞) be arbitrary, then set 1p + q1 = 1) and Hölder’s inequality imply F(x, s) d x| ≤ −2 C F φ(M (s) − x) d x |F(M (s), s) − −2 s

s

≤ C F −2 p 1

≤ CF q

−2

s

(M (s) − x) 1+ p

1− 1p

dx

1

p|s | 2 p I (s) 2q .

The external vector field F is divergence free so |s | = | | = 2 and 1 −2 |F(M (s), s) − F(x, s)| ≤ pC F I (s) 2q . s

Combining this bound with (20) allows t 1 |F(M (s), s) − −2 F(x, s)| ds ≤ pT C F (eI (0)exp(−C F t) ) 2q ∀t ∈ [0, T ]. s

0

The first integral on the right hand side of (22) is estimated using log-Lipschitz continuity: t t |F(z (s), s) − F(M (s), s)| ds ≤ C F φ(z (s) − M (s)) ds. 0

0

With these two estimates in hand (22) becomes 1

|z (t) − M (t)| ≤ |z 0 − M (0)| + pT C F (eI (0)exp(−C F t) ) 2q t + C F φ(z (s) − M (s)) ds.

(24)

0

To handle this we use a generalized Gronwall inequality which we prove after the conclusion of this proof. Lemma 4. Let C0 , C1 > 0 satisfy C0 ≤ exp(− exp C1 T ). If f : [0, T ] → R+ satisfies the bound t C1 φ( f (s)) ds, f (t) ≤ C0 + 0 exp(−C1 t)

then f (t) ≤ eC0

for all t ∈ [0, T ].

The first two terms in (24) tend to zero as → 0, so after applying Lemma 4 we have finished the proof. We have not used any quantitative property of F aside from the constant C F .

142

C. Bjorland

Proof (Proof of Lemma 4). Since φ is an increasing function, if g ≥ 0 satisfies t g(t) = C0 + C1 φ( f ) d x 0

we may conclude f (t) ≤ g(t). Such a g would also satisfy g (t) ≤ C1 φ(g)

g(0) = C0

exp(−C1 t)

which can be solved to yield g(t) ≤ eC0

. See for example [8, p. 319].

4. Technical Limit Lemmas To compute the limits involving h we will make use of technical lemmas which describe how the convergence of δ can be used to imply the convergence of h . Definition 3. We say a sequence of functions { f α (y, s)}α∈A is “uniformly equicontinuous, uniformly on [0, T ]” if given any > 0 there exists η > 0 such that if |y − y¯ | < η then | f α (y, s) − f α ( y¯ , s)| < for all α ∈ A, y, y¯ ∈ R2 and s ∈ [0, T ]. To give an idea of how such functions will arise in our analysis, in the next lemma we prove that a reasonable sequence of functions in convolution with the Biot-Savart kernel satisfy the above definition. Lemma 5. Let g α : R+ × R2 → R satisfy g α L ∞ ([0,T ];L p (R2 )) < C < ∞ for some p ∈ (2, ∞). The constant is assumed independent of α. Then each component of K ∗ g α is uniformly equicontinuous, uniformly on [0, T ] in the sense of Definition 3. Proof. Let x 1 , x 2 ∈ R2 satisfy d = |x 1 − x 2 | < 1. We will show |K ∗ g α (x 1 , t) − K ∗ g α (x 2 , t)| is uniformly bounded by some power of d. This actually proves some type of Hölder continuity but we only use the weaker condition. The proof follows typical arguments to show log-Lipschitz continuity (see for example [8] Lemma 8.1) and is included here because the exact result is not readily available in the literature. To begin we split the integral into three parts: α 1 α 2 |K ∗ g (x , t) − K ∗ g (x , t)| ≤ + + R2 \B2 (x 1 ) 1

B2 (x 1 )\B2d (x 1 ) 2

B2d (x 1 ) α

×|K (x − y) − K (x − y)||g (y, t)| dy ≤ A1 + A2 + A3 . To bound A1 we use the estimate ([8, p. 317]) |K (x) − K (y)| ≤ Hölder’s inequality with

1 p

+

1 q

1 |x − y| . π |x||y|

= 1 implies

∞ 1 q d α g α (y) dr d dy ≤ g L p (R2 ) A1 ≤ . 1 2 2q−1 π R2 \B2 (x 1 ) |x − y||x − y| π r 1


143

The integral on the right hand side is finite when q ∈ (1, ∞) and in turn p ∈ (1, ∞). This implies A1 ≤ cd g α L p (R2 ) .

(25)

For y ∈ B2 (x 1 )\B2d (x 1 ) the mean-value theorem implies ([8, p. 318]) |K (x 1 − y) − K (x 2 − y)| ≤ c

|x 1 − x 2 | . |x 1 − y|2

Again using Hölder’s inequality,

α

A2 ≤ cd g L p (R2 )

2 2d

dr

q1

r 2q−1

.

The integral on the right hand side is finite when q ∈ (1, ∞) but we require q ∈ (1, 2) (and hence p ∈ (2, ∞)) to retain positive powers of d in our estimate and establish continuity. Indeed, 2

A2 ≤ c p d q

−1

g α L p (R2 ) .

(26)

We bound the third term using Hölder’s inequality and the estimate |K (x)| ≤ C|x|−1 . So,

1 B2d (x 1 )

|K (x 1 − y) − K (x 2 − y)|q

dy ≤ 1 − y|q 1 |x B2d (x ) 1 2d dr q ≤2 . r q−1 0

1

q

q

+

B2d (x 1 )

dy |x 2 − y|q

1 q

This integral is finite if q ∈ (1, 2), so if p ∈ (2, ∞) we can bound 2

A3 ≤ c p d q

−1

g α L p (R2 ) .

(27)

Considering (25–27) we see g α ∗ K satisfies the conclusion of this lemma. When we do not need to deal with a sequence of functions we have the following corollary and definition. Definition 4. We say a function f (x, t) is “uniformly continuous, uniformly on [0, T ]” if given any > 0 there exists η > 0 such that if |y − y¯ | < η then | f (y, s)− f ( y¯ , s)| < for all y, y¯ ∈ R2 and s ∈ [0, T ]. Corollary 1. If g ∈ L ∞ ([0, T ]; L p (R2 )), then g ∗ K is uniformly continuous, uniformly on [0, T ]. Proof. Choose g α = g for all α in Lemma 5. Now we state the main technical device for proving convergence.

144

C. Bjorland

Lemma 6. Let f α (y, s) be a sequence of functions, uniformly equicontinuous, uniformly on [0, T ], satisfying the bound sup f α (·, s) L ∞ (R2 ) ≤ C.

(28)

s∈[0,T ]

Let δ α (x, s) be an approximation to the Dirac distribution which satisfies the following properties: (29) lim sup |Mα (s) − z(s)| = lim sup yδ α (y, s) dy − z(s) = 0, α→0 s∈[0,T ]

α→0 s∈[0,T ]

R2

lim sup Iα (t) = lim sup

α→0 s∈[0,T ]

R2

Then,

α→0 s∈[0,T ]

δ α (x, s) d x = 1 and

2 α

R2

(y − Mα (s)) δ (y, s) d x

= 0, (30)

δ α (x, s) ≥ 0.

α lim sup f (z(s), s) − α→0 s∈[0,T ]

f (y, s)δ (y, s) dy = 0. α

R2

(31)

α

Proof. Let > 0 be fixed and let η > 0 be given by Definition (3). Using (29) we may choose α1 > 0 small enough so that if α < α1 , sup |Mα (s) − z(s)| < η.

s∈[0,T ]

Now, | f α (z(s), s) − α

R2

f α (y, s)δ α (y, s) dy| α

α

≤ | f (z(x), s) − f (Mα (s), s)| + | f (Mα (s), s) −

R2

f α (y, s)δ α (y, s) dy|.

(32)

If α < α1 , the first term on the right hand side is bounded uniformly in s by . To attack the second term, first note that (31) implies f α (Mα (s), s) − f α (y, s)δ α (y, s) dy R2 = [ f α (y, s) − f α (Mα (s), s)]δ α (y, s) dy. R2

We may then decompose [ f α (y, s) − f α (Mα (s), s)]δ α (y, s) dy R2 = [ f α (y, s) − f α (Mα (s), s)]δ α (y, s) dy Bη (Mα (s))

+

BηC (Mα (s))

[ f α (y, s) − f α (Mα (s), s)]δ α (y, s) dy.


145

Relying on the uniform equicontinuity and using again (31) we see [ f α (y, s) − f α (Mα (s), s)]δ α (y, s) dy| | Bη (Mα (s))

≤

Bη (Mα (s))

δ α (y, s) dy ≤ .

(33)

This bound is independent of s ∈ [0, T ]. To handle the other term note [ f α (y, s) − f α (Mα (s), s)]δ α (y, s) dy| | BηC (Mα (s))

C |y − Mα (s)|2 δ α (y, s) dy η 2 R2 C ≤ 2 Iα (s). η ≤

Above, C is again as in (28). In view of (30), there exists an α ≤ α1 such that for all 2 α ≤ α , Iα (s) ≤ ηC , and | [ f α (y, s) − f α (Mα (s), s)]δ α (y, s) dy| ≤ . (34) BηC (Mα (s))

This bound is independent of s ∈ [0, T ]. Combining (32-34) shows that if α < α , f α (y, s)δ α (y, s) dy| < 3. | f α (z(s), s) − R2

As was picked arbitrarily, the lemma is proved. Corollary 2. Let δ α satisfy the assumptions of Lemma 6. If f : R+ × R2 → R is uniformly continuous, uniformly on [0, T ] and f ∈ L ∞ ([0, T ] × R2 ) then lim sup f (z(s), s) − f (y, s)δ α (y, s) dy = 0. α→0 s∈[0,T ]

R2

Proof. Choose f α = f for all α in Lemma 6. The final point to address here is the assumption (28) in Lemma 6. To show how this will be satisfied we present a well known estimate. Lemma 7. Let g ∈ L p ∩ L q (R2 ), where 1 ≤ p < 2 < q ≤ ∞. Then g ∗ K L ∞ (R2 ) ≤ C( g L p (R2 ) + g L q (R2 ) ). Proof. The case p = 1 and q = ∞ is usually what is presented in the literature so we do not give a proof of those cases (see for example [8, Prop. 8.2]). Let B1 be a ball of unit radius centered at the origin. Using the estimate |K (x)| ≤ C|x|−1 we bound, |g(y)K (x − y)| dy + |g(y)K (x − y)| dy g ∗ K L ∞ (R2 ) ≤ B1c

B1

≤ C g L q (R2 )

|x| B1

−q ∗

dy

1 q∗

+ C g L p (R2 )

B1c

|x|

− p∗

dy

1 p∗

.

146

C. Bjorland

Here q ∗ satisfies the usual relation q1 + q1∗ = 1 and p ∗ is the same with respect to p. The first integral is finite if q ∗ < 2 or q > 2 and likewise the second integral is finite if p ∗ > 2 or p < 2. 5. Existence of the Weak Solution This section contains the proof of Theorem 2. 5.1. Moment properties of the approximate sequence. Let δ0 be a sequence of vortex blobs satisfying assumptions (6) and (7). Let ω0 ∈ L 1 ∩L ∞ (R2 ) be the given initial ambient vorticity and construct ω , δ , u , and h as in Subsect. 2.1. Lemma 1 shows these functions are approximate solutions of the Vortex-Wave equation. Applying Lemmas 2 and 3 we find a subsequence → 0 along with a function u(x, t) and a path z(t) which satisfy all of the conclusions of the theorem with the exception of the fact that u and z form a solution of the Vortex-Wave equation. We only need to take the weak limits to show (8) converges to (9) to finish the proof. These limits are demonstrated in the following Subsect. using the technical lemmas established in Sect. 4. From here on we are considering the specific subsequence of given by Lemma 2. We record some consequences of Sect. 3. Recall M and I are defined as in (3-4) and z (t) is the path under the flow u starting at z 0 , it is defined by (10). Property (iv) in Lemma 1, along with a well known property of the Biot-Savart kernel (see [8, p. 315]) implies the sequence u is uniformly log-Lipschitz on [0, T ]. That is, ∀s ∈ [0, T ], Cω0

|u (x, s) − u (y, s)| ≤ Cω0 φ(x − y), = ω (s) L ∞ ∩L 1 (R2 ) ≤ ω0 L ∞ ∩L 1 (R2 ) .

By choosing F = u and applying Theorem 3 (while keeping this uniform estimate in mind) we see lim sup |M (t) − z (t)| = 0.

→0 t∈[0,T ]

(35)

In fact this statement follows directly from (21). If I (0) ≤ exp(− exp Cω0 T ), (20) implies I (t) ≤ eI (0)exp(−Cω0 t)

∀t ∈ [0, T ].

(36)

Note (35), (36), and Lemma 3 imply the sequence δ satisfies the assumptions of Lemma 6. 5.2. Weak limit of approximate sequence. Keep in mind the remark at the end of the previous subsection: the approximate blobs δ satisfy the assumptions of Lemma 6 and z(t) is a continuous path in R2 . Let ∈ Cσ∞ (R2 × [0, T ] \ {(z(t), t)}t∈[0,T ] ). We now show how (8) converges to (9) with h(x, t) = K (x − z(t)). The first two limits we consider are well known for the Euler equation. Since our velocity sequence u corresponds to vorticity uniformly in L 1 ∩ L ∞ the following claims are identical to those in the literature.


147

Claim.

T

T

t u d x dt →

R2

0

t u d x dt.

R2

0

(37)

This claim follows immediately from the fact that t has compact support and (17). Claim.

T

T

∇ : u ⊗ u d x dt →

R2

0

∇ : u ⊗ u d x dt.

R2

0

(38)

This is observed by adding and subtracting the cross terms, then applying (18). See [8, p. 402] for more details. Now we work on the terms which involve h using the lemmas in the previous subsection. Claim.

T 0

R2

T

t h d x dt →

R2

0

t h d x dt.

(39)

The first step in evaluating this limit is to rewrite it using the Biot-Savart kernel:

T 0

R2 T

=

=

t (x)h (x) d x dt −

T

0

R2

0

t (x)h(x) d x dt

T

t (x) K (x − y)δ (y) d y d x dt − t (x)K (x − z(t)) d x dt R2 R2 0 t (x)K (x − y) d x δ (y) dy − t (x)K (x − z(t)) d x dt.

R2

0

T

R2

R2

R2

The support of ∈ Cσ∞ (R2 × [0, T ]\{(z(t), t)}t∈[0,T ] ) and (12) justifies changing the order of integration between the last two lines. The integrand t (x)K (x − z(t)) is smooth on its support and the final integral makes sense. Corollary 1 applied to t shows K ∗ t is uniformly continuous, uniformly on [0, T ]. is smooth with compact support so Lemma 7 implies K ∗ t ∈ L ∞ ([0, T ] × R2 ). Corollary 2 applied to K ∗ t then shows T t (x)K (x − y) d x δ (y) dy − t (x)K (x − z(t)) d x dt → 0, 0

R2

R2

R2

and the proof of (39) is complete. Claim.

T 0

0

T

R2 R2

∇ : h ⊗ u d x dt →

T

0

∇ : u ⊗ h d x dt →

0

T

R2 R2

∇ : h ⊗ u d x dt.

(40)

∇ : h ⊗ u d x dt.

(41)

148

C. Bjorland

Since these twoare nearly identical we will only prove (40). By adding and subtracting the cross term ∇ : h ⊗ u d x dt we break this case into two parts. We will show

T 0

R2

T

∇ : h ⊗ u d x dt →

T

0

∇ : h ⊗ u d x dt →

0

T

R2

∇ : h ⊗ u d x dt,

∇ : h ⊗ u d x dt.

(42) (43)

0

Consider first (42):

T

0

=

R2 T

T

0

R2 R2

∇ : h ⊗ u d x dt −

0

−

T

R2

∇ : h ⊗ u d x dt

∇ (x) : K (x − y)δ (y) ⊗ u (x) d y d x dt

∇ (x) : K (x − z(t)) ⊗ u (x) d x dt ∇ (x) : K (x − y) ⊗ u (x) d x δ (y) dy dt

R2 0 T

=

R2

0

T

− 0

R2

R2

∇ (x) : K (x − z(t)) ⊗ u (x) d x dt.

To justify the above sequence we rely on the differentiability of and its compact support combined with (13) and u ∈ L ∞ (R2 ) which follows from (12) and Lemma 7. Indeed, u (t) L ∞ (R2 ) ≤ C ω (t) L 1 ∩L ∞ (R2 ) ≤ C ω(0) L 1 ∩L ∞ (R2 ) . Combined with the compact support of we see that for any p ≥ 1, ∂i j u j ∈ L ∞ ([0, T ]; L p (R2 )) uniformly. Hence Lemmas 5 and 7 show ∇ (x, t) : K (x − y) ⊗ u (x, t) d x f (y, t) = R2

is uniformly equicontinuous, uniformly on [0, T ] and satisfies (28). Then, Lemma 6 then implies (42). To obtain (43) recall that has compact support separate from z(t) so h is a bounded function in the support of . The strong convergence (17) now implies (43). Claim.

T

R2

0

∇ : h ⊗ h d x dt → 0.

(44)

∇ : h ⊗ h d x dt = 0,

(45)

To prove this we recall

T 0

R2


149

then add and subtract a cross term so the claim is reduced to proving T T ∇ : h ⊗ h d x dt → ∇ : h ⊗ h d x dt, 0

R2 T

T

∇ : h ⊗ h d x dt →

0

R2

0

(46)

∇ : h ⊗ h d x dt.

(47)

0

To check (45) one can compute directly, when x = z(t), h(x, t) · ∇h(x, t) = K (x − z(t)) · ∇ K (x − z(t)) = ∇

1 1 . 4π |x − z(t)|2

That is, in the support of , h(x, t) · ∇h(x, t) is the gradient of a scalar. The convergence (47) is argued similar to (42) or (39) once one notes h is smooth and bounded in the support of . The convergence (46) is more subtle and we use an argument inspired by the proof of Theorem 3.2 in [10]. We proceed as before and rewrite the quantity: T T ∇ : h ⊗ h d x dt − ∇ : h ⊗ h d x dt 2 2 R R 0 0 T = ∇ (x) : K (x − y) ⊗ h (x) d x δ (y) dy dt R2

0

T

− 0

R2

R2

∇ (x) : K (x − z(t)) ⊗ h (x) d x dt.

To apply Lemmas 5 and 6, and hence prove the desired convergence, we need to establish ∂i j h j is in L ∞ ([0, T ]; L p (R2 )) uniformly for some p ∈ (2, ∞). Since ∇ is smooth and has compact support we will need to focus our efforts on h . For this we decompose h into two parts, one representing the contribution of δ near z(t) and the other representing the remaining part. Let 1 inf{|x − z(t)| : (x, t) ∈ Supp( )}. 4 By assuming is sufficiently small and relying on (35) we may use |M (t) − z(t)| < r0 . Now consider K (x − y)δ (y) dy + K (x − y)δ (y) dy h (x, t) = r0 =

Br0 (z(t))

Brc (z(t)) 0

= h 1, + h 2, . For any (x, t) ∈ Supp( ) and y ∈ B2r0 (M (t)) we have |K (x − y)| < integral may be bounded by 1, K (x − y)δ (y) dy |h (x, t)| ≤

1 2πr0 , so the first

B2r0 (M (t))

≤

1 2πr0

≤

1 . 2πr0

B2r0 (M (t))

δ (y) dy (48)

150

C. Bjorland

This bound is uniform in t and x and the support of is compact, so we conclude ∞ p 2 ∂i j h 1, j ∈ L ([0, T ]; L (R )) for all p ∈ [1, ∞]. Now we handle h 2, . The first step is to bound the measure of the support of δ outside of Br0 (z(t)) which we label (t): (t) = {x ∈ BrC0 (z(t))|δ (x, t) = 0}. Recall (5) and denote (as in Sect. 3) by t the image of transported by u . Then, δ (x, t) d x |(t)| ≤ 2 B cr0 (M (t))

2 ≤ 2 9r0

2

B cr0 (M (t))

(x − M (t))2 δ (x, t) d x

2

2 ≤ 2 I (t), 9r0

(49)

where I is the second moment controlled by (36). The next piece is the well known estimate of the Biot-Savart law considered as a singular integral operator. Combining the Calderon-Zygmund inequality with the Sobolev inequality (see [8, p. 322]) shows h 2, (t) L p (R2 ) ≤ C χ(t) δ (t) L q (R2 ) ,

(50)

where p and q satisfy the relation 1p = q1 − 21 . Therefore if we are able to prove χ(t) δ (t) ∈ L ∞ ([0, T ]; L q (R2 )) for some q ∈ (1, 2) we will have shown ∂i j h i2, (t) ∈ L ∞ ([0, T ]; L p (R2 )) for some p ∈ (2, ∞). To show this we first claim that if I (0) satisfies assumption (7) then χ(t) δ (t) L q (R2 ) is bounded uniformly for q ∈ (1, 1 + 21 exp(−Cω0 T )), where Cω0 is the same as in (36). Indeed, for such a q, χ(t) δ (t) L q (R2 ) ≤

1 1 |(t)| q 2 2

≤

q

−2

(9r0 )

1 q

2

≤ Cr 0 q

1

I (t) q

−2+2 exp(−Cω0 T )

.

Moving from the first to the second line above we used (49). The move from the second to the third is exactly (36). For q ∈ (1, 1 + 21 exp(−Cω0 T )) the exponent of is positive and we may conclude χ(t) δ (t) L q (R2 ) is bounded uniformly in time (in fact it tends to zero as → 0). Then (50) implies h 2, ∈ L ∞ ([0, T ]; L p (R2 )) uniformly from p ∈ (2, p), ˜ where p˜ = (2 + exp(−Cω0 T ))/(1 − 21 exp(−Cω0 T )). When combined with (48) we see h ∈ L ∞ ([0, T ], L p˜ (R2 )) for some p˜ > 2 and with the compact support of we have ∂i j h j ∈ L ∞ ([0, T ], L p (R2 )) for all p ∈ [1, p]. ˜ Now we can apply Lemmas 5, 7, and 6 similar to the previous terms and obtain (46). Therefore we have proven (44). Taking all of the claims together, (37–44), we conclude the proof of Theorem 2. Acknowledgements. The author would like to thank an anonymous referee for suggesting several ways to improve this article.


151

References 1. Delort, J.M.: Existence de nappes de tourbillon en dimension deux. J. Amer. Math. Soc. 4(3), 553–586 (1991) 2. DiPerna, R.J., Majda, A.J.: Concentrations in regularizations for 2-D incompressible flow. Commun. Pure Appl. Math. 40(3), 301–345 (1987) 3. DiPerna, R.J., Majda, A.J.: Oscillations and concentrations in weak solutions of the incompressible fluid equations. Commun. Math. Phys. 108(4), 667–689 (1987) 4. DiPerna, R.J., Majda, A.J.: Reduced Hausdorff dimension and concentration-cancellation for two-dimensional incompressible flow. J. Amer. Math. Soc. 1(1), 59–95 (1988) 5. Lacave, C., Miot, E.: Uniqueness for the vortex-wave system when the vorticity is constant near the point vortex. SIAM J. Math. Anal. 41(3), 1138–1163 (2009) 6. Lopes Filho, M.C., Nussenzveig Lopes, H.J., Xin, Z.: Existence of vortex sheets with reflection symmetry in two space dimensions. Arch. Rat. Mech. Anal. 158(3), 235–257 (2001) 7. Majda, A.J.: Remarks on weak solutions for vortex sheets with a distinguished sign. Indiana Univ. Math. J. 42(3), 921–939 (1993) 8. Majda, A.J., Bertozzi, A.L.: Vorticity and incompressible flow. Volume 27 of Cambridge Texts in Applied Mathematics. Cambridge: Cambridge University Press, 2002 9. Marchioro, C.: Euler evolution for singular initial data and vortex theory: a global solution. Commun. Math. Phys. 116(1), 45–55 (1988) 10. Marchioro, C., Pulvirenti, M.: Euler evolution for singular initial data and vortex theory. Commun. Math. Phys. 91(4), 563–572 (1983) 11. Marchioro, C., Pulvirenti, M.: On the vortex-wave system. In: Mechanics, analysis and geometry: 200 years after Lagrange, North-Holland Delta Ser., Amsterdam: North-Holland, 1991, pp. 79–95 12. Marchioro, C., Pulvirenti, M.: Mathematical theory of incompressible nonviscous fluids. Volume 96 of Applied Mathematical Sciences. New York: Springer-Verlag, 1994 13. Schochet, S.: The weak vorticity formulation of the 2-D Euler equations and concentration-cancellation. Commun. Part. Diff. Eqs. 20(5–6), 1077–1104 (1995) 14. Stein E.M.: Singular integrals and differentiability properties of functions. Princeton Mathematical Series, No. 30. Princeton, N.J.: Princeton University Press, 1970 15. Yudovich, V.I.: Non-stationary flow of an incompressible liquid. Zh. Vychisl. Mat. Mat. Fiz. 3, 1032–1066 (1963) Communicated by P. Constantin


Communications in


On the Constructions of Holomorphic Vertex Operator Algebras of Central Charge 24 Ching Hung Lam Institute of Mathematics, Academia Sinica, Taipei 10617, Taiwan. E-mail: [email protected] Received: 29 June 2010 / Accepted: 4 October 2010 Published online: 26 February 2011 – © Springer-Verlag 2011

Abstract: In this article, we construct explicitly several holomorphic vertex operator algebras of central charge 24 using Virasoro frames. The Lie algebras associated to their weight one subspaces are of the types A1,2 A3,4 4 , A1,2 D5,8 , A1,1 3 A7,4 , A1,1 2 C3,2 D5,4 , A2,1 2 A5,2 2 C2,1 , A3,1 A7,2 C3,1 2 , A3,1 C7,2 , A4,1 A9,2 B3,1 , B4,1 C6,1 2 and B6,1 C10,1 . These vertex operator algebras correspond to number 7, 10, 18, 19, 26, 33, 35, 40, 48 and 56 in Schellekens’ list Schellekens (Commun Math Phys 153:159–185, 1993). Contents 1. 2.

Introduction . . . . . . . . . . . . . . . . . . . . . . . . Framed Vertex Operator Algebras . . . . . . . . . . . . . 2.1 Structure codes . . . . . . . . . . . . . . . . . . . . 2.2 Miyamoto involutions and Z2 -orbifold construction . 3. Z4 -Codes and Framed VOAs . . . . . . . . . . . . . . . 4. An Exceptional [48, 9] Triply Even Code . . . . . . . . . 4.1 Triangular graph . . . . . . . . . . . . . . . . . . . . 5. Constructions of VOA . . . . . . . . . . . . . . . . . . . 5.1 Subcodes of D ex . . . . . . . . . . . . . . . . . . . 5.2 Lie algebra structure for V (D ex )1 . . . . . . . . . . 5.3 Lie algebra structure for [V (D1 )]1 . . . . . . . . . . 6. Vertex Operator Algebras Associated to Other Subcodes . 6.1 8-dimensional subcodes . . . . . . . . . . . . . . . . 6.2 7-dimensional subcodes . . . . . . . . . . . . . . . . 6.3 6-dimensional subcodes . . . . . . . . . . . . . . . . 6.4 5-dimensional subcodes . . . . . . . . . . . . . . . . Appendix A. Code Vertex Operator Algebras Associated to En Appendix B. Type II Self-Dual Z4 Code . . . . . . . . . . . . References . . . . . . . . . . . . . . . . . . . . . . . . . . . .

. . . . . . . . . . . . . . . . . . .

. . . . . . . . . . . . . . . . . . .

. . . . . . . . . . . . . . . . . . .

. . . . . . . . . . . . . . . . . . .

. . . . . . . . . . . . . . . . . . .

. . . . . . . . . . . . . . . . . . .

. . . . . . . . . . . . . . . . . . .

. . . . . . . . . . . . . . . . . . .

. . . . . . . . . . . . . . . . . . .

. . . . . . . . . . . . . . . . . . .

154 155 156 157 161 163 164 167 167 171 174 176 177 179 187 193 196 196 197

154

C. H. Lam

1. Introduction A vertex operator algebra (VOA) V = Vn = 0

n∈Z

Vn is said to be of CFT-type if

for n < 0 and dim V0 = 1.

A VOA is rational if all of its admissible modules are completely reducible [DLM1] and is C2 -cofinite if dim V /C2 (V ) < ∞, where C2 (V ) = span{u −2 v | u, v ∈ V }. A rational vertex operator algebra V is said to be holomorphic if it has only one inequivalent irreducible module, namely V itself. It is well-known (cf. [Z]) that if V is a C2 -cofinite holomorphic type and c is the central charge of V , then VOA of CFT −c/24+n of V is a modular function over the full the character ch V = ∞ dim V q n n=0 modular group S L 2 (Z). In particular, the central charge c must be divisible by 8. If V is of CFT type, it is also known that the weight one subspace V1 has a Lie algebra structure and possesses an invariant bilinear form [FLM]. When c = 8 and 16, it is easy to determine dim V1 by using the theory of modular forms. The corresponding Lie algebra structure for V1 can also be determined. In fact, it is not difficult to show that the lattice VOA VE 8 associated to the root lattice of type E 8 is the only holomorphic VOA of c = 8 and there are two only holomorphic VOAs of c = 16. They are isomorphic to + (see for example [DM3]). the lattice VOAs VE 8 ⊕E 8 and VD16 The classification of holomorphic vertex operator algebras of central charge 24, on the other hand, is much more complicated. In 1993, Schellekens [Sch] obtained a partial classification by determining the possible Lie algebra structures for the weight one subspaces. However, only 39 of the 71 cases in his list have been constructed explicitly. Besides, it is still an open question if the Lie algebra structure of V1 will determine the VOA structure of V uniquely when c = 24. The most difficult case is probably the case when V1 = 0. Frenkel-Lepowsky-Meurman [FLM] conjectured that such a VOA is isomorphic to the famous moonshine VOA V . This conjecture is one of the most difficult problems in VOA theory and has very little progress in the last 20 years. Motivated by the work of Schellekens, Montague [Mon] proposed some constructions for 70 of the 71 theories by using the so-called Z2 and Z3 orbifold constructions. However, his method is not rigorous. In fact, orbifold constructions have only been established rigorously in a few special cases (cf. [FLM,M4]). In this article, we shall construct explicitly several holomorphic vertex operator algebras of central charge 24 by using Virasoro frames. Our main method is by successive Z2 -orbifoldings on certain lattice type framed VOAs. In general, it is difficult to determine if the Fock space obtained from Z2 -orbifold construction has a VOA structure; however, for a framed VOA, if an involution fixes the Virasoro frame pointwise, then the corresponding Z2 -orbifold construction always gives a VOA (see [LY] and Sect. 2.2). By this method, we shall construct several vertex operator algebras (VOAs) in Schellekens’ list [Sch]. They correspond to number 7, 10, 18, 19, 26, 33, 35, 40, 48 and 56 in his list and the Lie algebras associated to their weight one subspaces are of the types A1,2 A3,4 4 , A1,2 D5,8 , A1,1 3 A7,4 , A1,1 2 C3,2 D5,4 , A2,1 2 A5,2 2 C2,1 , A3,1 A7,2 C3,1 2 , A3,1 C7,2 , A4,1 A9,2 B3,1 , B4,1 C6,1 2 and B6,1 C10,1 . We believe that these VOAs have not been constructed explicitly before in the literature. The organization of this article is as follows. In Sect. 2, we recall the notion of framed vertex operator algebras and some basic facts. We also review the Z2 -orbifold construction associated to certain special type of involutions. In Sect. 3, we recall a construction of framed VOA from type II self-dual Z4 -codes. In Sect. 4, we review the definition of a certain special triply even code D ex of length 48 from [BM]. The properties of D ex

Constructions of VOA

155

Table 1. Exceptional Framed VOAs Number in Schellekens’ list

Dimension of 1/16-code

T.E. Code

10 7 18 35 26 19 16* 7 48 40 33 19 56 48

9 8 8 7 7 7 7 7 6 6 6 6 5 5

[10] [5, 4] [8] [7] [6, 2] [5, 3] [4, 4] [3, 3, 3] [6] [5, 2] [4, 3] [3, 3, 2] [5] [3, 3]

dim V1 48 48 72 120 96 72 72 48 192 144 120 96 288 192

Lie algebra A1,2 D5,8 A1,2 A3,4 3 A1,1 3 A7,4 A3,1 C7,2 A2,1 2 A5,2 2 C2,1 A1,1 2 C3,2 D5,4 A1,1 4 A3,2 4 A1,2 A3,4 3 B4,1 C6,1 2 A4,1 A9,2 B3,1 A3,1 A7,2 C3,1 2 A2,1 2 A5,2 2 C2,1 B6,1 C10,1 B4,1 C6,1 2

Here X ,k denotes a simple Lie subalgebra of type X whose affine algebra has level k on V * A holomorphic VOA with the Lie algebra A1,1 4 A3,2 4 has already been known (cf. Table 2 of [DGM])

and its subcodes will also be studied. In Sect. 5 and 6, we construct explicitly some holomorphic framed VOA using D ex and its subcodes. The Lie algebra structures of the corresponding weight one spaces are also determined (Table 1). 2. Framed Vertex Operator Algebras In this section, we review the notion of framed VOAs from [DGH,M3]. For the details of VOAs, see [FHL,FLM]. Let Vir = ⊕n∈Z CL n ⊕ Cc be the Virasoro algebra. That means the L n satisfy the commutator relations: [L m , L n ] = (m − n)L m+n +

1 (m 3 − m)δm+n,0 c, 12

and

[L m , c] = 0.

For any c, h ∈ C, we shall denote by L(c, h) the irreducible highest weight module of Vir with central charge c and highest weight h. It is shown by [FZ] that L(c, 0) has a natural VOA structure. We call it the simple Virasoro VOA with central charge c. ∞ Definition 2.1. Let V = n=0 Vn be a VOA. An element e ∈ V2 is called an Ising vector or a Virasoro vector of central charge 1/2 if the subalgebra Vir(e) generated by e is isomorphic to L(1/2, 0). Two Ising vectors u, v ∈ V are said to be orthogonal if [Y (u, z 1 ), Y (v, z 2 )] = 0. Remark 2.2. It is well-known that L(1/2, 0) is rational, i.e., all L(1/2, 0)-modules are completely reducible, and has only three inequivalent irreducible modules L(1/2, 0), L(1/2, 1/2) and L(1/2, 1/16). The fusion rules of L(1/2, 0)-modules are computed in [DMZ]: L(1/2, 1/2) L(1/2, 1/2) = L(1/2, 0),

L(1/2, 1/2) L(1/2, 1/16) = L(1/2, 1/16),

L(1/2, 1/16) L(1/2, 1/16) = L(1/2, 0) ⊕ L(1/2, 1/2).

(2.1)

156

C. H. Lam

Definition 2.3 ([DGH]). A simple VOA V is said to be framed if there exists a set {e1 , . . . , en } of mutually orthogonal Ising vectors of V such that their sum e1 + · · · + en is equal to the conformal element of V . The sub VOA Tn generated by e1 , . . . , en is thus isomorphic to L(1/2, 0)⊗n and is called a Virasoro frame or simply a frame of V . By abuse of notation, we sometimes call the set of Ising vectors {e1 , . . . , en } a frame, also. 2.1. Structure codes. Given a framed VOA V with a frame Tn , one can associate two binary codes C and D of length n to V and Tn as follows: Since Tn = L(1/2, 0)⊗n is rational, V is a completely reducible Tn -module. That is, m h 1 ,...,h n L(1/2, h 1 ) ⊗ · · · ⊗ L(1/2, h n ), V ∼ = 1 h i ∈{0, 12 , 16 }

where the nonnegative integer m h 1 ,...,h n is the multiplicity of L(1/2, h 1 )⊗· · ·⊗ L(1/2, h n ) in V . In particular, all the multiplicities are finite. It was also shown in [DMZ] that m h 1 ,...,h n is at most 1 if all h i are different from 1/16. Definition 2.4. Let U ∼ = L(1/2, h 1 )⊗· · ·⊗ L(1/2, h n ) be an irreducible module for Tn . We 1 define the /16-word (or τ -word) τ (U ) of U as the binary word β = (β1 , . . . , βn ) ∈ Zn2 such that 0 if h i = 0 or 1/2, (2.2) βi = 1 if h i = 1/16. For any β ∈ Zn2 , denote by V β the sum of all irreducible submodules U of V such that τ (U ) = β. Definition 2.5. Define D := {β ∈ Zn2 | V β = 0}. Then D becomes a binary code of length n and V can be written as a sum V = V β. β∈D

For any c = (c1 , . . . , cn ) ∈ Zn2 , denote Mc = m h 1 ,...,h n L(1/2, h 1 ) ⊗ · · · ⊗ L(1/2, h n ), where h i = 1/2 if ci = 1 and h i = 0 elsewhere. Note that m h 1 ,...,h n ≤ 1 since h i = 1/16. Definition 2.6. Define C := {c ∈ Zn2 | Mc = 0}. Then C also forms a binary code and V 0 = c∈C Mc . Remark 2.7. The VOA V 0 is often called the code VOA associated to C and is denoted by MC ([M1]). For a coset δ + C of C in Zn2 , we also denote Mδ+C = Mc . c∈δ+C

In this case, Mδ+C is an irreducible MC -module (cf. [M2]). Summarizing, there exists a pair of even linear codes (C, D) such that V = V β and V 0 = Mc . β∈D

c∈C


157

The codes (C, D) are called the structure codes of a framed VOA V associated to the 1 frame Tn . We also call the code D the 16 -code and the code C the 21 -code of V with β respect to Tn . Note also that all V , β ∈ D , are irreducible V 0 -modules. Since V is a VOA, its weights are integers and we have the lemma. Lemma 2.8. 1. D is triply even, i.e., wt(α) ≡ 0 mod 8 for all α ∈ D. 2. C is even. The following theorem is also well-known (cf. [DGH, Theorem 2.9] and [M3, Theorem 6.1]): Theorem 2.9. Let V be a framed VOA with structure codes (C, D). Then, V is holomorphic if and only if C = D ⊥ . In [LY], the structure of a general framed VOA has been studied in detail. It was shown that the structure codes (C, D) satisfy some duality conditions. In particular, the following are established (see Remark 6 and Theorem 10 of [LY]). Lemma 2.10. Let D be a triply even binary code of length 16k, k ∈ Z+ with 1 ∈ D and let C = D ⊥ be its dual code. Then for any β ∈ D, the subcode Cβ = {α ∈ C | supp α ⊂ supp β} contains a doubly even self-dual subcode with respect to the support of β. Proposition 2.11. Let D be a triply even binary code of length 16k, k ∈ Z+ and 1 ∈ D. Let C = D ⊥ be its dual code. Then there exists a set of irreducible MC -modules {V β | β ∈ D} with τ (V β ) = β such that (1) V 0 = MC and all V β , β ∈ D, have integral weights; (2) V β MC V β = V β+β for all β, β ∈ D; (3) all V β , β ∈ D, are simple current modules. 1 In particular, the space V = ⊕β∈D V β is a holomorphic VOA and the 16 -code of V is D. Theorem 2.12. Let D be a linear binary code of length 16k, k ∈ Z+ . Then D can be 1 realized as the 16 -code of a holomorphic framed VOA of central charge 8k if and only if (1) D is triply even and (2) the all-one vector 1 ∈ D. 1 By the theorem above, the classification of the 16 -codes for holomorphic framed VOAs is equivalent to the classification of triply even codes of length 16k. It turns out that most triply even codes can be constructed by certain doubling processes or contained in some doublings [BM,DGH] (see also Sec. 3). However, in [BM], a very special triply even code D ex of length 48 is constructed. It has dimension 9 and minimal weight 16 but it is not contained in any doublings. In this article, we shall 1 construct explicitly a holomorphic framed VOA which realizes D ex as the 16 -code. We ex shall also construct several other VOAs using the subcodes of D .

2.2. Miyamoto involutions and Z2 -orbifold construction. Next, we shall review the definition of Miyamoto involutions [M1] and the notion of Z2 -orbifold construction. Definition 2.13. Let V = ⊕β∈D V β be a framed VOA with the structure codes (C, D), where C, D ⊂ Zn2 . For a binary word β ∈ Zn2 , we define τβ (u) := (−1)(α,β) u

for u ∈ V α .

Then by the fusion rules, τβ defines an automorphism on V [M1].

(2.3)

158

C. H. Lam

Similarly, we can define an automorphism on V 0 by σβ (u) := (−1)(α,β) u

for u ∈ Mα ,

where V 0 = ⊕α∈C Mα . Note that σβ is just an automorphism of V 0 . It does not necessarily lift to an automorphism of V . Nevertheless, the following holds. Theorem 2.14 (cf. Theorem 12 of [LY]). Let V be a framed VOA with the structure codes (C, D). Let ξ · β = (ξ1 · β1 , . . . , ξn · βn ) be the coordinatewise product of ξ and β. For a binary word ξ ∈ Zn2 , there exists g ∈ Aut(V ) such that g|V 0 = σξ if and only if ξ · β ∈ C for all β ∈ D. Moreover, g has order 2 if wt(ξ · β) ≡ 0 mod 4 for all β ∈ D; otherwise, g has order 4. Remark 2.15. Note that σξ |T = idT , where T is the Virasoro frame of V . Hence the automorphism g fixes T pointwise. As remarked in [LY], the automorphism g is not unique. In fact, if g is another automorphism such that g |V 0 = σξ , then g = gτα for α ∈ Zn2 . 2.2.1. Irreducible MC -modules. We shall review the structure of irreducible MC -modules from [M2]. Notation 2.16. For any α, β ∈ Zn2 , we denote 1 1 1 1 L 1 (α) := L( , α1 ) ⊗ · · · ⊗ L( , αn ), 2 2 2 2 2 1 1 1 1 L 1 (β) := L( , β1 ) ⊗ · · · ⊗ L( , βn ). 16 2 16 2 16 In this notation, every irreducible T -module with L 1 (β) L 1 (γ · (1 + β)) 16

T

2

1 16 -word

β can be written as

for some γ ∈ C.

Next we shall recall a notion of induced modules from [M2]. Let β ∈ C ⊥ := {α ∈ (α, γ ) = 0, for all γ ∈ C} and Cβ := {α ∈ C| supp α ⊆ supp β}. Let the group Cˆ = ±ek | k ∈ C be a central extension of C by {±1} such that

Zn2 |

ea eb = (−1)(a,b) ea eb for any a, b ∈ C ˆ Let H be a maximal self-orthogonal subcode of and denote Cˆ β := ±ea | a ∈ Cβ ⊂ C. α ˆ Cβ . Then H = {±e |α ∈ H } is a maximal abelian subgroup of Cˆ β (it is automatically normal). Take a linear irreducible character χ of Hˆ with χ −e0 = −1 and define a 1-dimensional Hˆ -module Fχ by the action for p ∈ Fχ , α ∈ H. eα p = χ eα p n 1 , i = 1, · · · , n, with τ ⊗i=1 L 21 , h i = β, we define For any h i ∈ 0, 21 , 16

1 i n ⊗ Fχ . ,h W = ⊗i=1 L 2


159

Then W becomes an M H -module with the vertex operator defined by

ai i n n Y ⊗i=1 u i ⊗ eα , z = ⊗i=1 I 2 ,h u i , z ⊗ χ eα , ai

where u i ∈ L( 21 , a2i ), (a1 , . . . , an ) ∈ H , and I 2 ,h is an intertwining operator of type L( 21 , a2i ) L( 1 ,0) L( 21 , h i ) 2 . L( 21 , h i ) L( 21 , a2i ) We shall denote this M H -module by L h 1 , . . . , h n ⊗ Fχ . Let {β j }sj=1 be a transversal of H in C and define

L(h 1 , . . . , h n ) L 1 (β j ) ⊗ eβ j ⊗ Hˆ Fχ . IndCH L(h 1 , . . . , h n ) ⊗ Fχ = i

T

β j ∈C/H

2

Note that the M H -module structure of IndCH L(h 1 , . . . , h n ) ⊗ Fχ does not depend on the choice of the transversal of H in C. The following result can be found in Miyamoto [M2]. Theorem 2.17. The induced module IndCH L(h 1 , . . . , h n ) ⊗ Fχ is an MC -module. Theorem 2.18. Let U be an irreducible MC -module with τ (U ) = β and let H be any maximal self-orthogonal subcode of Cβ = {α ∈ C| supp α ⊆ supp β}. Then there exist 1 h , . . . , h n ∈ {0, 1/2, 1/16}n and a linear character χ of Hˆ such that

U∼ = IndCH L h 1 , . . . , h n ⊗ Fχ . Moreover, the MC -module structure of U is uniquely determined by an irreducible M H -submodule. 2.2.2. Z2 -orbifold construction. Next, we shall recall Z2 -orbifold construction for holomorphic framed VOAs. Let V = ⊕β∈D V β be a holomorphic framed VOA with the structure codes (C, D). For any δ ∈ Zn2 \C, denote D 0 = {β ∈ D | (β, δ) = 0}

and

D 1 = {β ∈ D | (β, δ) = 0}.

Then the Miyamoto involution τδ has order 2 and the fixed point subspace V τδ = ⊕β∈D 0 V β . Define ⎧ ⎪ ( V β) ⊕ ( Mδ+C V β ) if wt(δ) is odd, ⎪ ⎪ MC ⎨ 0 1 β∈D β∈D V˜ (τδ ) = ⎪ ( V β) ⊕ ( Mδ+C V β ) if wt(δ) is even, ⎪ ⎪ MC ⎩ β∈D 0

β∈D 0

where MC denotes the fusion product over MC . Theorem 2.19 (cf. [LY]). V˜ (τδ ) is a holomorphic framed VOA. Moreover, the structure codes of V˜ (τδ ) are given by (C, D) if wt(δ) is odd and (C ∪ (δ + C), D 0 ) if wt(δ) is even. We call the VOA V˜ (τδ ) the τδ -orbifold of V .

160

C. H. Lam

2.2.3. σ -type involutions. Next, let us consider another kind of Z2 -orbifold construction. Let ξ ∈ Zn2 \D such that D˜ = spanZ2 D, ξ is still triply even. Set C 0 = {α ∈ C | (α, ξ ) = 0} and C 1 = {α ∈ C | (α, ξ ) = 1}. Then [C : C 0 ] = 2 and ˜ C = C 0 ∪ (δ + C 0 ) for some δ ∈ C 1 . Moreover, (C 0 )⊥ = D. Since D˜ = spanZ2 D, ξ is triply even, it is clear that wt(ξ · β) ≡ 0 mod 4 and wt(ξ · β · γ ) ≡ 0 mod 2, i.e., ξ · β ∈ D ⊥ = C, for any β, γ ∈ D. By the proof of [LY, Theorem 12], for every β ∈ D, the subcode (C 0 )β = {α ∈ C 0 | suppα ⊂ suppβ} contains a doubly even self-dual subcode Hβ (cf. Lemma 2.10). Clearly, Cβ ⊃ (C 0 )β ⊃ Hβ and Hβ is also a maximal self-orthogonal subcode of Cβ . Thus, for any β ∈ D, there exists γ ∈ Zn2 and χβ ∈ I rr Hˆ β such that

β C V = Ind Hβ L 1 (β) L 1 (γ · (1 + β)) ⊗ Fχβ 16 2 T

C0 = Ind Hβ L 1 (β) L 1 (γ · (1 + β)) ⊗ Fχβ 16 2 T

C0 ⊕ Ind Hβ L 1 (β) L 1 ((γ + δ) · (1 + β)) ⊗ (eδ ⊗ Fχβ ). 16

T

2

Therefore, V β is a direct sum of two irreducible MC 0 -modules, say V β,1 and V β,2 . Note also that V β,2 = Mδ+C 0 MC 0 V β,1 and V β,1 and V β,2 are simple current modules of MC 0 by [LY, Cor. 4]. In particular, V is a simple current extension of MC 0 . Since D˜ is triply even, ξ ∈ D˜ and C 0 = D˜ ⊥ , (C 0 )ξ also contains a doubly even selfdual subcode (cf. Lemma 2.10 and [LY, Remark 6]). Let H be a doubly even self-dual subcode of (C 0 )ξ and let χ be an irreducible character of Hˆ such that χ (−e0 ) = −1. Then 0

U = IndCH L 1 (ξ ) ⊗ Fχ 16

is an irreducible MC 0 -module. Moreover, U MC 0 U = MC 0 by [LY, Prop. 6 and Corollary 4]. Since V β,2 = Mδ+C 0 MC 0 V β,1 , we have

V

β,2

U = Mδ+C 0

MC 0

MC 0

V

β,1

U .

MC 0

Note that the 1/16-word of V β,1 MC 0 U is β + ξ and (δ, β + ξ ) = (δ, β) + (δ, ξ ) = 1. Thus the weights of V β,1 MC 0 U and V β,2 MC 0 U are differed by 1/2 + Z. Therefore, one of V β,1 MC 0 U and V β,2 MC 0 U has integral weights and the other one has weights in 21 + Z. Notation 2.20. Let V β,+ be the irreducible MC 0 -module of V β such that V β,+ MC 0 U has the same weights as U modulo Z and let V β,− be the irreducible MC 0 -module of V β such that the weights of V β,− MC 0 U and U differ by 1/2 + Z.


161

Lemma 2.21 (cf. Theorem 12 of [LY]). Let g ∈ End(V ) such that g|V β,+ = 1 and g|V β,− = −1 for all β ∈ D. Then g is an automorphism of order 2 in Aut (V ) and g| MC = σξ . Remark 2.22. Note that the automorphism g, in general, depends on the choice of the MC 0 -module U but it is uniquely determined by U . 0

Since U = IndCH L 1 (ξ ) ⊗ Fχ has the minimal weight 16

V β,−

1 16 wt(ξ ),

MC 0 U are integral if wt(ξ ) ≡ 8 mod 16 and are in Now let U β = V β,+ and β,− V MC 0 U if wt(ξ ) ≡ 8 mod 16, ξ +β = U β,+ V MC 0 U if wt(ξ ) ≡ 0 mod 16,

the weights of

1 2 +Z if wt(ξ )

≡ 0 mod 16.

for any β ∈ D. Then all U β ’s have integral weights and are simple current modules for ˜ is closed under the fusion rules since MC 0 -modules. Moreover, the set {U β | β ∈ D} U MC 0 U = M C 0 . The following theorem now follows immediately from [LY, Theorem 8]. Theorem 2.23 ([LY]). Let D˜ = spanZ2 D, ξ . Then V˜ (g) = β∈ D˜ U β is a holomorphic framed VOA. Moreover, the structure codes associated to the frame T for V˜ (g) are ˜ given by (C 0 , D). Remark 2.24. By [LY, Theorem 1], V T = ⊕β∈D V β MC 0 U is actually an irreducible g-twisted module of V . Remark 2.25. Let V = ⊕β∈D V β be a holomorphic framed VOA with the 1/16-code D. By Theorem 2.19, for any subcode 1 ∈ D0 < D, we can construct a holomorphic framed VOA V with the 1/16-code D0 by successive Z2 -orbifoldings using τ -involutions. Moreover, ⊕β∈D0 V β ⊂ V . Conversely, for any triply even code D˜ > D, we may also construct a holomorphic framed VOA V˜ with the 1/16-code D˜ by using lifts of σ -involutions (cf. Theorem 2.23). 3. Z4 -Codes and Framed VOAs In this section, we shall recall a basic construction of framed VOAs from codes over Z4 (cf. [DGH]). Most known examples of framed VOAs, including the famous moonshine VOA, are constructed by this method. Let C be a (linear) Z4 -code. The Euclidean weight of a codeword x = (x1 , . . . , xn ) n of C is i=1 min{xi2 , (4 − xi )2 }. Suppose C is a self-orthogonal Z4 -code and the Euclidean weights of all elements in C are divisible by 8. Define A4 (C) =

1 (x1 , . . . , xn ) ∈ Zn | (x1 , . . . , xn ) ∈ C mod 4 . 2

Then A4 (C) is an even lattice. It is also well-known that A4 (C)√is unimodular if and only if C is self-dual. Note that if C = 0, then A4 (C) = (2Z)n ∼ = ( 2 A 1 )n .

162

C. H. Lam

Definition 3.1. A self-dual Z4 code C is of type II if the Euclidean weights of all elements in C are divisible by 8. Note that a type II self dual Z4 -code exists if and only if the length n ∈ 8Z (cf. [CS]). In [DMZ] and [M2], it was shown that the lattice VOA V√2 A1 is framed and 1 1 1 1 1 1 V√2 A1 ∼ = L( , 0) ⊗ L( , 0) ⊕ L( , ) ⊗ L( , ) 2 2 2 2 2 2 as an L( 21 , 0) ⊗ L( 21 , 0)-module. Hence the lattice VOA V A4 (C) ⊃ V A4 (0) ∼ = (V√2 A1 )⊗n is also framed for any self-orthogonal Z4 -code C. Now let us study the structure codes for the lattice VOA V A4 (C) . Let C be a self-dual Z4 code. Denote Ctor = {(α1 , . . . , αn ) ∈ {0, 1}n | (2α1 , . . . , 2αn ) ∈ C}, Cr es = {α ∈ {0, 1}n | α ≡ β mod 2 for some β ∈ C}. These codes Ctor and Cr es are called torsion and residue codes, respectively. Then both Ctor and Cr es are even binary codes. Moreover, Cr es is doubly even and Ctor = Cr⊥es . n 2n n 2n Now let us define three linear maps d : Zn2 → Z2n 2 , : Z2 → Z2 and r : Z2 → Z2 such that d(a1 , a2 , . . . , an ) = (a1 , a1 , a2 , a2 , . . . , an , an ), (a1 , a2 , . . . , an ) = (a1 , 0, a2 , 0, . . . , an , 0), r (a1 , a2 , . . . , an ) = (0, a1 , 0, a2 , . . . , 0, an ),

(3.1)

for any (a1 , a2 , . . . , an ) ∈ Zn2 . Proposition 3.2 (cf. [DGH]). Let C be a type II self-dual Z4 -code and Ctor and Cr es defined as above. Then the structure codes of the lattice VOA V A4 (C) are given by D = d(Cr es ) and C = D ⊥ = spanZ2 d(Zn2 ), (Ctor ). Now set ξ = (1010 . . . 10). Then for any β = d(α) ∈ d(Cr es ) = D, we have ξ · β = (α) ∈ (Cr es ) ⊂ (Ctor ) ⊂ C

and

wt(ξ · β) ≡ 0

mod 4.

By Theorem 2.23, we can construct a g-orbifold VOA V˜ A4 (C) (g) = ⊕β∈ D˜ U β , for some order 2 automorphism g ∈ Aut(V A4 (C) ) such that g| MC = σξ . Since σξ acts as −1 on the weight one subspace of Md(Zn2 ) , which generates the Heisenberg sub VOA in V A4 (C) , g is conjugate to the automorphism θ , a lift of the (−1)-map on the lattice A4 (C) (cf. [DGH,FLM]). Notation 3.3. Since g is conjugate to θ , we have V˜ A4 (C) (g) ∼ = V˜ A4 (C) (θ ). We shall denote V˜ A4 (C) (g) ∼ = V˜ A4 (C) (θ ) by V˜ A4 (C) in this case.


163

Theorem 3.4 ([LY]). The VOA V˜ A4 (C) is a holomorphic framed VOA. Moreover, the ˜ where D˜ = structure codes associated to the frame T for V˜ A4 (C) are given by (C 0 , D), spanZ2 D, (10)n . Definition 3.5. Let E be a binary code of length n. We shall define D(E) = spanZ2 d(E), (10)n to be the code generated by d(E) and (10)n . We call the code D(E) the extended doubling (or simply the doubling) of E. Lemma 3.6. If E is a doubly even [8n, k] code, then the doubling D(E) is a triply even [16n, k + 1] code. Remark 3.7. By the discussion above, for a given Type II Z4 -code C, one can construct 1 -code is given by D(Ctor ). a framed VOA V˜ A4 (C) whose 16 Notation 3.8. For any positive integer n, let En be the subcode of Zn2 consisting of all even codewords. Note that D(E)⊥ > d(En ) for any binary code E. The following is a partial converse of Theorem 3.4. 1 Theorem 3.9. Let V = ⊕β∈D V β be a holomorphic framed VOA with the 16 -code D. Suppose that D can be embedded into a doubling D(E) for some doubly even code E. Then there is an even unimodular lattice N such that V ∼ = VN or V˜ N .

Proof. Let E be a doubly even code of length n and assume that D < D(E). Then D ⊥ > D(E)⊥ > d(En ). If D ⊥ contains d(Zn2 ), then V contains a subVOA isomorphic to V√⊗n and hence 2 A1 V is isomorphic to a lattice VOA VN and N is an even unimodular lattice since V is holomorphic. Otherwise, let α = (1100 . . . 0). Then α ∈ / D ⊥ and the τα -orbifold V˜ (τα ) of V will n have the structure code containing d(Z2 ) and hence the τα -orbifold V˜ (τα ) is isomorphic √ √ to a lattice VN for some even unimodular lattice N . Since ( 2 A1 )∗ / 2 A1 ∼ = √ VOA Z4 , N /( 2 A1 )n defines a type II self-dual Z4 -code C and N ∼ = A4 (C). In this case, D 0 = {β ∈ D | (β, α) = 0} ∼ = d(Cr es ). Let ξ ∈ D\D 0 , then σξ acts as −1 on (Md(Zn2 ) )1 . Then by the same argument as in Theorem 3.4, V is isomorphic to V˜ N . 4. An Exceptional [48, 9] Triply Even Code In this section, we shall recall the properties of the exceptional [48, 9] code constructed by [BM].

164

C. H. Lam

4.1. Triangular graph. Let X = {1, 2, . . . , 10} be a set of 10 elements and let

X

:= = {i, j} | {i, j} ⊂ X 2 = 45. be the set of all 2-element subsets of X . Then | | = 10 2

The triangular graph on X is a graph whose vertex set is and two vertices S, S ∈ are joined by an edge if and only if |S ∩ S | = 1. We shall denote by T10 the binary code generated by the row vectors of the incidence matrix of the triangular graph on X . Remark 4.1. Note that the entries of an incidence matrix are either 0 or 1 and we shall view 0 and 1 as integers modulo 2. For {i, j} ∈ , let γ{i, j} be the binary word supported at {{k, } | |{i, j}∩{k, }| = 1}, i.e., the set of all vertices joining to {i, j}. Note that supp(γ{i, j} ) = {{i, k} | k ∈ X \{i, j}} ∪ {{ j, k} | k ∈ X \{i, j}}

(4.1)

and wt (γ{i, j} ) = 16. For convenience, we often identify γ{i, j} with its support. Lemma 4.2. For any i, j, k, ∈ X , we have (1) γ{i, j} + γ{i,k} = γ{ j,k} if j = k, and (2) wt (γ{i, j} + γ{k,} ) = 24 if {i, j} ∩ {k, } = ∅. Proof. (1) By (4.1), we have supp(γ{i, j} ) = {{i, } | ∈ X \{i, j}} ∪ {{ j, } | ∈ X \{i, j}}, supp(γ{i,k} ) = {{i, } | ∈ X \{i, k}} ∪ {{k, } | ∈ X \{i, k}}, and hence supp(γ{i, j} ) ∩ supp(γ{i,k} ) = {{i, } | ∈ X \{i, j, k}} ∪ {{ j, k}}. Therefore, we have supp(γ{i, j} + γ{i,k} ) = {{ j, } | ∈ X \{ j, k}} ∪ {{k, } | ∈ X \{ j, k}} = supp(γ{ j,k} ). Hence we have (1). For (2), we note that supp(γ{i, j} ) ∩ supp(γ{k,} ) = {{i, k}, { j, k}, {i, }, { j, }} if {i, j} ∩ {k, } = ∅. Thus, wt (γ{i, j} + γ{k,} ) = 16 + 16 − 2 × 4 = 24. Lemma 4.3. Let X = {i 1 , j1 } ∪ {i 2 , j2 } ∪ · · · ∪ {i 5 , j5 } be a partition of X . Then we have γ{i1 , j1 } + γ{i2 , j2 } + · · · + γ{i5 , j5 } = 0. Proof. For each {i, j} = {i 1 , j1 }, · · · , {i 5 , j5 }, {i, j} is contained in the support of exactly 2 of γ{i1 , j1 } , γ{i2 , j2 } , . . . , γ{i5 , j5 } by (4.1). Hence, we have γ{i1 , j1 } + γ{i2 , j2 } + · · · + γ{i5 , j5 } = 0.


165

Lemma 4.4 (Lemma 3.5 of [KMR]). For any {i, j} ∈ , the set {γ{ j,k} | k ∈ / {i, j}} is a basis of T10 . In particular, dim T10 = 8. Proof. By Lemma 4.2 (1) and Lemma 4.3, we know that γ{ p,q} ∈ span{γ{ j,k} | k ∈ / {i, j}}

for any { p, q} ⊂ X.

Thus, {γ{ j,k} | k ∈ / {i, j}} spans T10 since T10 is generated by γ{ p,q} , { p, q} ⊂ X . Now suppose λk γ{ j,k} = 0. k∈X \{i, j}

/ Again by (4.1), we know that {i, k} ∈ supp(γ{ j,k} ) for any k ∈ X \{i, j} but {i, k} ∈ supp(γ{ j,k} ) if i = j. Thus, we have λk = 0 for all k and {γ{ j,k} | k ∈ / {i, j}} is linearly independent. 48 Now let ι : Z45 2 → Z2 be defined by ι(α) = (α, 0, 0, 0). Then we can embed T10 48 into Z2 using ι.

Definition 4.5. We shall denote by D ex the binary code generated by ι(T10 ) and the ex = 9. all-one vector 1 in Z48 2 . Clearly, dim D Notation 4.6. We label the indices of the last 3 coordinates by a, b and c and denote ˜ = ∪ {a, b, c}. Then | | ˜ = 48.

Theorem 4.7 (cf. [BM]). The binary code D ex is a maximal triply even code of length 48, that means, it is not properly contained in any triply even code of length 48. Moreover, the weight enumerator of D ex is given by 1 + 45x 16 + 420x 24 + 45x 32 + x 48 . Remark 4.8. Note that {γ{i, j} | {i, j} ∈ } is exactly the set of all weight 16 vectors in D ex . Notation 4.9. Let D ex be the triply even code defined in Definition 4.5 and let C ex = (D ex )⊥ be the dual code. Then dim C ex = 39. For convenience, we shall often identify a codeword with its support. Notation 4.10. For α = (α1 , · · · , αn ), β = (β1 , · · · , βn ) ∈ Zn2 , we shall denote by α · β the coordinatewise product of α and β, i.e., α · β = (α1 β1 , . . . , αn βn ). We shall also 48 use pβ to denote the natural projection of Z48 2 to the support of β, where β ∈ Z2 . Notation 4.11. We denote the extended Hamming [8, 4, 4] code by H8 and denote by + the doubly even self-dual code of length 16 generated by d16 ⎞ ⎛ 1111 0000 0000 0000 ⎜ 0011 1100 0000 0000 ⎟ ⎟ ⎜ ⎜ 0000 1111 0000 0000 ⎟ ⎟ ⎜ ⎜ 0000 0011 1100 0000 ⎟ ⎜ 0000 0000 1111 0000 ⎟ . ⎟ ⎜ ⎜ 0000 0000 0011 1100 ⎟ ⎟ ⎜ ⎝ 0000 0000 0000 1111 ⎠ 1010 1010 1010 1010

166

C. H. Lam

Lemma 4.12. Let β ∈ D ex with wt(β) = 16 and let Cβ = {α ∈ C ex | supp(α) ⊂ supp(β)}. Then we have + . In particular, it is a doubly even self-dual code. 1. pβ (Cβ ) ∼ = d16 ex ∼ 2. p1+β (C ) = E32 , the code consisting of all even words in Z32 2 . Proof. Let β ∈ D ex and wt(β) = 16. Then β = γ{i, j} for some {i, j} ⊂ {1, . . . , 10}. Since D ex is triply even, it is clear that pβ (D ex ) is doubly even. Let {γ{ j,k} | k ∈ / {i, j}} be a basis of T10 . Then supp pβ (γ{ j,k} ) = {{ j, } | ∈ X \{i, j, k}} ∪ {{i, k}}. Since {i, k} only appears in supp pβ (γ{ j,k} ) , { pβ (γ{ j,k} ) | k ∈ X \{i, j}} is a linearly independent set and hence dim( pβ (T10 )) = 8. Note that pβ (D ex ) = pβ (T10 ) and thus pβ (D ex ) is doubly even self-dual. It is also clear that the subcode Cβ = {α ∈ C ex | supp(α) ⊂ supp(β)} contains pβ (D ex ) and is orthogonal to pβ (D ex ). Thus, Cβ = pβ (D ex ) and Cβ is selfdual. By Lemma 4.2, it is easy to verify that { pβ (γ{k,} ) | {k, } ∩ {i, j} = ∅} generates + . a subcode isomorphic to d(E8 ) and hence Cβ ∼ = d16 ex Now consider the projection of C to supp(1 + β). Then ker p1+β |C ex = {α ∈ C ex | supp(α) ⊂ supp(β)} = Cβ , and thus dim( p1+β (C ex )) = dim C ex − dim Cβ = 39 − 8 = 31. Since p1+β (C ex ) is even, we have p1+β (C ex ) = E32 .

Lemma 4.13. Let β ∈ and wt(β) = 16. Let U be any irreducible MC ex -module U with integral weights and τ (U ) = β. Then we have L 1 (β) L 1 (γ · (1 + β)) U= D ex

γ ∈C ex /Cβ

16

T

2

as a T -submodule. In particular, dim U1 = 1 and U1 = 0. Proof. Let W = L 1 (β) T L 1 (α) be an irreducible T -submodule of U with α sup16 2 ported at 1 + β. Since wt(β) = 16 and U has integral weights, α is even. Since p1+β (C ex ) = E32 , there exists γ ∈ C ex such that γ · (1 + β) = α and thus

∼ L 1 (β) = L 1 (β) L 1 (α) Mγ 16

16

T

T

2

is also contained in U . Hence, we have U = L 1 (β) Mγ γ ∈C ex /Cβ

=

γ ∈C ex /C

T

16

L 1 (β) L 1 (γ · (1 + β)) T

16

β

2

as a T -module. Therefore, dim U1 = dim(L 1 (β))1 = 1 and U1 = 0. 16


167

5. Constructions of VOA 1 In this section, we shall give an explicit construction of a VOA V ex , whose 16 -code is isomorphic to D ex . The method is by successive orbifoldings from certain lattice VOAs.

5.1. Subcodes of D ex . In this subsection, we shall study some subcodes of D ex . In particular, we shall find a subcode of D ex which is isomorphic to d(E) for some doubly even code E. ˜ = As in Sect. 4, we denote X = {1, 2, . . . , 10}, = {{i, j} | {i, j} ⊂ X } and

∪ {a, b, c}. Notation 5.1. For any binary code C and a positive integer n, we denote C(n) = {α ∈ C | wt(α) = n}. Notation 5.2. Let λ = (λ1 , . . . , λm ) be a partition of 10. Let X 1 , . . . , X m be subsets of m X and |X | = λ for 1 ≤ i ≤ m. X such that X = ∪i=1 i i i Set Wλ = {γ{i, j} | {i, j} ⊂ X k , 1 ≤ k ≤ m}. Then Wλ and the all-one vector ex 1 ∈ Z48 2 generates a subcode of D . We shall denote this code by D[λ1 ,...,λm ] or simply ex by [λ1 , . . . , λm ]. Note that D = D[10] = [10]. Remark 5.3. It is clear by definition that the code D[λ1 ,...,λm ] is uniquely determined by the shape of the partition (λ1 , . . . , λm ) up to the action of Sym 10 . Notation 5.4. For a partition (λ1 , . . . , λm ), we define C[λ1 ,...,λm ] := (D[λ1 ,...,λm ] )⊥ and denote K [λ1 ,...,λm ] = supp(D[λ1 ,...,λm ] (16))

and

˜ [λ1 ,...,λm ] . K [λ = \K 1 ,...,λm ]

In addition, we define C[λ1 ,...,λm ],K = {α ∈ C[λ1 ,...,λm ] | supp α ⊂ K [λ1 ,...,λm ] }, C[λ = {α ∈ C[λ1 ,...,λm ] | supp α ⊂ K [λ }, and 1 ,...,λm ] 1 ,...,λm ] C[λ1 ,...,λm ],β = {α ∈ C[λ1 ,...,λm ] | supp α ⊂ supp β} for β ∈ D[λ1 ,...,λm ] . Lemma 5.5. Let X = X 1 ∪ · · · ∪ X k be a partition of the shape (λ1 , . . . , λm ), the point {i, j} ∈ K [λ1 ,...,λm ] if and only if there exists k = 1, . . . , m and { p, q} ⊂ X k such that |{i, j} ∩ { p, q}| = 1. Proof. Let X = X 1 ∪ · · · ∪ X k be a partition of the shape (λ1 , . . . , λm ). Then D[λ1 ,...,λm ] (16) are given by γ{ p,q} ,

{ p, q} ⊂ X k for some k = 1, . . . , m.

Since supp(γ{ p,q} ) = {{i, j} | |{i, j} ∩ { p, q}| = 1}, we have K [λ1 ,...,λm ] = =

m

k=1 { p,q}⊂X k m k=1 { p,q}⊂X k

and we have the desired conclusion.

supp(γ{ p,q} ) {{i, j} | |{i, j} ∩ { p, q}| = 1}

168

C. H. Lam

Lemma 5.6. Let D be a binary code of length 2n. Then D ∼ = d(E) for some binary code E if and only if there is an involution g ∈ Sym 2n which acts fixed point free on {1, 2, . . . , 2n} but fixes D pointwise. Proof. The “only if” part is trivial. Now let g ∈ Sym 2n be an involution which acts fixed point free on {1, . . . , 2n} and fixes D pointwise. Then g must be a product of n transpositions. Without loss, we may assume g = (1, n + 1)(2, n + 2) . . . (n, 2n). Let p be the projection of Z2n 2 to the first n coordinates and denotes E = p(D). Then we must have D ∼ = d(E) since g fixes D pointwise. Lemma 5.7. The subcode [4, 2, 2, 2] is isomorphic to d(E) for some doubly even code E. Moreover, the weight enumerator of E is 1 + 9x 8 + 44x 12 + 9x 16 + x 24 . Proof. Let X = {1, 2, 3, 4} ∪ {5, 6} ∪ {7, 8} ∪ {9, 10} be a partition of type (4, 2, 2, 2). Then D[4,2,2,2] (16) = {γ{1,2} , γ{1,3} , γ{1,4} , γ{2,3} , γ{2,4} , γ{3,4} } ∪ {γ{5,6} , γ{7,8} , γ{9,10} }. Recall that D[4,2,2,2] is generated by D[4,2,2,2] (16) and (148 ). By Lemma 5.5, we have K [4,2,2,2] = {{5, 6}, {7, 8}, {9, 10}, a, b, c}.

Define σ1 := ({1, 2}, {3, 4})({1, 3}, {2, 4})({1, 4}, {2, 3})({5, 6}, a)({7, 8}, b)({9, 10}, c) ˜ By direct verification, it is easy to show that σ1 fixes D[4,2,2,2] (16) as a permutation on . 48 and (1 ) pointwise. Moreover, ˜ | σ1 x = x} = \Y ˜ 1, Fi x(σ1 ) = {x ∈ where Y1 = {{1, 2}, {3, 4}, {1, 3}, {2, 4}, {1, 4}, {2, 3}, {5, 6}, {7, 8}, {9, 10}, a, b, c} . ˜ − (Y1 ∪ Y2 ). Let Y2 = { {i, j} | i ∈ {1, 2, 3, 4}, j ∈ {5, 6, 7, 8, 9, 10}} and Y3 = Note that Y1 , Y2 , Y3 are pairwise disjoint and {5, 7}, {5, 8}, {5, 9}, {5, 10}, {6, 7}, {6, 8}, Y3 = . {6, 9}, {6, 10}, {7, 9}, {8, 9}, {7, 10}, {8, 10} ˜ by Let t = (5, 6)(7, 8)(9, 10) ∈ Sym 10 . Define a permutation σ2 on {ti, t j} if {i, j} ∈ Y2 , σ2 ({i, j}) = {i, j} if {i, j} ∈ / Y2 and σ2 a = a, σ2 b = b, σ2 c = c. We also define σ3 = ({5, 7}, {6, 7})({5, 8}, {6, 8})({5, 9}, {6, 9}) · ({5, 10}, {6, 10})({7, 9}, {8, 9})({7, 10}, {8, 10}) ˜ as a permutation on .


169

It is easy to verify that σ2 and σ3 fix [4, 2, 2, 2] pointwise and are fixed point free on Y2 and Y3 , respectively. ˜ and fixes [4, 2, 2, 2] pointwise. Now let τ = σ1 σ2 σ3 . Then τ is fixed point free on Thus, [4, 2, 2, 2] ∼ = d(E) by Lemma 5.6. The weight enumerator can be computed directly by the facts that dim[4, 2, 2, 2] = 6 and [4, 2, 2, 2] contains the all-one vector and exactly 9 elements of weight 16. Notation 5.8. We shall use C to denote the doubly even code E obtained in Lemma 5.7. Now let C be a type II self-dual Z4 -code such that the residue code Cr es = C. Note that such a code C exists (cf. App. B). Lemma 5.9. For any α ∈ C(8), we have |{β ∈ C | β ≡ α

mod 2, Ewt (β) = 8}| = |{γ ∈ C ⊥ |supp(γ ) ⊂ supp(α)}| = 8,

where Ewt (β) denotes the Euclidean weight of β. Proof. Let S = {β ∈ C | β ≡ α mod 2, Ewt (β) = 8}. Then for any β, β ∈ S, ⊥ β − β ≡ 0 mod 2. Thus, β − β ∈ Ctor = C⊥ Res = C . Since β, β both have Euclidean weight 8 and β ≡ β ≡ α mod 2, both β and β are supported at supp(α). Hence S = β0 + {2γ | γ ∈ (C ⊥ )α } and we have |S| = |Cα⊥ | = 23 . Proposition 5.10. Let C be a type II self-dual Z4 -code such that the residue code Cr es = C. Let N = A4 (C) =

1 {(x1 , . . . , x24 ) ∈ Z24 | (x1 , . . . , x24 ) ∈ C mod 4}. 2

Then the kissing number of N is 96 and thus N is isometric to the Niemeier lattice of type A83 . Proof. The kissing number of N is equal to the number of vectors of norm 2, which is determined by codewords with Euclidean weight 8. By Lemma 5.9, the kissing number is equal to |C(8)| · 8 + |C ⊥ (2)| · 4 = 9 × 8 + 6 × 4 = 96. Since N (A38 ) is the only Niemeier lattice that has the kissing number 96, N is isometric to the Niemeier lattice of type A83 . Theorem 5.11. There is a Virasoro frame T of VN (A8 ) such that the 1/16-code of VN (A8 ) 3 3 is isomorphic to [4, 2, 2, 2]. Let X = {1, 2, 3, 4}∪{5, 6}∪{7, 8}∪{9, 10} be a partition and let τ be the permutation ˜ as defined in Lemma 5.7. of Lemma 5.12. Let ξ0 = γ{4,5} +γ{8,9} . Then wt(ξ0 ) = 24 and τ (ξ0 ) = 1+ξ0 . In particular, spanZ2 D[4,2,2,2] , ξ0 ∼ = D(C).

170

C. H. Lam

Proof. By Lemma 4.2 (ii), we have wt(ξ0 ) = 24 and ⎧ {1, 4}, ⎪ ⎨ {1, 5}, supp(ξ0 ) = ⎪ {1, 8}, ⎩ {1, 9},

{2, 4}, {3, 4}, {2, 5}, {3, 5}, {2, 8}, {3, 8}, {2, 9}, {3, 9},

{4, 6}, {5, 6}, {6, 8}, {6, 9},

{4, 7}, {5, 7}, {7, 8}, {7, 9},

⎫ {4, 10}, ⎪ ⎬ {5, 10}, . {8, 10}, ⎭ ⎪ {9, 10}

Then by the definition of τ , we have ⎧ {2, 3}, ⎪ ⎨ {1, 6}, supp(τ (ξ0 )) = ⎪ ⎩ {1, 7}, {1, 10},

{1, 3}, {2, 6}, {2, 7}, {2, 10},

⎫ {1, 2}, {4, 5}, {4, 8}, {4, 9}, ⎪ ⎬ {3, 6}, a, {6, 7}, {6, 10}, {3, 7}, {5, 8}, b, {7, 10}, ⎪ ⎭ {3, 10}, {6, 10}, {8, 9}, c,

and hence ξ0 + τ (ξ0 ) = 1 and τ (ξ0 ) = 1 + ξ0 .

Let ξ0 = γ{4,5} + γ{8,9} , ξ1 = γ{4,5} and ξ2 = γ{6,7} and define D0 = spanZ2 [4, 2, 2, 2], ξ0 , D1 = spanZ2 D0 , ξ1 , D2 = spanZ2 D1 , ξ2 . Note that D0 ∼ = D(C), D1 ∼ = [6, 4] and D2 ∼ = [10] ∼ = D ex . Notation 5.13. Denote C 0 = (D0 )⊥ , C 1 = (D1 )⊥ , C 2 = (D2 )⊥ = C ex . Note that [C 0 : C 1 ] = [C 1 : C 2 ] = 2. Thus there exist δ1 , δ2 ∈ Z48 2 such that 0 1 1 1 2 2 C = C ∪ (δ1 + C ) and C = C ∪ (δ2 + C ). By Theorem 2.23, there exists holomorphic framed VOAs, V (D0 ) =

Wβ ∼ = V˜ N (A8 ) , 3

β∈D 0

V (D[6,4] ) = V (D1 ) =

β∈D 1

V (D ex ) = V (D2 ) =

β

U[6,4] , V β,

β∈D 2

such that β

U[6,4] = V β ⊕ V β Mδ2 +C 2 MC 2

β

β

for β ∈ D1 and

W β = U[6,4] ⊕ U[6,4] Mδ1 +C 1 MC 1

for β ∈ D0 .


171

5.2. Lie algebra structure for V (D ex )1 . Next we shall determine the Lie algebra structure for the weight one subspaces. First we shall recall several important results by Dong and Mason. Notation 5.14. Let g be a simple Lie algebra of type X . Let gˆ = g ⊗ C[t, t −1 ] ⊕ C be the corresponding affine Lie algebra. We denote the irreducible highest weight gˆ -module of level k and of weight 0 by Lg(k, 0) or L X (k, 0). It is known (cf. [FZ]) that L X (k, 0) is a simple VOA with central charge c = k dim g/(k + h ∨ ) if k = −h ∨ , where h ∨ is the dual Coxeter number of g. Theorem 5.15 (Dong-Mason [DM3]). Let V be a C2 -cofinite holomorphic VOA of CFT type. Suppose the central charge c of V is 24. Then the Lie algebra V1 has Lie rank less than or equal to 24 and is either abelian (including 0) or semi-simple. Now suppose V1 is semisimple. Let V1 be a direct sum V1 = g1,k1 ⊕ g2,k2 ⊕ · · · ⊕ gn,kn of simple Lie algebra gi whose affine Lie algebra has level ki on V . The following result is also proved in [DM3] (see also [Sch]). Theorem 5.16 (Dong-Mason). Let gi be a simple component of V1 whose affine algebra has level ki . Let h i∨ be the dual Coxeter number of gi . Then h i∨ (dim V1 − 24) . = ki 24

(5.1)

In particular, the ratio h i∨ /ki is independent of gi . The next theorem shows that the level ki is a positive integer and the subVOA generated by gi is simple. Theorem 5.17 (Dong-Mason [DM4]). Let V be a simple vertex operator algebra which is C2 -cofinite of CFT type such that the adjoint module of V is isomorphic to its dual. Let g be a simple Lie subalgebra of V1 , k the level of V as gˆ -module, and U the vertex operator subalgebra of V generated by g. Then the following hold: (a) The restriction of ( , ) to g is nondegenerate, where ( , ) is defined by (u, v)1 = u 1 v for u, v ∈ V1 and 1 is the vacuum element; (b) U = Lg(k, 0); (c) k is a positive integer; (d) V is an integrable gˆ -module. Now denote V ex = V (D ex ) =

V β.

β∈D ex

For any β ∈ D ex with wt(β) = 16, let vβ be a highest weight vector of V β such that (vβ , vβ ) = 1. Notice that (V β )1 = (L 1 (β))1 and hence vβ is actually a highest 16 weight vector of L 1 (β). We also note that vβ is unique up to a multiplication of ±1 β

16

since dim(V1 ) = dim L 1 (β)1 = 1 (cf. Lemma 4.13). 16

172

C. H. Lam

Notation 5.18. For α ∈ C ex (2), let qα be a highest weight vector of L 1 (α) with 2 (qα , qα ) = 1. Again, qα is unique up to a multiplication of ±1. Proposition 5.19. The set {vβ | β ∈ D ex (16)} ∪ {qα | α ∈ C ex (2)} forms a basis for V1ex . In particular, dim V1ex = 48. Proof. First we shall note that (V β )1 = 0 for wt(β) > 16 and is spanned by vβ if wt(β) = 16. Moreover, V10 = (MC e x )1 is spanned by {qα | α ∈ C ex (2)}. Since V1 = ⊕β∈D (V β )1 , we have the desired result. By the fusion rules, it is easy to verify the following two lemmas. Lemma 5.20. For α, α ∈ C ex (2), we have (qα )0 (qα ) =

λ(α, α )qα+α

if wt(α + α ) = 2,

0

if wt(α + α ) = 2

for some non-zero constant λ(α, α ). Lemma 5.21. Let β, β ∈ D ex (16). Then we have (vβ )0 (vβ ) =

μ(β, β )vβ+β

if |β ∩ β | = 8,

0

if |β ∩ β | = 4 or 16,

for some nonzero constant μ(β, β ). In particular, span{vβ | β ∈ D ex (16)} forms a Lie subalgebra for V1 . Note that |supp(D ex (16))| = 10 = 45, |supp(C ex (2))| = 3, and 2 supp(D ex (16)) ∩ supp(C ex (2)) = ∅. Therefore, for any β ∈ D ex (16) and α ∈ C ex (2), we have (qα )0 vβ = 0, since Mα T L 1 (β) has the minimal weight 1. 16

Now let g1 = span{vβ | β ∈ D ex (16)} and g2 = span{qα | α ∈ C ex (2)}. Lemma 5.22. The Lie algebra g1 commutes with g2 and hence V1ex ∼ = g1 ⊕ g 2 . Lemma 5.23. The Lie subalgebra g2 generated by {qα | α ∈ C ex (2)} is isomorphic to sl2 (C). Moreover, the subVOA generated by g2 is isomorphic to L A1 (2, 0). Proof. Since {qα | α ∈ C ex (2)} generates a subVOA isomorphic to ME3 , the conclusion follows from Theorem A.2. Lemma 5.24. For any β ∈ D ex (16), (vβ )0 acts semisimply on g1 .


173

Proof. Let B = {α ∈ D ex (16) | |α ∩ β| = 8}, B = {α ∈ D ex (16) | |α ∩ β| = 8}. Then by Lemma 5.21, (vβ )0 acts as 0 on span{vα | α ∈ B }. Let W = span{vα | α ∈ B} and let A be the matrix of (vβ )0 |W with respect to the basis {vα | α ∈ B}. Then by Lemma 5.21, we have A = D P, where P is a permutation matrix and D is a diagonal matrix with non-zero diagonal entries. Since (vβ )0 (vβ )0 vα ∈ span{vα }, we also have P 2 = idW . By rearranging the basis if necessary, A can be written as ⎛ ⎞ 0 a1 ⎜ b1 0 ⎟ ⎜ ⎟ 0 a2 ⎜ ⎟ ⎜ ⎟ b2 0 ⎟, A=⎜ ⎜ ⎟ .. ⎜ ⎟ . ⎜ ⎟ ⎝ 0 an ⎠ bn 0 where ai , bi are non-zero constants. Since x 2 − ai bi has two distinct roots, for all i, A is diagonalizable. Therefore, (vβ )0 acts semisimply on g1 = span{vβ | β ∈ D ex (16)}. Lemma 5.25. Let β1 = γ{1,6} , β2 = γ{2,7} , β3 = γ{3,8} , β4 = γ{4,9} , β5 = γ{5,10} and let h1 = span{vβ1 , . . . , vβ5 }. Then h1 is a maximal abelian subalgebra of g1 = span{vβ | β ∈ D ex (16)}. In particular, h1 is a maximal toral subalgebra of g1 . Proof. By Lemma 5.21, it is clear that h1 is abelian. Let u = α aα vα ∈ g1 be such that [h1 , u] = 0. Then (vβi )0 u = μ(βi , α)aα vβi +α = 0 for all i = 1, 2, 3, 4, 5. |α∩βi |=8

It implies that aα = 0 for all α such that |α ∩ βi | = 8 for some i = 1, 2, 3, 4, 5. Since all weight 16 elements of D ex have the form γ{i, j} , {i, j} ⊂ X, and ⎧ ⎨ 4 if {i, j} ∩ {k, } = ∅, |γ{i, j} ∩ γ{k,} | = 8 if |{i, j} ∩ {k, }| = 1, ⎩ 16 if {i, j} = {k, }, we have aα = 0 unless α = β1 , · · · , β5 , and hence u ∈ h1 . Theorem 5.26. The Lie algebra o10 (C) and is of the type D5,8 .

g1

spanned by {vβ | β ∈ D ex (16)} is isomorphic to

Proof. Since g1 is semi-simple, has rank 5 and dim g1 = 45, the only possibility is o10 (C). Let k be the level of the affine Lie algebra gˆ 1 . Then by (5.1), we have k 48 − 24 = , 8 24 and thus k = 8.

Theorem 5.27. The Lie algebra V1ex is of the type A1,2 D5,8 .

174

C. H. Lam

5.3. Lie algebra structure for [V (D1 )]1 . Recall that D1 ∼ = D[6,4] and β V (D1 ) = V (D[6,4] ) = U[6,4] , β∈D[6,4]

β

where U[6,4] = V β ⊕ V β MC 2 Mδ2 +C 2 . Note that the 1/16-word of V β and V β MC 2 Mδ2 +C 2 are both β. For β ∈ D ex (16), β

dim(V β )1 = dim(V β MC 2 Mδ2 +C 2 ) = 1 by Lemma 4.13 and hence dim(U[6,4] )1 = 2. For any β ∈ D[6,4] (16), let u 0β and u 1β be highest weight vectors of V β and V β MC 2 Mδ2 +C 2 , respectively. We may also assume (u 0β , u 0β ) = (u 1β , u 1β ) = 1. Note that |K [6,4] | = 45, |K [6,4] | = 3 and the set C[6,4],K (2) consists of

{i, j}, {k, } ,

where {i, j, k, } = {7, 8, 9, 10}.

Thus, |C[6,4],K (2)| = 3 and for β = γ{i, j} with {i, j} ⊂ {1, 2, 3, 4, 5, 6}, the subcode C[6,4],β = {α ∈ C[6,4] | supp(α) ⊂ supp(β)} has no elements of weight 2. Lemma 5.28. The dimension of V (D[6,4] )1 is 48. β

Proof. Since dim U[6,4] = 2 for all β ∈ D[6,4] (16), we have dim V (D[6,4] )1 = 2 · |D[6,4] (16)| + |C[6,4] (2)| = 2 · |D[6,4] (16)| + |C[6,4],K (2)| + |C[6,4] (2)|

6 4 3 = 2×( + )+3+ = 48 2 2 2 as desired.

Lemma 5.29. Let g0[6,4] = span{qα | α ∈ C[6,4] (2)}. Then g0[6,4] ∼ = sl2 (C). Moreover, 0 the subVOA generated by g[6,4] is isomorphic to L A1 (2, 0). ∼ Proof. Since C[6,4] = E3 , the result follows from Theorem A.2.

Lemma 5.30. For β = γ{i, j} with {i, j} ⊂ {1, 2, 3, 4, 5, 6}, we have (u 0β )0 u 1β = 0. Proof. Since (u 0β )0 u 1β ∈ (MC[6,4] ,β )1 = 0, we have (u 0β )0 u 1β = 0 as desired. Proposition 5.31. Let ! " g1[6,4] = span u 0β , u 1β | β = γ{i, j} , {i, j} ⊂ {1, 2, 3, 4, 5, 6} . Then g1[6,4] ∼ = sl4 (C) ⊕ sl4 (C).


175

Proof. Let β1 = γ{1,2} , β2 = γ{3,4} , β3 = γ{5,6} . Then by the same argument as in Lemma 5.25, h1[6,4] = span{u 0β1 , u 0β2 , u 0β3 , u 1β1 , u 1β2 , u 1β3 } is a maximal abelian subalgebra of g1[6,4] . Thus, g1[6,4] ∼ = sl4 (C)⊕sl4 (C) since it has rank 6 and dim g1[6,4] = 26 ×2 = 30. For β = γ{i, j} , {i, j} ⊂ {7, 8, 9, 10}, we have |C[6,4],β (2)| = 3. Notation 5.32. For β = γ{i, j} , {i, j} ⊂ {7, 8, 9, 10}, 0 = (u 0β )0 u 1β ∈ Mδ2 +C2 . We shall define h β := (u 0β )0 u 1β . Lemma 5.33. For β = γ{i, j} , β = γ{k,} with {i, j}, {k, } ⊂ {7, 8, 9, 10}, h β+β = β h β + β h β , for some constants β , β when wt (β + β ) = 16. In particular, dim span{h γ{i, j} | {i, j} ⊂ {7, 8, 9, 10}} = 3. Proof. By the fusion rules, there exist non-zero constant λ, μ, ν such that (u 0β )0 u 0β = λu 0β+β , (u 0β )0 u 1β+β = μu 1β , (u 0β )0 u 1β+β = νu 1β . Then by definition, we have h β+β = (u 0β+β )0 u 1β+β 1 0 ((u )0 u 0 )0 u 1β+β λ β β 1 = ((u 0β )0 (u 0β )0 − (u 0β )0 (u 0β )0 )u 1β+β λ 1 = (μh β − νh β ) λ =

as desired.

Lemma 5.34. The Lie algebra g2[6,4] = span{u 0β , u 1β , h β | β = γ{i, j} , {i, j} ⊂ {7, 8, 9, 10}} is isomorphic to sl4 (C). Proof. Since h β ∈ (Mδ+C2 )1 , it is clear that (h β )0 u 0β = λβ u 1β

and

(h β )0 u 1β = μβ u 0β

for some non-zero constant λβ and μβ . Thus span{u 0β , u 1β , h β } is closed under the 0th product and span{u 0β , u 1β , h β } ∼ = sl2 (C) for any β = γ{i, j} , {i, j} ⊂ {7, 8, 9, 10}. # # Moreover, # γ{i, j} | {i, j} ⊂ {7, 8, 9, 10} # = 6 and dim g2[6,4] = 6 × 2 + 3 = 15. Recall that wt(γ{i, j} + γ{i , j } ) = 16 if and only if |{i, j} ∩ {i , j }| = 1 and γ{i, j} + γ{ j, j } = γ{i, j } . Thus, we can identify the set γ{i, j} | {i, j} ⊂ {7, 8, 9, 10} with the positive roots of a Lie algebra of type A3 and we have g2[6,4] ∼ = sl4 (C).

176

C. H. Lam

Theorem 5.35. The Lie algebra V (D[6,4] )1 is isomorphic to sl2 (C) ⊕ sl4 (C)⊕3 and is of the type A1,2 A3,4 3 . Proof. That V (D[6,4] )1 ∼ = sl2 (C) ⊕ sl4 (C)⊕3 follows directly from Lemma 5.29, 5.31 and 5.34. Let gi be a simple component of V (D[6,4] )1 whose affine algebra has level ki . Let h i∨ be the dual Coxeter number of gi . Since d = dim V (D[6,4] )1 = 48, we have (d − 24)/24 = 1, and thus h i∨ = ki by (5.1). Therefore, ki = 2 if gi ∼ = sl2 (C) and ki = 4 if gi ∼ = sl4 (C). 6. Vertex Operator Algebras Associated to Other Subcodes Next we shall study the structures of the VOA associated to other subcodes of D ex . By computer enumeration, one can verify that D ex has exactly 14 subcodes which are not subcodes of any extended doubling D(E). Such subcodes are listed in Table 2. Details can be found in the webpage http://www.st.hirosaki-u.ac.jp/~betsumi/triply-even/ In this section, we shall study the framed VOAs that realized these codes as their 1 -codes. We shall use the same notation as in Sec. 5.1. In particular, C[λ1 ,...,λm ] = 16 ˜ [λ1 ,...,λm ] . (D[λ1 ,...,λm ] )⊥ , K [λ1 ,...,λm ] = supp(D[λ1 ,...,λm ] (16)) and K [λ = \K 1 ,...,λm ] Notation 6.1. For any partition (λ1 , . . . , λm ), V (D[λ1 ,...,λm ] ) denotes a holomorphic framed VOA whose 1/16-code is D[λ1 ,...,λm ] . We shall denote the Lie algebra of V (D[λ1 ,...,λm ] )1 by g[λ1 ,...,λm ] and use gi[λ1 ,...,λm ] , i = 0, 1, . . . , to denote (semisimple) Lie subalgebras of g[λ1 ,...,λm ] . Notation 6.2. Recall the definition of qα from Notation 5.18. For any partition (λ1 , . . . , λm ), let K[λ1 ,...,λm ] be the Lie subalgebra of g[λ1 ,...,λm ] generated by {qα | i α ∈ C[λ1 ,...,λm ],K (2)}. We also use K[λ , i = 1, 2, . . . , to denote the subalge1 ,...,λm ] i to bras of K[λ1 ,...,λm ] . In the following section, we often use K[λ1 ,...,λm ] and K[λ 1 ,...,λm ] determine the Lie rank of the semisimple components of V (D[λ1 ,...,λm ] )1 . Notation 6.3. We shall use rank(g) to denote the Lie rank of a reductive Lie algebra g. Notation 6.4. For n ≥ 1, denote by An an n-dimensional abelian Lie algebra. Table 2. Subcodes of D ex not contained in a doubling

dim 9

[10]

8

[8]=[8,2]

7

[7]

[6,2]=[6,2,2]

6

[6]

[5,2]

5

[5]

[3,3]

[6,4]=[6,3]=[5,4] [5,3]

[4,3]

[4,4,2]=[4,3,2]

[3,3,2]

[3,3,3]


177

6.1. 8-dimensional subcodes. There are only two subcodes of D ex which have dimension 8, namely D[6,4] and D[8] . Since D1 = D[6,4] has been studied in the previous section, we shall only study D[8] . Subcode: D[8] = D[8,2] . Let X 1 = {1, 2, . . . , 8} and X 2 = {9, 10}. Then D[8] is spanned by γ{i, j} , {i, j} ⊂ X 1 , γ{9,10} and the all-one vector. In this case,

8 |D[8] (16)| = + 1 = 29 and |C[8] (2)| = 14. 2 Note that |K [8] | = 44 and C[8],K (2) consists of {{i, 9}, {i, 10}} , i = 1, . . . , 8. In particular, there are 8 elements in C[8],K (2). Let η[8] = {{1, 9}, {1, 10}}. Then C[8] = C ex ∪ (η[8] + C ex ). By Theorem 2.19, there exists a holomorphic framed VOA, β V (D[8] ) = U[8] , β∈D[8]

such that 0 = MC[8] U[8]

and

β

U[8] = V β ⊕ Mη[8] +C ex V β . MC ex

As in the case of V (D[6,4] ), we denote u 0β = vβ and let u 1β be a highest weight vector of V β MC ex Mη[8] +C ex . Since D[8] is generated by D[8] (16) and 1 and |K [8] | = 44, the code ˜ [8] } ∼ C[8] = {α ∈ C[8] | supp(α) ⊂ \K = E4 .

Recall that ME4 ∼ = V A1 ⊕A1 ∼ = V A1 ⊗ V A1 and hence we have the following lemma. (2)}. Then dim g0 = 6 and g0 ∼ sl (C)⊕2 . Lemma 6.5. Let g0[8] = span{qα | α ∈ C[8] [8] [8] = 2

Lemma 6.6. The dimension of V (D[8] )1 is 72. β

Proof. Since dim(U[8] )1 = 2 for all β ∈ D[8] (16), we have dim V (D[8] )1 = 2|D[8] (16)| + |C[8] (2)| = 2(29) + 14 = 72 as desired. Let g = span{u 0β , u 1β |β ∈ D[8] (16)} ∪ {qα | α ∈ C[8],K (2)}. Then g forms a Lie subalgebra of V (D[8] )1 . Lemma 6.7. Let K[8] = span{qα | α ∈ C[8],K (2)}. Then K[8] ∼ = A8 , an abelian Lie algebra of dimension 8. Proof. Since α ∩ α = ∅ for any α, α ∈ C[8],K (2) and α = α , the result follows from Lemma 5.20.

178

C. H. Lam

Remark 6.8. For any β ∈ D[8] (16), we have C[8],β (2) = ∅ and thus (u 0β )0 u 1β = 0 and (qα )0 u iβ = 0 for any α ∈ C[8],β (2) and i = 0, 1. Lemma 6.9. The Lie subalgebra K[8] is a maximal toral subalgebra of g. Proof. It is clear that K[8] is a toral subalgebra since it is abelian and generated by semisimple elements. Let w ∈ g such that w commutes with K[8] . Then by Remark 6.8, we must have w ∈ K[8] . Thus, K[8] is a maximal toral subalgebra. Notation 6.10. For β ∈ D[8] (16), we define h β := (u 0β )0 u 1β ∈ [Mδ+C0 ]1 . By the same proof as in Lemma 5.30, we also have h β+β = β h β + β h β for some constants β , β if wt (β + β ) = 16. Set g1[8] = span{u 0β , u 1β , h β | β = γ{i, j} , {i, j} ⊂ X 1 } g2[8]

=

and

span{u 0γ{9,10} , u 1γ{9,10} , h γ{9,10} }.

Lemma 6.11. The Lie algebras g1[8] and g2[8] commute with each other, i.e., [g1[8] , g2[8] ] = 0. Proof. For {i, j} ∩ {k, } = ∅, we have wt (γ{i, j} + γ{k,} ) = 24 16. Thus (u aγ{i, j} )0 u bγ{k,} = 0 for any {i, j} ∩ {k, } = ∅ and a, b ∈ {0, 1} and we have [g1[8] , g2[8] ] = 0. Lemma 6.12. The Lie algebra g2[8] is isomorphic to sl2 (C). Proof. It is clear that dim g2[8] = 3 and g2[8] ∼ = sl2 (C) by the definition of h β .

Lemma 6.13. The Lie algebra g1[8] is isomorphic to sl8 (C). Proof. By Lemma 5.30, we have dim spanC {h β | β = γ{i, j} , {i, j} ⊂ X 1 } = dim spanZ2 {γ{i, j} | {i, j} ⊂ X 1 } = 7. Moreover, span{u 0β , u 1β , h β } ∼ = sl2 (C) for each β ∈ {γ{i, j} | {i, j} ⊂ X 1 } and |{γ{i, j} | {i, j} ⊂ X 1 }| = 28. Therefore, g1[8] has rank 7 and 28 positive roots. By the same argument as in Lemma 5.34, g1[8] is of type A7 and isomorphic to sl8 (C). Theorem 6.14. The Lie algebra V (D[8] )1 is of the type A1,1 3 A7,4 . Proof. Let h ∨ and k be the dual Coxeter number and level of a simple component g of V (D[8] )1 . Then dim V (D[8] )1 − 24 72 − 24 h∨ = = =2 k 24 24 by (5.1). Thus, k = 1 if g is of type A1 and k = 4 if gi is of type A7 .


179

6.2. 7-dimensional subcodes. There are six 7-dimensional subcodes of D ex . Five of them are not contained in any extended doublings. Subcode: D[7] . Let X 1 = {1, . . . , 7}. Then D [7] is generated by γ{i, j} , {i, j} ⊂ X 1 and the all one vector. In this case, |D[7] (16)| = 27 = 21. Set K [7] = supp(D[7] (16)). Then |K [7] | = 42 and ∼ E6 . = {α ∈ C[7] | supp(α) ∩ K [7] = ∅} = C[7] (2)| = 6 = 15. Moreover, C Thus, |C[7] [7],K (2) consists of elements of the form 2

{{i, j}, {i, k}},

where i = 1, . . . 7, { j, k} ⊂ {8, 9, 10}.

Therefore, there are 21 such vectors. Let η[7] := {{1, 7}, {1, 8}}. Then we have C[7] = C[8] ∪ (η[7] + C[8] ). By Theorem 2.19, there is a holomorphic framed VOA, β U[7] V (D[7] ) = β∈D[7]

such that 0 = MC[7] U[7]

and

β

β

β

U[7] = U[8] ⊕ Mη[7] +C[8] U[8] . MC[8]

β

In this case, dim(U[7] )1 = 4 for β ∈ D[7] (16). Lemma 6.15. The dimension of V (D[7] )1 is 120. β

Proof. Since dim(U[7] )1 = 4 for β ∈ D[7] (16), we have dim V (D[7] )1 = 4 · |D[7] (16)| + |C[7] (2)| = 4(21) + 21 + 15 = 120 as desired.

(2)} and Set g0[7] = span{qα | α ∈ C[7] β

g1[7] = span{(U[7] )1 | β ∈ D[7] (16)} ∪ {qα | α ∈ C[7],K (2)}. Recall that ME6 ∼ = VD3 = V A3 (see Appendix A). Lemma 6.16. The subspace g0[7] forms a Lie subalgebra of the type A3,1 . Now let β = γ{i, j} ∈ D[7] (16) and set C[7],β = {α ∈ C[7] | supp(α) ⊂ supp(β)}. Then C[7],β (2) = { {{i, k}, {i, }}, {{ j, k}, { j, }} | k, = 8, 9, 10, k = }. Therefore, |C[7],β (2)| = 6 and C[7],β (2) also generates a subcode isomorphic to E3 ⊕ E3 . Lemma 6.17. Let αi = {{i, 8}, {i, 9}} for i = 1, . . . , 7. Then the Lie subalgebra span{qαi | i = 1, . . . , 7} is a maximal abelian subalgebra of g1[7] .

180

C. H. Lam

Proof. Since αi ∩ α j = ∅, we have [qαi , qα j ] = (qαi )0 qα j ∈ (Mαi +α j )1 = 0. Therefore, span{qαi | i = 1, . . . , 7} is abelian. β Now let u = β∈D[7] (16) aβ xβ + α∈C[7] (2) bα qα ∈ g1[7] , where 0 = xβ ∈ (U[7] )1 . Suppose u commutes with qαi for all i = 1, . . . , 7. Then we have aβ = 0 for all β ∈ D[7] (16) and bα = 0 for all |α ∩αi | = 1 since [qαi , xβ ] = 0 and [qα , qαi ] = ±qα+αi if |α ∩ αi | = 1. Thus, u ∈ span{qαi | i = 1, . . . , 7} and span{qαi | i = 1, . . . , 7} is a maximal abelian subalgebra. Remark 6.18. Note that (qα )0 also acts semisimply on V (D[7] )1 for any α ∈ C[7],K (2). Therefore, span{qαi | i = 1, . . . , 7} is, in fact, a maximal toral subalgebra of g1[7] . Lemma 6.19. The Lie subalgebra g1[7] is isomorphic to sp14 (C) and has the type C7,2 . Proof. First, we note that dim g1[7] = 21 × 4 + 21 = 105. In addition, g1[7] has Lie rank 7. Thus, g1[7] is of the type B7 or C7 , whose dual Coxeter number h ∨ is 13 and 8, respectively. Now suppose the affine Lie algebra of g1[7] has level k. Then by (5.1), 120 − 24 h∨ = = 4. k 24 Since k is a positive integer, h ∨ must be divisible by 4 and thus h ∨ = 8 and g1[7] is isomorphic to sp14 (C). In this case, the level k = 2 and g1[7] is of the type C7,2 . Theorem 6.20. The Lie algebra V (D[7] )1 is of the type A3,1 C7,2 . Subcode: D[6,2] = D[6,2,2] . In this case, |D[6,2,2] (16)| = 17 and |K [6,2,2] | = 43. ∼ Set C[6,2,2] = {α ∈ C[6,2,2] | supp(α) ∩ K [6,2,2] = ∅}. Then C[6,2,2] = E5 and the set C[6,2,2],K (2) consists of {i, k}, { j, } , i, j ∈ {7, 8}, k, ∈ {9, 10}, {i, k} = { j, }, {i, k}, {i, } , i = 1, . . . , 6, {k, } = {7, 8} or {9, 10}. In particular, there are 18 elements in C[6,2,2],K (2). Let δ = η[6,4] := {{7, 9}, {8, 10}} and δ = η[8] = {{1, 9}, {1, 10}}. Then we have C[6,4] = C[10] ∪ (δ + C[10] ), C[8] = C[10] ∪ (δ + C[10] ) and C[6,2,2] = C[6,4] ∪ (δ + C[6,4] ) = C[8] ∪ (δ + C[8] ) = C[10] ∪ (δ + C[10] ) ∪ (δ + C[10] ) ∪ (δ + δ + C[10] ). By Theorem 2.19, there is a holomorphic framed VOA β U[6,2,2] , V (D[6,2,2] ) = β∈D[6,2,2]

β

such that U[6,2,2] In this case,

β β = U[6,4] ⊕ U[6,4] Mδ +C[6,4] .

β U[6,2,2] = V β ⊕ V β Mδ+C[10] ⊕ V β Mδ +C[10] ⊕ V β Mδ+δ +C[10] , β

and dim(U[6,2,2] )1 = 4 for β ∈ D[6,2,2] (16).


181

Lemma 6.21. The dimension of the Lie algebra V (D[6,2,2] )1 is 96. β

Proof. Since dim(U[6,2,2] )1 = 4 for all β ∈ D[6,2,2] (16), we have dim V (D[6,2,2] )1 = 17 × 4 + 10 + 18 = 96. Lemma 6.22. Let g0[6,2,2] = span{qα | C[6,2,2,] (2)}. Then g0[6,2,2] generates a subVOA isomorphic to LC2 (1, 0) and g0[6,2,2] is of the type C2,1 . ∼ E5 . Note also that a Lie algebra of type B2 is Proof. Use Theorem A.2 and C[6,2,2,] = isomorphic to a Lie algebra of type C2 .

Notation 6.23. Let X 1 = {1, 2, 3, 4, 5, 6}, X 2 = {7, 8} and X 3 = {9, 10}. Let B1 = {γ{i, j} | {i, j} ⊂ X 1 }, B2 = {γ{7,8} } and B3 = {γ{9,10} }. = vβ be a highest weight vector of V β defined as in Sect. 5.2. Let Let u 0,0 β

0,1 1,1 β β u 1,0 β , u β , u β be the highest weight vectors of V Mδ+C[10] , V Mδ +C[10] and V β Mδ+δ +C[10] , respectively. The following is easy to verify.

Lemma 6.24. For any β, β ∈ B1 with wt(β + β ) = 16 and a, a , b, b ∈ {0, 1}, we have

a ,b a+a ,b+b [u a,b β , u β ] = λu β+β

for some non-zero constant λ. Note that the values of λ depends on β, β ∈ B1 as well as a, a , b, b ∈ {0, 1}. For any β = γ{i, j} ∈ B1 , we have C[6,2,2],β (2) = {{ p, k}, { p, }}| p ∈ {i, j}, {k, } = {7, 8} or {9, 10} . Note that C[6,2,2],β (2) ⊂ δ + C[6,4] = (δ + C[10] ) ∪ (δ + δ + C[10] ).

a ,b Lemma 6.25. For any β ∈ B1 , (u a,b = 0 iff b = b . β )0 u β

Proof. First, we note that for (a, b), (a , b ) ∈ Z2 × Z2 , we have

a ,b (u a,b ∈ M(a+a )δ+(b+b )δ +C[10] . β )0 u β

a ,b Since C[6,2,2],β (2) ⊂ δ + C[6,4] , it is clear that (u a,b = 0 iff b + b = 1, i.e., β )0 u β b = b . 0,0 0,1 1,1 1,0 0,1 0,1 Notation 6.26. Define h 0,1 β := [u β , u β ] and h β := [u β , u β ]. Note that h β ∈

Mδ +C[10] and h 1,1 β ∈ Mδ+δ +C[10] .

1,1 0,1 (0,0)β , u 1,1 ] = μh 1,1 Lemma 6.27. For any β ∈ B1 , we have [u 1,0 β , u β ] = νh β and [u β β for some constant ν and μ.

182

C. H. Lam

1,1 Proof. Since h 0,1 β ∈ Mδ +C[10] and h β ∈ Mδ+δ +C[10] , we have 0,0 0,1 0,1 1,0 1,1 [h 0,1 β u β ] = pu β , [h β u β ] = qu β , 0,0 1,1 1,1 1,0 0,1 [h 1,1 β u β ] = r u β , [h β u β ] = su β ,

for some non-zero constant p, q, r, s. Thus, 1 1,0 1,1 0,0 [u , [h β , u β ]], p β " 1 ! 1,0 1,1 0,0 1,0 0,0 [[u β , h β ], u β ] + [h 1,1 = β , [u β , u β ]] , p s = − [u 0,1 , u 0,0 β ] p β s = h 0,1 . p β

1,1 [u 1,0 β , uβ ] =

Similarly, we also have 1 0,0 0,1 1,0 [u , [h β , u β ]], q β r = [u 1,0 , u β0,1 ] q β r = h 1,1 q β

[u β0,0 , u 1,1 β ]=

as desired.

Lemma 6.28. Let β, β ∈ B1 . (i) Suppose wt (β + β ) = 16. Then for (i, j) = (0, 1) or (1, 1), we have i, j

i, j

i, j

h β+β = εβ h β + εβ h β for some constants εβ and εβ ; and i, j (ii) [h β , h k, β ] = 0 for any (i, j), (k, ).

Proof. (i) can be proved using the similar argument as Lemma 5.33. i, j (ii) is trivial since [h β , h k, β ] ∈ MC[6,4] and C [6,4],β+β (2) = ∅. 1,1 0,0 1,0 0,1 1,1 Lemma 6.29. For each β ∈ B1 , spanC {h 0,1 β , h β , u β , u β , u β , u β } is closed under the Lie bracket and it is isomorphic to sl2 (C)2 . 1,1 0,0 1,0 0,1 1,1 Proof. It is clear by definition that spanC {h 0,1 β , h β , u β , u β , u β , u β } is closed under the Lie bracket. 1,1 0,0 1,0 0,1 1,1 Since spanC {h 0,1 β , h β , u β , u β , u β , u β } is semisimple and has dimension 6, it must be isomorphic to sl2 (C)2 .

Lemma 6.30. The Lie subalgebra 1,1 0,0 1,0 0,1 1,1 g1[6,2,2] = spanC {h 0,1 β , h β , u β , u β , u β , u β | β ∈ B1 }

is closed under the Lie bracket and it is isomorphic to sl6 (C)2 , i.e., of the type A25 .


183

Proof. First we note that dim g1[6,2,2] = 4 ·

6 + 2 · 5 = 70. 2

1,1 By Lemma 6.28, it is easy to see that h1[6,2,2] = spanC {h 0,1 β , h β | β ∈ B1 } is a maximal 1 1 abelian subalgebra of g[6,2,2] and dim h[6,2,2] = 2 × 5 = 10. Therefore, g1[6,2,2] has Lie rank 10 and dimension 70. The only possibility is sl6 (C)2 , i.e., of the type A25 .

For β = γ{k,} , where {k, } = {7, 8} or {9, 10}, we have C[6,2,2],β (2) = {{{i, k}, {i, }} | i = k, } ∪{{{k, p}, {, q}} | { p, q, k, } = {7, 8, 9, 10}}. Note that {{{k, p}, {, q}} | { p, q, k, } = {7, 8, 9, 10}} ⊂ δ + C[10] , {{{i, k}, {i, }} | i = k, } ⊂ δ + C[10] if {k, } = {9, 10} and {{{i, k}, {i, }} | i = k, } ⊂ δ + δ + C[10] if {k, } = {7, 8}. Let S = span{qα |α = {{i, k}, {i, }}, i ∈ {1, . . . , 6}, {k, } = {7, 8}, {9, 10}}. Define a linear map ϕ : S → End (g1[6,2,2] ) by (ϕ(a))(w) = a0 w. Lemma 6.31. The dimension of ker ϕ is 2. Proof. By Lemma 6.30, we know that S ∩ g1[6,2,2] has dimension 10. Moreover, dim S = 12. Thus, ker ϕ = (g1[6,2,2] )⊥ ∩ S has dimension 2, where (g1[6,2,2] )⊥ denotes the orthog onal complement of g1[6,2,2] with respect to the Killing form. Set g2[6,2,2] = span ker ϕ ∪ U ∪ Q, where ! " (a,b) U = uβ | β ∈ {γ{7,8} , γ{9,10} } , and Q = {qα | α = {{k, p}, {, q}}, k, ∈ {7, 8}, p, q ∈ {9, 10}, {k, p} = {, q}} . In this case, dim g2[6,2,2] = 2 + 4 · 2 + 6 = 16. Lemma 6.32. The Lie algebra g2[6,2,2] is isomorphic to sl3 (C)2 and commutes with g1[6,2,2] . Proof. That [g1[6,2,2] , g2[6,2,2] ] = 0 is clear by the definition of ϕ. Now note that span ker ϕ ∪ {q{{7,9},{8,10}} , q{{7,10},{8,9}} } is a maximal abelian subalgebra of g2[6,2,2] . Since g2[6,2,2] has Lie rank 4 and dim g2[6,2,2] = 16, the only possibility is sl3 (C)2 , i.e., of type A22 . Theorem 6.33. The Lie algebra structure of V (D[6,2,2] )1 is of the type A2,1 2 A5,2 2 C2,1 . Proof. Again, by (5.1), we have dim V (D[6,2,2] )1 − 24 96 − 24 h∨ = = =3 k 24 24 for any simple component g of V (D[6,2,2] )1 whose dual Coxeter number is h ∨ and whose affine algebra has level k. Thus k = 1 if g is of type A2 and C2 and k = 2 is g is of type A5 .

184

C. H. Lam

Subcode: D[5,3] . In this case, |D[5,3] (16)| = 25 + 3 = 13 and |K [5,3] | = 44. Thus, ∼ C[5,3] = E4 . Moreover, the set C[5,3],K (2) consists of {i, 9}, {i, 10} , i = 1, . . . , 8, {i, j}, {k, } , {i, j, k} = {6, 7, 8}, ∈ {9, 10}. Therefore, there are 14 elements in C[5,3],K (2). Recall that D[10] ⊃ D[8] ⊃ D[5,3] . Thus, we have C[10] ⊂ C[8] ⊂ C[5,3] . Let η = η[8] = {{1, 9}, {1, 10}} and η = η[5,3] = {{6, 7}, {8, , 9}}. Then we have C[8] = C[10] ∪ (η + C[10] ), C[5,3] = C[8] ∪ (η + C[8] ) and C[5,3] = C[10] ∪ (η + C[10] ) ∪ (η + C[10] ) ∪ (η + η + C[10] ). By Theorem 2.19, there exists a holomorphic VOA β U[5,3] , V (D[5,3] ) = β∈D[5,3]

β

β

β

such that U[5,3] = U[8] ⊕ U[8] Mη +C[8] . Since C[8] = C[10] ∪ (η + C[10] ), we also have β U[5,3] = V β ⊕ V β Mη+C[10] ⊕ V β Mη +C[10] ⊕ V β Mη+η +C[10] . 1,0 0,1 1,1 β Let u 0,0 β = vβ be a highest weight vector of V . Let u β , u β , u β be the highest weight vectors of V β Mη+C[10] , V β Mη +C[10] and V β Mη+η +C[10] , respectively.

Lemma 6.34. The dimension of the Lie algebra V (D[5,3] )1 is 72. β

Proof. Since dim U[5,3] = 4 for any β ∈ D[5,3] (16), we have dim V (D[5,3] )1 = 6 + 14 + 4 × 13 = 72. Set B1 B2 C1 C2

= {β = γ{i, j} | {i, j} ⊂ {1, . . . , 5}}, = {β = γ{i, j} | {i, j} ⊂ {6, 7, 8}}, = {{{i, 9}, {i, 10}} | i = 1, . . . , 5}, and = {{{i, 9}, {i, 10}} | i = 6, 7, 8} ∪{{{i, j}, {k, }} | {i, j, k} = {6, 7, 8}, ∈ {9, 10}}.

Let (2)}, g0[5,3] = span{qα | α ∈ C[5,3]

g1[5,3] = span{qα | α ∈ C1 } ∪ {u a,b β | β ∈ B1 , (a, b) ∈ Z2 × Z2 }, g2[5,3] = span{qα | α ∈ C2 } ∪ {u a,b β | β ∈ B2 , (a, b) ∈ Z2 × Z2 }. The following lemma is clear from the definition. Lemma 6.35. The subspaces g0[5,3] , g1[5,3] and g2[5,3] are Lie subalgebras of V (D[5,3] )1 and they are mutually commutative.


185

Lemma 6.36. The Lie subalgebra g0[5,3] is isomorphic to sl2 (C)2 . ∼ Proof. Since C[5,3] = E4 , it is clear from Theorem A.1 that g0[5,3] ∼ = sl2 (C)2 .

For β ∈ B1 , we have C[5,3],β (2) = {{i, 9}, {i, 10}}, {{ j, 9}, { j, 10}} ⊂ C[8],β (2) ⊂ η + C[10] . Lemma 6.37. The Lie subalgebra g1[5,3] is isomorphic to o10 (C), i.e., of type D5 . Proof. First, we note that dim g1[5,3] = 5 + 4 × 52 = 45. Moreover, span{qα | α ∈ C1 } is a maximal abelian subalgebra of g1[5,3] . Therefore, g1[5,3] has Lie rank 5 since (qα )0 acts semisimply on g1[5,3] . Thus, g1[5,3] must be isomorphic to o10 (C) and of type D5 . Lemma 6.38. The Lie subalgebra g2[5,3] is of type C3,2 . Proof. First, we note that g2[5,3] has Lie rank 3 and span{qα | α ∈ C2 } is closed under the 0th product and is isomorphic to sl2 (C)3 . Moreover, dim g2[5,3] = 3 + 6 + 4 × 3 = 21. By comparing ranks and dimensions, g2[5,3] is either of type B3 and C3 . Let h ∨ be the dual Coxeter number of g2[5,3] . Suppose the affine algebra of g2[5,3] has level k. Then by (5.1), h∨ (72 − 24) = = 2. k 24

(6.1)

This implies h ∨ is even and thus h ∨ = 4 and g2[5,3] is of type C3,2 . Note that the dual Coxeter number of B3 and C3 are 5 and 4, respectively. Theorem 6.39. The Lie algebra structure of V (D[5,3] )1 is of the type A1,1 2 C3,2 D5,4 . Proof. It follows easily by Lemma 6.36–6.38 and (6.1).

Subcode: D[4,4] = D[4,4,2] . In this case, |D[4,4,2] (16)| = 6 + 6 + 1 = 13 and |K [4,4,2] | = 44. Thus, C[4,4,2] = E4 . Moreover, C[4,4,2],K (2) consists of

{i, 9}, {i, 10} , i = 1, . . . 8, {i, j}, {k, } , {i, j, k, } = {1, 2, 3, 4} or {5, 6, 7, 8}.

Therefore, there are 14 elements in C[4,4,2],K (2). Lemma 6.40. dim V (D[4,4,2] )1 = 13 × 4 + 8 + 6 + 6 = 72. Note that D[4,4,2] is a common subcode of D[8] and D[6,4] . Let η = η[8] = {{1, 9}, {1, 10}} and δ = η[6,4] = {{7, 9}, {8, 10}}. Then C[8] = C[10] ∪ (η + C[10] )

and

C[6,4] = C[10] ∪ (δ + C[10] ).

186

C. H. Lam

Again, by Theorem 2.19, there is a holomorphic VOA, β V (D[4,4,2] ) = U[4,4,2] , β∈D[4,4,2]

β

β

β

such that U[4,4,2] = U[8] ⊕ U[8] Mδ+C[8] . In this case, we also have β

U[4,4,2] = V β ⊕ V β Mη+C[10] ⊕ V β Mδ+C[10] ⊕ V β Mδ+η+C[10] β

β

= U[6,4] ⊕ U[6,4] Mη+C[6,4] . Let X 1 = {1, 2, 3, 4}, X 2 = {5, 6, 7, 8}, X 3 = {9, 10} and denote Bk = {γ{i, j} | {i, j} ⊂ X k } for k = 1, 2, 3. $ β Lemma 6.41. Let U i = β∈Bi (U[4,4,2] )1 and let gi[4,4] be the Lie subalgebra generated by U i for i = 1, 2, 3. Then g1[4,4] ∼ = g2[4,4] ∼ = sl4 (C)2 and g3[4,4] ∼ = sl2 (C)2 . Proof. By symmetry, it is clear that g1[4,4] ∼ = g2[4,4] . For k = 1, 2, the subcode generated by Bk has dimension 3. Thus, by the same argument as in Lemma 6.30, gk[4,4] has the Lie rank 6(= 2 × 3). Moreover, dim gk[4,4] = 6 × 4 + 6 = 30. Thus, gk[4,4] ∼ = sl4 (C)2 . For k = 3, dim g3[4,4] = 4 + 2 × 1 and g3[4,4] has Lie rank 2. Thus, g3[4,4] ∼ = sl2 (C)2 . Theorem 6.42. The Lie algebra V (D[4,4,2] )1 is of the type A1,1 4 A3,2 4 . Proof. Let g be a simple component whose dual Coxeter number is h ∨ and whose affine algebra has level k. Then h∨ (72 − 24) = =2 k 24 and thus k = h ∨ /2. Therefore, k = 1 if g is of type A1 and k = 2 if g is of type A3 . 4A

4

Remark 6.43. The Lie algebra A1,1 3,2 is also isomorphic to the Lie algebra of the weight one subspace of the θ -orbifold VOA V˜ N (A4 D4 ) associated to the Niemeier lattice 5

N (A45 D4 ) of type A45 D4 (cf. [DGM]).

= Subcode: D[3,3,3] . In this case, |D[3,3,3] (16)| = 9 and |K [3,3,3] | = 45. Thus, C[3,3,3] E3 . Moreover, C[3,3,3],K (2) consists of {i, j}, {k, 10} , {i, j, k} = {1, 2, 3}, {4, 5, 6} or {7, 8, 9}.

Therefore, there are 9 elements in C[3,3,3],K (2). Let η[3,3,3] = {{1, 2}, {3, 10}}. Then C[3,3,3] = C[6,4] ∪(η[3,3,3] +C[6,4] ). By Theorem 2.19, there is a holomorphic VOA β V (D[3,3,3] ) = U[3,3,3] , β∈D[3,3,3]

β

β

β

β

such that U[3,3,3 = U[6,4] ⊕ Mη[3,3,3] +C[6,4] U[6,4] . In particular, dim(U[6,4] )1 = 4 for any β ∈ D[3,3,3] (16).


187

Lemma 6.44. The dimension of V (D[3,3,3] )1 is 48. β

Proof. Since dim(U[3,3,3] )1 = 4 for β ∈ D[3,3,3] (16), we have dim V (D[3,3,3] )1 = 9 × 4 +

3 + 9 = 48 2

as desired. }. Then g0[3,3,3] is of type A1,2 . Lemma 6.45. Let g0[3,3,3] = span{qα |α ∈ C[3,3,3] ∼ Proof. Since C[3,3,3] = E3 , the subVOA generated by g0[3,3,3] is isomorphic to L A1 (2, 0) and g0[3,3,3] ∼ = sl2 (C) by Theorem A.2.

Let X 1 = {1, 2, 3}, X 2 = {4, 5, 6} and X 3 = {7, 8, 9}. Denote Bi = {γ{k,} | {k, } ⊂ X i } and Ci = {{{ j, k}, {, 10}} | {i, j, } = X i } for i = 1, 2, 3. Lemma 6.46. Let gi[3,3,3] = span{qα |α ∈ Ci } ∪ {u a,b β |β ∈ Bi } for i = 1, 2, 3. Then i ∼ g[3,3,3] = sl4 (C). Proof. First, we note that dim gi[3,3,3] = 3 × 4 + 3 = 15. It is also clear that hi = span{qα | α ∈ Ci } is a maximal abelian subalgebra of gi[3,3,3] . Thus, gi[3,3,3] has Lie rank 3 and must be isomorphic to sl4 (C). Theorem 6.47. The Lie algebra structure of V (D[3,3,3] )1 is of the type A1,2 A3,4 3 . Proof. It follows directly from Lemma 6.45, 6.46 and (5.1). 6.3. 6-dimensional subcodes. There are four 6-dimensional subcodes of D ex which are not contained in extended doublings. | = 4 +3 = 9 and |K | = 39. Subcode: D[6] . In this case, |D[6] (16)| = 62 = 15, |K [6] [6] 2 = E . Moreover, C Thus, C[6] 9 [6],K (2) consists of {i, k}, {i, } , i = 1, . . . , 6, {k, } ⊂ {7, 8, 9, 10}. Therefore, |C[6],K (2)| = 6 × 42 = 36. Note that D[6] ⊂ D[6,2,2] . Set η[6] = {{1, 7}, {1, 8}}. Then C[6] = C[6,2,2] ∪ (η[6] + C[6,2,2] ). By Theorem 2.19, there is a holomorphic VOA, β V (D[6] ) = U[6] , β∈D[6]

β

β

β

β

such that U[6] = U[6,2,2] ⊕ Mη[6] +C[6,2,2] U[6,2,2] . In particular, dim(U[6] )1 = 8 for each β ∈ D[6] (16).

188

C. H. Lam

Lemma 6.48. The dimension of V (D[6] )1 is 192. β

Proof. Since the dimension of the top space of U[6] , β ∈ D[6] (16), is 8, we have dim(V (D[6] )1 ) = 8 × |D[6] (16)| + |E9 | + |C[6],K (2)|

9 = 8 × 15 + + 36 = 192 2 as desired.

(2)}. Then g0 ∼ o (C) and the subVOA Lemma 6.49. Let g0[6] = span{qα | α ∈ C[6] [6] = 9 0 generated by g[6] is isomorphic to L B4 (1, 0). ∼ E , the subVOA generated by g0 is isomorphic to L (1, 0) by Proof. Since C[6] = 9 B4 [6] Theorem A.2 and thus g0[6] ∼ = o9 (C).

Lemma 6.50. Let K[6] = span{qα | α ∈ C[6],K (2)}. Then K[6] ∼ = sl2 (C)12 . Proof. For i = 1, . . . , 6, let i K[6] = span{qα | α = {{i, k}, {i, }}, {k, } ⊂ {7, 8, 9, 10}}. i is clearly closed under the 0th product and dim Ki = 6. Thus, Ki ∼ sl (C)2 Then, K[6] [6] [6] = 2 and we have the desired conclusion. $ β Lemma 6.51. Let U = β∈D[6] (16) (U[6] )1 and let

g1[6] = span{qα | α ∈ C[6],K (2)} ∪ U. Then dim g1[6] = 156 and g1[6] is of type C6,1 2 . β

Proof. First, we calculate the dimension of g1[6] . Since dim(U[6] )1 = 8 for each β ∈ D[6] (16), we have

6 + 36 = 156. dim g1[6] = 8 × 2 It is also clear that g1[6] contains sl2 (C)12 as a subalgebra and has Lie rank 12. Thus, g1[6] is of type B62 , C62 or B6 C6 . Now let h ∨ be the Coxeter number of a simple component and let k be the level of the corresponding affine Lie algebra. Then 192 − 24 h∨ = = 7. k 24 Since the dual Coxeter numbers of B6 and C6 are 11 and 7, respectively, we must have h ∨ = 7 and k = 1. Thus, g1[6] is of type C6,1 2 as desired. Theorem 6.52. The Lie algebra structure of V (D[6] )1 is of the type B4,1 C6,1 2 .


189

Subcode: D[5,2] . In this case, |D[5,2] (16)| = 52 + 1 = 11, |K [5,2] | = 3 + 1 + 3 = 7 and = E7 . Moreover, C[5,2],K (2) consists of |K [5,2] | = 41. Thus, C[5,2] {i, k}, {i, } , i = 1, . . . , 7, {k, } ⊂ {8, 9, 10}, {i, 6}, {i, 7} , i = 6, 7, {6, k}, {7, } , {k, } ⊂ {8, 9, 10}. Therefore, there are 35(= 21 + 8 + 6) elements in C[5,2],K (2). Set η[5,2] = {{1, 6}, {1, 7}}. Then we have C[5,2] = C[7] ∪ (η[5,2] + C[7] ). By Theorem 2.19, there is a holomorphic VOA, β V (D[5,2] ) = U[5,2] , β∈D[5,2]

β

β

β

such that U[5,2] = U[7] ⊕ Mη[5,2] +C[7] U[7] . Lemma 6.53. The dimension of V (D[5,2] )1 is 144. β

Proof. Since dim(U[5,2] )1 = 8 for each β ∈ D[5,2] (16), we have

7 + 35 = 144 dim V (D[5,2] )1 = 11 × 8 + 2 as desired.

(2)}. Then g0[5,2] ∼ Lemma 6.54. Let g0[5,2] = span{qα | α ∈ C[5,2] = o7 (C) and the 0 subVOA generated by g[5,2] is isomorphic to L B3 (1, 0).

Proof. Use Theorem A.2.

Lemma 6.55. Let K[5,2] = span{qα | α ∈ C[5,2],K (2)}. Then we have K[5,2] ∼ = sl2 (C)5 ⊕ sl4 (C) ⊕ A5 . Proof. Let C1 = {{{i, k}, {i, }} | i = 1, . . . , 5, {k, } ⊂ {8, 9, 10}}, C2 = {{{i, k}, {i, }} | i = 6, 7, {k, } ⊂ {8, 9, 10}} ∪{{{i, 6}, {i, 7}} | i = 8, 9, 10} ∪ {{{6, k}, {7, }} | {k, } ⊂ {8, 9, 10}}, C3 = {{{i, 6}, {i, 7}} | i = 1, . . . , 5}. It is clear that span{qα | α = {{i, k}, {i, }}, {k, } ⊂ {8, 9, 10}} is a Lie subal1 gebra isomorphic to sl2 (C) for each i = 1, . . . , 5. Thus, K[5,2] = span{qα | α ∈ 3 5 ∼ C1 } = sl2 (C) . Moreover, K[5,2] = span{qα | α ∈ C3 } is abelian since {{i, 6}, {i, 7}} ∩ 3 ∼ {{ j, 6}, { j, 7}} = ∅ if i = j. Thus, K[5,2] = A5 . 2 2 Let K[5,2] = span{qα | α ∈ C3 }. Then K[5,2] has dimension 15 and Lie rank 3. Thus, it must be isomorphic to sl4 (C).

190

C. H. Lam

Let B 1 = {γ{i, j} | {i, j} ⊂ {1, 2, 3, 4, 5}} and denote U 1 = γ{6,7} )1 . denote U 2 = (U[5,2]

$

β β∈B 1 (U[5,2] )1 . We also

Lemma 6.56. Let gi[5,2] , i = 1, 2, be the Lie subalgebra generated by U i . Then we have g1[5,2] ∼ = sl10 (C) and g2[5,2] ∼ = sl5 (C). Proof. Since C[5,2],β (2) = ∅ for any β ∈ D[5,2] (16), we have K[5,2] ⊂ g1[5,2] ⊕ g2[5,2] and rank(g1[5,2] ⊕ g2[5,2] ) = rank(K[5,2] ) = 13. i ⊂ gi[5,2] for i = 1, 2. Notice that K[5,2] 1 3 Let k = dim(g[5,2] ∩ K[5,2] ). Then g1[5,2] has Lie rank 5 + k and g2[5,2] has Lie rank 3 + (5 − k). Moreover, dim g1[5,2] = 8 × 10 + 5 × 3 + k = 95 + k, and dim g2[5,2] = 8 + 15 + (5 − k) = 23 + (5 − k). Notice that 1 ≤ k ≤ 4 and g2[5,2] ⊃ sl4 (C) ⊕ Ak as a Lie subalgebra. Thus, the only solution is k = 4 and g2[5,2] ∼ = sl5 (C). Then g1[5,2] has dimension 99 and Lie rank 9. Thus, g1[5,2] ∼ = sl10 (C). Theorem 6.57. The Lie algebra structure of V (D[5,2] )1 is of the type B3,1 A4,1 A9,2 . Proof. Since dim V (D[5,2] )1 = 144, we have 1/24(dim V (D[5,2] )1 − 24) = 5. Thus, the level is equal to 1/5 of the dual Coxeter number and we have the desired result. 4 Subcode: D[4,3] . In this case, |D[4,3] (16)| = 2 + 3 = 9, |K [4,3] | = 3 + 3 = 6 and |K [4,3] | = 42. Thus, C[4,3] = E6 . Moreover, C[4,3],K (2) consists of {i, k}, {i, } , i = 1, . . . , 7, {k, } ⊂ {8, 9, 10}, {i, j}, {k, } , {i, j, k} = {5, 6, 7} and ∈ {8, 9, 10}, {i, j}, {k, } , {i, j, k, } = {1, 2, 3, 4}. Therefore, there are 33 elements in C[4,3],K (2). Let η[4,3] = {{1, 2}, {3, 4}}. Then C[4,3] = C[7] ∪ (η[4,3] + C[7] ). Again, by Theorem 2.19, there is a holomorphic VOA, β U[4,3] V (D[4,3] ) = β∈D[4,3]

β

β

β

β

such that U[4,3] = U[7] ⊕ Mη[4,3] +C[7] U[7] . Moreover, dim(U[4,3] )1 = 8 for β ∈ D[4,3] (16). Lemma 6.58. The Lie algebra V (D[4,3] )1 has dimension 120. β

Proof. Since dim U[4,3] = 8 for each β ∈ D[4,3] (16), we have

6 + 33 = 120 dim(V (D[4,3] )1 ) = 8 × 9 + 2 as desired.


191

Let B 1 = {γ{i, j} | {i, j} ⊂ {1, 2, 3, 4}} and B 2 = {γ{i, j} | {i, j} ⊂ {5, 6, 7}} and let C1 = {i, k}, {i, } | i = 1, 2, 3, 4}, {k, } ⊂ {8, 9, 10} ∪ {i, j}, {k, } | {i, j, k, } = {1, 2, 3, 4} , C2 = {i, k}, {i, } | i = 5, 6, 7}, {k, } ⊂ {8, 9, 10} ∪ {i, j}, {k, } | {i, j, k} = {5, 6, 7} and ∈ {8, 9, 10} . $ $ β β Denote U 1 = β∈B 1 (U[5,2] )1 and U 2 = β∈B 2 (U[4,3] )1 . (2)}. Then g0[4,3] generates a subVOA Lemma 6.59. Let g0[4,3] = span{qα | α ∈ C[4,3] isomorphic to L A3 (1, 0).

Proof. Use Theorem A.1. Lemma 6.60. Let gi[4,3] = span{qα | α ∈ Ci } ∪ U i for i = 1, 2. Then gi[4,3] are Lie subalgebras of V (D[4,3] )1 . Moreover, g1[4,3] is of the type A7,2 and g2[4,3] is of the type C3,1 2 . Proof. It is clear that gi[4,3] , i = 1, 2, are closed under the 0th product. Now note that dim g1[4,3] = 4 × 3 + 3 + 6 × 8 = 63 and the subalgebra 1 K[4,3] = span{qα | α ∈ C1 } ∼ = sl2 (C)4 ⊕ A3 .

Thus g1[4,3] has Lie rank 7 and dimension 63. The only possibility is g1[4,3] ∼ = sl8 (C). 2 For g[4,3] , the subalgebra 2 K[4,3] = span{qα | α ∈ C2 } ∼ = sl2 (C)6 .

Moreover, dim g2[4,3] = 3 × 3 + 3 × 3 + 3 × 8 = 42. Thus, g2[4,3] must be of the type C32 , B32 or B3 C3 . Let h ∨ be the Coxeter number of a simple component of g2[4,3] and let k be the level of the corresponding affine Lie algebra. Then h∨ 120 − 24 = = 4, k 24 and hence h ∨ = 4 and k = 1. That means g2[4,3] is of the type C3,1 2 . Similarly, we can show that the affine algebra of g1[4,3] has level 2. Theorem 6.61. The Lie algebra structure of V (D[4,3] )1 is of the type A3,1 A7,2 C3,1 2 . | = 1+1+3 = 5 Subcode: D[3,3,2] . In this case, |D[3,3,2] (16)| = 3 + 3 + 1 = 7, |K [3,3,2] and |K [3,3,2] | = 43. Thus, C[3,3,2] = E5 . Moreover, C[3,3,2],K (2) are given by {i, 9}, {i, 10} , i = 1, . . . , 8, {i, 7}, {i, 8} , i = 7, 8, {i, j}, {k, } , {i, j, k} = {1, 2, 3} or {4, 5, 6}, ∈ {9, 10}, {7, k}, {8, } , {k, } = {9, 10}.

Therefore, there are 30(= 8 + 8 + 12 + 2) elements in C[3,3,2],K (2).

192

C. H. Lam

Let η(= η[8] ) = {{1, 9}, {1, 10}}. Then C[3,3,2] = C[3,3,3] ∪ (η + C[3,3,3] ). By Theorem 2.19, there is a holomorphic VOA, β U[3,3,2] , V (D[3,3,2] ) = β∈D[3,3,2]

β

β

β

β

such that U[3,3,2] = U[3,3,3] ⊕ Mη+C[3,3,3] U[3,3,3] . Moreover, dim(U[3,3,2] )1 = 8 for β ∈ D[3,3,2] (16). Lemma 6.62. The Lie algebra V (D[3,3,2] )1 has dimension 96. β

Proof. Since dim U D[3,3,2] = 8 for β ∈ D[3,3,2] (16), we have

5 + 30 = 96 dim(V (D[3,3,2] )1 ) = 8 × 7 + 2 as required.

Lemma 6.63. Let V (D[3,3,2] )0 = sl4 (C)2 ⊕ sl2 (C)2 .

β∈D[3,3,2]

Proof. It follows directly from Lemma 6.46. Lemma 6.64. Let type C2,1 .

g0[3,3,2]

= span{qα | α ∈

β U[3,3,3] . Then we have (V (D[3,3,2] )0 )1 ∼ =

C[3,3,2] (2)}.

= E5 . Proof. Use Theorem A.2 and C[3,3,2]

Then g0[3,3,2] is a Lie algebra of

Lemma 6.65. Let K[3,3,2] = span{qα | α ∈ C[3,3,2],K (2)}. Then K[3,3,2] ∼ = sl2 (C)8 ⊕ A6 . Proof. Let C1 = {{{i, 9}, {i, 10}}|i = 1, 2, 3} ∪ {{i, j}, {k, }}| {i, j, k} = {1, 2, 3}, ∈ {9, 10}}, C2 = {{{i, 9}, {i, 10}}|i = 4, 5, 6} ∪ {{i, j}, {k, }}| {i, j, k} = {4, 5, 6}, ∈ {9, 10}}, C3 = {{{i, 9}, {i, 10}}|i = 7, 8} ∪ {{{7, k}, {8, }}| {k, } = {9, 10}} ∪{{{i, 7}, {i, 8}}| i = 9, 10}, C4 = {{{i, 7}, {i, 8}}| i = 1, . . . , 6}. i = span{qα | α ∈ Ci } for i = 1, . . . , 4. Then Denote K[3,3,2] 1 2 dim K[3,3,2] = dim K[3,3,2] =9

and

3 4 dim K[3,3,2] = dim K[3,3,2] = 6.

For each i ∈ {1, 2, 3}, let { j, k} = {1, 2, 3} − {i}. Denote Si = span{qα | α = {{i, 9}, {i, 10}}, {{i, 9}, { j, k}} or {{i, 10}, { j, k}}}. 1 3 S . Thus K1 3 ∼ = ⊕i=1 Then it is easy to verify that Si ∼ = sl2 (C) and K[3,3,2] i [3,3,2] = sl2 (C) 2 3 ∼ and similarly, we also have K[3,3,2] = sl2 (C) . 3 3 3 ∼ , it is easy to show that K[3,3,2] has Lie rank 2. Thus, K[3,3,2] For K[3,3,2] = sl2 (C)2 3 as dim K[3,3,2] = 6. Finally, we note that {{i, 7}, {i, 8}}∩α = ∅ for any α ∈ C[3,3,2],K (2)\{ {{i, 7}, {i, 8}} }. 4 Thus, K[3,3,2] is abelian and is isomorphic to A6 .


193

Let B 1 = {γ{i, j} | {i, j} ⊂ {1, 2, 3}}, B 2 = {γ{i, j} | {i, j} ⊂ {4, 5, 6}}, and B3 = {γ{7,8} }. $ β Lemma 6.66. Let U i = β∈B i (U[3,3,2] )1 and let gi[3,3,2] be the Lie subalgebra generated by U i for i = 1, 2, 3. Then g1[3,3,2] ∼ = g2[3,3,2] ∼ = sl6 (C) and g3[3,3,2] ∼ = sl3 (C)2 . Proof. By symmetry, it is clear that g1[3,3,2] ∼ = g2[3,3,2] . Next we note that i K[3,3,2] ⊂ gi[3,3,2]

for each i = 1, 2, 3.

4 Let k = dim(g1[3,3,2] ∩ K[3,3,2] ). Then we have rank(g1[3,3,2] ) = rank(g2[3,3,2] ) = 3 + k 3 and rank(g[3,3,2] ) = 2 + (6 − 2k). Moreover,

dim(g1[3,3,2] ) = dim(g2[3,3,2] ) = 3 × 8 + 9 + k = 33 + k, dim(g3[3,3,2] ) = 8 + 6 + (6 − 2k). Notice that k 0 and (6 − 2k) 0. Thus, k = 1 or 2 and we have rank(g1[3,3,2] ) = 4, dim(g1[3,3,2] ) = 34 or rank(g1[3,3,2] ) = 5, dim(g1[3,3,2] ) = 35. Thus, g1[3,3,2] is of type B4 , C4 or A5 . However, by Lemma 6.63, we also know that contains a subalgebra isomorphic to sl4 (C). Therefore, g1[3,3,2] is of type A5 and isomorphic to sl6 (C). In this case, g3[3,3,2] has rank 4 and dimension 16. Thus, g3[3,3,2] is of type A22 and isomorphic to sl3 (C)2 .

g1[3,3,2]

Theorem 6.67. The Lie algebra V (D[3,3,2] )1 is of the type A2,1 2 A5,2 2 C2,1 . Proof. It follows directly from Lemma 6.66, Theorem A.2 and (5.1).

6.4. 5-dimensional subcodes. There are only two 5-dimensional subcodes which are not contained in extended doublings.

5 5 + 3 = 13 and Subcode: D[5] . In this case, |D[5] (16)| = 2 = 10, |K [5] | = 2 |K [5] | = 35. Thus, C[5] = E13 . Moreover, C[5],K (2) are given by {i, k}, {i, } , i = 1, . . . , 5, {k, } ⊂ {6, 7, 8, 9, 10}. Therefore, there are 50(= 5 × 25 ) elements in C[5],K (2). Let η[5] = {{1, 6}, {1, 7}}. Then C[5] = C[6] ∪ (η[5] + C[6] ). By Theorem 2.19, there is a holomorphic VOA, β U[5] , V (D[5] ) = β∈D[5]

β

β

β

β

such that U[5] = U[6] ⊕ Mη[5] +C[6] U[6] . Moreover, dim(U[5] )1 = 16 for β ∈ D[5] (16).

194

C. H. Lam

Lemma 6.68. The Lie algebra V (D[5] )1 has dimension 288. β

Proof. Since dim U[5] = 16 for β ∈ D[5] (16), we have

13 + 50 = 288 dim(V (D[5] )1 ) = 10 × 16 + 2

as required

! " (2) . Then g0 generates a subVOA isoLemma 6.69. Let g0[5] = span qα | α ∈ C[5] [5] morphic to L B6 (1, 0). ∼ E13 , we have the desired conclusion by Theorem A.2. Proof. Since C = [5]

i Lemma 6.70. Let K[5] = span {qα | α = {{i, k}, {i, }}, {k, } ⊂ {6, 7, 8, 9, 10}} for i ∼ i = 1, . . . , 5. Then K[5] = sp4 (C). i is closed under the 0th product. Moreover, we have dim Ki = Proof. It is clear that K[5] [5] i . 5 = 10 and span q is a maximal abelian subalgebra of K[5] , q {{i,6},{i,7}} {{i,8},{i,9}} 2 i has Lie rank 2 and dimension 10 and hence must be of type C , i.e., Therefore, K[5] 2 i K[5] ∼ = sp4 (C). Note that Lie algebras of type B2 and C2 are isomorphic. $ β Lemma 6.71. Let U = β∈D[5] (U[5] )1 and let g1[5] = span qα | α ∈ C[5],K (2) ∪ U. Then g1[5] generates a subVOA isomorphic to LC10 (1, 0).

Proof. By Lemma 6.70, g1[5] contains sp4 (C)5 as a subalgebra and has Lie rank 10. Moreover, dim g1[5] = 16 × 10 + 50 = 210. Therefore, g1[5] is of type B10 or C10 . Let h ∨ be the Coxeter number of g1[5] and let k be the level of the corresponding affine Lie algebra. Then h∨ 288 − 24 = = 11. k 24 Therefore, h ∨ is divisible by 11 and thus h ∨ = 11 and k = 1. That implies g1[5] generates a subVOA isomorphic to LC10 (1, 0). Theorem 6.72. The Lie algebra V (D[5] )1 is of the type B6,1 C10,1 .

| = 42 + 3 = 9 and Subcode: D[3,3] . In this case, |D[3,3] (16)| = 3 + 3 = 6, |K [3,3] |K [3,3] | = 39. Thus, C[3,3,2] = E9 . Moreover, C[3,3],K (2) consists of {i, k}, {i, } , i = 1, . . . , 6, {k, } ⊂ {7, 8, 9, 10}, {i, j}, {k, } , {i, j, k} = {1, 2, 3} or {4, 5, 6}, ∈ {7, 8, 9, 10}. Therefore, there are 60(= 6 × 42 + 2 × 3 × 4) elements in C[3,3],K (2). Let η[3,3] = {{1, 7}, {1, 9}}. Then C[3,3] = C[3,3,2] ∪ (η[3,3] + C[3,3,2] ). By Theorem 2.19, there is a holomorphic VOA, β V (D[3,3] ) = U[3,3] , β∈D[3,3]

β

β

β

β

such that U[3,3] = U[3,3,2] ⊕ Mη[3,3] +C[3,3,2] U[3,3,2] . In this case, dim(U[3,3] )1 = 16 for β ∈ D[3,3] (16).


195

Lemma 6.73. The Lie algebra V (D[3,3] )1 has dimension 192. β

Proof. Since dim U[3,3] = 16 for β ∈ D[3,3] (16), we have

9 + 60 = 192 dim(V (D[3,3] )1 ) = 6 × 16 + 2 as desired.

Lemma 6.74. Let g0[3,3] = span{qα | α ∈ C[3,3] (2)}. Then, g0[3,3] generates a subVOA isomorphic to L B4 (1, 0).

Lemma 6.75. Let K[3,3] = span{qα | α ∈ C[3,3],K (2)}. Then K[3,3] ∼ = sp4 (C)6 , i.e., of 6 type C2 . Proof. For a fixed i = 1, . . . , 6, choose j, k ∈ {1, . . . , 6} such that {i, j, k} = {1, 2, 3} or {4, 5, 6}. Let Ci = {{{i, k}, {i, }} | k = , k, ∈ {7, 8, 9, 10}} ∪ {{{ j, k}, {i, }} | ∈ i {7, 8, 9, 10}} and let K[3,3] = span{qα | α ∈ Ci }. 6 i Since Ci ∩ C j = ∅ for i = j, we have K[3,3] = i=1 K[3,3] . i i i Note that dim K[3,3] = 10 and rank(K[3,3] ) = 2. Thus K[3,3] is of type C2 and thus 6 6 ∼ K[3,3] = sp4 (C) and is of type C2 . Let X 1 = {1, 2, 3} and X 2 = {4, 5, 6} and let B i = {γ{i, j} | {i, j} ⊂ X i } for i = 1, 2. $ β Lemma 6.76. Let U i = β∈B i (U[3,3] )1 and gi[3,3] = span{{qα | α ∈ Ci } ∪ U i } for i = 1, 2. Then g1[3,3] and g2[3,3] are of the type C6,1 . Proof. Since rank(gi[3,3] ) = 6 and dim(gi[3,3] ) = 78, gi[3,3] must be of type C6 or B6 . Let h i∨ be the Coxeter number of gi[3,3] and let ki be the level of the corresponding affine Lie algebra. Then h i∨ 192 − 24 = 7, = ki 24 and thus h i∨ is divisible by 7. Therefore, we have h i∨ = 7 and ki = 1. That means gi[3,3] is of type C6,1 for i = 1, 2. Theorem 6.77. The Lie algebra V (D[3,3] )1 is of the type B4,1 C6,1 2 . Finally, we shall remark that our calculations on the Lie algebra structure of V (D[λ1 ,...,λm ] )1 used only the information about (1) (2)

β

dim(U[λ1 ,...,λm ] )1 for β ∈ D[λ1 ,...,λm ] (16); C[λ1 ,...,λm ] (2) (C[λ1 ,...,λm ],K (2), C[λ (2) and C[λ1 ,...,λm ],β (2), etc). 1 ,...,λm ] β

Thanks to Lemma 4.13, we have dim(U[λ1 ,...,λm ] )1 = [D ex : D[λ1 ,...,λm ] ] for any β ∈ D[λ1 ,...,λm ] (16). Thus all the necessary data are uniquely determined by D[λ1 ,...,λm ] and we have the theorem.

196

C. H. Lam

Theorem 6.78. Let D = D[λ1 ,...,λm ] be a triply even code listed in Table 2. Let V and V be holomorphic framed VOAs with the structure codes (D ⊥ , D). Then V1 ∼ = V1 as a Lie algebra. Remark 6.79. In general, there is more than one framed VOA (up to isomorphism) whose 1 16 code is D. For example, most structure codes of the moonshine VOA can also be realized as the structure codes of the Leech lattice VOA (cf. [M3]). Nevertheless, because of Theorem 6.78, we believe that the VOA structure of V (D) is uniquely determined by D for all triply even codes D[λ1 ,...,λm ] considered in this article. Acknowledgements. The author thanks K. Betsumiya, M. Harada and A. Munemasa for discussion on binary codes and M. Miyamoto, H. Shimakura and H. Yamauchi for stimulating discussions and their helpful comments. He also thanks National Science Council (NSC 97-2115-M-006-015-MY3) and National Center for Theoretical Sciences of Taiwan for financial support.

Appendix A. Code Vertex Operator Algebras Associated to En In this Appendix, we shall identify the code VOA MEn with certain affine VOA. The results are well-known (see for example [M1], [GNOS] and [BT]). Theorem A.1 (cf. [M1]). Let n ≥ 2 be a positive integer. Then ME2n ∼ = V Dn ∼ = L Dn (1, 0), where Dn is the root lattice of type Dn . Here we identify D2 with A1 ⊕ A1 and D3 with A3 . Theorem A.2 (cf. [BT,GNOS]). For n ≥ 2, we have ME2n+1 ∼ = L Bn (1, 0) as VOA. Moreover, ME3 ∼ = L A1 (2, 0). Proof. Suppose n ≥ 2. Let {e1 , . . . , e2n+1 , e2n+2 } be the Ising frame of ME2n+2 . Let σi be the σ -involution associated to the Ising vector ei , i = 1, . . . , 2n + 2. Now take i = 2n + 2. Then σ2n+2 defines an involution on ME2n+2 and the fixed point subVOA,

ME2n+2

σ2n+2

1 ∼ = ME2n+1 ⊗ L( , 0). 2

Notice that σ2n+2 also defines an automorphism on the Lie algebra ME2n+2 1 , which is σ of type Dn+1 . Moreover, dim ME2n+2 12n+2 = n(2n + 1). σ Since E2n can be embedded into E2n+1 , ME2n+2 12n+2 also contains a Lie subalgebra σ of type Dn . Thus, σ2n+2 is an outer involution and ME2n+2 12n+2 must be isomorphic to o2n+1 (C) and is of type Bn . σ Since ME2n+1 is generated by ME2n+2 12n+2 and the central charge of ME2n+1 is 1 ∼ 2 (2n + 1), we have ME2n+1 = L Bn (1, 0). The proof for n = 1 is similar. Note that dim(ME3 )1 = 3 and ME3 has central charge 3/2.


197

Appendix B. Type II Self-Dual Z4 Code Let C be the Z4 -code generated by ⎞ ⎛ 100000011010113310003132 ⎜0 1 0 0 0 0 0 0 0 1 1 1 1 1 3 0 1 1 1 0 0 2 1 3 ⎟ ⎜0 0 1 0 0 0 0 1 1 0 1 0 0 1 0 2 1 0 0 1 2 0 1 0 ⎟ ⎟ ⎜ ⎜0 0 0 1 0 0 1 0 0 1 1 0 0 0 2 1 1 0 1 0 2 3 0 0 ⎟ ⎟ ⎜ ⎜0 0 0 0 1 0 1 1 0 1 1 1 0 1 3 0 1 1 1 1 3 1 1 3 ⎟ ⎟ ⎜ ⎜0 0 0 0 0 1 1 0 1 0 0 1 1 1 0 3 0 1 0 1 3 2 1 1 ⎟ ⎟ ⎜ ⎜0 0 0 0 0 0 2 0 0 0 0 0 0 0 2 0 0 0 0 0 0 2 0 2⎟ ⎟ ⎜ ⎜0 0 0 0 0 0 0 2 0 0 0 0 0 0 2 0 0 0 0 0 2 0 2 0⎟ ⎟ ⎜ ⎜0 0 0 0 0 0 0 0 2 0 0 0 0 0 2 2 0 0 0 0 2 2 2 0⎟ ⎜0 0 0 0 0 0 0 0 0 2 0 0 0 0 0 0 0 0 0 0 2 2 0 2⎟ ⎟ ⎜ ⎜0 0 0 0 0 0 0 0 0 0 2 0 0 0 0 0 0 0 0 0 2 2 2 0⎟ ⎟ ⎜ ⎜0 0 0 0 0 0 0 0 0 0 0 2 0 0 0 0 0 0 0 0 0 0 0 2⎟ ⎟ ⎜ ⎜0 0 0 0 0 0 0 0 0 0 0 0 2 0 2 2 0 0 0 0 0 2 0 0⎟ ⎟ ⎜ ⎜0 0 0 0 0 0 0 0 0 0 0 0 0 2 0 0 0 0 0 0 0 0 2 0⎟ ⎟ ⎜ ⎜0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 2 0 0 0 2 2 2 0⎟ ⎟ ⎜ ⎜0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 2 0 0 0 0 0 2⎟ ⎠ ⎝ 000000000000000000202202 000000000000000000022022 The following can be checked easily by computer. Proposition B.1. The Z4 -code C is a type II self-dual code over Z4 and d(Cr es ) ∼ = [4, 2, 2, 2]. Moreover, the symmetric weight enumerator of C is given by x 24 + z 24 + 6(x 22 z 2 + x 2 z 22 ) + 72(x 16 y 8 + y 8 z 16 ) + 342(x 20 z 4 + x 4 z 20 ) + 8640(x 14 y 8 z 2 + x 2 y 8 z 14 ) + 18432(x 8 y 16 + y 16 z 8 ) + 67584(x 11 y 12 z + x y 12 z 11 ) + 4110(x 18 z 6 + x 6 z 18 ) + 516096(x 6 y 16 z 2 +x 2 y 16 z 6 )+131040(x 12 y 8 z 4 +x 4 y 8 z 12 )+1239040(x 9 y 12 z 3 +x 3 y 12 z 9 )+ 262144y 24 + 23391(x 16 z 8 + x 8 z 16 ) + 576576(x 10 y 8 z 6 + x 6 y 8 z 10 ) + 4460544(x 7 y 12 z 5 + x 5 y 12 z 7 ) + 1290240x 4 y 16 z 4 + 60396(x 14 z 10 + x 10 z 14 ) + 926640x 8 y 8 z 8 + 85652x 12 z 12 . References [BM]

Betsumiya, K., Munemasa, A.: On triply even binary codes. http://arxiv.org/abs/1012.4134v1 [math.CO], 2010 [BT] Bernard, D., Thierry-Mieg, J.: Level one representations of the simple affine kac-moody algebras in their homogeneous gradations. Commun. Math. Phys. 111, 181–246 (1987) [CS] Conway, J.H., Sloane, N.J.A.: Self-dual codes over the integers modulo 4. J. Comb. Theory Ser. A 62, 30–45 (1993) [DGH] Dong, C., Griess, R.L., Höhn, G.: Framed vertex operator algebras, codes and the moonshine module. Commun. Math. Phys. 193, 407–448 (1998) [DGM] Dolan, L., Goddard, P., Montague, P.: Conformal field theories, representations and lattice constructions. Commun. Math. Phys. 179(1), 61–120 (1996) [DL] Dong, C., Lepowsky, J.: Generalized vertex algebras and relative vertex operators. Progress in Math. 112, Boston: Birkhäuser, 1993 [DLM1] Dong, C., Li, H., Mason, G.: Regularity of rational vertex operator algebra. Adv. Math. 312, 148– 166 (1997) [DLM3] Dong, C., Li, H., Mason, G.: Modular-invariance of trace functions in orbifold theory and generalized moonshine. Commun. Math. Phys. 214, 1–56 (2000) [DLM4] Dong, C., Li, H., Mason, G.: Simple currents and extensions of vertex operator algebras. Commun. Math. Phys. 180, 671–707 (1996) [DM1] Dong, C., Mason, G.: On quantum galois theory. Duke Math. J. 86, 305–321 (1997)

198

[DM2]

C. H. Lam

Dong, C., Mason, G.: Rational vertex operator algebras and the effective central charge. Internat. Math. Res. Notices 56, 2989–3008 (2004) [DM3] Dong, C., Mason, G.: Holomorphic vertex operator algebras of small central charge. Pacific J. Math. 213(2), 253–266 (2004) [DM4] Dong, C., Mason, G.: Integrability of C2 -cofinite vertex operator algebras. Int. Math. Res. Not. 2006, Art. ID 80468 (2006) [DMZ] Dong, C., Mason, G., Zhu, Y.: Discrete series of the virasoro algebra and the moonshine module. Proc. Symp. Pure. Math., American Math. Soc. 56(II), 295–316 (1994) [FHL] Frenkel, I., Huang, Y.-Z., Lepowsky, J.: On axiomatic approaches to vertex operator algebras and modules. Memoirs Amer. Math. Soc. 104 (1993) [FLM] Frenkel, I.B., Lepowsky, J., Meurman, A.: Vertex Operator Algebras and the Monster. New York: Academic Press (1988) [FZ] Frenkel, I., Zhu, Y.: Vertex operator algebras associated to representations of affine and virasoro algebras. Duke Math. J. 66, 123–168 (1992) [GKO] Goddard, P., Kent, A., Olive, D.: Unitary representations of the virasoro and super-virasoro algebras. Commun. Math. Phys. 103, 105–119 (1986) [GNOS] Goddard, P., Nahm, W., Olive, D., Schwimmer, A.: Vertex operators for non- simply-laced algebras. Commun. Math. Phys. 107, 179–212 (1986) [H1] Huang, Y.-Z.: Virasoro vertex operator algebras, (non-meromorphic) operator product expansion and the tensor product theory. J. Algebra 182, 201–234 (1996) [KMR] Key, J.D., Moori, J., Rodrigues, B.G.: Permutation decoding for the binary codes from triangular graphs. European J. Combin. 25(1), 113–123 (2004) [L1] Lam, C.H.: Twisted representations of code vertex operator algebras. J. Algebra 217, 275–299 (1999) [L2] Lam, C.H.: Some twisted module for framed vertex operator algebras. J. Algebra 231, 331–341 (2000) [LY] Lam, C.H., Yamauchi, H.: On the structure of framed vertex operator algebras and their pointwise frame stabilizers. Commun. Math. Phys. 277, 237–285 (2008) [M1] Miyamoto, M.: Griess algebras and conformal vectors in vertex operator algebras. J. Algebra 179, 528–548 (1996) [M2] Miyamoto, M.: Representation theory of code vertex operator algebras. J. Algebra 201, 115–150 (1998) [M3] Miyamoto, M.: New construction of the moonshine vertex operator algebra over the real number field. Ann. of Math 159, 535–596 (2004) [M4] Miyamoto, M.: A Z3 -orbifold theory of lattice vertex operator algebra and Z3 -orbifold constructions, http://arxiv.org/abs/1003.0237v1 [math.QA], 2010 [Mon] Montague, P.: Conjectured Z2 -orbifold constructions of self-dual conformal field theories at central charge 24—the neighborhood graph. Lett. Math. Phys. 44(2), 105–120 (1998) [Sch] Schellekens, A.N.: Meromorphic c = 24 conformal field theories. Commun. Math. Phys. 153(1), 159–185 (1993) [Z] Zhu, Y.: Modular invariance of characters of vertex operator algebras. J. Amer. Math. Soc. 9, 237–302 (1996) Communicated by Y. Kawahigashi


Communications in


Energy Cascades and Flux Locality in Physical Scales of the 3D Navier-Stokes Equations R. Dascaliuc, Z. Grujić Department of Mathematics, University of Virginia, Charlottesville, VA 22904, USA. E-mail: [email protected] Received: 8 July 2010 / Accepted: 19 October 2010 Published online: 16 March 2011 – © Springer-Verlag 2011

Abstract: Rigorous estimates for the total – (kinetic) energy plus pressure – flux in R3 are obtained from the three dimensional Navier-Stokes equations. The bounds are used to establish a condition – involving Taylor length scale and the size of the domain – sufficient for existence of the inertial range and the energy cascade in decaying turbulence (zero driving force, non-increasing global energy). Several manifestations of the locality of the flux under this condition are obtained. All the scales involved are actual physical scales in R3 and no regularity or homogeneity/scaling assumptions are made.

1. Introduction One of the main features of Kolmogorov’s empirical turbulence theory [10–12] is existence of energy cascade over a wide range of length scales, called the inertial range, where the dissipation effects are dominated by the transport of energy from higher to lower scales. Energy cascades have been observed in physical experiments, but theoretical justification of this phenomenon using equations of fluid motion, and in particular, the Navier-Stokes equations (NSE), remains far from being settled. The technical complexity of the NSE makes it difficult to establish conditions under which such cascades can occur. A particular problem is the possible lack of regularity of the solutions to the NSE, and thus choosing the right setting becomes crucial. (For an overview of various mathematical models of turbulence and the theory of the NSE, see, e.g., [6,7,9] and [4,14,18], respectively.) The first studies in this direction were made in [8], where infinite-time averages of the Leray-Hopf solutions in the Fourier setting were used to establish a sufficient condition for the energy cascade. This condition, involving Taylor length scale, provided an inspiration for the sufficient condition (4.13) obtained in Sect. 4. In contrast to [8], our goal was to work in physical space, dealing with actual length scales in R3 rather than the Fourier wave numbers.

200

R. Dascaliuc, Z. Grujić

In studying a PDE model, a natural way of introducing a concept of scale is to measure oscillations, i.e., (distributional) derivatives of a quantity with respect to the scale. 1 function f on a ball of radius 2R, B(x , 2R), the physical scale Considering an L loc 0 R is introduced via bounds on the distributional derivatives of f , where a test function ψ is a refined – smooth, non-negative, equal to 1 on B(x0 , R) and featuring optimal bounds on the derivatives over the outer R-layer – cut-off function on B(x0 , 2R). (Uniformity in all scales dictates linearity of the length of the outer layer in R; hence B(x0 , R + R).) More explicitly, 1 (1.1) |(D α f, ψ)| ≤ | f ||D α ψ| ≤ c(α) |α| | f |, ψ ρ(α) R B(x0 ,2R) for some c(α) > 0 and ρ(α) in (0, 1). (An attempt to introduce a concept of scale via characteristic functions in place of smooth cut-off functions would lead to infinite concentration – delta functions – invalidating much of the desired calculus.) This approach has a similar flavor as introducing the Fourier scale |ξ | via α f (ξ ) = i |α| ξ α fˆ(ξ ) D

(in the Schwarz space, and then by duality in the space of tempered distributions). Let x0 be in B(0, R0 ) (R0 being the integral scale, B(0, 2R0 ) ⊂ , where is the global spatial domain) and 0 < R ≤ R0 . Define local – per unit of mass – kinetic energy, e and enstrophy, E, at time t, associated with the ball B(x0 , R) by 1 2 2δ−1 |u| φ dx, ex0 ,R (t) = 2 E x0 ,R (t) = |∇ ⊗ u|2 φ dx, where φ = η ψ and η and ψ are refined cut-off functions in time and space, respectively (for some 21 < δ < 1). A total flux – (kinetic) energy plus pressure – through the boundary of a region D is given by 1 2 |u| + p u · n ds = [(u · ∇)u + ∇ p] · u dx, ∂D 2 D where n is an outward normal. Considering the NSE localized to B(x0 , 2R) – and utilizing ∇ · u = 0 – leads to a localized flux, 1 2 |u| + p u · ∇φ dx = − [(u · ∇)u + ∇ p] · u φ dx. x0 ,R (t) = 2 Since ψ can be constructed such that ∇φ = η ∇ψ is oriented along the radial directions of B(x0 , 2R) toward the center of the ball, x0 ,R represents the flux into B(x0 , R) through the layer between the spheres S(x0 , 2R) and S(x0 , R) (∇φ ≡ 0 on B(x0 , R)). A more dynamic physical significance of the sign of x0 ,R can be seen from the equations: multiplying the NSE by ψu and integrating over B(x0 , 2R) (formally, assuming smoothness) leads to d 1 2 |u| ψ dx = x0 ,R + ν u · u ψ dx. dt 2

Energy Cascades

201

Plainly, the positivity of x0 ,R contributes to the increase of the kinetic energy around the point x0 at scale R. Since the flux consists of both the kinetic and the pressure parts, a natural question is whether there is a transfer of the kinetic energy from larger scales into B(x0 , R), or perhaps the increase is mainly due to the change in pressure. In general, it is possible that the increase of the kinetic energy around x0 is due solely to the pressure part; a simple example being u = (c, c, c) t, p = cx1 + cx2 + cx3 . However, in physical situations where the kinetic energy on the (global) spatial domain is non-increasing, e.g., a bounded domain with no-slip boundary conditions, or the whole space with either decay at infinity or periodic boundary conditions (here, we are concerned with the case of decaying turbulence, setting the driving force to zero), the increase of the kinetic energy in B(x0 , R) – and consequently, the positivity of x0 ,R – implies local transfer of the kinetic energy from larger scales simply because the local kinetic energy is increasing while the global kinetic energy is non-increasing resulting in decrease of the kinetic energy in the complement. This is also consistent with the fact that in the aforementioned scenarios one can project the NSE – in an appropriate functional space – to the subspace of divergence-free functions effectively eliminating the pressure and revealing that the local flux x0 ,R is indeed driven by transport/inertial effects rather than the change in the pressure. (The u = (c, c, c) t, p = cx1 + cx2 + cx3 example pertains to a completely opposite situation; the kinetic energy is simply uniformly growing over the whole spatial domain.) Henceforth, following the discussion in the preceding paragraphs – in the setting of decaying turbulence (zero driving force, non-increasing global energy) – the positivity and the negativity of x0 ,R will be interpreted as transfer of (kinetic) energy around the point x0 at scale R toward smaller scales and transfer of (kinetic) energy around the point x0 at scale R toward larger scales, respectively. We also consider finite time averages for each of the aforementioned quantities. Completely analogous definitions hold for shells of radii R and 2R. Our goal is to obtain a manifestation of the (kinetic) energy cascade in physical space, i.e., formulate a condition on B(0, R0 ) that would imply that the time-averaged energy transfers/cascades to smaller scales across a range of scales (the existence of the inertial range). A key point here is that we do not assume any homogeneity of the flow; hence one can not expect to show that the local fluxes are positive for each individual ball B(x, R). The best one can hope for is to prove the positivity of the flux over some spatial average. We choose to work with a very straightforward spatial average: the arithmetic mean of the local fluxes – time-averaged, per unit mass – computed over a family of coverings of B(0, R0 ), the so-called optimal coverings. n Let K 1 and K 2 be two positive integers. A covering {B(xi , R)}i=1 of B(0, R0 ) is an optimal covering (with parameters K 1 and K 2 ) if

R0 R

3 ≤ n ≤ K1

R0 R

3 ,

and any point x in B(0, R0 ) is covered by at most K 2 balls B(xi , 2R). (Optimal coverings exist for all large enough K 1 and K 2 , the critical values depending only on the dimension of the space. In R3 , we can take K 1 = K 2 = 8.) Let f be a sign-varying quantity (e.g., the flux density −[(u · ∇)u + ∇ p] · u), and consider the arithmetic mean of the quantity locally averaged over the (optimal) covering

202


elements B(xi , R), n 1 1 ρ FR = f ψi dx n R 3 B(xi ,2R) i=1

(for some 0 < ρ ≤ 1). A revealing observation is that FR ∼ const (R) for all optimal coverings at scale R (K 1 and K 2 fixed) indicates there are no significant fluctuations of the sign of f at scales comparable or greater than R. In other words, if there are significant fluctuations of the sign of f at scale R ∗ , FR will run over a wide range of values while the average is being run over all permissible optimal coverings (determined by K 1 and K 2 ), for any R comparable or less than R ∗ . When there is no change of the sign at all, i.e., in the case of a signed quantity (e.g., the energy density f = 21 |u|2 , ρ = 2δ − 1 or the enstrophy density f = |∇ ⊗ u|2 , ρ = 1), one would then expect that for any scale R, 0 < R ≤ R0 , the averages FR are all comparable to each other. This is in fact true (an easy proof). Utilizing the NSE via the local energy inequality – in the mathematical setting of suitable weak solutions [1,17] – we establish the positivity and near-constancy (comparable to ν E, where ν is the viscosity and E is the average enstrophy over B(0, 2R0 ) × (0, 2T )) of the averaged flux across a range of scales under a very simple and natural (in the sense of turbulence phenomenology) condition; namely, that Taylor micro-scale τ0 associated with B(0, R0 ) is smaller than the integral spatial scale R0 (cf. (4.13)). The larger the gap, the deeper the inertial range. This condition is reminiscent of the Poincaré inequality on a domain of the corresponding size (see Remark 4.2); moreover, the condition in hand would be easy to check in physical experiments as the averages involved are very straightforward. In addition, the length of the time interval T is consistent with the intrinsic scaling of the model (cf. (4.2)). It is interesting to interpret the cascade in the light of the above observation regarding the meaning of the near-constancy of optimal cover averages. Essentially, for any τ0 ≤ R ≤ R0 (within the inertial range), the flux density does not experience significant fluctuations of sign at scale R; the significant fluctuations of sign are only possible at the scales substantially smaller than τ0 , i.e., inside the dissipation range. The second part of the paper concerns the locality of the flux. It is believed (see [16]) that the energy flux inside the inertial range of turbulent flows depends strongly on the flow in nearby scales, its dependence on the lower and much higher scales being weak. The theoretical proof of this conjecture remained elusive. The first quantitative results on fluxes were obtained by the early 70’s (see [13]). Much later, the authors in [15] used the NSE in the Fourier setting to explore locality of scale interactions for statistical averages, while the investigation in [5] revealed the locality of filtered energy flux under an assumption that solutions to the vanishing viscosity Euler’s equations saturate a defining inequality of a suitable Besov space, i.e., under a (weak) scaling assumption. A more recent work [2] provided a proof of the locality of the energy flux in the setting of the Littlewood-Paley decomposition. In the last section we prove the locality of the energy cascade – in decaying turbulence – in the physical space throughout the inertial range established in Theorem 4.1. In particular, considering dyadic shells at the scales 2k R (k an integer) in the physical space, we show that both ultraviolet and infrared locality propagate exponentially in the shell number k. To the best of our knowledge, the condition (4.13) is presently the only condition (in any solution setting) implying both the existence of the inertial range and the locality of

Energy Cascades

203

the energy flux. Moreover, it does not involve any additional regularity or homogeneity/scaling assumptions on the solutions to the NSE. 2. Preliminaries We consider three dimensional incompressible Navier-Stokes equations (NSE) ∂ u(t, x) − νu(t, x) + (u(t, x) · ∇)u(t, x) + ∇ p(t, x) = 0, ∂t ∇ · u(t, x) = 0,

(2.1)

where the space variable x is in R3 and the time variable t is in (0, ∞). The vector-valued function u and the scalar-valued function p represent the fluid velocity and the pressure, respectively, while the constant ν is the viscosity of the fluid. Since our goal is to investigate local fluxes in the physical space, the class of suitable weak solutions (see [1,14]) will provide an appropriate mathematical framework. Definition 2.1. Let be an open connected set in R3 . We say that (u, p) is a suitable weak solution on (0, ∞) × if (a) u ∈ L ∞ ((0, ∞), L 2 ()3 ) ∩ L 2 ((0, ∞), H 1 ()3 ) and p ∈ L 3/2 ((0, ∞) × ); (b) the NSE (2.1) are satisfied in the weak (distributional) sense; (c) the local energy inequality is satisfied: for any φ ∈ D((0, ∞) × ), φ ≥ 0 we have 2ν |∇ ⊗ u|2 φ dx dt ≤ |u|2 (∂t φ + νφ) dx dt + (|u|2 + 2 p)u · ∇φ dx dt, (2.2) where D((0, ∞) × ) denotes the space of infinitely differentiable functions with compact support in (0, ∞) × . The existence of the suitable weak solutions in the case where = R3 and the external force is zero, given a divergence-free initial condition in L 2 , was first established in [17]. See also [1,14] for more general results related to existence and regularity properties of the suitable weak solutions. A solution to the NSE on (0, ∞) × is called regular if its H 1 norm is bounded on (0, T ) for any T positive. Given appropriate boundary conditions, this implies that the solution is infinitely differentiable (in fact, analytic) in both space and time and so it is a classical physical solution. In particular, the local energy equality holds ((2.2) becomes an equality). The smoothness of the suitable weak solutions to the NSE is still an open problem, and the best result in this direction reads that the one-dimensional (parabolic) Hausdorff measure of the singular set in (0, T ) × is zero [1] (outside the singular set, a suitable weak solution is infinitely differentiable in the spatial variables). In what follows, we consider R0 > 0 such that B(0, 3R0 ) ⊂ ,

(2.3)

where B(0, 3R0 ) denotes the ball in R3 centered at the origin and with the radius 3R0 . Let 1/2 ≤ δ < 1. Choose ψ0 ∈ D(B(0, 2R0 )) satisfying 0 ≤ ψ0 ≤ 1, ψ0 = 1 on B(0, R0 ),

|∇ψ0 | C0 ≤ , δ R0 ψ0

|ψ0 | ψ02δ−1

≤

C0 . R02

(2.4)

204


For a T > 0 (to be chosen later), x0 ∈ B(0, R0 ) and 0 < R ≤ R0 define φ = φx0 ,T,R (t, x) = η(t)ψ(x) to be used in the local energy inequality (2.2) where η = ηT (t) and ψ = ψx0 ,R (x) are refined cut-off functions satisfying the following conditions, η ∈ D(0, 2T ), 0 ≤ η ≤ 1, η = 1 on (T /4, 5T /4),

|η | C0 ; ≤ δ η T

(2.5)

if B(x0 , R) ⊂ B(0, R0 ), then ψ ∈ D(B(x0 , 2R)) with 0 ≤ ψ ≤ ψ0 , ψ = 1 on B(x0 , R) ∩ B(0, R0 ),

|∇ψ| C0 |ψ| C0 , 2δ−1 ≤ 2 , ≤ δ ψ R ψ R (2.6)

and if B(x0 , R) ⊂ B(0, R0 ), then ψ ∈ D(B(0, 2R0 )) with ψ = 1 on B(x0 , R) ∩ B(0, R0 ) satisfying, in addition to (2.6), the following: ψ = ψ0 on the part of the cone in R3 centered at zero and passing through S(0, R0 ) ∩ B(x0 , R) between S(0, R0 ) and S(0, 2R0 ) (2.7) and ψ = 0 on B(0, R0 )\B(x0 , 2R) and outside the part of the cone in R3 centered at zero and passing through S(0, R0 ) ∩ B(x0 , 2R) between S(0, R0 ) and S(0, 2R0 ). (2.8) Figure 1 illustrates the definition of ψ in the case B(x0 , R) is not entirely contained in B(0, R0 ). Remark 2.1. The additional conditions on the boundary elements (2.7) and (2.8) are necessary to obtain the lower bound on the fluxes in terms of the same version of the localized enstrophy E in Theorems 4.1 and 5.2 (see Remarks 4.4 and 5.5). 3. Localized Energy, Enstrophy and Flux; Ensemble Averages Let x0 ∈ B(0, R0 ) and 0 < R ≤ R0 . Define localized energy, e, and enstrophy, E, at time t – all per unit of mass – associated with B(x0 , R) by 1 2 2δ−1 |u| φ dx, (3.1) ex0 ,R (t) = 2 E x0 ,R (t) = |∇ ⊗ u|2 φ dx (3.2) (for some 21 < δ < 1). The total – (kinetic) energy plus pressure – flux through sphere S(x0 , R) is given by 1 ( |u|2 + p)u · n ds = [(u · ∇)u + ∇ p] · u dx, S(x0 ,R) 2 B(x0 ,R) where n is an outward normal. Considering the NSE localized to B(x0 , R) leads to a localized version of the flux, 1 (3.3) x0 ,R (t) = ( |u|2 + p)u · ∇φ dx = − [(u · ∇)u + ∇ p] · u φ dx, 2

Energy Cascades

205

Fig. 1. Regions of supp(ψ) in the case B(x0 , R) ⊂ B(0, R0 ), cross-section

where φ = ηψ with η and ψ as in (2.5–2.6). Since ψ can be constructed such that ∇φ = η∇ψ is oriented along the radial directions of B(x0 , R) towards the center of the ball x0 , (x0 , R) represents the flux into B(x0 , R) through the layer between the spheres S(x0 , 2R) and S(x0 , R) (in the case of the boundary elements satisfying the additional hypotheses (2.7) and (2.8), ψ is almost radial and the gradient still points inward). For a quantity x,R (t), t ∈ [0, 2T ] and a covering {B(xi , R)}i=1,n of B(0, R0 ) define a time-space ensemble average R =

1 T

n 1 1 x ,R (t) dt. n R3 i

(3.4)

i=1

Denote by e R = ex,R (t) R , E R = E x,R (t) R , R = x,R (t) R ,

(3.5) (3.6) (3.7)

the averaged localized energy, enstrophy and inward-directed flux over balls of radius R covering B(0, R0 ). Also, introduce the time-space average of the localized energy on B(0, R0 ), 1 e= T

1 1 1 e0,R0 (t) dt = 3 T R0 R03

1 2 2δ−1 dx dt |u| φ0 2

(3.8)

206


and the time-space average of the localized enstrophy on B(0, R0 ), 1 1 1 1 E= E 0,R0 (t) dt = |∇ ⊗ u|2 φ0 dx dt, T T R03 R03

(3.9)

where φ0 (t, x) = η(t)ψ0 (x)

(3.10)

with ψ0 defined in (2.4). Finally, define Taylor length scale associated with B(0, R0 ) by e 1/2 τ0 = . (3.11) E Note that the possible lack of regularity may produce additional loss of energy, resulting in anomalous energy dissipation and the loss of flux leading to the strict inequality in (2.2). Let us mention here that in the turbulence literature the term ‘anomalous dissipation’ is usually utilized in the context of the possible energy dissipation due to the (possible) singularities in the 3D Euler equations (the observation originally made by Onsager); for rigorous results on Onsager’s conjecture on the energy conservation in the Euler equations see, e.g., [3], and a recent work [2]. Denote by ∞ x0 ,R the loss of flux due to possible singularities in [0, 2T ] × B(x0 , 2R), 1 ( |u|2 + p)u · ∇φ dx dt − ∞ x0 ,R 2 1 =ν |∇ ⊗ u|2 φ dx dt − (3.12) |u|2 (∂t φ + νφ) dx dt 2 where φ = ηψ with η and ψ as in (2.5) and (2.6–2.8). In particular, denote by ∞ = ∞ 0,R0 the loss of flux due to singularities in [0, 2T ] × B(0, 2R0 ). We will also consider the time-space ensemble averages of these anomalous fluxes, ∞ R =

n 1 1 1 ∞ . n T R 3 xi ,R

(3.13)

i=1

Note that due to (2.2), all the anomalous fluxes are nonnegative, ∞ x0 ,R ≥ 0,

∞ ≥ 0,

∞ R ≥0;

(3.14)

they are all zero provided the equality holds in (2.2) inside [0, 2T ] × B(0, 2R0 ). In particular, the anomalous fluxes are all zero provided the solution in view is regular on [0, 2T ] × B(0, 2R0 ). Consequently, the total localized flux into B(x0 , R) over interval [0, 2T ], including the (loss of) flux due to the possible loss of regularity, is x0 ,R = x0 ,R (t) dt − ∞ (3.15) x0 ,R , and the time-space ensemble average of this flux at scale R is R = R − ∞ R.

(3.16)

We will refer to x0 ,R and R as the modified flux over [0, 2T ] into B(x0 , R) and the (time-space ensemble) averaged modified flux at the scale R, respectively. Let K 1 , K 2 > 1 be two positive integers (independent of R, R0 , and any of the parameters of the NSE).

Energy Cascades

207

Definition 3.1. We say that a covering of B(0, R0 ) by n balls of radius R is optimal if 3 3 R0 R0 ≤ n ≤ K1 ; (3.17) R R any x ∈ B(0, R0 ) is covered by at most K 2 balls B(xi , 2R).

(3.18)

Note that optimal coverings exist for any 0 < R ≤ R0 provided K 1 and K 2 are large enough. In fact, the choice of K 1 and K 2 depends only on dimension of the space; in R3 we can choose K 1 = K 2 = 8. Henceforth, we assume that the averages · R are taken with respect to optimal coverings. Lemma 3.1. If the covering {B(xi , R)}i=1,n of B(0, R0 ) is optimal then ∞ R ≤ K

1 1 ∞ , T R03

(3.19)

where K > 0 is a constant depending only on K 2 and dimension of the space R3 . Proof. Let {xi j } be a subset of {xi }i=1,n such that interiors of the balls B(xi j , 2R) are pairwise disjoint. Using (3.12), we obtain 1 ( |u|2 + p)u · ∇φ0 dx dt − ∞ 2 1 =ν |∇ ⊗ u|2 φ0 dx dt − (3.20) |u|2 (∂t φ0 + νφ0 ) dx dt 2 and

1 ( |u|2 + p)u · ∇( φi j ) dx dt − ∞ xi j ,R 2 j j 1 =ν |∇ ⊗ u|2 ( φi j )dxdt − φi j ) + ν( φi j )]dxdt, |u|2 [∂t ( 2 j

j

j

(3.21) where φ0 = ηψ0 and φi j = ηψi j with η as in (2.5), ψ0 as in (2.4) and ψi j a test function corresponding to B(xi j , R) satisfying (2.6–2.8). Note that the definitions of φ0 and φi j imply φi j ≥ 0; φ˜ = φ0 − j

hence, by the local energy inequality (2.2), 1 ( |u|2 + p)u · ∇ φ˜ dx dt 2 1 ˜ dx dt. ≥ν |∇ ⊗ u|2 φ˜ dx dt − |u|2 (∂t φ˜ + νφ) 2

(3.22)

208


If we add relations (3.21) and (3.22) and then subtract (3.20) we obtain ∞ ∞ ≥ xi ,R . j

j

(3.23)

Let L be a cubic lattice inside B(0, R0 ) with the points situated at the vertices of cubes of side R/2 (Note that this lattice can be chosen such that the number of points in it is between 23 (R0 /R)3 and (4π/3)23 (R0 /R)3 .) Since the covering {B(xi , R)} is optimal, each point in L is contained in at most K 2 balls. Moreover, any ball in the covering will contain at least one point from the lattice. If L is a sub-lattice of L with points at vertices of cubes of side 4R, then the interiors of balls of radius 2R containing different points of L are pairwise disjoint, and thus if we denote by B(xi p , R) a ball from the covering {B(xi , R)} containing the point p ∈ L , by (3.22), ∞ ≥ ∞ xi p ,R . p∈L

Note that for each point p ∈ L there are at most K 2 choices for B(xi p , R). So

K 2 ∞ ≥

i:B(xi ,R)∩L =∅

∞ xi ,R .

Clearly L can be written as a union of 83 = 256 sub-lattices Lk , k = 1, . . . , 256, each Lk having the same properties as L . Thus, 8 K 2 ∞ ≥ 3

n

∞ xi ,R .

i=1

Consequently, ∞ R

n 1 1 1 1 1 1 1 1 ∞ ∞ ≤ 83 K 2 = xi ,R ≤ 83 K 2 ∞ , T R3 n T R3 n T R03 i=1

where the last inequality is due to n satisfying (3.18).

According to the lemma, the time-space ensemble averages ∞ R taken over the optimal coverings at the scale R are bounded, independently of R, by the average loss of flux due to possible singularities inside B(0, 2R0 ). 4. Energy Cascade Let {B(xi , R)}i=1,n be an optimal covering of B(0, R0 ). Note that the local energy equality (3.12) and the definitions of E R , R and ∞ R ( (3.6), (3.7) and (3.13) ) imply R = R − ∞ R = νER −

n 1 2 1 1 1 |u| (∂t φi + νφi ) dx dt, 3 n T R 2 i=1

(4.1)

Energy Cascades

209

where φi = ηψi and ψi is the spatial cut-off on B(xi , 2R) satisfying (2.5–2.8). If R02 , ν

T ≥

(4.2)

then for any 0 < R ≤ R0 , 1 δ C0 η ψi ≤ ν 2 φi2δ−1 , T R ν C0 ν|φi | = ν|ηψi | ≤ C0 2 ηψi2δ−1 ≤ ν 2 φi2δ−1 ; R R |(φi )t | = |ηt ψi | ≤ C0

hence, R ≥ ν E R − ν

C0 eR . R2

The optimality conditions (3.17) and (3.18) paired with (2.6–2.8) imply 1 E K1

(4.3)

e R ≤ K 2 e.

(4.4)

1 C0 K 2 E −ν e K1 R2

(4.5)

ER ≥ and

Consequently, R ≥ ν leading to the following proposition. Proposition 4.1.

τ2 1 − c2 02 R

R ≥ c1 ν E

(4.6)

with c1 = 1/K 1 and c2 = C0 K 1 K 2 (provided conditions (3.17–3.18) are satisfied). Suppose that τ0
0 sufficiently small (i.e., the small coupling counterpart to Casdagli’s result at large coupling). A consequence of this is the equality of Hausdorff dimension and upper and lower box counting dimensions of V in this coupling constant regime. Our first result shows that the dimension of the spectrum indeed extends continuously to V = 0.

224

D. Damanik, A. Gorodetski

Theorem 1.1. We have lim dim V = 1.

V →0

More precisely, there are constants C1 , C2 > 0 such that 1 − C1 V ≤ dim V ≤ 1 − C2 V for V > 0 sufficiently small. We get Theorem 1.1 as a consequence of a connection between the Hausdorff dimension of a Cantor set and its denseness and thickness, along with estimates for the latter quantities. Since these notions and connections may be less familiar to at least a part of our intended audience, let us recall the definitions and some of the main results; an excellent general reference in this context is [PT]. Let C ⊂ R be a Cantor set and denote by I its convex hull. Any connected component of I \C is called a gap of C. A presentation of C is given by an ordering U = {Un }n≥1 of the gaps of C. If u ∈ C is a boundary point of a gap U of C, we denote by K the connected component of I \(U1 ∪ U2 ∪ . . . ∪ Un ) (with n chosen so that Un = U ) that contains u and write τ (C, U, u) =

|K | . |U |

With this notation, the thickness τ (C) and the denseness θ (C) of C are given by τ (C) = sup inf τ (C, U, u), U

u

θ (C) = inf sup τ (C, U, u), U

u

and they are related to the Hausdorff dimension of C by the following inequalities (cf. [PT, Sect. 4.2]), log 2 log(2 +

1 τ (C) )

≤ dimH C ≤

log 2 log(2 +

1 θ(C) )

.

Due to these inequalities, Theorem 1.1 is a consequence of the following result: Theorem 1.2. We have lim τ (V ) = ∞.

V →0

More precisely, there are constants C3 , C4 > 0 such that C3 V −1 ≤ τ (V ) ≤ θ (V ) ≤ C4 V −1 for V > 0 sufficiently small. Bovier and Ghez described in their 1995 paper [BG] the then-state of the art concerning mathematically rigorous results for Schrödinger operators in 2 (Z) with potentials generated by primitive substitutions. The Fibonacci Hamiltonian belongs to this class; more precisely, it is in many ways the most important example within this class of models. One of the most spectacular discoveries is that, in this class of models, the spectrum jumps from being an interval for coupling V = 0 to being a zero-measure Cantor set for coupling V > 0. That is, as the potential is turned on, a dense set of gaps opens

The Weakly Coupled Fibonacci Hamiltonian

225

Fig. 1. The set {(E, V ) : E ∈ V , 0 ≤ V ≤ 2}

immediately (and the complement of these gaps has zero Lebesgue measure). It is natural to ask about the size of these gaps, which can in fact be parametrized by a canonical countable set of gap labels; see [BBG92]. These gap openings were studied in [B] for a Thue-Morse potential and in [BBG91] for period doubling potential. However, for the important Fibonacci case, the problem remained open. In fact, Bovier and Ghez write on p. 2321 of [BG]: It is a quite perplexing feature that even in the simplest case of all, the golden Fibonacci sequence, the opening of the gaps at small coupling is not known!1 Our next result resolves this issue completely and shows that, in the Fibonacci case, all gaps open linearly: Theorem 1.3. For V > 0 sufficiently small, the boundary points of a gap in the spectrum V depend smoothly on the coupling constant V . Moreover, given any one-parameter continuous family {UV }V >0 of gaps of V , we have that lim

V →0

|UV | |V |

exists and belongs to (0, ∞). Figure 1 shows a plot of the spectrum for small coupling: The plot illustrates the results contained in Theorems 1.1–1.3. It also suggests that V| the limit lim V →0 |U |V | depends on the chosen family of gaps. We have more to say about the value of the limit in Theorem 1.6 below. It will turn out that its size is related to the label assigned to it by the gap labeling theorem. Our next result concerns the sum set V + V = {E 1 + E 2 : E 1 , E 2 ∈ V }. This set is equal to the spectrum of the so-called square Fibonacci Hamiltonian. Here, one considers the Schrödinger operator [HV(2) ψ](m, n) = ψ(m + 1, n) + ψ(m − 1, n) + ψ(m, n + 1) + ψ(m, n − 1) +V χ[1−α,1) (mα mod 1) + χ[1−α,1) (nα mod 1) ψ(m, n) 1 There is a perturbative approach to this problem for a class of models that includes the Fibonacci Hamiltonian by Sire and Mosseri; see [SM89] and [OK,Si89,SM90,SMS] for related work. While their work is non-rigorous, it gives quite convincing arguments in favor of linear gap opening; see especially [SM89, Sect. 5]. It would be interesting to make their approach mathematically rigorous.

226


in 2 (Z2 ). The theory of tensor products of Hilbert spaces and operators then implies that σ (HV(2) ) = V + V , see Sect. 6. This operator and its spectrum have been studied numerically and heuristically by Even-Dar Mandel and Lifshitz in a series of papers [EL06,EL07,EL08] (a similar model was studied by Sire in [Si89]). Their study suggested that at small coupling, the spectrum of V + V is not a Cantor set; quite on the contrary, it has no gaps at all. Our next theorem confirms this observation: Theorem 1.4. For V > 0 sufficiently small, we have that σ (HV(2) ) = V + V is an interval. Certainly, the same statement holds for the cubic Fibonacci Hamiltonian (i.e., the analogously defined Schrödinger operator in 2 (Z3 ) with spectrum V + V + V ). Notice that Theorem 1.4 is a consequence of Theorem 1.2 and the famous Gap Lemma, which was used by Newhouse to construct persistent tangencies and generic diffeomorphisms with an infinite number of attractors (the so-called “Newhouse phenomenon”), see Subsect. 6.2 for details: Gap Lemma (Newhouse [N79,N70]). If C1 , C2 ⊂ R1 are Cantor sets such that τ (C1 ) · τ (C2 ) > 1, then either one of these sets is contained entirely in a gap2 of the other set, or C1 ∩C2 = ∅. Let us turn to the formulation of results involving the integrated density of states, which is a quantity of fundamental importance associated with an ergodic family of Schrödinger operators. We first recall the definition of the integrated density of states. Denote the restriction of HV,ω to some finite interval ⊂ Z with Dirichlet boundary . We denote by N (E, ω, V, ) the number of eigenvalues of H conditions by HV,ω V,ω that are less than or equal E. The integrated density of states is given by N (E, V ) = lim

n→∞

1 N (E, ω, V, [1, n]). n

(1)

We will comment on the existence of the limit and some of its basic properties in Sect. 4. One of the most important applications of the integrated density of states is the so-called gap labeling. That is, one can identify a canonical set of gap labels, that is only associated with the underlying dynamics (in this case, an irrational rotation of the circle or the shift-transformation on a substitution-generated subshift over two symbols), in such a way that the value of N (E, V ) for E ∈ R\V must belong to this canonical set. In the Fibonacci case, this set is well-known (see, e.g., [BBG92, Eq. (6.7)]) and the general gap labeling theorem specializes to the following statement: {N (E, V ) : E ∈ R\V } ⊆ {{mα} : m ∈ Z} ∪ {1}

(2)

for every V = 0. Here {mα} denotes the fractional part of mα, that is, {mα} = mα − mα. Notice that the set of gap labels is indeed V -independent and only depends on the value of α from the underlying circle rotation. Since α is irrational, the set of gap labels is dense. In general, a dense set of gap labels is indicative of a Cantor spectrum and hence 2 For the purpose of this lemma, we also consider the two unbounded gaps in addition to the bounded gaps considered above, that is, the connected components of the complement of the convex hull of the Cantor set in question.


227

a common (and attractive) stronger version of proving Cantor spectrum is to show that the operator “has all its gaps open.” For example, the Ten Martini Problem for the almost Mathieu operator is to show Cantor spectrum, while the Dry Ten Martini Problem is to show that all labels correspond to gaps in the spectrum. The former problem has been completely solved [AJ], while the latter has not yet been completely settled. Indeed, it is in general a hard problem to show that all labels given by the gap labeling theorem correspond to gaps and there are only few results of this kind. Here we show the stronger (or “dry”) form of Cantor spectrum for the weakly coupled Fibonacci Hamiltonian and establish complete gap labeling: Theorem 1.5. There is V0 > 0 such that for every V ∈ (0, V0 ], all gaps allowed by the gap labeling theorem are open. That is, {N (E, V ) : E ∈ R\V } = {{mα} : m ∈ Z} ∪ {1}.

(3)

Complete gap labeling for the strongly coupled Fibonacci Hamiltonian was shown by Raymond in [Ra], where he proves (3) for V > 4. We conjecture that (3) holds for every V > 0. Let us return to the existence of the limit in Theorem 1.3. As was pointed out there, the value of the limit will depend on the family of gaps chosen. Now that the gap labeling has been introduced, we can refine the statement. For m ∈ Z\{0}, denote by Um (V ) the gap of V where the integrated density of states takes the value {mα}. Theorem 1.6. There is a finite constant C ∗ such that for every m ∈ Z\{0}, we have lim

V →0

Cm |Um (V )| = |V | |m|

for a suitable Cm ∈ (0, C ∗ ). Our final set of results concerns the spectral measures and transport exponents associated with the operator family. We will give precise definitions and statements of our results in Sect. 5 and limit ourselves to a brief description here. The ultimate goal of any analysis of a given Schrödinger operator is always an understanding of the associated unitary group, which then allows one to understand the dynamics of the associated time-dependent Schrödinger equation. The standard transport exponents capture the spreading of the quantum state in space. Most approaches to a study of these transport exponents proceed via (time-independent) spectral theory and link continuity properties of the spectral measure, associated to the initial state of the time evolution via the spectral theorem, to lower bounds for the transport exponents. In one space dimension, these continuity properties can in turn be investigated via an analysis of the solutions of the time-independent Schrödinger equation. Our goal is to carry this out for the weakly coupled Fibonacci Hamiltonian. Indeed, results of this kind are known, but the dependence of the quantities entering the estimates on the coupling constant had not been optimized. We revisit these approaches here and improve them to yield the best possible quantitative estimates at small coupling that can be obtained with current technology. Our results in Sect. 5 are likely not optimal and in particular do not approach the (known) zero-coupling values. We regard it as an interesting open problem to either prove or disprove that the dimension estimates for the spectral measures and the lower bounds for the transport exponents approach the values in the free (zero coupling) case as the coupling approaches zero. Some of the results of this paper were announced in [DG09b].

228


1.3. Overview of the paper. Let us outline the remaining parts of this paper. In Sect. 2 we give some necessary background information and recall how the trace map arises in the context of the Fibonacci Hamiltonian and some of its basic properties. Moreover, since our aim is an understanding of weak coupling phenomena, we discuss the case of zero coupling. Section 3 is the heart of the paper. Here we regard the weak coupling scenario as a perturbation of zero coupling and study the dynamics of the trace map and consequences thereof for the structure of the spectrum as a set. Section 4 considers the integrated density of states and proves complete gap labeling at weak coupling and our quantitative version of the linear gap opening result. Spectral measures and transport exponents are studied by means of solution estimates in Sect. 5. Higher-dimensional models generated by a product construction are discussed in Sect. 6, where we confirm some predictions of Even-Dar Mandel and Lifshitz. Since their model is based on the off-diagonal Fibonacci Hamiltonian and the main body of this paper (and most of the other mathematical works on the Fibonacci Hamiltonian) considers the diagonal Fibonacci Hamiltonian, we develop all the basic results for the off-diagonal model in Appendix A and explain there how our work indeed confirms the predictions for the original Even-Dar Mandel-Lifshitz product model. To assist the reader in locating the proofs of the theorems from the previous subsection, here is where they may be found: We prove Theorem 1.2 (which implies Theorem 1.1) in Subsect. 3.4, Theorem 1.3 in Subsect. 3.1, Theorem 1.4 in Subsect. 6.3, Theorem 1.5 in Subsect. 4.2, and finally Theorem 1.6 in Subsect. 4.3. 2. Preliminaries 2.1. Description of the trace map and previous results. The main tool that we are using here is the so-called trace map. It was originally introduced in [Ka,KKT]; further useful references include [BGJ,BR,HM,Ro]. Let us quickly recall how it arises from the substitution invariance of the Fibonacci potential; see [S87] for detailed proofs of some of the statements below. The one step transfer matrices associated with the difference equation HV,ω u = Eu are given by E − V χ[1−α,1) (mα + ω mod 1) −1 . TV,ω (m, E) = 1 0 Denote the Fibonacci numbers by {Fk }, that is, F0 = F1 = 1 and Fk+1 = Fk + Fk−1 for k ≥ 1. Then, one can show that the matrices 1 −V E −1 M−1 (E) = , M0 (E) = 0 1 1 0 and Mk (E) = TV,0 (Fk , E) × · · · × TV,0 (1, E) obey the recursive relations Mk+1 (E) = Mk−1 (E)Mk (E) for k ≥ 0. Passing to the variables xk (E) =

1 TrMk (E), 2

for k ≥ 1


229

Fig. 2. The surface S0.01

this in turn implies xk+1 (E) = 2xk (E)xk−1 (E) − xk−2 (E). These recursion relations exhibit a conserved quantity; namely, we have xk+1 (E)2 + xk (E)2 + xk−1 (E)2 − 2xk+1 (E)xk (E)xk−1 (E) − 1 =

V2 4

for every k ≥ 0. Given these observations, it is then convenient to introduce the trace map T : R3 → R3 , T (x, y, z) = (2x y − z, x, y). The following function3 G(x, y, z) = x 2 + y 2 + z 2 − 2x yz − 1 is invariant under the action of T , and hence T preserves the family of cubic surfaces4 V2 . SV = (x, y, z) ∈ R3 : x 2 + y 2 + z 2 − 2x yz = 1 + 4 Plots of the surfaces S0.01 and S0.5 are given in Figs. 2 and 3, respectively. 3 The function G(x, y, z) is called the Fricke character, or sometimes the Fricke-Vogt invariant. 4 The surface S is called the Cayley cubic. 0

230


Fig. 3. The surface S0.5

It is of course natural to consider the restriction TV of the trace map T to the invariant surface SV . That is, TV : SV → SV , TV = T | SV . Denote by V the set of points in SV whose full orbits under TV are bounded. A priori the set of bounded orbits of TV could be different from the non-wandering set5 of TV , but our construction of the Markov partition and our analysis of the behavior of TV near singularities show that here these two sets do coincide. Notice that this is parallel to the construction of the symbolic coding in [Cas]. Let us recall that an invariant closed set of a diffeomorphism f : M → M is hyperbolic if there exists a splitting of the tangent space Tx M = E xu ⊕ E xu at every point x ∈ such that this splitting is invariant under D f , the differential D f exponentially contracts vectors from the stable subspaces {E xs }, and the differential of the inverse, D f −1 , exponentially contracts vectors from the unstable subspaces {E xu }. A hyperbolic set of a diffeomorphism f : M → M is locally maximal if there exists a neighborhood U of such that = f n (U ). n∈Z

We want to recall the following central result. Theorem 2.1 ([Cas,DG09a,Can]). For V = 0, the set V is a locally maximal hyperbolic set of TV : SV → SV . It is homeomorphic to a Cantor set. 5 A point p ∈ M of a diffeomorphism f : M → M is wandering if there exists a neighborhood O( p) ⊂ M such that f k (O) ∩ O = ∅ for any k ∈ Z\0. The non-wandering set of f is the set of points that are not wandering.


Denote by V the line

V =

231

E−V E , ,1 : E ∈ R . 2 2

It is easy to check that V ⊂ SV . The second central result about the trace map we wish to recall was proven by Süt˝o in [S87]. Theorem 2.2 (Süt˝o 1987). An energy E belongs to the spectrum of HV,ω if and only E if the positive semiorbit of the point ( E−V 2 , 2 , 1) under iterates of the trace map T is bounded. In fact, as also shown by Süt˝o in [S87], the trace map can be used to generate canonical approximations of the spectrum, V . Namely, consider the following sets: E V(n) = {E ∈ R : for (xn , yn , z n ) = T n ( E−V 2 , 2 , 1), we have min{|x n |, |yn |} ≤ 1}. (n)

(n+1)

Then, we have V ⊇ V

→ V , that is, (n) V . V = n∈Z+ (n)

Figure 4 shows a plot of the sets {(E, V ) : E ∈ V , 0 ≤ V ≤ 43 } for values of n up to 20.6 These plots illustrate nicely both the linear gap opening and the fact that the size of a gap depends on its label; compare Theorems 1.3 and 1.6. To further document linear gap opening through numerics, Fig. 5 zooms into a portion of Fig. 4 near a point (E, 0) for an energy E where a gap opens; we have chosen E ≈ 0.7248. 2.2. Properties of the trace map for V = 0. We will regard the case of small V as a small perturbation of the case V = 0. This subsection is devoted to the study of this “unperturbed case.” Denote by S the part of the surface S0 inside the cube {|x| ≤ 1, |y| ≤ 1, |z| ≤ 1}. The surface S is homeomorphic to S 2 , invariant, smooth everywhere except at the four points P1 = (1, 1, 1), P2 = (−1, −1, 1), P3 = (1, −1, −1), and P4 = (−1, 1, −1), where S has conic singularities, and the trace map T restricted to S is a factor of the hyperbolic automorphism of T2 = R2 /Z2 given by A(θ, ϕ) = (θ + ϕ, θ ) (mod 1). The semiconjugacy is given by the map F : (θ, ϕ) → (cos 2π(θ + ϕ), cos 2π θ, cos 2π ϕ). 1 1 The map A is hyperbolic, and is given by the matrix A = , which has eigenvalues 1 0 √ √ 1+ 5 1− 5 −1 μ= and − μ = . 2 2 6 This is also how the plot in Fig. 1 was obtained. In fact, what is shown there is the set {(E, V ) : E ∈

(20)

V

, 0 ≤ V ≤ 2}.

232


(n)

Fig. 4. The sets {(E, V ) : E ∈ V , 0 ≤ V ≤ 43 } through n = 20

Let us denote by vu , vu ∈ R2 the unstable and stable eigenvectors of A: Avs = −μ−1 vs , vu = vs = 1.

Avu = μvu ,

Fix some small ζ > 0 and define the stable (resp., unstable) cone fields on R2 in the following way: K sp = {v ∈ T p R2 : v = v u vu + v s vs , |v s | > ζ −1 |v u |}, K up = {v ∈ T p R2 : v = v u vu + v s vs , |v u | > ζ −1 |v s |}. These cone fields are invariant: u ∀ v ∈ K up Av ∈ K A( p) ,

∀ v ∈ K sp A−1 v ∈ K As −1 ( p) .

(4)


233

(n)

Fig. 5. The sets {(E, V ) : E ∈ V } near (0.7248, 0) for n = 15, 20, 25, 30

Also, the iterates of the map A expand vectors from the unstable cones, and the iterates of the map A−1 expand vectors from the stable cones: ∀ v ∈ K up ∀ n ∈ N |An v| >

1 1 + ζ2

∀ v ∈ K sp ∀ n ∈ N |A−n v| >

μn |v|,

1 1 + ζ2

μn |v|.

The families of cones {K s } and {K u } invariant under A can be also considered on T2 . The differential of the semiconjugacy F sends these cone families to stable and unstable cone families on S\{P1 , P2 , P3 , P4 }. Let us denote these images by {Ks } and {Ku }. Lemma 2.3 (Lemma 3.1 from [DG09a]). The differential of the semiconjugacy D F induces a map of the unit bundle of T2 to the unit bundle of S\{P1 , P2 , P3 , P4 }. The

234


Fig. 6. The Markov partition for the map A

derivatives of the restrictions of this map to fibers are uniformly bounded. In particular, the sizes of the cones in the families {Ks } and {Ku } are uniformly bounded away from zero. Finally, consider the Markov partition for the map A : T2 → T2 that is shown in Fig. 6 (and which had already appeared in [Cas]; for more details on Markov partitions for two-dimensional hyperbolic maps see [PT, App. 2]). Its image under the map F : T2 → S is a Markov partition for the pseudo-Anosov map T : S → S. 3. The Spectrum as a Set It is not hard to see that the line V is transversal to the stable manifolds of the hyperbolic set V for small values of V (see, e.g., [DG09a, Lemma 5.5]). Therefore the intersection of V and W s (V ) (and, hence, V ) is a dynamically defined Cantor set (see, for example, [T]). In this section we study the properties of this one-parameter family of Cantor sets. Namely, in Subsect. 3.1 we prove that the size of a given gap in the Cantor set tends to zero linearly as the coupling constant (the parameter) tends to zero. In Subsect. 3.2 we use normally hyperbolic theory to introduce a normalizing coordinate system in a neighborhood of a singularity. Then in Subsect. 3.3 the order of the gaps that


235

Fig. 7. S0.1 and Per2 (T ) near (1, 1, 1)

is related to the dynamics is chosen. Roughly speaking, the longer it takes for a gap to leave the union of the elements of the Markov partition, the higher is the order of the gap. Next, in Subsect. 3.5 this normalizing coordinate system is used to study the distortion properties of the transitions through a neighborhood of a singularity. Finally, in the last three subsections we bring all the pieces together and prove the distortion property that immediately implies Theorem 1.2. 3.1. Linear gap opening as the potential is turned on. Here we prove Theorem 1.3. Consider the dynamics of T in a neighborhood of P1 = (1, 1, 1). Due to the symmetries of the trace map this will also provide information on the dynamics near the other singularities. Take r0 > 0 small and let Or0 (P1 ) be an r0 -neighborhood of the point P1 = (1, 1, 1) in R3 . Let us consider the set Per2 (T ) of periodic points of T of period 2; compare Figs. 7 and 8. Lemma 3.1. We have 1 x 1 Per2 (T ) = (x, y, z) : x ∈ (−∞, ) ∪ ( , ∞), y = , z=x . 2 2 2x − 1 Proof. Direct calculation. Notice that in a neighborhood U1 of P1 , the intersection I ≡ Per2 (T )∩U1 is smooth curve that is normally hyperbolic with respect to T (see, e.g., App. 1 in [PT] for the formal cs (I ) definition of normal hyperbolicity). Therefore, the local center-stable manifold Wloc cu and the local center-unstable manifold Wloc (I ) defined by

cs Wloc (I ) = p ∈ Or0 (P1 ) : T n ( p) ∈ Or0 (P1 ) for all n ∈ N ,

cu (I ) = p ∈ Or0 (P1 ) : T −n ( p) ∈ Or0 (P1 ) for all n ∈ N Wloc

236


Fig. 8. S0.2 and Per2 (T ) near (1, 1, 1) ss (P ) are smooth two-dimensional surfaces. Also, the local strong stable manifold Wloc 1 uu and the local strong unstable manifold Wloc (P1 ) of the fixed point P1 , defined by

ss cs Wloc (P1 ) = p ∈ Wloc (I ) : T n ( p) → P1 as n → ∞ ,

uu cu Wloc (P1 ) = p ∈ Wloc (I ) : T −n ( p) → P1 as n → ∞ ,

are smooth curves. The Markov partition for the pseudo-Anosov map T : S → S can be extended to a Markov partition for the map TV : SV → SV for small values of V . Namely, there are four singular points P1 = (1, 1, 1), P2 = (−1, −1, 1), P3 = (1, −1, −1), and P4 = (−1, 1, −1) of S. The point P1 is a fixed point of T , and the points P2 , P3 , P4 form a periodic orbit of period 3. For small V , on the surface SV near P1 there is a hyperbolic orbit of the map TV = T | SV of period 2, and near the orbit {P2 , P3 , P4 } there is a hyperbolic periodic orbit of period 6. Pieces of stable and unstable manifolds of these 8 periodic points form a Markov partition for TV : SV → SV . For V = 0, the elements of this Markov partition are disjoint. Let us denote these six rectangles (the elements of 6 Ri . the Markov partition) by R V1 , R V2 , . . . , R V6 . Let us also denote R V = ∪i=1 V It is convenient now to consider TV6 : SV → SV since for TV6 , each of the eight periodic points that were born from singularities becomes a fixed point. Due to the symmetries of the trace map, the dynamics of T 6 is the same in a neighborhood of each of the singularities P1 , P2 , P3 , and P4 . The set of fixed points of T 6 in a neighborhood U1 of P1 is a smooth curve Fix(T 6 , Or0 (P1 )) = Per2 (T ) ∩ Or0 (P1 ); see Lemma 3.1 above. Each of the fixed points has one of the eigenvalues equal to 1, one greater than 1, and one smaller than 1 in absolute value. Therefore the curve Fix(T 6 , Or0 (P1 )) is a normally hyperbolic manifold, and its stable set W s (Fix(T 6 , Or0 (P1 ))) is a smooth two dimensional surface; see


237

[HPS]. The strong stable manifolds form a C 1 -foliation of W s (Fix(T 6 , Or0 (P1 ))); see [PSW, Theorem B]. If pV and qV are two fixed points of TV6 in Or0 (P1 ), then these points form the intersection of the curve Fix(T 6 , Or0 (P1 )) with SV and can be found from the system ⎧ x ⎨ y = 2x−1 z=x 2 ⎩ 2 x + y 2 + z 2 − 2x yz = 1 + V4 . If we parameterize Fix(T 6 , Or0 (P1 )) as {x = t + 1, z = t + 1, y = (t + 1)2 +

t+1 2t+1 },

we get

V2 t +1 t +1 + (t + 1)2 − 2(t + 1)2 =1+ , 2t + 1 2t + 1 4

or 4t 4 + 10t 3 + 9t 2 + 4t + 1 V2 . =1+ 2 (2t + 1) 4 4

3

2

+9t +4t+1 Since for the function f (t) = 4t +10t , we have f (0) = 1, f (0) = 0, f (0) > 0, (2t+1)2 the distance between the points pV and qV is of order |V | for small values of V .

Lemma 3.2. Let W ⊂ R3 be a smooth surface with a C 1 -foliation on it. Let ξ ⊂ W be a smooth curve transversal to the foliation. Fix a leaf L ⊂ W of the foliation, and denote P = L ∩ ξ . Take a point Q = P, Q ∈ L, and a line 0 ⊂ R3 , Q ∈ 0 , tangent to W at Q, but not tangent to the leaf L. Suppose that a family of lines {V }V ∈(0,V0 ) is given such that V → 0 as V → 0, each line V , V > 0, intersects W at two points pV and qV , and pV → Q, qV → Q as V → 0. Denote by L pV and L qV the leaves of the foliation that contain pV and qV , respectively. Denote pV = L pV ∩ ξ and qV = L qV ∩ ξ . Then there exists a finite non-zero limit lim

V →0

dist( pV , qV ) . dist(pV , qV )

Proof. Since 0 is not tangent to the leaf L, there exists a curve ξ ⊂ W tangent to 0 , transversal to the foliation, and such that Q ∈ ξ. Set p = L p V ∩ ξ , q = L q V ∩ ξ . Since the foliation is C 1 , there exists a finite non-zero limit lim

V →0

dist(pV , qV ) = 0. dist( pV , qV )

(5)

Let us consider a plane tangent to W at Q, and let π : W → be an orthogonal projection (well defined and smooth in a neighborhood of Q). It is clear that dist(π( pV ), π( qV )) = 1. V →0 dist( pV , qV ) lim

(6)

Also, since V → 0 as V → 0, we have dist(π( pV ), π(qV )) = 1. V →0 dist( pV , qV ) lim

(7)

238


Finally, since π sends the C 1 -foliation of W to a C 1 -foliation on , the projection along this foliation from π(V ) to 0 is C 1 -close to isometry. In particular, dist(π( pV ), π(qV )) = 1. V →0 dist(π( pV ), π( qV )) lim

The statement of Lemma 3.2 follows now from (5)–(8).

(8)

Proof of Theorem 1.3. Pick any bounded gap in the spectrum and the corresponding gap O in V \W s (V ). The boundary points pV and qV of O belong to stable manifolds of two fixed points near one of the singularities P1 , P2 , P3 , or P4 . Without loss of generality assume that those fixed points are pV and qV in a neighborhood of P1 . The surface W = W s (Fix(T 6 , Or0 (P1 ))) is C 1 -foliated by strong stable manifolds of fixed points from Fix(T 6 , Or0 (P1 )), and the curve ξ = Fix(T 6 , Or0 (P1 )) is transversal to this foliation. Therefore Lemma 3.2 implies that there exists a finite non-zero limit pV ,qV ) lim V →0 dist( dist(pV ,qV ) , and, since dist(pV , qV ) is of order |V | for small values of V , this implies Theorem 1.3. 3.2. Choice of a coordinate system in a neighborhood of a singular point. Due to the smoothness of the invariant manifolds of the curve of periodic points of period two described in Sect. 3.1, there exists a smooth change of coordinates : Or0 (P1 ) → R3 such that (P1 ) = (0, 0, 0) and • • • • •

(I ) is a part of the line {x = 0, z = 0}; cs (I )) is a part of the plane {z = 0}; (Wloc cu (I )) is a part of the plane {x = 0}; (Wloc ss (P )) is a part of the line {y = 0, z = 0}; (Wloc 1 uu (P )) is a part of the line {x = 0, y = 0}. (Wloc 1

Denote f = ◦ T ◦ −1 . In this case, ⎛

⎞ λ−1 0 0 A ≡ D f (0, 0, 0) = D( ◦ T ◦ −1 )(0, 0, 0) = ⎝ 0 −1 0 ⎠ , 0 0 λ where λ is the largest eigenvalue of the differential DT (P1 ) : TP1 R3 → TP1 R3 , ⎛ ⎞ √ 2 2 −1 3+ 5 ⎝ ⎠ = μ2 . , λ= DT (P1 ) = 1 0 0 2 0 1 0 Let us denote SV = (SV ). Then, away from (0, 0, 0), the family {SV } is a smooth family of surfaces, S0 is diffeomorphic to a cone, contains the lines {y = 0, z = 0} and {x = 0, y = 0}, and at each non-zero point on these lines, it has a quadratic tangency with a horizontal or vertical plane. Due to the symmetries of the trace map, similar changes of coordinates exist in a neighborhood of each of the other singularities. Denote Or0 = Or0 (P1 ) ∪ Or0 (P2 ) ∪ Or0 (P3 ) ∪ Or0 (P4 ).


239

Fix a small constant C > 0 and introduce the following cone fields in R3 : Kup = {v ∈ T p R3 , v = vx y + vz : |vz | > C |z p ||vx y |}, up K Ksp sp K

−1

= {v ∈ T p R , v = vx + v yz : |vz | > C 3

= {v ∈ T p R3 , v = vx + v yz

|vx y |}, : |vx | > C |x p ||v yz |}, −1

= {v ∈ T p R , v = vz + vx y : |vx | > C 3

|v yz |}.

(9) (10) (11) (12)

Lemma 3.3. There are r1 ∈ (0, r0 ) and m 0 ∈ N such that the following holds. 1. T m 0 (0 ) ∩ Or1 is a union of two connected curves γ1 and γ2 , and (γi ), i = 1, 2, u ; is tangent to the cone field K −1 m 0 2. F (T (0 ∩ S)) is tangent to the cone field K u (defined by (4)). Definition 3.4. We will call the rectangle R 6 (the element of the Markov partition) the opposite to singularities P1 and P3 , and we will call the rectangle R 5 the opposite to singularities P2 and P4 Notice that T m 0 (0 ∩ S) consists of a curve that connects P1 with a stable boundary of R6 , a finite number of curves ξi that connect two stable sides of an element of the Markov partition and such that F −1 (ξi ) is tangent to the cone field K u , and a curve that connects some other singularity with a stable boundary of its opposite rectangle. 3 Now let us take r2 ∈ (0, r1 ) so small that ∪i=−3 T i (Or2 ) ⊂ Or1 . For small V , denote by SV,Or2 the bounded component of SV \Or2 . The family {SV,Or2 }V ∈[0,V0 ) of surfaces with boundary depends smoothly on the parameter and has uniformly bounded curvature. For small V , a projection πV : SV,Or2 → S is defined. The map πV is smooth, and if p ∈ S, q ∈ SV,Or2 , and πV (q) = p, then T p S and Tq SV are close. Denote by KuV (resp., KsV ) the image of the cone Ku (resp., Ks ) under the differential of πV−1 . Denote FV = πV−1 ◦ F, FV : F −1 (S\Or2 ) → SV . Denote also V = ◦ FV . Compactness and Mean Value Theorem type arguments imply the following statement. > 0 such that the following holds. Suppose that Lemma 3.5. There are V0 > 0 and C 2 −1 2 a, b ∈ T \F (Or2 ), va ∈ Ta T , vb ∈ Tb T2 . Then the following inequalities hold for all V ∈ [0, V0 ]: dist(a, b), dist(FV (a), FV (b)) ≤ C dist(a, b) ≤ C dist(FV (a), FV (b)), ∠(D FV (va ), D FV (vb )) ≤ C(∠(v a , vb ) + dist(a, b)), ∠(va , vb ) ≤ C(∠(D FV (va ), D FV (vb )) + dist(FV (a), FV (b))). Moreover, if V (a) and V (b) are defined, then dist(a, b), dist(V (a), V (b)) ≤ C dist(a, b) ≤ C dist(V (a), V (b)), ∠(DV (va ), DV (vb )) ≤ C(∠(v a , vb ) + dist(a, b)), ∠(va , vb ) ≤ C(∠(DV (va ), DV (vb )) + dist(V (a), V (b))).

240


Finally, notice that if C and V0 are taken sufficiently small, then the cone fields K u on respect each other in the following sense. Suppose that a ∈ T2 \F −1 (Or2 ) is such that V (a) is defined for V ∈ [0, V0 ], and va ∈ K au . Lemma 2.3 implies that

u T2 and Ku , K

u . DV (va ) ∈ K V (a)

On the other hand, if b ∈ T2 \F −1 (Or2 ) is such that V (b) is defined and DV (vb ) ∈ u K , then V (b)

vb ∈ K bu . 3.3. Ordering of the gaps. To estimate the thickness of a Cantor set from below (or the denseness from above), it is enough to consider one particular ordering of its gaps. Here we choose a convenient ordering of gaps in V ∩ W s (V ) (which is affine equivalent to V ). The trace map TV , V = 0, has two periodic points of period 2, denote them by P1 (V ) and P1 (V ), and six periodic points of period 6, denote them by P2 (V ), P2 (V ), P3 (V ), P3 (V ), P4 (V ), and P4 (V ). In Sect. 3.1 we showed that the distance between Pi (V ) and Pi (V ) is of order |V |. We can choose the notation (swapping the notation for P1 (V ) and P1 (V ), and/or for P2 (V ) and P2 (V ) if necessary) in such a way that the following lemma holds. Lemma 3.6. If V is small enough, the line V contains points B1 (V ) ∈ V ∩W ss (P1 (V )) and B2 (V ) ∈ V ∩ W ss (P2 (V )) such that every point of the line which is not between B1 (V ) and B2 (V ) tends to infinity under iterates of TV . Denote by lV the closed interval on V between the points B1 (V ) and B2 (V ). It is known that the set of points on lV with bounded positive semiorbits is a dynamically defined Cantor set; see [DG09a,Can]. We would like to estimate the thickness of this Cantor set. Lemma 3.7. There are m 0 ∈ N, 0 < C1 < C2 and V0 > 0 such that for all V ∈ [0, V0 ] the following holds: 1. TVm 0 (lV ) ∩ Or1 is a union of two connected curves γ1 (V ) and γ2 (V ), and u ; (γi (V )), i = 1, 2, is tangent to the cone field K −1 m 2. FV (T 0 (lV )\Or2 ) is tangent to the cone field K u ; ss (P ) with a stable boundary of R 6 , a 3. TVm 0 (lV ) consists of a curve that connects Wloc 1 finite number of curves ξi (V ), each of which connects two stable sides of an element of the Markov partition and is such that FV−1 (ξi ) is tangent to the cone field K u , a ss (P ), i ∈ {2, 3, 4}, with a stable boundary of an opposite curve that connects Wloc i rectangle to Pi , and some “gaps” between the curves mentioned above. The length of these “gaps” is between C1 V and C2 V for all small enough V . Proof. The statement holds for V = 0 (see Lemma 3.3). For V positive but small enough Properties 1 and 2 hold by continuity, and Property 3 follows from the fact that the distances between finite pieces of strong stable manifolds of points pV and qV is of order V , and these strong stable manifolds form the stable parts of boundary of the Markov partition for TV .


241

We will call the preimages (under TVm 0 ) of the gaps defined in Lemma 3.7 the gaps of order 1. Definition 3.8. A smooth curve γ ⊂ SV is tangent to an unstable cone field if FV−1 (γ \Or2 ) is tangent to K u , and (γ ∩ Or1 ) is tangent to Ku . Definition 3.9. A curve is of type one if it is tangent to the unstable cone field and connects opposite sides of stable boundaries of some R i . A curve is of type two if it is tangent to the unstable cone field and connects a point ss (P ), i ∈ {1, 2, 3, 4}, with a stable boundary of an opposite element of the from Wloc i Markov partition. In this terminology, Lemma 3.7 claims that TVm 0 (lV ) consists of two curves of type two, some curves of type one, and some gaps between them of size of order V . Lemma 3.10. An image of a curve of type one under T 6 is a union of a finite number of curves of type one, and of a finite number of gaps of length between C1 V and C2 V . An image of a curve of type two under T 6 is a union of a curve of type two, a finite number of curves of type one, and a finite number of gaps of length between C1 V and C2 V . Proof. The first part follows from the properties of the Markov partition and the fact that the distance between strong stable (strong unstable) manifolds that form the Markov partition is of order V , see Subsect. 3.1. An image of a curve tangent to an unstable ss (P )) ⊂ W ss (P ). cone field is a curve tangent to an unstable cone field. Also, T 6 (Wloc i loc i 6 Therefore the image of a curve of type two under T is a curve which is close to a finite piece of a strong unstable manifold of Pi , so the second part follows. Suppose that the gaps of order k have already been defined. Consider the complement of all gaps of order not greater than k on lV . It consists of a finite number of m +6(k−1) closed intervals. Let J be one of them. Consider the curve TV 0 (J ). By construction, it is either a curve of type one, or of type two. In either case, the image m +6(k−1) TV6 (TV 0 (J )) = TVm 0 +6k (J ) consists of some curves of type one or two, and some gaps of size ∼ V . Let us say that the preimages of these gaps (under TVm 0 +6k ) are gaps of order k + 1. It is clear that every gap in lV ∩ W s (V ) has some finite order. Therefore we have ordered all the gaps. 3.4. Distortion property: estimate of the gap sizes. Let us consider lV and some gap γG ⊂ lV of order n. A bridge that corresponds to this gap is a connected component of the complement of the union of all gaps of order ≤ n next to the gap. There are two bridges that correspond to the chosen gap; take one of them, and denote it by γ B . Denote also m +6(n−1) m +6(n−1) γ = γG ∪ γ B . Now let us consider G ≡ TV 0 (γG ) and B ≡ TV 0 (γ B ). By definition of the order n of the gap we know that C3 V ≤

|G | ≤ C4 V | B |

for some constants C3 and C4 independent of V . Proposition 3.11. There is a constant K > 1 independent of the choice of the gap and of V such that K −1

|G | |γG | |γG | ≤ ≤K . |γ B | | B | |γ B |

242


Notice that Theorem 1.2 immediately follows from Proposition 3.11. The rest of this section is devoted to the proof of Proposition 3.11, which is completed in Subsect. 3.7.

3.5. Dynamics near singularities. Here we prove several technical propositions on the properties of the trace map in the coordinate system constructed in Subsect. 3.2. The first two propositions are reformulations of [DG09a, Prop. 1]. The first one claims that a certain unstable cone field is invariant. We will use the variables (x, y, z) for coordinates in R3 . For a point p ∈ R3 , we will denote its coordinates by (x p , y p , z p ). Proposition 3.12. Given C1 > 0, C2 > 0, λ > 1, there exists δ0 = δ0 (C1 , C2 , λ) such that for any δ ∈ (0, δ0 ), the following holds. Let f : R3 → R3 be a C 2 -diffeomorphism such that (i) f C 2 ≤ C1 ; (ii) The plane {z = 0} is invariant under iterates of f ; (iii) D f ( p) − A < δ for every p ∈ R3 , where ⎛

⎞ λ−1 0 0 A = ⎝ 0 1 0⎠ 0 0 λ is a constant matrix. Introduce the following cone field in R3 : K up = {v ∈ T p R3 , v = vx y + vz : |vz | ≥ C2 |z p ||vx y |}.

(13)

Then for any point p = (x p , y p , z p ), |z p | ≤ 1 we have D f (K p ) ⊆ K uf ( p) . Notice that the choice of the cone field K up here (in (13)) and below (in (18)) corresponds to the choice of the cone field Kup in (9). The next proposition establishes expansion of vectors from the introduced unstable cones under the differential of the map. Proposition 3.13. Given C1 > 0, C2 > 0, λ > 1, ε ∈ (0, 41 ), η > 0 there exists δ0 = δ0 (C1 , C2 , λ, ε), N0 ∈ N, N0 = N0 (C1 , C2 , λ, ε, δ0 ), and C = C(η) > 0 such that for any δ ∈ (0, δ0 ), the following holds: Under the conditions of and with the notation from Proposition 3.12, suppose that for the points p = (x p , y p , z p ) and q = (xq , yq , z q ), the following holds: 1. 0 < z p < 1 and 0 < z q < 1; 2. For some N ≥ N0 both f N ( p) and f N (q) have z-coordinates larger than 1, and both f N −1 ( p) and f N −1 (q) have z-coordinates not greater than 1; 3. There is a smooth curve γ : [0, 1] → R3 such that γ (0) = p, γ (1) = q, and for each t ∈ [0, 1] we have γ (t) ∈ K γu (t) ;


243

If N ≥ N0 (i.e., if z p is small enough), then N

|D f N (v)| ≥ λ 2 (1−4ε) |v| for any v ∈ K up ,

(14)

and if D f N (v) = u = ux y + uz , then |ux y | < 2δ 1/2 |uz |.

(15)

|D f k (v)| ≥ Cλ 2 (1−4ε) |v| for each k = 1, 2, . . . , N .

(16)

Moreover, if |vz | ≥ η|vx y |, then k

In particular, N

length( f N (γ )) ≥ λ 2 (1−4ε) length(γ ).

(17)

In order to establish the distortion property we need better control over the expansion rates. In Proposition 3.15 we improve the estimates given by (14) and (17). As a first step we show that, roughly speaking, if a point stays for N iterates in a neighborhood where normalizing coordinates are defined then it must be λ−N -close to the center-stable manifold of a curve of fixed points. Proposition 3.14. Given C1 > 0, C2 > 0, λ > 1, there exist δ0 = δ0 (C1 , C2 , λ), N0 = N0 (C1 , C2 , λ, δ0 ) ∈ N, and C ∗∗ > C ∗ > 0 such that for any δ ∈ (0, δ0 ), the following holds. Let f : R3 → R3 be a C 2 -diffeomorphism such that (i) (ii) (iii) (iv)

f C 2 ≤ C1 ; The planes {z = 0} and {x = 0} are invariant under iterates of f ; Every point of the line {z = 0, x = 0} is a fixed point of f ; At a point Q ∈ {z = 0, x = 0} we have ⎛ ⎞ λ−1 0 0 D f (Q) = ⎝ 0 1 0 ⎠ . 0 0 λ

(v) D f ( p) − A < δ for every p ∈ R3 , where ⎛ ⎞ λ−1 0 0 A = D f (Q) = ⎝ 0 1 0 ⎠ . 0 0 λ Introduce the following cone fields in R3 :

K up = {v ∈ T p R3 , v = vx y + vz : |vz | ≥ C2 |z p ||vx y |},

K cu p K sp K cs p

−1

= {v ∈ T p R , v = vx + v yz : |vx | < 0.01λ |v yz |}, = {v ∈ T p R3 , v = vx + v yz : |vx | ≥ C2 |x p ||v yz |}, 3

= {v ∈ T p R , v = vz + vx y : |vz | < 0.01λ 3

−1

|vx y |}.

Suppose that for a finite orbit p0 , p1 , p3 , . . . , p N we have ( p0 )x ≥ 1, ( p1 )x < 1, ( p N )z ≥ 1, ( p N −1 )z < 1,

(18) (19) (20) (21)

244


and there are curves γ0 and γ N such that γ0 connects p0 with W ss (Q) and is tangent to both cone fields K u and K cu , and γ N connects p N with W uu (Q) and is tangent to both cone fields K s and K cs . Then C ∗ λ−N ≤ |( p0 )z | ≤ C ∗∗ λ−N , and C ∗ λ−N ≤ |( p N )x | ≤ C ∗∗ λ−N . Proof. Consider an orthogonal from p0 to the plane {z = 0}, and denote its base by p0∗ . There is a unique point Q 0 on the line {z = 0, x = 0} such that p0∗ ∈ W ss (Q 0 ). Denote the line segment connecting p0 and p0∗ by σ0 and set σi = f i (σ0 ), i = 1, 2, . . . , N . Similarly, consider an orthogonal from p N to the plane {x = 0}, and denote its base by p ∗N . There is a unique point Q N on the line {z = 0, x = 0} such that p ∗N ∈ W uu (Q N ). Denote the line segment connecting p N and p ∗N by ρ N , and set ρi = f −N +i (ρ N ), i = 0, 1, 2, . . . , N − 1. We have 0 < |σ0 | < |σ1 | < · · · < |σ N −1 | < |σ N |, 0 < |ρ N | < |ρ N −1 | < · · · < |ρ1 | < |ρ0 |,

1 ≤ |σ N | ≤ λ(1 + δ), 1 ≤ |ρ0 | ≤ λ(1 + δ).

Denote bk = dist( pk , Q). Then we have (λ−1 − min(δ, C1 bk ))|σk | ≤ |σk−1 | ≤ (λ−1 + min(δ, C1 bk ))|σk |, k = 1, 2, . . . , N , (λ−1 −min(δ, C1 bk ))|ρk | ≤ |ρk+1 | ≤ (λ−1 +min(δ, C1 bk ))|ρk |, k = 0, 1, 2, . . . , N −1. Now we have bk = dist( pk , Q) ≤ dist(Q, Q 0 ) + |ρk | + dist(Q, Q N ) + |σk | ≤ |σk | + |ρk | + C3 C2 ( |ρ N | + |σ0 |), where C3 does not depend on N . Indeed, the distance between p0∗ and W ss (Q) is bounded above by the length of the curve√γ0 , and since γ0 is tangent to the cone fields K u and K cu , its length is not greater than C2 |σ0 |. On the other hand, dist(Q, Q 0 ) is of the same order as that distance since the strong stable manifolds of fixed points form a C 1 -foliation of √ the plane {z = 0}. In the same way one gets an estimate dist(Q, Q N ) ≤ C3 C2 |ρ N |. Since we have the a priori estimates |σk | ≤ (1 + δ)(λ − δ)−N +k and |ρk | ≤ (1 + δ)(λ − −k δ) , we also have bk ≤ |σk | + |ρk | + 2C2 (1 + δ)(λ − δ)−N /2 . If k < N /2, then bk ≤ (λ − δ)−k (1 + (λ − δ)−N +2k + 2C(λ − δ)−N /2+k ) ≤ C (λ − δ)−k . If k ≥ N /2, then bk ≤ (λ − δ)−N +k (1 + (λ − δ) N −2k + 2C(λ − δ) N /2−k ) ≤ C (λ − δ)−N +k .


245

Therefore we have |σ0 | ≤ |σ N |

N

(λ−1 + min(δ, C1 bk ))

k=1

≤ λ(1 + δ)λ−N ∗∗ −N

≤C λ

[N /2]

N

k=1

k=[N /2]+1

(1 + C1 C (λ − δ)−k ) ·

(1 + C1 C (λ − δ)−N +k )

.

Also, |σ0 | ≥ |σ N |

N

(λ−1 − min(δ, C1 bk ))

k=1

≥ λ−N

[N /2]

N

k=1

k=[N /2]+1

(1 − C1 C (λ − δ)−k ) ·

(1 − C1 C (λ − δ)−N +k )

≥ C ∗ λ−N . In the same way we get estimates for ρ N .

Proposition 3.15. Given C1 > 0, C2 > 0, λ > 1, there exist δ0 = δ0 (C1 , C2 , λ), N0 ∈ > 0 such that for any δ ∈ (0, δ0 ), the following N, N0 = N0 (C1 , C2 , λ, δ0 ) ∈ N, and C holds: Under the conditions of and with the notation from Proposition 3.14, suppose that N /2 |v|. v ∈ T p0 R3 , v ∈ K up0 . Then |D f pN0 (v)| ≥ Cλ Proof. We will use the notation from Proposition 3.14 and its proof. Let us denote vk = D f k (v), k = 0, 1, . . . , N , and Dk = |(vk )z |, dk = |(vk )x y |. Let us normalize v in such a way that d0 = 1. Since v ∈ K up0 and |σ0 | ≥ C ∗ λ−N , we have D0 ≥ C5 λ−N /2 , where C5 is independent of N . Denote ⎞ ⎛ ν( p) m 1 ( p) t1 ( p) D f ( p) = ⎝ m 2 ( p) e( p) t2 ( p) ⎠ . s1 ( p) s2 ( p) λ( p) We have

⎞ ⎞⎛ ⎞ ⎛ ν( p)vx + m 1 ( p)v y + t1 ( p)vz vx ν( p) m 1 ( p) t1 ( p) D f ( p)(v) = ⎝ m 2 ( p) e( p) t2 ( p) ⎠ ⎝ v y ⎠ = ⎝ m 2 ( p)vx + e( p)v y + t2 ( p)vz ⎠ . s1 ( p) s2 ( p) λ( p) vz s1 ( p)vx + s2 ( p)v y + λ( p)vz ⎛

Since f C 2 ≤ C1 , we also have |ν( p)| ≤ λ−1 +C1 dist(Q, p), |m 1 ( p)|, |m 2 ( p)|, |t1 ( p)|, |t2 ( p)| ≤ C1 dist(Q, p), and |λ( p)| ≥ λ − C1 dist(Q, p). Furthermore, if p belongs to the plane {z = 0}, then s1 ( p) = s2 ( p) = 0. Therefore, for arbitrary p, we have |s1 ( p)|, |s2 ( p)| ≤ C1 z p . This implies that we have the following estimates: dk+1 ≤ (1 + min(δ, C1 bk ))dk + min(δ, C1 bk )Dk . (22) Dk+1 ≥ (λ − min(δ, C1 bk ))Dk − min(δ, C1 |σk |)dk

246


Lemma 3.16. There exists k ∗ such that dk ≥ Dk for all k ≤ k ∗ , and dk < Dk for all k > k∗. Proof. Indeed, if Dk > dk , then dk+1 ≤ (1 + δ)dk + δ Dk ≤ (1 + 2δ)Dk and Dk+1 ≥ (λ − δ)Dk − δdk ≥ (λ − 2δ)Dk . Since λ − 2δ > 1 + 2δ, we have Dk+1 > dk+1 .

We have the following preliminary estimates. If k < N /2, then bk ≤ C (λ − δ)−k ; if k ≥ N /2, then bk ≤ C (λ−δ)−N +k . Also, |σk | ≤ (λ−δ)−N +k for each k = 0, 1, . . . , N . N Notice that this implies that i=1 (1 + C1 bi ) is bounded by a constant that is independent −N /2 (λ − δ)k ; see [DG09a, Lemma 6.1]. of N . And, finally, Dk ≥ C5 λ If Dk ≤ dk (i.e., k ≤ k ∗ ), then dk+1 ≤ (1 + C1 bk )dk + C1 bk Dk ≤ (1 + 2C1 bk )dk k ≤ (1 + 2C1 bk ) d0 i=1

≤ C6 , where C6 does not depend on k or N . Moreover, we have Dk+1 ≥ (λ − C1 bk )Dk − C1 |σk |dk ≥ (λ − C1 bk )Dk − C1 C6 (λ − δ)−N +k C1 C6 (λ − δ)−N +k ≥ (λ − C1 bk )Dk 1 − · λ − C1 bk Dk C1 C6 (λ − δ)−N +k · ≥ (λ − C1 bk )Dk 1 − λ − C1 bk C5 λ−N /2 (λ − δ)k C1 C6 (λ1/2 (λ − δ)−1 ) N ≥ (λ − C1 bk )Dk 1 − C5 (λ − C1 bk ) k k C1 C6 k+1 −1 (λ1/2 (λ − δ)−1 ) N ≥ λ D0 (1 − (C1 λ )bi ) · 1 − C5 (λ − C1 bk ) i=0 k N C1 C6 (λ1/2 (λ − δ)−1 ) N ≥ λk+1 D0 (1 − (C1 λ−1 )bi ) · 1− C5 (λ − C1 bk ) i=0

≥ C7 λk+1 D0 , since for any C > 0 and ξ ∈ (0, 1), one has lim N →∞ (1 − Cξ N ) N = 1.


247

If dk < Dk (i.e., k > k ∗ ), then Dk+1 ≥ (λ − C1 bk )Dk − C1 |σk |dk ≥ (λ − C1 bk − C1 |σk |)Dk ≥ λDk (1 − λ−1 C1 bk − λ−1 C1 |σk |) k

≥ λk+1 C7 D0

(1 − λ−1 C1 bk − λ−1 C1 |σk |)

i=k ∗

≥ C8 λ

k+1

D0 ,

where C8 does not depend on N or k. N /2 |v|. Finally, |D f pN0 (v)| ≥ D N ≥ C8 λ N D0 ≥ Cλ

Below (in the proof of Proposition 3.18) we will also need an estimate on k ∗ provided by Lemma 3.16. Namely, we claim that k ∗ cannot be much larger than N /2. The formal statement is the following. Lemma 3.17. There is a constant C9 independent of N such that ∗

λk ≤ C9 λ N /2 . ∗

Proof. We know that Dk ∗ −1 ≤ dk ∗ −1 ≤ C6 . Therefore C6 ≥ Dk ∗ −1 ≥ C7 λk D0 ≥ ∗ ∗ C7 λk · C5 λ−N /2 , so λk ≤ (C6 C7−1 C5−1 )λ N /2 . Now we are ready to formulate the statement that will be used to check the distortion property of the trace map. Proposition 3.18. Given C1 > 0, C2 > 0, C3 > 0, λ > 1, there exist δ0 = δ0 (C1 , C2 , C3 , λ), N0 = N0 (C1 , C2 , C3 , λ, δ0 ) ∈ N, and C > 0 such that for any δ ∈ (0, δ0 ) and any > 0, the following holds: Under the conditions of and with the notation from Proposition 3.14, suppose that the curve γ0 has a curvature bounded by C3 . Suppose also that for the points p = (x p , y p , z p ) and q = (xq , yq , z q ), the following holds: 1. p, q ∈ γ0 ; 2. For some N ≥ N0 both f N ( p) and f N (q) have z-coordinates larger than 1, and both f N −1 ( p) and f N −1 (q) have z-coordinates not greater than 1; 3. dist( f N ( p), f N (q)) = . Denote pk = f k ( p), qk = f k (q), k = 0, . . . , N . Let v ∈ T p R3 and w ∈ Tq R3 be vectors tangent to γ0 . Denote vk = D f k (v) and wk = D f k (w), k = 0, . . . , N . Let αk be the angle between vk and wk . Then N k=0

αk < C, and

N k=0

dist( pk , qk ) < C.

(23)

248


Proof. First of all, notice that it is enough to prove Proposition 3.18 in the case when the points p and q are arbitrarily close to each other. Indeed, otherwise split the piece of the curve γ0 between the points p and q into a large number of extremely small pieces. If for each of them the statement of Proposition 3.18 holds, then by subadditivity of the inequalities (23) it holds in general. Denote by the piece of the curve γ0 between the points p0 and q0 , and set k = f k (), k = 0, 1, 2, . . . , N . Denote μk = |k |. Due to the remark above we can assume that for any vector tangent to , the value of k ∗ is the same. From the proof of Proposition 3.15 we see that for k = 0, 1, . . . , k ∗ , we have μk ≤ C6 μ0 , and ≈ μ N ≥ N /2 μ0 ≥ Cλ N /2 C −1 μk ∗ , so μk ∗ ≤ (C −1 C6 )λ−N /2 μ N ≤ C11 λ−N /2 . Cλ 6 ∗ On the other hand, if k > k , then 1 ≈ μ N ≥ D N ≥ C7 λ N −k Dk ∗ ≥ C7 λ N −k μk , 2 where we denote by Dk the length of the projection of k to the z-axis (slightly abusing the notation). Therefore, μk ≤ (2C7−1 )λ−N +k . It follows that we have N

dist( pk , qk ) ≤

k=0

N

μk

k=0 ∗

=

k

N

μk +

k=0

μk

k=k∗+1 N

≤ k ∗ · C6 μ0 +

k=k ∗ +1

(2C7−1 )λ−N +k

−1 λ−N /2 C6 + C12 ≤ k ·C ≤ C. ∗

Notice that for any two vectors v, w ∈ K cu , ⎛

⎞ λ−1 0 0 ∠(Av, Aw) ≤ λ∠(v, w), where A = ⎝ 0 1 0 ⎠ , 0 0 λ and if a linear operator B is ξ -close to A, then ∠(Bv, Bw) ≤ (λ + ξ )(1 + ξ )∠(v, w) = (λ + ξ(λ + 1 + ξ ))∠(v, w) < λ(1 + 2ξ )∠(v, w). Therefore we have α0 ≤ C3 μ0 and αk+1 ≤ λ(1 + 2C1 bk )αk + C1 μk , k = 0, 1, . . . , k ∗ .


Since

k ∗

k=0 (1 + 2C 1 bk )

249

≤ C13 for some C13 that is independent of k ∗ and N , we have

αk ≤ (λk + λk−1 + · · · + λ + 1) · (C13 C1 C6 C3 )μ0 ≤ C14 λk μ0 , where C14 is also independent of k and N . In particular, ∗

−1 λ−N /2 ≤ C15 αk ∗ ≤ C14 λk μ0 ≤ (C14 C9 )λ N /2 C and ∗

k

∗

αk ≤

k=0

k

∗

C14 λk μ0 ≤ C16 λk μ0 ≤ C17 .

k=0

Now denote 3 K uu p = {v ∈ T p R : |vz | > 100λ|vx y |}.

If v, w ∈ K uu , then ∠(Av, Aw) ≤ λ−1/2 ∠(v, w), and the same holds for any linear operator B which is δ-close to A. There exists m ∈ N independent of N such that if for a vector v we have |vz | > |vx y |, then D f m (v) ∈ K uu . Also αk ∗ +m = ∠(D f m (vk ∗ ), D f m (wk ∗ )) ≤ C15 (λ + 2δ)m = C16 , and for k ≥ k ∗ + m, we have αk+1 ≤ λ−1/2 αk + C1 μk . Denote ν = λ−1/3 − λ−1/2 . If C1 μk < ναk , then αk+1 ≤ λ−1/2 αk + ναk = (λ−1/2 + ν)αk = λ−1/3 αk . If C1 μk ≥ ναk , then αk+1 ≤ λ−1/2 αk + Cμk ≤ (λ−1/2 ν −1 + 1)C1 μk . Since

N

k=k ∗ +m

μk ≤

N

−1 −N +k k=k ∗ +m (2C 7 )λ N

≤ C12 , this implies that

αk ≤ C18

k=k ∗ +m

and N k=0

∗

αk ≤

k

αk + (αk ∗ +1 + · · · + αk ∗ +m−1 ) +

k=0

concluding the proof.

N k=k ∗ +m

αk ≤ C,

250


3.6. Distortion property: preliminary estimates. The main result of this subsection is the following statement: Lemma 3.19. There are constants R > 0, V0 > 0, and κ > 0 such that for any V ∈ (0, V0 ) and N ∈ N, the following holds: Suppose that γ ⊂ T N (lV )\Or1 is a connected curve of length not greater than κ. Let the points p, q ∈ lV be such that TVN ( p) ∈ γ and TVN (q) ∈ γ , and v p and vq be unit vectors tangent to γ at points p and q. Then N

∠(DTVi (v p ), DTVi (vq )) + dist(TVi ( p), TVi (q)) < R.

i=0

Notice that F −1 (S\Or2 ) is a torus without small neighborhoods of the preimages of the singularities, and we can define the following map: V = F −1 ◦ πV ◦ TV ◦ π −1 ◦ F ≡ F −1 ◦ TV ◦ FV . (24) V : F −1 (S\Or2 ) → T2 , T T V V V is C 2 -close to the linear automorphism A on its domain. If V is small, T Lemma 3.20. For V0 > 0 small enough, there exists t ∈ (0, 1) such that for V ∈ [0, V0 ], p, q ∈ T2 \F −1 (Or2 ) and unit vectors v p ∈ K up , vq ∈ K qu , we have V,q (vq )) ≤ t∠(v p , vq ) + 2T V C 2 dist( p, q). V, p (v p ), D T ∠(D T V is C 2 -close to the linear automorphism A. In particular, Proof. If V0 is small, then T for any point p ∈ T2 \F −1 (Or2 ) and any vectors v1 , v2 ∈ K up , V, p (v2 )) ≤ t∠(v1 , v2 ), V, p (v1 ), D T ∠(D T where t ∈ (0, 1) can be chosen uniformly for all V ∈ [0, V0 ] and p ∈ T2 \F −1 (Or2 ). Therefore we have V, p (v p ), D T V,q (vq )) ≤ ∠(D T V, p (v p ), D T V, p (vq ))+∠(D T V, p (vq ), D T V,q (vq )) ∠(D T V, p (vq ) − D T V,q (vq ) ≤ t∠(v p , vq ) + 2D T V C 2 dist( p, q), ≤ t∠(v p , vq ) + 2T as claimed.

Definition 3.21. For any points p, q and any vectors v p , vq define F( p, q, v p , vq ) ≡

∠(v p , vq ) . dist( p, q)

(25)

Lemma 3.22. For p, q ∈ T2 \F −1 (Or2 ), p = q, and vectors v p ∈ K up , vq ∈ K qu , consider the function F( p, q, v p , vq ) defined by (25). Suppose that p and q belong to a curve that is tangent to the unstable cone field. Then V ( p), T V (q), D T V, p (v p ), D T V,q (vq )) ≤ tF( p, q, v p , vq ) + 2T V C 2 . F(T In particular, if F( p, q, v p , vq ) >

V 2 4T C 1−t ,

then

V ( p), T V (q), D T V, p (v p ), D T V,q (vq )) ≤ F(T

1+t F( p, q, v p , vq ). 2


251

Proof. We have V, p (v p ), D T V,q (vq )) ∠(D T V ( p), T V (q)) dist(T V C 2 dist( p, q) t∠(v p , vq ) + 2T ≤ dist( p, q) V C 2 . = tF( p, q, v p , vq ) + 2T

V (q), D T V, p (v p ), D T V,q (vq )) = V ( p), T F(T

If we also have F( p, q, v p , vq ) >

V 2 4T C 1−t ,

then

V C 2 ≤ tF( p, q, v p , vq ) + 1 − t F( p, q, v p , vq ) tF( p, q, v p , vq ) + 2T 2 1+t = F( p, q, v p , vq ). 2 Lemma 3.5 immediately implies the following statement. Lemma 3.23. Fix a small V ≥ 0. Suppose that a, b ∈ T2 \F −1 (Or2 ) are such that V (a) and V (b) are defined, and va ∈ Ta T2 \F −1 (Or2 ), vb ∈ Tb T2 \F −1 (Or2 ). Then 2 (F(V (a), V (b), DV (va ), DV (vb )) + 1) F(a, b, va , vb ) ≤ C and 2 (F(a, b, va , vb ) + 1). F(V (a), V (b), DV (va ), DV (vb )) ≤ C V is C 2 -close to the linear automorphism A on its domain for small V ≥ 0, Since T V C 2 ≤ 10. we can assume that for all V ∈ [0, V0 ], we have T 40 2 was taken. Let C be the constant from Proposition 3.18, where C3 = ( 1−t + 1)C ∗ Fix a small τ > 0. Take n ∈ N such that ∗ τ 1 + t n 2 40 2 μn ∗ ≥ μn ∗ (1− 8 ) . , and C (26) C (C + 1) ≤ 2 1−t Now we are going to choose a neighborhood U of the set of singularities {P1 , P2 , P3 , P4 } in such a way that if an orbit of a point leaves U , then it does not enter U for the next n ∗ iterates. Also, we will choose a smaller neighborhood U ∗ ⊂ U such that if a small curve is tangent to an unstable cone field (see Definition 3.8) and intersects U ∗ , then either it is entirely inside of U , or its iterates will continue to intersect U until they reach the opposite rectangle. Here is how we do that. For small r3 ∈ (0, r2 ), we denote Or03 = Or3 , Or13 = T (Or3 )∩Or1 , Ori3 = T (Ori−1 )∩ 3 ∞ i ). We ) ∩ O for each i < 0, and U = ∪ (O Or1 for each i > 1, Ori3 = T −1 (Ori+1 r1 r3 i=−∞ 3 will take r3 so small that the following property holds. If p ∈ SV is such that T −1 ( p) ∈ U but p ∈ U , then T n ( p) ∈ U for every n ∈ N with n ≤ n ∗ . For small r4 ∈ (0, r3 ), we denote Or04 = Or4 , Or14 = T (Or4 )∩Or1 , Ori4 = T (Ori−1 )∩ 4 i −1 i+1 Or1 for each i > 1, Or4 = T (Or4 ) ∩ Or1 for each i < 0, and U∗ =

∞ i=−∞

(Ori4 ).

(27)

252


We will take r4 so small that the following property holds. Suppose γ is a curve on T2 such that V (γ ) is defined, FV (γ ) ⊂ Or1 , γ is tangent to the unstable cone field, FV (γ ) ∩ Or1 \U = ∅, and FV (γ ) ∩ U ∗ = ∅. Then there is k ∈ N such that TVn (FV (γ )) ∩ U = ∅ for all natural n ≤ k, and TVk (FV (γ )) intersects the opposite rectangle of the Markov partition. Lemma 3.24. There are V0 > 0 and R1 > 0 such that the following holds. Suppose that v is a non-zero vector tangent to the line lV at some point p ∈ lV . Let N ∈ N be such that TV ( p) belongs to the bounded component of SV \Or1 . Then N

DTVi (v) ≤ R1 DTVN (v).

i=0

Proof. If V is small enough, the vector DT m 0 (v) is tangent to the unstable cone field. Let us split the orbit {T m 0 ( p), T m 0 +1 ( p), . . . , T N ( p)} into several intervals {T m 0 (q), T m 0 +1 (q), . . . , T k1 −1 (q)}, {T k1 (q), . . . , T k2 −1 (q)}, . . . , {T ks (q), . . . , T N (q)}

in such a way that the following properties hold: (1) (2) (3)

for each i = 1, 2, . . . , s, the points T ki −1 (q) and T ki (q) are outside of Or2 ; if {T ki (q), . . . , T ki+1 −1 (q)} ∩ U ∗ = ∅, then {T ki (q), . . . , T ki+1 −1 (q)} ⊂ U ∗ ; for each i = 1, 2, . . . , s − 1, we have either ki+1 − ki ≥ n ∗ (where n ∗ is chosen due to (26)) or {T ki (q), . . . , T ki+1 −1 (q)} ∩ U ∗ = ∅.

Such a splitting exists due to the choice of U ∗ ⊂ U in (27) above. Apply Proposition 3.18 to those intervals in the splitting that are contained in U ∗ . The choice of n ∗ in (26) above guarantees for the intervals that do not intersect U ∗ uniform expansion of the vector. The first and the last interval may have length greater than n ∗ , and then we have uniform expansion that “kills” the distortion added by the change of coordinates, or smaller than n ∗ , but then they do not add more than a constant to the sum. As a result, the required sum is bounded above by a geometrical progression. Lemma 3.24 implies the following statement. Lemma 3.25. There are constants R1 > 0, V0 > 0, and κ1 > 0 such that for any V ∈ (0, V0 ) and N ∈ N, the following holds: Suppose that γ ⊂ T N (lV )\Or1 is a connected curve of length not greater than κ1 . Let the points p, q ∈ lV be such that TVN ( p) ∈ γ and TVN (q) ∈ γ . Then N

dist(TVi ( p), TVi (q)) < R1 .

i=0

Finally, the choice of n ∗ and Proposition 3.18 imply that the function F(TVi ( p), TVi (q), DTVi (v p ), DTVi (vq )) is uniformly bounded, and together with Lemma 3.25 this proves Lemma 3.19.


253

3.7. Proof of the distortion property. Proof of Proposition 3.11. Notice that we need to prove that log |G ||γ B | | B ||γG | is bounded by some constant independent of the choice of the gap and of V . There are points pG ∈ γG and p B ∈ γ B such that if vG is a unit vector tangent to the curve γG at pG , and v B is a unit vector tangent to the curve γ B at p B , then n+2 | ||γ | G B = log |TV (γG )||γ B | log | B ||γG | |TVn+2 (γ B )||γG | |DTVn+2 (vG )| = log |DTVn+2 (v B )| n+1 i i log |DTV | DT i (vG ) (TV ( pG ))| − log |DTV | DT i (v B ) (TV ( p B ))| = V V i=0

n+1 ≤ log |DTV | DT i (vG ) (TVi ( pG ))| − log |DTV | DT i (v B ) (TVi ( p B ))| V

i=0

≤

V

n+1 |DTV | DT i (vG ) (TVi ( pG ))| − |DTV | DT i (v B ) (TVi ( p B ))| . i=0

V

V

We estimate each of the terms in this sum using Lemma 3.26. Suppose f : Rn → Rn is a smooth map, a, b ∈ Rn , and va ∈ Ta Rn , vb ∈ Tb Rn are unit vectors. Then |D f |v (a)| − |D f |v (b)| ≤ f C 2 (∠(va , vb ) + |a − b|). a b Proof of Lemma 3.26. |D f |v (a)| − |D f |v (b)| ≤ |D f |v (a)| − |D f |v (a)| + |D f |v (a)| − |D f |v (b)| a a b b b b ≤ D f (a) · |va − vb | + f C 2 · |a − b| ≤ f C 2 (|va − vb | + |a − b|) ≤ f C 2 (∠(va , vb ) + |a − b|). Now, Proposition 3.11 follows from Lemma 3.19.

4. The Integrated Density of States 4.1. Definition and basic properties. Recall the definition of N (E, V ) given in (1), N (E, V ) = lim

n→∞

1 N (E, ω, V, [1, n]). n

254


Proposition 4.1 (Hof, see [Ho]). For every (E, V ) ∈ R2 , the limit in (1) exists for every ω ∈ T and its value does not depend on it. The following proposition collects some well-known properties of the integrated density of states. Proposition 4.2. (a) The map R × R (E, V ) → N (E, V ) ∈ [0, 1] is continuous. (b) For every V ∈ R, there is a Borel measure on R, called the density of states measure and denoted by d N V , such that N (E, V ) =

R

χ(−∞,E] d N V .

(c) The topological support of the measure d N V is equal to V . (d) The density of states measure is the ω-average of the spectral measure associated with HV,ω and the vector δ0 ∈ 2 (Z). That is, for every V ∈ R and every bounded measurable g : R → R, R

g d NV =

ω∈T

δ0 , g(HV,ω )δ0 dω.

(28)

(e) We have N (E, 0) =

⎧ ⎪ ⎨0

E ≤ −2 arccos − E2 −2 < E < 2 ⎩1 E ≥ 2. 1 ⎪π

Proof. (a) This follows from (the proof of) Lemma 3.1 and Theorem 3.2 in [AS]. (b) For every V ∈ R, the map R E → N (E, V ) ∈ [0, 1] is continuous by (a) and non-decreasing by construction, and hence it is the distribution function of a Borel measure on R. (c) and (d) See [CFKS, Sect. 9.2]. (e) This is folklore; see, for example, [LS, Theorem 1.1] and its discussion there for a simple derivation. 4.2. Complete gap labeling. Here we prove the following result, which implies Theorem 1.5 since the transversality assumption holds for V0 > 0 sufficiently small. Theorem 4.3. Suppose V0 > 0 is such that for every V ∈ (0, V0 ] and every point in V , its stable manifold intersects V transversally. Then, for every V ∈ (0, V0 ], all gaps allowed by the gap labeling theorem are open. That is, {N (E, V ) : E ∈ R\V } = {{kα} : k ∈ Z} ∪ {1}.


255

Proof. Consider the preimages of the singularities of the trace map F −1 (Pi ), i = 1, 2, 3, 4, on the torus. They form a set of 4 periodic points of the hyperbolic automorphism A : T2 → T2 , and the stable manifolds of those periodic points intersect the line {φ = 0} transversally at the points {kα (mod 1)}, {kα+ 21 (mod 1)}, {kα+ α2 (mod 1)}, and {kα + 21 + α2 (mod 1)}. The images of these points under the semiconjugacy F form the set of points on 0 of the form (± cos(π mα), ± cos(π mα), 1), and they correspond to the energies E ∈ {±2 cos(π mα), m ∈ Z}. The integrated density of states for the free Laplacian, N (E, 0), takes the values {{mα} : m ∈ Z} at these energies. After we increase the value of the coupling constant, each singularity splits into two periodic points, and each of the stable manifolds of the singularities splits into two strong stable manifolds of the periodic points. Every point between the stable manifolds has an unbounded positive semiorbit, and therefore the interval that those manifolds cut in the line V corresponds to a gap in the spectrum. Due to the continuous dependence of N (E, V ) on the coupling constant and the local constancy of N (·, V ) in the complement of V , the integrated density of states takes the same value in the formed gap as at the energy that corresponds to the initial point of intersection of the stable manifold of singularity with 0 .

4.3. More on the asymptotic gap lengths. Proof of Theorem 1.6. Fix any m ∈ Z\{0}. The integrated density of states of the free Laplacian takes the values {±mα} at the energies {±2 cos π mα}. If m = 2k, then these energies correspond to points with θ -coordinates kα(mod 1) and kα + 21 (mod 1) on F −1 (0 ). If m = 2k + 1, then these energies correspond to points with θ -coordinates kα + α α 1 −1 2 (mod 1) and kα + 2 + 2 (mod 1) on F (0 ). Take one of these points, Q k ∈ F −1 (0 ). Let P ∗ be the singularity such that Q k ∈ F −1 (W ss (P ∗ )). Denote = F −1 (W1ss (P ∗ )). Let M ∈ N be the smallest number such that A−M () contains Q k . Then A M (F −1 (0 )) intersects at some point Z , and the distance from Z to the set of singularities is uniformly (in |m|) bounded from zero. Denote by P (V ) and P (V ) the periodic points on SV near P ∗ (of period 2 or of period 6, depending on P ∗ ). Denote by = FV−1 (W1ss (P (V )) and = FV−1 (W1ss (P (V )). M (F −1 (V )) (recall that T V was defined by (24)) intersects and transThen T V V M (F −1 (l V )) an interval I V versally near the point Z and the curves and cut in T V V whose length is of order V . In other words, lim V →0 |I|VV|| exists and is uniformly bounded from zero and from above. We also have lim

V →0

−1 −M (I V ) T V = DA M | F −1 (l0 ) = α M · C(Q k ), |I V |

where C(Q k ) is bounded from zero and from above. Notice that if A−M () intersects F −1 (0 ) at Q k , then A−M () must be of length at least |k|α −1 . On the other hand, |A−M ()| = α −M ||. Therefore, α −M || ≥ |k|α −1 .

256


Hence −M (I V )| −M (I V )| |I V | |T |T |I V | 1 V = lim V = α M C(Q k ) lim ≤ C(Q k ) . V →0 V →0 V →0 |V | |V | |I V | |V | |k| lim

On the other hand, lim

V →0

|Um (V )| −M (I V )| |T V

= |D F| F −1 (l0 ) (Q k )|

is bounded from above (since D F is bounded). Therefore we have −M (I V )| |Um (V )| |T |Um V | 1 ≤ C(Q k ) , = lim V −M (I V )| V →0 |V | V →0 |V | |k| |T V lim

where C(Q k ) is uniformly bounded from above.

Remark 4.4. Notice that one can actually claim a bit more. Namely, for those gaps that appear away from the endpoints of the free spectrum, the corresponding constant Cm is uniformly bounded away from zero. This follows from the fact that away from the singularities, the differential of F has norm which is bounded away from zero. 5. Spectral Measures and Transport Exponents 5.1. Solution estimates. In this subsection we study solutions to the difference equation and prove upper and lower bounds for them. Results of this kind are known (we provide the references below), but our purpose here is to obtain explicit quantitative estimates, as functions of the coupling constant, as they enter explicitly in the bounds on fractal dimensions of spectral measures and transport exponents and we wish to prove the best bounds possible for the latter quantities to get an idea about their behavior at weak coupling. These applications will be discussed in the next two subsections. Denote the largest root of the polynomial x 3 − (2 + V )x − 1 by aV . For small V > 0, √ we have aV ≈ 5+1 2 + cV with a suitable constant c. Our goal is to prove the following pair of theorems (recall that σ (HV,ω ) = V for every ω). Let M(n, m, ω, E) be the standard transfer matrix, that is, the SL(2, R) matrix that maps (u(m + 1), u(m))T to (u(n + 1), u(n))T for every solution u of the difference equation HV,ω u = Eu. Theorem 5.1. For every V > 0 and every ζ >

log[(5 + 2V )1/2 (3 + V )aV ] √

log

5+1 2

,

(29)

there is a constant C > 0 such that for every ω and every E ∈ V , we have max

0≤|n−m|≤N

M(n, m, ω, E) ≤ C N ζ .

(30)


257

Theorem 5.2. For every V > 0, 1 log 1 + (2+2V 2 log[(5 + 2V )1/2 (3 + V )aV ] ) √ , and γu > 1 + √ γ < , 5+1 log 16 · log 5+1 2 2 there are constants C , Cu > 0 such that for every ω, L ≥ 1, and E ∈ V , we have that every solution to the difference equation HV,ω u = Eu, which is normalized by |u(0)|2 + |u(1)|2 = 1, obeys the estimates C L γ ≤ u L ≤ Cu L γu ,

(31)

where the local 2 norm · L is defined by u2L =

[L]

|u(n)|2 + (L − [L])|u([L] + 1)|2

n=1

for L ≥ 1. Theorem 5.1 is a quantitative version of [DL99a, Theorem 3], which in turn was a generalization of [IT, Theorem 1]. Theorem 5.2, on the other hand, is a quantitative version of [DKL, Props. 5.1 and 5.2]. The upper bound in Theorem 5.2 is a consequence of Theorem 5.1, while the lower bound in Theorem 5.2 will be extracted from the details of the proof of [DKL, Prop. 5.1]. Let us recall some notation from [DL99a]. For V > 0, E ∈ R, and a ∈ {0, 1}, we write E − V a −1 . T (V, E, a) = 1 0 For a word w = w1 . . . wn ∈ {0, 1}∗ , we then set M(V, E, w) = T (V, E, wn ) × · · · × T (V, E, w1 ). Next, define the words {sn }n≥0 inductively by s0 = 0, s1 = 1, sn = sn−1 sn−2

for n ≥ 2.

By the definition of these words, there is a unique u ∈ {0, 1}Z+ (the fixed point of the Fibonacci substitution) that has sn as a prefix for every n ≥ 1; namely u = 1011010110110 . . .. We denote the set of finite subwords of u by Wu , that is, Wu = {0, 1, 01, 10, 11, 010, 011, 101, 110, . . .}. By uniform recurrence of u it suffices to consider the matrices M(V, E, w) for E ∈ V and w ∈ Wu when proving Theorems 5.1 and 5.2. With these quantities, we have Mn = Mn (V, E) = M(V, E, sn ) and consequently xn = xn (V, E) = for n ≥ 1.

1 TrM(V, E, sn ) 2

258


Finding upper bounds for the norm of transfer matrices consists of three steps. The first step is to bound the special matrices Mn , that is, M(Fn , 0, ω = 0, E). This is the objective of Lemma 5.3. Then, in Lemma 5.5, we use interpolation to bound the norm of the matrices M(n, 0, ω = 0, E). Finally, the case of general matrices M(n, m, ω, E) is treated using partition. This will then complete the proof of Theorem 5.1. We begin with the first step. Part (a) of the following lemma is due to Süt˝o (see [S87, Lemma 2]) and part (b) is a relative of [IT, Lemma 4.(ii)]: Lemma 5.3. (a) We have E ∈ V if and only if the sequence {xn } is bounded. Moreover, we have |xn | ≤ 1 +

V 2

for every E ∈ V and n ≥ 1. (b) With some positive V -dependent constant C, we have Mn ≤ CaVn for every E ∈ V and n ≥ 1. Proof. As pointed out above, part (a) is known. Let us prove part (b). By the CayleyHamilton Theorem, we have that Mn2 − 2xn Mn + I = 0, that is, Mn = 2xn I − Mn−1 . The recursion for the matrices Mn , Mn = Mn−2 Mn−1 , gives −1 Mn−2 = Mn−1 Mn−1 .

Putting these things together, we find −1 −1 Mn = Mn−2 Mn−1 = Mn−2 (2xn−1 I − Mn−1 ) = 2xn−1 Mn−2 − Mn−3 .

Since we also have Mn = Mn−1 , we obtain from this identity along with part (a) the estimate Mn ≤ (2 + V )Mn−2 + Mn−3 for every E ∈ V . Consider for comparison the recursion m n = (2 + V )m n−2 + m n−3 . It is clear that we have Mn ≤ m n if we consider the sequence {m n } generated by the recursion and the initial conditions m 1 = M1 , m 2 = M2 , m 3 = M3 . Any solution of the recursion is of the form m n = c1 x1n + c2 x2n + c3 x3n , where the x j are the roots of the characteristic polynomial x 3 − (2 + V )x − 1. Thus, the definition of aV implies that the estimate in (b) holds.


259

√

n 1+δ with δ → 0 as Remark. Since Fn ∼ ( 5+1 2 ) , we can infer that Mn ≤ C Fn V → 0. Since for V = 0, the norm of the transfer matrices grows linearly at the energies E = ±2, we cannot expect a better result in general. However, since the transfer matrices are bounded when V = 0 and E ∈ (−2, 2), it is reasonable to expect that there is a better bound for small values of V for most energies in the spectrum.

Let us now turn to the second step, which is the interpolation of the estimates from the previous lemma to non-Fibonacci sites in the ω = 0 sequence. The following lemma is a relative of [IT, Lemma 5]. Even though the proof is closely related to that in [IT], we give the details to clearly show where the improved estimate comes from. Lemma 5.4. For n ≥ 1 and k ≥ 0, we may write (1)

(2)

(3)

(4)

Mn Mn+k = Pk Mn+k + Pk Mn+k−1 + Pk Mn+k−2 + Pk I ( j)

with coefficients Pk that also depend on n and E. Moreover, for every n ≥ 1 and E ∈ V , we have that 4

( j)

|Pk | ≤ (5 + 2V )(3 + V )k/2 .

j=1

Proof. Consider the case k = 0. Then, Mn Mn = Mn2 = 2xn Mn − I by Cayley-Hamilton and hence we can set (1)

P0

= 2xn ,

(2)

(3)

= P0

P0

= 0,

(4)

P0

= −1.

It follows that for E ∈ V , we have the estimate 4

( j)

|P0 | ≤ 3 + V.

j=1

Consider now the case k = 1. Then, Mn Mn+1 = Mn+2 = 2xn+1 Mn + Mn−1 − 2xn−1 I by [IT, Lemma 6] and hence we can set (1)

P1

= 0,

(2)

P1

= 2xn+1 ,

(3)

P1

= 1,

For E ∈ V , we therefore have 4 j=1

( j)

|P1 | ≤ 5 + 2V.

(4)

P1

= −2xn−1 .

260


Next, assume that the lemma holds for k ∈ {1, . . . , } and consider the case k = +1. Then, using [IT, Lemma 6] one more time, we find Mn Mn++1 = (Mn Mn+−1 )Mn+ (1) (2) (3) (4) = P−1 Mn+−1 + P−1 Mn+−2 + P−1 Mn+−3 + P−1 I Mn+ (1)

(2)

= P−1 Mn++1 + P−1 (2xn+−2 Mn+ − Mn+−1 ) (3)

(4)

+ P−1 (2xn+−1 Mn+−1 − I ) + P−1 Mn+ (1)

(2)

(3)

(4)

= P+1 Mn++1 + P+1 Mn+ + P+1 Mn+−1 + Pk I, where we set (1) (1) P+1 = P−1

(2) (2) (4) P+1 = 2xn+−2 P−1 + P−1

(3) (2) (3) P+1 = −P−1 + 2xn+−1 P−1 (4) (3) P+1 = −P−1 .

It follows that 4

( j) |P+1 |

≤ (3 + V )

j=1

4

( j)

|P−1 |.

j=1

By induction hypothesis, we obtain the desired estimate.

Lemma 5.5. For every V > 0 and every ζ obeying (29), there is a constant C˜ so that ˜ ζ M(n, 0, ω = 0, E) ≤ Cn for every n ≥ 1, E ∈ V . Before we give the proof of Lemma 5.5, let us recall some facts about the minimal representation of positive integers in terms of Fibonacci numbers; compare, for example, [HCB,Le,Z]. Given n ≥ 1, there is a unique representation n=

K

Fn k

k=0

such that n k ∈ Z+ and n k+1 − n k ≥ 2. This is the so-called Zeckendorf representation of n (which, incidentally, was published by Lekkerkerker some twenty years before Zeckendorf). It is found by the greedy algorithm, that is, Fn K is set to be the largest Fibonacci number less than or equal to n, n is replaced by n − Fn K , and the process is repeated until we have a zero remainder. The property n k+1 − n k ≥ 2 follows from the way the algorithm works together with the recursion for the Fibonacci numbers. Uniqueness of this representation in turn follows from n k+1 − n k ≥ 2. By construction, we have Fn K ≤ n < Fn K +1 ≤ 2Fn K .

(32)


261

We will also need a relation between n and K , that is, the number of terms in the Zeckendorf representation of n. The algorithm to compute K = K (n) is the following: Start with two words over the positive integers, 1 and 1. The next word is obtained by writing down the previous word, following by the word preceding the previous word, but this time with every symbol increased by one. Now iterate this procedure: 1, 1, 12, 122, 12223, 12223233, 1222323323334, . . . Concatenate all the words to obtain a semi-infinite word over the positive integers: 111212212223122232331222323323334 . . . The n th symbol in this semi-infinite sequence is K (n). Notice that the lengths of these words follow the Fibonacci sequence. Moreover, only every other word has an increase in the maximum value of its entries, relative to the previous one, and the increase is by one whenever it happens. This shows that F2(K −1) ≤ n.

(33)

Proof of Lemma 5.5. From the Zeckendorf representation of n ≥ 1, one finds that M(n, 0, ω = 0, E) = Mn 0 Mn 1 · · · Mn K ; compare the beginning of the proof of [IT, Theorem 1]. Let us prove by induction that with bV = (3 + V )1/2 , we have for every k ≥ 1, −n 0 +2(k−1)

Mn 0 Mn 1 · · · Mn k ≤ C(5 + 2V )k bV

(aV bV )n k .

(34)

Consider first the case k = 1. From Lemmas 5.3 and 5.4, we may infer that (1)

(2)

(3)

(4)

Mn 0 Mn 1 ≤ |Pn 1 −n 0 | · Mn 1 + |Pn 1 −n 0 | · Mn 1 −1 + |Pn 1 −n 0 | · Mn 1 −2 + |Pn 1 −n 0 | ≤

4

( j)

|Pn 1 −n 0 |CaVn 1

j=1

≤ C(5 + 2V )(3 + V )(n 1 −n 0 )/2 aVn 1 n1 0 ≤ C(5 + 2V )b−n V (a V bV ) .

Thus, the estimate (34) holds for k = 1.

262


Now assume that (34) holds for k ∈ {1, . . . , } and consider the case k = + 1. Using Lemma 5.4 again, we see that Mn 0 Mn 1 · · · Mn Mn +1 = (Mn 0 Mn 1 )Mn 2 · · · Mn Mn +1 (1)

≤ |Pn 1 −n 0 | · Mn 1 Mn 2 · · · Mn Mn +1 (2)

+|Pn 1 −n 0 | · Mn 1 −1 Mn 2 · · · Mn Mn +1 (3)

+|Pn 1 −n 0 | · Mn 1 −2 Mn 2 · · · Mn Mn +1 (4)

+|Pn 1 −n 0 | · Mn 2 · · · Mn Mn +1 (1)

−n 1 +2(−1)

≤ |Pn 1 −n 0 | · C(5 + 2V ) bV

(ab)n +1

(2)

−n 1 +1+2(−1)

(aV bV )n +1

(3)

−n 1 +2+2(−1)

(aV bV )n +1

+|Pn 1 −n 0 | · C(5 + 2V ) bV +|Pn 1 −n 0 | · C(5 + 2V ) bV

2 +2(−2) +|Pn(4) | · C(5 + 2V ) b−n (aV bV )n +1 V 1 −n 0 ⎛ ⎞ 4 ( j) 1 +2 ⎝ ≤ |Pn 1 −n 0 |⎠ C(5 + 2V ) b−n (aV bV )n +1 V

j=1 1 +2 ≤ (5 + 2V )bn 1 −n 0 C(5 + 2V ) b−n (aV bV )n +1 V

−n 0 +2((+1)−1)

= C(5 + 2V )+1 bV

(aV bV )n +1 .

We conclude that (34) holds for k = + 1. Therefore, −n 0 +2(K −1)

M(n, 0, ω = 0, E) ≤ C(5 + 2V ) K bV

(aV bV )n K .

It follows that log M(n, 0, ω = 0, E) log n 0 +2(K −1) log C(5 + 2V ) K b−n (aV bV )n K V ≤ lim sup log n n→∞ K log(5 + 2V ) + 2K log bV + n K log (aV bV ) ≤ lim sup log n n→∞

lim sup n→∞

log√n

≤ lim sup

2 log

5+1 2

log(5 + 2V ) + 2

=

5+1 2

log bV +

log n √ log

5+1 2

log (aV bV )

log n

n→∞

=

log√n 2 log

log(5 + 2V )1/2

+ log bV + log (aV bV )

log

√

5+1 2

log[(5 + 2V )1/2 (3 + V )aV ] log

√ 5+1 2

,

where we used (32) and (33) in the third step. Since these estimates are uniform in E ∈ V , the result follows.


263

In the third step, we can now turn to the Proof of Theorem 5.1. Suppose V > 0 and ζ obeys (29). We will prove the following estimate: M(V, E, w) ≤ C|w|ζ

for every V > 0, E ∈ V , and w ∈ Wu ,

(35)

from which (30) immediately follows. Let V, E, w as in (35) be given. As explained in [DL99a, Proof of Lemma 5.2], we can write w = x yz, where y ∈ {a, b}∗ is a word of length 2 and x R (the reversal of x) and z are prefixes of u. Thus, M(V, E, w) ≤ M(V, E, x) · M(V, E, y) · M(V, E, z).

(36)

It follows from [DL99a, Lemma 5.1] that M(V, E, x R ) = M(V, E, x).

(37)

Moreover, Lemma 5.5 yields ˜ ζ M(V, E, x R ) ≤ C|x|

(38)

˜ ζ. M(V, E, z) ≤ C|z|

(39)

and

Since ζ > 1 and y has length 2, (35) follows from (36)–(39).

Proof of Theorem 5.2. The upper bound in (31) follows from Theorem 5.1. The lower bound in (31) will be extracted here from [DKL] since it is not made explicit there. Denote U (n) = (u(n + 1), u(n))T and consider the associated local 2 -norms U L . Power-law bounds for U L correspond in a natural way to power-law bounds for u L , so let us discuss the former object. By considering squares adjacent to the origin and Gordon’s two-block method, Damanik, Killip, and Lenz showed that U F8n

≥ 1+

1 (2 + 2V )2

n/2 ;

see [DKL, Lemma 4.1]. We find

lim inf n→∞

log U F8(n−1) log F8n

≥

log 1 +

1 (2+2V )2 √ 16 · log 5+1 2

,

uniformly in ω ∈ T and E ∈ V . This, together with monotonicity, yields the asserted lower bound.

264


5.2. Spectral measures. Given V > 0 and ω ∈ T, let us consider the operator HV,ω . Since this operator is self-adjoint, the spectral theorem associates with each ψ ∈ 2 (Z) a Borel measure μψ that is supported by V and obeys "

# ψ, g HV,ω ψ =

g(E) dμψ (E)

for every bounded measurable function g on V . It follows from Theorem 5.2 that each of these spectral measures gives zero weight to sets of small Hasudorff dimension. More precisely, we have the following result: Corollary 1. For every V > 0, every ω, and every 1 2 log 1 + (2+2V 2 ) √ , α< 1 5+1 + 16 log[(5 + 2V )1/2 (3 + V )aV ] log 1 + (2+2V )2 + 16 · log 2 we have that μψ (S) = 0 for every ψ ∈ 2 (Z) and every Borel set S with h α (S) = 0. Proof. This is an immediate consequence of Theorem 5.2 and [DKL, Theorem 1].

5.3. Transport exponents. Absolute continuity of spectral measures with respect to Hausdorff measures is important because it implies lower bounds for transport exponents via the Guarneri-Combes-Last Theorem. Let us recall this connection briefly. Given HV,ω and ψ as above, consider the Schrödinger equation i∂t ψ(t) = HV,ω ψ(t) with initial condition ψ(0) = ψ. Then, the unique solution of this initial-value problem is given by ψ(t) = e−it HV,ω ψ. To measure the spreading of a wavepacket, one considers time-averaged moments of the position operator. This is of course mainly of interest when ψ(0) = ψ is well-localized and in fact, in 2 (Z) one usually considers the canonical initial state ψ = δ0 . p One is interested in the growth of |X |ψ (T ) as T → ∞ for p > 0, where p |X |ψ (t) = |n| p |δn , e−it HV,ω ψ|2 n∈Z

and the time average of a t-dependent function f is given by either f (T ) =

T

1 T

f (t) dt

(40)

e−2t/T f (t) dt.

(41)

0

or f (T ) =

2 T

∞ 0

We will indicate in the results below which time-average is involved. However, for compactly supported (or fast-decaying) ψ, all the results mentioned below hold for both types of time-average. To measure the power-law growth of |X | p (T ), define p

βψ+ ( p)

= lim sup T →∞

log|X |ψ (T ) p log T


265

and p

βψ− ( p)

= lim inf

log|X |ψ (T )

T →∞

p log T

.

The Guarneri-Combes-Last Theorem (see [La, Theorem 6.1]) states that if μψ is not supported by a Borel set S with h α (S) = 0, then β ± ( p) ≥ α for every p > 0, where the time-average is given by (40). Corollary 2. For every V > 0, ω ∈ T, 0 = ψ ∈ 2 (Z), and p > 0, we have 1 2 log 1 + (2+2V )2 √ βψ± ( p) ≥ , 1 5+1 + 16 log[(5 + 2V )1/2 (3 + V )aV ] log 1 + (2+2V )2 + 16 · log 2 where the time-average is given by (40). Proof. Immediate from the Guarneri-Combes-Last Theorem and Corollary 1.

We also have the following estimate, which holds for a special initial state but which is better when p is large: Corollary 3. For every V > 0, ω ∈ T, and p > 0, we have √

βδ±0 ( p) ≥

√

log

log 5+1 2

5+1 2

+ log[(5 + 2V )1/2 (3 + V )aV ]

3 log[(5 + 2V )1/2 (3 + V )aV ] , − √ 1/2 (3 + V )a ] p log 5+1 + log[(5 + 2V ) V 2 where the time-average is given by (41). Proof. This is a consequence of Theorem 5.1 and results of Damanik, Süt˝o, and Tcheremchantsev (see [DST, Theorem 1] and compare also [DT08, Theorem 1]). It can be shown that for compactly supported (or fast-decaying) ψ, the transport exponents βψ± ( p) are non-decreasing functions of p taking values in [0, 1]. Thus, in this case it is natural to consider the limits of these quantities as p ↓ 0 and p ↑ ∞ and denote them by βψ± (0) and βψ± (∞), respectively. The bounds in the two previous corollaries imply estimates for βψ± (0) and βψ± (∞) as well. However, at small coupling, a better estimate for βδ±0 (∞) is obtained via a different route: Corollary 4. We have lim β ± (∞) V ↓0 δ0

= lim lim βδ±0 ( p) = 1, V ↓0 p↑∞

uniformly in ω ∈ T. Here, the time-average is given by (41). Proof. Damanik, Embree, Gorodetski, and Tcheremchantsev showed for the Fibonacci Hamiltonian that βδ±0 (∞) ≥ dim± B (V ) for every V > 0 and ω ∈ T; see [DEGT, Theorem 3]. Since the right-hand side converges to one as V ↓ 0, the result follows.

266


Remarks. (a) As pointed out above, while the papers [DEGT,DST,DT03] work with the time-average (41), the results above carry over to transport exponents defined using (40). (b) We find that βδ±0 (∞), considered as a function of V , extends continuously to zero. While we would expect this also for other initial states ψ, it does not follow from Corollary 2. It would be interesting to extend this continuity result to all fast-decaying initial states or to exhibit one for which it fails. 6. Consequences for Higher-Dimensional Models 6.1. Lattice Schrödinger operators with separable potentials. Let d ≥ 1 be an integer and assume that for 1 ≤ j ≤ d, we have bounded maps V j : Z → R. Consider the associated Schrödinger operators on 2 (Z), [H j ψ](n) = ψ(n + 1) + ψ(n − 1) + V j (n)ψ(n). Furthermore, we let V :

Zd

(42)

→ R be given by

V (n) = V1 (n 1 ) + · · · + Vd (n d ),

(43)

where we express an element n of Zd as n = (n 1 , . . . , n d ) with n j ∈ Z. Finally, we introduce the Schrödinger operator on 2 (Zd ) with potential V , that is, ⎞ ⎛ d ψ(n + e j ) + ψ(n − e j )⎠ + V (n)ψ(n). (44) [H ψ](n) = ⎝ j=1

Here, e j denotes the element n of Zd that has n j = 1 and n k = 0 for k = j. Potentials of the form (43) and Schrödinger operators of the form (44) are called separable. Let us first state some known results for separable Schrödinger operators. Proposition 6.1. (a) The spectrum of H is given by σ (H ) = σ (H1 ) + · · · + σ (Hd ). (b) Given ψ1 , . . . , ψd ∈ denote by μ j the spectral measure corresponding to H j and ψ j . Furthermore, denote by μ the spectral measure corresponding to H and the element ψ of 2 (Zd ) given by ψ(n) = ψ1 (n 1 ) · · · ψd (n d ). Then, 2 (Z),

μ = μ1 ∗ · · · ∗ μd . Proof. Recall the definition and properties of tensor products of Hilbert spaces and operators on these spaces; see, for example, [RS, Sects. II.4 and VIII.10]. It follows from [RS, Theorem II.10] that there is a unique unitary map U from 2 (Z) ⊗ · · · ⊗ 2 (Z) (d factors) to 2 (Zd ) so that for ψ j ∈ 2 (Z), the elementary tensor ψ1 ⊗ · · · ⊗ ψd is mapped to the element ψ of 2 (Zd ) given by ψ(n) = ψ1 (n 1 ) · · · ψd (n d ). With this unitary map U , we have U ∗ HU =

d

Id ⊗ · · · ⊗ Id ⊗ H j ⊗ Id ⊗ · · · ⊗ Id,

j=1

with H j being the j th factor. Given this representation, part (a) now follows from [RS, Theorem VIII.33] (see also the example on [RS, p. 302]). Part (b) follows from the proof of [RS, Theorem VIII.33]; see also [BS] and [S].


267

6.2. A consequence of the Newhouse Gap Lemma. We have seen above that the spectrum of a product model is given by the sum of the individual spectra. If these individual spectra are Cantor sets and we want to show that their sum is not a Cantor set, the following consequence of the Gap Lemma is useful. Lemma 6.2. Suppose C, K ⊂ R1 are Cantor sets with τ (C) · τ (K ) > 1. Assume also that the size of the largest gap of C is not greater than the diameter of K , and the size of the largest gap of K is not greater than the diameter of C. Then the sum C + K is a closed interval. Proof. Denote min C = c1 , max C = c2 , min K = k1 , max K = k2 . Let us prove that C + K = [c1 + k1 , c2 + k2 ]. The inclusion “⊆” is obvious, so let us prove the inclusion “⊇.” Take an arbitrary point x ∈ [c1 + k1 , c2 + k2 ]. Then, x ∈ C + K if and only if 0 ∈ C + K − x = C − (x − K ). Therefore, x ∈ C + K ⇔ C ∩ (x − K ) = ∅. Since τ (C) · τ (x − K ) = τ (C) · τ (K ) > 1, the Gap Lemma implies that a priori there are only four possibilities: (1) (2) (3) (4)

the intervals [c1 , c2 ] and [x − k2 , x − k1 ] are disjoint; the set C is contained in a finite gap of the set (x − K ); the set (x − K ) is contained in a finite gap of the set C; C ∩ (x − K ) = ∅.

But the case (1) contradicts the assumption x ∈ [c1 + k1 , c2 + k2 ], and the cases (2) and (3) are impossible due to our assumption on the sizes of gaps and diameters of C and K . Therefore, we must have C ∩ (x − K ) = ∅ and hence x ∈ C + K . 6.3. The square Fibonacci Hamiltonian. Let us now discuss the (diagonal version of the) model studied by Even-Dar Mandel and Lifshitz, namely the operator [HV(2) ψ](m, n) = ψ(m + 1, n) + ψ(m − 1, n) + ψ(m, n + 1) + ψ(m, n − 1) + + V χ[1−α,1) (mα mod 1) + χ[1−α,1) (nα mod 1) ψ(m, n) in 2 (Z2 ). By Proposition 6.1, we have σ (HV(2) ) = V + V .

(45)

Proof of Theorem 1.4. By (45), it suffices to show that V + V is an interval for V > 0 sufficiently small. By Theorem 1.2, there is V0 > 0 such that τ (V ) > 1 for V ∈ (0, V0 ). For such V ’s, Lemma 6.2 applies with C = K = V and yields V + V = [min V + min V , max V + max V ], as desired.

Acknowledgement. We are grateful to Mark Embree for generating the plots shown in Figs. 4 and 5. Open Access This article is distributed under the terms of the Creative Commons Attribution Noncommercial License which permits any noncommercial use, distribution, and reproduction in any medium, provided the original author(s) and source are credited.

268


Appendix A. The Off-Diagonal Fibonacci Hamiltonian The purpose of this appendix is to give complete proofs of the basic spectral properties of the off-diagonal Fibonacci Hamiltonian. This operator has been considered in many physics papers and is the basic building block for the higher-dimensional product models studied by Even-Dar Mandel and Lifshitz. The mathematics literature on the Fibonacci model provides an exhaustive study of the diagonal model, and it was always understood that “analogous results hold for the off-diagonal model.” For the reader’s convenience, we make these analogous results explicit here and hence provide in this paper a complete treatment of the spectrum of the Even-Dar Mandel-Lifshitz product model at weak coupling. A.1. Model and results. Let a, b be two positive real numbers and consider the Fibonacci substitution, S(a) = ab, S(b) = a. This substitution rule extends to finite and one-sided infinite words by concatenation. For example, S(aba) = abaab. Since S(a) begins with a, one obtains a one-sided infinite sequence that is invariant under S by iterating the substitution rule on a and taking a limit. Indeed, we have S k (a) = S k−1 (S(a)) = S k−1 (ab) = S k−1 (a)S k−1 (b) = S k−1 (a)S k−2 (a). S k (a)

(46)

S k−1 (a),

starts with and hence there is a unique one-sided infinite In particular, sequence u, the so-called Fibonacci substitution sequence, that starts with S k (a) for every k. The hull a,b is then obtained by considering all two-sided infinite sequences that locally look like u, a,b = {ω ∈ {a, b}Z : every subword of ω is a subword of u}. It is known that, conversely, every subword of u is a subword of every ω ∈ a,b . In this sense, u and all elements of the hull ω look exactly the same locally. We wish to single out a special element of a,b . Notice that ba occurs in u and that S 2 (a) = aba begins with a and S 2 (b) = ab ends with b. Thus, iterating S 2 on b|a, where | denotes the eventual origin, we obtain as a limit a two-sided infinite sequence which belongs to a,b and coincides with u to the right of the origin. This element of a,b will be denoted by ωs . Each ω ∈ a,b generates a Jacobi matrix Hω acting in 2 (Z), (Hω ψ)n = ωn+1 ψn+1 + ωn ψn−1 . With respect to the standard orthonormal basis {δn }n∈Z of 2 (Z), where δn is one at n and vanishes otherwise, this operator has the matrix ⎞ ⎛ .. .. .. . . . ⎟ ⎜ ⎟ ⎜ .. ⎟ ⎜ . 0 ω−1 0 ⎟ ⎜ ⎟ ⎜ .. ⎟ ⎜ . ω−1 0 ω0 0 ⎟ ⎜ ⎟ ⎜ . . ⎜ .⎟ 0 ω0 0 ω1 ⎟ ⎜ ⎜ .. ⎟ ⎟ ⎜ . 0 ω 0 1 ⎠ ⎝ .. .. .. . . . and it is clearly self-adjoint.


269

This family of operators, {Hω }ω∈a,b , is called the off-diagonal Fibonacci model. Of course, the structure of the Fibonacci sequence disappears when a = b. In this case, the hull consists of a single element, the constant two-sided infinite sequence taking the value a = b, and the spectrum and the spectral measures of the associated operator Hω are well understood. For this reason, we will below always assume that a = b. Nevertheless, the limiting case, where we fix a, say, and let b tend to a is of definite interest. Our first result concerns general properties of the spectrum of Hω . For S ⊂ R, we denote by dim H S the Hausdorff dimension of S and by dim B S the box counting dimension of S (which is then implicitly claimed to exist). Theorem A.1. Suppose a, b > 0 and a = b. Then, there exists a compact set a,b ⊂ R such that σ (Hω ) = a,b for every ω ∈ a,b , and (i) a,b has zero Lebesgue measure. (ii) The Hausdorff dimension dim H a,b is an analytic function of a and b. (iii) 0 < dim H a,b < 1. More can be said about the spectrum when a and b are close to each other: Theorem A.2. There exists ε0 > 0 such that if a, b > 0, a = b, and other words, if a and b are close enough), then

a 2 +b2 2ab

< 1 + ε0 (in

(iv) The spectrum a,b is a Cantor set that depends continuously on a and b in the Hausdorff metric. (v) For every small δ > 0 and every E ∈ a,b , we have dim H (E − δ, E + δ) ∩ a,b = dim B (E − δ, E + δ) ∩ a,b = dim H a,b = dim B a,b . (vi) Denote α = dim H a,b , then the Hausdorff α-measure of a,b is positive and finite. (vii) We have that a,b + a,b = [min a,b + min a,b , max a,b + max a,b ]. Given these results, and especially Theorem A.2.(vii), we can confirm rigorously the observation made by Even-Dar Mandel and Lifshitz in [EL06,EL07] that the square Fibonacci Hamiltonian (based on the off-diagonal one-dimensional model) has no gaps in its spectrum for sufficiently small coupling. Next, we turn to the spectral type of Hω . Theorem A.3. Suppose a, b > 0 and a = b. Then, for every ω ∈ a,b , Hω has purely singular continuous spectrum. Throughout the rest of this appendix we will only consider a, b > 0 with a = b. Theorems A.1 and A.2 are proved in Subsect. A.2 and Theorem A.3 is proved in Subsect. A.3.

270


A.2. The trace map and its relation to the spectrum. The spectral properties of Hω are closely related to the behavior of the solutions to the difference equation ωn+1 u n+1 + ωn u n−1 = Eu n .

(47)

Denote Un =

un ωn u n−1

.

Then u solves (47) (for every n ∈ Z) if and only if U solves Un = Tω (n, E)Un−1 ,

(48)

(for every n ∈ Z), where 1 Tω (n, E) = ωn

E −1 . ωn2 0

Note that det Tω (n, E) = 1. Iterating (48), we find Un = Mω (n, E)U0 , where Mω (n, E) = Tω (n, E) × · · · Tω (1, E). With the Fibonacci numbers {Fk }, generated by F0 = F1 = 1, Fk+1 = Fk + Fk−1 for k ≥ 1, we define xk = xk (E) =

1 TrMωs (Fk , E). 2

For example, we have 1 E −1 , a a2 0 1 E 2 − a 2 −E 1 E −1 1 E −1 , = Mωs (F2 , E) = Eb2 −b2 b b2 0 a a2 0 ab 1 E −1 1 E −1 1 E −1 Mωs (F3 , E) = a a2 0 b b2 0 a a2 0 3 1 E − Ea 2 − Eb2 −E 2 + b2 , = 2 E 2a2 − a4 −Ea 2 a b Mωs (F1 , E) =

and hence x1 =

E E 2 − a 2 − b2 E 3 − 2Ea 2 − Eb2 , x2 = , x3 = . 2a 2ab 2a 2 b

(49)


271

Lemma A.4. We have xk+1 = 2xk xk−1 − xk−2

(50)

for k ≥ 2. Moreover, the quantity 2 2 Ik = xk+1 + xk2 + xk−1 − 2xk+1 xk xk−1 − 1

(51)

is independent of both k and E and it is given by (a 2 + b2 )2 − 1. 4a 2 b2 Proof. Since ωs restricted to {n ≥ 1} coincides with u and the prefixes sk of u of length Fk obey sk+1 = sk sk−1 for k ≥ 2 by construction (cf. (46)), the recursion (50) follows as in the diagonal case; compare [D00,D07a,S87]. This recursion in turn implies readily that (51) is k-independent. In particular, the xk ’s are again generated by the trace map I =

T (x, y, z) = (2x y − z, x, y) and the preserved quantity is again I (x, y, z) = x 2 + y 2 + z 2 − 2x yz − 1. The only difference between the diagonal and the off-diagonal model can be found in the initial conditions. How are x1 , x0 , x−1 obtained? Observe that the trace map is invertible and hence we can apply its inverse twice to the already defined quantity (x3 , x2 , x1 ). We have T −1 (x, y, z) = (y, z, 2yz − x) and hence, using (49), (x1 , x0 , x−1 ) = T −2 (x3 , x2 , x1 ) 3 E − 2Ea 2 − Eb2 E 2 − a 2 − b2 E = T −2 , , 2a 2 b 2ab 2a 2 2 2 2 2 2 E (E − a − b )E E 3 − 2Ea 2 − Eb2 −1 E − a − b =T , ,2 − 2ab 2a 4a 2 b 2a 2 b 2 2 2 E −a −b E E = T −1 , , 2ab 2a 2b E E E2 E 2 − a 2 − b2 = , ,2 − 2a 2b 4ab 2ab 2 2 E E a +b = , , . 2a 2b 2ab It follows that I (xk+1 , xk , xk−1 ) = I (x1 , x0 , x−1 ) E2 E 2 (a 2 + b2 )2 E 2 (a 2 + b2 ) = 2+ 2+ − 2 −1 4a 4b 4a 2 b2 8a 2 b2 (a 2 + b2 )2 −1 = 4a 2 b2 for every k ≥ 0.

272


It is of crucial importance for the spectral analysis that, as in the diagonal case, the invariant is energy-independent and strictly positive when a = b! Lemma A.5. The spectrum of Hω is independent of ω and may be denoted by a,b . With σk = {E ∈ R : |xk | ≤ 1}, we have a,b =

σk ∪ σk+1 .

(52)

1/2 (a 2 + b2 )2 − 1 4a 2 b2

(53)

k≥1

Moreover, for every E ∈ a,b and k ≥ 2, |xk | ≤ 1 +

and for E ∈ a,b , |xk | diverges super-exponentially. Proof. It is well known that the hull a,b together with the standard shift transformation is minimal. In particular, every ω ∈ a,b may be approximated pointwise by a sequence of shifts of any other ω˜ ∈ a,b . The associated operators then converge strongly and we get σ (Hω ) ⊆ σ (Hω˜ ). Reversing the roles of ω and ω, ˜ the first claim follows. So let a,b denote the common spectrum of the operators Hω , ω ∈ a,b . We have Hω ≤ max{2a, 2b}. Thus, a,b ⊆ [− max{2a, 2b}, max{2a, 2b}] =: Ia,b . For E ∈ Ia,b , we have that at least one of x1 , x0 belongs to [−1, 1]. This observation allows us to use the exact same arguments Süt˝o used to prove (52) for the diagonal model in [S87]. The only point where care needs to be taken is the claim that σk is the spectrum of the periodic Jacobi matrix obtained by repeating the values ωs takes on {1 ≤ n ≤ Fk } periodically on the off-diagonals. This, however, follows from the general theory of periodic Jacobi matrices, which relies on the diagonalization of the monodromy matrix (which is Mωs (Fk , E) in this case) in order to obtain Floquet solutions and in particular discriminate between those energies that permit exponentially growing solutions and those that do not. This distinction works just as well here, but one needs to use that the ωn ’s that enter in the Un ’s are uniformly bounded away from zero and infinity. Thus, after paying attention to this fact, we may now proceed along the lines of Süt˝o. Let us describe the main steps of the argument. Since at least one of x1 , x0 belongs to [−1, 1], we have a result analogous to [S87, Lemma 2] with the same proof as given there. Namely, the sequence {xk }k≥0 is unbounded if and only if there exists k such that |xk | > 1 and |xk+1 | > 1. Moreover, we then have |xk+l | > c Fl for some c > 1 and all l ≥ 0. This shows σk ∪ σk+1 = σk+l . l≥0

Using now the fact that the Fk periodic Jacobi matrices with spectrum σk converge strongly to Hωs , we obtain a,b ⊆

k≥1 l≥0

σk+l =

k≥1

σk ∪ σk+1 =

σk ∪ σk+1 ,

k≥1

since the spectra σk and σk+1 are closed sets. Thus, we have one inclusion in (52).


273

& Next, suppose E ∈ k≥1 σk ∪ σk+1 . If k ≥ 1 is such that |xk | > 1, then |xk−1 | ≤ 1 and |xk+1 | ≤ 1. Since we have 2 2 xk+1 + xk2 + xk−1 − 2xk+1 xk xk−1 − 1 =

(a 2 + b2 )2 − 1, 4a 2 b2

this implies 1/2 (a 2 + b2 )2 2 2 2 2 xk = xk+1 xk−1 ± 1 − xk+1 − xk−1 + xk+1 xk−1 + −1 , 4a 2 b2 and hence 2 1/2 (a + b2 )2 2 2 −1 |xk | ≤ |xk+1 xk−1 | + (1 − xk+1 )(1 − xk−1 ) + 4a 2 b2 which, using |xk−1 | ≤ 1 and |xk+1 | ≤ 1 again, implies the estimate (53) for E ∈ & σ k≥1 k ∪σk+1 . We will show in the next subsection that the boundedness of the sequence {xk }k≥0 implies that, for arbitrary ω ∈ a,b , no solution of the difference equation (47) is square-summable at +∞. Consequently, such E’s belong to a,b .7 This shows the other inclusion in (52) and hence establishes it. Moreover, it follows that (53) holds for every E ∈ a,b . Finally, from the representation (52) of a,b and our observation above about unbounded sequences {xk }k≥0 , we find that |xk | diverges super-exponentially for E ∈ a,b . This concludes the proof of the lemma. Lemma A.6. For every E ∈ R, there is γ (E) ≥ 0 such that lim

n→∞

1 log Mω (n, E) = γ (E), n

uniformly in ω ∈ a,b . Proof. This follows directly from the uniform subadditive ergodic theorem; compare [DL99b,Ho]. Lemma A.7. The set Za,b := {E ∈ R : γ (E) = 0} has zero Lebesgue measure. Proof. This is one of the central results of Kotani theory; see [Ko] and also [D07b]. Note that these papers only discuss the diagonal model. Kotani theory for Jacobi matrices is discussed in Carmona-Lacroix [CL] and the result needed can be deduced from what is presented there. For a recent reference that states a result sufficient for our purpose explicitly, see Remling [Re]. Lemma A.8. We have a,b = Za,b . 7 This follows by a standard argument: If E ∈ , then (H − E)−1 exists and hence (H − E)−1 δ is ω ω 0 a,b an 2 (Z) vector that solves (47) away from the origin. Choosing its values for n ≥ 1, say, and then using (47) to extend it to all of Z, we obtain a solution that is square-summable at +∞.

274


Proof. The inclusion a,b ⊇ Za,b holds by general principles. For example, one can construct Weyl sequences by truncation when γ (E) = 0. The inclusion a,b ⊆ Za,b can be proved in two ways. Either one uses the boundedness of xk for energies E ∈ a,b to prove explicit polynomial upper bounds for Mω (n, E) (as in [IT] for ω = ωs or in [DL99b] for general ω ∈ a,b ), or one combines the proof of the absence of decaying solutions at +∞ for E ∈ a,b given in the next subsection with Osceledec’s Theorem, which states that γ (E) > 0 would imply the existence of an exponentially decaying solution at +∞. Here we use one more time that Un is comparable in norm to (u n , u n−1 )T . Proof of Theorems A.1 and A.2. The existence of the uniform spectrum a,b was shown in Lemma A.6 and the fact that a,b has zero Lebesgue measure follows from Lemmas A.7 and A.8. The set of bounded orbits of the restriction of the trace map 2 2 )2 T : R3 → R3 to the invariant surface I (x, y, z) = C ≡ (a4a+b − 1, C > 0, is 2 b2 hyperbolic; see [Can] (and also [DG09a] for C sufficiently small and [Cas] for C sufficiently large). Due to Lemma A.5, the points of the spectrum correspond to the points of the intersection of the line of the initial conditions E E a 2 + b2 , , :E ∈R a,b ≡ 2a 2b 2ab with the stable manifolds of the hyperbolic set of bounded orbits. Properties (ii) and (iii) can be proved in exactly the same way as Theorem 6.5 in [Can]. The line a,b intersects the stable lamination of the hyperbolic set transversally for sufficiently small C > 0, as can be shown in the same way as for the diagonal Fibonacci Hamiltonian with a small coupling constant; see [DG09a]. Therefore the spectrum a,b for close enough a and b is a dynamically defined Cantor set, and the properties (iv)–(vi) follow; see [DEGT,DG09a,Ma,MM,P,PT] and references therein. The statement (vii) follows as 2 +b2 approaches 1. in the diagonal case since the thickness of a,b tends to infinity as a2ab Notice that a proof of the transversality of the line a,b to the stable lamination of the hyperbolic set of bounded orbits for arbitrary a = b would imply the properties (iv)–(vi) for these values of a and b. A.3. Singular continuous spectrum. In this subsection we prove Theorem A.3. Given the results from the previous subsection, we can follow the proofs from the diagonal case quite closely. Proof of Theorem A.3. Since the absence of absolutely continuous spectrum follows from zero measure spectrum, we only need to show the absence of point spectrum. It was shown by Damanik and Lenz [DL99a] that, given any ω ∈ a,b and k ≥ 1, the restriction of ω to {n ≥ 1} begins with a square ω1 . . . ω2Fk . . . = ω1 . . . ω Fk ω1 . . . ω Fk . . . such that ω1 . . . ω Fk is a cyclic permutation of S k (a). By cyclic invariance of the trace, it follows that TrMω (Fk , E) = 2xk (E) for every E. The Cayley-Hamilton Theorem, applied to Mω (Fk , E), says that Mω (Fk , E)2 − (TrMω (Fk , E)) Mω (Fk , E) + I = 0,


275

which, by the observations above, translates to Mω (2Fk , E) − 2xk Mω (Fk , E) + I = 0. If E ∈ a,b and u is a solution of the difference equation (47), it therefore follows that U (2Fk + 1) − 2xk U (Fk + 1) + U (1) = 0. If u does not vanish identically, this shows that u n → 0 as n → ∞ since the xk ’s are bounded above and the ωn ’s are bounded below away from zero. In particular, if E ∈ a,b , then no non-trivial solution of (47) is square-summable at +∞ and hence E is not an eigenvalue. It follows that the point spectrum of Hω is empty. References [As00] [As01] [As02] [AJ] [AS] [BGJ] [BR] [B] [BBG91] [BBG92] [BS] [BG] [Can] [CL] [Cas] [Cus] [CFKS] [D00] [D07a] [D07b]

[DEGT]

Astels, S.: Cantor sets and numbers with restricted partial quotients. Trans. Amer. Math. Soc. 352, 133–170 (2000) Astels, S.: Sums of numbers with small partial quotients. ii. J. Number Theory 91, 187–205 (2001) Astels, S.: Sums of numbers with small partial quotients. Proc. Amer. Math. Soc. 130, 637–642 (2002) Avila, A., Jitomirskaya, S.: The ten martini problem. Ann. of Math. 170, 303–342 (2009) Avron, J., Simon, B.: Almost periodic schrödinger operators. ii. the integrated density of states. Duke Math. J. 50, 369–391 (1983) Baake, M., Grimm, U., Joseph, D.: Trace maps, invariants, and some of their applications. Internat. J. Mod. Phys. B 7, 1527–1550 (1993) Baake, M., Roberts, J.: The dynamics of trace maps. In: Hamiltonian Mechanics (Toruń, 1993), NATO Adv. Sci. Inst. Ser. B Phys. 331, New York: Plenum, 1994, pp. 275–285 Bellissard, J.: Spectral properties of Schrödinger’s operator with a Thue-Morse potential. In: Number Theory and Physics (Les Houches, 1989), Springer Proc. Phys. 47, Berlin: Springer, 1990, pp. 140–150 Bellissard, J., Bovier, A., Ghez, J.-M.: Spectral properties of a tight binding hamiltonian with period doubling potential. Commun. Math. Phys. 135, 379–399 (1991) Bellissard, J., Bovier, A., Ghez, J.-M.: Gap labelling theorems for one-dimensional discrete schrödinger operators. Rev. Math. Phys. 4, 1–37 (1992) Bellissard, J., Schulz-Baldes, H.: Subdiffusive quantum transport for 3d hamiltonians with absolutely continuous spectra. J. Stat. Phys. 99, 587–594 (2000) Bovier, A., Ghez, J.-M.: Remarks on the spectral properties of tight-binding and kronig-penney models with substitution sequences. J. Phys. A 28, 2313–2324 (1995) Cantat, S.: Bers and hénon, painlevé and schrödinger. Duke Math. J. 149, 411–460 (2009) Carmona, R., Lacroix, J.: Spectral Theory of Random Schrödinger Operators. Boston, MA: Birkhäuser, 1990 Casdagli, M.: Symbolic dynamics for the renormalization map of a quasiperiodic schrödinger equation. Commun. Math. Phys. 107, 295–318 (1986) Cusick, T.: On M. Hall’s continued fraction theorem. Proc. Amer. Math. Soc. 38, 253–254 (1973) Cycon, H., Froese, R., Kirsch, W., Simon, B.: Schrödinger Operators with Application to Quantum Mechanics and Global Geometry. Texts and Monographs in Physics, Berlin: Springer-Verlag, 1987 Damanik, D.: Gordon-type arguments in the spectral theory of one-dimensional quasicrystals. In: Directions in Mathematical Quasicrystals. CRM Monogr. Ser. 13, Providence, RI: Amer. Math. Soc., 2000, pp. 277–305 Damanik, D.: Strictly ergodic subshifts and associated operators. In: Spectral Theory and Mathematical Physics: a Festschrift in Honor of Barry Simon’s 60th Birthday. Proc. Sympos. Pure Math. 76, Part 2, Providence, RI: Amer. Math. Soc., 2007, pp. 505–538 Damanik, D.: Lyapunov exponents and spectral analysis of ergodic Schrödinger operators: a survey of Kotani theory and its applications. In: Spectral Theory and Mathematical Physics: a Festschrift in Honor of Barry Simon’s 60th Birthday, Proc. Sympos. Pure Math. 76, Part 2, Providence, RI: Amer. Math. Soc., 2007, pp. 539–563 Damanik, D., Embree, M., Gorodetski, A., Tcheremchantsev, S.: The fractal dimension of the spectrum of the fibonacci hamiltonian. Commun. Math. Phys. 280, 499–516 (2008)

276


[DG09a] Damanik, D., Gorodetski, A.: Hyperbolicity of the trace map for the weakly coupled fibonacci hamiltonian. Nonlinearity 22, 123–143 (2009) [DG09b] Damanik, D., Gorodetski, A.: The spectrum of the weakly coupled fibonacci hamiltonian. Electronic Research Announcements in Mathematical Sciences 16, 23–29 (2009) [DKL] Damanik, D., Killip, R., Lenz, D.: Uniform spectral properties of one-dimensional quasicrystals. iii. α-continuity. Commun. Math. Phys. 212, 191–204 (2000) [DL99a] Damanik, D., Lenz, D.: Uniform spectral properties of one-dimensional quasicrystals, i. absence of eigenvalues. Commun. Math. Phys. 207, 687–696 (1999) [DL99b] Damanik, D., Lenz, D.: Uniform spectral properties of one-dimensional quasicrystals. ii. the lyapunov exponent. Lett. Math. Phys. 50, 245–257 (1999) [DST] Damanik, D., Süt˝o, A., Tcheremchantsev, S.: Power-law bounds on transfer matrices and quantum dynamics in one dimension ii. J. Funct. Anal. 216, 362–387 (2004) [DT03] Damanik, D., Tcheremchantsev, S.: Power-law bounds on transfer matrices and quantum dynamics in one dimension. Commun. Math. Phys. 236, 513–534 (2003) [DT07] Damanik, D., Tcheremchantsev, S.: Upper bounds in quantum dynamics. J. Amer. Math. Soc. 20, 799–827 (2007) [DT08] Damanik, D., Tcheremchantsev, S.: Quantum dynamics via complex analysis methods: general upper bounds without time-averaging and tight lower bounds for the strongly coupled fibonacci hamiltonian. J. Funct. Anal. 255, 2872–2887 (2008) [EL06] Even-Dar Mandel, S., Lifshitz, R.: Electronic energy spectra and wave functions on the square fibonacci tiling. Phil. Mag. 86, 759–764 (2006) [EL07] Even-Dar Mandel, S., Lifshitz, R.: Electronic energy spectra of square and cubic fibonacci quasicrystals. Phil. Mag. 88, 2261–2273 (2008) [EL08] Even-Dar Mandel, S., Lifshitz, R.: Bloch-like electronic wave functions in two-dimensional quasicrystals. http://arXiv.org/abs/0808.3659IIVL[cond-mat.mtrl.sci], 2008 [Ha] Hall, M.: On the sum and product of continued fractions. Ann. of Math. 48, 966–993 (1947) [Hl] Hlavka, J.: Results on sums of continued fractions. Trans. Amer. Math. Soc. 211, 123–134 (1975) [HPS] Hirsch, M., Pugh, C., Shub, M.: Invariant Manifolds. Lecture Notes in Mathematics 583, BerlinNew York: Springer-Verlag, 1977 [Ho] Hof, A.: Some remarks on discrete aperiodic schrödinger operators. J. Stat. Phys. 72, 1353–1374 (1993) [HCB] Hoggatt, V., Cox, N., Bicknell, M.: A primer for the fibonacci numbers. xii. Fibonacci Quart. 11, 317–331 (1973) [HM] Humphries, S., Manning, A.: Curves of fixed points of trace maps. Ergod. Th. & Dynam. Sys. 27, 1167–1198 (2007) [IT] Iochum, B., Testard, D.: Power law growth for the resistance in the fibonacci model. J. Stat. Phys. 65, 715–723 (1991) [Ka] Kadanoff, L.P.: Analysis of cycles for a volume preserving map, unpublished manuscript [KKT] Kohmoto, M., Kadanoff, L.P., Tang, C.: Localization problem in one dimension: mapping and escape. Phys. Rev. Lett. 50, 1870–1872 (1983) [Ko] Kotani, S.: Jacobi matrices with random potentials taking finitely many values. Rev. Math. Phys. 1, 129–133 (1989) [La] Last, Y.: Quantum dynamics and decompositions of singular continuous spectra. J. Funct. Anal. 142, 406–445 (1996) [LS] Last, Y., Simon, B.: Fine structure of the zeros of orthogonal polynomials. iv. a priori bounds and clock behavior. Comm. Pure Appl. Math. 61, 486–538 (2008) [Le] Lekkerkerker, C.: Representation of natural numbers as a sum of fibonacci numbers. Simon Stevin 29, 190–195 (1952) [LW] Liu, Q.-H., Wen, Z.-Y.: Hausdorff dimension of spectrum of one-dimensional schrödinger operator with sturmian potentials. Pot. Anal. 20, 33–59 (2004) [Ma] Mané, R.: The hausdorff dimension of horseshoes of diffeomorphisms of surfaces. Bol. Soc. Brasil. Mat. (N.S.) 20, 1–24 (1990) [Me] de Melo, W.: Structural stability of diffeomorphisms on two-manifolds. Invent. Math. 21, 233–246 (1973) [MM] Manning, A., McCluskey, H.: Hausdorff dimension for horseshoes. Erg. Theory Dynam. Sys. 3, 251–261 (1983) [N79] Newhouse, S.: The abundance of wild hyperbolic sets and nonsmooth stable sets for diffeomorphisms. Inst. Hautes Études Sci. Publ. Math. 50, 101–151 (1979) [N70] Newhouse, S.: Nondensity of axiom A(a) on S 2 . Global Analysis (Proc. Sympos. Pure Math., Vol. XIV, Berkeley, Calif., 1968), Providence, RI: Amer. Math. Soc., 1970, pp. 191–202 [OK] Ostlund, S., Kim, S.-H.: Renormalization of quasiperiodic mappings. Physica Scripta T9, 193–198 (1985)


[PT] [P] [PSW] [Ra] [RS] [Re] [Ro] [S] [Si89] [SM89] [SM90] [SMS] [S87] [S89] [S95] [T] [Z]

277

Palis, J., Takens, F.: Hyperbolicity and Sensitive Chaotic Dynamics at Homoclinic Bifurcations. Cambridge: Cambridge University Press, 1993 Pesin, Ya.: Dimension Theory in Dynamical Systems. Chicago Lectures in Mathematics Series, Chicago, IL: Univ. Chicago Press, 1997 Pugh, C., Shub, M., Wilkinson, A.: Hölder foliations. Duke Math. J. 86, 517–546 (1997) Raymond, L.: A constructive gap labelling for the discrete Schrödinger operator on a quasiperiodic chain. Preprint, 1997 Reed, M., Simon, B.: Methods of Modern Mathematical Physics. I. Functional Analysis, 2nd edition, New York: Academic Press, 1980 Remling, C.: The absolutely continuous spectrum of Jacobi matrices. Preprint, 2007 Roberts, J.: Escaping orbits in trace maps. Phys. A 228, 295–325 (1996) Simon, B.: Operators with singular continuous spectrum. vii. examples with borderline time decay. Commun. Math. Phys. 176, 713–722 (1996) Sire, C.: Electronic spectrum of a 2d quasi-crystal related to the octagonal quasi-periodic tiling. Europhys. Lett. 10, 483–488 (1989) Sire, C., Mosseri, R.: Spectrum of 1d quasicrystals near the periodic chain. J. Phys. France 50, 3447– 3461 (1989) Sire, C., Mosseri, R.: Excitation spectrum, extended states, gap closing: some exact results for codimension one quasicrystals. J. Phys. France 51, 1569–1583 (1990) Sire, C., Mosseri, R., Sadoc, J.-F.: Geometric study of a 2d tiling related to the octagonal quasiperiodic tiling. J. Phys. France 55, 3463–3476 (1989) Süt˝o, A.: The spectrum of a quasiperiodic schrödinger operator. Commun. Math. Phys. 111, 409– 415 (1987) Süt˝o, A.: Singular continuous spectrum on a cantor set of zero lebesgue measure for the fibonacci hamiltonian. J. Stat. Phys. 56, 525–531 (1989) Süt˝o, A.: Schrödinger difference equation with deterministic ergodic potentials. In: Beyond Quasicrystals (Les Houches, 1994), Berlin: Springer, 1995, pp. 481–549 Takens, F.: Limit capacity and Hausdorff dimension of dynamically defined Cantor sets, Dynamical Systems, Lecture Notes in Mathematics 1331, Berlin: Springer, 1988, pp. 196–212 Zeckendorf, E.: A generalized fibonacci numeration. Fibonacci Quart. 10, 365–372 (1972)

Communicated by G. Gallavotti


Communications in


The Hamiltonian Structure of the Nonlinear Schrödinger Equation and the Asymptotic Stability of its Ground States Scipio Cuccagna Department of Mathematics and Computer Sciences, University of Trieste, via Valerio 12/1, Trieste 34127, Italy. E-mail: [email protected] Received: 19 October 2009 / Accepted: 30 January 2011 Published online: 19 May 2011 – © Springer-Verlag 2011

Abstract: In this paper we prove that ground states of the NLS which satisfy the sufficient conditions for orbital stability of M. Weinstein, are also asymptotically stable, for seemingly generic equations. The key issue is to prove that a certain coefficient is non-negative because is a square power. We assume that the NLS has a smooth short range nonlinearity. We assume also the presence of a very short range and smooth linear potential, to avoid translation invariance. The basic idea is to perform a Birkhoff normal form argument on the hamiltonian, as in a paper by Bambusi and Cuccagna on the stability of the 0 solution for NLKG. But in our case, the natural coordinates arising from the linearization are not canonical. So we need also to apply the Darboux Theorem. With some care though, in order not to destroy some nice features of the initial hamiltonian. 1. Introduction We consider the nonlinear Schrödinger equation (NLS) iu t = −u + V u + β(|u|2 )u, u(0, x) = u 0 (x), (t, x) ∈ R × R3

(1.1)

with − + V (x) a selfadjoint Schrödinger operator. Here V (x) = 0 to exclude translation invariance. We assume that both V (x) and β(|u|2 )u are short range and smooth. We assume that (1.1) has a smooth family of ground states. We then prove that the sufficient conditions for orbital stability by Weinstein [W1] (which, essentially, represent the correct definition of linear stability, see [Cu3]), imply for a generic (1.1) that the ground states are not only orbitally stable, as proved in [W1] (under less restrictive hypotheses), but that their orbits are also asymptotically stable. That is, a solution u(t) of (1.1) starting sufficiently close to ground states, is asymptotically of the form eiθ(t) φω+ (x) + eit h + , for ω+ a fixed number and for h + ∈ H 1 (R3 ) a small energy function. The problem of stability of ground states has a long history. Orbital stability has been well understood since the 80’s, see in the sequence [CL,W1,GSS1,GSS2], and has been a very active

280

S. Cuccagna

field afterwards. Asymptotic stability is a more recent, and less explored, field. In the context of the NLS, the first results are in the pioneering works [SW1,SW2,BP1,BP2]. Almost all references on asymptotic stability of ground states of the NLS tackle the problem by first linearizing at ground states, and by attempting to deal with the resulting nonlinear problem for the error term. An apparent problem in the linear theory is that the linearization is not a symmetric operator. However the linearization is covered by the scattering theory of non-selfadjoint operators developed by T.Kato in the 60’s, see his classical [K], see also [CPV,S]. Dispersive and Strichartz estimates for the linearization, analogous to the theory for short range scalar Schrödinger operators elaborated in [JSS,Y1,Y2], to name only few of many papers, can be proved using similar ideas, see for example [Cu1,S,KS]. It is fair to say that anything that can be proved for short range scalar Schrödinger operators, can also be proved for the linearizations. The only notable exception is the problem of “positive signature” embedded eigenvalues, see [Cu3], which we conjecture not to exist (in analogy to the absence of embedded eigenvalues for short range Schrödinger operators), and which in any case are unstable, see [CPV]. Hence it is reasonable to focus on NLS’s where these positive signature embedded eigenvalues do not exist (in the case of ground states, all positive eigenvalues are of positive signature).While linear theory is not a problem in understanding asymptotic stability, the real trouble lies in the difficult NLS like equation one obtains for the error term. Specifically, the linearization has discrete spectrum which, at the level of linear theory, tends not to decay and potentially could yield quasiperiodic solutions. A good analogy with more standard problems, is that the continuous spectrum of the linearization corresponds to stable spectrum while the discrete spectrum corresponds to central directions. Stability cannot be established by linear theory alone. The first intuition on how nonlinear interactions are responsible for loss of energy of the discrete modes, is in a paper by Sigal [Si]. His ideas, inspired by the classical Fermi golden rule in linear theory, are later elaborated in [SW3], to study asymptotic stability of vacuum for the nonlinear Klein Gordon equations with a potential with non-empty discrete spectrum. This problem, easier than the one treated in the present paper, to a large extent is solved in [BC]. In reality, the main ideas in [SW3] had already been sketched, for the problem of stability of ground states of NLS, in a deep paper by Buslaev and Perelman [BP2], see also the expanded version [BS]. In the case when the linearization has just one positive eigenvalue close to the continuous spectrum, [SW3,BP2], or [Si] in a different context, identify the mechanisms for loss of energy of the discrete modes in the nonlinear coupling of continuous and discrete spectral components. Specifically, in the discrete mode equation there is a key coefficient of the form D F, F for D a positive operator and F a function. Assuming the generic condition D F, F = 0, this gives rise to dissipative effects leading to leaking of energy from the discrete mode to the continuous modes, where energy disperses because of linear dispersion, and to the ground state. After [BP2] there is strong evidence that, generically, linearly stable ground states, in the sense of [W1], should be asymptotically stable. Still, it is a seemingly technically difficult problem to solve rigorously. After [BP2,SW3], a number of papers analyze the same ideas in various situations, [TY1,TY2,TY3,T,SW4,Cu2]. In the meantime, a useful series of papers [GNT,M1,M2] shows how to use endpoint Strichartz and smoothing estimates to prove in energy space the result of [SW2,PiW], generalizing the result and simplifying the argument. The next important breakthrough is due to Zhou and Sigal [GS]. They tackle for the first time the case of one positive eigenvalue arbitrarily close to 0, developing further the normal forms analysis of [BP2] and obtaining the rate of leaking conjectured in [SW3]

Hamiltonian Structure of NLS and Asymptotic Stability of its Ground States

281

p.69. The argument is improved in [CM]. The crucial coefficient is now of the form D F, G, with F and G not obviously related. In [CM] it is noticed that D F, G < 0 is incompatible with orbital stability (an argument along these lines is suggested in [SW3] p.69). So, for orbitally stable ground states, the generic condition D F, G = 0 implies positivity, and hence leaking of energy out of the discrete modes. This yields a result similar to [Si,BP2,SW3] and in particular is a partially positive answer to a conjecture on p.69 in [SW3]. The case with more than one positive eigenvalue is harder. In this case, due to possible cancelations, [CM] is not able to draw conclusions on the sign of the coefficients under the assumption of orbital stability. But, apart from the issue of positivity of the coefficients, [CM] shows that the rest of the proof does not depend on the number of positive eigenvalues. Moreover, [T,GW1,Cu3] show that if there are many positive eigenvalues, all close to the continuous spectrum, then the important coefficients are again of the form D F, F. The reason for this lies in the hamiltonian nature of the NLS. The above papers contain normal forms arguments. The hamiltonian structure is somewhat lost in the above papers. When the eigenvalues are close to the continuous spectrum, the normal form argument consists of just one step. This single step does not change the crucial coefficients. Then, the hamiltonian nature of the initial system yields information on these coefficients (this is emphasized in [Cu3]). In the case treated in [GS,CM] though, there are many steps in the normal form. The important coefficients are changed in ways which look very complicated, see [Gz] which deals with the next two easiest cases after the easiest. The correct way to look at this problem is introduced in [BC], which deals with the problem introduced in [SW3]. Basically, the positivity can be seen by doing the normal form directly on the hamiltonian. We give a preliminary and heuristic justification on why the hamiltonian structure is crucial at the end of Sect. 3. [BC] consists in a mixture of a Birkhoff normal forms argument, with the arguments in [CM]. For asymptotic stability of ground states of NLS though, [BC] is still not enough. Indeed in [BC] something peculiar happens: the natural coordinates arising by the spectral decomposition of the linearization at the vacuum solution, are also canonical coordinates for the symplectic structure. This is no longer true if instead of vacuum we consider ground states. So we need an extra step, which consists in the search of canonical coordinates, through the Darboux theorem. This step requires care, because we must make sure that our problem remains similar to a semilinear NLS also in the new system of coordinates. In the setting of [GW1], Zhou and Weinstein [GW2] track how much of the energy of the discrete modes goes to the ground state and how much is dispersed. For another result on asymptotic stability, that is asymptotic stability of the blow up profile, we refer to [MR]. In some respects the situation in [MR] is harder than here, since there the additional discrete modes are concentrated in the kernel of the linearization. There is important work on asymptotic stability for KdV equations due to Martel and Merle, see [MM1] and further references therein, which solve a problem initiated by Pego and Weinstein [PW], the latter closer in spirit to our approach to NLS. It is an interesting question to see if elaboration of ideas in [MM1,MMT] can be used for alternative solutions of the problem which we consider here. Our result does not cover important cases, like the pure power NLS, with β(|u|2 ) = −|u| p−1 and V = 0, where our result is probably false. Indeed it is well known that in 3D ground states are stable for p < 7/3 and unstable for p ≥ 7/3. In the p < 7/3 case there are ground states of arbitrarily small H 1 norm. They are counterexamples to the asymptotic stability in H 1 of the 0 solution. Then for p > 5/3 the 0 solution is asymptotically stable in a smaller space usually denoted

282

S. Cuccagna

by , which involves also the xu L 2x norm, see in [St] the comments after Theorem 6 p. 55. In there are no small ground states for p ∈ (5/3, 7/3). Presumably one should be able to prove asymptotic stability of ground states in . To our knowledge even the following (presumably easier) problem is not solved yet: the asymptotic stability of 0 in when V = 0, σ p (−+V ) = ∅ and β(|u|2 ) = −|u| p−1 with p ∈ (5/3, 7/3). In the literature on asymptotic stability of ground states like [BP2,BS,GS,CM], the case of moving solitons is left aside, because in that set up it appears substantially more complex. We do not treat moving solitons here either, but it is possible that our approach might help also with moving solitons. In the step when we perform the Darboux Theorem, the velocity should freeze and we should reduce to the same situation considered from Sect. 8 on. The extra difficulty with moving solitons is that there are more obstructions to the fact that after Darboux we have a semilinear NLS. But it would be surprising if this difficulty had a really deep nature. In any case, the main conceptual problem stemming from [Si,BP2,SW3], which we solve here, is the issue of the positive semidefiniteness of the critical coefficients. There is a growing literature on interaction between solitons, see for example [MM2,HW,M3], and we expect our result to be relevant. We do not reference all the literature on asymptotic stability of ground states, see [CT] for more. We conclude by observing that Sigal [Si], Buslaev and Perelman [BP2] and Soffer and Weinstein [SW3] had identified with great precision the right mechanism of leaking of energy away from the discrete modes. 2. Statement of the Main Result We will assume the following hypotheses. (H1) β(0) = 0, β ∈ C ∞ (R, R). (H2) There exists a p ∈ (1, 5) such that for every k ≥ 0 there is a fixed Ck with k d 2 p−k−1 if |v| ≥ 1. dv k β(v ) ≤ Ck |v| (H3) V (x) is smooth and for any multi index α there are Cα > 0 and aα > 0 such that |∂xα V (x)| ≤ Cα e−aα |x| . (H4) There exists an open interval O such that u − V u − ωu + β(|u|2 )u = 0 for x ∈ R3 ,

(2.1)

admits a C 1 -family of ground states φω (x) for ω ∈ O, (H5) d φω 2L 2 (R3 ) > 0 for ω ∈ O. dω

(2.2)

(H6) Let L + = − + V + ω − β(φω2 ) − 2β (φω2 )φω2 be the operator whose domain is H 2 (R3 ). Then L + has exactly one negative eigenvalue and does not have kernel. (H7) Let Hω be the linearized operator around eitω φω (see Sect. 3 for the precise definition). There is a fixed m ≥ 0 such that Hω has m positive eigenvalues λ1 (ω) ≤ λ2 (ω) ≤ · · · ≤ λm (ω). We assume there are fixed integers m 0 = 0 < m 1 < · · · < m l0 = m such that λ j (ω) = λi (ω) exactly for i and j both in (m l , m l+1 ] for some l ≤ l0 . In this case dim ker(Hω − λ j (ω)) = m l+1 − m l . We assume there exist N j ∈ N such that 0 < N j λ j (ω) < ω < (N j + 1)λ j (ω) with N j ≥ 1. We set N = N1 .


283

(H8) There is no multi-index μ ∈ Zm with |μ| := |μ1 | + · · · + |μm | ≤ 2N1 + 3 such that μ · λ = ω. (H9) If λ j1 < · · · < λ jk are k distinct λ’s, and μ ∈ Zk satisfies |μ| ≤ 2N1 + 3, then we have μ1 λ j1 + · · · + μk λ jk = 0 ⇐⇒ μ = 0. (H10) Hω has no other eigenvalues except for 0 and the ±λ j (ω). The points ±ω are not resonances. (H11) The Fermi golden rule Hypothesis (H11) in Subsect. 10.1, see (10.24), holds. Remark 2.1. The novelty of this paper with respect to [CM] is that we prove that some crucial coefficients are of a specific form, see (10.24). As a consequence, see Lemma 10.5, these coefficients are positive semidefinite. In the analogue of (10.24) in [CM], see Hypothesis 5.2 p.72 [CM], except for the special case n = 1 of just one eigenvalue (or of possibly many eigenvalues but all with N j = 1), there is no clue on the sign of the term on the rhs of the key inequality, and the fact that it is positive is an hypothesis. Theorem 2.2. Let ω1 ∈ O and φω1 (x) be a ground state of (1.1). Let u(t, x) be a solution to (1.1). Assume (H1)–(H10). Then, there exist an 0 > 0 and a C > 0 such that if := inf γ ∈[0,2π ] u 0 − eiγ φω1 H 1 < 0 , there exist ω± ∈ O, θ ∈ C 1 (R; R) and h ± ∈ H 1 with h ± H 1 + |ω± − ω1 | ≤ C such that lim u(t, ·) − eiθ(t) φω± − eit h ± H 1 = 0.

t→±∞

(2.3)

u (t, x) with |A(t, x)| ≤ It is possible to write u(t, x) = eiθ(t) φω(t) + A(t, x) + C N (t)x−N for any N , with lim|t|→∞ C N (t) = 0, with limt→±∞ ω(t) = ω± , and such that for any pair (r, p) which is admissible, by which we mean that 2/r + 3/ p = 3/2 , 6 ≥ p ≥ 2 , r ≥ 2,

(2.4)

u L r (R,W 1, p ) ≤ C .

(2.5)

we have t

x

We end the f, g : R3 → C we Introduction with some notation. Given two functions ∗ set f, g = R3 f (x)g(x)d x. Given a matrix A, we denote by A , or by t A, its transpose. ∗ Given two vectors A and B, we denote by A B = j A j B j their inner product. Sometimes we omit the summation symbol, and we use the conventionon sum over repeated indexes. Given two functions f, g : R3 → C2 we set f, g = R3 f ∗ (x)g(x)d x. For any k, s ∈ R and any Banach space K with field C, H k,s (R3 , K ) = { f : R3 → K s.t. f H s,k := xs (− + 1)k f K L 2 < ∞}, 3 3 (−+1)k f (x) = (2π )− 2 eix·ξ (ξ 2 +1)k f (ξ )dξ, f (ξ ) = (2π )− 2 e−ix·ξ f (x)d x. In particular we set L 2,s = H 0,s , L 2 = L 2,0 , H k = H 2,0 . Sometimes, to emphasize that k, p p these spaces refer to spatial variables, we will denote them by Wx , L x , Hxk , Hxk,s 2,s and L x . For I an interval and Yx any of these spaces, we will consider Banach spaces p L t (I, Yx ) with mixed norm f L p (I,Yx ) := f Yx L p (I ) . Given an operator A, we t

t

284

S. Cuccagna

will denote by R A (z) = (A − z)−1 its resolvent. We set N0 = N ∪ {0}. We will consider multi-indexes μ ∈ Nn0 . For μ ∈ Zn with μ = (μ1 , . . . , μn ) we set |μ| = nj=1 |μ j |. For X and Y two Banach space, we will denote by B(X, Y ) the Banach space of bounded linear operators from X to Y and by B (X, Y ) = B( j=1 X, Y ). We denote by a ⊗ the element ⊗j=1 a of ⊗j=1 X for some a ∈ X . Given a differential form α, we denote by dα its exterior differential. 3. Linearization and Set Up Let U = t (u, u). We introduce now energy E(u) and mass Q(u). We set E(U ) = E K (U ) + E P (U ), E K (U ) = ∇u · ∇ud x + V uud x, 3 R3 R E P (U ) = B(uu)d x,

(3.1)

R3

with B(0) = 0 and ∂u B(|u|2 ) = β(|u|2 )u. We will consider the matrices

0 1 0 i 1 0 , σ2 = , σ3 = . σ1 = 1 0 −i 0 0 −1 We introduce the mass

Q(U ) =

Let

R3

uud x =

1 U, σ1 U . 2

φω , q(ω) = Q(ω ), e(ω) = E(ω ), d(ω) = e(ω) + ωq(ω). ω = φω

(3.2)

(3.3)

Often we will denote ω simply by . Equation (1.1) can be written as

0 1 ∂u E ˙ = σ3 σ1 ∇ E(U ), iU = ∂u E −1 0

(3.4)

(3.5)

with ∇ E(U ) defined by (3.5). We have for ϑ ∈ R, E(e−iσ3 ϑ U ) = E(U )

and

∇ E(e−iσ3 ϑ U ) = eiσ3 ϑ ∇ E(U ).

(3.6)

Write for ω ∈ O, U = eiσ3 ϑ (ω + R). Then ˙ iσ3 ϑ (ω + R) + iωe iU˙ = −σ3 ϑe ˙ iσ3 ϑ ∂ω ω + ieiσ3 ϑ R˙ and ˙ iσ3 ϑ (ω + R) + iωe −σ3 ϑe ˙ iσ3 ϑ ∂ω ω + ieiσ3 ϑ R˙ = σ3 σ1 e−iσ3 ϑ ∇ E(ω + R).

(3.7)


285

Equivalently we get ˙ ω ω + i R˙ −σ3 (ϑ˙ − ω)(ω + R) + iω∂ = σ3 σ1 (∇ E(ω + R) + ω∇ Q(ω + R)) . We have

d dt σ3 σ1 (∇ E(ω

(3.8)

+ t R) + ω∇ Q(ω + t R))|t=0 = Hω R with

Hω = σ3 (− + V + ω) + σ3 β(φω2 ) + β (φω2 )φω2 − iσ2 β (φω2 )φω2 .

(3.9)

The essential spectrum of Hω consists of (−∞, −ω] ∪ [ω, +∞). It is well known (see [W2]) that by (H5) 0 is an isolated eigenvalue of Hω with dim N g (Hω ) = 2 and Hω σ3 ω = 0, Hω ∂ω ω = −ω .

(3.10)

Since Hω∗ = σ3 Hω σ3 , we have N g (Hω∗ ) = span{ω , σ3 ∂ω ω }. We consider eigenfunctions ξ j (ω) with eigenvalue λ j (ω): Hω ξ j (ω) = λ j (ω)ξ j (ω), Hω σ1 ξ j (ω) = −λ j (ω)σ1 ξ j (ω). They can be normalized so that σ3 ξ j (ω), ξ (ω) = δ j ; this is based on Proposition 2.4 [Cu3]. Furthermore, they can be chosen to be real, that is with real entries, so ξ j = ξ j for all j. Both φω and ξ j (ω, x) are smooth in ω ∈ O and x ∈ R3 and satisfy sup ω∈K,x∈R3

ea|x| (|∂xα φω (x)| +

m

|∂xα ξ j (ω, x)| < ∞

j=1

√ for every a ∈ (0, inf ω∈K ω − λm (ω)) and every compact subset K of O. For ω ∈ O, we have the Hω -invariant Jordan block decomposition L 2 (R3 , C2 ) = N g (Hω ) ⊕ ⊕± ⊕mj=1 ker(Hω ∓ λ j (ω)) ⊕ L 2c (Hω ),

(3.11)

⊥ with σd = σd (Hω ). We also set L 2c (Hω ) := N g (Hω∗ ) ⊕ ⊕λ∈σd \{0} ker(Hω∗ − λ) L 2d (Hω ) := N g (Hω ) ⊕ ⊕λ∈σd \{0} ker(Hω − λ(ω)) . By Pc (Hω ) (resp. Pd (Hω )), or simply by Pc (ω) (resp. Pd (ω)), we denote the projection on L 2c (Hω ) (resp. L 2d (Hω )) associated to the above direct sum. The space L 2c (Hω ) depends continuously on ω. We specify the ansatz imposing that U = eiσ3 ϑ (ω + R)

with ω ∈ O, ϑ ∈ R

and

R ∈ N g⊥ (Hω∗ ).

(3.12)

We consider coordinates U = eiσ3 ϑ (ω + z · ξ(ω) + z · σ1 ξ(ω) + Pc (Hω ) f ) ,

(3.13)

where ω ∈ O, z ∈ C and f ∈ L 2c (Hω0 ), where we fixed ω0 ∈ O such that q(ω0 ) = u 0 22 . Equation (3.13) is a system of coordinates if we use the notation O to denote a

286

S. Cuccagna

small neighborhood of ω1 in Theorem 2.2. Indeed by Lemma 3.1 below, then the map Pc (Hω ) is an isomorphism from L 2c (Hω0 ) to L 2c (Hω ). In particular R=

m

z j ξ j (ω) +

j=1

m

z j σ1 ξ j (ω) + Pc (Hω ) f,

(3.14)

j=1

R ∈ N g⊥ (Hω∗ ) and f ∈ L 2c (Hω0 ). We also set z · ξ = j z j ξ j and z · σ1 ξ = j z j σ1 ξ j . In the sequel we set ∂ω R =

m

j=1

z j ∂ω ξ j (ω) +

m

z j σ1 ∂ω ξ j (ω) + ∂ω Pc (Hω ) f.

(3.15)

(3.16)

j=1

We have: Lemma 3.1. We have Pc (Hω )∗ = Pc (Hω∗ ) for all ω ∈ O. For all ω, ω ∈ O the following ,s −k,−s k to H for all exponents: operators are bounded from H ∂ω Pc (Hω ) for any > 0 ; Pc (Hω ) − Pc (Hω∗ ) ; Pc (Hω ) − Pc (H ω ).

(3.17)

Consider ω1 of Theorem 2.2. There exists ε1 > 0 such that (ω1 − ε1 , ω1 + ε1 ) ⊂ O, and for any pair ω, ω ∈ (ω1 − ε1 , ω1 + ε1 ) we have 2 ω) : L 2c (H Pc (ω)Pc ( ω ) → L c (Hω ) is an isomorphism.

Furthermore, the following operator is bounded from H −k,−s to H −1 Pc (Hω ), Pc (H ω ) 1 − (Pc (Hω )Pc (H ω ))

k ,s

(3.18)

for all exponents: (3.19)

where in the last line and (Pc (ω)Pc ( ω))−1 is the inverse of the operator in (3.18). Finally, for 0 in Theorem 2.2 sufficiently small, we have |ω0 − ω1 | < ε1 , with ω0 defined under (3.13). Proof. The first statement follows from the definition. We have Pc (Hω ) = 1 − Pd (Hω ), where Pd (Hω ) are finite linear combinations of rank 1 operators , with , ∈ H K ,S for any (K , S). This implies the statement for the second line of (3.17). ∂ω Pc (Hω ) is well defined by the fact that in (H4) the dependence on ω is in fact smooth (this is seen iterating the argument in Theorem 18 [ShS]). Assuming (3.18), and for Pc = c = Pc ( d = Pd ( Pc (ω), P ω), Pd = Pd (ω), P ω), we have c 1 − (Pc P c − Pc )Pc − ( P c − Pc )(Pc P c )−1 Pc = ( P c )−1 Pc , P c f = 0, which yields (3.19). We prove (3.18). First of all the map is 1–1. Indeed if Pc P then we have f = Pd f = (Pd − Pd ) f . Then f 2 ≤ C|ω − ω| f 2 for some fixed C > 0. This, for 2Cε1 < 1, is compatible only with f = 0. If we knew that the map in (3.18) is onto, then (3.18) would hold by the open mapping theorem. So suppose c ) be the range of Pc P c . If there exists the map is not onto. Let R(Pc P g ∈ L 2c (Hω∗ ) 2 c f = 0 for all f ∈ L c (H g = σ3 g for a such that g = 0 and g , Pc P ω ), then since c Pc g, σ3 f for all f ∈ L 2c (H c f = P g , Pc P g ∈ L 2c (Hω ), we get 0 = ω ). This implies


287

c Pc g = 0, and since g ∈ L 2c (Hω ), by the 1–1 argument this implies g = 0. So if the P c ) is dense in L 2c (Hω ). We will see in a moment map in (3.18) is not onto, then R(Pc P 2 c ) is closed in L c (Hω ), hence concluding that the map in (3.18) is also onto. that R(Pc P c ) is closed in L 2c (Hω ), let f n ∈ L 2c (H To see that R(Pc P ω ) be a sequence such that 2 ω| f n 2 f n 2 + C|ω − Pc f n − f 2 → 0 for some f ∈ L c (Hω ). By f n 2 ≤ Pc for some fixed C, it follows that for 2Cε1 < 1 the sequence f n 2 is bounded. Then by weak compactness there is a subsequence f n j weakly convergent to a f ∈ L 2c (H ω ). Since Pc Pc is also weakly continuous, Pc Pc f = f . The final statement is elementary by (2.2). Using the system of coordinates (3.13) we rewrite system (3.8) as −σ3 (ϑ˙ − ω)(ω + z · ξ + z · σ1 ξ + Pc (Hω ) f ) + iω(∂ ˙ ω ω + z · ∂ω ξ + z · σ1 ∂ω ξ + ∂ω Pc (Hω ) f ) + i˙z · ξ ˙ + iz · σ1 ξ + iPc (Hω ) f˙ = σ3 σ1 ∇ E(ω + z · ξ + z · σ1 ξ + Pc (Hω ) f ) + ωσ3 σ1 ∇ Q(ω + z · ξ + z · σ1 ξ + Pc (Hω ) f ), (3.20) where z · ξ = j z j ξ j and z · σ1 ξ = j z j σ1 ξ j , where ξ = ξ(ω). Notice for future reference, that for any fixed ω0 we also have −σ3 (ϑ˙ − ω0 )(ω + z · ξ + z · σ1 ξ + Pc (Hω ) f ) + iω(∂ ˙ ω ω + z · ∂ω ξ + z · σ1 ∂ω ξ + ∂ω Pc (Hω ) f ) + i˙z · ξ + iz˙ · σ1 ξ + iPc (Hω ) f˙ = σ3 σ1 ∇ E(ω + z · ξ + z · σ1 ξ + Pc (Hω ) f ) + ω0 σ3 σ1 ∇ Q(ω + z · ξ + z · σ1 ξ + Pc (Hω ) f ), (3.21) where (3.21) is the same as (3.20) except for ω0 replacing ω in the first spot where they appear in the first and last line. We end this section with a short heuristic description about why the crucial property needed to prove asymptotic stability of ground states is the hamiltonian nature of the (1.1). In terms of (3.13), and oversimplifying, (3.7) splits as i˙z − λz =

μν

i f˙ − Hω f =

aμν z μ z ν +

μν

z μ z ν f (t, x), G μν (x, ω) L 2x + · · · ,

z μ z ν Mμν (x, ω) + · · · .

μν

Here we are assuming m = 1. We focus on positive times t ≥ 0 only. After changes of variables, see [CM], we obtain i˙z − λz = P(|z|2 )z + z N f (t, x), G μν (x, ω) L 2x + · · · , i f˙ − Hω f = z N +1 M(x, ω) + · · · . The next step is to write, for g an error term, + ((N + 1) λ)M + g, f = −z N +1 RH ω + ((N + 1) λ)M, G L 2x + · · · . i˙z − λz = P(|z|2 )z − |z|2N zRH ω

(3.22)

288

S. Cuccagna

Then, ignoring error terms, by + RH ((N + 1) λ) = P.V. ω

1 + iπ δ(Hω − (N + 1) λ), Hω − (N + 1) λ

the equation for z has solutions such that d 2 |z(0)| |z| = −|z|2N +2 , |z(t)| = 1 dt (|z(0)|2N N t + 1) 2N with (the Fourier transforms are associated to Hω ) = 2π δ(Hω − (N + 1) λ)M, G =

√ |ξ |= (N +1) λ−ω

) · G(ξ )dσ M(ξ . √ (N + 1) λ − ω

If > 0, we see that z(t) decays. Notice that < 0 is incompatible with orbital stability, which requires z to remain small, see Corollary 4.6 [CM]. The latter indirect argument to prove positive semidefiniteness of does not seem to work when in (3.7) there are further discrete components. So we need another way to prove that ≥ 0. This is provided by the hamiltonian structure. Indeed, if (3.22) is of the form i˙z = ∂z K , i f˙ = ∇ f K ,

(3.23)

then by Schwartz lemma (N + 1)!M = ∂zN +1 ∇ f K = ∂zN ∇ f ∂z K = N !G at z = 0 and f = 0. So is positive semidefinite. This very simple idea on system (3.23) inspired [BC] and inspires the present paper. 4. Gradient of the Coordinates We focus on ansatz (3.12) and on the coordinates (3.13). In particular we compute the gradient of the coordinates. Here we recall that given a scalar valued function F, the relation between exterior differential and gradient is d F = ∇ F, . Consider the following two functions: F(U, ω, ϑ) := e−iσ3 ϑ U − ω , ω

and

G(U, ω, ϑ) := e−iσ3 ϑ U, σ3 ∂ω ω .

Then ansatz (3.12) is obtained by choosing (ω, ϑ) s.t. R := e−iσ3 ϑ U − ω satisfies R ∈ N g⊥ (Hω∗ ) by means of the implicit function theorem. In particular: Fϑ Fω ∇U F Gϑ Gω

= = = = =

−iσ3 e−iσ3 ϑ U, ω = −iσ3 R, ω ; −2q (ω) + e−iσ3 ϑ U, ∂ω ω = −q (ω) + R, ∂ω ω ; e−iσ3 ϑ ω , ∇U G = e−iσ3 ϑ σ3 ∂ω ω ; −ie−iσ3 ϑ U, ∂ω ω = −i(q (ω) + R, ∂ω ω ); e−iσ3 ϑ U, σ3 ∂ω2 ω = R, σ3 ∂ω2 ω .

By F(U, ω(U ), ϑ(U ) = G(U, ω(U ), ϑ(U ) = 0 we get Wω ∇ω + Wϑ ∇ϑ = −∇U W for W = F, G. By the above formulas, if we set

−q (ω) + R, ∂ω ω −iσ3 R, ω A= , (4.1) R, σ3 ∂ω2 ω −i(q (ω) + R, ∂ω ω )


289

we have

∇ω −e−iσ3 ϑ ω A = . ∇ϑ −e−iσ3 ϑ σ3 ∂ω ω

(4.2)

So ∇ω = ∇ϑ =

(q (ω) + R, ∂ω ω )e−iσ3 ϑ ω − σ3 R, ω e−iσ3 ϑ σ3 ∂ω ω (q (ω))2 − R, ∂ω ω 2 + σ3 R, ω R, σ3 ∂ω2 ω

R, σ3 ∂ω2 ω e−iσ3 ϑ ω + (q (ω) − R, ∂ω ω )e−iσ3 ϑ σ3 ∂ω ω . i q (ω))2 − R, ∂ω ω 2 + σ3 R, ω R, σ3 ∂ω2 ω

Notice that along with the decomposition (3.11) we have L 2 (R3 , C2 ) = N g (Hω∗ ) ⊕ ⊕λ∈σd \{0} ker(Hω∗ − λ(ω)) ⊕ L 2c (Hω∗ ),

(4.3)

(4.4)

⊥ N g (Hω ) ⊕ ⊕λ∈σd \{0} ker(H . We also set L 2d (Hω∗ ) := L 2c (Hω∗ ) := ω − λ(ω)) ∗ ∗ ∗ N g (Hω ) ⊕ ⊕λ∈σd \{0} ker(Hω − λ(ω)) . Notice that N g (Hω ) = σ3 N g (Hω ), ker(Hω∗ − λ) = σ3 ker(Hω − λ), L 2c (Hω∗ ) = σ3 L 2c (Hω ) and L 2d (Hω∗ ) = σ3 L 2d (Hω ), so that (4.4) is obtained applying σ3 to decomposition (3.11). We can decompose gradients as

∇ F(U ) = e−iσ3 ϑ [PNg (Hω∗ ) + (Pker(Hω∗ −λ j ) + Pker(Hω∗ +λ j ) ) + Pc (Hω∗ )]eiσ3 ϑ ∇ F(U ) j

=

∇ F(U ), eiσ3 ϑ ∂

∇ F(U ), eiσ3 ϑ σ3 −iσ3 ϑ e σ3 ∂ω q (ω) q (ω)

+ ∇ F(U ), eiσ3 ϑ ξ j e−iσ3 ϑ σ3 ξ j + ∇ F(U ), eiσ3 ϑ σ1 ξ j e−iσ3 ϑ σ1 σ3 ξ j ω −iσ3 ϑ

e

+

j

+e

j

−iσ3 ϑ

Pc (Hω∗ )eiσ3 ϑ ∇ F(U ).

(4.5)

Using coordinates (3.13) and notation (3.16), at U we have the following formulas for the vectorfields ∂ ∂ = eiσ3 ϑ ∂ω ( + R) , = ieiσ3 ϑ σ3 ( + R), ∂ω ∂ϑ ∂ ∂ = eiσ3 ϑ ξ j , = eiσ3 ϑ σ1 ξ j ∂z j ∂z j

(4.6)

∂ ∂ Hence, by ∂ω F = d F( ∂ω ) = ∇ F, ∂ω etc., we have

∂ω F = ∇ F, eiσ3 ϑ ∂ω ( + R) , ∂ϑ F = i∇ F, eiσ3 ϑ σ3 ( + R), ∂z j F = ∇ F, eiσ3 ϑ ξ j , ∂z j F = ∇ F, eiσ3 ϑ σ1 ξ j .

(4.7)

Lemma 4.1. We have the following formulas: ∇z j = −σ3 ξ j , ∂ω R∇ω − iσ3 ξ j , σ3 R∇ϑ + e−iσ3 ϑ σ3 ξ j , ∇z j = −σ1 σ3 ξ j , ∂ω R∇ω − iσ1 σ3 ξ j , σ3 R∇ϑ + e

−iσ3 ϑ

σ1 σ3 ξ j .

(4.8) (4.9)

290

S. Cuccagna

Proof. Equalities

∂z j ∂z

= δ j ,

∂z j ∂z

=

∂z j ∂ω

=

∂z j ∂ϑ

= 0 and ∇ f z j = 0 are equivalent to

∇z j , eiσ3 ϑ ξ = δ j , ∇z j , eiσ3 ϑ σ1 ξ ≡ 0 = ∇z j , eiσ3 ϑ σ3 ( + R), ∇z j , eiσ3 ϑ ∂ω ( + R) = 0 ≡ ∇z j , eiσ3 ϑ Pc (ω)Pc (ω0 )g ∀g ∈ L 2c (Hω0 ).

(4.10)

Notice that the last identity implies Pc (Hω∗ 0 )Pc (Hω∗ )eiσ3 ϑ ∇z j = 0 which in turn implies Pc (Hω∗ )eiσ3 ϑ ∇z j = 0. Then , applying (4.5) and using the product row column, we get for some pair of numbers (a, b), ∇z j = ae−iσ3 ϑ + be−iσ3 ϑ σ3 ∂ω + e−iσ3 ϑ σ3 ξ j −iσ ϑ

e 3 ∇ω + e−iσ3 ϑ σ3 ξ j , = (a, b) −iσ3 ϑ + e−iσ3 ϑ σ3 ξ j = −(a, b)A ∇ϑ e σ3 ∂ω where in the last line we used (4.2). Equating the two extreme sides and applying to the ∂ ∂ ∂ ∂ ∂ ∂ and , ∂ϑ , by ∇z j , ∂ω = ∇z j , ∂ϑ = ∇ϑ, ∂ω = ∇ω, ∂ϑ = formula , ∂ω ∂ ∂ 0, by ∇ϑ, ∂ϑ = ∇ω, ∂ω = 1 and by (4.6) and (4.10), we get

σ3 ξ j , ∂ω R a = . A∗ b iσ3 ξ j , σ3 R This implies

∇z j = −(σ3 ξ j , ∂ω R, iσ3 ξ j , σ3 R)

∇ω + e−iσ3 ϑ σ3 ξ j . ∇ϑ

This yields (4.8). Similarly ∇z j = ae−iσ3 ϑ + be−iσ3 ϑ σ3 ∂ω + e−iσ3 ϑ σ1 σ3 ξ j , where A∗

a σ1 σ3 ξ j , ∂ω R . = iσ1 σ3 ξ j , σ3 R b

Lemma 4.2. Consider the map f (U ) = f for U and f as in (3.13). Denote by f (U ) the Frechét derivative of this map. Then f (U ) = (Pc (ω)Pc (ω0 ))−1 Pc (ω) −∂ω R dω − iσ3 R dϑ + e−iσ3 ϑ 1 . Proof. We have f (U )eiσ3 ϑ ξ ≡ f (U )eiσ3 ϑ σ1 ξ ≡ 0 = f (U )eiσ3 ϑ σ3 ( + R) = f (U )eiσ3 ϑ ∂ω ( + R) and f (U )eiσ3 ϑ Pc (ω)g = g ∀g ∈ L 2c (Hω0 ).

(4.11)

This implies that for a pair of vectors valued functions A and B and with the inverse of Pc (Hω )Pc (Hω0 ) : L 2c (Hω0 ) → L 2c (Hω ),

e−iσ3 ϑ , f = (A, B) + (Pc (ω)Pc (ω0 ))−1 Pc (ω)e−iσ3 ϑ e−iσ3 ϑ σ3 ∂ω ,

dω + (Pc (ω)Pc (ω0 ))−1 Pc (ω)e−iσ3 ϑ . = −(A, B)A dϑ


291

By (4.11) we obtain that A and B are identified by the following equations (treating the last (Pc (ω)Pc (ω0 ))−1 Pc (ω) like a scalar):

A ∂ R = (Pc (ω)Pc (ω0 ))−1 Pc (ω) ω . A∗ B iσ3 R 5. Symplectic Structure Our ambient space is H 1 (R3 , C) × H 1 (R3 , C). We focus only on points with σ1 U = U . The natural symplectic structure for our problem is (X, Y ) = X, σ3 σ1 Y .

(5.1)

We will see that the coordinates we introduced in (3.13), which arise naturally from the linearization, are not canonical for (5.1). This is the main difference with [BC]. In this section we exploit the work in Sect. 4 to compute the Poisson brackets for pairs of coordinates. We end the section with a crucial property for Q, Lemma 5.4. The hamiltonian vector field X G of a scalar function G is defined by the equation X G , σ3 σ1 Y = −i∇G, Y for any vector Y and is X G = −iσ3 σ1 ∇G. At U = eiσ3 ϑ (ω + R) as in (3.12) we have by (4.5), ∇G(U ), eiσ3 ϑ σ3 iσ3 ϑ ∇G(U ), eiσ3 ϑ ∂ω iσ3 ϑ e e ∂ − i σ3 ω q (ω) q (ω)

+i ∂z j G(U )eiσ3 ϑ σ1 ξ j − i ∂z j G(U )eiσ3 ϑ ξ j

X G (U ) = i

j

− ie

iσ3 ϑ

j

σ3 σ1 Pc (Hω∗ )eiσ3 ϑ ∇G(U ).

(5.2)

We call the Poisson bracket of a pair of scalar valued functions F and G the scalar valued function {F, G} = ∇ F, X G = −i∇ F, σ3 σ1 ∇G = i(X F , X G ).

(5.3)

d Q(U (t)) = ∇ Q(U (t)), σ3 σ1 ∇ E(U (t)) we have the commutation By 0 = i dt

{Q, E} = 0.

(5.4)

In terms of spectral components we have i{F, G}(U ) = ∇ F(U ), σ3 σ1 ∇G(U ) = (q )−1 × ∇ F, eiσ3 ϑ σ3 ∇G, eiσ3 ϑ ∂ω − ∇ F, eiσ3 ϑ ∂ω ∇G, eiσ3 ϑ σ3

∂z j F∂z j G − ∂z j F∂z j G + j

+σ3 e−iσ3 ϑ Pc (Hω∗ )eiσ3 ϑ ∇ F, σ1 e−iσ3 ϑ Pc (Hω∗ )eiσ3 ϑ ∇G.

(5.5)

292

S. Cuccagna

Lemma 5.1. Let F(U ) be a scalar function. We have the following equalities: q ; (q )2 − R, ∂ω 2 + σ3 R, R, σ3 ∂ω2 {z j , F} = σ3 ξ j , ∂ω R{F, ω} + iσ3 ξ j , σ3 R{F, ϑ} − i∂z j F; {z j , F} = σ1 σ3 ξ j , ∂ω R{F, ω} + iσ1 σ3 ξ j , σ3 R{F, ϑ} + i∂z j F. {ω, ϑ} =

(5.6) (5.7) (5.8)

In particular we have: {z j , ω} = iσ3 ξ j , σ3 R{ω, ϑ}; {z j , ω} = iσ1 σ3 ξ j , σ3 R{ω, ϑ}; {z j , ϑ} = σ3 ξ j , ∂ω R{ϑ, ω}; {z j , ϑ} = σ1 σ3 ξ j , ∂ω R{ϑ, ω}; {z k , z j } = i(σ3 ξk , ∂ω Rσ3 ξ j , σ3 R − σ3 ξ j , ∂ω Rσ3 ξk , σ3 R){ω, ϑ}; {z k , z j } = i(σ1 σ3 ξk , ∂ω Rσ1 σ3 ξ j , σ3 R − σ1 σ3 ξ j , ∂ω Rσ1 σ3 ξk , σ3 R){ω, ϑ}; {z k , z j } = −iδ jk + i(σ3 ξk , ∂ω Rσ1 σ3 ξ j , σ3 R − σ1 σ3 ξ j , ∂ω Rξk , R){ω, ϑ}. Proof. By (4.3) and (5.5) we have i{ω, ϑ} = (q )−1 ∇ω, eiσ3 ϑ σ3 ∇ϑ, eiσ3 ϑ ∂ω − ∇ω, eiσ3 ϑ ∂ω ∇ϑ, eiσ3 ϑ σ3 =

−σ3 R, ω q R, σ3 ∂ω2 ω q − [(q (ω))2 − R, ∂ω ω 2 ](q )2 . 2 q i q (ω))2 − R, ∂ω ω 2 + σ3 R, ω R, σ3 ∂ω2 ω

This yields (5.6). For (5.7), substituting (4.8) in (5.3), we get {z j , F} = ∇z j , X F = −σ3 ξ j , ∂ω R{ω, F} − iσ3 ξ j , σ3 R{ϑ, F} + e−iσ3 ϑ σ3 ξ j , X F . When we substitute X F with the decomposition in (5.2), the last term in the above sum becomes e−iσ3 ϑ σ3 ξ j , X F = −i∂z j Fe−iσ3 ϑ σ3 ξ j , eiσ3 ϑ ξ j = −i∂z j F. This yields (5.7). Equation (5.8) can be derived by first replacing F with F in (5.7) and by taking the complex conjugate of the resulting equation: {z j , F} = σ3 ξ j , ∂ω R{F, ω} − iσ3 ξ j , σ3 R{F, ϑ} + i∂z j F. Then (5.8) follows by using that R = σ1 R and σ1 σ3 = −σ3 σ1 . The remaining formulas in the statement follow from (5.7)–(5.8). Definition 5.2. Given a function G(U ) with values in L 2c (Hω0 ), a symplectic form and a scalar function F(U ), we define {G, F} := G (U )X F (U )

(5.9)

with X F the hamiltonian vector field associated to F. We set {F, G} := −{G, F}. We have: Lemma 5.3. For f (U ) the functional in Lemma 4.2, we have: { f, F} = (Pc (ω)Pc (ω0 ))−1 Pc (ω)[{F, ω}∂ω R + i{F, ϑ}σ3 R − ie−iσ3 ϑ σ3 σ1 ∇ F].

(5.10)


293

In particular we have: { f, ω} = i{ω, ϑ}(Pc (ω)Pc (ω0 ))−1 Pc (ω)σ3 R; { f, ϑ} = {ϑ, ω}(Pc (ω)Pc (ω0 ))−1 Pc (ω)∂ω R; { f, z j } = (Pc (ω)Pc (ω0 ))−1 Pc (ω) {z j , ω}∂ω R + i{z j , ϑ}σ3 R ; { f, z j } = (Pc (ω)Pc (ω0 ))−1 Pc (ω) {z j , ω}∂ω R + i{z j , ϑ}σ3 R . Proof. Using Lemma 4.2 and by (4.2),

(5.11)

∇ω, σ3 σ1 ∇ F f σ3 σ1 ∇ F = −(A, B)A ∇ϑ, σ3 σ1 ∇ F

+(Pc (ω)Pc (ω0 ))−1 Pc (ω)e−iσ3 ϑ σ3 σ1 ∇ F. By Lemma 4.2 we have

{ω, F} {ω, F} −1 . (A, B)A = (Pc (ω)Pc (ω0 )) Pc (ω)(∂ω R, iσ3 R) {ϑ, F} {ϑ, F} The following result is important in the sequel. Lemma 5.4. Let Q be the function defined in (3.3). Then, we have the following formulas: {Q, ω} = 0; {Q, ϑ} = 1; {Q, z j } = {Q, z j } = 0; {Q, f } = 0.

(5.12) (5.13) (5.14) (5.15)

Denote by X Q the hamiltonian vectorfield of Q. Then XQ = −

∂ . ∂ϑ

(5.16)

Proof. We have by (5.5), (4.3) and ∇ Q(U ) = σ1 U , iq {Q, ω} = ∇ Q, eiσ3 ϑ σ3 ∇ω, eiσ3 ϑ ∂ω − ∇ Q, eiσ3 ϑ ∂ω ∇ω, eiσ3 ϑ σ3 −R, σ3 (q (ω) + R, ∂ω ω ) − (q (ω) + R, ∂ω ω )(−1)R, σ3 = q = 0. (q (ω))2 − R, ∂ω ω 2 + σ3 R, ω R, σ3 ∂ω2 ω

Similarly, iq {Q, ϑ} = ∇ Q, eiσ3 ϑ σ3 ∇ϑ, eiσ3 ϑ ∂ω − ∇ Q, eiσ3 ϑ ∂ω ∇ϑ, eiσ3 ϑ σ3 −R, σ3 R, σ3 ∂ω2 − (q (ω) + R, ∂ω ω )(q (ω) − R, ∂ω ω ) = q = q i. i[(q (ω))2 − R, ∂ω ω 2 + σ3 R, ω R, σ3 ∂ω2 ω ]

By (5.7), (5.12) and (5.13) we have i{z j , Q} = −ξ j , R + ∂z j Q, i{z j , Q} = ξ j , σ1 R − ∂z j Q.

(5.17)

294

S. Cuccagna

By 1 Q(U ) = q + z · ξ + z · σ1 ξ + Pc (ω) f, σ1 (z · ξ + z · σ1 ξ + Pc (ω) f ), 2 we have ∂z j Q = ξ j , σ1 R , ∂z j Q = ξ j , R.

(5.18)

(5.19)

So both lines in (5.17) are 0 and yield (5.14). Finally (5.15) follows by (5.9), Lemma 5.3, (5.12) , (5.13) and by { f, Q} = (Pc (ω)Pc (ω0 ))−1 Pc (ω) i{Q, ϑ}σ3 R − ie−iσ3 ϑ σ3 σ1 ∇ Q = (Pc (ω)Pc (ω0 ))−1 Pc (ω) [iσ3 R − iσ3 − iσ3 R] = 0. Equation (5.16) is an immediate consequence of the definition of X Q and of (5.12)– (5.15). 6. Hamiltonian Reformulation of the System Equation (3.20) is how the problem is framed in the literature. Yet (3.20) hides the crucial hamiltonian nature of the problem. In the coordinate system (3.13) can be written as follows: ω˙ = {ω, E} , f˙ = { f, E} , z˙ j = {z j , E} , z˙ j = {z j , E} , ϑ˙ = {ϑ, E}.

(6.1)

For the scalar coordinates the equations in (6.1) are due to the hamiltonian nature of (3.5). Exactly for the same reasons we have the equation f˙ = { f, E}, which we now derive in the following standard way. Multiplying (3.20) by eiϑσ3 one can rewrite (3.20) by (4.6) and (3.6), as

∂ ∂ ∂ ∂ + iω˙ +i −iϑ˙ z˙ j +i z˙ j ∂ϑ ∂ω ∂z j ∂z j j

+ ie

iϑσ3

j

Pc (Hω ) f˙ = σ3 σ1 ∇ E(U ).

(6.2)

When we apply the derivative f (U ) to (6.2) the first line cancels, so that f˙ = f (U )eiϑσ3 Pc (Hω ) f˙ = − f (U )iσ3 σ1 ∇ E(U ) = f (U )X E (U ) = { f, E}, where the first equality is (4.11), the third is the definition of hamiltonian field two lines above (5.2) and the last equality is Definition 5.2. We now introduce a new hamiltonian. For u 0 the initial datum in (1.1), set K (U ) = E(U ) + ω(U )Q(U ) − ω(U )u 0 2L 2 . x

(6.3)

By Lemma 5.4 the solution of the initial value problem in (1.1) solves also f˙ = { f, K } , z˙ j = {z j , K } , z˙ j = {z j , K } , ϑ˙ − ω = {ϑ, K }. ω˙ = {ω, K } ,

(6.4)


295

∂ By ∂ϑ K = 0 the right hand sides in Eqs. (6.4) do not depend on ϑ. Hence, if we look at the new system

f˙ = { f, K } , z˙ j = {z j , K } , z˙ j = {z j , K } , ϑ˙ = {ϑ, K }, ω˙ = {ω, K } ,

(6.5)

the evolution of the crucial variables (ω, z, z, f ) in (6.1) and (6.5) is the same. Therefore, to prove Theorem 2.2 it is sufficient to consider system (6.5). 7. Application of the Darboux Theorem Since the main obstacle at reproducing the Birkhoff normal forms argument of [BC] for (6.5) is that the coordinates (3.13) are not canonical, we change coordinates. That is, we apply the Darboux Theorem. We warn the reader not to confuse the variable t ∈ [0, 1] of this section with the time of the evolution equation of the other sections. We introduce the 2-form, for q = q(ω) = φω 2L 2 and summing on repeated indexes, x

0 = idϑ ∧ dq + dz j ∧ dz j + f (U ), σ3 σ1 f (U ),

(7.1)

with f (U ) the function in Lemma 4.2, f (U ) its Frechét derivative and the last term in (7.1) acting on pairs (X, Y ) like f (U )X, σ3 σ1 f (U )Y . It is an elementary exercise to show that 0 is a closed and non degenerate 2 form. In Lemma 7.1 we check that 0 (U ) = (U ) at U = eiσ3 ϑ ω0 . Then the proof of the Darboux Theorem goes as follows. One first considers t = (1 − t)0 + t = 0 + t

:= − 0 . with

(7.2)

Then one considers a 1- differential form γ (t, U ) such that (external differentiation will with γ (U ) = 0 at U = eiσ3 ϑ ω0 . always be on the U variable only) idγ (t, U ) = t Finally one considers the vector field Y such that i Y t t = −iγ (here for a 2 form and Y a vector field, i Y is the 1 form defined by i Y (X ) := (Y, X )) and the flow Ft generated by Y t , which near the points eiσ3 ϑ ω0 is defined up to time 1, and show that F∗1 = 0 by d d ∗ Ft t = F∗t L Yt t + F∗t t dt dt = F∗t −idγ + = 0. = F∗t d i Y t t + F∗t

(7.3)

For 0 , the coordinates (3.13) are canonical. But if one does not choose the 1 form γ = K ◦ F1 will not yield a semilinear NLS for carefully, then the new hamiltonian K coordinates (3.13), which is what we need to perform the argument of [BC,CM]. In the sequel of this section all the work is finalized to the correct choice if γ . In Lemma 7.2 we compute explicitly a differential form α and we make the preliminary choice γ = −iα. This is not yet the right choice. By the computations in Lemma 7.3 and Remark 7.4, is of the desired type. Lemmas 7.5–7.7 are we find the obstruction to the fact that K necessary to find an appropriate solution F of a differential equation in Lemma 7.8. Then γ = −iα + d F is the right choice of γ . In Lemma 7.10 we collect a number of useful estimates for F1 . Finally, Lemma 7.11 is valid independently of the precise γ chosen and contains information necessary for (8.1)–(8.2).

296

S. Cuccagna

For any vector Y ∈ TU L 2 we set Y = Yϑ

∂ ∂ ∂ ∂ + Yj + eiσ3 ϑ Pc (ω)Y f + Yω + Yj ∂ϑ ∂ω ∂z j ∂z j

(7.4)

for Yϑ = dϑ(Y ) , Yω = dω(Y ) , Y j = dz j (Y ), Y j = dz j (Y ) , Y f = f (U )Y. Similarly, a differential 1-form γ decomposes as

γ = γ ϑ dϑ + γ ω dω + γ j dz j + γ j dz j + γ f , f , where: γ f , f acts on a vector Y as γ f , f Y , with here γ

f

(7.5)

(7.6)

∈ L 2c (Hω∗ 0 ); γ ϑ , γ ω , γ j

and γ j are in C. Notice that we are reversing the standard notation on super and subscripts for forms and vector fields. In the sequel, given a differential 1 form γ and a point U , we will denote by γU the value of γ at U . Given a function χ , denote its hamiltonian vector field with respect to t by X χt : i X χt t = −i dχ . By (7.1) we have the following hamiltonian vectorfield associated to q(ω) (this is important in Lemma 7.11 later): 0 X q(ω) =−

∂ . ∂ϑ

(7.7)

We have the following preliminary observation: Lemma 7.1. At U = eiσ3 ϑ ω0 , for any ϑ, we have 0 (U ) = (U ). Proof. Using the following partition of the identity:

1 = eiσ3 ϑ [PNg (Hω ) + Pker(Hω −λ) + Pc (Hω )]e−iσ3 ϑ ,

(7.8)

λ∈σ (Hω )\{0}

we get, summing on repeated indexes, (X, Y ) = X, σ3 σ1 Y 1 = X, e−iσ3 ϑ σ3 ∂ω Y, e−iσ3 ϑ − X, e−iσ3 ϑ Y, e−iσ3 ϑ σ3 ∂ω q + X, e−iσ3 ϑ σ3 ξ j Y, e−iσ3 ϑ σ1 σ3 ξ j −X, e−iσ3 ϑ σ1 σ3 ξ j Y, e−iσ3 ϑ σ3 ξ j +Pc (Hω )e−iσ3 ϑ X, σ3 σ1 Pc (Hω )e−iσ3 ϑ Y .

(7.9)

By (4.2) we have , e−iσ3 ϑ σ3 ∂ω ∧ , e−iσ3 ϑ = det A dω ∧ dϑ.

(7.10)

Substituting (4.8)–(4.9) we get , e−iσ3 ϑ σ3 ξ j ∧ , e−iσ3 ϑ σ1 σ3 ξ j = (dz j + σ3 ξ j , ∂ω Rdω + iσ3 ξ j , σ3 Rdϑ) ∧(dz j + σ1 σ3 ξ j , ∂ω Rdω + iσ1 σ3 ξ j , σ3 Rdϑ).

(7.11)


297

By Lemma 4.2 we have Pc (Hω )e−iσ3 ϑ , σ3 σ1 Pc (Hω )e−iσ3 ϑ = Pc (ω)Pc (ω0 ) f + Pc (ω)∂ω R dω + iPc (ω)σ3 R dϑ, σ3 σ1 (Pc (ω)Pc (ω0 ) f + Pc (ω)∂ω R dω + iPc (ω)σ3 R dϑ).

(7.12)

Then by (7.9)–(7.12) we have = (iq + a1 )dϑ ∧ dω + dz j ∧ dz j + dz j ∧ σ1 σ3 ξ j , ∂ω R dω + iσ1 σ3 ξ j , σ3 R dϑ − dz j ∧ σ3 ξ j , ∂ω R dω + iσ3 ξ j , σ3 R dϑ + Pc (ω)Pc (ω0 ) f , σ3 σ1 Pc (ω)Pc (ω0 ) f + Pc (ω)Pc (ω0 ) f , σ3 σ1 Pc (ω)∂ω R ∧ dω + iPc (ω)Pc (ω0 ) f , σ3 σ1 Pc (ω)σ3 R ∧ dϑ,

(7.13)

where iq + a1 =

det A + Pc (ω)∂ω R, σ3 σ1 Pc (ω)σ3 iR q + σ3 ξ j , ∂ω Rσ1 σ3 ξ j , iσ3 R − σ1 σ3 ξ j , ∂ω Rσ3 ξ j , iσ3 R. (7.14)

In particular we have a1 := −iq +

det A + PNg⊥ (Hω∗ ) iσ3 R, σ3 σ1 ∂ω R. q

(7.15)

Notice that a1 = a1 (ω, z, f ) is smooth in the arguments ω ∈ O, z ∈ Cn and f ∈ H −K ,−S for any pair (K , S ) with, for (z, f ) near 0, |a1 | ≤ C(K , S )(|z| + f H −K ,−S )2 .

(7.16)

At points U = eiσ3 ϑ ω , that is for R = 0, we have = idϑ ∧ dq + dz j ∧ dz j + Pc (ω)Pc (ω0 ) f , σ3 σ1 Pc (ω)Pc (ω0 ) f . At ω = ω0 we get = 0 .

Lemma 7.2. Consider the following forms: 1 σ1 σ3 U, Y ; 2

z j dz j − z j dz j 1 + f (U ), σ3 σ1 f (U ). β0 (U ) := −iqdϑ − 2 2

(7.17)

dβ0 = 0 , dβ = .

(7.18)

β(U )Y :=

j

Then

Set α(U ) = β(U ) − β0 (U ) + dψ(U ) where ψ(U ) :=

1 σ3 , R. 2

(7.19)

298

S. Cuccagna

We have α = α ϑ dϑ + α ω dω + α f , f with: i i α ϑ + f 22 = − z · ξ + z · σ1 ξ 22 − iz · ξ + z · σ1 ξ, σ1 Pc (ω) f 2 2 i − (Pc (ω) − Pc (ω0 )) f, σ1 (Pc (ω) + Pc (ω0 )) f ; 2 1 α ω = − σ1 R, σ3 ∂ω R; 2 1 α f = σ1 σ3 Pc (ω0 ) (Pc (ω) − Pc (ω0 )) f. 2

(7.20)

Proof. Everything is straightforward except for (7.20), which we now prove. We will sum over repeated indexes. We substitute U using (3.13) getting β=

1 1 −iσ3 ϑ e σ1 σ3 , + e−iσ3 ϑ σ1 σ3 Pc (ω) f, 2 2 1 −iσ3 ϑ + z j e σ1 σ3 ξ j , − z j e−iσ3 ϑ σ3 ξ j , . 2

(7.21)

When we decompose 21 e−iσ3 ϑ σ1 σ3 like ∇ F in (4.5), we obtain q 1 −iσ3 ϑ σ1 σ3 , = − e−iσ3 ϑ σ3 ∂ω , e 2 q 1 − σ3 , ξ j e−iσ3 ϑ σ3 ξ j , − e−iσ3 ϑ σ1 σ3 ξ j , 2 1 (7.22) − e−iσ3 ϑ Pc (Hω∗ )σ3 , 2 with by (4.2) −

q −iσ3 ϑ q q e σ3 ∂ω , = R, σ3 ∂ω2 dω − i (q + R, ∂ω ) dϑ. (7.23) q q q

Substituting slightly manipulated versions of the formulas in Lemmas 4.1–4.2, in particular using σ3 Pc (ω) = Pc (ω)∗ σ3 , σ1 Pc (ω) = Pc (ω)σ1 and σ1 σ3 = −σ1 σ3 , and summing over repeated indexes, we get 1 β0 = −iq dϑ + z j (σ1 σ1 ξ j , σ3 ∂ω Rdω + iξ j , Rdϑ − e−iσ3 ϑ σ3 ξ j , ) 2 1 + z j (σ1 ξ j , σ3 ∂ω Rdω + iξ j , σ1 Rdϑ + e−iσ3 ϑ σ1 σ3 ξ j , ) 2 1 1 + f, σ3 σ1 (1 − Pc (ω)Pc (ω0 )) f + f, σ3 σ1 Pc (ω)e−iσ3 ϑ 2 2 1 i + σ1 Pc (ω) f, σ3 ∂ω Rdω + Pc (ω) f, σ1 Rdϑ. 2 2


299

Hence

1 1 β0 = i −q + R, σ1 R dϑ + σ1 R, σ3 ∂ω R dω 2 2 1 + σ1 σ3 (1 − Pc (ω0 )Pc (ω)) f, f 2 1 + z j e−iσ3 ϑ σ1 σ3 ξ j , − z j e−iσ3 ϑ σ3 ξ j , 2 1 + e−iσ3 ϑ σ1 σ3 Pc (ω) f, . 2

(7.24)

By (3.13) we have dψ =

1 1 1 σ3 , ∂ω Rdω + σ3 , ξ j dz j − dz j + σ3 , Pc (ω) f . (7.25) 2 2 2

Applying to (7.25) Lemmas 4.1–4.2, the fact that, in particular, we have Pc (ω) f (U ) = Pc (ω)Pc (ω0 ) f (U ) = Pc (ω) −∂ω R dω − iσ3 R dϑ + e−iσ3 ϑ 1 , and the identities (7.27)–(7.28) below, we get dψ = =

1 σ3 , ξ j e−iσ3 ϑ σ3 ξ j , − e−iσ3 ϑ σ1 σ3 ξ j , 2 1 + e−iσ3 ϑ Pc (Hω∗ )σ3 , 2 q i + σ3 ∂ω , ∂ω Rdω − σ3 , PNg⊥ (Hω∗ ) σ3 Rdϑ. q 2

(7.26)

To get the last line of (7.26) we have used: 1 1 σ3 , ∂ω R − σ3 , ξ j σ3 ξ j , ∂ω R 2 2 1 1 1 − σ3 , σ1 ξ j σ1 σ3 ξ j , ∂ω R − σ3 , Pc (ω)∂ω R = σ3 , ∂ω R 2 2 2 1 2q 1 − σ3 , ∂ω R − σ3 , σ3 σ3 ∂ω , ∂ω R = σ3 ∂ω , ∂ω R; 2 q 2q i i − σ3 , ξ j σ3 ξ j , σ3 R − σ3 , σ1 ξ j σ1 σ3 ξ j , σ3 R 2 2 i i − σ3 , Pc (ω)σ3 R = − σ3 , PNg⊥ (Hω∗ ) σ3 R. 2 2

(7.27)

(7.28)

Let us consider the sum (7.19). There are various cancelations. The first and second (resp. the first term of the third) line of (7.26) cancel with the second and third lines of (7.22) (resp. the first term of the rhs of (7.23)). The last three terms in rhs (7.21) cancel with the last two lines of (7.24). The −iqdϑ term in the rhs of (7.24) cancels with the

300

S. Cuccagna

−iqdϑ term in (7.23). Adding the second term of the third line of (7.26) with the last term in the rhs of (7.23) we get the product of i times the following quantities: 1 q − σ3 , PNg⊥ (Hω∗ ) σ3 R − R, ∂ω 2 q 1 q 1 = − , R + σ3 , PNg (Hω ) σ3 R − R, ∂ω 2 2 q 1 1 = − , R + σ3 R, σ3 , ∂ω 2 2q 1 q + σ3 R, σ3 ∂ω σ3 , σ3 − R, ∂ω = 0, 2q q

(7.29)

where for the second equality we have used PNg (Hω ) =

1 1 σ3 σ3 ∂ω , + ∂ω , . q q

The last equality in (7.29) can be seen as follows. The two terms in the third line in (7.29) are both equal to 0. Indeed, σ3 , ∂ω = 0 and, by R ∈ N g⊥ (Hω∗ ) and ∈ N g (Hω∗ ), R, = 0. The two terms in the fourth line in (7.29) cancel each other. Then we get formulas for α ω and α f . We get α ϑ also by Pc (ω) f 22 = f 22 + (Pc (ω) − Pc (ω0 )) f, σ1 (Pc (ω) + Pc (ω0 )) f . We have, summing over repeated indexes (also on j and j): Lemma 7.3. We have i Y 0 = iq Yϑ dω − iq Yω dϑ + (Y j dz j − Y j dz j ) + σ1 σ3 Y f , f .

(7.30)

, we have For a1 given by (7.15), and for = i Y ω = a1 Yϑ + σ1 σ3 ξ j , ∂ω RY j − σ3 ξ j , ∂ω RY j + Y f , σ3 σ1 Pc (ω)∂ω R; −ϑ = a1 Yω − i σ1 σ3 ξ j , σ3 RY j + i σ3 ξ j , σ3 RY j − i Y f , σ3 σ1 Pc (ω)σ3 R; − j = σ1 σ3 ξ j , ∂ω RYω + i σ1 σ3 ξ j , σ3 RYϑ ; j = σ3 ξ j , ∂ω RYω + i σ3 ξ j , σ3 RYϑ ; σ3 σ1 f = (Pc (ω0 )Pc (ω) − 1)Y f + Yω Pc (ω0 )Pc (ω)∂ω R + i Yϑ Pc (ω0 )Pc (ω)σ3 R.

(7.31)


301

we have In particular, for γ = i Y t t = i Y t 0 + t i Y t γω = (iq + ta1 )(Y t )ϑ + tσ1 σ3 ξ j , ∂ω R(Y t ) j − tσ3 ξ j , ∂ω R(Y t ) j + t(Y t ) f , σ3 σ1 Pc (ω0 )Pc (ω)∂ω R; −γϑ = (iq + ta1 )(Y t )ω − i t σ1 σ3 ξ j , σ3 R(Y t ) j + i t σ3 ξ j , σ3 R(Y t ) j − i t (Y t ) f , σ3 σ1 Pc (ω0 )Pc (ω)σ3 R; −γ j = (Y t ) j + tσ1 σ3 ξ j , ∂ω R(Y t )ω + i t σ1 σ3 ξ j , σ3 R(Y t )ϑ ;

(7.32)

γ j = (Y t ) j + tσ3 ξ j , ∂ω R(Y t )ω + i t σ3 ξ j , σ3 R(Y t )ϑ ; σ3 σ1 γ f = (Y t ) f + t (Pc (ω0 )Pc (ω) − 1)(Y t ) f + + t (Y t )ω Pc (ω0 )Pc (ω)∂ω R + t i (Y t )ϑ Pc (ω0 )Pc (ω)σ3 R. Proof. Equation (7.30) is trivial. Equation (7.32) follows immediately from (7.30)– (7.31). In the following formulas we denote Pc = Pc (ω), Pc0 = Pc (ω0 ) and we sum on = + 1 with, see (7.13), repeated indexes. We can split 1 = (Pc0 Pc − 1) f , σ3 σ1 f , = a1 dϑ ∧ dω + dz j ∧ (σ1 σ3 ξ j , ∂ω Rdω + iσ1 σ3 ξ j , σ3 Rdϑ) − dz j ∧ (σ3 ξ j , ∂ω Rdω + iσ3 ξ j , σ3 Rdϑ) +Pc Pc0 f , σ3 σ1 Pc ∂ω R ∧ dω + iPc Pc0 f , σ3 σ1 Pc σ3 R ∧ dϑ. Then 1 = σ1 σ3 (Pc0 Pc − 1)Y f , f iY and = a1 Yϑ + Y j σ1 σ3 ξ j , ∂ω R − Y σ3 ξ j , ∂ω R + Y f , σ3 σ1 Pc ∂ω R dω iY j + −a1 Yω + iY j σ1 σ3 ξ j , σ3 R − iY j σ3 ξ j , σ3 R + iY f , σ3 σ1 Pc σ3 R dϑ − (σ1 σ3 ξ j , ∂ω RYω + iσ1 σ3 ξ j , σ3 RYϑ )dz j + (σ3 ξ j , ∂ω RYω + iσ3 ξ j , σ3 RYϑ )dz j − f , Yω σ3 σ1 Pc0 Pc ∂ω R + iYϑ σ3 σ1 Pc0 Pc σ3 R. Remark 7.4. If we choose γ = −iα in Lemma 7.3 with the α of (7.19), and if Ft is the flow of Y t , then (Y t )ϑ = 0 is an obstruction to the fact that, for 0 < t ≤ 1, K ◦ Ft is the hamiltonian of the sort of semilinear NLS that (6.1) is. Indeed (Y t ) f = −ti(Y t )ϑ Pc (ω0 )Pc (ω)σ3 R + S(R3 , C2 ). Then if we substitute f with f − i(Y 1 )ϑ Pc (ω0 )Pc (ω)σ3 R + · · · in Hω f, σ3 σ1 f we obtain a term of the form (Y 1 )2ϑ Hω f, σ3 σ1 f . To avoid terms like this, we want flows defined from fields with (Y t )ϑ = 0. To this effect we add a correction to α. We first consider the hamiltonian fields of ϑ and ω.

302

S. Cuccagna

Lemma 7.5. Consider the vectorfield X ϑt (resp. X ωt ) defined by i X t t = −idϑ (resp. ϑ i X ωt t = −idω). Then we have (here Pc = Pc (Hω ) and Pc0 = Pc (Hω0 )): ∂ ∂ ∂ t t X ϑ = (X ϑ )ω − tσ1 σ3 ξ j , ∂ω R − tσ3 ξ j , ∂ω R ∂ω ∂z j ∂z j −t Pc0 (1 + t Pc − t Pc0 )−1 Pc0 Pc ∂ω R , (7.33) ∂ ∂ ∂ t t − itξ j , R + itσ1 ξ j , R X ω = (X ω )ϑ ∂ϑ ∂z j ∂z j −it Pc0 (1 + t Pc − t Pc0 )−1 Pc0 Pc σ3 R , where, for the a1 of (7.15), we have i = −(X ωt )ϑ , + ta1 + ta2 a2 := itσ3 ξ j , ∂ω Rσ1 ξ j , R − itσ1 σ3 ξ j , ∂ω Rξ j , R

(X ϑt )ω =

iq

(7.34)

+ itPc0 (1 + t Pc − t Pc0 )−1 Pc0 Pc ∂ω R, σ3 σ1 Pc σ3 R.

(7.35)

Proof. By (7.32) for γ = −i dϑ, X ϑt satisfies (X ϑt )ϑ = 0;

i = (iq + ta1 )(X ϑt )ω − itσ1 σ3 ξ j , σ3 R(X ϑt ) j + itσ3 ξ j , σ3 R(X ϑt ) j − it(X ϑt ) f , σ3 σ1 Pc σ3 R;

(X ϑt ) f (X ϑt ) j

= =

(7.36)

Pc0 Pc )(X ϑt ) f

t (1 − − t (X ϑt )ω Pc0 Pc ∂ω R; t −t (X ϑ )ω σ1 σ3 ξ j , ∂ω R; (X ϑt ) j = −t (X ϑt )ω σ3 ξ j , ∂ω R.

This yields (7.33) for X ϑt and the first equality in (7.34). By (7.32) for γ = −i dω, X ωt satisfies (X ωt )ω = 0;

−i − i q (X ωt )ϑ = ta1 (X ωt )ϑ + tσ1 σ3 ξ j , ∂ω R(X ωt ) j − tσ1 σ3 ξ j , ∂ω R(X ωt ) j + t(X ωt ) f , σ3 σ1 Pc ∂ω R; (X ωt ) f (X ωt ) j

= =

(7.37)

t (1 − Pc0 Pc )(X ωt ) f − i t (X ωt )ϑ Pc0 Pc σ3 R; −i t (X ωt )ϑ σ1 σ3 ξ j , σ3 R; (X ωt ) j = −i t (X ωt )ϑ σ3 ξ j , σ3 R.

This yields the rest of (7.33)–(7.34).

The following lemma is an immediate consequence of the formulas in Lemma 7.5 and of (7.16). Lemma 7.6. For any (K , S , K , S) we have |1 − (X ϑt )ω q | R2H −K ,−S |(X ϑt ) j | + |(X ϑt ) j | + (X ϑt ) f H K ,S R H −K ,−S ,

(7.38)


303

and |1 + (X ωt )ϑ q | R2H −K ,−S , |(X ωt ) j | + |(X ωt ) j | + (X ωt ) f H −K ,−S R H −K ,−S .

(7.39)

Set HcK ,S (ω) = Pc (ω)H K ,S and denote K ,S = Cm × HcK ,S (ω0 ) , P K ,S = R2 × P K ,S P

(7.40)

K ,S . with elements (ϑ, ω, z, f ) ∈ P K ,S and (z, f ) ∈ P Lemma 7.7. We consider ∀ t ∈ [0, 1] the hamiltonian field X ϑt and the flow d s (t, U ) = X ϑt (s (t, U )) , 0 (t, U ) = U. ds

(7.41)

(1) For any (K , S ) there is a s0 > 0 and a neighborhood U of R × {(ω0 , 0, 0)} in P −K ,−S such that the map (s, t, U ) → s (t, U ) is smooth, (−s0 , s0 ) × [0, 1] × (U ∩ {ω = ω0 }) → P −K

,−S

.

(7.42)

(2) U can be chosen so that for any t ∈ [0, 1] there is another neighborhood Vt of R × {(ω0 , 0, 0)} in P −K ,−S s.t. the above map establishes a diffeomorphism (−s0 , s0 ) × (U ∩ {ω = ω0 }) → Vt .

(7.43)

(3) f (s (t, U )) − f (U ) = G(t, s, z, f ) is a smooth map for all (K , S), (−s0 , s0 ) × [0, 1] × (U ∩ {ω = ω0 }) → H K ,S with G(t, s, z, f ) H K ,S ≤ C|s|(|z| + f H −K ,−S ). Proof. Claims (1)–(2) follow by Lemma 7.5 which implies X ϑt ∈ C ∞ (U, P K ,S ) for all (K , S). Let ζ be any coordinate z j or f . Then, for ζ a scalar coordinate, we have s |ζ (s (t, U )) − ζ (U )| ≤ |(X ϑt )ζ (s (t, U ))|ds −s

≤ C|s| sup (|z(s (t, U ))| + f (s (t, U )) H −K ,−S ). |s |≤s

(7.44)

For ζ = f we have f (s (t, U )) − f (U ) H K ,S ≤

s

−s

(X ϑt ) f (s (t, U )) H K ,S ds ≤ rhs(7.44).

The above two formulas imply the following, which yields claim (3): f (s (t, U )) − f (U ) H K ,S ≤ C|s|(|z| + f H −K ,−S ), |z(s (t, U )) − z(U )| ≤ C|s|(|z| + f H −K ,−S ).

(7.45)

304

S. Cuccagna

Lemma 7.8. We consider a scalar function F(t, U ) defined as follows: s F(t, s (t, U )) = i αs (t,U ) X ϑt (s (t, U )) ds , where ω(U ) = ω0 .

(7.46)

0

We have F ∈ C ∞ ([0, 1] × U, R) for a neighborhood U of R × {(ω0 , 0, 0)} in P −K ,−S . We have 2 |F(t, U )| ≤ C(K , S )|ω − ω0 | |z| + f H −K ,−S . (7.47) We have (exterior differentiation only in U ) (α + i d F)(X ϑt ) = 0.

(7.48)

Proof. F is smooth by (7.20) and Lemma 7.7. Equation (7.48) follows by (7.41) and by d F(t, s (t, U )) = 0. αU X ϑt (U ) + i ds |s=0

(7.49)

By (7.20) and (7.38) we have 2 |α(X ϑt )| ≤ |α ω | |(X ϑt )ω | + |α f , (X ϑt ) f | |z| + f H −K ,−S . Then (7.47) follows by |s| ≈ |ω(s (t, U )) − ω0 |.

(7.50)

Lemma 7.9. Denote by X t the vector field which solves i X t t = −α − i d F(t).

(7.51)

Then the following properties hold: (1) There is a neighborhood U of R × {(ω0 , 0, 0)} in P 1,0 such that X t (U ) ∈ C ∞ ([0, 1] × U, P 1,0 ). (2) We have (X t )ϑ ≡ 0. (3) For constants C(K , S, K , S ) we have f 22 t (X )ω + (|z| + f H −K ,−S )2 ; 2q (ω) |(X t ) j | + |(X t ) j | + (X t ) f H K ,S (|z| + f H −K ,−S )

(7.52)

×(|ω − ω0 | + |z| + f H −K ,−S + f 2L 2 ). (4) We have LX t

∂ ∂ = 0. := X t , ∂ϑ ∂ϑ

(7.53)

Proof. Claim (1) follows from the regularity properties of α, F and t and from Eqs. (7.54) and (7.56) below. Equation (7.48) implies (2) by i(X t )ϑ = idϑ(X t ) = −i X t t (X t ) = i X t t (X ϑt ) = −(α + i d F)(X ϑt ) = 0. ϑ


305

We have i(X t )ω = idω(X t ) = −i X ωt t (X t ), so by (7.51) and (7.33) we get i(X t )ω = i X t t (X ωt ) = −(X ωt )ϑ α ϑ + t∂ j F ξ j , R − t∂ j Fσ1 ξ j , R

+ t∇ f F + iα f , Pc0 (1 + t Pc − t Pc0 )−1 Pc0 Pc σ3 R . (7.54)

Then by (7.20), (7.34), (7.15) and (7.35), we get the first inequality in (7.52): 2 f 22 t (X )ω + ≤ C |z| + f H −K ,−S . 2q (ω)

(7.55)

By (7.32) we have the following equations: i ∂ j F = (X t ) j + tσ1 σ3 ξ j , ∂ω R(X t )ω , −i ∂ j F = (X t ) j + tσ3 ξ j , ∂ω R(X t )ω ,

(7.56)

σ3 σ1 (α f + i ∇ f F) = −(X t ) f − t (Pc0 Pc − 1)(X t ) f − t (X t )ω Pc0 Pc ∂ω R. Formulas (7.56) imply |(Xωt ) j | ≤ |∂ j F| + C |z| + f H −K ,−S |(X t )ω |, |(Xωt ) j | ≤ |∂ j F| + C |z| + f H −K ,−S |(X t )ω |, (Xωt ) f H K ,S ≤ α f H K ,S + ∇ f F H K ,S + C |z| + f H −K ,−S |(X t )ω |,

which with (7.55), (7.20) and Lemma (7.47) imply (7.52). Equation (7.53) is a consequence of the following equalities, which we will justify below: 0=L

∂ ∂ϑ

(i X t t ) = i [

∂ t ∂ϑ ,X ]

t + i X t L

The first equality is a consequence of (7.51) and L sequence of L

∂ ∂ϑ

α = 0 and

∂ ∂ϑ

F = 0. Notice that

∂ ∂ϑ

∂ ∂ϑ

∂ ∂ϑ

t = i [

∂ t ∂ϑ ,X ]

t .

(7.57)

(α + id F) = 0. The latter is a conF = 0 can be proved observing that

∂ ∂ F = 0 and that on ω = ω0 we have ∂ϑ F = 0. (7.48), (7.20) and Lemma 7.5 imply X ϑt ∂ϑ L ∂ α = 0 is a consequence of the Cartan “magic” formula L X γ = (i X d + di X )γ , of ∂ϑ the definition (7.19) and of following equalities:

L L L

∂ ∂ϑ

∂ ∂ϑ

∂ ∂ϑ

β = di

β0 = di

dψ = di

∂ ∂ϑ ∂ ∂ϑ ∂ ∂ϑ

i dβ = − dσ1 U, U + iσ1 U, = 0; 2 β0 + i ∂ dβ0 = −idq − ii ∂ (dq ∧ dϑ) = −idq + idq ∧ i β +i

∂ ∂ϑ

∂ϑ

dψ =

∂ϑ

1 ∂ d σ3 , R = 0. 2 ∂ϑ

∂ ∂ϑ

dϑ = 0; (7.58)

The second equality in (7.57) follows by the product rule for the Lie derivative. Finally, the third equality in (7.57) follows by L ∂ t = (1 − t)L ∂ 0 + t L ∂ = 0, the con∂ϑ ∂ϑ ∂ϑ sequence of L ∂ = 0 (resp. L ∂ 0 = 0), in turn the consequence of the first (resp. ∂ϑ ∂ϑ second) line in (7.58) and of the identity L X dγ = d L X γ .

306

S. Cuccagna

We have: Lemma 7.10. Consider the vectorfield X t in Lemma 7.8 and denote by Ft (U ) the corresponding flow. Then the flow Ft (U ) for U near eiσ3 ϑ ω0 is defined for all t ∈ [0, 1]. We have ϑ ◦ F1 = ϑ. We have for = j, j, f 22 + Eω (U ), 2 z (F1 (U )) = z (U ) + E (U ), f (F1 (U )) = f (U ) + E f (U ),

q (ω(F1 (U ))) = q (ω(U )) −

(7.59)

with |Eω (U )| (|ω − ω0 | + |z| + f H −K ,−S )2 ,

(7.60)

|E (U )| + E f (U ) H K ,S (|ω − ω0 | + |z| + f H −K ,−S ×(|ω − ω0 | + |z| + f H −K ,−S ).

(7.61)

+ f 2L 2 ),

For each ζ = ω, z , f we have Eζ (U ) = Eζ ( f 2L 2 , ω, z, f ) with, for a neighborhood U −K a0 > 0,

,−S

(7.62)

of R × {(ω0 , 0, 0)} in P −K

Eζ (, ω, z, f ) ∈ C ∞ ((−a0 , a0 ) × U −K

,−S

,−S

and for some fixed

, C)

(7.63)

, H K ,S ).

(7.64)

for ζ = ω, z and with E f (, ω, z, f ) ∈ C ∞ ((−a0 , a0 ) × U −K

,−S

Proof. We add a new variable . We define a new field by f 22 − ρ + t∂ j F ξ j , R − t∂ j Fσ1 ξ j , R 2 + t∇ f F + iα f , Pc0 (1 + t Pc − t Pc0 )−1 Pc0 Pc σ3 R],

i(Y t )ω = −(X ωt )ϑ [α ϑ + i

(7.65)

by i ∂ j F = (Y t ) j + tσ1 σ3 ξ j , ∂ω R(Y t )ω , −i ∂ j F = (Y t ) j + tσ3 ξ j , ∂ω R(Y t )ω , σ3 σ1 (α f + i ∇ f F) = (Y t ) f + t (Pc0 Pc − 1)(Y t ) f

(7.66)

− t (Y t )ω Pc0 Pc ∂ω R, and by Yρt = 2(Y t ) f , σ1 f . Then Y t = Y t (ω, ρ, z, f ) defines a new flow Gt (ρ, U ), which reduces to Ft (U ) in the invariant manifold defined by ρ = f 22 . Notice that by t ρ(t) = ρ(0) + 0 Yρs ds it is easy to conclude ρ(G1 (ρ, U )) = (U ) + O(rhs (7.60)). Using (7.39), (7.20) and (7.65) it is then easy to get t t ρ(s) ds + O(rhs (7.60)). q(ω(t)) = q(ω(0)) + q (ω(s))Yωs ds = q(ω(0)) − 2 0 0


307

By standard arguments, see for example the proof of Lemma 4.3 [BC], we get ρ + Eω (ρ, U ), 2 z (G1 (ρ, U )) = z (U ) + E (ρ, U ), f (G1 (ρ, U )) = f (U ) + E f (ρ, U ),

q (ω(G1 (ρ, U ))) = q (ω(U )) −

(7.67)

with Eζ (ρ, U ) satisfying (7.63) for ζ = ω, z and (7.64) for ζ = f . We have Eζ (U ) = Eζ ( f 2 , U ) satisfying (7.60) for ζ = ω and (7.61) for ζ = z , f . We have: Lemma 7.11. Consider the flow Ft of Lemma 7.10. Then we have Ft∗ t = 0 .

(7.68)

Q ◦ F1 = q.

(7.69)

We have

If χ is a function with ∂ϑ χ ≡ 0, then ∂ϑ (χ ◦ Ft ) ≡ 0. Proof. Equation (7.68) is Darboux Theorem, see (7.3). Let Gt = (Ft )−1 . Then Gt∗ 0 = t 0 = X q(ω)◦ t . We have Gt∗ X q(ω) Gt by t . i G ∗ X 0 t = i Gt∗ X q0 (ω) Gt∗ 0 = Gt∗ i X q0 (ω) 0 = −id(q(ω) ◦ Gt ) = i X t t q(ω) q(ω)◦Gt ∂ Then by X t , ∂ϑ = 0 for all t,

d t d ∗ 0 d ∗ ∂ ∗ 1−t ∂ X = −Gt X , = 0. = Gt X q(ω) = − Gt dt q(ω)◦Gt dt dt ∂ϑ ∂ϑ

1 0 So X q(ω)◦ G1 = X q(ω) . Since by (5.16) and (7.7) this implies d(q ◦ G1 ) = d Q and since there are points with q ◦ G1 (U ) = Q(U ), we obtain (7.69). Finally, the last statement of Lemma 7.11 follows by (7.53) and by

∗ ∂ ∂ ∗ ∗ ∂ ∗ F χ = Ft Ft χ = Ft χ . ∂ϑ t ∂ϑ ∂ϑ

8. Reformulation of (6.5) in the New Coordinates We set H = K ◦ F1 .

(8.1)

In the new coordinates (6.5) becomes q ω˙ =

∂H ∂H ≡ 0 , q ϑ˙ = − ∂ϑ ∂ω

(8.2)

308

S. Cuccagna

and i˙z j =

∂H ∂H , iz˙ j = − , ∂z j ∂z j

(8.3)

i f˙ = σ3 σ1 ∇ f H. Recall that we are solving the initial value problem (1.1) and that we have chosen ω0 with q(ω0 ) = u 0 2L 2 . Correspondingly it is enough to focus on (8.3) with ω = ω0 . For x system (8.3) we prove: Theorem 8.1. Then there exist ε > 0 and C > 0 such that for |z(0)|+ f (0) H 1 ≤ < ε the corresponding solution of (8.3) is globally defined and there are f ± ∈ H 1 with f ± H 1 ≤ C such that lim eiϑ(t)σ3 f (t) − eitσ3 f ± H 1 = 0,

t→±∞

(8.4)

where ϑ(t) is the variable associated to U T (t) = (u(t), u(t)) in (3.12) and (3.13). We also have lim z(t) = 0.

(8.5)

t→∞

In particular, it is possible to write R(t, x) = A(t, x) + f (t, x) with |A(t, x)| ≤ C N (t)x−N for any N , with limt→∞ C N (t) = 0 and such that for any admissible pair (r, p), i.e. (2.4), we have f L r (R,W 1, p ) ≤ C . t

(8.6)

x

By Lemma 7.10, Theorem 8.1 implies Theorem 2.2. Indeed, if we denote (ω, z , f ) the initial coordinates, and (ω0 , z, f ) the coordinates in (8.3), we have z = z + O(|z| + f L 2,−2 ) and f = f + O(|z| + f L 2,−2 ). The two error terms O converge to 0 as x x t → ∞. Hence the asymptotic behavior of (z , f ) and of (z, f ) is the same. We also have q (ω(t)) = q (ω0 ) −

f (t)22 2

lim q (ω(t)) = lim

t→+∞

t→+∞

+ O(|z(t)| + f (t) L 2,−2 ) which implies, say at +∞,

eitσ3 f + 22 q (ω0 ) − 2

x

= q (ω0 ) −

f + 22 = q(ω+ ) 2

for somewhere ω+ is the unique element near ω0 for which the last inequality holds. So limt→+∞ ω(t) = ω+ . In the rest of the paper we focus on Theorem 8.1. The main idea is that (8.3) is basically like the system considered in [BC]. Therefore Theorem 8.1 follows by the Birkhoff normal forms argument of [BC], supplemented with the various dispersive estimates in [CM].


309

8.1. Taylor expansions. Consider U = eiσ3 ϑ (ω + R) as in (3.12). Decompose R as in (3.14). Set u = ϕ + u c with t (Pc (ω) f ) = (u c , u c ). We have 1 ∂ ∂ 2 2 2 2 B(|u| ) = B |u c | + B(|u| )|u=u c +tϕ ϕ + B(|u| )|u=u c +tϕ ϕ dt ∂u ∂u 0 1

1 j ∂ui+1 ∂u B |u|2 = B |u c |2 + dt u ic u c j ϕ |u=tϕ i! j! 0 i+ j≤4 1

1 j+1 ∂ui ∂u B |u|2 dt u ic u c j ϕ + |u=tϕ i! j! 0 i+ j≤4

1 j +5 dtds(1 − s)4 u ic u c j ϕ ∂ui+1 ∂u B |u|2 |u=tϕ+su c i! j! [0,1]2 i+ j=5

1 j+1 ∂ui ∂u B |u|2 +5 dtds(1 − s)4 u ic u c j ϕ. (8.7) |u=tϕ+su c i! j! [0,1]2 i+ j=5

Lemma 8.2. The following statements hold: K = d(ω) − ωu 0 22 + K 2 + K P ,

1 K2 = λ j (ω)|z j |2 + σ3 Hω f, σ1 f , 2 j

KP = aμν (ω, z), 1z μ z ν + z μ z ν G μν (ω, z), σ3 σ1 Pc (ω) f |μ+ν|=3

|μ+ν|=2

4

⊗d + Bd (ω, z), (Pc (ω) f ) + B6 (ω, f ), 1 +

R3

d=2

for B6 (x, ω, f ) = B

|Pc (ω) f (x)|2 2

B5 (x, ω, z, f (x)) f ⊗5 (x)d x,

, where we have what follows:

(1) aμν (·, ω, z) ∈ C ∞ (U, HxK ,S (R3 , C)) for any pair (K , S) and a small neighborhood U of (ω0 , 0) in O × Cm . (2) G μν (·, ω, z) ∈ C ∞ (U, HxK ,S (R3 , C2 )), for U like in (1), possibly smaller. (3) Bd (·, ω, z) ∈ C ∞ (U, HxK ,S (R3 , B((C2 )⊗d , C))), for 2 ≤ d ≤ 4 for U possibly smaller. (4) Let t η = (ζ, ζ ) for ζ ∈ C. Then for B5 (·, ω, z, η) we have l B5 (ω, z, η) H K ,S (R3 ,B((C2 )⊗5 ,C) ≤ Cl . for any l, ∇ω,z,z,ζ,ζ x

(5) We have aμν = a νμ , G μν = −σ1 G νμ . Proof. The expansion for K is a consequence of well known cancelations. (1)–(4) follow from (8.7) and elementary calculus. (5) follows from the fact that K (U ) is real valued for U = σ1 U . We set δ j for j ∈ {1, . . . m} the multi index δ j = (δ1 j , . . . , δm j ). Let λ0j = λ j (ω0 ) and λ0 = (λ01 , . . . , λ0m ).

310

S. Cuccagna

Lemma 8.3. Let H = K ◦ F1 . Then, at eiσ3 ϑ ω0 we have the expansion (1)

H = d(ω0 ) − ω0 u 0 22 + ψ( f 22 ) + H2 + R(1)

(8.8)

for ω = ω0 , where the following holds: (1) We have for r = 1,

H2(r ) =

1 (r ) aμν ( f 22 )z μ z ν + σ3 Hω0 f, σ1 f . 2

|μ+ν|=2

(8.9)

λ0 ·(μ−ν)=0

(2) We have R(1) = R(1) + R(2) , with

(1) aμν ( f 22 )z μ z ν + z μ z ν σ1 σ3 G μν ( f 22 ), f , R(1) = |μ+ν|=2

|μ+ν|=1

λ0 ·(μ−ν) =0

R(2) =

zμ zν

R3

|μ+ν|=3

+

+

5

(1) Rj

R3

+

j=2

with

zμ zν

|μ+ν|=2

(1) Rj

aμν (x, z, f, f (x), f 22 )d x

R3

(1) (z, f, f 22 ) B(| f (x)|2 /2)d x + R 2

=

∗ σ1 σ3 G μν (x, z, f, f (x), f 22 ) f (x)d x

R3

F j (x, z, f, f (x), f 22 ) f ⊗ j (x)d x.

(8.10)

(3) We have F2 (x, 0, 0, 0, 0) = 0. (4) ψ(s) is smooth with ψ(0) = ψ (0) = 0. (5) At f 2 = 0 with r = 1, (r ) aμν (0) = 0 for |μ + ν| = 2 with (μ, ν) = (δ j , δ j ) for all j, (r )

aδ j δ j (0) = λ j (ω0 ),

where δ j = (δ1 j , . . . , δm j ),

(8.11)

G μν (0) = 0 for |μ + ν| = 1. (r ) These aμν () and G μν (x, ) are smooth in all variables with G μν (·, ) ∈ K ,S ∞ C (R, Hx (R3 , C2 )) for all (K , S). (6) We have for all indexes and for r = 1, (r ) ) = a (r aμν νμ , aμν = a νμ , G μν = −σ1 G νμ .

(8.12)

(7) Let t η = (ζ, ζ ) for ζ ∈ C. For all (K , S, K , S ) there is a neighborhood U −K ,−S −K ,−S , see (7.40), such that we have, for aμν (x, z, f, η, ) with of {(0, 0)} in P (z, f, ζ, ) ∈ U −K ,−S × C × R, l ∇z,z,ζ,ζ a K ,S (R3 ,C) ≤ Cl , f, μν H x

for all l.

(8.13)


(8) Possibly restricting U −K

,−S

, we have also, for G μν (x, z, f, g, ),

l ∇z,z,ζ,ζ G K ,S (R3 ,C2 ) ≤ Cl , f, μν H x

(9) Restricting U −K

,−S

for all l.

x

,−S

(8.14)

further, we have also, for F j (x, z, f, g, ),

l F K ,S (R3 ,B((C2 )⊗ j ,C)) ≤ Cl ∇z,z,ζ,ζ , f, j H

(10) Restricting U −K

311

(1)

for all l.

(z, f, ) ∈ C ∞ (U −K further, we have R 2

,−S

× R, R) with

(1)

(z, f, )| ≤ C(|z| + || + f −K ,−S ) f 2 −K ,−S . |R 2 H H Proof. By F1 (ω0 ) = ω0 , K (ω0 ) = 0 and F1 (U ) − U P K ,S R2L 2 we conclude H (ω0 ) = 0 and H (ω0 ) = K (ω0 ). In particular, this yields the formula for H2(1) for f 2 = 0. The other terms are obtained by substituting in (8.8) the formulas 2 , where (7.59). By σ3 f, σ1 f = 0 we have σ3 Hω0 +δω f, σ1 f = σ3 Hω0 f, σ1 f + F 2 F2 can be absorbed in j = 2 in (8.10). ψ( f 2 ) arises from d(ω ◦ F1 ) − ω ◦ F1 u 0 22 . Other terms coming from the latter end up in (8.10): in particular there are no monomials j f 2 z μ z ν G, f i with |μ + ν| + i = 1, because of (7.60) (applied for ω = ω0 ). 9. Canonical Transformations Our goal in this section is to prove the following result. 1,0 , Theorem 9.1. For any integer r ≥ 2 there are a neighborhood U 1,0 of {(0, 0)} in P 1,0 1,0 see (7.40), and a smooth canonical transformation Tr : U → P s.t. H (r ) := H ◦ Tr = d(ω0 ) − ω0 u 0 22 + ψ( f 22 ) + H2(r ) + Z (r ) + R(r ) ,

(9.1)

where: (r )

(2)

(r )

(i) H2 = H2 for r ≥ 2, is of the form (8.9) where aμν ( f 2 ) satisfy (8.11)–(8.12); (ii) Z (r ) is in normal form, in the sense of Definition 9.3 below, with monomials of degree ≤ r whose coefficients satisfy (8.12); (iii) the transformation Tr is of the form (see below) (9.9)– (9.10) and satisfies (9.12)– (9.13) for M0 = 1; (r ) (iv) we have R(r ) = 6d=0 Rd with the following properties: (iv.0) for all (K , S, K , S ) there is a neighborhood U −K ,−S of {(0, 0)} in ,−S −K P such that

) μ ν (r ) = z z aμν (x, z, f, f (x), f 22 )d x R(r 0 |μ+ν|=r +1

R3

(r )

and for aμν (z, f, η, ) with t η = (ζ, ζ ), ζ ∈ C we have for (z, f ) ∈ U −K ,−S and || ≤ 1, l ∇z,z,ζ,ζ a (r ) (·, z, f, η, ) H K ,S (R3 ,C) ≤ Cl for all l; , f, μν

(9.2)

312

S. Cuccagna

(iv.1) possibly taking U −K ,−S smaller, we have ∗

(r ) ) 2 R1 = σ1 σ3 G (r zμ zν (x, z, f, f (x), f ) f (x)d x μν 2 R3

|μ+ν|=r

l G (r ) (·, z, f, η, ) H K ,S (R3 ,C2 ) ≤ Cl for all l; (9.3) with ∇z,z,ζ,ζ , f, μν

(iv.2–5) possibly taking U −K ,−S smaller, we have for 2 ≤ d ≤ 5, (r ) (r ) (r ) , Rd = Fd (x, z, f, f (x), f 22 ) f ⊗d (x)d x + R d R3

with for any l, (r )

l F (·, z, f, η, ) H K ,S (R3 ,B((C2 )⊗d ,C) ≤ Cl , ∇z,z,ζ,ζ , f, d

(9.4)

(r ) (r ) (z, f, f 2 ) s.t. with F2 (x, 0, 0, 0, 0) = 0 and with R 2 d

(r ) (z, f, ) ∈ C ∞ (U −K ,−S × R, R), R d (r )

(z, f, )| ≤ C f d −K ,−S , |R d H (r ) (z, |R 2

f, )| ≤

(9.5)

C(|z| + || + f H −K ,−S ) f 2H −K ,−S ;

(r ) (iv.6) R6 = R3 B(| f (x)|2 /2)d x.

We develop the proof in the following subsections. The basic ideas are classical. However we need to develop a number of tools, along the lines of [BC]. The situation here is more complicated than in [BC] because of the dependence of the coefficients on f 2 . 9.1. Lie transform. We consider functions

χ= bμν ( f 22 )z μ z ν + |μ+ν|=M0 +1

z μ z ν σ1 σ3 Bμν ( f 22 ), f ,

(9.6)

|μ+ν|=M0

where bμν () ∈ C ∞ (R , C) and Bμν (x, ) ∈ C ∞ (R, Pc (ω0 )Hxk,s (R3 , C2 )) for all k and s. Assume bμν = bνμ and σ1 Bμν = −B νμ for all indexes. We set for K > 0 and S > 0 fixed and large set

χ = χ ( f 22 ) = |bμν ( f 22 )| + Bμν ( f 22 ) H K ,S .

(9.7)

(9.8)

Denote by φ t the flow of the Hamiltonian vector field X χ ( from now on with respect to 0 and only in (z, f )). The Lie transform φ = φ t t=1 is defined in a sufficiently small neighborhood of the origin and is a canonical transformation. Lemma 9.2. Consider the χ in (9.6) and its Lie transform φ. Set (z , f ) = φ(z, f ). Then there are G(z, f, ), (z, f, ), 0 (z, f, ρ) and 1 (z, f, ρ) with the following properties:


313

(1) ∈ C ∞ (U −K ,−S , Cm ), 0 , 1 ∈ C ∞ (U −K ,−S , R), with U −K ,−S ⊂ Cm × Hc−K ,−S (ω0 ) × R an appropriately small neighborhood of the origin. (2) G ∈ C ∞ (U −K ,−S , HcK ,S (ω0 )) for any K , S. (3) The transformation φ is of the following form: z = z + (z, f, f 22 ),

f =e

i0 (z, f, f 22 )Pc (ω0 )σ3

(9.9) f + G(z,

f, f 22 ).

(9.10)

(4) We have f 22 = f 22 + 1 (z, f, f 22 ), (9.11) 1 (z, f, f 22 ) ≤ C|z| M0 −1 (|z| M0 +2 + |z|2 f H −K ,−S + f 3H −K ,−S ). (9.12) (5) There are constants c K ,S and c K ,S,K ,S such that (z, f, f 22 )| ≤ c K ,S (χ + (9.12))|z| M0 −1 (|z| + f H −K ,−S ), G(z, f, f 22 ) H K ,S ≤ c K ,S,K ,S (χ + (9.12))|z| M0 ,

(9.13)

|0 (z, f, f 22 )| ≤ c K ,S |z| M0 −1 (|z| + f H −K ,−S )2 . (6) We have ei0 Pc (ω0 )σ3 = ei0 σ3 + T (0 ),

(9.14)

B(H −K ,−S , H K ,S )) for all (K , S, K , S ), with norm ≤ C(K , S, K , S )|r |. More specifically, the range of T (r )

C ∞ (R,

where T (r ) ∈ T (r ) B(H −K ,−S ,H K ,S ) is R(T (r )) ⊆ L 2d (H) + L 2d (H∗ ), L 2d defined two lines after (3.11).

Proof. Claim (6) can be proved independently of the properties of 0 . Recall that Pc (ω) = 1 − Pd (ω), see below (3.11), with Pd (ω) smoothing and of finite rank. Exploiting σ3 Pd (ω) = Pd∗ (ω)σ3 it is elementary to prove ei0 Pc (ω0 )σ3 = ei0 σ3 + T (0 ) with T (0 ) = −i sin (0 ) Pd (ω0 )σ3 n ∞ 2 n

(i0 )n 2 K j (Pc (ω0 )σ3 )ε(n) , + j n! n=2

(9.15)

j=1

n

with K = Pd (ω0 )Pd∗ (ω0 ) − Pd (ω0 ) − Pd∗ (ω0 ) and ε(n) = 1−(−1) . T (0 ) has the 2 properties of Claim (6). and B derivatives In the sequel we prove Claims (1)–(5). Set = f 22 . For bμν μν with respect to , summing on repeated indexes, consider γ (z, f, ) := 2(bμν ()z μ z ν + σ1 σ3 Bμν (), f z μ z ν ).

For σ1 f = f , then γ (z, f, ) ∈ R by (9.7). We set up the following system:

zμ zν zμ zν i˙z j = νj bμν () + νj σ1 σ3 Bμν (), f , zj zj |μ+ν|=M0 +1 |μ+ν|=M0

i f˙ = z μ z ν Bμν () + γ (z, f, )Pc (ω0 )σ3 f, |μ+ν|=M0

˙ = −2i

|μ+ν|=M0

z μ z ν Bμν () + γ (z, f, )(Pc (ω0 ) − Pc∗ (ω0 ))σ3 f, σ1 f ,

(9.16)

(9.17)

314

S. Cuccagna

where in the last equation we exploited σ3 f, σ1 f = 0. By (9.7) the flow leaves the set with σ1 f = f and ∈ R invariant. In particular, the set where = f 22 is invariant under the flow of (9.17). In a neighborhood of 0 the lifespan of the solutions is larger than 1. Equation (9.9) can always be written. For γ defined in (9.16), we have t t t

f (t) = e−i 0 γ ds Pc (ω0 )σ3 f − i z μ z ν ei s γ ds Pc (ω0 )σ3 Bμν ds. |μ+ν|=M0

0

This yields (9.10). We can always write = + 1 (z, f, ).

(9.18)

This yields (9.11). Claims (1)–(2) follow from the regularity of the flow of (9.17) on the initial data. By (9.17) we get |(t) − | ≤ C sup |z(t )| M0 −1 (|z(t )| M0 +2 0≤t ≤t 2

+|z(t )| f (t ) H −K ,−S + f (t )3H −K ,−S ).

(9.19)

Similarly we have |z(t) − z| ≤ C sup |z(t )| M0 −1 χ ((t ))(|z(t )| + f (t ) H −K ,−S ), (9.20) 0≤t ≤t

t

z μ z ν ei

t s

γ ds Pc (ω0 )σ3

Bμν ds H K ,S ≤ C sup |z(t )| M0 χ ((t )). (9.21) 0≤t ≤t

0

Then |z(t)| ≈ |z| + f H −K ,−S with in particular |z(t)| ≈ |z| when M0 > 1. By Claim (6) and by the fact that the exponent 0 (z, f, ) in (9.10) is a uniformly bounded function, we get f (t) H −K ,−S ≈ |z| + f H −K ,−S . Then χ ((t )) − χ () ≤ (9.12). (9.22) This implies that the right hand sides of (9.19)–(9.21) are bounded by the bounds of 1 , and G in the statement. This yields the desired bounds on 1 , and G. The bound on 0 follows from t | γ (t )dt | ≤ C sup |z(t )| M0 (|z(t )| + f (t ) H −K ,−S ) 0

≤

0≤t ≤t C1 |z| M0 −1 (|z| + f H −K ,−S )2 .

(9.23)

9.2. Normal form. Recall the notation λ0j = λ j (ω0 ) and δ j = (δ1 j , . . . , δm j ), see before Lemma 8.3. Let H = Hω0 Pc (Hω0 ). For r ≥ 1, using the coefficients in (8.9) of the (r ) H2 in Theorem 9.1, let (r )

(r )

(r )

(r )

) λ j = λ j ( f 22 ) = λ j (ω0 ) + aδ j δ j ( f 22 ), λ(r ) = (λ1 , · · · , λ(r m ).

(9.24)


315

Definition 9.3. A function Z (z, f ) is in normal form if it is of the form Z = Z0 + Z1,

(9.25)

where we have finite sums of the following types:

Z1 = z μ z ν σ1 σ3 G μν ( f 22 ), f

(9.26)

|λ0 ·(ν−μ)|>ω0

with G μν (x, ) ∈ C ∞ (R , HxK ,S ) for all K , S;

Z0 =

aμ,ν ( f 22 )z μ z ν

(9.27)

λ0 ·(μ−ν)=0

and aμ,ν () ∈ C ∞ (R , C). We will always assume the symmetries (8.12). (r )

(r )

For an H2 as in (8.9) let H2 (r )

D2 =

m

j=1

(r )

(r )

(r )

= D2 + (H2 − D2 ), where

1 (r ) λ j ( f 22 )|z j | + σ3 Hω0 f, σ1 f . 2

(9.28)

In the following formulas we set λ j = λ(rj ) , λ = λ(r ) and D2 = D2(r ) . We recall (λj () is the derivative in ) that by (5.3), summing on repeated indexes, {D2 , F} := d D2 (X F ) = ∂ j D2 (X F ) j + ∂ j D2 (X F ) j + ∇ f D2 , (X F ) f = −i∂ j D2 ∂ j F + i∂ j D2 ∂ j F − i∇ f D2 , σ3 σ1 ∇ f F = iλ j z j ∂ j F − iλ j z j ∂ j F + iH f, ∇ f F + 2iλj ( f 22 )|z j |2 f, σ3 ∇ f F.

(9.29)

In particular, we have, for G = G(x), (we use σ1 iσ2 = σ3 ) {D2 , z μ z ν } = iλ · (μ − ν)z μ z ν , {D2 , σ1 σ3 G, f } = −i f, σ1 σ3 HG − 2 i

m

λj |z j |2 σ1 f, G,

(9.30)

j=1

1 1 {D2 , f 22 } = {D2 , f, σ1 f } = iH f, σ1 f = −iβ (φ 2 )φ 2 σ3 f, f . 2 2 In the sequel we will assume (and prove) that f 2 is small. We will consider only |μ + ν| ≤ 2N + 3. Then, λ0 · (μ − ν) = 0 implies |λ0 · (μ − ν)| ≥ c > 0 for some fixed c, and so we can assume also |λ · (μ − ν)| ≥ c/2. Similarly |λ0 · (μ − ν)| < ω0 (resp. |λ0 · (μ − ν)| > ω0 ) will be assumed equivalent to |λ · (μ − ν)| < ω0 (resp. |λ · (μ − ν)| > ω0 ). Lemma 9.4. Consider

kμν ( f 22 )z μ z ν + K = |μ+ν|=M0 +1

|μ+ν|=M0

z μ z ν σ1 σ3 K μν ( f 22 ), f .

(9.31)

316

S. Cuccagna

Suppose that all the terms in (9.31) are not in normal form and that the symmetries (8.12) hold. Consider

χ =

|μ+ν|=M0 +1

−

kμν ( f 22 ) μ ν z z iλ · (ν − μ) z μ z ν σ1 σ3

|μ+ν|=M0

1 K μν ( f 22 ), f . i(λ · (μ − ν) − H)

(9.32)

Then we have {χ , D2 } = K + L

(9.33)

with, summing on repeated indexes, kμν

z μ z ν β (φ 2 )φ 2 σ3 f, f 1 μ ν 2 K μν + 2λ j z z |z j | σ1 f, (μ − ν) · λ − H 1 β (φ 2 )φ 2 σ3 f, f − 2λ · (μ − ν)z μ z ν |z j |2 σ1 f, K 2 μν − ν) · λ − H) ((μ 1 μ ν β (φ 2 )φ 2 σ3 f, f . + 2z z f, σ3 σ1 K (9.34) (μ − ν) · λ − H μν

L=2

(μ − ν) · λ

The coefficients in (9.32) satisfy (8.12). Proof. The proof follows by the tables (9.30), by the product rule for the derivative and by the symmetry properties of H. We split the proof of Theorem 9.1 in two stages. We first prove step r = 2 of Theorem 9.1. We subsequently prove the case r > 2. 9.3. Proof of Theorem 9.1: the step r = 2. At this step, our goal is to obtain a hamiltonian similar to H , but with R(1) = 0. We will need to solve a nonlinear homological equation. We consider a χ like in (9.6) with M0 = 1 satisfying (9.7). We write H ◦ φ = d(ω0 ) − ω0 u 0 22 + ψ( f 22 ) + (H2(1) + R(1) + R(2) ) ◦ φ,

(9.35)

for φ the Lie transform of χ . We write (9.9)–(9.10) as follows, where we sum on repeated indexes and ∇ f does not act on f 22 : z j − z j = ∂k j (0, 0, f 22 )z k + ∂k j (0, 0, f 22 )z k + + ∇ f j (0, 0, f 22 ), f + r j ,

f −e

i0 (z, f, f 22 )Pc (ω0 )σ3

f =

∂k G(0, 0, f 22 )z k

+ ∂k G(0, 0, f 22 )z k

(9.36) +rf.

By (9.13) the terms in rhs (9.36) satisfy (see (9.8) for definition of χ ) |∂k j | + · · · ∂k G H K ,S ≤ Cχ , |r j | + r f H K ,S ≤ C(|z|2 + f 2H −K ,−S ).

(9.37)


317

We write the f ⊗2 in (8.10) schematically as

f 2 (x) = Aμν (x, f 22 )z μ z ν + z μ z ν Aμν ( f 22 )(x) f (x) |μ+ν|=2

+ (e

i0 σ3

|μ+ν|=1

f + T (0 ) f ) (x) + ϕ(x)r j f (x), 2

+ r f (x) f (x) + ϕ(x)r 2j + r 2f (x),

(9.38)

where ϕ(x) represents an exponentially decreasing smooth function. Equation (9.37) implies

Aμν (x, f 22 ) H K ,S + Aμν ( f 22 ) H K ,S ≤ Cχ . (9.39) μ,ν

μ,ν

We consider H2 ◦ φ + R(1) ◦ φ + (1)

R3

F2 (x, 0, 0, 0, f 22 ) f ⊗2 (x)d x

(1) (0, 0, f 22 ), f ⊗2 . +∇ 2f R 2

(9.40)

We will assume for the moment Lemma 9.5: Lemma 9.5. The following difference is formed by terms which satisfy the properties stated for R(2) in Theorem 9.1: (1) R(1) + R(2) ) ◦ φ − ψ( f 22 ) − (9.40). ψ( f 22 ) + (H2 +

(9.41)

We postpone the proof of Lemma 9.5 and focus on (9.40) and on the choice of χ . (2)

Lemma 9.6. It is possible to choose χ such that there exists H2 as in (i) Theorem 9.1 (2) such that the difference (9.40)−H2 is formed by terms which satisfy the properties (2) stated for R in Theorem 9.1. Proof. We have by (9.6) and using Definition 5.2, 1 {H2(1) , χ } ◦ φt dt H2(1) ◦ φ = H2(1) + 0

(1)

= H2 +

bμν ( f 22 )

|μ+ν|=2

+

σ3 σ1 Bμν ( f 22 ),

|μ+ν|=1

1 0

1 0

(1)

{H2 , z μ z ν } ◦ φt dt (1) (9.42) {H2 , z μ z ν f } ◦ φt dt + R

≤ C(|z| + f −K ,−S )3 , (9.30), Lemma 9.2. Then, by (9.30) for λ = λ(1) , with | R| H defined in (9.24), and substituting H2(1) = D2(1) + (H2(1) − D2(1) ) in the last two lines of (9.42), we get

H2(1) ◦ φ = H2(1) + i bμν λ · (μ − ν)z μ z ν −i

|μ+ν|=2

|μ+ν|=1

z μ z ν f, σ1 σ3 (H − λ · (μ − ν))Bμν + R,

318

S. Cuccagna (1)

with D2 defined in (9.28) and with, by (9.30), (8.11) and Lemma 9.2, ≤ C(|z| + f −K ,−S )3 + Cχ (χ + f 22 )(|z| + f −K ,−S )2 . (9.43) | R| H H Similarly

R(1) + R(1) ◦ φ =

lμν z μ z ν +

|μ+ν|=2

z μ z ν f, σ1 σ3 L μν + R,

(9.44)

|μ+ν|=1

L μν H K ,S ≤ Cχ R(1) with | lμν | + |R| ≤ rhs (9.43) + rhs (9.45). In (9.40) we substitute f ⊗2 using (9.38). Then F2 (x, 0, 0, 0, f 22 ) f ⊗2 (x)d x = χ + R R3

(9.45) (9.46)

(9.47)

χ ≤ C f 22 χ by claims (4) with: χ a polynomial like (9.6) with M0 = 1 such that and (9) in Lemma 8.3 and by (9.38); χ satisfies (9.7) by the fact that the rhs (9.47) is real valued; R formed by terms with the properties stated for R(2) in Theorem 9.1, see second line of (9.38). By an argument similar to the one for (9.47), we have (1) (0, 0, f 22 ), f ⊗2 = χ ∇ 2f R + R, 2

(9.48)

with χ and R different from the ones in (9.47) but with the same properties. Then we have

(1) (2) + χ (9.40) = H2 + R + i bμν λ · (μ − ν)z μ z ν −i

|μ+ν|=2

μ ν

z z f, σ1 σ3 (H − λ · (μ − ν))Bμν + R,

(9.49)

|μ+ν|=1

where R satisfies the properties stated for R(2) and χ is a polynomial like (9.6)–(9.7) will be defined in two lines) with M0 = 1 and ( Z and K (2) ). ≤ Cχ ( f 22 + χ + R χ = Z + K (9.50) , where in Here χ = Z+K Z= bμν ( f 22 )z μ z ν we sum over |μ + ν| = 2, λ0 · μ = 0 λ · ν, i.e. in Z we collect the null form terms of χ . We set H2(2) = H2(1) + Z.

(9.51)

Up to now χ is undetermined. We choose χ with coefficients bμν and Bμν such that bμν = 0 for λ0 · μ = λ0 · ν and such that the following system is satisfied: +i R(1) + K −i

|μ+ν|=1

bμν λ · (μ − ν)z μ z ν

|μ+ν|=2

z μ z ν f, σ1 σ3 (H − λ · (μ − ν))Bμν = 0.

(9.52)


319

In coordinates, (9.52) is (1) aμν + kμν + ibμν λ · (μ − ν) = 0, |μ + ν| = 2, λ0 · μ = λ0 · ν, μν − i(H − λ · (μ − ν))Bμν = 0, |μ + ν| = 1, G μν + K

(9.53)

(1) where: aμν and G μν are coefficients of R(1) , they are smooth functions of = f 22 , μν ∈ H K ,S are coefficients of K , and and are equal to 0 for = 0; kμν ∈ C and K 2 are smooth functions of = f 2 and of the coefficients of χ , where bμν ∈ C and Bμν ∈ H K ,S . By (9.50),

μν H K ,S ≤ Cχ ( f 22 + χ + | kμν | + K R(1) ).

(9.54)

Then by the implicit function theorem we can solve the nonlinear system (9.53) with unknown χ obtaining (we consider bμν only for λ0 · μ = λ0 · ν) (1)

aμν | + Bμν + iRH (λ · (μ − ν))G μν H K ,S iλ · (μ − ν) ≤ C R(1) ( R(1) + f 2 ).

|bμν +

2

(9.55)

Notice that with the above choice of χ and with (9.51), (9.49) yields (2)

(9.40) = H2 + R,

(9.56)

where R has the properties stated for R(2) in Theorem 9.1. Hence Lemma 9.5 is proved. Proof of Lemma 9.5. By (9.11)–(9.12), and with the big O smooth in z ∈ Cm , f ∈ Hc−K ,−S , (9.57) ψ( f 22 ) = ψ( f 22 ) + O |z|2 f H −K ,−S + f 3H −K ,−S . The error term in (9.57) has the properties stated for R(2) in Theorem 9.1. We consider the terms R(2) ◦ φ. Terms, for |μ + ν| = 3, like μ ν a(x, z , f , f (x), f 22 )d x, (9.58) z z R3

by (9.9) and (9.13) can be written as (z μ z ν + O((|z| + f H −K ,−S )3 ))

R3

a(x, z , f , f (x), f 22 )d x.

(9.59)

In the notation of Lemma 9.2 we have

a(x, z , f , f (x), f 22 ) = a x, z + , ei0 Pc (ω0 )σ3 f + G, ei0 σ3 f (x) + [T (0 ) f ](x) + G(z, f, f 22 )(x), f 22 + 1 = a(x, z, f, f (x), f 22 ) + O(|z| + f H −K ,−S ).

(9.60)

320

S. Cuccagna

The big O’s in (9.59)–(9.60) are smooth in z ∈ Cm , f ∈ Hc−K ,−S . Then (9.58) has the properties stated for R(2) in Theorem 9.1. Similar formulas can be used for ∗

σ1 σ3 G μν (x, z , f , f (x), f 22 ) f (x)d x z μ z ν R3

|μ+ν|=2

+

5

j=3

R3

F j (x, z , f , f (x), f 22 ) f ⊗ j (x)d x +

R3

B(| f (x)|2 /2)d x.

(9.61)

We treat with some detail these terms in the step r > 2, Subsect. 9.4. Next we consider the term with F2 f ⊗2 (x)d x. First of all, we can apply to F2 an analogue of (9.60) to obtain for d = 2, Fd (x, z , f , f (x), f 22 ) = Fd (x, z + , ei0 Pc (ω0 )σ3 f + G, ei0 σ3 f (x) + [T (0 ) f ](x) + G(z, f, f 22 )(x), f 22 + 1 ) = Fd (x, 0, 0,

f (x), f 22 ) +

(9.62)

O(|z| + f H −K ,−S ).

Then, modulo terms with the properties stated for R(2) in Theorem 9.1, we get F2 (x, 0, 0, f (x), f 22 ) f ⊗2 (x)d x = F2 (x, 0, 0, 0, f 22 ) f ⊗2 (x)d x R3 R3 + G 2 (x, f (x), f 22 ) f (x) ⊗ f ⊗2 (x)d x, (9.63) R3

where first term in rhs has been treated in Lemma 9.6 and second term has the properties (2) stated for R3 in Theorem 9.1. By a similar argument, 2 ⊗2 (1) (1) (z , f , f 2 ) − ∇ 2 R R f 2 (0, 0, f 2 ), f 2

has the properties stated for R(2) in Theorem 9.1.

(9.64)

(2)

We denote: χ2 = χ , T2 the Lie transformation of χ2 , Z (2) = 0. H2 has been defined in (9.51). We denote R(2) = (9.41) + (9.40) − H2(2) . This R(2) satisfies the conditions in Theorem 9.1. This ends the proof of case r = 2 in Theorem 9.1. 9.4. Proof of Theorem 9.1: the step r > 2. Case r = 2 has been treated in Subsect. 9.3. (2) We have defined H2 in (9.51). We proceed by induction to complete the proof of The(r ) (2) orem 9.1. From the argument below one can see that H2 = H2 for all r ≥ 2. For r ≥ 2, write Taylor expansions,

(r ) (r ) μ ν (r ) z z aμν (x, 0, 0, 0, f 22 )d x, (9.65) R0 − R02 = |μ+ν|=r +1

(r )

(r )

R1 − R12 =

|μ+ν|=r

zμ zν

R3

R3

∗ ) 2 σ1 σ3 G (r f (x)d x. μν (x, 0, 0, 0, f 2 )

(9.66)


We have (r )

(r )

R02 + R12 =

zμ zν

R3

|μ+ν|=r +2

+

zμ zν

|μ+ν|=r +1

+

zμ zν

|μ+ν|=r

(r ) aμν (x, z, f, 0, f 22 )d x

R3

R3

321

∗ ) 2 (r σ1 σ3 G (x, z, f, f (x), f ) f (x)d x μν 2

(r ) (x, z, f, f (x), f 22 ) · ( f (x))⊗2 d x, (9.67) F 2

(r ) ) (r (r ) (9.4). Since H (r ) = H ◦ Tr is real valued with aμν satisfying (9.2), G μν (9.3) and F 2 (because H is real valued), then both sides of Eqs. (9.65)–(9.67) are real valued. In (r ) (r ) particular, aμν and G μν satisfy (8.12). Set

r +1 := rhs (9.65) + rhs (9.66). K

(9.68)

r +1 in null form. The r +1 = K r +1 + Z r +1 collecting inside Z r +1 all the terms of K Split K coefficients of K r +1 and of Z r +1 satisfy (8.12), by the argument just before (9.68). We consider a (momentarily unknown) polynomial χ like (9.6)–(9.7), M0 = r . Denote by φ its Lie transformation. Let (z , f ) = φ(z, f ). For d = 2, in the notation of Lemma 9.2 we have (r )

(r )

(r )

)(z , f ) = F (z , f , f (·), f 22 ), (ei0 Pc (ω0 )σ3 f + G)⊗d . (Rd − R d d

(9.69)

Then rhs (9.69) = d

d (r ) Fd (z , f , f (·), f 22 ), G ⊗(d− j) ⊗ [ei0 Pc (ω0 )σ3 f ]⊗ j = j j=0

j

d

d j (r ) Fd (· · · ), G ⊗(d− j) ⊗ [T (0 ) f ]⊗( j−) ⊗ [ei0 σ3 f ]⊗ . (9.70) = j =0

j=0

In the notation of Lemma 9.2 we have for d = 2, Fd(r ) (z , f , f (x), f 22 )(x) (r ) = Fd z + , ei0 Pc (ω0 )σ3 f +G, ei0 σ3 f (x)+[T (0 ) f ](x), f 22 +1 (x). (9.71) Then F2(r ) (z , f , f (x), f 22 )(x) (r )

= F2 (0, 0, f (x), f 22 )(x) + O(|z| + f H −K ,−S ) (r )

= F2 (0, 0, 0, f 22 )(x) + G(0, 0, f (x), f 22 )(x) f (x) + O(|z| + f H −K ,−S ), (9.72)

where the big O are smooth in z ∈ Cm and f ∈ H −K ,−S with values in H K ,S (R3 , B((C2 )⊗2 , C), and where G has values in H K ,S (R3 , B((C2 )⊗3 , C) and satisfies estimates (9.4). So the last line of (9.72) when plugged in (9.70) for d = 2 yields

322

S. Cuccagna

(r +1) terms with the properties of 3d=0 Rd . We focus now on the first term in the rhs of (9.72). Schematically, in analogy to (9.38) we write

f 2 (x) = z μ z ν Aμν ( f 22 )(x) f (x) |μ+ν|=r

+

Aμν (x, f 22 )z μ z ν + (ei0 σ3 f + T (0 ) f )2 (x)

|μ+ν|=2r

+ ϕ(x)r j f (x) + r f (x) f (x) + ϕ(x)r 2j + r 2f (x),

(9.73)

where we have (9.39) and |r j | + r f H K ,S ≤ C(|z| + f H −K ,−S )r +1 . Then (r ) F2 (x, 0, 0, 0, f 22 ) f ⊗2 (x)d x = χ 1 + R1 R3

(9.74)

with: R1 formed by terms obtained by the last two lines of (9.73) has the properties stated for R(r +1) in Theorem 9.1; χ 1 a polynomial like (9.6) with M0 = r arising from the first line of rhs of (9.73) is such that χ1 ≤ C f 22 χ by the inductive hypothesis (r ) 1 satisfies (9.7) because F2 (x, 0, 0, 0, 0) = 0 in (iv.2-5) Theorem 9.1 and by (9.39); χ each side in (9.74) is real valued. We have 2 ⊗2 (1) (r ) (z , f , f 22 ) = ∇ 2 R R + f 2 (0, 0, f 2 ), f 2 2 ⊗2 (r ) (z , f , f 22 ) − ∇ 2 R (r ) + (R ), (9.75) f 2 (0, 0, f 2 ), f 2

where the second line on rhs of (9.75) yields terms which have the properties of elements of R(r +1) . We have (r )

(0, 0, f 22 ), f ⊗2 = χ 2 + R2 , ∇ 2f R 2

(9.76) (r )

1 and R1 in (9.76). Split H2 where χ 2 and R2 have the same properties of χ (r ) (r ) (r ) (H2 − D2 ) for D2 in (9.28). Then

3 + R3 , χ , H2(r ) − D2(r ) = χ

(r )

= D2 +

(9.77)

where χ 3 and R3 have the same properties of χ 1 and R1 in (9.76). Set χ = 3j=1 χ j . Split now χ = Z + K collecting in Z the null form terms in χ . Then we choose the yet unknown χ such that its coefficients bμν and Bμν satisfy the system +i r +1 + K K −i

bμν λ · (μ − ν)z μ z ν

|μ+ν|=2

z μ z ν f, σ1 σ3 (H − λ · (μ − ν))Bμν = 0.

(9.78)

|μ+ν|=1

≡ 0 system (9.78) would be linear and admit exactly one solution. Notice that for K ≤ C f 2 χ . So by the implicit function theorem By χ ≤ C f 22 χ we get K 2 there exists exactly one solution of (9.78). This solution is close to the solution of system


323

≡ 0. Furthermore, this system has solution χr +1 = χ which satisfies (9.78) when K (8.12), or what is the same, (9.7). For L r +1 of type (9.34), χr +1 satisfies (r ) r +1 + K + L r +1 . χr +1 , H2 =K (9.79) Call φr +1 = φ the Lie transform of χr +1 . For Tr +1 = Tr ◦ φr +1 set H (r +1) := H (r ) ◦ φr +1 = H ◦ (Tr ◦ φr +1 ) = H ◦ Tr +1 . Since χr +1 satisfies (8.12),

H (r +1)

is well defined and real valued. Split

(r )

Z H (r ) ◦ φr +1 = H2 + Z (r ) + Z r +1 +

(9.81)

Z ◦ φr +1 − Z) + (Z ◦ φr +1 − Z ) + ( + ( K r +1 + K ) ◦ φr +1 − K r +1 − K + H2(r ) ◦ φr +1 − H2(r ) + H2(r ) , χr +1 (r )

(r )

) (r ) + (R(r 02 + R12 ) ◦ φr +1

+

5

) ◦ φr +1 − Z−K

(r ) ◦ φr +1 + R6

(9.82) (9.83) (9.84) (9.85)

(r ) (r ) ) ◦ φr +1 + R (r ) ◦ φr +1 (Rd − R d d

d=3 ) + (R(r 2

+ψ

(9.80)

◦ φr +1 .

(9.86) (9.87) (9.88)

(r +1) (r ) (r ) (2) = H2 (this proves H2 = H2 ) and Z (r +1) := Z (r ) + Z r +1 + Z . Its coefDefine H2 ficients satisfy (8.12) (because H (r +1) is real valued) and it is a normal form. We have already discussed that (9.87) has the properties stated for R(r +1) . By expansions (9.69)– (9.71) we get that the first summation in (9.86) has the properties stated for R(r +1) . By an (r ) (z , f ) have the properties stated for R(r +1) . We have, analogous argument, terms R d for T = T (0 ),

| f (x)|2 = | f (x)|2 + E(x) with E(x) := 2(T (0 ) f (x))∗ σ1 ei0 σ3 f (x) +|T (0 ) f (x)|2 + 2G ∗ (x)σ1 ei0 σ3 f (x) + 2G ∗ (x)σ1 T (0 ) f (x) + |G(x)|2 . Then ) R(r 6

◦ φr +1 =

R3

+

(9.89)

B(| f (x)| /2)d x = 2

R3

B(| f (x)|2 /2)d x

1 1 d x E(x) B (| f (x)|2 /2 + s E(x)/2)ds. 2 R3 0

(9.90)

The last line in (9.90) has the properties stated for R(r +1) − R6(r +1) by Lemma 9.2. (r ) ) (r (r ) (9.4), the terms By (9.67) and by the fact that aμν satisfies (9.2), G μν (9.3) and F 2 (r ) (r ) (r +1) R02 + R12 has the properties stated for 2d=0 Rd . The same conclusion holds for , (9.85). By Lemma 9.2 and by an analogue of (9.57), we have that ψ ◦ φr = ψ + ψ has the properties stated for 3d=1 R(r +1) by (9.12). We have where ψ d 1 {Z (r ) , χr +1 } ◦ φrt +1 dt. (9.91) Z (r ) ◦ φr +1 − Z (r ) = 0

324

S. Cuccagna

We have

{χr +1 , Z (r ) } ≤ C(|z|r +2 + |z|r +1 f H −K ,−S ).

(9.92)

By (9.92) we conclude that (9.91) has the properties stated for R(r +1) . The same is true for the other terms in (9.82)–(9.83). We have, for H2 = H2(r ) ,

1 t2

H2 ◦ φr +1 − (H2 + {H2 , χr +1 }) = 0

2!

{{H2 , χr +1 } , χr +1 } ◦ φrt +1 dt

1 t2

=− 0

2!

+ L r +1 , χr +1 ◦ φrt +1 dt. K r +1 + K

(9.93) + L r +1 , χr +1 } ≤ rhs(9.92) implies that (9.93) has the properties stated Then {K r +1 + K

for R(r +1) .

10. Dispersion We apply Theorem 9.1 for r = 2N +1 (recall N = N1 , where N j λ j < ω0 < (N j +1)λ j ). In the rest of the paper we work with the hamiltonian H (r ) . We will drop the upper index. (r ) (r ) (r ) So we will set H = H (r ) , H2 = H2 , λ j = λ j , λ = λ(r ) , Z a = Z a for a = 0, 1 ) (r ) and R = R(r ) . In particular we will denote by G μν the coefficients G (r μν of Z 1 . We will show:

Theorem 10.1. There is a fixed C > 0 such that for ε0 > 0 sufficiently small and for ∈ (0, ε0 ) we have f L r ([0,∞),W 1, p ) ≤ C t

x

μ

z L 2 ([0,∞)) ≤ C t

z j W 1,∞ ([0,∞)) ≤ C t

for all admissible pairs (r, p),

(10.1)

for all multi indexes μ with λ · μ > ω0 ,

(10.2)

for all j ∈ {1, . . . , m}.

(10.3)

Estimate (10.3) is a consequence of the classical proof of orbital stability in Weinstein [W1]. Notice that (1.1) is time reversible, so in particular (10.1)–(10.3) are true over the whole real line. The proof, though, exploits that t ≥ 0, specifically when for λ ∈ σc (H) + (λ) = R (λ + i0) rather than R − (λ) = R (λ − i0) in formula (10.11). we choose RH H H H See the discussion on p.18 [SW3]. The proof of Theorem 10.1 involves a standard continuation argument. We assume f L r ([0,T ],W 1, p ) ≤ C1 t

x

μ

z L 2 ([0,T ]) ≤ C2 t

for all admissible pairs (r, p),

(10.4)

for all multi indexes μ with ω · μ > ω0

(10.5)

for fixed sufficiently large constants C1 , C2 , and then we prove that for sufficiently small, (10.4) and (10.5) imply the same estimate but with C1 , C2 replaced by C1 /2, C2 /2. Then (10.4) and (10.5) hold with [0, T ] replaced by [0, ∞). The proof consists in three main steps. (i) Estimate f in terms of z.


325

(ii) Substitute the variable f with a new “smaller” variable g and find smoothing estimates for g. (iii) Reduce the system for z to a closed system involving only the z variables, by insulating the part of f which interacts with z, and by decoupling the rest (this reminder is g). Then clarify the nonlinear Fermi golden rule. The first two steps are the same as [CM]. The only novelty of the proof with respect to [CM] is step (iii), specifically the part on the Fermi golden rule. At issue is the non negativity of some crucial coefficients in the equations of z. This point is solved using the same ideas in Lemma 5.2 [BC]. The fact that they are not 0 is assumed by hypothesis (H11). The fact that if not 0 they are positive, is proved here. Step (i) is encapsulated by the following proposition: Proposition 10.2. Assume (10.4) and (10.5). Then there exist constants C = C(C1 , C2 ), K 1 , with K 1 independent of C1 , such that, if C(C1 , C2 ) is sufficiently small, then we have f L r ([0,T ],W 1, p ) ≤ K 1 x

t

for all admissible pairs (r, p).

(10.6)

Consider Z 1 of the form (9.26). Set: G 0μν = G μν ( f 22 )

for f 22 = 0; λ0j = λ j (ω0 ).

(10.7)

Then we have (with finite sums and with the derivative in the variable f 22 performed w.r.t. the f 22 arguments explicitly emphasized in Theorem 9.1)

i f˙ − H f − 2(∂ f 2 H )Pc (ω0 )σ3 f = z μ z ν G 0μν 2

+

|λ0 ·(ν−μ)|>ω0

z μ z ν (G μν − G 0μν ) + σ3 σ1 ∇ f R − 2(∂ f 2 R)Pc (ω0 )σ3 f. 2

|λ0 ·(ν−μ)|>ω0

(10.8)

The proof of Proposition 10.2 is standard and is an easier version of the arguments in §4 in [CM]. The dominating term in the rhs of (10.8) is the one on the first line, with contribution to f bounded by C(C2 ) by the endpoint Strichartz estimate and by (10.5) (we recall that the third term in the lhs, in part becomes a phase through an integrating factor, in part goes on the rhs: see [CM]; this trick is due to [BP2]). Notice also, that Theorem 10.1 implies by the arguments on pp. 67–68 in [CM], lim eiθ(t)σ3 f (t) − eitσ3 f + 1 = 0 (10.9) t→+∞

H

t for a f + ∈ H 1 with f + H 1 ≤ C and for θ (t) = tω0 + 2 0 (∂ f 2 H )(t )dt . We claim 2 that θ (t) = ϑ(t) − ϑ(0). This claim, Theorem 9.1, Theorem 10.1 and (10.9) imply Theorem 8.1. To prove the claim we substitute the last system of coordinates in (3.21) to obtain i f˙ − H f − (ϑ˙ − ω0 )Pc (ω0 )σ3 f = G,

(10.10)

where G is a functional with values in ∈ C(R, L 1x ). The two equations are equivalent. This implies G = rhs(10.8) and ϑ˙ − ω0 = 2∂ f 2 H. This yields the claim 2 θ (t) = ϑ(t) − ϑ(0).

326

S. Cuccagna

Step (ii) in the proof of Theorem 10.1 consists in introducing the variable

g= f +

+ z μ z ν RH (λ0 · (μ − ν))G 0μν .

|λ0 ·(μ−ν)|>ω

(10.11)

0

Substituting the new variable g in (10.8), the first line on the rhs of (10.8) cancels out. By an easier version of Lemma 4.3 [CM] we have: Lemma 10.3. For sufficiently small and for C0 = C0 (H) a fixed constant, we have g L 2 L 2,−S ≤ C0 + O( 2 ). x

t

(10.12)

As in [CM], the part of f which couples nontrivially with z comes from the polynomial in z contained in (10.11). g and z are decoupled.

10.1. The Fermi golden rule. We proceed as in the related parts in [BC,CM]. The only difference with [CM] is that the preparatory work in Theorem 9.1 makes transparent the positive semidefiniteness of the crucial coefficients. + = R + (λ0 · (μ − ν)). We will have λ0 = λ (ω ) and λ = λ ( f 2 ) as in Set Rμν j 0 j j 2 j H Sect. 9.2. |λ0j − λ j | C12 2 by (10.4), so in the sequel we can assume that λ0 satisfies the same inequalities of λ. We substitute (10.11) in i˙z j = ∂z∂ j H (r ) obtaining

i˙z j = ∂z j (H2 + Z 0 ) +

νj

|λ·(μ−ν)|>ω0

−

νj

|λ·(α−β)|>ω0

zμ zν g, σ1 σ3 G μν + ∂z j R zj

z μ+α z ν+β + 0 Rαβ G αβ , σ1 σ3 G μν . zj

(10.13)

|λ·(μ−ν)|>ω0

We rewrite this as i˙z j = ∂z j (H2 + Z 0 ) + E j ,

z ν+β + 0 − νj R0β G 0β , σ1 σ3 G 00ν , zj

(10.14) (10.15)

λ·β>ω0

λ·ν>ω0 λ·β−λk ω0 λ·α−λk ω0 λ·β−λk ω0

λ0

νj z ν+β + 0 R0β G 0β , σ1 σ3 G 00ν λ0 · (β + ν) z j νj zα zν + 0 Rα0 G α0 , σ1 σ3 G 00ν . · (α − ν) z j

(10.17)

λ0 ·α =λ0 ·ν λ·α−λk 1. Then by (10.5),

z α L 2 ≤ CC2 M 2 , ζ − z L 2 ≤ C t

t

λ·α>ω0 λ·α−λk ω0

ζj

ν + Rα0 G 0α0 , σ1 σ3 G 00ν .

(10.19)

λ·α−λk 1/2 and r > ω . Let standard theory, RH 0 α0 α G = λ0 ·α=r ζ G 0α0 and F = λ0 ·α=r ζ α Fα . Let t F = (F1 , F2 ). Then, see Lemma 4.1 [Cu2], " ! " ! + (r )G, σ3 G = lim Im RH (r + iε)G, σ3 G Im RH ε#0 ! " = lim Im Rσ3 (−+ω0 ) (r + iε)F, σ3 F ε#0 " ! = lim Im R− (r − ω0 + iε)F1 , F1 ε#0 ε = lim | F1 (ξ )|2 dξ ≥ 0. (10.23) ε#0 R3 (ξ 2 − (r − ω0 ))2 + ε 2 Now we will assume the following hypothesis.

(H11) We assume that for some fixed constants for any vector ζ ∈ Cn we have:

ν + λ0 · ν Im ζ α ζ Rα0 G 0α0 , σ1 σ3 G 00ν λ0 ·α=λ0 ·ν>ω0 λ·α−λk 5/2, we know that there exists T = T (v0 ) < ∞ such that the solution can be constructed uniquely in C([0, T ); H m (R3 )) (see e.g. [27]). The question of global regularity/finite time blow-up of such local classical solution, however, is a wide-open problem(see e.g. [8,29] for the general backgrounds on the problem). For this problem a

334

D. Chae

celebrated criterion of finite blow-up due to Beale, Kato and Majda ([1]) is established, T which states that the blow-up at T happens if and only if 0 ω(t) L ∞ dt = ∞, where ω = curl v is the vorticity. Later Constantin, Fefferman and Majda derived another type criterion, taking into account the dynamics of the vorticity ([10] see also [6,20] for the later refinements in this direction). One plausible direction of study of the problem is to investigate the possibility of scenarios of the finite time blow-ups. For example in [17] the scenario of vortex tube collapse is studied, and excluded under the milder condition, T ∞ 0 v(t) L dt < ∞. In this paper we continue the study of [3,4] on the possibility of self-similar blow-up and related problems. It is originally motivated by Leray’s question on the existence of self-similar blow-up in the 3D Navier-Stokes equations, which was raised in [28], and was negatively answered in [31], which is later refined in [30,32]. The (backward) self-similar solution of the system is the solution v(x, t) of the form, x 1 v(x, t) = (1.1) α V 1 (T − t) α+1 (T − t) α+1 for a vector field V on R3 , where α = −1. If we substitute v given by (1.1) into the first equation of (E), then we obtain α 1 V + α+1 (y · ∇)V + (V · ∇)V = −∇ P, (1.2) (SS E) α+1 div V = 0, which is similar to the Leray system, satisfied by the self-similar solution for the NavierStokes equations, corresponding to (1.1) with α = 1. Now the existence of a self-similar solution of the form (1.1) is equivalent to the existence of a solution the stationary system (1.2). Note that due to the absence of the laplacian term in (SSE) we could not expect the similar maximum principle to the Leray system, which was crucial in the proofs of [31,32]. In [3] the author showed, by a completely different argument, that if V satisfies

:= curl V ∈ L p (R3 ) for some p1 > 0, (1.3) 0< p< p1

then the only solution to (1.2) satisfies = curl V = 0, which together with the divergence free condition, implies that V = ∇h for a harmonic function h on R3 . Note that the representation (1.1) implies x 1 ω(x, t) = . (1.4) 1 T −t (T − t) α+1 Actually a much more general result than the one stated above is proved in [3] (Theorem 1.2). Namely, if ω(x, t) = G(t)(F(t)x)

(1.5)

with a scalar function G(t) and a 3×3 matrix valued function F(t) on [0, T ) is a vorticity of the solution of (E), then necessarily either det(F −1 (0)F(t)) = 1 for all t ∈ [0, T ), or = 0. The nonexistence of a solution of the form (1.1) is an immediate corollary of this restriction theorem for the solution representation, since we can easily see that 3 t − α+1 −1 det (F (0)F(t)) = 1 − = 1 ∀t ∈ (0, T ), T

Self-Similar Solutions of 3D Euler and Related Equations

335

and therefore we should have = 0. One of our main purposes of this paper is to generalize/localize the main results of [3,4]. The following theorem generalizes Theorem 1.2 of [3] in the case of the Euler system in a bounded domain in R3 . Theorem 1.1. Let D be a bounded domain in R3 , and v ∈ C(D × [0, T )) ∩ C 1 (D × (0, T )) be a classical solution to (E) with the initial data v0 . For each t ∈ [0, T ) let G t (·) : D → R3×3 be a non-singular matrix-valued function and Ft (·) : D → D be a diffeomorphism. Suppose there exists a continuous and bounded vector field such that ω(x, t) = G t (x)(Ft (x)) ∀(x, t) ∈ D × [0, T ).

(1.6)

Then, either = 0, or for each t ∈ [0, T ) we have {x ∈ D¯ | det(∇(F0−1 ◦ Ft (x))) = 1} = ∅.

(1.7)

Remark 1.1. The removal of the condition (1.3) in the above theorem is an immediate consequence of the assumption on the boundedness of the domain D. The main novelty in the generalization of Theorem 1.2 of [3] is that the matrix valued function Ft (x) is a non-constant function of spatial variable in general. Next in order to localize the results of asymptotically self-similar singularity of [4] we define the notion of regular point. Let v ∈ C([0, t∗ ); H m (R3 )) be a local classical solution of (E). Definition 1.1. We say that z ∗ = (x∗ , t∗ ) is a regular point of v if the following conditions are satisfied: (i) There exists a ∈ R3 such that x∗ = limt→t∗ X t (a), where X t (·) is the particle trajectory generated by v. t (ii) For a ∈ R3 chosen in (i) we have 0∗ |ω(X t (a), t)|dt < ∞. Let us now recall the definition of the quantity αˆ introduced by Constantin in [9],

α(x, ˆ t) =

ξ(x, t) · D(x, t) · ξ(x, t), ξ = ω/|ω|, 0, if ω(x, t) = 0,

(1.8)

where D is the symmetric part of the velocity gradient matrix ∇v. Theorem 1.2. Let v ∈ C([0, t∗ ); H m (R3 )) be a local in time classical solution of (E), and X t (·) be the particle trajectory generated by v. Let x∗ = limt→t∗ X t (a) for some a ∈ R3 . Suppose there exists α > −1 such that lim sup r α+1 r →0

sup t∗ −r α+1 0 and ∂i η∂l F i j = ∂l q j .

(16)

Thus, any Lipschitz solution of (15) satisfies the identity ∂t (η(u)) + divx (q(u)) = 0. Definition 2. A bounded admissible measure-valued solution ν of (15) with initial data U0 ∈ L ∞ is a parametrized family of propability measures ν ∈ P([0, T ] × Rn ; Rk ) such that • t → ν·,t , ξ is a weakly∗ continuous map, taking values in L ∞ (Rn ); • the identity ⎧ ⎨ ∂t ν, ξ + divx ν, F(ξ ) = 0 ⎩ ν

x,0 , ξ

= U0 (x)

(17)

holds in the sense of distributions; • the inequality ⎧ ⎨ ∂t ν, η(ξ ) + divx ν, q(ξ ) ≤ 0 ⎩ ν

x,0 , η(ξ )

holds in the sense of distributions.

= η(U0 (x))

(18)

Weak-Strong Uniqueness for Measure-Valued Solutions

359

Theorem 3. Assume U : Rn × [0, T ] → Rk is a bounded Lipschitz solution of (15) and ν a bounded admissible measure valued solution of (15) with initial data U0 = U (·, 0). Then νx,t = δU (x,t) for a.e. (x, t) ∈ Rn × [0, T ]. The proof follows essentially the computations of pp. 98–100 in [6]. Proof. We start by defining the following functions of t and x: h := ν, η(ξ ) − η(U ) − Dη(U ) · [ν, ξ − U ] ,

Y α := ν, q α (ξ ) − q α (U ) − ∂l η(U ) ν, F lα (ξ ) − F lα (U ) , Z βα := ∂β j η(U ) ν, F jα (ξ ) − F jα (U ) − ∂l F jα (U ) ν, ξ l − U l .

(19) (20) (21)

Recall that supp (νx,t ) and |U (x, t)| are both uniformly bounded and that η, q and F are C 2 functions. So, there exists a constant C such that the following identities hold for every ξ ∈ supp (νx,t ): |q α (U (x, t)) − q α (ξ ) − ∂i q α (U (x, t))(U i (x, t) − ξ )| ≤ C|U (x, t) − ξ |2 , jα F (U (x, t)) − F jα (ξ ) − ∂i F jα (U (x, t))(U i (x, t) − ξ i ) ≤ C|U (x, t) − ξ |2 (we underline that C is a constant independent of x, t and ξ ). Plugging these last identities into (20) and recalling (16), we conclude |Y (x, t)| ≤ C |U (x, t) − ξ |2 dνx,t (ξ ). On the other hand, using that D 2 η ≥ c0 I d, we easily infer c0 |h(x, t)| ≥ |U (x, t) − ξ |2 dνx,t 2

(22)

(23)

and hence that |Y (x, t)| ≤ C0 |h(x, t)|.

(24)

|Z (x, t)| ≤ C1 |h(x, t)|.

(25)

∂t (η(U )) + divx (q(U )) = 0

(26)

A similar computation yields

Next recall that

(because U is Lipschitz). Fix a test function ψ ∈ Cc∞ (Rn ×] − T, T [). Combining (26) with (18), we conclude T T α ≥− ∂t ψ h + ∂xα ψ Y ∂t ψ ∂i η(U ) U i − ν, ξ i 0 0

(27) +∂xα ψ ∂i η(U ) F iα (U ) − ν, F iα (ξ ) (no boundary term appears because the initial condition is the same for both ν, η(ξ ) and η(U )).

360

Y. Brenier, C. De Lellis, L. Székelyhidi Jr.

In fact, by an easy approximation argument, (27) holds for any test function which is just Lipschitz continuous. Similarly, we can use the test function := ψ Dη(U ) (which is Lipschitz and compactly supported) on the identity (17) to get

∂t (ψ ∂i η(U ))(U i − ν, ξ i ) + ∂xα (ψ ∂i η(U ))(F iα (U ) − ν, F iα (ξ ) = 0. (28) Since U is Lipschitz, we can use the chain rule and (15) to compute

∂t (∂i η(U ))(U i − ν, ξ i ) + ∂xα (∂i η(U ))(F iα (U ) − ν, F iα (ξ ) = ∂xα U i Z iα .

(29)

Combining (27), (28) and (29) we infer ∂t ψ h + ∂xα ψ Y α ≥ ψ ∂xα U i Z iα .

(30)

Next, fix any point τ < T , any radius R > 0 function ψ(x, t) = ω(t)χ (x, t), where ⎧ ⎨1 ω(t) := 1 − ε−1 (t − τ + ε) ⎩0

and ε ∈]0, T − τ [. Consider the test for 0 ≤ t < τ − ε for τ − ε ≤ t ≤ τ for t ≥ τ ,

⎧ if |x| ≤ R + C0 (τ − t) ⎨1 χ (x, t) := 1 − ε−1 (|x| − R − C0 (τ − t)) if 0 ≤ |x| − (R + C0 (τ − t)) ≤ ε ⎩0 otherwise, where C0 is the constant appearing in (24). Note that: • • • •

0 ≤ ψ ≤ 1; ψ(x, t) = 0 if t ≥ τ or |x| ≥ ε + R + C(τ − t); ∂t ψ = −ε−1 on B R (0)×]τ − ε, τ [; |∇x ψ| ≤ −C0−1 ∂t ψ.

Combining these pieces of information with (24), from (30) we easily conclude τ 1 τ h d x dt ≤ |∇U ||Z | d x dt. ε τ −ε |x|≤R |x|≤R+ε+C0 (τ −t) 0 Recalling (25) and the Lipschitz regularity of U we conclude τ 1 τ h d x dt ≤ C h d x dt. ε τ −ε |x|≤R |x|≤R+ε+C0 (τ −t) 0 Finally, letting ε ↓ 0 and using the fact that h is integrable, we conclude τ h(x, τ ) d x ≤ C h(x, t) d x dt for a.e. τ . |x|≤R

0

|x|≤R+C0 (τ −t)

(31)

(32)

(33)

Weak-Strong Uniqueness for Measure-Valued Solutions

361

Note, moreover, that the set of measure zero where (33) fails can be chosen independently of R. Therefore, having fixed any s < T , we infer τ h(x, τ ) d x ≤ C h(x, t) d x dt for a.e. τ ∈ [0, s]. |x|≤R+C0 (s−τ )

0

|x|≤R+C0 (s−t)

(34) If we set

g(τ ) :=

|x|≤R+C0 (s−τ )

h(x, τ ) d x ,

τ then (34) becomes the Gronwall’s inequality g(τ ) ≤ C 0 g(t) dt, which leads to the conclusion g ≡ 0. By the arbitrariness of R > 0 and s < T we conclude that h ≡ 0 on [0, T ] × Rn . Recalling (23), we infer νx,t = δU (x,t) for a.e. (x, t), which is the desired conclusion. References 1. Alibert, J.J., Bouchitté, G.: Non-uniform integrability and generalized young measure. J. Convex Anal. 4(1), 129–147 (1997) 2. Bellout, H., Cornea, E., Neˇcas, J.: On the concept of very weak L 2 solutions to Euler’s equations. SIAM J. Math. Anal. 33(5), 995–1006 (2002) (electronic) 3. Brenier, Y.: Convergence of the Vlasov-Poisson system to the incompressible Euler equations. Comm. Part. Diff. Eqs. 25(3-4), 737–754 (2000) 4. Brenier, Y., Grenier, E.: Limite singulière du système de Vlasov-Poisson dans le régime de quasi neutralité: le cas indépendant du temps. C. R. Acad. Sci. Paris Sér. I Math. 318(2), 121–124 (1994) 5. Constantin, P.: Note on loss of regularity for solutions of the 3-D incompressible Euler and related equations. Commun. Math. Phys. 104(2), 311–326 (1986) 6. Dafermos, C.M.: Hyperbolic conservation laws in continuum physics, Second ed., Vol. 325 of Grundlehren der Mathematischen Wissenschaften [Fundamental Principles of Mathematical Sciences]. Berlin: Springer-Verlag, 2005 7. De Lellis, C., Székelyhidi, L. Jr.: The Euler equations as a differential inclusion. Ann. Math. (2) 170(3), 1417–1436 (2009) 8. De Lellis, C., Székelyhidi, L. Jr.: On admissibility criteria for weak solutions of the Euler equations. Arch. Rat. Mech. Anal. 195(1), 225–260 (2010) 9. Di Perna, R.J.: Measure-valued solutions to conservation laws. Arch. Rat. Mech. Anal. 88(3), 223–270 (1985) 10. DiPerna, R.J., Majda, A.J.: Oscillations and concentrations in weak solutions of the incompressible fluid equations. Commun. Math. Phys. 108(4), 667–689 (1987) 11. Lions, P.-L.: Mathematical topics in fluid mechanics. Vol. 1, Vol. 3 of Oxford Lecture Series in Mathematics and its Applications. New York: The Clarendon Press, Oxford University Press, 1996 12. Masmoudi, N.: Remarks about the inviscid limit of the Navier-Stokes system. Commun. Math. Phys. 270(3), 777–788 (2007) 13. Scheffer, V.: An inviscid flow with compact support in space-time. J. Geom. Anal. 3(4), 343–401 (1993) 14. Shnirelman, A.: On the nonuniqueness of weak solution of the Euler equation. Comm. Pure Appl. Math. 50(12), 1261–1286 (1997) Communicated by P. Constantin


Communications in


Effective Dynamics of Double Solitons for Perturbed mKdV Justin Holmer1 , Galina Perelman2 , Maciej Zworski3 1 Department of Mathematics, Brown University, 151 Thayer Street, Providence, RI 02912, USA.


2 Ecole Polytechnique, CMLS, 91128 Palaiseou, France. E-mail: [email protected] 3 Department of Mathematics, University of California, Berkeley, CA 94720, USA.

E-mail: [email protected] Received: 22 June 2010 / Accepted: 17 November 2010 Published online: 17 May 2011 – © Springer-Verlag 2011

Abstract: We consider the perturbed mKdV equation ∂t u = −∂x (∂x2 u +2u 3 −b(x, t)u), where the potential b(x, t) = b0 (hx, ht), 0 < h 1, is slowly varying with a double soliton initial data. On a dynamically interesting time scale the solution is O(h 2 ) close in H 2 to a double soliton whose position and scale parameters follow an effective dynamics, a simple system of ordinary differential equations. These equations are formally obtained as Hamilton’s equations for the restriction of the mKdV Hamiltonian to the submanifold of solitons. The interplay between algebraic aspects of complete integrability of the unperturbed equation and the analytic ideas related to soliton stability is central in the proof. 1. Introduction We consider 2-soliton solutions to the modified Korteweg-de Vries (mKdV) equation with a slowly varying external potential ∂t u = ∂x (−∂x2 u + bu − 2u 3 ), b = b(x, t) = b0 (hx, ht), 0 < h 1, ∂ α b0 ∈ L ∞ (R2 ).

(1.1)

The purpose of the paper is to find minimal exact effective dynamics valid for a long time in the semiclassical sense and describing non-perturbative 2-soliton interaction. The semiclassical parameter, h, quantifies the slowly varying nature of the potential. Equation (1.1) is chosen as a mathematically and numerically simpler model than the nonlinear Schrödinger equation (NLS) with a slowly varying potential, ∂t u = −i(−∂x2 u + bu − 2|u|2 u).

(1.2)

Like (1.1), Eq. (1.2) is a perturbation of a completely integrable nonlinear equation which possesses multisoliton solutions. The dynamics of such multisoliton solutions

364

J. Holmer, G. Perelman, M. Zworski

in the presence of an external potential is in fact physically relevant to experiments involving Bose-Einstein condensation – see [32]. For (1.2), one can draw an analogy to the semiclassical dynamics of coherent states – well-localized solutions of the linear Schrödinger equation with slowly varying potential – see for instance [7] or the Introduction to [30]. In this one-particle quantum mechanical setting, the natural long time for which the semiclassical approximation is valid is the Ehrenfest time, log(1/ h)/ h. For the nonlinear equation (1.2), Fröhlich-GustafsonJonsson-Sigal [13] obtained effective dynamics of 1-soliton propagation valid up to the Ehrenfest time. However, unlike the corresponding linear case, the exact effective dynamics of 1-soliton solutions to the nonlinear equation (1.2) valid for such a long time requires h 2 -size corrections.1 Those corrections appeared as unspecified O(h 2 ) additions to Newton’s equations (which give the usual semiclassical approximation) in the aforementioned work [13]. That paper and its symplectic point of view were the starting point for our recent work [18,19] in which the exact dynamics for 1-soliton solutions to (1.2) were obtained. Following the 1-soliton analysis of [18,19] for (1.2), the semiclassical dynamics for 2-solitons considered here for (1.1) is obtained by restricting the Hamiltonian to the symplectic manifold of 2-solitons and considering the finite dimensional dynamics there. The numerical experiments [17] show a remarkable agreement with the theorem below. However, they also reveal an interesting scenario not covered by our theorem: the velocities of the solitons can almost cross within an exponentially small width in h and the effective dynamics remains valid. Any long time analysis involving multiple interactions of solitons has to explain this avoided crossing which perhaps could be replaced by a direct crossing in a different parametrization. This seems the most immediate open problem of phenomenological interest. In a recent numerical study Potter [30] showed that the same effective dynamics applies very well to N -solitons both in the case of perturbed mKdV (1.1) and perturbed NLS (1.2). The soliton matter-wave trains created for Bose-Einstein condensates [32] were a good testing ground and our effective dynamics gives an alternative explanation of the observed phenomena. At the moment it is not clear how to analytically obtain rigorous exact effective dynamics for the perturbed NLS (1.2), while a result yielding inexact equations in the spirit of [13] seems comparatively accessible. Our method of analysis follows a long tradition of the use of modulation parameters in problems of orbital and asymptotic stability of single solitons, and the perturbative interaction of multiple solitons for nonintegrable equations – see for instance [8,24,25,29,31,34] and the numerous references given there. For nonlinear dispersive equations with non-constant coefficients one can consult, in addition, [3,13–15], and references given there. Here we avoid generality and, as described above, the aims are more modest: for an equation with an underlying completely integrable structure, using classical methods, we can give a remarkably accurate and phenomenologically relevant description of 2-soliton interaction with an external field, while allowing for nonperturbative self-interaction of the 2-soliton. The Lyapunov functional needed in our proof was previously employed by Maddocks-Sach [23] to prove the stability of KdV multisolitons and relies on the existence of higher-order conserved energies stemming from the completely integrable structure. To state the exact result we recall that the unperturbed case of (1.1), b ≡ 0, is completely integrable and has N -soliton solutions expressed in terms of profiles q N (x, a, c) 1 A compensation for that comes however at having the semiclassical propagation accurate for larger values of h.

Effective Dynamics of Double Solitons for Perturbed mKdV

365

dependent upon a positional parameter a ∈ R N and a scale parameter c ∈ R N – see § 1.1 and § 3 below for detailed discussion. For N = 2 we obtain ¯ c¯ ∈ Rn . Suppose that u(x, t) solves (1.1) with Theorem. Let δ0 > 0 and a, u(x, 0) = q2 (x, a, ¯ c), ¯ |c¯1 ± c¯2 | > 2δ0 > 0, 2δ0 < |c¯ j | < (2δ0 )−1 .

(1.3)

Then, for t < T (h)/ h, u(·, t) − q2 (·, a(t), c(t)) H 2 ≤ Ch 2 eCht , C = C(δ0 , b0 ) > 0,

(1.4)

where a(t) and c(t) evolve according to the effective equations of motion, a˙ j = c2j − sgn(c j )∂c j B(a,c, t), c˙ j = sgn(c j )∂a j B(a, c, t), def 1 B(a, c, t) = b(x, t)q2 (x, a, c)2 d x. 2

(1.5)

The upper bound T (h)/ h for the validity of (1.4) is given in terms of T (h) = min(δ log(1/ h), T0 (h)) ,

δ = δ(δ0 , b0 ) > 0,

(1.6)

where for t < T0 (h)/ h, |c1 (t) ± c2 (t)| > δ0 > 0 and δ0 < |c j (t)| < δ0−1 . Under the assumption (1.3) on c, ¯ T0 (h) > δ2 , where δ2 = δ2 (δ0 , b0 ) > 0 is independent of h – see (1.13). Remarks. 1. As shown by the top two plots in Fig. 1 the agreement of the approximations given by (1.5) and numerical solutions of (1.1) is remarkable. The codes are available at [17], see also § 1.4. 2. For external potentials with nondegenerate maxima, the limiting Ehrenfest time log(1/ h)/ h appears to be optimal as the errors behave like O(h 2 exp(Cht)) – see [30, §4.3] – provided we insist on the agreement with classical equations of motion (1.5). We expect that the solution is close to a soliton profile q2 (x, a, c) for much longer times (h −∞ ?) but with a modified evolution for the parameters. One difficulty is the lack of a good description of the long time behaviour of time dependent linearized evolution with b present – see § 8. However, the modified equations would lack the transparency of (1.5) and would be harder to implement. The numerical study [30] suggests that for the minimal exact dynamics the error bound O(h 2 ) in (1.4) is optimal. 3. The condition that |c1 (t) ± c2 (t)| > δ1 , that is, that the perturbed effective dynamics avoids the lines shown in Fig. 2, could most likely be relaxed. Allowing that provides more interesting dynamics as then the solitons can interact multiple times. As discussed in § 1.2 and Appendix B, we expect avoided crossing after ±c j (t)’s get within exp(−c/ h) of each other – see Fig. 3. Examples of such evolution, and the comparisons with effective dynamics, are shown in the lower two plots in Fig. 1. On closer inspection the agreement between the solutions and solitons moving according to effective dynamics is not as dramatic as in the case when ±c j ’s stay away from each other but for smaller values of h the result should still hold. We concentrated on the simpler case at this early stage.

366


Fig. 1. A gallery of numerical experiments showing agreement with the results of the main theorem (clockwise from the left hand corner) for the external fields listed in (1.16) with the indicated initial data. The continuous lines are the numerically computed solutions and the dotted lines follow the evolution given by (1.5). The main theorem does not apply to the bottom two figures on the whole interval of time due to the crossing of c j ’s – see Fig. 3. In the first figure in the second line, (1.5) still apply directly, but in the second one further modification is needed to account for the signs

4. Studies of single solitons for perturbed KdV, mKdV, and their generalizations were conducted by Dejak-Jonsson [10] and Dejak-Sigal [11]. The mKdV results of [10] are improved by following [19]. For KdV one does not expect the same behaviour as for mKdV and the O(h 2 )-approximation similar to (1.4) is not valid – see the recent work by Muñoz [27] and the first author [16] for finer analysis of that case. 5. We expect the same result to be true for N -solitons for all N with the function space H 2 replaced by H N . For N = 1 it follows directly from the arguments of [19]. That case is also implicit in this paper: single soliton dynamics describes the propagation away from the interaction region. 6. The condition that u(x, 0) = q2 (x, a, ¯ c) ¯ can be relaxed by allowing a small perturbation in H 2 – see [9] for the adaptation of [19] to that case. Similar statements are possible here but we prefer the simpler formulation both in the statement of the theorem and in the proofs. 7. Equation (1.1) is globally well-posed in H k , k ≥ 1 under even milder regularity hypotheses on b. This can be shown by modifying the techniques of Kenig-PonceVega [21] – see Appendix A. Although for k ≥ 2 more classical methods are available, we opt for a self-contained treatment dealing with all H k ’s at once.


367

Fig. 2. On the left we show R2 \ C and on the right examples of double solitons corresponding to (c1 , c2 ) indicated on the left (with a1 = a2 = 0 in the first figure and a2 = −a1 = 1, in the other two). At the coordinate axes the double soliton degenerates into a single soliton. As one approaches the lines c1 = ±c2 the solitons escape to infinities in the opposite direction

In the remainder of the Introduction we will explain the origins of the effective dynamics (1.5), outline the proof, and comment on numerical experiments. 1.1. Double solitons for mKdV. The single soliton solutions to mKdV, (1.1) with b ≡ 0, are described in terms of the profile η(x, a, c) as follows. Let η(x) = sech x so that −η + η + 2η3 = 0, and let η(x, c, a) = cη(c(x − a)) for a ∈ R, c ∈ R\0. Then a single soliton defined by u(x, t) = η(x, a + c2 t, c) is easily verified to be an exact solution to mKdV. Such solitary wave solutions are available for many nonlinear evolution equations. However, mKdV has richer structure – it is completely integrable and can be studied using the inverse scattering method (Miura [26], Wadati [34]). One of the consequences is the availability of larger families of explicit solutions. In the case of mKdV, we have N -solitons and breathers. In this paper we confine our attention to the 2-soliton (or double soliton), which is described by the profile q2 (x, a, c) defined in (3.2) below. The four real parameters, a ∈ R2 , and c ∈ R2 \C, def

C = {(c1 , c2 ) : c1 = ±c2 } ∪ R × {0} ∪ {0} × R, describe the position (a) and scale (c) of the double soliton. At the diagonal lines the parametrization degenerates: for c1 = ±c2 , q2 ≡ 0. At the coordinate axes in the c space, we recover single solitons: q2 (x, a, (c1 , 0)) = −c1 η(x, a1 , c1 ), q2 (x, a, (0, c2 )) = c2 η(x, a2 , c2 ). Figure 2 shows a few examples. Solving mKdV with u(x, 0) = q2 (x, a, c) gives the solution u(x, t) = q2 (x, a1 + tc12 , a2 + tc22 , c), that is, the double soliton solution.

368


Fig. 3. The plots of c and a for the external potential given by the last b(x, t) in (1.16), and c¯ = (6, 10), a¯ = (−1, −2). We see the avoided crossings near times at which the decoupled dynamics (1.12) would give a crossing of c j ’s (see also Fig. 6). The crossings are avoided with exp(−1/Ch) width and a1 = a2 at the crossings. These cases are not yet covered by our theory. Of the five crossings of a j ’s in the bottom figure, three do not involve crossings of c j ’s, and hence the description by effective dynamics there is covered by our theorem. However, in the absence of avoided crossing of c j ’s the solitons can interact only once

If, say, 0 < c1 < c2 , then for |a1 − a2 | large, q(x, a, c) ≈ η(x, a1 + α1 , c1 ) + η(x, a2 + α2 , c2 ), where α j are shifts defined in terms of c, see Lemma 3.2 for the precise statement. This means that for large positive and negative times the evolving double soliton is effectively a sum of single solitons. The decomposition can be made exact preserving the particle-like nature of single solitons even during the interaction – see (3.11) and Fig. 4. We consider the set of 2-solitons as a submanifold of H 2 (R; R) with 8 open components corresponding to the components of R2 \C: M = { q(·, a, c) | a = (a1 , a2 ) ∈ R2 , c = (c1 , c2 ) ∈ R2 \C }.

(1.7)

As in the case of single solitons this submanifold is symplectic with respect to the natural structure recalled in the next subsection.


369

Fig. 4. A depiction of the double soliton solution given by (3.3). The top figure shows the evolution of a double soliton. The bottom two figures show the evolution of its two components defined using (3.11). One possible “particle-like” interpretation of the two soliton interaction [4] is that the slower soliton, shown in the left bottom plot is hit by the fast soliton shown in the right bottom plot. Just like billiard balls, the slower one picks up speed, and the fast one slows down. But unlike billiard balls, the solitons simply switch velocities

1.2. Dynamical structure and effective equations of motion. The equation (1.1) is a Hamiltonian equation of evolution for 1 (1.8) Hb (u) = (u 2x − u 4 + bu 2 )d x, 2 on the Schwartz space, S(R; R) equipped with the symplectic form 1 +∞ x (u(x)v(y) − u(y)v(x))d yd x. ω(u, v) = 2 −∞ −∞

(1.9)

In other words, (1.1) is equivalent to u t = ∂x Hb (u), Hb (u), ϕ =

def

d Hb (u + sϕ)|s=0 , ds

(1.10)

and ∂x Hb (u) is the Hamilton vector field of Hb , Hb , with respect to ω: ω(ϕ, Hb (u)) = Hb (u), ϕ . For b = 0, H0 is tangent to the manifold of solitons (1.7). Also, M is symplectic with respect to ω, that is, ω is nondegenerate on Tu M, u ∈ M. Using the stability theory for 2-solitons based on the work of Maddocks-Sachs [23], and energy methods

370


(enhanced and simplified using algebraic identities coming from complete integrability of mKdV) we will show that the solution to (1.1) with initial data on M stays close to M for t ≤ log(1/ h)/ h. A basic intuition coming from symplectic geometry then indicates that u(t) stays close to an integral curve on M of the Hamilton vector field (defined using ω| M ) of Hb restricted to M: 1 def Heff (a, c) = Hb | M (a, c) = H0 | M (a, c) + b(x)q2 (x, a, c)2 d x, 2 1 H0 | M (a, c) = − (|c1 |3 + |c2 |3 ), 3 (1.11) ω| M = da1 ∧ d|c1 | + da2 ∧ d|c2 |, Heff =

2

sgn(c j )(∂a j Heff ∂c j − ∂c j Heff ∂a j ).

j=1

The effective equations of motion (1.5) follow. This simple but crucial observation was made in [18,19] and it did not seem to be present in earlier mathematical work on solitons in external fields [13]. The condition made in the theorem, that |c1 (t) ± c2 (t)| and |c j (t)| are bounded away from zero for t < T0 (h)/ h (where T0 (h) could be ∞), follows from a condition involving a simpler system of decoupled h-independent ODEs – see Appendix B. Here we state a condition which gives an h-independent T0 appearing in (1.6). Suppose we are given b(x, t) = b0 (hx, ht) in (1.1) and the initial condition is given by q2 (x, a, ¯ c), ¯ a¯ = (a¯ 1 , a¯ 2 ), c¯ = (c¯1 , c¯2 ), |c¯1 ± c¯2 | > δ0 , |c¯ j | > δ0 . We consider an h-independent system of two decoupled differential equations for A(T ) = (A1 (T ), A2 (T )), C(T ) = (C1 (T ), C2 (T )), given by

∂T A j = C 2j − b0 (A j , T ) ∂T C j = C j ∂x b0 (A j , T )

,

A(0) = ah, ¯ C(0) = c, ¯ j = 1, 2.

(1.12)

Then, for a given δ1 < δ0 , T0 (h) in (1.6) can be replaced by def

T0 = sup{T : |C1 (T ) ± C2 (T )| > δ1 , |C j (T )| > δ1 , j = 1, 2}.

(1.13)

1.3. Outline of the proof. To obtain the effective dynamics we follow a long tradition (see [13] and references given there) and define the modulation parameters a(t) = (a1 (t), a2 (t)), c(t) = (c1 (t), c2 (t)), by demanding that v(x, t) = u(x, t) − q(x, a(t), c(t)), q = q2 , satisfies symplectic orthogonality conditions: ω(v, ∂a1 q) = 0, ω(v, ∂a2 q) = 0, ω(v, ∂c1 q) = 0, ω(v, ∂c2 q) = 0.


371

These can be arranged by the implicit function theorem thanks to the nondegeneracy of ω| M . This makes q the symplectic orthogonal projection of u onto the manifold of solitons M. Since u = q + v and u solves mKdV, we have ∂t v = ∂x (Lc,a v − 6qv 2 − 2v 3 + bv) − F0 ,

(1.14)

where Lc,a = −∂x2 − 6q(x, a, c)2 v, and F0 results from the perturbation and ∂t landing on the parameters: F0 =

2

(a˙ j − c2j )∂a j q

+

2

j=1

c˙ j ∂c j q − ∂x (bq).

j=1

We decompose F0 = F + F⊥ , where F is the symplectic projection of F0 onto Tq M, and F⊥ is the symplectic projection onto its symplectic orthogonal (Tq M)⊥ . As seen in (5.4), F ≡ 0 is equivalent to the equations of motion (1.5) (we assume in the proof that c2 > c1 > 0). Using the properties of q, we show that F⊥ is O(h 2 ). In fact it is important to obtain a specific form for the O(h 2 ) term so that it is amenable to finding a certain correction term later – see § 6. The estimates for F are obtained using the symplectic orthogonality properties of v. For example, 0 = v, ∂x−1 ∂a j q implies 0 = ∂t v, ∂x−1 ∂a j q =

∂t v

, ∂x−1 ∂a j q + v, ∂t ∂x−1 ∂a j q ,

↑ substitute equation (1.14)

which can be used to show that |F | ≤ Ch 2 v H 2 + v2H 2 ,

(1.15)

see § 7. The next step is to estimate v satisfying (1.14) with v(0) = O(h 2 ) (in the theorem v(0) = 0, but we need this relaxed assumption for the bootstrap argument). We want to show that on a time interval of length h −1 , that v at most doubles. The Lyapunov functional E(t) that we use to achieve this comes from the variational characterization of the double soliton (see [22, §2] and Lemma 4.1 below): if Hc (u) = I5 (u) + (c12 + c22 )I3 (u) + c12 c22 I1 (u), then Hc (q(·, a, c)) = 0,

∀a ∈ R2 ,

and Hc (q(·, a, c)) = Kc,a ,

372


where Kc,a is a fourth order operator given in (4.11) below. Hence def

E(t) = Hc(t) (q(•, a(t), c(t)) + v(t)) − Hc(t) (q(•, a(t), c(t))), satisfies E(t) ≈ Kc,a v, v , and, as in Maddocks-Sachs [23] for KdV, Kc,a has a two dimensional kernel and one negative eigenvalue. However, the symplectic orthogonality conditions on v imply that we project far enough away from these eigenspaces and hence we have the coercivity δv2H 2 ≤ E(t). To get the upper bound on E(t), we compute d E(t) = O(h)v(t)2H 2 + Kc,a v, F + Kc,a v, F⊥ , dt see § 9. Using (1.15) we can estimate the second term on the right-hand side but |F⊥ | = O(h 2 ) only. We improve this to h 3 using a correction term to v – see § 8, and the comment at the end of this section. All of this combined gives, on [0, T ], v2H 2 v(0)2H 2 + T (|F |v H 2 + h 2 v H 2 + v2H 2 ), |F | ≤ Ch 2 v H 2 + v2H 2 , which implies v H 2 h 2 ,

|F | h 4 ,

on [0, h −1 ].

Iterating the argument δ log(1/ h) times gives a slightly weaker bound for longer times. The O(h 4 ) errors in the ODEs can be removed without affecting the bound on v, proving the theorem. In the proofs various facts due to complete integrability (such as the miraculous Lemma 2.1) simplify the arguments, in particular in the above energy estimate. We conclude with the remark about the correction term added to v in order to improve the bound on F⊥ from h 2 to h 3 . A similar correction term was used in [19] for NLS 1-solitons. Together with the symplectic projection interpretation, it was the key to sharpening the results in earlier works. Implementing the same idea in the setting of 2-solitons is more subtle. The 2-soliton is treated as if it were the sum of two decoupled 1-solitons, the corrections are introduced for each piece, and the result is that F⊥ is corrected so that F⊥ H 2 h 3 + h 2 e−γ |a1 −a2 | , That is, when |a1 − a2 | = O(1), there is no improvement. However, this happens only on an O(1) time scale and hence does not spoil the long time estimate.


373

1.4. Numerical experiments. The mKdV equation is very friendly from the numerical point of view and MATLAB is sufficient for producing good results. We first describe the simple codes on which our experiments are based. Instead of considering (1.1) on the line, we consider it on the circle identified with [−π, π ). To solve it numerically we adapt the code given in [33, Chap. 10] which is based on the Fast Fourier Transform in x, the method of integrating factor for the −u x x x → −ik 3 u(k) ˆ term, and the fourth-order Runge-Kutta formula for the resulting ODE in time. Unless the amplitude of the solution gets large (which results in large terms in the equation due to the u 3 term) it suffices to take 2 N , N = 8, discretization points in x. For X ∈ [−π, π ) we consider B(X, T ) periodic in X , and compute U (X, T ) satisfying ∂T U = −∂ X (∂ X2 U + 2U 3 − B(X, T )U ), U (π, T ) = U (−π, T ). A simple rescaling, u(x, t) = αU (αx, α 3 t), b(x, t) = α 2 B(αx, α 3 t), gives a solution of (1.1) on [−π/α, π/α] with periodic boundary conditions. When α is small this is a good approximation of the equation on the line. If we use U (X, T ) in our numerical calculations with the initial data q2 (X, A, C), A ∈ R2 , C ∈ R2 \ C, the initial condition for u(x, t) is given by u(x, 0) = q2 (x, A/α, αC). If we want c¯ = αC to satisfy the assumptions (1.3), the effective small constant h becomes h = α and b0 in (1.1) becomes b0 (x, t) = h 2 B(x, h 2 t). In principle we have three scales: size of B, size of ∂x B, and size of ∂t B, which should correspond to three small parameters h. For simplicity we just use one scale h in the theorem. Figure 1 shows four examples of evolution and comparison with effective dynamics computed using the MATLAB codes available at [17]. The external potentials used are given by B(x, t) = 100 cos2 (x − 103 t) − 50 sin(2x + 103 t), B(x, t) = 100 cos2 (x − 103 t) + 50 sin(2x + 103 t), B(x, t) = 60 cos2 (x + 1 − 102 t) + 40 sin(2x + 2 + 102 t),

(1.16)

B(x, t) = 40 cos(2x + 3 − 102 t) + 30 sin(x + 1 + 102 t). The rescaling the fixed size potential used in the theorem, b0 (x, t) = h 2 B(x, h 2 t), means that our h satisfies h 1/5 in the last two examples. In the first two examples the scales in x are different than the ones in t: the potential is not slowly varying in t if h 1/10. The agreement with the main theorem is very good in all cases. However, the theorem in the current version does not apply to the two bottom figures since the condition in (1.13) is not satisfied for the full time of the experiment. See also Fig. 3 and Appendix B. We have not exploited numerical experiments in a fully systematic way but the following conclusions can be deduced:

374


• For the case covered by our theorem the agreement with the numerical solution is remarkably close; the same thing is true for times longer than T0 / h, with T0 defined by (1.13) despite the crossings of C j ’s (resulting in the avoided crossing of c j ’s). The agreement is weaker but the experiments involve only a relatively large value of h. • The soliton profile persists for long times but we see a deviation from the effective dynamics. This suggest the optimality of the bound log(1/ h)/ h in (1.4). • The slow variation in t required in the theorem can probably be relaxed. For instance, in the top plots in Fig. 1 max |∂t b0 |/ max |∂x b0 | ∼ 10, while the agreement with the effective dynamics is excellent. For longer times it does break down as can be seen using the Bmovie.m code presented in [17, §3]. An indication that slow variation in time might be removable also comes from [2]. • When the decoupled equations (1.12) predict crossing of C j ’s, we observe an avoided crossing of c j ’s – see Fig. 3 and Fig. 6 – with exponentially small width, exp(−1/Ch). At such times we also see the crossing of a j ’s, though it really corresponds to solitons changing their scale constants – see Fig. 7. To have multiple interactions of a pair of solitons, this type of crossing has to occur, and it needs to be investigated further. 2. Hamiltonian Structure and Conserved Quantities The symplectic form, at first defined on S(R; R) is given by def ω(u, v) = u, ∂x−1 v , f, g = f g, where ∂

−1

1 f (x) = 2 def

x −∞

+∞

−

(2.1)

f (y) dy.

x

Then the mKdV (Eq. (1.1) with b ≡ 0) is the Hamiltonian flow ∂t u = ∂x H0 (u) and (1.1) is the Hamiltonian flow ∂t u = ∂x Hb (u), where 1 1 Hb = H0 = (u 2x − u 4 ) (u 2x − u 4 + bu 2 ). 2 2 Solutions to mKdV have infinitely many conserved integrals and the first four are given by I0 (u) = u d x, I1 (u) = u 2 d x, I3 (u) = (u 2x − u 4 ) d x, I5 (u) = (u 2x x − 10u 2x u 2 + 2u 6 ) d x, which are the mass, momentum, energy, and second energy, respectively. In this paper we will only use these particular conserved quantities.


375

We write I j (u) = A j (u), which means that A j (u) denotes the j th Hamiltonian density. For future reference, we record the expressions appearing in the Taylor expansions of these densities, A j (q + v) = A j (q) + A j (q)(v) + A 1 (q)(v) = 2qv,

1 A (q)(v, v) + O(v 3 ), 2 (2.2)

A 3 (q)(v) = 2qx vx − 4q 3 v, A 5 (q)(v) = 2qx x vx x − 20qx q 2 vx − 20qx2 qv + 12q 5 v, and A 1 (q)(v, v) = 2v 2 , A 3 (q)(v, v) = 2vx2 − 12q 2 v 2 , A 5 (q)(v, v) = 2vx2x − 20q 2 vx2 − 20qx2 v 2 − 80qqx vvx + 60q 4 v 2 . The differentials, I j (q), are identified with functions by writing: I j (q), v = A j (q)(v).

It is useful to record a formal expression for I j (q)’s valid when A j (q)’s are polynomials in ∂x q: I j (q) =

(−∂x )

≥0

∂ A j (q) ()

∂qx

, qx() = ∂x q.

(2.3)

The Hessians, I j (q), are the (self-adjoint) operators given by I j (q)v, v = A j (q)(v, v). One way to generate the mKdV energies is as follows (see Olver [28]). Let us put (u) = −∂x2 − 4u 2 − 4u x ∂x−1 u, and recall that (u)∂x is skew-adjoint: (u)∂x = −∂x3 − 4u 2 ∂x − 4u x ∂x−1 u∂x = −∂x3 − 4u 2 ∂x − 4u x u + 4u x ∂x−1 u x , where we used the formal integration by parts ∂x−1 (u f x ) = −∂x−1 (u x f ) + u f . With this notation we have the fundamental recursive identity: ∂x I2k+1 (u) = (u)∂x I2k−1 (u),

which together with skew-adjointness of (u)∂x shows that I j (u), ∂x Ik (u) = I j−2 (u), ∂x Ik+2 (u) ,

(2.4)

376


for j and k odd (if we use (2.4) with m even the choice I2m (u) = 0, for m > 0 is consistent). By iteration this shows that I j (u), ∂x Ik (u) = 0, ∀j, k.

(2.5)

In fact, since j and k are odd we can iterate all the way down to j = 1 and apply (2.3): () I1 (u), ∂x Ik+ j−1 (u) = − ∂x u x ,

∂ A j+k−1 (u)/∂u () x

≥0

=−

∂x (A j+k−1 (u))d x = 0.

If u solves mKdV, then ∂t u = 21 ∂x I3 (u) and hence by (2.5) we obtain ∂t I j (u) = I j (u), ∂t u =

1 I (u), ∂x I3 (u) = 0. 2 j

The following identities related to the conservation laws will be needed in § 9. Recalling the definition (2.2) of A j , we have: Lemma 2.1. For any function u ∈ S, and for b ∈ C ∞ ∩ S , we have I1 (u), (bu)x = bx , A1 (u) , I3 (u), (bu)x = 3 bx , A3 (u) − bx x x , A1 (u) , I5 (u), (bu)x = 5 bx , A5 (u) − 5 bx x x , A3 (u) + bx x x x x , A1 (u) . Proof. By taking arbitrary b ∈ S, we see that the claimed formulae are equivalent to u∂x I1 (u) = ∂x A1 (u), u∂x I3 (u) = 3∂x A3 (u) − ∂x3 A1 (u), u∂x I5 (u) = 5∂x A5 (u) − 5∂x3 A3 (u) + ∂x5 A1 (u), and these can be checked by direct computation.

Lemma 2.2. For any function u, q ∈ S, and for b ∈ C ∞ ∩ S , we have I1 (q)v, (bq)x − ∂x I1 (q), bv = bx , A 1 (q)(v) , I3 (q)v, (bq)x − ∂x I3 (q), bv = 3 bx , A 3 (q)(v) − bx x x , A 1 (q)(v) , I5 (q)v, (bq)x − ∂x I5 (q), bv = 5 bx , A 5 (q)(v) − 5 bx x x , A 3 (q)(v) + bx x x x x , A 1 (q)(v) . Proof. Differentiate the formulæ in Lemma 2.1 with respect to u at q in the direction of v.


377

3. Double Soliton Profile and Properties Here we record some properties of mKdV and its double soliton solutions. The parametrization of the family of double solitons follows the presentation for NLS in Faddeev– Takhtajan [12]. The double-soliton is defined in terms of the profile q(x, a, c), where a = (a1 , a2 ) ∈ R2 , c = (c1 , c2 ) ∈ R2 \C, def

C = {(c1 , c2 ) : c1 = ±c2 } ∪ R × {0} ∪ {0} × R.

(3.1)

The profile q = q2 (from now on we drop the subscript 2) is defined by q(x, a, c) =

det M1 , det M

where M = [Mi j ]1≤i, j≤2 ,

1 + γi γ j Mi j = , ci + c j

(3.2) ⎡

⎤ γ1 M γ2 ⎦ M1 = ⎣ 11 0

and γ j = (−1) j−1 exp(−c j (x − a j )),

j = 1, 2.

For conveninece we will consider the 0 < c1 < c2 connected component of R2 \C throughout the paper. Since q(x, a1 , a2 , c1 , c2 ) = −q(x, a2 , a1 , c2 , c1 ), q(x, a1 , a2 , −c1 , −c2 ) = −q(−x, −a1 , −a2 , c1 , c2 ), the only other component to consider would be, say, 0 < −c1 < c2 (see Fig. 2), and the analysis is similar. We should however mention that in numerical experiments it is more useful to introduce a phase parameter = ( 1 , 2 ), j = ±1, and define q(x, ˜ a, c, ) by (3.2) but with γ j ’s replaced by γ˜ j = (−1) j−1 j exp(−c j (x − a j )),

j = 1, 2.

We can then check that q(x, ˜ a, c, ) = q(x, a, ( 1 c1 , 2 c2 )), but q˜ seems more stable in numerical calculations. The corresponding double-soliton u(x, t) = q(x, a1 + c12 t, a2 + c22 t, c1 , c2 )

(3.3)

is an exact solution to mKdV. For the double soliton this can be checked by an explicit calculation, but it is a consequence of the inverse scattering method. This is the only

378


place in this paper where we appeal directly to the inverse scattering method. Figure 4 illustrates some aspects of this evolution. The scaling properties of mKdV imply that q(x + t, a + (t, t), c) = q(x, a, c), q(t x, ta, c/t) = q(x, a, c)/t.

(3.4)

Both properties also follow from the formula for q, with the second one being slightly less obvious: ⎡ ⎤ γ 1 1 tM γ2 ⎦ q(t x, ta, c/t) = det ⎣ det t M 11 0 ⎛⎡ ⎡ ⎤ ⎤⎞ t 0 1 0 0 0 1 det ⎝⎣ 0 t 0 ⎦ M1 ⎣ 0 1 0 ⎦⎠ = det t M 00 1 0 0 1/t = q(x, a, c)/t. Now we discuss in more detail the properties of the profile q. Recalling that we suppose that c2 > c1 > 0, let

c1 + c2 c2 − c1 def 1 def 1 log log α1 = , α2 = , (3.5) c1 c2 − c1 c2 c1 + c2 noting that for c2 > c1 > 0, α1 > 0 and α2 < 0. Fix a smooth function, θ ∈ C ∞ (R, [0, 1]), such that 1 for s ≤ −1, θ (s) = (3.6) −1 for s ≥ 1. Define the shifted positions as def

aˆ j = a j + α j θ (a2 − a1 ) that is,

aˆ j =

aj + αj, aj − αj,

(3.7)

a2 a1 , a2 a1 ,

see Fig. 5. We note that aˆ j = aˆ j (a j , c1 , c2 ). Let S denote the Schwartz space. We will next introduce function classes Ssol and Serr , and then show that q ∈ Ssol and give an approximate expression for q with error in Serr . Definition 3.1. Let Serr denote the class of functions, ϕ = ϕ(x, a, c), x ∈ R, a ∈ R2 , 0 < δ < c1 < c2 − δ < 1/δ (for any fixed δ) satisfying k p ∂x ∂c ∂a ϕ ≤ C2 exp(−(|x − a1 | + |x − a2 |)/C1 ), where C j depend on δ, , k, and p only. Let Ssol denote the class of functions of (x, a, c) of the form p1 (c1 , c2 )ϕ1 (c1 (x − aˆ 1 )) + p2 (c1 , c2 )ϕ2 (c2 (x − aˆ 2 )) + ϕ(x, a, c),


379

Fig. 5. The top plots show q(x, 3, 5, ∓3, ±3), the corresponding η(x, aˆ j , c j ) given by Lemma 3.2. The bottom plots show the post-interaction pictures at times ∓0.75. Since the sign of a2 − a1 changes after the interaction we see the shift compared to the evolution of η(x, aˆ j , c j )’s.

where (1) |∂k ϕ j (k)| ≤ C exp(−|k|/C), for some C, (2) p j ∈ C ∞ (R2 \ C). (3) ϕ ∈ Serr . Some elementary properties of Ssol and Serr are given in the following. Lemma 3.1 (properties of Serr ). (1) ∂x Serr ⊂ Serr , ∂a j Serr ⊂ Serr , ∂c j Serr ⊂ Serr . (2) (x − a j )Serr ⊂ Serr and (x − aˆ j )Serr ⊂ Serr . +∞ (3) If f ∈ Serr and −∞ f = 0, then ∂x−1 f ∈ Serr . The class Serr allows to formulate the following Lemma 3.2 (Asymptotics for q). Suppose that 0 < c1 < c2 < c1 / < 1/ 2 , for > 0. Then for |a2 − a1 | ≥ C0 /(c1 + c2 ), ⎛ ⎞ 2 k p ∂ ∂ ∂a ⎝q(x, a, c) − ⎠ η(x, aˆ j , c j ) ≤ C2 exp(−(|x − a1 | + |x − a2 |)/C1 ), x c j=1 (3.8)

380


where C2 depends on k, , p and , and C0 , C1 on only. In other words, q(x, a, c) −

2

η(x, aˆ j , c j ) ∈ Serr .

j=1

Corollary 3.3. ∂x−1 ∂a j q, ∂x−1 ∂c j q ∈ Ssol . Proof. By Lemma 3.2, we have ∂c j q = ∂c j

2

η(·, aˆ j , c j ) + f,

j=1

where f ∈ Serr . By direct computation with the η terms, we find that +∞ 2 ∂c j η(·, aˆ j , c j ) = 0. −∞

j=1

+∞ +∞ By the remark in Lemma 3.5, we have −∞ ∂c j q = 0. Hence −∞ f = 0. By Lemma 3.1(3), we have ∂x−1 f ∈ Serr . Hence ∂x−1 ∂c j q

=

∂x−1 ∂c j

2

η(·, aˆ j , c j ) + Serr ,

j=1

and the right side is clearly in Ssol .

Proof of Lemma 3.2. We define def

Q(x, α, δ) = q(x, −α, α, 1 − δ, 1 + δ),

(3.9)

a1 + a2 c1 + c2 c1 + c2 Q x− , α, δ , q(x, a1 , a2 , c1 , c2 ) = 2 2 2 c + c a − a

c − c 1 2 2 1 2 1 α= , δ= . 2 2 c2 + c1

(3.10)

so that, using (3.4),

Hence it is enough to study the more symmetric expression (3.9). We decompose it in the same spirit as the decomposition of double solitons for KdV was performed in [4]: Q(x, α, δ) = τ (x, α, δ) + τ (−x, −α, δ),

(3.11)

where τ (x, α, δ) =

1 (1 + δ) exp((1 − δ)(x + α)) + (1 − δ) exp((1 + δ)(x − α)) . 2 δ sech2 (x − δα) + δ −1 cosh2 (δx − α)

(3.12)

This follows from a straightforward but tedious calculation which we omit. Thus, to show (3.8) we have to show that |∂x ∂αp ∂δk (τ (x, α, δ) − η(x − |α| − log(1/δ)/(1 ± δ), 1 ± δ))| ≤ C2 exp(−(|x| + |α|)/C1 ), ±α 1, uniformly for 0 < δ ≤ 1 − .

(3.13)


381

To see this put γ = (1 − δ)/(1 + δ), and multiply the numerator and denominator of (3.12) by e−(1+δ)(x−α) : 2(1 − δ) 1 + γ −1 e2α−2δx τ (x, α, δ) = (1−δ)(x+α) . (3.14) δe (1 − e−2x+2δα )2 + δ −1 e−(1−δ)(x+α) (1 + e−2δx+2α )2 Similarly, the multiplication by e−(1+δ)(x−α) gives 2(1 + δ) 1 + γ e−2α+2δx τ (x, α, δ) = (1+δ)(x−α) δe (1 − e−2x+2δα )2 + δ −1 e−(1+δ)(x−α) (1 + e−2δx−2α )2 2(1 + δ) 1 + γ e−2α+2δx (1 + e−2δx−2α )−2 = . 2 δe(1+δ)(x−α) (1 − e−2x+2δα )/(1 + e−2δx−2α ) + δ −1 e−(1+δ)(x−α) (3.15) This shows that for negative values of x, τ is negligible: multiplying the numerator and denominator by δ and using (3.14) for α ≤ 0 and (3.15) for α ≥ 0, gives δ(1 + δ)(1 + e−2(|α|+δ|x|) )e−(1+δ)(|x|+|α|) , α ≥ 0, τ (x, α, δ) ≤ (3.16) 2δ|x|−2|α| −1 −(1−δ)(|x|+|α|) ) e , α ≤ 0, δ(1 + δ)(1 + e and in fact this is valid uniformly for 0 ≤ δ ≤ 1. Similar estimates hold also for derivatives. For x ≥ 0, 0 ≤ δ ≤ 1 − , and for α −1, we use (3.14) to obtain,

1 1 τ (x, α, δ) = (1 − δ) sech (1 − δ) x − |α| − log + − (x, α, δ), 1−δ δ and for α 1, (3.15):

τ (x, α, δ) = (1 + δ) sech (1 + δ) x − |α| −

1 1 log 1+δ δ

+ + (x, α, δ),

where |∂xk ± | ≤ Ck exp(−(|x| + |α|)/c), c > 0, uniformly in δ, 0 < δ < 1 − . Inserting the resulting decomposition into (3.10) completes the proof. Lemma 3.4 (Fundamental identities for q). With q = q(·, a, c), we have ∂x I3 (q) = 2∂x (−∂x2 q − 2q 3 ) = 2

2

c2j ∂a j q,

(3.17)

j=1

∂x I1 (q) = 2∂x q = −2

2

∂a j q,

(3.18)

j=1

q=

2 j=1

(x − a j )∂a j q +

2 j=1

c j ∂c j q.

(3.19)

382


These three identities are analogues of the following three identities for the single-soliton η = η(·, a, c), which are fairly easily verified by direct inspection. ∂x I1 (η) = ∂x η = −∂a η, ∂x I3 (η) = ∂x (−∂x2 η − 2η3 ) = c2 ∂a η, η = (x − a)∂a η + c∂c η. Proof. The first identity is just the statement that (3.3) solves mKdV and we take it on faith from the inverse scattering method (or verify it by a computation). To see (3.18) and (3.19) we differentiate (3.4) with respect to t. The value of I j (q) for all j is recorded in the next lemma. Lemma 3.5 (Values of I j (q)). I0 (q) = 2π.

(3.20)

For j = 1, 3, 5, we have I j (q) = 2(−1) Also,

j−1 2

j

j

c1 + c2 . j

(3.21)

xq(x, a, c)2 d x = 2a1 c1 + 2a2 c2 .

(3.22)

Note that by (3.20),

+∞

−∞

∂a j q = 0,

+∞

−∞

∂c j q = 0,

j = 1, 2,

from which it follows that ∂x−1 (∂a j q) and ∂x−1 (∂c j q) are Schwartz class functions. Proof. We prove (3.21), (3.20) by reduction to the 1-soliton case. Let u(t) = q(·, a1 + tc12 , a2 + tc22 , c1 , c2 ). Then by the asymptotics in Lemma 3.2, I j (q) = I j (u(0)) = I j (u(t)) =

2

I j (η(·, (ak + ck2 t)ˆ, ck )) + ω(t),

k=1

where |ω(t)| c2 ((a1 + tc12 ) − (a2 + tc22 )) −2 . But note that by scaling, j

I j (η(·, (ak + ck2 t)ˆ, ck )) = ck I j (η). By sending t → +∞, we find that j

j

I j (q) = (c1 + c2 )I j (η).


383

To compute I j (η), we let ηc (x) = cη(cx). By scaling I j (ηc ) = c j I j (η). Hence j I j (η) = ∂c |c=1 I j (ηc ) = I j (η), ∂c |c=1 ηc = I j (η), (xη)x = 2(−1)

j−1 2

η, (xη)x = 2(−1)

j−1 2

,

where we have used the identity I j (η) = 2(−1)

j−1 2

η,

(3.23)

which follows from the energy hierarchy. In fact, I1 (η) = 2η is just the definition of I1 . Assuming that I j (η) = 2(−1)

j−1 2

η, we compute

∂x I j+2 (η) = (η)∂x I j (η) = 2(−1)(−1)

j−1 2

(∂x2 + 4η2 + 4ηx ∂x−1 η)ηx

= 2(−1)

j+1 2

∂x (ηx x + 2η3 )

= 2(−1)

j+1 2

∂x η.

We now prove (3.22). By direct computation, if u(t) solves mKdV, then ∂t xu 2 = −3I3 (u). Again let u(t) = q(·, a1 + tc12 , a2 + tc22 , c1 , c2 ). By (3.21) with j = 3, we have xq(x, a, c)2 d x = xu(0, x)2 d x = xu(t, x)2 d x − 2(c13 + c23 )t. By the asymptotics in Lemma 3.2, xu(t, x)2 =

2

xη(x, (a j + tc2j )ˆ, c j )2 + ω(t),

j=1

where |ω(t)| ≤ (a1 + tc12 ) c2 ((a1 + c12 t) − (a2 + tc22 )) −2 . But

xη(x, aˆ j , c j )2 = 2c j aˆ j .

Combining, and using that c1 aˆ 1 + c2 aˆ 2 = c1 a1 + c2 a2 , we obtain xq(x, a, c)2 d x = 2(c1 a1 + c2 a2 ) + ω(t). Send t → +∞ to obtain the result.

We define the four-dimensional manifold of 2-solitons M as M = { q(·, a, c) | a = (a1 , a2 ) ∈ R2 , c = (c1 , c2 ) ∈ (R)2 \C }.

384


Lemma 3.6. The symplectic form (2.1) restricted to the manifold of 2-solitons is given by ω| M =

2

da j ∧ dc j .

j=1

In particular, it is nondegenerate and M is a symplectic manifold. Proof. By (3.21) with j = 1 and (3.18), 1 1 ∂a1 I1 (q) = I1 (q), ∂a1 q = ∂a1 q, ∂x−1 ∂a1 q + ∂a2 q, ∂x−1 ∂a1 q 2 2 = ∂a2 q, ∂x−1 ∂a1 q .

0=

Again by (3.21) with j = 1 and (3.18), 1=

1 1 ∂c1 I1 (q) = I1 (q), ∂c1 q = ∂a1 q, ∂x−1 ∂c1 q + ∂a2 q, ∂x−1 ∂c1 q . (3.24) 2 2

By (3.21) with j = 3 and (3.17), − c12 =

1 1 ∂c I3 (q) = I3 (q), ∂c1 q = −c12 ∂a1 q, ∂x−1 ∂c1 q − c22 ∂a2 q, ∂x−1 ∂c1 q . 2 1 2 (3.25)

Solving (3.24) and (3.25), we obtain that ∂a1 q, ∂x−1 ∂c1 q = 1 and ∂a2 q, ∂x−1 ∂c1 q = 0. We similarly obtain that ∂a2 q, ∂x−1 ∂c2 q = 1 and ∂a1 q, ∂x−1 ∂c2 q = 0. It remains to show that ∂c1 q, ∂x−1 ∂c2 q = 0: ∂c1 q, ∂x−1 ∂c2 q =

2 1 c j ∂c j q, ∂x−1 ∂c2 q c1 j=1

1 q − (x − a j )∂a j q, ∂x−1 ∂c2 q c1 2

=

by (3.19)

j=1

2 1 1 q + xqx , ∂x−1 ∂c2 q + a j ∂a j q, ∂x−1 ∂c2 q c1 c1 j=1 1 a 2 =− ∂c xq 2 + 2c1 2 c1 = 0.

=

by (3.18)

by (3.22)

Remark. If |a1 − a2 | 2, and c1 < c2 then, in the notation of (3.7), da j ∧ dc j = d aˆ j ∧ dc j , j=1,2

j=1,2

that is the map (a, c) → (a, ˆ c) is symplectic.


385

The nondegeneracy of the symplectic form (2.1) restricted to the manifold of 2-solitons, M shows that H 2 functions close to M can be uniquely decomposed into an element q, of M and a function symplectically orthogonal Tq M. We recall this standard fact in the following Lemma 3.7 (Symplectic orthogonal decomposition). Given c, ˜ there exist constants δ > 0, C > 0 such that the following holds. If u = q(·, a, ˜ c) ˜ + v˜ with v ˜ H 2 ≤ δ, then there exist unique a, c such that |a − a| ˜ ≤ Cv ˜ H2,

|c − c| ˜ ≤ Cv ˜ H2,

def

and v = u − q(·, a, c) satisfies v, ∂x−1 ∂a j q = 0 and v, ∂x−1 ∂c j q = 0, j = 1, 2.

(3.26)

Proof. Let ϕ : H 2 × R2 × (R+ )2 → R4 be defined by ⎤ u − q(·, a, c), ∂x−1 ∂a1 q ⎢ u − q(·, a, c), ∂ −1 ∂a q ⎥ 2 x ⎥ ϕ(u, a, c) = ⎢ ⎣ u − q(·, a, c), ∂x−1 ∂c1 q ⎦ . u − q(·, a, c), ∂x−1 ∂c2 q ⎡

Using that ω| M = da1 ∧ dc1 + da2 ∧ dc2 , we compute the Jacobian matrix of ϕ with respect to (a, c) at (q(·, a, ˜ c), ˜ a, ˜ c) ˜ to be ⎡

0 ⎢0 Da,c ϕ(q(·, a, ˜ c), ˜ a, ˜ c) ˜ =⎣ 1 0

0 0 0 1

1 0 0 0

⎤ 0 1⎥ . 0⎦ 0

By the implicit function theorem, the equation ϕ(u, a, c) = 0 can be solved for (a, c) in terms of u in a neighbourhood of q(·, a, ˜ c). ˜ We also record the following lemma which will be useful in the next section: Lemma 3.8. Suppose v solves a linearized equation ∂t v =

1 ∂x I (q(t))v = ∂x (−∂x2 − 6q(t)2 )v, q(x, t) = q(x, a j + tc2j , c j ). 2 3

Then ∂t v(t), ∂x−1 (∂c j q)(t) = ∂t v(t), ∂x−1 (∂a j q)(t) = 0, where (∂c j q)(t) = (∂c j q)(x, a j + tc2j , c j ) (and not ∂c j (q(x, a j + tc2j , c j ))). In addition, for v(0) = ∂a j q, v(t) = (∂a j q)(t), and for v(0) = ∂c j q, v(t) = (∂c j q)(t) + 2c j t (∂a j q)(t).

386


4. Lyapunov Functional and Coercivity In this section we introduce the function Hc adapted from the KdV theory of MaddocksSachs [23]. We will build our Lyapunov functional E from Hc . Thus let def

Hc (u) = I5 (u) + (c12 + c22 )I3 (u) + c12 c22 I1 (u). We give a direct proof that q(·, a, c) is a critical point of Hc : Lemma 4.1 (q is a critical point of H ). We have Hc (q(·, a, c)) = 0,

(4.1)

that is I5 (q) + (c12 + c22 )I3 (q) + c12 c22 I1 (q) = 0. Proof. We follow Lax [22, §2]: we want to find A = A(q) and B = B(q) such that H (q) = I5 (q) + AI3 (q) + B I1 (q) = 0, def

for all q = q(x, a, c) ∈ M. If we consider the mKdV evolution of q given by (3.3), then Lemma 3.2 shows that as t → ±∞ we can express H (q) asymptotically using H (ηc1 ) and H (ηc2 ). From (3.23) we see that H (ηc ) = I5 (ηc ) + AI3 (ηc ) + B I1 (ηc ) = 2(c4 − Ac2 + B)ηc . Two parameters c1 and c2 are roots of this equation if A = c12 + c22 and B = c12 c22 and this choice gives H (q(t)) = r (t), r (t) L 2 ≤ C exp(−|t|/C), def

q(t) = q(x, a1 + c12 t, a2 + c22 t, c1 , c2 ),

(4.2)

where the exponential decay of r (t) comes from Lemma 3.2 and the fact that c1 = c2 . To prove (4.1) we need to show that r (0) ≡ 0. For the reader’s convenience we provide a direct proof of this widely accepted fact. Since it suffices to prove that r (0), w = 0, for all w ∈ S, we consider the mKdV linearized equation at q(t), vt =

1 ∂x I (q(t))v, v(0) = w ∈ S, 2 3

(4.3)

and will show that ∂t r (t), v(t) = ∂t H (q(t)), v(t) = 0.

(4.4)

The conclusion r (0), w = 0 will then follow from showing that r (t), v(t) → 0, t → ∞. We first claim that ∂t Ik (q), v = 0, ∀ k.

(4.5)


387

In fact, from (2.5) we have Ik (ϕ), ∂x I3 (ϕ) = 0 for all ϕ ∈ S. Differentiating with respect to ϕ in the direction of v, we obtain Ik (ϕ)v, ∂x I3 (ϕ) = − Ik (ϕ), ∂x I3 (ϕ)v . Applying this with v = v(t) and ϕ = q(t) we conclude that 1 ∂t Ik (q), v = Ik (q)∂t q, v + Ik (q), ∂x I3 (q)v 2 1 1 = Ik (q)∂x I3 (q), v + Ik (q), ∂x I3 (q)v , 2 2 = 0. Since H is a linear combination of Ik ’s, k = 1, 3, 5, this gives (4.4). We now want to use the exponential decay of r (t) L 2 in (4.2), and (4.4) to show (4.5). Clearly, all we need is a subexponential estimate on v(t), that is ∀ > 0 ∃ t0 , v(t) L 2 ≤ e t , t > t0 .

(4.6)

Let ψ be a smooth function such that ψ(x) = 1 for all |x| ≤ 1 and ψ(x) ∼ e−2|x| for |x| ≥ 1. With the notation of Lemma 3.2 define ψ j (x, t) = ψ(δ(x − (a j + c2j t))), for 0 < δ 1 to be selected below and j = 1, 2. We now establish that

2 ∂t v2 2 + vx 2 2 + 6 q 2 v 2 ψ j v2L 2 . L L

(4.7)

j=1

To prove (4.7), apply ∂x−1 to (4.3) and pair with vt to obtain 0 = ∂x−1 vt , vt + vx x , vt + 6q 2 v, vt , which implies ∂t

1 vx 2L 2 + 3 2

q v

2 2

=6

qqt v 2 .

(4.8)

Next, pair (4.3) with v to obtain 0 = vt , v + vx x x , v + 6 ∂x (q 2 v), v , which implies

∂t v2L 2

= −12

qqx v 2 .

(4.9)

Summing (4.8) and (4.9) gives (4.7). The inequality (4.7) shows that we need to control ψ j v(t), j = 1, 2. For t large ψ j provides a localization to the region where q decomposes into an approximate sum of decoupled solitons (see Lemma 3.2). Hence we define L j = c2j − ∂x2 − 6η2 (x, (a j + tc2j ), c j )

388


(see also § 8 below for a use of similar operators). A calculation shows that t ≥ T (δ) !⇒ ∂t L j ψ j v, ψ j v = O(δ)v2H 1 ,

(4.10)

where T (δ) is large enough to ensure that the supports of ψ j ’s are separated. It suffices to assume that v(0) = w satisfies w, ∂x−1 ∂a j q = 0 and w, ∂x−1 ∂c j q = 0, since Lemma 3.8 already showed that the evolutions of ∂a j q and ∂c j q are linearly bounded in t. Under this assumption, we have by Lemma 3.8 that v(t), ∂x−1 ∂a j q(t) = 0 and v(t), ∂x−1 ∂c j q(t) = 0. We now want to invoke the well known coercivity estimates for operators L j – see for instance [18, §4] for a self contained presentation. For that we need to check that | ψ j v, ∂x−1 ∂a η(aˆ j + tc2j , c j ) | 1, | ψ j v, ∂x−1 (∂c η(aˆ j + tc2j , c j )| | 1. This follows from the fact that v is symplectically orthogonal to (∂c j q)(t) and ∂a j q(t) (Lemma 3.8 again), the fact that q decouples into two solitons for t large, and from the remark after the proof of Lemma 3.6. Hence, L j ψ j v, ψ j v ψ j v2H 1 . 1

We now sum (4.7) and (4.10) multiplied by δ − 2 to obtain, for t sufficiently large (depending on δ), 1

F (t) ≤ Cδ 2 F(t), def

F(t) = v(t)2H 1 + 6

1

q 2 (t)v(t)2 + δ − 2 L j (t)ψ j (t)v(t), ψ j (t)v(t)

(where we added the additional q 2 v 2 term to the right hand side at no cost). Conse1 quently, F(t) ≤ exp(C δ 2 t), for t > T1 (δ). We recall that this implies (4.6) and going back to (4.4) show that r (0) = 0, and hence H (q) = 0. We denote the Hessian of Hc at q(•, a, c) by Kc,a : Kc,a = I5 (q) + (c12 + c22 )I3 (q) + c12 c22 I1 (q). It is a fourth order self-adjoint operator on L 2 (R) and a calculation shows that 1 Kc,a = (−∂x2 + c12 )(−∂x2 + c22 ) 2 +10∂x q 2 ∂x + 10(−qx2 + (q 2 )x x + 3q 4 ) − 6(c12 + c22 )q 2 .

(4.11)

Lemma 4.2 (Mapping properties of K). The kernel of Kc,a in L 2 (R) is spanned by ∂a j q: Kc,a ∂a j q = 0,

(4.12)

Kc,a ∂c j q = 4(−1) j c j (c12 − c22 )∂x−1 ∂a j q.

(4.13)

and


389

Proof. Equations (4.12) follow from differentiation of (4.1) with respect to a j . As x → ∞, the leading part of Kc,a is given by (−∂x2 + c12 )(−∂x2 + c22 ), and hence the kernel in L 2 is at most two dimensional. To see (4.13) recall that I1 (q) = 2q = −2∂x−1 (∂a1 q + ∂a2 q), I3 (q) = −2q − 4q 3 = 2∂x−1 (c12 ∂a1 q + c22 ∂a2 q), where we used Lemma 3.4. By differentiating H (q) = I5 (q) + (c12 + c22 )I3 (q) + c12 c22 I1 (q) = 0 with respect to c j , we obtain K(∂c1 q) = −2c1 (I3 (q) + c22 I1 (q)), K(∂c2 q) = −2c2 (I3 (q) + c12 I1 (q)). (4.14) Inserting the above formulæ for I1 (q) and I2 (q) gives (4.13).

The main result of this section is the following coercivity result: Proposition 4.3 (Coercivity of K). There exists δ = δ(c) > 0 such that for all v ∈ H 2 satisfying the symplectic orthogonality conditions v, ∂x−1 ∂a j q = 0 and v, ∂x−1 ∂c j q = 0, j = 1, 2, we have δv2H 2 ≤ Kc,a v, v .

(4.15)

The proposition is proved in a few steps. In Lemma 4.2 we already described the kernel Kc,a and now we investigate the negative eigenvalues: Proposition 4.4 (Spectrum of K). The operator Kc,a has a single negative eigenvalue, h ∈ L 2 (R): Kc,a h = −μh, μ > 0.

(4.16)

In addition, for 0 < δ < c1 < c2 − δ < 1/δ, there exists a constant, ρ, depending only on δ, such that min{λ > 0 : λ ∈ σ (Kc,a )} > ρ, a ∈ R2 .

(4.17)

Proof. As always we assume 0 < c1 < c2 . We know the continuous spectrum of Kc,a , σac (Kc,a ) = [2c12 c22 , +∞) and that for all a, c, there is a two-dimensional kernel given by span{∂a1 q, ∂a2 q}. The eigenvalues depend continuously on a, c, and hence the constant dimension of the kernel shows that the number of negative eigenvalues is constant (since the creation or annihilation of a negative eigenvalue would increase the dimension of ker Kc,a .) Hence it suffices to determine the number of negative eigenvalues of K for any convenient values of a, c. To do that we use the following fact:

390


Lemma 4.5 (Maddocks-Sachs [23, Lemma 2.2]). Suppose that K is a self-adjoint, 4th order operator of the form K = 2(−∂x2 + c12 )(−∂x2 + c22 ) + p0 (x) − ∂x p1 (x)∂x , where the coefficients p j (x) are smooth, real, and rapidly decaying as x → ±∞. Let r1 (x), r2 (x) be two linearly independent solutions of Kr j = 0 such that r j → 0 as x → −∞. Then the number of negative eigenvalues of K is equal to r1 (x) r1 (x) . (4.18) dim ker r2 (x) r2 (x) x∈R

We apply this lemma with K = Kc,a , in which case p1 = 20q 2 , p0 = 40qx x q + 20qx2 + 60q 4 − 12(c12 + c22 )q 2 , q = q(•, a, c). Convenient values of a and c are provided by a1 = a2 = 0 and c1 = 0.5, c2 = 1.5. In the notation of (3.9) we then have q(x, a, c) = Q(x, 0, 0.5), and since ∂x Q = −∂a1 q − ∂a2 q, ∂α Q = −∂a1 q + ∂a2 q, we can take r1 = ∂x Q and r2 = ∂α Q. A computation based on (3.11) and (3.12) shows that Q(x, 0.5, 0) = sech(x/2), ∂x Q(x, 0.5, 0) = −

sinh(x/2) , 2 cosh2 (x/2)

sinh(x/2) (9 − 2 cosh2 (x/2)) 4 cosh4 (x/2) 9 sinh(x/2) + ∂x Q(x, 0.5, 0). = 4 cosh4 (x/2)

∂α Q(x, 0.5, 0) =

(4.19)

Since x → y = sinh(x/2) is invertible, we only need to check the dimension of the kernel the Wronskian matrix of y y , r˜2 (y) = , r˜1 (y) = 2 1+y (1 + y 2 )2 and that is equal to 1 at y = 0 and 0 on R\{0}. In view of (4.18) this completes the proof of (4.16) To prove (4.17) we first note that by rescaling (3.10) we only need to prove the estimate for def

K (c, α) = K((c,1),(−α,α)) , c ∈ [δ, 1 − δ], 0 < δ < 1/2. For that we introduce another operator def

P(c) = (−∂x2 + 1)(−∂x2 + c2 ) + 10∂x η2 ∂x + 10(3η2 − 2η4 ) − 6(1 + c2 )η2 , where η = sech x, c ∈ R+ \{1}.

(4.20)


391

The operator P(c) is the Hessian of H(c,1) at η, which is also a critical point for H(c,1) . In particular, P(c)∂x η = 0. Putting def

Uα f (x) = f (x + α + log((1 + c)/(1 − c)))), and P+ (c, α) = Uα∗ P(c)Uα , def

we see that K (c, α) = 2P+ (c, α) + O(e−(α+|x|)/C )∂x2 + O(e−(α+|x|)/C ), x ≥ 0. Similarly, if def

Tc f (x) =

√ c f (cx),

and P− (c, α) = c2 Uα Tc P(1/c)Tc∗ Uα∗ , def

then K (c, α) = 2P− (c, α) + O(e−(α+|x|)/C )∂x2 + O(e−(α+|x|)/C ), x ≤ 0. We reduce the estimate (4.17) to a spectral fact about the operators P(c) and P(1/c): Lemma 4.6. Suppose that there exists α −→ λ(c, α) ∈ R\{0} such that λ(c, α) ∈ σ (K (c, α)), λ(c, α) −→ 0, α −→ ∞. Then we have dim ker L 2 P(c) + dim ker L 2 P(1/c) > 2,

(4.21)

where ker L 2 means the kernel in L 2 . Proof. The assumption that 0 = λ(c, α) → 0 as α → ∞ implies that there exists a family of quasimodes f α , f α L 2 = 1, K (c, α) f α L 2 = o(1), α −→ ∞,

f α ⊥ ker L 2 K (c, α).

(4.22)

Since we know that the kernel of K (c, α) is spanned by Uα∗ ∂x η + O(e−(|x|+α)/C ) and Uα Tc ∂x η + O(e−(|x|+α)/C ), we can modify f α and replace the orthogonality condition by f α ⊥ span (Uα∗ ∂x η, Uα Tc ∂x η).

392


The estimate in (4.22), and f α L 2 = O(1), imply that f α H 2 = O(1), α −→ ∞. We first claim that

1 −1

| f α (x)|2 d x = o(1), α −→ ∞.

(4.23)

(4.24)

In fact, on [−α/2, α/2], K (c, α) = (−∂x2 + c2 )(−∂x2 + 1) + O(e−α/C )∂x2 + O(e−α/c ), and hence, using (4.23), (−∂x2 + c2 )(−∂x2 + 1) f α = rα , rα L 2 ([−α/2,α/2]) = o(1). Putting def eα = [(−∂x2 + c2 )(−∂x2 + 1)]−1 rα 1[−α/2,α/2] , eα H 2 = o(1), we see that f α = gα + eα , where (−∂x2 + c2 )(−∂x2 + 1)gα (x) = 0, |x| < α/2.

(4.25)

Suppose now that (4.24) were not valid. Then the same would be true for gα , and there would exist a constant c0 > 0, and a sequence α j → ∞, for which gα j L 2 ([−1,1]) > c0 . In view of (4.25) this implies that ± ±cx ± ±x a± , |x| < α/2, |a ± e + b e gα j (x) = j j j |, |b j | = O(1), ±

and for at least one choice of sign, ± 2 2 |a ± j | + |b j | > c1 > 0.

We can choose a subsequence so that this is true for a fixed sign, say, +, for all j. In that case, a simple calculation shows that for M j → ∞, M j ≤ α j /2, Mj 1 1 2 |a + ||b+ |e(c+1)M j |gα j (x)|2 d x ≥ |a +j |2 e2M j + |b+j |2 e2cM j − 2 2c c+1 j j 0 2 |a + ||b− |e(1−c)M j − O(1) − 1−c j j

1 + 2 2M j c 1 1−c 2 + 2 2M j |a j | e + |b j | e ≥ 2 1+c c 4 |a + |2 e2(1−c)M j − O(1), − (1 − c)2 j where we used the fact that 0 < δ < c < 1 − δ. Hence Mj Mj fα j L 2 ≥ | f α j (x)|2 d x ≥ |gα j (x)|2 d x − o(1) 0

≥

1 2

1−c 1+c

2

0

γ e2M j c − O(1) −→ ∞, j → ∞.

Since f α L 2 = 1 we obtain a contradiction proving (4.24).


393

Now let χ± C ∞ (R) be supported in ±[−1, ∞), and satisfy χ+2 + χ−2 = 1. Then (4.24) (and the corresponding estimates for derivatives obtained from (4.22)) shows that P± (c, α)(χ± f α ) L 2 = o(1), α −→ ∞. For at least one of the signs we must have χ± f α L 2 > 1/3 (if α is large enough), and hence we obtain a quasimode for P± (c, α), orthogonal to the known element of the kernel of P± (c, α). This means that P± (c, α), for at least one of the signs has an additional eigenvalue approaching 0 as α → ∞. Since the spectrum of P± (c, α) is independent of α it follows that for at least one sign the kernel is two dimensional. This proves (4.21). The next lemma shows that (4.21) is impossible: Lemma 4.7. For c ∈ R+ \{1}, ker L 2 P(c) = C · ∂x η.

(4.26)

def

Proof. Let L = (I3 (η) + I1 (η))/2: Lv = −vx x − 6η2 v + v, η(x) = sech(x). We recall (see the comment after (4.20)) that P(c) =

1 1 H(c,1) (η) = I5 (η) + (1 + c2 )I3 (η) + c2 I1 (η) . 2 2

We already noted that L(∂x η) = P(c)∂x η = 0, and proceeding as in (4.14) we also have L(∂x (xη)) = −2η, P(c)(∂x (xη)) = 2(1 − c2 )η.

(4.27)

P(c)∂x L = L∂x P(c).

(4.28)

We claim that

Since I j (η + tv) = t I j (η)v + O(t 2 ), v ∈ S, Eq. (2.5) implies that I j (η)v, ∂x Ik (η)v = 0, ∀j, k, v ∈ S. From this we see that P(c)v, ∂x Lv = 0, ∀v ∈ S, and hence by polarization, P(c)v, ∂x Lw = − P(c)w, ∂x Lv = ∂x P(c)w, Lv , which implies (4.28).

394


Suppose now that dim ker L 2 P(c) = 2 for some c = 1, and let ηx and ψ be the basis of this kernel. Since P(c) is symmetric with respect to the reflection x → −x, ψ can be chosen to be either even or odd. Applying (4.28) to ψ we get P(c)∂x Lψ = 0 and hence ∂x Lψ = αηx + βψ, for some α, β ∈ R. If ψ is odd then ∂x Lψ is even, and therefore α = β = 0. But then ψ ∈ ker L 2 L = C · ηx , giving a contradiction. If ψ is even then ∂x Lψ is odd, β = 0 and Lψ = αη. We have α = 0 since ψ is orthogonal to the kernel of L, spanned by ∂x η. From (4.27) we obtain α ψ = − ∂x (xη). 2 Applying the second equation in (4.27) we then obtain P(c)ψ = −α(1 − c2 )η, contradicting ψ ∈ ker L 2 P(c).

With this lemma we complete the proof of Proposition 4.4.

To obtain the coercivity statement in Proposition 4.3 we first obtain coercivity under a different orthogonality condition: Lemma 4.8. There exists a constant ρ > 0 depending only on c1 , c2 , such that the following holds: If u, ∂x−1 ∂a1 q = 0, u, ∂x−1 ∂a2 q = 0, u, ∂a1 q = 0, u, ∂a2 q = 0, then Kc,a u, u ≥ ρu2L 2 . Proof. To simplify notation we put K = Kc,a in the proof. Using (4.13) and the expression for the symplectic form, ω| M = da1 ∧ dc1 + da2 ∧ dc2 , we have K∂c1 q, ∂c1 q = −4c1 (c12 − c22 ) ∂x−1 ∂a1 q, ∂c1 q = 4c1 (c12 − c22 ), and similarly K∂c2 q, ∂c2 q = −4c2 (c12 − c22 ).

(4.29)

Since we assumed that c1 < c2 , K∂c1 q, ∂c1 q < 0. Let ∂c1 q be the orthogonal projection of ∂c1 q on (ker K)⊥ . We first claim that there exists a constant α such that u = u˜ + α ∂c1 q with u, ˜ h = 0, where μ and h are defined in Proposition 4.4. To prove this, decompose ∂c1 q as ∂c1 q = ξ + βh with ξ, h = 0. Then by (4.29), 0 > K∂c1 q, ∂c1 q = Kξ, ξ + 2β Kh, ξ + β 2 Kh, h = Kξ, ξ − μβ 2 . Since Kξ, ξ ≥ 0, we must have that β = 0. Hence there exists u and α such that u = u + α∂c1 q with u , h = 0. Now take u˜ to be the projection of u away from the kernel of K. This completes the proof of the claim.


395

We have that u, K∂c1 q = −4c1 (c22 − c12 ) u, ∂x−1 ∂a1 q = 0 by (4.13) and hypothesis. Substituting u = u˜ + α ∂c1 q, we obtain ∂c1 q, K∂c1 q = −α ∂c1 q, K∂c1 q . u, ˜ K∂c1 q = −α

(4.30)

Now let ρ˜ denote the bottom of the positive spectrum of K. We have Ku, u = K(u˜ + α ∂c1 q), (u˜ + α ∂c1 q) = Ku, ˜ u ˜ + 2α Ku, ˜ ∂c1 q + α 2 K∂c1 q, ∂c1 q = Ku, ˜ u ˜ − α 2 K∂c1 q, ∂c1 q ≥ ≥

ρ ˜ u ˜ 2L 2 + 4c1 (c22 ˜ u C( ˜ 2L 2 + α 2 ),

by (4.30)

− c12 )α 2

where C˜ depends on c1 , c2 and ρ. ˜ However, since u = u˜ + α ∂c1 q, we have u2L 2 ≤ C(u ˜ 2L 2 + α 2 ), where C depends on c1 , c2 which completes the proof.

We now put E = E a,c = ker K = span{∂a1 q, ∂a2 q}, F = Fa,c = span{∂x−1 ∂c1 q, ∂x−1 ∂c2 q}, G = G a,c =

(4.31)

span{∂x−1 ∂a1 q, ∂x−1 ∂a2 q}.

In this notation Lemma 4.8 states that u ⊥ (E + G) !⇒ Ku, u ≥ θ u2L 2 , while to establish Proposition 4.3 we need 2 ˜ u ⊥ (F + G) !⇒ Ku, u ≥ θu . L2

That is, we would like to replace orthogonality with the kernel E by orthogonality with a “nearby” subspace F. For this, we apply the following analysis with D = F ⊥ . Definition 4.1. Suppose that D and E are two closed subspaces in a Hilbert space. Then α(D, E), the angle between D and E, is α(D, E) = cos−1 def

sup

d=1, d∈D e=1, e∈E

d, e .

It is clear that 0 ≤ α(D, E) ≤ π/2, α(D, E) = α(E, D), and that α(E, D) = π/2 if and only if E ⊥ D. We will need slightly more subtle properties stated in the following

396


Lemma 4.9. Suppose that D and E are two closed subspaces in a Hilbert space. Then α(D, E) = cos−1

sup

d=1,d∈D

PE d, α(D, E) = sin−1

inf

d=1,d∈D

PE ⊥ d.

(4.32)

In addition if E is finite dimensional then α(D, E) = 0 ⇐⇒ D ∩ E = {0}.

(4.33)

Proof. To see (4.32) let d ∈ D, with d = 1. By the definition of the projection operator, 1 − PE d2 = d − PE d2 = inf d − e2 = inf inf d − αe2 e∈E α∈R e=1

e∈E

= inf inf (1 − 2α d, e + α 2 ) = inf (1 − d, e 2 ) e∈E α∈R e=1

e∈E e=1

= 1 − sup d, e 2 , e∈E e=1

and consequently, PE d = sup d, e , e∈E e=1

from which the first formula in (4.32) follows. The second one is a consequence of the first one as 1 = PE d2 + PE ⊥ d2 . The ⇐ implication in (4.33) is clear. To see the other implication, we observe that if D ∩ E = {0} and E is finite dimensional then inf d(y, D) > 0,

y∈E y=1

where d(y, D) = inf z∈D y − z is the distance from y to D. This implies that 0 < inf inf y − z2 = inf inf (1 − 2 y, z + z2 ) y∈E z∈D y=1

≤ inf

y∈E z∈D y=1

inf (2 − 2 y, z ) = 2(1 − sup sup y, z )

y∈E z∈D y=1 z=1

y∈E z∈D y=1 z=1

= 2(1 − cos α(D, E)). Thus if D ∩ E = {0} then α(D, E) > 0. But that is the ⇒ implication in (4.33).

In the notation of (4.31), the translation symmetry gives ⊥ α(E a,c , Fa,c ) = F(c1 , c2 , a1 − a2 ),

where F is a continuous fuction in C × R. We claim that F(c1 , c2 , α) ≥ κδ > 0 for δ ≤ c1 ≤ c1 + δ ≤ c2 ≤ δ −1 .

(4.34)


397

Consider now the case |a1 − a2 | ≤ A (where A is chosen large below), and hence c1 , c2 , ⊥ ) is and a1 − a2 vary within a compact set. Thus it suffices to check that α(E a,c , Fa,c ⊥ nowhere zero and this amounts to checking E ∩ F = {0}. Suppose the contrary, that is that there exists u = z 1 ∂a1 q + z 2 ∂a2 q ∈ F ⊥ . Since ω| M = da1 ∧ dc1 + da2 ∧ dc2 , z j = u, ∂x−1 ∂c j q = 0. This proves (4.34). To complete the argument in the case |a1 − a2 | ≤ A, we need: Lemma 4.10. Let E = ker K, and suppose that G is a subspace such that E ⊥ G and the following holds: u ⊥ (E + G) !⇒ Ku, u ≥ θ u2L 2 . Then, for any other subspace F we have u ⊥ (F + G) !⇒ K u, u ≥ θ sin2 α(E, F ⊥ ) u2L 2 . ˜ Proof. Suppose u ⊥ (F + G) and consider its orthogonal decomposition, u = PE u + u. Since E ⊥ G and u ⊥ G, we have u˜ ⊥ (E + G). Hence, by the hypothesis we have Ku, u = Ku, ˜ u ˜ ≥ θ u ˜ 2L 2 = θ PE ⊥ u2L 2 . An application of (4.32), sin α(E, F ⊥ ) = inf PE ⊥ d L 2 ≤ d=1 d∈F ⊥

concludes the proof.

PE ⊥ u L 2 , u L 2

5. Set-up of the Proof Recall the definition of T0 (for given δ0 > 0 and a, ¯ c) ¯ stated in the Introduction. Recall def B(a, c, t) = b(x, t)q 2 (x, a, c) d x. In the next several sections, we establish the key estimates required for the proof of the main theorem. Let us assume that on some time interval [0, T ], there are C 1 parameters a(t) ∈ R2 , c(t) ∈ R2 such that, if we set def

v(·, t) = u(·, t) − q(·, a(t), c(t))

(5.1)

then the symplectic orthogonality conditions (3.26) hold. Since u solves (1.1), v(t) satisfies ∂t v = ∂x (−∂x2 v − 6q 2 v − 6qv 2 − 2v 3 + bv) − F0 ,

(5.2)

398


where F0 results from the perturbation and ∂t landing on the parameters: def

F0 =

2 2 (a˙ j − c2j )∂a j q + c˙ j ∂c j q − ∂x (bq). j=1

(5.3)

j=1

Now decompose F0 = F + F⊥ , where F is symplectically parallel to M and F⊥ is symplectically orthogonal to M. Explicitly, F =

2

(a˙ j − c2j + 21 ∂c j B)∂a j q +

j=1

2

(c˙ j − 21 ∂a j B)∂c j q,

(5.4)

j=1

1 [−(∂c j B)∂a j q + (∂a j B)∂c j q]. 2 2

F⊥ = −∂x (bq) +

(5.5)

j=1

All implicit constants will depend upon δ0 > 0 and L ∞ norms of b0 (x, t) and its derivatives. We further assume that δ0 ≤ c1 (t) ≤ c2 (t) − δ0 ≤ δ0−1

(5.6)

holds on all of [0, T ]. In § 6 we will estimate F⊥ using the properties of q recalled in § 3. We note that F ≡ 0 would mean that the parameters solve the effective equations of motion (1.5). Hence the estimates on F are related to the quality of our effective dynamics and they are provided in § 7. In § 8 we then construct a correction term which removes the leading non-homogeneous terms from the equation for v. Finally energy estimates in § 9 based on the coercivity of K lead to the final bootstrap argument in § 10. 6. Estimates on F⊥ Using the identities in Lemma 3.4, we will prove that F⊥ is O(h 2 ); in fact, we obtain more precise information. For notational convenience, we will drop the t dependence in b(x, t), and will write b , b , b , to represent x-derivatives. We will use the following consequences of Lemma 3.2: ∂a j q = −∂x η(·, aˆ j , c j ) + Serr

(6.1)

and c j ∂c j q = ∂x [(x − a j )η(x, aˆ j , c j )] + −

2c3− j θ (a2 − a1 ) ∂x η(x, aˆ j , c j ) (c1 + c2 )(c1 − c2 )

2c j θ (a2 − a1 ) ∂x η(x, aˆ 3− j , c3− j ) + Serr , (c1 + c2 )(c1 − c2 )

(6.2)

where θ is given by (3.6). Importantly, as the last formula shows, ∂c j q is not localized around aˆ j due to the c j -dependence of aˆ 3− j . Also note that it is (x − a j ) and not (x − aˆ j ) in the first term inside the brackets.


399

Definition 6.1. Let A denote the class of functions of a, c that are of the form h 2 ϕ(a1 − a2 , a, c) + ν(a, c)h 3 , a = (a1 , a2 ) ∈ R2 , 0 < δ < c1 < c2 − δ < 1/δ, where k p p ∂α ∂c ∂a ϕ(α, a, c) ≤ C α −N , ∂ck ∂a ν(a, c) ≤ C, where C depends on δ, N , , k, and p only. We note that if f ∈ Serr , then f (x)d x has the form ϕ(a1 − a2 , a, c), ϕ ∈ A. The most important feature of the class A is that for f ∈ A, |∂ak j ∂cj f | h 2 a1 − a2 −N + h 3 with implicit constant depending on c1 , c2 . Lemma 6.1. We have ∂a j B(a, c, ·) = 2c j b (aˆ j ) + A,

(6.3) π2

b (aˆ j )c−2 j 12 j 2(−1) c3− j (b (aˆ 2 ) − b (aˆ 1 ))θ + A. − (c1 + c2 )(c1 − c2 )

∂c j B(a, c, ·) = 2b(aˆ j ) + 2b (aˆ j )(a j − aˆ j ) −

(6.4)

Proof. First we compute ∂a j B(a, c, t). We have that ∂a j q is exponentially localized around aˆ j . Substituting the Taylor expansion of b around aˆ j , we obtain 2 ∂a j B(a, c, t) = b(aˆ j ) ∂a j q + b (aˆ j ) (x − aˆ j )∂a j q 2 1 + b (aˆ j ) (x − aˆ j )2 ∂a j q 2 + O(h 3 ) 2 = I + II + III + O(h 3 ). Terms I and II are straightforward. Using (3.21) and (3.22), I = b(aˆ j )∂a j q 2 = 0,

II = b (aˆ j ) ∂a j xq 2 − aˆ j ∂a j q 2 = 2c j b (aˆ j ). For III, we will substitute (6.1) and hence pick up O(h 2 ) a1 − a2 −N errors, 1 III = − b (aˆ j ) (x − aˆ j )2 ∂x η2 (x, aˆ j , c j ) d x + A = A. 2 Thus, we obtain (6.3). Next, we compute ∂c j B(a, c, t). Note that ∂c j q is not localized around aˆ j . Begin by rewriting ∂c j B as ∂c j B = b(aˆ j )∂c j q 2 + b (aˆ j )(x − aˆ j )∂c j q 2 + b˜ j ∂c j q 2 ,

400


where def b˜ j (x) = b(x) − b(aˆ j ) − b (aˆ j )(x − aˆ j ).

Now substitute (6.2) into the last term and note that the Serr term in (6.2) produces an A term here, ∂c j B = b(aˆ j )∂c j q 2 + b (aˆ j )(x − aˆ j )∂c j q 2 2 + b˜ j (x) ∂x [(x − a j )η(x, aˆ j , c j )]η(x, aˆ j , c j ) cj c3− j θ + b˜ j (x)∂x η2 (x, aˆ j , c j ) c j (c1 + c2 )(c1 − c2 ) θ − b˜ j (x)∂x η2 (x, aˆ 3− j , c3− j ) + A (c1 + c2 )(c1 − c2 ) = I + II + III + IV + V + A, where terms I-V are studied separately below. I = b(aˆ j )∂c j q 2 = 2b(aˆ j ),

2 2 II = b (aˆ j ) ∂c j xq − aˆ j ∂c j q = 2b (aˆ j )(a j − aˆ j ). Term III is localized around aˆ j , and thus we integrate by parts in x and Taylor expand b˜ j around aˆ j to obtain III =

1 cj

−b˜ j (x)(x − a j ) + b˜ j (x) η2 (x, aˆ j , c j )

=−

1 b (aˆ j ) 2 cj

=−

π 2 3 b (aˆ j )c−2 j + O(h ). 12

(x − aˆ j )2 η2 (x, aˆ j , c j ) −b (aˆ j )(aˆ j − a j ) (x − aˆ j )η2 (x, aˆ j , c j ) + O(h 3 )

Term IV is localized around aˆ j , and thus we integrate by parts in x and Taylor expand b˜ j around aˆ j to obtain

b˜ j (x) ∂x η2 (x, aˆ j , c j ) = −

(b (x) − b(aˆ j ))η2 (x, aˆ j , c j ) 1 = − b (aˆ j ) (x − aˆ j )η2 (x, aˆ j , c j ) + O(h 3 ) 2 = O(h 3 ).


401

Term V is localized around aˆ 3− j , and thus we integrate by parts in x and Taylor expand b˜ j around aˆ 3− j ; b˜ j (x)∂x η2 (x, aˆ 3− j , c3− j ) = − (b (x) − b(aˆ j ))η2 (x, aˆ 3− j , c3− j ) = −(b (aˆ 3− j ) − b (aˆ j )) η2 (x, aˆ 3− j , c3− j ) −b (aˆ 3− j ) (x − aˆ 3− j )η2 (x, aˆ 3− j , c3− j ) + O(h 3 ) = −2c3− j (b (aˆ 3− j ) − b (aˆ j )) + O(h 3 ). Lemma 6.2 (Estimates on F⊥ ). ∂x−1 ∂a j F⊥ = O(h 2 ) · Ssol , F⊥ = − where def

τ =

1 2

2 j=1

∂x−1 ∂c j F⊥ = O(h 2 ) · Ssol ,

j = 1, 2,

b (aˆ j ) ∂x τ (·, aˆ j , c j ) + A · Ssol , c2j

π2 2 + x η(x), 12

def

τ (x, aˆ j , c j ) = c j τ (c j (x − aˆ j )).

(6.5) (6.6)

(6.7)

In light of the above lemma, we introduce the notation F⊥ = (F⊥ )0 + F˜⊥ , where 1 b (aˆ j ) ∂x τ (·, aˆ j , c j ) 2 c2j 2

(F⊥ )0 = −

(6.8)

j=1

and F˜⊥ ∈ A · Ssol . We make use of (6.5) in §7 and (6.6) in §8–9. Proof. We begin by proving (6.6). By (3.18), (3.19), ∂x (bq) = (∂x b)q + b(∂x q) = (∂x b)

2 2 ((x − a j )∂a j q + c j ∂c j q) − b ∂a j q j=1

=

j=1

2

2

j=1

j=1

(−b + (∂x b)(x − a j ))∂a j q +

(∂x b)c j ∂c j q + O(h 3 ) · Ssol . (6.9)

The ∂a j q term is well localized around aˆ j , and thus we can Taylor expand the coefficients around aˆ j . The ∂c j q term we leave alone for the moment. We have ∂x (bq) =

2 1 −b(aˆ j ) + b (aˆ j )(aˆ j − a j ) + b (aˆ j )(aˆ j − a j )(x − aˆ j ) + b (aˆ j )(x − aˆ j )2 ∂a j q 2 j=1

+

2 j=1

b (x)c j ∂c j q + A · Ssol .

402


Substituting the above together with (6.3) and (6.4) into (5.5), we obtain

2 2 1 π −2 c j − 2(aˆ j − a j )(x − aˆ j ) − (x − aˆ j )2 ∂a j q F⊥ = b (aˆ j ) 2 12 j=1

2 2 (b (aˆ 2 ) − b (aˆ 1 ))θ j + (−1) c3− j ∂a j q − (b (x) − b (aˆ j ))c j ∂c j q + A · Ssol . (c1 + c2 )(c1 − c2 ) j=1

j=1

We now substitute (6.1) and (6.2) recognizing that this will only generate errors of type A times a Schwartz class function. We also Taylor expand around aˆ j or aˆ 3− j depending upon the localization.

2 2 1 π −2 2 F⊥ = b (aˆ j ) − c j +2(aˆ j −a j )(x − aˆ j ) + (x − aˆ j ) ∂x η(x, aˆ j , c j ) ← I, 2 12 j=1

−

2 (b (aˆ 2 ) − b (aˆ 1 ))θ (−1) j c3− j ∂x η(x, aˆ j , c j ) (c1 + c2 )(c1 − c2 )

← II,

j=1

−

2

b (aˆ j )(x − aˆ j )∂x [(x − a j )η(x, aˆ j , c j )]

← III,

c3− j θ b (aˆ j )(x − aˆ j )∂x η(x, aˆ j , c j ) (c1 + c2 )(c1 − c2 )

← IV,

j=1

−

2 j=1

+

2 j=1

+

2 j=1

cjθ b (aˆ 3− j )(x − aˆ 3− j )∂x η(x, aˆ 3− j , c3− j ) (c1 + c2 )(c1 − c2 )

← V,

cjθ (b (aˆ 3− j ) − b (a j ))∂x η(x, aˆ 3− j , c3− j ) (c1 + c2 )(c1 − c2 )

← VI.

+ A · Ssol We have that IV + V = 0 and II + VI = 0. Hence F⊥ = I + III + A · Ssol 2

2 π −2 1 c j + (x − aˆ j )2 η(x, aˆ j , c j ) . =− b (aˆ j )∂x 2 12 j=1

This completes the proof of (6.6). To obtain (6.5), we note that a consequence of (6.6) is F⊥ = O(h 2 ) f , where f ∈ Ssol . By the definition (5.5) of F⊥ and Corollary 3.3, we have ∂x−1 F⊥ ∈ Ssol , and hence f ∈ Ssol . 7. Estimates on the Parameters The equations of motion are recovered (in approximate form) using the symplectic orthogonality properties (3.26) of v and Eq. (5.2) for v. For a function G of the form G = g1 ∂a1 q + g2 ∂a2 q + g3 ∂c1 q + g4 ∂c2 q


403

with g j = g j (a, c), define coef(G) = (g1 , g2 , g3 , g4 ). Lemma 7.1. Suppose we are given δ0 > 0 and b0 (x, t), and parameters a(t), c(t) such that v defined by (5.1) satisfies the symplectic orthogonality conditions (3.26). Suppose, moreover, that the amplitude separation condition (5.6) holds. Then (with implicit constants depending upon δ0 > 0 and L ∞ norms of b0 and its derivatives), if v H 2 1, then we have | coef(F )| h 2 v H 1 + v2H 1 .

(7.1)

Proof. Since v, ∂x−1 ∂a j q = 0, we have upon substituting (5.2) 0 =∂t v, ∂x−1 ∂a j q = ∂t v, ∂x−1 ∂a j q + v, ∂t ∂x−1 ∂a j q = (∂x2 v + 6q 2 )v, ∂a j q + (6qv 2 + 2v 3 ), ∂a j q

← I + II

− bv, ∂a j q − F , ∂x−1 ∂a j q − F⊥ , ∂x−1 ∂a j q 2 2 ∂ak q a˙ k + ∂ck q c˙k + v, ∂x−1 ∂a j k=1 k=1

← III + IV + V ← VI.

We have, by (3.17), I = v, ∂a j (∂x2 q + 2q 3 ) 1 = − v, ∂x−1 ∂a j ∂x I3 (q) 2 2 ck2 ∂ak q . = − v, ∂x−1 ∂a j k=1

Also, by (5.5) III = − bv, ∂a j q = − v, ∂a j (bq) = − v, ∂x−1 ∂a j ∂x (bq) = − v, ∂x−1 ∂a j

−F⊥ −

1 2

2

(∂ck B)∂ak q +

k=1

1 2

2

(∂ak B)∂ck q .

k=1

Thus |I + III + VI| = | v, ∂x−1 ∂a j F⊥ + v, ∂x−1 ∂a j F | ≤ v L 2 (∂x−1 ∂a j F⊥ L 2 + ∂x−1 ∂a j F ) ≤ v L 2 (h 2 + | coef(F )|). Next, we note that by Cauchy-Schwarz, |II| v2H 1 .

404


Next, observe from (5.4) and Lemma 3.6 that 1 IV = F , ∂x−1 ∂a j q = −(c˙ j − ∂a j B). 2 Of course, we have V = F⊥ , ∂x−1 ∂a j q = 0. Combining, we obtain c˙ j − 1 ∂a B v H 1 (h 2 + | coef(F )|) + v2 1 . H 2 j A similar calculation, applying ∂t to the identity 0 = v, ∂x−1 ∂c j q , yields a˙ j − c2 + 1 ∂c B v H 1 (h 2 + | coef(F )|) + v2 1 . j j H 2

(7.2)

(7.3)

Combining (7.2) and (7.3) gives (7.1). 8. Correction Term

Recall the definition (6.7) of τ . Let ρ be the unique function solving (1 − ∂x2 − 6η2 )ρ = τ, see [19, Prop. 4.2] for the properties of this equation. The function ρ is smooth, exponentially decaying at ∞, and satisfies the symplectic orthogonality conditions ρ, η = 0,

ρ, xη = 0.

(8.1)

Set ρ(x, aˆ j , c j ) = c−1 ˆ j )), j ρ(c j (x − a def

and note that (c2j − ∂x2 − 6η2 (·, aˆ j , c j ))ρ(·, aˆ j , c j ) = τ (·, aˆ j , c j ). Define the symplectic projection operator def

Pf =

2 2 f, ∂x−1 ∂c j q ∂a j q + f, ∂x−1 ∂a j q ∂c j q. j=1

j=1

Define b (aˆ j ) 1 def w = − (I − P) ρ(·, aˆ j , c j ). 2 c2j 2

(8.2)

j=1

Note that w = O(h 2 ) and clearly now w satisfies w, ∂x−1 ∂a j q = 0, Recall the definition (6.8) of (F⊥ )0 .

w, ∂x−1 ∂c j q = 0.

(8.3)


405

Lemma 8.1. If c˙ j = O(h), and a˙ j = c j − b(aˆ j ) + O(h), then ∂t w + ∂x (∂x2 w + 6q 2 w − bw) = −(F⊥ )0 − G + A · Ssol ,

(8.4)

where G is an O(h 2 ) term that is symplectically parallel to M, i.e. G ∈ span{∂x−1 ∂a1 q, ∂x−1 ∂a2 q, ∂x−1 ∂c1 q, ∂x−1 ∂c2 q}. Proof. Let wj =

b (aˆ j ) ρ(·, aˆ j , c j ). c2j

Then ˆ j , c j ) − 2b (aˆ j )c−3 ˆ j, cj) ∂t w j = b (aˆ j )a˙ˆ j c−2 j ρ(·, a j c˙ j ρ(·, a ˙ˆ j ∂a ρ(·, aˆ j , c j ) + b (aˆ j )c−2 c˙ˆ j ∂c ρ(·, aˆ j , c j ) +b (aˆ j )c−2 j j j a j ˆ j, cj) + ∂t b (aˆ j )c−2 j ρ(·, a = −a˙ j ∂x w j + A · Ssol . Also, we have (∂x2 + 6q 2 )w j = (∂x2 + 6η2 (·, aˆ j , c j ))w j + A · Ssol = c2j w j − b (aˆ j )c−2 ˆ j , c j ) + A · Ssol . j τ (·, a Also, bw j = b(aˆ j )w j + A · Ssol . Combining, we obtain ∂t w j + ∂x (∂x2 w j + 6q 2 w j − bw j ) = −b (aˆ j )c−2 ˆ j , c j ) + (−a˙ j + c2j − b(aˆ j ))∂x w j + A · Ssol j ∂x τ (·, a = −b (aˆ j )c−2 ˆ j , c j ) + A · Ssol . j ∂x τ (·, a Now we discuss ∂t Pw j : ∂t Pw j = ∂t w j , ∂x−1 ∂a1 q ∂c1 q + w j , ∂t ∂x−1 ∂a1 q ∂c j q + similar + w j , ∂x−1 ∂a1 q ∂t ∂c1 q + similar. The first line of terms is symplectically parallel to M. For the second line, note that by (8.1), we have w j , ∂x−1 ∂a1 q = A. Consequently, ∂t Pw j = Tq M + A · Ssol .

406


Define u˜ and v˜ by u = u˜ + w,

v = v˜ + w.

(8.5)

Of course, it follows that u˜ = q + v. ˜ Note that by (3.26) and (8.3), we have v, ˜ ∂x−1 ∂a j q = 0

and

v, ˜ ∂x−1 ∂c j q = 0, j = 1, 2.

(8.6)

Note that u˜ solves ∂t u˜ = −∂x (∂x2 u˜ + 2u˜ 3 − bu) ˜ − ∂t w − ∂x (∂x2 w + 6u˜ 2 w − bw) + O(h 4 ),

(8.7)

where the O(h 4 ) terms arise from w 2 and w 3 . Moreover, if we make the mild assumption that v˜ = O(h), then u˜ 2 w = q 2 w + O(h 3 ). By (8.7) and (8.4), we have ∂t u˜ = −∂x (∂x2 u˜ + 2u˜ 3 − bu) ˜ + (F⊥ )0 + G + A · Ssol .

(8.8)

Since u˜ = q + v, ˜ we have (in analogy with (5.2)) ˜ − F − F˜⊥ + G + A · Ssol + O(h 3 )H 1 , ∂t v˜ = ∂x (−∂x2 v˜ − 6q 2 v˜ + bv)

(8.9)

where we have made the assumption that v˜ = O(h 3/2 ) in order to discard the v˜ 2 and v˜ 3 terms. We thus see that, in comparison to v, the equation for v˜ has a lower-order inhomogeneity, but still satisfies the symplectic orthogonality conditions (8.6) and v = v˜ + O(h 2 ). 9. Energy Estimate Since w = O(h 2 ), to obtain the desired bound on v it will suffice to obtain a bound for v. ˜ This will be achieved by the “energy method.” Lemma 9.1. Suppose we are given δ0 > 0 and b0 (x, t), and parameters a(t), c(t) such that v defined by (5.1) satisfies the symplectic orthogonality conditions (3.26) on [0, T ]. Suppose, moreover, that the amplitude separation condition (5.6) holds on [0, T ]. Then (with implicit constants depending upon δ0 > 0 and L ∞ norms of b0 and its derivatives), if v H 2 1 and T h −1 , then v2L ∞ H 2 [0,T ]

v(0)2H 2

+h 1+ 4

T

−N

a1 − a2

2 dt

.

0

Proof. Recall that we have defined Hc (u) = I5 (u) + (c12 + c22 )I3 (u) + c12 c22 I1 (u). With w given by (8.2) and u˜ given by (8.5), let E(t) = Hc (u) ˜ − Hc (q). Then ∂t E = Hc (u), ˜ ∂t u ˜ − Hc (q), ∂t q + 2(c1 c˙1 + c2 c˙2 )(I3 (u) ˜ − I3 (q)) +2c1 c2 (c1 c˙2 + c˙1 c2 )(I1 (u) ˜ − I1 (q)) = I + II + III + IV.


407

Note that II = 0 since Lemma 4.1 showed that Hc (q) = 0. For III, we have by (3.17) and the orthogonality conditions (8.6), ˜ + O(v ˜ 2H 1 )) III = 2(c1 c˙1 + c2 c˙2 )( I3 (q), v 2 = 4(c1 c˙1 + c2 c˙2 ) c2j ∂x−1 ∂a j q, v ˜ + O((|c˙1 | + |c˙2 |)v ˜ 2H 1 ) j=1

= O((|c˙1 | + |c˙2 |)v ˜ 2H 1 ). Term IV is bounded similarly. It remains to study Term I. Writing (8.8) as ∂t u˜ = 1 ˜ + ∂ (bu) ˜ + (F⊥ )0 + G + A · Ssol and appealing to (2.5), we have by Lemma 2.1 x 2 ∂x I3 (u) (with u replaced by u˜ in that lemma) that I = Hc (u), ˜ ∂x (bu) ˜ + Hc (u), ˜ (F⊥ )0 + A · Ssol = 5 bx , A5 (u) ˜ − 5 bx x x , A3 (u) ˜ + bx x x x x , A1 (u) ˜ +(c12 + c22 )(3 bx , A3 (u) ˜ − bx x x , A1 (u) ) ˜ + c12 c22 bx , A1 (u) ˜ + Hc (u), ˜ (F⊥ )0 + A · Ssol . Expand A j (u) ˜ = A j (q + v) ˜ = A j (q) + A j (q)(v) ˜ + O(v˜ 2 ) and Hc (u) ˜ = Hc (q) + Kc,a v˜ + O(v˜ 2 ) = Kc,a v˜ + O(v˜ 2 ) to obtain I = IA + IB + IC, where IA = 5 bx , A5 (q) − 5 bx x x , A3 (q) + bx x x x x , A1 (q) +(c12 + c22 )(3 bx , A3 (q) − bx x x , A1 (q) ) + c12 c22 bx , A1 (q) , IB = 5 bx , A 5 (q)(v) ˜ − 5 bx x x , A 3 (q)(v) ˜ + bx x x x x , A 1 (q)(v) ˜ +(c12 + c22 )(3 bx , A 3 (q)(v) ˜ − bx x x , A 1 (q)(v) ) ˜ + c12 c22 bx , A 1 (q)(v) , ˜ IC = Kc,a v, ˜ (F⊥ )0 + O(hv ˜ 2H 2 ) + O(A · v ˜ H 2 ). Then reapply Lemma 2.1 (with u replaced by q in that lemma) to obtain that IA = − Hc (q), ∂x (bq) = 0. Applying Lemma 2.2, ˜ (bq)x − ∂x Hc (q), bv ˜ IB = Kc,a v, = Kc,a v, ˜ (bq)x . In summary thus far, we have obtained that ˜ (bq)x + (F⊥ )0 + O(hv ˜ 2H 2 ) + O(Av ˜ H 2 ). ∂t E = Kc,a v, By (4.12), (4.13), and (8.6) (recalling the definition (5.3) of F0 ), we obtain ˜ ∂x (bq) = − Kc,a v, ˜ F0 = − Kc,a v, ˜ F + F⊥ . Kc,a v, Hence ˜ F + F˜⊥ + O(hv ˜ 2H 2 ) + O(Av ˜ H 2 ). ∂t E = − Kc,a v, It follows from Lemma 7.1 and F˜⊥ ∈ A · Ssol (see (6.6), (6.8)) that ˜ H 2 + hv ˜ 2H 2 . |∂t E| (h 2 a1 − a2 −N + h 3 )v

408


If T = δh −1 , E(T ) = E(0) + h 1 + 2

T

−N

a1 − a2

v ˜ L∞

2 [0,T ] Hx

0

+ hv ˜ 2L ∞

[0,T ] H

2

.

˜ we have By Lemma 4.1, the definition of E and Kc,a , and the fact that u˜ = q + v, |E − Kc,a v, ˜ v | ˜ v ˜ 3H 2 . Applying this at time 0 and T , together with the coercivity of K (Prop. 4.3),

T 2 2 2 −N v ˜ L ∞ Hx2 + hv v(T ˜ ) H 2 v(0) ˜ +h 1+ a1 − a2 ˜ 2L ∞ H2

[0,T ] H

[0,T ]

0

2

.

Replacing T by T such that 0 ≤ T ≤ T , and taking the supremum in T over 0 ≤ T ≤ T , we obtain

T 2 2 −N 1 + v ˜ L ∞ Hx2 + hv ˜ + h a − a

˜ 2L ∞ H 2 . v ˜ 2L ∞ H 2 v(0) 1 2 H2 [0,T ]

[0,T ]

0

[0,T ]

By selecting δ small enough, we obtain v ˜ 2L ∞

[0,T ] H

2

2 4 1 + v(0) ˜ + h H2

T

a1 − a2 −N dt

2 .

0

Finally, using that w H 2 ∼ h 2 , and v = v˜ + w, we obtained the claimed estimate.

10. Proof of the Main Theorem We start with the proposition which links the ODE analysis with the estimates on the error term v: Proposition 10.1. Suppose we are given b0 ∈ Cb∞ (R2 ) and δ0 > 0. (Implicit constants below depend only on b0 and δ0 .) Suppose that we are further given a¯ ∈ R2 , c¯ ∈ R2 \C, κ ≥ 1, h > 0, and v0 satisfying (3.26), such that 0 < h κ −1 ,

v0 Hx2 ≤ κh 2 .

Let u(t) be the solution to (1.1) with b(x, t) = b0 (hx, ht) and initial data η(·, a, ¯ c) ¯ + v0 . Then there exist a time T > 0 and trajectories a(t) and c(t) defined on [0, T ] such def that a(0) = a, ¯ c(0) = c¯ and the following holds, with v = u − η(·, a, c): (1) On [0, T ], the orthogonality conditions (3.26) hold. (2) Either c1 (T ) = δ0 , c1 (T ) = c2 (T ) − δ0 , c2 (T ) = δ0−1 , or T = ωh −1 , where ω 1. (3) |a˙ j − c2j + b(a j , t)| h. (4) |c˙ j − c j b (a j )| h 2 . (5) v L ∞ Hx2 ≤ ακh 2 , where α 1. [0,T ]

Here α and ω are constants depending only on b0 and δ0 (independent of κ, etc.).


409

Proof. Recall our convention that implicit constants depend only on b0 and δ. By Lemma 3.7 and the continuity of the flow u(t) in H 2 , there exists some T > 0 on which a(t), c(t) can be defined so that (3.26) hold. Now take T to be the maximal time on which a(t), c(t) can be defined so that (3.26) holds. Let T be the first time 0 ≤ T ≤ T such that c1 (T ) = δ0 , c1 (T ) = c2 (T ) − δ0 , c2 (T ) = δ0−1 , T = T , or ωh −1 (whichever comes first). Here, 0 < ω 1 is a constant that will be chosen suitably small at the end of the proof (depending only upon implicit constants in the estimates, and hence only on b0 and δ). Remark 10.2. We will show that on [0, T ], we have v(t) Hx2 κh 2 , and hence by Lemma 3.7 and the continuity of the u(t) flow, it must be the case that either c1 (T ) = δ0 , c1 (T ) = c2 (T ) − δ0 , c2 (T ) = δ0−1 , or ωh −1 (i.e. the case T = T does not arise). Let T, 0 < T ≤ T , be the maximal time such that v L ∞

2 [0,T ] Hx

≤ ακh 2 ,

(10.1)

where α is a suitably large constant related to the implicit constants in the estimates (and thus dependent only upon b0 and δ0 > 0). Remark 10.3. We will show, assuming that (10.1) holds, that v L ∞ Hx1 ≤ [0,T ] and thus by continuity we must have T = T .

1 1/2 2 ακh

In the remainder of the proof, we work on the time interval [0, T ], and we are able to assume that the orthogonality conditions (3.26) hold, δ0 ≤ c1 (t) ≤ c2 (t) − δ0 ≤ δ0−1 , and that (10.1) holds. By Lemma 7.1 and Taylor expansion, we have (since κ 2 h 4 h 2 ) a˙ j = c2j − b(a j , t) + O(h) (10.2) c˙ j = c j ∂x b(a j , t) + O(h 2 ), with initial data a j (0) = a¯ j , c j (0) = c¯ j . Let def

ξ(t) =

b(a1 (t), t) − b(a2 (t), t) , a1 (t) − a2 (t)

and let (t) denote an antiderivative. By the mean-value theorem |ξ | h, and since T ≤ ωh −1 , we have e ∼ 1. We then have d e (a2 − a1 ) = e (c22 − c12 ) + O(h). dt Since δ02 ≤ c22 − c12 , we see that e (a2 − a1 ) is strictly increasing. Let 0 ≤ t1 ≤ T denote the unique time at which e (a2 − a1 ) = 0 (if the quantity is always positive, take t1 = 0, and if the quantity is always negative, take t1 = T , and make straightforward modifications to the argument below). If t < t1 , integrating from t to t1 we obtain δ02 (t1 − t) −e(t) (a2 (t) − a1 (t)) = e(t) |a2 (t) − a1 (t)|. If t > t1 , integrating from t1 to t we obtain δ02 (t − t1 ) e(t) (a2 (t) − a1 (t)).

410


Hence,

T

a2 (t) − a1 (t) −2 1.

0

By Lemma 9.1, we conclude that v L ∞ Hx2 ≤ T

α α 1 (v(0) H 2 + h 2 ) ≤ (κh 2 + h 2 ) ≤ ακh 2 . 4 4 2

We can now complete Proof of the Main Theorem. Suppose that v0 H 2 ≤ h 2 . Iterate Prop. 10.1, as long as the condition δ0 ≤ c1 ≤ c2 − δ0 ≤ δ0−1

(10.3)

remains true, as follows: for the k th iterate, put κ = α k in Prop. 10.1 and advance from time tk = kωh −1 to time tk+1 = (k + 1)ωh −1 . At time tk , we have v(tk ) H 2 ≤ α k h 2 , k+1 h 2 . Provided (10.3) holds on all and we find from Prop. 10.1 that v L ∞ Hx2 ≤ α [tk ,tk+1 ]

of [0, t K ], we can continue until κ −1 ∼ h, i.e. K ∼ log h −1 . Recall (1.6), and A j (T ), C j (T ) defined by (1.12). Let aˆ j (t) = h −1 A j (ht), cˆ j (t) = C j (ht). Then aˆ j , cˆ j solve a˙ˆ j = cˆ2j − b(aˆ j , t) c˙ˆ j = cˆ j ∂x b(aˆ j , t) with initial data aˆ j (0) = a¯ j , cˆ j (0) = c¯ j . We know that (10.3) holds for cˆ j on [0, h −1 T0 ]. Let a˜ j = a j − aˆ j , c˜ j = c j − cˆ j denote the differences. Let b(a j , t) − b(aˆ j , t) , a j − aˆ j ˆ j , t) def ∂x b(a j , t) − ∂x b(a σ (t) = . a j − aˆ j def

γ (t) =

By the mean-value theorem, |γ (t)| h and |σ (t)| h 2 . We have a˙˜ j = c˜2j + 2cˆ j c˜ j − γ a˜ j + O(h) c˙˜ j = c˜ j (∂x b)(a j , t) + cˆ j σ a˜ j + O(h 2 ).

(10.4)

We conclude that |a˜ j | eCht and |c˜ j | heCht . This is proved by Gronwall’s method and a bootstrap argument. Since (10.3) holds for cˆ j on [0, h −1 T0 ], it holds for c j on the same time scale if T0 < ∞, and up to the maximum time allowable by the above iteration argument, h −1 log h −1 , if T0 = +∞. Acknowledgements. The authors gratefully acknowledge the following sources of funding: J.H. was supported in part by a Sloan Fellowship and the NSF grant DMS-0901582, G.P’s visit to Berkeley in November of 2008 was supported in part by the France-Berkeley Fund, and M.Z. was supported in part by the NSF grant DMS-0654436.


411

Appendix A. Local and Global Well-Posedness In this appendix, we will prove that (1.1) is globally well-posed in H k , k ≥ 1 provided def

M(T ) =

k+1

j

∞ < ∞, ∂x b(x, t) L ∞ [0,T ] L x

(A.1)

j=0

for all T > 0. This is proved for k = 1 under the additional assumption that b L 2x L ∞ < ∞ T in the appendix of Dejak-Sigal [11]. 2 The removal of the assumption b L 2x L ∞ < ∞ is T convenient since it allows for us to consider potentials that asymptotically in x converge to a nonzero number, rather than decay. Moreover, our argument is self-contained. Well-posedness for KdV (nonlinearity ∂x u 2 ) with b ≡ 0 was obtained by Bona-Smith [5] via the energy method, using the vanishing viscosity technique for construction and a regularization argument for uniqueness. Although their argument adapts to include b = 0 and to mKdV (1.1), it applies only for k > 23 due to the derivative in the nonlinearity. Kenig-Ponce-Vega [20,21] reduced the regularity requirements (for b ≡ 0) below k = 1 by introducing new local smoothing and maximal function estimates and applying the contraction method. These estimates were obtained by Fourier analysis (Plancherel’s Theorem, van der Corput Lemma). At the H 1 level of regularity (and above) for mKdV, the full strength of the maximal function estimate in [20,21] is not needed. Here, we prove a local smoothing estimate and a (weak) maximal function estimate (see (A.2) and (A.3) in Lemma A.1 below) instead by the integrating factor method, which easily accommodates the inclusion of a potential term since integration by parts can be applied. The estimates proved by Kenig-Ponce-Vega were directly applied by Dejak-Sigal, treating the potential term as a perturbation, which required introducing the norm b L 2x L ∞ . T Our argument does not apply directly to KdV since we are lacking the (strong) maximal function estimate used by [20,21]. Let Q n = [n − 21 , n + 21 ] so that R = ∪Q n . Let Q˜ n = [n − 1, n + 1]. An example of our notation is: u∞ L 2 L 2 = sup u L 2 n

T

Qn

2 (0,T ) L Q n

n

.

2 We will use variants like 2n L ∞ T L Q n etc. Note that due to the finite incidence of overlap, we have

u∞ L 2 L 2 ∼ u∞ L 2 L 2 . n

T

Qn

n

T

Q˜ n

Theorem A.1 (Local well-posedness). Take k ∈ Z, k ≥ 1. Suppose that def

M =

k+1

j

∞ < ∞. ∂x b(x, t) L ∞ [0,1] L x

j=0

For any R ≥ 1, take T min(M −1 , R −4 ). 2 It is further assumed in [11] that b ∞ ∞ is small, although this appears to be unnecessary in their LT Lx argument.

412


(1) If u 0 H k ≤ R, there exists a solution u(t) ∈ C([0, T ]; Hxk ) to (1.1) on [0, T ] with initial data u 0 (x) satisfying u L ∞ Hxk + ∂xk+1 u∞ L 2 L 2 R. n

T

T

Qn

(2) This solution u(t) is unique among all solutions in C([0, T ]; Hx1 ). (3) The data-to-solution map u 0 → u(t) is continuous as a mapping H k → C([0, T ]; Hxk ). The main tool in the proof of Theorem A.1 is the local smoothing estimate (A.2) below. Lemma A.1. Suppose that vt + vx x x − (bv)x = f. We have, for −1 ∞ + b L ∞ L ∞ ) , T (1 + bx L ∞ T Lx T x

the energy and local smoothing estimates

∂x−1 f 1 L 2 L 2 n T Qn f L 1 L 2

(A.2)

v2 L ∞ L 2 v0 L 2x + T 1/2 v L 2 H 1 + T 1/2 f L 2 L 2 .

(A.3)

v L ∞ L 2x + vx ∞ L 2 L 2 v0 L 2x + T

n

T

Qn

T

x

and the maximal function estimate n

T

Qn

T

x

T

x

The implicit constants are independent of b. Proof. Let ϕ(x) = − tan−1 (x − n), and set w(x, t) = eϕ(x) v(x, t). Note that 0 < π π e− 2 ≤ eϕ(x) ≤ e 2 < ∞, so the inclusion of this factor is harmless in the estimates, although it has the benefit of generating the “local smoothing” term in (A.2). We have ∂t w + wx x x − 3ϕ wx x + 3(−ϕ + (ϕ )2 )wx + (−ϕ + 3ϕ ϕ − (ϕ )3 )w − (bw)x + ϕ bw = eϕ f. This equation and manipulations based on integration by parts show that ∂t w L 2x = 6 ϕ , wx2 − 3 (−ϕ + (ϕ )2 ) , w 2 + 2 −ϕ + 3ϕ ϕ − (ϕ )3 , w 2 − bx , w 2 + 2 bϕ , w 2 + 2 w, eϕ f . We integrate the above identity over [0, T ], move the smoothing term 6 over to the left side, and estimate the remaining terms to obtain: w(T )2L 2 + 6 x − n −1 wx 2L 2 L 2 x

≤

w0 2L 2 x

T

x

∞ + C T (1 + bx L ∞ T Lx

2 ∞ )w ∞ 2 + b L ∞ LT Lx T LX

+C 0

T

T 0

ϕ , wx2 x dt

eϕ f w d x dt.


413

Replacing T by T , and taking the supremum over T ∈ [0, T ], we obtain, for T −1 , the estimate ∞ + b L ∞ L ∞ ) (1 + bx L ∞ T Lx [0,T ] x w2L ∞ L 2 + x − n −1 wx 2L 2 L 2 w0 2L 2 + T

Using that 0 < estimate for v:

x

T

e−π/2

≤

eϕ

≤

eπ/2

x

x

x

T

T 0

eϕ f w d x dt.

< ∞, this estimate can be converted back to an

v2L ∞ L 2 + vx 2L 2 L 2 v0 2L 2 + T

x

Qn

T 0

e2ϕ f v d x dt.

Estimating as

T 0

e2ϕ f v d x dt f 1 2 v L ∞ L 2 , LT Lx T x

and then taking the supremum in n yields the second bound in (A.2). Estimating instead as: T T e2ϕ f v d x dt = e2ϕ (∂x ∂ −1 f )v d x dt x 0 0 T −1 2ϕ ≤ (∂x f ) ∂x (e v) d x dt 0 ≤ ∂x−1 f L 2 L 2 ∂x v L 2 L 2 T

m

Qm

T

Qm

≤ ∂x−1 f 1 L 2 L 2 ∂x v∞ L 2 L 2 , m

T

Qm

m

T

Qm

and taking the supremum in n yields the second bound in (A.2). For the estimate (A.3), we take ψ(x) = 1 on [n − 21 , n + 21 ] and 0 outside [n −1, n +1], set w = ψv, and compute, similarly to the above, v2L ∞ L 2 v0 2L 2 + T vx 2L 2 L 2 + T f 2L 2 L 2 . T

Qn

Q˜ n

T

Q˜ n

T

Q˜ n

The proof is completed by summing in n.

Proof of Theorem A.1. We prove the existence by contraction in the space X , where X = { u | uC([0,T ];Hxk ) + ∂xk+1 u∞ L 2 L 2 + sup ∂xα u2 L ∞ L 2 ≤ C R }. n

T

Qn

α≤k−1

n

T

Qn

Here C is just chosen large enough to exceed the implicit constant in (A.2). Given u ∈ X , let ϕ(u) denote the solution to ∂t ϕ(u) + ∂x3 ϕ(u) − ∂x (bϕ(u)) = −2∂x (u 3 ),

(A.4)

with initial condition ϕ(u)(0) = u 0 . A fixed point ϕ(u) = u in X will solve (1.1). We separately treat the case k = 1 for clarity of exposition. Case k = 1. Applying ∂x to (A.4) gives, with v = ϕ(u)x , vt + vx x x − (bv)x = −2(u 3 )x x + (bx ϕ(u))x .

414


Now, (A.2) gives ϕ(u)x L ∞ L 2x + ϕ(u)x x ∞ L 2 L 2 T

n

T

Qn

u 0 Hx1 + (u 3 )x 1 L 2 L 2 + (bx ϕ(u))x L 1 L 2 . n

T

Qn

T

(A.5)

x

Using that u2L ∞ (u L 2 + u x L 2 )u L 2 , we also have Q˜

Q

Q˜

Q˜

(u 3 )x L 2 u x L 2 u2L ∞ u x L 2 u L 2 (u L 2 + u x L 2 ). Q

Q

Q˜

Q

Q

Q˜

Q˜

Taking the L 2T norm and applying the Hölder inequality, we obtain (u 3 )x L 2 L 2 u x L ∞ L 2 u L ∞ L 2 (u L 2 L 2 + u x L 2 L 2 ). T

Q

T

Q

T

Q˜

Q˜

T

T

Q˜

Taking the 1n norm and applying the Hölder inequality again yields (u 3 )x 1 L 2 L 2 u x ∞ L ∞ L 2 u2 L ∞ L 2 (u2 L 2 L 2 + u x 2 L 2 L 2 ). T

n

Qn

T

n

Qn

Q˜ n

T

n

Q˜ n

T

n

T

Q˜ n

Using the straightforward bounds u x ∞ L ∞ L 2 u x L ∞ L 2x , n

T

T

Qn

u2 L 2 L 2 u L 2 L 2 T 1/2 u L ∞ L 2x n

Q˜ n

T

T

T

x

and u x 2 L 2 L 2 u x L 2 L 2 T 1/2 u x L ∞ L 2x , n

T

Q˜ n

T

T

x

we obtain (u 3 )x 1 L 2 L 2 T 1/2 u2L ∞ H 1 u2 L ∞ L 2 . n

T

Qn

T

n

x

T

Qn

Inserting these bounds into (A.5), ϕ(u)x L ∞ L 2x + ϕ(u)x x ∞ L 2 L 2 u 0 Hx1 + T 1/2 u2L ∞ H 1 u2 L ∞ L 2 T

n

T

Qn

T

x

n

T

Qn

+T (bx L ∞ + bx x L ∞ )ϕ(u) Hx1 . x x

(A.6)

The local smoothing estimate (A.2) applied to v = ϕ(u) (not v = ϕ(u)x as above), and the estimate (u 3 )x L 1 L 2 T u3L ∞ H 1 , x

T

x

T

provides the estimate ϕ(u) L ∞ L 2x T u3L ∞ H 1 . T

x

T

The maximal function estimate (A.3) applied to v = ϕ(u) and the estimate (u 3 )x L 2 L 2 T 1/2 u3L ∞ H 1 , T

x

T

x

(A.7)


415

give the estimate ϕ(u)2 L ∞ L 2 u 0 L 2x + T ϕ(u) L ∞ Hx1 + T u3L ∞ H 1 . n

T

T

Qn

(A.8)

x

T

Summing (A.6), (A.7), (A.8), we obtain that ϕ(u) X ≤ C R if u X ≤ C R provided T is as stated above. Thus ϕ : X → X . A similar argument establishes that ϕ is a contraction on X . Case k ≥ 2. Differentiating (A.4) k times with respect to x we obtain, with v = ∂xk ϕ(u),

∂t v + ∂x3 v − ∂x (bv) = −2∂xk+1 (u 3 ) − 2∂x

∂xα b ∂xβ ϕ(u).

α+β≤k+1 β≤k−1

Using (A.2) gives ∂xk ϕ(u) L ∞ L 2x + ∂xk+1 ϕ(u)∞ L 2 L 2 T

n

∂xk u 3 1 L 2 L 2 n T Q

+

sup

α+β≤k+1 β≤k−1

n

T

Qn

∂x (∂xα b ∂xβ ϕ(u)) L 1 L 2 . T

x

Expanding, and applying the Leibniz rule gives

∂xk u =

γ

cαβγ ∂xα u ∂xβ u ∂x u,

α+β+γ =k α≤β≤γ

which is then estimated as follows: γ ∂xα u2n L ∞ L ∞ ∂xβ u2 L 2 L ∞ ∂x u∞ L ∞ L 2 . ∂xk u1 L 2 L 2 n

T

Qn

T

α+β+γ =k α≤β≤γ

Qn

n

T

n

Qn

By the Sobolev embedding theorem (as in the k = 1 case) we obtain k 3 σ ∂x u 1 L 2 L 2 sup ∂x u2 L ∞ L 2 sup ∂xσ u2 L 2 L 2 n

T

Qn

n

α+β+γ =k σ ≤α+1 α≤β≤γ

T

Qn

n

σ ≤β+1

T

Qn

T

Qn

γ

∂x u L ∞ L 2x .

When k ≥ 2, we have α ≤ [[ 13 k]] ≤ k − 2 and β ≤ [[ 21 k]] ≤ k − 1, and therefore ∂xk u 3 1 L 2 L 2 n T Q

n

T

1/2

sup ∂xα u2 L ∞ L 2

α≤k−1

n

T

Qn

u2L ∞ H k . T

x

Also, ∂x (∂xα b

∂xβ ϕ(u)) L 1 L 2 T x

≤T

∞ sup ∂xα b L ∞ ϕ(u) L ∞ Hxk . T Lx

α≤k+1

T

T

416


Combining these estimates, we obtain ∂xk ϕ(u) L ∞ L 2x + ∂xk+1 ϕ(u)∞ L 2 L 2 T n T Qn u 0 Hxk + T 1/2

sup ∂xα u2 L ∞ L 2 n

α≤k−1

T

Qn

u2L ∞ H k x

T

∞ sup ∂xα b L ∞ ϕ(u) L ∞ Hxk . T Lx

+T

(A.9)

T

α≤k+1

The local smoothing (u 3 )x L 1 L 2 T u3L ∞ H 1 to obtain x

T

T

x

ϕ(u) L ∞ L 2x T u3L ∞ H 1 . T

(A.10)

x

T

We apply the maximal function estimate (A.3) to v = ∂xα ϕ(u) for α ≤ k − 1 and use that ∂xα+1 u 3 L 1 L 2 ≤ T u3L ∞ H k and T

x

T

x

∂xα+1 (bϕ(u)) L 1 L 2 T x

≤T

∞ sup ∂xβ b L ∞ ϕ(u) L ∞ Hxk T Lx T

β≤k

to obtain ∂xα ϕ(u)2 L ∞ L 2 u 0 Hxk−1 + T ϕ(u) L ∞ Hxk + T u3L ∞ H k n T Qn T x T +T

∞ sup ∂xβ b L ∞ ϕ(u) L ∞ Hxk . T Lx

(A.11)

T

β≤k

Summing (A.9), (A.10), (A.11), we obtain that ϕ : X → X , and a similar argument shows that ϕ is a contraction. This concludes the case k ≥ 2. To establish uniqueness within the broader class of solutions belonging merely to C([0, T ]; Hx1 ), we argue as follows. Suppose u, v ∈ C([0, T ]; Hx1 ) solve (1.1). By (A.3), v2 L ∞ L 2 v0 L 2 + T v L ∞ Hx1 + T v3L ∞ H 1 . n

T

T

Qn

T

x

By taking T small enough in terms of v L ∞ Hx1 , we have that T

v2 L ∞ L 2 v L ∞ Hx1 .

(A.12)

u2 L ∞ L 2 u L ∞ Hx1 .

(A.13)

n

T

Qn

T

Similarly, n

T

Qn

T

Set w = u − v. Then, with g = (u 3 − v 3 )/(u − v) = u 2 + uv + v 2 , we have wt + wx x x − (bw)x ± (gw)x = 0. Apply (A.2) to v = wx to obtain wx L ∞ L 2x + wx x ∞ L 2 L 2 (gw)x 1 L 2 L 2 + (bx w)x L 1 L 2 . T

n

T

Qn

n

T

Qn

T

x

(A.14)


417

The terms of (gw)x 1 L 2 L 2 are bounded following the method used above: n

T

Qn

u x vw1 L 2 L 2 u x ∞ L ∞ L 2 vw1 L 2 L ∞ n

T

n

Qn

T

n

Qn

T

Qn

u x ∞ L ∞ L 2 (vw1 L 2 L 1 + (vw)x 1 L 2 L 1 ). n

T

n

Qn

T

n

Qn

T

Qn

The term in parentheses is bounded by v2 L 2 L 2 w2 L ∞ L 2 + vx 2 L 2 L 2 w2 L ∞ L 2 + v2 L ∞ L 2 wx 2 L 2 L 2 n

T

n

Qn

T

n

Qn

T

n

Qn

T

n

Qn

T

n

Qn

T

Qn

which leads to the bound u x vw1 L 2 L 2 T 1/2 u L ∞ Hx1 (v L ∞ Hx1 w2 L ∞ L 2 + v2 L ∞ L 2 w L ∞ Hx1 ). n

T

T

Qn

T

n

T

n

Qn

T

Qn

T

(A.15) We now allow implicit constants to depend upon u L ∞ Hx1 and v L ∞ Hx1 . Appealing to T T (A.14), (A.15) (and analogous estimates for other terms in gw), (A.12), (A.13) to obtain w L ∞ Hx1 T 1/2 (w2 L ∞ L 2 + w L ∞ Hx1 ). T

n

T

T

Qn

Combining this estimate with the maximal function estimate (A.3) applied to w yields w2 L ∞ L 2 T 1/2 w L ∞ Hx1 + T g L ∞ Hx1 w L ∞ Hx1 . n

T

Qn

T

T

T

This gives w ≡ 0 for T sufficiently small. The continuity of the data-to-solution map is proved using similar arguments. Next, we prove global well-posedness in H k by proving a priori bounds. Theorem A.1 shows that doing it suffices for global well-posedness. Theorem A.2 (Global well-posedness). Fix k ≥ 1 and suppose M(T ) < ∞ for all T ≥ 0, where M(T ) is defined in (A.1). For u 0 ∈ H k , there is a unique global solution u ∈ Cloc ([0, +∞); Hxk ) to (1.1) with u L ∞ Hxk controlled by u 0 H k , T , and M(T ). T

Proof. Before beginning, we note that by the Gagliaro-Nirenberg inequality, u4L 4 u3L 2 u x L 2 , we have (in the focusing case) u x 2L 2 − u x u3L 2 ≤ I3 (u) ≤ u x 2L 2 . With α = u x 2L 2 /u6L 2 and β = I3 (u)/u6L 2 , this is α − α 1/2 ≤ β ≤ α, which implies that α ∼ β , i.e. u x 2L 2 + u6L 2 ∼ I3 (u) + u6L 2 . The same statement holds in the defocusing case. Another fact we need is based on the d I j (u) = I j (u), ∂t u dt = I j (u), −u x x x − 2(u 3 )x + (bu)x = I j (u), ∂x I3 (u) + I j (u), (bu)x = I j (u), (bu)x .

418


For u(t) ∈ L 2 , we compute near conservation of momentum and energy from Lemma 2.1: d I1 (u) = bx , A1 (u) . dt Estimate | bx , A1 (u) | ≤ bx L ∞ I1 (u), and apply Gronwall to obtain a bound on 1 ∞ and u 0 L 2 . For u(t) ∈ H , we compute near conu L ∞ L 2x in terms of bx L ∞ T L T servation of energy from Lemma 2.1: d I3 (u) = 3 bx , A3 (u) − bx x x , A1 (u) . dt We have | bx , A3 (u) | bx L ∞ (u x 2L 2 + u4L 4 ) bx L ∞ (u x 2L 2 + u x L 2 u3L 2 ) bx L ∞ (u x 2L 2 + u6L 2 ) bx L ∞ (I3 (u) + u6L 2 ) and | bx x x , A1 (u) | bx x x L ∞ u2L 2 . Combining these gives d I3 (u) bx L ∞ I3 (u) + bx L ∞ u6 2 + bx x x L ∞ u2 2 . dt L L Gronwall’s inequality, combined with the previous bound on u L 2 , gives the bound on I3 (u) and hence u H 1 . For u(t) ∈ H 2 , we apply Lemma 2.1 to obtain d I5 (u) = I5 (u), (bu)x dt = 5 bx , A5 (u) − 5 bx x x , A3 (u) + bx x x x x , A1 (u) . We have | bx , A5 (u) | bx L ∞ (u x x 2L 2 + u4H 1 + u6H 1 ) bx L ∞ I5 (u) + bx L ∞ (u4H 1 + u6H 1 ). Also, | bx x x , A3 (u) | bx x x L ∞ (u2H 1 + u4H 1 ) and | bx x x x x , A1 (u) | bx x x L ∞ (u 2 )x x L 2 bx x x L ∞ u H 2 u L 2 . Combining, applying Gronwall’s inequality, and appealing to the bound on u H 1 obtained previously, we obtain the claimed a priori bound in the case k = 2.


419

Bounds on H k for k ≥ 3 can be obtained by the above method appealing to higherorder analogues of the identities in Lemma 2.1. However, starting with k = 3, we do not need such refined information. By direct computation from (1.1), d k 2 k+1 k ∂ u 2 = − ∂x (bu) ∂x u + 2 ∂xk+1 u 3 ∂xk u. dt x L In the Leibniz expansion of ∂xk+1 u 3 , we isolate two cases: γ cαβγ ∂xα u ∂xβ u ∂x u. ∂xk+1 u 3 = 3u 2 ∂xk+1 u + α+β+γ =k+1 α≤β≤γ ≤k

For the first term, u 2 ∂ k+1 u ∂ k u = (u 2 )x (∂ k u)2 u2 2 u2 k . x x x H H By Hölder’s inequality and interpolation, if α + β + γ = k + 1 and γ ≤ k, γ

∂xα u ∂xβ u ∂x u L 2 u2H 2 u H k . Thus we have

Similarly, we can bound

∂ k+1 u 3 ∂ k u u2 2 u2 k . x x H H ∂ k+1 (bu) ∂ k u M(t)u2 k x x H

by separately considering the term b ∂xk+1 u ∂xk u and integrating by parts. We obtain d k 2 ∂ u 2 (M + u2 2 )u2 k dt x L H H and can apply the Gronwall inequality to obtain the desired a priori bound.

Appendix B. Comments about the Effective ODEs Here we make some comments about the differential equations for the parameters a and c. B.1. Conditions on T0 . First we give a reason for replacing T0 (h) in the definition of T (h) (1.6) by T0 defined by (1.13). In (10.2) we have seen that the a and c solving the ! = ha, C ! = c, T = ht: system (1.5) give the following equations for A !2 − b0 ( A !j , T ) + O(h) !j = C ∂T A j ! = ah, ! = c, , A(0) ¯ C(0) ¯ j = 1, 2. !j ∂x b0 ( A !j , T ) + O(h) !j = C ∂T C This can also be seen by analysing (B.6) using Lemma 3.2.

420


!j − A j and C !j − C j : As in (10.4) we can write the equations for A ⎧ 2 ! ! ! ! ⎪ ⎨ ∂T ( A j − A j ) = (C j − C j ) + 2C j (C j − C j ) + γ0 ( A j − A j ) + O(h), !j − C j ) = (C !j − C j )(∂x b0 )(A j , t) + C j σ0 ( A !j − A j ) + O(h), ∂T (C ⎪ ⎩! !j (0) − C j (0) = 0, A j (0) − A j (0) = 0, C where γ0 , σ0 = O(1). This implies that !j (T ) − A j (T ) = O(h)eC T , A !j (T ) − C j (T ) = O(h)eC T . C !j (T ) + O(h 1−δC ). Hence, if This means that for T < δ log(1/ h), we have C j (T ) = C δ is small enough, then for small h we have that T0 (h) defined in (1.6) and T0 in (1.13) can be interchanged. B.2. Examples with C j going to 0. In the decoupled equations (1.12) we can have C j (T ) → 0 , T → ∞, which implies that T0 < ∞ in the definition (1.13). That prevents log(1/ h)/ h lifespan of the approximation (1.4). Let us put a = Aj, c = Cj, so that the system (1.12) becomes aT = c2 (T ) − b0 (a, T ), cT = c ∂a b0 (a, T ).

(B.1)

For simplicity we consider the case of b0 (a, T ) = b0 (a). In that case the Hamiltonian 1 E(a, c) = − c3 + cb0 (a) 3 is conserved in the evolution and we have exp(T min ∂a b) ≤ |c(T )| ≤ exp(T max ∂a b).

(B.2)

In particular this means that c > δ > 0 if T < T1 (δ). We cannot improve on (B.2), and in general we may have |c(T )| ≤ e−γ T , T → ∞, but this behaviour is rare. First we note that the conservation of E shows that if c(T j ) → 0 for some sequence T j → ∞, then E = 0. We can then solve for c, and the equation reduces to da/dT = 2b0 (a), c2 = 3b0 (a), that is to 1 a d a˜ = T, b(a(0)) > 0. (B.3) 2 a0 b0 (a) ˜ If b0 (a) > 0 in this set of values a then a(T ) → ∞, T → ∞, 1

and c(T ) = (3b0 (a(T ))) 2 .

(B.4)


421

If b0 (a) = 0 for some a > a(0) (aT = 2b0 > 0), then we denote a1 , the smallest such a and assume that the order of vanishing of b0 there is 1 . The analysis of (B.3) shows that

a(T ) = a1 + O(1)

⎧ ⎨ K e−γ T ⎩

1 = 1,

K T −1/(1 −1)

1 > 1,

which gives the rate of decay of c(T ). Hence we have shown the following statement which is almost as long to state as to prove: Lemma B.1. Suppose that in (B.1) b0 = b0 (a). Then E = 0, |c(0)| > δ0 > 0 !⇒ ∃ δ > 0 ∀ T > 0, |c(T )| > δ. If E = 0, let a1 = min{a : a > a(0), b0 (a) = 0}, with a1 not defined if the set is empty (note that c(0) = 0 and E = 0 imply that b0 (a(0)) > 0). Now suppose that a1 exists, and that ∂ b0 (a1 ) = 0, < 1 , ∂ 1 b0 (a1 ) = 0. Then as T → ∞,

|c(T )| ≤

⎧ ⎨ K e−γ T ⎩

1 = 1,

K T −1 /(1 −1)

1 > 1,

for some constants γ and K , and a(T ) → a1 . 1 If a1 does not exist then c(T ) = (3b0 (a(T ))) 2 , a(T ) → ∞, T → ∞. We excluded the case of infinite order of vanishing since it is very special from our point of view. The lemma suggests that c → 0 is highly nongeneric but it can occur for our system. Since for the original time t in (1.1) we would like to go up to time δ log(1/ h)/ h we cannot do it in some cases as then

c(t)|t=δ log(1/ h)/ h ∼

⎧ γ δ/2 ⎨h ⎩

1

log− 2 1 /(1 −1) (1/ h)

1 = 1, 1 > 1.

422


Fig. 6. The plots of (A j , c j ), j = 1, 2, solving (B.6) for b0 (x, t) = cos2 x and initial data A1 (0) = √ √ −π/3, A2 (0) = π/6, and c1 (0) = 3 cos(π/3), c2 (0) = 3 cos(π/6). The “decoupled” curve corresponds to solving (1.12). Because of the choice of initial conditions, (A j , c j ), j = 1, 2 lie on the same curve

B.3. Avoided crossing for the effective equations of motion. Here we make some comments about the puzzling avoided crossing which needs further investigation. For the decoupled equations it is easy to find examples in which c1 (T0 ) = c2 (T0 ).

(B.5)

One is shown in Fig. 6. We take b0 independent of T and equal to cos2 x. If we choose the initial conditions so that c2j = 3 cos2 A j , A j = ha j as in (1.12), and −π/2 < A1 < −A2 < 0, then when A1 (T0 ) = −A2 (T0 ) we have (B.5) (this also provides an example of c2 (T ) → 0 as T → ∞). The decoupled equations (1.12) should be compared to the rescaled version of (1.5): 2 ∂T c j = ∂x j B0 (c, A, h), ∂T A j = c j − ∂c j B0 (c, A, h), def 1 B0 (c, A, h) = q2 (x/ h, c, A/ h)b0 (x)d x. 2

(B.6)

For the example above the comparison between the solutions of the decoupled h-independent equations and solutions to Eq. (B.6) are shown in Fig. 6 (the solutions (1.12) are shown as a single curve which both solutions with these initial data follow). The dramatic avoided crossings shown in Fig. 6 (and also, for a different, time dependent b0 in Fig. 3) are not seen in the behaviour of q2 (x, c, A/ h) which is the approximation of the solution to (1.1) – see Fig. 7. The masses of the right and left solitons are switched and that corresponds to the switch of positions of A1 and A2 . It is possible that a different parametrization of double solitons would resolve this problem. Another possibility is to study the decomposition (3.11) in the proof of Lemma 3.2 uniformly α → 0 (corresponding to a2 − a1 → 0). We conclude with two heuristic observations. If the decoupled equations lead to (B.5) and |A1 − A2 | > > 0 (which is the case when we approach the crossing in Fig. 6)


423

Fig. 7. The plots of q2 (x, c, A/ h) for (A j , c j ), j = 1, 2, solving (B.6) for b0 (x, t) = cos2 x and initial data √ √ A1 (0) = −π/3, A2 (0) = π/6, and c1 (0) = 3 cos(π/3), c2 (0) = 3 cos(π/6). On the left h = 0.1 and on the right h = 0.3

then Eqs. (B.6) differ from (1.12) by terms of size

c2 − c1 , h log c1 + c2 see Lemma 3.2. For this to affect the motion of trajectories on finite time scales in T we need γ c2 − c1 exp − . (B.7) h This means that c j ’s have to get exponentially close to each other (but does not explain avoided crossing). On the other hand if |a1 − a2 | > > 0, where a j ’s are the original variables in (1.5), A j (0) = ha j (0), then we can use the decomposition in Lemma 3.2 and variables aˆ j defined by (3.7). The remark after the proof of Lemma 3.6 shows that the equations of motion take essentially the same form written in terms of a j ’s and c j ’s and hence aˆ j has to stay bounded. And that means that c2 − c1 is bounded away from 0. Hence, when c2 − c1 → 0 we must also have a2 − a1 → 0 as seen in Fig. 3 and Fig. 6. Appendix C. Alternative Proof of Lemma 4.7 (with Bernd Sturmfels) We note that the standard substition reduces the equation P(c)u = 0, where P(c) is defined in (4.20), to an equation with rational coefficients: z = tanh x, ∂x = (1 − z 2 )∂z , η2 = 1 − z 2 . This means that P(c)u = 0 is equivalent to Q(c)v = 0, u(x) = v(tanh x), where Q(c) = (L 2 + 1)(L 2 + c2 ) − 10L R(z)L + 10(3R(z) − 2R(z)2 ) − 6(1 + c2 )R(z), and L=

1 (1 − z 2 )∂z , R(z) = 1 − z 2 , −1 < z < 1. i

424


Lemma 4.7 will follow from finding a basis of solutions of Q(c)v = 0 and from seeing that the only bounded solution is the one corresponding to ∂x η, that is, to 1

v(z) = z(1 − z 2 ) 2 . Remarkably, and no doubt because of some deeper underlying structure due to complete integrability, this can be achieved using MAPLE package DEtools. First, the operator Q(c) is brought to a convenient form 3 d4 3 3 d f (z) + 12z(z − 1) (z + 1) f (z) dz 4 dz 3 d2 +(z − 1)2 (z + 1)2 (26z 2 − c2 + 1) 2 f (z) dz 2 2 d −2z(z − 1)(z + 1)(8z − 11 + c ) f (z) dz +(4 − 20z 2 + 6c2 z 2 − 5c2 + 16z 2 ) f (z).

Q = (z − 1)4 (z + 1)4

Applying the MAPLE command DFactorsols(Q,f(z)) gives the following explicit basis of solutions to Q(c)v = 0, c = 1: 1

v1 (z) = (1 − z 2 ) 2 z, c

c

v2 (z) = (1 + z)− 2 (1 − z) 2 ((c + z)2 + z 2 − 1), c

c

v3 (z) = v2 (−z) = (1 + z) 2 (z − 1)− 2 ((c − z)2 + z 2 − 1), 1 z+1 v4 (z) = (1 − z 2 )− 2 −3zc2 + 3z 3 c2 − 7z 3 + 7z log z−1 1 + (1 − z 2 )− 2 4c2 − 6c2 z 2 + 14z 2 − 12 . For c = 1 these solutions are linearly independent and only v1 vanishes at z = ±1 (or is bounded). Hence ker L 2 P(c) is one dimensional proving Lemma 4.7. References 1. Ablowitz, M., Kaup, D., Newell, A., Segur, H.: Nonlinear evolution equations of physical significance. Phys. Rev. Lett. 31, 125–127 (1973) 2. Abou Salem, W.K.: Solitary wave dynamics in time-dependent potentials. J. Math. Phys. 49, 032101 (2008), 29 pp. 3. Abou-Salem, W.K., Fröhlich, J., Sigal, I.M.: Colliding solitons for the nonlinear Schrödinger equation. Commun. Math. Phys. 291, 151–176 (2009) 4. Benes, N., Kasman, A., Young, K.: On decompositions of the KdV 2-Soliton. J. Nonlin. Sci. 2, 179–200 (2006) 5. Bona, J.L., Smith, R.: The initial-value problem for the Korteweg-de Vries equation. Philos. Trans. Roy. Soc. London Ser. A 278(1287), 555–601 (1975) 6. Bona, J.L., Souganidis, P.E., Strauss, W.A.: Stability and instability of solitary waves of Korteweg de Vries type. Proc. Roy. Soc. London Ser. A 411(1841), 395–412 (1987) 7. Bouzouina, A., Robert, D.: Uniform semiclassical estimates for the propagation of quantum observables. Duke Math. J. 111, 223–252 (2002) 8. Buslaev, V., Perelman, G.: On the stability of solitary waves for nonlinear Schrödinger equations. In: Nonlinear evolution equations, editor N.N. Uraltseva, Transl. Ser. 2, 164, Providence RI: Amer. Math. Soc., (1995), pp. 75–98 9. Datchev, K., Ventura, I.: Solitary waves for the Hartree equation with a slowly varying potential. Pacific. J. Math. 248(1), 63–90 (2010)


425

10. Dejak, S.I., Jonsson, B.L.G.: Long-time dynamics of variable coefficient modified Korteweg-de Vries solitary waves. J. Math. Phys. 47(7), 072703 (2006), 16 pp. 11. Dejak, S.I., Sigal, I.M.: Long-time dynamics of KdV solitary waves over a variable bottom. Comm. Pure Appl. Math. 59, 869–905 (2006) 12. Faddeev, L.D., Takhtajan, L.A.: Hamiltonian methods in the theory of solitons. Berlin Heidelberg: Springer-Verlag, 2007, translated from the Russian by A.G. Reyman. Reprint of the 1987 English edition. Classics in Mathematics. Berlin: Springer, 2007 13. Fröhlich, J., Gustafson, S., Jonsson, B.L.G., Sigal, I.M.: Solitary wave dynamics in an external potential. Commun. Math. Physics 250, 613–642 (2004) 14. Gang, Z., Sigal, I.M.: On soliton dynamics in nonlinear Schrödinger equations. Geom. Funct. Anal. 16(6), 1377–1390 (2006) 15. Gang, Z., Weinstein, M.I.: Dynamics of nonlinear Schrödinger/GrossPitaevskii equations: mass transfer in systems with solitons and degenerate neutral modes. Analysis & PDE 1, 267–322 (2008) 16. Holmer, J.: Dynamics of KdV solitons in the presence of a slowly varying potential. http://arxiv.org/abs/ 1001.1583v3 [math.AP], 2010 to appear in IMRN Internat. Math. Res. Notices. 17. Holmer, J., Perelman, G., Zworski, M.: 2-solitons in external fields. on-line presentation with MATLAB codes, http://math.berkeley.edu/~zworski/hpzweb.html, Nov. 2009 18. Holmer, J., Zworski, M.: Slow soliton interaction with delta impurities. J. Mod. Dyn. 1, 689–718 (2007) 19. Holmer, J., Zworski, M.: Soliton interaction with slowly varying potentials. IMRN Internat. Math. Res. Notices 2008, Art. ID runn026, 36 pp. (2008) 20. Kenig, C.E., Ponce, G., Vega, L.: Well-posedness and scattering results for the generalized Korteweg-de Vries equation via the contraction principle. Comm. Pure Appl. Math. 46, 527–620 (1993) 21. Kenig, C.E., Ponce, G., Vega, L.: Well-posedness of the initial value problem for the Korteweg-de Vries equation. J. Amer. Math. Soc. 4(2), 323–347 (1991) 22. Lax, P.: Integrals of nonlinear equations of evolution and solitary waves. Comm. Pure Appl. Math 21, 467–490 (1968) 23. Maddocks, J., Sachs, R.: On the stability of KdV multi-solitons. Comm. Pure Appl. Math. 46(6), 867–901 (1993) 24. Martel, Y., Merle, F.: Description of two soliton collision for the quartic gKdV equation. http://arxiv.org/ abs/0709.2677v1 [math.AP], 2007 25. Martel, Y., Merle, F., Tsai, T.-P.: Stability in H 1 of the sum of K solitary waves for some nonlinear Schrödinger equations. Duke Math. J. 133(3), 405–466 (2006) 26. Miura, R.M.: Korteweg-de Vries equation and generalizations. I. A remarkable explicit nonlinear transformation. J. Mathe. Phys. 9, 1202–1204 (1968) 27. Muñoz, C.: On the soliton dynamics under a slowly varying medium for generalized KdV equations. http://arxiv.org/abs/0912.4725v2 [math.AP], 2010, to appear in Analysis & PDE 28. Olver, P.J.: Applications of Lie groups to differential equations. Graduate Texts in Mathematics, 107. New York: Springer-Verlag, 1986 29. Perelman, G.: Asymptotic stability of multi-soliton solutions for nonlinear Schrödinger equations. Comm. Part. Diff. Eqs. 29(7-8), 1051–1095 (2004) 30. Potter, T.: Effective dynamics for N -solitons of the Gross-Pitaevskii equation. http://arxiv.org/abs/1009. 4910v1, 2010 31. Rodnianski, I., Schlag, W., Soffer, A.: Asymptotic stability of N-soliton states of NLS. http://arxiv.org/ abs/math/0309114v1 [math.AP], 2003 32. Strecker, K.E., et al.: Formation and propagation of matter wave soliton trains. Nature 417, 150–154 (2002) 33. Trefethen, L.N.: Spectral methods in MATLAB. Software, Environments, and Tools, 10. Philadelphia, PA: Soc. for Industrial and Applied Mathematics (SIAM), 2000 34. Wadati, M.: The modified Korteweg-de Vries equation. J. Phys. Soc. Jpn. 34, 1289–1296 (1973) 35. Weinstein, M.I.: Lyapunov stability of ground states of nonlinear dispersive evolution equations. Comm. Pure Appl. Math. 39(1), 51–67 (1986) Communicated by I.M. Sigal

Commun. Math. Phys. 305, 427–440 (2011) Digital Object Identifier (DOI) 10.1007/s00220-010-1173-x

Communications in


Asymptotic Completeness in a Class of Massless Relativistic Quantum Field Theories Wojciech Dybalski1, , Yoh Tanimoto2, 1 Zentrum Mathematik, Technische Universität München, D-85747 Garching, Germany.


2 Dipartimento di Matematica, Università di Roma “Tor Vergata”, Via della Ricerca Scientifica, 1,

I–00133 Roma, Italy. E-mail: [email protected] Received: 29 June 2010 / Accepted: 16 August 2010 Published online: 23 November 2010 – © Springer-Verlag 2010

Abstract: This paper presents the first examples of massless relativistic quantum field theories which are interacting and asymptotically complete. These two-dimensional theories are obtained by an application of a deformation procedure, introduced recently by Grosse and Lechner, to chiral conformal quantum field theories. The resulting models may not be strictly local, but they contain observables localized in spacelike wedges. It is shown that the scattering theory for waves in two dimensions, due to Buchholz, is still valid under these weaker assumptions. The concepts of interaction and asymptotic completeness, provided by this theory, are adopted in the present investigation. 1. Introduction The interpretation of quantum field theories in terms of particles is a long-standing fundamental problem [Ha58,Ru62,LSZ55,CD82]. The last two decades witnessed significant progress on this issue, both on the side of structural analysis [Bu90,Po04.1, Po04.2,Dy05,Dy09] and in the study of concrete models [Sp97,DG99,FGS04,Le08]. By combining methods of algebraic quantum field theory [Ha] with insights from the form-factor program [SW00,BFK06], the first examples of local, relativistic quantum field theories, which are interacting and asymptotically complete, have been constructed in [Le08]. As this class contains only massive models, the question of asymptotic completeness in the presence of massless particles is open to date in the local, relativistic framework. This can be partly attributed to the infamous infrared problem, which hinders rigorous construction and analysis of interacting massless theories by traditional methods (see however [CRW85,BFM04,ZZ92]). It is therefore remarkable that more recent constructive tools, developed in [BLS10], give rise to massless models which are asymptotically complete and interacting. We exhibit such theories in the present work. Supported by the DFG grant SP181/25-1.

Supported in part by the ERC Advanced Grant 227458 OACFT “Operator Algebras and Conformal Field

Theory”.

428

W. Dybalski, Y. Tanimoto

We recall that a new class of relativistic quantum field theories, including both massive and massless models, has been obtained recently by a certain deformation procedure akin to the Rieffel deformation [GL08,BS08,BLS10,DLM10]. These theories are wedgelocal, i.e. observables can be localized in (unbounded) wedge-shaped regions extending in spacelike directions. In the massive case this remnant of locality suffices for a canonical construction of the two-body scattering matrix, as shown in [BBS01]. Exploiting this fact, it was demonstrated in [GL08,BS08] that the deformed theory is interacting even if the original theory is not. As in general only two-body scattering states are available, it may seem that the problem of asymptotic completeness cannot be addressed in the framework of wedge-local theories. However, in the case of two-dimensional massless theories such a conclusion would be pre-mature, as we demonstrate in this paper. Our first task is to provide a scattering theory for such models. We recall that for local two-dimensional theories of massless excitations a scattering theory was developed in [Bu75]. The basic building blocks of this construction are the subspaces H+ and H− in the physical Hilbert space H, corresponding to the right and left branch of the lightcone in momentum space. These subspaces carry representations of the Poincaré group which are in general highly reducible. Thus vectors ± ∈ H± do not describe particles in the Wigner sense, but rather composite objects, called in [Bu75] ‘waves’. In view of their dispersionless motion, a composition of several waves traveling in the same direction (say elements of H+ ), gives rise to another wave from H+ . Thus it suffices to consider out

in

scattering states + × − (resp. + × − ) which describe two waves traveling in the opposite directions in the remote future (resp. past). They span the subspaces Hout (resp. Hin ) of the outgoing (resp. incoming) states. The scattering operator S : Hout → Hin out

in

can be defined as an isometry mapping + × − into + × − . If Hout = Hin = H, then we say that the theory is asymptotically complete. As we will show, there exists a large class of non-interacting massless theories in two-dimensional spacetime which have this property: it includes all chiral conformal quantum field theories. In the light of the above discussion, it is not surprising that the scattering theory from [Bu75] can be generalized to the wedge-local context. Indeed, observables localized in two opposite spacelike wedges suffice to separate two waves traveling in opposite directions. We demonstrate this fact in Sect. 2 after some introductory remarks on wedge-local quantum field theories. In Sect. 3 we express the scattering matrix Sκ of the deformed theory (with a deformation parameter κ), by the scattering matrix S of the original one. We obtain 2

Sκ = eiκ M S,

(1)

where M is the mass operator. Hence, similarly as in the massive case, the deformed theory is interacting, even if the original theory is not. Moreover, the property of asymptotic completeness is preserved by the deformation procedure. Thus, as we show in Sect. 4, deformations of chiral conformal field theories give rise to wedge-local theories which are interacting and asymptotically complete. We summarize our results in Sect. 5, where also some open questions are discussed. 2. Scattering Theory A convenient framework for the study of wedge-local theories is provided by the concept of a Borchers triple [Bo92]. We recall that a Borchers triple (R, U, ), (relative to the wedge W = { x = (x 0 , x 1 ) ∈ R2 | x 1 ≥ |x 0 | }), consists of:

Asymptotic Completeness in a Class of Massless Relativistic Quantum Field Theories

429

(a) a von Neumann algebra R ⊂ B(H), (b) a strongly continuous unitary representation U of R2 on H, whose spectrum sp U is contained in the closed forward lightcone V+ = { p = ( p 0 , p 1 ) ∈ R2 | p 0 ≥ | p 1 | } and which satisfies αx (R) ⊂ R, for x ∈ W, where αx ( · ) = U (x) · U (x)−1 , (c) a unit vector ∈ H which is invariant under the action of U and is cyclic and separating for R. It will be called the vacuum vector. One interprets A(W) := R as the algebra of all observables localized in the wedge W. In view of (c), one can apply to (R, ) the Tomita-Takesaki theory and we denote by (, J ) the modular operator and the conjugation. As shown in [Bo92], with the help of the modular objects one can construct an (anti-)unitary representation λ → U˜ (λ) of the proper Poincaré group P+ which extends the original representation of translations. In particular, J implements the spacetime reflection i.e. J U (x)J = U (−x), x ∈ R2 .

(2)

Thus with any wedge λW one can associate the algebra of observables A(λW) = U˜ (λ)RU˜ (λ)−1 . Since, by the Tomita-Takesaki theory, J RJ = R , the resulting net is wedge-local, i.e. A((λW) ) = A(λW) , where (λW) is the causal complement of λW and a prime over an algebra denotes the commutant. Hence this net gives rise to a (two-dimensional) wedge-local, relativistic quantum field theory. See [Bo92,Fl98] for proofs of the above statements and [BLS10] for a more detailed overview. 0 1 Let (H, P) be the generators of U , i.e. U (x 0 , x 1 ) = ei H x −i P x . We set H± = ker (H ∓ P) and denote by P± the corresponding projections. We assume that H+ ∩H− = [c], i.e. is the unique (up to a phase) unit vector which is invariant under translations. This implies that H+ ∩[c]⊥ is orthogonal to H− ∩[c]⊥ . We assume that the latter two subspaces are non-trivial, to ensure that the theory contains massless excitations. Let us now describe briefly their collision theory. The construction follows closely [Bu75]. For any F ∈ B(H) and x ∈ R2 we denote F(x) := αx (F) and define the sequences of operators F± (h T ) = dt h T (t)F(t± ) with t± = (t, ±t), (3) ∞ −ε where h T (t) = |T |−ε h(|T | (t − T )), 0 < ε < 1 and h ∈ C0 (R) is a non-negative, symmetric function s.t. dt h(t) = 1. With the help of these approximating sequences we construct the asymptotic fields corresponding to the wedge W.

Lemma 2.1. Let F ∈ R. Then the limits

out + (F) := s- lim F+ (h T ), T →∞

in − (F) := s- lim F− (h T ), T →−∞

(4)

exist and are elements of R. They depend only on the respective vectors out + (F) = P+ F, in (F) = P F and satisfy − − in (a) out + (F)H+ ⊂ H+ , − (F)H− ⊂ H− , out out in (b) αx ( + (F)) = + (αx (F)), αx ( in − (F)) = − (αx (F)) for x ∈ W.

Proof. Let us consider the first limit in (4). Since there holds the estimate F+ (h T ) ≤ F dt |h(t)|, it suffices to show the convergence on the dense set of vectors R . First, using the mean ergodic theorem, one checks that s- lim F+ (h T ) = P+ F. T →∞

(5)

430


In view of part (b) of the definition of the Borchers triple and the fact that t+ ∈ W, we obtain that F+ (h T ) ∈ R for T sufficiently large. Hence, for any F ∈ R , s- lim F+ (h T )F = F P+ F, T →∞

(6)

which proves the convergence. Since R is a von Neumann algebra, the limit out + (F) is an element of R. Since is separating for R, this operator depends only on out + (F) = P+ F. The second part of (4) is proven analogously. Property (a) follows by an application of the mean ergodic theorem, similarly as in (5). Property (b) is obvious from the in definitions of out + , − . Let us now define the asymptotic fields corresponding to the wedge W . Keeping in mind that J R J = R, we set for any F ∈ R , out

in + (F ) := J + (J F J )J,

in

out − (F ) := J − (J F J )J.

(7)

Making use of formula (2), we easily obtain the following counterpart of Lemma 2.1. Lemma 2.2. Let F ∈ R . Then

in + (F ) = s- lim F+ (h T ), T →−∞

out − (F ) = s- lim F− (h T ). T →∞

(8)

These operators depend only on the respective vectors in + (F ) = P+ F and out

− (F ) = P− F and satisfy out (a) in + (F )H+ ⊂ H+ , − (F )H− ⊂ H− , in in out (b) αx ( + (F )) = + (αx (F )), αx ( out − (F )) = − (αx (F )) for x ∈ W .

Let us now proceed to the construction of scattering states. Clustering properties of the asymptotic fields are of importance here. Proceeding as in [Bu75], we note that for any F, G ∈ R, F , G ∈ R there holds out out out ( out + (F) − (F )| + (G) − (G ))

∗ out out ∗ out = ( out + (G) + (F)| − (F ) − (G ))

out out out = ( out + (F)| + (G))( − (F )| − (G )),

(9)

where in the last step we made use of Lemma 2.1 (a), Lemma 2.2 (a) and of the fact that H+ ∩ [c]⊥ is orthogonal to H− ∩ [c]⊥ . Now for any + ∈ H+ , (resp. − ∈ H− ) we choose, with the help of property (c) of the Borchers triple, a sequence {Fn }n∈N of elements of R (resp. a sequence {Fn }n∈N of elements of R ) s.t. s- limn→∞ P+ Fn = + (resp. s- lim P+ Fn = − ). By relation (9), the limit n→∞

out

out + × − := s- lim out + (Fn ) − (Fn ) n→∞

(10)

exists and does not depend on the choice of the sequences within the above restrictions. We will call it the outgoing scattering state. Next, we define the incoming scattering states as follows: in

out

+ × − := J ((J + ) × (J − )).

(11)


431

This definition is meaningful, since relation (2) gives J H+ ⊂ H+ and J H− ⊂ H− . It is easily seen that for suitable sequences {G n }n∈N (resp. {G n }n∈N ) of elements of R (resp. of R ), there holds in

in + × − = s- lim in + (G n ) − (G n ), n→∞

(12)

similarly as in (10). The states constructed above have the following basic properties which justify their interpretation as scattering states: ∈H , Lemma 2.3. For any ± , ± ± out

out

) = ( , )( , ), (a) (+ × − , + × − + − + − out

out

(b) U (x)(+ × − ) = (U (x)+ ) × (U (x)− ), for x ∈ R2 . Analogous relations hold for the incoming scattering states. Proof. Part (a) follows immediately from relation (9). As for part (b), for any x ∈ R2 we can choose such y ∈ W and y ∈ W that x + y ∈ W and x + y ∈ W . We choose a sequence {Fn }n∈N of elements of R and {Fn }n∈N of elements of R s.t. s- lim P+ Fn (y) = + and s- lim P− Fn (y ) = − . Then n→∞

n→∞

out

out U (x)(+ × − ) = s- lim αx ( out + (Fn (y)))αx ( − (Fn (y ))) n→∞

out = s- lim αx+y ( out + (Fn ))αx+y ( − (Fn )) n→∞

out = s- lim out + (Fn (x + y)) − (Fn (x + y )), n→∞

(13)

where we applied Lemma 2.1 (b) and Lemma 2.2 (b) in the second and third step. We out

note that the last state on the r.h.s. above is just (U (x)+ ) × (U (x)− ), completing the proof of (b). The statement concerning the incoming states follows immediately from the properties of the outgoing states and from definition (11). After this preparation, we introduce the scattering subspaces in

out

Hin = H+ × H− and Hout = H+ × H− ,

(14)

which are spanned by the respective scattering states. In view of Lemma 2.3, they are canonically isomorphic to the tensor product H+ ⊗H− . Similarly as in [Bu75], we define the scattering operator S : Hout → Hin , extending by linearity the following relation: out

in

S(+ × − ) = + × − .

(15)

By Lemma 2.3 this map is an isometry and it is invariant under translations. If S differs from (a constant multiple of) the identity transformation on Hin , then we say that the theory is interacting. If Hin = Hout = H, then we say that the theory is asymptotically complete. In the next two sections we exhibit a class of theories which satisfy these two properties. To conclude this section, we point out that the asymptotic fields form new Borchers triples, which are non-interacting and asymptotically complete. In view of Lemma 2.1 (a)

432


and Lemma 2.2 (a), we can define the following von Neumann algebras acting on Has := H+ ⊗ H− : in Ras := { out + (F)|H+ ⊗ − (G)|H− | F, G ∈ R } , as

(R ) :=

{ in + (F )|H+

⊗ out − (G )|H−

|F ,G ∈ R } .

(16) (17)

Moreover, we set U as (x) = U (x)|H+ ⊗U (x)|H− and as = ⊗ . Clearly, there holds sp U as = sp U |H+ + sp U |H− ⊂ V+ ,

(18)

and as is the unique (up to a phase) unit vector which is invariant under the action of U as . Since as is cyclic for Ras and (R )as , and (R )as ⊂ (Ras ) , we obtain that (Ras , U as , as ) is a Borchers triple w.r.t. W. We call it the asymptotic Borchers triple of (R, U, ). It has the following properties: Proposition 2.4. The Borchers triple (Ras , U as , as ) defined above gives rise to an asymptotically complete and non-interacting wedge-local quantum field theory. Moreover, sp U as = V+ . Proof. Making use of the fact that (R )as ⊂ (Ras ) , we obtain the equalities out as

out + ( + (F)|H+ ⊗ I ) = P+ F ⊗ ,

out as

out − (I ⊗ − (F )|H− ) = ⊗ P− F ,

(19) (20)

as ⊃ valid for any F ∈ R, F ∈ R . Thus we conclude that H+as ⊃ H+ ⊗ [c] and H− [c] ⊗ H− . Let ± ∈ H± and let {Fn }n∈N (resp. {Fn }n∈N ) be a sequence of elements of R (resp. R ) s.t. s- limn→∞ P+ Fn = + (resp. s- limn→∞ P− Fn = − ). Then we get out

(+ ⊗ ) × ( ⊗ − ) out out out as = s- lim out + ( + (Fn )|H+ ⊗ I ) − (I ⊗ − (Fn )|H− ) n

= + ⊗ − .

(21)

By an analogous argument, we verify that in

(+ ⊗ ) × ( ⊗ − ) = + ⊗ − .

(22)

We infer from equalities (21) and (22) that (Has )out = (Has )in = Has (i.e. asymptotic completeness holds) and S = I (i.e. the theory is non-interacting). To justify the statement concerning the spectrum of U as , we recall that H+ ∩ [c]⊥ and H− ∩ [c]⊥ are assumed to be non-trivial. Consequently, sp U |H+ and sp U |H− have some non-zero elements. From the existence of the unitary representation of the Poincaré group U˜ , associated with the triple (R, U, ), we conclude that these two spectra coincide with the right and left branch of the lightcone, respectively. Since sp (U |H+ ⊗ U |H− ) = sp U |H+ + sp U |H− , the statement follows.


433

3. Deformations and Interaction In the previous section we showed that for any Borchers triple in two-dimensional spacetime (with a unique vacuum state) we can canonically construct the scattering operator S which describes collisions of massless particles (or rather ’waves’). In this section we consider a class of deformations of Borchers triples, introduced in [BLS10], and study their effect on the scattering operator. Similarly as in the massive case [GL08,BS08], the deformed theory turns out to be interacting, even if the original one is not. Moreover, we show that the property of asymptotic completeness is preserved under these deformations. Let us recall briefly the deformation procedure of [BLS10]. Let (R, U, ) be a Borchers triple w.r.t. the wedge W. We denote by R∞ the subset of elements of R which are smooth under the action of α in the norm topology. (It is easy to see that R∞ is a dense subalgebra of R in the strong operator topology). Let D be the dense domain of vectors which are smooth w.r.t. the action of U . Then, as shown in [BLS10], one can define for any F ∈ R∞ , and a matrix Q, antisymmetric w.r.t. the Minkowski scalar product (x, y) → x y, the warped convolution −2 FQ = d E(x) α Qx (F) := lim (2π ) d 2 x d 2 y f (εx, εy)e−i x y α Qx (F)U (y), ε0

(23) where d E is the spectral measure of U and f ∈ S(R2 × R2 ) is s.t. f (0, 0) = 1. The limit exists in the strong sense on vectors from D and is independent of the function f within the above restrictions. We set R Q := { FQ | F ∈ R∞ } . Let us now restrict attention to the following family of matrices: 0 κ , Qκ = κ 0

(24)

(25)

where κ > 0, and recall a result from [BLS10]: Theorem 3.1. If (R, U, ) is a Borchers triple w.r.t. W, then (R Q κ , U, ) is also a Borchers triple w.r.t. W. Moreover, (R )−Q κ ⊂ (R Q κ ) . Our goal is to express the scattering operator Sκ of the deformed theory (R Q κ , U, ) by the scattering operator S of the original theory (R, U, ). To this end, we prove the following fact. Theorem 3.2. For any ± ∈ H± the following relations hold: out

1

+ × κ − = e−i 2 κ(H in

+ ×κ − = ei

2 −P 2 )

1 2 2 2 κ(H −P )

out

(+ × − ), in

(+ × − ),

(26) (27)

where on the l.h.s. (resp. r.h.s.) there appear the scattering states of the deformed (resp. undeformed) theory.

434


Proof. Let us first prove relation (26). To this end, we pick F ∈ R∞ , F ∈ (R )∞ . We set + = P+ F = P+ FQ κ and − = P− F = P− F−Q , where we exploited κ the translational invariance of the state . Since FQ κ ∈ R Q κ and, by Theorem 3.1, F−Q ∈ (R Q κ ) , the outgoing state of the deformed theory is given by κ out

(h ) + × κ − = lim FQ κ ,+ (h T )F−Q (h T ) = lim FQ κ ,+ (h T )F− T κ ,− T →∞ T →∞ (h )(y), = lim lim (2π )−2 d 2 x d 2 y f (εx, εy)e−i x y α Qx (F+ (h T ))F− T T →∞ ε0

(28) where in the last step we made use of the fact that F− (h T ) ∈ D, and that is invariant under translations. To exchange the order of the limits, we use methods from the proof of Lemma 2.1 of [BLS10]: We note that for each polynomial (x, y) → L(x, y), there exists a polynomial (x, y) → K (x, y) s.t L(x, y)e−i x y = K (−∂x , −∂ y )e−i x y .

(29)

We choose L so that L −1 and its derivatives are absolutely integrable. Denoting temporarily T (x, y) := α Qx (F+ (h T ))F− (h T )(y), we obtain lim (2π )−2 d 2 x d 2 y f (εx, εy)e−i x y T (x, y) ε0 = lim (2π )−2 d 2 x d 2 y e−i x y K (∂x , ∂ y ) f (εx, εy)L(x, y)−1 T (x, y) ε0 −2 = (2π ) (30) d 2 x d 2 y e−i x y K (∂x , ∂ y )L(x, y)−1 T (x, y), where in the first step we integrated by parts and in the second step we applied the dominated convergence theorem. To obtain the last expression, we used the fact that derivatives of (x, y) → f (εx, εy) contain powers of ε and thus vanish in the limit. Substituting this expression to formula (28) and making use again of the dominated convergence theorem, we arrive at out −2 + × κ − = (2π ) d 2 x d 2 y e−i x y K (∂x , ∂ y )L(x, y)−1 out

(U (Qx)+ ) × (U (y)− ).

(31)

To interchange the limit T → ∞ with the action of the derivatives, we exploited the fact that for any F1 ∈ R∞ , μ ∈ {0, 1}, the derivative ∂x μ F1 := (∂x μ F1 (x))|x=0 is an out element of R∞ and out + (∂x μ F1 )(x) = ∂x μ + (F1 )(x). This equality (as well as its counterpart for out ) follows immediately from the norm continuity of the respective − map. To analyze expression (31), we will exploit some special features of massless theories in two dimensions. First, we recall that (H − P)+ = 0, and therefore U (Q κ x)+ = eiκ(H x

1 −P x 0 )

1

+ = e−i 2 κ(H +P)(x

0 −x 1 )

+ .

(32)

Similarly, since (H + P)− = 0, we obtain 1

U (y)− = ei 2 (H −P)(y

0 +y 1 )

− .

(33)


435

Hence, exploiting the equalities (H ± P)∓ = 0 and Lemma 2.3 (b), we get out

i

(U (Qx)+ ) × (U (y)− ) = e− 2 κ(H +P)(x

0 −x 1 )

i

e 2 (H −P)(y

0 +y 1 )

out

= U (v(x, y))(+ × − ),

out

(+ × − ) (34)

where v(x, y) = ( 21 (y 0 + y 1 − κ x 0 + κ x 1 ), 21 (y 0 + y 1 + κ x 0 − κ x 1 )). We substitute this expression to formula (31), obtaining out out (+ × κ − ) = (2π )−2 d 2 x d 2 y e−i x y K (∂x , ∂ y )L(x, y)−1 U (v(x, y))(+ × − ) out lim (2π )−2 d 2 x d 2 y e−i x y f (εx, εy)ei pv(x,y) d E( p)(+ × − ). = ε0

(35) Here in the second step we expressed U (v(·, ·)) as a spectral integral and used the Fubini theorem to exchange the order of integration. Then we reversed the steps which led to formula (30). Now we analyze the function in the bracket above. Setting p ± = 21 ( p 0 ± p 1 ), we get (2π )−2 d 2 x d 2 y e−i x y f (εx, εy)ei pv(x,y) + 0 0 + 1 1 − 0 1 −2 = (2π ) d 2 x d 2 y f (ε(x 0 , x 1 ), ε(y 0 , y 1 ))e−i(κ p +y )x ei(κ p +y )x ei p (y +y ) − 0 1 = (2π )−1 d 2 y ε−2 fˆ(−ε−1 (κ p + + y 0 , κ p + + y 1 ), ε(y 0 , y 1 ))ei p (y +y ) − 0 1 + −1 = (2π ) d 2 y fˆ(−(y 0 , y 1 ), ε(εy 0 − κ p + , εy 1 − κ p + ))ei p ((y +y )ε−2κ p ) . (36) Here fˆ denotes the Fourier transform of f w.r.t. the x variable and inthe last step we exploited the change of variables: (y 0 , y 1 ) → εy 0 − κ p + , εy 1 − κ p + . Making use of the dominated convergence theorem, we perform the limit ε 0, obtaining 1 0 2 1 2 lim (2π )−2 d 2 x d 2 y e−i x y f (εx, εy)ei pv(x,y) = e−i 2 κ(( p ) −( p ) ) . (37) ε0

In view of formula (35), this completes the proof of (26) for dense sets of vectors ± ∈ H± . For arbitrary ± the statement follows by the limiting procedure (10). The statement (27) concerning the incoming states can be shown using formula (12) and an obvious modification of the above argument. We obtain it however more directly, using formula (26) and definition (11): in out out i 2 2 + ×κ − = J ((J + ) × κ (J − )) = J e− 2 κ(H −P ) ((J + ) × (J − )) i

= e 2 κ(H

2 −P 2 )

in

(+ × − ).

(38)

Here in the last step we made use of the fact, shown in [BLS10], that the modular objects of the deformed and undeformed theory coincide. We also exploited the relation J g(H, P)J = g(H, P)∗ , valid for any bounded, measurable function g, which follows from formula (2).

436


We immediately obtain the following corollary: Corollary 3.3. Let S be the scattering operator of (R, U, ) and let Sκ be the scattering operator of (R Q κ , U, ). Then Sκ = eiκ(H

2 −P 2 )

S.

(39)

In particular, if the original theory is asymptotically complete and non-interacting, and sp U = V+ , then the deformed theory is asymptotically complete and interacting. Proof. Making use of Theorem 3.2 and of the invariance of the scattering operator under translations, we obtain out

in

Sκ (+ × κ − ) = + ×κ − in

1

2 −P 2 )

(+ × − )

1

2 −P 2 )

S(+ × − )

= ei 2 κ(H = ei 2 κ(H = eiκ(H

2 −P 2 )

out

out

S(+ × κ − ).

(40)

This proves formula (39). The property of asymptotic completeness is preserved under 2 2 the deformation, since eiκ(H −P ) is a unitary. If Hout = H, S = I and sp U = V+ 2 2 then eiκ(H −P ) is not a constant multiple of identity on Hout , i.e. the deformed theory is interacting. We have shown in Proposition 2.4 that any Borchers triple (R, U, ) with a unique vacuum vector and non-trivial single-particle subspaces H+ ∩ [c]⊥ , H− ∩ [c]⊥ , gives rise to an asymptotic Borchers triple (Ras , U as , as ) which is asymptotically complete, non-interacting and s.t. sp U as = V+ . Hence, in view of the above corollary, the deformation of (Ras , U as , as ) gives rise to an interacting, asymptotically complete theory. Interestingly, there exists a large class of Borchers triples which are unitarily equivalent to their asymptotic Borchers triples (in the sense of [BLS10]). They give rise to interacting theories with a complete particle interpretation by a direct application of the deformation procedure. In the next section we show that the Borchers triples associated with chiral conformal field theories belong to this class. 4. Asymptotic Completeness of Chiral Nets In this section we consider a specific class of Borchers triples resulting from chiral nets. We will show that such triples are asymptotically complete, what is at first sight surprising in view of the rich family of superselection sectors in chiral conformal field theory [GF93]. We recall, however, that in the present case particles (or rather ’waves’) are composite objects, i.e. they may transform reducibly under the action of the Poincaré group. Consequently, they may contain (pairs of) excitations from other sectors. ˆ Uˆ , ). ˆ It consists of We start from the definition of a local net on R, denoted by (A, ˆ ˆ from open, bounded intervals to von Neumann (a) a map R ⊃ I → A(I) ⊂ B(H), algebras on Hˆ s.t. ˆ ˆ ) for I ⊂ J , A(I) ⊂ A(J ˆ ˆ )] = 0 for I ∩ J = φ; [A(I), A(J

(41) (42)


437

(b) a unitary representation R s → Uˆ (s) s.t. sp Uˆ ⊂ R+ , ˆ + s) for s ∈ R; ˆ Uˆ (s)−1 = A(I Uˆ (s)A(I)

(43) (44)

ˆ invariant under the action of Uˆ , which is (c) a unique (up to a phase) unit vector , ˆ cyclic for any local algebra A(I). We remark that there are many examples of local nets on R. They arise, in particular, from conformal field theories on S 1 (see e.g. [BMT88,KL04] for concrete examples). Given a theory on S 1 one obtains a net on the compactified real line by means of the Cayley transform. Its restriction to the real line gives rise to a local net on R with properties specified above. ˆ 1 ) and (Aˆ 2 , Uˆ 2 , ˆ 2 ) be two local nets on R, and let Hˆ 1 , Hˆ 2 be the Let (Aˆ 1 , Uˆ 1 , respective Hilbert spaces. We identify the two real lines with the lightlines x + t = 0 and x − t = 0 in R2 . To construct a local net on R2 , acting on the tensor product space H = Hˆ 1 ⊗ Hˆ 2 , we first specify the unitary representation of translations: U (x) := Uˆ 1

1 0 1 0 1 1 ˆ √ (x − x ) ⊗ U2 √ (x + x ) . 2 2

(45)

Let αx ( · ) := U (x) · U (x)∗ be the corresponding group of translation automorphisms (1/2) and let αx ( · ) := Uˆ 1/2 (x) · Uˆ 1/2 (x)∗ . Then there holds (1)

αx (A1 ⊗ A2 ) = α √1

2

(2)

(x 0 −x 1 )

(A1 ) ⊗ α √1

2

(x 0 +x 1 )

(A2 ),

A1 ∈ Aˆ 1 , A2 ∈ Aˆ 2 .

(46)

Any double cone in R2 can be written as a product of two intervals on lightlines I1 × I2 . We define the corresponding local algebra by A(I1 × I2 ) := Aˆ 1 (I1 ) ⊗ Aˆ 2 (I2 ). Setting ˆ1⊗ ˆ 2 , we obtain a triple (A, U, ), which we call a chiral net on R2 . Defining = R :=

A(I1 × I2 ),

(47)

I1 ×I2 ⊂W

we arrive at a Borchers triple (R, U, ) associated with (A, U, ). We will show that this Borchers triple is unitarily equivalent to its asymptotic Borchers triple (Ras , U as , as ) and therefore, by Proposition 2.4, asymptotically complete and non-interacting. To this end, we determine the asymptotic fields in the following proposition: Proposition 4.1. For any A1 ∈ Aˆ 1 (I1 ), A2 ∈ Aˆ 2 (I2 ) there holds ˆ 2 )I, ˆ 2 |A2 (A1 ⊗ A2 ) = A1 ⊗ ( out/in ˆ 1 )I ⊗ A2 . ˆ 1 |A1

− (A1 ⊗ A2 ) = ( out/in

+

(48) (49)

in (In the case of out + and − it is assumed that I1 × I2 ⊂ W. In the remaining cases I1 × I2 ⊂ W .)

438


Proof. We consider only out + , as the remaining cases are analogous. From its definition and formula (46), we obtain (2) (A ⊗ A ) = slim A ⊗ dt h T (t)α√ (A2 ). (50)

out 1 2 1 + T →∞

We denote A2 (h T ) :=

2t

(2) dt h T (t)α√ (A2 ). This sequence has the following properties: 2t

ˆ 2 ) ˆ 2 = ( ˆ 2 |A2 ˆ 2, s- lim A2 (h T ) T →∞

lim [A2 (h T ), A] = 0, for any A ∈ Aˆ 2 (I),

T →∞

(51) (52)

where I is an arbitrary open, bounded interval. The first identity above follows from the ˆ 2 is the only vector invariant under the action mean ergodic theorem and the fact that of Uˆ 2 . The second equality is a consequence of the locality property (41). Now since ˆ 2 ] = Hˆ 2 , we obtain from relations (51), (52), [Aˆ 2 (I) ˆ 2 )I. ˆ 2 |A2 s- lim A2 (h T ) = ( T →∞

This completes the proof.

(53)

Now we can easily prove the main result of this section: Theorem 4.2. Any Borchers triple (R, U, ) associated with a chiral net on R2 is unitarily equivalent to its asymptotic Borchers triple (Ras , U as , as ). More precisely, there exists a unitary map W : Has → H s.t. W Ras = RW, W U as (x) = U (x)W and W as = . Proof. By cyclicity of under the action of R and the mean ergodic theorem (cf. formula (5)), there holds H± = [ out ± (F) | F ∈ R ].

(54)

ˆ 1, ˆ 2 under the action Thus, applying Proposition 4.1, and exploiting the cyclicity of of the respective local algebras, we obtain ˆ 2 ], H+ = Hˆ 1 ⊗ [c ˆ 1 ] ⊗ Hˆ 2 . H− = [c

(55) (56)

Recalling that Has = H+ ⊗H− and H = Hˆ 1 ⊗ Hˆ 2 , we define a unitary map W : Has → H, extending by linearity the relation ˆ 2 ) ⊗ ( ˆ 1 ⊗ 2 )) = 1 ⊗ 2 , 1 ∈ Hˆ 1 , 2 ∈ Hˆ 2 . W ((1 ⊗

(57)

It is readily verified that W U as (x) = U (x)W, (58) as W = , (59) out in ˆ ˆ ˆ ˆ W { + (A1 ⊗ A2 )|H+ ⊗ − (B1 ⊗ B2 )|H− } = (1 |B1 1 )(2 |A2 2 ){A1 ⊗ B2 }W, (60)


439

where A1 ⊗ A2 , B1 ⊗ B2 comply with the assumptions of Proposition 4.1. By definition (47), the elements in the curly bracket on the r.h.s. of (60) generate R. Making in use of this fact and of the identities out + (F)|H+ = P+ F|H+ , − (F)|H− = P− F|H− , where F ∈ R, we obtain that the double commutant of the set of elements in the curly bracket on the l.h.s. of (60) coincides with Ras . Hence W Ras = RW , which concludes the proof. In view of the above theorem, we obtain from Proposition 2.4: Corollary 4.3. Any Borchers triple (R, U, ), associated with a chiral net, gives rise to an asymptotically complete, non-interacting theory. Moreover, sp U = V+ . Hence, by Corollary 3.3, deformations of such Borchers triples give rise to asymptotically complete, interacting theories. 5. Concluding Remarks In this paper we applied the deformation method, developed in [BLS10], to twodimensional massless theories. We have shown that the deformation procedure not only introduces interaction, as expected from the massive case [GL08,BS08], but also preserves the property of asymptotic completeness. By deforming chiral conformal field theories, we obtained a large class of wedge-local theories, which are interacting and asymptotically complete. As the resulting scattering matrices are Lorentz invariant, one can hope for the existence of local observables in these models. We recall that negative results, concerning this issue, have so far been established only in spacetimes of dimension larger than two [BLS10]. A large part of our investigation was devoted to scattering of massless particles (’waves’) in two-dimensional wedge-local theories. It turned out that the scattering theory developed in [Bu75] for local nets of observables generalizes naturally to the wedgelocal framework: To construct the two-body scattering matrix, it suffices to know the Borchers triple. It is an interesting open problem, if this fact remains true for scattering of massless particles in spacetimes of higher dimension. We recall that for local nets of observables scattering theory of massless excitations is well understood in the physical spacetime [Bu77]. Acknowledgements. The authors would like to thank Prof. D. Buchholz, Dr. G. Lechner and D. Cadamuro for interesting discussions. Moreover, W.D. gratefully acknowledges hospitality of the Erwin Schrödinger International Institute for Mathematical Physics in Vienna during the final stages of this work.

References [BFK06]

[BFM04] [Bo92] [BBS01]

Babujian, H., Foerster, A., Karowski, M.: The form factor program: a review and new results the nested SU(N) off-shell Bethe Ansatz. In: J. Balog, L. Fehér (eds.), Proceedings of the O’Raifeartaigh Symposium on Non-Perturbative and Symmetry Methods in Field Theory, Budapest, 2006 Benfatto, G., Falco, P., Mastropietro, V.: Functional integral construction of the massive Thirring model: Verification of axioms and massless limit. Commun. Math. Phys. 273, 67–118 (2007) Borchers, H.-J.: The CPT-theorem in two-dimensional theories of local observables. Commun. Math. Phys. 143, 315–332 (1992) Borchers, H.-J., Buchholz, D., Schroer, B.: Polarization-free generators and the S-matrix. Commun. Math. Phys. 219, 125–140 (2001)

440

[Bu75] [Bu77] [BMT88] [Bu90] [BS08] [BLS10] [CRW85] [CD82] [DLM10] [DG99] [Dy05] [Dy09] [Fl98] [FGS04] [GF93] [GL08] [Ha] [Ha58] [KL04] [Le08] [LSZ55] [Po04.1] [Po04.2] [Ru62] [SW00] [Sp97] [ZZ92]


Buchholz, D.: Collision theory for waves in two dimensions and a characterization of models with trivial S-matrix. Commun. Math. Phys. 45, 1–8 (1975) Buchholz, D.: Collision theory for massless bosons. Commun. Math. Phys. 52, 147–173 (1977) Buchholz, D., Mack, G., Todorov, I.: The current algebra on the circle as a germ of local field theories. Nuclear Phys. B Proc. Suppl. 5B, 20–56 (1988) Buchholz, D.: Harmonic analysis of local operators. Commun. Math. Phys. 129, 631–641 (1990) Buchholz, D., Summers, S.: Warped convolutions: A novel tool in the construction of quantum field theories. http://arxiv.org/abs/0806.0349v1 [math-ph], 2008 Buchholz, D., Lechner, G., Summers, S.: Warped convolutions, Rieffel deformations and the construction of quantum field theories. http://arxiv.org/abs/1005.2656v1 [math-ph], 2010 Carey, A.L., Ruijsenaars, S.N.M., Wright, J.D.: The massless Thirring model: Positivity of Klaiber’s n-point functions. Commun. Math. Phys. 99, 347–364 (1985) Combescure, M., Dunlop, F.: Three-body asymptotic completeness for P(φ)2 models. Commun. Math. Phys. 85, 381–418 (1982) Dappiaggi, C., Lechner, G., Morfa-Morales, E.: Deformations of quantum field theories on spacetimes with Killing vector fields. http://arxiv.org/abs/1006.3548v1 [math-ph], 2010 Dereziński, J., Gerard, C.: Asymptotic completeness in quantum field theory. Massive PauliFierz Hamiltonians. Rev. Math. Phys. 11, 383–450 (1999) Dybalski, W.: Haag-Ruelle scattering theory in presence of massless particles. Lett. Math. Phys. 72, 27–38 (2005) Dybalski, W.: Continuous spectrum of automorphism groups and the infraparticle problem. Commun. Math. Phys. 300(1), 273–299 (2010) Florig, M.: On Borchers’ theorem. Lett. Math. Phys. 46, 289–293 (1998) Fröhlich, J., Griesemer, M., Schlein, B.: Asymptotic completeness for Compton scattering. Commun. Math. Phys. 252, 415–476 (2004) Gabbiani, F., Fröhlich, J.: Operator algebras and conformal field theory. Commun. Math. Phys. 155, 569–640 (1993) Grosse, H., Lechner, G.: Wedge-local quantum fields and non-commutative Minkowski space. JHEP 0711, 12–39 (2007) Haag, R.: Local Quantum Physics. Berlin: Springer, 1992 Haag, R.: Quantum field theories with composite particles and asymptotic conditions. Phys. Rev. 112, 669–673 (1958) Kawahigashi, Y., Longo, R.: Classification of two-dimensional local conformal nets with c < 1 and 2-cohomology vanishing for tensor categories. Commun. Math. Phys. 244, 63–97 (2004) Lechner, G.: Construction of quantum field theories with factorizing S-matrices. Commun. Math. Phys. 277, 821–860 (2008) Lehmann, H., Symanzik, K., Zimmermann, W.: Zur Formulierung quantisierter Feldtheorien. Nuovo Cim. 1, 205–225 (1955) Porrmann, M.: Particle weights and their disintegration I. Commun. Math. Phys. 248, 269–304 (2004) Porrmann, M.: Particle weights and their disintegration II. Commun. Math. Phys. 248, 305–333 (2004) Ruelle, D.: On the asymptotic condition in quantum field theory. Helv. Phys. Acta 35, 147–163 (1962) Schroer, B., Wiesbrock, H.W.: Modular constructions of quantum field theories with interactions. Rev. Math. Phys. V. 12, 301–326 (2000) Spohn, H.: Asymptotic completeness for Rayleigh scattering. J. Math. Phys. 38, 2281–2296 (1997) Zamolodchikov, A.B., Zamolodchikov, Al.B.: Massless factorized scattering and sigma models with topological terms. Nucl. Phys. B 379, 602–623 (1992)

Communicated by Y. Kawahigashi


Communications in


Symplectic Geometry of Entanglement Adam Sawicki1 , Alan Huckleberry2 , Marek Ku´s1 1 Center for Theoretical Physics, Polish Academy of Sciences, Al. Lotników 32/46,

02-668 Warszawa, Poland. E-mail: [email protected]; [email protected]

2 Fakultät für Mathematik, Ruhr-Universität Bochum, 44780 Bochum, Germany

Received: 19 July 2010 / Accepted: 10 December 2010 Published online: 13 May 2011 – © The Author(s) 2011. This article is published with open access at Springerlink.com

Abstract: We present a description of entanglement in composite quantum systems in terms of symplectic geometry. We provide a symplectic characterization of sets of equally entangled states as orbits of group actions in the space of states. In particular, using the Kostant-Sternberg theorem, we show that separable states form a unique symplectic orbit, whereas orbits of entangled states are characterized by different degrees of degeneracy of the canonical symplectic form on the complex projective space. The degree of degeneracy may be thus used as a new geometric measure of entanglement. The above statements remain true for systems with an arbitrary number of components, moreover the presented method is general and can be applied also under different additional symmetry conditions stemming, e.g., from the indistinguishability of particles. We show how to calculate the degeneracy for various multiparticle systems providing also simple criteria of separability. 1. Introduction Quantum entanglement - a direct consequence of linearity of quantum mechanics and the superposition principle - is one of the most intriguing phenomena distinguishing quantum and classical description of physical systems. Quantum states which are entangled possess features unknown in the classical world, like the seemingly paradoxical nonlocal properties exhibited by the famous Einstein-Podolsky-Rosen analysis of completeness of the quantum theory. Recently, with the development of quantum information theory they came to prominence as the main resource for several applications aiming at speeding up and making more secure information transfers (see, e.g., [1]). Pure states which are not entangled are called separable and for systems of L distinguishable particles they are, by definition, described by simple tensors in the Hilbert space of the whole system, H = H1 ⊗ · · · ⊗ H L , where Hl are the single-particle spaces. For indistinguishable particles such a definition lacks sense - indistinguishability enforces symmetrization or antisymmetrization of the state vectors. In effect nearly all

442

A. Sawicki, A. Huckleberry, M. Ku´s

states are not simple tensors, in fact the relevant Hilbert spaces of such systems are no longer full tensor products, but rather their symmetric or antisymmetric subspaces. In these cases one modifies the original definition of separability and adapts it according to symmetry (see below). The concept of separability (or equivalently nonentanglement) can be in a natural way extended to mixed states by first identifying pure states with projections on their directions (i.e. rank-one orthogonal projections) and then defining mixed separable states as convex combinations of pure separable ones. Mixed states which are not separable are, consequently, called entangled. Separability of a state remains unaffected under a particular class of transformations allowed by quantum mechanics. Thus, for example, a separable state of distinguishable particles remains separable when we act on it by a unitary operator U = U1 ⊗ · · · ⊗ U L , where Ul are unitaries acting in the single-particle spaces. One can find appropriate classes of unitary operators preserving separability also in the cases of indistinguishable particles. Going one step further one may analyze how actions of separability-preserving unitaries stratify into their orbits the whole space of states (pure or mixed) of a composite quantum system. To treat all the cases in a unified way we may consider a general situation in which a compact group K acts on some manifold M. The manifold in question will then depend on the considered system. For pure states it will be the projectivisation P(H) of the Hilbert space H in the case of distinguishable particles or the projectivisation of an appropriate symmetrization (for bosons) or antisymmetrization (for fermions) of H. In all cases the manifold M is naturally equipped with some additional structure. In our investigations it will be a symplectic structure inherited from the natural one existing on every complex Hilbert space. Orbits of K being submanifolds of M might also, under special circumstances, inherit the symplectic structure or in addition respect the underlying complex structure of H and become Kählerian. From this point of view we want to consider several problems. 1. How symplectic and nonsymplectic orbits of the K action on P(V ) stratify the set of pure states? 2. What is the meaning (for the entanglement properties) of the fact that the orbit through a particular pure state is or is not symplectic? In the next section we start with relevant definitions of separability and entanglement for distinguishable as well as indistinguishable particles. When giving definitions we concentrate on L = 2, i.e. on two-partite systems, but the general reasonings for larger L remains very similar. To make the paper reasonably self-contained we devote a few further sections and the Appendix to a presentation of some tools from the Lie-group representation theory and the symplectic geometry most important in our investigations. 2. Separable and Entangled States Let H be an N -dimensional Hilbert space. By choosing an orthonormal basis in H we will identify it with C N equipped with the standard Hermitian product. A state is a positive, trace-one linear operator on H, ρ : H → H, ∀x∈H x|ρ|x ≥ 0, Trρ = 1.

(1)

We use the standard Dirac notation: |x is an element of H, and x| - the element of the dual space H∗ corresponding to |x via the scalar product · | · on H. A state is, by

Symplectic Geometry of Entanglement

443

definition, pure if it is a rank-one projection, ρ = ρ2,

(2)

otherwise it is called mixed. A pure state can be thus written in the form ρ = |xx|/x|x := Px for some x ∈ H, hence it can be identified with a point in the projective space P(H).

2.1. Separable and entangled states of two distinguishable particles. The Hilbert space for a composite system of two distinguishable particles is the tensor product of the Hilbert spaces of the subsytems, H = H1 ⊗ H2 , H1 C N1 , H2 C N2 .

(3)

A pure state ρ is called separable or, equivalently, nonentangled if and only if it is a tensor product of pure states of the subsystems, ρ = Px ⊗ Py , |x ∈ H1 , |y ∈ H2 ,

(4)

otherwise it is called entangled. A mixed state is, by definition, separable if it is a convex combination of pure separable states [2], ρ=

pi Pxi ⊗ Pyi , |xi ∈ H1 , |yi ∈ H2 ,

pi > 0,

i

pi = 1.

(5)

i

From the physical point of view it is often desirable to define how strongly entangled is a particular state ρ. Although such a quantification of entanglement is not universal, especially for systems with more than two constituents and can be constructed on the basis of different (measured in an actual experiment) properties of entangled states, it should always ascribe the same amount of entanglement to states differeing by local quantum operations, i.e. by a conjugation by direct product of the unitary groups U (H1 )×U (H2 ), ρ → U1 ⊗ U2 ρ U1† ⊗ U2† .

(6)

2.2. Separable and entangled states of two indistinguishable particles. For indistinguishable particles the Hilbert space of a composite two-partite system is no longer the tensor product of the Hilbert spaces of the subsystems but, 1. the antisymmetric part of the tensor product in the case of fermions, HF =

2

(H1 ),

(7)

2. the symmetric part of the tensor product in the case of bosons, H B = Sym2 (H1 ),

(8)

444


where H1 C N is the, so called, one-particle Hilbert space, i.e. the Hilbert space of a single particle. In the fermionic state there is a natural way of defining pure nonentangled states: a state ρ is nonentangled if and only if it is an orthogonal projection on an antisymmetric part of the tensor product of two vectors from H1 [3,4]. Otherwise ρ is called entangled. This definition, which can be in an obvious way extended to multipartite systems, is equivalent to the one proposed in [3] and [4]. Interestingly, a completely analogous definition for two bosons identifying nonentangled pure states with orthogonal projections on symmetrized tensor products of two vectors leads to some unexpected consequences – there are two classes of unentangled states inequivalent from the point of view of the action of the appropriate unitary group in the single particle spaces. Namely, to keep the indistinguishability intact, the unitary action of U (N ) × U (N ) on the Hilbert space of the composite system H B given by Eq. (8) must be restricted to the diagonal one, i.e., involving the same unitary operator in both copies of the single particle space H1 . Clearly, nonentangled states being products of two copies of the same state lie then on the other orbit of the unitary action than states which are (projections of) symmetrized tensor products of distinct (say, orthogonal) vectors from H1 . As a consequence, two definitions supported by physical arguments were employed in the literature of the subject. In [5] (see also [6]) a concept of ‘complete system of properties’ of a subsystem was used to identify both above described classes of states as nonentangled, whereas in [7] it was pointed that only the restriction of the definition of nonentanglement to products of two copies of the same vector assures that nonentangled states can not be used to perform such a clearly ‘nonclassical’ task like e.g. teleportation, which definitely remains in accordance with the basic intuition connecting nonentanglement with the classical world. The second definition of nonentangled bosonic states was also proposed in [4], based on slightly different arguments. We will return to the problem in Sect. 10, provisionally adopting the more restrictive definition of nonentanglement in the bosonic case. Mixed nonentangled states for fermions and bosons are defined, as in the case of distinguishable particles, as convex combinations of projections on pure nonentangled states. As in the case of distinguishable particles the physically interesting amount of entanglement is invariant under the action of U (H1 ) acting in the one-particle space H1 . 3. Pure Nonentangled States as Coherent States In all three cases of distinguishable particles, fermions, and bosons, the pure nonentangled states, treated as points in appropriate projective spaces, form a set invariant under the action of an appropriate compact, semisimple group K irreducibly represented on some Hilbert space H [8–10]. This observation is in accordance with an intuition that entanglement properties of a state should not change under ‘local’ transformations allowed by quantum mechanics and symmetries of a system. Thus for example, for two distinguishable particles in two distant laboratories, local transformations can consist of independent quantum evolutions of each particle. This paradigm does not apply to indistinguishable particles when, in order to keep the exchange symmetry untouched, both particles must undergo the same evolution. Thus, 1. For distinguishable particles, K = SU (N1 ) × SU (N2 ), H = C N1 ⊗ C N2 .

(9)


445

2. For fermions, CN .

(10)

K = SU (N ), H = Sym2 C N .

(11)

K = SU (N ), H =

2

3. For bosons,

In all cases the nonentangled pure states are distinguished as forming some unique orbit of the underlying group action [11,12]. The orbit in question appears in the literature in several contexts and customarily its points are called coherent states, or the coherent states ‘closest to classical states’ [13]). A precise characteristic of the orbit, as well as its distinguished features from the view of entanglement theory will be discussed below. 4. A Short Review of the Representation Theory Let us recall some fundamentals of the representation theory for semisimple Lie groups and algebras useful in the next sections [14]. In the following we denote by K a simply connected compact Lie group and by k its Lie algebra. It is standard fact that representations of K are in one to one correspondence with representations of k. They both possess complete reducibility property, i.e., decompose as direct sums of irreducible ones, and can be made unitary by an appropriate choice of the scalar product in the carrier space. Let kC be the complexification of k. It is also well known that irreducible representations of k and kC are in one to one correspondence and that kC is a semisimple complex Lie algebra. Example 1. Consider K = SU (N ) which is simply connected and compact. Then k = su N and kC = sl N (C). 4.1. Adjoint representation of kC . The adjoint representation of kC is defined as ad : kC → gl(kC ), ad X (Y ) = [X, Y ].

(12) (13)

This representation plays a key role in understanding all other representations of kC . Let us fix a maximal commutative subalgebra t of k, then h = tC = t + it is a Cartan subalgebra of kC . Since h is the maximal commutative subalgebra of kC with the property that for every H ∈ h the operator ad H is diagonalizable (this is a consequence of the assumed semisimplicity of K ), we can decompose kC as a direct sum of root spaces with respect to h, kC = h ⊕ gα , (14) α

where α : h → C range over linear functionals (called roots) for which there exist X ∈ kC such that ad H (X ) = α(H )X ∀H ∈ h.

(15)

446


Space gα consists of the elements X with the above property. It is a standard fact that if α is a root then −α is also a root and that [gα , gβ ] = 0 or [gα , gβ ] = gα+β . Moreover all gα are one dimensional. We may introduce the notion of a positive root by first choosing an arbitrary basis consisting of roots in the space spanned by them, and then defining positive roots as those with only positive coefficients in the decomposition in the chosen basis. The weight space decomposition of kC can be then written as kC = n− ⊕ h ⊕ n+ ,

(16)

where the direct sums of the negative and positive root spaces, n− and n+ are nilpotent Lie algebras. In the defining representation of sl N (C) as N × N complex traceless matrices, the most natural choice of positive roots is that which leads to n− and n+ as, respectively, lower and upper triangular matrices. It is a key fact that we can choose bases E α of the root spaces gα and define Hα = [E −α , E α ] so that {E −α , Hα , E α } is the standard basis for sl2 (C). We will denote it by sl2 (α), by su2 (α) the corresponding su2 -triple {E −α − E α , i Hα , i(E −α + E α )}, and by SU2 (α) the corresponding group. 4.2. General case. It is enough to restrict our attention to irreducible representations as K is a compact and simply connected Lie group. Given any representation of h on the complex vector space V one decomposes V as a direct sum: V = ⊕Vλ ,

(17)

where isotypical components Vλ are weight spaces. In other words: ξ.v = λ(ξ )v ∀ξ ∈ h and v ∈ Vλ ,

(18)

where the linear functionals λ are called weights and vectors v - the corresponding weight vectors. Every irreducible representation of kC is the so-called highest weight cyclic representation. The most important facts we will use in the next sections are: – E α .Vλ ⊂ Vλ+α , – [E α , E β ].Vλ ⊂ Vλ+α+β , where E α ∈ gα and E β ∈ gβ . 5. Symplectic Orbits of Group Actions In the following we will need a couple of facts about actions of Lie groups on symplectic manifolds (see e.g. [12]). Let us denote by (M, ω) a symplectic manifold, i.e. M is a manifold and ω is a nondegenerate, closed (dω = 0) two-form. Let a compact semisimple group K act on M via syplectomorhisms, K × M (g, x) → g (x) ∈ M, ∗g ω = ω. We denote by k∗ the space dual to k = Lie(K ). Let ξ ∈ k. We define a vector field ξˆ , d ˆξ (x) = exp tξ (x). (19) dt t=0


447

Since the action of the group is Hamiltonian (which is true for a semisimple K ), for each ξ ∈ k there exists a Hamilton function μξ : M → R for ξˆ , i.e. dμξ = ı ξˆ ω := ω(ξ, ·).

(20)

The function can be chosen to be linear in ξ , i.e. μξ (x) = μ(x), ξ , μ(x) ∈ k∗ ,

(21)

where , is the pairing between k and its dual k∗ . The map μξ defines thus by (21) a map μ : M → k∗ . We can choose μ to be equivariant with respect to the coadjoint action of K [15], i.e. μ g (x) = Ad∗g μ(x), (22) where the coadjoint action Ad∗g on k∗ is defined via Ad∗g α, h = α, Ad g−1 h = α, g −1 hg, g ∈ K , h ∈ k, α ∈ k∗ ,

(23)

and Ad is the adjoint representation of K , Ad : K → Gl(k) d Ad(g)X = g Xg −1 := g exp t X g −1 . dt t=0

(24)

The above constructed μ is called the momentum map. The goal is now to describe the criterion for the K -orbit to be symplectic. Let O = K .x be the orbit through a point x ∈ M. Denote by ωO the restriction of the symplectic form ω to O. This form may, and in fact usually does, have a certain degree of degeneracy. Denote by Dx the subspace of tangent vectors which are ωO -orthogonal to the full space Tx O. Since the K action is symplectic we have g∗ (Dx ) = Dg (x) which means that the degree of degeneracy is constant on the orbit O. This fact will turn out to be very important in the context of entanglement measures. Now because of (22), μ(O) = is a coadjoint orbit in k∗ and thus is symplectic with respect to the canonical form ω on (see the Appendix). We also have (μ|O )∗ (ω ) = ωO , which means that the tangent spaces of the fibers of μ|O are exactly the degeneracy spaces Dx . Indeed, if u ∈ Dx then for an arbitrary v ∈ Tx O, 0 = ωO (u, v) = ω ((μ|O )∗ u, (μ|O )∗ v).

(25)

But ω is nondegenerate and Tμ(x) = (μ|O )∗ Tx O. Thus (μ|O )∗ u = 0 whenever u ∈ Dx . As a conclusion we get Theorem 1. A K -orbit O = K .x in M is symplectic if and only if the restriction of the moment map μ|O is a diffeomorphism onto a coadjoint orbit . Suppose now that O defined as above is symplectic. This means that K -action on O is the same as the coadjoint K -action on its μ-image (because μ is a diffeomorphism). Since K is compact there exists an Ad-invariant scalar product ( · | · ) on the carrier space k, (Ad(g)X |Ad(g)Y ) = (X | Y ), ∀X, Y ∈ k, g ∈ K .

(26)

448


In particular every operator Ad(g) is unitary, operators ad X are anitHermitian (ad∗X = −ad X ), and (ad X Y |Z ) = −(Y |ad X Z ).

(27)

We may use the invariant scalar product (26) to identify k with k∗ . More specifically we know that for any α ∈ k∗ there exist X ∈ k such that α = (X | · ). Upon such identification coadjoint orbits are exactly adjoint ones. To see this consider α ∈ k∗ . We know that α = (X | · ) ≡ α X ,

(28)

for some X ∈ k. We need to show that Ad∗g α (defined by 23) is equal to αAd(g)X . We have Ad∗g α X , Y = α X , Ad(g −1 )Y = (X |Ad(g −1 )Y ) = (Ad(g)X |Y ) = αAd(g)X , ∀g ∈ K ,

(29)

but this exactly what we wanted. Now we have the important fact which says that the adjoint action of K on t (the maximal commutative subalgebra of k) gives the whole k. This observation is true for any compact group but in the following we will need only its exemplification given by a familiar example. Example 2. Let K = SU (n) with the Lie algebra k = sun of traceless antiHermitian matrices. Maximal commutative subalgebra of k consists of traceless diagonal matrices t = diag(it1 , . . . , itn ), where tk ∈ R. It is a well known fact that every antiHermitian matrix has a purely imaginary spectrum and can be diagonalized by a unitary operator. Therefore, taking any X ∈ k we can find U ∈ U (n) such that U XU −1 = diag(it1 , . . . , itn ) tk ∈ R ∀k. Moreover we can choose SU (n) U1 = det(U )

− n1

(30)

U so that det(U1 ) = 1 and

X = U1−1 diag(it1 , . . . , itn )U1 tk ∈ R ∀k.

(31)

Hence indeed, every matrix X ∈ k can be obtained from the t by the adjoint action. As a consequence we obtain that every adjoint orbit contains an element of the maximal commutative subalgebra t which is fixed by the adjoint action of the maximal torus T ⊂ G, where torus T is obtained by exponentiating t (T = {et : t ∈ t}). Indeed, since T is Abelian it fixes its elements by conjugation, tt t −1 = t , for t, t ∈ T . By differentiation it translates to fixing the elements of t (and, consequently t∗ ) by the adjoint (coadjoint) action of T . Combining this observation with Theorem 1 establishing the diffeomorphism of a symplectic orbit with some coadjoint one, we arrive at the following conclusion Theorem 2. If an orbit O of K through x ∈ M is symplectic then the set of points on O fixed by the action of T is nonempty, FixO (T ) = 0. If the point x ∈ M is fixed by an element g ∈ K then by the equivariant property of the moment map, its μ-image is also fixed by the adjoint action Ad(g). The degeneracy subspaces Dx originate from nontrivial action of those symplectomorphisms g for which the corresponding Ad(g)-action on μ(x) is trivial. Thus we have the following theorem [11,12].


449

Theorem 3. The orbit of K through x ∈ M is symplectic if and only if the stabilizer subgroup (of the K -action) of x is the same as the stabilizer subgroup (of the Ad∗ -action) of μ(x). It is always true that Stab(x) ⊂ Stab(μ(x)), hence, Corollary 1. The dimension of degeneracy subspace Dx for an orbit O = K .x does not depend on x ∈ O and can be computed as D(x) = dim(Dx ) = dim(Stab(x)) − dim(Stab(μ(x))) = dim(K .x) − dim(K .μ(x)). (32) This means we can associate with every orbit of K -action a non-negative integer D(x) which measures the degree of its nonsymplecticity. 6. Symplectic Orbits in the Space of States In the case of pure states M = P(V ). The canonical symplectic form on P(V ), the moment map and symplectic orbits of a unitary K action can be calculated as follows [11,12]. For A ∈ u(V ) let A x ∈ Tx P(V ) be the vector tangent at t = 0 to the curve t → π(exp(t A)v), where x = π(v), v ∈ V, v = 1 and π : V → P(V ) is the canonical projection. When A runs through the whole Lie algebra u(V ) the corresponding A x span Tx P(V ) and for A, B ∈ u(V ) we obtain ωx (A x , Bx ) = −ImAv|Bv =

i [A, B]v|v. 2

(33)

The equivariant moment map μ : P(V ) → u∗ (V ) for the action of U (V ) on P(V ) is given by μ A (x) =

1 v|Av. 2

(34)

The group K acts on V via its unitary representation : K → U (V ). The restriction of ω to K .x can be calculated as above but now A and B are restricted to elements of k. From Sect. 5 we know that the necessary condition for an orbit to be symplectic is possessing a point fixed by the maximal torus T of K . From the definition of weights and weight vectors (18) it easily follows that in the case of K -action via a unitary representation on a projective space P(V ) fixed points of the T action are exactly the weight vectors. Hence Theorem 2 can be reformulated as, Fact 1. Let K act on P(V ) by unitary representation on a Hilbert space V . If O = K .x is a symplectic K -orbit then O contains a point x = π(v), where v is a T -weight vector, i.e., v ∈ Vλ for some weight λ. A sufficient condition for O = K .x to be symplectic is, of course, given in Theorem 3 but we want to have it in a more useful form. It is enough to restrict our attention to orbits passing through the weight vectors. For v ∈ Vλ we consider the tangent space Tx O equipped with the 2-form ωx . Let α be a positive root and define Oα to be the orbit of SU2 (α) of the associated SU2 -triple. Let Pα denote the tangent space to Oα at the point x. The tangent space Tx O can be of course considered as the collection of Pα where α

450


range over all positive roots. Let kC be the complexification of k - the Lie algebra of K . It has the root-space decomposition kC = tC ⊕ CE α , α

(35)

where E α is a root vector corresponding to the root α, hence [E α , E −α ] = Hα ∈ tC . The corresponding decomposition of k reads k = t ⊕ R (E α − E −α ) ⊕ Ri (E α + E −α ), α

α

(36)

where α ranges over all positive roots. Fact 2. If α and β are different positive roots, then the tangent planes Pα and Pβ are ω-orthogonal. Proof. The symplectic form ωx is given as: ωx (A x , Bx ) =

i [A, B]v|v. 2

(37)

We have assumed that v ∈ Vλ . If [A, B]v is in some other weight space, the right-hand side of (37) vanishes since two different weight spaces are orthogonal. We know that Pα = Span{(E α − E −α ).v, i(E α + E −α ).v} and Pβ = Span{(E β − E −β ).v, i(E β + E −β ).v}. We also know that [E α , E β ].v ∈ Vλ+α+β or is equal to zero. But Vλ+α+β is orthogonal to Vλ . Consequently, if A x ∈ Pα and Bx ∈ Pβ , then ωx (A x , Bx ) =

i [A, B]v|v = 0, 2

(38)

which is what we wanted to prove.

Summing up we know that Tx O = α Pα and that the spaces Pα are ω-orthogonal. Hence Tx O is a symplectic vector space if and only if all Pα are symplectic. Fact 3. The space Pα is symplectic if and only if [E α , E −α ]v|v = 0. Proof. Let A x ∈ Pα and Bx ∈ Pα . Computing ωx (A x , Bx ) =

i [A, B]v|v, 2

(39)

we see that only the term [E α , E −α ]v|v can give a nonzero result and when it indeed does not vanish then Pα is symplectic which is what we wanted to prove. Thus Tx O is symplectic when the following implication is true [E α , E −α ]v|v = 0 ⇒ Pα = 0.

(40)

The left-hand side of (40) can be rewritten as [E α , E −α ]v = Hα .v = λ(Hα )v,

(41)

where λ is the weight of v. For the right-hand side of (40) recall that Pα = Span{(E α − E −α ).v, i(E α + E −α ).v}. So Pα = 0 means E α v = 0 = E −α v. Hence [11],


451

Theorem 4 (Kostant-Sternberg). The orbit O = K .x, x = π(v), v ∈ Vλ for some weight λ, is symplectic if and only if for every positive root α with λ(Hα ) = 0 it follows that E α v = 0 = E −α v. To demonstrate how this theorem works we will prove that the orbit through the highest weight vector is always symplectic. As it was mentioned in Subsect. 4.2 every unitary irreducible representation of the compact semisimple group K is a highest weight representation. The highest weight vector is defined as follows Definition 1. Let K be compact semisimple Lie group and denote by k its Lie algebra and by kC its complexification. Then kC admits decomposition (35). The weight vector v ∈ Vλ of irreducible representation of K (respectively k or kC ) is highest weight if and only if E α .v = 0,

(42)

where α range through all positive roots. Let us take v ∈ Vλ - the highest weight vector of irreducible representation of kC and consider the corresponding orbit O = K .x, where x = π(v). It is easy to see that according to Definition 1, v is also the highest weight vector for all sl2 (α)-triples. From the representation theory we know that weights of an irreducible representation of sl2 (α) are W = {−n, −n + 2, . . . , n − 2, n}, where n ≥ 0. This means Hα .v = λ(Hα )v = nv.

(43)

So λ(Hα ) = n. The only interesting α is the one for which n = 0. But then we have a one dimensional, hence trivial, representation of sl2 (α). This, of course, means E −α .v = 0 = E α .v. Making use of Theorem 4 we see that O is symplectic. In fact this orbit is not only symplectic but also Kähler (see the Appendix). Indeed, from Theorem 6 we know that to prove this, it is enough to check that this orbit is complex manifold since P(V ) is a positive Kähler manifold (see

the Appendix). But if v is the highest weight vector then the tangent space Tx O = α Pα , where α range over positive roots and Pα = Span{E −α .v, i E −α .v}. So Tx N is stable under multiplication by i, hence O is complex. 7. Distinguishable Particles 7.1. Two qubits case. In the simplest case of two qubits we may use directly the Kostant-Sternberg theorem from the last section. The Hilbert space is then H = C2 ⊗C2 and the direct product K = SU (2) × SU (2) acts on H in a natural way, (g1 , g2 )v1 ⊗ v2 = g1 v1 ⊗ g2 v2 ,

(44)

where g1 , g2 ∈ SU (2) and v1 , v2 ∈ C2 . Our first goal is to identify symplectic orbits of K . To apply theorems and facts established in the previous sections we start with the root-space decomposition of the Lie algebra kC = sl2 (C) ⊕ sl2 (C), i.e., the complexification of k = su2 ⊕ su2 - the Lie algebra of G. The algebra kC is semisimple as a direct sum of the simple algebras sl2 (C). Let us recall that sl2 (C) = Span{X, H, Y }, [H, X ] = 2X, [H, Y ] = −2Y, [X, Y ] = H.

(45)

452


The Cartan subalgebra of sl2 (C) is spanned by H , whereas Span{X }, Span{Y } are the positive and negative root spaces. An element of kC can be written as (Z 1 , Z 2 ), where Z 1 , Z 2 ∈ sl2 (C). We also have: [(Z 1 , Z 2 ), (W1 , W2 )] = ([Z 1 , W1 ], [Z 2 , W2 ]).

(46)

Knowing this we find that the Cartan subalgebra of kC is t = Span{(H, 0), (0, H )}. The commutation relations read as, [(H, 0), (X, 0)] = 2(X, 0), [(H, 0), (Y, 0)] = −2(Y, 0), [(X, 0), (Y, 0)] = (H, 0), [(0, H ), (0, X )] = 2(0, X ), [(0, H ), (0, Y )] = −2(0, Y ), [(0, X ), (0, Y )] = (0, H ), (47) [(0, W ), (Z , 0)] = 0. Since kC is semisimple, its root spaces are one-dimensional. We have the following roots (computed in the basis {(H, 0), (0, H )} of the Cartan subalgebra t), and the corresponding root spaces, α −α β −β

= (2, 0), Vα = Span{(X, 0)}, = (−2, 0), V−α = Span{(Y, 0)}, = (0, 2), Vβ = Span{(0, X )}, = (0, −2), V−β = Span{(0, Y )}.

(48) (49) (50) (51)

Thus we have the following decomposition of kC , kC n− n+ t

= = = =

n− ⊕ t ⊕ n+ , Span{(Y, 0), (0, Y )}, Span{(X, 0), (0, X )}, Span{(H, 0), (0, H )},

where n− and n+ are negative and positive root spaces, respectively. Let 1 0 , e2 = , e1 = 0 1

(52) (53) (54) (55)

(56)

be the standard basis of C2 . The Lie algebra sl2 (C) acts then via the defining representation, 1 0 0 1 0 0 H= , X= , Y = . (57) 0 −1 0 0 1 0 The highest weight vector equals e1 and there are just two weight spaces, one spanned by e1 and the other by e2 . The corresponding weights are 1 and −1. The action of (Z 1 , Z 2 ) ∈ kC on H is given by (Z 1 , Z 2 )v1 ⊗ v2 = Z 1 v1 ⊗ v2 + v1 ⊗ Z 2 v2 .

(58)

It is easy to guess that the highest weight vector for the above representation equals e1 ⊗ e1 . Indeed, it is an eigenvector of the Cartan subalgebra and is annihilated by all


453

elements of n+ . The weight spaces are obtained by successive action of n− on e1 ⊗ e1 . In the basis {(H, 0), (0, H )} the weights and weight vectors read as, λ1 λ2 λ3 λ4

= (1, 1), v1 = e1 ⊗ e1 , = (1, −1), v2 = e1 ⊗ e2 , = (−1, 1), v3 = e2 ⊗ e1 , = (−1, −1), v4 = e2 ⊗ e2 .

(59) (60) (61) (62)

The sl2 (C) triples corresponding to the positive roots of kC are {(X, 0), (H, 0), (Y, 0)} and {(0, X ), (0, H ), (0, Y )}. To decide if an orbit through a weight vector is symplectic it is enough to check if λ((0, H )) = 0 and λ((H, 0))) = 0, where λ is one of the weights from the list above. Since weights are given by two nonzero numbers (n 1 , n 2 ), we find Fact 4. In the case of two qubits, only the orbits through weight vectors are symplectic in the projective space P(C2 ⊗ C2 ). In fact all weight vectors lie on the same orbit which is Kähler and contains all separable states. Orbits through entangled states are not symplectic Let us now consider the states e1 ⊗ e2 ± e2 ⊗ e1 which are not weight vectors and are not separable. The orbits through them are not symplectic and we can ask what is the dimension of the degeneracy subspace for them. We need to examine which vectors from the tangent space to the orbit of SU (2) × SU (2) are tangent to the fibers of the corresponding moment map μ. We already know that 1 (63) μ A (x) = v|Av, x = π(v). 2 In our case we have, 1 μ(Z 1 ,Z 2 ) (x) = e1 ⊗ e2 ± e2 ⊗ e1 |(Z 1 , Z 2 )(e1 ⊗ e2 ± e2 ⊗ e1 ) 2 = tr(Z 1 ) + tr(Z 2 ) = 0, (Z 1 , Z 2 ) ∈ k, (64) where the first equality was obtained by a direct computation and the second one is a consequence of the zero-trace property of the matrices from su2 . Thus the degeneracy space is the whole tangent space to the orbit through state e1 ⊗ e2 ± e2 ⊗ e1 . This space can be directly computed, as it is spanned by the projection of vectors given by (Z 1 , Z 2 )(e1 ⊗ e2 ± e2 ⊗ e1 ) = Z 1 e1 ⊗ e2 + e1 ⊗ Z 2 e2 ± Z 1 e2 ⊗ e1 ± e2 ⊗ Z 2 e2 . (65) Using the Pauli matrices multiplied by the imaginary unit i as a basis for su2 and the formula (57) we obtain that in both cases the tangent space is three dimensional. In the case of the state e1 ⊗ e2 + e2 ⊗ e1 it is spanned by {i(e1 ⊗ e1 + e2 ⊗ e2 ), (e2 ⊗ e2 − e1 ⊗ e1 ), i(e1 ⊗ e2 − e2 ⊗ e1 )}, whereas for the state e1 ⊗ e2 − e2 ⊗ e1 it is spanned by {i(e2 ⊗ e2 − e1 ⊗ e1 ), (e1 ⊗ e1 + e2 ⊗ e2 ), i(e1 ⊗ e2 + e2 ⊗ e1 )}. The conclusion is that we can use the dimension of the degeneracy space as a measure of entanglement. In principle a similar reasoning directly using the Kostant-Sternberg theorem can be applied in cases of larger dimensions of subsystems and/or for many-partite systems involving multiple tensor products of spaces with arbitrary dimensions, but explicit calculations become prohibitively complicated. In the next section we present a method allowing for finding the degeneracy spaces for bipartite systems of arbitrary dimensions based on the Singular Value Decomposition (SVD) of a matrix, and in the following one we show how to extend the reasoning to a multipartite case where the direct application of SVD is not possible.

454


8. Degeneracy Subspaces and SVD The method of determining the dimension of the degeneracy space presented in the previous section can be extended to a more general case of two distinguishable particles, but in this case one can achieve the goal in a less cumbersome manner by invoking the Singular Value Decomposition of an arbitrary complex matrix [16]. We will present the solution for two distinguishable but otherwise identical particles (i.e., living in spaces of the same dimension N ). A generalization to unequal dimensions of the spaces needs only a little bit more effort. The Hilbert space is thus now H = C N ⊗ C N . Let us fix an orthonormal basis {ei : i = 1, . . . , N } of C N , (e.g., the standard one where ei is a column vector with one on the i th position and zero on others). Any state | ∈ H can be decomposed as: | =

N

Ci j ei ⊗ e j .

(66)

i, j=1

The action of U ⊗ V ∈ SU (N ) × SU (N ) gives: U ⊗ V | =

N

Ci j U ei ⊗ V e j =

i, j=1

N

Ci j Uki ek ⊗ Vl j el

i, j=1

=

N

(U C V T )kl ek ⊗ el .

(67)

k,l=1

It is a well known fact that any complex matrix can be put to a diagonal form by the simultaneous left and right action of the unitary group achieving the SVD, i.e. there exist unitary U˜ , V˜ such that U˜ C V˜ = diag(0, . . . , 0, ν1 , . . . , ν2 , . . . , ν K ),

(68)

where νi > 0 and {0, . . . , 0, ν12 , . . . , ν22 , . . . , ν K2 } constitute the spectrum of C † C (and, equivalently, the spectrum of CC † ). Taking U = U˜ and V = V˜ T in (67) we conclude that the orbit of SU (N ) × SU (N ) through any state | contains a point which can be written as: | =

N

pi ei ⊗ ei ,

(69)

i=1

N with pi ≥ 0 and i=1 pi2 = 1. We denote by m i the multiplicity of νi and by m 0 the dimension of the kernel of C,

K m i = N . We can use the state | to compute the dimension of the hence m 0 + n=1 orbit through | . Crucial for this is the observation [17] that | is stabilized by the action of U ⊗ V where: ⎛ ⎞ ⎛ ⎞ u0 v0 u1 u1 ⎜ ⎟ ⎜ ⎟ ⎟, V = eiφ ⎜ ⎟, U =⎜ (70) . . ⎝ ⎠ ⎝ ⎠ .. .. uK

uK


455

u 1 , . . . , u K are arbitrary unitary operators from, respectively, U (m 1 ), . . . , U (m K ). Both u 0 and v0 belong to U (m 0 ) and det(u 0 ) and det(v0 ) are fixed by the determinants of u 1 , . . . , u K in a way ensuring that matrices U, V are special unitary. Knowing this we can compute the dimension of the orbit O of G = SU (N ) × SU (N ) through | in the projective space P(H) as: dim(O) = (2N − 2) − 2

(2m 20

− 2) + 1 +

K

m 2n

= 2N 2 − 2m 20 −

n=1

K

m 2n − 1,

n=1

(71) where we used dim(U (n)) = n 2 = dim(SU (n)) + 1. The dimensions of the two U (m 0 ) blocks are diminished by one due to the determinant fixing condition stated above, and an additional one is subtracted due to the projection on P(H). To compute the dimension of the coadjoint orbit in the dual space to k = su N ⊕ su N associated with | via the moment map μ let us calculate μ(A,B)

N

pi ei ⊗ ei

N N pi ei ⊗ ei |(A ⊗ I + I ⊗ B)( p j e j ⊗ e j ) =

i=1

=

N

i=1

j=1

pi ei ⊗ ei | pi Ae j ⊗ e j + p j e j ⊗ Be j =

i, j=1

N

pi2 (ei |Aei + ei |Bei )

(72)

i=1

with (A, B) ∈ su N ⊕ su N . It is easy to see (using the standard basis of su N ) that in fact the SV D transfers our state | into a state | such that μ(| ) ∈ t∗ , where t∗ is the dual space to the Cartan subalgebra t of k. Of course every coadjoint orbit is passing through at least one point from t∗ (usually, a coadjoint orbit contains more than one point from t∗ , all these points lie on the orbit of the Weyl group). This fact could be seen as a geometrical interpretation of the SVD, but in contrast to the SDV itself, it remains true in multipartite cases. We will use it in the following sections. Going back to our considerations we know that μ(| ) is determined by the action on t which is generated by I ⊗ H, H ⊗ I , where H belongs to the Cartan subalgebra of su N . Using an invariant scalar product on k given by TrAB, we can find an element (X, Y ) ∈ t such that μ(| ) = α(X,Y ) . To compute this we use the standard basis for t given by H1 = diag(1, −1, 0, . . . , 0), . . . , H N −1 = diag(0, 0, . . . , 1, −1). We have μ(Hk ,0) (| ) = tr((X ⊗ I + I ⊗ Y )Hk ⊗ I ) = tr(X Hk ), μ(0,Hk ) (| ) = tr((X ⊗ I + I ⊗ Y )I ⊗ Hk ) = tr(Y Hk ),

(73)

but we also know that μ(Hk ,0) (| ) =

N

2 pi2 (ei |Hk ei ) = pk2 − pk+1 = μ(0,Hk ) (| ).

(74)

i=1

It is easy to see now that X = Y and X = Y = diag(−

1 1 1 + p12 , − + p22 , . . . , − + p 2N ). N N N

(75)

456


To compute the dimension of the coadjoint orbit through μ(| ) notice that if U1 ⊗U2 ∈ G then U1 ⊗ U2 (X ⊗ I + I ⊗ X )U1† ⊗ U2† = U1 XU1† ⊗ I + I ⊗ U2 XU2† .

(76)

Hence, to obtain the dimension of the coadjoint orbit we need to compute the dimension of the stabilizer subgroup of X by the adjoint action. It is easy to see that X is stabilized by any matrix of the form: ⎛ ⎜ U =⎜ ⎝

⎞

u0 u1

..

⎟ ⎟, ⎠

.

(77)

uK where u 0 , u 1 , . . . , u K are arbitrary unitary operators from U (m 0 ), U (m 1 ), . . . , U (m K ) and the value of det(u 0 ) is fixed by demanding that U is special unitary. The dimension of the coadjoint orbit through μ(| ) is thus dim(μ(O)) = (2N 2 − 2) − (2

K

m 2n − 2) = 2N 2 − 2

n=0

K

m 2n .

(78)

n=0

Now we are able to compute the dimension D(| ) of the degeneracy subspaces (fibers of the moment map), D(| ) = dim(O) − dim(μ(O)) =

K

m 2n − 1,

(79)

n=1

and we see that the orbit through | is symplectic if and only if in the SV D decomposition we get a diagonal matrix with only one non zero entry. Fact 5. In the case of two identical but distinguishable particles there is only one symplectic orbit in the projective space P(C N ⊗C N ). This orbit contains all separable states and is Kähler. Orbits through entangled states are not symplectic. Knowing this and making use of Corollary 1 we arrive at Fact 6. In the case of two identical but distinguishable particles the dimension of the

K degeneracy space D(| ) = n=1 m 2n − 1 gives a well defined entanglement measure. 9. Three Particle Case As already mentioned the SVD has no generalization to multiple tensor products corresponding to multiparticle cases. Nevertheless we may apply some methods from the previous section if we look at the SVD from a slightly different point of view. Let us namely ask the question about necessary conditions for a state | (66) to be sent by


457

the moment map μ to an element of t∗ represented by X ⊗ I + I ⊗ Y ∈ t upon the identification of t∗ and t through the invariant scalar product on k. We have μ(A,B) (| ) =

N

Ci j ei ⊗ e j |(A ⊗ I + I ⊗ B)

Cmn em ⊗ en

m,n=1

i, j=1

=

N

N N

C¯ i j Cmn δn j ei |Aem + δim e j |Ben

m,n=1 i, j=1

=

N

C¯ i j Cm j ei |Aem +

i, j,m=1

=

N

(CC † )mi ei |Aem +

i,m=1

=

N

N

C¯ i j Cin e j |Ben

i, j,n=1 N

(C † C) jn e j |Ben

j,n=1

(CC † ) ji ei |Ae j + (C † C) ji e j |Bei .

i, j=1

In the following we will denote by E i j the matrix with zero entries everywhere except 1 on the (i, j) position. Matrices i(E i j + E ji ) and E i j − E ji supplemented by the previously defined standard basis elements of t constitute a standard basis of k. Taking now A and B of the form i(E i j + E ji ) and E i j − E ji which do not belong to t but are from k, we must have N

(CC † ) ji (ei |(E kl + Elk )e j ) = 0,

i, j=1 N

(CC † ) ji (ei |(E kl − Elk )e j ) = 0,

i, j=1

and the same for

C † C.

Notice that

ei |(E kl ± Elk )e j = δl j (ei |ek ) ± δk j (ei |el ) = δl j δik ± δk j δil ,

(80)

hence, (CC † )lk + (CC † )kl = 0, (CC † )lk − (CC † )kl = 0, and the same equations are fulfilled by C † C. It means that both CC † and C † C are diagonal. From linear algebra we know that the spectra of (CC † ) and (C † C) are the same. Using this property and an additional freedom of a unitary action which permutes elements on the diagonal we notice that it is always possible to have CC † = C † C (i.e., C is a normal operator) and thus X = Y in X ⊗ I + I ⊗ Y for the corresponding image of the moment map. Let thus ⎞ ⎛ 0Im 0 v1 Im 1 ⎟ ⎜ ⎟, C †C = ⎜ . ⎠ ⎝ .. v K Im k

458


where In is the unit n × n matrix. Then ⎞

⎛

0 √ v1 u 1 ⎜ C =⎜ .. ⎝ .

√

⎟ ⎟, ⎠

(81)

vK u K

where m 1 , . . . , m k are dimensions of degeneracy of v1 , . . . , v K , respectively and u n are m n × m n unitary matrices. Among all matrices (81) there is one which is diagonal and corresponds to the SVD. In this way we proved the existence of the SVD for any state using the fact that each adjoint orbit intersects the Cartan subalgebra t. The second important observation is that all states (81) are sent by the moment map into the same point X ⊗ I + I ⊗ X , and therefore constitute the fiber of the moment map. The dimension

K of this fiber is n=1 m 2n − 1, which is exactly D(| ) from the previous section. To a certain point we may repeat the reasoning in a multipartite case. Thus, e.g. for a general three-particle state,

| =

N

Ci jk ei ⊗ e j ⊗ ek ,

(82)

i, j,k=1

the action of the moment map on | gives: μ(A,B,D) (| ) =

N

Ci jk ei ⊗ e j ⊗ ek |(A ⊗ I ⊗ I + I ⊗ B ⊗ I + I ⊗ I ⊗ D)

i, j,k=1

=

N

N

Cmnl em ⊗ en ⊗ el

m,n,l=1 N

C¯ i jk Cmnl ei ⊗ e j ⊗ ek |Aem ⊗ en ⊗ el + em ⊗ Ben ⊗ el + em ⊗ en ⊗ Del

m,n,l=1 i, j,k=1

=

N

N

C¯ i jk Cmnl δ jn δkl ei |Aem + δim δkl e j |Ben + δim δ jn ek |Del

m,n,l=1 i, j,k=1

=

N

C¯ i jk Cm jk ei |Aem + C¯ i jk Cimk e j |Bem + C¯ i jk Ci jm ek |Dem ).

i, j,k,m=1

Again, we want to find conditions for Ci jk under which μ(A,B,D) (| ) belongs to t∗ . Substituting for A, B, and D basis elements from k − t and again using (80) we get: N

C¯ n jk Cl jk +

j,k=1 N j,k=1

N

C¯ l jk Cn jk = 0,

j,k=1

C¯ n jk Cl jk −

N j,k=1

C¯ l jk Cn jk = 0,


459

and similarly two pairs of equations for the other combination of indices. If we now define N

(C )nl = 1

C¯ n jk Cl jk ,

j,k=1 N

(C 2 )nl =

C¯ jnk C jlk ,

(83)

j,k=1 N

(C 3 )nl =

C¯ jkn C jkl ,

j,k=1

the obtained conditions mean that the matrices C 1 , C 2 , C 3 are diagonal. In this case it is not generally true that C 1 = C 2 = C 3 , so the corresponding state in t is X ⊗ I ⊗ I + I ⊗ Y ⊗ I + I ⊗ I ⊗ Z where, in general, X = Y = Z .

˜ = i,N j,k=1 C˜ i jk ei ⊗ e j ⊗ ek can be taken by Up to now we know that any state |

local unitary transformation U1 ⊗ U2 ⊗ U3 to the state | = i,N j,k=1 Ci jk ei ⊗ e j ⊗ ek , where the coefficients Ci jk fulfill (83). This statement has a deeper physical meaning. The diagonal elements of C 1 , C 2 , C 3 constitute probabilities to obtain basis vectors {ei } in some local measurements performed on the state | . The conditions (83) say that any state can be transformed by local unitary transformation to the state which is determined by these local measurements. It is natural to ask now how to find such a unitary local transformation. ˜ The action of U ⊗ V ⊗ W gives: Let us consider the arbitrary state | . ˜ = U ⊗ V ⊗ W |

N

C˜ i jk U ei ⊗ V e j ⊗ W ek

i, j,k=1

=

N

C˜ i jk Uαi Vβ j Wγ k eα ⊗ eβ ⊗ eγ .

(84)

i, j,k=1

The matrices C˜ 1 , C˜ 2 , C˜ 3 are generally not diagonal but by definition they are positive, hence Hermitian. This means there are unitary operators U, V, W such that U † C˜ 1 U, V † C˜ 2 V, W † C˜ 3 W are diagonal. If we take now Ci jk =

N

T T T V jl Wkm , C˜ nlm Uin

n,l,m=1

then: (C 1 )nl =

N j,k=1

=

N

C¯ n jk Cl jk =

N T T Wkc C¯˜ αβγ U¯T nα V¯T jβ W¯ T kγ C˜ abc UlaT V jb j,k=1

C¯˜ αβγ C˜ abc U † nα Ual Vbj V † jβ Wck W † kγ = U † nα C¯˜ αβγ C˜ aβγ Ual

j,k=1 † ˜1 = Unα (C )αa Ual = (U † C˜ 1 U )nl ,

460


which is diagonal as we wanted. Similarly we show that C 2 and C 3 are diagonal as well. Now to compute the dimension of the fiber over μ(| ) we need to find the dimension of the submanifold of states which are sent to μ(| ). First we look at the coadjoint orbit through μ(| ). As we know μ(| ) can be represented by an element of X ⊗ I ⊗ I + I ⊗ Y ⊗ I + I ⊗ I ⊗ Z ∈ t. Using similar reasoning as in the case of two particles we obtain: 1 1 1 2 2 2 + p11 , − + p12 , . . . , − + p1N ), N N N 1 1 1 2 2 2 Y = diag(− + p21 , − + p22 , . . . , − + p2N ), N N N 1 1 1 2 2 2 Z = diag(− + p31 , − + p32 , . . . , − + p3N ), N N N

X = diag(−

(85)

2 , p 2 , . . . , p 2 }, { p 2 , p 2 , . . . , p 2 } and { p 2 , p 2 , . . . , p 2 } constitute the where { p11 12 21 22 31 32 1N 2N 3N spectra of C 1 , C 2 and C 3 , respectively. The dimension of this orbit can be easily computed knowing that Stab(μ(| )) consists of matrices U ⊗ V ⊗ W and ⎛ ⎞ ⎛ ⎞ u 1,0 v2,0 u 1,1 u 2,1 ⎜ ⎟ ⎜ ⎟ ⎟, V = ⎜ ⎟, U =⎜ . . ⎝ ⎠ ⎝ ⎠ .. ..

⎛ ⎜ W =⎜ ⎝

u 1,K 1 w3,0

w3,1

..

.

u 2,K 2

⎞ ⎟ ⎟, ⎠

(86)

w3,K 3

where K i is the number of eigenspaces of C i corresponding to different eigenvalues, m i,n are their dimensions and u i,n ∈ U (m i,n ). The stabilizer of this orbit has the dimension: dim(Stab(μ(| ))) =

K1

m 21,n +

n=0

K2

m 22,n +

n=0

K3

m 23,n − 3.

(87)

n=0

Hence, K1 K2 K3 m 21,n + m 22,n + m 23,n − 3). dim(μ(O)) = (3N 2 − 3) − ( n=0

n=0

(88)

n=0

The dimension of fiber can be computed as: dim(D(| )) = dim(Stab(μ(| ))) − dim(Stab(| )).

(89)

Notice that if C 1 , C 2 , C 3 have nontrivial kernels then in the decomposition of there are no elements ei ⊗ e j ⊗ ek where ei ∈ Ker(C 1 ) or e j ∈ Ker(C 2 ) or e j ∈ Ker(C 3 ). This means that acting on | by unitary operators from Stab(μ(| )) which can be restricted to the kernels of C 1 , C 2 , C 3 we do not change the state | . We find thus an upper bound for the dimension of the degeneracy space as K1 n=1

m 21,n +

K2 n=1

m 22,n +

K3 n=1

m 23,n − 3.

(90)


461

The dimension of a fiber is at least max{

K1

m 21,n ,

n=1

K2

m 22,n ,

K3

n=1

m 23,n } − 1.

(91)

n=1

Indeed, the conditions (83) allow us to write the state | as | =

N

p1i ei ⊗ vi ,

(92)

i

where vi =

1 Ci jk e j ⊗ ek , i = 1, . . . , N p1i

(93)

jk

constitute a set of orthonormal vectors. We can treat (92) as a bipartite decomposition of in the orthonormal bases {ei } and {vi }. In these bases is thus represented by the ˇ matrix C, ⎛ ⎞ 0Im 10 ⎜ ⎟ pˇ 1 Im 1,1 ⎜ ⎟ Cˇ = ⎜ ⎟, .. ⎝ ⎠ . pˇ K 1 Im 1,K 1 where pˇ i are different eigenvalues p1i . Application of U ⊗ I ⊗ I ∈ Stab(μ(| )) yields: ⎛ ⎜ Cˇ = ⎜ ⎝

0u 10

⎞ pˇ 1 u 11

..

.

⎟ ⎟. ⎠

(94)

pˇ K 1 u 1,K 1

Clearly, the matrices Cˇ of the above form constitute a manifold of dimension

K1 2 n=1 m 1,n − 1. In the case of two particles this is the whole fiber because acting with U ⊗ I and I ⊗ V we get exactly the same manifold. For multipartite systems, like the three particle case we consider, we have to take into account that acting with I ⊗ V ⊗ I and I ⊗ I ⊗ W may produce manifolds of larger dimensionalities which leads thus to the estimate (91). Summing up we have max{

K1 n=1

m 21,n ,

K2

m 22,n ,

n=1

K3

m 23,n }−1 ≤ D(| ) ≤

n=1

K1

m 21,n +

n=1

K2 n=1

m 22,n +

K3

m 23,n −3.

n=1

(95) Thus an orbit is symplectic if and only if K1 n=1

m 21,n = 1,

K2 n=1

m 22,n = 1,

K3 n=1

m 23,n = 1.

462


But this means that the state | is separable because it reduces to one of the states ei ⊗ e j ⊗ ek . If the state is separable then of course by local operations we can transform it to the state e1 ⊗ e1 ⊗ e1 and then (96) is fulfilled. In this way we found an easy way to check if a state is separable and showed that it is equivalent to the fact that the associated orbit is symplectic. We also have an estimate for dimensions of degeneracy spaces for entangled states. A generalization to cases of more than three particles is straightforward. So we have following Theorem 5. In the case of L identical but distinguishable particles there is only one L symplectic orbit in the projective space P( n=1 C N ). This orbit contains all separable states and is Kähler. Orbits through entangled states are not symplectic. Knowing this and making use of Corollary 1 we arrive at Fact 7. In the case of L identical but distinguishable particles the dimension of the degeneracy space D(| ) gives a well defined entanglement measure. For any state | the estimate for this measure is given by a formula analogous to (95). 10. Indistinguishable Particles The Kostant-Sternberg theorem can be directly applied also to indistinguishable particles, i.e., bosons and fermions. For L bosons the relevant group is K = SU (N ) represented in V = Sym L C N . As above we want to check which orbits of K -action are symplectic in the projective space P(V ). The best way to understand the problem is to do some nontrivial example and then generalize the obtained result. To this end let us consider the simplest case of L = 2 and N = 3, i.e., the representation of SU (3) in Sym2 C3 . First we notice that the representation of SU (N ) in Sym2 C N is irreducible [18]. From the Kostant-Sternberg theorem it follows that it is enough to investigate the structure of the sl3 (C) representation on Sym2 C N . The g = sl3 (C) algebra is eight-dimensional and can be decomposed as g = n− ⊕h⊕n+ , where h is the Cartan subalgebra consisting of traceless diagonal matrixes and n+ = Span(E 12 , E 13 , E 23 ), n− = Span(E 21 , E 31 , E 32 ). We define three linear functionals L i : h → C, L i (diag(a1 , a2 , a3 )) = ai , i ∈ {1, 2, 3}. (96) Let us choose a basis B in Sym2 C3 , B = {e1 ⊗ e1 , e2 ⊗ e2 , e3 ⊗ e3 , e1 ⊗ e2 + 3 e2 ⊗ e1 , e1 ⊗ e3 + e3 ⊗ e1 , e2 ⊗ e3 + e3 ⊗ e2 } where ei ∈ C are the standard basis vectors. The action of the Lie algebra g on Sym2 C3 is the usual action of the tensor product of representations. Construction of the representation of sl3 (C) on Sym2 C3 is straightforward, we take the vector e1 ⊗ e1 which is the highest weight vector (it is an eigenvector of all elements in h and it is annihilated by n+ ), and we act on it with 2 C3 into the operators from n . As a result we obtain a decomposition of V = Sym − direct sum V = Vλ , where the one-dimensional weight spaces Vλ are spanned by the basis vectors of B. The weights λ ∈ h∗ can be now calculated as1 H (ei ⊗ ei ) = 2L i (H )ei ⊗ ei , i = 1, 2, 3, H (ei ⊗ e j + e j ⊗ ei ) = (L i + L j )(H ) ei ⊗ e j + e j ⊗ ei .

(97)

1 Remember that we represent k in the symmetric tensor product, hence H (e ⊗ e ) has the meaning of i i (H ⊗ I + I ⊗ H )(ei ⊗ ei ) = H ei ⊗ ei + ei ⊗ H ei , etc.


463

We know that only orbits passing through weight vectors might be symplectic. We have the following sl2 (C) triples in sl3 (C): (E i j , E ji , Hi j = [E i j , E ji ]). The orbit through a weight vector v with a weight λ is symplectic if and only if for every operator from n+ the following implication is true: λ(Hi j ) = 0 ⇒ E i j (v) = 0 = E ji (v). There are two cases to consider, – Vectors of the form ei ⊗ ei , i = 1, 2, 3. The weight of the ei ⊗ ei state is 2L i so 2L i (Hk j ) = 0 only when k = i and j = i. In this case E k j (ei ⊗ ei ) = 0 because to give the nonzero result the matrix E k j must have one in the i th column. The corresponding orbit is thus symplectic. Obviously, all these vectors lie on the orbit through the highest weight vector e1 ⊗ e1 . – Vectors of the form ei ⊗ e j + e j ⊗ ei where i = j. The weight of this vector is L i + L j so (L i + L j )(Hkl ) = 0 only if k = i and j = l. In this case E i j (ei ⊗ e j + e j ⊗ ei ) = 0 because E i j e j = 0. In conclusion the orbit through ei ⊗e j +e j ⊗ei is not symplectic. Let us now return to the problem mentioned in Sect. 2.2. If we define nonentangled bosonic states as antisymmetrizations of simple tensors (or, more precisely, as corresponding points in the projective space) then we clearly have two, inequivalent from the geometric point of view, types of nonentanglement. Nonentangled states of two different types are not connected by local unitary transformations which is in contrast to the familiar situation of distinguishable particles and intuitions built upon the fact that all separable states of distinguishable particles can be obtained from a single one by local transformations. Although this is obviously acceptable, it remains an open problem: what is a physical meaning of two different types of nonentanglement. If, instead, we adopt the second definition identifying nonentangled bosonic states as points in the projective space corresponding to tensor products of the same vector, we encounter the same situation as in the case of distinguishable particles - the nonentangled states form a unique symplectic orbit, and the degeneracy of the symplectic form can be used as a measure of entanglement for entangled states. The case of fermions does not lead to any ambiguities of the above type. Calculations similar to those made for bosons lead to a conclusion that nonentangled states form the unique symplectic orbit. Indeed, let us consider as an example K = SU (N ) and V = 2 C N corresponding to two fermions of spin (N − 1)/2 with the single-particle space H1 = C N . In terms of the previously introduced standard bases ei and E i j adapted to N dimensions, V is spanned by ekl = ek ⊗ el − el ⊗ ek , with k < l the highest weight vector is e12 and E i j ekl = δ jl eki + δ jk eil ,

(98)

where we denote ekl = −elk for k > l. Acting by E i j with i > j on e12 we obtain the remaining weight vectors, which according to (98) are all of the form ekl (in fact with l = 1 or 2), with weights L i + L j (we extended in an obvious way the definition (96) to N dimensions). As remarked (L i + L j )(Hkl ) = 0 implies k = i, j = l, but then E i j ekl = 0 = E ji ekl . 11. Summary and Outlook We presented a geometric description of the set of pure states of composite quantum systems in terms of the natural symplectic structure in the space of states. Nonentangled states form a unique symplectic orbit through the highest weight vector of the appropriate representation of the group of local transformations whereas entangled states are

464


characterized by the degeneracy of the symplectic form. The degeneracy can be thus used as a kind of geometric measure of entanglement. We were able to calculate the degeneracy in many relevant cases and give some estimates for the most general system of an arbitrary number of constituents with an arbitrary dimension of the single particle space. Let us remark that there exists a useful characterization of the highest weight vector orbits which allows to generalize and estimate effectively some other entanglement measures [19]. An obvious question is whether a method can be adapted to the case of mixed states. This problem, as well as applications of the obtained results to identifying, so called, locally unitary equivalent multiparticle states [20] and finding “canonical” forms of them we postpone to forthcoming publications. Acknowledgements. The support by SFB/TR12 Symmetries and Universality in Mesoscopic Systems program of the Deutsche Forschungsgemeischaft and Polish MNiSW grant DFG-SFB/38/2007 is gratefully acknowledged. Open Access This article is distributed under the terms of the Creative Commons Attribution Noncommercial License which permits any noncommercial use, distribution, and reproduction in any medium, provided the original author(s) and source are credited.

Appendix A.1 Symplectic structure on coadjoint orbits. Let K be a semisimple compact Lie group, k its Lie algebra, and k∗ the dual space to k. The coadjoint action of K on k∗ is given by Ad∗g : k∗ → k∗ , Ad∗g α, Y = α, Ad g−1 Y = α, g −1 Y g, g ∈ K , Y ∈ k, α ∈ k∗ ,

(99)

where Ad is the adjoint action of K . It can be easily checked that (99) is well defined. The coadjoint orbit α passing through α ∈ k∗ is the orbit of the coadjoint action of K on α, α = {Ad∗g α : g ∈ K }.

(100)

Our goal now is to define a K -invariant symplectic form ω on α . Such a form acts on tangent vectors so it is reasonable to look first at their structure. Since k∗ is a vector space its tangent space at any point is again k∗ . For any X ∈ k let X˜ ∈ Tα α be a vector tangent to the curve t → Ad∗ex p(t X ) α. We have then X˜ , Y =

d d ∗ Ad α, Y = α, Adex p(−t X ) Y = α, [Y, X ], dt t=0 ex p(t X ) dt t=0

(101)

where Y ∈ k. Thus X˜ is an element of k∗ given by X˜ = α, [ · , X ].

(102)

It is now interesting to ask how this tangent vector transforms when pushed by an element g ∈ K , i.e. to consider the vector g X˜ ∈ TAd∗g α α tangent to the curve t →


465

Ad∗g Ad∗ex p(t X ) α. We have d d ∗ ∗ Ad Ad α, Y = α, Adex p(−t X ) Ad g−1 Y dt t=0 g ex p(t X ) dt t=0 = α, [Ad g−1 Y, X ] = α, [Ad g−1 Y, Ad g−1 Ad g X ]

g X˜ , Y =

= α, Ad g−1 [Y, Ad g X ] = Ad∗g α, [Y, Ad g X ].

(103)

Hence g X˜ is an element of k∗ given by g X˜ = Ad∗g α, [ · , Ad g X ].

(104)

We may now define our symplectic form at any β ∈ α as ωβ ( X˜ , Y˜ ) = β, [X, Y ].

(105)

The form is nondegenerate on α because ∀Y ∈ k β, [X, Y ] = 0 ⇔ ∀Y ∈ k X˜ , Y = 0 ⇔ X˜ = 0.

(106)

It is also K -invariant because ωAd∗g β (g X˜ , g Y˜ ) = Ad∗g β, [Adg X, Adg Y ] = β, Adg−1 Adg [X, Y ] = β, [X, Y ] = ωβ ( X˜ , Y˜ ).

(107)

Thus we need only to check if ω is closed. But we know that for any differential 2-form the following is true: dω( X˜ , Y˜ , Z˜ ) = X˜ ω(Y˜ , Z˜ ) + Y˜ ω( Z˜ , X˜ ) + Z˜ ω( X˜ , Y˜ ) −(ω([ X˜ , Y˜ ], Z˜ ) + ω([Y˜ , Z˜ ], X˜ ) + ω([ Z˜ , X˜ ], Y˜ )).

(108)

Taking X˜ = Ad∗g β, [ · , Ad(g)X ] and similarly for Y˜ and Z˜ we see that the first three terms in (108) vanish because ω is K -invariant. The sum of the next three terms is also zero due to the Jacobi identity. This way we arrive at a well defined symplectic form on α .

A.2 Kähler structure. Let us start with the definition of Kähler manifold. Definition 2. Let M be a complex manifold dimC M = n and let ω be a symplectic form on M treated as a 2n-dimensional real manifold. Then M is called a Kähler manifold if at every p ∈ M the complex structure i on T p M (multiplication by an imaginary unit) ω fulfills: ω(iv, iw) = ω(v, w), that is i ∈ Sp(T p M) (the symplectic group of T p M).

(109)

466


Assume M is a Kähler manifold. Then we can define a symmetric nondegenerate form b on M, b(v, w) = ω(v, iw).

(110)

Indeed we have b(v, w) = ω(v, iw) = ω(iv, i 2 w) = −ω(iv, w) = ω(w, iv) = b(w, v).

(111)

The form b is nondegenerate because ∀w ∈ T p M b(v, w) = 0 ⇔ ∀w ∈ T p M ω(w, iv) = 0 ⇔ v = 0.

(112)

It is also i-invariant b(iv, iw) = ω(iv, i 2 w) = ω(v, iw) = b(v, w).

(113)

We have now the following definition. Definition 3. A Kähler structure on M is called positive if and only if the corresponding symmetric form b is positive. It is straightforward to check that having such an i-invariant nondegenerate symmetric form b on M we can define the nondegenerate i-invariant antisymmetric 2-form by ω(v, w) = b(iv, w). We will need only one more theorem (see [12] for a detailed proof). Theorem 6. Let M be a positive Kähler manifold. Then any complex submanifold S ⊂ M is also a Kähler manifold. The assumption that M is a positive Kähler manifold (not just a Kähler manifold) is very important due to the fact that restriction of the symmetric form b to S is then nondegenerate and i-invariant. Now using b|O we can define a nondegenerate i-invariant antisymmetric 2-form on O which is a restriction of ω defined on the whole M, and hence is symplectic. A.2.1 Kähler structure on P(V ). Consider a complex vector space V, dimC V = n with a Hermitian scalar product (·|·). The complex projective space P(V ) of V is defined P(V ) = V / ∼,

(114)

where the equivalence ∼ is defined by v ∼ w ⇔ w = αv α ∈ C∗ ,

(115)

where C∗ = C \ {0}. The standard way to realize P(V ) is by two steps: a

b

V −→ S(V ) −→ P(V ),

(116)

where step a is a quotient by the dilation v ∼ αv for a ∈ R∗ and step b is a quotient by the rotations v ∼ eiφ v. The result of the quotient a is the real sphere S(V ) = {v ∈ v : v = 1}. The quotient b gives S(V )/S 1 = S 2n−1 /S 1 , where S 1 represents the group of rotations. It is a well known fact that the complex projective space P(V ) is a complex manifold and dimC P(V ) = n − 1. Let π : V \ {0} → P(V ) be the projection defined by equivalence ∼. The tangent space to P(V ) at z = π(v)


467

is Tz P(V ) = πv (Tv V ). It is a good question to ask what is πv (ξ ), where ξ ∈ Tv V . Consider the curve t → v(t) ∈ V, v(0) = v. Then d ξ= v(t). (117) dt t=0

v(t) We first apply the map a to v(t). As a result we get a curve t → v(t) ∈ S(V ). Applying iφ the map b amounts to getting rid of the rotation e v, hence finally the curve in P(v) is given as v v(t) v(t) t → −v . (118) v(t) v v(t)

The tangent vector to this curve is thus d ξ v v ξ v(t) v v(t) d − = − . v dt t=0 v(t) dt t=0 v v(t) v v v v

(119)

Hence for any vector ξ ∈ Tv V , we obtain the corresponding vector πv (ξ ) ∈ Tπ(v) P(V ) ξ as the orthogonal complement of v to the subspace Cv in the Hermitian scalar product (·|·). Let us introduce a Hermitian scalar product on P(V ) by 1 ξ(v|v) − (v|ξ )v 1 η(v|v) − (v|η)v , (120) h(πv (ξ ), πv (η)) = v v (v|v) (v|v) which, after some calculations, reduces to h(πv (ξ ), πv (η)) =

(ξ |η)(v|v) − (ξ |v)(v|η) . (v|v)2

(121)

Of course, h is a well defined, that is a nondegenerate, positive Hermitian form on PV . Indeed, for any z = π(v) we have Tz P(V ) = (Cv)⊥ . Knowing this we can introduce a nondegenerate antisymmetric 2-form ω on P(V ) as the imaginary part of h(·, ·), ω(πv (ξ ), πv (η)) = −Imh(πv (ξ ), πv (η)).

(122)

It is straightforward to check that ω is not only i-invariant but also U (V )-invariant. To check that it is also closed notice that U (V ) is acting transitively on V which means the vectors Av, where A ∈ u(V ) span Tv V . Hence, ω(πv (Av), πv (Bv)) = −Im

(Av|Bv)(v|v) − (Av|v)(v|Bv) , (v|v)2

A, B ∈ u(V ).

(123)

But (Av|v)(v|Bv) is real since (Av|v) and (v|Bv) are imaginary (this is because A† = −A for all A ∈ u(V )). Hence, ω(πv (Av), πv (Bv)) = −Im

i([A, B]v|v) (Av|Bv) = , (v|v) 2(v|v)

A, B ∈ u(V ).

(124)

Now making use of Eq. (108) for vector fields πU v (U AU ∗ U v) and similarly for B and C we find, by the same argument as for coadjoint orbits, that dω = 0. This means P(V ) is a Kähler manifold. It is also a positive Kähler manifold because the corresponding symmetric form b given by b(πv (ξ ), πv (η)) = −Reh(πv (ξ ), πv (η)), is clearly positive.

(125)

468


References 1. Horodecki, R., Horodecki, P., Horodecki, M., Horodecki, K.: Quantum entanglement. Rev. Mod. Phys. 81, 865–942 (2009) 2. Werner, R.: Quantum states with Einstein-Podolsky-Rosen correlation admitting a hidden-variable model. Phys. Rev. A 40, 4277–4281 (1989) 3. Schliemann, J., Cirac, J.I., Ku´s, M., Lewenstein, M., Loss, D.: Quantum correlations in two-fermion systems. Phys. Rev. A 64, 022303 (2001) 4. Eckert, K., Schliemann, J., Bruß, D., Lewenstein, M.: Quantum correlations in systems of identical particles. Ann. Phys. 299, 88–127 (2002) 5. Ghirardi, G., Marinatto, L., Weber, T.: Entanglement and properties of composite quantum systems, a conceptual and mathematical analysis. J. Stat. Phys. 108, 49–122 (2002) 6. Li, Y.S. Zeng, B., Liu, X.S., Long, G.L.: Entanglement in a two-identical-particle system. Phys. Rev. A 64, 054302 (2001) 7. Paškauskas, R., You, L.: Quantum correlations in two-boson wave functions. Phys. Rev. A 64, 042310 (2001) 8. Klyachko, A.: Dynamic symmetry approach to entanglement. http://arxiv.org/alos/0802.4008v1 [quantph], (2008) 9. Bengtsson, I.: A curious geometrical fact about entanglement. AIP Conf. Proc. 962(1), 34–38 (2007) 10. Ku´s, M., Bengtsson, I.: Classical quantum states. Phys. Rev. A 80, 022319 (2009) 11. Kostant, B., Sternberg, B.: Symplectic projective orbits. In: New directions in applied mathematics, papers presented April 25/26, 1980, on the occasion of the Case Centennial Celebration (Hilton, P. J., Young, G. S., eds.) New York: Springer, 1982, pp 81–84 12. Guillemin, V., Sternberg, S.: Symplectic techniques in physics. Cambridge: Cambridge University Press, (1984) 13. Perelomov, A.: Generalized coherent states and their applications. Springer: Heidelberg, 1986 14. Hall, B.C.: Lie groups, Lie algebras, and representations, an elementary introduction. New York: Springer, (2003) 15. Kirillov, A.A.: Lectures on the orbit method. Graduate Studies in Mathematics Vol. 64. Providence, RI: Amer. Math. Soc., 2004 16. Horn, R.A., Johnson, C.R.: Matrix analysis. Cambridge: Cambridge University Press, 1985 ˙ 17. Sinoł¸ecka, M.M., Zyczkowski, K., Ku´s, M.: Manifolds of equal entanglement for composite quantum system. Acta Phys. Pol. 33, 2081–2095 (2002) 18. Fulton, W., Harris, J.: Representation theory. A first course. New York: Springer, 1991 19. Kotowski, M., Kotowski, M., Ku´s, M.: Universal nonlinear entanglement witnesses. Phys. Rev. A 81, 062318 (2010) 20. Kraus, B.: Local unitary equivalence of multipartite pure states. Phys. Rev. Lett. 104, 020504 (2010) Communicated by M.B. Ruskai


Communications in


An Algebraic Version of Haag’s Theorem Mihály Weiner1,2,, 1 Dipartimento di Matematica, Università di Roma “Tor Vergata”, Rome, Italy 2 Alfréd Rényi Institute of Mathematics, Hungarian Academy of Sciences, POB 127,

H-1364 Budapest, Hungary. E-mail: [email protected] Received: 20 July 2010 / Accepted: 17 September 2010 Published online: 18 May 2011 – © Springer-Verlag 2011

Abstract: Under natural conditions (such as split property and geometric modular action of wedge algebras) it is shown that the unitary equivalence class of the net of local (von Neumann) algebras in the vacuum sector associated to double cones with bases on a fixed space-like hyperplane completely determines an algebraic QFT model. More precisely, if for two models there is a unitary connecting all of these algebras, then — without assuming that this unitary also connects their respective vacuum states or spacetime symmetry representations — it follows that the two models are equivalent. This result might be viewed as an algebraic version of the celebrated theorem of Rudolf Haag about problems regarding the so-called “interaction-picture” in QFT. Original motivation of the author for finding such an algebraic version came from conformal chiral QFT. Both the chiral case as well as a related conjecture about standard half-sided modular inclusions will be also discussed. 1. Introduction 1.1. Haag’s theorem and its algebraic version. If we “freeze” a classical, nonrelativistic physical system — say a mechanical system of n point masses — at a certain time-instant, we do not see if the system was an “interactive” or a “free” one. A certain configuration with given velocities may correspond both to a free or to an interactive system. Interaction becomes visible only when one looks at how things change. This is the basic idea behind the so-called “interaction-picture” in quantum field theory (QFT). Within the framework of Wightman-axioms [16], free models can be well-described in terms of Wightman-fields (i.e.operator-valued distributions on spacetime). Then to give an interactive model one should consider the restriction of the same free fields at a certain spacelike hyperplane but then extend it to spacetime with a On leave from the Alfréd Rényi Institute of Mathematics, Budapest Hungary.

Supported by the ERC Advanced Grant 227458 OACFT “Operator Algebras and Conformal Field

Theory”.

470

M. Weiner

different time-evolution. (So that the interactive and free fields will coincide at our fixed spacelike hyperplane but possible nowhere else.) However, Haag’s theorem (see the book [16] for a detailed account) has ruled out the existence of such a description. Suppose two QFT models are given: one in terms of the Wightman-fields r (r = 1, . . . , n) and another one in terms of the Wightman-fields ˜ r (r = 1, . . . , n). Assuming some relatively mild conditions (such as the existence of well-behaved restrictions for the fields along spacelike hyperplanes), if there exists a spacelike hyperplane H and a unitary operator V such that ˜ r (x) V r (x)V ∗ =

(x ∈ H, r = 1, . . . n),

(1.1)

˜ (where and ˜ are then it also follows that up to a possible phase-factor, V = ∗ ˜ r (x) for all spacetime points x and the respective vacuum-vectors), V r (x)V = r = 1, . . . , n, and finally, that V U (g)V ∗ = U˜ (g) (where U and U˜ are the respective representations of spacetime symmetries) for all elements g of the connected Poincaré group. Thus V establishes an equivalence between the two models: if one was free, so is the other — we cannot make an interacting model out of a free one in this way. So what would be an algebraic version of Haag’s theorem? Fix a spacelike hyperplane H . We shall say that two nets of von Neumann algebras A and A˜ are equivalent along H , iff there is a unitary operator V such that ˜ ) V A(K )V ∗ = A(K

(1.2)

K

with base K ⊂ H . (See Sect. 2 on the notions of doublefor every double-cone cones and algebraic QFT.) In this paper — under certain additional assumptions of — ˜ U˜ ) are two algebraic QFT models and A and it will be proved that if (A, U ) and (A, A˜ are equivalent along a spacelike hyperplane, then there exists a unitary W such that ˜ W A(O)W ∗ = A(O) for all double-cones O and W U (g)W ∗ = U˜ (g) for all elements g of the connected Poincaré group: thus the two models are equivalent. Actually, it will suffice to assume equivalence along a “half-hyperplane” H + ; see the details in Sect. 4. Nevertheless, H + has still an infinite space-volume. In fact, it is well-known that two inequivalent models, when restricted to a compact region, may give rise to unitarily equivalent nets of algebras; see [6,10] for examples. 1.2. Algebraic vs.original version. Of course in a strict sense the two versions of Haag’s theorem cannot be compared. They are statements made in two different frameworks and despite numerous attempts, the passage between the two frameworks — albeit clear in actual examples — in general is still unresolved. Nonetheless, in some sense, as we shall see now, one may say that the algebraic version is stronger than the original one, and that the new version is not a simple reformulation of the old one. Let us see why. In case we deal with algebraic QFT models associated to Wightman field theories, our additional assumptions — with the exception of the split property, which however probably could be avoided (see the comments at the end of Sect. 2) — are known to hold. To appreciate the differences, rather than at assumptions regarding frameworks, one must look at the respective notions of equivalence and the ways in which it is established. The natural notion of equivalence of Wightman field theories (i.e. the existence of a unitary operator connecting the defining fields and representations) — and hence also the condition of equivalence along a spacelike hyperplane appearing in Haag’s original

An Algebraic Version of Haag’s Theorem

471

version — is too narrow, and does not coincide with physical equivalence. (In a sense this was exactly the original motivation [13] for considering the local algebras generated by the fields rather than the fields themselves: they already contain all physical information — fields also depend on the choices made regarding our description.) But there is more to this. In the original version, the unitary operator V appearing in Eq. (1.1) actually also turns out to be the unitary operator establishing the equivalence between the two models. This clearly does not hold in the algebraic case. For example, let both models be the same scalar free field model. Since the adjoint action of a Weyl-operator W ( f ) preserves every local algebra, V := W ( f ) satisfies the requirement (1.2) made in the algebraic version. However, in general W ( f ) = λ so V = W ( f ) does not establish an equivalence between the model and itself. To put it another way: a unitary operator whose adjoint action leaves the fields along a hyperplane invariant must be a multiple of the identity and hence must preserve every local algebra. To the contrary, a unitary operator, whose adjoint action preserves every local algebra, does not necessarily preserve the vacuum and hence may not take a Poincaré-covariant field into a Poincaré-covariant field. So even if the passage between Wightman field theory and algebraic QFT was clear, the introduced algebraic version of Haag’s theorem would not become a simple consequence of the original one. Rather, it is the other way around. 1.3. Conformal QFT and half-sided modular inclusions. Though it is always nice to strengthen a theorem, this was not why the author considered an algebraic version of Haag’s theorem. As it will be explained now, the original motivation came from conformal chiral QFT and in particular its relation to half-sided modular inclusions. Möbius covariant nets on S 1 have remarkable properties. Many things that in “ordinary” algebraic QFT often appear as additional assumptions — like for example additivity, Bisognano-Wichmann property and factoriality of local algebras — can be in fact derived; see [4,8,9,11,12] on the general structure of such nets. For simplicity of notations, let us consider such a net A with vacuum vector on the real line R (see the last section on details of what it exactly means). Setting M := A(0, ∞) and N := A(1, ∞), by an application of the Bisognano-Wichmann property (which, as was mentioned, in the conformal case is automatic) we have that the (, N ⊂ M) is a standard half-sided modular inclusion of von Neumann factors. That is, • is a standard vector of the inclusion N ⊂ M: it is cyclic and separating for both N , M and N ∩ M, • itM, N itM, ⊂ N for all t ≤ 0. This also works the other way around. Namely, it is shown [2,12,18,19] that if (, N ⊂ M) is a standard half-sided modular inclusion of factors, then one can construct a unique strongly additive Möbius covariant net A with vacuum vector such that A(0, ∞) = M and A(1, ∞) = N . At first sight, this seems to give a great opportunity for constructing new conformal chiral QFT models. Indeed, instead of an entire net of algebras (together with a representation of the Möbius group), all we need is to present a certain standard inclusion of von Neumann factors. Sadly, the reality is just the opposite way around. As far as the author knows, (nontrivial) standard half-sided modular inclusions have been constructed only with the help of Möbius covariant nets. However, there were hopes to find a more or less direct way to

472

M. Weiner

construct a new half-sided modular inclusion out of an existing one. R.Longo proposed1 to consider the following “perturbation” of a half-sided modular inclusion of factors (, N ⊂ M). For a vector which is cyclic and separating for M, let us denote by J and the modular objects associated to (, M). By [1], for each X ∈ M, X ∗ = X there exists a vector X cyclic and separating for M such that • X is in the natural cone of (, M) and hence J X = J =: J , • ln( X ) = ln( ) + X + J X J . In particular, if X ∈ N then by applying the Trotter product formula one can easily check that ( X , N ⊂ M) is still a half-sided modular inclusion. If X is also a standard vector for N ⊂ M, then we can go on and generate a new strongly additive net AX . But are the original net A0 (from where we took our half-sided modular inclusion) and A X really different? Using the mentioned product formula one can also easily show that with X ∈ N many local algebras will remain the same; not only that A0 (0, ∞) = A X (0, ∞) = M and A0 (1, ∞) = A X (1, ∞) = N but actually A0 (I ) = A X (I )

for all I ⊂ (0, 1).

(1.3)

On the other hand, by an easy reformulation (see Sect. 5) of the main result of the present paper, if A0 and hence also A X satisfy the split property, then the above equality implies that A0 and A X , as Möbius covariant nets, are equivalent. Thus, in this way we cannot obtain new models. Of course one may try to improve the situation. Instead of a self-adjoint X ∈ N , more generally we could take any X ∈ M, X ∗ = X for which ei X t N e−i X t ⊂ N for all t ≤ 0. For example, X may be a self-adjoint of the form X = X 1 + X 2 with X 1 ∈ N and X 2 ∈ M ∩ N , and in concrete examples we may find further choices. Nevertheless, in light of Haag’s theorem, it seems unlikely for the author that retaining the same inclusion N ⊂ M and changing only the “dynamics” one could obtain something really new. Actually in Sect. 5, regarding this question we shall observe two ˜ be two standard half-sided mod˜ N˜ ⊂ M) important facts. Let (, N ⊂ M) and (, ular inclusions of factors and denote the two corresponding strongly additive Möbius ˜ U˜ ), respectively. covariant nets by (A, U ) and (A, ˜ then I. If there exists a unitary operator V such that V N V ∗ = N˜ and V MV ∗ = M, for each n ∈ N there exists a unitary operator Vn such that ˜ j, k) Vn A( j, k)Vn∗ = A( for every pair of integers j, k ∈ {0, 1, . . . , n}. In particular, this implies that if A is split, so is A˜ and in fact A and A˜ has unitarily equivalent 2-interval inclusions. Now this inclusion is a rich source of information; in the completely rational case essentially it contains [14] the entire representation theory of the net. So this already suggests that perhaps A and A˜ are equivalent. As a matter of fact, the conformal version of our algebraic Haag’s theorem tells that just a slightly stronger condition indeed implies equivalence. 1 This idea has never been published; the author learned about it through oral communication.


473

˜ U˜ ) be two Möbius covariant nets and assume that at least one of II. Let (A, U ) and (A, them is split. If there exists a unitary operator V such that ˜ j, k) V A( j, k)V ∗ = A( ˜ U˜ ) are equivalent. for every pair of natural numbers j, k ∈ N then (A, U ) and ( A, Again, as was mentioned already and will be explained at the end of the next section, the author thinks that the split condition should be possible to remove. Now I + II + the remarks made after stating them — though it does not actually prove — seems to indicate the following: Conjecture 1.1. The unitary equivalence class of a standard half-sided modular inclusion of factors (, N ⊂ M) is completely determined (up to a possible normalization of ) by the unitary equivalence class of the inclusion N ⊂ M. That is, for another ˜ with equal normalization ˜ N˜ ⊂ M) ˜ = , if half-sided modular inclusion (, ∗ ∗ ˜ ˜ then there there exists a unitary operator V such that V N V = N and V MV = M, ∗ ˜ ˜ but exists a unitary operator W such that not only W N W = N and W MW ∗ = M, ˜ also W = . 2. Preliminaries: Axioms of Algebraic QFT In this paper we shall consider an algebraic version of Haag’s theorem. An algebraic QFT, rather than quantum fields, is given in terms of a net of local algebras O → A(O). We shall work directly on the so-called “vacuum Hilbert space” and consider O → A(O) = A(O) to be a net of von Neumann algebras. For a spacelike hyperplane H , and a bounded, connected and simply connected open subset K of H we set c

K := Int(K ),

(2.1)

where K c is the (closed) causal completion of K and “Int” stands for the (open) interior. We say that K is a double-cone with base on H ; that is, with K ⊂ H . For physical purposes (e.g. for determining the S-matrix or the structure of charged sectors) it is enough to work with special spacetime regions like double-cones. So considering only what is absolutely necessary, here we define an algebraic QFT to be a map associating to each double-cone O a von Neumann algebra A(O) on a fixed Hilbert space H together with a strongly continuous unitary representation U of the connected Poincaré group satisfying the following “minimal” conditions. (Note that some further additional properties will be later considered.) (1) Isotony: A(O1 ) ⊂ A(O2 ) whenever O1 ⊂ O2 . (2) Locality: [A(O1 ), A(O2 )] = 0 whenever O1 and O2 are spacelike separated. (3) Covariance: U (g)A(O)U (g)∗ = A(g(O)) for all regions O and elements g of the connected Poincaré group. (4) Positivity of energy: Px ≥ 0 whenever x is future like. Px is defined by the equation U (τtx ) = eit Px (t ∈ R) in which τz is a translation by z. (5) Existence, uniqueness and cyclicity of vacuum: up to phase there exists a unique unit vector invariant for U (τ ) for all spacetime translations τ . Moreover, is cyclic for ∨O A(O).

474

M. Weiner

Note that from a physical point of view one should assume U to be a projective representation rather than a true one. However, it is easy to see that if U is a projective representation of a group G and N ⊂ G is a normal subgroup such that there exists a unique one-dimensional invariant subspace for U (N ), then actually this subspace is invariant for the action of the full group and hence one can arrange the “phase factors” in such a way that U becomes a true representation. So without loss of generality, for clarity we have stated the axioms with U being a true representation rather than just a projective one. Although so far we have only associated algebras to double-cone like bounded regions, by setting A(O) := ∨ K ⊂O A(K ),

(2.2)

we may talk about the algebra A(O) associated to any open region O. Note that isotony implies that the new definition does not change the algebra associated to a double-cone and that properties (1, 2, 3, 4, 5) remain valid. The standard Reeh-Schlieder argument combined with locality shows that is cyclic and separating for A(W) whenever W is a wedge region. (See e.g. the book [13] for precise definition of wedge regions.) Actually, by [15, Thm. 3] it even follows that for a wedge region W the algebra A(W) is a type III1 factor. Another well-known consequence of (1, 2, 3, 4, 5) is irreducibility: A(M) ≡ {∨O A(O)} = ∩O A(O) = C1.

(2.3)

Here M stands for the full spacetime. Howeve, as was mentioned, (1, 2, 3, 4, 5) is only a “minimal set” of conditions; they still allow many pathological examples. In particular, while turns out to be cyclic for A(W) whenever W is a wedge, it may not be so for a double-cone.2 Sometimes instead of isotony the stronger additivity property is required; namely, that A(O) ⊂ ∨nk=1 A(Ok ) whenever O ⊂ ∪nk=1 Ok . (Note that in the conformal case additivity is not needed as a further assumption since it can be actually proved, as will be discussed in Sect. 5.) Having additivity one can use the argument of Reeh and Schlieder and show that is cyclic for every local algebra A(O) associated to a nonempty open region O. Local von Neumann algebras were originally introduced to replace the unbounded polynomial algebra of local fields. From a physical point of view it seems reasonable to assume that our local von Neumann algebras are in fact generated by unbounded (Wightman) fields (i.e. that there is an “underlying” Wightman field theory and A(O) is the smallest von Neumann algebra to which the closure of all fields smeared with testfunctions with support in O are affiliated). Now for the algebra of fields additivity is evident. However, the passage from unbounded operators to von Neumann algebras is a delicate issue. In particular — at least, to the author’s knowledge — even assuming an underlying Wightman field theory, so far additivity could not be proved. On the other hand, it is easy to see that the cyclicity guaranteed (at the level of Wightman fields) by the Reeh-Schlieder theorem passes without problems to the level of local von Neumann algebras. For this reason here we shall assume directly this cyclicity rather than making the stronger assumption of additivity. 2 For an example, let us fix a bounded open set of spacetime and call a region “small” if it can be moved into this set by a Poincaré tranformation. Now take a “nice” model and reset all local algebras that are associated to the “small region” to be equal to the trivial algebra C1. It is easy to see that all listed properties remain valid, but now, starting from a “nice” model we have produced one with the mentioned pathological property.


475

(6) Reeh-Schlieder property: is cyclic for A(O) for every nonempty open region O. Let W be a wedge region and consider the modular operator A(W ), and modular conjugation JA(W ), associated to (A(W), ). (As was mentioned, is cyclic and separating for A(W), so these objects are well-defined). Assuming the existence of an “underlying” Wightman field theory, it is known [3] that these objects have a “geometrical meaning”. Though attempts were made, so far it has not been proved that in general, the geometrical nature of these modular objects is a consequence of (1, 2, 3, 4, 5). So we shall simply assume it. (7) Bisognano-Wichmann property. If W is a wedge-region, then itA(W ), = U (βt )(t ∈ R), where t → β is the one-parameter group of boosts associated to W with a certain parametrization. For definition of the one-parameter group of boosts associated to a wedge and details on the parametrization we refer to the book [13]. Note that as will be explained in Sect. 5, in the conformal case not only (6), but also this property can be derived eliminating the need to additionally assume it. The discussed properties (1, 2, 3, 4, 5, 6, 7) are essential for the proof of our argument. However — though for a somewhat technical reason — we shall actually need one more property: (8) Split property. The inclusion A(K 1 ) ⊂ A(K 2 ) is split whenever K 1 ⊂ K 2 . (Actually distant split property would suffice for us, but for simplicity here we only talk about split property.) For physical significance of the split property we again refer to the book [13]. Here we briefly comment only on the difference between how (8) and the other properties will be used. In the course of the proof of our main theorem, we shall construct a sequence of unitary operators n → Wn . The equivalence between the two models is then to be established by the strong limit of this sequence. But though the author is convinced that this limit exists, he could not show this. So instead, split property is used to obtain a compactness condition by which at least the existence of a convergent subsequence can be established. Now the way in which split property can be turned into the right compactness condition is not simple; in fact the whole next section will be dedicated to this question. Nevertheless, the author feels that the split property should not play an essential role in the algebraic Haag’s theorem. 3. On Split Inclusions An inclusion of von Neumann algebras N ⊂ M for which there exists a type I factor R “in between”: N ⊂ R ⊂ M, is said to be a split inclusion. Let N ⊂ M be a split inclusion and a standard vector for the inclusion in question; i.e. we suppose that is cyclic and separating for both N , M and the relative commutant N ∩ M. Denoting the modular conjugation associated to N ∩ M and the vector by J , if N is a factor, we shall set R := N ∨ J N J .

(3.1)

Alternatively, if M is a factor, we shall set R := M ∩ J MJ .

(3.2)

476

M. Weiner

By [7], under the assumptions made our notation is unambiguous: if both N and M are factors, then N ∨ J N J = M ∩ J MJ . Moreover, the thus defined von Neumann algebra R is a type I factor between N and M; we shall say that that R is the canonical type I factor of the inclusion. If (, N ⊂ M) is a standard split inclusion in which one of the algebras is a factor, and W is a unitary operator such that it preserves the vector and its adjoint action preserves the algebras N and M, then the adjoint action of W must also preserve the canonical type I factor R of the inclusion. Using this fact in [7] it was proved that the group of such unitary operators is compact and metrizable (with respect to the strong operator topology). In particular, if n → Wn is a sequence of such unitary operators, then one can always find a subsequence s such that n → Ws(n) will strongly converge to a unitary operator. In this section we assume that (, N ⊂ M) is a standard split inclusion in which at least one of the algebras is a factor, and n → Wn is a sequence of unitaries such that the adjoint action of Wn preserves the algebras N and M for all n ∈ N. We set n := Wn and assume that n → n is convergent; more precisely, that there is a standard vector for our inclusion such that n − → 0 as n → ∞. Our aim is to find a suitable modification of the proof of [7] in order to show the existence of a subsequence of n → Wn converging strongly to a unitary operator. We shall proceed in several intermediate steps. We shall begin with an important observation which generalizes [7, Lemma 3.2]. Lemma 3.1. Let H be a Hilbert space, n → Un a sequence of unitary operators on H and ϕ a faithful normal state on B(H). If n → ϕn := ϕ ◦ Ad(Un ) converges in norm, then there exists a subsequence s such that n → Us(n) converges strongly. Moreover, if the norm limit of n → ϕn is faithful, then the strong limit of n → Us(n) is unitary. Proof. If dim(H) < ∞, then the statement is trivially true. On the other hand, as B(H) was assumed to have a faithful normal state, H must be separable. So we may assume that H is the (up to unitary equivalence) unique infinite dimensional separable Hilbert space. For each normal state η there exists a unique positive trace-class operator Dη ∈ B(H) such that η(A) = Tr(Dη A)

(3.3)

for all A ∈ B(H). Since ϕn = ϕ ◦ Ad(Un ), we have that Dn := Dϕn = Un Dϕ Un∗ . Now let ϕ˜ be the assumed (norm) limit of n → ϕn , and consider the operator Dϕ˜ . We know that ϕn → ϕ˜ in norm as n → ∞. What does this tell us about Dn (n ∈ N) and Dϕ˜ ? Since Dn − Dϕ˜ ≤ 2, as n → ∞ we have 0 ≤ Tr((Dϕ˜ − Dn )2 ) = (ϕ˜ − ϕn )(D − Dn ) ≤ 2ϕ˜ − ϕn → 0.

(3.4)

In particular, Dn → Dϕ˜ in norm. Let us see now what can we say about the convergence of spectrums and spectral projections. Let f be a continuous real function on [0, 1] and > 0. Then by the StoneWeierstrass theorem there is a real polynomial p such that | f (x) − p(x)| < /3 for all x ∈ [0, 1]. As Sp(Dn ), Sp(Dϕ˜ ) ⊂ [0, 1], we have that both f (Dϕ˜ ) − p(Dϕ˜ ) < /3 and f (Dn ) − p(Dn ) < /3 for all n ∈ N. On the other hand, as p is a polynomial and Dn → Dϕ˜ in norm, there exists a N ∈ N such that p(Dn ) − p(Dϕ˜ ) < /3 for


477

all n > N . Thus for n > N we have that f (Dn ) − f (Dϕ˜ ) ≤ f (Dn ) − p(Dn ) + p(Dn ) − p(Dϕ˜ ) + p(Dϕ˜ ) − f (Dϕ˜ ) < ,

(3.5)

showing that f (Dn ) − f (Dϕ˜ ) → 0 as n → ∞. As was already noted, Sp(Dn ) = Sp(Dϕ ) for every n ∈ N because of unitary equivalence. Now Dϕ , Dϕ˜ are density operators, so their spectrum is contained in [0, 1] and have at most one point of accumulation; namely, at zero. Moreover, each positive point of their spectrum must be an eigenvalue corresponding to a finite dimensional eigenspace. Since the spectrum is compact, if x ∈ / Sp(Dϕ ), then there exists a continuous function f on [0, 1] such that f |Sp(Dϕ ) = 0 but f (x) = 1. Thus f (Dϕ ) = 0 = f (Dn ) so by the established convergence property f (Dϕ˜ ) = 0 showing that x cannot be an eigenvalue for Dϕ˜ . On the other hand, let us fix an eigenvalue λ ∈ Sp(Dϕ )\{0} and choose a continuous function f on [0, 1] such that f (x) = 0 for all x ∈ Sp(Dϕ )\{λ} and f (x) = 1 if and only if x = λ. Then f (Dn ) is exactly the spectral projection associated to the eigenvalue λ of Dn ; in particular f (Dn ) = 1 and f (Dn )2 = f (Dn )∗ = f (Dn ). This shows that f (Dϕ˜ ) — which is the norm-limit of f (Dn ) — is also a nonzero projection. Now 0 ∈ Sp(Dϕ ) ∩ Sp(Dϕ˜ ) since we are dealing with density operators given on an infinite-dimensional space. Moreover, 0 is not an eigenvalue for Dϕ since ϕ was assumed to be faithful. Let us sum up what we have obtained so far. We have shown that Sp(Dϕ ) = Sp(Dn ) = Sp(Dϕ˜ ) and that for each eigenvalue λ of Dϕ , the spectral projections E n,λ of Dn corresponding to the eigenvalue λ converge in norm to the spectral projection E ϕ,λ ˜ of Dϕ˜ corresponding to the same eigenvalue λ. Let be an eigenvector of Dϕ with eigenvalue λ. Then n := Un is an eigenvector of Dn = Un Dϕ Un∗ with the same eigenvalue. To put it in another way, (E n,λ −1)n = 0 implying that (E ϕ,λ ˜ − 1)n → 0 as n → ∞ and so n → n (or a subsequence of it) converges if and only if n → E ϕ,λ ˜ n (or its subsequence in question) does so. But n → E ϕ,λ “runs” in the unit ball of the finite dimensional space Im(E ϕ,λ n ˜ ˜ ), so it admits a convergent subsequence. Since Dϕ is a density operator, there exists a complete orthonormal system consisting of eigenvectors of Dϕ , only. However, since H is separable, this system is countable. Thus by what was established, we may conclude the existence of a subsequence s such that n → Us(n) strongly converges on each vector of this system, and hence — as we are dealing with a sequence of unitary operators — on every vector of H. We are almost finished: we have proved the existence of a convergent subsequence. However, the limit of a strongly convergent sequence of unitary operators may not be again a unitary operator (in general, it is only an isometry). To show the existence of a unitary limit, we have to check the strong convergence of the adjoints. Then to conclude our proof, all we have to note is that if ϕ˜ is faithful, then we may repeat our argument ∗ and with the role of ϕ and ϕ˜ exchanged. with the unitary sequence n → Us(n) Recall that in this section we are dealing with a standard split inclusion (, N ⊂ M) in which at least one of the algebras is a factor, and a certain sequence of unitary operators n → Wn . In our case the adjoint action of Wn does not necessarily preserve the canonical type I factor R . Rather, we have that Wn R Wn∗ = Rn , where Rn is the canonical type I factor given by the vector n . (Note that n = Wn is automatically a standard vector for the inclusion N ⊂ M.) All we can hope now is that since n → , the type I factors Rn will get “closer and closer” to the type I factor R . At this point,

478

M. Weiner

our previous lemma resolves only the rather particular case when the adjoint action of Wn actually does preserve R . However, this in turn will serve to prove the general case. Proposition 3.2. Suppose that for all n ∈ N, the adjoint action of Wn also preserves the canonical type I factor R . Then there exists a subsequence s such that n → Ws(n) strongly converges to a unitary operator. Proof. First let us note that if n → An is a sequence of uniformly bounded operators converging strongly to a bounded operator A then also An ⊗ 1 → A ⊗ 1 strongly, as n → ∞. Indeed, convergence is clear on vectors of tensorial form, and hence on every vector as our sequence was assumed to be uniformly bounded. In particular, if we identify R with B(K) (via an isomorphism) where K is some Hilbert space, then a sequence of unitary operators n → Un ∈ R is strongly converging to the unitary operator of R if and only if we have convergence in the topology given by the strong operator topology of B(K). So let now ω and ψ be the normal states on R given by the vectors and , respectively. These states are faithful since the vectors in question are separating for M which contains R . Since R is a type I factor, the adjoint action of Wn in R can be implemented by a unitary Un ∈ R . We have that ω ◦ Ad(Un ) → ψ in norm, since Wn − → 0. Thus our previous lemma can be applied, and by what was noted in the beginning of our proof, it shows that there exists a unitary U ∈ R and a subsequence s such that Us(n) → U strongly (on our original Hilbert space, not only on K) as n → ∞. Then for an A ∈ R we have that as n → ∞, ∗ Ws(n) A = (Us(n) AUs(n) )Ws(n) → U AU ∗ ,

(3.6)

since Ws(n) − → 0 and since the strong limit of a product of strongly convergent, uniformly bounded sequences is simply the product of the limits. Thus n → Ws(n) is ∗ is strongly convergent on R . Now strongly convergent on R , and n → Ws(n) both and are cyclic for N and hence for R , too; so actually we have shown that n → Ws(n) converges strongly to a unitary operator. In our previous proposition we assumed Rn := Rn to coincide with R . It is rather clear that this assumption is too strong; it will not hold in general. So now we shall see how we can “correct” Wn by another unitary in order to have this property. Lemma 3.3. Let , n (n ∈ N) be standard vectors for the split inclusion N ⊂ M in which at least one of the algebras is a factor. If n − → 0 as n → ∞, then there exists a sequence of unitaries n → Un ∈ N ∩ M strongly converging to the operator 1 such that Un Rn Un∗ = R for all n ∈ N. Proof. We may assume that the smaller algebra N is a factor. (If only M is a factor, then instead of the original inclusion we may consider (, M ⊂ N ) in which it is again the smaller algebra which is a factor.) Let us denote by ψ, ψn (n ∈ N) the faithful normal states on N ∩ M given by the vectors , n (n ∈ N), respectively. The state ˜ n in the natural cone of (, N ∩ M). Note that ψn has a unique vector-representation ˜ n , N ∩ M) coincides by construction, the modular conjugation J˜ n associated to ( ˜ n implement the same state on with J . As both cyclic and separating vectors n and ˜ n , or equivalently, N ∩ M, there exists a unitary Un ∈ (N ∩ M) such that Un n =


479

˜ n = n . As the adjoint action of Un preserves both N , M and N ∩ M, we that Un∗ have that Un∗ J Un = Un∗ J˜ n Un = JU ∗ ˜ n = Jn . n

(3.7)

Moreover, as is rather evident, J Un∗ N Un J ⊂ N ∩ M, we also have that Un is in the commutant of J Un∗ N Un J and Jn N Jn = Un∗ J Un N Un∗ J Un = J Un N Un∗ J = (J Un J )J N J (J Un J )∗ = Un Jn N Jn Un∗ ,

(3.8)

where Un = J Un J is a unitary in the relative commutant N ∩ M. Now the sequence of states n → ψn clearly converges to ψ in norm (since n → as n → ∞). It follows ˜ n and , both elements of the natural cone of that the distance between the vectors (, N ∩ M), also goes to zero as n → ∞. Now Un = J Un J = J Un and J Un − J Un n = − n → 0, so n → Un is convergent as in fact ˜ n ) = J = . lim(Un ) = lim(J Un n ) = lim(J n

n

n

(3.9)

Since Un ∈ N ∩ M ⊂ M, the above shows that n → Un strongly converges to the identity on the closure of M and hence everywhere (as is cyclic and separating for M and so for M , too). Corollary 3.4. Under the assumptions explained in the beginning of this section, it follows that there exists a subsequence s such that n → Ws(n) strongly converges to a unitary operator.

4. Equivalence of Models Fix a space-like hyperplane H , and let further τ be a nonzero translation such that τ (H ) = H . Fix a plane N in H which is orthogonal to the direction of the translation τ . Then H \N is the disjoint union of two open “half-spaces” H + and H − . Here the “+” and “−” signs are given in such a way that τ (H + ) ⊂ H + while τ (H − ) ⊃ H − . Note that W ± := (H ± ) are wedge-regions such that the causal complement of any of them is exactly (the closure of) the other. Moreover, we have τ (W + ) ⊂ W + and τ (W − ) ⊃ W − , and that ∪n∈N τ n (W − ) is the full spacetime. Hence if (A, UA ) is an algebraic QFT given on the Hilbert space H satisfying axioms (1, 2, 3, 4, 5) discussed in the Introduction, then n → A(τ n (W − )) is an increasing sequence of von Neumann algebras such that its union is dense (w.r.t. the strong op. topology) in B(H). Then for the decreasing sequence n → A(τ n (W + )), by locality we have that ∩n∈N A(τ n (W + )) = C1. In our main theorem — apart from many other things — we shall also use a rather well-known fact concerning a decreasing sequence of von Neumann algebras and distances of restrictions of states. However, for reasons of self-containment we shall outline the proof of this fact (which is anyway short). Lemma 4.1. Let M1 ⊃ M2 ⊃ M3 ⊃ . . . be a decreasing sequence of von Neumann algebras on a Hilbert space H with ∩n∈N Mn = C1 (or equivalently: with

480

M. Weiner

{∪n∈N Mn } = B(H)). Let further ψ, ψ˜ be two normal states on M1 . Then for the ˜ Mn we have restriction of states ψn := ψ|Mn , ψ˜ n := ψ| ψn − ψ˜ n → 0 as n → ∞ Proof. Clearly, the validity of the statement does not depend on the “underlying” Hilbert space. So we may assume that both states on M1 can be represented by vectors in H (e.g. we may work on the direct sum of the two GNS representations given by the two ˜ is a representative vector for ψ. ˜ states); say is a representative vector for ψ and Any two unit-vectors can be connected by a unitary operator, so let V be a unitary ˜ We may write V in the form V = ei A , where A is selfoperator such that V = . adjoint operator with spectrum Sp(A) ⊂ [−π, π ], and hence A ≤ π . Now ∪n∈N Mn is dense in B(H) in the strong operator topology. Thus by an application of Kaplansky’s density theorem there exists a sequence of self-adjoints n → An ∈ Mn strongly converging to A such that An ≤ π for all n ∈ N. Then n → Un := ei An ∈ Mn is ˜ as a sequence of unitary operators strongly converging to V ; in particular Un → n → ∞. For the von Neumann algebra Mn the vectors and Un represent the same state. Hence as n → ∞, ˜ → 0, ψn − ψ˜ n ≤ 2Un − which is exactly what we have claimed.

(4.1)

For what follows, recall our definition of a double-cone K with base K . Recall also that in the beginning of this section we have fixed a spacelike hyperplane H and some further objects related to H . ˜ U˜ ) be two algebraic QFT models on the d + 1 dimenTheorem 4.2. Let (A, U ) and (A, sional Minkowskian spacetime satisfying the basic requirements (1, 2, 3, 4, 5, 6) as well as the Bisognano-Wichmann (7) and split (8) properties. If there exists a unitary V such that ˜ ) V A(K )V ∗ = A(K for every double-cone K with base K ⊂ H + , then the two models are equivalent. That ˜ for all double-cones is, there exists a unitary operator W such that W A(O)W ∗ = A(O) ∗ ˜ O and W U (g)W = U (g) for all elements g of the connected part of the Poincaré group. ˜ be the (up to phase unique, normalized) vacuum vectors for U Proof. Let and and U˜ , respectively. We may assume that the two models are given on the same Hilbert ˜ ) for every space H and that V is the identity operator so that actually A(K ) = A(K + double-cone K with base K ⊂ H . Remember we defined A(W + ) to be the von Neumann algebra generated by all local algebras A(O) with O ⊂ W + . That is, theoretically we should take account of all double-cones included in W + and not only those with bases on H + . However, it is easy to see that one can take an increasing sequence of double-cones n → K n with bases on H + such that not only ∪n∈N K n = W + , but actually every bounded region O ⊂ W +


481

is included in K n for some n ∈ N. Then by isotony A(W + ) = {∪n∈N A(K n )} . So the assumed equality implies that A and A˜ coincide on W + , too. ˜ is in the natural cone of (, A(W + )). Indeed, suppose origWe may assume that ˜ This state inally it was not so, and consider the state on A(W + ) given by the vector . ˜ ˜ has a unique representative vector in the cone in question. Since is cyclic and ˜ is also cyclic and separatseparating for A(W + ), the corresponding state is faithful, + + ˜ . Then we ing for A(W ) and there exists unitary V ∈ A(W ) such that V = ∗ ∗ ˜ ˜ ˜ ˜ ˜ by (V AV , V U V ) with vacuum vector may replace (A, U ) with vacuum vector ˜ . For this latter choice we have the desired property that its vacuum vector is in the required cone, and since V commutes with all algebras A(O) ⊂ A(W + ) (O ⊂ W + ), ˜ )V ∗ for every double-cone K with ˜ ) = V A(K we still have that A(K ) = A(K + base K ⊂ H . Let γn be the adjoint action of the product U (τ n )∗ U˜ (τ n ). By what was assumed, we have that for every n ∈ N, γn (A(K )) = A(K ) for every double cone K with base K ⊂ H + .

(4.2)

Now let ω and ω˜ be the faithful normal states on A(W + ) given by the vectors ˜ respectively. Then ω ◦ γn is nothing else than the state given by the vecand , tor U˜ (τ n )∗ U (τ n ) = U˜ (τ n )∗ ; so ω ◦ γn = ω ◦ Ad(U˜ (τ n )). On the other hand, ˜ is an invariant vector for U˜ (τ ). Putting it together, and ω˜ ◦ Ad(U˜ (τ n )) = ω˜ since applying our previous lemma we have that ω ◦ γn − ω ˜ = (ω − ω) ˜ ◦ Ad(U˜ (τ n )) = (ω|A(τ n (W + )) − ω| ˜ A(τ n (W + )) → 0 (4.3) as n → ∞, since on τ n (W + ) ⊂ W + (by a similar argument to that used for W + ) the nets A and A˜ coincide and hence U˜ (τ n )A(W + )U˜ (τ n )∗ = A(τ n (W + )) for every n ∈ N. Since is cyclic and separating for A(W + ), we can find a unitary Wn implementing ˜ γn on A(W + ) such that Wn is in the natural cone of (, A(W + )). Then Wn and are exactly the vector representatives in the specified natural cone of the states ω ◦ γn and ω, ˜ respectively. Thus by the established norm convergence of states we have that ˜ → 0 as n → ∞. Wn − Let K be a nonempty double-cone with base K ⊂ τ (H + ) ⊂ H + . Then the split property together with the isotony and Reeh-Schlieder property imply that A(K ) ⊂ A(W + ) ˜ are standard vectors. Moreover, is a split inclusion for which the vacuum vectors , as was discussed in Sect. 2, A(W + ) is a factor. Hence by Corollary 3.4 there exists a unitary operator W and a subsequence s such that n → Ws(n) converges strongly to W . ˜ ) It is evident that for the limit W we still have that W A(K )W ∗ = A(K ) = A(K for every double-cone K with base K ⊂ H + (and so also for regions like W + and ˜ By the Bisognano-Wichmann property it τ n (W + )), but now we also have that W = . ∗ ˜ immediately follows that W U W and U coincide on both the boosts associated to W + and to τ (W + ). Now a quick check shows that the subgroup generated by such boosts contains τ so actually we also have that W U (τ )W ∗ = U˜ (τ ). Since every double-cone with base on H can be shifted into H + by a repeated use of τ , this further implies that W A(K )W ∗ = ˜ ) for every double-cone K with base K ⊂ H , and hence also for infinite regions A(K

482

M. Weiner

like wedges whose “edges” are included in H . Then in turn — again by the BisognanoWichmann property — we have that W U W ∗ and U˜ coincides on every boost that is associated to some wedge with edge in H . But elementary geometric arguments show that such boosts generate the entire connected Poincaré group so at this point we have that W U W ∗ (g) = U˜ (g) for every element g. Moreover, since every double-cone can be moved by a suitable Poincaré transformation so that its base will be on H , we now have ˜ ) for every double-cone. Thus W establishes an equivalence that W A(K )W ∗ = A(K between the two models, which is exactly what we wanted to prove. 5. The Conformal Case The conformal chiral QFT, though originally defined on a lightline, can be naturally extended to the compactified lightline which is customarily identified with the circle S 1 ≡ {z ∈ C| |z| = 1}. On the circle the theory becomes Möbius covariant; that is, it will carry a symmetry action of the group of diffeomorphisms of S 1 of the form z → az+b , which is called the Möbius group and is isomorphic to PSL(2, R). The bz+a connection between the “circle picture” and the “line picture” (here “line”≡ R) is made by puncturing the circle at −1 ∈ S 1 and using a Cayley-transformation: x =i

i−x 1−z ∈ R ⇐⇒ z = ∈ S 1 \{−1}. 1+z i+x

(5.1)

Via the line picture one can view translations and dilations as diffeomorphisms of S 1 and in this sense they are elements of the Möbius group. A Möbius covariant net of von Neumann algebras on S 1 is a map A which assigns to every nonempty, nondense open “arc” (or simply interval) I ⊂ S 1 a von Neumann algebra A(I ) acting on a fixed Hilbert space H, together with a given strongly continuous representation U of the Möbius group satisfying certain properties. Here we shall not dwell much either on the defining properties of a Möbius covariant net of von Neumann algebras on S 1 , or on their known consequences. We only assert that the defining properties are adopted versions of (1, 2, 3, 4, 5) whereas (the adopted versions of) property (6, 7) — that is, the Reeh-Schlieder and Bisognano-Wichmann properties — are consequences. One also has irreducibility, factoriality of local algebras and moreover additivity even for an infinite set of intervals: ∨ Iα A(Iα ) ⊃ A(I ) whenever ∪α Iα ⊃ I for any collection {Iα }. For details we refer to [4,8,9,11,12]. Note however that one cannot derive split property (i.e.that A(K ) ⊂ A(I ) is a split inclusion whenever K ⊂ I ), since by taking infinite tensorial products it is easy to construct non-split Möbius covariant nets. Nevertheless, it is known to hold in the majority of “interesting” models. ˜ U˜ ) be Möbius covariant nets of von Neumann algeTheorem 5.1. Let (A, U ) and (A, 1 bras on S with at least one of them being split. Then any of the following 4 conditions: ˜ ) and W U (g)W ∗ = U˜ (g) for all I, g, • ∃ a unitary W s.t. W A(I )W ∗ = A(I ∗ ˜ • ∃ a unitary V s.t. V A(I )V = A(I ) for all I , ˜ ) for all K ⊂ I , • ∃ an (open, nonempty) I and a unitary V s.t. V A(K )V ∗ = A(K ∗ ˜ j, k) for all j, k ∈ N, • ∃ a unitary V s.t. with R-picture notations V A( j, k)V = A( implies the remaining three.


483

Proof. It is clear that any of the conditions implies that if one of the nets is split then so is the other and that each condition implies the next one. All we have to show is that the last one implies the first one, which can be done by simply copying the argument of the proof of the main theorem of the previous section. Note that by (the infinite version of) additivity the last condition implies that for the ˜ ∞) for every unitary V appearing in the condition we also have V A(k, ∞)V ∗ = A(k, k ∈ N. So we may replace the wedge W + in our former proof by the half-line (0, ∞). We have to be careful to use a translation τ by an integer length; say we let τ be the unit translation x → x + 1. For a split inclusion we can choose A(1, 2) ⊂ A(0, ∞). Then the argument of the mentioned proof shows that there exists a unitary W such that ˜ j, k) for all j, k ∈ N and moreover W = , ˜ where and ˜ are W A( j, k)W ∗ = A( ˜ vacuum vectors for U and U , respectively. From here on the proof is actually even simpler than in the “normal” case. Indeed, whereas there the respective modular unitaries did not generate the Poincaré group and so we needed to consider further regions, here we do not need any further argument. It is easy to see that the “dilations” associated to intervals of the form ( j, k)( j, k ∈ N) generate the entire Möbius group. Moreover, the Möbius group acts transitively on the set of (open, nondense, nonempty) intervals. So we immediately have that W U (g)W ∗ = U˜ (g) ˜ ) for all I . for all g and W A(I )W ∗ = A(I In [5] it was shown3 that the Möbius symmetry of a Möbius covariant net (A, U ) admits at most one extension to the group Diff + (S 1 ) which is “compatible” with the net A. For details and precise notations regarding the diffeomorphism covariance we refer to [5]. Here we only note and state that the mentioned uniqueness result implies that our algebraic Haag’s theorem can be adjusted for the diffeomorphism covariant case, too. Proposition 5.2. One may replace “Möbius covariance” by “diffeomorphism covariance” in Theorem 5.1. Let us talk now about implications of our result regarding half-sided modular inclusions. A net A on the circle is strongly additive iff A(I1 )∨A(I2 ) = A(I ) whenever I, I1 and I2 are intervals with the last two being obtained from I by the removal of a point. As was explained in the Introduction, there is a one-to-one correspondence between strongly additive Möbius covariant nets and standard half-sided modular inclusions of factors. For any inclusion of von Neumann algebras N ⊂ M with a common cyclic vector , consider the tunnel introduced by R. Longo: N0 ⊃ N1 ⊃ N2 ⊃ N3 ⊃ . . . ,

(5.2)

Jk (k = 1, 2, . . .) and Jk is the modular where N0 = M, N1 = N and Nk+1 = Jk Nk−1 conjugation associated to (Nk , ). It is easy to see that the tunnel is well-defined (i.e. that remains cyclic and separating at each step of the induction and hence the modular conjugation can be indeed considered). But how does it depend on the choice of the common cyclic and separating vector ? In some sense not much. The following statement is included for reasons of self-containment; it is well-known to experts of the field. 3 Actually the statement in [5] is slightly less general. There a certain regularity condition was also used which was however later removed by the author of this article; see [17, Sect. 6.1] for details.

484

M. Weiner

˜ be common cyclic vectors for the inclusion of von Lemma 5.3. Let both and Neumann algebras N ⊂ M, and denote by N0 ⊃ N1 ⊃ N2 ⊃ . . . and N˜ 0 ⊃ N˜ 1 ⊃ N˜ 2 ⊃ . . . the respective tunnels defined after Eq. (5.2). Then for each n ∈ N there exists a unitary operator Vn such that Vn Nk Vn∗ = N˜ k

for all k ∈ {0, 1, . . . , n}.

That is, up to any finite level, the two tunnels are unitarily equivalent. Proof. We set V1 = 1 and define Vn inductively. Now for n = 1 the condition is satisfied since by assumption N0 = N˜ 0 = M and N1 = N˜ 1 = N . So assume Vk is already defined in a way satisfying the requirement made in the statement. Then Vk is cyclic and separating for (Vk Nk Vk∗ ) = N˜ k so there is a unitary Uk ∈ N˜ k such that Uk Vk is in ˜ N˜ k ). Set Vk+1 := Uk Vk ; it is then evident that Vk+1 N j V ∗ = N˜ j the natural cone of (, k+1 ˜ N˜ k ) for all j = 0, 1, . . . , k. Moreover, as Vk+1 = Uk Vk is in the natural cone of (, ∗ ˜ and Vk+1 Nk Vk+1 = Nk , we have that the adjoint action of Vk takes the modular conju˜ N˜ k ). gation Jk associated to (, Nk ) into the modular conjugation J˜k associated to (, Thus ∗ ∗ ∗ ˜ Vk+1 Nk+1 Vk+1 = Vk+1 Jk Nk−1 Jk Vk+1 = J˜k (Vk+1 Nk−1 Vk+1 ) Jk = J˜k N˜ k−1 J˜k = N˜ k+1 (5.3)

and hence the statement is proved by induction.

Let (A, U ) be a Möbius covariant net with vacuum vector and denote the modular objects associated to (, A(k, ∞)) by Jk and k . Using the Bisognano-Wichmann property and the main theorem [2, Thm. 2.1] of half-sided modular inclusions, the product Jk Jk−1 can be expressed with the modular unitaries which in turn can be expressed by U resulting in Jk Jk−1 = U (τ )2 , where τ is the unit-translation defined in the R-picture by the map x → x + 1. Hence Jk A(k − 1, ∞) Jk = Jk Jk−1 A(k − 1, ∞)Jk−1 Jk = U (τ )2 A(k − 1, ∞)U (τ )−2 = A(k + 1, ∞),

(5.4)

and so the tunnel (5.2) associated to (, A(0, ∞) ⊂ A(1, ∞)) is nothing else than the sequence of inclusions A(0, ∞) ⊂ A(1, ∞) ⊂ A(2, ∞) ⊂ . . . .

(5.5)

Note that in case we have strong additivity, by taking relative commutants, this sequence determines all algebras of the form A( j, k) with j, k ∈ N. Vice versa, if we know A( j, k) for all j, k ∈ N, then by (the infinite version of) additivity we can compute all algebras of the form A(k, ∞) with k ∈ N. So by what was explained we can draw the following conclusion. ˜ are two standard half-sided ˜ N˜ ⊂ M) Corollary 5.4. Suppose (, N ⊂ M) and (, modular inclusions of factors and denote the two corresponding strongly additive Möbi˜ U˜ ), respecively. Then the conditions: us covariant nets by (A, U ) and (A, ˜ V N V ∗ = N˜ , • ∃ a unitary V s.t. V MV ∗ = M, ˜ j, k) for all j, k ∈ {0, 1, . . . , n}, • ∀n ∈ N : ∃ unitary Vn s.t. Vn A( j, k)Vn∗ = A( are equivalent.


485

The relevance of this statement in light of the conformal version of our algebraic Haag’s theorem has been already discussed in the Introduction. Acknowledgement. The author would like to thank Roberto Longo, Sebastiano Carpi and Yoh Tanimoto for useful discussions.

References 1. Araki, H.: Relative Hamiltonian for faithful normal states of a von Neumann algebra. Publ. Res. Inst. Math. Sci. 9, 165–209 (1973/74) 2. Araki, H., Zsidó, L.: Extension of the structure theorem of Borchers and its application to half-sided modular inclusions. Rev. Math. Phys. 17, 495–543 (2005) 3. Bisognano, J., Wichmann, E.H.: On the duality for quantum fields. J. Math. Phys. 16, 985–1007 (1975) 4. Brunetti, R., Guido, D., Longo, R.: Modular structure and duality in conformal quantum field theory. Commun. Math. Phys. 156, 201–219 (1993) 5. Carpi, S., Weiner, M.: On the uniqueness of diffeomorphism symmetry in Conformal Field Theory. Commun. Math. Phys. 258, 203–221 (2005) 6. Eckmann, J.P., Fröhlich, J.: Unitary equivalence of local algebras in the quasifree representation. Ann. Inst. H. Poincaré Sect. A (N.S.) 20, 201–209 (1974) 7. Doplicher, S., Longo, R.: Standard and split inclusions of von Neumann algebras. Invent. Math. 75, 493– 536 (1984) 8. Fredenhagen, K., Jörß, M.: Conformal Haag-Kastler nets, pointlike localized fields and the existence of operator product expansions. Commun. Math. Phys. 176, 541–554 (1996) 9. Fröhlich, J., Gabbiani, F.: Operator algebras and conformal field theory. Commun. Math. Phys. 155, 569– 640 (1993) 10. Glimm, J., Jaffe, A.: The λ(φ 4 )2 quantum field theory without cutoffs II: The field operators and the approximate vacuum. Ann. Math. 91, 362–401 (1970) 11. Guido, D., Longo, R.: The conformal spin and statistics theorem. Commun. Math. Phys. 181, 11–35 (1996) 12. Guido, D., Longo, R., Wiesbrock, H.-W.: Extensions of conformal nets and superselection structures. Commun. Math. Phys. 192, 217–244 (1998) 13. Haag, R.: Local Quantum Physics. 2nd ed. Berlin-Heidelberg-New York: Springer-Verlag, 1996 14. Kawahigashi, Y., Longo, R., Müger, M.: Multi-interval subfactors and modularity of representations in conformal field theory. Commun. Math. Phys. 219, 631–669 (2001) 15. Longo, R.: Notes on algebraic invariants for noncommutative dynamical systems. Commun. Math. Phys. 69, 195–207 (1979) 16. Streater, R., Wightman, A.S.: PCT, Spin and Statistics, and all that. New York-Amsterdam: W.A. Benjamin, 1964 17. Weiner, M.: Conformal covariance and related properties of chiral QFT. PhD thesis, Department of Mathematics, University of Rome “Tor Vergata”, 2005 18. Wiesbrock, H.W.: Half-Sided Modular Inclusions of von Neumann algebras. Commun. Math. Phys. 157, 83–92 (1993) 19. Wiesbrock, H.W.: A note on strongly additive conformal field theory and half-sided modular conormal standard inclusions. Lett. Math. Phys. 31, 303–307 (1994) Communicated by Y. Kawahigashi


Communications in


Unique Continuation for Schrödinger Evolutions, with Applications to Profiles of Concentration and Traveling Waves L. Escauriaza1, , C. E. Kenig2, , G. Ponce3, , L. Vega1, 1 Dpto. de Matemáticas, UPV/EHU, Apto. 644, 48080 Bilbao, Spain.


2 Department of Mathematics, University of Chicago, Chicago, IL 60637, USA.


3 Department of Mathematics, University of California, Santa Barbara, CA 93106, USA.

E-mail: [email protected] Received: 25 July 2010 / Accepted: 22 November 2010 Published online: 17 May 2011 – © Springer-Verlag 2011

Abstract: We prove unique continuation properties for solutions of the evolution Schrödinger equation with time dependent potentials. As an application of our method we also obtain results concerning the possible concentration profiles of blow up solutions and the possible profiles of the traveling waves solutions of semi-linear Schrödinger equations.

1. Introduction In this paper we continue our study initiated in [8–11] on unique continuation properties of solutions of Schrödinger equations. To begin with we consider the linear equation ∂t u = i(u + V (x, t)u),

(x, t) ∈ Rn × [0, ∞).

(1.1)

We shall be interested in finding the strongest possible space decay of global solutions of (1.1). In this direction our first results are the following ones: Theorem 1. Let u ∈ C([0, ∞) : L 2 (Rn )) be a solution of Eq. (1.1) with a real potential V ∈ L ∞ (Rn × [0, ∞)) satisfying that V (x, t) = V1 (x, t) + V2 (x, t),

(1.2)

with V j , j = 1, 2 real valued, |V1 (x, t)| ≤

c1 c1 = , xα (1 + |x|2 )α/2

0 ≤ α < 1/2,

The first and fourth authors are supported by MEC grant, MTM2004-03029.

The second and third authors by NSF grants DMS-0456583 and DMS-0456833 respectively.

(1.3)

488

L. Escauriaza, C. E. Kenig, G. Ponce, L. Vega

and V2 supported in {(x, t) : |x| ≥ 1} such that − (∂r V2 (x, t))− ≤

c2 , |x|2α

a − = min{a; 0}.

(1.4)

Then there exists a constant λ0 = λ0 (V L ∞ (Rn ×[0,∞)) ; c1 ; c2 ; α) > 0 such that if p sup eλ0 |x| |u(x, t)|2 d x < ∞, with p = (4 − 2α)/3, (1.5) t≥0

Rn

then u ≡ 0.

(1.6)

As an immediate consequence of Theorem 1 we have: Corollary 1. Let u ∈ C([0, ∞) : L 2 (Rn )) be a solution of Eq. (1.1) with a real potential V ∈ L ∞ (Rn × [0, ∞)). If |V (x, t)| ≤

c1 c1 = , xα (1 + |x|2 )1/2

(1.7)

and for some p > 1 and λ0 > 0, p sup eλ0 |x| |u(x, t)|2 d x < ∞, t≥0

(1.8)

Rn

then u ≡ 0. Theorem 2. Let u ∈ C([0, ∞) : L 2 (Rn )) be a solution of Eq. (1.1) with a real potential V ∈ L ∞ (Rn × [0, ∞)) satisfying that V (x, t) = V1 (x, t) + V2 (x, t),

(1.9)

with V j , j = 1, 2 real valued, |V1 (x, t)| ≤

c1 c1 = , x1/2+0 (1 + |x|2 )1/4+0 /2

0 > 0,

(1.10)

and V2 supported in {(x, t) : |x| ≥ 1} such that − (∂r V2 (x, t))− ≤

c2 , |x|1+0

a − = min{a; 0}.

(1.11)

Then there exists a constant λ0 = λ0 (V L ∞ (Rn ×[0,∞)) ; c1 ; c2 ; 0 ) > 0 such that if sup eλ0 |x| |u(x, t)|2 d x < ∞, (1.12) t≥0

Rn

then u ≡ 0.

(1.13)

Unique Continuation

489

Using the results in [9] and [10] one sees that it suffices to assume that the hypothesis (1.5) and (1.12) in Theorem 1 and Theorem 2 respectively, hold for a sequence of times j = T0 + j L : j ∈ Z+ } for some T0 ≥ 0 and L > 0. {T The hypothesis on the real character on the potential in these theorems is used to guarantee that the L 2 -norm of the solution of Eq. (1.1) is time independent. However, it suffices to have the L 2 -norm of the solution bounded below for all time t ∈ [0, ∞) by a positive constant, provided that u(0) = 0. Therefore, Theorem 1 still holds for potentials V (x, t) which can be written as V (x, t) = V1 (x, t) + V2 (x, t) + V3 (x, t), with V1 and V2 as before and V3 complex valued satisfying (1.3) and such that ∞ V3 (·, t)∞ dt < ∞. V3 L 1 ([0,∞):L ∞ (Rn )) = 0

A similar remark applies to Theorem 2. Next, we define the “hyperbolic” or “ultra-hyperbolic” operator Lk = ∂x21 + · · · + ∂x2k − ∂x2k+1 − · · · − ∂x2n ,

k ∈ {2,.., n − 1},

(1.14)

and study the linear dispersive equation ∂t u = i(Lk u + V (x, t)u),

(x, t) ∈ Rn × R.

(1.15)

Nonlinear models with a non-degenerate non-elliptic operator Lk describing the dispersive relation arise in several mathematical and physical contexts. For example, the Davey-Stewarson system [5] i∂t u ± ∂x2 u + ∂ y2 u = c1 |u|2 u + c2 u∂x ϕ, t, x, y ∈ R, (1.16) 2 2 2 ∂x ϕ ± ∂ y ϕ = ∂x |u| , with u = u(x, y, t) a complex-valued function, ϕ = ϕ(x, y, t) a real-valued function and c1 , c2 real parameters. The system (1.16) appears as a model in wave propagations [5] and independently as a two dimensional completely integrable system which generalizes the integrable cubic 1-dimensional Schrödinger equation [1]. Also one has the Ishimori system [13] ∂t S = S ∧ (∂x2 S ± ∂ y2 S) + b(∂x φ∂ y S + ∂ y φ∂x S), t, x, y ∈ R, (1.17) 2 2 ∂x φ ∓ ∂ y φ = ∓2S · (∂x S ∧ ∂ y S), where S(·, t) : R2 → R3 with S = 1, S → (0, 0, 1) as (x, y) → ∞, and ∧ denotes the wedge product in R3 . This model was first proposed as a two dimensional generalization of the Heisenberg equation in ferromagnetism. For b = 1 the system (1.17) has been shown to be completely integrable (see [1] and references therein). The arguments used in the proofs of Theorems 1–2 do not rely on the elliptic character of the laplacian in (1.1), so we have: Theorem 3. Theorems 1–2 and Corollary 1 still hold for solutions u ∈ C([0, ∞) : L 2 (Rn )) of Eq. (1.15) with a potential V verifying the same hypotheses. Remarks (i) It is interesting to relate our results with those due to V. Z. Meshkov in [14]:

490


2 (Rn ) be a solution of Theorem. Let w ∈ Hloc

∈ L ∞ (Rn ). (x)w = 0, w + V x ∈ Rn , with V 4/3 If e2a|x| |w|2 d x < ∞, ∀ a > 0, then w ≡ 0.

(1.18) (1.19)

the exponent 4/3 in It was also proved in [14] that for complex valued potentials V (1.19) is optimal. (x), We observe that if the potential in (1.1) V (x, t) is time independent V = V then a solution of w(x) of (1.18) is a stationary solution of the IVP (1.1). Also for the (x), if w(x) is an H 1 -solution of the eigenvalue time independent potential V (x, t) = V problem (x)w = ζ w, w + V

(1.20)

v(x, t) = eiζ t w(x)

(1.21)

then one has that for ζ ∈ R,

is a solution of the IVP (1.1) for which Theorems 1-2 apply. As it was mentioned above the assumption on the real character on the potential in these theorems is only required to guarantee that the L 2 -norm of the solution of Eq. (1.1) is time independent. In the case described in (1.20)–(1.21) the solution v(x, t) preserves the L 2 -norm and so the proof of Theorems 1–2 can be carried out. Hence, taking V2 ≡ 0 one has the following results which recovers that in [14] mentioned above, and improves and generalizes those in [4]: Theorem 4. Let w ∈ H 1 (Rn ) be a solution of Eq. (1.20) with a complex potential ∈ L ∞ (Rn ) satisfying V (x) = V 1 (x) + V 2 (x), V

(1.22)

such that 1 (x)| ≤ |V

c1 c1 = , xα (1 + |x|2 )α/2

0 ≤ α < 1/2,

(1.23)

2 real valued and supported in {x ∈ Rn : |x| ≥ 1} such that and V 2 (x))− ≤ − (∂r V

c2 , |x|2α

a − = min{a; 0}.

L ∞ (Rn ) ; c1 ; c2 ; α) > 0 such that if Then there exists a constant λ0 = λ0 (V p eλ0 |x| |w(x)|2 d x < ∞, with p = (4 − 2α)/3, Rn

(1.24)

(1.25)

then w ≡ 0.

(1.26)

Moreover, if (1.23) and (1.24) holds α > 1/2 and (1.25) holds with p = 1 and large L ∞ (Rn ) ; c1 ; α) > 0, then w ≡ 0. λ0 = λ0 (V

Unique Continuation

491

2 = 0, (1.23) and (1.25), but for all λ0 > 0, on the In [4] under the hypotheses V complex potential V (x, t) on Theorem 4 it was shown that the eigenfunction w(x) solution of (1.20) corresponding to the real eigenvalue ζ satisfies w ≡ 0. 2 = 0, α = 1/2 in We observe that the conclusion of Corollary 1 applies, i.e. if V (1.23), and (1.25) holds for some p > 1 and λ0 > 0, then w ≡ 0. In this direction we have the following improvement of the result in Theorem 4 concerning the case α = 1/2 in (1.23) and (1.24). ∈ L ∞ (Rn ) Theorem 5. Let w ∈ H 1 (Rn ) be a solution of Eq. (1.20) with a potential V satisfying (x) = V 1 (x) + V 2 (x), V

(1.27)

1 is complex valued with such that V 1 (x)| ≤ |V

c1 c1 = , x1/2 (1 + |x|2 )1/4

2 is real valued and supported in {x ∈ Rn : |x| ≥ 1} such that and V 2 (x))− ≤ c2 , a − = min{a; 0}. − (∂r V |x| L ∞ (Rn ) ; c1 ; c2 ) > 0 such that if Then there exists a constant λ0 = λ0 (V eλ0 |x| |w(x)|2 d x < ∞, Rn

(1.28)

(1.29)

(1.30)

then w ≡ 0.

(1.31)

We observe that Theorem 5 is a stationary result (not a consequence of the time evolution results in Theorems 1 and 2) in which the ellipticity of the laplacian in (1.20) plays an essential role. The proof of Theorem 5 will be based in the following Carleman estimate : as in Theorem 5. Then there exists τ0 = Theorem 6. Let ρ ∈ (0, 1] and V ∞ ; c1 ; c2 ) > 0 such that the inequality τ0 (ρ; V g)2 τ 3/2 |x|−1/2 eτ |x| g2 ≤ eτ |x| (g + V holds for any τ ≥ τ0 and any g ∈

C0∞ (Rn

(1.32)

− Bρ (0) ).

We return to the consequence of our time evolution results. Thus, combining Theorem 3 and the comments before the statement of Theorem 4 one has that Theorem 4 also applies to the solutions of the non-elliptic eigenvalue problem (x)w = ζ w, Lk w + V

(1.33)

and ζ ∈ R. with Lk as in (1.14) with complex potential V We shall employ the above results to study the possible profile of the concentration blow up phenomenon in solutions of the initial value problem (IVP) associated to the non-linear Schrödinger equation i∂t u + u ± |u|a u = 0, x ∈ Rn , t ∈ R, a > 0, (1.34) u(x, 0) = u 0 (x).

492


We observe that if u(x, t) is a solution of (1.34) then for all σ > 0, u σ (x, t) = σ 2/a u(σ x, σ 2 t),

(1.35)

is also a solution of (1.35) with data u σ (x, 0) = σ 2/a u 0 (x), so D s u σ (x, 0)2 = σ 2/a−n/2+s D s u 0 2 ,

(1.36)

where D s f (x) = (|ξ |s f )∨ (x), s ∈ R. Thus, if sa /2 − 2/a the size of the data does not change by the scaling and one says that H˙ n/2−2/a (Rn ) = D n/2−2/a L 2 (Rn ),

(1.37)

is a critical space for the IVP (1.34). The following result concerning the local wellposedness of the IVP (1.34) in the critical cases was established in [3]. Theorem. Let sa /2 − 2/a, sa ≥ 0 with [sa ] ≤ a − 1 if a is not an odd integer, then for each u 0 ∈ H sa (Rn ) there exist T = T (u 0 ) > 0 and a unique solution u = u(x, t) of the IVP (1.34) with p

u ∈ C([−T, T ] : H sa (Rn )) ∩ L q ([−T, T ] : L sa (Rn )) = Z Tsa .

(1.38)

Moreover, the map data → solution is locally continuous from H sa (Rn ) into Z Tsa . Above we have introduced the notations: (a) for 1 < p < ∞ and s ∈ R, L s (Rn ) ≡ (1 − )−s/2 L p (Rn ), p

· s, p ≡ (1 − )s/2 · p ,

with L 2s (Rn ) = H s (Rn ), (b) the indices (q, p) in (1.38) are given by the Strichartz estimate [7,16] : ∞ 1/q q it e u 0 p dt ≤ cu 0 2 , −∞

(1.39)

(1.40)

where 2 n n = + , 2 q p

2 ≤ p ≤ ∞, if n = 1, 2 ≤ p < 2n/(n − 2), if n ≥ 2.

The pseudo-conformal transformation deduced in [7] shows that if u = u(x, t) is a solution of (1.34), then 2 eiω|x| /4(ν+ωt) γ + θt x v(x, t) = , , νθ − ωγ = 1, (1.41) u (ν + ωt)n/2 ν + ωt ν + ωt satisfies the equation i∂t v + v ± (ν + ωt)an/2−2 |v|a v = 0.

(1.42)

Hence, in the L 2 -critical case a = 4/n Eqs. (1.34) and (1.42) are the same. Also in this case a = 4/n the pseudo-conformal transformation preserves both the space L 2 (Rn ) and the space H 1 (Rn ) ∩ L 2 (Rn : |x|2 d x). In particular, if we take u(x, t) = eit ϕ(x)

Unique Continuation

493

the standing wave solution, i.e. ϕ(x) being the unique positive solution (ground state) of the non-linear elliptic equation − ϕ + ϕ + |ϕ|4/n ϕ = 0,

x ∈ Rn ,

(1.43)

it follows that eit/(1−t) e−i|x| /4(1−t) ϕ v(x, t) = (1 − t)n/2 2

x 1−t

,

(1.44)

is a solution of (1.34) with a = 4/n and + sign in the nonlinear term (focussing case) which blows up at time t = 1, i.e. lim ∇ v(·, t)2 = ∞, t↑1

and lim |v(·, t)|2 = c δ(·), in the distribution sense. t↑1

Since it is known that positive solutions of the elliptic problem (1.43) (in particular the ground state) have exponential decay (see [2,15]), i.e. ϕ(x) ∼ b1 e−b2 |x| ,

b1 , b2 > 0,

the blow up solution v(x, t) in (1.44) satisfies 1 x , |v(x, t)| ≤ Q (1 − t)n/2 1−t

t ∈ (−1, 1),

(1.45)

with Q(x) = b1 e−b2 |x| . One may ask if it is possible to have a faster “concentration profile” in a solution of (1.34) with a = 4/n than the one described in (1.45). In other words, whether or not (1.45) can hold with Q(x) = b1 e−b2 |x| , p

b1 , b2 > 0,

p > 1,

(1.46)

or Q(x) = b1 e−b3 |x| ,

(1.47)

with b3 sufficiently large. More generally for a ≥ 4/n one may ask if a blow up solution v(x, t) of (1.34) can satisfy 1 x , t ∈ (−1, 1), (1.48) |v(x, t)| ≤ Q (1 − t)2/a 1−t with Q(·) as in (1.46) or as (1.47). Our next result shows that this is not the case. Theorem 7. Let a ≥ 4/n. Let v ∈ C((−1, 1) : H n/2−2/a (Rn )) be a solution of the equation (1.34). If (1.48) holds with Q(·) as in (1.46) for some p > 1 and b2 > 0 or as (1.47), then v ≡ 0.

494


In [12] we establish the result in Theorem 7 for a = 4/n and p > 4/3. Now we consider the equation in (1.34) with the operator describing the dispersive relation Lk as in (1.15) being non-degenerate but not elliptic, i∂t u + Lk u ± |u|a u = 0,

a > 0.

(1.49)

In this case, the local well-posedness theory is similar to that described above for the IVP (1.34). This follows from the fact that the local theory is based on the Strichartz estimates in (1.40) which do not require the ellipticity of the laplacian, i.e. (1.40) holds with Lk instead of . Hence the results in [3] still holds for the IVP associated to the equation in (1.49). In addition, in this case the pseudo-conformal transformation tells us that if u = u(x, t) is a solution of (1.49), then eiωQk (x)/4(ν+ωt) x γ + θt v(x, t) = , νθ − ωγ = 1, (1.50) u , (ν + ωt)n/2 ν + ωt ν + ωt with 2 − · · · − xn2 , Qk (x) = x12 + · · · + xk2 − xk+1

(1.51)

i∂t v + Lk v ± (ν + ωt)an/2−2 |v|a v = 0.

(1.52)

verifies the equation

Hence, as in Theorem 7 we have: Theorem 8. Let a ≥ 4/n. Let v ∈ C((−1, 1) : H n/2−2/a (Rn )) be a solution of the equation in (1.34). If u satisfies (1.48) with Q(·) as in (1.46) or as (1.47), then u ≡ 0. It should be remarked that the result in Theorem 8 is a conditional one. It assumes that the local solution of the IVP associated to Eq. (1.34) blows up (see (1.48)) which is an open problem. We will adapt our results in Theorems 1 and 2 to study the possible profile of “generalized traveling wave” solutions of a class of equations containing those in (1.34) and (1.49), (see (1.59) and (1.60) below). Roughly, these are solutions u(x, t) for which there exist μ ∈ R and e ∈ Sn−1 such that the L 2 (Rn )-norm of u(x − μ t e, t) remains highly concentrated at the origin for all time t ≥ 0, see (1.54) and (1.57) below. Corollary 2. Let u ∈ C([0, ∞) : L 2 (Rn )) be a solution of Eq. (1.1) or Eq. (1.15) with a real potential V ∈ L ∞ (Rn × [0, ∞)). (a) If there exist μ ∈ R and e ∈ Sn−1 such that |V (x, t)| ≤

c1 , (1 + |x + μ t e |2 )α/2

(1.53)

for some constants c1 > 0 and α ∈ [0, 1/2). Then there exists λ0 (V L ∞ (Rn ×[0,∞)) ; c1 ; α) > 0 such that if p eλ0 |x+μ t e | |u(x, t)|2 d x < ∞, with p = (4 − 2α)/3, (1.54) sup t≥0

Rn

then u ≡ 0.

(1.55)

Unique Continuation

495

(b) If there exist μ ∈ R and e ∈ Sn−1 such that |V (x, t)| ≤

c1 , (1 + |x + μ t e |2 )1/4+0 /2

0 > 0,

(1.56)

for some constants c1 > 0. Then there exists λ0 (V L ∞ (Rn ×[0,∞)) ; c1 ; α) > 0 such that if eλ0 |x+μ t e | |u(x, t)|2 d x < ∞, (1.57) sup t≥0

Rn

then u ≡ 0.

(1.58)

As in Corollary 1 we remark that if (1.53) holds with α = 1/2 and (1.54) holds for some p > 1 and λ0 > 0, then u ≡ 0. Finally we shall consider the semi-linear equations of the form ∂t u = i(u + F(u, u)u),

(1.59)

∂t u = i(Lk u + F(u, u)u),

(1.60)

and

with Lk as in (1.14) and F : C2 → R (real valued), F(0, 0) = 0, and such that there exists M > 0 and j ∈ Z+ such that |F(z, z)| ≤ M(|z| + |z| j ).

(1.61)

As a direct consequence of Theorems 1 and 2, Corollary 2, and an appropriate version of the Galilean invariant property for solution of Eqs. (1.59) and (1.60) we shall establish the following result: Corollary 3. Let u ∈ C([0, ∞) : L 2 (Rn )) be a solution of Eq. (1.59) or Eq. (1.60). If there exist μ ∈ R and e ∈ Sn−1 such that |u(x, t)| ≤ Q(x + μ t e),

∀ x ∈ Rn , t > 0,

(1.62)

with Q(·) as in (1.46) for some p > 1 or as in (1.47), then u ≡ 0. In [6] it was proved that Eq. (1.60) with a Lk non-elliptic operator does not have nontrivial (travelling wave) solutions of the form u(x, t) = eiωt ϕ(x + μt e),

μ ∈ R, e ∈ Sn−1 ,

2 (Rn ). with ϕ ∈ H 1 (Rn ) ∩ Hloc The rest of this paper is organized as follows. Section 2 contains the details of the proof of Theorem 1 in the case V2 ≡ 0 (the proof of Theorems 3, 4, 7, and 8, and Corollaries 2 and 3 follows this approach) and the modifications needed in this proof to obtain the general case. The modifications of this argument required to establish Theorems 2 will be given in Sect. 3. Also Sect. 3 contains some remarks on the proof of Theorem 3. Theorem 7 will be proved in Sect. 4, and the proofs of Corollaries 2–3 will be outlined in Sect. 5. Finally, Theorems 5 and 6 will be proven in Sect. 6. The Appendix is concerned with the existence of the functions ϕ used in the proofs of Theorem 1 and Theorem 2.

496


2. Proof of Theorem 1 We begin with two preliminary results. Let S be a symmetric operator independent of t. Let A be a skew-symmetric one. Proposition 1. For any T0 , T1 ∈ R, T0 < T1 and any suitable function f (x, t) one has

T1 T0

T1

[S; A] f f d xdt +

T1

|S f |2 d xdt

T0

|∂t f − (S + A) f |2 d xdt T0 +| S f (T1 ) f (T1 )d x| + | S f (T0 ) f (T0 )d x|.

≤

(2.1)

Proof. Since S is independent of t one has ∂t S f, f = ∂t f, S f + S f, ∂t f = ∂t f − (S + A) f, S f + S f, ∂t f − (S + A) f +(S + A) f, S f + S f, (S + A) f = 2 ∂t f − (S + A) f, S f + 2S f, S f + [SA − AS] f, f .

(2.2)

Thus, integrating in the time interval [T0 , T1 ] it follows that

T1

[S; A] f, f dt + 2

T0

= −2

T1

S f, S f dt

T0 T1 T0

∂t f − (S + A) f, S f + S f, f |TT10 .

Then, using that 2ab ≤ a 2 + b2 we obtain (2.1).

Next, for a fixed T ∈ R we define η : [T − 1/2, T + 1/2] → R as η(t) = (t − (T − 1/2))((T + 1/2) − t), so η(T − 1/2) = η(T + 1/2) = 0 and for any t ∈ [T − 1/2, T + 1/2], 0 ≤ η(t) ≤ 1/4,

|η (t)| ≤ 1,

η (t) = −2.

Proposition 2. For any T > 1/2 one has

T +1/2 T −1/2

≤8

η(t)(|S f |2 + [S; A] f f ) d xdt +

T +1/2 T −1/2

T +1/2 T −1/2

|∂t f − (S + A) f | d xdt + 8 | 2

| f |2 d xdt T +1/2

| f |2 d x|T −1/2 |.

(2.3)

Unique Continuation

497

Proof. Since ∂t f, f = ∂t f, f + f, ∂t f = ∂t f − (S + A) f, f + f, ∂t f − (S + A) f +(S + A) f, f + f, (S + A) f = 2 ∂t f − (S + A) f, f + 2S f, f ,

(2.4)

multiplying by η (t) and integrating in the time interval [T − 1/2, T + 1/2] one gets T +1/2 −2 η (t)S f, f dt T −1/2

= 2

T +1/2 T −1/2

∂t f − (S + A) f, f η (t)dt −

T +1/2 T −1/2

Integration by parts gives T +1/2 T +1/2 − ∂t f, f η (t)dt = − f, f η (t)|T −1/2 + T −1/2

∂t f, f η (t)dt.

T +1/2 T −1/2

f, f η (t)dt

(2.5)

(2.6)

and −2

T +1/2

η (t)S f, f dt = 2

T −1/2

T +1/2 T −1/2

η(t) ∂t S f, f dt.

(2.7)

We recall that from (2.2) one has ∂t S f, f = 2 ∂t f − (S + A) f, S f + 2S f, S f + [S; A] f, f ,

(2.8)

so inserting (2.8) into (2.7), and the result together with (2.6) into (2.5) it follows that T +1/2 T +1/2 4 η(t)S f, S f dt + 2 η(t)[S; A] f, f dt T −1/2

= −4 +2

T −1/2

T +1/2

T −1/2 T +1/2 T −1/2

− f, f η

η(t)∂t f − (S + A) f, S f dt

∂t f − (S + A) f, f η (t)dt

T +1/2 (t)|T1 /2

+

T +1/2 T −1/2

f, f η (t)dt,

(2.9)

which combined with the properties of the function η and Cauchy-Schwarz yields the estimates (2.3). Proof of Theorem 1: caseV2 ≡ 0. We fix α ∈ [0, 1/2) and p = (4 − 2α)/3 ∈ (1, 4/3]. Let ϕ = ϕ p be a C 4 , radial, strictly convex function on compact sets of Rn , such that ϕ(r ) = r p + β, for r = |x| ≥ 1, ϕ(0) = 0, ϕ(r ) > 0, for r > 0, ∃ M > 0 s.t. ϕ(r ) ≤ Mr p , ∀ r ∈ [0, ∞).

(2.10)

498


The existence of such a function ϕ = ϕ p will be discussed in the Appendix, part (a). We recall that x x ∂ ϕ x j xk

j k r δ + . (2.11) − D 2 ϕ = ∂r2 ϕ jk r2 r r2 Therefore, ∇ϕ D 2 ϕ∇ϕ = ∂r2 ϕ(∂r ϕ)2 =

c , |x|4−3 p

for r = |x| ≥ 1,

(2.12)

and D 2 ϕ ≥ p( p − 1)r p−2 I,

for r = |x| ≥ 1.

(2.13)

Let f (x, t) = eλϕ(x) u(x, t), where u(x, t) is a solution of the IVP (1.1) so eλϕ (∂t − i)u = eλϕ (∂t − i)(e−λϕ f ) = ∂t f − S f − A f,

(2.14)

where S is symmetric and A skew-symmetric both independent of t with A = i( + λ2 |∇ϕ|2 ),

S = −iλ(2∇ϕ · ∇ + ϕ),

(2.15)

so that [S; A] = −λ((4∇ · D 2 ϕ∇ ) − 4λ2 ∇ϕ D 2 ϕ∇ϕ + 2 ϕ).

(2.16)

We divide the proof into three steps: Step 1. If

sup t>0

eλ|x| |u(x, t)|2 d x ≤ cλ , p

p = (4 − 2α)/3.

Then there exists {T j : j ∈ Z+ } with T j ↑ ∞ as j ↑ ∞ such that cλ , |S f (x, T j )|2 d x ≤ sup j∈Z+

(2.17)

(2.18)

where f = eλϕ(x) u(x, t), S as in (2.15), and cλ denoting a constant depending on cλ in (2.17), λ, V ∞ and p. Proof of Step 1. We combine Proposition 2 with (2.16) passing the term involving 2 ϕ to the right hand side and using that the rest of the commutator in (2.16) is positive to obtain T +1/2 T +1/2 2 |S f | η(t)d xdt ≤ 8 |∂t f − S f − A f |2 d xdt T −1/2

+λ2 ϕ∞

T +1/2 T −1/2

T −1/2

| f |2 d xdt + |

T +1/2 | f |2 d x|T −1/2 | ≡ B.

We use that eλϕ (∂t − i)u = ∂t f − S f − A f,

(∂t − i)u = i V u,

(2.19)

Unique Continuation

499

to bound the right hand side of (2.19) as B ≤ c(λ

2

ϕ∞ + sup V (·, t)2∞ ) sup t>0 t>0

e2λϕ |u(x, t)|2 d x ≤ c˜λ .

(2.20)

Inserting this in (2.19) and using that η(t) ≥ 3/16 for t ∈ [T − 1/4, T + 1/4] one gets that T +1/2 T +1/4 2 |S f | η d xdt ≥ |S f |2 η d xdt c˜λ ≥ T −1/2 T +1/4

3 ≥ 16

T −1/4

T −1/4

|S f |2 d xdt ≥

3 32

|S f (x, T ∗ )|2 d x,

for some T ∗ ∈ [T − 1/4, T + 1/4]. Hence, we can find a sequence {T j : j ∈ Z+ } with t j ↑ ∞ and j ↑ ∞ such that sup (2.21) |S f (x, T j )|2 d x ≤ c˜λ . j∈Z+

Step 2. There exists λ0 > 0 such that if λ ≥ λ0 , then for any j ∈ Z+ , T j 2λϕ(x) |u(x, t)|2 e d xdt ≤ c˜λ uniformly in j ∈ Z+ . x4−3 p T1

(2.22)

Proof of Step 2. A combination of Proposition 1, the conclusion of Step 1, and our hypothesis leads to Tj Tj (2.23) [S; A] f f d xdt ≤ |eλϕ V u|2 d xdt + c˜λ . T1

T1

From our hypothesis (2.10) on ϕ one has that c , |x|4−3 p c , |2 ϕ(x)| ≤ x2

∇ϕ D 2 ϕ∇ϕ ≥

|x| ≥ 1, ∀x ∈ Rn .

(2.24)

Thus, from our decay hypothesis on the potential (1.3) it follows that there exists λ>0 such that if λ ≥ λ and |x| ≥ 1, then 2λ2 ∇ϕ D 2 ϕ∇ϕ + 2 ϕ − |V |2 ≥

λ . x4−3 p

(2.25)

Next, for any ∈ (0, 1) we consider the domain {x : ≤ |x| ≤ 1}. In this set we have that ∇ϕ D 2 ϕ∇ϕ ≥ cϕ, ,

for ≤ |x| ≤ 1.

(2.26)

Therefore, for large enough λ ≥ λ , λ2 ∇ϕ D 2 ϕ∇ϕ + 2 ϕ − |V |2 ≥ λ,

for ≤ |x| ≤ 1.

(2.27)

500


Hence from (2.23),

Tj

∇ f D 2 ϕ∇ f d xdt + 2λ3

4λ T1

≤ c˜λ + c (λ2 ϕ∞ + V 2∞ )

Tj

∇ϕ D 2 ϕ∇ϕ| f |2 d xdt

T1 Tj |x|≤

T1

| f |2 d xdt.

(2.28)

In the domain {x : |x| ≤ } we shall use that ϕ is strictly convex in r = |x| ≤ 2 to get from (2.28) that 4cϕ λ

Tj

|x|≤2

T1

≤ c˜λ + c (λ

|∇ f |2 d xdt + 2λ3

2

ϕ∞ + V 2∞ )

Tj


T1 Tj |x|≤

T1

| f |2 d xdt,

(2.29)

with cϕ and c independent of ∈ (0, 1]. Now, we pick θ ∈ C ∞ (Rn ) such that θ (x) ≡ 1 for |x| ≤ with supp θ ⊂ {x : |x| ≤ 2} and use Poincaré’s inequality to get that for each t ∈ [T1 , T j ],

| f | dx ≤ |θ f | d x ≤ cϕ |∇(θ f )|2 d x |x|≤2 |x|≤2 2 2 |∇ f | d x + cϕ | f |2 d x. ≤ cϕ 2

2

2

|x|≤

|x|≤2

≤|x|≤2

(2.30)

Fixing sufficiently small and then λ large enough it follows from this that λ

Tj T1

+λ3

∇ f D ϕ∇ f d xdt + λ 2

Tj T1

2

Tj T1

|x|≤1

| f |2 d xdt

∇ϕ D 2 ϕ∇ϕ| f |2 d xdt ≤ c˜λ .

(2.31)

In particular, for λ0 ≥ λ sufficiently large we have

Tj T1

| f |2 d xdt ≤ cλ , uniformly in j ∈ Z+, x4−3 p

(2.32)

which completes the proof of this step. We fix λ = λ0 above for the rest of the proof. Step 3. u(x, t) ≡ 0. Proof of Step 3. On the one hand, since the potential V = V (x, t) is real, then the L 2 -norm of the solution u(x, t) of (1.1) is preserved, i.e. for all t ∈ R, u(·, t)2 = u 0 2 .

Unique Continuation

501

On the other hand, from Step 2 inequality (2.22) one has (T j − T1 )u 0 22 =

Tj

T1 Tj

|u(x, t)|2 d xdt

e2λϕ x4−3 p e−2λϕ d xdt 4−3 p x T1 Tj e2λϕ cλ0 , ≤ sup (x4−3 p e−2λϕ ) |u(x, t)|2 4−3 p d xdt ≤ x T1 x∈Rn

=

|u(x, t)|2

which completes the proof of Theorem 1 in the case V2 ≡ 0. Proof of Theorem 1: general case. The argument is similar to that presented above in the case V2 ≡ 0, so we sketch it. Step 1 is similar so it will be omitted. In Step 2 we divide the potential V (x, t) as in (1.2), V (x, t) = V1 (x, t) + V2 (x, t), and define S = −iλ(2∇ϕ · ∇ + ϕ),

A = i( + V2 + λ2 |∇ϕ|2 ),

(2.33)

so that [S; A] = −λ((4∇ · D 2 ϕ∇ ) − 4λ2 ∇ϕ D 2 ϕ∇ϕ + 2 ϕ) + 2λ∇ϕ · ∇V2 = D1 + D2 . (2.34) We notice that D1 is similar to the term handled in the proof of Theorem 1 in the case V2 ≡ 0, and that since ϕ is radial and convex one has D2 = 2λ∇ϕ · ∇V2 = 2λ∂r ϕ∂r V2 ≥ 2λ∂r ϕ(∂r V2 )− .

(2.35)

Thus, from our decay hypothesis on the potential it follows that there exists λ0 > 0 such that if λ ≥ λ0 and |x| ≥ 1, then 2λ2 ∇ϕ D 2 ϕ∇ϕ + 2 ϕ − |V1 |2 + 2∂r ϕ(∂r V2 )− ≥

λ . x4−3 p

(2.36)

For |x| ≤ 1 we apply the argument in the proof of Theorem 1 in the case V2 ≡ 0. Therefore combining these estimates we obtain the proof of Step 2 : There exists λ0 > 0 such that if λ ≥ λ0 , then for any j ∈ Z+ ,

Tj T1

e2λϕ(x) |u(x, t)|2 d xdt ≤ c˜λ x4−3 p

independent of j ∈ Z+ .

(2.37)

Once (2.37) has been established the rest of the proof follows the same argument given in Step 3 of the proof of Theorem 1 in the case V2 ≡ 0.

502


3. Proofs of Theorem 2 and Theorem 3 Proof of Theorem 2: case V2 ≡ 0. We shall follow the argument provided in the proof Theorem 1. A main difference is the choice of the function ϕ in (2.10). In this case we take ϕ ∈ C 4 to be a radial, strictly convex function on compact sets of Rn , such that r dr + β, r = |x| ≥ 1, (3.1) ϕ(r ) = 3r − 1 + log r 1 so ∂r ϕ(x) = 3 −

1 , 1 + log r

∂r2 ϕ(x) =

1 , r (1 + log r )2

r = |x| ≥ 1,

(3.2)

and ϕ(0) = 0, ϕ(r ) > 0, for r > 0, ∃ M > 0 s.t. ϕ(r ) ≤ Mr, ∀r ∈ [0, ∞).

(3.3)

The existence of such a function ϕ will be proven in the Appendix, part (b). Since x x ∂ ϕ x j xk

j k r D 2 ϕ = ∂r2 ϕ δ + , (3.4) − jk r2 r r2 for |x| ≥ 1 one has ∇ϕ D 2 ϕ∇ϕ = ∂r2 ϕ(∂r ϕ)2 >

1 , r (1 + log r )2

(3.5)

and D 2 ϕ ≥ ∂r2 ϕ(x)I.

(3.6)

Step 1 is similar to that in the proof of Theorem 1, with the appropriate modifications, hence we shall start with Step 2. Step 2. There exists λ0 > 0 such that if λ ≥ λ0 , then for any j ∈ Z+ , T j 2λϕ(x) |u(x, t)|2 e d xdt ≤ c˜λ independent of j ∈ Z+ . x (logx)2 T1

(3.7)

Proof of Step 2. A combination of Proposition 1, the conclusion of Step 1, and our hypothesis leads to Tj Tj (3.8) [S; A] f f d xdt ≤ |eλϕ V u|2 d xdt + c˜λ . T1

T1

From our assumptions on ϕ it follows that |2 ϕ(x)| ≤

c , x2

∀x ∈ Rn .

(3.9)

Using the decay hypothesis on the potential (1.10) one has that there exists λ > 0 such that if λ ≥ λ and |x| ≥ 1, then 2λ2 ∇ϕ D 2 ϕ∇ϕ + 2 ϕ − |V |2 ≥

λ . r (1 + log r )2

(3.10)

Unique Continuation

503

Thus, from (3.8) and λ >> 1, Tj 2 3 4λ ∇ f D ϕ∇ f d xdt + 2λ T1

≤ c˜λ + c λ

Tj

|x|≤1

T1


T1 Tj

≤ c˜λ + c (λ2 ϕ∞ + V ∞ )

Tj

|x|≤1

T1

| f |2 d xdt

| f |2 d xdt.

(3.11)

Next, for a fixed ∈ (0, 1) we consider the domain {x : ≤ |x| ≤ 1}. In this region ∇ϕ D 2 ϕ∇ϕ ≥ cϕ, , for ≤ |x| ≤ 1.

(3.12)

Therefore, for large enough λ ≥ λ λ2 ∇ϕ D 2 ϕ∇ϕ + 2 ϕ − |V |2 ≥ λ, for ≤ |x| ≤ 1. Hence

Tj

∇ f D ϕ∇ f d xdt + 2λ 2

4λ T1

≤ c˜λ + c λ

3

|x|≤

T1


T1

Tj

Tj

(3.13)

| f |2 d xdt,

(3.14)

with c independent of ∈ (0, 1]. In the domain {x : |x| ≤ } we shall use that ϕ is strictly convex in r = |x| ≤ 2 to get from (3.11) that Tj Tj ∇ϕ D 2 ϕ∇ϕ| f |2 d xdt |∇ f |2 d xdt + λ3 2cϕ λ T1

|x|≤2 Tj

≤ c˜λ + c λ

T1

|x|≤

T1

| f |2 d xdt,

(3.15)

with cϕ and c independent of ∈ (0, 1]. Choosing θ ∈ C ∞ (Rn ) such that θ (x) ≡ 1 for |x| ≤ with supp θ ⊂ {x : |x| ≤ 2} and using Poincaré’s inequality to get that for each t ∈ [T1 , T j ] it follows that 2 2 2 | f | dx ≤ |θ f | d x ≤ cϕ |∇(θ f )|2 d x |x|≤ |x|≤2 |x|≤2 2 2 |∇ f | d x + cϕ | f |2 d x. (3.16) ≤ cϕ |x|≤2

≤|x|≤2

Gathering the above estimates by fixing sufficiently small and then λ > λ large enough one concludes that Tj Tj | f |2 d xdt ∇ f D 2 ϕ∇ f d xdt + λ2 λ T1

+ λ3

Tj T1

T1

|x|≤1

∇ϕ D 2 ϕ∇ϕ| f |2 d xdt ≤ c˜λ .

(3.17)

504


In particular

Tj

T1

| f |2 d xdt ≤ cλ , independent of j ∈ Z+, x (logx)2

(3.18)

which completes the proof of Step 2. We fixed λ = λ0 above for the rest of the proof. Step 3. u(x, t) ≡ 0 Proof of Step 3. On one hand, since the potential V = V (x, t) is real, then the L 2 -norm of the solution u(x, t) of (1.1) is preserved, i.e. for all t ∈ R, u(·, t)2 = u 0 2 . On the other hand, from Step 2 (3.7), Tj (T j − T1 )u 0 22 = |u(x, t)|2 d xdt

T1 Tj

e2λϕ x (logx)2 e−2λϕ d xdt x (logx)2 T1 Tj e2λϕ d xdt ≤ sup (x (logx)2 e−2λϕ ) |u(x, t)|2 x (logx)2 T1 x∈Rn ≤ cλ0 ,

=

|u(x, t)|2

which completes the proof of Theorem 2 in the case V2 ≡ 0. The proof in the general case follows the same argument already explained in the proof of Theorem 1 so it will be omitted. Proof of Theorem 3. The only differences with the previous cases are following computations: + Lk ), S = −i λ(2∇ϕ · ∇

= (∂x1 , .., ∂xk , −∂xk+1 , .., −∂xn ), ∇

A = i(Lk + λ2 ((∂x1 ϕ)2 + · · · + (∂xk ϕ)2 − (∂xk+1 ϕ)2 − (∂xn ϕ)2 ), so − 4λ2 ∇ϕ + Lk Lk ϕ. D 2 ϕ ∇ϕ · D 2 ϕ ∇) [S; A] = −λ((4∇ Hence, the method of proof used in Theorems 1–2 for the elliptic case Lk = can be applied to obtain the same results in this non-degenerate case. 4. Proof of Theorem 7 The conformal transformation (1.41) with ν = ω = θ = 1 and γ = 0 tells us that w(x, t) =

t x 1 2 , ), ei|x| /4(1+t) v( n/2 (1 + t) 1+t 1+t

(4.1)

solves the equation i∂t w + w ± (1 + t)an/2−2 |w|a w = 0,

(4.2)

Unique Continuation

505

in the time interval t ∈ [0, ∞). Thus, from the hypotheses (1.45) it follows that the solution w(x, t) satisfies x t 1 v( , ) |w(x, t)| = (1 + t)n/2 1 + t 1 + t

x 1 1 1 (1+t) ≤ Q Q(x). (4.3) = t t (1 + t)n/2 (1 − (1+t) (1 + t)n/2−2/a )2/a 1 − (1+t) Since the potential V (x, t) has the form V (x, t) = ±(1 + t)an/2−2 |w(x, t)|a , from (4.3) one sees that it verifies that an/2−2 |V (x, t)| ≤ (1 + t)

1 (1 + t)n/2−2/a

a Q a (x) = Q a (x).

(4.4)

Therefore, since a ≥ 4/n > 0 from our hypothesis (1.46) or (1.47) it follows that the potential in (4.2) satisfies the hypothesis in Theorem 1 and Theorem 2 with V2 ≡ 0. Since the L 2 -norm of the solution w(x, t) is preserved for all t ≥ 0, Theorem 1 and Theorem 2 yield the desired result. 5. Proofs of Corollary 2 and Corollary 3 Proof of Corollary 2. We observe that if u(x, t) solves the equation in (1.1), then μ

w(x, t) = u(x − μ t e, t) ei( 2 x·e−

μ2 t 4 )

,

(5.1)

is a solution of the equation ∂t w = i(w + V (x − μ t e, t) w).

(5.2)

Thus, from hypothesis (1.53) and (1.56) the potential in (5.2) W (x, t) ≡ V (x − μ t e, t)

(5.3)

satisfies the conditions on Theorems 1 and 2, respetively. Therefore, they can be applied to Eq. (5.2) to obtain the result. In the case of Eq. (1.15) the transformation (5.1) reads μ

w(x, t) = u(x − μ t e, t) ei( 2 x·e(k)−

μ2 t Qk (e) ) 4

,

(5.4)

e = (e1 , . . . , en ),

(5.5)

with e(k) = (e1 , .., ek , −ek+1 , .., −en ),

if

and Qk as in (1.51). The function w(x, t) satisfies the equation ∂t w = i(Lk w + V (x − μ t e, t) w).

(5.6)

W (x, t) ≡ V (x − μ t e, t)

(5.7)

Hence, the potential and the solution w(x, t) of (5.6) satisfies the requirements in Theorem 3.

506


Proof of Corollary 3. If u(x, t) is a solution of Eq. (1.59) ∂t u = i(u + F(u, u) u), then μ

v(x, t) = u(x − μ t e, t) ei( 2 x·e−

μ2 t 4 )

,

(5.8)

satisfies the equation μ

∂t v = i(v + F(e−i( 2 x·e−

μ2 t 4 )

μ

v, ei( 2 x·e−

μ2 t 4 )

v) v).

(5.9)

So in this case from the hypothesis on F(z, z) the potential μ

W (x, t) ≡ F(e−i( 2 x·e−

μ2 t 4 )

μ

v, ei( 2 x·e−

μ2 t 4 )

v),

(5.10)

verifies that |W (x, t)| ≤ M(|v(x, t)| + |v(x, t)| j ) = M(|u(x − 2μ e t, t)| + |u(x − 2μ e t, t)| j ). Thus, the assumption (1.62) guarantees that we can use Corollary 1 and Theorem 2 to achieve the result. In the case of Eq. (1.60) ∂t u = i(Lk u + F(u, u)u), one just needs to define v(x, t) as μ

v(x, t) = u(x − μ t e, t) ei( 2 x·e(k)−

μ2 tQk (e) ) 4

,

(5.11)

with e(k) as in (5.5) and Qk as in (1.51). Since v(x, t) solves the equation μ

∂t v = i(Lk v + F(e−i( 2 x·e(k)−

μ2 tQk (e) ) 4

μ

v, ei( 2 x·e(k)−

μ2 tQk (e) ) 4

v) v),

(5.12)

one just needs to follow the argument given in the case of Eq. (1.59) to obtain the desired result. 6. Proofs of Theorem 5 and Theorem 6 Proof of Theorem 6. We have 2 )e−τ |x| = S + A, eτ |x| ( + V where 2 + τ 2 , S =+V

A=−

τ (2x · ∇ + n − 1) . |x|

Hence, the commutator of S and A is [S; A] = −4τ ∂ j · ((

x j xk δ jk τ (n − 1)(n − 3) 2 . − )∂k ) + + τ ∂r V 3 |x| |x| |x|3

(6.1)

Unique Continuation

507

Let g ∈ C0∞ (Rn \ Bρ ) and set f = eτ |x| g. Then, 2 )g22 = S f 22 + A f 22 + eτ |x| ( + V

Rn

[S; A] f f d x

4 = S f 22 + A f 22 + τ |∇ f |2 − |∂r f |2 Rn |x| (n − 1)(n − 3) +τ + ∂r V2 | f |2 d x, |x|3 Rn

with ∂r f =

x |x|

(6.2)

· ∇ f and

√ n−1 n−1 A f 2 = τ 2∂r f + f 2 ≥ τ 2∂r f + f 2 |x| |x| √ √ ≥ 2 τ ∂r f 2 − τ (n − 1) |x|−1 f 2 √ ≥ τρ |x|−1/2 ∂r f 2 − τ/ρ |x|−1/2 f 2

(6.3)

for τ ≥ 1. Combining our hypotheses on the potential (1.27)-(1.29), (6.2) and (6.3) one gets that S f 2 +

√ √ 2 )g2 + τ /ρ |x|−1/2 f 2 . τρ |x|−1/2 ∇ f 2 ≤ eτ |x| ( + V

(6.4)

Thus using (6.1) it follows that τ

3

| f |2 1 2 | f |2 d x dx = τ S f f − f f − V n |x| Rn |x| R 1 1 1 2 | f |2 d x S f f dx − τ | f |2 − |∇ f |2 + V =τ Rn |x| Rn |x| 2 1 (n − 3) | f |2 2 2 =τ S f f + |∇ f | + − V2 | f | d x. (6.5) 2 |x|2 Rn |x|

The last identity, our hypotheses on the potential (1.27)–(1.29), (6.4) and the Cauchy-Schwarz inequality show that Theorem 6 holds for τ ≥ τ0 with τ0 = ∞ ; c1 ; c2 ; ρ). τ0 (n, V Proof of Theorem 5. We fix φ ∈ C0∞ (Rn ) such that φ is positive, with φ(x) = 1, |x| ≤ 1 and supp φ ⊂ {x : |x| ≤ 2} and rewrite Eq. (1.20) as (x)u − ζ u = u + 1 (x)u + V 2 (x)u = 0, u + V V (x)u = u + V

(6.6)

with (x) − ζ φ(x), 1 (x) = V V

2 (x) = −ζ (1 − φ(x)). V

1 , V 2 satisfy the hypotheses of Theorems 5 and 6. We shall define φ L as Thus, V φ L (x) = φ(x/L),

L > 0.

(6.7)

508


Claim. There exist ρ0 ∈ [0, 1) and M = M(n) such that u2L 2 (B ) = |u(x)|2 d x 4ρ0 |x|≤4ρ0 ≤ M u2L 2 (B −B ) = 10ρ0

5ρ0

5ρ0 ≤|x|≤10ρ0

|u(x)|2 d x.

(6.8)

2 , with ρ to be determined and inteProof of the claim. Multiplying Eq. (6.6) by u φ5ρ grating the result one gets 2 2 − |∇u|2 φ5ρ d x + |u|2 (2|∇φ5ρ |2 + φ5ρ φ5ρ ) d x + ( V )|u|2 φ5ρ d x. (6.9)

Combining (6.9) and Poincaré inequality one has that 2 2 |uφ5ρ | d x ≤ (10ρ) |∇(uφ5ρ |2 d x 2 ≤ (10ρ)2 d x + cn |u|2 φ5ρ |∇φ5ρ |d x |∇u|2 φ5ρ 2 2 2 ≤ (10ρ) (cn |u| d x + V ∞ |uφ5ρ | d x) + cn B10ρ −B5ρ

B10ρ −B5ρ

|u|2 d x. (6.10)

Fixing ρ0 small enough, depending on the V ∞ , we establish the claim (6.8). Next, we apply Theorem 2a to u = u ρ,R , where ∈ C0∞ (Rn ) with (x) = 1, 4ρ ≤ |x| ≤ R, (x) = 0, |x| ≥ 2R, (x) = 0, |x| ≤ 2ρ with R > 10 and ρ ∈ (0, 1) to get that τ 3 |x|−1/2 eτ |x| (u)22 ≤ eτ |x| ( + V )(u)22 ≤ 4 eτ |x| ∇u · ∇22 + 2 eτ |x| u)22 ≤ 4 eτ |x| ∇u · ∇22 + 2cn eτ |x| u2L 2 ((B

2R −B R )∪(B4ρ −B2ρ ))

.

(6.11)

Using integrations by part and Eq. (6.6) one gets that τ eτ |x| ∇u · ∇22 ≤ cn ( V ∞ + τ 2 + ) eτ |x| u · ∇22 . ρ Therefore τ V ∞ + τ 2 + ) A1 ≡ τ 3 |x|−1/2 eτ |x| (u)22 ≤ cn ( ρ × eτ |x| u2L 2 ((B

2R −B R )∪(B4ρ −B2ρ ))

≡ A2 .

On one hand one has that A1 ≥ τ 3 ≥ cn

τ 3 τ |x| eτ |x| u e u L 2 (B10ρ −B5ρ ) ≥ c 2 n |x|1/2 L (B R −B2ρ ) ρ

τ 3 10τρ e u L 2 (B10ρ −B5ρ ) . ρ

(6.12)

Unique Continuation

509

On the other hand, τ τ V ∞ + τ 2 + ) e8τρ u2L 2 (B ) + cn ( V ∞ + τ 2 + ) e4τ R u2L 2 (B −B ) . A2 ≤ cn ( 4ρ 2R R ρ ρ Therefore, fixing ρ = ρ0 as in the claim it follows that τ 3 10τ ρ0 τ 3 10τρ0 e u2L 2 (B ) ≤ e u L 2 (B10ρ −B5ρ ) 4ρ0 ρ0 ρ0 τ τ ≤ cn ( V ∞ + τ 2 + ) e8τρ u2L 2 (B +cn ( V ∞ + τ 2 + ) e4τ R u2L 2 (B −B ) . 2R R 4ρ) ρ0 ρ0 (6.13)

M

Therefore, for τ sufficiently large but independently of R > 10 it follows that u2L 2 (B

2R −B R )

≥ cn e10τρ0 e−4τ R u2L 2 (B

4ρ0 )

.

Finally, taking λ0 > 2τ one has ∞> ≥

∞

e2λ0 |x| |u(x)|2 d x ≥ e2

kλ R 0

≥

e2

k Rλ 0

e−2

2k−1 R≤|x|≤2k

k=1

2k−1 R≤|x|≤2k

k=1

∞

k+1 τ R

e2λ0 |x| |u(x)|2 d x

|u(x)|2 d x

e8τρ0 u2L 2 (B

4ρ0 )

which gives a contradiction except if u2L 2 (B

4ρ0 )

,

(6.14)

= 0.

Acknowledgement. The authors would like to thank J. C. Saut for fruitful conversations concerning this work.

7. Appendix Part (a). We recall that p ∈ (1, 4/3]. The aim is to find ϕ(r ) = a0 + a1r 2 + a2 r 4 + a3r 6 + a4 r 8 ,

r ∈ [0, 1],

(7.1)

such that ϕ(1) = d0 , ϕ (1) = d1 , ϕ (2) (1) = d2 > 0, ϕ (3) (1) = d3 < 0, ϕ (4) (1) = d4 > 0,

(7.2)

for prescribed values d0 , . . . , d4 such that ϕ(0) = 0 and ϕ is strictly convex for r ∈ [0, 1]. Since in Theorem 1 ϕ(r ) = r p + β, r ≥ 1 one has d0 = 1 + β, d1 = p > 0, d3 = p( p − 1)( p − 2) < 0,

d2 = p( p − 1) > 0, d4 = p( p − 1)( p − 2)( p − 3) > 0.

(7.3)

510


So we solve the system ⎧ a0 + a1 + a2 + a3 + a4 = d0 = 1 + β, ⎪ ⎪ ⎪ 2a1 + 4a2 + 6a3 + 8a4 = d1 = p, ⎨ 2a1 + 12a2 + 30a3 + 56a4 = d2 = p( p − 1), ⎪ ⎪ 24a2 + 120a3 + 336a4 = d3 = p( p − 1)( p − 2), ⎪ ⎩ 24a2 + 360a3 + 1680a4 = d4 = p( p − 1)( p − 2)( p − 3).

(7.4)

After some computations one sees that p p p( p − 2) (192 − 104 p + 18 p 2 − p 3 ) > , a2 = ( p − 6)( p − 8), 6 · 16 2 4 · 16 − p( p − 2) p( p − 2) a3 = ( p − 4)( p − 8), a4 = ( p − 4)( p − 6). 6 · 16 24 · 16 (7.5)

a1 =

Next, we shall see that this ϕ is convex in r ∈ [0, 1]. From (7.4) and (7.5) one has ϕ (2) (1) = p,

ϕ (2) (0) = 2a1 > p,

(7.6)

so it will suffice to show that ϕ (3) (r ) = 24r (a2 + 5a3 r 2 + 14a4 r 4 )

p( p − 2) 3( p − 6)( p − 8)−10( p − 4)( p − 8)r 2 +7( p−2)( p−6)r 4 = 24r 12 · 16 (7.7) has no critical points in (0, 1). After some computations one finds that the discriminant D of the quadratic equation (in r 2 ) in (7.7) is

D = ( p − 4)( p − 8) 102 ( p − 4)( p − 8) − 84( p − 6)2 = 16( p − 1)( p − 4)( p − 8)( p − 11) < 0,

(7.8)

because p ∈ (1, 4/3). Since ϕ (3) has no critical points (7.6) tells us that ϕ is strictly convex in [0, 1]. Taking β in (7.4) as β = a1 + a2 + a3 + a4 − 1, it follows that ϕ(0) = a0 = 0. Finally, if φ(r ) = r p , ϕ(0) = ϕ (0) = φ(0) = φ (0) = 0, φ (2) (r ) = p( p − 1)r p−2 ≥ p( p − 1) r ∈ (0, 1). Thus, there exists M0 > 0 such that M0 p( p − 1) ≥ sup |ϕ (2) (r )|. 0≤r ≤1

Finally, taking M = max{M0 ; β} one gets that ϕ(r ) ≤ Mr p , which completes the proof.

∀r ≥ 0,

Unique Continuation

511

Part (b). As in the proof of Theorem 2 we choose

r

ϕ(r ) = 3r − 1

dt + β, 1 + log t

so in this case we have d0 = 3 + β,

d1 = 2,

d2 = 1,

d3 = −3,

d4 = 14.

(7.9)

Solving the system (7.4) with these values of (d0 , d1 , .., d4 ) one gets ϕ(r ) = a0 +

103 2 9 4 17 6 17 8 r + r − r + r , r ∈ [0, 1]. 96 64 96 24 · 16

(7.10)

To show that ϕ is convex in [0, 1], we consider ϕ (2) (r ) =

1 (103 + 81r 2 − 225r 4 + 119r 6 ), r ∈ [0, 1], 48

(7.11)

ϕ (2) (0) = 103/48,

(7.12)

and recall that ϕ (2) (1) = 1.

We look for critical points of ϕ (3) (r ) =

r (27 − 150r 2 + 119r 4 ), r ∈ (0, 1). 8

(7.13)

There is only one critical point, the point r0 ∈ (0, 1] with r02

=

150 −

√ 150 − 9648 (150)2 − 4 · 119 · 27 = ∈ (0, 1). 2.119 238

(7.14)

Since ϕ (2) (r0 ) ≥ 110/48,

(7.15)

combining (7.15) and (7.12) it follows that ϕ is convex in [0, 1]. Finally, taking β in (7.9) such that β = a1 + a2 + a3 + a4 − 1, it follows that ϕ(0) = a0 = 0. Finally, an argument similar to that at the end of part (a) shows ϕ(r ) ≤ Mr, which provides the desired result.

∀ r ≥ 0,

512


References 1. Ablowitz, M.J., Haberman, R.: Nonlinear evolution equations in two and three dimensions. Phys. Rev. Lett. 35, 1185–1188 (1975) 2. Berestycki, H., Lions, P.-L.: Nonlinear scalar field equations. Arch. Rat. Mech. Anal. 82, 313–375 (1983) 3. Cazenave, T., Weissler, F.: The Cauchy problem for the critical nonlinear Schrödinger equation in H s . Nonlinear Analysis TMA 14, 807–836 (1990) 4. Cruz-Sampedro, J.: Unique continuation at infinity of solutions to Schrödinger equations with complex potentials. Proc. Edinburgh Math. Soc. 42, 143–153 (1999) 5. Davey, A., Stewartson, K.: On three-dimensional packets of surface waves. Proc. Royal London Soc. A 338, 101–110 (1974) 6. Ghidaglia, J.M., Saut, J.C.: Nonexistence of travelling wave solutions to nonelliptic nonlinear Schrodinger equations. J. Nonlinear Sci. 6, 139–145 (1996) 7. Ginibre, J., Velo, G.: On a class of nonlinear Schrödinger equations. J. Funct. Anal. 32, 1–71 (1979) 8. Escauriaza, L., Kenig, C.E., Ponce, G., Vega, L.: On Uniqueness Properties of Solutions of Schrödinger Equations. Comm. PDE. 31(12), 1811–1823 (2006) 9. Escauriaza, L., Kenig, C.E., Ponce, G., Vega, L.: Convexity of Free Solutions of Schrödinger Equations with Gaussian Decay. Math. Res. Lett. 15, 957–972 (2008) 10. Escauriaza, L., Kenig, C.E., Ponce, G., Vega, L.: Hardy’s uncertainly principle, convexity and Schrödinger eqautions. J. Eur. Math. Soc. 10, 882–907 (2008) 11. Escauriaza, L., Kenig, C.E., Ponce, G., Vega, L.: The sharp Hardy Uncertainty Principle for Schrödinger evolutions. Duke Math. J. 155, 163–187 (2010) 12. Escauriaza, L., Kenig, C.E., Ponce, G., Vega, L.: Uncertainty Principle of Morgan type and Schrödinger Evolutions. J. Lond. Math. Soc. 83, 187–207 (2011) 13. Ishimori, Y.: Multi vortex solutions of a two dimensional nonlinear wave equation. Progr. Theor. Phys. 72, 33–37 (1984) 14. Meshkov, V.Z.: On the possible rate of decay at infinity of solutions of second-oreder partial differential equations. Math. USSR Sbornik 72, 343–361 (1992) 15. Strauss, W.A.: Existence of solitary waves in higher dimensions. Commun. Math. Phys. 55, 149–162 (1977) 16. Strichartz, R.S.: Restriction of Fourier transforms to quadratic surface and decay of solutions of wave equations. Duke Math. J. 44, 705–714 (1977) Communicated by P. Constantin


Communications in


Constructing Self-Dual Strings Christian Sämann1,2 1 Department of Mathematics, Heriot-Watt University, Colin Maclaurin Building, Riccarton,

Edinburgh EH14 4AS, UK. E-mail: [email protected]

2 Maxwell Institute for Mathematical Sciences, Edinburgh, UK

Received: 6 August 2010 / Accepted: 11 November 2010 Published online: 15 May 2011 – © Springer-Verlag 2011

Abstract: We present an ADHMN-like construction which generates self-dual string solutions to the effective M5-brane worldvolume theory from solutions to the BasuHarvey equation. Our construction finds a natural interpretation in terms of gerbes, which we develop in some detail. We also comment on a possible extension to stacks of multiple M5-branes.

1. Introduction There is a close link between certain supersymmetric D-brane configurations and classical integrability as found in the self-dual Yang-Mills equation and its dimensional reductions. For example, the Atiyah-Drinfeld-Hitchin-Manin (ADHM) construction of instantons [1] finds a full interpretation within the gauge theoretic description of D0-D4brane bound states, see e.g. [2] for a review. Similarly, Nahm’s extension [3–5] of the ADHM construction to the case of monopoles, the ADHMN construction, is reflected in the effective description of D1-branes ending on D3-branes [6,7]. In both constructions, one starts from a Dirac operator and constructs the instanton and monopole solutions from its zero modes in a straightforward manner. The Dirac operator in turn is fully determined by solutions to a matrix equation in the ADHM case and solutions to the Nahm equation, i.e. the dimensional reduction of the self-dual Yang-Mills equation to one dimension, in the ADHMN case. It is clearly interesting to look for such a link between integrable field theories and brane configurations also in M-theory. The obvious starting point here is a configuration of M2-branes ending on an M5-brane, which is obtained from the D1-D3-brane configuration describing monopoles via T-duality and taking the M-theory limit. For this configuration, Basu and Harvey [8] suggested a new Nahm-type equation, in which

514

C. Sämann

the Lie algebra structure is replaced by a 3-Lie algebra.1 This Basu-Harvey equation passed many consistency tests and in particular, it led to the remarkably successful Bagger-Lambert-Gustavsson (BLG) model [10,11] and its close relative, the AharonyBergman-Jafferis-Maldacena (ABJM) model [12]. We therefore assume that the BasuHarvey equation is indeed the appropriate substitute for the Nahm equation. In this paper, we propose a completion of the picture: We present a Dirac operator built from solutions to the Basu-Harvey equation and demonstrate how its zero modes can be used to construct solutions to the self-dual string equation, which is the analogue of the Bogomolny monopole equation in the case of the M2-M5-brane configuration. We hope that this construction can help to understand the structures which underlie an effective description of multiple M5-branes. Note that an earlier approach to an ADHMN construction for self-dual strings was suggested by Gustavsson [13]. There, the author worked on an auxiliary space obtained from the loop space LR4 , described by the coordinates σ μν (τ ) := x μ (τ )x˙ ν (τ ) − x ν (τ )x˙ μ (τ ), where x μ (τ ) ∈ LR4 . He then split the auxiliary space up into two by grouping these six coordinates into pairs of three using the ’t Hooft tensors. Eventually, he proposed to identify self-dual strings with a pair of monopoles living in each of these two new auxiliary spaces. Our construction, however, is different: We start from a gerbe over S 3 , which we transgress, together with the self-dual string equation, to a principal U(1)-bundle over the loop space LS 3 . The resulting self-dual string equation is different from the pair of Nahm equations used in [13]. Throughout the paper, we shall try to be mostly self-contained and to give detailed motivation for our constructions. In Sect. 2, we present some mathematical background on gerbes before we begin our discussion in Sect. 3 by reviewing the standard ADHMN construction. Section 4 contains the discussion of our extension of this construction to self-dual strings and we conclude in Sect. 5. An Appendix reviews metric 3-Lie algebras and introduces the notion of compatible representations of their associated Lie algebras. 2. Gerbes with Connective Structure In the following, we review the necessary definitions of a Hitchin-Chatterjee gerbe with connection; readers familiar with this notion can skip this section. We will focus on gerbes over S 3 , as these will motivate our generalization of the ADHMN construction. Our discussion follows mainly [14] and [15], see also [16] for a standard reference on gerbes as well as [17]. We also found the webpages of nLab2 to be helpful.

2.1. From U(1)-bundles to gerbes. Consider a principal U(1)-bundle P over a manifold M and let U = {Ui } be a cover of M. The bundle P can be defined by transition functions ˇ which are Cech cocycles3 [gi j ] ∈ Hˇ 1 (U, C ∞ (S 1 )). We can now introduce a connection ˇ ∇ on P by specifying a Cech 1-cochain of u(1)-valued one-forms Ai such that on overlaps Ui ∩ U j , we have Ai − A j = d log gi j . As d Ai − d A j = 0, the curvature F := d A of the connection ∇ is globally defined. It forms a representative of the Chern class of P and we have F ∈ H 2 (M, Z). Conversely, given a curvature two-form F ∈ H 2 (M, Z), 1 Basu and Harvey actually suggested a matrix algebra closely related to the 3-Lie algebra A . The useful4 ness of general 3-Lie algebras in this context was first observed in [9]. 2 http://ncatlab.org/nlab/show/HomePage. 3 The dependence on the cover is removed as usual by taking direct limits.

Constructing Self-Dual Strings

515

repeated application of the Poincaré Lemma yields a transition function on any overlap Ui ∩ U j representing the class [gi j ]. Note also that Hˇ 1 (M, C ∞ (S 1 )) ∼ = H 2 (M, Z), which follows from the long exact cohomology sequence of the short exact sequence e2π ix

0 −→ Z −→ C ∞ (R) −→ C ∞ (S 1 ) −→ 1,

(2.1)

the fact that C ∞ (R) is a fine sheaf (implying H i (M, C ∞ (R)) = 0), and the standard ˇ isomorphism between de Rham and Cech cohomology groups. Gerbes over manifolds are generalizations of U(1)-bundles in that they correspond to ˇ Cech cocycles [gi jk ] ∈ Hˇ 2 (M, C ∞ (S 1 )) or equivalently to elements of the cohomology class H 3 (M, Z). 2.2. Local gerbes. It will be sufficient for us to work with surjective submersions obtained from open covers4 of manifolds. That is, if U = (Ui ) is an open cover of a manifold, we consider the associated disjoint union of patches, YU := {(i, x)|x ∈ Ui },

(2.2)

together with the surjective submersion π : YU M with π(i, x) := x. We will also need the ordered, p-fold intersections of (not necessarily distinct) patches [ p]

p

YU := {(y1 , . . . , y p )|π(y1 ) = · · · = π(y p )} ⊂ YU . [ p]

(2.3)

[ p−1]

the obvious projection by dropping the i th patch. We denote by πi : YU → YU [ p] ˇ de Rham double complex Note that on the YU , we have the usual Cech π∗

δ

δ

0 → q (M) −→ q (YU ) −→ q (YU[2] ) −→ · · · ,

(2.4)

p ˇ differential (in additive notation). where δ = i=1 (−1) p−1 πi∗ is the standard Cech ∞ Although we are working with the abelian sheaf C (S 1 ), we switch to multiplicative notation in the following. A local bundle gerbe over a manifold M is given [14] by a pair (P, YU ), where YU M is a subjective submersion obtained from a cover U and P → YU[2] is a U(1)-bundle, together with a compatible bundle gerbe multiplication. The latter can be seen as the analogue of the tensor product of U(1)-bundles. The bundle gerbe multiplication μ yields the following smooth isomorphism of U(1)-bundles over Y [3] : μ : π3∗ (P) ⊗ π1∗ (P) → π2∗ (P).

(2.5)

Given an element on YU[2] as (i, j, x), where x ∈ Ui ∩ U j , we have explicitly μ : ((i, j, x), z 1 ) ⊗ (( j, k, x), z 2 ) → ((i, k, x), z 1 z 2 gi jk (x)) 4 This means that we restrict ourselves to local or Hitchin-Chatterjee gerbes.

(2.6)

516

C. Sämann

for some gi jk ∈ Cˇ 2 (U, C ∞ (S 1 )). We now demand that this multiplication is associative in the following sense: Denote by Py1 ,y2 the fibre of P over the point (y1 , y2 ). Then the diagram Py1 ,y2 ⊗ Py2 ,y3 ⊗ Py3 ,y4 −→ Py1 ,y3 ⊗ Py3 ,y4 ↓ ↓ Py1 ,y2 ⊗ Py2 ,y4 −→ Py1 ,y4

(2.7)

commutes for any (y1 , y2 , y3 , y4 ) ∈ Y [4] . This is the case exactly if δ(g) = 1, i.e. g is a cocycle, as one readily verifies. The rôle the first Chern class plays for U(1)-bundles is taken over by the Dixmier-Douady class for gerbes, which corresponds to Hˇ 2 (M, C ∞ (S 1 )) ∼ = H 3 (M, Z). 2.3. Connective structures on gerbes. Consider a principal U(1)-bundle with connection over a manifold M with open cover U = (Ui ). Its curvature F is a globally defined 2-form, the connection corresponds to a u(1)-valued one-form on each patch Ui and on overlaps Ui ∩ U j , we have transition functions f i j . The relations between these are F = d Ai on Ui and Ai − A j = d log f i j on Ui ∩ U j .

(2.8)

On gerbes, we shift these objects by one form degree or, equivalently, by one degree ˇ in their Cech cohomology. That is, we have a global curvature 3-form H , two-forms Bi on the patches Ui , one-forms Ai j on the intersections Ui ∩ U j and functions h i jk on triple intersections Ui ∩ U j ∩ Uk , all taking values in u(1). Up to obvious equivalences, we can start from H and construct the other objects by the Poincaré Lemma, trading ˇ form degree for Cech degree. Explicitly, the relations between these are H = dBi on Ui , Bi − B j = d Ai j on Ui ∩ U j , Ai j − Aik + A jk = dh i jk on Ui ∩ U j ∩ Uk .

(2.9)

ˇ If H ∈ H 3 (M, Z), then gi jk := exp(ih i jk ) is a Cech cocycle. The one-forms Ai j are interpreted as giving rise to connections on the U(1)-bundle over YU[2] defining the gerbe and A is called a bundle gerbe connection. The two-forms Bi are called a curving for A, and the combined data (A, B) yields a connective structure on the gerbe (P, YU ). 2.4. Gerbes over S 3 . Consider now M = S 3 and let the sphere be covered by two patches U = {U0 , U1 } containing the north and the south pole, respectively. The intersection U0 ∩ U1 can be identified with the space S 2 × (−1, 1). As a gerbe is primarily defined by a U(1)-bundle over YU[2] , we see that gerbes over S 3 correspond to U(1)-bundles over S 2 . At cohomological level, this is reflected in H 2 (S 2 , Z) ∼ = H 3 (S 3 , Z) ∼ = Z. We can 2 pull back the whole U(1)-bundle over S including its connection A and curvature F. As there are no triple intersections of elements of U, there are no further constraints. Together with a partition of unity ψ0,1 ∈ U0,1 , ψ0 + ψ1 = 1 on U0 ∩ U1 , we can define two-forms B0 = ψ0 F on U0 and B1 = −ψ1 F on U1 with B0 − B1 = F on U0 ∩ U1 and B is thus a curving for A. The corresponding curvature H is globally defined and it is given by H = dψ0 ∧ F and H = −dψ1 ∧ F on U0 and U1 , respectively. Using Stokes Theorem, the integral of H over S 3 reduces to the integral of F over S 2 , confirming H ∈ H 3 (S 3 , Z).


517

2.5. Transgression and regression. A transgression map is a map between cohomology classes on different spaces, which are the base spaces of a common correspondence space in a double fibration. We will be interested in a transgression T from a smooth manifold M to its loop space LM := Map(S 1 , M), which actually extends from the cohomology to differential forms. The transgression map T : k (M) → k−1 (LM) is based on the following double fibration: LM × S 1 ev @ pr R @ M LM.

(2.10)

Here, ev is the obvious evaluation map of the loop at the given angle and pr describes the projection from LM × S 1 onto LM. The transgression map amounts to the pull back along ev and the push forward along pr , that is T = pr ! ◦ ev ∗ , cf. [16]. Identify now the tangent space to the loop space of M, T LM, with the loop space of the tangent space to M, LT M. Note that there is a natural element x(τ ˙ ) ∈ LT M, which appears in the transgression map as follows: Given k vector fields (v1 (τ ), . . . , vk (τ )), vi (τ ) ∈ T LM, and ω ∈ k+1 (M), we have dτ ω(v1 (τ ), . . . , vk (τ ), x(τ ˙ )), x ∈ LM. (2.11) (T ω)x (v1 (τ ), . . . , vk (τ )) := S1

We now embed the loop space LM into path space P M. Here, composable paths induce the notion of composable vector fields and consequently onto differential forms, cf. [18]. If we restrict ourselves to functorial forms, which are the forms respecting composability, then there is an obvious inverse map, sometimes called regression: (T −1 ω)x(0) (v0 , v1 , . . . , vk ) := lim

τ →0

1 ωx|[0,τ ] (v˜1 , . . . , v˜k ), τ

(2.12)

where vi ∈ T M and x ∈ P M such that x(0) ˙ = v0 and v˜i ∈ LT M are extensions of the vi , which are constant along the path. More details on transgressions can be found e.g. in [16] and [19]. This map will allow us to translate the result of an ADHMN-like construction on loop space LS 3 back to S 3 . 3. The ADHMN Construction In the following, we give a lightning review of the ADHMN construction [3–5] and its D-brane interpretation [6,7]. This will fix our notation and provide reference points for Sect. 4.

3.1. Monopoles from D1-branes ending on D3-branes. Consider a stack of k D1-branes ending on a stack of N D3-branes in type IIB string theory where the worldvolumes of the branes extend into R1,9 in the following way: D1 D3

0 × ×

1

2

3

×

×

×

4

5

6 ×

…

.

(3.1)

518

C. Sämann

Adopting the perspective of the D3-branes, the ground state of this system is effectively described by time-independent BPS configurations to N = 4 super Yang-Mills theory with gauge group U(N ). The bosonic fields of this theory consist of a gauge potential Aμ and six scalar fields I , I = 4, . . . , 9. We concentrate now on configurations in the gauge A0 = 0 for which all fields but Ai , i = 1, . . . , 3, and := 6 vanish. The BPS equation for these fields is given by the Bogomolny monopole equation, which is the dimensional reduction of the self-dual Yang-Mills equation F = ∗F on R4 (the equation describing an instanton) to R3 : F = ∗∇ or Fi j = εi jk ∇k , i, j, k = 1, . . . , 3.

(3.2)

Note that is a harmonic function on its domain D ⊆ R3 due to the Bianchi identity. Solutions to the Bogomolny equation with appropriate boundary conditions5 carry a topological charge which is called the monopole number. From the perspective of the D1-branes, the system is described by the Nahm equation d i 1 X = εi jk [X j , X k ], ds 2

(3.3)

where the6 X i = − X¯ i are functions on I = R>0 with values in u(k). This equation is the dimensional reduction of the self-duality equation with gauge group U(k) to one ∂ dimension after choosing the gauge ∇s = ∂s . The transition between Eqs. (3.2) and (3.3), or rather between the spaces of solutions to these two equations, is done by a certain Fourier-Mukai transform known as the Nahm transform. The latter maps self-dual gauge potentials on a four-torus to self-dual gauge potentials on the dual torus. In our special case, we are – roughly speaking – dealing with tori with one and three radii being infinite, respectively, and the remaining radii zero, making the two tori dual to each other. In the following, we will focus on the construction of solutions to the Bogomolny monopole equation from solutions to the Nahm equation. This construction is also known as the ADHMN construction. 3.2. Constructing monopoles. We will be mostly interested in Dirac monopoles, for which the stack of D1-branes ends on a stack of D3-branes, which are all at the same position x 6 = 0. The D1-branes’ worldvolume is thus I = R>0 . More commonly, one studies non-singular SU(2)-monopoles, for which the D1-branes are suspended between two D3-branes at x 6 = −1 and x 6 = +1 and thus I = (−1, +1). Let W n,2 (I) ⊂ L 2 (I) be the Sobolev space7 of functions on the interval I, which are square integrable up to their n th derivative. Let furthermore X i ∈ C ∞ (I, u(k)), i = 1, . . . , 3, be a solution to the Nahm equation (3.3) satisfying certain boundary conditions, on which we will comment below. From these, we derive a Dirac operator ∇ / s : W 1,2 (I) ⊗ C2 ⊗ Ck → W 0,2 (I) ⊗ 2 k 0,2 ¯ C ⊗ C together with its adjoint ∇ / s : W (I) ⊗ C2 ⊗ Ck → (W 1,2 (I) ⊗ C2 ⊗ Ck )∗ , which read as8 ∇ / s = −1

d d + σ i ⊗ iX i and ∇ /¯ s := 1 + σ i ⊗ iX i . ds ds

(3.4)

5 amounting to the fact that the value of the Yang-Mills-Higgs action functional evaluated at the solution is finite, cf. [20]. 6 Throughout the paper, we will use the notation X¯ := X † to avoid overdecorating symbols. 7 One often finds the notation H n = W n,2 , which we avoid here as we label cohomology groups by H n . 8 We identify W 1,2 (I) ⊗ C2 ⊗ Ck with its dual (W 1,2 (I) ⊗ C2 ⊗ Ck )∗ .


519

Here, the σ i are the usual Pauli matrices with σ i = σ¯ i and X i ∈ u(k) with X¯ i = −X i . Consider now the differential operator s := ∇ /¯ s ∇ / s . One easily checks that the condii ¯ s = s are equivalent to tions [ s , σ ⊗ 1k ] = 0, i = 1, . . . , 3, and s > 0 as well as i the X forming a solution to the Nahm equation. Introducing the space R3 with euclidean coordinates x i , we can shift the iX i by x i 1k , while preserving the properties of s : d d + σ i ⊗ (iX i + x i 1k ) with ∇ /¯ s,x := 1 + σ i ⊗ (iX i + x i 1k ), ds ds := ∇ /¯ s,x ∇ / s,x > 0 and [ s,x , σ i ⊗ 1k ] = 0.

∇ / s,x = −1

s,x

(3.5)

This shift is usually called a twist of the Dirac operator, as it reflects the twisting of the original gauge bundle by the Poincaré line bundle in the underlying Nahm transform. Consider now the orthonormalized zero modes ψs,x,α ∈ W 0,2 (I) ⊗ C2 ⊗ Ck of the operator ∇ /¯ s,x : /¯ s,x ). ∇ /¯ s,x ψs,x,α = 0, α = 1, . . . , N , N = dimC (ker ∇

(3.6)

We arrange them into a k × N -dimensional matrix ψs,x , which satisfies 1N = ds ψ¯ s,x ψs,x .

(3.7)

I

Using this matrix, we define a gauge potential and a Higgs field on a subset of R3 by ∂ ds ψ¯ s,x μ ψs,x and := −i ds ψ¯ s,x s ψs,x . (3.8) Aμ := ∂x I I Note that the Green’s function G x (s, t) of s,x with s,x G x (s, t) = −δ(s − t) is welldefined because of the positivity of s,x . We therefore have a projector Px onto the orthogonal space of ker∇ /¯ s,x , / s,x G x (s, t)∇ /¯ t,x = −δ(s − t) + ψs,x ψ¯ t,x , (3.9) Px (s, t) := ∇ satisfying Px (s, t)ψt,x = 0 and Px2 (s, t) = dr Px (s, r )Px (r, t) = Px (s, t). One can now plug the fields (3.8) into the Bogomolny equation (3.2). After using (3.9) to rewrite the result as double integrals over s and t, one readily verifies that (3.8) forms a solution to the Bogomolny equation by direct computation, cf. e.g. [21]. An analogous calculation is presented in detail in Sect. 4.4 The boundary conditions we mentioned above have to guarantee that dimC (ker ∇ /¯ s,x ) = N , the number of D3-branes. This is the case, if the solution X i has a simple pole at the finite boundaries of I with its residue forming an irreducible representation of SU(2), see e.g. [20] for more details.

3.3. Abelian monopoles from the ADHMN construction. Let us now come to the case of Dirac monopoles. The corresponding ADHMN construction is standard and found, e.g. in [22]. First, consider the simplest case N = k = 1 and I = R>0 , i.e. one D1-brane ending on a single D3-brane at9 x 6 = s = 0. Solutions to the Nahm equation are just constants specifying the position of the monopole in the D3-brane, and we thus put 9 One readily verifies that one can easily accommodate a translation s → s + s in the construction. 0

520

C. Sämann

d X i = 0. The adjoint of the Dirac operator ∇ /¯ s,x = 1 ds + x i σ i has the following two normalizable zero modes:

√ ψ+ = e

−s R

R + x3 x 1 − ix 2

x 1 − ix 2 R − x3

√

and ψ− = e

−s R

R − x3 x 1 + ix 2

x3 + R x 1 + ix 2

, (3.10)

where R 2 = x i x i . Recall that the Dirac monopole should actually be regarded as a gauge configuration on a sphere encircling the position of the monopole. The two solutions ψ+ and ψ− correspond to the description of this configuration on the two standard patches of S 2 containing each one of the two poles at x 3 = R and x 3 = −R. For ψ+ , we obtain the following fields from (3.8): i i and Ai+ =

=− 1 2R 2(x + x 2 )2 +

x

2

x3 1− R

, −x

1

x3 1− R

,0 .

(3.11)

They satisfy the Bogomolny equation, as one easily verifies. The zero mode ψ− yields an analogous solution which for |x 3 | = R is related to the above solution by a gauge transformation. Next consider k = 2, N = 1, i.e. a stack of two D1-branes ending on a single D3-brane. A solution X i to the Nahm equation is found from the ansatz X i = f (s)T i with f ∈ C ∞ (R>0 ) and T i ∈ su(2), and we obtain 1 σi = −T¯ i . X i = − T i with T i = s 2i

(3.12)

Note that X i has indeed a simple pole at s = 0, and the Pauli matrices σ i belong to an irreducible representation of su(2). Therefore this solution can be interpreted as describing a polarization of the points of the worldvolume of the two D1-branes into fuzzy two-spheres with radius f (s) and prequantum line bundle10 O(1). The dual of the Dirac operator corresponding to the solution (3.12) reads as d σi i i ¯ ∇ / s,x = 1 + σ ⊗ − + x 12 . ds 2s

(3.13)

For simplicity, let us restrict ourselves to the point x 3 = R and compute only the Higgs field : ψ+ =

√ −Rs i se (0, 0, 0, 1)T , = − . R

(3.14)

We get the right radial behavior and, because we started from two monopoles, the Higgs field has twice the magnitude as that of a single Dirac monopole. 10 For k D1-branes, the prequantum line bundle of the fuzzy sphere is O(k − 1).


521

4. An ADHMN Construction for Self-Dual Strings 4.1. The self-dual string or M2-branes ending on M5-branes. We now want to lift the D1-D3-brane configuration (3.1) to a configuration of a stack of M2-branes ending on M5-branes in M-theory. For this, we first have to T-dualize along one direction transverse to both D1- and D3-branes, yielding a D2-D4-brane configuration. Subsequently, we choose an M-theory direction along which we can wrap the M5-brane corresponding to the D4-brane: M 0 1 2 3 4 5 6 M2 × × × M5 × × × × × × 1 ↓ SM IIA 0 1 2 3 (4) D2 × D4 × × × ×

5 6 × × ×

(4.1) IIB 0 1 2 3 (4) 5 6 T5 × ←→ D1 × D3 × × × ×

Note that here, we embedded the target space of string theory, R1,9 , into that of M-theory, R1,10 , as the hyperplane x 4 = 0. In the following, we will restrict our discussion to the cases for which we have reasonable worldvolume theories, i.e. to the cases of a single M2-brane ending on a single M5-brane, N = k = 1, and the case of two M2-branes ending on a single M5-brane, N = 1, k = 2. The equations of motion of a single M5-brane can be derived in a number of ways, e.g. by the doubly supersymmetric approach to super p-branes or by analyzing the dynamics of the Goldstone modes arising from the symmetry breaking of the 11d supergravity action by the presence of the M5-brane, see e.g. [23] for a review. From analyzing the Goldstone modes, we learn that the fields on the M5-brane are given by five scalars I together with a self-dual two-form potential B, such that H := d B = ∗H . The doubly supersymmetric approach was pursued for example in [24], where also the appropriate BPS equation corresponding to a stack of M2-branes ending on a single M5-brane as in (4.1) was given: H05μ =

1 1 ∂μ , Hμνρ = εμνρσ ∂σ , μ, ν, ρ, σ = 1, . . . , 4, 4 4

(4.2)

Here, = 6 is a function which is harmonic on its domain D ⊆ R4 due to the Bianchi 1 identity. We therefore have = 0 + p |x−y |2 with D = R4 \{y p }, where y p are the p singular points of corresponding to positions of M2-brane boundaries. Because of the self-duality of H and the string-like shape of the naïve boundary of the M2-brane on the M5-brane, this configuration is known as the self-dual string soliton. Note that ∼ R12 , contrary to the case of the D1-D3-brane system, where we had ∼ R1 . From the perspective of the M2-branes, an effective description was not available until Basu and Harvey proposed an extension of the Nahm equation [8]. They suggested that the field content is given by four transverse scalars X μ , which take values in a 3-Lie algebra11 A. Basu and Harvey then postulated essentially the following extension of the Nahm equation (3.3): d μ 1 X = εμνρσ [X ν , X ρ , X σ ], X μ ∈ A, ds 3! 11 The definition of a metric 3-Lie algebra is found in Appendix A.

(4.3)

522

C. Sämann

which can be justified by SO(4)-invariance and dimensional analysis: Recall that s and X μ correspond to ∼ r12 and x μ , respectively, in the self-dual string equation (4.2). 4.2. The Basu-Harvey equation from a twisted Dirac operator. Let us now motivate a Dirac operator suitable for an extended ADHMN construction for the M2-M5-brane system by following the Dirac operator of the D1-D3-brane system through T-duality and the lift to M-theory. d i i i Our starting point is the Dirac operator ∇ / IIB s,x = −1 ds + σ ⊗ (iX + x 1k ) in type IIB string theory. Going from a chiral theory to a non-chiral one (type IIA), it is only natural to replace the Pauli matrices with the generators γ μ of the Clifford algebra12 Cl(R4 ). Such a Dirac operator was suggested e.g. in [25], and here we choose: ∇ / IIA s,x = −γ5 ⊗ 1k

d + γ 4 γ i ⊗ (X i − ix i ), ds

d + γ 4 γ i ⊗ (X i − ix i ), ∇ /¯ IIA s,x = γ5 ⊗ 1k ds

(4.4)

where γ5 := γ 1 γ 2 γ 3 γ 4 . One readily verifies that IIA /¯ IIA / IIA s,x := ∇ s,x ∇ s,x > 0 and IIA 4 i i [ s,x , γ γ ] = 0 amounts again to the X satisfying the Nahm equation (3.3). Note, however, that we have dim(ker ∇ /¯ IIA /¯ IIB s,x ) = 2 dim(ker ∇ s,x ), as the Dirac operator acts reducibly: IIB ∇ / s,x 0 IIA ∇ / s,x = . (4.5) 0 −∇ / IIB s,x To lift (4.4) to M-theory, recall that solutions X μ to the Basu-Harvey equation (4.3), which will play the rôle of our Nahm data, take values in a 3-Lie algebra A. We therefore 4 1,2 (R>0 ), expect the M-theory Dirac operator ∇ /M s to act on elements of a space C ⊗E ⊗W where E is a space related to the 3-Lie algebra A. There are now essentially two possibilities for the choice of E: an analogue of the carrier space of the adjoint representation, i.e. E = A, and an analogue of the fundamental representation, i.e. E = Cd . Both possibilities, when followed through, yield construction mechanisms for self-dual strings. However, a future extension to the non-abelian case seems to work only with the latter possibility, and we will therefore choose E = Cd , following closely the original ADHMN construction. The 3-Lie algebra A comes with an associated Lie algebra of inner derivations, which we denote by gA . Let ρ be a faithful representation of gA that is compatible with the 3-Lie algebra structure as described in Appendix A and let Cd be its carrier space. For example, g A4 admits the following compatible representation with d = 4: 1 αβ γ γ5 ζ, ζ ∈ C4 , (4.6) 2 where the eα , α = 1, . . . , 4, form an orthonormal basis of the 3-Lie algebra A4 . Dimensional analysis as well as SO(4)-invariance then naturally lead us to the (untwisted) Dirac operator D (ρ) (eα , eβ )ζ =

∇ /M s = −γ5

d 1 μν (ρ) μ ν d 1 μν (ρ) μ ν + γ D (X , X ) with ∇ + γ D (X , X ). /¯ M s = γ5 ds 2 ds 2 (4.7)

12 Our conventions are {γ μ , γ ν } = +2δ μν .


523

Here, γ μν := 21 [γ μ , γ ν ], X μ ∈ A and D (ρ) (X μ , X ν ) is the inner derivation D(X μ , X ν ) ∈ gA in the representation ρ. The Dirac operator ∇ /M s thus acts on elements 4 d 1,2 >0 M M ¯ /s ∇ /M of C ⊗ C ⊗ W (R ). The differential operator s := ∇ s reads explicitly as d d 2 1 = −1 + γ5 γ μν D (ρ) (X μ , X ν ) ds 2 ds 1 + 2 γ μν γ κλ D (ρ) (X μ , X ν )D (ρ) (X κ , X λ ), 2

M s

(4.8)

μν and a straightforward calculation shows that the condition [ M s , γ ] = 0 for all μ, ν = 1, . . . , 4 amounts to the Basu-Harvey equation (4.3). The key issue in extending the ADHMN construction to self-dual strings is the appropriate twist of the Dirac operator. It is clear that twisting has to amount to adding a term γ μν a μ bν to ∇ /M s , where the vectors a and b must have the same dimension as x. Recall that the two-form potential B, which we want to construct and which describes together with the Higgs field the self-dual string, actually belongs to the connective structure of a gerbe. Using a transgression map, we can switch from the gerbe to a U(1)-bundle over loop space and perform our construction there. A regression map can take us back to the gerbe picture afterwards. Similar to a Dirac monopole corresponding to a vector bundle over S 2 instead of R3 , we expect here a gerbe over S 3 instead of R4 . The transgression map therefore takes us to loops on S 3 , which we describe as embedded in R4 by the cartesian coordinates x μ . We thus have loops x μ (τ ) satisfying x μ (τ )x μ (τ ) = R 2 and x μ (τ )x˙ μ (τ ) = 0, where R is the radius of S 3 ⊂ R4 and μ x˙ μ (τ ) := dxdτ(τ ) . We also impose the following (gauge) condition on the parameterization of our loops: x˙ μ (τ )x˙ μ (τ ) = R 2 . Now there is an obvious twist of the Dirac operator:

∇ /M s,x(τ ) = −γ5

d + γ μν ds

1 (ρ) μ ν D (X , X ) − ix μ (τ )x˙ ν (τ ) . 2

(4.9)

Note that the Nahm data (X μ ) is not extended to Nahm data on the circle parameterized by τ . Moreover, note that the twist naturally reflects the fact that x(τ ) ∈ LS 3 : The antisymmetrization of x μ (τ )x˙ ν (τ ) eliminates a possible component of x˙ ν (τ ) parallel to μν is again x μ (τ ). The condition that M /¯ M /M s,x(τ ) := ∇ s,x(τ ) ∇ s,x(τ ) commutes with all the γ μ equivalent to the X satisfying the Basu-Harvey equation (4.3).

4.3. The self-dual string on loop space. Now that we have a Dirac operator connected to loop space, we need to use a transgression map to translate the self-dual string equation H = ∗d to loop space, as well. Although there is no Hodge star operation due to the loop space having infinite dimension, one can use the transgression map and its inverse to lift the Hodge star on R4 to an operation on LR4 . The self-dual string equation in (4.2), as an equation on three-forms, reads as: ∂ H = εμνρσ σ dx μ ∧ dx ν ∧ dx ρ . ∂x

(4.10)

524

C. Sämann

The transgression map T now maps H ∈ 3 (R4 ) to F := T (H ) ∈ 2 (LR4 ) according to μ F(V1 , V2 ) = dτ Fμν (x(τ )) V1 (x(τ )) V2ν (x(τ )) S1 μ dτ Hμνρ (x(τ )) V1 (x(τ )) V2ν (x(τ )) x˙ ρ (τ ), (4.11) := S1

where V1 , V2 are elements of T LR4 . Equation (4.10) translates correspondingly into

μ

dτ Hμνρ (x(τ )) V1 (x(τ )) V2ν (x(τ )) x˙ ρ (τ ) μ ν ρ dτ εμνρσ V1 (x(τ )) V2 (x(τ )) x˙ (τ ) d =

S1

S1

δ δx σ ()

(x(τ )), (4.12)

μ

where δx νδ() x μ (σ ) = δν δ( − σ ). The transgression map T also takes the two-form potential B ∈ 2 (R4 ) to a gauge potential A ∈ 1 (LR4 ), which acts onto V ∈ T LR4 according to A(V ) =

S1

dτ Aμ (x(τ )) V μ (x(τ )) :=

S1

dτ Bμν V μ (x(τ )) x˙ ν (τ ).

(4.13)

Note that since ∂ S 1 = ∅, transgression is in our case a chain map: F := δ A = δT (B) = T (dB) = T (H ),

(4.14)

where δ f := dτ δxδμf(τ ) δx μ (τ ) is the loop space13 differential. Here, we will consider a different gauge potential A such that Fμν = ∂μ Aν − ∂ν Aμ ,

(4.15)

where ∂μ = dρ δx μδ(ρ) . This guarantees that we obtain a field strength F corresponding to a transgressed 3-form H . Putting everything together, we arrive at the following equation on loop space: Fμν (x(τ )) :=

∂ ∂ ∂ Aν (x(τ )) − ν Aμ (x(τ )) = εμνρσ x˙ ρ (τ ) σ (x(τ )). μ ∂x ∂x ∂x

(4.16)

We call (4.16) the loop space self-dual string equation and we will find solutions to this equation via a generalization of the ADHMN construction involving the Dirac operator we constructed above. In the following, we suppress the integral over S 1 , as we have done in (4.16); all our equations already hold in this form. 13 Differential calculus on loop spaces can be made precise within the context of diffeological spaces or as done by K.-T. Chen, cf. e.g. the references in [26].


525

4.4. Constructing self-dual strings. Our construction proceeds now in the obvious way, cf. Sect. 3.2. Consider the Dirac operator defined above, where we use again the compatible representation ρ of gA with d-dimensional carrier space: 4 d 1,2 ∇ /M (I) → C4 ⊗ Cd ⊗ W 0,2 (I), I = (0, ∞). s,x(τ ) : C ⊗ C ⊗ W

(4.17)

We pick one of the normalizable zero modes of ∇ /¯ M s,x(τ ) , denote it by ψs,x(τ ) ,

1=

I

ds ψ¯ s,x(τ ) ψs,x(τ ) ,

(4.18)

and introduce the following fields on loop space: ∂ Aμ (x(τ )) = ds ψ¯ s,x(τ ) μ ψs,x(τ ) and (x(τ )) = −i ds ψ¯ s,x(τ ) s ψs,x(τ ) . ∂x (4.19) We will now show that these fields solve indeed the loop space self-dual string equation (4.16) if the X μ which determine the Dirac operator ∇ /M s,x(τ ) satisfy the Basu-Harvey equa/¯ M /M tion. Recall that in this situation, the differential operator M s,x(τ ) := ∇ s,x(τ ) ∇ s,x(τ ) is a μν linear combination of 1 and γ5 (i.e. it commutes with any γ ). In order to have a Green’s M M function G M x(τ ) (s, t) with s,x(τ ) G x(τ ) (s, t) = −δ(s −t), we also require this differential operator to be invertible. To verify this, consider an arbitrary ψ ∈ C4 ⊗ Cd ⊗ W 1,2 (I) with (ψ, ψ) > 0, where (·, ·) is the obvious combination of hermitian scalar product on the complex vector spaces and the L 2 -norm over W 1,2 (I). Taking into account that the Nahm data satisfies the Basu-Harvey equation, it remains to show that d 2 1 μν ρσ μν ρσ ψ, − + {γ , γ } ⊗ T T ψ > 0, (4.20) ds 2

where T μν := 21 D (ρ) (X μ , X ν ) − ix [μ (τ )x˙ ν] (τ ) . Note that T¯ μν = −T μν and that there is no ψ such that14 T μν ψ = 0 for all μ, ν. We can simplify (4.20) further to

1 μνρσ ρσ 1 ε 2 T μν ψ, T μν ψ + 2 T ψ, εμνκλ T κλ ψ 2 2 1 (4.21) −4 T μν ψ, γ5 εμνρσ T ρσ ψ > 0. 2 Next, we decompose ψ into eigenvectors ψ± of γ5 ⊗ 1d with eigenvalues ±1. These eigenspaces are obviously orthogonal. Because of [γ5 ⊗ 1d , T μν ] = 0, T μν ψ± belongs to the same eigenspace as ψ± . Relation (4.21) now reduces to 1 1 T μν ψ+ − εμνρσ T ρσ ψ+ , T μν ψ+ − εμνκλ T κλ ψ+ 2 2 1 1 (4.22) + T μν ψ− + εμνρσ T ρσ ψ− , T μν ψ− + εμνκλ T κλ ψ− > 0, 2 2 14 The representation ρ is faithful and x(τ ) ∈ LS 3 .

526

C. Sämann

which holds true if (ψ, ψ) > 0, because the T μν do not have a common kernel. Therefore M s,x(τ ) is invertible and has a Green’s function, which leads again to a projector, cf. (3.9): M M ¯ Px(τ /M /¯ M ) (s, t) := ∇ s,x(τ ) G x(τ ) (s, t)∇ t,x(τ ) = −δ(s − t) + ψs,x(τ ) ψt,x(τ ) ,

(4.23)

where we switched to the notation (ψ, Aψ) = ψ¯ Aψ for convenience. For the same reason, let us also temporarily drop the explicit x(τ )-dependence and write x for x(τ ) as well as ∂μ for d δx μδ() . We have

ds (∂[μ ψ¯ s )∂ν] ψs M ∂ν] ψt = ds dt (∂[μ ψ¯ s ) ψs ψ¯ t − ∇ /M /¯ M s G (s, t)∇ t = ds dt ψ¯ s γ μκ x˙ κ G M (s, t)γ νλ x˙ λ − γ νκ x˙ κ G M (s, t)γ μλ x˙ λ ψt . (4.24)

Fμν =

Recall that the Green’s function commutes with the γ μν and γ5 . Together with the identity [γ μκ , γ νλ ]x˙ κ x˙ λ = −2εμνρσ γ σ κ γ5 x˙ ρ x˙ κ ,

(4.25)

we thus arrive at

dt ψ¯ s 2γ σ κ γ5 G M (s, t)x˙ ρ x˙ κ ψt M M ¯ t ψt = −iεμνρσ x˙ ρ ds dt (∂σ ψ¯ s ) ψs ψ¯ t − ∇ /M G (s, t) ∇ / s t M ∂σ ψt +ψ¯ s s ψs ψ¯ t − ∇ /M /¯ M s G (s, t)∇ t = −iεμνρσ x˙ ρ ds (∂σ ψ¯ s ) s ψ + ψ¯ s ∂σ ψ

Fμν = −εμνρσ

ds

= εμνρσ x˙ ρ ∂σ ,

(4.26)

thus verifying that the fields (4.19) indeed solve the loop space self-dual string equation (4.16).

4.5. Explicit solutions. Let us now come to explicit solutions. We restrict ourselves again to the cases N = k = 1 and N = 1, k = 2, as we did for monopoles. For k = 1, the Nahm data reduces to constants and we therefore put X μ = 0. The equation ∇ /¯ M s,x(τ ) ψs,x(τ ) = 0 and the normalization condition (4.18) are enough to fix ψs,x(τ ) completely. One finds altogether eight zero modes. Half of them are normalizable on (0, ∞), the other half is normalizable on (−∞, 0). The remaining four split into two pairs, each giving the description on one of the two standard patches of S 3 . Recalling that there was a doubling of the zero modes due to switching from Pauli matrices to


527

generators of the Clifford algebra Cl(R4 ), we restrict ourselves to the zero mode ψs,x(τ ) with γ5 ψs,x(τ ) = ψs,x(τ ) . Up to normalization, we have:15 ⎞ ⎛

i R 2 + x 2 x˙ 1 − x 1 x˙ 2 − x 4 x˙ 3 + x 3 x˙ 4 2 ⎜ x 3 (x ˙ 1 + i x˙ 2 ) + x 4 (x˙ 2 − i x˙ 1 ) − (x 1 + ix 2 )(x˙ 3 − i x˙ 4 ) ⎟ ⎟, (4.27) ψs,x(τ ) ∼ e−R s ⎜ ⎠ ⎝ 0 0 and formulas (4.19) yield for the normalized zero mode:

(x) =

i , 2R 2 ⎛

⎞ −x 3 (x˙ 2 x˙ 3 + x˙ 1 x˙ 4 ) + x 4 (x˙ 1 x˙ 3 − x˙ 2 x˙ 4 ) + x 2 ((x˙ 3 )2 + (x˙ 4 )2 ) 1 ⎜ x 4 (x˙ 2 x˙ 3 + x˙ 1 x˙ 4 ) + x 3 (x˙ 1 x˙ 3 − x˙ 2 x˙ 4 ) − x 1 ((x˙ 3 )2 + (x˙ 4 )2 )⎟ ⎜ ⎟, A(x) = ⎝ ⎠ −R 2 x 4 + x˙ 3 (x 1 x˙ 2 − x 2 x˙ 1 + x 4 x˙ 3 − x 3 x˙ 4 ) n(x) 2 3 4 1 2 2 1 4 3 3 4 R x + x˙ (x x˙ − x x˙ + x x˙ − x x˙ ) (4.28)

where n(x) = −2iR 2 (R 2 − x 2 x˙ 1 + x 1 x˙ 2 + x 4 x˙ 3 − x 3 x˙ 4 ). These fields indeed solve the self-dual string equation on loop space (4.16). Let us now switch to spherical coordinates (θ 1 , θ 2 , φ) describing S 3 ⊂ R4 according to x 1 = R sin θ 1 sin θ 2 cos φ, x 2 = R sin θ 1 sin θ 2 sin φ, x 3 = R sin θ 2 cos θ 1 , x 4 = R cos θ 2 . In these coordinates, the field strength F of the above gauge potential A reads as F=

2i sin θ 1 sin2 θ 2 (θ˙ 2 dφ ∧ dθ 1 − θ˙ 1 dφ ∧ dθ 2 + φ˙ dθ 1 ∧ dθ 2 ) , φ˙ 2 + 2(θ˙ 1 )2 + 4(θ˙ 2 )2 − (φ˙ 2 + 2(θ˙ 1 )2 ) cos(2θ 2 ) − 2φ˙ 2 cos(2θ 1 ) sin2 θ 2 (4.29)

and the induced metric on S 3 is given by ds 2 = R 2 sin2 θ 2 dθ 1 ⊗ dθ 1 + R 2 dθ 2 ⊗ dθ 2 + R 2 sin2 θ 1 sin2 θ 2 dφ ⊗ dφ.

(4.30)

We can now compute the regression H = T −1 F of (4.29) back to S 3 using formula (2.12): H = F|θ˙ 1 =1,θ˙ 2 =0,φ=0 ∧ sin θ 2 dθ 1 ˙ −F|θ˙ 1 =0,θ˙ 2 =1,φ=0 ∧ dθ 2 ˙ +F|θ˙ 1 =0,θ˙ 2 =0,φ=1 ∧ sin θ 1 sin θ 2 dφ ˙ = 6i sin θ 1 sin2 θ 2 dθ 1 ∧ dθ 2 ∧ dφ,

(4.31)

and one recovers16 the 3-form field strength of the self-dual string on S 3 ⊂ R4 . 15 For brevity, we write x μ for x μ (τ ) and x˙ μ for x˙ μ (τ ). 16 To understand the above regression, it might help to perform the inverse operation and to compute the transgression F = T (H ). Recall the normalization condition |x| ˙ 2 = R 2 we imposed on the parameterization

of the loops.

528

C. Sämann

For the case k = 2, we start from the following solution to the Basu-Harvey equation: eμ Xμ = √ , 2s

(4.32)

where the eμ are the orthonormalized generators of A4 . This solution can be interpreted as each point of the worldvolume of the M2-branes polarizing into a fuzzy 3-sphere with radius r ∼ √1 . Plugging this solution into the Dirac operator, we can again calculate 2s the corresponding zero modes. As the computation is rather cumbersome in practice, we restrict ourselves to the value of the Higgs field at x 1 (τ ) = x˙ 2 (τ ) = R, which we then extend by SO(4)-invariance. The corresponding zero mode reads as √ 2 ψ+ (x) = 4R 2 se−2R s (1, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0)T ,

(4.33)

and formulas (4.19) yield a Higgs field with the right behavior,

(x) =

i . R2

(4.34)

4.6. Reduction to the Nahm equation. To anchor our construction deeper within the context of the BLG model and the ADHMN construction, let us describe how it reduces to the corresponding construction for the D2-D4-brane system. For this, recall the reduction from the BLG model with 3-Lie algebra A4 to N = 8 super Yang-Mills theory in three dimensions with gauge group SU(2) as developed in [27]. The key here was to give the scalar corresponding to the M-theory direction x a vacuum expectation value: X =

r

e 3/2 4 p

= gYM e4 ,

(4.35)

where r is the radius of the circle on which x is compactified, p is the Planck length, gYM the Yang-Mills coupling and e4 a chosen generator of A4 . After a duality transform of the gauge potential, the terms to leading order in gYM yield the action of three-dimensional maximally supersymmetric Yang-Mills theory. In our case, we start by demanding that X = X 4 = −Re4 , in close analogy with (4.35). We then pick a loop x(τ ) and a base point x(τ0 ) such that x˙ 4 (τ0 ) = −R. From our normalization of x(τ ˙ ), it follows that x˙ i (τ0 ) = 0 for i = 1, . . . , 3 and from the μ μ orthogonality x˙ (τ )x (τ ) = 0, we conclude that x 4 (τ0 ) = 0. That is, at τ0 the loop agrees with a circle around the M-theory direction. We consider now the Dirac operator 3 ∇ /M s,x(τ ) at this loop with base point, thereby going from LS back to the correspondence 3 1 space LS × S in the double fibration (2.10) with M = S 3 . To reduce it further to S 3 , we have to evaluate it at the base point of the loop: d M μν 1 (ρ) μ ν μ ν D (X , X ) − ix (τ )x˙ (τ ) ∇ / s,x(τ ) = −γ5 + γ ds 2 d μν 1 (ρ) μ ν μ ν D (X , X ) − ix (τ0 )x˙ (τ0 ) → −γ5 + γ ds 2 d (4.36) = −γ5 + Rγ 4i X iα D (ρ) (eα , e4 ) − ix i (τ0 ) . ds


529

Note that D(e4 , e4 ) = 0 and the inner derivations D(eα , e4 ) in the representation ρ of g A4 with carrier space C4 given in Appendix A form a (reducible) representation of su(2): [D (ρ) (eα , e4 ), D (ρ) (eβ , e4 )] = εαβγ D (ρ) (eγ , e4 ), α, β, γ = 1, . . . , 3.

(4.37)

Also recall that s −1/2 ∼ R for the self-dual string but s −1 ∼ R for the monopole. We thus recognize the Dirac operator ∇ / IIA s,x in (4.36). On the level of equations of motion, the Basu-Harvey equation reduces to the Nahm equation according to d μ 1 d i 1 X = εμνρσ [X ν , X ρ , X σ ] → X = εi jk R[X j , X k ]. ds 3! ds 2

(4.38)

Eventually, note that Aμ (x(τ )) = Bμν x˙ ν → R Ai (τ0 ) and therefore the loop space self-dual string equation reduces as follows: Fμν (x(τ )) = εμνρσ x˙ ρ (τ )

∂ ∂

(x(τ )) → Fi j (x(τ0 )) = εi jk k R (x(τ0 )). ∂xσ ∂x (4.39)

As the Higgs-fields for monopoles m and self-dual strings sds are related via m = R sds , we recover the Bogomolny monopole equation. 4.7. Comment on non-abelian self-dual strings. One can easily extend the loop space version of the self-dual string equation to the situation in which gauge and Higgs fields take values in the adjoint representation of u(N ): Fμν (x(τ )) := [∇μ , ∇ν ] = εμνρσ x˙ ρ (τ )∇σ (x(τ )) with ∇μ :=

∂ + Aμ (x(τ )). ∂xμ (4.40)

One readily verifies that the fields (4.19), when constructed from appropriate matrices of zero modes ψs,x(τ ) , satisfy (4.40). Also the reduction procedure given in the previous section works in the non-abelian case. We therefore postulate that the non-abelian version of the loop space self-dual string equation (4.40) is the appropriate description of a stack of M2-branes ending on N M5-branes. It might be possible to translate this equation back to S 3 and thus to R4 ⊃ S 3 via a regression and one might thus approach an effective field theory for stacks of M5-branes. All this should be put within the context of non-abelian gerbes and further consistency checks from a physical perspective are in order, too. This, however, is beyond our scope here and we leave it to an upcoming paper. 5. Discussion and Future Directions We developed an ADHMN-like construction of solutions to the world-volume theory of M5-branes corresponding to self-dual strings. This construction contains all the ingredients of the usual ADHMN construction: One derives a gauge potential and a Higgs field from the normalizable zero modes of a Dirac operator which is constructed from solutions to certain BPS equations.

530

C. Sämann

The two-form B-field of the self-dual string belongs to a connective structure of a gerbe over S 3 ⊂ R4 , which can be mapped via a transgression to a connection on a principal fibre bundle over the loop space LS 3 . We could therefore perform our construction on LS 3 by translating the self-dual string equation to this loop space. The resulting equation is an ordinary gauge field equation and can therefore be rendered non-abelian. We suggested that this is in fact a suitable BPS equation for an effective description of stacks of multiple M5-branes. We demonstrated how our construction is linked to the ordinary ADHMN construction by a reduction process and T-duality. We also constructed explicit self-dual string solutions on LS 3 using our algorithm and verified that the inverse of the transgression map takes them back to self-dual strings on S 3 ⊂ R4 . It should be stressed that we therefore agree with the conclusion of [11]: The ability of giving an extension of the ADHMN construction for self-dual strings (together with the lack of sufficiently many finite-dimensional 3-Lie algebras to describe stacks of arbitrarily many M2-branes) suggests that one should seek descriptions involving loop spaces, or even better, gerbes. There is now a wealth of directions for future study. First of all, the non-abelian extension of our ADHMN construction should be interpreted in the context of non-abelian gerbes and one should try to make sense out of the non-abelian equation on LS 3 in terms of fields on S 3 . Many details remain to be worked out in this case, as e.g. the exact boundary conditions and the reduction of the number of zero modes of the Dirac operator. Second, a full generalization of the Nahm transform, possibly working on special loop spaces of T 5 , should be developed. Simpler aims are to extend our construction to the ABJM case17 , the case of the N = 2 models constructed in [30] as well as to the case of the higher Nahm-type equations discussed in [31]. Studying noncommutative deformations in this context would also be interesting. In the long term, it should be possible to mimic the link of the ADHMN methods via monads to the twistor formalism (see e.g. [20]) also for our construction. This would lead to twistor spaces for the description of self-dual strings. The fact that our construction fits nicely into both the mathematical and physical contexts is exciting. It is therefore reasonable to hope that following the lines of research proposed above will help to shed new light on the effective description of M5-branes. Acknowledegements. I would like to thank Werner Nahm for encouraging comments on ideas leading to this work. I also thank Sam Palmer and Martin Wolf for helpful comments. I am particularly grateful to Sergey Cherkis for many helpful and enjoyable discussions on M2-branes in the past and for detailed comments on a first draft of this paper. This work was supported by a Career Acceleration Fellowship from the UK Engineering and Physical Sciences Research Council.

Appendix A. 3-Lie algebras and representations. A metric 3-Lie algebra is an inner product space (A, (·, ·)) endowed with a trilinear, totally antisymmetric bracket [·, ·, ·] : A∧3 → A, which satisfies the fundamental identity [32] [a, b, [c, d, e]] = [[a, b, c], d, e] + [c, [a, b, d], e] + [c, d, [a, b, e]],

17 The corresponding Basu-Harvey equation was given in [28,29].

(A.1)


531

and which is compatible with the inner product ([a, b, c], d) + (c, [a, b, d]) = 0

(A.2)

for all a, b, c, d, e ∈ A. The inner derivations Der(A) are the linear extensions of the maps D(a, b) := [a, b, · ] with a, b, ∈ A. Due to the fundamental identity, they form a Lie algebra, the associated Lie algebra gA , [D(a, b), D(c, d)] e = [a, b, [c, d, e]] − [c, d, [a, b, e]] = [[a, b, c], d, e] + [c, [a, b, d], e] = (D([a, b, c], d) + D(c, [a, b, d])) e.

(A.3)

The most prominent example of a 3-Lie algebra is A4 , which corresponds to R4 with standard basis eα , α = 1, . . . , 4, together with the 3-bracket [eα , eβ , eγ ] = εαβγ δ eδ .

(A.4)

The euclidean scalar product, (eα , eβ ) = δαβ , is compatible with this 3-bracket and thus A4 is a metric 3-Lie algebra. Its associated Lie algebra is g A4 = su(2) × su(2). For an exhaustive review on n-ary Lie algebras, see [33]. In our discussion, we need a representation ρ of the associated Lie algebra compatible with the 3-Lie algebra structure in the sense that D (ρ) (a, b) · D (ρ) (c, d) − D (ρ) (c, d) · D (ρ) (a, b) = D (ρ) ([a, b, c], d) + D (ρ) (c, [a, b, d]),

(A.5)

where D (ρ) (a, b) denotes the inner derivation D(a, b) in the representation ρ. As an example of such a representation, consider the spinor representation ρ of g A4 on C4 , D (ρ) (eα , eβ )ζ :=

1 αβ γ γ5 ζ, 2

(A.6)

where ζ ∈ C4 and γ αβ = 21 [γ α , γ β ] with γ α being the generators of the Clifford algebra Cl(R4 ) satisfying {γ α , γ β } = +2δ αβ . Using this representation, we also find the 3-bracket of A4 as described in [11]: [eα , eβ , eγ ] = [D (ρ) (eα , eβ ), γγ ] = εαβγ δ eδ .

(A.7)

References 1. Atiyah, M.F., Hitchin, N.J., Drinfeld, V.G., Manin, Y.I.: Construction of instantons. Phys. Lett. A 65, 185 (1978) 2. Tong, D.: TASI lectures on solitons. http://arxiv.org/abs/hep-th/0509216v5, 2005 3. Nahm, W.: A simple formalism for the BPS monopole. Phys. Lett. B 90, 413 (1980) 4. Nahm, W.: All selfdual multi-monopoles for arbitrary gauge groups. Presented at Int. Summer Inst. on Theoretical Physics, Freiburg, West Germany, Aug 31–Sep 11, 1981 5. Nahm, W.: The construction of all selfdual multi-monopoles by the ADHM method. Talk at the Meeting on Monopoles in Quantum Field Theory, ICTP, Trieste, 1981 6. Diaconescu, D.-E.: D-branes, monopoles and Nahm equations. Nucl. Phys. B 503, 220 (1997) 7. Tsimpis, D.: Nahm equations and boundary conditions. Phys. Lett. B 433, 287 (1998) 8. Basu, A., Harvey, J.A.: The M2-M5 brane system and a generalized Nahm’s equation. Nucl. Phys. B 713, 136 (2005) 9. Bagger, J., Lambert, N.: Modeling multiple M2’s. Phys. Rev. D 75, 045020 (2007)

532

C. Sämann

10. Bagger, J., Lambert, N.: Gauge symmetry and supersymmetry of multiple M2-branes. Phys. Rev. D 77, 065008 (2008) 11. Gustavsson, A.: Algebraic structures on parallel M2-branes. Nucl. Phys. B 811, 66 (2009) 12. Aharony, O., Bergman, O., Jafferis, D.L., Maldacena, J.: N = 6 superconformal Chern-Simons-matter theories, M2-branes and their gravity duals. JHEP 10, 091 (2008) 13. Gustavsson, A.: Selfdual strings and loop space Nahm equations. JHEP 04, 083 (2008) 14. Murray, M.K.: An introduction to bundle gerbes. http://arxiv.org/abs/0712.1651v3 [math.DG], 2008 15. Hitchin N.: Lectures on special lagrangian submanifolds, http://arxiv.org/abs/math/9907034v1 [math.DG], 1999 16. Brylinski, J.-L.: Loop spaces, characteristic classes and geometric quantization. Boston: Birkhäuser, 2007 17. Chatterjee, D.S.: On gerbs. PhD thesis, Trinity College, Cambridge, 1998 18. Noonan, M.: Calculus on categories. Preprint, available at http://www.math.cornell.edu/noonan/ preprints/habitat.pdf 19. Gomi, K., Terashima, Y.: Higher-dimensional parallel transports. Math. Research Lett 8, 25 (2001) 20. Hitchin, N.J.: On the construction of monopoles. Commun. Math. Phys. 89, 145 (1983) 21. Schenk, H.: On a generalized Fourier transform of instantons over flat tori. Commun. Math. Phys. 116, 177 (1988) 22. Gross, D.J., Nekrasov, N.A.: Monopoles and strings in noncommutative gauge theory. JHEP 07, 34 (2000) 23. David S., Berman: M-theory branes and their interactions. Phys. Rept. 456, 89 (2008) 24. Howe, P.S., Lambert, N.D., West, P.C.: The self-dual string soliton. Nucl. Phys. B 515, 203 (1998) 25. Campos, V.L., Ferretti, G., Salomonson, P.: The non-abelian self dual string on the light cone. JHEP 12, 011 (2000) 26. Stacey, A.: Comparative smootheology. http://arxiv.org/abs/0802.2225v2 [math.DG], 2010 27. Mukhi, S., Papageorgakis, C.: M2 to D2. JHEP 05, 085 (2008) 28. Terashima, S.: On M5-branes in N = 6 membrane action. JHEP 08, 080 (2008) 29. Hanaki, K., Lin, H.: M2-M5 systems in N = 6 Chern-Simons theory. JHEP 09, 067 (2008) 30. Cherkis, S., Saemann, C.: Multiple M2-branes and generalized 3-Lie algebras. Phys. Rev. D 78, 066019 (2008) 31. Lazaroiu, C.I., McNamee, D., Saemann, C., Zejak, A.: Strong homotopy Lie algebras, generalized Nahm equations and multiple M2-branes. http://arxiv.org/abs/0901.3905v1 [hep-th], 2009 32. Filippov, V.T.: n-Lie algebras. Sib. Mat. Zh 26, 126 (1985) 33. de Azcarraga, J.A., Izquierdo, J.M.: n-ary algebras: a review with applications. J. Phys. A 43, 293001 (2010) Communicated by A. Kapustin


Communications in


Hypercontractivity on the q -Araki-Woods Algebras Hun Hee Lee1, , Éric Ricard2, 1 Department of Mathematics, Chungbuk National University, 410 Sungbong-Ro, Heungduk-Gu,

Cheongju 361-763, Korea. E-mail: [email protected]

2 Laboratoire de Mathématiques, Université de Franche-Comté, 16 Route de Gray, 25030 Besançon, France.

E-mail: [email protected] Received: 1 September 2010 / Accepted: 15 October 2010 Published online: 15 March 2011 – © Springer-Verlag 2011

Abstract: Extending a work of Carlen and Lieb, Biane has obtained the optimal hypercontractivity of the q-Ornstein-Uhlenbeck semigroup on the q-deformation of the free group algebra. In this note, we look for an extension of this result to the type III situation, that is for the q-Araki-Woods algebras. We show that hypercontractivity from L p to L 2 can occur if and only if the generator of the deformation is bounded.

1. Introduction In [14] Nelson proved the following famous hypercontractivity result for the classical Ornstein-Uhlenbeck semigroup Pt1 acting on L p (Rn , dγ ), where dγ is the n-dimensional gaussian measure on Rn . Theorem 1.1 (Nelson, 1973). For 1 < p < r < ∞ we have 1 Pt

L p →L r

≤ 1 if and only if e−2t ≤

p−1 . r −1

Since then, there have been several analogous results in the context of non-commutative probability. A fermionic counterpart of Nelson’s result has been clarified by Carlen/Lieb in [7], and including the fermionic case Biane proved in [2] the followq ing hypercontractivity result for the q-Ornstein-Uhlenbeck semigroup Pt acting on p L (q , τq ), where q is the von Neumann algebra generated by q-gaussians by Bo˙zejko and Speicher ([5,6]) and τq is the vacuum state. Research supported by Basic Science Research Program through the National Research Foundation of Korea (NRF) funded by the Ministry of Education, Science and Technology (2010-0015222). Research partially supported by ANR grant 06-BLAN-0015.

534

H. H. Lee, É. Ricard

Theorem 1.2 (Biane, 1997). Let −1 ≤ q < 1. For 1 < p < r < ∞ we have q Pt

L p →L r

≤ 1 if and only if e−2t ≤

p−1 . r −1

The above result has been further extended to the case of gaussians satisfying more general commutation relations by Krolak ([13]), and the holomorphic version has been proved by Kemp ([11]). Biane’s generalized result concerns only von Neumann algebras with a normal tracial state. Thus, it is natural to be interested in their non-tracial relatives, namely, q-Araki-Woods algebras q (HR , (Ut )) ([8] or see Sect. 2), which are a generalization of Araki-Woods factors ([1]) depending on the deformation group of orthogonal transformations (Ut ) on some real Hilbert space HR . As usual, we denote by A the (unbounded) generator of (Ut ) on the complexification of HR . Fortunately, we were able to prove q,(U ) the following hypercontractivity result for the q-Ornstein-Uhlenbeck semigroup Pt t q p (simply Pt again) acting on L (q (HR , (Ut )), τq ), where τq is the vacuum state. Theorem 1.3. Let −1 ≤ q < 1. For 1 < p < 2 we have q Pt

L p →L 2

≤ 1 if e−2t ≤ C A

1− 2p

( p − 1),

where C is a universal constant. q If A is unbounded, then Pt is not bounded from L p to L 2 . Following the idea of Carlen/Lieb and Biane we will use the baby Fock model and Speicher’s central limit procedure as key ingredients for the proof. Although we were unable to determine “the optimal time” for the contractivity, the constant in the above shows the same optimal order p − 1 as in the tracial case. This paper is organized as follows. In Sect. 2 we will review q-generalized gaussians, q-Araki-Woods algebras, and q-Ornstein-Uhlenbeck semigroup. We will also briefly review an extension procedure of a map on non-tracial von Neumann algebras to the associated non-commutative L p spaces. In Sect. 3 we will introduce the twisted baby Fock model by Nou ([15]), and we will prove the first main result, hypercontractivity of ε-Ornstein-Uhlenbeck semigroup, which is an analog of q-Ornstein-Uhlenbeck semigroup in the baby Fock model. In Sect. 4 we will use Speicher’s central limit procedure and hypercontractivity of ε-Ornstein-Uhlenbeck semigroup to prove hypercontractivity of the q-Ornstein-Uhlenbeck semigroup. Finally in Sect. 5 we will use a 1-dimensional estimate to get the converse direction of the main result from the previous section. 2. q-Araki-Woods Algebras and q-Ornstein-Uhlenbeck Semigroup We begin with the most general definition of q-Araki-Woods algebra (see [1,8,18]). The construction starts with a (separable) real Hilbert space HR and (Ut )t∈R a group of orthogonal transformations on HR . Let HC = HR + i HR be the complexification of HR . The group (Ut ) extends to a group of unitaries on HC . By Banach-Stone’s theorem, there is a self-adjoint (unbounded) operator A so that Ut = Ait . One then defines a new scalar product on HC by x, yU =

2A x, y HC . 1+ A

Hypercontractivity on the q-Araki-Woods Algebras

535

Its completion is then denoted by H. We consider the operator of symmetrization Pn on H⊗n defined by P0 = , q i(π ) f π(1) ⊗ · · · ⊗ f π(n) , Pn ( f 1 ⊗ · · · ⊗ f n ) = π ∈Sn

where Sn denotes the symmetric group of permutations of n elements and i(π ) = #{(i, j)|1 ≤ i, j ≤ n, π(i) > π( j)} is the number of inversions of π ∈ Sn . Now we define the q-inner product ·, ·q on H⊗n by ξ, ηq = δn,m ξ, Pn η

for ξ ∈ H⊗n , η ∈ H⊗m ,

where ·, · is the inner product in H⊗n . We denote by H⊗q n the resulting Hilbert space. Since Pn ’s are strictly positive for −1 < q < 1 ([5]), ·, ·q is actually an inner product. Then one can associate a q-Fock space Fq (H), H⊗q n , Fq (H) = C ⊕ n≥1

where is a unit vector called vacuum. When q = 0 we recover the classical full Fock space over H. In the extreme cases q = ±1, F1 (H) and F−1 (H) refer to Bosonic and Fermionic Fock spaces, respectively. In the whole paper we are interested in −1 ≤ q < 1. For h ∈ HR , we can define a generalized q-semi-circular random variable on Fq (H) by s(h) = q (h) + q∗ (h), where q (h) is the left creation operator by h ∈ H and q∗ (h) is the adjoint of q (h). By definition q (HR , (Ut )) = {s(h) : h ∈ HR }

. The vacuum state τq defined by τq (·) = · , q is a normal faithful state on q (HR , (Ut )), and (q (HR , (Ut )), τq , ) is in GNS position. q (HR , (Ut )) is known to be a type I I1 algebra if and only if (Ut ) is trivial, in general, it is a type III von Neumann algebra, whose modular theory relative to τq is well understood ([8,18]). q,(U ) q Now we consider the q-Ornstein-Uhlenbeck semigroup Pt t (simply Pt ). Since is a separating vector for q (HR , (Ut )) we can use the second quantization q ([8, Propo. 1.1], [4, Theorem 2.11]) to get the semigroup Pt : q → q given by Pt (X ) = F(e−t idH )X , q

where F(e−t idH ) : Fq (H) → Fq (H) defined by F(e−t idH )|H⊗n = e−nt idH⊗n , n ≥ 0. Note that second quantizations commute with the modular group of τq . For the later use we record the properties of the semigroup as follows. q

Proposition 2.1. Pt , t ≥ 0 is a completely positive, normal, τq -preserving contraction that commutes with the modular group of τq .

536


Recall that A is said to be almost periodic when it has an orthonormal basis of eigenvectors. In this situation, there is a more tractable model for q (HR , (Ut )) that we will use (see Sect. 2.2 in [15]). Assume that we are given a sequence μ = (μi )i≥1 ⊆ [1, ∞). The construction starts with a separable complex Hilbert space H equipped with an orthonormal basis (e±k )k≥1 . We denote as above by Fq (H) the associated q-Fock space. We define q-(generalized) gaussian variables (or q-generalized circular variables) by gq,i = μi−1 q (ei ) + μi q∗ (e−i ). μ

Let q (−1 ≤ q < 1), the von Neumann algebra generated by {gq,k }k≥1 and τq still denotes the vacuum state. If A is almost periodic and (μi4 )i≥1 ⊆ [1, ∞) is the sequence of eigenvalues of A that are bigger than 1, then there is a spatial isomorphism (q (HR , (Ut )), τq , ) ∼ = (qμ , τq , ). Of course, one has A = sup μi4 . q,μ q (simply Pt ) is also In this model, the q-Ornstein-Uhlenbeck semigroup Pt obtained by the second quantization procedure with the same formula. All the properties we will be looking at are stable under taking ultraproducts, so that the discretization procedure in Sect. 6.1 of [15] allows us to work only in the almost periodic situation. In this paper we will often need an extension of a map defined on the algebra level to the L p setting. Let M, N be von Neumann algebras with distinguished normal faithful states ϕ and ψ, respectively. Then, one can define Haagerup’s L p space L p (M, ϕ), 1 ≤ p < ∞. For the details about L p (M, ϕ) (simply L p (M)) we refer to [9,12,16]. Let T : M → N be a completely positive contraction. We say T is state-preserving if ψ ϕ ψ ◦ T = ϕ and it intertwines the modular groups: σt ◦ T = T ◦ σt for all t ∈ R. Let 1 trM be the trace on L (M) and Dϕ be the density operator associated to ϕ. Similarly, 1

we consider trN and Dψ . Then MDϕp is norm-dense in L p (M), and for x ∈ M the 1

elements x Dϕp ∈ L p (M) (1 ≤ p < ∞) are identified in the sense of complex interpo1

lation. Now we consider an extension of T to the L p setting given by T p : MDϕp → 1

1

1

N Dψp , x Dϕp → (T x)Dψp . It is well-known that T p can be extended to a contraction on L p (M) ([10, Lemma 1.1]). For a later use we record the extension procedure as follows. Proposition 2.2. Let M, N be von Neumann algebras with distinguished normal faithful states ϕ and ψ, respectively, and T : M → N be a state-preserving completely positive contraction. Then 1

1

T p : MDϕp → N Dψp ,

1

1

x Dϕp → (T x)Dψp

(2.1)

extends to a contraction on L p (M). In particular, if T is an onto isometry, then so is T p between L p (M) and L p (N ).


537

Inspired by the above we can consider the following further extension of T . Let 1 ≤ p, r < ∞, 1

1

1

1

T p,r : MDϕp → N Dψr ⊆ L r (N ), x Dϕp → (T x)Dψr .

(2.2)

In general, there is no guarantee that T p,r can be extended to a bounded map from L p (M) into L r (N ) when r > p. q If we apply the above extension (2.1) to the q-Ornstein-Uhlenbeck semigroup Pt , q, p p we obtain a contractive semigroup Pt on L (q ), q, p

Pt

: L p (q ) → L p (q ),

1

q

1

x Dqp → Pt (x)Dqp ,

where Dq is the density operator associated to the vacuum state τq . Now the question is for which 1 0, can we extend q Pt to a contraction from L p (q ) into L r (q )? More precisely, when can the map q, p,r

Pt

1

1

1

q

1

: q Dqp → q Dqr , x Dqp → (Pt x)Dqr

(2.3)

be extended to a contraction from L p (q ) into L r (q )? Here comes a partial answer to q, p,r this question, which is one of our main results. Note that we will simply denote Pt q again by Pt . q

Theorem 2.3. Let 1 0, where αμ = supn≥1 μn . We close this section with some precise results on the modular theory for q and τq τ that we need. The modular group σt q with respect to τq satisfies the following. τ

σt q (gq,k ) = μ4it k gq,k , k ≥ 1. Thus, gq,k is an analytic element satisfying 1

2

1

Dq2 p gq,k = μkp gq,k Dq2 p , k ≥ 1.

(2.4)

3. Twisted Baby Fock and Hypercontractivity of ε-Ornstein-Uhlenbeck Semigroup 3.1. The baby Fock model. We will briefly describe the twisted baby Fock introduced by A. Nou ([15]). Let I = {±1, ±2, · · · , ±n} be a fixed index set and ε : I × I → {±1} be a “choice of sign” function satisfying ε(i, j) = ε( j, i), ε(i, i) = −1, ε(i, j) = ε(|i| | j|), ∀i, j ∈ I. Now, we consider the unital algebra A(I, ε) with generators (xi )i∈I satisfying xi x j − ε(i, j)x j xi = 2δi, j , i, j ∈ I.

538


In particular, we have xi2 = 1, i ∈ I , where 1 refers to the unit of the algebra. A(I, ε) can be endowed with the involution given by xi∗ = xi . We will use the following notations for the elements in A(I, ε). x∅ := 1 and x A := xi1 · · · xik ,

A = {i 1 < · · · < i k } ⊆ I.

Then, {x A : A ⊆ I } is a basis for A(I, ε). Let φ ε : A(I, ε) → C be the tracial state given by φ ε (x A ) = δ A,∅ . φ ε give rise to a natural inner product on A(I, ε) as follows. x, y := φ ε (y ∗ x),

x, y ∈ A(I, ε).

Let H = L 2 (A(I, ε), φ ε ) be the corresponding L 2 -space, then clearly {x A : A ⊆ I } is an orthonormal basis for H . Now we consider left creations βi∗ and left annihilations βi in B(H ), i ∈ I in this context. xi x A if i ∈ xi x A if i ∈ A / A βi∗ (x A ) = , βi (x A ) = , i ∈ I, A ⊆ I. 0 if i ∈ A 0 if i ∈ / A Then using the same parameter μ = (μi )i≥1 we define our generalized gaussians γi := μi−1 βi∗ + μi β−i , 1 ≤ i ≤ n. The following relations are known to be satisfied by γi ’s ([15, Lemma 5.2]). ⎧ γi γ j − ε(i, j)γ j γi = 0 i = j ∈ I ⎪ ⎪ ⎪ ⎪ ∗ ∗ ⎨ γi γ j − ε(i, j)γ j γi = 0 i = j ∈ I ⎪ γi2 = (γi∗ )2 = 0 i∈I ⎪ ⎪ ⎪ ⎩ ∗ −2 ∗ 2 γi γi + γi γi = (μi + μi )id i ∈ I

(3.1)

n in B(H ) while Let 1,...,n be the von Neumann algebra generated by {γi }i=1 n refers to the von Neumann algebra generated by γn . It is also known that ([15, Lemma 5.2]) 1 is a cyclic and separating vector for 1,...,n , and the above is the faithful GNS representation of (1,...,n , τnε ), where τnε is the vacuum state on 1,...,n given by τnε (·) = · 1, 1. With this definition, 1,...,k is not a subalgebra of 1,...,n , but by the above facts we can indeed identify it with the subalgebra generated by γ1 , . . . , γk in 1,...,n and τkε is then the restriction of τnε . We may sometimes simply write τ ε without any reference to n. Finally, we remark that with the classical identification x → x1 between 1,...,n and H, τ ε corresponds to φ ε . We collect some results about γi ’s which we need in the sequel.

Proposition 3.1. We have: ε

(1) For all n ≥ 1, and i ≤ n, σtτ (γi ) = μ4it γi and τ ε (γi∗ γi ) = μi−2 .


539

(2) Let D1,...,n and D1,...,n−1 be the densities of τ ε restricted to 1,...,n and 1,...,n−1 , respectively. Then we have a natural isometric embedding 1

1

p p L p (1,...,n−1 ) → L p (1,...,n ), x D1,...,n−1 → x D1,...,n .

A similar statement holds for L p (n ). (3) For any a ∈ n and b, c ∈ 1,...,n−1 we have τ ε (cab) = τ ε (a)τ ε (cb). (4) γn∗ γn commutes with 1,...,n−1 and D1,...,n . Proof. The computations for the first point can be found in [15, Prop. 5.3]. Thus, the natural inclusion of 1,...,n−1 to 1,...,n is state preserving and one can use Proposition 2.2. For the third point, it suffices to do it with c = 1 using the modular theory, but then it is [15, Lemma 5.4]. Finally for i < n, γn∗ γn γi = γn∗ (ε(n, i)γi γn ) = ε(n, i)2 γi γn∗ γn = γi γn∗ γn . ε

ε

ε

And σtτ (γn∗ γn ) = σtτ (γn )∗ σtτ (γn ) = γn∗ γn , so γn∗ γn is in the centralizer of τ ε which exactly means that it commutes with D1,...,n . Remark 3.2. Note that we are dealing with finite dimensional algebras, so we know that the density D1,...,n belongs to 1,...,n . Moreover all associated L p -spaces can be identified with a p-Schatten classes provided that we fix a faithful trace on 1,...,n (instead ∗ of looking at the trace on the dual space 1,...,n ). The main estimate relies on the position of 1,...,n−1 inside 1,...,n , which will be clarified in the following proposition. Now we set yi = γi∗ γi − μi−2 id, 1 ≤ i ≤ n.

(3.2)

In this paper we will consider the basis {id, γi , γi∗ , yi } of i which is more suitable than {γi γi∗ , γi , γi∗ , γi∗ γi }, 1 ≤ i ≤ n, as their corresponding L 2 vectors are orthogonal and have length 0, 1, 1 and 2, respectively. Then, for a fixed 1 ≤ i ≤ n, any element X ∈ 1,...,n can be uniquely expressed in the form X = a + γi b + γi∗ c + yi d

(3.3)

for some a, b, c, d ∈ 1,...,n\i , the von Neumann algebra generated by {γk : 1 ≤ k = i ≤ n}. Proposition 3.3. There are a ∗-isomorphism : 1,...,n → M2 ⊗ 1,...,n−1 and a unitary u n ∈ 1,...,n−1 satisfying the following.

a 0 . (1) For a ∈ 1,...,n−1 we have (a) = 0 a

1 0 0 2 . (2) (γn ) = (μ2n + μ−2 n ) un 0

x11 x12 1 ε ε (3) τn = (ψ ⊗ τn−1 ), where ψ = λx11 + (1 − λ)x22 with λ = 1+μ 4. x21 x22 n

540


Proof. Relations 3.1 give us a ∗-isomorphism σ : n → M2 with σ (γn ) = (μ2n + 1/2 e . Let C be the relative commutant of μ−2 21 n n in 1,...,n . Then, the multiplican ) tion map ψ : n ⊗ Cn → 1,...,n is a ∗-homorphism. Actually, ψ is a ∗-isomorphism. Indeed, we can easily check that ψ is an onto map. For example, we have 2 ψ (ε(1, n)γn∗ γn + γn γn∗ ) ⊗ γ1 (ε(1, n)γn∗ γn + γn γn∗ ) = (μ2n + μ−2 n ) γ1 . The fact that γ1 (ε(1, n)γn∗ γn +γn γn∗ ) ∈ Cn is a straightforward calculation. Moreover, ψ must be a 1-1 map, since dimCn = 22n−2 , which can be checked by a simple induction and the following observation. Let X = a + γ1 b + γ1∗ c + y1 d, a, b, c, d ∈ 2,...,n . Then by the uniqueness of the expression (3.3) we have ε(1, n)γn b = bγn , ε(1, n)γn∗ b = bγn∗

. X ∈ Cn ⇔ a, d ∈ n ∩ 2,...,n and ε(1, n)γn c = cγn , ε(1, n)γn∗ c = cγn∗ Now we consider another ∗-isomorphism π = (σ ⊗ I d) ◦ ψ −1 : 1,...,n → M2 ⊗ Cn . Since γn∗ γn commutes with 1,...,n−1 there are ∗-homomorphisms πi : 1,...,n−1 → eii ⊗ Cn , a → (eii ⊗ 1)π(a)(eii ⊗ 1), i = 1, 2. Actually, πi ’s are ∗-isomorphisms since dim1,...,n−1 = 22n−2 = dimCn ([15, Lemma 5.2]) and they are injective. Indeed, we have −2 ∗ ∗ π −1 ((eii ⊗ id)π(a)(eii ⊗ id)) = (μ2n + μ−2 n ) γn γn aγn γn −1 ∗ = (μ2n + μ−2 n ) γn γn a.

Then, the injectivity comes from the uniqueness of the expression (3.3). Now by identifying eii ⊗ Cn and Cn we get two ∗-isomorphisms

0 ρ1 (a) . ρ1 , ρ2 : 1,...,n−1 → Cn such that π(a) = 0 ρ2 (a) The above ∗-isomorphisms enable us to conclude that 1,...,k ∼ = M2k , k ≥ 1 by a simple induction. Thus, any automorphism of 1,...,n−1 is inner, so that there is a unitary u n ∈ 1,...,n−1 such that ρ1−1 ◦ ρ2 (a) = u ∗n au n , a ∈ 1,...,n−1 . Finally we define : 1,...,n → M2 ⊗ 1,...,n−1 by

1 0 1 0 −1 (I ⊗ ρ1 ) ◦ π (x) . (x) = 0 un 0 u ∗n Then, the first two assertions easily follow from the construction of and u n . The formula for τnε is a consequence of Proposition 3.1 (3) and the definition of . For exam

0 0 ε ε 1 ple, we have τn (γn b) = 0 = ψ ⊗ τn−1 for b ∈ 1,...,n−1 , 2 (μ2n + μ−2 n ) un b 0 2 0 μn d ε and τnε (yn d) = 0 = ψ ⊗ τn−1 for d ∈ 1,...,n−1 . 0 −μ−2 n d Now we consider the number operator Nε on H given by Nε = i∈I βi∗ βi . Then, for any A ⊆ I we have Nε x A = |A| x A . Since 1 ∈ H is separating and cyclic for 1,...,n we define the ε-Ornstein-Uhlenbeck semigroup Ptε : 1,...,n → 1,...,n by Ptε (X )1 = e−t Nε (X 1), X ∈ 1,...,n .

(3.4)


541

To make this definition more explicit, any element in 1,...,n can be written as a linear combination of products w1 . . . wk , where wi ∈ {id, γi , γi∗ , yi }, 1 ≤ i ≤ k. The number operator counts 0 for id, 1 for γi , γi∗ and 2 for yi ; for instance Ptε (y4 γ2∗ γ1 ) = e−4t y4 γ2∗ γ1 . This can be checked by a straightforward induction. In comparison to the q-Fock space setting, it not easy to see that this defines a completely positive semigroup. There is no general second quantization in the baby Fock model. Nevertheless, such a procedure exists for some diagonal contractions. To do so, we define similarly the i-number operator Ni on H and Tit on 1,...,n by ∗ β−i , 1 ≤ i ≤ n and Tit (X )1 = e−t Ni (X 1). Ni = βi∗ βi + β−i

It counts only the letter i as explained above. Proposition 3.4. For any t ≥ 0, the operators Tit (1 ≤ i ≤ n) are completely positive and state preserving on 1,...,n , and so is Ptε . Proof. The second assertion follows from the first one as Ptε = T1t . . . Tnt . For simplicity we only check the case i = n. Let a, b, c, d ∈ 1,...,n−1 , we have t Tn (a) = a, Tnt (γn b) = e−t γn b, Tnt (γn∗ c) = e−t γn∗ c and Tnt (yn d) = e−2t yn d. We use nt = ◦ Tnt ◦ −1 on M2 ⊗ 1,...,n . From the Proposition 3.3 to transfer Tnt to T

nt 0 b = e−t 0 b and since (a + yn d) = formula for (γn ) it follows that T 0 0 0 0

a + μ2n d 0 for a, d ∈ 1,...,n−1 we have 0 a − μ−2 n d

2 2 −2t d 0 0 −t a + μn e nt a + μn d = e T −2t d . 0 a − μ−2 0 a − μ−2 n d n e nt = T ⊗I d, where T (e12 ) = e−t e12 , T (e21 ) = e−t e21 , T (1) = 1 Thus, we obtain that T 1 2 −2 and T (μn e11 − μn e22 ) = e−2t (μ2n e11 − μ−2 n e22 ). We get with λ = 1+μ4 , n

T (e11 ) = λ(1 + e−2t μ4n )e11 + λ(1 − e−2t )e22 , T (e22 ) = (1 − λ)(1 − e−2t )e11 + (1 − λ)(1 − e−2t μ−4 n )e22 . Choi’s matrice C = (T (ei, j ))i, j associated to T is ⎡

⎤ λ(1 + e−2t μ4n ) 0 0 e−t ⎢ ⎥ 0 λ(1 − e−2t ) 0 0 ⎢ ⎥. ⎣ ⎦ 0 0 (1 − λ)(1 − e−2t ) 0 e−t 0 0 (1 − λ)(1 + e−2t μ−4 ) n Since μ4n =

1−λ λ

and

−2t λ(1 − λ)(1 + e−2t μ4n )(1 + e−2t μ−4 = λ(1 − λ)(1 − e−2t )2 ≥ 0, n )−e

C is positive and T is completely positive. The state preserving property follows from Proposition 3.1.

542


3.2. Main estimates. We start with the main statement. Theorem 3.5. Let 1 0, where αμ = supn≥1 μn . Before proceeding to the proof, we collect some lemmas. The first and the most crucial one is an asymmetric version of the optimal convexity inequality ([3,7]). 1 Lemma 3.6. Let 1 < p ≤ 2, μ ≥ 1 and λ = 1+μ 4 . For any A, B ∈ Mn , n ≥ 1 we have p p 2p λ A + μ2 B + (1 − λ) A − μ−2 B ≥ A2p + C( p − 1) B2p , p

p

where C = C(μ) = 13 μ−4 for 1 < p ≤

4 3

8− 16 p

and C = 13 μ

for

4 3

< p ≤ 2.

Proof. The above inequality is nothing but the contractivity of a fixed linear map from an L p space to a L p -valued 2 space. By a careful examination of the adjoint map we can observe that the above inequality is equivalent to the following. Let 1p + q1 = 1. Then for any X, Y ∈ Mn we have q q2 λ q q−1 2 ≤ X q2 + μ (3.5) λ X + Y q + (1 − λ) 4 C Y q . X − 1 − λ Y q Since we care less about the best constant we will use the following standard argument (See remark (ii) after Theorem 5.6 in [16]). Let Cq be the best constant such that (3.5) q−1 is true if we replace μ 4 C by C q . Then we have 2q λ 2q X − Y λ X + Y 2q + (1 − λ) 1 − λ 2q q = λ |X |2 + |Y |2 + X ∗ Y + Y ∗ X q q

2 λ λ 2 |Y |2 − (X ∗ Y + Y ∗ X ) +(1 − λ) |X | + 1−λ 1−λ q q 2 2 ∗ ∗ ≤ λ |X | + |Y | + X Y + Y X q q 2 λ 2 ∗ ∗ (X Y + Y X ) +(1 − λ) |X | + |Y | − 1−λ q

q 2 2 2 ≤ |X |2 + |Y |2 + Cq X ∗ Y + Y ∗ X q

q

q 2 ≤ [X 22q + Y 22q ]2 + 4Cq X 22q Y 22q 2q 2 ≤ X 22q + (2Cq + 1) Y 22q . The first inequality is by monotonicity of the L p norm on positive elements as

λ 1−λ

≤ 1.


543

Thus, we can conclude that C2q ≤ 2Cq + 1. Since we have C2 = μ−4 ≤ 1, a standard interpolation argument leads us to Cq ≤ (q − 1)1−θ (2q − 1)θ , where 2n ≤ q < 2n+1 θ and q1 = 1−θ 2n + 2n+1 . Thus, we simply get Cq ≤ 3(q − 1), which implies that the original inequality is true for C=

1 −4 μ . 3

When n = 1, i.e. 2 ≤ q < 4 we can get a sharper estimate. Since C2 = μ−4 and θ C4 ≤ 2μ−4 + 1, for q1 = 1−θ 2 + 4 we have 4− 16 q

Cq ≤ (μ−4 )1−θ (2μ−4 + 1)θ ≤ 3μ

,

which implies that the original inequality is true for C=

1 8− 16p μ . 3

Lemma 3.7. For any a, b ∈ 1,...,n−1 , with λ =

1 , μ4n +1

we have

2 p 1 1 p (a + yn d)D = λ (a + μ2 d)D p n 1,...,n 1,...,n−1 p

p

p 2p 1 p −2 +(1 − λ) (a − μn d)D1,...,n−1 p

2 2 1 1 p p ≥ a D1,...,n−1 + C(μ)( p − 1) d D1,...,n−1 , p

(3.6)

p

where C(μ) is the constant in Lemma 3.6. Proof. It is a direct application of Proposition 3.3 and Lemma 3.6, if one notices that

a + μ2n d 0 for any a, d ∈ 1,...,n−1 , (a + yn d) = . 0 a − μ−2 n d Lemma 3.8. Let b, c ∈ 1,...,n−1 . Then, we have 1 1 2 −2 21 γn bD p p 1,...,n ≥ λ (μn + μn ) p

and

1 bD p 1,...,n−1

1 ∗ 1 2 −2 1 γ cD p p n 1,...,n ≥ (1 − λ) (μn + μn ) 2 p

p

1 p cD 1,...,n−1 . p

544


Proof. By (3.6) we get 1 1 γn bD p 1,...,n ≥ ∗ γ p

n ∞

1 = ∗ γ

n ∞

1 ∗ γ γn bD p 1,...,n n p 1 −2 p (μ b + yn b)D 1,...,n n

p

1 1 1 bD p ≥ ∗ λ p (μ2n + μ−2 ) n 1,...,n−1 γ p n ∞ 1 1 1 p 2 = λ p (μ2n + μ−2 n ) bD1,...,n−1 . p

Note that the equality in the last line holds by the fact γn ∞ = γn∗ ∞ = μ2n + μ−2 n , which is a direct application of Proposition 3.3. 1 ∗ p The estimate for γn cD1,...,n is similar. p

Proof of Theorem 3.5. We follow the idea of Carlen/Lieb and Biane to use the induction on n, where I = {±1, . . . , ±n}. We assume that we have the conclusion for n − 1 and consider the case n. Every element in 1,...,n can be uniquely expressed as X = a + γn b + γn∗ c + yn d, 1

1

1

1

2 2 2 2 where a, b, c, d ∈ 1,...,n−1 . Note that {Dn , γn Dn , γn∗ Dn , yn Dn } is an orthog2 onal set in L (n ) with 1 1 1 21 D = yn D 2 = 1, γn D 2 = μ−1 and γ ∗ D 2 = μn . n n n n n n

2

2

2

2

1 2 1 1 2 2 2 yn2 Dn ) = τ ε (yn2 ) = yn 1, yn 1 = 1. Moreover, For example, yn Dn = trn (Dn 2

yn 1 = x−n xn so that Ptε (yn ) = e−2t yn . Thus, by applying (3) of Proposition 3.1, we get that the four terms in X are orthogonal and 2 1 ε P (X )D 2 1,...,n t 2 2 2 1 1 ε −2 −2t ε 2 2 = Pt (a)D1,...,n−1 + μn e Pt (b)D1,...,n−1 2

2

2 2 1 1 ε 2 + e−4t P ε (d)D 2 P +μ2n e−2t (c)D 1,...,n−1 1,...,n−1 . t t 2

(3.7)

2 τ ε -preserving

Now we estimate X p . Since the map replacing γn into −γn is a ∗-isomorphism of 1,...,n , Proposition 2.2 implies that 1 1 (a + γn b + γ ∗ c + yn d)D p = (a − γn b − γ ∗ b + yn d)D p n n 1,...,n 1,...,n . p

p


545

By the optimal convexity inequality ([3] or [7]) we have 2 2 2 1 1 1 ≥ (a + yn d)D p + ( p − 1) (γn b + γ ∗ c)D p X D p n 1,...,n 1,...,n 1,...,n p

p

p

= I + ( p − 1)I I.

(3.8) 1

1

p p and γn∗ cD1,...,n have The estimate for I is Lemma 3.7. For I I , note that γn bD1,...,n disjoint support. Indeed, we have

2

2

p p b∗ γn∗ )(γn∗ cD1,...,n c ∗ γn ) = 0 (γn bD1,...,n

and 1

1

1

1

p p p p (D1,...,n b∗ γn∗ γn bD1,...,n )(D1,...,n c∗ γn γn∗ cD1,...,n ) 1

1

1

1

p p p p = D1,...,n b∗ bD1,...,n γn∗ γn γn γn∗ D1,...,n c∗ cD1,...,n

=0

by (3) of Proposition 3.1. Thus, by orthogonality and Lemma 3.8 we have p p 2p 1 1 ∗ p p I I = γn bD1,...,n + γn cD1,...,n p

p

2 2 1 1 ∗ p p γ γ ≥ bD + cD n 1,...,n n 1,...,n p p 2 2 1 1 2 2 + (1 − λ) p (μ2 + μ−2 ) cD p bD p ≥ λ p (μ2n + μ−2 ) n n n 1,...,n−1 1,...,n−1 . p

p

By combining (3.7), (3.8), (3.6) and (3.9) we get 2 2 1 1 ε ≤ X D p P (X )D 2 1,...,n 1,...,n , t p

2

provided that e−2t ≤ min{(μ4n + 1)

1− 2p

( p − 1),

C(μn ) p − 1},

where C(μn ) is the constant in Lemma 3.6 for μ = μn .

(3.9)

546


4. Approximation by Central Limit Procedure The aim of this section is to use a standard approximation procedure to go from the baby Fock model to the q-Araki-Woods algebras. Most of the arguments are easy adaptations of [15], so we will simply sketch them. In Sect. 3 we constructed generalized baby gaussians γi associated with the parameters μi for 1 ≤ i ≤ n by starting with the index set I = {±1, ±2, . . . , ±n}. In this section we apply the same construction using the increased index set I = {(i, j) : 1 ≤ i ≤ n, 1 ≤ j ≤ m} ∪ {(−i, − j) : 1 ≤ i ≤ n, 1 ≤ j ≤ m}, so that we can get generalized baby gaussians γi, j associated with the parameter μi for 1 ≤ i ≤ n, 1 ≤ j ≤ m and the von Neumann algebra n,m generated by {γi, j : 1 ≤ i ≤ n, 1 ≤ j ≤ m}. Note that the “choice of sign” function ε in this case would be ε: I × I → {±1}, satisfying ε((i 1 , i 2 ), ( j1 , j2 )) = ε(( j1 , j2 ), (i 1 , i 2 )), ε((i 1 , i 2 ), (i 1 , i 2 )) = −1, ε((i 1 , i 2 ), ( j1 , j2 )) = ε((|i 1 | , |i 2 |), (| j1 | , | j2 |)), ∀(i 1 , i 2 ), ( j1 , j2 ) ∈ I. Now we replace ε((i 1 , i 2 ), ( j1 , j2 )), (i 1 , i 2 ) ≺ ( j1 , j2 ) ∈ I with a family of i.i.d. random variables with 1−q 1+q , P(ε((i 1 , i 2 ), ( j1 , j2 )) = 1) = , P(ε((i 1 , i 2 ), ( j1 , j2 )) = −1) = 2 2 where (i 1 , i 2 ) ≺ ( j1 , j2 ) means i 1 < j1 or i 1 = j1 , i 2 < j2 . We set m 1 γi, j . si,m = √ m j=1

Then, Speicher’s central limit procedure ([15,19]) tells us the following. Proposition 4.1. For any ∗-polynomial Q in n non-commuting variables we have lim τ ε (Q(s1,m , . . . , sn,m )) = τq (Q(gq,1 , . . . , gq,n ))

m→∞

for almost every ε. Since the set of all non-commuting ∗-monomials is countable, we can find a choice of sign ε such that the above is true for any Q. In the sequel we fix such an ε. Now we would like to transfer this convergence in distribution into L p -norm convergence using Nou’s ultraproduct approach ([15, Theorem 4.3, Sect. 5.2]). If we set gi,m = Re(si,m ), g−i,m = Im(si,m ) and G i = Re(gq,i ), G −i = Im(gq,i ), 1 ≤ i ≤ n, then by Proposition 4.1 for any polynomial P in 2n non-commuting variables we have lim τ ε (P(g−n,m , . . . , gn,m )) = τq (P(G −n , . . . , G n )).

m→∞

(4.1)

We need to truncate g j,mto get a uniform control on the operator norms. Let C > 0 be a constant satisfying G j < C for any | j| ≤ n. We consider the function h q on R with h(x) = 1(−C,C) (x)x, x ∈ R and set g˜i,m = h(gi,m ), 1 ≤ i ≤ n. From [15, Lemma 5.7] and the discussion after it, we have


547

Proposition 4.2. Let U be a fixed free ultrafilter on N, (A, τ ) = m,U (n,m , τ ε ), and p ∈ A be the support of τ . Then we have the following normal state-preserving ∗-isomorphism. : (q , τq ) → (A, τ ), P(G −n , . . . , G n ) → p · (P(g˜ −n,m , . . . , g˜ n,m ))m,U · p, where P is any polynomial in 2n non-commuting variables. Then by Proposition 4.2, Proposition 2.2 and [17, Theorem 3.6] for any polynomial P in 2n non-commuting variables we have 1 1 p p lim P(g˜ −n,m , . . . , g˜ n,m )Dm = P(G −n , . . . , G n )Dq , m,U

p

p

where Dm is the density of τ ε restricted to n,m . Now we need to replace g˜i,m back with gi,m . Lemma 4.3. Let U be a fixed free ultrafilter on N and 1 ≤ p ≤ 2. For any polynomial P in 2n non-commuting variables we have 1 1 p p P(g P(G lim , . . . , g )D = , . . . , G )D −n,m n,m −n n m q . m,U

p

p

Proof. In the proof of [15, Lemma 5.7] it is shown that lim τ ε (g˜ n, j1 . . . g˜ n, jk−1 (gn, jk − g˜ n, jk )gn, jk+1 . . . gn, jl ) = 0

m→∞

for any indices j1 , . . . , jl and 1 ≤ k ≤ l. By taking involution inside the functional τ ε we also get lim τ ε (gn, j1 . . . gn, jk−1 (gn, jk − g˜ n, jk )g˜ n, jk+1 . . . g˜ n, jl ) = 0.

m→∞

If we apply the above limits repeatedly, then we have lim τ ε (g˜ n, j1 . . . g˜ n, jk−1 gn, jk . . . gn, jl ) − τ ε (gn, j1 . . . gn, jk−1 gn, jk . . . gn, jl ) = 0

m→∞

(4.2) and lim τ ε (gn, j1 . . . gn, jk−1 g˜ n, jk . . . g˜ n, jl ) − τ ε (g˜ n, j1 . . . g˜ n, jk−1 g˜ n, jk . . . g˜ n, jl ) = 0.

m→∞

(4.3) Now we consider any polynomial P in 2n non-commuting variables, then we have 1 1 2 P(g−n,m , . . . , gn,m )Dm2 − P(g˜ −n,m , . . . , g˜ n,m )Dm2 2

˜ ≤ τ (P P − P˜ ∗ P) + τ ε (P ∗ P˜ − P˜ ∗ P) ˜ , = τ (P P − P P − P P˜ + P P) ε

∗

˜∗

∗

˜∗

ε

∗

where P and P˜ denote P(g−n,m , . . . , gn,m ) and P(g˜ −n,m , . . . , g˜ n,m ), respectively. Since P ∗ P − P˜ ∗ P and P ∗ P˜ − P˜ ∗ P˜ are linear combination of the terms of the forms g˜ n, j1 · · · g˜ n, jk−1 gn, jk · · · gn, jl − gn, j1 · · · gn, jk−1 gn, jk · · · gn, jl

548


and gn, j1 . . . gn, jk−1 g˜ n, jk . . . g˜ n, jl − g˜ n, j1 . . . g˜ n, jk−1 g˜ n, jk . . . g˜ n, jl , respectively, (4.2) and (4.3) imply that 1 1 2 2 P(g lim , . . . , g )D − P( g ˜ , . . . , g ˜ )D −n,m n,m −n,m n,m m m = 0. m→∞

Since

L 2 (τ ε )

2

L p (τ ε )

embeds into contractively we get 1 1 p p P(g , . . . , g )D − P( g ˜ , . . . , g ˜ )D lim −n,m n,m −n,m n,m m m = 0,

m→∞

so that

p

1 1 p p P(g P( g ˜ lim , . . . , g )D = lim , . . . , g ˜ )D −n,m n,m −n,m n,m m m m,U m,U p p 1 p = P(G −n , . . . , G n )Dq . p

Remark 4.4. We can extend Lemma 4.3 for the case 2 < p < ∞. The following lemma is a non-tracial version of [2, Lemma 5]. Lemma 4.5. For any ∗-polynomial Q in n non-commuting variables and 1 ≤ p ≤ 2 we have 1 1 ε q p p lim P (Q(s1,m , . . . , sn,m ))Dm = Pt (Q(gq,1 , . . . , gq,n ))Dq . m→∞ t p

p

Proof. The proof is essentially the same as [2, Lemma 5], so that we omit it. Note that we need Lemma 4.3 for the conclusion. Proof of Theorem 2.3. By a standard density argument it is enough to consider the case dimH = n. Then for any ∗-polynomial Q in n non-commuting variables and a fixed free ultrafilter U on N we have 1 1 q Pt (Q(gq,1 , . . . , gq,n ))Dq2 = lim P ε (Q(s1,m , . . . , sn,m ))Dm2 t m,U 2

2

by Lemma 4.5. Theorem 3.5 implies that 1 1 ε P (Q(s1,m , . . . , sn,m ))Dm2 ≤ Q(s1,m , . . . , sn,m )Dmp t 2

p

4− 8p

if e−2t ≤ Cαμ ( p − 1), where C is the constant in Theorem 3.5. Applying Lemma 4.3 we get 1 1 q Pt (Q(gq,1 , . . . , gq,n ))Dq2 ≤ Ptq (Q(gq,1 , . . . , gq,n ))Dqp . 2

p


549

Remark 4.6. For the most general case of q (HR , (Ut )) we use the discretization argument in [15, Sect. 6], where the following embedding has been established. For a fixed free ultrafilter U on N we have the following normal state-preserving ∗-isomorphism. : (q (HR , (Ut )), τq ) → n,U (n , τn ), G(ei ) → p · (G n (ei ))n,U · p, where (n , τn )’s are almost periodic q-Araki-Woods algebras, G(ei ), G n (ei )’s are corresponding gaussians and p ∈ n,U n is the support of n,U τn . Then, the same ultraproduct argument as above proves Theorem 1.3.

5. 1-Dimensional Estimate We consider the “only if” direction by examining the 1-dimensional behavior as usual. We start by an estimate of the L p -norm of gi , the q-gaussian with the parameter μi . As gi∗ gi is in the centralizer of ϕ, 1

1

1

1

1

p p p 2 p = Di gi∗ gi Di p/2 = ϕ((gi∗ gi ) p/2 ) p . gi Di

The self-adjoint element y = gi∗ gi can be seen as a commutative random variable in some probability space with measure induced by ϕ. It is well known that q-creations are bounded for −1 ≤ q < 1 ([5, Lemma 4]) so y∞ ∼ μi2 with constants depending only on q. Moreover, we have already seen that y1 = μ12 and y2 ∼ 1. It follows from i

1

the Hölder inequality that y p = ϕ((gi∗ gi ) p ) p ∼ μi only on q. So we conclude that for p ≥ 2, 1

2−4/ p

1− 4p

p gi Di p ∼ μi

with constants depending

.

q

By duality if Pt can be extended to a contraction from L p (q ) into L 2 (q ), then it can

also be extended from L 2 (q ) into L p (q ), where 1p + p1 = 1, so 2− 4p

e−t ≤ μi

.

That is q

Theorem 5.1. Suppose that αμ = supn μn = ∞, then Pt can not be extended to a contraction from L p (q ) into L 2 (q ) for any 1 ≤ p < 2. We give a more precise estimate for p → 1. Let n ∈ N, and we set a(ε) = 1

2n (1 + εgi )Di , ε > 0. Then we have 1

1

2n 2n n |a(ε)|2n = (Di (1 + εgi + εgi∗ + ε2 gi∗ gi )Di ) .

550


If we expand the right hand side, then we get 1

1 n−1 n−1 1 1 2n 2n 2n 2n Di + ε Di gi Di Din + · · · + Din Di gi Di

1 1 n−1 n−1 1 1 2n ∗ 2n 2n ∗ 2n + ε Di gi Di Din + · · · + Din Di gi Di + nε2 Di gi∗ gi 1 1 1 n−2 2n ∗ n 2n + ε2 Di gi Di gi Di Din + · · · 1

1

1

n−2

1

2

1

n−3

1

2

1

n−3

2n n 2n gi Di gi∗ Di Din + · · · +Di 2n ∗ n 2n +Di gi Di gi Di Din + · · · 2n n 2n +Di gi Di gi∗ Di Din + · · ·

+ o(ε2 )Di . Thus, we have tr(|a(ε)| ) = 2n

1 + nε2 μi−2

+ε

2

n−1

4 n k−2

(n − k)μi

+

k=1

= 1 + nε2

n−1

4

μin

k−2

n−1

t Pq (a(ε))

2n

+ o(ε2 )

k=1

+ o(ε2 ) = 1 + nε2 μi−2

k=0

so that

− 4 (n−k)+2 kμi n

μi4 − 1 4 n

μi − 1

+ o(ε2 ),

= a(e−t ε)2n ⎛ =

μ4 ⎝1 + ne−2t ε2 μ−2 i 4 i

−1

μin − 1

⎞1

2n

+ o(ε )⎠ 2

e−2t ε2 −2 μi4 − 1 μi + o(ε2 ) 4 2 n μi − 1 n −2t 2 2− n4 ≥ 1 + e ε μi + o(ε2 ). 2

= 1+

Consequently, Pqt (a(ε)) 1+

2n

≤ a(ε)2 implies that

ε2 n −2t 2 2− n4 e ε μi + o(ε2 ) ≤ 1 + μi−2 + o(ε2 ), 2 2

which means e−2t ≤

1 1 −4+ n4 −4+ 8 μi ≤ 2μi 2n n 2n − 1

by taking ε → 0. By duality we get the following.

(5.1)


Theorem 5.2. Let

1 p

=1−

1 2n , n(≥

551

2) ∈ N. Then Pqt 4− 8p

e−2t ≤ 2αμ

L p →L 2

≤ 1 implies that

( p − 1).

If we turn back to the baby Fock model, this one dimensional estimate can be extended for all 1 1. Then, for any p > 2, p p + c p,λ Tr d p g ∗ g + c p, 1 Tr d p gg ∗ + O(ε2 ), (1 + εg)d p = Tr d p + ε2 λ 2 where c p,λ =

λp − 1 2(λ2 − 1)(1 −

1 ) λ2

−

pλ2 . 4(λ2 − 1)

Proof. We use the well known fact that on self-adjoint invertible matrices, the map f : x → x p/2 is C ∞ . Moreover, its derivative at x can be expressed easily in terms of the spectral decomposition of x and divided differences of f ; if x = sps is the spectral decomposition of x, then for h ∈ Msa : n f 1 (s, t) ps hpt , diffx f.h = s,t

diff2x

f.(h, h) =

f 2 (s, t, u) ps hpt hpu ,

s,t,u

where

% f 1 (a, b) = % f 2 (a, b, c) =

f (a)− f (b) a−b f (a)

ifa = b if a = b

,

f 1 (a,c)− f 1 (b,c) a−b f 1 (a,c) limh→0 f1 (a+h,c)− h

if a = b if a = b

.

Under the trace, for our choice of f : p Tr x p/2−1 h, 2 2 f 2 (s, t, s) ps hpt h . Tr (diffx f.(h, h)) = Tr Tr (diffx f.h) =

s,t

We want the expansion at the second order in ε of (1 + εg)d p = Tr (d 2 + εd(g + g ∗ )d + ε2 dg ∗ gd) p/2 . p

By the above formula, with x = d 2 , the first order term is of the commutation relation as λ = 1.

p 2

Tr d p (g + g ∗ ) = 0 because

552


The second order term has two contributions, one from the first derivative, the other from the second one. The first is given by 2p Tr d p g ∗ g. The second is more involved; let d = α∈σ (d) αpα be its spectral decomposition, we get ⎛ A = Tr ⎝

⎞ f 2 (α 2 , β 2 , α 2 ) pα d(g + g ∗ )dpβ d(g + g ∗ )d ⎠

α,β∈σ (d)

⎛ = Tr ⎝

⎞ (αβ)2 f 2 (α 2 , β 2 , α 2 ) pα (g + g ∗ ) pβ (g + g ∗ ) pα ⎠ .

α,β∈σ (d)

The relation dg = λgd gives that P(d)g = g P(λd) for any polynomial P. It yields pα g = gp αλ , where p αλ is zero if αλ is not a eigenvalue of d. In particular, pα (g + g ∗ ) pβ (g + g ∗ ) pα = g 2 p α2 p β pα + gg ∗ pα pβλ pα + g ∗ gpα p β pα λ

λ

λ

+ g ∗2 pαλ2 pα pβλ pα . Thus

⎞ ⎞ ⎛ 2 α4 α A = Tr ⎝ f 2 (α 2 , 2 , α 2 ) pα gg ∗ ⎠ + Tr ⎝ α 4 λ2 f 2 (α 2 , α 2 λ2 , α 2 ) pα g ∗ g ⎠ . λ2 λ ⎛

α∈σ (d)

α∈σ (d)

Then, 2 α4 2 α f (α , , α2 ) = α p 2 λ2 λ2

1 − λ1p p − 2(λ2 − 1) (λ2 − 1)(1 −

= 2α p c p, 1 .

1 ) λ2

λ

Finally, A = 2c p, 1 Tr d p gg ∗ + 2c p,λ Tr d p g ∗ g. λ

1

p To conclude, using the notation of Sect. 3, we apply it with d = D1,...,n , g = γn , 4

p ∗ 2 λ = μnp . Recall that Tr d p g ∗ g = μ−2 n and Tr d g g = μn ,

⎛ 1 p

p p (1 + εγn )D1,...,n p = 1 + ε2 ⎝ 2 − 2μn

⎞

8

pμnp 8 p

+

4μ2n (μn − 1) 8 p

= 1 + ε2 ·

pμ2n 4(μn − 1)

p μ4n + μn − 2 · + O(ε2 ). 8 4μ2n p μn − 1

Then, the conclusion about the optimality follows as above.

8 p

⎠ + O(ε2 )


553

References 1. Araki, H., Woods, E.J.: A classification of factors. Publ. Res. Inst. Math. Sci. Ser. A 4, 51–130 (1968/1969) 2. Biane, P.: Free hypercontractivity. Commun. Math. Phys. 184(2), 457–474 (1997) 3. Ball, K., Carlen, E.A., Lieb, E.H.: Sharp uniform convexity and smoothness inequalities for trace norms. Invent. Math. 115(3), 463–482 (1994) 4. Bo˙zejko, M., Kummerer, B., Speicher, R.: q-Gaussian processes: non-commutative and classical aspects. Commun. Math. Phys. 185(1), 129–154 (1997) 5. Bo˙zejko, M., Speicher, R.: An example of a generalized Brownian motion. Commun. Math. Phys. 137(3), 519–531 (1991) 6. Bo˙zejko, M., Speicher, R.: Completely positive maps on Coxeter groups, deformed commutation relations, and operator spaces. Math. Ann. 300(1), 97–120 (1994) 7. Carlen, E.A., Lieb, E.H.: Optimal hypercontractivity for Fermi fields and related noncommutative integration inequalities. Comm. Math. Phys. 155(1), 27–46 (1993) 8. Hiai, F.: q-deformed Araki-Woods algebras. In: Operator algebras and mathematical physics (Constan¸ta, 2001), Bucharest: Theta, 2003, pp. 169–202 9. Junge, M., Xu, Q.: Noncommutative Burkholder/Rosenthal inequalities. Ann. Probab. 31, 948–995 (2003) 10. Junge, M., Xu, Q.: Noncommutative maximal ergodic theorems. J. Amer. Math. Soc. 20(3), 385–439 (2007) 11. Kemp, T.: Todd Hypercontractivity in non-commutative holomorphic spaces. Commun. Math. Phys. 259(3), 615–637 (2005) 12. Kosaki, H.: Applications of the complex interpolation method to a von Neumann algebra: noncommutative L p -spaces. J. Funct. Anal. 56(1), 29–78 (1984) 13. Krolak, I.: Contractivity properties of Ornstein-Uhlenbeck semigroup for general commutation relations. Math. Z. 250(4), 915–937 (2005) 14. Nelson, E.: The free Markoff field. J. Funct. Anal. 12, 211–227 (1973) 15. Nou, A.: Asymptotic matricial models and QWEP property for q-Araki–Woods algebras. J. Funct. Anal. 232(2), 295–327 (2006) 16. Pisier, G., Xu, Q.: Non-commutative L p -spaces. In: Handbook of the geometry of the Banach spaces, Vol. 2, Amsterdam: North-Holland, 2003, pp. 1459–1517 17. Raynaud, Y.: On ultrapowers of non commutative L p spaces. J. Operator Theory 48(1), 41–68 (2002) 18. Shlyakhtenko, D.: Free quasi-free states. Pacific J. Math. 177(2), 329–368 (1997) 19. Speicher, R.: A noncommutative central limit theorem. Math. Z. 209(1), 55–66 (1992) Communicated by Y. Kawahigashi


Communications in


Second Eigenvalue of Paneitz Operators and Mean Curvature Daguang Chen, Haizhong Li Department of Mathematical Sciences, Tsinghua University, Beijing 100084, P.R. China. E-mail: [email protected]; [email protected] Received: 28 November 2009 / Accepted: 3 May 2011 Published online: 9 June 2011 – © Springer-Verlag 2011

Abstract: For n ≥ 7, we give the optimal estimate for the second eigenvalue of Paneitz operators for compact n-dimensional submanifolds in an (n + p)-dimensional space form in terms of the mean curvature and the Q-curvature. 1. Introduction Assume that M n is a compact Riemannian manifold immersed into Euclidean space Rn+ p . In [9], Reilly obtained the estimate for the first eigenvalue λ1 of Laplacian n λ1 ≤ |H |2 , (1.1) V (M n ) M n where H is the mean curvature vector of the immersion M n in Rn+ p , V (M n ) is the volume of M n . In [11], El Soufi and Ilias obtained the corresponding estimates for submanifolds in the unit sphere Sn+ p (1), hyperbolic space Hn+ p (−1) and some other ambient spaces. Motivated by certain variational stability issues, El Soufi and Ilias [12] obtained the sharp estimates for the second eigenvalue of the Schrödinger operator for compact submanifolds M n in space form Rn+ p , Sn+ p (1) and hyperbolic space Hn+ p (−1). Given a smooth 4-dimensional Riemannian manifold (M 4 , g), the Paneitz operator, discovered in [8], is the fourth-order operator defined by 2 4 2 P f = f − div R Id − 2Ric d f, for f ∈ C ∞ (M 4 ), 3 where is the scalar Laplacian defined by = divd, div is the divergence with respect to g, R, Ric are the scalar curvature and Ricci curvature respectively. The Paneitz operator was generalized to higher dimensions by Branson [1]. Given a smooth compact The second author was partly supported by NSFC grant No. 10971110.

556

D. Chen, H. Li

Riemannian n-manifold (M n , g), n ≥ 5, let P be the operator defined by (see also [2]) P f = 2 f − div (an R Id + bn Ric) d f +

n−4 Q f, 2

(1.2)

where 1 R 2(n − 1) n2 − 4 2 1 R, = R2 − |E|2 − 8n(n − 1)2 (n − 2)2 2(n − 1)

Q = cn |Ric|2 + dn R 2 −

E = Ric −

R g, n

(1.3)

and (n − 2)2 + 4 4 , bn = − , 2(n − 1)(n − 2) n−2 2 n 3 − 4n 2 + 16n − 16 cn = − d = . n (n − 2)2 8(n − 1)2 (n − 2)2

an =

(1.4)

The operator P is also called Paneitz operator (or Branson-Paneitz operator). In [4,5,13], the authors investigated positivity of the Paneitz operator. In analogy with the conformal volume in [7], Xu and Yang [14] defined N-conformal energy for compact 4-dimensional Riemannian manifold immersed in the N-dimensional sphere S N (1). In the same paper [14], the upper bound for the first eigenvalue of the Paneitz operator was bounded using n-conformal energy. In [3], we obtained the sharp estimates for the first eigenvalue of the Paneitz operator for compact 4-dimensional submanifolds in Euclidean space and unit sphere. The aim of this paper is to obtain the optimal estimates for the second eigenvalue of the Paneitz operator in terms of the extrinsic geometry of the compact submanifold M n in space form R n+ p (c) (the Euclidean space Rn+ p for c = 0, the Euclidean unit sphere Sn+ p (1) for c = 1 and the hyperbolic space Hn+ p (−1) for c = −1). Considering the first eigenvalue 1 of P, it is easy to see that it is bounded by the mean value of the Q-curvature on M n , n−4 Qdvg . (1.5) 1 ≤ 2V (M n ) M n Moreover, the inequality (1.5) is strict unless Q is constant. For the second eigenvalue we have the following: Theorem 1.1. Let φ : M n → R n+ p (c) be an n-dimensional (n ≥ 7) compact submanifold. Then the second eigenvalue 2 of Paneitz operator satisfies 2 1 n−4 |H |2 + c dvg + Qdvg . (1.6) 2 V (M n ) ≤ n(n 2 − 4) 2 2 Mn Mn Moreover, the equality holds if and only if φ(M n ) is an n-dimensional geodesic sphere Sn (rc ) in R n+ p (c), where 1 r0 = 2

n(n + 4)(n 2 − 4) 2

1/4

, r1 = ar csin r0 , r−1 = sinh −1r0 .

(1.7)

Second Eigenvalue of Paneitz Operators and Mean Curvature

557

Remark 1.1. For the n-dimensional geodesic sphere Sn (rc ) in R n+ p (c), we have 1 n(n − 4)(n 2 − 4)(|H |2 + c)2 , 16 1 2 = n(n + 4)(n 2 − 4)(|H |2 + c)2 , 16 2 1 Q = n(n 2 − 4) |H |2 + c . 8 1 =

From (1.3) and (1.6), we can reach Corollary 1.2. Under the same assumptions as in Theorem 1.1, then 1 2 V (M ) ≤ n(n 2 − 4) 2 n

2 (n − 4)(n 2 − 4) 2 |H | + c dvg + R 2 dvg . (1.8) 16n(n − 1)2 Mn Mn

Moreover, the equality holds if and only if M n is an n-dimensional geodesic sphere. Remark 1.2. We note that our technique in the proof of Theorem 1.1 does not work for 3 ≤ n ≤ 6, so it is interesting to know whether Theorem 1.1 is true or not for 3 ≤ n ≤ 6. Remark 1.3. The authors in [11] and [12] applied the estimate for the second eigenvalue of the Schrödinger operator to the stability of constant mean curvature hypersurfaces and the stability of the interfaces in the Allen-Cahn reaction diffusion model. We expect that the estimate of the second eigenvalue of the Paneitz operator here may be related to some variational stability problems. We also mention that the proof of Theorem 1.1 has taken the similar strategy as in [11] and [12]. 2. Some Lemmas Assume that φ : M n → R n+ p (c) is an n-dimensional compact submanifold in an (n + p)-dimensional space form R n+ p (c). From [7] (see also [12]), it is known that Lemma 2.1. Let w be the first eigenfunction of the Paneitz operator P on M n . Then there exists a regular conformal map : R n+ p (c) → Sn+ p (1) ⊂ R n+ p+1

(2.1)

such that for all 1 ≤ α ≤ n + p + 1, the immersion X = ◦ φ = (X 1 , . . . , X n+ p+1 ) satisfies X α wdvg = 0, Mn

where g is the induced metric of φ : M n −→ R n+ p (c). Assume that g˜ = e2u g is a conformal transformation for u ∈ C ∞ (M n ), then the scalar curvature obeys [10] e2u R˜ = R − 2(n − 1)u − (n − 1)(n − 2)|∇u|2 ,

(2.2)

558

D. Chen, H. Li

the gradient operator and the Laplacian follows ˜ f = e−2u [ f + (n − 2)∇u · ∇ f ] , e2u |∇˜ f |2 = |∇ f |2 , ∇˜ f = e−u ∇ f,

(2.3) (2.4)

˜ are the Levi-Civita connection and Laplacian with where ∇ and (resp. ∇˜ and ) respect to g (resp. g). ˜ We have the following relation under conformal transformation g˜ = e2u g (see p. 766 in [12]): Lemma 2.2. Let φ : M n → R n+ p (c) be an n-dimensional submanifold and X = ◦ φ as before. Then we have ˜ 2 − n| H˜ |2 ) = |h|2 − n|H |2 , e2u (|h|

(2.5)

where h, h˜ are the second fundamental form of the immersion φ and X respectively, H = n1 tr h and H˜ = n1 tr h˜ are the mean curvature vectors, u is defined by 1 |∇( ◦ φ)|2 . n We need also the following result (see [11,12]): e2u =

(2.6)

Lemma 2.3. Let φ : M n → R n+ p (c) be an n-dimensional submanifold and X = ◦ φ as before. Then we have n−2 2 |∇u|2 . e2u | H˜ |2 + 1 = |H |2 + c − u − (2.7) n n The following lemma is crucial in the proof of our Theorem 1.1. Lemma 2.4. Let φ : M n → R n+ p (c) be an n-dimensional compact submanifold, X = ◦ φ as before and u be defined by (2.6), then n−6 e2u (|H |2 + c) ≤ (|H |2 + c)2 − e2u |∇u|2 . (2.8) n n n n M M M Proof. Multiplying e2u on both sides of (2.7), we have 2 n − 2 2u e |∇u|2 . e4u (| H˜ |2 + 1) = e2u (|H |2 + c) − e2u u − n n Integrating (2.9) over M n and noting 2u e u = −2 e2u |∇u|2 , Mn

we can get

(2.9)

Mn

n−6 e ≤ e (|H | + c) − e2u |∇u|2 . (2.10) n n n n M M M From the Cauchy-Schwartz inequality and (2.10), we have 2u 2 4u e (|H | + c) ≤ e + (|H |2 + c)2 2 Mn Mn Mn n−6 ≤ e2u (|H |2 + c) − e2u |∇u|2 + (|H |2 + c)2 . n Mn Mn Mn This inequality implies (2.8).

4u

2u

2


559

3. Proof of Theorem 1.1 Assume that φ : M n → R n+ p (c) is an n-dimensional compact submanifold in an (n+ p)dimensional space form R n+ p (c). From Lemma 2.1, there exists a regular conformal map : R n+ p (c) → Sn+ p (1) ⊂ Rn+ p+1 such that the immersion X = ◦ φ = (X 1 , · · · , X n+ p+1 ) satisfies X α wdvg = 0, for all 1 ≤ α ≤ n + p + 1, Mn

where w is the first eigenfunction of the Paneitz operator on M n . Let 2 be the second eigenvalue of the Paneitz operator P. From the max- min principle for the Paneitz operator, we have α 2 2 (X ) dvg ≤ P(X α ) · X α dvg , 1 ≤ α ≤ n + p + 1. (3.1) Mn

Mn

Making summation over α from 1 to n + p in (3.1), using the fact and (1.2), we can obtain 2 V (M) ≤

n+ p+1 α=1

=

⎡ ⎣

Mn

Mn

n+ p+1

2 X α · X α −

α=1

n

Mn

Q|X |2 ⎦ dvg

< X, X > dvg +

n−4 2

Mn

< {(an Rδ jk + bn R jk )X j }k , X >

⎤

+

(X α )2 = 1

j,k=1

Mn

α=1

P(X α ) · X α dvg

n−4 + 2 =

n+ p+1

n M n j,k=1

< (an Rδ jk + bn R jk )X j , X k > dvg

Q|X |2 dvg ,

(3.2)

where we use Stokes’ formula in the second equality. By (2.3) and (2.4), we have the following calculations: < X, X > ˜ · ∇˜ X, X ˜ · ∇˜ X > ˜ − (n − 2)∇u ˜ − (n − 2)∇u = e4u < X 4u ˜ ˜ ˜ ˜ ˜ · ∇˜ X > = e < n H − n X − (n − 2)∇u · ∇ X, n H − n X − (n − 2)∇u 4u 2 ˜ 2 2 2 ˜ 2 = e [n | H | + n + (n − 2) |∇u| ] = e2u [n 2 e2u | H˜ |2 + n 2 e2u + (n − 2)2 |∇u|2 ], (3.3) where H˜ is the mean curvature vector of X = ◦ φ : M n −→ Sn+ p (1); here we used ˜ = n H˜ − n X . in the second equality the well-known formula X Noting < X j , X k > = e2u δ jk ,

(3.4)

560

D. Chen, H. Li

and putting (3.3) into (3.2), we have

2 V (M) ≤ e2u n 2 e2u | H˜ |2 + 1 + (n − 2)2 |∇u|2 dvg Mn n−4 +(nan + bn ) Re2u dvg + Qdvg . 2 Mn Mn

(3.5)

Putting (2.7) into (3.5) and by use of the definitions of an , bn in (1.4), we obtain n−2 2 2 V (M) ≤ |∇u|2 e2u n 2 |H |2 + c − u − n n Mn n−4 +(n − 2)2 |∇u|2 dvg + (nan + bn ) Re2u dvg + Qdvg 2 Mn Mn = n2 e2u (|H |2 + c)dvg + (nan + bn ) Re2u dvg Mn Mn n−4 + Qdvg + e2u (n − 2)2 − (n − 2)n + 4n |∇u|2 dvg 2 Mn Mn = n2 e2u (|H |2 + c)dvg + 2(n + 2) e2u |∇u|2 dvg Mn

+

− 2n − 4 2(n − 1)

n2

Mn

From Gauss equation of φ :

Mn Mn

Re2u dvg +

n−4 2

Mn

Qdvg .

(3.6)

−→ R n+ p (c),

R = n(n − 1)c + n 2 |H |2 − |h|2 and |h|2 ≥ n|H |2 , we have R ≤ n(n − 1)(|H |2 + c).

(3.7)

The equality holds in (3.7) if and only if φ : −→ is a total umbilical submanifold (see [6]). By (3.6) and (3.7), we have 1 n−4 2 V (M) ≤ n(n 2 − 4) e2u (|H |2 + c)dvg + Qdvg 2 2 Mn Mn +2(n + 2) e2u |∇u|2 dvg . (3.8) Mn

R n+ p (c)

Mn

From (2.8), we have

2 1 n−4 n(n 2 − 4) |H |2 + c dvg + Qdvg 2 2 Mn Mn 1 (n − 6)(n 2 − 4) − 2(n + 2) − e2u |∇u|2 dvg 2 Mn 2 1 n−4 |H |2 + c dvg + = n(n 2 − 4) Qdvg 2 2 Mn Mn 1 − (n + 2)(n 2 − 8n + 8) e2u |∇u|2 dvg . 2 Mn

2 V (M n ) ≤

(3.9)

Therefore, the inequality (1.6) follows immediately from inequality (3.9) if n ≥ 7.


561

If the equality holds in (1.6), all the inequalities become equalities from (3.1) to (3.9). From (3.9), we can get ∇u = 0, i.e. u = constant. In this case, (2.10) becomes equality, and then we can infer H˜ = 0. Equation (2.7) implies |H |2 + c = e2u = constant.

(3.10)

The equality case in (3.7) gives us |h|2 = n|H |2 , that is, h iαj = H α δi j ,

(3.11)

i.e., φ(M n ) is a totally umbilical submanifold in R n+ p (c) (in [12], also called a geodesic sphere). From (3.11) and Gauss equation of φ, we have α α Ri jkl = c(δik δ jl − δil δ jk ) + h ik h jl − h ilα h αjk

= c(δik δ jl − δil δ jk ) + H α H α δik δ jl − H α H α δil δ jk = (|H |2 + c)(δik δ jl − δil δ jk ),

(3.12)

Ri j = (n − 1)(|H | + c)δi j , 2

R = n(n − 1)(|H |2 + c), where Ri jkl , Ri j and R are the components of Riemannian curvature tensor, the Ricci tensor and scalar curvature of M n , respectively. By the definition of the Q-curvature in (1.3), we have by (1.4), 2 n 3 − 4n 2 + 16n − 16 2 2 |Ric| + R (n − 2)2 8(n − 1)2 (n − 2)2 2 =− (n − 1)2 (|H |2 + c)2 δik δik (n − 2)2 n 3 − 4n 2 + 16n − 16 2 + n (n − 1)2 (|H |2 + c)2 8(n − 1)2 (n − 2)2 2 1 = n(n 2 − 4) |H |2 + c . 8

Q=−

(3.13)

Therefore, for the equality case in (1.6), we have 2 =

1 n(n + 4)(n 2 − 4)(|H |2 + c)2 . 16

From (3.14), we have

|H | + c = 4 2

2 . n(n + 4)(n 2 − 4)

(3.14)

(3.15)

Therefore, from (3.12) and (3.15), we deduce that φ(M n ) is a geodesic sphere Sn (rc ) with radius rc defined by (1.7). Conversely, suppose that φ(M n ) is a geodesic sphere Sm (rc ) with radius rc defined by (1.7) in space form R n+ p (c). It is easily deduced that the section curvature 2 , i = j. (3.16) Ri ji j = 4 n(n + 4)(n 2 − 4)

562

D. Chen, H. Li

From (3.12), we obtain (3.15). Therefore the equality holds in (1.6). We complete the proof of Theorem 1.1. Remark 3.1. If we assume that the scalar curvature R is nonnegative, from (3.7) we have R 2 ≤ n 2 (n − 1)2 (|H |2 + c)2 .

(3.17)

Inserting (3.17) into (1.8), we have under R ≥ 0 and the same assumptions as in Theorem 1.1. 1 2 n(n + 4)(n − 4) (|H |2 + c)2 . (3.18) 2 V (M) ≤ 16 Mn Moreover, the equality holds if and only if M n is an n-dimensional geodesic sphere. Acknowledgements. The authors would like to thank the referees for some helpful comments which made this paper more readable.

References 1. Branson, T.P.: Group representations arising from Lorentz conformal geometry. J. Funct. Anal. 74, 199– 291 (1987) 2. Chang, S.-Y.A., Hang, F., Yang, P.: On a class of locally conformally flat manifolds. Int. Math. Res. Not. 2004(4), 185–209 (2004) 3. Chen, D.G., Li, H.: The sharp estimates for the first eigenvalue of Paneitz operator in 4-manifold. arXiv:1010.3104v1 4. Gursky, M.: The principal eigenvalue of a conformally invariant differential operator, with an application to semilinear elliptic PDE. Commun. Math. Phys. 207, 131–143 (1999) 5. Hebey, E., Robert, F.: Coercivity and Struwe’s compactness for Paneitz type operators with constant coefficients. Cal. Var. and PDE 13(4), 491–517 (2001) 6. Li, H.: Willmore submanifolds in a sphere. Math. Res. Lett. 9, 771–790 (2002) 7. Li, P., Yau, S.-T.: A new conformal invariant and its application to the Willmore conjecture and the first eigenvlalue of compact surfaces. Invent. Math. 69, 269–291 (1982) 8. Paneitz, S.: A quartic conformally covariant differential operator for arbitrary pseudo-Riemannian manifolds. Preprint 1983, available at http://www.emis.de/journal/SIGMA/2008/036/sigma08-036.pdf 9. Reilly, R.: On the first eigenvalue of the Laplacian for compact submanifolds of Euclidean space. Comment. Math. Helv. 52, 525–533 (1977) 10. Schoen, R., Yau, S.-T.: Lectures on Differential Geometry. Boston, MA: International Press, 1994 11. El Soufi, A., Ilias, S.: Une inégalité du type Reilly pour les sous-variétés de léspace hyperbolique. Comment. Math. Helvitici 67, 167–181 (1992) 12. El Soufi, A., Ilias, S.: Second eigenvalue of Schrödinger operators and mean curvature. Commun. Math. Phys. 208, 761–770 (2000) 13. Yang, P.C., Xu, X.W.: Positivity of Paneitz operators. Discrete and Continuous Dynamical Systems. 7(2), 329–342 (2001) 14. Yang, P.C., Xu, X.W.: Conformal energy in four dimension. Math. Ann. 324(4), 731–742 (2002) Communicated by M. Aizenman


Communications in


On the Initial Conditions and Solutions of the Semiclassical Einstein Equations in a Cosmological Scenario Nicola Pinamonti1,2 1 II. Institut für Theoretische Physik, Universität Hamburg, Luruper Chaussee 149, 22761 Hamburg,

Germany. E-mail: [email protected]

2 Dipartimento di Matematica, Università di Roma “Tor Vergata”, Via della Ricerca Scientifica,

00133 Roma, Italy Received: 23 January 2010 / Accepted: 3 March 2011 Published online: 26 May 2011 – © Springer-Verlag 2011

Abstract: In this paper we shall discuss the backreaction of a massive quantum scalar field on the curvature, the latter treated as a classical field. Furthermore, we shall deal with this problem in the realm of cosmological spacetimes by analyzing the Einstein equations in a semiclassical fashion. More precisely, we shall show that, at least on small intervals of time, solutions for this interacting system exist. This result will be achieved providing an iteration scheme and showing that the series, obtained starting from the massless solution, converges in the appropriate Banach space. The quantum states with good ultraviolet behavior (Hadamard property), used in order to obtain the backreaction, will be completely determined by their form on the initial surface if chosen to be lightlike. Furthermore, on small intervals of time, they do not influence the behavior of the exact solution. On large intervals of time the situation is more complicated but, if the spacetime is expanding, we shall show that the end-point of the evolution does not depend strongly on the quantum state, because, in this limit, the expectation values of the matter fields responsible for the backreaction do not depend on the particular homogeneous Hadamard state at all. Finally, we shall comment on the interpretation of the semiclassical Einstein equations for this kind of problems. Although the fluctuations of the expectation values of pointlike fields diverge, if the spacetime and the quantum state have a large spatial symmetry and if we consider the smeared fields on regions of large spatial volume, they tend to vanish. Assuming this point of view the semiclassical Einstein equations become more reliable. Contents 1. 2. 3.

Introduction . . . . . . . . . . . . . . . . . . . . . . . . . Geometry of the Spacetime, Classical and Quantum Matter 2.1 Geometry . . . . . . . . . . . . . . . . . . . . . . . . 2.2 Scalar fields, quantization and the stress tensor . . . . . Homogeneous States and Hadamard Property in Flat FRW .

. . . . .

. . . . .

. . . . .

. . . . .

. . . . .

. . . . .

. . . . .

. . . . .

564 567 567 568 570

564

N. Pinamonti

3.1 Characterization of pure, homogeneous states in terms of solutions of the wave equation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 4. Semiclassical Treatment of Backreaction and Existence of Solutions . . . 4.1 Semiclassical approximation . . . . . . . . . . . . . . . . . . . . . . 4.2 Initial conditions at the beginning of the universe . . . . . . . . . . . 4.3 Expectation value of φ 2 on the state ω1,0 , and further renormalization freedom . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 4.4 Recursive constructions and bounds satisfied by χk and by the auxiliary functional . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 4.5 Existence of exact solutions of the semiclassical Einstein equations out of initial data given at the beginning of the universe . . . . . . . . . . 5. Further Considerations on the Semiclassical Solutions . . . . . . . . . . . 5.1 Expectation value of φ 2 at late times . . . . . . . . . . . . . . . . . 5.2 Variance of T when smeared on large volume regions . . . . . . . . 6. Summary of the Results and Final Comments . . . . . . . . . . . . . . . . A. Appendix . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . A.1 Functional derivatives of a and τ . . . . . . . . . . . . . . . . . . . . A.2 Bounds satisfied by χk and its functional derivatives . . . . . . . . . . A.3 Proof of Proposition 4.5 . . . . . . . . . . . . . . . . . . . . . . . . .

570 574 574 576 578 581 584 588 588 589 593 593 593 595 596

1. Introduction Despite the presence of many attempts, no widely accepted description of the quantization procedure necessary to treat quantum gravity is, up to now, available; however, nowadays we know how such a theory should look like, at least in some regimes. For example, treating fields as quantum objects over a fixed curved classical background, many interesting effects have been discovered, like for example Hawking radiation in the case of black hole physics [Ha75,FH90] or particle creation in the context of cosmological spacetime [Pa68,Pa69]. Another interesting regime, where some progress could be made towards an eventual quantum gravity, is the analysis of the backreaction of matter quantum fields on curvature. In the present paper we would like to analyze some aspects of this problem in the context of cosmological spacetimes. In this respect the recent developments in the theory of quantum fields on curved spacetimes provided by the algebraic approach to quantization [Ha92], have been really fruitful. Here we think for example of the work [BFV03] where it has been shown that the quantization is nothing but a functor that associates to every spacetime its corresponding algebra of fields/observables. It is thus possible to quantize simultaneously on every spacetime and, as a byproduct, the renormalization ambiguities, present in the definition of pointwise product of fields, acquire a clearer interpretation [HW01]. In order to use similar ideas, a good control on the form of the quantum states has to be available and, in this respect, an important contribution is the one of Radzikowksi [Ra96] (see also [BFK95]). In that paper he has shown that, considering a free quantum theory and employing techniques proper of the Hörmander microlocal analysis [Ho89], it is sufficient to prescribe the form of the wave front set of the two-point function of a state in order to fix the form of its singularity (Hadamard form). Using similar ideas and similar methods it is also possible to show that many of the features of quantum fields on flat spacetime can be translated in a conceptual clear way to the curved case. More precisely, the Wick polynomials, their time ordered products and the stress tensor have

Initial Conditions and Solutions of Semiclassical Einstein Equations

565

been addressed in [HW01,HW02,HW05,Mo03], whereas the study of the perturbative construction of interacting field theories can be found in [BF00,HW03,BF09,BDF09]. Coming back to the analysis of the backreaction of quantum matter on classical curvature, a milestone is provided by the seminal paper of Wald [Wa77]. In that work, he has given a set of five axioms that a renormalization scheme should satisfy in order to produce a stress tensor whose expectation values can be safely used in the semiclassical Einstein equations G μν = 8π Tμν

(1)

in such a way that a physically reasonable backreaction emerges. Nowadays, the first four requirements on the expectation values of the stress tensor (consistency, covariance, conservation of Tμν and standard flat spacetime limit) are automatically obtained in every locally covariant theory constructed as in [BFV03] (see also the nice lectures [BF09]). However, the fifth one, which requires that the expectation values of the stress tensor depend on the derivatives of the coefficients of the metric up to the second order, is more problematic. Nevertheless, even if it cannot be satisfied in general, it is possible to fix part of the renormalization freedom in order to have that at least the expectation values of the trace T := Tμ μ meet this requiremeant [Wa78b]. Making use of these methods, in the case of cosmological spacetimes, the backreaction of a massless conformally coupled scalar field has been studied by Wald in [Wa78a] and in this case an exact solution can be found. Later on, another important contribution in this direction is the one due to Starobinsky [St80] (see also [Vi85]), where analyzing the trace of the stress tensor, and its anomalous term [Wa78b], the backreaction on curvature was computed. Following similar ideas the backreaction of massive scalar fields has been analyzed in [SS02] and more recently in [DFP08]. Unfortunately while in the case of massless conformally coupled fields the backreaction can be analytically computed, essentially because the quantum state does not influence the dynamics of the spactime, when the fields have a mass, the situation is more complicated and the details of the quantum state have to be taken into account. A nice study of the semiclassical problem, also in terms of numerical integration, can be found in the series of works of Anderson [An85,An86]. The analysis of the problem of definition of the expectation values of the stress tensor in relation with the Casimir effect is presented in [GNW09]. The semiclassical Einstein equations have also been employed by Flanagan and Wald in [FW96] in order to study the validity of the averaged null energy condition on semiclassical solutions. Other interesting works at the interface of classical gravity and quantum matter are [GM86,RV99,PS93,PR99,NO99,BO00,AE00,Sh08]. Recently the dynamical system associated with the semiclassical backreaction has been studied in [EG10], where similar equations, as the one studied in the present paper, have been derived also for more general kinds of matter fields (with different curvature coupling). In this paper we shall follow the discussion presented in [DFP08] where some approximated solutions of the semiclassical Einstein equations have been displayed. In the latter work the matter has been described by a free scalar field whose coupling constant to the curvature is equal to 1/6 and with a non-vanishing mass m. Notice that, despite the simplicity of the treated model, the solutions presented in that paper have very nice physical properties. On the one hand they show, at late times, a phase of exponential expansion (de Sitter), whose acceleration corresponds to a renormalization freedom of such a model, while on the other hand, at the beginning of the model a phase typical of the power law inflationary scenario emerges. In a certain sense that model provides an effective cosmological constant that varies in time and that could be used in order to model the present acceleration of our universe.

566

N. Pinamonti

Here, we shall show that exact solutions of that model exist and that many of the previously discussed physical properties shown by the approximated solutions persist. To this end, since we are here interested in the cosmological scenario we shall consider the category1 F of spacetimes with a large spatial symmetry, namely the flat Friedmann Robertson Walker (FRW). Thanks to the result stated in [BFV03] and [HW01], the fields realizing the components of the stress tensor can be unambiguously found on every spacetime, hence also on every element of F. However, in order to have a well defined system of equations governing the backreaction, we must be able to unambiguously select a class of states (one for every element of F) with good ultraviolet behavior in a spacetime independent way. Eventually, the validity of the semiclassical Einstein equations will be employed as a constraint on the class F. The difficult part in this program is to give, for every element of F, a state which is not based on the particular spacetime. Here we would like to give a prescription for obtaining those natural states. Furthermore, in order to give meaning to observables like the stress tensor, the states must have good ultraviolet behavior, namely the singular part of their two-point function must be as close as possible to the singularity shown by the vacuum in Minkowksi spacetime. Unfortunately this is not always possible and in many cases, a state, determined by some initial values on a spacelike Cauchy surface, does not satisfy the necessary regularity conditions. From the physical point of view, a natural candidate for a quantum state would be the choice of a vacuum, but unfortunately such concept is not available in curved spacetime. In any case we could ask for the validity of some physical properties in order to select a class of states that behaves like the vacuum in Minkowski spacetime, or at least that are close to it. To this end, Parker [Pa69] introduced the adiabatic states, namely states that are close to the vacuum. Afterwards, Lüders and Roberts [LR90] made their definition precise employing the construction of Parker to select certain initial conditions at finite time in a FRW spacetime. Unfortunately, in general, the states obtained in that way do not satisfy the Hadamard condition, hence they do not have the same ultraviolet singularity as the vacuum in Minkowski spacetime, as shown by Junker and Schrohe [JS02]; nevertheless they satisfy a condition weaker than the microlocal spectral condition given in [BFK95]. Another possibility would be to consider states that minimize the smeared energy density. Olbermann [Ol07] has constructed such states and has shown that they are of Hadamard form. Unfortunately, since this procedure depends upon the smearing over a finite interval of time, it is not possible to get those states only out of the initial conditions. In the present work we shall follow another procedure close to the one presented in [DMP06,Mo06,Mo08] in the context of asymptotically flat spacetimes and already employed in cosmological scenarios in [DMP09a,DMP09b]. We shall see that, if the initial surface is chosen to be lightlike (coinciding with the null past infinity in the conformally related spacetime), we can give a class of states (that do not rely on the particular spacetime) that can be used in order to solve the coupled semiclassical system. Eventually, we shall comment on the late time behavior of the solutions showing that, if a de Sitter phase of exponentially accelerated expansion has emerged, the value of that acceleration does not depend on the particular homogeneous Hadamard state. Finally we shall give a new interpretation of the semiclassical Einstein equations when considered as a non local equation. We shall actually show that, when smeared on large spatial volume regions, while the expectation value of the trace, T , does not differ from the pointlike one, its fluctuations tend to vanish. 1 The elements of F are the oriented and time oriented flat FRW spacetimes and the morphisms can be chosen to be the orientation preserving conformal transformations.


567

In the subsequent section we shall introduce the geometric setup and the basic facts about the quantization we shall use. This section serves to fix the notation and to recollect some results used throughout the subsequent part of the work. In the third section a discussion about the regularity and the Hadamard property shown by a class of states will be discussed, the states constructed in a spacetime independent way are presented in this section. The fourth section contains the main theorems about the existence of solutions of the semiclassical Einstein equations in a cosmological scenario. In the fifth section we shall discuss the behavior of the semiclassical solutions at late times and we shall see that the variance of the trace vanish when smeared on large volume regions. Finally, some comments and remarks are collected in the sixth section. The Appendix contains a discussion about deformation quantization, the technical estimates and a technical proof of a proposition used in the main text. 2. Geometry of the Spacetime, Classical and Quantum Matter 2.1. Geometry. Since we are dealing with problems in cosmology we shall assume homogeneity and isotropy. Hence, every element (M, g) in the class of spacetimes under investigation is such that the smooth manifold M is of the form M = I × , where I ⊂ R is an interval representing the cosmological time and is a three dimensional hypersurface that can be closed, open or flat. The metric g takes the well known Friedmann-Robertson-Walker form: dr 2 2 2 g = −dt 2 + a 2 (t) + r dS (θ, ϕ) , (2) 1 − κr 2 where dS2 (θ, ϕ) is the standard metric of the unit sphere in spherical coordinates and κ ∈ {+1, 0, −1} distinguishes between the topologies of , in particular between open, flat or closed spacelike hypersurfaces respectively. Furthermore, a(t) is the scaling factor representing the “story” of the universe and it is the single dynamical degree of freedom present in the system. Next, also supported by recent observations that seem to confirm this hypothesis, we shall restrict our attention to the flat case, κ = 0. With this assumption the metric (2) becomes conformally flat. This last fact, when a(t) is given, can be manifestly seen passing from the cosmological time t to the conformal time τ . The latter is determined by the diffeomorphism t0 1 τ (t) := τ0 − dt , a(t ) t where τ0 and t0 are two fixed constants. Since this transformation is a diffeomorphism and a is a field on the manifold M, it is clear that a can be seen both as a function of t or, as a function of τ . For this reason, next, we shall indicate a ◦ τ −1 (τ ) by a(τ ) as is the custom for scalar fields over manifolds. We would like to stress that choosing a(t) as the single degree of freedom, we are at the same time selecting a fixed (free) background on which a(t) evolves. The matter evolution could be analyzed on that background too, however, since we restrict our attention to conformally flat spacetimes, another natural choice for the background would be Minkowksi spacetime (M, gM ) taken with the standard Minkowksi metric gM := −dτ 2 + dr 2 + r 2 dS2 (θ, ϕ),

568

N. Pinamonti

where (r, θ, ϕ) represents a point in R3 in the standard spherical coordinates. In order to fix the notation, next, we shall indicate the generic element x of M by (τ, x); notice that this defines a coordinate system also on M. Later it will also be useful to consider as a background the static Einstein universe (Me , ge ). The advantage of this last selection is in the compactness of its spatial sections. All these spacetimes are connected by a chain of conformal embeddings [Pi09] η and η such that (Me , ge ), (M, g) → (M, gM ) → η

(3)

η

where the conformal factor of η is 2 := a −2 , while the one of η is := 2

(1 + (τ + r )2 )(1 + (τ − r )2 ) 4

−1 .

We would like to stress that in every spacetime under investigation, namely for every a(t), there is a conformal Killing vector, ∂τ generating conformal translations τ → τ +λ. We shall use these conformal symmetries in order to define a suitable quantum state used for the computation of backreaction.

2.2. Scalar fields, quantization and the stress tensor. As a model for matter we shall consider a real scalar field, namely a field satisfying Pϕ = 0,

P = −g + ξ Rg + m 2 .

(4)

For simplicity, we shall choose the coupling to the metric to be ξ = 1/6, corresponding to the so-called conformal coupling, whereas we shall allow the mass to be non vanishing. In this respect m plays the role of the “coupling constant” for the matter gravity interaction, or the interaction between the field ϕ and the scaling factor a. We shall employ the quantization prescription proper of the algebraic approach ([BFV03,HW01]). First of all we select the ∗−algebra of local observables/fields A (M) and afterwards, choosing a suitable state ω (a normalized, positive linear functional over A (M)), the expectation values for the elements of A (M) are obtained.2 For our purpose it is important to stress that M being a globally hyperbolic spacetime, A (M) is uniquely determined by the equation of motion and the unique causal propagator3 E. The whole procedure is hence functorial as discussed in [BFV03]. In order to compute the backreaction of matter on gravity by means of the semiclassical Einstein equations (1), the expectation values of the component of the stress tensor need to be computed. Hence, since the pointwise product of fields is singular, the quantum state has to be selected in a physically reasonable way. In other words a certain regularization procedure has to be implemented and it must be meaningful on that state. We proceed now to discuss this aspect in detail. On the field algebra A (M) (constructed as discussed for example in [BF09]), the functional ω representing the state can be thought of as being completely described by a suitable set of distributions ωn ∈ D (M n ), the so-called n−point functions. In this paper we shall restrict our attention to states that are quasi-free, namely states whose n−point functions vanish for odd n 2 An explicit construction of A (M) in terms of techniques proper of the deformation quantization can be found in [BF09,BDF09]. 3 The construction, and the proof of existence and uniqueness can be found for example in [BGP96].


569

while for even n they can be given in terms of the two-point function ω := ω2 only by means of the formula ωn (x1 , . . . , xn ) = ω(xπ(1) , xπ(2) ) . . . ω(xπ(n−1) , xπ(n) ), (5) π ∈Pn

where the sum is taken over the set Pn of permutations of the first n natural numbers with the constraints πn (2i − 1) < πn (2i), 1 ≤ i ≤ n/2

and πn (2i − 1) < πn (2i + 1), 1 ≤ i < n/2.

Let us proceed to discuss the regularization prescription we shall use to give meaning to observables like the component of Tμν or the field square. In order to implement it in practice, we must have control of the ultraviolet behavior of our states. For this reason, we shall impose a constraint on the states, asking that their two-point functions have singularities that look, as much as possible, similar to the ultraviolet singularity displayed by the Minkowski vacuum. This is achieved requiring that the two-point function ω satisfies the so-called microlocal spectrum condition [Ra96,BFK95]. Definition 2.1. A state ω on A (M) satisfies the microlocal spectrum condition (μSC) if its two-point function ω is a distribution on C0∞ (M 2 ) whose wave front set has the form W F(ω) = {(x, y, k x , k y ) ∈ T ∗ M 2 \ {0}, (x, k x ) ∼ (y, −k y ), k x 0}. The relation (x, k x ) ∼ (y, −k y ) is satisfied in M if the points x and y are connected by a null geodesic whose cotangent vector in x is k x and in y is −k y . Furthermore k x 0 if k x is future directed. Notice that, thanks to the work of Sanders [Sa09], it suffices to give a condition only on the two-point function. In the above mentioned paper it is actually shown that every state of a free scalar field algebra, whose two-point function satisfies the microlocal spectrum condition, has n−point functions that enjoy the microlocal spectrum condition too (for the definition in that case we refer to the work [BFK95]). Notice that the two-point function of a quasi-free state for fields satisfying the equation of motion (4) and the microlocal spectrum condition has an ultraviolet singularity that is similar to the one of the Minkowski vacuum. In other words, its integral kernel can be expanded in the following way U (x, y) σ (x, y) ω(x, y) := lim+ + W (x, y), (6) + V (x, y) log →0 σ (x, y) λ2 where U, V and W are smooth functions on M 2 , λ is a length scale used in order to have the logarithm with a dimensionless argument and σ (x, y) = σ (x, y) + i(T (x) − T (y)) + 2 , where T is a generic time function and σ is the Synge world function, namely half of the squared geodesic distance taken with sign [DB60,KW91]. Furthermore, is a regularization parameter that eventually tends to 0+ (weak limit). A two-point function whose singular part is of the form (6) is said to be of Hadamard form. It is important to notice that the coefficient U and V are completely determined out of the geometry and of the equation of motion, while the coefficient W really characterizes the state. The distribution constructed out of the first two terms in (6) is called Hadamard singularity and it is indicated by H, then its integral kernel is U (x, y) σ (x, y) H(x, y) := lim+ . (7) + V (x, y) log →0 σ (x, y) λ2

570

N. Pinamonti

Hence, in order to regularize the states satisfying the microlocal spectrum condition we can simply subtract the singular part H from ω; what remains is a smooth function whose coinciding point limit can be safely computed. As a side remark, we notice that, since we are interested in local fields, for our purpose it is enough to have an explicit form of H on small domains. The question about global existence of H is hence not an important one in our approach. Coming back to the main point, it is hence possible to obtain a well posed stress tensor on states satisfying the μSC. On practical ground this is achieved by operating on the regularized two-point function ω−H with a certain differential operator before computing the well posed coinciding point limit. This procedure is called point splitting regularization. Furthermore, we notice that on those states the fluctuations of the observables computed on similar states are always finite. Let us discuss this point in some detail. On the class of states that satisfy the μSC, we could extend the algebra of fields A (M) in order to encompass also the pointwise product of fields. Here we shall indicate by F (M), the extended algebra of fields, namely the ∗−algebra that contains also the Wick powers as constructed in [HW01], or more recently by means of deformation quantization methods in [BDF09,BF09]. We are now ready to discuss the form of the stress tensor as an element of F (M). To this end, following the discussion introduced in [Mo03], we consider the following operation on smooth field configurations ϕ: 1 Tab (ϕ) := ∂a ϕ∂b ϕ − gab ∂c ϕ∂ c ϕ + m 2 ϕ 2 − ξ ∇a ∂b ϕ 2 6 R 1 + ξ Rab − gab ϕ 2 + ξ − gab ϕ 2 . 6 6 Notice that for every compactly supported smooth function f , Tab (ϕ)( f ) :=

1 f, Tab (ϕ) 2

is an element of F (M), and we shall use it in order to compute the backreaction

in (1). Another element of F (M) that will be useful is φ 2 ( f ). This is nothing but f ϕ 2 /2. We shall indicate by φ 2 ω and Tab ω the expectation values of the two observables introduced above computed on the state ω regularized according to the previous discussion. Hence φ 2 ω and Tab ω are obtained replacing the smooth field configuration ϕ(x)ϕ(y) with the smooth function ω − H in Tab (ϕ)( f ) and φ 2 ( f ) before performing the coinciding point limit (contracting with the delta function). In the procedure presented above, there is an ambiguity in the definition of H, whose form is described in [HW01]; we call it renormalization freedom, later on we shall give more details on this crucial aspect. 3. Homogeneous States and Hadamard Property in Flat FRW 3.1. Characterization of pure, homogeneous states in terms of solutions of the wave equation. In this subsection we shall proceed to discuss the form of the states we are interested in, we shall make use of some previous works in order to get the form of the two-point function of the pure homogeneous states on FRW spacetimes in particular. We start with the definition of the class C (M) of pure and homogeneous quasi-free states we shall employ in the present paper. Although the set C (M) does not contain all the possible pure and homogeneous states for A (M), that definition is motivated by the


571

work of Lüders and Roberts [LR90] (Theorem 2.3 of [LR90] in particular) and the one of Olbermann [Ol07] see also the nice paper of Degner and Verch [DV09] for a review. Furthermore, we shall restrict our attention to that class. Definition 3.1. We shall indicate by C (M) the set of pure, homogeneous and quasifree states ω for A (M) whose two-point function ω ∈ D (M × M), using the standard Minkowksian coordinates, takes the following form, in the distributional sense,4 ω((τ, x); (τ , x )) :=

Sk (τ ) Sk (τ ) ik·(x−y) 1 e dk, 3 (2π ) R3 a(τ ) a(τ )

(8)

where (τ, x) is a generic element of I × R3 , k = |k|. For every k ≥ 0, Sk is a smooth function satisfying d2 Sk (τ ) + (m 2 a(τ )2 + k 2 )Sk (τ ) = 0, dτ 2

(9)

and Sk (τ )

d d Sk (τ ) − Sk (τ )Sk (τ ) = i. dτ dτ

(10)

Furthermore, for every τ , the map k → Sk (τ ), and its first time derivative, are measurable functions that are polynomially bounded at large k. When restricted on a generic interval [0, k0 ] ⊂ R+ they are in L 2 ([0, k0 ], k 2 dk). It is not guaranteed that the state, constructed out of a generic solution Sk of Eq. (9) (also with the constraint on the Wronskian), is of Hadamard type. It is also usually not easy to impose some initial conditions for Sk (τ ) on a spacelike hypersurface in order to have an Hadamard state. This problem, in connection with the concept of adiabatic states, has been discussed in the paper of Junker and Schrohe [JS02] who showed that it is possible to get Hadamard states only if the recursive construction of adiabatic vacua does not break down. However, as shown by Lüders and Roberts [LR90], one cannot exclude such possibility. Another possible explicit construction of Hadamard states is the one presented by Olbermann [Ol07], but this construction requires the smearing on a finite time interval. It is hence very difficult to use similar ideas in treating backreaction problems. Here we would like to give an alternative construction employing ideas similar to the one presented in [Mo08,DMP09a,DMP09b] and adjusting them in order to be suitable for the purpose of studying the gravity-matter interacting system. First of all, we start selecting a particular solution χk (τ ) of (9), when the spacetime M admits the limit τ → −∞, namely the one that satisfies the initial conditions lim e

τ →−∞

ikτ

1 χk (τ ) = √ , 2k

lim e

τ →−∞

ikτ

d χk (τ ) = −i dτ

k . 2

(11)

We can now introduce the following proposition that summarizes some results obtained in [LR90,Ol07,DV09] 4 Here we mean that ω has to be constructed as the weak limit → 0+ of the expression on the right-hand side of (8) regularized multiplying the integrand by an extra factor e−|k| .

572

N. Pinamonti

Proposition 3.1. If the limits in (21) are well posed, every solution Sk of (9) which determines a state ω ∈ C (M) according to Definition 3.1 can be realized as a linear combination of the χk (the solution of Eq. (9) satisfying the asymptotic conditions (11)): Sk (τ ) := A(k)χk (τ ) + B(k)χ k (τ ),

(12)

and the coefficients A(k) and B(k) are such that |A(k)|2 − |B(k)|2 = 1. Furthermore, for every τ , the functions, on R+ , k → A(k)χk (τ ) and k → B(k)χk (τ ), d d together with k → A(k) dτ χk (τ ) and k → B(k) dτ χk (τ ), are at most of polynomial growth and locally square integrable. Notice that, even if we restrict our attention to the class of pure states,5 we have still some freedom in the definition of the state, as expressed in the preceding proposition by the choice of A(k) and B(k) in the Sk of (12) used in finding the two-point function (8). Motivated by this expression, we shall indicate by ω A,B the quasi-free state whose two-point function is 1 ik(x−x ) S k (τ ) Sk (τ ) (13) dk e ω A,B ((τ, x); (τ , x )) := (2π )3 R3 a(τ ) a(τ ) where we have highlighted the dependence upon A(k) and B(k) in the state. We are now going to find sufficient conditions that have to be fulfilled by A(k) and B(k) in order to ensure that ω A,B ∈ C (M) satisfies the Hadamard condition. The strategy we shall pursue is the following one. We prove that ω1,0 is of Hadamard form provided some regularity is shown by a(t) in (2) and then that ω A,B satisfies the μSC if and only if B(k) is rapidly decreasing for large k. Notice that, in order to prove that a certain distribution satisfies the microlocal spectrum condition, it is not enough to test it on some Cauchy surface because no remnant of causality survives the projection of any vectors on a spacelike hypersurface. This is not true for projection on lightlike hypersurfaces, and this is the reason why the arguments used in [Mo08,Ho00 and DMP09b] work. We would like to give sufficient conditions on the form of the spacetime, and on its past asymptotic form in particular, in order to guarantee that ω1,0 is of Hadamard form. In the case of asymptotic de Sitter spacetime, it was already shown in [DMP09b] that the state ω1,0 satisfies the microlocal spectrum condition, and that if the spacetime is precisely the de Sitter one, ω1,0 becomes the well known Bunch-Davies state [BD78]. Furthermore, in the case of asymptotically flat spacetimes, Moretti has proven the Hadamard property for an analogous state [Mo08]; a similar proof can also be found in [Ho00]. We are now going to generalize these results in order to encompass the case considered here. The idea we would like to use is to consider ω1,0 as the pullback (η ◦ η )∗ (ω) ˜ associated with the conformal embedding η ◦ η introduced in (3), where ω˜ is the two-point function of a certain state in a hyperbolic subset of the Einstein universe that contains the image of M under η ◦ η together with its past boundary − ∪ i − . Notice that on the latter, the field theory looks like a massless Klein Gordon field perturbed with an external potential of the form m 2 a(τ )2 (1 + (τ + r )2 )(1 + (τ − r )2 )/4, and hence ω˜ is a state for this particular theory. It happens that, if that potential is regular on the extended spacetime in a neighborhood of − ∪ i − then the state ω1,0 , defined in such a way, turns out to be of Hadamard form. 5 More details in this respect can be found in [LR90,Ol07].


573

Theorem 3.1. Suppose that M = I × R3 , where I = (0, t0 ) and that a(t) is such that the region t → 0 corresponds to the past null infinity of (M, gM ) in (Me , ge ) and furthermore such that the smooth function (1 + (τ + r )2 )(1 + (τ − r )2 )m 2 a(τ )2

(14)

defined on M seen as embedded in the Einstein universe (Me , ge ) can be smoothly extended in a neighborhood O ⊂ Me containing the past boundary of M here indicated as − ∪ i − . Under these hypotheses the state ω1,0 , constructed out of the solutions χk that satisfy the initial condition on − satisfies the μSC. Proof. The proof can be found following similar ideas as those presented in [Mo08, Ho00] or alternatively in [DMP09b] we shall summarize here the main strategy and we shall discuss only the key point in detail. We would like to show that the two-point function ω1,0 satisfies the microlocal spectrum condition, hence we have to analyze its wave front set and to show that

W F(ω1,0 ) = (x, y, k x , k y ) ∈ T ∗ M 2 \ {0}, (x, k x ) ∼ (y, −k y ), k x 0 . (15) Notice that, in showing the preceding equality, the difficult part is to prove the inclusion ⊂. If it holds true, the other inclusion descends from the fact that ω1,0 is a state [SV01,SVW02] and out of an application of Hörmander’s theorem of propagation of singularity [DH72]. In order to show the validity of inclusion ⊂ we can rewrite ω1,0 ( f, g) as w(E f − , Eg − ), where w is a certain distribution on − × − defined employing the Fourier Plancherel transform6 along the τ −coordinate on − : ∞ 1 (k, θ )ψ 2 (k, θ ) dkdS2 , 2k ψ w(ψ1 , ψ2 ) := S2 0

where dS2 is the measure on the two dimensional unit sphere S2 and θ is shorthand notation for the standard spherical coordinates. A discussion on such a distribution can be found in Theorem 3.1 of [Mo08]. Furthermore, E f − is the restriction to − of the wave function E( f ) constructed by means of the causal propagator E, multiplied by a certain factor . That restriction can be safely computed in an Einstein universe and the multiplication by the factor corresponds to the conformal transformation that maps solutions of P in M to solutions of Pe in Me . We have indicated by Pe the operator −e + 16 Re + m 2 a 2 2 , where e is the d’Alembert operator associated with the Einstein metric. Hence, the two-point function ω1,0 can be seen as a composition of distributions ω1,0 := w ◦ (E − ⊗E − ). Because of the non compact nature of the restriction of E on − , which persists also if the other entry is localized on a compact set of M, the previous composition of distributions cannot be straightforwardly justified by an application of Theorem 8.2.13 of [Ho89]. On the other hand, in Proposition 4.3 in [Mo08], it is proven that a similar composition is well defined and that the result has the desired wave front set provided the following two facts hold: First of all considering a compact set O inside M, there exists a neighborhood Oi − of i − in Me such that there are no lightlike geodesics connecting any point in O with those in Oi − , and, second, if one entry of the causal propagator is restricted on O and the other on − , the causal propagator itself must fall off sufficiently fast towards i − in M. 6 This operation is introduced and discussed in the Appendix of [Mo08].

574

N. Pinamonti

Both requirements hold under the hypotheses of the present theorem; the first descends from the fact that M is conformally flat and the latter is a consequence of the fact that m 2 a(τ )2 (1 + (τ + r )2 )(1 + (τ − r )2 ) can be smoothly extended on a neighborhood of i − in Me , hence the same result as the one stated in Lemma 4.2 of [Mo08] holds in the present case too. Notice that the constraints given above are only sufficient but not necessary, in fact in [DMP09b] it has been proven that ω1,0 is a well defined Hadamard state also when the spacetime is asymptotically de Sitter in the past, and in such a case the potential that appears in the equation of motion on the Einstein universe is singular on i − . Hence the hypothesis of the smooth extension of the function (14) over − ∪i − in Me can be relaxed. Finally, notice that employing techniques similar to those presented in [DMP09a] and in [DMP09b] based on a careful analysis of the modes χk (τ ), it is actually possible to obtain the Hadamard property for the state ω1,0 also when (14) can be extended over − ∪ i − only with the regularity of C 1 (Me ). We would like to conclude this section by introducing a theorem that gives a necessary and sufficient condition for the state ω A,B to be of Hadamard form. Theorem 3.2. The quasi-free state ω A,B ∈ C (M) satisfies the microlocal spectrum condition (μSC) if and only if |B(k)| decreases rapidly for large k, in other words, iff there are constants cn and k such that |k|n |B(k)| ≤ cn ,

∀ |k| > k.

Proof. The proof of this fact can be easily done using the formulation of the state on a given Cauchy surface, see [LR90] for a discussion. Theorem 3.1 implies that ω1,0 is of Hadamard form, hence it is sufficient to consider the difference ω A,B − ω1,0 . Notice that, when restricted on a Cauchy surface, the difference between the two-point functions ω A,B − ω1,0 becomes smooth if and only if B(k) decreases rapidly (and this holds for the time derivative of one or both of its entries too). This proves that the rapid decrease of |B(k)| is a necessary condition for the validity of μSC. Finally, in order to prove that it is also sufficient, we can make use of a formula like the one given in Eq. 3.60 of [KW91] and reconstruct ω A,B − ω1,0 out of its form on a Cauchy surface. We can extend the result about the smoothness to the whole M 2 by noticing that every of the four terms appearing in formula 3.60, above cited, can be obtained composing ω A,B − ω1,0 (sometimes with a time derivative applied to one or both of its entries) with the tensor product E ⊗ E (sometimes taken with a time derivative applied to one or both of its entries). The proof can be easily finished noticing that E maps compactly supported smooth functions on the Cauchy surface to smooth functions in the spacetime. 4. Semiclassical Treatment of Backreaction and Existence of Solutions 4.1. Semiclassical approximation. We shall now proceed to treat the backreaction of quantum matter on gravity, and, as discussed in the Introduction, we shall do it for flat FRW spacetimes. With such a large symmetry, the Einstein equations take the well known Friedmann Robertson Walker form, namely, if the stress tensor of matter is presented in diagonal form diag(−ρ, P, P, P), and if we indicate by a˙ the derivative of a with respect to the cosmological time t and H := a −1 a, ˙ they become 3κ 3 H˙ + H 2 = −4π(ρ + 3P). 3H = 8πρ − 2 , a


575

Because of the conservation of the stress tensor we can instead solve the equation for the trace of G μν and of Tμν , −R = 8π T, and then employ the first FRW equation as a constraint that needs to be imposed only at a fixed time, in order to fix only an initial condition. We shall assume this point of view and, in the semiclassical scenario, we have simply to substitute the classical T with the expectation value of the corresponding observable T ω computed in the chosen quantum state ω. More precisely, as discussed in the Introduction, we have to select a class of states (one for every a(t)) and then the semiclassical equation −R = 8π T ω acts as a constraint on this class. Notice that by employing the initial values on the null boundary, we have already disentangled the problem of the definition of the class of states from the geometric property of the spacetime. Hence the backreaction of such a system can be treated simply employing the equation for the trace. But in the trace, taking the conformal coupling and a non-vanishing mass, the explicit form of the state becomes important. More precisely, the expectation value of the trace of the stress tensor in a state ω is 2[V1 ] 1 2 T ω := − ξ − m φ 2 ω . + −3 8π 2 6 Above [V1 ] is the coinciding point limit of the second Hadamard coefficient (see [SV01, Mo03,DFP08] for more details). This contribution to the trace coincides with the well known trace anomaly displayed by the stress tensor [Wa78b] in the massless case. Coming back to the dynamical equations, in the case of a flat FRW metric, the equation of motion for the scaling factor is 1 ˙ 2 − 6 H˙ + 2H 2 = −8π m 2 φ 2 ω − H H + H4 . (16) 30π In writing the preceding equation we have chosen the renormalization parameters, in such a way that the higher derivatives of the coefficients of the metric disappear. There is an extra renormalization freedom that we have not considered here, but it can always be reabsorbed in the freedom present in φ 2 ω . It seems important to notice that, apart from renormalization freedom, the trace anomaly is fixed and it is really a c−number. For the discussion of these and other points we recall the work [DFP08]. Hence, in the case of vanishing m, the state does not enter anymore in (16) and an exact solution can be easily found.7 This solution can be written in an implicit form as 2H+ H + H+ 4t H+ H (17) e =e H − H , + where the time of the big bang is fixed to be t = 0 and H+2 = 360π in natural units, that is G = = c = 1. The massless solution (17) presents three branches for H > 0 (the first 0 < H 2 < 2 H+ /2, the second H+2 /2 < H 2 < H+2 and the last H 2 > H+2 ), we concentrate ourself on the upper one and we discuss some of its properties. Notice that it corresponds to a universe that shows, at the beginning, a phase typical of the power law inflationary scenario, while it ends up in a phase which corresponds to a de Sitter universe. It is a 7 See [Wa78a,DFP08] for more details.

576

N. Pinamonti

remarkable fact that the form of the initial singularity is lightlike; hence, for example, the horizon problem is not present in this simple model. The Hubble constant H of the de Sitter phase shown in the asymptotic future is equal to H+ , and it is many orders of magnitude too big in order to explain the present acceleration of the universe. Despite this fact, in [DFP08], it has been shown that, in the case of massive fields, this parameter is not fixed and it needs to be interpreted as an extra renormalization freedom. Next we shall make use of the lightlike nature of the initial singularity to see if it is possible to obtain an exact solution also in the case of massive fields, and we shall look for solutions that are similar to the massless one at the beginning of the universe. In this respect, we shall treat H+ as a renormalization constant. Furthermore, since, in the massless case, only the anomaly appears in the trace and it is a c−number it has a vanishing variance, hence, such an equation should hold also for large H (as is the case close to the initial singularity t = 0 in (17)). Of course, the situation changes when the mass is different from zero, and in this case the result provided by the semiclassical Einstein equations needs to be interpreted more carefully. We shall comment on these points in the next section.

4.2. Initial conditions at the beginning of the universe. In this subsection we would like to discuss the existence of solutions of the semiclassical Einstein equations (16) that look as much as possible close to (17), at least on small intervals of proper time just after the initial singularity. Hence, we shall assume the form of the singularity to be lightlike just as in the massless case (17). As previously discussed this task can be accomplished only after a careful analysis of the class of states we have to use in order to obtain the expectation values of φ 2 in the different spacetimes we are considering, in any case the requirement of having a lightlike initial singularity is essential in order to apply the results of the preceding sections and to have Hadamard states. The states we would like to choose are those constructed in the preceding part of this paper and denoted by ω A,B with B of rapid decrease. We shall start considering only the state ω1,0 , and we notice that this state is defined by the requirement to be vacuum states with respect to the conformal time on the initial singularity (this procedure works only for lightlike initial singularities). Before proceeding with the discussion of Eq. (16) we have to discuss the form of the expectation value of φ 2 on these states and to fix once and forever the renormalization freedom, according to the picture presented in [BFV03]. Since we would like to impose initial conditions on − or better at τ → −∞, where the Hubble constant H usually diverges, it is useful to rewrite the dynamical equation for its inverse that we shall call X :=

1 . H

With this in mind, we can rewrite Eq. (16) as follows dX Hc2 X 2 Cm2 X 4 =1− + φ 2 ω1,0 , 2 2 dt 1 − Hc X 1 − Hc2 X 2

(18)

where Hc2 := H+2 /2 = 180π and Cm2 := 240π 2 m 2 are two constants. Notice that, since φ 2 ω1,0 is a functional of a(t) and of τ , the proof of the existence of solutions of (16) needs some effort, and the analysis of the functional dependence of φ 2 ω1,0 upon X = H −1 needs to be carefully performed in particular.


577

We intend to prove the existence of solutions of (16) (or better of (18)) in a certain compact set Bc of a Banach space B. To this end let us start introducing these spaces of functions. Definition 4.1. Let t0 > 0. Consider the set B as the subset of elements of C 1 ([0, t0 ]) that vanish at t = 0 and whose first derivative is bounded on [0, t0 ]. B is a Banach space when equipped with the norm f B := sup f˙(t) . [0,t0 ]

For every

2 3

< c < 1 we define Bc as the closed ball in B of radius 1/2 −c/2 centered in f 0 (t) :=

1+c t 2

(t ∈ [0, t0 ]).

(19)

The strategy of the proof of existence of solutions of (18) is the following: We define a map T : Bc → B by T (X ) (t) :=

t 0

1 − 2Hc2 X 2 Cm2 X 4 2 + φ [X ] dt , ω1,0 1 − Hc2 X 2 1 − Hc2 X 2

(20)

and we shall show that T acts as a contraction on Bc . Hence the existence and uniqueness descends from an easy application of the Banach fixed point theorem. Thus, if we succeed in obtaining that T is a contraction on Bc , we will have that the solution of the semiclassical Einstain equations enjoys a phase of power law inflation like every element of Bc . We shall now discuss how it is possible to associate a spacetime to an X ∈ Bc in such a way that the initial conditions are implemented. Proposition 4.1. Fix the three constants t0 , a0 and τ0 . Then every element X of Bc determines a spacetime (M, g[X ]) where M = [0, t0 ] × R3 and where g[X ] is the metric (2) constructed out of the following scaling factor: t0 a[X ](t) := a0 exp − X (t )−1 dt . t

The Hubble parameter shown by (M, g[X ]) is then H (t) = X (t)−1 , and the corresponding conformal time τ [X ](t) := τ0 −

t0

a[X ](t )−1 dt ,

t

is a functional of X too. Finally, the spacetime enjoys the following initial conditions: X (0) = 0, per construction.

a(t0 ) = a0 ,

τ (t0 ) = τ0

(21)

578

N. Pinamonti

For every X in Bc we have immediately, from the definition, that the initial conditions (21) given above are satisfied, and the functional a[X ] and τ [X ] are, moreover, well defined on Bc . Notice that, on Bc t0 lim X −1 (t )dt = ∞ t→0 t

out of this, we have that a(0) = 0 sufficiently fast in order to ensure that τ [X ] → −∞ for t → 0. This last requirement is necessary in order for the resulting spacetime to have a lightlike past boundary or, in other words, that the initial surface t = 0 is of null type. As already discussed, this last requirement is a necessary prerequisite for the construction of Hadamard states on (M, g). We have in fact, the following two propositions Proposition 4.2. For every spacetime characterized by a metric of FRW type, whose a(t) scaling factor a(t) is such that X (t) = a(t) is in Bc ∩ C ∞ ([0, t0 ]), we can construct ˙ ω1,0 as in (13) and it determines a quasi-free state ω1,0 which, moreover, satisfies the microlocal spectrum condition. Proof. If the analogue of function (14) can be smoothly extended over − ∪i − the proof descends from Theorem 3.1, otherwise, also according to the discussion given after that theorem we can adapt the proof of Theorem 3.1 given in [DMP09b] to the present case. Notice that if we drop the requirement of being smooth, and we replace it only with X ∈ Bc , the construction of the state is still valid, and, even if the resulting two-point function is not of Hadamard form, we can still apply the procedure to compute expectation values of φ 2 . Unfortunately, for fields that contain derivatives, it is not anymore guaranteed that the point splitting regularization works for these states (they are adiabatic states of suitable order, in the sense given in [JS02]). We shall conclude this subsection by noticing that, since a[X ] is positive, τ [X ](t) can always be inverted, furthermore, on Bc some useful inequalities hold. We shall summarize them in the next proposition: Proposition 4.3. The following inequalities hold for every X in Bc with 2/3 < c < 1 and for t ∈ [0, t0 ], X (t) ≤ 1, c≤ t

t 1 1≤ ≤ , X (t) c

1 t c a[X ](t) t ≤ ≤ . t0 a0 t0

The proof descends from a straightforward application of the definition of a and τ out of H and from the bounds satisfied by the elements of Bc . 4.3. Expectation value of φ 2 on the state ω1,0 , and further renormalization freedom. We are now in place to discuss the form of φ 2 ω1,0 constructed out of point splitting regularization on the quasi-free state ω1,0 introduced above in (13), namely we can construct it by φ 2 (x)ω1,0 = lim ω1,0 (x, y) − H(x, y) + α R(x) + βm 2 , (22) y→x


579

where H is the Hadamard parametrix (7) constructed in the spacetime (M, g), and α and β are renormalization constants. At this point it is important to stress that α and β are really constants, in the sense that they do not depend on the form of the spacetime. In other words, if they can be fixed in a specific spacetime, they are fixed on all of them. This is a consequence of the general covariance presented in [BFV03] which uses the hypothesis that the fields transform covariantly under isometries used in combination with spacetime deformation arguments. According to the discussion presented in [DFP08] we would like to have Minkowksi spacetime as a solution of the semiclassical equation (16), when the Minkowksi vacuum is employed. This is tantamount to requiring that the expectation value of φ 2 on the Minkowksi vacuum in Minkowksi spacetime vanishes and this fixes the renormalization constant β to a certain value. We notice also that the other renormalization constant α can be used in order to adjust the parameter Hc present in Eq. (18), hence from now on we shall consider it as a renormalization constant [DFP08]. Once we have a prescription to fix the renormalization freedom we can start to discuss the Hadamard regularization; to this end we need to know the Hadamard parametrix H in some detail. For our purposes, since the modes χ are easily treated when analyzed on the conformal Minkowksi background, it would be desirable to perform the subtraction ω1,0 − H directly on Minkowksi spacetime. To this avail we have to analyze how ω1,0 − H transforms under the push forward η∗ associated to the conformal transformation η : (M, g) → (M, gM ) introduced in (3). Notice that η∗ maps ω1,0 to a(t1 )a(t2 )ω1,0 seen as a distribution on Minkowski spacetime and moreover, since the conformal transformation preserves the Hadamard property [Pi09], also a(t1 )a(t2 )ω1,0 satisfies the μSC in M. We can hence regularize it by means of the Hadamard parametrix HM constructed in Minkowksi spacetime for the field theory PM φ = 0 with PM := −M + a 2 m 2 . Notice that, under the conformal transformation η, the mass term on M becomes a position dependent potential a(t)2 m 2 . Let us analyze the difference of the two regularization schemes. In this respect, we already know that the microlocal spectrum condition, together with the causal propagator, are preserved by a conformal transformation [Pi09], and this holds also if the conformally coupled quantum field has a mass. Of course, under a conformal transformation, the symmetric part of the Hadamard function is not preserved, hence the coinciding point limit of the difference does not vanish in general. Actually, taking the same scale λ in the logarithmic divergence, we obtain that lim

x1 →x2

H(x1 , x2 ) −

1 m2 HM (x1 , x2 ) = log a(τ2 ) + c R(x2 ), a(τ1 )a(τ2 ) 8π 2

(23)

where c is a fixed constant. Notice that an unexpected logarithmic dependence in the scaling factor log(a(τ )) appears and we have to carefully use it. Actually, we shall notice that in the final formula of the present subsection such contribution will be cancelled. On the other end, the term proportional to R, which appears when the two different renormalization schemes are used, falls into the class of the standard regularization freedom [HW01] and, thanks to the previous discussion, we can incorporate it in the definition of Hc and forget it. Furthermore, since we are only interested in computing the expectation value of φ 2 , it is possible to regularize by subtracting an approximated version of the Hadamard parametrix constructed just out of the first Hadamard coefficient (see [SV01,Mo03] for more details)

580

N. Pinamonti

0 HM := lim

→0

1 1 σ + V0 log 2 2 8π σ λ

.

According to the analysis performed in Chap. 4 of [Fr75], the V0 (x, x ) for spacelike separated x and x can be found as the result of the following integral: 1 PM U 1 (x, γ (s))ds, V0 (x, x ) := U (x, x ) 2 U 0 where the integration is done in the affine parameter s of the geodesic γ connecting x and x with γ (0) = x and γ (1) = x . The previous integral can be easily performed on Minkowski spacetime where U = 1/(8π 2 ), PM = −M + m 2 a 2 , and the geodesics are simply the straight lines connecting x and x . Furthermore, if we choose x and x as (τ, x) and (τ, x ) in the Minkowskian coordinates, we obtain V0 (x, x ) =

1 m 2 a(τ )2 . (4π )2

We can now write the Hadamard parametrix at first order, for x and x lying on the same Cauchy surface τ , as σ 2 1 0 2 2 HM . (x, x ) := + m a(τ ) log (4π )2 σ λ2 Notice that the first two contributions are the same as the one of a massive Klein Gordon field propagating in Minkowski spacetime with the mass equal to m 2 a 2 (τ ) (where here τ is fixed by the choice of τ ). Furthermore, we know that in this case, the coinciding point limit of the following expression yields: 1 1 0 ik(x−x ) e dk lim HM (x, x ) − (2π )3 x →x 2 k2 + m 2 a(τ )2 ma 2 m2a2 − log = +c , (4π )2 λ where x = (τ, x) and x = (τ, x ) while c is a constant. The previous expression has to be understood in the sense of distributions. Hence, as a side remark, it is clear that the Hadamard regularization and the first order adiabatic regularization [Pa69] for the computation of the expectation values of φ 2 coincides up to the choice of a renormalization constant. Furthermore, a direct computation shows that for every x and x lying on the same Cauchy surface τ , m 2 a(τ )2 1 1 0 ik(x−x ) H e − (x, x ) − dk (|k| − ma(τ )) lim M (2π )3 2|k| 4|k|3 x →x ma m2a2 + c , − log = 2 8π λ and hence we could use the right hand side to regularize the states in order to obtain the expectation value of φ 2 . Once again we incorporate the constant c in the renormalization constant β, that we shall fix a posteriori according to the discussion given above. Notice that the log(a) term present in the preceding expression will be cancelled by the other log(a) term that arises from the conformal transformation of the Hadamard function


581

in (23). Thus, these terms are not manifestly present anymore in the expectation value of φ 2 . Eventually, also in the computation of the expectation value of φ 2 , we are free to restrict both x and x to the same Cauchy surface, determined by τ1 = τ2 = τ and, after the subtraction, we can perform the coinciding point limit. Hence, making use of the previous equation and of (23), the Eq. (22) can be rewritten as 1 φ 2 (x)ω1,0 := lim dk eik(x−x ) x →x (2π )3 a(τ )2 m 2 a(τ )2 1 − × χ k (τ )χk (τ ) − (|k| − ma(τ )) 2|k| 4|k|3 m m2 +βm 2 + , (24) log 8π 2 λ where the integrals present above are to be understood in the distributional sense, β is the residual renormalization freedom and λ is the length scale present in the Hadamard parametrix. From now on, we shall fix β to the value −1/(8π 2 ). We stress that other choices of β would not alter the final result. The requirement that φ 2 ω1,0 vanishes in the limit of flat spacetime would also fix the scale λ equal to m; however, in order for φ 2 (x)ω1,0 to be analytic in the mass we shall do not perform this choice till the very end. It is thus shown that that the choice of λ does not alter the results presented below. We conclude this section with a remark. Since the map t → τ is a diffeomorphism, the previous expression can be easily given in cosmological coordinates, just employing a(t) and χk (t) at the place of a(τ ) and χk (τ ), where, as discussed above, a(τ ) = (a ◦τ −1 )(τ ) and χk (t) := (χk ◦ τ )(t). 4.4. Recursive constructions and bounds satisfied by χk and by the auxiliary functional . The aim of the present section is to find suitable bounds satisfied by ω1,0 (φ 2 ), for small values of a and close to t = 0. In order to accomplish this task we need some better control on the explicit form of the χk appearing in (24). Hence, here we shall give a recursive construction for χk , used in the definition of ω1,0 viewing m as a perturbation parameter. While the analysis of the state, and of the modes χk in particular, is easier if performed in the conformal time τ , the form of the differential equation (16) suggests the use of the cosmological time t. Here, as previously said since the map τ from the cosmological time to the conformal is a diffeomorphism, without ambiguity we shall write χk (t) for (χk ◦τ )(t), where χk (τ ) is a solution of (9) with the initial conditions (11). Following similar ideas to the one presented in [FH90] though in another context, we can write χk (τ ) by means of perturbation theory, hence in terms of a Dyson series over the massless solution χk0 and treating m 2 a(τ )2 as the perturbation potential (a similar construction can be found in the proof of Theorem 4.5 in [DMP09a]). Finally, since we would like to use the cosmological time instead of the conformal time, we can simply compose χk with τ (writing χk (τ (t))) and then change the coordinates in every integral appearing in that series from the conformal time to the cosmological one. Following this procedure we obtain χk [a, τ ](t) =

∞ n=0

χkn [a, τ ](t),

(25)

582

N. Pinamonti

where, χkn [a, τ ](t)

t

=− 0

sin(k(τ (t) − τ (t ))) 2 n−1 a(t )m χk [a, τ ](t ) dt , k

(26)

and e−ikτ (t) χk0 [a, τ ](t) = √ . 2k The advantage of using the cosmological time lies in the fact that it becomes clear how χk depends on τ . Notice that τ appears only in sin(k(τ (t) − τ (t ))) of (26) and in the form of χk0 . This reduces the number of functional derivatives that need to be performed in the computation of the functional derivatives of χk with respect to the small variations δ X of X . The first thing we would like to check is that the perturbative construction is well defined, namely let us show that the sum in (25) converges uniformly in time. We shall state the result about convergence in the following proposition and, although the proof is almost straightforward, we would like to give it since it contains some useful remarks that we shall use also later. Proposition 4.4. Let (M, g) be the flat FRW spacetime determined by X ∈ Bc according to the statement of Proposition 4.1, where a is its scale factor and τ the corresponding conformal time. Then the series (25), constructed over (M, g), converges absolutely to χk on [0, t0 ], where χk is a solution of Eq. (9), when read in conformal time. Moreover, it satisfies the initial condition (11). Furthermore χkn , given in (25), and χk satisfy the following bounds: 2 1 (mt)2n 1 1 m a(t)t n , |χk (t)| ≤ √ exp , |χk (t)| ≤ √ exp m 2 t 2 . |χk (t)| ≤ √ k 2k n! 2k 2k Introducing the following notation χk>n [a, τ ](t) =

∞

χki [a, τ ](t),

i=n+1

we have also that >n Cn (t) χ (t) ≤ , k 3 k n+ 2 which is valid from the positive functions Cn (t) = (m 2 a(t)t)n+1 exp(m 2 t 2 ). Proof. By direct inspection, under the hypotheses stated in the proposition, if the series converges, it has to tend to a solution of the equation of motion (9) which satisfies the wanted initial conditions (11). More precisely, notice that 2 2 ∂ ∂ 2 n 2 2 n−1 2 + k χk (τ ) + m a χk (τ ) = 0 and + k χk0 (τ ) = 0. ∂τ 2 ∂τ 2 Hence, let us apply the operator realizing the equation of motion (9) to the truncated series (χk −χk>n ). Using the inequalities given in the proposition, we obtain that the abso√ lute value of the reminder of the latter operation is smaller than a02 m 2 (mt0 )2n /(n! 2k),


583

which tends uniformly to zero over [0, t0 ] for large n. At the same time, for every n > 0, limτ →−∞ eikτ χkn (τ ) = 0, and the same holds for the first τ −derivative of χkn . What remains to be shown is the convergence and the validity of the estimates given in the statement of the proposition. We proceed analyzing the n th element of the sum. Looking at the recursion relations (26), it is a straightforward task to obtain t t a(t ) 2 n−1 |χkn (t)| ≤ m |χk (t )| dt , |χkn (t)| ≤ (τ (t)−τ (t ))a(t )m 2 |χkn−1 (t )| dt . k 0 0 1

Together with |χk0 | = (2k)− 2 , we obtain the following estimate |χkn (t)|

1

≤√ 2k n!

t

m (τ (t) − τ (t ))a(t )dt 2

n ,

0

where the n! in the denominator arises, as usual, from the extension of the time ordered domain of integration (0 ≤ t1 ≤ t2 · · · ≤ tn ≤ t) of a symmetric function over a symmetric domain ([0, t]n ). The previous inequality implies both the absolute convergence of the series and the following bounds 1 |χk (t)| ≤ √ exp 2k

t

0

m2 1 a(t )dt , |χk (t)| ≤ √ exp k 2k

t

m 2 (τ (t)−τ (t ))a(t )dt .

0

t Finally, since a(t) is growing monotonically and since τ (t) − τ (t ) = t a(t )−1 dt , the estimates presented in the proposition descend straightforwardly. The previous proposition yields suitable estimates for χk [a, τ ](t). We shall present the derivation of further useful estimates also for its functional derivative Dχk [a, τ, δ X ](t) with respect to the small variation δ X of X in Bc , in Appendix A.2. We shall conclude this subsection by discussing some properties of the auxiliary functional on Bc defined as [X ] := 2π 2 a 2 φ 2 ω1,0 ,

(27)

which corresponds to the expression (24) multiplied by 2π 2 a 2 and evaluated in the proper time t instead of the conformal time τ . For that expression we have the following proposition whose long and technical proof can be found in Appendix A.3. Proposition 4.5. The following inequality holds on Bc : |[X ](t)| ≤ C1 (a0 , c, t0 )t. Furthermore, the functional derivative8 of , with respect to the small perturbation δ X of X in Bc , satisfies the following inequality: |D[X, δ X ](t)| ≤ C2 (a0 , c, t0 )t0 δ X B , where C1 (t0 , c, t0 ) and C2 (t0 , c, t0 ) are uniformly bounded for t0 → 0 when and c are fixed. Above · B is the norm of the Banach space B introduced in 4.1. 8 A nice introduction of these concepts can be found in [Ha82].

584

N. Pinamonti

4.5. Existence of exact solutions of the semiclassical Einstein equations out of initial data given at the beginning of the universe. In this subsection, we shall present the main result of this paper, namely the existence of solutions of the semiclassical Einstein equations uniquely determined by some initial conditions and by the form of the state ω1,0 in particular. Later on we shall generalize the result to other Hadamard states in C (M). The proof we shall give is very similar to the Picard-Lindelöf proof of existence of solutions of first order differential equations together with the complication that the potential is a functional of the solution and not simply a function. Nonetheless, we shall show that it will be possible to use similar methods thanks to the estimates provided by Proposition 4.5. Hence, let us start by stating the following proposition which we shall use in order to prove that there exist solutions of (16). Proposition 4.6. Fix > 0 and let a0 = t0 , then, if t0 is sufficiently small, the image of Bc under the map T defined in (20), is contained in Bc . Furthermore, on Bc , T is a contraction. Proof. First of all we have to show that T maps Bc to Bc . In this respect, Proposition 4.2 and the subsequent remark ensure that we can compute for every element X of Bc ; hence we have to prove that T (X ) − f 0 B ≤

1−c , 2

where the norm · B is the one introduced in Definition 4.1 of Bc and f 0 is the center of the ball Bc introduced in (19). Fixing the two constants C1 = 120m 2 Hc−2 and C2 = (1 − c)Hc−2 , expanding the norm of B and using the definition of the map T in (20), the previous inequality turns out to be equivalent to −C2 + (2 − c)X 2 (t) ≤ C1

X 4 (t) (t) ≤ X 2 (t) a 2 (t)

∀t ∈ [0, t0 ].

Since X 2 ≤ t 2 ≤ t02 , if t0 < t0 for a sufficiently small t0 , the first expression in the previous chain of inequalities is certainly negative while its modulus is bigger than some ˆ Furthermore, since X 2 < Cˆ holds for a sufficiently small t0 , t0 -independent constant C. in order to verify the previous chain of inequalities it suffices to check that 1 a 2 (t) |(t)| ≤ ∀t ∈ [0, t0 ] C1 X 2 (t) which holds true for a certain t0 thanks both to the bounds, satisfied by the elements of Bc , given in Proposition 4.3 and to the estimates for |[X ](t)| found in Proposition 4.5. The next step consists in showing that T is a contraction on Bc with respect to the norm of B, that is, for every X 1 and X 2 in Bc , T (X 1 ) − T (X 2 )B < C X 1 − X 2 B , with 0 < C < 1. This property descends once more from the results stated in Proposition 4.5 and from the subsequent remark. Notice that for every 0 ≤ λ ≤ 1 we can consider X λ := λX 1 + (1 − λ)X 2 = X 2 + λ [X 1 − X 2 ] ,


585

where X λ is in Bc . Furthermore, indicating by δ X the difference X 1 − X 2 , we have 1 d d d T (X 1 ) (t) − d T (X 2 ) (t) ≤ T (X λ ) (t)dλ dt dt dλ dt 0 d ≤ sup D T (X λ , δ X ) (t) . dt λ∈[0,1] In order to conclude the proof we have to analyze the functional derivative of ddtT in Bc . In this latter step, the contribution which requires some care is the one arising from the functional derivative of X 4 a −2 , which, using both the estimates (31) derived in the Appendix and the result of Proposition 4.5, can be shown to satisfy the inequality 4 X sup D [X, δ X ] (t) ≤ C3 t0 δ X B , 2 a

t∈[0,t0 ]

for every X in Bc , where C3 is some constant. It is not difficult to show that similar estimates hold also for the other contributions to the functional derivative of ddtT . Hence, there is certainly a sufficiently small t0 for which C < 1. With the latter choice for t0 the map T acts as a contraction on Bc . Theorem 4.1. Fix and let a0 = t0 . If t0 is chosen sufficiently small, there exists a unique solution X = H −1 of (16) in Bc which can be constructed recursively starting from the massless solution X 0 := H0−1 implicitly given in (17). In other words the sequence X n := T (X n−1 ) , converges in Bc to a solution of the semiclassical Einstein equations with ω1,0 taken as reference state. Proof. On account of Proposition 4.6, the proof descends out of a straightforward application of the Banach fixed point theorem to the contraction T on Bc . Before concluding this subsection we would like to briefly comment on the result obtained above. First of all notice that the method to obtain the solution is constructive, in the sense that for every n, X n is closer to the solution than X n−1 . Furthermore, notice that if we start from H0 , which is a smooth function in Bc , every Hn is also smooth, because T maps smooth functions to smooth ones. Unfortunately out of the previous theorem, we cannot conclude that the found solution is smooth. In order to address this problem we should discuss the convergence of the series X n in the Fréchet spaces of smooth functions C ∞ [0, t0 ], but this lies out of the scope of the present paper. For our purpose, on account of the preceding discussion, it is enough to notice that we can find a smooth spacetime X n as close as we want, in the norm of B, to the the exact solution X provided by the previous theorem. Furthermore, the regularity displayed by the generic element of Bc is enough in order for Eq. (16) to be meaningful. At this point we would like to stress that in Eq. (16) nothing depends explicitly on a0 and actually, also in φ 2 ω1,0 as given in Eq. (24), a0 does not really appear. The latter statement becomes evident, re-scaling everything with respect to a0 , that is, using

586

N. Pinamonti

k := a0−1 k, τ := a0 τ, a := a0−1 a and χkr (τ ) = equation

√ a0 χk (τ ) which now satisfy the

2 ∂2 r χk (τ ) + k + m 2 a χkr (τ ) = 0. 2 ∂τ Hence, once a solution of (16) is obtained, we can always find other solutions by choosing a different a0 and thus, a0 can be really considered a free parameter of the theory. As discussed at the beginning of this section we have used only part of the information present in the semiclassical Einstein equations, namely only the trace. In order to be sure that the found H (t) solves the full semiclassical Einstein equations we have to check that the constraint G 00 − 8π T00 ω1,0 = 0 holds at some instant of time. Unfortunately, this is not an easy task. Luckily enough, due to the large symmetry present in the system and to the conservation of the stress tensor, we can integrate the equation for the trace. Operating in this way, we obtain that such constraint holds up to a term of the form Ca(t)−4 . Thus, if the constraint is not satisfied, we can always set it to zero by adding a conformal invariant field, representing quantum radiation, in a suitable quantum state. Notice that if the matter content of the system is formed by a massive field and a conformal invariant one, the equation R + 8π Tμ μ ω1,0 = 0 looks exactly like (16) up to a trivial change of the coefficient −1/(30π ) to −2/(30π ). Such trivial modification does not alter the discussion present above about the recursive construction of the solution and its convergence. We conclude the present section noticing that the same result about the uniqueness and the existence of the solution of (16) holds also for other choices of Hadamard states in C (M) (see Definition 3.1) provided suitable conditions are satisfied. We have in fact the following theorem Theorem 4.2. Let ω A,B ∈ C (M) be constructed out of A and B in C 2 (R+ ), and such that, with respect to the auxiliary function f (k) := A(k)B(k)k 2 , the following conditions are satisfied a) b) c) d)

B is rapidly decreasing, A(k)B(k) and |B(k)|2 are contained in L 1 (R+ , kdk), f (0) = f (0) = 0, f , f ∈ L 1 (R+ , dk).

Let a0 = t0 with > 0; then, if t0 is sufficiently small, a unique solution of the semiclassical Einstein equations with ω A,B taken as reference state exists. Proof. The proof of the present theorem can be performed following the argument of Theorem 4.1. For this reason, here, we shall discuss only the differences. Let us start noticing that what changes in this case is the value of φ 2 . More precisely, instead of dealing with the map T , we have to deal with T which is defined as follows A,B X4 , 2 2 1 − Hc X a 2 − φ 2 ω1,0 which can be rewritten as

T (X ) := T (X ) +

where A,B := 2π 2 a 2 φ 2 ω A,B ∞ 2|B(k)|2 χk χk (t) + A(k)B(k) χk χk (t) + A(k)B(k) χk χk (t) k 2 dk. A,B (t) = 0


587

Since we already have the estimates for χk − χk0 given in Appendix A.2, it is useful to divide A,B in two parts, namely A,B = 0A,B + +A,B , where 0A,B is constructed as A,B with χk0 in the place of χk . For any d > 0, let us introduce the closed set Bd formed by the elements of B such that d Bd := X ∈ B , 1 − td ≤ X (t) ≤ 1 + td . dt We have that +A,B satisfies on Bd similar inequalities as the ones stated in Proposition 4.5 for . In order to verify the last statement, we can make profitable use of the fact that B(k) is rapidly decreasing and that B(k)χk and A(k)χk are locally square integrable with respect to the measure k 2 dk. Hence the first bound |+A,B (t)| ≤ Ct,

∀t ∈ [0, t0 ]

descends straightforwardly from the inequalities in Proposition 4.4, while the second one, about the variation of + under the small perturbation δ X of X , |D+A,B (t)| ≤ Ct0 δ X B ,

∀t ∈ [0, t0 ]

can be proved using the results stated in Appendix A.2. 0A,B does not satisfy similar inequalities, and it is for such a reason that we have introduced Bd , a closed set different than Bc . For 0A,B , there exists a constant δ1 such that |0A,B (t)| ≤ δ1 ,

∀t ∈ [0, t0 ]

and this inequality descends straightforwardly from the properties a) and b) stated in the hypotheses and from the fact that |χk0 |k is constant. The variation of 0A,B under the small perturbation δ X of X involves only the variation of e2ikτ , present in the second and third term of 0A,B . From properties c) and d), the following uniform decay follows A(k)B(k)k 2 e2ikτ (t) dk ≤ C , ∀t ∈ [0, t0 ]. τ 2 (t) Furthermore, operating in a similar way as in Appendix A.1, since |δ X (t)| ≤ tδ X B , we obtain the uniform estimate 2 Dτ (t) t sup 2 ≤ C sup 2 δ X B τ (t) X (t ) t∈[0,t0 ] t ∈[0,t0 ] valid for sufficiently small t0 in Bd and for some constant C. Combining these arguments we easily obtain |D0A,B (t)| ≤ Cδ X B .

∀t ∈ [0, t0 ].

Finally, let us notice that the results given in Proposition 4.5 hold also on Bd . Although the behavior of 0A,B is worse than that of +A,B or of A,B , the inequalities given above are sufficient in order to obtain the existence and uniqueness of the solution

588

N. Pinamonti

in Bd . This is possible because, on Bd , |Xa −1 (t)| ≤ C holds, which is not valid for the generic element of Bc . Actually, for a sufficiently large d (depending on δ1 in particular) and a sufficiently small t0 , the map T introduced above is a contraction in Bd as it can be seen adapting the proof of Proposition 4.6. Out of this observation the proof of the present proposition can then be easily concluded, by applying the Banach fixed point theorem on Bd . 5. Further Considerations on the Semiclassical Solutions 5.1. Expectation value of φ 2 at late times. By direct inspection we notice that, at late times, the massless solution (17) tends to the de Sitter spacetime. We would like to see if such behavior persists also for massive fields.9 To this end notice that in [DFP08] it is argued that, in the case of a massive conformally coupled scalar field, if the expectation value of φ 2 has the form βm 2 + α R, then the same behavior at late times can be inferred. Such an estimate was obtained analyzing the adiabatic approximation for the state. It would be desirable to see if the assumption made in [DFP08] for the expectation value of φ 2 is reliable for the exact solution of the semiclassical Einstein equations. We start noticing that in the case of an exact de Sitter spacetime, the expectation values of φ 2 in the Bunch-Davies [BD78] state has exactly that behavior, and hence, there is only one choice of the cosmological constant which solves the semiclassical Einstein equations. Unfortunately, it is very difficult to compute the evolution of the expectation value of φ 2 in the exact solution we have found before. Despite the difficulties present in the analysis of the semiclassical solution at finite times, we would like to stress that if a de Sitter phase exists at late times, it has to be exactly the one found in [DFP08]. Such a remark originates from the fact that the expectation value of φ 2 , in an expanding flat universe (H > 0), is state independent (at least on the class of homogeneous pure Hadamard states) as asserted in the following theorem. Theorem 5.1. In an expanding H > 0 flat FRW spacetime whose smooth scale factor a diverges for large time, consider a quasi-free state ω A,B whose two-point function is as in (13) and of Hadamard form. Then lim φ 2 (t)ω A,B − φ 2 (t)ω1,0 = 0. t→∞

Proof. The proof can be obtained considering the form of χk . Since both ω A,B and ω1,0 are Hadamard states, it holds for their two-point functions φ 2 (t)ω A,B − φ 2 (t)ω1,0 1 1 2 = χ χ (t)+ A(k)B(k) χ χ (t)+ A(k)B(k) χ χ (t) dk, 2|B(k)| k k k k k k (2π )3 a(t)2 (28) where x = (t, x) in cosmological coordinates and k = |k|. Because of Theorem 3.2, B(k) is rapidly decreasing and A(k) of polynomial growth, hence we can pass to the coinciding point limit under k-integration. We aim to prove that the preceding expression vanishes in the limit a → ∞. To this end we shall show that the previous k integration is bounded in time. To prove this we have to analyze the behavior of χk for large times. To 9 See also the work [PRV08].


589

this avail, let us consider the functions χk in cosmological time (χk (t) := (χk ◦ τ )(t)), and let us notice that they have to satisfy the following equation 2 k 2 + m χk = 0, χ¨ k + H χ˙ k + a2 where χ˙ k and χ¨ k indicate respectively the first and second derivative of χk with respect to the cosmological time t. Hence the previous equation descends straightforwardly out of a change of variables in (9). Let us introduce the positive quantity 2 k 2 + m χk χk (t). Q(t, k) := χ˙ k χ˙ k (t) + a2 We would like to show that, at fixed k, Q(t, k) decreases in time. To this end notice that since H ≥ 0, 2 ˙ k) = −2H (t)|χ˙ k (t)|2 − 2 k H (t)|χk (t)|2 ≤ 0, Q(t, a(t)2

˙ k) the partial derivative ∂ Q(t, k). Hence, out of the where we have indicated by Q(t, ∂t ˙ form of Q(t, k) and of Q(t, k) we obtain that, for every t > t0 , |χk (t)|2 ≤

Q(t0 , k) . m2

It is now possible to use the preceding inequality to estimate the k-integral present in (28), and since both Q(t0 , k)|B(k)|2 and Q(t0 , k)|A(k)B(k)| are integrable in k, we have that there exists a positive constant C(t0 ), such that, for every x = (t, x) in the future of the surface t0 , C(t ) 0 . ω A,B (φ 2 (x)) − ω1,0 (φ 2 (x)) ≤ a(t)2 By the preceding remark, we can straightforwardly evaluate the limit t → ∞ and, since a(t) diverges per hypothesis therein, the difference vanishes. 5.2. Variance of T when smeared on large volume regions. As discussed in the Introduction, the semiclassical Einstein equations are valid when the quantum fluctuations of the expectation value of the stress tensor are negligible. In this paper, in order to obtain the dynamics of the spacetime, due to the large symmetry present in the model under investigation, we have used only the trace of the stress tensor. Notice that the anomalous terms in the trace are c−numbers; hence, they have vanishing fluctuations, and this already means that, in the case of massless fields, the found semiclassical equation and its solution are meaningful at every time (not only when the curvature is small). Furthermore, in the case of a massive field conformally coupled with the curvature, the source for the fluctuations in the trace of the stress tensor can only be the expectation value of φ 2 (x). Unfortunately we have to notice that point-like fields like φ 2 (x), although their expectation values are represented by smooth functions, could have divergent variance. This is indeed the case even if the Hadamard regularization is employed, as can be seen by a straightforward application of the Wick theorem. On the contrary

590

N. Pinamonti

the variance of smeared fields on Hadamard states is always finite. Let us analyze this in some detail. We start by recalling that the variance of φ 2 ( f ) on a state ω can be obtained as ω (φ 2 ( f )) := ω(φ 2 ( f )φ 2 ( f )) − ω(φ 2 ( f ))ω(φ 2 ( f )),

(29)

where in the field φ 2 ( f ) the Hadamard regularization is employed.10 Since the state ω1,0 is quasi-free, we can use (5) for the evaluation of the four-point function in the product of fields present above and we can employ the usual definition of the Wick square φ 2 (x) = lim y→x (ϕ(x)ϕ(y) − H(x, y)), in order to show that 2 (x, y) f (x) f (y)dμ(x)dμ(y), ω1,0 (φ 2 ( f )) := 2 ω1,0 2 is the product of distribution ω 2 where ω1,0 1,0 · ω1,0 . Although ω1,0 is a well defined distribution, due to the form of its wave front set, its coinciding point limit is not well defined and actually it diverges. Hence, the expectation value of φ 2 (x) has divergent fluctuations, but, luckily enough, we can overcome this difficulty by a suitable smearing procedure we are going to introduce. To this end, we could use the spatial symmetry once more and consider a smearing function f n 1 ,n 2 such that, for large n 1 and n 2 it tends to have support on the whole Cauchy surface τ , and it is normalized in such a way that its integral on M is equal to one. More precisely, we consider a compactly supported smooth function f ∈ C0∞ (M) centered in xτ := (τ, 0), and such that f (xτ ) = 1, f dμ(g) = 1, 0 ≤ f (x) ≤ 1. M

Notice that, in writing the preceding expression, we are using the conformal coordinates on M. We can then generate the desired f n 1 ,n 2 as |g(n 1 (τ − τ ) + τ, x/n 2 )| n1 x f n 1 ,n 2 (τ , x) = 3 f n 1 (τ − τ ) + τ, (30) n2 n2 |g(τ , x)| and, as anticipated before, for large n 1 , f n 1 ,n 2 tends to have support on τ while, for

large n 2 , even if its spatial support becomes larger and larger, M f n 1 ,n 2 dμ(g) = 1. We can then smear the fields in (16) and subsequently in (29) with f n 1 ,n 2 and analyze the behavior of the smeared quantity for large n 1 and n 2 . In other words, notice that, since the trace of the Einstein tensor G is smooth and equal to −R, in the limit (n 1 , n 2 ) → ∞, lim

(n 1 ,n 2 )→∞

g μν G μν , f nτ1 ,n 2 = −R(τ ),

and, thanks to the continuity properties of R, the result does not depend on the order in which the limits are taken. The analysis of the trace of T follows similarly. In fact, the anomalous part of the trace depends continuously on the coefficients of the metric, and its derivatives, up to the second order, while the expectation value of φ 2 (x) has vanishing wave front set and, hence, it is continuous too. Thus, the equation −R = 8π T ω1,0 , together with its solutions, holds exactly in the same way for the smeared quantities. 10 A more elegant formulation of the extended algebra of observables F (M) as the extension of a deformed algebra of fields can be found in [BF09,BDF09]. The product of fields present in ω (φ 2 ( f )) can be also understood as the product in such an algebra.


591

We proceed to analyze the variance of these quantities, and we notice immediately that the variances of the geometric entities present in (16) vanish for large n 1 and n 2 , no matter in which order the limits are performed. We have to be more careful when analyzing the variance of the expectation values of φ 2 but, luckily enough, we have the following theorem which shows that the fluctuations of φ 2 vanish if the limits are taken in a suitable order. Theorem 5.2. Let f n 1 ,n 2 be some smearing functions constructed as in (30) out of some f with the properties stated above. Fix n 1 , then the variance of φ 2 ( f n 1 ,n 2 ), computed according to formula (29) and evaluated on the state ω1,0 , vanishes for large n 2 in the weak sense, that is lim ω1,0 (φ 2 ( f n 1 ,n 2 )) = 0.

n 2 →∞

Proof. First of all notice that, ω1,0 is of Hadamard form and hence its wave front set satisfies the microlocal spectrum condition. This implies that the pointwise product of 2 , is a well defined distribution because the ω1,0 with itself, that we shall indicate by ω1,0 Hörmander criterion for multiplication of distributions is fulfilled (the sum of the wave front set of the distribution ω1,0 with itself does not contain the zero section). Further2 has more, Theorem 8.2.10 of [Ho89] yields that the wave front set of the product ω1,0 to be contained in (W F(ω1,0 ) ∪ {0}) + (W F(ω1,0 ) ∪ {0}), where the sum is meant in the 2 can only contain points cotangent space. This implies that the singular support of ω1,0 connected by a lightlike geodesic. Hence, we can divide in two parts the distribution 2 2 2 ω1,0 f n 1 ,n 2 ⊗ f n 1 ,n 2 := ω1,0 u f n 1 ,n 2 ⊗ f n 1 ,n 2 + ω1,0 u f n 1 ,n 2 ⊗ f n 1 ,n 2 . 2 multiplying the test functions by a suitable parAbove we have split the distribution ω1,0 2 . We shall now discuss tition of unit u + u = 1 on M × M and using the linearity of ω1,0 how we have to construct u in order to have the intersection of the singular support of 2 u with the support of f ω1,0 n 1 ,n 2 ⊗ f n 1 ,n 2 is empty for every n 2 and for fixed n 1 . Consider a compactly supported smooth function ρ on R3 equal to one on a neighborhood of the origin. Then we shall indicate by u the smooth function on M × M,

u : ((τ, x); (τ , x )) → ρ(x − x ), 2 u, where we have used standard coordinates introduced above. Let us start analyzing ω1,0 ∞ and suppose we test it on a test function h ∈ C0 (M × M) of the following form:

h((τ1 , x); (τ1 , y)) := f t (τ1 , τ2 ) f + (x − y) f − (x + y), where f t ∈ C0∞ (I × I ) and f + , f − ∈ C0∞ (R3 ). Furthermore, these functions are chosen in such a way that h is positive and, for a fixed n 1 , h ≥ f n 1 ,1 ⊗ f n 1 ,1 . Hence, due to 2 and the multiplication by the cutoff u, the the translation invariance satisfied by ω1,0 following continuity condition holds true: ⎛ 2 |ω1,0 u (h) | ≤ C ⎝

|a|≤q

⎞

⎛

D a f t ∞ ⎠ f + 1 ⎝

|b|≤q

⎞ D b f − ∞ ⎠

592

N. Pinamonti

for suitable values of C and q. Notice that, because of the localization introduced by u, C does not depend on the support of f − . Out of this observation we have that, for h n ∈ C0∞ (M × M) defined as h n ((τ1 , x); (τ1 , y)) := h((τ1 , x/n); (τ1 , y/n))/n 6 , 2 u (h n ) = 0. lim ω1,0

n→∞

2 u( f The latter limit implies also that limn 2 →∞ ω1,0 n 1 ,n 2 ⊗ f n 1 ,n 2 ) = 0, because, for fixed 2 u, n 1 , thanks to the special choice of f t , f + and f − and to the positivity of ω1,0 2 2 u( f n 1 ,n 2 ⊗ f n 1 ,n 2 ) ≤ ω1,0 u h n2 . ω1,0 2 , namely ω2 u , and we We have now to analyze the second contribution to ω1,0 1,0 2 u has vanishing intersection notice that, per construction, the singular support of ω1,0 2 u has a with the support of f n 1 ,n 2 ⊗ f n 1 ,n 2 . Hence, when tested on f n 1 ,n 2 ⊗ f n 1 ,n 2 , ω1,0 smooth integral kernel, and, for every n 2 and for sufficiently large n 1 , 2 u ( f n 1 ,n 2 ⊗ f n 1 ,n 2 ) ω1,0

is represented by an ordinary integral. We can change the variable of spatial integration in such a way that the preceding operation can be rewritten as 2 lim u )(τ1 , τ2 , n 2 (x − y))n 21 f (n 1 τ, x) f (n 1 τ , y)dμ(τ1 , τ2 )dxdy. (ω1,0 n 2 →∞

2 u ) is bounded, we can pass the limit under Since f has compact support and (ω1,0 the sign of integration. We obtain that the limit for large n 2 vanishes provided that 2 u )(τ , τ , n (x − y)) = 0. limn 2 →∞ (ω1,0 1 2 2 Due to the spatial translational and rotational invariance, it is enough to check this last requirement for ω1,0 (without the square), for a single direction and when one of the points is in the spatial origin of the employed coordinate system. Hence, using the conformal coordinates, let us consider x := (τ1 , 0) and y := (τ2 , r, 0, 0), with r >> 0. We can rewrite the two-point function ω1,0 as ∞ 1 sin(kr ) 2 ω1,0 (x, y) = χk (τ1 )χk (τ2 ) k dk, 2 (2π ) a(τ1 )a(τ2 ) 0 r

where we have made use of the fact that r is chosen to be strictly positive and the fact that (x, y) is not contained in the singular support of ω1,0 . We can now split the k−integral on (0, ∞) in two parts, as similarly done in the proof of Proposition 4.5, namely on the interval (0, 1) and on (1, ∞). Since χk (τ1 )χk (τ2 ) is locally integrable in the measure k dk, due to the estimates given in Proposition 4.4 and, making use of the Riemann Lebesgue theorem, we can conclude that the contribution on (0, 1) vanishes when r diverges. Let us proceed to analyze the k−integral on the remaining interval (1, ∞). Hence, we can now make use of the perturbative construction in m 2 for the χk given in (25) and analyze the different contributions separately. First of all we notice that the sum of the contributions to χk (τ1 )χk (τ2 ) of order equal to or larger than m 4 lie in L 1 (R, kdk), and, once more, due to results stated in Proposition 4.4, by the Riemann Lebesgue lemma, as r diverges this contribution tends to zero. We are left with the analysis of the order 1 and m 2 . The former consists of a well known distribution that vanishes for large r , whereas the latter (order m 2 ) can be estimated, using an argument


593

similar to the one used to treat (40). Therefore, for large r and for some constant C, this contribution can be shown to be bounded by Cr −1 (1 + log(r )) which tends to zero when r diverges. This discussion provides a new and stronger physical interpretation of the semiclassical Einstein equations on spacetime with large spatial symmetry as the cosmological one. 6. Summary of the Results and Final Comments In this paper we have rigorously constructed some solutions of the semiclassical Einstein equations in cosmological spacetimes. All these solutions display a phase of power law inflation after the initial singularity. We have also seen that, if a stable de Sitter phase is asymptotically present in the future, then the acceleration of the latter does not depend on the particular homogenous pure state in which the quantum theory is evaluated, and hence it can be considered as a universal character. Finally we have discussed the interpretation of these equations, showing that, when smeared on large spatial regions the equation −R = 8π T continues to be valid, and, in this case, since the fluctuations of T tend to vanish, it acquires an even stronger interpretation. We have been able to derive those results because we have selected a class of spacetimes and we have used an universal quantization scheme for the matter fields on that class, employing ideas typical of local covariance [BFV03]. Furthermore, here we have also been able to give unambiguously a quantum state on every element of the class of considered spacetimes, using and even generalizing, the results in [DMP09a,DMP09b]. Furthermore, the employed methods permit to characterize the quantum states in an intrinsic way. In other words, using a lightlike initial surface and giving initial data thereon, we provided a prescription which can be used to construct states with a good ultraviolet behavior. There are of course many questions that are still open: here we want to quote some of them which have already been already addressed in the approximated case in [DFP08]. Here we think for example about the issue of long-time existence and the long-time behavior of the exact solution. Another important question is about the stability of such kind of solutions under perturbations. In principle, similar ideas could be used to treat also more complicated problems like, for example, the backreaction of collapsing matter forming a black hole. Another interesting aspect that should be addressed in the future, is the relation between the results obtained here on the vanishing of the fluctuations by smearing the fields on infinite volume regions and the approach typical of the so-called stochastic gravity [HV08]. Acknowledgements. I would like to thank Romeo Brunetti, Claudio Dappiaggi, Klaus Fredenhagen, ThomasPaul Hack and Valter Moretti for useful discussions, suggestions and comments on the subject of this paper. This work, is supported in part by the German DFG Research Program SFB 676 and in part by the ERC Advanced Grant 227458 OACFT “Operator Algebras and Conformal Field Theory”.

A. Appendix A.1. Functional derivatives of a and τ . In this subsection we shall derive some estimates for the functionals a and τ needed to establish the proofs of the main theorems. Let us start by recalling the form of the scaling factor a as a functional of X . Fixing the

594

N. Pinamonti

initial condition a0 at the cosmological time t0 and its first functional derivative we can write t0 a [X ] (t) := a0 exp − X −1 (t )dt , t t0 Da [X, δ X ] (t) := a [X ] (t) X −2 (t )δ X (t )dt . t

We shall derive an estimate for the functional derivative Da valid for every X and X + δ X in Bc and t ∈ [0, t0 ]. Let us start multiplying and dividing by t 2 the integrand X −2 (t )δ X (t ) in Da [X, δ X ] (t). We can then bound such an expression by extracting the supremum of X −2 (t )t 2 δ X (t ) from the integral. The integral of t −2 over [t0 , t] can now be explicitly performed. Thus we get 1 1 a0 1 −2 2 |Da [X, δ X ] (t)| ≤ a [X ] (t) sup − (t )t δ X ∞ , X δ X ∞ ≤ t t0 t ∈[0,t0 ] t0 c 2 (31) where · ∞ is the supremum norm over the interval [0, t0 ], and in the last inequality we have used the estimate stated in Proposition 4.3. Furthermore, out of both the preceding inequalities and the definition of functional derivative as a directional derivative, it holds that |a1 (t) − a2 (t)| =

1

0

a0 1 Da [X λ , δ X ] (t)dλ ≤ δ X ∞ , t0 c 2

∀t ∈ [0, t0 ],

where δ X = X 2 − X 1 and X λ = X 1 + λδ X and the last inequality descends from (31) together with Proposition 4.3. Let us continue analyzing the form of the conformal time as a functional of a and the first functional derivative of τ ◦ a[X ],

t0

τ [a](t) := τ0 − t

1 dt , a

D (τ ◦ a) [X, δ X ] = Dτ [a, Da [X, δ X ]] ,

which it satisfies the following inequality

t0

|D (τ ◦ a) [X, δ X ] (t)| ≤ 2 t

1 a

1 1 − t t0

dt sup X −2 (t )t 2 δ X ∞ . t ∈[0,t0 ]

Hence, multiplying by a sufficiently large power of tm we obtain that the inequalities, 1/c t D (τ ◦ a) [X, δ X ] (t) ≤ C(a0 , t0 , c)

1

2tc sup X −2 (t )t 2 δ X ∞ ≤ 0 δ X ∞ , a0 c t ∈[0,t0 ] (32)

hold in the ball Bc constructed out of c as introduced in Definition 4.1.


595

A.2. Bounds satisfied by χk and its functional derivatives. In the following we shall derive some useful estimates satisfied by the functions χk constructed perturbatively in (25). This appendix is meant as a completion of the estimates presented in Proposition 4.4, where the convergence of (25) was proven. All the forthcoming formulas are meant to be valid for every spacetime that arises from an element X of Bc introduced in Definition 4.1, along the lines of Proposition 4.1. (Notice that they hold also on Bd introduced in the proof of Theorem 4.2.) For the same reason t ∈ [0, t0 ] everywhere. Finally, we stress that also the functional derivatives analyzed below are intended to be taken within Bc , which means that both X and X + δ X are contained in the convex set Bc . In order to shorten some formulas, it is also useful to introduce the following two functionals: t t 2 m F1 [a](t) := a(t ) dt , F2 [a](t) := (τ (t) − τ (t ))a(t )m 2 dt (33) k 0 0 that satisfy |F1 [a](t)| ≤ ta0 0k t 2 m 2 and |F2 [a](t)| ≤ m 2 t 2 . Following a procedure similar to the one presented in the proof of Proposition 4.4, we can derive the following estimates: >n χ [a](t) ≤ √1 (F1/2 [a](t))n+1 exp F1/2 [a 2 ](t). (34) k 2k In the remaining part of this appendix we shall analyze the functional derivatives of χk − χk0 with respect to X ∈ Bc ; more precisely we shall compute D(χk − χk0 ) [X, δ X ] = Da (χk − χk0 ) [a, τ, Da [X, δ X ]] +Dτ (χk − χk0 ) [a, τ, Dτ [X, δ X ]] . Let us start from Da (χk − χk0 )[a, τ, Da[X, δ X ]], which, exploiting the properties of the recursive series (25), the form of Da[X, δ X ] and the usual extension of a time ordered integral to an unordered one, turns out to satisfy the following inequality: e F1/2 (t) Da (χk − χk0 ) [a, τ, Da (X, δ X )] (t) ≤ √ m 2 t Da∞ , k 2k where · ∞ is the uniform norm on C[0, t0 ], while Da χk0 = 0. The second contribution is more laborious; let us consider the recursive sum and let us concentrate on the contribution to the n th order which looks like n eikτ (tn ) sin(k(τ (ti−1 ) − τ (ti ))) 2 m a(ti )dt1 . . . dtn , (35) √ k 2k 0≤tn ≤tn−1 ···≤t i=1

where, only in the previous formula, t0 corresponds to the time t. The action of the τ -functional derivatives on (35) is twofold: On the one hand, we have to differentiate every factor of the form sin(2k(τ −τ )) and, on the other hand, we have to differentiate τ in χk0 . Let us first of all consider the functional derivatives of sin(2k(τ − τ )) and, extend a time ordered integral to an un-ordered one. We obtain that the n th order is bounded by 1 1 n−1 (t) · √ F1/2 (n − 1)! 2k

0

t

4 m c a0 2

1 t0 c a(t )dt · δ X ∞ , t

596

N. Pinamonti

where · ∞ is the uniform norm on the elements of B and where we have used the following estimate descending from (32): |Dτ [X, δ X ] (t1 ) − Dτ [X, δ X ] (t2 )| ≤

4 c a0

1 t0 c δ X ∞ , t2

t1 > t2 .

The absolute value of the functional derivative of eikτ in (35) is also bounded by a similar expression 1 1 √ F n−1 (t) · (n − 1)! 2k 1/2

t 0

2 m c a0 2

1 t0 c a(t )dt · δ X ∞ . t

Both these last two estimates can be summed in order to give an exponential. Hence exp F (t) 6 m 2 1/2 t0 δ X ∞ , √ Dτ (χk − χk0 ) [X, δ X ] (t) ≤ 2c − 1 2k and, together with the a-functional derivative, we obtain 1 a 1 6 m 2 exp F1/2 (t) 2 0 0 + m t0 δ X ∞ . (36) √ D(χk − χk ) [X, δ X ] (t) ≤ k t0 c2 2c − 1 2k A further inequality we would like to present in this subsection is the following exp F (t) 2 m 2 1/2 t0 δ X ∞ , (χk − χk0 ) [X ] (t) · Dχk0 [X, δ X ] (t) ≤ 2k 2c − 1

(37)

which can be shown to hold in a similar way as the bound found above for the functional derivative of eikτ . We conclude this subsection with two inequalities, valid for n ≥ 0, that can be shown to hold in a similar way as before: F1/2 (t) 1 a0 1 6 m2 Dχ >n [X, δ X ] (t) ≤ F n (t) exp√ m 2 t0 δ X ∞ (38) + 1/2 k k t0 c2 2c − 1 2k and exp F1/2 (t) 2 m 2 >n n t0 δ X ∞ . (t) χk [X ] (t) · Dχk0 [X, δ X ] (t) ≤ F1/2 2k 2c − 1

(39)

A.3. Proof of Proposition 4.5. Proof. The proof is done dividing the expression for , obtained combining (27) with (24), in a finite number of parts and showing that every contribution separately satisfies the desired inequalities with respect to some constants that have the properties stated in Proposition 4.5. In the next, since most of the computations are rather long but straightforward, we shall concentrate only on the key points, and we shall make extensive use of all the estimates derived in the previous appendix. Throughout this proof we shall make use of Proposition 4.1 in order to associate a spacetime to an element X of Bc . According to the same proposition, we shall indicate X −1 as H , and the t which appears in some estimates has to be thought of as being contained in the interval [0, t0 ]. Furthermore, both X and X + δ X are assumed to be contained in the convex set Bc and we shall


597

indicate by · ∞ the uniform norm on C[0, t0 ]. We shall also make use of the inequality f ∞ ≤ t0 f B which holds for every f ∈ B. Let us start dividing the integral over k in (24), also taking into account (27), in two parts 1 ∞ a2m2 [X ] = dkk 2 χk χk − χk0 χk0 + dkk 2 χk χk − χk0 χk0 + 4k 3 0 1 m2a2 log (λa) , 4 where the regularization is needed only for large k, whereas χ , a and τ are functionals of X as discussed previously. Above, the last term proportional to log(am) arises from the change of the region over which the k−integration of a 2 m 2 /(k 3 4) is performed. Notice that the zeroth order in m 2 , present in φ 2 through χ 0k χk0 , vanish due both to the discussion about the regularization freedom and to the requirement that Minkowski spacetime, with respect to the Minkwoski vacuum, is a solution of the problem. For this reason we have subtracted it in (27) and we shall not consider it anymore. Let us start considering the first part of the integral, to which we shall refer as the infrared part and we shall indicate it by I . In this respect we shall write χk χk − χk0 χk0 = χk − χk0 χk − χk0 + χk − χk0 χk0 + χk0 χk − χk0 −

and, hence, using the estimates (34) for χk and the one for the functional derivatives (36) and (37), with F2 as in (33), we obtain 1 2 2 I 1 F2 (t)2 exp (2 · F2 (t)) + 2F2 (t) exp F2 (t) ≤ m 4 t 4 + 2m 2 t 2 e2m t , (t) ≤ 4 4 where F2 (t) ≤ m 2 t 2 , as can be seen by direct inspection of the definition of both F2 (t) and τ , and a0 1 3 m2 I + D [X, δ X ] (t) ≤ F2 (t) exp F2 (t) + 1 t0 c2 2c − 1 1 m2 + exp F2 (t) m 2 t0 δ X ∞ . 2c − 1 This concludes the analysis of the contribution of the lower energies to the value of . We can now proceed to analyze the higher frequencies which we shall indicate as U . We shall split it in powers of m 2 and discuss the first three powers separately. Afterwards, before summing everything together, we shall analyze the remainder. I) Order. Let us start considering the first order in m 2 and its contribution to U , in particular, ∞ 2 2 2 1 χ 0 [a, τ ](t) + χ 0 χ 1 [a, τ ](t) + a (t)m dk U [a, τ ](t) := k χ 1 k k k k 4k 3 1 m 2 a 2 (t) log (λa(t)) , 4 which can be expanded as ∞ m2 t ∂a 2 m 2 a(t)2 (t ) dt − log (λa(t)) , U [a, τ ](t) := dk cos (2k(τ − τ )) 1 4k 0 ∂t 4 1 −

598

N. Pinamonti

where both a and τ have to be thought of as functionals of X and where we have written the perturbative series (25) using t, instead of τ , as the time variable. Furthermore, τ has to be intended as τ [X ](t) while τ as τ [X ](t ). Let us rewrite it in the following form ∞ m2 t ∂a 2 U ∂ dk 2 sin (2k(τ − τ )) 1 [a, τ ](t) := a (t ) dt 8k 0 ∂t ∂t 1 −

m 2 a(t)2 log (λa(t)) 4

and, using the fact that the t derivative of X is positive, we can obtain the following bound 2 3 −1 2 |U 1 (t)| ≤ m a (t)X (t) + m a(t)t +

m 2 a 2 (t) log (λa(t)). 4

Let us proceed to analyze the functional derivative U U DU 1 [a, τ, δ X ] = Da 0 [a, τ, Da[X, δ X ]] + Dτ 0 [a, τ, Dτ [X, δ X ]],

and let us analyze both terms separately, first of all, setting F := Da[X, δ X ], ∞ 2 t m Da U [a, τ, F](t) := sin (2k(τ − τ )) F(8a 2 H 2 + 4a 2 H˙ ) 1 2 8k 1 0 2 ˙ + 8 Fa H + 2a 2 F¨ (t ) dt dk −

m2 m2 a(t)F log (λa(t)) − a(t)F, 2 4

where as usual H = X −1 and the dots indicate the derivatives with respect to the cosmological time t. Using the estimates present in Proposition 4.3 and (31), it is now an easy task to obtain the desired bounds. Notice in particular that the second order time derivative of F multiplied by a 2 can be estimated by δ X B (the norm of B), as it can be verified directly from the definition of a[X ]. Let us proceed with the second functional derivative, and now setting G := Dτ [X, δ X ] we obtain ∞ 2 t m U Dτ 1 [a, τ, G](t) := cos (2k(τ − τ )) G(t) − G(t ) 4k 1 0 3 2 × 6a H + 2a 3 H˙ (t ) dt dk. We could in principle integrate by parts in order to obtain an extra 1/(2k) factor. Unfortunately this would not help much, since, due to the extra t-derivative the left-hand side would depend on H¨ and we do not have control on the second t-derivative of H in Bc . At the same time we have to remember that coskkx is integrable in k on [1, ∞) for every x > 0 and that the k-integral present above needs to be understood in the distributional sense, hence taken with the following -prescription ∞ 2 t U −k m Dτ 1 [a, τ, G](t ) := lim+ e cos (2k(τ − τ )) G(t) − G(t ) =0 1 4k 0 × (6a 3 H 2 + 2a 3 H˙ )(t ) dt dk.


599

Since G(t) − G(t ) (6a 3 H 2 + 2a 3 H˙ ) is L 1 (0, t), at fixed , the absolute value of the integrand in dk ∧ dt is integrable. We can thus switch the order of integration to obtain t 2 G(t) − G(t ) (6a 3 H 2 + 2a 3 H˙ )(t ) [a, τ, G](t) := m Dτ U 1 0

× lim+ =0

∞e−k 1

4k

cos (2k(τ − τ )) dk dt .

(40)

We are now ready to perform the integral in k and to notice that the result is a smooth function everywhere in τ − τ but in 0, where a logarithmic divergence in τ − τ appears when the limit = 0+ is computed. Let us call L(τ − τ ) the result of the k-integration in the limit = 0. Notice that L(τ (t)−τ (t ))/a(t ) is absolutely integrable for t in [t −δ, t] and, moreover, we can adjust δ in such a way both that this integral is equal to 1 and that |L(τ (t) − τ (t ))| ≤ 1 for t in (0, t − δ). Hence, since a(t ) G(t) − G(t ) (6a 3 H 2 + 2a 3 H˙ )(t ) is uniformly bounded for t in [0, t], we can split the t -integral in (40) in two parts, namely over [0, t − δ] and over [t − δ, t]. Both can be uniformly estimated and finally the following absolute bound for Dτ U 1 can be established: 6a 3 H 2 + 2a 3 H˙ ∞ Dτ U 1 (t) ≤ sup a(t ) G(t) − G(t ) t ∈[0,t] t

+

0

G(t) − G(t ) (6a 3 H 2 + 2a 3 H˙ ) dt .

Combining all the estimates, we obtain the desired behavior, namely that the absolute value of the functional derivatives is controlled by Ct0 δ X . II) Order. The computation at the second order is more involved due to a pair of integrals which needs to be addressed, but at the same time there are no subtleties with the extra derivatives which could necessitate control on the derivatives of the variation higher than the first one. We start by recalling the form of this term ∞ U [a, τ ] := k 2 χk2 χk0 [a, τ ] + χk1 χk1 [a, τ ] + χk0 χk2 [a, τ ] dk, 2 1

which can be more explicitly rewritten as ∞ t 2 ∂a 2 2 1 m4 U 2 [a, τ ](t) := (t )dt sin(k(τ − τ )) dk k 2k 0 ∂t 1 2 t 2 1 ∂a (t )dt + sin(2k(τ − τ )) 8k 0 ∂t ∞ t m4 t ∂a 2 (t ) − dk 2 sin(2k(τ − τ )) a(t )dt dt 4k ∂t 1 0 t ∞ m4 t + dk 2 sin(2k(τ − τ ))a 3 (t )dt . (41) 4k 0 1 Out of such expression, an estimate satisfied by |U 2 | can be easily derived and it looks like 5 2 1 1 U (a H t)2 + a 3 H t 2 + a 3 t . 2 [a, τ ](t) ≤ m 4 4 2 4

600

N. Pinamonti

Let us proceed to the analysis of the functional derivatives of and let us notice that the analysis of Da 2 can also be easily done yielding m4 6a 3 H 2 t 2 + 3a 2 H t 2 + ta 3 Da∞ Da U 2 [a, τ, Da](t) ≤ 8 m4 2 2 ˙ 2a H t + at 2 a( Da) + . ∞ 8 A little bit more laborious is the handling of the τ -directional derivative, because, operating as above, an extra factor 2k appears in the formula. Furthermore, it could make the k-integral divergent when the trigonometric functions are replaced with one in the maximization procedure of the second, third and fourth factor in (41). In order to avoid this problem, we can integrate by parts after performing the functional derivative, along the following lines: 2 t t 2 ∂a ∂a ∂ Dτ dt = a D(τ − τ ) dt . sin(2k(τ − τ )) sin(2k(τ − τ )) ∂t ∂t ∂t 0 0 The price to be paid is in the extra time derivative we have to deal with. In fact, proceeding in that way we obtain the following estimate t t U 2 d a|D(τ − τ )|dt + C2 a D(τ − τ ) dt . Dτ 2 [a, τ, Dτ ](t) ≤ C1 dt 0 0 It is now a straightforward task to obtain the desired inequalities. Rest2 ). The contribution of the order larger than 2 in m 2 is ∞ U >2 [a, τ ] := dk k 2 χk1 χk2 +χk1 χk2 + χk0 +χk1 χk>2 +χk>2 χk0 + χk1 + χk>1 χk>1 . 1

Hence, employing the inequality (34), and the definition of F1 in (33), we obtain the bounds 3 ∞ m 2 a0 2 a0 1 U 1+ 1+ dk 2 m 2 t 2 t exp (F1 [a]) >2 [a, τ ](t) ≤ k t0 k t0 1 2 m a0 2 + t exp (2F1 [a]) , k t0 where F1 (t) ≤

m 2 a0 2 k t0 t .

More explicitly we can rewrite it as

3 U 2 a0 2 1 + m 2 a0 t0 exp 2m 2 a0 t0 . t >2 [a, τ ](t) ≤ 3 m t0 We do not analyze the functional derivative of this contribution, because, due to the extra k appearing in the functional derivative of the trigonometric functions with respect to τ , we have to further split it. III) Order The contribution at third order in m 2 is ∞ 2 3 χ 0 [a, τ ] + χ 2 χ 1 [a, τ ] + χ 1 χ 2 [a, τ ] + χ 0 χ 3 [a, τ ] , U [a, τ ] := dk k χ 3 k k k k k k k k 1


and its explicit form is U [a, τ ](t) := 3

601

m6 sin(k(τ − τ1 )) sin(k(τ1 − τ2 )) k2 1 03 1

and, using (39) and (38) together with (34), we can find a constant C (a0 , t0 , c), with the desired properties, i.e., (t) ∀t ∈ [0, t0 ]. DU ≤ C (a0 , t0 , c)δ X ∞ , >3

602

N. Pinamonti

After a rather long but straightforward computation, we are ready to collect all the estimates presented above, hence obtaining that |(t)| ≤ C(a0 , c, t0 )t,

|D [X, δ X ] (t)| ≤ C(a0 , c, t0 )δ X ∞ ,

∀t ∈ [0, t0 ],

where C(t0 , c, t0 ) remains bounded for small t0 and for some fixed > 0. The thesis of the proposition can be obtained noticing that δ X ∞ ≤ t0 δ X B in particular. References [An85]

Anderson, P.R.: Effects of quantum fields on singularities and particle horizons in the early universe. 3. the conformally coupled massive scalar field. Phys. Rev. D 32, 1302 (1985) [An86] Anderson, P.R.: Effects of quantum fields on singularities and particle horizons in the early universe. 4. initially empty universes. Phys. Rev. D 33, 1567 (1986) [AE00] Anderson, P.R., Eaker, W.: Analytic approximation and an improved method for computing the stress-energy of quantized scalar fields in robertson-walker spacetimes. Phys. Rev. D 61, 024003 (2000) [BGP96] Bär, C., Ginoux, N., Pfäffle, F.: “Wave equations on Lorentzian manifolds and quantization”. ESI Lectures in Mathematics and Physics, Zürich: European Math. Soc. Publishing House, 2007. [BO00] Brevik, I., Odintsov, S.D.: Quantum annihilation of anti-de sitter universe. Phys. Lett. B475, 247 (2000) [BDF09] Brunetti, R., Duetsch, M., Fredenhagen, K.: Perturbative algebraic quantum field theory and the renormalization groups. Adv. Theor. Math. Phys. 13, 1541–1599 (2009) [BF00] Brunetti, R., Fredenhagen, K.: Microlocal analysis and interacting quantum field theories: renormalization on physical backgrounds. Commun. Math. Phys. 208, 623 (2000) [BF09] Brunetti, R., Fredenhagen, K.: “Quantum Field Theory on Curved Backgrounds.” In: Lecture Notes in Physics 786, Bär, C., Fredenhagen, K., eds. Berlin-Heidelberg-New York: Springer, 2009, pp. 129–156 [BFK95] Brunetti, R., Fredenhagen, K., Köhler, M.: The microlocal spectrum condition and wick polynomials of free fields on curved spacetimes. Commun. Math. Phys. 180, 633 (1996) [BFV03] Brunetti, R., Fredenhagen, K., Verch, R.: The generally covariant locality principle: a new paradigm for local quantum physics. Commun. Math. Phys. 237, 31 (2003) [BD78] Bunch, T.S., Davies, P.C.W.: Quantum fields theory in de sitter space: renormalization by pointsplitting. Proc. R. Soc. Lond. A 360, 117 (1978) [DFP08] Dappiaggi, C., Fredenhagen, K., Pinamonti, N.: Stable cosmological models driven by a free quantum scalar field. Phys. Rev. D 77, 104015 (2008) [DMP06] Dappiaggi, C., Moretti, V., Pinamonti, N.: Rigorous steps towards holography in asymptotically flat spacetimes. Rev. Math. Phys. 18, 349 (2006) [DMP09a] Dappiaggi, C., Moretti, V., Pinamonti, N.: Cosmological horizons and reconstruction of quantum field theories. Commun. Math. Phys. 285, 1129–1163 (2009) [DMP09b] Dappiaggi, C., Moretti, V., Pinamonti, N.: Distinguished quantum states in a class of cosmological spacetimes and their hadamard property. J. Math. Phys. 50, 062304 (2009) [DV09] Degner, A., Verch, R.: Cosmological particle creation in states of low energy. J. Math. Phys. 51, 022302 (2010) [DB60] DeWitt, B.S., Brehme, R.W.: Radiation damping in a gravitational field. Ann. Phys. 9, 220 (1960) [Di80] Dimock, J.: Algebras of local observables on a manifold. Commun. Math. Phys. 77, 219 (1980) [DH72] Duistermaat, J.J., Hörmander, L.: Fourier integral operators ii. Acta Math. 128, 183–269 (1972) [EG10] Eltzner, B., Gottschalk, H.: “Dynamical Backreaction in Robertson-Walker Spacetime.” http:// arXiv.org/abs/1003.3630v2 [math-ph], 2010 [Fe00] Fewster, C.J.: A general worldline quantum inequality. Class. Quant. Grav. 17, 1897–1911 (2000) [FW96] Flanagan, E.E., Wald, R.M.: Does backreaction enforce the averaged null energy condition in semiclassical gravity? Phys. Rev. D 54, 6233 (1996) [FH90] Fredenhagen, K., Haag, R.: On the derivation of hawking radiation associated with the formation of a black hole. Commun. Math. Phys. 127, 273 (1990) [Fr75] Friedlander, F.G.: “The wave equation on a curved space-time.” Cambridge: Cambridge Univeristy Press, 1975 [GNW09] Gazzola, G., Nemes, M.C., Wreszinski, W.F.: On the casimir energy for a massive quantum scalar field and the cosmological constant. Ann. Phys. 324, 2095–2107 (2009) [GM86] Gottlöber, S., Müller, V.: Vacuum polarization and scalar field effects in the early universe. Astron. Nachr. 307, 285–287 (1986)


[Ha92] [Ha82] [Ha75] [Ho00] [HW01] [HW02] [HW03] [HW05] [Ho89] [HV08] [JS02] [KW91] [LR90] [Mo03] [Mo06] [Mo08] [NO99] [Ol07] [Pa68] [Pa69] [PS93] [PR99] [PRV08] [Pi09] [Ra96] [RV99] [SV01] [Sa09] [Sh08] [SS02] [St80]

603

Haag, R.: “Local quantum physics: Fields, particles, algebras”. Second Revised and Enlarged Edition, Berlin-Heidelberg-New York: Springer, 1992 Hamilton, R.: The inverse function theorem of nash and moser. Bull. Am. Math. Soc. 7, 65 (1982) Hawking, S.W.: Particle creation by black holes. Commun. Math. Phys. 43, 199 (1975) Hollands, S.: “Aspects of Quantum Field Theory in Curved Spacetime”. Ph.D. Thesis, University of York, 2000, advisor B.S. Kay Hollands, S., Wald, R.M.: Local wick polynomials and time ordered products of quantum fields in curved spacetime. Commun. Math. Phys. 223, 289 (2001) Hollands, S., Wald, R.M.: Existence of local covariant time ordered products of quantum fields in curved spacetime. Commun. Math. Phys. 231, 309 (2002) Hollands, S., Wald, R.M.: On the renormalization group in curved spacetime. Commun. Math. Phys. 237, 123 (2003) Hollands, S., Wald, R.M.: Conservation of the stress tensor in interacting quantum field theory in curved spacetimes. Rev. Math. Phys. 17, 227 (2005) Hörmander, L.: “The Analysis of Linear Partial Differential Operators I”. Second edition, Berlin: Springer-Verlag, 1989 Hu, B.L., Verdaguer, E.: “Stochastic Gravity: Theory and Applications.” Living Rev. Rel. 11, 3 (2008); Living Rev. Rel. 7, 3 (2004) Junker, W., Schrohe, E.: Adiabatic vacuum states on general spacetime manifolds: definition, construction, and physical properties. Ann. Henri Poincaré 3(6), 1113–1181 (2002) Kay, B.S., Wald, R.M.: Theorems on the uniqueness and thermal properties of stationary, nonsingular, quasifree states on space-times with a bifurcate killing horizon. Phys. Rept. 207, 49 (1991) Lüders, C., Roberts, J.E.: Local quasiequivalence and adiabatic vacuum states. Commun. Math. Phys. 134, 29–63 (1990) Moretti, V.: Comments on the stress-energy tensor operator in curved spacetime. Commun. Math. Phys. 232, 189 (2003) Moretti, V.: Uniqueness theorem for bms-invariant states of scalar qft on the null boundary of asymptotically flat spacetimes and bulk-boundary observable algebra correspondence. Commun. Math. Phys. 268, 727 (2006) Moretti, V.: Quantum out-states holographically induced by asymptotic flatness: invariance under spacetime symmetries, energy positivity and hadamard property. Commun. Math. Phys. 279, 3175 (2008) Nojiri, S., Odintsov, S.D.: Effective action for conformal scalars and anti-evaporation of black holes. Int. J. Mod. Phys. A14, 1293–1304 (1999) Olbermann, H.: States of low energy on robertson-walker spacetimes. Class. Quantum. Grav. 24, 5011–5030 (2007) Parker, L.: Particle creation in expanding universes. Phys. Rev. Lett. 21, 562 (1968) Parker, L.: Quantized fields and particle creation in expanding universe. i. Phys. Rev. D183, 1057 (1969) Parker, L., Simon, J.Z.: Einstein equation with quantum corrections reduced to second order. Phys. Rev. D 47, 1339 (1993) Parker, L., Raval, A.: Non-perturbative effects of vacuum energy on the recent expansion of the universe. Phys. Rev. D 60, 063512 (1999) [Erratum-ibid. D 67, 029901 (2003)] Perez-Nadal, G., Roura, A., Verdaguer, E.: Backreaction from non-conformal quantum fields in de sitter spacetime. Class. Quant. Grav. 25, 154013 (2008) Pinamonti, N.: Conformal generally covariant quantum field theory: the scalar field and its wick products. Commun. Math. Phys. 288, 1117 (2009) Radzikowski, M.J.: Micro-local approach to the hadamard condition in quantum field theory on curved space-time. Commun. Math. Phys. 179, 529 (1996) Roura, A., Verdaguer, E.: Mode decomposition and renormalization in semiclassical gravity. Phys. Rev. D 60, 107503 (1999) Sahlmann, H., Verch, R.: Microlocal spectrum condition and hadamard form for vector valued quantum fields in curved space-time. Rev. Math. Phys. 13, 1203 (2001) Sanders, K.: Equivalence of the (generalised) hadamard and microlocal spectrum condition for (generalised) free fields in curved spacetime. Commun. Math. Phys. 295(2), 485–501 (2010) Shapiro, I.L.: Effective action of vacuum: semiclassical approach. Class. Quant. Grav. 25, 103001 (2008) Shapiro, I.L., Sola, J.: Massive fields temper anomaly-induced inflation. Phys. Lett. B 530, 10 (2002) Starobinsky, A.A.: A new type of isotropic cosmological models without singularity. Phys. Lett. B91, 99 (1980)

604

[SVW02] [Vi85] [Wa77] [Wa78a] [Wa78b]

N. Pinamonti

Strohmaier, A., Verch, R., Wollenberg, M.: Microlocal analysis of quantum fields on curved space-times: analytic wavefront sets and reeh-schlieder theorems. J. Math. Phys. 43, 5514 (2002) Vilenkin, A.: Classical and quantum cosmology of the starobinsky inflationary model. Phys. Rev. D32, 2511 (1985) Wald, R.M.: The back reaction effect in particle creation in curved spacetime. Commun. Math. Phys. 54, 1–19 (1977) Wald, R.M.: Axiomatic renormalization of stress tensor of a conformally invariant field in conformally flat spacetimes. Ann. Phys. 110, 472 (1978) Wald, R.M.: Trace anomaly of a conformally invariant quantum field in curved space-time. Phys. Rev. D 17, 1477 (1978)

Communicated by Y. Kawahigashi


Communications in


Ornstein-Zernike Asymptotics for a General “Two-Particle” Lattice Operator C. Boldrighini1, , R. A. Minlos2, , A. Pellegrinotti3, 1 Dipartimento di Matematica, Università di Roma “La Sapienza”, Piazzale Aldo Moro 2, 00185 Rome, Italy 2 Institute for Problems of Information Transmission, Russian Academy of Sciences, Bolshoy Karetny Per. 19,

Moscow 127994, Russia

3 Dipartimento di Matematica, Università di Roma Tre, Largo S. Leonardo Murialdo 1, 00146 Rome, Italy.

E-mail: [email protected] Received: 3 March 2010 / Accepted: 11 March 2011 Published online: 28 May 2011 – © Springer-Verlag 2011

Abstract: We study the asymptotic behavior of correlations for a general “two-particle” operator T acting on the Hilbert space 2 (Zd × Zd ), for all dimension d = 1, 2, . . .. T is written as the sum of a “main” term, and a small “interacting” term, a form which appears in many problems. If the interacting term is small, we give a complete description of the asymptotics for large t of the correlations (T t f (1) , f (2) ), t = 1, 2, . . ., for f (1) , f (2) in some suitable class. The asymptotics is of the Ornstein-Zernike type, i.e., exponential with a power-law factor, which is t −d for d ≥ 3, but for d = 1, 2 it can be “anomalous” and is determined by the interacting term. 1. Introduction. Position of the Problem Many problems in mathematical physics lead to consider operators acting on the Hilbert space H := 2 (Zd × Zd ), where d = 1, 2, . . . is the lattice dimension, according to the formula (T f )(x1 , x2 ) = (a(x1 − y1 , x2 − y2 ) + αS(x1 , x2 ; y1 , y2 )) f (y1 , y2 ), f ∈ H. y1 ,y2 ∈Zd

(1.1) T is usually called a “two-particle” operator (see [1]). α > 0 is a small parameter, and a(·, ·), S(·, ·; ·, ·) are functions with some suitable decay properties. A frequent problem is that of finding the asymptotics of the scalar products in H, T t f (1) , f (2) , (1.2) as the iteration number t becomes large, for some class of functions f (1) , f (2) ∈ H. Partially supported by INdAM (G.N.F.M.) and M.U.R.S.T. research funds.

Partially supported by C.N.R. (G.N.F.M.) and M.U.R.S.T. research funds, by RFFI grants n. 05-01-00449,

Scienific School grant n. 934.2003.1, and CRDF research funds N RM1-2085.

606

C. Boldrighini, R. A. Minlos, A. Pellegrinotti

An example of a general character is the following. Let {ξt : t ∈ Z1 } be a stationary Markov chain with values in some space and invariant measure ν. An important property of such systems is the asymptotics as t → ∞ of correlations of the type (ξ0 ), (ξt ) := (ξ0 )(ξt ) − (ξ0 )2 ,

(1.3)

:= L 2 (, ν) is some functional on and · denotes averaging with where ∈ H respect to the probability distribution of the Markov chain. Such quantities may represent time correlations for evolution processes, such as random walks in a Markov environment, or space correlations for Gibbs states in statistical mechanics and Euclidean quantum field theory, and also other quantities of interest. If T denotes the transfer matrix (or stochastic operator) of the Markov chain and the correlation (1.3) can be written as (·, ·)H is the scalar product in H, t (1.4) (T ( − )), − H . can be represented as a direct (in general non-orthogonal) In many cases (see, e.g. [1]) H sum of invariant (with respect to T) subspaces =H 0 + H 1 + H 2 + H 3 , H

(1.5)

0 is the subspace of the constants (“vacuum”, in the physical terminology), and where H 1 , H 2 describe the so-called “one-particle” and “two-particle” excitathe subspaces H tions of the vacuum. Much attention has been devoted to models with the property that the maximal absolute values κi := maxλ∈σ (T /H i ) |λ| of the spectra of the restrictions of i , i = 0, 1, 2, 3, are decreasing T to H 1 = κ0 > κ1 > κ2 > κ3 .

(1.6)

according to (1.5) In such cases we can expand an element ∈ H = 0 + 1 + 2 + 3 ,

i , i = 0, 1, 2, 3, i ∈ H

(1.7)

and recalling that 0 = , we see, by a heuristic argument, that if 1 = 0, the leading term of the asymptotics (1.4) behaves, as t → ∞, roughly as κ1t , and if 1 = 0, as κ2t . In both cases the exact asymptotics is exponential, with power-law factors, which . Such asymptotics is usually called “Ornsteindepend on the detailed properties of T Zernike” (O.Z.), after the pioneering work [11] in which it was first predicted, by physical arguments, for statistical mechanical models of density correlations. The O.Z. asymptotics for correlations of the type (1.4) has been investigated for concrete models both by rigorous mathematical methods and at the level of theoretical physics. In particular, rigorous results, based on the detailed analysis of the spectral properties of the relevant restriction of T, according to the lines described above, were obtained in the papers [2–7], [9,10]. There are also many papers which follow other approaches. We refer the reader to [12–19]. We are here interested in the two-particle case, i.e., in the correlations for the restric2 (which often go under the physical term “four-point functions”). tion T := T/H In many cases T can be reduced to the form (1.1) by choosing an appropriate basis 2 . The problem of finding the exact asymptotics of the {vx1 ,x2 : x1 , x2 ∈ Zd } in H correlation (1.4) leads then to the problem (1.2).

Ornstein-Zernike Asymptotics for a General “Two-Particle” Lattice Operator

607

All results so far obtained, e.g., in the papers [2,3,17], rely on the particular features of the models. The aim of the present paper is to give a complete general analysis of the asymptotics (1.2) for any dimension d = 1, 2, . . ., which can be applied to a wide class of systems, including models of random walks in random media, quantum and classical models of matter, models of field theory, etc. The results of the present paper include a general description of the “anomalous” behavior of two-particle correlations in low dimension d = 1, 2, which were first discovered by Polyakov [18]. We now pass to the precise statement of our problem. The assumptions concerning the functions a(·, ·) and S(·, ·; ·, ·) are the following: 1. a is real, symmetric, and even, i.e., a(x1 , x2 ) = a(x2 , x1 )

a(x1 , x2 ) = a(−x1 , −x2 ).

(1.8)

2. a satisfies, for some positive constants C, q, with q ∈ (0, 1), the estimate |a(x1 , x2 )| ≤ Cq |x1 |+|x2 | , |x| := |x (1) | + · · · + |x (d) |, x = (x (1) , . . . , x (d) ). (1.9) 3. S is real, even, and invariant under space shifts, i.e., S(x1 , x2 ; y1 , y2 ) = S(−x1 , −x2 ; −y1 , −y2 ), S(x1 , x2 ; y1 , y2 ) = S(x1 + v, x2 + v; y1 + v, y2 + v), v ∈ Zd .

(1.10) (1.11)

4. S satisfies the following cluster inequality, for some C¯ > 0: ¯ minτ d(τ ) , |S(x1 , x2 ; y1 , y2 )| ≤ Cq

(1.12)

where q ∈ (0, 1) is the same as above, τ is a connected graph with vertices at x1 , x2 , y1 , y2 , d(τ ) is its length, and the minimum is taken over all such graphs. We further assume that the Fourier transform a(λ ˜ 1 , λ2 ) = ei(λ1 ,y1 )+i(λ2 ,y2 ) a(y1 , y2 ) (1.13) y1 ,y2

has a unique absolute maximum at some point (λ¯ 1 , λ¯ 2 ), with a negative-definite Hessian matrix. We suppose for definiteness that λ¯ 1 = λ¯ 2 = 0 and the maximum is positive, i.e., max

(λ1 ,λ2 )∈T d ×T d

a(λ ˜ 1 , λ2 ) = a(0, ˜ 0) := κ > 0,

(1.14)

and that the parameter α appearing in (1.1) is also positive. We also assume that |a(λ ˜ 1 , λ2 )| < κ for all (λ1 , λ2 ) = (0, 0). On the functions f (1) , f (2) in (1.2) we assume exponential fall-off at infinity, i.e., | f (i) (x1 , x2 )| ≤ Ci q |x1 |+|x2 | ,

i = 1, 2

for some constants Ci , i = 1, 2, and the same q as above. We can now state our main results, which are proved in §4.

(1.15)

608


Theorem 1.1. Asymptotics for dimension d ≥ 3. For d ≥ 3, under the assumptions above, there is a constant Md ( f (1) , f (2) ) such that the following asymptotics of the scalar products (1.2) holds, as t → ∞: κt T t f (1) , f (2) = Md ( f (1) , f (2) ) d (1 + rd (t))) , (1.16) t where rd (t) = O( lnt t ) for d = 4 and rd (t) = O( 1t ) for d = 4. (1)

For low dimension, d = 1, 2 the asymptotics depends on the sign of the constant Cd (0), defined below, by formula (3.15). Theorem 1.2. Asymptotics for dimension d = 1. For d = 1, under the assumptions above, the following asymptotics of the scalar products (1.2) hold, as t → ∞: (1)

(+)

i) If C1 (0) > 0, there is a constant M1 ( f (1) , f (2) ) such that

M(+) ( f (1) , f (2) ) κ t 1 1 T t f (1) , f (2) = 1 + O( ) ; t2 t

(1)

(1.17)

(−)

ii) If C1 (0) < 0, there are constants κ¯ > κ and M1 ( f (1) , f (2) ) such that

T f t

(1)

,f

(2)

(−)

M1 ( f (1) , f (2) ) κ¯ t = √ t

1 1 + O( ) ; t

(1.18)

(1) , f (2) ) such that iii) If C1(1) (0) = 0 there is a constant M(0) 1 (f

(0) M1 ( f (1) , f (2) ) κ t T t f (1) , f (2) = (1 + r¯1 (t))) , t

where r¯1 (t) = O(1/t) if d2 C (1) ( )| =0 d 2 1

d2 C (1) ( )| =0 d 2 1

(1.19)

√ ≥ 0, and r¯1 (t) = O(1/ t) if

< 0.

Theorem 1.3. Asymptotics for dimension d = 2. For d = 2, under the assumptions above, the following asymptotics of the scalar products (1.2) hold, as t → ∞: (1)

(+)

i) If C2 (0) > 0, there is a constant M2 ( f (1) , f (2) ) such that

T f t

(1)

,f

(2)

=

(1) , f (2) ) κ t M(+) 2 (f

t 2 ln2 t

1 1 + O( ) ; ln t

(1.20)

ii) If C2(1) (0) < 0, there are constants M2(−) ( f (1) , f (2) ) and κ¯ > κ such that (1)

M(−) ( f (1) , f (2) ) κ¯ t 1 2 T t f (1) , f (2) = 1 + O( ) ; t t

(1.21)

(0)

iii) if C2 (0) = 0 there is a constant M2 ( f (1) , f (2) ) such that M(0) ( f (1) , f (2) ) κ t ln t t (1) (2) 2 T f ,f 1 + O( ) . = t2 t

(1.22)


609

As stated by the next corollary, for small α, the coefficients of the leading terms of the asymptotics do not vanish unless the following quantities vanish: M( f (i) ) := f (i) (z 1 , z 2 ), i = 1, 2. z 1 ∈Zd z 2 ∈Zd

Corollary 1.4. For α small enough the coefficients of the leading terms of the asymptotics (1.16), (1.17), (1.18), (1.19), (1.20), (1.21), (1.22) can be written as

(1.23) Md ( f (1) , f (2) ) = cd M( f (1) )M( f (2) ) + O(α) , d ≥ 3,

(1) M() , f (2) ) = cd() M( f (1) )M( f (2) ) + O(α) , d = 1, 2, = +, −, 0, d (f (1.24) ()

for some non-vanishing constants cd , d ≥ 3, cd , d = 1, 2, = ±, 0. For an intuitive understanding of our results above a short discussion is in order. For α = 0 one can easily see, in analogy with the standard local limit theorem asymptotics in probability theory, that the leading term of the asymptotics is κ t t −d for all d = 1, 2, . . .. If α = 0 is small enough the interaction cannot change the asymptotics for d ≥ 3, whereas in low dimension d = 1, 2, the interacting term always “wins” in the “repulsive” case Cd(1) (0) > 0 and in the “attracting” case Cd(1) (0) < 0, as shown in more detail in Remark 4.1 below. The asymptotics is again of the type κ t t −d in the “neutral” case (1) Cd (0) = 0, which includes the real probabilistic case, when the matrix elements of T are non-negative and κ = 1. If on the other hand α is not small, then the above behavior cannot be expected to be true, as some bound states may appear, causing exponential decay with a constant different from κ. Occurrence of this kind of phenomena was shown to hold in some cases for one-particle operators [9,10]. The plan of the paper is as follows. Sections 2, 3, contain some preliminary constructions and results. The main proofs are in Sect. 4 and in Sect. 5 we give a proof of some statements of Sect. 3. 2. Preliminary Constructions By condition (1.11), the operator T is translation invariant, i.e., T Uv = Uv T ,

(2.1)

where {Uv : v ∈ Zd } is the unitary group on H generated by the shifts (Uv f )(x1 , x2 ) = f (x1 − v, x2 − v),

f ∈ H.

H can then be decomposed as a direct integral, H= H d

(2.2)

of Hilbert spaces H , ∈ T d , which reduce the operators T and Uv , v ∈ Zd : T = T d , Uv = ei( ,v) E d .

(2.3)

Td

Td

Td

610


T and E are operators on H , and E is the identity operator. H can be identified with the space of the functions f (x1 − x2 )e−i( ,x2 ) , with f ∈ l2 (Zd ). The Fourier transform on H corresponding to the representation (2.2), and its inverse, are defined as

1 f (z) = f (x + z, x)ei( ,x) , f (x1 , x2 ) = f (x1 − x2 )e−i( ,x2 ) d . d d (2π ) T d x∈Z

(2.4) By isometry, if dm( ) = d d /(2π )d denotes the normalized Haar measure on T d , we have for any f (1) , f (2) ∈ H,

(1) (2) f (1) , ( f , f )H = f (2) dm( ). (2.5) d 2 (Z )

Td

For all positive t ∈ Z1 we have

T f t

(1)

,f

(2)

(1) (2) = T t f f ,

2 (Zd )

dm( ),

(2.6)

where T acts on the elements of H according to the formula T f (z) = f (z + u 2 − u 1 ). (2.7) (a(u 1 , u 2 ) + αS(z, 0; z − u 1 , −u 2 )) ei( ,u 2 ) u 1 ,u 2

Finally, by applying the usual Fourier transform on 2 (Zd ), f (z)ei(λ,z) ∈ L 2 (T d , dm), f˜ (λ) =

(2.8)

z

we have, again by isometry, (1) (2) f , f

2 (Zd )

=

(1) (2) f˜ , f˜

L 2 (T d ,dm)

.

(2.9)

The operator T goes into a new operator T˜ equivalent to T under a unitary transformation and acting on L 2 (T d , dm) according to the formula

T˜ φ (λ) = a˜ (λ)φ(λ) + α K (λ, μ)φ(μ)dm(μ). (2.10) Td

˜ Here a˜ (λ) = a(λ, ˜

− λ) (see (1.13)), and K (λ, μ) = S(λ; μ, − μ), with ˜ 1 ; λ2 , λ3 ) = S(x1 , 0; y1 , y2 )ei(λ1 ,x1 )−i(λ2 ,y1 )−i(λ3 ,y2 ) . (2.11) S(λ x1 ,y1 ,y2

By the symmetry conditions (1.8), (1.10), the functions a˜ (λ) and K (λ, μ) are real. We will make use of the resolvent formula

−1 1 t ˜ T = T˜ − z E z t dz, (2.12) 2πi γ


611

where the integration contour γ goes around the spectrum of T clockwise. The resolvent (T˜ − z E)−1 of the operator T˜ acts (see [9]) according to the formula

−1 1 D (λ, μ; z)φ(μ) dm(μ) φ(λ) ˜ − . (2.13) T − z E φ (λ) = a˜ (λ) − z (z) T d (a˜ (λ) − z)(a˜ (μ) − z) The functions and D are given by a power series of α: (z) = 1 +

∞ αn n=1

n!

(n)

(z),

D (λ, μ; z) =

∞ n=1

αn (n) D (λ, μ; z), (2.14) (n − 1)!

(1)

with D (λ, μ; z) = K (λ, μ) independent of z, and (n) (z) (n) (λ, μ; z) D

=

n T d ×···×T d

=

T d ×···×T d

i=1 n−1 i=1

dm(λi ) T ,n (λ1 , . . . , λn ), n ≥ 1, (a˜ (λi ) − z)

(2.15)

dm(λi ) T ,n (λ, μ; λ1 , . . . , λn−1 ), n ≥ 2. (2.16) (a˜ (λi ) − z)

In (2.15), (2.16) we have T ,1 (λ) = K (λ, λ), and for n ≥ 2, T ,n (λ1 , . . . , λn ) = det{K (λi , λ j )}i, j=1,...,n , K (λ, μ) K (λ, λ1 ) ,n+1 (λ, μ; λ1 , . . . , λn ) = det K (λ1 , μ) K (λ1 , λ1 ) T ... ... K (λ , μ) K (λ , λ )

n

n 1

(2.17) . . . K (λ, λn ) . . . K (λ1 , λn ) . (2.18) ... ... . . . K (λn , λn )

By (2.13), (2.14), setting

β (z) =

(z) =

Td

Td

(1) (2) f˜ (λ) f˜ (λ) dm(λ), (2.19) a˜ (λ) − z

∞ (1) (2) D (λ, μ; z) f˜ (μ) f˜ (λ) αn (n) dm(λ)dm(μ) = (z), d ( a ˜ (λ) − z)( a ˜ (μ) − z) (n − 1)!

T n=1

(2.20) where D (λ, μ; z) is given in (2.14), and finally F (z) = β (z) −

(z) , (z)

we can apply the resolvent formula to get

1 t ˜(1) ˜(2) ˜ z t F (z)dz. T f , f = 2πi γ By the resolvent formula (2.22), and the isometries (2.5), (2.9) we find

1 t (1) (2) = dm( ) z t F (z)dz. T f ,f H 2πi T d γ

(2.21)

(2.22)

(2.23)

612


3. Analysis of the Spectrum of the Operator T˜ As one can see from formula (2.13), the spectrum of T˜ consists of the interval of the real axis I = [κ1 ( ), κ0 ( )], where κ0 ( ) = max a˜ (λ), λ

κ1 ( ) = min a˜ (λ), λ

(3.1)

with the possible addition of the zeroes of (z) which lie outside I . If α is small such zeroes lie close to I . In fact, for any δ > 0, consider the region of the complex plane V δ = {z ∈ C : d(z, I ) ≤ δ},

(3.2)

where d(·, I ) is the distance from the interval I . Then the following simple lemma holds. Lemma 3.1. For any δ > 0, there is a value α0 = α0 (δ) > 0 such that if ∈ T d and α < α0 , the zeroes of lie inside the region V δ . Proof. Observe that for z ∈ / V δ , if K := max ,λ1 ,λ2 |K (λ1 , λ2 )|, we have, by (2.15), n n (n) K n 2 , (z) ≤ δn

(3.3)

where we made use of the Hadamard inequality for determinants (see [20]). From (2.14) and (3.3), it follows that, for α small enough, | (z) − 1| ≤ const αδ < 1. Hence (z) = 0 for z ∈ / V δ . By our assumptions on a, ˜ we can find a neighborhood V ⊂ T d of the origin, symmetric with respect to sign change, so small that the following conditions hold: i) For some > 0, max κ0 ( ) = κ − = inf κ0 ( ) > 0.

∈V /

∈V

(3.4)

ii) For all ∈ V, the set {λ ∈ T d : a˜ (λ) > κ0 ( ) − 2} ⊂ T d is connected and does not contain any critical point of a˜ (λ) except the point λ0 = 2 , where a˜ attains its maximum κ0 ( ) = a˜ ( 2 ). (We use the representation T d = (−π, π ]d , and 2 means the solution of the equation 2λ0 = such that λ0 ∈ (− π2 , π2 )d . ) denotes the Hessian matrix of a ( /2)}| > 0. ˜ , inf ∈V | det{a˜ iii) If a˜ Lemma 3.2. For ∈ / V and α small enough we have t F (z)z t dz ≤ const κ − , 2 γ

(3.5)

where the constant does not depend on t and ∈ / V. Proof. The boundary ∂ V δ of V δ goes around the spectrum of T˜ , and can be used as the contour γ (see Fig. 1). Let δ ≤ 2 and assume that α < α0 (δ). If z ∈ γ , F (z) is bounded uniformly in ∈ / V , and, by (3.4), |z| ≤ κ0 ( ) + 2 ≤ κ − 2 , so that |z t | ≤ (κ − /2)t .


613

Fig. 1. The region V δ and the integration contour

Take now ∈ V and, as above γ = ∂ V δ , with δ ≤ /2, and consider the contribution to the integral on the left of (3.5) of the part of the contour γ2 := {z ∈ γ : Re z < κ0 ( ) − δ/2}. Such contribution is estimated as follows: Remark 3.3. For α small enough and ∈ V we have t ≤ c (κ − ν)t , F (z)z dz

γ2

(3.6)

where the positive constants c and ν do not depend on t and ∈ V . Proof. By ourassumptions on a, ˜ we have max |κ1 ( )| < κ, so that for δ small enough supz∈γ2 |z| ≤ (κ − δ/2)2 + δ 2 ≤ κ − ν for some ν > 0. The leading contribution to the asymptotics comes from the remaining part of the contour γ1 := γ \ γ2 , close to the right edge of the cut κ0 ( ), for ∈ V . We need a detailed knowledge of the spectrum near κ0 , and to this aim we use a convenient representation of the functions β , , , appearing in the definition (2.21) of F , for z in the circle, U σ = {z ∈ C : |z − κ0 ( )| ≤ σ },

σ > > 2δ,

(3.7)

which clearly contains γ1 (see Fig. 1). The representation of the functions β , , that we need is proved in Lemmas 3.4, 3.5, 3.6 below, the proof of which is deferred to Sect. 5. In what follows = C\I is the complex plane with a cut along the interval I , z 1/2 , z ∈ C, is the branch of the square root that is positive for real z > 0, and ln z the branch of the natural logarithm that is real for z > 0. We also set for brevity ζ := z − κ0 ( ), and, if no confusion arises, we will sometimes omit the arguments of the functions, in particular we will often write κ0 for κ0 ( ). Lσ will denote the Banach space of the analytic functions in U σ (3.7), with the usual sup-norm. Lemma 3.4. There is some σ > 0 such that for ∈ V and z ∈ U σ ∩ the following (0) (0) representations hold, for some functions h d (z; ), Hd (z; ) ∈ Lσ :

614


i) For odd dimension d = 2s + 1, s = 0, 1, . . . , (0)

s− 12

β (z) = Cd ( )ζ

(0)

s+ 12

+ h d (z; )ζ

(0)

+ Hd (z; );

(3.8)

ii) for even dimension d = 2s + 2, s = 0, 1, . . . , (0)

(0)

(0)

β (z) = Cd ( )ζ s ln ζ −1 + h d (z; )ζ s+1 ln ζ −1 + Hd (z; ). (0)

(3.9)

(0)

The norms h d (z; )Lσ , Hd (z; )Lσ are uniformly bounded for ∈ V , and (0) (1) (2) Cd ( ) = h (d) (κ0 ; ) f˜ ( ) f˜ ( ), 2 2

(3.10)

where h (d) is given in (5.13) below. Lemma 3.5. For all ∈ V and some σ > 0 the following representations hold for z ∈ U σ ∩ : (1) i) For odd dimension d = 2s+1, s = 0, 1, . . . there are functions h (1) d (z; ), Hd (z; ) σ ∈ L such that (1)

s−1/2

(z) = 1 + Cd ( )ζ

(1)

s+1/2

+ h d (z; )ζ

(1)

+ Hd (z; ).

(3.11)

ii) For even dimension d = 2s + 2, s = 0, 1, 2, . . . there are functions h (1) k;d (z; ), (1)

k = 1, 2, . . . , and H1 (z; ) of Lσ such that (1)

(z) = 1+Cd ( )ζ s ln ζ −1 +

∞

k (1) (1) h k;d (z; ) ζ k−1 ζ s ln ζ −1 + Hd (z; ).

k=2

(3.12) The norms of the functions appearing in (3.11) (3.12) satisfy the estimates (1) (1) h d (z; ) σ + Hd (z; ) σ ≤ α Bd , d = 2s + 1, s = 0, 1, . . . , L

(1) Hd (z; )

Lσ

L

∞ (1) + h k;d (z; )

Lσ

k=2

(3.13) ≤ α Bd , d = 2s + 2, s = 0, 1, . . . . (3.14)

Moreover in both representations (3.11) (3.12) we have (1)

Cd ( ) = αh (d) (κ0 ; )R( ), ∞

(3.15)

α n−1

)+ 2 (n − 1)! n=2

T ,n ( 2 , λ2 , . . . , λn ) n × dm(λ2 ) . . . dm(λn ), (3.16) ... ˜ (λi ) − κ0 ( )) Td Td i=2 (a

R( ) = T ,1 (

and R( ) is even in , and uniformly bounded for ∈ V .


615

Lemma 3.6. For all ∈ V and some σ > 0, the following representations hold for z ∈ U σ ∩ : (2)

(2)

i) For odd dimension d = 2s+1, s = 0, 1, . . ., there are functions h 1 (z; ), H1 (z; ) ∈ Lσ such that (0)

(1)

s− 12

(2)

(z) = Cd ( )Cd ( )ζ 2s−1 + h d (z; )ζ

(2)

+ Hd (z; ).

(3.17)

(2)

ii) For even dimension d = 2s + 2, s = 0, 1, 2, . . . there are functions h k;d (z; ), k = 1, 2, . . . and Hd(2) (z; ) of Lσ such that (0)

(1)

(z) = Cd ( )Cd ( )(ζ s ln ζ )2 +

2

(2)

h k;d (z; )ζ k−1 (ζ s ln ζ −1 )k

k=1

+

∞

(2)

(2)

h k;d (z; )ζ k−2 (ζ s ln ζ −1 )k + Hd (z; ).

(3.18)

k=3 (0) (1) The constants C1 ( ), C1 ( ) are given by (3.10) (3.15) of all functions in Lσ are uniformly bounded for ∈ V .

above, and the norms

By Lemmas 3.4, 3.5, 3.6 we are now able to draw conclusions regarding the spectrum in U σ , ∈ V . Lemma 3.7. Under the assumptions above, if α is small enough, the following assertions hold for all ∈ V : (1)

i) For d ≥ 3, and also for d = 1, 2 if Cd ( ) ≥ 0, in the region Re z > κ0 ( ) − δ/2 there are no zeroes of (z). (1) ii) For d = 1, 2 and Cd ( ) < 0 there is a unique real zero of (z) with coordinate κ∗ ( ) > κ0 ( ). Proof. If d ≥ 3 we have s > 0 in the representations (3.11), (3.12), so that (z) = 1 + O(α), and, if α is small, (z) = 0 for z ∈ U σ ∩ . As V δ ∩ {Re z > κ0 ( ) − δ/2} ⊂ U σ , we conclude (see (3.2)) that for Rez > κ0 ( ) − δ/2 there are no zeroes of . (1) For d = 1, 2 in (3.11), (3.12) we take s = 0. If Cd ( ) ≥ 0, then for d = 1, the (1)

1

sum 1 + C1 ( )(z − κ0 ( ))− 2 in (3.11) is in absolute value larger than 1, and cannot be compensated by the remaining terms, which are of order α 2 . On the other hand, by our choice of σ , zeroes of which are in absolute value larger than κ0 ( ) can appear only in U σ . Hence there are no such zeroes. A similar argument holds for d = 2. (1) If now d = 1 and Cd ( ) < 0, we can write (3.11) as (z) = M (z) + α f (z), − 21 ¯ ¯ where M(z) = 1 − C( )ζ , with C( ) = −C1(1) ( ), and f is analytic in U σ ∩ , and bounded. Clearly M (z) = 0 for z = κ( ) ˜ := κ0 ( ) + C¯ 2 ( ). Moreover √ z − κ0 ( ) is purely imaginary on both sides of the cut I , so that |M (z)| > 1, and it is not hard to see that, if α is small enough, on the circle |z − κ0 ( )| = σ we have ¯ C( ) 1 |M (z)| > 1 − √ > > α| f (z)|. 4 σ

616


It follows, by the Rouché Theorem, that and M have the same number of zeroes in the region U σ , so that has exactly one zero, at a real point denoted κ∗ ( ) > κ0 ( ). Moreover κ∗ ( ) is differentable and satisfies the estimates m d 2 m = 0, 1, 2, . . . . (3.19) d m (κ∗ ( ) − κ0 ( )) ≤ Cm α , In fact, setting δκ ( ) := κ∗ ( ) − κ0 ( ), as (κ∗ ( )) = 0, by formula (3.11) we find 1 2 (1) (1) ¯ δκ ( ) = C( ) − h 1 (κ∗ ( ); )δκ ( ) − H1 (κ∗ ( ); )(δκ ( )) 2 . (3.20) (1) (1) ¯ Observe that C( ), h 1 and H1 are all of order α, which gives relation (3.19). (1) Finally, for d = 2, if C2 (0) < 0, reasoning as for dimension d = 1, we see that has exactly one zero, at a real point κ∗ ( ) > κ0 ( ). Moreover κ∗ ( ) is differentable and satisfies the estimates ∂ m 1 +m 2 − c∗ m 1 , m 2 = 0, 1, 2, . . . , (3.21) ∂ m 1 ∂ m 2 (κ∗ ( ) − κ0 ( )) ≤ Cm 1 ,m 2 e α , 1 2

for some constant c∗ > 0.

4. Proofs of the Theorems In this section we are concerned with the part of the integral (2.23), which, as we shall see, gives the main contribution to the asymptotics as t → ∞, i.e.,

1 dm( ) z t F (z)dz, (4.1) 2πi γ1 V where γ1 = {z ∈ γ : Re z ≥ κ0 ( ) − δ/2} and, as before, γ = ∂ V δ , for δ small enough. We begin with some general considerations, which hold in all dimensions d ≥ 1. By assertion i) of Lemma 3.7, there are no zeroes of for z ≥ κ0 ( ) − δ/2. Hence the contour γ1 can be shrunk to the cut, i.e., it goes along the upper side of the cut I , from the point κ0 ( ) − 2δ to κ0 ( ), and then backwards along the lower side from κ0 ( ) to κ0 ( ) − 2δ . The point z = κ0 ( ) − δ/2 is then connected by vertical segments to the endpoints of γ2 (see Fig. 1). F (z) in (4.1) is written as F (z) = N (z)/ (z), where N (z) = β (z) (z) − (z). As γ1 ⊂ U σ ∩ , N can be written in terms of the representations in §3 for β , , . We need to distinguish the odd-dimension and the even-dimension case. For odd dimension d = 2s + 1, we have (3)

s−1/2

N (z) = h d (z; )ζ (3)

(3)

+ Hd (z; ),

(4.2)

(3)

where, by (3.8), (3.11), (3.17), Hd , h d ∈ Lσ are given by (3)

(0)

(1)

(1)

(0)

(2)

(0) (1)

(0)

(1)

h d = Cd (1 + Hd ) + Cd Hd − h d + ζ (Hd h d + h d (1 + Hd )), (3) Hd

=

(0) Hd (1 +

(1) Hd ) −

(2) Hd

(0) (1) + ζ 2s (Cd h d

(1) (0) + Cd h d

(0) (1) + ζ h d h d ).

(4.3) (4.4)


617

Therefore we have (3)

F (z) =

(1)

s−1/2

h d (z; )ζ

(1)

s−1/2

1 + Cd ( )ζ

(3)

+ Hd (z; ) s+1/2

+ h d (z; )ζ

(1)

+ Hd (z; )

.

(4.5)

We take the difference across the cut I , in terms of the variable u = κ0 ( ) − z, which is non-negative on I . Indicating values above (below) the cut by a suffix + (−) we have s−1/2 s−1/2 (ζ )+ = −i(−1)s u s−1/2 and (ζ )− = +i(−1)s u s−1/2 . Setting v = u/κ0 ( ), the internal integral in (4.1) is (−1)s+1 (κ0 ( ))t+s+1/2 π (3)

δ 2κ0

1

Dd (v; )(1 − v)t v s− 2 dv,

(4.6)

0

(3)

(3)

(1)

where, writing for brevity h˜ (v) = h d (κ0 (1−v); ), and similarly for H˜ (v), h˜ (v) (1) and H˜ (v), Dd (v; ) =

(1) ˜ (1) ˜ (3) ˜ (1) h˜ (3)

(v)(1 + H (v)) − H (v)(C d ( ) − κ0 ( )v h (v)) . (4.7) (1) (1) (1) (1 + H˜ (v))2 + (κ0 ( )v)2s−1 (C ( ) − κ0 ( )v h˜ (v))2

d

For even dimension d = 2s + 2 computations are more lengthy, but straightforward. By (3.9), (3.12), (3.18), the expression (4.2), can now be written as (3)

(2)

(z), N (z) = Hd (z; ) + Cd ( )ζ s ln ζ −1 + ζ s+1 ln ζ −1 N (3)

where Hd (2)

(4.8)

∈ Lσ and (1)

(0)

(0)

(1)

(2)

Cd ( ) = Cd ( )Hd (κ0 ; ) + Cd ( )(1 + Hd (κ0 ; )) − h 1;d (κ0 ; ), (z) = N

2

(3)

h k;d (z; )(ζ s ln ζ −1 )k + ζ

k=0

∞

(4.9)

(3)

h k;d (z; )ζ k−3 (ζ s ln ζ −1 )k .

(4.10)

k=3 (3)

The explicit expression of the functions h k;d ∈ Lσ , k = 0, 1, . . ., is not hard to work out. As before, there are no singularities of F outside I , and the jump of F across the cut is 2 [Im(N )/| |2 ]+ . At the point z = κ0 − u we have [ln ζ −1 ]+ = r (u) − iπ on the upper side and [ln ζ −1 ]− = r (u) + iπ on the other side, with r (u) = ln(1/u). Therefore (1)

(2)

[ (κ0 − u)]+ = (u) − i (u),

(1)

(2)

[N (κ0 − u)]+ = N (u) − i N (u),

where, by (3.12), (4.8), (4.9) and (4.10) we can write the real and imaginary parts as

(1) (1) (1) (1) (u) = 1 + Hd (κ0 − u; ) + (−u)s Cd ( )r (u) + (−u)s+1 A (u) , (4.11)

(1) s s+1 (2) C (2) (u) = (−u) ( )π + (−u) A (u) , (4.12)

d

(1) (3) (2) (1) N (u) = Hd (κ0 − u; ) + (−u)s Cd ( )r (u) − u B (u) , (4.13)

(2) (2) (2) N (u) = (−u)s Cd ( )π − u B (u) . (4.14)

618

C. Boldrighini, R. A. Minlos, A. Pellegrinotti (1)

(2)

It is easy to see that, for all s ≥ 0 we have, for small u, A (u) = O(r 2 (u)), A (u) = (1) (2) (1) (2) O(r (u)). For the functions B , B we find instead B (u) = O(r (u)), and B (u) = (1) (2) 3 2 O(1), if s > 0, and B (u) = O(r (u)), and B (u) = O(r (u)) for s = 0. Taking into account that the terms in Cd(1) Cd(2) cancel, we find, for all s ≥ 0,

(1) (2) (2) (1) (4) (s) [Im(N )]+ = N − N = (−u)s H (u) + u R (u) , (4.15) (s)

(0)

where R (u) = O(1) for s > 0, but R (u) = O(ln3 u). Setting u = κ0 v, the internal integral is now written as

δ (−1)s (κ0 ( ))t+s+1 2κ0 Dd (v; )(1 − v)t v s dv, (4.16) π 0 where Dd (v; ) is given by 2 times the expression in square brackets in (4.15) divided (1) (2) by ( (u))2 + ( (u))2 and computed for u = κ0 ( )v. We now pass to the proof of the theorems. (1)

Proof of Theorem 1.1. Let d = 2s + 1 and s > 0. For α small 1 + H˜ (v) is bounded away from zero, uniformly in v and ∈ V , so that the quantity Dd (v; ), given by (4.7), is uniformly bounded. Setting v = wt we get, for t → ∞,

δt

δ 2κ0 2κ0 1 w 1 w t s− 12 Dd (v; )(1 − v) v du = Dd ( ; )(1 − )t w s− 2 dw 1 s+ t t 0 t 2 0 1 1 1 = (s + ) Dd (0; ) + O( ) , (4.17) 1 2 t t s+ 2 where (·) denotes the Euler -function. We are left with the integral over . As κ0 ( ) is smooth and even in , we have ln κ0 ( ) = ln κ −

d 1 r jk j k 1 + O( 2 ) , 2

(4.18)

j,k=1

∂ κ0 ( ) | =0 is a positive definite matrix. Therefore, setting t∗ := where r jk = − κ1 ∂ j ∂ k √ t + s + 1/2 = t + d/2, and, as usual in Laplace asymptotics, θ j = t j , we have, for large t,

√ 1 κ t∗ − 12 t∗ jk r jk θ j θk D (0; θ/ t )dθ (κ ( )) D (0;

) d = 0 d d ∗ √ e d d tV (2π ) 2 V (2π t∗ ) 2 1 κ t∗ Dd (0; 0) 1 + O( ) . (4.19) = d t det{r jk }(2π t∗ ) 2 2

By (4.17), (4.19), the asymptotics (1.16) is proved for odd d ≥ 3. If now d = 2s + 2, with s ≥ 1, we go back to the expression (4.16), and proceed as for the odd case. Setting v = w/t we easily obtain the asymptotics (1.16) for s > 1, and the coefficient of the leading term is again proportional to Dd (0; 0). (1) (1) (1) For s = 1 (d = 4), ( (u))2 has a term Cd ( )u ln(1/u), which gives, if Cd (0) = 0, (1) a correction r4 (t) = O(ln t/t). If Cd (0) = 0 the correction is again O(1/t). The integration over is done as in the previous case. Theorem 1.1 is proved.


619

Proof of Theorem 1.2. The three cases are considered separately. i) C1(1) (0) > 0. We can assume that V is so small that C1(1) ( ) > 0 for all ∈ V . By Lemma 3.7, (z) cannot vanish, if α is small enough, for any ∈ V and z ∈ U σ ∩ . Therefore we need only consider the contribution of the cut (4.6), taken for s = 0. Multiplying numerator and denominator in (4.7) by κ0 v, we get

δ 1 (κ0 ( ))t+3/2 2κ0 D(v; )(1 − v)t v 2 dv, (4.20) π 0 (3) (1) (3) (1) (1) h˜ (v)(1 + H˜ (v)) − H˜ (v)(C1 ( ) − κ0 ( )v h˜ (v)) ) = D(v; . (4.21) (1) (1) (1) κ0 v(1 + H˜ (v))2 + (C1 ( ) − κ0 ( )v h˜ (v))2 The integral (4.20) is treated as before, so that we get

δ 2κ0

0

) + O( 1 ) . )(1 − v)t v 21 dv = 1 ( 3 ) D(0; D(v; 3 t t2 2

(4.22)

The integration over is done as before, and we get the asymptotics (1.17), with 0). the coefficient of the leading term proportional to D(0; (1) (1) ii) C1 (0) < 0. Again V is so small that C1 ( ) < 0 for all ∈ V . According to Lemma 3.7, has exactly one zero, at a real point κ∗ ( ) > κ0 ( ). Moreover κ∗ ( ) is differentiable and it follows from (3.19) that κ∗ ( ) has a unique maximum κ¯ = κ∗ ( ∗ ) at some point ∗ ∈ V , with κ¯ > κ, and negative second derivative κ∗ ( ∗ ) < 0. The contribution of the cut to the asymptotics can be computed exactly as in case t i), and is therefore of order κt 2 . The contribution of the pole of F = N / to the internal integral in (4.1), where N is given by (4.2), is N (κ∗ ( )) (κ∗ ( ))t . (κ∗ ( ))

(4.23)

Integrating over V , we expand the function κ∗ ( ) around ∗ , exactly as it was done for κ0 ( ), and we obtain the asymptotics (1.18). (1) iii) C1 (0) = 0. As R( ) is even in , and so is the other coefficient in (3.15), as a consequence of the conditions (1.8), C1(1) ( ) is even, and all its odd derivatives vanish at the origin. If the first derivative that does not vanish is of order 2k, k = 1, 2 . . . , (1) we write C1 ( ) = 2k g( ), with g(0) = 0. Assume first that g(0) > 0, and take V so small that g( ) > 0 for all ∈ V . As for case i), the expression (4.6), for s = 0 gives the contribution of the whole spectrum of T . Starting from (4.20), by the usual changes of variables v = w/t and √ ˜ ), the t ∗ = θ, t∗ = t + 3/2, for ∈ V = (−V0 , V0 ), setting D(v;

) = v D(v; integral (4.1) becomes κ t∗ πt

√

t V0

√ − t V0

dθ e

−

r0 θ 2 θ2 2 (1+O ( t ))

0

δt 2κ0

√ w t D˜ w/t; θ/ t ∗ w −1/2 1 − dw, t

(4.24)

620


d2 ˜ where r0 = −[ d 2 ln κ0 ( )] =0 . The function D is uniformly bounded from above in absolute value and, as t → ∞, it is easy to see that

√ ˜ D(w/t; θ/ t∗ ) →

(3)

h 1 (κ; 0) (1)

κ (1 + H1 (κ; 0))

.

(4.25)

√ By symmetry, the first correction in θ/ t gives a vanishing contribution, so that the first correction is of order 1/t. If now g(0) < 0, the contribution of the cut is the same as for the case g(0) > 0, but we must add the contribution of a single pole on the real axis for z > κ0 ( ). Assuming that g( ) < 0 for all ∈ V , and setting g( ) ¯ = −g( ), we see that has the form −1

(1)

1

(1)

2 2 (z) = 1 − 2k g( )ζ ¯

+ h 1 (z; )ζ + H1 (z; ).

For small α there is only one zero in the region U σ , at some point κ∗ ( ). In fact, invoking again the Rouché theorem, we see that (z) has in U σ the same number of zeroes −1

2 ¯ as 1 − 2 g( )ζ

. The difference δκ ( ) = κ∗ ( ) − κ0 ( ) satisfies Eq. (3.20), with ¯ ¯ Setting (δκ ( ))1/2 = 2k x( ), the rescaled equation for x is C( ) = 2k g( ).

(1)

(1)

x( )(1 + H1 (κ∗ ; )) − g( ) ¯ + x 2 ( ) 2k h 1 (κ∗ ; ) = 0, which shows that for small we have x( ) = c0 (1 + O( 2k )) with c0 = g(0)/ ¯ (1) (1 + H1 (κ; 0)). The residue of the pole of the function z t F at z = κ∗ ( ) turns out to be 2 2k (κ∗ ( ))t x 2 ( )

(3) 2k h (3) 1 (κ∗ ; ) + H1 (z ∗ ; ) x( ) . g( )(1 ¯ + O( 2k ))

(4.26)

As δκ = O( 4k ), we have κ∗ (0) = κ0 (0) = κ and all√ derivatives at = 0 also coincide up to the order 4k −1. Therefore, setting again θ = t, the leading term of the integral over V of the expression above is

(3) 2κ t c02 h 1 (κ; 0) ∞ 2k −r0 θ 2 2 dθ. θ e t k+1/2 g(0) ¯ −∞ Hence the contribution of the pole is O(t −2 ) for all k ≥ 2, and is negligible with respect to that of the cut, given by (4.24), (4.25). The asymptotics (1.19) is proved. Proof of Theorem 1.3. We follow the lines of the previous proof. (1)

(1)

i) C2 (0) > 0. Again we assume that C1 ( ) > 0 for all ∈ V . By Lemma 3.7, we only have the contribution of the cut I . Starting from the expression (4.16), which we take for s = 0, by (4.15) we have D2 (v; ) =

(1)

(1)

(4) (0) H˜ (v) + v R˜ (v)

(1 + H˜ (v) + C2 ( ) ln

1 2 κ0 v )

(1)

+ (πC2 ( ))2 + O(v ln3 v)

,

(4.27)

(4) (0) where the tilde in H˜ (v), R˜ (v), as in (4.7), signifies the change of variables 2 u = κ0 v. The expression ln tD2 (w/t; ) is uniformly bounded in v, and tends, as


621

(4) (1) t → ∞, to H˜ (0)/(C2 ( ))2 . By subtracting the limit we see that the correction is O(1/ ln t). Going back to (4.16) we conclude that

tδ w w (κ0 ( ))t+1 2κ0 D2 ( , )(1 − )t dw πt t t 0 (4) t+1 ˜ H (0) 1 (κ0 ( )) 1 + O( ) . (4.28) = ln t π t ln2 t (C2(1) ( ))2

By integrating over we get an additional factor 1/t, and the asymptotics (1.20). (1) (1) ii) C2 (0) < 0. Again let C2 ( ) < 0 for all ∈ V . By Lemma 3.7, has exactly one zero, at a real point κ∗ ( ) > κ0 ( ). Moreover κ∗ ( ), for small α, has a unique maximum κ¯ = κ∗ ( ∗ ) at some point ∗ ∈ V , with κ¯ > κ, and the quadratic form ∂ 2 κ∗ ( ) | = ∗ is negative definite. of the second derivatives j,k=1,2 ∂ j ∂ k The contribution of the pole of F = N / to the internal integral in (4.1), is again given by the expression (4.23). Integrating over V , we expand the function κ∗ ( ) around ∗ , and we see that the leading term is of order 1/t, with a correction of order O(1/t 2 ). The contribution of the cut is, as before O(1/(t ln t)2 ). This completes the proof of the asymptotics (1.21). (1) (1) iii) C2 (0) = 0. By parity, all odd derivatives vanish at = 0, so that C2 ( ) = (1) O( 2 ). If, by taking V small enough, we have C2 ( ) ≥ 0 for all ∈ V , there are no singularities outside I , and going back to (4.27), we see that √ lim D2 (w/t, θ/ t) =

t→∞

(4) H˜ 0 (0)

(1 + H2(1) (κ; 0))2

.

(4.29)

By subtracting the limiting value, we find a correction O( lnt t ). The usual change of variables gives a factor 1/t 2 , and the asymptotics (1.24) is proved. If now C2(1) ( ) takes negative values in any neighborhood of = 0, recalling that (1) (1) (1) C2 = O(α), and setting αg( ) = −C2 ( ) at the points where C2 ( ) < 0, we see again that for such there is a unique pole of F at a real point κ∗ ( ) > κ0 ( ), such that, for small α, − 1+O(α) − 1 δκ ( ) := κ∗ ( ) − κ0 ( ) = e αg( ) 1 + o(e 2αg( ) ) . A straightforward estimate shows that if V is small enough we have, for all ∈ V , and some constant C, N (κ∗ ( )) Cm 2 m m (κ ( )) ≤ Cδκ ( ) ln (1/δκ ( )) ≤ lnm (1/δ ( )) ≤ Cm α (g( )) , κ

∗ for any integer m = 1, 2, . . ., and some constants Cm , Cm . Moreover for the derivatives of δκ we have a bound of the type (3.21), in which the −

1

right side is replaced by Cm 1 ,m 2 e 2αg( ) . Hence the expansion of κ∗ ( ) at = 0 at any finite order is the same as that of κ0 ( ). Integrating over , with the usual change of variables, and recalling that g( ) = O( 2 ), we see that the contribution of the pole falls off faster than κ t /t n for any n = 1, 2, . . ..

622

C. Boldrighini, R. A. Minlos, A. Pellegrinotti (1)

For the contribution of the cut, we see from (4.11), (4.12), that the real part (u) − (1+O(α)) αg( )

1 − 2αg( )

also vanishes at some point u ∗ = e (1 + o(e )), and the imaginary part can be small for small , so that D2 (v; ) is not uniformly bounded in . However, −

1

setting u( ) ¯ = e 2αg( ) , we have, for α small, u( ) ¯ > u ∗ ( ), and, for 0 ≤ u ≤ (2) ¯ for some constant c > 0. Therefore, by (4.27) we u( ), ¯ ( (u))2 ≥ c/ ln2 u( ), have, for all m = 1, 2, . . .,

u( ) ¯ κ0 ( )

D2 (v; )(1 − v)t dv ≤ c1 u( ) ¯ ln2 (u( )) ¯ ≤

0

c¯m ≤ c˜m (αg( ))m . m ln(1/u( )) ¯

As g( ) = O( 2 ), multiplying this expression by (κ0 ( ))t , we see, by the usual change of variables, that the contribution of this quantity is O(κ t /t n ) for any n = 1, 2, . . .. (1) 2 For v > u( )/κ ¯ 0 ( ) we have ( (u)) = 1 + O(α), and, as t → ∞, the quantity D2

w θ ;√ t t

(4) H˜ θ/√t ( wt ) +

=

(1)

(1 + H ( wt ) − αg( √θ t ) ln

w ˜ (0) w t R √θ ( t ) t

θ w 2 w 2 2 2 2 √ κ0 w ) + π α g ( t ) + O( t ) ln ( t ) t

(4.30) tends to the limit

(4) H˜ 0 (0) (1)

, with a correction O(ln t/t).

(1+H2 (κ;0))2 2 1/t coming from

√ the change of variables = θ/ t, gives the asympThe factor totics (1.22). Theorem 1.3 is proved. Proof of Corollary 1.4. Observe that by (3.10) we have (0)

Cd (0) = h (d) (κ; 0)M( f (1) )M( f (2) ), with h (d) (κ; 0) = 0, as it follows from the explicit expression (5.13). For all d odd, it is easy to see that the expressions (4.3), (4.4) can be written as (3)

(0)

(3)

h d (κ0 ; ) = Cd ( ) + O(α),

(0)

Hd (κ0 ; ) = Hd (κ0 ; ) + O(α). (4.31) (0)

For d ≥ 3 and odd, by (4.7) we have Dd (0; 0) = Cd (0) + O(α), implying (1.23). (3)

If d is even one can check that for Hd (z; ) in (4.8) the same representation (4.31) holds as for d odd. Moreover for the quantities defined by (4.9), (4.15), we have (2)

(0)

Cd ( ) = Cd ( ) + O(α),

(4)

(0)

H (0) = −πCd ( ) + O(α).

(4.32)

For d > 3 and even, as the denominator of Dd in (4.16) is 1 + O(α), we see that (0) Dd (0; 0) = −πCd (0) + O(α), again implying (1.23). (1)

Let now d = 1. For case i) ( = +, or C1 (0) > 0), using the previous consider 0) = ations for the odd case d ≥ 3, we see that in (4.22) we have D(0; which implies the asymptotics (1.24).

(0)

C1 (0)+O (α) (1)

(C1 (0))2

,


623

(1)

For case ii) ( = −, or C1 (0) < 0), observe that in the leading term (4.23) we have κ∗ ( ) − κ0 ( ) = O(α 2 ), and the asymptotics (1.24) follows from the relation N (κ∗ )

(κ∗ )

(0)

=

−2C1 ( ) + O(α)

(κ∗ ( ) − κ0 ( ))C1(1) ( )(1 + O(α 2 ))

(1)

. (3)

Finally if = 0 (or C1 (0) = 0), by (4.25) and the representation for h 1 in (4.31) we (0) C1 (0)

see that the coefficient of the leading term is proportional to κ (1 + O(α)). The proof for d = 1 is complete. (1) We are left with the case d = 2. If = + (C2 (0) > 0), the asymptotics (1.24) follows by inserting (4.32) in (4.28). For = −, the proof is similar to that for the same case and d = 1. We omit it. Finally, for = 0, the result follows by an inspection of formula (4.29). An inspection of the proofs of the present paragraph leads to the following remark. (0)

Remark 4.1. For α → 0 the constants cd for d ≥ 3, and cd for d = 1, 2 (“neutral () case”), tend to a finite limit, whereas for d = 1, 2 the constants cd diverge in the “repulsive” case = +, and tend to 0 in the “attractive” case = −. We give a hint on how to check the assertion above. For d ≥ 3, by (4.19), the dependence of cd on α is contained in the factor Dd (0; 0). As we see by looking at the explicit (0) expression (4.7), for odd d, as α → 0, Dd (0; 0) tends to Cd (0). For even d one should look at the espressions (4.11)–(4.14). If now d = 1 we consider the quantity D(0; 0) given by (4.21). In the repulsive case ) = C (0) ( )(1 + C1(1) (0) > 0 by looking at (4.3), (4.4), for small α we see that D(0; 1 (1) (1) O(α))/(C1 ( ))2 , whence the conclusion, as C1 ( ) = O(α). In the attractive case the conclusion comes by looking at (4.23), and estimating using (3.19). The neutral case for d = 1 and the case d = 2 can be treated in a similar way. 5. Proof of Lemmas 3.4, 3.5, 3.6 We first report for convenience of the reader some results from [8]. Let F be a smooth function in a neighborhood U ⊂ Rd of the origin, and δ > 0 be so small that U contains the disk {u ∈ Rd : u 2 ≤ 2δ}. The integral

F(u) J F (z) = du, z ∈ C, (5.1) 2 u∈Rd :u 2 ≤2δ z + u defines an analytic function of z in the cut plane δ = C\Iδ , with the cut along the real interval Iδ = [−2δ, 0]. Introducing polar coordinates (ρ, ωd ), where ρ 2 = u 2 and ωd denotes the angle variables, J F (z) can be written as

√2δ 2) dωd F(ρ 2) = ρ d−1 dρ, F(ρ F(u(ρ, ωd )) , (5.2) J F (z) = d 2 z+ρ d Sd 0 where d is the measure of the unit sphere S d in Rd for d > 1 and 1 = 2. For d > 1 is a function of ρ 2 , as one can see by expanding F in Taylor series the angular average F 2 ) = 1 (F(ρ) + F(−ρ)). at u = 0. For d = 1 we take F(ρ 2

624


can be extended to an analytic Lemma 5.1. Let Uδ := {|z| < δ}, and suppose that F(v) function in the complex disk {v ∈ C : |v| ≤ δ1 } with δ1 > 2δ. Then under the assumptions above, if δ is small enough, J F (z) can be represented as follows, for z ∈ Uδ ∩ δ : (d) 1 (d) h F (z)z s− 2 + H F (z) d = 2s + 1, J F (z) = (5.3) (d) (d) 1 s h F (z)z log z + H F (z) d = 2s + 2 (d) where s = 0, 1, . . . and h (d) F , H F are analytic functions in Uδ such that 1 π d = 2s + 1 (d) h F (0) = F(0)(−1)s d . 1 d = 2s + 2 2

(5.4)

Moreover the following estimates hold

(d) (d) δ1 , sup |h F (z)| + |H F (z)| ≤ Rd F

(5.5)

z∈Uδ

δ1 = maxv∈C:|v|≤δ | F(v)|. where Rd depends only on d and δ, and F 1 Proof. By adding and subtracting F(−z) in the integral on the left in (5.2) we get √

√2δ

2δ 2) dρ F(ρ ρ d−1 dρ = (−z) + F(−z) ρ d−1 . (5.6) 2 z + ρ z + ρ2 0 0 at the origin is If δ is so small that the radius of convergence of the Taylor series of F larger than 2δ, we see that

√2δ ∞ (k) 2 ) − F(z) F(ρ F (0) dρ = pk−1 (z), ρ d−1 (5.7) (z) = 2 ρ −z k! 0 k=1

where the polynomials pk (z) are easily computed: d

√2δ k ρ 2(k+1) − z k+1 (2δ) j+ 2 k− j pk (z) = dρ = z . ρ d−1 ρ2 − z 2j + d 0 j=0

k+ d2

k

|z| j j=0 ( 2δ ) , (z) is analytic for |z|

As | pk (z)| ≤ (2δ) all s = 0, 1, . . . , (d) 1

√2δ d−1 (d) h 1 (z)z s− 2 + H1 (z), ρ dρ = (d) 1 s z + ρ2 0 h (d) 1 (z)z log z + H1 (z), (d)

(d)

where h 1 , H1 h (d) F (z)

< 2δ. Moreover (see [8]), for (d)

with h 1 (0) = π(−1) d = 2s + 1 2 with h (d) 1 (0) =

s

(−1)s 2

, d = 2s + 2 (5.8)

are analytic functions for |z| < 2δ. Assertions (5.3) (5.4) follow with

(d) = d F(−z)h 1 (z),

(d) H F(d) (z) = d [ F(−z)H 1 (z) + (−z)].

(5.9)

As for the estimate (5.5), it is enough to observe that √ (z)| ≤ c(δ) F δ1 . max |(z)| ≤ 2δ max | F |z|≤δ

|z|≤δ

as the integral of F along 2 ) − F(z) In fact the first inequality is proved by writing F(ρ the segment connecting ρ 2 and z , and the second one follows by the Cauchy formula.


625 (d)

Remark 5.2. By the representation (5.9) the coefficients of the power series h F (z) = ∞ k (k) ( j) k cj F (0). k=0 ck (F)z depend on F as ck (F) = j=0 In order to apply the previous results to the proof of Lemmas 3.4, 3.5, 3.6, observe first that the functions that appear in the integrals defining β , and can be extended analytically to a complex neighborhood Wr(d) = {z = λ + iμ, λ ∈ T d , μ ∈ Rd : max |μ j | < r },

r > 0.

j=1,...,d

˜ 1 , λ2 ) can be In fact, by (1.9), (1.15), the functions f˜(i) (λ1 , λ2 ), i = 1, 2 and a(λ (d) (d) extended analytically to Wη × Wη , for η = ln q1 . Moreover, by the cluster esti˜ 1 ; λ2 , λ3 ) is analytic in the complex extension (W η(d) )3 . In fact, if τ is mate (1.12), S(λ 3

a connected graph with vertices at x1 , 0, x2 , x3 , its length satisfies the bound d(τ ) ≥ |x1 |+|x2 |+|x3 | . 3 Finally, as a consequence of the previous considerations, for ∈ V , if r is small (i) enough, the functions f˜ (·), i = 1, 2, defined by (2.8), and a˜ (·), can be extended to (d) (d) (d) Wr , while K (λ, μ) can be similarly extended to Wr × Wr . For functions which can be so extended the following result applies. (d)

Corollary 5.3. Let f (λ) be analytic in a complex extension Wr some r > 0. Then for ∈ V , if σ is small enough the integral

J f (z; ) =

Td

f (λ) dλ1 . . . dλd , a˜ (λ) − z

of the torus T d , for

z ∈ C,

(5.10)

can be represented for z ∈ U σ ∩ , and ∈ V as ⎧ ⎨ h (z; ) ζ s− 21 + H (z; ) d = 2s + 1, s = 0, 1, . . . f f

J f (z; ) = . 1 ⎩ h f (z; )ζ s log + H f (z; ) d = 2s + 2, s = 0, 1, . . .

ζ

(5.11)

Here ζ = z − κ0 ( ), and h f (z; ), H f (z; ) are analytic functions for z ∈ U σ and such that max |h f (z; )| + |H f (z; )| ≤ cd,σ ( ) f r ,

σ z∈U

(5.12)

where f r = maxλ∈W (d) | f (λ)|, and the constant cd,σ ( ) depends only on a˜ , σ r and d. Moreover h f (κ0 ; ) = f ( 2 )h (d) (κ0 ; ), where h (d) (z; ) = h f (z; ) for f (λ) ≡ 1 and d

2 2 (−1)s+1 d h (d) (κ0 ( ); ) = 2 | det a˜ ( 2 )|

π d = 2s + 1 . 1 d = 2s + 2

(5.13)

626


Proof. Setting a 2 (λ) = κ0 ( ) − a˜ (λ) ≥ 0, we write

f (λ) f (λ) dλ dλ1 · · · dλd , . . . dλ + − J f (z; ) = 1 d 2 (λ) 2 2 d 2 ζ + a ζ + a (·)≤η T \{a (·)≤η} a (λ) (5.14) where η > 0 is so small that the region {a 2 (·) ≤ η} is connected and we can apply in it the Morse lemma (see [21]), introducing new variables u 1 (λ), . . . .u d (λ) such that a 2 (λ) = u 21 + . . . + u 2d . The first integral in (5.14) becomes an integral of the type (5.1)

F(u) du 1 . . . du d , F(u) = f (λ(u))J (u), (5.15) 2 u 2 ≤η ζ + u where J is the jacobian of the transformation. As the second integral is an analytic function for z ∈ U σ , by the previous lemma the representation (5.11) holds. For (5.13), observe that near λ = 2 we have κ0 ( ) − a˜ (λ) = dj,k=1 C jk λ j λk + O(|λ|4 ), with a strictly positive definite matrix C = − (det C )

− 21

a˜ 2 .

As λ(0) =

2

and J (0) =

, the relations (5.13) follow from (5.4), and the estimate (5.12) follows from (5.5), with cd,σ ( ) depending on a˜ only by way of J . Remark 5.4. If f ( 2 ) = 0, then the representation (5.11) is changed to ⎧ ⎨ h˜ (z; )ζ s+ 12 + H˜ (z; ) d = 2s + 1, s = 0, 1, . . . f f

, J f (z; ) = ⎩ h˜ (z; )ζ s+1 log 1 + H˜ (z; ) d = 2s + 2, s = 0, 1, . . . f f

ζ

where h˜ f (z; ), H˜ f (z; ) are analytic for z ∈ U σ and such that

maxσ |h˜ f (z; )| + | H˜ f (z; )| ≤ cd,σ ( ) f r . z∈U

Proof. f (0) = 0 implies F(0) = 0 in (5.15), and the proof follows by setting in (5.6) 2 ) = ρ 2 G(ρ 2 ). By the Cauchy formula, maxv∈C:|v|≤σ |G(v)| is again bounded by F(ρ a constant times f r , so that an inequality of the type (5.12) holds. (1) (2) From now on we set for convenience φ1 (λ) = f˜ (λ), φ2 ( ) = f˜ (λ). (d) The functions T ,n (λ1 , . . . , λn ) for n = 1, . . ., can also be extended to (Wr )n , so that we can apply Corollary 5.3. Moreover, as T ,n (λ1 , . . . , λn ) are determinants, they vanish if λi = λ j for j = i, and satisfy the Hadamard inequality, n

sup (d)

|T ,n (λ1 , . . . , λn )| ≤ K rn n 2 ,

(5.16)

(λ1 ,...,λn )∈(Wr )n (d)

where K r =: K (·, ·)W (d) ×W (d) is the sup-norm in (Wr )n . r r ,n (λ, μ; λ1 , . . . , λn−1 ). Similar considerations hold for the functions T Proof of Lemma 3.4. β (z) is an integral of the type (5.10) with f (λ) = φ1 (λ)φ2 (λ), so that we can apply Corollary 5.3. The proof follows by observing that in (5.11) we can write h f (z) = h f (κ0 ; ) + (z − κ0 ( )) h f (z; ), where h f is again analytic in U σ .

Ornstein-Zernike Asymptotics for a General “Two-Particle” Lattice Operator (1)

627

(2)

Proof of Lemma 3.5. For n ≥ 2, we set T ,n = T ,n + T ,n with (1)

T ,n (λ1 , . . . , λn ) =

n

T ,n (λ1 , . . . , [λ j ], . . . , λn ),

(5.17)

j=1 (1)

(2)

where [λ j ] denotes that we set λ j = /2. Both T ,n and T ,n are symmetric in (n) ˜ (n) + ¯ (n) , where λ1 , . . . , λn . Correspondingly we write =

(1)

n T (λ1 , . . . , λn ) ,n dm(λi ) n ˜ (λi ) − z) (T d )n i=1 (a i=1

n T ,n ( 2 , λ2 , . . . , λn ) n = n J1 (z; ) dm(λi ), ˜ (λi ) − z) (T d )n−1 i=2 (a

˜ (n) (z) =

¯ (n) (z) =

(5.18)

i=2

(2)

(T d )n

n T (λ1 , . . . , λn ) ,n dm(λi ). n ˜ (λi ) − z) i=1 (a

(5.19)

i=1

Here J1 (z; ) is the integral in (5.10) with f ≡ 1, and the last equality in (5.18) (1) (2) comes from the symmetry of T ,n . T ,n ( /2, λ2 , . . . , λn ) and T ,n (λ1 , . . . , λn ) vanish whenever one of the variables takes the value /2. By Remark 5.4 we see that for odd dimension d = 2s + 1 the second integral in (5.18) and the integral in (5.19) are polys+1/2 nomials of order n − 1 and n, respectively, in ζ , and the coefficients are analytic (n) (n) (n) (n) functions for z ∈ U σ . Therefore there are functions h˜ , H˜ , h¯ , H¯ ∈ Lσ such that ˜ (n) (z) = n J1 (z; )(h˜ (n) (z)ζ s+1/2 + H˜ (n) (z)),

¯ (n) (z) = h¯ (n) (z)ζ s+1/2 + H¯ (n) (z).

Taking the limit as z → κ0 of the integral in (5.18) we get (n) H˜ (κ0 ) =

T d−1

n T ,n ( /2, λ2 , . . . , λn ) n dλ j . ˜ (λ j ) − κ0 ( )) j=2 (a

(5.20)

j=2

By Corollary 5.3, the most singular term comes from J1 (z; ) and we find (n)

(n)

s−1/2

(z) = h (z)ζ

(n)

(n)

(n)

+ H (z), h , H ∈ Lσ .

(5.21)

Moreover, by the last assertion of Corollary 5.3 we get (n)

(n)

h (κ0 ( )) = nh (d) (κ0 ( ); ) H˜ (κ0 ( )),

(5.22)

where h (d) (κ0 ( ); ) is defined in (5.13). (1) It is easy to see that (5.21) and (5.22) also hold for n = 1 with H˜ (κ0 ) = K ( /2, /2). The proof that the series over n converges if α is small enough is based on the estimate (5.5), and the Hadamard inequality for determinants, and can easily be done following the lines of [9]. The lemma is thus proved for odd dimension.

628


For even dimension d = 2s + 2, the second integral in (5.18) and the integral in (5.19) are polynomials of order n − 1 and n, respectively, in the variable ζ s+1 ln ζ −1 , with analytic coefficients. The most singular term (in ζ s ln ζ ) comes again from J1 (z; ), and, after some simple manipulations we get (n) (z)

=

(n) h (z)ζ s ln ζ −1

+

n

(n)

(n)

h ;k (z)ζ k−1 (ζ s ln ζ −1 )k + H (z),

(5.23)

k=2 (n) (n) σ where, as always, h (n)

(z), H (z), h ;k (z) ∈ L , k = 2, . . .. Moreover it is easy to see (n)

that h (κ0 ( )) has the same expression (5.22). The convergence of the series in n is again based on the estimate (5.5) and the Hadamard inequality, and can be done along the lines of [10]. Lemma 3.5 is proved. Proof of Lemma 3.6. By (2.14) and (2.20) we see that for all n ≥ 1, (n) (z)

=

T d ×T d

(n) (1) (2) D (λ, μ; z) f˜ (λ) f˜ (μ) dm(λ)dm(μ). (a˜ (λ) − z)(a˜ (μ) − z)

(5.24)

,n defined by (2.18) as T ,n = T + T , For n > 1 we decompose the function T

,n

,n where ,n ( /2, /2; . . .) + T (1) (λ; . . .) + T (2) (μ; . . .). ,n (λ, μ; . . .) = T T

,n

,n

(5.25)

(1) (λ; . . .) = T ,n (λ, /2; . . .) − Here . . . stands for the variables λ1 , . . . , λn−1 , T

,n (2) is obtained by interchanging the roles of λ, μ. T is ,n ( /2, /2; . . .), and T T

,n

,n (3) (4) +T , with =T also split as T

,n

,n

,n

(3) (λ, μ; λ1 , . . . , λn−1 ) = T

,n

n−1

(λ, μ; λ1 , . . . , [λ j ], . . . , λn−1 ), T

;n

j=1

where [λ j ] means, as above, λ j = /2. The following properties hold: i) Each one of the three terms on the right of (5.25) vanishes if λ j = /2 for some 1 ≤ j ≤ n − 1. = 0 if λ = /2 or μ = /2. ii) T

,n (4) (λ, μ; λ1 , . . . , λn−1 ) = 0 if any one of the variables has the value /2. iii) T

,n

The contribution to the integral (5.24) of the three terms in (5.25) is (1)

(2)

(1)

(2)

n;2 n;1 J (z)J (z)Rn;0

(z) + J (z)R (z) + J (z)R (z).

If the dimension is odd (d = 2s + 1) we have, again by Corollary 5.3,

φi (λ) s−1/2 J (i) (z) = dλ = j (i) (z)ζ + J (i) (z), i = 1, 2 ˜ (λ) − z Td a

(5.26)

(5.27)

Ornstein-Zernike Asymptotics for a General “Two-Particle” Lattice Operator (i)

629

(i)

with j , J ∈ Lσ . The other terms are

Rn;0

(z)

=

(T d )n−1

Rn;i

(z) =

(T d )n

;n ( /2, /2; λ1 , . . . , λn−1 ) n−1 T dλ j , n−1 ˜ (λ j ) − z) j=1 (a j=1

n−1 (i) (λ; λ1 , . . . , λn−1 ) T

;n dλ dλ j , n−1 (a˜ (λ) − z) j=1 (a˜ (λ j ) − z) j=1 s+1/2

By property i) above, all Rn;i

are polynomials in ζ s+1/2

n;i Rn;i

= r (z)ζ

n;i + R (z),

i = 1, 2.

:

i = 0, 1, 2.

(5.28)

(4) is also a polynomial in ζ s+1/2 , and the contriBy property iii), the contribution of T

,n

(3) , in analogy with (5.18), is bution of T

,n

(n − 1)J1 (z; )I n;3 (z), (5.29)

n−1 φ1 (λ) dλ φ2 (μ) dμ T ,n (λ, μ; /2, λ2 , . . . , λn−1 ) dm(λ j ), I n;3 (z) = n−1 a˜ (λ) − z a˜ (μ) − z ˜ (λ j ) − z) j=2 (a j=2 n > 2. ,2 (λ, μ; /2). I n;3 is again a polynomial For n = 2 the last integral is replaced by T

s+1/2 (3) and T (4) to (n) are, respectively, in ζ . Therefore the contributions of T

,n

,n

(n;3)

h

s−1/2

(z)ζ

(n;3)

+ H

(n;4)

(z),

h

s+1/2

(z)ζ

(n;4)

+ H

(z).

(5.30)

Putting together the relations (5.26), (5.27), (5.28), (5.30), we see that if d = 1 (or s = 0) (n) the most singular term at ζ = 0 is a pole, coming from the first term in (5.26), and is represented as −1/2 (n) (n) (z), (z) = Cn ( )ζ −1 + h (n) +H

(z)ζ

(5.31)

n;0 σ ˜ (n) (n) with h (n)

(z), H (z) ∈ L . As by (5.20), R (κ0 ) = H (κ0 ), we have (1) (2) (n) Cn ( ) = f˜ ( /2) f˜ ( /2)(h (1) (κ0 ; ))2 H˜ (κ0 ).

(5.32)

If s > 0 the pole at ζ = 0 disappears and we have (n)

(n)

(z) = h (z)ζ

s−1/2

(n)

(z), +H

d = 2s + 1, s > 0.

Moreover for both (5.31), (5.33), i.e., for all d = 2s + 1, we have (n) (1) (n) ( ) + f˜(2) ( /2) R (n) ( ) + h (n;3) (κ0 ), h (κ0 ) = h (d) (κ0 ; ) f˜ ( /2) R

2 1 (n) ( ) R i

=

(i) (n) J (κ0 ) H˜ (κ0 ) +

n;i R (κ0 )

(5.33)

(5.34)

i = 1, 2. (i)

(i)

If d = 2s + 2 is even, in (5.26) we have, instead of (5.27), J (z) = j (z)ζ s ln ζ −1 + (i) J (z), for i = 1, 2. The integrals Rn;i

(z), for i = 0, 1, 2, are polynomials in

630


ζ s+1 ln ζ −1 , with analytic coefficients, and so are I n;3 (z) in (5.29) and the contribu(4) . Taking into account the expression of J1 (z; ) for d even, and that I n;3 (z) tion of T

,n

(3) −1 s+1 is a polynomial of order n in ζ ln ζ , we see that the contribution of T ,n in (5.30) is replaced by (n;3) s ζ ln ζ −1

h

+

n+1

(n;3)

(n;3)

h ;k (z)ζ k−1 (ζ s ln ζ −1 )k + H

(z).

(5.35)

k=2

Similar representations hold for the second and third term in (5.26), while the first term of (5.26) can be represented as −1 s h (n;0)

(z)ζ ln ζ +

n+1

(n;0) k−2 s −1 k h (n;0) (z).

;k (z)ζ (ζ ln ζ ) + H

(5.36)

k=2

Adding all terms together we see that, for d = 2s + 2, s ≥ 0, (n) (n) (z) = h (z)ζ s ln ζ −1 +

n+1

(n) (n) (z). h ;k (z)ζ k−2 (ζ s ln ζ −1 )k + H

(5.37)

k=2

For s = 0, i.e., in dimension d = 2, the most singular term behaves as ln2 ζ and its coefficient, h (n)

;2 (κ0 ) =: C n ( ), has the same expression (5.32) as for dimension d = 1: (1) (2) (n) Cn ( ) = f˜ ( /2) f˜ ( /2)(h (2) (κ0 ; ))2 H˜ (κ0 ).

(5.38)

(n)

The coefficent of the first term h (κ0 ) also has the same expression (5.34). The convergence of the series is again done along the lines in [9] and [10]. Lemma 3.6 is proved. References 1. Malyshev, V. A., Minlos, R. A.: Linear infinite-particle operators. Translated from the 1994 Russian original by Alan Mason. Translations of Mathematical Monographs, 143. Providence, RI: American Mathematical Society, 1995 2. Minlos, R.A., Zhizhina, E.A.: Asymptotics of the decay of correlations for Gibbs spin fields. (Russian) Teoret. Mat. Fiz. 77(1), 3–12 (1988); translation in Theoret. Math. Phys. 77(1), 1003–1009 (1988) 3. Minlos, R.A., Zhizhina, E.A.: Asymptotics of decay of correlations for lattice spin fields at high temperatures. I. The Ising model. J. Statist. Phys. 84(1–2), 85–118 (1996) 4. Kondratiev, Yu.G., Minlos, R.A.: One-particle subspaces in the stochastic X Y model. J. Statist. Phys. 87(3–4), 613–642 (1997) 5. Minlos, R. A.: Spectra of the stochastic operators of some Markov processes, and their asymptotic behavior. (Russian) Algebra i Analiz 8(2), 142–156 (1996); translation in St. Petersburg Math. J. 8(2), 291–301 (1997) 6. Boldrighini, C., Minlos, R. A., Pellegrinotti, A.: Central limit theorem for the random walk of one and two particles in a random environment, with mutual interaction. Probability contributions to statistical mechanics, Adv. Soviet Math. 20, Providence, RI: Amer. Math. Soc., 1994, pp. 21–75 7. Boldrighini, C., Minlos, R.A., Pellegrinotti, A.: Interacting random walk in a dynamical random environment. II. Environment from the point of view of the particle. Ann. Inst. H. Poincaré Probab. Statist. 30(4), 559–605 (1994) 8. Boldrighini, C., Minlos, R.A., Pellegrinotti, A.: Random Walks in Quenched i.i.d. space-time random environment are always a.s. diffusive. Prob. Th. Rel. Fields 129(1), 133–156 (2004)


631

9. Boldrighini, C., Minlos, R.A., Nardi, F., Pellegrinotti, A.: Asymptotic decay of correlations for a random walk in interaction with a Markov field. Mosc. Math. J. 5(3), 507–522 (2005) 10. Boldrighini, C., Minlos, R.A., Nardi, F.R., Pellegrinotti, A.: Asymptotic decay of correlations for a random walk on the lattice Zd in interaction with a Markov field. Mosc. Math. J. 8(3), 419–431 (2008) 11. Ornstein, L.S., Zernike, F.: Accidental Deviations of Density and Opalescence at the Critical Point of a Single Substance. Proc. Acad. Sci. (Amsterdam) 17, 793–806 (1914) 12. Hecht, R.: Correlation Functions for the Two-Dimensional Ising Model. Phys. Rev. 158, 557–561 (1967) 13. Bricmont, J., Fröhlich, I.: Statistical mechanical methods in particle structure analysis of lattice field theories. I. General theory. Nuclear Phys. B 251(4), 517–552 (1985) 14. Bricmont, J., Frohlich, J.: Statistical mechanical methods in particle structure analysis of lattice field theories. II. Scalar and surface models. Commun. Math. Phys. 98, 553–578 (1985) 15. Campanino, M., Yoffe, D., Velenik, I.: Ornstein-Zernike theory for finite range Ising models above Tc . Prob. Th. Rel. Fields 125, 305–349 (2003) 16. Paes-Leme, P.J.: Ornstein-Zernike and analyticity properties of classical lattice spin systems. Ann. Phys. (NY) 115, 367–387 (1978) 17. Auil, F., Barata, C.A.: Spectral Derivation of the Ornstein-Zernike Decay for Four-Point Functions. Brazilian J. Phys. 35(2B), 554–564 (2005) 18. Polyakov, A.M.: Microscopic Description of Critical Phenomena. Soviet Phys. JETP 28, 533 (1969) 19. Birman, M. Sh., Solomyak, M.Z.: Spektralnaya teoriya samosopryazhennykh operatorov v gilbertovom prostranstve. (Russian) [Spectral theory of selfadjoint operators in Hilbert space] Leningrad: Leningrad. Univ., 1980, 264 pp 20. Lovitt, W. V.: Linear Integral equations. New York: Dover Phoenix Editions, 2005 21. Milnor, J.: Morse Theory. 5th ed., Princeton, NJ: Princeton University Press, 1973 Communicated by M. Aizenman

Commun. Math. Phys. 305, 633–639 (2011) Digital Object Identifier (DOI) 10.1007/s00220-011-1269-y

Communications in


Landau-Zener Tunneling for Dephasing Lindblad Evolutions J. E. Avron1 , M. Fraas1 , G. M. Graf2 , P. Grech2 1 Department of Physics, Technion, 32000 Haifa, Israel. E-mail: [email protected] 2 Theoretische Physik, ETH Zurich, 8093 Zurich, Switzerland

Received: 29 March 2010 / Accepted: 10 February 2011 Published online: 21 May 2011 – © Springer-Verlag 2011

Abstract: We consider a family of time-dependent dephasing Lindblad generators which model the monitoring of the instantaneous Hamiltonian of a system by a Markovian bath. In this family the time dependence of the dephasing operators is (essentially) governed by the instantaneous Hamiltonian. The evolution in the adiabatic limit admits a geometric interpretation and can be solved by quadrature. As an application we derive an analog of the Landau-Zener tunneling formula for this family. Lindblad generators describe the quantum evolutions of finite systems coupled to a memoryless bath [1,2]. They give a useful phenomenological description of thermalization, decoherence, and measurement [3–5]. We consider a family of time-dependent dephasing Lindblad generators, first introduced in [6], which models the continuous monitoring of the instantaneous energy. This is the case, for example, in the Zeno effect [7]. The family can be defined for arbitrary dephasing rate, however, its physical interpretation in the strong dephasing regime requires some care [8]. The family of dephasing Lindbladians that we consider is defined in such a way that any instantaneous stationary states of the Hamiltonian are also instantaneous stationary states for the Lindbladian. This makes the family special and non-generic. (Generic Lindbladians have a unique stationary state.) In the adiabatic limit, the evolution generated by this family can be solved by quadrature and admits a geometric interpretation. The qualitative features of the dynamics differ from the corresponding dynamics of generic Lindbladians [9–11] reflecting the special character of the family. In 1932 Landau [12] and independently Zener [13], Majorana [14], and Stückelberg [15] found an explicit formula for the tunneling in a generic near crossing undergoing unitary, adiabatic evolution. Here we describe the corresponding result for the nonunitary case associated with a dephasing Lindbladian. The solution appears to be the simplest generalization of the Landau-Zener problem which is still exactly soluble. The influence of dissipation and noise on the tunneling of a two-level system has been extensively studied in the physics literature [16–20]. The results presented here differ in one or both of the following aspects: First, in the framework: We assume a Lindbladian,

634

J. E. Avron, M. Fraas, G. M. Graf, P. Grech

and do not attempt to derive an effective dynamics from a (unitary) model of a bath or a (unitary) model of stochastic noise source. Second: The Lindbladians are, as stated, of the dephasing type. Our results contribute to the mathematical physics of Lindblad operators. This approach has the virtue that the adiabatic dynamics can be solved by quadrature and does not rely on the assumption of weak dephasing which one normally needs to make when modelling decoherence with a unitary bath or noise. In the limit of weak dephasing, the tunneling formula we derive can be compared with results of [19] for Zener tunneling due to a dephasing noise. The two formulas have the same functional form up to an overall constant which is left undetermined in [19]. Since tunneling is dominated by the near crossing dynamics, the universal aspects of near crossing in a two-level system are captured by a Hamiltonian that depends linearly on time. By an appropriate choice of basis and of the zero of energy the relevant dynamics is governed by the Hamiltonian 1 s g0 H (s, g0 ) = , (s = εt), (1) 2 g0 −s where ε > 0 is the adiabatic parameter and g0 > 0 is the minimal gap. The tunneling probability T is the probability of a state, which originates asymptotically on one eigenvalue branch, to end up in the other at late times. The formula Landau and Zener found1 for this Hamiltonian is: T = e−πg0 /2h¯ ε . 2

(2)

The singularity of the limit h¯ ε → 0 reflects the singularity of the adiabatic and semiclassical limits, and their coincidence in this case. The universal aspects of tunneling for near crossing in an open system described by a dephasing Lindbladian is described as follows. The adiabatic evolution of the density matrix ρ is governed by h¯ ερ˙ = Ls (ρ), (ε > 0),

(3)

where the slowly varying parameter s = εt, having the physical dimension of an energy, is viewed as the slow clock. Ls is the changing Lindblad operator L(ρ) = −i[H, ρ] − h¯ γ (P− ρ P+ + P+ ρ P− );

(4)

H is the Hamiltonian, which for a generic near crossing is given in Eq. (1); P± = |± ±| are the two spectral projections of H ; finally, γ > 0 is the dephasing rate.2 γ = 0 is the case considered by Landau and Zener. In both cases transitions between the ground and the excited state only occur because the generator of the dynamics depends on s. The tunneling probability T = tr(ρ P+ )(∞), ( ρ(−∞) = P− (−∞) )

(5)

is the error in fidelity of the ground state. 1 Landau, who used semiclassical methods did not actually attempt to calculate the multiplicative overall factor in front of the exponential in Eq. (2). Fortunately, this factor happens to be unity. Zener solved the differential equation exactly in terms of Weber functions and derived Eq. (2) exactly. Zener was aware of Landau’s solution but for some reason, incorrectly, believed that Landau missed a factor of 2π in the exponent. 2 γ −1 is the dephasing time commonly denoted by T . 2

Landau-Zener Tunneling for Dephasing Lindblad Evolutions

635

0.7 Q(x) 0.6

0.5

0.4

0.3

0.2

0.1

0

0

1

2

3

4

5

Fig. 1. The function Q(x). The argument is the ratio of dephasing rate to the minimal gap. The function has a maximum at x = 1.13693

The adiabatic tunneling formula with dephasing, which we shall derive below, is3 ε h¯ h¯ γ + O(ε2 ), T = 2 Q (6) g0 2g0 where Q is the algebraic function (shown in the figure) √ π x(2 + 1 + x 2 ) Q(x) = √ . √ 2 1 + x 2 ( 1 + x 2 + 1)2

(7)

A few remarks about this result are in order: √ • The adiabatic limit means that h¯ ε is the smallest energy scale in the problem and in particular, ε h¯ γ 2 . When this fails, the error terms in Eq. (6) need not be small compared to the leading term. • When the dephasing is weak, h¯ γ g0 , Eq. (6) reduces to T ≈

3π εγ h¯ 2 · 3 . 16 g0

(8)

This term has the same form as one of the tunneling terms found by Shimshoni and Stern [19] in a (different) model where the two-level system is dephased by noise. The method they use cannot give the overall constant 3π/16 [21], nor does it allow investigating the full range of h¯ γ /g0 . 3 Since the problem has two dimensionless parameters: h ε/(h 2 γ 2 + g 2 ) and h γ /g it is not obvious what ¯ ¯ ¯ 0 0 is the correct dimensionless expression corresponding to O(ε 2 ). The correct interpretation of O(ε) turns out

to be the product O

h¯ ε

h¯ γ

h¯ 2 γ 2 +g02 g0

which can be verified by Eq. (10).

636


• When h¯ γ g0 Eq. (6) reduces to T ≈

πε . 4g0 γ

(9)

This may be understood as a manifestation of the quantum Zeno effect [7] by the following interpretation (cf. [4, Sect. 5]): Imagine that the energy H is measured with probability γ δt after a short waiting time δt. If a measurement happens, its outcome shall not be recorded. In the process the state is changed from ρ first to ρ˜ = e−i H δt/h¯ ρei H δt/h¯ and then to (1 − γ δt)ρ˜ + γ δt Pi ρ˜ Pi i=±

i = ρ − [H, ρ]δt + γ h¯

Pi ρ Pi − ρ δt + O((δt)2 ).

i=±

In the limit δt → 0 the resulting dynamics is generated by Eq. (4). Hence the dephasing term in the Lindblad generator can be viewed as a continuous monitoring of the state of the system at rate γ . Transitions between the states P± , which the changing Hamiltonian term potentially induces, are suppressed at high measurement rates, as stated in Eq. (9) and in line with the Zeno effect. Equation (6) follows from, and is a special case of, a more general and basic formula for the tunneling when the adiabatic evolution takes place on a finite interval of (slow) time [s0 , s1 ] and one also allows γ (s) to be time-dependent s1 tr(P+ P˙−2 P+ ) 2 ds + O(ε2 ). γ (s) (10) T = 2ε h¯ g 2 (s) + h¯ 2 γ 2 (s) s0 Here g(s) is the instantaneous gap in H (s), g 2 (s) = s 2 + g02 .

(11)

The positivity of the integrand in Eq. (10) when γ > 0 makes the tunneling irreversible. This changes the characteristics of the ε dependence of T from exponentially small in Eq. (2) to linear in Eq. (6). The Landau-Zener formula, Eq. (2), is buried in the error terms of Eq. (10). Equation (10) reduces the tunneling problem to integration. In the case where γ is constant and s runs from −∞ to ∞ the numerator Eq. (10) is simply tr(P+ P˙−2 P+ ) =

g02 . 4g 4 (s)

Elementary algebra then leads to Eq. (6) with ∞ (t 2 + 1)−2 (t 2 + 1 + x 2 )−1 dt. Q(x) = x −∞

(12)

(13)

The integral can be evaluated explicitly to give Eq. (7). The key idea behind the derivation of the adiabatic tunneling formula, Eq. (10), is a geometric view of spectral projections as adiabatic invariants. The evolution of observables is governed by the adjoint of the Lindblad generator, L∗ , (this is the Heisenberg


637

picture), where the adjoint refers to the Hilbert-Schmidt inner product. In particular, the adjoint of the dephasing Lindblad operator of Eq. (4) acting on the observable A is given by (from now on we set h¯ = 1) L∗ (A) = i[H, A] − γ (P− A P+ + P+ A P− ).

(14)

It differs from Eq. (4) by the replacement of i by −i. As we shall now see an instantaneously stationary observable A(s) ∈ Ker(L∗s ) that has no motion in Ker(Ls ) is an adiabatic invariant. More precisely, Theorem 1. Let A(s) be an observable which lies in the instantaneous kernel of L∗s , i.e. L∗s (A(s)) = 0

(15)

and suppose that, in addition, the linear equation ˙ A(s) = L∗s (X (s)) admits a solution X (s). Then one has s tr(A(s)ρε (s)) ss10 = ε tr(X (s)ρε (s))s1 − ε 0

(16)

s1 s0

tr( X˙ (s)ρε (s)) ds,

(17)

where ρε (s) is a solution of the adiabatic Lindblad evolution. A(s) is an adiabatic invariant in the sense that its expectation is conserved up to a small error, O(ε), given by the right hand side of Eq. (17) whereas the change in a generic observable is O(ε−1 ) and in the Lindblad generator is O(1). The identity, Eq. (17), readily follows from d ˙ tr(A(s)ρε (s)) = tr( A(s)ρ ε (s)) + tr(A(s)ρ˙ε (s)) ds = tr(L∗s (X (s))ρε (s)) + ε−1 tr(A(s)Ls (ρε (s))) = tr(X (s)Ls (ρε (s))) + ε−1 tr(L∗s (A(s))ρε (s)) = ε tr(X (s)ρ˙ε (s))

(18)

and integration by parts. Equation (16) may be interpreted as a condition that A(s) undergoes parallel trans˙ port: The equation has a solution provided A(s) ∈ Range (L∗s ) which is the case if A(s) has no motion in Ker (Ls ). It is straightforward to verify that the instantaneous spectral projections P j (s) of a dephasing Lindblad generator are adiabatic invariants in the sense of the theorem. Evidently, L∗s (P+ (s)) = 0. Moreover, Eq. (16) is solved by X (s) = −i

k = j

Pk P˙+ P j ek − e j + iγ

(19)

with e± the two eigenvalues of H . To see this note first that X (s) is purely off-diagonal4 by construction and so is P˙+ , namely P˙+ = P− P˙+ P+ + P+ P˙+ P− . 4 The orthogonal complement to the kernel is spanned by the operators |∓ ±|.

(20)

638


This follows from P+ = P+2 , which implies P˙+ = P˙+ P+ + P+ P˙+ and in turn P± P˙+ P± = 0. The equality of the off-diagonal components of Eq. (16) follows from L∗ (Pk A P j ) = i(ek − e j + iγ )Pk A P j , (k, j = ±, k = j).

(21)

The probability of leaking out of the instantaneous ground state is given by Eq. (17) with A(s) = P+ (s). Equation (10) then follows by appealing to the adiabatic theorem [9–11] which allows to replace the instantaneous state by the instantaneous projection on the right hand side of Eq. (17), ρε (s) = P− (s) + O(ε),

(22)

uniformly in s0 , s. The rest is simple algebra. For the convenience of the reader we include a proof of Eq. (22). Let Uε (s, s0 ) be the propagator for the differential equation (3), whence ρ(s) = Uε (s, s0 )P− (s0 ). We recall that the solution of its inhomogeneous variant, x˙ = ε−1 Ls (x) + y, is given by the Duhamel formula x(s) = Uε (s, s0 )x(s0 ) +

s s0

Uε (s, s )y(s )ds .

(23)

The remainder to be estimated, r (s) = ρ(s) − P− (s), satisfies r˙ = ε−1 Ls (r ) − P˙− , because of Ls (P− ) = 0. Before applying Eq. (23), we observe that the equation P˙− (s) = Ls (X (s)) admits Eq. (19) as a solution upon replacing i, P˙+ by −i, P˙− . The differential equation thus becomes (r − ε X )· = ε−1 Ls (r − ε X ) − ε X˙ , resulting in r (s) − ε X (s) = Uε (s, s0 )(r (s0 ) − ε X (s0 )) − ε

s s0

Uε (s, s ) X˙ (s )ds .

Since Ls is dissipative, i.e. tr(ρLs (ρ)) ≤ 0, we obtain Uε (s, s0 ) ≤ 1, (s ≥ s0 ). Together with r (s0 ) = 0 we conclude that r (s) = O(ε), as claimed. The uniformity follows from decay: X (s) = O(s −3 ), X˙ (s) = O(s −4 ), (s → ±∞). In conclusion: We have introduced a class of adiabatically changing dephasing Lindblad operators which allowed us to calculate the tunneling in a generic two-level crossing and extend the Landau-Zener tunneling to dephasing Lindbladians with arbitrary dephasing rate. Dephasing makes the tunneling irreversible and so fundamentally different from tunneling in the unitary setting. This irreversibility is responsible for the difference in the functional form of the tunneling formulas. Acknowledgements. This work is supported by the ISF and the fund for Promotion of research at the Technion. The last two authors are grateful for hospitality at the Physics Department at the Technion, where most of this work was done. Useful discussions with A. Keren and E. Shimshoni are acknowledged.


639

References 1. Lindblad, G.: On the generators of quantum dynamical semigroups. Commun. Math. Phys. 48, 119–130 (1976) 2. Gorini, V., Kossakowski, A., Sudarshan, E.C.G.: Completely positive semigroups of N-level systems. J. Math. Phys. 17, 821–825 (1976) 3. Gardiner, C.W., Zoller, P.: Quantum noise. Berlin: Springer, 2004 4. Davies, E.B.: Quantum theory of open systems. London: Academic Press, 1976 5. Breuer, H.P., Petruccione, F.: The theory of open quantum systems. Oxford: Oxford University Press, 2002 6. Åberg, J., Kult, D., Sjöqvist, E.: Robustness of the adiabatic quantum search. Phys. Rev. A 71, 060312 (2005) 7. Misra, B., Sudarshan, E.C.G.: The Zeno’s paradox in quantum theory. J. Math. Phys. 18(4), 756–763 (1977) 8. Avron, J., Fraas, M., Graf, G.M., Grech, P.: Optimal time schedule for adiabatic evolution. Phys. Rev. A 82, 040304(R) (2010) 9. Nenciu, G., Rasche, G.: On the adiabatic theorem for nonself-adjoint Hamiltonians. J. Phys. A 25, 5741–5751 (1992) 10. Joye, A.: General adiabatic evolution with a gap condition. Commun. Math. Phys. 275, 139–162 (2007) 11. Abou Salem, W.K.: On the quasi-static evolution of nonequilibrium steady states. Ann. H. Poincaré 8, 569–596 (2007) 12. Landau, L.: Zur Theorie der Energieübertragung. II. Phys. Z. Sowjet. 2, 46–51 (1932) 13. Zener, C.: Non-adiabatic crossing of energy levels. Proc. Roy. Soc. London, Series A 137, 692–702 (1932) 14. Majorana, E.: Atomi orientati in campo magnetico variabile. Nuovo Cimento 9, 43–50 (1932) 15. Stückelberg, E.C.G.: Theorie der unelastischen Stösse zwischen Atomen. Helv. Phys. Acta 5, 369 (1932) 16. Leggett, A.J., et al.: Dynamics of the dissipative two-state system. Rev. Mod. Phys. 59(1), 1–85 (1987) 17. Shimshoni, E., Gefen, Y.: Onset of dissipation in Zener dynamics: relaxation versus dephasing. Ann. Phys. 210, 16–80 (1991) 18. Wubs, M., Saito, K., Kohler, S., Hänggi, P., Kayanuma, Y.: Gauging a quantum heat bath with dissipative Landau-Zener transitions. Phys. Rev. Lett. 97, 200404 (2006) 19. Shimshoni, E., Stern, A.: Dephasing of interference in Landau-Zener transitions. Phys. Rev. B 47, 9523– 9536 (1993) 20. Pokrovsky, V.L., Sun, D.: Fast quantum noise in the Landau-Zener transition. Phys. Rev. B 76, 024310 (2007) 21. Berry, M.V.: Histories of adiabatic quantum transitions. Proc. Roy. Soc. London, Series A 429, 61–72 (1990) Communicated by M. Aizenman


Communications in


The Dirac Operator on Generalized Taub-NUT Spaces Andrei Moroianu1 , Sergiu Moroianu2,3 1 Centre de Mathématiques, École Polytechnique, 91128 Palaiseau Cedex, France.


2 Institutul de Matematic˘a al Academiei Române, P. O. Box 1-764, RO-014700 Bucharest, Romania 3 Scoala ¸ Normal˘a Superioar˘a Bucharest, Calea Grivi¸tei 21, Bucharest, Romania.

E-mail: [email protected] Received: 1 April 2010 / Accepted: 8 March 2011 Published online: 21 May 2011 – © Springer-Verlag 2011

Abstract: We find sufficient conditions for the absence of harmonic L 2 spinors on spin manifolds constructed as cone bundles over a compact Kähler base. These conditions are fulfilled for certain perturbations of the Euclidean metric, and also for the generalized Taub-NUT metrics of Iwai-Katayama, thus proving a conjecture of Vi¸sinescu and the second author. 1. Introduction The Taub-NUT metrics on R4 and their generalizations by Iwai-Katayama [9] provide a fruitful framework for the study of classical and quantum anomalies in the presence of conserved quantities, see e.g. [7]. To describe these metrics, consider the sphere S 3 as the unit sphere inside the quaternions. There exist then three orthogonal unit vector fields I, J, K given by left translation with the unit quaternions i, j, k. The Berger metrics gλ on S 3 are defined by setting the length of I, J to be 1, and that of K to be λ. The Iwai-Katayama metrics on R4 \{0} R+ × S 3 have the form g I K = γ 2 (t)(dt 2 + 4t 2 gλ(t) ), where

γ (t) =

a + bt , t

(1.1)

1 λ(t) = √ 1 + ct + dt 2

for positive constants a, b, c, d. The apparent singularity at the origin is removable. We are interested here in the axial quantum anomaly already studied in [6,16]. It was found in [6] that the axial anomaly, i.e., the difference between the number of null states of positive and of negative chirality on a ball or annular domain, may become non-zero for suitable choices of the parameters of the metric and of the domain when we impose the Atiyah-Patodi-Singer spectral condition at the boundary. Remarkably,

642

A. Moroianu, S. Moroianu

when the radius of the ball is sufficiently large the index was always 0. It was further proved in [16] that on the whole space, although the Dirac operator is not Fredholm, it only has a finite number of null states. The method of proving the finiteness of the index in [16] relied on a general index formula due to Vaillant [19], and on a comparison between harmonic spinors for a pair of conformal metrics. On the standard Taub-NUT space, which is hyperkähler and therefore scalar-flat, it is easy to see that there are no harmonic L 2 spinors using the Lichnerowicz identity and the infiniteness of the volume. It was somehow natural to conjecture in [16] that the L 2 index of the Dirac operator corresponding to the generalized Taub-NUT metric is also zero. The motivation of the present work is to prove the above conjecture: Theorem 1.1. There do not exist L 2 harmonic spinors on R4 for the generalized TaubNUT metrics. In particular, the L 2 index of the Dirac operator vanishes. As we just mentioned, for the standard Taub-NUT metric this has been proved in [16]. Our approach here is less analytic, and more geometric, than the previous attempt described above. We exploit the rich symmetries of the metric to decompose the spinors in terms of frequencies along the fibers as in e.g. [18], and then further in terms of eigenvalues of an associated spinc Dirac operator on S 2 . We obtain a system of ordinary differential equations which we show does not admit any L 2 solutions. There are similarities with [11,15] in the analysis of this system, but the essential difference is that large time behavior is not enough to rule out L 2 harmonic spinors and we must use also the behavior near the origin. The method is more general and we can prove our results for a wider class of manifolds, constructed from a circle fibration over a compact Hodge base. Although the one-point completion of such a manifold will not be in general a topological manifold, we consider it as a singular complete metric space, the appropriate condition on spinors being boundedness in the L ∞ norm near the singular point. Our main result (Theorem 5.1) applies both to the Iwai-Katayama metrics and to Euclidean metrics, and to certain perturbations thereof. The paper is organized as follows: In Sect. 2 we introduce the class of metrics studied in the rest of the paper. In Sect. 3 we relate geometric objects – like the Levi-Civita connection and the Dirac operator – of a circle-fibered space to the corresponding objects on the base, and we introduce the announced splitting into frequencies along the fibers. Section 4 contains similar computations in the case of warped products, introducing an extra variable corresponding to the radial direction, and computing the corresponding spinc Dirac operator. The main analytic result is stated and proved in Sect. 5 by reducing the problem to a linear system of ordinary differential equations on the positive real half-line, and a careful analysis both near infinity and 0 to exclude L 2 solutions. Finally in Sect. 6 we extend the result in a rather formal way to include the Iwai-Katayama generalized Taub-NUT metrics. 2. Circle Fibered Warped Products Let (B, g B , ) be a compact Kähler manifold of real dimension 2m. Let h denote the warped product metric dt 2 + α 2 (t)g B on N := R+ × B and let p denote the projection N → B. Assume that (B, g B , ) is a Hodge manifold, i.e., [] ∈ 2π H 2 (B, Z). The classical ˇ isomorphism of Cech cohomology groups H 1 (B, S 1 ) H 2 (B, Z) shows the existence of a Hermitian line bundle L 0 → B with first Chern class c1 (L 0 ) = −[]/2π . Let M0 denote the circle bundle of L 0 . The projection q : M0 → B can be viewed as a

The Dirac Operator on Generalized Taub-NUT Spaces

643

principal S 1 -bundle. By Chern-Weil theory (cf. [14], Ch. 16 for instance) there exists an imaginary-valued connection 1-form iξ on M0 such that dξ = q ∗ . We define L := R+ × L 0 and π := id × p the projection of L onto N . Then π : L → N is a Hermitian line bundle over N whose circle bundle, denoted by M, is just M := R+ × M0 . We endow M with the Riemannian metric g := dt 2 + α 2 (t)( p ◦ π )∗ g B + β 2 (t)ξ ⊗ ξ for some positive functions α and β defined on R+ . The Riemannian manifold (M, g) obtained in this way will be referred to as the circle-fibered warped product (CFWP) over the Hodge manifold (B, g B , ), with warping functions α and β. Notice that a CFWP can be viewed either as a generalized cylinder of a family of metrics on the S 1 -bundle M0 over B (cf. Prop. 2.3 below) or as a Riemannian submersion with 1-dimensional fibres over a warped product R+ ×α B. It is the latter point of view which will be useful in order to relate spinors on M and B. Example 2.1. The flat space Cm+1 \{0} is the CFWP over the complex projective space (CPm , g F S , F S ) endowed with the Fubini-Study metric, with warping functions α(t) = √t , β(t) = t. The normalization g F S of the Fubini-Study metric used here is the one 2 with scalar curvature equal to 2m(m +1) or, equivalently, the one for which the projection S2m+1 → (CPm , 21 g F S ) is a Riemannian submersion (cf. [14], Ch. 13). Example 2.2. The Taub-NUT metric on C2 is conformal √ to the one-point completion of 2 with warping functions α(t) = the CFWP over the standard 2-sphere of radius 1/ √ 2t 2t, β(t) = 1+bt . More generally, the generalized Taub-NUT metrics of Iwai-Katayama 1 on C2 are conformal to the one-point √ completion of the CFWP over (CP√, g F S ), i.e., the standard 2-sphere of radius 1/ 2, with warping functions α(t) = 2t, β(t) = √ 2t for some positive constants c and d (cf. [16], p. 6576): 2 1+ct+dt

gI K =

a + bt 2 dt + α 2 (t)π ∗ g F S + β 2 (t)ξ ⊗ ξ . t

(2.1)

By Remark 2.4 below, these are actually examples of CFWP’s. Proposition 2.3. Let (M, g) be the CFWP over a Hodge manifold (B 2m , g B , ) with warping functions α and β and assume that limt→0 α(t) = limt→0 β(t) = 0. Let d ˆ d) of (M, d) denote the distance on M induced by g. Then the metric completion ( M, ˆ then (B, g B , ) is has exactly one extra point. If g extends to a smooth metric on M, the complex projective space endowed with the Fubini-Study metric, and lim

t→0

1 α(t) β(t) − √ = lim − 1 = 0. t→0 t t 2

Proof. The Riemannian manifold (M, g) will be viewed as a generalized cylinder (cf. [4]) of the family of metrics gt := α 2 (t)q ∗ g B + β 2 (t)ξ ⊗ ξ on the S 1 -bundle M0 over B (which is a compact manifold). The first statement follows immediately from the fact that for every x ∈ M0 , the rays R+ × {x} are geodesics parametrized by arc-length on M. ˆ The Gauss Lemma applied to a neighborAssume now that g extends smoothly to M. ˆ hood of the origin t = 0 in M shows that the distance spheres (M0 , gt ) (renormalized by a factor 1/t 2 ) tend to the standard sphere S 2m+1 in the Gromov-Hausdorff topology. In other words, there exist non-zero constants α0 , β0 such that limt→0 α(t)/t = α0 , limt→0 β(t)/t = β0 and α02 q ∗ g B + β02 ξ ⊗ ξ is the standard metric on S 2m+1 . On

644


the other hand, this metric is by definition a Riemannian submersion over (B, α02 g B ) with totally geodesic fibres of length 2πβ0 . Since the length of every closed geodesic on S 2m+1 is 2π we get β0 = 1. The manifold (B, α02 g B ) is then the quotient of the sphere by an isometric S 1 action, so B = CPm and α02 g B = 21 g F S (cf. [14], Ch.13). The constant α0 is determined by the normalization condition c1 (L 0 ) = −[ B ]/2π . Indeed, since m L 0 = R2m+2 \{0} is clearly the tautological √ bundle (−1) of CP , its first Chern class is equal to −[ F S ]/2π , whence α0 = 1/ 2.

The converse holds under some extra smoothness assumption on α and β at t = 0 but we will not need this in the sequel. Remark 2.4. A metric conformal to a CFWP is itself a CFWP provided that the conformal factor is a radial function (i.e., it only depends on t). Indeed, if g = γ (t)2 (dt 2 + α 2 (t)π ∗ g B + β 2 (t)ξ ⊗ ξ ), in the new coordinate s := s(t) defined by s =

t 0

γ (u)du, g reads

g = ds 2 + α 2 (t (s))π ∗ g B + β 2 (t (s))ξ ⊗ ξ. The generalized Taub-NUT metrics from Example 2.2 are thus particular cases of CFWP. We will analyze these metrics in more detail in Sect. 6. Our main goal in this paper will be to study the L 2 -index of the Dirac operator on a CFWP (M, g) when M is a spin manifold. As we will see below, this is automatically the case when B has a spinc structure whose auxiliary bundle is some tensor power of L 0 , i.e., if the second Stiefel-Whitney class of B satisfies w2 (B) = 0 or w2 (B) ≡ c1 (L 0 ) mod 2. In the next two sections we will relate spinors on M to spinc spinors on N and then further to spinc spinors on B. The results are quite general and can be viewed as a natural extension of the theory of projectable spinors introduced in [12] to the case of submersions with non-totally geodesic fibres.

3. Spinors on Circle Fibrations Let π : (M, g) → (N , h) be a Riemannian submersion with 1-dimensional fibres of length 2πβ for some function β : N → R+ . The fibers of π are totally geodesic if and only if β is constant, but we will mostly be interested in examples with non-constant β in the sequel. We can view M as a principal S 1 -fibration over N . Indeed, the flow ϕt of the vertical Killing vector field V on M of length π ∗ β closes up at time t = 2π , i.e., ϕ2π = id M , thus it defines a free S 1 -action on M whose orbit space is N . We denote by PU(1) N this principal S 1 -bundle with total space M. The Riemannian metric g can be written as g = π ∗ h +β 2 (t)ξ ⊗ξ , where ξ is the 1-form on M defined by ξ(V ) = 1 and ker ξ = V ⊥ . The 2-form dξ is basic, i.e., there exists some 2-form F on N such that dξ = π ∗ F. This follows immediately from the Cartan formula and the fact that V is Killing, or alternately since iξ is a connection 1-form in the principal bundle PU(1) N (cf. Sect. 2). The following result holds without restriction on the dimension of N but we will state it only for the case we will need in the sequel.


645

Lemma 3.1. Let PU(1) N → N be the principal S 1 -bundle over the 2m + 1-dimensional manifold N defined by the Riemannian submersion π : M → N . Let L → N be the complex line bundle associated to PU(1) N with respect to the canonical representation of S 1 on C. Then every spinc structure PSpinc (2m+1) N on N with auxiliary bundle L ⊗k , k ∈ Z induces a spin structure on M and all these spin structures are isomorphic. Proof. By enlargement of the structure groups, the two-fold covering θ : PSpinc (2m+1) N → PSO(2m+1) N × PU(1) N gives a two-fold covering θ : PSpinc (2m+2) N → PSO(2m+2) N × PU(1) N , which, by pull-back through π , gives rise to a Spinc structure on M: PSpinc (2m+2) M ⏐ ⏐ π ∗θ

π

−−−−→

PSpinc (2m+2) N ⏐ ⏐ θ

π

PSO(2m+2) M × PU(1) M −−−−→ PSO(2m+2) N × PU(1) N ⏐ ⏐ ⏐ ⏐ π

−−−−→

M

P.

This construction actually yields a spin structure on M. Indeed, the pull back PU(1) M to M of PU(1) N is trivial since it carries a tautological global section σ (u) = (u, u), ∀u ∈ M = PU(1) N . Correspondingly, the pull-back to M of every associated bundle L ⊗k is trivial.

From now on we assume that N carries some spinc structure with auxiliary bundle L ⊗k , and we study M with the spin structure induced by the previous lemma. In particular, we will consider the flat connection on the trivial bundle PU(1) M, rather than the pull-back connection from PU(1) N , in order to define covariant derivatives of spinors on M. The following result, first proved in [13], relates an arbitrary connection on a principal bundle π : M = PU(1) N → N and the flat connection on π ∗ M = PU(1) M → M: π

π ∗ M = PU(1) M M × S 1 −−−−→ M = PU(1) N ⏐ ⏐ ⏐ ⏐ π π ∗π M

π

−−−−→

N

Lemma 3.2. The connection form A0 of the flat connection on PU(1) M can be related to an arbitrary connection A on PU(1) N by A0 ((π ∗ s)∗ (U )) = −A(U ), A0 ((π ∗ s)∗ (X ∗ )) = A(s∗ X ), where U is a vertical vector field on M, X ∗ is the horizontal lift (with respect to A) of a vector field X on N , and s is a local section of M → N .

646


Proof. The identification M × U(1) π ∗ M is given by (u, a) → (u, ua), for all (u, a) ∈ M × U(1). For some fixed u ∈ M, take a path u t in the fiber over x := π(u) such that u 0 = u and u˙ 0 = U . We define at ∈ U(1) by u t = s(x)at , so via the above identification we have (π ∗ s)(u t ) = (u t , s(x)) = (u t , (at )−1 ), and thus A0 ((π ∗ s)∗ (U )) = −a0−1 a˙ 0 = −A(u˙ 0 ) = −A(U ). Similarly, for x ∈ N and X ∈ Tx N , take a path xt in N such that x0 = x and x˙0 = X . Let u ∈ π −1 (x) and u t be the horizontal lift of xt such that u 0 = u. We define at ∈ U(1) by s(xt ) = u t at , which by derivation gives s∗ (X ) = Ra0 u˙ 0 + u 0 a˙ 0 . Then (π ∗ s)(u t ) = (u t , s(xt )) = (u t , at ), and thus, using the fact that u˙ 0 is horizontal, A0 ((π ∗ s)∗ (X ∗ )) = a0−1 a˙ 0 = A(s∗ (X )).

Recall that the complex Clifford representation 2m+2 = identified with 2m+1 ⊕ 2m+1 by defining in an orthonormal basis (e j · φ, e j · ψ) for j ≤ 2m + 1 e j · (ψ, φ) = (−φ, ψ) for j = 2m + 2.

+ 2m+2

⊕

− 2m+2

can be

Accordingly, we obtain identifications, denoted by π ± , of the pull back π ∗ N with ± M. By a slight abuse of notation we will denote π ± and ± M by π ε and ε M for ε = ±1. With respect to these identifications, if X is a vector and is a spinor on N , then X ∗ ·π ε = π −ε (X ·), 1 V ·(π ε ) = επ −ε , β

(3.1) (3.2)

where β1 V is the unit vertical vector field defined at the beginning of this section, and X ∗ denotes the horizontal lift to M of a vector field X on N . We consider now a spinc structure PSpinc (2m+1) N on (N , h) with auxiliary bundle L ⊗k and denote by ∇ N the covariant derivative induced on N by the connection form iξ of PU(1) N . By Lemma 3.1, the pull-back to M of PSpinc (2m+1) N induces by enlargement a spin structure on (M, g), where we recall that g = π ∗ h + β 2 ξ ⊗ ξ . Proposition 3.3. Let ∇ M denote the covariant derivative on ε M induced by the LeviCivita connection on (M, g) and the flat connection on π ∗ PU(1) N . Let ∇ N denote the spinc covariant derivative on N induced by the Levi-Civita connection on (N , h) and the connection form A = iξ on PU(1) N . Then ∇ M and ∇ N are related by εβ T (X )·), ∀X ∈ T M, ∇ XM∗ (π ε ) = π ε (∇ XN − 4 2

β ε ki ∇VM (π ε ) = π ε F· + dβ· − , 4 2 2

(3.3) (3.4)

where T is the endomorphism of T N defined by dξ(X ∗ , Y ∗ ) = F(X, Y ) = h(T X, Y ).


647

Proof. If V denotes as before the vertical vector field such that ξ(V ) = 1, the Koszul formula and the fact that [V, X ∗ ] = 0 for all vector fields X on N yield g(∇ XM∗ Y ∗ , Z ∗ ) = h(∇ XN Y, Z ),

(3.5) β2

1 g(∇VM X ∗ , Y ∗ ) = g(∇ XM∗ V, Y ∗ ) = − g(V, [X ∗ , Y ∗ ]) = − ξ([X ∗ , Y ∗ ]) 2 2 β2 β2 β2 ∗ ∗ dξ(X , Y ) = h(T X, Y ) = F(X, Y ), (3.6) = 2 2 2 and g(∇VM X ∗ , V ) = g(∇ XM∗ V, V ) = β X (β),

(3.7)

for all vector fields X, Y and Z on N . Consider a spinor field on N locally expressed as = [σ, ψ], where ψ : U ⊂ N → 2m+1 is a vector-valued function, and σ is a local section of PSpinc (2m+1) N whose projection onto PSO(2m+1) N is a local orthonormal frame (X 1 , . . . , X 2m+1 ) and whose projection onto PU(1) N is a local section s. Then π ε can be expressed as π ε = [π ∗ σ, π ∗ ξ ]. Moreover, the projection of π ∗ σ onto PSO(2m+2) M is the local ∗ orthonormal frame ( β1 V, X 1∗ , . . . , X 2m+1 ) and its projection onto PU(1) M is just π ∗ s. Using the general formula for the covariant derivative on spinors, Lemma 3.2, and the fact that the bundle L ⊗k is associated to PU(1) N via the representation ρ k (z) = z k of S 1 on C, we obtain ∇ XM∗ π ε = [π ∗ σ, X ∗ (π ∗ ψ)] +

1 g(∇ XM∗ X ∗j , X k∗ )X ∗j ·X k∗ ·π ε 2 j 0, so it cannot be L 2 .

The previous lemma together with (5.5) show that λ = 0: indeed, for λ = 0 the system (5.5) uncouples into two first-order linear ODE’s, whose nontrivial solutions never vanish by uniqueness, hence they have constant sign and so Lemma 5.2 applies. By changing U to −U if necessary, we can therefore assume that σ (t) > 0 for all t > 0.


653

Lemma 5.3. If σ (t) > 0 for all t > 0, then we must have (U W )(t) ≤ 0 for all t ∈ R+ . Proof. Assume that U W > 0 on some open interval I . From (c) we easily infer τ + ρ = −ε(−1)l

β 1 ≥ −√ , 2 2α 2α

(5.6)

so (5.5) yields 1 (U W ) = (τ + ρ)U W + σ (U 2 + W 2 ) ≥ − √ U W. 2α

(5.7)

Consider the maximal interval J := (x0 , x1 ) containing I on which U W > 0. For every x0 < x ≤ t < x1 , (5.7) implies (U W )(t) ≥ (U W )(x)e

−

t x

√ 1 ds 2α(s)

−

x

.

1 √ 1

(5.8) ds

If x1 < ∞ then by continuity (U W )(x1 ) ≥ (U W )(x)e x 2α(s) > 0, contradicting the maximality of J . Therefore x1 = ∞, so U W (t) > 0 for all t > x0 . By integration, (5.8) implies ∞ ∞ t 1 − √ ds (U W )(t)dt ≥ (U W )(x) e x 2α(s) dt. x

x

2 + By ∞hypothesis (a), the last integral is infinite, however U, W ∈ L (R , dt) implies that x (U W )(t)dt < ∞, a contradiction.

Lemma 5.4. (U W )(t) < 0 for all t > 0. Proof. Assume for instance that U (x0 ) = 0. The Cauchy-Lipschitz theorem gives W (x0 ) = 0 and the first equation in (5.5) shows that U (x) has the same sign as W (x) for every x in some small interval (x0 , x0 + δ), contradicting Lemma 5.3. The same argument works when W (x0 ) = 0 by considering the second equation in (5.5).

We proved so far that U and W have opposite signs and σ is positive. Condition (c) implies that τ and ρ have constant signs on R+ since 0 ≤ l ≤ m − 1. If τ ≤ 0, it means that τ and σ have opposite signs, and since also U and W have opposite signs by Lemma 5.4, it follows from the second equation in (5.5) that W has constant sign. By Lemma 5.2 we get a contradiction. This shows that τ > 0 and similarly we prove ρ > 0. By condition (c), this can only happen for k = 0, m = 2l + 1 and (−1)l = −ε. Assuming this to be the case, the system (5.5) reads β |λ| W, U+ 2 4α α β |λ| W = U + 2 W. α 4α U =

The difference D := U − W is thus a non-vanishing function satisfying

β |λ| D, − D = 4α 2 α

(5.9)

(5.10)

654


so for every t0 > 0, D(t) = D(t0 )e

t

β(s) |λ| t0 4α(s)2 − α(s) ds

.

To conclude the proof of the theorem we distinguish two cases. If λ ≤ 2−3/2 we get |D(t)| > |D(t0 )|e

−

t

√1 t0 2 2α(s) ds

,

so D cannot be square-integrable because of hypothesis (a). If λ ≥ 2−3/2 , (5.6) together with (5.10) show that D is decreasing, contradicting Lemma 5.2.

6. Axial Anomaly for Generalized Taub-NUT Metrics on R4 6.1. Radial perturbations of the Euclidean metric on R2m+2 . Recall from Example 2.1 that the Euclidean space is the metric completion of the CFWP with α = √t , β = t 2 and with basis B = CPm endowed with the Fubini-Study metric. Note that by elliptic regularity, bounded spinors which are harmonic on a punctured ball B0 ()\{0} are actually smooth and harmonic on B0 (), while conversely harmonic spinors on R2m+2 are clearly bounded near 0. Theorem 5.1 applies therefore to the Euclidean metric on R2m+2 , for all m ≥ 1. It is of course well-known that there are no harmonic L 2 spinors on the Euclidean space. Our results generalize this to metrics which are radial perturbations of the standard Euclidean metric with any α, β satisfying the conditions of Theorem 5.1. 6.2. Generalized Taub-NUT metrics. The main application of Theorem 5.1 that we have in mind is the vanishing of the index for the generalized Tab-NUT metric of IwaiKatayama. The difficulty of the problem resides in the non-Fredholmness of the Dirac operator as an unbounded operator in L 2 . Nevertheless in [16] it was proved that the L 2 kernel of D is finite-dimensional, and vanishes for the standard Taub-NUT metric. We cannot apply Theorem 5.1 directly because of the conformal factor γ (t) in (1.1). As in Remark 2.4 we set ds = γ (t)dt. Notice that s = s(t), t = t (s) are diffeomorphisms of R+ onto itself provided

1 0

∞

γ (t)dt < ∞,

γ (t)dt = ∞.

(6.1)

0

This condition clearly holds for the conformal factor in (1.1), which is asymptotically constant near infinity and of order t 1/2 near t = 0. Thus we obtain a CFWP metric γ 2 (t)g, where g is itself a CFWP metric. Lemma 6.1. Let g be a CFWP metric with coefficients α(t), β(t), and γ (t) a conformal factor satisfying (6.1). Then the CFWP metric γ 2 g satisfies the hypotheses of Theorem 5.1 if and only if t ∞ − √ 1 du (a’) x γ (t)e x 2α(u) dt = ∞ for all x > 0; (b’) limt→0 γ (t)α(t) = 0; 2 (c’) 2α 2 (t) ≥ β 2 (t) > m−1 m 2α (t) for all t > 0.


655

Proof. The coefficients of the CFWP metric γ 2 g are ˜ α(s) ˜ = γ (t (s))α(t (s)), β(s) = γ (t (s))β(t (s)), so conditions (b), (c) from Theorem 5.1 for α, ˜ β˜ are clearly equivalent to conditions 1 1 (b’), (c’). Now α(t) dt = α(s) ˜ ds and by definition γ (t)dt = ds, so by two changes of variables, condition (a’) is equivalent to condition (a) for α. ˜

As a corollary to Theorem 5.1 we deduce that the Iwai-Katayama metrics on R4 do not admit non-trivial L 2 harmonic spinors. Proof of Theorem 1.1. It is straightforward to check that the conditions of Lemma 6.1 hold for the coefficients of Example 2.2, namely √ a + bt 2t . m = 1, α(t) = 2t, β(t) = √ , γ (t) = t 1 + ct + dt 2 It follows from Theorem 5.1 that there do not exist non-trivial L 2 harmonic spinors on (R4 \{0}, g I K ) bounded near the origin. Of course, the metric g I K is smooth at the origin, as can be seen by the change of variable r 2 = t. In particular we have proved that there do not exist L 2 harmonic spinors on (R4 , g I K ).

We could have also used the conformal covariance of the Dirac operator (cf. [8], see also [18]) to related harmonic spinors for the metrics g and λ2 (t)g = g I K . We do not give details since this approach is essentially equivalent to the above proof. Acknowledgements. The authors are indebted to Mihai Vi¸sinescu for his help concerning generalized TaubNUT metrics and to the anonymous referee for valuable comments. The authors acknowledge the support of the Associated European Laboratory “MathMode”. A.M. was partially supported by ANR-10-BLAN 0105. S.M. was partially supported by grant PN-II-ID-PCE 1187/2009.

References 1. Ammann, B.: The Dirac operator on collapsing S 1 -bundles. Sé min. Théor. Spec. Géom., Univ. Grenoble 16, 33–42 (1998) 2. Ammann B.: Spin-Strukturen und das Spektrum des Dirac-Operators. PhD Thesis, Freiburg, 1998 3. Ammann, B., Bär, C.: The Dirac Operator on Nilmanifolds and Collapsing Circle Bundles. Ann. Global Anal. Geom. 16, 221–253 (1998) 4. Bär, C., Gauduchon, P., Moroianu, A.: Generalized Cylinders in Semi-Riemannian and Spin Geometry. Math. Z. 249, 545–580 (2005) 5. Bourguignon, J.-P., Gauduchon, P.: Spineurs, opérateurs de Dirac et variations de métriques. Commun. Math. Phys. 144, 581–599 (1992) 6. Cot˘aescu, I.I., Moroianu, S., Vi¸sinescu, M.: Gravitational and axial anomalies for generalized Euclidean Taub-NUT metrics. J. Phys. A – Math. Gen. 38, 7005–7019 (2005) 7. Cot˘aescu, I.I., Vi¸sinescu, M.: Runge-Lenz operator for Dirac field in Taub-NUT background. Phys. Lett. B 502, 229–234 (2001) 8. Hitchin, N.: Harmonic spinors. Adv. in Math. 14, 1–55 (1974) 9. Iwai, T., Katayama, N.: On extended Taub-NUT metrics. J. Geom. Phys. 12, 55–75 (1993) 10. Kirchberg, K.-D.: An estimation for the first eigenvalue of the Dirac operator on closed Kähler manifolds of positive scalar curvature. Ann. Global Anal. Geom. 3, 291–325 (1986) 11. Lott, J.: The Dirac operator and conformal compactification. Internat. Math. Res. Notices 2001(4), 171– 178 (2001) 12. Moroianu, A.: La première valeur propre de l’opérateur de Dirac sur les variétés kähleriennes compactes. Commun. Math. Phys. 169, 373–384 (1995)

656


13. Moroianu, A.: Spinc Manifolds and Complex Contact Structures. Commun. Math. Phys. 193, 661–673 (1998) 14. Moroianu, A.:Lectures on Kähler Geometry. LMS Student Texts 69, Cambridge, Cambridge Univ Press, 2007 15. Moroianu, A., Moroianu, S.: The Dirac spectrum on manifolds with gradient conformal vector fields. J. Funct. Analysis 253, 207–219 (2007) 16. Moroianu, S., Vi¸sinescu, M.: L 2 -index of the Dirac operator of generalized Euclidean Taub-NUT metrics. J. Phys. A - Math. Gen. 39, 6575–6581 (2006) 17. O’Neill, B.: Semi-Riemannian geometry. New York: Acad. Press, 1983 18. Nistor, V.: On the kernel of the equivariant Dirac operator. Ann. Global Anal. Geom. 17, 595–613 (1999) 19. Vaillant, B.: Index- and spectral theory for manifolds with generalized fibered cusps. Dissertation, Bonner Math. Schriften 344, Rheinische Friedrich-Wilhelms-Universität Bonn, 2001 Communicated by A. Connes


Communications in


Ground State at High Density András Süt˝o Research Institute for Solid State Physics and Optics, Hungarian Academy of Sciences, P. O. B. 49, 1525 Budapest, Hungary. E-mail: [email protected] Received: 29 April 2010 / Accepted: 15 February 2011 Published online: 8 June 2011 – © Springer-Verlag 2011

Abstract: Weak limits as the density tends to infinity of classical ground states of integrable pair potentials are shown to minimize the mean-field energy functional. By studying the latter we derive global properties of high-density ground state configurations in bounded domains and in infinite space. Our main result is a theorem stating that for interactions having a strictly positive Fourier transform the distribution of particles tends to be uniform as the density increases, while high-density ground states show some pattern if the Fourier transform is partially negative. The latter confirms the conclusion of earlier studies by Vlasov (in J. Phys. (USSR) IX:25–40, 1945), Kirzhnits and Nepomnyashchii (in Sov. Phys. JETP 32:1191–1197, 1971), and Likos et al. (in J. Chem. Phys. 126:224502, 2007). Other results include the proof that there is no Bravais lattice among high-density ground states of interactions whose Fourier transform has a negative part and the potential diverges or has a cusp at zero. We also show that in the ground state configurations of the penetrable sphere model particles are superimposed on the sites of a close-packed lattice. Contents 1. 2. 3. 4.

5. 6. 7.

Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Best Superstability Constant . . . . . . . . . . . . . . . . . . . . . . . . . Infinite-Density Limit of the Ground-State Energy and the Free Energy per Pair . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Infinite-Density Ground State and the Energy Functional in Finite Volume 4.1 Bounded interactions . . . . . . . . . . . . . . . . . . . . . . . . . . 4.2 Unbounded interactions . . . . . . . . . . . . . . . . . . . . . . . . . 4.3 Fourier representation of the energy functional . . . . . . . . . . . . . Stability Conditions for Pair Potentials . . . . . . . . . . . . . . . . . . . Stationary Points of the Energy Functional . . . . . . . . . . . . . . . . . Ground State Configurations in Infinite Space . . . . . . . . . . . . . . .

658 662 666 670 670 673 676 678 679 683

658

A. Süt˝o

8. 9. 10. 11.

Infinite-Density Ground State in Infinite Space . . . . . . . . . . . Infinite-Density Ground States for v ≥ 0 . . . . . . . . . . . . . . Uniformity vs Nonuniformity of High-Density Ground States . . . Interactions without Bravais Lattice Ground States at High Density 11.1Compensation by higher harmonics . . . . . . . . . . . . . . . 11.2Potentials with a cusp at zero . . . . . . . . . . . . . . . . . . 12. The Penetrable Sphere Model . . . . . . . . . . . . . . . . . . . . Appendix . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . References . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .

. . . . . . . . .

. . . . . . . . .

. . . . . . . . .

. . . . . . . . .

687 692 694 698 698 701 703 706 708

1. Introduction The present paper is a continuation of my earlier work on ground states of bounded Fourier-transformable pair potentials [1,2]. Bounded or integrable interactions appear in quantum physics for example in Bogoliubov’s theory of the Bose gas, which is based on models expressed in terms of the Fourier transform of the pair potential. Such interactions play also a central role in classical soft matter physics [3]. Those studied in [1,2] had a nonnegative Fourier transform of compact support. In 2007 Likos and his coworkers published a mean-field study of the case when the Fourier transform of the pair potential has a negative part [4]. If the system discussed in [1,2] showed already very peculiar properties at high densities, according to these authors a partly negative Fourier transform induced an even more curious behavior, the particles having the tendency to form clusters on the sites of a lattice whose lattice constant would be determined by the (negative) minimum of the Fourier transform, and only the population of the clusters would increase with the density. The question then arises whether it is possible to prove rigorously this kind of behavior. When, in the fall of 2008, I presented preliminary results on this problem and on its quantum mechanical counterpart [5] at a Montreal meeting in mathematical physics, Valentin Zagrebnov kindly informed me that, without knowing it, I worked on the theory of coherent crystals, a subject promoted by Russian physicists a long time ago. In effect, maybe the first attempt at a theory of crystallization was due to Vlasov. In his paper [6] Vlasov studied the solutions of the today called Vlasov equation −

∂f ∂f 1 ∂f =v· + F· ∂t ∂r m ∂v

(1.1)

for the one-particle distribution f (r, v, t). There is no collision term, but the equation contains a self-consistent force field ∂ F(r, t) = − f (r , v, t) dv dr (1.2) u(r − r ) ∂r induced by a translation invariant pair potential u. Considered on the torus or in the entire space, this equation has spatially homogeneous static solutions f 0 (v). Vlasov investigated his equation after linearizing it about f 0 . Besides many others, he asked the question whether there may exist spatially periodic static solutions. If yes, they could be associated with crystals. In his analysis, however, he arrived at the false conclusion that for the existence of such a solution u(r) dr < 0 must hold. Today we know that there can be no equilibrium (crystalline or other) phase under the effect of such an interaction: the system collapses, the grand-canonical partition function diverges in finite volumes [7]. The possibility of a static periodic solution for a (super)stable u

Ground State at High Density

659

is, nonetheless, there, and to see how such a solution emerges, we can follow Vlasov’s reasoning almost up to the end. Given the velocity profile f 0 , one substitutes f (r, v, t) = f 0 (v) + φ(r, v, t) into Eq. (1.1) and keeps only the terms linear in φ: ∂φ 1 ∂ f 0 ∂(u ∗ ) ∂φ +v· − · = 0, ∂t ∂r m ∂v ∂r where

(1.3)

(r, t) =

φ(r, v, t) dv.

(1.4)

φ(r, v, t) = eiωt−ik·r gk (v),

(1.5)

With the ansatz

one obtains f u (k)k · ∂∂v gk (v) = m(k · v − ω) 0

gk (v ) dv ,

(1.6)

where u is the Fourier transform of u. Integration over v yields u (k) 1= m

f k · ∂∂v dv k·v−ω 0

(1.7)

which implicitly determines the dispersion relation ω(k). With the choice f 0 (v) = h(v2 ) this further simplifies, and for a static solution (ω = 0), k must satisfy the equation 2 u (k) 1= (1.8) h (v2 ) dv. m In three dimensions, for a bounded and sufficiently fast decaying h this becomes u (k) ∞ h(x) (1.9) 1 = −2π √ dx. m 0 x Because h ≥ 0, we see that a solution for k is possible only if u has a negative part! Making the substitutions f 0 (v) = h(v2 ) and ω = 0 also in gk , we obtain an approximate static solution of Eq. (1.1) in the form ck cos k · r, (1.10) f (r, v) = h(v2 ) + h (v2 ) with the sum running over the vectors k which solve Eq. (1.9), and the real ck (incor porating the constant (2/m) u (k) gk ) chosen so that f ≥ 0. To obtain (1.10) we have supposed that u(x) = u(−x), which implies u (k) = u (−k) real and (1.9) holding simultaneously for ±k. If u is integrable, then u (k) is continuous and decays at infinity; thus, for any “natural” u the sum in (1.10) is finite. Moreover, u (0) > 0 for a superstable interaction, therefore u (k) ≤ 0 can be only if |k| ≥ k0 > 0. Let now f 0 be the Maxwell distribution, i.e., βm 3/2 −βmx/2 e , (1.11) h(x) = ρ 2π

660

A. Süt˝o

where ρ is the density and β is the inverse temperature. Then h (x) = −(βm/2)h(x), and (1.8) becomes − βρ u (k) = 1.

(1.12)

This equation has a solution if βρ ≥ (βρ)c , where (βρ)c = −

1 . min{ u (k)}

(1.13)

If the minimum of u is nondegenerate, at the critical point the resulting distribution is spatially periodic. As βρ increases, the solutions for k shift away from the minimizer of u , and the solution of the linearized equation is, in general, almost periodic. With the Maxwell distribution (1.10) becomes βm 0 f (r, v) = f (v) 1 − ck cos k · r . (1.14) 2 In order to keep f nonnegative, as β increases, the coefficients ck must scale as β −1 . The right condition for the formation of “coherent crystals” appeared only much later, in papers by Kirzhnits and Nepomnyashchii [8,9]. The term referred to (hypothetical) crystals of mobile particles as in Vlasov’s theory, capable of ballistic motion which must be coherent if it preserves long-range order. The context was somewhat different, these authors were interested in the crystallization of a quantum liquid. The starting point was a Hamiltonian the ground state of which was determined in the Hartree approximation. This analysis showed that the lowest-energy solution of the Hartree equation can be periodic only if the Fourier transform of the pair potential is partially negative, and the periodicity is then given by the wave vector at which the Fourier transform is minimal. These authors, just as Vlasov, were fully aware of the extraordinary properties of coherent crystals, even more pronounced in the quantum than in the classical case, and described them almost in the same terms as Likos et al. [4]. The mean-field or density-functional approach of Likos et al., the partial differential equation method of Vlasov, and the Hartree approximation of Kirzhnits and Nepomnyashchii are effective one-particle theories, and all arrive at the same conclusion. One can have little doubt in their truth. The problem is more difficult than the case treated in [1,2], and a new idea is necessary to obtain some progress in the rigorous theory. The idea presented and exploited in this paper is that of an infinite-density ground state (IDGS). Working in a finite domain (on a torus here), the N -particle ground states are arrangements of N points that minimize the interaction energy. To each N -point subset one can assign a discrete measure, and an IDGS is the weak limit of a sequence of discrete measures associated with N -particle ground states, when N goes to infinity. The ground state energy and energy per particle diverge in this limit but, if the interaction is integrable, the energy per pair is convergent. The limit is proportional to the best superstability constant, the largest number that can multiply N 2 /V in a lower bound on the energy of N particles in a cube of volume V . The two notions, that of an IDGS and of the best superstability constant, can be related via an energy functional written for normalized measures on the torus. It is shown that any IDGS is a minimizer of the energy functional, and the value of the minimum is the best superstability constant. The task is then to find these minimizers, because they can provide information on ground state configurations at high but finite densities. Most results on IDGS’s will be obtained by writing the energy


661

functional in Fourier representation; an exception is the penetrable sphere model presented at the end of the paper (for another example see Ref. [10]). Our main result is a confirmation of the conclusion of the earlier works [4,6,8,9]: if the Fourier transform of the pair potential has a negative part, the distribution of particles in high-density ground states is nonuniform. In some cases, e.g. when the potential diverges at the origin or has a cusp there, this nonuniform distribution cannot be approached through Bravais lattices. Except for the penetrable sphere model, from this analysis we cannot predict the ground state configurations at high densities with the precision obtained in Refs. [1,2]. The lack of precise information is true also for interactions with a strictly positive Fourier transform, but we can at least assert that, in contrast with the former, the asymptotic distribution of particles as the density increases tends to be uniform. Since there is a continued interest in the Gaussian core model [11–16], even this weak result may be of some value. It is to be emphasized that the limit of infinite particle density is not at all unusual in classical physics. Whenever a continuum theory is applied to describe a system of classical point-like particles, tacitly this limit is used. Theories of classical fluids, the rigorous van der Waals theory of the liquid-vapor phase transition [17–19], the continuum theories of droplet formation [20,21] and liquid-gas interfaces [22] belong to this family. These theories can be derived through scaling (or mean-field) limits in which the particle density tends to infinity (or equivalently, the range of the interactions tends to infinity) while the mass of the particles and the strength of the inter-particle interactions tend to zero in order to keep the mass and energy densities finite. In this work we do not have to scale the mass because it enters only the kinetic energy which vanishes in classical ground states (gravitating systems are out of the scope of this study), but we do scale the interaction: considering the energy per pair of particles corresponds to scaling the pair potential by dividing it with N as N tends to infinity in a fixed volume. The resulting functional is just the energy part of the usual free energy functionals. In its minimization we take into account also discrete measures. We will see that if the minimizer (the IDGS) is a discrete measure, it is directly related to finite-density ground state configurations in which particles are superimposed, cf. [4]; if it is a continuous measure then it approximates in a well-defined sense discrete measures associated with high- but finite-density ground state configurations. The paper is organized as follows. Section 2 fixes the conditions on the interaction and the basic notations, and contains the definition of the best superstability constant together with some preliminary results on it. Throughout the paper we deal with both bounded and unbounded interactions. The results will be more complete for bounded interactions, and the reason of this appears already in Sects. 2 and 3. For bounded interactions the self-energy is finite, and the energy per pair with and without the self-energy converges to the best superstability constant as N goes to infinity from above and from below, respectively. This permits to prove that the convergence is uniform in the volume—a fact that we do not know for unbounded interactions. In Sect. 3 we still show that the infinite-density limit of the free energy per pair is independent of the temperature and is the same as that of the ground state energy. In Sect. 4 we define the infinite-density ground states as weak limits of Dirac combs associated with N -particle ground states, introduce the energy functional and prove two theorems, one for bounded and another for unbounded interactions. They show that IDGS’s minimize the energy functional, and the minimum is the best superstability constant. From the Fourier representation of the energy functional, introduced in Sect. 4.3, we derive some stability conditions in Sect. 5. Section 6 contains a complete description of the stationary points of the energy

662

A. Süt˝o

functional. Their knowledge is important because there can be minimizers among them. The truly challenging problem is to find the ground state configurations (GSC) in infinite space. The definition, based on local stability, is recalled in Sect. 7. Here we prove two lemmas, the first relating periodic configurations and minimizers of the energy density, the second establishing the relation between periodic GSC’s in infinite space and GCS’s on tori. The analysis of the problem in infinite space is continued in Sect. 8. We introduce the notion of an IDGS in infinite space and formulate as a conjecture a commutative diagram relating four objects: GSC and IDGS in finite domains and in infinite space. The main result of this section is Proposition 8.2 which gives a new expression of the best superstability constant. Sections 9–12 present applications. In Sect. 9 we give a rather complete description of IDGS’s in the case when the Fourier transform of the interaction is nonnegative. Section 10 contains the main theorem proving the asymptotic uniformity or non-uniformity of GSC’s in infinite space for the case of a strictly positive or partially negative Fourier transform, respectively. In Sect. 11 we discuss two special classes of interactions with a partly negative Fourier transform. In the first one for any nonzero k the sum of the Fourier transform over integer multiples of k is nonnegative. In the second case the Fourier transform is ultimately positive and slowly decaying, so that the pair potential either diverges at zero or has a cusp there. In both cases we show that no Bravais lattice can be an IDGS, and high-density GSC’s are different from Bravais lattices. On the contrary, bounded pair potentials that are flat or nesting at the origin or have a dominantly negative Fourier transform at large wave vectors may prefer the accumulation of particles on the sites of a lattice which is independent of the density. In Sect. 12 we demonstrate this property on the so-called penetrable sphere model in which the interaction is a repulsive square core potential. The paper ends with an Appendix. 2. Best Superstability Constant Consider the problem of the ground state of a system of classical identical particles confined in a fixed bounded domain of volume V and interacting via a translation-invariant pair interaction u(x − y) = u(y − x). Assume u to be integrable, bounded outside the origin, strongly tempered, superstable and lower semicontinuous (see below). In the simplest case is a d-dimensional cube of side length L taken with periodic boundary conditions, and u is also made periodic. The usual way to achieve this is to replace u(x) by u (x) = u(x + Ln). (2.1) n∈Zd

This series is absolutely convergent for any L > 0 and any x (outside the set LZd when u(0) = ∞) if u is strongly tempered. The general definition of strong temperedness is |u(x)| < C|x|−d−η

(2.2)

for |x| > r0 , with some r0 , C, η > 0. If u has no hard core, r0 = 0 can be taken. Condition (2.2) also guarantees that u converges to u pointwise as L tends to infinity. A ground state of N particles in is any N -point configuration (r) N = (r1 , . . . , r N ) minimizing the N -particle interaction energy U (r) N = u (ri − r j ). (2.3) i< j


663

The minimum energy will be denoted by E 0 (N ). Because of the shift-invariance of u, any translate of a ground state is a ground state. To make sure that (2.3) can be minimized for all N , we will ask u to be lower semicontinuous [23]. A way to understand the behavior of soft matter at high density is to study the limit of infinite density. Since the particles have no hard core, this limit can be given a meaningful definition. The limit ρ = N /V → ∞ will be realized by letting N diverge in a fixed volume. To characterize a ground state, a suitable scaling of U is necessary. If u is bounded then (r) N =

V U (r) N N (N − 1)

(2.4)

remains bounded in this limit whatever the sequence of N -point configurations is. Moreover, N = min (r) N = (r) N ⊂

V E 0 (N ) N (N − 1)

(2.5)

is bounded as N goes to infinity even if u is unbounded but is integrable and bounded from below. Indeed, divide into N cubes of volume V /N and choose (r) N to be the centers of the cubes. Then the second sum in N V U (r) N = u (ri − r j ) (2.6) 2V N i

is a Riemann-sum, therefore (r) N =

1 2

j=i

u(x) dx + o(1) (N → ∞)

(2.7)

is an upper bound to N . We conclude that the right quantity to look at is (r) N that, with a slight abuse, we call the pair energy. The search for the ground state in the limit of N going to infinity is closely related to finding the best superstability constant for u. The notion of superstability was introduced by Ruelle [7,24]. Stability means the existence of a constant B such that for any N the ground state energy is bounded below by −B N ; superstability means that the ground state energy density (energy per volume) increases with the particle density at least quadratically. If u is integrable and superstable, to leading order in the density the increase is quadratic. Definition 2.1. The best superstability constant for a superstable interaction u is the supremum of the positive numbers C such that for any large enough cube of volume V , U (r) N ≥ C N 2 /V

(2.8)

holds for every N above a possibly -dependent value and every (r) N ⊂ . Lemma 2.1. The best superstability constant for u is C[u] = lim inf C [u], V →∞

(2.9)

where the limit is taken over cubes of increasing volume V and C [u] = lim inf N . N →∞

(2.10)

664

A. Süt˝o

Proof. Take any C < C[u]. From the definition of liminf it follows that C [u] > C for large enough, say, ⊃ C . It also follows that there exists some N such that N −1 N > C if ⊃ C and N > N N

(2.11)

which is (2.8). On the other hand, if C > C[u] then there exist arbitrarily large domains such that C > C [u], and an infinite sequence N j of positive integers (which may depend on ) such that lim N j = C [u].

(2.12)

N j < C

(2.13)

j→∞

Thus,

for j large enough, which is the negation of (2.8).

Remarks. (i) In the definition of superstability [7] there is an additional term −B N on the right-hand side of (2.8). We can add such a term with any B ≥ 0 without violating the inequality, it even allows for the extension of the inequality to every N , cf. Eq. (2.31) below. Obviously, no choice of B permits to increase the coefficient of N 2 and attain a superstability constant larger than C[u] defined by (2.9). (ii) Later on, will be a general parallelepiped. The periodized interaction can be defined, and the analogue of Lemma 2.1 can be proven in the following form. Let 0 be any nondegenerate parallelepiped, and for s > 0 consider s 0 = {sx : x ∈ 0 }. Then the best superstability constant is lim inf s→∞ Cs 0 [u]. The limit actually exists, C[u] = lim C [u] →∞

(2.14)

if tends to infinity in the Fisher sense [7], and is the same as that one obtains with the use of u instead of the periodized u . Based on the strong temperedness of the interaction, a proof in analogy with the proof of existence of the thermodynamic limit of the free energy [7] could be done, but in this paper (2.14) is considered as a hypothesis. An integrable interaction u has a bounded continuous Fourier transform decaying at infinity [23]. We will denote it by v. Thus, v(k) = u(r)e−ik·r dr. (2.15) If

k∈ ∗

|v(k)| < ∞ then

k∈ ∗

v(k)eik·r is a continuous function and

u (r) =

1 v(k)eik·r V ∗

(2.16)

k∈

almost everywhere; if u is continuous, equality holds everywhere. Here ∗ = (2π/L)Zd if is a cube of side length L. Superstability implies that v(0) > 0 and also u(0) > 0 and u (0) > 0. Proposition 2.1. C [u] ≤ v(0)/2 and thus C[u] ≤ v(0)/2.


665

Proof. This follows from Eq. (2.7). An alternative proof is obtained by computing the average of the potential energy, 1 v(0)N (N − 1) . (2.17) U (r) N d(r) N = N V 2V N The minimum is smaller than the average, N (N − 1) v(0)N (N − 1) N = E 0 (N ) ≤ , V 2V

(2.18)

C [u] ≤ v(0)/2.

(2.19)

and hence

The following simple observation helps to see the possibility of superposition of particles in some ground state configurations. Given (r) M , for any integer n ≥ 1 we define the n M-point configuration (r)nM by n times repeating (r) M , rm M+ j = r j ,

m = 0, . . . , n − 1,

j = 1, . . . , M.

(2.20)

Then, for any bounded complex-valued function f on Rd , nM M 1 1 f (r − r ) = f (ri − r j ). i j (n M)2 M2 i, j=1

(2.21)

i, j=1

When u is bounded, we apply this identity to f = V u /2. Introducing ω(r) N =

N V N −1 V u (0) (r) N + u (ri − r j ) = 2 2N N 2N

(2.22)

i, j=1

and ω N = min ω(r) N = (r) N ⊂

N −1 V u (0) N + , N 2N

(2.23)

and recalling that m|n for integers m, n denotes that m is a divisor of n, we obtain the following. Lemma 2.2. ω(r)nM = ω(r) M

(2.24)

ω N1 ≥ ω N2 ≥ ω N3 ≥ · · · if N1 |N2 |N3 | · · · .

(2.25)

and, hence,

Proof. Equation (2.24) repeats (2.21). Let (r) N j be a N j -particle ground state. Then N

ω N j+1 ≤ ω(r) N j+1 j

/N j

= ω(r) N j = ω N j .

(2.26)

666

A. Süt˝o

Proposition 2.2. If u is bounded, C [u] = inf ω N .

(2.27)

N

Proof. ωN − N =

1 N

V u (0) − N 2

(2.28)

goes to zero as N increases, so the statement is lim inf ω N = inf ω N . N →∞

N

(2.29)

Now inf N ω N ≤ lim inf N →∞ ω N ; supposing a strict inequality, because of (2.25) there would be a sequence N j tending to infinity such that ω N j ≤ ω N1 < lim inf ω N , N →∞

contradicting the definition of liminf.

(2.30)

A direct consequence of Eq. (2.27) is that U (r) N ≥ −

u (0) N2 N + C [u] 2 V

(2.31)

for all N . Another consequence is as follows. Corollary 2.1. If ω(r) M = C [u], then (r)nM is a n M-particle ground state for all n ≥ 1. The simplest realization of this situation is provided by the penetrable sphere model and is presented in Sect. 12. For a less trivial example see Ref. [10]. 3. Infinite-Density Limit of the Ground-State Energy and the Free Energy per Pair Proposition 3.1. The sequence N is convergent. Proof. We give two different proofs. (i) The first proof works for u bounded. In this case ω N − N tends to zero with increasing N , therefore the convergence of N is equivalent to lim ω N = inf ω N = C [u].

N →∞

N

(3.1)

Now C [u] is the smallest accumulation point of the sequence ω N , and suppose there is another one, ω0 . Let δ = 13 (ω0 − C [u]) and define two subsequences, K m and L n via the inequalities ω K m ≤ C [u] + δ and ω L n ≥ ω0 − δ; thus, ω L n − ω K m ≥ δ for all m, n. Both subsequences are infinite and, because of Eq. (2.25), K 1 ∈ {K m } for all integers ≥ 1. Therefore, for any n there is an m such that 0 < L n − K m < K 1 and, thus, E 0 (L n ) ≤ E 0 (K m ) +

1 K 1 (K 1 − 1 + K m )u ∞ . 2

(3.2)


667

The right member of this inequality is an upper bound to the energy of a L n -particle configuration that one obtains from a K m -particle ground state by adding L n − K m particles. Dividing by L 2n /V we find ω L n − ω K m = O(L −1 n ),

(3.3)

contradicting ω L n − ω K m ≥ δ. (ii) The second proof works even if u is unbounded. It is based on the monotonic increase of the ground state energy per pair, E 0 (N + 1) E 0 (N ) ≥ ; N (N + 1) N (N − 1)

(3.4)

see also Kiessling [25]. If u is integrable then N = V E 0 (N )/N (N − 1) is bounded above by v(0)/2, cf. Eq. (2.18), so its limit as N goes to infinity exists, lim N = sup N = C [u].

(3.5)

N →∞

As to Eq. (3.4), let N ≥ 2 and R be any sequence of N + 1 points of (repetition allowed). Then

U (R) =

u (x − y) =

(x,y)⊂R

=

1 N −1

1 N −1

u (x − y)

S⊂R,|S|=N (x,y)⊂S

U (S),

(3.6)

S⊂R,|S|=N

because for any two-point subsequence (x, y) ⊂ R we can choose N − 1 different N -point subsequences S ⊂ R containing (x, y). Since the sum over S has N + 1 terms, min

R⊂ ,|R|=N +1

which is just (3.4).

U (R) ≥

N +1 N −1

min

S⊂ ,|S|=N

U (S),

(3.7)

We have nowhere used the periodicity of the interaction, therefore Proposition 3.1 applies also to u instead of u . It holds true also for unstable interactions, when C [u] < 0. If u is bounded then Propositions 2.2 and 3.1 yield sup N = C [u] = inf ω N .

(3.8)

Corollary 3.1. If u is bounded then the convergence of ρV and ωρV to C [u] as ρ tends to infinity is uniform in V . Proof. u(0) u (0) ≤ 0 ≤ max ωρV − C [u], C [u] − ρV ≤ ωρV − ρV ≤ 2ρ ρ (3.9) if ρ and V are large enough. Here we used Eq. (2.28) and superstability for the third and strong temperedness for the fourth inequality.

668

A. Süt˝o

The uniform convergence of ρV to C [u] probably holds true also if u diverges at the origin. If is a cube of side L and R is any N -point configuration then there is an x ∈ such that dist(x, R) ≥ L/(2N 1/d ). Using this fact, for a radial u one can easily prove that 1 −1 (3.10) N ≤ N +1 ≤ N + cρ u 2ρ 1/d if is large enough. Here ρ = N /V and c = 1 if u(x) is a monotonic function of |x| close to zero. The term ρ −1 u(ρ −1/d /2) is the analogue of ρ −1 u(0). Because u is integrable, it tends to zero as ρ goes to infinity. However, the right member of (3.10) may not be an upper bound to C [u] and, therefore, it cannot play the role of ω N in the bounded case. It is natural to ask how the free energy per pair behaves as the density tends to infinity. Let 1 Z ,N = N e−βU (r) N d(r) N ≡ e−β FN (β) . (3.11) V N For the free energy per pair, f N (β), we have v(0) V FN (β) ≤ , N (N − 1) 2

N ≤ f N (β) ≡

(3.12)

where the upper bound results from Jensen’s inequality. We are interested in the limit of f N (β) as β or N goes to infinity. The usual definition of the partition function is V N /N ! times the expression (3.11). However, computing −(V /β N (N − 1)) ln Z ,N with the modified definition or with (3.11) yields the same result for both limits. At positive temperatures, as N goes to infinity, f N (β) − N remains positive only if the entropy is negative and of order N 2 . This means that the level set {(r) N ∈ N : U (r) N ≤ E 0 (N ) + N 2 /β} should be of Lebesgue measure ∼ e−cN with some c > 0. Under some natural conditions on the interaction we shall find the opposite result, an infinite-density limit of the free energy per pair that agrees with the limit of N , i.e., C [u]. 2

Theorem 3.1. Let u satisfy one of the following conditions: (i) u is bounded and

|x − y| |u(x) − u(y)| ≤ (|u(x)| + |u(y)|) r0

α (3.13)

with some r0 > 0 and 0 < α ≤ 1. (ii) There exist constants u 0 > 0, r0 > 0, c > 0, 0 < ζ < d and 0 < α ≤ 1 such that d−ζ r0 u(x) ≥ u 0 if |x| ≤ r0 (3.14) |x| and

|u(x) − u(y)| ≤ c(|u(x)| + |u(y)|)

|x − y| min{|x|, |y|, r0 }

α

.

(3.15)


669

Then lim f N (β) = N ,

β→∞

Proof.

lim f N (β) = lim N = C [u] all β > 0.

N →∞

N →∞

(3.16)

(i) Using (3.13),

|x − y| |u (x) − u (y)| ≤ 2u r0 where u = sup x

α

,

(3.17)

|u(x + Ln)|.

(3.18)

n∈Zd

Let (r) N be a ground state configuration [hence, U (r) N = E 0 (N )] and (r + δr) N = (r1 + δr1 , . . . , r N + δr N ) an arbitrary perturbation of it. Then |δri | α 2 U (r + δr) N − E 0 (N ) ≤ u 2 max N ≤ β −1 (3.19) i r0 if for all i |δri | ≤

r0 2

1

α

1 βu

N 2

≡

r0 εβ,N . 2

(3.20)

Using Eqs. (3.11), (3.12), (3.19) and (3.20), 1≥e

−β N (N −1)[ f N (β)− N ]/V

cd ≥ e

(r0 εβ,N /2)d V

N ,

(3.21)

where cd is a dimension-dependent constant. Taking the logarithm, dividing by β N (N − 1)/V and letting either β or N go to infinity we find Eq. (3.16). (ii) Let (r) N be a ground state configuration and let ri j = ri −r j . Because of Eq. (2.18) and the lower boundedness of u, u(ri j + Ln) ≤ u 1 N 2 holds with some u 1 > 0 for any n ∈ Zd . By decreasing, if necessary, r0 , (3.14) is valid also with u 1 replacing u 0 and implies 2

|ri j + Ln| ≥ r0 N − d−ζ .

(3.22)

Let xn = ri j + δri j + Ln and yn = ri j + Ln, where δri j = δri − δr j , and choose r0 |δri | ≤ 4

1

α

1 u2β N

2 4+ d−ζ

(3.23)

for every i, where u 2 = 3cu 1 . If u 2 β N 4 ≥ 1, this inequality implies |δri j | ≤

2 r0 − d−ζ N . 2

(3.24)

Then min{|xn |, |yn |, r0 } ≥

2 r0 − d−ζ N 2

(3.25)

670

A. Süt˝o

and by (3.15), |u(xn ) − u(yn )| ≤ c(|u(xn )| + |u(yn )|)

2|δri j |

α

. (3.26) 2 r0 N − d−ζ

When passing to u we have to estimate n |u(xn )| and n |u(yn )|. It is easily seen that only a single term can be large, the sum of the rest is of order 1 as N goes to infinity. The possible large term is bounded by u 1 N 2 . Summation over i, j brings in another factor N 2 , and for N or β large enough we end up with a generous upper bound 2 |δri | α 4+ d−ζ N ≤ β −1 . (3.27) U (r + δr) N − E 0 (N ) ≤ u 2 4 max i r0 The rest of the proof is as in the bounded case.

Remarks. (i) For the validity of (3.16) the continuity of u was essential. We shall see that for the penetrable sphere model at least the first of Eqs. (3.16) fails. Condition (3.13) is a sort of strong Hölder-continuity (because u(x) decays as |x| grows) which guarantees the ordinary Hölder-continuity of u . In the case of a free boundary condition ordinary Hölder-continuity would suffice. (ii) There is a gap between interactions that are bounded or diverge algebraically at the origin. Interactions with a logarithmic divergence, occurring in some cases between colloidal particles [26], allow particles to be much closer to each other and, thus, the ground state configurations to be strongly inhomogeneous. One cannot exclude that instead of (3.20) one should impose |δri j | = O(e−cN ). Then, one could still prove the first of Eqs. (3.16) but not the second one, because of an entropy ∝ −N 2 at positive temperatures. 4. Infinite-Density Ground State and the Energy Functional in Finite Volume 4.1. Bounded interactions. In this section we show that it is possible to obtain C [u] and a good approximation of high-density ground states via the minimization of an energy functional. For bounded interactions we can write ω(r) N in the form 1 ω(r) N = u (x − y) dμ(r) N (x) dμ(r) N (y), (4.1) 2V 2 where μ(r) N =

N V δr j N

(4.2)

j=1

is a measure on of total weight V (henceforth, a normalized measure); δx is the Dirac delta at x. We call μ(r) N the measure associated with (r) N . The discrete measures of the form (4.2)—sums of Dirac deltas with equal weights, sometimes called Dirac combs—are dense among the normalized Borel measures relative to the weak topology, cf. Lemma A.1. Therefore, when N goes to infinity and R N is a N -particle configuration, the sequence μ R N of associated measures can converge weakly to any normalized measure. We shall use this property to introduce the notion of an infinite-density ground state. Let M denote the set of normalized Borel measures on . A sequence μn ∈ M


671

converges vaguely to a μ ∈ M if for every real continuous function f of compact support lim μn ( f ) = μ( f ),

n→∞

(4.3)

where μ( f ) := f dμ. The convergence is in the weak sense if (4.3) holds for every bounded continuous f . Because is compact, every continuous function is bounded and of compact support, so the two types of convergences are the same. Henceforth μn μ will denote that μ is the vague limit of the sequence μn . For the reader’s convenience, in the Appendix we show that any infinite sequence μn ∈ M has a weakly convergent subsequence (i.e., M is compact in the weak topology), and any μ ∈ M is the weak limit of a sequence of discrete measures of the form (4.2). Definition 4.1. μ ∈ M is an infinite-density ground state (IDGS) of u if there is a sequence μ Rm of measures associated with configurations Rm such that |Rm | goes to infinity with m and μ Rm μ,

lim (Rm ) = C [u].

m→∞

(4.4)

Note that Rm may not be a sequence of ground state configurations. Proposition 4.1. The set of IDGS’s is nonempty. Proof. Given any sequence {R N }∞ 1 of N -particle ground states, (R N ) = N tends to C [u], cf. Eq. (3.5). Moreover, the sequence μ R N of associated measures has at least one weak limit point which is, therefore, an IDGS. Motivated by Eq. (4.1), we define an energy functional on M by the equation 1 1 I [μ] = μ∗ μ(u ). u (x − y) dμ(x) dμ(y) = (4.5) 2V 2 2V Here we use the notations of Ref. [27]: for a complex function f, f (x) := f (−x), f ). The measure V −1 μ ∗ μ is the autocorrelation of μ. It is defined on μ( f ) := μ( − (= in the case of periodic boundary conditions when is a torus) by μ∗ μ( f ) = f (z) d(μ ∗ μ)(z) := f (x − y) dμ(x) dμ(y), (4.6) −

2

where f is any continuous periodic function of period cell . Thus, μ∗ μ= δx−y dμ(x) dμ(y) = δx+y dμ(x) d μ(y). 2

2

(4.7)

For general results about the autocorrelation and its Fourier transform see [27,28]. Equation (4.1) can be rewritten as ω(r) N = I [μ(r) N ]. I [μ] is meaningful also for unbounded interactions, but it can be finite only if μ is a continuous measure. For example, for the Lebesgue measure λ, I [λ] = v(0)/2. In I [μ] one can recognize the interaction energy common to all mean-field theories, either quantum as the Hartree approximation [8], or classical as the mean-field theory of fluids [4,12]. The difference in our use of it is that,

672

A. Süt˝o

while in mean-field theories μ is tacitly supposed to be absolutely continuous, here no a priori assumption is made on it. Let I 0 = inf I [μ].

(4.8)

I 0 ≤ inf I [μ(r) N ] = inf ω N = C [u].

(4.9)

μ∈M

For bounded interactions N ,(r) N

N

Before proving the opposite inequality in Theorem 4.1, let us note that the correspondence between a finite configuration (r) N and the associated measure (4.2) is not one-toone. The configuration (r)nM defined by Eq. (2.20) is different from (r) M whose n-times repetition gives rise to it. This is in contrast with μ(r)nM = μ(r) M ,

(4.10)

implied by Eq. (2.24). Thus, any discrete measure is associated with an infinity of finite configurations. This observation leads to the counterpart of Corollary 2.1. Proposition 4.2. Suppose that μ(r) M is a minimizer of the energy functional. Then for any positive integer n, (r)nM is a n M-particle ground state of u . Proof. Because of the identities (2.24) and (4.10), ω(r)nM = I [μ(r) M ] = I 0 ≤ I [μ(x)n M ] = ω(x)n M

(4.11)

U (r)nM ≤ U (x)n M

(4.12)

and, hence, for any n M-particle configuration (x)n M .

In this way, if a minimizer of I [μ] happens to be a discrete measure concentrated e.g. on the vertices of a lattice inside , then the particles pile up on the lattice sites already in finite-density ground states. For a -periodic bounded Borel function f and a μ ∈ M , let I f [μ] =

1 μ∗ μ( f ). 2V

(4.13)

I [μ] ≡ Iu [μ] in this notation. Lemma 4.1. If u is bounded then I [μ] is continuous in the weak topology, i.e., I [μn ] → I [μ] if μn μ. Proof. Let first u be continuous. If μn μ then μn ∗ μ n μ ∗ μ, therefore μn ∗ μ n ( f ) → μ ∗ μ( f )

(4.14)

if f is continuous on . Applying this to f = u the result follows. Suppose now that u is bounded. For any ε > 0 we can choose a real continuous f on and an integer m (depending also on f ) such that u − f ∞ < ε and |I f [μ] − I f [μn ]| < ε for n ≥ m. Then |Iu [μ] − Iu [μn ]| < 3ε for n ≥ m and, hence, lim |Iu [μ] − Iu [μn ]| ≤ 3ε.

(4.15)

n→∞

This being true for every ε > 0, the above limit actually vanishes.


673

Theorem 4.1. If u is bounded, then I 0 = C [u],

(4.16)

the infimum I 0 is attained, and I [μ] = I 0 if and only if μ is an IDGS. Remark. Because of Eq. (2.27), Eq. (4.16) states that the infimum of I [μ] over all μ ∈ M is the same as the infimum over discrete measures associated with finitedensity ground states. This does not mean that the IDGS cannot be a continuous measure, see the forthcoming sections. Proof. Choose μm ∈ M such that I [μm ] tends to I 0 as m goes to infinity (μm may be the same for each m). Because the set of discrete measures (4.2) is dense in M and I is continuous, μm can be chosen to be of the form (4.2); hence, I [μm ] ≥ C [u]. If μ is any weak limit point of μm then by the continuity of I , C [u] ≥ I 0 = lim I [μm ] = I [μ] ≥ C [u]. m→∞

(4.17)

This proves (4.16) and that μ is an IDGS. Starting with an IDGS μ and choosing μm = μ Rm , where Rm is the defining sequence (4.4), Eq. (4.17) shows that μ is a minimizer of I . Remark. We have found that weak limits of N -particle ground states are minimizers of I , but have not proved that every minimizer = IDGS can be obtained as such a limit. 4.2. Unbounded interactions. We restrict the discussion to pair potentials such that u(x) → ∞ as x → 0 and u is bounded otherwise. Definition 4.1 and Proposition 4.1 are valid in this case. Because the notion of an IDGS is independent of the energy functional, the divergence of I [μ] on point measures does not allow to directly conclude that IDGS’s are continuous measures. Below we present a proof which is independent of I . Proposition 4.3. Let u be integrable and u(x) → ∞ as x → 0. Then any IDGS μ is purely continuous. Proof. Take any x0 ∈ supp μ. Let Rm be the defining sequence of μ, cf. Definition 4.1. Because u (0) = ∞, all points of Rm are distinct. Choose any ε > 0 and let C be an open ball of diameter ε centered at x0 . By approximating the characteristic function of C with continuous functions one can see that lim μ Rm (C) = μ(C).

m→∞

It follows that for m large enough μ Rm (C) ≥ 21 μ(C) and, because μ Rm ({x}) = V /|Rm | for x ∈ Rm , |Rm ∩ C| ≥

μ(C)|Rm | . 2V

(4.18)

μ(C)|Rm | − 1 inf u (x) |x| 0, introduce I K [μ] − I0 [μ] = I K [μ] + Iu − [μ]. I K+ [μ] =

(4.29)

Because u is bounded below, the second term is finite. Now I K+ [μ] ≥

K −1

n(gn − gn+1 ) =

n=1

for any

K

K K (gn − g K ) ≥ (gn − g K ) n=1

(4.30)

n=1

≤ K , therefore

lim I K+ [μ] ≥

K →∞

K

gn

(4.31)

gn = ∞

(4.32)

n=1

for any K . It follows that lim I K+ [μ] ≥

K →∞

∞ n=1

because gn > c/2n for n large enough. Thus, in this case I [μ] = I [μ] = ∞.

Lemma 4.3. If μ is an IDGS then I [μ] ≤ C [u]. Proof. Let Rm be the defining sequence of μ. By Lemma 4.1 and Theorem 4.1, I K [μ Rm ] ≤ lim I [μ Rm ] = lim (Rm ) = C [u]. I K [μ] = lim m→∞

m→∞

Taking the supremum over K gives the result.

m→∞

(4.33)

Lemma 4.4. C [u] ≤ I 0 . Proof. Define K u (x) = min{u (x), K }.

(4.34)

For any N there exists some K N such that R N is a N -particle ground state of U if and only if it is also a N -particle ground state of U K for any K ≥ K N , and E 0K (N ) = U K (R N ) = U (R N ) = E 0 (N ).

(4.35)

K For example, K N = E 0 (N ) + (N 2 /2) sup u − is a possible choice. Indeed, u ≤ u , K therefore E 0 (N ) ≤ E 0 (N ). It follows that if K ≥ K N and R N is a N -particle ground state of either U or U K , then K (x − y) = u (x − y) ≤ K u

676

A. Süt˝o

for every pair {x, y} ⊂ R N , otherwise U (R N ) ≥ U K (R N ) > E 0 (N ). This means, however, that R N is a common ground state of U and U K , and (4.35) holds true. With the above K N and R N , in an obvious notation, cf. (4.13), I 0K N = C [u K N ] ≥ NK N = N , u

(4.36)

the three relations holding due to Eqs. (4.16), (3.5) and (4.35), respectively. Letting N tend to infinity, or taking the supremum over N , I 0 ≡ Iu0 ≥ lim I 0K N ≥ C [u], N →∞ u

where the first inequality follows from Eq. (4.34).

(4.37)

Combining Proposition 4.3 and Lemmas 4.2, 4.3 and 4.4, for any IDGS μ we find I [μ] = C [u] = I 0 = I [μ]

(4.38)

and, therefore, the following result. Theorem 4.2. If u is integrable, bounded outside the origin, and u(x) → ∞ as x → 0, then I 0 = C [u], the infimum is attained, and any IDGS is continuous and minimizes I . Remark on Sections 4.1 and 4.2. In the proof of Theorems 4.1 and 4.2 we have not used that u was a periodized interaction. Therefore, the theorems are valid in the case when is any compact Lebesgue-measurable set and the interaction is the original, non-periodic u. 4.3. Fourier representation of the energy functional. Most of the results on IDGS’s will be found by writing I [μ] in Fourier representation. This is obtained by substituting the expansion (2.16) into Eq. (4.5) and integrating term by term: ⎤ ⎡ 1⎣ 1 I [μ] = μ ∗ μ(v). (4.39) v(k)| μ(k)|2 ⎦ = v(0) + 2 2V ∗ 0=k∈

Here 1 μ(k) = V

e−ik·x dμ(x),

(4.40)

1 μ ∗ μ= | μ(k)|2 δk . V ∗

(4.41)

so that | μ(k)| ≤ μ(0) = 1, and

k∈

We do not need the equality (4.39) to hold for all μ ∈ M . We are interested in IDGS’s, and I [μ] = C [u] ≤ v(0)/2 for any of them. It suffices, therefore, that the Fourier expansion (4.39) converges to I [μ], whenever I [μ] ≤ v(0)/2. In the case


677

v ≥ 0, I [μ] ≤ v(0)/2 holds (with equality) if and only if μ is an IDGS; later on, we will discuss this case in detail. Let v + (k) = max{0, v(k)}, v − (k) = − min{0, v(k)}.

(4.42)

If v − = 0, then the IDGS’s are among those μ making the right member of Eq. (4.39) absolutely convergent and satisfying v − (k)| μ(k)|2 − v + (k)| μ(k)|2 ≥ 0. (4.43) 0=k∈ ∗

0=k∈ ∗

For a stable interaction the left member of this inequality cannot exceed v(0), cf. Proposition 5.1. The verity of (4.39) in this case can be seen as follows. Given a t > 0, consider the Gaussian G t (x) = (4π t)−d/2 e−|x|

2 /4t

(4.44)

and its periodization, called the periodic heat kernel, G t (x) =

G t (x + Ln) =

n∈Zd

1 −t|k|2 ik·x e e , V ∗

(4.45)

k∈

where the right member is obtained by the Poisson summation formula (or the Fourier t expansion of the periodic function on the left). Ift μ ∈ M , then G ∗ μ ∈ M as well, [ f ∗ μ(x) := f (x − y) dμ(y)]. Moreover, G ∗ μ is absolutely continuous, and its Radon-Nikodym derivative 2 φ(x) = e−t|k| μ(k)eik·x (4.46) k∈ ∗

is a real entire function of each component of x. These properties guarantee that the energy functional evaluated on G t ∗ μ satisfies (4.39), I [G t ∗ μ] =

1 2 v(k)e−2t|k| | μ(k)|2 . 2 ∗

(4.47)

k∈

Because the left member is finite and the right member is absolutely convergent at t = 0, the regularity of Abel summability implies that Eq. (4.47) survives the t → 0 limit and yields (4.39). The form (4.39) of I [μ] reveals an important property of the minimizers, hidden in the defining equation (4.5). For stable interactions 0 ≤ I 0 ≤ v(0)/2 but, because ∗ is a set of density ∼ L d , for most μ ∈ M one finds I [μ] of the order of some power of L. For example, I [V δx ] =

1 v(k) = V u (0)/2. 2 ∗

(4.48)

k∈

Thus, one may expect that when increases, the minimizers are among μ’s such that for some volume-independent constant c and for any K > 0, μ {k ∈ ∗ : |k| ≤ K } ≤ cK d . (4.49) supp

678

A. Süt˝o

In Sect. 6 we shall meet some special sets of this kind, namely, additive subgroups of ∗ . These will be shown to support the Fourier transform of the stationary points of 2 I . When v ≥ 0 or u(x) ≥ const × e−α|x| , at least the uniform local boundedness as → ∞ of both the autocorrelation and its Fourier transform (4.41) is easily seen. We return to this question in Sect. 8.

5. Stability Conditions for Pair Potentials From Theorems 4.1 and 4.2 we can deduce stability conditions on u. If u is nonnegative then it is obviously stable; if u ≥ 0 and is integrable then its Fourier transform v is a positive definite function (a function of positive type) [29], therefore v(0) − |v(k)| ≥ 0 for all k ∈ Rd . Due to Theorems 4.1 and 4.2, similar conditions can be obtained without assuming u ≥ 0. Proposition 5.1. Let u be an integrable pair potential with u(−x) = u(x). Then I 0 = I 0 ( ) ≥ 0 for any cube is sufficient and necessary for stability, and I 0 ( ) > 0 for all is sufficient and necessary for superstability. Some particular necessary conditions for the stability of u are as follows: (i) 2v(0) + v(k) ≥ 0, all k ∈ Rd .

(5.1)

(ii) If u is radial and v(k) ≤ 0 for |k| ≥ k0 , then v(0) + n d min v(k) ≥ 0, |k|≥k0

(5.2)

where n d is the number of nearest neighbors in a d-dimensional close-packed lattice (n d = 6, 12, 24, 40, . . . for d = 2, 3, 4, 5, . . .). (iii) If u is bounded, radial and u(r) ≤ 0 for |r| ≥ r0 then u(0) + n d min u(r) ≥ 0. |r|≥r0

(5.3)

Remark. Ruelle’s example of a catastrophic potential ([7], Sect. 3.2.3) violates the inequality (5.3). Proof. Because for u bounded I 0 = inf N ,(r) N ω(r) N , the condition I 0 ≥ 0 is part of Proposition 3.2.2 of Ruelle [7]. The role of the sign of I 0 for u unbounded is also clear from I 0 = C [u]. Now v(−k) = v(k) real, and assertions (i)–(ii) are nontrivial if v takes on negative values. (i) Suppose that v(k) < 0 for some k = 0 and consider the measure dμk (x) = (1 + cos k · x) dx.

(5.4)

The form (4.39) of I [μ] shows (cf. also Proposition 9.1) that I [μk ] = v(0)/2 + v(k)/4 which must be nonnegative if u is stable.


679

(ii) Let min|k|≥k0 v(k) = v(km ). Choose and a Bravais lattice B (lattice, for mathematicians) such that the reciprocal lattice B ∗ (dual lattice, defined by k · r ∈ 2π Z for k ∈ B ∗ and r ∈ B) is close-packed, B ∗ ⊂ ∗ and the nearest neighbor distance of B ∗ is |km |. Then, for (r) N = B ∩ , by a simple computation I [μ B ] ≤

1 [v(0) − n d |v(km )|], 2

(5.5)

and the left member is nonnegative if u is stable. Here I [μ B ] = I [μ B∩ ] which is independent of provided that is a period cell for B, see also Eq. (9.11). (iii) Condition (5.3) is the dual of (5.2), obtained from the Poisson summation formula u(r) = ρ(B) v(k) = 2ρ(B)I [μ B ]. (5.6) k∈B ∗

r∈B

Here ρ(B) denotes the density of B. Let min|r|≥r0 u(r) = u(rm ). Choose B to be close-packed with nearest-neighbor distance |rm |. Then, in case of stability, 0 ≤ I [μ B ] =

1 1 u(r) ≤ [u(0) − n d |u(rm )|], 2ρ(B) 2ρ(B)

(5.7)

r∈B

which was to be shown.

6. Stationary Points of the Energy Functional The natural candidates for IDGS’s are the stationary points of I [μ]. To motivate the definition of stationary points, we first consider absolutely continuous normalized measures, dμ(x) = f 2 (x) dx,

(6.1)

where f is a real function. We have to minimize I [μ] with respect to f under the condition f 2 (x) dx = V . Via functional derivation of I [μ] we obtain f (x) u (x − y) f 2 (y) dy − c = 0. (6.2)

Here c is a Lagrange multiplier. Another way to write this equation is u (x − y) f 2 (y) dy = c if f (x) = 0.

(6.3)

Any absolutely continuous μ with a density f 2 solving Eq. (6.2) is a stationary point of the energy functional. Because with such an f , 1 c (6.4) I [μ] = u (x − y) f 2 (y) f 2 (x) dy dx = , 2V 2 2 among the solutions of (6.2) one has to choose the one giving the smallest constant c. A solution of (6.2) is f 2 (x) ≡ 1 which defines the Lebesgue measure and yields c = v(0) and, hence, an already familiar upper bound I 0 ≤ v(0)/2, cf. Eqs. (2.19), (4.9). We shall see that the Lebesgue measure is the only absolutely continuous measure among the stationary points of the functional I . By generalizing (6.3) we arrive at the following definition.

680

A. Süt˝o

Definition 6.1. A μ ∈ M is a stationary point of I f , μ ∈ M st ( f ), if f (x − y) dμ(y) = constant for x ∈ supp μ. ( f ∗ μ)(x) =

(6.5)

μ ∈ M is a stationary point (of I ), μ ∈ M st , if it is a stationary point of I f for every continuous function f . Remarks. (i) One could ask for f ∗ μ to be constant only μ-almost everywhere, but the above choice is sufficient and more convenient for our purposes. (ii) The name is justified by the following property. Let μ ∈ M st ( f ) and let ν ∈ M , ν μ. Then ( f ∗ μ) dμ = ( f ∗ μ) dν, (6.6)

and therefore I f [(1 − ε)μ + εν] − I f [μ] = ε2 I f [ν] − I f [μ]

(6.7)

for any 0 ≤ ε ≤ 1. So μ may not minimize I f , not even among the measures absolutely continuous with respect to it, but if we perturb μ with a ν μ in the order of ε, I f changes only in the order of ε2 . M st ⊂ M st ( f ), and we shall give two examples showing that the first can be a proper subset of the second. M st is nonempty, the Lebesgue measure is a stationary point. Intuitively, it appears to be clear that μ is a stationary point if it is uniform on its support, and supp μ − x ≡ {y − x|y ∈ supp μ} is the same for all x ∈ supp μ. The theorem below provides the proof. For t ∈ and a function g on , define Tt g by Tt g(x) = g(x − t),

(6.8)

and for μ ∈ M , Tt μ ∈ M by Tt μ(g) = μ(T−t g).

(6.9)

Tt μ(D) = Tt μ(1 D ) = μ(D − t).

(6.10)

Then for Borel sets D ⊂ ,

Lemma 6.1. Any translate of a stationary point of I f is a stationary point of I f . Proof. Suppose μ ∈ M st ( f ). Let A = supp μ, and ( f ∗ μ)(x) ≡ c for x ∈ A. Then ( f ∗ Tt μ)(x) ≡ c for x ∈ A + t and, because A + t = supp Tt μ, we conclude that Tt μ ∈ M st ( f ). The set = LTd is an additive group, − = . A closed subset G of is a closed additive subgroup if G − G ⊂ G. A Haar measure μ on G is a nonzero measure of support G which is invariant under shifts with elements of G; that is, Tt μ = μ for every t ∈ G, μ is uniformly distributed on G. The normalized Haar measure is unique. Theorem 6.1. A μ ∈ M is a stationary point if and only if it is a translate of the normalized Haar measure on a closed additive subgroup of .


681

Examples. Some simple closed additive subgroups with their normalized Haar measures are {0} and

V δ0 , λ ≡ λd ,

(6.11) (6.12)

B (n) ∩ (n) × (d−n) ,

(6.13)

with with

with μ B (n) ∩ × λd−n ≡

λn ( (n) ) |B (n) ∩ (n) |

δr(n) × λd−n ,

(6.14)

r∈B (n) ∩ (n)

(n)

where n = 1, . . . , d, λn and δr are the n dimensional Lebesgue and Dirac measures, respectively, (n) is the projection of to the first n coordinates and B (n) is a ndimensional Bravais lattice commensurate with (n) . Any Bravais lattice can appear as a stationary point if we replace = LTd by a suitable parallelepiped. In the following lemma we give a full description of closed additive subgroups of and the Haar measures on them. Related results can be found in Ref. [27]. Recall that ∗ = (2π/L)Zd and V = L d . Lemma 6.2. Let G be a closed additive subgroup of with normalized Haar measure μ (i.e., μ(G) = V ), and let G ∗ = {k ∈ ∗ |k · x ∈ 2π Z, all x ∈ G}.

(6.15)

Then G ∗ is an additive subgroup of ∗ , and μ is a Haar measure (the counting measure) on G ∗ , δk,K . (6.16) μ(k) = K∈G ∗

μ on it, then Conversely, if G ∗ is an additive subgroup of ∗ with the counting measure G = {x ∈ |k · x ∈ 2π Z, all k ∈ G ∗ }

(6.17)

is a closed additive subgroup of , and the measure μ defined by μ as its Fourier transform is a Haar measure on G with μ(G) = V . Proof. The (closed) subgroup property is obvious in both cases. Let now G be a closed additive subgroup of , either given initially or constructed as in (6.17), G ∗ the dual of G, and μ the normalized Haar measure on G. We show that μ is just (6.16). For any k ∈ ∗ and y ∈ G, 1 1 −ik·x μ(k) = e dμ(x) = e−ik·x dμ(x + y) V G V G 1 = e−ik·(x−y) dμ(x) = eik·y μ(k). (6.18) V G Then either μ(k) = 0 or k · y ∈ 2π Z for all y ∈ G, that is, k ∈ G ∗ . In the second case μ(k) = 1, proving the claim.

682

A. Süt˝o

Thus, we have obtained a simple characterization of closed additive subgroups of the torus as the dual sets, in the sense of Eq. (6.17), of intersections of linear subspaces of Rd with the lattice ∗ . Indeed, each additive subgroup of ∗ is such an intersection. We now see that is the only closed additive subgroup of positive Lebesgue measure, therefore the Lebesgue measure is the only absolutely continuous stationary point of I . Proof of Theorem 6.1. 1. Let μ be the normalized Haar measure on an additive subgroup G, extended to with μ( \G) = 0. For arbitrary Borel sets D1 ⊂ G and D2 ⊂ \G, and any x ∈ G, μ(D1 − x) = μ(D1 ), μ(D2 − x) = μ(D2 ) = 0. Hence, for any Borel set D ⊂ and any x ∈ G, μ(D −x) = μ(D) or, equivalently, (1 D ∗ μ)(x) = μ(D) for any x ∈ G.

(6.19)

Because any continuous function is the limit of a sequence of finite linear combinations of characteristic functions, (6.5) holds for every continuous f ; hence, μ ∈ M st . 2. Suppose that μ ∈ M st and let A = supp μ. Although the characteristic functions are discontinuous, Eq. (6.5) extends to them through limits. For example, the characteristic functions of open sets are limits of monotone increasing sequences of continuous functions. For an arbitrary Borel set D ⊂ , μ(D + x) ≡ c D for any x ∈ A

(6.20)

is then obtained by using the upper regularity of μ. (i) Suppose that 0 ∈ A. Then μ(D + x) = μ(D) for all D ⊂ Borel sets and all x ∈ A; in particular, for D = A − x, μ(A − x) = μ(A) = μ(A ∩ A − x) = V

(6.21)

for all x ∈ A. Since A − x is closed together with A, necessarily A − x = A for all x ∈ A and, hence, A − A = A. Thus, A = supp μ is an additive subgroup of and μ is a shift-invariant measure on it. (ii) If 0 ∈ / A, choose any t ∈ −A and replace A by A + t and μ by Tt μ. Because Tt μ ∈ M st , the argument (i) yields that Tt μ is the normalized Haar measure on the additive group A + t, and μ is a translate of this measure. Corollary 6.1. Let μ, ν ∈ M be two measures with coinciding supports. If both are stationary points then μ = ν. Proof. supp μ = supp ν is an additive group and μ and ν are normalized Haar measures on it. But the normalized Haar measure is unique, whence μ = ν. Remarks. (i) Here are two examples showing that M st M st (u ). 1. A two-dimensional example is the Dirac comb μ H concentrated on the honeycomb lattice. It is not a stationary point because, if x1 and x2 are nearest neighbors, supp μ H − x2 = −supp μ H + x1 . However, u (−x) = u (x) implies μ H ∈ M st (u ).


683

2. If u is radial and = [−L/2, L/2]d , u has a cubic symmetry. Then for any 0 < a ≤ L/4, μ=

d V δ−ae j + δae j 2d

(6.22)

j=1

is stationary for u , but not for functions not having a cubic symmetry. (ii) If the periodic boundary condition is replaced by a free one, the set of stationary points changes completely. The reason is that now u itself, and not the periodized u , appears in the energy functional; in fact, may not be a parallelepiped. The additive groups are replaced by others. For a radial u the stationary points of Iu are uniform distributions on 1, 2, . . . , d − 1 dimensional spheres, on the vertices of regular polyhedra, in 3 dimensions also on the 60 vertices of the soccer ball (fullerene), and others. The supports are now sets that are invariant under transformations forming finite point groups or groups of rotation, and the stationary points correspond to Haar measures on those groups. As increases, the dependence of the ground states on the boundary condition should continuously disappear in the bulk. For example, on parallelepipeds there should be a continuous transition between IDGS’s for free and periodic boundary conditions. Such a transition cannot take place on stationary points for both types of boundary conditions. Intuition suggests that stationary points have better chance with periodic boundary conditions. (iii) If u is integrable but u(x) → ∞ as x → 0, then I [μ] = ∞ for any μ having a point part. However, depending on the rate of divergence of u(x), I [μ] can be finite in some continuous stationary points, and these are to be considered as possible IDGS’s. If u(x) ∼ |x|−d+ζ near the origin with 0 < ζ < d then u is integrable in d − n dimensions for n < ζ , so the energy functional is finite on the corresponding measures (6.14). Some interactions not allowing stationary points of I to be IDGS’s will be presented in Theorem 11.2. If 0 < ζ < 1, I [μ] can be finite only if the Hausdorff dimension of supp μ is greater than d − ζ > d − 1. According to our characterization given in Lemma 6.2, the only stationary point providing a finite energy for such an interaction is the Lebesgue measure. In Sect. 9 we will show, however, that the Lebesgue measure does not minimize I if the Fourier transform of u is partly negative. Thus, for u ∈ L 1 (Rd ) diverging at zero as |x|−d+ζ with 0 < ζ < 1 and having a partly negative Fourier transform, no stationary point is an IDGS. Theorem 11.2 will cover this case. 7. Ground State Configurations in Infinite Space From Refs. [1,2] let us recall the definition of a ground state configuration in Rd . To begin with, a configuration is an equivalence class of point sequences, two sequences being equivalent if they differ only in a permutation of the entries. We must consider sequences instead of sets because particles can be placed in the same point if the interaction is bounded. For an infinite configuration X and a finite configuration R let U (R|X ) = U (R) + I (R, X ) with U (R) =

(r,r )⊂R

u(r − r ),

I (R, X ) =

r∈R,x∈X

(7.1) u(r − x).

(7.2)

684

A. Süt˝o

Definition 7.1. Let u be a strongly tempered interaction. Given a real number μ, an infinite configuration X is a (grand canonical) ground state configuration for chemical potential μ (a μGSC) if for any finite configuration X f ⊂ X, U (X f ) is finite, the sum I (X f , X \X f ) is absolutely convergent, and for any finite configuration R ⊂ Rd , U (R|X \X f ) − μ|R| ≥ U (X f |X \X f ) − μ|X f |.

(7.3)

X is a (canonical) ground state configuration (GSC) if (7.3) holds true for every R such that |R| = |X f |. Thus, a ground state configuration is a locally stable configuration, in the sense that no local modification can decrease its energy. Equation (7.3) must hold also when R or X f is the empty set. Hence, if X is a μGSC then U (X f |X \X f ) − μ|X f | ≤ 0 for any finite X f ⊂ X . If u is not strongly tempered, absolute convergence of I (X f , X \X f ) must be replaced by some weaker condition, e.g. Abel summability, see [2]. The above definition and the two lemmas below apply also to interactions that are not integrable at the origin or have a hard core. The existence of GSC’s and μGSC’s is by no means obvious. For certain interactions they can be constructed explicitly [1,2,10,30,31], for some others only their existence is proven [32,33]. In this paper we assume that for the class of interactions considered, high-density GSC’s do exist; at least, all statements about GSC’s in infinite space are subject to this proviso. Definition 7.2. For x ∈ Rd , let Cx denote the unit ball centered at x. An infinite configuration X is locally uniformly finite (l.u.f.), if there exists an integer m such that for any x ∈ Rd , |X ∩ Cx | ≤ m.

(7.4)

Equivalently, for any compact K ∈ Rd there is an integer m K such that |X ∩ (K + x)| ≤ m K , any x ∈ Rd .

(7.5)

In [2] we proved the absence of metastability in the case of strongly tempered interactions, by showing that a necessary condition for a l.u.f. configuration to be a GSC is to minimize the energy density among l.u.f. configurations of the same density. Here we prove that this condition is also sufficient among periodic arrangements (while it is obviously insufficient among l.u.f. configurations). The general form of a periodic configuration is X=

J !

(B + y j ),

(7.6)

j=1

where B = Za1 + · · · + Zad , ai ∈ Rd are linearly independent, y j are not necessarily " different d-dimensional vectors, and in this paper we use to indicate that in the union coinciding points occur with repetition. The choice of B can be made unique by asking J to be minimal. B thus obtained is the maximal group of period vectors of X , and is denoted by B(X ).


685

The particle density and the energy per particle of an infinite configuration X are defined, respectively, as ρ(X ) = lim

→Rd

|X ∩ | U (X ∩ ) , e p (X ) = lim , V () →Rd |X ∩ |

(7.7)

provided that the limits exist. Here is a bounded Lebesgue-measurable domain and V () is its volume. The energy density (energy per volume) of X is e(X ) = ρ(X )e p (X ). If the particle density is kept fixed, we can work with any of e p or e. If X is a periodic configuration, then ρ(X ) = |X ∩ |/V ( ), where is any period-parallelepiped. If, moreover, u is strongly tempered, cf. Eq. (2.2), and tends to Rd in the Van Hove sense [7], then e p (X ) =

∞ 1 U (X 0 ) U (X 0 ) I (X 0 , X \X 0 ) + + . I (X 0 , X m ) = |X 0 | 2|X 0 | |X 0 | 2|X 0 |

(7.8)

m=1

Here X m (m = 0, 1, . . .) are non-overlapping translates of any finite X 0 ⊂ X such that X = ∪∞ m=0 X m . The simplest choice is X 0 = (y1 , . . . , y J ), cf. (7.6). However, X 0 can be the content of an arbitrarily large period cell, and then comparison of (7.8) and (7.7) shows that I (X 0 , X \X 0 ) = o(|X 0 |).

(7.9)

Since u is superstable, for some b ≥ 0 and ρ(X ) > b/C[u] we have |X 0 | ≤

U (X 0 ) , −b + C[u]ρ(X )

(7.10)

lim

I (X 0 , X \X 0 ) = 0. U (X 0 )

(7.11)

and, consequently, |X 0 |→∞

The first part of the following lemma was proven by Sinai for a lattice gas [34] and by Radin in one dimensional continuous space [35]. Lemma 7.1. Let the interaction be strongly tempered, and let X be a periodic configuration. (i) If X minimizes the energy density among periodic configurations of the same density and of period vectors belonging to B(X ), then X is a GSC. (ii) If X is a GSC then it minimizes the energy density among l.u.f. configurations of the same density (provided that the respective densities exist). Proof.

(i) Suppose that X is not a GSC; then there exist X f ⊂ X and R ⊂ Rd finite sequences such that R ∩ X = ∅, |R| = |X f |, and = U (R|X \X f ) − U (X f |X \X f ) < 0.

(7.12)

We construct a periodic Y whose period vectors form a subgroup of those of X , such that ρ(Y ) = ρ(X ) and e p (Y ) < e p (X ). For an odd integer n, let d n−1 n−1 ≤ xi < X 0 = X ∩ n , n = xi ai : − . (7.13) 2 2 i=1

686

A. Süt˝o

This is the union of the content of n d unit cells of X . Define Y0 = X 0 \X f ∪ R. Choose n so large that X f ∪ R ⊂ n . Let Y be the periodic extension of Y0 , # ! Y = (Y0 + t) = (n B + y), (7.14) t∈n B

y∈Y0

#

!

to be compared with (n B + x).

(7.15)

= U (Y0 |X \X 0 ) − U (X 0 |X \X 0 ).

(7.16)

X=

(X 0 + t) =

t∈n B

x∈X 0

It is straightforward to verify that

Clearly, ρ(X ) = ρ(Y ) and from (7.8) we find |X 0 |[e p (Y ) − e p (X )] = +

1 [I (Y0 , Y \Y0 ) − I (Y0 , X \X 0 )] 2

1 + [I (X 0 , X \X 0 ) − I (Y0 , X \X 0 )] . 2

(7.17)

Now X 0 \X f = Y0 \R implies I (X 0 , X \X 0 ) − I (Y0 , X \X 0 ) = I (X f , X \X 0 ) − I (R, X \X 0 ),

(7.18)

and, using I (A, B + t) = I (B, A − t), I (Y0 , Y \Y0 ) − I (Y0 , X \X 0 ) = I (R, Y \Y0 ) − I (X f , Y \Y0 ).

(7.19)

The four infinite sums in the right member of the last two equations are absolutely convergent and tend to zero as n tends to infinity, because the distances of R and X f to X \X 0 and to Y \Y0 diverge with n. Thus, if n is large enough, e p (Y ) < e p (X ). (ii) Earlier it was shown (Prop. Ref. [2]) that if X and Y are two l.u.f. configurations of equal density and e p (X ) > e p (Y ), then X is not a GSC. Since periodic configurations are l.u.f., the second part of the claim follows. For a general parallelepiped spanned by the vectors li (i = 1, . . . , d), let u (r) = u (r + t) ,

(7.20)

t∈B

where

B =

d

n i li : n 1 , . . . , n d ∈ Z .

(7.21)

i=1

Lemma 7.2. Let u be strongly tempered and let X be a periodic configuration. (i) If X is a GSC of u, then X ∩ is a ground state of u on any period parallelepiped of X . (ii) If X ∩ is a ground state of u on any large enough period parallelepiped of X , then X is a GSC of u.


Proof.

687

(i) If X is a GSC, then it minimizes e p among periodic configurations of density ρ(X ). In particular, if R ⊂ , |R| = |X ∩ | and Y is the periodic extension of R from , then e p (X ) ≤ e p (Y ). Comparison with Eq. (7.8) written for a periodic configuration Z of period cell and X 0 = Z ∩ shows that e p (Z ∩ ) :=

U (Z ∩ ) 1 = e p (Z ) − |Z ∩ | 2

u(r).

(7.22)

0=r∈B

Applying (7.22) with Z = X and Y , one finds e p (R) − e p (X ∩ ) = e p (Y ) − e p (X ) ≥ 0.

(7.23)

(ii) We prove that if Y is a periodic configuration, B(Y ) ⊂ B(X ) and ρ(Y ) = ρ(X ), then e p (Y ) ≥ e p (X ). Let be a period cell of Y (and, hence, of X ) chosen so large that X ∩ is a ground state of u . Then inequality (7.23) holds true with R = Y ∩ . 8. Infinite-Density Ground State in Infinite Space Let M denote the family of (positive) Borel measures on Rd . Definition 8.1. Let X n be a sequence of infinite configurations with existing density ρn and energy per particle e p (X n ) which tend to infinity with n. Suppose that lim e p (X n )/ρn = C[u]

(8.1)

and there is some nonzero μ ∈ M such that δx μ. μ X n = ρn−1

(8.2)

n→∞

x∈X n

Then μ is called an infinite-density ground state in Rd . In contrast with IDGS’s in finite volumes, the existence of IDGS’s in infinite space is not a priori guaranteed. The ultimate goal would be to prove the following commutative diagram: Conjecture 8.1. →∞

GSC in −−−−→ GSC in Rd ⏐ ⏐ ⏐ρ→∞ ⏐ ρ→∞% %

(8.3)

IDGS in −−−−→ IDGS in Rd →∞

If u is superstable and strongly tempered then high-density GSC’s and IDGS’s in infinite space exist. Any IDGS in Rd can be obtained as the vague limit of both a sequence of periodic extensions of IDGS’s in increasing volumes and a sequence of Dirac combs (8.2) associated with GSC’s of increasing density. For any high-density GSC X , the associated measure μ X is the vague limit of a sequence μYn , where Yn is the periodic extension of a ground state of u n in some parallelepiped n , and ρ(Yn ) = ρ(X ).

688

A. Süt˝o

If the conjecture holds true then information on GSC’s in Rd can be obtained by studying IDGS’s in finite volume. In the case when v ≥ 0 and of compact support the results of Refs. [1,2] and those of Sect. 9 below prove the conjecture but nothing new can be obtained about GSC’s. If v ≥ 0 but its support is noncompact, the existence of IDGS’s in infinite space still follows from C [u] ≡ C[u] = v(0)/2; the Lebesgue measure is one of the IDGS’s (the only one if v > 0) and Sect. 9 gives a fairly complete description of them, cf. Corollary 9.1. Concerning the general case, below we present some clarification and partial result. Let be any parallelepiped centered at zero, d = xi li : −1/2 ≤ xi < 1/2, i = 1, . . . , d , (8.4) i=1

where li are linearly independent vectors. If n is a positive integer, then n = {nx : x ∈ } is the union of n d translates of : d

n =

n #

( + t j ),

(8.5)

j=1

where 1 tj = m ji li , m j1 , m j2 , . . . , m jd ∈ {−n + 1, −n + 3, . . . , n − 3, n − 1}. 2 d

i=1

(8.6) Definition 8.2. Let n be a positive integer. The periodic extension E n from M to Mn is a map assigning d

En μ =

n

1 +t j Tt j μ

(8.7)

j=1

to μ ∈ M ; that, is, E n μ(D) = μ(D − t j ),

j = 1, . . . , n d

(8.8)

for Borel sets D ⊂ + t j . The periodic extension E from M to M is a map assigning 1 +t Tt μ (8.9) Eμ = t∈B

to μ ∈ M . Lemma 8.1. Eμ is the vague limit of E 2n+1 μ as n tends to infinity. Proof. Let f ∈ Cc (Rd ). Comparing (8.9) with (8.7) one obtains Eμ( f ) = lim E 2n+1 μ( f ). n→∞

(8.10)


Recall that ∗ = {

689

n i li : n i ∈ Z, i = 1, . . . , d}, where li · l j = 2π δi j . Clearly, ∗ ⊂ (n )∗ = n −1 ∗ .

(8.11)

Lemma 8.2. If n is a positive integer and k ∈ ∗ then & μ(k) if n is odd E n μ(k) = s(k) μ(k) if n is even,

(8.12)

where s(k) ∈ {−1, 1}. If k ∈ (n )∗ \ ∗ then E n μ(k) = 0. Furthermore, μ(k), k ∈ Rd . Eμ(k) = 1 ∗ (k)

(8.13)

Proof. Using the definition of E n μ and (4.40), 1 E n μ(k) = d n V ( )

e

−ik·x

nd

1 E n μ( dx) = d n V ( )

j=1

n

e−ik·(x+t j ) μ( dx)

d

n 1 −ik·t j e μ(k). = d n

(8.14)

j=1

The average of the complex units is 1, ±1, or 0, depending on the case. Taking the limit n → ∞ gives the result for ' Eμ(k). Below we use the notations I [μ] and I 0 for the energy functional (4.5) and its infimum (4.8), and (r) N , ,N , ω (r) N and ω ,N for (2.4), (2.5), (2.22) and (2.23), respectively, to indicate the dependence on . Expanding the sum that defines u , 1 1 I [μ] = u(r) dEγ (r) = Eγ (u), (8.15) 2 Rd 2 where γ = μ ∗ μ/V for μ ∈ M . Proposition 8.1. For any positive integer n, Cn m [u](m = 1, 2, . . .) is a monotone decreasing sequence. Proof. Let m = n m . From (8.15) or from Lemma 8.2 and the form (4.39) of the energy functional, it follows that for any μ ∈ M m , I m+1 [E n μ] = I m [μ].

(8.16)

C m+1 [u] = I 0 m+1 ≤ I 0 m = C m [u].

(8.17)

By Theorems 4.1 and 4.2 then

For a stable interaction C [u] ≥ 0 for all , therefore the limit of Cn m [u] as m tends to infinity exists and is bounded below by the best superstability constant C[u]. As noted earlier, the limit must, in fact, be C[u]. Now we prove a different form of C[u].

690

A. Süt˝o

Proposition 8.2. If u is bounded then 1 inf e p (X ) : X periodic, ρ(X ) = ρ . ρ→∞ ρ

C[u] = lim

(8.18)

Remark. Here we have not supposed that there are periodic GSCs for arbitrarily high densities. If this is the case, the infimum in (8.18) is attained, and C[u] = limn→∞ e p (X n )/ρ(X n ), where X n is any sequence of periodic GSCs with ρ(X n ) tending to infinity. Proof.

(i) First we find a sequence X n of periodic configurations such that ρ(X n ) diverges and e p (X n )/ρ(X n ) tends to C[u]. Let n be any sequence of parallelepipeds such that C n [u] converges to C[u]. For each n let Rnm (m = 1, 2, . . .) be a sequence of ground state configurations in n , |Rnm | going to infinity with m. According to Eq. (3.5), limm→∞ n (Rnm ) = C n [u]. Therefore, one can choose a subsequence Rnm n ≡ Rn such that lim n (Rn ) = lim ω n (Rn ) = C[u]. Let X n denote the periodic extension of Rn . Thus, Rn = X n ∩ n and |Rn |/V ( n ) = ρ(X n ) → ∞. From Eq. (7.22), for any periodic configuration X of period parallelepiped , e p (X ) |X ∩ | − 1 1 = (X ∩ ) + ρ(X ) |X ∩ | ρ(X )

u(r)

0=r∈B

u(0) . = ω (X ∩ ) − 2ρ(X ) (8.19) Applying this formula to X n and n , we obtain limn→∞ e p (X n )/ρ(X n ) = C[u]. (ii) Suppose there is a sequence Yn of periodic configurations with respective period cells n such that ρ(Yn ) tends to infinity and lim

n→∞

e p (Yn ) = C0 < C[u]. ρ(Yn )

We can always choose the sequence n so that it tends to infinity in Fisher’s sense. We may also suppose that Yn ∩ n is a ground state of u n , otherwise we can replace Yn by the periodic extension of an |Yn ∩ n |-particle ground state of u n , cf. Eq. (7.23). For any ε > 0 there is an N such that for any n ≥ N , e p (Yn ) ε e p (Yn ) ε ε ρ(Y ) − C0 ≤ 3 , n (Yn ∩ ) − ρ(Y ) ≤ 3 , C n [u] − n ,|Yn ∩ | ≤ 3 . n n The second and third inequalities use Eq. (8.19) and Corollary 3.1, respectively. By assumption, n (Yn ∩ ) = n ,|Yn ∩ | , therefore |C n [u] − C0 | ≤ ε

if n ≥ N .

Thus, C n [u] converges to C0 < C[u], but this contradicts the hypothesis (2.14).


691

Remarks. (i) Without the hypothesis (2.14), from (2.9) we still find that Eq. (8.18) holds true if limρ→∞ is replaced by lim inf ρ→∞ . (ii) The first equality of Eq. (8.19) applies also to unbounded interactions. Equation (8.18) is presumably valid in this case as well, but in the absence of a proof of Corollary 3.1 for unbounded interactions we cannot prove it. For the sequence X n used in the proof of Proposition 8.2, Eq. (8.1) holds true. However, the proof of Eq. (8.2) is missing. We may suppose that n = (2ln + 1)Q 0 , where Q 0 is the unit cube centered at 0 and the integers ln tend to infinity with n. X n can be selected so that |X n ∩ Q 0 | = maxt∈Rd |(X n + t) ∩ Q 0 |, then μ X n (Q 0 ) ≥ V (Q 0 ) = 1, because μ X n ( n ) = V ( n ). The task would be to prove that μ X n (Q 0 ) is a bounded sequence. If this holds, one could find a vaguely convergent subsequence of μ X n tending to a nonzero measure which is by definition an IDGS in Rd . The same IDGS in Rd could also be obtained as the vague limit of a sequence of IDGS’s μn ∈ M n chosen such that μn (Q 0 ) ≥ V (Q 0 ). Let G be a locally compact Abelian group (in this paper G is a torus or G = Rd ). A measure μ on G is called shift-bounded if for any compact A ⊂ G there is a constant α A such that for all a ∈ G, μ(A + a) ≤ α A .

(8.20)

Suppose again that n is a sequence of increasing parallelepipeds such that C n [u] converges to C[u]. For any n, let μn be an IDGS in n . Hence, with the notation γn = V ( n )−1 μn ∗ μ n , 1 1 1 1 lim γn (u n ) = lim Eγn (u) = lim Eγn (v) = lim γn (v) = C[u]. n→∞ n→∞ n→∞ 2 2 2 2 n→∞ (8.21) Eγn is periodic and, thus, shift-bounded. Eγn is a positive measure which is the Fourier transform of a measure, therefore it is also shift-bounded, see [27] (Prop. 4.9) or [28] (Prop. 3.3). A necessary condition for (8.2) to hold is that these sequences are uniformly shift-bounded. In a special case the proof is given in the following lemma. 2 Lemma 8.3. If u(x) ≥ ce−α|x| for some c, α > 0, then both sequences Eγn and Eγn are uniformly shift-bounded. −d 2 2 1 Proof. Let A be a unit cube. If g(k) = K e−|k| /4α with K = 0 e−q /4α dq , then 1 A ∗ g ≥ 1 A . f˘ denoting the inverse Fourier transform of f , |( μn (k)|2 = Eγn (A + a) ≤ Eγn (1 A+a ∗ g) = Eγn (1˘ A+a g) ˘ ≤ Eγn (|1˘ A+a |g) ˘

k∈ ∗n ∩A+a

K 2 e−α|x| dEγn (x) ≤ (2π )d (π/α)d/2 ≤ β u(x) dEγn (x) ≤ βC[u]

for n sufficiently large (β =

K ). c(2π )d (π/α)d/2

(8.22)

The passage to any compact A is obvious,

so Eγn is uniformly shift-bounded. From (8.22) or with a similar independent argument

692

A. Süt˝o

one can show that for any Schwartz function f, Eγn ( f )(n = 1, 2, . . .) is a bounded sequence. We use this fact to prove that Eγn is uniformly shift-bounded. Let again A be a unit cube and g(x) as before; then there is some constant cg such that Eγn ( 1 A+a g) ≤ Eγn (|1( g) ≤ Eγn ( g) Eγn (A + a) ≤ Eγn (1 A+a ∗ g) = A | ≤ cg all n. (8.23)

An example for this lemma is the Gaussian pair potential for which, as mentioned already, the unique IDGS is the Lebesgue measure. Other examples are pair potentials with a Gaussian lower bound and a partly negative Fourier transform – the latter condition holds, for example, if u(0) = 0, cf. Eq. (11.18). Unfortunately, uniform shift-boundedness of the sequences Eγn and Eγn does not allow √ one to conclude that the sequence μn is uniformly shift-bounded. Unbounded but o( V ( n )) weight rearrangements of μn do not affect uniform shift-boundedness of Eγn . 9. Infinite-Density Ground States for v ≥ 0 The most complete information about IDGS’s can be obtained when the Fourier transform v of the interaction is nonnegative. The minimum of I [μ] is attained on any μ not contributing to the sum in Eq. (4.39), and its value is I 0 = v(0)/2, thus, C [u] = v(0)/2 = C[u] for every . Proposition 9.1. The Lebesgue measure λ is an IDGS in if and only if v ≥ 0, and then it yields I [λ] = C[u] = v(0)/2.

(9.1)

If v > 0, the Lebesgue measure is the unique IDGS. Proof. For k ∈ ∗ , λ(k) = δk,0 ,

(9.2)

and λ is the only element of M with Fourier transform δk,0 . So λ is an IDGS, and is the only one if v is strictly positive. It remains to prove that the Lebesgue measure is not an IDGS if v is partly negative. In this case consider dμq (x) = (1 + cos q · x) dx,

(9.3)

and choose q ∈ ∗ such that v(q) = mink∈ ∗ v(k) < 0. Then 1 δk,q + δk,−q 2

(9.4)

v(0) v(0) v(q) + < = I [λ], 2 4 2

(9.5)

μ (q (k) = δk,0 + and, recalling that v(−k) = v(k), I 0 ≤ I [μq ] = so λ is not an IDGS.


693

Let us make a small detour here to explain the choice of μq . The energy functional I [μ] is the diagonal part of a quadratic form on M . The operator underlying it has nice properties. Proposition 9.2. Let u be an integrable even pair potential, u(−x) = u(x). On L 2 ( ) define an integral operator A by setting (Aφ)(x) = u (x − y)φ(y) dy. (9.6)

Then A is a bounded self-adjoint operator with eigenvalues {v(k) : k ∈ ∗ } and eigenvectors ψk (x) = exp(ik · x). Proof. If k ∈ ∗ then

v(k) =

u(x)e

−ik·x

dx =

u (x)e−ik·x dx.

Substituting φ = ψk into Eq. (9.6) gives Aψk = v(k)ψk .

(9.7)

Thus, μq is an absolutely continuous measure composed of the lowest-lying real eigenvector of A that we must mix with the constant eigenvector in order to make the sum nonnegative. If v ≥ 0 and takes on the zero value, besides λ many other IDGS’s appear. We consider two specific cases. Proposition 9.3. Suppose that v ≥ 0 and v(k) = 0 for |k| ≥ K 0 , where K 0 < ∞. Let be a parallelepiped, B any Bravais lattice such that is a period cell for B, and let q B ∗ denote the nearest-neighbor distance of B ∗ , the reciprocal lattice of B. If q B ∗ ≥ K 0 then V δx (9.8) μ B∩ = |B ∩ | x∈B∩

is an IDGS. Remarks. (i) For the finite point configurations associated with μ B∩ , the proposition partly repeats the results of [1]. It provides also an example to Proposition 4.2. (ii) μ B∩ extends periodically to Rd into μ B = ρ(B)−1 δx , (9.9) x∈B

and the energy functional shown below actually depends only on μ B . Proof. For μ B∩ given by (9.8) and for k ∈ ∗ ,

μ B∩ (k) = 1 B ∗ (k) =

δk,K .

(9.10)

K∈B ∗

This implies

⎡ 1⎣ I [μ B∩ ] = v(0) + 2

⎤ v(K)⎦ ≡ I [μ B ].

(9.11)

0=K∈B ∗

The shortest nonzero vector in B ∗ has a length q B ∗ , at and above which v(k) vanishes, therefore I [μ B∩ ] = v(0)/2 = I 0 .

694

A. Süt˝o

The second special case is an interaction with a “soft mode”, v(k) = 0 for |k| = k0 and v is positive otherwise. One may think of the example v(k) = c1 e−c2 |k| (|k| − k0 )2

(9.12)

with some positive constants c1 , c2 , k0 , but the precise form of v plays no role. Choose so that ∗ contains a vector q of length k0 ; then there are at least two vectors, ±q of length k0 . The following proposition is an immediate consequence of the discussion above. The measure μq appearing in it is an example for an IDGS that is not a stationary point of I . Proposition 9.4. If v(k) ≥ 0 and v(k) = 0 for |k| = k0 , the measure μq of Eq. (9.3) with |q| = k0 is an IDGS. A particularity of the case v ≥ 0 is that convex combinations of IDGS’s are also IDGS’s. Lemma 9.1. If v ≥ 0 then I [μ] is a convex functional on M , that is, if μ, ν ∈ M and 0 < α, β < 1, α + β = 1 then I [αμ + βν] ≤ α I [μ] + β I [ν].

(9.13)

Proof. This immediately follows from the form (4.39) of I [μ] and the convexity of |z|2 on the complex plane: |αz 1 + βz 2 |2 ≤ (α|z 1 | + β|z 2 |)2 ≤ α|z 1 |2 + β|z 2 |2 .

(9.14)

Proposition 9.5. If v ≥ 0, the set of IDGS’s is convex. Proof. Let μ, ν ∈ M be two IDGS’s and 0 < α, β < 1, α + β = 1, I 0 ≤ I [αμ + βν] ≤ α I [μ] + β I [ν] = v(0)/2 = I 0 , therefore αμ + βν is an IDGS.

(9.15)

Corollary 9.1. If v ≥ 0 then the periodic extension of any IDGS from any parallelepiped is an IDGS in Rd . IDGS’s in Rd form a convex set. If v > 0, the Lebesgue measure is the unique IDGS in Rd . 10. Uniformity vs Nonuniformity of High-Density Ground States Proposition 9.1 has an immediate implication on the distribution of particles in highdensity ground states in . If μ is an IDGS and Rm is a sequence of Nm -particle ground states in such that μ Rm weakly converges to μ, then lim

m→∞

|Rm ∩ D| = μ(D) ρ(Rm )

(10.1)

for any open D ⊂ , where ρ(Rm ) = Nm /V . Equation (10.1) is a direct consequence of the definition of an IDGS. If v > 0 then λ is the unique IDGS. Thus, for any sequence


695

R N of N -particle ground states μ R N converges weakly to the Lebesgue measure λ, resulting in lim

N →∞

|R N ∩ D| = λ(D). ρ(R N )

(10.2)

On the other hand, if v is partly negative, the asymptotic distribution of particles in any ground state is necessarily inhomogeneous. We shall prove an analogous result for ground state configurations in infinite space. In Refs. [1,2] we gave a presumably complete description of defect-free high-density GSC’s of interactions with a nonnegative Fourier transform of compact support. If the Fourier transform is strictly positive, a prominent example of which is the Gaussian pair potential, a rigorous identification of GSC’s is still missing and, if the decorrelation conjecture of Torquato and Stillinger [13] is valid, the task is near to impossible in high dimensions. One is then reduced to some probabilistic description, and such an approach may be helpful already for d < 8. According to another conjecture of Torquato and Stillinger [14], for the Gaussian potential at d ≤ 8 the close-packed Bravais lattice and its reciprocal lattice should be the unique GSC at low and high densities, respectively. However, this conjecture was disproved by Cohn and Kumar [15], who found uniform close-packed periodic structures of lower energy at low densities in 5 and 7 dimensions. The result below shows the fundamental difference between interactions with a strictly positive or a partly negative Fourier transform. It may also be helpful in the numerical search for low-energy arrangements at high densities. Theorem 10.1. For arbitrary d ≥ 1, let u be a bounded, integrable, strongly tempered pair potential. Suppose that there exists a sequence of periodic GSC’s X n of respective densities ρn tending to infinity, and let μ X n = ρn−1 δx . (10.3) x∈X n

Write X n in the form (7.6), Xn =

Jn !

(Bn + y j ),

(10.4)

j=1

where Bn = B(X n ) =

d

i=1 Zani .

(i) Let the Fourier transform v of the potential be strictly positive. Suppose that the sequence of lattice constants |ani |(i = 1, . . . , d; n = 1, 2, . . .) is bounded. Then μ X n converges to the Lebesgue measure in distribution sense, that is, for any Schwartz function f : Rd → C, lim f (x) dx. (10.5) f (x) dμ X n (x) = n→∞

(ii) If v is partly negative, then (10.5) does not hold for any subsequence of μ X n . Remarks. The restriction to bounded interactions is for technical reasons. The result implies that if v is partly negative, any (vaguely or in distribution sense) convergent subsequence of μ X n tends to an IDGS in infinite space which is different from the

696

A. Süt˝o

Lebesgue measure. The IDGS’s are expected to be periodic or almost periodic in this case. Accordingly, the distribution of particles in high-density GSC’s shows a pattern corresponding to the infinite-density limit. On the other hand, if v > 0 then the distribution of particles is asymptotically uniform. We cannot prove that the set of lattice constants is always bounded, but the opposite, the divergence of at least one of the d lattice constants, leaves only two possibilities. The first is that Jn , the number of particles in the primitive cell (called complexity), tends to infinity. The second is that Jn is bounded, implying that X n falls into the union of lower than d-dimensional substructures of diverging separation and density. In the case of the Gaussian potential, in 5 and 7 dimensions, a new numerical result [16] suggests a uniaxially anisotropic high-density GSC with a stronger compression along the distinguished axis than perpendicular to it. According to part (i) of the theorem, the other lattice constants cannot remain bounded if the anisotropy is to survive the limit of infinite density. Proof.

(i) Let n be a period parallelepiped of Bn and, hence, of X n . Then, by Lemma 7.2, X n ∩ n is a ground state of u n . From Eqs. (2.28), (3.8) and (9.1), 0 < ω|X n ∩ n | −

u n (0) v(0) < . 2 2ρn

(10.6)

On the other hand, ω|X n ∩ n | = I [μ X n ] =

v(0) 1 + 2 2

v(k)| μ X n (k)|2 ,

(10.7)

0=k∈Bn∗

where μ X n (k) = Jn−1

Jn

eik·y j if k ∈ Bn∗ .

(10.8)

j=1

Thus,

v(k)| μ X n (k)|2
0 such that v(k) ≥ ε if k ∈ ∪|m|≤M Q m . Write = + . (10.13) 0=k∈Bn∗

k∈Bn∗ ∩(∪|m|≤M Q m )

k∈Bn∗ ∩(∪|m|>M Q m )

Using (10.10), by Cauchy’s inequality 2 2u(0) | f (k)|2 f (k) μ X n (k) ≤ ρn ε k∈B ∗ ∩(∪|m|≤M Q m ) k∈B ∗ ∩(∪|m|≤M Q m ) n

n

f 2∞ 2u(0)l M d , ≤ ρn ε which tends to zero as n → ∞. On the other hand, ≤l (k) max | f (k)|. f (k) μ X n k∈Q m k∈B ∗ ∩(∪|m|>M Q m ) |m|>M

(10.14)

(10.15)

n

The sum on the right-hand side is convergent and tends to zero as M tends to infinity, proving Eq. (10.5). (ii) Fix Q = [−L/2, L/2]d and define Nn = ρn L d . Let Q n = [−L n /2, L n /2]d , where L dn = Nn /ρn , then 0 ≤ L d − L dn < ρn−1 . Let q ∈ Q ∗ = (2π/L)Zd be the minimizer of v(k) in Q ∗ , and let qn = (L/L n )q ∈ Q ∗n = (2π/L n )Zd . Because L n → L , qn tends to q as n increases. Let μ Rn ∈ M Q n be a Nn -point approximation of μqn , cf. Eq. (9.3), constructed according to Lemma A.1. Let Z n be the periodic extension of Rn . Then ρ(Z n ) = ρn , and by part (ii) of Lemma 7.1, lim

n→∞

e p (X n ) e p (Z n ) v(0) |v(q)| − . ≤ lim = n→∞ ρn ρn 2 4

(10.16)

Applying Eq. (8.19) to X n , e p (X n ) u(0) + = ω n (X n ∩ n ) ρn 2ρn 1 = u(x − y) dμ X n (y) dμ X n (x), (10.17) 2V ( n ) n Rd where n is any period parallelepiped of X n . Suppose that μ X n tends to λ on Schwartz functions. If u ∈ S(Rd ), this would yield v(0)/2 instead of (10.16). If u is not a Schwartz function, replace it by u ∗ G t , compute the limit of e p (X n )/ρn and let t tend to zero. However, (u ∗ G t )(x − y) dy = v(0) (10.18) Rd

for any t ≥ 0; thus, again, we obtain a contradiction.

698

A. Süt˝o

11. Interactions without Bravais Lattice Ground States at High Density For certain potentials having a partially negative Fourier transform we can complete the information about GSC’s given in part (ii) of Theorem 10.1. Namely, we will show that for these potentials no periodic GSC X can be a singly or multiply occupied Bravais lattice if ρ(X ) is large enough, that is, J > 1 must hold and all y j cannot coincide [cf. (7.6)]. This result is valid in any dimension. Our method is to exclude Bravais lattices first as IDGS’s and then as GSC’s on the torus and in infinite space at high but finite densities. We note that one-dimensional examples of interactions having a nondegenerate GSC with two points in the unit cell [30,36] and others with no periodic GSC [37] existed already thirty years ago, see also Radin’s review [38]. 11.1. Compensation by higher harmonics. In the following example the negative contribution to the energy coming from a wave vector k is compensated by the positive contribution coming from integer multiples of k. Theorem 11.1. Consider a bounded integrable pair potential having a partly negative Fourier transform v with the property that for any k = 0, ∞

v(nk) ≥ 0.

(11.1)

n=1

Then no Bravais lattice can be an IDGS: if B is a Bravais lattice and the parallelepiped is a period cell for B, then I [μ B∩ ] > I 0 . Furthermore, at finite, high enough densities there is no Bravais lattice among the ground state configurations on tori and in infinite space, and no lattice tower of the form BJ =

J !

B.

(11.2)

j=1

where B is a Bravais lattice and J ≥ 1 can be a GSC if ρ = Jρ(B) is high enough. How large ρ must be depends on the details of the interaction. As an example, suppose that (i) there exist 0 < k1 < k2 ≤ 2k1 such that v(k) < 0 for k1 < |k| < k2 and v(k) ≥ 0 otherwise, (ii) v(2k) ≥ |v(k)| for k1 < |k| < k2 , and (iii) v(k) = 0 for |k| ≥ 3q with some q such that k1 < q < k2 . Then there is no Bravais lattice GSC if ρ ≥ (3/2)d−1 (q/π )d . Proof. Consider Eq. (9.11). If K ∈ B ∗ then nK ∈ B ∗ as well. Summing over B ∗ along lattice half-lines, the contribution of each partial sum is nonnegative, which shows that I [μ B∩ ] ≥ v(0)/2. On the other hand, choosing any q such that v(q) < 0, I 0 ≤ I [μq ] = v(0)/2 − |v(q)|/4, cf. Eq. (9.5), thus, μ B∩ is not an IDGS. The absence of Bravais lattice ground states at high but finite densities is true both in finite and infinite volume. To see this, suppose that is a period cell of B. Then I [μ B J ∩ ] = ω(B J ∩ ) = ω(B ∩ ) ≥

v(0) . 2

(11.3)


699

On the other hand, if ρ is high enough then according to Lemma A.1 and due to the continuity of I [μ] one can choose a configuration R = r1 , . . . , r|B J ∩ | ⊂ such that |I [μ R ] − I [μq ]| < |v(q)|/8, and therefore I [μ R ] = ω(R)
1 and c c ≤ v(k) ≤ at |k| ≥ K |k|d+2 |k|n+2+ε

(11.21)

for some positive integer n < d, positive c and 0 < ε < 1, then IDGS’s may be singular continuous measures of the form μ0 = λd−n × μ B (n) ∩ (n) ,

(11.22)

cf. Eq. (6.14), in which case high-density ground state configurations in are strongly anisotropic, and may or may not be Bravais lattices. (ii) If d ≥ 1 and v(k) ≥

c for |k| ≥ K , |k|3

(11.23)

then no IDGS is a stationary point of the energy functional. Furthermore, highdensity ground state configurations are different from Bravais lattices and their towers (11.2) in and, if u is bounded, also in infinite space. Remark. At the end of Sect. 6 we discussed the case when u(x) ∼ |x|−d+ζ with 0 < ζ < d near the origin. By a Tauberian argument v(k) ∼ |k|−ζ at infinity. If also v − = 0, then d ≥ 4 and 3 < ζ < d is

covered by (11.21), and 0 < ζ < min{d, 3} is covered by (11.23). If 0 < ζ ≤ 1 then ∞ n=1 v(nk) = ∞ for all nonzero k, which is a special case also of Theorem 11.1.

702

A. Süt˝o

One-dimensional example of (ii). Let 1 −a/k 2 2 e + (1 − k 2 )e−k . 2 k

v(k) =

(11.24)

For a ≥ 3 this function takes on negative values on two symmetric bounded intervals and decays as 1/k 2 at infinity. Thus, u is bounded but has a cusp at zero, and

| μ(k)|2 < ∞

k∈ ∗

must hold for an IDGS. Hence, any IDGS is absolutely continuous in = [−L/2, L/2] and is different from the Lebesgue measure, because v − = 0. High-density GSC’s form a non-arithmetic sequence. Proof. Let us return to the tempered measure G t ∗ μ and Eq. (4.47). For any t > 0, d 2 I [G t ∗ μ] = − v(k)|k|2 e−2t|k| | μ(k)|2 dt ∗ k∈ 2 2 = v − (k)|k|2 e−2t|k| | μ(k)|2 − v + (k)|k|2 e−2t|k| | μ(k)|2 , k∈ ∗

k∈ ∗

(11.25) because both sums are absolutely convergent. Now limt→0 dI [G t ∗ μ]/ dt exists, it can be −∞ for a general μ, but it must be finite nonnegative if μ is an IDGS. Thus, for any IDGS μ in M ,

v + (k)|k|2 | μ(k)|2 ≤

k∈ ∗

v − (k)|k|2 | μ(k)|2 ≤

k∈ ∗

v − (k)|k|2 < ∞. (11.26)

k∈ ∗

Let B be a Bravais lattice commensurate with and let μ B∩ be the associated measure, cf. Eq. (9.8). Then

v + (k)|k|2 | μ B∩ |(k)|2 =

k∈ ∗

v + (k)|k|2 = ∞

(11.27)

k∈B ∗

because of (11.20), so μ B∩ is not an IDGS. Combining (11.26) with (11.20), we find that | μ(k)|2 0 for |x| < d0 u(x) = 0 for |x| ≥ d0 .

(12.1)

We define u to be lower semicontinuous. This potential is applied in soft matter physics to model a system of interpenetrating micelles in a solvent [39–42]. The interaction has a partly negative Fourier transform. In three dimensions it reads (k = |k|) 4π u 0 d03 sin kd0 (12.2) v(k) = − cos kd0 . (kd0 )2 kd0 Note that v is not absolutely integrable. The first and deepest minimum of v(k) is at k = km = 6.12/d0 . If this was to determine the periodicity of the ground state then the ground state would be the dual of √ an fcc lattice of lattice constant km , that is, a bcc lattice of nearest-neighbor distance 6π/km = (7.70/6.12)d0 , cf. Eq. (15) of Ref. [2]. Instead, we will see that the ground state is an fcc lattice of nearest-neighbor distance d0 . We show that with a proper choice of , the ground state configurations can be given for every particle number N . In d dimensions choose a1 , . . . , ad such that |a j | = d0 and the vectors a j generate a d-dimensional close-packed lattice, d B= n j a j : n1, . . . , nd ∈ Z . (12.3) 1

Choose to be a period cell for B containing |B ∩ | = M points of B and having a side length ≥ 2d0 in every direction. The volume of is V = M |det[a1 . . . ad ]| , so the volume of the unit cell of B is ρ B−1

= V /M = |det[a1 . . . ad ]| =

(12.4)

√

3d02 /2 if d = 2 √ d03 / 2 if d = 3.

(12.5)

The property of B is that for the given density ρ B it has the largest nearest-neighbor distance or for the given nearest-neighbor distance d0 the largest density among Bravais lattices. We make also the hypothesis (trivial in two and proven in three dimensions) that B realizes the densest packing of hard spheres of diameter d0 . Note that with the choice of we exclude the occurrence of other close-packed (e.g. hcp) structures which can be ground states in suitable domains or in infinite space. If N < M, the ground state in is continuously degenerate, any configuration with all inter-particle distances ≥ d0 is a ground state. Below we describe the ground state configurations (GSC) for N ≥ M, modulo cyclic translations in . Proposition 12.1. The IDGS is B ∩ , C [u] = ω(B ∩ ).

(12.6)

ForMN finite, let n ≥ 1 be an integer and let n M ≤ N ≤ (n + 1)M. Then there are N −n M GSC’s in . In each of them (n + 1)M − N points of B ∩ are occupied by n particles and N − n M points of B ∩ are occupied by n + 1 particles. Moreover, (n + 1)M nV 1− , (12.7) N = u0 N −1 2N


705

so that u0 u0 C [u] = lim N = = × N →∞ 2ρ B 2

√ 3d02 /2 if d = 2 √ d03 / 2 if d = 3.

(12.8)

Proof. The proof goes by induction over N . Because each side of has a length ≥ 2d0 , only a single term of u can be nonvanishing, thus, u (r) = u 0 or 0. If N = M, B ∩ has zero energy, therefore it is a GSC. Any perturbation of B ∩ creates at least one pair of particles of distance < d0 , therefore of energy u 0 . Thus, B is the unique GSC in . Suppose we know the result up to N , where n M ≤ N < (n + 1)M, and want to prove it for N + 1. Adding a particle to a ground state of N particles costs the least if it is placed on a point of B ∩ occupied by n particles. The increase in energy is nu 0 ; at any other place it is at least twice as much. No relaxation of the configuration thus obtained can decrease the energy, so it is a ground state for N + 1 particles. The ground state energy for N particles is n(n + 1) n(n − 1) u 0 + [(n + 1)M − N ] u0 2 2 n+1 M]. = nu 0 [N − 2

E 0 (N ) = (N − n M)

Dividing by N (N − 1)/V yields N .

(12.9)

An unpleasant feature of the penetrable sphere model is the discontinuity of its energy at zero temperature when N ≥ M. The values of U (r) N can only be integer multiples of u 0 taken from the set {E 0 (N ), . . . , N (N − 1)u 0 /2}. Therefore, 1 −β[FN (β)−E 0 (N )] e ≡ N e−β[U (r) N −E 0 (N )] d(r) N V N =

N (N −1)/2 m=m 0

e−βmu 0

λd N (Am ) . VN

Here λd N denotes the d N -dimensional Lebesgue measure, 0 / Am = (r) N ∈ N : U (r) N = E 0 (N ) + mu 0 ,

(12.10)

(12.11)

and m 0 = min{m : λd N (Am ) > 0}.

(12.12)

Now A0 is the set of translates of B ∩ increased by the finite number of possible assignments of N particles to M sites, so that λd (A0 ) > 0, but its higher than d-dimensional Lebesgue measures vanish. On the other hand, λd N (A N (N −1)/2 ) = V λd (Bd0 /2 ) N −1 > 0,

(12.13)

where Bd0 /2 = {x ∈ Rd : |x| < d0 /2}. Therefore 0 < m 0 ≤ N (N − 1)/2. From (12.10) it follows that e−βm 0 u 0

λd N (Am 0 ) ≤ e−β[FN (β)−E 0 (N )] ≤ e−βm 0 u 0 . VN

(12.14)

706

A. Süt˝o

Taking the logarithm and dividing by −β, m 0 u 0 ≤ FN (β) − E 0 (N ) ≤ m 0 u 0 +

2 1 1 N ln V /λd N (Am 0 ) . β

(12.15)

Letting β tend to infinity in (12.15) and in the mean energy E N (β) = −

∂ ln Z ,N = E 0 (N ) + ∂β

N (N −1)/2

mu 0 e−βmu 0 λd N (Am ) m=m 0 ,

N (N −1)/2 −βmu 0λ e d N (Am ) m=m 0

(12.16)

we find the following. Proposition 12.2. For the penetrable sphere model in a finite volume, lim FN (β) = lim E N (β) = E 0 (N ) + m 0 u 0 ,

β→∞

β→∞

(12.17)

where m 0 is given by Eq. (12.12). From the point of view of thermodynamics, it is, therefore, more adequate to consider E 0 (N ) + m 0 u 0 as the ground state energy. The limit of f N (β) as N goes to infinity depends on the limit of m 0 /N 2 and on the large N behavior of λd N (Am 0 ), and remains to be answered. Note added. After the submission of this paper I learned that a part of the tools I developed for studying ground state configurations at high densities existed already in abstract potential theory. The first result dates from 1923 and is due to Fekete [43]. In an article about the distribution of the roots of polynomials of integer coefficients, Fekete considered the quantity dn (S), the maximum of the geometric mean of the n(n −1)/2 distances between points in an n-point set, chosen from an infinite compact set S. He proved that dn (S) was monotone decreasing as n increased, and called the limit the transfinite diameter of S. If we take the pair potential u(x − y) = − ln |x − y|, then the ground state energy per pair of n particles in S is − ln dn (S). Thus, Fekete’s result is the proof of the monotone increase of the ground state energy per pair [Eq. (3.4)] and of Proposition 3.1 in this special case. What I called the best superstability constant is minus the logarithm of the transfinite diameter. The first proof of the monotone increase of the ground state energy per pair for a general symmetric kernel u(x, y) was given by Choquet [44], who also proved a theorem corresponding to the present Theorem 4.2. The best superstability constant appears in [44] under the name of Fekete constant. The analogous theorem for a kernel which is bounded on the diagonal (Theorem 4.1 here) can be found in a recent paper by Farkas and Nagy [45]. Early results in abstract potential theory are summarized in Fuglede’s paper [46]. Acknowledgements. I thank Szilárd Révész for drawing my attention to related results in abstract potential theory. This work was supported by OTKA Grants K67980 and K77629.

Appendix In this appendix we collect a few results on the convergence of measures on compact sets. Let ⊂ Rd be compact and let M be the set of normalized Borel measures on .


707

Lemma A.1. Any μ ∈ M is the weak limit of a sequence {μ R N }∞ N =1 of discrete measures of the form (4.2). Proof. We may suppose that is a cube, otherwise, we take a cube that covers and extend μ with zero value to a measure on the cube (cf. Lemma A.4.) We may also suppose that is a unit cube. Given μ ∈ M (thus, μ( ) = 1), divide into a disjoint union of n d semi-open cubes Q nj of side length 1/n, where n = log2 (N + 1). In Q nj N

j choose N j points {rnji }i=1 , where N j = N μ(Q nj ) or N j = N μ(Q nj ) in such a way

that j N j = N . Thus,

|μ(Q nj ) − N j /N | ≤ 1/N .

(A.1)

Define d

μRN

Nj n 1 = δrnji , N

(A.2)

j=1 i=1

where δr is the Dirac measure at r. For any continuous function f , let 1 f n, j = f dμ. μ(Q nj ) Q nj

(A.3)

Because is compact, f is uniformly continuous, so there is some δn which goes to zero as n increases, such that for each j, Nj 1 n f (r ji ) − f n, j ≤ δn . (A.4) N j i=1

It follows that nd Nj n f dμ R − f dμ ≤ δn + N N − μ(Q j ) | f n, j |.

(A.5)

j=1

The second term is O(n d /N ), so both terms tend to zero as N goes to infinity. Thus, μ R N μ. Next, we show that given any infinite sequence of normalized measures on a compact set, one can select a subsequence converging weakly to a normalized measure. We start with the compact set being a unit cube equipped with periodic boundary conditions (d-dimensional torus). The Fourier transform (4.40) of a μ ∈ M is a function of positive type [29] on ∗ = 2π Zd , meaning that for any integer l, any k1 , . . . , kl in ∗ and any complex numbers z 1 , . . . , zl , l

μ(ki − k j )z i z j ≥ 0.

(A.6)

i, j=1

It follows among others that | μ(k)| ≤ μ(0) = 1.

(A.7)

708

A. Süt˝o

Lemma A.2. From any sequence μn ∈ M one can select a subsequence μn i such that μn i converges pointwise to the Fourier transform μ of a μ ∈ M . We say that μn i converges to μ in Fourier transform. Proof. Because ∗ is a countable set, we can use the diagonal process [23] to choose a subsequence μn i which converges pointwise to some function ψ on ∗ , lim μn i (k) = ψ(k), every k ∈ ∗ .

i→∞

(A.8)

Fixing any integer l, k1 , . . . , kl in ∗ and complex numbers z 1 , . . . , zl , writing down Eq. (A.6) for μn i and taking the limit i → ∞, we find l

ψ(ki − k j )z i z j ≥ 0.

(A.9)

i, j=1

Thus, ψ is a function of positive type, ψ(0) = 1, and by Bochner’s theorem [29], ψ = μ, the Fourier transform of a probability measure μ on . Lemma A.3. μn converges to μ in Fourier transform if and only if it converges weakly to μ. Proof. If μn μ then lim μn ( f ) = μ( f ) for

f (x) = sin k · x, cos k · x,

(A.10)

∗ .

therefore lim μn (k) = μ(k) for all k ∈ Suppose now that μ, μn ∈ M , μ = lim μn . Consider the trigonometric polynomials, p(x) = ak eik·x , (A.11) k∈ ∗

where ak = 0 for a finite number of k. Any continuous function f on is the uniform limit of a sequence pm of the form (A.11). Given any ε > 0, we fix an m so that f − pm ∞ ≤ ε, and choose Nε such that for n > Nε |μ( pm ) − μn ( pm )| ≤ |akm || μ(k) − μn (k)| ≤ ε. (A.12) k∈ ∗

Because |μ( f − pm )| ≤ ε and |μn ( pm − f )| ≤ ε, we obtain |μ( f ) − μn ( f )| ≤ 3ε if n > Nε , proving that μn converges weakly to μ.

(A.13)

The extension of the above result to the general case, when the compact domain may not be a cube and the boundary condition may not be periodic, is immediate. Lemma A.4. Let 0 ⊂ Rd be a compact Borel set and let M 0 denote the set of normalized Borel measures on 0 . Given any infinite sequence μn ∈ M 0 , one can select a subsequence converging weakly to some μ ∈ M 0 . Proof. Take a cube ⊃ 0 . Let μn ∈ M with supp μn ⊂ 0 be the extension of μn to . Constructing a normalized limit measure μ ∈ M according to Lemma A.2, it is clear that supp μ ⊂ 0 , therefore μ = μ | 0 ∈ M 0 . Because μ is the weak limit of μn , cf. Lemma A.3, μ is the weak limit of μn .


709

References 1. Süt˝o, A.: Crystalline ground states for classical particles. Phys. Rev. Lett. 95, 265501 (2005) 2. Süt˝o, A.: From bcc to fcc: Interplay between oscillating long-range and repulsive short-range forces. Phys. Rev. B 74, 104117 (2006) 3. Likos, C.N.: Going to ground. Nature 440, 433–434 (2006) 4. Likos, C.N., Mladek, B.M., Gottwald, D., Kahl, G.: Why do ultrasoft repulsive particles cluster and crystallize? Analytical results from density-functional theory. J. Chem. Phys. 126, 224502 (2007) 5. Süt˝o, A.: A possible mechanism of concurring diagonal and off-diagonal long-range order for soft interactions. J. Math. Phys. 50, 032107 (2009) 6. Vlasov, A.: On the kinetic theory of an assembly of particles with collective interaction. J. Phys. (USSR) IX, 25–40 (1945) 7. Ruelle, D.: Statistical Mechanics: Rigorous Results. New York: W. A. Benjamin, 1969 8. Kirzhnits, D.A., Nepomnyashchii, Yu.A.: Coherent crystallization of quantum liquid. Sov. Phys. JETP 32, 1191–1197 (1971) 9. Nepomnyashchii, Yu.A.: Coherent crystals with one-dimensional and cubic lattices. Theor. Math. Phys. 8, 928–938 (1971) 10. Süt˝o, A.: Superimposed particles in 1D ground states. J. Phys. A: Math. Theor. 44, 035205 (2011) 11. Stillinger, F.H.: Phase transitions in the Gaussian core system. J. Chem. Phys. 65, 3968–3974 (1976) 12. Lang, A., Likos, C.N., Watzlawek, M., Löwen, H.: Fluid and solid phases of the Gaussian core model. J. Phys. Condens. Matter 12, 5087–5108 (2000) 13. Torquato, S., Stillinger, F.H.: New conjectural lower bounds on the optimal density of sphere packings. Exp. Math. 15, 307–331 (2006) 14. Torquato, S., Stillinger, F.H.: New duality relations for classical ground states. Phys. Rev. Lett. 100, 020602 (2008) 15. Cohn, H., Kumar, A.: Counterintuitive ground states in soft-core models. Phys. Rev. E 78, 061113 (2008) 16. Cohn, H., Kumar, A., Schürmann, A.: Ground states and formal duality relations in the Gaussian core model. Phys. Rev. E 80, 061116 (2009) 17. Lebowitz, J.L., Penrose, O.: Rigorous treatment of the van der Waals-Maxwell theory of liquid vapor transition. J. Math. Phys. 7, 98 (1966) 18. Gates, D.J., Penrose, O.: The van der Waals limit for classical systems I. A Variational Principle. Commun. Math. Phys. 15, 255–276 (1969) 19. Benois, O., Bodineau, T., Presutti, E.: Large deviations in the van der Waals limit. Stoch. Proc. Appl. 75, 89–104 (1998) 20. Carlen, E.A., Carvalho, E.C., Esposito, L., Lebowitz, J.L., Marra, R.: Droplet minimizers for the CahnHilliard free energy functional. J. Geom. Anal. 16, 233–264 (2006) 21. Carlen, E.A., Carvalho, E.C., Esposito, L., Lebowitz, J.L., Marra, R.: Droplet minimizers for the GatesLebowitz-Penrose free energy functional. Nonlinearity 22, 2919–2952 (2009) 22. Modica, L.: The gradient theory of phase transitions and the minimal interface criterion. Arch. Rat. Mech. Anal. 98, 123–142 (1987) 23. Rudin, W.: Real and complex analysis. New York: McGraw-Hill, 1986 24. Ruelle, D.: Classical statistical mechanics of a system of particles. Helv. Phys. Acta 36, 183–197 (1963) 25. Kiessling, M.K.-H.: A note on classical ground state energies. J. Stat. Phys. 136, 275–284 (2009) 26. Witten, T.A., Pincus, P.A.: Colloid stabilization by long grafted polymers. Macromolecules 19, 2509–2513 (1986) 27. Berg, Ch., Forst, G.: Potential theory on locally compact Abelian groups. Berlin, Heidelberg, New York: Springer-Verlag, 1975 28. Hof, A.: On diffraction by aperiodic structures. Commun. Math. Phys. 169, 25–43 (1995) 29. Reed, M., Simon, B.: Functional Analysis. New York: Academic Press, 1980 30. Ventevogel, W.J.: On the configuration of a one-dimensional system of interacting particles with minimum potential energy per particle. Physica 92, 343–361 (1978) 31. Theil, F.: A proof of crystallisation in two dimensions. Commun. Math. Phys. 262, 209–239 (2006) 32. Radin, C.: Existence of ground state configurations. Math. Phys. Electron. J. 10(6) (2004) 33. Bellissard, J., Radin, C., Shlosman, S.: The characterization of ground states. J. Phys. A: Math. Theor. 43, 305001 (2010) 34. Sinai, Ya.G.: Theory of phase transitions: Rigorous results. Budapest, Akadémiai Kiadó, New York: Pergamon Press, 1982, Lemma 2.1 35. Radin, C.: Classical ground states in one dimension. J. Stat. Phys. 35, 109–117 (1984) 36. Nicolò, F., Radin, C.: A first order transition between crystal phases in the shift model. J. Stat. Phys. 28, 473–478 (1982) 37. Hamrick, G.C., Radin, C.: The symmetry of ground states under perturbation. J. Stat. Phys. 21, 601–607 (1979)

710

A. Süt˝o

38. Radin, C.: Low temperature and the origin of crystalline symmetry. Int. J. Mod. Phys. B 1, 1157–1191 (1987) 39. Marquest, C., Witten, T.A.: Simple cubic structure in copolymer mesophases. J. Phys. France 50, 12671277 (1989) 40. Klein, W., Gould, H., Ramos, R.A., Clejan, I., Mel’cuk, A.I.: Repulsive potentials, clumps and the metastable glass phase. Physica A 205, 738–746 (1994) 41. Likos, C.N., Watzlawek, M., Löven, H.: Freezing and clustering transitions for penetrable spheres. Phys. Rev. E 58, 3135–3144 (1998) 42. Fernaud, M.-J., Lomba, E., Lee, L.L.: A self-consistent integral equation study of the structure and thermodynamics of the penetrable sphere fluid. J. Chem. Phys. 112, 810–816 (2000) 43. Fekete, M.: Über die Verteilung der Wurzeln bei gewissen algebraischen Gleichungen mit ganzzahligen Koeffizienten. Math. Z. 17, 228–249 (1923) 44. Choquet, G.: Diamètre transfini et comparaison de diverses capacités. Séminaire Brelot-Choquet-Deny. Théorie du potentiel 3(4), 1–7 (1958–1959) 45. Farkas, B., Nagy, B.: Transfinite diameter, Chebyshev constant and energy on locally compact spaces. Potential Anal. 28, 241–260 (2008) 46. Fuglede, B.: On the theory of potentials in locally compact spaces. Acta Math. 103, 139–215 (1960) Communicated by H. Spohn

Commun. Math. Phys. 305, 711–739 (2011) Digital Object Identifier (DOI) 10.1007/s00220-011-1277-y

Communications in


Scaling Limits of Random Skew Plane Partitions with Arbitrarily Sloped Back Walls Sevak Mkrtchyan Department of Mathematics, Rice University, Houston, TX 77005, USA. E-mail: [email protected] Received: 3 May 2010 / Accepted: 12 February 2011 Published online: 11 June 2011 – © Springer-Verlag 2011

Abstract: The paper studies scaling limits of random skew plane partitions confined to a box when the inner shapes converge uniformly to a piecewise linear function V of arbitrary slopes in [−1, 1]. It is shown that the correlation kernels in the bulk are given by the incomplete Beta kernel, as expected. As a consequence it is established that the local correlation functions in the scaling limit do not depend on the particular sequence of discrete inner shapes that converge to V . A detailed analysis of the correlation kernels at the top of the limit shape, and of the frozen boundary is given. It is shown that depending on the slope of the linear section of the back wall, the system exhibits behavior observed in either Okounkov and Reshetikhin (Commun Math Phys 269(3):571–609, 2007) or Boutillier et al. (http://arXiv.org/abs/0912.3968v2 [math-ph], 2009). Contents 1.

Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1.1 Background . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1.2 Main results . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1.3 Outline of the structure of the paper . . . . . . . . . . . . . . . . . . 2. Critical Points of the Asymptotically Leading Term in the Integral Formula Giving the Correlation Kernel . . . . . . . . . . . . . . . . . . . . . . . . 2.1 The function S(z) . . . . . . . . . . . . . . . . . . . . . . . . . . . . 2.2 Number of complex critical points of S(z) . . . . . . . . . . . . . . . 3. Asymptotics of the Correlation Kernel . . . . . . . . . . . . . . . . . . . 3.1 Correlation kernels on the frozen boundary and when χ → ∞ . . . . 4. The Boundary of the Limit Shape . . . . . . . . . . . . . . . . . . . . . . 4.1 τ (z) and χ (z) are real for all z ∈ U . . . . . . . . . . . . . . . . . . . 4.2 The frozen boundary: connected components, cusps . . . . . . . . . . Appendix A. An Integral Formula for bk (z) . . . . . . . . . . . . . . . . . . References . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .

712 712 715 718 718 718 719 727 729 729 729 732 737 739

712

S. Mkrtchyan

1. Introduction 1.1. Background. 1.1.1. Skew plane partitions. Given a partition λ, a skew plane partition with boundary λ confined to a c × d box is an array of nonnegative integers π = {πi, j } defined for all 1 ≤ i ≤ c, 1 ≤ j ≤ d, (i, j) ∈ / λ, which are non-increasing in i and j. One way to visualize this is to draw a c × d rectangular grid, remove the partition λ from a corner, andstack πi, j identical cubes at position (i, j), as in Fig. 1. The number of cubes |π | := πi, j is the volume of the skew plane partition π . From Fig. 1 it is easy to see that skew plane partitions can be identified with tilings of a certain region of R2 with 3 types of rhombi (see [4] or [6] for details and other correspondences). Scale the axes in such a way that the centers of horizontal tiles are on the lattice Z × 21 Z. The letter t will be used for the horizontal and h for the vertical coordinate axes in this plane. For a partition λ, let bλ (t) encode the boundary of λ. More precisely, let u 1 < u 2 < · · · < u n−1 denote the t coordinates of the corners on the outer boundary of the Young diagram λ as shown in Fig. 2. Define bλ (t) as bλ (t) := t + 2

n−1 (−1)i (t − u i )θ (t − u i ), i=1

Fig. 1. A skew plane partition and the corresponding partition λ. Here λ = {4, 4, 3, 3, 3, 1}

Fig. 2. Position of corners for λ = {4, 4, 3, 3, 3, 1}

Arbitrary Slopes

713

Fig. 3. The graph of 21 bλ (t) when λ = {4, 4, 3, 3, 3, 1}

where θ (t) is the step function θ (t) =

1, t ≥ 0 . 0, t < 0

We will call bλ (t) the back wall. Notice that bλ (t) is a piecewise linear function with slopes in [−1, 1]. The graph of h = 21 bλ (t) (see Fig. 3) gives the inner shape of the skew plane partition from Fig. 1. 1.1.2. The thermodynamical limit; cases studied before. For q ∈ (0, 1) introduce a probability measure on skew plane partitions with boundary λ and confined to a c × d box by Pr ob(π ) ∝ q |π | . Given a subset U = {(t1 , h 1 ), . . . , (tn , h n )} ⊂ Z × 21 Z, define the corresponding local correlation function ρλ,q (U ) as the probability for a random tiling taken from the n . above probability space to have horizontal tiles centered at all positions (ti , h i )i=1 Okounkov and Reshetikhin studied the thermodynamical limit q = e−r → 1 of this system in several cases. In [4] they studied the case when λ is the empty partition and the size of the box is infinite (i.e. plane partitions are not confined to a finite c × d box). In [6] they showed that in general, for arbitrary λ, the correlation functions are determinants. Theorem 1.1 (Theorem 2, part 3 [6]). The correlation functions ρλ,q are determinants ρλ,q (U ) = det(K λ,q ((ti , h i ), (t j , h j )))1≤i, j≤n , where the correlation kernel K λ,q is given by the double integral K λ,q ((t1 , h 1 ), (t2 , h 2 )) √ bλ (z, t1 ) zw −h 1 + 1 bλ (t1 )− 1 h 2 − 1 bλ (t2 )+ 1 dzdw 1 2 2w 2 2 z , = (2π i)2 z∈C z w∈Cw bλ (w, t2 ) z − w zw

(1)

714

S. Mkrtchyan

where bλ (t) is the function giving the back wall corresponding to λ as in Fig. 3, bλ (z, t) = +,bλ (z, t) =

−,bλ (z, t) , +,bλ (z, t)

(1 − zq m ),

(2)

m>t,m∈D + ,m∈Z+ 12

−,bλ (z, t) =

(1 − z −1 q −m ),

m 0 such that if |ζ − w| < ε, then m 1 −1

xi μi + wα +

i=1

m 2 −1

yi νi = 0

(9)

i=1

if and only if ζ ∈ R and ζ > w. The same holds if α is replaced by α˜ := angle(ζ − w, ζ − y1 ) and ζ > w is replaced by ζ < w. Proof. Let ε = y1 − w. If ζ ∈ R, ζ > w and |ζ − w| < ε, then w < ζ < y1 . Thus, all the angles μi , νi and α are zero, and (9) is true. Let us prove the converse. Suppose ε > 0 such that (9) implies ζ ∈ R, ζ > w. If ζ ∈ R and x1 < ζ < w, then all the angles μi and νi are zero, but α = π and (9) cannot hold. It follows, that there must be a sequence ζi ∈ C\R such that lim ζi = w

i→∞

(10)

722

S. Mkrtchyan

Fig. 11. The angles η(ζ ) and γ (ζ )

and (9) holds for ζ = ζi , ∀i. Notice that the angles α, μi , and νi depend on ζ and that (10) implies lim μl (ζi ) = 0 and

i→∞

lim νl (ζi ) = 0 for all l.

(11)

i→∞

These, together with the assumption that (9) holds for ζ = ζi , ∀i, imply lim α(ζi ) = 0.

(12)

i→∞

Define η(ζ ) := angle(ζ − w, ζ − ζ ) and γ (ζ ) := angle(ζ − ζ, ζ − y1 ) (see Fig. 11). Using this notation, tan(α(ζi )) =

|w − x1 | cos(η(ζi )) . |ζi − w| + |w − x1 | sin(η(ζi ))

Writing a similar expression for α(ζi ) + μ1 (ζi ), and using (11) and (12), obtain lim

i→∞

μ1 (ζi ) + α(ζi ) tan(μ1 (ζi ) + α(ζi )) = lim i→∞ α(ζi ) tan(α(ζi )) =

|w−x2 | cos(η(ζi )) |ζi −w|+|w−x2 | sin(η(ζi )) lim |w−x1 | cos(η(ζi )) i→∞ |ζi −w|+|w−x1 | sin(η(ζi ))

|ζi −w| |w−x1 | = lim |ζ −w| i→∞ i |w−x2 |

+ sin(η(ζi )) + sin(η(ζi ))

.

(13)

It follows from (10) and (12) that lim η(ζi ) = lim γ (ζi ) =

i→∞

Thus, (13) gives that limi→∞

μ1 (ζi )+α(ζi ) α(ζi )

i→∞

π . 2

(14) μ1 (ζi ) α(ζi ) = 0. It is μl (ζi ) limi→∞ α(ζi ) = 0 for all

= 1, and hence that limi→∞

easy to see that using the same argument it can be shown that l = 1, 2, 3, . . . , m 1 .

Arbitrary Slopes

723

It follows from (10) and (14) that ν1 (ζi ) tan(ν1 (ζi )) = lim = lim i→∞ α(ζi ) i→∞ tan(α(ζi )) i→∞ lim

|y1 −y2 | cos(γ (ζi )) |ζi −y1 |+|y1 −y2 | sin(γ (ζi )) |w−x1 | cos(η(ζi )) |ζi −w|+|w−x1 | sin(η(ζi ))

cos(γ (ζi )) |y1 − y2 | lim |w − y1 | + |y1 − y2 | i→∞ cos(η(ζi )) sin( π2 − γ (ζi )) |y1 − y2 | = lim |w − y1 | + |y1 − y2 | i→∞ sin( π2 − η(ζi )) |y1 − y2 | |ζi − w| = lim = 0. |w − y1 | + |y1 − y2 | i→∞ |ζi − y1 |

=

νl (ζi ) α(ζi )

= 0 for all l = 1, 2, 3, . . . , m 2 . Combining the results gives m −1 m 1 2 −1 μl (ζi ) νl (ζi ) +w+ lim xl yl = 0, i→∞ α(ζi ) α(ζi )

Similarly, limi→∞

l=1

l=1

which is a contradiction to the assumption that (9) is satisfied for ζ = ζi for all i. This proves the first statement in the lemma. The second statement follows by symmetry. Proof [Proof of Lemma 2.2]. Since z = 0, studying the critical points of S(z) is d equivalent to studying the solutions to z dz S(z) = 0. The real part of the equation, i.e. d 0 = (z dz S(z)) implies that if χ is very large, then z must be very close to eτ or to e Vl for some l. d The imaginary part 0 = (z dz S(z)) is equivalent to 0=−

j−1 1 1 (1 + βm )angle(z − e Vm−1 , z − e Vm ) − (1 + β j )angle(z − e V j−1 , z − eτ ) 2 2

m=1

n 1 1 τ Vj + (1 − β j )angle(z − e , z − e )+ (1 − βm )angle(z −e Vm−1 , z −e Vm ). 2 2 m= j+1

(15) For arbitrary real numbers x, a, b, if x ∈ / [a, b], it is immediate that lim angle(z − ea , z − eb ) = 0.

z→e x

This implies that if z is near e Vm , then all but the two terms in (15), where e Vm appears close to zero. The sum of the angles in the remaining two terms is π , and both have coefficients of the same sign. If none of those two coefficients is zero, then if z is sufficiently close to e Vm , the right-hand side of (15) cannot be zero. If one of the two coefficients is zero, then in order for (15) to hold, z must be real. This follows from Lemma 2.3. For example, in the case βm = −1, βm+1 = −1 and m < j − 1, setting ζ = z, xm 1 xm 1 −1 . . . x1 w y1 . . . y j−m−1 y j−m y j−m+1 . . . ym 2 || || . . . || || || . . . || || || . . . || , e V0 e V1 . . . e Vm−1 e Vm e Vm+1 . . . e V j−1 eτ e V j . . . e Vn

(16)

724

S. Mkrtchyan

and xm 1 −1 . . . x1 w y1 . . . y j−m−1 y j−m . . . ym 2 −1 || . . . || || || ... || || . . . || 1+β j 1−β j 1+βm−1 1+β1 1+βm 1+βm+2 n − 2 ... − 2 − 2 − 2 ... − 2 . . . 1−β 2 2 ,

(17)

Lemma 2.3 gives that (15) implies z must be real. If z is near eτ and β j = ±1, then exactly one of the coefficients of angles in (15) containing z − eτ is zero, and again it follows from Lemma 2.3, with the parameters set up similarly to (16) and (17) but this time with w = eτ , that z must be real. If z is near eτ and β j = ±1, then using the methods from the proof of Proposition 3.1 of [1], it can be shown that there are two possibilities for z, and these two complex conjugate critical points of S(z) can be asymptotically calculated. Formulas for the critical points are given in Lemma 2.4. Lemma 2.4. In the limit χ → ∞, (a) If τ ∈ (V j−1 , V j ), β j = ±1 is fixed, then the asymptotics of the non-real complex critical points is given by z cr = eτ −ε , where

1 n

τ − Vi

2 (βi+1 −βi ) −χ ±iπ 21 (1+β j )

ε=e e 1 + O(e−χ ) .

2 sinh 2 i=0

(b) If τ = V j−1 + δ, χ → ∞ and δ → 0 in such a way that p = eχ −χ

( j−1)

1

|δ|1− 2 (β j −β j−1 )

is fixed, with e

χ ( j−1)

1 n

V j−1 − Vi 2 (βi+1 −βi )

2 sinh := ,

2 i=0 i= j−1

then the critical points behave as z cr = eτ −s|δ| , where s is a solution to the equation p=e

±iπ 21 (1+β j−1 ) (s

1

− sign(δ)) 2 (β j −β j−1 ) . s

(18)

Proof. Since the calculations are very similar to those in [1], in order to avoid repetition, we will omit them here. 2.2.4. Nature of critical points in various limits. Let us analyze the solutions to (18). Assume δ > 0. The case δ < 0 is similar. Consider the two limits p → ∞ and p → 0 in various scenarios depending on the angles β j−1 and β j . In the limit p → ∞ the solution to (18) has the asymptotics s = p −1 e±iπ 2 (1+β j ) (1 + O( p −1 )). 1

(19)

When p → 0, s=p

−

1 1− 21 (β j −β j−1 )

e

±iπ

1 2 (1+β j−1 ) 1− 21 (β j −β j−1 )

(1 + O( p)).

(20)

Arbitrary Slopes

725

To get these asymptotics, it is necessary to show that if (β j −β j−1 ) < 0 and p → ∞, or (β j −β j−1 ) > 0 and p → 0, then s 1. This is a technicality that has been addressed in [1]. In Lemma 2.3 set ζ w xi yi

= z, = eτ , = e V j−i , i = 1, 2, . . . , j, = e V j−1+i , i = 1, 2, . . . , n − j + 1, 1 xi = − (1 + β j−i ), i = 1, 2, . . . , j − 1, 2 1 yi = (1 − β j+i ), i = 1, 2, . . . , n − j. 2 Lemma 2.3 and its proof can be used to show that if z satisfies (15), then in the following situations it must be real. The idea is to show that if z is not real, then one of the angles in (9) with non-zero coefficient is much larger than all the other angles with non-zero coefficients, which is not possible. Case 1. p → ∞, s → 0, β j = 1. In Lemma 2.3 set w = −1. Then (15) has the form of (9). Also, |ζ − w| = |eτ −ε − eτ | = eτ |ε| + o(ε) and |w − x1 | = |e V j−1 − eτ | = eτ |δ| + o(δ). |ζ −w| ≈ |ε| Hence, |w−x δ = s → 0. The proof of Lemma 2.3 implies that ζ must be 1| real. Since z = ζ, z must be real. Case 2. p → 0, s → ∞, β j = 1. The setup is similar to the first case and |ζ − w| → |ζ −w| 0, |w − x1 | → 0. Unlike the previous case, |w−x → ∞. The proof of Lemma 1| 2.3 gives that in the limit considered μ1 α (all the other angles), which implies ζ must be real. Case 3. p → 0, s → ∞, β j−1 = −1, β j = 1. In this case (15) has the form of (9) ˜ Notice that with the term wα in (9) replaced by − 21 (1 + β j )α + 21 (1 − β j )α. x1 = 0. From the proof of Lemma 2.3 it is easy to see that if ζ is not real, then μi → 0 for i > 1, νi → 0 for i ≥ 1, α + α˜ μi for i > 1 and α + α˜ νi |ζ −w| for i ≥ 1. Since |w−x ≈ |ε| δ = s → ∞, it follows that α → 0 (see Fig. 10). 1| Thus, α˜ → 0 as well. A calculation similar to (13) yields

tan(α) + tan(α) ˜ ≈ tan(α)

|ζ −x1 | |w−x1 | |ζ −x1 | |y1 −x1 |

+1 +1

≈ s → ∞,

which implies α˜ α. Thus, in the considered limit α˜ is much larger than all the other angles that appear with non-zero coefficients, which is impossible. Hence, ζ must be real.

726

S. Mkrtchyan

Case 4. p → ∞, s → 0, β j = −1. Now (15) has the form of (9) if α is replaced by α˜ and w = 1. In this case |ζ − w| → 0, |x1 − w| → 0, and xζ1−w −w → 0. It is easy to see from the proof of Lemma 2.3 that α˜ μi , νi and hence that ζ must be real. It is easy to see that the complex critical points z cr of S(z) obtained from (19) and (20) are non-real complex except in the cases listed above. 2.2.5. Critical points at finite (τ, χ ). In the previous section the number of non-real complex critical points of Sτ,χ (z) was identified when χ is large. This section studies the number of non-real complex critical points for an arbitrary point (τ0 , χ0 ). Suppose that at this point the number of non-real complex critical points is neither 2 nor 0. It must be even, since they come in conjugate pairs. Assume there are 2m such critical points. The number of such critical points depends on the point (τ0 , χ0 ). However, if (τ0 , χ0 ) continuously changes in the τ, χ plane, the number of non-real complex critical points of Sτ0 ,χ0 (z) will not change, until it reaches a point where the equations (z) = Sτ,χ (z) = 0 Sτ,χ

(21)

have a real solution z ∈ R. In other words, the number of non-real complex critical points can change only near points (τ, χ ) where Sτ,χ (z) has double real critical points. From (8) and (7) the condition (21) is equivalent to eτ =z − n

1

1 i=0 2 (βi+1

− βi ) z−e1 Vi

,

(22)

and

1 ze−Vi − 1 ze−τ − 1 (1 + βi ) ln − (1 + β j ) ln χ =− 2 ze−Vi−1 − 1 2 ze−V j−1 − 1 i=1

−V j −Vi n 1 −1 1 −1 ze ze + (1 − β j ) ln + ) ln (1 − β i 2 ze−τ − 1 2 ze−Vi−1 − 1 j−1 1

i= j+1

1 1 − τ + (V (V0 ) + V0 ). 2 2

(23)

Think of this as a curve (τ (z), χ (z)) in the τ, χ plane parametrized by z ∈ R. Consider the complement of this curve in the τ, χ plane. The number of non-real complex critical points of Sτ,χ is the same for all points (τ, χ ) inside the same connected component of this complement. Let D be the connected component which contains (τ0 , χ0 ) and let (τ (z 0 ), χ (z 0 )) be a generic point on the boundary of D. The complement of (τ (z), χ (z)) ˜ has another connected component whose boundary contains (τ (z 0 ), χ (z 0 )). Call it D. The number of non-real complex critical points of Sτ,χ is 2m for points (τ, χ ) ∈ D and ˜ From Lemma 2.1 it follows that z 0 must be in one of 2m ± 2 for points (τ, χ ) ∈ D. the intervals (e Vi−1 , e Vi ), βi = ±1 or in (−∞, e V0 ) ∪ (e Vn , ∞). Suppose, for example, z 0 ∈ (e Vl−1 , e Vl ). The portion of the curve (τ (z), χ (z)) corresponding to z ∈ (e Vl−1 , e Vl ) ˜ where D denotes the closure of D (a detailed study of the curve is contained in D ∩ D, (τ (z), χ (z)) is carried out in Sect. 4). Now, when z approaches e Vl (or e Vl−1 ), χ (z) will approach ∞. However, it was shown in the previous section, that when χ is very large, the number of complex critical points is either 2 or 0. Thus, 2m = 0 or 2m = 2.

Arbitrary Slopes

727

This establishes the following proposition: Proposition 2.5. For any (τ, χ ) the number of non-real complex critical points of Sτ,χ (z) is 2 or 0. Divide the τ, χ plane into regions according to the number of non-real complex critical points of Sτ,χ (z). Each connected component where the number of such critical points is 0 has points where χ is arbitrarily large. 3. Asymptotics of the Correlation Kernel This section analyzes the asymptotics of the correlation kernel K λk ,rk ((t1k , h k1 ), (t2k , h k2 )) in the scaling limit when limk→∞ rk = 0, the skew plane partitions are scaled by rk in all directions, lim rk t1k = lim rk t2k = τ,

k→∞

k→∞

lim rk h k1 = lim rk h k2 = χ ,

k→∞

k→∞

and (t) := t1k − t2k and (h) := h k1 − h k2 are constants. A version of the saddle point method is used for calculating the asymptotics of the correlation kernel (1). The arguments used are along the lines of [4,6,1]. Let τ, χ be such that Sτ,χ has two non-real complex critical points. Deform the contours of integration C z and Cw in the double integral representation of the correlation kernel given in (1) to C z , Cw in such a way that the new contours pass transversely through the two critical points of S(z) and (S(z)) ≤ (S(w))∀z ∈ C z , ∀w ∈ Cw , with equality if and only if z = w = z cr . It was shown in the proof of Theorem 4.1 in [1] that this can be done for the function S(z) when βi = ±1, ∀i. The argument used there is general and applies for arbitrary βi ∈ [−1, 1]. During this contour deformation the contours cross each other along a path between the complex critical points of S(z), so 1 the residues from the term z−w should be picked. Thus, the integral (1) can be written as K λk ,qk ((t1k , h k1 ), (t2k , h k2 )) −,bλk (z, t1k )+,bλk (w, t2k ) 1 = (2π i)2 z∈C z w∈Cw +,bλ (z, t1k )−,bλ (w, t2k ) k k √ zw −h k + 1 bλ (t k )−1/2 h k − 1 bλ (t k )+1/2 dzdw z 1 2 k 1 w 2 2 k 2 × z−w zw z cr k ) k) (z, t (z, t 2 −,bλk +,bλk 1 1 2 h k −h k + 1 bλ (t k )− 1 bλ (t k )−1 z 2 1 2 k 1 2 k 2 dz. + 2π i z cr1 +,bλ (z, t1k )−,bλ (z, t2k ) k k Recall that the first integral in the above formula has the form Sτ,χ (z)−Sτ,χ (w) 1 1 +O(1) rk dwdz. e 2 (2π i) C z Cw z−w S(z)−S(w)

In the limit k → ∞ the asymptotically leading term of the double integral is e rk and (S(z)) < (S(w)) along the contours C z , Cw except at the critical points. This implies that the main contribution to the integral comes from the critical points, but since the contours of integration cross transversely at the critical points, the integral is zero in the limit. Hence,

728

S. Mkrtchyan

lim K λk ,qk ((t1k , h k1 ), (t2k , h k2 )) z cr k k 2 − (z, t )+ (z, t ) 1 1 2 h k2 −h k1 + 12 bλk (t1k )− 21 bλk (t2k )−1 z dz. = lim k→∞ 2π i z cr + (z, t1k )− (z, t2k ) 1

k→∞

Suppose t1k < t2k . Let Pλk (t1k , t2k ) = #{D − ∩ (t1k , t2k )}. The correlation kernel can be written as follows: lim K λk ,qk ((t1k , h k1 ), (t2k , h k2 ))

k→∞

= lim

k→∞

= lim

k→∞

1 2π i 1 2π i

1 = lim k→∞ 2π i = lim

k→∞

t1k <m z 0 and that

Case 1. βl+2 − βl+1 > 0. In this case

∂ 2m ˜ T (z 0 , a, a) = ∞. a→ln(z 0 )+ ∂z 2m lim

∂ ˜ Thus, ∃a0 ∈ R such that z 0 < ea0 < e Vl and ∂z 2m T (z 0 , a0 , a0 ) = 0. This contradicts the inductive assumption since T˜ (z, a0 , a0 ) is equal to the function T (z) corresponding to a piecewise linear back wall with parameters V0 , V1 , . . . , Vl−1 , a0 , Vl+2 , . . . , Vn and β1 , β2 , . . . , βl , βl+2 , . . . , βn , which has at most n − 2 corners (see Fig. 12). Case 2. βl+2 − βl+1 < 0. The argument is similar to the previous case. Now, ∂ 2m ˜ T (z 0 , a, b) is a decreasing function of b for b such that eb > z 0 , and ∂z 2m 2m 2m thus ∂ 2m T˜ (z 0 , Vl , Vl+2 ) < 0. ∂ 2m T˜ (z 0 , a, Vl+2 ) is decreasing as a function of 2m

∂z

∂z

a when ea > z 0 and

∂ 2m ˜ T (z 0 , a, Vl+2 ) = ∞. a→ln(z 0 )+ ∂z 2m lim

∂ 2m ˜ Thus, ∃a0 ∈ R such that z 0 < ea0 < e Vl and ∂z 2m T (z 0 , a0 , Vl+2 ) = 0. ˜ This contradicts the inductive assumption since T (z, a0 , Vl+2 ) is equal to the function T (z) corresponding to a piecewise linear back wall with parameters V0 , V1 , . . . , Vl−1 , a0 , Vl+2 , . . . , Vn and β1 , β2 , . . . , βl+1 , βl+3 , . . . , βn , which has at most n − 2 corners.

732

S. Mkrtchyan

This concludes the proof when there is at least one corner to the right of the corner at position Vl . When there are corners to the left, the argument can be easily modified to show that if T (2m) (z 0 ) = 0, the number of corners to the left of z 0 can be reduced and still have that T (2m) (z 0 ) = 0. Proof [Proof of Proposition 4.1]. Again, only the case z ∈ (e Vl−1 , e Vl ) with βl = 1 will be presented, as other cases work in the same way. From (26), to show τ is real it is enough to show that z − T 1(z) > 0. Lemma 4.2 gives that T (z) = 0. If T (z) < 0, there is nothing to show, so assume T (z) > 0. Again, do induction on the number of corners on the back wall. When there is only one corner, i.e. when n = 2, V0 < V1 < V2 , l = 1, e V0 < z < e V1 , β1 = 1, and β2 ∈ [−1, 1), the inequality can be rewritten as follows: 1 1 > 0 ⇔ 1 < zT (z) ⇔ 1 < z z− T (z) 2 ⇔

2 β2 − 1 1 − β2 + + V V 0 1 z−e z−e z − e V2

z e V1 − e V2 1 z(1 − β2 ) < − 1. V V 1 1 2 (z − e )(z − e ) z − e V0

The last inequality is true since the left-hand side is less than zero and the right-hand side is greater than zero. Now, assume that z − T 1(z) > 0 whenever the back wall has at most n − 2 corners. Fix z and let 0 < l < n − 1 be such that z < e Vl . Such l exists if not all corners are to the left of z. The case when all corners are to the left of z can be treated similarly. Consider the function T˜ (z, Vl , b) from the proof of Lemma 4.2. Depending on the sign of βl+2 − βl+1 , T˜ (z, Vl , b) is either increasing or decreasing in b for b ≥ Vl . Hence, either 0 < T˜ (z, Vl , Vl ) < T (z) or 0 < T˜ (z, Vl , Vl+2 ) < T (z). However, as in Lemma 4.2, both T˜ (z, Vl , Vl ) and T˜ (z, Vl , Vl+2 ) are equal to the function T (z) corresponding to back walls with at most n − 2 corners. Thus, the induction hypothesis gives that z − ˜ 1 > 0 and z − ˜ 1 > 0. Hence, z − T 1(z) > 0. T (z,Vl ,Vl+2 )

T (z,Vl ,Vl )

This establishes that if z ∈ U, then τ (z) = ln(z −

1 T (z) )

is real.

Now, let us show that χ (z) is real. This is obvious if z ∈ (−∞, e V0 ) ∪ (e Vn , ∞). Suppose z ∈ (e Vl−1 , e Vl ) with βl = ±1. If βl = 1, (βl = −1), then βl+1 − βl < 0, (βl+1 − βl > 0). If z = e Vl − ε, it follows from (25) that T (z) > 0, (T (z) < 0). By Lemma 4.2, T (z) = 0, which implies T (z) > 0, (T (z) < 0) for all z ∈ (e Vl−1 , e Vl ). Now, (26) implies that eτ < z < e Vl , (eτ > z > e Vl−1 ). Therefore, the coefficient of Vl ln( zeVl−1−1 ) in (23) is 1 − βl = 0, (1 + βl = 0). In conclusion, it was shown that if ze

−1

z ∈ (e Vl−1 , e Vl ) with βl = ±1, then the coefficient of ln( zeVl−1−1 ) in (23) is always zero, ze −1 which implies that χ (z) is real for all z ∈ U. Vl

4.2. The frozen boundary: connected components, cusps. Proposition 4.3. The frozen boundary consists of connected components, one for each outer corner where at least one of the slopes is ±1, and one connected component at the bottom (see Fig. 8).

Arbitrary Slopes

733

Proof. If (τ, χ ) is on the frozen boundary, then it is easy to see from (23) that dχ 1 z − . = τ dτ z−e 2

(27)

Suppose z ∈ (e Vl−1 , e Vl ) ⊂ U, for some l, for which βl = ±1. Assume βl = 1 (βl = −1 can be done similarly). It is immediate from the formulas for τ (z), χ (z) that (τ (z), χ (z)) is a continuous curve when z ∈ (e Vl−1 , e Vl ). Consider two cases. First, suppose that βl−1 = −1. Then (22) implies lim τ (z) = Vl −, and

lim

z→e Vl −

z→e Vl−1 +

τ (z) = Vl−1 −,

(28)

where for example by lim z→e Vl − τ (z) = Vl − we mean τ (z) converges to Vl from below when z converges to e Vl from below. Looking at (23) it is easy to see that lim

z→e Vl or e Vl−1

χ (z) = +∞.

These calculations imply that when z ranges over (e Vl−1 , e Vl ), it gives a connected component of the curve (τ (z), χ (z)). If βl−1 = −1, then (22) gives that lim

z→e Vl−1 −

τ (z) =

lim

z→e Vl−1 +

τ (z) = Vl−1 .

Consider values of z which range over (e Vl−2 , e Vl−1 ) ∪ (e Vl−1 , e Vl ). It follows from (23) that χ0 :=

lim

z→Vl−1 ±

ze−Vi − 1 2 ze−Vi−1 − 1 i=1

−Vi n 1 1 −1 ze 1 + (1 − βi ) ln − Vl−1 + (V (V0 ) + V0 ) −V i−1 2 ze −1 2 2

χ (z) = −

l−2 1

(1 + βi ) ln

i=l+1

and lim

z→e Vl or e Vl−2

χ (z) = +∞.

These imply that when z ranges over (e Vl−2 , e Vl−1 )∪(e Vl−1 , e Vl ), one connected component of the curve (τ (z), χ (z)) is obtained. The two pieces of this connected component corresponding to the intervals z ∈ (e Vl−2 , e Vl−1 ) and z ∈ (e Vl−1 , e Vl ) are connected at the point (Vl−1 , χ0 ). Values z ∈ / [e V0 , e Vn ] correspond to the component at the bottom. Next, it will be shown that each component corresponding to an outer corner develops a cusp. The above results, together with the analysis in Sect. 3.1 imply the frozen boundary at corners looks as described in Fig. 13. dτ A cusp appears if dχ dz = dz = 0. This is equivalent to S (z) = S (z) = S (z) = 0.

734

S. Mkrtchyan

Fig. 13. All possible corners

Lemma 4.4. Suppose (τ, χ ) is on the frozen boundary. Then dτ = 0. dz

S (z) = 0 ⇔ Proof. By definition of T (z), eτ = z −

1 , T (z)

so dτ 1 T . = 0 ⇔ 1 + 2 = 0 ⇔ T + T2 = 0 ⇔ 0 = T + dz T (z − eτ )2 However, notice that S (z) = T +

1 . (z−eτ )2

Thus,

dτ dz

= 0 ⇔ S (z) = 0.

Theorem 4.5. Each connected component of the frozen boundary corresponding to an outer corner has a cusp on it (see Figs. 8 and 13). In particular, the number of cusps is equal to the number of outer corners where at least one of the slopes is a lattice slope. Proof. First, let us treat the case e Vl−1 < z < e Vl , βl−1 > −1, βl = 1. By Lemma 4.4 cusps correspond to points z such that eτ = z −

dτ 1 and = 0. T (z) dz

This is equivalent to 1 T (z) = z − eτ

and T (z) =

1 z − eτ

,

which means that to prove the theorem it is enough to show that when e Vl−1 < z < e Vl , 1 then T (z) and z−e τ are tangent to each other at a single point. Recall that T (z) is given by T (z) =

n 1 i=0

2

(βi+1 − βi )

1 . z − e Vi

It follows that lim z→e Vl−1 + T (z) = ∞ = lim z→e Vl − T (z). Lemma 4.2 gives that T (2m) (z) = 0. In particular it follows that in the interval e Vl−1 < z < e Vl the function T (z) is never zero, is always concave up, and has a positive minimum in that interval.

Arbitrary Slopes

735

1 for various values of τ Fig. 14. The graphs of a section of T (z) and of z−e τ

1 Vl−1 < z < e Vl . When τ When τ > Vl , z−e τ does not intersect T (z) in the interval e decreases, it intersects at a point. When τ = Vl−1 , then

lim

z→e Vl−1 +

1 1 1 2 (1 − βl−1 ) − T (z) = lim − + finite terms = ∞, V z − e Vl−1 z − e Vl−1 z→e Vl−1 + z − e l−1

1 Vl−1 < z < e Vl , so z−e τ and T (z) intersect. T (z) has a positive minimum in the interval e 1 so if τ is small enough, z−e τ and T (z) will be tangent to each other. This completes the proof in this case (Fig. 14). The case e Vl−1 < z < e Vl , βl−1 = −1, βl < 1 is identical. It remains to study the case e Vl−1 < z < e Vl , βl−1 = −1, βl = 1. The same thing happens as above, except now the infinite terms in

lim

z→e Vl−1 +

1 − T (z) z − e Vl−1

cancel each other. If the limit is still positive, the argument works the same way as above, and a cusp τ < Vl−1 is obtained. If the limit is negative, then as above it can be shown that a cusp τ > Vl−1 appears when e Vl−2 < z < e Vl−1 . If the limit is zero, the cusp is at τ = Vl−1 . Vl Notice that (28) implies dχ dτ → ±∞ when z → e , ∀l. Examining (27) it can also be dχ V established that dτ = 0, and since for z ∈ (e l−1 , e Vl ), dχdτ(τ ) is a continuous function, it follows that in the intervals z ∈ (e Vl−1 , e Vl ), dχdτ(τ ) does not change its sign. In the τ, χ plane each connected component except the one at the bottom starts at τ = Vi−1 , χ = ∞ for some i, χ decreases until it reaches the cusp, then increases to τ = Vi , χ = ∞. See Figs. 8 and 13. The results of this and previous sections give the following characterization of the frozen boundary:

(1)

On the pieces of the back wall where the slope is ±1 the disordered region is bounded above, while at other places it grows infinitely high.

736

S. Mkrtchyan

Fig. 15. V = 0, 1, 1.05, 2, 2.05, 3, 3.05, β = 1, −1, 1, 0.7, 1, 0.7

Fig. 16. V = 0.7, 1, 1.05, 1.2, 1.25, 1.5, 1.55, 3, 3.1, β = 1, 0.7, 1, 0.7, 1, 0.7, 1, −1

(2) (3)

The number of connected components of the frozen boundary is one more than the number of outer corners where at least one of the slopes at the corner is a lattice slope. The frozen boundary develops a cusp for each such outer corner. Figures 9, 15 and 16 give several interesting examples of frozen boundaries.

Acknowledgements. I am very grateful to Peter Tingley for a suggestion on which the proof in Appendix A is based. I am very grateful to Cedric Boutillier and to an anonymous referee for many suggestions to improve the presentation of this paper. I am very grateful to Nicolai Reshetikhin for his guidance. I am also very grateful to Alexei Borodin, Cedric Boutilier and Peter Tingley for many useful discussions on the subject. Lastly, I would like to thank the organizers of the Park City Mathematics Institute Summer School on statistical mechanics, in July 2007, where much of Appendix A was written.

Arbitrary Slopes

737

Appendix A. An Integral Formula for bk (z) One of the main difficulties in this paper, and in fact also in [4] and [6], is understanding the asymptotics of (1). Similarly as in [4] we study the asymptotics of the function rk ln bλk (z, tk ) = rk ln

−,bλk (z, tk ) +,bλk (z, tk )

in the limit when rk → 0 and tk is a sequence such that rk tk → τ (here −,bλk (z, t) and +,bλk (z, t) are as in (2)). In this Appendix we prove a technical result concerning this asymptotics in a very general setting. We consider an arbitrary continuous function V (τ ) : R → R, which is Lipshitz with constant 1. This is a less restrictive assumption on V than in the rest of the paper, where V (τ ) is assumed to be piecewise linear, and the domain is restricted to a finite interval. Otherwise, we use notation from the Introduction. In particular, we are considering a sequence of back walls bλk (t) and rk ∈ R>0 such that rk tends to 0, and functions Bλk (τ ) defined by

τ Bλk (τ ) := rk bλk rk converge uniformly to V (τ ). Lemma A.1. If Bλk uniformly converge to V, limk→∞ rk = 0, and limk→∞ rk tk = τ , then τ 1 −e M z −1 1 τ −1 lim rk ln bλk (z, tk ) = (τ +V (τ )) ln 1−e z (M +V (M)) − dM k→∞ 2 1−e M z −1 −∞ 2 ∞1 1 e−M z + (τ −V (τ )) ln 1 − e−τ z + (M − V (M)) d M. 2 2 1 − e−M z τ In particular limk→∞ rk ln bλk (z, tk ) is independent of the family {bλk (t)}k . The convergence is uniform in z on compact subsets of C. Proof. Let us analyze the denominator of bλk (z, tk ). The same arguments will work for the numerator. Define 1 Dk := rk ln . +,bλk (z, tk ) From (2) we get ⎛ ⎛ ⎜ ⎜ Dk = −rk ⎝ln ⎝

⎞⎞

⎛

⎟⎟ (1−zqkm )⎠⎠ =rk ⎝−

m>tk ,bk (m)=−1

m>tk ,m∈D + ,m∈Z+ 12

Notice that 1 (1 − bk (m)) = 2

⎞ ln 1−e−rk m z ⎠.

1, bk (m) = −1 0, bk (m) = 1

.

(29)

738

S. Mkrtchyan

Make the change of variable M = rk m and set τk = rk (tk + 21 ).Dk can be rewritten as

1 rk (1 − bk (m)) ln 1 − e−rk m z 2 m>tk 1 =− rk (1 − Bk (M)) ln 1 − e−M z . 2

Dk = −

M∈{τk ,τk +rk ,τk +2rk ,...}

To make formulas simpler, define f (M) := − ln 1 − e−M z . k ∈ Z+ and limk→∞ σk = σ . Let σ > τ , and let σk be a sequence such that σkr−τ k Define 1 rk (1 − Bk (M)) f (M). Dkσ := 2

M∈{τk ,τk +rk ,τk +2rk ,...,σk }

Since Bλk (s) has constant slope when s ∈ (M, M + rk ) we can write Dkσ as

Bλ (M + rk ) − Bλk (M) 1 f (M) 1− k rk Dkσ = 2 rk M∈{τk ,τk +rk ,τk +2rk ,...,σk }

=

1 2

([M + rk − Bλk (M + rk )] − [M − Bλk (M)]) f (M).


Now, using (an+1 − an )bn = (an+1 bn+1 − an bn ) − an+1 (bn+1 − bn ) rewrite Dkσ as 1 2

Dkσ =

−

[(M + rk − Bλk (M + rk )) f (M + rk ) − (M − Bλk (M)) f (M)]


1 2

(M + rk − Bλk (M + rk ))( f (M + rk ) − f (M)).


Summing up the telescoping sum and rewriting the second one give Dkσ =

1 1 (σk + rk − Bλk (σk + rk )) f (σk + rk ) − (τk − Bλk (τk )) f (τk ) 2 2

f (M + rk ) − f (M) 1 . rk (M + rk − Bλk (M + rk )) − 2 rk M∈{τk ,τk +rk ,τk +2rk ,...,σk }

Since Bλk (M) converge uniformly to V (M), and f (M) and f (M) are bounded when M ∈ (τk , σk ), we have

⎛

σ 1 1 lim Dk − ⎝ (σk − V (σk )) f (σk ) − (τk − V (τk )) f (τk ) k→∞

2 2 ⎞

1 ⎠ − rk (M − V (M)) f (M)

= 0. 2


Arbitrary Slopes

739

σ The sum above is a Riemann sum for 21 τ (M − V (M)) f (M)d M and thus in the limit rk → 0 it converges to the integral, so we have

1 1 σ 1 (σ −V (σ )) f (σ )− (τ − V (τ )) f (τ )− lim Dkσ = (M −V (M)) f (M)d M . k→∞ 2 2 2 τ Since limσ →∞ 21 (σ − V (σ )) f (σ ) = 0, we obtain

∞ 1 1 (M − V (M)) f (M)d M . lim Dk = − (τ − V (τ )) f (τ ) − k→∞ 2 2 τ Similarly, if Nk is defined as

Nk := rk ln −,bλk (z, tk ) ,

it can be shown that 1 lim Nk = + (τ + V (τ )) ln 1 − eτ z −1 − k→∞ 2 Combining the results completes the proof.

τ −∞

−e M z −1 1 (M + V (M)) d M. 2 1 − e M z −1

Corollary A.2. If V (τ ) is piecewise differentiable, limk→∞ rk = 0, and limk→∞ rk tk = τ , then τ 1 lim rk ln bλk (z, tk ) = − − (1 + V (M)) ln 1 − e M z −1 d M k→∞ −∞ 2 ∞ 1 + − (1 − V (M)) ln 1 − e−M z d M. 2 τ Proof. This follows from Lemma A.1 by integration by parts.

References [1] Boutillier, C., Mkrtchyan, S., Reshetikhin, N., Tingley, P.: Random skew plane partitions with a piecewise periodic back wall. http://arXiv.org/abs/0912.3968v2 [math-ph], 2009 [2] Boutillier, C.: The bead model & limit behaviors of dimer models. Annals of Prob. 37(1), 107–142 (2009) [3] Kenyon, R., Okounkov, A.: Limit shapes and the complex burgers equation. Acta Math. 199(2), 263–302 (2007) [4] Okounkov, A., Reshetikhin, N.: Correlation function of schur process with application to local geometry of a random 3-dimensional young diagram. J. Amer. Math. Soc. 16, 581–603 (2003); (electronic) [5] Okounkov, A., Reshetikhin, N.: The birth of a random matrix. Moscow Math. J. 6(3), 553–566 (2006) [6] Okounkov, A., Reshetikhin, N.: Random skew plane partitions and the pearcey process. Commun. Math. Phys. 269(3), 571–609 (2007) Communicated by S. Smirnov


Communications in


Branching of Cantor Manifolds of Elliptic Tori and Applications to PDEs Massimiliano Berti1, , Luca Biasco2 1 Dipartimento di Matematica e Applicazioni “R. Caccioppoli”, Università degli Studi Napoli Federico II,

Via Cintia, Monte S. Angelo, 80126 Napoli, Italy. E-mail: [email protected]

2 Dipartimento di Matematica, Università di Roma 3, Largo San Leonardo Murialdo, 00146 Roma, Italy.

E-mail: [email protected] Received: 4 May 2010 / Accepted: 22 November 2010 Published online: 22 May 2011 – © Springer-Verlag 2011

Abstract: We consider infinite dimensional Hamiltonian systems. We prove the existence of “Cantor manifolds” of elliptic tori–of any finite higher dimension–accumulating on a given elliptic KAM torus. Then, close to an elliptic equilibrium, we show the existence of Cantor manifolds of elliptic tori which are “branching” points of other Cantor manifolds of higher dimensional tori. We also answer to a conjecture of Bourgain, proving the existence of invariant elliptic tori with tangential frequency along a pre-assigned direction. The proofs are based on an improved KAM theorem. Its main advantages are an explicit characterization of the Cantor set of parameters and weaker smallness conditions on the perturbation. We apply these results to the nonlinear wave equation. Contents 1. 2. 3. 4. 5. 6. 7. 8.

9.

Introduction . . . . . . . . . . . . . . . . . . . . . Cantor Manifolds of Tori Close to an Elliptic Torus . Branching of Cantor Manifolds of Elliptic Tori . . . Application to Nonlinear Wave Equation . . . . . . An Improved Basic KAM Theorem . . . . . . . . . Proof of Theorem 2.1 . . . . . . . . . . . . . . . . Proof of Theorem 3.1 . . . . . . . . . . . . . . . . 7.1 Measure estimates . . . . . . . . . . . . . . . . Proof of the Basic KAM Theorem 5.1 . . . . . . . . 8.1 Technical lemmata . . . . . . . . . . . . . . . 8.2 A class of symplectic transformations . . . . . 8.3 The KAM step . . . . . . . . . . . . . . . . . . 8.4 KAM Iteration . . . . . . . . . . . . . . . . . . Appendix . . . . . . . . . . . . . . . . . . . . . . .

. . . . . . . . . . . . . .

. . . . . . . . . . . . . .

. . . . . . . . . . . . . .

. . . . . . . . . . . . . .

. . . . . . . . . . . . . .

. . . . . . . . . . . . . .

. . . . . . . . . . . . . .

. . . . . . . . . . . . . .

. . . . . . . . . . . . . .

. . . . . . . . . . . . . .

. . . . . . . . . . . . . .

. . . . . . . . . . . . . .

742 747 749 751 754 760 764 766 772 772 774 776 781 789

Supported by the European Research Council under FP7 “New connections between dynamical systems and Hamiltonian PDEs with small divisors phenomena”.

742

M. Berti, L. Biasco

1. Introduction A central topic in the theory of Hamiltonian partial differential equations (PDEs) concerns the existence of quasi-periodic solutions. In the last twenty years several existence results have been proved using both KAM theory, see e.g. Wayne [30], Kuksin [24], Pöschel [25,27], Eliasson-Kuksin [17] (and references therein), or Newton-Nash-Moser implicit function techniques, see e.g. Craig-Wayne [15], Bourgain [9–11], Berti-Bolle [5] and with Procesi [6]. We mention also the recent approach with Lindstedt series by Gentile-Procesi [19]. An advantage of the KAM approach is to provide not only the existence of an invariant torus but also a normal form around it. This would allow, in principle, to study the dynamics of the PDE in its neighborhood. In the existing literature only quasi-periodic solutions of PDEs in a neighborhood of an elliptic equilibrium (see Kuksin [24], Craig [14] for survey) or perturbations of finite gap solutions of integrable PDEs (see Kuksin [24], Kappeler-Pöschel [22]) are considered. In this paper we study the dynamics of infinite dimensional Hamiltonian systems near an elliptic torus. In particular we develop an abstract KAM theory for proving the existence of “Cantor manifolds” of elliptic invariant tori near a given elliptic torus. For finite dimensional Hamiltonian systems, the dynamics close to a lagrangian KAM torus has been deeply investigated, see for example [20]. On the other hand the existence of lower dimensional tori in a neighborhood of an elliptic torus requires, also in finite dimension, a more refined KAM theorem (it is a corollary of our general results). As is well known, the difficulty comes from the presence of the elliptic directions. Our first result states, roughly, the following (see Theorem 2.1 for a precise statement): Given an n-dimensional torus with an elliptic KAM normal form around it, we prove, under the natural non-resonance and non-degeneracy assumptions, the existence of “Cantor manifolds” of elliptic tori–of any finite higher dimension nˆ ≥ n– accumulating on it. This result is based on two main steps. We first perform a Birkhoff normalization (see the “averaging” Proposition 6.1) assuming the natural non-resonance conditions on the tangential and normal frequencies of the torus, see (2.12). These conditions are similar to those used in Bambusi [1], Bambusi-Grébert [4], for an elliptic equilibrium. The next step is to apply KAM theory. Due to the third order monomials on the high mode variables in (6.3)–(6.4), the KAM theorems available in the literature would apply only requiring stronger non-resonance assumptions, see Remark 2.2. Therefore we use the improved KAM Theorem 5.1. Note that these refined estimates are required only for the search of small amplitude solutions and not for perturbations of linear PDEs as considered in [23,24,30] where the size of the perturbation is an external parameter. For finite dimensional systems a similar result has been proved by Jorba-Villanueva [21]. As a second result, we prove an abstract theorem describing a branching phenomenon of Cantor manifolds of elliptic tori of increasing dimension (see Theorem 3.1 for a precise statement): Close to an elliptic equilibrium there exist, under the natural non-resonance and non-degeneracy assumptions, Cantor manifolds of elliptic tori which are “branching points” of other Cantor manifolds of higher dimensional tori.

Branching of Cantor Manifolds of Elliptic Tori and Applications to PDEs

743

This result relies on Theorem 2.1. The main difficulty is to check that, after the first application of the KAM theorem close to the equilibrium, the perturbed frequencies of the deformed elliptic torus, fulfill the non-resonance conditions required by Theorem 2.1. This is achieved in Sect. 7, thanks to the explicit characterization of the Cantor set of non-resonant parameters provided by the basic KAM Theorem 5.1. Theorem 3.1 can be also seen as a “building block” for constructing small amplitude almost periodic solutions for PDEs without external parameters. Actually, with the present estimates, we can prove the existence of only finitely many branches of finite dimensional elliptic tori. The existence of almost periodic solutions has been proved by a similar scheme in Pöschel [28], for a nonlinear Schrödinger equation with regularizing nonlinearity, using the potential as infinitely many external parameters. We apply these abstract results to the nonlinear wave equation (NLW), which is more difficult, for KAM theory, because of the linear asymptotic growth of the frequencies. From Theorem 3.1 we deduce in Theorem 4.1 the existence of a new kind of quasiperiodic solutions of u tt − u x x + mu + f (u) = 0 (1.1) u(t, 0) = u(t, π ) = 0 for almost all the masses m > 0 and for real analytic, odd, nonlinearities of the form f (u) = ak u k , a3 = 0 . (1.2) k≥3,odd

These quasi-periodic solutions are different from the ones obtained in [8,27] since they accumulate to a torus and not to the origin. As already said, a basic tool for proving the above results is the improved KAM Theorem 5.1. Its main advantages are: (i) the KAM smallness conditions are weaker than in [26], see comments after KAM Theorem 5.1. This is achieved modifying the iterative scheme of [24,26], as described in Sect. 5. (ii) The final Cantor set of parameters, satisfying the Melnikov non-resonance conditions at all the KAM iterative steps, is completely explicit in terms of the final frequencies only, see (5.13). A new aspect of Theorem 5.1 is the complete separation between the iterative scheme for the construction of invariant tori and the existence of enough non-resonant frequencies at every step of the iterative process, see [5] for a similar construction in the Nash-Moser setting. In previous KAM theorems the Cantor set of non-resonant parameters is known “a posteriori”, see [25]. The key point here is that the final frequencies are always well defined also if the iterative KAM process stops after finitely many steps (and so there are no invariant tori for any value of the parameters). The present formulation simplifies considerably the necessary measure estimates, see, as applications, Theorems 5.2, 5.3, and Sect. 7.1. The characterization in (5.13) of the Cantor set in terms of the final frequencies only is new also for finite dimensional elliptic tori (for lagrangian tori see [12,13]). It allows also to prove in a simpler way the results of [21] valid in finite dimensions, see Theorem 2.1. It simplifies also the measure estimates of degenerate KAM theory, see for example [3] for an extension to PDEs. In particular it allows to avoid the notions of “links” and “chains” used in [29]. Thanks to the explicit characterization of the Cantor set (5.13) we are also able to answer positively a conjecture by Bourgain in [9]. We prove

744

M. Berti, L. Biasco

- the existence of elliptic invariant KAM tori with tangential frequency constrained to a fixed Diophantine direction, see Theorem 3.2. An application to the NLW equation (1.1) is given in Theorem 4.2. This kind of results was proved for finite dimensional Hamiltonian systems by Eliasson [16] and Bourgain [9] who raised the question if a similar result can be achieved also for infinite dimensional Hamiltonian systems. For a result for NLW in this direction see [18]. We hope that the results and techniques of this paper will be used to develop a more general description of the dynamics of the PDE in a neighborhood of a given elliptic torus, proving, for example, stability results as in Bambusi [1], Bambusi-Grébert [4]. Before presenting precisely our results, we introduce the functional setting and the main notations concerning infinite dimensional Hamiltonian systems. Functional setting and notations Phase space. We consider the Hilbert space of complex-valued sequences ⎧ ⎫ ⎨ ⎬ 2 a, p := z = (z 1 , z 2 , . . .) : za, |z j |2 j 2 p e2 ja < +∞ p := ⎩ ⎭ j≥1

with a > 0, p > 1/2, and the toroidal phase space a, p

a, p

(x, y, w) ∈ Tns × Cn × b ,

w := (z, z¯ ) ∈ b

:= a, p × a, p ,

where Tns is the complex open s-neighborhood of the n-torus Tn := Rn /(2π Z)n . Let

a, p D(s, r ) := |Im x| < s , |y| < r 2 , wa, p < r ⊂ Tns × Cn × b , 0 < s, r < 1 , where |y| := sup j=1,...,n |y j |. Hamiltonian system. Given a function H : D(s, r ) → C, we will study the Hamiltonian system (x, ˙ y˙ , w) ˙ = X H (x, y, w) ,

(1.3)

where X H is the hamiltonian vector field of H , X H = (∂ y H, −∂x H, −iJ ∂w H ), where ∂z j H (x, y, z, z¯ ) := (similarly for z¯ j ) and

d dε |ε=0 H (x,

y, z + εe j , z¯ ) with e j := (0, . . . , 0, 1, 0, . . .)

J :=

0 −I I 0

.

We also define the Poisson brackets {H, F} := ∂x H · ∂ y F − ∂ y H · ∂x F − iJ ∂w F · ∂w H , where “ · ” denotes the standard pairing a · b := j a j b j .

(1.4)


745

Analytic functions. Given a complex Banach space E, we consider analytic functions f : D(s, r ) × → E ,

(1.5)

possibly depending on parameters ξ ∈ ⊂ Rm . We define the sup-norm | f |s,r := | f |s,r,,E :=

sup

(x,y,w;ξ )∈D(s,r )×

f (x, y, w; ξ ) E .

(1.6)

We denote simply by | · |s the sup-norm of functions independent of (y, w). Any analytic function P : D(r, s) → C can be developed in a totally convergent power series P(x, y, w; ξ ) = Pi j (x; ξ )y i w j , i, j≥0

where ⎛

⎞ j−times i−times ⎜ ⎟ a, p a, p Pi j (x) := Pi j (x; ξ ) ∈ L ⎝Cn × . . . × Cn × b × . . . × b , C⎠

(1.7)

are multilinear, symmetric, bounded maps. For simplicity of notation, we will often omit the explicit dependence on ξ . We assume the analytic Hamiltonian vector field is regularizing, namely X P : a, p¯ D(r, s) → C2n × b , p¯ ≥ p. We identify P10 (x) ∈ L(Cn , C), resp. P01 (x) ∈ a, p L(b , C), with the vector P10 (x) = ∂ y|y=0,w=0 P ∈ Cn , resp. P01 (x) = a, p¯ ∂w|y=0,w=0 P ∈ b , writing P10 (x)y = P10 (x) · y , resp. P01 (x)w = P01 (x) · w . a, p

a, p

Moreover we identify the bilinear symmetric form P02 (x) ∈ L(b × b , C) with the a, p a, p¯ operator P02 (x) ∈ L(b , b ) defined by a, p

P02 (x)w 2 = P02 (x)w · w , ∀w ∈ b . j

In general we identify the Pi j in (1.7) and ∂ yi ∂w P with vector valued multilinear forms, for j ≥ 1, ⎞ ⎛ ( j−1)−times i−times ⎜ a, p a, p a, p¯ ⎟ Pi j (x) , ∂ yi ∂wj P(x, y, w) ∈ L ⎝Cn × . . . × Cn × b × . . . × b , b ⎠ . (1.8) For j ≥ 1, |Pi j |s = |∂ yi ∂wj P|s,r =

sup

(x;ξ )∈Ts ×

sup

Pi j (x; ξ ) ,

(x,y,w;ξ )∈D(s,r )×

∂ yi ∂wj P(x, y, w; ξ ) ,

(1.9)

746

M. Berti, L. Biasco ( j−1)−times

i−times

a, p a, p a, p¯ where · denotes the operatorial norm on L(Cn × . . . × Cn × b × . . . × b , b ). We define P≤2 := P00 + P01 w + P10 y + P02 w · w .

(1.10)

The [ ]-operator. We define the operator [·] acting on monomials Q := q(x)y i z a z¯ a¯ , i, a, a¯ ∈ N∞ , by [Q] :=

Q = q y i z a z¯ a¯ if a = a¯ , 0 otherwise

(1.11)

where q := (2π )−n Tn q(x)d x denotes the average with respect to the angles. Lipschitz norms. Given a function f as in (1.5) we define the Lipschitz semi-norm lip

| f |s,r =

sup

ξ,ζ ∈,ξ =ζ

| f (·; ξ ) − f (·; ζ )|s,r |ξ − ζ |

(1.12)

and, given λ ≥ 0, the Lipschitz norm λ | · |r,s := | · |r,s + λ| · |r,s . lip

(1.13)

We will always use the symbol “λ” in this role, not to be confused with exponentiation. We denote the Lipschitz norm of functions independent of (y, w) more simply by | · |λs . Miscellanea. Given l ∈ Z∞ we define |l| :=

|l j | , |l| p :=

j≥1

j≥1

⎞ ⎛ p d ⎠ ⎝ j |l j | , l d := max 1, j lj j≥1

and the unit versors e j := (0, . . . , 0, 1, 0, . . .) with zero components except the j th one. We define the space −δ ∞

:= := ( 1 , 2 , . . .), j ∈ R : | ||−δ := sup j

−δ

| j | < +∞

j≥1

and the Lipschitz norm | ||λ−δ := sup | (ξ )||−δ + λ|| ||−δ where | ||−δ := lip

ξ ∈

lip

sup

ξ,ζ ∈,ξ =ζ

| (ξ ) − (ζ )||−δ . |ξ − ζ | (1.14)

Finally, for τ > n − 1, η > 0, we define the set of Diophantine vectors Dη,τ

:= ω ∈ Rn : |ω · k| ≥

η n , ∀ k ∈ Z \{0} . 1 + |k|τ

(1.15)


747

2. Cantor Manifolds of Tori Close to an Elliptic Torus The KAM-normal form Hamiltonian

H = H (x, y, z, z¯ ) = N + P = ω · y + · z z¯ +

Pi j (x)y i w j

(2.1)

2i+ j≥3

possesses the elliptic invariant torus T0 = Tn × {0} × {0} × {0}

(2.2) Rn ,

:= ( n+1 , . . .) with tangential and normal frequencies ω := (ω1 , . . . , ωn ) ∈ respectively. In (2.1) the variables are w = (z, z¯ ) with z = (z n+1 , . . .). We assume • Frequency asymptotics. The j ∈ R and there exists d ≥ 1 such that j = jd + . . . ,

j ≥ 1,

(2.3)

where the dots stand for lower order terms in j. For d = 1, let κ be a positive constant such that i − j = 1 + O( j −κ ) , ∀i > j . i−j We also set

μ :=

1 if d > 1 κ/(κ + 1) if d = 1 .

• Regularity. The vector field X P is real analytic and p¯ ≥ p if d > 1 a, p¯ n n with X P : D(s, r ) → C × C × b p¯ > p if d = 1 .

(2.4)

(2.5)

(2.6)

We aim to prove the existence of finite dimensional elliptic tori of any arbitrary dimension nˆ ≥ n accumulating onto the elliptic torus T0 . We denote the augmented frequencies ˆ := ( n+1 ωˆ := (ω1 , . . . , ωn , n+1 , . . . , nˆ ) ∈ Rnˆ , ˆ , . . .) , the coordinates ˜ w) ˆ , w˜ = (˜z , z¯˜ ) , wˆ = (ˆz , z¯ˆ ) , z = (˜z , zˆ ) , z˜ := (z n+1 , . . . , z nˆ ) , zˆ := (z n+1 ˆ , . . .) , w = (w, and the actions yˆ := (y, y˜ ) ,

y˜ :=

1 (z n+1 z¯ n+1 , . . . , z nˆ z¯ nˆ ) , 2

1 Zˆ = (z n+1 ˆ z¯ n+1 ˆ , . . .) . 2

We decompose any l = (ln+1 , . . .) ∈ Z∞ as ˜ l) ˆ with l˜ := (ln+1 , . . . , lnˆ ) , lˆ := (ln+1 l = (l, ˆ , . . .) .

(2.7)

Given Pi j (see (1.7)) we define the coefficients Pi j˜jˆ , for j˜, jˆ ∈ N with j˜ + jˆ = j, by the relation Pi j y i w j = Pi j˜jˆ y i w˜ j˜ wˆ jˆ . j˜+jˆ= j

748

M. Berti, L. Biasco

We introduce the symmetric n-dimensional ˆ “twist” matrix ⎛ ⎞ 2[P200 ] [P120 ] ⎠, Aˆ ∈ Mat(nˆ × n) ˆ , Aˆ := ⎝ [P120 ] 2[P040 ]

(2.8)

where the matrices [P200 ], [P040 ], [P120 ] are defined by1 [P200 ]y · y := [P200 y 2 ] ,

[P040 ] y˜ · y˜ := [P040 w˜ 4 ] ,

[P120 ]y · y˜ := [P120 y w˜ 2 ] (2.9)

and the [ ] operator in (1.11). We also define [P102 ], [P022 ], by [P102 ]y · Zˆ := [P102 y wˆ 2 ] , [P022 ] y˜ · Zˆ := [P022 w˜ 2 wˆ 2 ] and p− ¯ p Bˆ := ( [P102 ] [P022 ] ) ∈ L(Cnˆ , ∞ ) ,

(2.10)

the last property being valid thanks to the regularizing property (2.6). We set 2(d − 1)−1 + n + 1 if d > 1 τ := −1 (n + 2)(δ∗ − 1)δ∗ + 1 if d = 1

(2.11)

with δ∗ fixed below. Theorem 2.1 (Higher dimensional tori close to an elliptic torus). Consider an Hamiltonian H as in (2.1) satisfying (2.3), (2.6), and, if d = 1, μ > 9/14 (see (2.5)). Fix nˆ ≥ n. Then there exists a constant c > 0 such that, if the following assumptions hold: • (Melnikov conditions) For some α > 0, |ω · k + · l| ≥ α

l d ˜ l) ˆ ∈ n,D , ∀k ∈ Zn , l = (l, , (k, l) = 0 , ˆ 1 + |k|τ

where τ is defined in (2.11) with δ∗ = p − p, ¯ and n,D ˆ

ˆ ≤ 2 ∪ |l| ˜ = D, |l| ˆ =1 , := |l| ≤ D, |l|

• (Twist) Aˆ is invertible. ˆ ≤ 2 there hold • (Non-resonance) ∀ 0 < |l| ˆ − Bˆ Aˆ −1 ωˆ · lˆ = 0 .

D :=

(2.12)

4 if d > 1 6 if d = 1 .

(2.13)

• (Smallness) The third order terms satisfy (|P11 |s + |P03 |s )2 ≤ cα ,

(2.14)

! ! 1 The matrices [P ] ∈ Mat(n × n), [P ] ∈ Mat (nˆ − n) × (nˆ − n) , [P ] ∈ Mat (nˆ − n) × n . 200 040 120 ! Similarly [P102 ] ∈ Mat(∞ × n), [P022 ] ∈ Mat ∞ × (nˆ − n) .


749

then there exists a 2n-dimensional ˆ Cantor manifold of real analytic, elliptic, diophantine n-dimensional ˆ tori accumulating onto the n-dimensional elliptic torus T0 . The above Cantor manifold has the same geometric structure described in [25]. The ˆ B. ˆ ˆ A, constant c depends on n, τ, s, d, A, B, n, ˆ ω, ˆ , ˆ (2.13) implies Remark 2.1. By (2.3), (2.4) and the regularizing property (2.10) of B, ˆ > 0. ˆ − Bˆ Aˆ −1 ω) inf |( ˆ · l|

ˆ 0 1 and D = 7 for d = 1 and μ = 2/3 (as for NLW, see (4.5)). See also Remarks 6.1 and 6.3. 3. Branching of Cantor Manifolds of Elliptic Tori We consider an Hamiltonian H = + Q + R,

(3.1)

where R is a higher order perturbation of an integrable normal form + Q. In complex coordinates (ζ, ζ¯ ) and, setting I :=

1 (ζ1 ζ¯1 , . . . , ζn ζ¯n ) , 2

Z :=

1 (ζn+1 ζ¯n+1 , . . .) , 2

the normal form consists of the terms := a · I + b · Z ,

Q :=

1 AI · I + BI · Z , 2

(3.2)

where a, b and A, B denote, respectively, vectors and matrices with constant coefficients. Fixed nˆ ≥ n, we assume that: (A) The normal form + Q is non-degenerate in the following sense: Twist. (A1 ) detA = 0

750

M. Berti, L. Biasco

Non- resonance. (A2 ) (A3 )

b · l = 0 , ∀ 1 ≤ |l| ≤ 2 a · k + b · l = 0 or Ak + Bl = 0 , ∀ k ∈ Zn , l ∈ n,D , (k, l) = 0 . ˆ ˜ ˜ Moreover, if d = 1, a · k + b · (l, 0) ± h = 0 or Ak + B (l, 0) = 0 , ˜ ≤ D − 2 , 1 ≤ h ≤ L 0 + n(D ∀ 0 < |k| ≤ K 0 , |l| ˆ − 2) .

The constants K 0 , L 0 depend only on d, D, a, b, A, B, see (7.34). (B) Frequency asymptotics. There is d ≥ 1 and δ∗ < d − 1 such that b j = j d + · · · + O( j δ∗ ). (C) Regularity. The vector fields X Q , X R are real analytic from some neighborhood a, p a, p¯ of the origin of b into b with p¯ ≥ p defined in (2.6). By increasing δ∗ , if necessary, we may also assume p − p¯ ≤ δ∗ < d − 1 .

(3.3)

Concerning the higher order perturbation R we assume 4 −1 |R| = O(za, , p ) + O(ζ a, p ) , z := (ζn+1 , ζn+2 , . . .) , g > 1 + 3μ μ ∈ (9/14, 1] , (3.4) g

where μ is defined as in (2.5) and, for d = 1, κ is a positive constant such that bi − b j −κ i − j − 1 ≤ a∗ j , ∀ i > j ,

(3.5)

for some a∗ > 0. For d = 1, by increasing δ∗ , if necessary, we can assume −δ∗ < κ. Fix nˆ ≥ n. We define the augmented frequency vectors aˆ := (a, bn+1 , . . . , bnˆ ) ∈ Rnˆ , bˆ := (bn+1 . . .) , ˆ , bn+2 ˆ

(3.6)

the symmetric “twist” matrix ⎧ ⎨ Ai j ˆ ∈ Mat(nˆ × n) ˆ i j := Bi j A ˆ , A ⎩ ∂ 4 ζ ζ¯ ζ i i

and

Bˆ ∈ Mat(nˆ × ∞) , Bˆ i j :=

ˆ We assume (A) ˆ 1) Twist. (A Non- resonance.

j ζ¯ j

Bi j ∂ζ4 ζ¯ ζ ζ¯ i i j j

R|ζ =ζ¯ =0

if i, j ≤ n if j ≤ n < i ≤ nˆ if n < i, j ≤ nˆ

if j ≤ n 0. ˆ

ˆ 3) (A

ˆ ˆ Bˆ A ˆ −1 aˆ ) · lˆ = 0 , ∀ lˆ = (ln+1 (b− ˆ , ln+2 ˆ , . . .) with |l| = 1, 2 .


751

ˆ 2 ) is stronger than (A2 ). Clearly (A Theorem 3.1 (Branching of Cantor manifolds of elliptic tori). Fix nˆ ≥ n. Suppose ˆ and (3.4). Then H = + Q + R satisfies assumptions (A),(B),(C), (A) • (i) There exists an n-dimensional Cantor manifold of real analytic, elliptic, diophantine, invariant n-dimensional tori. • (ii) Each of these n-dimensional elliptic tori possesses another Cantor manifold of real analytic, elliptic, diophantine n-dimensional ˆ tori with asymptotically full density. The new result is (ii). Part (i) was proved in Kuksin-Pöschel [25]. We prove Theorem 3.1 as follows. After a Birkhoff normal form step, we introduce the actions as parameters, and, applying Theorem 5.1-(H3), we find a Cantor manifold of n-dimensional tori close to the origin with asymptotically full density (part (i)). For proving part (i) we only require (A1 ), (A2 ), (B), (C), (3.4) and a · k + b · l = 0 or Ak + Bl = 0 , ∀ k ∈ Zn , |l| ≤ 2 , (k, l) = 0 , (3.9) ˆ there as in [25]. The key for proving part (ii) is that, thanks to assumptions (A3 ) and (A), exists a set of parameters with asymptotically full measure, such that the hypotheses of Theorem 2.1 hold. This is verified in Subsect. 7.1. We strongly exploit the explicit form of the Cantor set ∞ in (5.13) proved in the basic KAM Theorem 5.1. Another minor advantage of the KAM Theorem 5.1 is the following. Since condition (H3) is strictly weaker, when d = 1, than the KAM condition in [26] (see comments after Theorem 5.1), Theorem 5.1 simultaneously applies to both cases d > 1 and d = 1. Actually we can also improve the result of Theorem 3.1-(i) proving the existence of elliptic tori with tangential frequency restricted to a fixed Diophantine direction, extending to infinite dimensional systems the results of Bourgain [9] and Eliasson [16]. Theorem 3.2. Assume (A1 ), (A2 ), (B), (C), (3.4), a = 0 and (b−BA−1 a)·l = 0, ∀ 1 ≤ |l| ≤ 2. Then if ω¯ ∈ Dα0 ,τ (see (1.15)) with α0 := ρ01+c , ρ0 := |ω¯ − a| > 0, and c > 0 is small enough, then |T |(2cρ0 )−1 → 1 as ρ0 → 0 ,

(3.10)

where T ⊂ [1 − cρ0 , 1 + cρ0 ] are the t such that t ω¯ is the tangential frequency of a n-dimensional torus found in Theorem 3.1-(i). Note that the hypotheses of Theorem 3.2 imply (3.9). 4. Application to Nonlinear Wave Equation We now apply the results of Sect. 3 to the NLW equation (1.1). We first write (1.1) as an infinite dimensional Hamiltonian system introducing coordinates q, p ∈ a, p , a > 0, p > 1/2, by setting # " qj " " φ j , v = ut = p j λ j φ j where λ j := j 2 +m , φ j := 2/π sin( j x) . u= λj j≥1 j≥1

752

M. Berti, L. Biasco

The Hamiltonian of NLW is

$ π 2 v u2 u2 1 + x + m + g(u) d x = + G = HN L W = λ j (q2j + p2j ) + G(q) , 2 2 2 2 0 j≥1

where $ g(s) :=

s

$

π

f (t)dt , G(q) :=

0

0

⎛ g⎝

⎞ −1/2 qjλj φj⎠

dx .

j≥1

For 1 ≤ n ≤ nˆ we choose arbitrarily the “tangential sites” I := {i 1 , . . . , i n } ⊆ Iˆ := {i 1 , . . . , i n , i n+1 , . . . , i nˆ } ⊂ N+ .

(4.1)

By [27] there is a symplectic map transforming H N L W in its partial Birkhoff normal ˆ form on the I-modes, H = + G¯ + Gˇ + K , where X G¯ , X Gˇ , X K are analytic from some neighborhood of the origin in a, p into a, p+1 , 1 G¯ = 2

i or j∈Iˆ

6 4 − δi j 1 1 , z j = √ (q j +ip j ) , z¯ j = √ (q j −ip j ) , G¯ i j zi z¯ i z j z¯ j , G¯ i j := π λi λ j 2 2

ˆ K is of order six and depends on all / I, Gˇ is of order four and depends only on zi , i ∈ the variables zi , i ∈ N (for more details we refer to [27] or [7]). In order to write H in the form (3.1) we renumber the indexes in such a way that the ˆ first n modes correspond to the I-modes and the first nˆ modes to the I-modes. More precisely we construct a re-ordering N+ → N+ , j → i j which is bijective and increasing from {1, . . . , n} onto I, from {n + 1, . . . , n} ˆ onto Iˆ \ I and from N+ \ {1, . . . , n} ˆ onto + ˆ N \ I. Calling the variables ζ j := zi j , ∀ j ≥ 1 , the Hamiltonian H assumes the form (3.1)–(3.2) with a := (λi1 , . . . , λin ) , b := (λin+1 , . . .) , A := (G¯ i h ik )1≤h,k≤n , B := (G¯ i h ik )1≤k≤nn

Let us verify the hypotheses of Theorem 3.1. By [27] the matrix A in (4.2) is invertible, actually

4 π − δhk ah ak , 1 ≤ h, k ≤ n . (A−1 )hk = (4.4) 6 4n − 1


753

Then (A1 ) holds. Assumption (A2 ) holds because the frequencies λ j are simple and-non zero. Still in [27] it is verified that (B), (C) are satisfied with d = 1 , δ∗ = −1 ,

p¯ = p + 1 ,

as well as (3.4) with (see (4.2) and (3.5)) g = 6 , μ = 2/3 > 9/14 , κ = 2 .

(4.5)

Assumptions (A3 ) (which is new with respect to [27]) will be a corollary of the next lemma. Lemma 4.1. ∀0 < |l| < ∞, the function fl : (0, ∞) → R, fl (m) := (b − BA−1 a) · l is analytic and non-constant. Proof. By (4.2) and (4.4) we get (BA−1 )i j = 4a j bi−1 /(4n − 1) and fl (m) =

l j b−1 j (αm

j>n

+β

+ i 2j )

4 1≤ j≤n i 2j −1 with α := , β := − . 4n − 1 4n − 1

Let j∗ := max{ j > n : l j = 0} and i ∗ := max{i j : l j = 0}. For m > i ∗2 we expand the analytic functions b j (m)−1 in power series 1 2 k b−1 = ck i j m with c0 := 1 , √ j m k≥0

1 1 1 − − 1 · · · − − k + 1 /k! = 0 . ck := − 2 2 2 Then √ 1 fl (m) = α m lj + √ ck pk m−k where pk := l j i 2k j qki j m n< j≤ j∗

k≥0

n< j≤ j∗

αi and qki := L i + 2(k+1) , L i := (1 − α)i 2 + β . We prove that fl (m) is not constant showing that pk = 0 for k large enough. Note that |qki∗ | ≥ 1/k 2 for k large enough: if L i∗ = 0 then qki∗ → L i∗ = 0, otherwise |qki∗ | = |αi ∗2 (2k + 2)−1 | ≥ 1/k 2 for k large. Moreover |qki j | ≤ 2i ∗2 , ∀ k. Hence |l j ||qki j | ≥ i ∗2k k −2 − |l|(i ∗ − 1)2k 2i ∗2 → ∞ | pk | ≥ i ∗2k |qki∗ | − (i ∗ − 1)2k 2

n< j≤ j∗

as k → ∞.

Corollary 4.1. Assumption (A3 ) is satisfied with the exception of a countable set of m’s in (0, ∞). Proof. If l ∈ n,D and Ak + Bl = 0, then a · k + b · l = (b − BA−1 a) · l = 0 except at ˆ ˜ 0) = 0, then a · k + b · (l, ˜ 0) ± h = most countably many m’s. Analogously, if Ak + B(l, ˜ 0) ± h ≡ 0. (b − BA−1 a) · (l,

754

M. Berti, L. Biasco

ˆ where aˆ , b, ˆ A, ˆ B, ˆ defined in (3.6), The last condition of Theorem 3.1 to verify is (A), ˆ 1 ) holds as well as (A ˆ 3 ), (3.7), (3.8), are like a, b, A, B in (4.2) changing nˆ with n. Then (A ˆ 2 ) holds for almost every m ∈ (0, ∞) except countably many m. Finally, assumption (A as a consequence of Theorem 3.12 of [4] (see also Theorem 6.5 of [1]). More precisely inf l∈n,D |b(m) · l| > 0 is a consequence of the non-resonance condition (r -NR) of [4] ˆ with r = D + 2, N = n. ˆ Then Theorem 3.1 applies. Theorem 4.1. Suppose f is real analytic and (1.2) holds. Fix nˆ ≥ n. For all the choices of indices I, Iˆ as in (4.1), for almost all the masses m the conclusions (i)–(ii) of Theorem 3.1 apply to the NLW equation (1.1). Conclusion (i) was proved in Bobenko-Kuksin [8] and Pöschel [27] (under different restrictions on the set of the masses m and of the indexes I). On the other hand, the quasi-periodic solutions obtained in (ii) are new, since they accumulate onto a n-torus and not at the origin. They are not the n-dimensional ˆ tori bifurcating from the fourth order Birkhoff normal form of (1.1). As a consequence of Corollary 4.1 we can prove the existence of quasi-periodic solutions with tangential frequency restricted to a fixed direction, see [18] for a similar result. Theorem 4.2. Suppose that f is real analytic and (1.2) holds. Then, excluding a countable set of masses m ∈ (0, ∞), the conclusion of Theorem 3.2 applies to the NLW equation (1.1). 5. An Improved Basic KAM Theorem We consider a family of integrable Hamiltonians N := N (x, y, z, z¯ ; ξ ) := e(ξ ) + ω(ξ ) · y + (ξ ) · z z¯

(5.1)

defined on Tns × Cn × a, p × a, p . The frequencies ω = (ω1 , . . . , ωn ) and = ( n+1 , n+2 , . . .) depend on m-parameters ξ ∈ ⊂ Rm , m ≤ n , bounded with positive Lebesgue measure, ρ := diam() . For each ξ there is an invariant n-torus T0 = Tn × {0} × {0} × {0} with frequency ω(ξ ). In its normal space, the origin (z, z¯ ) = 0 is an elliptic fixed point with proper frequencies (ξ ). The aim is to prove the persistence of a large portion of this family of linearly stable tori under small analytic perturbations H = N + P. We assume (A∗ ) Parameter dependence. The map ω : → Rn , ξ → ω(ξ ), is Lipschitz continuous. (B∗ ) Frequency asymptotics. There exist d ≥ 1 and δ∗ < d − 1 such that ¯ i + i∗ (ξ ) ∈ R , i ≥ 1 , i (ξ ) = ∗ ¯ i = i d + . . . and ∗ : → −δ where ∞ is Lipschitz continuous.


755

By (A∗ ) and (B∗ ), the Lipschitz semi-norms (defined as in (1.12)) of the frequency maps satisfy lip

|ω|lip + | ||−δ∗ ≤ M < +∞

(5.2)

for some M ≥ 1. (C∗ ) Regularity. The perturbation P is real analytic in the space coordinates, Lipschitz in the parameters, and for every ξ ∈ the hamiltonian vector field a, p¯ maps X P : D(s, r ) → Cn × Cn × b with p¯ satisfying (2.6). More precisely, using the notations (1.6), (1.12), we assume lip

a, p¯

|X P |r,s,E, + |X P |r,s,E, < +∞ where E := Cn × Cn × b .

(5.3)

Moreover, we also assume (3.3). We suppose that P00 (x) has zero average, by adding a constant to P. We introduce the group (under composition) of analytic maps

a, p a, p¯ Es := : (x+ , y+ , w+ ; ξ ) ∈ Tns × Cn × b × → (x, y, w) ∈ C2n × b of the form x = x00 (x+ ; ξ ) , w = w00 (x+ ; ξ ) + w01 (x+ ; ξ )w+ , y = y00 (x+ ; ξ ) + y01 (x+ ; ξ )w+ + y10 (x+ ; ξ )y+ + y02 (x+ ; ξ )w+ · w+ ,

where x00 , yi j , wi j are analytic and bounded on Tns and Lipschitz on . (5.4) The symplectic map in (5.7) has the form = I + with as in (5.4), like in [26]. It is the composition of infinitely many time-1-flow maps (each having the form I + , ∈ Es ) generated by Hamiltonians in Fs defined in (8.7). Theorem 5.1 (Improved basic KAM theorem). Suppose that H = N + P satisfies assumptions (A∗ ), (B∗ ), (C∗ ). Let α > 0 be a parameter and assume that ⎧ ⎫ ⎨ ⎬ := max 1, |P11 |λs , |P03 |λs , |∂ yi ∂wj P|λs,r , r |∂ y2 ∂w P|λs,r with ⎩ ⎭ 2i+ j=4 √ α α satisfies ≤ . (5.5) λ= M 3r Then there is γ := γ (n, τ, s) > 0 such that, if one of the following KAM-conditions: |P00 |λs |P01 |λs |P10 |λs |P02 |λs • (H1) ε1 := max , ≤γ, , , r 2 α 2 r α 3/2 α α |P00 |λs |P01 |λs |P10 |λs |P02 |λs α 5/4 λ , ≤ γ and |P , • (H2) ε2 := max 2 5/4 , , | ≤ 11 s r α 3/2 α α r r α |P00 |λs |P01 |λs |P10 |λs |P02 |λs α ≤ γ and |P11 |λs , |P03 |λs ≤ , • (H3) ε3 := max , , , r 2 αμ rα α α r wher e μ := 1 i f d > 1 and 0 < μ ≤ 1 if d = 1 , holds, then there exist:

756

M. Berti, L. Biasco

• (Frequencies) Lipschitz functions ω∞ : → Rn , ∞ : → −d ∞ , satisfying −1 αεi |ω∞ − ω|λ , | ∞ − ||λp− ¯ p ≤γ

(5.6)

lip

and |ω∞ |lip , | ∞|−δ∗ ≤ 2M. • (KAM normal form) A Lipschitz family of analytic symplectic maps : D(s/4, r/4) × ∞ (x∞ , y∞ , w∞ ; ξ ) → (x, y, w) ∈ D(s, r )

(5.7)

of the form = I + with ∈ Es/4 , where ∞ is defined in (5.12), such that, ∞ =0 H ∞ (·; ξ ) := H ◦ (·; ξ ) = ω∞ (ξ )y∞ + ∞ (ξ )z ∞ z¯ ∞ + P ∞ has P≤2 (5.8) a, p¯

see (1.10). Moreover X P ∞ : D(s/4, r/4) × ∞ → C2n × b and ∞ |P11 − P11 |s/4 ≤ γ −1 εi (|P11 |s + α pa −1/2 ) ∞− P | −1 ε (|P | + |P | + α pa −1/2 ) . |P03 03 s/4 ≤ γ i 03 s 11 s

(5.9)

• (Smallness estimates) The map satisfies |x00 |λs/4 , |y00 |λs/4 |w00 |λs/4

α 1− pa α 1− pb , |y10 |λs/4 , |y02 |λs/4 , |w01 |λs/4 , , |y01 |λs/4 2 r r

α 1− pb ≤ γ −1 εi , r

accordingly (H i)i=1,2,3 holds, where ⎧ if (H1) ⎨2 pa := 5/4 if (H2) ⎩1 if (H3)

(5.10)

pb :=

3/2 if (H1) or (H2) 1 if (H3).

(5.11)

• (Cantor set) The Cantor set is explicitly if (H 1) or (H 2) or (H 3) − (d > 1) ∞ , (5.12) ∞ := ∞ ∩ ω−1 (Dα μ ,τ ) if (H 3) − (d = 1) where Dα μ ,τ is defined in (1.15) with η = α μ , and l d ∞ := ξ ∈ : |ω∞ (ξ ) · k + ∞ (ξ ) · l| ≥ 2α , 1 + |k|τ ∀(k, l) ∈ Zn × Z∞ \ {0} , |l| ≤ 2 .

(5.13)

Then, ∀ξ ∈ ∞ , the map x∞ → (x∞ , 0, 0; ξ ) is a real analytic embedding of an elliptic, diophantine, n-dimensional torus with frequency ω∞ (ξ ) for the system with Hamiltonian H , see (1.3). Note that (5.8) is the KAM normal form in an open neighborhood of the invariant elliptic torus. Regarding the smallness conditions we note that:


757

– In (H1) we make assumptions only on P00 , P01 , P10 , P02 . This is quite natural because, if they vanish, then the torus T0 in (2.2) is yet invariant, elliptic, and in normal form. – In (H2) we relax the smallness assumption on P00 , at the expense of a smallness condition on P11 . Note that in (H2) we do not require any assumption on P03 . We apply (H2) looking for tori in a neighborhood of a fixed torus (where, in general, P03 does not vanish), see the proof of Theorem 2.1. – In (H3) we further relax the smallness assumptions on P00 and P01 , at the expense of stronger conditions on P11 and P03 . We apply (H3) looking for tori close to an elliptic equilibrium (where, after a Birkhoff normal form, both P11 and P03 are small), see the proof of Theorem 3.1. Comparison with the KAM Theorem [26]. The KAM condition in [26] on X P in (2.6) is λ α −1 |X P |r,s ≤ const

with λ = α/M ,

(5.14)

λ := |X | where |X P |r,s P r,s,E, + λ|X P |r,s,E, is defined in (1.13) and lip

a, p E := (x, y, w) ∈ Cn ×Cn ×b with norm |(x, y, w)|r := |x|+r −2 |y|+r −1 wa, p¯ . We note that (5.14) implies the KAM condition (H3). Indeed, by (5.14) we get |∂x P00 (x)|λs ≤ c αr 2 which implies |P00 |λs ≤ c αr 2 since P00 (x) has zero average. Moreover, since P10 = (∂ y P)(x, 0, 0), we deduce, by (5.14), that |P10 |λs ≤ c α. Similarly (5.14) implies the other conditions in (H3). In the case d = 1 condition (H3) is strictly weaker than (5.14), since μ ≤ 1. This is why we prove the result of [27] for NLW (where μ = 2/3), avoiding the use of Theorem D in [26] (see Theorem 4.1 and the proof √ of Theorem 3.1-(i)). Note that (5.14) √ also implies αr −2 ≥ c , namely αr −1 ≥ c , which is condition (5.5), up to a multiplicative constant (we have ∈ [0 , 20 ] for some constant 0 > 0, uniformly for α, r small). On the other hand, the KAM conditions (H1)-(H2) are quite different from (5.14). The iterative scheme in [24,26] would not converge assuming only (H1) or (H2). We discuss below the KAM iterative process used to prove Theorem 5.1. Finally note that, if |P03 |λs = O(1), then (5.14) implies α ≥ const r . This causes difficulties for verifying the measure estimates because, as r → 0, also the size of the parameters domain shrinks to zero, see Remark 6.3. The KAM Theorem 5.1 is completed by the following remarks. Remark 5.1 (Analytic case). If the Hamiltonian H is analytic in ξ ∈ with ⊂ Cm we can prove the existence of limit-frequency maps ξ → (ω∞ (ξ ), ∞ (ξ )) that are of class C ∞ and, ∀q ≥ 1, 1−q |ω∞ − ω|C q , | ∞ − || p− , ¯ p,C q ≤ C(q)εi α

(5.15)

see Remark 8.1. Moreover in the KAM conditions (H1)–(H3) and in (5.5) we can substitute |Pi j |λs with |Pi j |s thanks to Cauchy estimates. Here the |Pi j |s are defined as in (1.9) with a complex domain .

758

M. Berti, L. Biasco

Remark 5.2 (Lipeomorphism). If ω : → ω() is a homeomorphism which is Lipschitz in both directions (Lipeomorphism), with |ω−1 |lip ≤ L and εi ≤

γ , 2L M

(5.16)

−1 |lip ≤ 2L . then ω∞ : → ω∞ () is a Lipeomorphism with |ω∞

Remark 5.3 (Dependence on n). The constant γ depends on the dimension n of the torus like, for example, γ = τ˜ −cτ˜ , where τ˜ := (τ + n) ln ((τ + n)/s) and c > 0 is an absolute constant, see Remark 8.2. We have not tried to improve such super-exponential estimate to get larger values of γ . Let us briefly comment on the assumptions of Theorem 5.1. Remark 5.4. The condition ≥ 1 in (5.5) is not restrictive because, rescaling the variables y → ρ 2 y , w → ρw , H → ρ −2 H , (5.17) j we can always verify max{|P11 |s , |P03 |s , 2i+ j=4 |∂ yi ∂w P|s,r } ≥ 1. On the other hand note that the KAM conditions (H1)–(H3) are invariant under the above rescaling. Remark 5.5. The KAM condition (H3) is obtained, for d = 1, performing a normal form step before the KAM iteration, see Sect. 8.4. Such condition is used for the wave equation. Note that, if μ → 0, then the condition (H3) improves, but the measure |Dα μ ,τ | decreases (see (1.15)–(5.12)). The scheme of proof of Theorem 5.1 differs from that in [26]. In order to find the symplectic map which transforms the Hamiltonian H into the KAM normal form H∞ := H ◦ in (5.8), i.e. ∞ ∞ ∞ ∞ ∞ := P00 + P01 w + P10 y + P02 w ·w ≡ 0, P≤2

we perform infinitely many symplectic maps ν , ν ≥ 1, as in [26]. Each Hamiltonian has the form H ν = N ν + Pν,

where

N ν = ων (ξ ) · y + ν (ξ ) · z z¯ ,

(5.18)

and P ν is analytic on D(sν , rν ) with rν > r0 /4 > 0 for all ν ≥ 0. It is natural to look at the map ν ν ν ν (P00 , P01 , P10 , P02 )

→

ν+1 ν+1 ν+1 ν+1 (P00 , P01 , P10 , P02 )

ν+1 is not a quadratic after any KAM step. An explicit calculus shows that the new P≤2 ν : in the terms (P ν+1 , P ν+1 ) there are linear combinations of P ν , P ν , see function of P≤2 10 02 00 01 ν , P ν , P ν , P ν . These terms come from the transforLemma 8.13, with coefficients P11 03 12 20 mation of the cubic and quartic terms of P ν under ν . However, after three iterations, the map ν ν ν ν (P00 , P01 , P10 , P02 )

→

ν+3 ν+3 ν+3 ν+3 (P00 , P01 , P10 , P02 )

turns out to be quadratic, see Lemma 8.16. Then the super-exponential convergence of the iterative process is guaranteed under the smallness conditions (H 1)–(H 3) on the


759

initial P00 , P01 , P10 , P02 , where α and r occur with different weights. Note that the exponents of r come from the natural rescaling (5.17), while the different exponents of α by explicit computations. Unlike the usual KAM scheme in [23,24,26], the KAM normal form H ∞ converges directly on an open neighborhood of the torus. Also the KAM iterative scheme in [26] is not quadratic, see, for example formula (13) in [26]. This problem is solved letting the domain of the normal form shrink to zero (see also [23]). Hence, at the end of the iteration, the normal form converges on the KAM torus only. The convergence on an open neighborhood of the torus is then recovered by a posteriori arguments. The Cantor set ∞ . Note that the Cantor set ∞ in (5.13) depends only on the final frequencies (ω∞ , ∞ ). It could be empty. In such a case the iterative process stops after finitely many steps and no invariant torus survives for any value of the parameters. However ω∞ , ∞ , and so ∞ , are always well defined. The idea is as follows. Each KAM step can be performed only for the parameters ξ such that the frequencies ων (ξ ), ν (ξ ), satisfy the second order Melnikov non-resonance conditions (8.38). Actually this set could be empty. However we can always extend the frequency maps ων (ξ ), ν (ξ ), to the whole set of parameters ξ ∈ , see the iterative Lemma 8.17-(S2)ν . This extension is Lipschitz continuous and, if the Hamiltonian is analytic, it is C ∞ , see Remark 8.1. Finally we verify in Lemma 8.19 that if ξ belongs to the Cantor set ∞ then all the Melnikov non-resonance conditions required to perform the previous KAM steps are all satisfied. We exploit that (ων , ν ) converge to (ω∞ , ∞ ) super-exponentially fast. Note that we do not claim that the frequencies of the final invariant torus satisfy the second order Melnikov non-resonance conditions, a fact already proved in [26]. We state a stronger claim, namely that if the parameter ξ is in ∞ then the torus is preserved. The number of parameters m in Theorem 5.1 is arbitrary. It could be strictly less than n (degenerate KAM theory). In the PDE applications of this paper we have m = n and the frequency map is a Lipeomorphism. In such a case the final frequency ω∞ is a Lipeomorphism too, see Remark 5.2. Then the following measure estimate follows by the classical arguments in [22–24,26], see also Subsect. 7.1. Let κ > 0 be such that (2.4) holds uniformly on and set μ as in (2.5). Theorem 5.2 (Measure estimate I). Let ω : → ω() be a Lipeomorphism and (5.16) hold. If (ξ ) · l = 0 , ∀ |l| = 1, 2 , ∀ ξ ∈ , (5.19) n ∞ |{ξ ∈ : ω(ξ ) · k + (ξ ) · l = 0}| = 0 , ∀k ∈ Z , l ∈ Z , |l| ≤ 2 , (k, l) = 0 , (5.20) then, taking τ as in (2.11), | \ ∞ | → 0 as α → 0. If, moreover, ω(ξ ), (ξ ) are affine functions of ξ , |\∞ | ≤ Cρ n−1 α μ

where

ρ := diam() .

(5.21)

The following theorem states that, given a Diophantine versor ω, ¯ there exist many invariant elliptic KAM tori with tangential frequency t ω, ¯ t ∈ R+ . Theorem 5.3 (Measure estimate II). Assume that ω(ξ ), (ξ ) are affine functions of ξ, ∂ξ ω is invertible, and · l = 0 , ∀ 0 < |l| ≤ 2 . (5.22) − ∂ξ (∂ξ ω)−1 ω |ξ =0

760

M. Berti, L. Biasco

Suppose that is compact and 0 ∈ / ω(). If γ defined in Theorem 5.1 is small enough, there exists K > 1 such that for every versor ω¯ ∈ D K α,τ , |ω∞ ( \ ∞ ) ∩ ωR ¯ +| ≤ K αμ

(5.23)

(here | · | denotes the one dimensional Lebesgue measure). Condition (5.22) is similar to condition (2) of [16] where it is required for 0 < |l| ≤ 3 (see also (2.13) with nˆ = n). By the Fubini theorem, integrating along the directions ω, ¯ the bound (5.23) implies (5.21). 6. Proof of Theorem 2.1 We have

⎡

1 ˆ A yˆ · yˆ = ⎣ 2

⎤

⎡

Pi j˜0 y i w˜ j˜ ⎦ and Bˆ yˆ · Zˆ = ⎣

2i+j˜=4

⎤ Pi j˜2 y i w˜ j˜ wˆ 2 ⎦ .

2i+j˜=2

Proposition 6.1 (Averaging). Let H be as in (2.1). Suppose that (2.12) holds. Then there exists a constant C := C(n, τ, s, d, n) ˆ > 1 large enough, 0 < r+ < r/4 small enough and a symplectic map : (x+ , y+ , w+ ) ∈ D(s+ , r+ ) → (x, y, w) ∈ D(s, r ) , s+ := s/4 , close to the identity, such that, defining H + := H ◦ =: N + P + , the Hamiltonian vector field X P + has the same regularity of X P , Pi+j = 0 if 2i + j ≤ 2 and2 * ) Pi+j˜jˆ y i w˜ j˜ wˆ jˆ = Pi+j˜jˆ y i w˜ j˜ wˆ jˆ if 2i + j˜ + jˆ ≤ D+1 and j˜ + jˆ ≤ 4 , jˆ ≤ 2 or jˆ = 1 . (6.1) Moreover [Pi+j˜jˆ ] − [Pi j˜jˆ ] ≤ Cκ32 /α , κ3 := |P11 |s + |P03 |s , ∀ 2i + j˜ + jˆ = 4 , j˜ = 0, 2, 4 , jˆ = 0, 2 .

(6.2)

In other words, in the case d > 1, D = 4, 1 + ˆ · zˆ + z¯ˆ + + P003 (x+ )wˆ +3 + Aˆ + yˆ+ · yˆ+ + Bˆ + yˆ+ · zˆ + z¯ˆ + + P004 H + = ωˆ · yˆ+ + (x+ )wˆ +4 2 j˜ jˆ j˜ jˆ + + P013 (x+ )w˜ + wˆ +3 + Pi+j˜jˆ (x+ )y+i w˜ + wˆ + + Pi+j˜jˆ (x+ )y+i w˜ + wˆ + , 2i+j˜+jˆ=5,jˆ=1

2i+j˜+jˆ≥6

(6.3) 2 In particular the terms P + , P + , P + , P + , P + , P + , P + , P + vanish. 110 101 030 021 012 111 031 041


761

while, in the case d = 1, D = 6, 1 + ˆ · zˆ + z¯ˆ + + P003 (x+ )wˆ +3 + Aˆ + yˆ+ · yˆ+ + Bˆ + yˆ+ · zˆ + z¯ˆ + + P004 H + = ωˆ · yˆ+ + (x+ )wˆ +4 2 ) * j˜ jˆ j˜ jˆ + P0+j˜jˆ (x+ )w˜ + wˆ + + + P013 (x+ )w˜ + wˆ +3 + Pi+j˜jˆ (x+ )y+i w˜ + wˆ + , +

2i+j˜+jˆ=5,6, jˆ≤2

2i+j˜+jˆ=7,jˆ=1

j˜ jˆ Pi+j˜jˆ (x+ )y+i w˜ + wˆ +

2i+j˜+jˆ=5,6 ,jˆ≥3

+

2i+j˜+jˆ≥8

j˜ jˆ Pi+j˜jˆ (x+ )y+i w˜ + wˆ + ,

(6.4)

p− ¯ p ˆ and B+ ∈ L(Cnˆ , ∞ ) satisfy where Aˆ + ∈ Mat(nˆ × n)

ˆ , Bˆ + − B ˆ ≤ C(|P11 |s + |P03 |s )2 α −1 . Aˆ + − A

(6.5)

Proof. We start with some general considerations. We define the degree of a monomial F = Fi j y i w j = Fi j˜jˆ y i w˜ j˜ wˆ jˆ

as

degF := 2i + j = 2i + j˜ + jˆ .

The Poisson brackets of two monomials is a monomial with deg{F, G} = degF + degG − 2 or {F, G} = 0 .

(6.6)

We denote X tF the hamiltonian flow generated by F at time t. Then j j j−1 H ◦ X 1F = L F H/j! where L F H := {L F H, F} and L 0F H := H . (6.7) j≥0

Let H = N + P be as in (2.1) and suppose that F = Fi j˜jˆ y i w˜ j˜ wˆ jˆ solves the homological equation {N , F} + Pi j˜jˆ y i w˜ j˜ wˆ jˆ = [Pi j˜jˆ y i w˜ j˜ wˆ jˆ ] .

(6.8)

By (6.7) and (6.6), the terms of H and H ◦ X 1F with degree less than or equal to degF are the same, except for Pi j˜jˆ y i w˜ j˜ wˆ jˆ which is normalized into [Pi j˜jˆ y i w˜ j˜ wˆ jˆ ] . On the other hand the terms of degree equal to degF + 1 are changed by a quantity of order |F|κ3 . For brevity for the rest of this proof a b means that there exists a constant c = c(n, τ, s, D, n) ˆ > 0 such that a ≤ cb. By the Melnikov condition (2.12) there is a solution F = Fi j˜jˆ y i w˜ j˜ wˆ jˆ of the homological equation (6.8) for every (i, j˜, jˆ) satisfying the conditions in (6.1). Indeed the existence of F and the estimate |Fi j˜jˆ |s(1−1/D) |Pi j˜jˆ |s /α

(6.9)

follows as in Lemmata 1-2 of [26]; we just note that the small divisors involved in the ˜ ˆ ˜¯ ( ˆ¯ with ˜ a˜ − a)+ ˆ aˆ − a), definition of every monomial f (x)y m z˜ a˜ z˜¯ a¯ zˆ aˆ zˆ¯ a¯ of F are ω ·k + ( n n−n ˆ ˜¯ = j˜, |aˆ + a| ˆ¯ = jˆ ˜ := ( n+1 , . . . , nˆ ), k ∈ Z , a, ˜ a˜¯ ∈ N , a, ˆ aˆ¯ ∈ N∞ and |a˜ + a| ˜¯ ≤ j˜, |aˆ + a| ˆ¯ ≤ jˆ). (then |a˜ + a| We now proceed normalizing the terms of degree three with (i, j˜, jˆ) = (1, 1, 0), (1, 0, 1), (0, 3, 0), (0, 2, 1), (0, 1, 2) .

(6.10)

762

M. Berti, L. Biasco

Let us define F (3) := Fi j˜jˆ y i w˜ j˜ wˆ jˆ , where the sum is taken over the indexes in (6.10). Let s3 := s(1 − 1/D). For r3 > 0 we have that |∂x F (3) |s3 r33 , |∂ y F (3) |s3 r3 , |∂w F (3) |s3 r32 , since 2i + j˜ + jˆ ≥ 3. Therefore we can choose r3 small enough such that X 1F (3) : D(s3 , r3 ) → D(s, r ). Moreover the terms of order three of H ◦ X 1F (3) are the same of H except for Pi j˜jˆ y i w˜ j˜ wˆ jˆ with indexes as in (6.10) that are normalized; note that, being of odd degree, they actually annihilate. On the other hand the term of degree four are slightly changed by a quantity of order |F (3) |s3 κ3 κ32 /α by (6.9). We now normalize the terms of degree four with (i, j˜, jˆ) = (1, 1, 1) , (0, 3, 1) , (2, 0, 0) , (1, 2, 0) , (1, 0, 2) , (0, 4, 0) , (0, 2, 2) . (6.11) Let us define F (4) := Fi j˜jˆ y i w˜ j˜ wˆ jˆ , where the sum is taken over the indexes in (6.11). If r4 > 0 is small enough and s3 := s(1 − 2/D) we have that X 1F (4) : D(s4 , r4 ) → D(s3 , r3 ). The terms of order three and four of H ◦ X 1F (3) ◦ X 1F (4) are the same of H ◦ X 1F (3) except for those with indexes as in (6.11) that are normalized. Note that the terms corresponding to the first two triples in (6.11) annihilate. The normalization of all the other terms of degree up to D + 1 is analogous. Remark 6.1. The cubic terms P003 (x+ )wˆ +3 on the high modes can not be removed by some averaging procedure because the tangential and normal frequencies satisfy only the second order Melnikov non-resonance conditions (2.12). We introduce parameters ξ ∈ (0, ρ∗ ]nˆ , ρ∗ ∈ (0, r+2 /4) , and new symplectic variables (x∗ , y∗ , w∗ ) = (x∗ , yˆ+ − ξ, wˆ + ) ∈ D(s∗ , r∗ ) ⊂ Tnsˆ∗ × Cnˆ × b , √ s∗ ≤ s+ , r∗ ≤ ρ ∗ /2, a, p

where the n-dimensional ˆ angles are defined by # x∗ j := x+ j , ∀1 ≤ j ≤ n , 2(ξ j + y∗ j ) e−ix∗ j , eix∗ j := w+ j , ∀n < j ≤ nˆ . After this symplectic change of coordinates the Hamiltonian H + becomes j H ∗ = N ∗ + P ∗ = ω∗ (ξ ) · y∗ + ∗ (ξ ) · z ∗ z¯ ∗ + Pi∗j (x∗ ; ξ )y∗i w∗

(6.12)

2i+ j≥0

with ω∗ (ξ ) := ωˆ + Aˆ + ξ ,

ˆ + Bˆ + ξ , ∗ (ξ ) :=

and, by (6.3), (6.4), denoting for simplicity | · | :=

(6.13)

| · |λs∗ ,

∗ ∗ ∗ ∗ ∗ if d > 1 , |P00 | , |P01 | = O(ρ∗ ) , |P10 | , |P02 | = O(ρ∗2 ) , |P11 | = O(ρ∗ ) , ∗ |P03 | = O(1) , (6.14) 5/2

3/2

∗ ∗ ∗ ∗ if d = 1 , |P00 | , |P01 | = O(ρ∗ ) , |P10 | , |P02 | = O(ρ∗3 ) , 7/2

∗ ∗ |P11 | = O(ρ∗ ) , |P03 | = O(1) . 5/2

Moreover for α∗ > 0 and λ := α∗ /M, with M := Aˆ + + Bˆ + (recall (5.2)).

(6.15)


763

We now apply the KAM Theorem 5.1. Take α∗ := 92 r∗2 , ρ∗ := r∗2ϑ where ϑ ∈ (9/10, 1) if d > 1 , ϑ ∈ (9/14, μ) if d = 1 . (6.16) Remark 6.2. Other choices of α∗ ≥ 92 r∗2 are clearly possible, giving different estimates on the Cantor manifold. Theorem 2.1 follows applying Theorems 5.1, 5.2 with3 H = H ∗ , P = P ∗ , r := r∗ α := α∗ , etc. Let us verify the hypotheses of the above theorems. It is immediate to check (A∗ ), (B∗ ), (C∗ ). Let as in (5.5) (with respect to the perturbation P ∗ ); note that = O(1) with respect to ξ . By (6.14)-(6.16) the KAM condition (H 2) of Theorem 5.1 holds: 2(1−ϑ) ) for d > 1 O(r∗ μ α∗ /ρ∗ = (6.17) → 0 as r∗ → 0 . 2(μ−ϑ) O(r∗ ) for d = 1 ˆ d + Aˆ −1 ( Aˆ + − A)), ˆ by the twist condition, (6.5) and (2.14) we get Since Aˆ + = A(I ˆ that A+ is invertible with ˆ −1 ˆ −1 2 ˆ ˆ Aˆ −1 + − A ≤ 2 A A+ − A ,

(6.18)

taking c in (2.14) small enough. Therefore, ξ → ω∗ (ξ ) is a diffeomorphism, see (6.13). We finally verify that the frequencies ω∗ , ∗ satisfy (5.19) and (5.20). The nonˆ · l| ≥ α, ∀1 ≤ |l| ≤ 2, and so4 resonance assumption (2.12) implies | (6.13)

ˆ · l| − | Bˆ + ξ · l| ≥ α − 2ρ∗ Bˆ + | ∗ (ξ ) · l| ≥ |

(6.5),(2.14)

≥

α−2ρ∗ (B+ +c) ≥ α/2

if r∗ is small enough. So (5.19) holds. Since ω∗ (ξ ) · k + ∗ (ξ ) · l is an affine function of ξ , the condition (5.20) holds if ˆ · l = 0 or Aˆ + k + Bˆ +l = 0 . ωˆ · k + ˆ Suppose that Aˆ + k + Bˆ + l = 0, then k = − Aˆ −1 + B+ l and

ˆ Aˆ −1 − Aˆ −1 ˆ − Bˆ Aˆ −1 ω) ˆ · l = ( ˆ − Bˆ + Aˆ −1 ˆ · l = ( ˆ · l + B( ωˆ · k + + ω) + ) ωˆ · l = 0 +( Bˆ − Bˆ + ) Aˆ −1 +

by (2.13) and Remark 2.1, (6.5), (6.18) taking c in (2.14) small enough. Then Theorems 5.1 and 5.2 apply and we obtain a family of elliptic n-dimensional ˆ tori parametrized by ξ ∈ ∞ , where the set ∞ has asymptotically full measure as r → 0 by (5.21) and (6.17). Remark 6.3. The KAM theorem in [26] does not apply. Indeed, with only the estimates (6.14)-(6.15) the KAM condition (5.14) implies const α −1 (ρ 5/2 r −2 + r ) = const α −1 (r 5ϑ−2 + r ) if d > 1 −1 , const ≥ α |X P |r,s,E, ≥ const α −1 (ρ 7/2 r −2 + r ) = const α −1 (r 7ϑ−2 + r ) if d = 1 which is incompatible with the measure estimate α r 2ϑ/μ (recall (5.21)). 3 We apply Theorems 5.1 and 5.2 with α := α . Here α is the parameter defined in (6.16) which is small ∗ ∗ with r∗ and has not to be confused with the fixed α appearing in the statement of Theorem 2.1. 4 Recall that α is fixed and independent of ρ and r (see also the previous footnote). ∗ ∗

764

M. Berti, L. Biasco

7. Proof of Theorem 3.1 We divide the proof in several steps. Step 1) Partial Birkhoff Normal Form on nˆ ≥ n modes. By the non-resonance assumpˆ 2 ) where D ≥ 4, we transform H in partial Birkhoff normal form, up to order 4, tion (A on the first nˆ ≥ n modes, namely 1ˆ ˆ ˆ ˆˆ ˆ 3 H = aˆ · Iˆ + bˆ · ζˆ ζˆ¯ + P = aˆ · Iˆ + bˆ · ζˆ ζˆ¯ + A I · I + B I · ζˆ ζ¯ + O(|ζ˜ |ζˆ a, p) 2 g 4 (7.1) +O(ζˆ a, p ) + O(ζ a, p ), ˆ Bˆ in (3.7), (3.8), g := min(g, 6), and where aˆ , bˆ are defined in (3.6), the matrices A, ˆ ˆ , . . .) , ζ = (ζ˜ , ζˆ ) , I˜ := ζ˜ ζ¯˜ , Iˆ := (I, I˜) . ζ˜ := (ζñ+1 , . . . , ζñˆ ) , ζˆ := (ζˆn+1 ˆ , ζn+2 3 ) The proof of this statement follows as in [2,25,27]. Note that the term O(|ζ˜ |ζˆ a, p ˆ 2 ) requires only second order Melnikov non-resonance can not be removed because (A conditions for n > n. ˆ

Step 2) Parameters and action-angle variables on n modes. We introduce parameters ξ ∈ (0, ρ]n , ρ ∈ (0, 1) ,

(7.2)

and angle-action variables (x, y) on the first n modes, setting # ζ j =: 2(ξ j + y j )e−ix j , 1 ≤ j ≤ n .

(7.3)

Then I = ξ + y and the Hamiltonian (7.1) assumes the form Pi∗j (x; ξ )y i w j with ω(ξ ) := a + Aξ , H = ω(ξ ) · y + (ξ ) · z z¯ + i, j≥0

(ξ ) := b + Bξ ,

(7.4)

z = (ζn+1 , . . .), w := (z, z¯ ), and g

j

g

|Pi∗j |λs = O(|ξ | 2 −i− 2 ) , ∀ 2i + j ≤ 3 , |Pi∗j − Pi j |λs = O(|ξ | 2 −2 ) , ∀ 2i + j = 4 . (7.5) The Hamiltonian H is real analytic on D(s, r ), for some 0 < s < 1, 0 < r < ρ/2. Step 3) Apply the KAM Theorem 5.1 and Theorem 5.2 to H . The assumptions (A∗ ), (B∗ ), (C∗ ) of Theorem 5.1 are implied by (B), (C), as in [25]. We take α := 92 r 2 , ρ := r 2ϑ , ϑ ∈ (μ, ¯ μ) where μ¯ := max{2(1 + μ)g −1 , 3(g − 1)−1 } < μ ≤ 1

(7.6)

by (3.4) . Remark 7.1. The parameter domain can not be the whole (0, ρ]n (see (7.2)) because, by (7.3), the Hamiltonian H will be analytic in D(s, r ) only excluding |ξ | ≤ Cr 2 . This difficulty can be handled as in [25], Sect. 7, Step 5. For simplicity of exposition we skip this technical detail in the following.


765

The KAM condition (H3) reduces, by (7.5)–(7.6), to ε3 = O(max{r gϑ−2−2μ , r ϑ(g−1)−3 }) ≤ γ

and

O(r (g−3)ϑ−1 ) < 1 ,

(7.7)

which are both verified for r small enough because (g − 3)ϑ − 1 > 0 and ε3 → 0 as r → 0 . By Theorem 5.1 there is, ∀ξ ∈ ∞ defined in (5.12), an analytic symplectic map (·; ξ ) : D(s/4, r/4) → D(s, r ) such that H ∞ := H ◦ = ω∞ (ξ ) · y∞ + ∞ (ξ ) · z ∞ z¯ ∞ + P ∞ with Pi∞ j = 0 , ∀ 2i + j ≤ 2 . Moreover the assumptions (5.19), (5.20) of Theorem 5.2 hold by (7.4) and (A). By Theorem 5.2 the Cantor set of parameters ∞ has asymptotically full measure μ n−1 |/∞ | α ρ = O(r 2(μ−ϑ) ) → 0 as r → 0 . =O (7.8) || ρn By (5.9), (5.10) with pa = 1, and (7.6), we get ! ∞ − P ∗ | ≤ C |P ∗ | + r ε |P11 3 11 11 ∗ ! |Pi∞ j − Pi j | ≤ Cε3 , ∀ 2i + j = 4 , ∞ − P ∗ | ≤ C |P ∗ | + |P ∗ | + r ε |P03 3 03 03 11 (7.9) where | · | :=

| · |λs/4

and C := C(γ , ). Moreover, (5.6), (7.4), (7.6),

|ω∞ (ξ ) − a| ≤ γ −1 αε3 + A|ξ | ≤ Cr 2ϑ −1 αε + B|ξ | ≤ Cr 2ϑ . | ∞ (ξ ) − b|| p− ¯ p ≤γ 3

(7.10)

Step 4) Apply Theorem 2.1 to H ∞ . Assumptions (2.3), (2.6) of Theorem 2.1 hold by (7.4). The non-resonance assumption (2.12) holds if d > 1 0 , forany ξ ∈ 0 ∩ ω−1 (Dα μ ,τ ) if d = 1 where

l d n 0 := ξ ∈ : |ω∞ (ξ ) · k + ∞ (ξ ) · l| ≥ 2α , ∀ k ∈ Z , l ∈ n,D ˆ 1 + |k|τ (7.11) ⊂ ∞

and n,D is defined in (2.12). In the next section we prove that also 0 has asymptotically ˆ full measure n−1 μ |\0 | ρ α = O(r 2(μ−ϑ) ) → 0 as r → 0 . =O (7.12) || ρn ˆ Bˆ defined in (2.8), (2.10) (with Step 5) Check the Twist condition. The matrices A, ∞ P = P ) satisfy, by (7.9), (7.5), (7.6), ˆ ≤ C(ε3 + r θ(g−4) ) , Aˆ − A

ˆ ≤ C(ε3 + r θ(g−4) ) . Bˆ − B

(7.13)

ˆ is invertible by (A ˆ 1 ). The twist condition follows for r small enough. The matrix A

766

M. Berti, L. Biasco

ˆ ≤ Step 6) Check the non-resonance condition (2.13). By (7.10), (7.13), for every 0 < |l| 2, ˆ ˆ −1 aˆ ) · lˆ → 0 as r → 0 . (7.14) ˆ · lˆ − (bˆ − Bˆ A ( − Bˆ Aˆ −1 ω) ˆ 3 ) and Remark 2.1 imply Assumption (A ˆ > 0, ˆ −1 aˆ ) · l| inf |(bˆ − Bˆ A

ˆ 0 K α by (7.6), for r and c small enough, where K > 1 is the constant defined in Theorem 5.3. Then ω¯ ∈ D K α,τ and (3.10) follows by (5.23) and since α μ /ρ → 0 as r → 0 by (7.6). Remark 7.2. Actually ω∞ () is not a neighborhood of the frequency a, since = (0, ρ]n is not a neighborhood of 0. Nevertheless this small technical point is bypassed as follows. " For 1 ≤ j ≤ n, inverting the signs in the definition (7.3), namely ζ j := 2(ξ j − y j )e+ix j , the new tangential frequency in (7.4) becomes ω(ξ ) = a + A(ξ1 , . . . , −ξ j , . . . , ξn ). Taking all the possible choices of 1 ≤ j ≤ n and ± signs, ξ ∈ span a whole neighborhood of the frequency a, except for n hyperplanes passing through a (but not through the origin). 7.1. Measure estimates. The next proposition implies (7.12) concluding the proof of Theorem 3.1. Proposition 7.1. | \ 0 | ≤ cρ n−1 α μ , where μ is defined in (3.4) and the constant c depends on a, b, A, B, n, n, ˆ d, D, a∗ , κ, δ∗ . We have to estimate \ 0 =

+

Rkl (α),

(7.16)

k∈Zn , l∈n,D ˆ

where Rkl are the “resonant zones” 2αl d . Rkl (α) := ξ ∈ : |ω∞ (ξ ) · k + ∞ (ξ ) · l| < 1 + |k|τ In the case d > 1 there are at most finitely many nonempty resonant zones Rkl (α). This is a consequence of the next lemmata. The case d = 1 is more complex.


767

Lemma 7.1. Let d > 1. There are D∗ ≥ 1, σ∗ > 0, such that l d ≥ D∗−1 |l|σ∗ |l|δ∗ , ∀l ∈ n,D . ˆ

(7.17)

˜ l), ˆ lˆ = ei − e j , i > j. We have Proof. We consider only the more difficult case l = (l, l d ≥ i d − (i − 1)d − D nˆ d ≥ i d−1 − D nˆ d > i d−1 /2 for i d−1 > 2D nˆ d .

(7.18)

Defining δ0 := max{δ∗ , 0}, σ∗ := d − 1 − δ0 > 0, we have |l|σ∗ |l|δ∗ ≤ Di σ∗ Di δ0 = D 2 i d−1 .

(7.19)

Let D∗ := 2D 3 nˆ d . If i d−1 > 2D nˆ d , then (7.17) follows by (7.18); if i d−1 ≤ 2D nˆ d , by (7.19). Remark 7.3. For d = 1, D ≥ 3 (as in this paper) the bound (7.17) is false. Taking for example l = l ( j) := en+ ˆ j − e j − enˆ with j > nˆ we have l ( j) = 1 , |l ( j) |δ∗ ≥ nˆ δ∗ , |l ( j) |σ∗ ≥ j σ∗ → ∞ as j → ∞ . This motivates assumption (A3 ) for d = 1. The bound (7.17) is true for d = 1, D = 2, see [26]. Lemma 7.2. There exists β0 > 0 (depending on d, b, n, ˆ D) such that |b · l| ≥ 4β0 l d , ∀l ∈ n,D . ˆ

(7.20)

˜ l), ˆ |l| ˆ = 2, lˆ = ei − e j , i > j. Proof. We consider only the subtlest case l = (l, We have |b · l| ≥ |bi − b j | − c1 ,

l d ≤ i d − j d + c2 ,

(7.21)

for some c1 := c1 (D, bn+1 , . . . , bnˆ ), c2 := c2 (d, n, ˆ D) > 0. By (A2 ) and (B) there is β1 > 0 such that |bi − b j | ≥ 2β1 (i d − j d ) , ∀i > j .

(7.22)

By (7.21), (7.22), for β0 ≤ β1 /4 we have that β1 (i d − j d ) ≥ β1 c2 + c1

⇒

|b · l| ≥ 4β0 l d .

(7.23)

Let d > 1. If i > i 0 we have i d − j d ≥ di 0d−1 , so (7.23) follows for i 0 large. On the other ˜ ≤ D − 2, j < i ≤ i 0 is finite and l d ≤ Di d . Hence (7.20) follows hand, the set of |l| 0 ˆ 2 ) for β0 small enough. Let now d = 1. Take h large such that β1 h ≥ β1 c2 + c1 . by (A Then (7.23) holds for i − j ≥ h. On the other hand, if i − j < h, we have l 1 ≤ h + nˆ D ˆ 2 ) for β0 small enough. and (7.20) follows by (A In the following r is small enough. Lemma 7.3. | ∞ (ξ ) · l| ≥ 3β0 l d , ∀ξ ∈ , l ∈ n,D ˆ .

768

M. Berti, L. Biasco

Proof. By (7.10), p¯ − p ≥ −δ∗ , and Lemma 7.2, we have | ∞ (ξ ) · l| ≥ |b · l| − |l|δ∗ | ∞ (ξ ) − b||−δ∗ ≥ 4β0 l d − C|l|δ∗ r 2ϑ . If d > 1 Lemma 7.1 implies |l|δ∗ ≤ D∗ l d and the thesis follows for r small enough. If d = 1 we have δ∗ < 0 (see (3.3)). Therefore |l|δ∗ ≤ D + 1 and we conclude again for r small. Lemma 7.4. If Rkl (α) = ∅, α ≤ β0 , then |k| ≥ θ l d with θ := β0 /(1 + |a|) .

(7.24)

Proof. If there exists ξ ∈ Rkl (α) then |ω∞ (ξ ) · k + ∞ (ξ ) · l| < 2αl d and, using Lemma 7.3, |k||ω∞ (ξ )| ≥ |k · ω∞ (ξ )| ≥ | ∞ (ξ ) · l| − 2αl d ≥ 3β0 l d − 2αl d ≥ β0 l d . By (7.10) we have |ω∞ (ξ )| ≤ |a| + 1 for r small enough, implying (7.24).

From now on we always assume α ≤ β0 taking r small enough. By the previous lemma we shall restrict the union in (7.16) to the cases |k| ≥ θ l d . In particular we shall always assume k = 0. In the following a b means that there is a constant c, depending on the same quantities as the constant of Proposition 7.1, such that a ≤ cb. Moreover M, L defined in (5.2), (5.16) respectively, are, here, M = A + B, L = A−1 . Lemma 7.5. If |k| ≥ 8L M|l|δ∗ then Rkl (α) ρ n−1 α/(1 + |k|τ ). Proof. Assume that r is small enough such that ε3 ≤ γ /(2L M). By Remark 5.2 the −1 |lip ≤ 2L. We ˜ := ω∞ () with |ω∞ frequency map ω∞ is invertible from to ˜ Then introduce the final frequencies ζ = ω∞ (ξ ) as parameters over the domain . ! −1 ˜ (ζ ) := ∞ ω∞ (ζ ) satisfies (see Remark 5.2) −1 lip ˜ |−δ∗ ≤ | ∞| |ω∞ | | | ≤ 2M2L = 4M L . −δ∗ lip

(7.25)

Choose a vector v ∈ {−1, 1}n such that v · k = |k| and write ζ = sv + w with s ∈ R and w ⊥ v. Then ˜ ) · l = s|k| + (sv ˜ ζ · k + (ζ + w) · l =: f kl (s)

(7.26)

and the resonant zones are written l d ˜ : | f kl (s)| < 2α . R˜ kl (α) := ω∞ (Rkl (α)) = ζ = sv + w ∈ 1 + |k|τ By (7.26), (7.25) we have f kl (s2 ) − f kl (s1 ) ≥ (s2 − s1 )|k| − 4M L|l|δ∗ (s2 − s1 ) ≥ |k|(s2 − s1 )/2 because |k| ≥ 8L M|l|δ∗ . Fubini’s theorem implies 2 l d ˜ n−1 2α (diam ) | R˜ kl (α)| ≤ . |k| 1 + |k|τ −1 and noting Going back to the original parameter domain by the inverse map ω∞ ˜ ≤ 2Mdiam (by Remark 5.2), l d ≤ θ −1 |k| (by Lemma 7.4), the final that diam estimate follows.


769

We estimate the other resonant zones Rkl (α) using that the unperturbed frequencies in (7.4) are affine functions of ξ and assumption (A3 ). We have ω∞ (ξ ) · k + (ξ ) · l = akl + bkl · ξ + Rkl (ξ ),

(7.27)

akl := a · k + b · l ∈ R , bkl := Ak + Bl ∈ Rn ,

(7.28)

Rkl (ξ ) := (ω∞ (ξ ) − ω(ξ )) · k + ( ∞ (ξ ) − (ξ )) · l .

(7.29)

where

and

Assumption (A3 ) implies that δkl := min{|akl | , |bkl |} > 0 , ∀k ∈ Zn , l ∈ n,D ˆ , (k, l) = 0 . Moreover (7.29), (5.6), imply |Rkl (ξ )| ε3 α(|k| + |l|δ∗ ) , |Rkl |lip ε3 (|k| + |l|δ∗ ) .

(7.30)

Lemma 7.6. Fix K ∗ > 0. For all 0 < |k| ≤ K ∗ , l ∈ n,D ˆ , (k, l) = 0, α ≤ θ δkl /4

⇒

|Rkl (α)| ρ n−1 α/δkl .

Proof. If d > 1, by Lemma 7.1, (7.24), and δ∗ < 0, we get l d ≤ K ∗ /θ if d > 1 |l|δ∗ ≤ D+1 if d = 1 .

(7.31)

(7.32)

Case I. |akl | = δkl . By (7.27), (7.30), (7.32) we get, for r small enough, |ω∞ (ξ ) · k + ∞ (ξ ) · l| ≥ |akl | − (A|k| + B|l|)r 2ϑ − |Rkl | ≥ |akl | − cK ∗ r 2ϑ ≥ δkl /2 (7.31)

(7.24)

≥ 2αθ −1 ≥ 2αl d |k|−1 ≥

2αl d , 1 + |k|τ

implying that Rkl (α) = ∅. Case II. |bkl | = δkl . Set ξ = ξs = bkl |bkl |−1 s + w with s ∈ R, w ⊥ bkl . By (7.27), (7.30), (7.32) the function f kl (s) := ω∞ (ξs ) · k + ∞ (ξs ) · l satisfies, taking r small, gkl (s2 ) − gkl (s1 ) ≥

|bkl | δkl (s2 − s1 ) = (s2 − s1 ) . 2 2

Arguing as in Lemma 7.5 by Fubini’s theorem we obtain |Rkl (α)|

ρ n−1 αl d ρ n−1 α|k| ≤ , δkl (1 + |k|τ ) δkl θ (1 + |k|τ )

and the thesis follows (since τ ≥ 1). We now distinguish the cases d > 1 and d = 1.

770

M. Berti, L. Biasco

• Case d > 1. Let L ∗ := 8D∗ L Mθ −1 ,

K ∗ := 8L M max |l|δ∗ . |l|σ∗ ≤L ∗

Lemma 7.7. |Rkl (α)| ρ n−1 α/(1 + |k|τ ), ∀k ∈ Zn , l ∈ n,D ˆ . Proof. If |k| ≤ K ∗ , |l|σ∗ ≤ L ∗ , (7.7) follows by Lemma 7.6. Then we can suppose that |k| > K ∗ or |l|σ∗ > L ∗ . If Rkl (α) = ∅ and |l|σ∗ > L ∗ , then (7.17)

(7.24)

|k| ≥ θ l d ≥ θ |l|σ∗ |l|δ∗ /D∗ ≥ 8L M|l|δ∗ . On the other hand, when |l|σ∗ ≤ L ∗ we have |k| > K ∗ ≥ 8L M|l|δ∗ . So, in both cases Lemma 7.5 applies proving (7.7). 2

Lemma 7.8. card{ l : l d ≤ θ −1 |k|} |k| d−1 . Proof. We claim that c l d ≥ |l|d−1 ,

c := 2D 2 nˆ d .

(7.33)

˜ ei − e j ), i > j. We have |l|d−1 ≤ Di d−1 . If We consider only the case l = (l, d−1 d ≤ 2Dm , then c l d ≥ c ≥ Di d−1 ≥ |l|d−1 . Otherwise by (7.18) l d ≥ i d−1 i /2 ≥ Di d−1 /c ≥ |l|d−1 /c and (7.33) follows. Therefore 2

card{ l : l d ≤ θ −1 |k|} ≤ card{ l : |l|d−1 ≤ c θ −1 |k|} |k| d−1 . By (7.16), (7.24) and Lemmata 7.7, 7.8, we deduce

| \ 0 | ≤

|Rkl (α)|

|k|≥θl

2

(2.11)

ρ n−1 α|k| d−1 /(1 + |k|τ ) ρ n−1 α,

k

namely Proposition 7.1 in the case d > 1. • Case d = 1. Set K 0 := 8(D + 1)M L ,

L 0 := K 0 /θ .

(7.34)

Lemma 7.9. inf{δkl : 0 < |k| ≤ K 0 , l 1 ≤ L 0 } > 0. ˜ l). ˆ Since the set {l 1 ≤ L 0 } ∩ {|l| ˆ = 0} is finite, we consider |l| ˆ =1 Proof. Let l = (l, ( j) ˆ ˆ ˜ or 2. If l = l = ±e j , j > nˆ we have akl = a · k + b · (l, 0) ± b j → ±∞ as j → ∞. ˆ It remains only the case lˆ = ±(ei − e j ), The same holds for lˆ = ±(ei + e j ), i, j > n. ˆ − 2) (since i > j . Then lˆ = lˆ( j) = ±(eh+ j − e j ) for some 1 ≤ h ≤ L 0 + n(D ˜ L 0 ≥ l 1 ≥ h − |l| ≥ h − n(D ˆ − 2)). As j → ∞ we have ˜ 0) ± (bh+ j − b j ) → a · k + b · (l, ˜ 0) ± h , akl = a · k + b · (l, ˜ 0) + B(0, ±(eh+ j − e j )) → Ak + B(l, ˜ 0) . bkl = Ak + B(l, We conclude by Assumption (A3 ).


771

n−1 α/(1 + |k|τ ). Lemma 7.10. For all k ∈ Zn , l ∈ n,D ˆ , there hold |Rkl (α)| ρ

Proof. If |k| ≥ K 0 ≥ 8L M|l|δ∗ because |l|δ∗ ≤ (D + 1) (recall δ∗ < 0) the estimate follows by Lemma 7.5. If |k| < K 0 we conclude by Lemmata 7.6 and 7.9. We can not estimate ∪l Rkl (α) with l |Rkl (α)| because, even with the constraint ˜ eh+ j − e j ) , j > n, ˆ with l 1 ≤ l 1 ≤ |k|/θ , there exist infinitely many l = (l, nˆ D + h, ∀h ≥ 1. We need more refined estimates. We decompose

˜ l) ˆ , lˆ = ±(eh+ j − e j ) , j > nˆ , h ≥ 1 , = 1 ∪ 2 , 2 := l = (l, n,D ˆ 1 := n,D ˆ \2 . Lemma 7.11. card (1 ∩ {l 1 ≤ |k|/θ }) |k|2 . ˆ = 0, 1 ˆ = 2, lˆ = ±(ei + e j ), i, j > nˆ (the cases |l| Proof. We consider only the case |l| −1 ˜ are simpler). We have |l| ≤ D − 2 and |i + j| ≤ |k|θ + nˆ D |k|, implying the lemma. Lemmata 7.10, 7.11 imply 2 + |k| ρ n−1 α . R (α) kl 1 + |k|τ l∈1

(7.35)

We now consider the more difficult case l ∈ 2 . We define

˜ Q klh (α) := ξ ∈ : |ω (ξ ) · k + (ξ ) · ( l, 0) + h| ≤ δ ∞ ∞ kh j , ˜ j where δkh j :=

2(1 + B)ρ a∗ h 2α|k| + + κ . τ θ (1 + |k| ) j −δ∗ j

Lemma 7.12. Let 1 ≤ h ≤ θ −1 |k| + n(D ˆ − 2), j > n. ˆ For r small enough,

α ρ 1 n−1 |Q klh + −δ + κ . ˜ j (α)| ρ 1 + |k|τ j ∗ j

(7.36)

˜ lˆ( j) ) ∈ 2 , lˆ( j) = eh+ j − e j , then Rkl ( j) (α) ⊆ Q ˜ (α). Moreover, if l ( j) = (l, k lh j Proof. If |k| ≥ K 0 , arguing as in the proof of Lemma 7.5, for r small enough we get n−1 δ −1 |Q klh ˆ −2). On the kh j /|k|, and the estimate follows since h ≤ θ |k|+ n(D ˜ j (α)|ρ other hand, if |k| < K 0 we have h ≤ L 0 + n(D ˆ − 2); by assumption (A3 ) and arguing as n−1 δ in the proof of Lemmata 7.6 and 7.9, for r small enough we have |Q klh kh j ˜ j (α)| ρ and the estimate follows as above. ( j) = (ξ ) · (l, ˜ 0) + We now prove that Rkl ( j) (α) ⊆ Q klh ∞ ˜ j (α). We have ∞ (ξ ) · l ∞ (ξ ) · (0, lˆ( j) ). By (5.6) and (3.5) we have | ∞ (ξ ) · (0, lˆ( j) ) − h| ≤ | ∞ (ξ ) · (0, lˆ( j) ) − b · lˆ( j) − Bξ · lˆ( j) | + |Bξ · lˆ( j) | + |b j+h − b j − h| ≤ 2γ −1 αε3 |lˆ( j) |δ∗ + Bρ|lˆ( j) |δ∗ + a∗ h j −κ

≤ 2(B + 1)ρ j δ∗ + a∗ h j −κ

(for r small enough 2α ≤ ρ); the thesis follows since l 1 ≤ θ −1 |k| by Lemma 7.4.

772

M. Berti, L. Biasco

We choose j0 :=

1 + |k|τ α

1 1+κ

.

(7.37)

Since Rkl ( j) (α) ⊂ Q klh ˜ j (α) ⊆ Q k lh ˜ j0 (α) for j ≥ j0 , we have , + α j ρ 1 0 n−1 Rkl ( j) (α) ≤ |Rkl ( j) (α)| + |Q klh + −δ + κ ˜ j0 | ρ 1 + |k|τ j0 j0 ∗ n< j>nˆ ˆ j< j0 (7.38) by Lemma 7.10 and (7.36). By (7.38), (7.37), (7.6) choosing ϑ ∈ (max{μ, ¯ μ + δ∗ (1 + κ)−1 }, μ) (note δ∗ < 0) we get, for r small enough (recall that −δ∗ ≤ κ ) + αμ n−1 R . ( j) (α) ρ kl δ∗ j>nˆ (1 + |k|τ ) δ∗ −1 ˜ ≤ D − 2} 1 we Since l 1 ≤ |k|/θ implies h ≤ n(D ˆ − 2) + |k|/θ, and card{l˜ : |l| have + αμ n−1 ρ R (α) . (7.39) kl δ∗ l∈2 (1 + |k|τ ) δ∗ −1 By (7.39) and (7.35) we get + α μ |k|2 ρ n−1 R (α) . kl δ∗ l∈n,D (1 + |k|τ ) δ∗ −1 ˆ Summing over k and by the choice of τ in (2.11) we get Proposition 7.1 also when d = 1. 8. Proof of the Basic KAM Theorem 5.1 8.1. Technical lemmata. We first give some lemmata on composition of families of analytic functions depending in a Lipschitz way on parameters. We recall that the Lipschitz norms defined in (1.13) satisfy the algebra property | f g|λs,r ≤ | f |λs,r |g|λs,r . Lemma 8.1. If h(·; ξ ) is analytic in Tns and |ψ|λs−σ ≤ σ/2, then g(x; ξ ) := h(x + ψ(x; ξ ); ξ ) satisfies |g|λs−σ ≤ |h|λs +

2 |h|s |ψ|λs−σ ≤ 2|h|λs . (8.1) σ

If ∈ Es−σ (see (5.4)) satisfies |x00 |λs−σ |y00 |λs−σ |y01 |λs−σ |w00 |λs−σ |w01 |λs−σ δ λ λ , | , |y | , , , |y , ≤ , 10 02 s−σ s−σ 2 σ r r σr σ 16 (8.2)


773

with 0 ≤ δ ≤ 1, then, for all H (·; ξ ) analytic in D(s, r ), H˜ (x, y, w; ξ ) := H ((x, y, w) + (x, y, w; ξ ); ξ ) satisfies | H˜ |λs−σ,r −δr ≤ 2|H |λs,r . (8.3) Proof. Since h(·; ξ ) is analytic in Tns , by Cauchy estimates, |ψ|s−σ ≤

σ 2 lip lip lip lip lip ⇒ |g|s−σ ≤ |∂x h|s− σ2 |ψ|s−σ + |h|s ≤ |h|s |ψ|s−σ + |h|s 2 σ

and (8.1) follows. The proof of (8.3) is similar.

We now estimate derivatives of the composite functions. Lemma 8.2. Given H : D(s, r ) × → C, there exists c0 > 0 such that, if : D(˜s , r˜ ) (x+ , y+ , w+ ) → (x, y, w) ∈ D(s, r ) with 0 < r˜ ≤

s r , 0 < s˜ ≤ , 2 2

and = I + with ∈ Es˜ satisfies λ |x00 |λs˜ |y00 |λs˜ |y01 |λs˜ |w01 |λs˜ λ λ |w00 |s˜ , , |y , ≤ c0 , , | , |y | , 10 02 s˜ s˜ s r2 r sr s

(8.4)

then H˜ := H ◦ is analytic on D(˜s , r˜ ), ∀ξ ∈ , and r˜ |∂ y+2 w+ H˜ |sλ˜,˜r , |∂ y i w j H˜ |λs˜,˜r ≤ 3 , ∀ 2i + j = 4 , with + + ⎧ ⎫ ⎨ ⎬ := max 1, |∂ y i w j H |λs,r , r |∂ y 2 w H |λs,r ⎩ ⎭

(8.5)

2i+ j=4

(we use the short notation H ◦ to mean H (·, ξ ) ◦ , ∀ξ ∈ ). Proof. For c0 small enough, conditions (8.4) imply (8.2) with s→

3r 3s s 3r − 4˜r 1 3s ,r→ , σ := − s˜ ≥ , δ := ≥ . 4 4 4 4 3r 3

Then, for c0 small enough, (8.3) implies that H˜ is analytic in D(˜s , r˜ ) and ) |∂ y+ w+2 H˜ |λs˜,˜r ≤ 2 |∂ y 3 H |λ3s , 3r (|y01 |sλ˜ + |y02 |λs˜ r )2 + 2|∂ y 2 w H |λ3s , 3r (1 + |w01 |sλ˜ ) 4

4

4

4

×(|y01 |sλ˜ + |y02 |λs˜ r ) + 2|∂ y 2 H |λs,r |y02 |sλ˜ + |∂ yw2 H |λs,r (1 + |w01 |λs˜ )2 ×(1 + |y10 |sλ˜ ) ≤ 3 , using that, by Cauchy estimates, |∂ y 3 H |λ3s , 3r ≤ 16r −2 |∂ y 2 H |λs,r ≤ 16r −2 . 4

4

The other estimates are analogous.

*

774

M. Berti, L. Biasco

We conclude with a lemma on Fourier series. Fix an integer K > 0, we denote TK f (x; ξ ) := f k (ξ )eik·x and TK⊥ := I − TK . k∈Zn ,|k|≤K

Lemma 8.3. Let f (·; ξ ) be analytic on Tns . There is C := C(n) such that, ∀0 ≤ σ ≤ s, K σ ≥ 1, K −n e K σ |TK⊥ f |λs−σ , σ K −n e K σ |TK⊥ f " |λs−σ , σ n |TK f |λs−σ , σ n+1 |TK f " |λs−σ ≤ C| f |λs . (8.6) Proof. We have |TK⊥ f " |s−σ ≤

|k|| f k |e|k|(s−σ ) ≤ | f |s

|k|>K

|k|e−|k|σ ≤ | f |s

|k|>K

4n l n e−lσ ,

l>K

and the last sum is bounded by C(n)σ −1 K n e−K σ if K σ ≥ 1. The other estimates are analogous. In the following we will always assume K σ ≥ 1. 8.2. A class of symplectic transformations. We introduce the space of Hamiltonians Fs := {F(x; ξ ) = F00 (x; ξ ) + F01 (x; ξ ) · w + F10 (x; ξ ) · y + F02 (x; ξ )w · w a, p¯

a, p

where X F ∈ C2n × b is analytic in (x, y, w) ∈ Tns × Cn × b and Lipschitz in ξ ∈ } .

(8.7)

Note that the terms that we want to eliminate from the perturbation through the KAM iteration have such a form. We shall also take “auxiliary” Hamiltonians in Fs whose time one flow generates the KAM symplectic transformations, see Lemma 8.9. The next lemmata will be used to estimate the perturbation after the KAM step, see Lemma 8.11. The time one flow map generated by Hamiltonians in Fs has the form I + with as in (5.4), see Lemma 8.6. Lemma 8.4 shows that Fs is closed under composition with such maps. We estimate the transformed map in a slightly smaller analytic strip for the convergence of the KAM iteration. Lemma 8.4 (Composition). If F ∈ Fs , ∈ Es−σ , 0 < σ ≤ s, with |x00 |λs−σ ≤ σ/2, then S := F ◦ (I + ) ∈ Fs−σ and S00 = F˜00 + F˜10 · y00 + F˜01 · w00 + F˜02 w00 · w00 , S01 = (I + w ) F˜01 + y F˜10 + 2(I + w ) F˜02 w00 , 01 ˜ y10 ) F10 ,

01

01

S10 = (I + S02 = F˜10 · y02 + (I + w01 ) F˜02 (I + w01 ), where F˜i j = F˜i j (x+ ) := Fi j (x+ + x00 (x+ )). By (8.1), | F˜i j |λs−σ ≤ 2|Fi j |λs . It is a merely algebraic calculus that the space Fs is closed under the Poisson brackets (see (1.4)).


775

Lemma 8.5 (Poisson bracket). Let R, F ∈ Fs then G := {R, F} ∈ Fs " , ∀0 < s " < s, and G 00 G 01 G 10 G 02

= = = =

F10 · F10 · F10 · F10 ·

" " R00 − R10 · F00 − iR01 · J F01 , " " R01 − R10 · F01 + 2iF02 J R01 − 2iR02 J F01 , " " R10 − R10 · F10 , " " R02 − R10 · F02 − 4iR02 J F02 .

Given F ∈ Fs , we consider the associated Hamiltonian system (see (1.3)) ⎧ ⎪ ⎨ x˙ = F10 (x), " (x) − F " (x)w − F " (x)y − F " (x)w · w, y˙ = −F00 01 10 02 ⎪ ⎩ w˙ = −iJ F01 (x) − 2iJ F02 (x)w,

(8.8)

with initial condition (x 0 , y 0 , w 0 ) = (x+ , y+ , w+ ). For all ξ ∈ , the hamiltonian flow at time t, X tF (·; ξ ) : (x+ , y+ , w+ ) → (x t , y t , w t )(x+ , y+ , w+ ), defines a symplectic diffeomorphism which is close to the identity for 0 ≤ t ≤ 1 and F small. In the next lemma we estimate each component of these symplectic diffeomorphisms separately. These finer estimates are required by our approach. This is a difference with respect to [26]. Lemma 8.6 (Hamiltonian flow). Let 0 < σ < s ≤ 1 and F ∈ Fs satisfy, for some λ ≥ 0, |F10 |λs ≤ σ/12 , |F02 |λs ≤ 1/12 .

(8.9)

Then, for all t ∈ [0, 1], X tF = I + t with t ∈ Es−σ satisfying t |λ λ t λ |x00 s−σ ≤ 2|F10 |s , |y00 |s−σ ≤

12 6 t |λ λ |F00 |λs + 9(|F01 |λs )2 , |y10 s−σ ≤ |F10 |s , σ σ

(8.10) 36 27 t |λ t |λ t |λ λ t λ λ |F01 |λs , |y02 |F02 |λs , |w00 |y01 s−σ ≤ s−σ ≤ s−σ ≤ 6|F01 |s , |w01 |s−σ ≤ 6|F02 |s . σ σ

Moreover, if, for 0 < δ < 1, |F00 |s ≤

δr 2 σ δr σ δσ δσ , |F01 |s ≤ , |F10 |s ≤ , |F02 |s ≤ , 72 216 24 108

(8.11)

then X tF (·; ξ ) : D(s − σ, r − δr ) ⊆ D(s, r ), ∀0 ≤ t ≤ 1, ∀ξ ∈ . Proof. In the Appendix.

Finally we study the composition of two symplectic maps of the form I + with ∈ Es . The symplectic transformation (5.7) of Theorem 5.1 is the composition of infinitely many maps of this form, see the iterative Lemma 8.17-(S6)ν .

776

M. Berti, L. Biasco

˜ = I + ˜ with ˜ ∈ Es˜ , Lemma 8.7 (Composition of diffeomorphisms). Let 0 < s < s˜ , and = I + with ∈ Es satisfy 2|x00 |sλ˜ /(˜s − s) ≤ η ≤ 1. Then the composite map has the form ˜ ◦ = I + ˆ with ˆ ∈ Es and |xˆ00 − x00 |s ≤ (1 + η)|x˜00 |s˜ , |wˆ 00 − w00 |s ≤ (1 + η)|w˜ 00 |s˜ + 2|w˜ 01 |s˜ |w00 |s , |wˆ 01 − w01 |s ≤ (1 + η)|w˜ 01 |s˜ (1 + |w01 |s ), | yˆ00 − y00 |s ≤ (1 + η)| y˜00 |s˜ + 2| y˜01 |s˜ |w00 |s + 2| y˜10 |s˜ |y00 |s + 2| y˜02 |s˜ |w00 |2s , | yˆ01 − y01 |s ≤ (1 + η)| y˜01 |s˜ (1 + |w01 |s ) + 2| y˜10 |s˜ |y01 |s + 4| y˜02 |s˜ |w00 |s (1 + |w01 |s ), | yˆ10 − y10 |s ≤ (1 + η)| y˜10 |s˜ (1 + |y10 |s ), | yˆ02 − y02 |s ≤ (1 + η)| y˜02 |s˜ (1 + |w01 |s )2 + 2| y˜10 |s˜ |y02 |s ,

(8.12)

where for brevity | · |s˜ := | · |λs˜ , | · |s := | · |λs . ˆ ˜ Proof. We have − = ◦(I +). The estimate on xˆ00 follows by xˆ00 (x+ )−x00 (x+ ) = x˜00 (x+ + x00 (x+ )) and (8.1). All the other estimates follow analogously. 8.3. The KAM step. At the generic ν th step we have an Hamiltonian H ν = N ν + P ν lip like in (5.18). Both ων , ν are Lipschitz in ν with |ων |lip + | ν |−δ∗ ≤ Mν . We set ⎧ ⎨

ν λν ν λν |sν , |P03 |sν , ν := max 1, |P11 ⎩

with λν :=

|∂ yi ∂wj P ν |λsνν,rν , rν |∂ y2 ∂w P ν |λsνν,rν

2i+ j=4

α0 . Mν

⎫ ⎬ ⎭

(8.13)

Note that, unlike the KAM iterative scheme in [26], α0 will be kept fixed along the iteration. We simplify notations in the next section dropping the index ν and writing “+” for ν + 1. So P = P ν , P + = P ν+1 , etc. The symplectic change of coordinates. We write H = N + P = N + R + (P − R),

where

R := TK P≤2 ,

(8.14)

and P≤2 is defined in (1.10). Then we consider the homological equation {N , F} + R = [R],

(8.15)

where ˆ · z¯ , eˆ := P00 , ωˆ := P10 , ˆ := diag j≥1 ∂z2 z¯ P|y=0,w=0 [R] := eˆ + ωˆ · y + z j j (8.16) and · denotes the average with respect to the angles.


777

Lemma 8.8 (Homological equation). Suppose that, uniformly on , |ω(ξ ) · k + (ξ ) · l| ≥ α

l d , ∀ (k, l) = 0, |k| ≤ K , |l| ≤ 2 . 1 + |k|τ

(8.17)

Let 0 < σ < s. Then, ∀R ∈ Fs , Eq. (8.15) has a solution F ∈ Fs−σ satisfying [F] = 0 and |Fi j |λs−σ ≤

K|Pi j |λs α , 0 ≤ 2i + j ≤ 2 , 0 ≤ λ ≤ , 2τ +n+1 ασ M

(8.18)

with K := K(n, τ ) ≥ 1. We can take K = (τ + n)c(τ +n) for some absolute constant c > 0. Proof. The proof is given in [26], Lemmata 1-2 with the only difference that (8.17) holds for every k. The truncation |k| ≤ K does not affect the estimates, since TK Pi j and, therefore, Fi j are Fourier polynomials of order K . By Lemma 8.8 and 8.6 we deduce: Lemma 8.9 (Symplectic map). There exist C0 := C0 (n, τ ) > 1 large enough–we can take C0 := Kc for some absolute constant c > 0 with K defined in Lemma 8.8–such that, if |P00 |λs |P01 |λs δασ β , |P10 |λs , |P02 |λs ≤ , , 2 r r 16C0

(8.19)

β := 2τ + n + 2 ,

(8.20)

where

0 < 2σ < s < 1, 0 < δ < 1, 0 ≤ λ ≤ α/M, the symplectic maps t = I + t := X tF : D(s − 2σ, r − δr ) → D(s − σ, r − δr/2)

(8.21)

are well defined ∀t ∈ [0, 1], and t ∈ Es−2σ satisfy |P10 |λs |P00 |λs (|P01 |λs )2 t λ , |y | ≤ C + C , 0 0 s−2σ 00 ασ β−1 2ασ β 2α 2 σ 2β−1 |P10 |λs |P01 |λs |P02 |λs t λ t λ t λ |y10 |s−2σ ≤ C0 , |y01 |s−2σ ≤ C0 , |y02 |s−2σ ≤ C0 , (8.22) β β ασ ασ ασ β |P01 |λs |P02 |λs t λ t λ |w00 |s−2σ ≤ C0 β−1 , |w01 |s−2σ ≤ C0 β−1 . ασ ασ

t λ |x00 |s−2σ ≤ C0

Note that (8.19)–(8.22) imply (8.2) (with | · |λs−2σ instead of | · |λs−σ ).

778

M. Berti, L. Biasco

The transformed Hamiltonian under the symplectic map + := X 1F defined in (8.21) is $ 1 h + := H ◦ + = N + Nˆ + {(1 − t) Nˆ +t R, F} ◦ X tF dt +(P − R) ◦ + =: N + + P + , 0

(8.23) where N + := N + Nˆ and Nˆ := [R] is defined in (8.16). ˆ · z¯ . The new normal form N + . We now estimate N + := N + Nˆ where Nˆ := eˆ + ωˆ · y + z ˆ with the vector We identify ˆ = ( ˆ i )i≥n+1 , ˆ i := ∂z2 z¯ P|y=0,w=0 . i i ˆ | p− ˆ| ˆ lip ≤ |P10 |s , | | Lemma 8.10. |ω| ˆ ≤ |P10 |s , |ω| ¯ p ≤ |P02 |s , | | p− ¯ p ≤ |P02 |s and lip

lip

ˆ · l| ≤ |P10 |s |k| + 2|P02 |s l d , ∀(k, l) ∈ Zn × Z∞ . |ωˆ · k +

lip

(8.24)

ˆ j = (P02 e j , e j ) p , where (·, ·) p and e j (respectively (·, ·) p¯ and e¯ j ) Proof. We have a, p a, p¯ denote the scalar product and the j th element of the basis in b (respectively b ). a, p ¯ We have e¯i = i p− p¯ e j and, if u ∈ b , u = i u¯ i e¯i = i u i ei , then u i = i p− p¯ u¯ i . Denoting u := P02 ei , we get ¯ p ˆ ¯ p ¯ p i p− | i | = i p− |(u, ei ) p | = i p− |u i | = |u¯ i | = |(u, e¯i ) p¯ | ≤ ua, p¯ ≤ |P02 |s

ˆ | p− (recall that |P02 |s = supx∈Ts P02 (x)L(a, p ,a, p¯ ) ) implying | | ¯ p ≤ |P02 |s . Similarly |ω| ˆ ≤ |P10 |s . Then

b

b

ˆ · l| ≤ |ω||k| ˆ |−δ∗ |l|δ∗ ≤ |ω||k| ˆ | p− |ωˆ · k + ˆ + | | ˆ + | | ¯ p 2l d ≤ |P10 |s |k| + 2|P02 |s l d , using (3.3) and |l|δ∗ ≤ |l|d−1 ≤ 2l d , ∀|l| ≤ 2. The same estimates hold for | · |lip . The new perturbation P + Notation. For the rest of this section, A B means that A ≤ Kc B, where K is defined in Lemma 8.8 and c > 0 is some absolute constant. ˜ where By (8.23), and since Nˆ = [R], we have to estimate P + = P ∗ + P, $ 1 {(1 − t)[R] + t R, F} ◦ X tF dt , P˜ := (P − R) ◦ + . P ∗ := 0

We estimate P ∗ in Lemma 8.11 and P˜ in Lemma 8.13. We introduce the rescaled quantities |P00 |λs |P02 |λs |P01 |λs |P10 |λs , d := , , b := , c := r 2 α pa r α pb α α where pa , pb are defined in (5.11). Since pa , pb ≥ 1, if a :=

a, b, c, d ≤

δσ β 16C0

(the constant C0 is defined in Lemma 8.9), then (8.19) and, so, (8.22) hold.

(8.25)

(8.26)


779

Note that the Pi∗j in (8.27), 0 ≤ 2i + j ≤ 2, are “quadratic” in the variables a, b, c, d (i.e. Pi j ). Lemma 8.11. P ∗ :=

1 0

{(1 − t)[R] + t R, F} ◦ X tF dt ∈ Fs−2σ and

∗ λ ∗ λ |P00 |s−2σ σ 2−6β r 2 α pa (ac + b2 ) , |P01 |s−2σ σ 2−6β r α pb b(c + d) , ∗ λ ∗ λ |s−2σ σ 2−6β αc2 , |P02 |s−2σ σ 2−6β αd(c + d) , |P10

(8.27)

where β is defined in (8.20). Proof. In the Appendix.

We define the higher order terms of the perturbation P4 :=

Pi j (x)y i w j

P = P≤2 + P11 yw + P03 w 3 + P4 (8.28)

so that

2i+ j≥4 j

j

(P≤2 was defined in (1.10)). Note that ∂ yi ∂w P = ∂ yi ∂w P4 if 2i + j = 4. We also define 00 := +|{y+ =0,w+ =0} = (x+ + x00 (x+ ; ξ ), y00 (x+ ; ξ ), w00 (x+ ; ξ )) . By Lemma 8.9, 00 : {|Im x| < s − 2σ } → D(s − σ, r − δr/2), ∀ξ ∈ . Lemma 8.12. We have |P4 ◦ 00 | δ −1 |y00 |2 + δ −1 |y00 ||w00 |2 + |w00 |4 , |(∂ y P4 ) ◦ 00 | δ −1 |y00 | + |w00 |2 , 2 |(∂ yw P4 ) ◦ 00 | (δr )−1 |y00 | + |w00 | , |(∂w P4 ) ◦ 00 | (δr )−1 |y00 |2 + δ −1 |y00 ||w00 | + |w00 |3 , 2 |(∂ww P4 ) ◦ 00 | δ −1 |y00 | + |w00 |2 , 3 |(∂www P4 ) ◦ 00 | (δr )−1 |y00 | + |w00 | , 3 3 |(∂ yyw P4 ) ◦ 00 | (δr )−1 , |(∂ yyy P4 ) ◦ 00 | (δr )−2 ,

where all the norms | | := | |λs−2σ and is defined in (5.5). The further estimates 2 3 P4 ) ◦ 00 | , |(∂ yww P4 ) ◦ 00 | ≤ |(∂ yy

follow immediately from the definition of in (5.5). Proof. In the Appendix.

We now estimate P˜ := (P − R)◦+ . Note the “linear” term in the variables a, b, c, d.

780

M. Berti, L. Biasco

Lemma 8.13. P˜ := (P − R) ◦ + = (P11 yw + P03 w 3 + P4 + TK⊥ P≤2 ) ◦ + ∈ Fs−2σ and σ 8β−4 | P˜00 |λs−2σ |P11 |λs r 3 α pa + pb −2 (ab + b3 ) + |P03 |λs r 3 α 3 pb −3 b3 σ 6β−3 | P˜01 |λs−2σ

+δ −1r 4 α 2 pa −2 (a 2 + b4 ) + K n e−K σ r 2 α pa (a + b2 ), |P11 |λs r 2 α pa −1 (a + b2 ) + |P03 |λs r 2 α 2 pb −2 b2 +δ −1r 3 α pa + pb −2 (a + b2 )b + K n e−K σ r α pb b, |P11 |λs r α pb −1 b + δ −1r 2 α pa −1 (a + b2 ) + K n e−K σ αc,

σ 4β−2 | P˜10 |λs−2σ σ 4β−2 | P˜02 |λs−2σ (|P11 |λs + |P03 |λs )r α pb −1 b + δ −1r 2 α pa −1 (a + b2 ) +K n e−K σ αd, |P11 |λs (c + d) + δ −1r α pa −1 (a + b),

σ 2β−1 | P˜11 − P11 |λs−2σ σ 2β−1 | P˜03 − P03 |λs−2σ (|P11 |λs + |P03 |λs )d + δ −1 r α pa −1 (a + b) , where β is defined in (8.20). Proof. In the Appendix.

We summarize the previous estimates in the following key lemma. Lemma 8.14 (KAM step). Assume (8.26). Then, ∀ξ ∈ satisfying (8.17), there is a symplectic map + (·; ξ ) : D(s − 2σ, r − δr ) → D(s − σ, r ) with 0 < 2σ < s , 0 < δ < 1 , satisfying (8.22), such that H + := H ◦ + = N + + P + = (N + Nˆ ) + P + = (N + [P]) + P + and P + = P ∗ + P˜ satisfies the estimates of Lemmata 8.11 and 8.13. We define a+ , b+ , c+ , d+ like a, b, c, d in (8.25), with Pi+j , s+ := s −2σ, α+ , r+ instead of Pi j , s, α, r . Lemma 8.15. Assume (8.26), r 2 ≤ 18α and |P11 |λs ≤ 9α pe /r, |P03 |λs ≤ 9α p f /r , where ⎧ ⎨ 1/2 if (H1) 1/2 if (H1) or (H2) pe := 3 − pa − p f = 5/4 if (H2) and p f := . 1 if (H3) ⎩1 if (H3) (8.29) We have that ˜

a+ ≤ C1 (ac + b2 + a 2 + K n e−K σ a)/δσ β , ˜

b+ ≤ C1 (a + b2 + bc + bd + K n e−K σ b)/δσ β , ˜

c+ ≤ C1 (b + c2 + a + K n e−K σ c)/δσ β ,

(8.30)

˜

d+ ≤ C1 (b + cd + d 2 + a + K n e−K σ d)/δσ β , where β˜ := 16τ + 8n + 12 and C1 = Kc for some absolute constant c > 0 (K defined in Lemma 8.8).


781

Proof. By Lemma 8.14 (see the estimates of Lemmata 8.11 and 8.13), β˜ = 8β − 4, we get ˜

σ β a+ ac + b2 + (ab + b3 )α pb + pe −2 + b3 α 3 pb + p f − pa −3 +δ −1 (a 2 + b4 )r 2 α pa −2 + K n e−K σ a, ˜

σ β b+ bc + bd + (a + b2 )α pa + pe − pb −1 + b2 α pb + p f −2 +δ −1 (ab + b3 )r 2 α pa −2 + K n e−K σ b, ˜

σ β c+ c2 + bα pb + pe −2 + δ −1 (a + b2 )r 2 α pa −2 + K n e−K σ c, ˜

σ β d+ cd + d 2 + bα pb + pe −2 + bα pb + p f −2 + δ −1 (a + b2 )r 2 α pa −2 + K n e−K σ d, which imply (8.30) thanks to r 2 ≤ 18α, (5.11), (8.29) and (8.26).

8.4. KAM Iteration. We fix χ such that 1 < χ < 21/3 ,

χ4 + 1 > χ5 .

(8.31)

Lemma 8.16. Let {(a j , b j , c j , d j )}0≤ j≤ν be a sequence of positive numbers satisfying a j+1 ≤ κ j+1 (a j c j + b2j + a 2j + K ∗n e−K ∗ 2 a j ), j

b j+1 ≤ κ j+1 (a j + b2j + b j c j + b j d j + K ∗n e−K ∗ 2 b j ), j

(8.32)

c j+1 ≤ κ j+1 (b j + c2j + a j + K ∗n e−K ∗ 2 c j ), j

d j+1 ≤ κ j+1 (b j + c j d j + d 2j + a j + K ∗n e−K ∗ 2 d j ) , j

∀0 ≤ j ≤ ν − 1,

where κ > ee and K ∗ ≥ 26 + 6 ln κ + 16n 2 . There exist 0 < γ0 := γ0 (κ, χ ) ≤ 1/3 such that a0 , b0 , c0 , d0 ≤ ε0 ≤ γ0 ⇒ a j , b j , c j , d j ≤ γ0−1 ε0 e−χ , ∀ 0 ≤ j ≤ ν . (8.33) j

In particular one can take γ0 = κ −c ln(ln κ) for some c = c(χ ) ≥ 1. Proof. The detailed computations are given in the Appendix. Note the linear terms a j , b j in the last three inequalities in (8.32). This seems to contrast with the superconvergent estimate (8.33). However the estimate of (a j+3 , b j+3 , c j+3 , d j+3 ) in terms of (a j , b j , c j , d j ) is quadratic. For ν ∈ N we define • σν := σ0 2−ν , σ0 := s80 , sν+1 := sν − 2σν # s20 , / r0 • δν := 2−ν−3 , rν+1 := (1 − δν )rν # r0 ∞ ν=0 (1 − δν ) > 2 , α α • α0 < 1 , Mν := M0 (2 − 2−ν ) $ 2M0 , λν := M0ν # 4M00 , • K ν := K 0 4ν ,

K 0 :=

8K ∗ s0

,

K −1 := 0 ,

Dν := D(sν , rν ) ,

K ∗ := 26 + 6 ln κ + 16n 2 .

782

M. Berti, L. Biasco

Note that K ν σν = K ∗ 2ν ≥ 1. Let us define ˜

κ := 4C1 (4/s0 )β ,

(8.34)

where C1 = Kc , β˜ = 16τ + 8n + 12 are introduced in Lemma 8.15 and K = (n + τ )c(n+τ ) in Lemma 8.8. Here and in the following c, c" , . . . denote “absolute” constants depending (possibly) on χ only. We set γ0 := γ0 (κ, χ ) as in Lemma 8.16 with κ in (8.34) .

(8.35)

Note that, for some 1 < c− < c+ , −c− τ0

ec− τ0 ≤ κ ≤ ec+ τ0 , τ0−c+ τ0 ≤ γ0 ≤ τ0

, with τ0 := (τ + n) ln ((τ + n)/s0 ) . (8.36)

In the following lemma we set | · |ν := | · |λsνν for brevity. Lemma 8.17 (Iterative Lemma). Let H 0 = N 0 + P 0 : D0 × −1 → C be analytic in D0 with −1 ⊂ Rm , N 0 := e0 + ω0 (ξ ) · y + 0 (ξ ) · z z¯ in normal form and lip |ω0 |lip + | 0|−δ∗ ≤ M0 . Define a0 :=

0| |P00 0 p r02 α0 a

, b0 :=

0| |P01 0 p r0 α0 b

, c0 :=

0| |P10 |P 0 |0 0 , d0 := 02 . α0 α0

∗

There exist C = γ0−c > 1, γ = γ0c < 1 (for some absolute constants c > c∗ > 1), such that, if the smallness conditions √ p p 0 0 |0 ≤ α0 e , r0 |P03 |0 ≤ α0 f , 0 r0 ≤ α0 , max{a0 , b0 , c0 , d0 } =: ε0 ≤ γ , r0 |P11 (8.37) are satisfied (the constant 0 is defined as in (5.5) for P 0 ), then: (S1)ν ∀0 ≤ j ≤ ν there exist H j = N j + P j : D j × j−1 → C, analytic in D j , with N j := e j + ωi (ξ ) · y + i (ξ ) · z z¯ in normal form and j := ξ ∈ j−1 : |ω j (ξ ) · k + j (ξ ) · l| l d , ∀(k, l) = 0 , |k| ≤ K j , |l| ≤ 2 . (8.38) ≥ α0 1 + |k|τ Moreover, ∀1 ≤ j ≤ ν, H j = H j−1 ◦ j , where j : D j × j−1 → D j−1 are a Lipschitz family of real analytic symplectic maps of the form j = I + j with j ∈ Es j satisfying |x00 | j , |y10 | j ≤ C 2(2β−1)( j−1) c j−1 , j

j

p −1

|y00 | j ≤ C 2(2β−1)( j−1)r02 α0 a j

p −1

|y01 | j , |w00 | j ≤ C 2(2β−1)( j−1)r0 α0 b j

j |y02 | j

j

j , |w01 | j

≤ C 2

(2β−1)( j−1)

d j−1 ,

(a j−1 + b2j−1 ) ,

b j−1 ,

(8.39)


783

where j

a j :=

|P00 | j p

r 2j α0 a

j

, b j :=

|P01 | j p

r j α0 b

j

j

|P | j |P | j , c j := 10 , d j := 02 . α0 α0

(8.40)

˜ j of ω j , j defined on −1 (S2)ν ∀0 ≤ j ≤ ν there exist Lipschitz extensions ω˜ j , and, for j ≥ 1, j−1

|ω˜ j − ω˜ j−1 | , |ω˜ j − ω˜ j−1 |lip ≤ |P10 |s j−1 , ˜ j − ˜ j−1| p− ˜ j − ˜ j−1| | ¯ p , | p− ¯ p ≤ |P02 |s j−1 , lip ˜ j| |ω˜ j |lip + | ≤ Mj . lip

j−1

−δ∗

(8.41) (8.42)

(S3)ν {(a j , b j , c j , d j )}0≤ j≤ν satisfy (8.32) with κ defined in (8.34). j (S4)ν ∀0 ≤ j ≤ ν − 1, the a j , b j , c j , d j ≤ γ0−1 ε0 e−χ with γ0 defined in (8.35). (S5)ν ∀ 1 ≤ j ≤ ν we have j ≤ 90 (see (8.13)) and p −1/2

0 |P11 − P11 | j ≤ 2− j−1 C ε0 (|P11 |0 + α0 a j

j |P03

j−1

−

j−1 P03 | j

≤2

− j−1

0 C ε0 (|P03 |0

0 + |P11 |0

),

(8.43)

p −1/2 + α0 a ).

(8.44)

˜ j := 1 ◦ 2 ◦ · · · ◦ j = I + ˜ j with (S6)ν ∀1 ≤ j ≤ ν, the composed map ˜ j ∈ Es j satisfies |x˜00 | j , | y˜10 | j , | y˜02 | j , |w˜ 01 | j ≤ C2 (1 − 2− j )ε0 , j

j

j | y˜00 | j

≤

j

j

p −1 C2 (1 − 2− j )r02 α0 a ε0 ,

j j | y˜01 | j , |w˜ 00 | j

(8.45) ≤

p −1 C2 (1 − 2− j )r0 α0 b ε0 .

Proof. The statements (S1)0 and (S2)0 follow by the hypothesis of the lemma, (8.37) ˜ 0 = 0 . The (S4)0 holds by (8.37) because γ0 ≤ 1/3 (see and setting ω˜ 0 := ω0 , Lemma 8.16). The (S6)1 follows by (S1)0 . Note that (S3)0 and (S5)0 trivially hold since there is nothing to verify in (8.32), (8.43) and (8.44) for ν = 0. Then, by induction, we prove the statements (Si)ν+1 , i = 1, . . . , 6. (S4)ν+1 follows by (8.37), (S3)ν and Lemma 8.16. (S1)ν+1 . By (S4)ν+1 we have, since ε0 ≤ γ = γ0c , ν

ν

aν , bν , cν , dν ≤ γ0−1 ε0 e−χ ≤ γ0c −1 e−χ ≤

β

δν σν 16C0

(8.46)

for c large enough. Indeed, since σν := s0 2−ν /8, δν := 2−ν−3 , β := 2τ + n + 2, we get

cβ ν e−χ β τ + n c(τ +n) −β −χ ν (β+1)(ν+3) sup = sup s e 2 ≤ ≤ . 0 β s0 s0 ν≥0 δν σν ν≥0 Then (8.46) follows, for c large enough, by (8.36) and C0 = K c = (τ + n)c" (τ +n) , see Lemma 8.9. Then, by (8.46), ∀ξ ∈ ν , Lemma 8.14 applies with N = N ν , P = Pν , s = sν , σ = σν , r = rν , α = α0 , δ = δν , M = Mν . There exists a real analytic symplectic map ν+1 : Dν+1 × ν → Dν , Lipschitz in ν , such that H ν+1 = H ν ◦ ν+1 =: N ν+1 + P ν+1 ,

N ν+1 := N ν + [P ν ] .

784

M. Berti, L. Biasco

The estimates (8.39) follow by (8.22) and (8.40), taking C large enough (namely c∗ large enough). (S2)ν+1 . The frequency maps ων+1 ν+1 are defined on ν and, by Lemma 8.10, satisfy the estimates ν ν |ων+1 − ων | ≤ |P10 |sν , |ων+1 − ων |lip ≤ |P10 |sν , lip

| ν+1 − ν | p− ¯ p ≤

ν |P02 |sν

lip , | ν+1 − ν | p− ¯ p

≤

(8.47)

ν lip |P02 |sν .

(8.48)

By the Kirszbraun theorem (see e.g. [25]), used componentwise, they can be extended to ˜ ν+1 defined on the whole −1 preserving the same sup-norm and Lipschitz maps ω˜ ν+1 , seminorms (8.47)-(8.48). As a consequence, and since | |−δ∗ ≤ | | p− ¯ p (recall (3.3)), we get ν ν −1 ˜ ν+1 | |ω˜ ν+1 |lip + | −δ∗ ≤ Mν + |P10 |ν + |P02 |ν ≤ Mν + λν α0 (cν + dν ) = Mν (1 + cν + dν ) ≤ Mν+1 lip

lip

lip

by (S4)ν and for c large enough. (S3)ν+1 follows by (8.30) and the definition of κ. The assumptions of Lemma 8.15 hold by (8.46), by (S5)ν

(8.37)

ν rν2 ≤ 90 rν2 ≤ 90 r02 ≤ 9α0 p

ν | ≤ 4α e /r , |P ν | ≤ 4α f /r , that follow by (S5) . Indeed, (recall 0 ≥ 1) and |P11 ν ν ν ν 0 0 03 ν by (8.44) with j = ν, and, since pa ≥ pe ≥ p f , we get by (8.37), p

p −1/2

ν 0 0 0 |P03 |ν ≤ |P03 |0 + C ε0 (|P03 |0 + |P11 |0 + α0 a p −1/2

≤ 3r0−1 α0 f + α0 f p

p −1/2

0 0 ) ≤ 2|P03 |0 + |P11 |0 + α0 a

≤ 4r0−1 α0 f ≤ 4rν−1 α0 f , p

p

(8.49)

ν | ≤ 9α e /r follows as well. for c large enough (with respect to c∗ ). The estimate |P11 ν ν 0 p

(S5)ν+1 . By the last inequality of Lemma 8.13, (S4)ν+1 , (8.37) and ν ≤ 90 we deduce ν

p −1

ν+1 ν ν ν |P03 − P03 |ν+1 ≤ Kc γ0−1 ε0 23βν e−χ (|P11 |ν + |P03 |ν + ν rν α0 a

≤2

−ν−2

0 C ε0 (|P11 |0

0 + |P03 |0

)

p −1/2 + α0 a )

with c∗ large enough, proving (8.44) with j = ν + 1. The proof of (8.43) for j = ν + 1 is analogous. ˜ ν = I + ˜ ν+1 . Finally, by (S6)ν and c large enough, we apply Lemma 8.2 with = Then (8.5) yields ν+1 ≤ 90 because ∂ y i w j H ν+1 = ∂ y i w j P ν+1 for 2i + j ≥ 3. ˜ = ˜ ν , = ν+1 , ˆ = ˜ ν+1 . Then (S6)ν+1 By (S1)ν we can apply Lemma 8.7 with ν+1 ν+1 ˜ ∈ Esν+1 and (S6)ν+1 follows. The estimate for y˜00 follows by the bound in (S6)ν ν | and the inequalities for | y˜00 ν ν+1 ν | y˜00 − y˜00 |ν+1

(8.12)

≤

ν+1 ν+1 ν |y00 |ν+1 + 2ν+3 s0−1 |x00 |ν+1 | y˜00 |ν ν ν+1 ν ν+1 ν ν+1 2 +2(| y˜01 |ν |w00 |ν+1 + | y˜10 |ν |y00 |ν+1 + | y˜02 |ν |w00 |ν+1 )

(S1)ν+1

≤

p −1

C2 2−ν−1 r02 α0 a

ε0


785

with c∗ large enough and, then, c large enough (w.r.t. c∗ ). All the other estimates are analogous. ˜ν = I + ˜ ν converges Corollary 8.1. For all ξ ∈ α0 := ∩ν≥0 ν the sequence uniformly on D(s0 /2, r0 /2) to an analytic symplectic map = I + , where ∈ Es0 /2 satisfies |x00 |λs00/2 , |y00 |λs00/2

1− pa

α0

r02

, |y01 |λs00/2

1− p

α0 b , r0

|y10 |λs00/2 , |y02 |λs00/2 , |w01 |λs00/2 , |w00 |λs00/2

1− pb

α0 r0

≤ γ0−c ε0

(8.50)

∞ (·, ξ ) = 0. and the perturbation P≤2

˜ ν = ν+1 ◦ ˜ ν is a Cauchy sequence by (8.39), (S4)ν+1 and ˜ ν+1 − Proof. The λ /4 (S6)ν . Estimates (8.50) follow by (8.45), and since | · |s00/2 ≤ 4| · |λs00/2 . Finally ∞ P≤2 (·, ξ ) = 0, ∀ξ ∈ α0 , follows by (8.40) and (S4)ν . Let us define ˜ν. ω∞ := lim ω˜ ν , ∞ := lim ν→∞

ν→∞

It could happen that ν0 = ∅ for some ν0 . In such a case α0 = ∅ and the iterative ˜ ν := process stops after finitely many steps. However, we can always set ω˜ ν := ω˜ ν0 , ˜ ν0 , ∀ν ≥ ν0 , and ω∞ , ∞ are always well defined. ˜ ν − ∞| p− ˜ ν − ∞|lip ≤ γ −c α0 ε0 e−χ ν . Lemma 8.18. |ω˜ ν −ω∞ |, | ˜ ν −ω∞ |lip , | ¯ p , |ω 0 p− ¯ p Proof. By (8.41), (8.40), (S4)ν , we have ∞ ∞ j ν |ω˜ ν − ω∞ | ≤ ω˜ j+1 − ω˜ j ≤ γ0−1 α0 ε0 e−χ ≤ γ0−c α0 ε0 e−χ . j=ν j=ν The other estimates are analogous.

End of the proof of Theorem 5.1. Case 1: Hypotheses (H1), (H2), or (H3)-(d > 1). We apply the Iterative Lemma with s0 := s, r0 :=

r , α0 := α, N 0 := N , P 0 := P, 0 := , M0 := M, −1 := . 2

The smallness assumption (8.37) follows by (5.5), (H1), (H2), (H3), (8.29), taking γ ≤ γ . Theorem 5.1 follows by the conclusions of Lemma 8.17, Corollary 8.1 and Lemma 8.18. Finally we prove the characterization of the Cantor set in terms of the limit frequencies (ω∞ , ∞ ). Lemma 8.19. ∞ ⊆ α := ∩ν≥0 ν .

786

M. Berti, L. Biasco

Proof. By (3.3) we get |l| p− p¯ ≤ |l|d−1 ≤ 2l d . If ξ ∈ ∞ , we have, ∀ν ≥ 0, ∀|k| ≤ K ν , |l| ≤ 2, l d − |ων (ξ ) − ω∞ (ξ )||k| − 2|| ν − ∞| p− ¯ p l d 1 + |k|τ l d ≥α , (8.51) 1 + |k|τ

|ων (ξ ) · k + ν (ξ ) · l| ≥ 2α

because, by Lemma 8.18, for γ small enough, α α |ων (ξ ) − ω∞ (ξ )| ≤ . , | ν − ∞| p− ¯ p ≤ 2(1 + K ντ )K ν 4(1 + K ντ ) By (8.51) we deduce ∞ ⊂ ν , ∀ν ≥ 0.

Case 2: Hypothesis (H3)-(d = 1). We first perform one step of averaging. The homological equation {N , F} + P00 = P00 has a solution Fˆ := Fˆ00 , for all ξ ∈ such that5 ω(ξ ) ∈ Dα μ ,τ (see (1.15)). The ˆ := X 1 : D(s/2, r/2) → D(s, r ) has the form symplectic map ˆ F

ˆ + , y+ , w+ ) = (x+ , y+ + yˆ00 (x+ ), w+ ) (x and | yˆ00 |s/2 α −μ |P00 |s , where, here and in the following, | · |s and | · |s/2 are short for | · |λs and | · |λs/2 respectively. ˆ = N + Pˆ satisfies Then Hˆ := H ◦ | Pˆ00 |s/2 α −μ |P00 |s |P10 |s + α −2μ |P00 |2s ε32 r 2 α + ε32 r 4 ≤ 2ε32 r 2 α, | Pˆ01 |s/2 |P01 |s + α −μ |P11 |s |P00 |s |P01 |s + α 1/2 ε3 r 2 ≤ αε3r, | Pˆ10 |s/2 |P10 |s + α −μ |P00 |s |P10 |s + ε3 r 2 ≤ ε3 α, | Pˆ02 |s/2 |P02 |s + α −μ |P00 |s ≤ ε3 α, and so

ε˜ := max r −2 α −1 | Pˆ00 |s/2 , α −1 r −1 | Pˆ01 |s/2 , α −1 | Pˆ10 |s/2 , α −1 | Pˆ02 |s/2 ε3 .

Moreover | Pˆ11 − P11 |s/2 , | Pˆ03 − P03 |s/2 | yˆ00 |s/2 α −μ |P00 |s ≤ ε3r 2 ≤ ε3 α , ˆ ≤ 3. whence | Pˆ11 |s/2 , | Pˆ03 |s/2 ≤ 2α/r , if γ is small enough. By Lemma 8.2 we get We apply the Iterative Lemma with s r H 0 := Hˆ , N 0 := N , P 0 := Pˆ , s0 := , r0 := , α0 := α, 2 2 0 := 3 , M0 := M , ε0 := ε˜ , −1 := \ ω−1 (Dα μ ,τ ) . Then (8.37) follows since ε˜ ε3 ≤ γ , taking γ small enough (with respect to γ ). We now prove Remark 5.1 for analytic Hamiltonians. 5 Actually it is sufficient to require in (1.15) only finitely many non-resonance conditions, i.e. for |k| ≤ K¯ .


787

Remark 8.1. We only modify the statement (S2)ν stating the existence of C ∞ extensions of the frequency maps ω∞ , ∞ . We follow the cut-off procedure of [5]. The small divisor condition (8.38) holds with α0 /2 instead of α0 in the neighborhood

−(τ +1) , (8.52) N ( j ) := ξ ∈ j−1 : dist(ξ, j ) ≤ cα0 K j where c is a small constant. Then H j+1 exists for all ξ ∈ N ( j ), and the KAM iteration implies −χ . |ω j+1 − ω j | , | j+1 − j | p− ¯ p ≤ Cα0 ε0 e j

˜ j+1 − ˜ j for all the parameters ξ ∈ −1 By a cut-off procedure we define C ∞ -functions coinciding with j+1 − j on j and equal to zero outside N ( j ). Moreover, by (8.52), the derivatives of such extended frequency maps satisfy j ε0 (τ +1)q −(τ +1) q ) ≤ C(q) q−1 e−χ K j , α

−χ /(α K ˜ j+1 − ˜ j )|| p− | D q ( 0 j ¯ p ≤ Cα0 ε0 e j

∀q ≥ 1 .

An analogous estimate holds for ω˜ j+1 − ω˜ j . Summing in j ≥ 1 we get (5.15). We now discuss the estimates of Remark 5.3. Remark 8.2. By Lemma 8.17 the small constant γ := γ (n, τ, s) of Theorem 5.1 can be taken γ := γ0c , where γ0 is defined in (8.35). Then (8.36) implies the estimate for γ given in Remark 5.3. Proof of Remark 5.2. By (5.6), (1.14), λ = α/M, we get lip

|ω∞ − ω|lip , | ∞ − ||−δ∗ ≤ Mεi /γ .

(8.53)

lip

By (5.2), (3.3) we have |ω∞ |lip , | ∞|−δ∗ ≤ M + Mεi /γ ≤ 2M. Let ξ1 , ξ2 ∈ and −1 (ω ) − ω−1 (ω )| ≤ L|ω − ω | and ω j := ω∞ (ξ j ), j = 1, 2. We have |ξ1 − ξ2 | = |ω∞ 1 2 1 2 ∞ |ω∞ (ξ1 ) − ω∞ (ξ2 )| ≥ |ω1 − ω2 | − |(ω∞ − ω)(ξ1 ) − (ω∞ − ω)(ξ2 )| ≥ L −1 − |ω∞ − ω|lip |ξ1 − ξ2 | (8.53)

≥ (L −1 − γ −1 Mεi )|ξ1 − ξ2 | ≥ (2L)−1 |ξ1 − ξ2 | .

−1 |lip ≤ 2L. Therefore ω∞ is injective and |ω∞

Proof of Theorem 5.3. We have ω(ξ ) = a + Aξ, det A = 0, (ξ ) = b + Bξ and (B ∗ ) implies bi = i d + lower order terms , i > n ,

∗ B ∈ L(Cn , −δ ∞ ) , δ∗ < d − 1 . (8.54)

Since is compact and 0 ∈ / ω() there exist 0 < t− < t+ such that ω∞ () ∩ ωR ¯ + ⊂ [t− , t+ ]ω¯ . By Remark 5.2, for εi small enough, the perturbed frequency map ω∞ is invertible. Then, for all t ∈ [t− , t+ ] such that t ω¯ ∈ ω∞ () we define −1 ¯ ∞ (t) := ∞ ω∞ (t ω) ¯ = b + BA−1 (t ω¯ − a) + r (t),

788

M. Berti, L. Biasco

where r (t) is a Lipschitz map satisfying, by (5.6) and (8.54), |r |−δ∗ α −1 , |r |−δ∗ ≤ cεi ≤ cγ . lip

(8.55)

The map r (t) can be extended to a Lipschitz map on the whole R preserving the bounds (8.55) applying the Kirszbraun theorem componentwise. We define ¯ ∞ (t) · l = (b − BA−1 a) · l + t (k + A−1 Bl) · ω¯ + r (t) · l . (8.56) f kl (t) := t ω¯ · k + We have to estimate the resonant set

+

ω∞ ( \ ∞ ) ∩ ωR ¯ +⊆

Rkl where

k∈Zn ,|l|≤2,(k,l)=0

2αl d . Rkl := t ∈ [t− , t+ ] : | f kl (t)| < 1 + |k|τ Let i0 := {|l| ≤ 2 : li = 0 , ∀ i > i 0 }. Note that i0 is a finite set. Lemma 8.20. There exists β1 > 0 (small enough) and i 0 (large enough) such that α ≤ β1 , l ∈ / i0 , |k| ≤ l d /8t+

⇒ Rkl = ∅ .

(8.57)

Proof. We first prove that if i 0 is large enough then |(b − BA−1 a + tBA−1 ω) ¯ · l| ≥ l d /4 , ∀ t ∈ [t− , t+ ] , 0 < |l| ≤ 2 , l ∈ / i0 . (8.58) We consider only the subtlest case l = ei − e j , i > j. Since l ∈ / i0 , we have i > i 0 . By (8.54) we get |b · l| ≥ l d /2 for i 0 large enough. If d > 1 then l d = i d − j d ≥ di d−1 . Then (8.58) follows for i 0 large enough since, by (8.54), |(BA−1 a + tBA−1 ω) ¯ · l| ≤ Ci δ∗ and δ∗ < d − 1. If d = 1, δ∗ < 0 and it is enough to prove that i − j ≥ C j δ∗ for some C > 1. For all j > j0 such that C j0δ∗ ≤ 1 the thesis follows because i − j ≥ 1. For all j ≤ j0 the thesis follows taking i 0 ≥ j0 + C. By (8.56), (8.58), (8.55), if t+ |k| ≤ l d /8 and α ≤ β1 is small enough, then | f kl (t)| ≥ implying that Rkl = ∅.

1 1 2αl d l d − const α − t+ |k| ≥ l d > , 4 9 1 + |k|τ

Lemma 8.21. For ω¯ ∈ D K α,τ with K > 2/t− then Rk0 = ∅. Moreover for α small |Rkl | ≤ const

αl d , ∀ k ∈ Zn , |l| ≤ 2 , (k, l) = 0 . 1 + |k|τ

(8.59)

Proof. Since ω¯ ∈ D K α,τ with K > 2/t− then, for t ∈ [t− , t+ ], | f k0 (t)| = |t ω¯ · k| ≥ t− |ω¯ · k| ≥ 2α/(1 + |k|τ )

⇒

Rk0 = ∅ .

We then discuss l = 0. Moreover, by Lemma 8.20, we consider only l ∈ i0 or |k| > l d /8t+ . By the hypotheses (5.22) and (8.54), arguing as in Remark 2.1, cl := (b − BA−1 a) · l

satisfies

|cl | ≥ δ¯ > 0 , ∀ 0 < |l| ≤ 2 .

(8.60)


789

¯ Now set m kl := (k + A−1 Bl) · ω. ¯ If |m kl | < δ/(3t + ), by (8.56), (8.60), (8.55), for α small enough, | f kl (t)| ≥ |cl | −

δ¯ (8.57) 2αl d δ¯ − 2cγ α ≥ ≥ 3 2 1 + |k|τ

⇒

Rkl = ∅ .

¯ ¯ If |m kl | ≥ δ/(3t + ) we have | f kl (t2 ) − f kl (t1 )| ≥ |t2 − t1 |(|m kl | − 2cγ ) ≥ |t2 − t1 |δ/(4t +) ¯ for γ small enough and (8.59) follows with const = 8t+ /δ. Now the proof of (5.23) proceeds as in [26] or Subsect. 7.1 (recalling Remark 7.3, now (7.17) holds also for d = 1 since nˆ = n, D = 2). Note that (8.57) and (8.59) are the analogues of Lemma 7.4 and Lemmata 7.7 (case d > 1), 7.10 (case d = 1) respectively. Acknowledgements. We thank Michela Procesi for useful comments.

9. Appendix Proof of Lemma 8.6. We take 0 ≤ t ≤ 1. For brevity we write | · | instead of | · |λ . Step 1. The solution of the first equation in (8.8) with x 0 = x+ has the form $ t ! t t t τ x = x+ + x00 (x+ ) where x00 (x+ ) = F10 x+ + x00 (x+ ) dτ . 0

By (8.9) and (8.1) we get

t | |x00 s−σ

t follows. ≤ σ/2 and the estimate (8.10) for x00

Step 2. Substituting x t in the third equation in (8.8) we get ! t t t − 2iJ F˜02 w t =: bt + At w t where F˜itj := Fi j x+ + x00 (x+ ) . w˙ t = −iJ F˜01 (9.61) By (8.1) we have | F˜itj |s−σ ≤ 2|Fi j |s and so (8.9)

|bt |s−σ ≤ 2|F01 |s , |At |s−σ ≤ 4|F02 |s ≤ 1/3 .

(9.62)

Let M t be the solution of the homogeneous system M˙ t = At M t with M 0 = I. We have $ t (9.62) 1 t |M − I |s−σ ≤ sup |M t |s−σ |Aτ |s−σ |M τ |s−σ dτ ≤ 3 0≤t≤1 0 1 1 sup |M t − I |s−σ , ≤ + 3 3 0≤t≤1 whence |M t |s−σ ≤

3 3 and |M t − I |s−σ ≤ sup |At |s−σ 2 2 0≤t≤1

(9.62)

(8.9)

≤ 6|F02 |s ≤

1 . 2

(9.63)

Then, by Neumann series, |(M t )−1 |s−σ ≤

j≥0

j

|M t − I |s−σ ≤ 2 .

(9.64)

790

M. Berti, L. Biasco

The solution of the non-homogeneous problem (9.61) with w 0 = w+ is $ t t t w t = w+ + (M t − I )w+ + M t (M τ )−1 bτ dτ =: w+ + w01 (x+ )w+ + w00 (x+ ) . 0

(9.65) The estimates (8.10) on

t w00

Step 3. Finally, substituting

and

xt

t w01

and

wt

follow by (9.65), (9.63), (9.64), (9.62). in the second equation (8.8), we get

t t t t t y˙ t = − Fˆ00 − Fˆ01 w t − Fˆ02 w t · w t − Fˆ10 y =: bˆ t + Aˆ t y t , (9.66) ! t (x ) , A ˆ t = − Fˆ t , and, using (9.65), where Fîtj := Fi"j x+ + x00 + 10 t t t t t t t t t ˆt t − Fˆ01 + Fˆ01 w00 + Fˆ02 w00 · w00 (I + w01 ) + 2(w00 ) F02 (I + w01 ) w+ bˆ t = − Fˆ00 t ˆt t − (I + w01 ) F02 (I + w01 ) w+ · w+ . (9.67) t | Since |x00 s−σ ≤ σ/2, by Cauchy estimates and (8.1) we get (8.9) 1 4 4 . (9.68) | Fîtj |s−σ ≤ 2|Fi"j |s− σ2 ≤ |Fi j |s ⇒ | Aˆ t |s−σ ≤ |F10 |s ≤ σ σ 3 Let Mˆ t be the solution of M˙ˆ t = At Mˆ t with Mˆ 0 = I . Reasoning as in Step 2 we get (8.9) 1 3 3 6 and |( Mˆ t )−1 |s−σ ≤ 2 . | Mˆ t |s−σ ≤ , | Mˆ t − I |s−σ ≤ | Aˆ t |s−σ ≤ |F10 |s ≤ 2 2 σ 2 (9.69)

The solution of the non-homogeneous system (9.66) with y 0 = y+ is $ t y t = y+ + ( Mˆ t − I )y+ + Mˆ t ( Mˆ τ )−1 bˆ τ dτ 0 t t t t (x+ ) + y01 (x+ )w+ + y10 (x+ )y+ + y02 (x+ )w+ · w+ , = y+ + y00

where, by (9.67), t y00 = − Mˆ t t y01 = − Mˆ t t y10 t y02

$ $

t 0 t 0

τ τ τ τ τ τ dτ, ( Mˆ τ )−1 Fˆ00 + Fˆ01 w00 + Fˆ02 w00 · w00 τ τ τ ˆτ τ ( Mˆ τ )−1 Fˆ01 (I + w01 ) + 2(w00 ) F02 (I + w01 ) dτ,

= Mˆ t − I, $ t τ ˆτ τ = − Mˆ t ( Mˆ τ )−1 (I + w01 ) F02 (I + w01 ) dτ . 0

The estimates (8.10) on yit j follow by (9.69), (9.68) and the previous estimates for w00 , w01 . We finally prove that X tF : D(s − σ, r − δr ) → D(s, r ). If (x+ , y+ , w+ ) ∈ D(s − σ, r − δr ) then t t (x+ )| ≤ s − σ + |x00 |s−σ |Im x t (x+ )| = |Im x+ + Im x00

(8.10)

(8.9)

≤ s − σ + 2|F10 |s < s .

The estimates |y t (x+ , y+ , w+ )| < r 2 , w t (x+ , w+ )a, p < r , follow as well by (8.10), (8.11).


791

1 1 Proof of Lemma 8.11. We estimate 0 t{R, F} ◦ X tF dt. The term 0 (1 − t){[R], F} ◦ X tF dt is analogous. The statement follows by Lemma 8.4 (with s → s − 3σ 2 ,s −σ → s − 2σ ), Lemma 8.5 (with G = {R, F}), Lemma 8.3, and (8.1), (8.6), (8.18), (8.19), (8.25) (8.26). Indeed, using r, α < 1 and 2 pb ≥ pa + 1, we get ∗ |λ λ |P00 s−2σ |G 00 |

s− 3σ 2

+|G 02 |λ

+ |G 10 |λ

s− 3σ 2

s− 3σ 2

|y00 |λs−2σ + |G 01 |λ

s− 3σ 2

|w00 |λs−2σ

(|w00 |λs−2σ )2

" | + |T P ||F " | + |F ||T P | |F10 |λs−σ |TK P00 01 K 01 K 10 00 ! 1−2β 2 p −1 " " r α a (a + b2 ) + |TK P10 ||F10 | + |TK P10 ||F10 | σ

" |+|T P ||F " |+|T P ||F |+|T P ||F |! σ 1−2β r α pb −1 b + |F10 ||TK P01 K 10 01 K 01 02 K 02 01 ! 2−4β 2 2 p −2 2 " " r α b b + |F10 ||TK P02 | + |TK P10 ||F02 | + |TK P02 ||F02 | σ ) α −1 σ 2−6β |P00 |λs |P10 | + |P01 |2 + |P10 |2 r 2 α pa −1 (a + b2 ) * +(|P10 | + |P02 |)|P01 |(1 + r α pb −1 b) + |P02 |2 r 2 α 2 pb −2 b2 ) α −1 σ 2−6β r 2 α pa +1 ac + r 2 α pb b2 + r 2 α pa +1 (a + b2 )c2 * +r 2 α 2 pb (c + d)b2 + r 2 α 2 pb −2 b2 d 2 σ 2−6β r 2 α pa ac,

where in the second term of the chain of inequalities all the norms are | · |λs−σ , in the third term all the norms are | · |λs , and we used Cauchy inequalities. Next ∗ λ |P01 |s−2σ |G 01 |λs− 3σ + |G 10 |λs− 3σ |y01 |λs−2σ + |G 02 |λs− 3σ |w00 |λs−2σ 2

2

2

" " |F10 |λs−σ |TK P01 | + |TK P10 ||F01 | + |F02 ||TK P01 | + |TK P02 ||F01 | " " " + |TK P10 ||F10 | + |TK P10 ||F10 | + |F10 ||TK P02 |

" +|TK P10 ||F02 | + |TK P02 ||F02 | × σ 1−2β r α pb −1 b σ 1−4β r α pb [b(c + d) + bc2 + bd(c + d)] σ 1−4β r α pb b(c + d), where in the second line all the norms are | · |λs−σ . Moreover ∗ λ " λ " λ |P10 |s−2σ |G 10 |λs− 3σ |F10 |λs−σ |TK P10 |s−σ + |F10 |s−σ |TK P10 |λs−σ σ −2β αc2 . 2

Finally ∗ λ |P02 |s−2σ |G 10 |λs− 3σ |y02 |λs−2σ + |G 02 |λs− 3σ

! " " |F10 |λs−σ |TK P10 | + |TK P10 ||F10 | σ 1−2β d 2

2

" " ||F10 | + |F02 ||TK P10 | + |TK P02 ||F02 | +|TK P02

σ 1−4β α(c2 d + cd + d 2 ) σ 1−4β α(c + d)d, where in the second line all the norms are | · |λs−σ .

792

M. Berti, L. Biasco

Proof of Lemma 8.12. We only prove the estimate for ∂w3 P4 ◦ 00 where, for brevity, ∂w3 := ∂www . For all (x, y, w; ξ ) ∈ D(s, r − δr/2) × , since ∂w3 P4 (x, 0, 0; ξ ) = 0 (by definition of P4 ), we have ∂w3 P4 (x, y, w; ξ ) = ∂w3 P4 (x, y, w; ξ ) − ∂w3 P4 (x, 0, 0; ξ ) ≤ sup ∂w3 ∂ y P4 (x, t y, tw; ξ )|y| + sup ∂w4 P4 (x, t y, tw; ξ )wa, p 0≤t≤1

0≤t≤1

≤ ((δr )

−1

|y| + wa, p )

(· denote the operatorial norm) because, by Cauchy estimates, and the definition of , |∂w3 ∂ y P4 |s,(1− δ )r (δr )−1 |∂w2 ∂ y P4 |s,r (δr )−1 . 2

(9.70)

Then ∀|y| < (r − δr/2)2 , wa, p < r − δr/2, |∂w3 P4 (·, y, w; ·)|s , σ |∂w3 ∂x P4 (·, y, w; ·)|s−σ ((δr )−1 |y| + wa, p ) . (9.71) Then, since Lemma 8.9 implies |x00 |λs−2σ ≤ σ/16, |y00 | < (r − δr/2)2 , |w00 |s−2σ < r − δr/2, |∂w3 P4 ◦ 00 |s−2σ ≤

sup

x∈Tns , ζ ∈

|∂w3 P4 (x, y00 (x+ ; ξ ), w00 (x+ ; ξ ); ζ )|

|y00 |s−2σ + |w00 |s−2σ δr

.

With similar estimates |∂w3 P4 ◦ 00 |s,r λ−1 (|y00 |λs−2σ (δr )−1 + |w00 |λs−2σ ). lip

Proof of Lemma 8.13.. Let for simplicity + := . We have P˜01 = ∂w+ ((P − R) ◦ )|y+ =0,w+ =0 , (9.72) 1 = ∂ y+ ((P − R) ◦ )|y+ =0,w+ =0 , P˜02 = ∂w2 + w+ ((P − R) ◦ )|y+ =0,w+ =0 , 2 1 = ∂ y2+ w+ ((P − R) ◦ )|y+ =0,w+ =0 , P˜03 = ∂w3 + w+ w+ ((P − R) ◦ )|y+ =0,w+ =0 . 6

P˜00 = ((P − R) ◦ )|y+ =0,w+ =0 , P˜10 P˜11

For brevity we set | · | := | · |λs , | · |∗ := | · |λs−2σ . The Pi⊥j (x + ) := TK⊥ Pi j (x + + x00 (x+ )), 0 ≤ 2i + j ≤ 2, satisfy, since |x00 |λs−2σ ≤ δσ/16 (by Lemma 8.9), (8.1)

(8.6)

|Pi⊥j |∗ ≤ |TK⊥ Pi j |s−σ K n e−K σ |Pi j | .

(9.73)

All the following estimates are a consequence of (9.72), the definition of P4 in (8.28), Lemmata 8.12 and 8.9, (8.25), (8.26), (9.73) and 2 pb ≥ pa +1. Setting Q := P4 +TK⊥ P≤2


793

we have | P˜00 |∗ |P11 ||y00 |∗ |w00 |∗ + |P03 ||w00 |3∗ + |Q ◦ 00 |∗

|P11 |σ 2−4β r 3 α pa + pb −2 (ab + b3 ) + |P03 |σ 3−3β r 3 α 3 pb −3 b3 + δ −1 |y00 |2∗ + δ −1 |y00 |∗ |w00 |2∗ + |w00 |4∗ ⊥ ⊥ ⊥ ⊥ |∗ + |P01 |∗ |w00 |∗ + |P10 |∗ |y00 |∗ + |P02 |∗ |w00 |2∗ +|P00

|P11 |σ 2−4β r 3 α pa + pb −2 (ab + b3 ) + |P03 |σ 3−3β r 3 α 3 pb −3 b3 +δ −1 σ 4−8β r 4 α 2 pa −2 (a + b2 )2 + α 4 pb −4 b4 +K n e−K σ σ 2−4β r 2 α pa a + b2 + c(a + b2 ) + db2 . Next | P˜01 |∗ |P11 | (|y01 |∗ |w00 |∗ + |I + w01 |∗ |y00 |∗ ) + |P03 ||w00 |2∗ |I + w01 |∗ +|∂w+ (Q ◦ )|y+ =0,w+ =0 |∗ |P11 |σ 2−4β r 2 α pa −1 (a + b2 ) + |P03 |σ 2−4β r 2 α 2 pb −2 b2 + |(∂ y Q) ◦ 00 |∗ |y01 |∗ +|(∂w Q) ◦ 00 |∗ |I + w01 |∗ |P11 |σ 2−4β r 2 α pa −1 (a + b2 ) +|P03 |σ 2−4β r 2 α 2 pb −2 b2 + δ −1 |y00 |∗ + |w00 |2∗ |y01 |∗ + (δr )−1 |y00 |2∗ + δ −1 |y00 |∗ |w00 |∗ + |w00 |3∗ ⊥ ⊥ ⊥ |∗ |I + w01 |∗ + |P10 |∗ |y01 |∗ + |P02 |∗ |w00 |∗ |I + w01 |∗ +|P01

|P11 |σ 2−4β r 2 α pa −1 (a + b2 ) + |P03 |r 2 σ 2−4β α 2 pb −2 b2 +δ −1 σ 3−6β r 3 α pa + pb −2 (a + b2 )b + K n e−K σ σ 1−2β r α pb b . Moreover | P˜10 |∗ |P11 ||w00 |∗ |I + y10 |∗ + |∂ y+ (Q ◦ )|y+ =0,w+ =0 ⊥ |P11 |σ 1−2β r α pb −1 b + δ −1 |y00 |∗ + |w00 |2∗ + |P10 |∗ |I + y10 |∗ |P11 |σ 1−2β r α pb −1 b + δ −1 σ 2−4β r 2 α pa −1 (a + b2 ) + K n e−K σ αc . By (8.22) and (8.26) we have |y01 |∗ δr and then | P˜02 |∗ |P11 | (|y02 |∗ |w00 |∗ + |I + w01 |∗ |y01 |∗ ) + |P03 ||w00 |∗ |I + w01 |2∗ 2 +|∂w (Q ◦ )|y+ =0,w+ =0 |∗ + w+ 2 Q) ◦ | |y |2 (|P11 | + |P03 |)σ 2−4β r α pb −1 b + |(∂ yy 00 ∗ 01 ∗ 2 Q) ◦ | |I + w | |y | +|(∂ yw 00 ∗ 01 ∗ 01 ∗ 2 Q) ◦ | |I + w |2 +|(∂ y Q) ◦ 00 |∗ |y02 |∗ + |(∂ww 00 ∗ 01 ∗ 2−4β p −1 2 b (|P11 | + |P03 |)σ rα b + |y01 |∗ + (δr )−1 |y00 |∗ + |w00 |∗ |y01 |∗ + δ −1 |y00 |∗ + |w00 |2∗ |y02 |∗ + δ −1 |y00 |∗ + |w00 |2∗ ⊥ | |y | + |P ⊥ | |I + w |2 +|P10 ∗ 02 ∗ 01 ∗ 02 ∗

794

M. Berti, L. Biasco

(|P11 | + |P03 |)σ 2−4β r α pb −1 b + δ −1 |y01 |2∗ + |y00 |∗ + |w00 |∗ |y01 |∗ + |w00 |2∗ ⊥ | |y | + |P ⊥ | +|P10 ∗ 02 ∗ 02 ∗

(|P11 | + |P03 |)σ 2−4β r α pb −1 b + δ −1 σ 2−4β r 2 α pa −1 (a + b2 ) + K n e−K σ σ 1−2β αd .

The estimates of | P˜11 − P11 |∗ and | P˜03 − P03 |∗ follow in the same way.

Proof of Lemma 8.16. Let γ0 := γ˜03 e−χ , where

1 j+1 j 4 5 j γ˜0 := inf κ − j−1 e(χ −1)χ , κ − j−1 e(2−χ )χ , κ − j−1 e(χ +1−χ )χ , 8 j≥0 3 j+2 κ − j−1 e(2−χ )χ . 4

Note that γ˜0 ≥ κ −c˜ ln(ln κ) for some c˜ = c(χ ˜ ) ≥ 1, since inf j≥1 κ − j eαχ ≥ κ −c¯ ln(ln κ) for some c¯ = c(χ ¯ , α) ≥ 1, (recall κ > ee ). By the choice of χ we have 0 < γ˜0 < 1. We claim that j

a j ≤ ε0 eχ

4 −χ j+4

, b j ≤ γ˜0−1 ε0 eχ

4 −χ j+2

, c j , d j ≤ γ˜0−2 ε0 eχ

4 −χ j

,

∀ 0≤ j ≤ ν . (9.74)

Note that (8.33) follows by (9.74) since γ˜0−2 eχ ≤ γ0−1 . We prove (9.74) by induction over j. The case j = 0 follows by a0 , b0 , c0 , d0 ≤ γ0 . Then we prove that (9.74) holds for j + 1. We have 4

a j+1 ≤ κ j+1 (a j c j + b2j + a 2j + K ∗n e−K ∗ 2 a j ) j

≤ e2χ ε02 κ j+1 (γ˜0−2 e−χ 4

≤ ε0 eχ

j+4 −χ j

+ γ˜0−2 e−2χ

j+2

+ e−2χ

j+4

) + ε0 κ j+1 K ∗n eχ

4 −χ j+4 −K

∗2

j

4 −χ j+5

since, ∀ j ≥ 0, 1 − j−1 (χ 4 +1−χ 5 )χ j 1 4 3 j+2 κ e , ε0 γ˜0−2 eχ ≤ γ˜0 ≤ κ − j−1 e(2−χ )χ , 8 8 1 − j−1 (2−χ )χ j+4 j+1 n 1+χ j+5 −χ j+4 −K ∗ 2 j ≤ γ˜0 ≤ κ e , κ K∗ e ≤ 1. 8

ε0 γ˜0−2 eχ ≤ γ˜0 ≤ 4

ε0 eχ

4

The first three estimates directly follow by the definition of γ˜0 . The last one holds since, by K ∗ ≥ 26 + 6 ln κ + 16n 2 , 1 + χ j+5 − χ j+4 − K ∗ 2 j ≤ χ j+5 − K ∗ 2 j ≤ −K ∗ 2 j−1 and6 ( j + 1) ln κ + n ln K ∗ − K ∗ 2 j−1 ≤ 0. We have b j+1 ≤ κ j+1 (a j + b2j + b j (c j + d j ) + K ∗n e−K ∗ 2 b j ) 4 j+4 4 j+2 4 j+2 j ≤ eχ ε0 κ j+1 e−χ + γ˜0−2 ε0 eχ −2χ + 2γ˜0−3 ε0 eχ −χ −χ j

+γ˜0−1 ε0 κ j+1 K ∗n eχ

4 −χ j+2 −K

∗2

j

≤ γ˜0−1 ε0 eχ

4 −χ j+3

6 This inequality holds for j = 0, 1, by K ≥ 26 + 6 ln κ + 16n 2 , while, for j ≥ 2, ( j + 1) ln κ + n ln K − ∗ ∗ K ∗ 2 j−1 ≤ ( j + 1) ln κ + n ln K ∗ − K ∗ ( j − 1) ≤ 3 ln κ + n ln K ∗ − K ∗ ≤ 0.


since, ∀ j ≥ 0, κ j+1 K ∗n e1+χ

j+3 −χ j+2 −K

∗2

j

795

≤ 1 and

1 − j−1 (χ −1)χ j+3 1 4 j+2 κ e , γ˜0−1 eχ ε0 ≤ γ˜0 ≤ κ − j−1 e(2−χ )χ , 8 8 1 4 2 3 j γ˜0−2 eχ ε0 ≤ γ˜0 ≤ κ − j−1 e(χ +1−χ )χ , 8

γ˜0 ≤

reasoning as above (note that χ 2 + 1 > χ 3 ). Finally c j+1 ≤ κ j+1 (a j + b j + c2j + K ∗n e−K ∗ 2 c j ) j

≤ eχ ε0 κ j+1 (e−χ 4

≤ γ˜0−2 ε0 eχ

j+4

+ γ˜0−1 e−χ

j+2

+ γ˜0−4 eχ ε0 e−2χ ) + γ˜0−2 ε0 κ j+1 K ∗n eχ 4

j

4 −χ j −K

∗2

4 −χ j+1

since, ∀ j ≥ 0, κ j+1 K ∗n e1+χ

j+1 −χ j −K

∗2

j

≤ 1, and

1 − j−1 (χ 3 −1)χ j+1 1 j+1 κ e , γ˜0 ≤ κ − j−1 e(χ −1)χ , 8 8 1 4 j γ˜0−2 eχ ε0 ≤ γ˜0 ≤ κ − j−1 e(2−χ )χ . 8

γ˜02 ≤ γ˜0 ≤

The estimate d j+1 ≤ γ˜0−2 ε0 eχ

4 −χ j+1

follows as well.

References 1. Bambusi, D.: Birkhoff normal form for some nonlinear PDEs. Commun. Math. Phys. 234(2), 253– 285 (2003) 2. Bambusi, D., Berti, M.: A Birkhoff-Lewis type theorem for some Hamiltonian PDE’s. SIAM J. Math. Anal. 37(1), 83–102 (2005) 3. Bambusi, D., Berti, M., Magistrelli, E.: Degenerate KAM theory for partial differential equations. J. Diff. Eqs. 250(8), 3379–3397 (2011) 4. Bambusi, D., Grébert, B.: Birkhoff normal form for partial differential equations with tame modulus. Duke Math. J. 135(3), 507–567 (2005) 5. Berti, M., Bolle, P.: Cantor families of periodic solutions for completely resonant nonlinear wave equations. Duke Math. J. 134(2), 359–419 (2006) 6. Berti, M., Bolle, P., Procesi, M.: An abstract Nash-Moser Theorem with parameters and applications to PDEs. Ann. Inst. H. Poincaré, Anal. Nonlin. 27, 377–399 (2010) 7. Biasco, L., Di Gregorio, L.: A Birkhoff-Lewis type Theorem for the nonlinear wave equation. Arch. Rat. Mech. 196(1), 303–362 (2010) 8. Bobenko, A., Kuksin, S.: The nonlinear Klein-Gordon equation on an interval as a perturbed sine-Gordon equation. Comment. Math. Helv. 70(1), 63–112 (1995) 9. Bourgain, J.: On Melnikov’s persistency problem. Math. Res. Lett. 4, 445–458 (1997) 10. Bourgain, J.: Quasi-periodic solutions of Hamiltonian perturbations of 2D linear Schrödinger equations. Ann. of Math. 148, 363–439 (1998) 11. Bourgain, J.: Green’s function estimates for lattice Schrödinger operators and applications. Annals of Mathematics Studies 158, Princeton, NJ: Princeton University Press, 2005 12. Chierchia, L.: A direct method for the construction of solutions of the Hamilton-Jacobi equation. Meccanica 25, 246–252 (1990) 13. Chierchia, L., Gallavotti, G.: Smooth prime integrals for quasi-integrable Hamiltonian systems. Il Nuovo Cimento 67 B, 277–295 (1982) 14. Craig, W.: Problèmes de petits diviseurs dans les équations aux dérivées partielles. Panoramas et Synthèses 9, Paris: Société Math de France, 2000 15. Craig, W., Wayne, C.E.: Newton’s method and periodic solutions of nonlinear wave equation. Comm. Pure Appl. Math. 46, 1409–1498 (1993) 16. Eliasson, L.H.: Perturbations of stable invariant tori for Hamiltonian systems. Ann. Scuola Norm. Sup. Pisa Cl. Sci. (4) 15(1), 115–147 (1988)

j

796

M. Berti, L. Biasco

17. Eliasson, L.H., Kuksin, S.: KAM for non-linear Schrödinger equation. Ann. of Math. 172, 371–435 (2010) 18. Geng, J., Ren, X.: Lower dimensional invariant tori with prescribed frequency for nonlinear wave equation. J. Diff. Eq. 249(11), 2796–2821 (2010) 19. Gentile, G., Procesi, M.: Periodic solutions for a class of nonlinear partial differential equations in higher dimension. Commun. Math. Phys. 289(3), 863–906 (2009) 20. Giorgilli, A., Morbidelli, A.: Superexponential Stability of KAM Tori. J. Stat. Phys. 78(5/6), 1607– 1617 (1995) 21. Jorba, A., Villanueva, J.: On the normal behaviour of partially elliptic lower-dimensional tori of Hamiltonian systems. Nonlinearity 10(4), 783–822 (1997) 22. Kappeler, T., Pöschel, J.: KAM and KdV. Berlin-Heidelberg-New York: Springer, 2003 23. Kuksin, S.: Hamiltonian perturbations of infinite-dimensional linear systems with imaginary spectrum. Funkt. Anal. i Pril. 21(3), 22–37, 95 (1987) 24. Kuksin, S.: Analysis of Hamiltonian PDEs. Oxford Lecture Series in Mathematics and its Applications, 19. Oxford: Oxford University Press, 2000 25. Kuksin, S., Pöschel, J.: Invariant Cantor manifolds of quasi-periodic oscillations for a nonlinear Schrödinger equation. Ann. of Math, 2, 143(1), 149–179 (1996) 26. Pöschel, J.: A KAM-Theorem for some nonlinear PDEs. Ann. Scuola Norm. Sup. Pisa, Cl. Sci. 23, 119–148 (1996) 27. Pöschel, J.: Quasi-periodic solutions for a nonlinear wave equation. Comment. Math. Helv. 71(2), 269–296 (1996) 28. Pöschel, J.: On the construction of almost periodic solutions for a nonlinear Schrödinger equation. Ergod. Th. and Dyn. Sys. 22, 1537–1549 (2002) 29. Rüssmann, H.: Invariant tori in non-degenerate nearly integrable Hamiltonian systems. Reg. Chaotic Dyn. 6(2), 119–204 (2001) 30. Wayne, E.: Periodic and quasi-periodic solutions of nonlinear wave equations via KAM theory. Comm. Math. Phys. 127, 479–528 (1990) Communicated by G. Gallavotti


Communications in


‘Return to Equilibrium’ for Weakly Coupled Quantum Systems: A Simple Polymer Expansion W. De Roeck1 , A. Kupiainen2 1 Institut für Theoretische Physik, Universität Heidelberg, Philosophenweg 19,

D-69120 Heidelberg, Germany. E-mail: [email protected]

2 Department of Mathematics, University of Helsinki, P. O. Box 68, Helsinki, FIN-00014, Finland.

E-mail: [email protected] Received: 10 May 2010 / Accepted: 5 October 2010 Published online: 22 May 2011 – © Springer-Verlag 2011

Abstract: Recently, several authors studied small quantum systems weakly coupled to free boson or fermion fields at positive temperature. All the rigorous approaches we are aware of employ complex deformations of Liouvillians or Mourre theory (the infinitesimal version of the former). We present an approach based on polymer expansions of statistical mechanics. Despite the fact that our approach is elementary, our results are slightly sharper than those contained in the literature up to now. We show that, whenever the small quantum system is known to admit a Markov approximation (Pauli master equation aka Lindblad equation) in the weak coupling limit, and the Markov approximation is exponentially mixing, then the weakly coupled system approaches a unique invariant state that is perturbatively close to its Markov approximation. 1. Introduction 1.1. Motivation. Quantum systems consisting of a small subsystem (say, an atom) and a large component (say, a heat bath) have received a lot of attention lately, sparked by the elegant results of [21,23,26]. The challenge in this problem is to prove that the subsystem thermalizes under influence of the heat bath; this property will be called ‘return to equilibrium’, or simply ‘RTE’ hereafter. The quoted works show that for typical such systems (in a precisely defined sense of typicality) the subsystem is close to equilibrium for most times. However, if one wants to study in more detail the subsystem dynamics one needs to resort to concrete models. Indeed, such subsystem-reservoir models have been successfully and rigorously studied since the late 90’s (we refer to [2,17] for early results in this field) under the assumptions 1) that the heat bath consists of a free (and hence explicitly solvable) field, 2) that the coupling between subsystem and heat bath is small compared to the energy scales of the subsystem. Nevertheless, these results either impose rather strong assumptions on the form of the system-reservoir coupling, or they become quite involved technically. The aim of the

798

W. De Roeck, A. Kupiainen

present paper is to develop an intuitive and simple approach for RTE in this case. To explain our result, let us first recall that these systems were already studied in the 70’s from the point of view of quantum master equations; B. Davies [6] pioneered the rigorous derivation of master equations in this framework, thus making precise earlier heuristic ideas of I. Prigogine and P. Van Hove. The master equation is derived, under mild conditions on the form of the coupling, by scaling time t as t ∼ λ−2 (λ is the coupling strength) and taking λ → 0. It exhibits all irreversible phenomena expected in such model systems and as such, it has inspired many researchers in open quantum systems. However, it does of course not yield information on the long time (longer than λ−2 ) behaviour of the system, We prove that, if the condition necessary for the derivation of the master equation is satisfied and the master equation is exponentially ergodic (exhibits exponentially fast return to equilibrium), then the system thermalizes in the long time limit, for small but nonzero coupling strength λ (more generally, it reaches a steady state, since we do not assume that the heat bath(s) is(are) in equilibrium). Moreover, we give an explicit bound on the speed of convergence towards the steady state. The necessary condition is the integrability in time of certain correlation functions of the free field. To our best knowledge, this condition is weaker than that of other RTE- results in the literature. 1.2. Setup. Let HS be a finite-dimensional Hilbert space (modeling the small system) with a Hamiltonian HS (a Hermitian matrix). To describe the field that plays the role of reservoir, we first pick a finite, discrete hypercube = L = Zd ∩ (−L , L]d for some L ∈ N and we enclose the field in the volume (we could as well choose to be a box in Rd ). Since we will mainly use the Fourier transform, we define the set of (quasi-)momenta ∗ = (π Z/L)d ∩ (−π, π ]d . The dynamics of one reservoir excitation is given by the one-particle dispersion relation ω (q) and the Hamiltonian of the whole field in is given by HR := ω (q)aq∗ aq (1.1) q∈∗

acting on the bosonic (symmetric) Fock space HR = (l 2 (∗ )). Here aq∗ , aq are the creation/annihilation operators of a mode with (quasi-)momentum q ∈ ∗ , satisfying the canonical commutation relations [aq , aq∗ ] = δq,q . The Hilbert space of the total system consisting of small system and field, is H = HS ⊗ HR , and we simply write HS and HR for the operators HS ⊗ 1 and 1 ⊗ HR acting on H . The coupling between field and the small system is assumed to be linear in the creation and annihilation operators and it can hence be written in the form φi (q)aq∗ + φi (q)aq , (1.2) HInt := Di ⊗ (φi ), (φi ) := i∈I

q∈∗

where Di = Di∗ are self-adjoint operators on HS and φi are functions (form factors) to be specified. I is a finite index set. The total Hamiltonian of the system is hence, with a coupling strength λ ∈ R, Hλ := HS + HR + λHInt ,

on H .

(1.3)

A standard application of the Kato-Rellich theorem states that, if (ω )−1/2 φi ∈ l 2 (∗ ), then H is self-adjoint on the domain of HR .

‘Return to Equilibrium’ for Weakly Coupled Quantum Systems

799

Initially, the field is in a Gaussian state that will be called the ‘reference state’: The density matrix ρRref, of this reference state is specified by the correlation functions of smeared field operators, i.e. by Tr R ρRref, (ϕ1 ) . . . (ϕn ) , (1.4) where ϕi are functions on ∗ and Tr R is the trace on HR . Our main assumptions on this reference state ρRref, are that it is 1) stationary w.r.t. the decoupled dynamics, i.e.

e−it HR ρRref, eit HR = ρRref, ,

(1.5)

2) gauge-invariant, i.e. all correlation functions (1.4) that involve an odd number of field operators, are zero, 3) Gaussian (also called “quasifree”), i.e. the higher correlation functions are related to the two-particle correlation function via the Gaussian relation Tr R ρRref, (ϕ1 ) . . . (ϕn ) = Tr R ρRref, (ϕi )(ϕ j ) , (1.6) pairings π (i, j)∈π

where a ‘pairing’ π is a set of pairs (i, j) with the convention i < j. By standard theory of Gaussian (or quasi-free) states, the above properties imply that the state ρRref, is completely determined by a positive density function 0 < η (q) < ∞ via the relations Tr R ρRref, (ϕ)(ϕ ) = ϕ , η ϕ + ϕ, (1 + η )ϕ (1.7) with ., . the scalar product in l 2 (∗ ). The invariance of ρRref, under the free field dynamics is ensured by the commutation relation [ω , η ] = 0. Since the field is a finite collection of harmonic oscillators, the reference state ρRref, is a well-defined traceclass density matrix. The material in this section is completely standard and we refer the reader to e.g. [4,7] for details that were omitted here (note however that these texts deal with infinite volume from the start and hence they are necessarily more involved technically). 1.3. Thermodynamic limit. As long as is finite, we cannot expect the system to have good ergodic properties and hence we will perform the thermodynamic limit as a first step. By the thermodynamic limit Zd , we mean that L ∞, hence the volume tends to Zd and the set ∗ tends to Td . As we will see, the influence of the reservoir on the dynamics of the subsystem can be expressed entirely in terms of the correlation functions

f i,j (t) := Tr R [ρRref, (eitω φi )(φ j )] = φ j ,η e

itω

φi + φi , (1 + η )e

(1.8) −itω

φ j

(1.9)

(−t) by stationarity of the state ρ ref, . To with φi as in (1.2). Note that f i,j (t) = f j,i R discuss the thermodynamic limit of the small system behaviour, it suffices to ask that

800


the f i,j (t) converge f i, j (t) = lim f i,j (t),

(1.10)

Zd

uniformly in t on any compact set, and that supt | f i, j (t)| < ∞. We recall that a density matrix on HS is a positive traceclass operator, i.e., belonging to B1 (HS ), whose trace is 1 (of course, since HS is finite-dimensional, any operator is traceclass). In what follows, we let Tr R stand for the partial trace over the field degrees of freedom, mapping density matrices on H into density matrices on HS . Lemma 1.1. Assume that the f i,j (t) converge to bounded functions f i, j (t), uniformly on compacts (i.e. (1.10)). Then, the thermodynamic limit ρS,t := lim Tr R e−it Hλ ρS,0 ⊗ ρRref, eit Hλ (1.11) Zd

exists for any initial density matrix ρS,0 on HS . The proof of Lemma 1.1 is given in Sect. 3.1.2. 1.4. Markov approximation. The Markov approximation to the model introduced above amounts to replacing the correlation function f i, j (t) by a multiple of δ(t) (no memory). It can be justified in the weak coupling scaling limit λ → 0, if one rescales time as t → λ−2 t. We state this important result precisely in Sect. 3.3. For now, we just introduce the precise form of the Markov approximation since one of our assumptions refers to it. First, we introduce the left and right multiplication operators MLe (A), MRi (A); MLe (A)S := AS, Then we set L˜ :=

k1 ,k2 ∈{Le,Ri} 0 i, j∈I

∞

dt e

itad(HS )

MRi (A)S := S A∗ ,

Mk2 (iD j ) e

−itad(HS )

A, S ∈ B(HS ). Mk1 (iDi )

(1.12)

f j,i (t) if k1 = Le , f j,i (t) if k1 = Ri (1.13)

where ad(HS ) = [HS , ·] and the integral over t is well-defined by integrability of f i, j (·), which will be assumed below. Finally 1 t ˜ −isad(HS ) , ds eisad(HS ) Le (1.14) L := lim t→∞ t 0 where the limit t → ∞ exists since HS has discrete spectrum. Note also that L commutes with ad(HS ) as follows from the spectral averaging in (1.14). As is discussed in many places, the Lindblad operator L generates a contractive semigroup etL , t ≥ 0 on B1 (HS ) that is trace-preserving and positivity preserving. In other words, etL maps the set of density matrices on HS into itself. In the above formulas, we denote time by the gothic symbol t to emphasize that it corresponds physically to a rescaled time. Indeed, the Lindblad operator L describes the dynamics on long time scales, see Sect. 3.3. Lindblad operators were first introduced in [22], an excellent exposition on the properties of L and its derivation from microscopic models can be found in [20].


801

1.5. Result. We need an assumption on the decay of temporal correlations of the ‘free reservoir correlation functions’. Assumption 1.2 (Decay of correlations). Recall the correlation functions f i, j introduced in (1.10). We assume that ∞ dt h(t) < ∞, where h(t) := Di D j | f i, j (t)|. (1.15) 0

i, j∈I

The second assumption concerns the Lindblad generator L, defined in Sect. 1.4. Assumption 1.3 (Fermi Golden Rule). The operator L has a simple eigenvalue at 0. All other eigenvalues lie in the region {z ∈ C Re z < −gap L } for some gap L > 0. Obviously, Assumption 1.3 and the fact that etL preserves density matrices, imply that there is a unique density matrix, ρSL , such that LρSL = 0 and etL − |ρSL 1| ≤ C L e−gap L t,

for all t > 0 and some C L < ∞,

(1.16)

where · is the operator norm of operators acting on B1 (HS ) and we use the notation |A A | to denote the rank-1 operator that acts as S → (Tr[(A )∗ S])A with S, A ∈ B1 (HS ) and A ∈ B(HS ). For us it is more convenient to define a characteristic time t L > 1/gap L such that etL − |ρSL 1| ≤ e−t/tL ,

for t > t L .

(1.17)

Conditions that imply Assumption 1.3 have been discussed extensively, see e.g. [12, 28]. Here, we prefer to give a (rather generic) example where Assumption 1.3 can be checked very explicitly: Assume that the Hamiltonian HS is non-degenerate, hence its spectral projections, P(e), e ∈ spHS , are one-dimensional. Then Assumption 1.3 is satisfied if and only if the continuous-time Markov process1 with (finite) state space spHS and jump rates rate(e → e ) = fî, j (e − e ) Tr[P(e )D j P(e)Di ], i, j

1 ˆ where f i, j (ω) = dt e−itω f i, j (t) 2π R

(1.18)

is ergodic. This in turn can be checked by the Perron-Frobenius theorem: a sufficient condition for ergodicity is that for any two eigenvalues e, e , there is a path e0 , e1 , . . . , en with e0 = e, en = e such that, for all i, rate(ei → ei+1 ) = 0. We are now ready to state our main result Theorem 1.4. Assume that Assumption 1.2 and Assumption 1.3 hold and let ρS,t be defined as in Lemma 1.1. Then, there is a λ0 > 0 such that for all λ satisfying 0 < |λ| < λ0 , we have lim ρS,t = ρSinv ,

t→∞

(1.19)

1 Since L commutes with ad(H ) and preserves positive density matrices, it sends the set of density matriS ces diagonal in HS -basis into itself. Since these diagonal density matrices can be identified with probability measures on spHS , et L determines a Markov process on spHS , namely the one defined by the rates (1.18).

802


where the “invariant density matrix” ρSinv = ρSinv (λ) does not depend on the initial state ρS,0 . Moreover, ρSinv is a small perturbation of ρSL , the invariant density matrix predicted by the Markov approximation; ρSinv − ρSL → 0 as λ → 0.

(1.20)

In (1.20), · is the operator norm of operators acting on B1 (HS ) (although it does not matter since HS is finite-dimensional). To quantify the speed of convergence towards the steady state ρSinv , we need to know the decay properties of the function h(·) that was introduced in Assumption 1.2. Let ζ (·) be a nondecreasing function on R+ satisfying the conditions 1 ≤ ζ (t + t ) ≤ ζ (t)ζ (t ),

for any t, t ∈ R+ .

(1.21)

We assume that this function governs the decay of the bath correlation function h, in the sense that ∞ dth(t)ζ (t) < ∞. (1.22) 0

The case where ζ (t) can be chosen to be exponentially increasing, is particularly simple but introduces a complication to the statement of the following result. Therefore we exclude this case explicitly by demanding

∞

dt e−κt ζ (t) < ∞,

for any κ > 0.

(1.23)

0

Proposition 1.5. Assume the conditions of Theorem 1.4. Let ζ be a non-decreasing function as above, satisfying (1.21–1.22–1.23), and let t L be chosen such that (1.17) holds. Then, for |λ| small enough, ρS,t − ρSinv

λ2 t t L + o(|λ|0 ) 2 −1 λ t 0 + o(|λ| ) ζ , for any t > t L λ−2 . 2t L

≤ exp −

(1.24)

Note that Proposition 1.5 makes no claim about the reduced dynamics ρS,t for short times t < λ−2 t L . The restriction to long times is natural since, for times shorter than λ−2 t L , the exponential decay of the semigroup is not yet visible. For those times, ρS,t is however well-described by the Markov approximation, see Theorem 3.5. On the RHS of (1.24), the time t appears essentially in the combination λ2 t/t L . As far as the first term is concerned, this is natural since that term originates from the Markov approximation, i.e. the temporal decay embodied in that term takes place on the macroscopic time scale ∼ λ−2 t L . The second term, however, comes from the slow decay of the reservoir correlation function h(t) on the microscopic time scale, and as such it is not clear why that decay gets prolonged to the macroscopic scale in (1.24). The estimate in that second term is indeed far from optimal (note also the weird factor ‘2 multiplying t L ) and this is due to the generality of our result. If, for example, one assumes that ζ (t) ∼ |t|α , α > 0, then one can state a sharper and more explicit bound.


803

1.6. Discussion and comparison with earlier results. 1.6.1. Restriction to confined systems. Our result is suited for confined small systems. We explain this in more detail and we distinguish essential assumptions from those made for convenience. A: The assumption that the ‘atom’ Hilbert space HS is finite-dimensional, seems not crucial to us. Atoms with an infinite number of energy levels (like the harmonic oscillator) should be treatable with the same technique. A complication that does arise in such infinite-dimensional atoms is that the relaxation of the Markov semigroup is in general not exponential since, in the absence of very energetic field quanta, the atom needs a large time to cascade from a very energetic level to the low-lying levels. We believe however that this can be remedied by a change of norm on (a subspace of) HS that renders the relaxation exponential, at least for a certain class of interaction Hamiltonians. B: The restriction to atom-bath couplings that are linear in the field operators is for notational simplicity only. One can study quadratic coupling in the same way. Coupling terms of higher order do not yield a well-defined Hamiltonian for bosonic baths, although they are well-defined for fermionic systems. In that case (fermionic baths with coupling of order at least 3) one has to use sign cancellations to control the Dyson expansion (this is done e.g. in [16]) and in such cases an operator-theoretic treatment might be favorable. C: The real assumption that excludes application of our result to extended systems is Assumption 1.2 and more concretely, the sum over i, j ∈ I. For an extended system, the simplest translation invariant coupling would be of the form HInt = dqϕ(q)eiq X aq + ϕ(q)e−iq X aq∗ (1.25) Td |x x| ⊗ (ϕx ), with ϕx (q) = eiq x ϕ(q), (1.26) = x∈Zd

where we have taken = Zd . The expression on the second line is of the form (??) with the index set I = Zd . Even though one could demand that the correlation functions are integrable in time in the sense that +∞ sup dt h x,x (t) < ∞, (1.27) x,x

with

0

h x,x (t) := lim Tr R ρRref, (eitω ϕx )(ϕx ) , Zd

(1.28)

then still Assumption 1.2 cannot hold because of the sum over x, x ∈ Zd . In fact, the appearance of the double sum is artificial and one can arrange to have a single sum, and moreover, h depends on the difference x − x only. Hence, Assumption 1.2 would boil down to ∞ dt h 0,x (t) < ∞ (1.29) 0

x

and this assumption cannot be satisfied for any interaction Hamiltonian of the type (1.25).

804


1.6.2. Interacting reservoirs. Models where the heat bath is not free, i.e. it is made up of a genuinely interacting system, are a far dream at this moment. However, we would like to draw attention to the fact that, in contrast to earlier results, our method does not exclude such reservoirs per se. Indeed, the important ingredient of our analysis is a temporal decay condition on the reservoir correlation function. This condition is stated in Assumption (2.2), and, for free reservoirs, it is satisfied provided that Assumption 1.2 holds. The huge challenge is of course to prove such a condition for an interacting system. First steps in this direction have recently been taken in [24]. 1.6.3. Algebraic quantum dynamical systems. In the literature on the subject, mixing properties are mostly investigated in a more general framework, allowing for initial states that are not factorized (but still local perturbations of ρRref ) and treating observables that depend on the field as well (since Theorem 1.4 deals with the reduced dynamics, we get information on observables of the small system only). In particular, one usually studies the system in the framework of C ∗ or W ∗ -algebras, in which the concepts “ergodicity” and “mixing” have a natural meaning, inherited from the theory of dynamical systems. For an introduction to these matters, we refer to [4,7]. It is straightforward to extend our approach such as to prove mixing in the above sense, but since this asks for more notation in Sect. 3, we have opted not to do so. The same remark applies to the study of multitime-correlation functions of small system observables. Our technique shows that these correlation functions are perturbatively close to correlation functions calculated within the Markovian approximation,2 see also [11]. A drawback of our technique with respect to the algebraic approach is that, in the case where ρRref is a Gibbs state, it is not immediately clear that the invariant state ρSinv is the restriction of the coupled Gibbs state to the small system. However, if one extends the class of initial states as suggested above, this does immediately follow. 1.6.4. Comparison with earlier results. One should distinguish between the case where the Gaussian reference state of the field has

a non-zero density (temperature) in the thermodynamic limit, i.e. lim Zd ||−1 q∈∗ η (q) > 0, or not. In the latter case, the field is essentially in the vacuum state and the approach to a steady state is related to the question whether the ground state of the coupled system (assuming that it exists) is the only bound state and whether the rest of the spectrum is absolutely continuous. These questions have been extensively studied in [1,3,14,15]. In one sense, our results are sharper than those quoted: they cover cases where the coupled system has no ground state, yet there is approach to a steady state for the small system. We do not explain nor develop these issues further here, but rather postpone them to a subsequent paper. However, the quoted results are stronger in the sense that they allow for the confined system to have continuous spectrum above a ionization threshold. If the field has a positive density, the prime example is of course the case where the field is in a thermal state at non-zero temperature, then the only results that we are aware of, rely on complex deformations of Liouvillians. One either uses complex translations or dilations. To streamline the discussion, we note that one can rewrite the correlation functions f i, j (t) as dω eitω fî, j (ω), fî, j (ω) := ϕi,e f f (ω), ϕ j,e f f (ω)S (1.30) f i, j (t) = R

2 Yet, they are qualitatively different, since in the Markovian model, correlations decay exponentially,

whereas at finite λ, the speed of decay is in general not faster than the decay of the correlation functions f i, j (t).


805

such that ϕi,e f f , the effective form factors (in the thermodynamic limit), are functions from R to some Hilbert space S that emerge naturally if one follows the operator-theoretic approach to the problem. They are often called “effective form factors” (effective because they incorporate the density function η of the reservoir). In the physical literature on the subject, the function fî, j (ω) is often called the ‘spectral function’. The first result on RTE, due to [17,18], proceeds by assuming that • The function ω → fî, j (ω) is analytic in a strip of width γ0 such that ω → fî, j (ω+iγ ) is in L 1 (R, dω) for 0 < γ < γ0 . This of course corresponds to exponential decay of f i, j (t). This result has been improved in [8,9] where analyticity is replaced by demanding that fî, j (·) is in C 2 , implying f i, j (t) ∼ |t|−2 . A related approach is found in [13]. The approach via dilation analyticity has been pioneered by [2]. There one assumes that • the function ω → fî, j (ei(signω)γ ω) is in L 1 (R, dω) for 0 < γ < γ0 (this is dilation analyticity), • For small ω, fî, j (ei(signω)γ ω) ≤ const. |ω|1+α , for some α > 0. By deforming the integration contour R in (1.30) into e−iγ R− ∪ eiγ R+ , one realizes that this implies that

| f i, j (t)| ≤ const t −(2+α) (log t)const ,

(1.31)

and hence this case is covered by our result. 1.7. Strategy of the proof. Our proof is based on a polymer expansion in real time. In the context of classical stochastic dynamics, such expansions were successfully applied in e.g. [5,25], and in the case of classical deterministic dynamics in [19]. For the case at hand, a similar strategy was pursued in [27]. In the following Sects. 1.7.1, 1.7.2 and 1.7.3, we introduce the rough ideas. 1.7.1. Markovian approximation and leading dynamics. We discretize time t = N ν where ν is a macroscopic time unit ν = λ−2 , with a λ-independent number that could actually be chosen = 1. Then, we write ρS,t = Z N ρS,0 , where ρS,t is the reduced time-evolved density matrix and Z N could be called the ‘reduced evolution operator’. The idea is that T ≡ Z N =1 can be analyzed quite well, at least for sufficiently small coupling λ, because in that regime the Markovian approximation (Sect. 1.4) can be justified. 2 Indeed, we will state in Sect. 3.3 that T is well-approximated by eλ ν L , with L the Lindblad generator (also mentioned in Sect. 1.4). This is not proven in the present paper since the proof is well-known in the literature. For now, we view T as the leading dynamics. 2 An important consequence of the fact that T is close to eλ ν L and of Assumption 1.3, is that we can establish that the operator T has a simple eigenvalue 1 (this eigenvector is the ‘steady state’ ρST ) and the rest of the spectrum lies in a circle with radius 1 − g < 1. Since T is trace conserving, Tr TρS,0 = Tr ρS,0 , the ‘right’ eigenvector corresponding to the eigenvector 1 is the identity 1 ∈ B(HS ), hence we have the spectral decomposition T = R + (1 − R)T,

R = |ρST 1|,

(1 − R)T n ≤ C(1 − g)n .

(1.32)

This property of T is proven in Sect. 3.3 by simple perturbation theory (with eλ ν L being the ‘unperturbed object’) , but it is introduced already in Sect. 2 as an assumption. 2

806


1.7.2. Polymer representation. If the reduced dynamics Z N were exactly Markovian, we would have Z N = T N , i.e. Z N could be called a ‘quantum Markov chain’. However, this is of course not the case and as N grows the difference between Z N and T N becomes important. We represent the corrections to T N by ‘nonmarkovian excitations’ that are localized in time. For example, Z 2 = T 2 + T [Ec (B(1, 2))],

(1.33)

where Ec (B(1, 2)) is an operator on B1 (HS ) ⊗ B1 (HS ) that should be thought of as localized in the macroscopic times 1 and 2 (actually, in macroscopic time intervals [0, ν] and [ν, 2ν]). The operation T [·] is a time-ordering; it converts Ec (B(1, 2)) to an operator on B1 (HS ), such that it is on the same footing as T (see the full definition in Sect. 2.2. We are actually abusing the correct definition slightly in the present section). For Z 3 , we get Z 3 = T 3 + T T [Ec (B(1, 2))] + T [Ec (B(2, 3))]T + T [Ec (B({1, 2, 3})] + T Ec (B(1, 3))T (2) ,

(1.34)

where, for a general set of macroscopic times A, Ec (B(A)) denotes the (irreducible) excitation that is localized in the elements of A (it acts on the |A|-fold tensor power of B1 (HS )). Since, in the rightmost term, the excitation Ec (B(1, 3)) is localized in times 1 and 3, and the T -operator represents the leading dynamics in the second time interval (hence the ‘2’ in T (2)), we need to squeeze T (2) in between the excitations at times 1 and 3. For general N , the resulting expression for Z N is ⎡⎛ ⎞ ⎤ T ⎣⎝ T (τ )⎠ Ec (B(A)) ⎦ , (1.35) ZN = T N + A∈Pol(N )

τ ∈I N \SuppA

A∈A

where the polymer set Pol(N ) is the set of nonempty collections A of disjoint subsets A of I N = {1, 2, . . . , N }. To analyze this polymer expression, we use two tools: bounds on the excitation operators Ec (B(A)) and a Feynman rule. Bounds. We will bound each term in the sum (1.35) in operator norm by A∈A Ec (B(A))# , where the norm ·# is defined in Sect. 2.2.2. The T (τ )-operators do not show up in these bounds since they have norm 1. We will require that Ec (B(A))# ∼ |A| for some small parameter and, moreover, that Ec (B(A))# decreases as the macroscopic times, i.e. the element of A; are further apart. This decrease as a function of temporal distance is a consequence of Assumption 1.2, but in Sect. 2 it is introduced as Assumption 2.2. In Sect. 3.1.1, we prove how Assumption 1.2 implies Assumption 2.2. Feynman rule. It is not hard to see that the bounds given above, when summed over the different terms in (1.35) lead to a too pessimistic bound on Z N . Even if we restrict to sets A whose elements are consecutive integers (which is essentially justified because of the temporal decay), then we still get an exponentially diverging bound, of order eC N , for some constant C > 0. To improve our bounds, we use a Feynman rule (one could also call it a Ward identity) that is a consequence of conservation of probability of the dynamics Z N , to be explained in Sect. 2.3. In our general polymer expansion, this Feynman rule implies that, for every uninterrupted string of T (·) operators that follows


807

Fig. 1. An example of a A with A = {A1 , A2 , A3 }. In the picture, N = 20, and A1 = {3, 4, 6}, A2 = {5, 10, 11, 13}, A3 = {16, 17}. The exponential decay is on the string of times that are covered by the dotted lines. These strings are Hook(A1 ) = {7, 8, 9}, Hook(A2 ) = {14, 15}, Hook(A3 ) = {18, 19, 20}. These are exactly the times between max A for some A and the next-in-time element of some other set A

a set A, we can insert the spectral projection (1 − R) in front of the string of T ’s. By (1.32), this yields exponential decay in the length of the string. This is illustrated in Fig. 1 (the sets Hook(A) will be defined later). Armed with the Feynman rule and the bounds on Ec (B(A)), we can now perform the sum over all terms on the RHS of (1.35), resulting in Z N − T N = O()

⇒

Z N − R = C(1 − g) N + O().

(1.36)

By very similar arguments, one can see that Z N , for large N is -close to a onedimensional projector. Indeed, by the exponential decay following any excitation, all contributions to Z N − T N tend to be localized at times close to the final time N . This means that they start with a string of T ’s of length O(N ); such a string is equal to R, up to an error of order (1 − g) O(N ) . Hence, up to a vanishing error, all contributions get multiplied by R on the right, and consequently they are of the form |S 1| for some S ∈ B(HS ). This means that also the limit lim N ∞ Z N is of this form. By conservation of trace and positivity it then follows that lim Z N = |ρSinv 1|,

for some density matrix ρSinv :

N ∞

ρSinv − ρST = O(),

→ 0.

(1.37)

These arguments are presented in Sect. 2.6. 1.7.3. Dyson expansion. The Dyson expansion is introduced to prove the bounds on Ec (B(A))# discussed above. This is done in Sect. 3.2. It is also the standard tool to prove the weak coupling limit, Theorem 3.5. As we will do later on in the proof, we assume for simplicity that the set I has just = D ⊗(ϕ ). one element, such that we can drop the index i ∈ I and simply write HInt For any operator O on H , let O(t) = eit (HS +HR ) Oe−it (HS +HR ) , and we abbreviate (t) ≡ ((φ ))(t), then we can write the Duhamel expansion (the convergence of the series is easily established) ref, it Hλ −it HS e e eitad(HS ) ρS,t := lim eit HS Tr R e−it Hλ ρS,0 ⊗ ρR Zd

= lim

Zd

n Le ,n Ri ∈N

(−iλ)n Le (iλ)n Ri

ds1 . . . dsn Ri

ds1 . . . dsn Le s1 (N , n) :=

A∈Pol(N ) m(A)>n

Z N (A).

(2.42)


815

The relations between polymers for different N implies that v> (N , n) = v> (N +k, n +k) for any k ∈ N. Moreover, the summability (2.35) implies that there is a v < ∞ such that v = lim v(N ), N →∞

v(N ) := v≤ (N , n) + v> (N , n)

(independently of n)

(2.43)

Given κ > 0, we choose N (κ) such that |v(N (κ))−v| ≤ κ, and hence v≤ (N +k, k) < κ for any N ≥ N (κ) and any k ∈ N. Hence Z N (A)C g (1 − g)m(A) ≤ v> (N + k, k))C g (1 − g)k + v≤ (N + k, k)) A∈Pol(N +k)

(2.44) (2.45)

≤ v C g (1 − g) + κ. k

As k → ∞, this bound equals κ. Since κ is arbitrary, this proves (2.36).

2.6.2. Convergence towards the steady state. From the bounds in (2.34) and Lemma 2.5, we obtain Z N (A)R lim sup Z N − Z N ≤ lim sup N →∞

N →∞

+

A∈Pol(N ) m(A)≤N −N

Z N (A)C g (1 − g)m(A) + C g (1 − g) N . (2.46)

A∈Pol(N )

We will now estimate the two first terms on the RHS of (2.46) multiplied by the factor ζ (N ). Note first that for any A ∈ Pol(N ) with m(A) ≤ N − N , | max A − min A| + |Hook(A)|, (2.47) N < N − m(A) + 1 = A∈A

A∈A

and hence, by property (1.21) of the function ζ (·), ζ (N ) ≤ distζ (A) × ζ (|Hook(A)|).

(2.48)

A∈A

Hence ζ (N )

Z N (A)

A∈Pol(N ) m(A)≤N −N

≤

distζ (A)E c (B(A)# ζ (|Hook(A)|) C g (1 − g)|Hook(A)|

A∈Pol(N ) A∈A

≤

c(ζ, g)|A| distζ (A)Ec (B(A))#

" |Hook(A)| 1−g ,

(2.49)

A∈Pol(N ) A∈A

where we have put (using that ζ is subexponential, see (1.23)), " c(ζ, g) := sup ζ (n)( 1 − g)n C g . n≥1

(2.50)

816


Note that we dropped the restriction that m(A) ≤ N − N since it was only necessary for (2.48). Consequently, we have derived a bound, (2.49), for the first term on the RHS of (2.46) (since Z N (A)R ≤ Z N (A)). We will now derive a similar bound for the second term on the RHS of (2.46). Instead of (2.48), we use here that ζ (N ) ≤ ζ (m(A)) × distζ (A) × ζ (|Hook(A)|), for A ∈ Pol(N ), (2.51) A∈A

and we obtain the same bound as in (2.49), except that this time we get c(ζ, g)|A|+1 instead of c(ζ, g)|A| because of the presence of the term ζ (m(A)) in (2.51). Next, we show that (2.49) (or the analogous bound for the second term on the RHS of (2.46)) can be bounded by O(), for small enough. To achieve this, we proceed in exactly the same way as in the proof of Lemma 2.5, except that here; • We include the factor distζ (A) in the weight Ec (B(τ A )), which is permitted by Assumption 2.2. • For each set A, there is an additional factor c(ζ, g)|A| (or c(ζ, g)|A|+1 ), which can be handled by choosing smaller. √ • The factor (1 − g) is replaced by 1 − g, which again forces to be smaller. To conclude the proof of Theorem 2.4, it remains to show that Z ∞ := lim N ∞ Z N is of the form |ρSinv 1|. This follows from the fact that, by the bound (2.36), only the terms Z N (A)R contribute to Z ∞ . 3. Discretization of the Physical System We explain now how the setup of Sect. 1 fits into the framework of Sect. 2. In Sect. 3.1, we introduce and estimate the Dyson expansion and we present the construction of the operators Ec (B(A)) from the microscopic model. In Sects. 3.1.3 and 3.3, we check Assumptions 2.2 and 2.3 starting from Assumptions 1.2 and 1.3.

3.1. Expansions. To save notation, we present the case where there is only one element in the sum (??) defining the interaction Hamiltonian, i.e. |I| = 1 and we can write Di = D, ϕi = ϕ. The general case can be treated in essentially the same way, we indicate the changes at the end of Sect. 3.1.2 and in the proof of Lemma 3.2. We expand the reduced dynamics, introduced in Sect. 2; Z N ρS = PU (N )U (N − 1) . . . U (1)PρS = Tr R (e−iν N Hλ (ρS ⊗ ρRref, )eiν N Hλ )

(3.1)

in a Dyson series. 3.1.1. Dyson expansion. We let D(t) := eitad(HS ) De−itad(HS ) ,

(3.2)


817

and we recall the left and right multiplication operators MLe (A), MRi (A) introduced in (1.12). Define the operator products K (t, k) = Mk2n (iD(t2n )) · · · Mk1 (iD(t1 )),

K (t, k) ≤ D2n

(3.3)

with t = (t1 , . . . , t2n ) is an ordered sequence of times 0 < t1 < · · · < t2n < t and k = (k1 , . . . , k2n ) is a sequence in {Le, Ri}. Next, recall the correlation functions f (t) (the labels i, j have been dropped because of the simplification |I| = 1) and define # 2 λ f (tr − ts ) if ks = Le G (t, k) := (3.4) 2 pairings π (s,r )∈π λ f (tr − ts ) if ks = Ri (recall the convention that s < r in the pairing π ). Finally, we introduce the free Sdynamics Wτ := e−iντ ad(HS ) . Then, we are ready to state the Dyson expansion for Z N ; dt K (t, k)G (t, k), Z N = WN

(3.5)

(3.6)

n∈N k 0

Communications in Mathematical Physics - Volume 305

Communications in Mathematical Physics - Volume 221

Communications in Mathematical Physics - Volume 220

Communications in Mathematical Physics - Volume 235

Communications in Mathematical Physics - Volume 223

Communications In Mathematical Physics - Volume 283

Communications In Mathematical Physics - Volume 270

Communications in Mathematical Physics - Volume 208

Communications in Mathematical Physics - Volume 186

Communications In Mathematical Physics - Volume 294

Communications in Mathematical Physics - Volume 217

Communications In Mathematical Physics - Volume 274

Communications in Mathematical Physics - Volume 239

Communications in Mathematical Physics - Volume 306

Communications in Mathematical Physics - Volume 264

Communications in Mathematical Physics - Volume 227

Communications in Mathematical Physics - Volume 184

Communications in Mathematical Physics - Volume 261

Communications in Mathematical Physics - Volume 225

Communications In Mathematical Physics - Volume 263

Communications in Mathematical Physics - Volume 211

Communications In Mathematical Physics - Volume 293

Communications in Mathematical Physics - Volume 246

Communications In Mathematical Physics - Volume 298

Communications in Mathematical Physics - Volume 234

Communications In Mathematical Physics - Volume 288

Communications in Mathematical Physics - Volume 304

Communications In Mathematical Physics - Volume 292

Communications in Mathematical Physics - Volume 233

Communications in Mathematical Physics - Volume 253

Communications in Mathematical Physics - Volume 222

Communications in Mathematical Physics - Volume 305

Communications in Mathematical Physics - Volume 221

Communications in Mathematical Physics - Volume 220

Communications in Mathematical Physics - Volume 235

Communications in Mathematical Physics - Volume 223

Communications In Mathematical Physics - Volume 283

Communications In Mathematical Physics - Volume 270

Communications in Mathematical Physics - Volume 208

Communications in Mathematical Physics - Volume 186

Communications In Mathematical Physics - Volume 294

Communications in Mathematical Physics - Volume 217

Communications In Mathematical Physics - Volume 274

Communications in Mathematical Physics - Volume 239

Communications in Mathematical Physics - Volume 306

Communications in Mathematical Physics - Volume 264

Communications in Mathematical Physics - Volume 227

Communications in Mathematical Physics - Volume 184

Communications in Mathematical Physics - Volume 261

Communications in Mathematical Physics - Volume 225

Communications In Mathematical Physics - Volume 263

Communications in Mathematical Physics - Volume 211

Communications In Mathematical Physics - Volume 293

Communications in Mathematical Physics - Volume 246

Communications In Mathematical Physics - Volume 298

Communications in Mathematical Physics - Volume 234

Communications In Mathematical Physics - Volume 288

Communications in Mathematical Physics - Volume 304

Communications In Mathematical Physics - Volume 292

Communications in Mathematical Physics - Volume 233

Communications in Mathematical Physics - Volume 253

Communications in Mathematical Physics - Volume 222

Recommend Documents