Communications In Mathematical Physics - Volume 288

Commun. Math. Phys. 288, 1–42 (2009) Digital Object Identifier (DOI) 10.1007/s00220-009-0733-4 Communications in Mathe...

Author: M. Aizenman (Chief Editor)

48 downloads 765 Views 14MB Size Report

This content was uploaded by our users and we assume good faith they have the permission to share this book. If you own the copyright to this book and it is wrongfully on our website, we offer a simple DMCA procedure to remove your content from our site. Start by pressing the button below!
Report copyright / DMCA form

DOWNLOAD PDF

Commun. Math. Phys. 288, 1–42 (2009) Digital Object Identifier (DOI) 10.1007/s00220-009-0733-4

Communications in

Mathematical Physics

Bethe Algebra of Homogeneous XXX Heisenberg Model has Simple Spectrum E. Mukhin1, , V. Tarasov1,2, , A. Varchenko3, 1 Department of Mathematical Sciences, Indiana University – Purdue University, Indianapolis,

402 North Blackford St, Indianapolis, IN 46202-3216, USA. E-mail: [email protected]

2 St. Petersburg Branch of Steklov Mathematical Institute, Fontanka 27, St. Petersburg,

191023, Russia. E-mail: [email protected]; [email protected]

3 Department of Mathematics, University of North Carolina at Chapel Hill,

Chapel Hill, NC 27599-3250, USA. E-mail: [email protected] Received: 12 September 2007 / Accepted: 23 October 2008 Published online: 18 February 2009 – © Springer-Verlag 2009

Abstract: We show that the algebra of commuting Hamiltonians of the homogeneous XXX Heisenberg model has simple spectrum on the subspace of singular vectors of the tensor product gl2 -modules. As a byproduct we show that there exist ofn two-dimensional exactly nl − l−1 two-dimensional vector subspaces V ⊂ C[u] with a basis f, g ∈ V such that deg f = l, deg g = n − l + 1 and f (u)g(u − 1) − f (u − 1)g(u) = (u + 1)n . 1. Introduction 1.1. Homogeneous XXX Heisenberg model. Consider the vector space (C2 )⊗n and the linear operator HXXX = −

n

( j) ( j+1)

( j) ( j+1)

(σ1 σ1

+ σ2 σ2

( j) ( j+1)

+ σ3 σ3

),

j=1 (k)

where σa matrices,

(n+1)

= 1⊗(k−1) ⊗ σa ⊗ 1(n−k) , σa σ1 =

01 10

,

σ2 =

0 −i i 0

(1)

= σa , and σ1 , σ2 , σ3 are the Pauli

,

σ3 =

1 0 0 −1

.

The operator HXXX is the Hamiltonian of the celebrated XXX Heisenberg model, also called the homogeneous XXX model, and the problem is to find eigenvalues and eigenvectors of the Hamiltonian. Supported in part by NSF grant DMS-0601005.

Supported in part by RFFI grant 08-01-00638. Supported in part by NSF grant DMS-0555327.

2

E. Mukhin, V. Tarasov, A. Varchenko

This problem was first addressed in the pioneering work [Be] by H. Bethe, who looked for eigenvectors of HXXX in a certain special form. His method and its further extensions are traditionally called the Bethe ansatz. The current literature on the XXX model and its generalizations, XXZ and XYZ models, as well as their counterparts in statistical mechanics, the six- and eight-vertex models, is enormous. We limit ourselves to mentioning just two books, [B1] and [KBI]. However, even numerous references therein hardly cover half of the bibliography on the subject. The Hamiltonian HXXX can be included into a one-parameter family of commuting linear operators called the transfer matrix, see [B1,FT,KBI]. We call a commutative unital subalgebra of linear operators on (C2 )⊗n generated by the transfer matrix the Bethe algebra. The actual problem is to construct eigenvalues and eigenvectors for the Bethe algebra. The elements of the Bethe algebra commute with the natural gl2 -action on (C2 )⊗n . Therefore, the eigenspaces of the Bethe algebra are representations of gl2 , and it suffices to construct highest weight vectors of those representations. The Bethe ansatz method associates to every admissible solution (λ1 , . . . λl ) of the system of equations

λj + λj −

i 2 i 2

n =

l λ j − λk + i , λ j − λk − i

j = 1, . . . , l ,

(1.1)

k=1 k= j

a vector in (C2 )⊗n , called the corresponding Bethe vector, see [FT]. A solution (λ1 , . . . , λl ) is called admissible if all λ1 , . . . , λl are distinct, and all factors in (1.1) are nonzero. A nonzero Bethe vector is a highest weight vector of an (n − 2l + 1)dimensional irreducible representation of gl2 , and all vectors in that representation are eigenvectors of each element of the Bethe algebra sharing the same eigenvalue. It is an important question whether the Bethe ansatz method produces all eigenvectors of the Bethe algebra. This question is referred to as the question of completeness of the Bethe ansatz for finite chains. It was discussed by H. Bethe himself in [Be] and many times since then by other authors. For instance, see a recent discussion in [B2]. However, no rigorous proof is available even for the so-called inhomogeneous models. Moreover, as one can see from the results of this paper, Sklyanin’s separation of variables does not prove completeness of the Bethe ansatz to the very end, though it is indeed an important step towards the proof. To be more precise, there are certain quantum integrable models for which the completeness of the Bethe ansatz has been proved. For example, see [YY] and Theorem 1.2.2 in [KBI]. The proofs for those models are based on a variational principle and convexity of some auxiliary action. However, for the the XXX model, the corresponding action is not convex, and that technique fails. In this paper we establish the completeness of the Bethe ansatz method for the homogeneous XXX model provided the method is improved in a certain way, see below in the introduction. We show that the spectrum of the Bethe algebra of the homogeneous XXX model is simple, that is, all eigenspaces of the Bethe algebra are irreducible gl2 -modules. We also show that eigenvalues of the Bethe algebra are in a one-to-one correspondence with certain second-order linear difference equations with two linearly independent polynomial solutions. We prove similar results for inhomogeneous higher spin XXX models.

Bethe Algebra of Homogeneous XXX Model has Simple Spectrum

3

To continue with an introduction and match the notation in the main part of the paper, we change the variables in system (1.1), λj =

i (t j + 1) , 2

j = 1, . . . , l ,

and write the system in the polynomial form (t j + 2)n

l k=1 k= j

(t j − tk − 1) = (t j + 1)n

l

(t j − tk + 1) ,

j = 1, . . . , l.

(1.2)

k=1 k= j

We call system (1.2) the system of the Bethe ansatz equations. The system is invariant with respect to permutations of t1 , . . . , tl , so the symmetric group Sl acts on solutions to the Bethe ansatz equations. We denote by ω(t1 , . . . , tl ) the Bethe vector corresponding to an admissible solution (t1 , . . . , tl ) of the Bethe ansatz equations. The Bethe vectors corresponding to admissible solutions with permuted coordinates are equal. The number of Bethe vectors ω(t1 , . . . , tl ) is equal to the number of Sl -orbits of admissible solutions to system (1.2). Since each element of the Bethe algebra commutes with the natural gl2 -action on (C2 )⊗n , it is enough to diagonalize the action of the Bethe algebra on each subspace of gl2 -singular vectors of given weight, Sing (C2 )⊗n [ l ] = { v ∈ (C2 )⊗n | e12 v = 0, e11 v = (n − l) v, e22 v = l v }, with 2l n. For every admissible solution (t1 , . . . , tl ) of the Bethe ansatz equations, the Bethe vector ω(t1 , . . . , tl ) belongs to the subspace Sing (C2 )⊗n [ l ] . To illustrate the problem with completeness of the Bethe ansatz in the standard form and the way it can be resolved, let us consider an example. Let n = 4 and l = 2. Then dim Sing (C2 )⊗4 [ 2 ] = 2, the operator HXXX restricted to Sing (C2 )⊗4 [ 2 ] has eigenvalues 5 and −3. The Bethe ansatz equations are (t1 + 2)4 (t1 − t2 − 1) = (t1 + 1)4 (t1 − t2 + 1), (t2 + 2)4 (t2 − t1 − 1) = (t2 + 1)4 (t2 − t1 + 1), and there is only one orbit of admissible solutions: 1 1 3 1 3 1 t1 = − + t2 = − − − , − . 2 2 3 2 2 3

(1.3)

(1.4)

The Bethe vector ω(t1 , t2 ) is an eigenvector of HXXX with eigenvalue 5. The results of this paper say that each eigenspace of HXXX acting on Sing (C2 )⊗4 [ 2 ] corresponds to a difference equation u 4 f (u) − B(u) f (u − 1) + (u + 1)4 f (u − 2) = 0 ,

(1.5)

where B(u) is a polynomial, and the difference equation has polynomial solutions of degree 2 and 3. The corresponding eigenvalue of HXXX equals 1 − 2B (0)/B(0). Indeed, there are exactly two such difference equations. The first one has B(u) = 2u 4 + 4u 3 − 2u + 1, and solutions u 2 + 3u + 73 and u 3 + 6u 2 + 11u + 13 2 . The roots of the quadratic polynomial are numbers t1 and t2 given by (1.4).

4


The second difference equation (1.5) with polynomial solutions of degree 2 and 3 has B(u) = 2u 4 + 4u 3 − 2u − 1, and solutions (u + 1) (u + 2) and u 3 + 6u 2 + 10u + 29 . The roots of the quadratic polynomial, t1 = −1 and t2 = −2, form a nonadmissible solution to system (1.3), and the Bethe vector ω(t1 , t2 ) for t1 = −1, t2 = −2 equals zero. For general n and l such that 2l n, the results of this paper for the homogeneous XXX model say that eigenspaces of the Bethe algebra acting on Sing (C2 )⊗n [ l ] are one-dimensional. They are in a one-to-one correspondence with difference equations u n f (u) − B(u) f (u − 1) + (u + 1)n f (u − 2) = 0 ,

(1.6)

where B(u) is a polynomial, and those difference equations have polynomial solutions of degree l and n − l + 1. The corresponding eigenvalues of elements of the Bethe algebra are described by the polynomial B(u). In particular, the eigenvalue of HXXX equals 1 − 2B (0)/B(0). The roots t1 , . . . tl of the polynomial solution of Eq. (1.6) of degree l form a solution of system (1.2). The Bethe vector ω(t1 , . . . tl ) is nonzero if and only if the solution (t1 , . . . tl ) is admissible. To obtain an eigenvector of the Bethe algebra corresponding to a difference equation (1.6) with two polynomial solutions, we use the following construction. The space (C2 )⊗n has a structure of a module over the Yangian Y (gl2 ), and the Bethe algebra of the homogeneous XXX model is the image of a commutative subalgebra B ⊂ Y (gl2 ), called the Bethe subalgebra. We take another Y (gl2 )-module Wa,d , described in Sect. 2.5, which is the holomorphic representation of Y (gl2 ) associated with the polynomials a(u) = (u + 1)n and d(u) = u n . There is a natural epimorphism σ : Wa,d → (C2 )⊗n of Y (gl2 )-modules. Using the roots t1 , . . . , tl of the polynomial solution of Eq. (1.6) of degree l and Sklyanin’s procedure of separation of variables [Sk], we define a nonzero vector ω(t ˜ 1 , . . . , tl ) in Wa,d , which is an eigenvector of B acting on Wa,d . We consider the maximal B-invariant subspace V ⊂ Wa,d that contains ω(t ˜ 1 , . . . , tl ) and does not contain other linearly independent eigenvectors of B. We show that the image σ (V ) ⊂ (C2 )⊗n is a one-dimensional subspace of Sing (C2 )⊗n [ l ]. Since σ is an homomorphism of Y (gl2 )modules, σ (V ) is an eigenspace of the Bethe algebra acting on Sing (C2 )⊗n [ l ] with the same eigenvalues as the eigenvalues of ω(t ˜ 1 , . . . , tl ) with respect to the action of B on Wa,d . The subspace σ (V ) ⊂ Sing (C2 )⊗n [l ] is that one-dimensional subspace of eigenvectors which we assigned to difference equation (1.6) with two polynomial solutions. If (t1 , . . . , tl ) is an admissible solution, then the subspace V ⊂ Wa,d is one-dimensional, and the subspace σ (V ) is spanned by the Bethe vector ω(t1 , . . . , tl ). The construction described above provides a generalization of the Bethe ansatz method in which the solutions to the Bethe ansatz equations are replaced by difference equation (1.6) with two polynomial solutions, and the Bethe vectors in Sing (C2 )⊗n [ l ] are replaced by the subspaces σ (V ). Our result says that the generalized Bethe vectors form a basis in Sing (C2 )⊗n [ l ] and, moreover, the spectrum of the Bethe algebra is simple. As a remark, we would like to indicate another way to obtain the eigenspace of the Bethe algebra acting on Sing (C2 )⊗n [ l ] corresponding to the difference equation (1.6). We may consider the inhomogeneous XXX model depending on parameters z 1 , . . . , z n . The corresponding system of the Bethe ansatz equations are n s=1

(t j − z s + 2)

l k=1 k= j

(t j − tk − 1) =

n s=1

(t j − z s + 1)

l k=1 k= j

(t j − tk + 1),

(1.7)


5

j = 1, . . . , l. It follows from the results of this paper that if f (u) = lj=1 (u − t j ) is a solution of the difference equation (1.6), then for generic z = (z 1 , . . . , z n ) there exists an admissible solution t(z) = (t1 (z), . . . , tl (z)) of system (1.7) such that t(z) → (t1 , . . . tl ) as z → 0. The Bethe vectors ω(t(z); z) are nonzero for generic z, and the eigenspaces C ω(t(z); z) have a one-dimensional limit as z → 0, which is the eigenspace of the Bethe algebra of the homogeneous XXX model. A similar approach for the gl N Gaudin model is developed in [MTV5]. The correspondence between the eigenvectors of the Bethe algebra and second-order linear difference equation with two polynomial solutions is in the spirit of the geometric Langlands correspondence in which eigenfunctions of commuting differential operators correspond to connections on curves. Equation (1.6) is known in the physical literature as Baxter’s equation. Its connection with the Bethe ansatz equations has been studied in many papers. The fact that the roots of a polynomial solution of Baxter’s equation give a solution of the Bethe ansatz equations (provided the roots are distinct) is known as Manakov’s principle and the analytic Bethe ansatz. An important observation about the existence of a second polynomial solution of Baxter’s equation has been done in [PS]. A similar observation in a much more general context has been made independently in [MV2,MV3]. 1.2. Content of the paper. The results of this paper for the XXX model are discrete analogues of the results of [MTV3] for the Gaudin model. In Sect. 2 we discuss the Yangian Y (gl2 ), the Bethe subalgebra B ⊂ Y (gl2 ), and Yangian modules. In particular, we describe the holomorphic representation Wa,d of the Yangian Y (gl2 ). The module Wa,d is associated with two monic polynomials a(u) =

n i=1

(u − z i + m i )

and

d(u) =

n

(u − z i )

i=1

and is isomorphic to C[x1 , . . . , xn ] as a vector space. We introduce a collection ((m 1 , 0), . . . , (m n , 0)) of gl2 -weights and say that the pair n m i − 2l + 1 + s = 0 for all s = 1, . . . , l. ((m 1 , 0), . . . , (m n , 0)) , l is separating if i=1 In Sects. 3 –7, we study the algebras A W and A D , and relations between them. Eventually, we show that the algebras A W and A D are isomorphic, see Theorem 7.3.1. The algebra A W is the image of the Bethe subalgebra B acting on the subspace Sing Wa,d [ l ] ⊂ Wa,d of gl2 -singular vectors. We consider a polynomial B(u, H) = 2u n + H1 u n−1 + · · · + Hn , whose coefficients Hk ∈ End (Sing Wa,d [ l ]) are generators of A W , and introduce the universal difference operator D Sing Wa,d [ l ] = d(u) − B(u, H) ϑ −1 + a(u) ϑ −2 acting on Sing Wa,d [ l ]-valued functions in u. Here ϑ : f (u) → f (u + 1). The algebra A D is defined in Sect. 4. We consider the space Cl+n with coordinates a = (a1 , . . . , al ) and h = (h 1 , . . . , h n ), polynomials B(u, h) = 2u n +h 1 u n−1 +· · ·+h n , and p(u, a) = u l + a1 u l−1 + · · · + al , and the difference operator Dh = d(u) − B(u, h) ϑ −1 + a(u) ϑ −2 . We define the scheme C D of points p ∈ Cl+n such that the polynomial p(u, a( p)) lies in the kernel of the difference operator Dh( p) . The algebra A D is the algebra of functions

6


on C D . There is a natural epimorphism ψ DW : A D → A W such that ψ DW (h k ) = Hk , see Theorem 4.3.3. Using the Bethe ansatz method, we prove that if z 1 , . . . , z n are generic and the pair ((m 1 , 0), . . . , (m n , 0)) , l is separating, then the scheme C D considered as a set has at least dim Sing Wa,d [ l ] distinct points, see Sect. 5. In Sect. 6, we review Sklyanin’s procedure of separation of variables in the XXX model and construct the universal weight function. Theorem 6.3.2 connects the algebras A D , A W and the universal weight function. The algebra A D acts on itself by multiplication operators. We denote by L f the operator of multiplication by an element f ∈ A D . The algebra A D acts on its dual space A∗D by operators L ∗f , dual to multiplication operators. Using the universal weight function we define a linear map τ : A∗D → Sing Wa,d [ l ] and prove that if the pair ((m 1 , 0), . . . , (m n , 0)) , l is separating, then τ is an isomorphism that intertwines the action of operators L ∗f , f ∈ A D , with the action of operators ψ DW ( f ) ∈ End(Sing Wa,d [ l ]), see Theorem 7.3.1. Therefore, we prove that ψ DW : A D → A W is an algebra isomorphism. Theorem 7.3.1 is our first main result. Using the Grothendieck residue, we define an isomorphism φ : A D → A∗D of A D modules, see Sect. 7.4. Therefore, if the pair ((m 1 , 0), . . . , (m n , 0)) , l is separating, the composition τ φ : A D → Sing Wa,d [ l ] is a linear isomorphism which intertwines the action of the algebra A D on itself by multiplication operators and the action of the Bethe algebra A W on Sing Wa,d [ l ]. In Sects. 8 through 11, we impose more conditions on m 1 , . . . , m n and z 1 , . . . , z n . We assume that m 1 , . . . , m n are natural numbers. We keep the assumption n that the pair ((m 1 , 0), . . . , (m n , 0)) , l is separating, that takes the form 2l s=1 m s . We also assume that z i − z j ∈ / Z if i = j. In Sects. 8–11, we study three more algebras A G , A P and A L , and relations between them. The algebra A G is defined in Sect. 8. We consider the subspace Cd [u] ⊂ C[u] of all polynomials of degree d for a suitably large number d, and the Grassmannian of all two-dimensional subspaces of Cd [u]. Using the numbers z 1 , . . . , z n and m 1 , . . . , m n we define n + 1 Schubert cycles CF(z 1 ), (1) , . . . , CF(z n ), (n) , CF(∞), (∞) in the Grassmannian. The algebra A G is the algebra of functions on the intersection of the Schubert cycles. The algebra A P is defined in Sect. 9.1. Let l˜ = ns=1 m s + 1 − l, a˜ = (a˜ 1 , . . . , ˜

˜

a˜ l−l−1 , a˜ l−l+1 , . . . , a˜ l˜), p(u, ˜ a˜ ) = u l + a˜ 1 u l−1 + · · · + a˜ l−l−1 u l+1 + a˜ l−l+1 u l−1 + · · · + a˜ l˜, ˜ ˜ ˜ ˜ ˜

and consider the space Cl+l+n−1 with coordinates a˜ , a, h. We define the scheme C P as ˜ the scheme of points p ∈ Cl+l+n−1 such that the polynomials p(u, a( p)) and p(u, ˜ a˜ ( p)) lie in the kernel of the difference operator Dh( p) . The algebra A P is the algebra of functions on C P . The map ( p(u, a( p)), p(u, ˜ a˜ ( p)), Dh( p) ) → ( p(u, a( p)), Dh( p) ) defines a natural epimorphism ψ D P : A D → A P . We also show that the algebras A G and A P are naturally isomorphic. To define the algebra A L , see Sect. 9.3, we consider the tensor product L (z) = L (1) (z 1 ) ⊗ · · · ⊗ L (n) (z n ) of evaluation Yangian modules, where L (i) is the irreducible gl2 -module of highest weight (i) = (m i , 0). The algebra A L is the image of the Bethe subalgebra B ⊂ Y (gl2 ) acting on the subspace Sing L [l ] ⊂ L (z) of gl2 -singular vectors. The Y (gl2 )-module L (z) is isomorphic to the quotient module Wa,d /K , where K ⊂ Wa,d is the kernel of


7

the Yangian Shapovalov form on Wa,d . We denote by σ : Sing Wa,d [ l ] → Sing L [ l ] the epimorphism of vector spaces corresponding to the epimorphism Wa,d → L (z) of Y (gl2 )-modules. The epimorphism σ induces the algebra epimorphism ψW L : A W → AL . We denote by ξ : A D → Sing L [ l ] the composition of maps σ τ φ, and by ψ DL : A D → A L the composition of maps ψW L ψ DW . We show that the kernels of the maps ξ , ψ DL and ψ D P coincide. This allows us to obtain an algebra isomorphism ψ P L : A P → A L and a linear isomorphism ζ : A P → Sing L [ l ] intertwining the action of A P on itself by multiplication operators and the action of the Bethe algebra A L on Sing L [ l ]. This is our second main result, see Theorem 10.3.1. In Sect. 11, we use the Yangian Shapovalov form on L (z) and the map ζ to obtain a linear isomorphism θ : A∗P → Sing L [ l ] intertwining the action of operators L ∗f , f ∈ A P , with the action of operators ψ P L ( f ) ∈ End(Sing L [ l ]), see Theorem 11.2.1. Using the isomorphism, we show that eigenvectors of the action of the algebra A L on Sing L [ l ] are in a one-to-one correspondence with certain second-order linear difference equations with two polynomial solutions of degrees l and n − l + 1, see Corollary 11.2.3. Section 12 contains the analogues of the previous results for the homogeneous XXX Heisenberg model. We recapitulate the main results of this paper as three commutative diagrams. The horizontal arrows of the diagrams are isomorphisms, the downward vertical arrows are epimorphisms, and the upward vertical arrow is an embedding. The first diagram shows the algebras of functions A D , A P on difference operators with respectively one or two polynomials in the kernels, the algebra A G of functions on the intersection of Schubert cycles, the Bethe algebras A W and A L , associated with Sing Wa,d [ l ] and Sing L [ l ], respectively, and their homomorphisms: ψ DW

A D −−−−→ ⏐ ⏐ ψD P

AW ⏐ ⏐ψ

WL

A G −−−−→ A P −−−−→ A L ψG P

ψP L

The other two diagram show the vector spaces involved: τ

A∗D −−−−→ Sing Wa,d [ l ] ⏐ ⏐σ ⏐ (ψ D P )∗ ⏐

A∗P −−−−→ Sing L [ l ] θ

τφ

A D −−−−→ Sing Wa,d [ l ] ⏐ ⏐ ⏐σ ⏐ ψD P

A P −−−−→ Sing L [ l ] ζ

Each vector space on these diagrams is a module over the corresponding algebra on the first diagram, and all linear maps are consistent with the algebra homomorphisms. 2. Yangian Y (gl2 ) and Yangian Modules 2.1. Lie algebra gl2 . Let eab , a, b = 1, 2, be the standard generators of the complex Lie algebra gl2 . We have gl2 = n+ ⊕ h ⊕ n− , where n+ = C · e12 ,

h = C · e11 ⊕ C · e22 ,

n− = C · e21 .

8


For a gl2 -weight ∈ h∗ , we denote by M the Verma gl2 -module with highest weight and by L the irreducible gl2 -module with highest weight . 2.1.1. Let = ( (1) , . . . , (n) ) be a collection of gl2 -weights, where (i) = (i) (i) ( 1 , 2 ) for i = 1, . . . , n. Let l be a nonnegative integer. The pair , l will be n (i) (i) called separating if i=1 ( 1 − 2 ) − 2l + 1 + s = 0 for all s = 1, . . . , l, cf. [MV1,MV2,MTV3]. In the following, we need the next lemma. 2.1.2. Lemma. Let m be a complex number and l a nonnegative integer. Let V be a gl2 -module with weight decomposition V = ∞ k=0 V [k] , where V [k] ⊂ V is a weight subspace of weight (m − k, k). Assume that m − 2l + 1 + s = 0 for all s = 1, . . . , l. Then the map e12 e21 : V [l − 1] → V [l − 1] is an isomorphism of vector spaces. l−k Proof. Let Uk = ker e12 |V [l−1] . Clearly, V [l − 1] = U0 ⊃ U1 · · · ⊃ Ul−1 ⊃ Ul = {0} .

Let C = e11 (e22 + 1) − e12 e21 . Set P(x) = l−1 k=0 (x − ck ), where ck = k (m − k + 1), and Q(x) =

P(x) − P(cl ) . x − cl

We have e12 e21 |V [l−1] = (cl − C)|V [l−1] . Since C is a central element, we have (C −ck )Uk ⊂ Uk+1 . Therefore, P(C)|V [l−1] = 0, and (cl − C)|V [l−1] Q(C)|V [l−1] = P(cl ) . The assumption on m and l implies that P(cl ) = 0. Hence, the operator (cl − C)|V [l−1] is invertible.

{s}

2.2. Yangian. The Yangian Y (gl2 ) is the unital associative algebra with generators Tab , a, b = 1, 2 and s = 1, 2, . . . . Let Tab (u) = δab +

∞

{s}

Tab u −s ,

a, b = 1, 2 .

s=1

Then the defining relations in Y (gl2 ) have the form (u − v) (Tab (u)Tcd (v) − Tcd (v)Tab (u)) = Tcb (v)Tad (u) − Tcb (u)Tad (v),

(2.1)

for all a, b, c, d. The Yangian is a Hopf algebra with coproduct : Tab (u) →

2 c=1

for all a, b.

Tcb (u) ⊗ Tac (u)

(2.2)


9

2.2.1. Proposition [KBI]. The following relations hold: T11 (u) T12 (u 1 ) . . . T12 (u k ) = 1 T12 (u) + (k − 1)!

σ ∈Sk

i=1

k 1 u σ1 −u σi − 1 T12 (u σ2 ) . . . T12 (u σk ) T11 (u σ1 ) , u − u σ1 u σ1 −u σi i=2

T22 (u) T12 (u 1 ) . . . T12 (u k ) = 1 T12 (u) + (k − 1)!

σ ∈Sk

k u − ui − 1 T12 (u 1 ) . . . T12 (u k ) T11 (u) u − ui

k i=1

u − ui + 1 T12 (u 1 ) . . . T12 (u k ) T22 (u) u − ui

k 1 u σ1 −u σi +1 T12 (u σ2 ) . . . T12 (u σk ) T22 (u σ1 ) . u − u σ1 u σ1 −u σi i=2

2.2.2. A series f (u) in u −1 is called monic if f (u) = 1 + O(u −1 ). For a monic series f (u), there is an automorphism ϕ f : Y (gl2 ) → Y (gl2 ) ,

Tab (u) → f (u) Tab (u).

There is a one-parameter family of automorphisms ρz : Y (gl2 ) → Y (gl2 )

Tab (u) → Tab (u − z),

z)−1

has to be expanded as a power series in u −1 . where in the right-hand side, (u − The Yangian Y (gl2 ) contains the universal enveloping algebra U (gl2 ) as a Hopf sub{1} algebra. The embedding is given by the formula eab → Tba for all a, b. We identify U (gl2 ) with its image. {1} The evaluation homomorphism : Y (gl2 ) → U (gl2 ) is defined by the rule: Tab → {s} eba for all a, b, and Tab → 0 for all a, b and all s > 1. We denote by + : Y (gl2 ) → Y (gl2 ) the antiinvolution defined by (Tab (u))+ = Tba (u).

(2.3)

2.3. Bethe subalgebra. The series qdet T (u) = T1 1 (u) T2 2 (u − 1) − T1 2 (u) T2 1 (u − 1)

(2.4)

is called the quantum determinant. The coefficients of the series qdet T (u) belong to the center of the Yangian Y (gl2 ) [IK]. The series T11 (u)+T22 (u) is called the transfer matrix. It is known that the coefficients of the series T11 (u) + T22 (u) commute [FT]. We call the unital subalgebra B ⊂ Y (gl2 ) generated by coefficients of the series qdet T (u) and T11 (u) + T22 (u) the Bethe subalgebra. The Bethe subalgebra is commutative. Elements of the Bethe subalgebra commute with elements of the subalgebra U (gl2 ) and are invariant under the antiinvolution (2.3). 2.4. Yangian modules. 2.4.1. Theorem [T]. Let V be an irreducible finite-dimensional Y (gl2 )-module. There exists a unique up to proportionality vector v ∈ V , monic series c1 (u), c2 (u), and a monic polynomial P(u) such that

10


T21 (u) v = 0 , Taa (u) v = ca (u) v ,

a = 1, 2 ,

and c1 (u) P(u + 1) = . c2 (u) P(u)

(2.5)

The vector v is called a highest weight vector, the series c1 (u) , c2 (u) — the Yangian highest weights, and the polynomial P(u) — the Drinfeld polynomial of the module V . 2.4.2. Theorem [T]. For any monic series c1 (u), c2 (u) and a monic polynomial P(u) obeying relation (2.5), there exists a unique irreducible finite-dimensional Y (gl2 )module V such that c1 (u) , c2 (u) are the Yangian highest weights of the module V . 2.4.3. Let V1 , V2 be irreducible finite-dimensional Y (gl2 )-modules with respective highest weight vectors v1 , v2 . Then for the Y (gl2 )-module V1 ⊗ V2 , we have T21 (u) v1 ⊗ v2 = 0 , Taa (u) v1 ⊗ v2 = ca(1) (u) ca(2) (u) v1 ⊗ v2 ,

a = 1, 2 .

Let W be the irreducible subquotient of V1 ⊗ V2 generated by the vector v1 ⊗ v2 . Then the Drinfeld polynomial of the module W equals the products of the Drinfeld polynomials of the modules V1 and V2 . 2.4.4. For a gl2 -module V , let the Y (gl2 )-module V (z) be the pullback of V through the homomorphism ◦ ρz ; that is, the series Tab (u) acts on V (z) as 1 + (u − z)−1 eba . The module V (z) is called the evaluation module with evaluation point z. 2.4.5. Let = ( (1) , . . . , (n) ) be a collection of integral dominant gl2 -weights, (i) (i) (i) where (i) = ( (i) 1 , 2 ), 1 2 , for i = 1, . . . , n. For generic complex numbers z 1 , . . . , z n , the tensor product of evaluation modules L (z) = L (1) (z 1 ) ⊗ · · · ⊗ L (n) (z n ) is an irreducible finite-dimensional Y (gl2 )-module and the corresponding highest weight series c1 (u) , c2 (u) have the form ca (u) =

n (i) u − z i + a . u − zi i=1

The corresponding Drinfeld polynomial equals (i)

P(u) =

n 1 −1 i=1 s= (i) 2

(u − z i + s).

(2.6)


11

2.5. Holomorphic representation. The results of this section go back to [T]. Choose monic polynomials a(u) , d(u) ∈ C[u] of positive degree n, a(u) =

n

(u − z i + m i ),

d(u) =

i=1

n

(u − z i ).

(2.7)

i=1

2.5.1. Proposition. There exists a unique Y (gl2 )-action on the vector space C[x1 , . . . , xn ] such that n 1 n−i u xi p(x1 , . . . , xn ) d(u) i=1 n zi x1 x2 − x1 i=1 + . . . p(x1 , . . . , xn ) = + u u2

(T12 (u) · p) (x) =

(2.8)

for any polynomial p ∈ C[x1 , . . . , xn ] , and T11 (u) · 1 =

a(u) · 1, d(u)

T22 (u) · 1 = 1 ,

T21 (u) · 1 = 0 ,

(2.9)

where 1 stands for the constant polynomial equal to 1 as an element of C[x1 , . . . , xn ]. We denote by Wa,d the Y (gl2 )-module defined by formulae (2.8), (2.9) and call it the holomorphic representation of Y (gl2 ), associated with the polynomials a(u) , d(u). The Yangian module Wa,d is cyclic: every element of Wa,d can be obtained from 1 {1} {2} by the action of a suitable polynomial in T12 , T12 , . . . . Formulae (2.9) mean that 1 is {s} {s} {s} an eigenvector of the operators T11 , T22 and 1 is annihilated by the operators T21 with s = 1, 2, . . . . Then the Yangian commutation relations (2.1) allow us to determine the {s} {s} {s} action of T11 , T22 , T21 on all elements of Wa,d . Since the coefficients of the series qdet T (u) are central, and the module Wa,d is generated by the polynomial 1, we have qdet T (u)W

a,d

=

a(u) . d(u)

(2.10)

For every i, j = 1, 2, we have Ti j (u)W

a,d

=

T˜i j (u) , d(u)

(2.11)

where T˜i j (u) is an End (Wa,d )-valued polynomial in u of degree n for i = j, and of degree n − 1 for i = j. 2.5.2. The embedding U (gl2 ) → Y (gl2 ) defines a gl2 -module structure on Wa,d . The ∞ W gl2 -weight decomposition of Wa,d is the degree decomposition Wa,d = ⊕l=0 a,d [ l ] into subspaces of homogeneous polynomials. The subspace W [ l ] of homogeneous a,d n polynomials of degree l has gl2 -weight i=1 m i − l, l .

12


2.5.3. Lemma. Let Sing Wa,d [ l ] = { p ∈ Wa,d [ l ] | e12 p = 0 } be the subspace of gl2 -sinular vectors. Assume that the pair ((m 1 , 0), . . . , (m n , 0)), l is separating. Then dim Sing Wa,d [ l ] = dim Wa,d [ l ] − dim Wa,d [ l − 1 ] . Proof. The map e12 e21 : Wa,d [ l − 1 ] → Wa,d [ l − 1 ] is an isomorphism of vector spaces since the pair ((m 1 , 0), . . . , (m n , 0)), l is separating, see Lemma 2.1.2. The fact that e12 e21 is an isomorphism implies the lemma.

2.5.4. Denote by + : Y (gl2 ) → Y (gl2 ) the antiinvolution defined by Ti+j (u) = T ji (u). Denote by φ : Wa,d → C the linear function p(x1 , . . . , xn ) → p(0, . . . , 0). The Yangian Shapovalov form on Wa,d is the unique symmetric bilinear form S on Wa,d defined by the formula S(x · 1, y · 1) = φ(x + y · 1) for all x, y ∈ Y (gl2 ). Different gl2 -weight subspaces of Wa,d are orthogonal with respect to the form S, and det S|Wa,d [ l ] = const

n l−1

(z i − z j + m j − s)(

n+l−s−2 n−1

),

i, j=1 s=0

where the constant does not depend on z 1 , . . . , z n , m 1 , . . . , m n . 2.6. The kernel of the Yangian Shapovalov form K ⊂ Wa,d is a Y (gl2 )-submodule. The Y (gl2 )-module Wa,d /K is irreducible. The Yangian Shapovalov form on Wa,d induces a nondegenerate symmetric bilinear form on Wa,d /K called the Yangian Shapovalov form of the module Wa,d /K . 2.6.1. Theorem [T]. For generic z 1 , . . . , z n , m 1 , . . . , m n , the Y (gl2 )-module Wa,d is irreducible and isomorphic to the tensor product of evaluation Verma modules M(m 1 ,0) (z 1 ) ⊗ · · · ⊗ M(m n ,0) (z n ). Any such isomorphism sends 1 to a scalar multiple of the tensor product v(m 1 ,0) ⊗ · · · ⊗ v(m n ,0) of highest weight vectors of the corresponding Verma modules. 2.6.2. Theorem [T]. Let m i ∈ Z0 for i = 1, . . . , n, and m 1 m 2 · · · m n . Assume that z i − z j + m j − s = 0 and z i − z j − 1 − s = 0 for all i < j and s = 0, 1, . . . , m i − 1. Then for any permutation σ ∈ Sn , the irreducible Y (gl2 )-module Wa,d /K is isomorphic to the tensor product of evaluation irreducible modules L (m σ1 ,0) (z σ1 )⊗· · ·⊗ L (m σn ,0) (z σn ). Any such an isomorphism sends the element corresponding to 1 to a scalar multiple of the tensor product v(m σ1 ,0) ⊗ · · · ⊗ v(m σn ,0) of highest weight vectors of the corresponding irreducible modules. For a proof of this theorem see also [CP]. 2.6.3. The assumption of Theorem 2.6.2 can be formulated geometrically as the assumption that for i < j the sets Z i = {z i , z i −1, . . . , z i −m i } and Z j = {z j , z j −1, . . . , z j − m j } either do not intersect, or the smaller set Z i is a subset of the larger set Z j (since we assumed that m i m j ).


13

3. Algebra AW and Universal Difference Operator 3.1. Definition. Let V be a Y (gl2 )-module. We call the image of the Bethe algebra B ⊂ Y (gl2 ) in End (V ) the Bethe algebra associated with V . If U ⊂ V is a vector subspace preserved by elements of the Bethe algebra BV , then their restrictions to U define a commutative unital subalgebra BU ⊂ End (U ) called the Bethe algebra associated with U . 3.1.1. Define the operator ϑ acting on functions of u as (ϑ f )(u) = f (u + 1). Let V be a Y (gl2 )-module such that for all a, b the series Tab (u)|V sum up to End (V )-valued rational functions in u. Let U ⊂ V be a vector subspace preserved by the Bethe algebra BV . The universal difference operator DU acting on U -valued functions in u is defined by the formula DU = 1 − (T11 (u) + T22 (u)) U ϑ −1 + qdet T (u) ϑ −2 , U

see [Tal, MTV1, (4.16) ], [MTV2]. The operator DU is a linear second-order difference operator. 3.2. Algebra A W . Operator D Sing Wa,d [ l ] . Consider the Bethe algebra BWa,d associated with the Y (gl2 )-module Wa,d . Recall that (qdet T (u)) W see (2.10), and

a,d

(T11 (u) + T22 (u)) W

a,d

a(u) , d(u)

=

=

˜ B(u, H) , d(u)

where

˜ = H˜ 0 u n + H˜ 1 u n−1 + · · · + H˜ n B(u, H) (3.1) for suitable coefficients H˜ k ∈ End Wa,d , see 2.11. It follows from Proposition (2.2.1) n (m i − 2z i ). that the coefficients H˜ 0 , H˜ 1 are scalar operators, H˜ 0 = 2, H˜ 1 = i=1 The elements H˜ k are called the XXX Hamiltonians associated with Wa,d . 3.2.1. The Hamiltonians H˜ k preserve the subspace Sing Wa,d [ l ] defined in Sect. 2.5.3. Set Hk = H˜ k | Sing Wa,d [ l ] ∈ End (Sing Wa,d [ l ]) and B(u, H) = H0 u n + H1 u n−1 + · · · + Hn . The coefficients H0 , H1 , H2 are scalar operators, H0 = 2 , H2 = l

l −1−

n i=1

H1 =

(m i − 2z i ),

i=1

mi

n

+

1i< j n

z i z j + (z i − m i )(z j − m j ) .

14


The simplest way to get the last formula is to extract H2 from the coefficient of u −2 of the series qdet T (u) Sing W [ l ] , see (2.4), and to use formula (2.10). a,d We denote by AW the Bethe algebra associated with Sing Wa,d [ l ]. It is the unital subalgebra of End Sing Wa,d [ l ] generated by the operators H3 , H4 , . . . , Hn , called the XXX Hamiltonians associated with Sing Wa,d [ l ]. 3.2.2. The operators of the algebra A W are symmetric with respect to the Yangian Shapovalov form on Wa,d , S( f v, w) = S(v, f w) for all f ∈ A W and v, w ∈ Wa,d , see [MTV1]. 3.3. Operator D Sing Wa,d [ l ] . Consider the universal difference operator D Sing Wa,d [ l ] acting on Sing Wa,d [ l ]-valued functions, D Sing Wa,d [ l ] = 1 −

B(u, H) −1 a(u) −2 ϑ ϑ . + d(u) d(u)

The modified universal difference operator D Sing Wa,d [ l ] is defined by the formula D Sing Wa,d [ l ] = d(u) D Sing Wa,d [ l ] . Then D Sing Wa,d [ l ] = d(u) − B(u, H) ϑ −1 + a(u) ϑ −2 . 3.3.1. Theorem. Assume that the pair ((m 1 , 0), . . . , (m n , 0)), l is separating. Then for any v0 ∈ Sing Wa,d [ l ] there exist unique v1 , . . . , vl ∈ Sing Wa,d [ l ] such that the function w(u) = v0 u l + v1 u l−1 + · · · + vl is a solution of the difference equation D Sing Wa,d [ l ] w(u) = 0. Proof. By Lemma 2.5.3 the dimension of Sing Wa,d [ l ] does not depend on z 1 , . . . , z n , m 1 , . . . , m n , if the pair ((m 1 , 0), . . . , (m n , 0)) , l is separating. Because of that, we may consider the difference equation D Sing Wa,d [ l ] v(u) = 0 as a difference equation on a fixed vector space with coefficients of the difference equation algebraically depending on parameters z 1 , . . . , z n , m 1 , . . . , m n . Given a vector v0 ∈ Sing Wa,d [ l ], we look of the difference equation for a solution l− j . Substituting this expression v u D Sing Wa,d [ l ] v(u) = 0 in the form v0 u l + ∞ j j=1 into the equation, we can calculate all of the coefficients v j recursively, and they are algebraic functions of z 1 , . . . , z n , m 1 , . . . , m n . For generic z 1 , . . . , z n and large positive integral m 1 , . . . , m n , the coefficients v j are equal to zero for all j > l by Theorem 2.6.2 and [MTV3, Theorem 7.3]. Hence, the same coefficients are equal to zero for all z 1 , . . . , z n , m 1 , . . . , m n such that the pair

((m 1 , 0), . . . , (m n , 0)) , l is separating.


15

4. Algebra A D 4.1. Definition. From now on until the end of Sect. 11 we fix complex numbers z 1 , . . . , z n , m 1 , . . . , m n , and a nonnegative integer l. We always assume that the polynomials a(u) and d(u) are given by formulae (2.7). Let a = (a1 , . . . , al ) and h = (h 1 , . . . , h n ). Consider the space Cl+n with coordinates a, h. Let D be the affine subspace of Cl+n defined by equations q1 (h) = 0, q2 (h) = 0, where q1 (h) = h 1 −

n

(m i − 2z i ),

i=1

q2 (h) = h 2 − l (l − 1 −

n

mi ) −

i=1

(z i z j + (z i − m i )(z j − m j )).

1i< j n

Let p(u, a) = u l + a1 u l−1 + · · · + al , B(u, h) = 2u n + h 1 u n−1 + · · · + h n , Dh = d(u) − B(u, h) ϑ −1 + a(u) ϑ −2 .

(4.1)

If h satisfy the equations q1 (h) = 0 and q2 (h) = 0, then the polynomial Dh ( p(u, a)) is a polynomial in u of degree l + n − 3, Dh ( p(u, a)) = q3 (a, h) u l+n−3 + · · · + ql+n (a, h). The coefficients qi (a, h) are linear functions in a and linear functions in h. Denote by I D the ideal in C[a, h] generated by polynomials q1 , q2 , q3 , . . . , ql+n . The ideal I D defines a scheme C D ⊂ D. Then A D = C[a, h]/I D is the algebra of functions on C D . The scheme C D is the scheme of points p ∈ D such that the polynomial p(u, a( p)) solves the difference equation Dh( p) w(u) = 0. 4.2. Independence of dimension of A D on z 1 , . . . , z n . For fixed m 1 , . . . , m n , the scheme C D and the algebra A D depend on the choice of numbers z = (z 1 , . . . , z n ): C D = C D (z), A D = A D (z). 4.2.1. Theorem. Assume that the pair ((m 1 , 0), . . . , (m n , 0)) , l is separating. Then the dimension of A D (z), considered as a vector space, is finite and does not depend on the choice of numbers z 1 , . . . , z n . Proof. It suffices to prove two facts: (i) For any z, there are no algebraic curves over C lying in C D (z). (ii) Let a sequence z (i) , i = 1, 2, . . . , tend to a finite limit z = (z 1 , . . . , z n ). Let (i) ∈ C (z (i) ), i = 1, 2, . . . , be a sequence of points. Then all coordinates D p (i) a( p ), h( p(i) remain bounded as i tends to infinity.

16


By fact (i), the dimension of A D (z) is finite for any z, whereas fact (ii) implies that dim A D (z) does not depend on z 1 , . . . , z n . For a point p in C D (z), the operator Dh( p) has the form d(u) − (2u n + h 1 ( p)u n−1 + h 2 ( p)u n−2 + h 3 ( p)u n−3 + · · · + h n ( p)) ϑ −1 + a(u) ϑ −2 , where the coefficients h 1 ( p) , h 2 ( p) are determined by the equations q1 (h) = 0 and q2 (h) = 0. Assume that (i) is not true. Since any affine algebraic curve over C is unbounded, there exists a sequence of points p(i) ∈ C D (z), i = 1, 2, . . . , which tends to infinity as i tends to infinity. Then it is easy to see that h( p(i) ) cannot tend to infinity since it would contradict the fact that Dh( p(i) ) p(u, a( p(i) )) = 0. Choosing a subsequence, we may assume that h( p(i) ) has a finite limit as i tends to infinity. Then a( p(i) ) cannot tend to infinity since it would mean that the limiting difference equation has a polynomial solution of degree less than l, and this is impossible. This reasoning implies that p(i) ∈ C D (z) cannot tend to infinity. Thus we get a contradiction and statement (i) is proved. The proof of statement (ii) is similar.

4.3. Second description of A D and epimorphism ψ DW : A D → A W . 4.3.1. Theorem. Assume that the pair ((m 1 , 0), . . . , (m n , 0)) , l is separating. Assume that h satisfies equations q1 (h) = 0 and q2 (h) = 0. Consider the system qi (a, h) = 0 ,

i = 3, . . . , l + 2 ,

(4.2)

as a system of linear equations with respect to a1 , . . . , al . Then this system has a unique solution ai = ai (h) , i = 1, . . . , l, where ai (h) are polynomials in h. Proof. The claim follows from the fact that q2+i (a, h) = i

n

m s − 2l + i + 1

s=1

ai +

i−1

qi j (h) a j

j=1

for i = 1, . . . , l. Here qi j are some linear functions of h. The coefficient of ai does not vanish since the pair ((m 1 , 0), . . . , (m n , 0)) , l is separating.

4.3.2. Denote by I D be the ideal in C[h] generated by the polynomials q1 (h) , q2 (h), in the space Cn q j (a(h), h), j = l + 3, . . . , l + n. The ideal I D defines a scheme C D is the scheme of points r ∈ Cn with coordinates h = (h 1 , . . . , h n ). The scheme C D such that the difference equation Dh( r ) w(u) = 0 has a polynomial solution of degree l. Theorem 4.3.1 implies that AD ∼ = C[h]/I D . Let H1 , . . . , Hn be the operators introduced in Sect. 3.2.1.

(4.3)


17

4.3.3. Theorem. Assume that the pair ((m 1 , 0), . . . , (m n , 0)) , l is separating. Then the assignment h s → Hs , s = 1, . . . , n, determines an algebra epimorphism ψ DW : A D → AW . Proof. We use description (4.3) of the algebra A D . The equations defining the scheme are the equations of existence of a polynomial solution of degree l to the polynoCD mial difference equation Dh w(u) = 0. The operators H1 , . . . , Hn satisfy the defining by Theorem 3.3.1. equations for C D

5. Bethe Ansatz Equations 5.1. Bethe ansatz equations. The Bethe ansatz equations is the following system of equations with respect to complex numbers t = (t1 , . . . , tl ) : n

(t j −z s +1+m s )

(t j − tk − 1) =

k= j

s=1

n

(t j − z s + 1)

s=1

(t j − tk + 1),

k= j

j = 1, . . . , l.

(5.1)

A solution t is called admissible if all t1 , . . . , tl are distinct, and all factors in (5.1) are nonzero. The permutation group Sl acts on admissible solutions. If t = (t1 , . . . , tl ) is an admissible solution, then any permutation of these numbers is an admissible solution too. We shall consider Sl -orbits of admissible solutions. The following lemma is well-known, see for example Lemma 2.2 in [MV2]. 5.1.1. Lemma. Let t be an admissible solution of system (5.1). Let p(u) =

l i=1

(u − ti ), B(u) =

d(u) p(u) + a(u) p(u − 2) . p(u − 1)

Then B(u) is a polynomial of degree n and p(u) is annihilated by the difference operator d(u) − B(u) ϑ −1 + a(u) ϑ −2 .

5.1.2. Corollary. Any Sl -orbit of admissible solutions of the Bethe ansatz equations gives a point of the scheme C D considered as a set. Moreover, different Sl -orbits give different points. 5.1.3. Theorem. Assume that the pair ((m 1 , 0), . . . , (m n , 0)) , l is separating. Then for generic z 1 , . . . , z n the Bethe ansatz equations have at least dim Sing Wa,d [l ] distinct Sl -orbits of admissible solutions.

18


5.1.4. Corollary. Assume that the pair ((m 1 , 0), . . . , (m n , 0)) , l is separating. Then for generic z 1 , . . . , z n the scheme C D considered as a set has at least dim Sing Wa,d [ l ] distinct points. Proof of Theorem 5.1.3. Make the change of variables: z s = zˆ s /ε, s = 1, . . . , n, and ti = tî /ε, i = 1, . . . , l. Then Eqs. (5.1) take the form n tˆj − zˆ s + ε + m s ε tˆj − tˆk − ε = 1, tˆj − zˆ s + ε tˆ − tˆk + ε s=1 k= j j

j = 1, . . . , l .

(5.2)

As ε tends to zero, Eqs. (5.2) take the form n s=1

2 ms − = O(ε) , tˆj − zˆ s tˆ − tˆk k= j j

j = 1, . . . , l ,

and in the limit we obtain n s=1

2 ms − = 0, tˆj − zˆ s tˆ − tˆk k= j j

j = 1, . . . , l .

(5.3)

The last system is the system of the Bethe ansatz equations for the Gaudin model. It was proved in [RV] that if the pair ((m 1 , 0), . . . , (m n , 0)) , l is separating and zˆ 1 , . . . , zˆ n are generic, then system (5.3) has at least dim Sing Wa,d [ l ]distinct Sl -orbits of admissible solutions. This proves Theorem 5.1.3.

6. Separation of Variables 6.1. Change of variables. For a nonnegative integer l let Cl [y1 , . . . , yn−1 ]Sym be the vector space of symmetric polynomials in y1 , . . . , yn−1 of degree not greater than l with respect to each variable. Let Wa,d [ l ] = y0l Cl [y1 , . . . , yn−1 ]Sym ⊂ C[y0 , y1 , . . . , yn−1 ] ∞ W and set Wa,d = ⊕l=0 a,d [ l ]. Define an isomorphism of vector spaces

Wa,d ∼ = Wa,d

(6.1)

using the formula n i=1

xi u n−i = y0

n−1

(u − y j ),

j=1

that is, by setting xi = (−1)i−1 y0 σi−1 (y1 , . . . , yn−1 ) , where σi−1 is the (i − 1) st elementary symmetric function. For example, for n = 2 we have x1 u + x2 = y0 (u − y1 ) and x1 = y0 , x2 = −y0 y1 .


19

We will identify the spaces Wa,d and Wa,d using isomorphism (6.1). In particular, this defines a Y (gl2 )-module structure on Wa,d . We denote by Sing Wa,d [ l ] ⊂ Wa,d [ l ] the subspace of gl2 -singular vectors. Isomorphism (6.1) defines on Wa,d and its subspaces the operators which were previously defined on Wa,d and its subspaces. Those new operators will be denoted by the same symbols. In particular, we shall consider the action of operators T˜i j (u) and H˜ 0 , . . . , H˜ n on Wa,d . 6.2. Sklyanin’s theorem. 6.2.1. Theorem [Sk]. The action of e11 , e22 , T˜11 (u), T˜22 (u) on Wa,d is given by the following formulae: e11 =

n

m i − y0

i=1

⎛ T˜11 (u) = ⎝u + e11 − ⎛ T˜22 (u) = ⎝u + e22 −

n

zi +

n−1

i=1

j=1

n

n−1

i=1

zi +

⎞ yj⎠ ⎞ yj⎠

j=1

∂ , ∂ y0

∂ , ∂ y0

(6.2)

a(y j )

u − y j ϑ y−1 , j y − y j j

e22 = y0

n−1

n−1

j=1

j=1

n−1

n−1

(u − y j ) +

j = j

(6.3) (u − y j ) +

j=1

d(y j )

u − y j ϑy j , y j − y j

j = j

j=1

(6.4) where ϑ y j : f (y0 , . . . , yn−1 ) → f (y0 , . . . , y j + 1, . . . , yn−1 ). Proof. The proofs of formulae (6.2) are straightforward. The proofs of formulae (6.3) and (6.4) are similar. We will prove formula (6.4). Clearly, the weight subspace Wa,d [ l ] is spanned by vectors of the form T˜12 (u 1 ) . . . T˜12 (u l ) · 1 = y0l

l n−1

(u i − y j )

(6.5)

i=1 j=1

with various u 1 , . . . , u l . So, it suffices to verify formula (6.4) on such vectors. Both the expression T˜22 (u) T˜12 (u 1 ) . . . T˜12 (u l ) · 1 and the right-hand side of formula (6.4) applied to T˜12 (u 1 ) . . . T˜12 (u l ) · 1 are polynomials in u of degree n. Therefore, they are uniquely determined by their coefficients at u n and u n−1 , and the values at n − 1 points y1 , . . . , yn−1 . Proposition 2.2.1 and formulae (2.9), (2.11), (6.5) yield that T˜22 (u) T˜12 (u 1 ) . . . T˜12 (u l ) · 1 =

n

u +

l−

n i=1

+O(u n−2 )

zi

u

n−1

y0l

l n−1 i=1 j=1

(u i − y j )

20


as u → ∞, and

⎞ ⎛ l ⎝(u i − y j +1) T˜22 (u) T˜12 (u 1 ) . . . T˜12 (u l ) · 1 u=y = d(y j ) y0l (u i − y j )⎠ , j

j = j

i=1

which proves the theorem.

6.2.2. Corollary. We have ˜ = T˜11 (u) + T˜22 (u) = (2u + B(u, H)

+

n−1 j=1

⎛ ⎝

⎞

n n−1 n−1 (m i − 2z i ) + 2 yj) (u − y j ) i=1

j=1

j=1

u − y j ⎠ a(y j )ϑ y−1 + d(y j )ϑ y j . j y j − y j

j = j

6.3. Universal weight function. Let y = (y0 , . . . , yn−1 ). Recall that a = (a1 , . . . , al ), h = (h 1 , . . . , h n ) and p(x, a) = x l + a1 x l−1 + · · · + al . Let

ω( y, a) = y0l

n−1

p(y j − 1, a).

j=1

This element of Wa,d [ l ] ⊗ C[a] ⊂ Wa,d [ l ] ⊗ C[a, h] is called the universal weight function. A trivial but important property of the universal weight function is given by the following lemma. 6.3.1. Lemma. Consider Cl+n with coordinates a, h. Then for every p ∈ Cl+n , the vector ω( y, a( p)) is a nonzero vector of Wa,d [ l ]. Denote by ω D the projection of the universal weight function ω( y, a) to Wa,d [ l ] ⊗ A D = Wa,d [ l ] ⊗ A D . 6.3.2. Theorem. Assume that the pair ((m 1 , 0), . . . , (m n , 0)) , l is separating. Then for s = 1, . . . , n, we have H˜ s ω D = h s ω D

(6.6)

in Wa,d [ l ] ⊗ A D . Moreover, we have ω D ∈ Sing Wa,d [ l ] ⊗ A D ⊂ Wa,d [ l ] ⊗ A D .

(6.7)


21

6.3.3. Corollary. Let p be a point of the scheme C D considered as a set. Then ω( y, a( p)) ∈ Sing Wa,d [ l ].

(6.8)

Moreover, for s = 1, . . . , n, we have Hs ω( y, a( p)) = h s ( p) ω( y, a( p)).

(6.9)

Proof of Corollary 6.3.3. Let π : C[a, h] → A D be the canonical projection. A point p ∈ C D determines uniquely an algebra homomorphism p : A D → C, such that f ( p) = p (π( f )) for any f ∈ C[a, h]. In particular, ω( y, a( p)) = (id ⊗ p)(ω D ). (6.10) Therefore, formulae (6.8) and (6.9) follow from formulae (6.7) and (6.6), respectively.

6.3.4. Corollary. Let p1 , . . . pd be distinct points of the scheme C D considered as a set. Then the vectors ω( y, a( p1 )), . . . , ω( y, a( pd )) are linearly independent. Proof of Corollary 6.3.4. The vector ω( y, a( p j )) is nonzero by Lemma 6.3.1 and is an eigenvector of the operator Hs with eigenvalue h s ( p j ) by formula (6.9). Moreover, the collections of eigenvalues h( p1 ), . . . , h( pd ) are distinct, because a point p ∈ C D is uniquely determined by its coordinates h( p) by Theorem 4.3.1. The corollary is proved.

Proof of Theorem 6.3.2. To prove formula (6.6) it is enough to show that the polynomial ˜ − B(u, h) ω( y, a) projects to zero in C[u] ⊗ Wa,d [ l ] ⊗ A D . Let B(u, H) B(u, y1 , . . . , yn−1 , h) =

n−1 j=1

B(y j , h)

u − y j . y j − y j

j = j

For j = 1, . . . , n, we have B(y j , y1 , . . . , yn−1 , h) = B(y j , h) and B(u, y1 , . . . , yn−1 , h) is a polynomial in u of degree n − 2. Hence ⎛ ⎞ n−1 n−1 yj⎠ (u − y j ). B(u, h) − B(u, y1 , . . . , yn−1 , h) = ⎝2u + h 1 + 2 j=1

j=1

We have ˜ − B(u, h) + B(u, y1 , . . . , yn−1 , h) − B(u, y1 , . . . , yn−1 , h))ω( y, a B(u, H) ⎛ ⎞ n−1 n = ⎝ − h1 + (m i − 2z i ) (u − y j )⎠ ω( y, a) i=1

+

n−1 j=1

j=1

⎞ u − y j −1 y0l ⎝ p(y j − 1, a)⎠ a(y j )ϑ y−2 − B(y , h)ϑ + d(y ) j j y j j y j − y j ⎛

j = j

× p(y j , a) . Clearly all terms in the right-hand side of this formula project to zero in C[u]⊗Wa,d [l ]⊗ A D . Hence, formula (6.6) is proved. The proof of formula (6.7) is based on the following lemma.

22


Lemma. We have e21 e12 ω D = 0. Proof. From the formula for the quantum determinant we have T˜12 (u)T˜21 (u − 1)ω( y, a) = T˜11 (u)T˜22 (u − 1)−a(u)d(u − 1) ω( y, a),

(6.11)

where T˜12 (u)T˜21 (u − 1) = e21 e12 u 2n−2 + O(u 2n−3 ). Therefore, our goal is to calculate the coefficient of u 2n−2 in the right-hand side. We have {2}

T11 (u)T22 (u − 1) = 1 +

{2}

e22 (e11 + 1) + T11 + T22 e11 + e22 + + O(u −3 ) . u u2

Hence T11 (u)T22 (u − 1) − T11 (u) − T22 (u) + 1 =

e22 (e11 + 1) + O(u −3 ) u2

and ˜ d(u − 1)+d(u)d(u − 1) = e22 (e11 + 1)u 2n−2 + O(u 2n−3 ) . T˜11 (u)T˜22 (u − 1)− B(u, H) Thus the right-hand side of (6.11) equals ˜ − a(u) − d(u) d(u − 1) ω( y, a) + e22 (e11 + 1)u 2n−2 ω( y, a) + O(u 2n−3 ) . B(u, H) Here e22 (e11 + 1) ω( y, a) = l

n

mi − l + 1

ω( y, a),

i=1

˜ ω( y, a) = B(u, h) ω( y, a), B(u, H) a(u) + d(u) = 2u n −

n s=1

+

(2z s − m s )u n−1

z i z j + (z i − m i )(z j − m j ) u n−2 + · · ·

1i< j n

and B(u, h) = 2u n + h 1 u n−1 + h 2 u n−2 + · · · . Therefore, the right-hand side of (6.11) equals n (2z s − m s ) u n−1 d(u − 1) ω( y, a) h1 + ⎛

s=1

+ ⎝h 2 + l

n i=1

+O(u

2n−3

mi − l + 1 −

⎞ z i z j + (z i − m i )(z j − m j ) ⎠ u 2n−2 ω( y, a)

1i< j n

).

Clearly the first two terms of this expression project to zero in C[u] ⊗ Wa,d [ l ] ⊗ A D . This proves the lemma. In order to deduce formula (6.7) from the lemma, it is enough to notice that the operator e21 is injective, in variables y it is the operator of multiplication by y0 . Therefore, e12 ω D = 0. Theorem 6.3.2 is proved.


23

7. Multiplication in Algebra A D and Bethe Algebra AW 7.1. Multiplication in A D . By Theorem 4.2.1, the scheme C D considered as a set is finite, and the algebra A D is the direct sum of local algebras, A D = ⊕ p A p,D , corresponding to points p of the set C D . The local algebra A p,D may be defined as the quotient of the algebra of germs at p of holomorphic functions in a, h modulo the ideal I p,D generated by all functions q1 , q2 , . . . , ql+n . The local algebra A p,D contains the maximal ideal m p generated by germs which are zero at p. For f ∈ A D , denote by L f the linear operator A D → A D , g → f g, of multiplication by f . Consider the dual space A∗D = ⊕ p A∗p,D and the dual operators L ∗f : A∗D → A∗D . Every summand A∗p,D contains the distinguished one-dimensional subspace m⊥p which is the annihilator of m p . 7.1.1. Lemma [MTV3]. (i) For any point p of the scheme C D considered as a set and any f ∈ A D , we have L ∗f (m⊥p ) ⊂ m⊥p . (ii) For any point p of the scheme C D considered as a set, if W ⊂ A∗p,D is a nonzero vector subspace invariant with respect to all operators L ∗f , f ∈ A D , then W contains m⊥p . Proof. For any f ∈ m p we have L ∗f (m⊥p ) = 0. This proves part (i). To prove part (ii) we consider the filtration of A p,D by powers of the maximal ideal, A p,D ⊃ m p ⊃ m2p ⊃ · · · ⊃ {0}. We consider a linear basis { f a,b } of A p,D , a = 0, 1, . . . , b = 1, 2, . . . , which agrees with this filtration. Namely, we assume that for every i, the subset of all vectors f a,b with a i is a basis of mip . Since dim A p,D /m p = 1, there is only one basis vector with a = 0 and we also assume that this vector f 0,1 is the image of 1 in A p,D . Let { f a,b } denote the dual basis of A∗p,D . Then the vector f 0,1 generates m⊥p . Let w = a,b ca,b f a,b be a nonzero vector in W . Let a0 be the maximum value of a such that there exists b with a nonzero ca,b . Let b0 be such that ca0 ,b0 is nonzero. Then it is easy to see that L ∗fa ,b w = ca0 ,b0 f 0,1 . Hence W contains m⊥p .

0 0

24


7.2. Linear map τ : A∗D → Sing Wa,d [ l ]. Let f 1 , . . . , f µ be a basis of A D considered as a vector space over C. Write ωD =

vi ⊗ f i

with

vi ∈ Sing Wa,d [ l ] = Sing Wa,d [ l ].

(7.1)

i

Denote by V ⊂ Sing Wa,d [ l ] the vector subspace spanned by v1 , . . . , vµ . Define the linear map τ : A∗D → Sing Wa,d [ l ] ,

g → g(ω D ) =

g( f i ) vi .

(7.2)

i

Clearly, V is the image of τ . 7.2.1. Lemma. Let p be a point of C D considered as a set. Let ω( y, a( p)) ∈ Wa,d [ l ] = Wa,d [ l ] be the value of the universal weight function at p. Then the vector ω( y, a( p)) belongs to the image of τ . Proof. The statement follows from formula (6.10).

Let ψ DW : A D → A W be the epimorphism defined in Theorem 4.3.3. 7.2.2. Lemma Assume that the pair ((m 1 , 0), . . . , (m n , 0)) , l is separating. Then for any f ∈ A D and g ∈ A∗D , we have τ (L ∗f (g)) = ψ DW ( f )(τ (g)). In other words, the map τ intertwines the action of the algebra of multiplication operators L ∗f on A∗D and the action on the Bethe algebra on Sing Wa,d [ l ]. Proof. The algebra A D is generated by h 1 , . . . , h n . It is enough to prove that for any swe have τ (L ∗h s (g)) = Hs (τ (g)). But τ (L ∗h s (g)) = i g(h s f i )vi = g i vi ⊗ h s f i = g

i Hs vi ⊗ f i = Hs (τ (g)). 7.2.3. Corollary. The vector subspace V ⊂ Sing Wa,d [ l ] is invariant with respect to the action of the Bethe algebra A W and the kernel of τ is a subspace of A∗D , invariant with respect to multiplication operators L ∗f , f ∈ A D .

7.3. First main theorem. 7.3.1. Theorem. Assume that the pair ((m 1 , 0), . . . , (m n , 0)), l is separating. Then the image of τ is Sing Wa,d [ l ] and the kernel of τ is zero.


25

7.3.2. Corollary. The map τ identifies the action of operators L ∗f , f ∈ A D , on A∗D and the action of the Bethe algebra on Sing Wa,d [ l ]. Hence the epimorphism ψ DW : A D → A W is an isomorphism. Proof of Theorem 7.3.1. First we will show that τ is an epimorphism for generic z. Let dl = dim Sing Wa,d [ l ]. Corollary 5.1.4 says that for generic z there exists dl distinct points p1 , . . . , pdl in C D . By Corollary (6.3.4), the vectors ω( y, a( p1 )), …, ω( y, a( pdl )) are linearly independent and hence form a basis for Sing Wa,d [ l ]. Therefore, τ is an epimorphism for generic z by Lemma 7.2.1. By Theorem 4.2.1 and Lemma 2.5.3, dimensions of A D and Sing Wa,d [ l ] do not depend on z. Hence dim A D dim Sing Wa,d [l ] for all z 1 , . . . , z n . Therefore, to prove Theorem 7.3.1 it remains to prove that τ has zero kernel. Denote the kernel of τ by K . Let A D = ⊕ p A p,D be the decomposition into the direct sum of local algebras. Since K is invariant with respect to multiplication operators, we have that K = ⊕ p K ∩ A∗p,D , and for every p, the vector subspace K ∩ A∗p,D is invariant with respect to multiplication operators. By Lemma 7.1.1, if K ∩ A∗p,D is nonzero, then K ∩ A∗p,D contains the one-dimensional subspace m⊥p . Let { f a,b } be the basis of A p,D constructed in the proof of Lemma 7.1.1, and let { f a,b } be the dual basis of A∗p,D . Then the vector f 0,1 generates m⊥p . By definition of τ , the vector τ ( f 0,1 ) is equal to the value of the universal weight function at p. By Lemma 6.3.1, this value is nonzero and that contradicts the assumption that f 0,1 lies in the kernel of τ .

7.4. Grothendieck bilinear form on A D . Realize the algebra A D as C[h]/I D , where I D is the ideal generated by n polynomials q1 (h) , q2 (h) , q j (a(h), h), j = l + 3, . . . , l + n, see (4.3). Let : A D → C, be the Grothendieck residue, f →

f 1 . ResC D

l+n n (2πi) q1 (h)q2 (h) j=l+3 q j (a(h), h)

Let ( , ) D be the Grothendieck symmetric bilinear form on A D defined by the rule ( f, g) D = ( f g) . The Grothendieck bilinear form is nondegenerate. The form ( , ) D determines a linear isomorphism φ : A D → A∗D , f → ( f, ·) D . 7.4.1. Lemma. The isomorphism φ intertwines the operators L f and L ∗f for any f ∈ AD. Proof. For g ∈ A D we have φ(L f (g)) = φ( f g) = ( f g, ·) D = (g, f ·) D = L ∗f ((g, ·) D ) = L ∗f φ(g).

7.4.2. Corollary. Assume that the pair ((m 1 , 0), . . . , (m n , 0)) , l is separating. Then the composition τ φ : A D → Sing Wa,d [ l ] is a linear isomorphism which intertwines the algebra of multiplication operators on A D and the action of the Bethe algebra A W on Sing Wa,d [ l ].

26


8. Algebra A G 8.1. New conditions on (m 1 , 0) , . . . , (m n , 0), l. In the remainder of the paper we assume that = ( (1) , . . . , (n) ) = ((m 1 , 0) , . . . , (m n , 0)) is a collection of dominant integral gl2 -weights, that is,m s ∈ Z0 for s = 1, . . . , n. n We assume that l ∈ Z0 is such that the weight m − l, l is dominant s s=1 integral, that is, ns=1 m s − l l. This assumption implies that the pair ((m 1 , 0), . . . , (m n , 0)) , l is separating. n ˜ Let l˜ = s=1 m s + 1 − l. We have l > l. 8.2. Wronskian. The (discrete) Wronskian of polynomials f, g ∈ C[u] is the polynomial Wr ( f (u), g(u)) = f (u)g(u − 1) − f (u − 1)g(u). 8.2.1. Lemma. Let f, g, B ∈ C[u]. Assume that f, g are monic polynomials of degrees ˜ respectively, that lie in the kernel of the difference operator l, l, d(u) − B(u) ϑ −1 + a(u) ϑ −2 . Then ˜ Wr ( f (u), g(u)) = (l − l)

ms n

(u − z s + j).

s=1 j=1

˜ and Proof. Let C(u) = Wr ( f (u), g(u)). Then the top coefficient of C(u) equals l − l, a(u) C(u) = , C(u − 1) d(u) which determines the polynomial C(u) uniquely.

8.2.2. Lemma. Let f, g ∈ C[u], z ∈ C, m ∈ Z>0 . Assume that f (z − j) = 0 for j = 1, . . . , m + 1. Then the polynomial Wr ( f (u), g(u)) is equal to zero at u = z − j, j = 1, . . . , m, and the polynomial f (u)g(u − 2) − f (u − 2)g(u) is equal to zero at u = z − j, j = 1, . . . , m − 1. 8.2.3. Lemma. Let f, g, C ∈ C[u], z ∈ C, and Wr ( f (u), g(u)) = C(u). (i) If C(z) = 0 and f (z − 1) = 0, then g(z − 1) = 0. (ii) If C(z) = 0 and f (z) = 0, then f (z − 1) = 0. 8.2.4. Lemma. Let f, g ∈ C[u], z ∈ C. Then Wr ((u − z) f (u), (u − z)g(u)) = (u − z)(u − z − 1) Wr ( f (u), g(u)).


27

8.3. Intersection of Schubert cycles C G . Let d be a sufficiently large natural number with respect to the numbers m 1 , . . . , m n considered in Sect. 8.1. Let Cd [u] be the vector subspace in C[u] of polynomials of degree not greater than d. 8.3.1. Denote by G the Grassmannian of all two-dimensional vector subspaces in Cd [u]. Let F = {0 = Fd+1 ⊂ Fd ⊂ · · · ⊂ F1 ⊂ F0 = Cd [u]} be a complete flag and = (a, b) a gl2 dominant integral weight such that d a b 0 and a, b ∈ Z. Define o a Schubert cell CF , ⊂ G to be the set of all two-dimensional subspaces V ⊂ Cd [u] having a basis f, g such that f ∈ Fa+1 − Fa+2

and

g ∈ Fb − Fb+1 .

o . Define a Schubert cycle CF, ⊂ G as the closure of the Schubert cell CF , For z ∈ Z and i ∈ Z>0 , set

ϕi (u, z) =

i

(u − z + j).

j=1

Introduce a complete flag in Cd [u] : F(z) = {0 = Fd+1 (z) ⊂ Fd (z) ⊂ · · · ⊂ F1 (z) ⊂ F0 (z) = Cd [u]}, where Fi (z) consists of all polynomials divisible by ϕi (u, z). Introduce the complete flag in Cd [u] associated with infinity: F(∞) = {0 = Fd+1 (∞) ⊂ Fd (∞) ⊂ · · · ⊂ F1 (∞) ⊂ F0 (∞) = Cd [u]}, where Fi (∞) consists of all polynomials of degree d − i. o We consider the Schubert cells CF ⊂ G, s = 1, . . . , n, where (s) = (z s ), (s) (∞) = (d − l, d − l˜ − 1). The (m s , 0), and the Schubert cell C o (∞) ⊂ G, where F(∞),

o is the set of all two-dimensional subspaces V ⊂ Cd [u] having a basis cell CF (z s ), (s) f, g such that

g(z s − 1) = 0 ,

f (z s − m s − 2) = 0 ,

f (z s − j) = 0 for j = 1, . . . , m s + 1,

o and the cell CF is the set of all two-dimensional subspaces V ⊂ Cd [x] having (∞), (∞) ˜ a basis f, g such that deg f = l and deg g = l. Consider the (scheme-theoretic) intersection

C G = CF(∞), (∞)

∩ns=1 CF(z s ), (s)

(8.1)

of the corresponding Schubert cycles. Denote by A G the algebra of functions on C G .

28


8.3.2. Lemma. Let z i − z j ∈ / Z for i = j. Then n o 0 C G = CF ∩s=1 CF (∞) (z (∞),

), (s)

s

as sets.

n ∩s=1 CF(z s ), (s) . Let f, g be a monic basis Proof. Let V be a point of CF(∞), (∞) ˜ Then deg Wr ( f (u), g(u)) l + l˜ − 1. On of V , such that deg f l and deg g l.

s the other hand, the polynomial Wr ( f (u), g(u)) is divisible by ns=1 mj=1 (u − z s + j) n ˜ ˜ by Lemma 8.2.2. Since s=1 m s = l + l − 1, we conclude that deg f = l, deg g = l, o V is a point of CF(∞), (∞) , and ms n ˜ Wr ( f (u), g(u)) = (l − l) (u − z s + j). s=1 j=1

Since a suitable linear combination of f and g is divisible by ϕm s +1 (u, z s ), the subspace o V is a point of CF by Lemma 8.2.3.

(z ), (s) s

8.3.3. Let V be a point of C G considered as a set. Then there exists a unique basis f, g of V such that f (u) = u l + f 1 u l−1 + · · · + fl , ˜

˜

g(u) = u l + g1 u l−1 + · · · + gl−l−1 u l+1 + gl−l+1 u l−1 + · · · + gl˜ ˜ ˜ for suitable complex numbers f 1 , . . . , fl , g1 , . . . , gl−l−1 , gl−l+1 , . . . , gl˜. ˜ ˜ / Z for i = j. Then all polynomials of the subspace V are 8.3.4. Lemma. Let z i − z j ∈ annihilated by the difference operator DV = d(u) − BV (u) ϑ −1 + a(u) ϑ −2 , where BV (u) =

1 l˜ − l

(g(u) f (u − 2) − g(u − 2) f (u))

n m s −1

(u − z s + j)−1

s=1 j=1

is a polynomial of degree n. Proof. Let W (u) = Wr ( f (u), g(u)). It is straightforward to see that all polynomials of the subspace V are annihilated by the difference operator W (u − 1) − (g(u) f (u − 2) − g(u − 2) f (u)) ϑ −1 + W (u) ϑ −2 . Since ˜ W (u) = (l − l)

ms n

(8.2)

(u − z s + j),

s=1 j=1

see the proof of Lemma 8.3.2, and all coefficients of the difference operator (8.2) are

s −1 divisible by ns=1 mj=1 (u − z s + j) by Lemma 8.2.2, the statement follows.

Write

BV (u) = 2u n + h 1 u n−1 + · · · + h n .

Recall that the scheme C D is defined in Sect. 4.1.


29

8.3.5. Corollary of Lemma 8.3.4. Consider the schemes C D and C G as sets. Then the assignment V → ( f 1 , . . . , fl , h 1 , . . . , h n ) ∈ Cl+n defines an injective map of sets CG → C D . / Z for i = j. Assume that V ∈ G has a basis f, g such 8.3.6. Theorem. Let z i − z j ∈ ˜ and V is annihilated by a difference operator of the form that deg f = l and deg g = l, d(u) − B(u) ϑ −1 + a(u) ϑ −2 , where B(u) is a polynomial. Then V is a point of C G . The proof is similar to the proof of Theorem 7.2 in [MTV2].

8.4. Algebra A G . / Z for i = j. Then A G considered as a vector space is 8.4.1. Lemma. Let z i − z j ∈ finite-dimensional. Moreover, this dimension does not depend on z. Proof. The claim follows from Corollary 8.3.5 and the reasoning is similar to the proof of Theorem 4.2.1.

Under conditions of Lemma 8.4.1, the dimension of A G as a vector space is given by Schubert calculus. Namely, let = ( (1) , . . . , (n) ) be the collection of gl2 -highest weights, where (s) = (m s , 0). Denote by L = L (1) ⊗ · · · ⊗ L (n) the tensor product of irreducible gl2 -modules with highest weights (1) , . . . , (n) , respectively. Let Sing L [ l ] be the subspace of L of gl2 -singular vectors of weight ( ns=1 m s − l, l). Then by Schubert calculus, dim A G = dim Sing L [ l ],

(8.3)

see [Fu]. / Z for i = j, we shall use the following 8.5. Presentation of algebra A G . If z i − z j ∈ presentation of the algebra A G : Let a˜ = (a˜ 1 , . . . , a˜ l−l−1 , a˜ l−l+1 , . . . , a˜ l˜). ˜ ˜ ˜

Consider the space Cl+l+n−1 with coordinates a˜ , a, h, cf. Sect. 4.1. Denote by p(u, ˜ a˜ ) the following polynomial in u depending on parameters a˜ : ˜

˜

p(u, ˜ a˜ ) = u l + a˜ 1 u l−1 + · · · + a˜ l−l−1 u l+1 + a˜ l−l+1 u l−1 + · · · + a˜ l˜ . ˜ ˜ Recall that p(u, a) = u l + a1 u l−1 + · · · + al and B(u, h) = 2u n + h 1 u n−1 + · · · + h n . Let us write ˜

˜

( a˜ , a), Wr ( p(u, ˜ a˜ ), p(u, a)) = (l˜ − l)u l+l−1 + w1 ( a˜ , a)u l+l−2 + · · · + wl+l−1 ˜ p(u, ˜ a˜ ) p(u − 2, a) − p(u ˜ − 2, a˜ ) p(u, a) ˜

˜

= 2(l˜ − l)u l+l−1 + wˆ 1 ( a˜ , a)u l+l−2 + · · · + wˆ l+l−1 ( a˜ , a) ˜

30


for suitable polynomials w1 , . . . , wl+l−1 , wˆ 1 , . . . , wˆ l+l−1 in variables a˜ , a. Let us write ˜ ˜ (l˜ − l)

ms n

˜

˜

(u − z s + j) = (l˜ − l)u l+l−1 + c1 u l+l−2 + · · · + cl+l−1 , ˜

s=1 j=1

(l˜ − l) B(u, h)

n m s −1

˜ ˜ (u − z s + j) = 2(l˜ − l)u l+l−1 + cˆ1 (h)u l+l−2 + · · · + cˆl+l−1 (h), ˜

s=1 j=1

and polynomials cˆ1 , . . . , cˆl+l−1 in variables h. for suitable numbers c1 , . . . , cl+l−1 ˜ ˜ ˜ Denote by IG the ideal in C[ a˜ , a, h] generated by 2(l + l − 1) polynomials wi ( a˜ , a) − ci ,

i = 1, . . . , l˜ + l − 1.

wˆ i ( a˜ , a) − cî (h),

(8.4)

8.5.1. Lemma. Let z i − z j ∈ / Z for i = j. Then A G = C[ a˜ , a, h]/IG . Proof. The scheme defined by the ideal IG consists of points p such that Wr ( p(u, ˜ a˜ ( p)), p(u, a( p))) = (l˜ − l)

ms n

(u − z s + j),

s=1 j=1 n m s −1

p(u, ˜ a˜ ) p(u − 2, a) − p(u ˜ − 2, a˜ ) p(u, a) = (l˜ − l) B(u, h( p))

(u − z s + j).

s=1 j=1

Hence, the polynomials p(u, ˜ a˜ ( p)), p(u, a( p)) span a vector subspace V lying in the intersection C G , see Theorem 8.3.6. Conversely, if V is a point of C G , then V has a basis f, g like in Lemma 8.3.4. Then by Lemma 8.3.4 we have Wr(g(u), f (u)) = (l˜ − l)

ms n

(u − z s + j),

s=1 j=1

g(u) f (u − 2) − g(u − 2) f (u) = (l˜ − l) B(u)

n m s −1

(u − z s + j)

s=1 j=1

for a suitable polynomial B(u). Hence, the triple g, f, B determines a point p, whose coordinates satisfy Eqs. (8.4).

9. Algebras A P and A L ˜

9.1. Algebra A P . Consider the space Cl+l+n−1 with coordinates a˜ , a, h. Let Dh = d(u) − B(u, h) ϑ −1 + a(u) ϑ −2 be the difference operator defined in (4.1). If h satisfies equations q1 (h) = 0 and ˜ a˜ )) is a polynomial in u of degree l˜ + n − 3, q2 (h) = 0, then the polynomial Dh ( p(u, ˜

˜ , h). Dh ( p(u, ˜ a˜ )) = q˜3 ( a˜ , h) u l+n−3 + · · · + q˜l+n ˜ (a The coefficients q˜i ( a˜ , h) are functions linear in a˜ and linear in h.


31

Recall that if p(u, a) = u l +a1 u l−1 +· · ·+al , and h satisfies equations q1 (h) = 0 and q2 (h) = 0, then the polynomial Dh ( p(u, a)) is a polynomial in u of degree l + n − 3, Dh ( p(u, a)) = q3 (a, h) u l+n−3 + · · · + ql+n (a, h). Denote by I P the ideal in C[ a˜ , a, h] generated by polynomials q1 , q2 , q3 , . . . , ql+n , ˜ l+l+n−1 . The algebra q˜3 , . . . , q˜l+n ˜ . The ideal I P defines a scheme C P ⊂ C A P = C[ a˜ , a, h]/I P is the algebra of functions on C P . ˜ The scheme C P is the scheme of points p ∈ Cl+l+n−1 such that the difference equation Dh( p) w(u) = 0 has two polynomial solutions p(u, ˜ a˜ ( p)) and p(u, a( p)). 9.2. Isomorphism ψG P : A G → A P . ˜

˜

/ Z for i = j, then the identity map Cl+l+n−1 → Cl+l+n−1 9.2.1. Theorem. If z i − z j ∈ induces an algebra isomorphism ψG P : A G → A P . ˜ a˜ ( p)), p(u, a( p)) are annihilated by Proof. If p is a point of C P , then polynomials p(u, the difference operator d(u) − B(u, h( p))ϑ −1 + a(u)ϑ −2 . If z i − z j ∈ / Z for i = j, then the span V of polynomials p(u, ˜ a˜ ( p)), p(u, a( p)) is a point of C G by Theorem 8.3.6. This reasoning defines an algebra homomorphism ψG P : A G → A P . Conversely, if p is a point of C G , then the triple p(u, ˜ a˜ ( p)), p(u, a( p)), B(u, h( p)) satisfies equations ms n ˜ Wr ( p(u, ˜ a˜ ( p)), p(u, a( p))) = (l − l) (u − z s + j), s=1 j=1

p(u, ˜ a˜ ) p(u − 2, a) − p(u ˜ − 2, a˜ ) p(u, a) = (l˜ − l) B(u, h( p))

n m s −1

(u − z s + j).

s=1 j=1

Hence the polynomials p(u, ˜ a˜ ( p)), p(u, a( p)) are annihilated by the difference operator d(u) − B(u, h( p))ϑ −1 + a(u)ϑ −2 . Therefore, p is a point of C P .

9.3. Algebra A L . Assume that m 1 , . . . , m n , l satisfy conditions of Sect. 8.1. Let = ( (1) , . . . , (n) ) be the collection of gl2 -highest weights with (s) = (m s , 0). Let L = L (1) ⊗ · · · ⊗ L (n) be the tensor product of irreducible gl2 -modules with highest weights (1) , . . . , (n) , respectively, and v = v(m 1 ,0) ⊗ · · · ⊗ v(m n ,0) the tensor product of the corresponding highest weight vectors. Denote by L (z) = L (1) (z 1 ) ⊗ · · · ⊗ L (n) (z n ) the tensor product of evaluation modules.

32


n Let Sing L [ l ] ⊂ L (z) be the subspace of gl2 -singular vectors of weight ( i=1 m i − l, l). The algebra A L is the Bethe algebra associated with Sing L [ l ]. Assume that m i ∈ Z0 for i = 1, . . . , n, and m 1 m 2 · · · m n . Assume that z i − z j + m j − s = 0 and z i − z j − 1 − s = 0 for all i < j and s = 0, 1, . . . , m i − 1. Then by Theorem 2.6.2, there is a natural isomorphism Wa,d /K → L (z) such that 1 → v . Here K ⊂ Wa,d is the kernel of the Yangian Shapovalov form on Wa,d . The Yangian Shapovalov form on Wa,d induces the Yangian Shapovalov form S on L (z) such that S(v , v ) = 1 and S(x · v, w) = S(v, x + · w) for all x ∈ Y (gl2 ) and v, w ∈ L (z). The form S is nondegenerate and symmetric. We have the composition of linear maps Wa,d → Wa,d /K → L (z). Restricting this composition to Sing Wa,d we get a linear epimorphism σ : Sing Wa,d [ l ] → Sing L [ l ]. The Bethe algebra A W preserves the kernel of σ and induces a commutative subalgebra in End (Sing L [ l ]). The induced subalgebra coincides with the Bethe algebra A L . We denote by ψW L : A W → A L the corresponding epimorphism. The operators of the algebra A L are symmetric with respect to the Yangian Shapovalov form on L (z). 9.3.1. Denote by D L = d(u) − (2u n + ψW L (H1 )u n−1 + · · · + ψW L (Hn )) ϑ −1 + a(u) ϑ −2 the universal difference operator associated with the subspace Sing L [ l ] and collection z. 9.3.2. Theorem. Assume that the pair , l satisfies conditions of Sect. 8.1. Then for any v0 ∈ Sing L [ l ] there exist v1 , . . . , vl˜ ∈ Sing L [ l ] such that the function ˜

˜

w(u) = v0 u l + v1 u l−1 + · · · + vl˜ is a solution of the difference equation D L w(u) = 0. This theorem is a particular case of Theorem 7.3 in [MTV2].

10. Homomorphisms of Algebras A D , A P and A L 10.1. Epimorphism ψ D P : A D → A P . A point p of C P determines the difference equation Dh( p) w(u) = 0 and two solutions p(u, ˜ a˜ ( p)) and p(u, a( p)). Then the pair, consisting of the difference operator Dh( p) and the solution p(u, a( p)) of the smaller degree, determines a point of C D , see Sect. 4.1. This correspondence defines a natural algebra epimorphism ψ D P : A D → A P .


33

10.2. Linear map ξ : A D → Sing L [ l ]. Assume that z 1 , . . . , z n , m 1 , . . . , m n satisfy the assumptions of Theorem 2.6.2. Then we have the composition of linear maps φ

τ

σ

A D −→ A∗D −→ Sing Wa,d [ l ] −→ Sing L [ l ]. Denote this composition by ξ : A D → Sing L [ l ]. By Theorem 7.3.1, ξ is a linear epimorphism. Let ψ DL : A D → A L be the algebra epimorphism defined as the composition ψW L ψ DW . 10.2.1. Lemma. If z 1 , . . . , z n , m 1 , . . . , m n satisfy the assumptions of Theorem 2.6.2, then the linear map ξ intertwines the action of the multiplication operators L f , f ∈ A D , on A D and the action of the Bethe algebra A L on Sing L [ l ], that is, for any f, g ∈ A D we have ξ(L f (g)) = ψ DL ( f )(ξ(g)). The lemma follows from Corollary 7.4.2. 10.2.2. Lemma. If z 1 , . . . , z n , m 1 , . . . , m n satisfy the assumptions of Theorem 2.6.2, then the kernel of ξ coincides with the kernel of ψ DL . Proof. If ψ DL ( f ) = 0, then ξ( f ) = ξ(L f (1)) = ψ DL ( f )(ξ(1)) = 0. On the other hand, if ξ( f ) = 0, then for any g ∈ A D we have ψ DL ( f )(ξ(g)) = ξ(L f (g)) = ξ( f g) = ξ(L g ( f )) = ψ DL (g)(ξ( f )) = 0. Since ξ is an epimorphism, this means that ψ DL ( f ) = 0.

/ Z for i = j, then the kernel of ξ coincides with the kernel 10.2.3. Lemma. If z i − z j ∈ of ψ D P . Proof. If z i − z j ∈ / Z for i = j, then the assumptions of Theorem 2.6.2 are satisfied and ξ is defined. By Schubert calculus, dim Sing L [ l ] = dim A G . By Theorem 9.2.1 dim A G = dim A P if z i − z j ∈ / Z for i = j. Hence it suffices to show that the kernel of ξ contains the kernel of ψ D P . But this follows from Theorems 3.3.1 and 9.3.2. Indeed the defining relations in A P = A D /(ker ψ D P ) are the conditions on the operator Dh to have two linear independent polynomials in the kernel. Theorems 3.3.1 and 9.3.2 guarantee these relations for elements of the Bethe algebra A L . Hence, the kernel of ψ DL contains the kernel of ψ D P . By Lemma 10.2.2, the kernel of ξ coincides with the kernel of ψ DL . Therefore, the kernel of ξ contains the kernel of ψ D P .

/ Z for all i = j. Then the algebras A P , A L and A G 10.2.4. Corollary. Let z i − z j ∈ are isomorphic. Proof. Since the algebra epimorphisms ψ D P and ψ DL have the same kernels, the algebras A P and A L are isomorphic. Then A L and A G are isomorphic by Theorem 9.2.1.

/ Z for all i = j. Denote by ψ P L : A P → A L 10.3. Second main theorem. Let z i − z j ∈ the isomorphism induced by ψ DL and ψ D P . Lemmas 10.2.1 – 10.2.3 imply the following theorem.

34


10.3.1. Theorem. If z i − z j ∈ / Z for all i = j, then the linear map ξ induces a linear isomorphism ζ : A P → Sing L [ l ] which intertwines the multiplication operators L f , f ∈ A P , on A P and the action of the Bethe algebra A L on Sing L [ l ], that is, for any f, g ∈ A P we have ζ (L f (g)) = ψ P L ( f )(ζ (g)). / Z for all i = j. Assume that every operator f ∈ A L 10.3.2. Corollary. Let z i − z j ∈ is diagonalizable. Then the algebra A L has simple spectrum and all points of the intersection of Schubert cycles n C G = CF(∞), (∞) ( ∩i=1 CF(zi ), (i) ) are of multiplicity one. Proof. The algebras A L , A P and A G are isomorphic. We have A P = ⊕ p A p,P , where the sum is over the points of the scheme C P considered as a set and A p,P is the local algebra associated with a point p. The algebra A p,P has nonzero nilpotent elements if dim A p,P > 1. If every element f ∈ A P is diagonalizable, then the algebra A P is the direct sum of one-dimensional local algebras. Hence A P has simple spectrum as well as the algebras A L and A G .

Corollary 10.3.2 has the following application. / Z and |z i − z j | 1 for 10.3.3. Corollary. Assume that z 1 , . . . , z n are real, z i − z j ∈ all i = j. Then all points of the intersection of Schubert cycles n C G = C∞, (∞) C zi , (i) ) ( ∩i=1 are of multiplicity one. Proof. If z 1 , . . . , z n are real and |z i − z j | 1 for all i = j, then the Yangian Shapovalov form, restricted to the real part of Sing L [ l ], is positive definite, see Appendix C in [MTV1]. The Hamiltonians ψW L (H1 ), . . . , ψW L (H1 ), restricted to the real part of Sing L [ l ], are real symmetric operators with respect to the Yangian Shapovalov form, see [MTV1]. Hence, all elements of the Bethe algebra A L are diagonalizable operators. Therefore, the spectrum of A G is simple and all points of C G are of multiplicity one.

Corollary 10.3.3 is related to Theorem 1 from [EGSV] and Theorem 2.1 from [MTV4] concerning the real Schubert calculus. 10.3.4. Example. Let n = 3, (s) = (1, 0), s = 1, 2, 3, (∞) = (2, 1), and R = 4 (z 12 + z 22 + z 32 − z 1 z 2 − z 1 z 3 − z 2 z 3 ) − 3. If R = 0, then every element of A L is diagonalizable and the algebra A L is isomorphic to the direct sum C ⊕ C. If R = 0, then the algebra A L contains a nonzero nilpotent matrix and is isomorphic to C[b]/b2 .


35

11. Operators with Polynomial Kernel and Bethe Algebra A L 11.1. Linear isomorphism θ : A∗P → Sing L [l ]. Let z i − z j ∈ / Z for all i = j. Define the symmetric bilinear form on A P by the formula ( f, g) P = S (ζ ( f ), ζ (g))

for all

f, g ∈ A P ,

where S( , ) denotes the Yangian Shapovalov form on Sing L [ l ]. 11.1.1. Lemma. The form ( , ) P is nondegenerate. The lemma follows from the fact that the Yangian Shapovalov form on Sing L [ l ] is nondegenerate and the fact that ζ is an isomorphism. 11.1.2. Lemma. We have ( f g, h) P = (g, f h) P for all f, g, h ∈ A P . The lemma follows from the fact the elements of the Bethe algebra are symmetric operators with respect to the Yangian Shapovalov form, see Sect. 3.2.2. The form ( , ) P defines a linear isomorphism π : A P → A∗P , f → ( f , ·) P . 11.1.3. Corollary. Let z i − z j ∈ / Z for all i = j. Then the map π intertwines the multiplication operators L f , f ∈ A P , on A P and the dual operators L ∗f , f ∈ A P , on A∗P . 11.2. Third main theorem. Summarizing Theorem 10.3.1 and Corollary 11.1.3 we obtain the following theorem. 11.2.1. Theorem. Let z i − z j ∈ / Z for all i = j. Then the composition θ = ζ π −1 is a linear isomorphism from A∗P to Sing L [ l ] which intertwines the multiplication operators L ∗f , f ∈ A P , on A∗P and the action of the Bethe algebra A L on Sing L [ l ], that is, for any f ∈ A P and g ∈ A∗P we have θ (L ∗f (g)) = ψ P L ( f )(θ (g)). 11.2.2. Let z i − z j ∈ / Z for all i = j. Assume that v ∈ Sing L [ l ] is an eigenvector of the Bethe algebra A L , that is, ψW L (Hs )v = λs v for suitable λs ∈ C and s = 1, . . . , n. Then, by Corollary 7.4 in [MTV2], the difference equation d(u) − (2u n + λ1 u n−1 + · · · + λn ) ϑ −1 + a(u) ϑ −2 w(u) = 0 has two linearly independent polynomial solutions, one of degree l and the other of ˜ The following corollary of Theorem 11.2.1 gives the converse statement. degree l. / Z for all i = j. Assume that 11.2.3. Corollary of Theorem 11.2.1 Let z i − z j ∈ (λ1 , . . . , λn ) ∈ Cn is a point such that n n λ1 = (m i − 2z i ) , λ2 = l l − 1 − mi i=1

+

1i< j n

i=1

z i z j + (z i − m i )(z j − m j ) ,

36


and the difference equation d(u) − (2u n + λ1 u n−1 + · · · + λn ) ϑ −1 + a(u) ϑ −2 w(u) = 0

(11.1)

has two linearly independent polynomial solutions. Then there exists a unique up to normalization eigenvector v ∈ Sing L [ l ] of the action of the Bethe algebra A L such that for every s = 1, . . . , n we have ψW L (Hs ) v = λs v.

(11.2)

Proof of Corollary 11.2.3. Indeed, such a point (λ1 , . . . , λn ) defines a linear function η : A P → C, h s → λs , for s = 1, . . . , n. Moreover, η( f g) = η( f )η(g) for all f, g ∈ A P . Hence η ∈ A∗P is an eigenvector of operators L ∗f acting on A∗P . By Theorem 11.2.1, the vector v = θ (η) ∈ Sing L [ l ] is an eigenvector of the action of the Bethe algebra A L with eigenvalues prescribed in Corollary 11.2.3. Let v ∈ Sing L [ l ] satisfy (11.2), then η = θ −1 (v) ∈ A∗P satisfies η ( f g) = η( f )η (g) for all f, g ∈ A P . Hence, for g = 1 we have η ( f ) = η( f )η (1). Therefore, η is proportional to η, and v is proportional to v.

11.2.4. Assume that (λ1 , . . . , λn ) ∈ Cn is a point satisfying the assumptions of Corollary 11.2.3. We describe how to find the eigenvector v ∈ Sing L [ l ], indicated in Corollary 11.2.3. Let f (u) be the monic polynomial of degree l which is a solution of the difference equation (11.1). Consider the polynomial ω( y) = y0l

n−1

f (y j − 1)

j=1

as an element of Wa,d , see Sect. 6.3. By Theorem 6.3.2 this vector lies in Sing Wa,d [ l ] and ω( y) is an eigenvector of the Bethe algebra A W with eigenvalues prescribed in Corollary 11.2.3. Consider a maximal subspace V ⊂ Sing Wa,d [ l ] with three properties: i) V contains ω( y), ii) V does not contain other eigenvectors of the Bethe algebra A W , iii) V is invariant with respect to the Bethe algebra A W . Such a maximal subspace does exist and is unique. Let σ (V ) ⊂ Sing L [ l ] be the image of V under the epimorphism σ . Then by Corollary 11.2.3, the subspace σ (V ) contains a unique one-dimensional subspace of eigenvectors of the Bethe algebra A L . Any such eigenvector may serve as an eigenvector of the Bethe algebra A L indicated in Corollary 11.2.3. 12. Homogeneous XXX Heisenberg model 12.1. Statement of results. In Sects. 8–11, in most of the assertions we assumed that z 1 , . . . , z n ∈ C are such that z i − z j ∈ / Z for i = j, and m 1 , . . . , m n are natural numbers. In this section we assume that z1 = · · · = zn = 0

and

m 1 = · · · = m n = 1.

This special case is called the homogeneous XXX Heisenberg model.

(12.1)


37

In other words, in this section we consider the Y (gl2 )-module L 1 (0) = L (1,0) (0) ⊗ · · · ⊗ L (1,0) (0), which is the tensor product of n copies of the two-dimensional evaluation module, and the subspace of gl2 -singular vectors of weight (n − l, l), Sing L 1 [ l ] = { p ∈ L 1 (0) | e12 p = 0, e22 p = lp }. The subspace Sing L 1 [ l ] is not empty if and only if 2l n, that is, if and only if the pair ((1, 0), . . . , (1, 0)) , l is separating. In that case n n − . dim Sing L 1 [ l ] = l l −1 The algebra A L is the Bethe algebra associated with the subspace Sing L 1 [ l ]. It is generated by the coefficients of the series (T11 (u) + T22 (u)) Sing L [ l ] . 1 The main result of this section is the following theorem. 12.1.1. Theorem. For the homogeneous XXX Heisenberg model, the Bethe algebra A L has simple spectrum. The theorem will be proved in Sect. 12.7. 12.1.2. Let l˜ = n + 1 − l. We have l˜ + l − 1 = n and l˜ > l. Denote by f, g two polynomials in C[u] of the form: f (u) = u l + f 1 u l−1 + · · · + fl , l˜

g(u) = u + g1 u

˜ l−1

(12.2)

+ · · · + gl−l−1 u ˜

l+1

+ gl−l+1 u ˜

l−1

+ · · · + gl˜ .

As a byproduct of the proof of Theorem 12.1.1 we prove the following theorem. n distinct pairs of polynomials f, gof the form Theorem. There exist exactly nl − l−1 (12.2), such that ˜ (u + 1)n . f (u)g (u − 1) − f (u − 1) g(u) = (l − l) Theorem 12.1.2 will be proved in Sect. 12.8.

12.2. Algebra A L for the homogeneous XXX model. Consider the Yangian module Wa,d corresponding to the polynomials a(u) = (u + 1)n ,

d(u) = u n .

The numbers (12.1) satisfy the assumptions of Theorem 2.6.2. Therefore the Y (gl2 )module L 1 (0) is irreducible, and there is a natural epimorphism Wa,d → L 1 (0) of Y (gl2 )-modules. Restricting this epimorphism to Sing Wa,d [ l ], we obtain a linear epimorphism σ : Sing Wa,d [ l ] → Sing L [ l ].

38


The Bethe algebra A W preserves the kernel of σ and induces a commutative subalgebra in End (Sing L [ l ]). The induced subalgebra coincides with the Bethe algebra A L , see Sect. 9.3. Denote by ψW L : A W → A L the corresponding epimorphism. We have (T11 (u) + T2 (u)) Sing L

1[ l

]

= 2 + ψW L (H1 ) u −1 + · · · + ψW L (Hn ) u −n ,

where ψW L (H1 ) = n ,

ψW L (H2 ) = l(l − 1 − n) +

n(n − 1) , 2

see Sect. 3.2.1. Thus the Bethe algebra A L is generated by elements ψW L (H3 ), . . . , ψW L (Hn ). 12.3. Algebra A P for the homogeneous XXX model. Consider the space C2n with coordinates a˜ , a, h, as in Sect. 8.5, and polynomials p(u, ˜ a˜ ), p(u, a), B(u, h). Given the polynomials a(u) = (u + 1)n and d(u) = u n , we define the ideal I P , the algebra A P , and the scheme C P as in Sect. 9.1. The scheme C P is the scheme of points p ∈ C2n such that the difference equation (u n − B(u, h) ϑ −1 + (u + 1)n ϑ −2 ) w(u) = 0 has two polynomial solutions p(u, ˜ a˜ ( p)) and p(u, a( p)). 12.4. Algebra A G for the homogeneous XXX model. Consider the space C2n with coordinates a˜ , a, h, and polynomials p(u, ˜ a˜ ), p(u, a), B(u, h). Let us write Wr ( p(u, ˜ a˜ ), p(u, a)) = (l˜ − l)u n + w1 ( a˜ , a)u n−1 + · · · + wn ( a˜ , a), p(u, ˜ a˜ ) p(u − 2, a) − p(u ˜ − 2, a˜ ) p(u, a) n = 2(l˜ − l)u + wˆ 1 ( a˜ , a)u n−1 + · · · + wˆ n ( a˜ , a) for suitable polynomials w1 , . . . , wn , wˆ 1 , . . . , wˆ n in variables a˜ , a. Denote by IG the ideal in C[ a˜ , a, h] generated by 2n polynomials n , wˆ i ( a˜ , a) − (l˜ − l)h i , i = 1, . . . , n. wi ( a˜ , a) − (l˜ − l) i

(12.3)

The ideal IG defines a scheme C G ⊂ C2n . Then A G = C[ a˜ , a, h]/IG is the algebra of functions on C G . The scheme C G is the scheme of points p ∈ C2n such that Wr ( p(u, ˜ a˜ ( p), p(u, a( p)) = (l˜ − l) (u + 1)n , p(u, ˜ a˜ ) p(u − 2, a) − p(u ˜ − 2, a˜ ) p(u, a) = (l˜ − l) B(u, h( p)).

(12.4)


39

12.4.1. Theorem. The identity map C2n → C2n induces an algebra isomorphism ψG P : A G → A P . Proof. The proof is similar to the proof of Theorem 9.2.1.

12.4.2. Lemma. The dimension of A G considered as a vector space is equal to n n dim Sing L 1 [ l ] = − . l l −1 Proof. Consider the ideal IG (z) defined by (8.4) for m 1 = · · · = m n = 1 and arbitrary z 1 , . . . , z n . Consider the algebra A G (z) = C[ a˜ , a, h]/IG (z). By Lemma 8.5.1, if z 1 , . . . , z n are distinct and close to zero, then A G (z) is the algebra of functions on the intersection of Schubert cells C G (z), see (8.1), and by (8.3) we have n n dim A G (z) = − . l l −1 To complete the proof of Lemma 12.4.2, it suffices to verify two facts: (i) There are no algebraic curves over C lying in the scheme C G (0), defined by the ideal (12.3). (ii) Let a sequence z (i) , i = 1, 2, . . . , tend to 0. Let p(i) ∈ C G (z (i) ), i = 1, 2, . . . , be a sequence of points. Then all coordinates a˜ ( p(i) ), a( p(i) ), h( p(i) remain bounded as i tends to infinity. By Theorem 9.2.1, the schemes C G (z) and C P (z) are isomorphic if z 1 , . . . , z n are distinct and close to zero. By Theorem 12.4.1, the schemes C G (0) and C P (0) are isomorphic as well. Claims (i) and (ii) hold for the scheme C P (z) by Theorem 4.2.1 because C P (z) is a subscheme of the scheme C D (z).

12.5. Three more homomorphisms for the homogeneous XXX model. In Sects. 10.1 and 10.2, we define an algebra epimorphism ψ D P : A D → A P , a linear epimorphism ξ : A D → Sing L 1 [ l ] as the composition of linear maps φ

τ

σ

A D −→ A∗D −→ Sing Wa,d [ l ] −→ Sing L 1 [ l ], and an algebra epimorphism ψ DL : A D → A L as the composition ψW L ψ DW . For the homogeneous XXX model, we have Lemmas 10.2.1 and 10.2.2 and the following analogue of Lemma 10.2.3. 12.5.1. Lemma. For the homogeneous XXX model, the kernel of ξ coincides with the kernel of ψ D P . Proof. The proof is similar to the proof of Lemma 10.2.3 with Theorem 12.4.1 replacing Theorem 9.2.1.

12.5.2. Corollary. For the homogeneous XXX model, the algebras A P , A L and A G are isomorphic. Denote by ψ P L : A P → A L the isomorphism induced by ψ DL and ψ D P . We have the following analogue of Theorem 10.3.1.

40


12.5.3. Theorem. For the homogeneous XXX model, the linear map ξ induces a linear isomorphism ζ : A P → Sing L 1 [ l ] which intertwines the multiplication operators L f , f ∈ A P , on A P and the action of the Bethe algebra A L on Sing L 1 [ l ], that is, for any f, g ∈ A P we have ζ (L f (g)) = ψ P L ( f )(ζ (g)).

12.6. The Bethe algebra A L of the homogeneous XXX model is diagonalizable. 12.6.1. Theorem. For the homogeneous XXX model, all elements of A L are diagonalizable operators. Proof. Let v+ be a highest gl2 -weight vector of L (1,0) and v− = e21 v+ . Then v+ , v− form a basis of L (1,0) . Consider the Hermitian form on L 1 (0) for which the vectors vi1 ⊗ · · · ⊗ vin

with

i j ∈ {+, −}

generate an orthonormal basis of L 1 (0). For any X ∈ End (L 1 (0)), denote by X † the Hermitian conjugate operator with respect to this Hermitian form. It is clear that † (1⊗( j−1) ⊗ eab ⊗ 1⊗(n− j) )| L 1 (0) = (1⊗( j−1) ⊗ eba ⊗ 1⊗(n− j) )| L 1 (0) . Using the fact that (e11 + e22 )| L (1,0) = 1 and the definition of the coproduct (2.2), it is straightforward to verify by induction on n that

Tab (u)| L 1 (0)

†

= (−1)a+b+n T3−a,3−b (− u¯ − 1)| L 1 (0) ,

where u¯ is the complex conjugate of u. Therefore, † (T11 (u) + T22 (u))| L 1 (0) = − (T11 (− u¯ − 1) + T22 (− u¯ − 1))| L 1 (0) . This means that for any X ∈ A L , the Hermitian conjugate operator X † lies in A L . Hence, any element of A L commutes with its Hermitian conjugate and, therefore, is diagonalizable.

12.7. Proof of Theorem 12.1.1. The proof is similar to the proof of Corollary 10.3.2, because every element of A L is diagonalizable by Theorem 12.6.1.

12.8. Proof of Theorem 12.1.2. The algebras A G and A L are isomorphic. So, by Theorem 12.6.1 every element f ∈ A G is diagonalizable. Therefore, the algebra A G is the direct sum local algebras. Hence C G considered as a set consists of of one-dimensional n distinct points, see Lemma 12.4.2. Theorem 12.1.2 is proved. dim A G nl − l−1


41

12.8.1. Assume that v ∈ Sing L 1 [ l ] is an eigenvector of the Bethe algebra A L , that is, ψW L (Hs )v = λs v for suitable λs ∈ C and s = 1, . . . , n. Then by Corollary 7.4 in [MTV2], the difference equation u n − (2u n + λ1 u n−1 + · · · + λn ) ϑ −1 + (u + 1)n ϑ −2 w(u) = 0 has two linearly independent polynomial solutions, one of degree l and the other of degree n − l + 1. The following corollary of Theorem 12.1.1 gives the converse statement. 12.8.2. Corollary of Theorem 12.1.1 Assume that (λ1 , . . . , λn ) ∈ Cn is a point such that n(n − 1) , λ2 = l(l − 1 − n) − λ1 = n , 2 and the difference equation u n − (2u n + λ1 u n−1 + · · · + λn ) ϑ −1 + (u + 1)n ϑ −2 w(u) = 0 has two linearly independent polynomial solutions. Then there exists a unique up to normalization eigenvector v ∈ Sing L 1 [ l ] of the action of the Bethe algebra A L of the homogeneous XXX model such that for every s = 1, . . . , n we have ψW L (Hs ) v = λs v . The proof of Corollary 12.8.2 is similar to the proof of Corollary 11.2.3. 12.8.3. Assume that (λ1 , . . . , λn ) ∈ Cn is a point satisfying the assumptions of Corollary 12.8.2. In order to find the eigenvector v ∈ Sing L 1 [l ], indicated in Corollary 12.8.2, one needs to apply the procedure described in Sect. 11.2.4. Acknowledgements. The authors thank referees for helpful comments.

References [B1] [B2] [Be] [CP] [EGSV] [FT] [Fu] [IK] [KBI] [MTV1]

Baxter, R.: Exactly solved models in statistical mechanics. London: Academic Press, Inc., 1982 Baxter, R.: Completeness of the bethe ansatz for the six- and eight-vertex models. J. Stat. Phys. 108(1-2), 1–48 (2002) Bethe, H.: Zur theorie der metalle: i. eigenwerte und eigenfunktionen der linearen atomkette. Z. Phys. 71, 205–226 (1931) Chari, V., Pressley, A.: A Guide to quantum groups. Cambridge: Cambridge University Press, 1994 Eremenko, A., Gabrielov, A., Shapiro, M., Vainshtein, A.: Rational functions and real schubert calculus. Proc. Amer. Math. Soc. 134(4), 949–957 (2006) Faddeev, L.D., Takhtajan, L.A.: The quantum method for the inverse problem and the XYZ Heisenberg model. Russ. Math. Surv. 34, no. 5, 11–68 (1979); The spectrum and scattering of excitations in the one-dimensional isotropic Heisenberg model. J. Sov. Math. 24, 241–267 (1984) Fulton, W.: Intersection Theory. Berlin-Heidelberg-New Yok: Springer-Verlag, 1984 Izergin, A.G., Korepin, V.E.: Lattice model connected with nonlinear schrödinger equation. Sov. Phys. Doklady 26, 653–654 (1981) Korepin, V.E., Bogoliubov, N.M., Izergin, A.G.: Quantum inverse scattering method and correlation functions. Cambridge: Cambridge University Press, 1993 Mukhin, E., Tarasov, V., Varchenko, A.: Bethe eigenvectors of higher transfer matrices. J. Stat. Mech. Theor. Exp. 2006, no. 8, P08002, 1–44 (electronic) (2006)

42

[MTV2] [MTV3] [MTV4] [MTV5] [MV1] [MV2] [MV3] [PS] [RV] [Sk] [T] [Tal] [YY]


Mukhin, E., Tarasov, V., Varchenko, A.: Generating operator of XXX or Gaudin transfer matrices has quasi-exponential kernel. SIGMA 6, 060, 1–31 (2007) Mukhin, E., Tarasov, V., Varchenko, A.: Bethe algebra and algebra of functions on the space of differential operators of order two with polynomial solutions. http://arxiv./org/abs/0705. 4114v1[math.QA], 2007 Mukhin, E., Tarasov, V., Varchenko, A.: On reality property of Wronski maps. http://arxiv./org/ abs/0710.5856v2[math.QA], 2008 Mukhin, E., Tarasov, V., Varchenko, A.: On separation of variables and completeness of the Bethe ansatz for quantum glN Gaudin model. http://arxiv./org/abs/0712.0981v1[math.QA], 2007 Mukhin, E., Varchenko, A.: Critical points of master functions and flag varieties. Comm. Contemp. Math. 6(1), 111–163 (2004) Mukhin, E., Varchenko, A.: Solutions to the XXX type bethe ansatz equations and flag varieties. Cent. Eur. J. Math. 1(2), 238–271 (2003) Mukhin, E., Varchenko, A.: Discrete miura opers and solutions of the bethe ansatz equations. Commun. Math. Phys. 256(3), 565–588 (2005) Pronko, G.P., Stroganov, Yu.G.: Bethe equations “on the wrong side of equator”. J. Phys. A 32(12), 2333–2340 (1999) Reshetikhin, N., Varchenko, A.: Quasiclassical asymptotics of solutions to the KZ equations. In: Geometry, Topology and Physics for R. Bott, somerville, MA: Intern. Press, 1995, pp. 293–322 Sklyanin, E.: Quantum inverse scattering method. Selected topics. In: Nankai Lectures Math. Phys., River Edge, NJ: World Sci. Publ., 1992, pp. 63–97 Tarasov, V.: Irreducible monodromy matrices for the r -matrix of the xxz-model and lattice local quantum hamiltonians. Theor. Math. Phys. 63(2), 440–454 (1985) Talalaev, D.: Quantization of the Gaudin system. http://arxiv.org/abs/list/hep-th/0404153, 2004 Yang, C.N., Yang, C.P.: Thermodynamics of a one-dimensional system of bosons with repulsive delta-function interaction. J. Math. Phys. 10, 1115–1122 (1969)

Communicated by L. Takhtajan


Communications in


Conformal Radii for Conformal Loop Ensembles Oded Schramm1 , Scott Sheffield2,3, , David B. Wilson1 1 Microsoft Research, One Microsoft Way, Redmond, WA 98052, USA 2 Courant Institute, New York University, 251 Mercer Street, New York, NY 10021, USA 3 Department of Mathematics, M. I. T., 77 Massachusetts Ave., Cambridge, MA 02139, USA

Received: 20 September 2007 / Accepted: 10 November 2008 Published online: 26 February 2009 – © Springer-Verlag 2009

Abstract: The conformal loop ensembles CLEκ , defined for 8/3 ≤ κ ≤ 8, are random collections of loops in a planar domain which are conjectured scaling limits of the O(n) loop models. We calculate the distribution of the conformal radii of the nested loops surrounding a deterministic point. Our results agree with predictions made by Cardy and Ziff and by Kenyon and Wilson for the O(n) model. We also compute the expectation dimension of the CLEκ gasket, which consists of points not surrounded by any loop, to be 2−

(8 − κ)(3κ − 8) , 32κ

which agrees with the fractal dimension given by Duplantier for the O(n) model gasket. 1. Introduction The conformal loop ensembles CLEκ , defined for all 8/3 ≤ κ ≤ 8, are random collections of loops in a simply connected planar domain D C. They were defined and constructed from branching variants of SLEκ in [She06], where they were conjectured to be the scaling limits of various random loop models from statistical physics, including the so-called O(n) loop models with n = −2 cos(4π/κ),

(1)

see e.g. [KN04] for an exposition. This paper is a sequel to [She06]. We will state the results about CLEκ from [She06] that we need for this paper (namely Propositions 1 and 2), but we will not repeat the definition of CLEκ here. When 8/3 < κ < 8, CLEκ is almost surely a countably infinite collection of loops. CLE8 is a single space-filling loop almost surely and CLE8/3 is almost surely empty. Partially supported by NSF grant DMS0403182.

44

O. Schramm, S. Sheffield, D. B. Wilson

CLE6 is the scaling limit of the cluster boundaries of critical site percolation on the triangular lattice [CN06,Smi01,CN07]. We will henceforth assume 8/3 < κ < 8. For each z ∈ D, we inductively define L kz to be the outermost loop surrounding z z when the loops L 1z , . . . , L k−1 are removed (provided such a loop exists). For each deterministic z ∈ D, the loops L kz exist for all k ≥ 1 with probability one. Define A0z = D and let Akz be the component of D\L kz that contains z. The conformal gasket is the random closed set consisting of points that are not surrounded by any loop of an instance of CLEκ , i.e. the set of points for which L 1z does not exist. If D is a simply connected planar domain and z ∈ D, the conformal radius of D viewed from z is defined to be CR(D, z) := |g (z)|−1 , where g is any conformal map from D to the unit disk D that sends z to 0. The following is immediate from the construction in [She06]: Proposition 1. Let D be a simply connected bounded planar domain, and consider a CLEκ on D for some 8/3 < κ < 8. Then is almost surely the closure of the set of points that lie on an outermost loop (i.e., a loop of the form L 1z for some z). Conditioned on the outermost loops, the law of the remaining loops is given by an independent CLEκ in each component of D\. For z ∈ D and k = 1, 2, 3, . . . , define z , z) − log CR(Akz , z). Bkz := log CR(Ak−1

For any fixed z, the Bkz ’s are i.i.d. random variables. Various authors in the physics literature have used heuristic arguments (based on the so-called Coulomb gas method) to calculate properties of the scaling limits of statistical physical loop models, including the O(n) models, based on certain conformal invariance hypotheses of these limits. Although the scaling limits of the O(n) models have not been shown to exist, there is strong evidence that if they do exist they must be CLEκ . (For example, there is heuristic evidence that any scaling limit of the O(n) models should be in some sense conformally invariant; it is shown in [She06,SW08b,SW08a] that any random loop ensemble satisfying certain hypotheses including conformal invariance and a Markov-type property must be a CLEκ .) It is therefore natural to interpret these calculations as predictions about the behavior of the CLEκ . Cardy and Ziff [CZ03] predicted and experimentally verified the expected number of loops surrounding a point in the O(n) model, which, in light of (1), may be interpreted as a prediction of the expectation of Bzk : 1 (κ/4 − 1) cot(π(1 − 4/κ)) . z = E[Bk ] π

(2)

Kenyon and Wilson [KW04] went further and predicted the distribution of Bkz , giving its moment generating function E[exp(λBkz )] = for λ satisfying Reλ < 1 −

2 κ

−

3κ 32 ,

− cos(4π/κ) , cos π (1 − 4/κ)2 + 8λ/κ and density function

d Pr[Bkz < x] dx ∞ ( j + 1/2)2 − (1 − 4/κ)2 −κ cos(4π/κ) j x . (−1) ( j + 1/2) exp − = 4π 8/κ j=0

(3)

(4)

Conformal Radii for Conformal Loop Ensembles

45

The main result of this paper is Theorem 1, which confirms these predictions. In the special case κ = 6, this prediction for the law of Bkz was independently confirmed by Dubédat [Dub05]. Theorem 1. Let f κ denote the density function for √ the first√time that a standard Brownian motion started at 0 exits the interval (−2π/ κ, 2π/ κ). Then for 8/3 < κ < 8, the density function for Bkz is d (κ − 4)2 z Pr[Bk < x] = − f κ (x) cos(4π/κ) exp x . (5) dx 8κ The equivalence of the formulations (4) and (5), and the fact that they imply (3), follows from a calculation of Ciesielski and Taylor, who showed that the exit-time distribution of a Brownian motion from the √center of a 1-dimensional ball of radius r has a moment generating function of 1/ cos 2r 2 λ, and who gave two series expansions (one in powers of e−x and the other in powers of e−1/x ) for its density function [CT62, Theorem 2 and Eq. 2.22] (see also [BS02, Eqs. 1.3.0.1 and 1.3.0.2]). Since the Fourier transform is invertible on L 2 (R), the equivalence of (3) and (4) follows by considering the moment generating function restricted to the imaginary line Reλ = 0. Duplantier [Dup90] predicted the fractal dimension of the gasket associated with the O(n) model to be 3κ 2 +1+ , 32 κ where as usual n = −2 cos(4π/κ). We partially confirm this prediction by giving the expectation dimension of the gasket associated with CLEκ . The expectation dimension of a random bounded set A is defined to be log E[minimal number of balls of radius ε required to cover A] , ε→0 | log ε|

DE (A) = lim

provided the limit exists. The expectation dimension upper bounds the Hausdorff dimension. Theorem 2. Let be the gasket of a CLEκ in the unit disk with κ ∈ (8/3, 8). Then E[minimal number of balls of radius ε required to cover ] In particular, the expectation dimension of the gasket is

3κ 32

3κ +1+ 2 κ 1 32 . ε

+ 1 + κ2 .

Here, denotes equivalence up to multiplicative constants. Lawler, Schramm, and Werner [LSW02] studied the percolation gasket (associated with CLE6 ), effectively proving Theorem 2 in the case κ = 6. More generally, they studied how long it takes for a radial SLEκ to surround the origin when κ > 4, and their results implicitly imply Theorem 2 when κ > 4; see the remark in Sect. 2.1 for further discussion. We conclude our introduction by noting that the gasket dimension described above plays an important role in the physics literature, where it is related (at least heuristically) to the exponents of magnetization and multipoint correlation functions in critical lattice models. We briefly describe this connection in the case of the q-state Potts model on the square lattice. More details and references are found in [She06,Car07,Gri06].

46


A sample from the q-state Potts model on a connected planar graph G is a random function σ : V → {1, 2, . . . , q}, where V is the set of vertices of G and the image values 1, 2, . . . , q are often called spins. If the boundary vertices (those on the boundary of the unbounded face) of G are all assigned a particular value (say b), then using the standard FK random cluster decomposition [FK72], one may construct a sample from the Potts model as follows: 1. Sample a random subgraph G of G containing all boundary edges (edges on the boundary of the unbounded face), with probability proportional to # edges of G p q # components of G , 1− p where 0 < p < 1 is a parameter.The law of G is called the FK random cluster model with parameters p and q. Call the component of G which contains the boundary vertices of G the FK gasket. 2. Set σ (v) = b for each v in the FK gasket, and independently assign one of the q states uniformly at random to each of the remaining connected components of G (assigning all vertices in that component the corresponding state). The “magnetization” at an interior vertex v of G (i.e., the probability that σ (v) = j minus 1/q) is proportional to the probability that v is in the FK gasket. Given distinct vertices v and w, the covariance of σ (v) and σ (w) is proportional to the probability that both v and w lie in the same component of G . We now restrict to the case in which G is√a finite piece of the square grid in the plane and the parameter p satisfies p/(1 − p) = q. (With this choice of p, the FK model is self-dual and believed to be critical, see e.g., [Gri06, Chap. 6].) It is shown in [She06] that if q ≤ 4 and certain other hypotheses including conformal invariance hold, then the scaling limit of the set of boundaries between clusters and dual clusters in the critical FK random cluster models discussed above must be given by CLEκ for the κ satisfying q = 4 cos2 (4π/κ) and 4 ≤ κ ≤ 8. Assuming these hypotheses, the scaling limit of the discrete gasket is precisely the continuum CLEκ gasket. A heuristic ansatz is that the law of the critical FK gasket should have similar properties as the law of the set of squares in a fine grid which intersect the continuum gasket. If this heuristic holds, then when G is a bounded domain intersected with a square grid with spacing ε, the magnetization at a vertex v of macroscopic distance from the boundary in the discrete model will be on the order of ε2−d , where d is the limiting expectation dimension of the continuum gasket. Similarly, the covariance between σ (v) and σ (w), for two macroscopically separated vertices v and w, should be on the order of ε2(2−d) (since in the continuum model, the set of pairs v and w which lie in the same continuum spin cluster has dimension 2d; see [She06]). 2. Diffusions and Martingales 2.1. Reduction to a diffusion problem. Let Bt : [0, ∞) → R be a standard Brownian motion and let θt : [0, ∞) → [0, 2π ] be a random continuous process on the interval [0, 2π ] that is instantaneously reflecting at its endpoints (i.e., the set {t : θt ∈ {0, 2π }} has Lebesgue measure zero almost surely) and evolves according to the SDE dθt =

√ κ −4 cot(θt /2) dt + κ d Bt 2

(6)


47

on each interval of time for which θt ∈ / {0, 2π }. In other words, θt is a random continuous process adapted to the filtration of Bt which almost surely satisfies √ ∂ κ −4 (θt − κ Bt ) = cot(θt /2) ∂t 2 for all t for which the right hand side is well defined. The law of this process is uniquely determined by θ0 [She06], and we also have the following from [She06]: Proposition 2. When 8/3 < κ < 8, the law of Bkz is the same as the law of inf{t : θt = 2π } for the diffusion (6) started at θ0 = 0. It is convenient to lift the process θt so that, rather than taking values in [0, 2π ], it takes values in all of R. Let R : R → [0, 2π ] be the piecewise affine map for which R(x) = |x| when x ∈ [−2π, 2π ] and R(4π + x) = R(x) for all x. Given θt , we can generate a continuous process θ˜t with R(θ˜t ) = θt in such a way that for each component (t1 , t2 ) of the set {t : θt ∈ / 2π Z}, we independently toss a fair coin to decide whether θ˜t > θ˜t1 or θ˜t < θ˜t1 on that component. The θt together with these coin tosses (for each interval of {t : θt ∈ / 2π Z}) determine θ˜t uniquely. ˜ This θt is still a solution to (6) provided we modify Bt in such a way that d Bt is replaced with −d Bt on those intervals for which d θ˜t = −dθt . (This modification does not change the law of Bt .) In the remainder of the text, we will drop the θ˜t notation and write θt for the lifted process on R. Remark. A very similar diffusion process was studied by Lawler, Schramm, and Werner [LSW02], namely dθt = cot(θt /2) dt +

√ κ d Bt ,

(7)

which is the same as (6) but without the factor of (κ − 4)/2, and they too studied the time for the diffusion to reach θt = 2π when started at θ0 = 0. Diffusions 6 and 7 are identical when κ = 6, and Diffusion 6 (for 4 < κ < 8) is given by Diffusion 7 (for 4 < κ < ∞) upon substituting κ → (2κ)/(κ − 4) and scaling time by t → t (κ − 4)/2. However, Diffusion 6 is a more singular Bessel-type process when κ ≤ 4, requiring additional technical analysis to deal with the times at which process is at 0 (see e.g. Lemma 3). Furthermore, only the large-time asymptotic decay rate of the hitting time distribution (used in the proof of Theorem 2) is given in [LSW02], and additional effort is required to obtain the precise hitting time distribution provided in Theorem 1.

2.2. Local martingales. Recall the hypergeometric function defined by F(a, b; c; z) =

∞ (a)n (b)n n=0

(c)n n!

zn ,

where a, b, c ∈ C are parameters, c ∈ / −N (where N = {0, 1, 2, . . . }), and ( )n denotes

( + 1) · · · ( + n − 1). This definition holds for z ∈ C when |z| < 1, and it may be

48


defined by analytic continuation elsewhere (though it is then not always single-valued). We define for λ ∈ C,

2 e 4 4 2 8λ 3 4 2 θ , Mκ,λ 1 − (θ ) = F 1 − κ4 + 1 − κ4 + 8λ , 1 − − + ; − ; sin κ κ κ κ 2 κ 4

1 2 2 2λ 3 o 1 2 2 2 Mκ,λ (θ ) = F 1 − κ2 + + 2λ + κ ; 2 ; cos2 θ2 cos θ2 2 − κ κ ,1 − κ − 2 − κ e (θ ) makes sense whenever κ = 8 , 8 , 8 , . . . ). There is some (where the formula for Mκ,λ 3 5 7 ambiguity in the choice of square root, but since F(b, a; c; z) = F(a, b; c; z), as long as the same choice of square root is made for both occurrences, there is no ambiguity in these definitions.

Lemma 1. For the diffusion (6) with κ > 0, let T be the first time at which θt ∈ 2π Z, e (θ ) and exp[λt¯]M o (θ ) are and let t¯ = min(t, T ). For any λ ∈ C, both exp[λt¯]Mκ,λ t¯ κ,λ t¯ local martingales parameterized by t. Proof. Given these formulas, in principle it is straightforward to verify that the dt term of the Itô expansion of

e|o d eλt Mκ,λ (θt ) is equal to zero (where e|o is either e or o). This term can be expressed as κ −4 κ e|o e|o e|o λt cot(θt /2) + Mκ,λ (θt ) dt. e λ Mκ,λ (θt ) + Mκ,λ (θt ) 2 2 e ; the Since Mathematica does not simplify this to zero, we show how to do this for Mκ,λ o is similar. case for Mκ,λ e with M, let F denote the hypergeometric function in the defiWe abbreviate Mκ,λ e nition of Mκ,λ , let a, b, and c denote the parameters of F in M, and change variables to y = y(θ ) = sin2 (θ/4) = (1 − cos(θ/2))/2:

M(θ ) = F(y(θ )), 1 M (θ ) = F (y(θ )) sin(θ/2), 4 1 1 M (θ ) = F (y(θ )) sin2 (θ/2) + F (y(θ )) cos(θ/2) 16 8 1 2 − y y−y = F (y) + F (y) 2 , 4 4 so that

κ

κ −4 κ eλt λF(y) − F (y)(y − 21 ) + F (y) y − y 2 + F (y) 21 − y 4 8 8

κ = eλt λF(y) + (1 − 3κ/8) F (y) y − 21 + F (y)(y − y 2 ) ⎤ 8 ⎡ 1 (a + n)(b + n) λ + (1 − 3κ/8) n − ∞ ⎥ (a)n (b)n ⎢ 2 c+n ⎢ ⎥ = eλt yn ⎣ ⎦ κ (a + n)(b + n) (c)n n! n=0 n − n(n − 1) + 8 c+n E n /(c+n)


49

and we define E n to be c +n times the expression in brackets. (Note that indeed c ∈ / −N.) We may write E n as a polynomial in n: E n = [(1 − 3κ/8)(1 − 1/2) + (κ/8)(1 − c + a + b)] × n 2 + [λ + (1 − 3κ/8)(c − a/2 − b/2) + (κ/8)(c + a b)] × n + [cλ − (1 − 3κ/8)a b/2]. e . By our choices of a, b and c, E n = 0 for each n, which proves the claim for Mκ,λ

2.3. Expected first hitting of 2π Z. In this subsection we obtain asymptotics for the function L(θ ) := E[θT |θ0 = θ ], where θt is the diffusion (6) and T is the first time t ≥ 0 at which θt ∈ 2π Z. (Recall that T is finite a.s. when κ < 8.) Whenever θ ∈ 2π Z, trivially L(θ ) = θ . Lemma 2. For the diffusion (6) with 8/3 < κ < 8, L(θt ) is a martingale. Proof. Since L(θ ) is defined in terms of expected values, L(θt ) is a local martingale whenever θt ∈ / 2π Z, and the stopped process L(θmin(t,T ) ) is a martingale. Since the diffusion behaves symmetrically around the points 2π Z and the number of intervals of R2π Z crossed before some deterministic time has exponentially decaying tails (which implies integrability), L(θt ) is a martingale.

Next, we express L(θt ) in terms of the λ = 0 case of the local martingales e (θ ) and exp[λt]M o (θ ). Because M e (θ ) = 1, this local martingale is exp[λt]Mκ,λ t κ,λ t κ,0 uninformative, but

o (θ ) = F 23 − κ4 , 21 ; 23 ; cos2 θ2 cos θ2 . Mκ,0

o (0) = −M o (2π ) = √π (4/κ − 1/2)/(2(4/κ)), and since M o is We have Mκ,0 κ,0 κ,0 o (θ bounded (when κ = 8), the stopped process Mκ,0 min(t,T ) ) is also a martingale. This determines L, namely,

√

2 π κ4 L(θ ) = π − 4 1 F 23 − κ4 , 21 ; 23 ; cos2 θ2 cos θ2 , θ ∈ [0, 2 π ] (8) κ −2

and L(θ + 2 π ) = L(θ ) + 2 π . (It is also possible to derive (8) from the formula for Pr[SLE trace passes to left of x + i y] [Sch01] after applying a Möbius transformation and suitable hypergeometric identities.) We wish to understand the behavior of L near the points in 2π Z, and to this end we use the formula (see [EMOT53, p. 108, Eq. 2.10.1]) F(a, b; c; z) =

(c)(c − a − b) F(a, b; a + b − c + 1; 1 − z) (c − a)(c − b) (c)(a + b − c) + (1 − z)c−a−b F(c − a, c − b; c − a − b + 1; 1 − z) (a)(b) (9)

50


which is valid when 1 − c, b − a, and c − b − a are not integers and | arg(1 − z)| < π . In our case the nonintegrality condition is satisfied whenever 8/κ ∈ / N. For the range of κ that we are interested in, this rules out κ = 4, for which we already know L(θ ) = θ , and therefore do not need asymptotics. The endpoints of the range, κ = 8/3 and κ = 8 are also ruled out, but for the remaining κ’s we have

3 4 1 3

23 κ4 − 21 θ 2 F 2 − κ , 2 ; 2 ; cos 2 = F 23 − κ4 , 21 ; 23 − κ4 ; sin2 θ2

4 κ (1)

23 21 − κ4 θ κ8 −1 4 + 3 4 1 sin 2 F κ , 1; κ4 + 21 ; sin2 θ2 2−κ 2

√ 21 − κ4 θ κ8 −1 π κ4 − 21 1 + =

sin 2 cos θ 2 3 − 4 2 κ4 2 2 κ 4 4 1 2 θ , 1; + ; sin ×F , κ κ 2 2 and by (8),

8 −1

L(θ ) = cκ sin θ2 κ F κ4 , 1; κ4 + 21 ; sin2 θ2 cos θ2 ,

θ ∈ (0, π ).

for some constant cκ > 0. (One can use the Legendre duplication formula to show that cκ = 28/κ−1 ( κ4 )2 /( κ8 ), but we do not need this.) Since L(−θ ) = −L(θ ), we conclude that L(θ ) = A0 (θ 2 ) |θ |8/κ /θ,

θ ∈ (−π, π ),

where A0 is a analytic function (depending on κ) satisfying A0 (0) > 0. This implies θ 2 = A |L(θ )|2κ/(8−κ) (10) near θ = 0, for some analytic A. 2.4. Starting at θ0 = 0. We will eventually need to start the diffusion at θ0 = 0, but Lemma 1 only covers what happens up until the first time that θt ∈ 2π Z. In this subsection we show Lemma 3. For the diffusion (6) with 8/3 < κ < 8, let T be the first time at which e (θ ) is θt ∈ 2π Z4π Z, and let t¯ = min(t, T ). For any λ ∈ C, the process exp[λt¯]Mκ,λ t¯ a martingale. e , and let us assume without loss of generality that −2π < Proof. Let M abbreviate Mκ,λ θ0 < 2π . Let us define ωt = L(θt ), which is a martingale and may be interpreted as a time-changed Brownian motion. We wish to argue that eλt¯ M(L −1 (ωt¯)) = eλt¯ M(θt¯) is a local martingale. (The definition of L implies that it is strictly monotone, and hence L −1 is well defined.) Note that by Lemma 1 it is a local martingale when θt¯ ∈ / 2 π Z. To extend this to a neighborhood of θt¯ = 0, one could try to use Itô’s formula. To do this, it would be necessary that f := M ◦ L −1 be twice differentiable. We have M(θ ) = A1 (θ 2 ) in (−2 π, 2 π ), where A1 is analytic. Consequently, (10) gives for 8/3 < κ < 8 and for ω in a neighborhood of 0, f (ω) = A2 |ω|2κ/(8−κ) , (11)


51

for some analytic A2 . Though this is not necessary for the proof, one can check that A2 (0) = 0 and therefore f (0) is not finite when κ < 4. To circumvent the problem of f = M ◦ L −1 not being twice differentiable, we use the 2κ Itô-Tanaka Theorem ([RY99, Theorem 1.5 on p. 223]). The exponent 8−κ in (11) ranges from 1 to ∞ as κ ranges from 8/3 to 8. In particular, f (0) = 0 and f is continuous near 0. Since A2 is analytic, near 0 the function f may be expressed as the difference of two convex functions, namely, f (ω) = ( f (ω) − f (0)) 1ω≥0 + ( f (ω) − f (0)) 1ω≤0 + f (0). Therefore, we may apply the Itô-Tanaka Theorem to conclude that eλt¯ f (ωt¯) = eλt¯ M(θt¯) is a local martingale also when θt¯ is near zero. Now, the hypergeometric function F satisfies F(a, b; c; 1) =

(c)(c − a − b) (c − a)(c − b)

(12)

provided −c ∈ / N and Rec > Re(a + b) (see e.g. [EMOT53, p. 104, Eq. 46]). Therefore, M(±2 π ) is finite. Thus, eλt¯ M(θt¯) is bounded for bounded t, and we may conclude that it is a martingale.

e (±2 π ). Observe that the parameters For future reference, we now calculate Mκ,λ e satisfy 2 c − a − b = 1. a, b, c of the hypergeometric function in the definition of Mκ,λ Consequently, the identity (z)(1 − z) = π/sin(π z) and (12) give

sin e Mκ,λ (±2 π ) =

π 2

2 − π 1 − κ4 +

8λ κ

=

sin (3π/2 − 4π/κ)

cos π (1 − 4/κ)2 + 8λ/κ cos(π(1 − 4/κ))

. (13)

3. Proofs of Main Results We now restate and prove Theorem 1. Theorem 3. Suppose the diffusion process (6) (with 8/3 < κ < 8) is started at θ0 = 0, and T is the first time at which θt = ±2π . If Reλ ≤ 0, then

E eλT θ0 = 0 =

cos(π(1 − 4/κ)) . cos π (1 − 4/κ)2 + 8λ/κ

(This is equivalent to Theorem 1 by Proposition 2 and the remarks following the statement of Theorem 1.) e (θ ) = M e (±2 π ) = M e (2 π ) a.s. and exp[λt¯]M e (θ ) is a marProof. Since Mκ,λ T κ,λ κ,λ κ,λ t¯ tingale, the optional sampling theorem gives

e e e Mκ,λ (2 π ) E eλT θ0 = 0 = E eλT Mκ,λ (θT ) θ0 = 0 = Mκ,λ (0) = 1, and the proof is completed by appeal to (13).

52


Proof of Theorem 2. Fix some ε > 0 and let z ∈ D. Set r0 := 1 − |z|, r1 := dist(z, L 1z ), and suppose that ε < r0 . We seek to estimate the probability that the open disk of radius ε about z intersects the gasket; that is, the probability that r1 < ε. By the Koebe 1/4 theorem, r1 ≤ CR(D1 , z) ≤ 4 r1 . Likewise, r0 ≤ CR(D, z) ≤ 4 r0 . Thus, B1z = log CR(D, z) − log CR(D1 , z) = log(r0 /r1 ) + O(1). Referring to the density function of Bkz (4), we see that Pr[r1 < ε] exp[−α log(r0 /ε)] = (ε/r0 )α , where α=

1/4 − (1 − 4/κ)2 3κ 2 (8 − κ)(3κ − 8) =− +1− = . 8/κ 32 κ 32κ

For each j = 1, . . . , 1/ε, we may cover the annulus {z : ( j − 1) ε ≤ 1 − |z| ≤ j ε} by O(1/ε) disks of radius ε. The total expected number of these disks that intersect the gasket is at most

1/ε

O(1/ε) × O(ε/( jε))α = O(εα−2 ).

j=1

(Here we made use of the fact that α < 1.) Thus on average O(εα−2 ) disks of radius ε suffice to cover the gasket. On the other hand, we may pack into D at least (1/ε2 ) points so that every two of them are more than distance 4ε apart, and each of them is at least distance 1/2 from the boundary. For each such point z there is a (εα ) chance that the disk or radius ε centered at z is not surrounded by a loop, i.e., that that the gasket contains a point z that is within distance ε of z. Since the points z are sufficiently far apart, the points z must be covered by distinct disks in any covering of by disks of radius ε. Thus the expected number of disks of radius ε required to cover the gasket is at least (εα−2 ).

4. Open Problems Kenyon and Wilson [KW04] also predicted the large-k limiting distribution of another quantity, the “electrical thickness” of the loops L kz when k → ∞. The electrical thickness of a loop compares the conformal radius of the loop to the conformal radius of the image of the loop under the map m(w) = 1/(w − z), and more precisely it is ϑz (L kz ) = − log CR(L kz , z) − log CR(m(L kz ), z). Kenyon and Wilson [KW04] predicted that the large-k moment generating function of ϑz (L kz ) is sin(π(1 − 4/κ)) π (1 − 4/κ)2 + 8λ/κ z , (14) lim E[exp(λϑz (L k ))] = k→∞ π(1 − 4/κ) sin π (1 − 4/κ)2 + 8λ/κ or equivalently that the limiting probability density function is given by the density function of the √ exit time√of a standard Brownian excursion started in the middle of the interval (−2π/ κ, 2π/ κ), reweighted by a factor of const × exp[(κ − 4)2 x/(8κ)]. (This equivalence follows from [BS02, Eq. 5.3.0.1].) Recall that the density function


53

of Bkz is given by the density function of the exit time of a standard Brownian motion √ √ started in the middle of the interval (−2π/ κ, 2π/ κ), also reweighted by a factor of const × exp[(κ − 4)2 x/(8κ)]. These forms are highly suggestive, but currently we do not know how to calculate the electrical thickness using CLEκ , nor do we have a conceptual explanation for why these distributions take these forms. References [BS02] [Car07] [CN06] [CN07] [CT62] [CZ03] [Dub05] [Dup90] [EMOT53] [FK72] [Gri06] [KN04] [KW04] [LSW02] [RY99] [Sch01] [She06] [Smi01] [SW08a] [SW08b]

Borodin, A.N., Salminen, P.: Handbook of Brownian Motion—Facts and Formulae. Probability and its Applications. Basel: Birkhäuser Verlag, 2nd edition, 2002 Cardy, J.: ADE and SLE. J. Phys. A 40(7), 1427–1438 (2007) Camia, F., Newman, C.M.: Two-dimensional critical percolation: the full scaling limit. Commun. Math. Phys. 268(1), 1–38 (2006) Camia, F., Newman, C.M.: Critical percolation exploration path and SLE6 : a proof of convergence. Probab. Theory Related Fields 139(3-4), 473–519 (2007) Ciesielski, Z., Taylor, S.J.: First passage times and sojourn times for Brownian motion in space and the exact Hausdorff measure of the sample path. Trans. Amer. Math. Soc. 103(3), 434– 450 (1962) Cardy, J., Ziff, R.M.: Exact results for the universal area distribution of clusters in percolation, Ising, and Potts models. J. Stat. Phys. 110(1-2), 1–33 (2003) Dubédat, J.: 2005, Personal communication Duplantier, B.: Exact fractal area of two-dimensional vesicles. Phys. Rev. Lett. 64(4), 493 (1990) Erdélyi, A., Magnus, W., Oberhettinger, F., Tricomi, F.G.: Higher Transcendental Functions. Vol. I., New York: McGraw-Hill Book Company, 1953, based, in part, on notes left by Harry Bateman Fortuin, C.M., Kasteleyn, P.W.: On the random-cluster model. I. Introduction and relation to other models. Physica 57, 536–564 (1972) Grimmett, G.: The Random-Cluster Model, Volume 333 of Grundlehren der Mathematischen Wissenschaften [Fundamental Principles of Mathematical Sciences]. Berlin: Springer-Verlag, (2006) Kager, W., Nienhuis, B.: A guide to stochastic Löwner evolution and its applications. J. Stat. Phys. 115(5-6), 1149–1229 (2004) Kenyon, R.W., Wilson, D.B.: Conformal radii of loop models, 2004. Manuscript Lawler, G.F., Schramm, O., Werner, W.: One-arm exponent for critical 2D percolation. Electron. J. Probab. 7: Paper No. 2, 13 pp. (2002) Revuz, D., Yor, M.: Continuous Martingales and Brownian Motion, Volume 293 of Grundlehren der Mathematischen Wissenschaften [Fundamental Principles of Mathematical Sciences]. Berlin: Springer-Verlag, third edition, 1999 Schramm, O.: A percolation formula. Electron. Comm. Probab. 6, 115–120 (2001) Sheffield, S.: Exploration trees and conformal loop ensembles. http://arxiv.org/abs/math.PR/ 0609167, 2006 Duke Math. J., to appear Smirnov, S.: Critical percolation in the plane: conformal invariance, Cardy’s formula, scaling limits. C. R. Acad. Sci. Paris Sér. I Math. 333(3), 239–244 (2001) Sheffield, S., Werner, W.: Conformal loop ensembles: Construction via loop-soups, 2008, in preparation Sheffield, S., Werner, W.: Conformal loop ensembles: The Markovian characterization, 2008, in preparation

Communicated by M. Aizenman


Communications in


A Groupoid Approach to Noncommutative T-Duality Calder Daenzer University of California at Berkeley, 970 Evans Hall, Berkeley, CA 94720-3840, USA. E-mail: [email protected] Received: 6 October 2007 / Accepted: 15 December 2008 Published online: 26 February 2009 – © The Author(s) 2009. This article is published with open access at Springerlink.com

Abstract: Topological T-duality is a transformation taking a gerbe on a principal torus bundle to a gerbe on a principal dual-torus bundle. We give a new geometric construction of T-dualization, which allows the duality to be extended in the following two directions. First, bundles of groups other than tori, even bundles of some nonabelian groups, can be dualized. Second, bundles whose duals are families of noncommutative groups (in the sense of noncommutative geometry) can be treated, though in this case the base space of the bundles is best viewed as a topological stack. Some methods developed for the construction may be of independent interest. These are a Pontryagin type duality that interchanges commutative principal bundles with gerbes, a nonabelian Takai type duality for groupoids, and the computation of certain equivariant Brauer groups. Contents 1. 2. 3. 4. 5. 6. 7. 8. 9. 10. 11. 12.

Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . Groupoids and G-Groupoids . . . . . . . . . . . . . . . . . . . . Modules and Morita Equivalence for Groupoids . . . . . . . . . Some Relevant Examples of Groupoids and Morita Equivalences Groupoid Algebras, K-Theory, and Strong Morita Equivalence . Equivariant Groupoid Cohomology . . . . . . . . . . . . . . . . Gerbes and Twisted Groupoids . . . . . . . . . . . . . . . . . . Pontryagin Duality for Generalized Principal Bundles . . . . . . Twisted Morita Equivalence . . . . . . . . . . . . . . . . . . . . Some Facts about Groupoid Cohomology . . . . . . . . . . . . . Generalized Mackey-Rieffel Imprimitivity . . . . . . . . . . . . Classical T-Duality . . . . . . . . . . . . . . . . . . . . . . . . .

. . . . . . . . . . . .

. . . . . . . . . . . .

. . . . . . . . . . . .

. . . . . . . . . . . .

. . . . . . . . . . . .

. . . . . . . . . . . .

56 58 59 60 64 64 66 69 73 75 79 80

The research reported here was supported in part by National Science Foundation grants DMS-0703718 and DMS-0611653.

56

13. 14. 15. 16. A.

C. Daenzer

Nonabelian Takai Duality . . . . . . . . . . . . . Nonabelian Noncommutative T-Duality . . . . . . The Equivariant Brauer Group . . . . . . . . . . . Conclusion . . . . . . . . . . . . . . . . . . . . . Connection with the Mathai-Rosenberg Approach

. . . . .

. . . . .

. . . . .

. . . . .

. . . . .

. . . . .

. . . . .

. . . . .

. . . . .

. . . . .

. . . . .

. . . . .

. . . . .

. . . . .

82 85 87 90 90

1. Introduction A principal torus bundle with U (1)-gerbe and a principal dual-torus bundle1 with U (1)gerbe are said to be topologically T-dual when there is an isomorphism between the twisted K -theory groups of the two bundles, where the “twisting” of the K -groups is determined by the gerbes on the two bundles. The original motivation for the study of T-duality comes from theoretical physics, where it describes several phenomena and is by now a fundamental concept. For example T-duality provides a duality between type IIa and type IIb string theory and a duality on type I string theory (see e.g. [Pol]), and it provides an interpretation of a certain sector of mirror symmetry on Calabi-Yau manifolds (see [SYZ]). There have been several approaches to constructing T-dual pairs, each with their particular successes. For example Bunke, Rumpf and Schick ([BRS]) have given a description using algebraic topology methods which realizes the duality functorially. This method is very successful for cases in which T-duals exist as commutative spaces, and has recently been extended ([BSST]) to abelian groups other than tori. In complex algebraic geometry, T-duality is effected by the Fourier-Mukai transform; in that context duals to certain toric fibrations with singular fibers (so they are not principal bundles) can be constructed (e.g. [DP,BBP]). Mathai and Rosenberg have constructed T-dual pairs using C ∗ -algebra methods, and with these methods arrived at the remarkable discovery that in certain situations one side of the duality must be a family of noncommutative tori ([MR]). In this paper we propose yet another construction of T-dual pairs, which can be thought of as a construction of the geometric duality that underlies the C ∗ -algebra duality of the Mathai-Rosenberg approach. To validate the introduction of yet another T-duality construction, let us immediately list some of the new results which it affords. Any of the following language which is not standard will be reviewed in the body of the paper. • Duality for groups other than tori can be treated, even groups which are not abelian. More precisely, if N is a closed normal subgroup of a Lie group G, then the dual of any G/N -bundle P → X with U (1)-gerbe can be constructed as long as the gerbe is “equivariant” with respect to the translation action of G on the G/N -bundle. The dual is found to be an N -gerbe over X , with a U (1)-gerbe on it. The precise sense in which it is a duality is given by what we call nonabelian Takai duality for groupoids, which essentially gives a way of returning from the dual side to something canonically Morita equivalent to its predual. A twisted K -theory isomorphism is not necessary for there to be a nonabelian Takai duality between the two objects, though we show that there is nonetheless a K -isomorphism whenever G is a simply connected solvable Lie group. 1 If a torus is written V /Λ, where V is a real vector space and Λ a full rank lattice, then its dual is the torus := Hom(Λ, U(1)). Λ

A Groupoid Approach to Noncommutative T-Duality

57

• Duality can be treated for torus bundles (or more generally G/N bundles, as above) whose base is a topological stack rather than a topological space. Such a generalization is found to be crucial for the understanding of duals which are noncommutative (in the sense of noncommutative geometry). • We find new structure in noncommutative T-duals. For example in the case of principal T -bundles, where T = V /Λ Rn /Zn is a torus, we find that a noncommutative T-dual is in fact a deformation of a Λ-gerbe over the base space X . The Λ-gerbe will be given explicitly, as will be the 2-cocycle giving the deformation, and when we restrict the Λ-gerbe to a point m ∈ X , so that we are looking at what in the classical case would be a single dual-torus fiber, the (deformed) Λ-gerbe is presented by a groupoid with twisting 2-cocycle, whose associated twisted groupoid algebra is a noncommutative torus. Thus the twisted C ∗ -algebra corresponding to a groupoid presentation of the deformed Λ-gerbe is a family of noncommutative tori, which matches the result of the Mathai-Rosenberg approach, but we now have an understanding of the “global” structure of this object, which one might say is that of a Λ-gerbe fibred in noncommutative dual tori. Another benefit to our setup is that a cohomological classification of noncommutative duals is available, given by groupoid (or stack) cohomology. • Groupoid presentations are compatible with extra geometric structure such as smooth, complex or symplectic structure. This will allow, in particular, for the connection between topological T-duality and the complex T-duality of [DP] and [BBP] to be made precise. We will begin an investigation of this and possible applications to noncommutative homological mirror symmetry in a forthcoming paper with Jonathan Block [BD]. Let us now give a brief outline of our T-dualization construction. For the outline to be intelligible, the reader should be familiar with groupoids and gerbes or else should browse Sects. (2)–(4) and (7). Let N be a closed normal subgroup of a locally compact group G, let P → X be a ˇ principal G/N -bundle over a space X , and suppose we are given a Cech 2-cocycle σ on P with coefficients in the sheaf of U (1)-valued functions. It is a classic fact that σ deterˇ mines a U (1)-gerbe on P, and that such gerbes are classified by the Cech cohomology 2 class of σ , written [σ ] ∈ Hˇ (P; U (1)). So σ represents the gerbe data. (The case which has been studied in the past is G Rn and N Zn , so that P is a torus bundle.) Given this data (P, [σ ]) of a principal G/N -bundle P with U (1)-gerbe, we construct a T-dual according to the following prescription: 1. Choose a lift of [σ ] ∈ Hˇ 2 (P; U (1)) to a 2-cocycle [σ˜ ] ∈ Hˇ G2 (P; U (1)) in ˇ G-equivariant Cech cohomology (see Sect. (6)). If no lift exists there is no T-dual in our framework. 2. From the lift σ˜ , define a new gerbe as follows. By definition, σ˜ will be realized as a G-equivariant 2-cocycle in the groupoid cohomology of some groupoid presentation G(P) of P (see Example (3)). Because σ˜ is G-equivariant, it can be interpreted as a 2-cocycle on the crossed product groupoid G G(P) for the translation action of G on G(P) (see Example (2)). Thus σ˜ determines a U (1)-gerbe on the groupoid G G(P). 3. The crossed product groupoid G G(P) is shown to present an N -gerbe over X , so σ˜ is interpreted as the data for a U (1)-gerbe on this N -gerbe. This U (1)-gerbe on an N -gerbe can be viewed as the T-dual (there will be ample motivation for this). We construct a canonical induction procedure, nonabelian Takai duality (see Sect. (13)), that recovers the data (P, σ˜ ) from this T -dual.

58

C. Daenzer

4. In the special case that N is abelian and σ˜ has a vanishing “Mackey obstruction”, we construct a “Pontryagin dual” of the U (1)-gerbe over the N -gerbe of Step (3). This dual object is a principal G/N dual -bundle with U (1)-gerbe, where G/N dual ≡ := Hom(N ; U (1)) is the Pontryagin dual.2 Thus in this special case we arrive at N a classical T-dual, which is a principal G/N dual -bundle with U (1)-gerbe, and the other cases in which we cannot proceed past Step (3) are interpreted as noncommutative and nonabelian versions of classical T-duality. The above steps are Morita invariant in the appropriate sense and can be translated into statements about stacks. Furthermore, they produce a unique dual object (up to Morita equivalence or isomorphism) once a lift σ˜ has been chosen. It should be noted, however, that neither uniqueness nor existence are intrinsic features of a T-dualization whose input data is only (P, [σ ]). In fact, the different possible T-duals are parameterized by the fiber over [σ ] of the forgetting map Hˇ G2 (P; U (1)) → Hˇ 2 (P; U (1)), which is in general neither injective nor surjective. In some cases the forgetful map is injective. For example this is true 1-dimensional tori, and consequently T-duals of gerbes over principal circle bundles are unique. At the core of our construction is the concept of dualizing by taking a crossed product for a group action. This concept was first applied by Jonathan Rosenberg and Mathai Varghese, albeit in a quite different setting than ours. We have included an Appendix which makes precise the connection between our approach and the approach presented in their paper [MR]. The role of Pontryagin duality in T-duality may have been first noticed by Arinkin and Beilinson, and some notes to this effect can be found in Arinkin’s appendix in [DP], (though this is in the very different setting of complex T-duality). The idea from Arinkin’s appendix has recently been expanded upon in the topological setting in [BSST]. Our version of Pontryagin duality almost certainly coincides with these, though we arrived at it from a somewhat different perspective. 2. Groupoids and G-Groupoids Let us fix notation and conventions for groupoids. A set theoretic groupoid is a small category G with all arrows invertible, written as follows: s,r

G := (G1 ⇒ G0 ). Here G1 is the set of arrows, G0 is the set of units (or objects), s is the source map, and r is the range map. The n-tuples of composable arrows will be denoted Gn . Throughout the paper γ s will be used to denote arrows in a groupoid unless otherwise noted. A topological groupoid is one whose arrows G1 and objects G0 are topological spaces and for which the structure maps (source, range, multiplication, and inversion) are continuous. A left Haar system on a groupoid (see [Ren]) is, roughly speaking, a continuous family of measures on the range fibers of the groupoid that is invariant under left groupoid multiplication. It is shown in [Ren] that for every groupoid admitting a left Haar system, the source and range maps are open maps. In this paper, a groupoid will mean a topological groupoid whose space of arrows is locally compact Hausdorff. Also each groupoid will be implicitly equipped with a left Haar system. These extra conditions are needed so that groupoid C ∗ -algebras can 2 For example when N Zn , N is the n-torus which is (by definition) dual to Rn /Zn .


59

be defined. Furthermore, all groupoids will be assumed second countable, that is, the space of arrows will be assumed second countable. Second countability of a groupoid ensures that the groupoid algebra is well behaved. For example, second countability implies that the groupoid algebra is separable and thus well-suited for K -theory; second countability is invoked in [Ren] when showing that every representation of a groupoid algebra comes from a representation of the groupoid [Ren]; and the condition is used in [MRW] when showing that Morita equivalence of groupoids implies strong Morita equivalence of the associated groupoid algebras. On the other hand, several results presented here do not involve groupoid algebras in any way. It will hopefully be clear in these situations that the Haar measure and second countability hypotheses, and in some cases local compactness, are unnecessary. Topological groups and topological spaces are groupoids, and they will be assumed here to satisfy the same implicit hypotheses as groupoids. Thus spaces and groups are always second countable, locally compact Hausdorff, and equipped with a left Haar system of measures. A group G can act on a groupoid, forming what is called a G-groupoid. Definition 2.1. A (left) G-groupoid is a groupoid G with a (continuous) left G-action on its space of arrows that commutes with all structure maps and whose Haar system is left G-invariant. 3. Modules and Morita Equivalence for Groupoids ε

Let G be a groupoid. A left G-module is a space P with a continuous map P → G0 called the moment map and a continuous “action” G ×G0 P → P; (γ , p) → γ p. Here G1 ×G0 P := { (γ , p) | sγ = εp } is the fibred product and by “action” we mean that γ1 (γ2 p) = (γ1 γ2 ) p. A right module is defined similarly, and one can convert a left module P to a right module P op by setting p · γ := γ −1 p; γ ∈ G , p ∈ P op . A G-action is called free if (γ p = γ p) ⇒ (γ = γ ) and is called proper if the map G ×G0 P → P × P; (γ , p) → (γ p, p) is proper. A G-module is called principal if the G action is both free and proper, and is called locally trivial when the quotient map P → G\P admits local sections. Note that when G = (G ⇒ ∗) is a group, a locally trivial principal G-module P is exactly a principal G-bundle over the quotient space G\P. For this reason principal modules are sometimes called principal bundles. We are reserving the term principal bundle for something else (see Example (3)). Now we come to the important notion of groupoid Morita equivalence. Definition 3.1. Two groupoids G and H are said to be Morita equivalent when there exists a Morita equivalence (G-H)-bimodule. This is a space P with commuting left G-module and right H-module structures that are both principal, and satisfying the following extra conditions:

60

C. Daenzer

• The quotient space G\P (with its quotient topology) is homeomorphic to H0 in a way that identifies the right moment map P → H0 with the quotient map P → G\P. • The quotient space P/H (with its quotient topology) is homeomorphic to G0 in a way that identifies the left moment map P → G0 with the quotient map P → P/H. In the literature on groupoids one finds several other ways to express Morita equivalence, but they are all equivalent (see for example [BX]). Morita equivalence bimodules give rise to equivalences of module categories. To see this, let E be a right G-module and P a Morita (G-H)-bimodule. Then G acts on E ×G0 P by γ · (e, p) := (eγ −1 , γ p) and one checks that the right H-module structure on P induces one on E ∗ P := G\(E ×G0 P). The assignment E → E ∗ P induces the desired equivalence of module categories. The inverse is given by P op ; in fact the properties of Morita bimodules ensure an isomorphism of (G-G)-bimodules, P ∗ P op G, and this in turn induces an isomorphism ((E ∗ P) ∗ P op ) E. If E is principal then so is E ∗ P, and if furthermore, both E and P are locally trivial, then so is E ∗ P. There is also a notion of G-equivariant Morita equivalence of G-groupoids. This is given by a Morita equivalence (G-H)-bimodule P with compatible G-action. The compatibility is expressed by saying that the map G ×G0 P ×H0 H → P; (γ , p, η) → γ pη satisfies, for g ∈ G, g(γ pη) = g(γ )g( p)g(η).

(1)

4. Some Relevant Examples of Groupoids and Morita Equivalences Here are some groupoids and Morita equivalences which will be used throughout the paper. ˇ Example 1. Cech groupoids and refinement. If U := {Ui }i∈I is an open cover of a topoˇ logical space X then the Cech groupoid of the cover, which we denote GU, is defined as follows: s : Ui j → U j GU := ( Ui j ⇒ Ui ) . r : Ui j → Ui I ×I

I

This groupoid is Morita equivalent to the unit groupoid X ⇒ X . Indeed, G0 is a Morita equivalence bimodule. It is a right G module in the obvious way. As for the left (X ⇒ X )-module structure, the moment map G0 → X is “glue the cover together” and the X -action is the trivial one X × X G0 → G0 . More generally, let G be any groupoid and suppose U := {Ui }i∈I is a locally finite cover of G0 . Define a new groupoid GU := ( Gi j ⇒ Ui ), where G i j := r −1 Ui ∩ s −1 U j . I ×I

I


61

s s i j : G i j → s −1 U j → U j The source and range maps are and where i j r r : G i j → r −1 U j → Ui . We will call such a groupoid a refinement of G and write γ i j for γ ∈ G i j . A groupoid is always Morita equivalent to its refinements. A Morita equivalence (G-GU)-bimodule P is defined as follows: P := s −1 Ui .

si j

rij,

I

For a left G-module structure on P, let the moment map be r : P → G0 and, writing ηi for η ∈ s −1 Ui ⊂ P, define the action by (γ , ηi ) → (γ η)i

γ ∈ G, ηi ∈ P.

For the right GU-module structure, the moment map is ηi → sη ∈ Ui and the right action is (ηi , γ i j ) → (ηγ ) j . ˇ Of course a Cech groupoid is exactly a refinement of a unit groupoid. In order to keep within our class of second countable groupoids, we restrict to countable covers of G0 . Example 2. Crossed product groupoids. From a G-groupoid G one can form the crossed product groupoid G G := (G × G1 ⇒ G0 ) whose source and range maps are s(g, γ ) := s(g −1 γ ) and r (g, γ ) := r γ , and for which a composed pair looks like: (g, γ ) ◦ (g , g −1 γ ) = (gg , γ γ ). Now suppose two G-groupoids G and H are equivariantly Morita equivalent via a bimodule P with moment maps b : P → G0 and br : P → H0 . Then G × P has the structure of a Morita (G G)-(G H)-bimodule. The left G G action is (g, γ ) · (g , p) := (gg , γ gp),

(g, γ ) ∈ G G, (g , p) ∈ G × P,

with moment map (g , p) → b ( p). The right G H-module structure is (g , p) · (g , η) := (g g , pg η),

(g , η) ∈ G H ,

with moment map (g , p) → br (g −1 p). Example 3. Generalized principal bundles. Let G be a groupoid, G a locally compact group, and ρ : G → G a homomorphism of groupoids. The generalized principal bundle associated to ρ is the groupoid G ρ G := (G × G1 ⇒ G × G0 ) whose source and range maps are s : (g, γ ) → (gρ(γ ), sγ ) and r : (g, γ ) → (g, r γ ) and for which a composed pair looks like (g, γ1 ) ◦ (gρ(γ1 ), γ2 ) = (g, γ1 γ2 ).

62

C. Daenzer

ˇ i} The reason G ρ G is called a generalized principal bundle is that when G = G{U ˇ is the Cech groupoid of Example (1), G ρ G is G-equivariantly Morita equivalent to ˇ a principal bundle on X . Indeed, in this case ρ is the same thing as a G-valued Cech 1-cochain on the cover, and the homomorphism property ρ(γ1 γ2 ) = ρ(γ1 )ρ(γ2 ) translates to ρ being closed. Thus ρ gives transition functions for the principal G-bundle on X, P(ρ) := G × Uα / ∼ (g, u ∈ Uα ) ∼ (gρ(γ ), u ∈ Uβ ), where γ = u ∈ Uαβ ⊂ G. Let π denote the bundle map P(ρ) → X , then there are isomorphisms h α : G × Uα −→ π −1 Uα which satisfy h −1 β h α (g, u) = (gρ(γ ), u), and the maps h α give a G-equivariant isoˇ −1 Uα }) by sendˇ groupoid G({π morphism of groupoids between G ρ G and the Cech −1 ˇ −1 Uα }). Finally, ing (g, γ ) ∈ G × Uα,β ⊂ G ρ G to h α (g, γ ) ∈ π Uαβ ⊂ G({π ˇ −1 Uα }) is G-equivariantly Morita equivalent to the unit groupoid P(ρ) ⇒ P(ρ). G({π To see the importance of keeping track of G-equivariance, note for instance that when G is abelian G ρ G and G ρ −1 G are isomorphic (and therefore Morita equivalent), whereas these two groupoids with their natural G-groupoid structures are not equivariantly equivalent. Example 4. Isotropy subgroups. Let G be a locally compact group and N a closed subgroup. Then G acts on the homogeneous space G/N by left translation and one can form the crossed product groupoid G G/N ⇒ G/N . There is a Morita equivalence (G G/N ⇒ G/N ) ∼ (N ⇒ ∗). The bimodule implementing the equivalence is G, with N acting on the right by translation and G G/N acting on the left by (g, gh N ) · h := gh. Example 5. Nonabelian groupoid extensions. Let G be a groupoid and B → G0 a bundle of not necessarily abelian groups over G0 . Suppose we have two continuous functions G2 ×G0 B → B ; (γ1 , γ2 , p) → σ (γ1 , γ2 ) p and G ×G0 B → B ; (γ , p) → τ (γ )( p), such that σ (γ1 , γ2 ) is an element of the fiber of B over r γ1 , τ (γ ) is an isomorphism from the fiber over sγ to the fiber over r γ , and the following equations are satisfied: τ (γ1 ) ◦ τ (γ2 ) = ad(σ (γ1 , γ2 )) ◦ τ (γ1 γ2 ), (τ (γ1 ) ◦ σ (γ2 , γ3 ))σ (γ1 , γ2 γ3 ) = σ (γ1 , γ2 )σ (γ1 γ2 , γ3 ),

(2) (3)

where ad( p)(q) := pqp −1 for elements p, q ∈ B that both lie in the same fiber over G0 . We will write γ ( p) := τ (γ )( p). The pair (σ, τ ) can be thought of as a 2-cocycle in “nonabelian cohomology” with values in B, and when B is a bundle of abelian groups, τ is simply an action and σ a 2-cocycle as in Sect. (6). From the data (σ, τ ) we form an extension of G by B, which is the groupoid B σ G := (B ×b,G0 ,r G1 ⇒ G0 ) with source, range and multiplication maps


63

1. s( p, γ ) := sγ r ( p, γ ) := r γ , 2. ( p1 , γ1 ) ◦ ( p2 , γ2 ) = ( p1 γ1 ( p2 )σ (γ1 , γ2 ), γ1 γ2 ). The next example combines Examples (3), (4), and (5). Example 6. Let ρ : G → G be a continuous function. Define δρ(γ1 , γ2 ) := ρ(γ1 )ρ(γ2 )ρ(γ1 γ2 )−1 ,

(γ1 , γ2 ) ∈ G 2 .

Suppose δρ takes values in a closed normal subgroup N of G, and write ρ¯ for the comρ position G → G → G/N , which is a homomorphism. Associated to ρ we construct two groupoids. 1. (G (G/N ρ¯ G). This is the crossed product groupoid of G acting by translation on the generalized principal bundle G/N ρ¯ G. This means that for (g, t, γ ) s ∈ (G (G/N ρ¯ G), the source, range and multiplication maps are (a) s(g, t, γ ) = (g −1 tρ(γ ), sγ ), (b) r (g, t, γ ) = (t, r γ ), (c) (g1 , t, γ1 ) ◦ (g2 , g1−1 tρ(γ1 ), γ2 ) = (g1 g2 , t, γ1 γ2 ). 2. (N δρ G). In the notation of Example (2) this is the extension determined by the pair (σ, τ ) := (δρ, ad(ρ)) and the constant bundle B := G0 × N . Note that as an N -valued groupoid 2-cocycle δρ is not necessarily a coboundary. Explicitly, the source, range and multiplication are (a) s(n, γ ) = sγ , (b) r (n, γ ) = r γ , (c) (n 1 , γ1 ) ◦ (n 2 , γ2 ) = (n 1 γ1 (n 2 )δρ(γ1 , γ2 ), γ1 γ2 ), where γ (n) := ρ(γ )nρ(γ )−1 . Proposition 4.1. The two groupoids H := G (G/N ρ¯ G) and K := (N δρ G) of Example (6) are Morita equivalent. Proof. The equivalence bimodule is P = G × G, endowed with the following structures: 1. Moment maps: P (g, γ ) → (gρ(γ )−1 , r γ ) ∈ H0 and P (g, γ ) → sγ ∈ K0 . 2. H-action: H ×H0 P ((g1 , t, γ1 ), (g2 , γ2 )) → (g1 g2 , γ1 γ2 ) ∈ P, whenever t = g1 g2 ρ(γ2 )−1 ρ(γ1 )−1 ∈ G/N . 3. K-action: P ×K0 K ((g, γ1 ), (n, γ2 )) → (gnρ(γ2 ), γ1 γ2 ) ∈ P. Direct checks show that these definitions make P a Morita equivalence bimodule.

Summary of notation. For convenience, let us summarize the notation that has been developed in these examples. • (G G) denotes a crossed product groupoid. It is in some sense a quotient of G by G. • (G ρ G) denotes a principal bundle over G. • (G σ G) denotes an extension of G by G. We will see that this corresponds to a presentation of a G-gerbe over G.

64

C. Daenzer

5. Groupoid Algebras, K-Theory, and Strong Morita Equivalence Let G be a groupoid. The continuous compactly supported functions G1 → C form an associative algebra, denoted Cc (G), for the following multiplication called groupoid convolution: a(γ1 )b(γ2 ) (4) a ∗ b(γ ) := γ1 γ2 =γ

for a, b ∈ Cc (G) and γ s ∈ G. Integration is with respect to the fixed left Haar system of measures. This algebra has an involution, a → a ∗ (γ ) = a(γ −1 ) (the overline denotes complex conjugation), and can be completed in a canonical way to a C ∗ -algebra (see [Ren]) which we simply refer to as the groupoid algebra and denote C ∗ (G) or C ∗ (G1 ⇒ G0 ). The groupoid algebra is a common generalization of the continuous functions on a topological space, to which this reduces when G is the unit groupoid, and of the convolution C ∗ -algebra of a locally compact group, to which this reduces when the unit space is a point. Indeed, by definition of the groupoid algebra we have C ∗ (X ⇒ X ) = C(X ) and C ∗ (G ⇒ ∗) = C ∗ (G) when X is a locally compact Hausdorff space and G is a locally compact Hausdorff group. As is probably common, we will define the K-theory of G, denoted K (G), to be the C ∗ -algebra K-theory of its groupoid algebra. Here are the facts we need about groupoid algebras and K -theory: Proposition 5.1. 1. [MRW] A Morita equivalence of groupoids gives rise to a (strong) Morita equivalence of the associated groupoid algebras. 2. A Morita equivalence between G-groupoids G and H gives rise to a Morita equivalence between the crossed product C ∗ -algebras G C ∗ (G) and G C ∗ (H). 3. Groupoid K -theory is invariant under Morita equivalence. Proof. The first statement is the main theorem of [MRW]. The second statement follows from the first after noting that the definitions of GC ∗ (G) and C ∗ (G G) coincide and that G G is Morita equivalent to G H (see Example (2)). The last statement now follows from the Morita invariance of C ∗ -algebra K -theory. Let G and H be Morita equivalent groupoids. It is useful to know that in [MRW] a C ∗ (G)C ∗ (H)-bimodule is constructed directly from a G-H-Morita equivalence bimodule P. The C ∗ -algebra bimodule is a completion of Cc (P) and has the actions induced from the translation actions of G and H on P. We present a generalization of this in Lemma (A.4). 6. Equivariant Groupoid Cohomology In this section we define equivariant groupoid cohomology for G-groupoids. Equivariant 2-cocycles will give rise to what we call equivariant gerbes. b

Let H be a groupoid and B → H0 a left H-module each of whose fibers over H0 is an abelian group, that is a (not necesssarily locally trivial) bundle of groups over H0 .


65

Then one defines the groupoid cohomology with B coefficients, denoted H ∗ (H; B), as the cohomology of the complex (C • (H; B), δ), where C k (H; B) := {continuous maps f : Hk → B | b( f (h 1 , . . . , h n )) = r h 1 } and for f ∈ C k (H; B), δ f (h 1 , . . . , h k+1 ) := h 1 · f (h 2 , . . . , h k+1 ) +

(−1)i f (h 1 , . . . , h i h i+1 , . . . , h k+1 )

i=1...k

f (h 1 , . . . , h k+1 ).

k+1

+(−1)

As is common, we tacitly restrict to the quasi-isomorphic subcomplex { f ∈ C k | f (h 1 , . . . , h k ) = 0 if some h i is a unit}, except for 0-cochains which have no such restriction. When the H-module is B = H0 × A, where A is an abelian group, we write A for the cohomology coefficients. ˇ When H is the Cech groupoid of a locally finite cover of a topological space X and ˇ B is the étale space of a sheaf of abelian groups on X , then C • is identical to the Cech complex of the cover with coefficients in the sheaf of sections of B, so this recovers ˇ Cech cohomology of the given cover. On the other hand, when H is a group this recovers continuous group cohomology. If H is a G-groupoid then C • (H; B) becomes a complex of left G-modules by g · f (h 1 , . . . , h n ) := f (g −1 h 1 , . . . , g −1 h n )

f ∈ C k (H; B),

and one can form the double complex K p,q = (C p (G; C q (H; B)), d, δ), where d denotes the groupoid cohomology differential for G ⇒ ∗. The G-equivariant cohomology of H with values in B, denoted HG∗ (H; B), is the cohomology of the total complex tot(K )n := (⊕ p+q=n K p,q , D = d + (−1) p δ). As one would hope, there is a chain map from the complex tot(K ) computing equivariant cohomology to the chain complex associated to the crossed product groupoid: Proposition 6.1. The map F : tot K • −→ C • (G H; B),

(5)

defined to be the sum of the maps f pq

C p (G; C q (H; B)) −→ C p+q (G H; B) f pq (c)((g1 , γ1 ), (g2 , g1−1 γ2 ), . . . , (g p+q , (g1 g2 . . . g p+q−1 )−1 γ p+q )) := c(g1 , . . . , g p , γ p+1 , . . . , γ p+q ) for c ∈ C p (G; C q (H; B)), g s ∈ G, and (γ1 , . . . , γ p+q ) ∈ H p+q , is a morphism of chain complexes. Proof. This is a direct check.

66

C. Daenzer

It seems likely that this is a quasi-isomorphism, but we have not proved it. In Sect. (10) it is shown that H (G H; B) is always a summand of HG (H; B), which is enough for the present purposes. Remark 6.2. These cohomology groups are not Morita invariant. For example, different ˇ covers can have different Cech cohomology. One can form a Morita invariant cohomology (that is, stack cohomology); it is the derived functors of B → HomH (H0 , B) = Γ (H0 , B)H , which homological algebra tells us can be computed by using a resolution of B by injective H-modules.3 However, the cocycles obtained via injective resolutions are often not useful for describing geometric objects such as bundles and groupoid extensions, so we will stick with the groupoid cohomology as defined above (which is the approximation to these derived functors obtained by resolving H0 by H• and taking the cohomology of HomH (H• ; B)). In Sect. (9) we will show how to compare the respective groupoid cohomology groups of two groupoids which are Morita equivalent. We will also encounter cocycles with values in a bundle of nonabelian groups B, defined in degrees n = 0, 1, 2. The spaces of cochains are the same and a 0-cocycle is also the same as in the abelian setting. In degree one we say ρ ∈ C 1 (H; B) is closed when δρ(γ1 , γ2 ) := ρ(h 1 )h 1 · ρ(h 2 )ρ(h 1 h 2 )−1 = 1 and ρ and ρ are cohomologous when ρ (h) = h · α(sh)ρ(h)α −1 (r h). A nonabelian 2-cocycle is a pair (σ, τ ) as in Example (5). 7. Gerbes and Twisted Groupoids In this section we describe various constructions that can be made with 2-cocycles and, in particular, explain our slightly non-standard use of the term gerbe. We also describe the construction of equivariant gerbes from equivariant cohomology. Given a 2-cocycle σ ∈ Z 2 (G; N ), where N is an abelian group, we can form an extension of G by N : N σ G := (N × G1 ⇒ G0 ), with multiplication (n 1 , γ1 ) ◦ (n 2 , γ2 ) := (n 1 n 2 σ (γ1 , γ2 ), γ1 γ2 ). More generally, if B is a bundle of not necessarily abelian groups, and (σ, τ ) a B-valued nonabelian 2-cocycle, then we can form the groupoid extension B σ G that was described in Example (5). We will call such an extension a B-gerbe, or an N -gerbe if B = G0 × N is a constant bundle of (not necessarily abelian) groups. The term gerbe comes from Giraud’s stack theoretic interpretation of degree two nonabelian cohomology ([Gir]). In the following few paragraphs (everything up to Definition (7.2)) we will outline the stack theoretic terminology leading to Giraud’s gerbes. The point of the outline is only to clear up terminology, and can be skipped. A nice reference for topological stacks is [Met]. Let C be any category. A topological stack is a functor F : C → T op satisfying a certain list of axioms. A morphism of stacks from (F : C → T op) to (F : C → T op) 3 There are enough injective H-modules for étale groupoids, but we do not know if this is true for general groupoids.


67

is a functor α : C → C (satisfying a couple of axioms) such that F = F ◦ α. Such a morphism is an equivalence of stacks when α is an equivalence of categories. Given a groupoid G, define PrinG to be the category whose objects are locally trivial right principal G-modules and whose homs are the continuous G-equivariant maps. This category has a natural functor to T op which sends a principal module P to the quotient P/G, and in fact satisfies the axioms for a stack. This stack (which we denote PrinG ) is called the stack associated to G. A stack which is equivalent to PrinG is called presentable and G is called a presentation of the stack. The discussion following Definition (3.1) shows that a locally trivial right principal G-H-bimodule P induces a functor ∗P : PrinG → PrinH . It is in fact a morphism of stacks, and is an equivalence of stacks when P is also left principal (that is when P is a Morita equivalence bimodule). Conversely, if there is an equivalence of stacks PrinG → PrinH , then G and H are Morita equivalent. Thus any statement about groupoids which is Morita invariant is naturally a statement about presentable stacks. We will only work with presentable stacks in this paper, and when stacks are mentioned at all, it will only be as motivation for making Morita invariant constructions. According to Giraud [Gir], a gerbe over a stack C is a stack C equipped with a morphism of stacks α : C → C satisfying a couple of axioms. Now, the extension B σ G has its natural quotient map to G (and this quotient map is a functor), and this determines a morphism of stacks Prin(B σ G ) → PrinG which in fact makes Prin(B σ G ) into what Giraud called a B-gerbe over the stack PrinG (see also [Met] Definition 84). When G is Morita equivalent to a space X , one usually calls this a gerbe over X . Thus we call the groupoid B σ G a B-gerbe, although it is actually a presentation of a B-gerbe. Hopefully this will not cause much confusion. Remark 7.1. Every gerbe described so far has the property that N acts on N σ G1 , making it a trivial principal N -bundle over G1 . Any groupoid presentation of a stack theoretic N -gerbe will admit a principal N -action on its space of arrows, but not every one is a trivializable principal bundle over G1 . Those that are not trivializable do not admit the 2-cocycle description we have been using. The obstruction to all gerbes being trivializable bundles is the degree one sheaf cohomology of the space G1 with coefficients in the 1 sheaf of N -valued functions, HShea f (G1 ; N ). Since many groupoids admit a refinement ´ for which this obstruction vanishes (in particular Cech groupoids do), there are plenty of situations in which one may assume the gerbe admits the above 2-cocycle description (in particular, for gerbes on spaces this is fine). Nonetheless, we will encounter gerbes which are not trivial bundles, such as the ones in Example (9). Closely related to gerbes is the following notion: Definition 7.2. Let G be a groupoid and let B → G0 be a bundle of groups. A B-twisted groupoid is a pair (G, (σ, τ )), where (σ, τ ) is a B-valued (nonabelian) 2-cocycle over G as in Example (5). When τ is understood to be trivial, we simply write (G, σ ), and when B is the constant bundle G0 × U (1) we simply call the pair a twisted groupoid. In fact a B-twisted groupoid contains the exact same data as a B-gerbe. However, we will encounter a type of duality which takes twisted groupoids to U (1)-gerbes and does not extend to a “gerbe-gerbe” duality. Thus it is necessary to have both descriptions at hand. We would like to make C ∗ -algebras out of twisted groupoids in order to define twisted K-theory. Here is the definition.

68

C. Daenzer

Definition 7.3. [Ren]. Given a twisted groupoid (G, σ ∈ Z 2 (G; U (1))), the associated twisted groupoid algebra, denoted C ∗ (G, σ ), is the C ∗ -algebra completion of the compactly supported functions on G1 , with σ -twisted multiplication a(γ1 )b(γ2 )σ (γ1 , γ2 ); a, b ∈ Cc (G1 ) a ∗ b(γ ) := γ1 γ2 =γ

and involution a → a ∗ (γ ) := a(γ −1 )σ (γ , γ −1 ). Here functions are C-valued and the overline denotes complex conjugation. Of course a groupoid algebra is exactly a twisted groupoid algebra for σ = 1. Definition 7.4. The twisted K-theory of a twisted groupoid (G, σ ) is the K-theory of C ∗ (G, σ ). Now suppose G is a locally compact group and H is a G-groupoid. By definition, a U (1)-valued 2-cocycle in G-equivariant cohomology is of the form: (σ, λ, β) ∈ C 0 (G; Z 2 (H; U (1))) × C 1 (G; C 1 (H; U (1))) × Z 2 (G; C 0 (H; U (1))). (6) and it satisfies the cocycle condition: D(σ, λ, β) = (δσ, δλ−1 dσ, δβdλ, dβ) = (1, 1, 1, 1). The first component, σ , of the triple determines a twisted groupoid algebra C ∗ (H; σ ). Now the translation action of G on C ∗ (H), g · a(γ ) := a(g −1 γ );

g ∈ G, h ∈ H, a ∈ C ∗ (H; σ )

is not an action on C ∗ (H; σ ) because g · (a ∗σ b) = (g · a) ∗σ (g · b) for a, b ∈ C ∗ (H; σ ), g ∈ G. The second and third components are “correction terms” that allow G to act on the twisted groupoid algebra. Indeed, define a map α : G → Aut(C ∗ (H; σ )) αg (a)(h) := λ(g, h)g · a(h). Then we have { αg (a ∗σ b) = αg (a) ∗σ αg (b)} ⇐⇒ { dσ = δλ}, so α does land in the automorphisms C ∗ (H; σ ). However, this is still not a group homomorphism since in general αg1 ◦ αg2 = αg1 g2 . If we attempted to construct a crossed product algebra G α C ∗ (H) it would not be associative. The failure of α to be homomorphic is corrected by the third component, β. An interpretation of β is that it determines a family over H0 of deformations of G as a noncommutative space from which α is in some sense a homomorphism. We encode this “noncommutative G-action” in the following twisted crossed product algebra: G λ,β C ∗ (H; σ ), which is the algebra with multiplication a ∗ b(g, h) := h 1 h 2 =h a(g1 , h 1 )b(g2 , g1−1 h 2 )χ ((g1 , h 1 ), (g2 , g1−1 h 2 )) , g1 g2 =g

where χ ((g1 , h 1 ), (g2 , g1−1 h 2 )) := σ (h 1 , h 2 )λ(g1 , h 2 )β(g1 , g2 , sh 2 ).


69

Lemma 7.5. Let G H be the crossed product groupoid associated to the G action on H (as in Example (2)). Then χ ∈ Z 2 (G H; U (1)). Consequently the multiplication on G λ,β C ∗ (H; σ ) ≡ C ∗ (G H, χ ) is associative. Proof. χ is the image of the cocycle (σ, λ, β) under the chain map of Eq. (5), thus it is a cocycle. Now it is clear how to interpret the data (σ, λ, β) at groupoid level: it is the data needed to extend the twisted groupoid (H, σ ) to a twisted crossed product groupoid (G H, χ ). The meaning of “extend” in this context is that (H, σ ) is a sub-(twisted groupoid): (H, σ ) ({1} H, σ ) ⊂ (G H, χ ), which follows from the fact that χ |H ≡ σ . Clearly in the groupoid interpretation the U (1) coefficients can be replaced by an arbitrary system of coefficients. Definition 7.6. The pair (H, (σ, λ, β)), where H is a G-groupoid and (σ, λ, β) ∈ 2 (H, U (1)) will be called a twisted G-groupoid. ZG 8. Pontryagin Duality for Generalized Principal Bundles In this section we introduce an extension of Pontryagin duality, which has been a duality on the category of abelian locally compact groups, to a correspondence of the form {generalized principal G-bundles} ←→ {twisted G-gerbes} for any abelian locally compact group G. In fact we extend this to a duality between a U (1)-gerbe on a principal G-bundle (though only certain types of U (1)-gerbes are allowed) and a U (1)-gerbe on a G-gerbe. By construction, there will be a Fourier type isomorphism between the twisted groupoid algebras of a Pontryagin dual pair; consequently any invariant constructed from twisted groupoid algebras (K -theory for example) will be unaffected by Pontryagin duality. This duality might be of independent interest, especially because it is not a Morita equivalence and thus induces a nontrivial duality at stack level. The Pontryagin dual of a U (1)-gerbe on a principal torus bundle will play a crucial role in the understanding of T-duality. Let us fix the following notation for the remainder of this section: G denotes an = Hom(G, U (1)) denotes its Pontryagin dual group, abelian locally compact group, G we use g s and φ s respectively. Evaluation is often written and for elements of G and G as a pairing φ, g ≡ φ(g). As usual γ s are elements of a groupoid G. According to our groupoid notation, (G ⇒ G) denotes the group G thought of as a topological space while (G ⇒ ∗) denotes the group thought of as a group. Thus by definition, the groupoid algebras C ∗ (G ⇒ G) and C ∗ (G ⇒ ∗) are functions on G with pointwise multiplication in the first case and convolution multiplication in the

70

C. Daenzer

second. Keeping this in mind, Fourier transform can be interpreted as an isomorphism of groupoid algebras: ⇒ ∗), F : C ∗ (G ⇒ G) −→ C ∗ (G a(g)φ(g −1 ). a → F(a)(φ) := g∈G

We use the Plancherel measure so that the inverse transform is given by −1 ⇒ ∗). ˆ := a(φ)φ(g) ˆ aˆ ∈ C ∗ (G F (a)(g) φ∈G

The group G acts on C ∗ (G ⇒ G) by translation g1 · a(g) := a(gg1 )

a ∈ C ∗ (G ⇒ G),

and the dual group acts by “dual translation” on C ∗ (G ⇒ G) by φ a(g) := φ, ga(g). Under Fourier transform translation and dual translation are interchanged: F(g · a)(φ) = φ, gF(a)(φ) =: g F(a)(φ), F(φ a)(ψ) = F(a)(φ −1 ψ) =: φ −1 · F(a)(ψ), for φ, ψ ∈ G.

(7) (8)

Let us quickly check the first one: a(g1 g)φ(g1−1 ) F(g · a)(φ) = g1 = a(g )φ((g g −1 )−1 ) g = φ(g) a(g )φ(g ) = g F(a)(φ). g

With those basic rules of Fourier transform in mind, we are ready to prove: Definition 8.1. Let G be a locally compact abelian group and G a groupoid. The following data: and ν ∈ C 2 (G; U (1)), ρ ∈ Z 1 (G; G), f ∈ Z 2 (G; G), satisfying δν(γ1 , γ2 , γ3 ) = f (γ1 , γ2 ), ρ(γ3 )−1 will be called Pontryagin duality data. Given Pontryagin duality data (ρ, f, ν), the following formulas define twisted groupoids: 1. The generalized principal bundle (G ρ G) with twisting 2-cocycle: σ ν f ((g, γ1 ), (gρ(γ1 ), γ2 )) := ν(γ1 , γ2 ) f (γ1 , γ2 ), g ∈ Z 2 (G ρ G; U (1)). f G) with twisting 2-cocycle: 2. The G-gerbe (G f G; U (1)). τ ρν ((φ1 , γ1 ), (φ2 , γ2 )) := ν(γ1 , γ2 )φ2 , ρ(γ1 ) ∈ Z 2 (G


71

To verify this, simply check that the twistings are indeed 2-cocycles. so that the pairs f G; τ ρν ) are actually twisted groupoids. (G ρ G, σ ν f ) and (G f G; τ ρν ) Theorem 8.2 (Pontryagin duality for groupoids). Let (G ρ G, σ ν f ) and (G be twisted groupoids constructed from Pontryagin duality data (ρ, f, ν) as in Definition (8.1). Then there is a Fourier type isomorphism between the associated twisted groupoid algebras: F

f G; τ ρν ), C ∗ (G ρ G; σ ν f ) −→ C ∗ (G a(g, γ )φ(g −1 ), γ ∈ G . a → F(a)(φ, γ ) := g∈G

Also, the natural translation action of G on C ∗ (Gρ G; σ ) is taken to the dual translation analogous to Eq. (7): F(g · a)(φ, γ ) = φ, gF(a)(φ, γ ) =: g F(a)(φ, γ ). Note, however, that G only acts by vector space automorphisms (as opposed to algebra automorphisms) unless f = 1. Proof. The fact that F is an isomorphism of Banach spaces follows because “fibrewise” this is a classical Fourier transform. Thus verifying that F is a C ∗ -isomorphism is a matter of seeing that F takes the multiplication on the first algebra into the multiplication on the second. Let us check. ν1,2 = ν(γ1 , γ2 ) ∈ U (1), and ρi = ρ(γi ) ∈ G. The Set f 1,2 = f (γ1 , γ2 ) ∈ G, ∗ multiplication on C (G ρ G; σ ν f ) is by definition a ∗ b(g, γ ) := a(g, γ1 )b(gρ1 , γ2 )ν1,2 f 1,2 , g γ1 γ2 =γ = a(g, γ1 ) f 1,2 (ρ1 · b)(g, γ2 )ν1,2 . γ1 γ2 =γ

which we denote by Pointwise multiplication on G is transformed to convolution on G, ∗ˆ . Group translation and dual group translation behave under the transform according to Eqs. (7) and (8). Using these rules, we have F(a ∗ b)(φ, γ ) = φ1 φ2 =φ F(a)(φ1 , γ1 )ˆ∗F( f 1,2 (ρ1 · b))(φ2 , γ2 )ν1,2 γ γ =γ 1 2 −1 −1 = φ1 φ2 =φ F(a)(φ1 , γ1 )F(b)(φ2 f 1,2 , γ2 )φ2 f 1,2 , ρ1 ν1,2 γ1 γ2 =γ = φ1 φ f1,2 =φ F(a)(φ1 , γ1 )F(b)(φ2 , γ2 )φ2 , ρ1 ν1,2 , 2

γ1 γ2 =γ

f G; τ νρ ). The statement about and this last line is exactly the multiplication on C ∗ (G G-actions is proved by the same computation as for Eq. (7). Definition 8.3. A pair of a twisted generalized principal G-bundle and twisted G-gerbe as in Theorem (8.2) are said to be Pontryagin dual.

72

C. Daenzer

This Pontryagin duality is independent of choice of cocycles ρ and f within their cohomology classes, and furthermore, if we alter ν by a closed 2-cochain (recall ν itself is not closed) we obtain a new Pontryagin dual pair. Here is the precise statement of these facts; the proof is a simple computation. f G; τ ρν ) are Pontryagin dual and Proposition 8.4. Suppose (G ρ G; σ ν f ) and (G 1 0 we are given α ∈ C (G; G) and β ∈ C (G; G) and c ∈ Z 2 (G; U (1)). Define ρ := ρδβ and f := f δα. Then (G ρ G; σ ν

f

f G; τ ρ ν ) are Pontryagin dual as well, where ) and (G

ν1,2 := c1,2 ν1,2 f 1,2 , β(sγ2 )−1 α1 , ρ2 −1 .

Since Pontryagin duality is defined in terms of a Fourier isomorphism, we have the following obvious Morita invariance property: Proposition 8.5. Suppose we are given two sets of Pontryagin duality data (G, G, ρ, ν, f ) and (G , G, ρ , ν , f ). Then f 1. C ∗ (G ρ G; σ ν f ) is Morita equivalent to C ∗ (G ρ G ; σ ν f ) if and only if C ∗ (G f G ; σ ρ ν ). G; σ ρν ) is Morita equivalent to C ∗ (G 2. In particular, if G ρ G is G-equivariantly Morita equivalent to G ρ G and the twisted groupoid (G ρ G, σ ν f ) is Morita equivalent to (G ρ G , σ ν f ) (see Theorem (9.1) for Morita equivalence of twisted groupoids), then all C ∗ -algebras in (1) are Morita equivalent. The same is true if the G-twisted groupoids (G, f ) is Morita f G, σ ρν ) is Morita equivalent equivalent to (G , f ) and the twisted groupoid (G f G , σ ρ ν ). to (G Note that we have not claimed that a Morita equivalence at the groupoid level produces a Morita equivalence of Pontryagin dual groupoids. Though there is often such a correspondence, it is not clear that there is always one. Here are a couple of important examples of Pontryagin duality: Example 7. Pontryagin duality actually shows that any twisted groupoid algebra is a C ∗ -subalgebra of the (untwisted) groupoid algebra of a U (1)-gerbe. Using the notation = U (1). Then the twisted groupoid of Theorem (8.2), set ρ = ν = 1, G = Z, and G ((Z 1 G), σ f ) (this is a trivial Z-bundle with twisting σ f ) is Pontryagin dual to the gerbe (U (1) f G). Explicitly, σ f ((n, γ1 ), (n, γ2 )) = f (γ1 , γ2 )n , for ((n, γ1 ), (n, γ2 )) ∈ (Z 1 G)2 , f ∈ Z 2 (G; U (1)). But the functions in C ∗ (U (1) f G) C ∗ (Z 1 G, σ f ) with support in {1} 1 G clearly form a C ∗ -subalgebra identical to C ∗ (G; f ). Example 8. Again in the notation of Theorem (8.2), the case ν = f = 1 shows that a 1 generalized principal bundle G ρ G is Pontryagin dual to the twisted groupoid (G ρ ρ G, τ ). (The latter object is the trivial G-gerbe on G, with twisting τ .) One might wonder if this duality can be expressed purely in terms of gerbes. The answer is that it cannot. More precisely, this duality does not extend via the association (twisted groupoids) ←→ (U (1)-gerbes)


73

to a duality between U (1)-gerbes that is implemented by the Fourier isomorphism of 1 G) (here groupoid algebras. Indeed, the gerbe associated to the right side is U (1)τ (G ρ τ = τ ), but the extension on the left side corresponding (in the sense of having a Fourier isomorphic C ∗ -algebra) to that gerbe is G χ (Z 1 G), where χ (n, γ ) := ρ(γ )n , which cannot be written as a U (1)-gerbe. The groupoid G χ (Z 1 G) actually corresponds to the disjoint union of the tensor powers of the generalized principal bundle. Pontryagin duality will be used to explain “classical” T-duality, in Sect. (12). Before we get to T-duality, however, we need the tools to compare Morita equivalent twisted groupoids, and we need to know some specific Morita equivalences. The development of these tools is the subject of the next two sections. 9. Twisted Morita Equivalence Let H and K be groupoids. A Morita (H-K)-bimodule P determines a way to compare the groupoid cohomology of H with that of K. More specifically, there is a double complex C i j (P) associated to the bimodule such that the moment maps H0 ← P → K0 induce augmentations by the groupoid cohomology complexes C i (H; M) → C i• (P; M) and C • j (P; M) ← C j (K; M).

(9)

We say groupoid cocycles c ∈ C n (H; M) and c ∈ C n (K; M) are cohomologous when their images in C(P; M) are cohomologous. We assume for simplicity that the coefficients M are constant and have the trivial actions of H and K. The double complex is defined as follows: C i j (P) := (C(H j ×H0 P ×K0 Ki ; M); δ H , δ K ). The differentials (δ H : C i j → C i j+1 ) and (δ K : C i j → C i+1 j ) are given, for f ∈ C i j , by the formulas δ H f (h 1 , . . . , h j+1 , p, k s) := f (h 2 , . . . , h j+1 , p, k s) +

(−1)n f (. . . , h n h n+1 , . . . , p, k s)

n=1... j

+(−1) f (h 1 , . . . , h j , h j+1 p, k s), j

δ K f (h s, p, h 1 . . . , h i+1 ) := f (h s, pk1 , k2 , . . . , ki ) +

(−1)n f (h s, p, . . . , kn kn+1 , . . . )

n=1...i

+(−1)k f (h s, p, k1 , . . . , ki ). Our main reason for comparing cohomology of Morita equivalent groupoids is the following theorem. The first statement of the theorem is a classic statement about (stack theoretic) gerbes, but we will reproduce it for convenience.

74

C. Daenzer

Theorem 9.1. Let P be a Morita equivalence (H-K)-bimodule, let M be an Abelian group acted upon trivially by the two groupoids, and suppose we are given 2-cocycles ψ ∈ Z 2 (H; M) and χ ∈ Z 2 (K; M) whose images ψ˜ and χ˜ in the double complex of the Morita equivalence are cohomologous. Then 1. The M-gerbes M ψ H and M χ K are Morita equivalent. 2. For any group homomorphism φ : M → N , the N -gerbes N φ◦ψ H and N φ◦χ K are Morita equivalent. 3. For any group homomorphism φ : M → U (1), the twisted C ∗ -algebras C ∗ (H; φ◦ψ) and C ∗ (K; φ ◦ χ ) are Morita equivalent. For example when M = U (1) with H and K acting trivially, the representations of M are identified with the integers, (u → u n ) and the proposition implies that the “gerbes of weight n”, C ∗ (H; ψ n ) and C ∗ (K; χ n ), are Morita equivalent. Proof. We begin by proving the statement about M-gerbes. The idea is that a cocycle (µ, ν −1 ) ∈ C 1,0 × C 0,1 satisfying ˜ ) ∈ C 2,0 × C 1,1 × C 0,2 ˜ 0, χ −1 D(µ, ν −1 ) := (δ H µ, δ K µδ H ν, δ K ν −1 ) = (ψ,

(10)

provides exactly the data needed to form a Morita (M ψ H) − (M χ K)-bimodule structure on M × P. Indeed, define for m s ∈ M, h s ∈ H, k s ∈ K, and p s ∈ P, 1. Left multiplication of M ψ H: (m 1 , h) ∗ (m 2 , p) := (m 1 m 2 µ(h, p), hp), 2. Right multiplication of M χ K ⇒ K0 : (m 1 , p) ∗ (m 2 , k) := (m 1 m 2 ν( p, k), pk). Then δ H µ = ψ˜ ⇔ the left multiplication is homomorphic, δ K ν = χ˜ ⇔ the right multiplication is homomorphic, δ K µ = δ K ν −1 ⇔ the left and right multiplications commute. For example the equality δ K ν( p, k1 , k2 ) := ν( pk1 , k2 )ν( p, k1 k2 )ν( p, k1 )−1 = χ˜ ( p, k1 , k2 ) =: χ (k1 , k2 ) holds if and only if ((m, p) ∗ (m 1 , k1 )) ∗ (m 2 , k2 ) = (mm 1 m 2 ν( p, k1 )ν( pk1 , k2 ), pk1 k2 ) = (m(m 1 m 2 χ (k1 , k2 ))ν( p, k1 k2 ), pk1 k2 ) = (m, p) ∗ ((m 1 , k1 ) ∗ (m 2 , k2 )). Now we will check that the right action is principal and that the orbit space (M × P)/(M χ K) is isomorphic to H0 . Suppose (m, p) ∗ (m 1 , k1 ) = (m, p) ∗ (m 2 , k2 ). Then k1 = k2 since the action of K is principal, and then m 1 = m 2 is forced, so the action is principal. Next, the equation (m, p) ∗ (m 1 ν( p, k)−1 , k) = (mm 1 , pk) makes it clear that the orbit space (M × P)/(M χ K) is the same as the orbit space P/K, which is H0 . The situation is obviously symmetric, so the left action satisfies the analogous properties, and thus the first statement of the proposition is proved.


75

The second statement now follows immediately because whenever Eq. (10) is satisfied for the quadruple (µ, ν, ψ, χ ) in C(P; M), it is also satisfied for (φ◦µ, φ◦ν, φ◦ψ, φ◦χ ) in C(P; N ). In other words φ ◦ ψ and φ ◦ χ are cohomologous. The third statement of the theorem can be proved directly by exhibiting a C ∗ -algebra bimodule, but this will not be necessary because we already have a Morita C ∗ (M ψ H)C ∗ (M χ K)-bimodule, coming from Proposition (5.1) and the fact that M ψ H and M χ K are Morita equivalent. We will manipulate this bimodule into a C ∗ (H, φ ◦ ψ)C ∗ (K, φ ◦ χ )-bimodule. 1 H, σ ψ ) is Pontryagin dual to (M ψ H) and ( M 1 K, σ χ ) is Note that ( M Pontryagin dual to (M χ K), thus the associated C ∗ -algebras are pairwise Fourier isomorphic. Then the Morita equivalence bimodule (which is a completion of Cc (M × P), where P is the H-K-bimodule) for C ∗ (M ψ H) and C ∗ (M χ K) is taken via the 1 H, σ ψ ) Fourier isomorphism to a Morita equivalence bimodule between C ∗ ( M ∗ χ and C ( M 1 K, σ ). This Fourier transformed bimodule will be a completion X of × P). Then for φ ∈ M, evaluation at φ determines projections Cc ( M 1 H, σ ψ ) → C ∗ (H, φ ◦ ψ), evφ : C ∗ ( M 1 K, σ χ ) → C ∗ (K, φ ◦ χ ), evφ : C ∗ ( M × P) → Cc (P), and evφ : Cc ( M that are compatible with the bimodule structure of X , so evφ (X ) is automatically a Morita equivalence C ∗ (H, φ ◦ ψ)-C ∗ (K, φ ◦ χ )-bimodule. Thus X is actually a family and in particular the third statement of Morita equivalences parameterized by φ ∈ M, of the theorem is true. Remark 9.2. Just as a Morita equivalence lets you compare cohomology, a G-equivalence lets you compare G-equivariant cohomology. Indeed, all complexes involved will have the commuting G-actions, and there will be an associated tricomplex which is coaugmented by the complexes computing equivariant cohomology of the two groupoids. Rather than using the tricomplex, however, one can map the equivariant complexes into the complexes of the crossed product groupoids (as in Eq. (5)) and do the comparing there. The result is the same but somewhat less tedious to compute. 10. Some Facts about Groupoid Cohomology Now there is good motivation for wanting to know when groupoid cocycles are cohomologous. In this section we will collect some facts that help in that pursuit. The Morita equivalences that come from actual groupoid homomorphisms have nice properties with respect to cohomology. They are called essential equivalences. Definition 10.1. [C]. Let φ : G → H be a morphism of topological groupoids (i.e. a continuous functor). This morphism determines a right principal G-H-bimodule Pφ := G0 ×H0 H1 . If Pφ is a Morita equivalence bimodule (which follows if it is left principal) then φ is called an essential equivalence. Note that for any morphism φ the left moment map Pφ → G0 has a canonical section. Proposition 10.2. Let G and H be topological groupoids, and suppose P is a Morita

ρ

equivalence G-H-bimodule with left and right moment maps G0 ← P → H0 . Then:

76

C. Daenzer

1. P is equivariantly isomorphic to Pφ for some essential equivalence φ : G → H if and only if the left moment map : P → G0 admits a continuous section. 2. Any morphism φ : G → H determines, via pullback, a chain morphism φ ∗ : C • (H) → C • (G). 3. The moment maps and ρ determine chain morphisms ∗

ρ∗

C • (G) −→ tot(C •• (P)) ←− C • (H). 4. Any continuous section of : P → G0 determines a contraction of the coaugmented complex C k (G) → C •k (P) for each k, and thus a quasi-inverse [ ∗ ]−1 : H ∗ (tot(C(P)) → H ∗ (G), and thus a homomorphism [ ∗ ]−1 ◦ [ ρ ∗ ] : H ∗ (H) → H ∗ (G). 5. The two chain morphisms ∗ ◦ φ ∗ and ρ ∗ are homotopic; in particular [ φ ∗ ] = [ ∗ ]−1 ◦ [ ρ ∗ ]. Proof. The first statement is an easy exercise and the next two statements do not require proof. The homotopy for the fourth statement is in the proof of Lemma 1 [C] and the homotopy for the fifth statement is written on page (8) of [C]. Corollary 10.3. Suppose that φ : G → H is an essential equivalence, that M is a locally compact abelian group viewed as a trivial module over both groupoids, and that χ ∈ Z 2 (H; M) is a 2-cocycle. Then χ and φ ∗ χ satisfy the conditions of Theorem (9.1); ∗ in particular M χ H is Morita equivalent to M φ χ G. Proof. Use the notation of Proposition (10.2). The images of χ and φ ∗ χ in C •• (P) are ∗ ◦ φ ∗ χ and ρ ∗ χ respectively, which are cohomologous by statement (5), and these are precisely the conditions of Theorem (9.1). Corollary 10.4. If φ : G → H is an essential equivalence such that the right moment ρ map Pφ → H0 admits a section then [φ ∗ ] : H ∗ (H) → H ∗ (G) is an isomorphism. Proof. This follows from Corollary (10.3) and Part (4) of Proposition (10.2).

Corollaries (10.3) and (10.4) will be very useful because all of the Morita equivalences that have been introduced so far are essential equivalences, as the next proposition shows. Before the proposition let us describe two more groupoids. Example 9. Let ρ : G → G/N be a homomorphism and let G ×G/N ,ρ G1 denote the fibred product (i.e. the space {(g, γ ) | g N = ρ(γ ) ∈ G/N }). Define G ×G/N ,ρ G := (G ×G/N ,ρ G1 ⇒ G0 ) with structure maps s(g, γ ) := sγ , r (g, γ ) := r γ , and (g1 , γ1 )◦(g2 , γ2 ) := (g1 g2 , γ1 γ2 ). Note that any lift of ρ to a continuous map ρ˜ : G1 → G determines an isomorphism of groupoids N δρ G −→ G ×G/N ,ρ G,

(n, γ ) → (n ρ(γ ˜ ), γ ).

When no such lift exists this fibred product groupoid is an example of an N -gerbe without section.


77

Example 10. The fibred product groupoid of Example (9) is equipped with a canonical right module P := G × G0 , for which the induced groupoid is G (G ×G/N ,ρ G) := ((G × G ×G/N ,ρ G1 ) ⇒ G × G0 ) with structure maps s(g, g , γ ) := (gg , sγ ), r (g, g , γ ) := (g, r γ ), and (g, g1 , γ1 ) ◦ (gg1 , g2 , γ2 ) := (g, g1 g2 , γ1 γ2 ). A lift ρ˜ determines a homeomorphism G (N δρ G) −→ G (G ×G/N ,ρ G),

(g, n, γ ) → (g, n ρ(γ ˜ ), γ )

which implicitly defines the groupoid structure on G N δρ G as the one making this a groupoid isomorphism. Proposition 10.5 (Essential equivalences). Let G be a groupoid, N a closed subgroup of a locally compact group G, NG (N ) its normalizer in G, and ρ : G → NG (N )/N ⊂ G/N a homomorphism. Then 1. The gluing morphism GU → G from a refinement GU corresponding to a locally finite cover U of G0 (see Example (1)) is an essential equivalence. 2. The quotient morphism X N → (X/N ⇒ X/N ),

(x, n) → x N ∈ X/N ,

for X a free and proper right N -space, is an essential equivalence. 3. The inclusion ι : (G ×G/N ,ρ G) → G G/N ρ G,

(g, γ ) → (g, eN , γ )

is an essential equivalence and in particular if ρ lifts to a continuous map ρ˜ : G → G then ι : (N δρ G) → G (G/N ρ G),

(n, γ ) → (n ρ(γ ˜ ), eN , γ )

is an essential equivalence. 4. The quotient map κ : G (G ×G/N ,ρ G) → G/N ρ G,

(g, g , γ ) → (g N , γ )

is an essential equivalence and in particular if ρ lifts to a continuous map ρ˜ : G → G then κ : G (N δρ G) → G/N ρ G,

(g, n, γ ) → (g N , γ )

is an essential equivalence. ˇ 5. If G is a Cech groupoid, N is normal in G, and Q := (G/N × G0 )/(t, r γ ) ∼ (tρ(γ ), sγ ) is the principal G/N -bundle on X = G0 /G1 with transition functions given by ρ, then the quotient q : (G/N ρ G) −→ (Q ⇒ Q) is an essential equivalence.

78

C. Daenzer

Furthermore, in all cases where there are obvious left G actions, these are essential G-equivalences. Proof. For the groupoids that were introduced in Sect. (4), simply check that the bimodules determined by these morphisms are the same as the Morita equivalence bimodules that were described there. The two new cases involving Examples (9) and (10) are very similar and are left for the reader. The statement about G-equivariance is clear. Proposition 10.6. In the notation of Proposition (10.5), suppose that G → G/N admits a continuous section (for example if N = {e}, or if N is a component of G or if G is discrete), then the essential equivalences ι and κ induce isomorphisms of groupoid cohomology. Proof. By Corollary (10.4), we only need to produce sections of the right moment maps, and the section of σ : G/N → G provides these. The case G = ∗ illustrates the answer: for ι, G/N g N → (∗, σ (g N ), eN ) ∈ {∗} × G × eN Pι does the job, while for κ it is G/N g N → (σ (g N ), g N ) ∈ G ×G/N G/N Pκ . The following proposition is important because it implies that cocycles on crossed product groupoids can be assumed to be in a special form. Proposition 10.7. The groupoid cohomology H ∗ (G (G ρ G); B)) is a direct summand of the equivariant cohomology HG∗ (G ρ G; B). Proof. We will show that the identity map on H ∗ (G (G ρ G); B)) factors through HG∗ (G ρ G; B). The inclusion ι : G → G (G ρ G) sending γ → (ρ(γ ), e, γ ) induces a chain map ι∗ : C • (G (G ρ G); B) −→ C • (G; B) which is a quasi-isomorphism by Proposition (10.6). The quotient G (G ρ G) −→ G induces the chain map q ∗ : C • (G; B) −→ C • (G (G ρ G); B) which is also a quasi-isomorphism, and in fact induces the quasi-inverse to ι∗ since ι∗ ◦ q ∗ = (q ◦ ι)∗ = I d ∗ . Now the quotient G ρ G → G induces a chain map C • (G) → C • (G ρ G) and thus a chain map C • (G) → C • (G ρ G) → tot C • (G, C • (G ρ G)). Note that the first map and the composition of both maps are chain morphisms, but the second is not. Following this by the morphism tot C • (G, C • (Gρ G)) → C • (G(Gρ G)) of Eq. (5) induces a sequence C • (G) −→ tot C • (G; C • (G ρ G)) −→ C • (G (G ρ G)) whose composition is easily seen to equal q ∗ . Finally, precomposing with ι∗ provides the promised factorization.


79

Remark 10.8. Whenever a group G acts freely and properly on a groupoid H and H → G\H has local sections, then H, being a principal G-bundle, is equivariantly Morita equivalent to G ρ G for some ρ and G, where G is a refinement of the quotient groupoid H/G := (H1 /G ⇒ H0 /G). In particular there is a refinement of H that is equivariantly isomorphic to G ρ G. Thus after a possible refinement the above proposition is a statement about the cohomology of all free and proper G-groupoids with local sections. For completeness we will include the following proposition, which might be attributable to Haefliger. It implies that one can work exclusively with essential equivalences if desired. Proposition 10.9. Let P be a G-H-Morita equivalence which is locally trivial as an H-module, then the Morita equivalence can be factored in the form: φ

G ←− GU −→ H , where GU is a refinement of G and φ is an essential equivalence. Proof. The left moment map for P admits local sections so there is a cover U of G0 such that the refined Morita GU-H-bimodule PU has a section for its left moment map. By Proposition (10.2) PU Pφ for some essential equivalence φ. 11. Generalized Mackey-Rieffel Imprimitivity In this section we show that a specific pair of twisted groupoids is Morita equivalent. The two Morita equivalent twisted groupoids correspond, respectively, to a U (1)-gerbe on a crossed product groupoid for a G-action on a generalized principal G/N -bundle, and to a U (1)-gerbe over an N -gerbe. We also describe group actions on the groupoids that make the Morita equivalence equivariant. The equivalence is a simple consequence of the methods of Sect. (12)), but it deserves to be singled out because both twisted groupoids appear in the statement of classical T-duality. Theorem 11.1 (Generalized Mackey-Rieffel imprimitivity) Let G be a groupoid, G a locally compact group, N < G a closed normal subgroup, and ρ¯ : G → G/N a homomorphism that admits a continuous lift ρ : G → G. From the two groupoids of Example (6): 1. H := G (G/N ρ¯ G), 2. K := N δρ G. Then for any G-equivariant gerbe presented by a 2-cocycle (σ, λ, β) as in Eq. (6), there is a Morita equivalence of twisted groupoids (H, ψ) ∼ (K, χ ), where ψ ∈

Z 2 (H; U (1))

and χ ∈ Z 2 (K; U (1)) are given by

ψ((g1 , t, γ1 ), (g2 , g1−1 tρ(γ1 ), γ2 )) := σ (t, γ1 , γ2 )λ(g1 , tρ(γ1 ), γ2 ) × β(g1 , g2 , tρ(γ1 )ρ(γ2 ), sγ2 ), (11) χ ((n 1 , γ1 ), (n 2 , γ2 )) := σ (eG/N , γ1 , γ2 )λ(n 1 ρ(γ1 ), ρ(γ1 ), γ2 ) × β(n 1 ρ(γ1 ), n 2 ρ(γ2 ), ρ(γ1 )ρ(γ2 ), sγ2 ), (12) for g s ∈ G, t ∈ G/N , n s ∈ N , and γ s ∈ G.

80

C. Daenzer

Proof. For the sake of understanding, let us first see where ψ comes from. Set (g1 , h 1 ) = (g1 , t, γ1 ), (g2 , h 2 ) = (g2 , g1−1 tρ(γ1 ), γ2 ). Then a composed pair looks like (g1 , h 1 )(g2 , h 2 ) = (g1 g2 , h 1 g1 h 2 ), and ψ = σ (h 1 , g1 h 2 )λ(g1 , g1 h 2 ) × β(g1 , g2 , g1 g2 (sh 2 )). In other words, ψ = F(σ, λ, β), the image of (σ, λ, β) under the chain map F : tot C • (G, C • ((G/N ρ¯ G), U (1))) → C • (H; U (1)) of Eq. (5). So ψ comes from extending a 2-cocycle from (G/N ρ¯ G) to an equivariant 2-cocycle. Now χ = ι∗ ◦ ψ, where ι : N δρ G → G G/N ρ¯ G is as in Proposition (10.5) and the theorem follows immediately from the results in Sect. (10). Remark 11.2. In the hypotheses of the above theorem it is not necessary to have the lift ρ. ˜ In the absence of such a lift, one simply replaces N δρ G by G ×G/N ,ρ G as in Example (9). on U (1) ψ H (denoted α), When G is abelian there is a canonical action of G ˆ given by the formula (θ, g, t, γ ) ∈ U (1) ψ H. φ · (θ, g, t, γ ) = (θ φ, g, g, t, γ ), φ ∈ G,

(13)

This action corresponds to the natural G-action on a crossed product algebra G A. As one expects, the above Morita equivalence takes this action to the same action, pulled back via ι. Proposition 11.3. In the notation of Theorem (11.1) let G be an abelian group. Then the canonical G-action, α, ˆ on U (1) ψ H is transported under the Morita equivalence ψ (U (1) H) ∼ (U (1) χ K) to ι∗ (α)(φ)(θ, ˆ n, γ ) := (θ φ, nρ(γ ), n, γ ), (θ, n, γ ) ∈ U (1) χ K. Thus with these actions the Morita equivalence of Theorem (11.1) becomes G-equivariant. Proof. U (1) × Pι is the Morita (U (1) ψ H)-(U (1) χ K)-bimodule here. For φ ∈ G and (θ, g, γ ) ∈ U (1) × Pι , define an action by φ · (θ, g, γ ) = (θ φ, g, g, γ ). Then this action, along with the actions αˆ and ι∗ αˆ determines an equivariant Morita equivalence (that is, Eq. (1) is satisfied for these actions). 12. Classical T-Duality Let us start with the definition. For us classical T-duality will refer to a T-duality between a U (1)-gerbe on a generalized torus bundle and a U (1)-gerbe on a generalized principal dual-torus bundle. If instead of a torus bundle we start with a G/N -bundle, for N a closed subgroup of an abelian locally compact group G, then we still call this classical T-duality since it is the same phenomenon. Thus “classical” means for us that the groups involved are abelian and furthermore that the dual object does not involve noncommutative geometry. We are now ready to fully describe the T-dualization procedure. There will be several remarks afterwards. Classical T-duality. Suppose we are given the data of a generalized principal G/N -bundle and G-equivariant twisting 2-cocycle whose C 2,0 -component β ∈ C 2 (G; C 0 (G/N ρ¯ G; U (1))) is trivial: 2 (G/N ρ¯ G, (σ, λ, 1) ∈ Z G (G/N ρ¯ G; U (1))).

(14)


81

Then this is the initial data for a classically T-dualizable bundle, and the following steps produce its T-dual. Remark 12.1. Because G is abelian the restriction of λ : G × G/N × G1 → U (1) to N × G/N × G1 does not depend on G/N . Indeed, for n ∈ N , g ∈ G, and t ∈ G/N , λ(n, t, γ ) = λ(ng, t, γ )λ(g, t, γ )−1 = λ(gn, t, γ )λ(g, t, γ )−1 = λ(n, g −1 t, γ ). Furthermore, the cocycle conditions ensure that λ| N ×G/N ×G1 is homomorphic in both . N and G. We write λ¯ for the induced homomorphism λ¯ : G → N Step 1. Pass from (G/N ρ¯ G, (σ, λ, 1)) to the crossed product groupoid G (G/N ρ¯ G), with twisting 2-cocycle ψ := F(σ, λ, 1) of Eq. (11), and with canonical G-action αˆ of Eq. (13): ˆ (G (G/N ρ¯ G), ψ, α). Step 2. Choose a lift ρ : G → G of ρ¯ and pass from G (G/N ρ¯ G) to the Morita ι∗ αˆ of Proposition equivalent N -gerbe N δρ G with twisting 2-cocycle ι∗ ψ and G-action (11.3): ˆ (N δρ G, ι∗ ψ, ι∗ α). Step 3. Pass to the Pontryagin dual system. More precisely, pass to the twisted grou poid with G-action whose twisted groupoid algebra is G-equivariantly isomorphic to the algebra of Step 2, when the chosen isomorphism is Fourier transform in the N -direction. -bundle, but not exactly of the form we have been considering. Here The result is an N is the Pontryagin dual system: × G1 ⇒ N × G0 ). • Groupoid: K := ( N • Source and range: s(φ, γ ) := (φ, sγ ), r (φ, γ ) := φ(λ¯ (γ )−1 , r γ ) for (φ, γ ) ∈ K1 . • Multiplication: (φ λ¯ (γ2 )−1 , γ1 )(φ, γ2 ) := (φ, γ1 γ2 ), where λ¯ := λ| N ×{1}×G1 (which . is homomorphic in N ), viewed as a map λ¯ : G1 → N • Twisting: τ (φ(λ¯ (γ2 )−1 , γ1 ), (φ, γ2 )) := σ (e, γ1 , γ2 )λ(ρ(γ1 ), γ2 )φ, δρ(γ1 , γ2 ). and a ∈ Cc (K1 ). • G-action: φ · a(φ , γ ) := φ , ρ(γ )a(φ −1 φ , γ ) for φ ∈ G For aesthetic reasons, it is preferable to put this data in the same form as Eq. (14). To λ¯ G via: do this first note that K is isomorphic to N ∼ λ¯ G −→ K, N

(φ, γ ) → (φ λ¯ (γ ), γ ).

Using this isomorphism to import the twisting and G-action on N λ¯ G from K determines that N λ¯ G must have: ¯ 1 )λ(γ ¯ 2 )), δρ(γ1 , γ2 ). • Twisting: σ ∨ (φ, γ1 , γ2 ) := σ (e, γ1 , γ2 )λ(ρ(γ1 ), γ2 )(φ λ(γ and a ∈ Cc (N λ¯ G). • G-action: φ · a(φ, γ ) := φ, ρ(γ )a(φ −1 φ, γ ) for φ ∈ G But this is again classical T-duality data, indeed it is what we write as: λ¯ G, (σ ∨ , ρ, 1) ∈ Z 2 ( N λ¯ G; U (1))). (N G Thus the classical T-dual pair is λ¯ G, (σ ∨ , ρ, 1)). (G/N ρ¯ G, (σ, λ, 1)) ←→ ( N

(15)

82

C. Daenzer

Some properties of the duality. (A) Taking C ∗ -algebras everywhere, the dualizing process becomes: Morita

C ∗ (G/N ρ¯ G; σ ) G λ C ∗ (G/N ρ¯ G; σ ) ∼ C ∗ (N δρ G; ι∗ ψ) iso

λ¯ G; σ ∨ ). C ∗( N

All algebras to the right of the “” have canonically G-equivariantly isomorphic spectra. For the passage from Step (1) to Step (2) this follows because it is a Morita equivalence of twisted groupoids by Theorem (11.1), and by Proposition (11.3) this Morita equiv alence is G-equivariant. For the passage from Step (2) to Step (3) it follows because Pontryagin dualization induces an equivariant isomorphism of twisted groupoid algebras with G-action (Theorem (8.2)). (B) The passages (1)→(2) and (2)→(3) induce isomorphisms in K -theory for any G, since K -theory is invariant under Morita equivalence and isomorphism of C ∗ -algebras. Thus whenever G satisfies the Connes-Thom isomorphism (i.e. G satisfies K (A) K (G A) for every G-C ∗ -algebra A) the duality of (15) incorporates an isomorphism of twisted K-theory: λ¯ G, σ ∨ ). K • (G/N ρ¯ G, σ ) K •+dimG ( N The class of groups satisfying the Connes-Thom isomorphism includes the (finite dimensional) 1-connected solvable Lie groups. ˇ (C) When G is a Cech groupoid for a space X , this duality can be viewed as a duality 2 ∨ (P → X, [(σ, λ, 1)] ∈ HG2 (P; U (1))) ←→ (P ∨ → X, [(σ ∨ , ρ, 1)] ∈ HG (P ; U (1))),

-bundle. Indeed, the pair where P is a principal G/N -bundle and P ∨ is a principal N (P, [(σ, λ)]) are the spectrum (with its G/N -action) and G-equivariant Dixmier-Douady invariant, respectively, of the G-algebra C ∗ (G/N ρ¯ G; σ ), and it is enough to show that λ¯ G, σ ∨ ), is the spectrum and equivariant Dixmier-Douady invariant of the dual, C ∗ ( N independent of the groupoid presentation of (P, [(σ, λ)]). But since every procedure to the right of the “” is an equivariant Morita equivalence of C ∗ -algebras, the result will follow as long as two different presentations give rise to objects which are equivariantly Morita equivalent at Step (1). But if (G/N ×ρ¯ G, (σ, λ, 1)) and (G/N ×ρ¯ G , (σ , λ , 1)) are G-equivariantly Morita equivalent groupoids with cohomologous cocycles then the groupoids U (1) ψ (G G/N ×ρ¯ G) and U (1) ψ (G G/N ×ρ¯ G ) are Morita groupoids, and from this the result follows immediately. equivalent as G 13. Nonabelian Takai Duality In describing classical T-duality it was crucial that the group G be abelian because the dual side was viewed as a G-space, and the procedure of T-dualizing was run in reverse by taking a crossed product by the G-action. Since our goal is to describe an analogue of T-duality that is valid for bundles of nonabelian groups, we need a method of returning from the dual side to the original side that does not involve a Pontryagin dual group. A solution is provided by what we call the nonabelian Takai duality for groupoids. In this section we first review classical Takai duality, then we describe the nonabelian version. The nonabelian Takai duality is constructed so that when applied to abelian groups it reduces to what is essentially the Pontryagin dual of classical Takai duality.


83

Recall that Takai duality for abelian groups is the passage − C ∗ -algebras) (G − C ∗ -algebras) (G (α, A) −→ (α, ˆ G α A), g ∈ G and where αˆ is the canonical G-action αˆ φ · a(g) := φ(g)a(g) for φ ∈ G, a ∈ G A. This is a duality in the sense that a second application produces a G-algebra ˆˆ G αˆ G α A), which is G-equivariantly Morita equivalent to the original (α, A). (α, Now let (α, H) be a G-groupoid. Takai duality applied (twice) to the associated groupoid algebra is the passage (α, C ∗ (H)) −→ (α, ˆ C ∗ (G α H)) ˆˆ G αˆ C ∗ (G α H)). −→ (α, Comparing multiplications, one sees that the last algebra is identical to the groupoid triv (G α H), χ ) where algebra of the twisted groupoid (G χ ((φ1 , g1 , γ ), (φ2 , g2 , γ2 )) := φ1 , g2 Note that this duality cannot be expressed and (triv) denotes the trivial action of G. only acts on the groupoid algepurely in terms of groupoids since the dual group G bra. However, taking the Fourier transform in the G-direction determines a Pontryagin triv G α H, χ ) and the untwisted groupoid duality between (G G := (G × G × H1 ⇒ G × H0 ) whose source, range and multiplication are 1. s(g, h, γ ) := (g, h −1 sγ ) r (g, h, γ ) := (gh −1 , r γ ), −1 2. (gh −1 2 , h 1 , γ1 ) ◦ (g, h 2 , h 1 γ2 ) := (g, h 1 h 2 , γ1 γ2 ), for g s, h s ∈ G and γ ∈ H. This groupoid has the natural left translation action of G for which it is equivariantly isomorphic to the generalized G-bundle G q (G α H), where q is the quotient homomorphism q : G α H → (G ⇒ ∗). The map is given by G −→ G q (G α H), (g, h, γ ) −→ (gh −1 , h, γ ). As was mentioned in the construction of generalized bundles, a generalized bundle G ρ G can be described as the induced groupoid of the right G-module P := G × G0 with the obvious structure. In the current situation the right (G α H)-module is b

P := (G × H0 → H0 ), where the moment map b is just the projection and the right action is (g, r γ ) · (h, hγ ) := (g(q((h, hγ ), sγ ) = (gh, sγ ). For convenience we will write down this groupoid structure, G q (G α H) ≡ P q (G α H) with source, range and multiplication

84

C. Daenzer

1. s(g, h, γ ) := (gh, h −1 sγ ) r (g, h, γ ) := (g, r γ ), 2. (g, h 1 , γ1 ) ◦ (gh 1 , h 2 , h −1 1 γ2 ) := (g, h 1 h 2 , γ1 γ2 )

for g s, h s ∈ G and γ ∈ H. The content of the previous paragraph is that up to a Pontryagin duality, a Takai duality can be expressed purely in terms of groupoids. Given a G-groupoid (α, H), one forms the crossed product G α H. There is a canonical G-action on the groupoid algebra, but there is also a canonical right (G α H)-module P, and the two canonical pieces of data are essentially the same. Now to express the duality, one passes to the induced groupoid P (G α H). This induced groupoid is itself equipped with a natural G-action coming from the left translation of G on P, and is G-equivariantly Morita equivalent to the original G-groupoid (α, H). Thus the C ∗ -algebra of this induced groupoid takes the αˆ C ∗ (G α H)) (and is isomorphic to it via Fourier transform). place of G This duality (α, H) (P, G α H) expresses essentially the same phenomena as Takai duality, but while Takai duality applies only to abelian groups, this formulation applies to arbitrary groups. Let us write this down formally. Theorem 13.1 (Nonabelian Takai duality for groupoids). Let G be a locally compact group and (α, H) a G-groupoid, then 1. G α H has a canonical right module P := ((G × H0 ) → H0 ). 2. The induced groupoid P q (G α H) is naturally a generalized principal G-bundle. 3. The G-groupoids (α, H) and (τ, P q (G α H)) are equivariantly Morita equivalent, where τ denotes the left principal G-bundle action. Proof. The first statement has already been explained, and the second is just the fact that P (G α H) ≡ G q (G α H) and the latter groupoid is manifestly a G-bundle. For the third statement, note that the inclusion φ

H {1} × {1} × H → (G q (G α H)) is an essential equivalence. After identifying the equivalence bimodule Pφ with the space {(g, g −1 , γ ) | g ∈ G, γ ∈ H1 }, one verifies easily that the following G-actions make this an equivariant Morita equivalence: For (h, k, γ ) ∈ (G q (G α H)), γ ∈ H, and (h, h −1 , γ ) ∈ Pφ , g ∈ G acts by g · (h, k, γ ) := (gh, k, γ ); g · (h, h −1 , γ ) := (gh, (gh)−1 , γ ); g · (γ ) := gγ . Note that φ itself is not an equivariant map.

For our intended application to T-duality, it will be necessary to consider a nonabelian Takai duality for twisted groupoids, which we will prove here. In this context it not possible to make a statement of equivariant Morita equivalence because in general the twisted groupoid does not admit a G-action. Instead of G-equivariance, there is a Morita equivalence that is compatible with “extending” to a twisted crossed product groupoid. Theorem 13.2. (Nonabelian Takai duality for twisted groupoids). Let G be a group and (α, H) a G-groupoid. Suppose we are given a 2-cocycle χ ∈ Z 2 (G α H; U (1)). Define σ := χ |H ∈ Z 2 (H, U (1)). Then χ can be viewed as a 2-cocycle on (G q (G α H)) which is constant in the first G variable, or as a 2-cocycle on G τ (G q (G α H)) which is constant in the first two G-variables, and we have:


85

1. The Morita equivalence (G (G α H)) H of Theorem (13.1) extends to a Morita equivalence of U (1)-gerbes: U (1) χ (G q (G α H)) U (1) σ H. 2. The Morita equivalence G τ (G q (G α H)) G α H induced from G-equivariance extends to a Morita equivalence of U (1)-gerbes: U (1) χ (G τ (G q (G α H))) U (1) χ (G α H). 3. The first equivalence is a subequivalence of the second. Proof. The first statement follows just as in the proof of the Morita equivalence of G q (G α H) and H; this time the bimodule is U (1) × Pφ (using the notation from the proof of Theorem (13.1)). For the second statement, note that G × Pφ is the G τ (G (G α H))- G α Hbimodule, the bimodule structure being given as in Example (2). Then U (1) × G × Pφ is the desired Morita bimodule. For the third statement, note that restricting to U (1) × {1} × Pφ ⊂ U (1) × G × Pφ recovers the Morita equivalence of the first statement as a subequivalence of the second. 14. Nonabelian Noncommutative T-Duality Remembering our convention to call a group nonabelian when it is not commutative and reserve the word noncommutative for a noncommutative space in the sense of noncommutative geometry, we define the following extensions to T-duality. Definition 14.1. Let N be a closed normal subgroup of a locally compact group G. There is a canonical equivalence between the data contained in: • a G-equivariant U (1)-gerbe on a generalized G/N -bundle, and • a U (1)-gerbe on an N -gerbe, with canonical right module. The interpolation between the two objects is described below, and will be called nonabelian noncommutative T-duality whenever the interpolating procedure induces an isomorphism in K -theory. Remark 14.2. Classical T-duality was a duality between gerbes on generalized principal bundles. More precisely, for abelian groups G, the duality associated to a U (1)-extension of a groupoid of the form G/N ρ G, a new U (1)-extension of a groupoid of the form λ¯ G. Now we will instead use groupoids of the form G N δρ G or more generally N G G ×G/N ,ρ G, defined in Example (10). These are equivariantly Morita equivalent to the old kind. The reason for this change is that while every gerbe on a classically T-dualizable pair of bundles can be presented by an equivariant 2-cocycle on a groupoid of the form G/N ρ G, this is not the case in general. On the other hand, the following fact will be proved in Sect. (15): Every G-equivariant stack theoretic gerbe on a generalized principal G/N -bundle admits a presentation as a U (1)-extension of a groupoid of the form G G ×G/N ,ρ G.

86

C. Daenzer

This is not necessarily obvious. In fact without the G-equivariance condition, not every gerbe on a generalized G/N -bundle admits such a presentation; it is the G-equivariance that forces this. Procedure for nonabelian T-dualization. The initial data for nonabelian T-duality will be a G/N -bundle with equivariant 2-cocycle: 2 (G N δρ G; U (1))). (G N δρ G, (σ, λ, β) ∈ Z G

(16)

The nonabelian T-dual is obtained in the following two steps: Step 1. Pass to the crossed product groupoid, together with its canonical right module P of Theorem (13.1) and the 2-cocycle ψ which is the image of (σ, λ, β) in Z 2 (G G N δρ G; U (1)) : (G (G N δρ G), ψ, P). Step 2. Pass to the Morita equivalent system: (N δρ G; ι∗ ψ, ι∗ P), where ι is the essential equivalence (see Proposition (10.5)): ι : (N δρ G) → G (G N δρ G) (n, γ ) → (nρ(γ ), e, n, γ ). Some properties of the duality. (A) If ρ : G → G/N does not admit a lift to a map to G, then the same T-dualization procedure works with N δρ G replaced by G ×G/N ,ρ G (see Example (9)). (B) What we are describing is indeed a duality, in the sense that we can recover the initial system (16) by inducing the groupoid via its canonical module P. This fact is the content of twisted nonabelian Takai duality for groupoids, Theorem (13.2). (C) There is an isomorphism of twisted K -theory K • (G N δρ G; σ ) K •+dimG (G (G N δρ G), ψ) K •+dimG (N δρ G; ι∗ ψ) whenever G satisfies the Connes-Thom isomorphism theorem, in particular whenever G is a (finite dimensional) 1-connected solvable Lie group. (D) This construction is Morita invariant in the appropriate sense. That is, if we choose two representatives for the same generalized principal bundle with equivariant gerbe, the two resulting dualized objects present the same N -gerbe with equivariant gerbe. This is proved in Sect. (15). (E) The dual can be interpreted as a family of noncommutative groups. Indeed, ˇ suppose that in the above situation G is a Cech groupoid of a locally finite cover of a space X . Let us look at the fiber of the dual over a point m ∈ X . This fiber corresponds to N δρ (G|m ), where G|m is the restriction of G to the chosen point. G|m is a pair groupoid which is a finite set of points (there is one arrow for each double intersection Ui ∩ U j that contains m and one object for each element of the cover that contains m). Any inclusion of the trivial groupoid (∗ ⇒ ∗) → G|m induces an essential equivalence φ : (N ⇒ ∗) → N δρ (G|m ) which induces an isomorphism φ ∗ in cohomology by Corollary (10.4). So the twisted groupoids ((N δρ G)|m , (ι∗ ψ)|m ) and (N ⇒ ∗, φ ∗ (ι∗ ψ|m )) are equivalent. In particular, C ∗ (N δρ G|m ; ι∗ ψ) is Morita equivalent to C ∗ (N ; φ ∗ (ι∗ ψ)), and the latter is a standard presentation of a noncommutative


87

(and nonabelian, if desired) dual group! For example if N Zn we get noncommutative n-dimensional tori. So the T-duality applied to G/N -bundles produces N -gerbes that are fibred in what should be interpreted as noncommutative versions of the dual group (G/N )∨ ! 15. The Equivariant Brauer Group In this section we will describe the elements of the G-equivariant Brauer group of a principal “G/N -stack”. First recall that the Brauer group Br(X ) of a space X is the set of isomorphism classes of stable separable continuous trace C ∗ -algebras with spectrum X . The famous Dixmier-Douady classification says that each such algebra is isomorphic to the algebra Γ0 (X ; E) of sections that vanish at infinity of a bundle E of compact operators. Since such bundles can be described by transition functions with values in Aut K PU (h), there is an isomorphism H 1 (X ; Aut K ) Br(X ). Since H 1 (X ; Aut K ) H 2 (X ; U (1)), Br (X ) can also be taken to classify U (1)-gerbes on X . If X is a G-space, one can talk about the equivariant Brauer group Br G (X ), which can be most simply defined as the equivalence classes of G-equivariant bundles of compact operators under the equivalence of isomorphism and outer equivalence of actions. We intend to show that this group also corresponds to G-equivariant gerbes on X . Generalizing the case of spaces X , one can consider the Brauer group of a presentable topological stack X (that is, a stack X which is equivalent to PrinG for some groupoid G, as in Sect. (7)). So let X be a presentable topological stack. Then a vector bundle on X is a vector bundle on a groupoid G presenting X . A vector bundle on a groupoid G is a (left) G-module E → G0 which is a vector bundle on G0 and such that G acts linearly (in the sense that the action morphism s ∗ E → r ∗ E is a morphism of vector bundles over the space G1 ). An H-G-Morita equivalence bimodule P makes P ∗ E → H0 a vector bundle on H. In exactly the same way, one has the notion of a bundle of algebras on a stack, in particular one has bundles of compact operators on a stack. Finally, the stack with G/N -action associated to generalized principal bundle G/N ρ G is what should be called a principal G/N -stack. This data can be presented in at least two ways, for example as a stack over PrinG/N , PrinG −→ PrinG/N , or as a stack over PrinG , PrinG/N ρ G −→ PrinG . Now define the G-equivariant Brauer group Br G (PrinG/N ρ G ) to be the isomorphism classes G-equivariant bundles of compact operators on PrinG/N ρ G . We will show that whenever G0 is contractible to a set of points, Br G (PrinG/N ρ G ) H 1 (G ×G/N G; Aut K ). Of course if G0 isn’t contractible we can refine G so that it is, thus we will always make that assumption. So suppose E is a bundle on PrinG/N ρ G . It may initially be presented as a module over any groupoid H which is equivariantly Morita equivalent to (G/N ρ G), ˇ for example if G is a Cech groupoid one could imagine that E is a bundle on the actual space Q (G/N × G0 )/(G/N ρ G). We would like E to be presented on the

88

C. Daenzer

groupoid G (G G/N ,ρ G), (as in Example (10)), so choose a Morita equivalence ((G G G/N ,ρ G)-H)-bimodule P, and replace E by E˜ := P ∗ E. Remember that no data is lost here because P op ∗ P ∗ E E. Just for the sake of not having too many G s, assume that ρ admits a lift to G so that there is an isomorphism (G G G/N ,ρ G) (G N δρ G) as in Example (10). The general case when there is no lift works in the exact same way. ˜ Let us suppose this is the case, meaning precisely If E is G-equivariant then so is E. that E˜ is a (G N δρ G)-module, and E˜ has a G-action which is equivariant with respect to the translation action of G on (G N δρ G). Keep in mind that in particular E˜ is a bundle over the objects G × G0 of the groupoid. Then the restriction of E˜ to {e} × G0 ⊂ G × G0 , denoted E 0 , is trivializable since G0 is contractible. So assume that E 0 = G0 × K . But then the whole of E˜ is trivializable since it is a G-equivariant bundle over a space with free G-action. For example a trivialization is given by: G × E 0 → E˜

(g, ξ ) → gξ.

So assume that E˜ = G × G0 × K . Note that being a G-equivariant (G N δρ G)-module is the same as being a G (G (N δρ G))-module. Since E˜ is trivial as a bundle, it is classified by the action of G (G N δρ G), and this is given by a homomorphism π : G (G N δρ G) → Aut K such that the groupoid action is ˜ (G (G N δρ G)) ×G×G0 E˜ −→ E, ((g1 , g2 , n, γ ), (g1−1 g2 nρ(γ ), sγ , k)) → (g2 , r γ , π(g1 , g2 , n, γ )k) ˜ for (g1−1 g2 nρ(γ ), sγ , k) ∈ E. Two such actions, given by π and π , are outer equivalent if and only if π and π are cohomologous, and this shows that Br G (PrinG/N ρ G ) H 1 (G G N δρ G; Aut K ). But the results of Sect. (10) imply that the inclusion ι : N δρ G → G G N δρ G

(n, γ ) → (nρ(γ ), e, n, γ )

induces a quasi-isomorphism with quasi-inverse the quotient map q : G G N δρ G → N δρ G , therefore we have: Proposition 15.1. Let N be a closed normal subgroup of a locally compact group G, let G be a groupoid with G0 contractible, and let ρ : G → G/N be a homomorphism. Then Br G (PrinG/N ρ G ) H 1 (G G N δρ G; Aut K ) H 1 (N δρ G; Aut K ). More generally, N δρ G can be replaced by G G/N ,ρ G.


89

The meaning of the isomorphism on the right is that π can be taken constant in its two G variables. In fact, one can construct such a π directly; simply note that the whole situation is determined by the module structure of E 0 , and that it is precisely ι(N δρ G) that preserves this subspace. An example of this construction is carried out in the proof of Theorem (A.1). Now let us explain the relationship between gerbes and bundles of compact operators. If N is a discrete group then there is a connecting homomorphism which is an isomorphism, H 1 (N δρ G; Aut K ) H 2 (N δρ G; U (1)). If N is not discrete then there is the possibility that only a Borel connecting homomorphism can be chosen. Rather than tread into the territory of Borel cohomology for groupoids, we point out that for any groupoid H, the U (1)-gerbe associated to π ∈ Z 1 (H; Aut K ) is just U (h) ×Aut K ,π H, the groupoid constructed in Example (9). It is a U (1)-gerbe on H. To make that last fact more clear, note that as a space, the arrows of this groupoid form a possibly nontrivial U (1)-bundle over H1 and any global section of the bundle determines an isomorphism U (h) ×Aut K ,π H U (1) δπ H, as was pointed out in Example (9), and the latter object is clearly a U (1) gerbe. At the C ∗ -algebra level there is also a relationship between gerbes and bundles of compact operators. It is shown in Lemma (A.5) that a global section of the U (1)-bundle (U (h) ×Aut K ,π H1 ) → H1 induces a Morita equivalence C ∗ (H, δπ )

Morita

∼

Γ (H; E(π )),

where E(π ) is the trivial bundle H0 × K with H-module structure given by π . Independent of the existence of a global section, there is a Morita equivalence Γ (H; Fund(U (h) ×Aut K ,π H))

Morita

∼

Γ (H; E(π )),

where Fund(U (h) ×Aut K ,π H) denotes the associated line bundle to the U (1)-bundle (viewed as a bundle of (rank 1) C ∗ -algebras) (U (h) ×Aut K ,π H1 ) −→ H1 . So in this section we saw that nonabelian noncommutative T-duality as presented in Sect. (14) can be used to describe a dual to a G-equivariant gerbe presented on any groupoid H such that H describes a principal “G/N -stack”; this being because any such gerbe could also be presented on (G N δρ G), as it was in Sect. (14). (Except for the situation in which ρ and possibly π do not admit lifts to G or U (h) respectively, but we also saw how to modify the setup in these situations.) Finally, as promised: Corollary 15.2. Nonabelian T-duality is Morita invariant. Proof. This is easy because we may assume that any gerbe on a generalized principal bundle is presented on a groupoid of the form G N δρ G and that the gerbe is defined via a cocycle which is constant in G. But then if two such presentations are given, corre sponding to π ∈ Z 1 (N δρ G; Aut K ) and π ∈ Z 1 (N δρ G ; Aut K ) it is clear that the gerbes U (h)×Aut K ,π G N δρ G and U (h)×Aut K ,π G N δρ G are G-equivariantly Morita equivalent if and only if U (h) ×Aut K ,π N δρ G and U (h) ×Aut K ,π N δρ G are Morita equivalent.

90

C. Daenzer

16. Conclusion A natural direction for future study here is to consider the case when both sides of the duality are fibred in noncommutative groups (in the sense of noncommutative geometry). Interestingly, a completely new phenomenon arises in this context: the gerbe data, the 2-cocycles that is, become noncompactly supported distributions and it is necessary to multiply them. We have manufactured examples in which this can be done, that is when the singular support of the distributions do not intersect, but our present methods do not provide a general method for describing a T-dual pair with both sides families of noncommutative groups. Another direction to look is the case of groupoids for which the G/N -action is only free on a dense set. This corresponds to singularities in the fibers of a bundle of groups. The groupoid approach to T-duality seems well suited for this. On the other hand for some other types of singularities in fibers, notably singularities which destroy the possibility of a global G/N -action, the groupoid approach will not apply at all. It will be interesting to see if this problem can be fixed. A third direction these methods can take is to consider complex structures on the groupoids and make the connection between topological T-duality and the T-duality of complex geometry (as in [DP]). We have initiated this project in [BD]. Lastly, it will of course be very nice to find some physically motivated examples of nonabelian T-duality. A. Connection with the Mathai-Rosenberg Approach The goal of this section is to describe the connection between our approach and the Mathai-Rosenberg approach to T-duality. Let us begin with a summary of the approach of Mathai and Rosenberg [MR]. One begins with the data of a principal torus bundle P → X over a space X and a cohomology class H ∈ H 3 (P; Z) called the H -flux. The procedure for T-dualizing is as follows: 1. Pass from the data (P, H ) to a C ∗ -algebra A(P, H ). To do this one traces H through the isomorphisms H 3 (P; Z) Hˇ 2 (P; U (1)) Hˇ 1 (P; Aut(K ))

(17)

(here K = K (h) is the algebra of compact operators on a fixed separable ˇ Hilbert space h) to get an Aut(K )-valued Cech 1-cocycle. This cocycle gives transition functions for a bundle of compact operators over P, and A(P, H ) is the C ∗ -algebra of continuous sections of this bundle which vanish at infinity. According to the Dixmier-Douady classification, one can recover the torus bundle P and the H -flux from the A(P, H ). 2. Next, writing the torus as a vector space modulo full rank lattice, T = V /Λ, one tries to lift the action of V on P to an action of V on A(P, H ). Assuming one exists, choose an action α : V → Aut(A(P, H )) lifting the principal bundle action. If no action exists, the data is not T-dualizable. 3. Now the T-dual of A = A(P, H ) is simply the crossed product algebra, V α A, (or perhaps the T-dual is the spectrum and Dixmier-Douady invariant (P ∨ , H ∨ ) of this algebra, if V α A is of continuous trace) and the problem of producing a T-dual object is reduced to understanding this crossed product algebra.


91

4. There are two scenarios for describing the crossed product, depending on whether a certain obstruction class, called the Mackey obstruction M(α), vanishes. When M(α) = 0 the crossed product algebra is isomorphic to one of the form A(P ∨ , H ∨ ) for P ∨ → X a principal dual-torus bundle and H ∨ ∈ H 3 (P ∨ , Z). The transition functions for P ∨ are obtained from the so-called Phillips-Raeburn obstruction of the action and there exist explicit formulas for H ∨ in terms of the data (P, H, α). This understanding of the crossed product is a result of work by Mackey, Packer, Phillips, Raeburn, Rosenberg, Wassermann, Williams and others (in the subject of crossed products of continuous trace algebras) which is referenced in [MR]. When M(α) = 0, the crossed product algebra was shown in [MR] to be a continuous field of stable noncommutative (dual) tori over X . ˇ We claim that the Mathai-Rosenberg setup corresponds to our approach applied to Cech groupoids and the groups (G, N ) = (V, Λ). More precisely, we have the following theorem. Theorem A.1. Let Q → X be a principal torus bundle trivialized over a good cover of ˇ X , let G denote the Cech groupoid for this cover and let ρ : G → V /Λ be transition functions presenting Q. Then 1. For any H ∈ H 3 (Q; Z) such that A := A(Q; H ) admits a V -action, there is a Morita equivalence A

Morita

∼

C ∗ (V Λ δρ G; σ ),

for some σ ∈ Z 2 (V Λ δρ G; U (1)) that is constant in V . If V acts by translation on C ∗ (V Λ δρ G; σ ) then the equivalence is V -equivariant. 2. [σ ] is the image of H under the composite map ∼

H 3 (Q; Z) −→ H 2 (Q; U (1)) −→ H 2 (V Λ δρ G; U (1)). 3. Let σ ∨ := σ |Λδρ G ∈ Z 2 (Λ δρ G; U (1)). Then for the chosen action of V , there -equivariant Morita equivalence: is a V VA

Morita

∼

V C ∗ (V Λ δρ G; σ )

Morita

∼

C ∗ (Λ δρ G; σ ∨ ),

acts by the canonical dual action on the left two algebras and on the rightwhere V and (λ, γ ) ∈ Λ δρ G. most algebra by φ · a(λ, γ ) := φ, λρ(γ )a(λ, γ ) for φ ∈ V The theorem will follow from the next two lemmas concerning bundles of C ∗ -algebras. Definition A.2. Let G be a groupoid. A (left) G-C ∗ -algebra A is a bundle of C ∗ -algebras A → G0 which is a (left) G-module, such that G acts by C ∗ -algebra isomorphisms. The groupoid algebra of sections of A, written Γ ∗ (G; r ∗ A), is the C ∗ -completion of Γc (G; r ∗ A) (= the compactly supported sections of the pullback bundle r ∗ A) with multiplication and involution given by a(g1 )g1 · (b(g2 )) a ∗ (g) := g · (a(g −1 )∗ ) ab(g) := g1 g2 =g

for g s ∈ G and a, b ∈ Γc (G; r ∗ A), and where the last “ ∗ ” on the right is the C ∗ -algebra involution in the fiber.

92

C. Daenzer

Remark A.3. This definition is a synonym for the groupoid crossed product algebra G Γ0 (G0 ; A), as defined in [Ren2].

ρ

Lemma A.4. Let G and H be groupoids and (G0 ← P → H0 ) a (G-H)-Morita equivaπ lence bimodule. Then for any H-C ∗ -algebra A → H0 , there is a Morita equivalence Γ ∗ (G; r ∗ (P ∗ A))

Morita

∼

Γ ∗ (H; r ∗ A)

where, as in Sect. (3), P ∗ A := (P ×ρ,H0 ,π A)/H) = ((P ×ρ,H0 ,π A)/( p, a) ∼ ( ph −1 , h · a)). Proof. We will construct a Morita equivalence bimodule for essential equivalences. If this case is true then the lemma is true because any Morita equivalence factors into essenφ1

φ2

op

tial equivalences, and if G ← K → H is such a factorization then, setting P = Pφ1 ∗ Pφ2 , we will have Γ ∗ (H; r ∗ A)

Morita

Γ ∗ (K; r ∗ (Pφ∗2 A)) ∼ Γ ∗ (K; r ∗ Pφ1 ∗ (P ∗ A))

Morita

Γ ∗ (G; r ∗ (P ∗ A)).

∼ ∼

iso

So in the notation of the statement of the lemma, assume P = Pφ := G0 ×φ,H0 ,r H1 for φ : G → H an essential equivalence. Then is the projection Pφ → G0 and ρ is the projection π H : Pφ → H1 followed by the source map s : H1 → H0 . We use the isomorphism Pφ ∗ A φ ∗ A := G0 ×φ,H0 ,π A. Now we will construct a Morita equivalence bimodule. Set Ac := Γc (G; r ∗ (φ ∗ A)) and Bc := Γc (H; r ∗ A). The following structure defines an Ac -Bc -pre-Morita equivalence bimodule structure on X c := Γc (P; ∗ φ ∗ A), which after completion produces the desired Morita equivalence. Fix the notation: g s ∈ G, h s ∈ H, p s ∈ P, a s ∈ Ac , x s ∈ X c , and b s ∈ Bc , and as usual integration is with respect to the fixed Haar system. The left pre-Hilbert module structure is given by the following data: • Action: ax( p) := g a(g)φ(g) · (x(g −1 p)). • Inner product: A x1 , x2 (g) := p x1 (gp)φ(g) · x2 ( p)∗ . The right pre-Hilbert module structure is given by the following data: • Action: xb( p) := h h · (x( ph))π H ( ph) · b(h −1 ). • Inner product: x1 , x2 B (h) := p π H ( p)−1 · (x1 ( p)∗ x2 ( ph)). Verification that this determines a Morita equivalence bimodule is routine.


93

Lemma A.5. Let G be a groupoid, σ : G2 → U (1) a 2-cocycle, and T : G1 → U (h) a continuous map satisfying σ (γ1 , γ2 ) = T (γ1 )T (γ2 )T (γ1 γ2 )−1 =: δT (γ1 , γ2 ). Let A(ad T ) denote the G-C ∗ -algebra (G0 × K ) → G0 with G-action given by: s ∗ A(ad T ) −→ r ∗ A(ad T )

g · (sg, k) := (rg, ad T (g)k).

Then 1. There is a Morita equivalence Γ ∗ (G; r ∗ A(ad T ))

Morita

∼

C ∗ (G; σ ).

ˇ ∗ -algebra ˇ 2. When G = Gˇ is a Cech groupoid on a good cover of a space X , every G-C of compact operators is isomorphic to A(ad T ) for some T and Γ0 (X ; E(ad T ))

Morita

∼

ˇ σ ), C ∗ (G;

where E(ad T ) := G0 × K /(sγ , k) ∼ (r γ , adT (γ )k) is the bundle of compact operators whose transition functions are ad T : G → Aut K . Proof. The Morita equivalence in the second statement is proved as follows. First note op that E(ad T ) = (Pφ ) ∗ A(ad T ), where Pφ is the bimodule of the essential equivalence φ

G −→ G0 /G ≡ X , so an application of Lemma (A.4) implies that Γ0 (X ; E(ad T ))

Morita

∼

ˇ r ∗ A(ad T )). Γ ∗ (G;

Now apply the Morita equivalence of the first statement to finish. The other part of statement (2), that all bundles of compact operators on X are of the form E(ad T ) for some ˇ T , follows because the connecting homomorphism in nonabelian Cech cohomology, H 1 (X ; Aut K ) → H 2 (X ; U (1)), is an isomorphism. It remains to exhibit the Morita equivalence of statement (1). Set Ac := Γc (G; r ∗ A(ad T )) and Bc := C ∗ (G; σ ). We claim that the space X c := Cc (G1 ; h) of compactly supported h-valued maps admits a pre-Morita equivalence Γ ∗ (G; r ∗ A(ad T ))-C ∗ (G; σ )-bimodule structure. Set the notational conventions: g s ∈ G, a s ∈ Ac , x s ∈ X c , and b s ∈ Bc . The unadorned bracket , denotes the C-valued inner product on h and is taken to be conjugate-linear in the first variable and linear in the second. The bra-ket notation will be used for the K -valued inner product on h, so for v s ∈ h, |v1 v2 | is the compact operator defined by |v1 v2 |(v) := v2 , vv1 . The left pre-Hilbert module structure on X c is given by the following data: • Action: ax(g) := g1 g2 =g σ (g1 , g2 )a(g1 )T (g1 )x(g2 ). • Inner product: A x1 , x2 (g) := g1 g2 =g σ (g1 , g2 )|x1 (g1 )T (g)x2 (g2−1 )|. The right pre-Hilbert module structure on X c is given by the following data: • Action: xb(g) := g1 g2 =g σ (g1 , g2 )x(g1 )b(g2 ). • Inner product: x1 , x2 B (g) := g1 g2 =g x1 (g1−1 ), x2 (g2 )σ (g1 , g2 ). Verification that this structure gives a Morita equivalence is routine. Now let us proceed to the proof of the theorem.

94

C. Daenzer

Proof. (Theorem). By the Dixmier-Douady classification we know that A is isomorphic to the algebra of continuous sections of a bundle E → Q of compact operators, so assume A = Γ0 (Q; E). A V -action on A comes from an action by automorphisms of the bundle of algebras. Now, noting that Q is isomorphic to the quotient (V /Λρ G)0 /(V /Λρ G)1 , pull back E to a bundle E˜ over V /Λ × G0 . Then E˜ is a module for the groupoid V V /Λ ρ G. Denote by E 0 the restriction of E˜ to {eΛ} × G0 ⊂ V /Λ × G0 . Then E 0 is trivializable since G0 is contractible, so we assume E 0 = G0 × K , and write (sγ , k) ˜ and [sγ , k] for its image under the quotient q : E˜ → E. The for a point in E 0 ⊂ E, δρ V × V /Λ G-module structure on E˜ “restricts” to a Λ δρ G-module structure on E 0 , via the inclusion ι : Λ δρ G → V V /Λ ρ G

(λ, γ ) → (λρ(γ ), eΛ, γ ).

The action on E 0 can be written as (λ, γ ) · (sγ , k) := (r γ , π(λ, γ )k), where π is a homomorphism Λ δρ G → Aut K . (Note that there exists a lift of ρ to V since G0 is contractible and V → V /Λ is a covering space, so δρ makes sense.) Then q followed by the V action determines a map V × E0 → E

(v, (sγ , k)) → (v, [sγ , k]) → v · [sγ , k].

This map factors through the quotient V × E 0 −→ (V × E 0 )/(Λ δρ G) := (V × E 0 )/(v, (sγ , k)) ∼ (v − λ − ρ(γ ), (r γ , π(λ, γ )k)), and induces an isomorphism of bundles ∼

(V × E 0 )/(Λ δρ G) −→ E, which is V -equivariant when the bundle on the left is equipped with the natural translation action. So A(Q; E) is equivariantly isomorphic to Γ (Q; (V × E 0 )/(V Λ δρ G)). Let σ := δπ : (Λ δρ G)2 → U (1) be an image of π under the composition of the connecting homomorphism (which is an isomorphism due to the contractibility of U (h)) and the pullback via the quotient map V Λ δρ G → Λ δρ G, H 1 (Λ δρ G; Aut K ) → H 2 (Λ δρ G; U (1)) → H 2 (V Λ δρ G; U (1)). In other words, we have chosen a continuous map T : (Λ δρ G) → U (h) such that ad T = π and δT = σ , and E E(ad T ). Now we know that A(Q; H ) Γ0 (Q; E) = Γ0 (Q; ad T )), and according to Lemma (A.5) there is a Morita equivalence: C ∗ (V Λ δρ G; σ )

Morita

∼

Γ0 (Q; E(ad T ))

which is easily seen to be equivariant since T does not depend on V . So statement (1) is proved, and statement (2) is obvious from the construction since Γ0 (Q, E(ad T )) A(Q, H ) when H is the image of [σ ] = [δT ].


95

Statement (3) now follows as well. Indeed, since an equivariant Morita equivalence induces an equivalence of the associated crossed product algebras, we have V C ∗ (V Λ δρ G; σ )

Morita

∼

V A(Q; H ),

and the algebra on the left, being identical to C ∗ (V V Λ δρ G; σ ), is equivariantly Morita equivalent to C ∗ (Λ δρ G; σ ) by Proposition (10.5). This completes the proof. So that is the correspondence: Mathai-Rosenberg do A(Q; H ) ↔ V A(Q; H ), whereas we do (V Λ δρ G; σ ) ↔ (Λ δρ G; σ ∨ ). In Sects. (12) and (14) the presentation is slightly different. The difference is that in this appendix we have assumed that σ is already given as a 2-cocyle on H := (V Λ δρ G) which is constant in the V -direction, whereas in Sect. (14) we begin with an arbitrary σ on H that has been extended to an equivariant 2-cocyle (σ, λ, β). The two setups are essentially the same because according to Sect. (10), the existence of the lift (σ, λ, β) ensures that σ is cohomologous to a 2-cocycle which is constant in the V direction. In Sect. (12) there is the further difference that σ is presented on V /Λ ρ G rather than H, but we can easily pull it back to a cocycle on H. The slightly messier presentation in Sects. (12) and (14) appeals to the notion that the initial data is a gerbe on a principal bundle (with any groupoid presentation), and that we have found an action of V (that is a lift to an equivariant cocycle) for which the gerbe is equivariant. The Mackey obstruction in our setup is simply β := σ |Λδρ G0 . The methods develˇ oped in Sect. (10) make it clear that since G is a Cech groupoid, the restriction to a point in the base space, G G|m identifies β with a 2-cocycle on Λ. When β is a coboundary, we may assume that σ only depends on one copy of Λ, and the Pontryagin duality methods apply. Indeed, we may always assume that σ is in the image of equivariant cohomology, HV2 (V Λ δρ G; U (1)), and then β really corresponds to the component that obstructs classical dualization and Pontryagin dualization described in Sect. (12). Acknowledgement. I would like to thank Oren Ben-Bassat, Tony Pantev, Michael Pimsner, Jonathan Rosenberg, Jim Stasheff, and most of all Jonathan Block, for advice and helpful discussions. I am also grateful to the Institut Henri Poincaré, which provided a stimulating environment for some of this research. Open Access This article is distributed under the terms of the Creative Commons Attribution Noncommercial License which permits any noncommercial use, distribution, and reproduction in any medium, provided the original author(s) and source are credited.

References [BX] [BBP]

Behrend, K., Xu, P.: Differentable stacks and gerbes. http://arxiv.org/abs/math.DG/0605694, 2006 Ben-Bassat, O., Block, J., Pantev, T.: Noncommutative tori and Fourier-Mukai duality. http://arxiv. org/abs/math.AG/0509161, 2005 [BD] Block, J., Daenzer, C.: Mukai duality for gerbes with connection. J. für die reine und ang. Math. (Crelle’s journal) in press [BHM] Bouwknegt, P., Hannabuss, K., Mathai, V.: Nonassociative tori and applications to T-duality. Commun. Math. Phys. 264, 41–69 (2006) [Br] Brylinski, J.-L.: Loop spaces, characteristic classes and geometric quantization. Progress in Mathematics, 107. Boston, MA: Birkhäuser Boston, Inc., 1993 [BRS] Bunke, U., Rumpf, P., Schick, T.: The topology of T-duality for t n -bundles. Rev. Math. Phys. 18(10), 1103–1154 (2006) [BSST] Bunke, U., Schick, T., Spitzweck, M., Thom, A.: Duality for topological abelian group stacks and T-duality. http://arxiv.org/abs/0701428v1[math.AT], 2007

96

[C] [CMW] [DP] [Gir] [Met] [MR] [MRW] [RW] [RW2] [Ren] [Ren2] [Pol] [SYZ]

C. Daenzer

Crainic, M.: Differentiable and algebroid cohomology, Van Est isomorphisms, and characteristic classes. http://arxiv.org/abs/math.DG/0008064, 2000 Curto, R., Muhly, P., Williams, D.: Crossed products of strongly morita equivalent c∗ -algebras. Proc. Amer. Math. Soc. 90, 528–530 (1984) Donagi, R., Pantev, T.: Torus fibrations, gerbes, and duality. http://arxiv.org/abs/math.AG/0306213, 2003 Giraud, J.: Cohomologie non abélienne. Berlin-Heidelberg-New York: Springer-Verlag, 1971 Metzler, D.: Topological and smooth stacks. http://arxiv.org/abs/math/0306176, 2003 Mathai, V., Rosenberg, J.: T-duality for torus bundles with H-fluxes via noncommutative topology, II: the high-dimensional case and the T-duality group. Adv. Theor. Math. Phys. 10, 123–158 (2006) Muhly, P.S., Renault, J., Williams, D.P.: Equivalence and isomorphism for groupoid c∗ -algebras. J. Op. Th. 17, 3–22 (1987) Raeburn, I., Williams, D.P.: Dixmier-douady classes of dynamical systems and crossed products. Can. J. Math. 45(5), 1032–1066 (1993) Raeburn, I., Williams, D.P.: Morita equivalence and continuous-trace C ∗ -algebras. Mathematical Surveys and Monographs, 60. Providence, RI: Amer. Math. Soc., 1998 Renault, J.: A groupoid approach to C ∗ -algebras. Lecture Notes in Mathematics, 793. Berlin: Springer, 1980 Renault, J.: Représentations des produits croisés d’algèbres de groupoïdes. J. Op. Th. 18, 67–97 (1987) Polchinski, J.: String Theory, Vol. II, Cambridge: Cambridge University Press, 1998 Strominger, A., Yau, S.-T., Zaslow, E.: Mirror symmetry is T-duality. Nucl. Phys. B 479, 243 (1996)

Communicated by A. Connes


Communications in


Twistor Actions for Self-Dual Supergravities Lionel J. Mason1 , Martin Wolf 2 1 The Mathematical Institute, University of Oxford, 24–29 St. Giles,

Oxford OX1 3LP, United Kingdom. E-mail: [email protected]

2 Theoretical Physics Group, The Blackett Laboratory, Imperial College London,

Prince Consort Road, London SW7 2AZ, United Kingdom. E-mail: [email protected] Received: 5 November 2007 / Accepted: 5 November 2008 Published online: 26 February 2009 – © Springer-Verlag 2009

Abstract: We give holomorphic Chern-Simons-like action functionals on supertwistor space for self-dual supergravity theories in four dimensions, dealing with N = 0, . . . , 8 supersymmetries, the cases where different parts of the R-symmetry are gauged, and with or without a cosmological constant. The gauge group is formally the group of holomorphic Poisson transformations of supertwistor space where the form of the Poisson structure determines the amount of R-symmetry gauged and the value of the cosmological constant. We give a formulation in terms of a finite deformation of an ¯ integrable ∂-operator on a supertwistor space, i.e., on regions in CP3|8 . For N = 0, we also give a formulation that does not require the choice of a background. Contents 1. 2.

Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Twistor Constructions for Self-Dual Supergravity . . . . . . . . . . . . . 2.1 Definitions, notation and conventions . . . . . . . . . . . . . . . . 2.2 Self-dual supergravity equations . . . . . . . . . . . . . . . . . . . 2.3 Twistor constructions . . . . . . . . . . . . . . . . . . . . . . . . . 3. Twistor Actions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 3.1 Deformations of twistor space . . . . . . . . . . . . . . . . . . . 3.2 Action functionals . . . . . . . . . . . . . . . . . . . . . . . . . . . 4. Covariant Approach, Covariant Action for N = 0 and Special Geometry . 4.1 The supersymmetric case . . . . . . . . . . . . . . . . . . . . . . . 5. Conclusions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Appendix A. Prepotential Formulation . . . . . . . . . . . . . . . . . . . . . A.1. Real structures on PT[N] and M[N] . . . . . . . . . . . . . . . . . A.2. Comparison of the two approaches . . . . . . . . . . . . . . . . . . Appendix B. Holomorphic Volume Forms and Non-Projective Twistor Space Appendix C. Supersymmetric BF-Type Theory . . . . . . . . . . . . . . . . References . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .

. . . . . . . . . . . . . . . . .

. . . . . . . . . . . . . . . . .

98 100 100 102 104 107 107 110 112 114 115 115 116 117 119 120 121

98

L. J. Mason, M. Wolf

1. Introduction Recently it has been discovered that N = 8 supergravity has better ultraviolet behaviour than has hitherto been anticipated, [B-Betal06,BDR07 and GRV07]. This has led some authors to speculate that it is possibly even finite. This improved behaviour relies on exact cancellations that do not follow from standard supersymmetry arguments, [St06]. One possible explanation arises from twistor string theory, [W04 and B04]. The original twistor string theories by Witten and Berkovits correspond to conformal supergravity (together with supersymmetric Yang-Mills theory), [BW04]. By gauging certain symmetries of the Berkovits twistor string, [A-ZHM08] introduced a new family of twistor string theories some of which have the appropriate field content for Einstein supergravity (including N = 4 and N = 8). Such a twistor string formulation of Einstein supergravity could be an explanation for the possible ultraviolet finiteness of N = 8 supergravity if it were fully consistent in its quantum theory. However, it now appears that these twistor string theories are chiral, [N08], unlike the original twistor string theories which were parity invariant. It remains a major open question as to whether a twistor-string theory exists that gives the full content of Einstein (super)-gravity even just at tree level. An approach to understanding what the appropriate twistor string theory might be is via a twistor action, [M05 and MS06] and [BMS07a,BMS07b]. Such actions have two terms. The first on its own gives a kinetic term for all the fields, but with only the self-dual part of the interactions. The second gives the remaining interactions of the full theory and correspond to the instanton contribution in the twistor-string theory. In the case of N = 4 supersymmetric Yang-Mills theory, the self-dual part of the action on twistor space is a holomorphic Chern-Simons theory, [W04], see also [S95] for a closely related harmonic superspace action. [BW04] gave a twistor action for self-dual N = 4 conformal supergravity. The purpose of this paper is to give an analogous action in the case of self-dual N = 8 Einstein supergravity. This action is special to N = 8 supergravity in much the same way as Witten’s Chern-Simons action is special to N = 4 supersymmetric Yang-Mills theory. It lends general support to the idea that twistor space has something special to say about full N = 8 supergravity and is suggestive of the existence of an underlying twistor string theory, perhaps even with explicit N = 8 supersymmetry as opposed to those of [A-ZHM08] in which only N = 4 supersymmetry is manifest. Penrose’s non-linear graviton construction [P76] reformulates the local data of a four-metric with self-dual Weyl tensor into the complex structure of a deformed twistor space, a three-dimensional complex manifold obtained by deforming a region in CP3 . The space-time field equation in this case is the vanishing of the anti-self-dual part of the Weyl tensor, and in the [AHS78] approach to twistor theory, this is reformulated as the integrability of the twistor almost complex structure. [BW04] introduce a version of conformal gravity with just self-dual interactions in which the underlying conformal structure is self-dual, but in which there is also a linear anti-self dual conformal gravity field (a linearised anti-self-dual Weyl tensor B) propagating on the self dual background. This has a Lagrange multiplier action (analogous to a ‘BF’ action) (B, C − ) d vol , where C − is the anti-self-dual part of the Weyl tensor, and (B, C − ) is the natural pairing. This can be extended to N = 4 supersymmetry. [BW04] gave a corresponding (super symmetric) twistor action of the form bN , where N is the Nijenhuis tensor of the almost complex structure and b is a Lagrange multiplier that doubles up as the Penrose

Twistor Actions for Self-Dual Supergravities

99

transform of the field B when the field equations are satisfied. In the non-supersymmetric case, this was extended to a twistor action for full (non-self-dual) conformal gravity in [M05] with further supersymmetric extension and connections with twistor-string theory in [MS08]. For Einstein gravity we wish to encode the vanishing of the Ricci tensor. In the non-linear graviton this can be characterised by requiring that the twistor space admits a fibration over a CP1 together with a certain Poisson structure up the fibre. [W80] extended this to the Einstein case, with a cosmological constant; in this case, the twistor space is required to admit a holomorphic contact structure that is non-degenerate when the cosmological constant is non-zero, see [WW90 and MW96] for textbook treatments. So, for Einstein gravity, we are seeking a twistor action whose field equations not only imply the integrability of an almost complex structure, but also the existence of some compatible holomorphic geometric structure, for example the contact one-form in the case of the cosmological constant, or the fibration together with a Poisson structure up the fibres in the case of vanishing cosmological constant. The first task is to introduce suitable variables that encode the almost complex structure together with the relevant compatible geometric structure on the real six-manifold underlying the twistor space. This turns out to be a one-form with values in a line bundle, and we write down the appropriate field equations that it must satisfy and an action (depending also on a Lagrange multplier field) that gives rise to them; the Lagrange multiplier field again corresponds to an anti-self-dual linear gravitational field propagating on the self-dual background via the Penrose transform when the field equations are satisfied. Our primary exposition will focus on the N = 8 supersymmetric cases, and reduce them to the cases with lesser or no supersymmetry. Supersymmetric extensions of Penrose’s non-linear graviton construction were first discussed by [M92a,M92b] (see also [M91,M92c]) based on work by [M88] and developed further in [A-ZHM08] and in [W07].1 That in [W07] gives a twistor description of four-dimensional N-extended, possibly gauged, self-dual supergravity with and without cosmological constant in terms of a deformed supertwistor space, a deformation of a region in CP3|N endowed with an even holomorphic contact structure. Here we also discuss the different gaugings in the case without a cosmological constant. It is these the integrability of the almost complex structures of these twistor spaces together with the holomorphy of the appropriate geometric structures that correspond to the field equations for our twistor actions. There are now a number of contexts arising from conventional string theory and M-theory in which the task of finding variables and action principles whose field equations encode the integrability of complex structures compatibly with some other geometric structure. In particular Kodaira-Spencer theory, [BCOV94], leads to field equations that imply the integrability of an almost complex structure compatible with a global holomorphic volume form on a six-manifold, yielding a Calabi-Yau structure. For a compendium of such theories and relations between them, including conjectured relations to twistorstring theory, see [DGNV05]. The situations considered here are distinct from those in [DGNV05], but given that one of the form theories involved there is a self-dual form theory of four-dimensional gravity including a cosmological constant (see also [A-ZH06]) there may well be some important connections between these ideas. The paper is structured as follows. In §2, we first review the equations of self-dual supergravity, with cosmological constant and gauged R-symmetry, and then go on to review the various twistor constructions and give a brief proof of the version of the 1 See also [S06 and W06] and references therein for recent reviews of supertwistors and their application to supersymmetric gauge theories.

100


non-linear graviton construction for self-dual Einstein supergravity both with and without cosmological constant and different gaugings. In §3, we study infinitesimal deformations and show that a deformation of the contact structure determines a deformation of the almost complex structure. We develop a non-projective twistor formulation that shows that this persists in the case of a finite deformation giving a compact form for the field equations, i.e., the integrability condition for the almost complex structure. In the case of maximal supersymmetry, N = 8, we present the twistor action and show that it gives the appropriate field equations. We give a brief discussion of its invariance properties and various reductions with lesser gauging, supersymmetry, or no cosmological constant. A Chern-Simons action is always expressed in a given background frame and is not manifestly gauge invariant. In this gravitational context, our action is similarly not manifestly diffeomorphism invariant; we require the choice of some background, which we take to be a solution to the field equations. However, we go some way towards an invariant formulation. We give an invariant formulation of the field equations in general, but only find an explicitly diffeomorphism invariant action in the N = 0 case with cosmological constant. We prove that on any smooth manifold of dimension 4n + 2 equipped with a complex one-form τ up to scale (i.e., a complex line subbundle of the complexified cotangent bundle), then, if τ ∧ (dτ )n = 0, and a non-degeneracy condition is satisfied, there is a unique integrable almost complex structure for which τ is proportional to a non-degenerate holomorphic contact structure. This idea can be used to give a covariant form of the field equations in general, and a covarant action in the N = 0 case. In §5, we make some general concluding remarks. An action principle for N = 8 selfdual supergravity with vanishing cosmological constant has been obtained by [KK98] in harmonic superspace for split space-time signature.2 In that work, harmonic superspace is the spin bundle of super space-time and in Euclidean signature, it can naturally be identified with the supertwistor space. However, their action uses structures pulled back from space-time (e.g., the Laplacian) that are not locally obtainable from the complex structure and contact structure on twistor space. It is therefore not possible to regard it as a twistor action. Nevertheless, their action is closely related to ours and we show that theirs can be obtained from ours by gauge fixing in Appendix 5. In Appendix 5 we give a detailed discussion of the construction of the line bundle on a super-twistor space whose total space corresponds to a non-projective twistor space. In Appendix 5, we discuss some alternative twistor actions. 2. Twistor Constructions for Self-Dual Supergravity We work throughout in a complex setting. This can be understood as arising from taking a real analytic metric on a real space-time, and extending it to become a holomorphic complex metric on some neighbourhood M of the real slice in complexified space-time. We can straightforwardly restrict attention to Euclidean or split signature slice by requiring invariance under appropriate anti-holomorphic involutions (for Euclidean signature, these are discussed in Appendix 5). In the Euclidean case, one needs to restrict the number of allowed supersymmetries N to be even. 2.1. Definitions, notation and conventions. We model our definition of chiral super space-time on the paraconformal geometries of [BE91] (see also [W07]). 2 A similar action for N = 4 supersymmetric Yang-Mills theory was discovered in the context of harmonic superspace by [S95].


101

Definition 1. A right-chiral super space-time, M , is a split supermanifold of superdimension 4|2N on which we have an identification3 T M ∼ = H ⊗ S, where S is the right (dotted) spin bundle of rank 2|0 and H is the sum of the left spin bundle S and the rank-0|N bundle of supersymmetry generators and so has rank 2|N. We will also assume that S and H are endowed with choices of Berezinian forms (so that T M does also). This is the superspace one would obtain from a full super space-time by eliminating the left-handed fermionic coordinates, leaving only the right-handed ones in play. Being a split supermanifold, it is locally of the form C4|2N with coordinates4 (x µ˙ν , θ m ν˙ ) := x M ν˙ with x µ˙ν bosonic and θ m ν˙ fermionic where the indices range as fol˙ 1˙ lows: α, . . . , µ, . . . = 0, 1 for left-handed two-component spinors, α, ˙ . . . , µ, ˙ . . . = 0, for right-handed spinors, i, . . . , m, . . . = 1, . . . , N indexing the supersymmetries and A = (α, i), M = (µ, m); it will turn out in the following that it is natural, and simplifying in this self-dual context to group together the supersymmetry index m and the undotted spinor index µ into one index M. We use the convention that letters from the middle of the alphabets are coordinate indices whereas letters from the beginning of the alphabets are structure frame indices. The identification T M ∼ = H ⊗ S will be specified by a choice of ‘structure coframe’ given by the indexed one-forms E Aα˙ = dx M ν˙ E M ν˙ Aα˙ .

(2.1) ˙

˙

The dual vector fields will be denoted E Aα˙ , E Aα˙ E B β = δα˙ β δ A B . When contracting a vector field V with a differential one-form α we use the notation V α. With the capital Roman indices A, B, . . . ranging over both the bosonic α, β, . . . and the fermionic i, j, . . . indices we use the notation {AB . . .] for graded symmetrization and [AB . . .} for graded skew symmetrization T{A1 A2 ...An ] :=

1 n!

(−)σ¯ T Aσ (1) Aσ (2) ...Aσ (n) ,

(2.2a)

(−)σ¯ +|σ | T Aσ (1) Aσ (2) ...Aσ (n) ,

(2.2b)

σ ∈Pn

T[A1 A2 ...An } :=

1 n!

σ ∈Pn

where Pn is the group of permutations of n letters, |σ | the number of transpositions in σ and σ¯ the number of transpositions of odd indices. For an index such as A that ranges over indices for both odd and even coordinates, p A will denote the Graßmann parity of the index, p A = 0 for an even coordinate, and 1 for an odd one so that a graded skew form AB satisfies AB = −(−) p A p B B A . ˙

(2.3) ˙

γ˙ β = δ β , and similarly for . We introduce α˙ β˙ = [α˙ β] ˙ with 0˙ 1˙ = −1 and α˙ γ˙

α˙ αβ In the supersymmetric setting, there is a distinction between differential and integral forms, the latter being required for integration, [M88]. Unless otherwise stated, all our forms will be differential. 3 By T M we will mean T (1,0) M . There will be no role for anti-holomorphic objects on M . 4 The index structure on the bosonic coordinates in the curved case is not natural, but simplifies notation.

102


2.2. Self-dual supergravity equations. We introduce connections on H and S repre˙ sented by connection one-forms ω A B and ωα˙ β , respectively. These determine a connection ∇ on T M by ˙

∇V Aα˙ = dV Aα˙ + V B α˙ ω B A + V Aβ ωβ˙ α˙

(2.4)

so that it preserve the factorisation T M ∼ = H ⊗ S. The fermionic parts of ω A B gauge the R-symmetry. In this supersymmetric context, a choice of scale or volume form on M is a section of the Berezinian of 1 M . We can assume that the Berezinians of H and S have been identified so that the scale is determined by a section of the Berezinian of either H ∗ or S∗ . The connections can be chosen uniquely so that they preserves these sections of the Berezinians of H ∗ and S∗ and so that the connection on T M has torsion with vanishing supertrace.5 We assume from hereon that such choices have been made. In the formulae that follow, we will also assume that the connection is torsion-free as that is part of the self-dual Einstein condition (the torsion will not in general vanish on the full super space-time, only on this right-chiral (or left-chiral) reduced supermanifold). ˙ The curvature two-form R Aα˙ B β of ∇ decomposes into curvature two-forms for the connections on H and S , ˙

˙

˙

R Aα˙ B β = δ A B Rα˙ β + δα˙ β R A B .

(2.5)

Making explicit the form indices, we write the Ricci identities as ˙

˙

D [∇ Aα˙ , ∇ B β˙ }V D δ = (−) pC ( p A + p B ) V C δ R Aα˙ B βC ˙ ˙

+ (−) p D ( p A + p B ) V D γ˙ R Aα˙ B β˙ γ˙ δ ,

(2.6)

where V Aα˙ is a vector field on M . In the torsion free case, using the algebraic Bianchi identities, Prop. 2.6 of [W07] gives the decomposition of the curvature into irreducibles: D D D R Aα˙ B βC = −2(−) pC ( p A + p B ) RC[A|α˙ β| ˙ ˙ δ B} + α˙ β˙ R ABC ,

R ABC D = C ABC D − 2(−) pC ( p A + p B ) C{A δ B] D , δ˙

δ˙

δ˙

(2.7a) (2.7b)

δ˙

R Aα˙ B β˙ γ˙ = C AB α˙ β˙ γ˙ + 2 AB δ(α˙ β) ˙ γ˙ + α˙ β˙ R AB γ˙ ,

(2.7c)

where the curvature tensors satisfy the algebraic conditions R AB α˙ β˙ = R AB α˙ γ˙ γ˙ β˙ = R AB(α˙ β) ˙ , C ABC D = C{ABC] D , (−) pC C ABC C = 0, AB = [AB} .

(2.8)

Here, AB is a natural supersymmetric extension of the scalar curvature and will be set equal to the cosmological constant when the field equations are satisfied. (See [W07] for further details of the construction and properties of the connections.) 5 Special care needs to be taken for N = 4, [W07].


103

Definition 2. A right-chiral superspace will be said to satisfy the N-extended self-dual supergravity equations if (i) the unique connection that preserves the given Berezinians of H ∗ and S∗ is ˙ torsion-free and satisfies C AB α˙ β˙ γ˙ δ = 0, (ii) R AB α˙ β˙ = 0, (iii) preserves some P AB = P [AB} ∈ 2 H of rank 2|r and is flat on the odd (N − r )dimensional subspace of H ∗ that annihilates P AB . When AB = 0 it will be said to be Einstein, whereas if AB = 0 it will be said to be vacuum. When r = 0, the connection on H is trivial in the odd directions and the R-symmetry is ungauged; all supersymmetry generators are covariantly constant. For r > 0, a subgroup of the R-symmetry is gauged with gauge group an extension of S O(r, C), the subgroup of S O(N, C) that preserves P i j the odd-odd part of P AB . For r = N, the gauge group is S O(N, C). Conformal supergravity corresponds to the more general situation where condition (i) alone is satisfied, and a natural supersymmetric analogue of the hypercomplex case corresponds to conditions (i) and (ii). In this work, we shall mostly be concerned with the situation where (i)–(iii) are satisfied simultaneously. There is only one possibility for the gauging in the Einstein case as follows: Lemma 1. Either P AB and AB both have maximal rank and can be chosen to be multiples of each-other’s inverse, or AB = 0. Proof. Condition (iii) of Def. 2 implies that (−) pC + pC ( p D + p E ) R ABC [D P E}C = 0, and taking a supertrace gives the equation E}C 0 = (−) pC ( p A + p E )+ pC + p B C{A δ [B , B] P

(−) p B

(2.9)

which quickly leads to the condition that is a multiple of δ A If this AB multiple is non-zero, P AB and AB have maximal rank and are multiples of each other’s inverse. If this multiple is zero, the assumption on the rank of P AB implies that the rank of AB is less than or equal to 0|N − r . The condition that the connection is flat on the subspace of H ∗ that annihilates P AB implies that R ABC D e D = 0 for all e D such that P AB e B = 0. Symmetrizing over ABC gives that C ABC D e D = 0 so we must also have C{A e D] = 0. Note that for N = 2 the multiple is always zero.

P BC

C.

It is a consequence of the Bianchi identities that AB is covariantly constant so that, when non-zero, defining PAB as the inverse of P AB , we can set AB = PAB . When AB is non-trivial, the curvature is non-trivial on the odd directions of H , and so the R-symmetry is therefore necessarily gauged with gauge group S O(N, C). We will see in §3.1 in the discussion of the deformations of twistor space how the different gaugings come about. We also obtain (−) pC + pC ( p D + p E ) C ABC [D P E}C = 0 and ∇[Aα˙ C B}C D E = 0. The field equations of self-dual supergravity with zero cosmological constant lead to the Ricci identities ˙

˙

[∇ Aα˙ , ∇ B β˙ }V D δ = (−) pC ( p A + p B ) V C δ α˙ β˙ C ABC D ,

(2.10)

which in turn imply Ricci-flatness of M . The self-dual supergravity equations on chiral super space-time with vanishing cosmological constant first appeared in light-cone gauge and in their covariant formulation in the work by [S92].6 6 See also [K79,K80,CDDG79,KNG92 and BS92].

104


2.3. Twistor constructions. Flat supertwistor space is PT[N] := CP3|N \ CP1|N with homogeneous coordinates Z I := (ωα , θ i , πα˙ ) = (ω A , πα˙ ),

(2.11)

where ωα and πα˙ are bosonic coordinates and θ i fermionic ones. The supertwistor correspondence is between right-chiral complexified super spacetime M[N] ∼ = C4|2N with coordinates (x α α˙ , θ i α˙ ) = x Aα˙ , and is expressed by the incidence relation ω A = x Aα˙ πα˙ .

(2.12)

By holding x Aα˙ constant we see that points of M[N] correspond to CP1 s in supertwistor space PT[N] with homogeneous coordinates πα˙ . Alternatively, by holding Z I constant, we see that points in PT[N] correspond to (2|N)-dimensional isotropic superplanes. In the curved case, both sides of the correspondence are deformed, but points of super space-time still correspond to CP1 s in supertwistor space (and points of supertwistor space to (2|N)-dimensional isotropic subsupermanifolds of M ). Bosonic twistor space will be denoted by P T and will be a deformation of some region in CP3 , whereas a supersymmetrically extended curved twistor space will be denoted by PT and will be a deformation of a region in CP3|N. Similarly, a bosonic space-time will be denoted by M and a supersymmetric one (which will always in this paper be right-chiral) by M . We recall first Ward’s extension [W80] of [P76] non-linear graviton construction to the case of non-zero cosmological constant: Theorem 1. ([P76,W80]). (i) There is a natural one-to-one correspondence between holomorphic conformal structures [g] on some four-dimensional (complex) manifold M whose anti-self-dual Weyl curvature vanishes, and three-dimensional complex manifolds P T (the twistor space) containing a rational curve (a CP1 ) with normal bundle N∼ = O(1) ⊕ O(1). (ii) The existence of a conformal scale for which the trace-free Ricci tensor vanishes, but for which the scalar curvature is non-vanishing, is equivalent to P T admitting a non-degenerate contact structure. (iii) The existence of a conformal scale for which the full Ricci tensor vanishes is equivalent to P T admitting a fibration : P T → CP1 whose fibres admit a Poisson structure with values in the pullback of O(−2) from CP1 . Here, O(n) is the complex line bundle of Chern class n on CP1 . The holomorphic contact structure is a rank-2 distribution D ⊂ T (1,0) P T in the holomorphic tangent bundle of P T . The quotient determines a line bundle L := T (1,0) P T /D. It can be defined dually to be the kernel of a holomorphic (1, 0)-form τ defined up to scale on P T , i.e. D = ker τ . If so, τ takes values in L since the map T (1,0) P T → T (1,0) P T /D := L is then the contraction of a vector with the (1, 0)-form τ . The nondegeneracy condition is that for any two vector fields X and Y in D, the Frobenius form : D ∧ D → L := T (1,0) P T /D, with (X, Y ) := [X, Y ] mod D (2.13)


105

is non-degenerate on D. This is equivalent to τ ∧ dτ = 0. When it is everywhere degenerate, D determines a foliation whose leaves are the fibres of the projection : P T → CP1 and τ is the pullback of the one-form π α˙ dπα˙ from CP1 and L becomes the pullback of O(2) from CP1 . In the non-degenerate case, we can define a Poisson structure with values in L ∗ to be the inverse of on D. This has an analogue also in the degenerate case, now with values in O(−2) although its existence no longer follows from that of τ . We can impose compatibility with, e.g., Euclidean reality conditions by requiring the existence of an anti-holomorphic involution ρ : P T → P T without fixed points sending the given Riemann sphere to itself via the antipodal map. This then induces a corresponding involution on M fixing a real slice on which the metric g is real and of Euclidean signature. The above theorem has a supersymmetric extension as follows: Theorem 2. (i) There is a natural one-to-one correspondence between conformally self-dual holomorphic right-chiral space-times and complex supermanifolds PT of dimension 3|N with an embedded rational curve (a Riemann sphere CP1 ) with normal bundle N ∼ = O(1)⊕2|N. (ii) Furthermore, M is a complex solution to the four-dimensional N-extended selfdual supergravity equation with non-vanishing cosmological constant iff the twistor space PT admits a non-degenerate even contact structure. (iii) M is a complex solution to the four-dimensional N-extended self-dual supergravity equation with vanishing cosmological constant iff the twistor space PT admits a fibration : PT → CP1|N−r and a Poisson structure of rank 2|r tangent ot the fibres with values in ∗ O(−2). Here, O(n)⊕r |s := Cr |s ⊗ O(n). The proof breaks up into three parts; further details of the non-degenerate cosmological constant case are given in [W07]. Proof. Part (i). Let F = P(S∗ ) be the projective co-spin bundle over M with holomorphic projection p : F → M . Its fibres p −1 (x) over x ∈ M are complex projective lines CP1 with homogeneous fibre coordinates πα˙ . We define the twistor distribution to be the rank-2|N distribution DF on F given by ∂ A } := span π α˙ E Aα˙ + π α˙ πγ˙ ω Aα˙ β˙ γ˙ DF := span{ E , (2.14) ∂πβ˙ ˙ where the E Aα˙ s are the frame fields and ωα˙ β is the connection one-form on S. A few lines of algebra show that DF is integrable if and only if the connection is torsion-free and the C AB(α˙ β˙ γ˙ δ) ˙ -part of the curvature vanishes. In this case, the distribution DF defines a foliation of F . Working locally on M , the resulting quotient will be our supertwistor space, a (3|N)-dimensional supermanifold denoted by PT . The quotient map will be denoted by q : F → PT so that we have the double fibration q p PT ← F → M . We note that we can form a non-projective supertwistor space T by taking the quotient of S∗ by the distribution DF . The integral curves of the Euler := πα˙ ∂/∂πα˙ are the fibres over P(S∗ ) and ϒ descends to give a vector vector field ϒ field ϒ on T which determines the fibration T → PT . Since F is a CP1 -bundle over M and the fibres are transverse to the distribution DF , the submanifolds q( p −1 (x)) → PT , for x ∈ M , are CP1 s. In the other direction, the supermanifolds p(q −1 (Z )) → M , for Z ∈ PT , are the (2|N)-dimensional isotropic subsupermanifolds of M given by the p projections of integral surfaces of DF .

106


The inverse construction, i.e. starting from PT , follows by applying a supersymmetric extension of Kodaira’s deformation theory ([W86]). This allows one to reconstruct M as the moduli space of CP1 s that arise as deformations of the given CP1 which will correspond to some x ∈ M . According to Kodaira theory, Tx M ∼ = H 0 (CP1 , N ), where N is the normal bundle to the given CP1 ⊂ PT , and in order that the moduli space exist, we require the vanishing of the first cohomology of the normal bundle N . If the given CP1 arises as q( p −1 (x)) for some x ∈ M , then N ∼ = O(1)⊕2|N: this can be seen by expressing it as the quotient of the horizontal tangent vectors to F at p −1 (x) ∼ = CP1 , which can be represented by Tx M , by DF , 0 −→ DF | p−1 (x) −→ Tx M −→ q ∗ N −→ 0 .

(2.15)

Since the twistor distribution DF restricted to the fibres p −1 (x) over x ∈ M is O(−1)⊕2|N, and Tx M ∼ = C4|2N, N takes the form O(1)⊕2|N as stated above. Kodaira theory in turn implies that we can reconstruct M as the moduli space of such CP1 s, and that the construction is stable under deformations of the complex structure on PT . Kodaira theory identifies the tangent bundle Tx M with the sections of the normal bundle, N ∼ = O(1)⊕2|N, and these, by an extension of Liouville’s theorem are linear functions of πα˙ , i.e., V Aα˙ πα˙ where the A index is associated to a basis of C2|N. This gives the right-chiral manifold structure on M , and it is easily seen that lines through a given point of PT correspond to an integrable (2|N) manifold that will be an integral surface of the distribution DF . Thus DF is integrable and the M is therefore conformally self-dual. Part (ii). In the self-dual Einstein case with non-vanishing cosmological constant, we may introduce a one-form of homogeneity 2 on F by ˙

τ := π α˙ ∇πα˙ = π α˙ dπα˙ − ωα˙ β π α˙ πβ˙ ,

(2.16)

˙ where ωα˙ β is the connection one-form on S. The one-form τ automatically annihilates horizontal vectors and hence the distribution DF . The form τ descends to PT if and only if d τ is annihilated by DF also. This characterizes the self-dual Einstein equations since when C AB α˙ β˙ α˙ δ˙ = 0, as follows from the conformal self-duality condition, ˙

˙

d τ = ∇π α˙ ∧ ∇πα˙ + E B β ∧ E Aα˙ AB πα˙ πβ˙ − E B γ˙ ∧ E γA˙ R AB α˙ β˙ π α˙ π β ,

(2.17)

and this is annihilated by DF iff R AB α˙ β˙ = 0. Thus, τ descends to PT , i.e., there exists a one-form τ on PT such that τ = q ∗τ . Non-degeneracy of the contact structure is the condition that dτ is non-degenerate on the kernel D of τ , or equivalently, the condition that the three-form τ ∧ dτ should be non-degenerate in the sense that for any vector X , X (τ ∧ dτ ) = 0 ⇒ X = 0. This non-degeneracy is equivalent to the non-degeneracy of AB on H . Thus, τ defines a non-degenerate holomorphic contact structure on PT . Part (iii). In the self-dual vacuum case, we see that the connection on S is flat and a basis for S can be found so that it vanishes. In this basis, πα˙ are constant along the horizontal distribution on F , and so along the distribution (2.14). They are therefore the pullback of coordinates on PT . The condition that the connection is flat on the annihilator of P AB in H ∗ means that there are N − r covariantly constant sections esA of the odd part of H ∗ , s = r + 1, . . . , N. The forms E Aα˙ esA are therefore constant and, since the connection is torsion free, these forms are exact and equal to dθ s α˙ for some odd


107

coordinates θ s α˙ . The N − r functions θ s = θ s α˙ πα˙ can be seen to be constant also along the twistor distribution (2.14). The global holomorphic coordinates (πα˙ , θ s ) define a projection : PT → CP1|N−r as promised. We now define the Poisson structure by considering a pair of local functions f, g on PT . Pulled back to F , they satisfy π α˙ E Aα˙ f = 0, π α˙ E Aα˙ g = 0 ,

(2.18)

E Aα˙ f = πα˙ f A ,

(2.19)

and this implies that E Aα˙ g = πα˙ g A

for some f A , g A of weight −1 in πα˙ (this follows from the standard fact that π α˙ bα˙ = 0 ⇒ bα˙ = bπα˙ for some b which follows from the two-dimensionality of the spin space and the skew symmetry of α˙ β˙ ). We define the Poisson bracket { f, g} of f with g to be { f, g} := (−) p A ( p f +1) f A P AB g B .

(2.20)

It is clear that this has weight −2 in πα˙ , but, as given, this expression only lives on F . However, it is easily checked that, as a consequence of the covariant constancy of the P AB , it is constant along the distribution (2.14) and descends to PT .

See Appendix 5 for more on the non-projective formulation. 3. Twistor Actions In order to consider actions, we must allow our fields to go off-shell, and this is most straightforwardly done in the Dolbeault setting. We can take an almost complex structure that is not necessarily integrable to be the off-shell field, and regard the integrability condition to be part of the field equations. In the following we will see that if we require the almost complex structure to be compatible with a Poisson structure or complex contact structure and the almost complex structure can be encoded in a complex one-form h defined up to scale. In the following, we will mostly work ‘non-projectively’ i.e., on T[N] = C4|N, or at least using homogeneous coordinates. This can also be identified as the total space of the line bundle O(−1) over PT. On this space, we have the Euler homogeneity vector field ϒ, and a canonically defined holomorphic volume form (an integral form in this supersymmetric context) of weight 4 − N, the tautological form pulled back from Ber(PT ) ∼ = O(N − 4) satisfying Lϒ = (4 − N) , where Lϒ is the Lie derivative along ϒ. Similarly, τ will be a well-defined differential one-form of weight 2. See Appendix 5 for further discussion.

3.1. Deformations of twistor space. For simplicity, we take the supertwistor space PT to be a deformation of flat twistor space PT[N] with homogeneous coordinates as in the flat case given by7 Z I = (ωα , θ i , πα˙ ) = (ω A , πα˙ ) = (Z a , θ i ) ,

(3.1)

7 We could take a finite deformation of any curved integrable twistor space, but would then need more coordinate patches.

108


the latter form distinguishes between the odd, θ i and the even, Z a coordinates. We also assume that we are given an ‘infinity twistor’ a constant graded skew bi-vector I I J := diag(P AB , α˙ β˙ ),

(3.2a)

where P AB = diag( αβ , P i j )

and

P i j = P (i j) .

(3.2b)

When = 0, we will take P i j to be diagonal with r ones and N − r zeroes along the diagonal. We also introduce the graded Poisson structure on homogeneous functions f and g by [ f, g} := (−) p I ( p f +1) (∂ I f )I I J (∂ J g),

(3.3a)

where we introduce the notation ∂ I :=

∂ , ∂ZI

and we will also use

∂¯ I¯ :=

∂ . ∂ Z¯ I¯

(3.3b)

Infinitesimally, a deformation of the almost complex structure is represented by a holomorphic tangent bundle valued (0, 1)-form j, where the deformed and undeformed anti-holomorphic exterior derivatives are related by ∂¯ = ∂¯0 + j. The first order part of the integrability condition (assuming that ∂¯02 = 0) is ∂¯0 j = 0. An infinitesimal diffeomorphism induced by the real part of a (1, 0)-vector field X gives rise to the deformation j := −∂¯0 X , so that the infinitesimal deformations of the complex structure modulo those obtained by infinitesimal diffeomorphisms define an element of the Dolbeault cohomology group H 1 (PT , T (1,0) PT). In order to impose the Einstein or vacuum conditions, we will also demand that the deformation preserves the Poisson structure = −I J I ∂ I ∧ ∂ J of weight −2. In this linearised context, we can ensure this by requiring that the deforming vector fields j preserve the Poisson structure L j = 0, where L is the Lie derivative. This will follow if j is Hamiltonian with respect to , i.e., if there exists a (0, 1)-form h of weight 2 such that j = dh = (−) p I (∂ I h)I I J ∂ J .

(3.4)

¯ we see that j is ∂((χ ¯ If h = ∂χ )) and so is pure gauge. Thus such deformations correspond to h taken to be Dolbeault representatives for elements of H 1 (PT , O(2)). The Penrose transform gives the identification between elements of H 1 (PT , O(2)) and linearised self-dual gravitational fields, [P68,P76] and in the supersymmetric case this will give the whole associated linearised gravitational supermultiplet. We now consider a finite deformation, again determined by h = d Z¯ a¯ h a¯ which, at this stage, is an arbitrary (even) smooth function of (Z I , Z¯ a¯ ) homogeneous of degree 2 ¯ in Z I and 0 in Z¯ I , holomorphic in the θ i s and satisfies Z¯ a¯ h a¯ = 0; we will never allow any dependence on the complex conjugates of the fermionic cooordinates. We then define the distribution T (0,1) PT of anti-holomorphic tangent vectors on PT by T (0,1) PT := span{ D¯ I¯ } := span ∂¯a¯ + (−) p I (∂ I h a¯ )I I J ∂ J , ∂¯i¯ . (3.5)


109

This is to be understood as a finite perturbation of the standard complex structure on ¯ ¯ flat supertwistor space with ∂-operator ∂¯0 = d Z¯ I ∂¯ I¯ .8 The complex structure can be equivalently determined by specifying the space of (1,0)-forms

(1,0) PT := span{D Z I } := span{dZ I + I I J ∂ J h}.

(3.6)

The integrability condition for this distribution is

I I J ∂ J ∂¯a¯ h b¯ − ∂¯b¯ h a¯ + [h a¯ , h b¯ } = 0 ⇐⇒ I IJ ∂ J ∂¯0 h + 21 [h, h} = 0,

(3.7)

where the wedge product in the last expression is understood. When this equation is satisfied, not only is the almost complex structure integrable, but also the Poisson bracket of two holomorphic functions is again holomorphic. In the case that = 0, when the Poisson structure is degenerate, the coordinates πα˙ and θ r +1 , . . . , θ N are holomorphic and define a projection to CP1|N−r as required for the characterization of a twistor space for a self-dual vacuum solution. Thus, in this case, Eq. (3.7) is the main field equation. In the Einstein case, we must produce a holomorphic contact structure. On the flat twistor space, introduce the contact structure τ0 = dZ I Z J I J I ,

(3.8a)

where (−) p K I I K I K J = δ I J

and

˙

I I J = diag( PAB , α˙ β ).

(3.8b)

For the Einstein case, from Thm. 2, we need to know that we have a holomorphic contact structure on the deformed space. The deformed one can be taken to be τ := D Z I Z J I J I = dZ I Z J ω J I + Z J (−) p I I J I I I K ∂ K h = τ0 + 2 h, (3.9) = δ J K

where the last equation follows from the homogeneity relation Z I ∂ I h = 2h. The con¯ = 0 ⇔ D¯ ¯ dτ = 0 is dition that ∂τ I F (0,2) := ∂¯0 h + 21 [h, h} = 0.

(3.10)

Thus, integrability of the complex structure follows from the holomorphy of the contact structure when = 0. (When = 0, τ0 remains holomorphic trivially.) Thus, not only is (3.10) our main equation in the Einstein case, it also implies (3.7) in the other cases, and so we will focus on this as the main equation in what follows. The choice of the Poisson structure reduces the diffeomorphism freedom to (infinitesimal) Hamiltonian coordinate transformations of the form δ Z I = [Z I , χ } h → h + δh, with δh = ∂¯0 χ + [h, χ },

(3.11)

where χ is some smooth function of weight 2. Under this transformation, the ‘curvature’ F (0,2) behaves as F (0,2) → F (0,2) + δ F (0,2) with δ F (0,2) = [F (0,2) , χ }. Thus, the field equation (3.10) is invariant under these transformations. 8 As in the linearised context, we eventually want to impose the Einstein condition on the space-time manifold. Therefore, we are only interested in a subclass of (finite) deformations ∂¯0 → ∂¯0 + j with j given by j = d Z¯ a¯ ja¯ I ∂ I = d Z¯ a¯ (−) p I ∂ I h a¯ I I J ∂ J .

110


We can see that, at least in linear theory, h encodes a supergravity multiplet as follows. The form h may be expanded in the odd coordinates as h = h0 +

N r =1

1 r!

θ i1 · · · θ ir h i1 ···ir .

(3.12)

If we further linearise (3.10) around the trivial solution h = 0, it tells us that ∂¯0 h = 0, or equivalently, ∂¯0 h 0 = 0 = ∂¯0 h i1 ···ir . Because of the gauge invariance (3.11), which at the linearised level reduces to δh = ∂¯0 χ , we see that h 0 ∈ H 1 (P T, O(2)) and h i1 ···ir ∈ H 1 (P T, O(2 − r )), where P T represents the body of the supermanifold PT (so that P T is a finite deformation of PT[0] ). By virtue of the Penrose transform, [P68], h 0 corresponds on space-time to a helicity s = 2 field while h i1 ···ir to a helicity s = (4 − r )/2 field. Hence, for maximal N = 8 supersymmetry, we find (sm ) = (−21 , − 23 8 , −128 , − 21 56 , 070 , 21 56 , 128 , 23 8 , 21 ) which is precisely the (on-shell) spectrum of N = 8 Einstein supergravity; the subscript ‘m’ refers to the respective multiplicity. Altogether, we see that a single element h ∈ H 1 (PT, O(2)) encodes the full particle content of maximally supersymmetric linearised Einstein gravity in four dimensions. In this linearised context, it is straightforward to see how the gauging works. The bundle of R-symmetry generators on twistor space is the tangent bundle to the odd ¯ directions spanned by ∂/∂θ i . The linearised variation in the ∂-operator on this bundle is P ik ∂ 2 h/∂θ j ∂θ k because the part of ∂¯ f i ∂/∂θ i tangent to the odd directions is (∂¯ f i + P ik ∂ 2 h/∂θ j ∂θ k f j )∂/∂θ i . Because θ i anti-commute, ∂ 2 h/∂θ i ∂θ j is skew symmetric in i j. Thus, in the case of non-degenerate P i j , this gives an element of the Lie algebra of SO(N, C), and so corresponds to the maximal gauging of the R-symmetry, with gauge group SO(N, C). When P i j has rank r , for r < N, the gauging of the R-symmetry will be reduced to the subgroup of SO(N, C) that preserves P i j . In Appendix 5, where we compare our approach with that of [KK98], we also make some comments on the space-time fields in the non-linearised setting for zero cosmological constant. 3.2. Action functionals. We will be interested in integrating Lagrangian densities over twistor space for which we will need the holomorphic volume integral form

N = D(D Z I ) =

a b 1 4! abcd Z D Z

∧ DZc ∧ DZd ⊗

N i=1

Dθ i ,

(3.13)

which has weight 4 − N on account of the Berezinian integration rule dθ i θ j = δ i j implying d(λθ i ) = λ−1 dθ i for λ ∈ C∗ . Here, we use Manin’s notation [M88] to denote integral forms associated with a given basis of differential one-forms. We will not integrate over any complex conjugated odd coordinates. For maximal supersymmetry, N = 8, we can write down an action functional reproducing the field equations (3.10) and hence also (3.7),

S[h] = 8 ∧ h ∧ ∂¯0 h + 13 h ∧ [h, h}

1 ¯ (3.14) = (0) 8 ∧ h ∧ ∂0 h + 3 h ∧ [h, h} ,


111

where the integral form = D(dZ I ) =

(0) 8

a b 1 4! abcd Z dZ

∧ dZ c ∧ dZ d ⊗

8

dθ i .

(3.15)

i=1

It can be seen that the weights balance as h has weight 2, [·, ·} weight −2 and 8 (0) (respectively, 8 ) has weight −4. This is the only value of N for which there is such a balance. The action (3.14) is invariant under (3.11). This follows from the Bianchi identity for F (0,2) , ∂¯0 F (0,2) + [h, F (0,2) } = 0,

(3.16)

implied by the (graded) Jacobi identity for the Poisson structure. It is clear that the almost complex structure, integrability conditions and action formulation (the latter for N = 8) only depend on the Poisson structure I I J and not on I I J directly. It is also clear that if I I J is degenerate, the above field equations and action (the latter for N = 8) all make good sense, although the action most directly yields (3.10) rather than the superficially weaker Eq. (3.7), that is sufficient to determine the relevant structures on the deformed twistor space. The action (3.14) can be compared with the Kodaira-Spencer actions introduced in [BCOV94], the compendium of topological M-theory related actions in [DGNV05] and the Lagrange multiplier-type action involving the Nijenhuis tensor given in [BW04] in the N = 4 case. Our action is local in contra-distinction with the non-local KodairaSpencer action. Our action is given for a non-Calabi Yau space (due the isomorphism (B.5), the holomorphic Berezinian is only trivial when N = 4). Ours is most closely related to that in Berkovits & Witten, although our basic variable, the one-form h which is a “potential” for the deformation j, considered in deformation theory (i.e. j is a holomorphic derivative of h) and is most naturally expressed for N = 8 rather than N = 4. We close this subsection by discussing the cases with N < 8 supersymmetries. We start from the action (3.14) with N = 8 but restrict the dependence of h on θ i by requiring invariance under an SO(8 − N, C) subgroup of the R-symmetry. Thus, we set h = f + θ N+1 · · · θ 8 b,

(3.17)

where f and b are now one forms depending on the bosonic twistor coordinates and θ 1 , . . . , θ N, f has weight 2, and b has weight N − 6. We can now integrate out the anti-commuting variables θ N+1 , . . . , θ 8 and integrate by parts to obtain the action

(3.18) S[b, f ] =

r ∧ b ∧ ∂¯0 f + 21 [ f, f } . This action is now of ‘BF’ form where b acts as a Lagrange multiplier for the field equation ∂¯0 f + 21 [ f, f } = 0.

(3.19)

which, as we have seen, implies that integrability of the complex structure is compatible with a holomorphic Poisson structure. Varying f yields the equation ∂¯ f b = 0

(3.20)

112


and, together with the gauge freedom b → b+ ∂¯ f χ , this implies that b defines an element of the cohomology group H 1 (PT , O(N − 6)) and so is the Penrose transform of a superfield of helicity −2 + N/2. 4. Covariant Approach, Covariant Action for N = 0 and Special Geometry The above actions are non-covariant in the sense that they explicitly depend on the chosen background one has started with so that diffeomorphism invariance is broken. This is normal in the context of Chern-Simons actions for which a frame of the YangMills bundle must be chosen. Nevertheless, we will see that at least for τ non-degenerate and N = 0 we can give a covariant version. The geometric structure we are concerned with here is closely related to a (real) six-dimensional special geometry introduced by [CE03]. In their geometry, a real rank4 distribution (subbundle of the tangent bundle) D is introduced and, if suitably nondegenerate and satisfying a positivity condition, it is shown that there is a canonically defined almost complex structure J for which the distribution is an almost complex contact distribution. Furthermore, the obstruction to the integrability of J is identified. Our situation is somewhat different in that the primary structure on a smooth manifold, P, is a complex one-form τ defined up to complex rescalings (or more abstractly, a complex line bundle L ∗ ⊂ CT ∗ P := C ⊗ T ∗ P). This is more information in the sense that D is defined directly as the kernel of τ , but τ is only defined by D up to τ → aτ + bτ¯ , where a, b are complex valued functions on P. Given D, there is a unique choice of τ that is compatible with the Cap-Eastwood almost complex structure but a priori, one does not know if that is the τ that has been chosen. Our analogue of the Cap-Eastwood theorem works in higher dimensions also and we state it in greater generality than we need. Theorem 3. Suppose that on a (smooth) manifold P of dimension 4n + 2 we are given a complex line subbundle L ∗ ⊂ CT ∗ P, represented by a complex one-form τ defined up to complex rescalings. Suppose further that τ ∧ (dτ )n+1 = 0

and

τ ∧ (dτ )n ∧ τ¯ ∧ (dτ¯ )n = 0,

then there is a unique integrable almost complex structure for which τ is proportional to a non-degenerate holomorphic contact structure. Here, (dτ )n := dτ ∧· · ·∧dτ (n-times). Proof. We claim that, with the assumptions above, the (2n +1)-form τ ∧(dτ )n is simple, i.e., that the space of vectors X ∈ (P, CTP) such that X (τ ∧ dτ ) = 0 is (2n + 1)dimensional. This follows because the kernel of τ is (4n + 1)-dimensional, whereas dτ defines a skew form on this kernel and so must have even rank. However, its rank is less than 2n + 2 by τ ∧ (dτ )n+1 = 0 but greater than or equal to 2n because τ ∧ (dτ )n = 0. Hence, the kernel of τ ∧ (dτ )n is (2n + 1)-dimensional and we will take this kernel to be the space of anti-holomorphic tangent vectors spanning T (0,1) P. The condition that T (0,1) P should contain no real vectors follows from the second assumption of the theorem. We have that X (τ ∧ (dτ )n ) = 0 ⇔ X (τ ∧ dτ ) = 0 and we will use this latter characterisation of T (0,1) P in the following. We now consider the integrability of the distribution. Let X and Y satisfy X

(τ ∧ dτ ) = 0 = Y

(τ ∧ dτ ).

(4.1)


Then clearly X

τ =0=Y

113

τ and τ ∧ (X

dτ ) = 0 ,

(4.2)

so that X dτ ∝ τ and L X τ ∝ τ , and similarly for Y . Here, L X denotes the Lie derivative along X . Thus, [X, Y ] since X more, [X, Y ]

τ =0=Y

τ = X (Y

τ ) − Y (X

τ) − X

τ by assumption and so X

(τ ∧ dτ ) = −τ ∧ ([X, Y ]

(Y

(Y

dτ ) = 0 ,

(4.3)

dτ ) = 0 from above. Further-

dτ )

= −τ ∧ ([X, Y ] dτ + d([X, Y ] τ ) = −τ ∧ (L[X,Y ] τ ) = −τ ∧ (L X LY τ − LY L X τ ) = 0 , (4.4) since L X τ = X dτ ∝ τ , so L X LY τ ∝ τ . Thus, the almost complex structure is integrable.

In the twistor context, we will take P to be a six-dimensional manifold with topology U × S 2 with U ⊂ R4 and, as before, we shall denote it by P T . With this theorem, then, our data is simply a complex line subbundle L ∗ ⊂ CT ∗ P T represented by a differential one-form τ with values in L subject to the open condition τ ∧ dτ ∧ τ¯ ∧ dτ¯ = 0. We will also require that the line bundle L has Chern class 2. The field equation is τ ∧ (dτ )2 = 0. The N = 0 action above is simply S[b, τ ] = b ∧ τ ∧ (dτ )2 , (4.5) where b ∈ 1 P T ⊗ (L ∗ )3 is a Lagrange multiplier. Clearly, the field equation obtained by varying b is τ ∧ (dτ )2 = 0, as desired. The action is clearly diffeomorphism invariant, and enjoys a gauge invariance given by τ → χ τ and b → χ −3 b, where χ is a non-vanishing complex-valued function on P T . This gauge freedom corresponds to the fact that τ takes values in a line bundle L which we shall also denote by O(2) since it becomes that on-shell, and hence b is a differential one-form with values in O(−6). The action is also invariant under b → b + γ , where γ ∧ τ ∧ (dτ )2 = 0, and the space of such γ is two-dimensional when the field equations are not satisfied, but three-dimensional when they are. (When they are satisfied, this freedom can be used to ensure that b is a (0, 1)-form.) There is also a gauge freedom in b obtained as follows. We can define a partial connection ∂¯ on O(n) by defining for χ , now assumed to be a section of O(−6), ¯ to be the differential one-form modulo the kernel of ∂χ ¯ → ∂χ ¯ ∧ τ ∧ (dτ )2 defined ∂χ ¯ ∧ τ ∧ (dτ )2 := d(χ τ ∧ (dτ )2 ). It is clear from this definition that the integrand by ∂χ ¯ is a boundary integral and so this represents of the action evaluated on such a b = ∂χ ¯ needs to be a gauge freedom. On-shell, the above definition becomes trivial, and ∂χ ¯ 2/3 ∧ (τ ∧ dτ ) := d(χ 2/3 τ ∧ dτ ), and in this case it defined a little differently by ∂χ ¯ leads to an honest ∂-operator on the line bundles O(n). The field equation for b is db ∧ τ ∧ dτ − 23 b ∧ (dτ )2 = 0

(4.6)

114


¯ and when the field equation for τ is satisfied, this is the ∂-closure condition for sections ¯ with χ a of (0,1) P T ⊗ O(−6). Taking into account the gauge freedom b → b + ∂χ 1 section of O(−6), b will correspond to an element of H (P T, O(−6)). Thus, solutions to the field equations correspond to a complex three-dimensional manifold P T with holomorphic contact structure τ , and the condition on the Chern class of L implies that it satisfies the topological assumption of Ward’s theorem, so that, if it contains a holomorphic rational curve of degree one in the S 2 -factor, then it corresponds to a space-time M with self-dual Einstein metric. The field b ∈ H 1 (P T, O(−6)) then corresponds via the Penrose transform to a right-handed linearised gravitational field propagating on that self-dual background. Thus, we have the self-dual sector of non-supersymmetric Einstein gravity. 4.1. The supersymmetric case. In the supersymmetric situation, we will assume that PT is a smooth supermanifold with six real bosonic dimensions and N complex fermionic dimensions. Without loss of generality, we can always assume that the supermanifold is split in the smooth category [B79], and that locally the odd coordinates are θ i , i = 1, . . . , N, and that we will only ever have holomorphic dependence on θ i , their complex conjugates will not enter the formalism, so, in particular, the transition functions for the supermanifold will be holomorphic in θ i .9 We can still encode the structure of a supersymmetric non-linear graviton into a complex contact form τ as follows. We will assume that τ is a complex differential one-form on the supermanifold PT , again with only holomorphic dependence on the θ i , i.e., τ = dx a τa + dθ i τi , where the x a s are the real bosonic coordinates on PT , a = 1, . . . , 6, and τa and τi are holomorphic in θ i with τi odd and τa even functions on PT . On the body of the supermanifold, θ i = 0, we can assume that we have the equations τ ∧ (dτ )2 = 0 as before, but these will not hold when θ i = 0, even for standard flat supertwistor space as, in general, (dθ )n = 0 ∀ n for an odd variable θ . Thus, we cannot express the conditions we need quite so simply in the supersymmetric case. Nevertheless, much of Thm. 3 works in the supersymmetric case also. We will require firstly, as a genericity assumption, that the complexified kernel CD of τ has dimension ¯ 5|2N (here we are taking ∂/∂θ i and ∂/∂ θ¯ i to be independent). Secondly, we require that on this complexified kernel of τ , the two form dτ has rank 2|N so that the kernel of τ ∧ dτ is 3|N-dimensional and further, that ker(τ ∧ dτ ) has no real vectors, i.e. ker(τ ∧ dτ ) ∩ ker(τ ∧ dτ ) = {0}. θi

(4.7) ¯ θ¯ i

The fact that we have required that τ depends only on and not means that dτ ¯ i ¯ annihilates ∂/∂ θ , for i = 1, . . . , N and so the rank of dτ is at most 5|N in any case. With these assumptions, the proof of Thm. 3 follows without modification to show that ker(τ ∧ dτ ) is integrable and that τ is a holomorphic complex contact structure so that T (0,1) PT := ker(τ ∧ dτ ).

(4.8)

The main field equation is therefore the condition that τ ∧ dτ annihilates a complex distribution of dimension 3|N. In the supersymmetric context, we do not yet have an equation on τ analogous to the bosonic equation τ ∧ (dτ )n+1 = 0 for higher dimensional complex contact structures nor an action that produces this condition as its Euler-Lagrange equation. As a consequence, we have so far been unable to find a covariant supersymmetric action functional. 9 In a Dolbeault context, this assumption is, in effect a gauge choice.


115

5. Conclusions Given that these actions are ‘Chern-Simons-like’ one is led to ask the extent to which they can be interpreted coherently as holomorphic Chern-Simons theories. Clearly, in some sense, the gauge group should be taken to be the diffeomorphisms of the supertwistor space that preserve the holomorphic Poisson structure. This is most easily made sense of in a complexified context so that the holomorphic twistor variables are freed up and become independent from the conjugate twistor variables. Then the theory becomes a complexified Chern-Simons theory with gauge group the holomorphic contact transformations of the holomorphic supertwistor space, a region in CP3|8 , on the conjugate supertwistor space (which is just CP3 as we have no anti-holomorphic fermionic coordinates). A similar connection between the self-dual vacuum equations and a gauge theory with a diffeomorphism group gauge group was given on space-time in [MN89] (here the gauge theory was the self-dual Yang-Mills equations); see also [W07] for a supersymmetric extension thereof. The fact that Thm. 3 works in 4n + 2 dimensions is suggestive of applications of this framework to the twistor theory for quaternionic Kähler manifolds with non-zero scalar curvature in 4n dimensions. It is straightforward to write down a Lagrange multiplier action b ∧ τ ∧ (dτ )n+1 analogous to our N = 0 action, but with b a (2n − 1)-form, although in this context the interpretation of b is less clear. An attractive feature is that we have a fully supersymmetrically invariant and Lorentz invariant off-shell formulation of the theory. However, we have so far been unable to find an action functional of N = 8 self-dual supergravity that does not depend on a given integrable background. Such an action functional would, however, be desirable as one would hope for an explicitly diffeomorphism invariant action principle for N = 8 self-dual supergravity. In particular, if one wishes to be able to extend the ideas to the full theory along the lines of [M05] for conformal supergravity,10 then it would seem awkward to have to identify a Minkowski background. A task for the future is to start with the superfield expansions (in the non-linear setting) of τ and h and reproduce the covariant form of the field equations and of the action functional of N = 8 self-dual supergravity in four dimensions as given in [S92].11 In the zero cosmological constant case, our twistor action and field equations must correspond via the Penrose transform to Siegel’s results. Acknowledgements. We would like to thank Alexander Popov for a number of important contributions to this work and Mohab Abou-Zeid, Rutger Boels, Daniel Fox, Chris Hull, Riccardo Ricci, Christian Sämann and David Skinner for useful discussions. We would also like to thank the referee for useful suggestions. The first author is partially supported by the EU through the FP6 Marie Curie RTN ENIGMA (contract number MRTN– CT–2004–5652) and through the ESF MISGAM network. The second author was supported in part by the EU under the MRTN contract MRTN–CT–2004–005104 and by STFC under the rolling grant PP/D0744X/1.

Appendix A. Prepotential Formulation The subject of this appendix is the comparison of [KK98] approach with ours. Their formulation is based on an anti-holomorphic involution which picks a real slice in complexified space-time being of split signature. Pretty much the same holds true, however, 10 See also [A-ZH06] for a space-time action for expanding about the self-dual sector in the case of Einstein gravity. 11 Similar expansions for certain supersymmetric gauge theories were performed in [PW04,PS05], Sämann (2005), [PSW05 and LS06].

116


for Euclidean signature and it is this latter case we are interested in here. As already indicated, this works only for an even number of supersymmetries. In the following, we shall use conventions from [W06]. A.1. Real structures on PT[N] and M[N] . Let us first consider the supertwistor space

PT[N] = CP3|N \CP1|N with (homogeneous) coordinates (ω A , πα˙ ) for flat super spacetime M[N] ∼ = C4|2N. An Euclidean signature real slice follows from the anti-holomorphic involution without fixed points ρ : PT[N] → PT[N] given by ˙

(ωˆ A , πˆ α˙ ) := ρ(ω A , πα˙ ) := (ω¯ B C B A , Cα˙ β π¯ β˙ ),

(A.1)

where bar denotes complex conjugation and (C A B ) = diag((Cα β ), (Ci j )), with 0 1 ˙ . (A.2) (Cα β ) = , (Ci j ) = diag( , . . . , ), (Cα˙ β ) = − , := −1 0 N 2 −times

We can extend ρ to a map from a holomorphic function f on PT[N] another holomorphic function by ρ( f (· · · )) := f (ρ(· · · )).

(A.3)

By virtue of the incidence relation, ω A = x Aα˙ πα˙ , we obtain an induced involution on M[N] explicitly given by ˙

ρ(x Aα˙ ) = −x¯ B β C B A Cβ˙ α˙ .

(A.4)

We shall use the same notation ρ for the anti-holomorphic involution induced on the different (super)manifolds in the twistor correspondence. The fixed point set of this involution, that is, ρ(x) = x for x ∈ M[N] , defines Euclidean right-chiral superspace ρ M[N] ∼ = R4|2N inside M[N] . Following [AHS78], the supertwistor space PT[N] can be identified with O(1)⊕2|N → CP1

(A.5)

and so it can be covered by two (acyclic) coordinate patches U± and coordinatised by (ω±A , π± ), where ω±A are local fibre coordinates with ω+A := ω A /π0˙ , ω−A := ω A /π1˙ and π+ := π1˙ /π0˙ , π− := π0˙ /π1˙ are the standard local holomorphic coordinates on CP1 , with π+ = π−−1 on U+ ∩ U+ ⊂ PT[N] . On the other hand, since PT[N] is diffeomorphic ρ ∼ R4|2N × S 2 , one may equivalently coordinatise it by using (x Aα˙ , λ± ), to M[N] × S 2 = where λ± are the standard local holomorphic coordinates on S 2 ∼ = CP1 . Note that ± A A α ˙ (ω± , π± ) = (x λα˙ , λ± ), where (λα+˙ ) :=

λ+ −1

and

(λα−˙ ) :=

1 . −λ−

(A.6)


117

The explicit inverse transformation laws are simply x Aα˙ =

ω±A πˆ ±α˙ − ωˆ ±A π±α˙ β˙

πˆ ± π±β˙

,

(A.7)

where π±α˙ are similarly defined as in (A.6). Altogether, we have obtained a non-holomorphic fibration ρ

π : PT[N] → M[N] .

(A.8)

Introduce (λˆ α+˙ ) :=

¯ 1 ˆ α−˙ ) := λ− , γ±−1 := λˆ α±˙ λ± = 1 + λ± λ¯ ± , , ( λ α˙ 1 λ¯ +

(A.9)

like for πˆ ±α˙ = ρ(π±α˙ ). Then, due to the above diffeomorphism, we have the following transformation laws between the coordinate vector fields: ∂ ∂ = γ± λˆ α±˙ , A ∂ x Aα˙ ∂ω± ∂ ∂ ∂ ˙ = − γ+ x A1 λˆ α+˙ Aα˙ , ∂π+ ∂λ+ ∂x ∂ ∂ ∂ ˙ = − γ− x A0 λˆ α−˙ ∂π− ∂λ− ∂ x Aα˙

(A.10a) (A.10b) (A.10c)

for the holomorphic tangent vector fields and ∂ ¯ ∂ ω¯ ±A

= −γ± C A B λα±˙

∂ , ∂ x B α˙

∂ ∂ ∂ ˙ = − γ+ x A0 λα+˙ Aα˙ , ∂ π¯ + ∂x ∂ λ¯ + ∂ ∂ ∂ ˙ = + γ− x A1 λα−˙ ¯ ∂ π¯ − ∂ x Aα˙ ∂ λ−

(A.10d) (A.10e) (A.10f)

for the anti-holomorphic ones.

A.2. Comparison of the two approaches. In what follows, we shall restrict our discussion to the U+ -patch only and for notational simplicity suppress the patch index. Of course, a similar discussion carries over to the U− -patch. To begin with, let us write down the field Eqs. (3.10) more explicitly. If we let the deformation be h = dω¯ α¯ h α¯ + dπ¯ h π¯ , they read as ∂ ∂ h β¯ − h α¯ + [h α¯ , h β¯ } = 0, α ¯ ∂ ω¯ ∂ ω¯ β¯ ∂ ∂ h α¯ − h π¯ + [h π¯ , h α¯ } = 0. ∂ π¯ ∂ ω¯ α¯

(A.11a) (A.11b)

118


Using the incidence relation ω A = x Aα˙ πα˙ and the involutions introduced in the preceding subsection, h can also be expressed in the coordinates (x Aα˙ , λ) as ˙

h = −γ λˆ β˙ dx α β α + dλ¯ λ¯ ,

(A.12)

˙

where α := −γ −1 Cα β h β¯ and λ¯ := h π¯ + γ x α 0 α . In order to compare our approach with those by [KK98], we notice that their formulation deals with the ‘vacuum case’, i.e. with the case of vanishing cosmological constant. Upon also recalling point (iii) of Thm. 2, we must therefore ensure that the fibration of the supertwistor space is preserved, and so (i) h is of the form ˙

h = −γ λˆ β˙ dx α β α ,

(A.13)

˙

i.e. λ¯ = 0 ⇔ h π¯ = −γ x α 0 α and (ii) the relative symplectic structure needs to be preserved which amounts to requiring a degeneracy of the Poisson structure ω = (I I J ) introduced in Sect. 3.1 according to ω = (I AB ). Notice further that α must be of weight 3 in order for h to be of weight 2. Some algebra then reveals that in the ‘vacuum case’ the above equations for h α¯ and h π¯ translate into the following set:

αβ ∂¯α β + 21 αβ [α , β } = 0, ∂λ¯ α + γ

−2 βγ

(∂β α )γ = 0,

(A.14a) (A.14b)

where ∂¯ A := λα˙ ∂/∂ x Aα˙ and ∂ A := γ λˆ α˙ ∂/∂ x Aα˙ . Before going any further, let us say a few words about gauge symmetries. The original equations for h transformed covariantly under gauge transformations of the form h → h + δh, with δh = ∂¯0 χ + [h, χ } for some function χ of weight 2. However, the above equations will no longer transform covariantly under generic gauge transformations, since we have incorporated the constraint λ¯ = 0. Nevertheless, some residual gauge symmetry remains, and which is determined as follows. In order to preserve the con˙ straint λ¯ = 0, we must have δh π¯ = −γ x α 0 δα , where δα = −γ −1 Cα β δh β¯ , i.e. transformations of h π¯ are determined by those of h α¯ . It is not difficult to verify that the remaining gauge symmetry is given by the following transformation laws: δα = −(∂¯α χ + [α , χ }),

with

∂λ¯ χ + γ −2 βγ (∂β χ )γ = 0. (A.15)

In particular, the last of these equations shows that the 2nd equation for α from above does not constrain α any further, so that the only remaining field equation we are left with is

αβ ∂¯α β + 21 αβ [α , β } = 0.

(A.16)

Since in particular α = ∂α (see also [W85]), where is some function of weight 4 (recall that α is of weight 3) and ω = (I AB ), we end up with + 21 αβ (−) p A ∂ A ∂α I AB ∂ B ∂β = 0 which is [KK98] result.

and

:= αβ ∂¯α ∂β , (A.17)


119

As before, in the case of maximal supersymmetry, N = 8, the field Eqs. (A.17) can be derived from an action principle, S[] = d vol + 3!1 αβ (−) p A ∂ A ∂α I AB ∂ B ∂β , (A.18a) where the measure d vol is given by d vol = d4 x γ 2 dλdλ¯ dθ 1 · · · dθ 8 .

(A.18b)

It remains to give the superfield expansion of . For brevity, let us only discuss the N = 8 case. We find = g + θ i ψi + θ i1 i2 A[i1 i2 ] + θ i1 i2 i3 χi1 i2 i3 + θ i1 i2 i3 i4 φi1 i2 i3 i4 + θi1 i2 i3 χ˜ i1 i2 i3 + θi1 i2 A˜ i1 i2 + θi ψ˜ i + θ g, ˜

(A.19)

where θ i1 ···ir := θi1 ···i8−r :=

1 i1 r! θ

· · · θ ir , for r = 1, . . . , 4,

i 9−r 1 r ! i 1 ···i 8−r i 9−r ···i 8 θ

(A.20a)

· · · θ , for r = 5, . . . , 8. i8

(A.20b)

Here, i1 ···i8 = [i1 ···i8 ] and 1···8 = 1. Keeping in mind (A.13), we find the following space-time fields: Table 1. Space-time fields and their helicities and multiplicities Field Helicity Multiplicity

g

ψ

2 1

3 2

8

A

χ

1 28

1 2

56

φ

χ˜

A˜

φ˜

g˜

0 70

− 21

−1 28

− 23 8

−2 1

56

Appendix B. Holomorphic Volume Forms and Non-Projective Twistor Space It is often convenient to work on the non-projective twistor space T as many of the geometric structures can be formulated globally there and sections of the line bundles O(n) become ordinary functions of weight n under the action of the Euler vector field ϒ = Z I ∂/∂ Z I . In the curved case, as in the proof of Theorem 2, the non-projective space can be defined as the quotient of the non-projective co-spin bundle S ∗ by DF . We can also define it intrinsically as follows. In the bosonic case, given a contact structure defined by a one-form τ with values in a line bundle L, we can see that τ ∧ dτ defines a (non-vanishing) section of

(3,0) P T ⊗ L 2 . Thus, we must have L −2 ∼ = (3,0) P T . In the flat case, non-projective 4 ∼ twistor space T[0] = C is the total space of the (tautological) line bundle O(−1) over the projective twistor space PT[0] , and (3,0) PT[0] ∼ = O(−4). In the general (non-supersymmetric) case, we can define the non-projective twistor space T to be the total space of the line bundle O(−1) now defined to be the 4th root of (3,0) P T . If so, we see that L∼ = O(2). The non-projective space has an Euler vector field ϒ that generates the C∗ action on the fibres of O(−1). The weights of functions and forms pulled back from P T are translated into the weights along ϒ on the non-projective space. In this context,

120


τ defines a 1-form of weight 2 on the non-projective space, and the non-degeneracy of the contact structure translates into the condition that the two-form dτ is non-degenerate as a two-form on T and being closed defines a holomorphic symplectic structure. Its inverse therefore defines a non-degenerate holomorphic Poisson structure on T of weight −2. This descends to give a Poisson structure on P T with values in O(−2). We can extend this reasoning to the supersymmetric case as follows. We again consider a holomorphic differential one-form τ with values in a complex line bundle L . It defines as its kernel the contact distribution D, which now is of rank 2|N, leading to a short exact sequence as follows: 0 −→ D −→ T (1,0) PT −→ L −→ 0 .

(B.1)

Since we assume that τ defines a non-degenerate holomorphic contact structure, dτ provides a non-degenerate skew form on D. Taking its Berezinian, we get an element Ber(dτ |D ) ∈ L 2−N ⊗ (Ber D)−2 .

(B.2)

(This follows from the fact that in the definition of the Berezinian, the odd-odd part of the matrix is inverted before its determinant is taken leading to inverse weights associated to the odd directions relative to their bosonic counterparts.) When L has a square root, we can take its square root to get an isomorphism (Ber(dτ |D )) : Ber D → L 1−N/2 . (B.3) The above exact sequence then gives an identification Ber T (1,0) PT ∼ = Ber D ⊗ L ∼ = L 2−N/2 ,

(B.4)

and so finally we obtain the isomorphism Ber(PT ) := Ber (1,0) PT ∼ = L N/2−2 .

(B.5)

We will take the body of the supertwistor space to have topology U × S 2 , where U is an open subset of R4 (or more generally the total space of the projective co-spin bundle of a real smooth spin four-manifold M). The assumption on the normal bundle of a rational curve in supertwistor space implies that the holomorphic Berezinian bundle Ber(PT ) has Chern class N − 4, and with the topological assumptions we have made, this will have an |N − 4|-th root and we may introduce the (consistent) notation O(n) := (Ber(PT ))n/(N−4) . Thus, L ∼ = O(2) and Ber(PT ) ∼ = O(N − 4). Appendix C. Supersymmetric BF-Type Theory In this appendix we wish to present an alternative interpretion of the holomorphic ChernSimons-type theory (3.14). We shall see that this theory can be viewed as a certain supersymmetric holomorphic BF-type theory. In what follows, we will borrow ideas of [W89]. To begin with, consider some (0|2)-dimensional space T with odd coordinates ψ 1 and ψ 2 , which we collectively denote by ψ α . On PT × T , we may introduce a (0, 1)-form H of weight 2 according to H = h + ψ α χα + ψ 1 ψ 2 b.

(C.1)


121

Here, h and b are even and χα are odd (0, 1)-forms of weight 2 on PT . As before, we ¯ assume that these fields have no dependence on the θ¯ i coordinates. In analogy to (3.14), we may consider the action functional

(0) 1 2 S[b, h, χα ] = dψ dψ (C.2)

8 ∧ H ∧ ∂¯0 H + 13 H ∧ [H, H } . A short calculation reveals that this action reduces after integration over the ψ α coordinates to (0) S[h, b, χα ] =

8 ∧ b ∧ F (0,2) − 21 αβ χα ∧ (∂¯0 χβ + [h, χβ }) . (C.3) The equations of motion that follow from this action are F (0,2) = 0, ∂¯0 b + [h, b} = 21 αβ [χα , χβ },

∂¯0 χα + [h, χα } = 0.

(C.4a) (C.4b) (C.4c)

The first equation is the field equation (3.10). Note that for χα = 0 we get (3.18). The supersymmetry transformations are straightforwardly worked out as they follow from infinitesimal translations in the odd coordinates ψ α . We find δα h = χα ,

δα χβ = αβ b

and

δα b = 0,

(C.5)

with {δα , δβ } = 0. Therefore, the supersymmetric holomorphic BF-type action (C.3) can also be written as S[h, b, χα ] = − 21 δ1 δ2 S[h],

(C.6)

where S[h] is the action (3.14). References [A-ZH06] [A-ZHM08] [AHS78] [BE91] [B79] [BS92] [B04] [BW04] [BDR07] [BCOV94] [B-Betal06]

Abou-Zeid, M., Hull, C.M.: A chiral perturbation expansion for gravity. JHEP 0602, 057 (2006) Abou-Zeid, M., Hull, C.M., Mason, L.J.: Einstein supergravity and new twistor string theories. Commun. Math. Phys. 282, 519 (2008) Atiyah, M.F., Hitchin, N.J., Singer, I.M.: Self-duality in four-dimensional riemannian geometry. Proc. Roy. Soc. Lond. A 362, 425 (1978) Bailey, T.N., Eastwood, M.G.: Complex paraconformal manifolds— their differential geometry and twistor theory. Forum. Math. 3, 61 (1991) Batchelor, M.: The structure of supermanifolds. Trans. Amer. Math. Soc. 253, 329 (1979) Bergshoeff, E., Sezgin, E.: Self-dual supergravity theories in (2 + 2)-dimensions. Phys. Lett. B 292, 87 (1992) Berkovits, N.: An alternative string theory in twistor space for N = 4 super yang-mills. Phys. Rev. Lett. 93, 011601 (2004) Berkovits, N., Witten, E.: Conformal supergravity in twistor-string theory. JHEP 0408, 009 (2004) Bern, Z., Dixon, L.J., Roiban, R.: Is N = 8 supergravity ultraviolet finite? Phys. Lett. B 644, 265 (2007) Bershadsky, M., Cecotti, S., Ooguri, H., Vafa, C.: Kodaira-Spencer theory of gravity and exact results for quantum string amplitudes. Commun. Math. Phys. 165, 311 (1994) Bjerrum-Bohr, N.E.J., Dunbar, D.C., Ita, H., Perkins, W.B., Risager, K.: The no-triangle hypothesis for N = 8 supergravity. JHEP 0612, 072 (2006)

122

[BMS07a] [BMS07b] [CE03] [CDDG79] [DGNV05] [GRV07] [K79] [K80] [KK98] [KNG92] [LS06] [M88] [M05] [MN89] [MS06] [MS08] [MW96] [M91] [M92a] [M92b] [M92c] [N08] [P68] [P76] [PW04] [PS05] [PSW05] [S05] [S06] [S92] [S95]


Boels, R., Mason, L.J., Skinner, D.: Supersymmetric gauge theories in twistor space. JHEP 0702, 014 (2007a) Boels, R., Mason, L.J., Skinner, D.: From twistor actions to MHV diagrams. Phys. Lett. B 648, 90 (2007b) Cap, A., Eastwood, M.G.: Some special geometry in dimension six. In: Proc. of the 22nd Winter School, Geometry and physics (Srni 2002), Rend. Circ. Mat. Palermo (2) Suppl. No. 71, 93 (2003) Christensen, S.M., Deser, S., Duff, M.J., Grisaru, M.T.: Chirality, self-duality, and supergravity counterterms. Phys. Lett. B 84, 411 (1979) Dijkgraaf, R., Gukov, S., Neitzke, A., Vafa, C.: Topological M-theory as unification of form theories of gravity. Adv. Theor. Math. Phys. 9, 603 (2005) Green, M.B., Russo, J.G., Vanhove, P.: Ultraviolet properties of maximal supergravity. Phys. Rev. Lett. 98, 131602 (2007) Kallosh, R.E.: Super self-duality. JETP Lett. 29, 172 [Pisma Zh. Eksp. Teor. Fiz. 29, 192] (1979) Kallosh, R.E.: Self-duality in superspace. Nucl. Phys. B 165, 119 (1980) Karnas, S., Ketov, S.V.: An action of N = 8 self-dual supergravity in ultra-hyperbolic harmonic superspace. Nucl. Phys. B 526, 597 (1998) Ketov, S.V., Nishino, H., Gates, S.J.J.: Self-dual supersymmetry and supergravity in AtiyahWard space-time. Nucl. Phys. B 393, 149 (1992). See also Phys. Lett. B 297, 323 (1992), Phys. Lett. B 307, 331 (1993), Phys. Lett. B 307, 323 (1993) Lechtenfeld, O., Sämann, C.: Matrix models and D-branes in twistor string theory. JHEP 0603, 002 (2006) Manin, Yu.I.: Gauge field theory and complex geometry. New York: Springer Verlag, 1988 [Russian: Moscow: Nauka, 1984] Mason, L.J.: Twistor actions for non-self-dual fields: A derivation of twistor string theory. JHEP 0510, 009 (2005) Mason, L.J., Newman, E.T.: A connection between the Einstein and Yang-Mills equations. Commun. Math. Phys. 121, 659 (1989) Mason, L.J., Skinner, D.: An ambitwistor Yang-Mills Lagrangian. Phys. Lett. B 636, 60 (2006) Mason, L.J., Skinner, D.: Heterotic twistor-string theory. Nucl. Phys. B 795, 105 (2008) Mason, L.J., Woodhouse, N.M.J.: Integrability, self-duality, and twistor theory. Oxford: Clarendon Press, 1996 Merkulov, S.A.: Paraconformal supermanifolds and non-standard N-extended supergravity models. Class. Quant. Grav. 8, 557 (1991) Merkulov, S.A.: Supersymmetric non-linear graviton. Funct. Anal. Appl. 26, 69 (1992a) Merkulov, S.A.: Simple supergravity, supersymmetric non-linear gravitons and supertwistor theory. Class. Quant. Grav. 9, 2369 (1992b) Merkulov, S.A.: Quaternionic, quaternionic Kähler, and hyper-Kähler supermanifolds. Lett. Math. Phys. 25, 7 (1992c) Nair, V.P.: A note on graviton amplitudes for new twistor string theories. Phys. Rev. D 78, 041501 (2008) Penrose, R.: Twistor quantization and curved space-time. Int. J. Theor. Phys. 1, 61 (1968) Penrose, R.: Non-linear gravitons and curved twistor theory. Gen. Rel. Grav. 7, 31 (1976) Popov, A.D., Wolf, M.: Topological B model on weighted projective spaces and self-dual models in four dimensions. JHEP 0409, 007 (2004) Penrose, R., Sämann, C.: On supertwistors, the Penrose-Ward transform and N = 4 super Yang-Mills theory. Adv. Theor. Math. Phys. 9, 931 (2005) Penrose, R., Sämann, C., Wolf, M.: The topological B model on a mini-supertwistor space and supersymmetric Bogomolny monopole equations. JHEP 0510, 058 (2005) Sämann, C.: The topological B model on fattened complex manifolds and subsectors of N = 4 self-dual Yang-Mills theory. JHEP 0501, 042 (2005) Sämann, C.: Aspects of twistor geometry and supersymmetric field theories within superstring theory, Ph.D. thesis, Leibniz University of Hannover, available at http://arXiv.org/list/ hep-th/0603098, 2006 Siegel, W.: Self-dual N = 8 supergravity as closed N = 2 (N = 4) strings. Phys. Rev. D 47, 2504 (1992) Sokatchev, E.S.: Action for N = 4 supersymmetric self-dual Yang-Mills theory. Phys. Rev. D 53, 2062 (1995)


[St06] [W86] [W80] [WW90] [W89] [W04] [W06] [W07] [W85]

123

Stelle, K.S.: Counterterms, holonomy and supersymmetry. In: Deserfest: A celebration of the Life and works of Stanley Deser, Ann Arbor Michigan, 2004, Liu, J.T., Duff, M.J., Stelle, K.S., Woodward, R.P., (eds.), River Edge, NJ: World Scientific, 2006, p. 303 Waintrob, A.Yu.: Deformations and moduli of supermanifolds. In: Group theoretical methods in physics, Vol. 1, Moscow: Nauka, 1986 Ward, R.S.: Self-dual space-times with cosmological constants. Commun. Math. Phys. 78, 1 (1980) Ward, R.S., Wells, R.O.: Twistor geometry and field theory. Cambridge: Cambridge University Press, 1990 Witten, E.: Topology changing amplitudes in (2 + 1)-dimensional gravity. Nucl. Phys. B 323, 113 (1989) Witten, E.: Perturbative gauge theory as a string theory in twistor space. Commun. Math. Phys. 252, 189 (2004) Wolf, M.: On supertwistor geometry and integrability in super gauge theory. Ph.D. thesis, Leibniz University of Hannover, available at http://arXiv.org/list/hep-th/0611013, 2006 Wolf, M.: Self-dual supergravity and twistor theory. Class. Quant. Grav. 24, 6287 (2007) Woodhouse, N.M.J.: Real methods in twistor theory. Class. Quant. Grav. 2, 257 (1985)

Communicated by G. W. Gibbons


Communications in


Asymptotic Stability of Lattice Solitons in the Energy Space Tetsu Mizumachi Faculty of Mathematics, Kyushu University, Hakozaki 6-10-1, Fukuoka 812-8581, Japan. E-mail: [email protected] Received: 13 November 2007 / Accepted: 19 December 2008 Published online: 14 March 2009 – © Springer-Verlag 2009

Abstract: Orbital and asymptotic stability for 1-soliton solutions of the Toda lattice equations as well as for small solitary waves of the FPU lattice equations are established in the energy space. Unlike analogous Hamiltonian PDEs, the lattice equations do not conserve the adjoint momentum. In fact, the Toda lattice equation is a bidirectional model that does not fit in with the existing theory for the Hamiltonian systems by Grillakis, Shatah and Strauss. To prove stability of 1-soliton solutions, we split a solution around a 1-soliton into a small solution that moves more slowly than the main solitary wave and an exponentially localized part. We apply a decay estimate for solutions to a linearized Toda equation which has been recently proved by Mizumachi and Pego to estimate the localized part. We improve the asymptotic stability results for FPU lattices in a weighted space obtained by Friesecke and Pego.

1. Introduction In this paper, we study asymptotic stability of solitary waves for a class of Hamiltonian systems of particles connected by nonlinear springs. A typical model of these lattices is the Toda lattice q(t, ¨ n) = e−(q(t,n)−q(t,n−1)) − e−(q(t,n+1)−q(t,n)) for t ∈ R and n ∈ Z,

(1)

where q(t, n) denotes the displacement of the n th particle at time t and ˙ denotes differentiation with respect to t. Let p(t, n) = q(t, ˙ n), r (t, n) = q(t, n + 1) − q(t, n), u(t, n) = t(r (t, n), p(t, n)) and V (r ) = e−r − 1 +r . The Toda lattice (1) is an integrable system with the Hamiltonian H (u(t)) =

1 n∈Z

2

p(t, n) + V (r (t, n)) , 2

126

T. Mizumachi

(see [7]) and can be rewritten as du = J H (u), dt where

J=

0 1 − e−∂

(2)

e∂ − 1 , 0

∂

and e±∂ = e± ∂n are the shift operators defined by (e±∂ ) f (n) = f (n ± 1) for every sequence { f (n)}n∈Z and H is the Fréchet derivative of H in l 2 × l 2 . The Toda lattice (2) has a two-parameter family of solitary waves M = u c (t + δ) c > 1, δ ∈ R , where u c (t, n) = u˜ c (n − ct), u˜ c (x) = (˜rc (x), p˜ c (x)) and cosh{κ(x − 1)} , cosh κ x p˜ c (x) = −c∂x q˜c (x), r˜c (x) = q˜c (x + 1) − q˜c (x),

q˜c (x) = log

(3) (4)

and κ = κ(c) is a unique positive solution of c = sinh κ/κ. Friesecke and Pego [9,10] have proved asymptotic stability of solitary waves to FPU lattices in a weighted space assuming an exponential linear stability property (H1) below. To state the assumption explicitly, we introduce several notations. Let la2 be a Hilbert space of R2 -sequences equipped with the norm 1/2 e2an |u(n)|2 . ula2 = n∈N

2 Let u, v := n∈Z (u 1 (n)u 2 (n) + v1 (n)v2 (n)) for R -sequences u = (u 1 , u 2 ) and 1/2 v = (v1 , v2 ) and ul 2 = (u, u) .

(H1) Let a > 0 be a small number. There exist positive numbers K and β such that if v(s), J −1 u˙ c (s) = v(s), J −1 ∂c u c (s) = 0,

(5)

then a solution to dv dt

= J H (u c (t))v

(6)

satisfies ea(·−ct) v(t, ·)l 2 ≤ K e−β(t−s) ea(·−cs) v(s)l 2 for every t ≥ s.

(7)

Remark 1. Solutions u˙ c (t) and ∂c u c (t) to (6) correspond to infinitesimal changes on t and c and they do not decay as t → ∞. Since J −1 u˙ c (t) and J −1 ∂c u c (t) are the corresponding neutral modes to the adjoint equation dw = H (u c (t))J w, dt the condition (H1) says that a solution to (6) decays exponentially as t → ∞ if it does not include neutral modes u˙ c (t) and ∂c u c (t).

Asymptotic Stability of Lattice Solitons in the Energy Space

127

Remark 2. In (5), we set 0 J −1 = −1

0

k=−∞ e

k∂

k=−∞ e

k∂

0

2 . (Note that u decays exponentially as n → ±∞ so that J −1 is a bounded operator in l−a 2 and a > 0 and that e−∂ u −a if u ∈ l±a l 2 = e ul 2 .) Since u˙ c and ∂c u c decay like −a

−a

2 for every a ∈ (0, 2κ(c)). e−2κ|n−ct| as n → ±∞, we have J −1 u˙ c , J −1 ∂c u c ∈ l−a

Friesecke and Pego prove in [9] that solitary waves of FPU lattices are asymptotically stable in la2 if (H1) holds. They have also proved in [10,11] that small solitary waves of FPU lattices can be approximated by KdV solitons and that they satisfy (H1). In [18], we use the linearized Bäcklund transformation to show that every 1-soliton of the Toda lattice satisfies (H1) and prove that it is asymptotically stable in la2 without assuming smallness of solitons. Our goal in the present paper is to prove asymptotic stability of 1-solitons in l 2 . Theorem 1. Let c0 > 1, τ0 ∈ R and let u(t) be a solution to (2) with u(0) = u c0 (τ0 )+v0 . For every ε > 0, there exists a positive number δ > 0 satisfying the following: If v0 l 2 < δ, there exist constants c+ > 1 and σ ∈ (1, c+ ) and a C 1 -function x(t) such that u(t) − u˜ c0 (· − x(t))l 2 < ε,

lim u(t) − u˜ c+ (· − x(t)) l 2 (n≥σ t) = 0,

t→∞

˙ − c0 |) = O(v0 l 2 ), sup (|c(t) − c0 | + |x(t) t∈R

lim c(t) = c+ ,

t→∞

lim x(t) ˙ = c+ .

t→∞

Remark 3. By a simple computation, we see d H (u c )/dc > 0 and limc→1 H (u c ) = 0 (see e.g. [26]). So we have arbitrary small 1-solitons in l 2 . However, small solitary waves do not belong to an exponentially weighted space if c is close to 1 because u c (t) decays like e−2κ(c)|n−x(t)| as n → ∞ and limc↓1 κ(c) = 0. Thus from Friesecke and Pego [8–11] and Mizumachi and Pego [18], we cannot see whether a solitary wave can be stable under perturbations which include small solitary waves. Theorem 1 and Theorem 2 below insist that a solitary wave does not collapse by small perturbations including other solitary waves. Since Benjamin [1] and Bona [2] studied stability of KdV 1-solitons, many results have been obtained on stability of solitary waves to infinite dimensional Hamiltonian systems (see [5] and references therein). In those results, they utilized the fact that the Hamiltonian systems have another conservation law (momentum for KdV and charge for NLS) and a solitary wave solution is a local minimizer of the Hamiltonian among solutions whose momentum or charge is the same as the solitary wave solution. However, a solution to the Toda lattice does not conserve adjoint momentum in general because Noether’s theorem is not applicable to the spatial variable n ∈ Z. Hence stability of solitary waves does not follow from the theory of Hamiltonian systems by Grillakis, Shatah and Strauss [13,14] and Shatah and Strauss [25]. For the same reason, it is not possible to use a Liouville theorem such as [15] to prove asymptotic stability of solitary waves.

128

T. Mizumachi

Luckily, solitary waves for a class of lattice equations including the Toda lattice equation separate from each other as t → ∞. As can be seen from (3) and (4), speed of solitary waves which move to the right is larger than 1 and the larger a solitary wave is the faster it moves, whereas the absolute value of group velocities are less than 1. So a solution to (2) is decoupled into a train of solitary waves and a remainder term as t → ∞. Friesecke and Pego [8–11] utilize this fact and prove asymptotic stability of solitary waves to FPU lattices in an exponentially weighted space. They decompose a solitary wave as u(t) = u c(t) (γ (t)) + v(t) = u˜ c(t) (· − x(t)) + v(t), x(t) = c(t)γ (t),

(8)

where u c(t) (γ (t)) denotes a main solitary wave, and c(t) and x(t) are modulation parameters of the speed and the phase shift of the main wave, respectively. They prove that a solution which lies in a neighborhood of M is absorbed into M exponentially in la2 -norm as t → ∞. Their proof basically follows the idea of Pego and Weinstein [21] and imposes the symplectical orthogonality condition (5) on v. One of the difficulties in the use of their method in the energy space is that J −1 ∂c u c tends to a nonzero constant as n → ∞ and (5) is not well defined for v ∈ l 2 . Our strategy is to decompose v(t) into the sum of a small solution v1 (t) of (2) and v2 (t) that is driven by an interaction of u c and a dispersive part of the solution. Since v2 (t) is exponentially localized in front, we can estimate v2 (t) by using exponential linear stability (7). Since v1 (t) moves more slowly than the main solitary waves, it locally tends to 0 around the solitary wave. To fix the decomposition, we impose the constraint v, J −1 u˙ c (γ ) = v2 , J −1 ∂c u c (γ ) = 0 instead of (5). Recently, Martel and Merle [16] give a direct proof of the asymptotic stability results in H 1 (R) for generalized KdV solitons based on a virial identity (which first appeared in Kato [19]). Because the Toda lattice and KdV equations have a similarity that the dominant solitary wave outruns and is separated from other parts of solutions as t → ∞, their idea seems promising. We prove a virial lemma [Lemma 5 in Sect. 3] for v1 (t) and apply local energy decay estimates for v2 instead of proving a virial lemma around solitary waves.This enables us to prove our results without numerics whereas [15,16] need some numerical computation to prove positivity of a quadratic form. We expect our proof is applicable also for Hamiltonian PDEs like the KdV equation or bidirectional models like Boussinesq equations (see [3,4,20]) by using the renormalization method by Ei [6] and Promislow [22] (see [17] for an application to the generalized KdV equation in a weighted space). We remark that Quintero [23] proved orbital stability of solitary waves of the 1-dimensional Benney-Luke equation by the variational method [13] in a case where surface tension is strong. But the approach fails if the surface tension is weak because then a solitary wave solution is a saddle point of infinite dimensional indefiniteness ([24]). Our method can be applied to such a situation because we do not use the fact that a solitary wave is a constrained minimizer to prove stability of solitary waves. Now let us consider asymptotic stability of solitary waves to the FPU lattice equations. It is interesting to see whether solitary waves to non-integrable lattices are robust to perturbations in the energy class. Let u(t, n) = t(r (t, n), p(t, n)) be a solution to


129

du = J H F (u) for t ∈ R, dt where H F (u(t)) =

1 n∈Z

2

(9)

p(t, n) + VF (r (t, n)) , 2

and VF is a potential satisfying (H2) VF ∈ C 4 (R; R), VF (0) = VF (0) = 0, VF (0) > 0, VF (0) = 0. If c > cs := VF (0) and c is sufficiently close to cs , Friesecke and Pego [8] show that there exists a unique solution u˜ c (x), − c∂x u˜ c (x) = J H F (u˜ c (x)) for x ∈ R,

(10)

up to translation and its profile is close to that of a KdV soliton. We remark that a solitary wave solution u˜ c (n − ct) has small amplitude and satisfies d H (u˜ c )/dc > 0 if c > cs and c is close to cs . See Friesecke and Wattis [12] for existence of large solitary waves. Friesecke and Pego have proved in [11] that small solitary wave solutions of (9) satisfy (H1) and are asymptotically stable in la2 . Assuming (H2), we can prove orbital and asymptotic stability of small solitary waves in l 2 exactly in the same way as the Toda lattice. Theorem 2. Suppose (H2). Let δ∗ be a small positive number and let c0 ∈ (cs , cs + δ∗ ) and τ0 ∈ R. Let u(t) be a solution to (9) with u(0) = u c0 (τ0 ) + v0 . Then for every ε > 0, there exists a δ > 0 satisfying the following: If v0 l 2 < δ, there exist constants c+ > cs and σ ∈ (cs , c+ ) and a C 1 -function x(t) such that u(t) − u˜ c0 (· − x(t))l 2 < ε,

lim u(t) − u˜ c+ (· − x(t)) l 2 (n≥σ t) = 0, t→∞

˙ − c0 |) = O(v0 l 2 ), sup (|c(t) − c0 | + |x(t) t∈R

lim c(t) = c+ ,

t→∞

lim x(t) ˙ = c+ .

t→∞

In Sect. 2 of the present paper, we introduce a variant of the secular term condition for solutions in the energy class and some estimates that will be used later. In Sect. 3, we derive modulation equations of x(t) and c(t) and prove c(t) ˙ = O(v1 (t)2W + v2 (t)2X )

(11)

2 and X ⊂ l 2 . On the other hand, we show that for some weighted spaces W ⊂ la2 ∩ l−a a

∞ (v1 (t)W + v2 (t) X )2 dt v0 l22 0

(12)

130

T. Mizumachi

by using a virial lemma for v1 (t) and a local energy decay estimate (Corollary 1 in Sect. 2) for v2 (t). Combining (11) and (12) with v(t)l22 ≤ C(v0 l 2 + |c(t) − c0 |),

(13)

which follows from the convexity of the Hamiltonian and the orthogonality condition, we will prove Theorem 1. In Sect. 4, we give a brief proof of Theorem 2. Finally, let us introduce some notations. For a Banach space X , we denote by B(X ) the space of all linear continuous operators from X to X . We use a b and a = O(b) to mean that there exists a positive constant C such that a ≤ Cb. 2. Preliminaries Let u(t) be a solution to (2) which lies in a tubular neighborhood of M. We decompose u(t) as (8). Since u˙ c = −c∂x u˜ c (· − ct) = J H (u c ), it follows from (3) and (4) that d u c(t) (γ (t)) = c(t)∂ ˙ ˙ c u˜ c(t) (n − x(t)) − x(t)∂ x u˜ c(t) (n − x(t)) dt x(t) ˙ − c(t) u˙ c(t) (γ (t)). ˙ = J H (u c (t)) + c(t)∂ c u c (γ (t)) + c(t) Thus by the definition of v,

dv = J H u c(t) (γ (t)) v(t) + l1 (t) + N1 (t), dt

(14)

where x(t) ˙ − c(t) u˙ c(t) (γ (t)), ˙ l1 (t) = −c(t)∂ c u c(t) (γ (t)) − c(t)

N1 (t) = J H u c(t) (γ (t)) + v(t) − H u c(t) (γ (t)) − H (u c(t) (γ (t)))v(t) . Let Pc (t) be a spectral projection associated with a subspace of neutral modes span{u˙ c (t), ∂c u c (t)} and let Q c (t) = 1 − Pc (t). Then for v ∈ la2 (0 < a < 2κ(c)), Pc (t)v = θ (c)v, J −1 u˙ c (t)∂c u c (t) − θ (c)v, J −1 ∂c u c (t)u˙ c (t), where θ (c) = (d H (u c )/dc))−1 . We remark that the projections Pc (t) and Q c (t) cannot be defined on l 2 because J −1 ∂c u c does not decay as n → ∞. Now we decompose v(t) into the sum of a small solution to (2) and a remainder term that belongs to la2 for some a > 0. More precisely, we put v(t) = v1 (t) + v2 (t), where dv

1 dt = J H (v1 ), v1 (0) = v0 ,

and v2 (t) is a solution to dv

2 dt = J H u c(t) (γ (t)) v2 + l1 (t) + v2 (0) = u c0 (τ0 ) − u c(0) (γ (0)),

(15)

N2 (t),

(16)


131

where N2 (t) = N1 (t) − J H (v1 (t)) + J H (u c(t) (γ (t)))v1 . To fix the decomposition, we impose the constraint v(t), J −1 u˙ c(t) (γ (t)) = 0, v2 (t), J

−1

∂c u c(t) (γ (t)) = 0.

(17) (18)

We remark that u(t) − v1 (t) remains in la2 for every 0 ≤ a < 2κ(c0 ) and t ∈ R. More precisely, we have the following: Proposition 1. Let c0 > 1, τ0 ∈ R and v0 ∈ l 2 . Let u(t) be a solution to (2) satisfying u(0) = u c0 (τ0 ) + v0 and let v1 (t) be a solution to (15). Then u(t) ∈ C 2 (R; l 2 ) and u(t) − v1 (t) ∈ C 2 (R; la2 ) for 0 ≤ a < 2κ(c0 ). Proof. By [9], we have u, v1 ∈ C 2 (R; l 2 ). Let v3 (t) = u(t) − v1 (t). Then v3 (0) ∈ ∩0≤a 1, τ0 ∈ R, γ0 (t) = t + τ0 and a ∈ (0, 2κ(c0 )). Let u(t) be a solution to (2) and let v1 (t) be a solution to (15). Then there exist positive numbers δ0 and δ1 satisfying the following: If u(t) − u c0 (γ0 (t))l 2 + e−ac0 γ0 (t) u(t) − u c0 (γ0 (t)) − v1 (t)la2 < δ0 sup t∈[T1 ,T2 ]

for some 0 ≤ T1 ≤ T2 ≤ ∞, there exists (c(t), γ (t)) ∈ C 2 ([T1 , T2 ]; R2 ) satisfying (8), (17), (18) and sup (|γ (t) − γ0 (t)| + |c(t) − c0 |) < δ1 .

t∈[T1 ,T2 ]

Especially, it holds |c(0) − c0 | + |γ (0) − τ0 | = O(v0 l 2 ).

132

T. Mizumachi

Proof. Let F1 (u, u, ˜ c, γ ) := u − u c (γ ), J −1 u˙ c (γ ), F2 (u, u, ˜ c, γ ) := u˜ − u c (γ ), J −1 ∂c u c (γ )).

(21) (22)

Then 2 ∂(F1 , F2 ) d u c0 (γ0 ), u c0 (γ0 ), c0 , γ0 = − H (u c0 ) = 0. ∂(c, γ ) dc Let U (δ0 ) = (u, u) ˜ ∈ l 2 × la2 : u − u c (γ0 )l 2 + e−acγ0 u˜ − u c (γ0 )la2 < δ0 and B(δ1 ) := (c, γ ) ∈ R2 : |c − c0 | + |γ − γ0 | < δ1 . Using the implicit function theorem, we see that there exist positive numbers δ0 and δ1 and a mapping

: U (δ0 ) (u, u) ˜ → (c, γ ) ∈ B(δ1 ) satisfying F1 (u, u, ˜ (u, u)) ˜ = F2 (u, u, ˜ (u, u)) ˜ = 0. Since F1 and F2 are C 2 in 2 (u, u, ˜ γ , c) ∈ U (δ0 ) × B(δ1 ), we have ∈ C (U (δ0 )). We remark that δ0 and δ1 can be chosen uniformly with respect to γ due to the periodicity u(t + 1/c) = e−∂ u(t) (t ∈ R). Let (c(t), γ (t)) = (u(t), u(t) − v1 (t)) for t ∈ [T1 , T2 ]. Then c(t) and γ (t) satisfy (17) and (18) and are of class C 2 because ∈ C 2 (U (δ0 )) and (u(t), u(t) − v1 (t)) ∈ C 2 (R; U (δ0 )). Furthermore, we have |c(t) − c0 | + |γ (t) − γ0 (t)| u(t) − u c0 (γ0 (t))l 2 + e−ac0 γ (t) u(t) − u c0 (γ0 (t)) − v1 (t)la2 . Especially for t = 0, we have |c(0) − c0 | + |γ (0) − τ0 | = O(v0 l 2 ). This completes the proof of Lemma 1. To estimate the exponentially decaying part of a solution, we will use the following decay estimate for non-autonomous linearized equations. Lemma 2. [[10,18]] Let c0 > 1, a ∈ (0, 2κ(c0 )) and b(a) := ca − 2 sinh(a/2). Let U0 (t, τ )ϕ be a solution to dv dt = J H (u c0 )v. (23) v(τ ) = ϕ. Then for every b ∈ (0, b(a)), there exists a positive number K such that for every ϕ ∈ la2 and t ≥ τ , e−ac0 (t−τ ) U0 (t, τ )Q c0 (τ )ϕla2 ≤ K e−b(t−τ ) ϕla2 . Now let γ = γ (t) be a C 1 -function and let U (t, τ )v0 be a solution to dv

dt = γ˙ J H u c0 (γ ) v, v(τ ) = ϕ.

(24)

If a modulation parameter γ (t) is an increasing function and γ˙ (t) is bounded away from 0, we have the following:


133

Corollary 1. Let c0 , a, b and K be as in Lemma 2 and let 0 ≤ T ≤ ∞. Suppose inf t∈[0,T ] γ˙ (t) ≥ 1/2, ϕ ∈ la2 and ϕ, J −1 u˙ c0 (γ (τ )) = ϕ, J −1 ∂c u c0 (γ (τ )) = 0. Then U (t, τ )ϕ X (t) ≤ K e−b(t−τ )/2 ϕ X (τ ) for 0 ≤ τ ≤ t ≤ T , where v X (t) := e−ac0 γ (t) vla2 . Proof. Let s = γ (t), τ1 = γ (τ ) and v(s) ˜ = v(γ −1 (s)). Then for s ∈ [0, γ (T )], d v˜ = J H (u c0 )v˜ and v(s) ∈ Range Q c0 (s). ds Lemma 2 and the fact that γ˙ (t) ≥ 1/2 imply −b(s−τ1 )−ac0 τ1 v(t) X (t) = e−ac0 s v(s) ˜ ϕla2 la2 ≤ K e

≤ K e−b(t−τ )/2 e−ac0 γ (τ ) ϕla2 . This completes the proof of Corollary 1.

We can estimate v(t)l 2 by applying an argument from [9] that uses the convexity of Hamiltonian and the orthogonality condition (17). Lemma 3. Let u(t) be a solution to (2) satisfying u(0) = u c0 (τ0 ) + v0 . Then there exist positive numbers δ2 and C satisfying the following: Suppose there exists T ∈ [0, ∞] such that v(t) satisfies (8) and (17) for t ∈ [0, T ] and supt∈[0,T ] |c(t)−c0 |+v0 l 2 ≤ δ2 . Then v(t)l22 ≤ C(|c(t) − c0 | + v0 l 2 ) for t ∈ [0, T ].

(25)

Proof. By (17), we have H (u c(t) (γ (t))), v(t) = J −1 u˙ c(t) (γ (t)), v(t) = 0. Since H (u(t)) does not depend on t, it follows from the convexity of the functional H and the above that

δ H := H u c0 (τ0 ) + v0 − H (u c0 ) = H (u c(t) (γ (t)) + v(t)) − H (u c0 ) 1 = H (u c(t) ) + H (u c(t) (γ (t))), v(t) + H (u c(t) (γ (t)))v(t), v(t) 2 −H (u c0 ) + O v(t)l32 ≥ C v(t)l22 − C |c(t) − c0 | + O v(t)l32 , where C and C are positive constants. Noting that |δ H | = O(v0 l 2 ), we have (25) for a C > 0. Because l 2 ⊂ l r for every r ∈ [2, ∞], Lemma 3 allows us to control every l r - norm with r ≥ 2.

134

T. Mizumachi

3. Proof of Theorem 1 First, we derive from (17) and (18) a system of ordinary differential equations which describes the motion of modulating speed c(t) and phase shift x(t) = c(t)γ (t) of the main solitary wave. Lemma 4. Let u(t) be a solution to (2) and v1 (t) be a solution to (15). Suppose that c and γ are C 1 -functions satisfying (17) and (18) on [0, T ] and inf t∈[0,T ] c(t) > 1. Then it holds for t ∈ [0, T ] that c(t) ˙ = O(v1 (t)2W (t) + v2 (t)2X (t) ), x(t) ˙ − c(t) = O(v1 (t)W (t) + (v(t)l 2 + v1 (t)l 2 )v2 (t) X (t) ),

−κ(c(t))|n−x(t)| |u(n)|2 1/2 , u 2a(n−x(t)) where uW (t) = X (t) = n∈Z e n∈Z e 1/2 |u(n)|2 and a is a constant satisfying 0 < a ≤ inf t∈[0,T ] κ(c(t)). Proof. Differentiating (17) with respect to t and substituting (14) into the resulting equation, we have d v, J −1 u˙ c (γ ) dt

x˙ ˙ J −1 ∂c u˙ c (γ ) v, J −1 u¨ c (γ ) + cv, c = J H (u c (γ ))v, J −1 u˙ c (γ )) + v, J −1 u¨ c (γ ) x˙ −1 − 1 v, J −1 u¨ c (γ ) + cv, ˙ J −1 ∂c u˙ c (γ ) +l1 + N1 , J u˙ c (γ ) + c = 0.

= v, ˙ J −1 u˙ c (γ ) +

Substituting u¨ c = J H (u c )u˙ c and J ∗ = −J into the above, we have x˙ d −1 H (u c ) − v, J ∂c u˙ c (γ ) − − 1 v, J −1 u¨ c (γ ) = N1 , J −1 u˙ c (γ ). c˙ dc c (26) Differentiating (18) with respect to t, we have d v2 , J −1 ∂c u c (γ ) dt

x˙ v2 , J −1 ∂c u˙ c (γ ) + cv ˙ 2 , J −1 ∂c2 u c (γ ) c = J H (u c (γ ))v2 , J −1 ∂c u c (γ )) + v2 , J −1 ∂c u˙ c (γ ) x˙ − 1 v2 , J −1 ∂c u˙ c (γ ) + cv ˙ 2 , J −1 ∂c2 u c (γ ) +l1 + N2 , J −1 ∂c u c (γ ) + c = 0.

= v˙2 , J −1 ∂c u c (γ ) +

Substituting ∂c u˙ c = J H (u c )∂c u c into the above, we obtain x˙ d −1 H (u c ) + v2 , J −1 ∂c u˙ c (γ ) c dc −c˙ ∂c u c , J −1 ∂c u c − v2 , J −1 ∂c2 u c (γ ) = −N2 , J −1 ∂c u c (γ ).

(27)


135

Since |N1 (t)| |v(t)|2 and |J −1 u˙ c (t, n)| e−2κ(c)|n−x(t)| as n → ∞, we have N1 , J −1 u˙ c (γ ) = O(v(t)2W (t) ). 1 (t) + N 2 (t) + N 3 (t), where Let N2 (t) = N 1 (t) = N1 (t) − J H (v(t)) + J v(t), N 2 (t) = J H (v(t)) − J H (v1 (t)) − J v2 (t), N

3 (t) = J H (u c(t) (γ (t))) − 1 v1 (t). N We put G(v) := H (v) − H (0) − H (0)v = H (v) − v so that J G(v) represents 1 (t) that does not interact with the solitary wave u c (γ ). Since |u c (t, n)| a part of N −2κ(c)|n−x(t)| e and a ≤ inf t∈[0,T ] κ(c(t)), we have u c(t) v 2 X (t) v2W (t) . Hence by 2 , 1 and N the definition of N 1 (t) X (t) = N1 (t) − J G(v(t)) X (t) v(t)2 , N W (t) 2 (t) X (t) = J G(v(t)) − J G(v1 (t)) X (t) N (v(t)l ∞ + v1 (t)l ∞ )v2 (t) X (t) .

(28)

(29)

We see from (3) and (4) or [8] that H (u c ) − 1 decays like e−2κ|n−x(t)| as n → ±∞ and for a ∈ (0, κ(c(t))), 3 (t) X (t) v1 (t)W (t) . N

(30)

Let u X (t)∗ = eax(t) ul 2 and uW (t)∗ = ( n∈Z eκ(c(t))|n−x(t)| |u(n)|2 )1/2 . In view −a of (26), (27) and the fact that sup J −1 u¨ c(t) (γ (t))W (t)∗ + J −1 ∂c u˙ c(t) (γ (t))W (t)∗ < ∞, t∈[0,T ]

sup

t∈[0,T ]

we have A(t)

J −1 ∂c u˙ c(t) (γ (t)) X (t)∗ + J −1 ∂c2 u c(t) (γ (t)) X (t)∗ < ∞,

O(v(t)2W (t) ) c(t) ˙ , = x(t) ˙ − c(t) O(v1 (t)W (t) + (v(t)l 2 + v1 (t)l 2 )v2 (t) X (t) )

where A(t) = diag(d H (u c )/dc, d H (u c )/dc) + O(v1 (t)W (t) + v2 (t) X (t) ). We have thus proved Lemma 4. Since v1 (t) is smaller than the main wave, it moves more slowly and will be separated from the main wave. The following is an analog of the virial lemma for small solutions in Martel and Merle [16]. Lemma 5. Let v1 (t) be a solution to (15). (i) Suppose v0 ∈ l 2 . Then supt∈R v1 (t) ≤ Cv0 l 2 , where C can be chosen as an increasing function of v0 l 2 .

136

T. Mizumachi

(ii) Let c1 > 1 and x(t) ˜ be a C 1 -function satisfying inf t∈R x˜t ≥ c1 . Then there exist positive numbers a0 and δ3 such that if a ∈ (0, a0 ) and v0 l 2 ≤ δ3 , t ψa (t)

1/2

v1 (t)l22

+

ψ˜ a (s)v1 (s)l22 ds ψa (0)1/2 v0 l22 ,

0

where ψa (t, x) = 1 + tanh a(x − x(t)) ˜ and ψ˜ a (t, x) = a 1/2 sech a(x − x(t)). ˜ Corollary 2. Let v1 (t) be a solution to (15). For every c1 > 1, there exists δ3 > 0 such that limt→∞ v1 (t)l 2 (n≥c1 t) = 0 if v0 l 2 < δ3 . Proof. (Proof of Lemma 5) Since v1 (t) ∈ C 2 (R; l 2 ) is a solution to (15), we have H (v1 (t)) = H (v0 ) for t ∈ R. Noting that V (x) is coercive and inf |x|≤R |x|−2 V (x) > 0 for every R > 0, we have δ v(t)l22 ≤ H (v(t)) = H (v0 ) ≤ C(v0 l 2 )v0 l22 , where C can be chosen as an increasing function of v0 l 2 and δ is a positive constant depending only on v0 l 2 . Next, we prove (ii). Let v1 (t) = t(r1 (t, n), p1 (t, n)), h 1 (t, n) =

1 p1 (t, n)2 + V (r1 (t, n)). 2

By (2) and the fact that there exists a C > 0 such that for every n ∈ Z, 2 V (r1 (t, n)) − r1 (t, n) ≤ Cv0 l 2 |r1 (t, n)|2 , 2 V (r1 (t, n)) − r1 (t, n) ≤ Cv0 l 2 |r1 (t, n)|, we have d dt

ψa (t, n)h 1 (t, n)

n∈Z

=

p1 (t, n)V (r1 (t, n − 1)) (ψa (t, n − 1) − ψa (t, n)) +

n∈Z

∂t ψa (t, n)h 1 (t, n)

n∈Z

x˜t (t) ψ˜ a (t, n)2 p1 (t, n)2 2 n∈Z |ψa (t, n − 1) − ψa (t, n)| | p1 (t, n)r1 (t, n − 1)| +(1 + C v0 l 2 )

≤−

n∈Z

x˜t (t) (1 − C v0 l 2 ) − ψ˜ a (t, n − 1)2 r1 (t, n − 1)2 , 2 n∈Z

where C is a positive constant. Let δ3 and a be sufficiently small positive numbers. Since inf x˜t ≥ c1 > 1 and ψa (t, n) − ψa (t, n − 1) − 1 = O(a) as a ↓ 0, sup 2 ˜ n,t ψa (t, n)


137

there exists a δ˜ > 0 such that for t ∈ [0, T ], d ψ˜ a (t, n)2 ( p1 (t, n)2 + r1 (t, n)2 ). ψa (t, n)h 1 (t, n) ≤ −δ˜ dt n∈Z

(31)

n∈Z

Integrating (31) over [0, t], we have

ψa (t, n)h 1 (t, n) + δ˜

n∈Z

n∈Z

t

ψ˜ a (s, n)2 ( p1 (s, n)2 + r1 (s, n)2 )ds

n∈Z 0

ψa (0, n)h 1 (0, n) v0 l22 .

We have thus proved Lemma 5.

Proof. (Proof of Corollary 2) Let c2 ∈ (1, c1 ) and let x(t) ˜ = c2 t. Then by Lemma 5, we have v1 (t)l 2 (n≥c2 t) ψa (0)1/2 v0 l 2 . Let n 0 (t) = [(c1 − c2 )t], a largest integer which is smaller than (c1 − c2 )t. Then we have n 0 (t) → ∞ as t → ∞ and v1 (t)l 2 (n≥c1 t) ≤ v1 (t, · + n 0 (t))l 2 (n≥c2 t) ψa (0, ·)1/2 v0 (· + n 0 (t))l 2 . Letting t → ∞, we have limt→∞ v1 (t)l 2 (n≥c1 t) = 0. This completes the proof of Corollary 2. Next, we will estimate v2 . It decays slowly due to slow decay of the interaction between v1 and the solitary wave u c(t) . Lemma 6. Let c0 > 1, a ∈ (0, κ(c0 )/3) and δ4 be a sufficiently small positive number. Suppose that the decomposition (8), (17) and (18) exists for t ∈ [0, T ] and that v0 l 2 + supt∈[0,T ] (|c(t) − c0 | + |x(t) ˙ − c0 |) ≤ δ4 , where x(t) = c(t)γ (t). Then ⎛ ⎞ t −bt/4 −b(t−s)/4 v0 l 2 + e v1 (s)W (s) ds ⎠ , (32) Q c(t) (γ (t))v2 (t) X (t) ≤ C ⎝e 0

for t ∈ [0, T ], and T v2 (t)2X (t) dt ≤ Cv0 l22 ,

(33)

0

where C is a positive constant independent of T and v X (t) and vW (t) are as in Lemma 4. Proof. Since v2 is exponentially localized in front, we will apply (7) to (16). Let v˜2 (t) := Q c(t) (γ (t))v2 (t) and w(t) = Q c0 (γ˜ (t))v˜2 (t), where γ˜ (t) = x(t)/c0 . Here we choose γ˜ (t) so that u c(t) (γ (t)) and u c0 (γ˜ (t)) have the same phase shift and for 0 < a < min(κ(c(t)), κ(c0 )), Q c(t) (γ (t)) − Q c0 (γ˜ (t)) B(la2 ) = O(|c(t) − c0 |).

138

T. Mizumachi

By (17) and (18), v˜2 (t) = v2 (t) − θ (c(t))v2 (t), J −1 u˙ c (γ (t))∂c u c (γ (t)) = v2 (t) + θ (c(t))v1 (t), J −1 u˙ c (γ (t))∂c u c (γ (t)).

(34)

Thus we have d v˜2 = J H (u c(t) )(γ (t))v˜2 + l1 (t) + l2 (t) + l3 (t) + N2 (t), dt where

d θ (c(t))v1 (t), J −1 u˙ c(t) (γ (t))∂c u c(t) (γ (t)) , dt l3 (t) = −θ (c(t))v1 (t), J −1 u˙ c(t) (γ (t))∂c u˙ c(t) (γ (t)).

l2 (t) =

Since

we have

d ˙ − γ˜ J H (u c0 (γ˜ )), Q c0 (γ˜ ) = 0, dt

w˙ − γ˙˜ J H (u c0 (γ˜ ))w = Q c0 (γ˜ ) v˙˜2 − γ˙˜ J H (u c0 (γ˜ ))v˜2 ⎧ ⎫ ⎨ ⎬ k , = Q c0 (γ˜ ) lk + N ⎩ ⎭ 1≤k≤4

where

(35)

1≤k≤3

l4 (t) = J H (u c(t) (γ (t))) − H (u c0 (γ˜ (t))) v˜2 (t) −(γ˙˜ (t) − 1)J H (u c (γ˜ (t)))v˜2 (t). 0

In view of Lemma 4, we have for a ∈ (0, 2κ(c(t))), l1 X (t) v1 (t)W (t) + (v(t)l 2 + v1 (t)l 2 + v2 (t) X (t) )v2 (t) X (t) . By (15) and the fact that J −1 u˙ c (γ ), e−2κ|n−x(t)| as n → ±∞, we have

d dt

J −1 u˙ c (γ ), ∂c u c (γ ) and

d dt ∂c u c (γ )

(36)

decay like

l2 (t) X (t) v1 (t)W (t) .

(37)

l3 (t) X (t) v1 (t)W (t) .

(38)

Similarly, we have

Since x(t) = c0 γ˜ (t) = c(t)γ (t), l4 (t) X (t) (|c(t) − c0 | + |x(t) ˙ − c0 |)v˜2 (t) X (t) δ4 (v1 (t)W (t) + v2 (t) X (t) ). Let U (t, s) be a flow generated by dw = γ˙˜ (t)J H (u c0 (γ˜ (t)))w. dt

(39)


139

Applying Corollary 1 to (35) and substituting (28)–(30) and (36)–(39), we have w(t) X (t) U (t, 0)w(0) X (t) +

4

t

U (t, s)Q c0 (γ˜ (s))lk (s) X (t)

k=1 0

+

3

t

k (s) X (t) U (t, s)Q c0 (γ˜ (s)) N

k=1 0

e

−bt/2

t w(0) X (0) +

e−b(t−s)/2 v2 (s)2X (s) ds

0

t +

e−b(t−s)/2 v1 (s)W (s) + (δ4 + v1 (s)l 2 + v2 (s)l 2 )v2 (s) X (s) .

0

Here we use uW (t) ul 2 and uW (t) u X (t) for a ∈ (0, κ(c(t))/2). By the definition of v˜2 and w, v2 (t) X (t) v˜2 (t) X (t) + v1 (t)W (t) ,

(40)

v˜2 (t) X (t) ≤ w(t) X (t) + (Q c(t) (γ (t)) − Q c0 (γ˜ (t)))v˜2 (t) X (t) w(t) X (t) + |c(t) − c0 |v˜2 (t) X (t) .

(41)

If δ4 is sufficiently small, Eqs. (40) and (41) imply v˜2 (t) X (t) w(t) X (t) and v2 (t) X (t) w(t) X (t) + v1 (t)W (t) .

(42)

It follows from Lemmas 5 (i) and 3 that v1 (t)l 2 + v2 (t)l22 v0 l 2 + |c(t) − c0 |. √ Thus as long as sup0≤s≤t w(s) X (s) ≤ δ4 , we have w(t) X (t) e−bt/2 w(0) X (0) t " + e−b(t−s)/2 v1 (s)W (s) + δ4 w(s) X (s) ds. 0

Applying Gronwall’s inequality, we have √

w(t) X (t) e−(b/2+O( δ4 ))t w(0) X (0) t √ + e−(b/2+O( δ4 ))(t−s) v1 (s)W (s) ds.

(43)

0

By the definition of w, (16), (34) and Lemma 1, w(0) X (0) v2 (0) X (0) + v1 (0)l 2 v0 l 2 .

(44)

140

T. Mizumachi

In view of Lemma 5 (i), (43) and (44), we have w(t) X (t) v0 l 2 = O(δ4 ) and (43) persists for t ∈ [0, T ] if δ4 is sufficiently small. Thus by (41), we have (32). Combining (32), (42) and Lemma 5 (ii) and using Young’s inequality, we have v2 (t) L 2 (0,T ;X (t)) v0 l 2 + e−bt/4 L 1 (0,T ) v1 L 2 (0,T ;W (t)) v0 l 2 . We have thus completed the proof of Lemma 6.

Now we are in a position to prove the following proposition: Proposition 2. Let c0 > 1, τ0 ∈ R and let u(t) be a solution to (2) with u(0) = u c0 (τ0 ) + v0 . For every ε > 0, there exists a positive number δ > 0 satisfying the following: If v0 l 2 < δ, there exist a constant c+ > 1 and a C 1 -function x(t) such that u(t) − u˜ c0 (· − x(t))l 2 < ε,

lim u(t) − u˜ c+ (· − x(t)) l 2 (n≥x(t)−R) = 0 for every R > 0,

(45) (46)

˙ − c0 |) = O(v0 l 2 ), sup (|c(t) − c0 | + |x(t)

(47)

lim c(t) = c+ ,

(48)

t→∞ t∈R

t→∞

lim x(t) ˙ = c+ .

t→∞

Proof. Let δ5 = min1≤i≤4 δi and T0 := sup {t : (8), (17) and (18) hold for 0 ≤ τ ≤ t} , #

$

T1 := sup t ≤ T0 : v0 l 2 + sup (|c(τ ) − c0 | + |x(τ ˙ ) − c0 |) ≤ δ5 . 0≤τ ≤t

If δ is sufficiently small, Proposition 1 and Lemma 1 imply that T1 > 0. We will show that T0 = T1 for small δ. Suppose that t ∈ [0, T1 ). Lemmas 4, 5 and 6 and (40) imply |x(t) ˙ − c(t)| v1 (t)W (t) + v2 (t) X (t) v1 (t)W (t) + v˜2 (t) X (t) v0 l 2 .

(49)

By Lemmas 1 and 4, t |c(t) − c0 | ≤ |c(0) − c0 | +

|c(s)|ds ˙ 0

t v0 l 2 + v1 (s)2W (s) + v2 (s)2X (s) ds. 0

In view of Lemmas 5 (ii) and 6, we have |c(t) − c0 | v0 l 2 .

(50)

It follows from (49) and (50) that T0 = T1 if δ is sufficiently small. Next, we will show that T0 = ∞ for small δ. Suppose that for every δ > 0, there exists v0 such that v0 l 2 < δ and T0 < ∞. By Lemma 3 and (50),


141

sup v(t)l22 v0 l 2 .

t∈[0,T0 ]

(51)

Using (40), Lemmas 6 and 5 (i), we have sup v2 (t) X (t) sup

t∈[0,T0 ]

t∈[0,T0 ]

v1 (t)W (t) + v˜2 (t) X (t) v0 l 2 .

(52)

By (51) and (52), we get v(T0 )l 2 + e−ax(T0 ) v2 (T0 )la2 v0 l 2 . Hence it follows from Lemma 1 that the decomposition (8), (17) and (18) can be extended beyond t = T0 if v0 l 2 is small. This is a contradiction. Thus we prove T0 = ∞ for small v0 ∈ l 2 . Let δ be a small positive number such that T0 = T1 = ∞. Then Lemma 5 (ii) and Lemma 6 imply v1 (t)W (t) + v2 (t) X (t) ∈ L 2 (0, ∞). Thus by Lemma 4, we see that c(t) ˙ is integrable on [0, ∞) and that there exists c+ satisfying limt→∞ c(t) = c+ . Next, we will prove (46). As in the proof of Corollary 2, we can prove limt→∞ v1 (t)W (t) = 0. Combining this with (54), we have ˙ = lim c(t) = c+ . lim x(t)

t→∞

t→∞

(53)

By (40), Lemma 6 and the fact that v1 (t)W (t) ∈ L 2 (0, ∞), v2 (t) X (t) v1 (t)W (t) + v˜2 (t) X (t) v1 (t)W (t) + e

−bt/4

(54)

v0 l 2 + sup v1 (s)W (s) t/2≤s≤t

⎞1/2 ⎛ t/2 +e−bt/8 ⎝ v1 (s)2W (s) ds ⎠ → 0 as t → ∞. 0

Since v2 (t)l 2 (n≥x(t)−R) v2 (t) X (t) for every R > 0, Corollary 2 and (54) imply (46). Combining this (53) and (54), we have ˙ = lim c(t) = c+ . lim x(t)

t→∞

t→∞

We have thus completed the proof of Proposition 2.

Combining Proposition 2 and the monotonicity argument given in [16], we obtain Theorem 1. Proof. (Proof of Theorem 1) Put 1 ˜ n)2 + V (˜r (t, n)), (˜r (t, n), p(t, ˜ n)) := v(t, n), h(t, n) = p(t, 2 N3 (t) = J H (u c(t) (γ (t)) + v(t)) − H (u c(t) (γ (t))) − H (v(t)) .

t

Let σ ∈ (1, c+ ), t1 ≥ 0 and x(t) ˜ = x(t1 ) + σ (t − t1 ). Let ψa (t, n) and ψ˜ a (t, n) be as in Lemma 5. Then

142

T. Mizumachi

d ψa (t, n)h(t, n) dt n∈Z

˙ + = H (v(t)), ψa (t)v(t) =

∂t ψa (t, n)h(t, n)

n∈Z

p(t, ˜ n)V (˜r (t, n − 1))(ψa (t, n − 1) − ψ(t, n))

n∈Z

+ψa (t)l1 (t), H (v(t)) + ψa (t)N3 (t), H (v(t)) +

∂t ψa (t, n)h(t, n).

n∈Z

J H (v(t))

Here we use = + l1 (t) + N3 (t). Suppose that a > 0 and v0 l 2 are sufficiently small. Since v(t)l 2 v0 l 2 follows from Proposition 2, we see that there exists a δ > 0 d ψ˜ a (t, n) r˜ (t, n)2 + p(t, ψa (t, n)h(t, n) ≤ −δ ˜ n)2 dt dv dt

n∈Z

n∈Z

+ψa (t)l1 (t), H (v(t)) + ψa (t)N3 (t), H (v(t))

in exactly the same way as the proof of Lemma 5. By the definitions of l1 (t) and N3 (t) and Lemma 4, |N3 (t)| |u c(t) (γ (t))v(t)|,

|l1 (t), H (v(t))| (v1 (t)W (t) + v2 (t) X (t) )2 . Combining the above, we have ψa (t, n) r˜ (t, n)2 + p(t, ˜ n)2 n∈Z

t ψa (t1 , n)h(t1 , n) +

n∈Z

(v1 (s)W (s) + v2 (s) X (s) )2 ds t1

ψa (t1 , n) |v1 (t1 , n)| + |v2 (t1 , n)| 2

n∈Z

2

t

(v1 (s)W (s) + v2 (s) X (s) )2 ds.

+ t1

As in the proof of Corollary 2, we have lim ψa (t1 , n)|v1 (t1 , n)|2 = 0. t1 →∞

n∈Z

On the other hand, Lemma 6 implies ψa (t1 , n)|v2 (t1 , n)|2 v2 (t1 )2X (t1 ) → 0 as t1 → ∞. n∈Z

Furthermore, Lemmas 5 and 6 and Proposition 2 imply ∞ lim (v1 (s)2W (s) + v2 (s)2X (s) )ds = 0.

t1 →∞

t1


143

Combining the above, we obtain lim sup v(t)l 2 (n≥σ t) = 0.

t1 →∞ t≥t1

Thus we complete the proof of Theorem 1.

4. Proof of Theorem 2 In this section, we will prove orbital and asymptotic stability of solitary waves to the FPU lattice equation (9). For a two-parameter family of solitary wave solutions {u c (t + δ) : c ∈ [c1 , c2 ], δ ∈ R} that satisfies the condition (P1)–(P4) below, we can prove the orbital and asymptotic stability of solitary wave solutions in exactly the same way as Theorem 1. (P1) There exists an open interval I such that V (r ) > 0 for every r ∈ I and that {rc (x) : x ∈ R} ⊂ I for every c ∈ [c1 , c2 ]. 2 (P2) There exists an a > 0 such that the map R × [c1 , c2 ] (t, c) → u c (t) ∈ la2 ∩ l−a 2 is C . (P3) The solitary wave energy H F (u c ) satisfies d H F (u c )/dc = 0 for c ∈ [c1 , c2 ]. (P4) Let c0 ∈ [c1 , c2 ] and a ∈ (0, 2κ(c0 /cs )). Let U0 (t, τ )ϕ be a solution to dv dt = J H F (u c0 )v. (55) v(τ ) = ϕ. Then there exist positive numbers b and K such that for every ϕ ∈ la2 and t ≥ τ , e−ac0 (t−τ ) U0 (t, τ )Q c (τ )ϕla2 ≤ K e−b(t−τ ) ϕla2 . Proof. (Proof of Theorem 2) If c > cs and c is sufficiently close to cs , there exists a unique solitary wave solution to (10) up to translation ([8, Theorem 1.1]). By [8, Theorem 1.1], we see that a solitary wave solution satisfies (P1) and (P3) if c is close to cs . Slightly modifying the proof of [8, Prop. 6.1] and [9, Prop. A.3], we obtain (P2). Since (P4) holds for small solitary waves (see [11]), Theorem 2 can be proved in exactly the same way as Theorem 1. Acknowledgement. The author would like to express his gratitude to Professor Robert L. Pego for his hospitality at Carnegie Mellon University where this work was carried out.

References 1. Benjamin, T.B.: The stability of solitary waves. Proc. Roy. Soc. London A 328, 153–183 (1972) 2. Bona, J.L.: On the stability of solitary waves. Proc. Roy. Soc. London A 344, 363–374 (1975) 3. Bona, J.L., Chen, M., Saut, J.C.: Boussinesq equations and other systems for small-amplitude long waves in nonlinear dispersive media I. Derivation and Linear Theory. J. Nonlinear Sci. 12(4), 283–318 (2002) 4. Bona, J.L., Chen, M., Saut, J.C.: Boussinesq equations and other systems for small-amplitude long waves in nonlinear dispersive media II. The Nonlinear Theory. Nonlinearity 17, 925–952 (2004) 5. Cazenave, T.: Semilinear Schrodinger equations. Courant Lecture Notes in Mathematics 10, New York: New York University, Courant Institute of Mathematical Sciences. Providence, RI: Amer. Math. Soc., 2003 6. Ei, S.-I.: The motion of weakly interacting pulses in reaction-diffusion systems. J. Dynam. Diff. Eq. 14, 85–137 (2002)

144

T. Mizumachi

7. Flaschka, H.: On the Toda lattice. II. Inverse-scattering solution. Progr. Theor. Phys. 51, 703–716 (1974) 8. Friesecke, G., Pego, R.L.: Solitary waves on FPU lattices I, Qualitative properties, renormalization and continuum limit. Nonlinearity 12, 1601–1627 (1999) 9. Friesecke, G., Pego, R.L.: Solitary waves on FPU lattices. II, Linear Implies Nonlinear Stability. Nonlinearity 15, 1343–1359 (2002) 10. Friesecke, G., Pego, R.L.: Solitary waves on Fermi-Pasta-Ulam lattices III, Howland-type Floquet theory. Nonlinearity 17, 207–227 (2004) 11. Friesecke, G., Pego, R.L.: Solitary waves on Fermi-Pasta-Ulam lattices IV, Proof of stability at low energy. Nonlinearity 17, 229–251 (2004) 12. Friesecke, G., Wattis, J.: Existence theorem for solitary waves on lattices. Commun. Math. Phys. 161, 391–418 (1994) 13. Grillakis, M., Shatah, J., Strauss, W.A.: Stability Theory of solitary waves in the presence of symmetry I. J. Diff. Eq. 74, 160–197 (1987) 14. Grillakis, M., Shatah, J., Strauss, W.A.: Stability Theory of solitary waves in the presence of symmetry II. J. Funct. Anal. 94, 308–348 (1990) 15. Martel, Y., Merle, F.: Asymptotic stability of solitons for subcritical generalized KdV equations. Arch. Ration. Mech. Anal. 157(3), 219–254 (2001) 16. Martel, Y., Merle, F.: Asymptotic stability of solitons of the subcritical gKdV equations revisited. Nonlinearity 18, 55–80 (2005) 17. Mizumachi, T.: Weak interaction between solitary waves of the generalized KdV equations. SIAM J. Math. Anal. 35, 1042–1080 (2003) 18. Mizumachi, T., Pego, R.L.: Asymptotic stability of Toda lattice solitons. Nonlinearity 21, 2061–2071 (2008) 19. Kato, T.: On the Cauchy problem for the (generalized) Korteweg-de Vries equation, Studies in applied mathematics, Adv. Math. Suppl. Stud. 8, New York: Academic Press, 1983, pp. 93–128 20. Pego, R.L., Smereka, P., Weinstein, M.I.: Oscillatory instability of solitary waves in a continuum model of lattice vibrations. Nonlinearity 8, 921–941 (1995) 21. Pego, R.L., Weinstein, M.I.: Asymptotic stability of solitary waves. Commun. Math. Phys. 164, 305–349 (1994) 22. Promislow, K.: A renormalization method for modulational stability of quasi-steady patterns in dispersive systems. SIAM J. Math. Anal. 33, 1455–1482 (2002) 23. Quintero, J.R.: Nonlinear stability of a one-dimensional Boussinesq equation. J. Dynam. Diff. Eq. 15, 125–142 (2003) 24. Quintero, J.R., Pego, R.L.: Asymptotic stability of solitary waves in the Benney-Luke model of water waves. Unpublished manuscript 25. Shatah, J., Strauss, W.A.: Instability of nonlinear bound states. Commun. Math. Phys. 100, 173–190 (1985) 26. Toda, M.: Nonlinear waves and solitons. Mathematics and its Applications (Japanese Series) 5, Dordrecht: Kluwer Academic Publishers Group, Tokyo: SCIPRESS, 1989 Communicated by P. Constantin


Communications in


Global Wellposedness in the Energy Space for the Maxwell-Schrödinger System Ioan Bejenaru1, , Daniel Tataru2, 1 Department of Mathematics, Texas A&M University, College Station,

TX 77843-3368, USA. E-mail: [email protected]

2 Department of Mathematics, University of California, Berkeley, CA 94720, USA

Received: 19 December 2007 / Accepted: 22 December 2008 Published online: 13 March 2009 – © Springer-Verlag 2009

Abstract: We prove that the Maxwell-Schrödinger system in R3+1 is globally well-posed in the energy space. The key element of the proof is to obtain a short time wave packet parametrix for the magnetic Schrödinger equation, which leads to linear, bilinear and trilinear estimates. These, in turn, are extended to larger time scales via a bootstrap argument.

1. Introduction The Maxwell-Schrödinger system in R3+1 describes the evolution of a charged nonrelativistic quantum mechanical particle interacting with the classical electro-magnetic field it generates. It has the form ⎧ ⎨ iu t − A u = φu, −φ + ∂ divA = ρ, ⎩ A + ∇(∂t φ + divA) = J, t

ρ = |u|2 J = 2I m(u, ¯ ∇ A u),

(1)

where u is the wave function of the particle, (φ, A) is the electro-magnetic potential, (u, A, φ) : R3 × R → C × R × R3 and ∇ A = ∇ − i A, A = ∇ 2A . The system is invariant under the gauge transform: (u , φ , A ) → (eiλ u, φ − ∂t λ, A + ∇λ),

λ : R3 × R → R,

The first author was partially supported by NSF grant DMS0738442.

The second author was partially supported by NSF grant DMS0354539.

146

I. Bejenaru, D. Tataru

where λ : R3 × R → R. To remove this degree of freedom we need to fix the gauge. In this article we choose to work in the Coulomb gauge divA = 0. Under this assumption, the system can be rewritten as: iu t − A u = φu A = P J,

(2)

(3)

where φ = (−)−1 (|u|2 ) and P = 1 − ∇div−1 is the projection on the divergence free vectors functions - also called the Helmholtz projection. We consider the above system with a set of initial data chosen in Sobolev spaces: (u(0), A(0), At (0)) = (u 0 , A0 , A1 ) ∈ H s × H σ × H σ −1 . The gauge condition (2) is conserved in time provided the initial data (A0 , A1 ) satisfies it due to the form of the second equation in (3). The conserved quantities associated to the system are the charge and the energy, |u|2 d x, Q(u) = R3 1 1 E(u) = |∇ A u|2 + (|At |2 + |∇x A|2 ) + |∇φ|2 d x. 3 2 2 R The local well-posedness of the system in various Sobolev spaces above the energy level is known, see [9,12]. On the other hand the existence of weak energy solutions is established in [2]. The main outstanding problem which we seek to address is the well-posedness in the energy space. Our result is Theorem 1. The Maxwell-Schrödinger system (3) is globally well-posed in the energy space H 1 × H 1 × L 2 in the following sense: i) (regular solutions) For each initial data (u 0 , A0 , A1 ) ∈ H 2 × H 2 × H 1 there exists an unique global solution (u, A) ∈ C(R, H 2 ) × C(R, H 2 ) ∩ C 1 (R, H 1 ). ii) (rough solutions) For each initial data (u 0 , A0 , A1 ) ∈ H 1 × H 1 × L 2 there exists a global solution (u, A) ∈ C(R, H 1 ) × C(R, H 1 ) ∩ C 1 (R, L 2 ), which is the unique strong limit of the regular solutions in (i). iii) (continuous dependence) The solutions (u, A) in (ii) depend continuously on the initial data in H 1 × H 1 × L 2 .

Maxwell-Schrödinger System

147

We remark that in the process of proving the above results we establish some additional regularity properties for the energy solutions (u, A) which suffice both for the uniqueness and the continuous dependence results. Traditionally these regularity properties are described using X s,b type spaces. Instead here we use the related U 2 and V 2 type spaces associated to both the wave equation and the magnetic Schrödinger equation. These are introduced in the next section; for more details we refer the reader to [3,6,7]. Remark 2. We note that in some directions our analysis yields stronger results than as stated in the theorem. Precisely, the same arguments as those in Sect. 5 also yield: a) Local in time a-priori estimates in H β × H 1 × L 2 for β > 21 . This is exactly the range allowed for β in Lemma 25 and Lemma 26 (a). b) Local in time well-posedness in H β × H 1 × L 2 for β > 43 . This reduced range arises due to Lemma 26 (b). The nonlinearities on the right-hand side of both equations in (3) are fairly mild. Indeed, if A were replaced by then it would be quite straightforward to iteratively close the argument in X s,b or Strichartz spaces. For the magnetic potential A it is quite reasonable to hope to obtain an X s,b type regularity. Thus the main difficulty stems from the linear magnetic Schrödinger equation iu t − A u = f,

u(0) = u 0 .

(4)

The linear and bilinear estimates for L 2 solutions to (4) are summarized in Theorem 9 in Sect. 4. The rest of the section is devoted to the well-posedness of (4) in H 2 , H −2 and intermediate spaces. The proof of our main result is completed in the following section. The first step is to establish local in time a-priori bounds for solutions to (3), first in H 1 and then in more regular spaces. This is done by treating the nonlinearities on the right of the equations in a perturbative manner. The transition from local in time to global in time is straightforward, using the conserved energy. The second step is to establish the continuous dependence on the initial data. This is a consequence of a Lipschitz dependence result in a weaker topology. Precisely, we show that the corresponding linearized equation is 1 1 well-posed in L 2 × H 2 × H − 2 . The rest of the paper is devoted to the study of L 2 solutions for (4), with the aim of proving Theorem 9. Previous approaches establish Strichartz estimates with a loss of derivatives for this equation in a perturbative manner, starting from the free Schrödinger equation. This no longer suffices for A in the energy space, and instead one needs to study directly the dispersive properties for the linear magnetic Schrödinger equation. Our approach uses some of the ideas described in [5 and 6]. To each dyadic frequency λ, we associate the time scale λ−1 . On this time scale we show that at frequency λ Eq. (4) is well approximated by its paradifferential truncation, which is roughly iu t − A 0 they satisfy the bound −δ −δ −δ IλT1 ,λ2 ,λ3 ,λ4 A λ−δ 1 λ2 λ3 λ4 · R H S((67)).

(68)

168


We begin with a weaker bound which follows directly from Strichartz estimates, namely 1

−1 −1 −2

2 IλT1 ,λ2 ,λ3 ,λ4 A T 3 λ1 4 λ2 4 λ3 3 λ4N u U 2 H β A U 2

W

A

H 1 B L ∞ L 2 .

Arguing as in Lemma 8, this allows us to replace (68) with −δ −δ −δ IλT1 ,λ2 ,λ3 ,λ4 A λ−δ 1 λ2 λ3 λ4 · modified R H S((67)),

(69)

1

where we have replaced the VW2 H (λ 2 /m) space with the similar U 2 space in the righthand side of (67). Due to (13) both wave factors are l 2 summable with respect to unit spatial cubes, therefore it is enough to estimate the above integral on a unit cube Q. Also we must have λ4 ≤ max{λ1 , λ2 , λ3 }. Hence we consider two cases. If max{λ1 , λ2 , λ3 } = λ1 (or λ2 ) then we estimate 1

IλT1 ,λ2 ,λ3 ,λ4 ≤ T 6 Sλ1 u L 4 L 3 Sλ2 u L 2 L ∞ Sλ3 A L 3 L 6 Sλ4 B L ∞ L 2 1

≤ T

1 6

− 1 +ε 1 +ε−β − 16 λ1 2 λ22 λ3

λ12 m(λ4 ) 1 2

λ4 m(λ1 )

u U 2 H (m) u U 2 H β A V 2 H 1 B A

A

2 H( UW

W

√

λ m )

.

By (63) the fraction above is less than one, therefore for small enough ε the bound (69) follows. The second case is when max{λ1 , λ2 , λ3 } = λ3 . Then we estimate 1

IλT1 ,λ2 ,λ3 ,λ4 ≤ T 12 Sλ1 u L 4 L 6 Sλ2 u L 2 L ∞ Sλ3 A L 6 L 3 Sλ4 B L ∞ L 2 1

A T

1 12

1 3 +ε−β

λ1

1 2 +ε−β

λ2

−1 λ3 6

λ32 m(λ4 ) 1 2

λ4 m(λ3 )

u U 2 H β u U 2 H β A U 2 A

A

W

H (m) B U 2 H ( W

and (69) again follows. b) By duality we have two estimates to prove. The first one is trilinear, T A T δ u 2 2 v 2 β B u∇v ¯ Bd xdt 1 U L U H 2 0

R3

A

with a divergence free B. The second one is quadrilinear, T uu ¯ ABd xdt A T δ u U 2 H β v U 2 L 2 A U 2 0 R3

A

A

W

,

(70)

VW H 2

A

√ λ m )

H 1 B V 2 H 21 . W

(71)

Since B is divergence free, in (70) we can place the gradient on either u or v. The argument is similar to the one in part (a), using (60) and (61), as well as Lemma 8 in order to substitute the V 2 norm by the U 2 norm. The new restriction β > 43 arises in the case when the low frequency is on B. Indeed, if µ λ then by (60) we have 1

1

T (u, ∇v, B)| A T δ λ1−β+ µ− 2 min{1, µλ− 2 } u U 2 L 2 v U 2 H β B |Iλ,λ,µ A

1

A

3

The worst case is when µ = λ 2 , when the coefficient above is T δ λ 4 −β+ε .

1

VW2 H 2

.


169

The estimate (71) is also proved as in part (a). Indeed, by Corollary 7 we can substitute the V 2 space by U 2 at the expense of losing ε derivatives. Because of the finite speed of propagation for the wave equation, see (13), we can reduce the problem to the case when A, B are supported in a unit cube Q. There we can use the Strichartz estimates for the wave equation, respectively the local Strichartz estimates for the Schrödinger equation. The next step in the proof of the theorem is to establish an a-priori H 1 estimate for H 2 solutions. This is obtained in terms of the conserved quantities in our problem, namely E and Q. Proposition 27. Let (u, A) be an H 2 solution for (3) in some time interval [0, T0 ] with T0 ≤ 1. Then u U 2 (0,T0 ;H 1 ) + A U 2 (0,T0 ;H 1 ) ≤ c(E, Q). A

W

Proof. We use a bootstrap argument. Since u, A ∈ L ∞ H 2 , we can easily estimate the 2 H 1 , and, in wave nonlinearity and obtain A ∈ L ∞ H 1 . This implies that A ∈ UW addition, that the function T → A U 2 (0,T ;H 1 ) W

is continuous and satisfies lim A U 2 (0,T ;H 1 ) = A(0) H 1 + At (0) L 2 .

T →0

W

A similar argument applies in the case of the Schrödinger equation. By (9) we estimate in the Schrödinger equation u U 2 (0,T ;H 1 ) u(0) H 1 + iu t − A u DU 2 (0,T ;H 1 ) . A

A

Then by (62) we obtain 3 u U 2 (0,T ;H 1 ) ≤ C 1A ( u(0) H 1 + T δ u U 2 (0,T ;H 1 ) ). A

A

Similarly we can use (64) to obtain a bound for the wave equation 2 A U 2 (0,T ;H 1 ) ≤ A(0) H 1 + At (0) L 2 + T δ C 2A u U 2 (0,T ;H 1 ) . W

A

We multiply the first equation by c1A = (C 1A )−1 and add to the second equation to obtain A U 2 (0,T ;H 1 ) + c1A u U 2 (0,T ;H 1 ) ≤ u(0) H 1 + A(0) H 1 + At (0) L 2 W

A

3 +T δ C 3A (1 + u U 2 (0,T ;H 1 ) ). A

We make the bootstrap assumption c1A u U 2 (0,T ;H 1 ) + A U 2 (0,T ;H 1 ) ≤ 2 + u(0) H 1 + A(0) H 1 + At (0) L 2 . A

W

Then the previous bound implies that c1A u U 2 (0,T ;H 1 ) + A U 2 (0,T ;H 1 ) ≤ u(0) H 1 + A(0) H 1 + At (0) L 2 A

W

3 +T δ C(E, Q)(1 + u U 2 (0,T ;H 1 ) ). A

170


This shows that for T ≤ T0 (E, Q) we have u U 2 (0,T ;H 1 ) + A U 2 (0,T ;H 1 ) ≤ 1 + u(0) H 1 + A(0) H 1 + At (0) L 2 , (72) A

W

improving our bootstrap assumption. Hence a continuity argument shows that (72) holds without any bootstrap assumption. The conclusion of the proposition follows by summing up with respect to T0 (E, Q) time intervals. The next step is to establish an a-priori H 2 bound with constants which depend only on the H 1 size of the data. Proposition 28. Let (u, A) be an H 2 solution for (3) in some time interval [0, T0 ] with T0 ≤ 1. Then u U 2 H 2 + A U 2 A

W

H2

≤ c(E, Q)( u 0 H 2 + A0 H 2 + A1 H 1 ).

Proof. The argument is similar to the one above.

Given the local well-posedness result in H 2 proved in earlier work [9], we can iterate the argument and conclude that the H 2 solutions are global. Finally, our last apriori estimate is in intermediate spaces: Proposition 29. Let m be a weight which satisfies (54) and (63). Let (u, A) be an H 2 solution for (3) in the time interval [0, 1]. Then u U 2 H (m) + A U 2 A

W

H (m)

≤ c(E, Q)( u 0 H (m) + A0 H (m) + A1 H (λ−1 m) ).

Proof. The argument is again similar to the one above.

In order to obtain H 1 solutions and to study the dependence of the solutions on the initial data we need to obtain estimates for differences of solutions. Given a solution (u, A) to (3) we consider the corresponding linearized problem ⎧ ¯ ⎨ ivt − A v = 2i B∇u + 2 ABu + φv + −1 (u v)u (73) ⎩ B = P(v∇ ¯ A u + u∇ ¯ A v + B|u|2 ). Our main estimate for the linearized problem is Proposition 30. Let (u, A) be an H 2 solution for (3) in the time interval [0, 1]. Then 1 1 the linearized problem (73) is well-posed in L 2 × H 2 × H − 2 uniformly with respect to (u, A) in a bounded set in the energy space, v U 2 L 2 + B A

1

2 H2 UW

≤ c(E, Q)( v0 L 2 + B0

1

H2

+ B1

1

H− 2

).

(74)

Proof. The conclusion follows iteratively in short time intervals provided that we obtain appropriate estimates for the terms on the right: δ 2i B∇u + 2 ABu + φv + −1 (u v)u ¯ DU 2 L 2 T c(E, Q)( v U 2 L 2 + B A

A

1

2 H2 UW

),


171

respectively ¯ A v + i B|u|2 ) P(v∇ ¯ A u + u∇

1

2 H− 2 DUW

T δ c(E, Q)( v U 2 L 2 + B

1

2 H2 UW

A

).

These in turn follow by duality from the trilinear and quadrilinear bounds T B∇u 1 u¯ 2 d xdt A T δ B 2 1 u 1 U 2 H 1 u 2 V 2 L 2 , A A UW H 2 0 R3 T ABu 1 u¯ 2 d xdt A T δ A U 2 H 1 B 2 1 u 1 U 2 H 1 u 2 V 2 L 2 , W A A UW H 2 0 R3 T −1 (u 1 u¯ 2 )u 3 u¯ 4 d xdt A T δ u 1 U 2 H 1 u 2 U 2 H 1 u 3 U 2 L 2 u 4 V 2 L 2 , A A A A 3 0 R T −1 (u 1 u¯ 2 )u 3 u¯ 4 d xdt A T δ u 1 U 2 H 1 u 2 U 2 L 2 u 3 U 2 H 1 u 4 V 2 L 2 , A A A A 0 R3 T B∇u 1 u¯ 2 d xdt A T δ B 2 1 u 1 U 2 H 1 u 2 U 2 L 2 , A A VW H 2 0 R3 T B∇u 1 u¯ 2 d xdt A T δ B 2 1 u 1 U 2 L 2 u 2 U 2 H 1 , A A VW H 2 3 0 R T ABu 1 u¯ 2 d xdt A T δ A U 2 H 1 B 2 1 u 1 U 2 H 1 u 2 U 2 L 2 , W A A VW H 2 3 0 R T ABu 1 u¯ 2 d xdt A T δ A 2 1 B 2 1 u 1 U 2 H 1 u 2 U 2 H 1 . UW H 2

0 R3

VW H 2

A

A

(75) (76) (77) (78) (79) (80) (81) (82)

The quadrilinear mixed bounds (76), (81), (82) follow trivially from the Strichartz estimates. For (76) for instance we have T T 121 A L 3 L 8 B L 4 u 1 L 3 L 8 u 2 L ∞ L 2 ABu u ¯ d xdt 1 2 0 R3

1

A T 12 A U 2

W

H 1 B U 2 H 21 u 1 U A2 H 1 u 2 V A2 L 2 . W

We note that there is significant room for improvement in this computation by localizing first to the unit spatial scale and then using the local Strichartz estimates for the Schrödinger equation. The quadrilinear Schrödinger bound (77) corresponds to the particular choice m(λ) = 1 and β = 1 > 21 in (62). For (78) we can write T −1 (u 1 u¯ 2 )u 3 u¯ 4 d xdt u 1 u 2 2 3 u 3 u 4 2 3 L L2 L L2 3 0 R

1

T 3 u 1 L 3 L 6 u 2 L ∞ L 2 u 3 L 3 L 6 u 4 L ∞ L 2 1

A T 3 u 1 U 2 H 1 u 2 U 2 L 2 u 3 U 2 H 1 u 4 V 2 L 2 . A

A

A

A

Finally, the bounds (79) and (80) are identical since div B = 0 and correspond to (70). Equation (75) is essentially the same estimate.

172


Proof of Theorem 1, conclusion. By Proposition 30 we can obtain a weak Lipschitz dependence result for H 2 solutions (u 1 , A1 ) and (u 2 , A2 ) to (3), u 1 − u 2 L ∞ L 2 + A1 − A2

1

2 H2 UW

≤ c(E 1 , Q 1 , E 2 , Q 2 )

( (u 1 − u 2 )(0) L 2 + (A1 − A2 )(0)

1

H2

+ (A1 − A2 )t (0)

1

H− 2

).

(83)

We use this in order to construct solutions to (3) for H 1 initial data. Given (u 0 , A0 , A1 ) ∈ H 1 × H 1 × L 2 we consider a sequence of H 2 initial data (u n0 , An0 , An1 ) → (u 0 , A0 , A1 ) in H 1 × H 1 × L 2 . The sequence (u n0 , An0 , An1 ) is compact in H 1 × H 1 × L 2 , therefore we can bound them uniformly in a stronger norm, (u n0 , An0 , An1 ) H (m)×H (m)×H (λ−1 m) ≤ M, where m(λ) ≥ λ satisfies (54) and (63) and in addition lim λ−1 m(λ) = ∞.

λ→∞

By Proposition 29 we obtain a uniform bound u n L ∞ H (m) + An U 2

W

H (m)

≤ M.

On the other hand, (83) shows that the solutions (u n , An ) have a limit in a weaker topology, (u n , An ) → (u, A)

1

2 in L ∞ L 2 × UW H 2.

Combining the two bounds above we obtain strong convergence in H 1 , (u n , An ) → (u, A)

2 in L ∞ H 1 × UW H 1.

In addition, u will also satisfy the same Strichartz estimates as u n . Passing to the limit in Eq. (3) we easily see that (u, A) is a solution. Due to the weak Lipschitz dependence it is also the unique uniform limit of strong solutions. Due to the Strichartz estimates we can bound the nonlinear term φu in the Schrödinger equation as in Lemma 62. Then it also follows that u ∈ U A2 H 1 . The weak Lipschitz dependence (83) carries over to H 1 solutions, as well as the bounds in Propositions 27,29. Then the same argument as above gives the continuous dependence on the initial data.


173

6. Wave Packets for Schrödinger Operators with Rough Symbols An essential part of this article is devoted to understanding the properties of the (95) flow at frequency λ on λ−1 time intervals. As it turns out, for many estimates the parameter λ can be factored out by rescaling. This is why in this section we consider a more general equation of the form iu t − u + a w (t, x, D)u = 0,

u(0) = u 0

(84)

which we study on a unit time scale. Here a is a real symbol which is roughly smooth on the unit scale. For such a problem one seeks to obtain a wave packet parametrix, i.e. to write solutions as almost orthogonal superpositions of wave packets, where the wave packets are localized both in space and in frequency on the unit scale. The simplest setup is to assume uniform bounds on a of the form β

|∂xα ∂ξ a(t, x, ξ )| ≤ cαβ ,

|α| + |β| ≥ k.

An analysis of this type has been carried out in [7,10,11]. If k = 2 then one obtains a wave packet parametrix where the packets travel along the Hamilton flow. If k = 1 the geometry simplifies, and the Hamilton flow stays close to the flow for a = 0; however, a still affects a time modulation factor arising in the solutions. Finally if k = 0 then the a w (t, x, D) term is purely perturbative. For the operators arising in the present paper the above uniform bounds on a are too strong, and need to be replaced by integral bounds of the form 1 β |∂xα ∂ξ a(t, x t , ξ t )| ≤ cαβ , |α| + |β| ≥ k, 0

(x t , ξ t ) is the associated Hamilton flow. The case k

where t → = 2 has been considered in [8]; as proved there, the Hamilton flow is bilipschitz and a wave packet parametrix can be constructed. The case k = 0 was considered in [1]; then the term a w (t, x, D) is perturbative, and one may use the a = 0 Hamilton flow in the above condition. In the present article we need to deal with the case k = 1. This corresponds to a Hamilton flow which is close to the a = 0 flow. However, the term a w (t, x, D) is nonperturbative, and contributes a time modulation factor along each packet. Given these considerations, we consider the following assumption on the symbol a: 1 β sup |∂xα ∂ξ a(t, x + 2tξ, ξ )|dt ≤ cα,β , |α| + |β| ≥ 1. (85) x,ξ

0

Let (x0 , ξ0 ) ∈ R2n . To describe functions which are localized in the phase space on the unit scale near (x0 , ξ0 ) we use the norm: HxN0 ,N := { f : D − ξ 0 N f ∈ L 2 , x − x 0 N f ∈ L 2 }. ,ξ 0 We work with the lattice Zn both in the physical and Fourier space. We consider a partition of unity in the physical space, φx0 = 1 φx0 (x) = φ(x − x0 ), x0 ∈Zn

174


where φ is a smooth bump function with compact support. We use a similar partition of unity on the Fourier side: ϕξ0 = 1, ϕξ0 (ξ ) = ϕ(ξ − ξ0 ). ξ0 ∈Zn

An arbitrary function u admits an almost orthogonal decomposition u= u x0 ,ξ0 , u x0 ,ξ0 = ϕξ0 (D)(φx0 u), (x0 ,ξ0 )∈Z2n

so that

u x0 ,ξ0 2

(x0 ,ξ0 )∈Z2n

H N0,N0 x ,ξ

u 2L 2 .

(86)

We remark that a continuous analog of the above discrete decomposition can be obtained using the Bargman transform. We first establish that the Hamilton flow is close to the Hamilton flow with a = 0: Lemma 31. Assume that (85) holds with a small enough . Then for each (x 0 , ξ 0 ) ∈ R2n and t ∈ [0, 1] we have |x t − (x 0 + 2tξ 0 )| + |ξ t − ξ 0 | ε. The proof is straightforward and is left for the reader; it is also essentially contained in [1]. This allows us to apply the main result in [8]: Proposition 32. Assume that (85) holds with a small enough . Then for each N ≥ 0 the solution of the homogeneous problem (84) satisfies the following localization estimate: u(t) H N ,N

x0 +2tξ0 ,ξ0

N u 0 H N ,N . x0 ,ξ0

(87)

We denote the evolution operator for (84) by S(t, s). If the initial data is u 0 = δx then it has a decomposition of the form u ξ0 (0), u ξ0 (0) H N ,N 1. (88) u0 = x,ξ 0

ξ0 ∈Zn

By (32), at time 1 the corresponding solutions u ξ0 are concentrated close to x + 2tξ0 , therefore they are spatially separated. Hence we obtain the following pointwise decay: Corollary 33. The kernel K (1, 0) of S(1, 0) satisfies |K (1, x, 0, y)| 1. The solution of the homogeneous equations (84) satisfies S(1, 0)u 0 L ∞ u 0 L 1 . If in addition the initial data is localized at some frequency λ, say u 0 = Sλ δx , then the decomposition in (88) is restricted to the range |ξ0 | ≈ λ. Then the corresponding solutions travel with speed O(λ), and we can obtain better pointwise decay away from the propagation region:


175

Corollary 34. The kernel K λ (1, 0) of S(1, 0)Sλ satisfies |K (1, x, 0, y)| (λ + |x − y|)−N ,

|x − y| ≈ λ.

(89)

|x − y| ≈ λ, |t − s| 1.

(90)

The kernel K λ (t, s) of S(t, s)Sλ satisfies |K (t, x, s, y)| λ−N ,

The next result concerns localized energy estimates. Corollary 35. For each ball Br of radius r ≥ 1 the solution u to (84) satisfies 1

1

S(t, 0)Sλ u 0 L 2 (Br ) λ− 2 r 2 u 0 L 2 .

(91)

Proof. We consider the wave packet decomposition for u = S(t, 0)Sλ u(0), u=

|ξ 0 |≈λ (x0 ,ξ0

u x0 ,ξ0 .

)∈Z2n

Let χr be a cutoff corresponding to Br . Since r ≥ 1 it follows that the functions χr u x0 ,ξ0 are almost orthogonal, therefore it suffices to prove the estimate for a single packet. But a single packet is concentrated near a tube of spatial size 1 which travels with speed O(λ). This tube intersects the cylinder [0, 1] × Br over a time interval of length λ−1r . The conclusion easily follows. To obtain any results below the unit spatial scale we slightly strengthen the condition (85) by adding a weaker pointwise bound β

1

|∂xα ∂ξ a(t, x, ξ )| ≤ cαβ ξ 2 ,

∀ α, β.

(92)

This will guarantee that on a unit spatial scale the flow in (84) is a small perturbation of the flat Schrödinger flow. Then we have: Proposition 36. Assume that the conditions (85) and (92) hold. Then (i) For any r > 0 the solution u to (84) satisfies the localized energy estimates 1

1

S(t, 0)Sλ u(0) L 2 (Br ) λ− 2 r 2 u 0 L 2 . (ii) For each y, z with |y − z| ≈ λ we have the square function bound Sλ S(t, s)Sλ ( f (s)δ y ) λ−1 f L 2 . I

(93)

(94)

L2

Proof. To prove this result it is convenient to replace the L 2 initial data space by weighted L 2 spaces. Definition 37. A weight m : R2n → R+ is admissible if |m(x, ξ )/m(y, η)| (1 + |x − y| + |ξ − η|) N for some real N .

176


Correspondingly we define a weighted L 2 space u 2L 2 (m) = m(x0 , ξ0 )u x0 ,ξ0 2 (x0 ,ξ0

HxN ,N ,ξ

)∈Z2n

.

0 0

Given a weight m 0 at time 0 we evolve it in time by m t (x + 2tξ, ξ ) = m 0 (x, ξ ). As a consequence of Proposition 32 we obtain Lemma 38. Assume that (85) holds with a small enough . Then S(t, s) L 2 (m s )→L 2 (m t ) 1. Next we consider truncated solutions on a unit spatial scale. Given a unit ball B and an associated cutoff function χ we have the following weighted local energy estimates: Lemma 39. For any solution u to (84) we have χ u

1

L 2 ([0,1],L 2 (ξ 2 m t ))

+ (i∂t − )χ u

1

L 2 ([0,1],L 2 (ξ − 2 m t ))

u 0 L 2 (m) .

Proof. We begin again with a wave packet decomposition of u, u= u x0 ,ξ0 . (x0 ,ξ0 )∈Z2n

The functions χ u x0 ,ξ0 are almost orthogonal in L 2 , therefore the bound for χ u follows. On the other hand we have (i∂t − )χ u = −2∇χ ∇u − χ u + χa w (t, x, D)u. The first two terms are estimated using the bound for χ u. For the last one we note that, by (92), the operator a w preserves the H N ,N spaces, 1

a w (t, x, D)u H N −1,N −1 ξ0 2 u H N ,N . x0 ,ξ0

Hence we can use orthogonality again.

x0 ,ξ0

Now we can conclude the proof of the proposition. For the local energy estimate (91) we first truncate u to a unit scale. By the above lemma with m = (1 + λ−3 ξ 3 ) (1 + λ3 ξ −3 ) we obtain 1

1

λ 2 χ1 u L 2 (m ) + λ− 2 (i∂t − )χ1 u L 2 (m ) u 0 L 2 (m) , where m = (1 + λ−2 ξ 2 )(1 + λ2 ξ −2 ). It remains to show that 1

1

1

1

λ 2 r − 2 χr u L 2 λ 2 χ1 u L 2 (m ) + λ− 2 (i∂t − )χ1 u L 2 (m ) . Then we can localize the right-hand side to the λ−1 time scale. On the λ−1 time scale we can use the Duhamel formula to further reduce the problem to a corresponding estimate for solutions to the homogeneous constant coefficient Schrödinger equation, namely: 1

1

λ 2 r − 2 χr e−it u 0 L 2 u 0 L 2 (m ) .


177

After a dyadic frequency decomposition this becomes 1

1

λ 2 r − 2 χr e−it Sλ u 0 L 2 u 0 L 2 which is exactly the local energy estimate for the homogeneous constant coefficient Schrödinger equation. Consider now the square function bound. For |t −s| 1 we can use the kernel bound (90). Hence without any restriction in generality we assume that t ∈ I , s ∈ J where I , J are intervals of size O(1) with O(1) separation. Choose t0 the center of the interval between I and J . We factor the estimate in two and prove the dual estimates S(t0 , s)Sλ ( f (s)δ y ) I

1

L 2 (m)

λ− 2 f L 2 ,

respectively 1

(Sλ S(t, t0 )u)(z) L 2 λ− 2 u L 2 (m) , where the flow invariant weight m is given by m(x, ξ ) = (1 + λ−1 |ξ ∧ (x − y)|) K (1 + λ−1 |ξ ∧ (x − z)|)−K with K large enough. These are dual bounds, therefore it suffices to prove the second one. If χ is a smooth approximation of the characteristic function of B(z, 1), then by (a slight modification of) Lemma 39 it remains to show that v = χ Sλ u satisfies 1

1

1

λ 2 v(t, x) L 2 (J ) λ 2 v L 2 L 2 (m·m ) + λ− 2 (i∂t − )v L 2 L 2 (m·m ) , t

t

where the additional weight m = (1 + λ−2 ξ 2 )(1 + λ2 ξ −2 ) can be added due to the localization to frequency λ. This estimate can be localized to the λ−1 timescale. In addition, since v has support in B(z, 1) we can freeze x = z in m and replace m by m(ξ ˜ ) = (1 + λ−1 |ξ ∧ (y − z)|) K . Assuming y − z = O(λ)e1 we get m(ξ ˜ ) = (1 + |ξ | + λ−1 |ξ1 |) K . Then the x variable can be factored out and we are left with a bound for the one dimensional Schrödinger equation, 1

e−it v0 (·, 0) L 2 (J ) λ− 2 v0 L 2 (m ) . But this is exactly the one dimensional local energy estimate.

178


7. The Short Time Structure In this section we consider a paradifferential approximation to the magnetic Schrödinger equation (18). Precisely given a dyadic frequency λ we consider the evolution iu t − u + i(A=< Da ∂v , ∂u >= 0, since the left-hand side is symmetric in u, v and the right-hand side antisymmetric. Hence Du ∂v = Dv ∂u = 0. Now, with the above choices, [∂u , ea ] = −(∂u f / f )ea , < Du e1 , e2 > = < [∂u , e1 ], e2 > + < D1 ∂u , e2 >=< D2 ∂u , e1 >=< Du e2 , e1 > = − < Du e1 , e2 >, which gives < Du e1 , e2 >= 0. Since < Du ea , ∂u >= − < ea , Du ∂u >= 0, < Du ea , ∂v >= − < ea , Du ∂v >= 0, we obtain finally Du ea = Dv ea = 0. From the above formula, we get Da ∂u = −Du ea + [ea , ∂u ] = (∂u f / f )ea , Da ∂v = (∂v f / f )ea . Hence < Da ea , ∂u >= − < ea , Da ∂u >= −∂u f / f, < Da ea , ∂v >= −∂v f / f. The values of < Da eb , ec > are given by the induced connexion < Da eb , ec >= . 3.2. Energy identity for X = a∂u + b∂v . Proposition 3.2. Let X = a∂u + b∂v , where the coefficients a and b are C 1 functions, and depend only on (u, v). Then I = Q αβ X π αβ = −2 [∂v a(∂u φ)2 + ∂u b(∂v φ)2 − 2(X f / f )∂u φ∂v φ] −| ∇φ|2 (∂u a + ∂v b + 2X / ). Proof. 1. We first compute the deformation tensor π = 1 π of X = ∂u . From the above formula, we have πaa

πuu = πvv = 0, πuv = 4 ∂u , = 2 < Da ∂u , ea >= 2∂u f / f, π12 = 0, πua = πva = 0.

In particular, tr 1 π = 4(∂u / +∂u f / f ). We have a similar formula for the deformation tensor 2 π of ∂v . Finally, X

παβ = a 1 παβ + b2 παβ + guα ∂β a + guβ ∂α a + gvα ∂β b + gvβ ∂α b,

tr X π = 4a(∂u / + ∂u f / f ) + 4b(∂v / + ∂v f / f ) + 2(∂u a + ∂v b).

206

S. Alinhac

2. We thus obtain I = Q αβ π αβ = −(1/2)|∇φ|2 tr π + ∂α φ∂β φ(a(1 π )αβ + b(2 π )αβ ) + 2∇a(φ)∂u φ + 2∇b(φ)∂v φ. Now (1 π )αβ ∂α φ∂β φ = 2(∂u / 3 )∂u φ∂v φ + 2(∂u f / f )| ∇φ|2 , and similarly for 2 π . Hence I = −|∇φ|2 (∂u a + ∂v b + 2X / + 2X f / f ) + −2 (∂u a∂v φ + ∂v a∂u φ)∂u φ + −2 (∂u b∂v φ + ∂v b∂u φ)∂v φ + 2 −3 ∂u φ∂v φ X + 2X f / f | ∇φ|2 . This gives the formula.

3.3. The photon sphere. For the Schwarzschild metric, the function h¯ = 2 log( / f ) depends only on r ∗ , and h¯ (0) = 0. Thus, a null bicharacteristic starting tangentially from = {r ∗ = 0} stays on . For a general metric g, let h = 2 log( / f ). Consider a null bicharacteristic for (ξ and η being the dual variables of u and v) u˙ = η, v˙ = ξ, φ˙ = ..., θ˙ = ..., ξ˙ = (∂u h)ξ η, η˙ = (∂v h)ξ η. For an arbitrary function ψ(u, v), let ψ0 (s) = ψ(u(s), v(s)) be ψ on this curve. Then 2 ψ˙ 0 = ∂u ψη+∂v ψξ, (d 2 /ds 2 )ψ0 = ∂u2 ψη2 +∂v2 ψξ 2 + ξ η(2∂uv ψ + ∂u ψ∂v h + ∂v ψ∂u h).

If we take for instance ψ = u, we obtain (d 2 /ds 2 )ψ0 = ∂v hξ ψ˙ 0 , hence a null bicharacteristic starting tangentially to ψ = C will stay on ψ = C: we know that u satisfies the eikonal equation, and all level surfaces u = C are characteristic surfaces. The same applies to v. Assume now ∂u ψ = 0, ∂v ψ = 0. We write then 2 2 (d 2 /ds 2 )ψ0 = [(∂uu ψ/∂u ψ)η + (∂vv ψ/∂v ψ)ξ ]ψ˙ 0 + ξ η A

with 2 2 2 A = 2∂uv ψ − (∂v ψ/∂u ψ)∂uu ψ − (∂u ψ/∂v ψ)∂vv ψ + ∂u ψ∂v h + ∂v ψ∂u h.

If ψ is such that A vanishes on ψ = 0, we deduce that a null bicharacteristic starting tangentially from ψ = 0 will stay on ψ = 0. The typical example is ψ = u + v for the Schwarzschild metric, since then A = ∂u h¯ + ∂v h¯ vanishes on u + v = 0. If we take simply ψ = v − k(u), the above vanishing condition reads k

/k = (∂u h − k ∂v h)(u, k(u)). Definition. A surface of equation v = k(u) for which k satisfies the above differential equation is called a photon sphere for g (recall h = 2 log / f ). The following proposition gives a rough sufficient condition for such a k to exist.

Energy Multipliers for Perturbations of the Schwarzschild Metric

207

Proposition 3.3. Let µ ≥ 0, 0 > 0 be given. Define the space C¯ µ2 to be the space of C 2 functions of H (r ∗ , t), defined in the strip [−0 , 0 ] × R, for which the norm ||H || = ||H ||∞ + ||∂t H ||∞ + || < t >µ ∂r ∗ H ||∞ + ||H

||∞ ¯ ≤ c0 , there exists k = k(u) ∈ C 2 is finite. There exists c0 > 0 such that, if ||h − h|| solution of the differential equation k

/k = (∂u h − k ∂v h)(u, k(u)) for which ||(k + u) µ ||∞ + ||(k + 1) µ ||∞ + ||k

µ ||∞ < ∞. ¯ Moreover, k + u is a C 1 function of h − h¯ vanishing when h = h. Proof. 1. This is a simple application of the implicit function theorem. Set Cµ2 the space of C 2 functions K of u alone such that ||K µ ||∞ + ||K µ ||∞ + ||K

µ ||∞ < ∞. The differential equation for k = −u + K and h = h¯ + H can be written K

/(K − 1) = (2 − K )(∂r ∗ h¯ + ∂r ∗ H )(K , K − 2u) − K ∂t H (K , K − 2u). Set F(H, K ) = K

+ (1 − K )(2 − K )(h¯ (K ) + ∂r ∗ H (K , K − 2u)) −K (1 − K )∂t H (K , K − 2u). We have F(0, 0) = 0, and F is a C 1 function defined on a neighbourhood of (0, 0) in C¯ µ2 × Cµ2 , with values in Cµ , the space of continuous functions z of u with weight µ and norm ||z µ ||∞ . The differential F of the mapping K → F(0, K ) is given by y → F (y) = y

+ 2h¯

(0)y. To prove the proposition, it is sufficient to prove that F is an isomorphism between Cµ2 and Cµ . Since h¯ = log(1/r 2 − 2m/r 3 ), dr/dr ∗ = 1 − 2m/r , we have h¯ = −2/r 2 (r − 3m), h¯

(0) = −2/27m 2 < 0. 2. Setting 2h¯

(0) = −α 2 , we have to study the equation y

− α 2 y = z. The only solution bounded at infinity is ∞ u −α(s−u) y(u) = (1/2α)[ z(s)e ds − z(s)e−α(u−s) ds]. u

∞

∞

−∞

Let us consider first | u z(s)e−α(s−u) ds| ≤ C 0 −µ e−ασ dσ : for u ≥ 0, it −u/2 ∞ and −u/2 . The first integral is smaller than C −µ . If u ≤ 0, we separate in 0 is bounded by C −µ , while the second is exponentially decreasing. Hence this first term in y belongs to Cµ , and similarly for the second term and for y . This concludes the proof.

208

S. Alinhac

3.4. Necessary conditions for non-negative multipliers. Theorem 3.4. Let g be a rotationally invariant metric close enough to the Schwarzschild metric to have a photon sphere , and X = a(u, v)∂u + b(u, v)∂v be a smooth spacelike field close to 2∂r ∗ (that is, a and b close to 1). If there exists a function ν (with dν = 0) such that ν X is non-negative for g, then necessarily X is orthogonal to the photosphere of g and ν vanishes on . Proof. 1. The interior terms I = Q αβ ν X π αβ are given by Proposition 3.2. Set y = (a/b)∂u (bν), z = (b/a)∂v (aν). If ν X is positive, necessarily ∂v (aν) ≥ 0, ∂u (bν) ≥ 0, hence y ≥ 0, z ≥ 0. The coefficient of −| ∇φ|2 in I is ∂u (aν) + ∂v (bν) + 2ν X / = y + z + ν A, where we defined A = ∂u a + ∂v b − (a/b)∂u b − (b/a)∂v a + 2X / . Assume that, for some R, I˜ = I + R( −2 ∂u φ∂v φ + | ∇φ|2 ) = −2 [∂v (aν)(∂u φ)2 + ∂u (bν)(∂v φ)2 − (∂u φ)(∂v φ)(2ν X f / f − R)] −| ∇φ|2 (y + z + ν A − R) is non-negative. Then, necessarily, R˜ = −(y + z + ν A − R) ≥ 0, ( R˜ + y + z + ν B)2 ≤ 4yz, where, recalling h = 2 log( / f ), B = A − 2X f / f = ∂u a + ∂v b − (a/b)∂u b − (b/a)∂v a + X h. The last inequality can be written ˜ + z) + 2ν B( R˜ + y + z) ≤ 0. R˜ 2 + (y − z)2 + ν 2 B 2 + 2 R(y ¯ the quantity This implies in particular ν B ≤ 0. Now, for a = b = 1, and h = h, corresponding to the Schwarzschild metric, B vanishes simply on r ∗ = 0. If we assume a and b close to one and h close to the value h¯ for the Schwarzschild metric, B will vanish simply on a surface S = {(u, v), v = k(u)}, for k close to −u. Now, on S, we must have R˜ = 0, y = z, and, from ν B ≤ 0, we also get ν = 0. From y = z and ν = 0, we deduce a∂u ν = b∂v ν on S. Since dν = 0, ν = 0 is an equation of S, and X is orthogonal to S, that is, b = −ak

on S. Then, with Z = ∂u + k ∂v tangent to S, B = 0 can be written B = Z a + (1/k )Z b + X h = Z a + (1/k )(−k Z a − ak

) + X h = −ak

/k + X h = −a(k

/k − ∂u h + k ∂v h), hence k satisfies the differential equation of the photon sphere k

/k = ∂u h(u, k(u)) − k ∂v h(u, k(u)). This concludes the proof of the theorem.


209

3.5. Sufficient conditions I. Let us consider now a rotationally invariant metric g close enough to the Schwarzschild metric to have a photon sphere , according to Proposition 3.3. Let v = k(u) be an equation of , such that k (u) < 0, and take X = ∂u − k (u)∂v . Proposition 3.5. Recall h = 2 log / f and set B = −k

(u)/k (u) + X h. Assume the metric g such that, for some B¯ < 0, ¯ B = (v − k(u)) B. Let W and β be two functions of the real variable S = v − k(u) (still to be chosen), with W (0) = 0, W > 0. Set ν = (−1/k (u))W (v − k(u)), R = 2[W (v − k(u)) + ν X f / f ], which implies ν B ≤ 0. Then Q (ν X ) π d V ≥ D

∂D

[Cφ 2 − ν B| ∇φ|2 ]d V,

edv + D

with C = k −2 [W

+ 4βW

+ 4W (β 2 + β )] + W ((1/k )X f / f ). The boundary terms are edv = ∂D

∂D

[(1/2)n α ∂α R]φ 2 dv −

∂D

Rφ(n α ∂α φ)dv.

Proof. 1. We keep here the notation of the proof of Theorem 3.4. We first arrange to have y ≡ z: this means ∂u ν + k ∂v ν = (−k

/k )ν. For any W , ν = (−1/k )W (v − k(u)) is a solution of this equation. With this choice, y = z = W . We choose now R˜ = −ν B, which corresponds to R = 2(y + ν X f / f ). At this stage, we have I + R|∇φ|2 = I˜ = − −2 W /k (X φ)2 − ν B| ∇φ|2 . We transform the term R(φ α φα ) according to formula 2.1.d to obtain α −R(φ φα )d V = ...dv − φ 2 d V (y + ν X f / f ). D

∂D

D

Now, for a function w(u, v), we have 2 w = −2 [∂uv w + (1/ f )(∂u f ∂v w + ∂v f ∂u w)].

210

S. Alinhac

Hence y = −2 (−k W

+ W

X f / f ), (ν X f / f ) = −2 [W

X f / f − W X ((1/k )X f / f ) − (W /k )(X f / f )2 ] −W ((1/k )X f / f ). 2. Following [4], we use now a Poincaré formula to produce “good” terms in φ 2 . In fact, with the above choices, we are left with the non angular interior terms (−1/k )y(∂u φ)2 − k y(∂v φ)2 + 2y(∂u φ)(∂v φ) = (−y/k )[∂u φ − k ∂v φ]2 . Let us introduce now, instead of the coordinates (u, v), the coordinates S = v − k(u), T = v + k(u). Then X = −2k ∂ S ,

−2 2 (−y/k ) (X φ) d V = 4 y f 2 (∂ S φ)2 d SdT dσ S 2 . D

D

Keeping T and the variable on the sphere constant, we will use in the variable S the following Poincaré inequality (see [4]), where β˜ is arbitrary and small at infinity, ˜ − β˜ 2 A]d S. (∂ S φ)2 Ad S ≥ φ 2 [(Aβ) Here, A = 4y f 2 . Hence we obtain

−2 2 ˜ W

+ y(X β˜ + 2β˜ X f / f + 2k β˜ 2 )]. (−y/k ) (X φ) d V ≥ 2 φ 2 / 2 d V [−2βk D

D

Putting together all terms, we get for the interior terms the lower bound ˜ ) φ 2 W ((1/k )X f / f ) + φ 2 −2 [k W

− 2W

(X f / f + 2βk ˜ )X f / f + β˜ 2 )]. +2W X (β˜ + (1/2k )X f / f ) + 4k W ((1/4k 2 )(X f / f )2 + (β/k Choosing β˜ = −(1/2k )X f / f − β(S), these terms are φ 2 W ((1/k )X f / f ) + φ 2 −2 k [W

+ 4βW

+ 4W (β 2 + β )]. 3.6. Sufficient conditions II. Using Proposition 3.5, we can now construct the actual multiplier ν X , that is, choose the functions W and β. Following [4], we will denote by φl the part of φ which is a l-spherical harmonic. We keep here the notation of Sect. 3.5. Theorem 3.6. Let E = ((1/k )X f / f ). Assume that, for some C0 > 0, we have ¯ everywhere, i) f 2 |E| ≤ C0 | B| 2 2 ¯ ii) (− B/ f + |E|) ≤ C0 for |v − k(u)| ≤ C0−1 , ¯ ≥ C −1 f 2 for |v − k(u)| ≤ 3. iii) 2 | B| 0


211

Then there exist l0 (C0 ) ∈ N, W and β and c > 0 such that, for all φ with φl = 0 for l ≤ l0 , Q αβ (ν X ) π αβ d V ≥ [Cφ 2 − ν B| ∇φ|2 ]d V ≥ c < r ∗ >−3−0 φ 2 d V. D

D

D

Proof. 1. We want first to obtain (1/4)a

+ βa + a(β 2 + β ) < 0, with a = W . To this aim, we set β = −a /a (with a still to be chosen), and get (1/4)a

+ βa + a(β 2 + β ) = −[(3/4)a

− a 2 /a]. Remark that the sign of this quantity is scale invariant, which justifies the normalization made in the following lemma. Lemma. For any small η > 0, there exists an even C 2 function a, with a > 0, a(0) = 1, a ∈ L 1 and i) For x ≥ 1, (3/4)a

− a 2 /a > 0, ii) For 0 ≤ x ≤ 1, (3/4)a

− a 2 /a = O(η), 2 iii) 1 a(x)d x ≥ (1 + 3η)−1 . Proof of the lemma. For x ≥ 1, we just take a(x) = (1 + ηx µ )−1 , with µ = (2η + 3)(η + 3)−1 . We get (3/4)a

−a 2 /a = ηµ(4(1 + ηx µ )−1 [(3 − µ)ηx µ −3(µ − 1)] ≥ ηµ(4(1 + ηx µ )−1 η/2. We set now a1 (x) = a(1) + (x − 1)a (1) + (1/2)(x − 1)2 a

(1), and, for 0 ≤ x ≤ 1, a(x) = χ (x) + (1 − χ (x))a1 (x), where χ is smooth decreasing between zero and one, being one close to zero and zero close to one. We have then a (x) = χ (x)(1 − a1 ) + (1 − χ (x))a1 = O(η), a ≥ 1 + O(η), a

(x) = O(η). This finishes the proof of the lemma. 2. We fix now η small enough and choose S a(x)d ¯ x, a(x) ¯ = a(x − 2), S = v − k(u). W (S) = 0

For l ≥ l0 , we have, with E = ((1/k )X f / f ), [Cφ 2 − ν B| ∇φ|2 ]d V ≥ [C − ν Bl0 (l0 + 1) f −2 ]φ 2 d V D D

−2

¯ 0 (l0 + 1) f −2 + E)}φ 2 d V. = {(−k ) [(3/4)a¯ − a¯ 2 /a] ¯ + W ((−1/k )(− B)Sl D

212

S. Alinhac

Thanks to the assumptions of the theorem, there is c1 > 0 big enough such that, for |S| ≥ c1 /l02 , ¯ 0 (l0 + 1) f −2 + E] > 0. S[(−1/k )(− B)Sl Thus the integrand for these values of S and |S − 2| ≥ 1 is positive, by i) of the lemma. 3. For |S| ≤ c1 /l02 , we have |W | = O(l0−2 ), and (3/4)a¯

− a¯ 2 /a¯ > 0 by i) of the lemma, hence once again the integrand is positive. We are left with the zone |S −2| ≤ 1 : but there W ≥ (1 + 3η)−1 by iii) of the lemma while, by ii), the first term in the integrand is small. This completes the proof. Remarks. i) For the Schwarzschild metric, 2/(3r 2 ) ≤ − B¯ = (2/r 2 )((r −3m)/r ∗ ) ≤ 2/r 2 , while E = O(r −4 ), hence the assumptions of the theorem are satisfied. ii) It is possible that positive neglected terms in the right-hand side of the inequality turn out to be useful. 4. Kerr Metrics 4.1. Frame and connection coefficients. We use the usual spherical coordinates (r, φ, θ ) in R3 , x1 = r sin θ cos φ, x2 = r sin θ sin φ, x3 = r cos θ. The Kerr metric is ds 2 = −(( − a 2 sin2 θ )/)dt 2 − 4amr (sin2 θ/)dtdφ +(/)dr 2 + [.]/ sin2 θ dφ 2 + dθ 2 , with = r 2 + a 2 cos2 θ, = r 2 + a 2 − 2mr, [.] = (r 2 + a 2 )2 − a 2 sin2 θ. a. Following [2,10], we use the null vectors l = ((r 2 + a 2 )/)∂t + (a/)∂φ + ∂r , n = ((r 2 + a 2 )/(2))∂t + a/(2)∂φ − /(2)∂r for which < l, n >= −1. We set e3 = /(r 2 + a 2 )l = X + Y, e4 = 2/(r 2 + a 2 )n = X − Y, X = ∂t + a/(r 2 + a 2 )∂φ , Y = /(r 2 + a 2 )∂r . We have then < e3 , e4 >= −2µ, µ = /(r 2 + a 2 )2 . The orthogonal space of (e3 , e4 ) is generated by ∂θ and ∂φ + a sin2 ∂t . In the sequence, for simplicity, we omit the variable θ in sin = sin θ , etc. We have < ∂θ , ∂θ >= , < ∂φ + a sin2 ∂t , ∂φ + a sin2 ∂t >= sin2 .


213

Hence we take e1 = −1/2 ∂θ , e2 = (1/ 1/2 sin θ )(∂φ + a sin2 ∂t ). Remark that < e1 , e2 >= 0, hence (e1 , e2 ) form an orthonormal system. We compute now the brackets of the vectors of our null frame (e1 , e2 , e3 , e4 ). We obtain first [e1 , e2 ] = − cos( − a 2 sin2 )/( 2 sin2 )∂φ + a cos(r 2 + a 2 )/ 2 ∂t . Since we also have 1/2 sin e2 = ∂φ + a sin2 ∂t , (1/2)(e3 + e4 ) = a/(r 2 + a 2 )∂φ + ∂t , we get ∂t = (r 2 + a 2 )/(2)(e3 + e4 ) − a sin −1/2 e2 , ∂φ = −a(r 2 + a 2 ) sin2/(2)(e3 + e4 ) + (r 2 + a 2 ) sin −1/2 e2 . Finally [e1 , e2 ] = cos(r 2 + a 2 ) −3/2 [a −1/2 (e3 + e4 ) − (1/ sin)e2 ]. We also have [e1 , e3 ] = r /((r 2 + a 2 ))e1 , [e1 , e4 ] = −r /((r 2 + a 2 ))e1 , [e2 , e3 ] = r /((r 2 + a 2 ))e2 , [e2 , e4 ] = −r /((r 2 + a 2 ))e2 . Now [e3 , e4 ] = −2[X, Y ] = −4ar /(r 2 + a 2 )3 ∂φ , [e3 , e4 ] = 4ar sin /((r 2 + a 2 )2 )[(a/2) sin(e3 + e4 ) − 1/2 e2 ]. b. Using the properties of the metric connection D, in particular the torsion free character D X Y − DY X = [X, Y ], we can compute now all the coefficients for the frame (e1 , e2 , e3 , e4 ): < D1 e1 , e1 >= 0, < D1 e1 , e2 >= − < e1 , [e1 , e2 ] >= 0, < D1 e1 , e3 >= − < e1 , [e1 , e3 ] >= −r /((r 2 + a 2 )), < D1 e1 , e4 >= r /((r 2 + a 2 )), < D2 e1 , e1 >= 0, < D2 e1 , e2 >= − < [e1 , e2 ], e2 >= (r 2 + a 2 ) −3/2 cos/sin . For j = 3, 4, we have 2 < D j e1 , e2 >=< [e j , e1 ], e2 > − < e j , [e1 , e2 ] > + < [e2 , e j ], e1 >,

214

S. Alinhac

hence < < <
= − < e1 , D3 e2 >= a cos /((r 2 + a 2 )), D2 e1 , e4 >=< D4 e1 , e2 >= a cos /((r 2 + a 2 )), D3 e1 , e1 >= 0, < D3 e1 , e2 >=< D2 e1 , e3 >= a cos /((r 2 + a 2 )), D3 e1 , e3 >= 0, < D3 e1 , e4 >= (1/2)D1 < e3 , e4 > = 2a 2 cos sin /( 1/2 (r 2 + a 2 )2 ), < D4 e1 , e1 >= 0, < D4 e1 , e2 >= a cos /((r 2 + a 2 )), < D4 e1 , e3 >=< D3 e1 , e4 >= 2a 2 cos sin /( 1/2 (r 2 + a 2 )2 ).

We start similar computations with e2 : < D1 e2 , e1 >=< [e1 , e2 ], e1 >= 0, < D1 e2 , e2 >= 0, < D1 e2 , e3 >=< [e1 , e2 ], e3 > + < D2 e1 , e3 >= −a cos /((r 2 + a 2 )) =< D1 e2 , e4 >, < D2 e2 , e1 >= −(r 2 + a 2 ) −3/2 cos/sin, < D2 e2 , e2 >= 0, < D2 e2 , e3 >= −r /((r 2 + a 2 )), < D2 e2 , e4 >= r /((r 2 + a 2 )), < D3 e2 , e1 >= −a cos /((r 2 + a 2 )), < D3 e2 , e2 >= 0 =< D3 e2 , e3 >, < D3 e2 , e4 >= −(1/2) < [e3 , e4 ], e2 >= 2ar sin /( 1/2 (r 2 + a 2 )2 ), < D4 e2 , e1 >= −a cos /((r 2 + a 2 )), < D4 e2 , e2 >= 0, < D4 e2 , e3 >=< e2 , [e3 , e4 ] > + < D3 e2 , e4 >= −2ar sin /( 1/2 (r 2 + a 2 )2 ), < D4 e2 , e4 >= 0. By symmetry, the other terms are obtained: < < < < < < < < < < < < < < < <
= r /((r 2 + a 2 )), < D1 e3 , e2 >= a cos /((r 2 + a 2 )), >= 0, < D1 e3 , e4 >= 2a 2 cos sin /( 1/2 (r 2 + a 2 )2 ), >= −a cos /((r 2 + a 2 )), < D2 e3 , e2 >= r /((r 2 + a 2 )), >= 0, < D2 e3 , e4 >= 2ar sin /( 1/2 (r 2 + a 2 )2 ), >= 0, < D3 e3 , e2 >= 0, < D3 e3 , e3 >= 0, >= D3 < e3 , e4 > − < e3 , [e3 , e4 ] >= 4m(a 2 − r 2 )/(r 2 + a 2 )4 , >= −2a 2 cos sin /( 1/2 (r 2 + a 2 )2 ), >= 2ar sin /( 1/2 (r 2 + a 2 )2 ), >= 0, < D4 e3 , e4 >= 4a 2 r 2 sin2 /(r 2 + a 2 )4 , >= −r /((r 2 + a 2 )), < D1 e4 , e2 >= a cos /((r 2 + a 2 )), >= 2a 2 cos sin /( 1/2 (r 2 + a 2 )2 ), < D1 e4 , e4 >= 0, >= −a cos /((r 2 + a 2 )), < D2 e4 , e2 >= −r /((r 2 + a 2 )), >= −2ar sin /( 1/2 (r 2 + a 2 )2 ), < D2 e4 , e4 >= 0, >= −2a 2 cos sin /( 1/2 (r 2 + a 2 )2 ), >= −2ar sin /( 1/2 (r 2 + a 2 )2 ), >= −4a 2 r 2 sin2 /(r 2 + a 2 )4 , < D3 e4 , e4 >= 0, >=< D4 e4 , e2 >= 0, < D4 e4 , e3 >= − < D3 e3 , e4 >


215

= −4m(a 2 − r 2 )/(r 2 + a 2 )4 , < D4 e4 , e4 >= 0. 4.2. Deformation tensors. We are now in a position to compute the deformation tensors of all fields. Denote i π = ei π . From the definition and the previous computations, we get easily the components of π : a. 1 π : π11 = 0, π12 = 0, π13 = −r /((r 2 + a 2 )), π14 = r /((r 2 + a 2 )), π22 = 2(r 2 + a 2 ) −3/2 cos/sin, π23 = 2a cos/((r 2 + a 2 )) = π24 , π33 = 0, π34 = 4a 2 cos sin/( 1/2 (r 2 + a 2 )2 ), π44 = 0. b. 2 π : π11 = 0, π12 = −(r 2 + a 2 ) −3/2 cos/sin, π13 = −2a cos/((r 2 + a 2 )) = π14 , π22 = 0, π23 = −r /((r 2 + a 2 )) = −π24 , π33 = 0, π34 = 0, π44 = 0. c. 3 π : π11 = 2r /((r 2 + a 2 )), π12 = 0, π13 = 0, π14 = 0, π22 = 2r /((r 2 + a 2 )), π23 = 0, π24 = 4ar sin /( 1/2 (r 2 + a 2 )2 ), π33 = 0, π34 = 4m(a 2 − r 2 )/(r 2 + a 2 )4 , π44 = 8a 2 r 2 sin2 /(r 2 + a 2 )4 . d. 4 π : π11 = −2r /((r 2 + a 2 )), π12 = 0, π13 = 0, π14 = 0, π22 = −2r /((r 2 + a 2 )), π23 = −4ar sin /( 1/2 (r 2 + a 2 )2 ), π24 = 0, π33 = −8a 2 r 2 sin2 /(r 2 + a 2 )4 , π34 = −4m(a 2 − r 2 )/(r 2 + a 2 )4 , π44 = 0. Taking into account the actual components of the tensors i π for the Kerr metric, we can finally compute their traces against Q. We obtain a. Q 1 π = −[cos/( 3/2 sin)](r 2 + a 2 − 2a 2 sin2 )e12 + [cos /( 3/2 sin)](r 2 + a 2 + 2a 2 sin2 )e22 − r (r 2 + a 2 ) −2 e1 (e3 − e4 ) − 2a(r 2 + a 2 ) cos −2 e2 (e3 + e4 ) + (r 2 + a 2 )3 cos /( 5/2 sin)e3 e4 . b. Similarly Q 2 π = −2(cos / sin)(r 2 + a 2 ) −3/2 e1 e2 + 2a(r 2 + a 2 ) −2 cos e1 (e3 + e4 ) −r (r 2 + a 2 ) −2 e2 (e3 − e4 ). c. Q 3 π = 2m[(a 2 − r 2 )/(r 2 + a 2 )2 ](e12 + e22 ) − 4ar sin −3/2 e2 e3 + 2a 2 r sin2 −2 e32 + 2r (r 2 + a 2 ) −2 e3 e4 . d. Q 4 π = −2m[(a 2 − r 2 )/(r 2 + a 2 )2 ](e12 + e22 ) + 4ar sin −3/2 e2 e4 −2r (r 2 + a 2 ) −2 e3 e4 − 2a 2 r sin2 −2 e42 .

216

S. Alinhac

4.3. Photon sphere. The principal symbol of the wave equation for the Kerr metric is p, and ˜ p = −[(r 2 + a 2 )2 − a 2 sin2 ]τ 2 +2 R 2 +(−a 2 sin2 )/ sin2 φ˜ 2 +2 −4amr τ φ. ˜ ). For a null The order of the variables is (t, r, φ, θ ), with dual variables (τ, R, φ, bicharacteristic, ˜ r˙ = 2R2 , τ˙ = 0, t˙ = −2τ [(r 2 + a 2 )2 − a 2 sin2 ] − 4amr τ φ, 2 2 2 2 2 − R˙ = τ (−4r (r + a ) + 2a (r − m) sin ) + 4(r − m)R 2 +2(r − m)(φ˜ 2 / sin2 +2 ) ˜ −4amτ φ. From p = 0 we get ˜ (φ˜ 2 / sin2 +2 ) = τ 2 [(r 2 + a 2 )2 − a 2 sin2 ] − 2 R 2 + a 2 φ˜ 2 + 4amr τ φ. Hence − R˙ = 2τ 2 (r 2 + a 2 )−1 (4mr 2 − (m + r )(r 2 + a 2 )) + 2(r − m)R 2 ˜ −1 (a(r − m)φ˜ + 2m(r 2 − a 2 )τ ). + 2a φ Set q(r ) =

a(r − m) . r − m(r 2 − a 2 )

We see that, for any r , a null bicharacteristic starting from a point with r = r , ˜ stays on the surface r = r : it is a partial photon sphere (in the R = 0, τ = q(r )φ, sense that only certain null bicharacteristics stay where they started from). These partial photon spheres accumulate to the one for which q = ∞. Definition. We call a photon sphere any surface = {r = r¯ } for which r¯ satisfies the relation r = m(r 2 − a 2 ). On , we have the choice of φ˜ = 0 or a(r −m)φ˜ +2m(r 2 −a 2 )τ = 0. The last possibility is forbidden on null characteristics, for r close to 3m. The conditions φ˜ = 0, R = 0 are equivalent to r˙ = 0, −2amr t˙ + [.]φ˙ = 0. Thus we see that not all null bicharacteristics starting tangentially from the photon sphere stay on it ; we can say that this is a partial photon sphere (in contrast with the case a = 0).


217

4.4. Intersphere region. Lemma 4.4. Let a < m, and define the function f α (0 ≤ α ≤ a) by f α (r ) = 2r + (m − r )(r 2 + α 2 ) = r 3 − 3mr 2 + (2a 2 − α 2 )r + mα 2 . Then, i) For λ2 (α) = m + (m 2 + (α 2 − 2a 2 )/3)1/2 , f α (λ2 ) < 0, (∂r f α )(λ2 ) = 0, and (∂r f α )(r ) > 0 for r > λ2 (α). Hence there exists a unique smooth zero r (α) > λ2 (α) > m of f α , for which r (α) > 0 for α > 0, and r (0) = 0. ii) The surface {r = r (a) ≡ r0 } is a (partial) photon sphere defined in Sect. 4.3.The value of r (0) ≡ r1 is r1 = (3m + (9m 2 − 8a 2 )1/2 )/2. Hence, r1 ≤ r (α) ≤ r0 . Remark also that r1 > λ2 (a). iii) The function k˜ = 2r + (m − r ) = f a| cos θ| (r ) vanishes on the surface r = r (a| cos θ |) ≡ Sa (θ ), for which r1 ≤ Sa (θ ) ≤ r0 , Sa (0) = Sa (π ) = r0 . Proof of Lemma 4.4. 1. We have ∂r f α = 3r 2 − 6mr + 2a 2 − α 2 , which vanishes for r = λ1 and r = λ2 , λ1 = m − δ 1/2 , λ2 = m + δ 1/2 , δ = m 2 + (α 2 − 2a 2 )/3. At such a point, with = −1 for λ1 and = 1 for λ2 , f α (r ) = 2m(a 2 − m 2 ) − 2δ 3/2 . Hence f α (λ2 ) < 0, and r (α) exists and is smooth. Moreover, ∂α f α + (∂r f α )r (α) = 0. Since ∂α f α (r ) = 2α(m − r ), point i) is proved. 2. We note that r − m(r 2 − a 2 ) = r 3 − 3mr 2 + a 2 r + ma 2 = f a (r ), hence r = r0 is a (partial ) photon sphere for the Kerr metric. The rest of point ii) follows from i), except the final remark. But, for a > 0, λ2 (a) = m + (m 2 − a 2 /3)1/2 ≤ 2m = (3m + (m 2 )1/2 )/2 ≤ (3m + (9m 2 − 8a 2 )1/2 )/2 = r1 . Definition. We call the region I S = {(t, r, φ, θ ), 0 ≤ θ ≤ π, Sa (θ ) ≤ r ≤ r0 } the intersphere region.

218

S. Alinhac

5. A Nonexistence Theorem for Kerr Metrics Theorem. Assume 0 < a < m. Let X = α1 e1 + α2 e2 + be3 + ce4 be a vector field with the following properties: i) The C 1 coefficients b, c and α2 do not depend on t or φ : b = b(r, θ ), c = c(r, θ ), α2 = α2 (r, θ ). ii) The coefficient α1 is C ∞ for 0 ≤ θ ≤ π and analytic in r for 0 < θ < π . Assume that X is non negative, that is, there exists a function R with I˜ ≡ Q X π + R|ψ|2 ≥ 0. Then, in the intersphere region IS, R ≡ 0 and I˜ ≡ 0. Remarks. i) The independence of b, c and α2 on φ is technical; it seems natural since the coefficients of the Kerr metric do not depend on φ either. The fact that these coefficients should be independent of t is motivated by the fact that the flux of the field at time t has to be controlled by the standard energy (as in the Morawetz inequality). ii) The analyticity assumption on α1 is used in point 3 of the proof of the theorem, and remains a rather mysterious technical assumption. The rest of the sections are devoted to the proof of this theorem. 5.1. Computation of the quadratic terms I˜. Let 3X = X¯ + Z , X¯ = α1 e1 + α2 e2 , Z = be3 + ce4 . ¯

Lemma 5.1. With π = X π , π¯ = X π , α π = eα π , we have, for any R, I˜ = Qπ + R|∇ψ|2 = e12 [ R¯ + A] + e22 [ R¯ − A] + 2e1 e2 (π¯ 12 + b3 π12 + c4 π12 ) + 2C T + ye32 + ze42 + (2µ)−1 he3 e4 , where h = −2R + π¯ 11 + π¯ 22 + b(3 π11 + 3 π22 ) + c(4 π11 + 4 π22 ), R¯ = R − e3 (b) − e4 (c) + (2µ)−1 [b3 π34 + c4 π34 + π¯ 34 ], 2 A = b(3 π11 − 3 π22 ) + c(4 π11 − 4 π22 ) + π¯ 11 − π¯ 22 , y = −e4 (b)/µ + (4µ2 )−1 (b3 π44 + c4 π44 ), z = −e3 (c)/µ + (4µ2 )−1 (b3 π33 + c4 π33 ), and the cross terms C T are C T = e1 e3 [e1 (b) − (2µ)−1 π¯ 14 ] + e1 e4 [e1 (c) − (2µ)−1 π¯ 13 ] +e2 e3 [e2 (b) − 2bar sin θ −3/2 − (2µ)−1 π¯ 24 ] + e2 e4 [e2 (c) + 2acr sin θ −3/2 −(2µ)−1 π¯ 23 ]. Here and in the sequence, we just write eα for eα (ψ), when no misunderstanding can occur.


219

¯

Proof of Lemma 5.1. 1. For π¯ = X π , we have π¯ αβ = α1 1 παβ + α2 2 παβ + π˜ αβ , π˜ αβ = eα (α1 ) < e1 , eβ > +eα (α2 ) < e2 , eβ > +eβ (α1 ) < e1 , eα > +eβ (α2 ) < e2 , eα > . Hence π˜ 11 = 2e1 (α1 ), π˜ 22 = 2e2 (α2 ), π˜ 12 = e1 (α2 ) + e2 (α1 ), π˜ 13 = e3 (α1 ), π˜ 14 = e4 (α1 ), π˜ 23 = e3 (α2 ), π˜ 24 = e4 (α2 ), π˜ 33 = π˜ 34 = π˜ 44 = 0. 2. For Z , we have similarly Z

παβ = b3 παβ + c4 παβ + eα (b) < e3 , eβ > +eα (c) < e4 , eβ > +eβ (b) < e3 , eα > +eβ (c) < e4 , eα > .

This gives Q Z π = bQ 3 π + cQ 4 π − (e12 + e22 )(e3 (b) + e4 (c)) + 2[e1 (b)e1 e3 + e1 (c)e1 e4 + e2 (b)e2 e3 + e2 (c)e2 e4 ] − µ−1 [e4 (b)e32 + e3 (c)e42 ]. Without making explicit all the terms in the deformation tensors for e3 and e4 yet, d1 = b(3 π11 − 3 π22 ) + c(4 π11 − 4 π22 ), Q Z π = e12 [−e3 (b) − e4 (c) + (2µ)−1 (b3 π34 + c4 π34 ) + (1/2)d1 ] +e22 [−e3 (b) − e4 (c) + (2µ)−1 (b3 π34 + c4 π34 ) − (1/2)d1 ] +2e1 e2 (b3 π12 + c4 π12 ) +2[e1 (b)e1 e3 + e1 (c)e1 e4 + e2 e3 (e2 (b) − 2bar sin θ −3/2 ) + e2 e4 (e2 (c) +2acr sin θ −3/2 )] +e32 [−µ−1 e4 (b) + (4µ2 )−1 (b3 π44 + c4 π44 )] + e42 [−µ−1 e3 (c) +(4µ2 )−1 (b3 π33 + c4 π33 )] +(2µ)−1 e3 e4 [b(3 π11 + 3 π22 ) + c(4 π11 + 4 π22 )]. 5.2. The function H . Lemma 5.2.1. With the notations of Lemma 5.1, assuming (e3 + e4 )(b) = 0, (e3 + e4 )(c) = 0, we have ¯ h/(2µ) = −(y + z) − R/(2µ) + H/(4µ2 ), 1 H = k(b − c) + 2α1 π34 + (4µ)[< D1 X¯ , e1 > + < D2 X¯ , e2 >], k = 8(r 2 + a 2 )−3 (2r + (m − r )).

220

S. Alinhac

Proof of Lemma 5.2.1. 1. Taking into account the assumptions on b and c, we get y = e3 (b)/µ + (4µ2 )−1 (b3 π44 + c4 π44 ), R¯ = R − e3 (b) + e3 (c) + (2µ)−1 [b3 π34 + c4 π34 + π¯ 34 ], hence h has the above form with H = b[3 π33 + 3 π44 + 23 π34 + 2µ(3 π11 + 3 π22 )] + c[4 π33 + 4 π44 + 24 π34 +(2µ)(4 π11 + 4 π22 )] + 2π¯ 34 + 2µ(π¯ 11 + π¯ 22 ). The coefficient cob of b is (1/2)cob =< D3 e3 , e4 > + < D4 e3 , e4 > +(2µ)(< D1 e3 , e1 > + < D2 e3 , e2 >). The coefficient coc of c is (1/2)coc =< D3 e4 , e3 > + < D4 e4 , e3 > + (2µ)(< D1 e4 , e1 > + < D2 e4 , e2 >) = −2(e3 + e4 )µ + (2µ)[< [e1 , e3 + e4 ], e1 > + < [e2 , e3 + e4 ], e2 >] − (1/2)cob . In the present case of the Kerr metric, (e3 + e4 )µ = 0, [e1 , e3 + e4 ] = 0, [e2 , e3 + e4 ] = 0, hence cob = −coc . 2. From the formula of Sect. 4.1, we have < D3 e3 , e4 > + < D4 e3 , e4 >= 4(r 2 + a 2 )−4 [m(a 2 − r 2 ) + a 2 r sin2 θ ]. The quantity in the bracket is m(a 2 − r 2 ) + r (a 2 − a 2 cos2 θ ) = mr 2 (a 2 − r 2 ) + ra 2 + (m − r )(r 2 + a 2 )(a 2 cos2 θ ) = (m − r )(r 2 + a 2 ) − r 2 (m − r )(r 2 + a 2 ) + r (r 2 + a 2 )(a 2 − mr ) = (r 2 + a 2 )[(m − r ) + r ]. Since 2µ < D1 e3 , e1 > + < D2 e3 , e2 >= 4r 2 (r 2 + a 2 )−3 , we obtain finally (1/2)cob = 4(r 2 + a 2 )−3 [2r + (m − r )]. 3. Since 2 π34 = 0, we have from Lemma 5.1, π¯ 34 = α1 1 π34 + α2 2 π34 = α1 1 π34 , π¯ 11 + π¯ 22 = 2(< D1 X¯ , e1 > + < D2 X¯ , e2 >), which finishes the proof. Lemma 5.2.2. i) The non negativity of the quadratic form I˜ = Qπ + R|∇ψ|2 requires as necessary conditions √ √ ¯ R¯ ≥ |A|, y ≥ 0, z ≥ 0, R/(2µ) + ( y − z)2 ≤ H/(4µ2 ).


221

ii) For e1 = 0, e3 = e4 , the quadratic form reduces to I˜ = Qπ + R|∇ψ|2 = ( R¯ − A)e22 + Be2 e3 + (y + z + h/(2µ))e32 , where B = −4ar sin θ −3/2 (b − c) − 4a(r 2 + a 2 ) −2 α1 cos θ. The non negativity of the quadratic form requires |B| ≤ µ−3/2 H. Proof of Lemma 5.2.2. 1. If we set e1 = e2 = 0 in the expression of the quadratic form given in Lemma 1, the non negativity of ye32 + ze42 + (2µ)−1 he3 e4 requires (2µ)−1 |h| ≤ 2(yz)1/2 , and in particular ¯ −h/(2µ) = y + z + R/(2µ) − H/(4µ2 ) ≤ 2(yz)1/2 , which is the required inequality. 2. From Lemma 5.1 and the assumptions on b, c, we get B = −4ar sin θ −3/2 (b − c) − (µ)−1 (π¯ 23 + π¯ 24 ). But π¯ 23 = α1 1 π23 + α2 2 π23 + e3 (α2 ), π¯ 24 = α1 1 π24 + α2 2 π24 + e4 (α2 ), π¯ 23 + π¯ 24 = 4aα1 cos θ/((r 2 + a 2 )) + (e3 + e4 )(α2 ), since 2 π23 + 2 π24 = 0, and this gives the expression of B. The non-negativity requires ¯ R¯ ≤ H/(2µ), and R¯ ≥ |A|, which implies R¯ − A ≤ 2 R, ¯ ≤ (4/µ)(H/(2µ))2 , B 2 ≤ 2/µ( R¯ − A)(H/(2µ) − R) that is |B| ≤ µ−3/2 H . Lemma 5.2.3. Set ˜ B = −4ar sin θ −3/2 B, 2 2 1/2 ˜ B = b − c + [(r + a )/(r )]α1 cos θ/ sin θ. Then, for some C ∞ functions β1 , β2 , analytic in r , H = k B˜ + H˜ , H˜ = (4/(µ ))[∂θ α1 + β1 (r )α1 cos θ/ sin θ + β2 α1 ], β1 (r ) = 1 − 22 f a (r )/(r (r 2 + a 2 )4 ), f a (r ) = 2r + (m − r )(r 2 + a 2 ), 1/2

and, for some smooth γ , H˜ = (4/µ)e−γ −1/2 (sin θ )−β1 ∂θ w, w = eγ α1 (sin θ )β1 . Note that in the intersphere region IS, f a (r ) ≤ 0, hence β1 ≥ 1.

222

S. Alinhac

Proof of Lemma 5.2.3. We have, with X¯ = α1 e1 + α2 e2 , since < D1 e2 , e1 >= 0 and e2 (α2 ) = 0, < D1 X¯ , e1 > + < D2 X¯ , e2 >= e1 (α1 ) + α1 < D2 e1 , e2 >= e1 (α1 ) +(r 2 + a 2 ) −3/2 α1 cos θ/ sin θ. This gives H˜ = (4/µ)[e1 (α1 ) + α1 (r 2 + a 2 ) −3/2 cos θ/ sin θ ] + 2α1 1 π34 −k(r 2 + a 2 )α1 /(r 1/2 ) cos θ/ sin θ. Now (r 2 + a 2 )/ = (1 − a 2 sin2 θ/(r 2 + a 2 ))−1 = 1 + . . . sin2 θ, [µ(r 2 + a 2 )/(4r )]k = [22 /(r (r 2 + a 2 )4 )(2r + (m − r )(r 2 + a 2 )) + . . . sin2 θ, which finally gives H˜ = (4/(µ 1/2 ))[∂θ α1 + β1 (r )α1 cos θ/ sin θ + β2 α1 ] with β1 = 1 − 22 f a (r )/(r (r 2 + a 2 )4 ),

f a (r ) = 2r + (m − r )(r 2 + a 2 ).

We set w = eγ α1 (sin θ )β1 , ∂θ w = eγ (sin θ )β1 [∂θ α1 + β1 α1 cos θ/ sin θ + α1 ∂θ γ ], and it is enough to take γ such that ∂θ γ = β2 . 6. Proof of the Non Existence Theorem 1. We prove first that, in the interior of the intersphere region, i) the function H˜ from Lemma 5.2.3 cannot be negative, ii) if H˜ ( p) = 0, then B( p) = 0. Suppose H˜ ( p) < 0 for some p ∈ I S with sin θ = 0 at p. Then the necessary condition from Lemma 5.2.2 would imply ˜ < µ−3/2 k B˜ ≤ µ−3/2 |k|| B|, ˜ |B| = 4ar sin θ −3/2 | B| which in turn implies 4ar sin θ −3/2 < µ−3/2 |k|. In the region IS, we know from Lemma 4.4 that k = 8(r 2 + a 2 )−3 k˜ ≥ 0,


223

and we observe that k˜ = f a| cos θ| (r ) = 2r + (m − r )(r 2 + a 2 − a 2 sin2 θ ) = f a (r ) + (r − m)a 2 sin2 θ ≤ (r − m)a 2 sin2 θ, since, according to Lemma 4.4 ii), f a (r ) ≤ 0 for r1 ≤ r ≤ r0 . Thus the inequality we have obtained at p implies 4r −3/2 < µ−3/2 8(r 2 + a 2 )−3 (r − m)a. 2. We prove now that, for a ≤ m, the non strict inequality 4r −3/2 ≤ µ−3/2 8(r 2 + a 2 )−3 (r − m)a is impossible in the region IS. This inequality is in fact r 1/2 ≤ 2(r − m)a. Now 1/2 (r − m)2 (r 1/2 /(r − m)) = −m + r (r − m)2 = (r − m)3 + m(m 2 − a 2 ) > 0, hence the inequality would imply r1 (r12 + a 2 − 2mr1 )1/2 = r1 (mr1 − a 2 )1/2 ≤ 2(r1 − m)a. Using the equation satisfied by r1 , this is equivalent to r12 (mr1 − a 2 ) ≤ 4a 2 (r1 − m)2 , (3mr1 − 2a 2 )(mr1 − a 2 ) ≤ 4a 2 (mr1 + m 2 − 2a 2 ), 9mr1 ≤ 10a 2 . Since 2m ≤ r1 , this is impossible. This proves points i) and ii) above. 3. From 2 we deduce in particular that H˜ ≥ 0 for r = r0 and 0 < θ < π , hence H˜ ≡ 0 for r = r0 because of Lemma 5.2.3. Consider now the function Hˆ = (sin θ ) H˜ : by assumption, it is a C ∞ function, vanishing on r = r0 , analytic in r for 0 < θ < π . The function Hˆ can be written Hˆ = (r − r0 )l H1 for some l and some H1 ∈ C ∞ non-identically zero on r = r0 ; if this were not the case, Hˆ would be flat on r = r0 , which would imply H˜ ≡ 0 by the analyticity assumption. Now, by Lemma 4, for r = r0 , π (µ/4)eγ 1/2 (sin θ )β1 −1 H1 dθ = 0, 0

hence the same must be true also for r = r0 , which implies that there are points m 1 = (r0 , θ1 ), m 2 = (r0 , θ2 ), 0 < θ1 < π, 0 < θ2 < π, for which H1 (m 1 ) < 0, H1 (m 2 ) > 0.

224

S. Alinhac

This implies that H˜ < 0 at some point interior to I S, which is impossible. 4. Hence H˜ ≡ 0 everywhere by the analyticity assumption, which implies α1 ≡ 0 by Lemma 5.2.3. From point 1, B˜ = 0 in IS, and also b ≡ c in IS. Since H = 0 in IS, we ¯ y − z and A are zero there. But b ≡ c implies obtain from Lemma 5.2.2 that also R, y = −z, since 3 π44 = −4 π33 . Finally y = z = 0 in IS, and the quadratic form I˜ obtained from X in Lemma 5.1 is identically zero. Since 3 π34 + 4 π34 = 0, b = c and α1 = 0 imply r = R¯ = 0. Remark. In the case a = 0 of the Schwarzschild metric, the intersphere region reduces to r = 3m: it is possible, as in Sect. 3.6, to have on the photon sphere b = c and ∂r b = −∂r c. This explains the difference in the conclusions. Acknowledgement. We would like to thank Prof. S. Klainerman for many helpful conversations about these subjects.

References 1. Alinhac, S.: On the Morawetz-Keel-Smith-Sogge inequality for the wave equation on a curved background. Publ. R. Inst. Math. Sc. Kyoto 42(3), 705–720 (2006) 2. Chandrasekhar, C.: The Mathematical Theory of Black Holes. Int. Ser. Mon. Physics 69, Oxford: Oxford Univ. Press, 1983 3. Christodoulou, D.: Bounded variation solutions of the spherically symmetric einstein-Scalar field equations. Comm. Pure Appl. Math. XLVI, 1131–1220 (1993) 4. Dafermos, M., Rodnianski, I.: The Red-Shift Effect and Radiation Decay on Black Hole Spacetimes. http://arxiv.org/abs/gr-qc/0512119, 2005 5. Dafermos, M., Rodnianski, I.: A Note on Energy Currents and Decay for the Wave Equation on a Schwarzschild Background. http://arxiv.org/abs/0710.071v1[math,AP], 2007 6. Dafermos, M., Rodnianski, I.: Lectures on Black Holes and Linear Waves. http://arxiv.org/abs/0811. 354v1[gr-qc], 2008 7. Hawking, S.W., Ellis, G.F.: The Large Scale Structure of Space-Time. Cambridge Mon. Math. Physics, Cambridge: Cambridge Univ. Press, 1973 8. Hörmander, L.: Lectures on Nonlinear Hyperbolic Differential Equations. Math. Appl. 26, Berlin-Heidelberg-New York: Springer Verlag, 1997 9. Klainerman, S., Nicolò, F.: The Evolution Problem in General Relativity. Prog. Math. Physics 25, BeselBoston: Birkhäuser, 2002 10. Morawetz, C.S.: Time decay for the nonlinear Klein-Gordon equation. Proc. Roy. Soc. A 306, 291–296 (1968) 11. Tataru, D., Tohaneanu, M.: Local Energy Estimate on Kerr Black Hole Background. http://arxiv.org/abs/ 0810.5766v2[math,AP], 2008 12. Wald, R.: General Relativity. Chicago, IL: Univ. Chicago Press, 1984 Communicated by G.W. Gibbons


Communications in


The N = 1 Triplet Vertex Operator Superalgebras Dražen Adamović1 , Antun Milas2, 1 Department of Mathematics, University of Zagreb, Zagreb, Croatia. E-mail: [email protected] 2 Department of Mathematics and Statistics, University at Albany (SUNY),

Albany, NY 12222, USA. E-mail: [email protected]; [email protected] Received: 6 February 2008 / Accepted: 9 November 2008 Published online: 18 February 2009 – © Springer-Verlag 2009

Abstract: We introduce a new family of C2 -cofinite N = 1 vertex operator superalgebras SW(m), m ≥ 1, which are natural super analogs of the triplet vertex algebra family W( p), p ≥ 2, important in logarithmic conformal field theory. We classify irreducible SW(m)-modules and discuss logarithmic modules. We also compute bosonic and fermionic formulas of irreducible SW(m) characters. Finally, we contemplate possible connections between the category of SW(m)-modules and the category of modules for 2πi the quantum group Uqsmall (sl2 ), q = e 2m+1 , by focusing primarily on properties of characters and the Zhu’s algebra A(SW(m)). This paper is a continuation of our paper Adv. Math. 217, no.6, 2664–2699 (2008). Contents 1. 2. 3. 4. 5. 6. 7. 8.

Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Preliminaries . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 2.1 Zhu’s algebra A(V ) . . . . . . . . . . . . . . . . . . . . . . . . . . . . 2.2 Intertwining operators among vertex operator superalgebra modules . . N = 1 Neveu-Schwarz Vertex Operator Superalgebras . . . . . . . . . . . . Fusion Rules For N = 1 Superconformal (2m + 1, 1)-Models . . . . . . . . Lattice and Fermionic Vertex Superalgebras . . . . . . . . . . . . . . . . . 5.1 Fermionic vertex operator superalgebra F . . . . . . . . . . . . . . . . 5.2 Vertex superalgebra S M(1) . . . . . . . . . . . . . . . . . . . . . . . The N = 1 Neveu-Schwarz Module Structure of VL ⊗ F-Modules . . . . . . The Vertex Operator Superalgebra S M(1) . . . . . . . . . . . . . . . . . . Zhu’s Algebra A(SM(1)) and Classification of Irreducible S M(1)–Modules 8.1 Logarithmic S M(1)-modules . . . . . . . . . . . . . . . . . . . . . . . 8.2 Further properties of A(S M(1)) . . . . . . . . . . . . . . . . . . . . . The second author was partially supported by NSF grant DMS-0802962.

226 228 230 231 232 232 234 235 236 238 242 245 247 248

226

D. Adamović, A. Milas

The N = 1 Triplet Vertex Algebra SW(m) . . . . . . . . . . . . Classification of Irreducible SW(m)–Modules . . . . . . . . . . On the Structure of Zhu’s Algebra A(SW(m)) . . . . . . . . . . Modular Properties of Characters of Irreducible SW(m)-Modules SW(m)-Characters and q-Series Identities . . . . . . . . . . . . 13.1 The m = 1 case: first computation . . . . . . . . . . . . . . 13.2 Irreducible SW(m) characters from W(2m + 1) characters . 14. A Conjectural Relation of SW(m) with Quantum Groups . . . . 15. Outlook and Final Remarks . . . . . . . . . . . . . . . . . . . . 16. Appendix . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . References . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .

9. 10. 11. 12. 13.

. . . . . . . . . . .

. . . . . . . . . . .

. . . . . . . . . . .

. . . . . . . . . . .

. . . . . . . . . . .

. . . . . . . . . . .

249 251 253 257 260 260 262 264 265 265 268

1. Introduction Compared to rational vertex algebras, significantly less is known about the structure of modules for general vertex algebras. Recently, geared up with clues from the physics literature, some breakthrough has been achieved in understanding at least quasi-rational vertex algebras (i.e., C2 -cofinite irrational vertex algebras), and in particular the triplet vertex algebras W( p), p ≥ 2 [AM2,FHST,CF] (cf. also [Ab] for p = 2). But apart from the symplectic fermions W(2), the description of categories of weak (logarithmic) modules for other triplets W( p), p ≥ 3 remains open, even though there is strong evidence for Kazhdan-Lusztig correspondence between the category of logarithmic W( p)modules and certain categories of modules for quantum groups (for these and related developments we refer the reader to [FHST], and especially [FGST1,FGST2,Se], and references therein). In [AM2] we obtained several useful results about the structure of the category of W( p)-modules by using primarily Zhu’s algebra and Miyamoto’s pseudocharacters [Miy]. Eventually, we will require more-or-less explicit knowledge of “higher” Zhu’s algebras for the triplet. But several obstacles (e.g., explicit realization of certain logarithmic modules) prevents us from taking this theory to the next level. We hope that this approach, in particular, will give an additional evidence for Kazhdan-Lusztig correspondence, because we believe that proper understanding of the relationship between quantum groups and triplets. As with other familiar rational vertex operator algebras (e.g. Virasoro minimal models), one may also wonder if the triplet has (interesting!) N = k super extensions, and whether those exhibit similar properties (e.g., C2 -cofiniteness). In this paper we solve this problem for k = 1, by constructing a family of N = 1 vertex operator superalgebras SW(m), m ≥ 1, which share many similarities with the triplet family. In what follows we briefly recall the construction of SW(m) and present our main results. Let us recall that the triplet vertex algebra W( p) [FHST,AM2] is defined as the kernel of a screening operator acting from VL to VL− αp , where VL is the vertex algebra associated to the rank one even lattice Zα, α, α = 2 p, and VL− αp is a certain VL -module. To construct an N = 1 super triplet we replace the even lattice with an odd lattice such that α, α = 2m + 1, so that VL has a natural vertex operator superalgebra structure. Then we tensor VL with F, the free fermion vertex operator superalgebra (cf. [KWn]), and again, there is a screening operator α Q˜ : VL ⊗ F −→ VL− 2m+1 ⊗ F.

N = 1 Triplet Vertex Operator Superalgebras

227

The kernel of this operator, denoted by SW(m), is what we call the N = 1 triplet vertex operator superalgebra (or simply, N = 1 super triplet). If we restrict the kernel of Q˜ on the charge zero subspace we obtain another vertex operator superalgebra S M(1) ⊂ SW(m), called the N = 1 singlet vertex operator superalgebra. Both vertex operator superalgebras contain the Neveu-Schwarz vector τ , giving a representation of the ns Lie super12m 2 algebra of central charge 23 − 2m+1 . This is precisely the central charge of (1, 2m + 1) Neveu-Schwarz (degenerate) minimal modules. A different N = 1 extension of the symplectic fermion W-algebra W(2) was considered in [MS]. By using the notation used by physicists, our super triplet would be an example of a super W-algebra of type W( 23 , 2m + 21 , 2m + 21 , 2m + 21 ). Similarly, the N = 1 singlet algebra S M(1) is an example of a super W-algebra of type W( 23 , 2m + 21 ). We should say that for low m (e.g., m = 1) some general properties of W superalgebras with two generators were also discussed in the physics literature, but mostly by using the Jacobi identity and methods of Lie algebras (cf. [BS] and references therein). We should also mention that several general results about W-superalgebras associated to affine superalgebras were recently obtained in [Ar,KWak] (see also [HK]). However, super singlet and super triplet vertex superalgebras do not appear in these works. Because of the similarity between W( p) and SW(m) many results we obtain here are intimately related to those for the triplet [AM2] (cf. [FGST1,CF]), but there are some subtle differences which we address at various stages. However, to keep the paper self-contained at many places we gave proofs that are almost identical to those in [AM2]. Let us first consider the super singlet S M(1). This vertex superalgebra is too small to be C2 -cofinite (let alone rational!), which is evident from the following result: Theorem 1.1. Assume that m ≥ 1. (i) The singlet vertex superalgebra S M(1) is a simple N = 1 vertex operator superalgebra generated by τ and a primary vector H of conformal weight 2m + 21 . (ii) The associative Zhu’s algebra A(S M(1)) is isomorphic to the commutative algebra C[x, y]/P(x, y), where P(x, y) is the ideal in C[x, y] generated by the polynomial P(x, y) = y 2 − Cm

2m

(x − h 2i+1,1 ),

i=0

where Cm is a non-trivial constant and h 2i+1,1 =

i(i−2m) 2(2m+1) .

So the structure and representation theory of S M(1) is quite similar to that of M(1) investigated in [A3 and AM1]. In particular we can construct interesting logarithmic S M(1)–modules and logarithmic intertwining operators as defined in [HLZ]. Next we study the vertex operator superalgebra SW(m). The main result on the structure on this vertex superalgebra is Theorem 1.2. Assume that m ≥ 1. (i) SW(m) is a simple N = 1 vertex operator superalgebra generated by τ and three primary vectors E, F, H of conformal weight 2m + 21 .

228


(ii) The vertex operator superalgebra SW(m) is irrational and C2 -cofinite. (iii) SW(m) has precisely 2m + 1 inequivalent irreducible modules. Our proof of (iii) imitates the proof of C2 -cofiniteness for the triplet W( p) [AM2] (for a different proof see [CF]). The rest is done by combining methods of Zhu’s associative algebra and our knowledge of irreducible VL ⊗ F-modules. In parallel with the triplet vertex algebra, we do not have an explicit description of A(SW(m)), but we believe that the following conjecture should hold true. Conjecture 1.1. Zhu’s algebra decomposes as a sum of ideals A(SW(m)) =

3m i=2m+1

Mh 2i+1,1 ⊕

m−1

Ih 2i+1,1 ⊕ Ch 2m+1,1 ,

i=0

where Mh 2i+1,1 ∼ = M2 (C), dim(Ih 2i+1,1 ) = 2 and Ch 2m+1,1 is one-dimensional. In particular A(SW(m)) is 6m + 1-dimensional. In view of the classification result (cf. Theorem 1.1), it is important to compute irreducible characters and study their modular transformation properties. As with the triplet vertex algebra [F1], irreducible SW(m)-characters are often expressible as sums of modular forms of unequal weight. Also, because we are working within vertex operator superalgebras the S L(2, Z) group should be replaced with the θ -group θ . Then we have Theorem 1.3. The θ -closure of the space spanned by irreducible SW(m)-characters is 3m + 1-dimensional. For a more precise statement see Theorem 12.1. Our result should be compared with [F1], where it was observed that the S L(2, Z) closure of the vector space of W( p) characters is 3 p −1 dimensional. Finally, in parallel with [FGK and FFT], we also obtain (see Sect. 13) fermionic formulas for characters of irreducible SW(m)-modules. Our main results indicate that there is an interesting relationship between characters of irreducible SW(m)-modules and irreducible characters of W(2m + 1)-modules. It is not clear if there is a deeper connection between these two W-algebras. Notice that if Conjecture 1.1 is true, then the center of A(SW(m)) is 3m + 1dimensional, which is precisely the dimension of the center of the small quantum group 2πi Uqsmall (sl2 ), q = e 2m+1 [Ker]. It is no accident that this dimension matches the dimension in Theorem 1.3 (a similar phenomena occurs for the triplet vertex algebra [FGST1]). Furthermore, both Uqsmall (sl2 ) and SW(m) have the same number of irreducible modules [Ker] (see also [La]). Thus, motivated by conjectures in [FGST1], we expect the following (rather bold) conjecture to be true. Conjecture 1.2. The category of weak SW(m)-modules is equivalent to the category 2πi of modules for the quantum group Uqsmall (sl2 ), where q = e 2m+1 . 2. Preliminaries In this section we briefly discuss the definition of vertex operator superalgebras, their modules and intertwining operators as developed in [FFR,K,KWn,Li,HLZ,HM], etc.


229

We assume the reader is familiar with basics of vertex algebra theory (cf. [FHL,FLM, FB,K,LL], etc.). Let V = V0¯ ⊕ V1¯ be any Z2 –graded vector space. Then any element u ∈ V0¯ (resp. u ∈ V1¯ ) is said to be even (resp. odd). We define |u| = 0¯ if u is even and |u| = 1¯ if u is odd. Elements in V0¯ or V1¯ are called homogeneous. Whenever |u| is written, it is understood that u is homogeneous. The notion of vertex operator superalgebra is a natural (and straightforward) generalization of the notion of vertex algebra where the vector space V in the definition is assumed to be Z2 -graded, where the vertex operator map u n z −n−1 Y (·, z) : V −→ Hom(V, V ((z)), Y (u, z) = n∈Z

is compatible with the Z2 -grading, and where the Jacobi identity for a pair of homogeneous elements is adjusted with an appropriate sign. A vertex superalgebra V is called a vertex operator superalgebra if there is a special element ω ∈ V0¯ (called the conformal vector) whose vertex operator we write in the form ωn z −n−1 = L(n)z −n−2 , Y (ω, z) = n∈Z

n∈Z

such that L(n) close the Virasoro algebra representation on V , and where V is 1 2 Z-graded (by weight), truncated from below, with finite-dimensional vector spaces. Also, the grading is determined with the action of the Virasoro operator L(0). In this paper, we shall assume that V (n), V1¯ = V (n), where V (n) = {a ∈ V | L(0)a = na}. V0¯ = n∈Z≥0

1 n∈ 2 +Z≥0

For a ∈ V (n), we shall write wt(a) = n or deg(a) = n. We shall sometimes refer to the vertex operator superalgebra V as a quadruple (V, Y, 1, ω), where 1 is the vacuum vector (as for vertex operator algebras). We say that the vertex operator superalgebra V is generated by the set S ⊂ V if V = spanC {u 1n 1 · · · u rnr 1 | u 1 , . . . , u r ∈ S, n 1 , . . . , nr ∈ Z, r ∈ Z>0 }. The vertex operator algebra V is said to be strongly generated (cf. [K]) by the set R if V = spanC {u 1n 1 · · · u rnr 1 | u 1 , . . . , u r ∈ R, n i < 0, r ∈ Z>0 }. In parallel with vertex algebras we can define the notion of weak module for vertex operator superalgebras. Again, the only new requirement is that the vector space M in the definition is Z2 -graded, with grading compatible with respect to the action of V , and where the Jacobi identity is adjusted as in the case of vertex superalgebras. The vertex operator acting on M is usually denoted by Y M . A weak V –module (M, Y M ) is called an (ordinary) V -module if M carries an action of the Virasoro algebra via the expansion of Y M (ω, x), and in addition M is equipped with a R-grading (or even C-grading) determined by the Virasoro operator L(0). In addition, the grading is truncated from below, with finite dimensional graded subspaces.

230


As usual, we say that a V -module M is irreducible (or simple) if M has no proper submodules. We say that a vertex operator superalgebra is rational if every V -module M is semisimple (i.e., M decomposes as a direct sum of irreducible modules) and if V has only finitely many (inequivalent) irreducible modules. Definition 2.1. Let V be a vertex operator superalgebra. We say that a weak V -module M is logarithmic, if it carries an action of the Virasoro algebra and if it admits decomposition M= Mr , r ∈C

where Mr = {v : (L(0) − r )k v = 0, for some k ∈ N}. 2.1. Zhu’s algebra A(V ). We define two bilinear maps ∗ : V ×V → V , ◦ : V ×V → V as follows. For homogeneous a, b ∈ V let (1+x)deg(a) b if a, b ∈ V0¯ x a ∗ b = Resx Y (a, x) (2.1) 0 if a or b ∈ V1¯ , ⎧ deg(a) ⎨ Resx Y (a, x) (1+x)2 b if a ∈ V0¯ x 1 a◦b = (2.2) deg(a)− ⎩ 2 Resx Y (a, x) (1+x) x b if a ∈ V1¯ . Next, we extend ∗ and ◦ on V ⊗ V linearly, and denote by O(V ) ⊂ V the linear span of elements of the form a ◦ b, and by A(V ) the quotient space V /O(V ). The image of v ∈ V , under the natural map V → A(V ) will be denoted by [v]. The space A(V ) has a unital associative algebra structure, with the product ∗ and [1] as the unit element. The associative algebra A(V ) is called Zhu’s algebra of V . Assume that M = ⊕ 1 M(n) is a 21 Z≥0 –graded V –module. Then the top comn∈ 2 Z≥0

ponent M(0) of M is a A(V )–module under the action [a] → o(a) = awt(a)−1 for homogeneous a in V0¯ . We shall sometimes write a(0) for o(a). (Note that if a ∈ V1¯ , then [a] = 0 in A(V ). We formally set o(a) = a(0) = 0 in this case.) Moreover, there is one-to-one correspondence between irreducible A(V )–modules and irreducible 21 Z≥0 –graded V –modules (cf. [KWn]). As usual, for a vertex operator superalgebra V we let C2 (V ) = {a−2 b : a, b ∈ V }. Then it is not hard to see that P(V ) = V/C2 (V ) has a super Poisson algebra structure with the multiplication a¯ · b¯ = a−1 b, and the Lie bracket ¯ = a0 b, [a, ¯ b]


231

where–denotes the natural projection from V to P(V ) (see for instance [Z]). Therefore we have a decomposition P(V ) = P(V )0 ⊕ P(V )1 into even and odd subspace, respectively. If V /C2 (V ) is finite-dimensional we say that V is C2 -cofinite. Let a, b ∈ V , be Z2 homogeneous. Then by using super-commutator formulae in vertex operator superalgebras one can easily see that a¯ · b¯ − (−1)|a||b| b¯ · a¯ = 0 in V /C2 (V ).

(2.3)

The following result was proved in [DK], and it is a generalization of Proposition 2.2 in [Ab]. Proposition 2.1. Let V be strongly generated by the set S. Then we have: (1) P(V ) is generated by the set {a, a ∈ S}. (2) A(V ) is generated by the set {[a], a ∈ S}. (3) If V is C2 -cofinite, dim(P(V )0 ) ≥ dim(A(V )). 2.2. Intertwining operators among vertex operator superalgebra modules. Intertwining operators for superconformal vertex operator algebras were introduced in [KWn]. Their theory is further developed in [HM] by using both even and odd formal variables. We briefly outline the definition here. Definition 2.2. Let V be a vertex operator superalgebra and M1 , M2 and M3 a triple M of V –module. An intertwining operator Y(·, z) of type M1 3M2 is a linear map Y : M1 → End(M2 , M3 ){z}, w1 → Y(w1 , z) =

(w1 )n z −n−1 ,

n∈C

satisfying the following conditions for wi ∈ Mi , i = 1, 2 and a ∈ V : d Y(w1 , z). (I1) Y(L(−1)w1 , z) = dz (I2) (w1 )n (w2 ) = 0 for Re(n) sufficiently large. (I3) The following Jacobi identity holds: z1 − z2 −1 Y M3 (a, z 1 )Y(w1 , z 2 )w2 z0 δ z0 z2 − z1 |a||w1 | −1 −(−1) z0 δ Y(w1 , z 2 )Y M2 (a, z 1 )w2 −z 0 z1 − z0 Y(Y M1 (a, z 0 )w1 , z 2 )w2 , = z 2−1 δ z2

for Z2 -homogeneous a and w1 . We shall denote by

I

M3 M1 M2

the vector space of intertwining operators of type as the “fusion rules”.

M3

M1 M2 .

Their dimensions are known

232


3. N = 1 Neveu-Schwarz Vertex Operator Superalgebras The N = 1 Neveu-Schwarz (or simply NS) algebra is the Lie superalgebra CL(n) CG(m) CC ns = n∈Z

1 m∈ 2 +Z

with commutation relations (m, n ∈ Z): m3 − m C, 12 1 1 n 1 [G(m + ), L(n)] = (m + − )G(m + n + ), (3.1) 2 2 2 2 1 1 1 {G(m + ), G(n − )} = 2L(m + n) + m(m + 1)δm+n,0 C, (3.2) 2 2 3 1 [L(m), C] = 0, [G(m + ), C] = 0. 2 It is important to consider vertex algebras which admit an action of the N = 1 Neveu-Schwarz algebras (cf. [HM]). These vertex operator superalgebras are called N = 1 Neveu-Schwarz vertex operator superalgebras and are subject to an additional axiom: There exists τ ∈ V3/2 (superconformal vector) such that Y (τ, z) = G(n)z −n−3/2 , G(n) ∈ End(V ), [L(m), L(n)] = (m − n)L(m + n) + δm+n,0

n∈Z+1/2

where G(n) satisfy bracket relations as in (3.1) and (3.2). The simplest examples of N = 1 vertex operator superalgebras are ns-modules L ns(c, 0), c = 0, where we use the standard notation and for any (c, h) ∈ C2 we denote by L ns(c, h) the corresponding irreducible highest weight ns–module with central charge c and highest weight h (cf. [KWn,Li,A1,HM]). It is well-known that the vertex operator superalgebra L ns(c, 0), c = 0 is simple. Set 2( p − q)2 3 (1 − ), 2 pq (sp − rq)2 − ( p − q)2 = . 8 pq

c p,q = h r,s p,q

In the rest of the paper we shall focus on certain ns modules of central charge c2m+1,1 , m ≥ 1. 4. Fusion Rules For N = 1 Superconformal (2m + 1, 1)-Models From now on we will mostly focus on (non-minimal) (2m + 1, 1)-models, so that p = 2m + 1, q = 1. Relevant lowest weights are h r,s := h r,s 2m+1,1 , r, s ∈ Z. It will be of great use to determine the fusion rules L(c2m+1,1 , h r ,s ) (4.1) I L(c2m+1,1 , h r,s ) L(c2m+1,1 , h r ,s )


233

for certain triples (r, s), (r , s ) and (r , s ) ∈ Z2 . For m = 0 (i.e., the c = 3/2 case) these numbers were computed in (see [M1]). In particular, for every s > 0 we have: 3 3 3 3 3 L( , h 1,3 ) × L( , h 1,2s+1 ) = L( , h 1,2s−1 ) ⊕ L( , h 1,2s+1 ) ⊕ L( , h 1,2s+3 ), 2 2 2 2 2 (4.2) where × is just a formal product indicating which triples of irreducible modules admit nontrivial fusion rules (all with multiplicity one). As shown in [M1], the fusion rules for m = 0 can be computed by using certain projection formulas for singular vectors combined with Frenkel-Zhu’s formula. It is not hard to see that the same approach extends to m ≥ 1 as well. We only have to apply appropriate projection formulas as in Lemma 3.1 of [IK1]. Actually, for purposes of this paper we do not need any results from [IK1], because we are interested only in special properties of “fusion rules” (4.1) (nevertheless, see Remark 4.1). Proposition 4.1. For every i = 0, . . . , m − 1 and n ≥ 1 we have: the space L(c2m+1,1 , h) I L(c2m+1,1 , h 1,3 ) L(c2m+1,1 , h 2i+1,2n+1 ) is nontrivial only if h ∈ {h 2i+1,2n−1 , h 2i+1,2n+1 , h 2i+1,2n+3 }, and L(c2m+1,1 , h) I L(c2m+1,1 , h 1,3 ) L(c2m+1,1 , h 2i+1,1 ) is nontrivial only if h = h 2i+1,3 . Similarly, for every i = 0, . . . , m − 1 and n ≥ 2 we have: the space L(c2m+1,1 , h) I L(c2m+1,1 , h 1,3 ) L(c2m+1,1 , h 2i+1,−2n+1 ) is nontrivial only if h ∈ {h 2i+1,−2n−1 , h 2i+1,−2n+1 , h 2i+1,−2n+3 }, and L(c2m+1,1 , h) I L(c2m+1,1 , h 1,3 ) L(c2m+1,1 , h 2i+1,−1 ) is nontrivial only if h ∈ {h 2i+1,−3 , h 2i+1,−1 }. For a stronger statement see Remark 4.1. Proof. We assume that n ≥ 1 (for other cases essentially the same argument works). Let A(L(c2m+1,0 , 0)) be Zhu’s algebra of L(c2m+1,0 , 0) (polynomial algebra in one variable) and A(L(c2m+1 , h)) the A(L(c2m+1,0 , 0))-bimodule of L(c2m+1 , h) [FZ]. As in [M1], it is sufficient to analyze the structure of the A(L(c2m+1,0 , 0))-module, A(L(c2m+1 , h 1,3 )) ⊗ A(L(c2m+1,0 ,0) L(c2m+1,1 , h 2i+1,2n+1 )(0),

(4.3)

where L(c2m+1,1 , h)(0) denotes the top weight component of L(c2m+1,1 , h) (cf. [FZ]). From [IK1] (or elsewhere) it follows that the Verma module M(c2m+1,0 , h 1,3 ) combines in the following short exact sequence: 3 0 −→ M(c2m+1,0 , h 1,3 + ) −→ M(c2m+1,0 , h 1,3 ) −→ L(c2m+1,0 , h 1,3 ) −→ 0. 2

234


Thus the maximal submodule of M(c2m+1,0 , h 1,3 ) is generated by a singular vector of weight h 1,3 + 23 (explicitly, (−L(−1)G(−1/2) + (2m + 1)G(−3/2))v1,3 , where v1,3 is the highest weight vector in M(c2m+1,0 , h 1,3 )). Now, as in [M1], it is not hard to see that the space (4.3) is three-dimensional and that all fusion rules covered by the statements are at most 1 (actually, they are all one; see Remark 4.1). Remark 4.1. We can actually prove the “if and only if” statement in Proposition 4.1 by using at least two different methods. On one hand we would have to combine methods from [M1] and projection formula in Lemma 3.1 [IK1] (we do not have explicit singular vectors to work with!). Alternatively, with Proposition 4.1, it is sufficient to construct non-trivial intertwining operators for all types covered in Proposition 4.1. This was actually done in later sections. We should say that our fusion rules formulas coincide with Iohara-Koga’s fusion rule formula in the generic case, which are computed by using coinvariants and projection formulas rather than Frenkel-Zhu’s formula [IK1]. But as we know the coinvariant approach and Frenkel-Zhu’s formulas yield the same answer in practically all known examples (for further examples see [W,M1,M4]).

5. Lattice and Fermionic Vertex Superalgebras We shall first recall some basic facts about lattice and fermionic vertex superalgebras. Let m ∈ Z≥0 . Let L = Zβ be a rational lattice of rank one with nondegenerate bilinear form ·, · given by β, β =

1 . 2m + 1

Let h = C ⊗Z L. Extend the form ·, · on L to h. Let hˆ = C[t, t −1 ] ⊗ h ⊕ Cc be the + − affinization of h. Set hˆ = tC[t] ⊗ h; hˆ = t −1 C[t −1 ] ⊗ h. Then hˆ + and hˆ − are abelian ˆ Let U (hˆ − ) = S(hˆ − ) be the universal enveloping algebra of hˆ − . Let subalgebras of h. ˆ λ ∈ h. Consider the induced h-module ˆ ⊗U (C[t]⊗h⊕Cc) Cλ S(hˆ − ) (linearly), M(1, λ) = U (h) where tC[t] ⊗ h acts trivially on Cλ ∼ = C, h acting as h, λ for h ∈ h and c acts on Cλ as multiplication by 1. We shall write M(1) for M(1, 0). For h ∈ h and n ∈ Z write h(n) = t n ⊗ h. Set h(z) = n∈Z h(n)z −n−1 . Then M(1) is a vertex algebra which is generated by the fields h(z), h ∈ h, and M(1, λ), for λ ∈ h, are irreducible modules for M(1). As in [DL] (see also [D,FLM,GL,K]), we have the generalized vertex algebra

V L = M(1) ⊗ C[ L], where C[ L] is a group algebra of L with a generator eβ . For v ∈ V L , let Y (v, z) = −s−1 be the corresponding vertex operator (for precise formulae see [DL]). v z 1 s∈ 2m+1 Z s Define α = (2m + 1)β. Then α, α = 2m + 1, implying L = Zα ⊂ L is an integer lattice. Therefore the subalgebra VL ⊂ V has the structure of a vertex superalgebra. L


235

Define the Schur polynomials Sr (x1 , x2 , . . .) in variables x1 , x2 , . . . by the following equation: ∞ ∞ xn n y = exp Sr (x1 , x2 , . . .)y r . n

(5.1)

r =0

n=1

For any monomial x1n 1 x2n 2 · · · xrnr we have an element h(−1)n 1 h(−2)n 2 · · · h(−r )nr 1 in M(1) for h ∈ h. Then for any polynomial f (x1 , x2 , . . .), f (h(−1), h(−2), . . .)1 is a well-defined element in M(1). In particular, Sr (h(−1), h(−2), . . .)1 ∈ M(1) for r ∈ Z≥0 . Set Sr (h) for Sr (h(−1), h(−2), . . .)1. The following relations in the generalized vertex operator algebra V L are of great importance: γ

ei eδ = 0

for i ≥ −γ , δ.

(5.2)

γ

Especially, if γ , δ ≥ 0, we have ei eδ = 0 for i ∈ Z≥0 , and if γ , δ = −n < 0, we get γ

ei−1 eδ = Sn−i (γ )eγ +δ

for i ∈ {0, . . . , n}.

(5.3)

5.1. Fermionic vertex operator superalgebra F. In what follows we consider the Clifford algebra CL, generated by {φ(n), n ∈ 21 + Z} ∪ {1} and relations {φ(n), φ(m)} = δn,−m , n, m ∈

1 2

+ Z.

Let F be the CL–module generated by the vector 1 such that φ(n)1 = 0, n > 0. Then the field Y (φ(− 21 )1, z) = φ(z) =

1

φ(n)z −n− 2 ,

1 n∈ 2 +Z

generates a unique vertex operator superalgebra structure on F. We choose ω(s) =

1 φ(− 23 )φ(− 21 )1 2

for the Virasoro element giving central charge 21 . Moreover, F is a rational vertex operator superalgebra, and F is up to equivalence the unique irreducible F–module (see [FRW,KWn,Li]).

236


5.2. Vertex superalgebra S M(1). In this subsection we study the vertex superalgebra S M(1) := M(1) ⊗ F. We shall first define a family of N = 1 superconformal vectors in S M(1). For every m ∈ Z≥0 , we define (see also [MR,K,IK2])

1 τ=√ α(−1)1 ⊗ φ(− 21 )1 + 2m1 ⊗ φ(− 23 )1. , 2m + 1 1 (α(−1)2 + 2mα(−2))1 ⊗ 1 + 1 ⊗ ω(s) . ω= 2(2m + 1) Set Y (τ, z) = G(z) =

n∈Z

G(n + 21 )z −n−2 ,

Y (ω, z) = L(z) =

L(n)z −n−2 .

n∈Z

Then τ is an N = 1 superconformal vector, and the vertex subalgebra of S M(1) strongly generated by the fields G(z) and L(z) is isomorphic to the Neveu-Schwarz vertex opera8m 2 ). In other words, S M(1) tor superalgebra L ns(c2m+1,1 , 0), where c2m+1,1 = 23 (1 − 2m+1 becomes a Fock module for the Neveu-Schwarz algebra with central charge c2m+1,1 . Moreover, for every λ ∈ h, the S M(1)–modules S M(1, λ) := M(1, λ) ⊗ F is also a Fock module with central charge c2m+1,1 and conformal weight 1 (λ, α2 − 2mλ, α). 2(2m + 1)

(5.4)

Now we want to describe the structure of these Fock modules viewed as ns-modules. For this purpose we need the concept of screening operators. As in [A3], we shall construct these operators using generalized vertex algebras. The N = 1 superconformal vector τ ∈ M(1) ⊗ F also defines an N = 1 superconformal structure on V L ⊗ F and VL ⊗ F. In particular, VL ⊗ F is an N = 1 vertex operator superalgebra. The operator L(0) defines a 21 Z≥0 –gradation on VL ⊗ F. Recall that wt(v) = n if L(0)v = nv. Define s (1) = eα ⊗ φ(− 21 )1 ∈ VL ⊗ F, s (2) = e−β ⊗ φ(− 21 )1 ∈ V L ⊗ F. By using the Jacobi identity in the (generalized) vertex algebras VL ⊗ F and V L ⊗ F we get the following formulas: i 1 (1) (1) (1) α , [L(n), si ] = −i si+n (i ∈ Z), [G(n + ), si ] = − √ ei+n 2 2m + 1 √ 1 1 (2) −β (2) Z). [G(n + ), sr ] = r 2m + 1ei+n , [L(n), sr(2) ] = −r sr +n (r ∈ 2 2m + 1

(5.5) (5.6)

Let Q = s0(1) = Resz Y (s (1) , z),

= s (2) = Resz Y (s (2) , z). Q 0

commute with the From relations (5.5) and (5.6) we see that the operators Q and Q action of the Neveu-Schwarz algebra (see also [IK2]).


237

are the We are interested in the action of these operators on S M(1). In fact, Q and Q

are vertex subalgebras of screening operators, and therefore Ker S M(1) Q and Ker S M(1) Q S M(1) (for details see Sect. 14 in [FB] and reference therein).

The proof The following lemma gives the basic properties of the operators Q and Q. is similar to that of Lemma 2.1 in [A3].

= 0. Lemma 5.1. (i) If m = 0, [Q, Q]

nα = 0, n ∈ Z>0 . (ii) Qe

−nα = 0, n ∈ Z≥0 . (iii) Qe We now define the following three (non-zero) elements in the vertex operator superalgebras VL ⊗ F: F = e−α , H = Q F, E = Q 2 F. By using expression for conformal weights (5.4) and Lemma 5.1, we conclude that these vectors are singular vectors for the action of the Neveu-Schwarz algebra, and wt(F) = wt(H ) = wt(E) = h 1,3 = 2m + 21 . It is also important to notice that H ∈ S M(1). The proof of the following result is similar to that of Lemma 3.1 in [A3]. Lemma 5.2. In the vertex operator superalgebra VL ⊗ F the following relations hold: (i) Q 3 F = 0. (ii) E i E = Fi F = 0, for every i ≥ −2m − 1. (iii) Q(Hi H ) = 0, for every i ≥ −2m − 1. We define = e−α ⊗ φ(− 1 ), F 2

= Q F, H

= Q 2 F. E

(5.7)

These vectors are even and have conformal weight 2m + 1. We will need the following result. Lemma 5.3. We have = 0, E i E = 0, i ≥ −2m. i F F Also, ) = 0, i ≥ −2m. i H Q( H i F) = 0, for i ≥ −2m then Q 4 ( F = 6E i E = i F Proof. Since Q acts as a derivation if F 0, for i ≥ −2m. We only have to notice relations k F = Resx x k Y (e−α , x)e−α ⊗ Y (φ(−1/2), x)φ(−1/2)1, F Resx x i Y (e−α , x)e−α = 0, i ≥ −2m − 1, proven in Lemma 5.2, and Resx x j Y (φ(−1/2), x)φ(−1/2)1 = 0, j ≥ 1. i F) = 0 for i ≥ −2m. The last formula follows from Q 3 ( F

238


6. The N = 1 Neveu-Schwarz Module Structure of VL ⊗ F-Modules For i ∈ Z, we set γi =

i α. 2m + 1

(6.1)

We shall first present results on the structure of VL ⊗ F–modules as modules for the N = 1 Neveu-Schwarz algebra. It is a known fact that irreducible VL ⊗ F-modules are given by VL+γi ⊗ F, i = 0, . . . , 2m. Each VL+γi is a direct sum of super Feigin-Fuchs modules via (M(1) ⊗ eγi +nα ) ⊗ F. VL+γi ⊗ F = n∈Z

We shall now investigate the action of the operator Q. Since operators Q j , j ∈ Z>0 , commute with the action of the Neveu-Schwarz algebra, they are actually intertwiners between super Feigin-Fuchs modules inside VL+γi ⊗ F. Assume that 0 ≤ i ≤ m. If Q j eγi −nα is nontrivial, it is a singular vector in the Fock module S M(1, γi + ( j − n)α) of weight wt(Q j eγi −nα ) = wt(eγi −nα ) = h 2i+1,2n+1 , γi +( j−n)α ) > wt(eγi −nα ) if j > 2n, we where h 2i+1,2n+1 := h 2i+1,2n+1 1,2m+1 . Since wt(e conclude that

Q j eγi −nα = 0 for j > 2n.

(6.2)

One can similarly see that for m + 1 ≤ i ≤ 2m: Q j eγi −nα = 0 for j > 2n + 1.

(6.3)

The following lemma is useful for constructing singular vectors in VL+γi ⊗ F: Lemma 6.1. (1) Q 2n eγi −nα = 0 for 0 ≤ i ≤ m. (2) Q 2n+1 eγi −nα = 0 for m + 1 ≤ i ≤ 2m. Proof. We shall prove the assertion (1) by induction on n ∈ Z>0 . For n = 1 we can see directly that Q 2 eγi −α = 0 (or see below). Assume now that (1) holds for certain n ∈ Z>0 . Since VL+γi ⊗ F is a simple module for the simple vertex operator superalgebra VL ⊗ F we have that Y (E, z)Q 2n eγi −nα = 0, (for the proof see [DL]). So there is j0 ∈ Z such that E j0 Q 2n eγi −nα = 0 and E j Q 2n eγi −nα = 0 for j > j0 . Since E j0 Q 2n eγi −nα =

1 γi −nα Q 2n+2 (e−α ), j0 e (n + 1)(2n + 1)


239

we have that j0 ≤ i − 1 − (2m + 1)n. By using the fusion rules from Proposition 4.1, we conclude that γi −nα e−α ∈ U (ns).eγi −(n+1)α , j0 e

and therefore Q 2n+2 eγi −(n+1)α = 0, which proves (1). Notice that the idea used in the induction step, and fusion rules from Proposition 4.1 can be alternatively used to show that Q 2 eγi −α = 0. The proof of (2) is similar so we omit it here. Remark 6.1. It would be desirable — in parallel with the Virasoro algebra case — to have a direct proof of Lemma 6.1 with no reference to fusion rules. However, the Virasoro algebra approach based on matrix coefficients does not apply verbatim to superconformal (1, 2m + 1)-models, so we decided to give a proof which uses the theory of vertex algebras and fusion rules. We found this approach to be quite elegant. We also remark that Iohara and Koga proved certain properties of screening operators among super Feigin-Fuchs modules in Theorem 3.1, [IK2] (see also [MR]), but it is not clear whether these results can be used to prove Lemma 6.1. As in the Virasoro algebra case the N = 1 Feigin-Fuchs modules are classified according to their embedding structure. For our purposes we shall focus only on modules of certain type (Type 4 and 5 in [IK2]). These modules are either semisimple (Type 5) or they become semisimple after quotienting with the maximal semisimple submodule (Type 4). As usual the singular vectors will be denoted by • and cosingular vectors with ◦. The following result follows directly from Lemma 6.1 and the structure theory of super Feigin-Fuchs modules [IK2] after some minor adjustments of parameters (cf. Type 4 embedding structure). Theorem 6.1. Assume that i ∈ {0, . . . , m − 1}. (i) As a module for the Neveu-Schwarz algebra, VL+γi ⊗ F is generated by the family i C Sing i , where of singular and cosingular vectors Sing i = {u ( j,n) | j, n ∈ Z≥0 , 0 ≤ j ≤ 2n}; Sing i ( j,n) | n ∈ Z>0 , 0 ≤ j ≤ 2n − 1}. C Sing i = {wi

These vectors satisfy the following relations: ( j,n)

ui

( j,n)

= Q j eγi −nα , Q j wi

= eγi +nα .

i , denoted by S (i + 1), is The submodule generated by singular vectors Sing 1 isomorphic to ∞ (2n + 1)L ns(c2m+1,1 , h 2i+1,2n+1 ). n=0 1 In this section notation k L ns (c, h) means L ns (c, h)⊕k , k ∈ Z . ≥0

240


(ii) For the quotient module we have S(m − i) := (VL+γi ⊗ F)/S (i + 1) ∼ =

∞ (2n)L ns(c2m+1,1 , h 2i+1,−2n+1 ). n=1

The situation described in Theorem 6.1 can be depicted by the following diagram: j = −2

j = −1

j =0

j =1

j =2

(6.4)

•O

•

◦

◦

•O

•O

•O

◦

◦

◦

◦

•

•

•

•

Let M be the contragradient V -module, where V is a vertex operator superalgebra. Then we have an isomorphism of M(1) ⊗ F-modules, (M(1) ⊗ e 2m+1 α+iα ⊗ F) ∼ = (M(1) ⊗ e 2m+1 α−iα ⊗ F). j

2m− j

By taking direct sums we obtain the following isomorphism of ns-modules: (VL+γi ⊗ F) ∼ = VL+γ2m−i ⊗ F.

(6.5)

Since the dual functor interchanges cosingular and singular vectors, Theorem 6.1 implies the next result (alternatively, use Type 4 embedding structure in [IK2]): Theorem 6.2. Assume that i ∈ {0, . . . , m − 1}. (i) As a module for the Neveu-Schwarz algebra, VL+γ2m−i ⊗ F is generated by the i C family of singular and cosingular vectors Sing Sing i , where

i = {u ( j,n) | n ∈ Z>0 , 0 ≤ j ≤ 2n − 1}; Sing i

( j,n) C Sing i = {wi | j, n ∈ Z≥0 , 0 ≤ j ≤ 2n}.

These vectors satisfy the following relations: ( j,n)

ui

= Q j eγ2m−i −nα ,

( j,n)

Q j wi

= eγ2m−i +nα .


241

i is isomorphic to The submodule generated by singular vectors Sing S(m − i) ∼ =

∞

(2n)L ns(c2m+1,1 , h 2i+1,−2n+1 ).

n=1

(ii) For the quotient module we have S (i + 1) ∼ = (VL+γi ⊗ F)/S(m − i) ∼ =

∞

(2n + 1)L ns(c2m+1,1 , h 2i+1,2n+1 ).

n=0

The embedding diagram for VL+γ2m−i ⊗ F, i = 0, . . . , m − 1 is now j = −1

j =0

j =1

j =2

(6.6)

◦

•

•O

•O

◦

◦

◦

•

•

•

Finally, (6.5) imply that VL+γm ⊗ F is a self-dual VL ⊗ F-module. In view of that, it is not surprising that VL+γm ⊗ F is a semisimple ns-module. More precisely, we have the following result (for the proof see the embedding structure in the Type 5 case in [IK2]). Theorem 6.3. As a module for the Neveu-Schwarz algebra VL+γm ⊗ F is completely reducible and generated by the family of singular vectors m = {u (mj,n) := Q j eγm −nα | j, n ∈ Z≥0 , 0 ≤ j ≤ 2n}; Sing and it is isomorphic to S (m + 1) := VL+γm ⊗ F ∼ =

∞ n=0

(2n + 1)L ns(c2m+1,1 , h 2m+1,2n+1 ).

242


The embedding structure in the last case is a totally disconnected diagram j = −2

j = −1

j =0

j =1

j =2

(6.7)

• •

•

•

•

•

•

•

•

:

:

:

:

:

7. The Vertex Operator Superalgebra SM(1) Let us fix a positive integer m. We shall first present the structure of the vertex operator superalgebra S M(1) as a module for the Neveu-Schwarz algebra. The next result follows directly from Theorem 6.1. Theorem 7.1. For every n ∈ Z≥0 , set (n,n)

u n := u 0

(n+1,n+1)

= Q n e−nα , wn+1 := w0

.

(i) The vertex operator superalgebra S M(1), as a module for the vertex operator superalgebra L ns(c2m+1,1 , 0), is generated by the family of singular and cosingu C lar vectors Sing Sing, where = {u n | n ∈ Z≥0 }; Sing C Sing = {wn | n ∈ Z>0 }. Moreover, U (ns)u n ∼ = L ns(c2m+1,1 , h 1,2n+1 ). (ii) The submodule generated by vectors u n , n ∈ Z≥0 is isomorphic to [Sing] ∼ =

∞

L ns(c2m+1,1 , h 1,2n+1 ).

n=0

(iii) The quotient module is isomorphic to M(1)/[Sing] ∼ =

∞

L ns(c2m+1,1 , h 1,−2n+1 ).

n=1

(iv) Qu 0 = Q1 = 0, and Qu n = 0, Qwn = 0 for every n ≥ 1. Our Theorem 7.1 immediately gives the following result.


243

Proposition 7.1. We have L ns(c2m+1,1 , 0) ∼ = W0 = Ker S M(1) Q . Define the following vertex algebra:

S M(1) = Ker S M(1) Q.

commutes with the action of the Neveu-Schwarz algebra, we have Since Q L ns(c2m+1,1 , 0) ∼ = W0 ⊂ S M(1). This implies that S M(1) is a vertex operator subalgebra of S M(1) in the sense of [FHL] (i.e., S M(1) has the same Virasoro element as S M(1)). The following theorem will describe the structure of the vertex operator superalgebra S M(1) as a L ns(c2m+1,1 , 0)–module. Theorem 7.2. The vertex operator superalgebra S M(1) is isomorphic to [Sing] as a L ns(c2m+1,1 , 0)–module, i.e., S M(1) ∼ =

∞

L ns(c2m+1,1 , h 1,2n+1 ).

n=0

Proof. By Theorem 7.1 we know that the L ns(c2m+1,1 , 0)–submodule generated by the is completely reducible. So to prove the assertion, it suffices to show that set Sing

annihilates vector v ∈ Sing ∪ C Let the operator Q Sing if and only if v ∈ Sing. n −nα

−nα = 0, then v = Q e v ∈ Sing, for certain n ∈ Z≥0 . Since by Lemma 5.1 Qe we have

−nα = 0.

=Q

Q n e−nα = Q n Qe Qv Let now v ∈ C Sing. Then there is n ∈ Z>0 such that Q n v = enα . Assume that

Qv = 0. Then we have that

nα ,

=Q

Q n v = Qe 0 = Q n Qv contradicting Lemma 5.1 (iii). This proves the theorem. Next we shall prove that the vertex operator algebra S M(1) is generated by only two generators. Theorem 7.3. (1) The vertex operator superalgebra S M(1) is generated by τ and H . (2) The vertex operator superalgebra S M(1) is strongly generated by the set {τ, ω, H, G(− 21 )H }.

244


Proof. Let U be the vertex subalgebra of S M(1) generated by τ and H . We need to prove that U = S M(1). Let Wn by the (irreducible) ns–submodule of S M(1) generated by vector u n . Then Wn ∼ = L ns(c2m+1,1 , h 1,2n+1 ). Using Lemma 6.1 we see that Ker S M(1) Q n ∼ =

n−1

Wi .

i=0

To prove (1) it suffices to show that u n ∈ U for every n ∈ Z≥0 . We shall prove this claim by induction. By definition we have that u 0 , u 1 (= H ) ∈ U . Assume that we have k ∈ Z≥0 such that u n ∈ U for n ≤ k. In other words, the inductive assumption is k W ⊂ U. ⊕i=0 i We shall now prove that u k+1 ∈ U . Set j = −(2m + 1)k − 1. By Lemma 6.1 we have −kα

= 0. Q 2k+2 e−(k+1)α = Q 2k+2 e−α j e Next we notice that

Q k+1 (H j u k ) = Q k+1 Qe−α j Q k e−kα =

1 −kα Q 2k+2 e−α , e j 2k + 1

which implies that Q k+1 (H j u k ) = 0. So we have found vector H j u k ∈ U such that wt(H j u k ) = wt(u k+1 ). This implies H j uk ∈

k+1

Wi and H j u k ∈ /

i=0

k

Wi .

i=0

k

Since Q k+1 ⊕i=0 Wi = 0 and wt(H j u k ) = wt(u k+1 ) we conclude that there is a constant C, C = 0, such that H j u k = Cu k+1 + u , u ∈

k

Wi ⊂ U.

i=0

Since H j u k ∈ U , we conclude that u k+1 ∈ U . Therefore, the claim is verified, and the proof of (1) is complete. The proof of (1) shows that S M(1) is spanned by the vectors u 1n 1 · · · u rnr 1, u i ∈ {τ, H },

(7.1)

such that for 1 ≤ i ≤ r : n i ≤ −1 if u i = H

and

n i ≤ 0 if u i = τ.

This implies that S M(1) is strongly generated by the set {τ, ω, holds.

H, G(− 21 )H },

(7.2) and (2)


245

i+1 H can The following lemma implies that for i ≥ −(2m + 1) vectors Hi H and H be constructed using only the action of the Neveu-Schwarz operators L(n) and G(n + 21 ) on the vacuum vector 1. Lemma 7.1. We have: Hi H ∈ W0 ∼ = ∼ ∈ W0 = i H H

L ns(c2m+1,1 , 0) for every i ≥ −(2m + 1), L ns(c2m+1,1 , 0) for every i ≥ −2m.

Remark 7.1. If we adopt notation used by physicists, then Theorem 7.3 implies that S M(1) is a W( 23 , 2m + 21 ) superalgebra, meaning that it is generated by primary fields of weight 23 and 2m + 21 . In some physics papers W( 23 , 2m + 21 ) super algebras are studied by using general principles (e.g., Jacobi identities) but only for low m. Because S M(1) shares many similarities with the singlet algebra M(1) [AM1] we call SM(1) super singlet vertex algebra. 8. Zhu’s Algebra A(SM(1)) and Classification of Irreducible SM(1)–Modules In this section we completely determine Zhu’s algebra A(S M(1)) and classify all irreducible S M(1)–modules. It turns out that the structure of Zhu’s algebra A(S M(1)) is similar to the structure of Zhu’s algebra for A(M(1)) studied in [A3] and the proofs of the main results are completely analogous. = Q(e−α ⊗ φ(− 1 )). Clearly, H is proportional to G(− 1 )H and Recall that H 2 2 ∈ S M(1). therefore H The next result shows that Zhu’s algebra A(S M(1)) is commutative. Theorem 8.1. Zhu’s algebra A(S M(1)) is spanned by the set ]∗t | s, t ≥ 0}. {[ω]∗s [ H In particular, Zhu’s algebra A(S M(1)) is isomorphic to a certain quotient of the poly]. nomial algebra C[x, y], where x and y correspond to [ω] and [ H Proof. The proof follows from Proposition 2.1, Theorem 7.3 and because τ and H are odd vectors. 2i+1,1 = i(i−2m) . Let h r,s = h r,s 2m+1,1 , so that h 2(2m+1) As in [A3], for X = F, E or H we let X (n) := X 2m+n (here as usual Y ( X , z) = −n−1 ). In particular, H z (0) is a degree zero operator acting on S M(1). Since X n n∈Z S M(1) ⊂ M(1) every M(1, λ) ⊗ F is naturally an S M(1)-module. Let T be the subspace of M(1) ⊗ F linearly spanned by the vectors

a ⊗ b, where a ∈ M(1), b ∈ F, deg(b) > 0. (So we only assume that b is homogeneous in F and that it is not proportional to 1.) The proof of the following lemma is a consequence of the definition of vertex superalegbra structure on M(1) ⊗ F. Lemma 8.1. Let λ ∈ h∗ and vλ be the highest weight vector in M(1, λ) ⊗ F. Assume that w ∈ T . Then o(w)vλ = 0.

246


We have the following proposition about the action of the “Cartan subalgebra” of S M(1) on the top component. Proposition 8.1. Let λ ∈ h∗ , t = α, λ and vλ be the highest weight vector in M(1, λ)⊗F. Then we have t (t − 2m) vλ , 2(2m + 1) t (0) · vλ = H vλ . 2m + 1 L(0) · vλ =

Proof. From the very definition of Q and H we see that = φ(1/2)S2m+1 (α)φ(−1/2) + w = S2m+1 (α) + w, H where w = S2m−1 (α) ⊗ φ(− 23 )φ(− 21 ) + · · · + 1 ⊗ φ(−2m − 21 )φ(− 21 ) ∈ T . On the other hand it is known (cf. Proposition 3.1 in [A2]) that t vλ , r ≥ 1. Sr (α)(0)vλ = r The proof follows. It is not hard to see that x(t) = zero curve P(x, y) = 0, where P(x, y) = y − Cm 2

where Cm =

22m+1 (2m+1)2m+1 . (2m+1)!

t (t−2m) 2(2m+1)

and y(t) =

m2 x+ 2(2m + 1)

m−1 i=0

t

2m+1

parametrize the genus

i(i − 2m) x− 2(2m + 1)

2 ,

(8.1)

Alternatively, notice that we can write

P(x, y) = y 2 − Cm

2m

x − h 2i+1,1 .

(8.2)

i=0

By using arguments analogous to those in the proof of Lemma 6.1 from [A3], we obtain the following result: Lemma 8.2. In Zhu’s algebra A(S M(1)) we have the following relation ] ∗ [ H ] = Cm [H

2m ([ω] − h 2i+1,1 ), i=0

where Cm is as above. By using Theorem 8.1, Lemma 8.2 and the same proof as that of Theorem 6.1 from [A3] we get:


247

Theorem 8.2. Zhu’s algebra A(S M(1)) is isomorphic to the commutative, associative algebra C[x, y]/P(x, y), where P(x, y) is the ideal in C[x, y] generated by the polynomial P(x, y) = y − Cm 2

2m

(x − h 2i+1,1 ).

i=0

The fact that Zhu’s algebra A(S M(1)) is commutative, enables us to study irreducible lowest weight representations of the vertex operator superalgebra S M(1). For given (r, s) ∈ C2 such that P(r, s) = 0 let L(r, s) be the irreducible lowest weight S M(1)–module generated by the vector vr,s such that L(m)v = r δm,0 vr,s ,

(m)v = sδm,0 r vr,s (m ≥ 0). H

Our Theorem 8.2 and standard Zhu’s theory imply the following classification result. Theorem 8.3. The set {L(r, s) | P(r, s) = 0} provides all non-isomorphic irreducible 21 Z≥0 -gradable S M(1)-modules. By using classification of irreducible S M(1)–modules and the same proof as that of Theorem 4.3 of [AM1] we get: Corollary 8.1. The vertex operator superalgebra S M(1) is simple.

8.1. Logarithmic S M(1)-modules. In [AM1] we studied logarithmic modules for the singlet vertex algebra M(1) p . Here we have a similar result. ˆ As in [AM1], let M(1, λ) ⊗ be an h-module, where is a two-dimensional vector space and where α(0)| is given by formula α, λ 1 (8.3) 0 α, λ in some basis {w1 , w2 } of (see also [M3]). Then M(1, λ)⊗ F ⊗ carries an ns-module structure. m Proposition 8.2. The vector space M(1, λ) ⊗ F ⊗ , λ = 2m+1 α is a genuine m 2 logarithmic S M(1)-module , while for λ = 2m+1 α, M(1, λ) ⊗ F ⊗ is an ordinary S M(1)-module.

Notice that the previous result is in agreement with Theorem 8.2. More precisely, m2 because of the linear term (x + 2(2m+1) ) in P(x, y), as in the proof of Proposition 7.1 [AM1], Theorem 8.2 can be now used to show that there are no logarithmic self-extension m α) ⊗ F. of M(1, 2m+1 2 In other words, the module involves nontrivial Jordan blocks with respect to the action of L(0).

248


8.2. Further properties of A(S M(1)). In the next sections we shall make use of the following important technical results. Proposition 8.3. In Zhu’s algebra A(S M(1)) we have [Q 2 e−2α ] = Bm f m ([ω]), where f m ([ω]) =

3m

([ω] − h 2i+1,1 )

i=0

and

2m

(2(2m + 1))3m+1 Bm = (−1) m 4m+1

. 2 m (3m + 1)! m

Proof. First we notice that Q 2 e−2α = ν H−2m−2 H + v, where ν = 0 and v ∈ U (ns).1 (see also [AM2], Lemma 3.3). The above results on the structure of A(S M(1)) imply that [Q 2 e−2α ] = m ([ω]) for certain m ∈ C[x], deg m ≤ 3m + 1. We shall evaluate the action of Q 2 e−2α on top levels of S M(1)–modules M(1, λ) ⊗ F. Let vλ be the highest weight vector in M(1, λ) ⊗ F. First we notice that Q 2 e−2α =

4m+1

α e−i−1 eiα e−2α + w, where w ∈ T .

i=0

By using a direct calculation similar to that of [AM2] we see that o(Q 2 e−2α )vλ =

∞

α o(e−i−1 eiα e−2α )vλ

i=0

= Resz 1 Resz 2

∞

z 1−i−1 z 2i (z 1 − z 2 )2m+1 (z 1 z 2 )−4m−2 (1 + z 1 )t (1 + z 2 )t vλ

i=0

= Resz 1 Resz 2 (z 1 − z 2 )2m (z 1 z 2 )−4m−2 (1 + z 1 )t (1 + z 2 )t vλ = m ( 1 (t 2 − 2tm))vλ = m (t)vλ , 2(2m+1)

where

m (t) =

2m t t k 2m , (−1) k 4m + 1 − k 2m + 1 + k k=0

t = λ, α. As in the proof of Lemma 3.4 in [AM2] one can prove the following identity:

(−1)m 2m t t + m ¯m , where A¯ m = 4m+1 m . m (t) = A 3m + 1 3m + 1 m

(8.4)


249

This implies 2 1 1 m ( 2(2m+1) (t 2 − 2mt)) = m (t) = Bm f m ( 2(2m+1) (t − 2mt)).

Consequently, m is a non-trivial polynomial of degree 3m + 1 and in A(S M(1)) we have [Q 2 e−2α ] = m ([ω]) = Bm f m ([ω]), Bm = 0.

(8.5)

Define the following non-trivial vector U F,E := Resz Y (F, z)E Set U F,E (0) := o(U F,E ) =

2m

i≥0

i

(z + 1)2m ∈ S M(1). z

o(Fi−1 E).

Proposition 8.4. In Zhu’s algebra A(S M(1)) we have: ], [U F,E ] = g([ω])[ H where g(x) ∈ C[x] is of degree at most m. Proof. First we notice that U F,E = a H − H ◦ H for certain a ∈ U (ns). This implies that in Zhu’s algebra A(S M(1)), we have ], [U F,E ] = g([ω])[ H

(8.6)

where g ∈ C[x] is a polynomial of degree at most m. (Here we used the relation [H ◦ H ] = 0, which holds in A(S M(1)).) It is not at all clear that g(x) is a nonzero polynomial. 9. The N = 1 Triplet Vertex Algebra SW(m) Define the following vertex superalgebra

SW(m) = KerVL ⊗F Q. Recall definition (5.7). For any X ∈ {E, F, H }, X is proportional to G(− 21 )X , and therefore X ∈ SW(m). Theorem 9.1. (1) For every m ≥ 1, SW(m) is an N = 1 vertex operator superalgebra and SW(m) ∼ = S (1). (2) The vertex operator superalgebra SW(m) is generated by E, F, H and τ . (3) The vertex operator superalgebra SW(m) is strongly generated by the set F, H }. {τ, ω, E, F, H, E,

250


Proof. Recall the structure of VL ⊗ F as a module for the Neveu-Schwarz algebra from Theorem 6.1. By using Lemma 5.1, similarly to the proof of Theorem 7.2, we conclude that SW(m) is a completely reducible module for the Neveu-Schwarz algebra, generated by the family of singular vectors: Q j e−nα , n ∈ Z≥0 , j ∈ {0, · · · , 2n}.

(9.1)

This proves (1). Let Z n be the Neveu-Schwarz module generated by singular vectors Q j e−α , ≤ n, j ∈ Z≥0 .

Therefore SW(m) = n∈Z≥0 Z n . Let now U be the vertex subalgebra of SW(m) generated by τ, E, F, H . Clearly, U ⊆ SW(m). We shall prove that in fact U = SW(m). In order to do so it is sufficient to show that Z n ⊆ U for every n ∈ Z>0 . We shall prove this claim by induction on n. By the definition, the claim holds for n = 1. Assume now that Z n ⊆ U. Set j0 = (2m + 1)n + 1. As in the proof of Theorem 7.3 we have F− j0 e−nα = e−(n+1)α , E − j0 Q 2n e−nα = B2n+1 Q 2n+2 e−(n+1)α , where B2n+1 = 0 and H− j0 Q j e−nα = B j Q j+1 e−(n+1)α + v j , where v j ∈ Z n , B j = 0, 0 ≤ j ≤ 2n. These relations imply that Z n+1 ⊆ U. By induction we conclude that Z n ⊆ U for every n ∈ Z>0 and therefore U = SW(m). This proves (2). The proof of (2) actually gives that SW(m) is spanned by the vectors u 1n 1 · · · u rnr 1, u i ∈ {τ, E, F, H }

(9.2)

such that for 1 ≤ i ≤ r : n i ≤ −1 if u i ∈ {E, F, H }

and

n i ≤ 0 if u i = τ.

(9.3)

The assertion (3) follows. Theorem 9.2. Assume that m ≥ 1. Then we have (1) The vertex operator superalgebra SW(m) is C2 –cofinite. (2) The vertex operator superalgebra SW(m) is irrational. Proof. By using Proposition 2.1, relation (2.3) and Theorem 7.2 we conclude that SW(m)/C2 (SW(m)) is generated by F, F, H, H , τ , ω, E, E,

(9.4)

and that every two generators either commute or anti-commute. In order to prove C2 – cofiniteness it suffices to prove that every generator (9.4) is nilpotent in SW(m)/ C2 (SW(m)). Let X be either E or F. From Lemma 5.2 we see that X −1 X = 0, and 2 thus X = 0. By using G(−i − 1/2)2 = L(−2i − 1) ∈ U (ns)


251

we get τ 2 = 0. Similarly, from H−1 H ∈ U (ns) · 1,

H−1 H =

ai1 ,...,ik G(−i 1 −1/2) · · · G(−i k −1/2)L(−j1 ) · · · L(− js )1, k/2+i 1 +···+i k + j1 +···+ js =4m+1

where i 1 > i 2 > · · · > i k ≥ 1,

j1 , . . . , js ≥ 2, ai1 ,...,ik ∈ C,

it follows that H−1 H ∈ C2 (SW(m)), and thus 2

H = H−1 H = 0. E} in SW(m) (cf. Lemma 5.3), so that We also have X −1 X = 0, X ∈ { F, 2

X = 0 in SW(m)/C2 (SW(m)). are nilpotent. We prove this as in [AM2]. Since Thus, it remains to prove that ω and H + F −1 E + 2H −1 H =0 −1 F E we get 4

= 0. H Moreover, the description of Zhu’s algebra from Theorem 8.2 implies that 2

= Cm ω2m+1 , (Cm = 0), H which implies that ω4m+2 = 0. Therefore, every generator of SW(m)/C2 (SW(m)) is nilpotent and SW(m) is C2 –cofinite. This proves (1). Assertion (2) follows from the fact that VL ⊗ F is not completely reducible, viewed as a SW(m)–module. 10. Classification of Irreducible SW(m)–Modules From the definition of Zhu’s algebra and the structure of the vertex operator superalgebra SW(m) follows: [H ], [ F] Proposition 10.1. The associative algebra A(SW(m)) is generated by [ E], and [ω].

252


Proof. The proof follows from Proposition 2.1, Theorem 9.1 and the fact that τ , E, F and H are all odd. Theorem 10.1. In Zhu’s algebra A(SW(m)) we have the following relation: f m ([ω]) = 0, where 3m f m (x) = (x − h 2i+1,1 ). i=0

Proof. Since O(S M(1)) ⊂ O(SW(m)), the embedding S M(1) ⊂ SW(m) induces an algebra homomorphism A(S M(1)) → A(SW(m)). Applying this homomorphism to Proposition 8.3 and using the fact that Q 2 e−2α ∈ O(SW(m)) we get that f m ([ω]) = 0 in A(SW(m)). Alternatively, we can write the polynomial f m (x) as f m (x) = (x − h 2m+1,1 )

m−1

(x − h 2i+1,1 )2

i=0

3m

(x − h 2i+1,1 ),

(10.1)

i=2m+1

indicating possibility of existence of logarithmic modules of generalized lowest conformal weight h 2i+1,1 , i = 0, . . . , m − 1. Theorem 10.2. (1) For every 0 ≤ i ≤ m, S (i + 1) is an irreducible 21 Z≥0 –gradable SW(m)–module, with the top component S (i + 1)(0) of lowest weight h 2i+1,1 . Moreover, S (i + 1)(0) is an 1-dimensional irreducible A(SW(m))-module. (2) For every 0 ≤ j ≤ m − 1 , S(m − j) is an irreducible 21 Z≥0 -gradable SW(m)module, with the top component S(m − j)(0) of lowest weight h 2i+1,1 , where i = 2m+1+ j. Moreover, S(m− j)(0) is a 2–dimensional irreducible A(SW(m))module. Proof. Proof is similar to that of Theorem 3.7 in [AM2] so we omit it here. Applying the previous theorem in the case of SW(m) = S (1) we get: Corollary 10.1. The vertex operator superalgebra SW(m) is simple. As in [AM2] we have the following result: Proposition 10.2. In Zhu’s associative algebra we have ] ∗ [ F] − [ F] ∗ [H ] = −2q([ω])[ F], [H ] ∗ [ E] − [ E] ∗ [H ] = 2q([ω])[ E], [H ], [ E] ∗ [ F] − [ F] ∗ [ E] = −2q([ω])[ H

(10.2) (10.3) (10.4)

where q is a certain polynomial. Theorem 10.3. The set {S(i)(0) : 1 ≤ i ≤ m} ∪ S (i)(0) : 1 ≤ i ≤ m + 1} provides, up to isomorphism, all irreducible modules for Zhu’s algebra A(SW(m)).


253

Proof. The proof is similar to that of Theorem 3.11 in [AM2]. Assume that U is an irreducible A(SW(m))–module. Relation f m ([ω]) = 0 in A(SW(m)) implies that L(0)|U = h 2i+1,1 Id, for i ∈ {0, . . . , m} ∪ {2m + 1, . . . , 3m}. Assume first that i = 2m + 1 + j for 0 ≤ j ≤ m − 1. By combining Propositions 10.2 and Theorem 10.2 we have that q(h 2i+1,1 ) = 0. Define e= √

1 2q(h 2i+1,1 )

[ E],

f = −√

1 2q(h 2i+1,1 )

[ F], h=

1 ]. [H q(h 2i+1,1 )

Therefore U carries the structure of an irreducible, sl2 –module with the property that e2 = f 2 = 0 and h = 0 on U . This easily implies that U is a 2-dimensional irreducible sl2 -module. Moreover, as an A(SW(m))-module U is isomorphic to S(m − j)(0). Assume next that 0 ≤ i ≤ m. If q(h 2i+1,1 ) = 0, as above we conclude that U is an irreducible 1–dimensional sl2 –module. Therefore U ∼ = S (i + 1)(0). If q(h 2i+1,1 ) = 0, from Proposition 10.2 we have that the action of generators of A(SW(m)) commute on U . Irreducibility of U implies that U is 1-dimensional. Since ], [ E] 2 , [ F] 2 must act trivially on U , we conclude that [ H ], [ E], [ F] also act trivially [H on U . Therefore U ∼ = S (i + 1)(0). As a consequence of the previous theorem we have. Theorem 10.4. The set {S(i) : 1 ≤ i ≤ m} ∪ {S (i) : 1 ≤ i ≤ m + 1} provides, up to isomorphism, all irreducible modules for the vertex operator superalgebra SW(m). 11. On the Structure of Zhu’s Algebra A(SW(m)) As in [AM2], the main difficulty in description of Zhu’s algebra A(SW(m)) is that of not having a good understanding of logarithmic SW(m)-modules. For the triplet W( p) this problem can be resolved, at least if p is prime, by using modular invariance. We believe the same approach can be applied for SW(m), which would require a super version of Miyamoto’s result [Miy]. This is the main reason why in this part we focus mostly on the case 2m + 1 is prime, but we expect all results to be true in general. In many ways this section is analogous to Sect. 5 (and Appendix) in [AM2], but as we shall see there are some important differences. First a few generalities regarding the Lagrange interpolation polynomial. Proposition 11.1. Let S = {(x1 , y1 ), . . . , (xn , yn )}, xi = x j be a set of points in C2 such that their Lagrange interpolation polynomial L n (x) is of degree exactly n − 1. Then every interpolation polynomial of degree exactly n is given by Q λ (x) = L n (x) + λ

n (x − xi ), λ = 0. i=1

Proof. Let P(x) be an arbitrary n interpolation polynomial of degree n. Then for some λ, the polynomial P(x) n− λ i=1 (x − xi ) is of degree less or equal n − 1, but not zero. But then P(x) − λ i=1 (x − xi ) = L n (x).

254


Lemma 11.1. Let L m (x) be the Lagrange interpolation polynomial for (h 2i+1,1 , t (t−2m) ), then where 2m + 1 ≤ i ≤ 3m. If we let r (t) = L n ( 2(2m+1)

i

2m+1 ),

3m

r (t) =

i=2m+1 (t

− i)(t − 2m + i) (2m + 1)! 3m (i!)2 (−1)i+m 1 1 ( − ) ∈ C[t]. × (i − 2m − 1)!2 (3m − i)!(i + m)! t − i t − 2m + i i=2m+1

Now, we have an important technical result (in a slightly different setup a similar result has been proven in Appendix of [AM2]). Proposition 11.2. For every m ≥ 1 we have L m (h 2i+1,1 ) = 0, 0 ≤ i ≤ m. Proof. As in [AM2] it suffices to let s(t) = 3m

r (t)

i=2m+1 (t

− i)(t − 2m + i)

,

and check first s(0) < 0, s(1) < 0, which follows by using hypergeometric summations. That r (h 2i+1,1 ) = 0 for 0 ≤ i ≤ m follows now from the recursion s(t)(m + t)(2m + 1 − t)2 = 2(m + 1 − t)(2m 2 + 2tm − 2 − t 2 + 2t)s(t − 1) + (t − 1)2 (3m + 2 − t)s(t − 2), because all coefficients in the recursion are positive for 1 ≤ t ≤ m. As in Appendix of [AM2] we now observe that ∗ F = a.F, H where a ∈ U (ns). From −1 F) = 4m + 2, deg( H and ] ∗ [ F] = −q([ω])[ F], [H

(11.1)

for some q ∈ C[x]. It follows that q([ω]) is a polynomial of degree at most m. In [AM2] this observation was sufficient to argue that q has to be the interpolation polynomial. However, in view of Proposition 11.1 and Lemma 11.1, we are unable to argue that q = L m , because L m is of degree m − 1. Thus, it is not clear what the q polynomial should be.


255

Proposition 11.3. Let g(x) be as in Proposition 8.4 and 3m

u(x) =

(x − h 2i+1,1 ).

i=2m+1

Then g(x) = Dm u(x), for some constant Dm . Moreover, Dm u([ω]) ∗ [ X ] = 0, X ∈ {F, H, E}.

(11.2)

Proof. First we notice that U F,E = F ◦ E ∈ O(SW(m)). Then Proposition 8.4 implies that ] = 0 in A(SW(m)) g([ω]) ∗ [ H for some polynomial of degree at most m. Because we already know all irreducible SW(m)-modules we also know that g([ω]) must act as zero on all SW(m)-modules ] acts nontrivially). Thus we with two-dimensional highest weight subspaces (here [ H know that g([ω]) = Dm u([ω]) for some constant Dm . Since Q preserves O(SW(m)) we get (11.2). It is crucial for our considerations to show that Dm = 0 (i.e., g(x) = 0). This will require an explicit computation of U F,E (0) on the top degree subspaces of certain S M(1)-modules. We have the following result. Theorem 11.1. If m ∈ N such that 2m + 1 is a prime integer, then g(x) = 0. For the proof of this important technical result we refer the reader to the Appendix. If Dm = 0, Proposition 11.3 and (11.1) we get ] ∗ [ F] = −q([ω])[ F] = −q ([ω])[ F], [H where q ([ω]) is a polynomial of degree m − 1, which forces q = L m . We should say here that in [AM2] the formula (11.2) was a consequence of a formula analogous to (11.1). Theorem 11.2. Assume that 2m + 1 is prime or Dm = 0. Then we have 2 = [ F] 2=0 (i) [ E] 2 (ii) [ H ] = Cm P([ω]), where P(x) =

2m (x − h 2i+1,1 ) ∈ C[x] i=0

and Cm is a nonzero constant.

256


(iii) ] ∗ [ F] = −[ F] ∗ [H ] = −q([ω]) ∗ [ F], [H [ H ] ∗ [ E] = −[ E] ∗ [ H ] = q([ω]) ∗ [ E], where q(x) is a nonzero polynomial of degree m − 1 and q(h 2i+1,1 ) = 0, 0 ≤ i ≤ m. (iv) ] ∗ [ F] − [ F] ∗ [H ] = −2q([ω])[ F], [H ] ∗ [ E] − [ E] ∗ [H ] = 2q([ω])[ E], [H ], [ E] ∗ [ F] − [ F] ∗ [ E] = −2q([ω])[ H where q(x) is as in (iii). (v) 3m

F, H }. ([ω] − h 2i+1,1 ) ∗ [X ] = 0, X ∈ { E,

i=2m+1

(vi) The center of A(SW(m)) is a subalgebra generated by [ω]. Proof. We recall that SW(m) is generated by [ω] and [ X ], X = F, H and E (see Proposition 10.1). For (i) we recall [AM2] that Q lifts to a derivation of A(SW(m)), denoted by the same symbol. Now, because of Lemma 5.3 we have ∗ [ F] = [ E] ∗ [ E] = 0. [ F] Part (ii) has been proven in Lemma 8.2. It is left to show relations (iii), (iv) and (v). As in [AM2] we compute ∗ [ F]) = [H ] ∗ [ F] + [ F] ∗ [H ], 0 = Q([ F] which yields ] ∗ [ F] = −[ F] ∗ [H ]. [H After an application of Q 2 on the previous equation we get ] ∗ [ E] = −[ E] ∗ [H ]. [H Two remaining formulas in (iii) ] ∗ [ F] = −q([ω]) ∗ [ F], [H

(11.3)

] ∗ [ E] = q([ω]) ∗ [ E], [H have already been proven in the discussion preceding the theorem. The relation (iv) follows from (iii) (cf. [AM2]). Part (v) follows directly from Proposition 11.3. Part (vi) follows from the fact that q([ω]) is a unit in A(SW(m)).


257

Corollary 11.1. Under the assumptions of Theorem 11.2, the associative algebra A(SW(m)) is spanned by F or H }. {[ω]i , 0 ≤ i ≤ 3m} ∪ {[ω]i ∗ [X ], 0 ≤ i ≤ m − 1, X = E, Thus, A(SW(m)) is at most 6m + 1-dimensional. By using the same ideas as in [AM2] it is not hard to show that Theorem 11.3. Assume 2m + 1 is prime or Dm = 0. Then Zhu’s algebra decomposes as a sum of ideals A(SW(m)) =

3m

Mh 2i+1,1 ⊕

i=2m+1

m−1

Ih 2i+1,1 ⊕ Ch 2m+1,1 ,

i=0

where Mh 2i+1,1 ∼ = M2 (C), 1 ≤ dim(Ih 2i+1,1 ) ≤ 2 and Ch 2m+1,1 is one-dimensional. It is also not hard to find explicit generators for every ideal, in parallel with [AM2]. As with the triplet we expect that all Ih 2i+1,1 are two-dimensional (which is related to existence of logarithmic modules). This is equivalent to Conjecture 11.1. The associative algebra A(SW(m)) is 6m + 1-dimensional. Then the center of A(SW(m)) is 3m + 1-dimensional. Remark 11.1. Dong and Jiang have recently proven [DJ] that if A(V ) is semisimple and every irreducible admissible module is an ordinary module, then V is rational. It is feasible to assume that their result applies for vertex operator superalgebras. This would imply dim(Ih 2i+1,1 ) = 2 for at least one i, and in particular dim A(SW(1)) = 7. (Note that in the case m = 1, D1 = 0 certainly holds.) 12. Modular Properties of Characters of Irreducible SW(m)-Modules We first introduce several basic facts regarding classical modular forms needed for description of irreducible SW(m) characters. The Dedekind η-function is usually defined as the infinite product η(τ ) = q 1/24

∞

(1 − q n ),

n=1

an automorphic form of weight We also introduce

1 2.

As usual in all these formulas q = e2πiτ , τ ∈ H.3

f(τ ) = q

−1/48

f1 (τ ) = q −1/48 f2 (τ ) = q 1/24

∞ n=0 ∞

n=1 ∞

(1 + q n+1/2 ),

(12.1)

(1 − q n−1/2 ),

(12.2)

(1 + q n ).

(12.3)

n=1 3 Here τ - the coordinate of H - should not be confused with the superconformal vector used in previous sections.

258


These (slightly normalized) Weber functions form a vector-valued modular form of weight zero. More precisely, √ 1 f(−1/τ ) = f(τ ), f2 (−1/τ ) = √ f1 (τ ), f1 (−1/τ ) = 2f2 (τ ), 2 f(τ + 1) = e−2πi/48 f1 (τ ), f2 (τ + 1) = e2πi/24 f2 (τ ), f1 (τ + 1) = e−2πi/48 f(τ ). In what follows, we denote by j,k (τ ) =

q (2kn+ j)

2 /4k

n∈Z

Jacobi-Riemann -series where j ∈ Z and k ∈ N/2. We also let 2 (∂) j,k (τ ) = (2kn + j)q (2kn+ j) /4k . n∈Z

Then we have transformation formulas (notice that here k ∈ N/2 so j,k (τ ) is not invariant under τ −→ τ + 1 in general): √ η(−1/τ ) = −iτ η(τ ), η(τ + 1) = eπi/12 η(τ ), (12.4) 2k−1 −iτ eiπ j j /k j ,k (τ ), (12.5) j,k (−1/τ ) = 2k j =0

j,k (τ + 2) = e

iπ j 2 /k

(∂) j,k (τ + 2) = e

j,k (τ ),

iπ j 2 /k

(12.6)

(∂) j,k (τ ),

(∂) j,k (−1/τ ) = (−τ ) −iτ/2k

2k−1

(12.7)

eiπ j j /k (∂) j ,k (τ ).

(12.8)

j =1

For a vertex operator algebra module M we define its graded-dimension or simply character χ M (τ ) = tr| M q L(0)−c/24 . If V = L ns(c2m+1,0 , 0) and M = L(c2m+1,0 , h 2i+1,2n+1 ), then (see [IK2], for instance) m2 f(τ ) h 2i+1,2n+1 2i+1,−2n−1 q χ L ns (c2m+1,1 ,h 2i+1,2n+1 ) (τ ) = q 2(2m+1) . (12.9) − qh η(τ ) By combining Theorem 6.1, 6.2 and 6.3, and formula (12.9) we obtain Proposition 12.1. For i = 0, . . . , m − 1, f(τ ) 2i + 1 2 χ S (i+1) (τ ) = m−i, 2m+1 (τ ) + (∂)m−i, 2m+1 (τ ) , (12.10) 2 2 η(τ ) 2m + 1 2m + 1 f(τ ) 2m − 2i 2 m−i, 2m+1 (τ ) − (∂)m−i, 2m+1 (τ ) . (12.11) χ S(m−i) (τ ) = 2 2 η(τ ) 2m + 1 2m + 1 Also, χ S (m+1) (τ ) =

f(τ ) 2m+1 (τ ). η(τ ) 0, 2

(12.12)


259

For purposes of modular invariance, it is also important to compute supercharacters of irreducible modules. Let us recall that a supercharacter of a V -module M is defined F (τ ) = tr| M σ q L(0)−c/24 , χM

where σ is the sign operator taking values 1 (resp. −1) on even (resp. odd) vectors. In parallel with Proposition 12.1, it is not hard to compute irreducible supercharacters of SW(m)-modules. Here is an explicit description in terms of -constants and their derivatives. Proposition 12.2. For i = 0, . . . , m − 1, F χ S (i+1) (τ )

f2 (τ ) 2i + 1 2(m−i),2(2m+1) (τ ) − 2(m+i+1),2(2m+1) (τ ) = (12.13) η(τ ) 2m + 1

1 (∂)2(m−i),2(2m+1) (τ ) − (∂)2(m+i+1),2(2m+1) (τ ) ; (12.14) + 2m + 1 F χ S(m−i) (τ )

f2 (τ ) 2m − 2i 2(m−i),2(2m+1) (τ ) − 2(m+i+1),2(2m+1) (τ ) = (12.15) η(τ ) 2m + 1

1 (∂)2(m−i),2(2m+1) (τ ) − (∂)2(m+i+1),2(2m+1) (τ ) . (12.16) − 2m + 1

Also, F (τ ) = χ S (m+1)

f2 (τ ) 0,2(2m+1) (τ ) − 2(2m+1),2(2m+1) (τ ) . η(τ )

(12.17)

As in [F2] we now study modular invariance properties of irreducible SW(m) characters and supercharacters. We only consider some special modular transformations. For example, f(τ ) λk k, 2m+1 (τ ) 2 η(τ ) 2m

χ S (i+1) (−1/τ ) =

k=0

f(τ ) ν j (∂) j, 2m+1 (τ ), + (−τ ) 2 η(τ ) 2m

j=1

for some constants λk and ν j . Because of j,k = − j,k = 2k− j,k = 2k+ j,k , (∂) j,k = −(∂)− j,k , the previous formula indicates that τ

f(τ ) (∂) j, 2m+1 (τ ), j = 1, . . . , m 2 η(τ )

(12.18)

have to be added to the vector space spanned by irreducible SW(m) characters in order to preserve modular invariance. In the case of the triplet vertex algebra expressions similar to (12.18) could be interpreted as Miyamoto’s pseudocharacters (cf. [AM2]). On the

260


other hand, the T transformation τ → τ + 1, maps characters to supercharacters (multiplied with appropriate scalars). In order to find an S L(2, Z)-closure, we would have to apply the S transformation on the space of supercharacters, but this requires a knowledge of irreducible σ -twisted characters. Since we do not study σ -twisted SW(m)-modules in this paper, at this point we record the modular invariance property for the untwisted sector only. Theorem 12.1. The vector space N S spanned by: χ S (m+1) (τ ), χ S (i+1) (τ ), χ S(m−i) (τ ), i = 0, . . . , m − 1, f(τ ) (∂)m−i, 2m+1 (τ ), i = 0, . . . , m − 1 τ 2 η(τ )

(12.19)

is (3m + 1)-dimensional and invariant under the subgroup θ ⊂ S L(2, Z), where θ = S, T 2 . Remark 12.1. We expect that S-transforms of (generalized) supercharacters are expressible in terms of characters and generalized characters of σ -twisted SW(m)-modules. More precisely, appropriately defined vector space spanned by characters and generalized supercharacters , denoted by N S, and the vector space spanned by characters and generalized characters of σ -twisted modules, denoted by R, should be inter-related as on the diagram S

,

T

6N S l

8 NS

T S T

8R

w

S

It is known that (super) characters of N = 1 minimal models in the NS and R sector transform according to this picture (see [IK1]). 13. SW(m)-Characters and q-Series Identities In this section we discuss fermionic expressions for irreducible characters of SW(m)-modules. As we shall see irreducible SW(m)-modules admit q-series formulas similar to those for the triplet, conjectured by Flohr-Grabov-Koehn [FGK], and proven by Warnaar [Wa] ( Feigin et al. independently obtained similar identities by using different methods [FFT]). More precisely, the characters of irreducible modules for the super triplet SW(m) are intimately related to characters of irreducible W(2m + 1)-modules. It is not clear whether a deeper connection persists beyond characters. 13.1. The m = 1 case: first computation. Motivated by computations in [FGK] for W(2), here we probe double-sum fermionic expressions of irreducible characters of SW(1)-modules. As usual, we will be using (a; q)n = (1 − a)(1 − aq) · · · (1 − aq n−1 ), ∞ (a; q)∞ = (1 − aq i−1 ), i=1


261

and sometimes we shall write (q)n = (q; q)n , for simplicity. We start a basic relation ∞ n−1/2 ) 1 n=1 (1 + q = ∞ . ∞ n) n/2 )(1 + q n ) (1 − q (1 − q n=1 n=1

(13.1)

We shall also use Durfee rectangle identities which hold for every k ∈ Z≥0 , ∞

1 q (n +kn)/2 = n/2 ) (q 1/2 )n (q 1/2 )n+k n≥1 (1 − q 2

n=0

=

2 ∞ (−q 1/2 )n (−q 1/2 )n+k q (n +kn)/2

(q)n (q)n+k

n=0

.

(13.2)

Another useful elementary formula due to Euler is η(q) = q

1/24

∞ (−1)n q (n+1)n/2

(q)n

n=0

.

(13.3)

For m = 1 there are three irreducible characters. We will focus here on f(τ ) 1 2 1, 3 (τ ) + (∂)1, 3 (τ ) . χ S (1) (τ ) = 2 2 η(τ ) 3 3

(13.4)

We first notice a theta-function identity (∂)1,3/2 (τ ) =

η(τ )3 , f(τ )2

(essentially, a consequence of the Jacobi triple product identity) or equivalently η(τ )2 f(τ ) (∂)1,3/2 (τ ) = . η(τ ) f(τ ) Now, we apply the relation f(τ ) =

η(τ )2 η(τ/2)η(2τ )

and (13.3), so we obtain ∞

f(τ ) (∂)1,3/2 (τ ) = η(2τ )η(τ/2) = q 5/48 (1 − q 2n )(1 − q n/2 ) η(τ ) = q 5/48

(m 1 ,m 2 )∈Z2≥0

n=1 (−1)m 1 +m 2 (−q 1/2 ; q 1/2 )

m2 q

(13.5)

m 1 (m 1 +1)+m 2 (m 2 +1)/4

(q 2 )m 1 (q)m 2

.

262


On the other hand the Durfee square identity (13.2) yields (after some computation) f(τ ) 1,3/2 (τ ) η(τ ) q 5/48 = (−q; q)∞

3(m 1 −m 2 )2 (m 1 −m 2 ) m 1 m 2 + + 2 8 2

(m 1 , m 2 ) ∈ Z2≥0 m 1 ≡ m 2 (2)

(−q 1/2 ; q 1/2 )m 1 (−q 1/2 ; q 1/2 )m 2 q (q)m 1 (q)m 2

.

(13.6) Evidently, double fermionic expressions for (∂)1,3/2 (τ ) and 1,3/2 (τ ) (cf. formulas (13.5) and (13.6), respectively) appear to have little in common, so it is unclear to us that (13.4) admits representation as a closed double fermionic sum. Thus, it appears that the m = 1 case is rather different compared to the triplet W(2). This is perhaps reflected by the fact that the p = 2 triplet admits a fermionic construction, while such a realization seems to be absent for SW(1) and its modules. 13.2. Irreducible SW(m) characters from W(2m + 1) characters. In this part we will be using character formulas of irreducible W( p)-modules (see for instance (6.34) and (6.35) in [AM2], or [FHST]). Recall f2 (τ ) = q 1/24

∞

(1 + q n ).

n=1

The first result in this part is Proposition 13.1.

(i) For 0 ≤ i ≤ m, we have χ S (i+1) (τ ) =

χ (2i+1) ( τ2 ) . f2 (τ )

(ii) For 0 ≤ i ≤ m − 1, we also have χ S(m−i) (τ ) =

χ(2m−2i) ( τ2 ) . f2 (τ )

Here (i) and (2m +2−i), i = 1, . . . , 2m +1, are irreducible W(2m +1)-modules [AM2]. Proof. The proof follows from character formulas for irreducible W( p)-modules, Theorem 12.1, and the following transformation formulas: 2 j,2m+1 (τ/2) =

j,

2m+1 (τ ), 2

(∂)2 j,2m+1 (τ/2) = 2(∂)

j,

2m+1 (τ ), 2

f(τ ) 1 = f2 (τ ). η(τ/2) η(τ )


263

We recall two multi-sum identities obtained recently by Warnaar [Wa] (these identities are essentially conjectures from [FGK]): Theorem 13.1. For λ = 0, . . . , p and σ ∈ {0, 1} we have p

q

i, j=1

Bi, j n i n j +λ/2(n p−1 −n p +σ )−σ p/4

(13.7)

(q; q)n 1 · · · (q; q)n k

n1, . . . , n p = 0 n p−1 + n p ≡ 0 (2) 1 2 = q pn +(λ−σ p)n (q; q)∞ n∈Z

and

p

q

i, j=1

p−2 Bi, j n i n j +λ/2(n p−1 +n p +σ )+ p−λ (i− p+λ+1)n i −σ p/4

(13.8)

(q; q)n 1 · · · (q; q)n k

n 1 , . . . , n p = 0, n p−1 + n p ≡ 0 (2) 1 2 = (2n − σ + 1)q pn +(λ−σ p)n , (q; q)∞ n∈Z

where Bi, j are entries of the inverse Cartan matrix of the Lie algebra D p . Equipped with Warnaar’s formulas and Proposition 13.1 it is now not hard to prove the next result Theorem 13.2. We have the following formulas for irreducible SW(m)-characters: q −1/16 χ S (m+1) (τ ) =

2m+1

n 1 , . . . , n 2m+1 = 0, n 2m + n 2m+1 ≡ 0 (2)

(−q 1/2 ; q 1/2 )n 1 · · · (−q 1/2 ; q 1/2 )n 2m+1 q k,l=1 Bk,l n k nl /2 . (−q; q)∞ (q; q)n 1 · · · (q; q)n 2m+1 (13.9)

For i = 0, . . . , m − 1, we have q

−ai,m

χ S (i+1) (τ ) =

n 1 , . . . , n 2m+1 = 0 n 2m + n 2m+1 ≡ 0 (2)

2m+1

2m−1

(−q 1/2 ; q 1/2 )n 1 · · · (−q 1/2 ; q 1/2 )n 2m+1 q k,l=1 Bk,l n k nl /2+(m−i)(n 2m +n 2m+1 )/2+ (−q; q)∞ (q; q)n 1 · · · (q; q)n 2m+1

and q −bi,m χ S(m−i) (τ ) =

k=2i+1 (k−2i)n k /2

,

n 1 , . . . , n 2m+1 = 0 n 2m + n 2m+1 ≡ 1 (2)

2m+1

2m−1

(−q 1/2 ; q 1/2 )n 1 · · · (−q 1/2 ; q 1/2 )n 2m+1 q k,l=1 Bk,l n k nl /2+(m−i)(n 2m +n 2m+1 )/2+ (−q; q)∞ (q; q)n 1 · · · (q; q)n 2m+1

where ai,m and bi,m are certain rational numbers.

k=2i+1 (k−2i)n k /2

,

264


Proof. We prove the middle formula only. The other two formulas follow along the same lines. Recall that f(τ ) 2i + 1 2 (∂)m−i, 2m+1 (τ ) . χ S (i+1) (τ ) = (13.10) 2m+1 (τ ) + 2 η(τ ) 2m + 1 m−i, 2 2m + 1 Now, 2i + 1 2 m−i, 2m+1 (τ ) + (∂)m−i, 2m+1 (τ ) 2 2 2m + 1 2m + 1 (2m+1)n 2 +2(m−i)n (m−i)2 /(2(2m+1)) 2 =q (2n + 1)q . n∈Z

Finally, if we substitute q 1/2 for q in (13.8), and let p = 2m + 1, σ = 0, λ = 2m − 2i, and apply formula (13.1) and simple identity 1 (−q 1/2 ; q 1/2 )n = , (q 1/2 ; q 1/2 )n (q)n the proof automatically follows. 14. A Conjectural Relation of SW(m) with Quantum Groups Let gˆ be an untwisted affine Kac-Moody Lie algebra. Then there is a well-known (Kazhdan-Lusztig) equivalence between the tensor category of L g(k, 0)-modules k ∈ N, and the semisimple part of the tensor category of Uq (g)-modules, where q is a certain root of unity (not to be confused with q = e2πiτ used in the previous section) depending on the level k and g [Fi]. Notice that on the quantum group side we have a semisimplified category, and not the full category of Uq (g)-modules. In [FGST1 and FGST2] (see also [Se]) the authors proposed a remarkable equivalence between the (enhanced) tensor category of W( p)-modules and the category of Uq (sl2 )-modules, q = eiπ/ p , where Uq (sl2 ) is the restricted finite-dimensional quantum group. While this is still a conjecture for p > 2, the same authors established an important weaker equivalence among the S L(2, Z)-module Zc f t formed by generalized W( p) characters and the S L(2, Z)-module Z, the center of Uq (sl2 ). Thus, it is a natural question to find Kazhdan-Lusztig dual of the category or ordinary and logarithmic SW(m)-modules. In our case the relevant space of generalized characters is the θ invariant subspace described in Theorem 12.1, which is 3m + 1-dimensional. As indicated in the introduction, we believe that the quantum group Uqsmall (sl2 ), 2iπ

q = e 2m+1 is relevant for the supertriplet SW(m). Here are some evidences. Firstly, both SW(m) and Uqsmall (sl2 ) have the same number of inequivalent irreducible representations. Also, in [Ker] (see also [La]) it was proven that the center of Uqsmall (sl2 ) is 3m + 1-dimensional, and that it carries a projective action of the modular group. Notice that 3m + 1 is also (conjecturally) the dimension of the center of A(SW(m)). Thus, in parallel with [FGST1], we expect the following conjecture to be true. Conjecture 14.1. The category of weak SW(m)-modules is equivalent to the category 2πi of Uqsmall (sl2 )-modules, where q = e 2m+1 .


265

Finally, Proposition 13.1 is a strong indication for a possiblity that the category of SW(m)-modules should be related to a subcategory of W(2m +1) and Uq (sl2 )-modules, q = eπi/(2m+1) . 15. Outlook and Final Remarks There are several research directions we plan to pursue in the future. Let us mention only a few we found the most interesting. (i) The most important problem that we left open is the existence and description of logarithmic SW(m)-modules. We strongly believe the ideas based on modular invariance as in [AM2] could be successfully applied for the super triplet. (ii) As with any N = 1 vertex operator superalgebra, the most obvious next step would be to examine the category of σ -twisted SW(m)-modules, where σ is the parity automorphisms. As we already indicated (cf. Theorem 12.1) the space of S L(2, Z)-transforms of irreducible SW(m)-modules should close a finitedimensional vector space. Supposedly characters of irreducible σ -twisted modules are included in the same vector space (cf. Remark 12.1) see [AM3]. (iii) Singular vectors in Feigin-Fuchs modules for the N = 1 Neveu-Schwarz algebra certainly deserve more attention. We expect these vectors to have description in terms of modified Jack polynomials and as kernels of super Calogero-Sutherland operators. Similar results for the Virasoro algebra have been obtained in [MY]. (iv) Our fermionic expressions for the SW(m)-characters indicate a possibility of parafermionic (or quasiparticle) bases for SW(m)-modules. For the triplet W( p) this problem has been resolved in [FFT]. 16. Appendix Here we prove Theorem 11.1 and give strong evidence that in Proposition 11.3 the polynomial g(x) is nonzero for every m. In the process of proving these results we discovered certain constant term identities which are of independent interest. We recall U F,E := Resz

(1 + z)2m Y (F, z)E ∈ S M(1). z

Then we have U

F,E

(0) := o(U

F,E

2m o(Fi−1 E). )= i i≥0

In Proposition 8.4 we proved that inside A(S M(1)) we have the relation ]. [U F,E ] = g([ω])[ H Because of the homomorphism from A(S M(1)) to A(SW(m)) and Proposition 11.3 it is sufficient to show that U F,E (0) acts nontrivially on the top components of at least one S M(1)-module M(1, λ) ⊗ F.

266


Proposition 16.1. Let vλ be the highest weight vector in M(1, λ) ⊗ F. Then we have U F,E (0) · vλ = −G m (t)vλ , where t = λ, α and 2m + 1 −2m − 1 G m (t) = · (−1) l j l=1 i=0 j=0 k=0 2m − t t t −2m − 1 . k j + k + 2m + 1 i − j − l + 2m + 1 l − k − 1 − i 2m+1 l−1 2m+1+i−l l−1−i

j+k+l

Proof. It is not hard to see that 2m 2m (1+z ) 1 −i−1 U F,E = Resz 1 Resz 2 Resz 3 z2 z 3i Y (e−α , z 1 )Y (eα , z 2 )Y (eα , z 3 )e−α +w, z1 i=0

where w ∈ T . By repeatedly using the well-known formula (cf. [LL]) E + (δ, x)E − (γ , y) = (1 − y/x)δ,γ E − (γ , y)E + (δ, x), which holds for every δ, γ ∈ Zβ, we get U F,E =

2m

Resz 1 Resz 2 Resz 3

(1 + z 1 )2m −i−1 i z2 z 3 (z 1 z 2 z 3 )−2m−1 · z1

i=0 (1−z 2 /z 1 )−2m−1 (1−z 3 /z 1 )−2m−1 (z 2 −z 3 )2m+1 E − (α, z 1 )E − (−α, z 2 )E − (−α, z 3 )+w.

The previous formula together with o(E − (β, x)) · vλ = (1 + x)−β,λ vλ and o(w)vλ = 0 implies U F,E (0) · vλ = (1 − z 2 /z 1 )

2m

i=0 −2m−1

Resz 1 Resz 2 Resz 3

(1 + z 1 )2m −i−1 i z2 z 3 (z 1 z 2 z 3 )−2m−1 · z1

(1 − z 3 /z 1 )−2m−1 (z 2 − z 3 )2m+1 (1 + z 1 )−t (1 + z 2 )t (1 + z 3 )t vλ .

The rest follows by expanding generalized rational functions with respect to standard conventions in vertex algebra theory and extracting the residues in all three variables. If we view parameter t as a variable, the expression G m (t) is a polynomial in t of degree at most 4m + 1. However, it is a priori not clear that the polynomial G m (t) is nonzero. We made some computations for small m and we came up with the following hypothesis.


267

Conjecture 16.1. G m (t) =

2 t +m 2m . 4m + 1 m

We checked this conjecture by using Mathematica package for every m ≤ 20. As in Sect. 11, by using representation theory of SW(m) it is not hard to see that t+m 4m+1 must divide G m (t) for every m. Since deg(G m (t)) ≤ 4m + 1, then we have t +m , (16.1) G m (t) = Am 4m + 1 for some constant Am . But even proving Am = 0 seems to be a nontrivial problem. Proposition 16.2. Let 2m + 1 be prime. Then G m (t) = 0. Proof. We will prove this result by virtue of reduction mod 2m + 1. Let p = 2m + 1 be a prime. It is not hard to see that in fact G m (a) ∈ Z( p) , for every a ∈ Z (in other words, G m (a) is p-integral). Thus it is sufficient to prove that for some t = t0 we have G m (t0 ) = 0 mod p. We take t0 = 3m + 1 and examine G m (t) =

2m+1 l−1 2m+1+i−l l−1−i l=1 i=0

j=0

k=0

−2m − 1 −2m − 1 −m − 1 j+k+l 2m + 1 · (−1) l j k j + k + 2m + 1 3m + 1 3m + 1 . i − j − l + 2m + 1 l − k − 1 − i The finite sum G m (3m + 1) has many terms divisible by p. For instance, in the sum≡ 0 mod p unless l = 2m + 1. After some analysis it is not mation, all terms 2m+1 l hard to see that a possible nontrivial (mod p) contribution comes only if k = j = 0 and l = 2m + 1 (in other cases at least one binomial coefficient is divisible by p). Thus we get: G m (3m + 1) ≡

2m i=0

−m − 1 (−1) 2m + 1

3m + 1 3m + 1 mod p. i 2m − i

Observe the basic relation −m − 1 3m + 1 3m + 1 =− =− . 2m + 1 2m + 1 m Also, for i as in the summation we have 3m + 1 3m + 1 ≡ 0 mod p, i = m. i 2m − i

268


However for i = m we have 3m + 1 ≡ 11−1 22−1 · · · mm −1 ≡ 1 mod p. m Consequently, the summation reduces to a single term 3m + 1 3 ≡ 1 mod p. G m (3m + 1) ≡ m Notice that the previous computations support our Conjecture 16.1 because 2m ≡ ±1 mod p, m so that for t = 3m + 1, 2 2 2m t +m 2m 4m + 1 = ≡ 1 mod p. m 4m + 1 m 4m + 1 Remark 16.1. Because of interesting arithmetics involved in Propositions 16.1 and 16.2, we plan to return to Conjecture 16.1 in our future work. Acknowledgement. We thank the anonymous referee for his/her valuable comments.

References [Ab] [A1] [A2] [A3] [AM1] [AM2] [AM3] [Ar] [BS] [CF] [D] [DL] [DJ] [DK]

Abe, T.: A Z2 -orbifold model of the symplectic fermionic vertex operator superalgebra. Math. Z. 255, 755–792 (2007) Adamović, D.: Rationality of Neveu-Schwarz vertex operator superalgebras. Internat. Math. Res. Notices 17, 865–874 (1997) Adamović, D.: Representations of the vertex algebra W1+∞ with a negative integer central charge. Comm. Algebra 29(7), 3153–3166 (2001) Adamović, D.: Classification of irreducible modules of certain subalgebras of free boson vertex algebra. J. Algebra 270, 115–132 (2003) Adamović, D., Milas, A.: Logarithmic intertwining operators and W(2, 2 p − 1)-algebras. J. Math. Physics 48, 073503 (2007) Adamović, D., Milas, A.: On the triplet vertex algebra W( p). Adv. Math. 217, 2664– 2699 (2008) Adamović, D., Milas, A.: The N = 1 triplet vertex operator superalgebras: twisted sector. SIGMA 4(87), 1–24 (2008) Arakawa, T.: Representation theory of superconformal algebras and the Kac-Roan-Wakimoto conjecture. Duke Math. J. 130, 435–478 (2005) Bouwknegt, P., Schoutens, K.: W–symmetry in Conformal Field Theory. Phys. Rept. 223, 183–276 (1993) Carqueville, N., Flohr, M.: Nonmeromorphic operator product expansion and C2 -cofiniteness for a family of W-algebras. Phys. A: Math. Gen. 39, 951–966 (2006) Dong, C.: Vertex algebras associated with even lattices. J. Algebra 160, 245–65 (1993) Dong, C., Lepowsky, J.: Generalized vertex algebras and relative vertex operators. Boston: Birkhäuser, 1993 Dong, C., Jiang, C.: Rationality of vertex operator algebras. http://arxiv.org/abs/math/ 0607679.v1[math.QA], 2006 De Sole, A., Kac, V.: Finite vs. affine W -algebras. Japanese J. Math. 1, 137–261 (2006)

N = 1 Triplet Vertex Operator Superalgebras [EFHHNV] [FFR] [FRW] [FF] [FFT] [FGST1] [FGST2] [FHL] [Fi] [F1] [F2] [FGK] [FB] [FLM] [FZ] [FHST] [GK1] [GK2] [GL] [HK] [H] [HLZ] [HM] [IK1] [IK2] [K] [KWn] [KWak] [Ker]

269

Eholzer, W., Flohr, M., Honecker, A., Hubel, R., Nahm, W., Vernhagen, R.: Representations of W–algebras with two generators and new rational models. Nucl. Phys. B 383, 249–288 (1992) Feingold, A.J., Frenkel, I.B., Ries, J.: Spinor Construction of Vertex Operator Algebras, (1) Triality, and E 8 . Cont. Math. 121, Providence, RI: Amer. Math. Soc., 1991 Feingold, A.J., Ries, J., Weiner, M.: Spinor construction of the c = 21 minimal model. In: Moonshine, The Monster and related topics, Contemporary Math. 193, Chongying Dong, Geoffrey Mason, eds., Providence, RI: Amer. Math. Soc., 1995, pp. 45–92 Feigin, B., Fuchs, D.B.: Representations of the Virasoro algebra. In: Representations of infinite-dimensional Lie groups and Lie algebras. New York: Gordon andd Breach, 1989 Feigin, B., Feigin, E., Tipunin, I.: Fermionic formulas for (1, p) logarithmic model characters in 2,1 quasiparticle realisation. http://arxiv.org/abs/0704.2464.v4[hepth], 2007 Feigin, B.L., Ga˘ınutdinov, A.M., Semikhatov, A.M., Yu Tipunin, I.: Modular group representations and fusion in logarithmic conformal field theories and in the quantum group center. Commun. Math. Phys. 265, 47–93 (2006) Feigin, B.L., Ga˘ınutdinov, A.M., Semikhatov, A.M., Yu Tipunin, I.: The Kazhdan-Lusztig correspondence for the representation category of the triplet w-algebra in logorithmic conformal field theories. Theor. Math. Phys. 148, 1210–1235 (2006) Frenkel, I.B., Huang, Y.-Z., Lepowsky, J.: On axiomatic approaches to vertex operator algebras and modules, Mem. Amer. Math. Soc. 104, 1993 Finkelberg, M.: Am equivalence of fusion categories. Geom. Funct. Anal. 249–267 (1996) Flohr, M.: On modular invariant partition functions of conformal field theories with logarithmic operators. Internat. J. Modern Phys. A 11, 4147–4172 (1996) Flohr, M.: Bits and pieces in logarithmic conformal field theory. Proceedings of the School and Workshop on Logarithmic Conformal Field Theory and Its Applications (Tehran, 2001 (Internat. J. Mod. Phys. A 18), 4497–4591 (2003) Flohr, M., Grabow, C., Koehn, M.: Fermionic formulas for the characters of c p,1 logarithmic field theory. Nucl. Phys. B 768, 263–276 (2007) Frenkel, E., Ben-Zvi, D.: Vertex algebras and algebraic curves, Mathematical Surveys and Monographs; no. 88, Providence, RI: Amer. Math. Soc., 2001 Frenkel, I.B., Lepowsky, J., Meurman, A.: Vertex Operator Algebras and the Monster, Pure and Applied Math, Vol. 134. New York: Academic Press, 1988 Frenkel, I.B., Zhu, Y.: Vertex operator algebras associated to representations of affine and Virasoro algebras. Duke Math. J. 66, 123–168 (1992) Fuchs, J., Hwang, S., Semikhatov, A.M., Tipunin, I.Yu.: Nonsemisimple fusion algebras and the Verlinde formula. Commun. Math. Phys. 247(3), 713–742 (2004) Gaberdiel, M., Kausch, H.G.: A rational logarithmic conformal field theory. Phys. Lett B 386, 131–137 (1996) Gaberdiel, M., Kausch, H.G.: A local logarithmic conformal field theory. Nucl. Phys. B 538, 631–658 (1999) Gao, Y., Li, H.: Generalized vertex algebras generated by parafermion-like vertex operators. J. Algebra. 240, 771–807 (2001) Heluani, R., Kac, V.: SUSY lattice vertex algebras. http://arxiv.org/abs/0710.1587.v1[math. QA], 2007 Honecker, A.: Automorphisms of W algebras and extended rational conformal field theories. Nucl. Phys. B 400, 574–596 (1993) Huang, Y-Z., Lepowsky, J., Zhang, L.: Logarithmic tensor product theory for generalized modules for a conformal vertex algebra. http://arxiv.org/abs/0710.2687.v3[math.QA], 2007 Huang, Y.-Z., Milas, A.: Intertwining operator superalgebras and vertex tensor categories for superconformal algebras, I. Commun. Contemp. Math. 4, 327–355 (2002) Iohara, K., Koga, Y.: Fusion algebras for N = 1 superconformal field theories through coinvariants, II, N = 1 super-Virasoro-symmetry. J. Lie Theory 11, 305–337 (2001) Iohara, K., Koga, Y.: Representation theory of the Neveu-Schwarz and Ramond algebras II: Fock modules. Ann. Inst. Fourier. Grenoble 53(6), 1755–1818 (2003) Kac, V.G.: Vertex Algebras for Beginners, University Lecture Series, Second Edition. Providence, RI: Amer. Math. Soc., Vol. 10, 1998 Kac, V.G., Wang, W.: Vertex operator superalgebras and their representations. Contemp. Math. 175, 161–191 (1994) Kac, V., Wakimoto, M.: Quantum Reduction and Representation Theory of Superconformal Algebras. Adv. Math. 185, 400–458 (2004) Kerler, T.: Mapping class group actions on quantum doubles. Commun. Math. Phys. 168(2), 353–388 (1995)

270

[La] [LL] [Li] [MS] [MR] [M1] [M2]

[M3] [M4] [MY] [Miy] [Se] [Wa] [W] [Z]


Lachowska, A.: On the center of the small quantum group. J. Algebra 262, 313–331 (2003) Lepowsky, J., Li, H.: Introduction to Vertex Operator Algebras and Their Representations, Progress in Mathematics Vol. 227. Boston: Birkhäuser, 2003 Li, H.: Local systems of vertex operators, vertex superalgebras and modules. J. Pure Appl. Algebra 109, 143–195 (1996) Mavromatos, N., Szabo, R.: The Neveu-Schwarz and Ramond algebras of logarithmic superconformal field theory. JHEP 0301, 041 (2003) Meurman, A., Rocha-Caridi, A.: Highest weight representations of the Neveu-Schwarz and Ramond algebras. Commun. Math. Phys. 107, 263–294 (1986) Milas, A.: Fusion rings for degenerate minimal models. J. Algebra 254(2), 300–335 (2002) Milas, A.: Weak modules and logarithmic intertwining operators for vertex operator algebras. In: Recent developments in infinite-dimensional Lie algebras and conformal field theory (Charlottesville, VA, 2000), Contemp. Math. 297, Providence, RI: Amer. Math. Soc. 2002, pp. 201–225 Milas, A.: Logarithmic intertwining operators and vertex operators. Commun. Math. Phys. 277, 497–529 (2008) Milas, A.: Characters, Supercharacters and Weber modular functions. Crelle’s Journal 608, 35–64 (2007) Mimachi, K., Yamada, Y.: Singular vectors of the Virasoro algebra in terms of Jack symmetric polynomials. Commun. Math. Phys. 174, 447–455 (1995) Miyamoto, M.: Modular invariance of vertex operator algebras satisfying C2 -cofiniteness. Duke Math. J. 122, 51–91 (2004) Semikhatov, A.: Factorizable ribbon quantum groups in logarithmic conformal field theories. Theo. Math. Phys. 154, 433–453 (2008) Ole Warnaar, S.: Proof of the Flohr-Grabow-Koehn conjecture for characters of logarithmic field theory. J. Phys. A: Math. Theor. 40, 12243–12254 (2007) Wang, W.: Rationality of Virasoro vertex operator algebras. Internat. Math. Res. Notices 71(1), 197–211 (1993) Zhu, Y.-C.: Modular invariance of characters of vertex operator algebras. J. Amer. Math. Soc. 9, 237–302 (1996)

Communicated by Y. Kawahigashi


Communications in


On the Reeh-Schlieder Property in Curved Spacetime Ko Sanders Department of Mathematics, University of York, Heslington, York, YO10 5DD, United Kingdom. E-mail: [email protected]; [email protected] Received: 26 February 2008 / Accepted: 27 November 2008 Published online: 26 February 2009 – © The Author(s) 2009. This article is published with open access at Springerlink.com

Dedicated to Klaas Landsman, out of gratitude for the support he offered when it was most needed Abstract: We attempt to prove the existence of Reeh-Schlieder states on curved spacetimes in the framework of locally covariant quantum field theory using the idea of spacetime deformation and assuming the existence of a Reeh-Schlieder state on a diffeomorphic (but not isometric) spacetime. We find that physically interesting states with a weak form of the Reeh-Schlieder property always exist and indicate their usefulness. Algebraic states satisfying the full Reeh-Schlieder property also exist, but are not guaranteed to be of physical interest. 1. Introduction The Reeh-Schlieder theorem ([17]) is a result in axiomatic quantum field theory which states that for a scalar Wightman field in Minkowski spacetime any state in the Hilbert space can be approximated arbitrarily well by acting on the vacuum with operations performed in any prescribed open region. The physical meaning of this is that the vacuum state has very many non-local correlations and an experimenter in any given region can exploit the vacuum fluctuations by performing a suitable measurement in order to produce any desired state up to arbitrary accuracy. The original proof uses analytic continuation arguments, an approach which was extended to analytic spacetimes in [20] by replacing the spectrum condition of the Wightman axioms by an analytic microlocal spectrum condition. For spacetimes which are not analytic, a result by Strohmaier [19], extending an earlier result by Verch [21], shows that in a stationary spacetime all ground and thermal (KMS-)states of several types of free fields (including the Klein-Gordon, Dirac and Proca field) also have the Reeh-Schlieder property. To prove the existence of such states directly one may need to make further assumptions, depending on the type of field (see [19]). Furthermore, the condition of [20] can be weakened to a smoothly covariant condition that implies the Reeh-Schlieder property as well as physical relevance (i.e. the microlocal spectrum condition), but this condition does not seem to be a suitable tool to find such states (see [18] Sect. 5.4).

272

K. Sanders

In this paper we will investigate whether we can find states of a quantum field system in a general (globally hyperbolic) curved spacetime which have the Reeh-Schlieder property. We do this using the technique of spacetime deformation, as pioneered in [9] and as applied successfully to prove a spin-statistics theorem in curved spacetime in [23]. This means that we assume the existence of a Reeh-Schlieder state (i.e. a state with the Reeh-Schlieder property) in one spacetime and try to derive the existence of another state in a diffeomorphic (but not isometric) spacetime which also has the Reeh-Schlieder property. We will prove that for every given region there is a state in the physical state space that has the Reeh-Schlieder property for that particular region (but maybe not for all regions). Algebraic states with the full Reeh-Schlieder property also exist, i.e. states which have the Reeh-Schlieder property for all open regions simultaneously. However, their existence follows from an abstract existence principle and, consequently, such states are not guaranteed to be of any physical interest. To keep the discussion as general as possible we will work in the axiomatic language known as locally covariant quantum field theory as introduced in [5] (see also [23], where some of these ideas already appeared, and [6] for a recent application). We outline this formulation in Sect. 2 and our most important assumption there will be the time-slice axiom, which expresses the existence of a causal dynamical law. In Sect. 3 we will prove the geometric results on spacetime deformation that we need and we will see what they mean for a locally covariant quantum field theory. Section 4 contains our main results on deforming one Reeh-Schlieder state into another one and it notes some immediate consequences regarding the type of local algebras and Tomita-Takesaki modular theory. As an example we discuss the free scalar field in Sect. 5 and we end with a few conclusions. 2. Locally Covariant Quantum Field Theory In this section we briefly describe the main ideas of locally covariant quantum field theory as introduced in [5]. It will also serve to fix our notation for the subsequent sections. In the following any quantum physical system will be described by a C ∗ -algebra A with a unit I , whose self-adjoint elements are the observables of the system. It will be advantageous to consider a whole class of possible systems rather than just one. Definition 2.1. The category Alg has as its objects all unital C ∗ -algebras A and as its morphisms all injective ∗ -homomorphisms α such that α(I ) = I . The product of morphisms is given by the composition of maps and the identity map idA on a given object serves as an identity morphism. A morphism α : A1 → A2 expresses the fact that the system described by A1 is a subsystem of that described by A2 , which is called a super-system. The injectivity of the morphisms means that, as a matter of principle, any observable of a sub-system can always be measured, regardless of any practical restrictions that a super-system may impose. A state of a system is represented by a normalised positive linear functional ω, i.e. ω(A∗ A) ≥ 0 for all A ∈ A and ω(I ) = 1. The set of all states on A will be denoted by A∗+ 1 . Not all of these states are of physical interest, so it will be convenient to have the following notion at our disposal. Definition 2.2. The category States has as its objects all subsets S ⊂ A∗+ 1 , for all unital C ∗ -algebras A in Alg and as its morphisms all maps α ∗ : S1 → S2 for which

On the Reeh-Schlieder Property in Curved Spacetime

273

∗ Si ⊂ (Ai )∗+ 1 , i = 1, 2, and α is the restriction of the dual of a morphism α : A2 → A1 ∗ in Alg, i.e. α (ω) = ω ◦ α for all ω ∈ S1 . Again the product of morphisms is given by the composition of maps and the identity map id S on a given object serves as an identity morphism.

After these operational aspects we now turn to the physical ones. The systems we will consider are intended to model quantum fields living in a (region of) spacetime which is endowed with a fixed Lorentzian metric (a background gravitational field). The relation between sub-systems will come about naturally by considering sub-regions of spacetime. More precisely we consider the following: Definition 2.3. By the term globally hyperbolic spacetime we will mean a connected, Hausdorff, paracompact, C ∞ Lorentzian manifold M = (M, g) of dimension d = 4, which is oriented, time-oriented and admits a Cauchy surface. A subset O ⊂ M of a globally hyperbolic spacetime M is called causally convex iff for all x, y ∈ O all causal curves from x to y lie entirely in O. A non-empty open set which is connected and causally convex is called a causally convex region or cc-region. A cc-region whose closure is compact is called a bounded cc-region. The category Man has as its objects all globally hyperbolic spacetimes M = (M, g) and its morphisms are given by all maps ψ : M1 → M2 which are smooth isometric embeddings (i.e. ψ : M1 → ψ(M1 ) is a diffeomorphism and ψ∗ g1 = g2 |ψ(M1 ) ) such that the orientation and time-orientation are preserved and ψ(M1 ) is causally convex. Again the product of morphisms is given by the composition of maps and the identity map id M on a given object serves as a unit. A region O in a globally hyperbolic spacetime is causally convex if and only if O itself is globally hyperbolic (see [11] Sect. 6.6), so a cc-region is exactly a connected globally hyperbolic region. The image of a morphism is by definition a cc-region. Notice that the converse also holds. If O ⊂ M is a cc-region then (O, g| O ) defines a globally hyperbolic spacetime in its own right. In this case there is a canonical morphism I M,O : O → M given by the canonical embedding ι : O → M. We will often drop I M,O and ι from the notation and simply write O ⊂ M. The importance of causally convex sets is that for any morphism the causality structure of M1 coincides with that of (M1 ) in M2 : ± ± ψ(J M (x)) = J M (ψ(x)) ∩ ψ(M1 ), x ∈ M1 . 1 2

(1)

If this were not the case then the behaviour of a quantum physical system living in M1 could depend in an essential way on the super-system, which makes it practically impossible to study the smaller system as a sub-system in its own right. This possibility is therefore excluded from the mathematical framework. ± Equation (1) allows us to drop the subscript in J M if we introduce the convention that ± J is always taken in the largest spacetime under consideration. This simplifies the notation without causing any confusion, even when O ⊂ M1 ⊂ M2 with canonical embed± ± dings, because then we just have J ± (O) := J M (O) and J M (O) = J ± (O) ∩ M1 . 2 1 Similarly we take by convention D(O)

:=

D M2 (O),

⊥

:=

O ⊥ M2 := M2 \ J (O),

O

274

K. Sanders

and we deduce from causal convexity that D M1 (O) = D(O) ∩ M1 and O ⊥ M1 = O ⊥ ∩ M1 . The following lemma gives some ways of obtaining causally convex sets in a globally hyperbolic spacetime. Lemma 2.4. Let M = (M, g) be a globally hyperbolic spacetime, O ⊂ M an open subset and A ⊂ M an achronal set. Then: 1. 2. 3. 4. 5. 6. 7.

the intersection of two causally convex sets is causally convex, for any subset S ⊂ M the sets I ± (S) are causally convex, O ⊥ is causally convex, O is causally convex iff O = J + (O) ∩ J − (O), int(D(A)) and int(D ± (A)) are causally convex, if O is a cc-region, then D(O) is a cc-region, if S ⊂ M is an acausal continuous hypersurface, then D(S), D(S) ∩ I + (S) and D(S) ∩ I − (S) are open and causally convex.

Proof. The first two items follow directly from the definitions. The fourth follows from J + (O) ∩ J − (O) = ∪ p,q∈O (J + ( p) ∩ J − (q)), which is contained in O if and only if O is causally convex. The fifth item follows from the first two and Theorem 14.38 and Lemma 14.6 in [14]. To prove the third item, assume that γ is a causal curve between points in O ⊥ and p ∈ J (O) lies on γ . By perturbing one of the endpoints of γ in O ⊥ we may ensure that the curve is time-like. Then we may perturb p on γ so that p ∈ int(J (O)) and γ is still causal. This gives a contradiction, because there then exists a causal curve from O through p to either x or y. For the sixth statement we let S ⊂ O be a smooth Cauchy surface for O (see [3]) and note that D(O) is non-empty, connected and D(O) = D(S). The causal convexity of O implies that S ⊂ M is acausal, which reduces this case to statement seven. The first part of statement seven is just Lemma 14.43 and Theorem 14.38 in [14]. The rest of statement seven follows from statement one and two together with the openness of I ± (S). We now come to the main set of definitions, which combine the notions introduced above (see [5]). Definition 2.5. A locally covariant quantum field theory is a covariant functor A : Man → Alg, written as M → A M , → α . A state space for a locally covariant quantum field theory A is a contravariant functor S : Man → States, such that for all objects M we have M → S M ⊂ (A M )∗+ 1 and ∗| for all morphisms : M1 → M2 we have → α S M2 . The set S M is called the state space for M. When it is clear that = I M,O for a canonical embedding ι : O → M of a cc-region O in a globally hyperbolic spacetime M, i.e. when O ⊂ M, we will often simply write A O ⊂ A M instead of using α I M,O . For a morphism : M → M which restricts to a morphism | O : O → O ⊂ M we then have α| O = α |A O

(2)

rather than α I M ,O ◦ α| O = α ◦ α I M,O , as one can see from a commutative diagram. The framework of locally covariant quantum field theory is a generalisation of algebraic quantum field theory (see [5,10]). We now proceed to discuss several physically


275

desirable properties that such a locally covariant quantum field theory and its state space may have (cf. [5], but note that our time-slice axiom is stronger, placing a restriction on the state spaces as well as the algebras). Definition 2.6. A locally covariant quantum field theory A is called causal iff for every pair of morphisms i : Mi → M,i = 1, 2 such that ψ1 (M1 ) ⊂ (ψ2 (M2 ))⊥ in M we have that α1 (A M1 ), α2 (A M2 ) = {0} in A M . A locally covariant quantum field theory A with state space S satisfies the time-slice axiom iff for all morphisms : M1 → M2 such that ψ(M1 ) contains a Cauchy surface ∗ (S ) = S . for M2 we have α (A M1 ) = A M2 and α M2 M1 A state space S for a locally covariant quantum field theory A is called locally quasiequivalent iff for every morphism : M1 → M2 such that ψ(M1 ) ⊂ M2 is bounded and for every pair of states ω, ω ∈ S M2 the GNS-representations πω , πω of A M2 are quasiequivalent on α (A M1 ). The local von Neumann algebras RωM1 := πω (α (A M1 )) are then *-isomorphic for all ω ∈ S M2 . A locally covariant quantum field theory A with a state space functor S is called nowhere classical iff for every morphism : M1 → M2 and for every state ω ∈ S M2 the local von Neumann algebra RωM1 is not commutative. Note that the condition ψ1 (M1 ) ⊂ (ψ2 (M2 ))⊥ is symmetric in i = 1, 2. The causality condition formulates how the quantum physical system interplays with the classical gravitational background field, whereas the time-slice axiom expresses the existence of a causal dynamical law. The condition of a locally quasi-equivalent state space is more technical in nature and means that all states of a system can be described in the same Hilbert space representation as long as we only consider operations in a small (i.e. bounded) cc-region of the spacetime. The condition that ψ(M1 ) contains a Cauchy surface for M2 is equivalent to D(ψ(M1 )) = M2 , because a Cauchy surface S ⊂ M1 maps to a Cauchy surface ψ(S) for D(ψ(M1 )). On the algebraic level this yields: Lemma 2.7. For a locally covariant quantum field theory A with a state space S satisfying the time-slice axiom, an object (M, g) ∈ Man and a cc-region O ⊂ M we have A O = A D(O) and S O = S D(O) . If O contains a Cauchy surface of M we have A O = AM and S O = S M . Proof. Note that both (O, g| O ) and (D(O), g| D(O) ) are objects of Man (by Lemma 2.4) and that a Cauchy surface S for O is also a Cauchy surface for D(O). (The causal convexity of O in M prevents multiple intersections of S.) The first statement then reduces to the second. Leaving the canonical embedding implicit in the notation, the result immediately follows from the time-slice axiom. Finally we define the Reeh-Schlieder property, which we will study in more detail in the subsequent sections. Definition 2.8. Consider a locally covariant quantum field theory A with a state space S. A state ω ∈ S M has the Reeh-Schlieder property for a cc-region O ⊂ M iff πω (A O )ω = Hω , where (πω , ω , Hω ) is the GNS-representation of A M in the state ω. We then say that ω is a Reeh-Schlieder state for O. We say that ω is a (full) Reeh-Schlieder state iff it is a Reeh-Schlieder state for all cc-regions in M.

276

K. Sanders

3. Spacetime Deformation The existence of Hadamard states of the free scalar field in certain curved spacetimes was proved in [9] by deforming Minkowski spacetime into another globally hyperbolic spacetime. Using a similar but slightly more technical spacetime deformation argument [23] proved a spin-statistics theorem for locally covariant quantum field theories with a spin structure, given that such a theorem holds in Minkowski spacetime. In the next section we will assume the existence of a Reeh-Schlieder state in one spacetime and try to deduce along similar lines the existence of such states on a deformed spacetime. As a geometric prerequisite we will state and prove in the present section a spacetime deformation result employing similar methods as the references mentioned above. First we recall the spacetime deformation result due to [9]: Proposition 3.1. Consider two globally hyperbolic spacetimes Mi , i = 1, 2, with spacelike Cauchy surfaces Ci both diffeomorphic to C. Then there exists a globally hyperbolic spacetime M = (R × C, g ) with spacelike Cauchy surfaces Ci , i = 1, 2, such that Ci is isometrically diffeomorphic to Ci and an open neighbourhood of Ci is isometrically diffeomorphic to an open neighbourhood of Ci . The proof is omitted, because the stronger result Proposition 3.3 will be proved later on. Note, however, the following interesting corollary (cf. [5] Sect. 4): Corollary 3.2. Two globally hyperbolic spacetimes Mi with diffeomorphic Cauchy surfaces are mapped to isomorphic C ∗ -algebras A Mi by any locally covariant quantum field theory A satisfying the time-slice axiom (with some state space S). Proof. Consider two diffeomorphic globally hyperbolic spacetimes Mi , i = 1, 2, let M be the deforming spacetime of Proposition 3.1 and let Wi ⊂ Mi be open neighbourhoods of the Cauchy surfaces Ci ⊂ Mi which are isometrically diffeomorphic under ψi to the open neighbourhoods Wi ⊂ M of the Cauchy surfaces Ci ⊂ M . We may take the Wi and Wi to be cc-regions (as will be shown in Proposition 3.3), so that the i (determined by ψi ) are isomorphisms in Man. It then follows from Lemma 2.7 that −1 −1 (AW1 ) = α (A M ) A M1 = AW1 = Aψ −1 (W ) = α 1 1 1

1

−1 ◦ α2 (A M2 ), = α 1

where the αi are ∗ -isomorphisms. This proves the assertion. At this point a warning seems in place. Whenever g1 , g2 are two Lorentzian metrics on a manifold M such that both Mi := (M, gi ) are objects in Man, Corollary 3.2 gives a ∗ -isomorphism α between the algebras A Mi . If O ⊂ M is a cc-region for g1 then α is a ∗ -isomorphism from A (O,g1 ) into A M2 . However, the image cannot always be identified with A(O,g2 ) , because O need not be causally convex for g2 , in which case the object is not defined. We now formulate and prove our deformation result. The geometric situation is schematically depicted in Fig. 1. Proposition 3.3. Consider two globally hyperbolic spacetimes Mi , i = 1, 2, with diffeomorphic Cauchy surfaces and a bounded cc-region O2 ⊂ M2 with non-empty causal complement, O2⊥ = ∅. Then there are a globally hyperbolic spacetime M = (M , g ), spacelike Cauchy surfaces Ci ⊂ Mi and C1 , C2 ∈ M and bounded cc-regions U2 , V2 ⊂ M2 and U1 , V1 ⊂ M1 such that the following hold:


277

Fig. 1. Sketch of the geometry of Proposition 3.3

• There are isometric diffeomorphisms ψi : Wi → Wi , where W1 := I − (C1 ), W1 := I − (C1 ), W2 := I + (C2 ) and W2 := I + (C2 ), • U2 , V2 ⊂ W2 , U2 ⊂ D(O2 ), O2 ⊂ D(V2 ), • U1 , V1 ⊂ W1 , U1 = ∅, V1⊥ = ∅, ψ1 (U1 ) ⊂ D(ψ2 (U2 )) and ψ2 (V2 ) ⊂ D(ψ1 (V1 )). Proof. First we recall the result of [3] that for any globally hyperbolic spacetime (M, g) there is a diffeomorphism F : M → R × C for some smooth three dimensional manifold C in such a way that for each t ∈ R the surface F −1 ({t} × C) is a spacelike Cauchy surface. The pushed-forward metric g := F∗ g makes (R × C, g ) a globally hyperbolic manifold, where g is given by = βdtµ dtν − h µν . gµν

(3)

Here dt is the differential of the canonical projection on the first coordinate t : R×C → R, which is a smooth time function; β is a strictly positive smooth function and h µν is a (space and time dependent) Riemannian metric on C. The orientation and time orientation of M induce an orientation and time orientation on R × C via F. (If necessary we may compose F with the time-reversal diffeomorphism (t, x) → (−t, x) of R × C to ensure that the function t increases in the positive time direction.) Applying the above to the Mi gives us two diffeomorphisms Fi : Mi → M , where M = R × C as a manifold. Note that we can take the same C for both i = 1, 2 by the assumption of diffeomorphic Cauchy surfaces. Define O2 := F2 (O2 ) and let tmin and tmax be the minimum and maximum value that the function t attains on the compact set O2 . We now prove that F2−1 ((tmin , tmax ) × C) ∩ O2⊥ = ∅. Indeed, if this were empty, then we see that J (O2 ) contains F2−1 ([tmin , tmax ] × C) and hence also Cmax := F2−1 ({tmax } × C) and Cmin := F2−1 ({tmin } × C). In fact, Cmin ⊂ J − (O2 ). Indeed, if p := F2−1 (tmin , x) is in J + (O2 ) then we can consider a basis of neighbourhoods of p of the form I − (F2−1 (tmin + 1/n, x)) ∩ I + (F2−1 ({tmin − 1/n} × C)). If qn ∈ J + (O2 ) is in such a basic neighbourhood, then the same neighbourhood also contains a point pn ∈ O2 . Hence, given a sequence qn in J + (O2 ) converging to p we find a sequence pn in O2 converging to p and we conclude that p ∈ O2 ⊂ J − (O2 ). Similarly we can show that Cmax ⊂ J + (O2 ). It then follows that I + (Cmax ) ⊂ J + (O2 ) and I − (Cmin ) ⊂ J − (O2 ), so that J (O2 ) = M and O ⊥ = ∅. This contradicts our assumption on O2 , so we must have F2−1 ((tmin , tmax ) × C) ∩ O2⊥ = ∅. Then we may choose t2 ∈ (tmin , tmax ) such that C2 := F2−1 ({t2 } × C) satisfies C2 ∩ O2 = ∅ and C2 ∩ O2⊥ = ∅. We define C2 := F2 (C2 ), W2 := I + (C2 ) and W2 := (t2 , ∞) × C.

278

K. Sanders

Note that C2 ∩ J (O2 ) is compact by [1] Corollary A.5.4. It follows that we can find relatively compact open sets K , N ⊂ C such that K 2 := {t2 } × K , K 2 := F2−1 (K 2 ), N2 := {t2 } × N and N2 := F2−1 (N2 ) satisfy K = ∅, N = C, K 2 ⊂ O2 and C2 ∩ J (O2 ) ⊂ N2 . We let Cmax := F2−1 ({tmax } × C) and define U2 := D(K 2 ) ∩ I + (K 2 ) ∩ I − (Cmax ) and V2 := D(N2 )∩ I + (N2 )∩ I − (Cmax ). It follows from Lemma 2.4 that U2 , V2 are bounded cc-regions in M2 . Clearly U2 , V2 ⊂ W2 , U2 ⊂ D(O2 ), O2 ⊂ D(V2 ) and V2⊥ = ∅. Next we choose t1 ∈ (tmin , t2 ) and define C1 := {t1 } × C, C1 := F1−1 (C1 ), W1 := − I (C1 ) and W1 := (−∞, t1 ) × C. Let N , K ⊂ C be relatively compact connected open sets such that K = ∅, N = C, K ⊂ K and N ⊂ N . We define N1 := {t1 } × N , K 1 := {t1 } × K , N1 := F1−1 (N1 ), K 1 := F1−1 (K 1 ) and Cmin := F1−1 ({tmin } × C). Let U1 := D(K 1 ) ∩ I − (K 1 ) ∩ I + (Cmin ) and V1 := D(N1 ) ∩ I − (N1 ) ∩ I + (Cmin ). Again by Lemma 2.4 these are bounded cc-regions in M1 . Note that U1 , V1 ⊂ W1 and V1⊥ = ∅. The metric g of M is now chosen to be of the form := βdtµ dtν − f · (h 1 )µν − (1 − f ) · (h 2 )µν , gµν

where we have written ((Fi )∗ gi )µν = βi dtµ dtν − (h i )µν , f is a smooth function on M which is identically 1 on W1 , identically 0 on W2 and 0 < f < 1 on the intermediate region (t1 , t2 ) × C and β is a positive smooth function which is identically βi on Wi . It is then clear that the maps Fi restrict to isometric diffeomorphisms ψi : Wi → Wi . The function β may be chosen small enough on the region (t1 , t2 )×C to make (M, g ) globally hyperbolic. (As pointed out in [9] in their proof of Proposition 3.1, choosing β small “closes up” the light cones and prevents causal curves from “running off to spatial infinity” in the intermediate region.) Furthermore, using the compactness of (t1 , t2 ) × N and the continuity of (h i )µν we see that we may choose β small enough on this set to ensure that any causal curve through K 1 must also intersect K 2 and any causal curve through N2 must also intersect N1 . This means that K 1 ⊂ D(K 2 ) and N2 ⊂ D(N1 ) and hence ψ1 (U1 ) ⊂ D(ψ2 (U2 )) and ψ2 (V2 ) ⊂ D(ψ1 (V1 )). This completes the proof. The analogue of Corollary 3.2 for the situation of Proposition 3.3 is: Proposition 3.4. Consider a locally covariant quantum field theory A with a state space S satisfying the time-slice axiom and two globally hyperbolic spacetimes Mi , i = 1, 2 with diffeomorphic Cauchy surfaces. For any bounded cc-region O2 ⊂ M2 with nonempty causal complement there are bounded cc-regions U1 , V1 ⊂ M1 and a ∗ -isomorphism α : A M2 → A M1 such that V1⊥ = ∅ and AU1 ⊂ α(A O2 ) ⊂ AV1 .

(4)

Moreover, if the spacelike Cauchy surfaces of the Mi are non-compact and P2 ⊂ M2 is any bounded cc-region, then there are bounded cc-regions Q 2 ⊂ M2 and P1 , Q 1 ⊂ M1 such that Q i ⊂ Pi⊥ for i = 1, 2 and α(A P2 ) ⊂ A P1 , A Q 1 ⊂ α(A Q 2 ),

(5)

where α is the same ∗ -isomorphism as in the first part of this proposition. Proof. We apply Proposition 3.3 to obtain sets Ui , Vi and isomorphisms i : Wi → Wi associated to the isometric diffeomorphisms ψi . As in the proof of Corollary 3.2 the i


279

Fig. 2. Sketch of the proof of the second part of Proposition 3.4 −1 give rise to ∗ -isomorphisms αi and α := α ◦ α2 is a ∗ -isomorphism from A M2 to 1 A M1 . Using the properties of Ui , Vi stated in Proposition 3.3 we deduce: −1 −1 −1 (AU1 ) ⊂ α (A D(U2 ) ) = α (AU2 ) = α(AU2 ) ⊂ α(A O2 ) AU1 = α 1 1 1

⊂ α(AV2 ) = αψ−11 (AV2 ) ⊂ αψ−11 (A D(V1 ) ) = αψ−11 (AV1 ) = AV1 . Here we repeatedly used Eq. (2) and Lemma 2.7 (the time-slice axiom). This proves the first part of the proposition. Now suppose that the Cauchy-surfaces are non-compact and let P2 be any bounded cc-region. We refer to Fig. 2 for a depiction of this part of the proof. First choose Cauchy surfaces T2 , T+ ⊂ W2 such that T+ ⊂ I + (T2 ). Note that J (P2 ) ∩ T2 is compact, so it has a relatively compact connected open neighbourhood N2 ⊂ T2 . Choosing T+ appropriately we see that R := D(N2 ) ∩ I + (N2 ) ∩ I − (T+ ) is a bounded cc-region in M2 by Lemma 2.4 and as usual we set R := ψ2 (R). Now let T− , T1 ⊂ W1 be Cauchy surfaces such that T− ⊂ I − (T1 ) and note that J (R ) ∩ T1 is again compact, so we can find a relatively compact connected open neighbourhood N1 ⊂ T1 and use Lemma 2.4 to define the bounded cc-region P1 := D(N1 ) ∩ I − (N1 ) ∩ I + (T− ) and P1 := ψ1−1 (P1 ). Now let L 1 ⊂ T1 be a connected relatively compact set such that L 1 ∩ N1 = ∅. Such an L 1 exists because T1 is non-compact. Define Q 1 := D(L 1 ) ∩ I − (L 1 ) ∩ I + (T− ) and Q 1 := ψ1−1 (Q 1 ). We see that Q 1 ⊂ P1⊥ is a bounded cc-region and Q 1 ⊂ D(ψ2 (L 2 )) where L 2 ⊂ T2 \ N2 is a relatively compact open set. In fact, we can choose L 2 to be connected because Q 1 lies in a connected component C of D(ψ2 (T2 \ N2 )). We now define the bounded cc-region Q 2 := D(L 2 ) ∩ I + (L 2 ) ∩ I − (T+ ) and Q 2 := ψ2 (Q 2 ), so that Q 2 ⊂ P2⊥ and Q 1 ⊂ D(Q 2 ). This concludes the geometrical part of the proof. Now note that A P2 ⊂ A R by Lemma 2.7 on D(N2 ) and that A R = α2 (A R ). Applying Lemma 2.7 in D(N1 ) we see that −1 (A P1 ). Putting this together yields the inclusion: A R ⊂ A P1 and we have A P1 = α 1 −1 −1 (A R ) ⊂ α (A P1 ) = A P1 . α(A P2 ) ⊂ α(A R ) = α 1 1

−1 Similarly we have A Q 1 = α (A Q 1 ), A Q 2 = α2 (A Q 2 ) and A Q 1 ⊂ A Q 2 by Lemma 1 2.7. This yields the inclusion: −1 −1 (A Q 2 ) ⊃ α (A Q 1 ) = A Q 1 . α(A Q 2 ) = α 1 1

280

K. Sanders

4. The Reeh-Schlieder Property in Curved Spacetime The spacetime deformation argument of the previous section will have some consequences for the Reeh-Schlieder property that we describe in the current section. Unfortunately it is not clear that we can deform a Reeh-Schlieder state into another (full) Reeh-Schlieder state, but we do have the following more limited result: Theorem 4.1. Consider a locally covariant quantum field theory A with state space S which satisfies the time-slice axiom. Let Mi be two globally hyperbolic spacetimes with diffeomorphic Cauchy surfaces and suppose that ω1 ∈ S M1 is a Reeh-Schlieder state. Then given any bounded cc-region O2 ⊂ M2 with non-empty causal complement, O2⊥ = ∅, there is a ∗ -isomorphism α : A M2 → A M1 such that ω2 := α ∗ (ω1 ) has the Reeh-Schlieder property for O2 . Moreover, if the Cauchy surfaces of the Mi are non-compact and P2 ⊂ M2 is a bounded cc-region, then there is a bounded cc-region Q 2 ⊂ P2⊥ for which ω2 has the Reeh-Schlieder property. (Here ω2 = α ∗ (ω1 ) is still defined by the same α as in the first statement of the theorem.) Proof. For the first statement let α and U1 be as in the first part of Proposition 3.4 and note that α gives rise to a unitary map Uα : Hω2 → Hω1 . This map is the expression of the essential uniqueness of the GNS-representation, so that Uα ω2 = ω1 and Uα πω2 U∗α = πω1 ◦ α. The Reeh-Schlieder property for O2 then follows from the observation that Uα πω2 (A O2 )U∗α ⊃ πω1 (AU1 ): πω2 (A O2 )ω2 ⊃ U∗α πω1 (AU1 )ω1 = U∗α Hω1 = Hω2 . Similarly for the second statement, given a bounded cc-region P2 and choosing Q 1 , Q 2 as in the second statement of Proposition 3.4 we see that Uα πω2 (A Q 2 )U∗α ⊃ πω1 (A Q 1 ). The second part of Theorem 4.1 means that ω2 is a Reeh-Schlieder state for all ccregions that are big enough. Indeed, if V2 is a sufficiently small cc-region then V2⊥ is connected (recall that we work with four-dimensional spacetimes) and therefore ω2 has the Reeh-Schlieder property for some cc-region in V2⊥ and hence also for V2⊥ itself. A useful consequence of Theorem 4.1 is the following: Corollary 4.2. In the situation of Theorem 4.1 if A is causal then ω2 is a cyclic and separating vector for RωO22 . If the Cauchy surfaces are non-compact ω2 is a separating vector for all RωP22 , where P2 is a bounded cc-region. Proof. Recall that a vector is a separating vector for a von Neumann algebra R iff it is a cyclic vector for the commutant R (see [12] Proposition 5.5.11.). Choosing V1 as in the first part of Proposition 3.4 we have Uα πω2 (A O2 )U∗α ⊂ πω1 (AV1 ) by the inclusion (4). Therefore the commutant of Uα RωO22 U∗α contains (RωV11 ) . As V1⊥ = ∅ this commutant contains the local algebra of some cc-region for which ω1 is cyclic. Hence ω1 is a separating vector for RωV11 and ω2 for RωO22 . If the Cauchy surfaces are non-compact, P2 is a bounded region and Q 2 is as in Theorem 4.1, then (RωP22 ) contains πω2 (A Q 2 ), for which ω2 is cyclic. It follows that ω2 is separating for RωP22 . If the theory is nowhere classical there exist non-local correlations between O2 and any cc-region V2 spacelike to it, just as in the Minkowski spacetime case (see e.g. [16]). Also,


281

if the Cauchy surfaces are non-compact, any localised non-trivial positive observable has a positive expectation value. If the state space is locally quasi-equivalent and large enough it is possible to show the existence of full Reeh-Schlieder states. The proof uses abstract existence arguments, as opposed to the proof of Theorem 4.1 which is constructive, at least in principle. Theorem 4.3. Consider a locally covariant quantum field theory A with a locally quasiequivalent state space S which is causal and satisfies the time-slice axiom. Assume that S is maximal in the sense that for any state ω on some A M which is locally quasi-equivalent to a state in S M we have ω ∈ S M . Let Mi , i = 1, 2, be two globally hyperbolic spacetimes with diffeomorphic noncompact Cauchy surfaces and assume that ω1 is a Reeh-Schlieder state on M1 . Then S M2 contains a (full) Reeh-Schlieder state. Proof. Let {On }n∈N be a countable basis for the topology of M2 consisting of bounded cc-regions with non-empty causal complement. We then apply Theorem 4.1 to each On to obtain a sequence of states ω2n ∈ S M2 which have the Reeh-Schlieder property for On . We write ω := ω21 and let (π, , H) denote its GNS-representation. For all n ≥ 2 we now find a bounded cc-region Vn ⊂ M2 such that Vn ⊃ O1 ∪On . For this purpose we first choose a Cauchy surface C ⊂ M2 and note that K n := C ∩ J (On ) is compact. Letting L n ⊂ C be a compact connected set containing K 1 ∪ K n in its interior it suffices to choose Vn := int(D(L n )) ∩ I − (C+ ) ∩ I + (C− ) for Cauchy surfaces C± to the future resp. past of O1 , On and C. Note that and ω2n are cyclic and sepωn

arating vectors for RωVn and RVn2 respectively by O1 ∪ On ⊂ Vn and by Corollary 4.2. ωn

Because ω and ω2n are locally quasi-equivalent there is a ∗ -isomorphism φ : RVn2 → RωVn . In the presence of the cyclic and separating vectors φ is implemented by a unitary map Un : Hω2n → H (see [12] Theorem 7.2.9). We claim that ψn := Un ω2n is cyclic for RωOn . Indeed, by the definition of quasi-equivalence we have φ ◦ πω2n = πω on AVn , so πω (A On )ψn = Un πω2n (A On )ω2n = Un Hω2n = Hω . We now apply the results of [8] to conclude that H contains a dense set of vectors ψ which are cyclic and separating for all RωOn simultaneously. Because each cc–region

ω (A)ψ defines a full ReehO ⊂ M2 contains some On we see that ωψ : A → ψ,πψ 2 Schlieder state. Finally, because the GNS-representation of ωψ is just (π, ψ, H) we see that it is locally quasi-equivalent to ω and hence ωψ ∈ S M2 .

One reason to assume the maximality condition of Theorem 4.3 is that it guarantees that the state spaces are closed under operations, i.e. if ω ∈ S M and A ∈ A M such that ω(A∗ A) = 1, then S M automatically contains the state ω A defined by ω A (B) := ω(A∗ B A). However, such a large state space may contain many singular states, as we will see in the example of the free scalar field in Sect. 5. In situations of physical interest it therefore remains to be seen whether the state space is big enough to contain full ReehSchlieder states. Nevertheless, Theorem 4.1 is already enough for some applications, such as the following conclusion concerning the type of local von Neumann algebras. Corollary 4.4. Consider a nowhere classical causal locally covariant quantum field theory A with a locally quasi-equivalent state space S which satisfy the time-slice axiom. Let Mi be two globally hyperbolic spacetimes with diffeomorphic Cauchy surfaces and let ω1 ∈ S M1 be a Reeh-Schlieder state. Then for any state ω ∈ S Mi and any cc-region O ⊂ Mi the local von Neumann algebra RωO is not finite.

282

K. Sanders

Proof. We will use Proposition 5.5.3 in [2], which says that RωO is not finite if the GNSvector is a cyclic and separating vector for RωO and for a proper sub-algebra RωV . Note that we can drop the superscript ω if O and V are bounded, by local quasi-equivalence. First we consider M1 . For any bounded cc-region O1 ⊂ M1 such that O1⊥ = ∅ we can find bounded cc-regions O ⊂ O1⊥ and U, V ⊂ O1 such that U ⊂ V ⊥ . By the Reeh-Schlieder property the GNS-vector ω1 is cyclic for RV and hence also for R O1 . Moreover it is cyclic for RO1 ⊃ R O and therefore it is separating for R O1 and RV . Now suppose that R O1 = RV . Then, by causality: πω (AU ) ⊂ πω (AV ) = πω (A O1 ) ⊂ πω (AU ) . , which contradicts the nowhere classicality. Therefore, the It follows that RU ⊂ RU inclusion RV ⊂ R O1 must be proper and the cited theorem applies. Of course, if O ⊂ M1 is a cc-region that is not bounded, then it contains a bounded sub-cc-region O1 as above and RωO ⊃ RωO1 R O1 isn’t finite either for any ω ∈ S M1 . (If V is a partial isometry in the smaller algebra such that I = V ∗ V and E := V V ∗ < I then the same V shows that I is not finite in the larger algebra.) Next we consider M2 and let O ⊂ M2 be any cc-region. It contains a cc-region O2 with O2⊥ = ∅, so we can apply Theorem 4.1. Using the unitary map Uα : Hω2 → Hω1 we see that R O2 RωO22 contains α −1 (RωO11 ), which is not finite by the first paragraph. Hence R O2 is not finite and the statement for O then follows again by inclusion.

Instead of the nowhere classicality we could have assumed that the local von Neumann algebras in M1 are infinite, which allows us to derive the same conclusion for M2 . Unfortunately it is in general impossible to completely derive the type of the local algebras using this kind of argument. Even if we know the types of the algebras AU1 and AV1 in the inclusions (4), we can’t deduce the type of A O2 . Another important consequence of Proposition 4.1 is that Corollary 4.2 enables us to apply the Tomita-Takesaki modular theory to RωO22 (or to the von Neumann algebra of any bounded cc-region V2 which contains O2 , if the Cauchy surfaces are non-compact). More precisely, let O2 ⊂ M2 be given and let U1 , V1 ⊂ M1 be the bounded cc-regions and α : M2 → M1 the ∗ -isomorphism of Proposition 3.4, so that A O1 ⊂ α(A O2 ) ⊂ AV1 . ω1 We can then define R := Uα RωO22 U∗α and obtain RU ⊂ R ⊂ RωV11 . It is then clear that 1 the respective Tomita-operators are extensions of each other, SU1 ⊂ SR ⊂ SV1 (see e.g. [12]). 5. The Free Scalar Field As an example we will consider the free scalar field, which can be quantised using the Weyl algebra (see [7]). For a globally hyperbolic spacetime M the algebra A M is defined as follows. We let E := E − − E + denote the difference of the advanced and retarded fundamental solution of the Klein-Gordon operator ∇ a ∇a + m 2 for a given mass m ≥ 0. ∞ The linear space H := E(C0 (M)) has a non-degenerate symplectic form defined by σ (E f, Eg) := M f Eg, where we integrate with respect to the volume element determined by the metric. To every E f ∈ H we can then associate an element W (E f ) subject to the relations i

W (E f )∗ = W (−E f ), W (E f )W (Eg) = e− 2 σ (E f,Eg) W (E( f + g)). These elements form a ∗ -algebra that can be given a norm and completed to a C ∗ -algebra A M . It is shown in [5] Theorem 2.2 that the free scalar field is an example of a locally


283

covariant quantum field theory which is causal. It satisfies part of the time-slice axiom, namely if O ⊂ M contains a Cauchy surface then A O = A M . A state ω on A M is called regular if the group of unitary operators λ → πω (W (λE f )) is strongly continuous for each f . It then has a self-adjoint (unbounded) generator ω ( f ) and we can define the Hilbert-space valued distribution φω ( f ) := ω ( f )ω . A regular state is quasi-free iff the two-point function w2 ( f, h) := φω ( f¯), φω (h),

f, h ∈ C0∞ (M)

determines the state by ω(W (E f )) = e−w2 ( f, f ) . A quasi-free state is Hadamard iff W F∞ (φω (.)) ⊂ V + , where V + ⊂ T ∗ M denotes the cone of future directed causal covectors of the spacetime (see [20] Proposition 6.1). Quasi-free Hadamard states exist on all globally hyperbolic spacetimes (see [9]) and they are believed to be the most suitable states to play a role similar to the vacuum in Minkowski spacetime. For this reason we will want to choose a state space S M which contains all quasi-free Hadamard states. If we choose these states only it can be shown that we get a locally quasi-equivalent state space (see [22] Theorem 3.6) and the time-slice axiom is satisfied (see [15] Theorem 5.1 and the subsequent discussion). We may now apply the results of Sect. 4: Proposition 5.1. Let M be a globally hyperbolic spacetime, let O ⊂ M a bounded ccregion with non-empty causal complement and assume that the mass m > 0 is strictly positive. Then there is a Hadamard state ω on A M which has the Reeh-Schlieder property for O. The vector ω is cyclic and separating for R O . For all bounded cc-regions V ⊂ M the local von Neumann algebra RV is not finite. Moreover, if the Cauchy surfaces of M are non-compact then ω is a separating vector for all RV . Proof. The theory is causal, satisfies the time-slice axiom and the state space is locally quasi-equivalent. Moreover, the theory is nowhere classical. To see this we note that the local C ∗ -algebras are non-commutative and simple, so the representations πω are faithful. Now we can find an ultrastatic (and hence stationary) spacetime M diffeomorphic to M. Because m > 0 we may apply the results of [13], which imply the existence of a regular quasi-free ground state ω on M . This state has the Reeh-Schlieder property (see [19]) and is Hadamard because it satisfies the microlocal spectrum condition (see [15,20]). The conclusions now follow immediately from Theorem 4.1 and Corollaries 4.2 and 4.4. Note that stronger results on the type of the local algebras are known, [22]. If we would enlarge our state space, following [5], and allow any state that is locally quasi-equivalent to a quasi-free Hadamard state, then it follows from Theorem 4.3 that it also contains full Reeh-Schlieder states. In fact, if ω is a suitable quasi-free Hadamard state on A M then the proof of Theorem 4.3 shows that Hω contains a dense G δ of vectors which define Reeh-Schlieder states. An important question is how many states are both Hadamard and Reeh-Schlieder states. As a partial answer we wish to note that most vectors in the given G δ of Reeh-Schlieder vector states are not Hadamard. Indeed, if a vector ψ ∈ Hω defines a Hadamard state then it must be in the domain of the unbounded self-adjoint operator T := ω ( f )∗∗ ω ( f )∗ for every test function f (see [12] Theorem 2.7.8v). We then apply

Note that this is what [5] calls the time-slice axiom. In our definition, however, we also need to choose a suitable state space functor so that we get isomorphisms of the sets of states too.

284

K. Sanders

Proposition 5.2. The domain of an unbounded self-adjoint operator T on a Hilbert space H is a meagre Fσ , (i.e. the complement of a dense G δ ). Proof. For each n ∈ N we define Vn := {ψ ∈ H|T ψ ≤ n} and note that dom(T ) = ∪n Vn . The sets Vn are nowhere dense because T is unbounded. They are also closed because for a Cauchy sequence ψi → ψ with ψi ∈ Vn we have T E [−r,r ] ψ ≤ T E [−r,r ] (ψ − ψi ) + T E [−r,r ] ψi ≤ r ψ − ψi + n, where E [−r,r ] is the spectral projection of T on the interval [−r, r ]. Taking i → ∞ shows that T E [−r,r ] ψ ≤ n for all r and hence T ψ ≤ n, i.e. ψ ∈ Vn . This completes the proof. It then follows that most Reeh-Schlieder vector states in Hω are not Hadamard. The converse question, how many Hadamard states are Reeh-Schlieder states, remains open. The basic difficulty for that question seems to be that the Hilbert space topology on Hω is not fine enough to deal with the meagre set of Hadamard states. 6. Conclusions If one accepts locally covariant quantum field theory as a suitable axiomatic framework to describe quantum field theories in curved spacetime then one only needs to assume the very natural time-slice axiom in order to use the general technique of spacetime deformation. The geometrical ideas behind deformation results like Proposition 3.3 are insightful, even though the proofs can become a bit involved. It should be noted, however, that these geometrical results, possibly combined with other assumptions such as causality, have immediate consequences on the algebraic side which are not hard to prove. This we have seen in Sect. 4, where most proofs follow easily from the deformation, with the exception of Theorem 4.3. Concerning the Reeh-Schlieder property we have shown that a Reeh-Schlieder state on one spacetime can be deformed in such a way that it gives a state on a diffeomorphic spacetime which is a Reeh-Schlieder state for a given cc-region. It is even possible to get full Reeh-Schlieder states, but it is not clear whether these are “physical” enough to belong to a state space of interest. Nevertheless, our results do allow us to draw conclusions about non-local correlations and the type of local von Neumann algebras and they open up the way to use Tomita-Takesaki theory in curved spacetime. Acknowledgement. I would like to thank Chris Fewster for suggesting the current approach to the ReehSchlieder property and for many helpful discussions and comments on the second draft. Many thanks also to Lutz Osterbrink for his careful proofreading of the first draft. Finally I would like to express my gratitude to the anonymous referee for pointing out some minor mistakes and providing useful comments. Open Access This article is distributed under the terms of the Creative Commons Attribution Noncommercial License which permits any noncommercial use, distribution, and reproduction in any medium, provided the original author(s) and source are credited.

References 1. Bär, C., Ginoux, N., Pfäffle, F.: Wave equations on Lorentzian manifolds and quantization. EMS Publishing House, Zürich, 2007 2. Baumgärtel, H., Wollenberg, M.: Causal nets of operator algebras. Akademie Verlag, Berlin, 1992 3. Bernal, A.N., Sánchez, M.: Smoothness of time functions and the metric splitting of globally hyperbolic spacetimes. Commun. Math. Phys. 257, 43–50 (2005) 4. Bernal, A.N., Sánchez, M.: Further results on the smoothability of Cauchy hypersurfaces and Cauchy time functions. Lett. Math. Phys. 77, 183–197 (2006)


285

5. Brunetti, R., Fredenhagen, K., Verch, R.: The generally covariant locality principle—a new paradigm for local quantum field theory. Commun. Math. Phys. 237, 31–68 (2003) 6. Brunetti, R., Ruzzi, G.: Superselection sectors and general covariance. I. Commun. Math. Phys. 270, 69–108 (2007) 7. Dimock, J.: Algebras of local observables on a manifold. Commun. Math. Phys. 77, 219–228 (1980) 8. Dixmier, J., Maréchal, O.: Vecteurs totalisateurs d’une algèbre de von Neumann. Commun. Math. Phys. 22, 44–50 (1971) 9. Fulling, S.A., Narcowich, F.J., Wald, R.M.: Singularity structure of the two-point function in quantum field theory in curved spacetime, II. Ann. Phys. (N.Y.) 136, 243–272 (1981) 10. Haag, R.: Local quantum physics – fields, particles, algebras. Berlin-Heidelberg:Springer Verlag, 1992 11. Hawking, S.W., Ellis, G.F.R.: The large scale structure of space-time. Cambridge University Press, Cambridge, 1973 12. Kadison, R.V., Ringrose, J.R.: Fundamentals of the theory of operator algebras. Academic Press, London, 1983 13. Kay, B.S.: Linear spin-zero quantum fields in external gravitational and scalar fields. I. A one particle structure for the stationary case. Commun. Math. Phys. 62, 55–70 (1978) 14. O’Neill, B.: Semi-Riemannian geometry: with applications to relativity. Academic Press, New York, 1983 15. Radzikowski, M.J.: Micro-local approach to the Hadamard condition in quantum field theory on curved space-time. Commun. Math. Phys. 179, 529–553 (1996) 16. Redhead, M.: The vacuum state in relativistic quantum field theory. In: Hull, D., Forbes, M., Burian, eds., Phil. of Sci. Assoc, 1994 PSA, Vol. 1994, Volume 2 E.Lansing, MI:Phil. of Sci. Assoc, 1994, pp. 77–87 17. Reeh, H., Schlieder, S.: Bemerkungen zur Unitäräquivalenz von Lorentzinvarianten Felden. Nuovo Cimento 22, 1051–1068 (1961) 18. Sanders, K.: Aspects of locally covariant quantum field theory. PhD thesis, University of York, also available on http://arxiv.org/abs/:0809.4828v1[math-ph], 2008 19. Strohmaier, A.: The Reeh-Schlieder property for quantum fields on stationary spacetimes. Commun. Math. Phys. 215, 105–118 (2000) 20. Strohmaier, A., Verch, R., Wollenberg, M.: Microlocal analysis of quantum fields on curved space-times: analytic wavefront sets and Reeh-Schlieder theorems. J. Math. Phys. 43, 5514–5530 (2002) 21. Verch, R.: Antilocality and a Reeh-Schlieder theorem on manifolds. Lett. Math. Phys. 28, 143–154 (1993) 22. Verch, R.: Continuity of symplectically adjoint maps and the algebraic structure of Hadamard vacuum representations for quantum fields on curved spacetime. Rev. Math. Phys. 9, 635–674 (1997) 23. Verch, R.: A spin-statistics theorem for quantum fields on curved spacetime manifolds in a generally covariant framework. Commun. Math. Phys. 223, 261–288 (2001) 24. Wald, R.M.: General relativity. The University of Chicago Press, Chicago and London, 1984 Communicated by G. W. Gibbons

Commun. Math. Phys. 288, 287–310 (2009) Digital Object Identifier (DOI) 10.1007/s00220-009-0771-y

Communications in


Mellin Transform of the Limit Lognormal Distribution Dmitry Ostrovsky 125 Field Point Rd. #3, Greenwich, CT 06830, USA. E-mail: [email protected] Received: 26 February 2008 / Accepted: 24 December 2008 Published online: 11 March 2009 – © Springer-Verlag 2009

Abstract: The technique of intermittency expansions is applied to derive an exact formal power series representation for the Mellin transform of the probability distribution of the limit lognormal multifractal process. The negative integral moments are computed by a novel product formula of Selberg type. The power series is summed in general by means of its small intermittency asymptotic. The resulting integral formula for the Mellin transform is conjectured to be valid at all levels of intermittency. The conjecture is verified partially by proving that the integral formula reproduces known results for the positive and negative integral moments of the limit lognormal distribution and gives a valid characteristic function of the Lévy-Khinchine type for the logarithm of the distribution. The moment problem for the logarithm of the distribution is shown to be determinate, whereas the moment problems for the distribution and its reciprocal are shown to be indeterminate. The conjecture is used to represent the Mellin transform as an infinite product of gamma factors generalizing Selberg’s finite product. The conjectured probability density functions of the limit lognormal distribution and its logarithm are computed numerically by the inverse Fourier transform. 1. Introduction Limit lognormal stochastic processes were introduced and reviewed by Mandelbrot [17,19], formalized in a series of papers by Kahane [12–14], and constructed explicitly by Bacry et. al. [3]. The interest in limit lognormal processes is threefold. First, it stems from their remarkable scaling properties. Indeed, they exhibit nonlinear moment scaling, i.e. multiscaling, via the mechanism of grid-free stochastic self-similarity with lognormal multipliers, and thus are a concrete example of a family of multifractal stochastic processes. In addition, the construction of Bacry et. al. has stationary non-gaussian increments and serves as an ideal mathematical model for the physical phenomenon of long-range dependence. Second, limit lognormal processes are non-markovian as their increments are strongly stochastically dependent. We showed in [22 and 23] that this

288

D. Ostrovsky

dependence is stronger than that of both the canonical multifractals [18] and 1D disordered systems [15]. The precise nature of this dependence is the source of mathematical interest as it calls for novel mathematical techniques that are suitable for a strongly non-markovian setting. Third, the positive integral moments of the Bacry et. al. process were shown in [4] to be given by the Selberg integral, which has its own interest, confer [10], hence the interest in the probability distribution that generates it. In this paper we continue our study of the limit lognormal process of Bacry et. al. [3] via the technique of intermittency expansions that we introduced in [23] in the special case of the Laplace transform and developed it in general in [24]. This technique is based on a novel rule of intermittency differentiation for the probability distribution of the limit lognormal process. The rule is a functional equation for the derivatives of the expectation of an arbitrary smooth function of the distribution with respect to the intermittency parameter. By formally re-summing the resulting Taylor series, we obtain a power series expansion of any such functional with universal coefficients that are independent of the function. In [24] we expressed these expansion coefficients as an alternating sum of derivatives of the Selberg integral and used the celebrated Selberg’s formula [26] to compute the derivatives exactly as Bell polynomials of the values of the Riemann zeta function at positive integers. Thus, the technique of intermittency expansions provides a way of reconstructing the functional from the Selberg integral. Our goal in this paper is to carry out this program in the special case of the Mellin transform of the distribution. The contribution of this paper is to provide several new exact results beyond what has been known since the inception of the limit lognormal distribution. Our main result is the following formal expansion of the Mellin transform (complex moments) of the distribution in powers of the intermittency parameter µ,

E M

q

∞ µr +1 1 Br +2 (q + 1) + 2Br +2 (q) − 3Br +2 ζ (r + 1) − q = exp r + 1 2r +1 r +2 r =0 Br +2 (q − 1) − Br +2 (2q − 1) + (ζ (r + 1) − 1) , q ∈ C. (1) r +2

As usual, ζ (s)1 denotes the Riemann zeta and Bn (s) the n th Bernoulli polynomial. It is proved by deriving and solving a linear recurrence relation for the coefficients of the corresponding intermittency expansion. In the special case of integral moments, the series is convergent for a finite range of the moments and allows us to re-derive Selberg’s formula for the positive moments and derive the following formula for the negative ones: (2 + (n + 2 + k)µ/2) (1 − µ/2) n−1 E M −n = . 2 (1 + (k + 1)µ/2) (1 + kµ/2)

(2)

k=0

For non-integral moments, the power series expansion in Eq. (1) is divergent. We show that it is precisely the small intermittency µ → +0 asymptotic expansion of the following integral: 1 We will write ζ (1) to denote Euler’s constant. It never enters any of the final formulas as the coefficient it multiplies is identically zero throughout this paper.

Mellin Transform of the Limit Lognormal Distribution

log E M

q

∞ ∼

dx x

0

1 ex − 1

− 1 + q + qe

e

289

µx 2 (q+1)

+ 2e

µx 2 q

−3+e e

µx 2

+e

−x

e

µx 2

µx 2 (2q−1)

e

µx 2

µx 2 (q−1)

−e

µx 2 (2q−1)

−1 −e

µx 2 (q−1)

−1

−q

.

(3)

We conjecture that this formula holds exactly for all intermittency levels 0 ≤ µ < 1 and complex moments q such that (q) < 2/µ. While we do not know if it is true in this generality, we do know that it holds for integral q as it recovers Selberg’s formula when q is positive and Eq. (2) when q is negative. The special case of purely imaginary q corresponds to the characteristic function of log M. To further substantiate our conjecture, we prove that Eq. (3) does indeed produce a valid infinitely divisible characteristic function at all intermittency levels when q is purely imaginary. The corresponding probability density, which is conjectured to be the density of log M, is not known analytically. As a first step towards the goal of computing it, we show that the exponential of the integral in Eq. (3) equals an infinite product of gamma factors in the complex plane, thereby extending Selberg’s formula and Eq. (2) to the half-plane (q) < 2/µ: q 2 (2 + 2/µ − 2q) q E[M ] = (1 − µq/2) −q (1 − µ/2) µ (2 + 2/µ − q) ∞ 2q 2n 3 (1 − q + 2n/µ) (2 − q + 2n/µ) × . (4) µ 3 (1 + 2n/µ) (2 − 2q + 2n/µ) n=1

In particular, this representation implies novel functional equations for the Mellin transform that are stated and proved below. Finally, we use Eq. (3) to compute the conjectured densities of log M and M numerically. The main technical innovation of this paper is the derivation and use of a novel recurrence relation for the universal expansion coefficients. Its strength is in that it encodes both the representation of the coefficients as the alternating sum of Selberg integral’s derivatives and the Bell polynomial representation of these derivatives mentioned above into a single linear recurrence. We use the method of generating functions to evaluate the coefficients of this recurrence explicitly and relate them to Bernoulli polynomials. The plan of the paper is as follows. In Sect. 2 we give a brief review of our results on general intermittency expansions following [24]. In Sect. 3 we state and prove the recurrence relation for the expansion coefficients followed by the proof of Eq. (1) in Theorem 3.1. In Sect. 4 we treat the case of integral moments and derive Eq. (2) in Theorem 4.1. In Sect. 5 we establish the small intermittency asymptotic of Eq. (3) in Theorem 5.1. In Sect. 6 we state our main conjecture, explain its origin, show that it gives correct results in the special case of integral q, prove that it gives rise to a valid characteristic function for log M, and relate the Mellin transform to an infinite product of gamma factors. In Sect. 7 we use the conjecture to compute the probability density functions of log M and M numerically and compare them with the corresponding gaussian and lognormal densities, respectively. Conclusions are presented in Sect. 8. Appendix contains proofs of the infinite product representation and its corollaries. 2. Review of Intermittency Expansions In this section and throughout this paper we will write M to denote the limit lognormal distribution. It is a probability distribution on the positive real line M > 0, which can

290

D. Ostrovsky

be thought of as the value of the limit lognormal process at the time and decorrelation length equal to one. We will denote the intermittency parameter by 0 ≤ µ < 1. The simplest approach to the limit lognormal process introduced by Bacry et. al. [3] is to consider the exponential functional of a particular stationary gaussian process ωε (s) in the limit ε → 0. Specifically, let ωε (s) be a gaussian process in s, whose mean and covariance are functions of a finite scale ε > 0. Define them to be µ (1 − log ε) , 2 Cov [ωε (t), ωε (s)] −µ log |t − s|, ε ≤ |t − s| ≤ 1, |t − s| , Cov [ωε (t), ωε (s)] µ 1 − log ε − ε E [ωε (t)] −

(5a) (5b) (5c)

if |t − s| < ε, and covariance is zero in the remaining case of |t − s| ≥ 1. Thus, ε is used as a truncation scale, and for simplicity we set the decorrelation length to one. The two key properties of this construction are, first, that E [ωε (t)] = −Var [ωε (t)] /2 so that E exp (ωε (s)) = 1 and, second, that Var [ωε (t)] = µ(1 − log ε) is logarithmically divergent as ε → 0. The first property is essential for convergence, the second is responsible for multifractality, and both are originally due to Mandelbrot [17]. The interest in this t construction stems from the ε → 0 limit of the exponential functional Mε (t) 0 exp (ωε (s)) ds. Using the theory of T -martingales developed by Kahane in a series of papers [12–14], and the work of Barral and Mandelbrot [6] on log-Poisson cascades, Bacry and Muzy [5] showed that Mε (t) converges weakly (as a measure on R+ ) a.s. to a limit process M(t) = limε→0 Mε (t) provided 0 ≤ µ < 1, and the limit is nondegenerate in the sense that E[M(t)] = t. This limit is known as the limit lognormal process, and its value at time t = 1, i.e. M M(1), is the limit lognormal distribution. We refer the reader to the original work by Bacry et. al. [3 and 20] for further details of their construction including its remarkable statistical self-similarity and scaling properties, to [5] for the proof of existence, to [25] for an alternative approach to limit lognormal multifractality, and to [22 and 23] for reviews. The positive integral moments of M were shown in [4] to be given by the celebrated Selberg integral, confer [26] and [1], 2 ≤ l < 2/µ, 1 E[M ] =

···

l

0

1 l

|si − s j |−µ d s(l) =

0 i< j

l−1 (1 − (k + 1)µ/2) 2 (1 − kµ/2) , (6) (1 − µ/2)(2 − (l + k − 1)µ/2)

k=0

which from now on we will denote by Sl (µ). In general, it is shown in [5] that for q > 0 we have E M q < ∞ if q < 2/µ and, conversely, E M q < ∞ implies that q ≤ 2/µ. Let F(x) be an arbitrary smooth function that does not involve µ and let F (k) (x) denote its k th derivative. Our results on general intermittency expansions established in [24] are summarized in the following propositions. Proposition 2.1. The expectation E [F(M)] has the formal expansion E [F(M)] = F(1) +

2n ∞ µn n=1

n!

k=2

F

(k)

(1)Hn,k .

(7)


291

Proposition 2.2. Given n = 1, 2, 3, · · · , the coefficients (−1)l l!Hn,l and derivatives of the Selberg integral are binomial transforms of one another n k (−1)k l k ∂ Sl Hn,k = (−1) |µ=0 , (8a) l ∂µn k! l=2

∂ n Sl |µ=0 ∂µn

l l t!Hn,t . = t

(8b)

t=2

Proposition 2.3. The expansion coefficients Hn,k satisfy Hn,k = 0 ∀k > 2n.

(9)

Proposition 2.4. Let Yn (x1 · · · xn ) denote the complete exponential Bell polynomial of order n and ζ (x) the Riemann zeta function. By convention, we will write ζ (1) to mean Euler’s constant. Define the sequence of coefficients c p (l) for p = 1, 2, · · · and l = 2, 3, · · · ⎡ ⎤ l−1 l−1 1 ⎣ c p (l) = ( j + 1) p + 2 j p − 1 − (l + j − 1) p + (l + j − 1) p ⎦ . ζ ( p) p2 p j=0

j=0

(10) Then, ∂ n Sl |µ=0 = Yn (1!c1 (l), . . . , n!cn (l)) . (11) ∂µn We end this section with several remarks. First, Hn,k are the universal expansion coefficients, Eq. (8a) is their alternating sum representation, and Eq. (11) is the Bell polynomial representation of the derivatives that we referred to in the Introduction. Second, the intermittency expansion in Eq. (7) can be thought of as a linear operator acting on F(x) with the coefficients that are rational polynomials in values of the Riemann zeta at positive integers. It is an open question as to whether this operator has a number theoretic interpretation similar to that of the Gauss-Kuzmin-Wirsing operator. Third, the coefficient that multiplies ζ ( p) in Eq. (10) vanishes if p = 1 ∀l so that Euler’s constant does not enter. 3. Recurrence Relations We begin by stating and proving the fundamental recurrence relation for the expansion coefficients Hn,k . Recall the definition of c p (l) in Proposition 2.4 above. Proposition 3.1. The coefficients Hn,k satisfy the recurrence relation n−1 k n Hn+1,k = An,k + Hn−r, t Br, t, k , n ≥ 0, k ≥ 2, (12) r r =0

An,k (−1)

Br, t, k

k

t=2

k (n + 1)!

k!

l=2

k cn+1 (l), (−1) l l

k l (r + 1)! l k cr +1 (l). (−1) t! (−1) l t k! k

l=t

(13)

(14)

292

D. Ostrovsky

Proof. By Eq. (8a) we have Hn+1,k =

n+1 k k ∂ Sl (−1)k (−1)l |µ=0 . l ∂µn+1 k!

(15)

l=2

Recall the recursion relation of complete Bell polynomials, confer [9], Chap. 11, Yn+1 (x1 , · · · , xn+1 ) =

n−1 n r =0

r

Yn−r (x1 , · · · , xn−r ) xr +1 + xn+1 .

(16)

It follows from this equation and Eq. (11) that we have the identity n−1 n−r n ∂ Sl ∂ n+1 Sl | = |µ=0 (r + 1)!cr +1 (l) + (n + 1)!cn+1 (l). µ=0 r ∂µn−r ∂µn+1

(17)

r =0

Substituting Eq. (8b) into Eq. (17), we get

l n−1 n l ∂ n+1 Sl t!Hn−r,t (r + 1)!cr +1 (l) + (n + 1)!cn+1 (l). (18) |µ=0 = r t ∂µn+1 r =0

t=2

Finally, the result follows by substituting this equation back into Eq. (15) and changing the order of summation. It is clear that Proposition 3.1 determines the Hn,k uniquely as H1,k = A0,k . Next, we proceed to compute the alternating sums that occur in Eqs. (13) and (14). Proposition 3.2. The coefficients An,k and Br,t,k are An,k =

Br, t, k

1 d n+2 z n! ζ (n + 1) e z (e z − 1)k + 2(e z − 1)k |z=0 z n+1 n+2 k! 2 (n + 2) dz e −1

−z z k −z 2z +e (e − 1) − e (e − 1)k + e−z (e2z − 1)k − e−z (e z − 1)k ,

r +2 1 k d z z(t+1) z r! ζ (r + 1) e = t! | (e − 1)k−t z=0 k! 2r +1 (r + 2) t dz r +2 ez − 1 +2e zt (e z − 1)k−t + e z(t−1) (e z − 1)k−t − e z(2t−1) (e2z − 1)k−t − 3δkt r +2 k d z −(r + 2)k (δkt + δk t+1 ) + |z=0 z r +2 t dz e −1 × e z(2t−1) (e2z − 1)k−t − e z(t−1) (e z − 1)k−t .

(19)

(20) Proof. The starting point is the identity known as Faulhaber’s formula that expresses the sum of powers in terms of Bernoulli polynomials Bn (x), y j=x

jp =

B p+1 (y + 1) − B p+1 (x) . p+1

(21)


293

Using the generating function of Bernoulli polynomials, it follows that c p (l) can be written as p+1 d z z(l+1) 1 zl z(l−1) z(2l−1) ζ ( p) e c p (l) = | +2e − 3+e −e z=0 p( p + 1)2 p dz p+1 ez − 1 d p+1 z z(2l−1) −l( p + 1) + p+1 |z=0 z e (22) − e z(l−1) . dz e −1 We now substitute Eq. (22) in Eqs. (13) and (14) and compute the ensuing alternating sums k l k e zl = (1 − e z )k + ke z − 1, (−1) (23) l l=2

k l zl t k e = (−1) e zt (1 − e z )k−t . (−1) l t t

k l=t

l

(24)

The rest of the proof follows from lengthy but straightforward algebraic manipulations. We mention in passing that the derivatives that are involved in Eqs. (19) and (20) can all be evaluated in terms of Nörlund-Bernoulli polynomials. For our purposes, however, it is advantageous to write them in this form as will become clear shortly. We now proceed to derive and solve a key recurrence relation that governs the intermittency expansion for the moments of M. The moments correspond to F(x) = x q in Eq. (7) for some given q ∈ C. By Proposition 2.1, the intermittency expansion for the moments is ∞ µn f n (q), E Mq = 1 + n!

(25)

n=1

∞ f n (q) = (q)k Hn,k , n = 1, 2, 3, · · · .

(26)

k=2

Here and throughout the rest of the paper we use the standard notation for the ‘falling factorial’ (q)k −1)(q −2) · · · (q −k +1) so that the corresponding binomial coef q(q ficient satisfies qk k! = (q)k . Note that the upper limit of summation has been extended to infinity by Proposition 2.3. Proposition 3.3. Let f 0 (q) = 1 and define the coefficients br (q), r = 0, 1, 2 · · · , Br +2 (q + 1) + 2Br +2 (q) − 3Br +2 1 br (q) = r +1 ζ (r + 1) −q 2 r +2 Br +2 (q − 1) − Br +2 (2q − 1) . (27) + (ζ (r + 1) − 1) r +2 Then, f n (q) satisfies the recurrence f n+1 (q) = n!

n f n−r (q) br (q). (n − r )! r =0

(28)

294

D. Ostrovsky

Proof. We begin by substituting the main recurrence in Eq. (12) into Eq. (26) and changing the order of summation in the second sum (recall that all the sums involved are finite, despite notation)

∞ ∞ n−1 ∞ n f n+1 (q) = (q)k An,k + (q)k Br, t, k Hn−r,t . (29) r r =0

k=2

t=2

k=t

∞ We now use Proposition 3.2 to evaluate k=2 (q)k An,k and k=t (q)k Br, t, k . By Proposition 3.2 it is sufficient to sum the elementary series for |a| < 1, ∞

a k (q)k /k! = (1 + a)q − qa − 1,

k=2

∞

∞

a k−t (q)k /(k − t)! = (q)t (1 + a)q−t . (30)

k=t

After straightforward algebraic reductions, we obtain ∞

(q)k An,k = n! bn (q),

k=2 ∞

(q)k Br, t, k = (q)t r ! br (q).

(31)

(32)

k=t

The result follows. Theorem 3.1. The moments have the following exact formal representation: ∞ µr +1 q br (q) , q ∈ C. E M = exp r +1

(33)

r =0

Proof. The key point to observe is that the solution to the recurrence relation in Eq. (28) is f n (q) = Yn b0 (q)0!, b1 (q)1!, · · · , bn−1 (q)(n − 1)! , (34) where Yn stands for the exponential Bell polynomial as in Sect. 2. This is proved by noticing that the recurrence in Eq. (28) satisfies the recurrence relation of Bell polynomials in Eq. (16). The result now follows from the well-known formula for the generating function of Bell polynomials, confer Chap. 11 in [9]. The significance of Theorem 3.1 is that it gives an exact closed-form expression for the moments. This solution is however formal as we have not said anything yet about convergence. This question is addressed in the next three sections. 4. Convergent Series Our goal in this section is to sum the series in Theorem 3.1 in the special case of integral q such that −2/µ + 1/2 < q < 2/µ, in which case the series is convergent. Thus, we will establish the validity of Eq. (2) for this particular range of moments and also show as a self-check that we do indeed recover Selberg’s formula in Eq. (6). Throughout this section we work with the derivative series r∞=0 µr br (q) instead of the one that is involved in Theorem 3.1 as it is technically easier to handle.


295

Proposition 4.1. Let q = −n, n = 1, 2, 3, · · · , < 2/µ − 1/2. Then, the sum of the series is

n−1 ∞ (2 + (n + 2 + k)µ/2) (1 − µ/2) ∂ log µr br (q) = . (35) ∂µ 2 (1 + (k + 1)µ/2) (1 + kµ/2) r =0

k=0

Proof. The starting point is a generalization of Faulhaber’s formula in Eq. (21), y B p+1 (−x) − B p+1 (−y) = (− j) p , 0 < x < y, p+1

(36)

j=x+1

which follows from Eq. (21) by means of the identity B p+1 (−x) = (−1) p+1 (B p+1 (x) + ( p + 1)x p ) that is satisfied by Bernoulli polynomials, confer [30], Chap. 1. By the definition of br (q), we obtain

n 1 (−k)r +1 + n ζ (r + 1) br (−n) = r +1 (−n)r +1 − 3 2 k=1 2n+1 (−k)r +1 . + (ζ (r + 1) − 1) (37) k=n+2

Recall the identities, confer Sect. 3.4 of [27], that relate the Riemann zeta to the digamma function (recall that ζ (1) denotes Euler’s constant), ∞

∞

ζ (r ) t r −1 = −ψ(1 − t), |t| < 1,

(38)

r =1

(ζ (r ) − 1) t r −1 = −ψ(1 − t) −

r =1

1 , |t| < 2. 1−t

(39)

We now change the index of summation in r∞=0 µr br (−n) from r to r + 1, change the order of summation over k and r, and evaluate the ensuing sums over r by means of Eqs. (38) and (39). We obtain for the sum of the series r∞=0 µr br (−n),

n µ µk 1 µn −3 − nψ(1 − ) kψ 1 + nψ 1 + 2 2 2 2 k=1 2n+1 µk k kψ 1 + + + . (40) 2 1 + µk/2 k=n+2

It is elementary to verify that the expressions in Eqs. (35) and (40) are the same.

Theorem 4.1. The negative integral moments satisfy Eq. (2) for n = 1, · · · , < 2/µ − 1/2. ∞ ∂ q E M = E Mq µr br (q), subject to Proof. The moments solve the equation ∂µ r =0 q E M (µ = 0) = 1, by Theorem 3.1. By Proposition 4.1, so does the product formula in Eq. (2).

296

D. Ostrovsky

Proposition 4.2. Let q = n, n = 1, 2, 3, · · · , < 2/µ. Then, the sum of the series is

n−1 ∞ (1 − (k + 1)µ/2) 2 (1 − kµ/2) ∂ r log µ br (q) = . (41) ∂µ (1 − µ/2)(2 − (n + k − 1)µ/2) r =0

k=0

Proof. The same argument as in the proof of Proposition 4.1 gives for the sum of the series

n−1 µ 1 µk µn −3 + nψ(1 − ) kψ 1 − −nψ 1 − 2 2 2 2 k=0 2n−2 µk k kψ 1 − + + . (42) 2 1 − µk/2 k=n−1

It is now easy to verify that the expressions in Eqs. (41) and (42) are the same.

Theorem 4.2. The positive integral moments satisfy Eq. (6) for n = 1, 2, 3, · · · , < 2/µ. 5. Asymptotic Series In this section we will consider the series in Theorem 3.1 for arbitrary q ∈ C. The series is a divergent asymptotic series unless q is integral. Our goal is to prove Eq. (3), which gives the sum of the series in the limit of small intermittency. In other words, we will compute the sum by finding a particular integral function whose asymptotic expansion in the limit of zero intermittency coincides with the series. We begin with a preliminary result that establishes the asymptotic expansion of two particular integrals and is the basis of our summation method. Proposition 5.1. Let 0 ≤ µ < 1, q ∈ C. Then, in the limit µ → +0, ∞ −x µx q ∞ Br +2 (q) − Br +2 e 2 −1 µ r +1 1 e ∼ − q d x, µx 2 r +1 r +2 x e 2 −1 r =0

(43)

0

∞ µ r +1 ζ (r + 1) Br +2 (q) − Br +2 2 r +1 r +2 r =1

µx ∞ e 2 q −1 (q 2 − q) µx dx −q − ∼ . µx (e x − 1)x e 2 − 1 2 2

(44)

0

Proof. We start with the well-known identity yq ∞ yr e −1 = (B (q) − B ) y r r ey − 1 r!

(45)

r =1

that is valid for |y| < 2π, hence valid asymptotically as y → 0. Using that B1 (q)−B1 = q, we obtain in the limit y → 0, ∞ Br +2 (q) − Br +2 1 e yq − 1 − q ∼ yr . (46) f (y) y ey − 1 (r + 2)! r =0


297

∞ Finally, as the integral 0 exp(−2y/µ) f (y)dy equals the integral on the right-hand side of Eq. (43), the result follows by Watson’s lemma, confer Theorem 3.1 in Chap. 3 of [21]. The proof of Eq. (44) is quite similar and requires a slight generalization of Watson’s Lemma in the following form. Given a function that has the asymptotic expansion f (y) ∼ r∞=1 ar y r as y → 0, then as z → +∞, ∞ 0

∞

1 f (y)dy ∼ ζ (r + 1)r !ar /z r +1 , e zy − 1

(47)

r =1

confer Lemma 10.2 in Chap. 38 of [8]. Now, using that B2 (q) − B2 = q 2 − q, we have in the limit y → 0 f (y)

∞ Br +2 (q) − Br +2 1 e yq − 1 y 2 − q − (q ∼ yr . − q) y ey − 1 2 (r + 2)!

(48)

r =1

The result now follows from Eq. (47).

We can naturally think of Eqs. (43) and (44) as defining the sums of the divergent series involved. The summation method in Eq. (43) is known as Borel’s and the one in Eq. (44) is its close kin known as Hardy’s Moment Constant Method, confer Sect. 4.12 and 4.13 in [11]. We now proceed to the main result of this section that gives an asymptotic formula for the sum of the series in Theorem 3.1. Theorem 5.1. Let br (q) be as in Proposition 3.3 and q ∈ C. Then, as µ → +0,

E M

q

⎛∞

µx µx µx µx 2 (q+1) + 2e 2 q − 3 + e 2 (q−1) − e 2 (2q−1) 1 e d x ∼ exp ⎝ µx x ex − 1 e 2 −1 0 ⎞

µx µx (2q−1) (q−1) 2 2 µx e −e − q ⎠. − 1 + q + qe 2 + e−x µx 2 e −1

(49)

Proof. It is sufficient to show that r∞=0 µr +1 br (q)/(r + 1) is asymptotic to the integral on the right-hand side of Eq. (49) as asymptotic series can be exponentiated, confer Sect. 65 of [16]. The expression in Eq. (27) is a linear combination of terms that are of the same types as in Proposition 5.1 with the exception of the q term, which can be treated using the identity ∞ ∞ µx µ r +1 ζ (r + 1) dx µx 2 −1− e . = 2 r +1 (e x − 1)x 2 r =1

0

The result follows by an elementary algebraic reduction.

(50)

298

D. Ostrovsky

6. The Conjecture and its Corollaries In this section we will state our conjecture, explain its origin, show that it is consistent with the known values of integral moments, and discuss its implications. Throughout this section we denote D(q) the integral on the right-hand side of Eq. (49) so that the content of Theorem 5.1 is that E[M q ] ∼ exp (D(q)) as µ → +0. Conjecture. The equality E[M q ] = exp (D(q)) holds for 0 ≤ µ < 1 and (q) < 2/µ. The origin of this conjecture is as follows. For a fixed q ∈ C define two functions of z ∈ C,

z z(q+1) e (51) + 2e zq − 3 + e z(q−1) − e z(2q−1) , g(z) z e −1

z e z(2q−1) − e z(q−1) . (52) h(z) z e −1 It is clear that these functions are meromorphic in z, and z = 0 is a removable singularity with g(0) = h(0) = 0. Now, the integral in Eq. (49) can be written as 1 1 µx 1

µx 2 µx

g − g (0) − g (0) e x − 1 µx/2 2 2 2 2 µx µx −x 1 µx µx

2 −q e − 1 − +e h − h (0) . (53) 2 µx/2 2 2

∞ D(q) = 0

dx x

The Taylor series at z = 0 for the integrands in Eq. (53) are ∞ z r +1 1 1 g(z) − g (0)z − g

(0)z 2 −q e z −1−z = , g (r +2) (0) − q(r + 2) z 2 (r + 2)! r =1

(54) 1 h(z) − h (0)z = z

∞ r =0

h (r +2) (0)

z r +1 (r + 2)!

.

(55)

Using g (n) (0) = Bn (q + 1) + 2Bn (q) − 3Bn + Bn (q − 1) − Bn (2q − 1), h

(n)

(0) = Bn (2q − 1) − Bn (q − 1),

(56) (57)

if we now substitute Eqs. (54) and (55) at z = µx/2 into Eq. (53) and integrate term by term, we obtain precisely the series r∞=0 µr +1 br (q)/(r + 1). This is the origin of our conjecture. It is worth emphasizing that the conjecture “explains” the source of divergence of ∞ r +1 b (q)/(r + 1) for non-integral q. If fact, when q is integral, the functions r r =0 µ g(z) and h(z) are entire so that the substitution of the Taylor expansions in Eqs. (54) and (55) into Eq. (53) is legitimate. For non-integral q, these series have finite radii of convergence, and the substitution leads to a divergent asymptotic series. Aside from the proven correctness of our conjecture in the limit of small intermittency, we also have strong analytic evidence that supports it in the case of the general intermittency level. Specifically, the following two propositions show that Eq. (49) recovers the values of the integral moments that were computed previously in Theorems 4.1 and 4.2, and that it gives rise to a valid characteristic function when q is purely imaginary.


299

Theorem 6.1. Let 0 ≤ µ < 1 and n = 1, 2, 3 · · · . Then, for q = ±n, the integral in Eq. (49) gives (1 − (k + 1)µ/2) 2 (1 − kµ/2) n−1 , n < 2/µ, E Mn = (1 − µ/2)(2 − (n + k − 1)µ/2) E M −n =

k=0 n−1 k=0

(2 + (n + 2 + k)µ/2) (1 − µ/2) , ∀n. 2 (1 + (k + 1)µ/2) (1 + kµ/2)

(58)

(59)

Proof. If q is integral, then the complicated fractions that are involved in Eq. (49) become finite sums of exponentials. The starting point is the algebraic identity 1 z(q+1) zq z(q−1) z(2q−1) e − 1 + q + qe z + 2e − 3 + e − e ez − 1 zq zq e − 1 z(q−1) e −1 − q − e − 1 . = e zq − 1 − q(e z − 1) + 2 ez − 1 ez − 1

(60)

Consider first the case of q = n, n = 1, 2, 3 · · · . Then, e zn − 1 zk e z(2n−1) − e z(n−1) = = e , e z(n+k−1) . z z e −1 e −1 n−1

n−1

k=0

k=0

(61)

Denoting z = µx/2, the integral in Eq. (49) simplifies to ∞ D(n) = 0

n−1 n−1 dx 1 zn z zk z(n+k−1) zk −e ) e − 1 − n(e − 1)+2 (e − 1)− (e x ex − 1

+e−x

k=0

n−1

k=0

(e z(n+k−1) − 1) .

(62)

k=0

By means of the identities, confer Sect. 1.2 of [27], ∞ log (1 + s) = 0

∞ log(s) =

e−ts − 1 + se−t et − 1

dt (Malmstén), (s) > −1, t

−t dt (Frullani), (s) > 0; e − e−ts t

(63)

(64)

0

it is now not too difficult to show that the integral equals D(n) = log

n−1 k=0

(1 − (k + 1)µ/2) 2 (1 − kµ/2) . (1 − µ/2)(2 − (n + k − 1)µ/2)

This completes the proof of Eq. (58).

(65)

300

D. Ostrovsky

The case of q = −n is very similar. Instead of Eq. (61) we now have e−zn − 1 e−z(2n+1) − e−z(n+1) −z(k+1) = − = − e , e−z(n+k+2) , ez − 1 ez − 1 n−1

n−1

k=0

k=0

(66)

so that the integral in Eq. (49) reduces to ∞ D(−n) =

dx x

0

+

n−1

n−1 1 −zn z − 1 + n(e − 1) − 2 (e−z(k+1) − 1) e ex − 1

(e

k=0

−z(n+k+2)

−e

−z(k+1)

k=0

) −e

−x

n−1

(e

−z(n+k+2)

− 1) .

(67)

k=0

Again, the integral is evaluated using the identities in Eqs. (63) and (64) resulting in D(−n) = log

n−1 k=0

(2 + (n + 2 + k)µ/2) (1 − µ/2) . 2 (1 + (k + 1)µ/2) (1 + kµ/2)

This completes the proof of Eq. (59).

(68)

Remark 6.1. It is not too difficult to see from Eq. (59) that log E M −n grows as n 2 as n → +∞. This is the same rate of growth as that of the moments of the lognormal distribution. As the lognormal moment problem is well-known to be indeterminate, confer [7], it is natural to ask if the same conclusion holds for M −1 . It does as shown in Corollary 6.1 below. This, however, cannot be concluded from the rate of growth of the moments alone because the converse to the Carleman criterion of determinancy is false in general. Having considered the case of integral q, we now turn our attention to that of purely imaginary q in Eq. (49), which corresponds to the characteristic function of log M. Formally, we have the identity

E eiq log M = exp (D(iq)) , q ∈ R. (69) Thus, a necessary condition for the validity of our conjecture is that exp (D(iq)) be a valid characteristic function. This is the result of the next proposition. Theorem 6.2. For any 0 ≤ µ < 1 the function q → exp (D(iq)) , q ∈ R, is the characteristic function of an infinitely divisible distribution. Proof. We will prove the assertion by reducing D(iq) to the Lévy-Khinchine functional form. Fix a q ∈ R and introduce the following functions of x ∈ R+ µx µx µx µx e 2 + 2 + e− 2 −x , (70) F(x) eiq 2 − 1 − iq 2 µx µx − µx −x e 2 , G(x) e2iq 2 − 1 − 2iq (71) 2 µx µx µx µx µx e 2 + 2 − e− 2 −x − 2 + e 2 − e−x e 2 − 1 . (72) H (x) 2


301

Then, it is not hard to see that the integral in Eq. (49) can be written as ⎡ ⎤ ∞ d x ⎣ F(x) − G(x) + iq H (x) ⎦ µx D(iq) = . x (e x − 1) e 2 − 1

(73)

0

G (0)

= 0, we can split the integral into two as follows Now, using that G(0) = ⎡ ⎡ ⎤ ⎤ ∞ ∞ d x ⎣ F(x) − G

(0)x 2 /2 + iq H (x) ⎦ d x ⎣ G(x) − G

(0)x 2 /2 ⎦ µx µx . (74) − x x (e x − 1) e 2 − 1 (e x − 1) e 2 − 1 0

0

Next, we make a change of variables x = 2x in the second integral and then put the two integrals back under the same integral sign. Note that H (x) = O(x 3 ) as x → 0 and G

(0) = −q 2 µ2 . There results ⎡ µx ⎤ µx ∞ 2 + 2+e − 2 −x − µx −x2 e 4 µx d x iq µx ⎣ e µx − x µx ⎦ e 2 −1−iq D(iq) = x 2 x (e − 1) e 2 − 1 (e 2 −1) e 4 − 1 0

∞

⎡

⎤

(x/2)2 dx ⎣ x2 µx ⎦ − µx x x x − 1) e 2 − 1 2 − 1) e 4 − 1 (e (e 0 ⎡ ⎤ ∞ H (x) dx ⎣ µx ⎦ . +iq x x − 1) e 2 − 1 (e 0 −

q 2 µ2 2

(75) (76)

(77)

The key observation is that the functions that are involved in Eqs. (75) and (76) are non-negative. That is, it is not difficult to show by inspection that µx µx µx x e 2 + 2 + e− 2 −x e− 4 − 2 µx − x µx ≥ 0, (78) (e x − 1) e 2 − 1 (e 2 − 1) e 4 − 1 (x/2)2 x2 µx ≥ 0. − µx x (e 2 − 1) e 4 − 1 (e x − 1) e 2 − 1

(79)

Finally, the change of variables u = µx/2 in Eq. (75) and simple re-arrangement of the terms bring D(iq) to the canonical form, confer Theorem 4.4 in Chap. 4 of [28], iqu 1 2 2 iqu e −1− dM (u) (80) D(iq) = iqa − q σ + 2 1 + u2 R\{0}

for some constants a ∈ R and σ 2 > 0 depending on µ, and the spectral function M (u) that is defined by ⎤ ⎡ 2u ∞ eu + 2 + e−u− µ − u2 − µu e ⎥ du ⎢ u ⎦ − u M (u) = − ⎣ 2u , u > 0, (81) u µ − 1) e 2 − 1 µ − 1) (e u − 1) (e (e u

302

D. Ostrovsky

and M (u) = 0 for u < 0. Note that M (u) is continuous and non-decreasing on (−∞, 0) and (0, ∞) 2by construction, and satisfies the required integrability and limit conditions [−1,1]\{0} u dM (u) < ∞ and lim M (u) = 0. It follows that D(iq) is precisely of u→±∞

the Lévy-Khinchine functional form.

Corollary 6.1. The moment problem for log M is determinate and those for M and M −1 are indeterminate. Proof. The argument is based on the general theory of tail asymptotics of infinitely divisible distributions. The two key properties of log M that we need are that its spectral function M (u) is concentrated on R+ and decays as exp(−2u/µ)/u up to a constant in the limit u → +∞, confer Eq. (81). We conclude that the left tail u → −∞ of log M is gaussian, confer Theorem 9.7 in Chap. 4 of [28]. On the other hand, it follows from Theorem 2 in [2] that the right tail of log M satisfies log P (log M > u) ∼ −2(u − 1)/µ as u → +∞. It is immediate from here that the moment problem for log M is determinate. Indeed, the moments of log M grow no faster than those of the exponential distribution, which are known to satisfy the Carleman criterion, confer [7]. On the other hand, log P (M > u) ∼ −2(log u − 1)/µ as u → +∞ so that the right tail of M follows a power law, while its left tail u → +0 is lognormal. By the same token, the right tail of M −1 is lognormal, while its left tail is a power law. Hence, the moment problems for M and M −1 are indeterminate by the Krein criterion, confer [29]. Corollary 6.2. The positive integral moments of log M are finite and satisfy n

n E (log M)n−r D (r +1) (0). E (log M)n+1 = r

(82)

r =0

Proof. The positive integral moments of log M are given by the derivatives of E[M q ] at q = 0. As E[M q ] = exp (D(q)) , we have by di Bruno’s formula, confer [9], Chap. 11, (83) E (log M)n = Yn D (1) (0), · · · , D (n) (0) . Equation (82) follows from the recurrence relation of Bell polynomials, confer Eq. (16). We conclude this section with a partial result pertaining to the open question of analytically computing the probability density function of M. The density is the Mellin inverse of E[M q ] = exp (D(q)) . Computing it is not an easy task. We have so far given two representations of D(q), one being its definition in Eq. (49) and the other being the Lévy-Khinchine form in Eqs. (75)–(77) for the case of purely imaginary q. The following proposition gives a formula for exp (D(q)) , which extends Selberg’s finite product formula to an infinite product and reveals a deep connection between the structure of the Mellin transform and the gamma function in the complex plane. Theorem 6.3. The Mellin transform in Eq. (49) satisfies for (q) < 2/µ, q 2 (2 + 2/µ − 2q) E[M q ] = (1 − µq/2) −q (1 − µ/2) µ (2 + 2/µ − q) 2q 3 ∞ 2n (1 − q + 2n/µ) (2 − q + 2n/µ) × . µ 3 (1 + 2n/µ) (2 − 2q + 2n/µ) n=1

(84) (85)


303

Table 1. The moments of log M with intermittency µ = 0.5 versus those of N and of M versus exp(N ) mean stand. dev. skewness kurtosis

log M − 0.449 0.933 0.084 0.057

N (gaussian) − 0.449 0.933 0 0

M 1 1.291 8.892 ∞

exp(N )(lognormal) 0.987 1.163 5.176 74.064

The proof is quite long and is deferred to the Appendix.2 A direct proof that Theorem 6.3 implies Theorem 6.1 is easy and will be omitted. Corollary 6.3. The Mellin transform satisfies the following functional equations. 2 2 2 µ 2(−q) 2 (1 − q) (2 − q + 2/µ) q E M q+ µ = (2π ) µ −1 − µ 1 − E M , 2 µ (2 − 2q) (2 − 2q + 2/µ) (86)

(2 − (2q − 2)µ/2) (2 − (2q − 3)µ/2) (1 − µ/2) E M q . (87) E M q−1 = (1 − µq/2) 2 (1 − (q − 1)µ/2) (2 − (q − 2)µ/2) Equation (86) holds for (q) < 0 and Eq. (87) for (q) < 2/µ. The proof is given in the Appendix. We remark in passing that Eqs. (86) and (87) are equivalent to integral equations of the Mellin convolution type involving the Appell F3 function in the kernel for the probability density function of M. Their detailed study is however beyond the scope of this work. 7. Numerical Results In this section we present results of simple numerical calculations that illustrate some of the main properties of log M and M in comparison to their gaussian and lognormal counterparts. Equation (49) gives us the density of log M as the Fourier inverse of exp (D(iq)) , which we compute numerically. Corollary 6.2 gives us the moments of log M, in particular, it gives us the mean and standard deviation. Let N be a gaussian random variable with the same mean and standard deviation as those of log M with intermittency µ = 0.5. It is then natural to make a comparison of log M to N as well as of M to the lognormal random variable exp (N ) . Table 1 lists the mean < X >, standard deviation σ E (X − < X >)2 , skewness E (X − < X >)3 /σ 3 , and kurtosis E (X − < X >)4 /σ 4 − 3 of these distributions, while Fig. 1 and 2 show graphs of the corresponding probability densities. The computation was carried out with the help of the computer program MAXIMA (http://maxima.sourceforge.net). We can draw the following conclusions from the numerical data. First, log M is positively skewed, i.e. its mean is shifted to the left of zero resulting in the right tail that is “longer” than the left tail. This is also seen in Fig. 1. Second, both log M and M are leptokurtic, i.e. they have “fat” right tails. This is seen in having a positive kurtosis as well as in higher “peaks” and lower “valleys” in graphs of their densities compared to those of N and exp(N ). In fact, the kurtosis of M is infinite because 2 It is worth pointing out that the n = 1 term in Eq. (85) cancels the singularity at q = 1 + 1/µ coming from one of the gamma factors in Eq. (84). Hence, the equation has the stated region of validity.

304

D. Ostrovsky 0.45 pdf of log M gaussian

0.4 0.35 0.3 0.25 0.2 0.15 0.1 0.05 0 -4

-3

-2

-1

0

1

2

3

4

Fig. 1. Graphs of the probability densities of log M with intermittency µ = 0.5 and N 1.2 pdf of M lognormal 1

0.8

0.6

0.4

0.2

0

0

0.5

1

1.5

2

2.5

3

3.5

Fig. 2. Graphs of the probability densities of M with intermittency µ = 0.5 and exp(N )

its fourth moment is infinite at intermittency µ = 0.5. Finally, the left tail of M looks lognormal, which is consistent with the asymptotic computed in the proof of Corollary 6.1. 8. Conclusions We have presented a detailed study of the limit lognormal probability distribution by means of the intermittency expansion of its Mellin transform (complex moments). In particular, we examined the integral moments of the distribution and the characteristic function of its logarithm. Our findings are as follows.


305

We derived an explicit expansion of the Mellin transform in powers of the intermittency parameter. The resulting power series is convergent for a particular range of integral moments and is divergent otherwise. We computed the sum of the convergent series, thereby establishing a new formula for the negative integral moments of the distribution and re-deriving Selberg’s famous formula for the positive ones. In general, we computed the asymptotic behavior of the series expansion in the limit of small intermittency by finding a particular integral function whose asymptotic expansion coincides with the series in this limit. It is our conjecture that this function gives an exact representation of the Mellin transform at the general level of intermittency. The conjecture is interesting as it points to the source of divergence of the series expansion for non-integral moments, namely, that the integrand involved is entire when the moments are integral and meromorphic when they are not. More importantly, we evaluated the conjectured integral formula in the special case of integral moments and showed that the result coincides with the sum of the convergent series. Moreover, we proved that in the case of purely imaginary moments, which corresponds to the characteristic function of the logarithm of the distribution, our integral formula is indeed a valid characteristic function of the infinitely divisible type. Thus, in effect, we introduced a new probability distribution with the properties that its positive and negative integral moments at arbitrary intermittency and Mellin transform asymptotic in the limit of small intermittency coincide with the corresponding quantities of the limit lognormal distribution. Our conjecture says that the two are one and the same. The main reason our conjecture is not a theorem is the absence of a proof of uniqueness. As the positive integral moments of the limit lognormal distribution are given by the Selberg integral, we can summarize our results in the following way. For each intermittency level µ ∈ [0, 1) we constructed an explicit ‘analytical continuation’ of the Selberg integral, as a function of its dimension, to the Mellin transform in the half-plane (q) < 2/µ of some probability measure on R+ depending on µ. In addition, for each complex q, we showed that the asymptotic expansion of the Mellin transform coincides with our intermittency expansion in the limit µ → +0. What is lacking is the proof that the construction with these properties is unique. This cannot be concluded from the matching of the integral moments alone because we showed that the corresponding moment problems are in fact indeterminate. On the other hand, the moment problem for the logarithm is determinate, hence the answer lies in the analyticity properties of the Mellin transform as a function of both q and µ. We illustrated our results graphically by numerically inverting the conjectured characteristic function of the logarithm of the limit lognormal distribution, thereby computing its probability density function. The graph of the density and the analytical computation of the moments indicate that the distribution of the logarithm has positive skewness and kurtosis. The same is true of the limit lognormal distribution itself. At last, we mention some avenues for future research. First, aside from proving or disproving our conjecture, it would be interesting to find a closed-form expression for the probability density corresponding to our integral formula. Our results suggest that there is a deep connection between this density and the gamma function in the complex plane, for, indeed, we represented its Mellin transform as an infinite product of gamma factors. This representation implies functional equations for the Mellin transform that are equivalent to certain integral equations for the underlying density. Their study is left to future research. Second, in this paper we focused on the Mellin transform of the limit lognormal distribution. Similar arguments can be given for the Laplace and Stieltjes transforms of the distribution. What is lacking are explicit formulas for either transform of the kind that we gave for the Mellin transform. Third, the very structure of

306

D. Ostrovsky

the intermittency expansion studied in this paper begs the question of whether there is a connection of the limit lognormal distribution to analytic number theory.

A. Appendix In this section we will give proofs of Theorem 6.3 and its corollaries. We begin with the “main lemma.” It can be thought of as an analogue of the Euler formula for the gamma function, whereas Theorem 6.3 is an analogue of the Weierstraβ formula. Lemma A.1. The Mellin transform in Eq. (49) satisfies for (q) < 2/µ, µ (2 + 2/µ − 2q) 2 q µq −q E[M ] = (A-1) 1− 1− (2 + 2/µ − q) µ 2 2 ! " N µN 3 (1 + µ(k − q)/2) (1 + µ(k + 1 − q)/2) 2q 1+ × lim . N →∞ 2 3 (1 + µk/2) (1 + µ(k + 1 − 2q)/2) q

k=1

(A-2) Proof. We assume in this proof that (q) < 1 + 1/µ. It will become clear from Lemma A.2 below that the result holds in fact for (q) < 2/µ. Consider the integral ∞ I (q) 0

dx (e x − 1)x

e

µx 2 q

e

µx 2

(q 2 − q) µx −q − . 2 2 −1 −1

(A-3)

The defining integral for D(q) in Eq. (49) can be naturally split into three terms as they exist individually in the case of (q) < 1 + 1/µ. Specifically, we can write D(q) = I (q + 1) + 2I (q) + I (q − 1) − I (2q − 1) ∞ µx dx µx 2 −1− e −q (e x − 1)x 2 0 ∞

+

e−x x

e

µx 2 (2q−1)

e

0

µx 2

−e

µx 2 (q−1)

−1

(A-4) (A-5)

− q d x.

(A-6)

The integrals in Eqs. (A-5) and (A-6) can be computed in closed form using a combination of Malmstén’s formula and Frullani’s identity, ∞

∞ 0

e−x x

0

e

µx µ µ dx µx 2 −1− e = log 1 − −γ , x (e − 1)x 2 2 2

µx 2 (2q−1)

e

µx 2

−e

µx 2 (q−1)

−1

− q d x = log

(A-7)

2 (2 + 2/µ − 2q) + q log , (A-8) (2 + 2/µ − q) µ


307

where γ denotes Euler’s constant. Now, by the dominated convergence theorem, I (q) = lim I N (q), where I N (q) is defined by N →∞

N µx

(q 2 − q) µx µx

µx µx dx 2 q −1−q e 2 −1 − 2 −1 e e− 2 k . e x (e − 1)x 2 2

∞ I N (q) 0

k=1

(A-9) Let us fix k. Applying Malmstén’s formula to log (1+µk/2−µq/2)−log (1+µk/2) and log (1 + µk/2 − µ/2) − log (1 + µk/2), we obtain µk µk µq log 1 + − log 1 + − 2 2 2 µ µk µk − − log 1 + −q log 1 + 2 2 2 ∞ µx

µx µx µx dx 2 (q−k) − e − 2 k − q e 2 (1−k) − e − 2 k e . (A-10) = (e x − 1)x 0

Similarly, we have

µ µk − ψ 1+ 2 2

∞ µx

µx µk dx − 2 k 2 (1−k) . (A-11) −ψ 1+ = e − e 2 (e x − 1) 0

Substituting Eqs. (A-10) and (A-11) into Eq. (A-9) and summing over k gives us the following formula for I N (q) N µk µq µk µN log 1 + − − log 1 + + q log 1 + I N (q) = 2 2 2 2 k=1 1 µN µ + (q 2 − q) ψ(1) − ψ 1 + . (A-12) 2 2 2 Hence, we obtain for I N (q + 1) + 2I N (q) + I N (q − 1) − I N (2q − 1), N µk µq µk µk µ 3 log 1 + − −3 log 1 + +log 1 + − (q − 1) 2 2 2 2 2 k=1 µ µN µq µk − (2q − 1) + 2q log 1 + + log (1 − ) − log 1 + 2 2 2 2 µN µN µq µq µN µq ) − log (1 + − )− ψ 1+ . (A-13) + ψ(1) + log (1 + 2 2 2 2 2 2 By Stirling’s formula written in the form log (a + b) − log (a) = b log(a) + O(1/a) and ψ(a) = log(a) + O(1/a) as a → +∞, we have in the limit N → ∞, µN µN µq µq µN log 1 + − log 1 + − − ψ 1+ = O(1/N ). 2 2 2 2 2 (A-14)

308

D. Ostrovsky

Putting everything together and recalling that ψ(1) = −γ , we obtain the following formula for D(q): 2 µq (2 + 2/µ − 2q) µ + q log − q log 1 − + log 1 − (2 + 2/µ − q) µ 2 2 ! N µN µk µq + lim 2q log 1 + 3 log 1 + + − N →∞ 2 2 2 k=1 µk µk µ −3 log 1 + + log 1 + − (q − 1) 2 2 2 # µ µk − (2q − 1) . (A-15) − log 1 + 2 2

D(q) = log

The result follows.

Lemma A.2. The limit in Eq. (A-2) satisfies ! " N µN 3 (1 + µ(k − q)/2) (1 + µ(k + 1 − q)/2) 2q lim 1+ N →∞ 2 3 (1 + µk/2) (1 + µ(k + 1 − 2q)/2) k=1

(A-16)

∞ 2n 2q 3 (1 − q + 2n/µ) (2 − q + 2n/µ) . = µ 3 (1 + 2n/µ) (2 − 2q + 2n/µ)

(A-17)

n=1

Proof. The proof entails a combination of the Weierstraβ and Euler formulas for the gamma function. Recall (1 + z) = e−γ z

∞ n=1

(z) = lim

N →∞

e z/n , 1 + z/n

(A-18)

N !N z . z(z + 1) · · · (z + N )

(A-19)

We substitute the Weierstraβ product formula for every gamma function that appears in Eq. (A-16) and obtain after some cancelations, ! lim

N →∞

∞ n=1

e2q

µN 2n

(1 + µN /2n)2q

∞ N

e

k=1 n=1

µ −2q 2n

" (1 + µ(k + 1 − 2q)/2n) . (1 + µ(k − q)/2n)3 (1 + µ(k + 1 − q)/2n) (1 + µk/2n)3

(A-20) We now change the order of the products3 and let N → ∞, ! " N ∞ 1 (k + 2n/µ)3 (k + 1 − 2q + 2n/µ) 1 lim . (µ/2n)2q N →∞ N 2q (k − q + 2n/µ)3 (k + 1 − q + 2n/µ) n=1

k=1

It is now easy to see that Eq. (A-17) holds by the Euler product formula. 3 This can be easily justified by taking logarithms.

(A-21)


309

Remark A.1. It is now clear that Lemma A.1 holds for (q) < 2/µ. Indeed, the n = 1 term in Eq. (A-17) cancels the singularity at q = 1 + 1/µ coming from the ratio of gamma functions in Eq. (A-1). Proof of Theorem 6.3. The proof is immediate from Lemmas A.1 and A.2.

Proof of Corollary 6.3. It is easy to see from Theorem 6.3 that we have the identity E M

q+ µ2

2 +1 2 2 (1 − q) (2 − q + 2/µ) µ 2 µ (−q) = E[M ] − µ 1 − µ 2 (2 − 2q) (2 − 2q + 2/µ) ! 4 " 4 (2−2q + 2(N − 1)/µ) (2−2q + 2N /µ) 2 µN × lim (N !) µ . N →∞ µ 3 (1−q + 2N /µ) (2−q + 2N /µ) q

(A-22) Equation (86) now follows from Stirling’s formula. Similarly, Theorem 6.3 implies another identity

µ (1 − µ(q − 1)/2) E M q−1 = E[M q ] 1 − (1 + µ(1 − q)/2)3 2 (1 − µq/2) ∞ (1 + µ(1 − q)/2n)3 (1 + µ(2 − q)/2n) . (A-23) × (1 + µ(2 − 2q)/2n) (1 + µ(3 − 2q)/2n) n=2

Equation (87) now follows by the Weierstraβ formula for the gamma function.

Acknowledgement. The author wishes to thank the anonymous referee for many helpful suggestions.

References 1. 2. 3. 4. 5. 6. 7. 8. 9. 10. 11. 12. 13. 14. 15. 16.

Andrews, G.E., Askey, R., Roy, R.: Special Functions. Cambridge: Cambridge University Press, 1999 Antonov, S.N.: Asymptotic behavior of infinitely divisible laws. Mathematical Notes 28, 924–929 (1980) Bacry, E., Delour, J., Muzy, J.-F.: Multifractal random walk. Phys. Rev. E 64, 026103 (2001) Bacry, E., Delour, J., Muzy, J.-F.: Modelling financial time series using multifractal random walks. Physica A 299, 84–92 (2001) Bacry, E., Muzy, J.-F.: Log-infinitely divisible multifractal random walks. Commun. Math. Phys 236, 449–475 (2003) Barral, J., Mandelbrot, B.B.: Multifractal products of cylindrical pulses. Prob. Theory Relat. Fields 124, 409–430 (2002) Bisgaard, T.M., Sasvari, Z.: Characteristic Functions and Moment Sequences. Huntington: Nova Science Publishers, 2000 Brendt, B.C.: Ramanujan’s Notebooks, Part V. New York: Springer-Verlag, 1998 Charalambides, C.A.: Enumerative Combinatorics. Boca Raton: Chapman & Hall/CRC, 2002 Forrester, P.J., Warnaar, S.O.: The importance of the Selberg integral. Bull. Amer. Math. Soc. 45, 489–534 (2008) Hardy, G.H.: Divergent Series. Oxford: Clarendon Press, 1949 Kahane, J.-P.: Sur le chaos multiplicatif. Ann. Sci. Math. Quebec 9, 105–150 (1985) Kahane, J.-P.: Positive martingales and random measures. Chi. Ann. Math. 8, 1–12 (1987) Kahane, J.-P.: Produits de poids aléatoires indépendants et applications. In: Belair, J., Dubuc, S. (eds.) Fractal Geometry and Analysis. Boston: Kluwer, 1991, p. 277 Khorunzhiy, O.: Limit theorems for sums of products of random variables. Markov Process. Relat. Fields 9, 675–686 (2003) Knopp, K.: Theory and Application of Infinite Series. New York: Dover, 1990

310

D. Ostrovsky

17. Mandelbrot, B.B.: Possible refinement of the log-normal hypothesis concerning the distribution of energy dissipation in intermittent turbulence. In: Statistical Models and Turbulence, Rosenblatt, M., Van Atta, C. eds., Lecture Notes in Physics 12, New York: Springer, 1972, p. 333 18. Mandelbrot, B.B.: Intermittent turbulence in self-similar cascades: divergence of high moments and dimension of the carrier. J. Fluid Mech. 62, 331–358 (1974) 19. Mandelbrot, B.B.: Limit lognormal multifractal measures. In: Frontiers of Physics: Landau Memorial Conference, Gotsman, E.A. et al, eds., New York: Pergamon, 1990, p. 309 20. Muzy, J.-F., Bacry, E.: Multifractal stationary random measures and multifractal random walks with log-infinitely divisible scaling laws. Phys. Rev. E 66, 056121 (2002) 21. Olver, F.W.J.: Asymptotics and Special Functions. San Diego: Academic Press, 1974 22. Ostrovsky, D.: Limit lognormal multifractal as an exponential functional. J. Stat. Phys. 116, 1491–1520 (2004) 23. Ostrovsky, D.: Functional Feynman-Kac equations for limit lognormal multifractals. J. Stat. Phys. 127, 935–965 (2007) 24. Ostrovsky, D.: Intermittency expansions for limit lognormal multifractals. Lett. Math. Phys. 83, 265–280 (2008) 25. Schmitt, F.: A causal multifractal stochastic equation and its statistical properties. Eur. J. Phys. B 34, 85–98 (2003) 26. Selberg, A.: Remarks on a multiple integral. Norske Mat. Tidsskr. 26, 71–78 (1944) 27. Srivastava, H.M., Choi, J.: Series Associated with the Zeta and Related Functions. Dordrecht: Kluwer, 2001 28. Steutel, F.W., van Harn, K.: Infinite Divisibility of Probability Distributions on the Real Line. New York: Marcel Dekker, 2004 29. Stoyanov, J.: Krein condition in probabilistic moment problems. Bernoulli 6, 939–949 (2000) 30. Temme, N.M.: Special Functions: an Introduction to the Classical Functions of Mathematical Physics. New York: John Wiley, 1996 Communicated by S. Zelditch


Communications in


Vortex Condensates for Relativistic Abelian Chern-Simons Model with Two Higgs Scalar Fields and Two Gauge Fields on a Torus Chang-Shou Lin1 , Jyotshana V. Prajapat2 1 Department of Mathematics, Taida Institute of Mathematical Sciences,

National Taiwan University, Taipei 106, Taiwan. E-mail: [email protected]

2 Department of Mathematics, Arts and Science Program, The Petroleum Institute,

Abu Dhabi, P.O. 2533, U.A.E. E-mail: [email protected] Received: 10 March 2008 / Accepted: 9 December 2008 Published online: 13 March 2009 – © Springer-Verlag 2009

Dedicated to Professor Louis Nirenberg Abstract: We prove the existence of maximal condensates for the relativistic Abelian Chern-Simons equations involving two Higgs particles and two gauge fields on a torus. After a change of variable, we obtain a variational formulation of the problem whose critical points are equivalent to the original system of the equation. We prove existence of a local minimizer for this functional as well as the existence of a second mountain-pass critical point. 1. Introduction The Chern-Simons theories were developed to explain certain condensed matter phenomena, anyon physics, superconductivity, quantum mechanics and so on. The models of this theory give rise to elliptic equations with exponential nonlinearities. Analysis of these equations poses mathematically challenging problems and require new tools and ideas combined with the known techniques. In this paper, we study the system of equations corresponding to the relativistic Abelian Chern-Simons model involving two Higgs scalar fields and two gauge fields on a torus , viz., ⎫ k1 ⎪ v u ⎪ u = λe (e − 1) + 4π m j δpj ⎪ ⎪ ⎬ j=1 , (1.1) k2 ⎪ ⎪ u v ⎪ v = λe (e − 1) + 4π n j δq j ⎪ ⎭ j=1

k 1 where λ > 0 is a real number, m j , n j are positive numbers and N1 = j=1 m j , k 2 N2 = j=1 n j . The sets of points S1 := { p1 , p2 , . . . , pk1 } and S2 := {q1 , q2 , . . . , qk2 } are prescribed and δ p denotes the Dirac delta measure at point p. When m j , n j are integers, the solutions of (1.1) correspond to static solutions, also referred to as periodic

312

C.-S. Lin, J. V. Prajapat

vortices or condensates of the Lagrangian studied in [8 and 11] satisfying ’t Hooft periodic boundary conditions on a torus. The functions u, v correspond to the gauge fields and the Dirac measure 4π nδ p represents the magnetic flux at a point p. The mathematical analysis of the system (1.1) has been recently initiated in [12] where the problem (1.1) is studied on the plane. The authors prove existence of a topological solution of (1.1) in R2 by first solving the system on balls of radius R, R > 0 by variational method and then taking the limit R → ∞ (see ([12]) for details). More recently, the paper [6] studied the uniqueness of topological solutions and existence of nontopological solutions for (1.1) in a plane. Our goal is to study the system (1.1) on a torus . It is interesting to study this problem on a torus since the solutions of the Chern Simons equations depend on the topology of underlying space. For example, the relativistic Abelian Chern-Simons theory (scalar equation) ([2,14–16]), non Abelian SU(3) Chern-Simons theory ([18]), Maxwell-Chern-Simons theory ([3,4,17]) models on a torus are well studied. In spite of similarity with the scalar Abelian Chern Simons equation and Abelian Higgs vortex equations (see [12] for details), the system (1.1) is difficult not only since it involves two equations, but also due to the “mixed” nature of the nonlinear terms. Our first result is the existence of maximal solutions, precisely Theorem 1.1. Let ⊂ R2 be a torus and || denote the measure of . Then, any solution (u, v) of (1.1) in satisfies u < 0, v < 0 in . Moreover, there exists λ∗ >

4π ||

(1.2)

max{N1 , N2 } such that:

(i) For any λ > λ∗ , there exists a unique maximal solution (u λ , vλ ) for the system (1.1) in the sense that if (u, v) ∈ H 1 () × H 1 () is any other solution of (1.1), then u < u λ ; v < vλ .

(1.3)

(ii) For λ < λ∗ , there exists no solution of (1.1) and the map λ → (u λ , vλ ) is monotone, i.e., for λ∗ ≤ λ1 < λ2 , u λ1 < u λ2 , vλ1 < vλ2 .

(1.4)

(iii) The limits lim u λ := u ∗ ;

λ→λ∗

lim vλ := v∗

λ→λ∗

(1.5)

exist a.e. x ∈ and (u ∗ , v∗ ) ∈ H 1 ()× H 1 () is a solution of (1.1) with λ = λ∗ . Moreover, (u ∗ , v∗ ) is a strict sub solution of (1.1) for all λ > λ∗ and hence u ∗ λ∗ .

(1.6)

Here, W 1, p () denotes the Sobolev space which is the completion of C 1 () with respect to the norm || f ||W 1, p := ||∇ f || p + || f || p ,

(1.7)

and H 1 () is the Hilbert space with || f || H 1 := ||∇ f ||2 + || f ||2 ,

(1.8)

Vortex Condensates for Relativistic Abelian Chern-Simons Model on a Torus

313

where we shall use the notations || · || p and < ·, · > to denote the L p norm and L 2 inner product respectively. Note that, as in scalar Chern-Simons equations, while solutions of the system (1.1) in R2 exist for all λ > 0 (see [12]), the solutions for the system (1.1) in 4π a torus exists for λ ∈ [λ∗ , ∞) where the strict inequality λ∗ > || max{N1 , N2 } holds. ∗ The λ given by Theorem 1.1 is optimal in the sense that a solution for (1.1) exists for any λ ≥ λ∗ and does not exist for any λ < λ∗ . In order to find the maximal solution, it is conventional to develop a monotone iterative scheme. In general, it is difficult to develop an iterative scheme for a system of equations, more so with non linearities involving exponential functions and measures. So, it is surprising that such a scheme works for the system (1.1). Besides the simplicity of proof using an iteration scheme, an added advantage is that it allows for numerical construction of solutions of (1.1). We have used this scheme frequently in our paper, besides using it to prove Theorem 1.1 and hope that it will find applications in other problems too. Our second result is to find a second solution of (1.1) for λ > λ∗ . In principle, if a system of equation under consideration admits a variational structure, then a second solution might be obtained via the variational method, for example, the Mountain pass lemma. Unfortunately, the variational form associated with (1.1) is indefinite in H 1 ()× H 1 (). Thus, we have to find a new approach in order to find a second solution. We explain it briefly below, see Sects. 5 and 6 for details. As in [12], by the change of variables u → u + v =: F, v → u − v =: G, the system (1.1) can be transformed into an equivalent problem F = 2λe F − λe

F+G 2

− λe

F−G 2

+ 4π

k1

m j δ p j + 4π

k2

j=1

G = λe

F+G 2

− λe

F−G 2

+ 4π

k1

m j δ p j − 4π

j=1

n j δq j .

(1.9)

j=1 k2

n j δq j .

(1.10)

j=1

which is an Euler-Lagrange equation of an indefinite functional ˜ G)=I ˜ Iλ ( F, λ (F, G) :=

F+G F−G 1 1 ||∇ F||2 − ||∇G||2 +2λ eu 0 +v0 +F −eu 0+ 2 −ev0 + 2 d x 2 2 4π(N1 + N2 ) 4π(N1 − N2 ) + F dx − G d x. (1.11) || ||

˜ G˜ respectively. where F˜ = F + u 0 , G˜ = G + v0 and u 0 , v0 are singular parts of F, Since Theorem 1.1 gives existence of a strict sub solution, we may try to find a constrained minimizer for the functional Iλ in the closed subset {F ∈ H 1 () : F ≥ F∗ } as for the scalar case. However, even under the constraint on F, Iλ might not be bounded below in H 1 () × H 1 (). So we need new ideas to work on our problem. The idea is to first solve the second equation (1.10) for any given F. In fact, we show that for any fixed F ∈ H 1 (), Eq. (1.10) has a unique solution G(F) depending on F so that we may think of Iλ as a function of F as Iλ (F) := Iλ (F, G(F)).

(1.12)

314


In spite of this reduction, it is still difficult to bound the functional Iλ from below. Hence we approximate the functional Iλ by a suitable functional Iλε and study the minimization problem for Iλε . This will be done in Sect. 5 where we prove Theorem 1.2. For every λ > λ∗ , the functional Iλ has a local minimizer Fλ ∈ H () such that Fλ > F∗ ,

(1.13)

where F∗ := u ∗ + v∗ , (u ∗ , v∗ ) defined by (1.5) in Theorem 1.1. The approximate system is introduced in Sect. 4. There, we replace the measures k1 k2 µ := 4π m j δ p j and ν := 4π n j δ p2 by smooth, non negative functions µε , ν ε j=1

j=1

j

where µε µ, ν ε ν in the sense of distribution i.e.,

u = λev (eu − 1) + µε v = λeu (ev − 1) + νε .

(1.14)

We show that solution of this system converges to solution of (1.1) as ε → 0. Moreover, an analogue of Theorem 1.1 holds for (1.14) and hence there exists λε > 0 such that solutions to (1.14) exists for λ ≥ λε , but no solutions exist for λ < λε . Thus, to prove Theorem 1.2, it has to be shown that lim λε = λ∗ .

ε→0

(1.15)

The equality (1.15) will be proved by the implicit function theorem, where the nonsingularity of the linearized equation at the maximal solution for (1.1) has been established a priori. Generally, the non- singularity can be proved via showing the triviality of the null space of the linearized equation. However, it does not seem easy to prove this for a system of equations. Instead, taking advantage of the monotone scheme, we prove that the linearized equation is onto from W 2,2 () × W 2,2 () → L 2 () × L 2 (). Thus, we have Theorem 1.3. For almost everywhere λ > λ∗ , the linearized system for (1.1) at the maximal solution is invertible. The proof of Theorem 1.3 will be given in Sect. 4. In order to apply the mountain pass lemma, we prove that the approximate functional Iλε satisfies the Palais-Smale condition, provided min{N1 , N2 } > 0. However, due to the indefinite form of (1.11), there are some difficulties to prove the Palais-Smale condition. In any case, we succeed and will give a proof in Lemma 6.1. It is not known that the maximal solution obtained in Theorem 1.1 corresponds to a local minimizer of Iλ for λ > λ∗ . If not, then the local minimizer obtained in Theorem 1.2 already gives us a second condensate. Again, if it is not a strict local minimizer, then we obtain a sequence of local minimizers of Iλ distinct from (u λ + vλ , u λ − vλ ), which would prove existence of more than one condensate. Therefore, without loss of generality, we might assume that the maximal solution obtained in Theorem 1.1 corresponds to a strict local minimizer of the functional Iλ , and then by applying the mountain pass lemma, we have Theorem 1.4. For each λ > λ∗ , the system (1.1) has at least two distinct solutions.


315

2. Equations (1.1) and the Chern-Simons-Higgs System The Chern-Simons theory is an alternative gauge theory, different from the Maxwell theory, in 2 + 1 dimensional space used to study planar condensed matter phenomena. The Chern-Simons Lagrangian is characterized by the coupling of scalar field, the YangMills or Maxwell field and the Chern-Simons gauge field. Moreover, it is interesting to study this theory on space-time with non trivial topology. Please refer to [4,7,12,17,19] and the references therein for more details. Here we are concerned with Chern-Simon model defined on the (2 + 1)- dimensional Minkowski space R2,1 with metric tensor (gr s ) = diag(1, −1, −1), r, s = 0, 1, 2 involving two scalar fields, two gauge fields and with mixed charge-flux relations. Let φ1 and φ2 be two complex scalar fields representing two Higgs particles of charges q1 , (1) (2) q2 . We denote by (Ar , Ar ) the two gauge fields associated with the given metric (2) tensor carrying the induced electromagnetic fields (Fr(1) s , Fr s ), where ( j)

( j)

( j)

Fr s = ∂r As − ∂s Ar ,

(2.1)

i, j = 1, 2 and r, s = 0, 1, 2. The Chern-Simons action density or Lagrangian L studied in [8,11] is defined as 1 (2) 1 (1) L = − κεr st Ar(1) Fst − κεr st Ar(2) Fst + Dr¯φ1 Dr φ1 + Dr¯φ2 Dr φ2 −V (φ1 , φ2 ), 4 4 (2.2) where κ > 0 is the coupling parameter, Dr φ1 = ∂φ1 − iq1 Ar(1) φ1 ,

Dr φ2 = ∂φ1 − iq2 Ar(2) φ1

(2.3)

are the covariant derivatives and V (φ1 , φ2 ) is the Higgs potential density, V (φ1 , φ2 ) :=

q12 q22 {|φ1 |2 (|φ2 |2 − c22 )2 + |φ2 |2 (|φ1 |2 − c12 )2 }. κ2

(2.4)

The special numerical factor in front of the expression of V ensures that selfduality can be achieved for the static field configurations and the positive vacuum states < φ1 >= c1 > 0 and < φ2 >= c2 > 0 lead to spontaneously broken symmetries. The equations of motions of the Lagrangian (2.2) are the Chern-Simons equations ⎫ 1 r sα F (2) = −q i(φ D r¯φ − φ¯ D r φ ) ⎪ ⎪ 1 1 1 1 1 sα 2 κε ⎪ ⎪ ⎪ ⎪ (1) 1 r sα r ⎪ r¯φ − φ¯ D φ ) κε F = −q i(φ D ⎬ 2 2 2 2 2 sα 2 . (2.5) q2q2 ⎪ = − 1κ 2 2 {2|φ2 |2 (|φ1 |2 − c12 )2 + (|φ2 |2 − c22 )2 }φ1 ⎪ Dr Dr φ1 ⎪ ⎪ ⎪ ⎪ ⎪ q12 q22 2 2 r 2 2 2 2 2 D D φ =− {2|φ | (|φ | − c ) + (|φ | − c ) }φ ⎭ r

2

κ2

1

2

2

1

1

2

The first two equations in (2.5) are the conserved matter current densities and putting r = 0 in these equations we get the Gauss laws (2) (1) κ F12 = 2q12 A0 |φ1 |2 , (2.6) (1) 2 κ F12 = 2q22 A(2) 0 |φ2 | .

316

C.-S. Lin, J. V. Prajapat (1)

(2)

These are the variational equations for (A0 , A0 ). The Lagrangian L is invariant under the gauge transformations φ → eiη φ, A → A + ∇η, A0 → A0

(2.7)

for any smooth real valued function η. Using the notations of [2], we let the torus be represented by a fundamental domain in R2 , generated by two independent vectors a 1 , a 2 : := {x = (x1 , x2 ) ∈ R2 : x = s1 a 1 + s2 a 2 , 0 < s1 , s2 < 1}.

(2.8)

If we denote by k := {x ∈ R2 : x = sk a k , 0 < sk < 1} , k = 1, 2, then the boundary of can be represented as ∂ = 1 ∪ 2 ∪ {a 1 + 1 } ∪ {a 2 + 2 } ∪ {0, a 1 , a 2 , a 1 + a 2 }. Due to the invariance given in (2.7), we impose as in the scalar case, the following ’t Hooft boundary conditions: ⎫ k eiηk (x+a ) φ1 (x + a k ) = eiηk x φ1 (x) ⎬ (2.9) (A( j) + ∇ηk )(x + a k ) = (A( j) + ∇ηk )(x) ⎭ ( j) ( j) = A0 (x) A0 (x + a k ) for k = 1, 2 and x ∈ 1 ∪ 2 . Here η1 , η2 are real valued smooth functions defined in a neighbourhood of 2 ∪ {a 1 + 2 } and 1 ∪ {a 2 + 1 } respectively. Let us denote the value ¯ by η(s1 , s2 ). Since φ1 is a single valued of a function η at a point x = s1 a 1 + s2 a 2 ∈ complex function, its phase around must be a multiple of 2π . Hence the boundary condition (2.9) implies that there exists an integer N such that η1 (1, 1− ) − η1 (1, 0+ ) + η1 (0, 0+ ) − η1 (0, 1− ) +η2 (0+ , 1) − η2 (1− , 1) + η2 (1− , 0) − η2 (0+ , 0) + 2π N = 0.

(2.10)

Integrating (2.6) over , from (2.9) and (2.10) it follows that there exists integers N1 , N2 corresponding to the functions φ1 , φ2 respectively, such that ⎫ (2) (1) κ(2) := κ F12 d x = κ2π N2 ; 2q12 A0 |φ1 |2 = Q (1) = κ(2) ⎪ ⎬ (2.11) (1) (2) κ(1) := κ F12 d x = κ2π N1 ; 2q12 A0 |φ2 |2 = Q (2) = κ(1) . ⎪ ⎭

It is important to note that for our model, we obtain a mixed flux-charge relation. The Hamiltonian (energy) density H for static field configuration is given by H = −L( up to a total divergence) (2) (2) (1) (1) 2 (2) 2 2 2 2 2 2 = κ A(1) 0 F12 + κ A0 F12 − q1 (A0 ) |φ| − q2 (A0 ) |φ2 | + |D j φ1 |

+|D j φ2 |2 + V (φ1 , φ2 ) (2)

=

κ 2 [F12 ]2 4q12 |φ|2

(1)

+

κ 2 [F12 ]2 4q22 |φ2 |2

+ |D j φ1 |2 + |D j φ2 |2 + V (φ1 , φ2 ),

(2.12)

where in the last equality we have used the Gauss laws (2.6). Moreover, the following identities hold:

(k) |D j φk |2 = |D1 φk ± i D2 φk |2 ± i ∂1 (φk D2¯φk ) − ∂2 (φk D1¯φk ) ± F12 |φk |2 ; k = 1, 2. (2.13)


317

Substituting these in (2.12) and integrating over the torus, we get the energy E = H dx

=

(2) 2 κ 2 [F12 ]

4q12 |φ|2

+

(1) 2 ] κ 2 [F12

4q22 |φ2 |2

+ |D1 φ1 ± i D2 φ2 |2 + |D1 φ2 ± i D2 φ2 |2

(1) (2) ±q1 F12 |φ1 |2 ± q2 F12 |φ2 |2 + V (φ1 , φ2 ) ⎧ 2 2 ⎨ (1) (2) κ F12 κ F12 q1 q2 q1 q2 2 2 2 2 = ± |φ2 |(|φ1 | − c1 ) + ± |φ1 |(|φ2 | − c2 ) ⎩ 2q2 |φ2 | κ 2q1 |φ1 | κ (1) (2) +|D1 φ1 ± i D2 φ2 |2 + |D1 φ2 ± i D2 φ2 |2 ± c12 q1 F12 |φ1 |2 ± c22 q2 F12 |φ2 |2 d x ≥ ±c12 q1 (1) ± c22 q2 (2) = c12 q1 |(1) | + c22 q2 |(2) |.

(2.14)

where we choose the signs so that ±( j) = |( j) |, j = 1, 2 . Hence, the energy will (1) (2) attain its lower bound if and only if the field configuration (φ1 , φ2 , Ar , Ar ) satisfy the equations ⎫ D 1 φ1 ± i D 2 φ1 = 0, ⎪ ⎪ ⎪ D 1 φ2 ± i D 2 φ2 = 0, ⎬ 2 2q1 q2 (1) (2.15) 2 2 2 F12 ± κ 2 |φ2 | (|φ1 | − c1 ) = 0, ⎪ ⎪ ⎪ 2 ⎭ 2q q (2) F12 ± κ12 2 |φ1 |2 (|φ2 |2 − c22 ) = 0. (l)

(l)

Using the change of variables ql A j → A j , φl → cl φl , l = 1, 2 and suppressed parameter λ = 4c12 c22 q12 q22 /κ 2 , we can simplify (2.15)(with positive sign) to ⎫ D 1 φ1 + i D 2 φ1 = 0, ⎪ ⎪ D 1 φ2 + i D 2 φ2 = 0, ⎬ (2.16) (1) F12 + λ2 |φ2 |2 (|φ1 |2 − 1) = 0, ⎪ ⎪ ⎭ (2) F12 + λ2 |φ1 |2 (|φ2 |2 − 1) = 0. where now D j φl = ∂ j φl − i A(l) j φl , l, j = 1, 2. It suffices to consider (2.15) with plus (l) (l) sign since the transformation A j → −A j and φ → φ¯ will help us recover results for the negative sign. The first two equations in (2.16) imply that the complex fields (φ1 , φ2 ) are holomorphic with respect to the gauge invariant derivatives while, the last two equations in (2.16) are the “vortex” equations relating “curvatures” to the “strength” of the scalar particles. The system of equations (2.16) together with the Gauss laws (2.6)( or (2.11)) and the periodic boundary conditions (2.9) are the self-dual Chern-Simons equations with two Higgs particles and two Abelian gauge fields. Note that if N2 = 0, we may choose A(2) j = 0 and |φ2 | = 1 so that (2.16) reduces to the self-dual Ginzburg-Landau equations, while if both φ1 and φ2 have only common zeroes with the same multiplicity, then we may take φ1 = φ2 so that (2.16) reduces to the single particle self-dual Abelian Chern-Simons equations.

318


If we substitute u := ln |φ1 |2 , v := ln |φ2 |2 ,

(2.17)

then it can be seen that (2.16) reduces to the system (1.1) with suitable periodicity condition. Integrating the equations in (1.1), we see that

ev (1 − eu ) =

4π N1 , λ

(2.18)

eu (1 − ev ) =

4π N2 . λ

(2.19)

Thus as for the scalar case, we have two types of solutions. The “topological type” satisfying eu → 1 and ev → 1 as λ → ∞ equivalently, |φ1 | → 0, |φ2 | → 0 as κ → 0, (2.20) and the “non-topological type” satisfying eu → 0 and ev → 0 as λ → ∞ equivalently, |φ1 | → 1, |φ2 | → 1 as κ → 0 (2.21) This is in contrast to the non-Abelian case. In view of (1.2) and (1.6), it follows that the maximal solutions of Theorem 1.1 correspond to topological solutions. Let κ∗ be defined as κ∗ :=

c12 c22 q12 q22 , λ∗

(2.22)

where λ∗ is as in Theorem 1.1. Theorem 1.1 is equivalent to Theorem 2.1. Let N1 , N2 be given integers and S1 := { p1 , p2 , . . . , pk1 }, S2 := {q1 , q2 , . . . , qk2 } be sets of given points prescribed on a torus . Then there exists (an optimal) coupling parameter κ∗ , 0 < κ∗
0, consider the system of equation

u = λev (eu − 1) + f in . v = λeu (ev − 1) + g.

(3.1)

Here, f and g may be non-negative continuous functions or a linear combination of Dirac measures on . Suppose there exist functions (u, ¯ v) ¯ ∈ H 1 () × H 1 (), a super solution of (3.1) and (u, v), a sub solution of (3.1) such that u < u, ¯ v < v. ¯ Define (u 0 , v0 ) ≡ (u, ¯ v) ¯ in and consider the following iteration scheme for ⎫ ( − K )u n+1 = λevn (eu n − 1) − K u n + f ⎬ ( − K )vn+1 = λeu n (evn − 1) − K vn + g (3.2) ⎭ for n = 1, 2, 3, . . . where K > 0 is a constant. Then {u n } and {vn } are monotone decreasing sequence of functions in , i.e., ⎫ u¯ > u 1 > u 2 > . . . > u n > u n+1 > . . . > u a.e. ⎬ v¯ > v1 > v2 > . . . > vn > vn+1 > . . . > v a.e. (3.3) ⎭ and the limits lim u n (x) = u λ (x), lim vn (x) = vλ (x) exist almost everywhere x ∈ n→∞

n→∞

and (u n , vn ) → (u λ , vλ ) in H 1 () × H 1 (). The functions (u λ , vλ ) are solutions of (3.1) and are maximal (hence unique) in the sense that if (u, v) is any other solution of (3.1), then

u < uλ . (3.4) v < vλ The most important observation is that the linear system (3.2) can be solved on the torus , see for example Theorem 4.18 in [1]. Moreover, since the Green’s function for Laplacian on torus exist, we can allow f , g above to be sum of Dirac measures. Thus, there exist solutions (u n , vn ) of (3.1) for each n ∈ N and hence the iteration scheme (3.2) is well defined. Also note that the regularity of (u n , vn ) depends on the regularity of f and g. The proof of Proposition 3.1 is standard and follows by using the maximum principle. Here and in the following sections, whenever Proposition 3.1 is referred to , all we need to prove is the existence of sub-solution and super solution for the system being considered. An immediate application is Theorem 3.1. Define the measures µ := 4π

k1 j=1

m j δ p j and ν := 4π

k2

n j δq j ,

(3.5)

j=1

4π and let ( f, g) = (µ, ν) in (3.1). Then, there exists λ0 > || max{N1 , N2 } such that for every λ > λ0 , the system (3.1) has a maximal solution (u λ , vλ ).

320


Proof. We proceed in the following steps: Step 1. (0, 0) is a super solution for (3.1). If (u, v) is a solution of (3.1), then u ∼ ln |x − p j |m j as x → p j where m j is the multiplicity of p j , for p j ∈ S1 v ∼ ln |x − q j |m j as x → q j where n j is the multiplicity of q j , for q j ∈ S2 ,

(3.6)

see [1]. The functions u ∈ C 2 ( \ S1 ), v ∈ C 2 ( \ S2 ) and u, v ∈ W 1,q () for all 1 < q < 2 . Thus, there exists ε > 0 such that the balls B( p j , ε), j = 1, . . . , k1 are mutu1 ally disjoint and u(x) < 0 in B( p j , ε) for all j = 1, . . . , k1 . Let ε := \∪kj=1 B( p j , ε) and maxε u = u(x0 ). If x0 ∈ ∂ε , then u(x0 ) < 0. If x0 ∈ ε is an interior point then using the differential equation (3.1) we have 0 ≥ u(x0 ) ≥ λev (eu − 1).

(3.7)

Therefore, eu(x0 ) − 1 ≤ 0, i.e, u(x0 ) ≤ 0. A similar argument will show that v ≤ 0. In fact, the maximum principle implies that the strict inequality u < 0, v < 0 holds. In particular, it follows that (0, 0) is a super-solution of (3.1) for any λ > 0. This proves (1.2) of Theorem 1.1. Integrating Eqs. (3.1) over and using the fact that u < 0, v < 0, ⎫ u+v v 4π N1 4π N1 = e < ||, ⎪ ⎬ λ ≤ λ + e u+v u (3.8) 4π N2 4π N2 = e < ||, ⎪ ⎭ λ ≤ λ + e

we derive necessary condition λ>

4π max{N1 , N2 } ||

(3.9)

for the existence of solutions to (1.1). Step 2. Iteration. Since here µ, ν are a sum of Dirac measures, we indicate the first two steps in the iteration scheme (3.2), where K > 0 will be chosen later. For n = 1, we have ⎫ ( − K )u 1 = µ ⎬ ( − K )v1 = ν in . (3.10) ⎭ Referring to [1], there exists a solution (u 1 , v1 ) of (3.10) which is C 2 away from the 1 2 singular points { p j }kj=1 and {q j }kj=1 respectively. Choose ε > 0 small such that the balls Bε ( pi ), Bε (q j ), i = 1, . . . .k1 and j = 1, . . . , k2 are mutually disjoint. Since u 1 ∼ ln |x − p j |m j , v1 ∼ ln |x − q j |n j near the singular points, we have u 1 < 0, v1 < 0 1 2 in the closure ¯ε , where ε := \ {∪kj=1 Bε ( p j ) ∪kj=1 Bε (q j )}. Thus, ( − K )u 1 = 0 in ε , ( − K )v1 = 0 in ε , u 1 < 0, v1 < 0 in ∂ε .


321

The maximum principle implies that both u 1 and v1 must achieve the maximum on the boundary. Hence u 1 < 0, v1 < 0 in . For n = 2, ( − K )u 2 = λev1 (eu 1 − 1) − K u 1 + µ ( − K )v2 = λeu 1 (ev1 − 1) − K v1 + ν

in .

We have ( − K )(u 2 − u 1 ) = λev1 (eu 1 − 1) − K u 1 = (K − λev1 ew )(−u 1 ) > 0 if we choose, say K > 2λ. Here w is a function between u 1 and 0. Again, the maximum principle implies that u 2 2λ, for each λ > || max{N1 , N2 } we obtain a 2 2 monotone sequences {u n } ∈ C ( \ S1 ), {vn } ∈ C ( \ S2 ) from Proposition 3.1. Step 3. Existence of sub-solution. First note that if (u, v) is a sub-solution of (3.1), i.e.,

u > λev (eu − 1) + µ (3.11) u v v > λe (e − 1) + ν then u < 0, v < 0: If max u := u(x0 ) > 0, then at point x0 , (eu(x0 ) − 1) > 0 and hence λev (eu − 1) + µ > 0 at x0 . But then, u(x0 ) > 0, contradicting that x0 is a point of maximum. Next, we claim that u λev (eu − 1) − K u > (K − λev ew )(−u), where u < w < 0. For our choice of K , (K − λev ew ) > 0. Hence, the maximum principle implies u λev (eu − 1) − λevn−1 (eu n−1 − 1) + K (u n−1 − u) > λevn (eu − eu n−1 ) + K (u n−1 − u) = (K − λevn−1 ew )(u n−1 − u) > 0, where u < w 0 such that for all λ > λ0 , the system (3.1) has a sub-solution (w, z) (independent of λ).

322


Proof. From Theorem 4.13 in [1] and the existence of the Green’s function for the Laplacian on a torus, there exist unique solutions (u 0 , v0 ) of the equations

1 4π N1 + 4π m j δpj , ||

k

u 0 = −

j=1

v0 = − The functions u 0 ∈

C 2 ( \

4π N2 + 4π ||

S1 ), v0 ∈

k2

(3.13)

v0 = 0.

(3.14)

n j δ p2 , j

j=1

C 2 ( \

u 0 = 0,

S2 ) and for ε > 0 sufficiently small,

u 0 ∼ 2m j ln |x − p j | in B( p j , ε) for j = 1, . . . , k1 , v0 ∼ 2n j ln |x − q j | in B(q j , ε) for j = 1, . . . , k2 . Moreover, u 0 , v0 ∈ W 1, p () for all 1 < p < 2 and eu 0 , ev0 ∈ L ∞ (). If we write u = u 1 − u 0 , v = v1 − v0 , then it can be easily verified that (u 1 , v1 ) is the solution of (3.1) if and only if (u, v) solves ⎫ u = λev0 +v (eu 0 +u − 1) + 4π||N1 ⎪ ⎬ 4π N2 . u +u v +v 0 0 (3.15) v = λe (e − 1) + || ⎪ ⎭ Thus, it suffices to find a sub solution of (3.15). The construction of these sub-solutions is similar to [2] with suitable modifications due to a different nonlinear term in our equations. Let ε > 0 be sufficiently small so that the balls B( p j , 2ε), 1 ≤ j ≤ k1 and B(q j , 2ε), 1 ≤ j ≤ k2 are mutually disjoint. Let ϕε , ψε be smooth functions such that ⎫ 0 ≤ ψε ≤ 1; 0 ≤ ϕε ≤ 1; ⎬ ϕε ≡ 1 in B( p j , ε), j = 1, . . . , k1 ; ψε ≡ 1 in B(q j , ε), j = 1, . . . , k2 , ϕε ≡ 0 in \B( p j , 2ε), j = 1, . . . , k1 ; ψε ≡ 0 in \B(q j , 2ε), j = 1, . . . , k2 . ⎭ (3.16) Consider the functions f ε :=

8π N1 8π N2 ϕε , gε := ψε . || ||

Then

C1 (ε) :=

fε d x ≤

32π 2 N12 2 ε ||

(3.18)

gε d x ≤

32π 2 N22 2 ε . ||

(3.19)

C2 (ε) :=

Define

f ε

:= f ε − C1 (ε),

gε

(3.17)

:= gε − C2 (ε) so that

Theorem 4.7 in [1], the equations w = f ε

f ε d x = 0 and

gε d x = 0. From (3.20)


323

and z = gε

(3.21)

have a unique solution up to an additive constant. From (3.18), (3.19 ) we have 4π N 4π N1 1 f ε (x) ≥ 2 − 8π N1 ε2 > (3.22) for x ∈ B( p j , ε); || || 4π N2 4π N1 2 − 8π N2 ε2 > (3.23) gε (x) ≥ for x ∈ B(q j , ε) || || for ε > 0 small enough and j = 1, . . . , k1 (respectively, j = 1, . . . , k2 ). Henceforth we fix ε so that the inequalities (3.22) and (3.23) hold. Now choose a solution w0 of (3.20) and z 0 of (3.21) such that eu 0 +w0 < 1, ev0 +z 0 < 1.

(3.24)

Then for any λ > 0, we have 4π N1 4π N1 ≥ λev0 +z 0 (eu 0 +w0 − 1)+ in B( pi , ε), 1 ≤ i ≤ k1 , || || 4π N2 4π N1 ≥ λeu 0 +w0 (ev0 +z 0 − 1)+ in B(qi , ε), 1 ≤ i ≤ k2 . z 0 = gε > || ||

w0 = f ε >

(3.25) (3.26)

k1 B( pi , ε)}, m 2 := inf{ev0 (x)+z 0 (x) : x ∈ Let m 1 := inf{eu 0 (x)+w0 : x ∈ \ ∪i=1 k2 k1 \ ∪i=1 B(qi , ε)} and M1 := sup{eu 0 (x)+w0 (x) : x ∈ \ ∪i=1 B( pi , ε)}, M2 := k2 k1 v (x)+z (x) 0 0 inf{e : x ∈ \ ∪i=1 B(qi , ε)}. Then, for x ∈ \ ∪i=1 B( pi , ε) we have k2 v +z u +w 0 0 0 0 (e −1) < m 2 (M1 −1) and for x ∈ \∪i=1 B(qi , ε) we have eu 0 +w0 (ev0 +z 0 − e 1) < m 1 (M2 − 1). Thus we can choose λ0 > 0 sufficiently large so that for all λ > λ0 ,

4π N1 k1 B( pi , ε), in \ ∪i=1 || 4π N1 k2 in \ ∪i=1 − 1) + B(qi , ε). ||

w0 = f εv0 +z 0 (eu 0 +w0 − 1) + z 0 = gεu 0 +w0 (ev0 +z 0

(3.27) (3.28)

In particular, (w0 , z 0 ) is a strict sub-solution of (3.15) for all λ > λ0 . Step 4. Existence of solutions for (3.1). For each λ > λ0 , (u, v) := (u 0 + w0 , v0 + z 0 ) is a sub-solution for (3.1). Hence, the sequence (u n (x), vn (x)) → (u λ (x), vλ (x)) almost everywhere x ∈ and u n → u λ , vn → vλ in L 2 norm. Since (u n+1 , vn+1 ) satisfies (3.2), multiplying the equations by (u n+1 , vn+1 ) and integrating by parts, we conclude that the right hand side of (3.2) converges in W 1, p () for all 1 < p < 2. Taking the limit as n → ∞ in Eqs. (3.2) and using the elliptic estimates, we conclude that (u λ , vλ ) is a solution of (3.1), which is C k away from the singular points. By definition, it is unique. If (u, v) is any other solution of (3.1) then, it is a sub-solution and hence u || max{N1 , N2 } such that for all λ ≥ λ0 , the Eq. (3.15) has a C 2 maximal solution which we continue to denote by (u λ , vλ ) . Let := {λ > 0 : there exists a maximal solution (u λ , vλ ) of (3.15)}.

324


For λ1 ∈ and λ1 < λ2 , it is immediate from Eqs. (3.15) that the maximal solution (u λ1 , vλ1 ) is a sub-solution for (3.15) with λ = λ2 . Hence, from Theorem 3.1, λ2 ∈ and u λ1 0 : λ ∈ } ≥ We prove

4π max{N1 , N2 }. ||

(3.30)

(i) For x ∈ , inf λ>λ∗ u λ (x) and inf λ>λ∗ vλ (x) are finite.

Theorem 3.2. (ii) Define

lim u λ (x) := u ∗ (x); lim vλ (x) := v∗ (x) a.e. x ∈ .

λ→λ∗

λ→λ∗

(3.31)

Then (u ∗ , v∗ ) ∈ H 1 () × H 1 () and is a solution for (3.15) with λ = λ∗ . In 4π particular, λ∗ > || max{N1 , N2 }. (iii) (u ∗ , v∗ ) is a strict sub solution of (3.15) for all λ > λ∗ . Proof. (i) Writing u λn = u n for simplicity of notation, suppose there exists xn ∈ such that u n (xn ) → −∞. Then we claim that u n (x) → −∞ for almost every x ∈ . Multiplying the first equation in (3.15) by u n − u n and integrating by parts, using the

Poincáre inequality we get 2 v0 +vn u 0 +u n |∇u n | ≤ |λe (e − 1)||u n − u n |

≤ λ||e

v0 +vn

(e

u 0 +u n

−1/2 − 1)||∞ ||1 ||∇u n ||2 ,

(3.32)

and hence ||∇u n || ≤ C(λ, 1 , ||), where 1 denotes the first eigenvalue of for a torus. Thus the sequence {||∇u n ||}n is uniformly bounded. Again, the Poincare inequality implies that −1/2 (3.33) ||u n − u n || L 2 () ≤ C1 ||∇u n || ≤ C(λ, 1 , ||), and hence {||u n −

u n ||} is uniformly bounded (independent of n). From the Calderón

Zygmund inequality we conclude ||u n − u n ||W 2,2 () ≤ C(λ, 1 , ||)

(3.34)

uniformly in n and it follows from the Sobolev embedding theorem that ||u n − u n ||C 0 () ≤ C(λ, 1 , ||)

(3.35)


325

uniformly in n. If there exists x0 ∈ such that u n (x0 ) → −∞ as λn → λ∗ , then inequality (3.34) implies that u n (x) → −∞ as λn → λ∗ for all x ∈ . But then, substituting (u n , vn ) in (3.15) with λ = λn and integrating by parts, we have λ ev0 +vn (1 − eu 0 +u n ) = 4π N1 , λ eu 0 +u n (1 − ev0 +vn ) = 4π N2 , (3.36)

and we get a contradiction if eu 0 +u n → 0 as λn → λ∗ . Therefore, the limits limλ→λ∗ u n (x) := u ∗ (x) and limλn →λ∗ vn (x) := v∗ (x) exist. (ii) By definition (u n , vn ) → (u ∗ , v∗ ) pointwise a.e. in . Since ev0 +vn < 1 and 0 < 1 − eu 0 +u n < 1, ev0 +vn (1 − eu 0 +u n ) are uniformly bounded in L p () for all p ≥ 1 and λn ev0 +vn (1 − eu 0 +u n ) → λ∗ ev0 +v∗ (1 − eu 0 +u ∗ ) pointwise in . Again, as in (3.32) we get ||∇u n ||22 = λn ev0 +vn (eu 0 +u n − 1)u n d x ≤ C(λ∗ )||ev0 +vn (eu 0 +u n − 1)|| L 2 ||u n || L 2 d x

≤ C(λ∗ , ||, 1 )||∇u n ||2 ,

(3.37)

where C(λ∗ , ||, 1 ) is constant independent of n. Similarly, ||∇vn || is uniformly bounded. Moreover, since the functions {u n }, {vn } are a monotone sequences of functions bounded below by u ∗ , v∗ almost everywhere, it follows that u n → u ∗ , vn → v∗ in the L 2 norm. Therefore, (u n , vn ) (u ∗ , v∗ ) weakly in H 1 () × H 1 () and strongly in L p () × L p () for all p ≥ 1. Using the fact that (u n , vn ) satisfies the Eq. (3.15) with λ = λn for all ϕ ∈ H 1 (), ∇u∇ϕ = lim ∇u n ∇ϕ λn →λ∗

= lim

λn →λ∗

=

λn ev0 +v∗ (eu 0 +u ∗

Moreover, since the map we conclude that

H 1 ()

4π N1 || 4π N1 − 1) + ϕ. ||

λn ev0 +vn (eu 0 +u n − 1) +

ϕ

(3.38)

→

L 1 ()

given by ϕ →

e pϕ , p ∈ R is compact,

eu 0 +u n → eu 0 +u ∗ , ev0 +vn → ev0 +v∗ as λn → λ∗

(3.39)

in L p norm for all p > 0. Therefore, from (3.38) and the fact that u n satisfies (3.15), we get |∇(u n − u ∗ )|2 = (λn ev0 +vn (eu 0 +u n − 1) − λ∗ ev0 +v∗ (eu 0 +u ∗ − 1))(u n − u ∗ )

≤ ||λn ev0 +vn (eu 0 +u n − 1) − λ∗ ev0 +v∗ (eu 0 +u ∗ − 1)|| L 2 ||u n − u ∗ || L 2 ≤ → 0 asλn → λ∗ (3.40)

and u n → u ∗ strongly in H 1 (). Similarly, we can see that vn → v∗ strongly in H 1 ().

326


(iii) Moreover, (u ∗ , v∗ ) is a strict sub solution of (3.15) for any λ > λ∗ since u ∗ = λ∗ ev∗ (eu ∗ − 1) +

4π N1 4π N1 > λev∗ (eu ∗ − 1) + || ||

(3.41)

v∗ = λ∗ eu ∗ (ev∗ − 1) +

4π N2 4π N2 > λeu ∗ (ev∗ − 1) + . || ||

(3.42)

and

Hence, we must have u ∗ λ∗ . Note that this also implies the optimality of λ∗ , i.e., for λ < λ∗ , (1.1) has no solution. Now, Theorem 1.1 is a consequence of Theorem 3.1 and Theorem 3.2 . 4. An Approximate Problem We shall now consider an approximate problem to (3.1) replacing the measures µ and ν by smooth, positive approximating functions. Choose δ > 0 such that the balls B( p, δ), p ∈ S1 ∪ S2 are mutually disjoint. For this choice of δ, let η(x) = η(|x|) denote a ∞ C with compact support such that 0 ≤ η ≤ 1, η ≡ 1 in B(0, δ/2) and function ε η(r ) = π . Thus, (ε+r 2 )2

ε η → π δ0 (ε + r 2 )2

(4.1)

in the sense of distribution. 4π ε Let ρ(r ) := (ε+r 2 )2 η(r ) and define the functions µε (x) :=

k1

4π m j ρ(|x − p j |), νε (x) :=

j=1

k2

4π n j ρ(|x − q j |).

(4.2)

j=1

Then, µε ≥ 0, νε ≥ 0 are smooth, non negative functions such that µε µ, νε ν in the sense of distribution as ε → 0. Moreover, µε = 4π N1 , νε = 4π N2 for all ε > 0. (4.3)

Now consider the system u = λev (eu − 1) + µε v = λeu (ev − 1) + νε .

in .

(4.4)

Arguing as in Step 1 in the proof of Theorem 3.1, any solution (u ε , v ε ) of (4.4) must satisfy u ε (x) < 0, v ε (x) < 0 in ,

(4.5)


327

and integrating by parts, we immediately see that necessarily λ>

4π max{N1 , N2 }. ||

(4.6)

From [1], there exists unique, smooth functions u ε0 , v0ε satisfying ⎫ ε u0 = 0 ⎪ u ε0 = µε = − 4π||N1 + µε ; ⎬ in . 4π N v0ε = νε = − || 2 + νε ; v0ε = 0 ⎪ ⎭

(4.7)

It is easy to verify that u ε0 (x) =

k1

m j ln(|x − p j |2 + ε) + O(1),

j=1

v0ε (x) =

k1

n j ln(|x − q j |2 + ε) + O(1).

j=1

Thus, u ε0 → u 0 in C 2 ( \ S1 ), v0ε → v0 in C 2 ( \ S2 ) e

u ε0

→e

u0

in C (), e 0

After the change of variables u → u

v0ε

→e

+ u ε0 ,

ε

v0

in C () 0

v →

v + v0ε ,

ε

u = λev0 +v (eu 0 +u − 1) + ε ε v = λeu 0 +u (ev0 +v − 1) +

4π N1 || 4π N2 || .

as ε → 0,

as ε → 0.

(4.8)

the system (4.4) is equivalent to in .

(4.9)

Next, we show that solutions of (4.4) converge to solutions of (3.1 ) as ε → 0. Lemma 4.1. For a fixed λ, assume that the system (4.9) has solution (u ε , v ε ) for all ε ∈ (0, ε0 ). Then (i) lim u ε (x) and lim v ε exist; ε→0

ε→0

(ii) (u ε , v ε ) → (u, v) strongly in H 1 () × H 1 () and (u, v) is a solution for (3.15). Proof. Since u ε0 + u ε < 0, v0ε + v ε < 0,

(4.10)

{u ε }ε , {v ε }ε are uniformly bounded above in for all ε > 0. We need to prove that ε lim inf ε→0 u ε (x) and lim inf ε→0 v (x) exists almost everywhere. Multiplying the first equation in (4.9) by u ε − u ε and integrating by parts, using the Poincare inequality we get

ε 2

|∇u | ≤

ε

ε

ε

ε

|λev0 +v (eu 0 +u − 1)||u ε

−

ε

ε

−1/2

u ε | ≤ λ||ev0 +v (eu 0 +u − 1)||∞ ||1

||∇u ε ||2 ,

328


and hence ||∇u ε || ≤ C(λ, 1 , ||),

(4.11)

where 1 denotes the first eigenvalue of for a torus. Thus the sequence {||∇u ε ||}ε is uniformly bounded. Again, the Poincare inequality implies that −1/2 ||u ε − u ε || ≤ C1 ||∇u ε || ≤ C(λ, 1 , ||), (4.12) and hence

{||u ε

−

u ε ||}

is uniformly bounded (independent of ε). From the Calderón

Zygmund inequality we conclude ε ||u − u ε ||W 2,2 () ≤ C(λ, 1 , ||),

(4.13)

uniformly in ε and it follows from the Sobolev embedding theorem that ε ||u − u ε ||C 0 () ≤ C(λ, 1 , ||)

(4.14)

uniformly in ε. If there exists x0 ∈ such that u ε (x0 ) → −∞ as ε → 0, then inequality (4.14) implies that u ε (x) → −∞ as ε → 0 for all x ∈ . But then, substituting (u ε , v ε ) in (4.9) and integrating by parts, we have ε ε ε ε λ ev0 +v (1 − eu 0 +u ) = 4π N1 ,

ε

ε

ε

ε

eu 0 +u (1 − ev0 +v ) = 4π N2 ,

λ

(4.15)

ε

ε

and we get a contradiction if eu 0 +u → 0 as ε → 0. Therefore, the limits limε→0 u ε (x) := u(x) and limε→0 v ε (x) := v(x) exist and hence ||u ε || → ||u||and ||v ε || → ||v|| in the L 2 norm. Together with (4.11) we conclude that {u ε }, {v ε } are uniformly bounded in H 1 () and hence u ε u, v ε v weakly in H 1 () as ε → 0. Hence, for ϕ ∈ H 1 (), ∇u∇ϕ = lim ∇u ε ∇ϕ ε→∞

= lim

ε→∞

=

ε

ε

ε

ε

λev0 +v (eu 0 +u − 1)ϕ +

λev0 +v (eu 0 +u

4π N1 ||

ϕ

4π N1 ϕ. − 1)ϕ + ||

(4.16)

From the Moser-Trudinger inequality and the fact that (u ε , v ε ) are uniformly bounded ε ε ε ε in H 1 (), we conclude that ev0 +v (eu 0 +u − 1) → ev0 +v (eu 0 +u − 1) in L p for all p > 0. ε Now, using the equation satisfied by (u , v ε ) and (4.16), taking ϕ = (u ε −u)− (u ε −u), it can be verified that (u ε , v ε ) → (u, v) strongly in H 1 () × H 1 ().


329

Next we prove Proposition 4.1. (Existence of maximal solutions for the system (4.4)) Given ε0 > 0 small, there exists λ0 = λ0 (ε0 ) depending on ε0 such that for each ε ∈ (0, ε0 ) the system (4.4) has a unique, smooth maximal solution (u ελ , vλε ) for all λ > λ0 satisfying u ελ < 0, vλε < 0.

(4.17)

Proof. Since (0, 0) is a super-solution of (4.4), the iteration scheme (3.2) in Proposition 3.1 can be applied to system (4.4) to obtain a smooth solution once we prove the existence of a sub-solution for (4.4) for all 0 < ε < ε0 . From Theorem 4.18 in [1], for a fixed positive constant M > 0, the equations ( − M)w = 4π||N1 (4.18) ( − M)z = 4π||N2 have unique smooth solutions (w, z). Let a1 (ε0 ) = supε≤ε0 sup u ε0 , a2 (ε0 ) = supε≤ε0 sup v0ε and choose a constant c(ε0 ) > 0 such that supx∈ w(x) − c(ε0 ) + a1 (ε0 ) < 0, supx∈ z(x) − c(ε0 ) + a2 (ε0 ) < 0. Then the functions w1 := w − c(ε0 ), z 1 := z − c(ε0 ) are such that w1 = Mw1 + Mc(ε0 ) + 4π||N1 , w1 + u ε0 < 0 for all ε ∈ [0, ε0 ] (4.19) z 1 = M z 1 + Mc(ε0 ) + 4π||N2 , z 1 + v0ε < 0 for all ε ∈ [0, ε0 ]. Hence, there exists λ0 (ε0 ) > 0 depending on ε0 such that 4π N1 4π N1 ε ε > λev0 +z 1 (eu 0 +w1 − 1) + , || || 4π N2 4π N2 ε ε > λeu 0 +w1 (ev0 +z 1 − 1) + z 1 = M z 1 + Mc(ε0 ) + m || ||

w1 = Mw1 + Mc(ε0 ) +

for all λ > λ0 . Thus, (w1 , z 1 ) is a sub solution of (4.9), independent of ε for all λ > λ0 . From Proposition 3.1, there exists a smooth, maximal solution (u ελ , vλε ) of (4.9) for every λ > λ0 . From Proposition 4.1, the set {λ > 0 : there exists a maximal solution of (4.9)} is non empty for all ε ∈ (0, ε0 ]. While, from the necessary condition (4.6), λε := inf{λ > 0 : there exists a maximal solution of (4.9)}

(4.20)

exists. By definition, λε is an optimal value for which a solution exists for (4.9). We will show that the family of solutions (u ελ , vλε ) have properties similar to ones mentioned in Theorem 1.1. Theorem 4.1. For each ε ∈ (0, ε0 ] let λε be defined by (4.20). Then (i) the map λ → (u ελ , vλε ) is monotone in (λε , ∞); (ii) The limits lim u ελ := u ε∗ ; lim vλε := v∗ε

λ→λε

λ→λε

(4.21)

exist and (u ε∗ , v∗ε ) is a solution of (4.9) with λ = λε∗ . 4π max{N1 , N2 } and (u ε∗ , v∗ε ) is a strict sub solution of (4.9) In particular, λε∗ > || ε for all λ > λ∗ .

330


The proof of Theorem 4.1 is similar to the proof of Theorem 3.2. Remark 4.1. Since λε >

4π max{N1 ,N2 } ||

for all ε ∈ (0, ε0 ), the limit

lim λε := λ¯

(4.22)

ε→0

exists. From Lemma 4.1, the sequence of solutions (u ελε , vλε ε ) converges to a solution (u, v) of (3.15) with λ = λ¯ . Hence λ¯ ≥ λ∗ ,

(4.23)

λ∗ defined in Theorem 1.1. To prove λ¯ = λ∗ , we have to show that for every λ > λ∗ , the problem (4.9) can be solved for all small ε > 0. This will be achieved by studying the linearized system corresponding to (3.15) at the maximal solution (u λ , vλ ) , namely,

w − λeu 0 +u λ w + λev0 +vλ (1 − eu 0 +u λ )z = 0 (4.24) z + λeu 0 +u λ (1 − ev0 +vλ )w − λev0 +vλ z = 0. Heuristically, the fact that the map λ → (u λ , vλ ) is monotone implies that it is differentiable almost everywhere. Thus Equation (3.15) can be differentiated with respect to λ and at such λ, the linearized equation at (u λ , vλ ) will be proved to be non-singular. Conventionally, the non-singularity can be obtained by proving that the null space of the linearized equation is trivial. Since the linearized equation is a system of linear elliptic equations, it is not easy to show the triviality of the null space. Instead, in the following, we will show that the linear system is onto and then the non-singularity will follow from the Fredholm alternative theorem. Lemma 4.2. Let (u λ , vλ ) denote the maximal solution for (3.15) for λ > λ∗ . Then, the map λ → (u λ , vλ ) is differentiable almost everywhere in (λ∗ , +∞). Proof. Let λ ∈ (λ∗ , ∞) and h be sufficiently small such that λ + h ∈ (λ∗ , ∞). The difference u λ+h − u λ satisfies the equation (u λ+h − u λ ) = λ(eu 0 +v0 +u λ+h +vλ+h − eu 0 +v0 +u λ +vλ ) − λ(ev0 +vλ+h − ev0 +vλ ) +hev0 +vλ+h (eu 0 +u λ+h − 1). (4.25) For each fixed x ∈ , the sequence {(u λ (x), vλ (x))}λ is monotone in λ and hence so is the sequence {(u 2λ (x), vλ2 (x))}λ . Therefore, the map λ → ( u 2λ (x) d x, vλ2 (x) d x) is also

monotone and hence there exists a subset E 1 ⊂ (λ∗ , ∞) with measure |(λ∗ , ∞)\ E 1 )| = 0 and such that λ → ( u λ (x) d x, vλ (x) d x) and λ → ( u 2λ d x, vλ2 d x)

are differentiable for all λ ∈ E 1 .

(4.26)

λ → eu 0 (x)+v0 (x)+u λ (x)+vλ (x) , λ → eu 0 (x)+u λ (x) , λ → ev0 (x)+vλ (x)

(4.27)

Also, note that the maps


331

are all monotone in λ for each x ∈ . Thus, there exists E 2 ⊂ (λ∗ , ∞) with |(λ∗ , ∞) \ E 2 )| = 0 and ⎫ λ → eu 0 +v0 +u λ +vλ , λ → eu 0 +u λ , λ → ev0 +vλ ⎪ ⎪ ⎬ 2u +2v +2u +2v 2u +2u 2v +2v λ λ λ λ 0 0 0 0 . (4.28) λ → e , λ → e , λ → e ⎪ ⎪ ⎭ are differentiable for all λ ∈ E 2 From L 2 estimates and (4.25) we have ||u λ+h − u λ ||W 2,2 () ≤ ||u λ+h − u λ || L 2 () + ||(u λ+h − u λ )|| L 2 () ≤ o(1)|h| + λ||(eu 0 +v0 +u λ+h +vλ+h − eu 0 +v0 +u λ +vλ )|| + λ||(ev0 +vλ+h − ev0 +vλ )|| +|h|||ev0 +vλ+h (eu 0 +u λ+h − 1)||,

(4.29)

where o(1) → 0 as |h| → 0. From (4.28), for λ ∈ E 2 , we have ||(eu 0 +v0 +u λ+h +vλ+h − eu 0 +v0 +u λ +vλ )||2L 2 = (eu λ+h +vλ+h − eu λ +vλ )2 d x

=

(e2u 0 +2v0 +2u λ+h +2vλ+h − e2u 0 +2v0 +2u λ +2vλ ) d x

(e2u 0 +2v0 +2u λ +2vλ − eu 0 +v0 +u λ+h +vλ+h eu 0 +v0 +u λ +vλ ) d x

+2

≤ o(1)|h|

(4.30)

and similarly, ||(ev0 +vλ+h − ev0 +vλ )||2L 2 =

(e2v0 +2vλ+h − e2v0 +2vλ ) + 2

ev0 +vλ (ev0 +vλ − ev0 +vλ+h )

≤ o(1)|h|,

(4.31)

where o(1) → 0 as |h| → 0. Moreover, for λ ∈ E 1 , from (4.26), ||u λ+h − u λ ||2L 2 = (u 2λ+h − u 2λ ) d x + 2 u λ (u λ − u λ+h ) d x

≤ o(1)|h|

(4.32)

with o(1) → 0 as |h| → 0. Note that we use the fact that the functions u λ , vλ , evλ are bounded on . Hence, if we choose λ ∈ E 1 ∩ E 2 then |(λ∗ , N ) \ E 1 ∩ E 2 )| = 0 and from (4.29), (4.30 ), (4.31) and (4.32) we conclude that eu λ ,

||u λ+h (x) − u λ (x)||W 2,2 () < o(1)|h|

(4.33)

o(1) → 0 as |h| → 0. In particular, from the Sobolev embedding theorem, ||u λ+h − u λ ||C 0 () < o(1)|h|,

(4.34)

o(1) → 0 as |h| → 0 i.e. λ → u λ ∈ C 0 () is differentiable for all λ ∈ E 1 ∩ E 2 . Similarly, λ → vλ defined from E 1 ∩ E 2 → C 0 () is differentiable.

332


Proposition 4.2. For any fixed λ1 > λ∗ and (u λ1 , vλ1 ) := (u 1 , v1 ) the maximal solution for (3.15), the linearized operator L : W 2,2 () × W 2,2 () → L 2 () × L 2 () defined by L :=

− λeu 0 +u λ ev0 +v1 (1 − eu 0 +u 1 ) u +u v +v 0 1 0 1 e (1 − e ) − λev0 +vλ

(4.35)

is invertible. Proof. The proof is complete once we show that the map L is onto. Step 1.Let ϕ, ψ be non negative, continuous functions defined on with the property that ϕ ≡ 0 in neighbourhood of S1 , ψ ≡ 0 in neighbourhood of S2 .

(4.36)

We claim (ϕ, ψ) lies in the image of L: For λ > λ1 , consider the equation u + λev0 +v (1 − eu 0 +u ) = v + λeu 0 +u (1 − ev0 +v ) =

4π N1 || 4π N2 ||

+ (λ − λ1 )ϕ + (λ − λ1 )ψ

.

(4.37)

Since (λ − λ1 )ϕ ≥ 0 and (λ − λ1 )ψ ≥ 0 and (u λ , vλ ) satisfies (3.15), it follows that (u λ , vλ ) is a super solution for (4.37) for any λ ≥ λ1 . Next, we show that (4.37) has a sub solution for all λ ∈ (λ1 , λ1 + δ), δ > 0 small, to be chosen later. Set σ = λ1 − c(λ − λ1 ), where c is a positive constant to be chosen. If (u σ , vσ ) denotes the maximal solution of (4.9) corresponding to σ = λ1 − c(λ − λ1 ), then u σ + λev0 +vσ (1 − eu 0 +u σ ) = vσ + λeu 0 +u σ (1 − ev0 +vσ ) =

4π N1 || 4π N2 ||

+ (c + 1)(λ − λ1 )ev0 +vσ (1 − eu 0 +u σ ) + (c + 1)(λ − λ1 )eu 0 +u σ (1 − ev0 +vσ ).

(4.38)

Since ϕ and ψ both vanish near the neighbourhood of singular points, we can choose c large enough so that ϕ ≤ (c + 1)ev0 +vσ (1 − eu 0 +u σ ), ψ ≤ (c + 1)eu 0 +u σ (1 − ev0 +vσ ). Now, choose δ > 0 sufficiently small such that λ1 −c(λ−λ1 ) > λ∗ for all λ ∈ (λ1 , λ1 +δ), so that the maximal solution (u σ , vσ ) with σ = λ1 − c(λ − λ1 ) exists. Hence (u σ , vσ ) is a sub solution of (4.37) for all λ ∈ (λ1 , λ1 + δ). By Proposition 3.1, (4.37) has a monotone family of maximal solutions (u˜ λ , v˜λ ) such that u σ ≤ u˜ λ ≤ u λ vσ ≤ v˜λ ≤ vλ

for all λ ∈ (λ1 , λ1 + δ),

(4.39)


333

where u˜ λ1 = u 1 , v˜λ1 = v1 In particular, for λ ∈ (λ1 , λ1 + δ), uσ − u1 u˜ λ − u 1 uλ − u1 ≤ ≤ , λ − λ1 λ − λ1 λ − λ1 and the limits u˜ λ −u 1 λ→λ1 λ−λ1 lim u σ −u 1 λ→λ1 λ−λ1

lim

(4.40)

⎫ ⎬ (4.41)

⎭

u˜ λ v˜λ |λ=λ1 exists. Similarly, ∂∂λ |λ=λ1 exist. Clearly, if A := exist. Thus the derivative ∂∂λ ∂ u˜ λ ∂u λ ∂ v˜λ ∂vλ ( ∂λ |λ=λ1 − ∂λ |λ=λ1 ), B := ( ∂λ |λ=λ1 − ∂λ |λ=λ1 ) then L(A, B) = (ϕ, ψ). Hence (ϕ, ψ) are in the image of the operator L. Step (ii). It is easy to see that the linear subspace spanned by the set of functions (ϕ, ψ) in Step (i) satisfying the property (4.36) is dense in L 2 () × L 2 (). Since the image of L is closed in L 2 () × L 2 (), we have L is onto. By the Fredholm alternative theorem, L is invertible.

We can now prove Theorem 4.2. lim λε = λ¯ = λ∗ . ε→0

Proof. For simplicity of notations, let ε

ε

ε

ε

f ε (u, v) := ev0 +v (eu 0 +u − 1), gε (u, v) := eu 0 +u (ev0 +v − 1)

(4.42)

with the convention that f 0 (u, v) := ev0 +v (eu 0 +u − 1), g0 (u, v) := eu 0 +u (ev0 +v − 1) so that system (4.4) can be rewritten as ⎫ u = λ f ε (u, v) + 4π||N1 ⎬ (4.43) ⎭ v = λgε (u, v) + 4π||N2 for ε > 0, and for ε = 0 we get (3.15). Consider the map : [0, ε0 ) × R × W 2,2 () × W 2,2 () → R × L 2 () × L 2 () defined by (ε, λ, u, v) = ε (λ, u, v) := (λ, u − λ f ε (u, v), v − λgε (u, v)). The map ε is

(4.44)

in λ, u, v variables and its derivative at a point (λ, u, v) is given by ⎞ ⎛ 1 0 0 ⎟ ⎜ ⎟ ⎜ ⎜ − f ε (u, v) − λ∂1 f ε (u, v) −λ∂2 f ε (u, v) ⎟ (4.45) Dε (λ, u, v) = ⎜ ⎟, ⎟ ⎜ ⎝ −g (u, v) −λ∂ g (u, v) − λ∂ f (u, v) ⎠ ε 1 ε 2 ε C1

where ∂i denotes the partial derivative with respect to the i th variable, i = 1, 2. From Remark 4.1, λ¯ ≥ λ∗ . Suppose λ¯ > λ∗ . From Lemma 4.2 and Proposition 4.2, ¯ such that (∂λ u λ , ∂λ vλ ) |λ=λ1 exists and the operator L and hence there exists λ1 ∈ (λ∗ , λ) D0 is invertible at (u λ1 , vλ1 ). By the implicit function theorem there exists ε0 > 0 small, such that (4.44) has a solution for all ε ∈ (0, ε0 ) and hence a maximal solution for all ε ∈ (0, ε0 ). Thus, λ1 ≥ λε for all ε ∈ (0, ε0 ) and hence λ1 ≥ λ¯ a contradiction to λ1 < λ¯ . Therefore λ¯ = λ∗ .

334


5. Variational Method In general, it is difficult to find a local minimizer for a system of equations. Fortunately, the variational method developed in [12] is helpful here. Since Eqs. (3.15) are symmetric in u and v, without loss of generality, we may assume that N2 ≥ N1 . To write the variational functional corresponding to (3.15), we first add the two equations therein to get (u + v) = 2λeu 0 +v0 +u+v − λeu 0 +u − λev0 +v +

4π(N1 + N2 ) , ||

(5.1)

and subtracting the second equation in (3.15) from the first one we get (u − v) = λeu 0 +u − λev0 +v +

4π(N1 − N2 ) . ||

(5.2)

Writing F := u + v and G := u − v, (F, G) satisfies F = 2λeu 0 +v0 +F − λeu 0 + G = λeu 0 +

F+G 2

F+G 2

F−G 2

− λev0 +

− λev0 +

+

F−G 2

+

4π(N1 + N2 ) , ||

4π(N1 − N2 ) . ||

(5.3) (5.4)

F−G Thus (F, G) is a solution of (5.3)–(5.4) if and only if (u = F+G 2 ,v = 2 ) is a solution of (3.15). It can be verified that Eqs. (5.3)–(5.4) correspond to the Euler Lagrange equations for the functional F+G F−G 1 1 2 2 eu 0 +v0 +F − eu 0 + 2 − ev0 + 2 dx Iλ (F, G) := ||∇ F|| − ||∇G|| + 2λ 2 2 4π(N1 + N2 ) 4π(N1 − N2 ) + F dx − G dx (5.5) || ||

= E(F) − J (F, G), where E(F) :=

1 ||∇ F||2 + 2λ 2

and 1 J (F, G) := ||∇G||2 +2λ 2

eu 0 +

(5.6) eu 0 +v0 +F d x +

F+G 2

+ev0 +

F−G 2

4π(N1 + N2 ) ||

dx +

F dx

(5.7)

4π(N1 − N2 ) ||

G d x.

(5.8)

As in Sect. 4, we use the approximate problem (4.9) and define the functional I ε as 1 ε ε ε F+G ε F−G ε 2 1 2 eu 0 +v0 +F − eu 0 + 2 −ev0 + 2 dx Iλ (F, G) := ||∇ F|| − ||∇G|| + 2λ 2 2 4π(N1 + N2 ) 4π(N1 − N2 ) + F dx − G dx (5.9) || ||

= E ε (F) − J ε (F, G),

(5.10)


335

where 1 E (F) := ||∇ F||2 + 2λ 2 ε

e

u ε0 +v0ε +F

4π(N1 + N2 ) dx + ||

F dx

(5.11)

and 1 J (F, G) := ||∇G||2 + 2λ 2 ε

4π(N1 − N2 ) u ε0 + F+G v0ε + F−G 2 2 e dx + +e G d x. ||

(5.12) The Euler Lagrange equations for I ε are ε

ε

ε

F = 2λeu 0 +v0 +F − λeu 0 + ε

G = λeu 0 +

F+G 2

ε

− λev0 +

F+G 2

F−G 2

+

ε

− λev0 +

F−G 2

+

4π(N1 + N2 ) , ||

4π(N1 − N2 ) , ||

(5.13) (5.14)

F−G and (F, G) are solutions of (5.13)–(5.14) if and only if u = F+G 2 , v = 2 are solutions ε of (4.9). The functional I is also indefinite and on differentiation we have

D1 Iλε (F, G) = D1 E ε (F) − D1 J ε (F, G); D2 Iλε (F, G) = −D2 J ε (F, G).

(5.15) (5.16)

Thus, to find the critical points of I ε , we first look for the critical points of J ε (F, G) as a function of G for a fixed F ∈ H 1 (). Our first observation is that for a fixed F ∈ H 1 (), (5.4) and (5.14) can have at most one solution: for if G 1 , G 2 are two solutions of Eq. (5.14) and x0 ∈ such that max (G 1 − G 2 ) = (G 1 − G 2 )(x0 ) > 0, then ε

F

ε

F

0 ≥ (G 1 − G 2 )(x0 ) = λeu 0 + 2 (e G 1 /2 − e G 2 /2 ) − λev0 + 2 (e−G 1 /2 − e−G 2 /2 ) > 0 (5.17) gives a contradiction. Therefore G 1 (x) ≤ G 2 (x) in . Interchanging G 1 and G 2 , it follows that G 1 ≡ G 2 . A similar argument works for (5.4). Hence (5.14) and (5.4) have a unique solution, if it exists. Eqs. (5.4) and (5.14) are Euler Lagrange equations for the functionals J (F, G) and J ε (F, G). We define J F (G) := J (F, G), J Fε (G) := J ε (F, G)

(5.18) (5.19)

to emphasize the fact that we are considering the functionals for a fixed F ∈ H 1 (). We prove Lemma 5.1. For each ε ∈ [0, ε0 ) and fixed F ∈ H 1 (), the functional J Fε has a unique minimizer in H 1 (). Here we use the convention that J 0 := J defined in (5.8).

336


Proof. From Jensen’s inequality, we have F+G ε + F−G 2 u ε0 + F+G v 2 e ≥ e and e 0 2 ≥ e

1 4π(N1 − N2 ) = ||∇G||2 + 2 || 1 4π(N1 − N2 ) > ||∇G||2 + 2 ||

Since N2 ≥ N1 , we note that if and

J Fε (G)

.

(5.20)

Therefore, J Fε (G)

F−G 2

ε F+G ε F−G e u 0 + 2 + e v0 + 2 G + 2λ

G + 2λe

F−G 2

+ 2λe

dx

.

(5.21)

G → +∞ then e

→ +∞. It follows that

F+G 2

4π(N1 −N2 ) ||

F+G 2

dx

will be the dominating term

G + 2λe

F+G 2

+ 2λe

F−G 2

dx

≥ C and

J Fε is bounded from below. The existence of the global minimizer of J Fε in H 1 () then follows from the coerciveness of ||∇G||. In view of Lemma 5.1, for every F ∈ H 1 () there exists a unique G(F) ∈ H 1 () such that G(F) is a point of minimum of J Fε . Substituting G = G(F) in (5.9), we immediately see that the Iλε (F, G(F) satisfies the condition (5.16) for any F ∈ H 1 () and our problem is now reduced to finding F ∈ H 1 () such that (5.15) is satisfied. Thus, we define the functionals Iλ (F) := Iλ (F, G(F)) = E(F) − J F (G(F))

(5.22)

Iλε (F) := Iλε (F, G(F)) = E ε (F) − J Fε (G(F))

(5.23)

and

to emphasize the fact that they depend only on the function F and our next goal is to find a minimizer for the functional Iλ . Due to the presence of singularities of u 0 and v0 , it is difficult to directly show that Iλ is bounded from below. Thus, we first prove that Iλε is bounded below for any ε > 0, and then by passing to the limit ε → 0, the lower bound for Iλ can be obtained. The main result of this section can be summarized as Theorem 5.1. For ε ∈ (0, ε0 ], let F∗ε := u ε∗ + v∗ε where (u ε∗ , v∗ε ) are solutions of the approximate system (4.9) defined by (4.21). Then, for every ε ∈ (0, ε0 ] there exists a critical point F ε > F∗ε of the functional Iλε which is a local minimum in H 1 (). The limε→0 F ε = Fˆ exists and Fˆ is a local minimum for the functional Iλ in H 1 (). Moreover, Fˆ > F∗ = u ∗ + v∗ , where (u ∗ , v∗ ) are solutions of (3.15) defined by (1.5). The proof of Theorem 5.1 is completed in the following steps each of which will be proved in subsequent lemmas: (i) find a constrained minimizer for the functional Iλε ; (ii) prove that this constrained minimizer is a critical point for Iλε in H 1 ();


337

(iii) taking the limit as ε → 0, obtain a constrained minimizer which is a critical point for Iλ ; (iv) the critical point is a local minimum of Iλε in H 1 () for all ε ∈ [0, ε0 ], ε0 small. From (iii) of Proposition 4.1 and Theorem 1.1 in case ε = 0, the maximum principle implies that F∗ε is a strict sub solution for Eq. (5.13) with G = G ε∗ = u ε∗ − v∗ε for any λ > λε∗ i.e., ε

F∗ε > 2λeu 0 +v0 +F∗ − λeu 0 +

F∗ε +G ε∗ 2

− λev0 +

F∗ε −G ε∗ 2

4π(N1 + N2 ) . ||

+

(5.24)

Thus, restricting the functional Iλ to the closed set F ε := {F ∈ H 1 () : F ≥ F∗ε = u ε∗ + v∗ε },

(5.25)

we prove Lemma 5.2. Iλε attains its minimum in F ε . Proof. Since G(F) minimizes J Fε in H 1 (), we have J Fε (G(F)) ≤ J Fε (0). Hence, Iλε (F) ≥ E ε (F) − J ε (F, 0) 1 ε ε ε F ε F 4π(N1 + N2 ) 2 = ||∇ F|| + eu 0 +v0 +F − eu 0 + 2 − ev0 + 2 . F + 2λ 2 ||

(5.26) By the Cauchy Schwartz inequality, ε F ε F ε ε eu 0 + 2 + ev0 + 2 ≤ (||eu 0 ||∞ + ||ev0 ||∞ )( e F )1/2 .

It follows that ε

ε

ε

F

ε

F

(eu 0 +v0 +F − eu 0 + 2 − ev0 + 2 ) ≥ C1 (ε)

e F − C2 (ε)(

varies in F ε . Here C

which is bounded below as F of F. Therefore, for F ∈ F ε , Iλε (F)

1 4π(N1 + N2 ) ≥ ||∇ F||2 + 2 ||

i (ε), i

e F )1/2

= 1, 2 is a constant independent

F∗ d x + C1 (ε)λ

(5.27)

e − C2 (ε)λ( F

e F )1/2 ,

(5.28) in particular, Iλε is coercive in F ε . Moreover, Iλε is lower semi-continuous on F ε since both E ε and J ε are lower semicontinuous. It follows that Iλε is bounded from below and attains its minimum at a point say F ε in the set F ε . Let F ε ∈ F ε such that min Iλε = Iλε (F ε ) Fε

and let G ε denote the corresponding solution of (5.14). Then,

(5.29)

338


Lemma 5.3. F ε is a critical point of the functional Iλε in H 1 (). Proof. Note that F ε is a closed subset of H 1 (). To prove that F ε is a critical point of Iλε in H 1 (), it suffices to show F ε lies in the interior of the set F ε , i.e., F ε > F∗ε . Let ϕ ∈ H 1 () such that ϕ ≥ 0. Then F ε + tϕ ∈ F ε for t > 0 and DIλε (F ε )(ϕ) = t ∂ ε ε limt→0+ 1t 0 ∂s Iλ (F + sϕ)ds ≥ 0, i.e., 4π(N1 + N2 ) ε ε ε F ε +G ε ε F ε −G ε ε ∇ F ε ∇ϕ + (2λeu 0 +v0 +F − λeu 0 + 2 − λev0 + 2 + )ϕ ≥ 0, ||

(5.30) where G ε satisfies (5.14). Thus, F ε is a super solution of (5.13), ε

ε

ε

ε

F ε ≤ 2λeu 0 +v0 +F − λeu 0 + ε

ε

ε

F ε +G ε 2

ε

− λev0 +

F ε −G ε 2

+

4π(N1 + N2 ) . ||

(5.31)

ε

F −G If we define w := F +G , then it can be easily verified from (5.31) and 2 , z := 2 (5.14) that (w, z) is a super solution of (4.9) i.e., ε

ε

4π N1 , || 4π N2 . − 1) + ||

w ≤ λev0 +z (eu 0 +w − 1) + ε

ε

z ≤ λeu 0 +w (ev0 +z

(5.32) (5.33)

We claim that u ε∗ ≤ w, v∗ε ≤ z a.e. x ∈ .

(5.34)

Consider ϕ = min{w − u ε∗ , 0}. For x0 ∈ such that (w − u ε∗ )(x0 ) < 0, we must have ε ε ε ε (z − v∗ε )(x0 ) > 0 since F ε − F∗ε ≥ 0. Therefore, λev0 +z (eu 0 +w − 1) < λ∗ ev0 +v∗ (eu 0 +w − ε ε ε ε 1) ≤ λ∗ ev0 +v∗ (x0 ) (eu 0 +u ∗ (x0 ) − 1), and using (5.32) together with the equation satisfied ε by u ∗ we see that (5.35) (w − u ε∗ )(x0 ) ≤ 0. It implies that ϕ is a super solution, and − ϕϕ = |∇ϕ|2 ≤ 0 and hence ϕ ≡ 0 a.e

in . Hence u ε∗ ≤ w a.e. in . A similar argument will show that v∗ε ≤ z a.e. in . We now apply the iteration scheme (3.2) beginning with the super solution (w, z) to get a monotone sequence {(u n , vn )}. Due to the claim (5.34), we note that u ε∗ ≤ u n , v∗ε ≤ vn for all n.

(5.36)

ˆ v) ˆ of the system (4.9). Clearly, Hence, the sequence (u n , vn ) converges to a solution (u, (u, ˆ v) ˆ are smooth functions and since λ > λ∗ , there exists δ > 0 such that u ε∗ + δ < uˆ ≤ w, v∗ε + δ < vˆ ≤ z.

(5.37)

F ε = w + z ≥ uˆ + vˆ > u ε∗ + v∗ε + 2δ > F∗ε ,

(5.38)

Therefore,

and the lemma is proved.


339

From Lemma 4.1, it follows that F ε → Fˆ in C 1,α , 0 < α < 1 as ε → 0, where Fˆ is a critical point of Iλ in H 1 (). Clearly, Fˆ = lim F ε ≥ lim F∗ε = F∗ = u ∗ + v∗ , ε→0

(5.39)

ε→0

and repeating the proof of Lemma 5.3, we can further conclude Fˆ > F∗ . Let F = {F ∈ H 1 () : F ≥ F∗ }. Lemma 5.4. Fˆ is a local minimizer of Iλ in H 1 () . Proof. Suppose that Fˆ is not a local minimizer for the functional Iλ in H 1 (). Thus, for every n ∈ N, inf

ˆ 1≤ 1 ||F− F|| n H

ˆ Iλ (F) < Iλ ( F).

ˆ H1 ≤ Let Fn ∈ H 1 () be such that ||F − F||

1 n

and Iλ (Fn ) :=

min

ˆ 1≤ 1 ||F− F|| n H

Iλ (F).

Using the principle of Lagrange multipliers, there exists a constant µn ≤ 0 such that ˆ ˆ for all ϕ ∈ H 1 (), (5.40) DIλ (Fn )(ϕ) = µn ( ∇(Fn − F)∇ϕ + (Fn − F)ϕ)

here DIλ denotes the differential of Iλ . Let G n denote the unique minimizer of the functional J Fn . Then, DIλ (Fn ) = D E(Fn ) − D1 J (Fn , G n ) − D2 J (Fn , G(Fn ))( = D E(Fn ) − D1 J (Fn , G n ),

∂G (Fn )) ∂F (5.41)

since D2 J (Fn , G(Fn )) = 0. Therefore, Fn +G n Fn −G n 4π(N1 + N2 ) ∇ Fn ∇ϕ + λ (2eu 0 +v0 +Fn −eu 0 + 2 −ev0 + 2 )ϕ d x + ϕ dx || ˆ ˆ + (Fn − F)ϕ) (5.42) = µn ( ∇(Fn − F)∇ϕ

for all ϕ ∈

H 1 ().

Integrating by parts, we conclude from (5.42) that Fn +G n

Fn −G n

−Fn +λ(eu 0 +v0 +Fn − eu 0 + 2 − ev0 + 2 )+ ˆ + (Fn − F) ˆ . = µn −(Fn − F)

4π(N1 + N2 ) || (5.43)

Since G n is a minimizer of J Fn ,

1 4π(N1 − N2 ) 2 ||∇G n ||2 + G n d x ≤ J (Fn , G n ) ≤ J (Fn , 0) 2 || u 0 + F2n v0 + F2n = 2λ (e +e ).

(5.44)

340


Putting ϕ = 1 in (5.42), we have Fn +G n Fn −G n ε ε λ eu 0 + 2 d x + λ ev0 + 2 d x = 2λ eu 0 +v0 +Fn + O(1).

(5.45)

Since ||Fn || H 1 () ≤ C

(5.46)

by the Moser-Trudinger inequality e Fn ∈ L p () for any p > 1. Thus by (5.45), Fn +G n Fn −G n (eu 0 + 2 + ev0 + 2 ) ≤ C, (5.47)

and Jensen’s inequality implies that (Fn + G n ) and (Fn − G n ) are bounded from

above. Thus,

|

G n | ≤ C and||∇G n || ≤ C,

(5.48)

where the last inequality is due to (5.44). Again, using the Moser Trudinger inequality together with (5.43), we get Fn ∈ L p for any p > 1 and ||Fn || L p () ≤ C p for all p > 1. By the Sobolev embedding theorem ˆ H 1 () → 0, we obtain and the fact that ||Fn − F|| ˆ C 0 () → 0. ||Fn − F||

(5.49)

Fn > F∗

(5.50)

Since Fˆ > F∗ , we have

for all large n and hence Fn ∈ F for all large n. In particular, there exists n 0 >> 0 such that Fn 0 > F∗ in.

(5.51)

Since lim F∗ε = F∗ , there exists ε0 such that for all ε < ε0 , we have ε→0

Fn 0 > F∗ε for all ε < ε0 ,

(5.52)

and hence Fn 0 ∈ F ε for all ε < ε0 . Thus, Iλε (Fn 0 ) ≥ Iλε (F ε ) for all ε < ε0 and taking the limit as ε → 0 we conclude ˆ Iλ (Fn 0 ) = lim Iλε (Fn 0 ) ≥ lim Iλε (F ε ) = Iλ ( F), ε→0

(5.53)

ε→0

ˆ This completes the proof. which contradicts that Iλ (Fn 0 ) < Iλ ( F).

Remark 5.1. By similar arguments, we can prove that F ε is a local minimizer of Iλε in H 1 ().


341

6. Mountain Pass Solution We begin this section by proving that Iλε satisfies the Palais-Smale condition i.e., Lemma 6.1. Let N2 ≥ N1 > 0 and {Fn } be a sequence in H 1 () such that (i) Iλε (Fn ) → α as n → ∞, (ii) ||DIλε (Fn )|| → 0 strongly, as n → ∞. Then {Fn } has a convergent subsequence. Proof. Condition (i) implies that there exists M > 0 such that |Iλε (Fn )| < M.

(6.1)

Let G n := G(Fn ) be the unique minimizer of J Fεn in H 1 (). Then J Fεn (G n ) ≤ J Fεn (0) and from (6.1) we have E ε (Fn ) − J Fεn (0) < M i.e., 1 4π(N1 + N2 ) ε ε ε Fn ε Fn ||∇ Fn ||2 + Fn d x + 2λ (eu 0 +v0 +Fn − eu 0 + 2 − ev0 + 2 ) d x < M. 2 ||

(6.2) Therefore, 4π(N1 + N2 ) ||

ε

ε

ε

(eu 0 +v0 +Fn − eu 0 +

Fn d x + 2λ

Fn 2

ε

− e v0 +

Fn 2

) d x < M.

(6.3)

By the Cauchy-Schwartz inequality, u ε0 + F2n v0ε + F2n (e +e ) ≤ C(ε)( e Fn )1/2 .

Therefore, 4π(N1 + N2 ) ||

(6.4)

Fn d x + 2λ

e

u ε0 +v0ε +Fn

− C(ε)(

e Fn )1/2 < M.

(6.5)

Note that together with Jensen’s inequality (6.5) implies that ε

ε

eu 0 +v0 ≥ c (ε) > 0, Fn ≤ C(ε).

(6.6) (6.7)

Recall that G n satisfies ε

G n = λ(eu 0 +

Fn +G n 2

ε

− e v0 +

Fn −G n 2

)+

4π(N1 − N2 ) . ||

If x0 ∈ is such that min G n = G n (x0 ), then ε

λ(eu 0 (x0 )+

Fn (x0 )+G n (x0 ) 2

ε

− ev0 (x0 )+

Fn (x0 )−G n (x0 ) 2

)+

4π(N1 − N2 ) ≥ 0, ||

(6.8)

342


and since N2 ≥ N1 , ε

ε

eu 0 (x0 )+G n (x0 )/2 − ev0 (x0 )−G n (x0 )/2 ≥ 0.

(6.9)

inf min G n ≥ −Cε

(6.10)

Hence n

for some constant Cε > 0 independent of n. Moreover, ε Fn +G n ε Fn −G n λ eu 0 + 2 = λ ev0 + 2 + 4π(N2 − N1 ).

(6.11)

Since D J Fn (G n ) = 0, ∂G n )=D E(Fn )− D1 J (Fn , G n ), ∂F (6.12)

DIλ (Fn )=D E(Fn )−D1 J (Fn , G n )−D2 J (Fn , G n )(

where Di denotes differentiation with respect to the i th variable, i = 1, 2. Therefore, (ii) implies that ||D E(Fn ) − D1 J (Fn , G n )|| → 0 as n → ∞,

(6.13)

strongly i.e., for any ϕ ∈ H 1 (), 4π(N1 + N2 ) ε Fn +G n ε Fn −G n u ε0 +v0ε +Fn ∇ Fn · ∇ϕ + ϕ +2λ e ϕ − λ (eu 0 + 2 + ev0 + 2 )ϕ ||

= o(1)||ϕ|| H 1 ,

(6.14)

where o(1) → 0 as n → ∞. Putting ϕ = 1, we get ε ε ε Fn +G n ε Fn −G n |λ (2eu 0 +v0 +Fn − eu 0 + 2 − ev0 + 2 ) d x + 4π(N1 + N2 )| → 0.

(6.15)

Using (6.11) and (6.15), we get ε ε ε Fn +G n ε Fn −G n 2λ eu 0 +v0 +Fn ≤ λ eu 0 + 2 + λ ev0 + 2 d x − 4π(N1 + N2 ) + o(1)

≤ C(ε, λ, N1 , N2 )(

e )

Fn 1/2

(

≤ C(ε, λ, N1 , N2 )(

e−G n )1/2

e )

Fn 1/2

.

(6.16)

Since ε

ε

2λ(inf eu 0 +v0 )

e Fn ≤ 2λ

ε

ε

eu 0 +v0 +Fn ,


(6.16) implies that

343

e Fn ≤ C(ε, λ, N1 , N2 ).

(6.17)

Therefore,

ε

Fn −G n 2

ε

Fn +G n 2

e v0 +

≤ C(ε, λ, N1 , N2 ),

(6.18)

≤ C(ε, λ, N1 , N2 ).

(6.19)

and from (6.11),

eu 0 +

Hence from Jensen’s inequality,

Fn + G n ≤ C.

(6.20)

From (6.8),

|∇G n |2 + λ

ε

eu 0 +

Fn +G n 2

Gn = λ

Thus

ε

e v0 +

Fn −G n 2

Gn +

|∇G n |2 ≤

|∇G n |2 + λ

= −λ

e

ε

eu 0 +

u ε0 + F2n

(e

Gn + λ

e

Gn 2

Gn .

(6.21)

− 1)G n

n v0ε + Fn −G 2

≤ C(ε)|

Fn 2

4π(N2 − N1 ) ||

4π(N2 − N1 ) Gn + ||

Gn

G n |,

(6.22)

where (6.10), (6.17) and the Poincare inequality are used. 1 Substituting ϕ = Fn := Fn − αn , where αn := || Fn in (6.14), we have

|∇ Fn |2 + 2λ

ε

ε

eu 0 +v0 +Fn Fn = λ

ε

(eu 0 +

Fn +G n 2

ε

+ e v0 +

Fn −G n 2

)Fn + o(1)||Fn || H 1 () ,

and hence ε ε |∇ Fn |2 ≤ |∇ Fn |2 + 2λ eu 0 +v0 +αn (e Fn − 1)Fn

= −2λ

e

u ε0 +v0ε +αn

Fn + λ

ε

(eu 0 +

Fn +G n 2

ε

+ e v0 +

Fn −G n 2

)Fn + o(1)||∇ Fn ||.

(6.23)

344


Multiplying (6.8) by Fn and integrating by parts, we get ε Fn +G n ε Fn −G n ∇G n ∇ Fn = λ (eu 0 + 2 − ev0 + 2 )Fn .

Thus,

λ

(6.24)

e

n u ε0 + Fn +G 2

Fn

=

∇G n ∇ Fn + λ

From (6.10) and (6.17),

ε

e v0 +

Fn −G n 2

Fn .

(6.25)

ε

e2v0 +Fn −G n ≤ C(ε).

(6.26)

Therefore, substituting (6.24) in (6.23) and using the Poincare inequality we get 2 2 ||∇ Fn || = |∇ Fn | ≤ C(ε)||∇ Fn || + ∇G n ∇ Fn ≤ (C(ε) + ||∇G n ||)||∇ Fn ||,

(6.27) and therefore ||∇ Fn || ≤ C(ε) + ||∇G n ||.

(6.28)

Substituting (6.17), (6.18), (6.19) in (6.1) we get

1 1 4π(N1 + N2 ) 4π(N2 − N1 ) ||∇ Fn ||2 − ||∇G n ||2 + Fn + Gn 2 2 || || + N ) 4π(N 8π N 1 2 1 ≤ C1 (ε) + c (ε)||∇G n ||2 + (Fn + G n ) − G n , (6.29) || ||

C≤

where the last inequality is due to (6.28) and c (ε) is small and will be chosen later. From (6.20) and (6.22), (6.29) yields 8π N1 G n ≤ c (ε)C(ε) G n + C2 (ε), ||

(6.30)

where Ci (ε) are constants independent of c (ε). Choosing c (ε)C(ε) = (6.10), we have | G n | ≤ C(ε).

4π N1 ||

from

(6.31)

From (6.22) and (6.28) we have |∇ Fn |2 + |∇G n |2 ≤ C(ε).

(6.32)


345

Again, from (6.1) it gives |

Fn | ≤ C(ε).

(6.33)

The Palais-Smale condition can now be proved from (6.31), (6.32) and (6.33). We omit the details since the remaining argument is standard (see [13]). This completes the proof of Lemma 6.1. Now, we are in position to prove Theorem 1.4. Proof of Theorem 1.4. From Theorem 1.2, Iλ has a local minimizer F0 for λ > λ∗ . We first consider the case when min{N1 , N2 } > 0. If F0 is not a strict local minimum, then a second solution of (3.15) obviously exists. Hence, we may assume that F0 is a strict local minimum for Iλ , i.e., there exists δ0 > 0 such that min

Iλ (F) − Iλ (F0 ) ≥ C1 > 0.

(6.34)

Iλε (F) − Iλε (Fε ) ≥ C1 /2 > 0

(6.35)

||F−F0 || H 1 () =δ0

Thus there exists ε0 > 0 such that min

||F−Fε || H 1 () =δ0

for all 0 < ε ≤ ε0 . For a fixed ε > 0, there exists large α > 0 such that 1 4π(N1 + N2 ) |∇ Fε |2 + Fε − 4π(N1 + N2 )α Iλε (Fε − α) = 2 || 1 − N ) 4π(N ε ε 2 1 2 −α − ||∇G ε || + G ε + λe eu 0 +v0 +Fε 2 || ε + Fε −G ε ε −α/2 u ε0 + Fε +G v 2 −λe (e +e 0 2 )

≤ −4π(N1 + N2 )α + C < Iλε (Fε ),

(6.36)

where C is a constant independent of ε and α, α is large. Thus, by the mountain pass lemma, there exists a solution Fˆε ∈ H 1 () such that I ελ ( Fˆε ) − Iλε (Fε ) ≥ C1 /2.

(6.37)

Taking the limit as ε → 0, Fˆε → Fˆ and Fε → F0 in C 2 () and ˆ − Iλ (F0 ) ≥ C1 /2. Iλ ( F)

(6.38)

Hence the second solution of (3.15) has been obtained provided that min{N1 , N2 } > 0. If min{N1 , N2 } = 0, we may assume that N2 > N1 = 0. Without loss of generality, let the local minimum F0 be a strict local minimizer in H 1 () i.e., (6.34) holds. Let k > 0 and consider Iλk corresponding to the equation

346


u + λev (1 − eu ) = 4π kδ p0 2 v + λeu (1 − ev ) = 4π kj=1 n j δq j

.

(6.39)

It is not difficult to see that Iλk → Iλ and Fk → F0 as k → 0, where Fk is a local minimizer of (6.39). By the above the proof, there exists a second solution of (6.39) with Iλk ( Fˆk ) − Iλk (Fk ) ≥ C1 /2 > 0.

(6.40)

Letting k → 0, Fˆk will converge to Fˆ in H 1 () . Since the proof is similar to that of Lemma (4.1), the proof is omitted here. Obviously, ˆ − Iλ (F0 ) ≥ C1 /2 > 0. Iλ ( F)

(6.41)

Thus, a second solution is found in the case N1 = 0 and the proof of Theorem 1.4 is complete. Acknowledgement. This work was done while the second author was visiting Taida Institute for Mathematical Sciences, National Taiwan University with NSC grants. She thanks them for their support and warm hospitality.

References 1. Aubin, T.: Nonlinear analysis on Manifolds: Monge Ampere equations. Grundlehren Math. Wiss., Vol. 252, NY: Springer, 1982 2. Caffarelli, L.A., Yang, Y.S.: Vortex condensation in the Chern-Simons Higgs model: an existence theorem. Comm. Math. Phys. 168(2), 321–336 (1995) 3. Chae, D., Imanuvilov, O.Yu.: Non-topological multivortex solutions to the self-dual Maxwell-ChernSimons-Higgs systems. J. Funct. Anal. 196(1), 87–118 (2002) 4. Chae, D., Kim, N.: Topological multivortex solutions of the self-dual Maxwell-Chern-Simons-Higgs system. J. Differential Equations 134(1), 154–182 (1997) 5. Chan, H., Fu, C.-C., Lin, C.-S.: Non-topological multi-vortex solutions to the self-dual ChernSimons-Higgs equation. Comm. Math. Phys. 231(2), 189–221 (2002) 6. Chern, J.-L., Chen, Z.-Y., Lin, C.-S.: Uniqueness of topological solutions and the structure of solutions for the Chern-Simons system with two Higgs particles. Preprint 7. Dunne, G.V.: Aspects of Chern-Simons theory. Aspects topologiques de la physique en basse dimension/Topological aspects of low dimensional systems (Les Houches, 1998), Les Ulis: EDP Sci., 1999, pp. 177–263 8. Dziarmaga, J.: Low energy dynamics of [U (1)] N Chern-Simons solitons and two dimensional nonlinear equations. Phys. Rev. D 49, 5469–5479 (1994) 9. Gilbarg, D., Trudinger, N.S.: Elliptic partial differential equations of second order. Reprint of the 1998 edition. Classics in Mathematics. Berlin: Springer-Verlag, 2001 10. Jaffe, A., Taubes, C.: Vortices and Monopoles. Progr. Phys. Vol. 2, Boston, MA: Birkhäuser Boston, 1990 11. Kim, C., Lee, C., Ko, P., Lee, B.-H: Schrödinger fields on the plane with [U (1)] N Chern-Simons interactions and generalized self-dual solitons. Phys. Rev. D (3) 48, 1821–1840 (1993) 12. Lin, C.-S., Ponce, A.C., Yang, Y.: A system of elliptic equations arising in Chern-Simons field theory. J. Funct. Anal. 247(2), 289–350 (2007) 13. Struwe, M.: Variational methods. Applications to nonlinear partial differential equations and Hamiltonian systems. Fourth edition. Ergebnisse der Mathematik und ihrer Grenzgebiete. 3. Folge. A Series of Modern Surveys in Mathematics [Results in Mathematics and Related Areas. 3rd Series. A Series of Modern Surveys in Mathematics], 34, Berlin: Springer-Verlag, 2008 14. Spruck, J., Yang, Y.S.: The existence of nontopological solitons in the self-dual Chern-Simons theory. Comm. Math. Phys. 149(2), 361–376 (1992) 15. Spruck, J., Yang, Y.S.: Topological solutions in the self-dual Chern-Simons theory: existence and approximation. Ann. Inst. H. Poincar Anal. Non Linire 12(1), 75–97 (1995) 16. Tarantello, G.: Multiple condensate solutions for the Chern-Simons-Higgs theory. J. Math. Phys. 37(8), 3769–3796 (1996)


347

17. Tarantello, G.: Self-dual gauge field vortices: an analytical approach. Berlin-Heidelberg, New York: Springer, 2007 18. Nolasco, M., Tarantello, G.: Vortex condensates for the SU(3) Chern-Simons theory. Comm. Math. Phys. 213(3), 599–639 (2000) 19. Yang, Y.: Solitons in field theory and nonlinear analysis. Springer Monographs in Mathematics. New York: Springer-Verlag, 2001 Communicated by I.M. Sigal

Commun. Math. Phys. 288, 349–377 (2009) Digital Object Identifier (DOI) 10.1007/s00220-009-0772-x

Communications in


On the Steady Compressible Navier–Stokes–Fourier System Piotr B. Mucha1 , Milan Pokorný2 1 Institute of Applied Mathematics and Mechanics, University of Warsaw,

ul. Banacha 2, 02-097 Warszawa, Poland. E-mail: [email protected]

2 Mathematical Institute of Charles University, Sokolovská 83,

186 75 Praha 8, Czech Republic. E-mail: [email protected] Received: 10 March 2008 / Accepted: 11 December 2008 Published online: 17 March 2009 – © Springer-Verlag 2009

Abstract: We study the motion of the steady compressible heat conducting viscous fluid in a bounded three dimensional domain governed by the compressible Navier– Stokes–Fourier system. Our main result is the existence of a weak solution to these equations for arbitrarily large data. A key element of the proof is a special approximation of the original system guaranteeing pointwise uniform boundedness of the density as well as the positiveness of the temperature. Therefore the passage to the limit omits tedious technical tricks required by the standard theory. Basic estimates on the solutions are possible to obtain by a suitable choice of physically reasonable boundary conditions. 1. Introduction We consider the following system of partial differential equations describing the steady flow of a compressible heat conducting Newtonian fluid in a bounded three dimensional domain , div(v) = 0,

(1.1)

div (v ⊗ v) − div S(v) + ∇ p(, θ ) = F,

(1.2)

div (e(, θ )v) − div (κ(θ )∇θ ) = S(v) : ∇v − p(, θ ) div v,

(1.3)

where : → R+0 is the density of the fluid, v : → R3 is the velocity field, S(v) = 2µ D(v) + λ(div v)I is the viscous part of the stress tensor, D(v) = 21 ∇v + (∇v)T + + is the symmetric part of the velocity gradient, p(·, ·) : R0 × R+ → R0 , a given function, is the pressure, F : → R3 is the external force, e(·, ·) : R+0 × R+ → R+0 , a given function, is the internal energy. System (1.1)–(1.3) is known as the compressible Navier–Stokes–Fourier equations or the full Navier–Stokes system [6].

350

P. B. Mucha, M. Pokorný

We assume that the constitutive equation has the form p(, θ ) = a1 γ + a2 θ, a1 , a2 > 0,

(1.4)

i.e. the pressure has one part corresponding to the ideal fluid and a so-called elastic part; for more information see e.g. [6]. Even though we could consider more general pressure laws, we restrict ourselves to this simple model to avoid unnecessary technicalities in the proof. The corresponding internal energy takes the form e(, θ ) = a1

γ −1 + cv θ, γ −1

(1.5)

see e.g. [6 or 1]. Note that in the full generality, Eq. (1.3) should be replaced by the conservation of the total energy, instead of conservation of the internal energy only. For a sufficiently regular class of solutions, including that we are going to construct, the balance of the kinetic energy is just a consequence of the momentum equation. We further simplify (1.3). Our solutions will be such that ∈ L ∞ () and v ∈ W p1 () for all p < ∞. We get due to the fact that div(v) = 0 in the weak sense (see [16]) 1 γ v = −γ div v, div γ −1 again in the weak sense. Thus we write instead of (1.3) (we put a1 = a2 = cv = 1) the energy equation (1.3) in the form div (θ v) − div (κ(θ )∇θ ) = S(v) : ∇v − θ div v.

(1.6)

The viscosity coefficients are, for the sake of simplicity, considered to be constant such that the conditions of the thermodynamical stability µ > 0,

2 λ+ µ>0 3

(1.7)

are satisfied. Finally, the heat conductivity is assumed to be temperature dependent, i.e. κ(θ ) = a3 (1 + θ m ), a3 , m > 0.

(1.8)

This fact is important for our study, we are not able to consider a constant heat conductivity. Our domain is sufficiently smooth, at least a C 2 domain. We supplement the system (1.1), (1.2) and (1.6) with the following boundary conditions at ∂. For the velocity, we consider the slip boundary conditions v · n = 0,

τ k · (T ( p, v)n) + f v · τ k = 0

at ∂,

(1.9)

where τ k , k = 1, 2 are two perpendicular tangent vectors to ∂, n is the outer normal vector and T ( p, v) = − p I + S(v) is the stress tensor. The friction coefficient f is non-negative (if f = 0 we assume additionally that is not axially symmetric). Recall that f = 0 corresponds to the perfect slip, while f → ∞ leads to the homogeneous Dirichlet boundary conditions. However, we are not able to perform this limit passage. Concerning the temperature, we assume that κ(θ )

∂θ + L(θ )(θ − θ0 ) = 0 ∂n

at ∂,

(1.10)

On the Steady Compressible Navier–Stokes–Fourier System

351

where θ0 : ∂ → R+ is a strictly positive sufficiently smooth given function, say θ0 ∈ C 2 (∂), 0 < θ∗ ≤ θ0 ≤ θ ∗ < ∞ with θ∗ , θ ∗ ∈ R+ and L(θ ) = a4 (1 + θ l ),

l ∈ R+0 .

We also add the prescribed mass of the gas d x = M > 0.

(1.11)

(1.12)

The objective of this paper is to prove the existence of weak solutions to problem (1.1)–(1.12) for arbitrarily large data. Till now only partial results have been proved (see e.g. [2,9,14,15]) and only known general theorems concern weak solutions to the evolutionary version of the system [6]. One of main obstacles was to construct suitable a priori estimates. Due to properties of boundary condition (1.10) we are able to obtain a nontrivial energy bound for weak solutions, saving the thermodynamical structure of the system. In the case of the barotropic gas we do not meet such difficulties. The energy bound follows elementary from the momentum equation. However, it is not the only difference. The standard methods introduced by P.L. Lions [9] do not work successfully for the heat conducting case. However, a generalization of the technique introduced in [11,17] gives us sufficient tools to solve the stated problem. An approach to system (1.1)–(1.12) was considered in the book [9]. Unfortunately, this result can be viewed as conditionalonly, since instead of (1.12) the author assumed artificially that weak solutions satisfy p d x = M p for sufficiently large p. On the one hand, this condition is physically not acceptable, on the other hand, it simplifies considerably the mathematical analysis. Nevertheless, this result shows us what is the difference in techniques for the barotropic and heat conducting models. Looking at results concerning the classical solutions for problems with small data, we realize that the heat conducting system has the same mathematical structure (difficulties) as the barotropic version of the model. Thus results from [2,15] are almost immediately transformed to the case of system (1.1)–(1.12). For large data solutions the energy equation starts to play an important role, essentially changing the properties of the whole system. The evolutionary case of system (1.1)–(1.12), under general assumptions on the pressure law was considered in [7 and 8]; the authors assumed only the situation when ∂θ the fluid is thermically isolated, i.e. ∂n = 0 at the boundary. However, the same technique works also for our boundary conditions (1.10). The thermically isolated situation guarantees immediately the energy bound for weak solutions, but considering the limit t → ∞, the only solution which can be obtained as the limit for large times (with time independent force) is a solution with the constant temperature. This is connected to the fact that the model does not allow the heat transfer through the boundary and either the energy increases to infinity (non-potential force) or the temperature approaches a constant value (potential force). The boundary condition (1.10) allows the heat transfer through the boundary, guaranteeing the balance of the total energy, and thus we are able to prove existence of solutions which are definitely nontrivial and physically acceptable. The main result of this paper is the following. Theorem 1. Let ∈ C 2 be a bounded domain in R3 which is not axially symmetric if f = 0. Let F ∈ L ∞ () and γ > 3,

m =l +1>

3γ − 1 . 3γ − 7

352


Then there exists a weak solution to (1.1)–(1.12) such that ∈ L ∞ (),

v ∈ Wq1 (), θ ∈ Wq1 ()

for all 1 ≤ q < ∞ and θ > 0 a.e.

The solution constructed by Theorem 1 is meant in the following sense. Definition 1. The triple (, v, θ ) is a weak solution to (1.1)–(1.12), if ∈ L s (), s ≥ γ , v ∈ W21 (), θ ∈ W21 (), θ m ∇θ ∈ L 1 () and θ > 0 a.e.; v · n = 0 at ∂ in the sense of traces and

∀η ∈ C ∞ (),

v · ∇η = 0

(1.13)

(−v ⊗ v : ∇ϕ + 2µ D(v) : D(ϕ) + λ div v div ϕ − p(, θ ) div ϕ) d x

(v τ ) · (ϕ τ )dσ =

+f ∂

F · ϕd x

(1.14)

∀ϕ ∈ C ∞ (); ϕ · n = 0 at ∂

(we denoted by v τ the vector v − (v · n)n1 ) and finally

(κ(θ )∇θ · ∇ψ − θ v · ∇ψ) d x +

L(θ )(θ − θ0 )ψdσ

(1.15)

∂

2µ| D(v)|2 ψd x + λ(div v)2 ψ − θ div vψ d x =

∀ψ ∈ C ∞ ().

The proof of Theorem 1 will be based on a special approximation procedure described in the next section which is the kernel of our method. This section includes also a priori estimates for the approximation. The structure of the approximative system gives us immediately the approximative density bounded uniformly in L ∞ , but we must prove refined L ∞ estimates to verify that the limit solves the original system (1.1)–(1.3). This idea has already been successfully applied in [11 and 17] in the case of barotropic flows. The third section contains a detailed proof of existence to the approximative system. Here the main difficulty comes from the energy equation, since the required positiveness of the temperature does not follow immediately. In the next section we introduce an important quantity, the effective viscous flux and prove its main properties, i.e. the compactness. This feature allows to improve information about the convergence of the density, which is the basic/fundamental fact in the theory of the compressible Navier–Stokes equations [6,9]. The last section describes the refined L ∞ estimates for the approximative density and the passage to the limit. Then we prove that the limit is indeed our sought solution in the meaning of Definition 1. As the reader may easily check, our method works for the slightly larger class of the pressure laws. It allows to consider e.g. 1 Note that v · n = 0 at ∂ and thus v τ = v.


p(, θ ) = pb () + θ,

353

(1.16)

where pb () is a strictly monotone function which behaves for large values as γ . The main steps of this generalization are similar to the barotropic case and can be found in [17]; since our problem is technically enough complicated, we shall avoid such generalizations. Our new result is closely related to the barotropic version of the system (1.1)–(1.12). Let us recall the state of the art in this theory. The steady compressible Navier–Stokes equations for arbitrarily large data were firstly successfully studied in the book [9], where, in the case of p() = γ the existence of renormalized weak solutions was shown for γ > 1 (N = 2) and γ ≥ 53 (N = 3) for Dirichlet boundary conditions. For potential forces with a small non-potential perturbation the existence was improved in [13] for γ > 23 (N = 3). In the recent paper [5] the authors proved the existence in two space dimensions also for γ = 1. See also [3], where the authors considered the three dimensional case and got existence for certain γ –s less than 53 , however, for periodic boundary conditions. P.L. Lions also considered the existence of solutions with locally bounded density: for the case of Dirichlet boundary conditions he was able to show their existence for γ > 1 (N = 2) and γ ≥ 3 (N = 3). Nevertheless, to prove Theorem 1 the above methods are not sufficient, thus we present our new approach for the heat conducting model. Throughout the paper we use the standard notations for the Lebesgue, Sobolev, etc. spaces; generic constants are denoted by C and sequences → 0 always mean suitable chosen subsequences k → 0+ . For the sake of simplicity we put a1 = a2 = a3 = a4 = cv = 1.

2. Approximation This section contains one of the main difficulties in the proof of Theorem 1 — to find a good approximation of problem (1.1)–(1.12). Then we shall be able to show existence and prove the corresponding a priori estimates. Here we present the approximative system as well as the proof of the fundamental a priori estimates, provided the temperature is positive and all quantities are sufficiently smooth. The next section deals then with the solvability of this system and with further a priori bounds. In particular in Sect. 3 the positiveness of the approximative temperature and smoothness of all quantities is proved. Our approximative system will contain two parameters: a number > 0 and an auxiliary function K (·) defined by a number k > 0 as follows: ⎧ ⎨1 K (t) = ∈ [0, 1] ⎩0

for t < k − 1 for k − 1 ≤ t ≤ k for t > k;

(2.1)

moreover, we assume that K (t) < 0 for t ∈ (k − 1, k), where k ∈ R+ . In the last section we pass with → 0+ and we shall show that we may take k sufficiently large such that K () ≡ 1 for our solution. The approximation of our problem (1.1)–(1.12) reads as follows:

354


⎫ + div(K ()v) − = h K () ⎪ ⎪ ⎪ 1 1 ⎪ div(K ()v ⊗ v) + K ()v · ∇v − div S(v) + ∇ P(, θ ) = K ()F ⎪ ⎪ ⎪ ⎪ 2 2 ⎬ ⎞ ⎛ in , +θ ⎪ ⎪ − div (1 + θ m ) ∇θ + div ⎝v K (t)dt ⎠ θ + div (K ()v) θ ⎪ ⎪ ⎪ θ ⎪ ⎪ ⎪ 0 ⎭ +K ()v · ∇θ − θ K ()v · ∇ = S(v) : ∇v (2.2) where P(, θ ) =

γt

γ −1

K (t)dt + θ

0

K (t)dt = Pb () + θ

0

K (t)dt

(2.3)

0

M and h = || . Equation (2.2)3 can be reformulated in the following way being the modification of the entropy equation:

⎛ ⎞ s) ( + e − div (1 + esm ) ∇s + K ()v · ∇s − K ()v · ∇ + div ⎝v K (t)dt ⎠ es 0

+ div (K ()v) =

S(v) : ∇v + es

(1 + esm )( es

+ es )

|∇s|2 in ,

(2.4)

with the “entropy” s defined as follows: s = ln θ.

(2.5)

The solvability of (2.2)–(2.4), guaranteed by Theorem 2, gives us s integrable, even continuous for a fixed > 0. Hence here the temperature θ := es is positive. This construction is performed in Sect. 3. Additionally if s ∈ Wq2 () or θ ∈ Wq2 () with q > 23 , so θ ≥ c0 > 0 in . Then (2.2)3 and (2.4) are equivalent. The distinguished entropy will allow to control the positiveness of the temperature, what does not seem to be elementary working directly with an equation of type (2.2)3 . This system is completed by the boundary conditions at ∂, ∂s + L(θ )(θ − θ0 ) + s = 0, ∂n τ k · (T ( p, v)n) + f v · τ k = 0, k = 1, 2, ∂ = 0. ∂n

(1 + θ m )( + θ ) v · n = 0,

(2.6)

The key element in the limit passage from the approximative problem to the original one is the energy estimate giving information independent of the choice of function K , i.e. of the choice of the positive constant k — see (2.1):


355

Lemma 1. Suppose solutions to (2.1)–(2.6) to be sufficiently smooth, i.e. , v and θ ∈ Wq2 () for any q < ∞, θ > 0 in . Let assumptions of Theorem 1 be satisfied. Then d x ≤ M and 0 ≤ ≤ k,

||v|| H 1 () + ||K ()|| L 2γ () + ||P(, θ )|| L 2 () + ||θ || L 3m () + ||∇θ || L r () + (es + e−s )dσ + ||∇s|| L 2 () ≤ C(||F|| L ∞ () , M), (2.7) ∂ 3m where the r.h.s. of (2.7) is independent of and k, s = ln θ and r = min{2, m+1 }.

Proof. The nonnegativeness of the density and boundedness by k follow directly from features of function K and the form of (2.2)1 ; it suffices to integrate the equation over sets {x ∈ ; (x) < 0} and {x ∈ ; (x) > k}, respectively. The integration of this equation over leads to the bound on the total mass. For details we refer to [11]. Let us prove the second part of (2.7) which is definitely more complicated. We divide our chain of estimates into eight steps to underline the main parts of our method. Step I. Multiply the approximative momentum equation (2.2)2 by v and integrate it over : 2 2 2 2µD (v) + λ div v d x + f |v τ | dσ + v · ∇ Pb ()d x

=

K ()v · Fd x +

⎛ ⎝

∂

⎞

K (t)dt ⎠ θ div vd x.

(2.8)

0

To find a good form of the last term of the l.h.s. of (2.8) we use the approximative continuity equation (2.2)1 , γ v · ∇ Pb ()d x = K ()v · ∇γ −1 d x γ −1 γ + h K () − γ −1 d x =− γ −1 γ = [ − h K ()]γ −1 d x + γ γ −2 |∇|2 d x. γ −1

Thus the momentum equation gives the following inequality: γ 2 γ −2 2 S(v) : ∇vd x + f |v τ | dσ + γ |∇| d x + γ d x γ −1 ∂ ⎛ ⎞ ⎛ ⎞ − ⎝ K (t)dt ⎠ θ div vd x ≤ C ⎝1 + |K ()v · F|d x ⎠ . (2.9)

0

356


Step II. Integrating the energy equation (2.2)3 and employing the boundary condition (2.6)1 we get ⎞ ⎞ ⎛ ⎛ (L(θ )(θ − θ0 ) + s) dσ = ⎝ S(v) : ∇v − ⎝ K (t)dt ⎠ θ div v ⎠ d x, (2.10)

∂

0

since the integration by parts gives the following identity: ⎞ ⎡ ⎛ ⎣ K ()v · ∇θ − θ K ()v · ∇ + div ⎝v K (t)dt ⎠ θ

⎤ + div (K ()v) θ ⎦ d x =

⎛ ⎝

0

⎞

K (t)dt ⎠ θ div vd x.

0

Summing up (2.9) and (2.10) we get γ + γ −2 2 L(θ )θ + s dσ + γ |∇| d x + γ d x γ −1 ∂ ⎛ ⎞ ≤ s − dσ + C ⎝1 + |K ()v · F|d x ⎠ , ∂

(2.11)

where s + and s − are the positive and negative parts of the entropy, respectively (s = s + − s − ). Note that the form of L(·) implies that the first term of (2.11) gives an estimate on ∂ es dσ independently of . We shall concentrate the attention on this term, since it controls the positive part of entropy s at the boundary. Step III. We integrate the entropy equation (2.4) over getting v · ∇θ L(θ )(θ − θ0 ) −s dσ + K () + se − K ()v · ∇ d x θ θ ∂ S(v) : ∇v (1 + θ m )( + θ ) 2 = (2.12) + |∇s| d x. θ θ

So

S(v) : ∇v (1 + θ m )( + θ ) L(θ )θ0 2 − |s − | dσ + |∇s| d x + + |s |e θ θ θ ∂ + − K ()v · ∇(s − ln )d x ≤ L(θ )dσ + s + e−s dσ. (2.13)

∂

∂

Here we emphasize that the first term in the second integral in the l.h.s. of (2.13) gives us a bound on ∂ e−s dσ , because of the properties of L(·) and θ0 ≥ θ∗ > 0. Hence we control the negative part of entropy s.


357

Let us look closer at the last term in the l.h.s. of (2.13). We have − K ()v · ∇(s −ln )d x = K ()v · ∇ ln d x − K ()v · ∇sd x = I1 + I2

(2.14) and employing (2.2)1 we get I1 =

K ()v · ∇ ln d x = −

div(K ()v) ln d x

(− + − h K ()) ln d x

=

|∇|2 − h K () ln + ln d x. =

(2.15)

The first term has a good sign (we shall keep in mind this term), the second term has a good sign for ≤ 1, too, and for ≥ 1is easily bounded by h. Similarly, the last term can be controlled by the term (1 + γ d x). The proof was rather formal, as we do not know whether > 0 in . However, we may write K ()v · ∇( + δ) in (2.12) with δ > 0 and find an analogue of (2.15) with ln( + δ). Finally we pass with δ → 0+ and get precisely the same information as above. Next I2 = − K ()v · ∇sd x = ( − + h K ()) sd x =

(−∇∇s − ln θ + h K () ln θ ) d x.

(2.16)

Considering the r.h.s. of (2.16), we have ∇∇sd x ≤ ∇ L () ∇s L () 2 2 ⎛ ⎞ 1 ⎝ |∇|2 1 d x + |∇|2 γ −2 d x ⎠ + ∇s 2L 2 () . ≤ 4 4

Moreover,

− ln θ d x

has a good sign for θ ≤ 1 and for θ > 1,

−(ln θ )+ d x ≤ L 2 () s + L 2 () ≤

(2.17)

+ ∇s L 2 () + γ L 1 () + C. 4

+

s L 1 (∂) 4 (2.18)

358


The last term of (2.16) can be treated as follows (one part has again a good sign, so we consider only θ ∈ (0, 1], i.e. s ≤ 0): 1 1 h K ()|(ln θ )− |d x ≤ C |s − |d x ≤ C + |s − |dσ + ||∇s|| L 2 () . (2.19) 2 4

∂

Here we applied a Poincaré type inequality yielding

u L 1 () ≤ c()( u L 1 (∂) + ∇u L 2 () ).

(2.20)

Then combining (2.13) with inequality (2.11) and with (2.15)–(2.19) we obtain S(v) : ∇v 1 + θ m L(θ )θ0 2 L(θ )θ + |∇θ | d x + + + |s| dσ ≤ H, (2.21) θ θ2 θ

∂

where

⎛ H = C ⎝1 +

⎞ |K ()v · F|d x ⎠ .

The form of the l.h.s. of (2.21) implies that we control also s L 1 (∂) and ∇s L 2 () ; 2 d x = |∇s|2 d x, evidently s L 1 (∂) is controlled by ∂ (es + e−s )dσ and |∇θ| θ2 which are estimated by the l.h.s. of (2.21). Step IV. From the growth conditions and (2.21) we deduce the following “homogeneous” estimates: ⎛ ⎞1/(l+1) ⎛ ⎞1/(l+1) C ⎝ θ l+1 dσ ⎠ ≤ ⎝ L(θ )θ dσ ⎠ ≤ H 1/(l+1) , ∂

∂

⎞1/m ⎞1/m ⎛ ⎛ m 1 + θ ≤⎝ |∇θ |2 d x ⎠ ≤ H 1/m . C ⎝ |∇θ m/2 |2 ⎠ θ2 We use the following Poincaré type inequality (analogical to (2.20)) ⎛⎛ ⎛ ⎞1/m ⎞1/m ⎛ ⎞1/(l+1) ⎞ ⎟ ⎜ ⎝ |θ m/2 |2 d x ⎠ ≤ C() ⎝⎝ |∇θ m/2 |2 d x ⎠ + ⎝ θ l+1 dσ ⎠ ⎠.

∂

Then the imbedding theorem W21 () → L 6 () (for N = 3) applied to the function θ m/2 leads to the bound ⎞1/3m ⎛ ⎝ θ 3m d x ⎠ ≤ H 1/m + H 1/(l+1) . (2.22)

To simplify further calculations, we set l + 1 = m. Note that we may allow also different values of l, however, for the prize that the further calculations become more technical which we try to avoid.


359

Step V. We return to (2.9). Hölder’s inequality yields2 γ ||v||2H 1 () + γ γ −2 |∇|2 d x + γ d x γ −1 ⎛ ⎞ ≤ C ⎝1 + |K ()v · F|d x + |θ K (t)dt|2 d x ⎠ .

(2.23)

0

The next step of our estimation is the bound on Pb () which is necessary to estimate the r.h.s. of (2.23). We just repeat the method for the barotropic case, but here we shall obtain an extra term related to the temperature. Introduce : → R3 defined as a solution to the following problem: 1 div = Pb () − {Pb ()} in , with {Pb ()} = Pb ()d x. (2.24) =0 at ∂, ||

The basic theory to the stationary Stokes system gives the existence of a vector field satisfying (2.24) with the following estimate for a solution to (2.24) (for another possible proof, using directly estimates of special solutions to system (2.24), see [16]) |||| H 1 () ≤ C||Pb || L 2 () . (2.25) 0 From the structure of Pb () and information that d x ≤ M we easily get, applying the interpolation inequality, {Pb ()} ≤ δ||Pb ()|| L 2 () + C(δ, M)

for any δ > 0.

Multiplying the momentum equation (2.2)2 by , employing (2.23) and (2.25), we conclude after standard estimates of the r.h.s to (2.2)2 , ⎛ ⎞ (2.26) ||Pb ()||2L 2 () ≤ C ⎝1 + |K ()v ⊗ v|2 d x + |θ K (t)dt|2 d x ⎠ .

As ||Pb ()||2L 2 ()

0

⎛ ⎛ ⎞2γ ⎞ ⎜ ⎟ ≥ C ⎝ (K ())2γ d x + ⎝ K (t)dt ⎠ d x ⎠ ,

(2.27)

0

recalling that 2γ > 6, we get a bound for the first integral in the r.h.s. of (2.26), |K ()v ⊗ v|2 d x ≤ c||v||4H 1 () ||K ()||2L 6 () 2(γ −3)

10γ

−1) −1) ≤ c||v||4H 1 () ||K ()|| L3(2γ ||K ()|| L3(2γ 1 () 2γ ()

(2.28)

6(2γ −1)

−4 ≤ δ||Pb ()||2L 2 () + C(δ, M)||v|| H3γ1 () .

2 Note that we used Korn’s inequality; for f = 0 we therefore require that is not axially symmetric, for more details see [16].

360


Hence a suitable choice of δ in (2.28) simplifies (2.26) to ⎛ ⎞ 6(2γ −1) −4 + |θ K (t)dt|2 d x ⎠ . ||Pb ()||2L 2 () ≤ C ⎝1 + ||v|| H3γ1 ()

(2.29)

0

The last estimate can be viewed by (2.27) in the form ||

K (t)dt|| L 2γ () + ||K ()|| L 2γ () 0

⎛

⎜ ≤ C ⎝1 + ||v||

3 2γ −1 γ 3γ −4 H 1 ()

⎛ +⎝

|θ

⎞ 2γ1 ⎞ ⎟ K (t)dt|2 d x ⎠ ⎠ .

(2.30)

0

Within our estimation we concentrate on a precise specification of powers of norms. Then, due to our growth conditions we shall be able to construct the desired bound (2.7). Step VI. The last integral in (2.30) can be treated as follows (we need m > 23 and m > 3(γ2γ−1) ): ||θ

1/γ K (t)dt|| L 2 ()

≤

1/γ ||θ || L 3m () ||

0

1/γ

K (t)dt|| L 0

1 γ

≤ θ L 3m () ||

(3m−2)γ −3m 3mγ (2γ −1)

K (t)dt|| L 1 ()

||

0

6m () 3m−2

3m+2

−1) K (t)dt|| L3m(2γ , 2γ ()

(2.31)

0

so (2.30) and (2.31) with the Young inequality imply ||

K (t)dt|| L 2γ () + ||K ()|| L 2γ ()

2γ −1 3m 3 2γ −1 γ 6m(γ −1)−2 γ 3γ −4 . ≤ C 1 + ||v|| H 1 () + ||θ || L 3m ()

0

Applying the inequality for the temperature — (2.22) — we obtain (recall that we put l + 1 = m) ||

3 2γ −1 2γ −1 3 −4 K (t)dt|| L 2γ () + ||K ()|| L 2γ () ≤ C 1 + ||v|| Hγ 13γ() + H γ 6m(γ −1)−2 .

(2.32)

0

We have to estimate H ; it holds |K ()v · F|d x ≤ ||v|| L 6 () ||K ()|| L 6/5 () ||F|| L ∞ () .

Using the interpolation between 1 and 2γ as above leads to the following bound: γ −1) |K ()v · F|d x ≤ C(M)||v|| H 1 () ||K ()|| L3(2γ . (2.33) 2γ ()


361

Inserting this inequality to the r.h.s. of (2.32), recalling that m ≥ 14 and applying the standard Hölder inequality we obtain from (2.32) the estimate on the density K (t)dt|| L 2γ () + ||K ()|| L 2γ ()

||

2γ −1 1 3 2γ −1 γ 2m(γ −1)−1 γ 3γ −4 . (2.34) ≤ C 1 + ||v|| H 1 () + ||v|| H 1 ()

0

Step VII. As we can see later, the first term is the most restrictive. So by (2.33) and (2.34) −1 we conclude (for m > 3γ 6γ −6 )

3γ −3 . |K ()v · F|d x ≤ C 1 + ||v|| H3γ1−4 ()

(2.35)

Hence we obtain from (2.22), 1 3γ −3 −4 ||θ || L 3m () ≤ C 1 + ||v|| Hm 13γ() .

(2.36)

From (2.31) we easily see that ||θ

K (t)dt|| L 2 () ≤ C||θ || L 3m () ||

0

3m+2

γ

2γ −1 K (t)dt|| L3m . 2γ ()

(2.37)

0

Step VIII. Summing up inequalities (2.23), (2.35) and (2.37) we obtain the main bound on the norm of the velocity 3γ −3 2 3γ −3 2 3m+2 m 3γ −4 + m 3γ −4 . + ||v|| ||v||2H 1 () ≤ C 1 + ||v|| H3γ1−4 () H 1 ()

(2.38)

The above bound implies the a priori bound ||v|| H 1 () ≤ C(||F|| L ∞ , M),

(2.39)

provided a suitable dependence between γ and m holds. The estimate (2.39) holds as the powers in the r.h.s. of (2.38) are less than 2. It can be described by the sufficient condition (γ > 3) m>

3γ − 1 . 3γ − 7

(2.40)

Note that as we take γ near 3 then m > 4 and for γ = 4 we have m > 11 5 . Moreover, the 3γ −1 2γ 2 above needed conditions m > 6γ −1 , m > 3 and m > 3(γ −1) are clearly less restrictive than (2.40). Bound (2.39) implies immediately the a priori estimate (2.7), since it follows from (2.21) with (1.11), (2.29), (2.34)–(2.37), together with (3.7) necessary in the next section.

362


3. Existence for the Approximative System The aim of this section is to show that for any > 0 and k > 0 there is a solution to the approximative system (2.2)–(2.6). In particular we ensure the positiveness of the temperature. We prove Theorem 2. Let the assumptions of Theorem 1 be satisfied. Moreover, let > 0 and k > 0. Then there exists a strong solution (, v, s) to (2.2)–(2.6) such that ∈ W p2 (), v ∈ W p2 () and s ∈ W p2 () for all 1 ≤ p < ∞. Moreover 0 ≤ ≤ k in , ||v||W 1

3m ()

+

√

d x

≤ M and

||∇|| L 2 () + ∇θ L r () + θ L 3m () ≤ C(k),

(3.1)

3m where θ = es > 0, r = min{2, m+1 } and the r.h.s. of (3.1) is independent of the parameter .

The proof of the existence to the approximative system (2.2) will follow from the standard application of the Leray-Schauder fixed point theorem. It will be split into several lemmas. First we consider the continuity equation. We denote for p ∈ [1, ∞], M p = {w ∈ W p2 (); w · n = 0 at ∂}. We have Lemma 2. Let q > 3. Then the operator S : Mq → W p2 ()

for 1 < p < ∞

such that S(v) = , where is the solution to the following problem: − = h K () − div(K ()v) in , ∂ = 0 at ∂ ∂n

(3.2)

is a well defined continuous compact operator from Mq to W p2 (), 1 < p < ∞. In particular, the solution to (3.2) is unique. Moreover,

W p1 () ≤ C(k, )( v L p () + 1) 1 < p < ∞, ⎧ ⎨ C(k, ) 1 + v 1 (1 + v L () ) 1 < p < 3, W p () 3

W p2 () ≤ 2 ⎩ C(k, )(1 + v W 1 () ) 3 ≤ p < ∞.

(3.3)

p

Proof. The well posedness of the operator S was proved in [16] for K ≡ 1, see also [11], Prop. 3.1 (there the two dimensional case with our function K was considered). The estimates are the direct consequence of the standard elliptic theory, together with the fact that L ∞ () ≤ k.


363

Next, we define the operator T : M p × W p2 () → M p × W p2 () such that T (v, s) = (w, z), where (w, z) is the solution to the following system: ⎫ 1 1 −div S(w) = − div(K ()v ⊗ v )− K ()v · ∇ v −∇ P(, es )+ K () F ⎪ ⎪ ⎪ 2 2 ⎛ ⎞ ⎪ ⎪ ⎬ ms s s −div (1+e )( + e )∇z = S(v ) : ∇ v −div ⎝v K (t)dt⎠ e ⎪ in, ⎪ (3.4) ⎪ ⎪ ⎪ 0 ⎭ s s s −div (K ()v) e − e K ()v · ∇s + e K()v · ∇ w · n = 0, n · S(w) · τ l + f w · τ l = 0 for l = 1, 2 at ∂, (1 + ems )( + es )∇z + z = −L(es )(es − θ0 )

where = S(w) is given by Lemma 2. The above procedure guarantees us that the temperature obtained in this way, θ = e z , will be strictly positive for fixed > 0. Our aim is to apply the Leray–Schauder fixed point theorem. Thus we need to verify that T is a continuous and compact mapping from M p × W p2 () to M p × W p2 () and that all solutions satisfying tT (w, z) = (w, z),

t ∈ [0, 1]

are bounded in M p × W p2 ().

(3.5)

First we easily have Lemma 3. Let p > 3 and all assumptions of Theorem 2 be satisfied. Then T is a continuous and compact operator from M p × W p2 () to M p × W p2 (). Proof. Note that for > 0 the system (3.4) is strictly elliptic. Since p > 3, the W p1 ()–space is algebra, thus the r.h.s. of (3.4) belongs to the L p –space (the boundary term 1−1/ p belongs to W p (∂)). The coefficients in the operator in the l.h.s. of (3.4)2 are of the C 1+α ()–class. Hence the standard theory for elliptic systems gives us the existence of the solution to (3.4) in M p × W p2 () with the following bound: ||w||W p2 () + ||z||W p2 () ≤ C(||es ||C 1+α () ) ||the r.h.s. of (3.4)1 || L p () + ||the r.h.s. of (3.4)2 || L p () + ||the r.h.s. of (3.4)4 ||W 1−1/ p (∂) p

which guarantees us the uniqueness and the continuous dependence on the data. Moreover, the r.h.s. of (3.4) is at most of the first order derivative of sought functions. Thus this structure implies the compactness for the map T . Next we consider a priori bounds for solutions to (3.5). Lemma 4. All solutions to problem (3.5) in the class M p × W p2 () satisfy the following bounds: √ 0 ≤ ≤ k, ||w|| H 1 () + ||θ || L 3m () + ||∇θ || L r () + ε ∇ L 2 () ≤ C(k), (3.6) 3m where r = min{ m+1 , 2}, θ = e z and the constant C(k) is independent of and t ∈ [0, 1].

364


Proof. We may basically repeat estimates of Lemma 1 from the previous section. Here we may use that ≤ k in , on the other hand we must control the behaviour of all norms with respect to t. Thus, repeating steps (2.8)–(2.13) for the case t = 1 (the corresponding terms are only multiplied by t) we finally get (1 + θ m )( + θ ) 2 (1 − t) S(w) : ∇wd x + f (w τ ) dσ + |∇z|2 d x θ ∂ S(w) : ∇w γ γ +t + γ γ −2 |∇|2 + dx θ γ −1 ! L(θ )θ0 L(θ )θ − L(θ )θ0 + + z + (1 − e−z + ) + |z − |(e|z − | −1) dσ +t − L(θ ) dσ θ ∂ ∂ ⎛ ⎞ ≤ t (K ()w · ∇z − K ()w · ∇) d x + tC ⎝1 + |K ()w · F|d x ⎠ ,

where = S(w). We may now repeat the arguments between (2.14)–(2.21) (all the corresponding terms are only multiplied by t) and we finally get L(θ )θ0 1 + θm S(w) : ∇w 2 t L(θ )θ + t d x + + |z| dσ |∇θ | d x + t θ2 θ θ ∂ ⎛ ⎞ ≤ tC ⎝1 + |K ()w · F|d x ⎠ .

As 0 ≤ ≤ k, we easily get (the Poincaré inequality is just the same as in the previous section), after dividing by t (the case t = 0 is clear; recall also m = l + 1) ||θ || L 3m () ≤ C(1 + ||w|| L 2 () )1/m , and from an analogue to (2.23) also ||w||2H 1 () ≤ C(1 + ||θ ||2L 2 () ). As m > 1, it implies ||w|| H 1 () + ||θ || L 3m () ≤ C(k). Further, if m ≥ 2 then due to the control of |∇θ| θ and |∇θ |θ ∇θ bounded in the same space. For 1 < m < 2, we only get

m−2 2

∇θ L

3m () m+1

≤ |∇θ |θ

m−2 2

in L 2 () we have also

2−m

2

L 2 () θ L 3m () ≤ C(k).

(3.7)

Finally, multiplying the approximative continuity equation by and integrating by parts we get ⎛ ⎞ (|∇|2 + 2 )d x ≤ h K ()d x + ⎝ K (t)tdt ⎠ | div w|d x,

from where we deduce the bound for

√

∇ L 2 () .

0


365

We continue the proof of Theorem 2. To conclude, we verify the bound on (w, z) in W p2 () × W p2 (), p < ∞, independently of t. We apply the bootstrap method to system ⎫ 1 1 ⎪ ⎪ − div S(w) = t − div(K ()w ⊗ w) − K ()w · ∇w ⎪ ⎪ 2 ⎪ 2 ⎪ ⎪ ⎪ z ⎪ ⎪ −∇ P(, e ) + K () F ⎪ ⎪ ⎡ ⎛ ⎞ ⎪ ⎪ ⎬ in , mz z z − div (1 + e )( + e )∇z = t ⎣S(w) : ∇w − div ⎝w K (t)dt ⎠ e ⎪ ⎪ ⎪ (3.8) ⎪ ⎪ 0 ⎪ ⎤ ⎪ ⎪ ⎪ ⎪ ⎪ ⎪ − div (K ()w) ez − ez K ()w · ∇z + ez K ()w · ∇⎦ ⎪ ⎪ ⎭ w · n = 0, n · S(w) · τ l + f w · τ l = 0 for l = 1, 2 (1 + emz )( + ez )∇z + z = −t L(ez )(ez − θ0 )

at ∂,

where = S(w) given by Lemma 2. Note first that due to bounds from Lemma 4 we have

w W 1 () ≤ C 3

as K ()w ⊗ w is bounded in L 3 (). Thus w is bounded in any L q (), q < ∞ and the most restrictive term is ∇ P(, ez ). As ez = θ is bounded in L 3m (), in L ∞ (), we deduce the bound

w W 1

3m ()

≤C

and consequently also W 2

3m ()

≤ C.

Note that the constant in the estimate for w is independent of . Next, we rewrite Eq. (3.8)2 as follows: ⎡ − (z) = t ⎣S(w) : ∇w − ez K ()w · ∇z + ez K ()w · ∇ ⎛ − div ⎝w

⎞

⎤

K (t)dt ⎠ ez − div (K ()w) ez ⎦ in ,

(3.9)

0

∂(z) = −z − t L(ez )(ez − θ0 ) at ∂ ∂n with z (z) =

(1 + emτ )( + eτ )dτ.

(3.10)

0

We multiply (3.9)1 by and integrate over . It leads to ||∇||2L 2 () + t L(ez )(ez −θ0 )+z dσ ≤ C||the r.h.s. of (3.9)1 || L 6/5 () |||| L 6 () . ∂

366


It is not difficult to realize that the most restrictive term on the r.h.s is ez K ()w · 3m ∇z ∈ L 3m (), where m+1 > 65 for m > 1. m+1 Let us look at the boundary terms. Note that (s) ∼ s for s → −∞ and (s) ∼ e(m+1)s for s → +∞. Thus t L(es )(es − θ0 ) + s I{≤0} dσ ≥ C1 2 ||I{≤0} ||2L 2 (∂) − C2 ∂

and

t L(es )(es − θ0 ) + s I{≥0} dσ ≥ C1 ||I{≥0} || L 1 (∂) − C2 .

∂

Thus, the estimates above yield W 1 () ≤ C with C independent of t which implies 2

θ

m+1

L 6 () = e

(m+1)z

L 6 () ≤ C and also ∇θ L 2 () = ez ∇z L 2 () ≤ C.

Now, it is not difficult to verify that from (3.9) we get W 2∗ () ≤ C with p

z p ∗ = min{ 3m 2 , 2} (as e ∇z ∈ L 2 () and ∇w ∈ L 3m ()). In particular,

z L ∞ () + θ L ∞ () ≤ C,

∇z L q () + ∇θ L q () ≤ C

3 p∗ 3− p ∗

> 3. Thus from the approximative momentum equation we for 1 ≤ q ≤ q ∗ = get (recall ∇(θ ) ∈ L q ∗ ()) the bound w W 2∗ () ≤ C and from the energy/entropy q

equation also

z W 2∗ () + θ W 2∗ () ≤ C. q

q

The imbedding theorem yields ∇z L ∞ () + ∇θ L ∞ () ≤ C which finally gives as above

Wr2 () + w Wr2 () + z Wr2 () + θ Wr2 () ≤ C, 1 ≤ r < ∞ with C independent of t. This finishes the proof of Theorem 2. 4. Effective Viscous Flux In this part we investigate the properties of the effective viscous flux. Estimates (3.1) from Theorem 2 guarantee us existence of a subsequence → 0+ such that 1 (), v v in W3m v → v in L ∞ (), ∗ Pb ( ) ∗ Pb () in L ∞ (), in L ∞ (), ∗ K ( ) ∗ K () in L ∞ (), K ( ) K () in L ∞ (),

K (t)dt 0

∗

K (t)dt

in L ∞ (),

(4.1)

0

3m }, θ θ in Wr1 () with r = min{2, m+1 θ → θ in L q () for q < 3m.

Here we follow the notation that a weak limit of a sequence {A(a )} is denoted by A(a) (for a fixed subsequence → 0+ ).


367

Passing to the limit in the weak formulation of our problem (2.2) we get div(K ()v) = 0, ⎛

(4.2) ⎛

⎜ ⎜ K ()v · ∇ v − div ⎝2µ D(v ) + ν(div v ) I − Pb () I − θ ⎝

⎞⎞ ⎟⎟ K (t)dt ⎠ I⎠ = K () F ,

0

(4.3) ⎛ ⎜ − div((1 + θ m )∇θ ) + θ ⎝ div v

⎞ ⎟ K (t)dt ⎠ + div(K ()θ v ) = 2µ|D(v )|2 + ν(div v )2 ,

0

(4.4) together with the boundary conditions (1.9)–(1.10). Recall that (4.2)–(4.4) is satisfied in the weak sense, similar to Definition 1. In what follows we must carefully study the dependence of the a priori bounds on k. We have Lemma 5. Under the assumptions of Theorems 1 and 2, we have || || L ∞ () ≤ k and ||v ||W 1

3m ()

γ 3m−2 m

≤ C(1 + k 3

).

(4.5)

Proof. The bound on the density follows directly from Theorem 2. We therefore estimate the velocity. If we write (2.2)2 in the form ⎛ ⎞⎞ ⎛ − div S(v) = −∇ ⎝ Pb ( ) + θ ⎝ K (t)dt ⎠⎠ + K ( ) F 0

1 1 − div[K ( ) v ⊗ v ] − K ( ) v · ∇v , 2 2 we immediately see that

v W 1 () ≤ C K ( ) v ⊗ v L 3m () + K ( ) v · ∇v L 3m () 3m m+1 ⎞ ⎛ ⎞ + Pb ( ) L 3m () + θ ⎝ K (t)dt ⎠ L 3m () + K ( ) F L 3m () ⎠ . m+1

0

Note that due to the bound of the temperature we cannot expect an –independent estimate for q > 3m. The bounds on the density and temperature yield 2

3m−2

≤ Ck γ

Pb ( ) L 3m () ≤ Pb ( ) L3m2 () Pb ( ) L 3m ∞ () while

⎛

θ ⎝

0

⎞ K (t)dt ⎠ L 3m () ≤ Ck.

3m−2 3m

,

368


Note that for m and γ satisfying assumptions of Theorem 1, γ 3m−2 3m > 1. It remains to estimate the convective terms (C.T.) C.T. ≤ K ( ) |v |2 L 3m () + K ( ) |v ||∇v | L 3m () m+1 2 ≤ C L ∞ () v L 6m () + ∇v L 3m () v L ∞ () m+1

for m ≥ 2, while for m < 2 the last term is replaced by ∇v L 2 () v L the fact that for 6 < q ≤ ∞,

6m 2−m

() . Using

α 1 1 1 = + (1 − α) − , q 6 3m 3

v L q () ≤ C v αL 6 () v 1−α with W 1 () 3m

and for 2 < r < 3m, α 1−α 1 = + , r 2 3m

∇v L r () ≤ v αL 2 () ∇v 1−α L 3m () with we end up with 2 2m−1

2

m−1

3m−2 C.T. ≤ C L ∞ () v W3m−2 1 () v W 1 () . 2

Note that

2(m−1) 3m−2

3m

< 1. Thus we may use the bound on and Young’s inequality yields

v W 1

3m ()

γ 3m−2 m

≤ C(1 + k 3

As γ > 3, the lemma is proved.

) + Ck

3m−2 m

1 + v W 1 () . 3m 2

Before using the bounds proved above, we show one useful result which in particular implies that the limit temperature is positive. Lemma 6. There exists a subsequence {s } such that s → s in L 2 (), subsequently, θ → θ in L q (), q < 3m with θ > 0 a.e. in . Proof. Recall that from the energy bound we have the following information: |∇s |2 d x + (es + e−s )dσ ≤ C,

which in particular gives

∂

|∇s | d x +

s2 dσ ≤ C.

2

∂

Thus, remembering that is bounded, we are allowed to choose a subsequence s → s in L 2 (). Recall also that θ = es and θ → θ strongly in L r (), r < 3m. Hence by Vitali’s theorem (for a subsequence, if necessary) es → es

in L r ()

and

θ = es

Thus θ > 0 a.e. in , since s > −∞ a.e. in .

with s ∈ L 2 ().


369

A crucial role in the proof of the strong convergence of the density is played by a quantity called the effective viscous flux. To define it, the Helmholtz decomposition of the velocity is needed v = ∇φ + rot A,

(4.6)

where the divergence-free part of the velocity is given as a solution to the following elliptic problem: rot rot A = rot v = ω in , div rot A = 0 in , rot A · n = 0 at ∂.

(4.7)

The potential part of the velocity is given by the solution to φ = div v in , φd x = 0. ∂φ ∂ v = 0 at ∂,

(4.8)

The classical theory for elliptic equations [18,19] gives us for 1 < q < ∞, ||∇ rot A|| L q () ≤ C||ω|| L q () , ||∇ 2 φ|| L q () ≤ C|| div v|| L q () ,

||∇ 2 rot A|| L q () ≤ C||ω||Wq1 () , ||∇ 3 φ|| L q () ≤ C|| div v||Wq1 () .

The properties of the slip boundary condition enable us to state the following problem: −µ ω = rot (K ( ) F − K ( ) v · ∇v − 21 h K ( )v + 21 v − rot 21 v := H 1 + H 2 in , ω · τ 1 = −(2χ2 − f /µ)v · τ 2 at ∂, ω · τ 2 = (2χ1 − f /µ)v · τ 1 at ∂, div ω = 0 at ∂,

(4.9)

where χk are curvatures related with directions τ k . For the proof of relations (4.9)2,3 – see [12] (also [10]). The structure of ω gives us a hint to consider it as a sum of three components, ω = ω0 + ω1 + ω2 ,

(4.10)

where they are determined by the following systems: −µ ω1 = H 1 , −µ ω2 = H 2 −µ ω0 = 0, · τ 1 = −(2χ2 − f /µ)v · τ 2 , ω1 · τ 1 = 0, ω2 · τ 1 = 0 0 1 ω · τ 2 = (2χ1 − f /µ)v · τ 1 , ω · τ 2 = 0, ω2 · τ 2 = 0 0 1 div ω = 0, div ω = 0, div ω2 = 0

ω0

in , at ∂, (4.11) at ∂, at ∂.

Lemma 7. For the vorticity ω written in the form (4.10) we have:3 ||ω2 || L r () ≤ C(k) 1/2 ||ω0 ||Wq1 () + ||ω1 ||Wq1 () ≤ C(1 + k

for 1 ≤ r ≤ 2,

1+γ ( 43 − q2 )

)

for 2 ≤ q ≤ 3m.

(4.12)

3 Note that we can prove that ω2 + L r () = o() for → 0 for any r < 3m. As we do not need it and the proof of the rate is slightly more complicated, we skip it. Analogously we may consider the other inequality also for q < 2, with different powers of k.

370


Proof. First, let us consider ω0 . Take α 0 any divergence–free extension of the boundary data to ω , e.g. in the form of a solution to the following Stokes problem: − µ α 0 + ∇ p0 div α 0 α0 · τ 1 α0 · τ 2 α0 · n 1−1/(3m)

Note that v ∈ W3m

= = = = =

0 in , 0 in , −(2χ2 − f /µ)v · τ 2 (2χ1 − f /µ)v · τ 1 0 at ∂.

at ∂, at ∂,

(4.13)

1 () with the estimate (∂), thus α 0 ∈ W3m

α 0 Wq1 () ≤ C v Wq1 () ,

1 < q ≤ 3m.

Thus we may transform the system for ω0 to the form − µ (ω0 − α 0 ) = µ α 0

in ,

− α0 ) · τ 1 = 0

at ∂,

− α0 ) · τ 2 = 0

at ∂,

div(ω0 − α 0 ) = 0

at ∂.

(ω0 (ω0

(4.14)

To find the estimates for solutions to (4.14) we consider its weak form, then the r.h.s. of (4.14)1 delivers a nontrivial boundary term. It is well defined, since div α 0 = 0. Then results from [18,19] guarantee desired bounds. As the system for ω0 has the same structure as that for ω1 , we get ||ω1 ||Wq1 () ≤ C||H 1 ||Wq−1 () and ||ω0 ||Wq1 () ≤ C||v ||Wq1 () , 1 < q ≤ 3m. Analyzing the form of H 1 we see that the only not elementary term is the convective one; so we obtain ||ω1 ||Wq1 () ≤ C(1 + ||K ( ) v · ∇v || L q () ). We easily see that for q ≥ 2, ||K ( ) v · ∇v || L q () ≤ k||v || L ∞ () ||∇v || L q () . Using interpolation inequalities as in Lemma 5 we prove that 2(m−1)

6m−2q

m

3m(q−2)

∇v L3m−2

∇v L(3m−2)q

∇v L(3m−2)q ||K ( ) v · ∇v || L q () ≤ Ck v L3m−2 6 () 3m () 2 () 3m () ≤ Ck

1+γ ( 43 − q2 )

.

Evidently, the estimate for ω0 is less restrictive. Similarly, for ω2 we have

||ω2 || L q () ≤ C|| v ||Wq−1 () ≤ C sup | φ

v φd x|,


371

where the sup is taken over all functions belonging to Wq1 () with 1/ p + 1/q = 1. From the continuity equation we know that √ ||∇ || L 2 () ≤ C(k). (For q > 2 we have only ∇ L q () ≤ C.) As q ≤ 2, 1

||ω2 || L q () ≤ C( ∇ L 2 () v L ∞ () + ∇ L 2 () ∇v L 3m () ) ≤ C(k) 2 . The lemma is proved.

We now introduce the fundamental quantity — the effective viscous flux — which is in fact the potential part of the momentum equation. Using the Helmholtz decomposition in the approximative momentum equation we have ∇(−(2µ + ν) φ + P( , θ )) = µ rot A + K ( ) F 1 1 1 −K ( ) v · ∇v − h K ( )v + v − v . 2 2 2 We define G ε = −(2µ + ν) φ + P( , θ ) = −(2µ + ν) div v + P( , θ )

(4.15)

and its limit version G = −(2µ + ν) div v + P(, θ ). (4.16) Note that we are able to control integrals G d x = P( , θ )d x and Gd x = 0 K (t)dt . P(, θ )d x, where P(, θ ) = Pb () + θ The result of the lemma below gives the most important properties of the effective viscous flux, guaranteeing the compactness of {G } as well as the pointwise bound of the limit in terms of the parameter k from definition (2.1). Lemma 8. We have, up to a subsequence → 0+ : G → G strongly in L 2 ()

(4.17)

and 2

||G|| L ∞ () ≤ C(η)(1 + k 1+ 3 γ +η )

for any η > 0.

(4.18)

Proof. The function G can be naturally decomposed as G = G 1 + G 2 , where 1 2 2 2 G d x = 0 and ∇G = − 2 v − µ rot ω . Thus ||G 2 || L q () ≤ C(|| v ||Wq−1 () + µ rot ω2 Wq−1 () ). Using Lemma 7 we see that 1

G 2 L q () ≤ C(k) 2 ,

1 ≤ q ≤ 2.

372


Next,using again Lemma 7 and calculations in its proof, we immediately see that (recall that | G d x| ≤ C, we control the average of the r.h.s of (4.15) from the energy bound — Lemma 1) ||G 1 ||Wq1 () ≤ C(1 + k

1+γ ( 43 − q2 )

)

for 2 ≤ q ≤ 3m.

(4.19)

Thus we have, at least for a subsequence, in L ∞ () and G 2 → 0

G 1 → G 1

in L 2 ().

Therefore G = G 1 + G 2 → G 1

in L q (),

1 ≤ q ≤ 2,

and due to the definition, G 1 = G. Finally, choosing q = 3 + η˜ in (4.19), 2

G L ∞ () ≤ C(q) G Wq1 () ≤ C(q) sup G 1 Wq1 () ≤ C(η)(1 + k 1+ 3 γ +η ) >0

with η > 0, arbitrarily small if η˜ is so. This finishes the proof of Lemma 8.

5. Limit Passage In this section we apply the properties of the effective viscous flux shown in the previous part. First we prove a result characterizing the sequence of approximative densities. Theorem 3. There exists a sufficiently large number k0 > 0 such that for k > k0 , k−3 (k − 3)γ − ||G|| L ∞ () ≥ 1 k

(5.1)

and for a subsequence → 0+ it holds lim |{x ∈ : (x) > k − 3}| = 0.

→0+

(5.2)

In particular it follows: K () = a.e. in . Proof. We define a smooth function M ⎧ ⎨1 M(t) = ∈ [0, 1] ⎩0

: R+0 → [0, 1] such that for t ≤k−3 for k − 3 < t < k − 2 for k−2≤t

and M (t) < 0 for t ∈ (k − 3, k − 2). We follow the method introduced in [11]. First we multiply the approximative continuity equation (2.2)1 by M l ( ) for l ∈ N getting ⎛ ⎞ (x) ⎜ ⎟ t l M l−1 (t)M (t)dt ⎠ div v ≥ R ⎝

0


373

with R → 0 as → 0, as M l ( ) d x = −l M l−1 ( )M ( )|∇ |2 d x ≥ 0.

Next, recalling definitions of G and M, we obtain ⎛ ⎞ (x) ⎜ ⎟ −(k − 3) ⎝ l M l−1 (t)M (t)dt ⎠ P( , θ )d x

0

⎛

⎞ (x) ⎜ ⎟ ≤ k ⎝ −l M l−1 (t)M (t)dt ⎠ G d x + R .

0

Thus the properties of M lead us to the following inequality: k−3 l (1 − M ( ))P( , θ )d x ≤ (1 − M l ( ))|G |d x + |R |. k { >k−3}

{ >k−3}

From the explicit form of the pressure function (2.3) we find k−3 k−3 (k − 3)γ |{ > k − 3}| − ||P( , θ )|| L 2 () ||M l ( )|| L 2 () k k ≤ ||G|| L ∞ () |{ > k − 3}| + ||G − G || L 1 () + |R |. But by Lemma 8 – inequality (4.18) – we are able to choose k0 so large that for all k > k0 2 we have (5.1), since γ > 3 and ||G|| L ∞ () ≤ Cη (1 + k 1+ 3 γ +η ) with 0 < η ≤ γ −3 6 . Hence we get |{x ∈ : (x) > k −3}| ≤ C ||M l ( )|| L 2 ({ >k−3}) +||G −G || L 1 () +|R | . (5.3) Now, let us fix δ > 0. Then there exists 0 > 0 such that for < 0 , C(||G − G || L 1 () + |R |) ≤ δ/2.

(5.4)

Having fixed, we consider the sequence {M l ( )I{ >k−3} }l∈N , where I A is the characteristic function of a set A. We see that it monotonely pointwise converges to zero. Thus by the Lebesgue theorem we are able to find l = l(, δ) such that C||M l ( )|| L 2 ({ >k−3}) ≤ δ/2.

(5.5)

From (5.3), (5.4) and (5.5) we obtain lim |{x ∈ ; (x) > k − 3}| ≤ δ.

(5.6)

→0

As δ > 0 can be chosen arbitrarily small, Theorem 3 is proved.

Thanks to Theorem 3 we are prepared to present the main part of the proof, i.e. the pointwise convergence of the density.

374


Lemma 9. We have P(, θ )d x ≤ Gd x and P(, θ )d x = Gd x;

(5.7)

consequently, P(, θ ) = P(, θ ) and up to a subsequence → 0+ , → strongly in L q () for any q < ∞.

(5.8)

Proof. Due to Theorem 3 we are able to omit K () in the limit equation. For details we refer to [11] – Sect. 4, consideration for (4.16). Examine the approximative continuity equation (2.2)1 . We use as test function ln( + δ) and passing with δ → 0+ we obtain K ( )v · ∇ d x ≥ C(k), (5.9)

thus Theorem 3 implies div v d x ≥ R .

−

(5.10)

Applying (4.15) to (5.10), passing with → 0, then by the strong convergence of G — see (4.17) — we conclude that G = G, so the first relation in (5.7) is proved. Next, we consider the limit to the continuity equation, i.e. div(v) = 0. Testing it by ln with an application of Friedrich’s lemma to have the possibility to use test functions with lower regularity we obtain (for details see [11]) div vd x = 0.

The definition of G — (4.16) — shows the second part of (5.7). Due to elementary properties of weak limits we get P(, θ ) ≤ P(, θ ) a.e. in , but (5.7) implies (P(, θ ) − P(, θ ) )d x ≤ 0, hence P(, θ ) = P(, θ ) a.e.,

i.e. γ +1 + 2 θ = γ + 2 θ a.e.

However, γ +1 ≥ γ and 2 θ ≥ 2 θ , so γ +1 = γ a.e.

and

2 θ = 2 θ a.e.

By Lemma 6 the temperature θ > 0 a.e., we conclude 2 = 2 and for a suitably taken subsequence, lim || − ||2L 2 () = 2 − 2 L 1 () = 0.

→0

(5.11)

Thus the limit (5.11) implies → strongly in L 2 () and by the pointwise boundedness of and we conclude (5.8).


375

Next, we would like to study the limit of the energy equation. The first observation concerns the velocity: we obtain the strong convergence of its gradient. Recall that from Theorem 3 and due to the strong convergence of the temperature it follows P( , θ ) → p(, θ ) strongly in L 2 (), hence (4.17) implies div v → div v

strongly in L 2 ().

(5.12)

strongly in L 2 (),

(5.13)

Additionally we already proved that rot v → rot v

since we observed that the vorticity can be written as the sum of two parts, one bounded in Wq1 () and the other one going strongly to zero in L 2 (). The regularity of systems (4.7) and (4.8) and convergences (5.12) and (5.13) imply immediately that v → v

strongly in H 1 ().

In particular, we get S(v ) : ∇v → S(v) : ∇v

strongly at least in L 1 ().

(5.14)

This fact will be crucial in our considerations for the limit of the energy equation. Recall that → in L q () for q < ∞,

v → v in Wq1 () for q < 3m,

θ → θ in L q () for q < 3m,

1 θ θ in Wmin{2, 3m (). } m+1

Consider the weak form of (2.2)3 . For a smooth function φ we have m + θ (1 + θ ) ∇θ · ∇φd x + L(θ )(θ − θ0 )φdσ + ln θ φdσ θ ∂ ∂ ⎡⎛ ⎞ ⎤ (x) ⎢⎜ ⎟ ⎥ − ⎣⎝ K (t)dt ⎠ v · ∇(θ φ) + K ( ) v · ∇(θ φ)⎦ d x

+

⎡

0

⎢ ⎣ K ( ) v · ∇θ φ + div(θ v φ)

(x)

⎤

⎥ K (t)dt ⎦ d x =

0

+ θ ∇θ (1 + θ m )∇θ

(5.16)

S(v ) : ∇v φd x.

Thanks to (5.15), (1 + θm )

(5.15)

in L 1 ().

376


Passing to the limit with the last four terms of the l.h.s. of (5.16) we get −v∇(θ φ) − v∇(θ φ) + φv∇θ + div(θ φv) d x

=

−θ v · ∇φ + θ div vφ d x.

(5.17)

In (5.17) we essentially used the strong convergence of the density. To control the behavior of the boundary terms we note that due to (5.15)2 we see that θ |∂ → θ |∂ strongly in L l+1 (∂) and by Lemma 6, ln θ is bounded in L 2 (∂). Thus recalling (5.14) we get at the limit

(1 + θ )∇θ · ∇φd x +

=

θ v · ∇φd x

∂

S(v) : ∇vφd x −

L(θ )(θ − θ0 )dσ −

m

θ div vφd x.

(5.18)

To conclude, note that we may show that the limit functions θ and v belong to W p1 () θ for any p < ∞. To see this, we introduce the function (θ ) = 0 (1 + t m )dt, similarly as in Sect. 3, formula (3.10). Thus from (5.18) we immediately see that θ ∈ L ∞ () and v ∈ W p1 () for any p < ∞. Using this fact once more in the energy equation, we observe that θ ∈ W p1 (), p < ∞. The positiveness of θ follows from Lemma 6. Theorem 1 is proved. Acknowledgements. The work has been granted by the working program between Charles and Warsaw Universities. The first author has been partly supported by the Polish KBN grant No. 1 P03A 021 30 and by ECFP6 M.Curie ToK program SPADE2, MTKD-CT-2004-014508 and SPB-M. The work of the second author is a part of the research project MSM 0021620839 financed by MSMT and partly supported by the grant of the Czech Science Foundation No. 201/05/0164 and by the project LC06052 (Jindˇrich Neˇcas Center for Mathematical Modeling).

References 1. Batchelor, G.K.: An introduction to fluid dynamics. Cambridge: Cambridge University Press, 1967 2. Bause, M., Heywood, J.G., Novotný, A., Padula, M.: On some approximation schemes for steady compressible viscous flow. J. Math. Fluid Mech. 5(3), 201–230 (2003) 3. Bˇrezina, J., Novotný, A.: On Weak Solutions of Steady Navier-Stokes Equations for Monatomic Gas. preprint, http://ncmm.karlin.mff.cuni.cz/research/Preprints 4. Ducomet, B., Feireisl, E.: On the dynamics of gaseous stars. Arch. Rat. Mech. Anal. 174(2), 221–266 (2004) 5. Frehse, J., Steinhauer, M., Weigant, V.: On Stationary Solutions for 2 - D Viscous Compressible Isothermal Navier-Stokes Equations. preprint, http://ncmm.karlin.mff.cuni.cz/research/Preprints 6. Feireisl, E.: Dynamics of viscous compressible fluids. Oxford Lecture Series in Mathematics and its Applications 26, Oxford: Oxford University Press, 2004 7. Feireisl, E., Novotný, A., Petzeltová, H.: On a class of physically admissible variational solutions to the Navier-Stokes-Fourier system. Z. Anal. Anwendungen 24(1), 75–101 (2005) 8. Feireisl, E., Novotný, A.: Large time behaviour of flows of compressible, viscous, and heat conducting fluids. Math. Methods Appl. Sci. 29(11), 1237–1260 (2006)


377

9. Lions, P.L.: Mathematical Topics in Fluid Mechanics, Vol. 2: Compressible Models. Oxford: Oxford Science Publications, 1998 10. Mucha, P.B.: On cylindrical symmetric flows through pipe-like domains. J. Diff. Eq. 201(2), 304–323 (2004) 11. Mucha, P.B., Pokorný, M.: On a new approach to the issue of existence and regularity for the steady compressible Navier–Stokes equations. Nonlinearity 19(8), 1747–1768 (2006) 12. Mucha, P.B., Rautmann, R.: Convergence of Rothe’s scheme for the Navier-Stokes equations with slip conditions in 2D domains. ZAMM Z. Angew. Math. Mech. 86(9), 691–701 (2006) 13. Novo, S., Novotný, A.: On the existence of weak solutions to the steady compressible Navier-Stokes equations when the density is not square integrable. J. Math. Kyoto Univ. 42(3), 531–550 (2002) 14. Novo, S., Novotný, A., Pokorný, M.: Steady compressible Navier-Stokes equations in domains with non-compact boundaries. Math. Methods Appl. Sci. 28(12), 1445–1479 (2005) 15. Novotný, A., Padula, M.: L p -approach to steady flows of viscous compressible fluids in exterior domains. Arch. Rat. Mech. Anal. 126(3), 243–297 (1994) 16. Novotný, A., Straškraba, I.: Mathematical Theory of Compressible Flows. Oxford: Oxford Science Publications, 2004 17. Pokorný, M., Mucha, P.B.: 3D steady compressible Navier–Stokes equations. Cont. Discr. Dyn. Systems S1, 151–163 (2008) 18. Solonnikov, V.A.: Overdetermined elliptic boundary value problems. Zap. Nauch. Sem. LOMI 21, 112–158 (1971) 19. Zaj¸aczkowski, W.: Existence and regularity of some elliptic systems in domains with edges. Dissertationes Math. (Rozprawy Mat.) 274 (1989) 95 pp Communicated by P. Constantin

Commun. Math. Phys. 288, 379–401 (2009) Digital Object Identifier (DOI) 10.1007/s00220-009-0747-y

Communications in


L 2 -Restriction Bounds for Eigenfunctions Along Curves in the Quantum Completely Integrable Case John A. Toth Department of Mathematics and Statistics, McGill University, Montreal, Canada. E-mail: [email protected] Received: 17 March 2008 / Accepted: 5 December 2008 Published online: 11 March 2009 – © Springer-Verlag 2009

Abstract: We show that for a quantum completely integrable system in two dimensions, the L 2 -normalized joint eigenfunctions of the commuting semiclassical pseudodifferen tial operators satisfy restriction bounds of the form γ |ϕ j |2 ds = O(| log |) for generic curves γ on the surface. We also prove that the maximal restriction bounds of BurqGerard-Tzvetkov [BGT] are generically attained for certain exceptional subsequences of eigenfunctions. 1. Introduction Let (M, g) be a compact, closed orientable Riemannian manifold. Let − : C ∞ (M) → C ∞ (M) be the associated Laplace-Beltrami operator with eigenvalues 0 < λ1 ≤ λ2 ≤ · · · and eigenfunctions ϕ j ; j = 1, 2, 3, . . . satisfying −g ϕ j = λ2j ϕ j , and L 2 -normalized so that M |ϕ j |2 dvol(x) = 1. The celebrated Avakumovic-Levitan-Hörmander asymptotics for the unintegrated spectral counting function e(x, x, λ) = 2 λ j ≤λ |ϕ j (x)| implies that n−1 ϕ j L ∞ = O λ j 2 .

(1.1)

The example of the sphere shows that (1.1) is sharp. The corresponding sharp L p -bounds are due to Sogge [So1,So2,So3]. Even though this L ∞ -bound is far from generic [STZ], the only general improvements on (1.1) that we are aware of are due to Sogge and Zelditch [SZ] and more recently, Sogge, Toth and Zelditch [STZ,T4]. These authors The author was supported by a William Dawson Fellowship and NSERC Grant OGP0170280.

380

J. A. Toth n−1

obtain pointwise o(λ 2 )-bounds under a certain non-recurrence condition for the geodesic flow on (M, g). The methods in [STZ] follow closely the earlier work of Safarov [S] and Safarov-Vassiliev [SV]. n−1 It is natural to ask whether one can generically improve the O(λ 2 )- sup-bound by polynomial powers of λ and if so, by how much? In general, very little is known here: Polynomial improvements have been obtained by Iwaniec and Sarnak in arithmetic hyperbolic cases [Sa,IS]. At the other extreme, in the quantum completely integrable (QCI) case it is known that under a natural Morse assumption, one can show 1 that supx∈M |ϕλ (x)| = O(λ 4 ) when the ϕλ ’s are joint eigenfunctions of the commuting operators and dim M = 2 (see [T4]). In the latter case, when dim M > 2, one can at least hope to obtain a fairly complete answer to this question provided the ϕ j ’s are joint eigenfunctions of n-functionally independent, self-adjoint, jointly elliptic, commuting -pseudodifferential operators P1 (), P2 (), . . . , Pn (). However, due to the presence of often complicated degeneracies of the Lagrangian foliation, even at the classical level, the dynamical picture is only partially understood [VN2]. Likewise, at the quantum level, the asymptotic blow-up properties of eigenfunctions (eg. sharp L p -bounds) are also only partially understood [T1–T3, TZ1–TZ3]. Apart from pointwise bounds, it is natural when studying asymptotic concentration properties of eigenfunctions to consider limits of expected values Aϕλ , ϕλ as λ → ∞ where A is a zeroth-order pseudodifferential operator and to compute the corresponding semiclassical defect measures. Formally, one can let A approach δγ , where the latter is surface measure along a submanifold, γ ⊂ M. Then, one is faced with estimating asymptotic upper bounds for L 2 , or more generally, L p integrals along submanifolds of M. In the case of surfaces, these are curves and the concentration of these defect measures along a periodic geodesic γ is called strong scarring. For Laplace eigenfunctions, the eigenfunction restriction bounds have been studied by Reznikov [R] for hyperbolic surfaces and Burq-Gerard-Tzvetkov [BGT] for general manifolds (and for all p ≥ 2). Both papers are related to earlier work of Tataru [Ta] on estimating boundary traces of wavefunctions. We will focus here on the case where p = 2 and dim M = 2. (i.e. L 2 -restriction bounds along curves on surfaces). At the moment, it is unclear to us whether our methods extend to L p -restriction bounds for p = 2. In the special case of L 2 -integrals along curves, the estimates in [BGT] are as follows: • (i) If γ is a unit-length geodesic, then

1/2

γ

|ϕ j (s)|2 ds = O(λ j ).

• (ii) If γ is a curve with strictly-positive geodesic curvature, γ

1/3

|ϕ j (s)|2 ds = O(λ j ).

In this article, we obtain generic asymptotic bounds for γ |ϕ j (s)|2 ds in the case where the ϕ j ’s are joint eigenfunctions of the QCI system consisting of two commuting -pseudodifferential operators P1 () and P2 (). Other than the fact that our analysis here is specific to QCI systems and to the case p = 2, this paper differs from [BGT] in several ways:

L 2 -Restriction Bounds for Eigenfunctions Along Curves in the QCI Case

381

1) One of the main issues here is the generic behaviour of restriction bounds, where a curve γ : [a, b] → M is called generic if it satisfies the Morse condition in (1.1). As weshow in Sect. 2, in the QCI case the restricted asymptotic eigenfunction mass, γ |ϕ j |2 ds, is much smaller than the prediction in (i) or (ii) above. Indeed, it is O(log λ j ) (see Theorems 1 and 3) and the example of zonal harmonics on the sphere (see Sect. 4.1) shows that this bound is sharp. 2) In Theorem 3, we establish a converse to (i) above, and show that the bound in (i) is generically attained in the QCI case. Moreover, we identify the specific bicharacteristics in terms of the singular Lagrangian foliation that support such large eigenfunction scars. 3) Finally, we prove all our results for a rather large class of possibly inhomogeneous semiclassical QCI Hamiltonians. The semiclassical Laplacian P1 () = −2 is a special case. The results really have to do with the bicharacteristic flow and are not specific to geodesics. Before going on, we explain what is meant here by the term generic. Given E 1 > 0 a regular value of p1 , we assume that for (x, ξ ) ∈ p1−1 (E 1 ), ∂ξ p1 (x, ξ ) = 0. (A1) That is, p1 is real prinicipal type on the hypersurface p1−1 (E 1 ). Given the canonical projection π : T ∗ M → M, we define Cγ := {(x, ξ ) ∈ T ∗ M; p1 (x, ξ ) = E 1 , x ∈ γ } = p1−1 (E 1 ) ∩ π −1 (γ ).

(1.2)

Definition 1.1. Let ι : Cγ → p1−1 (E 1 ) be the standard inclusion map. We say that the R2 -integrable system with moment map P = ( p1 , p2 ) is generic along the curve γ : [a, b] → M provided ι∗ p2 ∈ C ∞ (Cγ ) is a Morse function and condition (A1) is satisfied. When it does not cause confusion, we call the curve segment γ itself generic when the conditions in Definition 1.1 are satisfied. Remark 1.2. In the homogeneous case where p1 = |ξ |2g , the manifold Cγ = Sγ∗ M. In this case, E 1 = 1 and by Euler homogeneity, ξ · ∂ξ |ξ |2g = 2|ξ |2g so that (A1) is automatically satisfied. Theorem 1. Let ϕ j ; j = 1, 2, 3, . . . be the L 2 -normalized joint eigenfunctions of the commuting operators P1 () and P2 () on a Riemannian surface (M 2 , g) with joint (1) (2) eigenvalues (λ j () = E 1 +O(), λ j ()) ∈ Spec P1 ()×Spec P2 (); j = 1, 2, 3, . . .. Then for generic curves γ : [a, b] → M and ∈ (0, 0 ], |ϕ j |2 ds = O|γ | (| log |) . γ

Here, |γ | denotes the length of the curve segment γ and the RHS of the above estimate is uniform over all energy values {E ∈ R; (E 1 , E) ∈ P(T ∗ M)}. In the special case where P1 () = −2 one can scale out and Theorem 1 becomes

382

J. A. Toth

Theorem 2. Let ϕ j ; j = 1, 2, 3, . . . be the L 2 -normalized joint Laplace eigenfunctions of the commuting operators P1 = − and P2 on a Riemannian surface (M 2 , g). Then, provided ι∗ p2 | Sγ∗ M is Morse, one gets |ϕ j |2 ds = O|γ | log λ j . γ

In the homogeneous case, it was already observed in [BGT] (see estimate (i) above) that in the case where γ is a geodesic, the restriction upper bounds can grow at the maximal rate ∼ λ1/2 . Consistent with this, in the QCI case, we will show that there always exist certain bicharacteristics that support high L 2 -mass for certain subsequences of eigenfunctions consistent with the λ1/2 -bound in (i) (at least up to possible loss of log λ). However, it is important to note that the nature of the bicharacteristic is very important when discussing restriction bounds. To describe what we mean, let Br eg (resp. Bsing ) denote the regular (resp. singular) values of the moment map P = ( p1 , p2 ) : T ∗ M → R2 . In the general QCI case, most bicharcteristics of H p1 are subsets of Lagrangian tori in P −1 (Br eg ). These do not support large L 2 -bounds along their configuration space projections. However, as was shown in [TZ3] Lemma 3, unless (M, g) is a flat torus, there is always a subsequence of joint eigenfunctions of P1 and P2 with mass concentrated along (singular) joint orbits of the Hamilton fields H p1 and H p2 contained in P −1 (Bsing ). The latter eigenfunctions saturate the maximal bounds in (ii) above. This is of course consistent with simple examples like surfaces of revolution with metric g = dr 2 + a 2 (r )dθ 2 , where the equator is the projection of a singular orbit of the joint flow of H p1 and H p2 . In this case, p1 = |ξ |g and p2 = pθ with pθ (v) := v, ∂θ . The corresponding joint eigenfunctions, ϕ j , of P1 () = −2 g and P2 () = Dθ (1)

(2)

with joint eigenvalues (λ j (), λ j ()) = (1, 0) + o(1) (the analogs of highest weight spherical harmonics) satisfy γ |ϕ j |2 ds ∼ −1/2 along the equator, γ . So, in particular, the restriction bound for these eigenfunctions is certainly non-generic in the sense of Definition 1.1. However, it is not hard to show (see Sect. 4) that the meridian great circles, while obviously also periodic geodesics, have associated joint eigenfunctions with very different restriction bounds. These geodesics lies in the base space projection of a maximal Lagrangian torus. The zonal harmonics have -microsupport on this torus we show in Subsect. 4.1, the latter eigenfunctions have the L 2 -restriction bound and, as |2 ∼ log 1 along generic curves, , passing through the poles. These examples |ϕ j show that the estimates in Theorems 1 and 2 are sharp. In the case of exceptional bicharacteristics, we prove Theorem 3. Let P j ; (); j = 1, 2 be an Eliasson non-degenerate, QCI system on a surface, (M, g). Then, • (i) When γ is the projection of a bicharacteristic segment of p1 contained in P −1 (Br eg ), |ϕ j (s; )|2 ds = O|γ | (1). γ

• (ii) When γ is the projection of a singular joint orbit in P −1 (Bsing ), |ϕ j (s; )|2 ds = O|γ | (−1/2 ). γ


383

Moreover, there exists a constant cγ > 0 depending only on the curve γ , and a subsequence of joint eigenfunctions, ϕ jk ; k = 1, 2, . . . such that for ∈ (0, 0 ], γ

γ

|ϕ jk (s; )|2 ds ≥ cγ −1/2 when γ is stable,

|ϕ jk (s; )|2 ds ≥ cγ −1/2 | log |−1 when γ is unstable.

It is proved in [TZ3] (see Lemma 3) that unless (M, g) is a flat torus, the joint flow

t always possesses at least one singular orbit (see also [L,LS]). In the case where dim M = 2 this orbit must be one-dimensional (i.e. a geodesic). Thus, the second estimate (ii) in Theorem 3 is generically attained in the QCI case and therefore, up to a power of log λ, the maximal L 2 -restriction bound in [BGT] is attained. Remark 1.3. Examples to which Theorems 1 and 3 apply include: QCI Laplacians on ellipsoids (with distinct axes), surfaces of revolution, Liouville surfaces. Less wellknown examples include QCI Laplacians associated with spherical metrics got by reducing the Goryachev-Chaplyin top as well as those constructed in [DM]. In both of the last two classes of examples, the integral in involution, p2 , is a cubic polynomial in the momentum variables. Finally, there are also examples known where p2 is quartic in the momenta [Mi]. In addition, our results apply to inhomogeneous QCI systems such as Neumann oscillators, Euler and Kowalevsky tops and the spherical pendulum as well as many others. We have stated our results for surfaces because the formulation is quite elegant in that case. It is not hard to extend the analysis here to higher-dimensions under the appropriate notion of a generic submanifold, but the formulation of results becomes more cumbersome. We hope to address this elsewhere. Remark 1.4. In analogy with the specific results for QCI eigenfunctions in Theorems 1 and 3 above, it is natural to try to determine L 2 (or L p ) eigenfunction bounds along “typical” curves on general Riemann surfaces (M 2 , g) by varying the standard restriction estimates over appropriate moduli spaces of curve segments. We hope to address this point elsewhere.

2. Generic (Joint) Eigenfunction Restriction Bounds Along Curves We say that P() ∈ O p,cl (S m,k (T ∗ M)) if locally it has Schwartz kernel P(x, y; ) = (2π )

−n

Rn

ei(x−y)ξ/ p(x, ξ ; )dξ,

∞ −k+ j and p ∈ S m− j (T ∗ M) with ∈ (0, ]. where p(x, ξ ; ) ∼ j 0 j=0 p j (x, ξ ) 1,0 From now on, without loss of generality, we assume that P j () ∈ O p,cl (S m,0 ), P1 () is elliptic in the classical sense and the P j ()’s are self-adjoint. In this section, we get generic asymptotic bounds for γ |ϕ j (s; )|2 ds in the case where the ϕ j ’s are joint eigenfunctions of P1 () and P2 (). Here, the term generic refers to a non-degeneracy condition on the QCI system along the (generalized) cylinder Cγ given in Lemma 1.1.

384

J. A. Toth

2.1. Proof of Theorem 1. We assume here that (M, g) is a compact surface with QCI quantum Hamiltonian given by P1 () and the quantum integral in involution is P2 (), where we assume that its principal symbol, p2 , satisfies the Morse condition in (1) Definition 1.1. The joint spectrum of P1 ()(resp. P2 ()) will be denoted by λ j () (2)

(resp. λ j ()) with j = 1, 2, 3, . . .. Let ρ ∈ S(R) satisfy ρ(u) ≥ 0 with ρ(0) = 1 and ρˆ ∈ C0∞ ([−, ]) with > 0 sufficiently small. For fixed x ∈ M, we form the joint unintegrated trace attached to the level ( p1 , p2 ) = (E 1 , E) given by I E (x; ) :=

∞

(1)

(2)

ρ(−1 [λ j () − E 1 ]) ρ(−1 [λ j () − E]) |ϕ j (x; )|2 .

(2.3)

j=1

b Our task is to obtain a locally uniform asymptotic bound (in E) for a I E (x(τ ); ) dτ as → 0+ . Writing the usual small-time -Fourier integral operator (FIO) parametrices for eit P1 () and eis P2 () and taking Fourier transforms in (2.3) gives: I E (x; ) = (2π )−4 ˆ ρ(s) ˆ dsdtdξ dηdy ei (x,y,ξ,η,s,t;E)/aχ (y, η, ξ ; ) ρ(t) +O(∞ ).

(2.4)

In (2.4), because of the cutoff function χ appearing in the amplitude (see (2.10 below), the uniform O(∞ ) remainder follows by successive integration by parts in t and s. The total phase function

(x, y, ξ, η, s, t; E) = ψ1 (x, y, ξ, t) + t E 1 + ψ2 (y, x, η, s) + s E,

(2.5)

ψ1 (x, y, ξ, t) = ϑ1 (x, ξ, t) − yξ, ψ2 (y, x, η, s) = ϑ2 (y, η, s) − xη.

(2.6)

where,

In (2.6), the ϑ j ; j = 1, 2 satisfy the usual eikonal initial value problems ∂t ϑ1 + p1 (x, ∂x ϑ1 ) = 0, ϑ1 |t=0 = xξ, ∂s ϑ2 + p2 (y, ∂ y ϑ2 ) = 0, ϑ2 |s=0 = yη.

(2.7)

From the equations in (2.7) one easily derives the following Taylor expansions for ϑ1 (x, ξ, t)(resp. ϑ2 (y, η, s)) centered at t = 0 (resp. s = 0): ϑ1 (x, ξ, t) = xξ − t p1 (x, ξ ) + O(t 2 ), ϑ2 (y, η, s) = yη − sp2 (y, η) + O(s 2 ).

(2.8) (2.9)

In (2.4) the amplitude is of the form aχ (x, y, ξ, η, s, t; ) = χ (E 1 − p1 (x, ξ ))χ (E− p2 (y, η))χ (y−x)a(x, y, η, ξ, s, t; ), (2.10) ∞ where a ∼ j=0 a j j , a0 ≥ C10 > 0 and χ ∈ C0∞ (R) with χ (x) = 0 for |x| ≥ 1 and χ (x) = 1 for |x| ≤ 1/2.


385

Since the integral in (2.4) is absolutely convergent, we carry out the (y, η)-integration first and get that −4 I E (x; ) = (2π ) exp [it (E 1 − p1 )(x, ξ ) + O(t 2 )/] ×ρ(t) ˆ I (x, ξ, s, t; ) dsdξ dt + O(∞ ), where,

I (x, ξ, s, t; ) :=

ˆ ei (x,s;y,η)/bχ (x, y, ξ, η, s, t; )ρ(s)dydη,

(2.11)

(2.12)

j and where b ∈ Scl0 (1) with b ∼ j b j , b0 ≥ 1/C 0 > 0 and bχ has the same properties as aχ in (2.10). The phase function

(x, s; y, η) = x − y, ξ − η + s(E − p2 (y, η)) + O y,η (s 2 ).

(2.13)

det( y,η )

= 1 + O(s) and the s-support of bχ can be taken arbitrarily small, Since one can apply stationary phase (with parameters) in the (y, η)-variables in (2.12). The critical point equations for (y, η) are η = ξ + s ∂ y p2 (y, η) + O(s 2 ),

(∗)

y = x + s ∂η p2 (y, η) + O(s ). 2

By a straightforward computation, I E (x, ) equals −2 (2π ) exp [it (E 1 − p1 )(x, ξ ) + is(E − p2 )(x, ξ ) + Ox,ξ (s 2 ) + Ox,ξ (t 2 )/] (2.14) ×c(x, ξ, s, t; )dξ dtds + O(∞ ), ∞ 0 ∞ j where, c ∈ Scl (1) with c(x, ξ, s, t; ) ∼ j=0 c j (x, ξ, s, t) , where the c j ∈ C0 . Next, we make a polar variables decompostion in the ξ -variables in (2.14), which is legitimate since by assumption, p1 is real principal type on the energy shell p1−1 (E 1 ) and so, |∂ξ p1 | ≥ C1 > 0 when p1 ∼ E 1 and supp c ⊂ [−, ]2 × p1−1 [E 1 − , E 1 + ]. We note that in the case of a Schrödinger operator, p1 (x, ξ ) = |ξ |2g + V (x) and so, 2ξ ∂ξ p1 = |ξ |2g by Euler homogeneity. So, as long as γ ∩ {x ∈ M; V (x) = E 1 } = ∅, (∗) the condition ∂ξ p1 (x, ξ ) = 0 is satisfied for (x, ξ ) ∈ p1−1 (E 1 ) ∩ π −1 (γ ). In the homogeneous case, where p1 = |ξ |g and E 1 = 1 the condition (∗) is automatically satisfied. Since by assumption ∂ξ p1 = 0 near p1−1 (E 1 ) we can choose p1 as a local coordinate on π −1 (x) near (x, ξ0 ) ∈ p1−1 (E 1 ). Then, we put p1 = r E 1 and extend it to a local polar coordinate system (r, ω) : π −1 (x) → R2 near (x, ξ0 ) ∈ p1−1 (E 1 ). Cover a neighbourhood of π −1 (x) ∩ p1−1 (E 1 ) by small open sets and choose a partition of unity subordinate to the covering. Then, make the change of variables ( p1 , ω) → ξ in each open set and sum over the partition to get I E (x; ) = (2π )−2

exp i [t E 1 (1 − r ) + s(E− p2 (x, r ω)) + Ox,ξ (s 2 ) + Ox,ξ (t 2 )/]

×c(x, r ω, s, t; )r dr dωdtds + O(∞ ),

(2.15)

386

J. A. Toth

where ω ∈ p1−1 (E 1 )∩π −1 (x) is a (generalized) angle variable and dωx denotes Liouville measure on p1−1 (E 1 )∩π −1 (x). In the following and in (2.15) above, we have suppressed the dependence of dωx on x ∈ M. One final application of stationary phase in the (r, t)-variables in (2.15) gives I E (x; ) = (2π )−1 exp i[s (E − p2 (x, ω)) + O(s 2 )]/] ×c(x, ˜ ω, s; )dωds + O(∞ ),

(2.16)

where c˜ ∈ Scl0 (1). The remainder of the proof of Theorem 1 involves integrating the restriction of I E (x; ) in (2.15) to x = x(τ ) ∈ γ and then carrying out a detailed analysis of the result under the generic Morse condition in Definition 1.1. From (2.15), a

b

I E (x(τ ); ) dτ = (2π )−1

exp i[s (E − p2 (x(τ ), ω)) + O(s 2 )]/]

×c(x(τ ˜ ), ω, s; )dωdτ ds + O(∞ ) = (2π )−1 eis E/ Iγ (s; ) ds + O(∞ ).

(2.17) (2.18)

In (2.17) it is useful to absorb the O(s 2 )-term into the p2 -term in the phase and write p2 (x(τ ), ω; s) := p2 (x(τ ), ω) + O(s), uniformly in (τ, ω) ∈ [a, b] × (δ1 , δ2 ); δ j ∈ R, j = 1, 2. Also, an application of Fubini ensures that the s-integral can be carried out last and this shows that the result is uniform in the energy values E. By carrying out the s-integration last, E will always appear in a harmless, linear fashion in the phase only. So, for ∈ (0, 0 ] it remains to estimate the integral: Iγ (s; ) =

a

b

δ2

δ1

e−isp2 (x(τ ),ω;s)/c(x(τ ˜ ), ω, s; )dωdτ.

(2.19)

Because of the Morse assumption in Definition 1.1, the (ω, τ )-critical points of p2 (x(τ ), ω) are isolated and so, without loss of generality, we assume that there is a single critical point at (τ0 , ω0 ). Let Bδ ⊂ [a, b] × (δ1 , δ2 ) be a small δ-ball centered at (τ0 , ω0 ) and χ0 ∈ C0∞ (Bδ ) with χ0 = 1 in Bδ/2 . We then choose χ1 ∈ C ∞ so that χ0 + χ1 = 1 and split up the integral Iγ (s; ) = e−isp2 (x(τ ),ω;s)/c(x(τ ˜ ), ω, s; ) χ0 (τ, ω) dωdτ + e−isp2 (x(τ ),ω;s)/c(x(τ ˜ ), ω, s; ) χ1 (τ, ω) dωdτ = : Iγ(0) (s; ) + Iγ(1) (s; ).

(2.20)

(1)

First, we deal with the second integral Iγ (s; ) on the RHS of (2.20): For (τ, ω) ∈ supp χ1 , we have that for |s| sufficiently small, max { |∂τ p2 (x(τ ), ω; s)|, |∂ω p2 (x(τ ), ω; s)| } ≥

1 > 0. C0

(2.21)


387

By the implicit function theorem, in the case where |∂ω p2 (x(τ ), ω; s)| ≥

1 C0 ,

one can

(1) Iγ (s; ).

Alternamake a local change of variables (τ, ω) → (τ, p2 (x(τ ), ω; s)) in tively, when |∂τ p2 (x(τ ), ω; s)| ≥ C10 , one can make the make the change of variables (τ, ω) → ( p2 (x(τ ), ω; s), ω)). So, in either case after making a change of variables, one gets −1 i Es/ (1) −1 (2π ) Iγ (s; )ds = (2π ) e eis(E−θ)/c˜1 (s, θ, v; ) dθ dvds, (2.22) where, again c˜1 ∈ Scl0 (1) with compact support in all variables. Finally, another application of stationary phase in the (s, θ )-variables gives (2π )−1 ei Es/ Iγ(1) (s; )ds = O(1). (2.23) Moreover, the O(1)-bound on the RHS in (2.23) is clearly uniform in E. We now deal with Iγ(0) (s; ). The Morse assumption and implicit function theorem imply that the critical point equations ∂τ p2 (x(τ ), ω; s) = 0, ∂ω p2 (x(τ ), ω; s) = 0, τ (0) = τ0 , ω(0) = ω0 have unique local solutions τ (s) and ω(s) which are smooth for |s| ≤ C1 with C > 0 suf(0) ficiently large. We apply stationary phase in (τ, ω) to expand the first integral Iγ (s; ) on the RHS of (2.20). First, we split up the domain of s-integration and write ˜ ), ω, s; ) dτ dω eisp2 (x(τ ),ω;s)/χ0 (τ, ω)c(x(τ = 1|s|≤eisp2 (x(τ ),ω;s)/χ0 (τ, ω)c(x(τ ˜ ), ω, s; ) dτ dω ˜ ), ω, s; ) dτ dω. (2.24) + 1|s|≥eisp2 (x(τ ),ω;s)/χ0 (τ, ω)c(x(τ Clearly, (2π )

−1

|s|≤

ei Es/ Iγ(0) (s; )ds = O(1).

(2.25)

An application of stationary phase with parameters ([Ho] Theorem 7.7.5) in the second integral gives 1|s|≥ Iγ(0) (s; ) = s −1 c˜0 (x(τ (s)), ω(s), s) 1|s|≥ exp [ isp2 (x(τ (s)), ω(s); s)/ ] +O(|s|−2 2 ).

(2.26)

So, integrating (2.26) over {s; 1 ≥ |s| ≥ } gives i Es/ (0) (2π )−1 ≤ C1 −1 e I (s; )ds γ 1≥|s|≥

2 ds + C2 −1 ds 2 1≥|s| ≥ s 1≥|s|≥ s = O(| log |) + O(1) = O(| log |). (2.27)

388

J. A. Toth

Combining (2.23), (2.25) and (2.27) and using the fact that each of these estimates is uniform in E implies that for ∈ (0, 0 ], sup

{E; (E 1 ,E)∈P (T ∗ M)} a

b

I E (x(τ ); ) dτ = O(| log |).

Rewriting the last estimate, we have proved that

sup

{E; (E 1

×

,E)∈P (T ∗ M)}

γ

−1 (2) ρ(−1 [λ(1) j () − E 1 ])ρ( [λ j () − E])

j

|ϕ j |2 ds = O(| log |).

(2.28)

We claim that there exists a constant C2 > 0 (independent of ∈ (0, 0 ] and the joint (1) (2) eigenfunctions ϕ j ) such that for any ∈ (0, 0 ], and (λ j (), λ j ()) ∈ Spec(P1 (), P2 ()), with |λ(1) j () − E 1 | ≤ C 1 ,

inf |λ(2) j () − E|) ≤ C 2 . (∗) E

To see this, we argue by contradiction: Assume that (∗) does not hold. Then there + exists a sequence (m )∞ m=1 with m → 0 as m → ∞ for which (∗) is violated. Let ∞ ∈ (m )m=1 and treat > 0 as an adiabatic parameter. Consider the -pseudodiffer(1) (2) ential operator P() := −2 [ P1 () − λ j ()]2 + −2 [P2 () − λ j ()]2 and consider (k)

(k)

pk, := −2 ( pk − λ j () )2 and Pk, := −2 [Pk () − λ j () ]2 ; k = 1, 2. Then, our assumption implies that for any ∈ (m )∞ m=1 , p2, (x, ξ ) ≥ C22 when p1, (x, ξ ) ≤ C12 , and thus P() = P1, + P2, is -elliptic. One then constructs an -pseudodifferential parametrix Q() with Q()P() = I d + O( ∞ ) L 2 →L 2 . Applying Q() to both sides of the equation P()ϕ () j = 0 implies that ∞ ϕ () j L 2 = O( ).

(2.29)

But since > 0 can be taken arbitrarily small, (2.29) contradicts the fact that all joint eigenfunctions are L 2 -normalized. So, after possibly rescaling ρ and using that ρ ≥ 0 with ρ(0) = 1 it follows from (∗) that there exists a constant C3 > 0 (independent of j and ) such that for all j ≥ 1 and ∈ (0, 0 ], sup (1) {E;(E 1 ,E)∈P (T ∗ M), |E 1 −λ j ()|≤C1 }

(2)

ρ(−1 [λ j () − E]) ≥ C3 > 0.


389 (1)

Since the sum on the LHS of (2.28) has non-negative terms, by restricting to { j; |λ j ()− (1)

E 1 | ≤ C1 } and (after possibly rescaling ρ) using that ρ(−1 [λ j () − E 1 ]) ≥ C5 > 0 for these eigenvalues, one finally gets that |ϕ j |2 ds = O|γ | (| log |), (1)

{ j;|λ j ()−E 1 |=O()}

γ

uniformly in E. This finishes the proof of Theorem 1.

Remark 2.5. The sup bound in Theorem 1 is also uniform in the energy parameter, E 1 . However, for different values of E 1 one needs to excise different subvarieties of M (which depend on E 1 ) to ensure that p1 is real principal type on p1−1 (E 1 ). For example, in the case where p1 = |ξ |2g + V (x), assumption (A1) requires that γ ∩ {x ∈ M; V (x) = E 1 } = ∅. 3. Non-Generic Curves In this section, we turn to the proof of Theorem 3. In contrast to Theorem 1, this result deals with the L 2 -restriction bounds of joint eigenfunctions of P1 () and P2 () with -microsupports along singular orbits of the joint bicharacteristic flow of H p1 and H p2 . We show that, up to log -factors, the maximal L 2 -restriction bound in [BGT] is attained along the base projections of these orbits. In the special homogeneous case, these projections are certain (exceptional) geodesics. For example, as we discuss in Sect. 4, in the case of surfaces of revolution, the equator is such an exceptional geodesic. Just as in the previous section, one is reduced to estimating the integral Iγ (s; ). However, unlike the generic case, the phase function (τ, ω) ∈ Cγ∞ will now have degenerate critical points and we make a change of variables to classical Birkhoff normal form along these singular orbits to compute the asymptotics. 3.1. Orbits of the joint flow, t . Here, we describe an important class of exceptional curves, γ , which do not satisfy the Morse assumption in Definition 1.1. As we have already pointed out in the Introduction, it is not difficult to see that in the homogeneous case, geodesics are distinguished as far L p -restriction bounds are concerned (see for example [BGT]). In the QCI case, the same is true for the bicharacteristics of general inhomogeneous Hamiltonians. Moreover, as we will show, the nature of bicharacteristics vis-a-vis the singular Lagrangian foliation of T ∗ M also plays a very important role as far as restriction bounds are concerned. First, we give a slightly different characterization of what it means for a curve γ to be generic. This consists of a series of simple but useful geometric lemmas, the main result being Proposition 3.6. Fix a smooth curve γ : [a, b] → R and let (τ (0), ω(0)) ∈ Cγ be any point on the cylinder. We define : Cγ → R by = ι∗ p2 , where ι : Cγ → T ∗ M is the standard inclusion map. So, in terms of the local coordinates (τ, ω) : Cγ → [a, b] × (δ1 , δ2 ), (τ, ω) = p2 (x(τ ), ω). Modulo an O(s)-error (which is negligible), this is the phase function in (2.19), in the integral Iγ (s; ).

390

J. A. Toth

The point (τ (0), ω(0)) is critical for : Cγ → R if for every smooth curve segment µ(s) := {(τ (s), ω(s)) ∈ Cγ ; s ∈ (−, )} passing through the initial point, ∂ (τ (s), ω(s))|s=0 = 0. ∂s

(3.30)

Since (τ (s), ω(s)) = p2 (τ (s), ω(s)), writing (3.30) out explicitly and applying the chain rule gives: ∂x p2 · ∂s τ |s=0 + ∂ξ p2 · ∂s ω|s=0 = 0.

(3.31)

On the other hand, differentiating the definining equation p1 (τ (s), ω(s)) = 1 in s gives ∂x p1 · ∂s τ |s=0 + ∂ξ p1 · ∂s ω|s=0 = 0.

(3.32)

The following lemma is an immediate consequence of (3.31) and (3.32). Lemma 4. A point z 0 = (τ (0), ω(0)) ∈ Cγ is critical for : Cγ → R if and only if Tz 0 Cγ ⊂ ker(dp1 )(z 0 ) ∩ ker(dp2 )(z 0 ). The following simple geometric result is central to our proof of Theorem 3 since it describes the bicharacteristics that are non-generic. Proposition 3.6. Let γ ⊂ π(γ˜ ) where γ˜ = (z 0 ) is a joint orbit of exp t j H p j ; j = 1, 2 through the point z 0 ∈ Cγ with dim γ˜ ≥ 1. Then, if γ˜ ⊂ Cγ , the curve γ is not generic. Proof. First, the real principal type assumption combined with the implicit function theorem imply that Cγ = π −1 (γ ) ∩ p1−1 (E 1 ) is a smooth two-dimensional submanifold of T ∗ M. We split the analysis into two cases. Case 1. When γ˜ is a two-dimensional Lagrangian torus, we have that locally γ˜ = p1−1 (E 1 ) ∩ p2−1 (E) for some E ∈ R. Since by assumption γ˜ ⊂ Cγ , and both are two-manifolds, clearly Cγ = γ˜ . Then, Cγ is non-generic since = p2 |Cγ = E and so, all points z ∈ Cγ are critical for . Case 2. Here we assume that γ˜ is a singular joint orbit of dimension one (see Subsect. 3.2 below). Then, for all z ∈ γ˜ , dp2 (z) = λ(z) · dp1 (z), for some λ(z) = 0. So, from Lemma 4, z 0 ∈ γ˜ is a critical point of : Cγ → R if and only if Tz 0 Cγ ⊂ ker(dp1 )(z 0 ). This inclusion is always satisfied since Cγ ⊂ p1−1 (E 1 ). As a result, all points z ∈ γ˜ along the one-dimensial orbit are critical for and so the latter is not Morse.


391

3.2. Singular leaves of the Lagrangian foliation. Before taking up the proof of Theorem 3 we collect here some basic facts about the geometry of integrable systems and their singular sets. We refer the reader to [TZ3,VN2] for further details. Given the moment map P = ( p1 , p2 ), the singular variety of the corresponding integrable system is defined to be the set sing = {(x, ξ ) ∈ T ∗ M; dp1 ∧ dp2 (x, ξ ) = 0}. We now recall some elementary results about sing which we will need later on. First, given the joint flow t : T ∗ M → T ∗ M defined by t (x, ξ ) = exp t1 H p1 ◦ exp t2 H p2 (x, ξ ); t = (t1 , t2 ) ∈ R2 , we observe that

t (sing ) = sing ,

(3.33)

which follows immediately from the fact that { p1 , p2 } = 0 and t is a diffeomorphism. The singular set sing consists of a union of orbits of the joint flow t ; t ∈ R2 . Definition. Following [TZ3], we say that an orbit of the joint flow t is singular if it is not Lagrangian; that is, if dim ≤ 1. 3.2.1. Eliasson nondegeneracy. For our second main result (Theorem 3), we will need to make a non-degeneracy assumption on the integrable system with moment map P = ( p1 , p2 ). We now give a brief description of this condition. For more detailed treatment, see [VN1,VN2,TZ3]. Let p = R{ p1 , p2 } ⊂ C ∞ (T ∗ M − 0), {.} be the standard abelian subalgebra with Poisson bracket. Then, given a singular orbit (v) = exp t1 H p1 ◦ exp t2 H p2 (v) through a point v ∈ P −1 (Bsing ) of rank k ≤ 1, we note that the Hessians dv2 p j ; j = 1, 2, determine an Abelian subalgebra dv2 p ⊂ S 2 (K /L , ωv )∗ of quadratic forms on the reduced symplectic subspace K /L, where we put K = ker dp1 (v) ∩ ker dp2 (v),

L = span(H p1 (v), H p2 (v)).

Definition. We say that the orbit (v) is Eliasson non-degenerate of rank k ≤ 1 if dv2 p is a Cartan subalgebra of S 2 (K /L , ωv )∗ . Lemma 5. Assume that the integrable system with moment map P = ( p1 , p2 ) is Eliasson non-degenerate. Then, sing is a finite union of orbits of the joint flow, t with dimension ≤ 1. The latter are diffeomorphic to open intervals, circles and isolated points. Proof. From (3.33) it follows that sing is a finite union of joint orbits of the joint flow

t and so has dimension ≤ 2. The Eliasson non-degeneracy condition (3.2.1) implies that dim sing ≤ 1. As a result, sing consists of a union of open intervals, circles and, in the inhomogeneous case, possibly a finite number of isolated points. Remark. In the homogeneous case where p1 (x, ξ ) = |ξ |2g , the singular orbits are necessarily topological intervals or circles since P = ( p1 , p2 ) has no isolated critical points.

392

J. A. Toth

The Eliasson non-degeneracy assumption implies that is a finite union of singular orbits for p j ; j = 1, 2 and we use this in the next section to analyze the integral (2.19) by microlocalizing near these orbits and applying a classical Birkhoff normal form construction to analyze the resulting integral. We note that the crucial difference between the generic case and the case of a bicharacteristic which lifts to a singular joint orbit lies in the fact that due to the invariance of the integral p2 , the computation of Iγ (s; ) can be reduced to a single fibre π −1 (x) ∩ p1−1 (E 1 ) in the latter case. Thus, there is no additional cancellation coming from the computation of the s-integral and this is ultimately the reason why the O(−1/2 ) L 2 -restriction bound is saturated by these singular orbits.

3.3. Microlocalization along Cγ . In the following, it is useful to split up the mapping cylinder Cγ as follows: Cγ = Cγr eg ∪ Cγsing , r eg

(3.34)

sing

where Cγ (resp. Cγ ) denote invariant open neighbourhoods of regular (resp. singular) points of (τ, ω). Let χr eg (ω) (resp. χsing (ω)) be a partition of unity subordinate to the corresponding covering of the parametrizing coordinate interval (δ1 , δ2 ) of a crosssection of the cylinder, Cγ . We then write Iγ (s; ) =

e−is(τ,ω)/c(ω, τ, s; )χr eg (ω) dωdτ + e−is(τ,ω)/c(ω, τ, s; )χsing (ω) dωdτ

=: Ir eg (s; ) + Ising (s; ).

(3.35)

First, we analyze the regular term on the RHS of (3.35).

3.4. Analysis of the regular term. Given ω ∈ supp χr eg , in light of the invariance formula (3.33) it easily follows that for all τ ∈ [a, b], and ω ∈ supp χr eg , d(τ, ω) = d(ι∗ p2 )(τ, ω) = 0. Indeed, rank (dp1 , dp2 )(τ, ω) = 2 for all ω ∈ supp χr eg and so, by Lagrange multipliers, the restriction = ι∗ p2 ∈ C ∞ (Cγ ) satisfies d(τ, ω) = d(ι∗ p2 )(τ, ω) = 0 for all (τ, ω) ∈ [a, b]× supp χr eg . But then one can introduce ι∗ p2 as a new coordinate on supp χr eg , and so by the change of variables formula, for some c ∈ Scl0 (1), (2π )

−1

ei[s E−sθ]/c (s, θ ; )dθ ds = O(1).

(3.36)

The last bound on the RHS of (3.36) follows by stationary phase in (s, θ ) and the estimate is uniform for E ∈ π2 (P(T ∗ M)), where π2 : (E 1 , E) → E.


393

3.5. Analysis of the singular term. Here we assume that {(x(τ ), ω); a ≤ τ ≤ b} ∈ suppχsing . So, in particular (x(τ ), ω) is contained in an arbitrarily small neighbourhood of γ˜ containing (x(0), ω(0)) where, dp1 ∧ dp2 (x(0), ω(0)) = 0. To deal with the second term in (3.35), it is useful to pass to a convergent singular Birkhoff normal form and write the phase function (τ, ω) in (3.35) in normal coordinates. The analysis will be split into several cases depending on the nature of the singularity in the phase function, . 3.5.1. Singular Birkhoff normal forms. First, we recall that the orbit = ∪(t1 ,t2 )∈R2 exp t1 H p1 ◦ exp t2 H p2 (x(0), ω(0)) of the joint flow is of dimension ≤ 1. So, it is diffeomorphic to a union of intervals and circles and possibly a finite number of (necessarily isolated) critical points. Since the latter case of fixed points is handled very similarly to the case of 1-D orbits, we only consider here restriction bounds along curves. The literature on general classical (and quantum) Birkhoff normal forms is extensive [G1,G2,ISZ,Z1,Z2] and we focus here on the integrable case where the canonical change of variables to normal form is actually convergent [CP,HS,T2,TZ3,MVN,VN2]. Without loss of generality, we assume here that the singular locus γ˜ = t (x(0), ω(0)) consists of a bicharacteristic. Whether or not γ is the projection of a periodic bicharacteristic is of no consequence here. Since we are considering the case where n = 2, there are only two possibilities: γ is either stable (elliptic) or unstable (hyperbolic). 3.5.2. Stable case. Let γ be a non-degenerate, stable bicharacteristic in the singular locus P −1 (b) ⊂ P −1 (Bsing ). Let (x, t) : M → R2 be Fermi coordinates along γ centered at the point x0 ∈ γ . We choose the t-coordinate to run along the geodesic and the x-coordinate is transversal. By possibly replacing p1 and p2 by appropriate functions f j ( p1 , p2 ); j = 1, 2 (the corresponding operators f j (P1 (), P2 ()); j = 1, 2 have the same joint eigenfunctions), one can assume that p j (x, t, ξ, σ ) = b j + δ j (σ ) + ω j (σ )(x 2 + ξ 2 ) + Ot,σ (|x, ξ |3 ); j = 1, 2, where, δ j (0) = 0, ω j (0) = 0; j = 1, 2 and ω j , δ j are locally-defined smooth functions near σ = 0. In the following, Bδ∗ (0, 0) := {(x, ξ ); x 2 + ξ 2 < δ 2 } and Bδ∗ ([a, b]) := {(t, σ ); |σ | < δ, a ≤ t ≤ b}. In this case, [TZ3,VN1,VN2] there exists a canonical map κ : Uγ −→ Bδ∗ (0, 0) × Bδ∗ ([a, b]), where Uγ ⊃ Cγ is a small neighbourhood of Cγ such that in local coordinates, κ : (x, t; ξ, σ ) → (x , t ; ξ , σ ), with (κ −1 )∗ p j (x , t ; ξ , σ ) = F j (x 2 + ξ 2 , σ ); j = 1, 2.

(3.37)

Here, δ > 0 is a sufficiently small tube radius and F j ∈ C ∞ (Bδ (0) × Bδ (0)). In (3.37) and in the following, we abuse notation somewhat and write κ for both the canonical mapping and its various coordinate representations. By possibly replacing the classical

394

J. A. Toth

integrals p j ; j = 1, 2 by f k ( p1 , p2 ); k = 1, 2 with appropriate f k ∈ C ∞ , without loss of generality, we can assume that F j (u, v) = b j + β j (v) + α j (v)u + Ov (u 2 );

j = 1, 2.

Moreover, one can take here β j (v) = v +O(v 2 ). The Eliasson non-degeneracy condition says that for all v ∈ Bδ (0), α1 (v) = α2 (v) with minv∈Bδ (0) {|α1 (v)|, |α2 (v)|} ≥ C1 > 0. We need to compute the asymptotics of the RHS of (3.35). Without loss of generality, one can assume that x0 ∈ M is an interior point of the segment γ and so a < 0, b > 0. Consider the integral Ising (s; ) = e−is(t,ω)/χsing (ω)c (ω, t; s) dωdt. (3.38) To make the change of variables in (3.38) to Birkhoff coordinates (x , t ; ξ , σ ) ∈ Bδ∗ (0, 0) × Bδ∗ ([a, b]), we use that x (x, t; ξ, σ ) = x + O(x 2 ), t (x, t; ξ, σ ) = t + O(x), and so, from the expansion in (3.37), in terms of local coordinates, κ( p1−1 (E 1 ) ∩ π −1 (γ )) ∩ (Bδ∗ (0) × Bδ∗ ([a, b]))

= {(x , t ; ξ , σ ) ∈ Bδ∗ (0)× Bδ∗ ([a, b]); x = 0, σ +α1 (σ )ξ 2 +Oσ (ξ 4 )+O(σ 2 ) = 0}. (3.39)

To simplify the writing, from now on we drop the primes in the Birkhoff coordinates and put (b1 , b2 ) = (E 1 , E). Then, since ∂

σ + α1 (σ )ξ 2 + O(σ 2 ) + Oσ (ξ 4 ) = 1 + O(σ ) + O(ξ 2 ), ∂σ it follows from the implicit function theorem that one can use (σ, t) ∈ Bδ (0) × [a, b] as local parametrizing coordinates on κ( p1−1 (E 1 ) ∩ π −1 (γ )) = κ(Cγ ). Substitution of the solution ξ 2 (σ ) of the defining equation in (3.39) into the formula for κ ∗ p2 gives (σ ) = E + β2 (σ ) + α2 (σ )ξ 2 (σ ) + O(ξ 4 (σ )) α2 (σ ) + O(σ 2 ) = E +σ −σ α1 (σ ) α2 (σ ) σ + O(σ 2 ). = E + 1− α1 (σ )

(3.40)

Here we have used that β2 (σ ) = σ + O(σ 2 ) with β2 = 0. Next we compute the induced measure dω in terms of the Birkhoff coordinates. Let denote the canonical 2-form locally given by d x ∧ dξ + dt ∧ dσ . Since κ is canonical, locally the Lebesgue measure (κ ∗ )2 = 2 = d xdtdξ dσ. The induced arc-length (i.e. Liouville measure) dω satisfies κ∗ dωdt = i ∗ dσ dt.

(3.41)


395

In (3.41), i : γ × suppχ1 → κ(Cγ ) is the local parametrization given by i(t, σ ) = (0, ξ(σ ), t, σ ), where ξ(σ ) satisfies the identity in (3.39). Choosing (σ, t) as local coordinates on supp χsing × γ , by a straightforward computation, κ∗ dωdt = f (σ )|σ |−1/2 dσ dt,

(3.42)

where, depending on the sign of α1 , either f (σ ) = f + (σ )1[0,∞) or f (σ ) = f − (σ )1(−∞,0] with f ± ∈ C ∞ and f ± (σ ) ≥ C1 > 0. Consequently, by a change of variables, in terms of the normal coordinates, α2 (σ ) −1 is E/ −1 2 σ +O(σ ) (2π ) Ising (s; )ds = (2π ) e exp is 1− α1 (σ ) ×c(s, |σ |1/2 , t; ) χsing (σ ) |σ |−1/2 f (σ ) dσ dtds, (3.43) where c ∈ Scl0 (1) ∩ C0∞ is -elliptic on supp χsing . Now, from the non-degeneracy of the integrable system, we use the fact that α2 (σ ) = α1 (σ ) to make the change of variables α2 (σ ) σ + O(σ 2 ) σ → 1 − α1 (σ ) in the phase in (3.43) and integate out the t-variable (note that the phase in (3.43) is independent of t). The result is that for some T > 0, −1 is E/ −1 Ising (s; )ds = (2π ) eisσ/c(s, ˜ σ 1/2 ; )σ −1/2 dσ ds. e (2π ) 0≤σ ≤T

(3.44) Here the amplitude c˜ has the same properties as c. Making a first-order Taylor expansion around σ = 0 we write c(s, ˜ σ 1/2 ; ) 0 1/2 1/2 = c(s, ˜ 0; )+σ ·δ c(s, ˜ σ ; ), where δ c(·, ˜ ·; ) ∈ Scl (1) with compact support in the s-variable and both c˜ and δ have standard symbolic expansions in with δ c(s, ˜ σ 1/2 ; ) 1/2 = δ c(s, ˜ σ ) + O() and c(s, ˜ 0; ) = c(s, ˜ 0) + O(). The integral in (3.43) splits into the sum ∞ T −1 eisσ/c(s, ˜ 0)σ −1/2 dσ ds + (2π )−1 (2π ) ×

∞

−∞ 0 T isσ/

−∞ 0

e

(1) (2) δ c(s, ˜ σ 1/2 )dσ ds + O(1) =: Ising () + Ising () + O(1). (3.45)

Let Fϕ(ξ ) = (2π )−n Rn e−i xξ ϕ(x)d x be the usual Fourier transform. Then, by Fubini, T (2) Ising () = (2π )−1 (Fδ c) ˜ s→σ (−1 σ, σ 1/2 )dσ = O(1). 0

396

J. A. Toth

On the other hand, again by Fubini, for the leading term T (1) −1 Ising () = (2π ) (F c) ˜ s→σ (−1 σ, 0) σ −1/2 dσ ∼→0+ cγ −1/2 .

(3.46)

0

Again, the constant cγ > 0 appearing on the RHS in (3.46) is uniform in E with (E 1 , E) ∈ P(T ∗ M). Consequently, (1) (2) (2π )−1 eis E/ Iγ (s; )ds = Ising () + Ising () + Ir eg () = cγ −1/2 + O(1).

(3.47)

From (3.47) it follows that −1 (1) −1 (2) ρ( [λ j ()−E 1 ]) ρ( [λ j ()−E]) |ϕ j (s)|2 ds ∼→0+ cγ (E; ρ)−1/2 . γ

j

(3.48) So, by taking supremum over E in (3.48) the upper-bound in (ii) of Theorem 3 follows. The final part of the proof of Theorem 3 follows from the result of Toth and Zelditch [TZ3] which says that, unless (M, g) is a flat torus, the bicharacteristic flow must have a singular orbit, γ˜ . But then, γ˜ ⊂ P −1 (E 1 , E), where (E 1 , E) ∈ Bsing . In the case where γ˜ is stable, the existence of the subsequence of joint eigenfunctions follows from the joint trace formula −1 (2) ρ(−1 [λ(1) (3.49) j () − E 1 ]) ρ( [λ j () − E]) ∼→0+ c(E; ρ). j

To see this, one simply argues by contradiction: Assume that for all eigenfunctions |ϕ j (s)|2 ds = o(−1/2 ). γ

Then we bound the LHS in (3.48) by −1 (2) −1/2 o(−1/2 ) × ρ(−1 [λ(1) ) j () − E 1 ]) ρ( [λ j () − E]) = o( j

by the joint trace formula (3.49). This contradicts the asymptotic ∼ −1/2 on the RHS of (3.48). 3.5.3. Unstable case. In this case, the relevant canonical transformation to normal form is given by κ : Uγ → Bδ∗ (0) × Bδ∗ ([a, b]), where κ ∗ p j (x, t; ξ, σ ) = F(ξ 2 − x 2 , σ ) = b j + β j (σ ) + α j (σ )(ξ 2 − x 2 ) + O(|ξ 2 − x 2 |2 ); j = 1, 2. The computations follow in the same way as in the stable case by putting x = 0 and repeating the analysis in 3.5.2 with a few minor changes: In the unstable case [BPU,TZ3], the formula (3.49) gets replaced by −1 (2) ρ(−1 [λ(1) j () − E 1 ]) ρ( [λ j () − E]) ∼→0+ c(E; ρ)| log |. j


397

As in the stable case, an argument by contradiction then proves the existence of a subsequence ϕ jk ; k = 1, 2, 3, . . . satisfying |ϕ jk (s)|2 ds ≥ cγ −1/2 | log |−1 γ

for ∈ (0, 0 ]. This completes the proof of Theorem 3.

4. The Example of a Convex Surface of Revolution One can parametrize convex surfaces of revolution by using geodesic polar coordinates (t, ϕ) ∈ (0, 1) × [0, 2π ] in terms of which p1 (t, ϕ; ξt , ξϕ ) = ξt2 + a −1 (t) ξϕ2 , and p2 (t, ϕ, ξt , ξϕ ) = ξϕ , where, the profile function satisfies a(0) = a(1) = 0 and a(t) is a non-negative Morse function with a single non-degenerate maximum at t = t0 ∈ (0, 1). The level curve t = t0 is the equator of the surface. Let γ = {(t, ϕ(t)); 0 < a ≤ t ≤ b < 1} with ϕ ∈ C ∞ (0, π ) be a curve segment on the surface. The computation of the phase function in this case is easy. Clearly, Cγ = {(t, ϕ(t); ξt , ξϕ ); ξt2 + a −1 (t)ξϕ2 = 1}, and away from the set Cγ ∩ {(t, ϕ; ξt , ξϕ ); |ξϕ | ≤ δ} where δ > 0 is sufficiently small, one can use t ∈ (0, 1) and ξt to parametrize Cγ . It is easy to see that p2 = ξϕ restricted to Cγ ∩ {(t, ϕ; ξt , ξϕ ); |ξϕ | ≤ δ} has no critical points, provided δ > 0 is small enough. So, when (t, ϕ; ξt , ξϕ ) ∈ Cγ ∩ {(t, ϕ; ξt , ξϕ ); |ξϕ | ≥ δ}, it suffices to consider the function 2 (t, ξt ) = a(t) (1 − ξt2 ); t ∈ [a, b]. Here and in the following we work with 2 rather than noting that since > 0, the nature of the critical points is the same. The critical points are the solutions of ∂t 2 = a (t)(1 − ξt2 ) = 0 and ∂ξt 2 = −2a(t)ξt = 0. Since t ∈ (0, 1), there is a single critical point with ξt = 0 and a (t) = 0. This happens precisely when t = t0 . The end result is that the critical point of is (t0 , 0). We next compute the terms of the Hessian matrix at the critical point (t0 , 0). The result is that ∂t2 2 = a (t0 ), ∂t ∂ξt 2 = 0 and ∂ξ2t 2 = −2a(t0 ). Consequently, we get that det(d 2 2 )|(t0 ,0) = −2a(t0 )a (t0 ) = 0 by our Morse assumption on the profile function of the surface. It follows that the curve segment γ = {(t, ϕ(t)); 0 < t < 1} is generic. These curve segments are graphs over the meridian great circle. Next consider curves of the form γ = {(θ (t), ϕ(t) = t); t ∈ [a, b]}

398

J. A. Toth

which are graphs over the equator. In this case, 2 (t, ξt ) = a(θ (t))(1 − ξt2 ); t ∈ [a, b]. The critical points in this case are the solutions of ∂t 2 = a (θ (t)) · θ (t)(1 − ξt2 ) = 0 and ∂ξt 2 = −2a(θ (t))ξt = 0. Since a(θ ) > 0 the second equation implies that ξt = 0 at critical points. The first equation implies a (θ (t)) = 0, or θ (t) = 0. For the Hessian, ∂t2 2 = [a (θ (t))|θ (t)|2 + a (θ (t))θ (t)](1 − ξt2 ), and ∂ξ2t 2 = −2a(θ (t)). Also, clearly ∂t ∂ξt 2 = 0 at any critical point (t0 , 0). In the case where a (θ (t0 )) = 0 at the crtical point (t0 , 0) one gets det(d 2 2 )|(t0 ,0) = −2a(θ (t0 )) a (θ (t0 )) |θ (t0 )|2 . (∗) In the case where θ (t0 ) = 0 at the critical point (t0 , 0) one gets det(d 2 2 )|(t0 ,0) = −2a(θ (t0 )) a (θ (t0 )). θ (t0 ). (∗∗) The only way (∗) can vanish is if also θ (t0 ) = 0, so that both θ (t0 ) = 0 and a (θ (t0 )) = 0; that is, the curve γ (t) is tangent to the equator at t = t0 . In the second case where θ (t0 ) = 0 and a (θ (t0 )) = 0, the curve γ (t) is tangent to another circle parallel to the equator. So, curves which are graphs over the equator are generic in the sense of Definition 1.1 provided θ (t) = 0. This condition is satisfied provided θ : [a, b] → (0, π ) is never tangent to a circle parallel to the equator. In particular, this rules out the cases where γ includes pieces of the equator z = 0 or parallel circles z = const. The equator is of course the (only) projection of a singular orbit. It is non-generic and in that case, Theorem 3 applies. The parallel circles z = const. are caustics which are also necessarily non-generic since in the latter case there are joint eigenfunctions which blow-up like ∼ λ1/6 in sup-norm along the curve. To see what Theorem 1 means for a specific sequence of eigenfunctions, we consider the special case of the round sphere, where ϕλ (x) = λ1/4 (x1 + i x2 )λ are the highest-weight spherical harmonics where M |ϕλ |2 dVol ∼ 1, and where λ = n; n = 1, 2, 3, . . .. The above computations showed that all smooth curves γ = {(t, ϕ(t)); t ∈ [a, b], ϕ (t) = 0 } are generic. In terms of spherical coordinates, the restricted eigenfunction ϕλ (t) = λ1/4 [cos t cos ϕ(t) + i cos t sin ϕ(t)]λ , and so

|ϕλ | ds = λ 2

γ

1/2 a

b

(cos t)2λ dt.


399

In the case where a < 0 and b > 0 (so that γ intersects the equator), an application of steepest descent gives b 1/2 λ (cos t)2λ dt ∼λ→∞ cγ = O(1). a

Similarily, when γ = {(θ (t), t); t ∈ [a, b]} with a < 0, b > 0 and |θ (t)| ≥ gets b 1/2 λ (cos θ (t))2λ dt ∼λ→∞ c˜γ = O(1).

1 C

> 0 one

a

These bounds are consistent with (and slightly stronger than) the general O(log λ) bound given in Theorem 1. 4.1. Zonal harmonics. Let x = (x1 , x2 ) be geodesic normal coordinates on a convex surface of revolution centered at the north pole and (r, ϕ) denote the corresponding polar variables. We consider zonal harmonics centered at the north pole which can be written as oscillatory integrals of the form ϕλ (x) + ϕλ (−x) where, eiλx,ω a(x, ω; λ) dω, (4.50) ϕλ (x) = (2π λ)1/2 S1

∞

(x, ω)λ− j

where, a(x, ω; λ) ∼ and |a0 (x, ω)| ≥ C1 > 0 with a0 (x, ω) = j=0 a j 1 + O(|x|). The λ1/2 -factor in front of the terms in (4.50) ensures that M |ϕλ |2 d x = 1. Consider the generic curve segment γ = {(r, ϕ = f (r )); 0 ≤ r ≤ π } written in geodesic polar variables corresponding tothe x j -coordinates. From (4.50) it follows that ϕλ is radial and we get that in this case γ |ϕλ |2 ds equals 2 λ−1 π 2 iλx,ω |ϕλ (r )| dr = 2π λ e a(x, ω; λ)dω dr 0

r =0 π

S1

2 iλx,ω dr e a(x, ω; λ)dω r =λ−1 S1 2 π = 2π λ eiλx,ω a(x, ω; λ)dω dr + O(1). +2π λ

r =λ−1

S1

An application of stationary phase in the inner integral on the RHS of the last identity gives π π π dr −1/2 iλr 2 r + O(1). |ϕλ (r )|2 dr = 2π e dr + O(1) = 2π −1 −1 r 0 λ λ So it follows that π |ϕλ (r )|2 dr = 2π log λ + O(1), (4.51) 0

and this example shows that the upper bounds in Theorems 1 and 2 are sharp. Acknowledgement. We thank Steve Zelditch for helpful comments and suggestions regarding an earlier version of the manuscript.

400

J. A. Toth

References [A]

Avakumović, G.V.: Über die eigenfunktionen auf geschlossenen riemannschen mannigfaltigkeiten. Math. Z. 65, 327–344 (1956) [BGT] Burq, N., Gerard, P., Tzvetkov, N.: Restrictions of the Laplace-Beltrami eigenfunctions to submanifolds. Duke Math. J. 138(3), 445–486 (2007) [BPU] Brummelhuis, R., Paul, T., Uribe, A.: Spectral estimates around a critical level. Duke Math. J. 78(3), 477–530 (1995) [CP] Colin de Verdiere, Y., Parisse, B.: Equilibre instable en regime semi-classique I: concentration microlocale. Comm. P.D.E. 19, 1535–1563 (1994) [DG] Duistermaat, J.J., Guillemin, V.W.: The spectrum of positive elliptic operators and periodic bicharacteristics. Invent. Math. 29, 39–79 (1975) [DM] Dullin, H.R., Matveev, V.S.: A new integrable system on the sphere. Math. Res. Lett. 11, 715–722 (2004) [DS] Dimassi, M., Sjöstrand, J.: Spectral asymptotics in the semi-classical limit. London Math. Soc. Lecture Notes 268, Cambridge: Cambridge Univ. Press, 1999 [G1] Guillemin, V.: Wave trace invariants. Duke Math. J. 83, 287–352 (1996) [G2] Guillemin, V.: Wave trace invariants and a theorem of Zelditch. Int. Math. Res. Not. (IMRN) 12, 303–308 (1993) [HS] Helffer, B., Sjöstrand, J.: Semiclassical analysis of Harper’s equation III. Mem. Bull. Soc. Math. France, Ser. 2, 39, 1–124 (1990) [Ho] Hörmander, L.: The Analysis of Linear Partial Differential Operators, Volume I. Berlin-Heidelberg: Springer-Verlag, 1983 [IS] Iwaniec, H., Sarnak, P.: L ∞ norms of eigenfunctions of arithmetic surfaces. Ann. of Math. Second Ser. 141(2), 301–320 (1995) [ISZ] Iantchenko, A., Sjöstrand, J., Zworski, M.: Birkhoff normal forms in semiclassical inverse problems. Math. Res. Lett. 9, 337–362 (2002) [K] Kozlov, V.V.: Topological obstructions to the integrability of natural mechanical systems. Soviet Math. Dokl. 20(6), 1413–1415 (1979) [Le] Levitan, B.M.: On the asymptoptic behavior of the spectral function of a self-adjoint differential equation of second order. Isv. Akad. Nauk SSSR Ser. Mat. 16, 325–352 (1952) [L] Lerman, E.: Contact toric manifolds. J. Symp. Geom. 1(4), 785–828 (2003) [LS] Lerman, E., Shirokova, N.: Completely integrable torus actions on symplectic cones. Math. Res. Lett. 9(1), 105–115 (2002) [Mi] Miller, A.: Riemannian manifolds with integrable geodesic flows. Preprint, available at http://www. dpmms.cam.ac.uk/~hk244/miller.cp.pdf [MVN] Miranda, E., Vu-Ngoc, S.: A singular Poincare lemma. IMRN 1, 27–45 (2005) [R] Reznikov, A.: Norms of geodesic restrictions for eigenfunctions on hyperbolic surfaces and representation theory. http://arXiv.org/abs/math/0403437v2[math.AP], 2004 [S] Safarov, Yu.G.: Asymptotics of a spectral function of a positive elliptic operator without a nontrapping condition. (Russian) Funkt. Anal. i Pril. 22(3), 53–65 (1988); translation in Funct. Anal. Appl. 22(3), 213–223 (1988) [SV] Safarov, Yu., Vassiliev, D.: The asymptotic distribution of eigenvalues of partial differential operators. Translated from the Russian manuscript by the authors. Translations of Mathematical Monographs, 155. Providence, RI: Amer. Math. Soc., 1997 [Sa] Sarnak, P.: Arithmetic quantum chaos: The Schur lectures (1992) (Tel Aviv), Israel Math. Conf. Proc. 8, Ramat Gan: Bar-Ilan Univ., 1995, pp. 183–236 [So1] Sogge, C.D.: Concerning the L p norm of spectral clusters for second-order elliptic operators on compact manifolds. J. Funct. Anal. 77(1), 123–138 (1988) [So2] Sogge, C.D.: Oscillatory integrals and spherical harmonics. Duke Math. J. 53(1), 43–65 (1986) [So3] Sogge, C.D.: Fourier Integrals in Classical Analysis. Cambridge Tracts in Math. 105. Cambridge: Cambridge Univ. Press, 1993 [SZ] Sogge, C.D., Zelditch, S.: Riemannian manifolds with maximal eigenfunction growth. Duke Math. J. 114(3), 387–437 (2002) [STZ] Sogge, C.D., Toth, J.A., Zelditch, S.: Geodesic recurrence and maximal growth of eigenfunctions, I., 2008, Preprint [Ta] Tataru, D.: On the regularity of boundary traces for the wave equation. Ann. Scuola Norm. Sup. Pisa Cl. Sci. 26(1), 185–206 (1998) [T1] Toth, J.A.: Eigenfunction localization in the quantized rigid body. J. Diff. Geom. 43(4), 844–858 (1996) [T2] Toth, J.A.: On the quantum expected values of integrable metric forms. J. Diff. Geom. 52(2), 327–374 (1999)


[T3] [T4] [TZ1] [TZ2] [TZ3] [VN1] [VN2] [Z1] [Z2]

401

Toth, J.A.: A small-scale density of states formula. Commun. Math. Phys. 238, 225–256 (2003) Toth, J.A.: Eigenfunctions of quantum completely integrable systems. Encyclopedia of Math. Phys. vol. 2, Amsterdam: Kluwer, 2006, pp. 148–157 Toth, J.A., Zelditch, S.: Riemannian manifolds with uniformly bounded eigenfunctions. Duke Math. J. 111(1), 97–132 (2002) Toth, J.A., Zelditch, S.: Norms of modes and quasi-modes revisited. In: Harmonic analysis at Mount Holyoke (South Hadley, MA, 2001), Contemp. Math. 320, Providence, RI: Amer. Math. Soc., 2003, pp. 435–458 Toth, J.A., Zelditch, S.: L p -norms of eigenfunctions in the completely integrable case. Annales Henri Poincaré 4, 343–368 (2003) Vu-Ngoc, S.: Formes normales semi-classiques des systemes completement integrables au voisinage d’un point critique de l’application moment. Asymptotic Analysis 24(3), 319–342 (2000) Vu-Ngoc, S.: Symplectic techniques for semiclassical integrable systems. In: Topological Methods in the Theory of Integrable Systems, Cambridge: Cambridge Scientific Publishers, 2006 Zelditch, S.: Wave invariants at elliptic closed geodesics. Geom. Funct. Anal. (GAFA) 7, 145–213 (1997) Zelditch, S.: Wave invariants for non-degenerate closed geodesics. Geom. Funct. Anal. (GAFA) 8, 179–217 (1998)

Communicated by P. Sarnak


Communications in


Noncommutative Riemann Surfaces by Embeddings in R3 Joakim Arnlind1,2 , Martin Bordemann3 , Laurent Hofer4 , Jens Hoppe5 , Hidehiko Shimada2 1 Institut des Hautes Études Scientifiques, Le Bois-Marie 35, route de Chartres,

F-91440, Bures-sur-Yvette, France. E-mail: [email protected]

2 Max Planck Institute for Gravitational Physics, Am Mühlenberg 1,

D-14476, Golm, Germany. E-mail: [email protected]; [email protected]

3 Laboratoire de MIA, 4, rue des Frères Lumière, Université de Haute-Alsace,

F-68093, Mulhouse, France. E-mail: [email protected]

4 Université du Luxembourg, FSTC 162a, avenue de la Faïencerie,

L-1511, Luxembourg City, Luxembourg. E-mail: [email protected]

5 Department of Mathematics, KTH, S-10044, Stockholm, Sweden. E-mail: [email protected]

Received: 4 January 2008 / Accepted: 17 December 2008 Published online: 20 March 2009 – © Springer-Verlag 2009

Abstract: We introduce C-Algebras of compact Riemann surfaces as non-commutative analogues of the Poisson algebra of smooth functions on . Representations of these algebras give rise to sequences of matrix-algebras for which matrix-commutators converge to Poisson-brackets as N → ∞. For a particular class of surfaces, interpolating between spheres and tori, we completely characterize (even for the intermediate singular surface) all finite dimensional representations of the corresponding C-algebras. Introduction Attaching sequences of matrix algebras to a given manifold M to describe a noncommutative and approximate version of its ring of smooth functions has become a rather important tool in non-commutative field theory: more precisely, for each positive integer N let Q N : C ∞ (M, C) → M N ,N (C) be a complex linear surjective map of the ring of smooth functions on M into the space of all complex N × N -matrices such that products of functions are approximately mapped to products of matrices in the limit N → ∞. In almost all cases, C ∞ (M, C) carries a Poisson bracket { , } (for instance if M is symplectic, such as every orientable Riemann surface), and one further demands that Poisson brackets are approximately mapped to matrix commutators in the limit N → ∞ (see e.g. [BHSS91] for details). For the 2-sphere S2 [GH82] one could use the fact that the space of all spherical harmonics of fixed l is in bijection with the space of all harmonic polynomials in R3 of degree l; substituting the three commuting variables by irreducible N-dimensional representations of the three-dimensional Lie algebra su(2) allows to define a map from functions on S2 to N × N matrices, that sends Poisson brackets to matrix commutators up to corrections of order 1/N (see also [BHSS91], Example 3, p. 218). The result was dubbed “Fuzzy Sphere” in [Mad92]. The papers [KL92] prove that the (complexified) Poisson algebra of functions on any Riemann surface arises as a N → ∞ limit of

404

J. Arnlind, M. Bordemann, L. Hofer, J. Hoppe, H. Shimada

gl(N , C) – which had been conjectured in [BHSS91]. This result was extended to any quantizable compact Kähler manifold in [BMS94], the technical tool being geometric and Berezin-Toeplitz quantization. A thorough analysis of non-commutative Riemann surfaces of genus greater than or equal to 2 as a continuous field of simple C ∗ -algebras, strongly Morita equivalent to a reduced twisted group C ∗ -algebra of its fundamental group, has been given in [NN99]. Insight on how matrices can encode topological information (certain sequences having been identifiable as converging to a particular function, but gl(N , C) lacking topological invariants) was gained in [Shi04]. Even though the above general results are constructive, there seem to be only two explicit formulas, for the two-sphere [GH82] and for the two-torus [FFZ89] (see also [Hop89/88]), which are quite different from each other; the former uses the natural embedding of the two-sphere into R3 whereas the latter relies on the fact that the twotorus is a quotient R2 /Z2 . The general results are based on the complex nature of any compact orientable Riemann surface. In this paper, we should like to propose an approach which to the best of our knowledge does not seem to have been treated in the literature so far, despite its rather intuitive appeal: we are using the ‘visualisable’ embedding of a compact orientable Riemann surface into R3 explicitly given by the set of all zeros of a real polynomial C. The function C, via { f, g}C := ∇C · ∇ f × ∇g , defines a Poisson bracket for all real-valued smooth functions f, g on R3 . Since C is a Casimir function for the bracket { , }C , one gets a symplectic Poisson bracket on by restriction. The idea now is to use the above Poisson bracket on R3 to first define an infinite-dimensional algebra as a quotient algebra of the free non-commutative algebra in three variables, involving a real parameter and suitably ordered non-commutative analogues of {x, y}C , {y, z}C and {z, x}C . In a second step the resulting algebra is divided by an ideal generated by the constraint polynomial C thus giving a non-commutative version of the functions on . In a third step matrix representations of any size N of this latter algebra are constructed where the parameter takes specific values depending on N . It is noteworthy that the construction does not require the zero set of C to be a regular surface. Thus, even for a singular surface (e.g., in the transition from sphere to torus) the non-commutative analogue is still well defined. The main result of this paper is an explicit construction of non-commutative (non-round) spheres and tori, including the transition region with a singular surface that emerges at the point of topology change. Encouraged by the explicit construction and by the fact that for the two-torus our results almost coincide with the older results of [BHSS91], we are quite optimistic that for the case of genus g ≥ 2 this embedding approach may give more explicit constructions than the existence proof in [KL92] and [BMS94]. The paper is organized as follows: In Sect. 1 we describe Riemann surfaces of genus g embedded in R3 as inverse images of polynomial constraint-functions, C( x ). The above-mentioned Poisson bracket { , }C on R3 is treated in Sect. 2, where the bracket restricts to a symplectic bracket on the embedded Riemann surface . In Sect. 3 step one and two of the above programme is explicitly proven for a polynomial constraint C describing the two-sphere, the two-torus, and a transition region: we give a system of relations (Eqs. (3.2), (3.3), (3.4)), and show that this system satisfies the hypothesis of the Diamond lemma, thus proving that the non-commutative algebra

Noncommutative Riemann Surfaces by Embeddings in R3

405

carries a multiplication which is a converging deformation of the point wise multiplication of polynomials in three commuting variables (see Proposition 3.1). In the central Sect. 4 we completely classify all the finite-dimensional representations of the algebras constructed in the preceding section (two-sphere, two-torus, and transition) which are hermitian in the sense that the variables x, y, and z are sent to hermitian N × N -matrices. The main technical tool is graph-theory describing the non-zero entries of the matrices. Next, in Sect. 5, we confirm that the eigenvalue sequences of these representations reflect topology in the sense suggested in [Shi04]. The final Sect. 6 compares the classification results of Sect. 4 with previously known matrix constructions for the sphere and the torus. In the case of the torus it is shown that our result agrees with what can be obtained by (variants of) Berezin-ToeplitzQuantization, see e.g. [BHSS91]. 1. Genus g Riemann Surfaces The aim of this section is to present compact connected Riemann surfaces of any genus embedded in R3 by inverse images of polynomials. For this purpose we use the regular value theorem and Morse theory. Let C be a polynomial in 3 variables and define = C −1 ({0}). What are the conditions on C, for to be a genus g Riemann surface? If the restriction of C to is a submersion, then is an orientable submanifold of R3 . has to be compact and of the desired genus. For further details see [Hir76,Hof02]. The classification of 2-dimensional compact (connected) manifolds is well-known. In this case there is a one to one correspondence between topological and diffeomorphism classes. The result is that any compact orientable surfaces is homeomorphic (hence diffeomorphic) to a sphere or to a surface obtained by gluing tori together (connected sum). The number g of tori is called the genus and is related to the Euler-Poincaré characteristic by the formula χ = 2 − 2g. To compute χ () we apply Morse theory to a specific function. A point p of a (smooth) function f on is a singular point if D f p = 0 in which case f ( p) is a singular value. At any singular point p one can consider the second derivative D 2 f p of f and p is said to be non-degenerate if det(D 2 f p ) = 0. Moreover, one can attach an index to each such point depending on the signature of D 2 f : 0 if positive, 1 if hyperbolic and 2 if negative. A Morse function is a function such that every singular point is non-degenerate and singular values all distinct. Then χ () is given by the formula: χ () = n(0) − n(1) + n(2), where n(i) is the number of singular points which have an index i. The Cote x function is defined as the restriction of the first coordinate on the surface. It is not necessarily a Morse function (one has to choose a “good” embedding for that), but the singular points are those for which the gradient grad C is parallel to the O x axis. Moreover the Hessian matrix of Cote x at such a point p is: ⎛ 2 ⎞ ∂ C ∂2C ( p) ( p) 1 ⎝ ∂y2 ∂y∂z ⎠. − ∂C 2C ∂ ∂2C ( p) ( p) 2 ∂ x ( p) ∂y∂z ∂z Take C( x) =

2 1 1 1 P(x) + y 2 + z 2 − c, 2 2 2

406


where c > 0, P(x) = a2k x 2k + a2k−1 x 2k−1 + · · · + a1 x + a0 with a2k > 0 and k > 0. Obviously is closed and bounded (even degree of P) hence compact. is a submanifold of R3 if, and only if √ for each p ∈√, DC p = 0 which is equivalent to requiring that the polynomials P − c and P + c have only simple roots. The singular points of the Cote x function on are the points (x, 0, 0) such that P(x)2 = c and the Hessian matrix is: 1 2P(x) 0 − ∂C . 0 1 ∂ x (x, 0, 0) √ Hence it is√ positive or negative if, and only if P(x) = c and hyperbolic if, and only if P(x) = − c. Thanks to the fact that P(x) never vanishes at a singular point, this also shows that Cote x is a genuine Morse function. Finally, √ √ and n(1) = #{P = − c}. n(0) + n(2) = #{P = c} √ √ If the polynomial P − c has exactly 2 simple roots and the polynomial P + c has exactly 2g simple roots, then χ () = 2 − 2g and is a surface of genus g. Let g > 0. Set: √ 2 c (i) G(t) = (t − 1)(t − 22 ) . . . (t − g 2 ) and M = max G(t), α ∈ 0, , M 0≤t≤g 2 +1 √ (ii) Q(x) = αG(x) − c and P(x) = Q(x 2 ). √ √ roots, hence P + c has exactly 2g One can directly see that Q + c has exactly g simple√ 2 simple roots. For t ∈ [0;√ g + 1], the function Q(t) − c has no zero. On the other hand, for t ≥ g 2 + 1, Q(t)√ − c is strictly growing and has exactly one zero. Consequently the polynomial P − c has exactly 2 simple roots and the surface defined above is a genus g compact Riemann surface. Note that non-compact, respectively non-polynomial, higher genus Riemann surfaces have been considered in [BKL05]. 2. The Construction for General Riemann Surfaces For arbitrary smooth C : R3 −→ R, { f, g}R3 := ∇C · (∇ f × ∇g)

(2.1)

defines a Poisson bracket for functions on R3 (see e.g. Nowak [Now97] who studied the formal deformability of (2.1)).1 Clearly, C is a Casimir function of the bracket, i.e. C commutes with every function. Let now, as in Sect. 1, g ⊂ R3 be described as C −1 (0) with 2 1 1 1 P(x) + y 2 + z 2 − c, C( x) = (2.2) 2 2 2 and c > 0. For this choice of C, the bracket {·, ·}R3 defines a Poisson bracket on g through restriction. The Poisson brackets between x,y and z read: {x, y}R3 = ∂z C = z,

{y, z}R3 = ∂x C = P (x)(P(x) + y 2 ),

(2.3)

{z, x}R3 = ∂y C = 2y(P(x) + y ). 2

1 While we did not (yet find a way to) use his results, we are very grateful for his “New Year’s Eve” explanations, as well as providing us with his Ph.D. Thesis.


407

We claim that fuzzy analogues of g can be obtained via matrix analogues of (2.3). Apart from possible “explicit 1/N corrections”, direct ordering questions arise on the r.h.s. of (2.3), while on the l.h.s. one replaces Poisson brackets by commutators, i.e. {·, ·} → i1 [·, ·]. We present the following Ansatz for the C-algebra of g , given as three relations in the free algebra generated by the letters X, Y, Z : [X, Y ] = iZ , [Y, Z ] = i

2g r =1

(2.4) ar

r −1

X i P(X ) + Y 2 X r −1−i ≡ φˆ X ,

(2.5)

i=0

[Z , X ] = i 2Y 3 + Y P(X ) + P(X )Y ≡ φˆ Y ,

(2.6)

2g where is a positive real number and P(X ) = r =0 ar X r . The particular ordering in (2.5) and (2.6) is chosen such that the three equations are consistent, in the sense of the Diamond Lemma [Ber78]. Proposition 2.1. Let S = {σ X , σY , σ Z } be a reduction system with σ X = (W X , f X ) = Z Y, Y Z − φˆ X , σY = (WY , f Y ) = Z X, X Z + φˆ Y , σ Z = (W Z , f Z ) = Y X, X Y − iZ . This reduction system contains an ambiguity; i.e., there are two ways of reducing the word Z Y X : Either we replace Z Y by Y Z − φˆ X or we replace Y X by X Y − iZ . The ambiguity is called resolvable if these two reductions eventually reduce to the same expression, by using that we can replace any occurrence of W X , WY , W Z by f X , f Y , f Z respectively. The statement of this proposition is that the ambiguity (Z Y )X = Z (Y X ) is resolvable if and only if [X, φˆ X ] + [Y, φˆ Y ] = 0, and that this relation is satisfied for the choice in (2.5) and (2.6). Proof. The ambiguity is resolvable if we can show that A := (Y Z − φˆ X )X − Z (X Y − iZ ) = 0 using only the possibility to replace any occurrence of Wi with f i , for i = X, Y, Z . We get A = Y Z X − Z X Y − φˆ X X + iZ 2 = Y (X Z + φˆ Y ) − (X Z + φˆ Y )Y − φˆ X X + iZ 2 = Y X Z − X Z Y + [Y, φˆ Y ] − φˆ X X + iZ 2 = (X Y − iZ )Z − X (Y Z − φˆ X ) + [Y, φˆ Y ] − φˆ X X + iZ 2 = [X, φˆ X ] + [Y, φˆ Y ]. It is then straightforward to check that [Y, φˆ Y ] = −[X, φˆ X ] for the choice in (2.5) and (2.6). Finding explicit representations of (2.4)–(2.6), let alone classifying them, is of course a very complicated task. We succeeded in doing so for a (continuously deformable) class of surfaces corresponding to spheres and tori.

408


3. The Torus and Sphere C-Algebras Let us now take P(x) = x 2 − µ, in which case C −1 (0), with 2 1 1 2 1 x + y2 − µ + z 2 − c C(x, y, z) = (c > 0), (3.1) 2 2 2 √ √ describes √ a surface of revolution which is a torus for µ > c and a sphere for − c < µ < c. √ √ As µ increases from − c to c the (almost) round sphere gets deformed by introducing two growing√“sinks”; one at the north pole and one at the south pole. At the critical point µ = c the two sinks meet and the surface develops a singularity. For larger µ the singularity vanishes and a hole appears, giving the topology of a torus. The corresponding C-algebra is defined as the quotient of the free algebra C X, Y, Z with the two-sided ideal generated by the relations X, Y = iZ , (3.2)

(3.3) Y, Z = i 2X 3 + X Y 2 + Y 2 X − 2µX ,

Z , X = i 2Y 3 + Y X 2 + X 2 Y − 2µY . (3.4) By introducing W = X + iY and V = X − iY one can rewrite (3.3) and (3.4) as

W 2 V + V W 2 (1 + 2 ) = 4µ2 W + 2(1 − 2 )W V W, (3.5)

V 2 W + W V 2 (1 + 2 ) = 4µ2 V + 2(1 − 2 )V W V, (3.6) and we denote by I (µ, ) the ideal generated by these relations. Through the “Diamond lemma” [Ber78] one can explicitly construct a basis of this algebra. Proposition 3.1. Let C(µ, ) = CW, V /I (µ, ). Then a basis of C(µ, ) is given by {V i (W V ) j W k : i, j, k = 0, 1, 2, . . .}. As a vector space, C(µ, ) is therefore isomorphic to the space of commutative polynomials C[X, Y, Z ]. Proof. In the notation of the Diamond Lemma, let S = {σ1 , σ2 } be a reduction system with 4µ2 2(1 − 2 ) 2 , W + W V W − V W σ1 = (wσ1 , f σ1 ) = W 2 V, 1 + 2 1 + 2 2 2(1 − 2 ) 2 4µ 2 σ2 = (wσ2 , f σ2 ) = W V , V+ VWV − V W , 1 + 2 1 + 2 and let ≤ be a partial ordering on W, V such that p < q if either the total degree (in W and V ) of p is less than the total degree of q or if p is a permutation of the letters in q and the misordering index of p is less than the misordering index of q. The misordering index of a word a1 a2 . . . ak is defined to be the number of pairs (ak , ak ) with k < k such that ak = W and ak = V . This partial ordering is compatible with S in the sense that every word in f σi is less than wσi .


409

We will now argue that the partial ordering fulfills the descending chain condition, i.e. that every sequence of words such that w1 ≥ w2 ≥ · · · eventually becomes constant. Assume that w1 has degree d and misordering index i. If w1 > wk , then d or i must decrease by at least 1. Since both the degree and the misordering index are non-negative integers, an infinite sequence of strictly decreasing words can not exist. The reduction system S has one overlap ambiguity, namely, there are two ways to reduce the word W 2 V 2 ; either you write it as (W 2 V )V and use σ1 , or you write it as W (W V 2 ) and use σ2 . In an associative algebra, these must clearly be the same, and if they do reduce to the same expression, we call the ambiguity resolvable. It is now straightforward to check that the indicated ambiguity is in fact resolvable. The above observations allow for the use of the Diamond lemma, which in particular states that a basis for C(µ, ) is given by the set of irreducible words. In this particular case, it is clear that the words V i (W V )k W j are irreducible (since they do not contain W 2 V or W V 2 ) and that there are no other irreducible words. By a straightforward calculation, using (3.5) and (3.6), one proves the following result. ˜ 2 /2 . Proposition 3.2. Define D = W V , D˜ = V W and Cˆ = (D + D˜ − 2µ)2 + (D − D) Then it holds that ˜ = 0, (i) [D, D] ˆ = [V, C] ˆ = 0. (ii) [W, C] In particular, this means that the direct non-commutative analogue of the constraint (3.1) is a Casimir of C(µ, ). Let us make a remark on the possibility of choosing a different ordering when constructing a non-commutative analogue of the Poisson algebra. Assume we choose to completely symmetrize the r.h.s of Eqs. (2.3). Then, the defining relations of the algebra become X, Y = iZ , 1 2 3 2 Y, Z = 2i X + X Y + Y X + Y X Y − µX , 3 2 1 3 2 Z , X = 2i Y + Y X + X Y + X Y X − µY . 3 Again, defining W = X + iY and V = X − iY , gives

W 2 V + V W 2 (1 + 42 /3) = 4µ2 W + 2(1 − 22 /3)W V W,

V 2 W + W V 2 (1 + 42 /3) = 4µ2 V + 2(1 − 22 /3)V W V, which, by rescaling 2 = as the new parameter.

3 2 , can be brought to the form of Eqs. (3.5) and 3−h 2

(3.6), with

4. Representations of the Torus and Sphere Algebras Let us now turn to the task of finding representations φ, of the algebra C(µ, ), with 0 < < 1, for which φ(X ), φ(Y ), φ(Z ) are hermitian matrices, i.e. φ(W )† = φ(V ). First, we observe that any such representation is completely reducible; hence, in the following, we need only consider irreducible representations.

410


Proposition 4.1. Any representation φ of C(µ, ) such that φ(W )† = φ(V ) is completely reducible. Proof. Since the algebra of all complex N × N matrices equipped with the sup-norm is a C ∗ -algebra, it is clear that any ∗ -subalgebra is completely reducible. For the convenience of the reader, we give the algebraic proof. Let φ be a representation of C(µ, ) fulfilling the conditions in the proposition. Moreover, let A be the subalgebra, of the full matrix-algebra, generated by φ(W ) and φ(V ). First we note that since φ(V ) = φ(W )† , the algebra A is invariant under hermitian conjugation, thus given M ∈ A we know that M † ∈ A. We prove that Rad(A) (the radical of A), i.e. the largest nilpotent ideal of A, vanishes, which implies, by the Wedderburn-Artin theorem, see e.g. [ASS06], that φ is completely reducible. Let M ∈ Rad(A). Since Rad(A) is an ideal it follows that M † M ∈ Rad(A). For a finite-dimensional algebra, Rad(A) is nilpotent, m which in particular implies that there exists a positive integer m such that M † M = 0. It follows that M = 0, hence Rad(A) = 0. In the following, we shall always assume that φ is an hermitian irreducible representa˜ (as defined in Proposition 3.2) tion of C(µ, ). For these representations, φ(D) and φ( D) will be two commuting hermitian matrices and therefore one can always choose a basis such that they are both diagonal. We then conclude that the value of the Casimir Cˆ will always be a non-negative real number, which we will denote by 4c. Finding herˆ = 4c1 thus amounts to solving the matrix mitian representations of C(µ, ) with φ(C) equations

˜ ) 1 + 2 = 4µ2 W + 1 − 2 (W D˜ + DW ), (4.1) (W D + DW

2 1

2 D + D˜ − 2µ1 + 2 D − D˜ = 4c 1, (4.2) with D = W W † = diag(d1 , d2 , . . . , d N ) and D˜ = W † W = diag(d˜1 , d˜2 , . . . , dÑ ) being diagonal matrices with non-negative eigenvalues. The “constraint” (4.2) constrains the pairs xi = (di , d˜i ) to lie on the ellipse (x + y − 2µ)2 + (x − y)2 /2 = 4c, e.g. as in Fig. 1. Representations with c = 0, which we shall call degenerate, are particularly simple, and can be directly characterized. ˆ Proposition 4.2. Let φ be an hermitian representation of C(µ, ) such √ that φ(C) = 0. Then µ ≥ 0 and there exists a unitary matrix U such that φ(W ) = µ U . Proof. When D and D˜ are non-negative diagonal matrices, c = 0 implies D = D˜ = µ1 via (4.2), which necessarily gives µ ≥ 0. In this case, Eq. (4.1) is identically satisfied, and we are left with solving the equations W W † = W † W = µ1. Hence, there exists a √ unitary matrix U such that W = µ U . Assume in the following that c > 0. We note that any representation φ of C(µ , ), ˆ = 4c 1, can be obtained from a representation φ of C(µ, ) with φ(C) ˆ = with φ (C) √ √ 4c1, if µ/ c = µ / c . Namely, one simply defines φ (W ) := 4 c /c φ(W ). ˆ = 4c1. Proposition 4.3. Let√φ be an hermitian representation of C(µ, ) with φ(C) Then it holds that − c ≤ µ.


411

Fig. 1. The constraint ellipse

√ Proof. Assume that there exists a representation of C(µ, ) with − c > µ. Then the ˜ diagonal components of Eq. (4.2) describes an ellipse in the (d, d)-plane, for which all ˜ ˜ points (d, d) satisfy that either d or d is strictly negative. This contradicts the fact that √ D and D˜ have non-negative eigenvalues. Hence, − c ≤ µ. Writing out (4.1) in components gives Wi j 2 + 1 (d˜i + d j ) + 2 − 1 (di + d˜ j ) − 4µ2 = 0,

(4.3)

and we also note that W D˜ = DW yields Wi j di − d˜ j = 0. If Wi j = 0, the two equations give a relation between the pairs xi = (di , d˜i ) and x j = (d j , d˜ j ). Namely, x j = s ( xi ) with

˜ d , (4.4) s d, d˜ = 4µ sin2 θ + 2d cos 2θ − d, where = tan θ for 0 < θ < π/4. The map s is better understood if we introduce ˜ coordinates z( x ) = (d − d)/ and ϕ( x ) = d + d˜ − 2µ in which case one finds that z s( x) z( x) cos 2θ − sin 2θ . (4.5) = ϕ( x) sin 2θ cos 2θ ϕ s( x) We conclude that s amounts to a “rotation” on the ellipse described by the constraint (4.2). Let us collect some basic facts about s in the next proposition. Proposition 4.4. Let s : R2 → R2 be the map as defined above and let q = e2iθ . Then (i) (ii) (iii) (iv)

s is a bijection,

√ β0 √ cos(β0 +2θ) µ if x(β0 ) = c √µc + cos then s l x(β0 ) = x (β0 + 2lθ ) , , + cos θ cos θ c s( x ) = x if and only if x = (µ, µ), if x = (µ, µ), then s n ( x ) = x if and only if q n = 1.

From these considerations one realizes that it will be important to keep track of the pairs (i, j) for which Wi j = 0. This leads us to a graph representation of the matrix W .

412


4.1. Graph representation of matrices. In this section we will introduce the directed graph of the matrix W . See, e.g., [FH94] for the standard terminology concerning directed graphs. Definition 4.5. Let G = (V, E) be a directed graph on N vertices with vertex set V = {1, 2, . . . , N } and edge set E ⊆ V × V . We say that an N × N matrix W is associated to G (or G is associated to W ) if it holds that (i j) ∈ E ⇔ Wi j = 0. Given an equation for W , we say that a graph G is a solution if G is associated to a matrix W , solving the equation. Needless to say, for a given solution G there might exist many different (matrix) solutions associated to G. A graph with several disconnected components is clearly associated to a matrix that is a direct sum of matrices; hence, it suffices to consider connected graphs. In the following, a solution will always refer to a solution of (4.1). Given a connected solution G, we note that given the value of xi = (di , d˜i ), for any i, we can compute xk = (dk , d˜k ), for all k, using (4.4). Namely, since G is connected, we can always find a sequence of numbers i = i 1 , i 2 , . . . , il = k, such that Wi j i j+1 = 0 or Wi j+1 i j = 0, which will give us xk = s m ( xi ), where m is the difference between the number of edges (in the path) directed from i and the number of edges directed towards i. Proposition 4.6. Let G = (V, E) be a connected non-degenerate solution. Then (i) G has no self-loops (i.e. (ii) ∈ / E), (ii) there is at most one edge between any pair of vertices. Proof. In both cases, assuming the opposite, it follows from (4.3) that there exists an i such that di = d˜i = µ. Since the graph is connected we will have di = d˜i = µ for all i ((µ, µ) is indeed the fix-point of s), giving c = 0. Hence, a non-degenerate solution will satisfy the two conditions above. Any finite directed graph has a directed cycle, which we shall call loop, or a directed path from a transmitter (i.e. a vertex having no incoming edges) to a receiver (i.e. a vertex having no outgoing edges), which we shall call string. The existence of a loop or a string imposes restrictions on the corresponding representations. From Proposition 4.4 we immediately get: Proposition 4.7. Let G be a non-degenerate solution containing a loop on n vertices. Then q n = 1. Lemma 4.8. Let G be a solution. The vertex i is a transmitter if and only if d˜i = 0. The vertex i is a receiver if and only if di = 0. Proof. Since D = W W † and D˜ = W † W , we have Wik W ik = |Wik |2 , di = k

d˜i =

k

k

W ki Wki =

|Wki |2 ,

k

and it follows that di = 0 if and only if Wik = 0 for all k, i.e. i is a receiver. In the same way d˜i = 0 if and only if Wki = 0 for all k, i.e. i is a transmitter.


413

Next we prove that if G is a solution, then G can not contain both a string and a loop. Lemma 4.9. Let G be a non-degenerate connected solution and assume that G has a transmitter or a receiver. Then G has no loop and therefore there exists a string. Proof. Let us prove the case when a transmitter exists. Let us denote the transmitter by 1 ∈ V , and by Lemma 4.8 we have x1 = (a, 0), for some a > 0. Assume that there exists a loop and let i be a vertex in the loop. Since G is connected there exists an integer i such that xi = s i ( x1 ). Let l be the number of vertices in the loop. From Proposition 4.7 we know that q l = 1, which means that there is at most l different values of xk in the graph, and all values are assumed by vertices in the loop. In particular this means that there exists a vertex k in the loop, such that xk = x1 . But this implies, by Lemma 4.8, that k is a transmitter, which contradicts the fact that k is part of a loop. Hence, if a transmitter exists, there exists no loop and therefore there must exist a string. The above result suggests to introduce the concept of loop representations and string representations, since all representations are associated to graphs that have either a loop or a string. Let us now prove a theorem providing the general structure of the representations. Theorem 4.10. Let φ be an N -dimensional non-degenerate connected hermitian repˆ = 4c1. Then there exists a positive integer k dividresentation of C(µ, ) with φ(C) ing N , a unitary N × N matrix T , unitary N /k × N /k matrices U0 , . . . , Uk−1 and β, e˜0 , . . . , e˜k−1 ∈ R with e˜1 , . . . , e˜k−1 > 0, such that √ ⎞ ⎛ 0 e˜1 U1 √ 0 ··· 0 ⎟ ⎜ 0 e˜2 U2 · · · 0 0 ⎟ ⎜ ⎟ ⎜ . . . .. .. .. .. (4.6) T φ(W )T † = ⎜ .. ⎟, . . ⎟ ⎜ ⎠ ⎝ 0 e˜k−1 Uk−1 0 ··· 0 √ e˜0 U0 0 ··· 0 0 √ µ cos(2lθ + β) e˜l = c √ + . (4.7) cos θ c ˜ † are diagonal, Proof. Let U be a unitary N × N matrix such that U DU † and U DU † set Wˆ = U φ(W )U and let G be the graph associated to Wˆ . Define {xˆ0 , . . . , xˆk−1 } to be the set of pairwise different vectors out of the set { x1 , x2 , . . . , xN }, such that xî+1 = s xî for i = 0, . . . , k − 2 (which is always possible since G is connected), and write xî = (ei , e˜i ). We note that if G has a transmitter, it must necessarily correspond to the vector xˆ0 , in which case e˜0 = 0. In particular this means that no vertex corresponding to xî , for i > 0, can be a transmitter and hence, by Lemma 4.8, e˜1 , . . . , e˜k−1 > 0. Now, define Vi = { j ∈ V : x j = xî }

i = 0, . . . , k − 1,

and set li = |Vi |. Since xî+1 = s(xî ), a necessary condition for (i j) ∈ E is that j = i +1. This implies that there exists a permutation σ ∈ S N (permuting vertices to give the order V0 , . . . , Vk−1 ) such that ⎛ ⎞ 0 W1 0 ··· 0 0 W2 · · · 0 ⎟ ⎜0 ⎜ . . .. ⎟ † . . ⎟ .. .. .. W := σ Wˆ σ = ⎜ . ⎟, ⎜ .. ⎝0 0 ··· 0 Wk−1 ⎠ W0 0 ··· 0 0

414


where Wi is a li−1 × li matrix (counting indices modulo k). In this basis we get D = diag(e0 , . . . , e0 , . . . , ek−1 , . . . , ek−1 ) l0

=W W

†

=

lk−1

† diag(W1 W1† , . . . , Wk−1 Wk−1 , W0 W0† ),

D˜ = diag(e˜0 , . . . , e˜0 , . . . , e˜k−1 , . . . , e˜k−1 ) l0 †

=W W =

lk−1

† diag(W0† W0 , W1† W1 , . . . , Wk−1 Wk−1 ),

which gives Wi Wi† = ei−1 1li−1 and Wi† Wi = e˜i 1li . Since xî+1 = s(xî ) we know that e˜i+1 = ei , which implies that Wi Wi† = e˜i 1i−1 for i = 1, . . . , k − 1. Any matrix satisfying such conditions must be a square matrix, i.e. li = li−1 for i = 1, . . . , k − 1. Hence, Wi is a√square matrix of dimension N /k, and there exists a unitary matrix Ui such that Wi = e˜i Ui . Moreover, we take T to be the unitary N × N matrix σ U . Finally, since every point √ xî = (ei , e˜i ) lies on the ellipse, there exists a β0 such that xˆ0 corresponds to the point c (cos(β0 + θ ), sin(β0 + θ )) in the (z, ϕ)-plane, as in Proposition 4.4.

By √ cos(2lθ+β) µ defining β = β0 + 2θ , we get, since xˆl+1 = s(xˆl ), that e˜l = c √c + cos θ . The above theorem proves the structure of any connected representation, but the question of irreducibility still remains. We will now prove that any representation is in fact equivalent to a direct sum of representations where the Ui ’s are 1 × 1-matrices. Lemma 4.11. Let W1 and W2 be matrices such that ⎛ ⎛ ⎞ 0 w1 U 1 0 · · · 0 w1 1 0 0 w2 U 2 · · · 0 0 ⎜ 0 ⎜ 0 ⎟ ⎜ . ⎜ . ⎟ .. .. .. .. .. ⎜ ⎟ . . ; W W1 = ⎜ = 2 . . . . . ⎜ . ⎜ . ⎟ ⎝ 0 ⎝ 0 0 · · · 0 wn−1 Un−1 ⎠ 0 w0 V 0 w0 U 0 0 ··· 0 0

0 ··· w2 1 · · · .. .. . . ··· ···

0 0 .. .

0 wn−1 0 0

⎞ ⎟ ⎟ ⎟, ⎟ 1⎠

where U0 , . . . , Un−1 are unitary matrices, w0 , . . . , wn−1 ∈ C and V a diagonal matrix such that SV S † = U1 U2 · · · Un−1 U0 for some unitary matrix S. Then there exists a unitary matrix P such that W1 = P W2 P †

and W1† = P W2† P † .

Proof. Let us define P as P = diag(S, P1 , . . . , P¯n−1 ) with Pl = (U1 U2 . . . Ul )† S for l = 1, . . . , n − 1. Then one easily checks that W1 = P W2 P † and W1† = P W2† P † . Note that a graph associated to a matrix such as W2 , consists of n components, each being either a string (e˜0 = 0) or a loop (e˜0 > 0). Therefore, we have the following result.


415

Fig. 2. The constraint ellipse of a Toral representation

Theorem 4.12. Let φ be a non-degenerate hermitian representation of C(µ, ). Then φ is unitarily equivalent to a representation whose associated graph is such that every connected component is either a string or a loop. √ The existence of strings or loops will depend on the ratio µ/ c, and therefore we split all connected representations of C(µ, ) into three subsets, in correspondence with the original surface described by the polynomial C(x, y, z): √ (a) −1 < µ/ – Spherical representations, √ c≤1 (b) 1 < µ/ c ≤ 1/ cos θ – Critical toral representations, √ (c) 1/ cos θ < µ/ c – Toral representations. √ 4.2. Toral representations. For µ/ c > 1/ cos θ the constraint ellipse lies entirely in the region where both d and d˜ are strictly positive, e.g. as in Fig. 2. In particular this implies, by Lemma 4.8, that a graph associated to a toral representation can not have any transmitters or receivers. Hence, it must have a loop, and by Proposition 4.7, there exists an integer k such that q k = 1. We note that the restriction 0 < θ < π/4 necessarily gives k ≥ 5. √ Theorem 4.13. Assume that µ/ c > 1/ cos θ and let k be a positive integer such that k q = 1. Furthermore, let U0 , . . . , Uk−1 be unitary matrices of dimension N and let β ∈ R. Then φ is an N · k dimensional hermitian toral representation of C(µ, ), with ˆ = 4c1, if φ(C) √ ⎛ ⎞ 0 e˜1 U1 √ 0 ··· 0 ⎜ 0 ⎟ e˜2 U2 · · · 0 0 ⎜ ⎟ ⎜ .. ⎟ .. .. .. .. (4.8) φ(W ) = ⎜ . ⎟ . . . . ⎜ ⎟ ⎝ 0 ⎠ e˜k−1 Uk−1 0 ··· 0 √ e˜0 U0 0 ··· 0 0 and

√ µ cos(2lθ + β) e˜l = c √ + . cos θ c

(4.9)

416


Definition 4.14. We define a single loop representation φ L of C(µ, ) to be a toral representation, as in Theorem 4.13, with Ui chosen to be 1 × 1 matrices and k to be the smallest positive integer such that q k = 1. As a simple corollary to Theorem 4.12 we obtain Corollary 4.15. Let φ be a toral representation of C(µ, ). Then φ is unitarily equivalent to a direct sum of single loop representations. Proposition 4.16. A single loop representation of C(µ, ) is irreducible. Proof. Given a single loop representation φ L of dimension n, it holds that q n = 1, and there exists no n < n such that q n = 1, by definition. Now, assume that φ L is reducible. Then, by Proposition 4.1, φ L is equivalent to a direct sum of at least two representations. In particular, this means that there exists a toral representation of C(µ, ) of dimension m < n which implies, by Proposition 4.7, that there exists an integer n < n such that q n = 1. But this is impossible by the above argument. Hence, φ L is irreducible. For two loop representations of the same dimension, it is not only the value of the Casimir Cˆ that distinguishes them, but there is in fact a whole set of inequivalent representations - parametrized by a complex number. Definition 4.17. Let φ L be a single loop representation in the notation of Theorem 4.13 with Ul = eiαl . We define the index z(φ L ) as the complex number z(φ L ) = e˜0 e˜1 · · · e˜k−1 eiγ with γ = α0 + α1 + · · · + αk−1 . Lemma 4.18. Let k, n be integers such that gcd(k, n) = 1 and define 2π kl Al (β) = cos β + n for l = 0, 1, . . . , n − 1. Then there exists permutations σ+ , σ− ∈ Sn such that Aσ+ (l) (β) = Al (β + 2π/n) and

Aσ− (l) (β) = Al (2π/n − β)

for l = 0, 1, . . . , n − 1. Proof. Let us prove the existence of σ+ ; the proof that σ− exists is analogous. We want to show that there exists a permutation σ+ such that Aσ+ (l) (β) = Al (β + 2π/n). Let us make an Ansatz for the permutation; namely, we take it to be a shift with σ+ (l) = l + δ (mod n) for some δ ∈ Z. We then have to show that there exists a δ such that 2π k(l + δ) 2π(kl + 1) cos β + = cos β + . n n This holds if for some m ∈ Z, β+

2π(kl + 1) 2π k(l + δ) =β+ + 2π m n n kδ − nm = 1.

⇐⇒

Now, can we find δ such that this holds for some m? It is an elementary fact in number theory that such an equation has integer solutions for δ and m if gcd(k, n) = 1. Hence, if we set σ+ (l) = l + δ (mod n), where δ is such a solution, then the argument above shows that Aσ+ (l) (β) = Al (β + 2π/n).


417

Lemma 4.19. Let θ = π k/n with gcd(k, n) = 1, and set √ n−1 c cos(2lθ + β) . µ+ f (β) = cos θ l=0

Then f (β) = f (β + 2π/n), f (β) = f (2π/n − β) and if β, β ∈ [0, π/n] then β = β implies that f (β) = f (β ). Proof. It follows directly from Lemma 4.18 that f (β) = f (β + 2π/n) = f (2π/n − β). Since f is periodic, with period 2π/n, it can be expanded in a Fourier series as f (β) =

∞ l=−∞

al e2πilβ/(2π/n) =

∞

al eilnβ .

l=−∞

Comparing the Fourier series with the original expression for f , and introducing q = e2iθ , we get √ n n−1 ∞ µ cos θ 1

c l iβ −l −iβ q e +q e = al einlβ . f (β) = + √ cos θ 2 c l=0

l=−∞

From this equality we deduce that there are only three non-zero coefficients in the Fourier series, namely a−1 , a0 , a1 . Comparing both sides, we obtain 1 a−1 = n q −n(n−1)/2 , 2 1 a1 = n q n(n−1)/2 , 2 which implies that √ −n c 1 1 f (β) = a0 + n q −n(n−1)/2 e−inβ + n q n(n−1)/2 einβ cos θ 2 2 1 n−1 = a0 + − cos nβ. 2 From this it is clear that f (β) = f (β ) when β = β and β, β ∈ [0, π/n].

φ L

Proposition 4.20. Let φ L and be single loop representations of dimension n, such ˆ = φ (C). ˆ Then φ L and φ are equivalent if and only if z(φ L ) = z(φ ). that φ L (C) L L L Proof. Then characteristic equation of φ L (W ) is λn −z(φ L ). Therefore, a necessary condition for φ L and φ L to be equivalent is that z(φ L ) = z(φ L ). Now, to prove the opposite implication, assume that z(φ L ) = z(φ L ). Let us denote the β in Theorem 4.13 by β and β for φ L and φ L respectively. The fact that z(φ L ) = z(φ L ) gives directly γ = γ , and in the notation of Lemma 4.19, we must have f (β) = f (β ). By the same Lemma, writing θ = π k/n, this leaves us with three possibilities: Either β = β, β = β + 2π m/n or β = 2π m/n − β for some m ∈ Z. In all three cases, by Lemma 4.18, there exists a permutation σ such that for W = σ φ L (W )σ † it holds that e˜l = e˜l . Then it is easy to construct a diagonal unitary matrix P such that φ L (W ) = Pσ φ L (W )σ † P † . Hence, for a given dimension n and for a given value of the Casimir, such that toral representations exist, the set of inequivalent irreducible representations is parametrized by a complex number w such that π/n ≤ |w| ≤ 2π/n. We relate w to a single loop representation by setting w = βeiγ .

418


4.3. Spherical representations. In contrast to the case of toral representations, we will show that, in a spherical representation, there can not exist any loops. The intuitive picture is that the part of the ellipse lying in the region where either d or d˜ is negative, is too large to skip by a rotation through the map s ; see, e.g. Fig. 1. By Lemma 4.8, we know that the x corresponding to a transmitter or a receiver must ˜ lie on the d-axis or the d-axis respectively. For this reason, let us calculate the points where the ellipse crosses the axes. Lemma 4.21. Consider the ellipse (x + y − 2µ)2 + (x − y)/2 = 4c. Then x = 0 implies y = a± and y = 0 implies x = a± with ⎡ ⎤

2 c−µ ⎦ a± = 2 sin θ µ sin θ ± c − µ2 cos2 θ = 2 sin2 θ ⎣µ ± µ2 + (4.10) sin2 θ x ) = (a− , 0). Lemma 4.22. Let x = (0, a+ ), with a+ as in Lemma 4.21. Then s( Lemma 4.23. If φ is a spherical representation of C(µ, ), that contains a string on n vertices, then 0 < (n + 1)θ ≤ π.

(4.11)

Proof. Let us denote the vectors corresponding to the vertices in the string by x1 , . . . , xn and we define 0 < β, θ0 < 2π through x1 = x(β) and xn = x(β + θ0 ) in the notation of Proposition 4.4. Since xn = s n−1 x(β) we must have that (n − 1)2θ = θ0 + 2π k for some integer k ≥ 0. Let us prove that k = 0. For a spherical representation, a− ≤ 0, which implies, by Lemma 4.22, that s x(β + θ0 ) = (a− , 0) can not correspond to a vertex of a connected representation. Hence, for any α ∈ (0, 2θ ), s x(β + θ0 − α) can not correspond to a vertex of a connected representation. This implies that k = 0, i.e. ˜ the string never crosses the d-axis. Therefore 0 < (n − 1)2θ = θ0 < 2π . Again, by Lemma 4.22, both vectors s(0, a+ ) and s 2 (0, a+ ) have non-positive components which implies that 0 < (n + 1)2θ ≤ 2π . In fact, equality is attained when a− = 0. Proposition 4.24. Let φ be a spherical representation of C(µ, ). Then the associated graph has no loops. Proof. In the same way as in the proof of Lemma 4.23, we can argue that for α ∈ (0, 2θ ), s x(β + θ0 − α) has a negative component (or equals (0, 0)), which implies that it is impossible to have loops. Hence, we have excluded the possibility of loop representations and can conclude that all spherical representations are string representations. We therefore get the following corollary to Theorem 4.12. Corollary 4.25. Let φ be a spherical representation of C(µ, ). Then φ is unitarily equivalent to a direct sum of string representations. Let us now investigate the conditions for the existence of strings. Lemma 4.26. Let x1 = (a, 0) and xn = (0, b). Then s n−1 ( x1 ) = xn if and only if (i) q n = −1, µ = 0 and a = b, (ii) q n = 1 and b = −a + 4µ sin2 θ , (iii) q n = ±1, and


a=b=−

2µ sin θ sin(n − 1)θ . cos nθ

419

(4.12)

In particular, if a = a+ and q n = 1, then b = a− . Proposition 4.27. Let φ be a spherical representation of C(µ, ) containing a string on n vertices. Then √ c cos nθ + µ cos θ = 0. (4.13) Proof. Assume the existence of a string on n vertices. From Lemma 4.26 we can exclude the possibility that q n = 1, since a− < 0. Hence, either q n = −1 and µ = 0 or q n = ±1. If q n = −1 and µ = 0 then (4.13) is clearly satisfied. Now, assume q n = ±1 and θ sin(n−1)θ . Demanding that (a, 0) and (0, b) lie on the ellipse determines a = b = 2µ sincos nθ 2 c as c = µ cos2 θ/ cos2 nθ . Let us set ε = sgn µ. Recalling that 0 < (n + 1)θ ≤ π , from Lemma 4.23, demanding a > 0 makes it necessary that sgn(cos nθ ) = −ε, which determines the sign of the root in the statement. As we have seen, the existence of a loop puts a restriction on through the relation q n = 1. For the case of strings, the restriction comes out as a restriction on the possible values of the Casimir. In the next theorem we show that the necessary conditions for the existence of spherical representations are in fact sufficient. √ Theorem 4.28. Let n be a positive integer, c a positive real number such that c cos nθ + µ cos θ = 0 and 0 < (n + 1)θ ≤ π . Furthermore, let U1 , . . . , Un−1 be N × N unitary matrices. Then φ is a N · n-dimensional spherical representation of C(µ, ), with ˆ = 4c1, if φ(C) ⎛ ⎞ √ 0 e˜1 U1 √ 0 ··· 0 ⎜0 ⎟ e˜2 U2 · · · 0 0 ⎜ ⎟ ⎜. ⎟ .. .. .. .. φ(W ) = ⎜ .. ⎟ . . . . ⎜ ⎟ ⎝0 eñ−1 Un−1 ⎠ 0 ··· 0 0 0 ··· 0 0 and e˜l =

√ 2 c sin lθ sin(n − l)θ . cos θ

Proof. It is easy to check that the matrix φ(W ) satisfy (4.1), since s(e˜l , e˜l−1 ) = (e˜l+1 , e˜l ). Moreover, it is clear that e˜l > 0 since 0√< (n − 1)θ < √ π . Let us show that it is indeed a spherical representation, i.e. −1 < µ/ c ≤ 1. Since c cos nθ + µ cos θ = 0, we get that µ cos nθ √ =− cos θ c and from 0 < (n + 1)θ ≤ π we obtain 0 < nθ ≤ π − θ . From this it follows that | cos nθ | ≤ | cos θ | which implies that φ is a spherical representation. † Remark. Let us note that √ the matrix elements of the diagonal matrix Z = [W, W ]/2 can be written as zl = c sin(n + 1 − 2l)θ .

420


Fig. 3. The constraint ellipse of a critical toral representation

Definition 4.29. We define a single string representation φ S of C(µ, ) to be a spherical representation, as in Theorem 4.28, with Ui chosen to be 1 × 1 matrices. Proposition 4.30. Any single string representation of C(µ, ) is irreducible. ˆ = 4c1. Then, by Proof. Assume that φ S is reducible and has dimension n with φ S (C) Proposition 4.1, φ S is equivalent to a direct sum of at least two representations of dimension < n. In particular, this implies that there exists a representation φ of dimension ˆ = 4c1. But this is false, since there is at most one integer l such that m < n with φ(C) x(β + 2lθ ) = x(β + θ0 ), for 0 < (l + 1)2θ < 2π and 0 < θ0 < 2π . We conclude that the single string representations are the only irreducible spherical representations. Moreover, two single string representations φ S and φ S , of the same ˆ = φ (C). ˆ dimension, are equivalent if and only if φ S (C) S 4.4. Critical toral representations. In the case of critical toral representations, the con˜ axis twice, as in Fig. 3. As we will show, straint ellipse intersects the positive d (resp. d) there are both loop representations and string representations. String representations can √ be obtained from Theorem 4.28, by demanding that 1 < µ/ c ≤ 1/ cos θ instead of 0 < (n + 1)θ ≤ π . Let us as well give an example of a loop representation. √ Proposition 4.31. Assume that θ = π/N , N ≥ 5 odd and 1 < µ/ c ≤ 1/ cos θ . If we define φ as in Theorem 4.13 with β = 0, then φ is a critical toral representation of C(µ, ). Proof. One simply has to check that π ) √ cos(2l N µ e˜l = c √ + >0 π cos N c for l = 0, . . . , N − 1. If N is odd then 2θl ∈ / (π − / (2π − θ, 2π ), √ θ, π + θ ) and 2θl ∈ which implies that | cos 2θl| < | cos θ |. Since µ/ c > 1 we conclude that e˜l > 0 for l = 0, . . . , N − 1.


421

In contrast to the previous cases, it is, for a given value of the Casimir, possible to have both string representations and loop representations. Namely, if we assume that q n = 1 and let x1 correspond to the largest intersection with the d-axis, then s n−1 ( x1 ) ˜ will be the smallest intersection with the d-axis (cp. Lemma 4.26), and one can check that all pairs xi , for i = 2, . . . , n − 1 will be strictly positive. 4.5. Summary of representations. We have shown that every representation can be decomposed into a direct sum of irreducible representations of two types: string and loop representations. String representations correspond to matrices of the form ⎛ ⎞ 0 W12 0 ··· 0 0 W23 · · · 0 ⎟ ⎜0 ⎜. ⎟ .. .. .. .. ⎟, . φ(W ) = ⎜ . . . . ⎜. ⎟ ⎝0 0 ··· 0 W N −1,N ⎠ 0 0 ··· ··· 0 and loop representations to matrices of the form ⎛ 0 W12 0 ··· 0 W23 · · · ⎜ 0 ⎜ . .. .. .. φ(W ) = ⎜ . . . ⎜ .. ⎝ 0 0 ··· 0 W N ,1 0 ··· ···

0 0 .. . W N −1,N 0

⎞ ⎟ ⎟ ⎟, ⎟ ⎠

with W12 , W23 , . . . , W N −1,N , W N ,1 = 0. Furthermore, existence of representations puts ˆ A necessary conrestrictions on the parameter and the value 4c of the Casimir C. dition for loop representations to exist is that there is a positive integer k such that q k = e2ikθ √ = 1, where = tan θ . A necessary condition for string representations to exist is that c cos nθ + µ cos θ = 0, for some positive integer n. The structure √ of representations respects the classical geometry as follows: In the region −1 < µ/ c ≤ 1 we have shown that there are only string representations and √ when µ/ c > 1/ cos θ there are only loop representations. In the critical region, where √ 1 < µ/ c ≤ 1/ cos θ (classically, one is close to the singular surface), there are in fact representations of both types. 5. Eigenvalue Distribution and Surface Topology In this section we consider the eigenvalue distribution of the matrix X in the representations obtained in sect. 4, with the help of numerical computations. The eigenvalue distribution is of interest since in [Shi04] it was shown that the Morse theoretic information of topology manifests itself in certain branching phenomena of eigenvalue distribution of a single matrix. More precisely, critical points of the Morse function correspond to branching points of the eigenvalue distribution. (The meaning of the word “branching phenomena” will be illustrated below by using the eigenvalue distribution of the matrix X , plotted in Fig. 4, as an example.) This was achieved by using arguments analogous to those used in the WKB approximation in quantum mechanics, and is part of a more general correspondence between matrix elements and certain geometric quantities computed from the corresponding function on the surface. For a description of this analogy and also for more examples, we refer the reader to [Shi04].


λi + 1 _ λi

λi

422

1.5

1.5

1.5

1.0

1.0

1.0

0.5

0.5

0.5

0.0

10

20

30

0.0

10

20

30

0.0

-0.5

-0.5

-0.5

-1.0

-1.0

-1.0

-1.5

-1.5

-1.5

0.2

0.2

0.2

0.1

0.1

0.1

0.0

10

20

i = 0.9

30

0.0

10

20

30

20

30

0.0 10

20

i = 1.1

30

10

i = 1.3

Fig. 4. Plot of λi andλi+1 −λi versus i, where λ1 < λ2 < . . . < λ N are eigenvalues of X , for µ = 0.9, 1.1, 1.3. The size of matrices is given by N = 30. Critical values of x are also shown by the horizontal lines

Eigenvalues of X (whose continuum counterpart, x, is a Morse function on the surface) in the representations obtained in Sect. 4, do exhibit this branching phenomena, as is consistent with the results in [Shi04]. In Fig. 4, eigenvalues of X , computed numerically, for the case µ = 0.9, 1.1, 1.3 are shown. (We use the normalization convention in which c = 1, so that the transition between sphere and torus occurs at µ = 1. The size of matrices is given by N = 30. For the toral representation, we have taken the additional “phase shift” parameter β to be zero. Using different β’s does not change the plot qualitatively.) The horizontal lines correspond to the critical values of the function x on the surface. The plots directly reflect the Morse theoretic information of topology, with x as the Morse function, for each case µ = 0.9, 1.1, 1.3. For the case µ = 0.9, there are two critical values which are connected by a single branch. Correspondingly, the eigenvalue plot shows that there is only one “sequence” of eigenvalues λ1 < λ2 < . . . < λ N which increase smoothly. For the cases µ = 1.1 and µ = 1.3, there are four critical values of x, say x A < x B < xC < x D . For x A < x < x B and xC < x < x D the surface consists of a single branch, whereas for x B < x < xC , the surface consists of two branches. Correspondingly, in the plot of eigenvalues, one sees that eigenvalues x A < λi < x B and xC < λi < x D each consists of a single smoothly increasing eigenvalue sequence, whereas eigenvalues xC < λi < x D are naturally divided into two sequences both of which increase smoothly. This branching phenomena of eigenvalues can be seen more manifestly if one plots the difference between eigenvalues, λi+1 − λi , as is shown in the figure. From the figure it can also be seen that by decreasing the parameter µ from 1.3 to 1.1 the part of the surface which has two branches shrinks, as is consistent with the geometrical picture about the transition between torus and sphere.


423

6. Comparison with Other Quantization Methods 6.1. The torus. The purpose of this section is to compare matrix representations obtained in Sect. 4, in the torus case, with those one gets using Berezin-Toeplitz quantization. Full details and proofs can be found in [Hof07]. We shall use Theorem 5.1 from the paper [BHSS91] applied to S1 × S1 . Namely n = 1, τ = 1 and we omit the Laplacian terms: ! n n 2 rk+n π π τs r2 2 rs2 + s+n . τk rk + 2 exp − and m 2m τs2 tk k=1

s=1

We reformulate it for simplicity and to fix notations. Theorem 6.1. Let r1 , r2 ∈ Z and N ≥ 5 be an integer. Then the N × N -matrix corresponding to the phase function e2πi(r1 ϑ+r2 ϕ) is:

πi M e2πi(r1 ϑ+r2 ϕ) = χ r1 r2 S −r1 T r2 and χ := e− N , where the ⎛ 0 ⎜ ⎜0 ⎜ S = ⎜ .. ⎜. ⎝ 0 1

S and T are matrices such that: ⎞ 1 0 ··· 0 .. . 0⎟ 0 1 ⎟ 2πi .. .. . . .. ⎟ ⎟, T = diag(1, q, . . . , q N −1 ) where q := χ 2 = e− N . . ⎟ . . . ⎠ 0 0 ··· 1 0 0 ··· 0

Remark 6.2. The M map is not a morphism of algebras. However, M is continuous in the topology of uniform convergence. To apply this theorem to the torus case, √ i.e. the regular values of the polynomial function (x 2 + y 2 − µ)2 + z 2 − c (with µ/ c > 1), one has to choose an embedding: √ Proposition 6.3. Assume that µ/ c > 1. By using the parametrization: √ ⎧ ⎨ x(ϑ, ϕ) = cos(2π ϑ) c cos(2π ϕ) + µ √ sin(2π ϑ) c cos(2π ϕ) + µ ⎩ y(ϑ, ϕ) = √ z(ϑ, ϕ) = c sin(2π ϕ) one gets: √ √ S −1 S c −1 c −1 χ T + χT χ T + χ −1 T −1 , M(x) = + (6.1) 1µ + 1µ + 2 2 2 2 √ √ S −1 c −1 c S χ T + χ T −1 − χ T + χ −1 T −1 , (6.2) M(y) = 1µ + 1µ + 2i 2 2i 2 √

c T − T −1 . (6.3) M(z) = 2i √ Proof. The key idea is an expansion in Fourier series of µ + c cos(2π ϕ). We then replace phase functions by matrices T and S according to Theorem 6.1. Square roots of matrices are well defined since the matrices are positive definite.

424


Lemma 6.4. Let D = diag(d1 , . . . , d N ) be a diagonal N × N -matrix, then: S −1 DS = diag(d N , d1 , . . . , d N −1 ) and S DS −1 = diag(d2 , . . . , d N , d1 ). Let us denote: D :=

√

c % := χ T + χ −1 T −1 and D 1µ + 2

√ c −1 χ T + χ T −1 . 2

1µ +

Then one can write (6.1) and (6.2) as: i % 1 % S D + S −1 D S D − S −1 D . and M(y) = − M(x) = 2 2 % are diagonal: It is easily seen that the matrices D and D ! √ 2πl π + D = diag µ + c cos , N N l=1,...,N ! √ π 2πl % = diag − D µ + c cos . N N l=1,...,N

By Lemma 6.4, % = S DS % SD

−1

S = diag

! √ 2πl π + µ + c cos N N

× S = DS.

l=1,...,N

As a consequence, M(x) and M(y) can be written as: 1

i

DS + S −1 D DS − S −1 D . M(x) = and M(y) = − 2 2 Theorem 6.5. The matrices M(x), M(y) and M(z) are: ⎛0 x 0 ··· 0 ⎜ x1 ⎜ ⎜ 0 1⎜ M(x) = ⎜ . ⎜ 2 ⎜ .. ⎜ ⎝ 0 xN ⎛ 0

1

0

x2

x2 .. .

0 .. .

0 0

··· ···

y1 0

0 y2

··· .. . .. . .. . 0

··· ··· .. .

0 0 .. . 0 x N −1 0 0

xN ⎞ 0 ⎟ ⎟ ⎟ 0 ⎟ .. ⎟ ⎟, . ⎟ ⎟ ⎠

x N −1 0

⎜−y1 ⎜ ⎜ 0 0 −y2 0 i ⎜ M(y) = − ⎜ .. .. .. .. .. ⎜ 2⎜ . . . . . ⎜ . ⎝ .. 0 0 ··· 0 yN 0 ··· 0 −y N −1 M(z) = diag(z 1 , z 2 , . . . , z N ),

−y N ⎞ 0 ⎟ ⎟ ⎟ 0 ⎟ .. ⎟ ⎟, . ⎟ ⎟ ⎠ y N −1 0


425

where the xl ’s, yl ’s and zl ’s (for l = 1, . . . , N ) are: √ √ 2πl π 2πl + and zl = − c sin . xl = yl = µ + c cos N N N These matrices satisfy the following relations: √ √ Theorem 6.6. Let µ, c ∈ R and N ≥ 5 such that µ/ c > 1. If one assumes = tan(θ ) with θ := π/N , then: [ X˜ , Y˜ ] = i(cos(θ ) Z˜ ),

[Y˜ , (cos(θ ) Z˜ )] = i X˜ ( X˜ 2 + Y˜ 2 − µ1) + ( X˜ 2 + Y˜ 2 − µ1) X˜ ,

[(cos(θ ) Z˜ ), X˜ ] = i Y˜ ( X˜ 2 + Y˜ 2 − µ1) + ( X˜ 2 + Y˜ 2 − µ1)Y˜ , √ ( X˜ 2 + Y˜ 2 − µ1)2 + (cos(θ ) Z˜ )2 = ( c cos(θ ))2 1, where X˜ := M(x), Y˜ := M(y) and Z˜ := M(z) are the matrices obtained in Theorem 6.5. Proof. This is a direct computation on matrices.

This proves that the matrices X = X˜ , Y = Y˜ and Z = cos(θ ) Z˜ satisfy exactly the relations (3.2), (3.3) and (3.4). The Casimir identity is satisfied with c replaced by c cos2 θ . Note that cos θ converges to 1 as N goes to infinity. 6.2. The sphere. Let us start √ by constructing √ a parametrization for the deformed sphere described by (3.1) with − c < µ < c. Recall the well-known parametrization of the sphere ξ1 = sin ϑ cos ϕ, ξ2 = sin ϑ sin ϕ, ξ3 = cos ϑ. We would like to keep the axial symmetry and therefore we make the following Ansatz outside the poles for a map (x, y, z) : S 2 → , x = f (ϑ) cos ϕ, y = f (ϑ) sin ϕ. Keeping the relation {x, y} S 2 = z, using the (round sphere) Poisson bracket, { f, g} S 2 =

λ ∂ϑ f ∂ϕ g − ∂ϕ f ∂ϑ g sin ϑ

(where λ > 0 is an arbitrary parameter scaling the round sphere volume as 4π/λ), yields z=

λ f (ϑ) f (ϑ). sin ϑ

Demanding that these functions satisfy the constraint (x 2 + y 2 − µ)2 + z 2 − c = 0 gives the following differential equation for f (ϑ):

2 f 2 (ϑ) − µ +

2 λ2 (ϑ) f (ϑ) − c = 0, f sin2 ϑ

426


which is solved by f (ϑ) =

√ 2 cos ϑ + B , µ + c cos λ

for arbitrary B. As ϑ goes to 0 or π we need that x and y go to zero; there are two ways of achieving this (giving conditions on λ and B) but only the following leads to an embedding: Proposition 6.7. The map : S 2 → R3 defined by

ϑ =0: with

µ √ c

' ⎧ √ ⎪ x = cos ϕ µ + c cos λ2 cos ϑ ⎪ ⎨ ' √ ϑ ∈ (0, π ) : y = sin ϕ µ + c cos λ2 cos ϑ , ⎪ ⎪ √ ⎩ z = c sin λ2 cos ϑ √ √ x = y = 0, z = c sin(2/λ); ϑ = π : x = y = 0, z = − c sin(2/λ), = − cos

2 √ λ , −1 < µ/ c < 1, and 0 < 2/λ < π , is an embedding of the

(round) sphere into R3 whose image coincides with . Moreover, it holds that {x, y} S 2 = z, {y, z} S 2 = 2x(x 2 + y 2 − µ) and {z, x} S 2 = 2y(x 2 + y 2 − µ). The embedding is therefore a Poisson map and hence volume preserving (where is equipped with the volume defined by the inverse of the restriction of the C-bracket (2.1)). Proof. Outside the poles all the assertions are computed in a straight forward manner. Around the poles we can express x,y and z by the local round sphere charts ξ1 and ξ2 to see that the map is a smooth embedding. Let us introduce the hermitian n × n matrices S1 , S2 , S3 , whose nonzero matrix elements are 1 S1 k,k+1 = k(n − k) = S1 k+1,k , k = 1, . . . , n − 1, 2 i k(n − k) = − S2 k+1,k , k = 1, . . . , n − 1, S2 k,k+1 = − 2 1 S3 k,k = (n + 1 − 2k), k = 1, . . . , n, 2 satisfying [Sa , Sb ] = iabc Sc and S12 + S22 + S32 = n 4−1 1. We then define rescaled matrices X a = A(n)Sa , for some function A(n). In analogy with the case of the torus, we would like to compare the Berezin-Toeplitz quantization of the embedding functions with the results obtained in Sect. 4.3. Quantizing the function cos ϑ will (up to scaling) give the diagonal matrix S3 . However, a function of cos ϑ is in general not mapped to the same function of S3 and one can numerically check that the quantization of z(ϑ) (in Proposition 6.7) is not equal to the matrix Z obtained in Sect. 4.3. However, they agree up to corrections of order 1/n. 2


427

In [GH82] the following prescription for replacing functions on S 2 by matrices was introduced: Smooth functions on S 2 are expanded in terms of the spherical harmonics Ylm (ϑ, ϕ), resp. Ylm = r l Ylm , written as Ylm (x1 , x2 , x3 ) =

3

ca(m) x · · · xal , 1 ···al a1

a1 ,...,al =1

(x1 = r sin ϑ cos ϕ, x2 = r sin ϑ sin ϕ, x3 = r cos ϑ) with ca(m) 1 ···al chosen to be totally symmetric with respect to the lower indices. A function is then mapped to a n × n via 3 T (n) Ylm = B(n, l) ca(m) X · · · X al , 1 ···al a1 a1 ,...,al =1

'

√ 2 l (n−1−l)! 4π (n −1) and X 1 , X 2 , X 3 are defined as above, with the (n + l)! √ choice A(n) = 2/ n 2 − 1. Disregarding multiplication by an overall n-dependent function, the map T (n) will act on the basic functions in the following way: T (n) (x1 ) = T (n) sin ϑ cos ϕ ∼ S1 , T (n) (x2 ) = T (n) sin ϑ sin ϕ ∼ S2 , T (n) (x3 ) = T (n) cos ϑ ∼ S3 . where B(n, l) =

We will now show that, for some scaling of X 1 , X 2 , X 3 , the following hermitian matrices: ' −1 √ 2 1 2 ˆ X3 µ + c cos 1 − X3 X1 X = 2 λ ' −1 √ 1 2 2 + X 1 µ + c cos 1 − X3 , X3 2 λ ' −1 √ 2 1 X3 µ + c cos 1 − X 32 X2 Yˆ = 2 λ ' −1 √ 1 2 2 X3 + X 2 µ + c cos 1 − X3 , 2 λ √ 2 X3 , Zˆ = c sin λ being noncommutative analogues of the embedding functions in Proposition 6.7, agree with the results obtained in Sect. 4.3 up to corrections of order 1/n; moreover, the matrix Zˆ will have an exact agreement. (Actually, all orderings we tried for x + iy gave matrices with nonzero elements only on the first off-diagonal; furthermore, they also agreed with our results up to order 1/n, as we have seen from numerical computations.) For a spherical representation, it holds that Z ll =

√ 1 [W, W † ]ll = c sin (n + 1 − 2l)θ , 2

428


√ c cos nθ = 0. Furthermore, the matrix elements of Zˆ are given by √ A(n) ˆ . Z ll = c sin (n + 1 − 2l) λ √ As one can easily check, the relation µ cos θ + c cos nθ = 0 defines a unique smooth function θ (n) such that 0 < θ (n) < π/(n + 1). Defining A(n) = λθ (n) gives directly that Z ll = Zˆ ll , and one can show that the matrices Xˆ , Yˆ will agree with the matrices X, Y up to corrections of order 1/n. The main ingredient is the following lemma.

with µ cos θ +

Lemma 6.8.

µ 2 sin(n − l)θ sin lθ 1 2 X 3 l,l = +O . √ + cos λ cos θ n c

Proof. Setting θ = A(n)/λ, we can rewrite 2 cos X 3 l,l = cos(n − l)θ cos lθ + sin(n − l)θ sin lθ λ + cos θ − 1 cos(n − 2l)θ − sin θ sin(n − 2l)θ. √ Since − cos(2/λ) = µ/ c = − cos nθ/ cos θ , it follows that θ (n) = 2/(λn) + O(1/n 2 ) and we conclude that 1 2 X 3 l,l = cos(n − l)θ cos lθ + sin(n − l)θ sin lθ + O . cos λ n √ √ Since µ = − c cos 2/λ , we have − cos nθ = − cos(2/λ + O(1/n)) = µ/ c + O(1/n), which implies that µ 1 2 X 3 l,l = 2 sin(n − l)θ sin lθ + O , √ + cos λ n c from which the statement of the lemma follows.

Acknowledgement. We would like to thank the Swedish Research Council, the Royal Institute of Technology, the Knut and Alice Wallenberg foundation, the Japan Society for the Promotion of Science, the Albert Einstein Institute, the Sonderforschungsbereich “Raum-Zeit-Materie”, the IHES, the ESF Scientific Programme MISGAM, and the Marie Curie Research Training Network ENIGMA for financial support resp. hospitality. In addition, we are thankful for the constructive remarks of the referees.

References [ASS06] [Ber78] [BHSS91] [BKL05] [BMS94] [FFZ89]

Assem, I., Simson, D., Skowronski, A.: Elements of the Representation Theory of Associative Algebras. LMS Student Texts 65, Cambridge: Cambridge University Press, 2006 Bergman, G.M.: The diamond lemma for ring theory. Adv. Math. 29, 178–218 (1978) Bordemann, M., Hoppe, J., Schaller, P., Schlichenmaier, M.: gl(∞) and geometric quantization. Commun. Math. Phys. 138, 209–244 (1991) Bak, D., Kim, S., Lee, K.: All higher genus BPS membranes in the plane wave background. JHEP 0506, 035 (2005) Bordemann, M., Meinrenken, E., Schlichenmaier, M.: Toeplitz Quant. of Kähler manifolds and gl(N ), N → ∞ limits. Commun. Math. Phys. 165, 281–296 (1994) Fairlie, D., Fletcher, P., Zachos, C.: Trigonometric structure constants for new infinite algebras. Phys. Lett. B 218, 203 (1989)

Noncommutative Riemann Surfaces by Embeddings in R3 [GH82] [FH94] [Hir76] [Hof02] [Hof07] [Hop89/88]

[KL92] [Mad92] [NN99] [Now97]

[Shi04]

429

Hoppe, J.: Quantum theory of a massless relativistic surface. Ph.D. Thesis (Advisor: J. Goldstone), MIT. http://www.aei.mpg.de/~hoppe/, 1982 Harary, F.: Graph Theory. Reading MA: Addison-Wesley, 1969 Hirsch, M.W.: Differential topology. New-York: Springer, 1976 Hofer, L.: Surfaces de Riemann compactes. Master’s thesis, Université de Haute-Alsace Mulhouse, http://laurent.hofer.free.fr/data/master_hofer_2002.pdf, 2002 Hofer, L.: Aspects algébriques et quantification des surfaces minimales. Ph.D. thesis, Université de Haute-Alsace de Mulhouse, http://laurent.hofer.free.fr/data/these_hofer_2007. pdf, June 2007 Hoppe, J.: diffeomorphism groups, quantization, and SU (∞). Int. J. of Mod. Phys. A, 4(19), 5235–5248 (1989); Diff A T 2 , and the curvature of some infinite dimensional manifolds. Phys. Lett. B 215, 706–710 (1988) Klimek, S., Lesniewski, A.: Quantum Riemann surfaces I. The unit disc Commun. Math. Phys. 146, 103–122 (1992); Quantum Riemann surfaces II. The discrete series. Lett. Math. Phys. 24, 125–139 (1992) Madore, J.: The fuzzy sphere. Class. Quant. Grav. 9, 69–88 (1992) Natsume, T., Nest, R.: Topological approach to quantum surfaces. Commun. Math. Phys. 202, 65–87 (1999) Nowak, C.: Über Sternprodukte auf nichtregulren Poissonmannigfaltigkeiten (Ph.D Thesis, Freiburg University); Star Products for integrable Poisson Structures on R3 . http://arxiv.org/ abs/q-alg/9708012, 1997 Shimada, H.: Membrane topology and matrix regularization. Nucl. Phys. B 685, 297–320 (2004)

Communicated by Y. Kawahigashi


Communications in


Stability, Convergence to Self-Similarity and Elastic Limit for the Boltzmann Equation for Inelastic Hard Spheres S. Mischler, C. Mouhot CEREMADE, Université Paris IX-Dauphine, Place du Maréchal de Lattre de Tassigny, 75775 Paris, France. E-mail: [email protected]; [email protected] Received: 12 February 2008 / Accepted: 5 January 2009 Published online: 4 March 2009 – © Springer-Verlag 2009

Abstract: We consider the spatially homogeneous Boltzmann equation for inelastic hard spheres, in the framework of so-called constant normal restitution coefficients α ∈ [0, 1]. In the physical regime of a small inelasticity (that is α ∈ [α∗ , 1) for some constructive α∗ ∈ [0, 1)) we prove uniqueness of the self-similar profile for given values of the restitution coefficient α ∈ [α∗ , 1), the mass and the momentum; therefore we deduce the uniqueness of the self-similar solution (up to a time translation). Moreover, if the initial datum lies in L 13 , and under some smallness condition on (1 − α∗ ) depending on the mass, energy and L 13 norm of this initial datum, we prove time asymptotic convergence (with polynomial rate) of the solution towards the selfsimilar solution (the so-called homogeneous cooling state). These uniqueness, stability and convergence results are expressed in the selfsimilar variables and then translate into corresponding results for the original Boltzmann equation. The proofs are based on the identification of a suitable elastic limit rescaling, and the construction of a smooth path of self-similar profiles connecting to a particular Maxwellian equilibrium in the elastic limit, together with tools from perturbative theory of linear operators. Some universal quantities, such as the “quasielastic self-similar temperature” and the rate of convergence towards self-similarity at first order in terms of (1 − α), are obtained from our study. These results provide a positive answer and a mathematical proof of the Ernst-Brito conjecture [16] in the case of inelastic hard spheres with small inelasticity.

1. Introduction and Main Results 1.1. The model. We consider the spatially homogeneous Boltzmann equation for hard spheres undergoing inelastic collisions with a constant normal restitution coefficient α ∈ [0, 1) (see [8,17,23,24]). More precisely, the gas is described by the distribution density of particles f = f t = f (t, v) ≥ 0 with velocity v ∈ R N (N ≥ 2) at time t ≥ 0

432

S. Mischler, C. Mouhot

and it satisfies the evolution equation ∂f = Q α ( f, f ) in (0, +∞) × R N , ∂t f (0, ·) = f in in R N .

(1.1) (1.2)

The quadratic collision operator Q α ( f, f ) models the interaction of particles by means of inelastic binary collisions (preserving mass and momentum but dissipating kinetic energy). We define the collision operator by its action on test functions, or observables. Taking ψ = ψ(v) to be a suitable regular test function, we introduce the following weak formulation of the collision operator: Q α (g, f ) ψ dv = b |u| g∗ f (ψ − ψ) dσ dv dv∗ , (1.3) R N ×R N ×S N −1

RN

where we use the shorthand notations f := f (v), g∗ := g(v∗ ), ψ := ψ(v ), etc. Here and below u = v − v∗ denotes the relative velocity and v , v∗ denotes the possible post-collisional velocities. These post-collisional velocities are functions of v, v∗ , σ depending on the collision mechanism, and therefore, in our case, depending on α. They are defined by v =

w u w u + , v∗ = − , 2 2 2 2

with

w = v + v∗ , u =

1−α 2

u+

1+α 2

(1.4) |u| σ.

We also introduce the notation xˆ = x/|x| for any x ∈ R N , x = 0. The function b = b(uˆ · σ ) in (1.3) is (up to a multiplicative factor) the differential collisional crosssection. We assume that b is Lipschitz, non-decreasing and convex on (−1, 1),

(1.5)

and that ∃ bm , b M ∈ (0, ∞)

s.t.

∀ x ∈ [−1, 1], bm ≤ b(x) ≤ b M .

(1.6)

In the important case of “hard spheres”, the cross-section is given by (see [13,17]) b(x) = b0 (1 − x)−

N −3 2

, b0 ∈ (0, ∞),

(1.7)

so that it fulfills the above hypothesis (1.5,1.6) when N = 3. These hypotheses are needed in the proof of moments estimates (see [23, Prop. 3.2] and [24, Prop. 3.1]). We also define the symmetrized (or polar form of the) bilinear collisional operator Q˜ α by setting ⎧ 1 ⎪ ⎨ b |u| g∗ h ψ dσ dv dv∗ , Q˜ α (g, h) ψ dv = 2 RN R N ×R N ×S N −1 (1.8) ⎪ ⎩ with ψ = ψ + ψ∗ − ψ − ψ∗ .

Spatially Homogeneous Boltzmann Equation for Inelastic Hard Spheres

433

In other words, Q˜ α (g, h) = (Q α (g, h) + Q α (h, g))/2. The formula (1.3) suggests the − natural splitting Q α = Q +α − Q − α between gain and loss parts. The loss part Q α can be defined in strong form noticing that (g, f ), ψ = b |u| g∗ f ψ dσ dv dv∗ =: f L(g), ψ,

Q − α R N ×R N ×S N −1

where ·, · is the usual scalar product in L 2 and L is the convolution operator L(g)(v) = (b0 | · | ∗ g)(v) = b0 g(v∗ ) |v−v∗ | dv∗ , with b0 = RN and Q − α

S N −1

b(σ1 ) dσ.

(1.9)

= Q − are indeed independent of the normal restitution In particular note that L coefficient α. The Boltzmann equation (1.1) is complemented with an initial datum (1.2) which satisfies ⎧ ⎪ 1 N ⎪ ρ( f in ) := f in dv = ρ ∈ (0, ∞) ⎨ 0 ≤ f in ∈ L (R ), N R (1.10) ⎪ ⎪ ⎩ f in v dv = 0, E( f in ) := f in |v|2 dv < ∞. RN

RN

As explained in [23,24], the operator (1.3) preserves mass and momentum, and so does the evolution equation: d 1 dv = 0, (1.11) ft v dt R N while kinetic energy is dissipated

d 2 E( f t ) = −(1 − α ) DE ( f t ), E( f t ) := f t |v|2 dv. dt RN The energy dissipation functional is given by f f ∗ |u|3 dv dv∗ , DE ( f ) := b1 R N ×R N

where b1 is (up to a multiplicative factor) the angular momentum defined by 1 1 − (uˆ · σ ) b(uˆ · σ ) dσ. b1 := 8 S N −1

(1.12)

(1.13)

(1.14)

In order to establish (1.12) we have used (1.8) and the elementary computation 1 − α2 (1 − (uˆ · σ )) |u|2 . 4 The study of the Cauchy theory and the cooling process of (1.1)-(1.2) was done in [23]. The equation is well-posed for instance in L 13 : for 0 ≤ f in ∈ L 13 , there is a unique global solution in C(R+ ; L 12 ) ∩ L ∞ (R+ ; L 13 ) (see Subsect. 1.5 for the notation of functional spaces). This solution preserves mass, momentum and has a positive and decreasing kinetic energy. Moreover, as time goes to infinity, it satisfies: |·|2 (v, v∗ , σ ) = −

E( f t ) → 0 and f (t, ·) δv=0 in M 1 (R N )-weak *, where

M 1 (R N )

denotes the space of probability measures on

RN .

(1.15)

434


1.2. Introduction of rescaled variables. Let us introduce some rescaled variables (which can be found in [8,15,24] for instance), in order to study more precisely the asymptotic behavior (1.15) of the solution. For any solution f to the Boltzmann equation (1.1), we may associate for any τ ∈ (0, ∞) the self-similar rescaled solution g by the relation g(t, v) = e−N τ t f

eτ t − 1 −τ t v . ,e τ

Using the homogeneity property Q α (g(λ·), g(λ·))(v) = λ−(N +1) Q α (g, g)(λv), it is straightforward that g satisfies the evolution equation ∂g = Q α (g, g) − τ ∇v · (vg). ∂t

(1.16)

Any non-negative steady state 0 ≤ G = G(v) of (1.16), that is G satisfying Q α (G, G) − τ ∇v · (v G) = 0,

(1.17)

is called a self-similar profile. It induces a self-similar solution (or homogeneous cooling state) F of the original equation (1.1) by setting F(t, v) = (V0 + τ t) N G((V0 + τ t)v),

(1.18)

for a given constant V0 ∈ (0, ∞). Reciprocally, let us consider a self-similar solution F of the original equation (1.1). This means that F is a solution of (1.1) with the specific shape F(t, v) = V (t) N G(V (t) v),

(1.19)

for some given non-negative distribution G = G(v) and some C 1 , positive, increasing time rescaling function V (t). One can easily show (see for instance [24, Sect. 1.2]) that V (t) = τ t + V0 for some constants τ, V0 > 0 and G satisfies (1.17) for the velocity rescaling parameter τ . For a given self-similar profile G, associated to a velocity rescaling parameter τ and ˜ associated to with mass ρ and energy E, we may associate a new self-similar profile G, a velocity rescaling parameter τ˜ and with mass ρ˜ by setting ρ˜ τ ˜ , G(v) = K G(V v), V = ρ τ˜

K = VN

ρ˜ . ρ

(1.20)

2 The energy of G˜ is then E˜ = ρρ˜ ττ˜ E. We thus see that there exists a two real parameter family of self-similar profiles which can be either parametrized by (ρ, τ ) or by (ρ, E). For fixed mass, changing the velocity rescaling parameter τ in (1.17) corresponds to a change of the energy of the profile, or equivalently to an homothetic change of variable of the solution. Therefore it is no restriction to choose arbitrarily this constant. Also note that modifying V0 just corresponds to a time translation in the self-similar solution F defined by (1.18).


435

It follows from [24, Theorem 1.1] that for any inelastic parameter α ∈ (0, 1), any mass ρ ∈ (0, ∞) and (thanks to the preceding discussion) any velocity rescaling parameter τ ∈ (0, ∞), there exists at least one positive and smooth self-similar profile G with given mass ρ and vanishing momentum: ⎧ N ⎪ ⎨ Q α (G, G) − τ ∇v · (v G) = 0 in R , (1.21) ⎪ G dv = ρ, G v dv = 0, 0 < G ∈ S(R N ), ⎩ RN

RN

where S(R N ) denotes the Schwartz space of C ∞ functions decreasing at infinity faster than any polynomial. Furthermore, we recall that the particular self-similar profile G built in [24, Theorem 1.1] satisfies ∀ v ∈ R N , |v| ≥ 1, e−Cα |v| ≤ G(v) ≤ e−cα |v| , for some 0 < cα ≤ Cα < ∞. Finally, for any solution g to the Boltzmann equation in self-similar variables (1.16), we may associate a solution f to the evolution problem (1.1), defining f by the relation ln(V0 + τ t) N , (V0 + τ t)v . f (t, v) = (V0 + τ t) g (1.22) τ We emphasize that the construction of self-similar rescaling of this subsection is a priori no more valid when α is not constant. Therefore other difficulties arise in this case, and we postpone their study to a forthcoming work [25]. 1.3. Rescaled variables and elastic limit α → 1. We now set the value of τ as τ = τα = ρ (1 − α),

(1.23)

and we denote by G α a solution to the problem (1.21) for this choice τα . At a formal level, it is immediate that with this choice, in the elastic limit α → 1, Eq. (1.21) becomes ⎧ N ⎪ ⎨ Q 1 (G 1 , G 1 ) = 0 in R , (1.24) ⎪ G 1 dv = ρ, G 1 v dv = 0, 0 ≤ G 1 ∈ S(R N ). ⎩ RN

RN

Moreover, multiplying the first equation of (1.21) by |v|2 , integrating in the velocity variable as in (1.12) and taking into account the additional term coming from the additional drift term in (1.21), one gets 2 (1 − α) ρ E(G α ) − (1 − α 2 ) DE (G α ) = 0.

(1.25)

Dividing the above equation by (1 − α) and passing to the limit α → 1, one obtains ρ E(G 1 ) − DE (G 1 ) = 0.

(1.26)

It is straightforward (see Prop. 3.6 below) that the only function satisfying (1.24) and (1.26) is the Maxwellian function G¯ 1 := Mθ¯1 = Mρ,0,θ¯1 ,

(1.27)

436


where, for any ρ, θ > 0, u ∈ R N , the function Mρ,u,θ denotes the Maxwellian with mass ρ, momentum u and temperature θ given by Mρ,u,θ (v) :=

|v−u|2 ρ e− 2θ , N /2 (2π θ )

(1.28)

and where the temperature θ¯1 ∈ (0, ∞) here is given by (we recall that b1 is defined in (1.14)) −2 N2 3 θ¯1 = M (v) |v| dv . (1.29) 1,0,1 8 b12 RN For instance in dimension N = 3 we obtain θ¯1 = (9π )/(64b12 ). Moreover, in the particular case of the hard-spheres cross-section (1.7) in dimension 3, we find b1 = b0 (4π )/3 and therefore θ¯1 = 81/(1024π (b0 )2 ). 1.4. Physical and mathematical motivation. For a detailed physical introduction to granular gases we refer to [9,13]. As can be seen from the references included in the latter, granular flows have become a subject of physical research on their own in the last decades, and for certain regimes of dilute and rapid flows, these studies are based on kinetic theory. By contrast, the mathematical kinetic theory of granular gas is rather young and began in the late 1990 decade. We refer to [23,24] for some (short) mathematical introduction to this theory and a (non-exhaustive) list of references. As explained in these papers, granular gases are composed of grains of macroscopic size with contact collisional interactions, when one does not consider other additional possible self-interaction mechanisms such as gravitation – for cosmic clouds for instance – or electromagnetism – for “dusty plasmas” for instance. Therefore the natural assumption about the binary interaction between grains is that of inelastic hard spheres, with no loss of “tangential relative velocity” (according to the impact direction) and a loss in “normal relative velocity”. This loss is quantified in some (normal) restitution coefficient. The latter is either assumed to be constant as a first approximation (as in this paper) or can be more intricate: for instance it is a function of the modulus |v − v| of the normal relative velocity in the case of “visco-elastic hard spheres” (see [9]), which shall be studied in the future in [25]. Simplified Boltzmann models like inelastic Maxwell molecules or pseudo inelastic hard spheres have been proposed (see [5]) for which existence, uniqueness and global stability of a self-similar profile has been shown (see [3,7]), see also [2] for similar results in the case of a thermal bath. However these models do not capture some crucial physical features of the cooling process of granular gas, like the tail behavior of the velocity distribution or the rate of decay of temperature (the so-called Haff’s law). For (spatially homogeneous) inelastic hard spheres Boltzmann models, the existing mathematical works are: • the paper [8] which shows a priori polynomial and exponential moments bounds on any possible self-similar profile (resp. stationary solutions), whose existence is assumed, for freely cooling (resp. driven by a thermal bath) inelastic hard spheres with constant restitution coefficient; • the paper [17] which shows existence of stationary solutions for inelastic hard spheres driven by a thermal bath, and improves the estimates on their tails of [8] into pointwise ones;


437

• the paper [23] which provides a Cauchy theory for freely cooling inelastic hard spheres with a broad family of collision kernels (including in particular restitution coefficients possibly depending on the relative velocity and/or the temperature), and studies whether the gas cools down in finite time or asymptotically, depending on the collision kernel; • the paper [24] which shows, for freely cooling inelastic hard spheres with constant restitution coefficient, existence of self-similar profile(s) as well as propagation of regularity and damping with time of singularities. In this paper we study the self-similarity properties of the Boltzmann equation for inelastic hard spheres. Therefore as a natural first step we consider a constant restitution coefficient α in order to have a self-similar scaling, which enables to reduce the study of self-similar solutions to the study of stationary solutions for a rescaled equation. We also restrict to the case of a restitution coefficients α close to 1, that is, small inelasticity. There are several motivations from mathematics and physics for such a choice: • the first reason is related to the regime of validity of kinetic theory: as explained in [9, Chap. 6] for instance, the more inelasticity, the more correlations between grains are created during the binary collisions, and therefore the molecular chaos assumption, which is at the basis of the validity of Boltzmann’s theory, suggests weak inelasticity to be the most effective; • second, as emphasized in [9] again, the case of restitution coefficient α close to 1 has been widely considered in physics or mathematical physics since it allows to use expansions around the elastic case, and since conversely it is an interesting question to understand the connection of the inelastic case (dissipative at the microscopic level) to the elastic case (“hamiltonian” at the microscopic level); • finally this case of a small inelasticity is reasonable from the viewpoint of applications, since it applies to interstellar dust clouds in astrophysics, or sands and dusts in earth-bound experiments, and more generally to visco-elastic hard spheres whose restitution coefficient is not constant but close to 1 on the average. In this framework we shall show uniqueness and attractivity of self-similar solutions (in a suitable sense), and thus give a complete answer to the Ernst-Brito conjecture [16] (stated in [16] for the simplified inelastic Maxwell model) for inelastic hard spheres with a small inelasticity. Moreover we give precise results about the elastic limit and deduce some quantitative information about the weakly inelastic case.

1.5. Notation. Throughout the paper we shall use the notation · = 1 + | · |2 . We denote, for any p ∈ [1, +∞], q ∈ R and weight function ω : R N → R+ , the weighted Lebesgue space p L q (ω) := f : R N → R measurable; f L qp (ω) < +∞ , with, for p < +∞,

f L qp (ω) =

1/ p | f (v)| v p

RN

pq

ω(v) dv

and, for p = +∞, f L q∞ (R N ) = sup | f (v)| vq ω(v). v∈R N

438


We shall in particular use the exponential weight functions m = m s,a (v) := e−ζ (|v|

2)

∀ v ∈ RN ,

(1.30)

for some a ∈ (0, ∞), s ∈ (0, 1), or (for a mollified exponential where ζ (r ) = weight function m) where ζ ∈ C ∞ such that ζ (r ) = a r s/2 , r ≥ 1 for some a ∈ (0, ∞), s ∈ (0, 1). k, p In the same way, the weighted Sobolev space Wq (ω) (k ∈ N) is defined by the norm ⎤1/ p ⎡ p ∂ s f (v) p ⎦ , f W k, p (ω) = ⎣ L (ω) a r s/2

q

q

|s|≤k

and as usual in the case p = 2 we denote Hqk (ω) = Wqk,2 (ω). The weight ω shall be omitted when it is 1. Finally, for g ∈ L 12k , with k ≥ 0, we introduce the following notation for the homogeneous moment of order 2k: mk (g) := g |v|2 k dv, RN

and we also denote by ρ(g) = m0 (g) the mass of g, E(g) = m1 (g) the energy of g and by θ (g) = E(g)/(ρ(g) N ) the temperature associated to g (when the distribution g has zero mean). For any ρ, E ∈ (0, ∞), u ∈ R N we then introduce the subsets of L 1 of functions of given mass, mean velocity and energy 1 Cρ,u := {h ∈ L 1 ; h dv = ρ, h v dv = ρ u}, RN RN Cρ,u,E := {h ∈ L 12 ; h dv = ρ, h v dv = ρ u, h |v|2 dv = E}. RN

RN

RN

For any (smooth version of) exponential weight function m we introduce the Banach space L1 (m −1 ) = L 1 (m −1 ) ∩ C0,0 . 1.6. Main results in self-similar variables. Our main result, that we state now, deals with the evolution equation in self-similar variables ∂g = Q α (g, g) − τα ∇v · (vg), g(0, ·) = gin ∈ Cρ,0 , ∂t

τα := ρ (1 − α), (1.31)

and with the associated stationary equation for the self-similar profile, Q α (G, G) − τα ∇v · (v G) = 0, G ∈ Cρ,0 ,

τα := ρ (1 − α).

(1.32)

Theorem 1.1. There is some constructive α∗ ∈ (0, 1) such that for any given mass ρ ∈ (0, ∞), we have: (i) For any τ > 0 and α ∈ [α∗ , 1], Eq. (1.16) admits a unique non-negative stationary solution with mass ρ and vanishing momentum. We denote by G¯ α the self-similar profile obtained by fixing τ = τα := ρ (1 − α).


439

(ii) Let G¯ 1 = Mρ,0,θ¯1 be the Maxwellian distribution with mass ρ, momentum 0 and “quasi-elastic self-similar temperature” θ¯1 defined in (1.29). The path of selfsimilar profiles α → G¯ α parametrized by the normal restitution coefficient is C 1 from [α∗ , 1] into W k,1 ∩ L 1 (ea |v| ) for any k ∈ N and some a ∈ (0, ∞). (iii) For any α ∈ [α∗ , 1], the linearized collision operator h → Lα h := 2 Q˜ α (G¯ α , h) − τα ∇v · (v h)

(1.33)

is well-defined and closed on L1 (m −1 ) for any exponential weight function m with exponent s ∈ (0, 1) (defined in (1.30)). Its spectrum decomposes into a part which lies in the half-plane {Re ξ ≤ µ} ¯ for some constructive µ¯ < 0, and some remaining discrete eigenvalue µα . This eigenvalue is real negative and satisfies µα = −ρ (1 − α) + O(1 − α)2 when α → 1.

(1.34)

The associated eigenspace has dimension 1 and, denoting by φα = φα (v) the unique associated eigenfunction such that φα L 1 = 1 and φα (0) < 0, we have 2

φα ∈ S(R N ) (with bounds of regularity independent of α) and φα → φ1 := c0 |v|2 − N θ¯1 G¯ 1 as α → 1,

(1.35)

where c0 is the positive constant such that φ1 L 1 = 1. Finally one has construc2 tive decay estimates on the semigroup associated to this spectral decomposition in this Banach space (see the key Theorem 5.2 and the following point). (iv) The self-similar profile G¯ α is globally attractive on bounded subsets of L 13 under some additional smallness condition on (1 − α∗ ) in the following sense. For any ρ, E0 , M0 ∈ (0, ∞) there exists α∗∗ ∈ (α∗ , 1), C∗ ∈ (0, ∞) and η ∈ (0, 1), such that for any initial datum satisfying 0 ≤ gin ∈ L 13 ∩ Cρ,0,E0 , gin L 1 ≤ M0 , 3

the solution g to (1.31) satisfies ∀ α ∈ [α∗∗ , 1), gt − G¯ α L 1 ≤ e(1−η) µα t C∗ . 2

(1.36)

(v) Moreover, under smoothness conditions on the initial datum one may prove a more precise asymptotic decomposition, and construct a Lyapunov functional for Eq. (1.31). More precisely, there exists k∗ ∈ N and, for any exponential weight m as defined in (1.30) and any ρ, E0 , M0 ∈ (0, ∞), there exists α∗∗ ∈ (α∗ , 1) and a constructive functional H : H k∗ ∩ L 1 (m −1 ) → R such that, first, for any initial datum 0 ≤ gin ∈ H k∗ ∩ L 1 (m −1 ) ∩ Cρ,0,E0 satisfying gin H k∗ ∩ L 1 (m −1 ) ≤ M0 , the solution g to (1.31) satisfies g(t, ·) = G¯ α + cα (t) φα + rα (t, ·),

(1.37)

with cα (t) ∈ R and rα (t, ·) ∈ L 12 (R N ) such that (µα satisfies (1.34) above) |cα (t)| ≤ C∗ eµα t ,

rα (t, ·) L 1 ≤ C∗ e(3/2) µα t . 2

(1.38)

440


Second, when the initial datum also satisfies the lower bound gin ≥ M0−1 e−M0 |v| , 8

the solution is such that t → H(g(t, ·)) is strictly decreasing (except maybe when gt reaches the stationary state G¯ α , see the remark below). Remark 1.2. 1) All the constants of this theorem are constructive and they can be made explicit. In particular the proofs do not use any compactness argument. Unless otherwise mentioned, these constants will depend on b, on the dimension N , and on some bounds on the initial datum but never on the inelasticity parameter α ∈ (0, 1]. 2) Theorem 1.1 establishes that Conjectures 1 and 2 in [24, Sect. 5] hold true at least for the weak inelasticity model (α close enough to 1). 3) In point (iv), the condition on the restitution coefficient α depends on the mass, temperature and L 13 norm of the initial distribution, but this dependence is not a perturbative condition of smallness of (gin − G¯ α ). This is unexpected and relies on the so-called “entropy-entropy production” estimates which yields “superlinear” Gronwall-type estimates, and on the decoupling of the timescales of energy dissipation and entropy production. 4) In (1.38) one can prove rα (t, ·) L 1 ≤ Cζ eζ µα t for any ζ ∈ (1, 2). We remark that 2

¯

here we do not claim a decay rate eλt on the remaining part when one “removes” from gt − G¯ α the projection on the energy eigenvalue, where λ¯ < 0 would be some constant independent of α related to the second non-zero eigenvalue of Lα . We do not know if such a decay rate holds, but it is unlikely in our opinion, due to the coupling effect of the bilinear term, which mixes the different part of the spectral decomposition. 5) As a subproduct the above result provides an alternative argument to the one of [24, Sect. 3] to show uniform (according to time and α) non-concentration bounds on the rescaled equation in the case of α close to 1 and a general initial datum gin ∈ L 13 (whereas the proof of [24, Sect. 3] was valid for all α ∈ (0, 1) but for some initial datum gin ∈ L 13 ∩ L p , p ∈ (1, ∞]). 6) Our results show that no bifurcation occurs for the path of self-similar profiles for α close to 1. We do not know now if some bifurcations occur for other values of the inelasticity parameter. Therefore we do not know if there is a continuous branch of self-similar profiles parametrized by α ∈ [0, 1] (even if we know from [24] that self-similar profiles exist for all values of α). The best we can say from the estimates we have proved on the profile together with the classical theory of topological degree (see [29] for instance) is that there is a set K ⊂ [0, 1] × F (where F is for instance the set of positive functions in the Schwartz space with given mass) which is compact, connected, and such that for any α ∈ [0, 1], the intersection K ∩ {α} × F is not empty. 7) If gin = G¯ α , it is likely that g(t, ·) does not reach the stationary state G¯ α in finite time, and thus t → H(g(, ·)) is a strictly decreasing function over R+ . Unfortunately, we are not able to prove this fact. Nevertheless, it is worth mentioning that when gin ∈ / C ∞ , so that gin ∈ / H k for some k, we may adapt to the inelastic Boltzmann equation some results obtained in [28] for the elastic Boltzmann equation and we get that g(t, ·) ∈ / H k for any t ≥ 0, from which we easily conclude that g(t, ·) = G¯ α ∀ t ≥ 0 (because G¯ α ∈ H for any ≥ 0).


441

1.7. Coming back to the original equation. When coming back to the original equation (1.1) with the help of (1.18) and (1.22), Theorem 1.1 translates into Theorem 1.3. There is a constructive α∗ ∈ (0, 1) such that for any given mass ρ ∈ (0, ∞), we have: (i) For any α ∈ [α∗ , 1), up to a translation of time there exists a unique self-similar solution F¯α of Eq. (1.1) with mass ρ, and it is given by F¯α (t, v) = (1 + τα t) N G¯ α ((1 + τα t) v), τα = ρ (1 − α), where G¯ α was obtained in Theorem 1.1. More precisely, if Fα is a solution of (1.1) of the form (1.19) and with mass ρ, there exists t0 ∈ R such that Fα (t, v) = F¯α (t + t0 , v) for any t ≥ max{0, −t0 } and any v ∈ R N . (ii) The self-similar solution F¯α is globally attractive on bounded subsets of L 13 under some smallness condition on (1 − α∗ ) in the following sense. For any ρ, E0 , M0 ∈ (0, ∞) there exists α∗∗ ∈ (α∗ , 1) and η ∈ (0, 1) such that for any q ∈ N there is cq ∈ (0, ∞) such that for any initial datum satisfying 0 ≤ f in ∈ L 13 ∩ Cρ,0,E0 ,

f in L 1 ≤ M0 , 3

the solution f (t, ·) to (1.1) satisfies ∀ q ∈ [0, ∞), f (t, ·) − F¯α (t, ·) L 1 (|v|q ) ≤ cq (1 + τα t)(1−η) µα /τα −q = cq (1 + τα t)−(1−η)−q+O(1−α) . (iii) Moreover, there exists k∗ ∈ N and, for any exponential weight m as defined in (1.30) and any ρ, E0 , M0 ∈ (0, ∞), there exists α∗∗ ∈ (α∗ , 1) such that, for any initial datum 0 ≤ f in ∈ H k∗ ∩ L 1 (m −1 ) ∩ Cρ,0,E0 satisfying f in H k∗ ∩ L 1 (m −1 ) ≤ M0 , the solution f to (1.31) satisfies f (t, ·) = F¯α (t, ·) + c˜α (t) ψα (t, ·) + r˜α (t, ·), where

ψα (t, v) = (1 + τα t) N φα ((1 + τα t) v) , c˜α (t) = cα

(1.39)

ln(1 + τα t) . τα

In this expansion, the different terms have the following asymptotic behaviors (for any given q ≥ 0): F¯α (t, ·) L 1 (|v|q ) = (1 + τα t)−q G¯ α L 1 (|v|q ) , ψα (t, ·) L 1 (|v|q ) = (1 + τα t)−q G¯ α L 1 (|v|q ) , |c˜α (t)| ≤ C∗ (1 + τα t)µα /τα = C∗ (1 + τα t)−1+O(1−α) ,

∃Cq > 0; ˜rα L 1 (|v|q ) ≤ Cq (1+τα t)(3/2) µα /τα −q = Cq (1+τα t)−(3/2)−q+O(1−α) . Hence the leading term in the expansion (1.39) is, as expected, the self-similar solution, and the first order correction beyond self-similarity is given by the second term, that is the projection onto the eigenspace of the “energy eigenvalue”.

442


(iv) We can make Haff’s law precise on the asymptotic behavior of the granular temperature (see [24]) in the following way. Under the assumptions of point (iii), the solution f = f (t, v) to (1.1) satisfies E( f t ) =

E(G¯ α ) (1 + τα t)2

+O

. 3+O (1−α)

1 (1 + τα t)

(1.40)

(v) Under the assumptions of point (iii) the rescaling by the square root of the energy familiar to physicists is rigorously justified in the following sense: the solution f = f (t, v) to (1.1) satisfies for t → +∞, E( f t ) N /2 f t, E( f t )1/2 v → E(G¯ α ) N /2 G¯ α E(G¯ α )1/2 v in L 1 . Remark 1.4. We see from this theorem that the convergence towards the self-similar solution is indeed faster than the convergence towards the Dirac mass (hence justifying its interest), but also that the speed of convergence towards this self-similar solution degenerates to 0 as α → 1 (because τα → 0 when α → 1). This fact is surprising, since the self-similar solution converges towards a stationary Maxwellian distribution in the elastic limit, and the latter is known to be exponentially attractive for the elastic equation (see [27] for instance). As we shall see this is related to the fact that a bifurcation occurs in the spectrum of the linearized collision operator at α = 1 (namely the eigenvalue corresponding to the kinetic energy vanishes at α = 1 whereas it is non-zero for α ∈ [α∗ , 1)). This remark may explain the fact that in the quasi-elastic limit considered – in dimension 1 – in [10], it is proved that the rate of relaxation towards the self-similar solution is worse than any polynomial. Proof of Theorem 1.3. Except for points (i) and (v) this theorem is an obvious translation of Theorem 1.1. In order to prove (i), one first remarks that for two given self-similar solutions F and F˜ with same mass ρ, there holds F(t, v) = (V0 + A t) N G A ((V0 + A t) v) ,

˜ v) = (V˜0 + A˜ t) N G ˜ (V˜0 + A˜ t) v , F(t, A

and thus from (1.20) and Theorem 1.1, G A˜ (v) =

N A A v . GA A˜ A˜

We deduce N A A ˜ ˜ ˜ F(t, v) = V0 + A t GA V0 + A t v = F(t + t0 , v) A˜ A˜ with t0 =

V0 V˜0 . − A A˜


443

In order to prove (v), we introduce the function ξ(t) = E(G¯ α )1/2 /[E( f t )1/2 (1 + τα t)] and we compute E( f t ) N /2 f t, E( f t )1/2 · −E(G¯ α ) N /2 G¯ α (E(G¯ α )1/2 ·) 1 L

= g(τα−1 ln(1+τα t), ·)−ξ(t) N G¯ α (ξ(t) ·) L 1 ≤ g(τα−1 ln(1+τα t), ·)−G¯ α L 1 +|ξ(t) N −1| G¯ α L 1 +ξ(t) N G¯ α (ξ(t) ·)−G¯ α L 1 . Using now (1.34), (1.37), (1.38), (1.40) and the fact that G¯ α is bounded in W11,1 uniformly in α ∈ (α∗ , 1) from Theorem 1.1 (ii), we deduce E( f t ) N /2 f t, E( f t )1/2 · − E(G¯ α ) N /2 G¯ α (E(G¯ α )1/2 ·) 1 ≤ C (1 + τα t)−1+O(1−α) , L

for some constant C ∈ (0, ∞) (which depends in particular on the upper bound on G¯ α 1,1 ), from which (v) follows. W1

Remark 1.5. Let us emphasize that the temperature θ¯1 of the limit Maxwellian G¯ 1 is “universal” in the sense that it only depends on the collisional cross-section b (through its angular momentum), and not for instance on the density distribution. The temperature of the self-similar solution F¯α = F¯α (t, v) associated to a self-similar profile G¯ α decreases like θ ( F¯α (t, ·)) =

θ (G¯ α ) . (1 + ρ(1 − α)t)2

Hence when α is close to 1 (small inelasticity) we obtain θ ( F¯α (t, ·)) ≈

θ¯1 . (1 + ρ(1 − α)t)2

Therefore, as soon as the self-similar solutions correctly describe the asymptotic (at least in the framework of point (ii) of Theorem 1.3), as conjectured by physicists, generic solutions satisfy θ¯1 t −2 θ ( f α (t, ·)) ∼t→∞ ρ 2 (1 − α)2 for an inelasticity coefficient α close to 1. We call θ¯1 a “quasi-elastic self-similar temperature”. Remark that its definition as the temperature of G¯ 1 seems to depend on the choice of the scaling. However changing this scaling by some asymptotically equivalent one, as α → 1, would only add a factor which would then disappear when coming back to the solution to the original equation (1.1). Therefore a more “canonical” way to define this quasi-elastic self-similar temperature could be θ¯1 = ρ 2 lim (1 − α 2 ) lim θ ( f α (t, ·)) t 2 , α→1

t→+∞

where f α denotes a generic solution with mass ρ to Eq. (1.1).

444


1.8. Method of proof and plan of the paper. • The first main idea of our method is to consider the rescaled equations (1.16) and (1.17) with an inelasticity dependent anti-drift coefficient τα which exactly “compensates” the loss of elasticity of the collision operator (in the sense that it compensates its loss of kinetic energy). This scaling allows by some technical estimates to prove uniform bounds according to α for the family of self-similar profiles G α to Eq. (1.32). • The second main idea is the decoupling of the variations along the “energy direction” and its “orthogonal direction”. This decoupling makes possible to identify the limit of different objects as α → 1 (among them the limit of G α ). • The third main idea is to systematically use the well-developed theory on the elastic limit problem, once it has been identified thanks to the previous arguments. In particular we use the spectral study of the linearized problem and the entropy - entropy production inequalities for the elastic problem. This allows to argue by perturbative method. Let us emphasize that perturbation is singular in the classical sense because of the addition of a (vanishing at the limit) first-order derivative operator, but also because of the gain of one more conservative quantity at the limit (which implies in particular at the linearized level that the “energy eigenvalue” µα is negative for α = 1 but converges to µ1 = 0 in the limit α → 1). In Sect. 2, we use the regularity properties of the collision operator in order to establish on the one hand that the family (G α ) is bounded in H ∞ ∩ L 1 (m −1 ) uniformly according to the inelastic parameter α (the key argument being the use of the entropy functional which provides uniform lower bound on the energy of G α ) and on the other hand that the difference of two self-similar profiles in any regular or weighted norm may be bounded by the difference of these ones in L 1 norm (the key idea is a bootstrap argument). This last point shall allow to deal with the loss of derivatives and weights in the operator norms used in the sequel of the paper. In Sect. 3, we prove that α → Q +α is Hölder continuous in the norm of its graph and is Hölder differentiable in a weaker norm. As a consequence we deduce that G α → G¯ 1 when α → 1 with explicit “Hölder” rate, which (partially) proves point (ii) Theorem 1.1. The cornerstone of the proof is the decoupling of the variation G α − G¯ 1 between the “energy direction” and its “orthogonal direction”. In Sect. 4, we prove uniqueness of the profile G¯ α for small inelasticity (point (i) of Theorem 1.1) by a variation around the implicit function theorem in infinite dimension. We also deduce that α → G¯ α is differentiable at α = 1. Section 5 is devoted to the study of the linearized operator Lα , and we are partially inspired from the method of [27]. We prove point (iii) of Theorem 1.1 and we end the proof of point (ii) of Theorem 1.1. We obtain information on the localization of the spectrum and we establish some decay estimates on the associated semigroup. Let us emphasize that for technical reasons we state our results in an L 1 framework (mainly because we are not able to generalize Lemma 5.8 to an L 2 framework), which makes the spectral analysis more intricate. The proof proceeds as follows (the cornerstone idea is again the decoupling of the variations in the “energy direction” and its “orthogonal direction”). First, we localize the essential spectrum inside the half plane cµ¯ = {z ∈ C, e z ≤ µ¯ < 0} with the help of Weyl’s theorem, the compactness properties of Lα and the “rough” (Hölder type) convergence of Q +α (G¯ α , ·) to Q +1 (G¯ 1 , ·) in the “good” norm of the graph. Second, we localize the discrete spectrum lying in µ¯ = {z ∈ C, e z ≥ µ} ¯ inside the disc {z ∈ C, |z| ≤ C (1 − α)}, thanks to estimates on the resolvent of Lα . Third we establish that the spectrum (Lα ) of Lα satisfies (Lα ) ∩ µ¯ = {µα }, where µα has multiplicity 1 (the proof mainly takes advantage of


445

the “precise” convergence of Q +α (G¯ α , ·) to Q +1 (G¯ 1 , ·) in weaker norm, together with a regularity estimate holding in the discrete eigenspace). Last we establish the expansion (1.34) using the energy equation associated to the eigenvalue µα . The decay properties of the linear semigroup are then deduced from resolvent estimates and the above localization of the spectrum. Section 6 is devoted to the proof of points (iv) and (v) in Theorem 1.1, which is split in several steps. First we establish a “linearized asymptotic stability result” by decoupling the evolution equation (1.31) along the “energy direction” and its “orthogonal direction”, and using the semigroup decay estimates and the quadratic structure of the collision operator. Second we establish a “non-linear stability result” by decomposing the evolution equation (1.31) into two timescales, and using the energy dissipation equation along the “energy direction” and the entropy production method on its “orthogonal direction” (let us mention that this method follows closely the physical idea that for small inelasticity the “molecular” timescale of thermalization of velocity distribution decouples from the “cooling” timescale of dissipation of energy). Third we prove the asymptotic decomposition and we exhibit a Lyapunov functional for smooth initial data (point (v)) by gathering (and slightly modifying) the two preceding steps. Fourth and last, we prove point (iv) for general initial data, gathering the previous arguments with the decomposition of solutions between a smooth part and a small remaining part as introduced in [28]. 2. A Posteriori Estimates on the Self-Similar Profiles In this section we prove various a posteriori regularity and decay estimates on the self-similar profiles (or on the difference of two self-similar profiles). The main issue here is to obtain uniform estimates as α → 1. This shall be useful in the sequel. 2.1. Uniform estimates on the self-similar profiles. For any α ∈ (0, 1) we consider Gα the set of all the self-similar profiles of the inelastic Boltzmann equation (1.1) with inelasticity coefficient α, with given mass ρ ∈ (0, +∞) and finite energy. More precisely, we define Gα as the following set of functions: Gα := 0 ≤ G ∈ L 12 satisfying (1.32) . For some fixed α0 ∈ (0, 1), we also define G = ∪α∈[α0 ,1) Gα . The fact that for any α ∈ (0, 1), Gα is not empty was proved in [24], where a solution of (1.32) was built within the class of radially symmetric functions belonging to the Schwartz space. Here we show that any self-similar profile G α ∈ G belongs to the Schwartz space and that decay estimates, pointwise lower bound and regularity estimates can be made uniform according to the inelasticity coefficient α ∈ [α0 , 1). Let us once again emphasize that the choice of the velocity rescaling parameter τα = ρ (1 − α) in (1.32) is fundamental in order to get this uniformity in the limit α → 1. Let us also mention that our choice of scaling for Eq. (1.32) is mass invariant, that is G with density ρ(G) satisfies the equation if and only if G/ρ(G) satisfies the equation with ρ = 1. Therefore all the estimates on the profiles are homogeneous in terms of the density ρ.

446


Proposition 2.1. Let us fix α0 ∈ (0, 1). There exists a1 , a2 , a3 , a4 ∈ (0, ∞) and, for any k ∈ N, there exists Ck ∈ (0, ∞) such that ⎧ ⎨ G α L 1 (ea1 |v| ) ≤ a2 , G α H k (R N ) ≤ Ck , ∀ α ∈ [α0 , 1), ∀ G α ∈ Gα , (2.1) 8 ⎩ G α ≥ a3 e−a4 |v| . We first recall the following geometrical lemma extracted (in a slightly specified form) from [23, Lemma 2.3 & Lemma 4.4], that we shall use several times in the sequel. Lemma 2.2. For any α ∈ (0, 1] and σ ∈ S N −1 we define ∗ φα∗ = φα,v,σ : R N → R N , v∗ → v ,

φα = φα,v∗ ,σ : R N → R N , v → v , ∗ ), Jα = det (D φα,v∗ ,σ ), as well as the and the Jacobian functions Jα∗ = det (D φα,v,σ cone δ = δ,σ = u ∈ R N , uˆ · σ > δ − 1 ,

for any δ ∈ (0, 2) and σ ∈ S N −1 . For any δ √ ∈ (0, 2), φα∗ defines a C ∞ -diffeomorphism from v + δ onto v + ω∗ (δ) with ∗ ω (δ) = 1 + δ/2 and φα defines a C ∞ -diffeomorphism from v∗ + δ onto v∗ + ωα (δ) with δ − 1 + rα ωα (δ) = 1 + 1/2 1 + 2(δ − 1)rα + rα2 and rα = (1 + α)/(3 − α). Moreover, there exist C ∈ (0, ∞) such that with Cδ = C/δ, Cδ−1 |v − v∗ | ≤ |φα (v) − v∗ | ≤ 2 |v − v∗ |,

|φα−1 (v ) − φα−1 (v )| ≤ C δ |α − α| |v − v∗ |, 2 |Jα | ≤ Cδ , |Jα−1 | ≤ Cδ , |Jα−1 − Jα−1 | ≤ C δ |α

(2.2) (2.3) − α|

(2.4)

on v∗ + δ , uniformly with respect to the parameters α, α ∈ [0, 1], σ ∈ S N −1 and v∗ ∈ R N . The same estimate holds for φα∗ on v + δ . Finally, for any α, α ∈ [0, 1], σ ∈ S N −1 , v∗ ∈ R N and t ∈ [0, 1], there holds −1 t φα−1 + (1 − t) φα−1 = φαt

(2.5)

for some αt belonging to the segment with extremal points α and α . The same result holds for φα∗ . We will also need the following elementary result in order to estimate the convolution operator L defined in (1.9). Lemma 2.3. For any function g ∈ L 13 (R N ) there exist some constants c1 , c2 ∈ (0, ∞) such that c1 (1 + |v|) ≤ L(g) ≤ c2 (1 + |v|).

(2.6)

Moreover, if g satisfies E(g) ≥ a1 ρ and m3/2 (g) ≤ a2 ρ, for some constants a1 , a2 > 0, we can choose c1 = C −1 ρ, c2 = C ρ in (2.6) for some explicit constant C > 0 depending only on a1 , a2 > 0.


447

Proof of Lemma 2.3. The upper bound in (2.6) is immediate. As for the lower bound, we have, on the one hand, by Jensen’s inequality, g∗ |u| dv∗ ≥ ρ |v|. (2.7) RN

On the other hand, by the triangular inequality, g∗ |u| dv∗ ≥ m1/2 − |v| m0 . RN

−1 By Hölder’s inequality we have m1/2 ≥ E 2 m3/2 ≥ C0 ρ for some explicit constant C0 > 0 depending only on a1 , a2 . As a consequence g∗ |u| dv∗ ≥ ρ (C0 − |v|). (2.8)

RN

These two lower bounds (2.7, 2.8) imply immediately that g∗ |u| dv∗ ≥ C −1 ρ (1 + |v|) RN

for some explicit constant C > 0 depending only on C0 .

Proof of Proposition 2.1. We split the proof into several steps. In Steps 1, 2 and 3, we establish the smoothness for any profile G α ∈ G as well as upper and lower bounds on its tail. In Steps 4, 5, 6, 7, 8 and 9, we show that these estimates actually are uniform with respect to the choice of the profile G α ∈ Gα and α ∈ [α0 , 1). Thanks to Steps 1, 2 and 3 the computations then performed are rigorously justified. We fix α ∈ [α0 , 1) and G α a solution of (1.32) for which we will establish the announced bounds. From now on we omit the subscript “α” when no confusion is possible. Step 1. Moment bounds. From [24, Prop. 3.1], by taking gin = G in the evolution equation (1.16), we get that G ∈ L 1k for any k ∈ N. Step 2. L 2 a posteriori bound. We aim to prove that G ∈ L 2 . Let us fix A > 0 and let us introduce the C 1 function x2 A2 A (x) := 1x≤A + A x − 1x>A . 2 2 We multiply Eq. (1.32) by A (G) = min{G, A} := T A (G). Once again we shall omit the subscript “A” when no confusion is possible. After some straightforward computation we get T (G) G L(G) + ρ (1 − α) N T (G)2 /2 dv = T (G) Q + (G, G) dv. RN

RN

Since L(G) ≥ c1 (1 + |v|), thanks to Lemma 2.3 and (G) ≤ G T (G), we have c1 (G) (1 + |v|) dv ≤ T (G) G L(G) dv N RN R ≤ T (G) Q + (G, G) dv ≤ I1 + I2 + I3 + I4 , (2.9) RN

448


where the terms Ik are defined in the following way, splitting the collision kernel into some smooth and non-smooth R+ be an even C ∞ function such that parts. Let : R → N ∞ support ⊂ (−1, 1), and R = 1. Let : R → R+ be a radial C function such that support ⊂ B(0, 1) and R N = 1. Introduce the regularizing sequences m (z) = m (mz), z ∈ R,

(nx), x ∈ R N . n (x) = n N

As a convention, we shall use subscripts S for “smooth” and R for “remainder”. We denote (u) := |u|. First, we set n ∗ 1An , S,n = R,n = − S,n , where An stands for the annulus An = x ∈ R N ; b S,m (z) = m ∗ b 1Im (z),

2 n

≤ |x| ≤ n . Similarly, we set

b R,m = b − b S,m ,

where Im stands for the interval Im = x ∈ R ; −1 + m2 ≤ |x| ≤ 1 − m2 (b is understood as a function defined on R with compact support in [−1, 1]). We then define I1 = T (G) Q +R (G, G) dv, RN

where Q +R is the gain term associated to the cross-section B R := |u| b R,m , I2 = T (G) Q +R S (G, G) dv, RN

where Q +R S is the gain term associated to the cross-section B R S := R,n b S,m , I3 = T (G) Q˜ +S (χ (G), G) + Q˜ +S (T (G), χ (G) dv, RN

where Q +S is the gain term associated to the smooth cross-section B S := S,n b S,m and χ (G) := G − T (G) and finally I4 = T (G) Q +S (T (G), T (G)) dv. RN

We estimate each term separately. We omit the subscripts m and n when there is no confusion. For I1 we proceed along the line of the proof of the estimate for the term I r in [23, Proof of Theorem 2.1]. Using Young’s inequality x T (y) ≤ (x) + (y) we have I1 = G G ∗ T (G ) b R,m |u| dv dv∗ dσ R N ×R N ×S N −1 G [(G ∗ )+(G )] b R,m 1u·σ ≤ ˆ ≤0 |u| dv dv∗ dσ N N N −1 R ×R ×S G ∗ [(G)+(G )] b R,m 1u·σ + ˆ ≥0 |u| dv dv∗ dσ = I1,1 + · · · +I1,4 . R N ×R N ×S N −1


449

We just deal with the term I1,2 , the others may be handled in a similar (or even simpler) way. Making the change of variables v∗ → v = φα∗ (v∗ ) (for some fixed v, σ ) and using the elementary inequality |u| ≤ 4 |v − v|, valid when σ · uˆ ≤ 0, there holds I1,2 = G (G ) b R,m 1u·σ ˆ ≤0 |u| dv dv∗ dσ R N ×R N ×S N −1 G (G ) b R,m 1u·σ = 2 N +2 ˆ ≤0 |v − v | dv dv dσ R N ×R N ×S N −1 (G) (1 + |v|) dv. ≤ 2 N +2 b R,m L 1 G L 1 1

RN

Since the same estimates hold for all the terms I1,k , we obtain I1 ≤ ε(m) G L 1 (G) v dv with ε(m) −→ 0. 1

m→∞

RN

(2.10)

For I2 we proceed along the line of the proof of the estimate for the term I in [24, Proof of Prop. 2.5]. Using again Young’s inequality x T (y) ≤ (x) + (y) and the trivial estimate R,n ≤ C n −1 (|v|2 + |v∗ |2 ) we get I2 = G G ∗ T (G ) b S,m R,n dv dv∗ dσ R N ×R N ×S N −1 C ≤ G |v|2 [(G ∗ ) + (G )] b S,m dv dv∗ dσ n R N ×R N ×S N −1 C + G ∗ |v∗ |2 [(G)+(G )] b S,m dv dv∗ dσ = I2,1 + · · · + I2,4 . n R N ×R N ×S N −1 Because of the truncation on b of frontal and grazing collisions, both changes of variables v → v = φα (v) (for fixed v∗ , σ ) and v∗ → v = φα∗ (v∗ ) (for fixed v, σ ) are allowed (and the jacobian of their inverse is bounded). Hence in a similar way as for the term I1 we obtain C(m) I2 ≤ G L 1 (G) dv. (2.11) 2 n RN For I3 , using again Young’s inequality, plus T (G) ≤ G and the fact that both changes of variables v → v = φα (v) (for fixed v∗ , σ ) and v∗ → v = φα∗ (v∗ ) (for fixed v, σ ) are allowed, we have I3 = (T (G )+T (G ∗ )) [G χ (G ∗ )+χ (G) T (G ∗ )] b S,m S,n dv dv∗ dσ N N N −1 R ×R ×S χ (G ∗ ) (G )+(G ∗ )+2 (G) ≤ C(n) R N ×R N ×S N −1 + χ (G) (G )+(G ∗ )+2 (G ∗ ) b S,m dv dv∗ dσ. We deduce as before

I3 ≤ Cm,n χ (G) L 1

for some constant Cm,n > 0.

RN

(G) dv

(2.12)

450


Finally for I4 , we argue as in the proof of [24, Prop. 2.6] for the treatment of the term involving Q +S , and we get for some θ ∈ (0, 1), 2 (1−θ)

I4 ≤ Cm,n T (G)1+2θ T (G) L 2 L1

,

(2.13)

for some constant Cm,n > 0. Gathering (2.9), (2.10), (2.11), (2.12), (2.13) and taking m, next n and finally A ≥ A(G) large enough we may control the terms I1 , I2 and I3 by the half of the left hand side term of (2.9) (for I3 we use that χ A (G) L 1 → 0 when A → ∞). Note that the condition A ≥ A(G) depends on the distribution G (by the mean of some nonconcentration bound), but shall play no role since we shall take the limit A → +∞ in the end. We obtain c1 2 (1−θ) ∀ A ≥ A(G), A (G) (1 + |v|) dv ≤ Cb,ρ,E (G) T A (G) L 2 2 RN for some constant Cb,ρ,E (G) > 0 depending on the cross-section b and on the profile G via its energy. Using that T A (G)2 /2 ≤ A (G) we deduce ∀A ≥ A(G),

c1 T A (G)2L θ2 ≤ Cb,ρ,E (G) , 4

and we then conclude that G ∈ L 2 passing to the limit A → ∞ in the preceding estimate, with the bound G L 2 ≤

4 Cb,ρ,E (G) c1

1

2θ

.

(2.14)

Remark 2.4. Note that the L 2 bound (2.14) only depends on the distribution G by the means of the energy E(G) and the constant c1 . Therefore, thanks to Lemma 2.3, this bound only depends on a lower bound on the energy E(g) and an upper bound on the third moment m3/2 (g). Step 3. Smoothness and positivity. Thanks to [24, Theorem 1.3] and [8, Theorem 1], taking gin = G as an initial condition in (1.16) we have that G belongs to the Schwartz space of C ∞ functions decreasing faster than any polynomials, and that G ≥ a1 e−a2 |v| for some constant a1 , a2 > 0. So far the estimates in Step 3 may be not uniform on the elasticity coefficient α ∈ [α0 , 1) and on the profile G α . The aim of the following steps is to prove that they actually are uniform. Note however that estimates of the previous steps shall ensure that the following computations are rigorously justified. Step 4. Upper bound on the energy using the energy dissipation term. We prove that ∀ α ∈ (0, 1]

E≤

4 ρ. b12

(2.15)

From Eq. (1.25) on the energy of the profile G there holds (1 + α) b1 G G ∗ |u|3 dv dv∗ = 2 ρ RN RN

RN

G |v|2 dv.

(2.16)


From Jensen’s inequality

RN

|u|3 G ∗ dv∗ ≥ ρ |v|3 ,

and Hölder’s inequality |v| G dv ≥ ρ 3

RN

451

−1/2

3/2 |v| G dv 2

RN

,

we get (1 + α) b1 ρ 1/2 E 3/2 ≤ 2 ρ E from which the bound (2.15) follows. Step 5. Lower bound on the energy using the entropy. We prove ∀ α ∈ (0, 1]

E≥

N 2 α4 ρ with b2 := b L 1 . 8 b22

(2.17)

Remark 2.5. The choice of scaling we have made for the evolution equation in selfsimilar variables becomes clear from this computation: it is chosen such that the energy of the self-similar profile does not blow up nor vanishes for α → 1. The restriction α ∈ [α0 , 1), α0 > 0, is then made in order to get a uniform estimate from below on the energy. By integrating the equation satisfied by G against log G we find Q(G, G) log G dv − ρ (1 − α) log G ∇v · (v G) dv = 0. RN

RN

Then we write the first term as in [17, Sect. 1.4] to find G G ∗ 1 G G ∗ G G ∗ log − + 1 B dv dv∗ dσ 2N N −1 2 GG ∗ GG ∗ R ×S 1 G G ∗ − GG ∗ B dv dv∗ dσ +ρ (1−α) v · ∇v G dv = 0. + 2 R2N ×S N −1 RN If we denote D H,α (g) =

1 2

R2N ×S N −1

g g∗

g g∗ g g∗ − log − 1 B dv dv∗ dσ ≥ 0, (2.18) gg∗ gg∗

(recall that in this formula the post-collisional velocities v , v∗ are computed according to the inelastic formula (1.4) with normal restitution coefficient α ∈ (0, 1]), we can write 1 − D H,α (G) + − 1 b G G ∗ |u| dv dv∗ − (1 − α) N ρ 2 = 0, (2.19) 2 α2 R2N and thus we get α2 1 N α2 2 N ρ2 + D H,α (G) ≥ ρ . G G ∗ |u| dv dv∗ = 1+α 1−α 2 R2N

452


On the other hand, from Cauchy-Schwarz’s inequality G G ∗ |u| dv dv∗ R2N

1/2

≤

R2N

G G ∗ dvdv∗

1/2 R2N

G G ∗ |u|2 dvdv∗

=

√ 3/2 1/2 2ρ E ,

and then the bound (2.17) follows gathering the two preceding estimates. Step 6. Upper bound on (exponential) moments using Povzner inequality. There exists A, C > 0 such that ∀ α ∈ [0, 1), G(v) e A|v| dv ≤ C ρ. RN

We refer to [8] where that bound is obtained as an immediate consequence of the following sharp moment estimates: there exists X > 0 such that ∀ α ∈ [0, 1), mk = G |v|k dv ≤ (k + 1/2) X k/2 ρ. (2.20) RN

It is worth noticing that in [8] the Povzner inequality used in order to get (2.20) is uniform in the normal restitution coefficient α ∈ [0, 1] and that the factor ρ comes from our choice of the scaling variables (in which ρ is involved). Step 7. Uniform upper bound on the L 2 norm. From (2.17), (2.20) and Remark 2.4, the L 2 bound (2.14) is uniform on α ∈ [α0 , 1) and G ∈ Gα . Step 8. Smoothness. It is enough to show some uniform bounds from above and below on the energy together with uniform non-concentration bounds on the self-similar profiles in G, in the form of upper bounds on the L 2 bounds for instance. Indeed the proofs of [24, Prop. 3.1, Prop. 3.2, Prop. 3.4, Theorem 3.5 and Theorem 3.6] then apply straightforwardly (in these proofs we did not use the part associated with the anti-drift in the semigroup). Therefore the uniform bounds on the H k norms for all k ≥ 0 follows from these results. Step 9. Pointwise lower bound. It is a consequence of the following lemma. Lemma 2.6. Let g ∈ C([0, ∞); L 13 ) be a solution of the rescaled equation (1.31) with inelasticity parameter α ∈ (0, 1) and assume that for some p > 1 and C, T ∈ (0, ∞), sup g L p ∩L 1 ≤ C.

[0,T ]

3

(i) For any t1 ∈ (0, T ) there exists a1 ∈ (0, ∞) (depending on C, ρ and t1 but not on T ) such that ∀ t ∈ [t1 , T ], ∀ v ∈ R N , g(t, v) ≥ a1−1 e−a1 |v| . 8

(2.21)

(ii) If furthermore, gin satisfies gin (v) ≥ a0−1 e−a0 |v| , 8

then (2.21) holds with t0 = 0 and some constant a1 ∈ (0, ∞) (depending on C, ρ, a0 but not on T ).


453

Proof of Lemma 2.6. We only prove (i), the proof of (ii) being similar. Let us fix t1 ∈ (0, 1). We closely follow the proof of the Maxwellian lower bound for the solutions of the elastic Boltzmann equation (see [11,30]) taking advantage of some technical results established in its extension to the solutions of the inelastic Boltzmann equation (see [24, Theorem 4.9]). The starting point is again the evolution equation satisfied by g written in the form ∂t g + τα v · ∇v g + (τα N + C + C |v|) g = Q +α (g, g) + (C + C |v| − L(g)) g, where the last term in the right-hand side term is non-negative for some well-chosen numerical constant C ∈ (0, ∞) thanks to Lemma 2.3, (2.20) and (2.17). Let us introduce the semigroup Ut associated to the operator τα v · ∇v + λ(v), where λ(v) := τα N + C + C |v|, which action is given by t (Ut h)(v) = h(v e−τα t ) exp − λ(v e−s ) ds . 0

Thanks to the Duhamel formula, we have t Ut−s Q + (g(s + τ, .), g(s + τ, .)) ds. (2.22) ∀ t > 0, ∀ τ ≥ 0, g(t + τ, .) ≥ 0

Noticing that t − λ(v e−s ) ds ≥ −(C |v| + τα N t + C t), 0

and repeating the arguments of Steps 2 and 3 in the proof of [24, Theorem 4.9], we get that ∀ t ≥ τ, g(t, .) ≥ η 1 B(0,δ) (v),

(2.23)

with τ = τ1 = t1 /2 and some constant η = η1 > 0, δ = δ1 > 1. Let us emphasize that here we make use of Lemma 4.6, Lemma 4.7 and Lemma 4.8 in [24] where the constants exhibited in these ones are uniform in α ∈ [α0 , 1) thanks to the uniform L p ∩ L 13 estimates assumed on g. Now, on the one hand, from [24, Lemma 4.8], there exists κ ∈ (0, ∞) such that Q +α (1 B(0,1) , 1 B(0,1) ) ≥ κ 1 B(0,√5/2) , which in turns implies ∀ δ > 0,

Q +α (1 B(0,δ) , 1 B(0,δ) ) ≥ κ δ −N −1 1 B(0,√5/2 δ) .

(2.24)

On the other hand, there exists κ ∈ (0, ∞) such that ∀ δ > 0, ∀ s ∈ [0, 1], Us (1 B(0,δ) ) ≥ κ e−C δ 1 B(0,δ) .

(2.25)

From (2.23) with η = η1 , δ = δ1 , and making use of (2.22), (2.24), (2.25), we get that (2.23) holds with √ 5 t1 δ1 and η = η2 = (τ2 − τ1 ) κ η12 e−C δ1 , τ = τ2 = τ1 + 2 , δ = δ2 = 2 2

454


where κ = κ κ and C depends on C and N . Iterating the argument we get that (2.23) √ k+1 5/2 and holds with τ = τk = τk−1 + t1 2−k = (1 − 2−k ) t1 , δ = δk+1 = ηk+1 = (κ t1 )1+2+···+2

η12 e−C

(δ

2−[k+2 (k−1)+···+2 1] ≥ A2 , √ 8 √ √ with A := κ t1 η1 e−C δ1 /2. In other words, using that 25 > 2, we have proved k−1

k

k +2 δk−1 +···+ 2

k−1 δ

1)

k−1

k+1

k

∀ t ≥ t1 , ∀ k ∈ N, g(t, v) ≥ A2 1 B(0,2k/8 δ1 ) (v), from which we easily conclude.

2.2. Estimates on the difference of two self-similar profiles. In this subsection we take advantage of the mixing effects of the collision operator in order to show that the L 1 norm of the difference of two self-similar profiles (corresponding to the same inelasticity coefficient) indeed controls the H k ∩ L 1 (m −1 ) norm of their difference for any k ∈ N and for some exponential weight function m, uniformly in terms of α ∈ [α0 , 1). Proposition 2.7. For any k > 0, there is m = exp(−a |v|), a ∈ (0, ∞) and Ck > 0 such that for any α ∈ [α0 , 1) and any G α , Hα ∈ Gα there holds Hα − G α H k ∩L 1 (m −1 ) ≤ Ck Hα − G α L 1 .

(2.26)

Proof of Proposition 2.7. We proceed in three steps. It is worth mentioning that all the constants in the proof are uniform in terms of the normal restitution coefficient α ∈ [α0 , 1), as they only depend on the uniform bounds of Proposition 2.1 and some uniform bounds on the collision kernel. Step 1. Control of the L 1 moments. We prove first that there exists A, C ∈ (0, ∞) such that ∀ α ∈ [α0 , 1), |Hα − G α | e A |v| dv ≤ C |Hα − G α | dv. RN

RN

Let us consider some normal restitution coefficient α ∈ [α0 , 1) and two self-similar profiles G, H ∈ Gα (here again, we omit the subscript α when there is no confusion). We denote D = G − H , S = G + H and ϕ = |v|2 p sgn(D), p ∈ 21 N, p ≥ 3/2, where sgn(D) denotes the sign of D. The equation for D reads 0 = Q α (G, G) − Q α (H, H ) − ρ (1 − α) ∇v · (v D) = 2 Q˜ α (D, S) − ρ (1 − α) ∇v · (v D). Multiplying Eq. (2.27) by ϕ, we get 0= B D S∗ ϕ∗ + ϕ − ϕ∗ − ϕ dv dv∗ dσ R N ×R N ×S N −1 ∇v (v D) |v|2 p sgn(D) dv −ρ (1 − α) N R ≤ |u| |D| S∗ K p dv dv∗ + 2 |u| |D| S∗ |v∗ |2 p dv dv∗ N N N N R ×R R ×R 2p +ρ |D| v · ∇(|v| ) dv RN

(2.27)


with

K p (v, v∗ ) :=

S N −1

455

(|v |2 p + |v∗ |2 p − |v|2 p − |v∗ |2 p ) b(σ · u) dσ.

From [8, Corollary 3, Lemma 2], there holds K p (v, v∗ ) ≤ γ p p − (1 − γ p ) (|v|2 p + |v∗ |2 p ), where (γ p ) p=3/2,2,... is a decreasing sequence of real numbers such that 4 , 0 < γ p < min 1, p+1

(2.28)

and p is defined by kp p |v|2k |v∗ |2 p−2k + |v|2 p−2k |v∗ |2k , p := k k=1

with k p := [( p + 1)/2] is the integer part of ( p + 1)/2 and

p stands for the binomial k

coefficient. As a consequence, 2p (1 − γ3/2 ) |v| |u| S∗ |D| dv dv∗ ≤ γ p |u| |D| S∗ p dv dv∗ N N R N ×R N R ×R +2 |u| |D| S∗ |v∗ |2 p dv dv∗ + 2 ρ p |D| |v|2 p dv. R N ×R N

RN

Using Lemma 2.3 in order to estimate L(S) from below, the inequality |u| ≤ |v| + |v∗ | and introducing the notations 2k dk := |D| |v| dv, sk := S |v|2k dv, RN

RN

we get, for some numerical constant C ∈ (0, ∞), ρ d p+1/2 ≤ γ p S p + d0 s p+1/2 + d1/2 s p + 2 ρ p d p , C

(2.29)

with kp p dk+1/2 s p−k + dk s p−k+1/2 + d p−k+1/2 sk + d p−k sk+1/2 . S p := k k=1

From Proposition 2.1, or more precisely (2.20), we know that sk ≤ ρ (k + 1/2) x k for any k ≥ 1 and for some x ∈ (1, ∞). By Hölder’s inequality, we also have 1+ 21p

dp

1

≤ d p+ 1 d02 p . 2

Repeating the proof of [8, Lemma 4], for any a ≥ 1, there exists A > 0 such that S p ≤ A ρ (d0 + d1/2 ) (a p + a/2 + 1) Z p

456


with Z p := max {δk+1/2 σ p−k , δk σ p−k+1/2 , δ p−k+1/2 σk , δ p−k σk+1/2 }, k=1,..,k p

and δk :=

dk sk , σk := . (d0 + d1/2 ) (a k + 1/2) ρ (a k + 1/2)

We may then rewrite (2.29) as 1+1/2 p

(a p + 1/2)1/2 p δ p

≤ A γp

(a p + a/2 + 1) Z p + (σ p+1/2 + σ p ) + 2 ρ p δ p . (ap + 1/2)

On the one hand, from (2.28), there exists A such that A γp

(ap + a/2 + 1) ≤ A pa/2−1/2 (ap + 1/2)

∀ p = 3/2, 2, . . . .

On the other hand, thanks to Stirling’s formula n! ∼ n n e−n the estimate (2.28), there exists A > 0 such that (1 − γ p ) (a p + 1/2)1/2 p ≥ A pa/2

√ 2π n when n → ∞ and

∀ p = 3/2, 2, . . . .

Therefore, 1+1/2 p

pa/2 δ p

≤ pa/2−1/2 Z p + (σ p+1/2 + δ1 σ p ) + 2 ρ p δ p .

We finally obtain dk ≤ x k (ak + 1/2) (d0 + d1/2 ), and we easily conclude as in [8, Proof of Theorem 1] or in [23, Proof of Prop. 3.2, Step 2]. Step 2. Control of the L 2 norms. For k = 0, the propagation of the L 2 norm is immediate using the result [24, Cor. 2.3]. Indeed one just has to split the collision kernel as in [24, Sect. 2.4]. For the truncated and regularized part Q +S (we use the notation introduced in Step 2 in the proof of Prop. 2.1), [24, Cor. 2.3] together with some basic interpolation yield the following control: + Q S (S, D) + Q +S (D, S) D dv ≤ C ρ 1+2θ D2−2θ L2 RN

for some explicit C > 0 and θ ∈ (0, 1). For the remaining term Q +R , we use the same control as in [24, Proof of Prop. 2.5] to get + Q R (S, D) + Q +R (D, S) D dv ≤ ε D L 1 + D L 2 D L 2 RN

2

1/2

1/2

for some ε which can be taken as small as wanted by the truncation. Gathering these estimates, we get ∀ε > 0 Q˜ + (S, D) D dv ≤ ε D2L 2 + Cε , RN

1/2


457

where Cε depends on weighted L 1 and L 2 norms of S, on L 1 norms on D and on ε. Using Eq. (2.30) with i = 0, Lemma 2.3 to treat the term L(D), and some elementary interpolation, we deduce that D L 2 ≤ C D L 1 1/2

2

for some constant C > 0, which concludes the proof for k = 0 using the previous step on the L 1 moments. Step 3. Control of the H k norms. From the previous step and some interpolation, in order to conclude it is enough to prove (2.26) for any k ∈ N and m ≡ 1. We proceed by induction on k. For any i ∈ N N , the equation satisfied by ∂ i D is ∂ i Q + (S, D) + ∂ i Q + (D, S) − ∂ i (L(D) S) − L(S) ∂ i D i ∂ i L(S) ∂ i−i D − ρ (1 − α) ∂ i ∇ · (v D) = 0. − i 0 0 and α ∈ (0, 1), there holds σ ∈ S N −1 , sin2 χ ≥ δ implies m −1 (v ) ≤ m −k (v) m −k (v∗ ),

(3.8)

with k = (1 − δ/160)s/2 . In order to prove (3.7) we fix ∈ L ∞ and we argue by duality again. We estimate thanks to Lemma 3.3, +,v −1 Q α (ψ, ϕ) (v) m (v) dv = |u| bv ψ∗ ϕ (m )−1 dv dv∗ dσ RN R N ×R N ×S N −1 1 |u|2 bv ψ∗ ϕ (m)−k (m ∗ )−k ≤ R R N ×R N ×S N −1 ×dv dv∗ dσ 1 ≤ L ∞ ψ L 1 (m −k ) ϕ L 1 (m −k ) 2 2 R 1 ≤ L ∞ |.| m 1−k (.)2L ∞ ψ L 1 (m −1 ) ϕ L 1 (m −1 ) , 1 1 R from which we easily conclude since x → x m 1−k (x) is uniformly bounded by Ca,s (1 − k)−1/s , Ca,s ∈ (0, ∞). Step 3. The truncated operator. Let us prove that there exists a constant C ∈ (0, ∞) such that for any δ ∈ (0, 1), α, α ∈ (0, 1] and R ∈ (1, ∞) there holds +,r Q +,r α (g, f ) − Q α (g, f ) L 1 (m −1 ) ≤ C |α − α |

R2 R + δ δ3

g L 1 (m −1 ) f W 1,1 (m −1 ) .


461

We closely follow the proof of [23, Prop. 4.3]. We consider some ∈ L ∞ , f, g ∈ D(R N ), we proceed by duality and next conclude thanks to a density argument. We have +,r −1 I := [Q +,r dv α (g, f ) − Q α (g, f )] m RN = |u| R (u) bδ g∗ f (vα ) m −1 (vα ) − (vα ) m −1 (vα ) dv dv∗ dσ. R N ×R N ×S N −1

With the notations of Lemma 2.2 we perform the changes of variables v → vα = φα (v) and v → vα = φα (v) (for fixed v∗ and σ ) with jacobians Jα and Jα . Observing that without restriction we may assume α ≤ α and therefore Oα = v∗ + ωα (δ) ⊂ Oα = v∗ + ωα (δ) , since s → ωs (0) is an increasing function, we get I = g∗ (m −1 ) F(φα−1 ) Jα−1 dv dv∗ dσ R N ×S N −1 Oα \Oα

+ +

R N ×S N −1 Oα

R N ×S N −1 Oα

dv dv∗ dσ g∗ (m −1 ) F(φα−1 ) Jα−1 − Jα−1 g∗ (m −1 ) F(φα−1 ) − F(φα−1 ) Jα−1 dv dv∗ dσ

= I1 + I2 + I3 ,

− v∗ ). For the first term I1 we use with F(w) := |w − v∗ | R (w − v∗ ) f (w) bδ (σ · w the backward change of variables v → v = φα−1 (v ) (for fixed v∗ and σ ) and we get |u| R (u) f g∗ (vα ) m −1 (vα ) bδ 10≤u·σ I1 = ˆ ≤η dv∗ dv dσ R N ×S N −1 R N

with η := ωα−1 ◦ ωα (δ) ≤ C δ −3/2 |α − α | for some constant C ∈ (0, ∞). Since v → |v|s/2 is an increasing subadditive function, we also have |vα |s ≤ (|v|2 + |v∗ |2 )s/2 ≤ |v|s +|v∗ |s , which implies m(vα ) ≤ C m −1 m −1 ∗ for some constant C ∈ (0, ∞) (depending of ζ ). As a consequence, we obtain |I1 | ≤ C R δ −3/2 |α − α | b L ∞ L ∞ f L 1 (m −1 ) g L 1 (m −1 ) . For the term I2 , using the backward change of variable v → v = φα−1 (v ) (for some −1 −1 fixed v∗ and σ ) and using the bounds (2.4) on Jα and |Jα − Jα |, we obtain

|I1 | ≤ C R δ −3 |α − α | b L ∞ L ∞ f L 1 (m −1 ) g L 1 (m −1 ) . In order to estimate I3 , we introduce αt := (1 − t) α + t α and, thanks to (2.3)-(2.2), we get 1 ! ! C ! ! |I3 | ≤ |α−α | |g∗ | | | (m −1 ) |v −v|!∇w F(φα−1 (v ))!dv dv∗ dσ dt. t N N −1 δ Oαt 0 R ×S Using finally the backward change of variable v → v = φα−1 (v ) and the uniform bound t (2.4) on Jαt , t ∈ [0, 1], on v∗ + δ , we get 2 R R + 2 |α − α | bW 1,∞ L ∞ g L 1 (m −1 ) f W 1,1 (m −1 ) . |I3 | ≤ C δ δ

462


Gathering the estimates established in Steps 1, 2 and 3, we deduce the first inequality in (3.6). The second inequality in (3.6) is proved in a similar way (using symmetric changes of variable, allowed by the truncation). Proof of Lemma 3.3. We proceed √ in three steps.√ Step 1. Assume first that (2/ 5) |v∗ | ≤ |v| ≤ ( 5/2) |v∗ |. Using the fact that x → x s/2 is an increasing and subadditive function, there holds |v |s ≤ (|v|2 + |v∗ |2 )s/2 ≤ (9/4)s/2 |v∗ |s , and then by symmetry and because s ≤ 1, |v |s ≤

1 3 (9/4)s/2 (|v|s + |v∗ |s ) ≤ (|v|s + |v∗ |s ). 2 4

In that case, (3.8) holds with k = 3/4. Step 2. We shall first show that for any v, v∗ ∈ R N and σ ∈ S N −1 , there holds |v |2 , |v∗ |2 ≤ |v|2 + |v∗ |2 −

1+α sin2 χ |v + v∗ |2 . 8

(3.9)

We recall the formula

1+α 1+α v + v∗ 1 1 − α v + v∗ 1 1−α + u+ |u| σ , v∗ := + u− |u| σ . v := 2 2 2 2 2 2 2 2 Straightforward computations yield (denoting S = v + v∗ )

1+α |S|2 1 1 + α 2 2 1 − α 2 2 1−α 2 + |u| + |u| cos θ + (S · u) + |S| |u| cos χ . |v | ≤ 4 4 2 2 4 4 We deduce the bound from above |v |2 ≤

1+α |S|2 |u|2 1 − α + + |S| |u| + |S| |u| cos χ . 4 4 4 4

Then by applying twice Young’s inequality 1 1−α 1 1−α 1+α + + |u|2 + + |S| |u| cos χ , |v |2 ≤ |S|2 4 8 4 8 4 1 1−α 1 1−α 1+α 1+α 2 + |u|2 + ≤ |S|2 + + + |S| cos2 χ , 4 8 4 8 8 8 ≤

|S|2 |u|2 1 + α 2 + + |S| (cos2 χ − 1), 2 2 8

from which we deduce (3.9). √ √ Step 3. Assume that sin2 χ ≥ δ and that either (2/ 5) |v∗ | ≥ |v| or |v| ≥ ( 5/2) |v∗ |. In the first case, we have √ √ √ |v + v∗ | ≥ 1 − (2/ 5) |v∗ | + (2/ 5) |v∗ | − |v| ≥ 1 − (2/ 5) |v∗ |, which then implies

√ √ √ |v + v∗ | ≥ 1 − (2/ 5) ( 5/2) |v| ≥ 1 − (2/ 5) |v|.


463

The same inequalities are proved in a similar way in the second case. We deduce √ 1 1 − (2/ 5) (|v|2 + |v∗ |2 ). |v + v∗ |2 ≥ 2 We then deduce from (3.9) that |v |2 ≤ (1 − δ/160) (|v|2 + |v∗ |2 ) and we conclude that (3.8) holds as in Step 1. 3.2. Quantification of the elastic limit α → 1. We begin with a simple consequence of Proposition 3.1. Corollary 3.4. There exists k0 , q0 ∈ N such that for any ai ∈ (0, ∞) i = 1, 2, 3, there exists an explicit constant C ∈ (0, ∞) such that for any function g satisfying g H k0 ∩L 1 ≤ a1 , g ≥ a2 e−a3 |v| , 8

q0

there holds

! ! ! D H,α (g) − D H,1 (g)! ≤ C (1 − α),

where we recall that D H,α is defined in (2.18). Proof of Corollary 3.4. We write D H,α (g) − D H,1 (g) = b |u| gα g∗α − g g∗ dv dv∗ dσ (=: I1 ) + b |u| g g∗ log gα + log g∗α − log g − log g∗ dv dv∗ dσ (=: I2 ). For the first term, thanks to Proposition 3.1, we have |I1 | ≤ Q +α (g, g) − Q +1 (g, g) L 1 ≤ C (1 − α) g2

W33,1

.

For the second term, we write !" #!! ! |I2 | = 2 ! (Q +α (g, g) − Q +1 (g, g)) v8 , v−8 log g ! ≤ 2 Q +α (g, g) − Q +1 (g, g) L 1 v−8 log g L ∞ 8

≤ C (1 − α) g2

3,1 W11

≤C

(1 − α) a12

(| log g L ∞ | + | log a2 | + a3 )

(| log a1 | + | log a2 | + C a3 ) ,

3,1 for thanks to Proposition 3.1 and the bounded embedding H k0 ∩ L q10 ⊂ L ∞ ∩ W11 k0 , q0 large enough (see Proposition B.1). We conclude the proof gathering these two estimates.

Let us now recall the Csiszár-Kullback-Pinsker inequality (see [14,22]) and a “entropy-entropy production inequality” (the version we present here is established in [31]) that we will use several times in the sequel.

464


Theorem 3.5. (i) For a given function g ∈ L 12 , let us denote by M[g] the Maxwellian function with the same mass, momentum and temperature as g. For any 0 ≤ g ∈ L 12 (R N ), there holds g g − M[g]2 1 ≤ 2 ρ(g) dv. (3.10) g ln L N M[g] R (ii) For any ε > 0 there exists kε , qε ∈ N and for any A ∈ (0, ∞) there exists Cε = Cε,A ∈ (0, ∞) such that for any g ∈ H kε ∩ L q1ε such that g(v) ≥ A−1 e−A |v| , g H kε ∩L q1 ≤ A, 8

ε

there holds Cε ρ(g)1−ε

RN

g ln

g dv M[g]

1+ε ≤ D H,1 (g).

(3.11)

We have then the following estimate on the distance between G α and G¯ 1 for any self-similar profile G α . Proposition 3.6. For any ε > 0 there exists Cε (independent of the mass ρ) such that ∀ α ∈ [α0 , 1)

sup G α − G¯ 1 L 1 ≤ Cε ρ (1 − α) 2+ε , 1

2

G α ∈Gα

(3.12)

where we recall that G¯ 1 is the Maxwellian function defined by (1.27)–(1.29). Proof of Proposition 3.6. On the one hand, for any inelasticity coefficient α ∈ [α0 , 1) and profile G α , there holds from (2.19) together with Corollary 3.4 and the uniform estimates of Proposition 2.1, D H,1 (G α ) ≤ D H,α (G α ) + ρ 2 O(1 − α) ≤ ρ 2 O(1 − α).

(3.13)

On the other hand, introducing the Maxwellian function Mθ with the same mass, momentum and temperature as G α , that is Mθ given by (1.28) with u = 0 and θ = E(G α )/ρ, and gathering (3.13), (3.11), (3.10) with the uniform estimates of Proposition 2.1 and interpolation inequality, we obtain that for any q, ε > 0 there exists Cq,ε such that ∀α ∈ [α0 , 1) Next, from (2.16), we have b1

G α − Mθ 2+ε ≤ Cq,ε ρ 2+ε (1 − α). L1 q

(3.14)

G α G α∗ |u| dv dv∗ − ρ G α |v|2 dv N R b1 3 G α G α∗ |u| dv dv∗ , = (1 − α) 2 RN RN RN

3

RN

and then |(θ )| ≤ C1 G α − Mθ L 1 + C2 ρ 2 (1 − α), 3

(3.15)


465

where we have used that G α and Mθ are bounded thanks to Proposition 2.1 and we have defined Mθ |v|2 dv − b1 Mθ Mθ∗ |u|3 dv dv∗ . (3.16) (θ ) = ρ RN

RN RN

By elementary changes of variables, this formula simplifies into (θ ) = k1 θ − k2 θ 3/2 with k1 = ρ 2 N and, using (A.3), 2 M1,0,1 (M1,0,1 )∗ |u|3 dv dv∗ = 23/2 ρ 2 b1 m3/2 (M1,0,1 ). k2 = ρ b1 R N ×R N

We next observe that ∈ C ∞ (0, ∞) and is strictly concave. It is also obvious that the equation (θ ) = 0 for θ > 0 has a unique solution which is θ¯1 defined in (1.29), and that we have (θ ) ≤ (θ¯1 ) (θ − θ¯1 ) = −k1 (θ − θ¯1 )/2 as well as (θ ) = θ [k1 − k2 θ 1/2 ] = k2 θ [θ¯1

1/2

− θ 1/2 ].

(3.17)

Plugging this expression for into (3.15) and using the lower bound (2.17) on the temperature θ and the estimate (3.14), we obtain that for any ε > 0 there is Cε ∈ (0, ∞) such that ! ! ! 1/2 ¯ 1/2 !2+ε ∀ α ∈ (α0 , 1) ≤ Cε (1 − α). (3.18) !θ − θ1 ! We have thus proved that the temperature of G¯ α converges (with rate) to the expected temperature θ¯1 . In order to come back to the norm of G α − G¯ 1 , we first write, using Cauchy-Schwarz’s inequality, G α − G¯ 1 L 1

−N

≤ G α − Mθ L 1

−N

+ Mθ − G¯ 1 L 1

−N

≤ G α − Mθ L 1 + C N Mθ − G¯ 1 L 2 ,

(3.19)

and we remark that ! 1/2 ! Mθ − G¯ 1 2L 2 ≤ C ρ 2 !θ 1/2 − θ¯1 !.

(3.20)

Gathering (3.19) with (3.20), (3.18) and (3.14) we deduce that for any ε > 0 there is Cε ∈ (0, ∞) such that ∀ α ∈ (α0 , 1)

G α − G¯ 1 2+ε ≤ Cε ρ 2+ε (1 − α), L1

and (3.12) follows by interpolation again.

−N

466


4. Uniqueness and Continuity of the Path of Self-Similar Profiles 4.1. The proof of uniqueness. Theorem 4.1. There exists a constructive α1 ∈ (0, 1) such that the solution G α of (1.32) is unique for any α ∈ [α1 , 1]. We denote by G¯ α this unique self-similar profile. This theorem is an immediate consequence of the following result. Proposition 4.2. There is a constructive constant η ∈ (0, 1) such that ⎫ G, H ∈ Gα , α ∈ (1 − η, 1) ⎬ implies G = H. G − G¯ 1 1 ≤ η, H − G¯ 1 1 ≤ η ⎭ L2

L2

Proof of Theorem 4.1. Let us assume that Proposition 4.2 holds. Then Proposition 3.6 implies that there is some explicit ε ∈ (0, 1) such that for α ∈ (1 − ε, 1] one has sup G α − G¯ 1 L 1 ≤ η,

G α ∈Gα

2

where η is defined in the statement of Proposition 4.2. Up to reducing η, it is always possible to take η ≤ ε, and the proof is completed by applying Proposition 4.2. Proof of Proposition 4.2. Let us consider any exponential weight function m with s ∈ (0, 1), a ∈ (0, +∞), or with s = 1 and a ∈ (0, ∞) small enough. With the notations of Sub-sect. 1.5, let us also define O = C0,0,0 ∩ L1 (m −1 ) the subvector space of L1 (m −1 ) of functions with zero energy, ψ = C (|v|2 − N ) M1,0,1 such that E(ψ) = 1, and the following projection: : L1 (m −1 ) → O,

(g) = g − E(g) ψ.

Finally, let us introduce the following non-linear functional operator: : [0, 1) × (W11,1 (m −1 ) ∩ Cρ,0 ) → R × O, and (1, ·) : (L 11 (m −1 ) ∩ Cρ,0 ) → R × O, by setting (α, g) = (1 + α) DE (g) − 2 ρ E(g), Q α (g, g) − τα divv (v g) , where DE (g) is defined in (1.13). It is straightforward that (α, G α ) = 0 for any α ∈ [α0 , 1] and G α ∈ Gα , and that the equation (1, g) = (0, 0) has a unique solution, given by g = G¯ 1 = Mρ,0,θ¯1 defined in (1.27), (1.29).


467

The function is nonlinear in terms of its first argument, and it is quadratic in terms of its second argument (more precisely, it is the sum of linear and quadratic terms in terms of its second argument). Hence easy computations yield the following formal differential according to the second argument at the point (1, G¯ 1 ): (4.1) D2 (1, G¯ 1 ) h = A h := 4 D˜ E (G¯ 1 , h) − 2ρ E(h), 2 Q˜ 1 (G¯ 1 , h) , where Q˜ α is defined in (1.8) and D˜ E (g, h) := b1

R N ×R N

g h ∗ |u|3 dv dv∗ .

Notice that we can remove the projection on the last argument in (4.1) since the elastic collision operator always has zero energy. Then we have Lemma 4.3. The linear functional A : L11 (m −1 ) → R × O

h → A h = D2 (1, G¯ 1 ) h

is invertible: it is bijective with A−1 bounded with explicit estimate. Proof of Lemma 4.3. Since the spectrum of the linear operator L1 defined on L 1 (m −1 ) (with domain L 11 (m −1 )) includes 0 as a discrete eigenvalue associated with the eigenspace KerL1 = Span{G¯ 1 , v1 G¯ 1 , . . . , v N G¯ 1 , |v|2 G¯ 1 } by [27, Theorem 1.3] and since moreover O ∩ KerL1 = {0}, we deduce that it is invertible from O ∩ L11 (m −1 ) onto O. Moreover the work [27, Sect. 4] provides explicit estimates on the norm of its inverse. We deduce immediately that L−1 1 maps O onto itself with explicit bound. For any h ∈ L1 (m −1 ), we decompose h = h 1 φ1 + h ⊥ , with h 1 :=

E(h) ∈ R, h ⊥ ∈ O, E(φ1 )

where we recall that φ1 is defined in (1.35). Then, using the characterization (1.32) of G¯ 1 , A h = b1 G¯ 1 (v) h ⊥ (v∗ ) |u|3 dv dv∗ R N ×R N

2 ¯ ¯ 3 4 ⊥ ¯ +h 1 b1 |v| G 1 G 1∗ |u| dv dv∗ − 2ρ G 1 |v| dv , L1 (h ) . R N ×R N

RN

The claimed invertibility follows from the fact that C ∗ = 2 N ρ 2 θ¯12 = 0. Indeed, from (A.2) and (A.4) there holds ∗ 2 ¯ ¯ 3 C := b1 |v| G 1 G 1∗ |u| dv dv∗ −2ρ G¯ 1 |v|4 dv R N ×R N RN

1/2 2 ¯2 2 3 4 ¯ = ρ θ1 b1 θ1 M1,0,1 (M1,0,1 )∗ |v| |u| dv dv∗ −2 M1,0,1 |v| dv R N ×R N RN 1/2 √ 2 (2N + 3) m3/2 (M1,0,1 ) − 2 N (N + 2) , = ρ 2 θ¯12 b1 θ¯1 and we conclude thanks to formula (1.29).

468


Let us come back to the proof of Proposition 4.2. We write G α − Hα = A−1 [A G α − (α, G α ) + (α, Hα ) − A Hα ] = A−1 (I1 , I2 )

(4.2)

with (recall that the bilinear operators D˜ E and Q˜ α are symmetric) ' I1 := 4 D˜ E (G¯ 1 , G α − Hα ) − (1 + α) D(G α ) + (1 + α) D(Hα ) I2 := I2,1 + I2,2 and '

I2,1 := 2 Q˜ 1 (G¯ 1 , G α − Hα ) − Q α (G α , G α ) + Q α (Hα , Hα ) I2,2 := ρ (1 − α) ∇v · (v (Hα − G α )) .

On the one hand, I1 = 2 D 2G¯ 1 − (G α + Hα ), G α − Hα + (1 − α) D(G α + Hα , G α − Hα ) so that

|I1 | ≤ C3 G¯ 1 − G α L 1 + G¯ 1 − Hα L 1

+ (1 − α) G α L 1 + (1 − α) Hα L 1 G α − Hα L 1 3

3

3

3

3

≤ η1 (α) G α − Hα L 1 (m −1 )

(4.3)

1

with η1 (α) → 0 when α → 1 (with explicit rate, for instance η1 (α) = C1 (1 − α)1/3 ) because of Propositions 2.1 and 3.6. On the other hand, I2,1 = Q 1 (G¯ 1 , G α − Hα ) − Q α (G¯ 1 , G α − Hα ) + Q 1 (G α − Hα , G¯ 1 ) −Q α (G α − Hα , G¯ 1 ) + Q α (G¯ 1 − G α , G α − Hα ) + Q α (G α − Hα , G¯ 1 − Hα ). From Proposition 3.2 there holds Q 1 (G¯ 1 , G α − Hα ) − Q α (G¯ 1 , G α − Hα ) L 1 (m −1 ) ≤ ε(α) G α − Hα L 1 (m −1 ) , 1

Q 1 (G α − Hα , G¯ 1 ) − Q α (G α − Hα , G¯ 1 ) L 1 (m −1 ) ≤ ε(α) G α − Hα L 1 (m −1 ) , 1

with ε(α) → 0 as α → 1 (with again explicit rate, for instance ε(α) = C1 (1 − α)1/12 if s = 1/2 in the formula of m). From elementary estimates in L 1 (m −1 ) we have Q α (G¯ 1 − G α , G α − Hα ) + Q α (G α − Hα , G¯ 1 − Hα ) L 1 (m −1 ) ≤ C4 G α − G¯ 1 L 1 (m −1 ) + Hα − G¯ 1 L 1 (m −1 ) G α − Hα L 1 (m −1 ) . 1

1

1

Together with Propositions 3.6 we thus obtain I2,1 L 1 (m −1 ) ≤ η2 (α) G α − Hα L 1 (m −1 ) 1

(4.4)


469

for some η2 (α) → 0 as α → 1. Here we can take for instance (when s = 1/2 in the formula of m) η2 (α) = C2 (1 − α)1/12 for some C2 ∈ (0, ∞) by picking a suitable ε and interpolating. Finally from Proposition 2.7 there holds I2,2 L 1 (m −1 ) ≤ C5 (1 − α) G α − Hα L 1 (m −1 ) . 1

(4.5)

Gathering (4.3), (4.4) and (4.5) we obtain from (4.2) and Lemma 4.3 G α − Hα L 1 (m −1 ) ≤ η(α) A−1 G α − Hα L 1 (m −1 ) 1

1

for some function η such that η(α) → 0 as α → 1 (with explicit rate). Hence choosing α1 close enough to 1 we have η(α) A−1 ≤ 1/2 for any α ∈ [α1 , 1). This implies G α = Hα and concludes the proof. 4.2. Differentiability of the map α → G¯ α at α = 1. Lemma 4.4. The map [α1 , 1] → L 1 (m −1 ), α → G¯ α is continuous on [α1 , 1] and differentiable at α = 1. More precisely, there exists G¯ 1 ∈ L 1 (m −1 ) and for any η ∈ (1, 2) there exists a constructive Cη ∈ (0, ∞) such that G¯ α − G¯ 1 − (1 − α) G¯ 1 L 1 (m −1 ) ≤ Cη (1 − α)η

∀ α ∈ (α0 , 1).

(4.6)

Proof of Lemma 4.4. We split the proof into four steps. Step 1. For the continuity we use a classical stability argument. Let us consider a sequence (αn )n≥0 such that αn ∈ [α1 , 1] and αn → α. From the uniform bound (2.1), we may extract a subsequence (G¯ αn ) which strongly converges in L 1 (m −1 ) to a function G α . Passing to the limit in Eqs. (1.32) associated to the normal restitution coefficient αn and written for G αn , we deduce that G α satisfies (1.32) associated to the normal restitution coefficient α. From the uniqueness of the solution proved in Theorem 4.1, there holds G α = G¯ α and thus the whole sequence G¯ αn converges to G¯ α . Step 2. We next prove that there exists an explicit constant C such that ∀ α ∈ [α1 , 1]

G¯ α − G¯ 1 L 1 (m −1 ) ≤ C (1 − α).

We write G¯ α − G¯ 1 = A−1 [A G¯ α − (α, G¯ α ) + (1, G¯ 1 ) − A G¯ 1 ] = A−1 (J1 , J2 ) with '

J1 := 4 D˜ E (G¯ 1 , G¯ α − G¯ 1 ) + 2 D˜ E (G¯ 1 , G¯ 1 ) − (1 + α) D˜ E (G¯ α , G¯ α ) J2 := J2,1 + J2,2

and '

J2,1 := Q 1 (G¯ 1 , G¯ α ) + Q 1 (G¯ α , G¯ 1 ) − Q α (G¯ α , G¯ α ) J2,2 := ρ (1 − α) ∇v · v (G¯ α ) .

(4.7)

470


On the one hand, J1 = −2 D˜ E (G¯ 1 − G¯ α , G¯ 1 − G¯ α ) + (1 − α) D(G¯ α , G¯ α ) so that |J1 | ≤ C G¯ 1 − G¯ α 2L 1 + C (1 − α). 3

On the other hand, J2,1 = −Q 1 (G¯ 1 − G¯ α , G¯ 1 − G¯ α ) + Q 1 (G¯ α , G¯ α ) − Q α (G¯ α , G¯ α ). Hence using Propositions 2.7, 3.1, and the bound (2.1), we deduce 2 |J2,1 | ≤ G¯ α − G¯ 1 L 1 + C (1 − α) 2

and we also have straightforwardly J2,2 = O(1 − α). Gathering all these estimates, we thus obtain from (4.7), G¯ α − G¯ 1 1 −1 ≤ A−1 G¯ α − G¯ 1 2 1 −1 + C (1 − α) . L (m ) L (m ) 1

1

Using then the explicit result of quantification of the elastic limit in Proposition 3.6, we have that for some α2 ∈ [α1 , 1) close enough to 1: ∀ α ∈ [α2 , 1]

1 A−1 G¯ α − G¯ 1 L 1 (m −1 ) < , 1 2

and thus we get ∀ α ∈ [α2 , 1], G¯ α − G¯ 1 L 1 (m −1 ) ≤ 2 C A−1 (1 − α), 1

which implies the claimed estimate. Step 3. In order to prove the differentiability we must slightly improve the estimate established in the preceding step. On the one hand we exhibit what should be the derivative of G¯ α at α = 1, and denote it by R. Formally differentiating Eq. (1.32) at α = 1 we have Q 1 (G¯ 1 , G¯ 1 ) + 2 Q˜ 1 (R, G¯ 1 ) + ρ ∇v · v G¯ 1 = 0. On the other hand, we may compute 1 ¯ 2 ¯

Q α (G 1 , G 1 ), |.| = b |u| G¯ 1 G¯ 1∗ (|u| σ −u) · (|u| σ ) dv dv∗ dσ 4 R N ×R N ×S N −1 (4.8) = 2 DE (G¯ 1 ). Next, dividing Eq. (1.25) on the energy of G α by (1 − α) and formally differentiating the resulting expression we get 2 ρ E(R) − D˜ E (G¯ 1 , G¯ 1 ) − 4 D˜ E (R, G¯ 1 ) = 0. We now rigorously define R in the following way G¯ 1 = R := A−1 − D˜ E (G¯ 1 , G¯ 1 ), −F , F := Q α (G¯ 1 , G¯ 1 ) + ρ ∇v · v G¯ 1 .


471

Note that R is well-defined since E(F) = 0 because of (4.8) and the definition of G¯ 1 . Step 4. We finally come back to Step 2 and we shall construct a Taylor expansion of order 1. We want to estimate G¯ α − G¯ 1 + (α − 1) G¯ 1 = A−1 J1 − (α − 1) D˜ E (G¯ 1 , G¯ 1 ), J2 − (1 − α) F . On the one hand J1 − (α − 1) D˜ E (G¯ 1 , G¯ 1 ) = −2 D˜ E (G¯ 1 − G¯ α , G¯ 1 − G¯ α ) + (1 − α) D(G¯ α , G¯ α ) − D˜ E (G¯ 1 , G¯ 1 ) , so that we obtain straightforwardly ! ! ! J1 − (α − 1) D˜ E (G¯ 1 , G¯ 1 )! ≤ C (1 − α)2 . On the other hand, J2 − (1 − α) F := J2,1 + J2,2 with J2,1 = −Q 1 (G¯ 1 − G¯ α , G¯ 1 − G¯ α ) + Q 1 (G¯ α − G¯ 1 , G¯ α ) − Q α (G¯ α − G¯ 1 , G¯ α ) +Q 1 (G¯ 1 , G¯ α − G¯ 1 ) − Q α (G¯ 1 , G¯ α − G¯ 1 ) + (1 − α) ∇v · v (G¯ α − G¯ 1 ) and J2,2 = Q 1 (G¯ 1 , G¯ 1 ) − Q α (G¯ 1 , G¯ 1 ) − (1 − α) K . It is clear from Propositions 3.1, the bound of Step 2, and some interpolation with the uniform bounds (2.1), that J2,1 L 1 (m −1 ) , J2,2 L 1 (m −1 ) ≤ Ck (1 − α)k for any k ∈ (1, 2).

5. Study of the Spectrum and Semigroup of the Linearized Problem In this section we shall obtain information on the geometry of the spectrum of the linearized rescaled inelastic collision operator (for a small inelasticity), as well as estimates on its resolvent and on the associated linear semigroup. We shall use the properties of the elastic linearized operator and some perturbation arguments again. In order to do so, one needs some common functional “ground” for the linearized operators in the limit of vanishing inelasticity. This common functional setting is given by the study [27] in which the spectral study of the elastic linearized operator is made in L 1 spaces with s exponential weights ea |v| , a ∈ (0, +∞), s ∈ (0, 1). We thus consider the operator g → Q α (g, g) − τα ∇v · (v g) and some fluctuations h around the self-similar profile G¯ α : g = G¯ α + h with h ∈ L 1 (m −1 ), where m is a fixed smooth exponential weight function, as defined in (1.30). The corresponding linearized unbounded operator Lα acting on L 1 (m −1 ) with

472


domain dom(Lα ) = W11,1 (m −1 ) if α = 1 and dom(L1 ) = L 11 (m −1 ), is defined in (1.33) (it is straightforward to check that it is closed in this space). Since the equation in selfsimilar variables preserves mass and the zero momentum, the correct spectral study of Lα requires to restrict this operator to zero mean and centered distributions (which are preserved as well by Lα ), and therefore we shall work in L1 (m −1 ). When restricted to this space, the operator Lα is denoted by Lˆ α . We denote by R(Lˆ α ) the resolvent set of Lˆ α , and by Rα (ξ ) = (Lˆ α − ξ )−1 its resolvent operator for any ξ ∈ R(Lˆ α ). Let us recall that the linearized elastic hard spheres Boltzmann equation, the spectrum, and the asymptotic stability have been studied by many authors since the pioneering works by Hilbert [20], Carleman [12] and Grad [18], and we refer for instance to [27] for more references. The result established for L1 (and translated straightforwardly to Lˆ 1 ) in [27] is the following: Theorem 5.1. (i) There exists a decreasing sequence of real discrete eigenvalues (µn )n≥1 (that is: eigenvalues isolated and with finite multiplicity) of Lˆ 1 , with “energy” eigenvalue µ1 = 0 of multiplicity 1 and “energy” eigenvector φ1 (defined in (1.35)), µ2 < 0 and lim µn = µ∞ ∈ (−∞, 0) such that the spectrum (Lˆ 1 ) of Lˆ 1 in L1 (m −1 ) is written (Lˆ 1 ) = (−∞, µ∞ ] ∪ {µn }n∈N . In particular, Lˆ 1 is onto from O ∩ L11 (m −1 ) onto O. (ii) The resolvent R1 (ξ ) has a sectorial property for after “subtraction” of the “energy” eigenvalue, namely there is a constructive µ2 < λ < 0 such that ∀ ξ ∈ A, R1 (ξ )L1 (m −1 ) ≤ a +

b , |ξ + λ|

with

λ 3π 3π and e ξ ≤ . A = ξ ∈ C, arg(ξ + λ) ∈ − , 4 4 2 (iii) The linear semigroup S1 (t) associated to Lˆ 1 in L1 (m −1 ) is written ∀t ≥ 0

S1 (t) = 1 + R1 (t),

where 1 is the projection on the eigenspace associated to µ1 and R1 (t) is a semigroup which satisfies ∀t ≥ 0

R1 (t)L1 (m −1 ) ≤ C eµ2 t

with explicit constant C. The main result proved in this section is a perturbation result which extends Theorem 5.1 in the following way. Let us define for any x ∈ R the half-plane x by x = {ξ ∈ C, e ξ ≥ x}.


473

Theorem 5.2. Let us fix µ¯ ∈ (µ2 , 0), k, q ∈ N and m a smooth weight exponential function with s ∈ (0, 1). Then there exists α2 ∈ (α1 , 1) such that for any α ∈ [α2 , 1] the following holds: (i) The spectrum (Lˆ α ) of Lˆ α in Wqk,1 (m −1 ) is written (Lˆ α ) = E α ∪ {µα },

E α ⊂ cµ¯ ,

where µα is a 1-dimensional real eigenvalue which does not depend on the choice of the space Wqk,1 (m −1 ) and satisfies (1.34). (ii) The resolvent Rα (ξ ) in Wqk,1 (m −1 ) is holomorphic on a neighborhood of µ¯ \{µα } and there are explicit constants C1 , C2 such that sup

z∈C, e z=µ¯

Rα (z)|Wk,1 (m −1 )→Wk,1 (m −1 ) ≤ C1 q

q

and Rα (µ¯ + is)Wk+1,1 (m −1 )→Wk,1 (m −1 ) ≤ q

q+1

C2 . 1 + |s|

k+2,1 (m −1 ), k, q ∈ N, is written (iii) The linear semigroup Sα (t) associated to Lˆ α in Wq,2

Sα (t) = eµα t α + Rα (t), where α is the projection on the (1-dimensional) eigenspace associated to µα and where Rα (t) is a semigroup which satisfies Rα (t)Wk+2,1 (m −1 )→Wk,1 (m −1 ) ≤ Ck eµ¯ t q+2

q

(5.1)

with explicit bounds. Remark 5.3. Note that we do not claim that the resolvent Rα is sectorial for α < 1. Indeed it is likely that it is not (because of the contribution of the drift term). Moreover, it is not clear to us how to perform the spectral study in a Hilbert functional setting L 2 (m −1 ) with convenient weight function m. In particular, we are not able to prove Proposition 3.2 in an L 2 framework. In such a situation the spectral study and the obtaining of constructive rate of decay on the semigroup become tricky. Let us emphasize also that (as most of the results established in this paper) this result is not an easy consequence of perturbation theory of the unbounded operator since the elastic limit α → 1 is strongly ill-behaved (for instance neither the “relative bound” nor the “operator gap” of [21] go to 0) because of the anti-drift term.

5.1. Recalls and improvements of technical tools from [27]. Proposition 5.4. In the statement of Theorem 5.1 one can replace everywhere L 1 (m −1 ) by Wqk,1 (m −1 ), k, q ∈ N.

474


Let us first recall the key decomposition of Lˆ 1 in [27, Sect. 2] (re-written within the notation of this paper): Let 1 E denote the usual indicator function of the set E, let : R → R+ be an even ˜ : R N → R+ a radial C ∞ function with mass 1 and support included in [−1, 1] and C ∞ function with mass 1 and support included in B(0, 1). We define the following mollification functions ( > 0): ' (x) = −1 ( −1 x), (x ∈ R) ˜ (x) =

−N

˜ (

−1 x),

(x ∈ R N ).

Then we consider the decompositions L1 (g) = Lc1 (g) − Lν (g)

with Lν (g) := ν g, ν = L(G¯ 1 ),

where Lc1 splits between a “gain” part L+1 (denoted so because it corresponds to the linearization of Q + ) and a convolution part L∗ (not depending on α) as Lc1 (g) = L+1 (g) − L∗ (g) with L∗ (g) := G¯ 1 [g ∗ ] , (we do not write the subscript 1 when there is no dependency on α). Then for any δ ∈ (0, 1) we set + (|v − v∗ |) bδ (cos θ ) g (G¯ 1 )∗ + (G¯ 1 ) g∗ dv∗ dσ, L1,δ (g) = Iδ (v) R N ×S N −1

where ˜ δ ∗ 1{|·|≤δ −1 } , Iδ = and

bδ (z) = δ 2 ∗ 1{−1+2δ 2 ≤z≤1−2δ 2 } b(z).

This approximation induces L1,δ = L+1,δ − L∗ − Lν . Then the key result is that this approximation converges (in the norm of the graph) to the original linearized operator L1 as δ → 0, first in the small classical linearization space L 2 (G¯ −1 1 ) (this technical result was in fact mostly already included in Grad’s results [18]), and second most importantly in the larger space L 1 (m −1 ). On the basis of this approximation result the spectrum is then proved to be the same in both functional spaces, and then the norm of the resolvents within these two functional spaces are related by an explicit control. Hence the key elements of the proof which are to be extended are, on the one hand, the approximation argument (which has to be extended from an L 1 (m −1 ) setting to an Wqk,1 (m −1 ) setting), and, on the other hand the explicit control on the resolvent in the space L 2 (G¯ −1 1 ) provided by the self-adjointness structure of the collision operator in this space and the explicit estimates on the spectral gap (see [6]), which has to be extended to an H k (G¯ −1 1 ) setting. Then the rest of the proof of [27] would extend as well (up to minor technical modifications) to W k, p (m −1 ). Therefore for the first point let us prove k,1 (m −1 ), we have Proposition 5.5. For any k, q ∈ N and g ∈ Wq+1 + L − L+ (g) k,1 −1 ≤ ε(δ) g k,1 −1 , 1 1,δ W (m ) W (m ) q

q+1

where ε(δ) > 0 is an explicit constant going to 0 as δ goes to 0.


475

Proof of Proposition 5.5. The case k = q = 0 is provided by Proposition 3.2. Then higher-order derivatives follow by differentiation, and the incoporation of a polynomial weight is trivial. Concerning the second point let us prove k ¯ −1 Proposition 5.6. The spectrum (L1 ) of L1 in L 2 (G¯ −1 1 ) is unchanged in any H (G 1 ), k ∈ N. Moreover the control on the resolvent, which was (self-adjoint operator)

R1 (ξ ) L 2 (G¯ −1 ) ≤ 1

1 dist(ξ, (L1 ))

in the space L 2 (G¯ −1 1 ), extends into ∀ ξ ∈ A, R1 (ξ ) H k (G¯ −1 ) ≤ 1

with

Ck , dist(ξ, (L1 ))

3π 3π λ A = ξ ∈ C, arg(ξ + λ) ∈ − , and e ξ ≤ , 4 4 2

for any k ∈ N and some explicit constant Ck > 0, Proof of Proposition 5.6. A quick way to prove the result for instance is the following. It is easy to prove by induction on k ∈ N the following estimate on the Dirichlet form: ⎛ ⎞ ¯ s g)2 2 −1 ⎠ as ∇ s L1 (g), ∇ s g L 2 (G¯ −1 ) ≤ −τk ⎝ (∇ L (G¯ ) |s|≤k

1

|s|≤k

1

¯ denotes the orthogonal profor some explicit τk > 0 and as > 0, |s| ≤ k, and where −1 2 ¯ jection in L (G 1 ) onto the functions with zero mass, momentum and energy. Therefore we deduce on Lˆ 1 that its semigroup satisfies ˆ

∀ k ∈ N, et L1 H k (G¯ −1 ) ≤ Ck 1

and that obviously the same is true on the stable subspace of functions with zero energy. Then by interpolation with the rate of decay of the semigroup for functions with zero energy in L 2 (G¯ −1 1 ), we deduce that ˆ

∀ ε > 0, k ∈ N, et L1 H k (G¯ −1 ) ≤ Cε,k e−(µ2 −ε) t 1

for some explict Cε,k > 0, and where is the orthogonal projection in L 2 (G¯ −1 1 ) onto functions with zero energy. This implies on the resolvent that for any k ∈ N, ∀ ξ ∈ A, R1 (ξ ) H k (G¯ −1 ) ≤ Ck , 1

with

λ 3π 3π and e ξ ≤ , A = ξ ∈ C, arg(ξ + λ) ∈ − , 4 4 2

for some explicit Ck > 0. Then the result follows by straightforward interpolation with the estimates on the resolvent in L 2 (G¯ −1 1 ).

476


Then we can conclude to the following extension of point (ii) of Theorem 5.1: Proposition 5.7. We have ∀ ξ ∈ A, R1 (ξ )Wk,1 (m −1 ) ≤ ak,q + q

with

bk,q , |ξ + λ|

3π 3π λ A = ξ ∈ C, arg(ξ + λ) ∈ − , and e ξ ≤ 4 4 2

for any k, q ∈ N and some explicit constant ak,q , bk,q > 0. 5.2. Decomposition of Lˆ α and technical estimates. We fix once for all some µ¯ ∈ (µ2 , 0) and we split the proof of Theorem 5.2 into four steps, detailed in the following four subsections. Let us introduce the operator Pα = L1 − Lα = L+1 − L+α + τα ∇v · (v ·). Our first step in this subsection is to estimate the convergence to 0 of the first part of this operator in suitable norm. Namely we prove Lemma 5.8. (i) For any k, q ∈ N, there exists C = Ck,q,m such that + L (g) k,1 −1 ≤ C g k,1 −1 , Lα (g) k,1 −1 ≤ C g α

Wq (m

)

Wq+1 (m

)

Wq (m

)

k+1,1 Wq+1 (m −1 )

.

(ii) For any k, q ∈ N, there is a constructive function ε : (0, ∞) → (0, ∞) satisfying k,1 (m −1 ), ε(α) → 0 as α goes to 1 and such that for any g ∈ Wq+1 + L − L+ (g) k,1 −1 ≤ ε(α) g k,1 −1 . α 1 W (m ) W (m ) q

q+1

(iii) There exists C ∈ (0, ∞) such that for any g ∈ W33,1 (m −1 ), we have (L1 − Lα ) (g) L 1 (m −1 ) ≤ C (1 − α) gW 3,1 (m −1 ) . 3

Proof of Lemma 5.8. The case k = q = 0 is proved in Proposition 3.2. Then higher-order derivatives are obtained from the L 1 (m −1 ) estimates by straightforward differentiation, and the incorporation of polynomial weights is trivial. Now let us consider some ξ ∈ C and let us define Aδ = L+1,δ − L∗ and

Bα,δ (ξ ) = ν + ξ + L+1,δ − L+1 + Pα .

(Recall that the approximation L+1,δ is defined in the beginning of Subsect. 5.1.) It yields the decomposition Lα − ξ = Aδ − Bα,δ (ξ ). Then we have


477

Lemma 5.9. Let us consider any k, q ∈ N and ξ such that e ξ ≥ − min ν. Then ∞,1 (m −1 ) is a bounded linear operator (i) For any δ > 0, the operator Aδ : L 1 → W∞ 1 ∞ (more precisely it maps functions of L into C functions with compact support). (ii) There are some constructive δ ∗ > 0 and α2 ∈ (α1 , 1) (depending on a lower bound on dist(ξ, ν(R N ))) such that for δ ∈ [0, δ ∗ ] and α ∈ [α2 , 1], the operator k+1,1 (m −1 ) → Wqk,1 (m −1 ) Bα,δ : Wq+1

is invertible. (iii) The inverse operator Bα,δ (ξ )−1 satisfies for δ ∈ [0, δ ∗ ] and α ∈ [α3 , 1]: C1 ≤ Bα,δ (ξ )−1 k,1 −1 k,1 −1 Wq (m )→Wq (m ) dist(e ξ, ν(R N )) and Bα,δ (ξ )−1

Wqk+1,1 (m −1 )→Wqk,1 (m −1 )

≤

C2 dist(ξ, ν(R N ))

for some explicit constants C1 , C2 > 0 depending on k, q, δ ∗ , α2 and a lower bound on dist(e ξ, ν(R N )). N c Proof of Lemma 5.9.For ξ ∈ ν(R ) , it was proved in [27, Prop. 4.1, Theorem 4.2] the convergence to 0 of L+1,δ − L+1 as δ → 0 (which was done in L 11 (m −1 ) → L 1 (m −1 ) k,1 (m −1 ) → Wqk,1 (m −1 ) by Proposition 5.5), we deduce in [27] and is extended in any Wq+1 as in [27] that for δ small enough (depending on a lower bound on the coercivity norm of ν + ξ , that is on a lower bound on dist(ξ, ν(R N ))), we have

+ L − L+ g k,1 −1 ≤ 1 (ν + ξ ) g k,1 . 1,δ 1 Wq (m ) Wq+1 2 It was also proved that Aδ maps functions of L 1 into C ∞ functions with compact support (with explicit estimates). Let us now consider Bα,δ (ξ ) only in the case k = q = 0 (estimates for higher-order derivatives and weights are obtained by straightforward differentiation and computations). From Lemma 5.8 we have for α close enough to 1 (depending on a lower bound on dist(ξ, ν(R N ))), + L − L+ g 1

α

L 1 (m −1 )

≤

1 (ν + ξ ) g L 1 (m −1 ) . 2

By considering the semigroup on L 1 (m −1 ) of Bα,δ (ξ ) and computing the evolution of the norm in symmetric form using the formula for the differentiation of the complex modulus of a function ∇|h| =

∇h h¯ + h ∇ h¯ , 2 |h|

it is easily seen that B (ξ ) t 1 e α,δ g L 1 (m −1 ) ≥ (ν + e ξ ) g L 1 (m −1 ) − (ν + eξ ) g L 1 (m −1 ) , 2

478


and therefore for α close enough to 1 (depending on a lower bound on dist(ξ, ν(R N ))), we deduce that B (ξ ) t 1 e α,δ g L 1 (m −1 ) ≥ (ν + e ξ ) g L 1 (m −1 ) , 2 and thus that the operator is invertible with its inverse bounded by 2 Bα,δ (ξ )−1 1 −1 ≤ . L (m ) dist(e ξ, ν(R N )) Moreover by computing separately the evolution of the L 1 (m −1 ) norm in non-symmetric form (thus keeping ν + ξ but creating a term of the form O(1 − α) times a W11,1 (m −1 ) norm) and the evolution of the W11,1 (m −1 ) norm in symmetric form: it yields easily B (ξ ) t 1 1 e α,δ g W 1,1 (m −1 ) ≥ (ν + ξ ) g L 1 (m −1 ) + (ν + ξ ) ∇v g L 1 (m −1 ) , 2 2 which implies the result, by dropping the second term. 5.3. Geometry of the essential spectrum and estimates on the eigenvalues. First concerning the geometry of the spectrum, following the same strategy as in [27, Subsect. 3.2] we can prove Proposition 5.10. Let us pick any k, q ∈ N and m a smooth exponential weight function (as defined in (1.30)). Then for any α ∈ [α2 , 1] the spectrum of Lα in Wqk,1 (m −1 ) is composed of a part included in cµ∞ containing all possible essential spectrum, and a remaining part included in µ∞ exclusively composed of discrete eigenvalues. Proof of Proposition 5.10. We follow the same method as in the proof of [27, Prop. 3.4]. One uses the decomposition Lα = Aδ − Bα,δ (0), the compactness of the first part Aδ and the coercivity Bα,δ (0) L 1 (m −1 ) ≥ ν g L 1 (m −1 ) − ε(δ) ν g L 1 (m −1 ) of the second part (where (δ) → 0 as δ → 0). Then one applies Weyl’s theorem and show that (for any δ > 0) µ∞ + (δ) has to be a Fredhom set with indices (0, 0) (except possibly for a countable family of points) since [a, +∞) is included in the resolvent set for a big enough. Second concerning the discrete part of the spectrum, that is the isolated eigenvalues with finite multiplicity, following the same strategy as in [27, Proof of Prop. 3.5] we can prove Proposition 5.11. Let us fix µ¯ ∈ (µ2 , 0). Then for any α ∈ [α2 , 1] (where α2 is obtained from Lemma 5.9 for this choice of µ), ¯ for any µ ∈ µ¯ and φ ∈ W11,1 satisfying Lα (φ) = µ φ in L 1 , we have φW k,1 (m −1 ) ≤ Ck,m φ L 1 2

for any k ∈ N and m = exp(−a |v|s ), a > 0, s ∈ (0, 1), where the constant Ck,m depends on k, m and a lower bound on µ¯ − µ.


479

Proof of Proposition 5.11. Let us sketch the idea of the proof. We use the decomposition 0 = Lα φ − µ φ = Aδ φ − Bα,δ (µ)φ and the fact that for the choices made for µ and α in the assumptions we have (adjusting δ as in Lemma 5.9) Bα,δ (µ) is invertible in any W k,1 (m −1 ) with explicit bound, and Aδ maps L 1 into C ∞ functions with compact support. Remark 5.12. An alternative proof could be to adapt the proof of Proposition 2.7. 5.4. Estimate on the resolvent and global stability of the spectrum. Lemma 5.13. Let us pick k, q ∈ N and m a smooth exponential weight function (as defined in (1.30)) and consider the operator Lα in Wqk,1 (m −1 ). Then (i) For any ξ ∈ R(L1 ), there is αξ ∈ [α2 , 1) such that ξ ∈ R(Lα ) for any α ∈ [αξ , 1]. (ii) More precisely, the resolvent Rα (ξ ) satisfies the following two estimates for α ∈ [α2 , 1): Rα (ξ )Wk,1 (m −1 ) ≤ q

C1 + C2 R1 (ξ )W k+1,1 (m −1 ) q+1

1 − C3 (1 − α) R1 (ξ )W k+1,1 (m −1 )

,

q+1

Rα (ξ )Wk+1,1 (m −1 )→Wk,1 (m −1 ) q

q+1

C1 + C2 R1 (ξ )W k+1,1 (m −1 ) 1 q+1 ≤ , δ(ξ ) 1 − C3 (1 − α) R1 (ξ )W k+1,1 (m −1 ) q+1

Ci , Ci ,

dist(ξ, ν(R N ))

and where the constants i = 1, 2, 3 depend with δ(ξ ) := N on a positive lower bound on dist(e ξ, ν(R )). (iii) Finally, for any compact set K ⊂ ρ(L1 ) there exists α K ∈ [α2 , 1), C K ∈ (0, ∞) such that ∀ ξ ∈ K , α ∈ (α K , 1] Rα (ξ )Wk,1 (m −1 ) ≤ C K , q

∀ ξ ∈ K , α, α ∈ (α K , 1] Rα (ξ ) h−Rα (ξ ) hL1 (m −1 ) ≤ C K (1−α) hW 3,1 . 3

Proof of Lemma 5.13. We split the proof into three steps. k+1,1 (m −1 ) Step 1. Let us consider the following operator defined from Wqk,1 (m −1 ) to Wq+1 (which is seen to be well-defined at a glance) Iα,δ (ξ ) := −Bα,δ (ξ )−1 + R1 (ξ ) Aδ Bδ,α (ξ )−1 . Some straightforward computations show that (Lα − ξ ) Iα,δ (ξ ) = −Aδ Bα,δ (ξ )−1 + Id + [Id − Pα R1 (ξ )] Aδ Bα,δ (ξ )−1 which simplifies into (Lα − ξ ) Iα,δ (ξ ) =: Jα,δ (ξ ) := Id − Pα R1 (ξ ) Aδ Bα,δ (ξ )−1 =: Id − K α,δ (ξ ). First using that Pα hWk,1 (m −1 ) ≤ C (1 − α) hW k+1,1 (m −1 ) , q

q+1

480


k+1,1 the control of R1 (ξ ) in Wq+1 (m −1 ) and the regularization property of Aδ we deduce that

K α,δ (ξ ) = Pα R1 (ξ ) Aδ Bδ,α (ξ )−1 = O(1 − α) in the norm of bounded operators on Wqk,1 (m −1 ), and therefore for (1 − α) small enough (with explicit bound) we get that K α,δ (ξ )W k,1 (m −1 ) ≤ C3 (1 − α) R1 (ξ )Wk+1,1 (m −1 ) < 1 q

q+1

and Id − K α,δ (ξ ) is invertible in Wqk,1 (m −1 ). As a consequence (Lα − ξ ) Iα,δ (ξ ) (Id − K α,δ (ξ ))−1 = IdWk,1 (m −1 ) q

and we have proved that Lα − ξ admits a right-inverse, namely so that Iα (ξ ) (Id − K α,δ (ξ ))−1 . This proves that the operator Lα − ξ is onto. Step 2. In order to show that Lα − ξ is invertible and that we have identified the resolvent it remains to prove that it is one-to-one. Let us consider the eigenvalue equation (Lα − ξ ) h = 0 which is written (L1 − ξ )h = Pα h from which we deduce (using Proposition 5.11 to get regularity bounds on h) hWk,1 (m −1 ) ≤ R1 (ξ )Wk,1 (m −1 ) Pα hWk,1 (m −1 ) q

q

q

≤ C (1 − α) R1 (ξ )Wk,1 (m −1 ) hWk+1,1 (m −1 ) q

q+1

≤ C (1 − α) R1 (ξ )Wk,1 (m −1 ) hWk,1 (m −1 ) . q

q

Therefore for (1 − α) small enough (depending on the norm of R1 (ξ )) we have that necessarily h = 0, and thus the operator (Lα − ξ ) is one-to-one. For α satisfying all the previous conditions, the operator (Lα − ξ ) is bijective from k+1,1 (m −1 ) to Wqk,1 (m −1 ) and its inverse is given by Wq+1 Rα (ξ ) = Iα,δ (ξ ) Jα,δ (ξ )−1 from which we get the desired bound on the resolvent thanks to the study of Bα,δ (ξ )−1 in Lemma 5.9. At this point we have proved points (i), (ii) and the first estimate in (iii). Step 3. The second estimate in point (iii) is obtained from the resolvent identity Rα (ξ ) − R1 (ξ ) = Rα (ξ ) [L1 − Lα ] R1 (ξ ), together with the previous estimates on the resolvent and point (iii) in Lemma 5.8.

Remark that this lemma proves the point (ii) in Theorem 5.2. Moreover, as a consequence of this estimate on the resolvent Rα (ξ ), we may go one step further in the localization of the spectrum of Lˆ α around 0. Corollary 5.14. Let us fix µ¯ ∈ (µ2 , 0). In any Wqk,1 (m −1 ) there is some constant C ∈ (0, ∞) such that ∀ α ∈ [α2 , 1], (Lˆ α ) ∩ µ¯ ⊂ B(0, C (1 − α)).


481

Proof of Corollary 5.14. The proof follows from the estimates in point (ii) of Lemma 5.13, together with the fact that (Proposition 4.1 of [27] in L1 (m −1 ) extended to Wqk,1 (m −1 ) by the previous discussion): ∀ ξ ∈ µ¯ , R1 (ξ )Wk,1 (m −1 ) ≤ a + q

b |ξ |

for some explicit constants a, b > 0. We get thus that Rα (ξ )Wk,1 (m −1 ) < ∞ if ξ ∈ µ¯ q and |ξ | ≥ C (1 − α), which concludes the proof. 5.5. Fine study of spectrum close to 0. Let us fix r ∈ (0, |µ|] ¯ and let us choose any αr ∈ [α2 , 1) such that C (1 − αr ) < r (with the notations of Corollary 5.14) in such a way that (Lˆ α ) ∩ λ¯ ⊂ B(0, r ) for any α ∈ [αr , 1]. We may then define the spectral projection operator (see [21]) 1 Rα (ζ ) dζ (5.2) α := − 2 π i S(0,r ) in any Wqk,1 (m −1 ), with S(0, r ) := {ξ ∈ C, |ξ | = r }. The operator α is the projection operator on the sum of eigenspaces associated to eigenvalues lying in the half plane {ξ ∈ C, e ξ ≥ −r }, see [21]. In particular the operator 1 is the projection on the energy eigenline R φ1 , where we recall that φ1 is the energy eigenfunction defined by (1.35). Lemma 5.15. The operator α satisfies (i) For any k ∈ N and any exponential weight function m (as defined in (1.30)), it is well-defined and bounded in Wqk (m −1 ). (ii) Moreover there is a constant C > 0 (depending on m) such that ∀ α, α ∈ [αr , 1]

α − α W 3,1 (m −1 )→L 1 (m −1 ) ≤ C |α − α|.

(5.3)

3

Proof of Lemma 5.15. It is a straightforward consequence of (5.2) and Lemma 5.13. Corollary 5.16. There exists α3 ∈ [α2 , 1) such that for any α ∈ [α3 , 1) there holds (Lˆ α ) ∩ µ¯ = {µα } and the eigenspace associated to µα ∈ R is 1-dimensional. This eigenvalue is called the energy eigenvalue. We may furthermore remark that Corollary 5.14 implies ∀ α ∈ [α3 , 1)

|µα | ≤ C (1 − α).

(5.4)

Proof of Corollary 5.16. We already know that (Lˆ α ) ∩ µ¯ is entirely composed of discrete spectrum. Therefore we have to prove that it is of dimension 1. Indeed once this is proved, the fact that µα ∈ R is trivial since the operator is real, and the control (5.4) is trivial from Corollary 5.14. Let us define the space X α := α (L 1 (m −1 ))+1 (L 1 (m −1 )) endowed with the norm · L 1 (m −1 ) . From Proposition 5.11, there exists a constant C1 > 0 such that ∀ ψ ∈ X α , ψW 3,1 (m −1 ) ≤ C1 ψ L 1 (m −1 ) . 3

482


Thanks to the definition of α and 1 and to Lemma 5.15, we then get α − 1 X α →X α ≤ C2 sup

(Rα (z) − R1 (z)) ψ L 1 (m −1 )

sup

ψ L 1 (m −1 )

ψ∈X α z∈S(0,r )

≤ C2 (1 − α) sup

ψW 3,1 (m −1 )) 3

ψ L 1 (m −1 )

ψ∈X α

≤ C2 (1 − α) < 1, for (1 − α) small enough. By classical operator theory (see for instance the arguments presented in [21, Chap. 1, paragraph 4.6] in order to prove [21, Lemma 4.10]) one deduces that dimension(α ) = dimension(1 ). Since dimension(1 ) = 1 (as recalled in Theorem 5.1), this concludes the proof. Let us introduce for any ψ ∈ L 1 the decomposition ⊥ ψ = 1 ψ + ⊥ 1 ψ = (π1 ψ) φ1 + 1 ψ,

where π1 ψ ∈ R is the coordinate of 1 ψ on R φ1 (defined thanks to the projection 1 ). For any α ∈ [α3 , 1) we denote by φα the unique eigenfunction associated to µα such that φα L 1 = 1 and π1 φα ≥ 0. 2 We can now establish a first order approximation of the eigenfunction φα . Lemma 5.17. For any k, q ∈ N and any exponential weight function m (as defined in (1.30)), there exists C such that ∀ α ∈ [α3 , 1]

φα − φ1 W k,1 (m −1 ) ≤ C (1 − α).

(5.5)

q

Remark 5.18. We immediately deduce from Lemma 5.17 that φα (0) < 0 for α close enough to 1, and therefore, we get that this definition of φα coincides with the definition in Theorem 1.1. Proof of Lemma 5.17. On the one hand, from the normalization conditions, we have ! ! ! ! φ1 − 1 φα L 1 = |1 − π1 φα | = ! φα L 1 − 1 φα L 1 ! 2

2

≤ φα − 1 φα L 1 = 2

2

⊥ 1 φα L 12 .

We then deduce ⊥ φ1 − φα L 1 ≤ 1 φα − φα L 1 + ⊥ 1 φα L 1 ≤ 2 1 φα L 1 . 2

2

2

2

(5.6)

On the other hand, the eigenfunction φα satisfies Lˆ 1 (φα ) = [Lˆ 1 (φα ) − Lˆ α (φα )] − µα φα . ∞,1 Recall that from Proposition 5.11 one has uniform bounds in W∞ (m −1 ) on φα in terms 1 of its L 2 norm which has been fixed to 1, so that for any α ∈ [α3 , 1], φα W k,1 (m −1 ) ≤ C. q Using Proposition 3.1 and Proposition 5.11 we get

Lˆ 1 φα L 1 (m −1 ) = O(1 − α).


483

1 −1 1 −1 Using that Lˆ 1 is invertible from ⊥ 1 L1 (m ) to L (m ) we deduce that

⊥ 1 φα L 1 (m −1 ) = O(1 − α).

(5.7)

We conclude the proof of (5.5) holds for the L 12 norm gathering (5.6) and (5.7): ∀ α ∈ [α3 , 1]

φα − φ1 L 1 ≤ C (1 − α). 2

Let us now consider the eigenfunctions α associated to µα for α ∈ [α3 , 1] such that π1 α > 0 with the normalization condition α W k,1 (m −1 ) = 1. Proceeding similarly as before (by working in the space W k,1 (m −1 )), we can get α − 1 W k,1 (m −1 ) = O(1 − α). Because the eigenspace associated to µα is of dimension 1, we have α = cα φα for some constant cα ∈ (0, ∞). Then |c1 − cα | = cα φα − c1 φα L 1 ≤ α − 1 L 1 + |c1 | φ1 − φα L 1 = O(1 − α). 2

2

We then easily conclude that (5.5) holds for any

W k,1 (m −1 )

2

norm.

We now use the linearized energy dissipation equation to get a second order expansion of the eigenvalue. Lemma 5.19. For α ∈ [α3 , 1], the eigenvalues µα satisfy (with explicit bound) µα = −ρ (1 − α) + O(1 − α)2 . Proof of Lemma 5.19. By integrating the eigenvalue equation Lˆ α φα = µα φα against |v|2 and dividing it by (1 − α), we get µα ˜ G¯ α , φα ). E(φα ) = 2 ρ E(φα ) − 2 (1 + α) D( 1−α Using the rate of convergence of G¯ α → G¯ 1 and φα → φ1 established in Lemma 4.4 and Lemma 5.17 we deduce that µα ˜ G¯ 1 , φ1 ) + O(1 − α). E(φ1 ) = 2 ρ E(φ1 ) − 4 D( (5.8) 1−α Then we compute thanks to (A.1) and (A.2), E(φ1 ) = 2 N c0 ρ θ¯12 ,

(5.9)

where c0 is still the normalizing constant in (1.35) such that φ1 L 1 = 1. Similarly, 2 using (A.3), (A.4) and the relation (1.29) which make a link between b1 and θ¯1 , we find ˜ G¯ 1 , φ1 ) = 3 N c0 ρ 2 θ¯12 . D( 2 We conclude gathering (5.8), (5.9) and (5.10).

(5.10)

484


5.6. The map α → G¯ α is C 1 . The fact that the path of self-similar profiles α → G¯ α is C 0 on [α3 , 1] and C 1 at α = 1 was already proved in Lemma 4.4. Therefore we have to prove that it is C 1 for α ∈ [α3 , 1). Let us define the functional (α, g) → (α, g) := Q α (g, g) − τα ∇v (v g). The map is C 1 from R × (W11,1 (m −1 ) ∩ Cρ,0 ) into L1 (m −1 ) and it is such that for any α ∈ [α1 , 1), the equation (α, g) = 0 has only one solution which is the profile G¯ α . Moreover, for any α ∈ [α3 , 1), the linearized operator D2 (α, G¯ α ) = Lα is invertible from W1,1 (m −1 ) into L1 (m −1 ) because of the spectral properties of Lα established in Theorem 5.2 (i) & (ii) (note that here there is no eigenvalue approaching 0 at α). Then using the same strategy as in Subsect. 4.2 based on the implicit function theorem we easily conclude that α → G¯ α is C 1 from [α3 , 1) into L 1 (m −1 ). That ends the proof of Theorem 1.1 (ii). 5.7. Decay estimate on the semigroup. We start with a lemma dealing with semigroups in Banach spaces. This result is a tool for deriving constructive decay rate on non sectorial semigroups, assuming some precise estimates on the resolvent of their generator. We do not try to prove such a decay rate for the semigroup in the norm of the ambiant Banach space but instead in a weaker norm (corresponding to the norm of the graph of some power of its generator), which shall be sufficient for our study of the linearized stability of (1.31). Lemma 5.20. Let A be a closed unbounded operator on a Banach space E with dense domain dom(A). We denote by S(t) the associated semigroup, by R(A) the associated resolvent set and by R = R(ξ ) the resolvent operator defined on R(A). Assume that we have a sequence of Banach spaces E 2 ⊂ E 1 ⊂ E 0 = E decreasing for inclusion (in most cases this sequence shall be provided by E k = dom(Ak ) endowed with the norm of the graph of Ak ). We assume on the operator that: (i) the resolvent set R(A) contains the half plan a for some a ∈ R, together with the estimates sup R(a + i s) E 0 →E 0 ≤ C1

s∈R

and ∀ s ∈ R, R(a + i s) E 1 →E 0 , R(a + i s) E 2 →E 1 ≤

C2 1 + |s|

for some constants C1 , C2 > 0; (ii) the semigroup S(t) satisfies ∀ t ≥ 0, S(t) E 2 →E 0 ≤ C3 eb t

(5.11)

for some constants C3 , b > 0. Then for any a > a, there exists a constant C4 depending only on a, b, a , C1 , C2 , C3 such that

∀ t ≥ 0, S(t) E 2 →E 0 ≤ C4 ea t .

(5.12)


485

Proof of Lemma 5.20. We split the proof into two parts. Step 1. The first bound on the resolvent implies that for any x ∈ E 0 , R(a + is)x E 0 → 0, |s| → ∞. Indeed we first consider x ∈ dom(A) and then argue by density (since the domain dom(A) is dense). When x ∈ dom(A) the result is proved by the relation R(z)x = z −1 [−Id + R(z) A] x. Step 2. Then consider the following integral of R(z)x on a vertical segment with real part a (for some M > 0) a+i M e zt R(z)x dz. I M (x) := a−i M

The function z → R(z) is differentiable on this segment and we can perform an integration by part: a+i M zt e(a−i M) t e e(a+i M) t I M (x) = R(a + i M)x − R(a − i M)x − R(z)2 x dz, t t t a−i M where we have used R (z) = R(z)2 . Now we estimate the E 0 norm of this quantity: e(a+i M) t e(a+i M) t I M (x) E 0 ≤ R(a + i M)x + R(a + i M)x t t E0 E0 +∞ a t e 1 + C22 ds x E 2 . 2 t −∞ (1 + |s|) Therefore the integral is semi-convergent and we can pass to the limit M → +∞ and use (see [4,32]) that a+i M 1 1 S(t)x = lim lim I M e zt R(z)x dz = 2iπ M→∞ a−i M 2iπ M→∞ to obtain (the two boundary terms go to 0 as M → +∞ from the first step) +∞ at 1 e 2 x E 2 , with C2 = C2 S(t)x E 0 ≤ C2 ds . 2 t −∞ (1 + |s|)

(5.13)

Using (5.11) for t ≤ 1 and (5.13) for t ≥ 1, we conclude that (5.12) holds with C4 = max(C2 , C3 eb−a ). Proof of point (iii) in Theorem 5.2. The point (ii) of Theorem 5.2 was proved in Lemma 5.13 and it shows that the operator L¯ α = (Id − α ) Lˆ α together with the sequence of Banach spaces E i = Wik+i,1 (m −1 ), i = 0, 1, 2, for any fixed k ∈ N and any exponential weight function m (as defined in (1.30)), satisfies the assumption (i) of Lemma 5.20 for any a ∈ (µ2 , 0). Moreover it is trivial to prove that it satisfies the assumption (ii) of Lemma 5.20 for some explicit b > 0 from the decomposition Lα = Aδ − Bα,δ (ξ ) already introduced.

486


6. Convergence to the Self-Similar Profile In this section, we consider the nonlinear rescaled equation (1.31) and we prove the convergence of its solutions to the self-similar profile. As a preliminary step let us recall some results from [24, Proposition 3.1, Theorem 3.5, Theorem 3.6] about propagation and appearance of moments and regularity. Lemma 6.1. Let us consider gin ∈ L 13 ∩ Cρ,0 and the associated solution g ∈ C([0, ∞); L 13 ) to the rescaled equation (1.31). Then (i) For any exponential moment weight m (as defined in (1.30)) with exponent s ∈ (0, 1/2) and any time t0 ∈ (0, ∞), there exists a constant M1 = M1 (t0 ) such that sup g(t, ·) L 1 (m −1 ) ≤ M1 .

(6.1)

[t0 ,∞)

Moreover, if gin ∈ L 1 (m −1 ) for some polynomial or exponential (with exponent s ∈ (0, 1)) moment weight m then (6.1) holds (for this weight m) with t0 = 0 and some constant M1 = M1 (gin L 1 (m −1 ) ). For the following two points we now assume that for some constants c1 , T ∈ (0, ∞) there holds inf E(g(t, ·)) ≥ c1 ,

(6.2)

[0,T ]

and we state some smoothness properties of the solution g which depend on c1 but not on T nor α. (ii) Assume (6.2). Then for any k0 ∈ N there is q0 = q0 (k0 ) ∈ N such that if gin H k0 ∩L 1 q0 ≤ C0 holds, then for any c1 ∈ (0, ∞) there exists C1 = C1 (C0 , c1 ) ∈ (0, ∞) such that for any time T ∈ (0, ∞), we have ∀ t ∈ [0, T ], g(t, ·) H k1 ≤ C1 ,

(6.3)

with k1 = 0 if k0 = 0 and k1 = k0 − 1 if k0 ∈ N∗ . (iii) Assume (6.2) and that gin ∈ L 2 , with gin L 2 ∩L 1 ≤ M1 ∈ (0, ∞). Then there 3 exists λ ∈ (−∞, 0) and for any exponential weight function m with exponent s ∈ (0, 1/2) and any k ∈ N, there exists a constant K (which depends on ρ, c1 , M1 , k, m) such that we may split g = g S + g R with ∀ t ∈ [0, T ], g S (t, ·) H k ∩L 1 (m −1 ) ≤ K , g R (t, ·) L 1 ≤ K eλ t . 3

(6.4)

Remark 6.2. It is worth mentioning that these estimates are uniform with respect to the inelasticity parameter α ∈ (0, 1). Indeed, on the one hand, this was already the case for the moment estimate (6.1) in [24, Prop. 3.1]. On the other hand (6.3) and (6.4) from [24, Theorem 3.5, Theorem 3.6] were (partially) based on the use of the damping effect of the anti-drift term (whose coefficient was fixed to τ = 1). Here the damping effect of the anti-drift term vanishes (τα → 0) but it is replaced (as for the elastic Boltzmann equation) by the lower bound on the energy (6.2) which allows for a control from below on the convolution term L(g) appearing in the loss term of the collision operator (see Lemma 2.3), which is enough to conclude also in this case.


487

6.1. Local linearized asymptotic stability. Let us consider the nonlinear evolution equation (1.31) in L 1 (m −1 ) ∩ H k , and the associated equation on the fluctuation h of a solution g around the unique equilibrium G¯ α : g = G¯ α + h and ∂t h = Lα h + Q α (h, h). Let us start by stating an inequality that we shall need in the sequel. Lemma 6.3. For any exponential weight function m (as defined in (1.30)), there is a constant C ∈ (0, ∞) such that for any h ∈ W33,1 (m −1 ) and any α ∈ (0, 1), α Q α (h, h) L 1 (m −1 ) ≤ C (1 − α) h2

W33,1 (m −1 )

.

Proof of Lemma 6.3. We write α Q α (h, h) = α (Q α (h, h) − Q 1 (h, h)) + (α − 1 )Q 1 (h, h). On the one hand, from Lemma 5.15 (i) and (3.4), there is C ∈ (0, ∞) such that α (Q α (h, h) − Q 1 (h, h)) L 1 (m −1 ) ≤ C (1 − α)h2

W33,1 (m −1 )

.

On the other hand, from (5.3) and (3.1), we get (α − 1 )Q 1 (h, h) L 1 (m −1 ) ≤ C (1 − α) h2

W33,1 (m −1 )

.

The proof of the lemma is immediate by gathering the two previous estimates.

We now state a first local linearized stability result. Proposition 6.4. For any α ∈ [α3 , 1), the self-similar profile G¯ α is locally asymptotically stable, with domain of stability uniform according to α ∈ [α3 , 1). More precisely, let us fix ρ ∈ (0, ∞) and some exponential weight function m as in (1.30). There is k1 , q1 ∈ N∗ such that for any M0 ∈ (0, ∞) there exists C, ε ∈ (0, ∞) such that for any α ∈ [α3 , 1], for any gin ∈ H k1 ∩ L 1 (m −q1 ) with mass ρ, momentum 0 satisfying gin H k1 ∩L 1 (m −q1 ) ≤ M0 ,

gin − G¯ α L 1 (m −1 ) ≤ ε,

(6.5)

the solution g to the rescaled equation (1.31) with initial datum gin satisfies ∀ t ≥ 0, α (gt − G¯ α ) L 1 (m −1 ) ≤ C gin − G¯ α L 1 (m −1 ) eµα t ,

(6.6)

∀ t ≥ 0, (Id − α ) (gt −G¯ α ) L 1 (m −1 ) ≤ C gin −G¯ α L 1 (m −1 ) e(3/2) µα t .

(6.7)

Proof of Proposition 6.4. Step 1. Let us first denote by c1 the constant given in Step 5 of Proposition 2.1 such that ∀ α ∈ [α1 , 1), E(G¯ α ) ≥ 2 c1 . We may then fix ε0 ∈ (0, ∞) in such a way that g − G¯ α L 1 (m −1 ) ≤ ε0

implies

E(g) ≥ c1 ,

(6.8)

488


and define T∗ := sup T, E(gt ) ≥ c1 ∀ t ∈ [0, T ] ∈ (0, ∞]. From Lemma 6.1 (i) & (ii), there exists M ∈ (0, ∞) (depending on ρ, c1 , k1 , q1 , M0 ) such that for any T ∈ (0, ∞) there holds sup g H k1 ∩L 1 (m −q1 ) ≤ M.

(6.9)

t∈[0,T∗ ]

Let us now consider the fluctuation h t = gt − G¯ α . Thanks to the mass and momentum conservations, it satisfies h t ∈ C0,0 for all times, as well as the bound (6.9). We define the following decomposition on h: h 1 = α h and h 2 = (Id − α )h =: ⊥ α h. Since the spectral projection α commutes with the linearized operator Lα , the equation on h 1 is written ∂t h 1 = µα h 1 + α Q α (h, h). Multiplying that equation by (sign h) m −1 and integrating in the velocity variable, we deduce thanks to Lemma 6.3 and to (B.2), (6.9) that on (0, T∗ ) the following holds: d 1 h L 1 (m −1 ) ≤ µα h 1 L 1 (m −1 ) + C (1 − α) h2 3,1 −1 W3 (m ) dt

1 3/2 2 3/2 1 ≤ (1 − α) C1 h L 1 + C1 h L 1 − C2 h L 1 (m −1 ) , (6.10) 2

2

for some constants C1 depending on M and the possible choice C2 = ρ/2 for C2 . For the second part h 2 we have the following equation: 2 ⊥ ∂t h 2 = ⊥ α Lα h + α Q α (h, h).

Since the linearized operator Lα restricted to ⊥ α generates the semigroup Rα (t) defined in point (iii) of Theorem 5.2, the Duhamel formula reads t 2 h (t) = Rα (t) h in + Rα (t − s) ⊥ α Q α (h, h)(s) ds. 0

From (5.1) and (3.1) we have h 2 (t) L 1 (m −1 ) ≤ C eµ¯ t h in L 1 (m −1 ) + C

t 0

eµ¯ (t−s) h(s)2

W22,1 (m −1 )

ds.

We deduce h 2t L 1 (m −1 )

≤ C3 e

µ¯ t

h in + C4

t

2

0

3/2 3/2 eµ¯ (t−s) h 1s L 1 (m −1 ) +h 2s L 1 (m −1 ) ds (6.11)

with C4 depending on M thanks to (B.2) and (6.9). It is then easy to show by comparison arguments from (6.10) and (6.11) that there are 0 < ε2 ≤ ε1 ≤ ε0 (one can take for

Spatially Homogeneous Boltzmann Equation for Inelastic Hard Spheres 1/2

instance ε1 ≤ ε0 /2 satisfying 2 C1 ε1 satisfying C3 ε2 < ε1 /2) such that

489 1/2

< C2 and 2 C4 ε1

< 1/2 and next ε2 ≤ ε1

h 1in L 1 (m −1 ) + h 2in L 1 (m −1 )

sup max h 1t L 1 (m −1 ) , h 2t L 1 (m −1 ) ≤ ε1 .

≤ ε2 implies

t∈[0,T∗ ]

(6.12)

Gathering (6.8) and (6.12) we deduce that there exists ε ∈ (0, ε2 ) such that under condition (6.5) there holds T∗ = ∞ as well as sup g − G¯ α L 1 (m −1 ) ≤ 2 ε1 ≤ ε0 .

t∈(0,∞)

Step 2. In a second step, coming back to (6.11) and to the integral version of (6.10) and setting y(t) = h 1 + |µα | h 2 , we obtain t y(t) ≤ C5 eµα t y(0) + C6 |µα | eµα (t−s) y(s)3/2 ds. (6.13) 0

Then we have the following variant of the Gronwall lemma whose proof is the same as the one of [27, Lemma 4.5] and is therefore skipped: Lemma 6.5. Let y = y(t) be a nonnegative continuous function on R+ such that for some constants a, b, θ , µ > 0, t y(t) ≤ a e−µt X + b e−µ(t−s) y(s)1+θ ds 0

(as compared to [27, Lemma 4.5], X needs not necessarily be y(0)). Then if X and b are small enough, we have y(t) ≤ C X e−µt for some explicit constant C > 0. Thanks to the uniform smallness estimate on y(t) we can apply the lemma with θ = 1/4 for instance, and we get y(t) ≤ C7 y(0) eµα t , from which we deduce the estimate (6.6) for the h 1 part of g − G¯ α . Finally, we may insert that estimate on h 1 in (6.11) and we get t 3/2 h 2 (t) L 1 (m −1 ) ≤ C3 (eµ¯ t + e(3/2) µα t ) h in L 1 (m −1 ) + C4 eµ¯ (t−s) h 2 (s) L 1 (m −1 ) ds. 0

The same kind of computation yields to h 2t ≤ C8 e(3/2) µα t h(0) from which (6.7) follows.

490


6.2. Nonlinear stability estimates. In this subsection we shall prove that when the inelasticity is small enough, depending on the size of the initial datum (but not on the distance between the initial datum and the self-similar profile), Eq. (1.31) is stable. This mainly relies on the fact that the entropy production timescale is much faster than the energy dissipation timescale as α → 1. This point is familiar to physicists (see for instance [9]), who distinguish, for granular gases with small inelasticity, the “molecular timescale” (the level where entropy production effects dominate) and the “cooling timescale” (much slower than the molecular timescale). Proposition 6.6. Define k2 := max{k0 , k1 }, q2 := max{q0 , q1 , 3}, where ki and qi are defined in Theorem 3.5 and Corollary 3.4. For any ρ, E0 , M0 there exists α4 ∈ [α3 , 1), c1 ∈ (0, ∞) and for any α ∈ [α4 , 1] there exist ϕ = ϕ(α) with ϕ(α) → 0 as α → 1 and T = T (α) (possibly blowing-up as α → 1) such that any initial datum 0 ≤ gin ∈ L q12 ∩ H k2 ∩ Cρ,0,E0 with gin L 1

k q2 ∩H 2

≤ M0 ,

the solution g associated to the rescaled equation (1.31) satisfies ∀ t ≥ 0, E(gt ) ≥ c1 and for all α ∈ [α4 , 1) and then all α ∈ [α , 1], ∀ t ≥ T (α ), gt − G¯ α L 1 ≤ ϕ(α ).

(6.14)

2

Proof of Proposition 6.6. Let us consider a solution g ∈ C([0, ∞); L q12 ∩ H k2 ) to the rescaled equation (1.31) with given initial datum gin , whose existence has been established in [23,24]. We split the proof of the proposition into five steps. Step 1. From the propagation and appearance of uniform moment bounds [24, Prop. 3.1, (iii)], which it is worth noticing have been obtained uniformly with respect to the elastic coefficient (see also [8]), there exists C1 ∈ (0, ∞) such that sup g L q1 ≤ C1 . t≥0

2

Let us define c1 := min{E(G¯ 1 ), E0 }/4, and T∗ := sup T ; ∀ t ∈ [0, T ], E(g(t, ·)) ≥ c1 .

(6.15)

(6.16)

Next from the equation on the evolution of energy E (t) = −(1 − α 2 ) b1 DE (g) + (1 − α) 2 ρ E

(6.17)

and (6.15) there holds |E (t)| ≤ C2 (1 − α)

∀t ≥ 0

(take for instance C2 = 2 b1 C12 + C1 ), from which we deduce that we necessarily have T∗ ≥ C3 (1 − α)−1 (take for instance C3 = (3/4) E0 /C2 ).


491

Step 2. From point (ii) of Lemma 6.1, we have for some constant C5 ∈ (0, ∞), ∀ t ∈ [0, T∗ ]

gt H k2 ≤ C4 .

(6.18)

Moreover from Lemma 2.6, for any time t1 ∈ (0, T∗ ), there exists some constant C5 = C5 (ρ, C4 , t1 ) such that ∀ t ∈ [t1 , T∗ ]

g(t, v) ≥ C5−1 e−C5 |v| . 8

(6.19)

Step 3. With the notations of Theorem 3.5, we compute the evolution of the relative entropy of g(t, ·) with respect to the associated Maxwellian M[g(t, ·)], and we obtain d d ρN d d d H (g|M[g]) = H (g) − H (g) − E g ln M[g] = N dt dt dt R dt 2 E dt ρN (1 − α 2 ) DE (g) − (1 − α) ρ 2 N . = −D H,α (g) + 2E Next from Lemma 3.4 and the estimates (6.15), (6.16), (6.18) and (6.19) we have d H (g|M[g]) = −D H,1 (g, g) + O(1 − α) on (t1 , T∗ ). dt Then from (3.11), we are led to the following differential inequation on the relative entropy d H (g|M[g]) ≤ −C6 H (g|M[g])2 + C7 (1 − α) on (t1 , T∗ ). dt By straightforward computations we deduce that independently of the value of H (gt1 |M [gt1 ]) (this “loss of memory" effect is typical of differential equations with overlinear damping terms), we have ∀ t ∈ [t1 , T∗ ],

H (gt |M[gt ]) ≤ C8 (1 − α)

1/2

1 + e−C9 (1−α)

1/2 (t−t

1 − e−C9 (1−α)

1)

1/2 (t−t

1)

for some explicit constants. As a conclusion, defining t2 := t1 + C9−1 (1 − α)−1/2 and choosing α¯ ∈ [α3 , 1) in such a way that t2 < T∗ we have for α ∈ [α , 1), ∀ t ∈ [t2 , T∗ ] H (g(t)|M[g]) ≤ C10 (1 − α)1/2 . Finally, using Csiszár-Kullback-Pinsker inequality (3.10), as well as Hölder inequality, we obtain under the same conditions on α and the time variable: 1/2

1/2

g − M[g] L 1 ≤ C g − M[g] L 1 g L 1 ≤ C H (g|M[g])1/4 ≤ C (1 − α)1/8 . (6.20) 3

6

Step 4. Now let us go back to the energy equation (6.17). First, with the help of the moment bound (6.15), one may write E (t) = 2 (1 − α) [ρ E − b1 DE (g) + O(1 − α)]. Thanks to (6.20) we deduce E (t) = 2 (1 − α) (ρ E − b1 DE (M[g]) + O((1 − α)1/8 )) on (t2 , T∗ ).

492


Finally, thanks to (3.16), (3.17) and the relation E(g) = ρ N θ (g), we get on (t2 , T∗ ), 1/2 E (t) = (E(t), α) := (1 − α) [k3 E (E¯1 − E 1/2 ) + O((1 − α)1/8 )],

(6.21)

where E¯1 = ρ N θ¯1 with θ¯1 is the quasi-elastic self-similar temperature defined in (1.29). We may then choose α ∈ [α , 1) such that (c1 , α) > 0 for any α ∈ [α , 1). We conclude by maximum principle that T∗ = ∞ for α ∈ [α , 1). In particular, all the previous estimates on g are uniform on (t2 , ∞). Step 5. Thanks to (6.21) we easily get d (E − E¯1 )2 ≤ −(1 − α) [k5 (E − E¯1 )2 + O((1 − α)1/8 )], dt so that (for some constants a, b > 0) ∀ t ≥ t2 , |E(t) − E¯1 | ≤ |E(t2 ) − E¯1 | e−a (1−α) (t−t2 ) + b (1 − α)1/8 . Setting T (α) = max{t2 , c (1 − α)−1 } for some suitable constant c > 0, we then obtain |E − E¯1 | = O((1 − α)1/8 ) on [T (α), ∞).

(6.22)

In order to conclude that (6.14) holds, we write g(t) − G¯ α = (g(t) − M[g(t)]) + (M[g(t)] − G¯ 1 ) + (G¯ 1 − G¯ α ), and we estimate the first term thanks to (6.20), the second term thanks to (6.22) and the third term by (3.12). 6.3. Decomposition and Lyapunov functional for smooth initial datum. The proof of the gobal convergence (point (v) of Theorem 1.1) for smooth initial data only amounts to connect the two previous results of Propositions 6.4 and 6.6 by choosing α such that ϕ(α) ≤ ε, where ε is the size of the attraction domain in Proposition 6.4 and ϕ(α) is defined in Propositions 6.6. More precisely, we state without proof the straightforward combination of Propositions 6.6 and Proposition 6.4. Corollary 6.7. Let us fix an exponential weight function m as in (1.30), with exponent s ∈ (0, 1). Then for any ρ, E0 , M0 there exists C and α5 ∈ [α4 , 1) (depending on ρ, E0 , M0 , m) such that for any α ∈ [α5 , 1) and any initial datum 0 ≤ gin ∈ L 1 (m −q2 ) ∩ H k2 satisfying gin ∈ Cρ,0,E0 ,

gin L 1 (m −q2 )∩H k2 ≤ M0 ,

the solution g associated to the rescaled equation (1.31) satisfies ∀ t ≥ 0, α (gt − G¯ α ) L 1 (m −1 ) ≤ C eµα t , ∀ t ≥ 0, (Id − α ) (gt − G¯ α ) L 1 (m −1 ) ≤ C e(3/2) µα t . Remark 6.8. Note that the constant C in the rate of decay does not depend on α. This comes from the fact that the size of the linearized stability domain is uniform as α goes to 1 in Proposition 6.4, which allows in Proposition 6.6 to pick a fixed α such that in the estimate (6.14) ϕ(α ) is less than this size, and therefore that the time T (α ) required to enter this neighborhood does not blow-up as α goes to 1.


493

As a by-product of the previous propositions, we state and prove a result which provides a partial answer to the question (important from the physical viewpoint) of finding Lyapunov functionals for this particles system. Let us define the required objects. We consider a fixed mass ρ and some restitution coefficient α whose range will be specified below. At initial times, non-linear effects dominate and therefore we define 2 H1 (g) := H (g|M[g]) + E − E¯α , where E¯α = E(G¯ α ) is the energy of the self-similar profile corresponding to α and the mass ρ. At eventual times, linearized effects dominate. Therefore we define (inspiring from the spectral study): +∞ 2 1 2 H2 (g) := h L 1 (m −1 ) + (1 − α) Rα (s) h 2 2 ds, L

0

¯ with h 1 = α h, h 2 = ⊥ α h and h = g − G α . Proposition 6.9. There is k4 ∈ N big enough (this value is specified in the proof) such that for any exponential weight function m as defined in (1.30), any time t0 ∈ (0, ∞) and any ρ, E0 , M0 ∈ (0, ∞), there exists κ∗ ∈ (0, ∞) and α6 ∈ [α5 , 1) such that for any α ∈ [α6 , 1] and any initial datum gin ∈ H k4 ∩ L 1 (m −1 ) satisfying gin ∈ Cρ,0,E0 ,

gin H k4 ∩L 1 (m −1 ) ≤ M0 ,

gin (v) ≥ M0−1 e−M0 |v| , 8

the solution g to the rescaled equation (1.31) with initial datum gin is such that the functional H(gt ) = H1 (gt ) 1H

1 (gt )≥κ∗

+ H2 (gt ) 1

H1 (gt )≤κ∗

is decreasing for all times t ∈ [0, +∞). Moreover, H(g(t, ·)) is strictly decreasing as long as g(t, ·) has not reached the self-similar profile G¯ α . Proof of Proposition 6.9. We split the proof into three steps. Step 1: Initial times. Taking k4 ≥ k2 and α ∈ [α4 , 1), we know from the proof of Proposition 6.6 that the solution g satisfies that ∀ t ∈ [t0 , ∞), g(t, ·) H k4 ∩L 1 (m −1 ) ≤ M1 , g(t, v) ≥ M1−1 e−M1 |v| , 8

for some constant M1 ∈ (0, ∞) (recall that α4 was adjusted in terms of ρ, E0 , M0 ). Coming back then to Steps 3 and 4 in the proof of Proposition 6.6, we obtain the two following differential equations on (t0 , ∞) d H (g|M[g]) ≤ −K 1 H (g|M[g])2 + O(1 − α) dt and

d E = 2 ρ (1 − α) K 2 E (E¯α1/2 − E 1/2 ) (E − E¯α ) + O((1 − α)1/8 ) , dt

for some constants K i ∈ (0, ∞). We easily deduce that for any κ ∈ (0, ∞) there exists ακ ∈ [α5 , ∞) such that d H1 (gt ) < 0 for any t ∈ (0, ∞) such that H1 (gt ) ≥ κ. dt

(6.23)

494


Step 2: Eventual times. Let us first remark that from point (iii) in Theorem 5.2 (iii) and the interpolation inequality (B.2), for any q ∈ N∗ there exists k, k ∈ N and Ci ∈ (0, ∞) such that Rα h 2 2 ≤ C1 Rα h 2 k,1 −q/2 L W (m ) µ¯ s 2 ≤ C2 e h k+2,1 −q/2 ≤ C3 eµ¯ s h H k ∩L 1 (m −q ) , (m

W2

)

so that, taking k4 big enough, the functional H2 (g(t, .)) is well-defined for any times t ∈ (0, ∞). First observe that from (6.10) there holds d 1 2 5/2 h L 1 (m −1 ) ≤ (1 − α) K 1 h L 1 (m −1 ) − K 2 h 1 2L 1 (m −1 ) . (6.24) dt Second, we compute (with the notation of Subsect. 5.7) +∞ +∞ 2 d ¯ ¯ (es Lα h 2 ) [es Lα (L¯ α h 2 +⊥ Rα (s) h 2t 2 ds = 2 α Q α (h, h))] ds dv. L dt 0 RN 0 On the one hand,

I1 = 2 = 0

+∞

0 +∞

RN

¯ ¯ (es Lα h 2 ) [es Lα L¯ α h 2 ] dsdv

d s L¯ α 2 2 e h L 2 ds = −h 2 2L 2 . ds

On the other hand, +∞ (Rα (s) h 2 ) [Rα (s) ⊥ I2 = 2 α Q α (h, h))] dsdv 0 RN +∞ ≤ 2 C12 Rα (s) h 2 W k1 ,1 (m −q/2 ) Rα (s) ⊥ α Q α (h, h)W k1 ,1 (m −q/2 ) ds 0+∞ ≤ C2 e2µ¯ s ds h 2 W k1 +1,1 (m −q/2 ) Q α (h, h)W k1 +1,1 (m −q/2 ) ≤

2 2 0 3/2 1/2 2 3/4 2 1/4 C3 h L 2 h H k3 ∩L 1 (m −1 ) h L 2 h H k3 ∩L 1 (m −1 ) ,

for some k3 ∈ N given by Proposition B.1. Taking k4 ≥ k3 , we then obtain +∞ 2 d 9/4 Rα (s) h 2t 2 ds ≤ K 3 h L 2 − h 2 2L 2 . L dt 0

(6.25)

Gathering (6.24) and (6.25) and using some interpolation again, we deduce that there exists κ ∈ (0, ∞) such that d H2 (gt ) < 0 for any t ∈ (0, ∞) such that h t L 1 ≤ κ . (6.26) dt Step 3. We conclude putting together (6.23) and (6.26), and using (3.10), (3.12) in order to prove that H1 (g) ≤ κ implies h t L 1 ≤ κ , for α ∈ [α6 , 1] for some α6 ∈ [α5 , 1).


495

6.4. Global stability for general initial datum. We first state and prove a regularity result on the iterated gain term which is the inelastic collision operator version of the same result proved for the elastic collision operator in [1,26]. Lemma 6.10. There exists a constant C such that for any f, g, h ∈ L 12 (R N ) and any α ∈ (0, 1] there holds Q +α ( f, Q +α (g, h)) L 3 ≤ C f L 1 g L 1 h L 1 . 2

2

2

(6.27)

Proof of Lemma 6.10. We follow [26, Lemma 2.1] and [1, Lemma 2.1] and we make use of the Carleman representation introduced in [24, Prop. 1.5]. Let us consider f, g, h ∈ L 12 (R N ) and φ ∈ L ∞ (R N ). We apply twice the weak formulation of the gain term Q + ( f, Q + (g, h))(v) φ(v) dv RN

+ Q (g, h)(v) f (v2 ) |v − v2 | φ(w2 ) dσ2 dv2 dv = RN RN S2

= g(v) h(v1 ) f (v2 ) |v − v1 | |v1 − v2 | φ(v2 ) dσ2 dσ1 dv dv1 dv2 RN RN RN

w2

with that for

S2

= V (v2 , v, σ2 ), v1 any given v, v∗ , σ ∈ w=

= V (v, v1 , σ1 ) R N , we define

S2

and therefore v2 = V (v2 , v1 , σ2 ). Recall

v + v∗ 1+e , u = v − v∗ , γ = , u = (1 − γ ) u + γ |u| σ 2 2

and then γ w u + = v + (|u| σ − u), 2 2 2 w u γ V∗ = V∗ (v, v∗ , σ ) = − = v∗ − (|u| σ − u). 2 2 2 V = V (v, v∗ , σ ) =

We denote by = (v, v1 , v2 ) the term between brackets in the last integral. Introducing the point w1 and the set Sv,v1 ,ε defined by w1 := (1 − γ /2) v + (γ /2) v1 , ! ! ! ! := z ∈ R N ; !|z − w1 | − (γ /2) |v − v1 |! ≤ ε/2 ,

Sv,v1 ,ε we get =

(2/γ )2 ε lim , ε = 1 Sv,v1 ,ε (v1 ) |v1 − v2 | φ(v2 ) dσ2 dv1 . (6.28) N 2 |v − v1 | ε→0 ε R S

Remarking that v2 = v2 + (γ /2) (|u 2 | σ2 − u 2 ) with u 2 = v1 − v2 , we observe that the integral term ε is very similar to the collision term Q + (here v2 (resp. v1 , σ2 , γ , v2 ) plays the role of v (resp. v1 , σ , β, v) in the gain term) and therefore we may give a Carleman representation of ε . The same computations as performed in [24, Prop. 1.5] yield 4 ε = 2 1S (v ) |v − v2 |−1 φ(v2 ) d E(v3 ) dv2 , γ R N E v ,v v,v1 ,ε 1 2 2 2

496


where E v2 ,v2 is the hyperplan orthogonal to the vector v2 − v2 and passing through the point v2 ,v2 = v2 + (1 − γ −1 ) (v2 − v2 ). Here v3 stands for the post collision velocity issued from v1 , that is v3 = V∗ (v2 , v1 , σ2 ), and then, thanks to the momentum conservation, v1 := v2 + v3 − v2 . We finally define v2 ,v2 the hyperplan orthogonal to the vector v2 − v2 and passing through the point v ,v = v2 + (1 − γ −1 ) (v2 − v2 ) and we 2 2 get ε =

4 1S (v ) |v − v2 |−1 φ(v2 ) d E(v1 )dv2 . γ 2 R N v ,v v,v1 ,ε 1 2 2

(6.29)

2

Now, arguing as in [1, Lemma 2.1], we see that the measure of the intersection ε between the plane v2 ,v2 and the thickened sphere Sv,v1 ,ε is bounded by π ε γ |v − v1 | and that v1 ∈ ε implies that v2 ∈ B ε with B ε := z ∈ R N ; |z|2 ≤ |v|2 + |v1 |2 + 2 ε (|v| + |v1 |) + ε2 |v2 |2 . Gathering these estimates with (6.28) and (6.29) we get φ(v2 ) (2/γ )4 1 = lim mes(ε ) dv2 |v − v1 | ε→0 ε R N |v2 − v2 | φ(v2 ) φ(v2 ) 24 π 24 π ε (v ) dv = 1 1 0 (v ) dv2 , ≤ 3 lim B 2 2 γ ε→0 R N |v2 − v2 | γ 3 R N |v2 − v2 | B 2 where we have defined B 0 := {z ∈ R N ; |z|2 ≤ |v|2 + |v1 |2 }. Using [1, Lemma 2.2] we may conclude as in the end of [1, Lemma 2.1] and therefore (6.27) follows. We second establish that the solution g of the rescaled equation (1.31) decomposes between a regular part and a small remaining part as it has been proved for the elastic Boltzmann equation in [28], and then partially extended to the inelastic Boltzmann equation in [24]. As compared to this last paper, this result relaxes the assumption on the initial datum to gin ∈ L 13 , but at the price of the hypothesis of a lower bound on the energy. Lemma 6.11. Consider gin ∈ L 13 and the associated solution g ∈ C([0, ∞); L 13 ) to the rescaled equation (1.31). Assume that for some constant ρ, c1 , M1 , T ∈ (0, ∞) there holds gin ∈ Cρ,0 ,

gin L 1 ≤ M1 , 3

∀ t ∈ [0, T ], E(g(t, ·)) ≥ c1 .

(6.30)

Then, there are α7 ∈ [α6 , 1) and λ ∈ (−∞, 0), and for any smooth exponential weight function m (as defined in (1.30)) and any k ∈ N, there exists a constant K (which depends on ρ, c1 , M1 , k, m) such that for any α ∈ [α7 , 1], we may split g = g S + g R with ∀ t ∈ [0, T ], g S (t, ·) H k ∩L 1 (m −1 ) ≤ K ,

g R (t, ·) L 1 ≤ K eλ t . 3

(6.31)


497

Proof of Lemma 6.11. The starting point is to write the rescaled equation (1.31) in the following way ∂g + τα v · ∇v g + g = Q +α (g, g), ∂t with (t, v) := τα N + L(g(t, ·))(v). Introducing the linear semigroup t (Ut h)(v) = h(v e−τα t ) exp − (s, v) ds 0

and using the Duhamel formula, we have t gt = Ut gin + Ut−s Q +α (gs , gs ) ds. 0

We iterate that last identity and we obtain g = g1R + g1S with g1R

t

= Ut gin + Ut−s Q +α (gs , Us gin ) ds, 0 t s S g1 = Ut−s Q +α (gs , Us−u Q +α (gu , gu )) du ds. 0

0

On the one hand, the energy lower bound (6.30) and Lemma 2.3 imply that there exists a constant c2 ∈ (0, ∞) such that (Ut h)(v) ≤ e−c2 t (Vξt h)(v) with (Vξ h)(v) = h(ξ v) and ξt = e−τα t . On the other hand, straightforward homogeneity arguments leads to Q +α (g, Vξ h) = ξ −N −1 Vξ −1 Q +α (Vξ −1 g, h) and h ξ |.|q L p = ξ −q−N / p h |.|q L p for any functions g, h and positive real ξ . From these considerations we deduce that g1R (t) L 1 ≤ e(N τα −c2 ) t gin L 1 + e((N +1) τα −c2 ) t gin L 1 sup gs L 1 ≤ C e−(c2 /2) t , 1

s≥0

1

for some constant C and for any (1 − α) small enough. In the same way, we have t s g1S (t) L 3 ≤ e[(2N /3+1) τα −c2 ] (t−σ ) Q +α (Vξ −1 gs , Q +α (gσ , gσ )) L 3 dσ ds. 0

s−σ

0

Taking (1 − α) smaller if necessary and using Lemma 6.10, we obtain t s g1S (t) L 3 ≤ e−(c2 /2) (t−σ ) dσ ds sup gs 3L 1 , 0

0

s≥0

2

which ends the proof of (6.31) in the case k = 0, with the help of point (i) of Lemma 6.1. The general case k ∈ N∗ is then treated by following the strategy introduced in [28] and

498


using the result of appearance of regularity proved in [24] (and recalled in point (iii) of Lemma 6.1). We third recall a classical L 1 stability result for the elastic Boltzmann equation which has been established in [24, Prop. 3.2] for the rescaled equation (1.31). 1 , g2 ∈ L 1 ∩ C Lemma 6.12. Consider 0 ≤ gin ρ,0 and the two associated solutions in 3 1 1 2 ∞ gt , gt ∈ C([0, ∞); L 3 ) ∩ L (0, ∞; L 13 ) to the rescaled equation (1.31). There exists Cstab ∈ (0, ∞) (only depending on b and supt≥0 g 1 + g 2 L 1 ) such that 3

2 1 − gin L 1 eCstab t . ∀ t ≥ 0, gt2 − gt1 L 1 ≤ gin 2

2

Proof of point (iv) of Theorem 1.1. Let us consider gin ∈ L 13 ∩Cρ,0,Ein with gin L 1 ≤ M0 3 for some fixed Ein , M0 ∈ (0, ∞) and g the associated solution to the rescaled equation (1.31) which has been built in [24]. We know that there exists M1 ∈ (0, ∞) such that sup g(t, ·) L 1 ≤ M1 .

(0,∞)

3

(6.32)

Step 1. We define T∗ := sup T ∈ (0, ∞), E(g(t, ·)) ≥ c1 ∀ t ∈ [0, T ] , c1 := min{Ein , E¯1 }/2. We shall prove that T∗ = +∞. We argue by contradiction, assuming that T∗ < ∞. From the equation on the energy (6.17) and the uniform estimate (6.32) and from the definition of T∗ we have T∗ ≥ C1 (1 − α)−1 and E (T∗ ) ≤ 0.

(6.33)

Thanks to Lemma 6.11, we may decompose g = g S + g R on (0, t1 ), with t1 ∈ (0, T∗ ) to be fixed. At time t1 we initiate a new flow starting from the smooth part of g. More precisely, we decompose g = g˜ S + g˜ R on (t1 , T∗ ), with g˜ S (t1 ) = [ρ/ρ(g S (t1 ))] g S (t1 ), g˜ S solution (with mass ρ!) to Eq. (1.31) on (t1 , T∗ ) and g˜ R := g − g˜ S . On the one hand, from (6.31) and Lemma 6.12 we have g˜ R (t) L 1 ≤ C eCstab (T∗ −t1 )+λ t1 on (t1 , T∗ ). 3

We choose t1 = η T∗ with η ∈ (0, 1) in such a way that Cstab (1 − η) + λ η = λ/2. We have then proved g˜ R (t)] L 1 ≤ C e(λ/2) C1 (1−α) 3

−1

on (t1 , T∗ ).

(6.34)


499

On the other hand, following Step 3 in the proof of Proposition 6.6, we deduce a similar estimate as (6.20), namely g˜ S (T∗ , ·) − M[g˜ S (T∗ , ·)] L 1 = O((1 − α)1/8 ) 3

(6.35)

for any (1 − α) small enough chosen in such a way that the intermediate time t2 defined in Step 3 of the proof of Proposition 6.6 satisfies t1 + t2 ≤ T∗ . Gathering (6.34) and (6.35) we obtain g(T ˜ ∗ , ·) − M[g(T∗ , ·)] L 1 = O((1 − α)1/8 ). 3

Coming back to Eq. (6.17) on the energy and proceeding like in Step 4 in the proof of Proposition 6.6, we get 1/2 1/2 E (T∗ ) ≥ (1 − α) k3 c1 (E¯1 − c1 ) − C (1 − α)1/8 > 0 for any (1 − α) small enough. That is in contradiction with (6.33) and we conclude that T∗ = +∞. Step 2. Thanks to the previous step, we have a uniform in time lower bound on the energy, and therefore we can run the decomposition theorem for all times. By applying the decomposition theorem as in Step 1 for a given time t ∈ (0, ∞), starting a new flow at t1 = η t, taking [ρ/ρ(g S (t1 ))] g S (t1 , ·) as initial datum, and then using Corollary 6.7 on the smooth part g˜ S (s, ·), s ∈ [t1 , t], we find that at time t, the solution gt decomposes as g˜tS + g˜tR , where g˜tS approaches the self-similar profile with rate C eµα (t−t1 ) , that is C e(1−η) µα t , and the remaining part g˜ R goes to 0 with rate C e(λ/2) t . Since |λ/2| is larger than (1 − η) |µα | for (1 − α) small enough, it concludes the proof of (1.36). A. Appendix: Moments of Gaussians We state here some moments of tensor product of Gaussians. Lemma A.1. The following identities hold: M1,0,1 |v|2 dv = N , N R M1,0,1 |v|4 dv = N (N + 2), N R 3 3/2 M1,0,1 (M1,0,1 )∗ |u| dv dv∗ = 2 M1,0,1 |v|3 dv, R N ×R N RN √ 2 3 M1,0,1 (M1,0,1 )∗ |v| |u| dv dv∗ = 2 (2N +3) M1,0,1 (v) |v|3 dv. R N ×R N

RN

(A.1) (A.2) (A.3) (A.4)

Proof of Lemma A.1. The proof of (A.1) and (A.2) being straightforward and the proof of (A.3) being very similar to the proof of (A.4) we only prove (A.4). We first notice that M1,0,1 (M1,0,1 )∗ |v|2 |u|3 dv dv∗ N N R ×R 1 = M1,0,1 (M1,0,1 )∗ (|v|2 + |v∗ |2 ) |u|3 dv dv∗ 2 R N ×R N 1 = M1,0,1 (M1,0,1 )∗ (|v + v∗ |2 + |v − v∗ |2 ) |u|3 dv dv∗ . 4 R N ×R N

500


√ √ Making use of the change of variable (v, v∗ ) → (x = (v + v∗ )/ 2, y = (v − v∗ )/ 2), we then get M1,0,1 (M1,0,1 )∗ |v|2 |u|3 dv dv∗ R N ×R N

=

√ 2

=

√ 2N

=

√ 2 (2N + 3)

R N ×R N

RN

M1,0,1 (x) M1,0,1 (y) (|x|2 + |y|2 ) |y|3 d x d y

M1,0,1 (v) |v|3 dv +

RN

√ 2

M1,0,1 (v) |v|5 dv

RN

M1,0,1 (v) |v|3 dv.

B. Appendix: Interpolation Inequalities Lemma B.1. (i) For any k, k ∗ , q, q ∗ ∈ Z with k ≥ k ∗ , q ≥ q ∗ and any θ ∈ (0, 1) ∗∗ there is C ∈ (0, ∞) such that for h ∈ Wqk∗∗ ,1 (m −1 ), hW k,1 (m −1 ) ≤ C h1−θ k ∗ ,1

Wq ∗ (m −1 )

q

hθ

∗∗

Wqk∗∗ ,1 (m −1 )

(B.1)

with k ∗∗ , q ∗∗ ∈ Z such that k = (1 − θ ) k ∗ + θ k ∗∗ , q = (1 − θ ) q ∗ + θ q ∗∗ . (ii) For any k, q ∈ N ∗ and any exponential weight function m as defined in (1.30), there ‡ exists C ∈ (0, ∞) such that for any h ∈ H k ∩ L 1 (m −12 ) with k ‡ := 8k + 7(1 + N /2), 1/4 ‡ Hk

hW k,1 (m −1 ) ≤ C h q

1/4

3/4

h L 1 (m −12 ) h L 1 (m −1 ) .

(B.2)

Proof of Lemma B.1. The inequality (B.1) in point (i) is a classical result from interpolation theory. Let us focus on point (ii). We prove the inequality (B.2) for h ∈ S(R N ) and then argue by density. On the one hand, we observe that for any there exists C such that h2H ≤ C h L 1 h H † , † := 2 + 1 + N /2. Iterating twice that inequality, we get (for some related exponents k † , k ‡ ) 3/4

1/4 ‡. Hk

h H k † ≤ C h L 1 h

(B.3)

On the other hand, using first Cauchy-Schwartz inequality, plus the same argument as above and Hölder’s inequality, we obtain 1/2

1/2 † Hk

hW k,1 (m −1 ) ≤ Ch H k (m −3/2 ) ≤ C h L 1 (m −3 ) h q

1/8

3/8

1/2 †. Hk

≤ C h L 1 (m −12 ) h L 1 h We conclude gathering (B.4) and (B.3).

(B.4)


501

Acknowledgements We thank Alexander Bobylev, José Antonio Carrillo and Cédric Villani for fruitful discussions on the inelastic Boltzmann equation.

References 1. Abrahamsson, F.: Strong L 1 convergence to equilibrium without entropy conditions for the Boltzmann equation. Comm. Part. Diff. Eqs. 24, 1501–1535 (1999) 2. Bisi, M., Carrillo, J.A., Toscani, G.: Contractive metrics for a Boltzmann equation for granular gases: diffusive equilibria. J. Stat. Phys. 118(1–2), 301–331 (2005) 3. Bisi, M., Carrillo, J.A., Toscani, G.: Decay rates in probability metrics towards homogeneous cooling states for the inelastic Maxwell model. J. Stat. Phys. 124(2–4), 625–653 (2006) 4. Blake, M.D.: A spectral bound for asymptotically norm-continuous semigroups. J. Op. Th. 45, 111–130 (2001) 5. Bobylev, A.V., Carillo, J.A., Gamba, I.: On some properties of kinetic and hydrodynamics equations for inelastic interactions. J. Stat. Phys. 98, 743–773 (2000) 6. Baranger, C., Mouhot, C.: Explicit spectral gap estimates for the linearized Boltzmann and Landau operators with hard potentials. Rev. Matem. Iberoam. 21, 819–841 (2005) 7. Bobylev, A.V., Cercignani, C., Toscani, G.: Proof of an asymptotic property of self-similar solutions of the Boltzmann equation for granular materials. J. Stat. Phys. 111, 403–417 (2003) 8. Bobylev, A.V., Gamba, I., Panferov, V.: Moment inequalities and high-energy tails for the Boltzmann equations with inelastic interactions. J. Stat. Phys. 116, 1651–1682 (2004) 9. Brilliantov, N.V., Pöschel, T.: Kinetic Theory of Granular Gases. Oxford Graduate Texts. Oxford: Oxford University Press, 2004 10. Caglioti, E., Villani, C.: Homogeneous cooling states are not always good approximations to granular flows. Arch. Rat. Mech. Anal. 163, 329–343 (2002) 11. Carleman, T.: Sur la théorie de l’équation intégrodifférentielle de Boltzmann. Acta Math. 60, 91–146 (1932) 12. Carleman, T.: Problèmes mathématiques dans la théorie cinétique des gaz. Uppsala: Almqvist and Wiksells Boktryckeri Ab, 1957 13. Cercignani, C.: Recent developments in the mechanics of granular materials. In: Fisica matematica e ingegneria delle strutture, Bologna: Pitagora Editrice, 1995, pp. 119–132 14. Csiszár, I.: Eine informationstheoretische Ungleichung und ihre Anwendung auf den Beweis der Ergodizität von Markoffschen Ketten. Magyar Tud. Akad. Mat. Kutató Int. Közl. 8, 85–108 (1963) 15. Ernst, M.H., Brito, R.: Driven inelastic Maxwell molecules with high energy tails. Phys. Rev. E 65, 85–108 (2002) 16. Ernst, M.H., Brito, R.: Scaling solutions of inelastic Boltzmann equations with over-populated high energy tails. J. Stat. Phys. 109, 407–432 (2002) 17. Gamba, I., Panferov, V., Villani, C.: On the Boltzmann equation for diffusively excited granular media. Commun. Math. Phys. 246, 503–541 (2004) 18. Grad, H.: Asymptotic theory of the Boltzmann equation. II. In: Rarefied Gas Dynamics (Proc. 3rd Internat. Sympos., Palais de l’UNESCO, Paris, 1962), Vol. I, New York: Academic Press, 1963, pp 26–59 19. Haff, P.K.: Grain flow as a fluid-mechanical phenomenon. J. Fluid Mech. 134, 401–430 (1983) 20. Hilbert, D.: Grundzüge einer Allgemeinen Theorie der Linearen Integralgleichungen. Math. Ann. 72, (1912), New York: Chelsea Publ., 1953 21. Kato, T.: Perturbation Theory for Linear Operators. Berlin: Springer-Verlag, 1995 22. Kullback, S.: Information Theory and Statistics. New York: John Wiley, 1959 23. Mischler, S., Mouhot, C., Rodriguez Ricard, M.: Cooling process for inelastic Boltzmann equations for hard spheres, Part I: The Cauchy problem. J. Stat. Phys. 124, 655–702 (2006) 24. Mischler, S., Mouhot, C.: Cooling process for inelastic Boltzmann equations for hard spheres, Part II: Self-similar solutions and tail behavior. J. Stat. Phys. 124, 703–746 (2006) 25. Mischler, S., Mouhot, C.: Work in progress 26. Mischler, S., Wennberg, B.: On the spatially homogeneous Boltzmann equation. Ann. Inst. Henri Poincaré, Analyse Non linéaire 16, 467–501 (1999) 27. Mouhot, C.: Rate of convergence to equilibrium for the spatially homogeneous Boltzmann equation. Commun. Math. Phys. 261, 629–672 (2006) 28. Mouhot, C., Villani, C.: Regularity theory for the spatially homogeneous Boltzmann equation with cutoff. Arch. Rat. Mech. Anal. 173, 169–212 (2004) 29. Nirenberg, L.: Topics in Nonlinear Functional Analysis. With a chapter by E. Zehnder. Notes by R. A. Artino. Lecture Notes, 1973–1974. New York: Courant Institute of Mathematical Sciences, New York University, 1974

502


30. Pulvirenti, A., Wennberg, B.: A Maxwellian lower bound for solutions to the Boltzmann equation. Commun. Math. Phys. 183, 145–160 (1997) 31. Villani, C.: Cercignani’s conjecture is sometimes true and always almost true. Commun. Math. Phys. 234, 455–490 (2003) 32. Yao, P.F.: On the inversion of the Laplace transform of C0 semigroups and its applications. SIAM J. Math. Anal. 26, 1331–1341 (1995) Communicated by H. Spohn


Communications in


Geometric Optics and Boundary Layers for Nonlinear-Schrödinger Equations D. Chiron, F. Rousset Laboratoire J.A. Dieudonne, Université de Nice - Sophia Antipolis, Parc Valrose, 06108 Nice Cedex 02, France. E-mail: [email protected]; [email protected] Received: 11 April 2008 / Accepted: 22 November 2008 Published online: 20 February 2009 – © Springer-Verlag 2009

Abstract: We justify supercritical geometric optics in small time for the defocusing semiclassical Nonlinear Schrödinger Equation for a large class of non-necessarily homogeneous nonlinearities. The case of a half-space with Neumann boundary condition is also studied. 1. Introduction We consider the nonlinear Schrödinger equation in ⊂ Rd , iε

∂ ε ε2 + ε − ε f (| ε |2 ) = 0, ε : R+ × → C ∂t 2

with an highly oscillating initial datum under the form i ε ε ε ε |t=0 = 0 = a0 exp ϕ , ε 0

(1)

(2)

where ϕ0ε is real-valued. We are interested in the semiclassical limit ε → 0. The nonlinear Schrödinger equation (1) appears, for instance, in optics, and also as a model for Bose-Einstein condensates, with f (ρ) = ρ − 1, and the equation is termed the Gross-Pitaevskii equation, or also with f (ρ) = ρ 2 (see [13]). Some more complicated nonlinearities are also used especially in low dimensions, see [12]. At first, let us focus on the case = Rd . To guess the formal limit, when ε goes to zero, it is classical to use the Madelung transform, i.e. to seek for a solution of (1) under the form i ε ϕ . ε = ρ ε exp ε

504

D. Chiron, F. Rousset

By separating real and imaginary parts and by introducing u ε ≡ ∇ϕ ε , this allows to rewrite (1) as an hydrodynamical system, ⎧ ∂t ρ ε + ∇ · ρ ε u ε = 0 ⎪ ⎪ ⎨ √ ε (3) ε ε ε2 ρ ⎪ ε ε ⎪ ⎩ ∂t u + u · ∇ u + ∇ f (ρ ) = ∇ √ ε . 2 ρ The system (3) is a compressible Euler equation with an additional term in the right-hand side called quantum pressure. As ε tends to 0, the quantum pressure is formally negligible and (3) reduces to the (compressible) Euler equation, ⎧ ⎨ ∂t ρ + ∇ · (ρu) = 0 (4) ⎩ ∂ u + (u · ∇) u + ∇ ( f (ρ)) = 0. t The justification of this formal computation has received much interest recently. The case of analytic data was solved in [7]. Then for data with Sobolev regularity and a defocusing nonlinearity, so that (4) is hyperbolic, it was noticed by Grenier, [9], that it is more convenient to use the transformation ε ϕ ε ε (5) = a exp i ε and to allow the amplitude a ε to be complex. By using an identification between C and R2 , this allows to rewrite (1) as ⎧ aε ε ⎪ ⎨ ∂t a ε + u ε · ∇a ε + ∇ · u ε = J a ε 2 2 (6) ⎪ ⎩ ε ε ε ε 2 ∂t u + (u · ∇) u + ∇ f (|a | ) = 0, where J is the matrix of complex multiplication by i: 0 −1 . J= 1 0 When ε = 0, we find the system ⎧ a ⎪ ⎨ ∂t a + u · ∇a + ∇ · u = 0 2 ⎪ ⎩ ∂t u + (u · ∇) u + ∇ f (|a|2 ) = 0,

(7)

which is another form of (4), since then (ρ ≡ |a|2 , u) solves (4). The rigorous convergence of (6) towards (7) provided the initial conditions suitably converge was rigorously performed by Grenier [9] in the case f (ρ) = ρ (which corresponds to the cubic defocusing NLS). More precisely, it was proven in [9] that there exists T > 0 independent of ε such that the solution of (6) is uniformly bounded in H s on [0, T ]. In terms of the unknown ε of (1), this gives that

ϕ || s < +∞ sup sup || ε exp −i ε H ε∈(0,1] [0,T ]

Geometric Optics and Boundary Layers for Nonlinear-Schrödinger Equations

505

for every s where (a, u = ∇ϕ) is the solution of (7). Furthermore, the justification of WKB expansions under the form m

iϕ iϕ ε − εk a k e ε = O(εm )e ε k=0

for every m was performed in [9]. The main idea in the work of Grenier [9] is to use the symmetrizer 1 1 , . . . , S ≡ diag 1, 1, 4 f (|a|2 ) 4 f (|a|2 ) of the hyperbolic system (7) to get H s energy estimates which are uniform in ε for the singularly perturbed system (6). The case of nonlinearities for which f vanishes at zero (for instance the case f (ρ) = ρ 2 ) was left open in [9]. The additional difficulty is that for such nonlinearities, the system (7) is only weakly hyperbolic at a = 0 and in particular the symmetrizer S becomes singular at a = 0. In more recent works, see [1,14,19] it was proven that for every weak solution of (1) with f (ρ) = ρ − 1 or f (ρ) = ρ, the limits as ε → 0, ε ¯ ∇ ε − ρu → 0 | ε |2 − ρ → 0 in L ∞ ([0, T ], L 2 ) εIm 1 in L ∞ ([0, T ], L loc )

(8)

hold under some suitable assumption on the initial data. The approach used in these papers is completely different, and relies on the modulated energy method introduced in [4]. The advantage of this powerful approach is that it allows to describe the limit of weak solutions and to handle general nonlinearities once the existence of a global weak solution in the energy space for (1) is known. Nevertheless, it does not give precise qualitative information on the solution of (1), for example, it does not allow to prove that the solution remains smooth on an interval of time independent of ε if the initial data are smooth or to justify the WKB expansion up to arbitrary orders in smooth norms. In the work [2], the possibility of getting the same result as in [9] for pure power nonlinearities f (ρ) = ρ σ in the case = Rd was studied. It was first noticed that, thanks to the result of [15], the system ⎧ a ⎨ ∂t a + ∇ϕ · ∇a + ϕ = 0 2 (9) ⎩ ∂ ϕ + 1 |∇ϕ|2 + f (|a|2 ) = 0, t 2 with the initial condition (a, ϕ)/t=0 = (a0 , ϕ0 ) ∈ H ∞ has a unique smooth maximal solution (a, ϕ) ∈ C [0, T ∗ [, H s (Rd ) × H s−1 (Rd ) for every s. It was then established: Theorem 1 ([2]). Let d ≤ 3, σ ∈ N∗ and initial data a0ε , ϕ0ε ≡ ϕ0 in H ∞ such that, for some functions (ϕ0 , a0 ) ∈ H ∞ , || a0ε − a0 || H s = O(ε), for every s ≥ 0. Then, there exists T ∗ > 0 such that (9) with f (ρ) = ρ σ has a smooth maximal solution (a, ϕ) ∈ C([0, T ∗ [, H ∞ × H ∞ ). Moreover, there exists T ∈ (0, T ∗ )

506


independent of ε, such that the solution of (1), (2) remains smooth on [0, T ] and verifies the estimate

ϕ || ∞ s < +∞, (10) sup || ε exp −i ε L ([0,T ],H ) ε∈(0,1] where • • • •

if σ if σ if σ if σ

= 1, then s ∈ N is arbitrary, = 2 and d = 1, then one can take s = 2, = 2 and 2 ≤ d ≤ 3, then one can take s = 1, ≥ 3 then one can take s = σ .

As emphasized in [2], in some cases, the global existence of smooth solutions is already known for (1). For example, in the quintic case, σ = 2, global existence is known for d ≤ 3 (see [6] for the difficult critical case d = 3), so that only the bound (10) is interesting. Nevertheless, Theorem 1 may be also applied to cases where (1) is H 1 super-critical (σ ≥ 3, d = 3 for example) and hence the fact that it is possible to construct a smooth solution on a time interval independent of ε is already interesting. The main ingredient used in [2] is a subtle transformation of (1) into a perturbation of a quasilinear symmetric hyperbolic system with non smooth coefficients when σ ≥ 2. The first aim of this paper is to prove that the estimate (10) holds true for every s, every dimension d and every nonlinearity f which satisfies the following assumption: (A) f ∈ C ∞ ([0, +∞)) ,

f (0) = 0,

f > 0 on (0, +∞), ∃n ∈ N∗ ,

f (n) (0) = 0. Note that we allow f to vanish at the origin. The assumption (A) takes into account in particular all the homogeneous polynomial nonlinearities f (ρ) = ρ σ but also nonρσ linearities under the form f (ρ) = ρ σ1 + ρ σ2 or 1+ρ for example. Our result reads: Theorem 2. We assume (A), and consider an initial data (2) with ϕ0ε real-valued, a0ε , ϕ0ε in H ∞ such that, for some real-valued functions (ϕ0 , a0 ) ∈ H ∞ , we have for every s, || a0ε − a0 || H s = O(ε)

and

|| ϕ0ε − ϕ0 || H s = O(ε).

Then, there exists T ∗ > 0 such that (7) with initial value (a0 , ϕ0 ) has a unique smooth maximal solution (a, ϕ) ∈ C([0, T ∗ [, H ∞ × H ∞ ). Moreover, there exists T ∈ (0, T ∗ ] such that for every ε ∈ (0, 1), the solution ε to (1)–(2) exists at least on [0, T ] and satisfies for every s, i ε sup || exp − ϕ || L ∞ ([0,T ],H s ) < +∞. ε ε∈(0,1] More precisely, there exists ϕ ε = ϕ + O H ∞ (ε) such that, for every s, i ε ε || exp − ϕ − a || L ∞ ([0,T ],H s ) = O(ε). ε

(11)


507

Let us give a few comments on the statement of Theorem 2. At first, note that Theorem 2 contains a result of local existence of smooth solutions for (9) in the case of non necessarily homogeneous nonlinearities satisfying (A). Since (a, ∇ϕ) solves a compressible type Euler equation, the case of a homogeneous nonlinearity was studied in [15], and we thus give an extension of this result to smooth non-linearities satisfying assumption (A). A precise statement of our result with the required regularity of the initial data is given in Theorem 4 below. The new difficulty when f is not homogeneous is that the nonlinear symmetrization does not seem to allow to transform the problem into a classical symmetric or symmetrizable hyperbolic system with smooth coefficients. The correction of order ε that we have to add to the phase to get the estimate (11) is expected. Indeed, a perturbation of order ε in the phase modifies the amplitude at the leading order. Our approach to prove Theorem 2 is completely different from the one of [2 and 9]. We do not work any more on the system (6) or any reformulation of (1) into a perturbation of a quasilinear symmetric hyperbolic system, but directly on the NLS equation (1). Basically, we first prove the linear stability for (1) in arbitrary Sobolev norms of a highly oscillating solution of the form aeiϕ/ε and then use a fixed point argument to prove the nonlinear stability. The crucial estimate of linear stability of a highly oscillating solution is given in Lemma 1 and Theorem 3. This actually allows to justify WKB expansions up to arbitrary orders (see Theorem 5). Since we deal in this paper with sufficiently smooth and in particular bounded solutions, the assumption (A) can be replaced by a local version where we assume that f > 0 on (0, β) with β independent of ε if the initial datum verifies |a0 |2 < β. Indeed, since a 0 takes it values in the (weak) hyperbolic region of the limit system (7), there still exists a local smooth solution of (7) defined on [0, T ] for some T > 0 and the stability argument leading to Theorem 2 still holds. Consequently, our result can also be applied to nonlinearities like f (ρ) = ρ σ1 − ρ σ2 for every σ2 > σ1 , provided |a0 |2 ≤ β 1. Note that when σ2 is too large, the classical global existence result of weak solutions (see [8]) for (1) is not valid and hence it does not seem possible to use the modulated energy method of [1,14] to investigate the semi-classical limit. Finally, the last advantage of our approach is that it can be easily generalized to the case of a domain with boundary and to non-zero condition at infinity. This will be the aim of the second part of the paper. We shall restrict ourself to a physical case, the Gross-Pitaevskii equation, i.e. f (ρ) = ρ − 1. The generalization to more general nonlinearities satisfying an assumption like (A) is rather straightforward. This simplifying assumption is only made to avoid the multiplication of difficulties. Again to avoid too many technicalities, we restrict ourselves to the simplest domain = Rd+ = Rd−1 × (0, +∞). For x ∈ Rd+ , we shall use the notation x = (y, z), y ∈ Rd−1 , z > 0. We add to (1) the Neumann boundary condition ∂z ε (t, y, 0) = 0. We also impose the following condition at infinity: u∞ · x |u ∞ |2 ε +i , (t, x) ∼ exp −it 2ε ε that we can write in hydrodynamical variables ε (t, x)2 → 1, u ε (t, x) → u ∞ ,

(12)

|x| → +∞,

|x| → +∞,

(13)

508


where u ∞ is a constant vector. This condition appears naturally when we study a moving obstacle in the fluid. Indeed, if we start from (1) with the Neumann boundary condition on an obstacle moving at constant velocity and fluid at rest at infinity, then we can use the Galilean invariance of (1) to transform the problem into the study of (1) in a fixed domain but with the condition (13) at infinity. This problem with such boundary conditions is physically meaningful since it can be used to describe superfluids past an obstacle (we refer to [16] for example). The semiclassical limit ε tends to zero was already studied in [14] by using the modulated energy method. The limit (8) was proven with (ρ, u) the solution of the compressible Euler equation with boundary condition u · n /∂ = 0, n being the normal to the boundary. Note that the result of [14] is restricted to the two-dimensional case only in order to have a global solution in the energy space of (1). By using more recent results on the Cauchy problem, [3], one can also get the result in the three-dimensional case at least when u ∞ = 0. Our aim here is to give a more precise description of the convergence which takes into account boundary layers. More precisely, since the solution of the Euler system (9) cannot match the Neumann boundary condition ∂z a(t, y, 0) = 0, a boundary layer of weak amplitude ε and of size ε appears. They are formally described for example in [16]. WKB expansions ε = a ε ei ε

a =a + 0

m

k=1

ϕε ε

are thus to be sought under the form

m

z z k k ε 0 ε a (t, x) + A (t, y, ) , ϕ = ϕ + εk ϕ k (t, x)+ k (t, y, ) , ε ε k

k=1

(14) Ak (t,

where the profiles y, and are chosen such that

Z ), k (t,

y, Z ) are exponentially decreasing in the Z variable

∂z a k (t, y, 0) + ∂ Z Ak+1 (t, y, 0) = 0, ∂z ϕ k (t, y, 0) + ∂ Z k+1 (t, y, 0) = 0 so that the approximate WKB expansion WKB = a ε exp εi ϕ ε matches the Neumann boundary condition (12). Our result (Theorem 6) is that under suitable assumptions on the initial conditions, we have the nonlinear stability of WKB expansions: in particular we have the existence of a smooth solution for (1), (12), (13) on a time interval independent of ε and the estimate || ε e−i

ϕε ε

− a ε ||W 1,∞ ε.

(15)

Note that it is necessary to incorporate the boundary layer ε A1 in order to get (15) since its gradient has amplitude one in L ∞ . The case of Dirichlet boundary condition which is also physically meaningful, we again refer to [16], seems more complicated to handle since often in boundary layer theory in fluid mechanics the boundary layers involved have amplitude one. This is left for future work. s The paper is organized as follows. In Sect. 2, we prove the stability in H of

linear ε an approximate WKB solution of (1) under the form a ε exp i ϕε in the case = Rd . This is the crucial part towards the proof of Theorem 2. Next in Sect. 3, we give the construction of a WKB expansion up to arbitrary order and give the proof of the local existence of a smooth solution for the compressible Euler equation with a pressure law satisfying (A). In Sect. 4, we give the justification of WKB expansions at every order and recover Theorem 2 as a particular case. This part uses in a classical way the linear stability result and a fixed point argument. Finally, in Sect. 5, we study the problem in the half-space with Neumann boundary condition.


509

2. Linear Stability

ε In this section, we consider a smooth WKB approximate solution a = a ε exp i ϕε of (1) such that ε ϕ a ε N L S( ) = R exp i , (16) ε where N L S() ≡ iε∂t +

ε2 − f (||2 ). 2

Moreover, we also set 1 |∇ϕ ε |2 + f (|a ε |2 ), 2 1 Ra ≡ ∂t a ε + ∇ϕ ε · ∇a ε + a ε ϕ ε , 2

R ϕ ≡ ∂t ϕ ε +

(17) (18)

so that R ε = −a ε Rϕ + iε Ra +

ε2 a ε . 2

Looking for an exact solution of (1) under the form ε = a + w ei

ϕε ε

= (a ε + w)ei

ϕε ε

,

we find that w solves the nonlinear Schrödinger equation ε2 1 iε ∂t w + u ε · ∇w + w ∇ · u ε + w − 2(w, a ε ) f (|a ε |2 )a ε 2 2 = Rϕ w − R ε + Q ε (w),

(19)

where (·, ·) stands for the real scalar product in C R2 , with u ε ≡ ∇ϕ ε and the nonlinear term Q ε (w) is defined by

Q ε (w) ≡ (a ε + w) f (|a ε + w|2 ) − f (|a ε |2 ) − 2(w, a ε ) f (|a ε |2 )a ε .

(20)

Of course, R ε will be very small and Rϕ (and Ra ) are to be thought small (at least O(ε)) for applications to nonlinear stability results. Nevertheless, in this section the exact form of these terms is not important. The way to construct an accurate WKB solution a will be explained in the next section. Remark 1. If we work with a non-linearity f such that f (A2 ) = 0 for some A ∈ R, we can impose a non-zero condition at infinity such as a0 ∈ A + H ∞ and ∇ϕ0 ∈ U ∞ + H ∞ for some constant vector U ∞ ∈ Rd . Since we can still look for the perturbation w in H s , this does not affect the proofs.

510


Since we expect the correction term w to be small, we shall only consider in this section the linearized equation iε

∂w + Lε w = Rϕ w + F ε , x ∈ Rd , ∂t

(21)

where the linear operator Lε is defined as Lε (w) ≡

ε2 iε w + iε u ε · ∇w + w ∇ · u ε − 2 f (|a ε |2 )(w, a ε )a ε . 2 2

In this section, F ε is considered as a given source term. Of course, for the proof of Theorem 2, we shall apply the result of this section to F ε = −R ε + Q ε (w).

(22)

Furthermore, let us emphasize that at this stage, Rϕ is seen as a multiplicative operator with no link with the vector field u ε appearing in Lε , even though we will use this lemma with u ε = ∇ϕ ε . We notice that Lε is formally self-adjoint, but only the first and last term give rise to a nonnegative quadratic functional. Indeed, the quadratic form (in H 1 ) associated to the operator Sε w ≡ − is, since f ≥ 0,

ε2 w + 2 f (|a ε |2 )(w, a ε )a ε 2

1 w, S ε w = ε2 |∇w|2 + 4 f (|a ε |2 )(w, a ε )2 ≥ 0. 2 Rd Rd It is then natural to consider the (squared) norm w, S ε (w) as a good energy for

Rd

the linearized equation (21). Consequently, we introduce the weighted norm 1 N ε (w) ≡ ε2 |∇w|2 + 4 f (|a ε |2 )(w, a ε )2 + K ε2 |w|2 2 Rd for every K > 0 (K will be chosen sufficiently large only in the next subsection). Our first result of this section is a linear stability result in the energy norm N ε (w). Lemma 1. Assume that u ε : [0, T ] × Rd → Rd and a ε : [0, T ] × Rd → C are smooth and such that M ≡ || ∇x u ε || L ∞ ([0,T ]×Rd ) + || ∇x (∇ · u ε ) || L ∞ ([0,T ]×Rd ) + || |a ε |2 || L ∞ ([0,T ]×Rd ) < +∞. Let w ∈ C 1 ([0, T ], H 2 ) be a solution of (21). Then, there exists C M depending only on d, f and M such that for every ε ∈ (0, 1], the solution w of (21) satisfies the energy estimate 1 d ε 1 1 N (w(t)) ≤ C M 1+ || Ra (t) || L ∞ + || Rϕ (t) ||W 1,∞+ 2 || Rϕ (t) || L ∞ N ε (w(t)) dt ε ε ε 4 f (|a ε |2 )(w, a ε )(a ε , i F ε ) + + || F ε (t) ||2L 2 − (εw, i F ε ). Rd ε Rd (23)


511

Note that it is very easy to get from (23) and the Gronwall inequality a classical estimate of linear stability. Indeed, assuming that Ra = O L ∞ ([0,T ],L ∞ ) (ε) and Rϕ = O L ∞ ([0,T ],W 1,∞ ) (ε2 ) (which is true if (a ε , ϕ ε ) come from the WKB method), we infer from a crude estimate for the two last terms in (23) that for 0 ≤ t ≤ T , d ε 1 N (w(t)) ≤ C N ε (w(t)) + 2 || F ε (t) ||2H 1 , dt ε which gives for 0 ≤ t ≤ T , N ε (w(t)) ≤ eCt

N ε (w(0)) +

1 ε2

t

0

|| F ε (τ ) ||2H 1 dτ ,

which is a more classical result of linear stability in the energy norm N ε (w) since the amplification rate C is independent of ε. Nevertheless, to get H s estimates and the best nonlinear results as possible, it is important to have the special structure of the two last terms in (23). Modulated linearized functionals like N ε were also used in asymptotic problems in fluid mechanics, see [10] for example. 2.1. Proof of Lemma 1. The norms L ∞ , W 1,∞ , L 2 ... always stand for the norms in the x variable. At first, since S ε is self adjoint, we have ε d S w, w = 2 S ε w, ∂t w + 2∂t f (|a ε |2 ) (w, a ε )2 dt Rd Rd + 4 f (|a ε |2 )(w, a ε )(w, ∂t a ε ).

(24)

Next, we use (21) to express ∂t w as i i 1 i ∂t w = − S ε w − u ε · ∇w + w ∇ · u ε − Rϕ w − F ε ε 2 ε ε to get 2

Rd

ε S w, ∂t w = 2

Rd

ε2 1 w − 2 f (|a ε |2 )(w, a ε )a ε , u ε · ∇w + w ∇ · u ε 2 2 i i ε (25) + Rϕ w + F . ε ε

We shall now estimate the various terms in the right-hand side of (25). Integrating by parts, we get i 2 ε w, Rϕ w = −ε ∇w, iw∇ Rϕ d d ε R R ≤ ε ||∇ Rϕ || L ∞ ||w|| L 2 ||∇w|| L 2 1 ≤ ||Rϕ ||W 1,∞ N ε (w). ε Note that we have used that Rϕ is real-valued and thus that (∇w, i Rϕ ∇w) = 0

512


for the first equality. We also easily obtain by integration by parts that Rd

ε2 w, w ∇ · u ε ≤ C || ∇ · u ε || L ∞ + || ∇(∇ · u ε ) || L ∞ ε2 || ∇w ||2L 2 +ε2 || w ||2L 2 . ≤ C M N ε (w).

In the proof, C M is a harmless number which changes from line to line and which depends only on M. In particular, it is independent of ε. Moreover, we can also write for k = 1, . . . , d, Rd

|∂k w|2 − u ·∇ 2 Rd |∂k w|2 ∇ · uε − = 2 Rd Rd

2 ε ∂kk w, u · ∇w = −

ε

∂k w, ∂k u ε · ∇w ∂k w, ∂k u ε · ∇w ,

and hence, we immediately infer Rd

ε2 w, u ε · ∇w ≤ C M N ε (w).

Furthermore, from the inequality 2ab ≤ a 2 + b2 , there holds −

1 CM 4 2 f (|a ε |2 ) (w, a ε ) ε|w| f (|a ε |2 )(w, a ε ) a ε , i Rϕ w ≤ 2 ||Rϕ || L ∞ d ε Rd ε R CM ≤ 2 ||Rϕ || L ∞ f (|a ε |2 )(w, a ε )2 + ε2 |w|2 ε Rd CM ≤ 2 ||Rϕ || L ∞ N ε (w). (26) ε

Consequently, we can replace (25) in (24) and use the above estimates to get ε d 1 S w, w = 4 f (|a ε |2 )(w, a ε ) (w, ∂t a ε )− u ε · ∇w+ w ∇ · u ε , a ε dt Rd 2 Rd ∂t f (|a ε |2 ) (w, a ε )2 + E 1 , (27) +2 Rd

where E 1 satisfies the estimate 1 1 E 1 ≤ C M 1 + || Rϕ ||W 1,∞ + 2 || Rϕ || L ∞ N ε (w) ε ε 4 − f (|a ε |2 )(w, a ε )(a ε , i F ε ) + (εw, i F ε ). ε Rd Rd

(28)


513

To estimate the first integral in the right-hand side of (27), we use Eq. (18) to get 1 4 f (|a ε |2 )(w, a ε ) (w, ∂t a ε ) − u ε · ∇w + w ∇ · u ε , a ε 2 Rd f (|a ε |2 )(w, a ε ) (w, Ra ) − u ε · ∇(w, a ε ) − (w, a ε )∇ · u ε =4 d R

ε 2 ε =4 f (|a | )(w, a )(w, Ra ) − 2 f (|a ε |2 ) u ε · ∇ (w, a ε )2 Rd Rd ε 2 ε 2 ε f (|a | )(w, a ) ∇ · u −4 Rd =4 f (|a ε |2 )(w, a ε )(w, Ra ) + 2 (w, a ε )2 u ε · ∇ f (|a ε |2 ) Rd Rd f (|a ε |2 )(w, a ε )2 ∇ · u ε . −2 Rd

To get the last line, we have integrated by parts the second integral. Note that the last term is bounded by C M N ε (w), and, as for (26), that the first integral is bounded by CM ||Ra || L ∞ N ε (w). Consequently, we can replace the above identity in (27) to get ε ε d S w, w = 2(w, a ε )2 ∂t +u ε · ∇ f (|a ε |2 )+ E 1 + E 2 =: I + E 1 + E 2 , (29) dt Rd Rd where E 2 is such that

1 E 2 ≤ C M 1 + || Ra || L ∞ N ε (w). ε

(30)

To estimate I , we use again Eq. (18) which gives ∂t + u ε · ∇ f (|a ε |2 ) = 2 f (|a ε |2 ) a ε , ∂t a ε + u ε · ∇a ε = 2 f (|a ε |2 ) 1 × Ra − a ε ∇ · u ε , a ε , 2 and hence we find I ≤C |a ε |2 f (|a ε |2 ) (w, a ε )2 + 4 Rd

Rd

|a ε | | f (|a ε |2 )| (w, a ε )2 |Ra |.

To conclude, we shall use Assumption (A). By defining n ∈ N∗ the first integer such that f (n) (0) = 0, we see from Taylor expansion that f (ρ) = ρ n−1 q(ρ)

(31)

for some smooth positive function q on [0, +∞). In particular, since q > 0, we have ρ → which implies

ρ f (ρ) q (ρ) =n−1+ρ ∈ C ∞ ([0, +∞)) , f (ρ) q(ρ)

ρ f (ρ) ≤ C M f (ρ)

for 0 ≤ ρ ≤ M.

(32)

514


This yields ε 2 ε 2 ε 2 |a | | f (|a | )|(w, a ) ≤ C M Rd

Rd

(w, a ε )2 f (|a ε |2 ) ≤ C M N ε (w),

where, again, C M depends only on M. In a similar way, we also obtain ε 2 ε ε 2 (w, a ) |a | | f (|a | )| |Ra | ≤ || Ra || L ∞ |w| · (w, a ε ) · |a ε |2 f (|a ε |2 ) d d R R CM ε ε 2 || Ra || L ∞ ≤ (ε|w|) (w, a ) f (|a | ) ε Rd CM || Ra || L ∞ N ε (w). ≤ ε Consequently, we have proven that 1 I ≤ C M 1 + || Ra || L ∞ N ε (w). ε

(33)

To get the result of Lemma 1, it remains to perform the L 2 estimate. Taking the L 2 scalar product of (21) with iw and using that (w, u ε · ∇w +

1 1 w ∇ · u ε ) = ∇ · |w|2 u ε , 2 2

we get d dt

ε2 ||w||2L 2 2

=

Rd

ε

ε(F , iw) + 2ε

Rd

f (|a ε |2 )(w, a ε )(a ε , iw).

Note that we have once again used that Rϕ is real-valued and hence that (Rϕ w, iw) = 0. The first integral is clearly bounded by N ε (w) + || F ε ||2L 2 whereas for the second one, we have

f (|a ε |2 )(w, a ε )2 + ε2 |w|2 ≤ C M N ε (w). 2ε f (|a ε |2 )(w, a ε )(a ε , iw) ≤ C M Rd

Rd

As a consequence, we get d ε2 2 || w || L 2 ≤ C M N ε (w) + || F ε ||2L 2 . dt 2

(34)

Finally, we can collect (28), (29), (30), (33) and (34) to get (23). This completes the proof. 2.2. Higher order estimates. Since our final aim is to prove Theorem 2 by a fixed point argument, we also need to have H s estimates for s sufficiently large for the solution of the linear equation (21). This is the aim of the following. Note that the term −2(w, a ε ) f (|a ε |2 )a ε in (19) can be seen as a singular term with variable coefficients. Consequently, a crude way to get H s estimates is to apply ε|α| ∂ α to the equation,


515

the weight ε|α| being used to compensate the singular commutator when we take the derivative of (19), and then to apply Lemma 1 to the resulting equation. Nevertheless, it is possible to avoid the loss of ε|α| with more work by using more clever higher order modulated functionals. We set N1ε ≡ N ε and, if s ∈ N, s ≥ 2, we define the following weighted norm, where α ∈ Nd are multi-indices

Nsε (w) ≡ N ε (∂ α w) + K ||Re w||2H s−2 |α|≤s−1

=

1 2 f (|a ε |2 )(∂ α w, a ε )2 + K ε2 ||w||2H s−1 ε ||∇w||2H s−1 + 2 d 2 |α|≤s−1 R (35) + ||Re w||2H s−2 .

In this section, we shall use that a ε = a 0 + εa r with a 0 real-valued and sup ||a r || L ∞ ([0,T ],W s,∞ ) ≤ C.

ε∈(0,1]

Note that this allows to write 1 f (|a ε |2 )(∂ α w, a ε )2 ≥ f (|a ε |2 )(a 0 )2 |Re ∂ α w|2 − Cε2 ||Re ∂ α w||2L 2 , 2 Rd Rd and hence by choosing K sufficiently large (K > C) we get the lower bound

α 1 ε ε Ns (w) ≥ N ∂x w + f (|a ε |2 )(a 0 )2 |Re ∂ α w|2 d x. (36) 2 Rd |α|≤s−1

|α|≤s−1

Note that we also have the equivalence of norms: 2 || w ||2H s ≤ 2 Nsε (w), Nsε (w) ≤ C(|a ε |W s−1,∞ ) || w ||2H s + ||Re w||2H s−2 . ε The main result of this section is:

(37)

Theorem 3. Let 0 < T < ∞, s ∈ N∗ , f satisfying (A) and w ∈ C 1 ([0, T ], H s ) a solution of (21) with u ε : [0, T ] × Rd → Rd and a ε : [0, T ] × Rd → C such that M ≡ sup || u ε || L ∞ ([0,T ],W s+1,∞ (Rd )) + || a ε || L ∞ ([0,T ],W s,∞ (Rd )) < +∞. 0 0 in [0, +∞) (by (A)). In this case, (55) is symmetrizable (with the symmetrizer S = diag 1, 4 f 1(a 2 ) , . . . , 4 f 1(a 2 ) used in [9]) and the local existence and uniqueness for (55) follows easily. Proof of Theorem 4. The first step is to rewrite the system by using more convenient unknowns. At first, we notice that thanks to (A), we can write f under the form f (ρ) = ρ n f˜(ρ), with f˜ smooth on [0, +∞) and such that f˜(0) = 0. Next, since we have by assumption f (0) = 0 and f (ρ) > 0 for ρ = 0, we also have that f (ρ) > 0 for ρ > 0. This implies that f˜(ρ) > 0 for ρ ≥ 0. This allows to define a smooth function h on R by 1 2n . h(a) ≡ a f˜(a 2 )

(56)


521

Note that h(a) = 0 for a = 0. It is useful to notice that we can also write h under the form 1

h(a) = sgn(a) f (a 2 ) 2n , and hence that we have h(a)2n = f (a 2 ),

a ∈ R.

Furthermore, since > 0 and f˜(0) > 0 in (0, +∞), we deduce that h (a) > 0 for a = 0 1 2n > 0, so that h > 0 on R. Thus h is a smooth diffeomorphism and that h (0) = f˜(0) from R to h(R). In particular, this allows to define a smooth positive function c on h(R) such that 1 ah (a) = h(a) c (h(a)) , ∀a ∈ R. 2 With this definition, (h, u), with h ≡ h(a), solves the system ⎧ ⎪ ⎨ ∂t h + u · ∇h + hc(h) ∇ · u = 0 (57)

⎪ ⎩ ∂t u + u · ∇u + ∇ h 2n = 0. f

Since a is in H s if and only if h is in H s , we shall prove local existence of a smooth solution for the weakly hyperbolic system (57). As we shall see below, the nonlinear symmetrization method of [15] does not allow to reduce (57) to a symmetric or symmetrizable system with smooth coefficients except in the case where c(h) = c(h ˜ n ) for some smooth map c. ˜ Nevertheless, it will be still possible to use the same idea to prove the existence of an energy estimate with loss for the system (57). When we are in such a situation, the simplest way to construct a solution is to use the vanishing viscosity method. Indeed, this approximation method allows to preserve the nonlinear energy estimate verified by (57). We thus consider for > 0 the system ⎧ ⎪ ⎨ ∂t h + u · ∇h + h c(h )∇ · u = h (58)

⎪ ⎩ ∂t u + u · ∇u + ∇ h 2n = u . The local existence of smooth solutions for this parabolic system is very easy to obtain. Moreover, we note that h remains nonnegative if the initial datum (h )|t=0 is nonnegative. In the following, we shall only prove an H s energy estimate independent of for this system which ensures that the solution remains smooth on an interval of time independent of . The final step which consists in using the uniform bounds to pass to the limit when goes to zero to get a solution of (57) is very classical and hence will not be detailled. In the proof of the energy estimates, we shall omit the subscript for notational convenience. 1 As in the work of [15], we introduce the unknown H ≡ h n = a n f˜(a 2 ) 2 . Note that by definition of h, H is in H s as soon as a is in H s . We get for (H, u) the system ⎧

⎪ ⎨ ∂t H + u · ∇ H + n H c(h) ∇ · u = nh n−1 h = H − n(n − 1)h n−2 |∇h|2 ⎪ ⎩ ∂ u + u · ∇u + 2H ∇ H = u. t (59)

522


Note that it does not seem possible to get a classical hyperbolic symmetric system (in the case = 0) involving only H and u as in the case of homogeneous pressure laws 1 considered in [15]. Indeed, the coefficient c(h) = c(H n ) is not (in general) a smooth function of H . Nevertheless, it will be possible to prove that the system with unknowns (h, H, u) though only weakly hyperbolic (when = 0) satisfies an energy estimate. We notice that the symmetrizer

n S ≡ diag 1, c(h)Id , 2 which is positive since c(h) is positive, symmetrizes the first order part of (59). We shall first perform an H s energy estimate (s > 2 + d/2) on (59) but we have to track carefully the dependence on h in the energy estimates. To prove our H s energy estimate, we shall make extensive use of the following classical (see [18] for example) tame estimates || f g || H k ≤ Ck || f || L ∞ || g || H k + || f || H k || g || L ∞ , || ∂ α ( f g)− f ∂ α g || L 2 ≤ Ck || f || H k || g || L ∞ + || ∇ f || L ∞ || g || H k−1 , |α| ≤ k, || F(u) || H k ≤ C(|| u || L ∞ )(1 + || u || H k )

(60) (61) (62)

if F is smooth and such that F(0) = 0. At first, we notice that (∂ α H, ∂ α u) for |α| ≤ s solves the system ⎧ α ∂ ∂ H + u · ∇∂ α H + nc(h) (∇ · u) ∂ α H = ∂ α H − n(n − 1)∂ α (h n−2 |∇h|2 ) ⎪ ⎨ t − [∂ α , u] · ∇ H − n[∂ α , H c(h)]∇ · u ⎪ ⎩ α ∂t ∂ u + u · ∇∂ α u + 2H ∇∂ α H = ∂ α u − [∂ α , u] · ∇u − [∂ α , 2H ]∇ H. By using (61) to estimate in L 2 the commutators in the right hand-side, we get in a classical way by integration by parts 1 n n |∂ α H |2 + c(h)|∂ α u|2 + |∇∂ α H |2 + c(h)|∇∂ α u|2 d 2 Rd 2 2 R 2 α α α ≤ C0 || (h, u) ||W 1,∞ ||V || H s + C + D + R ,

d dt

(63)

where V ≡ (H, u), C0 is a non-decreasing function depending only on f , s and d, and α

C ≡ −n

d R

(∂ α H ) [∂ α , H c(h)](∇ · u),

n c (h) (∇h · ∇)∂ α u · ∂ α u − n(n − 1) ∂ α h n−2 |∇h|2 ∂ α H, 2 Rd Rd n Rα ≡ c (h)∂t h|∂ α u|2 . 4 Rd

Dα ≡ −

We have singled out the three terms above since they are the ones involving h which must be estimated with care. Note that the estimate of C α will be crucial since this


523

term involves high order derivatives of h. Next, we can integrate (63) in time, sum the 1 estimates for |α| ≤ s and use that c(h) > 0, hence nc(h)/2 ≥ C1 (||h|| to obtain L∞ ) t ||∇V (τ )||2H s dτ ||V (t)||2H s + 0 t 2 ≤ C1 (||h|| L ∞ ) ||V (0)|| H s + C0 ||(h, u)(τ )||W 1,∞ ||V (τ )||2H s + C(τ ) + D(τ ) 0 + R(τ ) dτ , (64) with C≡

Cα , D ≡

|α|≤s

|α|≤s

Dα , R ≡

Rα .

|α|≤s

Estimate for C. We claim that

C ≤ C0 (||(h, u)||W 1,∞ ) ||V ||2H s + ||h||2H s−1 .

(65)

The crucial point is that this estimate only involves the H s−1 norm of h. This will allow us to conclude by using that for the first equation in (59), the H s−1 norm of h is controlled by the H s norm of u. By using the commutator estimate (61), we have C ≤ C||H || H s ||H c(h)|| H s ||∇ · u|| L ∞ + ||∇ (H c(h)) || L ∞ ||∇ · u|| H s−1

≤ C0 ||(h, u)||W 1,∞ ||V ||2H s + ||H || H s ||H c(h)|| H s . To estimate the last term, we use that H = h n , which yields h∂i H = n H ∂i h, thus ∂i (H c(h)) = c(h)∂i H + c (h)H ∂i h = c(h)∂i H +

1 c (h)h∂i H. n

Consequently, by (60), (62), we get

||H c(h)|| H s ≤ C||c(h)∇ H || H s−1 + C||c (h)h∇ H || H s−1 ≤ C0 ||(h, u)||W 1,∞ ||H || H s + ||h|| H s−1 ,

and (65) follows. Estimate for D. The term D involves derivatives of u of order ≤ s + 1, and we shall use the energy dissipation in (63). We prove that

1 C1 (||h|| L ∞ ) D ≤ ||∇V ||2H s + C0 (||h||W 1,∞ ) ||V ||2H s + ||∇h||2H s−1 . (66) 2 We have, on the one hand, α α c (h)∇h · ∇∂ u · ∂ u ≤ C0 (||h||W 1,∞ )||∇u|| H s ||u|| H s Rd

≤ C0 (||h||W 1,∞ )||∇V || H s ||V || H s .

524


On the other hand, for the second term (which vanishes if n = 1), after one integration by parts when |α| > 0, we get

α n−2 2 α ∂ h |∇h| ∂ H ≤ C||∇ H || H s ||h n−2 |∇h|2 || H s−1 n(n − 1) Rd

≤ C0 (||h||W 1,∞ ) ||∇ H || H s ||∇h|| H s−1 ,

and if α = 0, since H = h n and s ≥ 1, n−1 h n−2 |∇h|2 H = |∇ H |2 ≤ C||H ||2H s . n(n − 1) n Rd Rd Consequently, D ≤ C0 (||h||W 1,∞ ) ||∇V || H s ||V || H s + ||∇h|| H s−1 + C||V ||2H s , and (66) follows from the standard inequality, for a, b, θ > 0, ab ≤ θa 2 +

b2 4θ .

Estimate for R. We prove that C1 (||h|| L ∞ ) R ≤

1 ||∇V ||2H s + C0 (||(h, u)||W 1,∞ ) ||V ||2H s . 2

(67)

By using the first equation in (58) for h and an integration by parts, we find, as for the first term in D, n α 2 R ≤ C0 (||(h, u)||W 1,∞ ) ||V || H s + c (h)h|∂ α u|2 4 Rd n ≤ C0 (||(h, u)||W 1,∞ ) ||V ||2H s − c (h) (∇h · ∇)∂ α u · ∂ α u 4 Rd n − c (g)|∇h|2 |∂ α u|2 4 Rd

≤ C0 (||(h, u)||W 1,∞ ) ||V ||2H s + ||∇V || H s ||V || H s . 2

b Then, (66) follows as above from the inequality ab ≤ θa 2 + 4θ . Summing (65), (66) and (67), inserting this into (64) and cancelling the terms ||∇V ||2H s , we infer

||V (t)||2H s ≤ C1 (||h(t)|| L ∞ ) ||V (0)||2H s t + C0 (||(h, u)(τ )||W 1,∞ ) ||V (τ )||2H s + ||h(τ )||2H s−1 + ||∇h(τ )||2H s−1 dτ . (68) 0

t To close the estimate, it remains to evaluate ||h||2H s−1 and 0 ||∇h||2H s−1 . We use the standard H s−1 estimate for the convection diffusion equation (58) which yields, as for (63), for |α| ≤ s − 1,

d 1 |∂ α h|2 + |∂ α h|2 ≤ C0 || (h, u) ||W 1,∞ ||h||2H s−1 + ||h|| H s−1 ||u|| H s . dt 2 Rd Rd


Summing for |α| ≤ s − 1 and integrating in time, this yields t 1 ||∇h(τ )||2H s−1 dτ ||h(t)||2H s−1 + 2 0 t

1 ≤ ||h(0)||2H s−1 + C0 || (h, u)(τ ) ||W 1,∞ ||V (τ )||2H s + ||h(τ )||2H s−1 dτ. 2 0

525

(69)

Finally, we can combine (68) and (69), to get ||V (t)||2H s + ||h(t)||2H s−1

≤ C0 || (h, u) || L ∞ ([0,t],W 1,∞ ) ||V (0)||2H s + ||h(0)||2H s−1 t + ||V (τ )||2H s + ||h(τ )||2H s−1 dτ .

(70)

0

Since H s−1 is embedded in W 1,∞ for s > 2+d/2, we easily get by classical continuation arguments and the Gronwall lemma that the solution of (58) is defined on an interval of time [0, T ) independent of . Finally, (70) provides a uniform bound for (h, H, u) in H s−1 × H s × H s , which allows to prove in a classical way that (h , u ) converges towards a solution of (57). This ends the proof of the existence of solution. To prove the uniqueness, it suffices to use the same method as above and perform an L 2 energy estimate on the system satisfied by h 1 − h 2 , u 1 − u 2 , H1 − H2 . This is left to the reader. 3.2. WKB expansions. We now turn to the construction of WKB expansions up to arbitrary order. Let us first notice that in Theorem 4, if the initial datum (a0 , u 0 ) is in H ∞ × H ∞ , then the solution (a, u) is in C 0 ([0, T ], H s−1 × H s ) for every s > 2 + d/2, with T independent of s > 2 + d/2. In other words, the existence time of the maximal solution in H ∞ × H ∞ is positive. This fact follows easily from (70) and the Gronwall inequality (since H s−1 ⊂ W 1,∞ ). ε

Lemma 2. Consider 0ε = a0ε eiϕ0 /ε with a0ε ∈ H ∞ , ϕ0ε ∈ H ∞ and that for some m ∈ N, there exists an expansion a0ε

=

m

εk a0k

k=0

+ εm+1 a0ε ,

ϕ0ε

=

m

εk ϕ0k + εm+1 ϕ0ε

with a00 ∈ R, a0k , ϕ0k ∈ H ∞ , satisfying, for every s, sup || a0ε || H s + || ϕ0ε || H s < +∞. ε∈(0,1)

(71)

k=0

(72)

Let us denote 0 < T ∗ ≤ +∞ the existence time of the maximal smooth (i.e. H ∞ × H ∞ ) solution (a 0 , ϕ 0 ) for (55) with the initial condition (a00 , ϕ00 ). Then, there exists an approxε imate smooth solution of (1) on [0, T ∗ ) under the form a = a ε eiϕ /ε , with a ε , ϕ ε ∈ H ∞ and a ε complex-valued, solving ⎧ ε ∂ϕ 1 ⎪ ε 2 ε 2 m ⎪ ⎪ ⎨ ∂t + f (|a | ) + 2 |∇ϕ | = Rϕ (73) ε ⎪ ⎪ ∂a ε ε a ε ⎪ ⎩ + ∇ϕ · ∇a ε + ϕ ε − J a ε = Ram , ∂t 2 2

526


with the initial condition (a ε , ϕ ε )/t=0 = a0ε , ϕ0ε , and where, for every s and 0 < T < T ∗, (74) sup || Ram || H s + || Rϕm || H s ≤ Cs,T εm+2 . [0,T ]

Finally, for 0 < T < T ∗ , a ε verifies (38): a ε − a 0 = O(ε) in L ∞ ([0, T ], W s,∞ ). Note that a is indeed an approximate solution of (1) since iε

ε ϕ ∂ a ε2 + a − a f (| a |2 ) = −iε Ram + a ε Rϕm exp i . ∂t 2 ε

By using the notation of Sect. 2, we have R ε = −iε Ram + a ε Rϕm , hence sup || R ε || H s ≤ Cs εm+2 .

(75)

[0,T ]

Proof. As in [9], we look for expansions aε =

m

εk a k + εm+1 a m+1 ,

k=0

ϕε =

m

εk ϕ k + εm+1 ϕ m+1 .

k=0

This yields that (a 0 , ϕ 0 ) solves the nonlinear system ⎧ 0 ∂ϕ 1 ⎪ 0 2 0 2 ⎪ ⎪ ⎨ ∂t + f (|a | ) + 2 |∇ϕ | = 0 ⎪

0 0 ⎪ ⎪ ⎩ ∂a + ∇ϕ 0 · ∇a 0 + a ϕ 0 = 0, ∂t 2 which is just (9), and that for 1 ≤ k ≤ m, (a k , ϕ k ) solves the linear system ⎧ k ∂ϕ ⎪ ⎪ + 2 f (|a 0 |2 )(a 0 , a k ) + ∇ϕ 0 · ∇ϕ k = Sϕk ⎪ ⎨ ∂t ⎪

k 0 k ⎪ ⎪ ⎩ ∂a + ∇ϕ 0 · ∇a k + ∇a 0 · ∇ϕ k + a ϕ k + a ϕ 0 = S k , a ∂t 2 2

(76)

(77)

where the source terms (Sϕk , Sak ) depend only on (a j , ϕ j )0≤ j≤k−1 , and Sak is complexvalued. 0 0 We first solve (76) (that is (9)) with the initial condition ϕ/t=0 = ϕ00 , a/t=0 = a00 . By introducing u 0 ≡ ∇ϕ 0 and by taking the gradient of the first equation of (76), we find ⎧ a0 ⎪ 0 0 0 ⎪ ∇ · u0 = 0 ⎨ ∂t a + u · ∇a + 2 (78)

⎪ ⎪ ⎩ ∂ u 0 + u 0 · ∇u 0 + ∇ f (a 0 )2 = 0, t

which is the compressible Euler type equation considered in the previous section. By using Theorem 4, we get the existence of a smooth solution (a 0 , u 0 ) ∈ H s−1 × H s for


527

every s on [0, T ∗ ) (with T ∗ independent of s), with a 0 real-valued. Finally, to get ϕ 0 , it is natural to set t

1 0 0 0 2 0 2 f (a ) + |u | (τ, x) dτ, ϕ (t, x) = ϕ0 (x) − 2 0 and the same argument as in [2] yields u 0 = ∇ϕ 0 . kWek now turn tok the resolution of (77). We solve it with the initial condition ϕ , a /t=0 = ϕ0 , a0k . By introducing again u k ≡ ∇ϕ k , we can take the gradient in the first line of (77) to get ⎧ a0 ak ⎪ k 0 k k k 0 0 k ⎪ ⎨ ∂t a + u · ∇a + ∇ · u + u · ∇a + ∇ · u = Sa , 2 2 (79)

⎪ ⎪ ⎩ ∂ u k + u 0 · ∇u k + ∇ a 0 , f ((a 0 )2 )a k + u k · ∇u 0 = ∇ S k . t ϕ Again, since f (a 0 )2 can vanish, the symmetrization of this linear hyperbolic system requires some care. We thus set ⎧√ 1 ⎨ 2 f ((a 0 )2 ) 2 a k if n is odd

1 F k (t, x) ≡ √ ⎩ 2 a 0 f ((a 0 )2 ) 2 a k if n is even. (a 0 )2 Note that in both cases, we have F k (t, x) =

√

2 g(a 0 )a k

with g smooth. Indeed, as we have seen, we can write f (ρ) = ρ n−1 q(ρ) with q smooth and positive, and we have in both cases :

1 2 g(a 0 ) = (a 0 )n−1 q((a 0 )2 ) .

(80)

This is the natural generalization of the change of unknown used in [2]. Then, thanks to the equation on a 0 , we get for (F k , u k ) the system ⎧ √ 1 ⎪ ⎪ ∂t F k + u 0 · ∇ F k + √ a 0 g(a 0 )∇ · u k + 2 g(a 0 ) u k · ∇a 0 ⎪ ⎪ 2 ⎪ ⎪ ⎪ ⎪ √ ⎨ a 0 g (a 0 ) Fk 1+ ∇ · u 0 = 2g(a 0 )Sak + 0 2 g(a ) ⎪ ⎪ ⎪ ⎪ ⎪

⎪ 1 ⎪ ⎪ ⎩ ∂t u k + u 0 · ∇u k + √ ∇ a 0 g(a 0 ), F k + u k · ∇u 0 = ∇ Sϕk . 2 0

g (a ) 0 Note that the coefficient a g(a 0 ) is smooth even when a vanishes since g is under the form (80). We have obtained a linear symmetric hyperbolic system with a zero order term and a source term S k depending only on (a j , ϕ j ) for 0 ≤ j < k under the form

∂t U + k

d

j=1

0

A (t, x)∂ j U + L(t, x)U = S , U = j

k

k

k

k

Fk uk

,

528


where A j (t, x) are smooth, real and symmetric and the matrix L is smooth. By the classical theory, there exists, on [0, T ∗ ), a smooth solution (F k , u k ) in H ∞ × H ∞ of this system. Once u k is built, we get a k by solving the transport equation for a k which is given by the first line of (79). Finally, we deduce the phase ϕ k by integrating in time the first line of (77). We obtain t

2 f (|a 0 |2 )(a 0 , a k ) + ∇ϕ 0 · u k − Sϕk (τ, x)dτ. ϕ k (t, x) = ϕ0k (x) − 0

m+1 m+1 ) that solve (77) with the initial conFinally, m+1we choose ε way in a similar (a , ϕ ε m+1 dition a ,ϕ = a0 , ϕ0 . Because of the assumption (72), we find that they /t=0

are also uniformly bounded in H s−1 × H s with respect to ε. This concludes the proof of Lemma 2. 4. Nonlinear Stability

In this section, we give the proof of Theorem 2. We shall actually prove directly a more precise version which states the existence of a WKB expansion to any order. ε

Theorem 5. Consider 0ε = a0ε eiϕ0 /ε with a0ε ∈ H ∞ , ϕ0ε ∈ H ∞ and that for some m ∈ N, there exists an expansion (71) as in Lemma 2. We assume (A) and let (a ε , ϕ ε ) be the smooth approximate solution given by Lemma 2 which is smooth on [0, T ∗ ). Then, • if m = 0, there exists ε0 > 0 and T ∈ (0, T ∗ ) such that for every ε ∈ (0, ε0 ], the solution of (1) with initial data 0ε remains smooth on [0, T ] and satisfies for every s ∈ N, the estimate i ε ε || exp − ϕ − a ε || L ∞ ([0,T ],H s ) ≤ Cs ε. ε • if m ≥ 1, for every T ∈ (0, T ∗ ), there exists ε0 (T ) > 0 such that for every ε ∈ (0, ε0 (T )], the solution of (1) with initial data 0ε remains smooth on [0, T ] and satisfies for every s ∈ N, the estimate i || ε exp − ϕ ε − a ε || L ∞ ([0,T ],H s ) ≤ Cs,T εm+1 . ε Note that Theorem 2 is actually the special case m = 0 in Theorem 5. Proof of Theorem 5. Let s > d/2. We take (a ε , ϕ ε ) the approximate solutions given by ε Lemma 2 and look for the solution of (1) under the form ε = (a ε + w)eiϕ /ε . We get for w Eq. (21) with F ε given by (22) and the initial condition w/t=0 = 0. For s > d/2, and every ε > 0, this semilinear equation is locally well-posed in H s : we get very easily that there exists for some T ε > 0 a unique maximal solution w ∈ C([0, T ε ), H s ) of (21) (see [5] for example). We shall prove that T ε is bounded from below by some T > 0 if m = 0, and that T ε ≥ T for every T ∈ (0, T ∗ ) for ε sufficiently small if m ≥ 1. Let us define τ ε ≡ sup τ ∈ (0, T ε ), ∀t ∈ [0, τ ], 2Nsε (w(t)) ≤ ε2m+4 . Note that τ ε > 0 since w(0) = 0 and that by Sobolev embedding, we have, for t ≤ τ ε , || w(t) ||2L ∞ ≤ K 2 ε−2 Nsε (w(t)) ≤ K 2 ε2m+2 ≤ K 2 , for some K independent of ε.


529

We will apply Theorem 3 with F ε given by (22). To estimate F ε , we use the following lemma: Lemma 3. Let R > 0, s > d/2 and w such that || w || L ∞ ≤ R, and F ε given by (22). Then, for a constant C depending only on || a ε (t) ||W s+2,∞ and R, we have ! ε " Nsε (w) Ns (w) 2 1 ε 2 ε 2 2m+4 2m ε || F || H s + 2 || ImF || H s−1 ≤ Cε +Cε Ns (w)+C + Nsε (w). ε ε4 ε4 We postpone the proof of Lemma 3 to the end of the section. We can first easily end the proof of Theorem 5. Notice first that, by definition of a , we have Ra = Ram +

iε ε a = O H k (εm+1 ) + O H k (ε) = O H k (ε), 2

for every k, uniformly for 0 ≤ t ≤ T , hence 1 || Ra (t) ||W s−1,∞ ≤ C. ε Applying Theorem 3 and Lemma 3 with R ≡ K , we infer that for 0 ≤ t ≤ τ ε , d ε N (w(t)) ≤ Cε2m+4 + Cε2m Nsε (w(t)) , dt s which gives immediately, since w/t=0 = 0, that

2m 1 Nsε (w(t)) ≤ Cε2m+4 eCε t − 1 ≤ ε2m+4 2 in the following cases: • for m = 0, 0 ≤ t ≤ T with 0 < T < T ∗ sufficiently small independent of ε, • for m ≥ 1, T ∈ (0, T ∗ ) is arbitrary, 0 ≤ t ≤ T and ε ≤ ε0 (T ) with ε0 (T ) sufficiently small. As a consequence, τ ε ≥ T as desired and || w || L ∞ ([0,T ],H s (Rd )) ≤ Cs,T εm+1 . It remains to prove Lemma 3. Proof of Lemma 3. We recall that F ε is given by

F ε = R ε + Q ε (w) = R ε + (a ε + w) f (|a ε + w|2 ) − f (|a ε |2 ) − 2(w, a ε ) f (|a ε |2 )a ε . As a first try, we could use the rough estimate Q ε (w) = O(|w|2 )

as w → 0,

which would lead to || Q ε ||2H s +

1 C C || Im Q ε ||2H s−1 ≤ 2 || w ||4H s ≤ 6 Nsε (w)2 , ε2 ε ε

530


which does not allow us conclude in the proof of Theorem 5 for m = 0 and does not give the sharp result for the existence time if m = 1. To get the refined estimate of Lemma 3, the idea is then to use a Taylor expansion for Q ε w.r.t. w up to second order, and write Q ε (w) = |w|2 f (|a ε |2 )a ε + 2 f (|a ε |2 )(w, a ε )w + 2a ε f (|a ε |2 )(w, a ε )2 + G ε (x, w), so that for fixed x, we have as w → 0,

G ε (x, w) = O |w|3 .

We turn now to estimate each term in F ε . Estimate for R ε = iε Ram − Rϕm a ε . Thanks to (75), we have || R ε ||2H s ≤ Cε2m+4 . Moreover, since Rϕm is real-valued and since, from (38), Im a ε = OW s,∞ (ε), we also have 1 || Im R ε ||2H s−1 ≤ Cε2m+4 ε2 thanks to (74). We have thus proven that || R ε ||2H s +

1 || Im R ε ||2H s−1 ≤ Cε2m+4 . ε2

Estimate for G ε (x, w). The estimate relies on Lemma 5 in the Appendix. Indeed, it is clear from the Taylor formula that G ε may be written under the form (Re w)2 h 11 (x, w(x)) + (Re w) (Im w) h 12 (x, w(x)) + (Im w)2 h 22 (x, w(x)) , where h 11 , h 12 , h 22 : Rd × C → C are of class C ∞ and ∀x ∈ Rd , h 11 (x, 0) = h 12 (x, 0) = h 22 (x, 0) = 0. Moreover, h 11 , h 12 and h 22 verify the hypothesis of Lemma 5 in the Appendix since a ε ∈ L ∞ ([0, T ], W s,∞ ). As a consequence, if || w || L ∞ ≤ R, || G ε || H s ≤ C || w ||3H s , which implies || G ε (x, w(x)) ||2H s +

1 2 C || Im G ε (x, w(x)) ||2H s−1 ≤ 2 || G ε (x, w(x)) ||2H s ≤ 8 Nsε (w)3 . 2 ε ε ε

The estimate for the quadratic terms in Q ε (w) will rely crucially on the fact that a ε is real to first order and that (w, a ε ) is estimated in H s−1 by Nsε (w) and not just by ε−2 Nsε (w). Estimate for F1ε ≡ |w|2 f (|a ε |2 )a ε . We have || F1ε ||2H s ≤

C ε N (w)2 , ε4 s

and in view of (38), Im a ε = OW s,∞ (ε), thus 1 C || Im F1ε ||2H s−1 ≤ C || |w|2 ||2H s−1 ≤ 4 Nsε (w)2 . ε2 ε


531

Estimate for F2ε ≡ 2 f (|a ε |2 )(w, a ε )w. We begin with the rough estimate || F2ε ||2H s ≤

C ε N (w)2 . ε4 s

Moreover, one has || f (|a ε |2 )(w, a ε ) ||2H s−1 ≤ C Nsε (w).

(81)

Indeed, let µ ∈ Nd with |µ| ≤ s − 1. Then,

∂ µ f (|a ε |2 )(w, a ε ) = ∗ ∂ λ f (|a ε |2 ) ∂ α w, ∂ β a ε , α+β+λ=µ

where ∗ is a coefficient depending only on α, β and λ. Since |µ| ≤ s − 1, the terms 1 (∂ α w, ∂ β a ε ) are bounded in L 2 by (w) 2 + ε||w|| H s−2 as soon as |α| ≤ s − 2. The term in the sum with |α| = s − 1 (hence µ = α and β = λ = 0) is f (|a ε |2 ) (∂ µ w, a ε ) and is bounded in L 2 by N ε (∂ µ w). Hence, (81) follows. As a consequence, by (60) and Sobolev embedding, we obtain

|| f (|a ε |2 )(w, a ε )w || H s−1 ≤ Cs || w || L ∞ || f (|a ε |2 )(w, a ε ) || H s−1 + || w || H s−1 ≤

C ε N (w). ε2 s

Consequently, || F2ε ||2H s +

1 C || Im F2ε ||2H s−1 ≤ 4 Nsε (w)2 . 2 ε ε

Estimate for F3ε ≡ 2a ε f (|a ε |2 )(w, a ε )2 . We find as for F1ε , || F3ε ||2H s ≤

C ε N (w)2 , ε4 s

and once again in view of (38), 1 C || Im F3ε ||2H s−1 ≤ C || w ||4H s−1 ≤ 4 Nsε (w)2 . ε2 ε We conclude the proof of Lemma 3 summing these estimates. 5. Geometric Optics in a Half-Space In this section, we consider the Gross-Pitaevskii equation in a half-space in dimension d ≤ 3, G P( ε ) ≡ iε∂t ε +

ε2 ε − ε (| ε |2 − 1) = 0, x ∈ Rd+ ≡ Rd−1 × (0, +∞). 2 (82)

We consider the Neumann boundary condition (12) on the boundary and the condition (13) at infinity, that is ∂ ε ∂ ε i ∞2 i ∞ |u | t − u · x ε → 1 |x| → +∞ = = 0 and exp ∂n /∂ Rd+ ∂z /z=0 2ε ε by using the notation x = (y, z) ∈ Rd−1 × (0, +∞).

532


5.1. Construction of the WKB expansion. In this section, we shall consider a smooth solution (a, u), with a real-valued, of ⎧ 1 ⎪ ⎨ ∂t a + u · ∇a + a ∇ · u = 0 2 ⎪ ⎩ ∂t u + u · ∇u + ∇(a 2 ) = 0,

(83)

with the boundary condition u d (t, y, 0) = 0 and the condition at infinity u(t, x) → u ∞ ,

a(t, x) → 1

when |x| → +∞.

Since we look for a real-valued, the resolution of this system is made in [14] (Theorem 2). Given s ∈ N∗ , if the initial datum a0 is positive and (a0 − 1, u 0 − u ∞ ) ∈ H s , and under some compatibility conditions for (a0 , u 0 ) on the boundary ∂Rd+ of sufficiently high order on the initial data, there exists T0 ∈ (0, +∞) and a solution (a, u) on [0, T0 ] with (a − 1, u − u ∞ ) ∈ C 0 ([0, T0 ], H s ) ∩ C 1 ([0, T0 ], H s−1 ), such that a(t, x) ≥ α > 0, ∀t ∈ [0, T0 ], ∀x ∈ Rd+

(84)

for some α > 0. We also define the phase ϕ by ϕ(t, x) ≡ ϕ0 (x) −

t 0

1 2 |u| + |a|2 − 1 (τ, x) dτ. 2

In view of the condition (13) at infinity, ϕ is not in H s but ϕ(t, .)−u ∞ ·x + 2t |u ∞ |2 ∈ H s . As we have seen and as in [2], u = ∇ϕ. The aim of this subsection is to prove the existence of the WKB expansion (which involves boundary layers since the solution of (83) does not match the Neumann boundary condition (12)) up to arbitrary orders for (82), (12), (13) starting from a smooth (a, u) which verifies (84). We define the set of boundary layer profiles Sex p as Sex p = A(t, y, Z ) ∈ H ∞ (R+ × Rd−1 × R+ ), ∀k, α, l, ∃γ > 0, |∂tk ∂ yα ∂ Zl A| # ≤ Ck,α,l exp(−γ Z ) . Lemma 4. Let s ∈ N and m ∈ N∗ be fixed. Then, there exists a smooth function a,m = ϕε

a ε ei ε on [0, Tm ] verifying the Neumann condition (12) and the condition (13) at infinity and such that a,m is an approximate solution of (82) on [0, Tm ]: G P( a,m ) = εm R ε ei

ϕε ε

,

(85)

where R ε can be written under the form

z z R ε = −a ε Rϕint,m (t, x) + Rϕ,m (t, y, ) + i ε Raint,m (t, x) + Ra,m (t, y, ) , ε ε

(86)


533 ,m

,m

with Rϕint,m , Raint,m smooth and uniformly bounded in H s and Ra (t, y, Z ), Rϕ (t, y, Z ) ∈ Sex p . Moreover, a ε is real-valued and a ε , ϕ ε have smooth expansions under the form aε = a + ϕε = ϕ +

m−1

k=1 m−1

k=1

z z εk a k (t, x) + Ak (t, y, ) + εm Am (t, y, ), ε ε

(87)

z z εk ϕ k (t, x) + k (t, y, ) + εm m (t, y, ). ε ε

(88)

The boundary layer profiles Ak (t, y, Z ), k (t, y, Z ) belong to Sex p and are such that ∂ Z A1 (t, y, 0) = −∂z a(t, y, 0),

∂ Z 1 (t, y, 0) = −∂z ϕ(t, y, 0),

∂ Z Ak (t, y, 0) = −∂z a k−1 (t, y, 0), ∂ Z k (t, y, 0) = −∂z ϕ k−1 (t, y, 0) ∀2 ≤ k ≤ m. (89)

ε Proof. Since a,m = a ε exp i ϕε , we want to solve approximately 1 1 −a ε ∂t ϕ ε + |∇ϕ ε |2 + |a ε |2 − 1 + iε ∂t a ε + ∇ϕ ε · ∇a ε + a ε ϕ ε 2 2 +

ε2 a ε = 0. 2

(90)

Since, in this section, we are looking for a ε real-valued, we can split the system (90) into ⎧ 1 ε ε ⎪ ε ε ε ⎪ ⎪ ∂t a + ∇ϕ · ∇a + a ϕ = 0 ⎨ 2 for t ≥ 0, x ∈ Rd+ . (91) ⎪ 2 ε ⎪ a 1 ε ⎪ ⎩ ∂t ϕ ε + |∇ϕ ε |2 + (a ε )2 − 1 = 2 2 aε Note that in this section, the division by a ε in the right-hand side of the second equation of (91) is not a problem since a 0 = a verifies (84) and hence does not vanish. We thus plug the expansions (87), (88) in (91) and we cancel the powers of ε. To separate interior and boundary layer terms, we use the general theory of [11]. In particular, we use that for every smooth function f and V ∈ Sex p , we have the expansion f (u(t, x) + V (t, y, z/ε)) = f (u(t, x)) + f (u(t, y, 0) + V (t, y, z/ε)) − f (u(t, y, 0)) + εR, where R ∈ Sex p . This yields that the boundary layer part of f (u(t, x) + V (t, y, z/ε)) is given by f (u(t, y, 0) + V (t, y, z/ε)) − f (u(t, y, 0)). In the following, we use the notation Wb = W (t, y, 0) for every W (t, x). At first, the ε−1 term in the equation only gives ab ∂ Z Z 1 = 0,

534


and hence we have 1 = 0, since ab ≥ α > 0 and 1 ∈ Sex p . Note that this is coherent with the fact that u d (t, y, 0) = (∂z ϕ)b = 0 so that we do not need a boundary layer to correct the boundary condition. The ε0 term gives, as expected, ⎧ 1 2 2 ⎪ ⎪ ⎨ ∂t ϕ + 2 |∇ϕ| + a − 1 = 0 for t ≥ 0, x ∈ Rd+ (92) ⎪ ⎪ 1 ⎩ ∂ a + ∇ϕ · ∇a + a ϕ = 0 t 2 for the interior part, and for the boundary layer terms, for (t, y) ∈ R+ × Rd−1 , ab ∂ Z Z 2 = −(∂z ϕ)b ∂ Z A1 = 0

for Z > 0,

since (∂z ϕ)b = u d (t, y, 0) = 0. Consequently, we also find gives ⎧ 1 ⎪ ⎨ ∂t a 1 + ∇ϕ · ∇a 1 + ∇ϕ 1 · ∇a + (aϕ 1 + a 1 ϕ) = 0 2 ⎪ ⎩ ∂t ϕ 1 + 2a a 1 + ∇ϕ · ∇ϕ 1 = 0

2

(93)

= 0. Next, the order ε

for t ≥ 0, x ∈ Rd+

in the interior and for the boundary layer terms ⎧ 1 1 ⎪ 1 1 2 2 ⎪ ⎨ ∂ Z Z A = A ∂t ϕ + |∇ϕ| + a − 1 + 2ab2 A1 = 2ab2 A1 2 2 b ⎪ ⎪ ⎩ a b ∂ Z Z 3 = G 3

for Z > 0, (94)

where G 3 ∈ Sex p depends only on (a, A1 , a 1 ) and (ϕ, ϕ 1 ). Consequently, the boundary layer A1 is given by A1 ≡

(∂z a)b −2ab Z e 2ab

in order to match (89). Finally, the εk , k ≥ 2 terms give ⎧ ⎪ ∂ ϕ k + 2a a k + ∇ϕ · ∇ϕ k = Sϕk ⎪ ⎨ t ⎪ a ak ⎪ ⎩ ∂t a k + ∇ϕ · ∇a k + ∇a · ∇ϕ k + ϕ k + ϕ = Sak 2 2

for t ≥ 0, x ∈ Rd+ (95)

and

⎧ ⎨ ∂ Z Z Ak = 4ab2 Ak + F k ⎩

∂ Z Z k = G k

for Z > 0,

(96)

where Sϕk and Sak depend only on (a, ϕ) and (a j , ϕ j )1≤ j≤k−1 ; F k ∈ Sex p depends only on (a, ϕ), (a j , ϕ j , A j , j )1≤ j≤k−1 and k ; and G k ∈ Sex p depends on (a, ϕ),


535

(a j , ϕ j , A j , j )1≤ j≤k−1 . Therefore, if we want to solve by induction these equations, one has to determine first k , then (a k , ϕ k ) and finally Ak . To solve the cascade of equations by induction, we first determine (a 1 , ϕ 1 ). As before, we notice that (a 1 , u 1 ≡ ∇ϕ 1 ) solves a symmetrizable hyperbolic system (there is no problem with the vacuum since we are in the same situation as in [9]). Since the condition at infinity is already absorbed by (a, ϕ), one can look for (a 1 , u 1 ) in H s . Moreover, we solve the system in Rd+ with the boundary condition u 1d (t, y, 0) = 0 which is needed in order to match (89) since we have already found that 2 = 0. The existence of a smooth solution for this linear system with the boundary condition u 1d (t, y, 0) = 0 which is maximal dissipative and an initial condition satisfying suitable compatibility conditions can be obtained by the classical theory [17]. Then, one finds ϕ 1 by the formula t

2a a 1 + u · u 1 (τ, x) dτ. ϕ 1 (t, x) = ϕ01 (x) − 0

Furthermore, since F 2 ∈ Sex p and ab ≥ α > 0, the first equation in (96) (with k = 2) has a unique solution A2 ∈ Sex p . We have therefore found (a 1 , A1 , ϕ 1 , 1 , A2 , 2 ). We now proceed by induction. Assume that, for some m ≥ 2, we have determined (a j , ϕ j )1≤ j≤m−1 and (A j , j )1≤ j≤m . Then, we wish to solve (95) and (96) with k = m +1. Since G m+1 is already determined and G m+1 ∈ Sex p , the differential equation ∂ Z Z m+1 = G m+1 has a unique solution in Sex p and +∞ m+1 G (t, y, ζ ) ∂ Z m+1 (t, y, Z ) = − dζ. a (t, y) b Z This determines the boundary condition for u m+1 ≡ ∇ϕ m+1 . Indeed, to match (89) we shall need to impose +∞ m−1 G (t, y, ζ ) m+1 m+1 dζ, u m+1 (t, y, 0) = (∂ ϕ )(t, y, 0) = −(∂

)(t, y, 0) = z Z d ab (t, y) 0 (97) which is non-zero in general. We then solve (96) in the following way: (a m+1 , u m+1 ≡ ∇ϕ m+1 ) still solves a linear symmetrizable hyperbolic system, with source terms Sϕm+1 and Sam+1 already known, with the maximal dissipative boundary condition (97). It has then a smooth solution by the above mentioned theory. Then, we recover ϕ m+1 as usual by t

Sϕm+1 − 2a a m+1 − u · u m+1 (τ, x) dτ. ϕ m+1 (t, x) ≡ ϕ0m+1 (x) + 0

Finally, the first equation in (96) (with k = m + 1) is a linear ODE for Am+1 , with source term F m+1 ∈ Sex p now determined, for which we can write down explicitly the unique exponentially decreasing solution satisfying ∂ Z Ak (t, y, 0) = −∂z a k (t, y, 0). Consequently, we have constructed an approximate solution of (91) such that ⎧

1 ε ε ⎪ ε ε ε m int,m −1 ,m ⎪ ∂ a R a + ∇ϕ · ∇a + ϕ = ε (t, x) + ε R (t, y, z/ε) ⎪ t a a ⎨ 2 ⎪

2 ε ⎪ ⎪ ⎩ ∂t ϕ ε + 1 |∇ϕ ε |2 + a ε 2 − 1 = ε a (t, x) + εm Rϕint,m (t, x) + Rϕ,m (t, y, z/ε) , 2 2 aε

536

D. Chiron, F. Rousset ,m

,m

where Raint,m (t, x), Rϕint,m (t, x) are smooth bounded functions and Ra , Rϕ ∈ Sex p . We can thus write the error R ε in the GP equation as

R ε (t, x) = εm −a ε Rϕint,m (t, x) + Rϕ,m (t, y, z/ε) + i ε Raint,m (t, x) + Ra,m (t, y, z/ε) . This ends the proof of Lemma 4.

5.2. Validity of the WKB expansion. We shall now prove the stability of the WKB expansion built in Lemma 4. ϕε

Theorem 6. Let a,m = a ε ei ε be a WKB expansion defined on [0, Tm ] given by Lemma 4. Then for d ≤ 3 and m ≥ 4 there exists a unique smooth solution ε also a,m ε defined on [0, Tm ] of (82), (12), (13) such that /t=0 = /t=0 . Moreover, we have the estimate ε || ε e

−iϕ ε ε

− a ε || H 1 (Rd+ ) + ε3 || ε e−i

ϕε ε

1

− a ε || H 3 (Rd+ ) ≤ Cm εm− 2 , ∀t ∈ [0, Tm ],

and in particular || ε e−i

ϕε ε

7 − a + ε A1 ||W 1,∞ (Rd+ ) ≤ Cm max{ε, εm− 2 }.

(98)

Remark 3. For simplicity, we have restricted ourselves to dimension d ≤ 3. Note however that it is possible to get H s estimates for every s. By contrast with Theorem 2, we emphasize that the initial condition in Theorem 6 is exactly the WKB approximate solution a,m . In particular, this initial datum has to verify some compatibility condition on the boundary. Proof. As in the proof of Theorem 5, we set ε = a,m + w e

iϕ ε ε

and we study the equation for w i.e. (19). Note that we are now seeking a w which tends to zero at infinity since the boundary condition at infinity is already absorbed in the WKB expansion. Again the first step is to get estimates for the linear equation (21) in with the Neumann boundary condition ∂z w(t, y, 0) = 0.

(99)

As we can check in the proof of Lemma 1, in all the integration by parts that are performed, the boundary terms vanish due to the Neumann boundary condition or the fact that u εd (t, y, 0) = 0, and hence the proof of the L 2 stability will be almost the same as the one in the whole space. Nevertheless, we have to pay attention to the presence of boundary layer terms in the coefficients. At first, we note that since 1 = 0 and

2 = 0 in the WKB expansion, we still have that M (which is defined in Lemma 1) is independent of ε. Indeed, for the worst term which is ∇(∇ · u ε ), we have ∇(∇ · u ε ) = ∂ Z Z Z 3 + ∇ϕ + O L ∞ (ε).


537

Next, keeping the definitions of Ra and Rϕ given in (17), (18) and by construction of the WKB expansion, we have ||Ra || L ∞ ≤ Cεm .

(100)

Nevertheless, again by construction of the WKB expansion, we only have Rϕ = Rϕm +

ε2 a ε , 2 aε

and due to the presence of boundary layers in a ε , we can split Rϕ into z Rϕ = ε2 Rϕint (t, y, z) + ε Rϕ (t, y, ), ε

(101)

where Rϕint is smooth and bounded, whereas Rϕ ∈ Sex p , and we see that ε ||Rϕ || L ∞ =

O(ε), ε ||∇ Rϕ || L ∞ = O(1), hence the estimate (23) of Lemma 1 would be useless. Moreover, the fact that Rϕ belongs to Sex p does not seem to improve the estimates. The way to overcome this difficulty seems to incorporate this new singular term into the functional. Let us define the operator S+ε w = −

ε2 w + 2(w, a ε )a ε + ε Rϕ w, 2

our weighted norm in this section will be

ε (S+ε w, w) + K ε2 |w|2 d x N+ (w) =

1 = ε2 |∇w|2 + 4(w, a ε )2 + 2ε Rϕ |w|2 + 2K ε2 |w|2 d x. 2 Note that Rϕ has no sign, nevertheless, N+ε (w) can be bounded from below by a weighted H 1 norm if K is chosen sufficiently large. Indeed, since Rϕ belongs to Sex p we can write −γ z 2ε Rϕ |w|2 d x ≤ Cε e ε |w|2 d x

and then use the one-dimensional Sobolev inequality 1

|w(t, y, z)| ≤ C 2

|w(t, y, ζ )| dζ 2

R+

1

2

|∂z w(t, y, ζ )| dζ 2

R+

2

to get ε

e

− γεz

|w| ≤ Cε||w|| L 2 ||∇w|| L 2 2

R+

e−

γz ε

dz ≤ Cε2 ||w|| L 2 ||∇w|| L 2 .

In particular, we have proven that 2 2ε Rϕ |w| d x ≤ Cε2 ||w|| L 2 ||∇w|| L 2 .

(102)

(103)

538


This yields thanks to the Young inequality 1 2 2ε Rϕ |w| d x ≤ ε2 ||∇w||2L 2 + Cε2 ||w||2L 2 , 2

(104)

where C is independent of ε. Consequently, if K is chosen such that 2K > C, we get ε 2 2 ε 2 N+ (w) ≥ C0 ε ||w|| H 1 + (w, a ) d x , C0 > 0.

Note that in this section, we have a ε = a + O(ε) with a ≥ α; this finally yields that N+ε (w) is equivalent to the weighted norm N+ε (w) ∼ ε2 ||w||2H 1 + ||Re w||2L 2 .

(105)

The first step in the proof of Theorem 6 is to prove the equivalent of Lemma 1. We shall prove the estimate d ε N (w(t)) ≤ C N+ε (w(t)) dt + + ||F ε ||2L 2 +

4 (w, a ε )(ia ε , F ε ) − ε

(iεw, F ε ) −

(106) (i F ε , Rϕ w),

where C is independent of ε. Proof of (106). The proof follows the same lines as the proof of Lemma 1. At first, since S+ε is self adjoint, we have

ε d 2 S+ε w, ∂t w + 4(w, a ε )(w, ∂t a ε ) + 2ε ∂t Rϕ |w|2 d x. S+ w, w d x = dt

Since ∂t Rϕ ∈ Sex p , we can still use (102) to get ∂t Rϕ |w|2 ≤ C N+ε (w). 2ε

Next, as in the proof of Lemma 1, we use (21) to express ∂t w as ε2 Rϕint i Fε i 1 w− ∂t w = − S+ε w − u ε · ∇w + w ∇ · u ε − i ε 2 ε ε to get 2

iε2 Rϕint 1 w − u ε · ∇w + w ∇ · u ε − 2 ε Fε ε , S+ w d x. −i ε

∂t w, S+ε w d x = 2

(107)


539

Moreover, since Rϕint and Rϕ are real, we have the cancellation (i Rϕint w, Rϕ w) d x = 0.

Therefore, the only terms in the right-hand side of (107) which are not present in (25) are − (i F ε , Rϕ w) and 1 ε ε u · ∇w + w ∇ · u , ε Rϕ w . I = −2 2 To estimate I, we note that we have a bound on the second term by using again (102). It remains to estimate the first term. Integrating by parts and using that u εd (t, y, 0) = 0, we get ε −2 u · ∇w, ε Rϕ w = ε ∇ · Rϕ u ε |w|2 = ∇ · u ε ε Rϕ |w|2 + ε u ε · ∇ Rϕ |w|2 .

Again, the first term can be bounded thanks to (102). For the second one, we first notice that since u εd (t, y, 0) = 0 and Rϕ ∈ Sex p , we have γz ε u ε · ∇ Rϕ ≤ Cε |∇ y Rϕ | + |z∂z Rϕ | ≤ Cεe− ε . This finally yields I ≤ C N+ε (w), thanks to a new use of (102). The end of the proof of (106) is then exactly the same as the proof of Lemma 1, since all the integration by parts do not create boundary terms either because of the Neumann boundary condition or because u εd vanishes on the boundary. Higher order estimates. The estimates of higher order derivatives are more involved than in the whole space. There are two main reasons. The first one is that there is a new singular term ε Rϕ w which creates bad terms when we take the derivatives of the equation. The second reason is that to recover estimates on the normal derivatives, we need to use the equation which gives in particular that ε2 ∂z2 behaves like ε∂t and ε∇. This anisotropy in the weights does not seem to allow to construct high order functionals like Nsε (w) which allows to get H s estimates without additional loss of ε. Let us use the notation t = (0 , . . . , d ) = ∂t , ∇ y , p(z)∂z , where the weight p(z) is given by p(z) = z/(1 + z). Note that we can apply to the equation since w still satisfies the Neumann boundary condition. The use of is classical in hyperbolic characteristic initial boundary value problems (see [17] for example) The weighted norm that we shall estimate is Y+ε (w) ≡ N+ε (w) + N+ε (εw).

540


In dimension d ≤ 3, this is sufficient to get the nonlinear stability. We shall see in the proof why the use of d is necessary. We shall prove that d ε Y+ (w) ≤ C Y+ε (w) + X ε (F ε ) + X ε (εF ε ) dt

(108)

for some C > 0 independent of ε where we have set X ε (F) ≡ ||F||2H 1 +

||F||2L 2 ε

+

||Im F||2L 2 ε2

.

Proof of (108). As a preliminary, we shall rewrite (106) in a more convenient form. We can use that a ε = a + O(ε) with a real, perform an integration by parts and use (102) to get from (106) that d ε N (w(t)) ≤ C N+ε (w(t)) + X ε (F ε ), dt +

(109)

where X ε (F ε ) = ||F ε ||2H 1 +

||F||2L 2 ε

+

||Im F ε ||2L 2 ε2

.

To prove (108), we start with the estimate of N+ε (ε∂t w). When we apply ε∂t to (21), we find iε∂t + Lε ε∂t w = Rϕ ε∂t w + ε∂t F ε + C, (110) where the commutator C can be split into C = C1 + C2 + C3

(111)

with C1 ≡ ε ∂t Rϕ w, C2 ≡ 2ε (∂t a ε , w)a ε + (a ε , w)∂t a ε , 1 2 ε ε C3 ≡ −iε ∂t u · ∇w + ∂t (∇ · u ) w . 2 Consequently, we can apply (109) to (110) with the new source term ε∂t F ε + C to get d ε N (ε∂t w(t)) ≤ C N+ε (ε∂t w(t)) + X ε (ε∂t F ε ) + X ε (C). dt +

(112)

Thus it remains to estimate X ε (C). Let us begin with X ε (C1 ). Thanks to the expansion (101), we easily get X ε (C1 ) N+ε (w) + |∂t Rϕ |2 ε4 |w|2 + ε4 |∇w|2 ) + ε4 |∇∂t Rϕ |2 |w|2 N+ε (w).

(113)


541

Note that we could have a better estimate by using that Rϕ ∈ Sex p and (102). Next, we turn to the estimate of X ε (C2 ). By using that a ε = a + O(ε) with a real, we find X ε (C2 ) N+ε (w) + ε||Re w||2L 2 + ε2 ||∇w||2L 2 N ε (w).

(114)

Note that the above estimate was sharp. This is for the estimate of this commutator C2 that we had to choose the weight ε in front of the time derivative. Finally, we estimate X ε (C3 ) using that ∂t u εd vanishes on the boundary which implies that |∂t u εd | p(z). Thanks to this remark, we find X ε (C3 ) N+ε (w) + ε4 ||∇w||2L 2 Y+ε (w).

(115)

Note that this is for the control of this commutator that we are obliged to add the vector field p(z)∂z in the definition of the functional space. Consequently, the combination of (112), (113), (114) and (115) gives d ε N (ε∂t w(t)) Y+ε (w(t)) + X ε (ε∂t F ε ). dt +

(116)

The estimate of ε∇ y w follows exactly the same lines, and we also find d ε N+ ε∇ y w(t) Y+ε (w(t)) + X ε (ε∇ y F ε ). dt

(117)

The estimate of εd w = εp(z)∂z w requires some additional work since the vector field d does not commute with the Laplacian. By applying εd to (21), we get iε∂t + Lε εd w = Rϕ εd w + εd F ε + C + C4 , (118) where C is defined as in (111) above with ∂t replaced by d and C4 is given by ε3 ε3 [d , ]w = − (2 (∂z p) ∂zz w + (∂zz p) ∂z w) . 2 2 Next, we can apply (106) to get C4 ≡ −

d ε N (εd w(t)) N+ε (εd w(t)) + X ε (εd F ε ) + X ε (C) + ||C4 ||2H 1 dt + 4 ε ε iC4 , Rϕ εd w . + (εd w, a )(ia , C4 ) − ε Since one can easily check that X ε (C) still satisfies the bounds (113), (114), (115), we obtain d ε N (εd w(t)) Y+ε (w) + X ε (εd F ε ) + ||C4 ||2H 1 dt + 4 iC4 , Rϕ εd w . + (εd w, a ε )(ia ε , C4 ) − ε Next, we note that ||C4 ||2H 1 ε6 ||w||2H 3 and that

542


4 1 4 ε ε (ε w, a )(ia , C ) ε|∂z w| | p(z)C4 | ε2 N+ε (w) 2 || p∂zz w|| L 2 + ||∂z w|| L 2 d 4 ε ε 1

1

N+ε (w) 2 Y+ε (w) 2 . In a similar way, we also get iC4 , Rϕ εd w ε||∂z w|| L 2 || p C4 || L 2 Y+ε (w).

Consequently, we have proven that d ε N (εd w(t)) Y+ε (w) + X ε (εd F ε ) + ε6 ||w||2H 3 . dt +

(119)

To conclude, it remains to estimate ε6 ||w||2H 3 . As usual, this is done thanks to Eq. (19) and the standard regularity result for elliptic equations. We rewrite (19) as the equation ε2 w = G ε , ∂z w(t, y, 0) = 0,

(120)

where the source term enjoys the estimates ||G ε ||2L 2 ε2 ||w||2L 2 + ||w||2L 2 + ||F ε ||2L 2 , ||∇G ε ||2L 2 ε2 ||∇w||2L 2 + ||w||2H 1 + ||∇ F ε ||2L 2 . Consequently, we get from (120) by standard elliptic regularity that ε6 ||w||2H 3 Y+ε (w) + ||F ε ||2H 1 .

(121)

By replacing this last estimate in (119), we finally obtain d ε N (εd w(t)) Y+ε (w) + X ε (εd F ε ) + ||F ε ||2H 1 . dt +

(122)

To conclude, it suffices to sum the estimates (109), (116), (117) and (122) to get (108). The estimate (108) is sufficient to prove the nonlinear stability stated in Theorem 6 for d ≤ 3. Nevertheless, it is possible to prove by induction that for every s, d dt

m≤s

N+ε

m (ε) w X ε (ε)m F ε + N+ε (ε)m w . m≤s


543

Nonlinear stability.. Thanks to (108) and the Gronwall inequality, we get for 0 ≤ T ≤ Tm , where Tm is the existence time of the approximate solution given by Lemma 4, sup Y+ε (w) Y+ε (0) + T eγ T sup X ε (F ε ) + X ε (εF ε ) [0,T ]

[0,T ]

for some γ > 0 independent of ε. Combining this last estimate with (121), we get ε ε ε ε ε ε sup Z + (w) ≤ C Tm Y+ (0) + sup X (F ) + X (εF ) , (123) [0,T ]

[0,T ]

with Z +ε (w) ≡ Y+ε (w) + ε6 ||w||2H 3 . Thanks to this a priori estimate, one can easily prove by standard fixed point argument the existence of a unique solution of (19) with the Neumann condition ∂z w|z=0 = 0 on some interval of time [0, T ε ] ⊂ [0, Tm ] such that Z +ε (w) remains finite. By using that w/t=0 = 0 and the equation to compute the time derivative, we find Y+ε (w)/t=0 = N+ε (ε∂t w)/t=0 ≤ C Tm ε2m . Moreover, using that F ε = εm R ε + Q ε , we have thanks to (86) that sup X +ε (R ε ) + X +ε (R ε ) ≤ C Tm ε2m−1 . [0,Tm ]

Inserting this into (123) yields, for 0 ≤ t ≤ T ε , sup Z +ε (w) ≤ K Tm ε2m−1 + C Tm sup X ε (Q ε ) + X ε (εQ ε ) .

[0,T ]

[0,T ]

(124)

We can thus define τ ε ∈ (0, Tm ] as the maximal time such that the solution w of (19) satisfies Z +ε (w(t)) ≤ 2K Tm ε2m−1 on [0, τ ε ]. As in the proof of Theorem 5, we shall prove that for ε sufficiently small, we have τ ε = Tm . Here, the expression of Q ε (w) is given by Q ε (w) = a ε |w|2 + 2(w, a ε )w + w|w|2 . To conclude, we need to bound the right-hand side of (124). To estimate the nonlinear term, we use that for d ≤ 3, we have ||w||2L ∞ ||∇ 2 w|| ||w|| H 1 , which gives ||w||2L ∞

Z +ε (w) ε2m−5 ∀t ∈ [0, τ ε ). ε4

We shall take m such that 2m > 5 in order to get ||w|| L ∞ ≤ 1 for t ∈ [0, τ ε ). This implies

Z ε (w)2 ||Q ε ||2H 1 ||w||2L ∞ + ||w||4L ∞ ||w||2H 1 + 6 . ε

544


Next, since H 1 (Rd ) ⊂ L 4 for d ≤ 3, we also have ||Q ε ||2L 2 ε2

||w||4H 1 ε2

(1 + ||w||2L ∞ )

Z +ε (w)2 . ε6

Consequently, we have already proven that X ε (Q ε )

Z +ε (w)2 . ε6

(125)

Next, we evaluate X ε (εQ ε ). At first, we write

ε2 ||Q ε ||2H 1 ε2 ||w||2H 1 ||w||2L ∞ + ||w||4L ∞ + ε2 ||w||2L 4 ||∇w||2L 4 1 + ||w||2L ∞ ) and by using for d ≤ 3, the Sobolev embedding H 1 ⊂ L 4 and the Gagliardo-Nirenberg inequality 1

3

||∇ f ||2L 4 || f || H2 1 ||∇ 2 f || L2 2 , we get for 0 ≤ t ≤ τ ε : ε2 ||Q ε ||2H 1

1 3 Z +ε (w)2 Z ε (w)2 + ε2 ||∇w||2H 1 ||w|| H2 1 ||∇ 2 w|| L2 2 + 6 . 4 ε ε

Finally, by similar arguments, we also have ||εQ ε ||2L 2 ε2

1

1

||w|| L2 4 ||w|| L2 4 ||w||2H 1 ||w||2H 1

Z +ε (w)2 . ε6

We have thus proven that X ε (εQ ε )

Z +ε (w)2 . ε6

(126)

Consequently, inserting (125), (126) into (124), we get Z +ε (w)2 ≤ K Tm ε2m−1 +2K Tm C Tm ε2m−7 sup Z +ε (w). ε6 [0,τ ε ] [0,τ ε ]

sup Z +ε (w) ≤ K Tm ε2m−1 +C Tm sup

[0,τ ε ]

By choosing m ≥ 4, this allows to get for ε sufficiently small that τ ε = Tm and that sup Z +ε (w) ≤ Cε2m−1 .

[0,Tm ]

Finally, the estimate (98) follows by Sobolev embedding. This ends the proof of Theorem 6. Acknowledgement. We thank Rémi Carles for useful comments about this work.


545

A. A Lemma about Composition in Sobolev Spaces During the proof of Lemma 3, we have used a result about composition in Sobolev spaces. This result is very standard when h does not depend on x (see, for instance, [18]). Lemma 5. Let R > 0, s ∈ N and h = h(x, w) ∈ C s+1 (Rd ×R2 , R), satisfying h(x, 0) = 0 for all x ∈ Rd . Assume moreover A≡ sup || ∂xα ∂wβ h || L ∞ (Rd ×B R ) , α ∈ Nd , β ∈ N2 , |α| ≤ s, |α| + |β| ≤ s + 1 < +∞. Then, there exists C, depending only on A, s and R, such that, for any w ∈ H s (Rd ) satisfying |w| L ∞ (Rd ) ≤ R, we have h (x, w(x)) ∈ H s (Rd ) and || h (x, w(x)) || H s ≤ C || w || H s . Proof. The proof is by induction on s ∈ N and relies on the Gagliardo-Nirenberg inequality. If s = 0, it suffices to notice that since h(x, 0) = 0, then for w ∈ B R , |h(x, w)| ≤ A|w|. Assume then the result for s − 1 ∈ N. Let µ ∈ Nd with |µ| = s. One has easily

p γ q ∂ µ (h(x, w(x))) = ∂ w2 , ∗ ∂xα ∂wβ+γ h (x, w(x)) ∂ β w1 where α ∈ Nd , α ≤ µ, β, γ ∈ N2 , p, q ∈ N∗ depend on β and γ , |α| + p|β| + q|γ | = s, and ∗ is a coefficient depending only on µ, α, β and γ . Furthermore, since w ∈ H s ∩ L ∞ , the Gagliardo-Nirenberg inequality yields, for 1 ≤ k ≤ s, || w ||

k

W

k, 2s k

1− k

≤ Ck,s || w || Hs s || w || L ∞s .

As a consequence, by interpolation, if w ∈ H s ∩ L ∞ and || w || L ∞ ≤ R, then for γ ∈ Nd , |γ | ≤ s, and 2 ≤ p ≤ |γ2s| , 2

|| ∂ γ w || L p ≤ Cs, p,R || w || Hp s . Therefore, in view of |α| + p|β| + q|γ | = s, by the Hölder inequality, we can estimate the terms in ∂ µ (h(x, w(x))) for which α = µ (thus |α| < s) as p γ q p q || ∂xα ∂wβ+γ h (x, w(x)) ∂ β w1 ∂ w2 || L 2 ≤ A || ∂ β w1 || s−|α| || ∂ γ w2 || s−|α| L

2 |β|

≤ Cs, p,R A || w || H s .

L

2 |γ |

For the term for which α = µ, we note that since h(x, 0) = 0 for x ∈ Rd , then (∂xα h)(x, 0) = 0 for any x ∈ Rd , so that if w ∈ B R ⊂ R2 , α (∂ h)(x, w) ≤ A|w|, x which implies || (∂xα h) (x, w(x)) || L 2 ≤ A || w || L 2 ≤ A || w || H s . Combining these two estimates gives || ∂ µ (h(x, w(x))) || L 2 ≤ Cs, p,R A || w || H s , and the proof of the lemma is complete.

546


References 1. Alazard, T., Carles, R.: Loss of regularity for supercritical nonlinear Schrödinger equations. Math. Ann. 343(2), 397–420 (2009) 2. Alazard, T., Carles, R.: Supercritical geometric optics for nonlinear Schrödinger equations. Arch. Rat. Mech. Anal., to appear, doi:10.1007s00205-008-0176-7, 2008 3. Anton, R.: Global existence for defocusing cubic NLS and Gross-Pitaevskii equations in exterior domains. J. Math. Pures Appl. (9) 89(4), 335–354 (2008) 4. Brenier, Y.: Convergence of the Vlasov-Poisson system to the incompressible Euler equations. Comm. Part. Diff. Eqs. 25(3-4), 737–754 (2000) 5. Cazenave, T.: Semilinear Schrödinger equations. Courant Lecture Notes in Mathematics, Vol. 10. New York: New York University, Courant Institute of Mathematical Sciences, 2003 6. Colliander, J., Keel, M., Staffilani, G., Takaoka, H., Tao, T.: Global well-posedness and scattering for the energy-critical nonlinear Schrödinger equation in R3. Ann. of Math. (2) 167(3), 767–865 (2008) 7. Gérard, P.: Remarques sur l’analyse semi-classique de l’équation de Schrödinger non linéaire. Séminaire sur les Equations aux Dérivées Partielles, Ecole Polytechnique, Palaiseau, 1992-1993, Exp. No. XIII, 13 pp. 8. Ginibre, J., Velo, G.: The global Cauchy problem for the nonlinear Schrödinger equation revisited. Ann. Inst. H. Poincaré Anal. Non Linéaire 2(4), 309–327 (1985) 9. Grenier, E.: Semiclassical limit of the nonlinear Schrödinger equation in small time. Proc. Amer. Math. Soc. 126(2), 523–530 (1998) 10. Grenier, E.: On the derivation of homogeneous hydrostatic equations. M2AN Math. Model. Numer. Anal. 33(5), 965–970 (1999) 11. Grenier, E., Guès, O.: Boundary layers of viscous perturbations of noncharacteristic quasilinear hyperbolic problems. J. Diff. Eqs. 143(1), 110–146 (1998) 12. Kivshar, Y.S., Luther-Davies, B.: Dark optical solitons: physics and applications. Physics Reports 298, 81–197 (1998) 13. Kolomeisky, E.B., Newman, T.J., Straley, X., Qi, J.P. : Low-Dimensional Bose Liquids: Beyond the Gross-Pitaevskii Approximation. Phys. Rev. Lett. 85, 1146–1149 (2000) 14. Lin, F., Zhang, P.: Semiclassical limit of the Gross-Pitaevskii equation in an exterior domain. Arch. Rat. Mech. Anal. 179(1), 79–107 (2006) 15. Makino, T., Ukai, S., Kawashima, S.: Sur la solution à support compact de l’équations d’Euler compressible. Japan J. Appl. Math. 3(2), 249–257 (1986) 16. Pham, C.-T., Nore, C., Brachet, M.-E.: Boundary layers and emitted excitations in nonlinear Schrödinger superflow past a disk. Phys. D 210(3-4), 203–226 (2005) 17. Rauch, J.: Symmetric positive systems with boundary characteristic of constant multiplicity. Trans. Amer. Math. Soc. 291(1), 167–187 (1985) 18. Taylor, M.: Partial Differential Equations. (III), Applied Mathematical Sciences, 117. New-York: Springer-Verlag, 1997 19. Zhang, P.: Semiclassical limit of nonlinear Schrödinger equation. II. J. Part. Diff. Eqs. 15(2), 83–96 (2002) Communicated by I. M. Sigal


Communications in


Rough Solutions of the Einstein Constraints on Closed Manifolds without Near-CMC Conditions Michael Holst , Gabriel Nagy , Gantumur Tsogtgerel Department of Mathematics, University of California San Diego, La Jolla, CA 92093, USA. E-mail: [email protected]; [email protected]; [email protected] Received: 12 April 2008 / Accepted: 15 October 2008 Published online: 26 February 2009 – © The Author(s) 2009. This article is published with open access at Springerlink.com

Abstract: We consider the conformal decomposition of Einstein’s constraint equations introduced by Lichnerowicz and York, on a closed manifold. We establish existence of non-CMC weak solutions using a combination of a priori estimates for the individual Hamiltonian and momentum constraints, barrier constructions and fixed-point techniques for the Hamiltonian constraint, Riesz-Schauder theory for the momentum constraint, together with a topological fixed-point argument for the coupled system. Although we present general existence results for non-CMC weak solutions when the rescaled background metric is in any of the three Yamabe classes, an important new feature of the results we present for the positive Yamabe class is the absence of the near-CMC assumption, if the freely specifiable part of the data given by the traceless-transverse part of the rescaled extrinsic curvature and the matter fields are sufficiently small, and if the energy density of matter is not identically zero. In this case, the mean extrinsic curvature can be taken to be an arbitrary smooth function without restrictions on the size of its spatial derivatives, so that it can be arbitrarily far from constant, giving what is apparently the first existence results for non-CMC solutions without the near-CMC assumption. Using a coupled topological fixed-point argument that avoids near-CMC conditions, we establish existence of coupled non-CMC weak solutions with (positive) conformal factor φ ∈ W s, p , where p ∈ (1, ∞) and s( p) ∈ (1 + 3/ p, ∞). In the CMC case, the regularity can be reduced to p ∈ (1, ∞) and s( p) ∈ (3/ p, ∞) ∩ [1, ∞). In the case of s = 2, we reproduce the CMC existence results of Choquet-Bruhat [10], and in the case p = 2, we reproduce the CMC existence results of Maxwell [33], but with a proof that goes through the same analysis framework that we use to obtain the non-CMC results. The non-CMC results on closed manifolds here extend the 1996 non-CMC result of Isenberg and Moncrief in three ways: (1) the near-CMC assumption is removed in the case of the positive Yamabe class; (2) regularity is extended down to the maximum Supported in part by NSF Awards 0715146, 0411723, and 0511766, and DOE Awards DE-FG02-05ER25707 and DE-FG02-04ER25620. Supported in part by NSF Awards 0715146 and 0411723.

548

M. Holst, G. Nagy, G. Tsogtgerel

allowed by the background metric and the matter; and (3) the result holds for all three Yamabe classes. This last extension was also accomplished recently by Allen, Clausen and Isenberg, although their result is restricted to the near-CMC case and to smoother background metrics and data. Contents 1. 2. 3. 4. 5. 6. 7. A.

Introduction . . . . . . . . . . . . . . . . . . . . . . . . Preliminary Material . . . . . . . . . . . . . . . . . . . . Overview of the Main Results . . . . . . . . . . . . . . . Weak Solution Results for the Individual Constraints . . . Barriers for the Hamiltonian Constraint . . . . . . . . . . Proof of the Main Results . . . . . . . . . . . . . . . . . Summary . . . . . . . . . . . . . . . . . . . . . . . . . . Some Key Technical Tools and Some Supporting Results

. . . . . . . .

. . . . . . . .

. . . . . . . .

. . . . . . . .

. . . . . . . .

. . . . . . . .

. . . . . . . .

. . . . . . . .

. . . . . . . .

. . . . . . . .

548 552 559 568 574 585 589 591

1. Introduction In this article, we give an analysis of the coupled Hamiltonian and momentum constraints in the Einstein equations on a 3-dimensional closed manifold. We consider the equations with matter sources satisfying an energy condition implied by the dominant energy condition in the 4-dimensional spacetime; the unknowns are a Riemannian three-metric and a two-index symmetric tensor. The equations form an under-determined system; therefore, we focus entirely on a standard reformulation used in both mathematical and numerical general relativity, called the conformal method, introduced by Lichnerowicz and York [32,49,50]. The conformal method assumes that the unknown metric is known up to a scalar field called a conformal factor, and also assumes that the trace and a term proportional to the trace-free divergence-free part of the two-index symmetric tensor is known, leaving as unknown a term proportional to the traceless symmetrized derivative of a vector. Therefore, the new unknowns are a scalar and a vector field, transforming the original under-determined system for a metric and a symmetric tensor into a (potentially) well-posed elliptic system for a scalar and a vector field. See [5] for a recent review article. The question of existence of solutions to the Lichnerowicz-York conformally rescaled Einstein’s constraint equations, for an arbitrarily prescribed mean extrinsic curvature, has remained an open problem for more than thirty years. The rescaled equations, which are a coupled nonlinear elliptic system consisting of the scalar Hamiltonian constraint coupled to the vector momentum constraint, have been studied almost exclusively in the setting of constant mean extrinsic curvature, known as the CMC case. In the CMC case the equations decouple, and it has long been known how to establish existence of solutions. The case of CMC data on closed (compact without boundary) manifolds was completely resolved by several authors over the last twenty years, with the last remaining sub-cases resolved and all the CMC sub-cases on closed manifolds summarized by Isenberg in [25]. Over the last ten years, other CMC cases on different types of manifolds containing various kinds of matter fields were studied and partially or completely resolved; see the survey [5]. We take a moment to point out just some of the quite substantial number of works in this area, including: the original work on the Lichnerowicz equation [32]; the development of the conformal method [49–52]; the initial solution theory for the Hamiltonian constraint [39–41]; the thin sandwich alternative

Rough Solutions of the Einstein Constraints on Closed Manifolds

549

to the conformal method [4,37]; the complete classification of CMC initial data [25] and the few known non-CMC results [11,26,28]; various technical results on transversetraceless tensors and the conformal Killing operator [6,8]; the more recent development of the conformal thin sandwich formulation [53]; initial data for black holes [7,9]; initial data for Kerr-like black holes [13,14]; initial data with trapped surface boundaries [15,34]; rough solution theory for CMC initial data [10,33,35]; and the gluing approach to generating initial data [12]. A survey of many of these results appears in [5]. On the other hand, the question of existence of solutions to the Einstein constraint equations for non-constant mean extrinsic curvature (the “non-CMC case”) has remained largely unanswered, with progress made only in the case that the mean extrinsic curvature is nearly constant (the “near-CMC case”), in the sense that the size of its spatial derivatives is sufficiently small. The near-CMC condition leaves the constraint equations coupled, but ensures the coupling is weak. In [26], Isenberg and Moncrief established the first existence (and uniqueness) result in the near-CMC case, for background metric having negative Ricci scalar. Their result was based on a fixed-point argument, together with the use of iteration barriers (sub- and super-solutions) which were shown to be bounded above and below by fixed positive constants, independent of the iteration. We note that both the fixed-point argument and the global barrier construction in [26] rely critically on the near-CMC assumption. All subsequent non-CMC existence results are based on the framework in [26] and are thus limited to the near-CMC case (see the survey [5], the non-existence results in [27], and also the newer existence results in [1] for non-negative Yamabe classes). This article presents (together with the brief overview in [22]) the first non-CMC existence results for the Einstein constraints that do not require the near-CMC assumption. Two recent advances make this possible: A new topological fixed-point argument (established here and in [21]) and a new global super-solution construction for the Hamiltonian constraint (established here and in [22]) that are both free of near-CMC conditions. These two results allow us to establish existence of non-CMC solutions for conformal background metrics in the positive Yamabe class, with the freely specifiable part of the data given by the traceless-transverse part of the rescaled extrinsic curvature and the matter fields sufficiently small, and with the matter energy density not identically zero. Our results here and in [21,22] can be viewed as reducing the remaining open questions of existence of non-CMC (weak and strong) solutions without near-CMC conditions to two more basic and clearly stated open problems: (1) Existence of near-CMC-free global super-solutions for the Hamiltonian constraint equation when the background metric is in the non-positive Yamabe classes and for large data; and (2) existence of near-CMCfree global sub-solutions for the Hamiltonian constraint equation when the background metric is in the positive Yamabe class in vacuum (without matter). We will make some further comments about this later in the paper. Our results in this article, which can be viewed as pushing forward the rough solutions program that was initiated by Maxwell in [33,35] (see also [10]), further extend the known solution theory for the Einstein constraint equations on closed manifolds in several directions: (i) Far-from-CMC Weak Solutions: We establish the first existence results (Theorem 1) for the coupled Einstein constraints in the non-CMC setting without the near-CMC condition. In particular, if the rescaled background metric is in the positive Yamabe class, if the freely specifiable part of the data given by the traceless-transverse part of the rescaled extrinsic curvature and the matter fields are sufficiently small, and if the energy density of matter is not identically zero, then

550

(ii)

(iii)

(iv)

(v)


we show existence of non-CMC solutions with mean extrinsic curvature arbitrarily far from constant. Two advances in the analysis of the Einstein constraint equations make this result possible: A topological fixed-point argument (Theorems 4 and 5) based on compactness arguments rather than k-contractions that is free of near-CMC conditions, and constructions of global barriers for the Hamiltonian constraint that are similarly free of the near-CMC condition (Lemmas 7, 8, 9, 13, and 14). Near-CMC Weak Solutions: We establish existence results (Theorem 2) for nonCMC solutions to the coupled constraints under the near-CMC condition in the setting of weaker (rougher) solutions spaces and for more general physical scenarios than appeared previously in [26,1]. In particular, we establish existence of weak solutions to the coupled Hamiltonian and momentum constraints on closed manifolds for all three Yamabe classes, with (positive) conformal factor in φ ∈ W s, p , where p ∈ (1, ∞) and s( p) ∈ (1+3/ p, ∞). These results are based on combining barriers, a priori estimates, and other results for the individual constraints together with a new type of topological fixed-point argument (Theorems 4 and 5), and are established in the presence of a weak background metric and data meeting very low regularity requirements. CMC Weak Solutions: In the CMC case, we establish existence (Theorem 3) of weak solutions to the un-coupled Hamiltonian and momentum constraints on closed manifolds for all three Yamabe classes, with (positive) conformal factor φ ∈ W s, p , where p ∈ (1, ∞) and s( p) ∈ (3/ p, ∞) ∩ [1, ∞). In the case of s = 2, we reproduce the CMC existence results of Choquet-Bruhat [10], and in the case p = 2, we reproduce the CMC existence results of Maxwell [33], but with a different proof; our CMC proof goes through the same analysis framework that we use to obtain the non-CMC results (Theorems 4 and 5). Again, these results are established in the presence of a weak background metric and with data meeting very low regularity requirements. Barrier Constructions: We give constructions (Lemmas 9 and 13) of weak global sub- and super-solutions (barriers) for the Hamiltonian constraint equation which are free of the near-CMC condition. The constructions require the assumption that the freely specifiable part of the data given by the traceless-transverse part of the rescaled extrinsic curvature and the matter fields are sufficiently small (required for the super-solution construction in Lemma 9) and if the energy density of matter is not identically zero (required for the sub-solution in construction Lemma 13, although we note this can be relaxed using the technique in [1]). While near-CMC-free sub-solutions are common in the literature, our near-CMC-free super-solution constructions appear to be the first such results of this type. Supporting Technical Tools: We assemble a number of new supporting technical results in the body of the paper and in several appendices, including: topological fixed-point arguments designed for the Einstein constraints; construction and properties of general Sobolev classes W s, p and elliptic operators on closed manifolds with weak metrics; the development of a very weak solution theory for the momentum constraint; a priori L ∞ -estimates for weak W 1,2 -solutions to the Hamiltonian constraint; Yamabe classification of non-smooth metrics in general Sobolev classes W s, p ; and an analysis of the connection between conformal rescaling and the near-CMC condition.

The results in this paper imply that the weakest differentiable solutions of the Einstein constraint equations we have found correspond to CMC and non-CMC hypersurfaces


551

with physical spatial metric h ab satisfying h ab ∈ W s, p (M),

p ∈ (1, ∞),

s( p) ∈ 1 + 3p , ∞ .

(1.1)

The curvature of such metrics can be computed in a distributional sense, following [17]. In the CMC case, the regularity can be reduced to h ab ∈ W s, p (M),

p ∈ (1, ∞),

s( p) ∈

3 p,∞

∩ [1, ∞).

(1.2)

In the case s = 2, we reproduce the CMC existence results of Choquet-Bruhat [10], and in the case p = 2, we reproduce the CMC existence results of Maxwell [33], but with a different proof; our CMC proof goes through the same analysis framework that we use to obtain the non-CMC results (Theorems 4 and 5). In this paper we do not include uniqueness statements on CMC solutions, or necessary and sufficient conditions for the existence of CMC solutions; however, we expect that the techniques used in the above mentioned works can be adapted to this setting without difficulty. There are several related motivations for establishing the extensions outlined above. First, as outlined in [5], new results for the non-CMC case, beyond the case analyzed in [1,26], are of great interest in both mathematical and numerical relativity. Non-CMC results that are free of the near-CMC assumption are of particular interest, since the existence of solutions in this case has been an open question for more than thirty years. Second, there is currently substantial research activity in rough solutions to the Einstein evolution equations, which rest on rough/weak solution results for the initial data [30]. Third, the approximation theory for Petrov-Galerkin-type methods (including finite element, wavelet, spectral, and other methods) for the constraints and similar systems previously developed in [20] establishes convergence of numerical solutions in very general physical situations, but rests on assumptions about the solution theory; the results in the present paper and in [21], help to complete this approximation theory framework. Similarly, very recent results on convergence of adaptive methods for the constraints in [23,24] rest in large part on the collection of results here and in [20,21]. An extended outline of the paper is as follows. In Sect. 2, we summarize the conformal decomposition of Einstein’s constraint equations introduced by Lichnerowicz and York, on a closed manifold. We describe the classical strong formulation of the resulting coupled elliptic system, and then define weak formulations of the constraint equations that will allow us to develop solution theories for the constraints in the spaces with the weakest possible regularity. After setting up the basic notation, we give an overview of our main results in Sect. 3, summarized in three existence theorems (Theorems 1, 2, and 3) for weak far-from-CMC, near-CMC, and CMC solutions to the coupled constraints, extending the known solution theory in several distinct ways as described above. We outline the two recent advances in the analysis of the Einstein constraint equations that make these results possible. The first advance is an abstract coupled topological fixed-point result (Theorems 4 and 5), the proof of which is based directly on compactness rather than on k-contractions. This gives an analysis framework for weak solutions to the constraint equations that is fundamentally free of the near-CMC assumption; the near-CMC assumption then only potentially arises in the construction of global barriers as part of the overall fixed-point argument. A result of this type also makes possible the new non-CMC results for the case of compact manifolds with boundary appearing in [21]. The second new advance is the construction

552


of global super-solutions for the Hamiltonian constraint that are also free of the nearCMC condition; we give an overview of the main ideas in the constructions, which are then derived rigorously in Sect. 5. In Sect. 4 we then develop the necessary results for the individual constraint equations in order to complete an existence argument for the coupled system based on the abstract fixed-point argument in Theorems 4 and 5. In particular, in Sect. 4.1, we first develop some basic technical results for the momentum constraint operator under weak assumptions on the problem data, including existence of weak solutions to the momentum constraint, given the conformal factor as data. In Sect. 4.2, we assume the existence of barriers (weak sub- and super-solutions) to the Hamiltonian constraint equation forming a nonempty positive bounded interval, and then derive several properties of the Hamiltonian constraint that are needed in the analysis of the coupled system. The results are established under weak assumptions on the problem data, and for any Yamabe class. Using order relations on appropriate Banach spaces, we then derive several such compatible weak global sub- and super-solutions in Sect. 5, based both on constants and on more complex non-constant constructions. While the sub-solutions are similar to those found previously in the literature, some of the super-solutions are new. In particular, we give two super-solution constructions that do not require the near-CMC condition. The first is constant, and requires that the scalar curvature be strictly globally positive. The second is based on a scaled solution to a Yamabe-type problem, and is valid for any background metric in the positive Yamabe class. In Sect. 6, we establish the main results by giving the proofs of Theorems 1, 2, and 3. In particular, using the topological fixed-point argument in Theorem 5, we combine the global barrier constructions in Sect. 5 with the individual constraint results in Sect. 4 to establish existence of weak non-CMC solutions. We summarize our results in Sect. 7. For ease of exposition, various supporting technical results are given in several appendices as follows: Appendix Sect. A.1 – topological fixed-point arguments; Appendix Sect. A.2 – ordered Banach spaces; Appendix Sect. A.3 – monotone increasing maps; Appendix Sect. A.4 – construction of fractional order Sobolev spaces of sections of vector bundles over closed manifolds; Appendix Sect. A.5 – a priori estimates for elliptic operators; Appendix Sect. A.6 – maximum principles on closed manifolds; Appendix Sect. A.7 – Yamabe classification of weak metrics; Appendix Sect. A.8 – conformal covariance of the Hamiltonian constraint; and Appendix Sect. A.9 – conformal rescaling and the near-CMC condition. 2. Preliminary Material 2.1. Notation and conventions. Let M be an n-dimensional smooth closed manifold. We denote by π : E → M (or simply E → M, or just E) a smooth vector bundle over M, where the manifold M is called the base space, E is called the total space, and π is the bundle projection such that for any x ∈ M, E x = π −1 (x) is the fiber over x, which is a vector space of (fiber) dimension m x . If all fibers E x have dimension m x = m, we say the fiber dimension of E is m. The manifold M itself can be considered as the vector bundle E = M × {0} with fiber dimension m = 0. A section of the trivial vector bundle E = M × R with fiber dimension m = 1 is simply a scalar function on M. Our primary interest is the case where E = Tsr M = T M ⊗ · · · ⊗ T M ⊗ T ∗ M ⊗ . . . ⊗ T ∗ M, r times

s times


553

the (r, s)-tensor bundle with contravariant order r and covariant order s, giving fiber dimension m = n(r + s), where T M is the tangent bundle, and T ∗ M is the co-tangent bundle of M. A C k section of π (or of E) is a C k map γ : M → E such that for each x ∈ M, π(γ (x)) = x. These C k sections form real Banach spaces C k (E) which arise naturally in the global linear analysis of partial differential equations on manifolds. Let h ab ∈ C ∞ (T20 M) be a smooth Riemannian metric on M, (where by convention Latin indices denote abstract indices as e.g. in [48]), meaning that it is a symmetric, positive definite, covariant, smooth two-index tensor field on M. The combination (M, h ab ) is referred to as a (smooth) Riemannian manifold; we will relax the smoothness requirement on h ab below. For each x ∈ M, the metric h ab (x) defines a positive definite inner product on the tangent space Tx M at x. Denote by h ab the inverse of h ab , that is, h ac h bc = δa b , where δa b : Tx M → Tx M is the identity map. We use the convention that repeated indices, one upper-index and one sub-index, denote contraction. Indices on tensors will be raised and lowered with h ab and h ab , respectively. For example, given the tensor u ab c we denote u abc = h aa1 h bb1 u a1 b1 c , and u abc = h cc1 u ab c1 ; notice that the order of the indices is important in the case that the tensor u abc or u abc is not symmetric. We say that a tensor is of type m iff it can be transformed into a tensor u a1 ···am by lowering appropriate indices (its vector bundle then has fiber dimension mn). We now give a brief overview of L p and Sobolev spaces of sections of vector bundles over closed manifolds in order to introduce the notation used throughout the paper. An overview of the construction of fractional order Sobolev spaces of sections of vector bundles can be found in Appendix A.4, based on Besov spaces and partitions of unity. The case of the sections of the trivial bundle of scalars can also be found in [19], and the case of tensors can also be found in [42]. Let ∇a be the Levi-Civita connection associated with the metric h ab , that is, the unique torsion-free connection satisfying ∇a h bc = 0. Let Rabc d be the Riemann tensor of the connection ∇a , where the sign convention used in this article is (∇a ∇b − ∇b ∇a )vc = Rabc d vd . Denote by Rab := Racb c the Ricci tensor and by R := Rab h ab the Ricci scalar curvature of this connection. Integration on M can be defined with the volume form associated with the metric h ab . Given an arbitrary tensor u a1 ···ar b1 ···bs of type m = r + s, we define a real-valued function measuring its magnitude at any point x ∈ M as |u| := (u a1 ···bs u a1 ···bs )1/2 .

(2.1)

A norm of an arbitrary tensor field u a1 ···ar b1 ···bs on M can then be defined for any 1 p < ∞ and for p = ∞ respectively using (2.1) as follows: u p :=

M

1/ p |u| p d x

,

u∞ := ess sup |u|. x∈M

(2.2)

One way to construct the Lebesgue spaces L p (Tsr M) of sections of the (r, s)-tensor bundle, for 1 p ∞, is through the completion of C ∞ (Tsr M) with respect to the L p -norm (2.2). The L p spaces are Banach spaces, and the case p = 2 is a Hilbert space with the inner product and norm given by

(u, v) := u a1 ···am v a1 ···am d x, u := (u, u) = u2 . (2.3) M

Denote covariant derivatives of tensor fields as ∇ k u a1 ···am := ∇b1 · · · ∇bk u a1 ···am , where k denotes the total number of derivatives represented by the tensor indices (b1 , . . . , bk ).

554


Another norm on C ∞ (Tsr M) is given for any non-negative integer k and for any 1 p ∞ as follows: uk, p :=

k

∇ l u p .

(2.4)

l=0

The Sobolev spaces W k, p (Tsr M) of sections of the (r, s)-tensor bundle can be defined as the completion of C ∞ (Tsr M) with respect to the W k, p -norm (2.4). The Sobolev spaces W k, p are Banach spaces, and the case p = 2 is a Hilbert space. We have L p = W 0, p and s p = s0, p . See Appendix A.4 for a more careful construction that includes real order Sobolev spaces of sections of vector bundles. Let C+∞ be the set of nonnegative smooth (scalar) functions on M. Then we can define order cone s, p

W+

:= φ ∈ W s, p : φ, ϕ 0 ∀ ϕ ∈ C+∞ ,

(2.5)

with respect to which the Sobolev spaces W s, p = W s, p (M) are ordered Banach spaces. Here ·, · is the unique extension of the L 2 -inner product to a bilinear form W s, p ⊗ s, p W −s, p → R, with p1 + 1p = 1. The order relation is then φ ψ iff φ − ψ ∈ W+ . We note that this order cone is normal only for s = 0. See Appendix A.2, where we review the main properties of ordered Banach spaces. 2.2. The Einstein constraint equations. We give a quick overview of the Einstein constraint equations in general relativity, and then define weak formulations that are fundamental to both solution theory and the development of approximation theory. Analogous material for the case of compact manifolds with boundary can be found in [21]. Let (M, gµν ) be a 4-dimensional spacetime, that is, M is a 4-dimensional, smooth manifold, and gµν is a smooth, Lorentzian metric on M with signature (−, +, +, +). Let ∇µ be the Levi-Civita connection associated with the metric gµν . The Einstein equation is G µν = κ Tµν , where G µν = Rµν − 21 R gµν is the Einstein tensor, Tµν is the stress-energy tensor, and κ = 8π G/c4 , with G the gravitation constant and c the speed of light. The Ricci tensor is Rµν = Rµσ ν σ and R = Rµν g µν is the Ricci scalar, where g µν is the inverse of gµν, that is gµσ g σ ν = δµ ν . The Riemann tensor is defined by Rµνσ ρ wρ = ∇µ ∇ν − ∇ν ∇µ wσ , where wµ is any 1-form on M. The stress energy tensor Tµν is assumed to be symmetric and to satisfy the condition ∇µ T µν = 0 and the dominant energy condition, that is, the vector −T µν vν is timelike and future-directed, where v µ is any timelike and futuredirected vector field. In this section Greek indices µ, ν, σ , ρ denote abstract spacetime indices, that is, tensorial character on the 4-dimensional manifold M. They are raised and lowered with g µν and gµν , respectively. Latin indices a, b, c, d will denote tensorial character on a 3-dimensional manifold. The map t : M → R is a time function iff the function t is differentiable and the vector field −∇ µ t is a timelike, future-directed vector field on M. Introduce the hypersurface M := {x ∈ M : t (x) = 0}, and denote by n µ the unit 1-form orthogonal to M. By definition of M the form n µ can be expressed as n µ = −α ∇µ t, where α, called the


555

lapse function, is the positive function such that n µ n ν g µν = −1. Let hˆ µν and kˆµν be the first and second fundamental forms of M, that is, hˆ µν := gµν − n µ n ν ,

kˆµν := −hˆ µ σ ∇σ n ν .

The Einstein constraint equations on M are given by G µν − κ Tµν n ν = 0. A well known calculation allows us to express these equations involving tensors on M as equations involving intrinsic tensors on M. The result is the following equations: Rˆ + kˆ 2 − kâb kˆ ab − 2κ ρˆ = 0, Dˆ a kˆ − Dˆ b kˆ ab + κ jâ = 0,

3

(2.6) (2.7)

where tensors hˆ ab , kâb , jâ and ρˆ on a 3-dimensional manifold are the pull-backs on M of the tensors hˆ µν , kˆµν , jˆµ and ρˆ on the 4-dimensional manifold M. We have introduced the energy density ρˆ := n µ n µ T µν and the momentum current density jˆµ := −hˆ µν n σ T νσ . We have denoted by Dˆ a the Levi-Civita connection associated to hˆ ab , so (M, hˆ ab ) is a 3-dimensional Riemannian manifold, with hˆ ab having signature (+, +, +), and we use the notation hˆ ab for the inverse of the metric hˆ ab . Indices have been raised and lowered with hˆ ab and hˆ ab , respectively. We have also denoted by 3Rˆ the Ricci scalar curvature of the metric hˆ ab . Finally, recall that the constraint Eqs. (2.6)-(2.7) are indeed equations on hˆ ab and kâb due to the matter fields satisfying the energy condition −ρˆ 2 + jâ jâ 0 (with strict inequality holding at points on M, where ρˆ = 0; see [48]), which is implied by the dominant energy condition on the stress-energy tensor T µν in spacetime. 2.3. Conformal transverse traceless decomposition. Let φ denote a positive scalar field on M, and decompose the extrinsic curvature tensor kâb = lâb + 13 hˆ ab τˆ , where τˆ := kâb hˆ ab is the trace and then lâb is the traceless part of the extrinsic curvature tensor. Then, introduce the following conformal re-scaling: hˆ ab =: φ 4 h ab , lâb =: φ −10 l ab , τˆ =: τ, jâ =: φ −10 j a , ρˆ =: φ −8 ρ.

(2.8)

We have introduced the Riemannian metric h ab on the 3-dimensional manifold M, which determines the Levi-Civita connection Da , and so we have that Da h bc = 0. We have also introduced the symmetric, traceless tensor lab , and the non-physical matter sources j a and ρ. The different powers of the conformal re-scaling above are carefully chosen so that the constraint Eqs. (2.6)-(2.7) transform into the following equations: 2 −8φ + 3Rφ + τ 2 φ 5 − lab l ab φ −7 − 2κρφ −3 = 0, 3 2 6 a ab −Db l + φ D τ + κ j a = 0, 3

(2.9) (2.10)

where in the equation above, and from now on, indices of unhatted fields are raised and lowered with h ab and h ab respectively. We have also introduced the Laplace-Beltrami

556


operator with respect to the metric h ab , acting on smooth scalar fields; it is defined as follows: φ := h ab Da Db φ.

(2.11)

Equations (2.9)–(2.10) can be obtained by a straightforward albeit long computation. In order to perform this calculation it is useful to recall that both Dˆ a and Da are connections on the manifold M, and so they differ on a tensor field Cab c , which can be computed explicitly in terms of φ, and has the form Cab c = 4δ(a c Db) ln(φ) − 2h ab h cd Dd ln(φ). We remark that the power four on the re-scaling of the metric hˆ ab and M being 3-dimensional imply that 3Rˆ = φ −5 (3Rφ − 8φ), or in other words, that φ satisfies the Yamabe-type problem: ˆ 5 = 0, φ > 0, − 8φ + 3Rφ − 3Rφ

(2.12)

where 3Rˆ represents the scalar curvature corresponding to the physical metric hˆ ab = φ 4 h ab . Note that for any other power in the re-scaling, terms proportional to h ab (Da φ) (Db φ)/φ 2 appear in the transformation. The set of all metrics on a closed manifold can be classified into the three disjoint Yamabe classes Y + (M), Y 0 (M), and Y − (M), corresponding to whether one can conformally transform the metric into a metric with strictly positive, zero, or strictly negative scalar curvature, respectively, cf. [31] (see also Appendix A.7). We note that the Yamabe problem is to determine, for a given metric h ab , whether there exists a conformal transformation φ solving (2.12) such that 3ˆ R = const. Arguments similar to those above for φ force the power negative ten on the re-scaling of the tensor lâb and jâ , so terms proportional to (Da φ)/φ cancel out in (2.10). Finally, the ratio between the conformal re-scaling powers of ρˆ and jâ is chosen such that the inequality −ρ 2 + h ab j a j b 0 implies the inequality −ρˆ 2 + hˆ ab jâ jˆb 0. For a complete discussion of all possible choices of re-scaling powers, see Appendix A.9. There is one more step to convert the original constraint equation (2.6)-(2.7) into a determined elliptic system of equations. This step is the following: Decompose the symmetric, traceless tensor lab into a divergence-free part σab , and the symmetrized and traceless gradient of a vector, that is, l ab =: σ ab + (Lw)ab , where Da σ ab = 0 and we have introduced the conformal Killing operator L acting on smooth vector fields and defined as follows: (Lw)ab := D a w b + D b wa − 23 (Dc w c )h ab .

(2.13)

Therefore, the constraint Eqs. (2.6)-(2.7) are transformed by the conformal re-scaling into the following equations: 2 −8φ + 3Rφ + τ 2 φ 5 − [σab + (Lw)ab ][σ ab + (Lw)ab ]φ −7 − 2κρφ −3 = 0, (2.14) 3 2 6 a ab −Db (Lw) + φ D τ + κ j a = 0. (2.15) 3 In the next section we interpret these equations above as partial differential equations for the scalar field φ and the vector field wa , while the rest of the fields are considered


557

as given fields. Given a solution φ and wa of Eqs. (2.14)-(2.15), the physical metric hˆ ab and extrinsic curvature kˆ ab of the hypersurface M are given by hˆ ab = φ 4 h ab ,

1 kˆ ab = φ −10 [σ ab + (Lw)ab ] + φ −4 τ h ab , 3

while the matter fields are given by Eq (2.8). From this point forward, for simplicity we will denote the Levi-Civita connection of the metric h ab on the 3-dimensional manifold M as ∇a rather than Da , and the Ricci scalar of h ab will be denoted by R instead of 3R. Let (M, h) be a 3-dimensional Riemannian manifold, where M is a smooth, compact manifold without boundary, and h ∈ C ∞ (T20 M) is a positive definite metric. With the shorthands C ∞ = C ∞ (M × R) and C∞ = C ∞ (T M), let L : C ∞ → C ∞ and L : C∞ → C∞ be the operators with actions on φ ∈ C ∞ and w ∈ C∞ given by Lφ := −φ, (Lw)a := −∇b (Lw)ab ,

(2.16) (2.17)

where denotes the Laplace-Beltrami operator defined in (2.11), and where L denotes the conformal Killing operator defined in (2.13). We will also use the index-free notation Lw and Lw. The freely specifiable functions of the problem are a scalar function τ , interpreted as the trace of the physical extrinsic curvature; a symmetric, traceless, and divergence-free, contravariant, two index tensor σ ; the non-physical energy density ρ and the non-physical momentum current density vector j subject to the requirement −ρ 2 + j · j 0. The term non-physical refers here to a conformal rescaled field, while physical refers to a conformally non-rescaled term. The requirement on ρ and j mentioned above and the particular conformal rescaling used in the semi-decoupled decomposition imply that the same inequality is satisfied by the physical energy and momentum current densities. This is a necessary condition (although not sufficient) in order that the matter sources in spacetime satisfy the dominant energy condition. The definition of various energy conditions can be found in [48, p. 219]. Introduce the non-linear operators F : C ∞ × C∞ → C ∞ and F : C ∞ → C∞ given by F(φ, w) = aτ φ 5 + a R φ − aρ φ −3 − aw φ −7 , and F(φ) = bτ φ 6 + b j , where the coefficient functions are defined as follows: aτ :=

1 2 12 τ ,

a R := 18 R,

aρ := κ4 ρ,

aw := 18 (σ + Lw)ab (σ + Lw)ab , bτa := 23 ∇ a τ, baj := κ j a .

(2.18)

Notice that the scalar coefficients aτ , aw , and aρ are non-negative, while there is no sign restriction on a R . With these notations, the classical formulation (or the strong formulation) of the coupled Einstein constraint equations reads: Given the freely specifiable smooth functions τ , σ , ρ, and j in M, find a scalar field φ and a vector field w in M solution of the system Lφ + F(φ, w) = 0

and

Lw + F(φ) = 0

in M.

(2.19)

558


2.4. Formulation in Sobolev spaces. We now outline a formulation of the Einstein constraint equations that involves the weakest regularity of the equation coefficients such that the equation itself is well-defined. So in particular, the operators L and L are no longer differential operators sending smooth sections to smooth sections. We shall employ Sobolev spaces to quantify smoothness, cf. Appendix A.4. Let (M, h) be a 3-dimensional Riemannian manifold, where M is a smooth, compact manifold without boundary, and with p ∈ ( 23 , ∞) and s ∈ ( 3p , ∞) ∩ [1, 2], h ∈ W s, p (T20 M) is a positive definite metric. Note that the restriction s 2 is only apparent, since W t, p → W 2, p for any t > 2. In the formulation of the constraint equations we need to distinguish the cases s > 2 and s 2 at least notation-wise, and we choose to present in this subsection the case s 2 because this is the case that is considered in the core existence theory; the higher regularity is obtained by a standard bootstrapping technique. 3p The general case is discussed in Sects. 4 and 6. Let us define r = r (s, p) = 3+(2−s) p, r s−2, p so that the continuous embedding L → W holds. Introduce the operators A L : W s, p → W s−2, p ,

and

AL : W 1,2r → W −1,2r ,

as the unique extensions of the operators L and L in Eqs. (2.16) and (2.17), respectively, cf. Lemma 31 in Appendix A.5. The boldface letters denote spaces of sections of the tangent bundle T M, e.g., W 1,2r = W 1,2r (T M). Fix the source functions s−2, p

τ ∈ L 2r , ρ ∈ W+

, σ ∈ L 2r , j ∈ W −1,2r ,

(2.20)

where σ is symmetric, traceless and divergence-free in the weak sense, the latter mean ing that σ, Lω = 0 for all ω ∈ W 1,(2r ) . Here (2r1 ) + 2r1 = 1, and ·, · denotes the

extension of the L 2 -inner product to W −1,2r ⊗ W 1,(2r ) . We say that the matter fields ρ and j satisfy the energy condition iff there exist sequences {ρn } ⊂ C ∞ and {j n } ⊂ C∞ , respectively converging to ρ and j in the appropriate topology, such that ρn2 − j n · j n 0. Given any function τ ∈ L 2r we have bτ ≡ 23 ∇τ ∈ W −1,2r . The assumptions τ ∈ L 2r and σ ∈ L 2r imply that for every w ∈ W 1,2r the functions aτ and aw belong to L r . For example, to see that aw ∈ L r , we proceed as 2 aw r = σ + Lw2r 2 σ 22r + Lw22r 2 σ 22r + cL w21,2r , where we used the boundedness Lw2r cL w1,2r . The assumption on the background metric implies that a R ∈ W s−2, p . Given any two functions u, v ∈ L ∞ , and t 0 and q ∈ [1, ∞], define the interval [u, v]t,q := {φ ∈ W t,q : u φ v} ⊂ W t,q , see Lemma 1 near the end of Sect. 3. We equip [u, v]t,q with the subspace topology of W t,q . We will write [u, v]q for [u, v]0,q , and [u, v] for [u, v]∞ . Now, assuming that φ− , φ+ ∈ W s, p and 0 < φ− φ+ < ∞, we introduce the non-linear operators f : [φ− , φ+ ]s, p × W 1,2r → W s−2, p ,

and

f : [φ− , φ+ ]s, p → W −1,2r ,


559

by f (φ, w) = aτ φ 5 + a R φ − aρ φ −3 − aw φ −7 , and f (φ) = bτ φ 6 + b j , where the pointwise multiplication by an element of W s, p defines a bounded linear map in W s−2, p and in W −1,2r , cf. Corollary 3(a) in Appendix A.4. Now, we can formulate the Einstein constraint equations in terms of the above defined operators: Find elements φ ∈ [φ− , φ+ ]s, p and w ∈ W 1,2r solutions of A L φ + f (φ, w) = 0, AL w + f (φ) = 0.

(2.21) (2.22)

In the following, often we treat the two equations separately. The Hamiltonian constraint equation is the following: Given a function w ∈ W 1,2r , find an element φ ∈ [φ− , φ+ ]s, p solution of A L φ + f (φ, w) = 0.

(2.23)

When the Hamiltonian constraint equation is under consideration, the function w is referred to as the source. To indicate the dependence of the solution φ on the source w, sometimes we write φ = φw . Let us define the momentum constraint equation: Given φ ∈ W s, p with φ > 0, find an element w ∈ W 1,2r solution of AL w + f (φ) = 0.

(2.24)

When the momentum constraint equation is under consideration, the function φ is referred to as the source. To indicate the dependence of the solution w on the source φ, sometimes we write w = wφ . 3. Overview of the Main Results In this section, we state our three main theorems (Theorems 1, 2, and 3 below) on the existence of far-from-CMC, near-CMC, and CMC solutions to the Einstein constraint equations, and give an outline of the overall structure of the argument that we build in the paper. The proofs of the main results appear in Sect. 6 toward the end of the paper, after we develop a number of supporting results in the body of the paper. After we give an overview of the basic abstract structure of the coupled nonlinear constraint problem, we prove two abstract topological fixed-point theorems (Theorems 4 and 5) that are the basis for our analysis of the coupled system; these arguments are also the basis for our results in [21] on existence of non-CMC solutions to the Einstein constraints on compact manifolds with boundary. After proving these abstract results, we give an overview of the technical results that must be established in the remainder of the paper in order to use the abstract results. Before stating the main theorems, let us make precise what we mean by near-CMC condition in this article. We say that the extrinsic mean curvature τ satisfies the nearCMC condition when the following inequality is satisfied: ∇τ z < inf |τ |,

(3.1)

M

√

√

min uv 6 where the constant = 2C3 if ρ, σ 2 ∈ L ∞ , and = 2C3 ( max uv ) otherwise, with the constant C > 0 as in Corollary 1 and the continuous functions u, v > 0 are as defined in

560


(5.14) or in (5.15) on page. Here C depends only on the Riemannian manifold (M, h ab ), and not mentioning (M, h ab ), u and v depend only on ρ, σ 2 , and τ . It is important to min uv note that we always have 0 < max uv 1, so that in any case the condition (3.1) is √

at least as strong as the same condition with taken to be equal to 2C3 . The condition depends on the value of z, and that will be inserted through the context. Recall that the three Yamabe classes Y + (M), Y − (M) and Y 0 (M) are defined after Eq. (2.12). See Appendix A.7 for more details. 3.1. Theorem 1: Far-CMC weak solutions. Here is the first of our three main results. This result does not involve the near-CMC condition, which is one of the main contributions of this paper. The result is developed in the presence of a weak background metric h ab ∈ W s, p , for p ∈ (1, ∞) and s ∈ (1 + 3p , ∞), with the weakest possible assumptions on the data that allows for avoiding the near-CMC condition. Theorem 1 (Far-CMC W s, p solutions, p ∈ (1, ∞), s ∈ (1 + 3p , ∞)). Let (M, h ab ) be a 3-dimensional closed Riemannian manifold. Let h ab ∈ W s, p admit no conformal Killing field and be in Y + (M), where p ∈ (1, ∞) and s ∈ (1 + 3p , ∞) are given. Select q and e to satisfy: 3− p 3+ p • q1 ∈ (0, 1) ∩ (0, s−1 3 ) ∩ [ 3 p , 3 p ], • e ∈ (1 + q3 , ∞) ∩ [s − 1, s] ∩ [ q3 + s − 3p − 1, q3 + s − 3p ]. Assume that the data satisfies: 3q • τ ∈ W e−1,q if e 2, and τ ∈ W 1,z otherwise, with z = 3+max{0,2−e}q , • σ ∈ W e−1,q , with σ 2 ∞ sufficiently small, s−2, p ∩ L ∞ \ {0}, with ρ∞ sufficiently small, • ρ ∈ W+ e−2,q • j∈W , with je−2,q sufficiently small. Then there exist φ ∈ W s, p with φ > 0 and w ∈ W e,q solving the Einstein constraint equations. Proof. The proof will be given in Sect. 6. See Fig. 1 for clarification of the conditions on e and q. Remark 1. The above result avoids the near-CMC condition (3.1); however, one should be aware of the various smallness conditions involved in the above theorem. More precisely, the mean curvature τ can be chosen to be an arbitrary function from a suitable function space, and afterwards, one has to choose σ , ρ, and j satisfying smallness conditions that depend on the chosen τ . Nevertheless, the novelty of this result is that τ can be specified freely, whereas the condition (3.1) is not satisfied for arbitrary τ . 3.2. Theorem 2: Near-CMC weak solutions. Here is the second of our three main results; this result requires the near-CMC condition, but still extends the known near-CMC results to situations with weaker assumptions on metric and on the data. In particular, the result is developed in the presence of a weak background metric h ab ∈ W s, p , for p ∈ (1, ∞) and s ∈ (1 + 3p , ∞), and with the weakest possible assumptions on the data. Theorem 2 (Near-CMC W s, p solutions, p ∈ (1, ∞), s ∈ (1 + 3p , ∞)). Let (M, h ab ) be a 3-dimensional closed Riemannian manifold. Let h ab ∈ W s, p admit no conformal Killing field, where p ∈ (1, ∞) and s ∈ (1 + 3p , ∞) are given. Select q, e and z to satisfy:


561

Fig. 1. Range of e and q in Theorems 1 and 2, with d = s − 3p > 1

•

1 q

3− p 3+ p ∈ (0, 1) ∩ (0, s−1 3 ) ∩ [ 3p , 3p ] .

• e ∈ (1 + q3 , ∞) ∩ [s − 1, s] ∩ [ q3 + s − • z=

3 p

− 1, q3 + s − 3p ] .

3q 3+max{0,2−e}q .

Assume that τ satisfies the near-CMC condition (3.1) with z as above, and that the data satisfies: • • • •

τ ∈ W e−1,q if e > 2, and τ ∈ W 1,z if e 2, σ ∈ W e−1,q , s−2, p , ρ ∈ W+ j ∈ W e−2,q .

In addition, let one of the following sets of conditions hold: (a) h ab is in Y − (M); the metric h ab is conformally equivalent to a metric with scalar curvature (−τ 2 ); (b) h ab is in Y 0 (M) or in Y + (M); either ρ ≡ 0 and τ ≡ 0 or τ ∈ L ∞ and inf M σ 2 is sufficiently large. Then there exist φ ∈ W s, p with φ > 0 and w ∈ W e,q solving the Einstein constraint equations. Proof. The proof will be given in Sect. 6. See Fig. 1 for clarification of the conditions on e and q.

3.3. Theorem 3: CMC weak solutions. Here is the last of our three main results; it covers specifically the CMC case, and allows for lower regularity of the background metric than the non-CMC case. In particular, the result is developed with a weak background metric h ab ∈ W s, p , for p ∈ (1, ∞) and s ∈ ( 3p , ∞) ∩ [1, ∞). In the case of s = 2, we reproduce the CMC existence results of Choquet-Bruhat [10], and in the case p = 2, we reproduce the CMC existence results of Maxwell [33], but with a different proof; our CMC proof goes through the same analysis framework that we use to obtain the non-CMC results (Theorems 4 and 5).

562


Theorem 3 (CMC W s, p solutions, p ∈ (1, ∞), s ∈ ( 3p , ∞) ∩ [1, ∞)). Let (M, h ab ) be a 3-dimensional closed Riemannian manifold. Let h ab ∈ W s, p admit no conformal Killing field, where p ∈ (1, ∞) and s ∈ ( 3p , ∞) ∩ [1, ∞) are given. With d := s − 3p , select q and e to satisfy: •

1 q

p 3+ p 1−d 3+sp ∈ (0, 1) ∩ [ 3− 3 p , 3 p ] ∩ [ 3 , 6 p ),

• e ∈ [1, ∞) ∩ [s − 1, s] ∩ [ q3 + d − 1, q3 + d] ∩ ( q3 + d2 , ∞). Assume τ = const (CMC) and that the data satisfies: • σ ∈ W e−1,q , s−2, p , • ρ ∈ W+ e−2,q • j∈W . In addition, let one of the following sets of conditions hold: (a) (b) (c) (d)

h ab h ab h ab h ab

is in Y − (M); τ = 0; is in Y + (M); ρ = 0 or σ = 0; is in Y 0 (M); τ = 0; ρ = 0 or σ = 0; is in Y 0 (M); τ = ρ = σ = 0; j = 0.

Then there exist φ ∈ W s, p with φ > 0 and w ∈ W e,q solving the Einstein constraint equations. Proof. The proof will be given in Sect. 6. See Fig. 2 for clarification of the conditions on e and q. 3.4. A coupled topological fixed-point argument. In Theorems 4 and 5 below (see also [21]) we give some abstract fixed-point results which form the basic framework for our analysis of the coupled constraints. These topological fixed-point theorems will be the main tool by which we shall establish Theorems 1, 2, and 3 above. They have the important feature that the required properties of the abstract fixed-point operators S and T appearing in Theorems 4 and 5 below can be established in the case of the Einstein

Fig. 2. Range of e and q in Theorem 3. Recall that d = s − 3p > 0


563

constraints without using the near-CMC condition; this is not the case for fixed-point arguments for the constraints based on k-contractions (cf. [1,26]) which require nearCMC conditions. The bulk of the paper then involves establishing the required properties of S and T without using the near-CMC condition, and finding suitable global barriers φ− and φ+ for defining the required set U that are similarly free of the near-CMC condition (when possible). We now set up the basic abstract framework. Let X and Y be Banach spaces, let f : X × Y → X ∗ and f : X → Y ∗ be (generally nonlinear) operators, let AL : Y → Y ∗ be a linear invertible operator, and let A L : X → X ∗ be a linear invertible operator satisfying the maximum principle, meaning that A L u A L v ⇒ u v. The order structure on X for interpreting the maximum principle will be inherited from an ordered Banach space Z (see Appendices A.2, A.3, and A.6, and also cf. [54]) through the compact embedding X → Z , which will also make available compactness arguments. The coupled Hamiltonian and momentum constraints can be viewed abstractly as coupled operator equations of the form: A L φ + f (φ, w) = 0, AL w + f (φ) = 0,

(3.2) (3.3)

or equivalently as the coupled fixed-point equations φ = T (φ, w), w = S(φ),

(3.4) (3.5)

for appropriately defined fixed-point maps T : X × Y → X and S : X → Y . The obvious choice for S is the Picard map for (3.3), S(φ) = −A−1 L f (φ),

(3.6)

which also happens to be the solution map for (3.3). On the other hand, there are a number of distinct possibilities for T , ranging from the solution map for (3.2), to the Picard map for (3.2), which inverts only the linear part of the operator in (3.2): T (φ, w) = −A−1 L f (φ, w).

(3.7)

Assume now that T is as in (3.7), and (for fixed w ∈ Y ) that φ− and φ+ are sub- and super-solutions of the semi-linear operator equation (3.2) in the sense that A L φ− + f (φ− , w) 0,

A L φ+ + f (φ+ , w) 0.

The assumptions on A L imply (see Lemma 26 in Appendix A.3) that for fixed w ∈ Y , φ− and φ+ are also sub- and super-solutions of the equivalent fixed-point equation: φ− T (φ− , w),

φ+ T (φ+ , w).

For developing results on fixed-point iterations in ordered Banach spaces, it is convenient to work with maps which are monotone increasing in φ, for fixed w ∈ Y : φ1 φ2

⇒

T (φ1 , w) T (φ2 , w).

The map T that arises as the Picard map for a semi-linear problem will generally not be monotone increasing; however, if there exists a continuous linear monotone increasing

564


map J : X → X ∗ , then one can always introduce a positive shift s into the operator equation AsL φ + f s (φ, w) = 0, with AsL = A L + s J and f s (φ, w) = f (φ, w) − s J φ. (Throughout this paper, the spaces we encounter for X typically fit into a Gelfand triple X → H → X ∗ , where the “pivot” space H is Hilbert space, and the continuous map between X and X ∗ is a composition of the two inclusion maps.) Since s > 0 the shifted operator AsL retains the maximum principle property of A L , and if s is chosen sufficiently large then f s is monotone decreasing in φ. Under the additional condition on J and s that AsL is invertible (see also [21]), the shifted Picard map T s (φ, w) = −(AsL )−1 f s (φ, w) is now monotone increasing in φ. We now give two abstract existence results for systems of the form (3.4)–(3.5). Theorem 4 (Coupled Fixed-Point Principle A). Let X and Y be Banach spaces, and let Z be a Banach space with compact embedding X → Z . Let U ⊂ Z be a non-empty, convex, closed, bounded subset, and let S : U → R(S) ⊂ Y,

T : U × R(S) → U ∩ X,

be continuous maps. Then there exist φ ∈ U ∩ X and w ∈ R(S) such that φ = T (φ, w) and w = S(φ). Proof. The proof will be through a standard variation of the Schauder Fixed-Point Theorem, reviewed as Theorem 9 in Appendix A.1. The proof is divided into several steps. Step 1. Construction of a non-empty, convex, closed, bounded subset U ⊂ Z . By assumption we have that U ⊂ Z is non-empty, convex (involving the vector space structure of Z ), closed (involving the topology on Z ), and bounded (involving the metric given by the norm on Z ). Step 2. Continuity of a mapping G : U ⊂ Z → U ∩ X ⊂ X . Define the composite operator G := T ◦ S : U ⊂ Z → U ∩ X ⊂ X. The mapping G is continuous, since it is a composition of the continuous operators S : U ⊂ Z → R(S) ⊂ Y and T : U ⊂ Z × R(S) → U ∩ X ⊂ X . Step 3. Compactness of a mapping F : U ⊂ Z → U ⊂ Z . The compact embedding assumption X → Z implies that the canonical injection operator i : X → Z is compact. Since the composition of compact and continuous operators is compact, we have the composition F := i ◦ G : U ⊂ Z → U ⊂ Z is compact. Step 4. Invoking the Schauder Theorem. Therefore, by a standard variant of the Schauder Theorem (see Theorem 9 in Appendix A.1), there exists a fixed-point φ ∈ U such that φ = F(φ) = T (φ, S(φ)). Since R(T ) = U ∩ X , in fact φ ∈ U ∩ X . We now take w = S(φ) ⊂ R(S) and we have the result.


565

The assumption in Theorem 4 that the mapping T is invariant on the non-empty, closed, convex, bounded subset U can be established using a priori estimates if T is the solution mapping, but if there are multiple fixed-points then continuity of T will not hold. Fixed-point theory for set-valued maps could still potentially be used (cf. [54]). On the other hand, if T is chosen to be the Picard map, then it is typically easier to establish continuity of T even with multiple fixed-points, but more difficult to establish the invariance property without additional conditions on T . In our setting, we wish to allow for non-uniqueness in the Hamiltonian constraint (for example see [21] for possible non-uniqueness in the case of compact manifolds with boundary), so will generally focus on the Picard map for the Hamiltonian constraint in our fixed-point framework for the coupled constraints. The following special case of Theorem 4 gives some simple sufficient conditions on T to establish the invariance using barriers in an ordered Banach space (for a review of ordered Banach spaces, see Appendix A.2 or [54]). Theorem 5 (Coupled Fixed-Point Principle B). Let X and Y be Banach spaces, and let Z be a real ordered Banach space having the compact embedding X → Z . Let [φ− , φ+ ] ⊂ Z be a nonempty interval which is closed in the topology of Z , and set U = [φ− , φ+ ] ∩ B M ⊂ Z , where B M is the closed ball of finite radius M > 0 in Z about the origin. Assume U is nonempty, and let the maps S : U → R(S) ⊂ Y,

T : U × R(S) → U ∩ X,

be continuous maps. Then there exist φ ∈ U ∩ X and w ∈ R(S) such that φ = T (φ, w) and w = S(φ). Proof. By choosing the set U to be the non-empty intersection of the interval [φ− , φ+ ] with a bounded set in Z , we have U bounded in Z . We also have that U is convex with respect to the vector space structure of Z , since it is the intersection of two convex sets [φ− , φ+ ] and B M . Since U is the intersection of the interval [φ− , φ+ ], which by assumption is closed in the topology of Z , with the closed ball B M in Z , U is also closed. In summary, we have that U is non-empty as a subset of Z , closed in the topology of Z , convex with respect to the vector space structure of Z , and bounded with respect to the metric (via normed) space structure of Z . Therefore, the assumptions of Theorem 4 hold and the result then follows. We make some final remarks about Theorems 4 and 5. If the ordered Banach space Z in Theorem 5 had a normal order cone, then the closed interval [φ− , φ+ ] would automatically be bounded in the norm of Z (see Lemma 20 in Appendix A.2 or [54] for this result). The interval by itself is also non-empty and closed by assumption, and trivially convex (see Appendix A.2), so that Theorem 5 would follow immediately from Theorem 4 by simply taking U = [φ− , φ+ ]. Second, the closed ball B M in Theorem 5 can be replaced with any non-empty, convex, closed, and bounded subset of Z having non-trivial intersection with the interval [φ− , φ+ ]. Third, in the case that T in Theorem 5 arises as the Picard map (3.7) of the semi-linear problem (3.2), we can always ensure that T is invariant on U in Theorem 5 by: (1) obtaining sub- and super-solutions to the semi-linear operator equation and using these for φ− and φ+ , since these will also be suband super-solutions for the fixed-point equation involving the Picard map; (2) introducing a shift into the nonlinearity to ensure T is monotone increasing; and (3) obtaining a priori norm bounds on Picard iterates. As noted earlier, (1) and (2) will ensure φ− T (φ− , w) T (φ, w) T (φ+ , w) φ+ ,

(3.8)

566


for all φ ∈ [φ− , φ+ ], and w ∈ R(S), whereas (3) ensures that T (φ, w) X M, ∀φ ∈ [φ− , φ+ ], ∀w ∈ R(S),

(3.9)

which together ensure T : U × R(S) → U ∩ X , where U = [φ− , φ+ ] ∩ B M ⊂ Z . Again, if Z has a normal order cone structure, then ensuring (3.8) holds will automatically guarantee that (3.9) also holds, so it is not necessary to establish (3.9) separately in the case of a normal order cone. Finally, note that Theorem 5 also allows one to choose the solution map (or any other fixed-point map) for T together with a priori order cone and norm estimates to ensure the conditions (3.8) and (3.9) hold (as long as continuity for T can be shown). Even if a priori order-cone estimates cannot be shown to hold directly for this choice of T , as long as the map can be “bracketed” in the interval [φ− , φ+ ] by two auxiliary monotone increasing maps, then it can be shown that (3.8) holds. This allows one to use the Picard map even if it is not monotone increasing, without having to introduce the shift into the Picard map. The overall argument we use to prove the non-CMC results in Theorems 1, 2, and 3 using Theorems 4 and 5 involves the following steps: Step 1. The choice of function spaces. We will choose the spaces for use of Theorem 5 as follows: – X = W s, p , with p ∈ (1, ∞), and s( p) ∈ (1 + 3p , ∞). In the CMC case in Theorem 3, we can lower s to s( p) ∈ ( 3p , ∞) ∩ [1, ∞). – Y = W e,q , with e and q as given in the theorem statements. – Z = W s˜, p , s˜ ∈ ( 3p , s), so that X = W s, p → W s˜, p = Z is compact.

Step 2.

Step 3.

Step 4.

Step 5.

– U = [φ− , φ+ ]s˜, p ∩ B M ⊂ W s˜, p = Z , with φ− and φ+ global barriers (suband super-solutions, respectively) for the Hamiltonian constraint equation which satisfy the compatibility condition: 0 < φ− φ+ < ∞. Construction of the mapping S. Assuming the existence of “global” weak suband super-solutions φ− and φ+ , and assuming the fixed function φ ∈ U = [φ− , φ+ ]s˜, p ∩ B M ⊂ W s˜, p = Z is taken as data in the momentum constraint, we establish continuity and related properties of the momentum constraint solution map S : U → R(S) ⊂ W e,q = Y (Sect. 4.1). Construction of the mapping T . Again assuming existence of “global” weak sub- and super-solutions φ− and φ+ , with fixed w ∈ R(S) ⊂ W e,q = Y taken as data in the Hamiltonian constraint, we establish continuity and related properties of the Picard map T : U × R(S) → U ∩ W s, p . Invariance of T on U = [φ− , φ+ ]s˜, p ∩ B M ⊂ W s˜, p is established using a combination of a priori order cone bounds and norm bounds (Sect. 4.2). Barrier construction. Global weak sub- and super-solutions φ− and φ+ for the Hamiltonian constraint are explicitly constructed to build a nonempty, convex, closed, and bounded subset U = [φ− , φ+ ]s˜, p ∩ B M ⊂ W s˜, p , which is a strictly positive interval. These include variations of known barrier constructions which require the near-CMC condition, and also some new barrier constructions which are free of the near-CMC condition (Sect. 5). Note: This is the only place in the argument where near-CMC conditions may potentially arise. Application of fixed-point theorem. The global barriers and continuity properties are used together with the abstract topological fixed-point result (Theorem 5) to establish existence of solutions φ ∈ U ∩ W s, p and w ∈ W e,q to the coupled system: w = S(φ), φ = T (φ, w) (Sect. 6).


567

Step 6. Bootstrap. The above application of a fixed-point theorem is actually performed for some low regularity spaces, i.e., for s 2 and e 2, and a bootstrap argument is then given to extend the results to the range of s and p given in the statement of the theorem (Sect. 6). The ordered Banach space Z plays a central role in Theorem 5. We will use Z = W t,q , t 0, 1 q ∞, with order cone defined as in (2.5). Given such an order cone, one can define the closed interval [φ− , φ+ ]t,q = {φ ∈ W t,q : φ− φ φ+ } ⊂ W t,q , which as noted earlier is denoted more simply as [φ− , φ+ ]q when t = 0, and as simply [φ− , φ+ ] when t = 0, q = ∞. When t = 0, the W t,q order cone is normal for 1 q ∞, meaning that closed intervals [φ− , φ+ ]q ⊂ L q = W 0,q are automatically bounded in the metric given by the norm on L q . If we consider the interval U = [φ− , φ+ ]t,q ⊂ W t,q = Z defined using this order structure, it will be critically important to establish that U is convex (with respect to the vector space structure of Z ), closed (in the topology of Z ), and (when possible) bounded (in the metric given by the norm on Z ). It will also be important that U be nonempty as a subset of Z ; this will involve choosing compatible φ− and φ+ . Regarding convexity, closure, and boundedness, we have the following lemma. Lemma 1 (Order cone intervals in W t,q ). For t 0, 1 q ∞, the set U = [φ− , φ+ ]t,q = {φ ∈ W t,q : φ− φ φ+ } ⊂ W t,q is convex with respect to the vector space structure of W t,q and closed in the topology of W t,q . For t = 0, 1 q ∞, the set U is also bounded with respect to the metric space structure of L q = W 0,q . Proof. That U is convex for t 0, 1 q ∞, follows from the fact that any interval built using order cones is convex. That U is closed in the case of t = 0, 1 q ∞ follows from the fact that norm convergence in L q for 1 q ∞ implies pointwise subsequential convergence almost everywhere (see Theorem 3.12 in [44]). That U is q bounded when t = 0, 1 q ∞ follows from the fact that the order cone L + is normal (see Appendix A.2). What remains is to show that U is closed in the case of t > 0, 1 q ∞. The t,q ⊂ L q , with argument is as follows. Let {u k }∞ k=1 be a Cauchy sequence in U ⊂ W t > 0, 1 q ∞. From completeness of W t,q there exists limk→∞ u k = u ∈ W t,q . From the continuous embedding W t,q → L q for t > 0, we have that u k − u l q Cu k − u l t,q so that u k is also Cauchy in L q . Moreover, the continuous embedding also implies that u is also the limit of u k as a sequence in L q . Since [φ− , φ+ ]0,q is closed in L q , we have u ∈ [φ− , φ+ ]0,q , and so u ∈ U = [φ− , φ+ ]t,q = [φ− , φ+ ]0,q ∩ W t,q . Remark 2. We indicate now how the far-CMC result outlined in [22] can be recovered using Theorem 4 above. The framework is constructed by taking X = W 2, p , Y = W 2, p , and Z = L ∞ , with p > 3, giving the compact embedding W 2, p → L ∞ . The coefficients are assumed to satisfy τ ∈ W 1, p and σ 2 , j a , ρ ∈ L p as well as the assumptions for the construction of a near-CMC-free global super-solution (presented in [22] as Theorem 1, analogous to Lemma 9 in this paper), and for the construction of a near-CMC-free global

568


sub-solution (presented in [22] as Theorem 2, analogous to Lemma 13 in this paper). One then takes U = [φ− , φ+ ] ⊂ Z = L ∞ , where the compatible 0 < φ− φ+ are these near-CMC-free barriers. Since Z = L ∞ is an ordered Banach space with normal order cone, we have (by Lemma 1 in this paper) that U is non-empty, convex, closed and bounded as a subset of Z . The invariance of the Picard mapping on the interval [φ− , φ+ ] is proven using a monotone shift (cf. Lemma 4 in this paper) and a barrier argument (cf. Lemma 5 in this paper). The main result in [22] (stated in [22] as Theorem 4), now follows from Theorem 4 in this paper (stated in [22] as Lemma 1). 4. Weak Solution Results for the Individual Constraints 4.1. The momentum constraint and the solution map S. In this section we fix a particular scalar function φ ∈ W s, p with sp > 3, and consider separately the momentum constraint equation (2.24) to be solved for the vector valued function w. The result is a linear elliptic system of equations for this variable w = wφ . For convenience, we reformulate the problem here in a self-contained manner. Note that the problem (4.2) below is identical to (2.24) provided the functions bτ and b j are defined accordingly. Our goal is not only to develop some existence results for the momentum constraint, but also to derive the estimates for the momentum constraint solution map S that we will need later in our analysis of the coupled system. We note that a complete weak solution theory for the momentum constraint on compact manifolds with boundary, using both variational methods and Riesz-Schauder Theory, is developed in [21]. Let (M, h) be a 3-dimensional Riemannian manifold, where M is a smooth, compact manifold without boundary, and with p ∈ (1, ∞) and s ∈ ( 3p , ∞), h ∈ W s, p is a positive definite metric. With q ∈ (1, ∞), and e ∈ (2 − s, s] ∩ −s + 3p − 1 + q3 , s − 3p + q3 , introduce the bounded linear operator AL : W e,q → W e−2,q , as the unique extension of the operator L in (2.17), cf. Lemmata 31 and 32 in Appendix A.5. Fix the source terms bτ , b j ∈ W e−2,q . Fix a function φ ∈ W s, p , and define f φ ∈ W s−2,q ,

f φ := bτ φ 6 + b j .

(4.1)

We used the subscript φ in f φ to emphasize that φ is not a variable (but the “source”) of the problem. Note that the above conditions on q and e are sufficient for the pointwise multiplication by an element of W s, p to be a bounded map in W e−2,q , cf. Corollary 3(a) in Appendix A.4. The momentum constraint equation is the following: find an element w ∈ W e,q solution of AL w + f φ = 0.

(4.2)

We sketch here a proof of existence of weak solutions of the momentum constraint equation (4.2).


569

Theorem 6 (Momentum constraint). Let e and q be as above. Then there exists a solution w ∈ W e,q to the momentum constraint equation (4.2) if and only if f φ (v) = 0 for all v ∈ W 2−e,q satisfying A∗L v = 0. The solution is unique if and only if the kernel of A∗L is trivial. Moreover, if a solution exists at all in W e,q , for any given closed linear space K ⊆ W e,q such that W e,q = ker AL ⊕ K , there is a unique solution satisfying w ∈ K , and for this solution, we have we,q C f φ e−2,q ,

(4.3)

with some constant C > 0 not depending on w. Proof. By Lemma 34 in Appendix A.5, the operator AL is semi-Fredholm, and moreover since AL is formally self-adjoint, it is Fredholm. The formal self-adjointness also implies that when the metric is smooth, index of AL is zero independent of e and q. Now we can approximate the metric h by smooth metrics so that AL is sufficiently close to a Fredholm operator with index zero. Since the set of Fredholm operators with constant index is open, we conclude that the index of AL is zero, and the theorem follows. In the later sections we need to bound the coefficient aw in the Hamiltonian constraint equation, which can be obtained by using the following observation. Corollary 1. Let p ∈ (1, ∞) and s ∈ (1 +

3 p , ∞).

In addition, let q ∈ (3, ∞) and

3q e ∈ (1, s] ∩ (1 + − + z = 3+(2−e)q , let bτ ∈ Lz . Assume that e,q the momentum constraint equation has a solution w ∈ W . Then, we have 3 q,s

3 p

3 q ] ∩ (1, 2], and with

Lw∞ C φ6∞ bτ z + C b j e−2,q ,

(4.4)

with C > 0 not depending on w. Moreover, if the solution is unique, the norm we,q can be bounded by the same expression. Proof. Since the kernel of AL is finite dimensional, we can write W e,q = ker AL ⊕ K with a closed linear space K ⊆ W e,q . We have the splitting w = w0 + w1 such that w0 ∈ ker AL = ker L and w1 ∈ K , implying that Lw∞ = Lw1 ∞ c w1 1,∞ c w1 e,q , the latter inequality by W e,q → W 1,∞ . We note that demanding W e,q → W 1,∞ gives us the lower bound e > 1 + q3 , and this in turn implies s > 1 + 3p if the range of e is to be nonempty. To complete the proof, we note that w1 is also a solution of the momentum constraint, and taking into account Lz → W e−2,q , we apply Theorem 6 to bound the norm w1 e,q . Note that the latter embedding requires e 2, and combining this with e > 1 + q3 , we need q > 3. We now establish some properties of the momentum constraint solution map S that we will need later for our analysis of the coupled system. Suppose that the conditions for Theorem 6 hold, so that the momentum constraint is uniquely solvable. Then for any fixed φ+ ∈ W s, p with φ+ > 0, there exists a mapping S : [0, φ+ ] ∩ W s, p → W e,q

(4.5)

that sends the source φ to the corresponding solution w of the momentum constraint equation. Since the momentum constraint is linear, it follows easily that S is Lipschitz continuous as stated in the following lemma.

570


Lemma 2 (Properties of the map S). In addition to the conditions imposed in the begin3−e ning of this section, let s 1. Let e ∈ [1, 3] and q1 ∈ ( e−1 2 δ, 1 − 2 δ), where δ = max{0, 1p − s−1 3 }. Assume that the momentum constraint (4.2) is uniquely solvable in W e,q . With some φ+ ∈ W s, p satisfying φ+ > 0, let w1 and w2 be the solutions to the momentum constraint with the source functions φ1 and φ2 from the set [0, φ+ ] ∩ W s, p , respectively. Then, w1 − w2 e,q C φ+ 5∞ bτ e−2,q φ1 − φ2 s, p . Proof. The functions φ1 and φ2 pointwise satisfy the following inequalities: n−1 j n−1− j n n φ2 − φ1 = φ2 φ1 (φ2 − φ1 ) n (φ+ )n−1 |φ2 − φ1 |, −

φ2−n

− φ1−n

=

j=0 φ2n −φ1n (φ2 φ1 )n

n

(φ+ )n−1 (φ− )2n

(4.6)

(4.7)

|φ2 − φ1 |,

for any integer n > 0. Since Eq. (4.2) is linear, applying Theorem 6 with the right-hand side f := f φ1 − f φ2 , and by using Lemma 29 in Appendix, we obtain w1 − w2 e,q bτ e−2,q φ16 − φ26 s, p 6φ+ 5∞ bτ e−2,q φ1 − φ2 s, p . 4.2. The Hamiltonian constraint and the Picard map T . In this section we fix a particular function aw in an appropriate space and we then separately look for weak solutions of the Hamiltonian constraint equation (2.23). For convenience, we reformulate the problem here in a self-contained manner. Note that the problem (4.9) below is identical to (2.23), provided the functionals aτ and aρ are defined accordingly. Our goal here is primarily to establish some properties and derive some estimates for a Hamiltonian constraint fixed-point map T that we will need later in our analysis of the coupled system, and also for the analysis of the Hamiltonian constraint alone in the CMC setting. We remark that a complete weak solution theory for the Hamiltonian constraint on compact manifolds with boundary, using both variational methods and fixed-point arguments based on monotone increasing maps, combined with sub- and super-solutions, is developed in [21]. Let (M, h) be a 3-dimensional Riemannian manifold, where M is a smooth, compact manifold without boundary, and with p ∈ (1, ∞) and s ∈ ( 3p , ∞) ∩ [1, ∞), h ∈ W s, p is a positive definite metric. Introduce the operator A L : W s, p → W s−2, p , as the unique extension of the Laplace-Beltrami operator L = −, cf. Lemma 31 in Appendix A.5. Fix the source functions s−2, p

aτ , aρ , aw ∈ W+

, and a R = 18 R ∈ W s−2, p ,

where R is the scalar curvature of the metric h. (By Corollary 3(b) in Appendix A.4, we know h ab ∈ W s, p implies R ∈ W s−2, p .) Given any two functions φ− , φ+ ∈ W s, p with 0 < φ− φ+ , introduce the nonlinear operator f w : [φ− , φ+ ]s, p → W s−2, p ,

f w (φ) = aτ φ 5 + a R φ − aρ φ −3 − aw φ −7 , (4.8)


571

where the pointwise multiplication by an element of W s, p defines a bounded linear map in W s−2, p since s − 2 −s and 2(s − 3p ) > 0 > 2 − 3, cf. Corollary 3(a) in Appendix A.4. In case the coupled system is under consideration, the dependence of f w on w is hidden in the fact that the coefficient aw depends on w, cf. (2.18). For generality, in the following we will view that the operator f w depends on aw . We now formulate the Hamiltonian constraint equation as follows: find an element φ ∈ [φ− , φ+ ]s, p solution of A L φ + f w (φ) = 0.

(4.9)

To establish existence results for weak solutions to the Hamiltonian constraint equation using fixed-point arguments, we will rely on the existence of generalized (weak) suband super-solutions (sometimes called barriers) which will be derived later in Sect. 5. Let us recall the definition of sub- and super-solutions in the following, in a slightly generalized form that will be necessary in our study of the coupled system. A function φ− ∈ (0, ∞) ∩ W s, p is called a sub-solution of (2.23) iff the function φ− satisfies the inequality A L φ− + f w (φ− ) 0,

(4.10)

for some aw ∈ W s−2, p . A function φ+ ∈ (0, ∞) ∩ W s, p is called a super-solution of (2.23) iff the function φ+ satisfies the inequality A L φ+ + f w (φ+ ) 0, for some aw ∈ satisfy

W s−2, p .

(4.11)

We say a pair of sub- and super-solutions is compatible if they 0 < φ− φ+ < ∞,

(4.12)

so that the interval [φ− , φ+ ] ∩ is both nonempty and bounded. We now turn to the construction of the fixed-point mapping T : U × R(S) → X for the Hamiltonian constraint and its properties. There are a number of possibilities for defining T ; the requirements are (1) that every fixed-point of T must be a solution to the Hamiltonian constraint; (2) T must be a continuous map from its domain to its range; and (3) T must be invariant on a non-empty, convex, closed, bounded subset U of an ordered Banach space Z , with X → Z compact. It will be sufficient to define T using a variation of the Picard iteration as follows. Due to the presence of the non-trivial kernel of the operator A L , which is a consequence of working with a closed manifold, we must introduce a shift into the Hamiltonian constraint equation in order to construct T with the required properties. W s, p

Lemma 3 (Properties of the map T ). In the above described setting, assume that p ∈ s−2, p s, p ( 23 , ∞) and s ∈ ( 3p , ∞) ∩ [1, 3]. With a0 ∈ W+ satisfying a0 = 0, and ψ ∈ W+ , let as = a0 + aw ψ ∈ W s−2, p . Fix the functions φ− , φ+ ∈ W s, p such that 0 < φ− φ+ , and define the shifted operators AsL : W s, p → W s−2, p , f ws

: [φ− , φ+ ]s, p → W

AsL φ := A L φ + as φ, s−2, p

,

f ws (φ)

:= f w (φ) − as φ.

(4.13) (4.14)

Let, for φ ∈ [φ− , φ+ ]s, p and aw ∈ W s−2, p , T s (φ, aw ) := −(AsL )−1 f ws (φ).

(4.15)

572


Then, the map T s : [φ− , φ+ ]s, p × W s−2, p → W s, p is continuous in both arguments. Moreover, there exist s˜ ∈ ( 3p , s) and a constant C such that T (φ, aw )s, p C 1 + aw s−2, p φs˜, p ,

(4.16)

for all φ ∈ [φ− , φ+ ]s, p and aw ∈ W s−2, p . Proof. In this proof, we denote by C a generic constant that may have different values at its different occurrences. By applying Lemma 29 from the Appendix, for any s˜ ∈ ( 3p , s], s − 2 ∈ [−1, 1] and

1 p

∈ ( s−1 2 δ, 1 −

3−s 2 δ)

with δ =

1 p

−

s˜ −1 3 ,

we have

−4 f ws (φ)s−2, p C aτ s−2, p φ+4 ∞ + aρ s−2, p φ− ∞

−8 + aw s−2, p (φ− ∞ + ψs˜, p ) + a R + a0 s−2, p φs˜, p .

Let us verify if since

s˜ 3

−

is indeed in the prescribed range. First, we have δ =

> 0, and taking into account s 1, we infer 1 −

1 p

This shows

1 p

1 p

< 1−

3−s 2 δ

for p >

3 2,

analysis. For the other bound, we need (s−1)(˜s −1) 6

>

s−3 2p .

3−s 2 δ

1 3

+

1−

1 s˜ 1 p − 3 < 3 3−1 1 2 2 3 = 3.

which is not sharp, but will be sufficient for our 1 p

0 such that T s (φ, aw )s˜, p K (1 + aw s−2, p )φt, p ,

∀φ ∈ [φ− , φ+ ]s˜, p .

For any ε > 0, the norm φt, p can be bounded by the interpolation estimate φt, p εφs˜, p + Cε−t/(˜s −t) φ p ,

574


where C is a constant independent of ε. Since φ is bounded from above by φ+ , φ p is bounded uniformly, and now demanding that φ ∈ B M , we get (4.17) T s (φ, aw )s˜, p K [1 + aw s−2, p ] Mε + Cε−t/(˜s −t) , with possibly different constant C. Choosing ε such that 2εK [1 + aw s−2, p ] = 1 and setting M = 2K C[1 + aw s−2, p ]ε−t/(˜s −t) , we can ensure that the right-hand side of (4.17) is bounded by M. 5. Barriers for the Hamiltonian Constraint The results developed in Sect. 4.2 for a particular fixed-point map T for analyzing the Hamiltonian constraint equation and the coupled system rely on the existence of generalized (weak) sub- and super-solutions, or barriers. There, the Hamiltonian constraint was studied in isolation from the momentum constraint, and these generalized barriers only needed to satisfy the conditions given at the beginning of Sect. 4.2 for a given fixed function w appearing as a source term in the nonlinearity of the Hamiltonian constraint. Therefore, these types of barriers are sometimes referred to as local barriers, in that the coupling to the momentum constraint is ignored. In order to establish existence results for the coupled system in the non-CMC case, it will be critical that the sub- and super-solutions satisfy one additional property that now reflects the coupling, giving rise to the term global barriers. It will be useful now to define this global property precisely. Definition 1. A sub-solution φ− is called global iff it is a sub-solution of (2.23) for all vector fields wφ solution of (2.24) with source function φ ∈ [φ− , ∞) ∩ W s, p . A supersolution φ+ is called global iff it is a super-solution of (2.23) for all vector fields wφ solution of (2.24) with source function φ ∈ (0, φ+ ] ∩ W s, p . A pair φ− φ+ of sub- and super-solutions is called an admissible pair if φ− and φ+ are sub- and super-solutions of (2.23) for all vector fields wφ of (2.24) with source function φ ∈ [φ− , φ+ ] ∩ W s, p . It is obvious that if φ− and φ+ are respectively global sub- and super-solutions, then the pair φ− , φ+ is admissible in the sense above, provided they satisfy the compatibility condition (4.12). Below we give a number of (local and global) sub- and super-solution constructions for closed manifolds; analogous constructions for compact manifolds with boundary are given in [21]. These constructions are based on generalizing known constant sub- and super-solution constructions given previously in the literature for closed manifolds. On one hand, the generalized global sub-solution constructions appearing here and in [21] do not require the near-CMC condition, inheriting this property from the known sub-solutions from literature on which they are based. However, on the other hand, all previously known global super-solutions for the Hamiltonian constraint equation have required the near-CMC condition. Here and in [21,22], one of our primary interests is in developing existence results for weak (and strong) non-CMC solutions to the coupled system which are free of the near-CMC assumption. This assumption had appeared in two distinct places in all prior literature on this problem [1,26]; the first assumption appears in the construction of a fixed-point argument based on strict k-contractions, and the second assumption appears in the construction of global super-solutions. Here and in [21,22], an alternative fixedpoint framework based on compactness arguments rather than k-contractions is used


575

to remove the first of these near-CMC assumptions. In this section, we give some new constructions of global super-solutions that are free of the near-CMC assumption, along with some compatible sub-solutions. These sub- and super-solution constructions are needed (without their global property) for the existence result for the Hamiltonian constraint (Theorem 3), and they are also needed (now with their global property) for the general fixed-point result for the coupled system (Theorem 5), leading to our two main non-CMC results (Theorems 1 and Theorem 2). The super-solutions in Lemmata 7(b) and 9 appear to be the first such near-CMC-free constructions, and provide the second key piece of the puzzle we need in order to establish non-CMC results through Theorem 5 without the near-CMC condition. Throughout this section, we will assume that the background metric h belongs to 3p W s, p with p ∈ (1, ∞) and s ∈ ( 3p , ∞) ∩ (1, 2]. Recall that r = 3+(2−s) p , so that r s−2, p the continuous embedding L → W holds. Given a symmetric two-index tensor σ ∈ L 2r and a vector field w ∈ W 1,2r , introduce the functions aσ = 18 σ 2 ∈ L r and aLw = 18 (Lw)2 ∈ L r . Note that under these conditions aw belongs to L r → W s−2,2 , and that if aσ , aLw ∈ L ∞ we have the pointwise estimate ∧ aw∧ 2aσ∧ + 2aL w.

Here and in what follows, given any scalar function u ∈ L ∞ , we use the notation u ∧ := ess sup u,

u ∨ := ess inf u.

In some places we will assume that when the vector field w ∈ W 1,2r is given by the solution of the momentum constraint equation (2.24) (or (4.2)) with the source term φ ∈ W s, p , ∧ 12 aL w k(φ) := k1 φ∞ + k2 ,

(5.1)

with some positive constants k1 and k2 . We can verify this assumption e.g. when the conditions of Corollary 1 are satisfied, since from Corollary 1 we would get 2 ∧ 2 2 6 φ = Lw C b + b , aL j e−2,q ∞ ∞ τ z w giving the bound (5.1) with the constants k1 = 2C 2 bτ 2z ,

and

k2 = 2C 2 b j 2e−2,q .

(5.2)

5.1. Constant barriers. Now we will present some global sub- and super-solutions for the Hamiltonian constraint equation (2.23) which are constant functions. The proofs essentially follow the arguments in [21] for the case of compact manifolds with boundary. Lemma 7 (Global super-solution). Let (M, h) be a 3-dimensional, smooth, closed Riemannian manifold with metric h ∈ W s, p . Assume that the estimate (5.1) holds for the solution of the momentum constraint equation, and assume that aρ , aσ ∈ L ∞ and that a R is uniformly bounded from below. With the parameter ε > 0 to be chosen later, define the rational polynomial qε (χ ) = (aτ∨ − K1ε ) χ 5 + a ∨R χ − aρ∧ χ −3 − K2ε χ −7 , where K1ε := (1 + 1ε )k1 and K2ε := (1 + ε)aσ∧ + (1 + 1ε )k2 . We distinguish the following two cases:

576


k1 . If qε has a root, let φ+ = φ1 (aτ∨ − aτ − k1 K1ε , a ∨R , aρ∧ , K2ε ) be the largest positive root of q, and if q has no positive roots, let φ+ = 1. Now, the constant φ+ is a global super-solution of the Hamiltonian constraint equation (2.23). (b) In case k1 aτ∨ , choose ε > 0. In addition, assume that a ∨R > 0 and that both aρ∧ and K2ε are sufficiently small, such that q has two positive roots. Then, the largest root φ+ = φ2 (aτ∨ − K1ε , a ∨R , aρ∧ , K2ε ) of q is a super-solution of the Hamiltonian constraint equation (2.23). (a) In case k1 < aτ∨ , choose ε >

∨

Proof. We look for a super-solution among the constant functions. Let χ be any positive constant. Then we have f (χ , w) = aτ χ 5 + a R χ − aρ χ −3 − aw χ −7 aτ∨ χ 5 + a ∨R χ − aρ∧ χ −3 − aw∧ χ −7 . Given any ε > 0, the inequality 2|σab (Lw)ab | εσ 2 + 1ε (Lw)2 implies that 8aw = σ 2 + (Lw)2 + 2σab (Lw)ab (1 + ε) σ 2 + (1 + 1ε ) (Lw)2 , hence, taking into account (5.1), for any w ∈ W 1,2r that is a solution of the momentum constraint equation (2.24) with any source term φ ∈ (0, χ ], the constant aw∧ must fulfill the inequality ∧ 12 aw∧ (1 + ε)aσ∧ + (1 + 1ε )aL w K1ε φ∞ + K2ε .

(5.3)

Thus, for any constant χ > 0 and all φ ∈ (0, χ ], it holds that −7 f (χ , wφ ) aτ∨ χ 5 + a ∨R χ − aρ∧ χ −3 − K1ε φ12 ∞ + K2ε χ Bε χ 5 + a ∨R χ − aρ∧ χ −3 − K2ε χ −7 , where Bε := aτ∨ − K1ε . Introduce the rational polynomial on χ given by qε (χ ) := Bε χ 5 + a ∨R χ − aρ∧ χ −3 − K2ε χ −7 .

(5.4)

We calculate the first and second derivative of qε as qε (χ ) = 5Bε χ 4 + a ∨R + 3aρ∧ χ −4 + 7K2ε χ −8 , qε (χ ) = 20Bε χ 3 − 12aρ∧ χ −5 − 56K2ε χ −9 .

(5.5)

1 , we have Bε > 0, Consider the case (a). In this case, because of the choice ε > a ∨k−k 1 τ and so qε (χ ) > 0 for sufficiently large χ , and qε is increasing. The function qε has no positive root only if aρ∧ = K2ε = 0. So if qε has no positive root, qε (χ ) 0 for all χ 0. If qε has at least one positive root, denoting by φ1 the largest positive root, q(χ ) 0 for all χ φ1 . Recalling now that any constant χ satisfies A L χ = 0, we conclude that

A L χ + f (χ , wφ ) 0

∀ χ φ1 , ∀ φ ∈ (0, χ ],

implying that φ+ is a global super-solution of the Hamiltonian constraint (2.21). For the case (b), since Bε < 0 and aρ∧ and K2ε are nonnegative, the first derivative qε (χ ) is strictly decreasing for χ > 0, and since qε (φ) > 0 for sufficiently small χ > 0 and qε (χ ) < 0 for sufficiently large χ > 0, the derivative qε has a unique positive root,


577

at which the polynomial qε attains its maximum over (0, ∞). This maximum is positive if both aρ∧ and K2ε are sufficiently small, and hence the polynomial qε has two positive roots φ1 φ2 . Similarly to the above we conclude that A L χ + f (χ , wφ ) 0

∀ χ ∈ [φ1 , φ2 ], ∀ φ ∈ (0, χ ],

implying that φ+ is a global super-solution of the Hamiltonian constraint (2.21).

Case (a) of the above lemma has the condition k1 < aτ∨ , which is the near-CMC condition. This condition seems to be present in all non-CMC results to date. The above condition also requires that the extrinsic mean curvature τ is nowhere zero. Noting that there are solutions even for τ ≡ 0 in some cases (cf. [25]), the condition inf τ > 0 appears as a rather strong restriction. We see that case (b) of the above lemma removes this restriction, in exchange for the smallness conditions on ρ, j, and σ . We also need the scalar curvature to be strictly positive, which condition is relaxed in the next subsection to allow any metric in the positive Yamabe class. In the following lemma, we list some constant sub-solutions. They impose considerable restrictions on the allowable data, which is the main reason to consider non-constant sub-solutions in the next subsection. Lemma 8 (Global sub-solution). Let (M, h) be a 3-dimensional, smooth, closed Riemannian manifold with metric h ∈ W s, p . Assume that aτ ∈ L ∞ and that a R is uniformly bounded from above. We distinguish the following three cases: (a) If a ∧R < 0, then the unique positive root of the polynomial q(χ ) = aτ∧ χ 4 + a ∧R , is a global sub-solution of (2.23). (b) If aρ∨ > 0, then the unique positive root of the polynomial qρ (χ ) = aτ∧ χ 8 + max{1, a ∧R } χ 4 − aρ∨ , is a global sub-solution of (2.23). (c) Let φ+ > 0 be a global super-solution of the Hamiltonian constraint. Let aσ∨ > k(φ+ ), where k is as in (5.1). Then, with some ε ∈ (k(φ+ )/aσ∨ , 1), the unique positive root φ+ of the polynomial qσ (χ ) = aτ∧ χ 12 + max{1, a ∧R } χ 8 − Kε , where Kε := (1 − ε)aσ∨ − 1ε − 1 k(φ+ ), is a global sub-solution of (2.23). Proof. For the proof of (a,b), see e.g. [21]. We give a proof of (c) here. Let χ > 0 be any constant function, and let w ∈ W 1,2r . Then we have f (χ , w) = aτ χ 5 + a R χ − aρ χ −3 − aw χ −7 aτ∧ χ 5 + a ∧R χ − aw∨ χ −7 aτ∧ χ 5 + Cχ − aw∨ χ −7 ,

(5.6)

where we have used that aρ is nonnegative, and introduced the constant C = max{1, a ∧R }. Given any ε > 0, the inequality 2|σab (Lw)ab | εσ 2 + 1ε (Lw)2 implies that 8aw = σ 2 + (Lw)2 + 2σab (Lw)ab (1 − ε) σ 2 − ( 1ε − 1) (Lw)2 ,

578


hence, taking into account (5.1), for any w ∈ W 1,2r that is a solution of the momentum constraint equation (2.24) with any source term φ ∈ (0, φ+ ], the constant aw∨ must fulfill the inequality 1 ∧ ∨ aw∨ (1 − ε)aσ∨ − ( 1ε − 1)aL w (1 − ε)aσ − ( ε − 1)k(φ+ ) =: Kε .

We use the above estimate in (5.6) to get, for any w ∈ W 1,2r that is a solution of the momentum constraint equation (2.24) with any source term φ ∈ (0, φ+ ], f (χ , w) aτ∧ χ 5 + Cχ − Kε χ −7 . Because of the choice k(φ+ )/aσ∨ < ε < 1, we have Kε > 0. So with the unique positive root χ∗ of qσ (χ ) := aτ∧ χ 5 + C χ − Kε χ −7 , we have qσ (χ ) 0 for any constant χ ∈ (0, χ∗ ], establishing the proof.

5.2. Non-constant barriers. All global super-solutions found to date appear to require the near-CMC condition; Lemma 7(b) avoids the near-CMC condition, but it requires the scalar curvature to be strictly positive. The following lemma extends this result to arbitrary metrics in the positive Yamabe class Y + (M). Lemma 9 (Global super-solution h ∈ Y + ). Let (M, h) be a 3-dimensional, smooth, closed Riemannian manifold with metric h ∈ W s, p in Y + (M). Assume there exist continuous positive functions u, ∈ W s, p that together satisfy: − u + 18 Ru = > 0, u > 0.

(5.7)

Let 0 < k3 := u ∧ /u ∨ < ∞, which is a trivially satisfied Harnack-type inequality. Assume that the estimate (5.1) is satisfied for the solution of the momentum constraint equation for two positive constants k1 and k2 , and assume that aρ , aσ ∈ L ∞ . If the constants aρ∧ , aσ∧ , and k2 are sufficiently small, then φ+ = βu, β =

∨ ∧ 5 2k1 k12 3 (u )

1/4 > 0,

(5.8)

is a positive global super-solution to the Hamiltonian constraint equation. Proof. Taking φ = βu with a constant β > 0 in (5.7), gives − φ + a R φ = β(−u + 18 Ru) = β.

(5.9)

Then for any ϕ ∈ C+∞ , by using (5.3) with K1 := 2k1 and K2 := 2aσ∧ + 2k2 , we infer A L φ + f (φ, w), ϕ = ∇φ, ∇ϕ + a R φ + aτ φ 5 − aρ φ −3 − aw φ −7 , ϕ

β + aτ∨ φ 5 − [K1 (φ ∧ )12 + K2 ]φ −7 − aρ∧ φ −3 , ϕ

5 −7 β + [aτ∨ − K1 k12 − aρ∧ φ −3 , ϕ

3 ]φ − K2 φ βG(β, K2 , aρ ), ϕ ,


579

where 4 ∧ 5 −8 ∧ −7 G(β, K2 , aρ ) := ∨ − K1 k12 − aρ∧ β −4 (u ∧ )−3 , 3 β (u ) − K2 β (u )

and where we have used the fact that φ ∧ /φ ∨ = u ∧ /u ∨ = k3 . Therefore, to ensure φ is a super-solution we must now pick arguments ensuring G(β, K2 , aρ ) 0. We first pick β as in (5.8) giving 1 ∨ 2

∧ 5 4 = ∨ − K1 k12 3 (u ) β > 0.

For this fixed β, we then pick K2 and aρ∨ , each sufficiently small, so that 1 ∨ 2

The result then follows.

− K2 β −8 (u ∧ )−7 − aρ∧ β −4 (u ∧ )−3 0.

Remark 3. We now make some remarks about the existence of a pair of positive functions (u, ) which satisfy the hypotheses of Lemma 9. Let the background metric h ab ∈ W s, p be in the positive Yamabe class. Then in Theorem 11 in Appendix A.7, for the sub-critical range 1 q < 5 we establish the existence of a positive u ∈ W s, p and a constant µq > 0 satisfying −8u + Ru = µq u q . So the pair (u, 18 µq u q ) readily satisfies (5.7). In a sense the simplest construction of the near-CMC-free global super-solution in Lemma 9 arises by taking q = 1; one is then simply using the first eigenfunction of the conformal Laplacian to build the global super-solution. Alternatively, one can consider a solution to the Yamabe problem −8u + Ru = u 5 , u > 0, which exists for sufficiently smooth metrics in the positive Yamabe class, cf. [31]. This approach is taken for simplicity in [22]. In any case, note that the function u > 0 that satisfies (5.7) is the conformal factor which transforms h ab into a metric with scalar curvature Ru = 8u −5 > 0. We remark that without the near-CMC condition, the only potentially strictly positive term appearing in the nonlinearity of the Hamiltonian constraint is the term involving the scalar curvature R. Therefore, global super-solution constructions based on the approach in Lemma 9 are restricted to data in Y + (M). We extend this observation in the next lemma, which essentially says that in a nonpositive Yamabe class, there is no way to build a positive global super-solution without the near-CMC condition as long as we use a global estimate of type (5.1). Lemma 10 (Near-CMC condition and aw bounds). Let (M, h) be a 3-dimensional, smooth, closed Riemannian manifold with metric h ∈ W s, p in a nonpositive Yamabe class, and let aτ be continuous. Let φ+ ∈ W s, p with φ+ > 0 be a global super-solution to the Hamiltonian constraint equation. We assume that any vector field w ∈ W 1,2r that is a solution of the momentum constraint equation with a source φ φ+ satisfies the estimate aw K1 φ+ 12 ∞ + K2 ,

(5.10)

580


with some positive constants K1 and K2 . Moreover, we assume that this estimate is sharp in the sense that for any x ∈ M there exist an open neighborhood U x and a vector field w ∈ W 1,2r a solution of the momentum constraint equation with a source φ φ+ , such that aw = K1 φ+ 12 ∞ + K2

in U.

(5.11)

Then, we have K1 supM aτ . 2−s, p

such Proof. Since the metric is in a nonpositive Yamabe class, there exists ϕ˜ ∈ W+ that ∇φ+ , ∇ ϕ

˜ + a R φ+ , ϕ

˜ 0. The collection of all neighborhoods in (5.11) will form an open cover of M, and let {Ui } be one of its finite subcovers. Let {µi } be a partition of unity subordinate to {Ui }. Then, by writing ϕ˜ = i µi ϕ, ˜ we can expand the expression ∇φ+ , ∇ ϕ

˜ + a R φ+ , ϕ

˜ into a finite sum, which has at least one non-positive term. Without loss of generality, let us assume ∇φ+ , ∇ϕ + a R φ+ , ϕ 0 with ϕ = µi ϕ. ˜ With w ∈ W 1,2r being a vector field that satisfies (5.11) with respect to U := Ui , we have 0 ∇φ+ , ∇ϕ + a R φ+ + aτ φ+5 − aw φ+−7 − aρ φ+−3 , ϕ

aτ φ+5 − aw φ+−7 − aρ φ+−3 , ϕ

= aτ φ+5 − [K1 (φ+∧ )12 + K2 ]φ+−7 − aρ φ+−3 , ϕ

([aτ − K1 (φ+∧ /φ+ )12 ]φ+5 , ϕ). Using partitions of unity we can make the support of ϕ arbitrarily small, from which we conclude that aτ K1 (φ+∧ /φ+ )12 K1 at some x ∈ M. All of the subsequent barrier constructions below are more or less known. A number of the more technically sophisticated construction techniques we employ below were pioneered by Maxwell in [33]. For completeness, we first construct local super-solutions and then global super-solutions for the near-CMC case. Lemma 11 (Local super-solution). Let (M, h) be a 3-dimensional, smooth, closed s−2, p Riemannian manifold with metric h ∈ W s, p . Let aτ , aρ , aw ∈ W+ , and let one of the following conditions hold: (a) The metric h is in a non-negative Yamabe class, aτ = 0, and aρ + aw = 0. (b) The metric h is in the positive Yamabe class, and aρ + aw = 0. (c) The metric h is conformally equivalent to a metric with scalar curvature −aτ = 0, thus in particular the metric is in the negative Yamabe class. Then, there is a positive (local) super-solution φ+ ∈ W s, p of the Hamiltonian constraint equation (2.23). Proof. First we prove (a) and (b). Let u ∈ W s, p be a (weak) solution to −u + 18 Ru = λu, u > 0, with a constant λ 0, which exists by Theorem 11 in Appendix A.7, and let v ∈ W s, p be the solution to u 2 ∇v, ∇ϕ + λu 2 v + aτ v, ϕ = aρ + aw , ϕ ,

∀ϕ ∈ C ∞ .

(5.12)


581

Since aτ , aρ , aw ∈ W+ with sp > 3, we have v ∈ W s, p → L ∞ , and since 2 λu + aτ = 0 and aρ + aw = 0, Lemma 35 (maximum principle) in Appendix A.6 implies that v > 0. Let us define φ = βuv ∈ W s, p for a constant β > 0. Then for any ϕ ∈ C+∞ we have s−2, p

A L φ + f (φ, w), uϕ = ∇φ, ∇(uϕ) + aτ φ 5 + a R φ − aρ φ −3 − aw φ −7 , uϕ

= β u 2 ∇v, ∇ϕ + βλu 2 v + aτ uφ 5 − aρ uφ −3 − aw uφ −7 , ϕ

= aτ [β 5 u 6 v 5 − βv], ϕ + aρ [β − β −3 u −2 v −3 ], ϕ

+ aw [β − β −7 u −6 v −7 ], ϕ , where the second line is obtained by A L φ + a R φ, uϕ = β ∇(uv), ∇(uϕ) + β8 Ruv, uϕ

= β ∇u, ∇(uvϕ) + β8 Ru, uvϕ + β u∇v, u∇ϕ

(5.13)

= β λu, uvϕ + β u ∇v, ∇ϕ , 2

and the third line is from (5.12). Now, choosing β > 0 sufficiently large, so that β 4 u 6 v 5 −v 0, 1 − β −4 u −2 v −3 0 and 1 − β −8 u −6 v −7 0, we ensure that φ is a super-solution. Now, let us consider (c). Let u > 0 be the conformal factor which transforms h into a metric with scalar curvature λ = −8aτ , i.e., let u ∈ W s, p be a weak solution to −u + 18 Ru + aτ u 5 = 0, u > 0. If aρ = aw = 0, the Hamiltonian constraint equation reduces to the above equation and we can take u as a super-solution (it is even a solution). So we can assume in the following that aρ + aw = 0. Let v ∈ W s, p be the solution to u 2 ∇v, ∇ϕ + aτ v, ϕ = aρ + aw , ϕ ,

∀ϕ ∈ C ∞ .

Defining φ = βuv ∈ W s, p for a constant β > 0, the rest of the proof proceeds superficially in the same way as the above case. Lemma 12 (Near-CMC global super-solution) Let (M, h) be a 3-dimensional, s−2, p smooth, closed Riemannian manifold with metric h ∈ W s, p . Let aτ , aρ ∈ W+ ∞ and aσ ∈ L + , and let one of the following conditions hold: (a) The metric h is in a non-negative Yamabe class, aτ = 0, and aρ + aσ = 0. Let u ∈ W s, p and v ∈ W s, p be the solutions to −u + 18 Ru = λu, −∇(u 2 ∇v) + (λu 2 + aτ )v = aρ + aσ

(5.14)

with a constant λ 0. (b) The metric h is conformally equivalent to a metric with scalar curvature −aτ = 0, thus in particular the metric is in the negative Yamabe class. Let u ∈ W s, p and v ∈ W s, p be the solutions to −u + 18 Ru + aτ u 5 = 0, −∇(u 2 ∇v) + aτ v = aρ + aσ .

(5.15)

582


Assume that the estimate (5.1) holds for the momentum constraint equation, and let min uv 12 k1 < aτ∨ ( max uv ) . Then, for any sufficiently large constant β > 0, φ+ = βuv is a global super-solution of the Hamiltonian constraint equation (2.23). Proof. We give a proof of (a). The proof of (b) is similar. Proceeding as in the proof of the preceding lemma, for any ϕ ∈ C+∞ we have A L φ + f (φ, w), uϕ = ∇φ, ∇(uϕ) + aτ φ 5 + a R φ − aρ φ −3 − aw φ −7 , uϕ

= β u 2 ∇v, ∇ϕ + βλu 2 v + aτ uφ 5 − aρ uφ −3 − aw uφ −7 , ϕ

β u 2 ∇v, ∇ϕ + βλu 2 v + aτ uφ 5 − aρ uφ −3 − 2[aσ + aLw ]uφ −7 , ϕ

= aρ [β − β −3 u −2 v −3 ], ϕ + aσ [β − 2β −7 u −6 v −7 ], ϕ

+ aτ [β 5 u 6 v 5 − βv] − 2aLw uφ −7 , ϕ . Then, choosing β sufficiently large, and by using (5.1), with θ = uv we infer A L φ + f (φ, w) [aτ∨ (θ ∨ )5 − 2k1 (θ ∧ )12 (θ ∨ )−7 ]β 5 − p(β), where p(β) = aτ (v ∧ /u ∨ )β +2k2 (θ ∨ )−7 β −7 . Now, if we have k1 < 21 aτ∨ (θ ∨ /θ ∧ )12 , then choosing β large enough, we ensure that φ is a super-solution. If we proceeded as in the proof of Lemma 7, we could remove the factor 21 from the condition k1 < 21 aτ∨ (θ ∨ /θ ∧ )12 ; however, we omit it for clarity. We now also give some examples of non-constant global sub-solutions φ− which are compatible with φ+ above in the sense that 0 < φ− φ+ . Such a pair of compatible sub- and super-solutions are needed to establish existence of solutions to the individual Hamiltonian constraint (Theorem 3), and are also needed again to establish existence of solutions to the coupled system (Theorems 1 and 2). Lemma 13 (Global sub-solution h ∈ Y − , ρ ≡ 0). Let (M, h) be a 3-dimensional, smooth, closed Riemannian manifold with metric h ∈ W s, p in a non-negative Yamabe s−2, p \{0}. Then, there exists a positive scalar φ− ∈ W s, p such that class. Let aρ , aτ ∈ W+ for any constant β ∈ (0, 1], βφ− is a global sub-solution of the Hamiltonian constraint equation. Proof. Let u ∈ W s, p be a (weak) solution to −u + 18 Ru = λu, u > 0, with a constant λ 0, which exists by Theorem 11 in Appendix A.7, and let v ∈ W s, p be the solution to u 2 ∇v, ∇ϕ + λu 2 v + aτ v, ϕ = aρ , ϕ ,

∀ϕ ∈ C ∞ .

(5.16)

with sp > 3, we have v ∈ W s, p → L ∞ , and Lemma 35 (maxSince aρ , aτ ∈ W+ imum principle) in Appendix A.6 implies that v > 0. Let us define φ = βuv ∈ W s, p for a constant β > 0. Then for any ϕ ∈ C+∞ we have s−2, p

A L φ + f (φ, w), uϕ A L φ, uϕ + aτ φ 5 + a R φ − aρ φ −3 , uϕ

= β u 2 ∇v, ∇ϕ + βλu 2 v + aτ u 6 (βv)5 − aρ u −2 (βv)−3 , ϕ

= β aρ [1 − u −2 v −3 β −4 ], ϕ + β aτ [u 6 v 5 β 4 − 1], ϕ ,


583

where the second line is obtained by (5.13), and the third line is from (5.16). Now, choosing β > 0 sufficiently small, so that 1 − u −2 v −3 β −4 0 and (βv)4 − 1 0, we ensure that φ is a sub-solution. The following lemma extends Lemma 8(a) to all reasonable metrics in the negative Yamabe class. Lemma 14 (Global sub-solution h ∈ Y − ). Let (M, h) be a 3-dimensional, smooth, closed Riemannian manifold with metric h ∈ W s, p in Y − (M). In addition, let aτ ∈ W s−2, p , and let the metric h be conformally equivalent to a metric with scalar curvature (−aτ ). Then, there exists a positive scalar function φ− ∈ W s, p such that for any β ∈ (0, 1], βφ− is a global sub-solution of the Hamiltonian constraint equation. Proof. Let u > 0 be the conformal factor which transforms h into a metric with scalar curvature λ = −8 aτ , i.e., let u ∈ W s−2, p be a weak solution to −u + 18 Ru + aτ u 5 = 0, u > 0. Taking φ = βu with a constant β > 0, we have A L φ + f (φ, w) A L φ + aτ φ 5 + a R φ = −βu + aτ (βu)5 +

β 8

Ru

= βaτ u (β − 1). 5

4

By choosing β ∈ (0, 1], we get the sub-solution.

The following lemma shows that the additional condition on the metric appearing in Lemma 14 is indeed not restrictive. It is worth noting that this next result can be viewed as an apparently new non-existence result in the context of the non-CMC constraints, which is interesting in its own right. This result was first proved in [33] for the case of p = 2; we just need to reinterpret it here in our setting. It states that for there to be a (CMC or non-CMC) solution to the Hamiltonian constraint, the background metric h ab must be conformally equivalent to a metric with scalar curvature equal to (−aτ ). Lemma 15 (Non-existence h ∈ Y − ). Let (M, h) be a 3-dimensional, smooth, closed Riemannian manifold with metric h ∈ W s, p in Y − (M). Let aτ ∈ W s−2, p , and let there exist a solution to the Hamiltonian constraint equation. Then, the metric h is conformally equivalent to a metric with scalar curvature (−aτ ). Proof. It suffices to show that the equation − ψ + 18 Rψ + aτ ψ 5 = 0,

(5.17)

has a solution ψ > 0. Since the above equation is just a Hamiltonian constraint equation with aρ = aw = 0, Theorem 3 establishes the proof upon constructing sub- and super-solutions to (5.17). Let φ > 0 be a solution to the (general) Hamiltonian constraint equation. Then, since both aρ and aw are non-negative, we have −φ + 18 Rφ + aτ φ 5 0, which means that φ is a super-solution to (5.17).

584


Let u ∈ W s, p be a solution to −u + 18 Ru = −λu, u > 0, with a constant λ > 0, which exists by Theorem 11 in Appendix A.7, and with a real parameter ε, let vε ∈ W s, p be the solution to u 2 ∇vε , ∇ϕ + λu 2 vε , ϕ = λu 2 − aτ ε, ϕ ,

∀ϕ ∈ C ∞ .

We have vε ≡ 1 for ε = 0, and we have vε ∈ W s, p → L ∞ , so as ε goes to 0, vε tends to 1 uniformly. Let us fix ε > 0 such that vε 21 . By taking ψ = βuvε with a constant β > 0, and using (5.13), it holds for any ϕ ∈ C+∞ that 1 5 ∇ψ, ∇(uϕ) + Rψ + aτ ψ , uϕ = β u 2 ∇vε , ∇ϕ + aτ u 6 (βvε )5 − βλu 2 vε , ϕ

8 = β aτ (u 6 vε5 β 4 − ε), ϕ + βλ u 6 (1 − 2vε ), ϕ . Now, by choosing β > 0 small enough, we can ensure that ψ is a sub-solution of (5.17). 5.3. A priori L ∞ bounds on W 1,2 solutions. We now establish some related a priori L ∞ -bounds on any W 1,2 -solution to the Hamiltonian constraint equation. Although such results are standard for semi-linear scalar problems with monotone nonlinearities (for example, see [29]), the nonlinearity appearing in the Hamiltonian constraint becomes non-monotone when R becomes negative. Nonetheless, we are able to obtain a priori L ∞ -bounds on solutions to the Hamiltonian constraint in all cases including the nonmonotone case. See [21] for an analogue of this result in the case of compact manifolds with boundary; in that case a more general result is possible. Lemma 16 ((Pointwise a priori bounds). Let φ ∈ W 1,2 be any non-constant positive solution of the Hamiltonian constraint equation (2.23). (a) Let aτ∨R := ess inf (aτ + a R ) > 0, and let aρ∧ and aw∧ be finite. Then, φ satisfies the a priori bound aρ∧ + aw∧ 4 . φ max 1, aτ∨R (b) Let aτ∨ > 0 and let aρ∧ and aw∧ be finite. Then, φ satisfies the a priori bound ⎧ ⎨

φ 4 max 1, ⎩

⎫ (a ∨R )2 + aτ∨ (aρ∧ + aw∧ ) − a ∨R ⎬ ⎭

aτ∨

.

∨ := ess inf (aρ + aw ) > 0, and let aτ∧ be finite. Then, φ satisfies the a priori (c) Let aρw bound

φ4

∨ aρw ∨ , a∧ + a∧} max{aρw R τ

.


585

Proof. We will only prove (a) since the other cases can be proven similarly. Let χ ∈ W 1,2 be any function with χ 1. Then for ϕ ∈ C+∞ we have f w (χ ), ϕ (χ ∨ )5 aτ , ϕ + χ ∨ a R , ϕ − (χ ∨ )−3 (aρ , ϕ) − (χ ∨ )−7 (aw , ϕ) aτ∨R χ ∨ − (χ ∨ )−3 [aρ∧ + aw∧ ] ϕ1 . So we conclude that ∀χ φ ∧ , χ ∈ W 1,2 , f w (χ ), ϕ 0 $ a ∧ +a ∧ % where (φ ∧ )4 = max 1, ρa ∨ w .

∀ϕ ∈ C+∞ ,

τR

Now, suppose that φ ∈ W 1,2 is a solution of the Hamiltonian constraint equation, such that φ φ ∧ . Denoting by (φ − φ ∧ )+ the positive part of φ − φ ∧ (cf. Appendix A.6), then we have 0 − f w (φ), (φ − φ ∧ )+ = (∇φ, ∇(φ − φ ∧ )+ ) = (∇(φ − φ ∧ )+ , ∇(φ − φ ∧ )+ ) c(φ − φ ∧ )+ − (φ − φ ∧ )+ 22 , where c > 0, and (φ − φ ∧ )+ is the integral average of (φ − φ ∧ )+ . This implies that φ is constant, leading to a contradiction. 6. Proof of the Main Results It is convenient to prove Theorem 2 first, which is the most general of the three; the proofs of Theorem 1 and Theorem 3 involve minor modifications of the proof of Theorem 2. 6.1. Proof of Theorem 2. Our strategy will be to prove the theorem first for the case s 2, and then to bootstrap to include the higher regularity cases. Step 1. The choice of function spaces. We have the (reflexive) Banach spaces X = W s, p and Y = W e,q , where p, q ∈ (3, ∞), s = s( p) ∈ (1 + 3p , 2], and e = e( p, s, q) ∈ (1, s]∩(1+ q3 , s − 3p + q3 ]. We have the ordered Banach space Z = W s˜, p with the compact

embedding X = W s, p → W s˜, p = Z , for s˜ ∈ ( 3p , s). The interval [φ− , φ+ ]s˜, p is nonempty (by compatibility of the barriers we will choose below), and by Lemma 1 at the end of Sect. 3 it is also convex with respect to the vector space structure of W s˜, p and closed with respect to the norm topology of W s˜, p . We then take U = [φ− , φ+ ]s˜, p ∩ B M for sufficiently large M (to be determined below), where B M is the closed ball in Z = W s˜, p of radius M about the origin, ensuring that U is non-empty, convex, closed, and bounded as a subset of Z = W s˜, p . Step 2. Construction of the mapping S. We have b j ∈ W e−2,q , and bτ ∈ Lz with 3q z = 3+(2−e)q so that Lz → W e−2,q . Moreover, since the metric admits no conformal Killing field, by Lemma 6 the momentum constraint equation is uniquely solvable for any “source” φ ∈ [φ− , φ+ ]s˜, p . The ranges for the exponents ensure that Lemma 2 holds, so that the momentum constraint solution map S : [φ− , φ+ ]s˜, p → W e,q = Y, is continuous.

586


3p Step 3. Construction of the mapping T . Define r = 3+(2−s) p , so that the continuous r s−2, p embedding L → W holds. Since the pointwise multiplication is bounded on L 2r ⊗ L 2r → L r , and w ∈ W e,q → W 1,2r , we have aw ∈ W s−2, p by σ ∈ L 2r . The em1 2 beddings W 1,z → W e−1,q → L 2r also guarantee that aτ = 12 τ ∈ W s−2, p . We have the scalar curvature R ∈ W s−2, p , and these considerations show that the Hamiltonian constraint equation is well defined with [φ− , φ+ ]s, p as the space of solutions. Suppose for the moment that the scalar curvature R of the background metric h is continuous, and by using the map T s introduced in Lemma 3, define the map T by T (φ, w) = T s (φ, aw ), where aw is now considered as an expression depending on w. Then Lemma 3 implies that the map T : [φ− , φ+ ]s˜, p × W e,q → W s, p is continuous for any reasonable shift as , which, by Lemma 4, can be chosen so that T is monotone in the first variable. Combining the monotonicity with Lemma 5, we infer that the interval [φ− , φ+ ]s˜, p is invariant under T (·, aw ) if w ∈ S([φ− , φ+ ]s˜, p ). Since Lz → W e−2,q , from Theorem 6 we have

we,q C bτ φ 6 + b j e−2,q C φ+ 6∞ bτ z + C b j e−2,q for any w ∈ S([φ− , φ+ ]s˜, p ). In view of Lemma 6, this shows that there exists a closed ball B M ⊂ W s˜, p such that φ ∈ [φ− , φ+ ]s˜, p ∩ B M , w ∈ S([φ− , φ+ ]s˜, p ∩ B M )

⇒

T (φ, w) ∈ B M .

Under the conditions in the above displayed formula, from the invariance of the interval [φ− , φ+ ]s˜, p , we indeed have T (φ, w) ∈ U = [φ− , φ+ ]s˜, p ∩ B M . However, the scalar curvature of h may be not continuous, and in general it is not clear how to introduce a shift so that the resulting operator is monotone. Nevertheless, we can conformally transform the metric into a metric with continuous scalar curvature, cf. Theorem 12, and by using the conformal covariance of the Hamiltonian constraint, we will be able to construct an appropriate mapping T . Let h˜ = θ 4 h be a metric with continuous scalar curvature, where θ ∈ W s, p is the (positive) conformal factor of the scaling. Let T˜ s be the mapping introduced in Lemma 3, corresponding ˜ and the coeffito the Hamiltonian constraint equation with the background metric h, cients a˜ τ = aτ , and a˜ ρ = θ −8 aρ . With a˜ w = θ −12 aw , this scaled Hamiltonian constraint equation has sub- and super-solutions θ −1 φ− and θ −1 φ+ , respectively, as long as φ− and φ+ are sub- and super-solutions respectively of the original Hamiltonian constraint equation, cf. Appendix A.8. We choose the shift in T˜ s so that it is monotone in [θ −1 φ− , θ −1 φ+ ]s˜, p . Then by the monotonicity and the above mentioned sub- and super-solution property under conformal scaling, for w ∈ S([φ− , φ+ ]s˜, p ), T˜ s (·, θ −12 aw ) is invariant on [θ −1 φ− , θ −1 φ+ ]s˜, p . Finally, we define T (φ, w) = θ T˜ s (θ −1 φ, θ −12 aw ), where, as before, aw is considered as an expression depending on w. From the pointwise multiplication properties of θ and θ −1 , the map T : [φ− , φ+ ]s˜, p × W e,q → W s, p is continuous, and from the monotonicity and Lemma 6 , T (·, w) is invariant on U = [φ− , φ+ ]s˜, p ∩ B M for w ∈ S(U ), where M is taken to be sufficiently large. Moreover, if the fixed point equation φ = θ T˜ s (θ −1 φ, θ −12 aw ),


587

is satisfied, then θ −1 φ is a solution to the scaled Hamiltonian constraint equation with a˜ w = θ −12 aw , and so by conformal covariance, φ is a solution to the original Hamiltonian constraint equation, cf. Appendix A.8. Step 4. Barrier choices and application of the fixed point theorem. At this point, Theorem 5 implies the Main Theorem 2, provided that we have an admissible pair of barriers for the Hamiltonian constraint. The ranges for the exponents ensure through Corollary 1 that we can use the estimate (5.1); see the discussion following the estimate at the beginning of Sect. 5. We will separate into the two cases in the theorem, depending on which Yamabe class we are in: (a) h ab is in Y − (M): We use the global constant super-solution from Lemma 7(a) or the non-constant super-solution from Lemma 12 depending on whether ρ and σ are both in L ∞ , and the global sub-solution from Lemma 14. (b) h ab is in Y 0 (M) or in Y + : We use the global constant super-solution from Lemma 7(a) or the non-constant super-solution from Lemma 12 depending on whether ρ and σ are both in L ∞ , and the global sub-solution from Lemma 13 or Lemma 8(c). This concludes the proof for the case s 2. Step 5: Bootstrap. Now suppose that s > 2. First of all we need to show that the equations are well defined in the sense that the involved operators are bounded in appropriate spaces. All other conditions being obviously satisfied, we will show that aτ ∈ W s−2, p , and aw ∈ W s−2, p for any w ∈ W e,q . Since τ , σ and Lw belong to W e−1,q , it suffices to show that the pointwise multiplication is bounded on W e−1,q ⊗ W e−1,q → W s−2, p , and by employing Corollary 3(b) in the Appendix, we are done as long as s − 2 e − 1 0, s − 2 − 3p < 2(e − 1 − q3 ), and s − 2 − 3p e − 1 − q3 . After a rearrangement these conditions read: e 1, e s − 1, e > d = s−

3 p

3 q

+ d2 , and e

3 q

+ d − 1, with the shorthand

> 1, the latter inequality by the hypothesis of the theorem. We have d −1 >

for d > 2, and 1

d 2

for d 2, meaning that the condition e >

3 q

d 2

+ d2 is implied by the

hypotheses e q3 + d − 1 and e > 1 + q3 . So we conclude that the constraint equations are well defined. Next, we will treat the equations as equations defined with s = e = 2 and with p and q appropriately chosen. This is possible, since if the quadruple ( p, s, q, e) satisfies the hypotheses of the theorem, then ( p, ˜ s˜ = 2, q, ˜ e˜ = 2) satisfies the hypotheses too, provided that 2 − 3p˜ s − 3p , and 1 < 2 − q3˜ e − q3 . Since the latter conditions reflect

the Sobolev embeddings W s, p → W 2, p˜ and W e,q → W 2,q˜ → W 1,∞ , the coefficients of the equations can also be shown to satisfy sufficient conditions for posing the problem for ( p, ˜ 2, q, ˜ 2). Finally, we have τ ∈ W e−1,q → W 1,q˜ = W 1,z since z = q˜ by e˜ = 2 for this new formulation. Now, by the special case s 2 of this theorem that is proven in the above steps, under the remaining hypotheses including the conditions on the metric and the near-CMC condition, we have φ ∈ W 2, p˜ with φ > 0 and w ∈ W 2,q˜ solution to the coupled system. To complete the proof we only need to show that these solutions indeed satisfy φ ∈ W s, p and w ∈ W e,q . Suppose that φ ∈ W s1 , p1 and w ∈ W e1 ,q1 , with 1 < s1 − p31 s− 3p ,

1 < e1 − q31 e − q3 , max{2, s − 2} s1 s, and max{2, e − 2} e1 min{e, s}. Then we have bτ φ 6 + b j ∈ W e−2,q , and so Corollary 5 from Appendix A.5 implies that w ∈ W e,q . This implies that aw ∈ W s−2, p , and by employing Corollary 5 once again, we get φ ∈ W s, p . The proof is completed by induction.

588


6.2. Proof of Theorem 1. The proof is identical to the proof of Theorem 2, except for the particular barriers used. In the proof of Theorem 2, the near-CMC condition is used to construct global barriers satisfying 0 < φ− φ+ < ∞, for all three Yamabe classes, and then the supporting results for the operators S and T established in Sect. 4.1 and Sect. 4.2 are used to reduce the proof to invoking Theorem 5. The construction of φ+ is in fact the only place in the proof of Theorem 2 that requires the near-CMC condition. Here, the proof is identical, except that the additional conditions made on the background metric h ab (that it be in Y + (M)), and on the data (the smallness conditions on σ , ρ, and j) allow us to make use of the alternative construction of a global super-solution given in Lemma 9, together with compatible global sub-solution given in Lemma 13, properly scaled for compatibility with the super-solution. Theorem 1 now follows from Theorem 5, without the use of near-CMC conditions. 6.3. Proof of Theorem 3. The CMC result in this theorem can be proved using the same analysis framework used for the proofs of the two non-CMC results in Theorem 1 and Theorem 2 above. Therefore, the proof follows the same general outline of the proof of Theorem 2, with slightly different spaces and supporting results. The main difference is that we can avoid having to construct “global” barriers and getting uniform bounds on the solution to the momentum constraint, since it is solved only once a priori and then is input as data into the nonlinearity of the Hamiltonian constraint. The case (d) follows from the Yamabe classification, cf. Appendix A.7. Since otherwise we can use the conformal covariance of the Hamiltonian constraint as in Sect. 6.1, for simplicity, assume that the scalar curvature of the background metric is continuous. Also assume that s 2, and let us look at the hypotheses of Theorem 5. We have the (reflexive) Banach spaces X = W s, p and Y = W 1,2r , where p ∈ ( 23 , ∞), 3p s = s( p) ∈ ( 3p , ∞) ∩ [1, 2], and r = r (s, p) = 3+(2−s) p . On the diagram in Fig. 2, 1,2r corresponds to the lower right corner of the shaded parallelfor s 2 the space W ogram, and so W 1,2r contains all the spaces W e,q which are represented by the points in the shaded parallelogram. In fact, W 1,2r is outside of this parallelogram, because of the strict inequality relating e and q in order to have the boundedness of the pointwise multiplication on W e−1,q ⊗ W e−1,q → W s−2, p by using Corollary 3(b). However, the conditions of Corollary 3(b) are not necessary conditions when some of the smoothness indices are integers, for example, in our case the pointwise multiplication is bounded on L 2r ⊗ L 2r → L r , even though these spaces do not satisfy the conditions of the corollary. As a consequence, as we have seen e.g. in Sect. 2.4, the constraint equations are well defined for these spaces. We have the ordered Banach space Z = W s˜, p with the compact embedding X = s, W p → W s˜, p = Z , for s˜ ∈ ( 3p , s). The interval [φ− , φ+ ]s˜, p is nonempty (by compatibility of the barriers we will choose below), and by Lemma 1 at the end of Sect. 3 it is also convex with respect to the vector space structure of W s˜, p and closed with respect to the norm topology of W s˜, p . We then take U = [φ− , φ+ ]s˜, p ∩ B M for sufficiently large M (to be determined below), where B M is the closed ball in Z = W s˜, p of radius M about the origin, ensuring that U is non-empty, convex, closed, and bounded as a subset of Z = W s˜, p . We take as T the shifted Picard mapping T s having as its fixed-point a solution to the 1,2r , which is indepenHamiltonian constraint, and we take S(φ) = w = −A−1 L bj ∈ W


589

dent of φ, since the momentum equation decouples from the Hamiltonian constraint in this case. The map S, which is constant as a function of φ due to the CMC de-coupling, is trivially continuous as a map S : U → W 1,2r = Y . We now consider properties we have for T . By Lemma 3, T : U × R(S) → W s, p = X is a continuous map. By Lemma 4, T is invariant on the closed interval [φ− , φ+ ]s˜, p , and by Lemma 6, T is invariant on U = [φ− , φ+ ]s˜, p ∩ B M . To summarize, T is invariant on the non-empty, closed, convex, bounded set U . Finally, Theorem 5 implies the Main Theorem 3, as long as we have an admissible pair of barriers for the Hamiltonian constraint. That is when we need to separate into the three remaining cases in the theorem, depending on which Yamabe class we are in: (a) h ab is in Y − (M); τ = 0: We take the super-solution from Lemma 11(c), and we take the sub-solution from Lemma 14. These lemmata require that the metric h ab is conformally equivalent to a metric with scalar curvature (−aτ ), and we shall verify this condition. By conformal invariance, it suffices to verify the condition for metrics with continuous and negative scalar curvature, meaning that we have to solve Eq. (5.17) with R < 0 continuous and aτ > 0 constant. Indeed, this |R| 1/4 equation has a positive solution ψ ∈ W s, p as the constants ψ− = ( min and 8aτ )

|R| 1/4 are respectively sub- and super-solutions of (5.17). ψ+ = ( max 8aτ ) + (b) h ab is in Y (M); ρ = 0 or σ = 0: We take the super-solution from Lemma 11(b), and we take the sub-solution from Lemma 13. For the case ρ = 0 and σ = 0, a local sub-solution can easily be constructed following the approach in the proof of Lemma 13. (c) h ab is in Y 0 (M); τ = 0; ρ = 0 or σ = 0: We take the super-solution from Lemma 11(a), and we take the sub-solution from Lemma 13. The case ρ = 0 and σ = 0 is treated as above.

To complete the proof one can bootstrap as in Sect. 6.1.

7. Summary We began in Sect. 2 by summarizing the conformal decomposition of Einstein’s constraint equations introduced by Lichnerowicz and York, on a closed manifold. After this setting up of the notation, we gave an overview of our main results in Sect. 3, represented by three new weak solution existence results for the Einstein constraint equations in the far-from-CMC, near-CMC, and CMC cases. In Sect. 4 we then developed the necessary results we need for the individual constraint equations in order to analyze the coupled system. In particular, in Sect. 4.1, we first developed some basic technical results for the momentum constraint operator under weak assumptions on the problem data. We also established the properties we need for the momentum constraint solution mapping S appearing in the analysis of the coupled system. In Sect. 4.2, we assumed the existence of barriers φ− and φ+ (weak sub- and super-solutions) to the Hamiltonian constraint equation, forming a nonempty positive bounded interval, and then established the properties we need for the Hamiltonian constraint Picard mapping T appearing in the analysis of the coupled system. We then derived several weak global sub- and super-solutions in Sect. 5, based both on constants and on more complex non-constant constructions. While the sub-solutions are similar to those found previously in the literature, some of the super-solutions were new. In particular, we gave two super-solution constructions that do not require the near-CMC condition. The first was constant, and requires that the

590


scalar curvature be strictly globally positive. The second was based on a scaled solution to a Yamabe-type problem, and is valid for any background metric in the positive Yamabe class. In Sect. 6, we proved the main results. In particular, using topological fixed-point arguments and global barrier constructions, we combined the results for the individual constraints and the global barriers to establish existence of coupled non-CMC weak solutions with (positive) conformal factor φ ∈ W s, p , where p ∈ (1, ∞) and s( p) ∈ (1+ 3p , ∞). In the CMC case, the regularity can be reduced to p ∈ (1, ∞) and s( p) ∈ ( 3p , ∞)∩[1, ∞). In the case of s = 2, we reproduce the CMC existence results of Choquet-Bruhat [10], and in the case p = 2, we reproduce the CMC existence results of Maxwell [33], but with a different proof; our CMC proof goes through the same analysis framework that we use to obtain the non-CMC results (Theorems 4 and 5). We also assembled a number of new supporting technical results in the body of the paper and in several appendices, including: topological fixed-point arguments designed for the Einstein constraints; construction and properties of general Sobolev classes W s, p and elliptic operators on closed manifolds with weak metrics; the development of a very weak solution theory for the momentum constraint; a priori L ∞ -estimates for weak W 1,2 -solutions to the Hamiltonian constraint; Yamabe classification of non-smooth metrics in general Sobolev classes W s, p ; and a discussion and analysis of conformal covariance and the connection between conformal rescaling and the near-CMC condition. An important feature of the results we presented here is the absence of the near-CMC assumption in the case of the rescaled background metric in the positive Yamabe class, as long as the freely specifiable part of the data given by the matter fields (if present) and the traceless-transverse part of the rescaled extrinsic curvature are taken to be sufficiently small. In this case, the mean extrinsic curvature can be taken to be an arbitrary smooth function without restrictions on the size of its spatial derivatives, so that it can be arbitrarily far from constant. Under these conditions, we have the first existence result for non-CMC solutions without the near-CMC condition. The two advances in the analysis of the Einstein constraint equations that make these results possible were: A topological fixed-point theorem based on compactness arguments that is free of the near-CMC condition (Theorems 4 and 5 and in [21]), and a new construction of global super-solutions for the Hamiltonian constraint that is similarly free of the near-CMC condition (Lemma 7 and Lemma 9). We note that the near-CMC-free constructions based on scaled solutions to a Yamabe-like problem also work for compact manifolds with boundary and other cases; see e.g. [21]. Finally, we point out that our results here and in [21,22] can be viewed as reducing the remaining open questions of existence of non-CMC (weak and strong) solutions without near-CMC conditions to two more basic and clearly stated open problems: (1) Existence of near-CMC-free global super-solutions for the Hamiltonian constraint equation when the background metric is in the non-positive Yamabe classes and for large data; and (2) existence of near-CMC-free global sub-solutions for the Hamiltonian constraint equation when the background metric is in the positive Yamabe class in vacuum (without matter). However, an important new development, which occurred a few months after the first draft of this article was made available, is that Maxwell has now shown [36] how a related topological fixed-point argument can be constructed so that a global subsolution is not needed, as long as the global super-solution is available; this allows for the extension of the far-CMC results in this article to the vacuum case without having to solve problem (2).


591

Acknowledgement. The authors would like to thank Jim Isenberg, David Maxwell, and Daniel Pollack for many very insightful comments and suggestions about this work. We would like to thank David Maxwell in particular for his careful reading of earlier drafts of this work and for pointing out various errors. MH was supported in part by NSF Awards 0715146, 0411723, and 0511766, and DOE Awards DE-FG02-05ER25707 and DE-FG02-04ER25620. GN and GT were supported in part by NSF Awards 0715146 and 0411723.

A. Some Key Technical Tools and Some Supporting Results A.1. Topological fixed-point theorems. In this Appendix, we give a brief review of some standard topological fixed-point theorems in Banach spaces that provide the framework for our analysis of the coupled constraint equations. The analysis framework that was developed earlier in [26] for analyzing the coupled constraints was based on k-contractive mappings, and as a result required the near-CMC condition in order to establish k-contractivity. All subsequent non-CMC results (see e.g. [1]) are based on the framework from [26], and as a result remain limited to the near-CMC case. Our interest here is on more general topological fixed-point arguments that will allow us to avoid the near-CMC condition. Brouwer, Schauder, and Leray-Schauder Fixed-Point Theorems. To establish the main abstract results we will need, we first give a brief overview of some standard results on topological fixed-point arguments involving compactness. Theorem 7 (Brouwer Theorem). Let U ⊂ Rn be a non-empty, convex, compact subset, with n 1. If T : U → U is a continuous mapping, then there exists a fixed-point u ∈ U such that u = T (u). Proof. See Proposition 2.6 in [54]; a short proof can be based on homotopy-invariance of topological degree. Theorem 8 (Schauder Theorem). Let X be a Banach space, and let U ⊂ X be a nonempty, convex, compact subset. If T : U → U is a continuous operator, then there exists a fixed-point u ∈ U such that u = T (u). Proof. This is a direct extension of the Brouwer Fixed-Point Theorem from Rn to X ; see Corollary 2.13 in [54]. The short proof involves a simple finite-dimensional approximation algorithm and a limiting argument, extending the Brouwer Fixed-Point Theorem (itself generally having a more complicated proof) from Rn to X . Theorem 9 (Schauder Theorem B). Let X be a Banach space, and let U ⊂ X be a non-empty, convex, closed, bounded subset. If T : U → U is a compact operator, then there exists a fixed-point u ∈ U such that u = T (u). Proof. See Theorem 2.A in [54]; the proof is a simple consequence of Theorem 8 above. A.2. Ordered Banach spaces. These notes follow the main ideas and definitions given in Chap. 7.1, p. 275, in [54], while some examples were taken from [2 and 16]. Let X be a Banach space, R+ be the non-negative real numbers. A subset C ⊂ X is a cone iff given any x ∈ C and a ∈ R+ the element ax ∈ C. A subset X + ⊂ X is an order cone iff the following properties hold: (i) The set X + is non-empty, closed, and X + = {0};

592


(ii) Given any a, b ∈ R+ and x, x ∈ X + then ax + bx ∈ X + ; (iii) If x ∈ X + and −x ∈ X + , then x = 0. The second property above says that every order cone is in fact a cone, and that the set X + is convex. The space X = R2 is a convenient Banach space to picture non-trivial examples of cones and order cones, as can be seen in Fig. 3. A pair X , X + is called an ordered Banach space iff X is a Banach space and X + ⊂ X is an order cone. The reason for this name is that the order cone X + defines several relations on elements in X , called order relations, as follows: u v iff u − v ∈ X + , u v iff u − v ∈ int(X + ),

u > v iff u v and u = v, u v iff u v is false;

finally the notations u v, u < v, and u v are used to mean v u, v > u, v u, respectively. A simple example of an ordered Banach space is R with the usual order. Another example can be constructed when this order on R is transported into C 0 (M), the set of scalar-valued functions on a set M ⊂ Rn , with n 1. An order on C 0 (M) is the following: the functions u, v ∈ C 0 (M) satisfy u v iff u(x) v(x) for all x ∈ M. The following lemmas summarize the main properties of order relations in Banach spaces. Lemma 17. Let X , X + be an ordered Banach space. Then, for all elements u, v, w ∈ X , hold: (i) u u; (ii) If u v and v u, then u = v; (iii) If u v and v w, then u w. Proof. The property that u − u = 0 ∈ X + implies that u u. If u v and v u then u − v ∈ X + and −(u − v) ∈ X + , therefore u − v = 0. Finally, if u v and v w, then u − v ∈ X + and v − w ∈ X + , which means that u − w = (u − v) + (v − w) ∈ X + . Furthermore, the order relation is compatible with the vector space structure and with the limits of sequences. Lemma 18. Let X , X + be an ordered Banach space. Then, for all u, u, ˆ v, v, ˆ w ∈ X , and a, b ∈ R, the following hold: (i) If u v and a b 0, then au bv; (ii) If u v and uˆ v, ˆ then u + uˆ v + v; ˆ (iii) If u n vn for all n ∈ N, then limn→∞ u n limn→∞ vn . Proof. The first two properties are straightforward to prove, and we do not do it here. The third property holds because the order cone is a closed set. Indeed, u n vn means that u n − vn ∈ X + for all n ∈ N, and then limn→∞ (u n − vn ) ∈ X + because X + is closed, then Property (iii) follows. The remaining order relations have some other interesting properties. Lemma 19. Let X , X + be an ordered Banach space. Then, for all u, v, w ∈ X , and a ∈ R, the following hold:: (i) If u v and v w, then u w; (ii) If u v and v w, then u w; (iii) If u v and v w, then u w; (iv) If u v and a > 0, then au av. The proof of Lemma 19 is similar to the previous lemma, and is not reproduced here. Given an ordered Banach space X , X + , and two elements u v, introduce the intervals [v, u] := {w ∈ X : v w u},

(v, u) := {w ∈ X : v w u}.


593 2

R

R+

u

2

[v,u]

v

Fig. 3. The shaded regions in the first picture represent an order cone, while the second picture represents a cone that is not an order cone. The shaded region between u and v in the third picture represents the closed interval [v, u], constructed with the order cone R2+ , which is also represented by a shaded region

Analogously, introduce the intervals [v, u) and (v, u]. See Fig. 3 for an example in X = R2 . Useful order cones for solving PDE are those that define an order structure in the Banach space which is related with the norm and the notion of boundedness. These types of order cones are called normal. More precisely, an order cone X + in a Banach space X is called a normal order cone iff there exists 0 < a ∈ R such that for all u, v ∈ X with 0 v u holds v a u. Lemma 20. If X , X + is an ordered Banach space with normal order cone X + , then every closed interval in X is bounded. Proof. Let w ∈ [v, u], then v w u, and so 0 w − v u − v. Since the cone X + is normal, this implies that there exists a > 0 such that w − v a u − v. Then, the inequalities w w − v + v a u − v + v, which hold for all w ∈ [v, u], establish the lemma. Not every order cone is normal. For example, consider the Sobolev spaces W k, p of scalar-valued functions on an n-dimensional, closed manifold M (or a compact manifold with Lipschitz continuous boundary), where k is a non-negative integer, and p > 1 is a real number. An order cone in W k, p is defined translating the order on the real numbers, almost everywhere in M, that is, k, p

W+

:= {u ∈ W k, p : u 0 a.e. in M}.

In the case k = 0, that is, we have W 0, p = L p , the order cone above is a normal cone [2,54]. However, in the case k 1 the cone above cannot be normal, since on the one hand, the cone definition involves information only of the values of u(x) and not of its derivatives; on the other hand, the norm in W k, p contains information of both the values of u(x) and its derivatives. In the case of a compact manifold with boundary, since there are no boundary conditions on ∂M in the definition of W k, p , there is no way to relate the values of a function in M with the values of its derivatives. (In other words, there is no Poincaré inequality for elements in W k, p , with k 1.) An order cone X + ⊂ X is generating iff Span(X + ) = X . An order cone X + ⊂ X is called total iff Span(X + ) is dense in X . Total order cones are important because the order structure associated with them can be translated from the space X into its dual space X ∗ . Lemma 21. Let X , X + be an ordered Banach space. If X + is a total order cone, then an order cone in X ∗ is given by the set X +∗ ⊂ X ∗ defined as X +∗ := {u ∗ ∈ X ∗ : u ∗ (v) 0 ∀ v ∈ X + }.

594


Proof. We check the three properties in the definition of the order cone. The first property is satisfied because X + is an order cone, so there exists v = 0 in X + , and then there exists u ∗ = 0 in X ∗ such that u ∗ (v) = 1 0, so X +∗ is non-empty. Trivially, 0 ∈ X +∗ . Finally, X +∗ is closed because the order relation for real numbers is used in its definition. The second property of an order cone is satisfied, because given any u ∗ , v ∗ ∈ X +∗ and any non-negative a, b ∈ R, then for all u ∈ X + , (au ∗ + bv ∗ )(u) = au ∗ (u) + bv ∗ (u) 0 holds since each term is non-negative. This implies that (au ∗ + bv ∗ ) ∈ X +∗ . The third property is satisfied because the order cone X + is total. Suppose that the element u ∗ ∈ X +∗ and −u ∗ ∈ X +∗ , then for all u ∈ X + it holds that u ∗ (u) 0 and −u ∗ (u) 0, which implies that u ∗ (u) = 0 for all u ∈ X + . Therefore, u ∗ ∈ X +◦ ⊂ X ∗ , where the superscript ◦ in X +◦ means the Banach annihilator of the set◦ X + , which is a subset of the space X ∗ . Therefore, we conclude that u ∗ ∈ Span(X + ) . Since the order cone is total, ◦ Span(X + ) = X , that implies Span(X + ) = {0}, so u ∗ = 0. This establishes the lemma. An order cone X + in a Banach space X is called a solid cone iff X + has non-empty interior. The following result asserts that solid order is generating. We remark that the converse is not true. In the examples below we present function spaces frequently used in solving PDE with order cones having empty interior which are indeed generating. Lemma 22. Let X , X + be an order Banach space. If X + is a solid cone, then X + is generating. Proof. The cone X + has a non-empty interior, so there exists x0 ∈ int(X + ) and x0 = 0. This means that given any x ∈ X there exists 0 < a ∈ R small enough such that both x+ := x0 + ax and x− := x0 − ax belong to int(X + ). But then, x = (x+ − x− )/(2a), so x ∈ Span(X + ). This establishes the lemma. Here is a list of examples of several order cones used in function spaces. All these examples use order cones obtained from the usual order in R. In particular, they refer to scalar-valued functions on an n- dimensional, closed manifold M (or a compact manifold with Lipschitz boundary). • Introduce on C k the cone C+k := {u ∈ C k : u(x) 0 ∀x ∈ M}. This is an order cone for all non-negative integers k. The cone is a normal cone in the particular case k = 0. The cone is solid for all k 0, therefore it is a generating cone. ∞ : u 0 a.e. in M}. This is a normal, • Introduce on L ∞ the cone L ∞ + := {u ∈ L order cone. It is a solid cone, therefore it is generating. • Introduce on W k,∞ the cone W+k,∞ := {u ∈ W k,∞ : u 0 a.e. in M}. This is an order cone. It is not normal for k 1. The cone is solid, therefore it is generating. p • Introduce on L p the cone L + := {u ∈ L p : u 0 a.e. in M}. This is a normal, order cone for every real number p 1. The cone is not solid, however it is a generating cone. k, p • Introduce on W k, p the cone W+ := {u ∈ W k, p : u 0 a.e. in M}. This is an order cone for every real number p 1. The cone is not normal for k 1. The cone is not solid for kp n, and it is solid for kp > n. In both cases, the cone is generating.


595

A key concept that becomes possible in ordered Banach spaces is that of an operator satisfying a maximum principle. We have not seen in the literature an approach to maximum principles on ordered Banach spaces in the generality we now present. Let X , X + and Y , Y+ be ordered Banach spaces. An operator A : D A ⊂ X → Y satisfies the maximum principle iff for every u, v ∈ D A such that Au − Av ∈ Y+ , u − v ∈ X + holds. In the particular case that the operator A is linear, then it satisfies the maximum principle iff for all u ∈ X such that Au ∈ Y+ , u ∈ X + holds. The main example is the Laplace operator acting on scalar-valued functions defined on different domains. It is shown later on in this Appendix that the inverse of an operator that satisfies the maximum principle is monotone increasing. The following result gives a simple sufficient condition for an operator to satisfy the maximum principle. This result is useful on weak formulations of PDE. Lemma 23. Let X , X + be an ordered Banach space, and A : X → X ∗ be a linear and coercive map. Assume that X + is a generating order cone, and that for all u ∈ X such that Au ∈ X +∗ there exists a decomposition u = u + − u − with u + , u − ∈ X + that also satisfies Au + (u − ) = 0. Then, the operator A satisfies the maximum principle. Proof. Since the order cone X + is generating, the space X ∗ is also an ordered Banach space. Denote its order cone by X +∗ . The assumption that the order cone X + is generating also implies that for any element u ∈ X there exists a decomposition u = u + − u − with u + , u − ∈ X + . By hypothesis, there exists at least one decomposition with the extra property that Au + (u − ) = 0. Now, by definition of the order in the space X ∗ we have that Au ∈ X +∗

⇔

Au(u) 0 ∀ u ∈ X + .

Pick as a test function u = u − . Then, 0 Au(u − ) = A(u + − u − )(u − ) = Au + (u − ) − Au − (u − ) = −Au − (u − ), where the last equality comes from the condition Au + (u − ) = 0. Therefore, we have Au − (u − ) 0

⇒

u − = 0,

because A is coercive. So we showed that u = u + ∈ X + . This establishes the lemma. An example is the weak form of the shifted Laplace-Beltrami operator +s on scalar functions on a closed manifold M, where s > 0. Consider the case X = W 1,2 , with Y = X ∗ = W −1,2 , and X + = W+1,2 , while Y+ = W+−1,2 . The Laplace operator in this case is given by A : X → X ∗ with action Au(v) := (∇u, ∇v). It is not difficult to check that this operator satisfies the hypothesis in Lemma 23. Therefore, this operator satisfies the maximum principle, that is, Au ∈ W+−1,2 implies u ∈ W+1,2 , that is, u 0 a.e. in the manifold M. A.3. Monotone increasing maps. Let X , X + and Y , Y+ be two ordered Banach spaces. An operator F : X → Y is monotone increasing iff for all x, x ∈ X such that x − x ∈ X + , F(x) − F(x) ∈ Y+ holds. An operator F : X → Y is monotone decreasing iff for all x, x ∈ X such that x − x ∈ X + it holds that − F(x) − F(x) ∈ Y+ . The main result for these types of maps is the following; it can be found as Theorem 7.A in [54], p. 283, and Corollary 7.18 on p. 284. We reproduce it here for completeness, without the proof.

596


Theorem 10 (Fixed point for increasing operators). Let X be an ordered Banach space, with a normal order cone X + . Let T : [x− , x+ ] ⊂ X → X be a monotone increasing, compact map. If − x− − T (x− ) ∈ X + and x+ − T (x+ ) ∈ X + , then the iterations xn+1 := T (xn ), xˆn+1 := T (xˆn ),

x0 = x− , xˆ0 = x+ ,

converge to x and xˆ ∈ [x− , x+ ], respectively, and the following estimate holds: x− xn x xˆ xˆn x+ ,

∀n = N.

(A.1)

We are interested in the following class of nonlinear problems: Find an element x ∈ X which solves the equation Ax + F(x) = 0,

(A.2)

where the principal part involves an invertible linear operator A : X → Y satisfying the maximum principle, and the non-principal part involves a nonlinear operator F : X → Y which has monotonicity properties. We now establish some basic results for this class of problems. The first two results relate linear, invertible operators that satisfy the maximum principle with monotone increasing (decreasing) operators. Lemma 24. Let X , X + and Y , Y+ be two ordered Banach spaces. Let A : X → Y be a linear, invertible operator satisfying the maximum principle. Then, the inverse operator A−1 : Y → X is monotone increasing. Proof. Let y, y ∈ Y be such that y − y ∈ Y+ . Then, A A−1 (y − y) ∈ Y+ ⇒ A−1 (y − y) ∈ X +

⇔

A−1 y − A−1 y ∈ X + .

This establishes that the operator A−1 is monotone increasing.

Lemma 25. Let X , X + and Y , Y+ be two ordered Banach spaces. Let A : X → Y be a linear, invertible operator satisfying the maximum principle. Let F : X → Y be a monotone decreasing (increasing) operator. Then, the operator T : X → X given by T := −A−1 F is monotone increasing (decreasing). Proof. Assume first that the operator F is monotone decreasing. So, given any x, x ∈ X such that x − x ∈ X + , the following inequalities hold: x − x ∈ X + ⇒ − F(x) − F(x) ∈ Y+ , ⇔ A −A−1 F(x) − F(x) ∈ Y+ , ⇒ −A−1 F(x) − F(x) ∈ X + , ⇔ − A−1 F(x) − A−1 F(x) ∈ X + , ⇔

T (x) − T (x) ∈ X + ,

which establishes that the operator T is monotone increasing. In the case that the operator F is monotone increasing, then the first line in the proof above changed into x − x ∈ X + implies that F(x) − F(x) ∈ Y+ , and then all the remaining inequalities in the proof above are reverted. This establishes the lemma.


597

The next result translates the inequalities that satisfy sub- and super-solutions to the equation Ax + F(x) = 0, into inequalities for the operator T = −A−1 F. Lemma 26. Assume the hypothesis in Lemma 25. If there exists an element x+ ∈ X such that Ax+ + F(x+ ) ∈ Y+ , then this element satisfies that x+ − T (x+ ) ∈ X + . If there exists an element x− ∈ X such that − Ax− + F(x− ) ∈ Y+ , then this element satisfies that − x− − T (x− ) ∈ X + . Proof. The first statement in the lemma can be shown as follows: Ax+ + F(x+ ) ∈ Y+ ⇔ A x+ + A−1 F(x+ ) ∈ Y+ ⇒ x+ + A−1 F(x+ ) ∈ X + , which then establishes that x+ − T (x+ ) ∈ X + . In a similar way, the second statement in the lemma can be shown as follows: − Ax− + F(x− ) ∈ Y+ ⇔ A −x− − A−1 F(x− ) ∈ Y+ ⇒ −x− − A−1 F(x− ) ∈ X + , which then establishes that − x− − T (x− ) ∈ X + . This establishes the lemma.

For nonlinear problems of the form (A.2), one can use Theorem 10 for monotone nonlinearities to conclude the following. Corollary 2. (Semi-linear equations with sub-/super-solutions) Let X , X + and Y , Y+ be two ordered Banach spaces where X + is a normal order cone. Let A : X → Y be a linear, invertible operator satisfying the maximum principle. Let x+ , x− ∈ X be elements such that (x+ − x− ) ∈ X + , and then assume that the operator F : [x− , x+ ] ⊂ X → Y is monotone decreasing and compact. If the elements x− and x+ satisfy the relations − Ax− + F(x− ) ∈ Y+ ,

Ax+ + F(x+ ) ∈ Y+ ,

(A.3)

then there exists a solution x ∈ [x− , x+ ] ⊂ X of the equation Ax + F(x) = 0. Proof. The operator A is invertible, then rewrite the equation Ax + F(x) = 0 as a fixed-point equation, x = −A−1 F(x) =: T (x).

(A.4)

By Lemma 25, we know that the map T : X → X is monotone increasing. Moreover, this operator T is compact, since it is the composition of the continuous mapping −A−1 and the compact map F. The elements x− and x+ satisfy Eq. (A.3), therefore, by Lemma 26, they are also sub- and super-solutions for the fixed-point equation involving the map T . It follows from Theorem 10 that there exists an element x ∈ X solution to the fixed-point equation (A.4), and this solution satisfies the bounds x− x x+ .

598


A.4. Sobolev spaces on closed manifolds. In this Appendix we will recall some properties of Sobolev spaces of sections of vector bundles over closed manifolds. The following definition makes precise what we mean by fractional order Sobolev spaces. We expect that without much difficulty all the results in this paper can be modified to reflect other smoothness classes such as Bessel potential spaces or general Besov spaces. Definition 2. For s 0 and 1 p ∞, we denote by W s, p (Rn ) the space of all distributions u defined in Rn , such that (a) when s = m is an integer, um, p =

∂ ν u p < ∞,

|ν|m

where · p is the standard L p -norm in Rn ; (b) and when s = m + σ with m (nonnegative) integer and σ ∈ (0, 1), us, p = um, p + ∂ ν uσ, p < ∞, |ν|=m

where uσ, p =

|u(x) − u(y)| p d xd y n+σ p Rn ×Rn |x − y|

1

p

,

for 1 p < ∞,

and uσ,∞ = ess supx,y∈Rn

|u(x) − u(y)| . |x − y|σ

For s < 0 and 1 < p < ∞, W s, p (Rn ) denotes the topological dual of W −s, p (Rn ), where 1p + p1 = 1. These well known spaces are Banach spaces with corresponding norms, and become Hilbert spaces when p = 2. We refer to [18,46] and references therein for further properties. Now we will define analogous spaces on closed manifolds. Let M be an n-dimensional smooth closed manifold, and let {(Ui , ϕi )} be a collection of charts such that {Ui } forms a finite cover of M. Then for any distribution u ∈ C0∞ (Ui )∗ , the pull-back ϕi∗ (u) ∈ C0∞ (ϕi (Ui ))∗ is defined by ϕi∗ (u)(v) = u(v ◦ ϕi ) for all v ∈ C0∞ (ϕi (Ui )). Extending ϕi∗ (u) by zero outside ϕi (Ui ), in the following we treat it as a distribution on Rn . Let {χi } be a smooth partition of unity subordinate to {Ui }. Definition 3. For s ∈ R and p ∈ (1, ∞), we denote by W s, p (M) the space of all distributions u defined in M, such that us, p = ϕi∗ (χi u)s, p < ∞, (A.5) i

where the norm under the sum is the W s, p (Rn )-norm. In case s 0, these Sobolev spaces can also be defined for p = 1 and p = ∞.


599

We collect the most basic properties of these spaces in the following lemma. Recall that a Riemannian metric on M induces a volume form on M, so that L p spaces can be defined on M (cf. [43]). Lemma 27. Either let s 0 and p ∈ [1, ∞] or let s < 0 and p ∈ (1, ∞). Then the space W s, p (M) is a Banach space. It is independent of the choice of the covering charts {(Ui , ϕi )} and the partition of unity {χi }. In particular, the different norms (A.5) are equivalent. Moreover, the following are true when M is equipped with a smooth Riemannian metric. (a) Let ∇ be the Levi-Civita connection associated to the Riemannian metric. Then for any nonnegative integer m, u m, p =

m

∇ i u p ,

i=0

is an equivalent norm on W m, p (M). In particular, we have W 0, p (M) = L p (M). (b) Identifying C ∞ (M) as a subspace of distributions via the L 2 -inner product, C ∞ (M) is densely embedded in W s, p (M) for any s ∈ R and p ∈ (1, ∞). (c) Let s ∈ R and p ∈ (1, ∞). Then the L 2 -inner product on C ∞ (M) extends uniquely to a continuous bilinear pairing W s, p (M) ⊗ W −s, p (M) → R, where 1p + p1 = 1.

Moreover, the pairing induces a topological isomorphism between W −s, p (M) and the topological dual space of W s, p (M).

Proof. See for example [3,19,43,45].

A main goal of this subsection is to extend the previous lemma to the case when the Riemannian metric is not smooth. The following result will be of importance. Lemma 28. Let si s with s1 + s2 0, and 1 p, pi ∞ (i = 1, 2) be real numbers satisfying 1 1 1 1 1 si − s n , s1 + s2 − s > n , − + − pi p p1 p2 p where the strictness of the inequalities can be interchanged if s ∈ N0 . In case min(s1 , s2 ) < 0, in addition let 1 < p, pi < ∞, and let 1 1 s1 + s 2 n + −1 . p1 p2 Then, the pointwise multiplication of functions extends uniquely to a continuous bilinear map W s1 , p1 (M) ⊗ W s2 , p2 (M) → W s, p (M). Proof. A proof is given in [55] for the case s 0, and by using a duality argument one can easily extend the proof to negative values of s. Some important special cases are considered in the following corollary:

600


Corollary 3. (a) If p ∈ (1, ∞) and s ∈ ( np , ∞), then W s, p is a Banach algebra. Moreover, if in addition q ∈ (1, ∞) and σ ∈ [−s, s] satisfy σ − qn ∈ [−n − s + np , s − np ], then the pointwise multiplication is bounded as a map W s, p ⊗ W σ,q → W σ,q . (b) Let 1 < p, q < ∞ and σ s 0 satisfy σ − qn < 2(s − np ) and σ − qn s − np . Then the pointwise multiplication is bounded as a map W s, p ⊗ W s, p → W σ,q . The following lemma is proved in [33] for the case p = q = 2. With the help of Lemma 28, the proof can easily be adapted to the following general case. Lemma 29. Let p ∈ (1, ∞) and s ∈ ( np , ∞), and let u ∈ W s, p . Let σ ∈ [−1, 1]

1−σ σ,q , where δ = 1 − s−1 . Moreover, let and q1 ∈ ( 1+σ 2 δ, 1 − 2 δ), and let v ∈ W p n f : [inf u, sup u] → R be a smooth function. Then, we have v( f ◦ u)σ,q C vσ,q f ◦ u∞ + f ◦ u∞ us, p ,

where the constant C does not depend on u, v or f . Proof. We consider the case σ = 1 first. Choosing a smooth Riemannian metric on M, we have v( f ◦ u)1,q C v( f ◦ u)q + ∇[v( f ◦ u)]q C v( f ◦ u)q + (∇v)( f ◦ u)q + v( f ◦ u)∇uq C vq f ◦ u∞ + v1,q f ◦ u∞ + f ◦ u∞ v∇uq . By Lemma 28, for

1 q

δ, the last term can be bounded as

v∇uq Cv1,q ∇us−1, p Cv1,q us, p , proving the lemma for the case σ = 1. By using duality one proves the case σ = −1 and q1 1 − δ, and the lemma follows from interpolation. Let M be an n-dimensional smooth closed manifold, and let E → M be a smooth vector bundle over M. Analogously to Definition 3, we define the Sobolev space W s, p (E) of sections of E by utilizing a finite trivializing cover of coordinate charts, a partition of unity subordinate to the cover, and the space [W s, p (Rn )]k of vector functions, where k is the fiber dimension of E. Then, Lemma 27 holds for these spaces with obvious modifications. When there is no risk of confusion, we will omit the explicit specification of the vector bundle E from the notation W s, p (E). In the following lemma we consider nonsmooth Riemannian structures on E and nonsmooth volume forms on M. Lemma 30. Let γ ∈ (1, ∞) and α ∈ ( γn , ∞). Fix on M a volume form of class W α,γ , and on E a Riemannian structure of class W α,γ . (a) Let p ∈ (1, ∞) and s min{α, α + n( 1p − γ1 )}. Then identifying the space C ∞ (E) of smooth sections of E as a subspace of distributions via the L 2 -inner product, C ∞ (E) is densely embedded in W s, p (E). (b) Let s ∈ [−α, α], p ∈ (1, ∞), and s − np ∈ [−n − α + γn , α − γn ]. Then the L 2 -inner product on C ∞ (E) extends uniquely to a continuous bilinear pairing W s, p (E) ⊗ W −s, p (E) → R, where 1p + p1 = 1. Moreover, the pairing induces a ∼ W −s, p (E). topological isomorphism [W s, p (E)]∗ =


601

Proof. We will prove the lemma for scalar functions on M, i.e., for the trivial bundle E = M × R. The general case is only more technical. Fixing a smooth volume form on M and denoting the associated L 2 -inner product by (·, ·)∗ , the L 2 -inner product associated to the nonsmooth volume form (and the nonsmooth metric on M × R) satisfies (u, v) L 2 = (hu, v)∗ ,

u, v ∈ C ∞ (M),

with some strictly positive function h ∈ W α,γ . From Lemma 28, we have that multiplication by h is continuous on W s, p for s ∈ [−α, α], p ∈ (1, ∞), and s − np ∈ [−n − α + γn , α − γn ]. Since h > 0 this operation is invertible hence a homeomorphism on W s, p . Now by using Lemma 27 we complete the proof. Corollary 4. Let γ ∈ (1, ∞) and α ∈ ( γn , ∞). Fix on M a volume form of class W α,γ , and on E a Riemannian structure of class W α,γ . With s ∈ [−α, α], p ∈ (1, ∞), and s − np ∈ [−n − α + γn , α − γn ], let A : L p → W s, p be a bounded linear operator and let A∗ be its formal L 2 -adjoint, i.e., let (Au, v) L 2 = (u, A∗ v) L 2 ,

for u, v ∈ C ∞ (E).

Then, A∗ extends uniquely to a bounded linear map A∗ : W −s, p → L p , and we have Au, v = u, A∗ v ,

for u ∈ L p (E), v ∈ W −s, p (E),

where ·, · denotes the extension of the L 2 -inner product. Proof. This is an application of Lemma 30.

A.5. Elliptic operators on closed manifolds. In this Appendix we will state a priori estimates for general elliptic operators in some Sobolev spaces. Let M be an n-dimensional smooth closed manifold, and let E → M be a smooth vector bundle over M. Let C −∞ (E) be the topological dual of the space C ∞ (E) of smooth sections of α,γ E. Then for m ∈ N, α ∈ R, and γ ∈ [1, ∞], we define Dm (E) to be the space of ∞ −∞ differential operators A : C (E) → C (E) that can be written in local coordinates (trivializing E) as A= a ν ∂ν with a ν ∈ W α−m+|ν|,γ (Rn , Rk×k ), |ν| m, |ν|m

where k is the fiber dimension of E. One can easily verify that if the metric of a Riemannian manifold is in W α,γ with αγ > n, then both the Laplace-Beltrami operator and vector Laplacian defined in (2.17) α,γ α,γ are in the classes D2 (M × R) and D2 (T M), respectively. α,γ

Lemma 31. Let A be a differential operator of class Dm (E). Then, A can be extended to a bounded linear map A : W s,q (E) → W σ,q (E),

602


for q ∈ (1, ∞), s m − α, and σ satisfying σ min{s, α} − m, σ−

n n α − − m, q γ

σ <s−m+α− and s −

n , γ

n n m−n−α+ . q γ

Proof. This is a straightforward application of Lemma 28.

The Laplace-Beltrami operator and vector Laplacian are elliptic operators. We now consider local a priori estimates for general elliptic operators. For any subset U ⊂ M, the W s, p (U )-norm is denoted by · s, p,U . α,γ

Lemma 32. Let A ∈ Dm (E) be an elliptic operator with α − γn > max{0, m−n 2 }. Let n n n q ∈ (1, ∞), s ∈ (m − α, α], and s − q ∈ (m − n − α + γ , α − γ ]. Then for any y ∈ M, there exists a constant c > 0 and open neighborhoods K ⊂ U ⊂ M of y such that cχ us,q Aus−m,q + us−1,q,U ,

(A.6)

for any u ∈ W s,q (E) and χ ∈ C0∞ (K ) with χ 0. Proof. We work in a local chart containing y, which trivializes E. Let K be the open ball of radius r centered at y contained in the domain of the chart and extend the coefficients ν,γ of A outside K so that the resulting operator is still in Dm , with appropriate vector n fields over R . We make the decomposition A = L + R + B, where L is the highest order term of A with coefficients frozen at y, and R is what remains in the highest order terms, i.e., L= a ν (y)∂ν , R= [a ν − a ν (y)]∂ν . |ν|=m

|ν|=m

Obviously B = A − L − R are the lower order terms. Let u ∈ W s,q with supp u ⊂ K . From the theory of constant coefficient elliptic operators, we infer the existence of a constant c > 0 such that for any u ∈ W s,q (E) with supp u ⊂ K , cus,q Lus−m,q + us−m,q Aus−m,q + Rus−m,q + Bus−m,q + us−m,q . Since α > γn , without loss of generality we can assume for |ν| = m that a ν ∈ C 0,h for some h > 0, so Rus−m,q Cr h us,q , where C is a constant depending only on A. By choosing r so small that Cr h 2c , we have c us,q Aus−m,q + Bus−m,q + us−m,q . 2 Now we will work with the lower order term. Choose δ ∈ (0, α − δ min{1, s + α − m, s − + α − n q

n γ

+ n − m}. We have B ∈

n γ)

such that

α−1,γ Dm−1 , so by Lemma 31,


603

B : W s−δ,γ → W s−m,γ is bounded. Then using a well known interpolation inequality, we get Bus−m,q Cus−δ,q Cεus,q + C ε−(m−δ)/δ us−m,q , for any ε > 0. Choosing ε > 0 sufficiently small, we conclude that cus,q Aus−m,q + us−m,q ,

∀u ∈ W s,q (E), supp u ⊂ K . α,γ

We apply this inequality to χ u, and then observing that [A, χ ] is in Dm−1 (M), we obtain (A.6). We can easily globalize the above result as follows: Corollary 5. Let the conditions of Lemma 32 hold. Then there exists a constant c > 0 such that cus,q Aus−m,q + us−m,q ,

∀u ∈ W s,q (E).

(A.7)

Proof. We first cover M by open neighborhoods K by applying Lemma 32 to every point y ∈ M, and then choose a finite subcover of the resulting cover. Then a partition of unity argument gives (A.7) with the term us−m,q replaced by us−1,q , and finally one can use an interpolation inequality to get the conclusion. Let us recall the following well known results from functional analysis. Lemma 33. Let X and Y be Banach spaces with continuous embedding X → Y . Let A : X → Y be a continuous linear map. Then (a) A necessary and sufficient condition that the graph of A be closed in X × Y is that there exists a constant c > 0 such that cu X AuY + uY for all u ∈ X . (b) If in addition the embedding X → Y is compact then the range of A is closed and the kernel of A is finite-dimensional. As an immediate consequence, we obtain the following result. α,γ

Lemma 34. Let A ∈ Dm (E) be an elliptic operator with α − γn > max{0, m−n 2 }. Let n n n q ∈ (1, ∞), s ∈ (m − α, α], and s − q ∈ (m − n − α + γ , α − γ ]. Then, the operator A : W s,q (E) → W s−m,q (E) is semi-Fredholm, i.e., its range is closed and the kernel is finite-dimensional. A.6. Maximum principles on closed manifolds. In this Appendix, we present maximum principles for the operators of the form −∇ · (u∇) with positive function u, followed by a simple application. These types of results are well known, but nevertheless we state them here for completeness. It is convenient at times when working with barriers and maximum principle arguments to split real valued functions into positive and negative parts; we will use the following notation for these concepts: φ + := max{φ, 0},

φ − := −min{φ, 0},

whenever they make sense. In the proof of the following lemma we will use the fact that for φ ∈ W 1, p , φ + ∈ W 1, p holds, and so φ − ∈ W 1, p , cf. [38].

604


Lemma 35. Let p ∈ (1, ∞) and s ∈ ( np , ∞) ∩ [1, ∞), and let (M, h ab ) be an n-dimensional, smooth, closed manifold with a Riemannian metric h ab ∈ W s, p . Moreover, let u ∈ W s, p be a function with u > 0 and let f ∈ W s−2, p . Let φ ∈ W s, p be such that for all ϕ ∈ C+∞ .

u∇φ, ∇ϕ + f, φϕ 0,

(A.8)

(a) If f = 0 and f, ϕ 0 for all ϕ ∈ C+∞ , then φ 0. (b) If M is connected and φ 0, then either φ ≡ 0 or φ > 0 everywhere. Proof. For (a), we will follow the proof of [33, Lemma 2.9]. Since φ ∈ W 1,n , we have φ − ∈ W+1,n and −φφ − ∈ W+1,n . Note that W 1,n → (W s−2, p )∗ by n 2. Now, using the positivity of f and the property (A.8), by density we get 0 f, φφ − − u∇φ, ∇φ − = u∇φ − , ∇φ − , implying that φ − = const. So if φ < 0, it would have to be a negative constant. But property (A.8) gives f, ϕ 0 for all ϕ ∈ C+∞ , which, in combination with the positivity, implies f = 0. This contradicts the hypothesis f = 0 and proves (a). Now we will prove (b). Since φ is continuous, the level set φ −1 (0) ⊂ M is closed. Following the proof of [35, Lemma 5.3], we apply the weak Harnack inequality [47, Theorem 5.2] to show that φ −1 (0) is also open. Then by connectedness of M we will have the proof. The weak Harnack inequality [47, Theorem 5.2] can be applied to second order elliptic operators of the form Lφ = ∂i (a i j ∂ j φ + a i φ) + b j ∂ j φ + aφ, where a i j are continuous, and a i , b j ∈ L 2t , and a ∈ L t for some t > n2 . The first term in (A.8) satisfies these conditions, and the second term can be cast into a form satisfying the conditions (details can be found in the proof of [35, Lemma 5.3]). Now suppose that φ(x) = 0 for some x ∈ M, and let us work in local coordinates around x. Then the weak Harnack inequality says that for sufficiently small R > 0, and for some p > t , n

φ L p (B(x,2R)) C R p inf φ, B(x,R)

where B(x, R) denotes the open ball of radius R (in the background flat metric) centered at x, and C is a constant that depends only on t, p, and the differential operator. Since φ(x) = 0 and φ is nonnegative, the infimum is zero and the inequality implies that φ ≡ 0 in a neighborhood of x. Hence the set φ −1 (0) is open. Lemma 36. Let the hypotheses of Lemma 35 (b) hold, and define the operator L : W s, p → W s−2, p by Lφ, ϕ = u∇φ, ∇ϕ + f, φϕ ,

φ ∈ W s, p , ϕ ∈ C ∞ .

Then, L is bounded and invertible. Proof. By Lemma 34, the operator L is semi-Fredholm, and moreover since L is formally self-adjoint, it is Fredholm. It is well known that when the metric is smooth, the index of L is zero independent of s and p. We can approximate the metric h by smooth metrics so that L is arbitrarily close to a Fredholm operator with index zero. Since the level sets of index as a function on Fredholm operators are open, we conclude that the index of L is zero. The injectivity of L follows from Lemma 35(a), for if φ1 and φ2 are two solutions of Lφ = g, then the above lemma implies that φ1 − φ2 0 and φ2 − φ1 0.


605

A.7. The Yamabe classification of nonsmooth metrics. Let M be a smooth, closed, connected n-dimensional Riemannian manifold with a smooth metric h, where we assume throughout this section that n 3. With a positive scalar ϕ, let h˜ be related to h by 2n the conformal transformation h˜ = ϕ 2 −2 h, where 2 = n−2 . We say that h˜ and h are conformally equivalent, and this defines an equivalence relation on the space of metrics. The equivalence class containing h will be denoted by [h]; e.g., h˜ ∈ [h]. It is well known that any smooth Riemannian metric h on a given closed connected manifold M satisfies one and only one of the following three conditions: Y + : There is a metric in [h] with strictly positive scalar curvature; Y 0 : There is a metric in [h] with vanishing scalar curvature; Y − : There is a metric in [h] with strictly negative scalar curvature. These conditions define three disjoint classes in the space of metrics: they are referred to as the Yamabe classes. We will extend the above classification to metrics in the Sobolev spaces W s, p under rather mild conditions on s and p. Since the case p = 2 is treated in [33] and the argument there easily extends to our slightly general setting, we shall only sketch the proof here. Given a Riemannian metric h ∈ W s, p , let us consider the functional E : W 1,2 → R defined by E(ϕ) = (a∇ϕ, ∇ϕ) + R, ϕ 2 , n−1 where a = 4 n−2 . By Corollary 3, the pointwise multiplication is bounded on W 1,2 ⊗ W 1,2 → W σ,q for σ 1 and σ − qn < 2 − n. Putting σ = 2 − s and q = p , these conditions read as 2 − s − pn = 2 − n − s + np < 2 − n or s − np > 0, and s 1.

So if sp > n and s 1, ϕ 2 ∈ W 2−s, p for ϕ ∈ W 1,2 , meaning that the second term is bounded in W 1,2 . By using the functional E, we define the quantity µq = µq (h) = inf E(ϕ), ϕ∈Bq

where Bq = {ϕ ∈ W 1,2 : ϕq = 1}.

Under the conditions sp > n and s 1, one can show that µq is finite for q 2, and ˜ for any two metrics moreover that µ2 is a conformal invariant, i.e., µ2 (h) = µ2 (h) ˜h ∈ [h], now allowing W s, p functions for the conformal factor. We refer to µ2 (h) as the Yamabe invariant of the metric h, and we will see that the Yamabe classes correspond to the signs of the Yamabe invariant. Theorem 11. Let (M, h) be a smooth, closed, connected Riemannian manifold with dimension n 3 and with a metric h ∈ W s, p , where we assume sp > n and s 1. Let q ∈ [2, 2 ). Then, there exists φ ∈ W s, p , φ > 0 in M, such that − aφ + Rφ = µq φ q−1 ,

and

φq = 1,

(A.9)

where µq = µq (h) is as defined above. Proof. The above equation is the Euler-Lagrange equation for the functional E, so it suffices to show that E attains its infimum µq over Bq at a positive function φ ∈ W s, p . Let {φi } ⊂ Bq be a sequence satisfying E(φi ) → µq . From the continuity of the embedding L q → L 2 , we have {φi } is bounded in L 2 . It is the content of [33, Lemma 3.1] that E(ϕ) C1 ϕ21,2 − C2 ϕ22 ,

ϕ ∈ W 1,2 ,

606


for metrics in W s,2 with s > n2 . The proof works verbatim for our case, and since µq is finite, from this we conclude that {φi } is bounded in W 1,2 . By the reflexivity of W 1,2 and the compactness of W 1,2 → L q , there exist an element φ ∈ W 1,2 and a subsequence {φi } ⊂ {φi } such that φi φ in W 1,2 and φi → φ in L q . The latter implies φ ∈ Bq . It is not difficult to show that E is weakly lower semi-continuous, and it follows that E(φ) = µq , so φ satisfies (A.9). Bootstrapping with Corollary 5 implies that φ ∈ W s, p → W 1,n , so that |φ| ∈ W 1,n . Since E(|φ|) = E(φ), after replacing φ by |φ|, we can assume that φ 0. Finally, bootstrapping again gives φ ∈ W s, p , and since φ = 0 as φ ∈ Bq , by Lemma 35 we have φ > 0. Under the conformal scaling h˜ = ϕ 2

−2

h, the scalar curvature transforms as

R˜ = ϕ 1−2 (−aϕ + Rϕ),

so assuming the conditions of the above theorem we infer that any given metric h ∈ W s, p can be transformed to the metric h˜ = φ 2 −2 h with the continuous scalar curvature R˜ = µq φ q−2 , where the conformal factor φ is as in the theorem. In other words, given any metric h ab ∈ W s, p , there exist continuous functions φ ∈ W s, p with φ > 0 and R˜ ∈ W s, p having constant sign, such that ˜ 2 −1 . − aφ + Rφ = Rφ

(A.10)

We will prove below that the conformal class of the metric h completely determines the ˜ giving rise to the Yamabe classification of metrics in W s, p . sign of R, In the class of smooth metrics there is a stronger result known as the Yamabe theorem: each conformal class of smooth metrics contains a metric with constant scalar curvature. The Yamabe theorem is a non-trivial extension of the above theorem to the critical case q = 2 , and we see that for smooth metrics the sign of the Yamabe invariant determines which Yamabe class the metric is in. A proof of the Yamabe theorem requires more delicate techniques since we lose the compactness of the embedding W 1,2 → L q , see e.g. [31] for a treatment of smooth metrics. As far as we know there has not appeared in the literature an explicit proof of the Yamabe theorem for nonsmooth metrics such as the ones considered in this paper, although it is generally expected to be true. We will not pursue this issue here; however, the following simpler result justifies the Yamabe classification of nonsmooth metrics. Theorem 12. Let (M, h) be a smooth, closed, connected Riemannian manifold with dimension n 3 and with a metric h ∈ W s, p , where we assume sp > n and s 1. Then, the following hold: • µ2 > 0 iff there is a metric in [h] with continuous positive scalar curvature. • µ2 = 0 iff there is a metric in [h] with vanishing scalar curvature. • µ2 < 0 iff there is a metric in [h] with continuous negative scalar curvature. In particular, two conformally equivalent metrics cannot have scalar curvatures with distinct signs. Proof. We begin by proving that if there is a metric in [h] with continuous scalar curvature of constant sign, then µ2 has the corresponding sign. Since µ2 is a conformal invariant, we can assume that the scalar curvature R of h is continuous and has constant sign. If R < 0, then E(ϕ) < 0 for constant test functions ϕ = const and there


607

is a constant function in B2 , so we have µ2 < 0. If R 0, then E(ϕ) 0 for any ϕ ∈ W 1,2 , so µ2 0. Taking constant test functions, we infer that R = 0 implies µ2 = 0. Now, if R > 0 then E(ϕ) defines an equivalent norm on W 1,2 , and we have 1 = ϕ2 Cϕ1,2 for ϕ ∈ B2 , so µ2 > 0. Next, we will prove that there is a metric in [h] with continuous scalar curvature with the same sign as that of µ2 . To this end, for any q ∈ [2, 2 ), we shall show that the sign of µ2 determines the sign of µq , so that the proof is completed by Theorem 11. If µ2 < 0, then E(ϕ) < 0 for some ϕ ∈ B2 , and since E(kϕ) = k 2 E(ϕ) for k ∈ R, there is some kϕ ∈ Bq such that E(kϕ) < 0, so µq < 0. If µs 0, then E(ϕ) 0 for all ϕ ∈ B2 , and for any ψ ∈ Bq there is k such that kψ ∈ B2 , so µq 0. All such k are uniformly bounded since k = 1/ψ2 C/ψq = C by the continuity estimate ·1 C·2 . From this we have for all ψ ∈ Bq , E(ψ) = E(kψ)/k 2 µ2 /k 2 µ2 /C 2 , meaning that µ2 > 0 implies µq > 0. A similar scaling argument gives that if µ2 = 0 then µq = 0. A.8. Conformal covariance of the Hamiltonian constraint. Let M be a smooth, closed, connected n-dimensional manifold equipped with a Riemannian metric h ∈ W s, p , where we assume throughout this section that p ∈ (1, ∞), s ∈ ( np , ∞)∩[1, ∞) and that n 3. We consider the Hamiltonian constraint H (φ) := −φ + where r =

4 n−2 ,

1 r (n−1) Rφ

+ aτ φ r +1 − aw φ −r −3 − aρ φ −t = 0,

t ∈ R are constants, R ∈ W s−2, p is the scalar curvature of the metric s−2, p

. In this Appendix, we will be h, and the other coefficients satisfy aτ , aw , aρ ∈ W+ interested in the transformation properties of H under the conformal change h˜ = θ r h of the metric with the conformal factor θ ∈ W s, p satisfying θ > 0. To this end, we consider ˜ + H˜ (ψ) := −ψ

1 ˜ r (n−1) Rψ

+ a˜ τ ψ r +1 − a˜ w ψ −r −3 − a˜ ρ ψ −t = 0,

˜ R˜ ∈ W s−2, p is ˜ is the Laplace-Beltrami operator associated to the metric h, where ˜ and at the moment we do not impose any conditions on the the scalar curvature of h, s−2, p . One can derive remaining coefficients other than that they satisfy a˜ τ , a˜ w , a˜ ρ ∈ W+ the following relations: R˜ = θ −r R − r (n − 1)θ −r −1 θ, ˜ = θ −r ψ + 2θ −r −1 ∇ a θ ∇a ψ. ψ Combining these relations with (θ ψ) = θ ψ + ψθ + 2∇ a θ ∇a ψ, we obtain ˜ + −ψ

1 ˜ r (n−1) Rψ

= θ −r −1 −(θ ψ) +

which in turn implies that H˜ (ψ) = θ −r −1 H (θ ψ),

1 r (n−1) Rθ ψ

,

608


provided in the definition of H˜ that a˜ τ = aτ , a˜ w = θ −2r −4 aw , and a˜ ρ = θ −t−r −1 aρ . We have proved the following well known result. Lemma 37. Assume the above setting, so in particular, a˜ τ = aτ , a˜ w = θ −2r −4 aw , and a˜ ρ = θ −t−r −1 aρ . Then we have H˜ (ψ) = 0 H˜ (ψ) 0

⇔

H (θ ψ) = 0,

⇔

H (θ ψ) 0,

H˜ (ψ) 0

⇔

H (θ ψ) 0.

A.9. General conformal rescaling and the near-CMC condition. In this article we focused on the standard conformal method to produce the particular coupled elliptic PDE system that we analyzed. Here we examine briefly other decompositions to see if it is possible to remove the near-CMC obstacle for non-CMC existence that still seems to remain for the non-positive Yamabe classes and for the positive Yamabe class with large data. The key question here is whether or not the standard conformal method essentially hard-wires the near-CMC assumption into the coupled system in order to get a domain of attraction for fixed-point iterations. If this is the case, then there remains the possibility that one can reverse-engineer a formulation, different from the conformal method, that gives a domain of attraction (preferably a contraction so that we also get uniqueness) without use of near-CMC conditions. Unfortunately, the answer appears to be negative, as we demonstrate below. In particular, it seems that the near-CMC obstacle is present in all possible formulations based on conformal transformations, if the estimate (5.1) is used. To begin, recall that the objects (M, hˆ ab , kâb , ρ, ˆ jâ ) form an n-dimensional initial data set for Einstein’s equations iff M is a n-dimensional smooth manifold, the tensor hˆ ab is a Riemannian metric on M, the tensor kâb is a symmetric tensor field on M, the fields ρˆ and jâ are a non-negative scalar and a tensor field on M, respectively, satisfying the condition −ρˆ 2 + jâ jâ < 0, and the following equations hold: Rˆ + kˆ 2 − kâb kˆ ab − 2κ ρˆ = 0, −∇ˆ a kˆ ab + ∇ˆ b kˆ + κ jˆb = 0,

(A.11) (A.12)

where ∇ˆ a is the Levi-Civita connection of the metric hˆ ab , the scalar field Rˆ is the Ricci scalar of the connection ∇ˆ a , the scalar kˆ = kâb hˆ ab is the trace of the tensor kâb , and the constant κ = 8π in units where both the gravitation constant G and the speed of light c have value one. The initial data set for Einstein’s equations describe an instant of time in the physical world if we choose the number n = 3. Nevertheless, in the calculations that follow we keep the number n as a general positive integer. Introduce the decomposition of the two-index tensor kab into trace-free and trace parts, as follows: kˆ ab = sˆ ab +

1 n

kˆ hˆ ab ,

where sâb hˆ ab = 0. Introduce the following conformal rescaling: hˆ ab = φ r h ab ,

sˆ ab = φ s s ab ,

kˆ = φ t k,

(A.13)


609

where the integers r , s, and t are arbitrary, and we have introduced the Riemannian metric h ab , a symmetric tensor s ab , and a scalar field k. Introduce ∇a , the Levi-Civita connection of the metric h ab , which satisfies the equation ∇a h bc = 0, and denote by R the Ricci scalar of this connection ∇a . The rescaling above induces the following equations: hˆ ab = φ −r h ab ,

sâb = φ (2r +s) sab ,

where hˆ ab is the inverse tensor of hˆ ab , and h ab is the inverse tensor of h ab . We use the convention that indices in all other hatted tensors are raised and lowered with the tensors hˆ ab and hˆ ab , respectively, while indices on unhatted tensors are raised and lowered with the tensors h ab and h ab , respectively. For example: sâb = hˆ ac hˆ bd sˆ cd = φ r h ac φ r h bd φ s s cd = φ (2r +s) sab . The rescaling introduced in Eq. (A.13) implies that the tensor field kˆ ab transforms as follows: kˆ ab = φ s s ab +

1 n

φ (t−r ) kh ab

⇔

kâb = φ (2r +s) sab +

1 n

φ (t+r ) kh ab .

The connections ∇ˆ a and ∇a differ in a tensor field Cab c , in the sense that for any tensor field va , ∇ˆ a vb = ∇a vb − Cab c vc holds. The tensor field Cab c depends on the scalar field φ and the number r as follows: Cab c = r δ(a c ∇b) ln(φ) −

r 2

h ab h cd ∇d ln(φ).

(A.14)

This expression implies the contractions h ab Cab c = − r2 (n − 2)h cd ∇d ln(φ),

Cab b =

nr 2 ∇a

ln(φ).

Given any two connections ∇ˆ a and ∇a related by a tensor field Cab c , the Riemann, Ricci, and Ricci scalar fields associated with these two connections are related by the following expressions: Rˆ abc d = Rabc d − 2∇[a Cb]c d + 2Cc[a e Cb]e d , Rˆ ac = Rac − ∇a Ccb b + ∇b Cac b + Cca e Ceb b − Ccb e Cae b , Rˆ = φ −r R − ∇ a Cab b + ∇b (h ac Cac b ) + h ac Cca e Ceb b − h ac Ccb e Cae b , where indices between square brackets mean anti-symmetrization, that is, given any tensor u ab we define u [ab] := (u ab − u ba )/2. In the case that the tensor Cab c is given by Eq. (A.14), the Ricci scalars Rˆ and R satisfy the equation r (n − 1)[r (n − 2) − 4](∇a φ)(∇ a φ) . Rˆ = φ −(r +1) φ R − r (n − 1)φ − 4φ Introduce the Hamiltonian and momentum fields, Hˆ := Rˆ + kˆ 2 − kâb kˆ ab , ˆ Mˆ b := −∇ˆ a kˆ ab + ∇ˆ b k,

610


then the conformal rescaling given in Eq. (A.13) implies the following equations: r Hˆ = φ −(r +1) φ R − r (n − 1)φ − 4φ (n − 1)[r (n − 2) − 4](∇a φ)(∇ a φ) n − 1 2t 2 φ k − φ 2(r +s) sab s ab , n rn n−1 t φ ∇b k − + r + s φ (r +s) sb a ∇a ln(φ) Mˆ b = −φ (r +s) ∇a sb a + n 2 n−1 t t φ k∇b ln(φ). + n +

It is convenient to reorder the terms in these equations in such a way that the equation for the Hamiltonian field is given by −r (n − 1)φ − +Rφ +

r (n − 1)[r (n − 2) − 4](∇a φ)(∇ a φ) 4φ

(n − 1) 2 (2t+r +1) k φ − sab s ab φ (3r +2s+1) = φ (r +1) Hˆ , n

and the equation for the momentum field is given by (n + 2) a r + s sb a ∇a ln(φ) −∇a sb − 2 (n − 1) (t−r −s) (n − 1) (t−r −s−1) φ tφ = φ −(r +s) Mˆ b − ∇b k − k∇b φ. n n There are many interesting particular cases of the equations above. The first case is to keep the dimension n 3 arbitrary, and choose: r=

4 n−2 ,

s = − (n+2) 2 r,

t = 0,

then, introducing the number 2∗ := 2n/(n − 2), we conclude that the n-dimensional vacuum Einstein constraint equations (H = 0, Mb = 0) can be written as follows: 4(n − 1) (n − 1) 2 (2∗ −1) ∗ − sab s ab φ −(2 +1) = 0, φ + Rφ + k φ (n − 2) n (n − 1) 2∗ a φ ∇b k = 0. −∇a sb + n

−

In the case that the manifold M is 3-dimensional, we have the number 2∗ = 6, and the equation for the Hamiltonian field is given by −2r φ − +Rφ +

r (r − 4)(∇a φ)(∇ a φ) 2φ

2 2 (2t+r +1) k φ − sab s ab φ (3r +2s+1) = φ (r +1) Hˆ , 3

and the equation for the momentum field is given by 3r a −∇a sb − + r + s sb a ∇a ln(φ) 2 2 2 = φ −(r +s) Mˆ b − φ (t−r −s) ∇b k − t φ (t−r −s−1) k∇b φ. 3 3

(A.15)

(A.16)


611

The semi-decoupling decomposition in the case of the vacuum Einstein constraint equations (H = 0, Mb = 0) is obtained from Eqs. (A.15)-(A.16) in the particular case of r = 4, s = −10, and t = 0, that is, 2 ∗ ∗ −8φ + Rφ + k 2 φ (2 −1) − sab s ab φ −(2 +1) = 0, 3 2 2∗ a −∇a sb + φ ∇b k = 0. 3 The conformally covariant decomposition, in the case of the vacuum Einstein constraint equations (H = 0, Mb = 0) and in the case that the transverse, traceless part of the tensor kab vanishes, is obtained from Eqs. (A.15)-(A.16) with the particular choice of r = 4, s = −4, and t = 0, that is, 2 2 ∗ k − sab s ab φ (2 −1) = 0, −8φ + Rφ + 3 2 −∇a sb a − 6 sb a ∇a ln(φ) + ∇b k = 0. 3 As a final example, it is interesting to write down the rescaled equations above in the case r = 4, s = −10, t arbitrary: 2 −8φ + Rφ + φ (2t+5) k 2 − φ −7 sab s ab = φ 5 Hˆ , 3 2 2 −∇a sb a = φ 6 Mˆ b − φ (t+6) ∇b k − t φ (t+5) k ∇b φ. 3 3 Since the leading power in each equation scales exactly as the conformal method, the same argument leading to the negative result for the conformal method in Lemma 10 will apply here. Therefore, it appears that the different conformal rescalings produce coupled systems leading to precisely the same form of the near-CMC condition to establish non-CMC existence, in the case of both the non-positive Yamabe classes and the positive Yamabe class for large data. Open Access This article is distributed under the terms of the Creative Commons Attribution Noncommercial License which permits any noncommercial use, distribution, and reproduction in any medium, provided the original author(s) and source are credited.

References 1. Allen, P., Clausen, A., Isenberg, J.: Near-constant mean curvature solutions of the Einstein constraint equations with non-negative Yamabe metrics. Class. Quant. Grav. 25, 075009 (2008) 2. Amann, H.: Fixed point equations and nonlinear eigenvalue problems in ordered Banach spaces. SIAM Review 18(4), 620–709 (1976) 3. Aubin, T.: Nonlinear Analysis on Manifolds. Monge-Ampère Equations. New York: Springer-Verlag, 1982 4. Bartnik, R., Fodor, G.: On the restricted validity of the thin sandwich conjecture. Phys. Rev. D 48(8), 3596–3599 (1993) 5. Bartnik, R., Isenberg J.: The constraint equations. In: Chru´sciel, P., Friedrich, H. eds., The Einstein equations and large scale behavior of gravitational fields. Berlin: Birhäuser, 2004, pp. 1–38 6. Beig, R.: TT-tensors and conformally flat structures on 3-manifolds. In: Chru´sciel, P.T. ed., Mathematics of Gravitation, Part 1, Volume 41. Warszawa: Banach Center Publications, Polish Academy of Sciences, Institute of Mathematics, 1997, pp. 109–118. Available at http://arXiv.org/abs/gr-qc/9606055v1, 1996

612


7. Beig, R.: Generalized Bowen-York initial data. In: Cotsakis, S., Gibbons, G. eds., Mathematical and Quantum Aspects of Relativity and Cosmology, Volume 537. Springer Lecture Note in Physics, Berlin: Springer, 2000, pp. 55–69 8. Beig, R., Ó Murchadha, N.: The momentum constraints of general relativity and spatial conformal isometries. Commun. Math. Phys. 176(3), 723–738 (1996) 9. Bowen, J., York, J.: Time-asymmetric initial data for black holes and black-hole collisions. Phys. Rev. D 21(8), 2047–2055 (1980) 10. Choquet-Bruhat, Y.: Einstein constraints on compact n-dimensional manifolds. Class. Quant. Grav. 21, S127–S151 (2004) 11. Choquet-Bruhat, Y., Isenberg, J., York, J.: Einstein constraint on asymptotically Euclidean manifolds. Phys. Rev. D 61, 084034 (2000) 12. Corvino, J.: Scalar curvature deformation and a gluing construction for the Einstein constraint equations. Commun. Math. Phys. 214, 137–189 (2000) 13. Dain, S.: Initial data for a head on collision of two Kerr-like black holes with close limit. Phys. Rev. D 64(15), 124002 (2001) 14. Dain, S.: Initial data for two Kerr-like black holes. Phys. Rev. Lett. 87(12), 121102 (2001) 15. Dain, S.: Trapped surfaces as boundaries for the constraint equations. Class. Quant. Grav. 21(2), 555–573 (2004) 16. Du, Y.: Order structure and topological methods in nonlinear partial differential equations, Vol I. New Jersey, London, Singapore: World Scientific, 2006 17. Geroch, R., Traschen, J.: Strings and other distributional sources in general relativity. Phys. Rev. D 36(4), 1017–1031 (1987) 18. Grisvard, P.: Elliptic Problems in Nonsmooth Domains. Marshfield, MA: Pitman Publishing, 1985 19. Hebey, E.: Sobolev spaces on Riemannian manifolds, Volume 1635 of Lecture notes in mathematics. Berlin, New York: Springer, 1996 20. Holst, M.: Adaptive numerical treatment of elliptic systems on manifolds. Adv. Comp. Math. 15, 139–191 (2001) 21. Holst, M., Nagy, G., Tsogtgerel, G.: Rough solutions of the Einstein constraints on manifolds with boundary. Preprint, available at http://arXiv.org/abs0712.0798v1[gr-qc], 2007 22. Holst, M., Nagy, G., Tsogtgerel, G.: Far-from-constant mean curvature solutions of Einstein’s constraint equations with positive Yamabe metrics. Phys. Rev. Lett. 100(16), 161101.1–161101.4 (2008) 23. Holst, M., Tsogtgerel, G.: Adaptive finite element approximation of nonlinear geometric PDE. Preprint 24. Holst, M., Tsogtgerel, G.: Convergent adaptive finite element approximation of the Einstein constraints. Preprint 25. Isenberg, J.: Constant mean curvature solution of the Einstein constraint equations on closed manifold. Class. Quant. Grav. 12, 2249–2274 (1995) 26. Isenberg, J., Moncrief, V.: A set of nonconstant mean curvature solution of the Einstein constraint equations on closed manifolds. Class. Quant. Grav. 13, 1819–1847 (1996) 27. Isenberg, J., Ó Murchadha, N.: Non CMC conformal data sets which do not produce solutions of the Einstein constraint equations. Class. Quant. Grav. 21, S233–S242 (2004) 28. Isenberg, J., Park, J.: Asymptotically hyperbolic non-constant mean curvature solutions of the Einstein constraint equations. Class. Quant. Grav. 14, A189–A201 (1997) 29. Jerome, J.: Consistency of semiconductor modeling: an existence/stability analysis for the stationary van Roosbroeck system. SIAM J. Appl. Math. 45(4), 565–590 (1985) 30. Klainerman, S., Rodnianski, I.: Improved local well posedness for quasilinear wave equations in dimension three. Duke Math. J. 117(1), 1–124 (2003) 31. Lee, J., Parker, T.: The Yamabe problem. Bull. Amer. Math. Soc. 17(1), 37–91 (1987) 32. Lichnerowicz, A.: L’integration des équations de la gravitation relativiste et le problème des n corps. J. Math. Pures Appl. 23, 37–63 (1944) 33. Maxwell, D.: Rough solutions of the Einstein constraint equations on compact manifolds. J. Hyp. Diff. Eqs. 2(2), 521–546 (2005) 34. Maxwell, D.: Solutions of the Einstein constraint equations with apparent horizon boundaries. Commun. Math. Phys. 253(3), 561–583 (2005) 35. Maxwell, D.: Rough solutions of the Einstein constraint equations. J. Reine Angew. Math. 590, 1–29 (2006) 36. Maxwell, D.: A class of solutions of the vacuum einstein constraint equations with freely specified mean curvature. http://arXiv.org/abs/0804.0874v1[gr-qc], 2008 37. Misner, C., Thorne, K., Wheeler, J.: Gravitation. San Francisco, CA: W. H. Freeman and Company, 1970 38. Mitrović, D., Žubrinić, D.: Fundamentals of applied functional analysis, Volume 91 of Pitman monographs and surveys in pure and applied mathematics. Essex, UK: Addison Wesley Longman, 1998 39. Ó Murchadha, N., York, J.: Existence and uniqueness of solutions of the Hamiltonian constraint of general relativity on compact manifolds. J. Math. Phys. 14(11), 1551–1557 (1973)


613

40. Ó Murchadha, N., York, J.: Initial-value problem of general relativity I. General formulation and physical interpretation. Phys. Rev. D 10(2), 428–436 (1974) 41. Ó Murchadha, N., York, J.: Initial-value problem of general relativity II. Stability of solution of the initial-value equations. Phys. Rev. D 10(2), 437–446 (1974) 42. Palais, R.: Seminar on the Atiyah-Singer index theorem. Princeton, NJ: Princeton University Press, 1965 43. Rosenberg, S.: The Laplacian on a Riemannian Manifold. Cambridge: Cambridge University Press, 1997 44. Rudin, W.: Real & Complex Analysis. New York: McGraw-Hill, 1987 45. Schwarz, G.: Hodge decomposition – a method for solving boundary value problems. In: Lecture Notes in Mathematics, Volume 1607. Berlin-Heidelberg-New York: Springer Verlag, 1995 46. Triebel, H.: Theory of function spaces, Volume 78 of Monographs in Mathematics. Basel : Birkhäuser Verlag, 1983 47. Trudinger, N.: Linear elliptic operators with measurable coefficients. Ann. Scuola Norm. Sup. Pisa 27(3), 265–308 (1973) 48. Wald, R.: General Relativity. Chicago, IL: The University of Chicago Press, 1984 49. York, J.: Gravitational degrees of freedom and the initial-value problem. Phys. Rev. Lett. 26(26), 1656–1658 (1971) 50. York, J.: Role of conformal three-geometry in the dynamics of gravitation. Phys. Rev. Lett. 28(16), 1082–1085 (1972) 51. York, J.: Conformally invariant orthogonal decomposition of symmetric tensor on Riemannian manifolds and the initial-value problem of general relativity. J. Math. Phys. 14(4), 456–464 (1973) 52. York, J.: Covariant decompositions of symmetric tensors in the theory of gravitation. Ann. Inst. Henri Poincare A 21(4), 319–332 (1974) 53. York, J.: Conformal “thin-sandwich” data for the initial-value problem of general relativity. Phys. Rev. Lett. 82, 1350–1353 (1999) 54. Zeidler, E.: Nonlinear Functional Analysis and its Applications I, Fixed-Point Theorems. New York: Springer, 1986 55. Zolesio, J.L.: Multiplication dans les espaces de Besov. Proc. Royal Soc. Edinburgh (A) 78(2), 113–117 (1977) Communicated by G. W. Gibbons


Communications in


Isospectral Deformations of Eguchi-Hanson Spaces as Nonunital Spectral Triples Chen Yang Department of Mathematical Sciences, Durham University, Durham DH1 3LE, UK. E-mail: [email protected] Received: 14 April 2008 / Accepted: 15 December 2008 Published online: 20 March 2009 – © Springer-Verlag 2009

Abstract: We study the isospectral deformations of the Eguchi-Hanson spaces along a torus isometric action in the noncompact noncommutative geometry. We concentrate on locality, smoothness and summability conditions of the nonunital spectral triples, and relate them to the geometric conditions to be noncommutative spin manifolds. Contents 1. 2. 3. 4. 5. 6.

Introduction . . . . . . . . . . . . . . . . . Spin Geometry of Eguchi-Hanson Spaces . . Smooth Algebras and Projective Modules . . Nonunital Spectral Triples and Summability Geometric Conditions . . . . . . . . . . . . Conclusion . . . . . . . . . . . . . . . . . .

. . . . . .

. . . . . .

. . . . . .

. . . . . .

. . . . . .

. . . . . .

. . . . . .

. . . . . .

. . . . . .

. . . . . .

. . . . . .

. . . . . .

. . . . . .

. . . . . .

. . . . . .

. . . . . .

. . . . . .

615 616 624 635 643 651

1. Introduction As a generalization of Connes’ noncommutative differential geometry [1], noncompact noncommutative geometry is the study of nonunital spectral triples [2,3]. Various authors also consider the aspect of summability as in [4–6]. In the unital case, Connes provides a set of axioms for unital spectral triples so to define compact noncommutative spin manifolds. See for example [1]. Rennie and Várilly explicitly reconstruct compact commutative spin manifolds from slightly modified axioms [7] and Connes thoroughly investigated this problem recently in [8]. As to the nonunital case, a complete generalization considering these axioms is not known yet. There are various nonunital examples [9,3,10], which may serve the purpose of testing Supported by the Dorothy Hodgkin Scholarship.

616

C. Yang

the axioms or geometric conditions suggested. In this article, we obtain another nonunital example by isospectral deformation of Eguchi-Hanson (EH-) spaces [11]. They are geodesically complete Riemannian spin manifolds in the commutative geometry. Isospectral deformation is a simple method to deform a commutative spectral triple. It traces back to the Moyal type of deformation from quantum mechanics. Rieffel’s insight is to consider Lie group actions on function spaces and hence explain the Moyal product between functions by oscillatory integrals over the group actions [12]. Apart from the well-known Moyal planes and noncommutative tori [13], this scheme allows more general deformations. Connes and Landi in [14] deform spheres and more general compact spin manifolds with isometry groups containing a two-torus. Connes and DuboisViolette in [15] observe that this works equally well for noncompact spin manifolds. As in the appendix of [2], it is possible to fit such noncompact examples in the nonunital framework there. The deformation of EH-spaces we will consider in the following is obtained by these methods and serves as an example of a nonunital triple. The Eguchi-Hanson spaces are of interest in both Riemannian geometry and physics. Geometrically, they are the simplest asymptotic locally Euclidean (ALE) spaces, for which a complete classification is provided by Kronheimer through the method of hyper-Kähler quotients [16]. This construction realizes the family of EH-spaces as a resolution of a singular conifold. In physics, where they first appeared, EH-spaces are known as gravitational instantons. Due to their hyper-Kähler structure, the ADHM construction [17], obtaining Yang-Mills’ instantons, is generalized on the EH-spaces in an elegant way [18,19]. The nonunital spectral triple from isospectral deformation of Eguchi-Hanson spaces may thus link various perspectives. Our aim in this article is to concentrate on the locality, smoothness [2] and summability conditions of these triples and further see how they fit into the modified geometric conditions for nonunital spectral triples. The organization of the rest of the article is as follows. In Sect. 2, we describe the Eguchi-Hanson spaces in the spin geometry. In Sect. 3, we consider algebras of functions over EH-spaces, the deformation quantization of algebras, and representations of algebras as operators on the Hilbert space of spinors. We also obtain a projective module description of the spinor bundle. In Sect. 4, we define spectral triples of the deformed EH-spaces and study their summability. In Sect. 5, we discuss how the triple fits into the modified geometric conditions. We conclude in Sect. 6.

2. Spin Geometry of Eguchi-Hanson Spaces In this section, we first describe the metric and the Levi-Civita connection of the Eguchi-Hanson space, and then introduce its spinor bundle, the spin connection and the Dirac operator. Finally, we write down the torus action through parallel propagators on the spinor bundle.

2.1. Metrics, connections and torus isometric actions. The Eguchi-Hanson spaces were originally constructed as gravitational instantons [11]. Generalized by Gibbons and Hawking, they fall into a new category of solutions of the Einstein’s equation, known as the multicenter solutions [20]. In local coordinates, the metric is ds 2 = ∆−1 dr 2 + r 2 (σx2 + σ y2 ) + ∆ σz2 ,

(1)

Isospectral Deformations of Eguchi-Hanson Spaces as Nonunital Spectral Triples

617

where ∆ := ∆(r ) := 1 − a 4 /r 4 and {σx , σ y , σz } are the standard Cartan basis for the 3-sphere, ⎧ 1 ⎪ ⎨σx = 2 (− cos ψ dθ − sin θ sin ψ dφ), σ y = 21 (sin ψ dθ − sin θ cos ψ dφ), ⎪ ⎩σ = 1 (−dψ − cos θ dφ), z 2 with r ≥ a, 0 ≤ θ ≤ π, 0 ≤ φ < 2π, 0 ≤ ψ < 2π. Remark 1. The convention that the period of ψ is 2π rather than 4π as in the original construction is suggested in [20] to remove the singularity at r = a, so that the manifold becomes geodesically complete. The EH-space is diffeomorphic to the tangent bundle of a 2-sphere T (S2 ). Modulo a distortion of the metric, the base as a unit two sphere S2 is parametrized by parameters φ and θ , with θ = 0 as the south pole and θ = π as the north pole. The angle φ parametrizes the circle defined by a constant θ . Over each point, say (θ, φ) on the 2-sphere, the tangent plane is parametrized by (r, ψ). r parametrizes the radial direction with r = a at the origin of the plane. Circles of constant r are parametrized by ψ. The identification of ψ = ψ + 2π is the identification of the antipodal points on the circle of constant radius. Together with the metric, this implies that the space at large enough r is asymptotic to R4 /Z2 , so that it is an ALE space. The parameter a in the metric (1) is a non-negative real number, and thus parametrizes a family of EH-spaces. When a = 0, the metric degenerates to the conifold R4 /Z2 and the rest of the family is a resolution of the conifold. This appears as the simplest case in Kronheimer’s classification of ALE spaces [16]. We will only concentrate on the smooth case so that a is assumed to be positive. Choose the local coordinates {xi } with x1 = r, x2 = θ, x3 = φ, x4 = ψ. We will write the coordinates (r, θ, φ, ψ) and (x1 , x2 , x3 , x4 ) interchangeably throughout the article, because the former give a clear geometric picture while the latter are convenient in tensorial expressions. The corresponding basis on the tangent space Tx (E H ) of any point x ∈ E H are ∂i := ∂∂xi , and the dual basis on the cotangent space Tx∗ (E H ) are {d x j }. The corresponding metric tensor gi j (x) d x i ⊗ d x j can be written as entries of the matrix G = (gi j ) as ⎛ −1 4∆ 1⎜ 0 G(x) = ⎜ 4⎝ 0 0

0 r2 0 0

0 0 ρ r 2 ∆ cos θ

⎞ 0 ⎟ 0 ⎟, 2 r ∆ cos θ ⎠ r2 ∆

(2)

where ρ := ρ(r, θ ) := r 4 − a 4 cos2 θ /r 2 . We always assume Einstein’s summation convention.

618

C. Yang

In the same coordinate chart, the Christoffel symbols of the Levi-Civita connection of (1) , defined by ∇i ∂ j = Γikj ∂k , are explicitly ∆ r∆ ∆ ρ+ r ∆+ ∆ cos θ 1 1 1 , Γ22 , Γ33 , Γ34 , =− =− =− ∆ 4 4r 4 r ∆+ ∆ 1 a 4 sin 2θ ∆ sin θ 2 2 2 , Γ12 , =− = , Γ33 =− , Γ34 = 4 4 r 2r 2 1 cot θ ∆+ ∆ 2 a 4 cos θ 3 3 4 , Γ24 , Γ13 , = , Γ23 = =− = r 2 2 sin θ r (r 4 − a 4 ) ∆+ ρ+ cot θ ∆ 4 4 , Γ23 , Γ24 , = =− 2 = r∆ 2 r sin θ 2

1 =− Γ11 1 Γ44 3 Γ13 4 Γ14

(3)

where ∆+ := ∆+ (r ) := 1 +

a4 ∂∆ r 4 + a 4 cos2 θ , ρ + := ρ + (r, θ ) := , ∆ := . 4 r ∂r r2

i = Γ i , implied by the torsion free property of the connection, generates The identity Γ jk kj another set of symbols and all the rest of the Christoffel symbols vanish. The isometry group of the metric (1) is (U (1) × SU (2))/Z2 . The Killing vector ∂ψ generates the group U (1)/Z2 . Another Killing vector is ∂φ . Its action on the restriction of the space at r = a is analogous to one of the three typical generators of the Lie algebra of the Lie group SU (2) on a standard two-sphere. These are the two Killing vectors which define a torus action σ on the Eguchi-Hanson space,

σ : U (1) × U (1) −→ Aut (E H ),

(4)

by σ (exp (i t3 ∂φ ), exp (i t4 ∂ψ ))(r, θ, φ, ψ) = (r, θ, φ + t3 , ψ + t4 ), where 0 ≤ t3 < 2π , 0 ≤ t4 < 2π and for any point (r, θ, φ, ψ) ∈ E H . The isometric torus action will determine the isospectral deformation later. 2.2. The stereographic projection and orthonormal basis. We choose an orthonormal basis to trivialize the cotangent bundle of the EH-space and obtain the corresponding transition functions. Since the EH-space is topologically the same as T (S2 ), we may obtain another set of coordinates by taking the stereographic projection of the S2 part, while keeping the coordinates on the tangent space unchanged. The EH-space (1) can be covered by two open neighbourhoods U N and U S , where U N covers the whole space except at θ = π and U S covers the whole space except at θ = 0. We may define the map f N : U N −→ C×R2 by taking a stereographic projection of the base two sphere to C. I.e., f N (φ, θ, r, ψ) = (z; r, ψ). For the coordinate chart U S , we similarly define the projection map f S : N S −→ C×R2 , by f S (φ, θ, r, ψ) = (w; r, ψ), where θ θ z := cot e−iφ , w := tan eiφ . 2 2 For any point x ∈ U N ∩ U S , the transition function from the coordinate charts U S to U N is 1 (w; r, ψ) = ( ; r, ψ), z and the transition function from U N to U S is (z; r, ψ) = ( w1 ; r, ψ).


619

The restriction of metric (1) on the U N chart with coordinates (z; r, ψ) is r2 r2 ∆ zz − 1 i dz dz 2 2 dzdz + ds = dψ + ( − ) . (1 + zz)2 4 zz + 1 2 z z To obtain a local orthonormal basis of T ∗ (E H )U N we may simply define dz r dz 1 − zz ∆−1/2 r ∆1/2 l := √ − + 2i dψ , dz, m := √ dr + √ z z 1 + zz 2 (1 + zz) 2 4 2 with their complex conjugates l, m so that the metric tensor over U N is ds 2 = l ⊗ l + l ⊗ l + m ⊗ m + m ⊗ m. A real orthonormal frame {ϑ α } of T ∗ (E H )U N is thus defined by 1 i i 1 ϑ 1 := √ (l + l), ϑ 2 := − √ (l − l), ϑ 3 := − √ (m − m), ϑ 4 := √ (m + m), 2 2 2 2 such that the metric on U N is diagonalized as ds 2 = δαβ ϑ α ⊗ ϑ β . The coordinate transformations ϑ α = h iα d x i are determined by the matrix H = (h iα ), ⎞ ⎛ 0 −r cos φ −r sin θ sin φ 0 ⎟ ⎜ r sin φ −r sin θ cos φ 0 ⎟ 1⎜ ⎟ ⎜ 0 (5) H= ⎜ ⎟, 2⎜ 0 0 r ∆1/2 cos θ r ∆1/2 ⎟ ⎠ ⎝ 2 0 0 0 1/2 ∆ j j whose inverse H −1 = (h˜ β ) from d x j = h˜ β ϑ β is ⎛ 0 0 ⎜ ⎜ cos φ sin φ ⎜ − r r H −1 = 2 ⎜ ⎜ sin φ cos φ ⎜ − r sin θ − r sin θ ⎝ cos θ sin φ r sin θ

cos θ cos φ r sin θ

⎞

0 0 0 1 r ∆1/2

∆1/2 2 ⎟

⎟ 0 ⎟ ⎟. ⎟ 0 ⎟ ⎠ 0.

(6)

The above construction on the U N chart can be carried out the same way on the U S coordinates. We denote orthonormal frames over U S by adding ’s to l, m, ϑ α , x j , etc. Local frames {ϑ α } on U N define a local trivialization of the cotangent bundle, FN : T ∗ (E H )U N → U N × R4 by FN (x; a1 ϑ 1 + · · · + a4 ϑ 4 ) := (x; a1 , . . . , a4 ), where aα ’s are real-valued functions over U N . In a similar way, the choice of local frames {ϑ α } on U S defines a local trivialization of the cotangent bundle, FS : T ∗ (E H )U S −→ U N ×R4 . β β The transition functions f α ’s such that ϑ β = f α ϑ α are elements of the matrix −1 FS N := FN ◦ FS as ⎞ ⎛ ⎛ 2 2 2 2 ⎞ − cos 2φ sin 2φ 0 0 − z2+z −i z 2−z 0 0 zz zz ⎟ ⎜ 2 2 2 2 ⎟ ⎜ − sin 2φ − cos 2φ 0 0⎟ ⎜ z −z ⎟ − z2+z 0 0⎟ ⎜ ⎜ i 2 zz ⎟ . (7) zz FS N = ⎜ ⎟=⎜ ⎟ ⎜ ⎟ ⎝ ⎜ 0 0 0 1 0 ⎠ 0 1 0 ⎠ ⎝ 0 0 0 1 0 0 0 1

620

C. Yang

The inverse transition function is given by the inverse of the matrix FS N , FN S := FS ◦ FN−1 = FS−1 N . The cotangent bundle is thus T ∗ (E H ) = (U N × R4 ) ∪ (U S × R4 )/ ∼,

(8)

where (x; a1 , . . . , a4 ) ∈ U N × R4 and (x ; a1 , . . . , a4 ) ∈ U S × R4 are defined to be equivalent if and only if x = x and FN S (a1 , . . . , a4 )t = (a1 , . . . , a4 )t . 2.3. Spin structures and spinor bundles. Following a standard procedure from [21], we obtain the spinor bundle of the EH-space. In coordinate charts {U N , U S }, the frame bundle PS O(4) of the EH-space is the S O(4)-principal bundle with transition functions FN S in (7) and its inverse FS N . Recall that the covering map of groups, ρ : Spin(4) −→ S O(4),

(9)

is defined by the adjoint representation of Spin(4) as ρ(w)x := w · x · w−1 for x ∈ R4 , where w = v1 · · · vm ∈ Spin(4), m is even and vi ∈ R4 for i = 1, . . . , m. Geometrically, ρ(w) = ρ(v1 ) ◦ · · · ◦ ρ(vm ), where ρ(vi ) is the reflection of the space R4 with respect to the hyperplane with normal vector vi . Locally, the upper left block of the transition matrix (7) is a rotation in the plane spanned by {ϑ 1 , ϑ 2 } through an angle 2φ + π . Such a rotation can be decomposed to two reflections say ρ(v2 ) ◦ ρ(v1 ), with v1 := ϑ 1 , v2 := − sin φ ϑ 1 + cos φ ϑ 2 . Remark 2. Another choice is ρ(−v2 )◦ρ(v1 ), which gives the same rotation as an element in S O(2). v2 · v1 ∈ Spin(4) is a lifting of ρ(v2 ) ◦ ρ(v1 ) ∈ S O(4) under the covering map (9). Thus, in the local coordinate chart U N , F S N := v2 · v1 in Spin(4) defines a lifting of the action FN S ∈ S O(4) as in (7) under the double covering (9). To obtain a global lifting of the frame bundle, we consistently define the transition matrix F N S as a lifting in the group Spin(4) over x ∈ U S by F N S = −v2 · v1 , where v1 := ϑ 1 , v2 := sin φ ϑ 1 + cos φ ϑ 2 . The following confirms the consistency of the liftings on two coordinate charts. Lemma 1. Transition functions { F N S, F S N } satisfy the cocycle condition, F NS ◦ F SN = F ◦ F = 1. SN NS Proof. Applying the transformation from ϑ α ’s to ϑ β ’s by (7), we have ϑ 1 ·ϑ 2 = ϑ 1 ·ϑ 2 . Thus, F NS ◦ F S N = −v2 · v1 · v2 · v1

= −(sin φ ϑ 1 + cos φ ϑ 2 ) · ϑ 1 · (− sin φ ϑ 1 + cos φ ϑ 2 ) · ϑ 1 = sin2 φ − sin φ cos φ(ϑ 1 · ϑ 2 + ϑ 2 · ϑ 1 ) − cos2 φ ϑ 2 · ϑ 1 · ϑ 2 · ϑ 1 = 1, by using identities ϑ α · ϑ α = −1 and ϑ α · ϑ β = −ϑ β · ϑ α for α = β, of elements of the orthonormal bases ϑ α ’s and those of ϑ β ’s. Similarly, F SN ◦ F N S = 1.


621

Therefore, the principal Spin(4)-bundle can be defined by PSpin(4) := (U N × Spin(4) ∪ U S × Spin(4))/ ∼ ,

(10)

where (x, g) ˜ ∈ U N × Spin(4) and (x , g˜ ) ∈ U S × Spin(4) are defined to be equivalent if and only if x = x and g˜ = F ˜ N S g. The double covering of bundles (10) over the EH-space defines a spin structure of it. We will always assume this choice of spin structure. The spinor bundle can be defined as an associative bundle of typical fiber C4 of the principal Spin(4)-bundle (10), by specifying a representation of Spin(4) on G L C (4). We know that locally, for any x ∈ E H , there exists a unique irreducible representation space Λ of complex dimension 4 of the Clifford algebra Cl(Tx∗ (E H )) through the Clifford action c : Cl(Tx∗ (E H )) → End(Λ). We define the representation of Spin(4) in End(Λ)(∼ = G L C (4)) simply by the restriction of c from the Clifford algebra, and obtain the spinor bundle S of typical fiber Λ, with transition functions {c( F N S ), c( F S N )} in the coordinate charts {U N , U S }. With respect to the orthonormal basis, say {ϑ α } of T ∗ (E H )U N , there exists a unitary frame { f α } of the representation space Λ ∼ = C4 , such that the Clifford representations γ α := c(ϑ α (x)) for α = 1, . . . , 4 can be represented as constant matrices, ⎛ ⎞ ⎛ ⎞ 0 0 −1 0 0 0 −i 0 0 1⎟ 0 0 −i ⎟ ⎜0 0 ⎜0 , γ2 = ⎝ , γ1 = ⎝ 1 0 0 0⎠ −i 0 0 0⎠ 0 −1 0 0 0 −i 0 0 ⎛ ⎞ ⎛ ⎞ 0 0 0 −1 0 0 0 −i 0⎟ ⎜0 0 −1 0 ⎟ ⎜0 0 i 3 4 γ =⎝ , γ =⎝ . (11) 0 1 0 0⎠ 0 i 0 0⎠ 1 0 0 0 −i 0 0 0 The fact is that there exist frames { f β } on the coordinate chart U S so that the representation of c(ϑ β ) are also constant matrices γ β ’s as above. Under the chosen frames { f α } and { f β }, we may represent the transition functions of the spinor bundle as follows. Define maps P, Q : U N ∩ U S −→ G L C (4) by iz iz iz iz , , , − ), (12) |z| |z| |z| |z| iw iw iw iw 1 1 2 1 ,− ,− , ); (13) Q : = c( F N S ) = − sin φγ γ − cos φγ γ = diag( |w| |w| |w| |w| 1 1 2 1 P : = c( F S N ) = − sin φγ γ + cos φγ γ = diag(−

diag(a, b, c, d) stands for diagonal matrix with diagonal elements a, b, c, d. The spinor bundle S is thus, S := (U N × C4 ∪ U S × C4 )/ ∼,

(14)

where (x; s1 , · · · , s4 ) ∈ U N × C4 and (x ; s1 , · · · , s4 ) ∈ U S × C4 are defined to be equivalent if and only if x = x and (s1 , · · · , s4 )t = Q(s1 , · · · , s4 )t . One can easily see that the cocycle condition of the transition functions P ◦ Q = Q ◦ P = 1 holds. The chirality operator is defined by χ := c(ϑ 1 ) c(ϑ 2 ) c(ϑ 3 ) c(ϑ 4 ) = γ 1 γ 2 γ 3 γ 4 = diag(−1, −1, 1, 1),

(15)

622

C. Yang

such that χ 2 = 1. The representation space Λ = Λ+ ⊕ Λ− is decomposed as ±1-eigenspaces of the operator χ , with dim C Λ+ = dim C Λ− = 2. This fiberwise splitting extends to the global decomposition of the spinor bundle as subbundles over the EH-space, S = S + ⊕ S − , with each of the complex subbundles S + and S − of rank 2. Therefore, any element s ∈ S can be decomposed as s = (s + , s − )t . The charge conjugate operator on the spinor bundle J : S → S is defined by − + s −s . (16) J − := s s+ 2.4. Spin connections and Dirac operators of spinor bundles. Following the general procedure in [22], we can induce the spin connection ∇ S of the spinor bundle S from the Levi-Civita connection of the EH-space. We will only work on the U N coordinate chart and the construction on U S is similar. In the orthonormal frame {ϑ α }, the corresponding Levi-Civita connection on the dual ∗ β tangent bundle, T ∗ (E H )U N , can be expressed as ∇ T E H ϑ β = −Γiα d x i ⊗ ϑ α . The β metric compatibility of the Levi-Civita connection implies that Γiβα = −Γiα . β We may represent Γiα ’s in terms of the Christoffel symbols Γikj ’s of ∇ in the d x i ’s (4) by β β β Γiα = h˜ αj h k Γikj − ∂i h j , (17) where h iα ’s and h˜ β ’s are the matrix entries of H in (5) and H −1 in (6), respectively. Modulo the anti-symmetric condition between α and β indices, all the nonvanishing Christoffel symbols are j

1 1/2 1 1 3 ∆ sin φ, Γ24 = − ∆1/2 cos φ, Γ22 = 2 2 1 1 4 1 Γ22 = − ∆1/2 sin φ, Γ33 = − ∆1/2 sin θ cos φ, 2 2 1 1 1 3 Γ32 = −1 − ∆+ cos θ, Γ32 = − ∆1/2 sin θ sin φ 2 2 1 + 1 1 4 1 4 Γ33 = − ∆ cos θ Γ42 = ∆, Γ43 = − ∆+ . 2 2 2 1 Γ23 =

1 1/2 ∆ cos φ, 2 1 1 Γ34 = − ∆1/2 sin θ sin φ, 2 1 4 Γ32 = ∆1/2 sin θ cos φ, 2 (18)

We define γα := γ α , then the spin connection ∇ S : S → S ⊗ Ω 1 (E H ) is 1 β ∇ S := d − Γiα d x i ⊗ γ α γβ . 4

(19)

The covariant derivative ∇iS := ∇ S (∂i ), for i = 1, . . . , 4, equals ∇iS = ∂i − ωi , β where ωi = 41 Γiα γ α γβ . The Dirac operator D : Γ (S) → Γ (S) can be defined by D(ψ) := −i γ j ∇ Sj ψ, ∀ψ ∈ Γ (S),

(20)

j where γ j := c(d x j ) = h˜ β γ β . We note that the compatibility of the spin connection with respect to the spin structure implies that the commutativity between the Dirac operator and the charge conjugate operator, i.e. [D, J ] = 0.


623

2.5. Torus actions on the spinor bundle. A torus action on the spinor bundle S can be induced from the torus isometric action on a general Riemannian manifold [14,15]. In this subsection, we will represent such torus action (4) through parallel transporting spinors along geodesics. Recall that the isometric action σ is generated by the two Killing vectors ∂3 = ∂φ and ∂4 = ∂ψ . Let ck : R → E H be the geodesics obtained as integral curves of the Killing vector field ∂k for k = 3, 4. The equation of parallel transportation with respect to the spin connection along any curve c(t) is ∇cS (t) ψ = 0, where c (t) := dc(t)/dt, for ψ ∈ Γ (S). Substituting (19), we obtain dψ − A(c(t)) ψ = 0, dt

A(c(t)) :=

1 β i Γ d x (c (t)) ⊗ γ α γβ . 4 iα

When the curve is c3 (t), the corresponding matrix A(c3 (t)) is ⎞ ⎛ i 0 0 0 ⎟ 1 ⎜0 −i 0 0 ⎟ A(c3 (t)) = ⎜ + cos θ ) −∆1/2 sin θ eiφ ⎠ , ⎝ 0 0 −i (1 + ∆ 2 0 0 ∆1/2 sin θ e−iφ i (1 + ∆+ cos θ )

(21)

(22)

where r, θ and φ are understood as components of coordinates on the curve c3 (t). When the curve is c4 (t), the corresponding matrix A(c4 (t)) is A(c4 (t)) =

a4 a4 i diag − 4 , 4 , −1, 1 , 2 r r

(23)

where r is understood as one of the components of coordinates on the curve c4 (t). The corresponding parallel propagator is a map Pc(t) (t0 , t1 ) : Γ (S) → Γ (S) defined by parallel transporting any section ψ along the curve c(t) with t ∈ [t0 , t1 ]. The propagator can be represented by an iterated integration of Eq. (21). For geodesics ck (t), k = 3, 4, the corresponding matrix is formally solved as t1 Ak (t)dt , (24) Pck (t) (t0 , t1 ) = P exp t0

where P is the path-ordering operator. Let H be the Hilbert space completion with respect to the L 2 -inner product on the space of L 2 -integrable sections of the spinor bundle S. The parallel propagators (24) can be extended to families of operators Uk (tk − t0 ) : H → H parametrized by the real number (tk − t0 ) by Uk (tk − t0 )(ψ)(x) := (Pck (t) (t0 , tk )ψ)(x), ∀ψ ∈ H , where we assume x = ck (t0 ) ∈ E H for k = 3, 4. Without loss of generality, we may take t0 = 0 so that the family of operators is parametrized by tk . Since the spin connection is compatible with the metric of the EH-space, the pointwise inner product of the images of any two sections under parallel transportation along the geodesics ck (t) remains unchanged. This further implies that their L 2 -integrations remain the same. Futhermore, the operators Uk (tk ) are unitary. Let Wk be the self-adjoint operators on H which generate Uk by Uk (tk ) = eitk Wk , where tk ∈ R for k = 3, 4.

624

C. Yang

2 → T2 of the two torus We may define a representation of the double cover p : T 2 → L(H) such that :T by V (t˜3 , t˜4 )ψ(x) := ei(t˜3 W3 (x)+t˜4 W4 (x) ψ(x), ∀ψ ∈ H. V

(25) from (4) in the sense that for any v˜ ∈ T2 This action covers the isometric action σ v˜ ( f ψ) = αv ( f )V v˜ (ψ), ∀ψ ∈ H for any bounded consuch that p(v) ˜ = v implies V tinuous function f ∈ Cb (E H ) and the action α on Cb (E H ) defined by αv ( f )(x) := f (σ−v (x)). We assume the choice of the lifting in the double torus is always fixed and omit the ˜· for notational simplicity from now on. of T2

3. Smooth Algebras and Projective Modules We consider algebras of functions over the Eguchi-Hanson spaces, and their deformations as differential algebras. To obtain a C ∗ -norm on the deformed algebra, we consider representations of algebras as operators on the Hilbert space of spinors. Some algebras may be realized as smooth algebras [2]. We also find projective modules from the spinor bundle. 3.1. Algebras of smooth functions. We first summarize some related facts on topological algebras of complex-valued functions in [2]. For a noncompact Riemannian manifold X , let Cc∞ (X ) be the space of smooth functions on X of compact support, C0∞ (X ) be the space of smooth functions vanishing at infinity, and Cb∞ (X ) be the space of smooth functions whose derivatives are bounded to all degrees. In some local coordinate charts with corresponding partition of unity, say U = {Ua , h a }a∈A , we may define the family of seminorms on Cb∞ (X ) by U α qm ( f ) := sup sup |h a (x) ∂ f (x)| (26) a∈A |α|≤m x∈Ua ∞ f ∈ Cb (X ), α are multi-indices and m a non-negative integer. These seminorms on C0∞ (X ) and Cc∞ (X ). The natural topology induced by (26) is the topology

for any restrict of uniform convergence of all derivatives. We can show that two such families of seminorms defined by different coordinate charts are equivalent. Thus the topology defined does not depend on the choice of coordinates U. We also note that the q0 seminorm in the family of seminorms is nothing but the supremum norm · ∞ , which is a C ∗ -norm with the involution defined by normal complex conjugation. Algebras Cb∞ (X ) and C0∞ (X ) are both Fréchet in the topology of uniform convergence of all derivatives, while the algebra Cc∞ (X ) is not complete. However, Cc∞ (X ) is complete in the topology of inductive limit as the inductive limit of the topology obtained by restriction on a family of algebras Cc∞ (K n ), where {K n }n∈N is an increasing family of compact subsets in X . The algebra Cc∞ (X ) is dense in the Fréchet algebra C0∞ (X ). To consider algebras of smooth functions of the Eguchi-Hanson spaces, we may use the coordinate charts U = {U N , U S } defined in Sect. 2.2, with a partition of unity {h N , h S } subordinated to them. The family of seminorms (26) can be written as

qmU ( f ) = sup sup |h N (x) ∂ α f (x)| + sup sup |h S (x ) ∂ α f (x )|. |α|≤m x∈U N

|α |≤m x ∈U S

We obtain the corresponding topological algebras by taking X = E H .

(27)


625

3.2. Algebras of integrable functions. Apart from algebras of functions which can be represented as operators, there are algebras of functions which may define projective modules as representation spaces. Decay conditions at infinity and integrability conditions of functions become important when considering noncompact spaces. We consider the following algebras of integrable functions. The (k, p)th Sobolev norm of a function f , say in Cb∞ (E H ), is given as f H p :=

k

k

m=0

1/ p |∇ f | d V ol m

p

,

(28)

EH

where k is a non-negative integer and p is a positive integer. (We will not consider the case where p is a real number). We define subspaces in Cb∞ (E H ) which contain functions with finite Sobolev norm, Ck (E H ) := { f ∈ Cb∞ (E H ) : f H p < ∞}. p

k

p Hk (E H )

p

Let be the Banach space obtained by the completion of the algebra Ck (E H ) p p with respect to the Sobolev norm. In particular, H0 (E H ) ⊃ · · · ⊃ Hk (E H ) ⊃ p Hk+1 (E H ) ⊃ · · · . Remark 3. Notice that the algebra Cc∞ (E H ) is contained in Hk (E H ) for any k ∈ N. p Completion of Cc∞ (E H ) with respect to · H p gives us the Banach space, Hk,0 (E H ) p

p

k

p

such that Hk,0 (E H ) ⊂ Hk (E H ). The equality does not hold in general. However, in the circumstances of a complete Riemannian manifold with Ricci curvature bounded up to degree k − 2, and positive injective radius (which is satisfied by the E H -space), p p Hk,0 (E H ) = Hk (E H ) when k ≥ 2 [23]. Lemma 2. For a fixed non-negative integer p, the intersection defined as C∞ p (E H ) := ∩k Hk (E H ) p

is a Fréchet algebra in the topology defined by the family of norms { · H p }k∈N . k

Proof. The topology is easily seen to be locally convex and metrizable. To show that it p is complete, let { f β } be any Cauchy sequence in C ∞ p (E H ), then there exists a limit f k p of { f β } under the norm · H p in Hk (E H ) for each k ∈ N. For any two indices k1 , k2 k such that k1 ≤ k2 , the norm · H p is stronger than the norm · H p . The Cauchy p

k2

k1

sequence { f β } with the limit f k2 in the norm · H p is also a Cauchy sequence with k2

p

p

p

the limit f k1 in the norm · H p . Uniqueness of the limit implies that f k2 = f k1 . Since k1

p

k1 , k2 are arbitrary, the limits f k for any k ∈ N agree. We denote the limit as f so that the Cauchy sequence converges to f ∈ C ∞ p (E H ) with respect to any of the norms. Thus the topology is complete and C ∞ (E H ) is a Fréchet algebra. p When p = 2, the Fréchet algebra C2∞ (E H ) belongs to the chain of continuous inclusions, Cc∞ (E H ) → C2∞ (E H ) → C0∞ (E H ), with respect to their aforementioned topologies.

(29)

626

C. Yang

3.3. Deformation quantizations of differentiable Fréchet algebras. Rieffel’s deformation quantization of a differentiable Fréchet algebra in [12] (Chapter 1, 2) can be summarized as follows. Let A be a Fréchet algebra whose topology is defined by a family of seminorms {qm }. We assume that there there is an isometric action α of the vector space V := Rd considered as a d-dimensional Lie algebra acting on A. We also assume that the algebra is smooth with respect to the action α, i.e. A = A∞ in the notation of the reference. Under the choice of a basis {X 1 , . . . , X d } of the Lie algebra of V , the action α X i of X i defines a partial differentiation on A. One can define a new family of seminorms from {qm } by taking into account the action of α. For any f ∈ A, f j,k := qm (δ µ f ), (30) m≤ j, |µ|≤k µ

µ

where µ are the multi-indices (µ1 , . . . , µd ) and δ µ = α X 11 . . . α X dd . The deformation quantization of the algebra A can be carried out in three steps: Step 1. Let Cb (V × V, A) be the space of bounded continuous functions from V × V to A. One can induce the family of seminorms { · Cj,k } on the space Cb (V × V, A) by FCj,k :=

sup F(w) j,k ,

w∈V ×V

(31)

for F in Cb (V × V, A) and · j,k on A as in (30). Let τ be an action of V × V on the space Cb (V × V, A) defined by translation. That is, τw0 (F)(w) = F(w + w0 ) for any w0 , w ∈ V × V and F ∈ Cb (V × V, A). The action τ is an isometry action with respect to the seminorms (31). We define B A (V × V ) to be the maximal subalgebra such that τ is strongly continuous and whose elements are all smooth with respect to the action of τ . In the same way as one induces from the family of seminorms {qm } and obtains the seminorms · j,k of A in (30), one may induce the family of seminorms on B A (V × V ) from (31) by taking into account the action of τ . For any F ∈ B A (V × V ), let C FBj,k;l := δ ν Fl,m , (32) (l,m)≤( j,k) |ν|≤l

δν

where ν are the multi-indices and denotes the partial differentiation operator associated to τ of V × V . Step 2. The following is the fundamental result of the deformation quantization of a differentiable algebra. See Proposition 1.6 in [12]. One can define an A-valued oscillatory integral over V × V of F ∈ B A (V × V ) by F(u, v)e(u · v) dudv, (33) V ×V

where e(t) := exp(2πi t) for t ∈ R and u ·v is the natural inner product on V considered as its own Lie algebra. It is shown to be A-valued by getting the bound of the integral in the family of seminorms { · j,k } on A. Specifically, for large enough l, there exists a constant Cl such that B F(u, v)e(u · v) dudv ≤ Cl F j,k;l < ∞, V ×V

where the seminorm

· Bj,k;l

j,k

is defined in (32).


627

Step 3. For any invertible map J on V and any two functions f, g ∈ A, one defines an element F f,g ∈ B A (V × V ) by F f,g (u, v) := α J u ( f )αv (g) ∈ A, ∀(u, v) ∈ V × V.

(34)

The deformed product f × J g is thus defined by the integral (33) of F f,g (u, v) as f × J g := α J u ( f )αv (g)e(u · v) dudv. (35) V

V

The algebra A with its deformed product × J , together with its undeformed seminorms { · j,k }, defines the deformed Fréchet algebra A J . This is called the deformation of the algebra A (in the direction of J ) as a differentiable Fréchet algebra. In the following, we obtain deformation quantizations of various algebras of functions on EH-spaces. We may induce a torus action α on the algebra Cb∞ (E H ), or similarly on algebras C0∞ (E H ) and C2∞ (E H ), from the torus isometric action σ of v ∈ T2 on the E H -space (4) by αv f (x) = f (σ−v (x)) for any f ∈ Cb∞ (E H ) and x ∈ E H . Under the choice of the covering {U N , U S }, the orbit of any point x ∈ E H lies in the same coordinate chart as x. We assume that the partition of unity h N and h S only depend on the coordinate θ so that they are invariant under the torus action α. One can easily show that the torus action α is isometric with respect to the family of seminorms (27). We also note that each of the Fréchet algebras Cb∞ (E H ) and C0∞ (E H ) is already smooth with respect to the action α. Thus, each of Cb∞ (E H ) and C0∞ (E H ), with the isometric action α, regarded as a periodic action of V = R2 , appears exactly as the starting point as (A, {qm }) of Rieffels’ deformation quantization. We can carry out Step 1 to Step 3 and obtain the product × J on the respective algebras, f × J g := α J u ( f ) αv (g) e(u · v) dudv, (36) R2 R2

where the inner product u · v is the one onR2 and J is a skew-symmetric linear operator 0 −θ on R2 . In the following we assume J := , for some θ ∈ R\{0}, and denote × J θ 0 as ×θ . The algebra Cb∞ (E H ) with its deformed product ×θ , together with its undeformed family of seminorms (27) defines the deformed Fréchet algebra Cb∞ (E H )θ as the deformation quantization of Cb∞ (E H ). Similarly, C0∞ (E H )θ is the deformation quantization of the algebra C0∞ (E H ). For the Fréchet algebra C2∞ (E H ), the torus action α is isometric with respect to the family of norms {· H 2 }k∈N , because it is isometric with respect to the Riemannian metk ric. We can similarly obtain the Fréchet algebra C2∞ (E H )θ as deformation quantization of the algebra C2∞ (E H ). Remark 4. For any of the algebras in our example, the family of seminorms · j,k induced from qm ’s as in Step 1 is equivalent to the original family of seminorms. Indeed, the torus action is defined by the normal differentiation with respect to coordinates. There follow some immediate observations. Lemma 3. The algebra C2∞ (E H )θ is an ideal of the algebra Cb∞ (E H )θ .

628

C. Yang

Proof. Let f ∈ C2∞ (E H ) and g ∈ Cb∞ (E H ). Considered as elements of the algebra ∞ Cb∞ (E H ), they define F f,g ∈ BCb (E H ) (R2 × R2 ) by (34). We claim that F f,g lies in ∞ BC2 (E H ) (R2 × R2 ) so that its oscillatory integral, or product of f ×θ g by definition, will be finite in the family of seminorms on C2∞ (E H ) and hence C2∞ (E H )-valued. In fact, |F f,g (u, v)(x)|2 d V ol(x) = | f (J u + x)g(v + x)|2 d V ol(x) EH EH ≤ sup |g(x)|2 | f (J u + x)|2 d V ol(x) EH x∈E H = sup |g(x)|2 | f (x)|2 d V ol(x) < ∞. x∈E H

EH

The last equality is by the invariance of the volume form of the integration with respect to the torus isometric action. The finiteness is because g is a bounded function and f ∈ C2∞ (E H ). Higher orders can be shown as follows. For any non-negative integer k, we may expand ∇ k ( f (J u + x)g(v + x)) by the Leibniz rule to a summation of terms in the form of ∇ l f (J u + x)∇ m g(v + x) with l + m = k. By the assumption that ∇ k f is L 2 -integrable for any k and ∇ l g is bounded for any l, each term in the summation is L 2 -integrable. Thus ∇ k ( f (J u + x)g(v + x)) is L 2 -integrable for any k and F f,g (u, v) ∈ C2∞ (E H ) for any (u, v) ∈ R2 × R2 . As a result, the product f ×θ g is C2∞ (E H )-valued and C2∞ (E H )θ is an ideal. The restriction of the product (36) of the algebra Cb∞ (E H )θ to the algebra Cc∞ (E H ) gives the deformed algebra Cc∞ (E H )θ . We see that it is closed as an algebra as follows. For any f, g ∈ Cc∞ (E H ), the integral (36) vanishes outside the compact set Orb(supp( f )) ∩ Orb(supp(g)), where Orb(U ) := {αT2 (x) : x ∈ U ⊂ E H }. Therefore, f ×θ g is of compact support and Cc∞ (E H )θ is thus closed. We define the topology of inductive limit of Cc∞ (E H )θ as follows. Let {K j } be an increasing family of compact sets of the E H -space such that ∪ j K j = E H and Orb(K j ) ⊂ K j for each j. The product (36) defines a multiplication for each of the algebra Cc∞ (K j ). This defines a family of deformed algebras {Cc∞ (K j )θ }. For each j, the topology of Cc∞ (K j )θ obtained from the restriction of the topology of uniform convergence to all degrees of C0∞ (E H )θ is complete. This induces the strictly inductive limit topology of Cc∞ (E H )θ which is locally convex and complete. Using definitions, we have Lemma 4. Cc∞ (E H )θ is an ideal of the algebras C0∞ (E H )θ and Cb∞ (E H )θ . Proof. For f ∈ Cc∞ (E H )θ and g ∈ Cb∞ (E H )θ , the integral (36) vanishes outside the compact set Orb(supp( f )). Hence f ×θ g is Cc∞ (E H )-valued, so that Cc∞ (E H )θ is an ideal of the algebras Cb∞ (E H )θ . The proof for the algebra C0∞ (E H )θ is the same. The torus action α as a compact action of an abelian group defines a spectral decomposition of a function f in the algebra Cb∞ (E H ) or C0∞ (E H ), by f =

s

fs ,

f s (x) = e−is3 φ e−is4 ψ h s (r, θ ),


629

where s = (s3 , s4 ) ∈ Z2 , f s satisfies αv f s = eis·v f s , ∀v ∈ T2 , and the series converges in the topology of uniform convergence of all derivatives. Under the decomposition, the product of (36) takes a simple form (Chapter 2, [12]). Let f = r fr and g = s gs , in their respective decompositions, be both in the algebra Cb∞ (E H ) (or C0∞ (E H )), then σ (r, s) fr gs , (37) f ×θ g = r,s

where σ (r, s) := eiθ(r4 s3 −r3 s4 ) and r = (r3 , r4 ), s = (s3 , s4 ) ∈ Z2 . The expression (37) can also be restricted to the algebra Cc∞ (E H )θ . Lemma 5. C0∞ (E H )θ is an ideal of Cb∞ (E H )θ . Proof. For any f ∈ C0∞ (E H )θ and g ∈ Cb∞ (E H )θ , it suffices to show that f ×θ g ∈ C0∞ (E H )θ . For g being zero, this is trivial. We thus assume that g is nonzero. The convergence of the series (37) implies that for any ε/2 > 0, there exists an integer N such that | f ×θ g(x)|
0, we may choose N and K ( fr ) as above and define the union of finitely many compact sets as K := ∪|r |≤N K ( fr ), so that x ∈ E H \K implies that | f ×θ g(x)|

Proof of Lemma 5. Start with (2.3). By the first estimate of (2.13), lim

n→∞

nx

(E log χ˜ kβ − E log χ(k+a)β ) =

k=ny

a log(y/x) 2

uniformly for y < x restricted to compact sets of (0, 1]. Thus, for (2.3) it is enough to demonstrate the weak convergence n k=[nx]

1 (log χ(k+c)β − E log χ(k+c)β ) ⇒

(2βz)−1/2 db(z),

(2.15)

x

where c is any fixed number. Indeed, the exponent of the discrete kernel is comprised of √ two such independent sums, and the promised limit will follow as b1 + b2 = 2b3 in law for independent Brownian motions b1 , b2 , b3 . Now refer to Proposition 9 and view the processes on the left of (2.15) as starting from 0 at x = 1 and evolving toward x = 0 (or take t = 1 − x in the proposition). Then, the second estimate of (2.13) yields the first part of (2.14) with f (t) = 1/(2βt); the estimate right after (2.13) with m = 2 produces the second half of (2.14) as x is always > 0. This finishes the job. The convergence (2.2) is easier. For any fixed x ∈ (0, 1], it is just an instance of the law of large numbers. The tightness required to ensure process level convergence is also elementary: via (2.12) one can obtain the increment bound !2 % % (r + 1)β rβ E − = O(1/r 2 ) χ(r2 +a+1)β χ(r2 +a)β

898

J. A. Ramírez, B. Rider

which more than suffices. While here we dispense of (2.4). First use the sum bound, ! √ n χ(k+a)β kβ 1 P sup . < >M ≤ P √ M kβ 1≤k≤n χ(k+a)β k=1

1−r/2

2 Then, employing the explicit density P(χr ∈ ds) = (r/2) s r −1 e−s /2 ds, one can perform a Laplace-type estimate to find the k th term on the right hand √ √ sidek is upper bounded by C( e/M)k with C depending only on β. Since ∞ k=1 ( e/M) may be made arbitrarily small by choice of M, the desired tightness of the random variables √ supk≤n ( kβ/χ(k+a)β ) follows. The final piece, or (2.5), is the most elaborate but really comes down to reworking the standard proof of the upper bound in the law of the iterated logarithm. Define,

Anx =

n−1

(log χ˜ kβ − log χ(k+a)β ) −

k= j

2

a log( j/n) 2

for x ∈ [x j , x j+1 ), and h(x) = [2x log log x]1/2 . We will in fact show that sup (Anx j ∨ 0)/ h(T (x j )) are tight in distribution,

(2.16)

1≤ j≤n−1

where again T (x) = Set Y jn

≡

exp(Anx j )

1 β

=

log x1 . This is stronger than what is claimed. n−1

χ˜ kβ

k= j

χ(k+a)β

k+1 k

a/2 , and

Z nj ≡ (Y jn )λ E[(Y jn )λ ]−1

with a small positive λ (the precise conditions on λ follow shortly). The sequence j → n n Z n− j is a martingale for j = 1, 2, . . . with E[Z j ] = 1 for all j. Hence, by Doob’s inequality P max Z nj ≥ eλb ≤ e−λb , ≤ j≤n−1

or

P

max (λAnx j − log E[exp(λAnx j )]) ≥ b ≤ e−λb

≤ j≤n−1

(2.17)

for b > 0. For the next move we need an estimate on the moment generating functions of Anx j , the proof of which we will return to at the end of the section. Claim 10. For all λ > 0 sufficiently small (λ < (β/2)[(1 + a) ∧ 1] will do), 2 # λAn $ λ xj log(1/x j ) + n ( j) = exp E e 2β with |n ( j)| ≤ C for constant C = C(a, β).

(2.18)

Diffusion at the Random Matrix Hard Edge

899

Using (2.18) in (2.17), we have ! 1 λ n log(1/t) + n (nt) ≥ b/λ ≤ e−λb P sup At − 2β λ x ≤t 1, a positive constant M and set λ = Mθ −m h(θ m ), b = Mh(θ m )/2. (To choose M large one must take θ large as well to respect the condition on λ set down in Claim 10.) The previous display will then imply ! P

Ant ≥ (M + 1)h(θ m ) ≤ (m log θ )−M . 2

sup θ m N ≤ ε(N ) where ε(N ) ↓ 0 as N ↑ ∞. 0 λ, as an examination of (3.1) shows. That is, the event that {0 (L) > λ} is equal in law to the event {x → ψ0 (x, λ) has no roots before x = L}. Continuing, additional zeros of the (almost surely continuous) function λ → ψ0 (L , λ) (and so additional eigenvalues) only occur by increasing λ, whereupon all other roots (in L the x-variable) move to the left. This equates the event that the k th eigenvalue of Gβ,a lies above a fixed λ and the event that ψ0 (x, λ) has at most k − 1 roots on (0, L). Now move to the p(x, λ) formed from ψ0 (x, λ) and its derivative. By appealing again to uniqueness of solutions to (3.1), note ψ0 and ψ0 cannot vanish simultaneously. (In particular, the zeros of ψ0 are isolated, and must be either finite in number or form a sequence tending to infinity.) Thus, at any root m of x → ψ0 (x, λ), including m = 0, an examination of signs shows that limε↓0 p(m + ε, λ) = +∞ and, when m > 0, limε↓0 p(m − ε, λ) = −∞. That is, counting roots of ψ0 (·, λ) is to count passages of the corresponding p(·, λ) to −∞, after subsequent re-starts at +∞. To see that the p-picture stands on its own is to show that there is a unique solution of (3.2) starting from +∞. Replacing the −λe−x term in the drift with any negative constant produces a homogeneous motion with an entrance boundary at +∞ (and which hits −∞ with probability one). This process (begun at +∞) may be constructed unambiguously via speed and scale, see again [12]. By successive dominations of the inhomogeneous p in the statement by such homogeneous versions over all short times, one may conclude the existence and uniqueness of the former. Theorem 2 now follows by taking L → ∞ in Lemma 11 with the aid of the next fact. L converge to the top k eigenvalues Lemma 12. As L → ∞, the top k eigenvalues of Gβ,a of Gβ,a with probability one. L )−1 Proof. This again demonstrates the advantage of having explicit inverses. Now (Gβ,a acts on L 2 ([0, L], m) via

∞ L −1 (Gβ,a ) f (x) = s L (x, y) f (y) m(dy), 0

902


where

⎤ ⎡ x∧y ⎤ ⎡" L s(dz) x∨y ⎦ 1{x,y∈[0,L]} . s L (x, y) = ⎣ s(dz)⎦ × ⎣ " L s(dz) 0 0

" x∧y

" x∧y Plainly, s L (x, y) ≤ 0 s(dz) and lim L→∞ s L (x, y) = 0 s(dz) pointwise in x and y, almost surely. By dominated convergence we have in the same mode that ⎛ x∧y ⎞ ∞ ∞ ∞ ∞ f (x)s L (x, y)g(y) m(d x)m(dy) → f (x) ⎝ s(dz)⎠ g(y) m(d x)m(dy) 0

0

0

0

0

for all f, g ∈ L 2 [R+ , m], and L −1 tr (Gβ,a )

L =

∞ x s L (x, x)m(d x) →

0

0

s(dy)m(d x) = tr G−1 β,a .

0

L to G But these last two items imply convergence of Gβ,a β,a in trace norm (see [17], Theorem 2.20); the convergence of the eigenvalues then stems from the same style of argument used in the proof of Theorem 1.

4. The Hard-to-Soft Transition Borodin-Forrester [1] discovered a transition between the hard and soft edge distributions at β = 1, 2, and 4. Their proof rests on the explicit Fredholm determinant or Fredholm pfaffian form of these laws. For example, at β = 2 one has that ∞ (−1)k

λ

λ

P(0 (2, a) > λ) = 1+

k=1

k!

d x1 · · ·

0

d xk det K Bessel (xi , x j ) i, j=1,...,k ,

0

(4.1) while P(T W2 < λ) = 1 +

∞ ∞ (−1)k k=1

k!

∞ d x1 · · ·

λ

d xk det K Air y (xi , x j ) i, j=1,...,k .

(4.2)

λ

Here, √ √ √ √ √ √ Ja ( x) y Ja ( y) − x Ja ( x)Ja ( y) K Bessel (x, y) := x−y with Ja the usual Bessel function of the first kind, which is replaced by the Airy function in K Air y (x, y) :=

Ai(x)Ai (y) − Ai (x)Ai(y) . x−y


903

For β = 1 or 4 the determinants in (4.1) and (4.2) are replaced by quaternion determinants (or, equivalently, pfaffians), but are comprised of the same class of functions. Further, it is a fact that, suitably scaled, Ja goes over into the Airy function as a → ∞, and the analysis of [1] demonstrates that one may pass this limit inside the various multiple integrals in (4.1) and its analogues. By a much different method, employing the Riccati correspondence, we show the same type of phenomena holds at all β > 0. From Theorem 2, the event that {0 (β, a) > λ} is equivalent in law to the process dp(x) = √2β p(x)db(x) + (a + β2 ) p(x) − p 2 (x) − λe−x d x never hitting −∞. While from [15] we know that the probability of the event {T Wβ < µ} equals the chance that a separate motion q given by dq(x) =

√2 db(x) β

+ (x + µ − q 2 (x))d x

(4.3)

also never hits −∞. (Both processes are begun at +∞.) The question is then: with the scalings 2 √ a = 2 η − > −1 and λ = η − η2/3 µ, β does the chance of p-explosion go over into that of a q-explosion for large η? To understand the mechanism, set µ = 0 for a moment. This scaled p solves √ dp(x) = √2β p(x)db(x) + (2 η p(x) − p 2 (x) − ηe−x )d x, √ and obviously p = p/ η explodes or not with p while satisfying √ dp(x) = √2β p(x)db(x) + η (2p(x) − p2 (x) − e−x )d x.

For η ↑ ∞, p comes quickly to the place p = 1, and, if it manages to tunnel through this point in a short time, explosion is hard to avoid. Within this excursion from 1+ to 1− in a small x-window, the q-motion emerges. To make this explicit we will use the following convergence criteria. Proposition 13. (After Theorem 11.1.4 of [18]). Let a(t, z) and b(t, z) be continuous from [0, ∞)× R into R. For each w ∈ R, let the solution of the martingale problem for a and b (diffusion and drift coefficients respectively) begun from w at t = s be unique. Denote this solution by Ps,w . Suppose next that there are {an } and {bn } satisfying sup sup sup (|an (t, z)| + |bn (t, z)|) < ∞ n≥1 t 0 and M > 0. Then, if Ps,w n n → P and bn starting from (s, w), Ps,w s,w .

904


Proof of Theorem 3. Restoring a generic value of µ we write dp(x) = √2β p(x)db(x) + η1/2 2p(x) − p2 (x) − (1 − η−1/3 µ)e−x d x.

(4.4)

Here p(0) = +∞, while to utilize the proposition it is convenient to move the starting point to a finite place. Certainly, P+∞ (p never explodes) ≥ P1+ε (p never explodes) for whatever ε > 0. Also, P+∞ (p never explodes) ≤ P+∞ (p never explodes, m1+ε ≤ δ) + P+∞ (m1+ε ≥ δ) where mc is the fist passage to the point c and δ > 0. By the Markov property and monotonicity, the first term on the right is less than the (Pδ,1+ε )-probability of no explosion. We wish to bound the second term from above for large η, and to that end note that P+∞ (ma ≤ maδ ) = 1, where maδ is the passage time of the homogeneous process pδ in which the appearance of e−x in the p drift is replaced by e−δ . (The obvious coupling is used.) Hence, P+∞ (m1+ε

1 1 > δ) ≤ E +∞ [mδ1+ε ] = δ δ

∞ x s(dy)m(d x)

(4.5)

1+ε 1+ε

for m(d x) and s(d x) the speed and scale measures of pδ : m(d x) =

' ( √ 1 β 2 1 −√ηψ(x) ηψ(x) x − 2 ln x − c e d x, s(d x) = e d x, ψ(x) = η,δ β x2 2 x

and cη,δ = (1 − µη−1/3 )e−δ . Next choose ε = ε(η) = Mη−1/6 , δ = δ(η) =

1 −1/3 η , K

(4.6)

√ where K ≥ 1 and M ≥ |µ| + 2. These last precautions imply that ψ(x) is increasing for x > 1 + ε. Then an exercise in stationary phase allows the continuation of (4.5) as P+∞ m1+ε(η) > δ(η) ≤ K η1/3

∞

1+Mη−1/6

1 x2

x

√

e−

η[ψ(x)−ψ(y)]

d yd x ≤ C

1+Mη−1/6

K , M

for η ↑ ∞ and a constant C depending only on β, the inner integral concentrating at the upper limit y = x. In summary, for p paths we have that P0,1+ε(η) (m−∞ = ∞) ≤ P0,+∞ (m−∞ = ∞) ≤ Pδ(η),1+ε(η) (m−∞ = ∞) + C holds for all large η. Now bring in

qη (x) = η1/6 p(η−1/3 x) − 1 ,

K M (4.7)


905

and note that, when p begins at (0, ε(η)), qη begins at (0, M), and when p begins at (δ(η), ε(η)), qη begins at (K −1 , M). Further, qη hits −∞ if and only if p does, and a substitution in (4.4) shows that qη satisfies the Itô equation # $ $ # −1/3 ˆ −qη2 (x) + η1/3 1 − (1 − η−1/3 µ)e−η x d x dqη (x) = √2β 1 + η−1/6 qη (x) d b(x)+ ˆ with a new Brownian Motion b(x) = η1/6 b(η−1/3 x). Given unique strong solutions in both instances, Proposition 13 easily applies with aη (t, z) = (2/β)[1 + η−1/6 z]2 and bη (t, z) = [−z 2 + η1/3 (1 − (1 − η−1/3 µ)e−η

−1/3 t

)],

the qη -coefficients, and a(t, z) = 2/β and b(t, z) = −z 2 + µ + t, the q-coefficients (recall (4.3)). That is to say, limη→∞ E x,c [φ(qη )] = E x,c [φ(q)] for all bounded continuous functions of the path, and, by approximation we also find, via (4.6) and (4.7), that P0,M (q never explodes) ≤ lim inf P0,∞ (p never explodes) η→∞

≤ lim sup P0,∞ (p never explodes) ≤ PK −1 ,M (q never explodes) + C η→∞

K . M

Note while q → m−∞ (q) is not continuous, q → m−L (q) is for any L finite (outside a set of measure zero). It follows that we have the distributional convergence of m−L (qη ) to m−L (q). The approximation required above is then to show that: lim L→∞ m−L (q) = m−∞ (q) holds in probability, with the same limit taking place uniformly in η when qη replaces q. That all processes involved have exit barriers at −∞ makes this routine. To finish the proof, let M and then K tend to infinity. The q-law is continuous in its initial time, and that lim M→∞ Pc,M = Pc,∞ is a byproduct of +∞ being an entrance point. Acknowledgements. We thank P. Forrester for pointing out the transition problem to us, and also M. Krishnapur and T. Kurtz for helpful input. The work of the second author was supported in part by NSF grants DMS-0505680 and DMS-0645756.

References 1. Borodin, A., Forrester, P.J.: Increasing subsequences and the hard-to-soft transition in matrix ensembles. J. Phys. A: Math and Gen. 36(12), 2963–2982 (2003) 2. Bovier, A., Faggionato, A.: Spectral analysis of Sinai’s walk for small eigenvalues. Ann. Probab. 36(1), 198–254 (2008) 3. Brox, T.: A one-dimensional diffusion process in a Wiener medium. Ann. Probab. 14(4), 1206–1218 (1986) 4. Deift, P., Gioev, D., Kriecherbauer, T., Vanlessen, M.: Universality for orthogonal and symplectic Laguerre-type ensembles. J. Statist. Phys. 29(5–6), 949–1053 (2007) 5. Dumitriu, I., Edelman, A.: Matrix models for beta ensembles. J. Math. Phys. 43(11), 5830–5847 (2002) 6. Edelman, A., Sutton, B.: From random matrices to stochastic operators. J. Stat. Phys. 127(6), 1121–1165 (2007) 7. Edelman, A.: Eigenvalues and condition numbers of random matrices. SIAM J. Matrix Anal. Appl. 9, 543–560 (1988) 8. Either, S., Kurtz, T.: Markov processes Wiley Series in Probability and Statistics. New york: John Wiley Sons, 1986 9. Forrester, P.J.: Exact results and universal asymptotics in the Laguerre random matrix ensemble. J. Math. Phys. 35(5), 2519–2551 (1994)

906


10. Forrester, P.J.: Hard and soft edge spacing distributions for random matrix ensembles with orthogonal and symplectic symmetry. Nonlinearity 19, 2989–3002 (2006) 11. Halperin, B.I.: Green’s functions for a particle in a one-dimensional random potential. Phys. Rev (2) 139, A104–A117 (1965) 12. Itô, K., McKean, H.P.: Diffusion Processes and their Sample Paths. Berlin-Heidelberg-New York: Springer-Verlag, 1974 13. Killip, R., Stoiciu, M.: Eigenvalue Statistics for CMV Matrices: From Poisson to Clock via Circular Beta Ensembles. To appear, Duke Math. J. available at http://arxiv.org/abs/math-ph/0608002, 2006 14. Muirhead, R.J.: Aspects of Multivariate Statistical Theory. Wiley Series in Probability and Statistics, 1982 15. Ramírez, J., Rider, B., Virág, B.: Beta ensembles, stochastic Airy spectrum and a diffusion. Preprint, available at http://arXiv.org/abs/math/0607331v3, 2007 16. Silverstein, J.: The smallest eigenvalue of a large dimensional Wishart matrix. Ann Probab. 13(4), 1364–1368 (1985) 17. Simon, B.: Trace Ideals and their Applications. Cambridege-New York: Cambridge University Press, 1974 18. Stroock, D.W., Varadhan, S.R.S.: Multidimensional Diffusion Processes. Springer-Verlag, Berlin-New York, 1997 19. Talet, M.: Annealed tail estimates for a Brownian motion in a drifted Brownian potential. Ann. Probab. 35(1), 32–67 (2007) 20. Telatar, E.: Capacity of multi-antenna Gaussian channels. European Trans. Telecom. 10(6), 585–596 (1999) 21. Tracy, C., Widom, H.: Level spacing distributions and the Airy kernel. Commun. Math. Phys. 159(1), 151–174 (1994) 22. Tracy, C., Widom, H.: Level spacing distributions and the Bessel kernel. Comm. Math. Phys. 161(2), 289–309 (1994) 23. Tracy, C., Widom, H.: On orthogonal and symplectic matrix ensembles. Comm. Math. Phys. 177(3), 727–754 (1996) 24. Trotter, H.F.: Eigenvalue distributions of large Hermitian matrices; Wigner’s semicircle law and a theorem of Kac, Murdock, and Szegö. Adv. in Math. 54(1), 67–82 (1984) 25. Valko, B., Virág, B.: Continuum limits of random matrices and the Brownian carousel. Preprint, available at http://arxiv.org/abs/0712.2000v3, 2008 26. Vanlessen, M.: Strong asymptotics of Laguerre-type orthogonal polynomials and applications in random matrix theory. Constr. Approx. 25(2), 125–175 (2007) 27. Verbaarschot, J.: Spectrum of the QCD Dirac operator and chiral random matrix theory. Phys. Rev. Lett. 72, 2531–2533 (1994) Communicated by H. Spohn


Communications in


On the Spectrum and Lyapunov Exponent of Limit Periodic Schrödinger Operators Artur Avila CNRS UMR 7599, Laboratoire de Probabilités et Modèles Aléatoires, Université Pierre et Marie Curie–Boîte Courrier 188, 75252 Paris Cedex 05, France Received: 12 May 2008 / Accepted: 24 July 2008 Published online: 27 November 2008 – © Springer-Verlag 2008

Abstract: We exhibit a dense set of limit periodic potentials for which the corresponding one-dimensional Schrödinger operator has a positive Lyapunov exponent for all energies and a spectrum of zero Lebesgue measure. No example with those properties was previously known, even in the larger class of ergodic potentials. We also conclude that the generic limit periodic potential has a spectrum of zero Lebesgue measure. 1. Introduction This work is motivated by a question in the theory of one-dimensional ergodic Schrödinger operators. Those are bounded self-adjoint operators of 2 (Z) given by (H u)n = u n+1 + u n−1 + v( f n (x))u n ,

(1.1)

where f : X → X is an invertible measurable transformation preserving an ergodic probability measure µ and v : X → R is a bounded measurable function, called the potential. One is interested in the behavior for µ-almost every x. In this case, the spectrum is µ-almost surely independent of x. The Lyapunov exponent is defined as 1 L(E) = lim (1.2) ln A(E) n (x)dµ(x), n (E)

where An is the n-step transfer matrix of the Schrödinger equation H u = Eu. Here we will give first examples of ergodic potentials with a spectrum of zero Lebesgue measure such that the Lyapunov exponent is positive throughout the spectrum. This answers a question raised by Barry Simon (Conjecture 8.7 of [S]). Current address: IMPA, Estrada Dona Castorina 110, Rio de Janeiro, 22460-320, Brazil. E-mail: [email protected]; [email protected]

908

A. Avila

The example we will construct will belong to the class of limit periodic potentials. Those arise from continuous potentials over a minimal translation of a Cantor group (see Sect. 2 for a discussion of those notions). In our approach, we fix the underlying dynamics and vary the potential: it turns out that a dense set of such potentials provide counterexamples. It is actually possible to incorporate a coupling parameter in our construction. Here is a precise version that can be obtained from our technique: Theorem 1.1. Let f : X → X be a minimal translation of a Cantor group. For a dense set of v ∈ C 0 (X, R) and for every λ = 0, the Schrödinger operator with potential λv has a spectrum of zero Lebesgue measure, and the Lyapunov exponent is a continuous positive function of the energy. Our result implies, by continuity of the spectrum, that a generic potential over a minimal translation of a Cantor group has a spectrum of zero Lebesgue measure. Corollary 1.2. Let f : X → X be a minimal translation of a Cantor group. For generic v ∈ C 0 (X, R), and for every λ = 0, the Schrödinger operator with potential λv has a spectrum of zero Lebesgue measure (and the Lyapunov exponent is a continuous function of the energy which vanishes over the spectrum). The statements about the Lyapunov exponent in the generic context are rather obvious consequences of upper semicontinuity and density of periodic potentials. They highlight however that the generic approach is too rough and that care must be taken in the proof of Theorem 1.1 in order not to lose the Lyapunov exponent. Remark 1.1. Lebesgue measure zero can be strengthened to Hausdorff dimension zero in both Theorem 1.1 and Corollary 1.2. It suffices to replace the 10th power by an arbitrarily large one in (2), Lemma 3.2, without qualitative impact in the proof, and to replace (3), Lemma 3.3, by the covering estimate which the argument is giving. This remark was prompted by a recent result (based on a different method) of Last-Shamis about the Hausdorff dimension of the spectrum of the critical almost Mathieu operator. 2. Preliminaries 2.1. From limit periodic sequences to Cantor groups. Limit periodic potentials are discussed in depth in [AS]. Here we will restrict ourselves to some basic facts used in this paper. Let σ be the shift operator on ∞ (Z), that is, (σ (x))n = xn+1 . Let orb(x) = {σ k (x), k ∈ Z}. We say that x is periodic if orb(x) is finite. We say that x is limit periodic if it belongs to the closure, in ∞ (Z), of the set of periodic sequences. If x is limit periodic, we let hull(x) be the closure of orb(x) in ∞ (Z). It is easy to see that every y ∈ hull(x) is limit periodic. Lemma 2.1. If x is limit periodic then hull(x) is compact and it has a unique topological group structure with identity x such that Z → hull(x), k → σ k (x) is a homomorphism. Moreover, the group structure is Abelian and there exist arbitrarily small compact open neighborhoods of x in hull(x) which are finite index subgroups.

Limit Periodic Operators

909

Proof. Recall that a metric space is called totally bounded if for every > 0 it is contained in the -neighborhood of a finite set. It is easy to see that a totally bounded subset of a complete metric space has compact closure. If x is limit periodic then orb(x) is totally bounded: indeed if p is periodic and x − p < then orb(x) is contained in the -neighborhood of orb( p). Since ∞ (Z) is a Banach space, hull(x) is compact. Clearly there exists a unique (cyclic) group structure on orb(x) such that the map Z → orb(x), k → σ k (x) is a homomorphism. Let us show that the group structure is uniformly continuous. We have

σ k+l (x) − σ k +l (x)∞ = σ k−k (x) − σ l −l (x)∞

≤ σ k−k (x) − x∞ + x − σ l −l (x)∞

= σ k (x) − σ k (x)∞ + σ l (x) − σ l (x)∞ ,

(2.1)

where the inequality is just the triangle inequality and the equalities follow from the fact that σ is an isometry of ∞ (Z). Thus if y, z, y , z ∈ orb(x) then y · z − y · z ∞ ≤ y − y ∞ + z − z ∞ , which shows the uniform (even Lipschitz) continuity. By uniform continuity, the group structure on orb(x) has a unique continuous extension to hull(x). Since the group structure on orb(x) is Abelian, its extension is still Abelian. For the last statement, fix > 0 and let p be periodic with x − p∞ < /2. Let k be such that σ k ( p) = p. Clearly the closure hullk (x) of {σ kn (x), n ∈ Z} is a compact subgroup of hull(x) of index at most k. Since hull(x) is the union of finitely many disjoint translates of hullk (x), it follows that hullk (x) is also open. Since σ is an isometry, hullk (x) is contained in the /2-ball around p, and hence it is contained in the -ball around x.

By the previous lemma, hull(x) is compact and totally disconnected, so it is either finite (if and only if x is periodic) or it is a Cantor set. If x is limit periodic but not periodic, we see that every y in hull(x) (which is also a limit periodic sequence) is of the form yn = v( f n (y)), where f is a minimal translation of a Cantor group ( f = σ |hull(x)) and v is continuous (v(w) = w0 ). 2.2. From Cantor groups to limit periodic sequences. Let us now consider a Cantor group X and let t ∈ X . Let f : X → X be the translation by t. We say that f is minimal if { f n (y), n ∈ Z} is dense in X for every y ∈ X . This is equivalent to {t n , n ∈ Z} being dense in X . In this case, since there exists a dense cyclic subgroup, we conclude that X is actually Abelian. Let v : X → R be any continuous function. Let φ : X → ∞ (Z), φ(x) = (v( f n (x))n∈Z . Lemma 2.2. For every x ∈ X , φ(x) is limit periodic and φ(X ) = hull(x). Proof. It is enough to show that φ(x) is limit periodic, since φ(X ) is compact and orb(φ(x)) is the image under φ of the set { f n (x), x ∈ X } which is dense in X . Given δ > 0 we must find a periodic sequence p such that φ(x)− p∞ ≤ δ. Choose a compact open neighborhood W of the identity of X which is so small that if y ∈ W then |v(y · z) − v(z)| ≤ δ. Introduce a metric d on X , compatible with the topology. Let > 0 be such that if y, z ∈ X are such that y ∈ W and z ∈ / W then d(y, z) > . Choose m > 0 such that t m

910

A. Avila

is so close to the identity that for every y ∈ X , d(y, f m (y)) < . Then by induction on |k|, t mk ∈ W for every k ∈ Z. It follows that the closure of {t km , k ∈ Z} is a compact subgroup of X contained in W . Clearly it has index at most m. Let p ∈ ∞ (Z) be given by pi = v( f j (w)), where 0 ≤ j ≤ m − 1 is such that i = jmodm. Then |φ(x)i − pi | = |v( f i (w)) − v( f j (w))| = |v(y · z) − v(z)|, where z = f j (w) and y = t i− j . Since i = jmodm, t i− j ∈ W , and by the choice of W we have |φ(x)i − pi | ≤ δ. It follows that φ(x) − p∞ ≤ δ as desired.

Remark 2.1. By the proof above, there exist arbitrarily small compact subgroups of finite index of X (such subgroups are automatically open as before). 2.3. Limit periodic Schrödinger operators. Given f : X → X a minimal translation of a Cantor group and v : X → R a continuous function, we define for every x ∈ X a Schrödinger operator H = H f,v,x by (1.1). A formal solution of H u = Eu satisfies u0 un (E, f,v) An (x) = , (2.2) u −1 u n−1 where A(E) n (x)

=

(E, f,v) An (x)

= Sn−1 · · · S0 where Si =

E − v( f i (x)) −1 . 1 0

(2.3)

(E)

The An (x) are thus in SL(2, R), and are called the n-step transfer matrices. The Lyapunov exponent L(E) = L(E, f, v) is defined by (1.2), where we take µ the Haar probability measure on X (this is the only possible choice actually, since minimal translations of Cantor groups are uniquely ergodic). (The limit in (1.2) exists by subadditivity, which also shows that lim may be replaced by inf.) (E) Remark 2.2. By subadditivity, 21k ln A2k (x)dµ(x) is a decreasing sequence converging to L(E). Allowing E to take values in C, we conclude that E → L(E) is the real part of a subharmonic function. Lemma 2.3. If n ≥ 2, for every non-zero vector z ∈ R2 , the derivative (with respect to (E, f,v) E) of the argument of An (x)z is strictly negative. (E, f,v)

Proof. Let ρn (E, x, z) be the derivative (with respect to E) of the argument of An (x)z. It is easy to see that ρ1 (E, x, z) is strictly negative whenever z is not vertical, and it is n zero if z is vertical. By the chain rule, for n ≥ 2, ρn (E, x, z) = i=1 κi ρ1 (E, f i−1 (x), (E, f,v) (E, f,v) i Ai−1 (x)z), where κi are strictly positive (since An−i ( f (x)) ∈ SL(2, R) and (E, f,v)

hence preserves orientation). Since either z or A1 follows.

(x)z is non-vertical, the result

2.3.1. Let us endow the space H of bounded self-adjoint operators of 2 (Z) with the norm = supu2 =1 (u)2 , and the space of compact subsets of R with the Caratheodory metric (d(A, B) is the infimum of all r such that A is contained in the r -neighborhood of B and B is contained in the r -neighborhood of A). With respect to those metrics, it is easy to see that the spectrum is a 1-Lipschitz function of ∈ H. Since the map C 0 (X, R), v → H f,v,x is also 1-Lipschitz, we conclude that the spectrum


911

of H f,v,x is a 1-Lipschitz function of v ∈ C 0 (X, R). It also follows that the spectrum of H f,v,x depends continuously on x. Since H f,v,x and H f,v, f (x) have obviously the same spectrum, and f is minimal, we conclude that the spectrum is actually x-independent. We will denote it ( f, v). 2.3.2. We say that v is periodic (of period n ≥ 1) if v( f n (x)) = v(x) for every x ∈ X . If v is a periodic potential, then it is locally constant, hence for any compact subgroup Y ⊂ X contained in a sufficiently small neighborhood of id, the function v is defined over X/Y . If v ∈ C 0 (X, R) and Y ⊂ X is a compact subgroup of finite index, then we can define another potential v Y by convolution with Y : v Y (x) = Y v(y · x)dµY , where µY is the Haar measure on Y . The potential v Y is then periodic. Since there are compact subgroups with finite index contained in arbitrarily small neighborhoods of id, this shows that the set of periodic potentials is dense in C 0 (X, R). (E, f,v)

2.3.3. If v is n-periodic then trAn (x) is x-independent and denoted ψ(E). Then (E, f,v) (x), for any x ∈ X . This L(E, f, v) is the logarithm of the spectral radius of An shows that the Lyapunov exponent is a continuous function of both the potential and the energy when one restricts considerations to potentials of period n. 2.3.4. We will need some basic facts on the spectrum of periodic potentials, see [AMS], Sect. 3, for a discussion with further references. If v is periodic of period n the spectrum ( f, v) of H is the set of E ∈ R such that |ψ(E)| ≤ 2. Thus for periodic potentials, we have ( f, v) = {E ∈ R, L(E, f, v) = 0}. The function ψ is a polynomial of degree n. It can be shown that ψ has n distinct real roots and its critical values do not belong to (−2, 2), moreover, E is a critical point of (E,v, f ) ψ with ψ(E) = ±2 if and only if An (x) = ± id. From this one derives a number of consequences about the structure of periodic spectra: (1) The set of all E such that |ψ(E)| < 2 has n connected components whose closures are called bands. (E, f,v) (2) If E is in the boundary of some band, we obviously have trAn (x) = ±2. (E, f,v) (x) = ±2, E is in the boundary of some band, thus the (3) Conversely, if tr An spectrum is the union of the bands. (E, f,v) (4) If two different bands intersect then their common boundary point satisfies An (x) = ± id. 2.3.5. We will need some simple estimates on the Lebesgue measure of the bands and of the spectrum. Lemma 2.4. Let v be a periodic potential of period n. (1) The measure of each band is at most 2π n . (2) Let C ≥ 1 be such that for every E in the union of bands, there exists x ∈ X and (E, f,v) k ≥ 1 such that Ak (x) ≥ C. Then the total measure of the spectrum is at 4π n most C . (E, f,v)

(x) is conjugate in SL(2, R) to a rotation: Proof. If E belongs to some band, An (E, f,v) (x)B (E) (x)−1 ∈ SO(2, R). there exists B (E) (x) ∈ SL(2, R) such that B (E) (x)An

912

A. Avila

This matrix is not unique, since R B (E) (x) has the same property for R ∈ SO(2, R), but this is the only ambiguity. In particular, the Hilbert-Schmidt norm squared B (E) (x)2HS (the sum of the squares of the entries of the matrix of B (E) (x)) is a well defined function b(E) (x), which obviously satisfies b(E) ( f n (x)) = b(E) (x). This allows us to define an ˆ x-independent function b(E) which is zero if E does not belong to a band and for E in a band is given by ˆ b(E) =

n−1 1 (E) i b ( f (x)). 4π n

(2.4)

i=0

ˆ It turns out that b(E) is related to the integrated density of states by the formula E ˆ N (E) = −∞ b(E)d E. As a consequence, we conclude that for any band I ⊂ ( f, v), 1 ˆ ˆ I b(E)d E = n (in particular R b(E)d E = 1). See [AD2], Sect. 2.4.1 for a discussion of this point of view on the integrated density of states. 1 ˆ The first statement is then an immediate consequence of b(E) ≥ 2π which in turn 2 comes from the estimate BHS ≥ 2, B ∈ SL(2, R). For the second estimate, it is enough to show that for every E in a band we have ˆ b(E) ≥ 4πC n . Notice that (E, f,v)

B (E) ( f k (x))Ak

(E, f,v)

(x)An

(E, f,v)

= B (E) ( f k (x))An (E, f,v)

(E, f,v)

(x)Ak

( f k (x))B (E) ( f k (x))−1 ∈ SO(2, R). (E, f,v)

Thus B (E) ( f k (x))Ak (x) conjugates An (E) R B (x) for some R ∈ SO(2, R). Thus (E, f,v)

C ≤ Ak

(x)−1 B (E) ( f k (x))−1 (2.5)

(x) to a rotation so it coincides with

(x) ≤ B (E) ( f k (x))−1 B(x),

(2.6)

and there exists y ∈ X (either y = x or y = f k (x)) such that C ≤ B (E) (y)2 ≤ ˆ b(E) (y). It follows that b(E) ≥ 4πC n .

2.3.6. We conclude with a weak continuity result for the Lyapunov exponent. Lemma 2.5. Let v (n) ∈ C 0 (X, R) be a sequence converging uniformly to v ∈ C 0 (X, R). Then L(E, f, v (n) ) → L(E, f, v) in L 1loc . Proof. This follows from the proof of Lemma 1 of [AD1]. Indeed for every compact interval I ⊂ R, there exists a continuous function g : I → R, non-vanishing in int I , such that lim max{L(E, f, v (n) ) − L(E, f, v), 0}g(E)d E = 0 (2.7) n→∞ I

and

lim

n→∞ I

min{L(E, f, v (n) ) − L(E, f, v), 0}g(E)d E = 0

(see the last two equations in p. 396 of [AD1]). The result follows.

(2.8)


913

3. Proof of Theorem 1.1 Fix some Cantor group X , and let f : X → X be a minimal translation. Then the homomorphism Z → X , n → f n (id) is injective with dense image. For simplicity of notation, we identify the integers with its image under this homomorphism. For a given potential w ∈ C 0 (X, R) and n ≥ 1, we write L(E, w) = L(E, f, w) for the Lyapunov exponent with energy E corresponding to the potential w. Since X is Cantor, there exists a decreasing sequence of Cantor subgroups X k ⊂ X with finite index such that ∩X k = {0}. Let Pk be the set of potentials which are defined on X/ X k . Potentials in Pk are n k -periodic where n k is the index of X k . If w ∈ C 0 (X, R) is a periodic potential, then it belongs to some Pk . Let P = ∪Pk be the set of periodic potentials (which is a dense subset of C 0 (X, R), see Sect. 2.3.2). (E, f,w) (E,w) (x) = An (x) for the n-step transfer matrix associated For n ≥ 1, we write An (E,w) with the potential w at x. We also let An = A(E,w) (0). The spectrum will be denoted n by (w) = ( f, w). We will actually work with finite families W of periodic potentials. Here we allow for multiplicity of elements, so the number of elements in W , denoted by #W , may be larger than the number of distinct elements of W . For simplicity of notation, we will often treat 1 W as a set (writing for instance W ⊂ P). We write L(E, W ) = #W w∈W L(E, w). (More formally, and generally, one could work with probability measures with compact support contained in Pk for all k sufficiently large.) The core of the construction is contained in the following two lemmas. Lemma 3.1. Let B be an open ball in C 0 (X, R), let W ⊂ P ∩ B be a finite family of potentials, and let M ≥ 1. Then there exists a sequence W n ⊂ P ∩ B such that (1) L(E, λW n ) > 0 whenever M −1 ≤ |λ| ≤ M, E ∈ R, (2) L(E, λW n ) → L(E, λW ) uniformly on compacts (as functions of (E, λ) ∈ R2 ). Lemma 3.2. Let B be an open ball in C 0 (X, R) and let W ⊂ P ∩ B be a finite family of potentials. Then for every K sufficiently large, there exists W K ⊂ PK ∩ B such that (1) L(E, λW K ) → L(E, λW ) uniformly on compacts (as functions of (E, λ) ∈ R2 ). (2) The diameter of W K is at most n −10 K . (3) For every λ ∈ R, if inf E∈R L(E, λW ) ≥ δ#W n k then for every w ∈ W K , (λw) has Lebesgue measure at most e−δn K /2 . Before proving the lemmas, let us conclude the proof of Theorem 1.1. First we combine both lemmas: Lemma 3.3. Let B ⊂ C 0 (X, R) be an open ball and let W ⊂ P ∩ B be a finite family of potentials. Then for every M ≥ 1, there exist δ > 0, an open ball B with closure contained in B, with diameter at most M −1 and W ⊂ P ∩ B such that (1) |L(E, λW ) − L(E, λW )| < M −1 for |E|, |λ| ≤ M. (2) L(E, λW ) > δ for every M −1 ≤ |λ| ≤ M and E ∈ R. (3) For every w ∈ B and M −1 ≤ |λ| ≤ M the Lebesgue measure of (λw) is at most M −1 . Proof. First apply Lemma 3.1 to find some W˜ ⊂ P ∩ B such that L(E, λW˜ ) > 0 for every E ∈ R and M −1 ≤ |λ| ≤ M (it is easy to see that L(E, λw) ≥ 1 if |E| ≥ λw + 4,

(3.1)

914

A. Avila

so this is really a statement about bounded energies which follows from Lemma 3.1), and |L(E, λW˜ ) − L(E, W )| < M −1 /4 for every |E|, |λ| ≤ M. By continuity of the Lyapunov exponent for periodic potentials (Sect. 2.3.3) and compactness (and (3.1) to take care of large energies), we conclude that there exists δ > 0 such that L(E, λW˜ ) > 2δ for every E ∈ R and M −1 ≤ |λ| ≤ M. Let us now apply Lemma 3.2 to W = W˜ and let W = W K for K large. Then W is −1 centered around some w ∈ W . contained in a ball B ⊂ B with diameter n −10 K < M Both estimates on L(E, λW ) are clear from the statement of Lemma 3.2 (using again (3.1) for large |E|). To estimate the measure of (λw) for w ∈ B , we notice that (λw) is contained in a Mn −10 K neighborhood of (λw ) (by 1-Lipschitz continuity of the spectrum, see Sect. 2.3.1). Using that (λw ) has at most n K connected components −1 ˜

and has measure at most e−δ(# W n k ) n K /2 , the result follows. Given an open ball B0 ⊂ C 0 (X, R) and W0 ⊂ P ∩ B0 , and 1 > 0, we can proceed by induction, applying the previous lemma, to define, for every i ≥ 1, open balls Bi with B i ⊂ Bi−1 , finite families of periodic potentials Wi ⊂ P ∩ Bi , and constants 0 < δi < 1 and i+1 = min{i , δi }/10 such that (1) L(E, λWi ) ≥ δi for E ∈ R and i ≤ |λ| ≤ i−1 . (2) |L(E, λWi ) − L(E, λWi−1 )| < i for |E|, |λ| ≤ i−1 . (3) For every w ∈ Bi and i ≤ |λ| ≤ i−1 , (λw) has measure at most i . Then the common element w∞ of all the Bi is such that (λw∞ ) has zero Lebesgue measure for every λ = 0. Notice that L(E, λWi ) converges uniformly on compacts to a continuous function, positive if λ = 0, which by general considerations must coincide with L(E, λw∞ ). Indeed, if wn → w then L(E, wn ) → L(E, w) in L 1loc by Lemma 2.5. So L(E, λw∞ ) coincides almost everywhere with lim L(E, λWi ). Since E → L(E, λw∞ ) is the real part of a subharmonic function (see Remark 2.2) and E → lim L(E, λWi ) is continuous, they coincide everywhere. Since B0 was arbitrary, the denseness claim of Theorem 1.1 follows. 3.1. Proof of Lemma 3.1. Let k be such that W ⊂ Pk . For every K > k, choose N1 (K ) > 0 such that if |E| ≤ K , |λ| ≤ K , w ∈ W and w ∈ PK are such that w is 2n k +1 1 N1 (K ) close to w then |L(E, λw ) − L(E, λw)| < K . Here we use the continuity of the Lyapunov exponent for periodic potentials, see Sect. 2.3.3. For w ∈ W , K > k, 1 ≤ j ≤ 2n k + 1, we define potentials w K , j ∈ PK by w K , j (i) = w(i), 0 ≤ i ≤ n K − 2 and w K , j (n K − 1) = w(n K − 1) +

j . N1 (K )

(3.2)

(This uniquely defines w K , j by periodicity.) Claim 3.4. For every λ = 0, K > k there exists 1 ≤ j ≤ 2n k + 1 such that (λw K , j ) has exactly n K components. Proof. Recall that for every w ∈ Pm , there exist exactly 2n m values of E such that (E,w ) (E,w ) = ±2, if one counts the exceptional energies such that An m = ± id with trAn m multiplicity 2, see Sect. 2.3.4. For each j such that (λw K , j ) does not have exactly n K components, there exists (E ,λw K , j )

at least one energy E j ∈ (λw K , j ) with An K j

= ± id. Then


915 (E ,λw)

An K j

=±

1 0

But since w is n k -periodic, this means that (E ,λw) 1 An k j =± 0 (E ,λw)

−λj N1 (K ) .

(3.3)

1

−λjn k N1 (K )n K

1

.

(3.4)

(E ,λw)

This implies, in particular, that An k j = An k j for j = j , thus we must also have (E,λw) E j = E j for j = j . But there can be at most 2n k values of E such that trAn k = ±2. K , j Thus there must be some 1 ≤ j ≤ 2n k +1 such that (λw ) has exactly n K connected components.

By the previous claim and compactness, there exists δ = δ(W, K , M) > 0 such that for w ∈ W and M −1 ≤ |λ| ≤ M, there exists 1 ≤ j = j (K , λ, w) ≤ 2n k + 1 such that (λw K , j ) has n K components and the measure of the smallest gap is at least δ. Choose M an integer N2 (K ) with N2 (K ) > 4π δn K . K , j For 0 ≤ l ≤ N2 (K ) and w as above, let w K , j,l ∈ PK be given by w K , j,l = 4π Ml w K , j + n K N2 (K ) . Claim 3.5. For every M −1 ≤ |λ| ≤ M, w ∈ W , K > k, (λw K , j (K ,λ,w),l ) = ∅.

(3.5)

0≤l≤N2 (K )

Proof. Each of the connected components of (λw K , j ) has measure at most n2πK , see M Lemma 2.4. Since N2 (K ) > 4π δn K , for every E there exists at least some l with 0 ≤ l ≤ 4π Ml / (λw K , j ), that is, E ∈ / (λw K , j,l ), which gives the N2 (K ) such that E − λ n K N2 (K ) ∈ result.

Let W K be the family obtained by collecting the w K , j,l for different w ∈ W , 1 ≤ j ≤ 2n k + 1 and 0 ≤ l ≤ N2 (K ). By the second claim, L(E, λW K ) > 0 for every M −1 ≤ |λ| ≤ M and E ∈ R (since L(E, λw) > 0 if E ∈ / (λw), see Sect. 2.3.4). To conclude, it is enough to show that max

max

1≤ j≤2n k +1 0≤l≤N2 (K )

|L(E, λw K , j,l ) − L(E, λw)| → 0

(3.6)

uniformly on compacts of (E, λ) ∈ R2 . Write |L(E, λw K , j,l ) − L(E, λw)| ≤ |L(E, λw K , j,l ) − L(E − λ +|L(E − λ

4π Ml , λw)| n K N2 (K )

(3.7)

4π Ml , λw) − L(E, λw)|. n K N2 (K )

Then the first term in the right-hand side is smaller than K −1 provided K ≥ |E| + 4π M 2 (by the choice of N1 (K )), while the second term in the right hand side is bounded by maxw∈W sup 4π M 2 |L(E + t, λw) − L(E, λw)| which converges to zero uniformly |t|≤

nK

on compacts of (E, λ) ∈ R2 as K → ∞ (by continuity of the Lyapunov exponent for periodic potentials, Sect. 2.3.3).

916

A. Avila

3.2. Proof of Lemma 3.2. Assume that W ⊂ Pk , n k ≥ 2, and let K > k be large. Order the elements w 1 , . . . , w m of W . Let r = [n K /mn k ]. First consider a potential w ∈ PK obtained as follows. It is enough to define w(l) for 0 ≤ l ≤ n K −1. Let I j = [ jn k , ( j +1)n k −1] ⊂ Z and let 0 = j0 < j1 < . . . < jm−1 < jm = n K /n k be a sequence such that ji+1 − ji − r ∈ {0, 1}. Given 0 ≤ l ≤ n K − 1, let j be such that l ∈ I j , let i be such that ji−1 ≤ j < ji and let w(l) = wi (l). For any sequence t = (t1 , . . . , tm ) with ti ∈ {0, . . . , r − 1}, let w t ∈ PK be the potential defined as follows. Let 0 ≤ l ≤ n K − 1, and let j be such that l ∈ I j . If j = ji − 1 for some 1 ≤ i ≤ m, we let w t (l) = w(l) + r −20 ti . Otherwise we let w t (l) = w(l). Let W K be the family consisting of all the w t . The claimed diameter estimate is obvious for large K . Let us show that L(E, λW K ) → L(E, λW ) uniformly on compacts. It is enough to restrict ourselves to compact subsets of (E, λ) ∈ R × (R \ {0}), since it is easy to see that L(E, λw) − L(E, 0) → 0 uniformly as λw → 0. For fixed E and λ, we write ) A(E,λw = C (tm ,m) B (m) · · · C (t1 ,1) B (1) , nK t

(E−λr −20 t ,λwi )

(3.8)

(E,λwi )

i and B (i) = (An k ) ji − ji−1 −1 . Notice that, for E and where C (ti ,i) = An k (t ,i) λ in a compact set, the norm of the C i -type matrices stays bounded as r grows, while the B (i) matrices may get large. Find some cutoff (ln ln r )−m ≤ c ≤ (ln ln ln r )m /(ln ln r )m such that if B (i) < ecr −1 −1 then B (i) < e(ln ln ln r ) cr < e(ln ln ln r ) cn K . To see that this is possible, notice that the (i) union of the m intervals (ln ln B − ln r, ln ln B (i) − ln r + ln ln ln ln r ], 1 ≤ i ≤ m, must omit at least one point in [−m ln ln ln r, −m ln ln ln r + m ln ln ln ln r ], which can be taken as ln c. r Call i good if B (i) ≥ ecr . If no B (i) is good, then L(E, λW ) ≤ c r −1 and rm L(E, λW K ) ≤ c n K + O(1/r ). In particular L(E, λW K ) and L(E, λW ) are close, since c = o(1) with respect to r . So we can assume that there exists at least one good B (i) . Let i 1 < . . . < i d t be the list of all good i. Write A(E,λw ) (0) = Cˆ (d) Bˆ (d) · · · Cˆ (1) Bˆ (1) , where for 1 ≤ (t ,i ) j ≤ d we let Cˆ ( j) = C i j j and Bˆ ( j) = B (i j ) D ( j) , where we denote D ( j) = (i j −1,ti j −1 ) (i j −1) (i j−1 +1,ti j−1 +1 ) (i j−1 +1) C B ···C B (denoting also i 0 = 0). By the choice of the cutoff, we have D ( j) ≤ ecr/2 for r large (uniformly on compacts of (E, λ) ∈ R2 ), so Bˆ ( j) ≥ ecr/2 .

Claim 3.6. As r grows, d 1 ln Bˆ ( j) → L(E, λW ) nK

(3.9)

j=1

uniformly on compacts of E and λ. Proof. Notice that this is equivalent to showing that m 1 ln B (i) → L(E, λW ) nK i=1

(3.10)


917

(uniformly), which in turn is equivalent to m 1 1 ln B (i) → L(E, λW ) m n k ( ji − ji−1 − 1)

(3.11)

i=1

(uniformly). Thus it is enough to show that 1 ln B (i) → L(E, λwi ) n k ( ji − ji−1 − 1)

(3.12) (E,λwi )

, whose (uniformly). But B (i) is just the ji − ji−1 − 1 iterate of the matrix An k spectral radius is precisely the exponential of n k L(E, λwi ). But it is easy to see that T n 1/n converges to the spectral radius of T uniformly on compacts of T ∈ SL(2, R). This gives (3.12) and the result.

For every t, we have the obvious upper bound L(E, λwt ) ≤

d 1 ln Bˆ ( j) + O(1/r ), nK

(3.13)

j=1

and we will now be concerned with bounding L(E, λwt ) from below, not for all t, but for a majority of them. Let s j be the most contracted direction of Bˆ ( j) and let u j be the image under Bˆ ( j) of the most expanded direction. Let us say that t is j-nice, 1 ≤ j ≤ d, if the absolute value of the angle between Cˆ ( j) u j and s j+1 is at least r −70 (with the convention that j + 1 = 1 for j = d). Claim 3.7. Let r be sufficiently large, and let t be j-nice. If z is a non-zero vector making an angle at least r −80 with s j , then z = Cˆ ( j) Bˆ ( j) z makes an angle at least r −80 with s j+1 and z ≥ Bˆ ( j) r −100 z. Proof. Let 0 ≤ θ ≤ π/2 be the angle between z and s j , and let 0 ≤ θ ≤ π/2 be the angle between z = Bˆ ( j) z and u j . The orthogonal projection of z on u j has norm z Bˆ ( j) sin θ . Since Cˆ ( j) stays bounded as r grows, we conclude that z ≥ Bˆ ( j) r −100 z. On the other hand, tan θ tan θ = Bˆ ( j) −2 . Since Bˆ ( j) ≥ ecr/2 ≥ r 400 for r large, it follows that θ < r −100 . The boundedness of Cˆ ( j) again implies that the angle between z and Cˆ ( j) u j is at most r −90 . Since t is j-nice, z makes an angle at least r −80 with s j+1 .

It follows that if t is very nice in the sense that it is j-nice for every 1 ≤ j ≤ d, then (E,λwt ) if z is a non-zero vector making an angle at least r −80 with s1 then z = An K z also makes an angle at least r −80 with s1 , and moreover z /z ≥ dj=1 r −100 Bˆ ( j) . By (3.9) and (3.13), it follows that L(E, λwt ) − L(E, λW ) → 0 as r grows, at least for very nice t. To conclude the estimate on the Lyapunov exponent, it is thus enough to show that most t are nice, in the sense that for every > 0, for every r sufficiently large, the set of t ∈ {0, . . . , r − 1}m which are not very nice has at most r m elements. A more precise estimate is provided below.

918

A. Avila

Claim 3.8. For every r sufficiently large, the set of t which are not very nice has at most mr m−1 elements. Proof. We will show in fact that, for every 1 ≤ j ≤ d, if for every 1 ≤ k ≤ m with k = i j one chooses tk ∈ {0, . . . , r − 1}, there exists at most one “exceptional” ti j ∈ {0, . . . , r − 1} such that t = (t1 , . . . , tm ) is not j-nice. Thus the set of t which are not j-nice has at most r m−1 elements and the estimate follows. Once tk is fixed for 1 ≤ k ≤ m with k = i j , both u j and s j+1 become determined, (E−λr −20 t ,λw j ) i

ij (t ,i ) but Cˆ ( j) = C i j j = An k depends on ti j . Since n k ≥ 2, we can apply Lemma 2.3 to conclude that for any non-zero vector

(E ,λw j ) i

z ∈ R2 , the derivative of the argument of the vector An k z as a function of E is strictly negative, and hence bounded away from zero and infinity, uniformly on z and on compacts of (E , λ) ∈ R2 , and independently of r . If r is sufficiently large, we conclude that for every 0 ≤ l ≤ r − 2, there exists a rotation Rl of angle θ with r −21 < θ < r −19 such that C (l+1,i j ) u j = Rl C (l,i j ) u j . It immediately follows that there exists at most one choice of 0 ≤ ti j ≤ r − 1 such that C

(ti j ,i j )

u j has angle at most r −90 with s j+1 , as desired.

We now estimate the measure of the spectrum. Let wi ∈ W be such that L(E, λwi ) ≥ (E,λwt ) δm(r −1)n 2k δn k m. Then A(r . Since E is arbitrary, we can apply −1)n k (( ji−1 )n k ) ≥ e

Lemma 2.4 to conclude that the measure of the spectrum is at most 4π n K e−δm(r −1)n k ≤ e−δn K /2 for r large. The result follows. 2

Acknowledgements. Conjecture 8.7 of [S] was brought to the attention of the author by Svetlana Jitomirskaya. This work was carried out during visits to Caltech and UC Irvine. This research was partially conducted during the period the author served as a Clay Research Fellow. We are grateful to the referee for several suggestions which led to significant changes in the presentation.

References [AD1] [AD2] [AMS] [AS] [S]

Avila, A., Damanik, D.: Generic singular spectrum for ergodic schrdinger operators. Duke Math. J. 130, 393–400 (2005) Avila, A., Damanik, D.: Absolute continuity of the integrated density of states for the almost mathieu operator with non-critical coupling. Invent. Math. 172, 439–453 (2008) Avron, J., van Mouche, P., Simon, B.: On the measure of the spectrum of the almost mathieu operator. Commun. Math. Phys. 132, 117–142 (1990) Avron, J., Simon, B.: Almost periodic schrödinger operators, i. Limit Periodic Potentials. Commun. Math. Phys. 82, 101–120 (1982) Simon, B.: Equilibrium measures and capacities in spectral theory. Inverse Probl. Imaging 1(4), 713–772 (2007)

Communicated by B. Simon


Communications in


Isometric Embeddings into the Minkowski Space and New Quasi-Local Mass Mu-Tao Wang1 , Shing-Tung Yau2 1 Department of Mathematics, Columbia University, New York,

NY 10027, USA. E-mail: [email protected]

2 Department of Mathematics, Harvard University, Cambridge, MA 02138, USA

Received: 17 May 2008 / Accepted: 17 October 2008 Published online: 27 February 2009 – © The Author(s) 2009

Abstract: The definition of quasi-local mass for a bounded space-like region in space-time is essential in several major unsettled problems in general relativity. The quasi-local mass is expected to be a type of flux integral on the boundary two-surface = ∂ and should be independent of whichever space-like region bounds. An important idea which is related to the Hamiltonian formulation of general relativity is to consider a reference surface in a flat ambient space with the same first fundamental form and derive the quasi-local mass from the difference of the extrinsic geometries. This approach has been taken by Brown-York [4,5] and Liu-Yau [16,17] (see also related works [3,6,9,12,14,15,28,32]) to define such notions using the isometric embedding theorem into the Euclidean three space. However, there exist surfaces in the Minkowski space whose quasilocal mass is strictly positive [19]. It appears that the momentum information needs to be accounted for to reconcile the difference. In order to fully capture this information, we use isometric embeddings into the Minkowski space as references. In this article, we first prove an existence and uniqueness theorem for such isometric embeddings. We then solve the boundary value problem for Jang’s [13] equation as a procedure to recognize such a surface in the Minkowski space. In doing so, we discover a new expression of quasi-local mass for a large class of “admissible” surfaces (see Theorem A and Remark 1.1). The new mass is positive when the ambient space-time satisfies the dominant energy condition and vanishes on surfaces in the Minkowski space. It also has the nice asymptotic behavior at spatial infinity and null infinity. Some of these results were announced in [29]. 1. Introduction 1.1. Dominant energy condition and positive mass theorem. Let N be a space-time, i.e. a four-manifold with a Lorentzian metric gαβ of signature (− + + +) that satisfies the © 2009 The authors. Reproduction of this article for non-commercial purposes by any means is permitted.

920

M.-T. Wang, S.-T. Yau

Einstein equation: s Rαβ − gαβ = 8π GTαβ , 2 where Rαβ and s are the Ricci curvature and the Ricci scalar curvature of gαβ , respectively. G is the gravitational constant and Tαβ is the energy-momentum tensor of matter density. The metric gαβ defines space-like, time-like and null vectors on the tangent space of N accordingly. A “dominant energy condition”, which corresponds to a positivity condition on the matter density Tαβ , is expected to be satisfied on any realistic space-time. It means the following: for any time-like vector e0 , T (e0 , e0 ) ≥ 0 and T (e0 , ·) is a non-space-like covector. We shall assume throughout this article the space-time N satisfies the dominant energy condition. Consider a space-like hypersurface (M, gi j , pi j ) in N where gi j is the induced (Riemannian) metric and pi j is the second fundamental form with respect to the future-directed time-like unit normal vector field of M. The dominant energy condition together with the compatibility conditions for submanifolds imply µ ≥ |J |,

(1.1)

where µ=

1 (R − pi j pi j + ( pkk )2 ), 2

and J i = D j ( pi j − pkk g i j ). Here R is the scalar curvature of M. An important special case is when pi j = 0 (time-symmetric case) and the dominant energy condition implies that the scalar curvature of M is non-negative. The positive mass theorem proved by Schoen-Yau [22–24] (later a different proof by Witten [30] ) states: Theorem 1.1. Let (M, gi j , pi j ) be a complete three manifold that satisfies (1.1). Suppose M is asymptotically flat: i.e. there exists a compact set K ⊂ M such that M\K is diffeomorphic to a union of complements of balls in R3 (called ends) such that gi j = δi j +ai j with ai j = O( r1 ), ∂k (ai j ) = O( r12 ), ∂l ∂k (ai j ) = O( r13 ), and pi j = O( r12 ), ∂k ( pi j ) = O( r13 ) on each end of M\K . Then the ADM mass (Arnowitt-Deser-Misner) of each end of M is positive, i.e. E ≥ |P|, where 1 E = lim r →∞ 16π G is the total energy and 1 r →∞ 16π G

(1.2)

(∂ j gi j − ∂i g j j )di Sr

Pk = lim

2( pik − δik p j j )di Sr

is the total momentum. Here Sr is a coordinate sphere of radius r on an end. We notice that the conclusion of the theorem is equivalent to the four-vector (E, P1 , P2 , P3 ) is future-directed time-like, i.e.

Isometric Embeddings into the Minkowski Space and New Quasi-Local Mass

E ≥ 0 and

921

− E 2 + P12 + P22 + P32 ≤ 0.

The asymptotically flat condition can be considered a gauge condition to assure that M can be compared to the flat space R3 . The essence of the positive mass theorem is that positive local matter density (1.1) measured pointwise should imply positive total energy momentum (1.2) measured at infinity. In contrast, the “quasi-local mass” corresponds to the measurement of mass of in-between scales. 1.2. Two-surfaces in space-time and quasi-local notion of mass. Let N be a timeoriented space-time. Denote the Lorentzian metric on N by ·, · and covariant derivative by ∇ N . Let be a closed space-like two-surface embedded in N . Denote the induced Riemannian metric on by σ and the gradient and Laplace operator of σ by ∇ and , respectively. Given any two tangent vector X and Y of , the second fundamental form of in N is given by II(X, Y ) = (∇ XN Y )⊥ , where (·)⊥ denotes the projection onto the normal bundle of . Themean curvature vector is the trace of the second fundamental form, 2 or H = tr II = a=1 II(ea , ea ), where {e1 , e2 } is an orthonormal basis of the tangent bundle of . The normal bundle is of rank two with structure group S O(1, 1) and the induced metric on the normal bundle is of signature (−, +). Since the Lie algebra of S O(1, 1) is isomorphic to R, the connection form of the normal bundle is a genuine 1-form that depends on the choice of the normal frames. The curvature of the normal bundle is then given by an exact 2-form which reflects the fact that any S O(1, 1) bundle is topologically trivial. Connections of different choices of normal frames differ by an exact form. We define Definition 1.1. Let e3 be a space-like unit normal along ; the connection form determined by e3 is defined to be αe3 (X ) = ∇ XN e3 , e4 ,

(1.3)

where e4 is the future-directed time-like unit normal that is orthogonal to e3 . When bounds a space-like hypersurface with ∂ = , we choose e3 to be the space-like outward unit normal with respect to . The connection form is then denoted by α . Suppose bounds a space-like hypersurface in N , the definition of quasi-local mass m asks that (see [7,8]): (1) m ≥ 0 under the dominant energy condition. (2) m = 0 if and only if is in the Minkowski spacetime. (3) The limit of m on large coordinates spheres of asymptotically flat (null) hypersurfaces should approach the ADM (Bondi) mass. The quasi-local mass is supposed to be closely related to the formation of black holes according to the hoop conjecture of Throne. Various definitions for the quasi-local mass have been proposed (see for example the review article by Szabados [27]). In this article, we shall focus on quasi-local mass defined by the following comparison principle: anchor the intrinsic geometry (the induced metric) by isometric embeddings and compare other extrinsic geometries. An important feature that we expect is the definition should be a flux type integral on and it should depend only on the fact that bounds a space-like hypersurface , but does not depend on which specific it bounds.

922


1.3. Prior results. We recall the solution of Weyl’s isometric embedding problem by Nirenberg [18] and independently, Pogorelov [21]: Theorem 1.2. Let be a closed surface with a Riemannian metric of positive Gauss curvature, then there exists an isometric embedding i : → R3 that is unique up to Euclidean rigid motions. In particular, the mean curvature of the isometric embedding is uniquely determined by the metric. Through a Hamiltonian-Jacobi analysis of Einstein’s action, Brown and York [4,5] introduced Definition 1.2. Suppose a two-surface bounds a space-like region in a space-time N . Let k be the mean curvature of with respect to the outward normal of . Assume the induced metric on has positive Gauss curvature and denote by k0 the mean curvature of the isometric embedding of into R3 . The Brown-York mass is defined to be: 1 k0 − k . 8π G Liu and Yau [16,17] (see also Kijowski [14]) defined Definition 1.3. Suppose is an embedded two-surface that bounds a space-like region in a space-time N . Assume has positive Gauss curvature. The Liu-Yau mass is defined to be 1 k0 − |H | , 8π G where |H | is the Lorentzian norm of the mean curvature vector. The Brown-York and Liu-Yau mass are proved to be positive by Shi-Tam [26] in the time-symmetric case, and Liu-Yau [16,17], respectively. Theorem 1.3 [26]. Suppose has non-negative scalar curvature and k > 0. Then the Brown-York mass of is nonnegative and it equals zero if only if is flat. Theorem 1.4 [16,17]. Suppose N satisfies the dominant energy condition and the mean curvature vector of is space-like. The Liu-Yau mass is non-negative and it equals zero only if N is isometric to R3,1 along . However, Ó Murchadha, Szabados, and Tod [19] found examples of surfaces in the Minkowki space which satisfy the assumptions but whose Liu-Yau mass, as well as Brown-York, mass, are strictly positive. It seems the missing of the momentum information pi j is responsible for this inconsistency: the Euclidean space can be considered as a totally geodesic space-like hypersurface in the Minkowski space with the second fundamental form pi j = 0 and in both the Brown-York and Liu-Yau case, the reference is taken to be the isometric embedding into R3 . In order to capture the information of pi j , we need to take the reference surface to be a general isometric embedding into the Minkowski space. However, an intrinsic difficulty for this embedding problem is that there are four unknowns (the coordinate functions in R3,1 ) but only three equations (for the first fundamental form). An ellipticity condition in replacement of the positive Gauss curvature condition is also needed to guarantee the uniqueness of the solution. We are


923

able to achieve these in this article, and indeed, the extra unknown (corresponding to the time function) allows us to identify a canonical gauge in the physical space N and define a quasi-local mass expression. We refer to our paper [29] in which this expression was derived from the more physical point of view, i.e. the Hamilton-Jacobi analysis of the gravitational action. 1.4. Results and organization. We first state the key comparison theorem: Theorem A. Let N be a space-time that satisfies the dominant energy condition. Suppose i : → N is a closed embedded space-like two-surface in N with space-like mean curvature vector H . Let i 0 : → R3,1 be an isometric embedding into the Minkowski space and let τ denote the restriction of the time function t on i 0 (). Let e¯4 be the future-directed time-like unit normal along i() such that − τ H, e¯4 = 1 + |∇τ |2 and e¯3 be the space-like unit normal along with e¯3 , e¯4 = 0 and H, e¯3 < 0. Let be the projection of i 0 () onto R3 = {t = 0} ⊂ R3,1 and kˆ be the mean curvature of in R3 . If τ is admissible (see Definition 5.1), then ˆk − − 1 + |∇τ |2 H, e¯3 − αe¯3 (∇τ ) (1.4)

is non-negative. Indeed, we show

kˆ =

− 1 + |∇τ |2 H0 , e˘3 − αe˘3 (∇τ )

(1.5)

(see Eq. (3.4) ) where H0 is the mean curvature vector of i 0 () in R3,1 , e˘3 is the space-like unit normal along i 0 () in R3,1 that is orthogonal to the time direction. The expression (1.4) naturally arises as the surface term in the Hamiltonian of gravitational action (see Remark 2.1). When the reference isometric embedding lies in an R3 with τ = 0, it recovers the Liu-Yau mass. Remark 1.1. If the Gauss curvature of is positive, an isometric embedding into an R3 with τ = 0 is admissible (see Corollary 5.3 and the preceding remark). In general, when the Gauss curvature is close to being positive, an isometric embedding with small enough τ is admissible. Remark 1.2. We learned the expression in (1.5) from Gibbon’s paper [10]. Indeed, we are motivated by [10] to study the projection of a space-like two-surface in the Minkowski space. The new quasi-local mass is defined to be the infimum of the expression (1.4) over all such isometric embeddings (see Definition 5.2). We prove that such embeddings are parametrized by the admissible τ . Theorem B. Given a metric σ and a function τ on S 2 such that the condition (3.1) holds. There exists a unique space-like isometric embedding i 0 : S 2 → R3,1 with the induced metric σ and the function τ as the time function.

924


In Sect. 2, we study the expression − 1 + |∇τ |2 H, e3 − αe3 (∇τ ) for surfaces in space-time. We consider it as a generalized mean curvature and study the variation of the total integral. The gauge e¯3 , e¯4 in Theorem A indeed minimizes the total integral (see Proposition 2.1). In Sect. 3, we prove Theorem B and study the total mean curvature of the projected surface. In particular, we prove equality (1.5). In Sect. 4, we study the boundary problem of Jang’s equation and calculate the boundary terms. This is an important step in proving Theorem A. In Sect. 5, we define the new quasi-local mass and prove the positivity. In particular, Theorem A is proved. We emphasize that though the proof involves solving Jang’s equation, the results depend only on the solvability but not on the specific solution. The Euler-Lagrange equation of the new quasi-local mass among all admissible τ ’s is derived in Sect. 6. 2. A Generalization of Mean Curvature Definition 2.1. Suppose i : → N is an embedded space-like two-surface. Given a smooth function τ on and a space-like normal e3 , the generalized mean curvature associated with these data is defined to be h(, i, τ, e3 ) = − 1 + |∇τ |2 H, e3 − αe3 (∇τ ), where H is the mean curvature vector of in N and αe3 is the connection form (see Definition 1.1) of the normal bundle of in N determined by e3 and the future-directed time-like unit normal e4 orthogonal to e3 . Remark 2.1. In the case when bounds a space-like region and e3 is the outward unit normal of , the mean curvature vector is H = H, e3 e3 − H, e4 e4 . We can reflect H along the light cone of the normal bundle to get J = H, e4 e3 − H, e3 e4 . Denote the tangent vector on dual to the one-form αe3 by V , then the expression (3) in [29] is J − V , where k = −H, e3 and p = −H, e4 . We have h(, i, τ, e3 ) = −J − V, 1 + |∇τ |2 e4 − ∇τ . Notice that 1 + |∇τ |2 e4 − ∇τ is again a future-directed unit time-like vector along . Fix a base frame {eˆ3 , eˆ4 } for the normal bundle; any other frame {e3 , e4 } can be expressed as e3 = cosh φ eˆ3 − sinh φ eˆ4 , e4 = − sinh φ eˆ3 + cosh φ eˆ4 for some φ. We compute the integral h(, i, τ, e3 )dv = 1 + |∇τ |2 (cosh φ∇eNa eˆ3 , ea − sinh φ∇eNa eˆ4 , ea ) −αeˆ3 (∇τ ) − ∇τ · ∇φ dv and consider this expression as a functional of φ.

(2.1)


925

Suppose the mean curvature vector of is space-like; we may choose a base frame with eˆ3 = −

H |H |

(2.2)

and eˆ4 the future directed time-like unit normal that is orthogonal to eˆ3 . This choice makes ∇eNa eˆ4 , ea = 0. Integrating by parts, the functional becomes ( 1 + |∇τ |2 cosh φ|H | − αeˆ3 (∇τ ) + φ τ )dv . (2.3)

As |H | is positive, this is clearly a convex functional of φ which achieves the minimum as sinh φ =

− τ . |H | 1 + |∇τ |2

(2.4)

We notice that the minimum is achieved by e¯4 such that the expression |H | sinh φ = H, e¯4 =

− τ 1 + |∇τ |2

(2.5)

depends only on τ ; this is taken as the characterizing property of e¯4 in [29]. Definition 2.2. Given an isometric embedding i : → N into a space-time with space-like mean curvature vector H . Denote ( τ )2 + |H |2 (1 + |∇τ |2 ) − ∇τ · ∇φ − αeˆ3 (∇τ ) dv , H(, i, τ ) =

where φ is defined by (2.4) and αeˆ3 is the connection one-form on associated with eˆ3 in Eq. (2.2). In terms of the frame e¯3 , e¯4 , where e¯4 is given by Eq. (2.5) and e¯3 is the space-like unit normal with e¯3 , e¯4 = 0, then h(, i, τ, e¯3 )dv = − 1 + |∇τ |2 H, e¯3 − αe¯3 (∇τ )dv . H(, i, τ ) =

Proposition 2.1. If the mean curvature vector of the embedding i : → N is space-like and e3 is any space-like unit normal such that H, e3 < 0, then h(, i, τ, e3 )dv ≥ H(, i, τ ).

3. Isometric Embeddings into the Minkowski Space 3.1. Existence and uniqueness theorem. Let be a two-surface diffeomorphic to S 2 . We fix a Riemannian metric σ on , σ = σab du a du b , in local coordinates u 1 , u 2 . Denote the gradient, the Hessian, and the Laplace operator with respect to the metric σ by ∇, ∇ 2 , and , respectively. We consider the isometric embedding problem of (, σ ) into the Minkowski space R3,1 with prescribed mean curvature in a fixed time direction. Let ·, · denote the standard Lorentzian metric on R3,1 and T0 be a constant unit time-like vector in R3,1 ; we have the following existence and uniqueness theorem:

926


Theorem 3.1. Let λ be a function on with of λ, i.e. τ = λ. Suppose

λdv = 0. Let τ be a potential function

K + (1 + |∇τ |2 )−1 det(∇ 2 τ ) > 0,

(3.1)

where K is the Gauss curvature of σ and det(∇ 2 τ ) is the determinant of the Hessian of τ . Then there exists a unique space-like embedding X : → R3,1 with the induced metric σ and such that the mean curvature vector H0 of the embedding satisfies H0 , T0 = −λ.

(3.2)

Proof. We prove the uniqueness part first. Let X i : → R3,1 , i = 1, 2 be two isometric embeddings that satisfy (3.2). Since the mean curvature vector of the embedding X i is X i , this implies (X 1 − X 2 ), T0 = 0, or X 1 − X 2 , T0 is a constant on . Denote τi = −X i , T0 ; we thus have dτ1 = dτ2 . Now consider the projection X i : → R3 onto the orthogonal complement of T0 ; X i = X i − τi T0 . The Gauss curvature of the embedding X can be computed as i = (1 + |∇τi |2 )−1 [K + (1 + |∇τi |2 )−1 det(∇ 2 τi )] K

(3.3)

which is positive by the assumption. We compute the induced metric on the image of the embedding d Xi , d X i = d X i , d X i + dτi2 . Since we assume d X 1 , d X 1 = d X 2 , d X 2 = σ , X i ’s are embeddings into R3 with the same induced metrics of positive Gauss curvature. By Theorem 1.2, X 1 and X 2 are congruent in R3 . Since τ1 and τ2 are different by a constant, X i , as the graphs of τi over X i , are congruent in R3,1 . We turn to the existence part. We start with the metric σ and the function λ and solve of the new metric σˆ = σ + dτ 2 is again given for τ in τ = λ. The Gauss curvature K by (3.3). Theorem 1.2 gives an embedding X : → R3 with the induced metric σˆ . Now X = X + τ T0 is the desired isometric embedding into R3,1 that satisfies (3.2).

The existence theorem can be formulated in terms of τ as the mean curvature vector is given by H0 = X . Corollary 3.1. (Theorem B) Given a metric σ and a function τ on S 2 such that the condition (3.1) holds. There exists a unique space-like isometric embedding i 0 : S 2 → R3,1 with the induced metric σ and the function τ as the time function. 3.2. Total mean curvature of the projection. In this section, we compute the total mean ˆ of the projection in R3 in terms of the geometry of in R3,1 . curvature kdv 3,1 Suppose X : → R is the embedding and τ = −X, T0 is the restriction of the in R3 and T0 form time function associated with T0 . The outward unit normal νˆ of in R3,1 . Extend νˆ along T0 by parallel an orthonormal basis for the normal bundle of translation and denote it by e˘3 . We have


Proposition 3.1. ˆ

ˆ kdv =

927

−H0 , e˘3 1 + |∇τ |2 − αe˘3 (∇τ ) dv .

(3.4)

Proof. Denote by ∇ R the flat connection associated with the Lorentzian metric on and compute R3,1 . Take an orthonormal basis eâ , a = 1, 2 for the tangent space of 3,1

3,1 3,1 R3,1 R3,1 νˆ , ν ˆ − ∇TR0 ν, ˆ T0 , kˆ = ∇eR â νˆ , eâ = ∇eâ νˆ , eâ + ∇νˆ

because the last two terms 3,1 are both zero. Therefore kˆ = g αβ ∇eRα νˆ , eβ for any orthonormal frame eα of R3,1 , where g αβ is the inverse of gαβ = eα , eβ . Now e˘3 = νˆ may be considered as a space-like normal vector field along . Pick an orthonormal basis {e1 , e2 } tangent to . Let e˘4 = √ 1 2 (T0 − T0 ) be the future1+|∇τ |

directed unit normal vector in the direction of the normal part of T0 . It is not hard to see that T0 = −∇τ . {e˘3 , e˘4 } form an orthonormal basis for the normal bundle of . We derive 1 3,1 3,1 R3,1 ∇∇τ e˘3 , e˘4 (3.5) kˆ = ∇eRa e˘3 , ea − ∇eR ˘4 e˘3 , e˘4 = −H0 , e˘3 − 2 1 + |∇τ | because νˆ is extended along T0 by parallel translation. are related by dv = √ The area forms of and

1 dv . Integrating Eq. (3.5) 1+|∇τ |2

over , we obtain (3.4)

−H0 |H0 |

Suppose the mean curvature vector H0 of in R3,1 is space-like. Let e3H0 = the unit vector in the direction of H0 and vector with e3H0 , e4H0 = 0. Suppose that

e4H0

be

the future-directed time-like unit normal

e3H0 = cosh θ e˘3 + sinh θ e˘4 , and e4H0 = sinh θ e˘3 + cosh θ e˘4 . Since τ = −H0 , T0 and T0 = 1 + |∇τ |2 e˘4 − ∇τ , we derive sinh θ =

− τ . |H0 | 1 + |∇τ |2

(3.6)

These imply the following relations: e˘3 = cosh θ e3H0 − sinh θ e4H0 , and e˘4 = − sinh θ e3H0 + cosh θ e4H0 . The integrand on the right-hand side of (3.4) becomes R3,1 H0 H0 e3 , e4 . |H0 | cosh θ 1 + |∇τ |2 − ∇θ · ∇τ − ∇∇τ Therefore we have Proposition 3.2. When the mean curvature vector of in R3,1 is space-like, is equal to ( τ )2 + |H0 |2 (1 + |∇τ |2 ) − ∇θ · ∇τ − αe H0 (∇τ ) dv ,

ˆ

ˆ kdv (3.7)

3

where θ is given by (3.6) and αe H0 is the one-form on defined by αe H0 (X ) = ∇ XR e3H0 , e4H0 . 3,1

3

3

928


4. Jang’s Equation and Boundary Information 4.1. Jang’s equation. Jang’s equation was proposed by Jang [13] in an attempt to solve the positive energy conjecture. Schoen and Yau came up with different geometric interpretations, studied the equation in full, and applied them to their proof [24] of the positive mass theorem. Another important contribution of Schoen and Yau’s work in [24] is to understand the precise connection between the solvability of Jang’s equation and the existence of black holes. This leads to the later works on the existence of black holes due to condensation of matter and boundary effect [25,31]. Given an initial data set (, gi j , pi j ), where pi j is a symmetric two-tensor that represents the second fundamental form of with respect to a future-directed time-like normal e4 in a space-time N , we consider the Riemannian product × R and extend pi j by parallel translation along the R direction to a symmetric tensor P(·, ·) on × R. Such an extension makes P(·, v) = 0, where v denotes the downward unit vector in the R direction.

in ×R, defined as the graph of a function Jang’s equation asks for a hypersurface

in × R is the same as the the trace of the f over , such that the mean curvature of

. In terms of local coordinates x i on , the equation takes the form restriction of P to 3 i, j=1

where D f = ikj ∂∂xfk

fi f j g − 1 + |D f |2

∂f ij ∂ g ∂x j ∂xi

ij

Di D j f − pi j (1 + |D f |2 )1/2

is the gradient of f , |D f |2 = g i j ∂∂xfi

∂f ∂x j

= 0,

and Di D j f =

(4.1)

∂2 f ∂xi ∂x j

−

is the Hessian of f .

such Pick an orthonormal basis {e˜α }α=1···4 for the tangent space of × R along

and e˜4 is the downward unit normal, then Jang’s equation that {e˜i }i=1···3 is tangent to is 3 i=1

e˜ e˜4 , e˜i = ∇ i

3

P(e˜i , e˜i );

(4.2)

i=1

is the Levi-Civita connection on the product space here and throughout this section ∇ × R. 4.2. Boundary calculations. Let τ be a smooth function on = ∂. We consider a solution f of Jang’s equation in × R that satisfies the Dirichlet boundary condition f = τ on .

and the graph of f over by

so that ∂

=

. Denote the graph of τ over by

, respectively. Let We choose orthonormal frames {e1 , e2 } and {e˜1 , e˜2 } for T and T e3 be the outward normal of that is tangent to . We also choose e˜3 , e˜4 for the normal

in × R such that e˜3 is tangent to the graph

and e˜4 is a downward unit bundle of

normal vector of in × R. {e1 , e2 , e3 , v} forms an orthonormal basis for the tangent space of × R, and so does {e˜α }α=1···4 . All these frames are extended along the R direction by parallel translation. Along , we have D f = ∇τ + f 3 e3 ,


929

where f 3 = e3 ( f ) is the normal derivative of f . e˜3 and e˜4 can be written down explicitly:

1 f3 2 e˜3 = 1 + |∇τ | e3 − (v + ∇τ ) and 1 + |D f |2 1 + |∇τ |2 1 (v + D f ). (4.3) e˜4 = 1 + |D f |2 We check that e˜3 and e˜4 are orthogonal to ea − ea (τ )v for a = 1, 2. Simple calculations yield 1 + |∇τ |2 f3 e3 , e˜3 = , and e3 , e˜4 = . 2 1 + |D f | 1 + |D f |2

(4.4)

e˜ e˜3 , eã be the mean curvature of

with respect to

. We are particularly Let k˜ = ∇ a

: interested in the following expression on

e˜ e˜4 , e˜3 + P(e˜4 , e˜3 ). k˜ − ∇ 4

(4.5)

Theorem 4.1. Let i : → N be a space-like embedding. Given any smooth function τ on and any space-like hypersurface with ∂ = , suppose the Dirichlet problem of Jang’s equation (4.1) over subject to the boundary condition that f = τ on is solvable. Then there exists a space-like unit normal e3 along in N such that the

is equal to expression (4.5) at q˜ ∈ −H, e3 − (1 + |∇τ |2 )−1/2 αe3 (∇τ ) at q ∈ ,

. In particular where q˜ = (q, τ (q)) ∈ ˜k − ∇

e˜ e˜4 , e˜3 + P(e˜4 , e˜3 )dv − 1 + |∇τ |2 H, e3 − αe3 (∇τ )dv .

= 4

(4.6) Let e3 be the outward unit normal of that is tangent to and e4 is the future-directed time-like normal of in N ; e3 is given by e3 = cosh φe3 + sinh φe4 , where sinh φ =

− f3 1 + |∇τ |2

.

(4.7)

Proof. The proof is through a sequence of calculations using the product structure of × R and Jang’s equation. It also relies on the fact that P, {e˜α }4α=1 , and {e1 , e2 , e3 , v} are all parallel in the direction of v. We first prove the following identity:

e˜ e˜4 , e˜3 + P(e˜3 , e˜4 ) k˜ − ∇ 4

ea e˜3 , ea + e3 , e˜4 ∇

ea e˜4 , ea − e3 , e˜4 P(ea , ea ) = ∇ e3 , e˜3 e3 , e˜3 1 + P(e3 , e˜4 − e˜4 , e3 e3 ). e3 , e˜3

(4.8)

930


ea e˜3 , ea and ∇

ea e˜4 , ea in the following: We compute the terms ∇

ea e˜3 , ea = ∇

3 4

ei e˜3 , ei − ∇

e3 e˜3 , e3 =

e˜α e˜3 , e˜α − ∇

e3 e˜3 , e3 , ∇ ∇ α=1

i=1

as {e1 , e2 , e3 , v} and {e˜α }α=1···4 are both orthonormal frames for the tangent space of

v e˜3 = 0. × R and ∇

e˜ e˜3 , e˜3 = 0 and thus we obtain Notice that ∇ 3

2

ea e˜3 , ea = k˜ + ∇

e˜ e˜3 , e˜4 − ∇

e3 e˜3 , e3 . ∇ 4

(4.9)

a=1

On the other hand, 2 3 3

ea e˜4 , ea =

ei e˜4 , ei − ∇

e3 e˜4 , e3 =

e˜ e˜4 , e˜i − ∇

e3 e˜4 , e3 . ∇ ∇ ∇ i a=1

i=1

i=1

Applying Jang’s equation (4.2), we obtain 2

ea e˜4 , ea = ∇

a=1

3

e3 e˜4 , e3 . P(e˜i , e˜i ) − ∇

i=1

Furthermore, we derive 3

P(e˜i , e˜i ) =

i=1

3

P(ei , ei ) +

i=1

e3 , e˜3 1 P(e˜3 , e˜4 ) − P(e3 , e˜4 ), e3 , e˜4 e3 , e˜4

using 3

P(e˜i , e˜i ) =

4

P(e˜α , e˜α ) − P(e˜4 , e˜4 ) =

α=1

i=1

3

P(ei , ei ) − P(e˜4 , e˜4 )

i=1

and e3 , e˜4 P(e˜4 , e˜4 ) = P(e3 − e3 , e˜3 e˜3 , e˜4 ). Therefore, we arrive at 2 2 e3 , e˜3

ea e˜4 , ea = P(e˜3 , e˜4 ) ∇ P(ea , ea ) + e3 , e˜4 a=1

a=1

−

1

e3 e˜4 , e3 . P(e3 , e˜4 − e˜4 , e3 e3 ) − ∇ e3 , e˜4

(4.10)

Combining (4.9) and (4.10) yields (4.8).

ea e3 , ea be the mean curvature of (as the boundary of ) with respect Let k = ∇ to e3 . As e3 , eã = 0 for a = 1, 2, we have e3 = e3 , e˜3 e˜3 + e3 , e˜4 e˜4 . Plug this into the expression for k, and we obtain

ea e˜3 , ea + e3 , e˜4 ∇

ea e˜4 , ea k = e3 , e˜3 ∇ − e3 , e˜3 ea (ea , e˜3 ) − e3 , e˜4 ea (ea , e˜4 ).


ea e˜3 , ea + From here we solve for ∇ obtain

e˜ e˜4 , e˜3 + P(e˜3 , e˜4 ) k˜ − ∇

e3 ,e˜4

e3 ,e˜3 ∇ea e˜4 , ea

931

and substitute into (4.8) to

4

e3 , e˜4 1 1 k− P(ea , ea ) + P(e3 , e˜4 − e˜4 , e3 e3 ) e3 , e˜3 e3 , e˜3 e3 , e˜3 e3 , e˜4 ea (ea , e˜4 ). +ea (ea , e˜3 ) + e3 , e˜3 We calculate the right-hand side of (4.11) using (4.3) and 1 P(e3 , e˜4 − e˜4 , e3 e3 ) = P(e3 , ∇τ ). 1 + |D f |2 The last two terms can also be calculated using (4.3) and =

(4.11)

e3 , e˜4 ea (ea , e˜4 ) ea (ea , e˜3 ) + e3 , e˜3 − f 3 ∇τ ∇τ f3 div = div + (1 + |D f |2 )(1 + |∇τ |2 ) 1 + |D f |2 1 + |∇τ |2 f3 1 ∇τ · ∇ = − . 2 1 + |D f | 1 + |∇τ |2 Recalling the definition of φ from (4.7), this is equal to ∇τ · ∇φ . 1 + |∇τ |2 The right-hand side of (4.11) is therefore (1 + |∇τ |2 )−1/2 1 + |D f |2 k − f 3 P(ea , ea ) + P(e3 , ∇τ ) + ∇τ · ∇φ .

(4.12)

This is an expression on that depends on the functions τ and f 3 on . Recall that the symmetric tensor P originates from the second fundamental form of with respect to the future-directed unit time-like normal e4 in the space-time N . Rewrite the expression (4.12) in terms of e3 and e4 :

e˜ e˜4 , e˜3 + P(e˜3 , e˜4 ) k˜ − ∇ 4 = (1 + |∇τ |2 )−1/2 N N 2 × 1 + |D f | ∇ea e3 , ea − f 3 ∇ea e4 , ea − αe3 (∇τ ) + ∇τ · ∇φ . (4.13) On the other hand, with the orthonormal frame e3 , e4 given by e3 = cosh φe3 + sinh φe4 , e4 = sinh φe3 + cosh φe4 , we compute ∇eNa e3 , ea − (1 + |∇τ |2 )−1/2 αe3 (∇τ ) = cosh φ∇eNa e3 , ea + sinh φ∇eNa e4 , ea − (1 + |∇τ |2 )−1/2 (αe3 (∇τ ) − ∇τ · ∇φ). Plug in the expression for cosh φ and sinh φ, we recover the right-hand side of (4.13).

932


4.3. Boundary gradient estimate. In this section, we demonstrate a sufficient condition for Jang’s equation to be solvable. As most estimates are derived in SchoenYau’s original paper [24] for the asymptotically flat case, it suffices to control the boundary gradient of the solution. Theorem 4.2. The normal derivative of a solution of the Dirichlet problem of Jang’s equation is bounded if k > |tr

P|. Proof. We consider the operator Q( f ) = g i j −

fi f j 1 + |D f |2

Di D j f − tr

P, (1 + |D f |2 )1/2

is the graph of f over . The point is to construct sub and super solutions of where this operator with the prescribed boundary condition. Denote by d the distance function to ∂. We extend the boundary data τ to the interior of , still denoted by τ . Consider a test function f = ψ(d) + τ as the one in (14.11) of [11], where ψ(d) = ν1 log(1 + κd) with κ, ν > 0. In particular ψ = −ν(ψ )2 < 0, ψ > 0, and ψ (d) → ∞ as κ → ∞. We compute Di D j f = ψ di d j + ψ Di D j d + Di D j τ. Therefore, Di D j f fi f j ij g − 1 + |D f |2 (1 + |D f |2 )1/2 di d j Di D j d fi f j fi f j ij ij . =ψ g − +ψ g − 2 2 1/2 2 1 + |D f | (1 + |D f | ) 1 + |D f | (1 + |D f |2 )1/2 Di D j τ fi f j + gi j − . 2 1 + |D f | (1 + |D f |2 )1/2 Applying fi f j 1 gi j ≤ gi j − ≤ gi j 2 1 + |D f | 1 + |D f |2 and |Dd| = 1, we derive that the first term is bounded above by ψ (1 + |D f |2 )3/2 and the third term is bounded above by |D 2 τ | . (1 + |D f |2 )1/2 The second term is ψ gi j −

Di D j d fi f j 2 1 + |D f | (1 + |D f |2 )1/2

d ψ = ψ − f i f j Di D j d. (1 + |D f |2 )1/2 (1 + |D f |2 )3/2

We compute


933

f i f j Di D j d = (ψ d i + τ i )(ψ d j + τ j )Di D j d = τ i τ j Di D j d, where we used the identity d i Di D j d = 0. Therefore, Q( f ) is bounded from above by ψ + (1 + |D f |2 )|D 2 τ | − ψ τ i τ j Di D j d ψ d + ψ − tr

P. (1 + |D f |2 )3/2 (1 + |D f |2 )1/2

is the graph of ψ(d) + τ over . Let a = {d ≤ a} ∩ and We recall that ∂a be the graph of ψ(d) + τ over ∂a . We have tr˜ P = tr P+ ∂a

P(Dd, Dd) 1 + (ψ + Dτ · Dd)2

.

Therefore Q( f ) is bounded from above by ψ + (1 + |D f |2 )|D 2 τ | − ψ τ i τ j Di D j d ψ d + − tr P 2 3/2 ∂a (1 + |D f | ) (1 + |D f |2 )1/2 . P(Dd, Dd) − 1 + (ψ + Dτ · Dd)2 When τ = 0, this recovers formula (5.11) in [31]. In general, we recall D f = ψ Dd + Dτ and |D f |2 ≥ θ (ψ )2 −

θ |Dτ |2 1−θ

for any positive θ < 1. We notice that d approaches −k, where k is the mean curvature of ∂ in . However, tr P approaches tr

P. Thus a sub and a super solution exist ∂a P|.

when k ≥ |tr

5. New Quasi-Local Mass and the Positivity First we define an admissible time function for a surface in space-time. Definition 5.1. Given a space-like embedding i : → N , a smooth function τ on is said to be admissible if: (1) K + (1 + |∇τ |2 )−1 det(∇ 2 τ ) > 0. (2) bounds an embedded space-like three-manifold in N such that Jang’s equation (4.1) with the Dirichlet boundary data τ is solvable on . (3) The generalized mean curvature h(, i, τ, e3 ) > 0 for the space-like unit normal e3 (4.7) is determined by Jang’s equation. We are now ready to define the quasi-local mass.

934


Definition 5.2. Given a space-like embedding i : → N , suppose the set of admissible functions is non-empty. The quasi-local mass is the defined to be the infimum of H(, i 0 , τ ) − H(, i, τ ) among all admissible τ , where H is given by Definition 2.2 and i 0 is the unique space-like isometric embedding of into R3,1 associated with τ given by Theorem B. The proof of the positivity of quasi-local mass is based on the following theorem which can be considered as a total mean curvature comparison theorem for solutions of Jang’s equation. Theorem 5.1. Suppose is a Riemannian three-manifold with boundary and suppose there exists a vector field X on such that R ≥ 2|X |2 − 2div X

(5.1)

in , where R is the scalar curvature of and k > X, ν

(5.2)

on , where ν is outward normal of and k is the mean curvature of with respect to ν. Suppose the Gauss curvature of is positive and k0 is the mean curvature of the isometric embedding of into R3 . Then k0 dv ≥ k − X, νdv .

Remark 5.1. When X = 0, the theorem was proved by Shi-Tam [26]. By the calculation in Schoen-Yau [23], the condition (5.1) holds for any solution of Jang’s equation over an initial data set that satisfies the dominant energy condition. The vector field X is the dual

e˜ e˜4 , · − P(e˜4 , ·) in the notation of Sect. 4. In this case, Liu-Yau [16] essentially of ∇ 4 proved the theorem by conformally changing the metric to zero scalar curvature. The proof of Theorem 6.2 in [28] gives a direct proof without conformal change in a slightly different setting. Proof. The idea of the proof is similar to the one by Shi-Tam. Consider the isometric embedding of into R3 and denote the region inside the image 0 by 0 . We then glue together and R3 \0 along the identification of and 0 . Write the metric on R3 \0 into the form dr 2 + gr , where r is the distance function to 0 and gr is the induced metric on the level set r of r . Applying Bartnik’s [2] quasi-spherical construction, we consider a new metric on R3 \0 of the form u 2 dr 2 + gr with zero scalar curvature and k0 at r = 0. u then satisfies a parabolic equation and the solution gives an u = k−X,ν asymptotically flat metric on R3 \0 . Denote by M˜ the space ∪ R3 \0 with the new metric u 2 dr 2 + gr on R3 \0 . The initial condition on u implies the mean curvature of 0 with respect to this new1 metric is k − X, ν. We still have the monotonicity ford 3 mula, i.e. dr r k0 (r )(1 − u )dvr ≤ 0, where k0 (r ) is the mean curvature of r in R . ˜ In the following, we Therefore, it remains to prove the positivity of the total mass of M. prove a Lichnerowicz formula for such a manifold and the existence of harmonic spinors


935

asymptotic to constant spinors. According to the standard Lichnerowicz formula, on we have 1 |∇ψ|2 + R|ψ|2 − |Dψ|2 4 = ψ, ∇ν ψ + c(ν) · Dψ, (5.3) ∂

where ψ is a spinor, c(·) is the Clifford multiplication, ∇ is the spin connection, and D is the Dirac operator. Integrating by parts, we obtain 2 2 X, ν|ψ| = div X |ψ| + X (|ψ|2 ). ∂

Formula (5.3) is equivalent to 1 1 |∇ψ|2 + (R + 2div X )|ψ|2 + X (|ψ|2 ) − |Dψ|2 4 2 1 = ψ, ∇ν ψ + c(ν) · Dψ + X, ν|ψ|2 . 2 ∂ ∂

(5.4)

The boundary term can be rearranged as 1 −ψ, D ∂ ψ − (k − X, ν)|ψ|2 , 2 ∂ 2 ψ for an orthonormal basis e1 , e2 for the where −D ∂ ψ = a=1 c(ν) · c(ea ) · ∇e∂ a tangent bundle of . Let M˜ r ⊂ M˜ be the region with ∂ M˜ r = r . On M˜ r \, we have 1 2 2 ∂ ψ, D ψ + (k − X, ν)ψ (|∇ψ| − |Dψ| ) = 2 ∂ M˜ r \ ∇νr ψ + c(νr ) · Dψ, ψ. + r

Adding these up, we obtain 1 1 |∇ψ|2 + (R + 2div X )|ψ|2 + X (|ψ|2 ) 4 2 M˜ r |Dψ|2 + ∇ν ψ + c(ν) · Dψ, ψ. = M˜ r

r

We claim the left-hand side of (5.5) is always greater than or equal to 1 |∇ψ|2 . 2 M˜ r This follows from the inequality: |∇ψ|2 + |X |2 |ψ|2 + X (|ψ|2 ) ≥ 0 as

(5.5)

936


X (|ψ|2 ) = ∇ X ψ, ψ + ψ, ∇ X ψ ≥ −2|∇ X ψ||ψ|. Thus if we can solve the harmonic spinor equation Dψ = 0 we obtain lim ∇ν ψ + c(ν) · Dψ, ψ ≥ 0 r →∞ r

˜ and it is known that the limit expression for a constant spinor gives the total mass of M. Equation (5.5) also implies the following coercive estimates for spinors of compact support: 1 2 |Dψ| ≥ |∇ψ|2 , 2 M˜ r M˜ r which is enough to establish the existence of harmonic spinors that are asymptotic to constant spinors at infinity.

Theorem 5.2. Given an embedding i : → N into a space-time that satisfies the dominant energy condition, suppose τ is admissible, then we have h(, i 0 , τ, e˘3 )dv ≥ h(, i, τ, e3 )dv ,

where i 0 : → Theorem B.

R3,1

is the isometric embedding into the Minkowski space given by

Proof. Since τ is admissible, by (2) and (3) of Definition 5.1, bounds a space-like hypersurface such that Jang’s equation over with boundary value τ on is solvable and the generalized mean curvature h(, i, τ, e3 ) is positive. It follows from Theorem 4.1 that

e˜ e˜4 , e˜3 + P(e˜4 , e˜3 ) > 0 k˜ − ∇ 4

e˜ e˜4 , ·−P(e˜4 , ·);

, the graph of τ over . Take X to be the vector field on

dual to ∇ on 4 ˜ satisfies the assumption of Theorem 5.1 by Remark 5.1. We can take the we see that projection of i 0 onto the standard R3 slice determined by t = 0 and denote the image . The induced metric on is then isometric to the metric on the boundary surface by

of . Therefore, by Theorem 5.1 we have ˆkdv (5.6) k˜ − X˜ , e˜3 dv ≥

.

˜

The theorem follows from Eq. (3.4) and Eq. (4.6).

We recall the statement of Theorem A and give the proof: Theorem A. Let N be a space-time that satisfies the dominant energy condition. Suppose i : → N is a closed embedded space-like two-surface in N with space-like mean curvature vector H . Let i 0 : → R3,1 be an isometric embedding into the Minkowski space and let τ denote the restriction of the time function t on i 0 (). Let e¯4 be the future-directed time-like unit normal along i() such that H, e¯4 =

− τ 1 + |∇τ |2


937

and e¯3 be the space-like unit normal along with e¯3 , e¯4 = 0 and H, e¯3 < 0. Let be the projection of i 0 () onto R3 = {t = 0} ⊂ R3,1 and kˆ be the mean curvature of in R3 . If τ is admissible (see Definition 5.1), then ˆk − − 1 + |∇τ |2 H, e¯3 − αe¯3 (∇τ )

is non-negative. Proof. Because τ is admissible and the i 0 is the unique isometric embedding into R3,1 associated with i 0 , by Eq. (5.6) and Eq. (4.6), we have ˆ ≥ h(, i, τ, e3 )dv . (5.7) kdv By Proposition 2.1, from Definition 2.2.

h(, i, τ, e3 )dv

≥ H(, i, τ ). Theorem A now follows

Rewriting the integrals, we obtain: Corollary 5.1. Given an embedding i : → N into a space-time that satisfies the dominant energy condition, suppose the mean curvature vector of in N is space-like and τ is admissible, then H(, i 0 , τ ) ≥ H(, i, τ ), where i 0 : → R3,1 is the isometric embedding into the Minkowski space given by Theorem B. Proof. We have from Eq. (3.4) and Eq. (3.7), ˆ = h(, i 0 , τ, e˘3 )dv = H(, i 0 , τ ) kdv

and

h(, i, τ, e3 )dv ≥ H(, i, τ )

from Proposition 2.1.

Corollary 5.2. Under the assumption of Theorem 5.1, if the set of admissible τ is nonempty, then the quasi-local mass is non-negative. It is zero if the embedding i : → N is isometric to R3,1 along . Proof. The first part follows from the previous corollary. If i : → N is isometric to R3,1 along , we can take the isometric embedding i 0 : → R3,1 , the restriction of the time function τ will be admissible and all the inequalities become equalities by the uniqueness of e¯4 .

By the boundary gradient estimate of Jang’s equation, a constant function is admissible if has positive Gauss curvature and the mean curvature vector of in N is space-like.

938


Corollary 5.3. Under the assumption of Theorem 5.1, and supposing has positive Gauss curvature, then the quasi-local mass is non-negative. It is zero if the embedding i : → N is isometric to R3,1 along . Suppose the minimum is achieved at some τ , we can consider the isometric embedding determined by τ and define a quasi-local energy momentum vector. This is particularly useful when we have a family of surface s → N ; we find the optimal isometric embedding into R3,1 and apply the procedure to get a family of future-directed time-like vectors in R3,1 . 6. The Equation of the Optimal Isometric Embedding 6.1. Variation of total mean curvature. Let be an orientable closed embedded hypersurface in Rn+1 . Denote the outward normal by ν and the mean curvature with respect to ν by H . We study how the total mean curvature H dv changes with respect to the induced metric σi j . We fix a local coordinate system u i on . The variational field δ X = Y can be decomposed into the tangential and normal part Y = ak

∂X + bν. ∂u k

∂X ∂X We compute the variation of the induced metric σi j = ∂u i , ∂u j ,

∂ δσi j = ∂u i =

∂X ∂ ∂X k ∂X k ∂X a + a + bν , , + bν ∂u k ∂u j ∂u i ∂u j ∂u k

∂a k ∂a k k σ + a ϒ + bh + σik + a k ϒ jki + bh i j , k j ik j i j ∂u i ∂u j

(6.1)

X ∂X ∂ν ∂ X where ϒik j = ∂u∂i ∂u k , ∂u j is the Christoffel symbol of σi j and h i j = ∂u i , ∂u j is the second fundamental form. Denote by the ∇i the covariant derivative with respect to ∂u∂ i . We solve for 2

bh i j =

1 (δσi j − ∇i a k σk j − ∇ j a k σik ). 2

(6.2)

X Next we compute the variation of the mean curvature H = −σ i j ∂u∂i ∂u j , ν, 2

δ H = h i j δσ

ij

−σ

ij

2 ∂ 2Y ∂ X ij ,ν − σ , δν . ∂u i ∂u j ∂u i ∂u j

We derive δσ i j = −σ ik δσkl σ l j , and

∂X ∂b δν = a k h lk − l σ l j j . ∂u ∂u

(6.3)


939

On the other hand, 2 ∂ Y ∂2 k ∂X a ,ν = + bν , ν ∂u i ∂u j ∂u i ∂u j ∂u k k 2 ∂a ∂ X ∂b ∂ν ∂ k ∂ X ,ν . + a + ν + b = ∂u i ∂u j ∂u k ∂u j ∂u k ∂u j ∂u j Substitute in ∂2 X ∂X ∂X ∂ν = ϒ ljk l − h jk ν and = h jl σ lk k , ∂u j ∂u k ∂u ∂u j ∂u and we obtain 2 ∂ Y ∂ ∂b k lk ∂ X k (∇ j a + bh jl σ ) k + ,ν = − a h jk ν , ν ∂u i ∂u j ∂u i ∂u ∂u j . ∂b ∂ k lk l = −h ik (∇ j a + bh jl σ ) + i − a h jl ∂u ∂u j Plug these into (6.3) and we arrive at δ H = −σ ik σ l j h i j δσkl − b + σ i j h ik ∇ j a k + bg i j σ lk h ik h jl + σ i j ∇i (a k h jk ). We plug (6.2) into this equation and obtain 1 δ H = − σ ik σ l j h i j δσkl − b + σ i j ∇i (a k h jk ). 2 Proposition 6.1. Let be a closed embedded hypersurface in Rn+1 . The variation of the total mean curvature with respect to a metric deformation is 1 H dv = (H σ i j − σ ik σ jl h kl )δσi j dv . δ 2 Corollary 6.1. If X s : → Rn+1 is a smooth family of isometric embedding of a compact n-manifold , then the total mean curvature is a constant. 6.2. The variational equation. In this section, σab denotes the metric on a two-surface which satisfies the assumption in Theorem A. Recall the metric on the projection is σˆ ab = σab + τa τb . The metric σab is fixed for the isometric embedding and thus δ σˆ ab = δ(τa τb ). The quasi-local mass expression we try to minimize is ˆ = − 1 + |∇τ |2 cosh θ |H | − ∇τ · ∇θ − V · ∇τ dv , kdv

where sinh θ =

√− τ , and |H | 1+|∇τ |2

V is the tangent vector on that is dual to the connec-

H tion one-form αeˆ3 determined by eˆ3 = − |H | . This is an expression that is determined by σab and the mean curvature vector H .

940


We can take X = (x(u a ), y(u a ), z(u a ), τ (u a )) : → R3,1 and X = (x(u a ), √ a a 3 1 2 ˆ ab du du and dv = det σab du 1 y(u ), z(u )) : → R . Notice that dv = det σ ˆ is du 2 . Recall from the last section the variation of kdv ˆ σˆ ab − σˆ ac σˆ bd hˆ cd )τa (δτ )b dv = (H δ kdv .

σˆ ab − σˆ ac σˆ bd hˆ cd is divergence free on Integrate by parts and recall that the tensor H ˆ ; we obtain ˆ σˆ ab − σˆ ac σˆ bd hˆ cd )∇ b ∇ a τ δτ dv δ = − (H kdv (6.4) .

is given by The relation between the Hessians of τ on and b ∇ a τ = ∇

1 ∇b ∇a τ. 1 + |∇τ |2

Since we also have

δ

ˆ kdv =−

det σˆ = 1 + |∇τ |2 , det σ

σˆ ab − σˆ ac σˆ bd hˆ cd ) ∇b ∇a τ δτ dv . (H 1 + |∇τ |2

Now 1 + |∇τ |2 cosh θ |H | − ∇τ · ∇θ dv δ (1 + |∇τ |2 )−1/2 ∇τ · ∇δτ cosh θ |H | + 1 + |∇τ |2 sinh θ δθ |H | dv = − (∇δτ · ∇θ + ∇τ · ∇δθ )dv .

Substitute sinh θ = δ

|H |

√− τ

1+|∇τ |2

and integrate by parts; we obtain

1 + |∇τ |2 cosh θ |H | − ∇τ · ∇θ dv ∇τ = cosh θ |H | − ∇θ · ∇δτ dv . 1 + |∇τ |2

Proposition 6.2. The variation of with respect to τ is σˆ ab − σˆ ac σˆ bd hˆ cd ) ∇b ∇a τ δτ dv − (H 1 + |∇τ |2 ∇τ + div cosh θ |H | − ∇θ − V · δτ dv . 1 + |∇τ |2

(6.5)


941

Therefore, the equation for the minimizing isometric embedding is ∇τ ∇ τ ∇ b a ab ac bd σˆ − σˆ σˆ hˆ cd ) +div cosh θ |H | − ∇θ − V = 0 −( H 1 + |∇τ |2 1 + |∇τ |2 (6.6) with sinh θ =

|H |

√− τ

1+|∇τ |2

.

Acknowledgements. We wish to thank Richard Hamilton for helpful discussions on isometric embeddings and Melissa Liu for her interest and reading of an earlier version of this article. The first author would like to thank Naqing Xie for pointing out several typos in an earlier version.

References 1. Bartnik, R.: New definition of quasilocal mass. Phys. Rev. Lett. 62(20), 2346–2348 (1989) 2. Bartnik, R.: Quasi-spherical metrics and prescribed scalar curvature. J. Diff. Geom. 37, 31–71 (1993) 3. Booth, I.S., Mann, R.B.: Moving observers, nonorthogonal boundaries, and quasilocal energy. Phys. Rev. D. 59, 064021 (1999) 4. Brown, J.D., York, J.W.: Quasilocal energy in general relativity. In: Mathematical aspects of classical field theory (Seattle, WA, 1991), Contemp. Math. 132, Providence, RI: Amer. Math. Soc., 1992, pp. 129–142 5. Brown, J.D., York, J.W.: Quasilocal energy and conserved charges derived from the gravitational action. Phys. Rev. D (3) 47(4), 1407–1419 (1993) 6. Brown, J.D., Lau, S.R., York, J.W.: Energy of isolated systems at retarded times as the null limit of quasilocal energy. Phys. Rev. D (3) 55(4), 1977–1984 (1997) 7. Christodoulou, D., Yau, S.-T.: Some remarks on the quasi-local mass. In: Mathematics and general relativity (Santa Cruz, CA, 1986), Contemp. Math. 71, Providence, RI: Amer. Math. Soc., 1988, pp. 9–14 8. Eardley, M.: Global problems in numerical relativity. In Sources of gravitational radiation, Cambridge: Cambridge Univ. Press, 1979, pp. 127–138 9. Epp, R.J.: Angular momentum and an invariant quasilocal energy in general relativity. Phys. Rev. D 62(12), 124108 (2000) 10. Gibbons, G.W.: Collapsing shells and the isoperimetric inequality for black holes. Class. Quant. Gravi. 14(10), 2905–2915 (1997) 11. Gilbarg, D., Trudinger, N.S.: Elliptic partial differential equations of second order. Second edition. Grundlehren der Mathematischen Wissenschaften [Fundamental Principles of Mathematical Sciences], 224, Berlin: Springer-Verlag, 1983 12. Hawking, S.W., Horowitz, G.T.: The gravitational Hamiltonian, action, entropy and surface terms. Class. Quant. Grav. 13(6), 1487–1498 (1996) 13. Jang, P.S.: On the positivity of energy in general relativity. J. Math. Phys. 19(5), 1152–1155 (1978) 14. Kijowski, J.: A simple derivation of canonical structure and quasi-local Hamiltonians in general relativity. Gen. Relativity Gravitation 29(3), 307–343 (1997) 15. Lau, S.R.: New variables, the gravitational action and boosted quasilocal stress-energy-momentum. Class. Quant. Grav. 13(6), 1509–1540 (1996) 16. Liu, C.-C.M., Yau, S.-T.: Positivity of quasilocal mass. Phys. Rev. Lett. 90(23), 231102 (2003) 17. Liu, C.-C.M., Yau, S.-T.: Positivity of quasilocal mass II. J. Amer. Math. Soc. 19(1), 181–204 (2006) 18. Nirenberg, L.: The Weyl and Minkowski problems in differential geometry in the large. Comm. Pure Appl. Math. 6, 337–394 (1953) 19. Ó Murchadha, N., Szabados, L.B., Tod, K.P.: Comment on “Positivity of quasilocal mass”. Phys. Rev. Lett 92, 259001 (2004) 20. Penrose, R.: Some unsolved problems in classical general relativity. In: Seminar on Differential Geometry, Ann. of Math. Stud. 102, Princeton, NJ: Princeton Univ. Press, 1982, pp. 631–668 21. Pogorelov, A.V.: Regularity of a convex surface with given Gaussian curvature (in Russian). Mat. Sbornik N.S. 31(73), 88–103 (1952) 22. Schoen, R., Yau, S.-T.: Positivity of the total mass of a general space-time. Phys. Rev. Lett. 43(20), 1457–1459 (1979) 23. Schoen, R., Yau, S.-T.: On the proof of the positive mass conjecture in general relativity. Commun. Math. Phys. 65(1), 45–76 (1979)

942


24. Schoen, R., Yau, S.-T.: Proof of the positive mass theorem II. Commun. Math. Phys. 79(2), 231–260 (1981) 25. Schoen, R., Yau, S.-T.: The existence of a black hole due to condensation of matter. Commun. Math. Phys. 90, 575–579 (1983) 26. Shi, Y., Tam, L.-F.: Positive mass theorem and the boundary behavior of compact manifolds with nonnegative scalar curvature. J. Diff. Geom. 62(1), 79–125 (2002) 27. Szabados, L.B.: Quasi-local energy-momentum and angular momentum in GR: a review article. Living Rev. Relativity 7, 4 (2004) 28. Wang, M.-T., Yau, S.-T.: A generalization of Liu-Yau’s quasi-local mass. Comm. Anal. Geom. 15(2), 249–282 (2007) 29. Wang, M.-T., Yau, S.-T.: Quasilocal mass in general relativity. Phys. Rev. Lett. 102, 021101 (2009). http://arXiv.org/abs/0804.1174v3[gr-qc] 30. Witten, E.: A new proof of the positive energy theorem. Commun. Math. Phys. 80(3), 381–402 (1981) 31. Yau, S.-T.: Geometry of three manifolds and existence of black hole due to boundary effect. Adv. Theor. Math. Phys. 5(4), 755–767 (2001) 32. Zhang, X.: A new quasi-local mass and positivity. Acta Mathematica Sinica (English Series) 24(6), 881–890 (2008) Communicated by G. W. Gibbons


Communications in


Reconstruction of Random Colourings Allan Sly Statistics Department, University of California, Berkeley, CA 94720, USA. E-mail: [email protected] Received: 25 May 2008 / Accepted: 18 December 2008 Published online: 20 March 2009 – © The Author(s) 2009. This article is published with open access at Springerlink.com

Abstract: Reconstruction problems have been studied in a number of contexts including biology, information theory and statistical physics. We consider the reconstruction problem for random k-colourings on the ∆-ary tree for large k. Bhatnagar et al. [2] showed non-reconstruction when ∆ ≤ 21 k log k−o(k log k). We tighten this result and show nonreconstruction when ∆ ≤ k[log k +log log k +1−log 2−o(1)], which is very close to the best known bound establishing reconstruction which is ∆ ≥ k[log k+log log k+1+o(1)].

1. Introduction Determining the reconstruction threshold of a Markov random field has been of interest in a number of areas including biology, information theory and statistical physics. Reconstruction thresholds on trees are believed to determine the dynamical phase transitions in many constraint satisfaction problems including random K-SAT and random colourings on random graphs. It is thought that at this point the space of solutions splits into exponentially many clusters. The properties of the space of solutions of these problems are of interest to physicists, probabilists and theoretical computer scientists. It is known [18,20,21] that reconstruction holds when the number of colours satisfies k[log k + log log k + 1 + o(1)] ≤ ∆. This bound is given by the analysis of a naive reconstruction algorithm which reconstructs the root only when it is known with absolute certainty given the leaves. The problem of finding good bounds when nonreconstruction holds is more difficult, it requires showing that the spins on the root and the leaves are asymptotically independent. The best previous rigorous result was that ∆ ≤ 21 k log k − o(k log k) implies non-reconstruction [2]. We improve this to ∆ ≤ k[log k + log log k + 1 − log 2 − o(1)]. Even at a heuristic level no non-reconstruction bound as good as ours was known. Supported by NSF grants DMS-0528488 and DMS-0548249.

944

A. Sly

1.1. Definitions. We begin by giving a general description of broadcast models on trees and the reconstruction problem. The broadcast model on a tree T is a model in which information is sent from the root ρ across the edges, which act as noisy channels, to the leaves of T . For some given finite set of characters C a configuration on T is an element of C T , that is an assignment of a character C to each vertex. The broadcast model is a probability distribution on configurations defined as follows. Some |C| × |C| probability transition matrix M is chosen as the noisy channel on each edge. The spin σρ is chosen from C according to some initial distribution and is then propagated along the edges of the tree according to the transition matrix M. That is if vertex u is the parent of v in the tree then the spin at v is defined according to the probabilities P(σv = j|σu = i) = Mi,j . We will focus on the colouring model with |C| = k which is given by the transition 1i=j matrix Mi,j = k−1 . Broadcast models and in particular colourings can also be considered as Gibbs measures on trees. Given a finite set of colours k and a graph T = (V, E), a k-colouring is an assignment of a colour to each vertex so that adjacent vertices have different colours. The random k-colouring model is then the uniform probability distribution on valid kcolourings of the graph. It is a Gibbs measure or Markov random field on the space of configurations σ ∈ {1, . . . , k}V given by 1 P(σ ) = 1σu =σv , Z (u,v)∈E

where Z is a normalizing constant given by the number of colourings of T . On an infinite tree more than one Gibbs measure may existi; the broadcast colouring model corresponds to the free Gibbs measure. We will restrict our attention to ∆-ary trees, that is the infinite rooted tree where every vertex has ∆ offspring. Let L(n) denote the spins at distance n from the root. Definition 1. We say that a model is reconstructible on a tree T if, P(σρ = i, L(n) = L) − P(σρ = i)P(L(n) = L) > 0, lim sup n

i,L

where the sum is over all i ∈ C and all configurations L on the vertices at distance n from the root. When the limsup is 0 we will say the model has non-reconstruction on T . Non-reconstruction is equivalent to the mutual information between σρ = L(0) and L(n) going to 0 as n goes to infinity and also to {L(n)}∞ n=1 having a non-trivial tail sigma-field. More equivalent formulations are given in [17] Prop. 2.1. As increasing ∆ only increases the information on the root, we can define ∆∗ (k) to be the reconstruction threshold, that is the smallest ∆ such that reconstruction holds on the ∆-ary tree. In contrast to reconstruction consider the uniqueness property of a model. Definition 2. We say that a model has uniqueness on a tree T if (P(σρ = i|L(n) = L) − P(σρ = i|L(n) = L )) > 0, lim sup sup n

L ,L

i∈C

where the supremum is over all configurations L , L on the vertices at distance n from the root.

Reconstruction of Random Colourings

945

Reconstruction implies non-uniqueness and is a strictly stronger condition. Essentially uniqueness says that there is some configuration on the leaves which provides information on the root while reconstruction says that a typical configuration on the leaves provides information on the root. 1.2. Background. For some parameterized collection of models the key question in studying reconstruction is finding which models have reconstruction, which typically involves finding a threshold. This problem naturally arises in biology, information theory and statistical physics and involves the trade off between increasing numbers of leaves with increasingly noisy information as the distance from the root to the leaves increases. The simplest collection of model is the binary symmetric channel which is defined on two characters with 1− M= 1− for 0 < < 21 , which corresponds to the ferromagnetic Ising model on the tree with no external field. It was shown in [3 and 7] that this channel has reconstruction if and only if ∆(1 − 2)2 > 1. The broadcast model is a natural model for the evolution of characters of DNA. In phylogenetic reconstruction the goal is to reconstruct the ancestral tree of a collection of species given their genetic data. Daskalakis, Mossel and Roch [5,16] proved the conjecture of Mike Steel that the number of samples required for phylogenetic reconstruction undergoes a phase transition at the reconstruction threshold for the binary symmetric channel. Exact reconstruction thresholds have only been calculated in the binary symmetric model and binary asymmetric models with sufficiently small asymmetry [4]. In both these cases the threshold corresponds to the Kesten-Stigum bound [10]. The KestenStigum bound shows that reconstruction holds whenever ∆λ2 (M)2 > 1, where λ2 (M) denotes the second largest eigenvalue of M. In fact when ∆λ2 (M)2 > 1, it is possible to asymptotically reconstruct the root from just knowing the number of times each character appears on the leaves (census reconstruction) without using the information on their positions on the leaves. Mossel [15,17] showed that the Kesten-Stigum bound is not the bound for reconstruction in the binary-asymmetric model with sufficiently large asymmetry or in the Potts model with sufficiently many characters. It was shown in [9] that k-colourings have uniqueness on ∆-ary trees if and only if k ≥ ∆ + 2 which therefore also establishes non-reconstruction in this regime. Exactly finding the threshold for reconstruction is difficult so most attention has been focused on finding its asymptotics as the number of colours and the degree goes to infinity. Recently [2] greatly improved this bound showing that ∆∗ (k) ≥ ( 21 + o(1))k log k. On the other hand [18] showed that when ∆ ≥ (1 + o(1))k log k then with high probability in k the spin of the root is exactly determined by the leaves and so reconstruction is possible. With a more detailed analysis this argument can be improved to show reconstruction when k[log k + log log k + 1 + o(1)] ≤ ∆, as was shown in [20,21]. This is a large improvement on the Kesten-Stigum bound which implies reconstruction when ∆ > (k − 1)2 . In related work Mezard and Montanari [14] found a variational principle which establishes bounds on reconstruction for colourings but which is asymptotically weaker than Lemma 7. Our results establish extremely tight bounds on ∆∗ (k) with the upper and lower bounds differing by just (log 2 + o(1))k rather than 21 k log k previously.

946

A. Sly

Theorem 1. The k-colouring model has reconstruction threshold ∆∗ (k) satisfying, ∆∗ (k) ≤ k[log k + log log k + 1 + o(1)] and ∆∗ (k) ≥ k[log k + log log k + 1 − log(2) − o(1)]. 1.3. Applications to Statistical Physics. The reconstruction threshold on trees is believed to play a critical role in the dynamical phase transitions in certain glassy systems given by random constraint satisfaction problems. Important examples include random K-SAT and random colourings on random graphs. We will briefly describe what is conjectured by physicists about such systems [11,21], generally without rigorous proof, and why understanding the reconstruction threshold for colourings plays an important role in such systems. The Erd˝os-Rényi random graph G(n, p) is a random graph on n vertices where every pair of vertices is connected with probability p. To maintain constant average degree ∆ we let p = ∆/n. The k-colouring model on G(n, ∆/n) or random ∆-regular graphs undergoes several phase transitions as ∆ grows. If we consider the space of solutions to the random colouring model where two colourings are adjacent if they differ at at most o(n) vertices, then for the smallest values of ∆ the space of solutions forms a large connected component. Above the clustering transition ∆d the space of solutions breaks into exponentially many disconnected clusters and has no giant component with a constant fraction of the probability. This replica symmetry breaking transition is believed [11,12] to occur at ∆d = k[log k + log log k + α + o(1)]. In a recent remarkable result [1] rigorously proved that when (1 + o(1))k log k ≤ ∆ ≤ (2 − o(1))k log k, then the space of solutions indeed breaks into exponentially many small clusters. A second transition occurs when most clusters have frozen spins, that is vertices which have the same colour in every colouring in the cluster. This phase transition is believed to occur at ∆r = k[log k + log log k + 1 + o(1)] [20,21] and is the best upper bound known for ∆d . Two more transitions are believed to occur: condensation where the size of the clusters is given by a Poisson-Dirichlet process, and the colouring threshold beyond which no more colourings are possible. These transitions are conjectured to occur at ∆c = 2k log k − log k − 2 log 2 + o(1) and ∆s = 2k log k − log k − 1 + o(1) respectively [21]. Similar results are also expected to hold for K-SAT and other random constraint satisfiability problems [11]. Both random regular and Erd˝os-Rényi random graphs are locally tree-like. Asymptotically in a random regular graph the neigbourhood of a random vertex is a regular tree and for Erd˝os-Rényi random graphs it is a Galton-Watson branching process tree with Poisson offspring distribution. It is conjectured [11] that the reconstruction threshold on the corresponding tree is exactly the clustering threshold ∆d on the random graph. As such, rigorous estimates of the reconstruction problem can be seen as part of a larger program of understanding glassy phases in constraint satisfaction problems. The clustering threshold is also believed to play an important role in the efficiency of MCMC algorithms for finding and sampling from colourings of the graphs. MCMC algorithms are believed to be efficient up to the clustering threshold but experience an exponential slowdown beyond it [11]. This is to be expected since a local MCMC algorithm cannot move between clusters each of which has exponentially small probability. Rigorous proofs of rapid mixing of MCMC algorithms, such as the Glauber dynamics, fall a long way behind. For random regular graphs, results of [6] imply rapid mixing when


947

k ≥ 1.49∆, well below the reconstruction threshold and even the uniqueness threshold. Even less is known for Erd˝os-Rényi random graphs as almost all MCMC results are given in terms of the maximum degree which in this case grows with n. Polynomial time mixing of the Glauber dynamics has been shown [19] for a constant number of colours in terms of ∆. 1.4. Open Problem. If the probability that the leaves uniquely determine the spin at the root does not go to 0 as n goes to infinity then the model has reconstruction. It is natural to ask is this a necessary condition for reconstruction. When k = 5 and ∆ = 14 it was shown in [14] using a variational principle that reconstruction holds but the probability that the leaves fix the root goes to 0. However, this is the only case in which the variational principle gives an upper bound on the number of colours required for reconstruction which is better than the bound of the leaves fixing the root. It remains open to determine if for large numbers of colours/high degree if this is exactly the reconstruction threshold. Numerical results of [21] suggest this is in fact not the case and there are two separate thresholds. Answering this question would be of significant interest. 2. Proofs We introduce the notation we use in the proofs. We denote the colours by C = {1, . . . , k} and let T be the ∆-ary tree rooted at ρ. Let u1 , . . . , u∆ be the children of ρ and let Tj denote the subtree of descendants of uj . Let P(σ ) denote the free measure on colourings on the ∆-ary tree. Let L(n) denote the spins at distance n from ρ and let L j (n) denote the spins on level n in the subtree Tj . We let E i and P i denote the expectation and probability with respect to the measure conditioned to have i at the root. For a random variable U , a function of σ , we will let L(U ) denote the law of U and Li (U ) denote its conditional law with respect to the measure conditioned to have i at the root. For a configuration L on the spins at distance n from ρ define the deterministic function f n as f n (i, L) = P(σρ = i|L(n) = L). By the recursive nature of the tree we also have that f n (i, L) = P(σuj = i|L j (n) = L). Now define X i (n) = X i by X i (n) = f n (i, L(n)). These random variables are a deterministic function of the random configuration L(n) of the leaves which gives the marginal probability that the root is in state i. By symmetry the X i are exchangable. Now we define two distributions X + = X + (n) = L1 f n (1, L(n)), and X − = X − (n) = L2 f n (1, L(n)).

948

A. Sly

We will establish non-reconstruction by showing that the distributions X + and X − both converge to k1 as n goes to infinity. By symmetry we have X + i1 = i2 , d Li1 ( f n (i2 , L(n))) = X − otherwise, and the set { f n (i, L(n)) : 2 ≤ i ≤ k} is conditionally exchangeable when conditioned on the event σρ = 1. Moreover, they are conditionally exchangeable given σρ = 1 and the value of f n (1, L(n)). Now define Yij = Yij (n) = f n (i, L j (n)). This is equal to the probability that σuj = i, given the random configuration L j (n) on the spins on level n in the subtree Tj . The following proposition follows immediately from the symmetries of the model. Proposition 1. The Yij satisfy the following properties: – The random vectors Yj = Y1j , . . . , Yqj are conditionally independent given σρ for j = 1, . . . , d. – Conditional on σuj the random variable Yσuj j is equal in distribution to X + (n) while for i = σuj the random variables Yij are equal in distribution to X − (n). – Further, for fixed j, given σuj and Yσuj j the random variables {Yij }i=σuj are conditionally exchangeable over i = σuj . We make use of these symmetries to simplify the anaylsis. Given the standard Gibbs measure recursions on trees we have that ∆ j=1 (1 − f n (1, L j (n))) f n+1 (1, L(n + 1)) = k ∆ i=1 j=1 (1 − f n (i, L j (n))) and so Z1 X 1 (n + 1) = k

i=1

Zi

,

where Zi =

∆

(1 − Yij ).

j=1

We let xn and zn denote E 1 X 1 (n) = E X + (n) and E 1 (X 1 (n)− k1 )2 = E(X + (n)− k1 )2 respectively. These quantities, in particular xn , play a major role in our analysis. The following lemma, which can be viewed as the analogue of Lemma 1 of [4], allows us to relate the first and second moments of X + . Lemma 1. We have that xn = E X + = E 1

k i=1

X i (n)2 = E

k (X i (n))2 , i=1


949

and k 1 1 1 2 1 2 + + X i (n) − ≥E X − = zn . xn − = E X − = E k k k k i=1

Proof. From the definition of conditional probabilities and of f n and the fact that P(σρ = 1) = k1 we have that E 1 f n (1, L(n)) = f n (1, L)P(L(n) = L|σρ = 1) L

P(L(n) = L , σρ = 1) f n (1, L) = P(σρ = 1) L P(L(n) = L) f n (1, L)2 =k L

= kE(X 1 (n))2 k (X i (n))2 .

=E

i=1

By symmetry for any i1 , i2 ∈ C, E i1

k k (X i (n))2 = E i2 (X i (n))2 , i=1

i=1

and so E

k k k k 1 i (X i (n))2 = E (X i (n))2 = E 1 (X i (n))2 . k i=1

i=1

i =1

i=1

Finally we have that E

k i=1

(X i (n) −

k

k

i=1

i=1

1 2 2 1 1 ) =E (X i (n))2 − E X i (n) + k 2 = E X + − , k k k k

which completes the proof. Corollary 1. We have that xn ≥

1 k

and that lim xn = n

1 k

implies non-reconstruction. Proof. We have that xn ≥ zn +

1 k

k

≥ k1 . If xn converges to

E

i=1

which implies non-reconstruction.

1 X i (n) − k

2 →0

1 k

then

950

A. Sly

2.1. Non-reconstruction. Our analysis is split into two phases, the first when xn is close to 1 and the second when xn is close to k1 . Lemma 2. Suppose that β < 1 − log 2. Then for sufficiently large k if ∆ < k[log k + log log k + β] then lim sup xn ≤ n

2 . k

Proof. We fix the colour of the root to be 1 and let F denote the sigma-algebra generated by {σuj : 1 ≤ j ≤ ∆}, the colours of the neighbours of the root. For 1 ≤ i ≤ k let bi = #{j : σuj = i}, the number of times each colour appears amongst the neighbours of the root. Of course b1 = 0 since the neighbours of the root cannot be 1. For 1 ≤ i ≤ k define Ui = (1 − Yij ). 1≤j≤∆:σuj =i

Note that with this definition U1 = 1. We will use the symmetries and exchangeability of the model to reduce the problem to considering a random variable only involving the Ui . Conditional on F, the Ui are independent and are distributed as the product of bi independent copies of (1 − X + (n)) and 0 ≤ Ui ≤ 1 for all i. Fix an with 2 ≤ ≤ k. Let W1 and W be defined by W1 = (1 − Y1j ), W = (1 − Y j ) 1≤j≤∆:σuj =

1≤j≤∆:σuj =

so Z = W U . Note that for j ∈ {1 ≤ j ≤ ∆ : σuj = } we have that σuj ∈ {1, }, since of none of the σuj are 1. So by Proposition 1, conditional on F and σuj ∈ {1, }, we have that Y1j and Y j are conditionally exchangeable and so W1 and W are conditionally exchangeable. We will analyse the effect of swapping W1 with W . Recall that Zi = ∆ j=1 (1 − Yij ) so define (1 − Y j ), Z = W1 U = W1 1≤j≤∆:σuj =

and Z 1 = W

(1 − Y1j ),

1≤j≤∆:σuj =

and for i ∈ {1, }, Zi = Zi. Proposition 1 noted that Yj = {Y1j , . . . , Ykj } are conditionally independent given F and for each j given σuj and Yσuj j the random variables {Yij : i = σuj } are conditionally exchangeable. It follows that (W1 , W , Z 1 , . . . , Z k , U1 . . . , Uk , σ1 , . . . , σ∆ ) d = W , W1 , Z1, . . . , Z k , U1 . . . , Uk , σ1 , . . . , σ∆ ,

(1)


951

where we denote equality as in distributions of random vectors since this just swaps Y1j ’s with Y j ’s which are conditionally exchangeable given all the other random variables.

Z = U (W − W1 ) it follows that Zi = Z − Since 0 ≤ U ≤ 1, and ki=2 Z i − ki=2 (W1 − W ) has the same sign as

k k Z i − W + Z i = (W1 − W )(1 − U ) W1 + i=2

i=2

and so W1 +

1

k

i=2

Zi

−

W +

1

k

i=2

Zi

has the opposite sign as W1 − W . Applying the equality in distribution of Eq. (1) we have that W − W1 1 E F, {Ui }

W1 + ki=2 Z i 1 1 W1 − W W − W1 = E + F, {Ui }

2 W1 + ki=2 Z i W + ki=2 Zi

1 1 1 1 = E (W − W1 ) − F, {Ui }

2 W1 + k Z i W + k Zi i=2

≥ 0,

i=2

where the first equality follows using equality in distributions of the random vectors and the inequality follows from the two terms of the product having the same sign. Since 0 ≤ Z 1 ≤ W1 ≤ 1 we have that, Z W 1 1 E1 F, {Ui } ≤ E 1 F, {Ui }

Z 1 + ki=2 Z i W1 + ki=2 Z i W 1 ≤E F, {Ui }

W1 + ki=2 Z i W ≤ E1 F, {Ui } ,

Z 1 + ki=2 Z i and so since Z = U W and we are conditioning on U , Z 1 U Z 1 1 E F, {Ui } ≤ E

k

k Z1 + Zi Z1 + i=2

i=2

F, {Ui } . Zi

Recall that ≥ 2 is arbitrary so the previous equation holds for all 2 ≤ ≤ k simultaneously. Summing over all values of we get that, ⎤ ⎡

k k Z 1 1 + l=2 Ul Z l 1⎣ 1 F, {Ui }⎦ ≤ E E F, {Ui } = 1,

k

k i=1 Z i i=1 Z i l=1

952

A. Sly

and hence since we are conditioning on the Ui , Z1 E 1 [ X 1 (n + 1)| F, {Ui }] = E 1 k

1 . F, {Ui } ≤

k 1 + i=2 Ui i=1 Z i

We now estimate the expected value of the right-hand side of the previous equation. 1 1 Using the fact that 1+x = 0 s x ds we have that

1+

1

k

1 k

=

i=2 Ui

s

i=2 Ui

ds.

0

As s u is convex as a function of u we have that s u ≤ s 0 (1 − u) + s 1 u when 0 ≤ u ≤ 1 and so since 0 ≤ Ui ≤ 1 we have that E 1 s Ui ≤ (1 − E 1 Ui ) + s E 1 Ui = 1 − (1 − s)E 1 Ui . Since it is conditional on F the Ui are independent and are distributed as the product of bi independent copies of (1 − X + (n)) we have that,

k 1

E [ X 1 (n + 1)| F] ≤ 1

0 i=2 k 1

=

0 i=2

(1 − (1 − s)E 1 [Ui |F])ds (1 − (1 − s)(1 − xn )bi )ds.

Now the colours σuj are chosen independently and uniformly from the set {2, . . . , k} so (b2 , . . . , bk ) has a multinominal distribution. Let β < β ∗ < 1 − log 2 and let bi be iid random variables distributed as Poisson(D), where D = log k + log log k + β ∗ . By Lemma 4 we can couple the b’s and b’s so that (b2 , . . . , bk ) ≤ ( b2 , . . . , bk ) whenever

k j=2 bj ≥ ∆. It follows that xn+1 = E 1 X 1 (n + 1) ≤E

1

1{ k j=2 bj 0 such that when 0 < x < δ, then 1 − e−x r > 0 such that −β ∗

∗

e−1 (1 + r ). Now for large enough k, (k − 1) exp(−D) = (k−1)e k log k −1 and so using the fact that r < r and p = o(k ), ∗ (k − 1)e−β (1 + r )e−1 e−1 1 − ≤1+ p− ≤1− , y1 = g(1) ≤ p + 1 − 2 k log k log k log k provided k is sufficiently large. Now since g is a continuous increasing function and y1 < y0 it follows that the sequence yi is decreasing. Suppose that (k − 1) exp(−yi D) < δ. Then 1 − (k − 1) exp(−yi D), yi+1 ≤ p + 1 − 2 and so for k sufficiently large 1 1 − yi+1 ≥ − (k − 1) exp(−yi D) − p 2 ∗ (k − 1)e−β 1 − exp((1 − yi ) log k) − p ≥ 2 k log k (1 + r )e−1 exp((1 − yi ) log k) − p log k ≥ (1 + r )(1 − yi ) − p ≥ (1 + r )(1 − yi ), ≥

where the second to last inequality uses the fact that ex ≥ ex and the final inequality e−1 −1 uses the fact that 1 − yi ≥ log k , while p = o(k ). It follows that yi decreases until for some i, (k − 1) exp(−yi D) ≥ δ. Now let k is large enough then yi+1 ≤ p +

1−e−δ δ

= α < α < α < 1 for some α. When

1 − e−δ ≤ α . δ

Then for k large enough, exp(−yi+1 D) ≥ exp(−α D) ≥ exp(−α log k) = k −α . It follows that 1 ≤ 2k α−1 . yi+2 ≤ p + (k − 1) exp(−yi+1 D)

954

A. Sly

Finally we have exp(−yi+2 D) ≥ exp(−2k α−1 D) ≥ yi+3 ≤ p +

2 3

and so

1 < 2k −1 (k − 1) exp(−yi+2 D)

when k is large enough, which completes the proof. In the preceding lemma we note that the requirement that β ∗ < 1 − log 2 comes from ∗ the fact that x < 21 ex−β for all x when β ∗ < 1 − log 2. Lemma 4. Suppose that (b1 , . . . , bk ) has the multinominal distribution M n, k1 , k1 , . . . 1 the b’s k . Let bj be iid random variables distributed as Poisson(D). We can couple

k and b’s so that (b1 , . . . , bk ) ≤ (b1 , . . . , bk ) (respectively ≥) whenever j=1 bj ≥ n (respectively ≤).

Proof. Since the bj are independent and Poisson, conditional on the sum N = kj=1 bj , 1 1 (see [13] bk ) is multinominal M N , k , k , . . . k1 the distribution of ( b1 , . . . , Prop. 6.2.1). n ≤ m then Now if two multinomial distributions A and B distributed as M n, k1 , k1 , . . . k1 and M m, k1 , k1 , . . . k1 respectively can be trivially coupled so that A ≤ B, which completes the proof. Janson and Mossel [8] studied “robust reconstruction”, the question of when reconstruction is possible from a very noisy copy of the leaves. They found that the threshold for robust reconstruction is exactly the Kesten-Stigum bound. Lemma 2 establishes that the leaves provide very little information about the spin at a vertex a long distance from the leaves. So as information over long distances is very noisy the results of [8] suggest that reconstruction would only be possible after the Kesten-Stigum bound whereas, in our context, ∆ is much less than λ2 (M)−2 . As such, only crude bounds are needed to establish the following lemma. Lemma 5. For sufficiently large k if ∆ ≤ 2k log k and if xn ≤ 1 1 1 xn − . xn+1 − ≤ k 2 k

2 k

then

Proof. Using the identity 1 1 r r2 1 = − 2+ 2 s +r s s s s +r

k

k 1 1 and taking s = E i=1 Z i and r = i=1 (Z i − E Z i ) we have that

Z 1 − k1 ki=1 Z i 1 xn+1 − = E 1

k k i=1 Z i

k 1 k 1Z ) 1 k Z − Z (Z − E 1 i i=1 i i=1 i Z 1 − k i=1 Z i k = E1 − E1

k

2 1 E i=1 Z i E 1 ki=1 Z i

2 k 1Z ) 1 k (Z − E i i i=1 Z 1 − k i=1 Z i +E 1

k

2 . i=1 Z i E 1 ki=1 Z i


955

Now by Lemma 6,

E 1 Z 1 − k1 ki=1 Z i ≤

E 1 ki=1 Z i

2∆ 1 1 − − xn − k1 − k−1 x n k k k2 2∆ 1 1 + (k − 1) 1 − k2 xn − k 1 3∆ . (2) ≤ 2 xn − k k k−1 k

1+

2∆ k

Using the inequality 21 (a 2 + b2 ) ≥ ab we have that

k

k 1 1 Zi Zi − E Zi − Z1 − k i=1 i=1

k k 1 1 1 1 1 1 E Zi − (Z i − E Z i ) = − Z1 − E Z1 + E Z1 − k k i=1 i=1 k

1 · (Z i − E Z i )

i=1

k 2 2 1 1 1 + ≤ Z 1 − E 1 Z 1 + (Z i − E 1 Z i ) 2 2 k i=1

k k 1 − E 1 Z1 − E 1 Zi (Z i − E 1 Z i ) k i=1

i=1

so by Lemma 6 we have that,

k k 1 1 E − Z1 − Zi (Z i − E Z i ) k i=1 i=1 4∆ k − 1 2∆ 1 xn − + 4∆ ≤ k k k

1

and ⎡

⎤ k 1Z ) Z (Z − E i i i i=1 i=1 ⎢ ⎥ E 1 ⎣− ⎦

2 E 1 ki=1 Z i xn − k1 4∆ k + 4∆ ≤ 2 1 1 + (k − 1) 1 − 2∆ x − n 2 k k 5∆ 1 . ≤ 2 xn − k k Z1 −

1 k

k

(3)

956

A. Sly

Finally since 0 ≤

Z

k 1

i=1

Zi

1 k i=1 1 Z1 − k E

k i=1 Z i

Z − 1 k Z ≤ 1 we have that 1 kk i=1 i ≤ 1, and so i=1

Zi

k i=1 (Z i

− E 1 Zi)

2 E 1 ki=1 Z i

2

Zi

k i=1 (Z i

− E 1 Zi) 1 ≤E

2 E 1 ki=1 Z i 5∆ 1 . ≤ 2 xn − k k

Combining Eqs. (2), (3) and (4) we have that 13∆ 1 1 1 1 ≤ xn − xn+1 − ≤ 2 xn − k k k 2 k

2

(4)

(5)

for large enough k, which completes the result. Lemma 6. For sufficiently large k if ∆ ≤ 2k log k and if xn ≤ k2 then the following all hold: 2∆ k−1 ∆ k−1 ∆ 1 1+ xn − , (6) ≤ E 1 Z1 ≤ k k k k and for i = 1,

∆ 2∆ 1 k−1 ∆ 1 ≤ E Zi ≤ 1 − 2 xn − , k k k k − 1 2∆ 4∆ 1 Var1 Z 1 ≤ xn − , k k k k

1 k − 1 2∆ 1 Var . Zi ≤ 4∆ xn − k k k−1 k

(7) (8) (9)

i=1

Proof. From Eq. (15) of Lemma 9 we have that 1 k−1 1 ∆ 1 E Z1 = + xn − , k k−1 k and since by Corollary 1, xn ≥

1 k

we have that k−1 ∆ E 1 Z1 ≥ . k k∆ 1 Then since exp(x) = 1 + x + O(x2 ) and (k−1) 2 xn − k is small for large k, k∆ 1 x − n (k − 1)2 k ∆ 2∆ 1 k−1 1+ xn − , ≤ k k k

E 1 Z1 ≤

k−1 k

∆

exp

which establishes Eq. (6). Equations (7), (8) and (9) are established similarly using identities from Lemma 9.


957

2.2. Reconstruction. An upper bound on the reconstruction threshold ∆∗ (k) is found by estimating the probability that the colour of the root is uniquely determined by the colours at the leaves. This method was described in [18] and used to a higher level of precision in [21,20]. We restate the result and give a full proof for completeness. Lemma 7. Suppose that β > 1. Then for sufficiently large k if ∆ > k[log k+log log k+β] then the colour of the root is uniquely determined by the colours at the leaves with probability at least 1 − log1 k , that is inf P(X + (n) = 1) > 1 − n

1 . log k

Proof. Let pn be the probability that the leaves at distance n determine the spin at the root, that is pn = P 1 (X 1 (n) = 1). We will show that when k is large then lim inf n pn is close to 1. Suppose we fix the colour of the root to be 1 and let F denote the sigma-algebra generated by {σuj : 1 ≤ j ≤ ∆}, the colours of the neighbours of the root. For 2 ≤ i ≤ k let bi = #{j : σuj = i}, the number of times each colour appears in the neighbours of the root. Now each colour σuj is chosen uniformly from the set {2, . . . , k} so (b2 , . . . , bk ) has a multinominal distribution. Let β > β ∗ > 1 and let bi be iid random variables distributed as Poisson(D), where D = log k + log log k + β ∗ . By Lemma 4 we can couple

b2 , . . . , bk ) whenever ki=2 bj ≤ ki=2 bj = ∆. the b’s and b’s so that (b2 , . . . , bk ) ≥ ( If for each colour 2 ≤ i ≤ k there is some vertex uj such that the states of the leaves, L j (n) fix the colour of uj to be i, then the leaves L(n + 1) fix the colour of ρ to be 1. Conditional on F the probability that there is such a vertex uj for a given colour i is at least 1 − (1 − pn )bi . Moreover these are conditionally independent of F so it follows that pn+1 ≥

k

E 1 1 − (1 − pn )bi |F

i=2

≥

k

E 1 1 − (1 − pn )bi − s

i=2

= (1 − exp(− pn D))k−1 − s, where s = P(Poisson((k − 1)D) > ∆) = o(k −1 ). Now f (x) = (1 − exp(−xD))k−1 − s is an increasing function in x and hence when k is large enough k−1 1 1 ∗ = 1 − exp −(1 − )(log k + log log k + β ) f 1− −s log k log k 1 , > 1− log k and since p0 = 1, inf pn ≥ 1 − n

which completes the proof.

1 , log k

958

A. Sly

2.3. Main Theorem. Proof (Theorem 1). Combining Lemmas 2 and 5 establishes non-reconstruction when ∆ ≤ k[log k + log log k + 1 − log(2) − o(1)]. Lemma 7 shows that the root can be reconstructed correctly with probability at least 1 − log1 k , which establishes reconstruction when ∆ ≥ k[log k + log log k + 1 + o(1)]. Remarks. For large √ k the Poisson(∆) distribution is concentrated around ∆ with standard deviation O( ∆) which is significantly smaller than the error bounds in Theorem 1. With some minor modifications the bounds for ∆-ary trees can be extended to GaltonWatson branching processes with offspring distribution Poisson(∆). The reconstruction of Galton-Watson branching processes with offspring distribution Poisson(∆) is of interest because, as noted before, it is believed to be related to the clustering phase transition for colourings on Erd˝os-Rényi random graphs. To be more specific, for the proof of non-reconstruction we can again bound xn = E 1 X 1 (n), where the expected value is taken over all possible trees. In Lemma 2 we repeat the same bounds on xn , the only difference being ∆ is now random, which does not affect the results for large k. Then similar estimates can be made in Lemma 5 pro1 vided ∆ x − n k k is very small. As ∆ is concentrated around its expected value the probability of this not holding is very small and this can be used to complete the proof of non-reconstruction. When β > β ∗ > 1, with probability going to 1 as k goes to infinity, the GaltonWatson branching process contains a subgraph which is a (k[log k + log log k + β ∗ ])-ary tree rooted at ρ. Reconstruction then follows from Lemma 7. Acknowledgements. The author would like to thank Elchanan Mossel for his useful comments and advice and thank Dror Weitz, Nayantara Bhatnagar, Lenka Zdeborova, Florent Krz¸akała, Guilhem Semerjian and Dmitry Panchenko for useful discussions. He would also like to thank the anonymous referees and associate editor for their careful reading of the paper and suggested improvements in the exposition. Open Access This article is distributed under the terms of the Creative Commons Attribution Noncommercial License which permits any noncommercial use, distribution, and reproduction in any medium, provided the original author(s) and source are credited.

A. Appendix In this appendix we calculate identities which are used in the proof of Lemma 6. Observe that since E X + (n) + (k − 1)E X − = 1 we have that E X + − k1 = −(k − 1)(E X − − k1 ). We will show that the means and variances of the Yij and Z i can all be calculated in terms of xn and zn . Lemma 8. We have the identities

1 1 1 − xn − , k k−1 k 1 1 k−2 1 xn − − zn . = 2+ k k(k − 1) k k−1

E 1 Y1j =

(10)

2 E 1 Y1j

(11)

For 2 ≤ i ≤ k, E 1 Yij =

1 1 + k (k − 1)2

1 xn − k

(12)


and E 1 Yij2

959

1 1 k 2 − 2k + 2 1 xn − + = 2+ zn . k k(k − 1)2 k (k − 1)2

(13)

For any 1 ≤ i1 < i2 ≤ k, Cov1 (Yi1 j , Yi2 j ) ≤ 0.

(14)

Proof. When the root is conditioned to be 1, σuj = 1 and so Y1j is distributed as X − and we have that 1 1 1 1 1 1 E 1 Y1j = E X − = − E X+ − = − xn − , k k−1 k k k−1 k and 2 = E(X − )2 E 1 Y1j k 1 2 + 2 = (X i ) − E(X ) E k−1 i=1

1 = [E X + − E(X + )2 ] k−1 k−2 1 1 1 2 k−1 + + −E X − = + E X − k−1 k k k k2 1 1 k−2 1 = 2+ xn − − zn , k k(k − 1) k k−1 where the third equality follows from Lemma 1. For 2 ≤ i ≤ k we have that 1 1 1 1 1 1 1 [1 − E Y1j ] = 1− + xn − E Yij = k−1 k−1 k k−1 k 1 1 1 xn − , = + k (k − 1)2 k and again using Lemma 1, E 1 Yij2

k 1 1 2 1 2 = (X i ) − E Y1j E k−1 i=1 1 k 2 − 2k + 2 1 1 x = 2+ + − zn . n k k(k − 1)2 k (k − 1)2

Also for 2 ≤ i ≤ k, E 1 Y1j Yij =

k 1 1 E Y1j Yi j k−1 i =2

1 E 1 Y1j (1 − Y1j ) = k−1 1 ≤ E 1 Y1j E(1 − Y1j ) k−1 = E 1 Y1j E 1 Yij ,

960

A. Sly

so Cov1 (Y1j , Yij ) ≤ 0. Finally for 2 ≤ i1 < i2 ≤ k, Var1 (1 − Y1j ) =

k

Var1 (Yij ) + (k − 1)(k − 2)Cov1 (Yi1 j , Yi2 j ),

i=2

and so Cov1 (Yi1 j , Yi2 j ) = Var1 (1 − Y1j ) − −

k

Var1 (Yij )

i=2

= Var(X ) − ((k − 2)Var(X − ) + Var(X + )) ≤ 0, so Cov1 (Yi1 j , Yi2 j ) ≤ 0. Using Lemma 8 we can calculate the means and covariances of the Z j . Lemma 9. We have the following results: 1 k−1 1 ∆ + xn − , k k−1 k

∆ 2 k − 1 1 3k − 2 1 xn − − zn + . E 1 Z 12 = k k(k − 1) k k−1

E 1 Z1 =

(15) (16)

For each 2 ≤ i ≤ k then E 1 Zi =

1 k−1 − k (k − 1)2

xn −

1 k

∆ (17)

and E

1

Z i2

=

k−1 k

2

∆ 1 k 2 − 4k + 2 1 + xn − + zn . k(k − 1)2 k (k − 1)2

(18)

For any 1 ≤ i1 < i2 ≤ k, Cov1 (Z i1 j , Z i2 j ) ≤ 0. Proof. By Eq. (10) we have that E 1 Z1 = E

∆ j=1

(1 − Y1j )

∆ 1 1 1 − xn − = 1− k k−1 k ∆ 1 k−1 1 + xn − = , k k−1 k

(19)


961

which establishes Eq. (15). Equations (16), (17) and (18) follow similarly. Using Eq. (14) we have that for 1 ≤ i1 < i2 ≤ k, E Z i1 Z i2 = E 1

1

∆

(1 − Yi1 j )(1 − Yi2 j )

j=1

≤

∆

E 1 (1 − Yi1 j )E(1 − Yi2 j )

j=1

= E 1 Z i1 E 1 Z i2 , which establishes Eq. (19). References 1. Achlioptas, D., Coja-Oghlan, A.: Algorithmic barriers from phase transition. http://front.math.ucdavis. edu/0803.2122, 2008 2. Bhatnagar, N., Vera, J., Vigoda, E.: Reconstruction for colorings on trees. http://front.math.ucdavis.edu/ 0711.3664, 2007 3. Bleher, P.M., Ruiz, J., Zagrebnov, V.A.: On the purity of limiting Gibbs state for the Ising model on the Bethe lattice. J. Stat. Phys. 79, 473–482 (1995) 4. Borgs, C., Chayes, J.T., Mossel, E., Roch, S.: The Kesten-Stigum reconstruction bound is tight for roughly symmetric binary channels. In: FOCS 2006, Los Alamitos, CA: IEEE Computer Society, 2006, pp. 518–530 5. Daskalakis, C., Mossel, E., Roch, S.: Optimal phylogenetic reconstruction. In: STOC’06: Proceedings of the 38th Annual ACM Symposium on Theory of Computing, New York: ACM, 2006, pp. 159–168 6. Dyer, M., Frieze, A., Hayes, T.P., Vigoda, E.: Randomly coloring constant degree graphs. In: FOCS ’04: Proceedings of the 45th Annual IEEE Symposium on Foundations of Computer Science, Washington, DC: IEEE Computer Society, 2004, pp. 582–589 7. Evans, W., Kenyon, C., Peres, Y., Schulman, L.J.: Broadcasting on trees and the Ising model. Ann. Appl. Probab. 10(2), 410–433 (2000) 8. Janson, S., Mossel, E.: Robust reconstruction on trees is determined by the second eigenvalue. Ann. Probab. 32(3B), 2630–2649 (2004) 9. Jonasson, J.: Uniqueness of uniform random colorings of regular trees. Stat. Prob. Lett. 57, 243–248 (2002) 10. Kesten, H., Stigum, B.P.: Additional limit theorems for indecomposable multidimensional Galton-Watson processes. Ann. Math. Stat. 37, 1463–1481 (1966) 11. Krz¸akała, F., Montanari, A., Ricci-Tersenghi, F., Semerjian, G., Zdeborova, L.: Gibbs states and the set of solutions of random constraint satisfaction problems. Proc. Nat. Acad. Sci. 104, 10318–10323 (2007) 12. Krz¸akała, F., Pagnani, A., Weigt, M.: Threshold values, stability analysis, and high-q asymptotics for the coloring problem on random graphs. Phys. Rev. E 70(4), 046705 (2004) 13. Lange, K.: Applied probability. Springer Texts in Statistics. New York: Springer-Verlag, 2003 14. Mézard, M., Montanari, A.: Reconstruction on trees and spin glass transition. J. Stat. Phys. 124(6), 1317–1350 (2006) 15. Mossel, E.: Reconstruction on trees: beating the second eigenvalue. Ann. Appl. Prob. 11(1), 285–300 (2001) 16. Mossel, E.: Phase transitions in phylogeny. Trans. Amer. Math. Soc. 356(6), 2379–2404 (electronic), (2004) 17. Mossel, E.: Survey: information flow on trees. In: Graphs, morphisms and statistical physics, Volume 63 of DIMACS Ser. Discrete Math. Theoret. Comput. Sci., Providence, RI: Amer. Math. Soc., 2004, pp. 155–170 18. Mossel, E., Peres, Y.: Information flow on trees. Ann. Appl. Probab. 13, 817–844 (2003) 19. Mossel, E., Sly, A.: Gibbs rapidly samples colorings of G(n,d/n). http://arxiv.org/abs/0707.3241V2[math. PR], 2007 20. Semerjian, G.: On the freezing of variables in random constraint satisfaction problems. J. Stat. Phys. 130, 251 (2008) 21. Zdeborová, L., Krz¸akała, F.: Phase transitions in the coloring of random graphs. Phys. Rev. E 76, 031131 (2007) Communicated by F. Toninelli


Communications in


Heat Kernel on Homogeneous Bundles over Symmetric Spaces Ivan G. Avramidi Department of Mathematics, New Mexico Institute of Mining and Technology, Socorro, NM 87801, USA. E-mail: [email protected] Received: 27 May 2008 / Accepted: 29 May 2008 Published online: 26 September 2008 – © Springer-Verlag 2008

Abstract We consider Laplacians acting on sections of homogeneous vector bundles over symmetric spaces. By using an integral representation of the heat semi-group we find a formal solution for the heat kernel diagonal that gives a generating function for the whole sequence of heat invariants. We show explicitly that the obtained result correctly reproduces the first non-trivial heat kernel coefficient as well as the exact heat kernel diagonals on the two-dimensional sphere S 2 and the hyperbolic plane H 2 . We argue that the obtained formal solution correctly reproduces the exact heat kernel diagonal after a suitable regularization and analytical continuation. 1. Introduction The heat kernel is one of the most powerful tools in mathematical physics and geometric analysis (see, for example the books [13,17,24,26,27] and reviews [2,12,14,18,31]). The short-time asymptotic expansion of the trace of the heat kernel determines the spectral asymptotics of the differential operator. The coefficients of this asymptotic expansion, called the heat invariants, are extensively used in geometric analysis, in particular, in spectral geometry and index theorems proofs [17,24]. There has been a tremendous progress in the explicit calculation of spectral asymptotics in the last thirty years [2–5,23,30,33]. It seems that further progress in the study of spectral asymptotics can be only achieved by restricting oneself to operators and manifolds with a high level of symmetry, in particular, homogeneous spaces, which enables one to employ powerful algebraic methods. In some very special particular cases, such as group manifolds, spheres, rank-one symmetric spaces and split-rank symmetric spaces, it is possible to determine the spectrum of the Laplacian exactly and to obtain closed formulas for the heat kernel in terms of the root vectors and their multiplicities [1,18– 20,22,26]. The complexity of the method crucially depends on the global structure of the symmetric space, most importantly its rank. Most of the results for symmetric spaces are obtained for rank-one symmetric spaces only [18].

964

I. G. Avramidi

It is well known that heat invariants are determined essentially by local geometry. They are polynomial invariants in the curvature with universal constants that do not depend on the global properties of the manifold [24]. It is this universal structure that we are interested in this paper. Our goal is to compute the heat kernel asymptotics of the Laplacian acting on homogeneous vector bundles over symmetric spaces. Related problems in a more general context are discussed in [7,9,11]. 2. Geometry of Symmetric Spaces 2.1. Twisted spin-tensor bundles. In this section we introduce the basic concepts and fix notation. Let (M, g) be an n-dimensional Riemannian manifold without boundary. We assume that it is complete simply connected orientable and spin. We denote the local coordinates on M by x µ , with Greek indices running over 1, . . . , n. Let ea µ be a local orthonormal frame defining a basis for the tangent space Tx M so that g µν = δ ab ea µ eb ν .

(2.1)

We denote the frame indices by lower case Latin indices from the beginning of the alphabet, which also run over 1, . . . , n. The frame indices are raised and lowered by the metric δab . Let ea µ be the matrix inverse to ea µ , defining the dual basis in the cotangent space Tx∗ M, so that, gµν = δab ea µ eb ν .

(2.2)

The Riemannian volume element is defined as usual by dvol = d x |g|1/2 , where |g| = det gµν = (det ea µ )2 . The spin connection ωab µ is defined in terms of the orthonormal frame by ωab µ = eaµ eb µ;ν = −ea µ;ν ebµ = eaν ∂[µ eb ν] − ebν ∂[µ ea ν] + ecµ eaν ebλ ∂[λ ec ν] ,

(2.3)

where the semicolon denotes the usual Riemannian covariant derivative with the LeviCivita connection. The curvature of the spin connection is R a bµν = ∂µ ωa bν − ∂ν ωa bµ + ωa cµ ωc bν − ωa cν ωc bµ .

(2.4)

The Ricci tensor and the scalar curvature are defined by Rαν = ea µ eb α R a bµν ,

R = g µν Rµν = ea µ eb ν R ab µν .

(2.5)

Let T be a spin-tensor bundle realizing a representation Σ of the spin group Spin(n), the double covering of the group S O(n), with the fiber Λ. Let Σab be the generators of the orthogonal algebra SO(n), the Lie algebra of the orthogonal group S O(n), satisfying the following commutation relations: [Σab , Σcd ] = −δac Σbd + δbc Σad + δad Σbc − δbd Σac .

(2.6)

The spin connection induces a connection on the bundle T defining the covariant derivative of smooth sections ϕ of the bundle T by 1 ∇µ ϕ = ∂µ + ωab µ Σab ϕ. (2.7) 2

Heat Kernel on Homogeneous Bundles over Symmetric Spaces

965

The commutator of covariant derivatives defines the curvature of this connection via [∇µ , ∇ν ]ϕ =

1 ab R µν Σab ϕ. 2

(2.8)

As usual, the orthonormal frame, ea µ and ea µ , will be used to transform the coordinate (Greek) indices to the orthonormal (Latin) indices. The covariant derivative along the frame vectors is defined by ∇a = ea µ ∇µ . For example, with our notation, ∇a ∇b Tcd = ea µ eb ν ec α ed β ∇µ ∇ν Tαβ . The metric δab induces a positive definite fiber metric on tensor bundles. For Dirac spinors, the fiber metric is defined as follows. First, one defines the Dirac matrices, γa , as generators of the Clifford algebra, (represented by 2[n/2] × 2[n/2] complex matrices), γa γb + γb γa = 2δab I S ,

(2.9)

where I S is the identity matrix in the spinor representation. Then one defines the antisymmetrized products of Dirac matrices γa1 ...ak = γ[a1 · · · γak ] .

(2.10)

Then the matrices Σab =

1 γab 2

(2.11)

are the generators of the orthogonal algebra SO(n) in the spinor representation. The Hermitian conjugation of Dirac matrices defines a Hermitian matrix β 1 by γa† = βγa β −1 ,

(2.12)

¯ = ψ † βϕ in the vector space of spinors. which defines a Hermitian inner product ψϕ We also find the following important relation: R ab cd γab γ cd = −2R ab ab I S = −2R I S ,

(2.13)

where R is the scalar curvature. In the present paper we will further assume that M is a locally symmetric space with a Riemannian metric with the parallel curvature ∇µ Rαβγ δ = 0,

(2.14)

which means, in particular, that the curvature satisfies the integrability constraints R f g ea R e bcd − R f g eb R e acd + R f g ec R e dab − R f g ed R e cab = 0.

(2.15)

Let G Y M be a compact Lie group (called a gauge group). It naturally defines the principal fiber bundle over the manifold M with the structure group G Y M . We consider a representation of the structure group G Y M and the associated vector bundle through this representation with the same structure group G Y M whose typical fiber is a k-dimensional vector space W . Then for any spin-tensor bundle T we define the twisted spin-tensor bundle V via the twisted product of the bundles W and T . The fiber of the bundle V is V = Λ ⊗ W so that the sections of the bundle V are represented locally by k-tuples of spin-tensors. 1 The Dirac matrices γ and the spinor metric β should not be confused with the matrices γ ab AB and βi j defined below.

966

I. G. Avramidi

Let A be a connection one form on the bundle W (called Yang-Mills or gauge connection) taking values in the Lie algebra GY M of the gauge group G Y M . Then the total connection on the bundle V is defined by 1 (2.16) ∇µ ϕ = ∂µ + ωab µ Σab ⊗ IW + IΛ ⊗ Aµ ϕ, 2 and the total curvature Ω of the bundle V is defined by [∇µ , ∇ν ]ϕ = Ωµν ϕ,

(2.17)

where Ωµν =

1 ab R µν Σab + Fµν , 2

(2.18)

and Fµν = ∂µ Aν − ∂ν Aµ + [Aµ , Aµ ]

(2.19)

is the curvature of the Yang-Mills connection. We also consider the bundle of endomorphisms of the bundle V. The covariant derivative of sections of this bundle is defined by 1 ab ∇µ X = ∂µ + ω µ Σab X + [Aµ , X ], (2.20) 2 and the commutator of covariant derivatives is equal to [∇µ , ∇ν ]X =

1 ab R µν Σab X + [Fµν , X ]. 2

(2.21)

In the following we will consider homogeneous vector bundles with parallel bundle curvature ∇µ Fαβ = 0,

(2.22)

which means that the curvature satisfies the integrability constraints [Fcd , Fab ] − R f acd F f b − R f bcd Fa f = 0.

(2.23)

2.2. Normal coordinates. Let x be a fixed point in M and U be a sufficiently small coordinate patch containing the point x . Then every point x in U can be connected with the point x by a unique geodesic. We extend the local orthonormal frame ea µ (x ) at the point x to a local orthonormal frame ea µ (x) at the point x by parallel transport

ea µ (x) = g µ ν (x, x )ea ν (x ), e

a

µ (x)

ν

= gµ (x, x )e

a

ν (x

),

(2.24) (2.25)

where g µ ν (x, x ) is the operator of parallel transport of vectors along the geodesic from the point x to the point x. Of course, the frame ea µ depends on the fixed point x as a parameter. Here and everywhere below the coordinate indices of the tangent space at the point x are denoted by primed Greek letters. They are raised and lowered by the


967

metric tensor gµ ν (x ) at the point x . The derivatives with respect to x will be denoted by primed Greek indices as well. The parameters of the geodesic connecting the points x and x , namely the unit tangent vector at the point x and the length of the geodesic, (or, equivalently, the tangent vector at the point x with the norm equal to the length of the geodesic), provide the normal coordinate system for U. Let d(x, x ) be the geodesic distance between the points x and x and σ (x, x ) be a two-point function defined by σ (x, x ) =

1 [d(x, x )]2 . 2

(2.26)

Then the derivatives σ;µ (x, x ) and σ;ν (x, x ) are the tangent vectors to the geodesic connecting the points x and x at the points x and x respectively pointing in opposite directions; one is obtained from another by parallel transport

σ;µ = −gµ ν σ;ν .

(2.27)

Here and everywhere below the semicolon denotes the covariant derivative. The operator of parallel transport satisfies the equation

with the initial conditions

σ ;µ ∇µ g α β = 0,

(2.28)

gα β

(2.29)

x=x

= δβα .

It can be expressed in terms of the local parallel frame g µ ν (x, x ) = ea µ (x)ea ν (x ),

gµ ν (x, x ) = ea µ (x)ea ν (x ).

(2.30)

Now, let us define the quantities

y a = ea µ σ ;µ = −ea µ σ ;µ ,

(2.31)

so that σ ;µ = ea µ y a

and

σ ;µ = −ea µ y a .

(2.32)

Notice that y a = 0 at x = x . Further, we have ∂ ya = −ea µ σ;νµ , ∂xν so that the Jacobian of the change of variables is a ∂y = |g|−1/2 (x ) det[−σ;νµ (x, x )]. det ∂xν

(2.33)

(2.34)

The geometric parameters y a are nothing but the normal coordinates. By using the Van Vleck-Morette determinant defined by2 ∆(x, x ) = |g|−1/2 (x )|g|−1/2 (x) det[−σ;νµ (x, x )], 2 Do not confuse it with the Laplacian ∆ defined below.

(2.35)

968

I. G. Avramidi

we can write the Riemannian volume element in the form dvol = dy ∆−1 (x, x ).

(2.36)

Let P(x, x ) be the operator of parallel transport of sections of the bundle V from the point x to the point x. It satisfies the equation σ ;µ ∇µ P = 0,

(2.37)

P

(2.38)

with the initial condition

x=x

= IV .

Any spin-tensor ϕ can be now expanded in the covariant Taylor series ϕ(x) = P(x, x )

∞ 1 ∇(c1 · · · ∇ck ) ϕ (x )y c1 · · · y ck . k!

(2.39)

k=0

Therefrom it is clear, in particular, that the frame components of a parallel spin-tensor are simply constant. In symmetric spaces one can compute the Van Vleck-Morette determinant explicitly in terms of the curvature. Let K be a n × n matrix with the entries K a b = R a cbd y c y d .

(2.40)

Then [2,5,13] ∂ ya = ∂xν

√ sin

K √

a eb ν ,

b

K

(2.41)

and, therefore, √

∆(x, x ) = det T M

sin

K √

K

.

(2.42)

Thus, the Riemannian volume element in symmetric spaces takes the following form: √ sin K . (2.43) dvol = dy det T M √ K √ √ The matrix (sin K )/ K determines the orthonormal frame in normal coordinates, and the square of this matrix determines the metric tensor in normal coordinates, √ sin2 K 2 dy a dy b . (2.44) ds = K ab

Let us define an endo-morphism valued 1-form A˜ a by the equation

∇ν P = P A˜ a ea µ σ ;µ ν .

(2.45)


969

Then for bundles with parallel curvature over symmetric spaces one can find it explicitly [2,5,13] √ b K I − cos A˜ a = −Fbc y c (2.46) a. K This object determines the gauge connection in normal coordinates, √ b c I − cos K a A = −Fbc y a dy . K

(2.47)

This means that all connections on a homogeneous bundle are essentially the same. In particular, the spin connection one-form in normal coordinates has the form √ c K I − cos e ωa b = −R a bcd y d (2.48) e dy . K Remark 1. Two remarks are in order here. First, strictly speaking, normal coordinates can be only defined locally, in geodesic balls of radius less than the injectivity radius of the manifold. However, for symmetric spaces normal coordinates cover the whole manifold except for a set of measure zero where they become singular [18]. This set is precisely the set of points conjugate to the fixed point x (where ∆−1 (x, x ) = 0) and of points that can be connected to the point x by multiple geodesics. In any case, this set is a set of measure zero and, as we will show below, it can be dealt with by some regularization technique. Thus, we will use the normal coordinates defined above for the whole manifold. Second, for compact manifolds (or for manifolds with compact submanifolds) the range of some normal coordinates is also compact, so that if one allows them to range over the whole real line R, then the corresponding compact submanifolds will be covered infinitely many times. 2.3. Curvature group of a symmetric space. We assumed that the manifold M is locally symmetric. Since we also assume that it is simply connected and complete, it is a globally symmetric space (or simply symmetric space) [32]. A symmetric space is said to be compact, non-compact or Euclidean if all sectional curvatures are positive, negative or zero. A generic symmetric space has the structure M = M0 × Ms ,

(2.49)

where M0 = Rn 0 and Ms is a semi-simple symmetric space; it is a product of a compact symmetric space M+ and a non-compact symmetric space M− , Ms = M+ × M− .

(2.50)

Of course, the dimensions must satisfy the relation n 0 + n s = n, where n s = dim Ms . Let Λ2 be the vector space of 2-forms on M at a fixed point x . It has the dimension dim Λ2 = n(n − 1)/2, and the inner product in Λ2 is defined by X, Y =

1 X ab Y ab . 2

(2.51)

970

I. G. Avramidi

The Riemann curvature tensor naturally defines the curvature operator Riem : Λ2 → Λ2

(2.52)

by (Riem X )ab =

1 Rab cd X cd . 2

(2.53)

This operator is symmetric and has real eigenvalues which determine the principal sectional curvatures. Now, let Ker (Riem) and Im (Riem) be the kernel and the range of this operator and p = dim Im(Riem) =

n(n − 1) − dim Ker (Riem). 2

(2.54)

Further, let λi , (i = 1, . . . , p), be the non-zero eigenvalues, and E i ab be the corresponding orthonormal eigen-two-forms. Then the components of the curvature tensor can be presented in the form [10] Rabcd = βik E i ab E k cd ,

(2.55)

where βik is a symmetric, in fact, diagonal, nondegenerate p × p matrix, (βik ) = diag (λ1 , . . . , λ p ).

(2.56)

Of course, the zero eigenvalues of the curvature operator correspond to the flat subspace M0 , the positive ones correspond to the compact submanifold M+ and the negative ones to the non-compact submanifold M− . Therefore, Im (Riem) = Tx Ms . In the following the Latin indices from the middle of the alphabet will be used to denote tensors in Im(Riem); they should not be confused with the Latin indices from the beginning of the alphabet which denote tensors in M. They will be raised and lowered with the matrix βik and its inverse −1 (β ik ) = diag (λ−1 1 , . . . , λ p ).

(2.57)

Next, we define the traceless n × n matrices Di = (D a ib ), where D a ib = −βik E k cb δ ca .

(2.58)

Then R a bcd = −D a ib E i cd , R a b c d = β ik D a ib D c kd , R a b = −β ik D a ic D c kb , R = −β ik D a ic D c ka .

(2.59) (2.60)

Also, we have identically, D a j[b E j cd] = 0.

(2.61)

The matrices Di are known to be the generators of the holonomy algebra, H, i.e. the Lie algebra of the restricted holonomy group, H , [Di , Dk ] = F j ik D j ,

(2.62)


971

where F j ik are the structure constants of the holonomy group. The structure constants of the holonomy group define the p × p matrices Fi , by (Fi ) j k = F j ik , which generate the adjoint representation of the holonomy algebra, [Fi , Fk ] = F j ik F j .

(2.63)

These commutation relations follow directly from the Jacobi identities F i j[k F j ml] = 0.

(2.64)

For symmetric spaces the introduced quantities satisfy additional algebraic constraints. The most important consequence of the Eq. (2.15) is the equation [10] E i ac D c kb − E i bc D c ka = F i k j E j ab .

(2.65)

It is this equation that makes a generic Riemannian manifold a symmetric space. Now, by using Eqs. (2.62) and (2.65) one can prove the following: Proposition 1. The matrix βik is H -invariant and satisfies the equation βik F k jl + βlk F k ji = 0.

(2.66)

This means that the matrices Fi satisfy the transposition rule (Fi )T = −β Fi β −1 ,

(2.67)

which simply means that the adjoint and the coadjoint representations of the holonomy algebra H are equivalent. In particular, this means that the matrices Fi are traceless. Such an algebra is called compact [16]. Another consequence of the Eq. (2.65) are the identities D a i[b Rc]ade + D a i[d Re]abc = 0, R a c D c ib = D a ic R c b .

(2.68) (2.69)

This means, in particular, that the Ricci tensor matrix commutes with all matrices Di and is, therefore, an invariant matrix of the holonomy algebra. Thus, Ra b =

1 a h b R, ns

(2.70)

where h a b is a projection (a symmetric idempotent parallel tensor) to the subspace Tx Ms of the tangent space of dimension n s , that is, h ab = h ba ,

ha b hb c = ha c ,

ha a = ns .

(2.71)

It is easy to see that the tensor h ab is nothing but the metric tensor on the semi-simple subspace Tx Ms . Since the curvature exists only in the semi-simple submanifold Ms , the components of the curvature tensor Rabcd , as well as the tensors E i ab , are non-zero only in the semi-simple subspace Tx Ms . Let q a b = δa b − h a b

(2.72)

972

I. G. Avramidi

be the projection tensor to the flat subspace Rn 0 such that qab = qba ,

qa bqbc = qa c,

q a a = n0,

q a b h b c = 0.

(2.73)

Rabcd q a e = Rab q a e = E i ab q a e = D a ib q b e = D a ib qa e = 0.

(2.74)

Then

Now, we introduce a new type of indices, the capital Latin indices, A, B, C, . . . , which split according to A = (a, i) and run from 1 to N = p + n. We define new quantities C A BC by C i ab = E i ab ,

C a ib = −C a bi = D a ib ,

C i kl = F i kl ,

(2.75)

all other components being zero. Let us also introduce rectangular p × n matrices Ta by (Ta ) j c = E j ac and the n × p matrices T¯a by (T¯a )b i = −D b ia . Then we can define N × N matrices C A = (Ca , Ci ), Di 0 0 T¯a Ca = , (2.76) , Ci = 0 Fi Ta 0 so that (C A ) B C = C B AC . Theorem 1. The quantities C A BC satisfy the Jacobi identities C A B[C C C D E] = 0.

(2.77)

This means that the matrices C A satisfy the commutation relations [C A , C B ] = C C AB CC ,

(2.78)

or, in more detail, [Ca , Cb ] = E i ab Ci ,

[Ci , Ca ] = D b ia Cb ,

[Ci , Ck ] = F j ik C j ,

(2.79)

and generate the adjoint representation of a Lie algebra G with the structure constants C A BC . Proof. This can be proved by using Eqs. (2.61), (2.62), (2.64) and (2.65) [10].

For the lack of a better name we call the algebra G the curvature algebra. As it will be clear from the next section it is a subalgebra of the total isometry algebra of the symmetric space. It should be clear that the holonomy algebra H is the subalgebra of the curvature algebra G. The curvature algebra exists only in symmetric spaces; it is Eq. (2.65) that closes this algebra. Next, we define a symmetric nondegenerate N × N matrix ⎛ ⎞ δab 0 (γ AB ) = = diag ⎝1, . . . , 1, λ1 , . . . , λ p ⎠ . (2.80) 0 βik This matrix and its inverse (γ AB ) =

δ ab 0 0 β ik

n

= diag (1, . . . , 1, λ−1 , . . . , λ−1 p ) will 1

be used to lower and to raise the capital Latin indices.

n


973

Finally, by using Eqs. (2.65) and (2.66) one can show the following: Proposition 2. The matrix γ AB is G-invariant and satisfies the equation γ AB C B C D + γ D B C B C A = 0.

(2.81)

In matrix notation this equation takes the form (C A )T = −γ C A γ −1 ,

(2.82)

which means that the adjoint and the coadjoint representations of the curvature group are equivalent. In particular, the matrices C A are traceless. Thus the curvature algebra G is compact; it is a direct sum of two ideals, G = G0 ⊕ Gs ,

(2.83)

an Abelian center G0 of dimension n 0 and a semi-simple algebra Gs of dimension p + n s . It is worth mentioning that although the holonomy algebra H is compact the (indefinite, in general) metric, βi j , introduced above is not equal to the (positive definite) Cartan-Killing form, ρi j , defined by tr T M Di Dk = D a ib D b ka = −ρik ,

(2.84)

ρik = diag (λ21 , . . . , λ2p ),

(2.85)

β ik ρik = R.

(2.86)

so that

and

Similarly, the generators Fi satisfy tr H Fi Fk = F j im F m k j = −4

RH ρik , R

(2.87)

where 1 R H = − β ik F j im F m k j . 4

(2.88)

The Killing-Cartan form tr G C A C B for the curvature algebra G is defined by 2 h ab R, ns RH ρi j , tr G Ci C j = − 1 + 4 R tr G Ca Ci = 0.

tr G Ca Cb = −

Notice that it is degenerate and is not equal to the metric γ AB .

(2.89) (2.90) (2.91)

974

I. G. Avramidi

2.4. Killing vectors fields. We will use extensively the isometries of the symmetric space M. We follow the approach developed in [2,5,10,13]. The generators of isometries are the Killing vector fields ξ defined by the equation ∇µ ξ ν + ∇ν ξ µ = 0.

(2.92)

The integrability conditions for this equation are Rαβµ[λ ∇ν] ξ µ + Rλνµ[β ∇α] ξ µ = 0.

(2.93)

By differentiating this equation, commuting derivatives and using curvature identities we obtain ∇µ ∇ν ξ λ = −R λ ναµ ξ α ,

(2.94)

∆ξ λ = −R λ α ξ α .

(2.95)

which means, in particular,

By induction we obtain ∇µ2k · · · ∇µ1 ξ λ = (−1)k R λ µ1 α1 µ2 R α1 µ3 α2 µ4 · · · R αk−1 µ2k−1 αk µ2k ξ αk , λ

∇µ2k+1 · · · ∇µ1 ξ = (−1) R k

λ

µ1 α1 µ2 R

α1

µ3 α2 µ4

··· R

αk−1

µ2k−1 αk µ2k ∇µ2k+1 ξ

(2.96) αk

. (2.97)

These derivatives determine all coefficients of the covariant Taylor series (2.39) for the Killing vectors, and therefore, every Killing vector in a symmetric space has the form √ a √ a sin K a b c b ξ (x) = cos K b ξ (x ) + (2.98) √ b y ξ ;c (x ), K or ξ(x) =

√

K cot

∂ √ a b K b ξ (x ) + ξ a ;c (x )y c . ∂ ya

(2.99)

Thus, Killing vector fields at any point x are determined by their values ξ a (x ) and the values of their derivatives ξ a ;c (x ) at the fixed point x . Similarly we can obtain the derivatives of the Killing vectors, √ c a a a d 1 − cos K f e ξ ;b (x) = ξ ;b (x ) − R bcd y e y ξ ; f (x ) K √ c sin K e −R a bcd y d (2.100) √ e ξ (x ). K The set of all Killing vector fields forms a representation of the isometry algebra, the Lie algebra of the isometry group of the manifold M. We define two subspaces of the isometry algebra. One subspace is formed by Killing vectors satisfying the initial conditions ∇µ ξ ν = 0, (2.101) x=x


975

and another subspace is formed by the Killing vectors satisfying the initial conditions ξ ν = 0. (2.102) x=x

We will call the Killing vectors from the first subspace translations and the Killing vectors from the second group rotations. However, this should not be understood literally. One can easily show that the initial values ξ a (x ) are independent and, therefore, there are n such parameters. Thus, there are n linearly independent translations, which can be chosen in the form √ √ b ∂ Pa = K cot K a b , (2.103) ∂y so that

eb µ Pa µ x=x = δ b a ,

Pa µ ;ν

x=x

= 0.

(2.104)

It is worth pointing out that the nature of the lower index of the Killing vectors Pa µ is different from the frame indices. This means, in particular, that the covariant derivative of Pa µ does not include the spin connection associated with the lower index. In other words, Pa µ are just n vectors and not the components of a (1, 1) tensor. On the other hand, the initial values of the derivatives ξ a ;c (x ) are not independent because of the constraints (2.93). These constraints are valid only in the semi-simple subspace Tx Ms . However, in this subspace, due to the identity (2.68), it should be clear that there are p linearly independent rotations L i = −D b ia y a satisfying the initial conditions L i µ x=x = 0,

∂ , ∂ yb

(2.105)

ea µ eb ν L i µ ;ν x=x = −D a ib .

More generally, by using (2.100) we also obtain √ c a a d sin K ν µ e µ eb Pe ;ν = −R bcd y √ e, K √ c 1 − cos K f e ea µ eb ν L i µ ;ν = −D a ib + R a bcd y d ey D if . K

(2.106)

(2.107)

(2.108)

This means, in particular, that the derivatives of all Killing vectors have the form ξ A a ;b = −D a ib η A i ,

(2.109)

η A i = α i j ξ A a ;b D b ja ,

(2.110)

where η A i are defined by

αi j

and the matrix = (ρi j by (2.84). Notice that

)−1

is the inverse matrix of the Cartan-Killing form ρ defined

ηa i x=x = 0,

ηji

x=x

= δ ij .

(2.111)

976

I. G. Avramidi

Then, from Eq. (2.94) we also immediately obtain η A i ;b = −E i ab ξ A a .

(2.112)

By adding the trivial Killing vectors for flat subspaces we find that the dimension of the rotation subspace is equal to p + n0ns +

n 0 (n 0 − 1) . 2

(2.113)

Here n 0 n s is the number of mixed rotations between M0 and Ms and n 0 (n 0 − 1)/2 is the number of rotations of M0 . Since p ≤ n s (n s − 1)/2, then the above number of rotations is less or equal to n(n − 1)/2 as it should be (recall that n = n 0 + n s ). In the following we will need only the Killing vectors Pa and L i defined above. We introduce the following notation (ξ A ) = (Pa , L i ). Theorem 2. The Killing vector fields ξ A satisfy the commutation relations [ξ A , ξ B ] = C C AB ξC ,

(2.114)

or, in more detail, [Pa , Pb ] = E i ab L i ,

[L i , Pa ] = D b ia Pb ,

[L i , L k ] = F j ik L j .

(2.115)

Proof. This can be proved by using the explicit form of the Killing vector fields obtained above [10].

Notice that they do not generate the complete isometry algebra of the symmetric space M but rather they form a representation of the curvature algebra G introduced in the previous section, which is a subalgebra of the total isometry algebra. It is clear that the Killing vector fields L i form a representation of the holonomy algebra H, which is the isotropy algebra of the semi-simple submanifold Ms , and a subalgebra of the total isotropy algebra of the symmetric space M. Proposition 3. There holds ξ Ac ;a ξ B b ;c − ξ Bc ;a ξ A b ;c = C C AB ξC b ;a − R b acd ξ A c ξ B d ,

(2.116)

F j ik η A i η B k = C C AB ηC j − E j cd ξ Ac ξ B d .

(2.117)

and

Proof. By differentiating Eq. (2.114) and using (2.94) we obtain (2.116). Finally, by using (2.109) and the holonomy algebra (2.62) we obtain (2.117).

Now, we derive some bilinear identities that we will need in the present paper. Proposition 4. The Killing vector fields satisfy the equation γ AB ξ A µ ξ B ν = δ ab Pa µ Pb ν + β ik L i µ L k ν = g µν , . Proof. This can be proved by using the explicit form of the Killing vectors.

(2.118)

Proposition 5. There holds γ AB ξ A α ξ B µ ;νλ = R α λνµ .

(2.119)


Proof. This follows from Eqs. (2.94) and (2.118).

977

Proposition 6. There holds γ AB ξ A µ ξ B ν ;β = 0, γ

AB

ξA

µ

ν

Rµα ν β .

;α ξ B ;β

=

and

θ µ α ν β = γ AB ξ A µ ;α ξ B ν ;β .

(2.120) (2.121)

Proof. Let τ µα ν = γ AB ξ A µ ξ B α ;ν

(2.122)

We compute ∇β τµαν = θµβνα − Rµβνα ,

(2.123)

∇γ ∇β τµαν = Rµβγρ τ ρ να + Rναγρ τ ρ µβ .

(2.124)

and

All higher derivatives of τµνα are expressed linearly in terms of τµνα and its first derivative ∇β τµαν with coefficients polynomial in curvature. Let x be a fixed point. We will show that the tensor τµνα together with all its covariant derivatives is equal to zero at x = x . This will then mean that τµνα = 0 identically and, therefore, from Eq. (2.123) that θ µ ανβ = R µ ανβ . We have τ µα ν = δ ab Pa µ Pb α ;ν + β i j L i µ L j α ;ν ,

(2.125)

θ µ α ν β = δ ab Pa µ ;α Pb ν ;β + β i j L i µ ;α L j ν ;β .

(2.126)

and

Therefore,

Therefore,

τ µ αν x=x = 0

and

θ µ ανβ x=x = R µ ανβ .

∇β τ µ αν x=x = 0.

(2.127)

(2.128)

Thus, by induction, all derivatives of τµνα vanish, and, therefore, τµνα = 0 identically. This also proves (2.121) by making use of (2.123).

Let i, j be non-negative integers. We define the tensors X (i, j) which are bilinear in Killing vectors by X (i, j) µν α1 ...αi β1 ...β j = γ AB ∇α1 · · · ∇αi ξ A µ ∇β1 · · · ∇β j ξ B ν .

(2.129)

Theorem 3. 1. The tensors X (i, j) are G-invariant and parallel, that is, ∇λ X (i, j) = 0. 2. For even (i + j) the tensors X (i, j) are polynomial in the curvature tensor. 3. For odd (i + j) the tensors X (i, j) are identically equal to zero.

(2.130)

978

I. G. Avramidi

Proof. First of all, we notice that (1) follows from (2) and (3). There are three cases: a) both i = 2k and j = 2m are even, b) both i = 2k + 1 and j = 2m + 1 are odd, and c) i = 2k is even and j = 2m + 1 is odd. In the case (a), when both i and j are even, by using Eqs. (2.96) and (2.118) we immediately obtain a polynomial in the curvature. In the cases (b) and (c) by using Eqs. (2.96) and (2.97) we reduce it to the tensors γ AB ξ A µ ;α ξ B ν ;β and γ AB ξ A µ ξ B ν ;β . Now, by using the Proposition 6 we prove the theorem.

Proposition 7. There holds γ AB ξ A µ η B i = 0,

(2.131)

γ AB η A i η B j = β i j .

(2.132)

Proof. This follows from the definition of η A i (2.110) and Eqs. (2.120) and (2.121).

2.5. Homogeneous vector bundles. Equation (2.23) imposes strong constraints on the curvature of the homogeneous bundle W. We define Bab = Fcd q c a q d b ,

Eab = Fcd h c b h d b ,

(2.133)

Eab q a c = 0.

(2.134)

[Bab , Bcd ] = [Bab , Ecd ] = 0,

(2.135)

[Ecd , Eab ] − R f acd E f b − R f bcd Ea f = 0.

(2.136)

so that Bab h a c = 0, Then, from Eq. (2.23) we obtain

and

This means that Bab takes values in an Abelian ideal of the gauge algebra GY M and Eab takes values in the holonomy algebra. More precisely, Eq. (2.136) is only possible if the holonomy algebra H is an ideal of the gauge algebra GY M . Thus, the gauge group G Y M must have a subgroup Z × H , where Z is an Abelian group and H is the holonomy group. We proceed in the following way. The matrices D a ib provide a natural embedding of the holonomy algebra H in the orthogonal algebra SO(n) in the following sense. Let X ab be the generators of the orthogonal algebra SO(n) in some representation satisfying the commutation relations (2.6). Let Ti be the matrices defined by 1 Ti = − D a ib X b a . 2

(2.137)

Proposition 8. The matrices Ti satisfy the commutation relations [Ti , Tk ] = F j ik T j and form a representation T of the holonomy algebra H. This can be proved by taking into account the orthogonal algebra (2.6).

(2.138)


979

Thus Ti are the generators of the gauge algebra GY M realizing a representation T of the holonomy algebra H. Since Bab takes values in the Abelian ideal of the algebra of the gauge group we also have [Bab , T j ] = 0.

(2.139)

Then by using Eq. (2.65) one can show that 3 Eab =

1 cd R ab X cd = −E i ab Ti . 2

(2.140)

Proposition 9. The two form Fab = −E i ab Ti + Bab =

1 cd R ab X cd + Bab 2

(2.141)

satisfies the constraints (2.23), and, therefore, gives the curvature of the homogeneous bundle W. Now, we consider the representation Σ of the orthogonal algebra defining the spintensor bundle T and define the matrices G ab = Σab ⊗ I X + IΣ ⊗ X ab .

(2.142)

Obviously, these matrices are the generators of the orthogonal algebra in the product representation Σ ⊗ X . Next, the matrices 1 Q i = − D a ib Σ b a 2

(2.143)

form a representation Q of the holonomy algebra H, and the matrices 1 Ri = Q i ⊗ IT + IΣ ⊗ Ti = − D a ib G b a 2

(2.144)

are the generators of the holonomy algebra in the product representation R = Q ⊗ T . Then the total curvature, that is, the commutator of covariant derivatives, (2.18) of a twisted spin-tensor bundle V is Ωab = −E i ab Ri + Bab =

1 cd R ab G cd + Bab . 2

(2.145)

Finally, we define the Casimir operators of the holonomy algebra in the representations Q, T and R, 1 abcd R X ab X cd , 4 1 Q 2 = C2 (H, Q) = β i j Q i Q j = R abcd Σab Σcd , 4 1 R2 = C2 (H, R) = β i j Ri R j = R abcd G ab G cd . 4 T 2 = C2 (H, T ) = β i j Ti T j =

They commute with all matrices Ti , Q i and Ri respectively. 3 We correct here a sign misprint in Eq. (3.24) in [10].

(2.146) (2.147) (2.148)

980

I. G. Avramidi

2.6. Twisted Lie derivatives. Let ϕ be a section of a twisted homogeneous spin-tensor bundle T . Let ξ A be the basis of Killing vector fields. Then the covariant (or generalized, or twisted) Lie derivative of ϕ along ξ A is defined by L A ϕ = Lξ A ϕ = ∇ξ A + S A ϕ, (2.149) where ∇ξ A = ξ A µ ∇µ , S A = η A i Ri =

1 a ξ A ;b G b a , 2

(2.150)

and η A i are defined by (2.110). Note that Sa q a b = 0.

(2.151)

[∇ξ A , ∇ξ B ]ϕ = C C AB ∇ξC − R AB + B AB ϕ,

(2.152)

Proposition 10. There hold

∇ξ A S B = R AB , [S A , S B ] = C

C

AB SC

(2.153) − R AB ,

(2.154)

where 1 R AB = ξ A a ξ B b E i ab Ri = − R cd ab ξ A a ξ B b G cd , 2 B AB = ξ A a ξ B b Bab .

(2.155) (2.156)

Proof. By using the properties of the Killing vectors described in the previous section and Eq. (2.145) we obtain first (2.152). Next, by using Eqs. (2.112) we obtain (2.153), and, further, by using Eq. (2.117) we get (2.154).

Notice that from the definition (2.133) we have Pc a L i b Bab = L i a L j b Bab = 0,

(2.157)

Pc a Pd b Bab = Bcd .

(2.158)

and

This means that the matrix B AB has the form Bab 0 , B AB = 0 0

(2.159)

and, therefore, C A BC B AD = γ B D C A BC B D E = 0.

(2.160)

We define the operator L2 = γ AB L A L B .

(2.161)


981

Theorem 4. The operators L A and L2 satisfy the commutation relations [L A , L B ] = C C AB LC + B AB ,

(2.162)

or, in more detail, [La , Lb ] = E i ab Li + Bab , [Li , La ] = D b ia Lb , [Li , L j ] = F k i j Lk , (2.163) and [L A , L2 ] = 2γ BC B AB LC .

(2.164)

[L A , L B ] = [∇ξ A , ∇ξ B ] + [∇ξ A , S B ] − [∇ξ B , S A ] + [S A , S B ]

(2.165)

Proof. This follows from

and Eqs. (2.152), (2.153), and (2.154). Equation (2.164) follows directly from (2.162).

The operators L A form an algebra that is a direct sum of a nilpotent ideal and a semisimple algebra. For lack of a better name we call this algebra a gauged curvature algebra and denote it by Ggauge . Proposition 11. There hold γ AB ξ A µ S B = 0, γ

AB

γ

(2.166)

∇ξ A S B = 0,

AB

(2.167)

S A SB = R . 2

Proof. This can be proved by using Eqs. (2.131), (2.153) and (2.132).

(2.168)

Theorem 5. The Laplacian ∆ acting on sections of a twisted spin-tensor bundle V over a symmetric space has the form ∆ = L2 − R2 , [L A , ∆] = 2γ BC B AB LC .

(2.169) (2.170)

γ AB L A L B = γ AB ∇ξ A ∇ξ B + γ AB S A ∇ξ B + γ AB ∇ξ A S B + γ AB S A S B .

(2.171)

Proof. We have

Now, by using Eqs. (2.118) and (2.120) we get γ AB ∇ξ A ∇ξ B = ∆.

(2.172)

Next, by using Eqs. (2.153), (2.167) and (2.168), we obtain (2.169). Equation (2.170) follows from the commutation relations (2.162).

982

I. G. Avramidi

2.7. Isometries and pullbacks. Let ωi be the canonical coordinates on the holonomy group and (k A ) = ( pa , ωi ) be the canonical coordinates on the gauged curvature group. We fix a point x so that the basis Killing vectors fields ξ A satisfy the initial conditions (2.104)-(2.106) and are given by (2.103)-(2.105). Let ξ = k, ξ = k A ξ A = pa Pa +ωi L i be a Killing vector field and let ψt : M → M be the one-parameter diffeomorphism (the isometry) generated by the vector field ξ . Let xˆ = ψt (x), so that

and

d xˆ = ξ (x) ˆ dt

(2.173)

xˆ t=0 = x.

(2.174)

The solution of this equation depends on the parameters t, p, ω, x and x , that is, xˆ = x(t, ˆ p, ω, x, x ).

(2.175)

We will be interested mainly in the case when the points x and x are close to each other. In fact, at the end of our calculations we will take the limit x = x . In this case, as we will show below, the Jacobian µ ∂ xˆ = 0 (2.176) det ∂ pa is not equal to zero, and, therefore, coordinates p can be used to parametrize the point x, ˆ that is, Eq. (2.175) defines the function p = p(t, ω, x, ˆ x, x ).

(2.177)

We will be interested in those trajectories that reach the point x at the time t = 1. So, we look at the values x(1, ˆ p, ω, x, x ) when the parameters p are varied. Then, as we will show below, there is always a value of the parameters p that we call p¯ such that x(1, ˆ p, ¯ ω, x, x ) = x .

(2.178)

Thus, Eq. (2.178) defines a function p¯ = p(ω, ¯ x, x ). Therefore, the parameters p¯ can be used to parameterize the point x. Of course, p(ω, ¯ x, x ) = p(1, ω, x , x, x ).

(2.179)

Now, we choose the normal coordinates y a of the point defined above and the normal coordinates yˆ a of the point xˆ with the origin at x , so that the normal coordinates y of the point x are equal to zero, y a = 0. Recall that the normal coordinates are equal to the components of the tangent vector at the point x to the geodesic connecting the points x and the current point, that is, y a = −ea µ (x )σ ;µ (x, x ) and yˆ a = −ea µ (x )σ ;µ (x, ˆ x ). Then by taking into account Eqs. (2.103) and (2.105), Equation (2.173) becomes a d yˆ a = K ( yˆ ) cot K ( yˆ ) p b − ωi D a ib yˆ b , (2.180) b dt with the initial condition

yˆ a t=0 = y a .

The solution of this equation defines a function yˆ = yˆ (t, p, ω, y).

(2.181)


983

Proposition 12. The Taylor expansion of the solution yˆ = yˆ (t, p, ω, y) of Eq. (2.180) in t reads a K (y) cot K (y) p b − ωi D a ib y b t + O(t 2 ). (2.182) yˆ a = y a + b

The Taylor expansion of the function yˆ = yˆ (t, p, ω, y) in p and y reads 1 − exp[−t D(ω)] a b 2 2 yˆ a = (exp[−t D(ω)])a b y b + b p + O(y , p , py). (2.183) D(ω) There holds

∂ yˆ a det ∂ pb

= det T M p=y=0,t=1

sinh[ D(ω)/2] . D(ω)/2

(2.184)

Proof. Eq. (2.182) trivially follows from Eq. (2.180). Let us expand the function yˆ (t, p, ω, y) in Taylor series in p and y restricting ourselves to linear terms, that is, ∂ yˆ a ∂ yˆ a yˆ a = yˆ a + b pb + b y b + O( p 2 , y 2 , py). (2.185) p=y=0 ∂ p p=y=0 ∂ y p=y=0 First of all, for p = 0, Eq. (2.180) becomes d yˆ a = −ωi D a ib yˆ b . dt

(2.186)

The solution of this equation with the initial condition yˆ = 0 is trivial, therefore, yˆ = yˆ (t, 0, ω, 0) = 0. (2.187) p=y=0

Next, by differentiating Eq. (2.186) with respect to y b and setting y = 0 we obtain the equation c d ∂ yˆ a i a ∂ yˆ , (2.188) p=y=0 = −ω D ic dt ∂ y b ∂ y b p=y=0 with the initial condition ∂ yˆ a = δa b . ∂ y b p=y=t=0 The solution of this equation is ∂ yˆ a = (exp[−t D(ω)])a b , ∂ y b p=y=0

(2.189)

(2.190)

where D(ω) = ωi Di . Let Zab =

∂ yˆ a . ∂ p b p=y=0

(2.191)

984

I. G. Avramidi

Then by differentiating Eq. (2.180) with respect to p b and setting p = 0, we obtain d a Z b = δ a b − ωi D a ic Z c b , dt

(2.192)

Z a b t=0 = 0.

(2.193)

1 − exp[−t D(ω)] . D(ω)

(2.194)

with the initial condition

The solution of this equation is Z=

By substituting Eqs. (2.187), (2.190) and (2.194) in (2.185) we get the desired result (2.183). Finally, by taking into account that the matrix D(ω) is traceless, we find first det exp[t D(ω)] = 1, and, then by using Eq. (2.194) we obtain (2.184).

The function yˆ = yˆ (t, p, ω, y) implicitly defines the function p = p(t, ω, yˆ , y).

(2.195)

The function p¯ = p(ω, ¯ y) is now defined by the equation yˆ (1, p, ¯ ω, y) = 0,

(2.196)

p(ω, ¯ y) = p(1, ω, 0, y).

(2.197)

or

Proposition 13. The Taylor expansion of the function p(ω, ¯ y) in y has the form a exp[−D(ω)] b 2 p¯ a = − D(ω) (2.198) b y + O(y ). 1 − exp[−D(ω)] Therefore, sinh[ D(ω)/2] −1 ∂ p¯ a det − b = det T M . ∂y D(ω)/2 y=0 Proof. We expand p¯ in Taylor series in y, ∂ p¯ a a a p¯ = p¯ y=0 + b y b + O(y 2 ). ∂ y y=0 Next, by taking into account (2.187) we have p¯ = 0. y=0

(2.199)

(2.200)

(2.201)

Further, by differentiating (2.196) with respect to y c and setting y = 0 we get ∂ yˆ a ∂ p¯ c ∂ yˆ a + = 0, (2.202) ∂ y b p=y=0,t=1 ∂ p c p=y=0,t=1 ∂ y b y=0


985

and, therefore, a ∂ p¯ a exp[−D(ω)] = − D(ω) b. ∂ y b y=0 1 − exp[−D(ω)] This leads to both (2.198) and (2.199).

(2.203)

Now, we define Λµˆ ν =

∂ xˆ µ . ∂xν

(2.204)

The pullback of the metric by the diffeomorphism ψt is defined by ˆ

(ψt∗ g)µν (x) = Λαˆ µ Λβ ν gαˆ βˆ (x). ˆ

(2.205)

Since ψt is an isometry, we have (ψt∗ g)µν (x) = gµν (x).

(2.206)

Therefore, the inverse matrix Λ−1 is equal to ˆ

(Λ−1 )µ αˆ = g µν (x)Λβ ν gβˆ αˆ (x). ˆ

(2.207)

Let ea µ and ea µ be a local orthonormal frame that is obtained by parallel transport along geodesics from a point x . Then the action of the pullback ψt∗ on the orthonormal frame is (ψt∗ ea )µ (x) = Λαˆ µ ea αˆ (x). ˆ

(2.208)

Since ψt is an isometry, we have δab (ψt∗ ea )α (x)(ψt∗ eb )β (x) = δab ea α (x)eb β (x).

(2.209)

Therefore, the frames of 1-forms ea and ψt∗ ea are related by an orthogonal transformation (ψt∗ ea )(x) = O a b eb (x),

(2.210)

where the matrix O a b is defined by ˆ αˆ µ eb µ (x). O a b = ea αˆ (x)Λ

(2.211)

Proposition 14. For p = y = 0 the matrix O has the form O

p=y=0

= exp [−t D(ω)] .

(2.212)

986

I. G. Avramidi

Proof. We use normal coordinates yˆ a and y a . Then the matrix O takes the form O a b = ea αˆ

∂ xˆ α ∂ yˆ c ∂ y d µ eb . ∂ yˆ c ∂ y d ∂ x µ

(2.213)

Now, by using the Jacobian matrix (2.41) and recalling that yˆ = 0 for p = y = 0 we obtain α ∂ y a µ a ∂ xˆ eb p=y=0 = e αˆ b = δa b . (2.214) ∂xµ ∂ yˆ p=y=0

Therefore, Oa b

p=y=0

=

∂ yˆ a , ∂ y b p=y=0

and, finally (2.190) gives the desired result (2.212).

(2.215)

Let ϕ be a section of the twisted spin-tensor bundle V. Let Vx be the fiber at the point x and Vxˆ be the fiber at the point xˆ = ψt (x). The pullback of the diffeomorphism ψt defines the map, that we call just the pullback, ψt∗ : C ∞ (V) → C ∞ (V)

(2.216)

on smooth sections of the twisted spin-tensor bundle V. The pullback of tensor fields of type ( p, q) is defined by µ ...µ

ˆ

ˆ

αˆ ...αˆ p

(ψt∗ ϕ)ν11...νq p (x) = Λβ1 ν1 · · · Λβq νq (Λ−1 )µ1 αˆ 1 · · · (Λ−1 )µ p αˆ p ϕ ˆ 1

β1 ...βˆq

(x). ˆ (2.217)

We define the twisted pullback (a combination of a proper pullback and a gauge transformation) of a tensor of type ( p, q) by a ...a

d ...d

(ψt∗ ϕ)b11 ...bqp (x) = O c1 b1 · · · O cq bq Od1 a1 · · · Od p a p ϕc11...cqp (x). ˆ

(2.218)

Since the matrix O is orthogonal, it can be parametrized by O = exp θ,

(2.219)

where θab is an antisymmetric matrix. The orthogonal transformation of the frame pulled back causes the transformation of spinors 1 ˆ (2.220) (ψt∗ ϕ)(x) = exp − θab γ ab ϕ(x). 4 More generally, we have Proposition 15. Let ϕ be a section of a twisted spin-tensor bundle V. Then 1 ∗ ab ϕ(x). ˆ (ψt ϕ)(x) = exp − θab G 2 In particular, for p = y = 0 (or x = x ), (ψt∗ ϕ)(x) = exp [tR(ω)] ϕ(x ), p=y=0

where R(ω) = ωi Ri .

(2.221)

(2.222)


987

Proof. First, from Eq. (2.212) we see that θ a b = −tωi D a ib .

(2.223)

p=y=0

Then, from the definition (2.144) of the matrices Ri we get (2.222).

It is not very difficult to check that the Lie derivatives are nothing but the generators of the pullback, that is, d (2.224) Lξ ϕ = k A L A ϕ = (ψt∗ ϕ) . t=0 dt We will use this fundamental fact to compute the heat kernel diagonal below. 3. Heat Semigroup 3.1. Geometry of the curvature group. Let G gauge be the gauged curvature group and H be its holonomy subgroup. Both these groups have compact algebras. However, while the holonomy group is always compact, the curvature group is, in general, a product of a nilpotent group, G 0 , and a semi-simple group, G s , G gauge = G 0 × G s .

(3.1)

The semi-simple group G s is a product G s = G + × G − of a compact G + and a noncompact G − subgroup. Let ξ A be the basis Killing vectors, k A be the canonical coordinates on the curvature group G and ξ(k) = k A ξ A . The canonical coordinates are exactly the normal coordinates on the group defined above. Let C A be the generators of the curvature group in adjoint representation and C(k) = k A C A . In the following ∂ M means the partial derivative ∂/∂k M with respect to the canonical coordinates. We define the matrix Y A M by the equation exp[−ξ(k)]∂ M exp[ξ(k)] = Y A M ξ A ,

(3.2)

which is well defined since the right hand side lies in the Lie algebra of the curvature group. This can be written in the form exp[−ξ(k)]∂ M exp[ξ(k)] = exp[−Adξ(k) ]∂ M ,

(3.3)

where the operator Ad X is defined by Ad X Z = [X, Z ]. This enables us to compute the matrix Y = (Y A M ) explicitly, namely, Y =

1 − exp[−C(k)] . C(k)

(3.4)

Let X = (X A M ) = Y −1 be the inverse matrix of Y . Then we define the 1-forms Y A and the vector fields X A on the group G by Y A = Y A M dk M ,

X A = X A M ∂M .

(3.5)

Proposition 16. There holds X A exp[ξ(k)] = exp[ξ(k)]ξ A .

(3.6)

988

I. G. Avramidi

Proof. This follows immediately from Eq. (3.2).

Next, by differentiating Eq. (3.2) with respect to k L and alternating the indices L and M we obtain ∂ L Y A M − ∂ M Y A L = −C A BC Y B L Y C M ,

(3.7)

which, of course, can also be written as 1 dY A = − C A BC Y B ∧ Y C . 2

(3.8)

Proposition 17. The vector fields X A satisfy the commutation relations [X A , X B ] = C C AB X C . Proof. This follows from Eq. (3.7).

(3.9)

The vector fields X A are nothing but the right-invariant vector fields. They form a representation of the curvature algebra. We will also need the following fundamental property of Lie groups. Proposition 18. Let G be a Lie group with the structure constants C A BC , C A = (C B AC ) and C(k) = C A k A . Let γ = (γ AB ) be a symmetric non-degenerate matrix satisfying the equation (C A )T = −γ C A γ −1 .

(3.10)

Let X = (X A M ) be a matrix defined by X=

C(k) . 1 − exp[−C(k)]

(3.11)

Then (det X )−1/2 γ AB X A M ∂ M X B N ∂ N (det X )1/2 = −

1 AB C γ C AD C D BC . 24

(3.12)

Proof. It is easy to check that this equation holds at k = 0. Now, it can be proved by showing that it is a group invariant. For a detailed proof for semisimple groups see [18,20,25].

It is worth stressing that this equation holds not only on semisimple Lie groups but on any group with a compact Lie algebra, that is, when the structure constants C A BC and the matrix γ AB , used to define the metric G M N and the operator X 2 , satisfy Eq. (2.81). Such algebras can have an Abelian center as in Eq. (2.83). Now, by using the right-invariant vector fields we define a metric on the curvature group G, G M N = γ AB Y A M Y B N ,

G M N = γ AB X A M X B N .

(3.13)

This metric is bi-invariant and satisfies, in particular, the equation L X A G BC = X A M ∂ M G BC + G B M ∂C X A M + G MC ∂ B X A M = 0.

(3.14)


989

This equation is proved by using Eqs. (2.81) and (3.9). This means that the vector fields X A are the Killing vector fields of the metric G M N . One can easily show that this metric defines the following natural affine connection ∇ G on the group 1 1 ∇ XGC Y A = C B AC Y B , (3.15) ∇ XGC X A = − C A BC X B , 2 2 with the scalar curvature 1 (3.16) RG = − γ AB C C AD C D BC . 4 Since the matrix C(k) is traceless we have det exp[C(k)/2] = 1, and, therefore, the volume element on the group is sinh[C(k)/2] , (3.17) |G|1/2 = (det G M N )1/2 = |γ |1/2 det G C(k)/2 where |γ | = det γ AB . Notice that this function is precisely the inverse Van VleckMorette determinant (2.42) on the group in normal coordinates. It is not difficult to see that k M Y A M = k M X M A = k A. By differentiating this equation with respect to we obtain

kB

(3.18)

and contracting the indices A and B

k M ∂A X M A = N − X A A.

(3.19)

G BC

Now, by contracting Eq. (3.15) with we obtain the zero-divergence condition for the right-invariant vector fields (3.20) |G|−1/2 ∂ M |G|1/2 X A M = 0. Next, we define the Casimir operator X 2 = C2 (G, X ) = γ AB X A X B .

(3.21)

X2

By using Eq. (3.20) one can easily show that is an invariant differential operator that is nothing but the scalar Laplacian on the group G G X 2 = |G|−1/2 ∂ M |G|1/2 G M N ∂ N = G M N ∇ M ∇N .

(3.22)

Then, by using Eqs. (2.81) and (2.78) one can show that the operator X 2 commutes with the operators X A , [X A , X 2 ] = 0.

(3.23)

Since we will actually be working with the gauged curvature group, we introduce now the operators (covariant right-invariant vector fields) J A by 1 J A = X A − B AB k B , 2

(3.24)

J 2 = γ AB J A J B .

(3.25)

and the operator

990

I. G. Avramidi

Proposition 19. The operators J A and J 2 satisfy the commutation relations [J A , J B ] = C C AB JC + B AB ,

(3.26)

[J A , J 2 ] = 2B AB J B .

(3.27)

and

Proof. By using Eqs. (2.157)-(2.160) we obtain X B A B AM = γ B N γ AC X C N B AM = B B M ,

(3.28)

γ AB X B M B AM = 0,

(3.29)

and, hence,

and, further, by using (3.9) we obtain (3.26). By using Eqs. (3.28) we get (3.27).

Thus, the operators J A form a representation of the gauged curvature algebra. Now, let L A be the operators of Lie derivatives satisfying the commutation relations (2.162) and L(k) = k A L A . Proposition 20. There holds J A exp[L(k)] = exp[L(k)]L A ,

(3.30)

J 2 exp[L(k)] = exp[L(k)]L2 .

(3.31)

and, therefore,

Proof. Similarly to (3.3) we have exp[−L(k)]∂ M exp[L(k)] = exp[−AdL(k) ]∂ M .

(3.32)

By using the commutation relations (2.162) and Eq. (2.160) we obtain 1 exp[−L(k)]∂ M exp[L(k)] = Y A M L A + B M N k N . 2

(3.33)

The statement of the proposition follows from the definition of the operators J A , J 2 and L2 .

3.2. Heat kernel on the curvature group. Let B be the matrix with the components B = (γ AB B BC ). Let k A be the canonical coordinates on the curvature group G and A(t; k) be a function defined by sinh [C(k)/2 + tB] −1/2 A(t; k) = det G . (3.34) C(k)/2 + tB By using Eqs. (3.28) one can rewrite this in the form sinh [C(k)/2] −1/2 sinh [tB] −1/2 A(t; k) = det G det G . C(k)/2 tB

(3.35)


Notice also that due to (2.159), sinh [tB] −1/2 sinh [tB] −1/2 det G = det T M , tB tB where B is now regarded as just the matrix B = (Ba b ). Let Θ(t; k) be another function on the group G defined by 1 ˆ , k, γ Θk Θ(t; k) = 2

991

(3.36)

(3.37)

where Θˆ is the matrix Θˆ = tB coth(tB)

(3.38)

and u, γ v = γ AB u A v B is the inner product on the algebra G. Theorem 6. Let Φ(t; k) be a function on the group G defined by Θ(t; k) 1 −N /2 Φ(t; k) = (4π t) + RG t . A(t; k) exp − 2t 6

(3.39)

Then Φ(t; k) satisfies the equation ∂t Φ = J 2 Φ,

(3.40)

Φ(0; k) = |γ |−1/2 δ(k).

(3.41)

and the initial condition

Proof. We compute first ∂t Θ =

t 1 1 Θ− k, γ Θˆ 2 k + k, γ B 2 k t 2t 2

(3.42)

1 N − tr G Θˆ A. 2t

(3.43)

and ∂t A = Therefore,

1 1 1 1 2 2 ˆ ˆ ∂t Φ = RG − tr G Θ + 2 k, γ Θ k − k, γ B k Φ. 6 2t 4t 4

(3.44)

Next, we have 1 J 2 = X 2 − γ AB B AC k C X B + γ AB B AC B B D k C k D . 4

(3.45)

By using Eqs. (3.28) and (2.160) and the anti-symmetry of the matrix B AB we show that γ AB B AC k C X B Θ = 0,

and

γ AB B AC k C X B A = 0,

(3.46)

and, therefore, B AC k C X B Φ = 0.

(3.47)

992

I. G. Avramidi

Thus,

1 1 J 2 Φ = A−1 (X 2 A) − (X 2 Θ) + 2 γ AB (X A Θ)(X B Θ) 2t 4t 1 1 − A−1 γ AB (X B A)(X A Θ) − k, γ B 2 k Φ. t 4

(3.48)

Further, by using (3.28) we get γ AB (X A Θ)(X B Θ) = k, γ Θˆ 2 k , X 2 Θ = tr G X + tr G Θˆ − N .

(3.49) (3.50)

Now, by using Eq. (3.20) in the form A2 ∂ M (A−2 X B M ) = 0

(3.51)

and Eqs. (2.160) and (3.19) we show that A−1 γ AB (X A Θ)X B A =

1 N − tr G X , 2

(3.52)

and by using Eq. (3.12) we obtain A−1 X 2 A =

1 RG . 6

(3.53)

Finally, substituting Eqs. (3.49)-(3.53) into Eq. (3.48) and comparing it with Eq. (3.44) we prove Eq. (3.40). The initial condition (3.41) follows easily from the well known property of the Gaussian. This completes the proof of the theorem.

3.3. Regularization and analytical continuation. In the following we will complexify the gauged curvature group in the following sense. We extend the canonical coordinates (k A ) = ( pa , ωi ) to the whole complex Euclidean space C N . Then all group-theoretic functions introduced above become analytic functions of k A possibly with some poles on the real section R N for compact groups. In fact, we replace the actual real slice R N N in C N obtained by rotating the real section of C N with an N -dimensional subspace Rreg R N counterclockwise in C N by π/4. That is, we replace each coordinate k A by eiπ/4 k A . In the complex domain the group becomes non-compact. We call this procedure the decompactification. If the group is compact, or has a compact subgroup, then this plane will cover the original group infinitely many times. Since the metric (γ AB ) = diag (δab , βi j ) is not necessarily positive definite, (actually, only the metric of the holonomy group βi j is non-definite) we analytically continue the function Φ(t; k) in the complex plane of t with a cut along the negative imaginary axis so that −π/2 < arg t < 3π/2. Thus, the function Φ(t; k) defines an analytic function of t and k A . For the purpose of the following exposition we shall consider t to be real negative, t < 0. This is needed in order to make all integrals convergent and well defined and to be able to do the analytical continuation. As we will show below, the singularities occur only in the holonomy group. This means that there is no need to complexify the coordinates pa . Thus, in the following we assume the coordinates pa to be real and the coordinates ωi to be complex, more


993 p

precisely, to take values in the p-dimensional subspace Rreg of C p obtained by rotating N = Rn × R p . R p counterclockwise by π/4 in C p . That is, we have Rreg reg This procedure (that we call a regularization) with the nonstandard contour of integration is necessary for the convergence of the integrals below since we are treating both the compact and the non-compact symmetric spaces simultaneously. Remember, that, in p general, the nondegenerate diagonal matrix βi j is not positive definite. The space Rreg is chosen in such a way to make the Gaussian exponent purely imaginary. Then the indefiniteness of the matrix β does not cause any problems. Moreover, the integrand does not have any singularities on these contours. The convergence of the integral is guaranteed by the exponential growth of the sine for imaginary argument. These integrals can be computed then in the following way. The coordinates ω j corresponding to the compact directions are rotated further by another π/4 to an imaginary axis and the coordinates ω j corresponding to the non-compact directions are rotated back to the real axis. Then, for t < 0 all the integrals below are well defined and convergent and define an analytic function of t in a complex plane with a cut along the negative imaginary axis. 3.4. Heat semigroup. Theorem 7. The heat semigroup exp(tL2 ) can be represented in form of the integral exp(tL2 ) = dk |G|1/2 (k)Φ(t; k) exp[L(k)]. (3.54) N Rreg

Proof. Let Ψ (t) =

dk |G|1/2 Φ(t; k) exp[L(k)].

(3.55)

N Rreg

By using the previous theorem we obtain ∂t Ψ (t) = dk |G|1/2 exp[L(k)]J 2 Φ(t; k).

(3.56)

N Rreg

Now, by integrating by parts we get ∂t Ψ (t) = dk |G|1/2 Φ(t; k)J 2 exp[L(k)],

(3.57)

N Rreg

and, by using Eq. (3.31) we obtain ∂t Ψ (t) = Ψ (t)L2 .

(3.58)

Finally from the initial condition (3.41) for the function Φ(t; k) we get Ψ (0) = 1, and, therefore, Ψ (t) = exp(tL2 ).

(3.59)

994

I. G. Avramidi

Theorem 8. Let ∆ be the Laplacian acting on sections of a homogeneous twisted spin-tensor vector bundle over a symmetric space. Then the heat semigroup exp(t∆) can be represented in the form of an integral sinh(tB) −1/2 1 exp(t∆) = (4π t)−N /2 det T M exp −tR2 + RG t tB 6 1/2 sinh[C(k)/2] dk |γ |1/2 det G × C(k)/2 N Rreg

1 × exp − k, γ tB coth(tB)k exp[L(k)]. 4t

(3.60)

Proof. By using Eq. (2.169) we obtain

exp(t∆) = exp −tR2 exp tL2 .

(3.61)

The statement of the theorem follows now from Eqs. (3.54), (3.39), (3.35)-(3.38) and (3.17).

4. Heat Kernel 4.1. Heat kernel diagonal and heat trace. The heat kernel diagonal on a homogeneous bundle over a symmetric space is parallel. In a parallel local frame it is just a constant matrix. The fiber trace of the heat kernel diagonal is just a constant. That is why it can be computed at any point in M. We fix a point x in M such that the Killing vectors satisfy the initial conditions (2.104)-(2.106) and are given by the explicit formulas above (2.103)-(2.105). We compute the heat kernel diagonal at the point x . The heat kernel diagonal can be obtained by acting by the heat semigroup exp(t∆) on the delta-function, [8,10] U diag (t) = exp(t∆)δ(x, x ) x=x = exp −tR2 dk |G|1/2 Φ(t; k) exp[L(k)]δ(x, x ) . (4.1) x=x

N Rreg

To be able to use this integral representation we need to compute the action of the isometries exp[L(k)] on the delta-function. Proposition 21. Let ϕ be a section of the twisted spin-tensor bundle V, L A be the twisted Lie derivatives, k A = ( pa , ωi ) be the canonical coordinates on the group and L(k) = k A L A . Let ξ = k A ξ A be the Killing vector and ψt be the corresponding one-parameter diffeomorphism. Then 1 ab ϕ(x) ˆ , (4.2) exp [L(k)] ϕ(x) = exp − θab G 2 t=1 where xˆ = ψt (x) and the matrix θ is defined by (2.219). In particular, for p = 0 and x = x , exp[L(k)]ϕ(x) = exp [R(ω)] ϕ(x). (4.3) p=0,x=x


995

Proof. This statement follows from Eqs. (2.221) and (2.222) and the fact that the Lie derivative is nothing but the generator of the pullback.

Proposition 22. Let ωi be the canonical coordinates on the holonomy group H and (k A ) = ( pa , ωi ) be the natural splitting of the canonical coordinates on the curvature group G. Then sinh[D(ω)/2] −1 exp[L(k)]δ(x, x ) = det T M exp[R(ω)]δ( p). (4.4) x=x D(ω)/2 Proof. Let x(t, ˆ p, ω, x, x ) = ψt (x). By making use of Eq. (4.2) we obtain 1 ab δ(x(1, ˆ p, ω, x, x ), x ) . (4.5) exp[L(k)]δ(x, x ) x=x = exp − θab G 2 x=x ,t=1 Now we change the variables from x µ to the normal coordinates y a to get a ∂y δ( y ˆ (1, p, ω, y)) . δ(x(1, ˆ p, ω, x, x ), x ) x=x = |g|−1/2 det ∂xµ y=0

(4.6)

This delta-function picks the values of p that make yˆ = 0, which is exactly the functions p¯ = p(ω, ¯ y) defined by Eq. (2.196). By switching further to the variables p we obtain b −1 a ∂ yˆ ∂y −1/2 det δ(x(1, ˆ p, ω, x, x ), x ) x=x = |g| det δ( p − p(ω, ¯ y)) . ∂xµ ∂ pc y=0,t=1

(4.7) Now, by recalling from (2.201) that p| ¯ y=0 = 0 and by using (2.41) and (2.184) we evaluate the Jacobians for p = y = 0 and t = 1 to get Eq. (4.4).

Remark 2. Some remarks are in order here. We implicitly assumed that there are no closed geodesics and that the equation of closed orbits of isometries yˆ a (1, p, ¯ ω, 0) = 0

(4.8)

has a unique solution p¯ = p(ω, ¯ 0) = 0. On compact symmetric spaces this is not true: there are infinitely many closed geodesics and infinitely many closed orbits of isometries. However, these global solutions, which reflect the global topological structure of the manifold, will not affect our local analysis. In particular, they do not affect the asymptotics of the heat kernel. That is why we have neglected them here. This is reflected in the fact that the Jacobian in (4.4) can become singular when the coordinates of the holonomy group ωi vary from −∞ to ∞. Note that the exact results for compact symmetric spaces can be obtained by an analytic continuation from the dual noncompact case when such closed geodesics are absent [18]. That is why we proposed above to complexify our holonomy group. If the coordinates ωi are complex taking values in the subspace p Rreg defined above, then Eq. (4.8) should have a unique solution and the Jacobian is an analytic function. It is worth stressing once again that the canonical coordinates cover the whole group except for a set of measure zero. Also a compact subgroup is covered infinitely many times. We will show below how this works in the case of the two-sphere, S2. Now by using the above lemmas and the theorem we can compute the heat kernel diagonal. We define the matrix F(ω) by F(ω) = ωi Fi .

996

I. G. Avramidi

Theorem 9. The heat kernel diagonal of the Laplacian on twisted spin-vector bundles over a symmetric space has the form sinh(tB) −1/2 1 1 R + R H − R2 t U diag (t) = (4π t)−n/2 det T M exp tB 8 6 dω 1 × |β|1/2 exp − ω, βω cosh [ R(ω)] (4π t) p/2 4t Rnreg

× det H

sinh [ F(ω)/2] F(ω)/2

1/2 det T M

sinh [ D(ω)/2] D(ω)/2

−1/2

,

(4.9)

where |β| = det βi j and ω, βω = βi j ωi ω j . Proof. First, we have dk = dp dω and |γ | = |β| . By using Eqs. (4.1) and (4.4) and integrating over p we obtain the heat kernel diagonal

U diag (t) =

dω |G|1/2 (0, ω)Φ(t; 0, ω) det T M p

sinh[D(ω)/2] D(ω)/2

−1

Rreg

× exp[R(ω) − tR2 ].

(4.10)

Further, by using the Eq. (2.76) we compute the determinants det G

sinh[C(ω)/2] C(ω)/2

= det T M

sinh[D(ω)/2] sinh[F(ω)/2] det H . (4.11) D(ω)/2 F(ω)/2

Now, by using (2.159) we compute (3.37) Θ(t; 0, ω) = 21 ω, βω, and, finally, by using Eqs. (3.39), (3.35), (3.16) and (2.88) we get the result (4.9).

By using this theorem we can also compute the heat trace for compact manifolds, Tr L 2 exp(t∆) =

dvol (4π t)−n/2 tr V det T M

M

sinh(tB) tB

−1/2 (4.12)

1 1 2 R + RH − R t × exp 8 6 dω 1 1/2 × |β| exp − ω, βω cosh [R(ω)] p/2 (4π t) 4t

p

Rreg

× det H where tr V is the fiber trace.

sinh [F(ω)/2] F(ω)/2

1/2 det T M

sinh [D(ω)/2] D(ω)/2

−1/2

,


997

4.2. Heat kernel asymptotics. It is well known that there is the following asymptotic expansion as t → 0 of the heat kernel diagonal [24]: U diag (t) ∼ (4π t)−n/2

∞

t k ak .

(4.13)

k=0

The coefficients ak are called the local heat kernel coefficients. On compact manifolds, there is a similar asymptotic expansion of the heat trace with the global heat invariants Ak defined by dvol tr V ak . (4.14) Ak = M

In symmetric spaces the heat invariants do not contain any additional information since the local heat kernel coefficients define the heat invariants Ak up to a constant equal to the volume of the manifold, Ak = vol (M)tr V ak .

(4.15)

We introduce a Gaussian average over the holonomy algebra by dω 1 1/2 f (ω) = ω, βω f (ω). |β| exp − (4π ) p/2 4

(4.16)

p Rreg

Then we can write U diag (t) = (4π t)−n/2 det T M !

sinh(tB) tB

−1/2

exp

1 1 R + R H − R2 t 8 6

√ 1/2 sinh t F(ω)/2 √ t F(ω)/2 √ −1/2 " sinh t D(ω)/2 × det T M . √ t D(ω)/2 √ × cosh t R(ω) det H

(4.17)

This equation can be used now to generate all heat kernel coefficients ak for any locally symmetric space simply by expanding it in a power series in t. By using the standard Gaussian averages 4 ω1i · · · ωi2k+1 = 0, (4.18) (2k)! (i1 i2 ωi1 · · · ωi2k = β · · · β i2k−1 i2k ) , (4.19) k! one can obtain now all heat kernel coefficients in terms of traces of various contractions of the matrices D a ib and F j ik with the matrix β ik . All these quantities are curvature invariants and can be expressed directly in terms of the Riemann tensor. 4 We have corrected here a misprint in Eq. (4.68) of [10].

998

I. G. Avramidi

There is an alternative representation of the Gaussian average in purely algebraic terms. Let b j and bk∗ be operators, called creation and annihilation operators, acting on a Hilbert space, that satisfy the following commutation relations: [b j , bk∗ ] = δk ,

[b j , bk ] = [b∗j , bk∗ ] = 0.

j

(4.20)

Let |0 be a unit vector in the Hilbert space, called the vacuum vector, that satisfies the equations b j |0 = 0|bk∗ = 0.

0|0 = 1,

(4.21)

Then the Gaussian average is nothing but the vacuum expectation value f (ω) = 0| f (b) expb∗ , βb∗ |0,

(4.22)

where b∗ , βb∗ = β jk b∗j bk∗ . This should be computed by the so-called normal ordering, that is, by simply commuting the operators b j through the operators bk∗ until they hit the vacuum vector giving zero. The remaining non-zero commutation terms precisely reproduce Eqs. (4.18), (4.19). 4.2.1. Calculation of the coefficient a1 . As an example let us calculate the lowest heat kernel coefficients: a0 and a1 . Let X be a matrix. Then by using √ m √ sinh t X sinh t X det = exp m tr log (4.23) √ √ tX tX and [21] √ ∞ 2k−1 sinh t X 2 B2k k 2k log t X , = √ k(2k)! tX k=1

(4.24)

where Bk are Bernoulli numbers, in particular, B0 = 1, we obtain

det

1 B1 = − , 2

B2 =

1 , 6

√ ±1/2 sinh t X 1 = 1 ± t tr X 2 + O(t 2 ). √ 12 tX

(4.25)

(4.26)

Now, by using Eq. (4.17) we obtain

U diag (t) = (4π t)−n/2 a0 + ta1 + O(t 2 ) ,

(4.27)

where a0 = I, and

b1 =

a1 = b1 ,

1 1 1 1 1 R + R H + tr F(ω)2 − tr D(ω)2 I − R2 + R(ω)2 . 8 6 48 48 2

(4.28)

(4.29)


Next, bu using (4.19), in particular, we obtain

999

ωi ω j = 2β i j ,

(4.30)

R(ω)2 = 2R2 , tr F(ω)2 = 2tr Fi F i = 2F j il F li j = −8R H , tr D(ω)2 = 2tr Di D i = 2D a ib D bi a = −2R,

(4.31) (4.32) (4.33)

and, therefore, a1 =

1 1 1 1 1 R + RH − RH + R I − R2 + R2 = RI. 8 6 6 24 6

(4.34)

This confirms the well known result for the coefficient a1 [5,24]. 4.3. Heat kernel on S 2 and H 2 . Let us apply our result to a special case of a two-sphere S 2 of radius r , which is a compact symmetric space equal to the quotient of the isometry group, S O(3), by the isotropy group, S O(2), S 2 = S O(3)/S O(2).

(4.35)

The two-sphere is too small to incorporate an additional Abelian field B; therefore, we set B = 0. Let y a be the normal coordinates defined above. On the 2-sphere of radius r they range over −r π ≤ y a ≤ r π . We define the polar coordinates ρ and ϕ by y 1 = ρ cos ϕ,

y 2 = ρ sin ϕ,

so that 0 ≤ ρ ≤ r π and 0 ≤ ϕ ≤ 2π . The orthonormal frame of 1-forms is e1 = dρ,

e2 = r sin

ρ r

dϕ,

(4.36)

(4.37)

which gives the spin connection 1-form ωab = −εab cos

ρ r

dϕ ,

(4.38)

with εab being the antisymmetric Levi-Civita tensor, and the curvature 1 1 εab εcd = 2 (δac δbd − δad δbc ), r2 r 1 2 = 2 δab , R = 2. r r

Rabcd =

(4.39)

Rab

(4.40)

Since the holonomy group S O(2) is one-dimensional, it is obviously Abelian, so all structure constants F i jk are equal to zero, and therefore, the curvature of the holonomy

1000

I. G. Avramidi

group vanishes, R H = 0. The metric of the holonomy group βi j is now just a constant, β = 1/r 2 . The only generator of the holonomy group in the vector representation is Dab = −

1 1 E ab = − 2 εab . 2 r r

(4.41)

The irreducible representations of S O(2) are parametrized by α, which is either an integer, α = m, or a half-integer, α = m + 21 . Therefore, the generator R of the holonomy group and the Casimir operator R2 are R=i

α , r2

(4.42)

R2 = β i j Ri R j = −

α2 . r2

(4.43)

The extra factor r 2 here is due to the inverse metric β −1 = r 2 of the holonomy group. The Lie derivatives L A are given by ρ sin ϕ sin ϕ cot ∂ϕ + i α, r r r sin (ρ/r ) ρ cos ϕ cos ϕ L2 = sin ϕ∂ρ + cot ∂ϕ − i α, r r r sin (ρ/r ) 1 L3 = 2 ∂ϕ , r L1 = cos ϕ∂ρ −

(4.44) (4.45) (4.46)

and form a representation of the S O(3) algebra [L1 , L2 ] = −L3 ,

[L3 , L1 ] = −

1 L2 r2

[L3 , L2 ] =

1 L1 . r2

(4.47)

The Laplacian is given by ∆ = ∂ρ2 +

ρ ρ 2 1 1 ∂ϕ − iα cos cot ∂ρ + 2 2 . r r r r sin (ρ/r )

(4.48)

The contour of integration over ω in (4.9) should be the real axis rotated counterclockwise by π/4. Since S 2 is compact, we rotate it further to the imaginary axis, compute the determinant sinh[ωD] −1/2 ω/(2r 2 )] , (4.49) = det T M ωD sin[ω/(2r 2 )] √ and rescale ω for t < 0 by ω → r −t ω to obtain an analytic function of t, t 1 1 diag 2 (4.50) U (t) = exp +α 4π t 4 r2 √ ∞ √ ω −t/(2r ) dω ω2 √ cosh αω −t/r . × √ exp − 4 sinh ω −t/(2r ) 4π −∞


1001

If we would have rotated √ the contour to the real axis instead then we would have obtained after rescaling ω → r t ω for t > 0, t 1 1 (4.51) U diag (t) = exp + α2 2 4π t 4 r √ ∞ √ ω t/(2r ) dω ω2 √ cos αω t/r , × – √ exp − 4 sin ω t/(2r ) 4π −∞

# where – denotes the Cauchy principal value of the integral. This can also be written as U

diag

1 exp (t) = 4π t ×

∞

t 1 2 +α 4 r2 √ 2πr/ t

(−1)k

k=−∞

√

0

(4.52)

2πr 2 % √ √ k ω + 2πr dω t 1 t √ ω+ √ k √ exp − 4 2r sin ω t/(2r ) t 4π $

× cos αω t/r . 2 This is √ nothing but the sum over the closed geodesics of S . Note that the factor cos αω t/r is either periodic (for integer α) or anti-periodic (for half-integer α). The non-compact symmetric space dual to the 2-sphere is the hyperbolic plane H 2 of pseudo-radius a. It is equal to the quotient of the isometry group, S O(1, 2), by the isotropy group, S O(2),

H 2 = S O(1, 2)/S O(2).

(4.53)

Let y a be the normal coordinates defined above. On H 2 they range over −∞ ≤ y a ≤ ∞. We define the polar coordinates u and ϕ by y 1 = u cos ϕ,

y 2 = u sin ϕ,

(4.54)

so that 0 ≤ u ≤ ∞ and 0 ≤ ϕ ≤ 2π . The orthonormal frame of 1-forms is e1 = du,

e2 = a sinh

u a

dϕ,

(4.55)

which gives the spin connection 1-form ωab = −εab cosh

u a

dϕ ,

(4.56)

and the curvature 1 1 εab εcd = − 2 (δac δbd − δad δbc ), a2 a 1 2 = − 2 δab , R = − 2. a a

Rabcd = − Rab

(4.57) (4.58)

1002

I. G. Avramidi

The metric of the isotropy group βi j is just a constant, β = −1/a 2 , and the only generator of the isotropy group in the vector representation is given by Dab =

1 1 E ab = 2 εab . 2 a a

(4.59)

The Lie derivatives L A are now u sin ϕ sin ϕ ∂ϕ + i α, coth a a a sinh (u/a) u cos ϕ cos ϕ L2 = sin ϕ∂u + coth ∂ϕ − i α, a a a sinh (u/a) 1 L3 = − 2 ∂ϕ , a L1 = cos ϕ∂u −

(4.60) (4.61) (4.62)

and form a representation of the S O(1, 2) algebra [L1 , L2 ] = −L3 ,

[L3 , L1 ] =

1 L2 a2

[L3 , L2 ] = −

1 L1 . a2

(4.63)

The Laplacian is given by ∆ = ∂u2 +

u u 2 1 1 ∂ coth ∂u + 2 − iα cosh . ϕ a a a a sinh2 (u/a)

(4.64)

The contour of integration over ω in (4.9) for the heat kernel should be the real axis rotated counterclockwise by π/4. Since √ H 2 is non-compact, we rotate it back to the real axis and rescale ω for t > 0 by ω → a t ω to obtain the heat kernel diagonal for the Laplacian on H 2 , t 1 1 diag 2 U (t) = (4.65) exp − +α 4π t 4 a2 √ ∞ √ ω t/(2a) dω ω2 √ cosh αω t/a . × √ exp − 4 sinh ω t/(2a) 4π −∞

We see that the heat kernel in the compact case of the two-sphere, S 2 , is related with the heat kernel in the non-compact case of the hyperboloid, H 2 , by the analytical continuation, a 2 → −r 2 , or a → ir , or, alternatively, by replacing t → −t (and a = r ). One can go even further and compute the Plancherel (or Harish-Chandra) measure µ(ν) in the case of H 2 and the spectrum in the case of S 2 . √ For H 2 we rescale the integration variable in (4.65) by ω → ωa/ t, substitute 2 ∞ t dν a 2 exp − 2 ν 2 + iων , exp − ω = √ 4t 2π a 4π t a

(4.66)

−∞

integrate by parts over ν, and use ∞ −∞

1 dω iων cosh(αω) = {tanh[π(ν + iα)] + tanh[π(ν − iα)]} e 2πi sinh (ω/2) 2

(4.67)


1003

(and the fact that α is a half-integer) to represent the heat kernel for H 2 in the form U

diag

1 (t) = 4πa 2

∞ −∞

t 1 2 2 , +α +ν dν µ(ν) exp − 4 a2

(4.68)

where µ(ν) = ν tanh ν

for integer α = m,

(4.69)

1 for half-integer α = m + . 2

(4.70)

and µ(ν) = ν coth ν

For S 2 we proceed as follows. We cannot just substitute a 2 → −r 2 in (4.68). Instead, first, we deform the contour of integration in (4.68) to the V -shaped contour that consists of two segments of straight lines, one going from ei3π/4 ∞ to 0, and another going from 0 to eiπ/4 ∞. Then, after we replace a 2 → −r 2 , we can deform the contour further to go counterclockwise around the positive imaginary axis. Then we notice that the function µ(ν) is a meromorphic function with simple poles on the imaginary axis at νk = idk , where 1 dk = k + , 2

k = 0, ±1 ± 2, . . . ,

for integer α = m,

(4.71)

and dk = k,


k = ±1, ±2, . . . ,

(4.72)

Therefore, we can compute the integral by residue theory to get U diag (t) =

∞ 1 dk exp (−λk t), 4πr 2

(4.73)

k=0

where 1 λk = 2 r

$

1 k+ 2

2

1 − − m2 4

% for integer α = m,

(4.74)


(4.75)

and $ % 1 1 1 2 2 λk = 2 k − − m + r 4 2

Our results for the heat kernel on the 2-sphere S 2 and the hyperbolic plane H 2 coincide with the exact heat kernel of scalar Laplacian (when R = α = 0) reported in [18] and obtained by completely different methods.

1004

I. G. Avramidi

4.4. Index theorem. We can now apply this result for the calculation of the index of the Dirac operator on spinor bundle on compact manifolds, D = γ µ ∇µ . Let the dimension n of the manifold be even and 1 Γ = i n(n−1)/2 εa1 ...an γ[a1 · · · γan ] n! be the chirality operator of the spinor representation so that Γ 2 = IS

Γ γa = −γa Γ.

and

(4.76)

(4.77)

(4.78)

Then the index of the Dirac operator is equal to Ind (D) = Tr L 2 Γ exp(t D 2 ).

(4.79)

In this case the generators Ri have the form 1 Ri = − D a ib γ b a ⊗ IT + I S ⊗ Ti , (4.80) 4 and the Casimir operator of the holonomy group in the spinor representation is obtained by using (2.13), 1 1 (4.81) R I S + I S ⊗ T 2 − E j ab γ ab ⊗ T j . 8 2 Now, by using Eqs. (2.11), (2.17), (2.13) and (2.141) we compute the square of the Dirac operator R2 =

D2 = ∆ −

1 1 1 R I S − E i ab Ti γ ab + γ ab Bab , 4 2 2

(4.82)

and, finally, the index sinh(tB) −1/2 −n/2 Ind (D) = dvol (4π t) tr V Γ det T M tB M 1 1 1 2 ab t × exp − R + R H − T + Bab γ 4 6 2 dω 1 × |β|1/2 exp − ω, βω (4π t) p/2 4t p

Rreg

1 × cosh − ωi D a ib γ b a + ωi Ti 4 sinh [F(ω)/2] 1/2 sinh [D(ω)/2] −1/2 × det H det T M . (4.83) F(ω)/2 D(ω)/2 Since the index does not depend on t, the right-hand side of this equation does not depend on t. By expanding it in an asymptrotic power series in t, we see that the index is equal to Ind (D) = (4π )−n/2 dvol tr V Γ an/2 . (4.84) M


1005

5. Conclusion We have continued the study of the heat kernel on homogeneous spaces initiated in [6–10]. In those papers we have developed a systematic technique for calculation of the heat kernel in two cases: a) a Laplacian on a vector bundle with a parallel curvature over a flat space [6,9], and b) a scalar Laplacian on manifolds with parallel curvature [8,10]. What was missing in that study was the case of a non-scalar Laplacian on vector bundles with parallel curvature over curved manifolds with parallel curvature. In the present paper we considered the Laplacian on a homogeneous bundle and generalized the technique developed in [10] to compute the corresponding heat semigroup and the heat kernel. It is worth pointing out that our formal result applies to general symmetric spaces by making use of the regularization and the analytical continuation procedure described above. Of course, the heat kernel coefficients are just polynomials in the curvature and do not depend on this kind of analytical continuation (for more detail, see [10]). As we mentioned above, due to existence of multiple closed geodesics the obtained form of the heat kernel for compact symmetric spaces requires an additional regularization, which consists simply in an analytical continuation of the result from the complexified noncompact case. In any case, it gives a generating function for all heat invariants and reproduces correctly the whole asymptotic expansion of the heat kernel diagonal. However, since there are no closed geodesics on non-compact symmetric spaces, it seems that the analytical continuation of the obtained result for the heat kernel diagonal should give the exact result for the non-compact case, and, even more generally, for the general case too. We have seen on the example of the two-sphere that our method gives not just the asymptotic expansion of the heat kernel diagonal but, after an appropriate regularization, in fact, an exact result for the heat kernel diagonal. References 1. Anderson, A., Camporesi, R.: Intertwining operators for solving differential equations with applications to symmetric spaces. Commun. Math. Phys. 130, 61–82 (1990) 2. Avramidi, I.G.: Covariant methods for the calculation of the effective action in quantum field theory and investigation of higher-derivative quantum gravity. PhD Thesis, Moscow State University (1987), http://arXiv.org/abs/hep-th/9510140, 1995 3. Avramidi, I.G.: Background field calculations in quantum field theory (vacuum polarization). Teor. Mat. Fiz. 79, 219–231 (1989) 4. Avramidi, I.G.: The covariant technique for calculation of the heat kernel asymptotic expansion. Phys. Lett. B 238, 92–97 (1990) 5. Avramidi, I.G.: A covariant technique for the calculation of the one-loop effective action. Nucl. Phys. B 355, 712–754 (1991); Erratum: Nucl. Phys. B 509, 557–558 (1998) 6. Avramidi, I.G.: A new algebraic approach for calculating the heat kernel in gauge theories. Phys. Lett. B 305, 27–34 (1993) 7. Avramidi, I.G.: Covariant methods for calculating the low-energy effective action in quantum field theory and quantum gravity. University of Greifswald (March, 1994), http://arXiv.org/abs/gr-qc/9403036, 1994 8. Avramidi, I.G.: The heat kernel on symmetric spaces via integrating over the group of isometries. Phys. Lett. B 336, 171–177 (1994) 9. Avramidi, I.G.: Covariant algebraic method for calculation of the low-energy heat kernel. J. Math. Phys. 36, 5055–5070 (1995); Erratum: J. Math. Phys. 39, 1720 (1998) 10. Avramidi, I.G.: A new algebraic approach for calculating the heat kernel in quantum gravity. J. Math. Phys. 37, 374–394 (1996) 11. Avramidi, I.G.: Covariant approximation schemes for calculation of the heat kernel in quantum field theory. In: Quantum Gravity, Eds. V.A. Berezin, V.A. Rubakov, D.V. Semikoz, Singapore: World Scientific, 1998, pp. 61–78 12. Avramidi, I.G.: Covariant techniques for computation of the heat kernel. Rev. Math. Phys. 11, 947–980 (1999)

1006

I. G. Avramidi

13. Avramidi, I.G.: Heat Kernel and Quantum Gravity. Lecture Notes in Physics, Series Monographs, LNP:64, Berlin: Springer-Verlag, 2000 14. Avramidi, I.G.: Heat kernel approach in quantum field theory. Nucl. Phys. Proc. Suppl. 104, 3–32 (2002) 15. Avramidi, I.G.: Heat kernel asymptotics on symmetric spaces. Comm. Math. Anal. Conf. 1, 1–10 (2008) 16. Barut, A.O., Raszka, R.: Theory of Group Representations and Applications. Warszawa: PWN, 1977 17. Berline, N., Getzler, E., Vergne, M.: Heat Kernels and Dirac Operators. Berlin: Springer-Verlag, 1992 18. Camporesi, R.: Harmonic analysis and propagators on homogeneous spaces. Phys. Rep. 196, 1–134 (1990) 19. Dowker, J.S.: When is the “sum over classical paths” exact?. J. Phys. A 3, 451–461 (1970) 20. Dowker, J.S.: Quantum mechanics on group space and Huygen’s principle. Ann. Phys. (USA) 62, 361–382 (1971) 21. Erdélyi A., Magnus W., Oberhettinger F., Tricomi F.G.: Higher Transcendental Functions. (New York: McGraw-Hill, 1953), vol. I 22. Fegan, H.D.: The fundamental solution of the heat equation on a compact Lie group. J. Diff. Geom. 18, 659–668 (1983) 23. Gilkey, P.B.: The spectral geometry of Riemannian manifold. J. Diff. Geom. 10, 601–618 (1975) 24. Gilkey, P.B.: Invariance Theory, the Heat Equation and the Atiyah-Singer Index Theorem. Boca Raton FL: CRC Press, 1995 25. Helgason, S.: Groups and Geometric Analysis: Integral Geometry, Invariant Differential Operators, and Spherical Functions. Mathematical Surveys and Monographs, Vol. 83, Providence, RI: Amer. Math. Soc., 2002, p. 270 26. Hurt, N.E.: Geometric Quantization in Action: Applications of Harmonic Analysis in Quantum Statistical Mechanics and Quantum Field Theory. Dordrecht: D. Reidel Publishing, Holland, 1983 27. Kirsten, K.: Spectral Functions in Mathematics and Physics. Boca Raton FL: CRC Press, 2001 28. Ruse, H., Walker, A.G., Willmore, T.J.: Harmonic Spaces. Roma: Edizioni Cremonese, 1961 29. Takeuchi, M.: Lie Groups II. In: Translations of Mathematical Monographs. Vol. 85, Providence, RI: Amer. Math. Soc., 1991, p.167 30. Van de Ven, A.E.M.: Index free heat kernel coefficients. Class. Quant. Grav. 15, 2311–2344 (1998) 31. Vassilevich, D.V.: Heat kernel expansion: user’s manual. Phys. Rep. 388, 279–360 (2003) 32. Wolf, J.A.: Spaces of Constant Curvature. University of California, Berkeley, 1972 33. Yajima, S., Higasida, Y., Kawano, K., Kubota, S.-I., Kamo, Y., Tokuo, S.: Higher coefficients in asymptotic expansion of the heat kernel. Phys. Rep. Kumamoto Univ. 12(1), 39–62 (2004) Communicated by A. Connes


Communications in


Reflectionless Herglotz Functions and Jacobi Matrices Alexei Poltoratski1, , Christian Remling2 1 Mathematics Department, Texas A&M University, College Station, TX 77843, USA.

E-mail: [email protected]

2 Mathematics Department, University of Oklahoma, Norman, OK 73019, USA.

E-mail: [email protected] Received: 28 May 2008 / Accepted: 18 August 2008 Published online: 16 December 2008 – © Springer-Verlag 2008

Abstract: We study several related aspects of reflectionless Jacobi matrices. First, we discuss the singular part of the corresponding spectral measures. We then show how to identify sets on which measures are reflectionless by looking at the logarithmic potentials of these measures. 1. Introduction We study several aspects of reflectionless Jacobi matrices and Herglotz functions in this paper. This is part of a larger program; the (perhaps too ambitious) goal is to reach a systematic understanding of the absolutely continuous spectrum of Jacobi operators J on 2 (Z+ ), (J u)(n) = a(n)u(n + 1) + a(n − 1)u(n − 1) + b(n)u(n). We will always assume that the coefficients a, b satisfy bounds of the form (C + 1)−1 ≤ a(n) ≤ C + 1, |b(n)| ≤ C, for some C > 0. Note that if J has some absolutely continuous spectrum, then, by the decoupling argument of Dombrowski and Simon-Spencer [3,17], it actually suffices to assume that a(n) is bounded above; the other two inequalities follow automatically. Let us recall some definitions. A Herglotz function is a holomorphic mapping of C+ = {z ∈ C : Im z > 0} to itself. We denote the set of Herglotz functions by H. If F ∈ H, then F(t) ≡ lim y→0+ F(t + i y) exists for (Lebesgue) almost every t ∈ R. We call F reflectionless (on E ⊂ R) if Re F(t) = 0 for almost every t ∈ E. A. P.’s work is supported in part by NSF grant DMS 0800300.

(1.1)

1008

A. Poltoratski, C. Remling

We will also use the notation N (E) = {F ∈ H : F reflectionless on E}. Herglotz functions have unique representations of the form F(z) = Fµ (z) = a + bz +

∞

−∞

1 t − 2 t −z t +1

dµ(t),

(1.2)

< ∞. We will call with a ∈ R, b ≥ 0, and a (positive) Borel measure µ on R, R dµ(t) t 2 +1 such a measure µ reflectionless (on E) if Fµ ∈ N (E) for some choice of a ∈ R, b ≥ 0; for easier reference, it will also be convenient to introduce the notation R(E) = {µ : µ reflectionless on E}. < ∞ for all µ ∈ R(E). Also, if We emphasize again that in particular R dµ(t) t 2 +1 µ ∈ R(E), then Fµ will refer to the unique Herglotz function Fµ ∈ N (E) that is associated with µ as in (1.2). There are several reasons for being interested in the class N (E); here, our main motivation is provided by the following fact: Call a (whole line) Jacobi matrix J reflectionless (on E) if gn ∈ N (E) for all n ∈ Z, where gn (z) = δn , (J − z)−1 δn is the nth diagonal element of the resolvent of J (also known as the Green function). Then [13, Theorem 1.4] says that all ω limit points of a Jacobi matrix J with some absolutely continuous spectrum are reflectionless on E = ac ; here, ac denotes an essential support of the absolutely continuous part of the spectral measure ρ of J . This is defined up to sets of Lebesgue measure zero; we can obtain a representative as ac = {t : dρ/dt > 0}. Please see [13] for the details. If µ ∈ R(E), then χ E dt dµac . Indeed, this follows immediately from (1.1) because dµac (t) = (1/π )Im F(t) dt and the boundary value of a Herglotz function can not be zero on a set of positive measure. However, it is not so clear in general if µ can also have a singular part on E. See [4–6] for earlier work on this question. We have the following criterion. We say that a (positive) measure ν is supported by a (measurable) set S if ν(S c ) = 0; supports are not assumed to be closed in this paper. Theorem 1.1. Let µ ∈ R(E). Then: (a) µs , the singular part of µ, is supported by

|E ∩ (x − h, x + h)| x ∈ R : lim =0 . h→0+ 2h

(b) Let θ ∈ L ∞ (E) be an arbitrary bounded measurable function. Then µs is also supported by E θ )(x) exists}. {x ∈ R : ( H

Reflectionless Herglotz Functions

1009

E f as Here, we define H E f )(x) = lim (H

y→0+ E

t−x t − 2 2 2 (t − x) + y t +1

f (t) dt,

if the limit exists. This is closely related to the Hilbert transform f (t) dt. (H f )(x) = lim y→0+ |t−x|>y t − x E f and H χ E f exist (Lebesgue) almost For instance, if f ∈ L 1 (R), then both H everywhere and define the same function (almost everywhere), up to an additive constant. Here, we are interested in the singular part of µ, so sets of Lebesgue measure zero do matter, and we distinguish between the two transforms. However, for a bounded, inteE and H stays bounded, grable function, the difference between the integrals defining H so we also obtain the following variant of Theorem 1.1(b): (c) Let θ ∈ L ∞ (E). Then µs is supported by

θ (t) χ E (t) dt < ∞ . x ∈ R : sup 0 0. In this case, we call d the approximative derivative of f at x, and we write (Dap f )(x) = d. Please see [1,14,20] for much more on this and related topics. Theorem 1.4. For almost all x ∈ R, we have that (Dap γ )(x) = −Re g(x).

(1.3)

In particular, γ is approximately differentiable almost everywhere. This is a development of [9, Theorem 5.4]. See also [10] for subsequent work inspired by the same result. Theorem 1.4 may be viewed as a result on interchanging limits. Indeed, Re g(x + i y) = −∂x γ (x + i y) for y > 0, so, for almost every x ∈ R, Re g(x) = − lim y→0+ ∂x γ (x +i y). This raises the question of whether it is possible to perform these operations in the opposite order; in other words, can we first take the boundary value of γ to obtain γ (x) and then take the derivative? Theorem 1.4 provides an affirmative answer if the derivative is taken in the approximate sense. However, for us here, Theorem 1.4 is significant mainly because it identifies sets on which g is reflectionless; in particular, this set will contain the points of constancy of γ . More precisely, we obtain the following:


1011

Corollary 1.5. Let

K = c ∈ R : γ −1 ({c}) > 0 ;

C = γ −1 (K ).

Then g ∈ N (C). Proof. K is countable and thus C is an at most countable union of sets C j of the form C j = γ −1 ({c j }). Almost every point of C j is a point of density, and at such points, clearly Dap γ = 0. Theorem 1.4 now gives the corollary. So, first of all, we recover the result from [9,10] that the equilibrium measure of a compact set K ⊂ R is reflectionless on K (see also [12] for information on potential theory). More importantly, Theorem 1.4 and Corollary 1.5 may be used to identify additional examples of reflectionless measures, and they unify these results. We will not pursue this theme in detail here, but just make a few quick remarks on how to proceed. Introduce density of states measures ν as follows: Nj 1 dν(t) = lim dE(t)δn 2 . j→∞ N j n=1

Here, E denotes the spectral resolution of the Jacobi matrix J , and the limit is taken in the weak-∗ sense and on a suitable subsequence N j → ∞. By the Banach-Alaoglu Theorem, such limits always exist; of course, ν could depend on the choice of the sequence N j . We also remark that there are other ways to obtain dν: one can use the eigenvalues of truncated problems or Christoffel-Darboux kernels; see [16, Theorem 1.5] for further discussion of this. By (a generalized version of) the Thouless formula, the logarithmic potential γ will equal the (generalized) Lyapunov exponent, up to a constant. Therefore, Corollary 1.5 now says that ν will be reflectionless on sets of constancy of the Lyapunov exponent; from this in turn, it also follows that ν ∈ R(ac ), g ∈ N (ac ), where ac again denotes an essential support of the absolutely continuous part of the spectral measure of J . The proofs of the results discussed in this introduction will be given in the following two sections, and we split the material in the obvious way: We will prove Theorems 1.1, 1.2, and 1.3 in Sect. 2, and Sect. 3 has the proof of Theorem 1.4. 2. The Singular Part of Reflectionless Measures Please recall the notation Fµ introduced in (1.2). Also, if f ≥ 0 is a Borel function, then, as expected, f µ will denote the measure ( f µ)(A) = A f dµ. The following result from [11] will be our main tool in this section. Theorem 2.1. ([11]) lim

y→0+

for µs -almost every x ∈ R.

F f µ (x + i y) = f (x) Fµ (x + i y)

1012


A clarifying comment is in order: Given ν, the function Fν is of course not completely determined yet (we don’t know a, b). This, however, is not an issue here; the statement from the theorem holds for all such functions. This follows because |Fµ (x + i y)| → ∞ as y → 0+ for µs -almost every x ∈ R. In Theorem 2.1, we of course implicitly assume that 1/(t 2 + 1) is integrable for all measures involved here. See also [2,7] for further discussion of this theorem. We will also use the following consequence of Theorem 2.1. Proposition 2.2. Suppose that ρ = ρs and σ ⊥ ρ. Then lim

y→0+

Fσ (x + i y) =0 Fρ (x + i y)

for ρ-almost every x ∈ R. Proof. Pick a Borel set T ⊂ R with ρ(T c ) = σ (T ) = 0, and abbreviate ρ + σ = µ. Then Fχ c µ Fχ c µ Fµ Fσ = T = T , Fρ FχT µ Fµ FχT µ and FχT c µ /Fµ → χT c µs -almost everywhere by Theorem 2.1. In particular, this ratio goes to zero ρ-almost everywhere. Similarly, FχT µ /Fµ → 1 ρ-almost everywhere, so the proposition follows. Proof of Theorem 1.1. Let µ ∈ R(E). Write Fµ for the associated Herglotz function Fµ ∈ N (E), as in (1.2), and let ξ be the Krein function of Fµ , that is, ξ(x) =

1 lim Im ln Fµ (x + i y), π y→0+

where we take the logarithm with 0 < Im ln w < π for w ∈ C+ . Since ln Fµ is a Herglotz function, the limit defining ξ exists almost everywhere and 0 ≤ ξ(x) ≤ 1. If, conversely, a measurable function ζ with values in [0, 1] is given, then ζ is the Krein function of some Herglotz function G. We can in fact recover ln G, up to an additive real constant, from ζ , using the Herglotz representation of ln G. Here, we make use of the fact that since ln G has bounded imaginary part, the associated measure is purely absolutely continuous. The condition that Fµ ∈ N (E) means that ξ = 1/2 (almost everywhere) on E. Given an arbitrary function θ ∈ L ∞ (R), with −1 ≤ θ ≤ 1 and θ = 0 on E c , we can therefore introduce two new Krein functions ξ± , as follows: 1 ξ± (x) = ξ(x) ± θ (x). 2 As just explained, this also defines two new Herglotz functions F± , up to multiplicative constants. We fix these constants by demanding that |F± (i)| = |Fµ (i)|. Call the measures associated with these functions µ+ and µ− , respectively. Since ξ = (ξ+ + ξ− )/2, we then have that (2.1) Fµ = Fµ+ Fµ− .


1013

Our first aim is to show that µs µ± .

(2.2)

Suppose this were wrong and write µs = gµ+,s + ν, with ν ⊥ µ+,s , ν = 0. We can then find a Borel set T so that ν(T ) > 0, µ+,s (T ) = ν(T c ) = |T | = 0. Theorem 2.1 now shows that Fχ µ (x + i y) Fν (x + i y) = T s →1 Fµs (x + i y) Fµs (x + i y) for µs -almost every x ∈ T , and, similarly, Fχ (µ +ν) (x + i y) Fν (x + i y) = T + →1 Fµ+ +ν (x + i y) Fµ+ +ν (x + i y) for (µ+,s + ν)-almost every x ∈ T and thus also for µs -almost every x ∈ T . Put differently, this means that Fµ+ (x + i y) →0 Fν (x + i y) for µs -almost every x ∈ T . So on a set of positive µs -measure, Fµ+ (x + i y) Fµ+ Fν = → 0. Fµs (x + i y) Fν Fµs We also have that for µs -almost every x ∈ R, Fµ (x + i y) < ∞. sup − 0 0 2h

for all x ∈ E. This condition is much weaker than the following, which is used to define homogeneous sets: inf inf

x∈E 0 0. 2h

From the work of Sodin-Yuditskii [18] it was previously known that if E is a compact (strongly) homogeneous set and µ ∈ R0 (E), then µs = 0. By using Theorem 1.1, we can go considerably beyond this: Corollary 2.3. Suppose that E is a weakly homogeneous set. If µ ∈ R(E), then µs (E) = 0.


1015

This is more general in two respects: E is only assumed to be weakly homogeneous (rather than homogeneous), and we can treat measures from R(E), not just from R0 (E). This latter improvement, of course, can also be obtained from the general principle that we formulated as Theorem 1.3. We now move on to proving Theorem 1.2. This will follow quickly from the following known characterization of the point part of µ in terms of the Krein function ξ of Fµ . See, for example, [8, p. 201]. We include a proof for the reader’s convenience. Lemma 2.4. µ({x}) > 0 if and only if x+1 |ξ(t) − χ(x,∞) (t)| dt < ∞. |t − x| x−1

(2.6)

Proof. First of all, we can recover the point part as µ({x}) = −i lim y Fµ (x + i y); y→0+

this is well known and follows quickly from the dominated convergence theorem. So µ({x}) > 0 if and only if (2.7) lim sup Re ln Fµ (x + i y) + ln y > −∞. y→0+

To slightly simplify the notation, we will now assume that x = 0. In terms of the Krein function ξ , the expression from (2.7) equals 1 1 t dt + O(1) (y → 0+). ξ(t) dt − 2 2 −1 t + y y t By monotone convergence, 0 −1

t ξ(t) dt → t 2 + y2

0

−1

ξ(t) dt t

(and, of course, this limit could equal −∞). Also, 1 1 1 1 t dt t (ξ(t) − 1) y2 = dt ξ(t) dt − dt − 2 2 2 2 t 2 + y2 0 t +y y t y y t (t + y ) y tξ(t) + dt 2 2 0 t +y 1 t (ξ(t) − 1) dt + O(1), = t 2 + y2 y and, by monotone convergence again, 1 1 t (ξ(t) − 1) ξ(t) − 1 dt ≥ −∞. dt → 2 + y2 t t y 0 These calculations have shown that (2.7), for x = 0, holds if and only if 0 1 ξ(t) 1 − ξ(t) dt + dt < ∞, |t| t −1 0 as asserted by the lemma.

1016


Proof of Theorem 1.2. Suppose that condition (b) from Theorem 1.2 fails. Put ξ(t) =

1 χ E (t) + χ E c ∩(x,∞) (t). 2

Let F ∈ H be the corresponding Herglotz function. Since ξ = 1/2 on E, we have that F ∈ N (E), but it is also clear that (2.6) holds, so the corresponding measure has a point mass at x. The converse is an immediate consequence of Theorem 1.1(c), with θ (t) = sgn(t −x). Furthermore, we can also obtain this statement conveniently from Lemma 2.4, as follows: If Fµ ∈ N (E), then ξ = 1/2 almost everywhere on E, so the integrand from (2.6) equals 1/(2|t − x|) on E ∩ (x − 1, x + 1) and thus (2.6) can not hold if we have condition (b) from Theorem 1.2. Proof of Theorem 1.3. Let ν ∈ R(E). We claim that if µ ∈ R0 (E), then we must have that lim

y→0+

Fµ (x + i y) =0 Fν (x + i y)

(2.8)

for νs -almost every x ∈ R. Indeed, µs = 0, µ(E c ) = 0 by assumption, and, as discussed above, the condition that ν ∈ R(E) forces the absolutely continuous part of ν to be equivalent to χ E dt on E. So µ ν, µs = 0, and thus (2.8) follows immediately from Theorem 2.1. Starting from ν, we will now construct a measure µ ∈ R0 (E) for which (2.8) cannot hold at any point x ∈ E. This will prove that νs (E) = 0, as claimed. We will again work with the Krein functions; the following simple monotonicity property is at the heart of the matter. Lemma 2.5. For ξ ∈ L ∞ (a, b), 0 ≤ ξ ≤ 1, and x ∈ / [a, b], define b ξ(t) dt . I x (ξ ) = a t −x b Let c = a ξ(t) dt. Then I x (ξ ) ≤ I x χ(a,a+c) for all x ∈ / [a, b]. Proof. It suffices to prove this for step functions ξ because these are dense in L 1 . So assume that ξ = Nj=1 s j χ I j , with disjoint intervals I j . If (c, c + h) is such an interval of constancy of ξ and ξ = s on (c, c + h), with 0 < s < 1, then, as an elementary argument shows, I x (ξ ) will go up if we redefine ξ on (c, c + h) as χ(c,c+sh) . Use this procedure on all intervals of constancy. Since I x (ξ ) clearly also increases if we pass to the non-increasing rearrangement of ξ , we obtain the lemma. Let ξ be the Krein function of Fν , and, motivated by Lemma 2.5, define ξ0 as follows: ξ0 = 1/2 on E, and if (a, b) is one of the bounded components of the open set E c , set b ξ0 = χ(a,a+c) on (a, b), where c = a ξ dt, as in the lemma. If E c has unbounded components, put ξ0 = 1 (say) on these. Notice that ξ0 is the Krein function of a Herglotz function Fµ whose associated measure satisfies µ ∈ R0 (E). Indeed, µ is reflectionless on E because this property is equivalent to ξ0 = 1/2 on E, and µ(E c ) = 0 because Fµ (x) ≡ lim y→0+ Fµ (x + i y) exists and is real at all points of E c , except possibly at the


1017

jumps of ξ0 . However, these can’t be discrete points of µ either because in order for this to happen, ξ0 would have to jump from 0 to 1, not the other way around, by Lemma 2.4. Now fix x ∈ E and look at ln |Fµ /Fν |. As y → 0+, ξ0 (t) − ξ(t) ln |Fµ (x + i y)| − ln |Fν (x + i y)| = dt + O(1). (2.9) t−x y 2, then y/t−1 (3.1) = 2t + t ln s ds ≤ 2t + y ln(1 + y/t) ≤ 12y ln(1 + y/t). 1

1018


Proof of Theorem 1.4. We will here that for almost every x ∈ R, the one-sided prove + right approximate derivative Dap γ (x) exists and (1.3) holds. The same argument can then be used to establish the corresponding statement about the left derivative, and these two statements together will give the full claim. Here, one-sided derivatives are defined + f )(x) = d if for all > 0, in the obvious way; for example, we say that (Dap f (x + y) − f (x) 1 ≥ = 0. y ∈ (0, h) : − d lim h→0+ h y Our basic strategy is modelled on the proof of [9, Theorem 5.4]. The following statements hold at (Lebesgue) almost every point x ∈ R; here, we write again νs for the singular part of ν, and we also use the notation ν(t) = ν((−∞, t)). • • • •

x is a Lebesgue point of ν (t); lim y→0+ g(x + i y) exists; lim y→0+ νs ([x − y, x + y])/y = 0; lim y→0+ Im gs (x + i y) = 0, where gs (z) = R

dνs (t) t−z .

+ γ )(x) exists and (1.3) We will now show that if x has all these properties, then (Dap holds. So fix such an x. To simplify the notation, we will again assume that x = 0. The basic idea is to look at averages of

γ (y) − γ (0) . y By the definitions of g and γ , we have that F(y) = R φ y (t) dν(t), where 1 t + ln 1 − , φ(t) = 2 t +1 t t 1 φ y (t) = φ . y y F(y) ≡ Re g(i y) +

(3.2)

Recall that we assumed that ν has a finite logarithmic potential γ (z) everywhere, so φ y is in L 1 (dν). Moreover, since φ(t) = O(t −2 ) for large |t|, we have that φ, φ y ∈ L 1 (R). For later use, we also observe that ∞ φ(t) dt = 0. (3.3) −∞

R

To prove this, look at −R φ. Clearly, the first term from (3.2) is odd and thus doesn’t contribute to this integral, and R R t − 1 1 dt ln 1 − dt = ln t t −R −R −R R = ln |t| dt − ln |t| dt → 0 (R → ∞), −R−1

so we obtain (3.3).

R−1


1019

Suppose now that B y is a family of Borel sets with the following properties: B y ⊂ [δy, y], B y ≥ δy,

(3.4)

for some fixed (but arbitrary) 0 < δ < 1/2. Define 1 ψ y (t) = φh (t) dh. |B y | B y We now claim that ψ y (t)

y −1 ln(1 + y/|t|) 0 < |t| ≤ 2y . |t| > 2y y/t 2

(3.5)

The constant implicit in (3.5) only depends on δ. Indeed, for |t| ≤ 2y this follows immediately from Lemma 3.1 and the obvious bound |t|/(t 2 + y 2 ) ≤ 2/y (if |t| ≤ 2y). If, on the other hand, h ≤ y < |t|/2, then Taylor’s theorem shows that |φh (t)| =

h2 y + O(h/t 2 ) 2 , |t|(t 2 + h 2 ) t

and the second bound from (3.5) follows. Next, (3.3) and the Fubini-Tonelli Theorem imply that ∞ ψ y (t) dt = 0. −∞

Our next goal is to show that

∞

lim

y→0+ −∞

ψ y (t) dν(t) = 0.

We rewrite this as ψ y (t) dν(t) ≤ ψ y (t) dνs (t) + ψ y (t)ν (t) dt . R

(3.6)

R

R

(3.7)

Our first step will be to show that the first integral on the right-hand side of (3.7) goes to zero as y → 0. Start by considering the contributions coming from |t| > 2y: By (3.5), y ψ y (t) dνs (t) dνs (t) = Im gs (i y) → 0, 2 2 |t|>2y R t +y by our choice of x (= 0). Next, if > 0 is given, we can find η > 0 so that if h ≤ η, then νs ([−h, h]) ≤ h. If 2y < η, then this, (3.5), and the monotone convergence theorem imply that ∞ ψ y (t) dνs (t) = ψ y (t) dνs (t) |t|≤2y

n=0 2 ∞

1 y

−n y 2y, ψ y (t) is dominated by the Poisson kernel y/(t 2 + y 2 ), so this part goes to zero because x = 0 is a Lebesgue point of ν . For small |t|, on the other hand, we again have that the contributions coming from |t| ≈ 2−n y will be n2−n , and the sum over n is still . Let us summarize: We have shown that lim ψ y (t) dν(t) = 0. y→0

By unwrapping the definitions, we see that this means that 1 1 γ (h) − γ (0) dh + Re g(i h) dh = 0. lim y→0 |B y | B y h |B y | B y Since g(i h) converges, to g(0), by the choice of x = 0 again, the second term converges to Re g(0), so we can also say that γ (h) − γ (0) 1 lim + Re g(0) dh = 0, (3.8) y→0 |B y | B y h and this holds for any choice of sets B y as in (3.4). This implies that the (right) approximate derivative of γ at x = 0 exists and (1.3) holds. Indeed, if this were not true, then we could find δ, > 0 and a sequence of sets An ⊂ [0, yn ], with yn → 0, such that |An | ≥ 3δyn and γ (h) − γ (0) + Re g(0) ≥ h for all h ∈ An . But then we can also construct sets Bn ⊂ [δyn , yn ], |Bn | ≥ δyn , so that either γ (h) − γ (0) + Re g(0) ≥ h for all h ∈ Bn or . . . ≤ − for all h ∈ Bn . However, then (3.8) with B yn = Bn leads to a contradiction, so we have to admit that the (one-sided) approximate derivative exists and (1.3) holds, as claimed. References 1. Bruckner, A.M.: Differentiation of Real Functions. Lecture Notes in Mathematics 659, Berlin: Springer, 1978 2. Cima, J.A., Matheson, A.L., Ross, W.T.: The Cauchy Transform. Mathematical Surveys and Monographs 125, Providence, RI: Amer. Math. Soc., 2006 3. Dombrowski, J.: Quasitriangular matrices. Proc. Amer. Math. Soc. 69, 95–96 (1978) 4. Gesztesy, F., Krishna, M., Teschl, G.: On isospectral sets of Jacobi operators. Commun. Math. Phys. 181, 631–645 (1996)


1021

5. Gesztesy, F., Yuditskii, P.: Spectral properties of a class of reflectionless Schrödinger operators. J. Funct. Anal. 241, 486–527 (2006) 6. Gesztesy, F., Zinchenko, M.: Local spectral properties of reflectionless Jacobi, CMV, and Schrödinger operators. http://arxiv.org/abs/0803.3177 [math.sp], 2008 7. Jaksic, V., Last, Y.: A new proof of Poltoratski’s theorem. J. Funct. Anal. 215, 103–110 (2004) 8. Martin, M., Putinar, M.: Lectures on Hyponormal Operators. Operator Theory: Advances and Applications 39, Basel: Birkhäuser Verlag, 1989 9. Melnikov, M., Poltoratski, A., Volberg, A.: Uniqueness theorems for Cauchy integrals. http://arxiv.org/abs/0704.0621 [math.sp], 2007 10. Nazarov, F., Volberg, A., Yuditskii, P.: Reflectionless measures with a point mass and singular continuous component. http://arxiv.org/abs/0711.0948 [math.sp], 2008 11. Poltoratski, A.: Boundary behavior of pseudocontinuable functions. St. Petersburg Math. J. 5, 389–406 (1994) 12. Ransford, T.: Potential Theory in the Complex Plane, London Mathematical Society Student Texts 28, Cambridge: Cambridge University Press, 1995 13. Remling, C.: The absolutely continuous spectrum of Jacobi matrices. http://arxiv.org/abs/0706.1101 [math.sp], 2007 14. Saks, S.: Theory of the Integral. Second revised edition, New York: Dover Publications, 1964 15. Simon, B.: Equilibrium measures and capacities in spectral theory. Inverse Probl. Imaging 1, 713–772 (2007) 16. Simon, B.: Weak convergence of CD kernels and applications. To appear in Duke Math. J. 17. Simon, B., Spencer, T.: Trace class perturbations and the absence of absolutely continuous spectra. Commun. Math. Phys. 125, 113–125 (1989) 18. Sodin, M., Yuditskii, P.: Almost periodic Jacobi matrices with homogeneous spectrum, infinitedimensional Jacobi inversion, and Hardy spaces of character-automorphic functions. J. Geom. Anal. 7, 387–435 (1997) 19. Stahl, H., Totik, V.: General Orthogonal Polynomials, Encyclopedia of Mathematics and its Applications 43, Cambridge: Cambridge University Press, 1992 20. Thomson, B.S.: Real Functions, Lecture Notes in Mathematics 1170, Berlin: Springer-Verlag, 1985 Communicated by B. Simon

Commun. Math. Phys. 288, 1023–1059 (2009) Digital Object Identifier (DOI) 10.1007/s00220-009-0754-z

Communications in


On the Mean-Field Limit of Bosons with Coulomb Two-Body Interaction Jürg Fröhlich, Antti Knowles, Simon Schwarz Institute of Theoretical Physics, ETH Hönggerberg, CH-8093 Zürich, Switzerland. E-mail: [email protected]; [email protected]; [email protected] Received: 28 May 2008 / Accepted: 2 December 2008 Published online: 28 February 2009 – © Springer-Verlag 2009

Abstract: In the mean-field limit the dynamics of a quantum Bose gas is described by a Hartree equation. We present a simple method for proving the convergence of the microscopic quantum dynamics to the Hartree dynamics when the number of particles becomes large and the strength of the two-body potential tends to 0 like the inverse of the particle number. Our method is applicable for a class of singular interaction potentials including the Coulomb potential. We prove and state our main result for the Heisenbergpicture dynamics of “observables”, thus avoiding the use of coherent states. Our formulation shows that the mean-field limit is a “semi-classical” limit. 1. Introduction Whenever many particles interact by means of weak two-body potentials, one expects that the potential felt by any one particle is given by an average potential generated by the particle density. In this mean-field regime, one hopes to find that the emerging dynamics is simpler and less encumbered by tedious microscopic information than the original N -body dynamics. The mathematical study of such problems has quite a long history. In the context of classical mechanics, where the mean-field limit is described by the Vlasov equation, the problem was successfully studied by Braun and Hepp [3], as well as Neunzert [16]. The mean-field limit of quantum Bose gases was first addressed in the seminal paper [10] of Hepp. We refer to [6] for a short discussion of some subsequent results. The case with a Coulomb interaction potential was treated by Erd˝os and Yau in [6]. Recently, Rodnianski and Schlein [21] have derived explicit estimates for the rate of convergence to the meanfield limit, using the methods of [10 and 9]. A sharper bound on the rate of convergence in the case of a sufficiently regular interaction potential was derived by Schlein and Erd˝os [22], by using a new method inspired by Lieb-Robinson inequalities. In [7,15], the mean-field limit (N → ∞) and the classical limit were studied simultaneously. A conceptually quite novel approach to studying mean-field limits was introduced in [8].

1024

J. Fröhlich, A. Knowles, S. Schwarz

In that paper, the time evolution of quantum and corresponding “classical” observables is studied in the Heisenberg picture, and it is shown that “time evolution commutes with quantisation” up to terms that tend to 0 in the mean-field (“classical”) limit, which is a Egorov-type result. In this paper we present a new, simpler way of handling singular interaction potentials. It yields a Egorov-type formulation of convergence to the mean-field limit, thus obviating the need to consider particular (traditionally coherent) states as initial conditions. Another, technical, advantage of our method is that it requires no regularity (traditionally H 1 - or H 2 -regularity) when applied to coherent states. Such kinds of results were first obtained by Egorov [5] for the semi-classical limit of a quantum system. Roughly, the statement is that time-evolution commutes with quantisation in the semi-classical limit. We sketch this in a simple example: Let us start with a classical Hamiltonian system of a finite number f of degrees of freedom. The classical algebra of observables A is given by (some subalgebra of) the Abelian algebra of smooth functions on the phase space := R2 f . Let H ∈ A be a Hamilton function. Together with the symplectic structure on , H generates a symplectic flow φ t on . Now we : A → define a quantisation map (·) A, where A is some subalgebra of B(L 2 (R f )). For concreteness, let (·) be Weyl quantisation with deformation parameter . This implies that , A B = {A, B} + O(2 ), i for → 0. The quantised Hamilton function defines a 1-parameter group of automorphisms on A through

A → eit H / A e−it H /,

A∈ A.

A Egorov-type semi-classical result states that, for all A ∈ A and t ∈ R, e−it H / + R(t), (A ◦ φ t ) = eit H / A

where R(t) → 0 as → 0. This approach identifies the semi-classical limit as the converse of quantisation. In a similar fashion, we identify the mean-field limit as the converse of “second quantisation”. In this case the deformation parameter is not , but N −1 , a parameter proportional to the coupling constant. We consider the mean-field dynamics (given by the Hartree equation in the case of bosons), and view it as the Hamiltonian dynamics of a classical Hamiltonian system. We show that its quantisation describes N -body quantum mechanics, and that the “semi-classical” limit corresponding to N −1 → 0 takes us back to the Hartree dynamics. We sketch the key ideas behind our strategy. (1) Use the Schwinger-Dyson expansion to construct the Heisenberg-picture dynamics of p-particle operators eit HN A N (a ( p) ) e−it HN (in the notation of Sect. 3). (2) Use Kato smoothing plus combinatorial estimates (counting of graphs) to prove convergence of the Schwinger-Dyson expansion on N -particle Hilbert space, uniformly in N and for small |t|. Diagrams containing l loops yield a contribution of order N −l .

On the Mean-Field Limit of Bosons with Coulomb Two-Body Interaction

1025

(3) Use Kato smoothing plus combinatorial estimates to prove convergence of the iterative solution of the Hartree equation, for small |t|. (4) Show that the Wick quantisation of the series in (3) is equal to the series of tree diagrams in (2). (5) Extend (2) and (3) to arbitrary times by using unitarity and conservation laws. This paper is organised as follows. In Sect. 2 we show that the classical Newtonian mechanics of point particles is the second quantisation of Vlasov theory, the latter being the mean-field (or “classical”) limit of the former. The bulk of the paper is devoted to a rigorous analysis of the mean-field limit of Bose gases. In Sect. 3 we recall some important concepts of quantum many-body theory and introduce a general formalism which is convenient when dealing with quantum gases. Section 4 contains an implementation of Step (1) above. The convergence of the Schwinger-Dyson series for bounded interaction potentials is briefly discussed in Sect. 5. Section 6 implements Step (2) above. Steps (3), (4) and (5) are implemented in Sect. 7. Finally, Sect. 8 extends our results to more general interaction potentials as well as nonvanishing external potentials. 2. Mean-Field Limit in Classical Mechanics In this section we consider the example of classical Newtonian mechanics to illustrate how the atomistic constitution of matter arises by quantisation of a continuum theory. The aim of this section is to give a brief and nonrigorous overview of some ideas that we shall develop in the context of quantum Bose gases, in full detail, in the following sections. A classical gas is described as a continuous medium whose state is given by a nonnegative mass density dµ(x, v) = M f (x, v) dx dv on the “one-particle” phase space R3 × R3 . Here M is the mass of one “mole” of gas; µ(A) is the mass of gas in the phase space volume A ⊂ R3 × R3 . Let dx dv f (x, v) = ν < ∞ denote the number of “moles” of the gas, so that the total mass of the gas is µ(R3 × R3 ) = ν M. An example of an equation of motion for f (x, v) is the Vlasov equation ∂t f t (x, v) = − (v · ∇x f t) (x, v) +

1 (∇Veff [ f t ] · ∇v f t) (x, v), m

(2.1)

where m is a constant with the dimension of a mass, t denotes time, and Veff [ f ](x) = V (x) + dy W (x − y) dv f (y, v). Here V is the potential of external forces acting on the gas and W is a (two-body) potential describing self-interactions of the gas. The Vlasov equation arises as the mean-field limit of a classical Hamiltonian sysn , moving in an external tem of n point particles of mass m, with trajectories (xi (t))i=1 potential V and interacting through two-body forces with potential N −1 W (xi − x j ). Here N is the inverse coupling constant. We interpret N as “Avogadro’s number”, i.e. as the number of particles per “mole” of gas. Thus, M = m N and n = ν N . More precisely, it is well-known (see [3,16]) that, under some technical assumptions on V and W , f t (x, v) = w*-lim n→∞

n ν δ(x − xi (t)) δ(v − x˙i (t)) n i=1

(2.2)

1026


exists for all times t and is the (unique) solution of (2.1), provided that this holds at time t = 0. Here, f t is viewed as an element of the dual space of continuous bounded functions. Note that n and N are, a priori, unrelated objects. While n is the number of particles in the classical Hamiltonian system, N −1 is by definition the coupling constant. The mean-field limit is the limit n → ∞ while keeping n ∝ N ; the proportionality constant is ν. It is of interest to note that the Vlasov dynamics (2.1) may be interpreted as a Hamiltonian dynamics on an infinite-dimensional affine phase space Vlasov . To see this, we write f (x, v) = α(x, ¯ v)α(x, v), where α(x, ¯ v), α(x, v) are complex coordinates on Vlasov . For our purposes it is enough to say that Vlasov is some dense subspace of L 2 (R6 ) (typically a weighted Sobolev space of index 1). On Vlasov we define a symplectic form through ω = i dx dv dα(x, ¯ v) ∧ dα(x, v). This yields a Poisson bracket which reads α(x, v), α(y, w) = α(x, ¯ v), α(y, ¯ w) = 0, α(x, v), α(y, ¯ w) = iδ(x − y)δ(v − w).

(2.3)

A Hamilton function H is defined on Vlasov through

1 H (α) := i dx dv α(x, ¯ v) −v · ∇x + ∇V (x) · ∇v α(x, v) m

i + dx dv α(x, ¯ v) dy dw ∇W (x − y) |α(y, w)|2 · ∇v α(x, v). (2.4) m ¯ which by Note that H is invariant under gauge α → e−iθ α, α¯ → eiθ α, 2transformations Noether’s theorem implies that |α| dx dv = f dx dv is conserved. Let us abbreviate K := −∇V /m and F := −∇W/m. After a short calculation using (2.3) we find that the Hamiltonian equation of motion α˙ t (x, v) = {H, αt (x, v)} reads α˙ t (x, v) = (−v · ∇x − K (x) · ∇v) αt (x, v)− dy dw F(x−y) |αt (y, w)|2 · ∇v αt (x, v) + dy dw F(x−y) α¯ t (y, w)αt (x, v) · ∇w αt (y, w). (2.5) Also, α¯ t satisfies the complex conjugate equation. Therefore, d |αt (x, v)|2 = (−v · ∇x −K (x) · ∇v) |αt (x, v)|2 dt −

dy dw F(x−y) |αt (y, w)|2 · ∇v |αt (x, v)|2 +|αt (x, v)|2 dy dw F(x−y) · α¯ t (y, w)∇w αt (y, w) + αt (y, w)∇w α¯ t (y, w) .

(2.6)


1027

We assume that |α(x, v)| = o(|(x, v)|−1 ),

(x, v) → ∞.

(2.7)

We shall shortly see that this property is preserved under time-evolution. By integration by parts, we see that the second line of (2.6) vanishes, and we recover the Vlasov equation of motion (2.1) for f = |α|2 . We comment briefly on the existence and uniqueness of solutions to the Hamiltonian equation of motion (2.5). Following Braun and Hepp [3], we assume that K and F are bounded and continuously differentiable with bounded derivatives. We use polar coordinates α = β eiϕ , where ϕ ∈ R and β ≥ 0. Then the Hamiltonian equation of motion (2.5) reads β˙t (x, v) = (−v · ∇x − K (x) · ∇v) βt (x, v) − dy dw F(x − y) βt2 (y, w) · ∇v βt (x, v), (2.8a) ϕ˙t (x, v) = (−v · ∇x −K (x) · ∇v) ϕt (x, v)− dy dw F(x−y) βt2 (y, w) · ∇v ϕt (x, v) + dy dw F(x − y) βt2 (y, w) · ∇w ϕt (y, w). (2.8b) We consider two cases. (i) ϕ = 0. In this case α = β and the equations of motion (2.8) are equivalent to the Vlasov equation for f = β 2 . The results of [3,16] then yield a global well-posedness result. (ii) ϕ = 0. The equation of motion (2.8a) is independent of ϕ. Case (i) implies that it has a unique global solution. In order to solve the linear equation (2.8b), we apply a contraction mapping argument. Consider the space X := {ϕ ∈ C(R6 ) : ∇ϕ ∈ L ∞ (R6 )}. Using Sobolev inequalities one finds that X , equipped with the norm ϕ X := |ϕ(0)| + ∇ϕ∞ , is a Banach space. We rewrite (2.8b) as an integral equation, and using standard methods show that, for small times, it has a unique solution. Using conservation of dx dv βt2 we iterate this procedure to find a global solution. We omit further details. Note that, as shown in [3], the solution βt can be written using a flow φ t on the one-particle phase space: βt (x, v) = β0 (φ −t (x, v)). The flow φ t (x, v) = (x(t), v(t)) satisfies x(t) ˙ = v(t), v(t) ˙ = K (x(t)) + dy dw βt2 (y, w) F(x(t) − y). Using conservation of dx dv βt2 we find that there is a constant C such that |φ −t (x, v)| ≤ |(x, v)|(1 + t) + C(1 + t 2 ). Therefore (2.7) holds for all times t provided that it holds at time t = 0. The Hamiltonian formulation of Vlasov dynamics can serve as a starting point to recover the atomistic Hamiltonian mechanics of point particles by quantisation: Replace α(x, ¯ v) → α ∗N (x, v),

α(x, v) → α N (x, v),

1028


where α ∗N and α N are creation and annihilation operators acting on the bosonic Fock space F+ L 2 (R6 ) ; see Appendix A. They satisfy the canonical commutation relations (A.2); explicitly, ∗ α N (x, v), α N (y, w) = α N (x, v), α ∗N (y, w) = 0, 1 α N (x, v), α ∗N (y, w) = δ(x − y)δ(v − w). (2.9) N Given a function A on Vlasov which is a polynomial in α and α, we define an operator N on F+ by replacing α # with A α #N and Wick-ordering the resulting expression. We N . Here, N −1 is the deformation parameter of the denote this quantisation map by (·) quantisation: We find that

N , BN A

=

N −1 {A, B} N + O(N −2 ), i

for N → ∞. Here A and B are polynomial functions on Vlasov . The dynamics of a state ∈ F is given by the Schrödinger equation N t , iN −1 ∂t t = H

(2.10)

N is the quantisation of the Vlasov Hamiltonian H . In order to identify the where H dynamics given by (2.10) with the classical dynamics of point particles, we study wave functions (n) (x1 , v1 , . . . , xn , vn ) in the n-particle sector of F+ , and interpret ρ (n) := ||2 as a probability density on the n-body classical phase space. If ∈ F+ denotes the vacuum vector annihilated by α N (x, v) then N n/2 dx1 dv1 · · · dxn dvn (n) (x1 , v1 , . . . , xn , vn ) α ∗N (xn , vn ) · · · α ∗N (x1 , v1 ) . (n) = √ n! It is a simple matter to check that (2.9) and (2.10) imply that

n 1 1 (n) (n) 1 (n) −vi · ∇xi + ∇V (xi ) · ∇vi t + ∇W (xi −x j ) · ∇vi t . ∂t t = m N m 1≤i = j≤n

i=1

Also, (n) t satisfies the same equation. Therefore,

n 1 (n) (n) 1 −vi · ∇xi + ∇V (xi ) · ∇vi ρt + ∂t ρt = m N i=1

1≤i = j≤n

1 (n) ∇W (xi −x j ) · ∇vi ρt . m

This is the Liouville equation corresponding to the Hamiltonian equations of motion of n classical point particles, ∂t xi = vi , m ∂t vi = −∇V (xi ) −

1 ∇W (xi − x j ). N j =i

Analogous results can be proven if α ∗N and α N are chosen to be fermionic creation and annihilation operators obeying the canonical anti-commutation relations and acting on the fermionic Fock space F− (L 2 (R6 )).


1029

3. Quantum Gases: The Setup Although our main results are restricted to bosons, all of the following rather general formalism remains unchanged for fermions. We therefore consider both bosonic and fermionic statistics throughout Sects. 3 – 6. Details on systems of fermions will appear elsewhere. Throughout the following we consider the one-particle Hilbert space H := L 2 (R3 , dx). We refer the reader to Appendix A for our choice of notation and a short discussion of many-body quantum mechanics. In the following a central role is played by the p-particles operators, i.e. closed ( p) operators a ( p) on H± = P± H⊗ p , where P+ and P− denote symmetrisation and anti-symmetrisation, respectively. When using second-quantised notation it is convenient to use the operator kernel of a ( p) . Here is what this means (see [18] for details): Let S (Rd ) be the usual Schwartz space of smooth functions of rapid decrease, and S (Rd ) its topological dual. The nuclear theorem states that to every operator A on L 2 (Rd ),

such that the map (f, g) → f , Ag is separately continuous on S (Rd ) × S (Rd ), there belongs a tempered distribution (“kernel”) A˜ ∈ S (R2d ), such that ˜ f¯ ⊗ g). f , Ag = A( In the following we identify A˜ with A. In the suggestive physicist’s notation we thus have

( p) ( p) ( p) = dx1 · · · dx p dy1 · · · dy p f ,a g f ( p) (x1 , . . . , x p ) a ( p) (x1 , . . . , x p ; y1 , . . . , y p ) g ( p) (y1 , . . . , y p ), where f, g ∈ S (R3 p ). It will be easy to verify that all p-particle operators that appear in the following satisfy the above condition; this is for instance the case for all bounded a ( p) ∈ B(H⊗ p ). ( p) Next, we define second quantisation A N . It maps a closed operator on H± to a closed operator on F± according to the formula ( p) A N (a ) := dx1 · · · dx p dy1 · · · dy p N∗ (x p ) · · · ψ N∗ (x1 )a ( p) (x1 , . . . , x p ; y1 , . . . , y p )ψ N (y1 ) · · · ψ N (y p ). ψ (3.1) # := √1 ψ # , where ψ # is the usual creation or annihilation operator; see Here ψ N N Appendix A. (n) In order to understand the action of A N (a ( p) ) on H± , we write N n/2 (n) = √ n!

N∗ (z n ) · · · ψ N∗ (z 1 ) dz 1 · · · dz n (n) (z 1 , . . . , z n ) ψ

1030


and apply A N (a ( p) ) to the right side. By using the (anti) commutation relations (A.2) to N (yi ) through the n creation operators ψ ∗ (z i ), and pull the p annihilation operators ψ N N (x) = 0, we get the “first quantised” expression ψ p! n P± (a ( p) ⊗ 1(n− p) )P± , n ≥ p, p ( p) A N (a ) H(n) = N p (3.2) ± 0, n < p. This may be viewed as an alternative definition of A N (a ( p) ). ( p) ( p) A is We define A as the linear span of A N (a ) : p ∈ N, a ( p) ∈ B(H± ) . Then 0 a ∗-algebra of closable operators on F± (see Appendix A). We list some of its important properties, whose straightforward proofs we omit. (i) A(a ( p) )∗ = A((a ( p) )∗ ). ( p) (q) ( p) (ii) If a ∈ B(H± ) and b(q) ∈ B(H± ), then A N (a ( p) ) A N (b(q) ) =

min( p,q) r =0

p r

q r! ( p) (q) A a , • b N r r Nr

(3.3)

where ( p+q−r )

a ( p) •r b(q) := P± (a ( p) ⊗ 1(q−r ) ) (1( p−r ) ⊗ b(q) ) P± ∈ B(H±

(n) (iii) The operator A(a ( p) ) leaves the n-particle subspaces H± invariant. ( p) (iv) If a ( p) ∈ B(H± ) and b ∈ B(H) is invertible, then (b−1 ) A N (a ( p) ) (b) = A N (b−1 )⊗ p a ( p) b⊗ p ,

). (3.4)

(3.5)

(n)

where (b) is defined on H± by b⊗n . ( p) (v) If a ( p) ∈ B(H± ) then n p a ( p) . A N (a ( p) )H(n) ≤ ± N

(3.6)

Of course, on an appropriate dense domain, (3.3) holds for unbounded operators a ( p) and b(q) too. We introduce the notation ( p) (q) (3.7) a , b r := a ( p) •r b(q) − b(q) •r a ( p) . Note that a ( p) , b(q) 0 = 0. Thus, A N (b(q) ) = A N (a ( p) ),

min( p,q) r =1

p r

q r! A N a ( p) , b(q) r . r r N

(3.8)

We now move on to discuss dynamics. Take a one-particle Hamiltonian h (1) ≡ h of the form h = − + v, where is the Laplacian over R3 and v is some real function. We denote by V the multiplication operator v(x). Two-body interactions are described


1031

by a real, even function w on R3 . This induces a two-particle operator W (2) ≡ W on H⊗2 , defined as the multiplication operator w(x1 − x2 ). We define the Hamiltonian 1 N := H A N (h) + A N (W ). 2

(3.9)

Under suitable assumptions on v and w that we make precise in the following sections, N is a well-defined self-adjoint operator on F± . It is convenient to one shows that H (n) N . On H± introduce H N := N H we have the “first quantised” expression n 1 H N H(n) = hi + ± N i=1

Wi j =: H0 +

1≤1< j≤n

1 W, N

(3.10)

in self-explanatory notation. 4. Schwinger-Dyson Expansion and Loop Counting Without loss of generality, we assume throughout the following that t ≥ 0. ( p) Let a ( p) ∈ B(H± ) and w be bounded, i.e. w ∈ L ∞ (R3 ). Using the fundamental theorem of calculus and the fact that the unitary group (e−it H0 )t∈R is strongly differentiable one finds eit HN A N (a ( p) ) e−it HN (n) A N (a ( p) ) e−it H0 eis H0 e−is HN (n) s=t = eis HN e−its H0 eit H0 t iN ( p) ( p) A N (Ws ), = A N (at ) (n) + ds eis HN e−is H0 A N (at ) eis H0 e−is HN (n) , 2 0 where (·)t := (eith )(·)(e−ith ) denotes free time evolution. As an equation between operators defined on F±0 , this reads eit HN A N (a ( p) ) e−it HN t iN ( p) ( p) A N (Ws ), = A N (at ) + ds eis HN e−is H0 A N (at ) eis H0 e−is HN . 2 0

(4.1)

Iteration of (4.1) yields the formal power series ∞ k k=0 (t)

dt

(iN )k ( p) A N (at ) . . . . A N (Wtk ), . . . A N (Wt1 ), k 2

(4.2)

(n)

It is easy to see that, on H± , the k th term of (4.2) is bounded in norm by k 2 tn w∞ /N n p p a . k! N (n)

(4.3)

Therefore, on H± , the series (4.2) converges in norm for all times. Furthermore, (4.3) implies that the rest term arising from the iteration of (4.1) vanishes for k → ∞, so that (4.2) is equal to (4.1).

1032


( p) Fig. 4.1. Two terms of the product A N (at ) A N (Ws ), represented as labelled diagrams. A tree term (left) produces a tree diagram. A loop term (right) produces a diagram with one loop

The mean-field limit is the limit n = ν N → ∞, where ν > 0 is some constant. The above estimate is clearly inadequate to prove statements about the mean-field limit. In order to obtain estimates uniform in N , more care is needed. To see why the above estimate is so crude, consider the commutator iN p! n i ( p) A N (Ws ), A N (at ) (n) = P± H± 2 Np p N

( p)

Wi j,s , at

⊗ 1(n− p) P± .

1≤i< j≤n

We see that most terms of the commutator vanish (namely, whenever p < i < j). Thus, for large n, the above estimates are highly wasteful. This can be remedied by more careful bookkeeping. We split the commutator into two terms: the tree terms, defined by 1 ≤ i ≤ p and p + 1 ≤ j ≤ n, and the loop terms, defined by 1 ≤ i < j ≤ p. All other terms vanish. This splitting can also be inferred from (3.8). The naming originates from a diagrammatic representation (see Fig. 4.1). A p-particle operator is represented as a wiggly vertical line to which are attached p horizontal branches on the left and p horizontal branches on the right. Each branch on ∗ (xi ), and each branch on the right an annihilathe left represents a creation operator ψ N N (yi ). The product tion operator ψ A N (a ( p) ) A N (b(q) ) of two operators is given by the sum over all possible pairings of the annihilation operators in A N (a ( p) ) with the creation (q) operators in A N (b ). Such a contraction is graphically represented as a horizontal line joining the corresponding branches. We consider diagrams that arise in this manner from the multiplication of a finite number of operators of the form A N (a ( p) ). We now generalise this idea to a systematic scheme for the multiple commutators appearing in the Schwinger-Dyson expansion. To this end, we decompose the multiple commutator (iN )k ( p) A (W ), . . . A (W ), A (a ) ... N t N t N t 1 k 2k into a sum of 2k terms obtained by writing out each commutator. Each resulting term is a product of k + 1 second-quantised operators, which we furthermore decompose into a sum over all possible contractions for which r > 0 in (3.3) (at least one contraction for each multiplication). The restriction r > 0 follows from [a ( p) , b(q) ]0 = 0. This is equivalent to saying that all diagrams are connected.


1033

We call the resulting terms elementary. The idea is to classify all elementary terms according to their number of loops l. Write k (iN )k 1 ( p) (k,l) ( p) A A G (W ), . . . A (W ), A (a ) . . . = (a ) , (4.4) N t N t N N t,t1 ,...,tk t 1 k 2k Nl l=0

( p) ) is a ( p +k −l)-particle operator, equal to the sum of all elementary where G (k,l) t,t1 ,...,tk (a ( p+k−l)

terms with l loops. It is defined through the recursion relation (on H±

)

(k,l) (k−1,l) G t,t1 ,...,tk (a ( p) ) = i( p + k − l − 1) Wtk , G t,t1 ,...,tk−1 (a ( p) ) 1 p+k −l (k−1,l−1) ( p) Wtk , G t,t1 ,...,tk−1 (a ) +i 2 2 p+k−l−1 (k−1,l) Wi p+k−l,tk , G t,t1 ,...,tk−1 (a ( p) ) ⊗ 1 P± = iP± i=1

+ iP±

(k−1,l−1) Wi j,tk , G t,t1 ,...,tk−1 (a ( p) ) P± ,

(4.5)

1≤i< j≤ p+k−l (0,0)

( p)

(k,l)

as well as G t (a ( p) ) := at . If l < 0, l > k, or p+k −l > n, then G t,t1 ,...,tk (a ( p) ) = 0. The interpretation of the recursion relation is simple: a (k, l)-term arises from either a (k − 1, l)-term without adding a loop or from a (k − 1, l − 1)-term to which a loop is added. It is not hard to see, using induction on k and the definition (4.5), that (4.4) holds. It is often convenient to have an explicit formula for the decomposition into elementary terms: (k,l) G t,t1 ,...,tk (a ( p) )

=

c( p,k,l)

(k,l)(α)

G t,t1 ,...,tk (a ( p) ),

α=1 (k,l)(α)

where G t,t1 ,...,tk (a ( p) ) is an elementary term, and c( p, k, l) is the number of elementary (k,l) terms in G t,t1 ,...,tk (a ( p) ). In order to establish a one-to-one correspondence between elementary terms and diagrams, we introduce a labelling scheme for diagrams. Consider an elementary term arising from a choice of contractions in the multiple commutator of order k, along with its diagram. We label all vertical lines v with an index i v ∈ N as follows. The vertical line of a ( p) is labelled by 0. The vertical line of the first (i.e. innermost in the multiple commutator) interaction operator is labelled by 1, of the second by 2, and so on (see Fig. 4.2). Conversely, every elementary term is uniquely determined by its labelled diagram. We consequently use α = 1, . . . , c( p, k, l) to index either elementary terms or labelled diagrams. Use the shorthand t = (t1 , . . . , tk ) and define ( p) ( p) := (a ) dt G (k,l) ). (4.6) G (k,l) t t,t (a k (t)

In summary, we have an expansion in terms of the number of loops l:

1034


Fig. 4.2. The labelled diagram corresponding to a one-loop elementary term in the commutator of order 4

A N (a ( p) ) e−it HN = eit HN

∞ k 1 (k,l) A N G t (a ( p) ) , l N

(4.7)

k=0 l=0

(n) , n ∈ N, for all times t. which converges in norm on H±

5. Convergence for Bounded Interaction For a bounded interaction potential, w∞ < ∞, it is now straightforward to control the mean-field limit. Lemma 5.1. We have the bound (k,l) ( p) G t,t (a ) ≤ c( p, k, l)wk∞ a ( p) .

(5.1)

Furthermore, c( p, k, l) ≤ 2k

k ( p + k − l)l ( p + k − 1) · · · p. l

(5.2)

Proof. Assume first that l = 0. Then the number of labelled diagrams is clearly given by 2k p · · · ( p + k − 1). Now if there are l loops, we may choose to add them at any l of the k steps when computing the multiple commutator. Furthermore, each addition of a loop produces at most p + k − l times more elementary terms than the addition of a tree branch. Combining these observations, we arrive at the claimed bound for c( p, k, l). Alternatively, it is a simple exercise to show the claim, with c( p, k, l) replaced by the bound (5.2), by induction on k. (ν N )

Lemma 5.2. Let ν > 0 and t < (8νw∞ )−1 . Then, on H± series (4.7) converges in norm, uniformly in N .

, the Schwinger-Dyson


1035

(k,l) Proof. Recall that p + k − l ≤ n for nonvanishing A N G t,t (a ( p) ) H(n) . Using the ± symbol I{A} , defined as 1 if A is true and 0 if A is false, we find ∞ k 1 (k,l) ( p) A G dt (a ) (ν N ) N t,t H± N l k (t) k=0 l=0

≤ ≤

∞ k ( p + k−l)l

Nl

k=0 l=0 ∞

I{ p+k−l≤ν N }

k p + k−1 1 k! ν p+k−l a ( p) (2w∞ t)k l k k!

(8νw∞ t)k (2ν) p a ( p)

k=0

=

1 (2ν) p a ( p) , 1−8νw∞ t

where we used that

k l=0 l

k

= 2k , and in particular

k l

≤ 2k .

In the spirit of semi-classical expansions, we can rewrite the Schwinger-Dyson series to get a “1/N -expansion”, whereby all l-loop terms add up to an operator of order O(N −l ). (ν N )

Lemma 5.3. Let t < (8νw∞ )−1 and L ∈ N. Then we have on H± eit HN A N (a ( p) ) e−it HN =

,

∞ L−1 1 1 (k,l) ( p) , G (a ) + O A N t Nl NL l=0

k=l

where the sum converges uniformly in N . Proof. Instead of the full Schwinger-Dyson expansion (4.2), we can stop the expansion whenever L loops have been generated. More precisely, we iterate (4.1) and use (3.8) at each iteration to split the commutator into tree (r = 1) and loop (r = 2) terms. Whenever a term obtained in this fashion has accumulated L loops, we stop expanding and put it into a remainder term. Thus all fully expanded terms are precisely those arising from diagrams containing up to L − 1 loops, and it is not hard to show that the remainder term is of order N −L . In view of later applications, we also give a proof using the fully expanded Schwinger(ν N ) Dyson series. From Lemma 5.2 we know that the sum converges on H± in norm, uniformly in N , and can be reordered as

e

it H N

A N (a ( p) ) e−it HN =

∞ ∞ 1 (k,l) ( p) A G dt (a ) , N t,t Nl k (t) l=0

(ν N )

as an identity on H±

k=l

. Proceeding as above we find

1036

J. Fröhlich, A. Knowles, S. Schwarz ∞ ∞ 1 (k,l) ( p) G A dt (a ) (ν N ) N t,t H± Nl k (t) l=L

k=l

∞ ∞ 1 ( p + k − l)l ≤ L I{ p+k−l≤ν N } N N l−L l=L k=l p+k−1 1 k k k! ν p+k−l a ( p) × (2w∞ t) l k k! ∞ ∞ 1 ≤ ( p + k − l) L (8νw∞ t)k (2ν) p a ( p) (ν N ) L

= ≤

1 (ν N ) L 1 (ν N ) L

l=L k=l ∞ ∞

( p + k) L (8νw∞ t)k+l (2ν) p a ( p)

l=L k=0 ∞ l=L

(8νw∞ t)l

e p L! (2ν) p a ( p) (1 − 8νw∞ t) L+1

1 e p L! (8νw∞ t) L (2ν) p a ( p) , = (ν N ) L (1 − 8νw∞ t) L+2 e p L! L k where we used that ∞ k=0 ( p + k) x ≤ (1−x) L+1 . 6. Convergence for Coulomb Interaction In this section we consider an interaction potential of the form w(x) = κ

1 , |x|

(6.1)

where κ ∈ R. We take the one-body Hamiltonian to be h = −, the nonrelativistic kinetic energy without external potentials. We assume this form of h and w throughout Sects. 6 and 7. In Sect. 8, we discuss some generalisations. 6.1. Kato smoothing. The non-relativistic dispersive nature of the free time evolution eit is essential for controlling singular potentials. The key tool for all of the following is the Kato smoothing estimate: −1 it 2 |x| e ψ dt ≤ π ψ2 , (6.2) R

L 2 (R3 ). Estimate (6.2) follows from Kato’s theory of smooth perturbations;

where ψ ∈ see [20,23]. In Sect. 8 we provide a proof of (6.2) (without the sharp constant π ), for a larger class of interaction potentials, using Strichartz estimates. In order to avoid tedious discussions of operator domains in equations such as (4.1), we introduce a cutoff to make the interaction potential bounded. For ε ≥ 0 set w ε (x) := w(x)I{|w(x)|≤ε−1 } ,


1037

so that w ε ∞ ≤ ε−1 . Now (6.2) implies, for ε ≥ 0, ε it 2 it 2 w e ψ dt ≤ w e ψ dt ≤ π κ 2 ψ2 . R

R

(6.3)

An immediate consequence is the following lemma. (n) Lemma 6.1. Let (n) ∈ H± . Then 2 ε −it H (n) 2 0 W e dt ≤ π κ (n) 2 . ij 2 R

(6.4)

Proof. By symmetry we may assume that (i, j) = (1, 2). Choose centre of mass coordi˜ (n) (X, ξ, x3 , . . . , xn ) := (n) (x1 , . . . , xn ), nates X := (x1 +x2 )/2 and ξ = x2 −x1 , set and write ε −it H (n) 2 ε 0 W e dt = w (ξ ) e2itξ ˜ (n) 2 dt, 12 R

R

since H0 = −1 − 2 = − X /2 − 2ξ and [ X , w ε (ξ )] = 0. Therefore, by (6.3) and Fubini’s theorem, we find ε −it H (n) 2 0 W e dt 12 R ˜ (n) (X, ξ, x3 , . . . , xn )2 = dX dx3 · · · dxn dt dξ w ε (ξ ) e2itξ (n) πκ2 ˜ (X, ξ, x3 , . . . , xn )2 ≤ dX dx3 · · · dxn dξ 2 π κ 2 (n) 2 . = 2

By Cauchy-Schwarz we then find that 1/2 2 1/2 t ε −is H (n) 2 πκ t 0 W ε (n) ds ≤ t 1/2 W e ds ≤ (n) . (6.5) i j,s ij 2 R 0

By iteration, this implies that, for all elementary terms α, 2 k/2 t t (k,l)(α),ε ( p) ( p+k−l) ≤ πκ t dt1 . . . dtk G t,t (a ) a ( p) ( p+k−l) , (6.6) 2 0 0 (k,l)(α),ε

where the superscript ε reminds us that G t,t potential wε . Thus one finds

(k,l),ε ( p) G t (a ) ≤ c( p, k, l)

(a ( p) ) is computed with the regularised

π κ 2t 2

k/2

a ( p) ,

for all ε ≥ 0. Unfortunately, the above procedure does not recover the factor 1/k! arising from the time-integration over the k-simplex k (t), which is essential for our convergence

1038


√ estimates. First iterating (6.4) and then using Cauchy-Schwarz yields a factor 1/ k!, which is still not good enough. A solution to this problem must circumvent the highly wasteful procedure of replacing the integral over k (t) with an integral over [0, t]k . The key observation is that, in the sum over all labelled diagrams, each diagram appears of the order of k! times with different labellings. 6.2. Graph counting. In order to make the above idea precise, we make use of graphs (related to the above diagrams) to index terms in our expansion of the multiple commutator (iN )k ( p) A (W ), . . . A (W ), A (a ) ... . (6.7) N t N t N t 1 k 2k The idea is to assign to each second quantised operator a vertex v = 0, . . . , k, and to represent each creation and annihilation with an incident edge. A pairing of an annihilation operator with a creation operator is represented by joining the corresponding edges. The vertex 0 has 2 p edges and the vertices 1, . . . , k have 4 edges. We call the vertex 0 the root. The edges incident to each vertex v are labelled using a pair λ = (d, i), where d = a, c is the direction (a stands for “annihilation” and c for “creation”) and i labels edges of the same direction; i = 1, . . . , p if v = 0 and i = 1, 2 if v = 1, . . . , k. Thus, a labelled edge is of the form {(v1 , λ1 ), (v2 , λ2 )}. Graphs G with such labelled edges are graphs over the vertex set V (G) = {(v, λ)}. We denote the set of edges of a graph G (a set of unordered pairs of vertices in V (G)) by E(G). The degree of each (v, λ) is either 0 or 1; we call (v, λ) an empty edge of v if its degree is 0. We often speak of connecting two empty edges, as well as removing a nonempty edge; the definitions are self-explanatory. over the vertex set We may drop the edge labelling of G to obtain a (multi)graph G {0, . . . , k}: Each edge {(v1 , λ1 ), (v2 , λ2 )} ∈ E(G) gives rise to the edge {v1 , v2 } ∈ E(G). We understand a path in G to be a sequence of edges in E(G) such that two consecutive This leads to the notions of connectedness of G and edges are adjacent in the graph G. loops in G. The admissible graphs – i.e. graphs indexing a choice of pairings in the multiple commutator (6.7) – are generated by the following “growth process”. We start with the empty graph G0 , i.e. E(G0 ) = ∅. In a first step, we choose one or two empty edges of 1 of the same direction and connect each of them to an empty edge of 0 of opposite direction. Next, we choose one or two empty edges of 2 of the same direction and connect each of them to an empty edge of 0 or 1 of opposite direction. We continue in this manner for all vertices 3, . . . , k. We summarise some key properties of admissible graphs G. (a) G is connected. (b) The degree of each (v, λ) is either 0 or 1. (c) The labelled edge {(v1 , λ1 ), (v2 , λ2 )} ∈ E(G) only if λ1 and λ2 have opposite directions. Property (c) implies that each graph G has a canonical directed representative, where each edge is ordered from the a-label to the c-label. See Fig. 6.1 for an example of such a graph. We call a graph G of type ( p, k, l) whenever it is admissible and it contains l loops. We denote by G ( p, k, l) the set of graphs of type ( p, k, l).


1039

Fig. 6.1. An admissible graph of type ( p = 4, k = 7, l = 3)

By definition of admissible graphs, each contraction in (6.7) corresponds to a unique admissible graph. A contraction consists of at least k and at most 2k pairings. A contraction giving rise to a graph of type ( p, k, l) has k + l pairings. The summand in (6.7) corresponding to any given l-loop contraction is given by an elementary term of the form (iN )k A N b( p+k−l) , k k+l 2 N

(6.8)

where the ( p + k − l)-particle operator b( p+k−l) is of the form ( p) b( p+k−l) = P± Wi1 j1 ,tv1 · · · Wir jr ,tvr at ⊗1(k−l) Wir +1 jr +1 ,tvr +1 · · · Wik jk ,tvk P± , (6.9) for some r = 0, . . . , k. Indeed, the (anti)commutation relations (A.2) imply that each pairing produces a factor of 1/N . Furthermore, the creation and annihilation operators of each summand corresponding to any given contraction are (by definition) Wick ordered, and one readily sees that the associated integral kernel corresponds to an operator of (k,l) the form (6.9). Thus we recover the splitting (4.4), whereby G t,t1 ,...,tk (a ( p) ) is a sum, indexed by all l-loop graphs, of elementary terms of the form (6.9). As remarked above, we need to exploit the fact that many graphs have the same topological structure, i.e. can be identified after some permutation of the labels {1, . . . , k} of the vertices corresponding to interaction operators. We therefore define an equivalence relation on the set of graphs: G ∼ G if and only if there exists a permutation σ ∈ Sk such that G = Rσ (G). Here Rσ (G) is the graph defined by {(v1 , λ1 ), (v2 , λ2 )} ∈ E(Rσ (G)) ⇐⇒ {(σ (v1 ), λ1 ), (σ (v2 ), λ2 )} ∈ E(G), where σ (0) ≡ 0. We call equivalence classes [G] graph structures, and denote the set of graph structures of admissible graphs of type ( p, k, l) by Q( p, k, l). Note that, in general, Rσ (G) need not be admissible if G is admissible. It is convenient to increase G ( p, k, l) to include all Rσ (G), where σ ∈ Sk and G is admissible. In order to keep track of the admissible graphs in this larger set, we introduce the symbol i G which is by definition 1 if G ∈ G ( p, k, l) is admissible and 0 otherwise. Because Rσ (G) = G if σ = id, G ( p, k, l) = k! Q( p, k, l). (6.10)

1040


Our goal is to find an upper bound on the number of graph structures of type ( p, k, l), which is sharp enough to show convergence of the Schwinger-Dyson series (4.2). Let us start with tree graphs: l = 0. In this case the number of graph structures is equal to 2k times the number of ordered trees1 with k + 1 vertices, whose root has at most 2 p children and whose other vertices have at most 3 children. The factor 2k arises from the fact that each vertex v = 1, . . . , k can use either of the two empty edges of compatible direction to connect to its parent. We thus need some basic facts about ordered trees, which are covered in the following (more or less standard) combinatorial digression. For x, t ∈ R and n ∈ N define x + nt x An (x, t) := (6.11) x + nt n as well as A0 (x, t) := 1. After some juggling with binomial coefficients one finds n

Ak (x, t)An−k (y, t) = An (x + y, t) ;

(6.12)

k=0

see [12] for details. Therefore An 1 (x1 , t) · · · Anr (xr , t) = An (x1 + · · · + xr , t).

(6.13)

n 1 +···+nr =n

Set Cnm

1 + nm 1 nm 1 := An (1, m) = = , 1 + nm n n(m − 1) + 1 n

the n th m-ary Catalan number. Thus we have n 1 +···+nr =n

In particular,

Cnm1 · · · Cnmr =

n 1 +···+n m =n−1

r + nm r . r + nm n

Cnm1 · · · Cnmm = Cnm .

(6.14)

(6.15)

(6.16)

Define an m-tree to be an ordered tree such that each vertex has at most m children. The number of m-trees with n vertices is equal to Cnm . This follows immediately from C0m = 1 and from (6.16), which expresses that all trees of order n are obtained by adding m (possibly empty) subtrees of combined order n − 1 to the root. We may now compute |Q( p, k, 0)|. Since the root of the tree has at most 2 p children, we may express |Q( p, k, 0)| as the number of ordered forests comprising 2 p (possibly empty) 3-trees whose combined order is equal to k. Therefore, by (6.15), 2 p + 3k 2p |Q( p, k, 0)| = 2k . (6.17) Cn31 · · · Cn32 p = 2k 2 p + 3k k n 1 +···+n 2 p =k

Next, we extend this result to all values of l in the form of an upper bound on |Q( p, k, l)|. 1 An ordered tree is a rooted tree in which the children of each vertex are ordered.


Lemma 6.2. Let p, k, l ∈ N. Then |Q( p, k, l)| ≤ 2k

k 2 p + 3k ( p + k − l)l . l k

1041

(6.18)

Proof. The idea is to remove edges from G ∈ G ( p, k, l) to obtain a tree graph, and then use the special case (6.17). In addition to the properties (a) – (c) above, we need the following property of G ( p, k, l): (d) If G ∈ G ( p, k, l) then there exists a subset V ⊂ {1, . . . , k} of size l and a choice of direction δ : V → {a, c} such that, for each v ∈ V, both edges of v with direction δ(v) are nonempty. Denote by E(v) ⊂ E(G) the set consisting of the two above edges. We additionally require that removing one of the two edges of E(v) from G, for each v ∈ V, yields a tree graph, with the property that, for each v ∈ V, the remaining edge of E(v) is contained in the unique path connecting v to the root. This is an immediate consequence of the growth process for admissible graphs. The set V corresponds to the set of vertices whose addition produces two edges. Note that property (d) is independent of the representative and consequently holds also for non-admissible G ∈ G ( p, k, l). Before coming to our main argument, we note that a tree graph T ∈ G ( p, k, 0) gives rise to a natural lexicographical order on the vertex set {1, . . . , k}. Let v ∈ {1, . . . , k}. There is a unique path that connects v to the root. Denote by 0 = v1 , v2 , . . . , vq = v the sequence of vertices along this path. For each j = 1, . . . , q − 1, let λ j be the label of the edge {v j , v j+1 } at v j . We assign to v the string S(v) := (λ1 , . . . , λq−1 ). Choose some (fixed) ordering of the sets of labels {λ}, for each v. Then the set of vertices {1, . . . , k} is ordered according to the lexicographical order of the string S(v). We now start removing loops from a given graph G ∈ G ( p, k, l). Define R1 as the graph obtained from G by removing all edges in v∈V E(v). By property (d) above, R1 is a forest comprising l trees. Define T1 as the connected component of R1 containing the root. Now we claim that there is at least one v ∈ V such that both edges of E(v) are incident to a vertex of T1 . Indeed, were this not the case, we could choose for each v ∈ V an edge in E(v) that is not incident to any vertex of T1 . Call R1 the graph obtained by adding all such edges to R1 . Now, since no vertex in V is in the connected component of R1 , it follows that no vertex in V is in the connected component R1 . This is a contradiction to property (d) which requires that R1 should be a (connected) tree. Let us therefore consider the set V˜ of all v ∈ V such that both edges of E(v) are incident to a vertex of T1 . We have shown that V˜ = ∅. For each choice of v and e, where v ∈ V˜ and e ∈ E(v), we get a forest of l − 1 trees by adding e to the edge set of R1 . Then v is in the same tree as the root, so that each such choice of v and e yields a string S(v) as described above. We choose v1 and e(v1 ) as the unique couple that yields the smallest string (note that different choices have different strings). Finally, set G1 equal to G from which e(v1 ) has been removed, and V1 := V \ {v}. We have thus obtained an (l − 1)-loop graph G1 and a set V1 of size l − 1, which together satisfy the property (d). We may therefore repeat the above procedure. In this manner we obtain the sequences v1 , . . . , vl and G1 , . . . , Gl . Note that Gl is obtained by removing the edges e(v1 ), . . . , e(vl ) from G, and is consequently a tree graph. Also, by construction, the sequence v1 , . . . , vl is increasing in the lexicographical order of Gl . Next, consider the tree graph Gl . Each edge e(v j ) connects the single empty edge of v j with direction δ(v j ) with an empty edge of opposite direction of a vertex v, where

1042


v is smaller than v j in the lexicographical order of Gl . It is easy to see that, for each j, there are at most ( p + k − l) such connections. We have thus shown that we can obtain any G ∈ G ( p, k, l) by choosing some tree Gl ∈ G ( p, k, 0), choosing l elements v j out of {1, . . . , k}, ordering them lexicographically (according to the order of Gl ) and choosing an edge out of at most ( p + k − l) possibilities for v1 , . . . , vl . Thus, G ( p, k, l) ≤

k ( p + k − l)l G ( p, k, 0). l

The claim then follows from (6.10) and (6.17).

6.3. Proof of convergence. We are now armed with everything we need in order to ( p) ). Recall that estimate k (t) dt G (k,l) t,t (a (k,l)

G t,t1 ,...,tk (a ( p) ) =

ik 2k

(k,l)(G )

i G G t,t1 ,...,tk (a ( p) ),

(6.19)

G ∈G ( p,k,l)

G ) ( p) where G (k,l)( ) is an elementary term of the form (6.9) indexed by the graph G. t,t1 ,...,tk (a We rewrite this using graph structures. Pick some choice P : Q( p, k, l) → G ( p, k, l) of representatives. Then we get (k,l)

G t,t1 ,...,tk (a ( p) ) = =

ik 2k

(k,l)(G )

i G G t,t1 ,...,tk (a ( p) )

Q∈Q ( p,k,l) G ∈Q

ik 2k

σ (P (Q))) i Rσ (P (Q)) G (k,l)(R (a ( p) ). t,t1 ,...,tk

Q∈Q ( p,k,l) σ ∈Sk

Now, by definition of Rσ , we see that (k,l)(R (G ))

G t,t1 ,...,tσk

(k,l)(G )

(a ( p) ) = G t,tσ (1) ,...,tσ (k) (a ( p) ).

Thus, k (t)

dt

( p) G (k,l) ) t,t1 ,...,tk (a

ik = k 2 =

ik 2k

i Rσ (P (Q))

Q∈Q ( p,k,l) σ ∈Sk

Q∈Q ( p,k,l)

kQ (t)

k (t)

(k,l)(P (Q)) dt G t,t (a ( p) ) σ (1) ,...,tσ (k)

(k,l)(P (Q))

dt G t,t1 ,...,tk

(a ( p) ),

where kQ (t) := {(t1 , . . . , tk ) : ∃σ ∈ Sk : i Rσ (P (Q)) = 1, (tσ (1) , . . . , tσ (k) ) ∈ k (t)} ⊂ [0, t]k is a union of disjoint simplices.


1043

( p+k−l)

Therefore, (6.5) and (6.9) imply, for any ( p+k−l) ∈ H± , that (k,l) ( p) ( p+k−l) k dt G t,t (a ) (t) (k,l)(P (Q)) 1 ≤ k dt G t,t1 ,...,tk (a ( p) ) ( p+k−l) k 2 Q∈Q ( p,k,l) Q (t) (k,l)(P (Q)) 1 ≤ k dt G t,t1 ,...,tk (a ( p) ) ( p+k−l) k 2 [0,t] Q∈Q ( p,k,l)

≤

1 2k

Q∈Q ( p,k,l)

π κ 2t 2

k/2

a ( p) ( p+k−l)

2 k/2 πκ t 2 p + 3k k a ( p) ( p+k−l) , ≤ ( p + k − l)l 2 k l where the last inequality follows from Lemma 6.2. Of course, the above treatment remains valid for regularised potentials. We summarise: 2 k/2 (k,l),ε ( p) 2 p + 3k k l πκ t G t ( p + k − l) (a ) ≤ a ( p) , (6.20) k l 2 for all ε ≥ 0. Using (6.20) we may now proceed exactly as in the case of a bounded interaction potential. Let 1 . 128π κ 2 ν 2 The removal of the cutoff and summary of the results are contained in ρ(κ, ν) :=

(6.21)

(ν N ) Lemma 6.3. Let t < ρ(κ, ν). Then we have on H±

e

it H N

A N (a ( p) ) e−it HN =

∞ k 1 (k,l) ( p) A G (a ) , N t Nl

(6.22)

k=0 l=0

in operator norm, uniformly in N . Furthermore, for L ∈ N, we have the 1/N -expansion ∞ L−1 1 1 (k,l) ( p) it H N ( p) −it H N , (6.23) A N (a ) e A N G t (a ) + O e = Nl NL l=0

where the sum converges on

(ν N ) H±

k=l

uniformly in N .

Proof. Using (6.20) we may repeat the proof of Lemma 5.3 to the letter to prove the statements about convergence. Thus (6.22) holds for all ε > 0. The proof of (6.22) for ε = 0 follows by approximation and is banished to Appendix B. 7. Mean-Field Limit In this section we identify the mean-field dynamics as the dynamics given by the Hartree equation.

1044


7.1. Hartree equation. The Hartree equation reads i∂t ψ = hψ + (w ∗ |ψ|2 )ψ.

(7.1)

It is the equation of motion of a classical Hamiltonian system with phase space := H 1 (R3 ). Here H 1 (R3 ) is the usual Sobolev space of index one. In analogy to A N we ( p) define A as the map from closed operators on H+ to functions on phase space, through A(a ( p) )(ψ) := ψ ⊗ p , a ( p) ψ ⊗ p ¯ p) · · · = dx1 · · · dx p dy1 · · · dy p ψ(x ¯ 1 ) a ( p) (x1 , . . . , x p ; y1 , . . . , y p ) ψ(y1 ) · · · ψ(y p ). ψ(x We define the space of “observables” A as the linear hull of {A(a ( p) ) : p ∈ N, a ( p) ∈ B ( p) (H+ )}. The Hamilton function is given by 1 H := A(h) + A(W ), 2 i.e. 1 1 dx (w ∗ |ψ|2 )|ψ|2 = ψ , h ψ+ ψ ⊗2 , W ψ ⊗2 . (7.2) H (ψ) = dx |∇ψ|2 + 2 2 Using the Hardy-Littlewood-Sobolev and Sobolev inequalities (see e.g. [13]) one sees that H (ψ) is well-defined on : 2 |ψ(x)|2 |ψ(y)|2 |ψ|2 6/5 = ψ412/5 ψ4H 1 , dx dy |x − y| where the symbol means the left side is bounded by the right side multiplied by a positive constant that is independent of ψ. The Hartree equation is equivalent to i∂t ψ = ∂ψ¯ H (ψ). The symplectic form on is given by ¯ ω = i dx dψ(x) ∧ dψ(x), which induces a Poisson bracket given by ¯ ¯ ¯ {ψ(x), ψ(y)} = iδ(x − y), {ψ(x), ψ(y)} = {ψ(x), ψ(y)} = 0. For A, B ∈ A we have that {A, B}(ψ) = i

dx ∂ψ A(ψ) ∂ψ¯ B(ψ) − ∂ψ B(ψ) ∂ψ¯ A(ψ) .

The “mass” function

N (ψ) :=

dx |ψ|2

is the generator of the gauge transformations ψ → e−iθ ψ. By the gauge invariance of the Hamiltonian, {H, N } = 0, we conclude, at least formally, that N is a conserved quantity. Similarly, the energy H is formally conserved. The space of observables A has the following properties:


1045

(i) A(a ( p) ) = A (a ( p) )∗ . ( p) (ii) If a ( p) ∈ B(H+ ) and b ∈ B(H), then A(a ( p) )(bψ) = A (b∗ )⊗ p a ( p) b⊗ p (ψ). (iii) If a ( p) and b(q) are p- and q-particle operators, respectively, then A(a ( p) ), A(b(q) ) = i pqA a ( p) , b(q) 1 .

(7.3)

( p)

(iv) If a ( p) ∈ B(H+ ), then A(a ( p) )(ψ) ≤ a ( p) ψ2 p .

(7.4)

The free time evolution φ0t (ψ) := e−ith ψ is the Hamiltonian flow corresponding to the free Hamilton function A(h). We abbreviate the free time evolution of observables A ∈ A by At := A ◦ φ0t . Thus, A(a ( p) )t = ( p) A(at ). In order to define the Hamiltonian flow on all of L 2 (R3 ), we rewrite the Hartree equation (7.1) with initial data ψ(0) = ψ as an integral equation t ψ(t) = e−ith ψ − i ds e−i(t−s)h (w ∗ |ψ(s)|2 )ψ(s). (7.5) 0

Lemma 7.1. Let ψ ∈ Then (7.5) has a unique global solution ψ(·) ∈ C (R; L 2 (R3 )), which depends continuously on the initial data ψ. Furthermore, ψ(t) = ψ for all t. Finally, we have a Schwinger-Dyson expansion for observables: Let ( p) a ( p) ∈ B(H+ ), ν > 0 and t < ρ(κ, ν). Then L 2 (R3 ).

A(a ( p) )(ψ(t)) =

∞

(k,0) A G t (a ( p) ) (ψ)

k=0

∞ 1 ( p) = dt A(Wtk ), . . . A(Wt1 ), A(at ) . . . (ψ), k 2 k (t)

(7.6)

k=0

uniformly in the ball Bν := {ψ ∈ L 2 (R3 ) : ψ2 ≤ ν}. Proof. The well-posedness of (7.5) is a well-known result; see for instance [4,24]. The remaining statements follow from a “tree expansion”, which also yields an existence result. We first use the Schwinger-Dyson expansion to construct an evolution on the space of observables. We then show that this evolution stems from a Hamiltonian flow that satisfies the Hartree equation (7.5). First, we generalise our class of “observables” to functions that are not gauge invarip q ant, i.e. that correspond to bounded operators a (q, p) ∈ B(H+ ; H+ ). We set A(a (q, p) ) (ψ) := ψ ⊗q , a (q, p) ψ ⊗ p , and denote by A the linear hull of observables of the form p q A(a (q, p) ) with a (q, p) ∈ B(H+ ; H+ ).

1046


It is convenient to introduce the abbreviations G := {A(h), · },

D :=

1 {A(W ), · }. 2

A through (eGt A)(ψ) = A(e−ih ψ), where A ∈ A. Note Then eGt is well-defined on also that Ds := eGs De−Gs =

1 {A(Ws ), · }. 2

Let A ∈ A. We use the Schwinger-Dyson series for e(G+D)t to define the flow S(t)A through S(t)A := =

∞

dt Dtk · · · Dt1 eGt A

k k=0 (t) ∞

dt

k k=0 (t)

1 A(W ), . . . A(W ), A ) ... . t t t 1 k 2k

(7.7)

Our first task is to show convergence of (7.7) for small times. Let A = A(a (q, p) ). As with (7.3) one finds, after short computation, that q p 1 {A(W ), A(a (q, p) )} = A i Wi q+1 (a (q, p) ⊗ 1) − i (a (q, p) ⊗ 1)Wi p+1 . (7.8) 2 i=1

i=1

Thus we see that the nested Poisson brackets in (7.7) yield a “tree expansion” which (k) (a (q, p) ) recursively through may be described as follows. Define Tt,t 1 ,...,tk (q, p) Tt(0) (a (q, p) ) := at , (k) Tt,t1 ,...,tk (a (q, p) ) := iP+

q+k−1

(k−1) Wi q+k,tk Tt,t1 ,...,tk−1 (a (q, p) ) ⊗ 1 P+

i=1

−iP+

p+k−1

(k−1) Tt,t1 ,...,tk−1 (a (q, p) ) ⊗ 1 Wi p+k,tk P+ .

i=1 ( p+k)

(k)

Note that Tt,t1 ,...,tk (a (q, p) ) is an operator from H+ that

(q+k)

to H+

. Moreover, (7.8) implies

1 (q, p) (k) (q, p) A(W ), . . . A(W ), A(a ) . . . = A T (a ) . tk t1 t,t1 ,...,tk t 2k

(7.9)

Also, by definition, we see that for gauge-invariant observables a ( p) we have (k)

(k,0)

Tt,t1 ,...,tk (a ( p) ) = G t,t1 ,...,tk (a ( p) ). We may use the methods of Sect. 6 to obtain the desired estimate. One sees that (k) Tt,t1 ,...,tk (a ( p) ) is a sum of elementary terms, indexed by labelled ordered trees, whose


1047

root has degree at most p + q, and whose other vertices have at most 3 children. From (6.15) we find that there are p + q + 3k p+q p + q + 3k k unlabelled trees of this kind. Proceeding exactly as in Sect. 6 we find that 2 k/2 p + q + 3k πκ t (k) (q, p) ( p+k) dt Tt,t1 ,...,tk (a ) a (q, p) ( p+k) , ≤ k 2 k (t) ( p+k)

where ( p+k) ∈ H+ . Let ψ ∈ L 2 (R3 ) with ψ2 ≤ ν. Then |A(a (q, p) )(ψ)| ≤ (q, p) p+q a ψ implies 1 (q, p) dt k A(Wtk ), . . . A(Wt1 ), A(at ) . . . (ψ) 2 k (t) 2 k/2 p + q + 3k πκ t ≤ a (q, p) ν k+( p+q)/2 . (7.10) k 2 Convergence of the Schwinger-Dyson series (7.7) for small times t follows immediately. Thus, for small times t, the flow S(t) is well-defined on A, and it is easy to check that it satisfies the equation t Gt S(t)A = e A + ds S(s) D eG(t−s) A, (7.11) 0

for all A ∈ A. In order to establish a link with the Hartree equation (7.5), we consider f ∈ L 2 (R3 ) and define the function F f ∈ A through F f (ψ) := f , ψ. Clearly, the mapping f → (S(t)F f )(ψ) is antilinear and (7.10) implies that it is bounded. Thus there exists a unique vector ψ(t) such that (S(t)F f )(ψ) =: f , ψ(t). We now proceed to show that (S(t)A)(ψ) = A(ψ(t)) for all A ∈ A. By definition, this is true for A = F f . As a first step, we show that S(t)(AB) = (S(t)A)(S(t)B), where A, B ∈ A. Write S(t)(AB) = =

∞ k k=0 (t) ∞ k k=0 (t)

(7.12)

dt Dtk · · · Dt1 eGt (AB) dt Dtk · · · Dt1 (At Bt ),

where we used eGt (AB) = (eGt A)(eGt B). We now claim that dt Dtk · · · Dt1 (At Bt ) = dt ds Dtl · · · Dt1 At Dsm · · · Ds1 Bt , k (t)

l l+m=k (t)

m (t)

(7.13)

1048


where the sum ranges over l, m ≥ 0. This follows easily by induction on k and using Ds (AB) = A(Ds B) + (Ds A)B. Then (7.12) follows immediately. Next, we note that (7.12) implies that (S(t)A)(ψ) = A(ψ(t)), whenever A is of the form A = A(a (q, p) ), where j j j j P+ f 1 ⊗ · · · ⊗ f q g1 ⊗ · · · ⊗ g p P+ , (7.14) a (q, p) = j ( p)

(q)

where the sum is finite, and f i , gi ∈ L 2 (R3 ). Now each a (q, p) ∈ B(H+ ; H+ ) can be (q, p) written as the weak operator limit of a sequence (an )n∈N of operators of type (7.14). One sees immediately that j

j

(q, p)

lim A(an n

)(ψ(t)) = A(a (q, p) )(ψ(t)). (q, p)

On the other hand, uniform boundedness implies that supn an < ∞, so that (q, p) ψ ⊗(q+k) , Wi1 j1 ,tv1 · · · Wir jr ,tvr an ⊗ 1(k) Wir +1 jr +1 ,tvr +1 · · · Wik jk ,tvk ψ ⊗( p+k) (q, p) ≤ an Wir jr ,tvr · · · Wi1 j1 ,tv1 ψ ⊗(q+k) Wir +1 jr +1 ,tvr +1 · · · Wik jk ,tvk ψ ⊗( p+k) justifies the use of dominated convergence in (q, p)

lim(S(t)A(an n

))(ψ) = (S(t)A(a (q, p) ))(ψ).

We have thus shown that (S(t)A)(ψ) = A(ψ(t)),

∀A ∈ A.

(7.15)

Let us now return to (7.11). Setting A = F f , we find that (7.11) implies t 1 −i h S(s){A(W ), (F f )t−s } (ψ) f , ψ(t) = f , e ψ + ds 2 0 t = f , e−i h ψ + ds {A(W ), (F f )t−s } (ψ(s)), 0

where we used (7.15). Using (7.8) we thus find t

−i h f , ψ(t) = f , e ψ − i ds (eih(t−s) f ) ⊗ ψ(s) , W ψ(s) ⊗ ψ(s) , (7.16) 0

which is exactly the Hartree equation (7.5) projected onto f . We have thus shown that ψ(t) as defined above solves the Hartree equation. To show norm-conservation we abbreviate F(s) := (w ∗ |ψ(s)|2 )ψ(s) and write, using (7.5), t ψ(t)2 − ψ2 = i ds F(s) , e−ish ψ − e−ish ψ , F(s) 0 t t

+ ds dr eish F(s) , eir h F(r ) . 0

0


1049

The last term is equal to t s ds dr eish F(s) , eir h F(r ) + eir h F(r ) , eish F(s) . 0

0

Therefore (7.5) implies that

ψ(t)2 − ψ2 = i

t

ds F(s) , ψ(s) − i

0

t

ds ψ(s) , F(s) = 0,

0

since F(s) , ψ(s) ∈ R, as can be seen by explicit calculation. Thus we can iterate the above existence result for short times to obtain a global solution. Furthermore, (7.16) implies that ψ(t) is weakly continuous in t. Since the norm of ψ(t) is conserved, ψ(t) is strongly continuous in t. Similarly, the Schwinger-Dyson expansion (7.7) implies that the map ψ → ψ(t) is weakly continuous for small times, uniformly in ψ in compacts. Therefore, the map ψ → ψ(t) is weakly continuous for all times t, and norm-conservation implies that it is strongly continuous. 7.2. Wick quantisation. In order to state our main result in a general setting, we shortly discuss how the many-body quantum mechanics of bosons can be viewed as a deformation quantisation of the (classical) Hartree theory. The deformation parameter (the analogue of in the usual quantisation of classical theories) is 1/N . We define quantisaN : A → N (x) tion as the linear map (·) A defined by the formal replacement ψ(x) → ψ ¯ ∗ (x) followed by Wick ordering. In other words, and ψ(x) → ψ N N : A(a ( p) ) → (·) A N (a ( p) ). N to unbounded operators in the obvious way, we see that Extending the definition of (·) H N is the quantisation of H . Note that (3.3) and (7.3) imply, for A, B ∈ A, N −1 1 , A N , BN = {A, B} N + O i N2 N. so that 1/N is indeed the deformation parameter of (·) 7.3. Mean-field limit: A Egorov-type result. Let φ t denote the Hamiltonian flow of the Hartree equation on L 2 (R3 ). Introduce the short-hand notation α t A := A ◦ φ t , α t A := eit N HN A e−it N HN ,

A ∈ A, A∈ A.

We may now state and prove our main result, which essentially says that, in the mean-field limit n = ν N → ∞, time evolution and quantisation commute. Theorem 7.2. Let A ∈ A, ν > 0, and ε > 0. Then there exists a function A(t) ∈ A such that ≤ ε, sup α t A − A(t) ∞ t∈R

as well as

L (Bν )

t C(ε, ν, t, A) N − α A A(t) N H(ν N ) ≤ ε + . + N

1050


Remark. The “intermediate function” A(t) is required, since the full time evolution α t does not leave A invariant. Proof. Most of the work has already been done in the previous sections. Without loss ( p) of generality take A = A(a ( p) ) for some p ∈ N and a ( p) ∈ B(H± ). Assume that t < ρ(κ, ν). Taking L = 1 in (6.23) we get A N (a ( p) ) α t

∞

=

H+(ν N )

A N G (k,0) (a ( p) ) t

k=0

H+(ν N )

+O

1 N

.

(7.17)

Comparing this with (7.6) immediately yields t

α A N (a

( p)

) = α t A(a ( p) ) N + O

1 N

on H+(ν N ) , where α t A(a ( p) ) N is defined through its norm-convergent power series. This is the statement of the theorem for short times. The extension to all times follows from an iteration argument. We postpone the details to the proof of Theorem 7.3 below. In its notation A(t) is given by

A(t) =

K 1 −1

···

k1 =0

K m −1 km =0

(k ,0) 1 ,0) a ( p) . A G τ(km ,0) G τ m−1 · · · G (k τ

The result may also be expressed in terms of coherent states. ( p)

Theorem 7.3. Let a ( p) ∈ B(H+ ), ψ ∈ L 2 (R3 ) with ψ = 1, and T > 0. Then there exist constants C, β > 0, depending only on p, T and κ, such that ⊗N A N (a ( p) ) e−it HN ψ ⊗N − ψ(t)⊗ p , a ( p) ψ(t)⊗ p , eit HN ψ ≤

C a ( p) , Nβ

t ∈ [0, T ].

(7.18)

Here ψ(t) is the solution to the Hartree equation (7.5) with initial data ψ. Proof. Introduce a cutoff K ∈ N and write (in self-explanatory notation) ατ A N (a ( p) ) =

K −1 k=0

α τ A(a ( p) ) =

K −1 k=0

1 ( p) τ A N G (k,0) A N (a ( p) ) + R N ,τ (a ( p) ), (7.19) (a ) + α≥K τ N ( p) τ A G (k,0) (a ) + α≥K A(a ( p) ). τ

(7.20)


1051

To avoid cluttering the notation, from now on we drop the parentheses of the linear map G (k,0) . We iterate (7.19) m times by applying it to its first term and get τ ( α τ )m A N (a ( p) ) =

K 1 −1

···

k1 =0

+

K m −1 km =0

m−1 1 −1 K

(k ,0) τ 1 ,0) a ( p) + ( A N G τ(km ,0) G τ m−1 · · · G (k A (a ( p) ) α τ )m−1 α≥K τ 1 N K j −1

···

j=1 k1 =0

k j =0

(k ,0) (k ,0) j j−1 τ (k1 ,0) ( p) A G ( α τ )m−1− j α≥K G · · · G a N τ τ τ j+1

1 τ m−1 R N ,τ (a ( p) ) ( α ) N K j −1 m−1 K 1 −1 (k ,0) 1 1 ,0) a ( p) . + ··· ( α τ )m−1− j R N ,τ G τ j · · · G (k τ N +

j=1 k1 =0

(7.21)

k j =0

A similar expression without the third line holds for (α τ )m A(a ( p) ). In order to control this somewhat unpleasant expression, we abbreviate τ . ρ(κ, 1)

x :=

Assume that x < 1. Then (6.20) and (6.23) imply the estimates, valid on H+(N ) , (k,0) ( p) G a ≤ 4 p a ( p) x k , τ τ xK α≥K A N (a ( p) ) ≤ 4 p a ( p) , 1−x x R N ,τ (a ( p) ) ≤ (4e) p a ( p) . (1 − x)3 Furthermore, (7.6) implies that τ α A(a ( p) ) ∞ ≥K L (B

1)

≤ 4 p a ( p)

xK . 1−x

We also need N · · · (N − p+1) ⊗N ( p) ⊗N ( p) ψ −A(a )(ψ) = , A N (a )ψ −1A(a ( p) )(ψ) p N p−1 N · · · (N − j) N · · · (N − j+1) a ( p) ≤ − N j+1 Nj j=1

≤

p 2 ( p) a . N

(7.22)

1052


Armed with these estimates we may now complete the proof of Theorem 7.3. Suppose that 1/2 ≤ x < 1. Then K 1 −1 k1 =0

···

K m −1 km =0

⊗N (k ,0) 1 ,0) a ( p) ψ ⊗N , A N G τ(km ,0) G τ m−1 · · · G (k ψ τ

· · · G τ(k1 ,0) a ( p) (ψ)

(km−1 ,0)

−A G τ(km ,0) G τ ≤

1 ( p + K 1 + · · · + K m )2 4m( p+K 1 +···+K m ) a ( p) . N (N )

Similarly, the second line of (7.21) on H+ bounded by m

and its classical equivalent on B1 are

x K j 4 j ( p+K 1 +···+K j−1 ) a ( p) .

j=1 (N )

Finally, the last line of (7.21) on H+ 1 N

m

is bounded by

4( j+1)( p+K 1 +···+K j−1 ) a ( p) .

j=1

Now pick m large enough that T ≤ mτ . Then it is easy to check that there exist a1 , . . . , am such that setting K j = a j log N ,

j = 1, . . . , m

implies that the three above expressions are all bounded by C N −β a ( p) , for some β > 0. This remains of course true for all m ≤ m. Since any time t ≤ T can be reached by at most m iterations with 1/2 ≤ x < 1, the claim follows. We conclude with a short discussion on density matrices. First we recall some standard results; see for instance [18]. Let ∈ L1 , where L1 is the space of trace class operators on some Hilbert space. Equipped with the norm 1 := Tr||, L1 is a Banach space. Its dual is equal to B, the space of bounded operators, and the dual pairing is given by A , = Tr(A),

A ∈ B, ∈ L1 .

Therefore, 1 =

sup

A∈B, A≤1

|Tr(A)|.

(7.23)

Consider an N -particle density matrix 0 ≤ N ∈ L1 (H+(N ) ) that satisfies Tr N = 1 and is symmetric in the sense that N P+ = N . Define the p-particle marginals ( p)

N

:= Tr p+1,...,N N ,

where Tr p+1,...,N denotes the partial trace over the coordinates p + 1, . . . , N . Define furthermore N (t) = e−it HN N eit HN ,


1053

( p)

as well as the p-particle marginals N (t) of N (t). Noting that p! N 1 ( p) ( p) Tr Tr a ( p) N (t) = Tr a ( p) N (t) + O , A N (a ( p) ) N (t) = p p N N we see that (7.23) and Theorem 7.3 imply Corollary 7.4. Let ψ ∈ H with ψ = 1, and let ψ(t) be the solution of (7.5) with initial data ψ. Set N := (|ψψ|)⊗N . Then, for any p ∈ N and T > 0 there exist constants C, β > 0, depending only on p, T and κ, such that C ( p) , t ∈ [0, T ]. N (t) − (|ψ(t)ψ(t)|)⊗ p ≤ 1 Nβ Remark. Actually it is enough for N to factorise asymptotically. If ( N ) N ∈N is a sequence of symmetric density matrices satisfying (1) lim N − |ψψ|1 = 0, N →∞

then one finds

(1) lim N (t) − |ψ(t)ψ(t)| = 0,

N →0

1

t ∈ R.

This is a straightforward corollary of the proof of Theorem 7.3. By an argument of Lieb and Seiringer (see the remark after Theorem 1 in [14]), this implies that ( p) lim N (t) − (|ψ(t)ψ(t)|)⊗ p = 0, t ∈R N →0

1

for all p. 8. Some Generalisations In this section we generalise our results to a larger class of interaction potentials, and allow an external potential. For this we need Strichartz estimates for Lorentz spaces. We start with a short summary of the relevant results (see [1,11]). For 1 ≤ q ≤ ∞ and 0 < θ < 1 we define the real interpolation functor (·, ·)θ,q as follows. Let A0 and A1 be two Banach spaces contained in some larger Banach space A. Define the real interpolation norm ⎧ 1/q q ⎨ ∞ −θ t K (t, a) dt/t , q < ∞, 0 a(A0 ,A1 )θ,q := ⎩sup t −θ K (t, a), q = ∞, t≥0

where K (t, a) :=

inf

a=a0 +a1

a0 A0 + ta1 A1 .

Define (A0 , A1 )θ,q as the space of a ∈ A such that a(A0 ,A1 )θ,q < ∞. Then (A0 , A1 )θ,q is a Banach space. The Lorentz space L p,q (R3 , dx) ≡ L p,q is defined by interpolation as L p,q := (L p0 , L p1 )θ,q ,

1054


where 1 ≤ p0 , p1 ≤ ∞, p0 = p1 , and 1 1−θ θ = + . p p0 p1 Lorentz spaces have the following properties that are of interest to us. First, L p, p = p p L p,∞ = L w , where L w is the weak L p space (see e.g. [1,19]). In particular, we have for the Coulomb potential in 3 dimensions L p . Second,

1 ∈ L 3,∞ . |x| Finally, Lorentz spaces satisfy a general Hölder inequality (see [17]): Let 1 < p, p1 , p2 < ∞ and 1 ≤ q, q1 , q2 ≤ ∞ satisfy 1 1 1 + = . q1 q2 q

1 1 1 + = , p1 p2 p Then we have

f g L p,q f L p1 ,q1 g L p2 ,q2 .

(8.1)

We need an endpoint homogeneous Strichartz estimate proved in [11]. For a map f : R → L p,q we define the space-time norm

1/r r p,q f L rt L x := . dt f (t) L p,q Then Theorem 10.1 of [11] implies that it e f r

p,2

Lt Lx

f L 2 ,

(8.2)

whenever 2 ≤ r < ∞ and 2 3 3 + = . r p 2 We are now set for proving a generalisation of (6.2). Lemma 8.1. Let w ∈ L 3w + L ∞ . Then there is a constant C = C(w) > 0, such that 1 w eit ψ2 dt ≤ Cψ2 . 0

Proof. Let w = w1 + w2 with w1 ∈ L ∞ and w2 ∈ L 3w . Then it w e ψ 2 2 ≤ w1 eit ψ 2 2 + w2 eit ψ 2 2 . L L L L L L t

t

x

t

x

x

The first term is bounded by w1 L ∞ ψ L 2 . To bound the second we use (8.1) and (8.2) with r = 2 and p = 6 to get w2 eit ψ 2 2 w2 L 3,∞ eit ψ 2 6,2 w2 L 3,∞ ψ L 2 . L L L L t

Therefore,

t

x

it w e ψ

L 2t L 2x

≤

x

$ C(w) ψ L 2 .


Now let us assume that v, w ∈ L ∞ + L 3w . Set H0 |H(n) := ± required generalisation of Lemma 6.1 is

1055

n

i=1 −i .

Then the

Lemma 8.2. There exists a constant C ≡ C(w, v) such that 1 Wi j e−it H0 (n) 2 dt ≤ C(n) 2 , 0

1

Vi e−it H0 (n) 2 dt ≤ C(n) 2 ,

0 (n)

where (n) ∈ H± . Proof. The claim for V follows immediately from Lemma 8.1. The estimate for W follows similarly by using centre of mass coordinates. Finally, we briefly discuss the changes to the combinatorics arising from an external potential. We classify the elementary terms according to the numbers (k, l, m), where k is the order of the multiple commutator, l is the number of loops, and m is the number of V -operators. Thus, instead of (4.5), we have the recursive definition (k,l,m) (k−1,l,m) G t,t (a ( p) ) = i( p + k − l − m − 1) Wtk , G t,t (a ( p) ) 1 ,...,tk 1 ,...,tk−1 1 p+k −l −m (k−1,l−1,m) +i Wtk , G t,t1 ,...,tk−1 (a ( p) ) 2 2 (k−1,l,m−1) ( p) +i( p + k − l − m) Vtk , G t,t1 ,...,tk−1 (a ) 1

= iP±

p+k−l−m−1 i=1

+iP±

(k−1,l,m) Wi p+k−l−m,tk , G t,t1 ,...,tk−1 (a ( p) ) ⊗ 1 P±

(k−1,l−1,m) Wi j,tk , G t,t1 ,...,tk−1 (a ( p) ) P±

1≤i< j≤ p+k−l−m

+iP±

p+k−l−m

(k−1,l,m−1) Vi,tk , G t,t1 ,...,tk−1 (a ( p) ) P± ,

i=1 ( p) (0,0,0) ( p) (k,l,m) as well as G t (a ) := at . We also set G t,t1 ,...,tk (a ( p) ) = 0 unless 0 ≤ l ≤ k − m. It is again an easy exercise to show by induction on k that k k−l (iN )k 1 ( p) (k,l,m) ( p) G (W ), . . . (W ), A (a ) . . . = (a ) . A A A N t N t N N t t,t1 ,...,tk 1 k 2k Nl l=0 m=0

(k,l,m) Note that G t,t (a ( p) ) is a p + k − l − m particle operator. 1 ,...,tk The graphs of Sect. 6 have to be modified: Each vertex corresponding to a V -operator has one edge for each direction d = a, c (see Fig. 8.1). Let us first consider tree graphs, l = 0. Take the set of trees without an external potential as in Sect. 6. By allowing each vertex v = 1, . . . , k whose edges (a, 2) and (c, 2) are empty to stand for either an interaction potential W or an external potential V , we count all trees with an external potential. Thus, for a given m, there are at most

1056


Fig. 8.1. An admissible graph of type ( p = 4, k = 7, l = 2, m = 2)

k (k,0,m) ( p) ). If l > 0 we repeat the argum |G ( p, k, 0)| tree graphs contributing to G t,t1 ,...,tk (a ment in the proof Lemma 6.2, and find that the number of graph structures contributing (k,l,m) to G t,t (a ( p) ) is bounded by 1 ,...,tk k k 2 p + 3k ( p + k − l − m)l . 2k m l k Putting all this together, we find that (k,l,m) ( p) k 2 p + 3k k G t ( p + k − l − m)l (Ct)k/2 a ( p) . (a ) ≤ m l k Using the condition p +k −l −m ≤ n, it is then easy to see that all convergence estimates remain valid with the additional factor 2k . In summary, all of the results of Sects. 6 and 7 hold if v, w ∈ L 3w + L ∞ . A. Second Quantisation We briefly summarise the main ingredients of many-body quantum mechanics and second quantisation. See for instance [2] for an extensive discussion. Let H = L 2 (Rd , dx) be the “one-particle Hilbert space”, where d ∈ N. Manybody quantum mechanics is formulated on subspaces of the n-particle spaces H⊗n . Let (n) P± ≡ P± be the orthogonal projector onto the symmetric/antisymmetric subspace of H⊗n , i.e. (P± (n) )(x1 , . . . , xn ) :=

1 (±1)|σ | (n) (xσ (1) , . . . , xσ (n) ), n! σ ∈Sn

where |σ | denotes the number of transpositions in the permutation σ , and (n) ∈ H⊗n . We define the bosonic n-particle space as H+(n) := P+ H⊗n , and the fermionic n-particle (n) space as H− := P− H⊗n . We adopt the usual convention that H⊗0 = C.


1057

We introduce the Fock space F± (H) ≡ F± :=

∞ %

(n)

H± .

n=0

A state ∈ F± is a sequence = scalar product

((n) )∞ n=0 ,

, =

(n)

where (n) ∈ H± . Equipped with the

∞

(n) , (n) ,

n=0

F± is a Hilbert space. The vector := (1, 0, 0, . . . ) is called the vacuum. By a slight abuse of notation, we denote a vector of the form = (0, . . . , 0, (n) , 0, . . . ) ∈ F± by its non-vanishing n-particle component (n) . Define also the subspace of vectors with a finite particle number F±0 := { ∈ F± : (n) = 0 for all but finitely many n}. ∗ and ψ , which map On F± we have the usual creation and annihilation operators, ψ the one-particle space H into densely defined closable operators on F± . For f ∈ H and ∈ F± , they are defined by n ∗ (n) 1 := ψ ( f ) (x1 , . . . , xn ) (±1)i−1 f (xi )(n−1) (x1 , . . . , xi−1 , xi+1 , . . . , xn ), √ n i=1 √ (n) ( f ) (x1 , . . . , xn ) := n+1 dy f¯(y)(n+1) (y, x1 , . . . , xn ). ψ

( f ) and ψ ∗ ( f ) are adjoints of each other (see for instance [2] It is not hard to see that ψ for details). Furthermore, they satisfy the canonical (anti)commutation relations # ( f ), ψ ∗ (g) ( f ), ψ # (g) ψ = f , g 1, ψ = 0, (A.1) ∓ ∓ ∗ or ψ . In order to simplify notation, we # = ψ where [A, B]∓ := AB ∓ B A, and ψ usually identify c1 with c, where c ∈ C. For our purposes, it is more natural to work with the rescaled creation and annihilation operators 1 # , N# := √ ψ ψ N where N > 0. We also introduce the operator-valued distributions defined formally by N# (x) := ψ N# (δx ), ψ # (x) has a rigorous meaning where δx is the delta function at x. The formal expression ψ N as a densely defined sesquilinear form on F± (see [19] for details). In particular one has that N ( f ) = N (x), N∗ ( f ) = N∗ (x). ψ dx f¯(x) ψ ψ dx f (x) ψ Furthermore, the (anti)commutation relations (A.1) imply that # 1 N (x), ψ N∗ (y) N# (y) N (x), ψ ψ = = 0. δ(x − y), ψ ∓ ∓ N

(A.2)

1058


B. The Limit ε → 0 in Lemma 6.3 What remains is the justification of the equality in (6.22) for ε = 0. Our strategy is to show that both sides of (6.23) with ε > 0 converge strongly to the same expression with ε = 0. (k,l),ε ( p) (n) We first show the strong convergence of G t (a ). Let (n) ∈ H± and consider ε (W − Wi j,s )(n) = I{|W |>ε−1 } Wi j e−is H0 (n) ≤ Wi j e−is H0 (n) . i j,s ij Since the right side is in L 1 ([0, t]), we may use dominated convergence to conclude that t lim ds (Wiεj,s − Wi j,s )(n) = 0. ε→0 0

Now

t ds ds Wiεj,s Wiε j ,s (n) − Wi j,s Wi j ,s (n) 0 0 t t ≤ ds ds Wiεj,s Wiε j ,s (n) − Wiεj,s Wi j ,s (n) 0 0 t t + ds ds Wiεj,s Wi j ,s (n) − Wi j,s Wi j ,s (n) . t

0

0

The first term is bounded by 2 1/2 t πκ t ε → 0. ds Wiε j ,s (n) − Wi j ,s (n) → 0, 2 0 The integrand of the second term is bounded by 2Wi j,s Wi j ,s (n) ∈ L 1 ([0, t]2 ), so that dominated convergence implies that the second term vanishes in the limit ε → 0. A straightforward generalisation of this argument shows that G (k,l),ε (a ( p) ) ( p+k−l) → G (k,l) (a ( p) ) ( p+k−l) , t t as claimed. Since the series (6.22) converges uniformly in ε, we find that ∞ ∞ k k 1 1 (k,l),ε ( p) (k,l) ( p) (n) A A G G (a ) → (a ) (n) , N N t t Nl Nl k=0 l=0

k=0 l=0

as ε → 0. ε Next, we show that e−it HN (n) → e−it HN (n) . This follows from strongresolvent convergence of H Nε to H N as ε → 0 by Trotter’s theorem [18]. Let W ε := i< j Wiεj , and consider N (H Nε − i)−1 (n) − (H N − i)−1 (n) = (H Nε − i)−1 (W − W ε )(H N − i)−1 (n) ≤ (W − W ε )(H N − i)−1 (n) . Clearly (n) := (H N − i)−1 (n) is in the domain of H N . By the Kato-Rellich theorem [19], (n) is in the domain of Wi j for all i, j. Therefore, (Wi j − W ε )(H N − i)−1 (n) = I{|W |>ε−1 } Wi j (n) → 0 ij ij


1059

as ε → 0. Therefore ε N (a ( p) ) e−it HNε (n) → eit HN A N (a ( p) ) e−it HN (n) eit HN A

as ε → 0, and the proof is complete. Acknowledgements. We thank W. De Roeck, S. Graffi and A. Pizzo for useful discussions and encouragement. We should also like to thank a referee for pointing out Ref. [14] in connection with the remark following Corollary 7.4.

References 1. Bergh, J., Löfström, J.: Interpolation Spaces, an Introduction. Berlin-Heidelberg-New York: Springer, 1976 2. Bratteli, O., Robinson, D.W.: Operator Algebras and Quantum Statistical Mechanics 2. BerlinHeidelberg-New York: Springer, 2002 3. Brown, W., Hepp, K.: The Vlasov dynamics and its fluctuations in the 1/N limit of interacting classical particles. Commun. Math. Phys. 56, 101–113 (1977) 4. Chadam, J.M., Glassey, R.T.: Global existence of solutions to the Cauchy problem for time-dependent Hartree equations. J. Math. Phys. 16, 1122 (1975) 5. Egorov, Y.V.: The canonical transformations of pseudodifferential operators. Usp. Mat. Nauk 25, 235–236 (1969) 6. Erd˝os, L., Yau, H.-T.: Derivation of the nonlinear Schrödinger equation with Coulomb potential. Adv. Theor. Math. Phys. 5, 1169–1205 (2001) 7. Fröhlich, J., Graffi, S., Schwarz, S.: Mean-field and classical limit of many-body Schrödinger dynamics for bosons. Commun. Math. Phys. 271, 681–697 (2007) 8. Fröhlich, J., Knowles, A., Pizzo, A.: Atomism and Quantization. J. Phys. A 40, 3033–3045 (2007) 9. Ginibre, J., Velo, G.: The classical field limit of scattering theory for non-relativistic many-boson systems. I-II. Commun. Math. Phys. 66, 37–76 (1979); Commun. Math. Phys. 68, 45–68 (1979) 10. Hepp, K.: The classical limit for quantum mechanical correlation functions. Commun. Math. Phys. 35, 265–277 (1974) 11. Keel, M., Tao, T.: Endpoint Strichartz estimates. Amer. J. Math. 120, 955–980 (1998) 12. Knuth, D.E.: The Art of Computer Programming, Vol. 1, Reading, MA: Addison-Wesley, 1998 13. Lieb, E.H., Loss, M.: Analysis. Providence, RI: Amer. Math. Soc., 2001 14. Lieb, E.H., Seiringer, R.: Proof of Bose-Einstein condensation for dilute trapped gases. Phys. Rev. Lett. 88(17), 170409 (2002) 15. Narnhofer, H., Sewell, G.L.: Vlasov hydrodynamics of a quantum mechanical model. Commun. Math. Phys. 79, 9–24 (1981) 16. Neunzert, H.: Fluid Dyn. Trans. 9, 229 (1977); Neunzert, H.: Neuere qualitative und numerische Methoden in der Plasmaphysik. Paderborn: Vorlesungsmanuskript, 1975 17. O’Neil, R.: Convolution operators and L( p, q) spaces. Duke Math. J. 30, 129–142 (1963) 18. Reed, M., Simon, B.: Methods of Modern Mathematical Physics I: Functional Analysis. New York: Academic Press, 1980 19. Reed, M., Simon, B.: Methods of Modern Mathematical Physics II: Fourier Analysis, Self-Adjointness. New York: Academic Press, 1975 20. Reed, M., Simon, B.: Methods of Modern Mathematical Physics IV: Analysis of Operators. New York: Academic Press, 1978 21. Rodnianski, I., Schlein, B.: Quantum fluctuations and rate of convergence towards mean field dynamics. http://arXiv.org/abs/0711.3087v1[math-ph], 2007 22. Schlein, B., Erd˝os, L.: Quantum Dynamics with Mean Field Interactions: a New Approach. http://arXiv. org/abs/0804.3774v1 (2008) 23. Simon, B.: Best constants in some operator smoothness estimates. J. Func. Anal. 107, 66–71 (1992) 24. Zagatti, S.: The Cauchy problem for Hartree-Fock time-dependent equations. Ann. Inst. Henri Poincaré (A) 56, 357–374 (1992) Communicated by H.-T. Yau


Communications in


Inverse Spectral Problems for Schrödinger Operators Hamid Hezari Department of Mathematics, Johns Hopkins University, Baltimore, MD 21218, USA. E-mail: [email protected] Received: 29 May 2008 / Accepted: 17 October 2008 Published online: 13 January 2009 – © Springer-Verlag 2009

Abstract: In this article we find some explicit formulas for the semi-classical wave invariants at the bottom of the well of a Schrödinger operator. As an application of these new formulas for the wave invariants, we improve the inverse spectral results proved by Guillemin and Uribe in [GU]. They proved that under some symmetry assumptions on the potential V (x), the Taylor expansion of V (x) near a non-degenerate global minimum can be recovered from the knowledge of the low-lying eigenvalues of the associated Schrödinger operator in Rn . We prove similar inverse spectral results using fewer symmetry assumptions. We also show that in dimension 1, no symmetry assumption is needed to recover the Taylor coefficients of V (x). 1. Introduction and Statement of Results In this article we study the inverse spectral problems for the semi-classical Schrödinger operator, Pˆ = − + V (x) 2 2

on L 2 (Rn ),

(1)

associated to the Hamiltonian P(x, ξ ) =

1 2 ξ + V (x). 2

Here the potential V (x) in (1) satisfies ⎧ ⎨ V (x) ∈ C ∞ (Rn ), V (x) has a unique non-degenerate global minimum at x = 0 and V (0) = 0, ⎩ For some ρ > 0, V −1 [0, ρ] is compact.

(2)

1062

H. Hezari

Under these conditions for sufficiently small , say ∈ (0, h 0 ), and sufficiently small δ, a classical fact tells us the spectrum of Pˆ in the energy interval [0, δ] is finite. We denote these eigenvalues by {E j ()}mj=0 . ˆ We notice the Weyl’s law We call these eigenvalues the low-lying eigenvalues of P. reads 1 ( d xdξ + o(1)). (3) m = N(δ) = { j; 0 ≤ E j () ≤ δ} = (2π )n 21 ξ 2 +V (x)≤δ Recently in [GU], Guillemin and Uribe raised the question whether we can recover the Taylor coefficients of V at x = 0 from the low-lying eigenvalues E j (). They also established that if we assume some symmetry conditions on V , namely V (x) = f (x12 , . . . , xn2 ), then the 1-parameter family of low-lying eigenvalues, {E j () | ∈ (0, h 0 )}, determines the Taylor coefficients of V at x = 0. In this article we will attempt to recover as much of V as possible from the family E j (), by establishing some new formulas for the wave invariants at the bottom of the potential (Theorem 1.1). Using these new expressions for the wave invariants, in Theorem 1.2 we improve the inverse spectral results of [GU] for a larger class of potentials. A classical approach in studying this problem is to examine the asymptotic behavior as → 0 of the truncated trace ˆ T r (( P)e

−it

Pˆ

),

(4)

where ∈ C0∞ ([0, ∞)) is supported in I = [0, δ] and equals one in a neighborhood of 0. The asymptotic behavior of the truncated trace around the equilibrium point (x, ξ ) = (0, 0) has been extensively studied in the literature. It is known that (see for example ˆ ˆ −it P ) has an asymptotic [BPU]) for t in a sufficiently small interval (0, t0 ), T r (( P)e expansion of the following form: ˆ T r (( P)e

−it

Pˆ

)∼

∞

a j (t) j ,

→ 0.

(5)

j=0

Throughout this paper when we refer to wave invariants at the bottom of the well, we mean the coefficients a j (t) in (5). By applying an orthogonal change of variable, we can assume that V is of the form 1 2 2 V (x) = ωk xk + W (x), 2 n

ωk > 0,

(6)

k=1

W (x) = O(|x|3 ),

|x| → 0.

In addition to conditions in (2), we also assume that {ωk } are linearly independent over Q. Our first result finds explicit formulas for the wave invariants.

Inverse Spectral Problems for Schrödinger Operators

1063

Theorem 1.1. There exists t0 such that for 0 < t < t0 , 1. n n −it ˆ 1 ˆ 0 = − 1 2 + 1 a0 (t) = T r (e H0 ) = , where H ωk2 xk2 . ωk t 2 2 2i sin 2 k=1 k=1 2. For j ≥ 1 , the wave invariants a j (t) defined in (5) are given by sl−1 t s1 2j l(n−1)+ n2 i π4 sgnHl a j (t) = a0 (t) i e ... Pl+ j bl (0)dsl . . . ds1 , (7) 0

l=1

0

0

where for every m, i −m < Hl−1 ∇, ∇ >m (bl )(0), 2m m! l cos ωk si k sin ωk si k sin ωk (t − si ) + sin ωk si k bl = W( ξi +( (z i+1 + z ik )− )x ), 2 ωk sin ωk t

Pm bl (0) =

i=1

and

Hl−1

is the inverse matrix of the Hessian Hl =Hess l (0), where

l = l (t, x, z 1 , . . . , zl , ξ1 , . . . , ξl ) =

n ωk ωk t)xk2 + ( cot ωk t)(z 1k )2 {(−ωk tan 2 2 k=1

+

l

k (z i+1 − z ik )ξik }.

i=1

The Hessian of l is calculated with respect to every variable except t. Therefore the entries of the matrix Hl−1 are functions of t. The matrix Hl−1 is shown in (32). 3. The wave invariant a j (t) is a polynomial of degree 2 j of the Taylor coefficients of V . The Taylor coefficients of highest order appearing in a j (t) are of order 2 j + 2. In fact these highest order Taylor coefficients appear in the linear term of the polynomial and ω a0 (t) t −1 2 j+2 ( cot t)α D2 α V (0) a j (t) = j+1 (2i) α ! 2ω 2 | α |= j+1

+ {a polynomial of Taylor coefficients of order ≤ 2 j + 1}.

(8)

Notice that in (8), we have used the standard shorthand notations for multi-indices, i.e. α = (α1 , . . . αn ), ω = (ω1 , . . . ωn ), | α | = α1 + . . . + αn , α ! = α1 ! . . . αn !, ∂m = with m = | α |. X α = X 1α1 . . . X nαn , and Dαm α1 αn ∂ x1 ...∂ xn

Our second result improves the result of Guillemin and Uribe in [GU]. This theorem is actually a non-trivial corollary of Theorem 1.1. Theorem 1.2. Let V satisfy (2), (6), and be of the form V (x) = f (x12 , . . . , xn2 ) + xn3 g(x12 , . . . , xn2 ), for some f, g ∈ | α| Dα V (0),

C ∞ (Rn ). Then the low-lying eigenvalues of 3 V (0) D3 en

∂3V ∂ xn3

Pˆ =

(9) − 21 2 + V

determine

| α | = 2, 3, and if := (0) = 0, they determine all the Taylor coefficients of V at x = 0. One quick consequence of Theorem 1.2 is the following:

1064

H. Hezari

Corollary 1.3. If n = 1, and V ∈ C ∞ (R) satisfies (2), then (with no symmetry assumptions) the low-lying eigenvalues determine V (0) and V (3) (0), and if V (3) (0) = 0, then these eigenvalues determine all the Taylor coefficients of V at x = 0. Let us briefly sketch our main ideas for the proofs. First, because of a technical reason which arises in the proofs, we will need to replace the Hamiltonian P by the following Hamiltonian H : ⎧ H (x, ξ ) = 21 ξ 2 + V(x), ⎪ ⎪ ⎪ ⎪ ⎨ V(x) = 21 nk=1 ωk2 xk2 + W(x), ⎪ ⎪ ⎪ ⎪ ⎩ ε > 0 sufficiently small, W(x) = χ ( 1x−ε )W (x), 2

C0∞ (Rn )

where the cut off χ ∈ is supported in the unit ball B1 (0) and equals one in B 1 (0). 2 Then in two lemmas (Lemma 2.1 and Lemma 2.2) we show that for t in a sufficiently small interval (0, t0 ), in the sense of tempered distributions we have ˆ T r (( P)e

−it

Pˆ

) = T r (e

−it

Hˆ

) + O(∞ ). −it

ˆ

This reduces the problem to studying the asymptotic of T r (e H ). For this we use the −it ˆ construction of the kernel k(t, x, y) of the propagator U (t) = e H found in [Z]. We find that i

k(t, x, y) = C(t)e S(t,x,y)

∞

al (t, , x, y),

(10)

l=0

where S(t, x, y) =

n k=1

1 ωk ( (cos ωk t)(xk2 + yk2 ) − xk yk ), sin ωk t 2

(11)

a0 = 1, and for l ≥ 1,

al (t, , x, y) = (

−1 ln 1 l(n+1) ) ( ) 2π i

t 0

2l

sl−1

...

i . . . e l bl (s, x, y, z , ξ )d l zd l ξ d l s,

0

where l =

n l ωk k { cot ωk t (z 1k )2 + (z i+1 − z ik )ξik }, 2 k=1

i=1

and bl =

l i=1

W (

cos ωk si k sin ωk si k sin ωk (t − si ) k sin ωk si k (z i+1 + z ik ) − y + x ). ξi + 2 ωk sin ωk t sin ωk t


1065 −it

ˆ

H

Next we apply the expression in (10) for k(t, x, y) to the formula T r (e ) = k(t, x, x)d x. Then we obtain an infinite series of oscillatory integrals, each one corresponding to one al . Finally we apply the method of stationary phase to each oscillatory integral and we show that the resulting series is a valid asymptotic expansion. From the resulting asymptotic expansion we obtain the formulas (7). Now let us compare our approach for the construction of k(t, x, y) with the classical approach. In the classical approach (see for instance [DSj,D,R,BPU and U]), one ˆ ˆ −it P, constructs a WKB approximation for the kernel k P (t, x, y) of the operator ( P)e i.e. i (12) k P (t, x, y) = e (ϕ P (t,x,η)−y.η) b P (t, x, y, η, )dη,

where ϕ P (t, x, η) satisfies the Hamilton-Jacobi equation (or eikonal equation in geometrical optics) ∂t ϕ P (t, x, η) + P(x, ∂x ϕ P (t, x, η)) = 0,

ϕ P |t=0 = x.η,

and the function b P has an asymptotic expansion of the form b P (t, x, y, η, ) ∼

∞

b P, j (t, x, y, η) j .

j=0

The functions b P, j (t, x, y, η) are calculated from the so called transport equations. See for example [R,DSj,EZ] or Appendix A of the paper in hand for the details of the above construction. In this setting, when one integrates the kernel k P (t, x, y) on the diagonal and applies the stationary phase lemma to the given oscillatory integral, one obtains very complicated expressions for the wave invariants. Of course the classical calculations above show the existence of asymptotic formulas of the form (5) (which can be used to get Weyl-type estimates for the counting functions of the eigenvalues, see for example [BPU]). Unfortunately these formulas for the wave invariants are not helpful when trying to establish some inverse spectral results. Hence, one should look for more efficient methods to calculate the wave invariants a j (t). One approach is to use the semi-classical Birkhoff normal forms, which was used in the papers [Sj] and [ISjZ] and [GU]. The Birkhoff normal forms methods were also used by S. Zelditch in [Z4] to obtain positive inverse spectral results for real analytic domains with symmetries of an ellipse. Zelditch proved that for a real analytic plane domain with symmetries of an ellipse, the wave invariants at a bouncing ball orbit, which is preserved by the symmetries, determine the real analytic domain under isometries of the domain. Recently in [Z3], Zelditch improved his earlier result to the real analytic domains with only one mirror symmetry. His approach for this new result was different. He used a direct approach (Balian-Bloch trace formula) which involves Feynman-diagrammatic calculations of the stationary phase method to obtain a more explicit formula for the wave invariants at the bouncing ball orbit. Motivated by the work of Zelditch [Z3] mentioned above, our approach in this article is also somehow direct and involves combinatorial calculations of the stationary phase. −it ˆ Our formula in (10) for the kernel of the propagator, U (t) = e H , is different from the WKB-expression in the sense that we only keep the quadratic part of the phase

1066

H. Hezari

function, namely the phase function S(t, x, y) in(11) of the propagator of Anisotropic ∞ oscillator, and we put the rest in the amplitude l=0 al (t, , x, y) in (10). The details of this construction are mentioned in Sect. 2.2. Remark 1.4. After the initial posting of this article, Guillemin and Colin de Verdière posted two articles (see [CG1], also [C]) in which they study inverse spectral problems of 1 dimensional semi-classical Schrödinger operators. One of the main results in [CG1] is our Corollary 1.3 in this paper. 2. Proofs of the Results 2.1. Two reductions. Because of some technical issues arising in the proof of Theorem 1.1, we will need to use the following two lemmas as reductions. In the following, we let χ ∈ C0∞ (Rn ) be a cut off which is supported in the unit ball B1 (0) and equals one in B 1 (0). 2

Lemma 2.1. Let the Hamiltonians ⎧ ⎪ P(x, ξ ) = 21 ξ 2 + V (x) ⎪ ⎪ ⎪ ⎨ V (x) = 21 nk=1 ωk2 xk2 + W (x), ⎪ ⎪ ⎪ ⎪ ⎩ W (x) = O(|x|3 ), as x → 0

P and H be defined by ⎧ H (x, ξ ) = 21 ξ 2 + V(x), ⎪ ⎪ ⎪ ⎪ ⎨ V(x) = 21 nk=1 ωk2 xk2 + W(x), (13) ⎪ ⎪ ⎪ ⎪ ⎩ W(x) = χ ( 1x−ε)W (x), ε > 0 sufficiently small, 2

and let Pˆ and Hˆ be the corresponding Weyl (or standard) quantizations. Then for t in a sufficiently small interval (0, t0 ), ˆ T r (( P)e

−it

Pˆ

) = T r (( Hˆ )e

−it

Hˆ

) + O(∞ ).

In other words, the wave invariants a j (t) will not change if we replace P by H . Proof. Proof is given in Appendix A.

Next we use the following lemma to get rid of ( Hˆ ). Lemma 2.2. Let H be defined by (13). Then in the sense of tempered distributions T r (( Hˆ )e

−it

Hˆ

) = T r (e

−it

Hˆ

) + O(∞ ).

This means that if we sort the spectrum of Hˆ as E 1 () < E 2 () ≤ . . . ≤ E j () → +∞, then for every Schwartz function ϕ(t) ∈ S(R), < T r (e

−it

Hˆ

) − T r (( Hˆ )e

−it

Hˆ

), ϕ(t) >=

∞ E j () ) = O(∞ ). (1 − (E j ()))ϕ( ˆ j=1

Proof. Proof is given in Appendix B.

Because of the above lemmas, it is enough to study the asymptotic of T r (e

−it

Hˆ

).


1067 −it

ˆ

2.2. Construction of k(t, x, y), the kernel of U (t) = e H . In this section we follow the construction in [Z] to obtain an oscillatory integral representation of k(t, x, y), the kernel −it ˆ of the propagator e H . The reader should consult [Z] for many details. In that article Zelditch uses Dyson’s Expansion of propagator to study the singularities of the kernel k(t, x, y). But he does not consider the semi-classical setting → 0 in his calculations (i.e. = 1). So we follow the same calculations but also consider carefully. The following important proposition gives a new semi-classical approximation to the propagator U (t) near the bottom of the well. We will use B(X ) for the bounded functions on X with bounded derivatives. Proposition 2.3. Let k(t, x, y) be the Schwartz kernel of the propagator U (t) = e Then (A) We have k(t, x, y) = (

n

k=1

−it

Hˆ

.

∞

1 i ωk ) 2 e S(t,x,y) al (t, , x, y), 2πi sin ωk t

(14)

l=0

where S(t, x, y) =

n

1 ωk ( (cos ωk t)(xk2 + yk2 ) − xk yk ). sin ωk t 2

k=1

Also a0 = 1 and for l ≥ 1,

al (t, , x, y) = (

−1 ln 1 l(n+1) ) ( ) 2π i

t

2l

i . . . e l bl (s, x, y, z , ξ )d l zd l ξ d l s,

sl−1

... 0

0

(15) where l =

n l ωk k { cot ωk t (z 1k )2 + (z i+1 − z ik )ξik }, 2 k=1

(16)

i=1

and bl =

l

cos ωk si k sin ωk si k sin ωk (t − si ) k (z i+1 + z ik ) − y ξi + 2 ωk sin ωk t i=1 sin ωk si k x , (zl+1 := 0). + sin ωk t W

(17)

there exists k0 = k0 (α, β) (B) We have al ∈ B(Rnx × Rny ). In fact for every α and β, such that for every 0 < ≤ h 0 ≤ 1,

|∂xα ∂ yβ al (t, , x, y)| ≤

Cα,β,n (t)l ||W ||l|α|+|β|+k0 l!

1

1

l( 2 −3ε)− 2 (|α|+|β|) ,

(18)

where 1

W (x) =

W( 2 x) 1

3( 2 −ε)

1

= χ (ε x)

W ( 2 x) 1

3( 2 −ε)

(19)

1068

H. Hezari

and W is uniformly in B(Rnx ); i.e. W is bounded with bounded derivatives and ∞ al (t, , x, y) the bounds are independent of . Hence the sum a(t, , x, y) = l=0 in (14) is uniformly convergent in B(Rnx × Rny ). In fact 1 1 |∂xα ∂ yβ a(t, , x, y)| ≤ exp 2 −3 Cα,β,n (t)||W |||α|+|β|+k0 − 2 (|α|+|β|) . Proof. Following [Z], we denote ⎧ ⎨ Hˆ 0 = − 21 2 + 21 nk=1 ωk2 xk2 ,

(Anisotropic Oscillator)

⎩ ˆ H = Hˆ 0 + W(x) = − 21 2 + V(x), and by U0 (t) = e From

−it

Hˆ 0

, and U (t) = e

Hˆ

−it

, we mean their corresponding propagators.

(i∂t − Hˆ 0 )U (t) = W.U (t), we obtain U (t) = U0 (t) +

1 i

t

U0 (t − s).W.U (s)ds.

0

By iteration we get the norm convergent Dyson Expansion:

U (t) = U0 (t) +

∞ l=1 −1

[U0 (sl )

1 (i)l

t

...

0

0

sl−1

U0 (t)[U0 (s1 )−1 .W.U0 (s1 )] . . .

.W.U0 (sl )]dsl . . . ds1 .

It is well-known that for t =

mπ ωk ,

(20)

the kernel of U0 (t) is given by

k0 (t, x, y) = (

n k=1

1 i ωk ) 2 e S(t,x,y) , 2πi sin ωk t

(21)

where S(t, x, y) =

n k=1

1 ωk ( (cos ωk t)(xk2 + yk2 ) − xk yk ). sin ωk t 2

Then by taking kernels in (20) and after some change of variables (see [Z], pp. 8–9 and 18–19), we get (14). This finishes the proof of part (A) of the proposition. Before proving part (B), let us mention a useful estimate from [Z]. The setting in [Z] is a non-semiclassical one, i.e. = 1. In [Z] on pp. 17–18 the following estimate (for = 1) is proved using integration by parts. That there exists a positive integer k0 = k0 (α, β, n) and a continuous function Cα,β,n (t) such that

|∂xα ∂ yβ al (t, 1, x, y)| ≤

1 Cα,β,n (t)l ||W1 ||l|α|+|β|+k0 , l!

(W1 = W |=1 ).

(22)

The estimates (22) will change if one considers in the calculations. This would be part (B) of the proposition. Let us prove this, namely the estimate (18). First, in (15),


1069

we apply the change of variables x → 2 x, y → 2 y, z → 2 z and ξ → 2 ξ . This 1 1 gives us ln in front of the integral. Then we replace W( 2 (·)) by 3( 2 −ε) W (·). After collecting all the powers of in front of the integral we obtain 1

1

1

al (t, , 2 x, 2 y) = (

−1 ln l( 1 −3ε) ) 2 2π

t

1

...

1

2l

sl−1

0

0

1

. . . eil bl (s, x, y, z , ξ )d l zd l ξ d l s,

where bl (s, x, y, z , ξ ) =

l

cos ωk si k sin ωk si k (z i+1 + z ik ) − ξi 2 ωk i=1 sin ωk (t − si ) k sin ωk si k y + x . + sin ωk t sin ωk t W

Next we apply (22) to the above integral with W1 replaced by W , and we get (18). To finish the proof we have to show that for every positive integer m we can find uniform bounds (i. e. independent of ) for the m th derivatives of the function W (x). Since χ (x) is supported in the unit ball, from the definition (19) we see that W is supported in |x| < h −ε . So from (19) it is enough to find uniform bounds in for the m th derivatives 1

of the function

W ( 2 x) 1 3( 2 −ε)

in the ball |x| < −ε . This is very clear for m ≥ 3. For m < 3,

we use the order of vanishing of W (x) at x = 0. Since W (x) = O(|x|3 ) near x = 0, the order of vanishing of W at x = 0 is 3. Therefore in the ball |x| < −ε , the functions 1

1

1

W ( 2 x) (∂ α W )( 2 x) (∂ α ∂ β W )( 2 x) , , , 1 1 1 ( 2 x)3 ( 2 x)2 2 x are bounded functions with uniform bounds in , and the statement follows easily for m < 3.

2.3. Trace of U (t). In this section we show that the integral T r U (t) = k(t, x, x)d x is convergent as an oscillatory integral and using (14) we express T r U (t) as an infinite sum of oscillatory integrals with a appropriate -estimate for the remainder term. First of all we review some standard facts. We know that the sum itE j () T r U (t) = e− is convergent in the sense of tempered distributions, i.e. T r U (t) ∈ S (R). This can be shown by the Weyl’s law in its high energy setting, which implies that for potentials of the form V (x) = 21 nk=1 ωk2 xk2 + W(x), with W ∈ B(Rn ), for fixed , the j th eigenvalue E j () satisfies 2

E j () ∼ C(n, ) j n ,

j → ∞.

Another way to define T r U (t) is to write it as the limit (it+τ )E j () e− T r U (t) = lim+ T r U (t − iτ ) = lim+ . τ →0

τ →0

(23)

(24)

1070

H. Hezari

This time Weyl’s law (23) implies that the sum T r U (t − iτ ) is absolutely uniformly τ E j ()

convergent because of the rapidly decaying factor e− . As a result, U (t − iτ ) is a trace class operator. It is clear that the kernel of U (t − iτ ) is k(t − iτ, x, y), the analytic continuation of the kernel k(t, x, y) of U (t). Clearly k(t − iτ, x, y) is continuous in x and y. So we can write T r U (t − iτ ) = k(t − iτ, x, x)d x. We notice that this integral is uniformly convergent. This is because up to a constant this integral equals

i S(t−iτ,x,x) e a(t − iτ, , x, x), and the exponential factor in the integral is rapidly decaying for τ > 0 as |x| → ∞ and a is a bounded function. More precisely (i S(t − iτ, x, x)) =

n

ωk (t − iτ ) 2 ωk (1 − e2τ ωk ) 2 ))xk = x , 2 |1 + eωk (it+τ ) |2 k n

(−iωk tan(

k=1

k=1

and ωk (1 − e2τ ωk ) < 0. |1 + eωk (it+τ ) |2

The discussion above shows that the integral k(t, x, x)d x can be defined by integrations by parts as follows: Since < Dx >2 ei S(t,x,x) := (1 − )ei S(t,x,x) ωt ωk t i S(t,x,x) ωk tan( , ) x 2 +2i ))e 2 2 n

= (1+ 2ω tan(

k=1

we can write

i

n

e S(t,x,x) a(t, , x, x)d x = 2 n

= 2 +2i

ei S(t,x,x) a(t, ,

ei S(t,x,x) (< Dx >2 (1+ 2ω tan( n k=1

ωk tan(

√

x,

√

x)d x

ωt ) x 2 2

√ √ ωk t −1 n 0 )) ) a(t, , x, x)d x. 2

(25)

π }, then by choosing n 0 > n2 , and because If we assume 0 < t < min1≤k≤n { 2w k a(t, , x, y) ∈ B(Rnx × Rny ), the integral becomes absolutely convergent. Finally, since ∞ al (t, , x, y) is absolutely uniformly convergent, by (18) the series a(t, , x, y) = l=0 we have ∞ i i e S(t,x,x) a(t, , x, x)d x = e S(t,x,x) al (t, , x, x)d x, l=0

and therefore we obtain an infinite sum of oscillatory integrals. The next step is to apply the stationary phase method to each integral above and then add the asymptotic expansions to obtain an asymptotic expansion for the T r U (t). Because we have an infinite sum of asymptotic expansions, we have to establish that the resulting asymptotic for the


1071

trace is a valid approximation. Hence we have to find some appropriate -estimates for the remainder term of the series. For this we define n i Il (t, ) = − 2 (26) e S(t,x,x) al (t, , x, x)d x. Hence by this notation, T r U (t) = following crucial proposition.

n

1 ωk 2 k=1 ( 2πi sin ωk t )

∞

l=0 Il (t, ).

Now we have the

Proposition 2.4. Let 0 < ε < 16 , and Il (t, ) be defined by (26). Then for all m ≥ 1, T r U (t) =

n k=1

(

m−1 1 1 ωk )2 Il (t, ) + O(m( 2 −3ε) ). 2πi sin ωk t

(27)

l=0

Proof. If in (26) we integrate by parts as we did in (25), and choose n 0 = [ n2 ] + 1, then using (18) we get |Il (t, )| ≤ Cn (t)

Cn (t)l ||W ||l2n 0 +k0 l!

1

l( 2 −3ε) ,

where Cn (t) = max|α|+|β|≤2n 0 {Cα,β,n (t)}. We choose ε > 0 such that 21 − 3ε > 0, or ε < 16 . Now it is clear that for every positive integer m, and every 0 < ≤ h 0 ≤ 1, |

∞

1

Il (t, )| ≤ Cn (t)e{Cn (t)||W ||2n0 +k0 } m( 2 −3ε) .

(28)

l=m

Since by Proposition 2.3.B, sup0 j bl (x, z , ξ ) 2 j . j! 2j z , ξ ) i−j r r ∂ bl (x, h rl 1 r2 . . . h l 2 j−1 2 j , = j 2 . j! ∂r1 . . . ∂r2 j

P j bl (x, z , ξ ) =

(35)

r1 ,...,r2 j ∈Al

where in the sum (35) the indices r1 , . . . , r2 j run in the set Al = {x k , z 1k , ..zlk , ξ1k , ..ξlk }nk=1 , −1 th and h rr l with r, r ∈ Al , corresponds to the (r, r ) entry of the inverse Hessian Hl . We note that P j bl (0) = 0 if 2 j < 3l. This is true because of (17) and because W (0) = 0, ∇W (0) = 0 and HessW (0) = 0. This implies, first, there are not any negative powers of in the expansion (as we were expecting). Second, the constant term (i.e. the 0th wave invariant), which corresponds to the term l = j = 0 in the sum, equals a0 (t) = T rU0 (t) =

n k=1

−π n 4

n

(

1 i− 2 e ωk )2 sin ωk t (2ωk tan

=

ωk t 21 2 )

n

1 2i sin k=1

ωk t 2

.

And third (using (33)), for j ≥ 1 the coefficient of j in (34) equals a j (t) = (

n

1 2i sin k=1

ωk t 2

)

2j

n

π

i l(n−1)+ 2 ei 4 sgnHl

t 0

l=1

s1

...

0

sl−1

Pl+ j bl (0)dsl . . . ds1 .

0

The sum goes only up to 2 j because if l > 2 j then 2(l + j) < 3l and Pl+ j bl (0) = 0. This proves the first two parts of Theorem 1.1.

2.5. Calculations of the wave invariants and the proof of Theorem 1.1.3. In this section we try to calculate the wave invariants a j (t) from the formulas (7). First of all, let us investigate how the terms with highest order of derivatives appear in a j (t). Because bl is the product of l copies of W functions, and because we have to put at least 3 derivatives on each W to obtain non-zero terms, the highest possible order of derivatives that can appear in P j+l bl (0), is 2( j + l) − 3(l − 1) = 2 j − l + 3. This implies that, because in the sum (7) we have 1 ≤ l ≤ 2 j, the highest order of derivatives in a j (t) is 2 j + 2 and those derivatives are produced by the term corresponding to l = 1, i.e. P j+1 b1 (0). The formula (7) also shows that a j (t) is a polynomial of degree 2 j. The term with the highest polynomial order is the one with l = 2 j, i.e. P3 j b2 j (0) (which has the lowest order of derivatives) and the term P j+1 b1 (0) is the linear term of the polynomial. Now let us calculate P j+1 b1 (x, z , ξ ) and prove Theorem 1.1.3. By (35), P j+1 b1 =

i −( j+1) + 1)!

2 j+1 .( j

r1 ,...,r2 j+2 ∈A1

r

h r11 r2 . . . h 12 j+1

r2 j+2

∂ 2 j+2 b1 , ∂r1 . . . ∂r2 j+2

where here by (17), b1 = W(

cos ωk s k sin ωk s k sin ωk (t − s) + sin ωk s k ξ +( z − )x ). 2 ωk sin ωk t

(36)

1074

H. Hezari

Also by (32),

⎛

H1−1

⎞ ωt D( −1 0 2ω cot( 2 )) 0 ⎠. =⎝ 0 0 −I 0 −I −D(w cot ωt)

Hence the only non-zero entries of H1−1 are the ones of the form h 1x and

ξkξk h1 .

k xk

zk ξ k

, h1

ξ k zk

= h1

,

Now we let ⎧ r2 j+1 r2 j+2 r1 r2 xk xk in (36), ⎪ ⎨ i x k x k = the number of times h 1k k appears in h 1 . . . h 1 r r z ξ i z k ξ k = the number of times h 1 appears in h r11 r2 . . . h 12 j+1 2 j+2 in (36), ⎪ ⎩ r r ξkξk i ξ k ξ k = the number of times h 1 appears in h r11 r2 . . . h 12 j+1 2 j+2 in (36).

By applying these notations to (36), and by (17) we get

i

n

i −( j+1) P j+1 b1 = j+1 2 ( j + 1)! n

k=1 i x k x k +i z k ξ k +i ξ k ξ k = j+1

i

( j + 1)! 2 k=1 zk ξ k n k=1 i x k x k !i z k ξ k !i ξ k ξ k !

− cot ω2k t x x i i (−1) zk ξ k (−ωk cot ωk t) ξ k ξ k 2ωk k=1 n sin ωk (t − s) + sin ωk s 2i x k x k cos ωk s i zk ξ k − sin ωk s i zk ξ k +2iξ k ξ k × sin ωk t 2 ωk ×

n

k k

k=1 2 j+2 ×D2α1 ,...2αn W,

where αk = i x k x k + i z k ξ k + i ξ k ξ k , for k = 1, . . . , n. Next we write the above big sum as n

k=1 i x k x k +i z k ξ k +i ξ k ξ k = j+1

=

αk = j+1

( j + 1)! () i k=1 x k x k !i z k ξ k !i ξ k ξ k !

n ( j + 1)! αk !

n

i x k x k +i z k ξ k +i ξ k ξ k =αk k=1

αk ! (). i x k x k !i z k ξ k !i ξ k ξ k !

2 j+2

So the coefficient of D2α1 ,...2αn W in P j+1 b1 , equals n 1 ωk t sin ωk (t − s) + sin ωk s 2 i −( j+1) (−1) j+1 cot 2 j+1 ( αk !)( ωkαk ) 2 2 sin ωk t k=1 αk − cos ωk s sin ωk s + cot ωk t sin2 ωk s

.

Now we observe that the term in the parenthesis simplifies to ωk t sin ωk (t −s) + sin ωk s 2 ωk t 1 1 cot . −cos ωk s sin ωk s + cot ωk t sin2 ωk s = cot 2 2 sin ωk t 2 2


1075

So we get 1 −1 1 ωt α 2 j+2 D2 α W, P j+1 b1 = cot (2i) j+1 α ! 2ω 2

(37)

| α |= j+1

Finally, by plugging (x, z , ξ ) = 0 into Eq. (37) and applying it to (7), we get (8). This finishes the proof of Theorem 1.1.3. For future reference let us highlight the equation we just proved

S1 :=

r

h r11 r2 . . . h 12 j+1

r1 ,...,r2 j+2 ∈A1

= ( j + 1)!

| α |= j+1

1 α !

r2 j+2

−1 ωt cot 2ω 2

∂ 2 j+2 W ∂r1 . . . ∂r2 j+2 α

2 j+2

D2 α W,

(38)

where W = W(

sin ωk (t − s) + sin ωk s k cos ωk s k sin ωk s k z − )x ). ξ +( 2 ωk sin ωk t

t s 2.6. Calculations of 0 0 1 P j+2 b2 (0), and the proof of Theorem 1.2. Throughout this section we assume that V is of the form (9). Hence, the only non-zero Taylor coefficients 2 j+2 2 j+1 are of the form D2 α V (0), or D2 α +3 en V (0), where e n = (0, . . . , 0, 1). We notice that based on our discussion in the previous section, the Taylor coefficients

t s 2 j+1 of order 2 j + 1 appear in 0 0 1 P j+2 b2 (0), and they are of the form D V (0)D 3 V (0). δ β Therefore we look for the coefficients of the data 2 j+1 3 D2 α+3 en V (0)D3 α| = j − 1 (39) en V (0); | in the expansion of a j (t). 2 j+1

Proposition 2.5. In the expansion of a j (t), the coefficient of the data D2 α+3 en V (0) 3 V (0), | α | = j − 1, is D3 en c2 (n) t (2i) j+2 α !

−1 ωt cot 2ω 2

α

1 2αn + 5 −1 ωn t 2 1 )( ) + ( cot 3ωn2 αn + 1 2ωn 2 9ωn4

.

(40)

Therefore a j (t) =

ωt c1 (n) t −1 2 j+2 ( cot )α D2 α V (0) (2i) j+1 α ! 2ω 2 | α |= j+1

ωt α c2 (n) t −1 cot (2i) j+2 α ! 2ω 2 | α |= j−1 1 2αn + 5 −1 ωn t 2 1 2 j+1 3 D2 α +3 en V (0)D3 )( ) × ( cot + en V (0) 3ωn2 αn + 1 2ωn 2 9ωn4 +

+{a polynomial of Taylor coefficients of order ≤ 2 j}.

(41)

1076

H. Hezari 2 j+1

3 V (0), Proof. As we mentioned at the beginning of Sect. 2.7, the data D2 α +3 en V (0)D3 en

t s1 | α | = j −1, appears first in a j (t) and it is a part of the term 0 0 P j+2 b2 (0). So let us cal2 j+1 3 V (0). culate those terms in the expansion of P j+2 b2 (0) which contain D2 α +3 en V (0)D3 en By (17), since here l = 2, we have

b2 (s1 , s2 , x, x, z 1 , z 2 , ξ1 , ξ2 ) = W1 W2 , W1 = W ( cos 2ωk s1 (z 1k + z 2k ) − W2 = W ( cos 2ωk s2 z 2k − Also from (32) we have ⎛ −1 D( 2ω cot( ωt 2 )) ⎜ ⎜ ⎜ 0 ⎜ ⎜ ⎜ ⎜ 0 H2−1 = ⎜ ⎜ ⎜ 0 ⎜ ⎜ ⎜ ⎝ 0

sin ωk s1 k ωk ξ1

sin ωk s2 k ωk ξ2

where,

1 )+sin ωk s1 + ( sin ωk (t−s )x k ), sin ωk t

(42)

2 )+sin ωk s2 + ( sin ωk (t−s )x k ). sin ωk t

0

0

0

0

0

0

−I

−I

⎞

⎟ ⎟ ⎟ ⎟ ⎟ ⎟ ⎟ 0 0 0 −I ⎟ ⎟ ⎟ −I 0 −D(ω cot(ωt)) −D(ω cot(ωt)) ⎟ ⎟ ⎟ −I −I −D(ω cot(ωt)) −D(ω cot(ωt)) ⎠

. (43)

5n×5n

By (35) and (42), P j+2 b2 (0) = S2 =

i− j

S , 2 j . j! 2

where S2 is the following sum: r

h r21 r2 . . . h 22 j+3

r2 j+4

(W1 W2 )r1 ...r2 j+4 (0),

(44)

r1 ,...,r2 j+4 ∈A2

where A2 = {x k , z 1k , z 2k , ξ1k , ξ2k }nk=1 and for every r, r ∈ A2 , h rr 2 is the (r, r )-entry of −1 the matrix H2 in (43). We would like to separate out those terms in S2 which include 2 j+1 3 V (0). To do this, from the total number 2 j +4 derivatives that we want D2 α+3 en V (0)D3 en to apply to W1 W2 , we have to put 3 of them on W1 (or W2 ) and put 2 j + 1 of them on W2 (or W1 respectively). These combinations fit into one of the following two different forms: r r h r21 r2 . . . h 22 j+3 2 j+4 (W1 )r1 r2 r3 (W2 )r4 ,r5 ...r2 j+4 (0). (45) S21 = r1 ,...,r2 j+4 ∈A2

There are 2( j + 1)( j + 2) terms of this form in the expansion of S2 . r r h r21 r2 . . . h 22 j+3 2 j+4 (W1 )r1 r3 r5 (W2 )r2 ,r4 ,r6 ,r7 ...r2 j+4 (0). S22 = r1 ,...,r2 j+4 ∈A2

j +2 terms of this form in the expansion of S2 . 3 Now, we calculate the sums S21 and S22 . There are 23

(46)


1077

2.6.1. Calculation of S21 . We rewrite S21 as ⎛ ⎞ rr rr r r h 21 2 h 23 4 ⎝ h r25 r6 . . . h 22 j+3 2 j+4 (W2 )r5 ...r2 j+4 ⎠ (W1 )r1 r2 r3 (0). S21 = r1 ,...,r4

r5 ,...,r2 j+4

r4

Then from the definition of W2 in (42) and also from (43) it is clear that we can apply (38) to the sum in the big parenthesis above. Hence we get 1 −1 ωt α r1 r2 r3 r4 2 j 1 cot D2 α W2 (W1 )r1 r2 r3 (0). S2 = ( j + 1)! h2 h2 r4 α ! 2ω 2 r ,...,r | α |= j

1

4

(47) This reduces the calculation of S21 to calculating the small sum rr rr 2j h 21 2 h 23 4 (W˜ 2 )r4 (W1 )r1 r2 r3 (0), (W˜ 2 = D2 α W2 ). A12 = r1 ,...,r4

Computation of the sum A12 is straightforward and we omit writing the details of this computation. Using Maple, we obtain t s1 t −1 ωn t 1 ˜ 3 A12 ds2 dt = − 2 ( cot ) De n W2 D3 en W1 (0). 2ωn 2ωn 2 0 0 If we plug this into (47), after a change of variable αn → αn + 1 in indices, we get t s1 ωt α ( j + 1)! 1 −1 cot S21 ds2 dt = αn + 1 α ! 2ω 2 0 0 | α |= j−1

t −1 ωn t 2 2 j+1 3 ) D2 α +3 en V (0) D3 × (− 2 )( cot en V (0). 2ωn 2ωn 2 2.6.2. Calculation of S22 . We rewrite S22 as ⎛ ⎞ rr rr rr r r 2 j+3 2 j+4 r r S22 = h 21 2 h 23 4 h 25 6 ⎝ h 27 8 . . . h 2 (W2 )r7 ...r2 j+4 ⎠ r1 ,...,r6

r7 ,...,r2 j+4

(48)

(W1 )r1 r3 r5 (0).

r2 ,r4 ,r6

Again from (43) it is clear that we can apply (38) to the sum in the big parenthesis above. So 1 −1 ωt α cot S22 = ( j + 1)! α ! 2ω 2 | α |= j−1 r r r r 2 j−2 × h 21 2 h 23 4 D2 α W2 (W1 )r1 r3 r5 (0). (49) r1 ,...,r6

r2 ,r4 ,r6

So we need to compute rr rr rr h 21 2 h 23 4 h 25 6 (W˜ 2 )r2 ,r4 ,r6 (W1 )r1 r3 r5 (0), A22 = r1 ,...,r6

(W˜ 2 = D2 α

2 j−2

W2 ).

1078

H. Hezari

Using Maple t 0

s1

0

t A22 ds2 dt = − 2 2ωn

−1 ωn t cot 2ωn 2

2

t − 12ωn4

3 ˜ 3 D3e W2 D3 en W1 (0). n

If we plug this into (49) we get t 0

s1 0

S22 ds2 dt = ( j + 1)!

| α |= j−1

t × − 2 2ωn

1 α !

−1 ωt cot 2ω 2

−1 ωn t cot 2ωn 2

2

α

t − 12ωn4

2 j+1

3 D2 α+3 en V (0) D3 en V (0).

(50) On the other hand the part of the expansion of 2 j+1 3 V (0), equals D2 α+3 en V (0) D3 en

t s1 0

0

P j+2 b2 (0) which contains the data

t s1 t s1 i−j j +2 1 3 S2 + 2 S22 . 2( j + 2)( j + 1) 3 2 j . j! 0 0 0 0 Finally, by applying Eqs. (48) and (50) to this we obtain (41).

Now using Proposition 2.5, we give a proof for Theorem 1.2. Proof of Theorem 1.2. First of all, we prove that for all α , the functions ω α cot t , 2 are linearly independent over C. To show this we define : (0, π )n −→ Rn , cot 1 , . . . , xn ) = (cot(x1 ), . . . , cot(xn )). cot(x Because ωk are linearly independent over Q, the set {( ω21 t, . . . , ω2n t) + π Zn ; t ∈ R} ∩ is a homeomorphism and is π -periodic, we conclude (0, π )n is dense in (0, π )n . Since cot that the set {(cot( ω21 t), . . . , cot( ω2n t); t ∈ R} is dense in Rn . Now assume α

ω α cα cot t = 0. 2

Since {(cot( ω21 t), . . . , cot( ω2n t); t ∈ R} is dense in Rn , we get cα X α = 0, α

for every X = (X 1 , . . . , X n ) ∈ Rn . But the monomials X α are linearly independent over C. So cα = 0.


1079

Next we argue inductively to recover the Taylor coefficients of V from the wave invariants. Since a0 (t) =

n

1 2i sin k=1

ωk t 2

,

we can recover nk=1 sin ω2k t , and therefore we can recover {ωk } up to a permutation. This can be seen by Taylor expanding nk=1 sin ω2k t . We fix this permutation and we 3 V (0). This term appears first move on to recover the third order Taylor coefficient D3 en in a1 (t). By Proposition 2.5, we have ω t 1 −1 4 ( cot t)α D2 α V (0) 2 (2i) α ! 2ω 2 | α |=2 2 5 −1 t ωn t 2 1 3 D3 + c2 (n) ) ( cot + en V (0) 3 2 4 (2i) 3ωn 2ωn 2 9ωn +{a rational function of ωk }. ωn t 2 −1 1 are linearly cot ) + Now since the functions {(cot ω2 t)α }| α|=2 and 3ω5 2 ( 2ω 4 2 9ω n a1 (t) = c1 (n)

n

n

4 V (0)} 3 2 independent over C, we can therefore recover the data {D2 | α |=2 and {D3 en V (0) } α 3 from a1 (t). So we have determined the third order term D3 en V (0) up to a minus sign from the first invariant a1 (t). This choice of minus sign corresponds to a reflection. We fix this reflection and we move on to determine the higher order Taylor coefficients inductively. 3 V (0) = 0 and that we know all the Taylor coefficients Next we assume D3 en 2 j+1

D m V (0) with m ≤ 2 j. We wish to determine the data {D2 α+3 en V (0)}| α|= j−1 and

β 2 j+2 {D2 α V (0)}| α|= j+1 ,

from the wave invariant a j (t). At this point we use Proposition 2.5, and to finish the proof of Theorem 1.2 we have to show that the set of functions ω (cot t)α ; | α| = j + 1 ∪ 2 ωt 1 2αn + 5 −1 ωn t 2 1 (cot )α ; | α| = j − 1 , )( ) ( cot + 2 3ωn2 αn + 1 2ωn 2 9ωn4 are linearly independent over C. But this is clear from our discussion at the beginning of the proof. 3. Appendix A In this Appendix we prove Lemma 2.1. Proof. First of all we would like to change the function slightly by rescaling it. We choose 0 < τ < 2ε so that 1−τ = o(1−2ε ). Then we define (x) := (

x ). 1−τ

(51)

1080

H. Hezari

Thus ∈ C0∞ ([0, ∞)) is supported in the interval I = [0, 1−τ δ]. In Appendix B, using the min-max principle we show that T r (( Hˆ )e

−it

Hˆ

) = T r (( Hˆ )e

−it

Hˆ

) + O(∞ ) = T r (e

−it

Hˆ

) + O(∞ ).

Hence to prove the lemma it is enough to show ˆ T r (( P)e

−it

Pˆ

) = T r (( Hˆ )e

−it

Hˆ

) + O(∞ ).

To prove this identity we use the WKB construction of the kernel of the operators ˆ ˆ ˆ )e −it ˆ −it P and ( H H and make a compression between them. ( P)e −it

ˆ

ˆ P . In [DSj], Chapter 10, a WKB construction is 3.1. WKB construction for ( P)e −it ˆ P ˆ for symbols P in the symbol class S 0 (1) which are independent of made for ( P)e 0 or of the form P(x, ξ, ) ∼ P0 (x, ξ )+P1 (x, ξ )+. . . , where P j ∈ S00 are independent of (but not for symbols H = H (x, ξ, ) ∈ Sδ00 ). ˆ ˆ −it P for small time t, say t ∈ (−t0 , t0 ), It is shown that we can approximate ( P)e by a fourier integral operator of the form

U P (t)u(x) = (2π )

−n

ei(ϕ P (t,x,η)−y.η)/b P (t, x, y, η, )u(y)dydη,

where b P ∈ C ∞ ((−t0 , t0 ); S(1)) have uniformly compact support in (x, y, η), and ϕ P is real, smooth and is defined near the support of b P . The functions ϕ P and b P are found in such a way that for all t ∈ (−t0 , t0 ), ˆ ||( P)e

−it

Pˆ

− U P (t)||tr = O(∞ ).

Let us briefly review this construction, made in [DSj]. First of all, in Chapter 8, Theoˆ p w (a P (x, ξ, )) rem 8.7, it is proved that for every symbol P ∈ S00 (1), we have ( P)=O for some a P (x, ξ, ) ∈ S00 (1), where here Pˆ and O p w (a P (x, ξ, )) are respectively the Weyl quantization of P and a P (x, ξ, ). It is also shown that a P ∼ a P,0 (x, ξ ) + ha P,1 (x, ξ ) + . . . for some a P, j (x, ξ ) ∈ S00 (1). The idea of proof is as follows. In Theo˜ ∈ C 1 (C) is an almost analytic rem 8.1 of [DSj] it is shown that if ∈ C0∞ (R), and if 0 ∞ ˜ extension of (i.e. ∂¯ (z) = O(|z| )), then ˆ = ( P)

¯˜ ∂ (z) −1 L(dz). π C z − Pˆ

(52)

ˆ −1 = O p w (r (x, ξ, Then it is verified that for some symbol r (x, ξ, z; ), we have (z − P) z; )). By symbolic calculus, one can find a formal asymptotic expansion of the form r (x, ξ, z; ) ∼

q1 (x, ξ, z) 1 q2 (x, ξ, z) + + 2 + ..., 3 z−P (z − P) (z − P)5


1081

ˆ = (z − P) ˆ O p w (r (x, ξ, z; )) = 1. by formally solving O p w (r (x, ξ, z; ))(z − P) We can see that q j (x, ξ, z) are polynomials in z with smooth coefficients. Finally it is ˆ = O p w (a P (x, ξ, )), where a P ∈ S 0 is given by shown that ( P) 0 a P (x, ξ, ) =

−1 ˜ ∂¯ (z)r (x, ξ, z; )L(dz). π C

By the above asymptotic expansion for r (x, ξ, z; ) one obtains an asymptotic a P ∼ a P,0 + a P,1 + . . ., where a P, j =

q j (x, ξ, z) −1 1 2j ˜ ∂ (q j (x, ξ, t)(t))|t=P(x,ξ ) . (53) L(dz) = ∂¯ (z) 2 j+1 π C (z − P) (2 j)! t

Then, again in Chapter 10 of [DSj], it is shown that ϕ P (t, x, η) and b P (t, x, y, η, ) satisfy ∂t ϕ P (t, x, η) + P(x, ∂x ϕ P (t, x, η)) = 0, ϕ P |t=0 = x.η, (54) b P ∼ b P,0 + b P,1 + . . . , b P, j = b P, j (t, x, y, η) ∈ C ∞ ((−t0 , t0 ); S00 (1)), where ⎧ ! " ⎨ ∂t b P, j + ∂x ϕ P , ∂x b P, j + 21 x ϕ P . b P, j = − 21 x b P, j−1 , ⎩

b P, j |t=0 = ψ(x, η)a P, j ( x+y 2 , η)ψ(y, η).

j ≥ 0, (b P,−1 = 0), (55)

In (55), a P, j is given by (53) and ψ(x, η) is any C0∞ function which equals 1 in a neighborhood of P −1 (I ), where I = [0, δ] is, as before, the range of our low-lying eigenvalues and where is supported. −it ˆ There exists a similar construction for ( Hˆ )e H , except here H ∈ S 0 . δ0

−it

ˆ

3.2. WKB construction for ( Hˆ )e H . Since in (13), H = H (x, ξ, ) ∈ Sδ00 , with δ0 = 21 − ε, we can not simply use the construction in [DSj] mentioned above. Here in −it ˆ two lemmas we show that the same construction works for the operator ( Hˆ )e H . We will closely follow the proofs in [DSj]. Lemma 3.1. 1) Let be given by (51) and H ∈ Sδ00 by (13). Then for some a H ∈ Sδ00 we have ( Hˆ ) = O p w (a H (x, ξ, )). Moreover a H (x, ξ, ) ∼ a H,0 (x, ξ, ) + a H,1 (x, ξ, ) + . . . , where a H, j (x, ξ, ) ∈ Sδ00 is given by q H, j (x, ξ, z, ) −1 ˜ L(dz) ∂¯ (z) π C (z − H )2 j+1 1 2j = ∂ (q H, j (x, ξ, t, )(t))|t=H (x,ξ,) . (2 j)! t

a H, j =

(56)

1082

H. Hezari

2) Choose c such that 0 < c < min{1, ωk2 }nk=1 ≤ max{1, ωk2 }nk=1 < 1c . Let ψ(x, η) be a function in C0∞ (R2n ) ∩ Sδ00 (R2n ) which is supported in the ball {x 2 + η2 < 4c−1 1−τ δ} and equals 1 in a neighborhood of H −1 (I ), where I = [0, 1−τ δ] (I is where is supported). Then x + y −n ˆ ei(x−y).η/ψ(x, η)a H ( H )u(x) = (2π ) , η, 2 (57) × ψ(y, η)u(y)dydη + K ()u(x), where ||K ()||tr = O(∞ ). Proof of Lemma 3.1. Since H ∈ Sδ00 and δ0 = 21 − ε < 21 , the symbolic calculus mentioned in the last section can be followed similarly to prove Lemma 3.1.1. It is also easy to check that in (56), a H, j ∈ Sδ00 . The second part of the lemma is stated in [DSj], Eq. 10.1, for the case P ∈ S00 . The same argument works for H ∈ Sδ00 , precisely because the factor N on the right-hand side of the inequality in Proposition 9.5 of [DSj] changes to N −δ0 α . Thus the discussion on pp. 115–116 still follows. Lemma 3.2. There exists t0 > 0 such that for every t ∈ (−t0 , t0 ), there exist functions ϕ H (t, x, η, ) and b H (t, x, y, η, ) such that the operator U H (t) defined by −n ei(ϕ H (t,x,η,)−y.η)/b H (t, x, y, η, )u(y)dydη, (58) U H (t)u(x) = (2π ) satisfies ||( Hˆ )e

−it

Hˆ

− U H (t)||tr = O(∞ ).

Moreover, we can choose ϕ H and b H such that 1) ϕ H satisfies the eikonal equation ∂t ϕ H (t, x, η, ) + H (x, ∂x ϕ H (t, x, η, )) = 0,

ϕ H |t=0 = x.η.

(59)

This equation can be solved in (−t0 , t0 ) × + < where C is an arbitrary constant. In fact ϕ H is independent of in this domain. (Only the domain of ϕ H depends on . See (62).) 2) For all t ∈ (−t0 , t0 ), we have b H (t, x, y, η, ) ∈ Sδ00 with supp b H ⊂ {x 2 + η2 , y 2 + η2 < C1 1−τ δ} for some constant C1 . Also b H has an asymptotic expansion of the form {x 2

η2

C1−τ δ},

b H ∼ b H,0 + b H,1 + . . . , b H, j = b H, j (t, x, y, η, ) ∈ C ∞ ((−t0 , t0 ); Sδ00 (1)), (60) and the functions b H, j satisfy the transport equations ⎧ " ! ⎨ ∂t b H, j + ∂x ϕ H , ∂x b H, j + 21 x ϕ H . b H, j = − 21 x b H, j−1 , ⎩

j ≥ 0, (b H,−1 = 0),

b H, j |t=0 = ψ(x, η)a H, j ( x+y 2 , η, )ψ(y, η), (61)

where in (61) we let ψ(x, η) be a function in C0∞ (R2n )∩Sδ00 (R2n ) which is supported in the ball {x 2 + η2 < 4c−1 1−τ δ} and equals 1 in a neighborhood of H −1 (I), where I = [0, 1−τ δ]. Here c is defined in Lemma 3.1.2. Also in (61), the functions

a H, j are defined by (56).


1083

3) For all t ∈ (−t0 , t0 ), ϕ H (t, x, η, ) = ϕ P (t, x, η) on {x 2 + η2 , y 2 + η2 < C1 1−τ δ} ⊃ supp (b H (x, y, η, )).

(62) 4) For all t ∈ (−t0 , t0 ), b H, j (t, x, y, η, ) = b P, j (t, x, y, η) on

x 2 + η2 , y 2 + η2 < c1−τ δ . (63)

Proof of Lemma 3.2. First of all we assume U H (t) is given by (58) and we try to solve the equation ⎧ ⎨ ||( i ∂t + Hˆ )U H (t)||tr = O(∞ ), ⎩

U H (0) = ( Hˆ )

for ϕ H and b H , for small time t. Using (57), this leads us to ⎧ (1)), ⎨ e−iϕ H /( i ∂t + Hˆ )(eiϕ H /b H ) ∈ C ∞ ((−t0 , t0 ); Sδ−∞ 0 ⎩

b|t=0 = ψ(x, η)a H ( x+y 2 , η, )ψ(y, η).

We choose the phase function ϕ H = ϕ H (t, x, η, ) to satisfy the eikonal equation (59). We show that this equation can be solved in a neighborhood of the support of b H , for small time t ∈ (−t0 , t0 ) with t0 independent of . Let us explain how to solve this equation. We let (x(t, z, η; ), ξ(t, z, η; )) be the solution to the Hamilton equation ⎧ x(0, z, η; ) = z, ⎨ ∂t x = ∂ξ H (x, ξ, ) = ξ, . (64) ⎩ ∂ ξ = −∂ H (x, ξ, ) = −∂ V (x), ξ(0, z, η; ) = η. t x x We can show that (see Sect. 4 of [Ch]) there exists t0 independent of such that for all |t| ≤ t0 we have ⎧ |∂η x(t, z, η; )| ≤ 21 , ⎨ |∂z x(t, z, η; ) − I | ≤ 21 , . (65) ⎩ |∂z ξ(t, z, η; )| ≤ 21 , |∂η ξ(t, z, η; ) − I | ≤ 21 . We can choose t0 independent of , precisely because in Eq. 4.4 of [Ch] we have a uniform bound in for Hess(V(x)). Now, we define λ : (z, η) −→ (x(t, z, η; ), η). It is easy to see that λ(0, 0) = (0, 0). This is because if (z, η) = (0, 0) then H (x, ξ ) = H (z, η) = 0. By (13) and (2), and W (x) = O(|x|3 ), we can see that H (x, ξ ) = 0 implies (x(t, 0, 0; ), ξ(t, 0, 0; )) = (0, 0). On the other hand from (65) we have 1 3 2 < |∂z x(t, z, η; )| < 2 . Therefore λ is invertible in a neighborhood of origin. We define the inverse function by λ−1 (x, η) = (z(t, x, η; ), η),

1084

H. Hezari

which is defined in a neighborhood of (x, η) = (0, 0). Then we have t 1 |ξ(s, z(t, x, η; ), η; )|2 ϕ H (t, x, η, ) = z(t, x, η; ).η + 0 2 −V(x(s, z(t, x, η; ), η; ))ds.

(66)

A similar formula holds for ϕ P except in (64) H should be replaced by P and in (66) V by V . It is known that the eikonal equation for ϕ P can be solved near suppb P , for small time t ∈ (−t0 , t0 ) (Of course t0 is independent of .) Now, we want to show that ϕ H (t, x, η, ) = ϕ P (t, x, η)

in

(−t0 , t0 ) × {x 2 + η2 < C1−τ δ}.

(67)

1

1−τ 2

δ2.

1

1−τ

1

Let (x, η) be in {x 2 + η2 < C1−τ δ}. First, we show that |z(t, x, η; )| < 8C 2 Because z(t, 0, 0; ) = 0, by the Fundamental Theorem of Calculus we have

1

|z(t, x, η; )| ≤ (|x| + |η|) sup{(|∂x | + |∂η |)(z(t, x, η; ))}. From x(t, z(t, x, η; ), η; ) = x, we get ∂η z = −(∂z x)−1 ∂η x. Thus by (65), |∂x z| + |∂η z| ≤ 4. Hence |z(t, x, η; )| < 4(|x| + |η|) < 8C 2 2 δ 2 . This implies that for all |t| ≤ t0 , (x(s, z(t, x, η; ), η; ), ξ(s, z(t, x, η; ), η; )) will stay in a ball of radius O(1−τ ) centered at the origin (this can be seen from the conservation of energy, i.e. H (x, ξ ) = H (z, η)). On the other hand, by definition (13), P and H agree in the ball {x 2 + η2 < 41 1−2ε } and τ < 2ε. So for all t, s ∈ (−t0 , t0 ) and (x, η) ∈ {x 2 + η2 < C1−τ δ} we have z P (t, x, η) = z(t, x, η; ), x P (s, z P (t, x, η), η) = x(s, z(t, x, η; ), η; ), ξ P (s, z P (t, x, η), η) = ξ(s, z(t, x, η; ), η; ),

(68)

where z P (t, x, η), x P (s, z P (t, x, η), η) and ξ P (s, z P (t, x, η), η) are corresponded to the Hamilton flow of P. Hence by (66) and a similar formula for ϕ P , we have (67). This also shows that we can solve (59) in (−t0 , t0 ) × {x 2 + η2 < C1−τ δ}. To find b H we assume it is of the form (60) and we search for functions b H, j such that e−iϕ H /( i ∂t + Hˆ )(eiϕ H /b H ) ∼ 0. After some straightforward calculations and using the eikonal equation for ϕ H we obtain the so called transport equations (61). We can solve the transport equations inductively (see [Ch]). In [Ch] it is shown that the solutions to the transport equation (61) are given by 1

b H,0 (t, x, y, η, ) = J − 2 (t, x, η, )b H,0 (0, z(t, x, η; ), η; ), y, η, ), 1 b H, j (t, x, y, η, ) = J − 2 (t, x, η, ) b H, j (0, z(t, x, η; ), η; ), y, η, ) 1 t 1 − J 2 (s, x, η, )b H, j−1 2 0 (s, x(s, z(t, x, η; ), η; ), y, η, )ds ,

(69)


1085

where J (t, x, η, ) = det(∂x z(t, x, η; ))−1 . Now, we notice by the assumption on ψ, we have supp(b H, j (0, x, y, η; )) ⊂ {x 2 + η2 , x 2 + η2 < 4c−1 1−τ δ}. So by our previous discussion on z(t, x, η, ), we can argue inductively that for all t ∈ (−t0 , t0 ), supp(b H, j ) ⊂ {x 2 + η2 , y 2 + η2 < C1 1−τ δ} for some constant C1 . Since b H, j |t=0 ∈ Sδ00 , we can also see inductively from (69) that b H, j ∈ Sδ00 . Finally, Borel’s Theorem produces a compactly supported amplitude b H ∈ Sδ00 from the compactly supported functions b H, j ∈ Sδ00 . This finishes the proof of items 1, 2 and 3 of Lemma 3.2. Now we give a proof for item 4 of Lemma 3.2. By choosing C > C1 , Eq. (62) is clearly true from (67). Next we prove that Eq. (63) holds. Using (53) and (56), and because P and H agree in the ball {x 2 + η2 < 41 1−2ε }, we observe that the functions a P, j (x, η) and a H, j (x, ξ, ) agree in this ball. Therefore, because suppψ(x, η) ⊂ {x 2 + η2 < 4c−1 1−τ δ} and ψ = 1 in {x 2 + η2 < c1−τ δ}, by (55) and (61), b H, j (0, x, y, η, ) = b P, j (0, x, y, η)

on {(x, y, η); x 2 + η2 , y 2 + η2 < c1−τ δ}.

This proves (63) only at t = 0. But by applying (68) to (69) and a similar formula for b P , we get (63). This finishes the proof of Lemma 3.2. To finish the proof of Lemma 2.1, we have to show that for t sufficiently small T rU H (t) = T rU P (t) + O(∞ ), or equivalently ei(ϕ H (t,x,η,)−x.η)/b H (t, x, x, η, )d xdη = ei(ϕ P (t,x,η)−x.η)/b P (t, x, x, η, )d xdη + O(∞ ). By (62), the phase function ϕ H of the double integral on the left-hand side equals ϕ P on the support of the amplitude b H , so ϕ H is independent of in this domain. Now, if t ∈ (0, t0 ), where t0 is smaller than the smallest non-zero period of the flows of P and H respectively in the energy balls, {(x, η)| H (x, η) ≤ δ1−τ C1 } ⊂ {(x, η)| P(x, η) ≤ δ}, then for every such t, (x, η) = (0, 0) is the only critical point of the phase functions ϕ H (t, x, η, ) − x.η and ϕ P (t, x, η) − x.η in these energy balls. Obviously both integrals in the equation above are convergent because their amplitudes are compactly supported. But the question is whether or not we can apply the stationary phase lemma to these integrals around their unique non-degenerate critical points. By Lemma 3.2 the phase functions ϕ H and ϕ P are independent of on the support of their corresponding amplitudes. Hence ϕ H , ϕ P ∈ S00 on supp b H and supp b P respectively. On the other hand b H (t, x, x, η, ) ∈ Sδ00 , δ0 < 21 ; and b P (t, x, x, η, ) ∈ S00 . These facts can be used to get the required estimates for the remainder term in the stationary phase lemma (for an estimate for the remainder term of the stationary phase lemma, see for example Proposition 5.2 of [DSj]). Finally, by (62) and (63) it is obvious that the integrals above must have the same stationary phase expansions.

1086

H. Hezari

4. Appendix B In this Appendix we prove Lemma 2.2. In fact we prove that if is given by (51) then in the sense of tempered distributions T r (( Hˆ )e

−it

Hˆ

) = T r (e

−it

Hˆ

) + O(∞ ).

(70)

Proof of Lemma 2.2 follows similarly. We will use the min-max principle. Min-max principle. Let H be a self-adjoint operator that is bounded from below, i.e. H ≥ cI , with purely discrete spectrum {E j }∞ j=0 . Then Ej =

sup

ϕ1 ,...,ϕ j−1

inf (ψ, H ψ). ψ ∈ D(H ); ψ = 1 ψ ∈ span(ϕ1 , . . . , ϕ j−1 )⊥

(71)

As before we put Hˆ = − 21 2 + V(x) = − 21 2 + 21 nk=1 ωk2 xk2 + W(x), and Hˆ 0 = − 21 2 + 21 nk=1 ωk2 xk2 . Then if we let C = W(x) L ∞ (Rn ×(0,h 0 )) , we have (ψ, Hˆ 0 ψ) − C ≤ (ψ, Hˆ ψ) ≤ (ψ, Hˆ 0 ψ) + C, and therefore by applying the min-max principle to the operators Hˆ and Hˆ 0 we get E 0j () − C ≤ E j () ≤ E 0j () + C.

(72)

Notice we have explicit formulas for the eigenvalues E 0j () of Hˆ 0 . They are given by the lattice points in the first quadrant of Rn . More precisely n 1 0 ≥0 σ ( Hˆ 0 ) = E γ () = ωk (γk + ); γk ∈ Z . 2 k=1

Since in the sense of tempered distributions T r (( Hˆ )e

−it

Hˆ

) = T r (χ[0,δ 1−τ ] ( Hˆ )e

−it

Hˆ

)+ O(∞ ); (see for example [Ca], Prop. 6),

to prove (70), it is clearly enough to show that for every ϕ in S(R) { j; E j ()>δ 1−τ }

ϕ( ˆ

E j () ) = O(∞ ).

Since ϕˆ is in S(R), for every p ≥ 0 there exists a constant C p such that |ϕ(x)| ˆ ≤ C p |x|− p . Hence by (72), # 0 # # # # E () − C #− p # E j () #− p E j () j # # # ≤ Cp # ϕ( ) ≤ C p ## # . # # #


1087

Again using (72) and because C = W(x) L ∞ (Rn ×(0,h 0 )) < A 2 −3ε < 4δ 1−τ we get 3

# 0 #− p # 0 #− p # E () # E j () δ1−τ − C p ## E j () ## # j # ) ≤ C p ( 1−τ ) # ϕ( # < 2C p # # , for E j () > δ1−τ. # # δ − 2C # # Now let m be an arbitrary positive integer. So in order to prove the lemma it is enough to find a uniform bound for

A() := −m {γ ;

|

ωk (γk + 21 )> δ

1−τ −C }

n k=1

1 ωk (γk + )|− p . 2

By applying the geometric-arithmetic mean value inequality we get A() ≤ n

− p −m

{γ ;

ωk (γk + 21 )> δ

⎧⎛ n ⎪ ⎨ ⎜ −m ≤ n− p ⎝ ⎪ k=1 ⎩

| 1−τ −C }

n k=1

1 ωk (γk + )|− p 2 ⎞

{γk ∈Z≥0 ; ωk (γk + 21 )> δ

⎛ ⎞⎫ ⎪ ⎬ 1 ⎝ |ωk (γk + )|− p ⎠ . × ⎪ 2 ⎭ γk k =k

1−τ −C } n

1 ⎟ |ωk (γk + )|− p ⎠ 2

We claim for p large enough there is a uniform bound for the sum on the right-hand side of the above inequality. It is clear that if p ≥ 2 then the series γ |ωk (γk + k

is convergent. Also if for some γk we have ωk (γk + 21 ) > δ n −C , then because ' (1/τ 3 δ 1/τ 1 > ( 2n ) . Thus C = O( 2 −3ε ), for small enough we have ωk (γk + 21 ) 1−τ

1 −p 2 )|

{γk ∈Z≥0 ; ωk (γk + 12 )> δ

1−τ −C } n

1 2n 1 m −m |ωk (γk + )|− p ≤ ( )m/τ |ωk (γk + )| τ − p . 2 δ 2 γ k

So if we choose p > max { mτ , 2}, then the sum on the right-hand side is convergent and therefore we have a uniform bound for the sum on the left-hand side and hence for A(). This finishes the proof of (70). Acknowledgements. I am sincerely grateful to Steve Zelditch for introducing the problem and many helpful discussions and suggestions on the subject. I would also like to thank him for his great support and encouragement as I was writing this article.

References [BPU] Brummelhuis, R., Paul, T., Uribe, A.: Spectral estimates around a critical level. Duke Math. J. 78(3), 477–530 (1995) [C] Colin De Verdière, Y.: A semi-classical inverse problem II: reconstruction of the potential. http://arXiv.org/abs/:0802.1643, 2008

1088

[Ca]

H. Hezari

Camus, B.: A semi-classical trace formula at a non-degenerate critical level. (English Summary) J. Funct. Anal. 208(2), 446–481 (2004) [CG1] Colin De Verdière, Y., Guillemin, V.: A semi-classical inverse problem I: Taylor expansions. http://arXiv.org/abs/:0802.1605, 2008 [Ch] Chazarain, J.: Spectre d’un Hamiltonien quantique et méchanique classique. Comm. PDE 5, 595–644 (1980) [D] Duistermaat, J.J.: Oscillatory integrals, Lagrange immersions and unfolding of singularities. Comm. Pure Appl. Math. 27, 207–281 (1974) [DSj] Dimassi, M., Sjöstrand, J.: Spectral asymptotics in the semi-classical limit. London Mathematical Society Lecture Note Series, 268. Cambridge: Cambridge University Press, 1999 [EZ] Evans, L.C., Zworski, M.: Lectures on semiclassical analysis, Lecture notes, available at http://math. berkeley.edu/~zworski/semiclassical.pdf [GU] Guillemin, V., Uribe, A.: Some inverse spectral results for semi-classical Schrödinger operators. Math. Res. Lett. 14(4), 623–632 (2007) [GPU] Guillemin, V., Paul, T., Uribe, A.: ‘‘Bottom of the well” semi-classical trace invariants. Math. Res. Lett. 14(4), 711–719 (2007) [ISjZ] Iantchenko, A., Sjöstrand, J., Zworski, M.: Birkhoff normal forms in semi-classical inverse problems. Math. Res. Lett. 9(2-3), 337–362 (2002) [PU] Paul, T., Uribe, A.: The semi-classical trace formula and propagation of wave packets. J. Funct. Anal. 132(1), 192–249 (1995) [R] Robert, D.: Autour de l’approximation semi-classique. (French) [On semiclassical approximation] Progress in Mathematics, 68. Boston, MA: Birkhäuser Boston, Inc., 1987 [Sj] Sjöstrand, J.: Semi-excited states in nondegenerate potential wells. Asymptotic Anal. 6(1), 29–43 (1992) [Sh] Shubin, M.A.: Pseudodifferential operators and spectral theory. Translated from the 1978 Russian original by Stig I. Andersson. Second edition. Berlin: Springer-Verlag, 2001 [U] Uribe, A.: Trace formulae. First Summer School in Analysis and Mathematical Physics (Cuernavaca Morelos, 1998), Contemp. Math. 260. Providence, RI: Amer. Math. Soc., 2000, pp. 61–90 [Z] Zelditch, S.: Reconstruction of singularities for solutions of Schrödinger’s equation. Commun. Math. Phys. 90(1), 1–26 (1983) [Z1] Zelditch, S.: The inverse spectral problem. With an appendix by J. Sjöstrand and M. Zworski. In: Surv. Differ. Geom. IX, Somerville, MA: Int. Press, 2004, pp. 401–467 [Z2] Zelditch, S.: Inverse spectral problem for analytic domains. I. balian-bloch trace formula. Commun. Math. Phys. 248(2), 357–407 (2004) [Z3] Zelditch, S.: Inverse spectral problem for analytic plane domains II: Z2 -symmetric domains. To appear in Ann. Math. http://aiXiv.org/abs/math.SP/0111078., 2001; available at http://annals.math. princeton.edu/issues/2006/FinalFiles/Zelditdi.pdf [Z4] Zelditch, S.: Spectral determination of analytic bi-axisymmetric plane domains. Geom. Funct. Anal. 10(3), 628–677 (2000) Communicated by B. Simon


Communications in


A Characterization of Dirac Morphisms E. Loubeau1 , R. Slobodeanu2, 1 Département de Mathématiques, Université de Bretagne Occidentale, 6,

Avenue Victor le Gorgeu, CS 93837, 29238 Brest Cedex 3, France. E-mail: [email protected] 2 Faculty of Physics, Bucharest University, 405 Atomi¸stilor Str., CP Mg-11, RO - 077125 Bucharest, Romania. E-mail: [email protected] Received: 30 May 2008 / Accepted: 6 October 2008 Published online: 24 January 2009 – © Springer-Verlag 2009

Abstract: Relating the Dirac operators on the total space and on the base manifold of a horizontally conformal submersion, we characterize Dirac morphisms, i.e. maps which pull back (local) harmonic spinor fields onto (local) harmonic spinor fields.

1. Introduction Introduced by Jacobi [11] in 1848, harmonic morphisms are maps which pull back local harmonic functions onto harmonic functions and, more recently, they were characterized by Fuglede [7] and Ishihara [10] as horizontally weakly conformal harmonic maps. Their dual nature of analytical and geometrical objects has led to a rich theory (cf. [3]) which has encouraged the study of various other morphisms, that is maps preserving germs of certain differential operators. The central role of the Dirac operator in differential geometry and mathematical physics called for this approach to be applied to harmonic spinors. Unlike previous cases, the first hurdle is to make sense of a notion of pull-back of spinors by a map. This requires the identification of the spinor bundles involved, necessarily restricting our investigation to horizontally conformal maps between Riemannian manifolds and even-dimensional targets (cf. Sect. 2). Combining a chain rule for the Dirac operator and a local existence lemma, we show that a horizontally conformal submersion between spin manifolds is a Dirac morphism if and only if its horizontal distribution is integrable and the mean curvature of the fibres is related to the dilation factor, in a manner reminiscent of the fundamental equation for harmonic morphisms. We conclude with some simple examples between Euclidean spaces and make explicit our results in the set-up of [13], which inspired initially our construction. The second author benefited from a one-year fellowship of the Conseil Général du Finistère.

1090

E. Loubeau, R. Slobodeanu

2. Pull-Back of a Spinor ρ

Let (M m , g) be a spin Riemannian manifold, the two-sheeted covering Spin(m) −→ SO(m) induces a double cover χ : PSpin(m) M −→ PSO(m) M of the bundle of positively oriented orthonormal frames by the principal Spin(m)-bundle over M, such that χ (s · g) = χ (s) · ρ(g), ∀s ∈ PSpin(m) M, g ∈ Spin(m). The associated bundle Cl(M) = PSO(m) M ×clm Clm is the Clifford bundle, where Clm is the Clifford algebra and clm the representation of SO(m) into Aut(Cl(Rm )), and the spinor bundle is S M = PSpin(m) M ×γ Sm , with γ the spinorial representation of Spin(m) on the [m/2] (cf. [12]). Clifford module Sm = C2 A spinor field is a (smooth) section of S M, : U ⊂ M −→ S M, (x) = [sx , ψ(x)], where sx ∈ PSpin(m) M is a spinorial frame at x ∈ M and ψ : U −→ Sm , the equivalence class being defined by [s, ψ] = [s · g −1 , γ (g)ψ], for all g ∈ Spin(m). The covariant derivative is m l 1 jk ek · el · ψ , ∇e j = s, dψ(e j ) + 2 k 1 .

(2.2.1) (2.2.2)

Let V be a finite dimensional C-linear space. For any operator X ∈ End(V ⊗2 ) and for all integer i > 0, j > 0 denote X i := I ⊗(i−1) ⊗ X ⊗ I ⊗( j−1) ∈ End(V ⊗(i+ j) ) ,

(2.2.3)

where I ∈ Aut(V ) is the identity operator.3 We also use notation X i j for an operator in End(V ⊗k ), 1 ≤ i = j ≤ k, acting as X in component spaces V with labels i and j and as identity in the rest. In these notations X i i+1 ≡ X i . An operator R ∈ Aut(V ⊗ V ) satisfying equality R1 R2 R1 = R2 R1 R2 ,

(2.2.4)

is called an R-matrix. Any R-matrix generates representations ρR of the braid groups Bk , k = 2, 3, ..., ρR :

Bk → Aut(V ⊗k ), ρR (σi ) = Ri , 1 ≤ i ≤ k − 1.

By a slight abuse of notation we assign the same symbol ρR to the R-matrix representations of the braid groups Bk for different values of index k. This should not cause problems as the braid groups admit a series of monomorphisms commuting with ρR , Bk → Bk+1 :

σi → σi ∀i = 1, . . . k − 1.

(2.2.5)

Definition 2.1. An R-matrix R is called skew invertible if there exists an operator R ∈ End(V ⊗2 ) such that Tr (2) R12 R 23 = Tr (2) R 12 R23 = P13 .

(2.2.6)

Here by Tr (i) we denote trace operation in i th space, and by P — the permutation operator: P(u ⊗ v) = v ⊗ u ∀ u, v ∈ V . With any skew invertible R-matrix R we associate a pair of operators DR , CR ∈ End(V ) DR 1 = Tr (2) R 12 , CR 2 = Tr (1) R 12 ,

(2.2.7)

which, by (2.2.6), satisfy equalities Tr (2) R12 DR 2 = I1 , Tr (1) CR 1 R12 = I2 .

(2.2.8)

Further properties of the operators DR and CR are summarized below. (i+ j)

3 Strictly speaking a proper notation for the l.h.s. of (2.2.3) would be, say, X . We use the shortened i notation X i since a dependence on j is not critical for our considerations. All formulas below make sense if the index j is large enough. A minimal possible value for j in each case is obvious from the context.

1142

A. P. Isaev, P. Pyatov

Proposition 2.2 [Is.04,O]. Let R be a skew-invertible R-matrix. The operators DR and CR (2.2.7) satisfy equalities DR 1 I2 = Tr (3) DR 3 R2±1 P12 R2∓1 , R12 DR 1 DR 2 = DR 1 DR 2 R12 ,

CR 3 I2 = Tr (1) CR 1 R1±1 P23 R1∓1 , R12 CR 1 CR 2 = CR 1 CR 2 R12 .

(2.2.9)

Let W be a C-linear space. For any skew invertible R-matrix R we define an R-trace map4 TrR : End W (V ) → W , Y → TrR (Y ) := Tr (DR Y ) , Y ∈ End W (V ) . Following properties of the R-trace are simple consequences of the relations given in Proposition 2.2. Corollary 2.3. Let R be a skew invertible R-matrix. For any operator Y ∈ End W (V ) the R-trace associated with R satisfies relations −ε ε TrR (2) (R12 Y1 R12 ) = I1 TrR (Y ),

(2.2.10)

where ε = ±1 and the symbol TrR (i) denotes taking the R-trace in ith space. For an element x (k) ∈ C[Bk ] denote XR(k) := ρ R (x (k) ) ∈ End(V ⊗k ). The following cyclic property TrR (1, . . . , k) XR(k) Y (k) = TrR (1, . . . , k) Y (k) XR(k) is fulfilled for any k ≥ 1 and Y (k) ∈ End W (V ⊗k ), and for all x (k) ∈ C[Bk ]. Example 2.4. Permutation P: P(u ⊗ v) := v ⊗ u ∀ u, v ∈ V , is the skew invertible R-matrix. The Identity operator I ⊗2 is the R-matrix which is not skew invertible. Example 2.5. Assume that the quasi-triangular Hopf algebra AR admits a representation ρV : AR → End(V ). As follows from the Yang-Baxter equation (2.1.3) an operator R := η P (ρV ⊗ ρV )(R),

(2.2.11)

satisfies relation (2.2.4). Here the scaling factor η ∈ {C \ 0} is introduced for the sake of future convenience. The R-matrix (2.2.11) is skew invertible, its skew inverse matrix is given by formula (see, e.g., [O], Sect. 4.1.2) R = η−1 P (ρV ⊗ ρV )((id ⊗ S)R). The matrices DR and CR associated with the R-matrix (2.2.11) are: DR = η−1 ρV (u) ,

CR = η−1 ρV (S(u)) .

(2.2.12)

Both, they are invertible and their properties (2.2.9) are descending from (2.1.5). 4 This map is often called a quantum trace or, shortly, a q-trace. In our opinion, the name R-trace is more appropriate to it.

Spectral Extension of the Quantum Group Cotangent Bundle

1143

2.3. Hecke algebras and Hecke type R-matrix. An A-type Hecke algebra Hk (q) is a quotient algebra of the group algebra C[Bk ] (2.2.1), (2.2.2) by relations (σi − q1)(σi + q −1 1) = 0

∀ 1 ≤ i ≤ k − 1.

Under the following conditions on the parameter q: [k] i q := (q i − q −i )/(q − q −1 ) = 0 ∀i = 2, 3, . . . , k,

(2.3.1)

the algebra Hk (q) is isomorphic to the group algebra of the symmetric group C[Sk ] and, hence, semisimple. It’s irreducible representations as well as its central idempotents are labeled by a set of partitions λ k. We are particularly interested in a series of idempotents corresponding to the one dimensional representations λ = (1k ), k = 1, 2, . . . . These idempotents – we denote them as a (k) – admit a recursive construction (see, e.g., [HIOPT], Sect. 1, or [GPS.97], Sect. 2.3, or [TW], Lemma 7.2) k−1 (k − 1)q (k−1) q a (1) = 1, a (k) = a 1 − σk−1 a (k−1) (2.3.2) kq (k − 1)q k−1 (k − 1)q (k−1)↑1 q a 1 − σ1 a (k−1)↑1 ∀ k = 2, 3, . . . , (2.3.3) = kq (k − 1)q where we use the symbol x (k)↑1 ∈ Hk+1 (q) for an image of the element x (k) ∈ Hk (q) under the following algebra monomorphism (cf. with (2.2.5)): Hk → Hk+1 :

σi → σi+1 ∀i = 1, . . . k − 1 .

The idempotents a (k) obey relations a (k) σi = σi a (k) = −q −1 a (k) a

(k) (i)↑ j

a

=a

(i)↑ j (k)

a

=a

(k)

∀ i = 1, 2, . . . , k − 1 , ,

if i + j ≤ k .

(2.3.4) (2.3.5)

An R-matrix R satisfying the quadratic minimal characteristic identity is called a Hecke type R-matrix. By an appropriate rescaling of R one always can turn its characteristic identity to a form (R − q I )(R + q −1 I ) = 0 .

(2.3.6)

In this case the corresponding representations ρ R become representations of the Hecke algebras Hk (q), ρR :

Hk (q) → Aut(V ⊗k ), ρR (σi ) = Ri , 1 ≤ i ≤ k − 1.

(2.3.7)

We reserve a special notation for the R-matrix images of idempotents a (k) : A(k) := ρR (a (k) ) ,

A(k)↑1 := ρR (a (k)↑1 ) ∀ k ≥ 1 .

(2.3.8)

We also put A(0) := 1. The elements A(k) will be further referred to as k-antisymmetrizers. Remark 2.6. The R-matrix analogues of relations (2.3.2)–(2.3.5) have been described in the literature (see [J.86,G]) even earlier than their algebraic prototypes.

1144


2.4. G L q (n) type R-matrix. Definition 2.7. Consider a Hecke type R-matrix R. Assume that the parameter q in its characteristic identity (2.3.6) satisfies conditions [n] (2.3.1), so that antisymmetrizers A(2) , . . . , A(n) are well defined. R is called a G L q (n) type R-matrix if two conditions n q A(n) I − Rn A(n) = 0 (2.4.1) nq and rk A(n) = 1

(2.4.2)

are fulfilled. Remark 2.8. Assuming (n + 1)q = 0, the condition (2.4.1) is equivalent to A(n+1) = 0. For generic values of q, assuming validity of (2.4.1), the condition (2.4.2) is equivalent to demanding skew invertibility of R (see [G], Props. 3.6 and 3.10). Proposition 2.9 [G,Is.04]. Let R be a skew invertible R-matrix of the type G L q (n). Then CR and DR are invertible and the following relations are fulfilled: DR CR = CR DR = q −2n I, (n + 1 − k)q (k−1) TrR (k) A(k) = q −n A ∀ k = 1, 2, . . . , n, kq n n 2 A(n) (DR )i = (DR )i A(n) = q −n A(n) . i=1

(2.4.3) (2.4.4) (2.4.5)

i=1

Example 2.10. Consider the case AR is the quantized universal enveloping algebra Uq sl(n). Let V be a vector representation of Uq sl(n), dim V = n. In this case formula (2.2.11) with the scaling factor chosen as η = q 1/n gives a standard Drinfeld-Jimbo’s R-matrix R ◦ of the G L q (n) type (see [KSch], Sect. 8.4.2): R◦ =

n

q δi j E i j ⊗ E ji + (q − q −1 )

i, j=1

E ii ⊗ E j j .

(2.4.6)

i< j

Here (E i j )kl := δik δ jl , i, j = 1, . . . , n, is a standard basis of n × n matrix units. Via the so-called twist procedure (for details see [R.90]) R ◦ gives rise to a multiparametric family of G L q (n) type R-matrices, R

f

:= F R ◦ F −1 =

n i, j=1

q δi j

fi j E i j ⊗ E ji + (q − q −1 ) E ii ⊗ E j j , f ji i< j

(2.4.7) ∀ f i j ∈ {C \ 0}. n Here F := i, j=1 f i j E ii ⊗ E j j is a twisting R-matrix. In what follows we use these particular R-matrices for illustration purposes. Their corresponding matrices D R ◦ and D R f are D R◦ = D R f =

n i=1

q 2(i−n)−1 E ii .


1145

Remark 2.11. Generally speaking, a G L q (n) type R-matrix can be realized in a tensor square of space V whose dimension is different from n. Examples of the R-matrices for any dim V ≥ n are given in [G], in Sect. 4. In what follows we do not assume any relation between the parameter n in Definition 2.7 and the dimension of the space V , unless it is stated explicitly. 3. Quantized Functions on a Cotangent Bundle Over Matrix Group In this section we recall the definition of a quantum group cotangent bundle and develop in linear cases – G L q (n) and S L q (n) – basic techniques for its structure investigation. 3.1. Quantized functions over matrix group (RTT algebra). Definition 3.1 [D.86,FRT]. Let R be a skew invertible R-matrix. An associative unital V algebra generated by a set of matrix components T ji i,dim j=1 satisfying relations R12 T1 T2 = T1 T2 R12

(3.1.1)

is denoted as F[R] and called an RTT algebra. The RTT algebra is endowed in a standard way with the coproduct and the counit (T ji ) = Tki ⊗ T jk , (T ji ) = δ ij . (3.1.2) k V Let further extend the RTT algebra by a set of inverse matrix components (T −1 )ij i,dim j=1 : Tki (T −1 )kj = (T −1 )ik T jk = δ ij 1 . (3.1.3) k

k

The extended algebra can be endowed with the antipode mapping S(T ji ) = (T −1 )ij ,

so that (see [R.89]):

S 2 (T ) DR = DR T .

(3.1.4)

The resulting Hopf algebra is further denoted as FG[R]. Example 3.2. Consider the quasi-triangular Hopf algebra AR together with its representation ρV (see Example 2.5). For any x ∈ AR denote ρV (x)ij a matrix of the operator ρV (x) in a certain basis in the space V . Let A∗R be the dual Hopf algebra and let ·, · denote a non degenerate pairing between AR and A∗R . Consider two matrices of linear functionals on AR — T ji and (T −1 )ij — such that T ji , x = ρV (x)ij , (T −1 )ij , x = ρV (S(x))ij ∀ x ∈ AR . (3.1.5) It is easy to see that these functionals satisfy conditions of Definition 3.1 (for details see, e.g., [B]), the numeric R-matrix R in (3.1.1) in this case is given by (2.2.11), relation (3.1.4) for the square of antipode descends from (2.1.4). The functionals T ji and (T −1 )ij generate a Hopf subalgebra in A∗R . In case AR is a universal enveloping algebra U g of some Lie algebra g, the dual Hopf algebra (U g)∗ can be treated as Fun(G) ≡ FG, where G is a formal group corresponding to g. Therefore, heuristically we can treat the RTT algebras FG[R] and F[R] as algebras of quantized functions over the matrix group and matrix semigroup, respectively. Here the term matrix refers to a matrix form of the coproduct (3.1.2); the term quantized means that relations (3.1.1) in general define a noncommutative product.

1146


In the rest of the subsection we describe a construction of the inverse matrix T −1 for the RTT algebra associated with the G L q (n) type R-matrix. Consider an element detR T := Tr (1, . . . , n) A(n) T1 T2 . . . Tn . (3.1.6) By the definition of the coproduct (3.1.2) and due to the rank 1 condition (2.4.2) the element detR T is group-like (detR T ) = detR T ⊗ detR T , and it satisfies relations A(n) T1 T2 . . . Tn = T1 T2 . . . Tn A(n) = A(n) detR T . Therefore, it is natural to call det R T a determinant of the matrix T . Proposition 3.3 [G]. Let R be a skew invertible G L q (n) type R-matrix. The following relation is satisfied in the corresponding RTT algebra F[R]: ) detR T , (detR T ) T = (OR T O−1 R where OR , O−1 ∈ Aut(V ) are mutually inverse matrices: R OR 1 = n q Tr (2, . . . , n +1) P1 P2 . . . Pn A(n) , (n) (O−1 A ) = n Tr P . . . P P (2, . . . , n +1) 1 q n 2 1 , R

(3.1.7)

(recall that Pi are permutation operators acting in components spaces Vi ⊗ Vi+1 ). Corollary 3.4. In the assumptions of Proposition 3.3 consider an extension of the RTT algebra F[R] by an element (detR T )−1 subject to relations T OR )(detR T )−1 , detR T (detR T )−1 = (detR T )−1 detR T = 1 . (detR T )−1 T = (O−1 R In the extended algebra the inverse matrix T −1 satisfying relations (3.1.3) is given by formula (T −1 )1 = q n(n−1) n q TrR (2, . . . , n) T2 . . . Tn A(n) (detR T )−1 . The resulting Hopf algebra is called a G L q (n) type RTT algebra and denoted as FG L q (n)[R]. Assume additionally that for the R-matrix R the corresponding matrix OR (3.1.7) is scalar: OR ∝ I . In this case R is called the R-matrix of S L q (n) type. In the corresponding RTT algebra FG L q (n) R the element detR T is central. A quotient of this algebra by relation detR T = 1 is called S L q (n) type RTT algebra and denoted as F S L q (n)[R].


1147

Remark 3.5. For a skew invertible G L q (n) type R-matrix R consider a system of equations R12 N1 N2 = N1 N2 R12 ,

N n ∝ OR

for some N ∈ Aut(V ).

Note that a consistency condition for these equations — R12 OR 1 OR 2 = OR 1 OR 2 R12 — is satisfied (see [OP.05]). By any solution N of these equations one can construct the S L q (n) type R-matrix 12 := N1 R12 N −1 = N −1 R12 N2 . R 1 2

(3.1.8)

Example 3.6. For the R-matrices described in Example 2.10 one has n O R ◦ = −I , O R f = − i=1 j=i f ji / f i j E ii . So, R ◦ is S L q (n) type, while R f is S L q (n) type only if ∀ i = 1, . . . , n : j=i ( f ji / f i j ) √ n 1/n th 1. Taking a diagonal n root O R f of the diagonal matrix O R f one finds the = S L q (n) type R-matrix associated with R f : f = R f˜ , where f˜i j := R

k=i, j ( f i j

f jk f ki )1/n , so that O R f˜ = −I .

3.2. Quantized right invariant vector fields (reflection equation algebra). Definition 3.7 [KS]. Let R be a skew invertible R-matrix. An associative unital algebra V LG[R] generated by a set of matrix components L ij i,dim j=1 satisfying relations L 1 R12 L 1 R12 = R12 L 1 R12 L 1

(3.2.1)

is called a reflection equation algebra or, shortly, RE algebra. The RE algebra LG[R] is naturally endowed with a structure of left coadjoint FG[R]-comodule algebra δ (L ij ) = Tki (T −1 )mj ⊗ L km . (3.2.2) k,m

Example 3.8. [FRT]. In notations of Examples 2.5, 3.2 consider the following AR -valued matrices i

L (+) j = id ⊗ T ji , R ,

L (−) j = S(T ji ) ⊗ id , R = T ji ⊗ id , R−1 , i

((L (+) )−1 )ij = id ⊗ T ji , R−1 , ((L (−) )−1 )ij = T ji ⊗ id, R.

(3.2.3)

As a consequence of the Yang-Baxter equation (2.1.3) components of these matrices satisfy relations (±) (±) (±) R12 L (±) 2 L 1 = L 2 L 1 R12 ,

(−) (−) (+) R12 L (+) 2 L 1 = L 2 L 1 R12 ,

(3.2.4)

i where R is given by (2.2.11). By (2.1.2), the elements (L (±) )±1 j generate a Hopf AR -subalgebra i i k i i (L (±) j ) = L (±) k ⊗ L (±) j , (L (±) j ) = δ ij , S(L (±) j ) = ((L (±) )−1 )ij . k

1148


Consider a composite matrix L with components 1 1 k L ij := q (n− n ) ((L (−) )−1 )ik L (+) j = q (n− n ) id ⊗ T ji , R21 R12 ,

(3.2.5)

k 1

where our choice of a numeric factor q n− n is argued in Appendix A. By (3.2.4), components of L satisfy reflection equation (3.2.1), where R is given by (2.2.11). Note that an AR -subalgebra generated by the elements L ij (3.2.5) does not carry a natural Hopf algebra structure. Instead, it obeys a coadjoint comodule algebra structure (3.2.2) with respect to the Hopf A∗R -subalgebra generated by the components of the matrices T and T −1 (3.1.5). Let us comment on a geometric interpretation of the RE algebra. In [FRT] the matrices L (±) were used to develop an RTT type description for the quantized universal enveloping algebra Uq g. Consider the case g = sl(n) and let V be its vector representation. The corresponding G L q (n) type R-matrix R is given in Example 2.10. Making a linear change of generators L ij → ij : L ij = δ ij + (q − q −1 ) ij ,

(3.2.6)

and using the Hecke condition (2.3.6) the reflection equation (3.2.1), for q 2 = 1, can be equivalently rewritten as 1 R12 1 R12 − R12 1 R12 1 = R12 1 − 1 R12 .

(3.2.7)

In a “classical” limit q → 1 the R-matrix (2.4.6) tends to the permutation and Eqs. (3.2.7) go into commutation relations for the basis of generators of the Lie algebra gl(n), [ 1 , 2 ] = P12 ( 1 − 2 ) .

(3.2.8)

p ij i, j=1

as a basis of right invariant vector fields on G L(n). Classically we can treat Transformation of these basic fields under the left transition by a group element t ∈ G L(n) is given by formula (cf. with (3.2.2)) δ (t) :

ij

→

n

tki km (t −1 )mj ,

where t ij := ρV (t)ij .

k,m=1

Extrapolating this interpretation to a “quantum” case q = 1 we call L ij i,n j=1 a basis of quantized right invariant vector fields over the matrix group. It is technically convenient to introduce the notation L 1 := L 1 ,

L k+1 := Rk L k Rk−1 ,

L 1 := L 1 ,

Rk−1

L k+1 :=

L k Rk

(3.2.9) ∀ k ≥ 1.

In terms of these R-copies L k , L k of the matrix L the reflection equation (3.2.1) can be equivalently written in any of the following forms: Rk L k L k+1 = L k L k+1 Rk ,

Rk L k+1 L k = L k+1 L k Rk

∀ k ≥ 1 . (3.2.10)

Taking into account commutativity relations Ri L k = L k Ri ,

Ri L k = L k Ri

∀ i, k : k = i, i + 1 ,

(3.2.11)


1149

one sees that the R-copies L k (L k ) of the matrix L in the RE algebra LG[R] formally satisfy the same relations as the usual copies Tk (Tk−1 ) of the matrix T (T −1 ) in the RTT algebra FG[R]. Matrix monomials in two different series of the R-copies satisfy relations L1 L2 . . . Lk = Lk . . . L2 L1

∀k ≥ 1.

(3.2.12)

For k = 2 the equality (3.2.12) is identical to the reflection equation (3.2.1). For k > 2 this equality follows by induction on k. Note that monomials (3.2.12) transform covariantly under the left coadjoint coaction (3.2.2),

δ L 1 . . . L k = T1 . . . Tk ⊗ 1)(1 ⊗ L 1 . . . L k (S(T1 . . . Tk ) ⊗ 1) . (3.2.13) The following proposition goes back to Theorem 14 from [FRT] (see also [Is.04], Prop. 5). Proposition 3.9. Let R be a skew invertible R-matrix. For an element x (k) ∈ C[Bk ] denote (3.2.14) ch(x (k) ) := TrR (1 . . . k) XR(k) L 1 L 2 . . . L k , where XR(k) := ρ R (x (k) ) ∈ End(V ⊗k ). Consider a linear subspace Ch[R] ⊂ LG[R] spanned by the unity and by elements ch(x (k) ) ∀k ≥ 1 and ∀x (k) ∈ C[Bk ]. The space Ch[R] is a subalgebra of the center of the RE algebra LG[R]. It is called a characteristic subalgebra of the RE algebra LG[R]. The characteristic subalgebra is invariant with respect to the left FG[R] coadjoint coaction (3.2.2). Proof. In a setting of the quasi-triangular Hopf algebras these statements were proved in [D.89,R.89] (see there Sect. 3 and Sect. 4, respectively). Below we prove the proposition in the RE algebra setting. Consider an arbitrary element ch(x (k) ) of the characteristic subalgebra. We first prove the following version of the formula (3.2.14): ch(x (k) ) I1 = TrR (2, . . . , k + 1) XR(k)↑1 L 2 L 3 . . . L k+1 = TrR (2, . . . , k + 1) XR(k)↑1 L k+1 . . . L 3 L 2 . (3.2.15) Here the first equality results from a calculation TrR (2, . . . , k + 1) XR(k)↑1 L 2 . . . L k+1 = TrR (2, . . . , k + 1) XR(k)↑1 R1 · · · Rk L 1 . . . L k Rk−1 · · · R1−1 = TrR (2, . . . , k + 1) R1 · · · Rk (XR(k) L 1 . . . L k ) Rk−1 · · · R1−1 = . . . = TrR (1, . . . , k) XR(k) L 1 L 2 . . . L k , where in the last line we applied (2.2.10) k times. To prove the second equality in (3.2.15) we first use the relation (3.2.12) and then perform similar transformations.

1150


With the use of (3.2.15) and (3.2.12) checking centrality of ch(x (k) ) is straightforward: L 1 ch(x (k) ) = TrR (2, . . . , k + 1) XR(k)↑1 L 1 L 2 L 3 . . . L k+1 = TrR (2, . . . , k + 1) XR(k)↑1 L k+1 . . . L 2 L 1 = ch(x (k) ) L 1 . The invariance of ch(x (k) ) under the left FG[R] coadjoint coaction follows immediately from (3.2.13) together with the relation (3.1.4) for the square of antipode. Consider a series of elements of the RE algebra LG[R], pi := TrR (L i ) , i = 1, 2, . . . .

(3.2.16)

Further on they are called power sums. The following calculation −1 −1 i L 1 pi = TrR (2) L 1 R12 L i1 R12 = TrR (2) R12 L 1 R12 L 1 = pi L 1 ,

proves centrality of the power sums. Here in the first and the last equalities we use formula (2.2.10), and the second equality is a consequence of (3.2.1). Actually, the power sums belong to the characteristic subalgebra Ch[R]: pi = ch(σi−1 . . . σ2 σ1 ) , which is verified by a following transformation:

ch(σi−1 . . . σ2 σ1 ) = TrR (1, . . . , i) L 1 . . . L i (Ri−1 . . . R1 ) −1 ) (Ri−1 . . . R1 ) = TrR (1, . . . , i) L 1 . . . L i−1 (Ri−1 . . . R1 )L 1 (R1−1 . . . Ri−1 = TrR (1, . . . , i − 1) L 1 . . . L i−1 TrR (i) Ri−1 (Ri−2 . . . R1 )L 1 = TrR (1, . . . , i − 2) L 1 . . . L i−2 TrR (i − 1) Ri−2 (Ri−3 . . . R1 )L 21 = . . . = TrR (L i ) = pi . Here we repeatedly expand the notation L j = (R j−1 . . . R1 )L 1 (R1−1 . . . R −1 j−1 ) for j = i, . . . , 2, and use (2.2.8). Let R be a skew invertible R-matrix of the Hecke type. Assuming that conditions [k] (2.3.1) are fulfilled consider a series of elements ai ∈ Ch[R], i = 0, 1, . . . k, in the corresponding Hecke type RE algebra LG[R], a0 := 1 , ∀ 1 ≤ i ≤ k , (3.2.17) ai := ch(a (i) ) = TrR (1, . . . , i) A(i) L 1 . . . L i where notations a (i) , A(i) were explained in (2.3.2), (2.3.8). The elements ai are called elementary symmetric functions. Definition 3.10. Let R be a skew invertible G L q (n) type R-matrix. A central extension of the corresponding RE algebra LG[R] by an element an−1 : an an−1 = 1 is called G L q (n)type RE algebra and denoted as LG L q (n)[R]. A quotient of this algebra by a relation an = q −1 1 is called S L q (n) type RE algebra and denoted as L S L q (n)[R].

(3.2.18)


1151

Remark 3.11. An actual value of a numeric factor in the right-hand side of (3.2.18) is not relevant for the definition. Our choice allows avoiding numeric factors later in formula (4.1.1) (see the proof of Proposition 4.1). Consider realization of the RE algebra L S L q (n)[R] as a subalgebra in the quasi-triangular Hopf algebra AR (see Example 3.8). In this case the condition (3.2.18) is consistent with the pairing ·, · of the dual Hopf algebras AR and A∗R only for the chosen normalizations (3.2.5) for L and η = q 1/n for R (2.2.11). This point is explained in Appendix A, see (A.3). (3.1.8) Remark 3.12. The G L q (n) type R-matrix R and its S L q (n) partner R-matrix R define identical RE algebras. In the theorem below we describe Cayley-Hamilton and Newton identities specific to the G L q (n) type and Hecke type RE algebras. Theorem 3.13. Let R be a skew invertible R-matrix of the Hecke type. Assume that the conditions [k] (2.3.1) are fulfilled. Then in the corresponding RE algebra LG[R] the following Cayley-Hamilton-Newton identities [IOP.98,IOP.99] i q TrR (2, . . . , i) (A(i) L 2 L 3 . . . L i ) = (−1)i+1

i−1

i− j−1

(−q) j a j L 1

∀ 2 ≤ i ≤ k (3.2.19)

j=0

take place. Multiplying by L 1 from the left and taking the R-trace TrR (1) of these identities one obtains Newton relations for the sets of power sums { pi }i≥1 and the set of elementary symmetric functions {ai }i≥0 [GPS.97], i q ai + (−1)

i

i−1

(−q) j a j pi− j = 0 ∀ 1 ≤ i ≤ k .

(3.2.20)

j=0

Both sets {1, p j } j≥1 and {a j } j≥0 in this case generate the characteristic subalgebra Ch[R]. Assume additionally that R is an R-matrix of the G L q (n) type. Then the finite set n {ai }i=0 generates the characteristic subalgebra of the RE algebra LG L q (n)[R], and following Cayley-Hamilton identity is fulfilled [GPS.97]: n

(−q)i ai L n−i = 0 .

(3.2.21)

i=0

This identity leads, in particular, to an invertibility of the matrix L: L

−1

= q

−1

an−1

n−1

(−q)−i an−i−1 L i .

i=0

Remark 3.14. One can introduce generating functions a(x), p(x) for the elementary symmetric functions and for the power sums a(x) := ai x i , p(x) := pi x i . i≥0

i≥1

1152


The Newton relations (3.2.20) can be written as a finite difference equation for the generating functions a(q −1 x) − a(q x) . q − q −1

a(q x) p(−x) =

For the G L q (n) type RE algebra we now construct its central extension by roots of the characteristic polynomial (3.2.21). Definition 3.15. Denote Sn a C-algebra of polynomials in n pairwise commuting invert±1 ible indeterminates µ±1 α and their differences (µα − µβ ) , α, β = 1, . . . , n, α = β. Let R be a skew invertible R-matrix of the G L q (n) type, LG L q (n)[R] be the corresponding RE algebra, and Ch[R] be its characteristic subalgebra. Consider a monomorphism Ch[R] → Sn defined on generators as 5

ai → ei (µ1 , . . . , µn ) :=

µ j1 µ j2 . . . µ ji

∀ i = 0, 1, . . . , n ,

(3.2.22)

1≤ j1 j

(3.3.12)

in the second line. By relabelling the subscript indices of the R-traces we then recast (3.3.11) in a following form7 ↑1 I1 ai = q i(i−1) TrR (2, . . . , i + 1) (L 1 J1 ) . . . (L i Ji ) A(i) .

(3.3.13)

Now we are ready to permute T1 and ai . Substituting expression (3.3.13) for ai and using relations (3.3.10) and (3.3.12) we calculate ↑1 γ 2i T1 ai = γ 2i T1 (I1 ai ) = q i(i−1) γ 2i TrR (2, . . . , i + 1) T1 (L 1 J1 ) . . . (L i Ji ) A(i) = q i(i−1) TrR (2, . . . , i + 1) (L 2 J2 ) . . . (L i+1 Ji+1 ) A(i)↑1 T1 = q i(i−1) TrR (2, . . . , i + 1) (L 2 . . . L i+1 ) Z i+1 A(i)↑1 T1 . To continue the calculation we need the following formula: Z i+1 A(i)↑1 = A(i)↑1 Z i+1 A(i)↑1 = q −i(i−1) q 2 A(i)↑1 − q −i (q 2 − 1)(i + 1)q A(i+1) , 7 Notice a similarity of the formula (3.3.13) with the relation (2.2.10). The role of the R-matrices R ±ε is now played by the permutation matrix P (see (3.3.7)).

1156


which follows by a combination of the definitions (2.3.3), (2.3.8), (3.3.8), and relations (2.3.4), (2.3.5), (2.3.6). So we finish the calculation

γ 2i T1 ai = TrR (2, . . . , i + 1) (L 2 . . . L i+1 ) q 2 A(i)↑1 − q −i (q 2 − 1)(i + 1)q A(i+1) T1 = q 2 ai T1 + (−q)−i (q 2 − 1)

i

(−q)− j a j L i− j T

1

j=0

= ai T1 − (q 2 − 1)

i (−q)− j ai− j (L j T )1 .

(3.3.14)

j=1

Here we calculate the first summand in the second line taking into account the equality (L 2 . . . L i+1 ) A(i)↑1 = (R1 . . . Ri )(L 1 . . . L i ) A(i) (R1 . . . Ri )−1 , and using i times formula (2.2.10). For calculation of the second summand we use the Cayley-Hamilton-Newton identity (3.2.19). Thus (3.3.6) is proved. Remark 3.23. For the set of power sums (3.2.16) the permutation relations with T ji in the Hecke case read γ 2i T pi = pi T + (q − q −1 )2

i−1 (2 j)q j=1

2q

pi− j (L j T ) + (q − q −1 )

(2i)q i (L T ) . 2q

One can derive this formula applying the R-trace TrR (2) to an equality γ 2i T1 (L 2 )i = (R L 1 R)i T1 and taking into account relations (R L 1 R)i = R(L 1 )i R + (q − q −1 )

i−1

R 2 j (L 1 )i− j R(L 1 ) j ,

j=1

R

2j

=

2q−1

(q 2 j−1 + q −2 j+1 )I + (q 2 j − q −2 j ) R .

These relations, in turn, follow inductively from the Hecke condition (2.3.6) and the reflection equation (3.2.1). Note that in this case there is no need to impose restrictions (2.3.1) on q. Proposition 3.24. Let R be a skew invertible G L q (n) type R-matrix. An extension of the corresponding HD algebra DG[R, γ ] by the elements (detR T )−1 and (an )−1 , satisfying relations γ 2n L (detR T )−1 = q 2 (detR T )−1 (OR L O−1 ), R γ

2n

(an )

−1

T = q T (an ) 2

−1

,

(3.3.15) (3.3.16)

in addition to those given in Definitions 3.4 and 3.10, is called G L q (n) type HD algebra and denoted as DG L q (n)[R, γ ]. Let R be a skew invertible S L q (n) type R-matrix. In the corresponding HD algebra DG[R, γ ] let us restrict the parameters by condition γ n = q and take a quotient by relations detR T = 1 and an = q −1 1. The quotient algebra is called S L q (n) type HD algebra and denoted as D S L q (n)[R].


1157

Remark 3.25. Notice consistency of the S L q (n) reduction condition γ n = q with the parameter restrictions η = q 1/n in Example 2.10 and γ = η in Example 3.20. Proof. Relations (3.3.15) and (3.3.16) should be consistent with permutation relations for detR T and an in the algebra DG[R, γ ]. Permutation relations for an with T were in fact derived in the first line of the calculation (3.3.14) (put i = n and take into account that A(n+1) = 0 in the G L q (n) case). The permutation relation for detR T with L can be derived by the same method as for detR T with T (see [G], Sect. 5, or [Is.04], calculation (3.5.39)). Given these results the consistency is obvious. In the S L q (n) case (OR ∝ I , γ n = q) the elements detR T and an are central. Hence, D S L q (n)[R] is consistently defined. Corollary 3.26. In the G L q (n) type HD algebra elements of the characteristic subalgebra satisfy the following commutation relations with detR T : γ 2nk detR T ch(x (k) ) = q 2k ch(x (k) ) detR T ∀ x (k) ∈ Hk (q), k = 1, 2, . . . . Proof. A proof is a direct calculation of permutation of ch(x (k) ) (3.2.14) and detR T exploiting relations (3.3.15) and properties of the matrix OR (3.1.7), R12 OR 1 OR 2 = OR 1 OR 2 R12 ,

OR DR = DR OR .

The latter relations are proved in [OP.05], Sect. 5.3.

Theorem 3.27. Let R be a skew invertible G L q (n) (S L q (n)) type R-matrix. An extension of the corresponding HD algebra DG L q (n)[R, γ ] (D S L q (n)[R]) by the algebra Sn ±1 satisfying of polynomials in mutually commuting indeterminates µ±1 α , (µα − µβ ) relations (3.2.23) together with γ 2 (P β T ) µα = q 2δαβ µα (P β T )

∀ α, β = 1, . . . , n ,

(3.3.17)

or, equivalently, γ 2 T µα = µα T + (q 2 − 1)µα (P α T ) , is called a (semisimple) spectral completion of the G L q (n) (S L q (n)) type HD algebra and denoted as DG L q (n)[R, γ ] (D S L q (n)[R]). Remark 3.28. To avoid problems with permutations of (µα −µβ )−1 with P σ T one could assume invertibility of all elements (µα − q 2k µβ ) ∀ α = β, k ∈ Z. Further on we will not make such permutations and so we don’t impose the corresponding restrictions. Remark 3.29. Assuming that the spectral variables µα are invariants of both left and right coactions, the algebra LG L q (n)[R, γ ] (L S L q (n)[R]) inherits the structures of left and right FG L q (n)[R]- (F S L q (n)[R]-) comodule algebra (see Definition 3.19). Remark 3.30. Note that relation (3.3.17) is typical for Weyl algebra generators. In fact there are many ways to combine from the elements (P β T )i j a set of n generators satisfying Weyl relations with the spectral variables µα . One such possibility is used later in Sect. 4.4.

1158


Proof. We have to check consistency of relations (3.3.17), (3.3.6) with the conditions ai = ei (µ1 , . . . , µn ) ≡ ei (µ) for 1 ≤ i ≤ n. Denote ei (µ /α ) := ei (µ)|µα =0 . We have ei (µ) = ei (µ /α ) + µα ei−1 (µ /α )

ei (µ /α ) =

⇒

i

(−µα ) j ei− j (µ) . (3.3.18)

j=0

Using relations (3.3.17), (3.3.18), (3.2.25) and (3.2.26) we calculate γ 2i T ei (µ) = γ 2i

n

(P α T ) ei (µ /α ) + µα ei−1 (µ /α )

α=1

=

n α=1

=

ei (µ /α ) + q 2 µα ei−1 (µ /α ) (P α T )

⎞ i−1 ⎝ei (µ) + (q 2 − 1) µα (−µα ) j ei− j−1 (µ)⎠ (P α T ) ⎛

n α=1

j=0

⎛

⎞ i n = ⎝ei (µ) − (q 2 − 1) (−L/q) j ei− j (µ)⎠ (P α T ) α=1

j=1

= ei (µ) T − (q 2 − 1)

i

(−q)− j ei− j (µ) (L j T ) ,

j=1

which coincides with (3.3.6) under identification ei (µ) = ai .

Corollary 3.31. In the completed G L q (n) type HD algebra DG L q (n)[R, γ ] the following permutation relations hold: γ 2n detR T µα = q 2 µα detR T ∀ α = 1, 2, . . . , n .

(3.3.19)

Proof. Using formulas (3.1.6), (3.2.25), (3.3.17) we can permute detR T and µα : Tr (1, . . . , n) A(n) (P β1 T )1 . . . (P βn T )n µα

n

γ 2n detR T µα = γ 2n

β1 ,...,βn =1

= µα

n

q

2

n

j=1 δαβ j

Tr (1, . . . , n) A(n) (P β1 T )1 . . . (P βn T )n .

(3.3.20)

β1 ,...,βn =1

Assuming that Tr (1, . . . , n) A(n) (P β1 T )1 . . . (P βn T )n = 0 , if there exists a pair i, j : βi = β j , (3.3.21) we conclude that for any nonzero summand in (3.3.20) the coefficient q q 2 , and therefore we can complete the calculation γ 2n detR T µα = q 2 µα

n β1 ,...,βn =1

2

n

j=1 δαβ j

equals

Tr (1, . . . , n) A(n) (P β1 T )1 . . . (P βn T )n = q 2 µα detR T .


1159

It remains to prove the assumption. First, we note that conditions on βi in (3.3.21) stand that there exists an integer σ : 1 ≤ σ ≤ n, and σ = βi ∀ i. Therefore, any projector P βi in (3.3.21) contains the factor (L − qµσ I ). Using relations (3.3.10), (3.3.17) we can move all such factors to the left side of the expression. Thus we obtain

n L J −qµ I ) ... . left hand side of (3.3.21) ∝ Tr (1, . . . , n) A(n) j σ j=1 j (3.3.22) Next, we note that the expression in braces is a symmetric function in a commuting set of matrices L j J j (see (3.3.9)) which by relations Ri (L i Ji )(L i+1 Ji+1 ) = (L i Ji )(L i+1 Ji+1 )Ri , = (L i Ji + L i+1 Ji+1 )Ri ,

Ri (L i Ji + L i+1 Ji+1 )

and by (3.2.11) together with the same formulas for Jk commutes with Ri , i = 1, . . . , (n) n − 1, and so with A(n) . Hence, using relations A(n) = (A(n) )2 and rkA = 1 we

can separate a left factor κ := Tr (1, . . . , n) A(n) nj=1 L j J j − qµσ I ) in (3.3.22). This factor we now calculate explicitly. Taking into account relations (2.4.5), (3.3.9), (3.3.12) and A(n) Ji = q −2(i−1) A(n) we transform the expression for κ:

κ = q n TrR (1, . . . , n) A(n) nj=1 L j − q 2 j−1 µσ I ) .

(3.3.23)

Expanding this expression in powers of L and noticing that (2.4.4) assumes

−1 (k) !(n−k) ! TrR (k + 1, . . . , n) A(n) = q n(k−n) q(n)q ! q := q n(k−n) nk we find that k th order monomials

−1 TrR (1, . . . , n) A(n) L i1 . . . L ik = TrR (1, . . . , n) A(n) L 1 . . . L k = q n(k−n) nk q ak are equal to each other for any choice of indices 1 ≤ i 1 < . . . i k ≤ n. Their corresponding coefficients in (3.3.23) sum up to

(−q −1 µσ )n−k

q2

n−k

r =1 ir

= q n(n−k)

1≤i 1 0, ∗ M∗ J 1 1

I2,... j+1 = γ TrR ( j + 2) (P j+1 . . . 2

∗

P1 )(L T )1 R 1 T1−1 (P1 . . .

(B.11)

P j+1 ) . (B.12)

In a similar way, for any value of i relations, (B.9) with j > 0 follow from that with j = 0 by a repeated application of (B.11). Therefore, it is enough to consider the case j = 0. ∗

Using relations (B.12) and (B.8) we can rewrite an expression M∗ J i in the following i

way

1176

A. P. Isaev, P. Pyatov ∗

∗

∗

∗

∗

M∗ J i = ( R i−1 . . . R 1 ) M1 ( R 1 . . . R i−1 ) i

∗

∗

∗

∗

∗

= γ TrR (i + 1) (Pi . . . P2 P1 )(L T )1 ( R i . . . R 2 R 1 R 2 . . . R i ) (T1 ) 2

−1

(P1 P2 . . . Pi ) . (B.13)

Now we are ready to prove formula (B.9) by induction on i. Assuming that (B.9) with j = 1 is valid for the product of (i − 1) factors we transform the product of i factors, ∗

∗

∗

M∗ J 2 . . . M∗ J i

M∗ J 1 1

2

i

(i) (2i−1) = γ (i−1)i TrR (i + 1, . . . 2i − 1) ϒ P ϒ P (L T )i−1 . . . (L T )1 ∗ (2i−2) −1 (2i−1) (i) × ϒ∗ (Ti−1 . . . T1 ) ϒ P ϒP M∗ J i . R

i

∗

Next, we apply formulas (B.6), (B.7) to move the last factor (M∗ J i ) in this expression i

left-wards. The result is

(i) (2i−1) (2i−2) (L T )i−1 . . . (L T )1 ϒ ∗ (Ti−2 . . . T1 )−1 = γ (i−1)i TrR (i + 1, . . . 2i − 1) ϒ P ϒ P R ⎞ ↑(i−2)

∗

T1−1 (M∗ J i )↑1

×

i

ϒ P(2i−1) ϒ P(i) ⎠ ,

where we have used identities (Ti−1 )−1 = (T1−1 )↑(i−2) ∗ ((M∗ J i )↑1 )↑(i−2)

to arrange the terms (Ti−1 )−1 and

i

and

∗ (M∗ J i )↑(i−1)

∗

(M∗ J i )↑(i−1) = i

in a suitable way.

i

Next, we use formula (3.4.8) for their permutation and then, in a similar way we move ∗

term (M∗ J ) to the left of all the terms (T∗ )−1 : ··· = γ

(i−1)i+2(i−1) ∗

× (M

∗ 2i−1

(i) (2i−1) (2i−2) TrR (i + 1, . . . 2i − 1) ϒ P ϒ P (L T )i−1 . . . (L T )1 ϒ ∗ R

J 2i−1 )(Ti−1 . . . T1 )−1 ϒ P(2i−1) ϒ P(i) .

Now we substitute the expression (B.13) for (M

∗ 2i−1

∗

J 2i−1 )

= γ i(i+1) TrR (i + 1, . . . 2i) ϒ P(i) ϒ P(2i−1) (L T )i−1 . . . (L T )1 ϒ (2i−2) (P2i−1 . . . P1 )(L T )1 ∗ R

∗

∗

∗

∗

∗

(2i−1)

× ( R 2i−1 . . . R 2 R 1 R 2 . . . R 2i−1 )(T1 )−1 (P1 . . . P2i−1 )(Ti−1 . . . T1 )−1 ϒ P

(i)

ϒP


1177

and move the term (P2i−1 . . . P1 ) leftwards and the term (P1 . . . P2i−1 ) rightwards close to the terms ϒ P(2i−1) . Finally, using (B.4) we complete the calculation (i) (2i) (2i) (2i) (i) = γ i(i+1) TrR (i + 1, . . . 2i) ϒ P ϒ P (L T )i . . . (L T )1 ϒ ∗ (Ti . . . T1 )−1 ϒ P ϒ P . R

∗

Here we transformed terms containing R in the following way: (2i−2)↑1 ∗

ϒ∗

∗

∗

∗

∗

( R 2i−1 . . . R 2 R 1 R 2 . . . R 2i−1 )

R

(2i−2)↑1 ∗

R ∗

∗

∗

∗

∗

( R 1 . . . R 2i−2 R 2i−1 R 2i−2 . . . R 1 )

= ϒ∗

∗

(2i−2) ∗

= ( R 1 . . . R 2i−1 )ϒ ∗ R

∗

∗

∗

(2i−1)

( R 2i−2 . . . R 1 ) = ( R 1 . . . R 2i−1 )ϒ ∗ R

(2i)

= ϒ∗ . R

ϒ (k)

(B.4) associated with a skew invertible Lemma B.3. The operators Jk (3.3.8) and R-matrix R satisfy relations (2i) (i) 4 TrR (i + 1, . . . , 2i) ϒ R = ϒ R = (J1 J2 . . . Ji )2 . (B.14) Proof. Calculation proceeds as follows:

TrR (i + 1, . . . , 2i) ϒ R(2i) = TrR (i + 1, . . . , 2i − 1) ϒ R(2i−1) (TrR (2i) R2i−1 ) (R2i−2 . . . R1 ) (2i−1) = TrR (i + 1, . . . , 2i − 1) (R1 . . . Ri−1 ) ϒ R (Ri−1 . . . R1 ) (i)

. . . = (R1 . . . Ri−1 )i ϒ R (Ri−1 . . . R1 )(Ri−2 . . . R1 ) . . . (R2 R1 )R1 (i) 2 (i) 4 = (J1 J2 . . . Ji ) ϒ R = ϒR . Here in passing to the second line we calculated the R-trace TrR (2i) with the help of (2.2.8)

and then used (B.5) to move (i −1) R-matrices to the left of the term ϒ R(2i−1) . Expression in the third line results from similar calculations of the R-traces TrR (2i − 1) , . . . , TrR (i + 1) , consecutively. Equalities in the last line result from rearranging factors of the product (R1 . . . Ri−1 )i .

Acknowledgement. We are grateful to Ludwig Dmitrievich Faddeev for acquainting us with the problem of a dynamics of the isotropic q-top, and for numerous inspiring discussions and advice. We would like to thank Alexei Gorodentsev, Sergei Kuleshov, Andrey Levin, Dmitry Lebedev, Andrey Mudrov, Andrey Marshakov and Vyacheslav Spiridonov for their useful comments and conversations. We also would like to acknowledge the warm hospitality of the Max-Planck-Institute für Mathematik, where writing this paper was started in 2004 and finished in 2008. The work is supported by the Russian Foundation for Basic Research, grants No. 08-01-00392-a, and CNRS-RFBR No. 07-02-92166 and No. 09-01-93107.

References [AF.91] [AF.92]

Alekseev, A.Yu., Faddeev, L.D.: (T ∗ G)t : a toy model for conformal field theory. Commun. Math. Phys. 141(2), 413–422 (1991) Alekseev, A.Yu., Faddeev, L.D.: An involution and dynamics for the q deformed quantum top. Zap. Nauchn. Semin. LOMI 200, 3 (1992) (in Russian); English translation available at http:// arxiv.org/abs/hep-th/9406196, 1994

1178

[B] [ChP] [CS] [D.86] [D.89] [DM.01] [DM.02] [ES] [F.90] [F.94] [F.95] [F.99] [FHIOPT] [FRT] [GKL] [G] [GPS.97] [GPS.05] [GPS.06]

[GR.91] [GR.92] [GS.99] [GS.04] [H] [HIOPT]


Burroughs, N.: Relating the approaches to quantized algebras and quantum groups. Commun. Math. Phys. 113, 91–117 (1990) Chari, V., Pressley, A.: A Guide to Quantum Groups. Cambridge: Cambridge University Press, 1994 Conway, J.H., Sloane, N.J.A.: Sphere Packings, Lattices and Groups. Berlin-Heidelberg-New York: Springer-Verlag, 1993 Drinfeld, V.G.: Quantum Groups. In: Proceedings of the Intern. Congress of Mathematics, Vol. 1 (Berkeley, 1986), p. 798. For the expanded version see J. Math. Sci. 41(2), 898–915 (1988) (translated from Zap. Nauch. Sem. LOMI 155, 18–49) (1986) Drinfeld, V.G.: On almost cocommutative Hopf algebras. (Russian) Algebra i Analiz 1(2), 30–46 (1989); English translation in: Leningrad Math. J. 1(2), 321–342 (1990) Donin, J., Mudrov, A.: Uq (sl(n))-covariant quantization of symmetric coadjoint orbits via reflection equation algebra. Contemp. Math. 315, 61–79 (2002) Donin, J., Mudrov, A.: Explicit equivariant quantization on coadjoint orbits of GL(n, C). Lett. Math. Phys. 62(1), 17–32 (2002) Etingof, P., Schiffmann, O.: Lectures on the dynamical Yang-Baxter equations. In: Quantum groups and Lie theory (Durham 1999), London Math. Soc. LN series 290, Cambridge: Cambridge Univ. Press 2001 Faddeev, L.D.: On the exchange matrix for WZNW model. Commun. Math. Phys. 132(1), 131–138 (1990) Faddeev, L.D.: Current-like variables in massive nad massless integrable models. Lectures delivered at the International School of Physics ‘Enrico Fermi’. Varenna, Italy, 1994; available at http://arxiv.org/abs/hep-th/9408041, 1994 Faddeev, L.D.: Discrete Heisenberg-Weyl group and modular group. Lett. Math. Phys. 34(3), 249–254 (1995) Faddeev, L.D.: Modular double of a quantum group. In: Conf’erence Mosh’e Flato 1999, Quantization, Deformation, and Symmetries. Vol. I, Dordrecht: Kluwer Acad. Publ., 2000, pp. 149–156; available at http://arxiv.org/abs/math.QA/9912078, 1999 Furlan, P., Hadjiivanov, L.K., Isaev, A.P., Ogievetsky, O.V., Pyatov, P.N., Todorov, I.T.: Quantum matrix algebra for the SU (n) WZNW model. J. Phys. A: Math. Gen. 36, 5497–5530 (2003) Faddeev, L.D., Reshetikhin, N.Yu., Takhtajan, L.A.: Quantization of Lie groups and Lie algebras. (Russian) Algebra i Analiz 1(1), 178–206; (1989) English translation in: Leningrad Math. J. 1(1), 193–225 (1990) Gerasimov, A., Kharchev, S., Lebedev, D.: Representation theory and quantum integrability. Progr. Math. 237, Basel: Birkhäuser, 2005, pp. 133–156, available at http://arxiv.org/abs/math. QA/0402112, 2004 Gurevich, D.I.: Algebraic aspects of the quantum Yang-Baxter equation. (Russian) Algebra i Analiz 2, 119–148 (1990); English translation in: Leningrad Math. J. 2, 801–828 (1991) Gurevich, D.I., Pyatov, P.N., Saponov, P.A.: Hecke symmetries and characteristic relations on reflection equation algebras. Lett. Math. Phys. 41, 255–264 (1997) Gurevich, D.I., Pyatov, P.N., Saponov, P.A.: Cayley-Hamilton Theorem for Quantum Matrix Algebras of G L(m|n) type. Algebra i Analiz 17(1) 160–182 (2005) (in Russian). English translation in: St. Petersburg Math. J. 17(1), 119–135 (2006) Gurevich, D.I., Pyatov, P.N., Saponov, P.A.: Quantum matrix algebras of the GL(m–n)type: the structure and spectral parameterization of the characteristic subalgebra. Teor. Matem. Fiz. 147(1), 14–46 (2006) (in Russian). English translation in: Theor. Math. Phys. 147(1), 460–485 (2006) Gelfand, I.M., Retakh, V.S.: Determinants of matrices over noncommutative rings. Funct. Anal. Appl. 25, 91–102 (1991) Gelfand, I.M., Retakh, V.S.: A theory of noncommutative determinants and characteristic funstions of graphs. Funct. Anal. Appl. 26, 1–20 (1992); Publ. LACIM, Montreal: UQAM, 14, pp. 1–26 Gurevich, D., Saponov, P.: Quantum line bundles via cayley-hamilton identity. J. Phys. A: Math. Gen. 34(21), 4553–4569 (2001) Gurevich, D., Saponov, P.: Geometry of non-commutative orbits related to Hecke symmetries. to appear in Contemp. Math.: Joseph Donin memorial volume, available at http://arxiv.org/abs/ math.QA/0411579, 2004 Hlavaty, L.: Quantized braided groups. J. Math. Phys. 35, 2560–2569 (1994) Hadjiivanov, L.K., Isaev, A.P., Ogievetsky, O.V., Pyatov, P.N., Todorov, I.T.: Hecke algebraic properties of dynamical R-matrices: application to related quantum matrix algebras. J. Math. Phys. 40(1), 427–448 (1999)


[I] [Is.95] [Is.04] [IOP.98] [IOP.99] [IP] [J.85] [J.86] [KL] [KLS] [KS] [KSch] [M] [Mum] [O] [OP.01]

[OP.05] [PP] [R.89] [R.90] [RT] [S] [SWZ.92] [SWZ.93] [TW]

1179

Igusa, J.: Theta Functions. Grund. Math. Wiss. 194, Berlin-Heidelberg-New York: SpringerVerlag, 1972 Isaev, A.P.: Twisted Yang-Baxter equations for linear quantum (super) groups. J. Phys. A: Math. Gen. 29, 6903–6910 (1996) Isaev, A.P.: Quantum groups and Yang-Baxter equations. MPIM Preprint 2004-132; available at http://www.mpim-bonn.mpg.de/Research/MPIM-Preprint-Series/ Isaev, A.P., Ogievetsky, O.V., Pyatov, P.N.: Generalized Cayley-Hamilton-Newton identities. Czech. J. Phys. 48, 1369–1374 (1998) Isaev, A., Ogievetsky, O., Pyatov, P.: On quantum matrix algebras satisfying the Cayley-Hamilton-Newton identities. J. Phys. A: Math. Gen. 32, L115–L121 (1999) Isaev, A.P., Pyatov, P.N.: Covariant differential complexes on quantum linear groups. J. Phys. A: Math. Gen. 28, 2227–2246 (1995) Jimbo, M.: A q-difference analogue of U (g) and the Yang-Baxter equation. Lett. Math. Phys. 10, 63–69 (1985) Jimbo, M.: A q-analogue of Uq (gl(N + 1)), Hecke algebra and the Yang-Baxter equation. Lett. Math. Phys. 11, 247–252 (1986) Krob, D., Leclerc, B.: Minor identities for quasi-determinants and quantum determinants. Commun. Math. Phys. 169(1), 1–23 (1995) Kharchev, S., Lebedev, D., Semenov-Tian-Shansky, M.: Unitary representations of Uq (sl(2, R)), the modular double and the multiparticle q-deformed toda chains. Commun. Math. Phys. 225(3), 573–609 (2002) Kulish, P.P., Sklyanin, E.K.: Algebraic structures related to reflection equations. J. Phys. A: Math. Gen. 25(22), 5963–5975 (1992) Klimyk, A., Schmüdgen, K.: Quantum Groups and their Representations. Berlin: Springer, 1997 Montgomery, S.: Hopf Algebras and their Actions on Rings. CBMS Lecture Notes Vol. 82, Providence, RI: Amer. Math. Soc., 1993 Mumford, D.: Tata Lectures on Theta. I. Progress in Mathematics, Vol. 28, Boston, MA: Birkhäuser Boston Inc., 1983 Ogievetsky, O.: Uses of quantum spaces. In: Proc. of School Quantum symmetries in theoretical physics and mathematics (Bariloche, 2000), Contemp. Math. 294, Providence, RI: Amer. Math. Soc., 2002 pp. 161–232 Ogievetsky, O., Pyatov, P.: Lecture on Hecke algebras. In: Proc. of the International School “Symmetries and Integrable Systems” (Dubna, Russia, June 8–11, 1999), JINR, Dubna, D2,52000-218, pp.39-88; MPIM Preprint 2001-40, available at http://www.mpim-bonn.mpg.de/ Research/MPIM-Preprint-Series/ Ogievetsky, O., Pyatov, P.: Orthogonal and symplectic quantum matrix algebras and CayleyHamilton theorem for them. Preprint MPIM2005–53; http://arxiv.org/abs/math.QA/0511618, 2005 Polishchuk, A., Positselski, L.: Quadratic Algebras. University Lecture Series, 37. Providence, RI: Amer. Math. Soc., 2005 Reshetikhin, N.Yu.: Quasitriangular Hopf algebras and invariants of tangles. (Russian) Algebra i Analiz 1 (2), 169–188 (1989); English translation in: Leningrad Math. J. 1(2), 491–513 (1990) Reshetikhin, N.Yu.: Multiparameter quantum groups and twisted quasitriangular Hopf algebras. Lett. Math. Phys. 20, 331–335 (1990) Reshetikhin, N.Yu., Turaev, V.G.: Ribbon graphs and their invariants derived from quantum groups. Commun. Math. Phys. 127(1), 1–26 (1990) Semenov-Tyan-Shanskii, M.A.: Poisson-Lie groups. The quantum duality principle and the twisted quantum double. (Russian) Teor. Mat. Fiz. 93(2) 302–329 (1992); English translation in: Theor. Math. Phys. 93(2), 1292–1307 (1992) Schupp, P., Watts, P., Zumino, B.: Differential geometry on linear quantum groups. Lett. Math. Phys. 25(2), 139–147 (1992) Schupp, P., Watts, P., Zumino, B.: Bicovariant quantum algebras and quantum lie algebras. Commun. Math. Phys. 157(2), 305–329 (1993) Tuba, I., Wenzl, H.: On braided tensor categories of type bcd. J. Reine Angew. Math. 581, 31–69 (2005)

Communicated by L. Takhtajan


Communications in


Stein’s Method and Characters of Compact Lie Groups Jason Fulman Department of Mathematics, University of Southern California, Los Angeles, CA 90089-2532, USA. E-mail: [email protected] Received: 13 June 2008 / Accepted: 13 September 2008 Published online: 5 December 2008 – © Springer-Verlag 2008

Abstract: Stein’s method is used to study the trace of a random element from a compact Lie group or symmetric space. Central limit theorems are proved using very little information: character values on a single element and the decomposition of the square of the trace into irreducible components. This is illustrated for Lie groups of classical type and Dyson’s circular ensembles. The approach in this paper will be useful for the study of higher dimensional characters, where normal approximations need not hold. 1. Introduction There is a large literature on the traces of random elements of compact Lie groups. One of the earliest results is due to Diaconis and Shahshahani [DS]. Using the method of moments, they show that if g is random from the Haar measure of the unitary group U (n, C), and Z = X + iY is a standard complex normal with X and Y independent, mean 0 and variance 21 normal variables, then for j = 1, 2, . . ., T r (g j ) are independent √ and distributed as j Z asymptotically as n → ∞. They give similar results for the orthogonal group O(n, R) and the group of unitary symplectic matrices U Sp(2n, C). The moment computations of [DS] use representation theory. It is worth noting that there are other approaches to their moment computations: [PV] uses a version of integration by parts (and also treats S O(n, R)), and [CoSz] uses an “extended Wick calculus” (and also treats symmetric spaces). Concerning the error in the normal approximation in the [DS] results, Diaconis conjectured that for fixed j, it decreases exponentially or even superexponentially in n. Stein [St2] uses “Stein’s method” to show that T r (g k ) on O(n, R) is asymptotically normal with error O(n −r ) for any fixed r . Johansson [J] proved Diaconis’ conjecture for classical compact Lie groups using Toeplitz determinants and a very detailed analysis of characteristic functions. The author received funding from NSF grant DMS-0503901.

1182

J. Fulman

One direction in which the [DS] results have been extended is the study of linear statistics of eigenvalues: see [J,DE,So] and the numerous references therein. There is also work by D’Aristotile, Diaconis, and Newman [DDN] on central limit theorems for linear functions such as T r (Ag), where A is a fixed n × n real matrix and g is from the Haar measure of O(n, R). In recent work, Meckes [Me2] refined Stein’s technique from [St2] to establish a sharp total variation distance error term (order n −1 ) for the [DDN] result. A natural goal is to prove limit theorems (with error terms) for the distribution of traces in other irreducible representations: i.e. χ τ (g), where g is a random element of a compact Lie group and χ τ is the character of an irreducible representation τ . This would have direct implications for Katz’s work [Ka] on exponential sums; see Sect. 4.7 of [KLR] for details. We do not attain this goal, but make a useful contribution to it. More precisely, the current paper presents a formulation of Stein’s method designed for the study of χ τ (g). In the case of normal approximation, we obtain O(n −1 ) bounds for the error term using only two pieces of information: φ

χ (α) , where φ may be arbitrary but α is a single • The value of the “character ratios” dim(φ) element of G (typically chosen to be close to the identity) • The decomposition of τ 2 into irreducible representations.

In contrast, the method of moments approach requires knowing the multiplicity of the trivial representation in τ k for all k ≥ 1 (which could be tricky to compute) and does not give an immediate bound on the error. Johansson’s paper [J] gives sharper bounds when χ τ is the trace of an element from a classical compact Lie group, but requires knowledge of high order moments and deep analytical tools which might not extend to arbitrary representations τ . Even Stein’s method approaches of Stein [St2] and Meckes [Me2] use information about the distribution of matrix entries; very little is known about this for arbitrary τ , whereas the main ingredient for our approach (character theory) is well-developed. Let us explain our statement in the abstract that the methods of this paper will prove useful for approximation other than normal approximation. We use Stein’s method of exchangeable pairs which involves the construction of a pair (W, W ) of exchangeable random variables. Our pair (which is somewhat different from those of Stein [St2] and Meckes [Me2]) satisfies the linearity condition that E(W |W ) is proportional to W , and we find representation theoretic formulas for quantities such as E(W − W )k . These computations are completely general and apply to arbitrary distributional approximation. Stein’s method of exchangeable pairs is still quite undeveloped for continuous distributions other than the normal, but that is temporary and there are some results: see [Mn,Re] for the chi-squared distribution, [Lu] for the Gamma distribution, and [GoT] for the semicircle law. Closest to the current paper is [CFR], which develops error terms for exponential approximation using quantities like E(W − W )k with k small. We remark that the bounds in our paper are all given in the Kolmogorov metric. Similar results can be proved in the slightly stronger total variation metric (see the remarks after Theorem 2.1). However we prefer to work in the Kolmogorov metric as it underscores the similarity with discrete settings such as [Fu], where total variation convergence does not occur. We also mention that all bounds obtained in this paper are given with explicit constants. The organization of this paper is as follows. Section 2 gives background on Stein’s method and normal approximation. Section 3 develops general theory for the case that G is a compact Lie group and χ τ an irreducible character. It treats the trace of random

Stein’s Method and Characters of Compact Lie Groups

1183

elements of O(n, R), U Sp(2n, C), and U (n, C) as examples. Section 4 extends the methods of Sect. 3 to study spherical functions of compact symmetric spaces. The symmetric space setting is natural from the viewpoint of random matrix theory [Dn,KaS]. After illustrating the technique on the sphere, we treat Dyson’s circular ensembles as examples, obtaining an error term. 2. Stein’s Method for Normal Approximation In this section we briefly review Stein’s method for normal approximation, using the method of exchangeable pairs [St1]. For more details, one can consult the survey [RR] and the references therein. Two random variables W, W are called an exchangeable pair if (W, W ) has the same distribution as (W , W ). As is typical in probability theory, let E(A|B) denote the expected value of A given B. The following result of Stein uses an exchangeable pair (W, W ) to prove a central limit theorem for W . Theorem 2.1. ([St1]). Let (W, W ) be an exchangeable pair of real random variables such that E(W 2 ) = 1 and E(W |W ) = (1 − a)W with 0 < a < 1. Then for all real x0 , x0 2 − x2 P(W ≤ x0 ) − √1 e d x 2π −∞ V ar (E[(W − W )2 |W ]) 1 − 14 ≤ + (2π ) E|W − W |3 . a a Remarks. (1) There are variations of Theorem 2.1 (for instance Theorem 6 of [Me1]) which can be combined with our calculations to prove normal approximation in the total variation metric. However Theorem 2.1 is quite convenient for our purposes. (2) In recent work, Röellin [Rl] has given a version of Theorem 2.1 in which the exchangeability condition can be replaced by the slightly weaker condition that W and W have the same law. Since exchangeability holds in our examples and may be useful for other applications involving Stein’s method, we adhere to using Theorem 2.1. To apply Theorem 2.1, one needs bounds on V ar (E[(W −W )2 |W ]) and E|W −W |3 . The following lemmas are helpful for this purpose. Lemma 2.2. Let (W, W ) be an exchangeable pair of random variables such that E(W |W ) = (1 − a)W and E(W 2 ) = 1. Then E(W − W )2 = 2a. Proof. Since W and W have the same distribution, E(W − W )2 = = = = =

E(E(W − W )2 |W ) E((W )2 ) + E(W 2 ) − 2E(W E(W |W )) 2E(W 2 ) − 2E(W E(W |W )) 2E(W 2 ) − 2(1 − a)E(W 2 ) 2a.

1184

J. Fulman

Lemma 2.3 is a well known inequality (already used in the monograph [St1]) and useful because often the right—hand side is easier to compute or bound than the left— hand side. To make this paper as self-contained as possible, we include a proof. Here x is an element of the state space X . Lemma 2.3. V ar (E[(W − W )2 |W ]) ≤ V ar (E[(W − W )2 |x]). Proof. Jensen’s inequality states that if g is a convex function, and Z a random variable, then g(E(Z )) ≤ E(g(Z )). There is also a conditional version of Jensen’s inequality (Sect. 4.1 of [Du]) which states that for any σ subalgebra F of the σ -algebra of all subsets of X , E(g(E(Z |F))) ≤ E(g(Z )). The lemma follows by setting g(t) = t 2 , Z = E((W − W )2 |x), and letting F be the σ -algebra generated by the level sets of W . 3. Compact Lie Groups This section uses Stein’s method to study the distribution of a fixed irreducible character χ τ of a compact Lie group G. Subsect. 3.1 develops general theory for the case that χ τ is real valued. This is applied to study the trace of a random element of U Sp(2n, C) in Subsection 3.2 and the trace of a random orthogonal matrix in Subsections 3.3 and 3.4. Subsect. 3.5 indicates the relevant amendments for the complex setting and Subsect. 3.6 illustrates the theory for U (n, C). 3.1. General theory (real case). Let G be a compact Lie group and χ τ a non-trivial realvalued irreducible character of G. The random variable of interest to us is W = χ τ (g), where g is chosen from the Haar measure of G. It follows from the orthogonality relations for irreducible characters of G that E(W ) = 0 and E(W 2 ) = 1. The following functional equation will be useful. Lemma 3.1. ([He2], p. 392). Let G be a compact Lie group and χ φ an irreducible character of G. Then

χ φ (hαh −1 g)dh = G

χ φ (α) φ χ (g) dim(φ)

for all α, g ∈ G. We now define a pair (W, W ) by letting W = χ τ (g), where g is chosen from Haar measure and W = W (αg), where α is chosen uniformly at random from a fixed selfinverse conjugacy class of G. Exchangeability of (W, W ) follows since the conjugacy class of α is self-inverse. Moreover, since χ φ (α −1 ) = χ φ (α), one has that χ φ (α) is real for all irreducible representations φ, a fact which will be used freely throughout this subsection.


1185

Remark. Non-identity self-inverse conjugacy classes always exist. Indeed, if G has rank r then any maximal torus has 2r − 1 elements of order 2, and these are naturally selfinverse. Moreover if G has all characters real valued (as is the case for symplectic and orthogonal groups), then all conjugacy classes are self inverse, since class functions can be uniformly approximated by sums of characters and χ φ (α −1 ) = χ φ (α). The remaining results in this subsection show that the exchangeable pair (W, W ) has desirable properties. Lemma 3.2.

E(W |W ) =

χ τ (α) W. dim(τ )

Proof. Applying Lemma 3.1 with φ = τ , one has that τ χ (α) χ τ (g). E(W |g) = χ τ (hαh −1 g)dh = dim(τ ) h∈G The result follows since this depends on g only through W . Lemma 3.3.

χ τ (α) E(W − W ) = 2 1 − . dim(τ )

2

Proof. This is immediate from Lemmas 2.2 and 3.2.

For the remainder of this subsection, if φ is an irreducible representation of G, we let m φ (τ r ) denote the multiplicity of φ in the r-fold tensor product of τ (which has character (χ τ )r ). Lemma 3.4. E[(W )2 |g] =

m φ (τ 2 )

φ

χ φ (α) φ χ (g), dim(φ)

where the sum is over all irreducible representations of G. Proof. Write (W )2 = φ m φ (τ 2 )χ φ (g ). Lemma 3.1 gives that χ φ (α) φ χ (g), χ φ (hαh −1 g)dh = E[χ φ (g )|g] = dim(φ) G and the result follows.

Lemma 3.5 writes V ar ([E(W − W )2 |g]) as a sum of positive quantities. Lemma 3.5.

V ar ([E(W − W ) |g]) = 2

∗ φ

2 χ φ (α) 2χ τ (α) − m φ (τ ) 1 + , dim(φ) dim(τ ) 2 2

where the star signifies that the sum is over all nontrivial irreducible representations of G.

1186

J. Fulman

Proof. By Lemmas 3.2 and 3.4, E((W − W )2 |g) = E[(W )2 |g] − 2W E(W |g) + W 2 2χ τ (α) W2 = E[(W )2 |g] + 1 − dim(τ ) 2χ τ (α) χ φ (α) − χ φ (g). = m φ (τ 2 ) 1 + dim(φ) dim(τ ) φ

The orthogonality relation for irreducible characters of G gives that 2 χ φ (α) 2χ τ (α) 2 2 2 2 − m φ (τ ) 1 + . E[E((W − W ) |g) ] = dim(φ) dim(τ ) φ

Finally, note that V ar ([E(W − W )2 |g]) = E[E((W − W )2 |g)2 ] − (E(W − W )2 )2 , and since the multiplicity of the trivial representation in τ 2 is 1, the result follows from Lemma 3.3. Lemma 3.6. Let k be a positive integer. φ r k−r ) χ (α) , (1) E(W − W )k = rk=0 (−1)k−r rk φ m φ (τ )m φ (τ dim(φ) χ τ (α) χ φ (α) (2) E(W − W )4 = φ m φ (τ 2 )2 8 1 − dim(α) − 6 1 − dim(φ) . Proof. For the first assertion, note that k k−r k E[(W − W ) |g] = χ τ (g)k−r E[(W )r |g]. (−1) r

k

r =0

Arguing as in Lemma 3.4 gives that this is equal to k χ φ (α) φ k−r k χ τ (g)k−r χ (g). (−1) m φ (τ r ) r dim(φ) φ

r =0

Thus E(W − W )k is equal to E(E[(W − W )k |g]) k φ k−r k r χ (α) (−1) m φ (τ ) χ τ (g)k−r χ φ (g) = r dim(φ) g∈G r =0

=

k r =0

φ

k χ φ (α) . (−1)k−r m φ (τ r )m φ (τ k−r ) dim(φ) r φ

For the second assertion, note by the first assertion that 4 χ φ (α) 4 r 4 E(W − W ) = . (−1) m φ (τ r )m φ (τ 4−r ) r dim(φ) r =0

φ


1187

If α is the identity element of G, then W = W which implies that 0=

4 r =0

4 (−1) m φ (τ r )m φ (τ 4−r ). r r

φ

Thus for general α, E(W − W )4 = −

4

(−1)r

4 χ φ (α) m φ (τ r )m φ (τ 4−r ) 1 − . r dim(φ) φ

r =0

Observe that the r = 0, 4 terms in this sum vanish, since the only contribution could come from the trivial representation, which contributes 0. The r = 2 term is χ φ (α) 1− m φ (τ 2 )2 . −6 dim(φ) φ

The r = 1, 3 terms are equal and together contribute χ τ (α) χ τ (α) 8 1− m τ (τ 3 ) = 8 1 − χ τ (g)4 dim(τ ) dim(τ ) g∈G 2 χ τ (α) 2 φ = 8 1− m φ (τ )χ (g) dim(τ ) g∈G φ τ χ (α) = 8 1− m φ (τ 2 )2 . dim(τ ) φ

This completes the proof.

The above lemmas are completely general. Specializing to normal approximation, one obtains the following result. Theorem 3.7. Let G be a compact Lie group and let τ be a non-trivial irreducible representation of G whose character is real valued. Fix a non-identity element α with the property that α and α −1 are conjugate. Let W = χ τ (g), where g is chosen from the Haar measure of G. Then for all real x0 , x0 2 − x2 P(W ≤ x0 ) − √1 e d x 2π −∞ 2 ∗ 1 χ φ (α) ≤ 1− m φ (τ 2 )2 2 − a dim(φ) ⎡

φ

⎤1/4 φ (α) 1 χ 6 ⎦ . 1− +⎣ m φ (τ 2 )2 8 − π a dim(φ) φ

τ

χ (α) Here a = 1 − dim(τ ) , the first sum is over all non-trivial irreducible representations of G, and the second sum is over all irreducible representations of G.

1188

J. Fulman

Proof. One applies Theorem 2.1 to the exchangeable pair (W, W ) of this subsection. By Lemmas 2.3 and 3.5, the first term in Theorem 2.1 gives the first term in the theorem. To upper bound the second term in Theorem 2.1, note by the Cauchy-Schwartz inequality that E|W − W |3 ≤ E(W − W )2 E(W − W )4 . Now use Lemma 3.3 and part 2 of Lemma 3.6.

3.2. Example. U Sp(2n, C). This subsection studies the distribution of χ τ (g), where τ is the 2n dimensional defining representation of U Sp(2n, C). The only representation theoretic fact needed is Lemma 3.8, which is the k = 2 case of a formula from p. 200 of [Su] giving the decomposition of τ k into irreducible representations. In its statement, we let x1 , x1−1 , . . . , xn , xn−1 denote the eigenvalues of an element of U Sp(2n, C). Lemma 3.8. For n ≥ 2, the square of the defining representation of the group U Sp(2n, C) decomposes in a multiplicity free way as the sum of the following three irreducible representations: • The trivial representation, with character 1 • The representation with character 21 ( i xi + xi−1 )2 + 21 i (xi2 + xi−2 ) • The representation with character 21 ( i xi + xi−1 )2 − 21 i (xi2 + xi−2 ) − 1. Remark. Lemma 3.8 could also be easily guessed (and proved) by looking at the character formulas for U Sp(2n, C) on p. 219 of [W]. Theorem 3.9. Let g be chosen from the Haar measure of U Sp(2n, C), where n ≥ 2. Let W (g) be the trace of g. Then for all real x0 , √ x0 2 2 − x2 P(W ≤ x0 ) − √1 . e d x ≤ n 2π −∞ Proof. One applies Theorem 3.7, with τ the defining representation, and α an element of type {x1±1 , . . . , xn±n }, where x1 = · · · = xn−1 = 1 and xn = eiθ . Then α is conjugate . Using Lemma 3.8, one calculates that the to α −1 and one computes that a = 1−cos(θ) n √ 2 4cos(θ)2 −4cos(θ)+2 first error term in Theorem 3.7 is equal to . One computes that the 2n+1 1/4 24(1−cos(θ)) second error term is equal to . Since these bounds hold for all θ and are π(2n+1) continuous in θ , the bounds hold in the limit that θ → 0. This gives an upper bound of √ √ 2 2 2 ≤ , 2n+1 n as claimed. 3.3. Example. S O(2n + 1, R). We investigate the distribution of χ τ (g), where τ is the 2n + 1-dimensional defining representation of S O(2n + 1, R). The only ingredient from representation theory needed is Lemma 3.10, which is the k = 2 case of a formula from p. 204 of [Su] giving the decomposition of τ k into irreducible representations (it is also easily obtained by inspecting the character formulas on p. 228 of [W]). In its statement, we let x1 , x1−1 , . . . , xn , xn−1 , 1 be the eigenvalues of an element of S O(2n + 1, R).


1189

Lemma 3.10. For n ≥ 2, the square of the defining representation of S O(2n + 1, R) decomposes in a multiplicity free way as the sum of the following three irreducible representations: • The trivial representation, with character 1 • The representation with character 21 ( i xi + xi−1 )2 + 21 i (xi2 + xi−2 ) + i (xi + xi−1 ) • The representation with character 21 ( i xi + xi−1 )2 − 21 i (xi2 + xi−2 )+ i (xi + xi−1 ). This leads to the following theorem. Theorem 3.11. Let g be chosen from the Haar measure of S O(2n + 1, R), where n ≥ 2. Let W (g) be the trace of g. Then for all real x0 , √ x0 2 2 − x2 P(W ≤ x0 ) − √1 . e d x ≤ n 2π −∞ Proof. One applies Theorem 3.7, taking τ to be the defining representation, and α to be an element of type {x1±1 , . . . , xn±n , 1}, where x1 = · · · = xn−1 = 1 and xn = eiθ (i.e. α is a rotation by θ ). Then α is conjugate to α −1 and a = 2(1−cos(θ)) . Using Lemma 3.10 2n+1 √

one computes that in the θ → 0 limit the first error term in Theorem 3.7 is equal to n2 . 1/4 One calculates that the second error term is equal to 12(2n+1)(1−cos(θ)) . The proof π n(2n+3) of the theorem is completed by noting that this goes to 0 as θ → 0. 3.4. Example. O(2n, R). We consider the distribution of χ τ (g), where τ is the 2ndimensional defining representation of O(2n, R). The only representation theoretic information needed is Lemma 3.12, which is the k = 2 case of a result of Proctor [Pr] (and also not difficult to obtain from the character formulas on p. 228 of [W]). In its statement, we let x1 , x1−1 , . . . , xn , xn−1 be the eigenvalues of an element of O(2n, R). Lemma 3.12. For n ≥ 2, the square of the defining representation of O(2n, R) decomposes in a multiplicity free way as the sum of the following three irreducible representations: • The trivial representation, with character 1

2 1 2

• The representation with character 21 − 2 i xi + xi 2 i xi + xi

2 1 2

+ 2 i xi + xi 2 − 1. • The representation with character 21 i xi + xi This leads to the following result. Theorem 3.13. Let g be chosen from the Haar measure of O(2n, R), where n ≥ 2. Let W (g) be the trace of g. Then for all real x0 , √ x0 2 2 − x2 P(W ≤ x0 ) − √1 . e dx ≤ n−1 2π −∞ Proof. Apply Theorem 3.7, with τ the defining representation of O(2n, R). We take α to be an element of type {x1±1 , . . . , xn±n }, where x1 = · · · = xn−1 = 1 and xn = eiθ (i.e. . Lemma 3.12 gives the α is a rotation by θ ). Then α is conjugate to α −1 and a = 1−cos(θ) n

1190

J. Fulman

decomposition of τ 2 into irreducibles, and from this one calculates the first error term in √ 8 Theorem 3.7, and sees that in the θ → 0 limit it is equal to 2n−1 . One computes that the 1/4 second error term is equal to 24n(1−cos(θ)) . The proof of the theorem is completed π(n+1)(2n−1) by noting that this goes to 0 as θ → 0. 3.5. General theory (complex case). Let G be a compact Lie group and τ be an irreducible representation of G whose character

is not real valued. The random variable of interest to us is W = √1 χ τ (g) + χ τ (g) , where g is chosen from the Haar measure 2 of G. It follows from the orthogonality relations for irreducible characters of G that E(W ) = 0 and E(W 2 ) = 1. We now define a pair (W, W ) by letting W be as above and W = W (αg), where α is chosen uniformly at random from a fixed self-inverse conjugacy class of G. As in Subsect. 3.1, the pair (W, W ) is exchangeable and all χ φ (α) are real. The remaining results in this subsection are proved by minor modifications of the arguments in Subsect. 3.1. Lemma 3.14.

E(W |W ) = Lemma 3.15.

χ τ (α) W. dim(τ )

χ τ (α) E(W − W )2 = 2 1 − . dim(τ )

Lemma 3.16. E[(W )2 |g] =

1 χ φ (α) φ χ (g), m φ [(τ + τ )2 ] 2 dim(φ) φ

where the sum is over all irreducible representations of G. Lemma 3.17. V ar ([E(W − W )2 |g]) =

2 ∗ 2χ τ (α) χ φ (α) 1 − m φ [(τ + τ )2 ]2 1 + , 4 dim(φ) dim(τ ) φ

where the star signifies that the sum is over all nontrivial irreducible representations of G. Lemma 3.18. Let k be a positive integer. (1) E(W − W )k is equal to k φ 1 k−r k r k−r χ (α) . (−1) m [(τ + τ ) ]m [(τ + τ ) ] φ φ r 2k/2 dim(φ) r =0

(2) E(W − W )4 =

φ

2 ]2 2 1 − m [(τ + τ ) φ φ

χ τ (α) dim(τ )

−

Finally, one obtains the following central limit theorem.

3 2

1−

χ φ (α) dim(φ)

.


1191

Theorem 3.19. Let G be a compact Lie group and τ an irreducible representation of G whose character Let α = 1 be such that α and α −1 are conjugate. is not real valued.

Let W = √1 χ τ (g) + χ τ (g) , where g is chosen from the Haar measure of G. Then 2 for all real x0 , x0 2 − x2 P(W ≤ x0 ) − √1 e d x 2π −∞ 2 ∗ 1 1 χ φ (α) ≤ 1− m φ [(τ + τ )2 ]2 2 − 2 a dim(φ) ⎡

φ

⎤1/4 φ (α) 1 3 χ ⎦ . +⎣ m φ [(τ + τ )2 ]2 2 − 1− π 2a dim(φ) φ

τ

χ (α) Here a = 1 − dim(τ ) , the first sum is over all non-trivial irreducible representations of G, and the second sum is over all irreducible representations of G.

3.6. Example. U (n, C). Every element of U (n, C) is conjugate to a diagonal matrix with entries (x1 , . . . , xn ) and the representation theory of U (n, C) is well understood (see for instance [Bu]). The irreducible representations of U (n, C) are parameterized by integer sequences λ1 ≥ λ2 ≥ · · · ≥ λn . The corresponding character value on an element of type {x1 , . . . , xn } is given by the Schur function sλ (x1 , . . . , xn ). (The usual definition of Schur functions requires that λn ≥ 0, so if λn = −k < 0, this should be interpreted as (x1 · · · xn )−k sλ+(k)n , where λ + (k)n is given by adding k to each of λ1 , . . . , λn .) The complex conjugate of a character with data λ1 ≥ λ2 ≥ · · · ≥ λn has data −λn ≥ −λn−1 ≥ · · · ≥ −λ1 . Combining the above information with Theorem 3.19, one obtains the following result. Theorem 3.20. Let g be chosen from the Haar measure of U (n, C), where n ≥ 2. Let W (g) = √1 [T r (g) + T r (g)], where T r denotes trace. Then for all real x0 , 2

x0 2 2 − x2 P(W ≤ x0 ) − √1 . e d x ≤ n − 1 2π −∞ Proof. One applies Theorem 3.19 with τ the n-dimensional defining representation. We take α to be an element of type {x1 , . . . , xn } with x1 = · · · = xn−2 = 1, xn−1 = eiθ , and xn = e−iθ . Then α and α −1 are conjugate and a = 2(1−cos(θ)) . By the Pieri rule n for multiplying Schur functions (p. 73 of [Mac]), the decomposition of the character of (τ + τ )2 in terms of Schur functions is given by (s(1) + s(1) )2 = s(1) s(1) + 2

s(1) s(1n−1 )

+ s(1) s(1) x1 · · · xn s(1n ) + s(2,1n−2 ) + s(−2) + s(−1,−1) = s(2) + s(1,1) + 2 x1 · · · xn = 2 + s(2) + s(1,1) + s(−2) + s(−1,−1) + 2s(1,0n−2 ,−1) .

1192

J. Fulman

One then computes that in the θ → 0 limit, the first error term in Theorem 3.19 is equal to 1/4 √ 2 n 2 +2 2 ≤ n−1 . One computes that the second error term is equal to 12(2n−1)(1−cos(θ)) . n 2 −1 π(n 2 −1) The proof is completed by noting that this approaches 0 as θ → 0. 4. Compact Symmetric Spaces This section extends the methods of Sect. 3 to study the distribution of a fixed spherical function ωτ on a random element of a compact symmetric space G/K . Subsect. 4.1 gives general theory for the case that ωτ is real valued. This is illustrated for the sphere in Subsect. 4.2, giving a different perspective on a result of [DF and Me1]. We note that since compact Lie groups can be viewed as symmetric spaces (see Sect. 4.1), the examples in Subsects. 3.2, 3.3, and 3.4 give further examples. Our theorems should also prove useful for Jacobi-type ensembles arising from other root systems (see for instance [Vr]). Subsection 4.3 indicates the changes needed to treat spherical functions ωτ which are not real valued, and Subsects. 4.4 and 4.5 study the trace of elements from Dyson’s circular orthogonal and circular symplectic ensembles as special cases (the circular unitary ensemble is equivalent to U (n, C), so was already treated in Subsect. 3.6). Central limit theorems are known for the trace of an element from the circular ensembles (see [Ra,BF,CoSz]), but our approach gives an error term.

4.1. General theory (real case). To begin we recall some concepts about spherical functions of symmetric spaces. Standard references which contain more details are [He1,He2, Te, and V]. Chapter 7 of [Mac] is also very helpful. A Riemannian manifold X is said to be a symmetric space if the geodesic symmetry σ : X → X with center at any point x0 is an isometry. Then X can be identified with G/K , where G is a connected transitive Lie group of isometries of X , and K is a compact group which up to finite index is given by K = {g ∈ G : gx0 = x0 }. A function ωφ ∈ L 2 (G/K ) is called spherical if ωφ (1) = 1 and the following functional equation is satisfied: ωφ (xky)dk = ωφ (x)ωφ (y) ∀x, y ∈ G. K

This equation implies that ωφ is K bi-invariant (i.e. ωφ (k1 gk2 ) = ωφ (g) for all k1 , k2 in K ), which justifies our writing ωφ (g) instead of ωφ (g K ). One reason that spherical functions are important is that if G/K is compact and Hφ is the G-invariant subspace of L 2 (G/K ) generated by ωφ , then Hφ is a finite dimensional irreducible representation of G and L 2 (G/K ) is a direct sum of all such Hφ . We let dim(φ) denote the dimension of Hφ . In reading this section it is useful to keep in mind that a compact Lie group U can be viewed as a compact symmetric space. Indeed, one can take G = U × U and K the diagonal subgroup of U ; then G/K is identified with U under the mapping (u 1 , u 2 )K → u 1 u −1 2 . The spherical functions ωφ of G/K are indexed by irreducible representations φ of U and are precisely the character ratios dim(Hφ ) =

χ φ (1)2 .

χ φ (u) ; χ φ (1)

moreover,


1193

Let ωτ be a non-trivial real valued spherical function of G/K . We are interested in the distribution of ωτ (g) (normalized to have variance 1). Here g K is chosen from the “Haar measure” µ on G/K . This is induced from the Haar measure of G using the projection map G → G/K , and is invariant under the action of G. The following orthogonality relation will be used; see for instance p. 45 of [Kl] for a proof. Lemma 4.1.

G/K

ωφ (g)ωη (g) =

δφ,η . dim(φ)

In particular, Lemma 4.1 implies that W := [dim(τ )]1/2 ωτ has mean 0 and variance 1.

The following lemma is immediate from the functional equation for ωφ and K bi-invariance of ωφ . Lemma 4.2. Let G/K be a compact symmetric space, and ωφ a spherical function of G/K . Then ωφ (k1 αk2 g)dk1 dk2 = ωφ (α)ωφ (g) K ×K

for all α, g ∈ G. We define the pair (W, W ) by letting W = [dim(τ )]1/2 ωτ (g), where g K is from the “Haar measure” of G/K , and W = W (αg), where α is chosen uniformly at random from a fixed double coset K α K = K which satisfies the property that K α K = K α −1 K . Since K α −1 K = (K α K )−1 , it follows that (W, W ) is exchangeable. Moreover the integral formula for spherical functions (p. 417 of [He2]) implies that all ωφ (α) are real. The analysis of the exchangeable pair (W, W ) can be carried out exactly as in Subsect. 3.1, using Lemmas 4.1 and 4.2 instead of the orthogonality relations for compact Lie groups and Lemma 3.1. Hence we simply record the results. Lemma 4.3. E(W |W ) = ωτ (α)W. Lemma 4.4. E(W − W )2 = 2(1 − ωτ (α)). In the statements of the remaining results, we define the “multiplicity” m φ (τ r ) by the expansion dim(φ) 1/2 [ωτ (g)]r = m φ (τ r )ωφ (g). dim(τ )r φ

The numbers m φ (τ r ) are real and non-negative (argue as on p. 396 of [Mac] with sums replaced by integrals) but need not be integers. Note that by Lemma 4.1, r r 1/2 ωτ (g)r ωφ (g). m φ (τ ) = dim(φ)dim(τ ) G/K

1194

J. Fulman

Lemma 4.5. E[(W )2 |g K ] =

m φ (τ )2 [dim(φ)]1/2 ωφ (α)ωφ (g),

φ

where the sum is over all spherical functions of G/K . Lemma 4.6. V ar ([E(W − W )2 |g K ]) =

∗

2 m φ (τ 2 )2 1 + ωφ (α) − 2ωτ (α) ,

φ

where the star signifies that the sum is over all nontrivial spherical functions of G/K . Lemma 4.7. Let k be a positive integer. r k−r )ω (α). (1) E(W − W )k = rk=0 (−1)k−r rk φ φ m φ (τ )m φ (τ 4 2 2 (2) E(W − W ) = φ m φ (τ ) 8(1 − ωτ (α)) − 6(1 − ωφ (α)) . Finally, one has the following central limit theorem. Theorem 4.8. Let G/K be a compact symmetric space and let ωτ be a non-trivial realvalued spherical function of G/K . Fix an element α ∈ K such that K α K = K α −1 K . Let W = [dim(τ )]1/2 ωτ (g), where g K is chosen from the “Haar measure” of G/K . Then for all real x0 , x0 2 − x2 P(W ≤ x0 ) − √1 e d x 2π −∞ ∗

2 1 2 2 ≤ 1 − ωφ (α) m φ (τ ) 2 − a ⎡

φ

⎤ 1/4

1 6 2 2 1 − ωφ (α) ⎦ . +⎣ m φ (τ ) 8 − π a φ

Here a = 1 − ωτ (α), the first sum is over all non-trivial spherical functions of G/K , and the second sum is over all spherical functions of G/K . 4.2. Example: the sphere. This subsection studies the unit sphere in Rn , viewed as the symmetric space S O(n, R)/S O(n − 1, R). Chapter 9 of [V] is a good reference for the spherical functions of this symmetric space, and Chapter 4 of [DyM] is a very clear textbook treatment for the special case n = 3. Letting e1 , . . . , en be the standard basis of Rn and embedding S O(n − 1, R) inside S O(n, R) as the subgroup fixing en , then K g1 K = K g2 K if and only if g1 (en ) and g2 (en ) have the same last coordinate. From this it is not difficult to check that K g K = K g −1 K for all g, and that the double cosets of S O(n − 1, R) in S O(n, R) are parameterized by xn , the final coordinate of a point (x1 , . . . , xn ) on the sphere. In what follows we let x denote xn .


1195

From p. 461 of [V], the spherical functions ωl are parameterized by integers l ≥ 0 and satisfy ωl (x) =

l!(n − 3)! n−2 C 2 (x). (l + n − 3)! l

ρ

Here Cl is the Gegenbauer polynomial, defined by the generating function ρ Cl (x)t l = (1 − 2xt + t 2 )−ρ . l≥0

For instance, ρ

ρ

ρ

C0 (x) = 1, C1 (x) = 2ρx, C2 (x) = −ρ + 2ρ(1 + ρ)x 2 and ω0 (x) = 1, ω1 (x) = x, ω2 (x) =

nx 2 − 1 . n−1

By p. 462 of [V], dim(l) = (2l+n−2)(n+l−3)! . (n−2)!l! √ We study the random variable W (x) = [dim(1)]1/2 ω1 = nx. In fact sharp (up to constants) normal approximations for W are known: see Diaconis and Freedman [DF] √ 2 3 and also Meckes [Me1], who uses Stein’s method to obtain an error term of n−1 in total variation distance. Our viewpoint leads to the following result. √ Theorem 4.9. Let W = nx, where x is the last coordinate of a random point on the n unit sphere in R . Then for all real x0 , √ x0 x2 2 2 1 − P(W ≤ x0 ) − √ . e 2 d x ≤ n−1 2π −∞ Proof. We apply Theorem 4.8 with τ = ω1 . Writing ω1 (x)2 as a linear combination of

ω0 (x) and ω2 (x), one computes that m (0) (τ 2 ) = 1, m (2) (τ 2 ) =

2(n−1) n+2 ,

and that all

other multiplicities in τ 2

vanish. Letting α be less than one but close to 1, one computes √ 2 2 that the first error term in Theorem 4.8 is exactly n−1 . Letting α tend to 1 from below, the second error term in Theorem 4.8 vanishes and the result follows. 4.3. General theory (complex case). In this subsection G/K is a compact symmetric space and ωτ is a spherical function which is not real valued. The random variable we 1/2

) ω (g) + ω (g) . As in Subsect. 4.1, let W = W (αg), where study is W = dim(τ τ τ 2 α is chosen uniformly at random from a fixed double coset K α K = K which satisfies the property that K α K = K α −1 K . Then (W, W ) is exchangeable and all ωφ (α) are real. The remaining results in this subsection extend those of Subsect. 3.5, and are proved by nearly identical arguments, using Lemmas 4.1 and 4.2. Lemma 4.10. E(W |W ) = ωτ (α)W .

1196

J. Fulman

Lemma 4.11. E(W − W )2 = 2(1 − ωτ (α)). For the remaining results in this subsection, we define the “multiplicity” m φ [(τ +τ )r ] by the expansion (ωτ + ωτ )r =

dim(φ) 1/2 φ

dim(τ )r

m φ [(τ + τ )r ]ωφ .

Arguing as on p. 396 of [Mac], one has that the numbers m φ [(τ + τ )r ] are real and non-negative (though not necessarily integers). Note that by Lemma 4.1, m φ [(τ + τ ) ] = [dim(φ)dim(τ ) ] r

r 1/2 G/K

r ωτ (g) + ωτ (g) ωφ (g).

Lemma 4.12. E[(W )2 |g K ] =

1 m φ [(τ + τ )2 ]dim(φ)1/2 ωφ (α)ωφ (g), 2 φ

where the sum is over all spherical functions of G/K . Lemma 4.13. V ar ([E(W − W )2 |g K ]) =

∗

1 m φ [(τ + τ )2 ]2 (1 + ωφ (α) − 2ωτ (α))2 , 4 φ

where the star signifies that the sum is over all nontrivial spherical functions of G/K . Lemma 4.14. Let k be a positive integer. (1) E(W − W )k is equal to k 1 k−r k (−1) m φ [(τ + τ )r ]m φ [(τ + τ )k−r ]ωφ (α). r 2k/2 r =0

φ

(2) E(W − W )4 is equal to φ

3 m φ [(τ + τ )2 ]2 2(1 − ωτ (α)) − (1 − ωφ (α)) . 2

Putting the pieces together, one has the following central limit theorem. Theorem 4.15. Let G/K be a compact symmetric space and let ωτ be a spherical function of G/K which is not real valued. Fix an element α ∈ K such that K α K = K α −1 K .


1197

) ω Let W = dim(τ (g) + ω (g) , where g K is from the “Haar measure” of G/K . τ τ 2 Then for all real x0 , x0 2 − x2 P(W ≤ x0 ) − √1 e d x 2π −∞ ∗

2 1 1 ≤ 1 − ωφ (α) m φ [(τ + τ )2 ]2 2 − 2 a ⎡

φ

⎤ 1/4

1 3 1 − ωφ (α) ⎦ . +⎣ m φ [(τ + τ )2 ]2 2 − π 2a φ

Here a = 1 − ωτ (α), the first sum is over all non-trivial spherical functions of G/K , and the second sum is over all spherical functions of G/K . 4.4. Example. U (n, C)/O(n, R). This symmetric space can be identified with the set of symmetric unitary matrices by the map g → gg T (see [Dn] or [Fo] for details), and the resulting matrix ensemble is known as Dyson’s circular orthogonal ensemble. For a thorough discussion of this ensemble, see the texts [Fo] or [Mt]. In particular, it is known that if a function f depends only on the eigenvalues x1 , . . . , xn of a matrix from Dyson’s ensemble, then n d xk [(3/2)]n . f = f (x1 , . . . , xn ) |xi − x j | n n ( + 1) 2π G/K T 2 1≤i< j≤n

In this integral,

Tn

k=1

is the n-dimensional torus with coordinates x1 , . . . , xn , xi ∈ C, |xi | = 1.

It is convenient to let the inner product f, g of two functions be defined by n d xk [(3/2)]n . f, g = f (x , . . . , x )g(x , . . . , x ) |x − x | 1 n 1 n i j n ( 2 + 1) Tn 2π 1≤i< j≤n

k=1

The spherical functions for this symmetric space are parameterized by integer 1 ,...,x n ;2) sequences λ1 ≥ λ2 · · · ≥ λn and are ωλ := PPλ λ(x(1,...,1;2) , the normalized Jack polynomials with parameter 2. An excellent reference for Jack polynomials is Sect. 6.10 of [Mac]. There one assumes that λn ≥ 0, so if λn = −k < 0, Pλ should be interpreted as (x1 · · · xn )−k Pλ+(k)n , where λ + (k)n is given by adding k to each of λ1 , . . . , λn . To describe some useful combinatorial properties of Jack polynomials, we use the notation that if λ is a partition and s is a box of λ, then l (s), l(s), a(s), a (s) are respectively the number of squares in the diagram of λ to the north of s (in the same column), south of s (in the same column), east of s (in the same row), and west of s (in the same row). For example the box marked s in the partition below s

would have l (s) = 1, l(s) = 2, a (s) = 1, and a(s) = 3.

1198

J. Fulman

Letting λ be a partition of n, and using this notation, two useful formulas are the “principal specialization formula” (p. 381 of [Mac]) n + 2a (s) − l (s) Pλ (1, . . . , 1; 2) = , 2a(s) + l(s) + 1 s∈λ

and the formula

n + 1 + 2a (s) − l (s) n + 2a (s) − l (s) dim(λ) = , (2a(s) + l(s) + 2) (2a(s) + l(s) + 1) s∈λ

which follows from the formula for Pλ , Pλ on p. 383 of [Mac] and the fact (Lemma 4.1) that Pλ (1, . . . , 1; 2)2 . Pλ , Pλ

Theorem 4.16. Let W = 21 1 + n1 T r (g) + T r (g) , where g is random from Dyson’s circular orthogonal ensemble, T r denotes trace, and n ≥ 2. Then for all real x0 , x0 2 4 − x2 P(W ≤ x0 ) − √1 e d x ≤ . n 2π −∞ dim(λ) =

Proof. Apply Theorem 4.15 to the spherical function τ = ω(1) (g) = T rn(g) . To compute m φ [(τ + τ )2 ] for all φ, one has to decompose (ω(1) + ω(1) )2 into spherical functions, which is equivalent to decomposing (P(1) + P(1) )2 in terms of Jack polynomials. From the Pieri rule for Jack polynomials ([Mac], p. 340), one calculates that (P(1) + P(1) )2 = P(1) P(1) + 2

P(1) P(1n−1 ) x1 · · · xn

4 = P(2) + P(12 ) + 2 3 =

+ P(1) P(1)

2n n n+1 P(1 )

+ P(2,1n−2 )

x1 · · · xn

+ P(2) +

4 P 2 3 (1 )

4n 4 4 + P(2) + P(12 ) + 2P(1,0n−2 ,−1) + P(2) + P(12 ) . n+1 3 3

We choose α to be an element of type (1, . . . , 1, eiθ , e−iθ ). Then K α K = K α −1 K and a = 2(1−cos(θ)) n . One computes that in the θ → 0 limit, the first error term of 3 +2n 2 +5n+6) ≤ n4 . The proof is completed by computing that the Theorem 4.15 is n1 8(nn 3 +4n 2 +n−6 2 (1−cos(θ)) 1/4 which goes to 0 as θ → 0. second error term is 24(n+1) π n 2 (n+3) 4.5. Example. U (2n, C)/U Sp(2n, C). This symmetric space corresponds to Dyson’s circular symplectic ensemble ([Dn]); see [Fo] or [Mt] for background on this ensemble. In particular, it is known that if f depends only on the eigenvalues x1 , . . . , xn of a matrix from this ensemble, then n d xk 2n , f = f (x1 , . . . , xn ) |xi − x j |4 n (2n)! 2π G/K T 1≤i< j≤n

k=1


1199

where Tn is as in the previous example. We let the inner product f, g of two functions be defined by f, g =

n d xk 2n . f (x1 , . . . , xn )g(x1 , . . . , xn ) |xi − x j |4 (2n)! Tn 2π 1≤i< j≤n k=1

The spherical functions for this symmetric space are parameterized by integer sequences λ1 ≥ λ2 · · · ≥ λn and are ωλ :=

Pλ (x1 ,...,xn ; 12 ) , Pλ (1,...,1; 21 )

the normalized Jack polynomi-

als with parameter 1/2. As mentioned in the previous example, Jack polynomials are usually defined assuming that λn ≥ 0, so if λn = −k < 0, Pλ should be interpreted as (x1 · · · xn )−k Pλ+(k)n , where λ + (k)n is given by adding k to each of λ1 , . . . , λn . Letting λ be a partition of n and using the notation of the previous example, two useful formulas are the “principal specialization formula” (p. 381 of [Mac]) n + a (s) − l (s) 1 2 Pλ (1, . . . , 1; ) = , a(s) 2 + l(s) + 1 s∈λ

and the formula dim(Hλ ) =

n+

a (s) 2

s∈λ

a(s) 2

2

2n − 1 + a (s) − 2l (s)

, + l(s) + 1 (a(s) + 2l(s) + 1)

− l (s)

which follows from the formula for Pλ , Pλ on p. 383 of [Mac] and the fact (Lemma 4.1) that Pλ (1, . . . , 1; 21 )2 . Pλ , Pλ

1 Theorem 4.17. Let W = 1 − 2n T r (g) + T r (g) , where g is random from Dyson’s circular symplectic ensemble and n ≥ 2. Then for all real x0 , x0 2 4 − x2 P(W ≤ x0 ) − √1 e d x ≤ . n 2π −∞ dim(λ) =

Proof. Apply Theorem 4.15 to the spherical function τ = ω(1) (g) = T rn(g) . To compute m φ [(τ + τ )2 ] for all φ, one has to decompose (ω(1) + ω(1) )2 into spherical functions, which is equivalent to decomposing (P(1) + P(1) )2 in terms of Jack polynomials. From the Pieri rule for Jack polynomials ([Mac], p. 340), one calculates that (P(1) + P(1) )2 is equal to P(1) P(1) + 2

P(1) P(1n−1 ) x1 · · · xn

+ P(1) P(1)

n n 2 2 2n−1 P(1 ) + P(2,1n−2 ) + P(12 ) + P(2) = P(2) + P(12 ) + 2 3 x1 · · · xn 3 2n 2 2 = + P(2) + P(12 ) + 2P(1,0n−2 ,−1) + P(12 ) + P(2) . 2n − 1 3 3

1200

J. Fulman

Now take α to be an element of type (1, . . . , 1, eiθ , e−iθ ); then K α K = K α −1 K and a = 2(1−cos(θ)) . One computes that in the θ → 0 limit, the first error term of n 8(4n 3 −4n 2 +5n−3) 1 Theorem 4.15 is 2n ≤ n4 . The second error term is computed to be 3 −8n 2 +n+3 4n 1/4 6(2n−1)(4n−5)(1−cos(θ)) which goes to 0 as θ → 0. π n 2 (2n−3) References [BF] [Bu] [CFR] [CoSz] [DDN] [DE] [DF] [DS] [Dn] [Du] [DyM] [Fo] [Fu] [GoT] [He1] [He2] [J] [Ka] [KaS] [KLR] [Kl] [Lu] [Mac] [Mn] [Me1] [Me2] [Mt] [PV]

Baker, T., Forrester, P.: Finite-n fluctuation formulas for random matrices. J. Stat. Phys. 88, 1371–1386 (1997) Bump, D.: Lie Groups. Graduate Texts in Mathematics 225. New York: Springer-Verlag, 2004 Chatterjee, S., Fulman, J., Roellin, A.: Exponential approximation by Stein’s method and spectral graph theory. http://arXiv.org/list/math.PR/0605552, 2008 Collins, B., Stolz, M.: Borel theorems for random matrices from the classical compact symmetric spaces. Ann. Probab. 36, 876–895 (2008) D’Aristotile, A., Diaconis, P., Newman, C.: Brownian motion and the classical groups. In: Probability, statistics and their applications: papers in honor of Rabi Bhattacharya. IMS Lecture Notes Ser. 41, 2003, pp. 97–116 Diaconis, P., Evans, S.: Linear functionals of eigenvalues of random matrices. Transac. Amer. Math. Soc. 353, 2615–2633 (2001) Diaconis, P., Freedman, D.: A dozen de Finetti-style results in search of a theory. Ann. Inst. H. Poincaré Probab. Statist. 23, 397–423 (1987) Diaconis, P., Shahshahani, M.: On the eigenvalues of random matrices. J. Appl. Probab. 31, 49–62 (1994) Dueñez, E.: Random matrix ensembles associated to compact symmetric spaces. Commun. Math. Phys. 244, 29–61 (2004) Durrett, R.: Probability: Theory and examples, Second edition. Belmont, CA: Duxbury Press, 1996 Dym, H., McKean, H.: Fourier series and integrals. New York - London: Academic Press, 1972 Forrester, P.: Log-gases and random matrices. Book in preparation. Available at http://www.ms. unimelb.edu.au/~matpjf/matpjf.html, 2005 Fulman, J.: Stein’s method and random character ratios. Transac. Amer. Math. Soc. 360, 3687–3730 (2008) Götze, F., Tikhomirov, A.N.: Limit theorems for spectra of random matrices with martingale structure. Theory Probab. Appl. 51, 42–64 (2007) Helgason, S.: Differential geometry, Lie groups, and symmetric spaces. San Diego, CA: Academic Press, 1978 Helgason, S.: Groups and geometric analysis. Corrected reprint of the 1984 original, Providence, RI: Amer. Math. Soc., 2000 Johansson, K.: On random matrices from the compact classical groups. Ann. of Math. 145, 519–545 (1997) Katz, N.: Exponential sums and differential equations. Ann. Math. Studies 124. Princeton, NJ: Princeton University Press, 1990 Katz, N., Sarnak, P.: Zeroes of zeta functions and symmetry. Bull. Amer. Math. Soc. 36, 1–26 (1999) Keating, J.P., Linden, N., Rudnick, Z.: Random matrix theory, the exceptional lie groups and L functions. J. Phys. A 36, 2933–2944 (2003) Klyachko, A.: Random walks on symmetric spaces and inequalities for matrix spectra. Lin. Alg. Applic. 319, 37–59 (2000) Luk, H.M.: Stein’s method for the gamma distribution and related statistical applications, University of Southern California Ph.D. thesis, 1994 Macdonald, I.: Symmetric functions and Hall polynomials. Second edition, New York: Oxford University Press, 1995 Mann, B.: Stein’s method for χ 2 of a multinomial. Unpublished manuscript, 1997 Meckes, E.: On the approximate normality of eigenfunctions of the Laplacian. http://arXiv.org/abs/ 0705.1342V1[math.SP], (2007), to appear in Transac. Amer. Math. Soc. Meckes, E.: Linear functions on the classical matrix groups. Transac. Amer. Math. Soc. 360, 5355–5366 (2008) Mehta, M.: Random matrices. Third edition, Amsterdam: Elsevier/Academic Press, 2004 Pastur, L., Vasilchuk, V.: On the moments of traces of matrices of classical groups. Commun. Math. Phys. 252, 149–166 (2004)


[Pr] [Ra] [Re] [RR] [Rl] [So] [St1] [St2] [Su] [Te] [V] [Vr] [W]

1201

Proctor, R.: A Schensted algorithm which models tensor representations of the orthogonal group. Canad. J. Math. 42, 28–49 (1990) Rains, E.: Topics in probability on compact Lie groups, Harvard University Ph.D. thesis, 1997 Reinert, G.: Three general approaches to Stein’s method. In: An introduction to Stein’s method, Lect. Notes Ser. Inst. Math. Sci. Natl. Univ. Singap., vol. 4, 2005 Rinott, Y., Rotar, V.: Normal approximations by Stein’s method. Decis. Econ. Finance 23, 15–29 (2000) Röellin, A.: A note on the exchangeability condition in Stein’s method. http://arXiv.org/list/math. PR/0611050, 2006 Soshnikov, A.: The central limit theorem for local linear statistics in classical compact groups and related combinatorial identities. Ann. Probab. 28, 1353–1370 (2000) Stein, C.: Approximate computation of expectations. Institute of Mathematical Statistics Lecture Notes, vol. 7, 1986 Stein, C.: The accuracy of the normal approximation to the distribution of the traces of powers of random orthogonal matrices. Stanford University Statistics Department technical report no. 470, March 1995 Sundaram, S.: Tableaux in the representation theory of compact Lie groups. In: Invariant theory and tableaux. IMA Volumes in Mathematics, vol. 19, 1990, pp. 191–225 Terras, A.: Harmonic analysis on symmetric spaces and applications. Volumes I, II, N.Y.: Springer Verlag, 1985, 1988 Vilenkin, N.J.: Special functions and the theory of group representations. Translations of Mathematics Monographs, Volume 2, Providence, RI: Amer. Math. Soc., 1968 Vretare, L.: Formulas for elementary spherical functions and generalized Jacobi polynomials. SIAM J. Math. Anal. 15, 805–833 (1984) Weyl, H.: The classical groups. Fifteenth printing, Princeton, NJ: Princeton University Press, 1997

Communicated by S. Zelditch

Communications In Mathematical Physics - Volume 288

Communications in Mathematical Physics - Volume 221

Communications in Mathematical Physics - Volume 220

Communications in Mathematical Physics - Volume 235

Communications in Mathematical Physics - Volume 223

Communications In Mathematical Physics - Volume 283

Communications In Mathematical Physics - Volume 270

Communications in Mathematical Physics - Volume 208

Communications in Mathematical Physics - Volume 186

Communications In Mathematical Physics - Volume 294

Communications in Mathematical Physics - Volume 217

Communications In Mathematical Physics - Volume 274

Communications in Mathematical Physics - Volume 239

Communications in Mathematical Physics - Volume 306

Communications in Mathematical Physics - Volume 264

Communications in Mathematical Physics - Volume 227

Communications in Mathematical Physics - Volume 184

Communications in Mathematical Physics - Volume 261

Communications in Mathematical Physics - Volume 225

Communications In Mathematical Physics - Volume 263

Communications in Mathematical Physics - Volume 211

Communications In Mathematical Physics - Volume 293

Communications in Mathematical Physics - Volume 246

Communications In Mathematical Physics - Volume 298

Communications in Mathematical Physics - Volume 234

Communications in Mathematical Physics - Volume 304

Communications In Mathematical Physics - Volume 292

Communications in Mathematical Physics - Volume 233

Communications in Mathematical Physics - Volume 253

Communications in Mathematical Physics - Volume 222

Communications in Mathematical Physics - Volume 260

Communications In Mathematical Physics - Volume 288

Communications in Mathematical Physics - Volume 221

Communications in Mathematical Physics - Volume 220

Communications in Mathematical Physics - Volume 235

Communications in Mathematical Physics - Volume 223

Communications In Mathematical Physics - Volume 283

Communications In Mathematical Physics - Volume 270

Communications in Mathematical Physics - Volume 208

Communications in Mathematical Physics - Volume 186

Communications In Mathematical Physics - Volume 294

Communications in Mathematical Physics - Volume 217

Communications In Mathematical Physics - Volume 274

Communications in Mathematical Physics - Volume 239

Communications in Mathematical Physics - Volume 306

Communications in Mathematical Physics - Volume 264

Communications in Mathematical Physics - Volume 227

Communications in Mathematical Physics - Volume 184

Communications in Mathematical Physics - Volume 261

Communications in Mathematical Physics - Volume 225

Communications In Mathematical Physics - Volume 263

Communications in Mathematical Physics - Volume 211

Communications In Mathematical Physics - Volume 293

Communications in Mathematical Physics - Volume 246

Communications In Mathematical Physics - Volume 298

Communications in Mathematical Physics - Volume 234

Communications in Mathematical Physics - Volume 304

Communications In Mathematical Physics - Volume 292

Communications in Mathematical Physics - Volume 233

Communications in Mathematical Physics - Volume 253

Communications in Mathematical Physics - Volume 222

Communications in Mathematical Physics - Volume 260

Recommend Documents