No title

Journal of Functional Analysis 256 (2009) 1311–1340 www.elsevier.com/locate/jfa The Elliott conjecture for Villadsen al...

Author: A. Connes | D. Stroock (Chief Editors)

139 downloads 471 Views 12MB Size Report

This content was uploaded by our users and we assume good faith they have the permission to share this book. If you own the copyright to this book and it is wrongfully on our website, we offer a simple DMCA procedure to remove your content from our site. Start by pressing the button below!
Report copyright / DMCA form

DOWNLOAD PDF

Journal of Functional Analysis 256 (2009) 1311–1340 www.elsevier.com/locate/jfa

The Elliott conjecture for Villadsen algebras of the first type ✩ Andrew S. Toms a , Wilhelm Winter b,∗ a Department of Mathematics and Statistics, York University, Toronto, Ontario, Canada M3J 1P3 b Mathematisches Institut der Universität Münster, Einsteinstr. 62, D-48149 Münster, Germany

Received 29 August 2007; accepted 8 December 2008 Available online 18 January 2009 Communicated by Dan Voiculescu

Abstract We study the class of simple C ∗ -algebras introduced by Villadsen in his pioneering work on perforated ordered K-theory. We establish six equivalent characterisations of the proper subclass which satisfies the strong form of Elliott’s classification conjecture: two C ∗ -algebraic (Z-stability and approximate divisibility), one K-theoretic (strict comparison of positive elements), and three topological (finite decomposition rank, slow dimension growth, and bounded dimension growth). The equivalence of Z-stability and strict comparison constitutes a stably finite version of Kirchberg’s characterisation of purely infinite C ∗ -algebras. The other equivalences confirm, for Villadsen’s algebras, heretofore conjectural relationships between various notions of good behaviour for nuclear C ∗ -algebras. Crown Copyright © 2008 Published by Elsevier Inc. All rights reserved. Keywords: Nuclear C ∗ -algebras; Classification

1. Introduction The classification theory of norm-separable C ∗ -algebras began with Glimm’s study of UHF algebras in 1960 [13], and was expanded by Bratteli’s 1972 classification of approximately finite-dimensional (AF) algebras via certain directed graphs [4]. It was with the work of El✩ Research partly supported by: Deutsche Forschungsgemeinschaft (through the SFB 478), EU-Network Quantum Spaces—Noncommutative Geometry (Contract No. HPRN-CT-2002-00280), and an NSERC Discovery Grant. * Corresponding author. Current address: School of Mathematical Sciences, University of Nottingham, NG7 2RD Nottingham, United Kingdom. E-mail addresses: [email protected] (A.S. Toms), [email protected] (W. Winter).

0022-1236/$ – see front matter Crown Copyright © 2008 Published by Elsevier Inc. All rights reserved. doi:10.1016/j.jfa.2008.12.015

1312

A.S. Toms, W. Winter / Journal of Functional Analysis 256 (2009) 1311–1340

liott, however, that the theory grew exponentially. His classification of both AF and AT algebras of real rank zero via their scaled ordered K-theory suggested a deep truth about the structure of separable and nuclear C ∗ -algebras [6,7]. He articulated this idea in the late 1980s, and formalised it in his 1994 ICM address [9]: simple, separable, and nuclear C ∗ -algebras should be classified up to ∗-isomorphism by their topological K-theory and traces. This prediction came to be known as the Elliott conjecture. The 1990s and early 2000s saw Elliott’s conjecture confirmed in remarkable generality, cf. [23]. Kirchberg and Phillips established it for purely infinite C ∗ -algebras satisfying the Universal Coefficient Theorem [17,21], and Lin did the same for his C ∗ -algebras of tracial topological rank zero [19]. Elliott, Gong, and Li, confirmed the conjecture for unital approximately homogeneous (AH) algebras of bounded dimension growth [12]. These results cover many natural examples of C ∗ -algebras, including those arising from certain graphs, dynamical systems, and shift spaces. In the midst of these successes, Villadsen produced a strange thing: a simple, separable, and nuclear C ∗ -algebra whose ordered K0 -group was perforated, i.e., contained a non-positive element x such that nx was positive and non-zero for some n ∈ N [33]. (This answered a longstanding question of Blackadar concerning the comparison theory of projections in C ∗ -algebras.) The techniques used by Villadsen to study the K-theory of his algebra were drawn from differential topology, and it took time for the functional analysts of the classification community to digest them. Then, in 2002, Rørdam found a way to adapt Villadsen’s techniques to construct the first counterexample to the Elliott conjecture [24]. Other counterexamples followed [26,27]. The success of Elliott’s conjecture, however, is no accident. It is a deep and fascinating phenomenon, and one must ask whether there is a regularity property lurking in those algebras for which the Elliott conjecture is confirmed. Various candidates exist: stability under tensoring with the Jiang–Su algebra Z, finite decomposition rank, and, for approximately subhomogeneous (ASH) algebras, the notion of strict slow dimension growth. The first property—known as Zstability—is perhaps the most natural candidate, since tensoring with Z does not affect K-theory or traces of a simple unital C ∗ -algebra with weakly unperforated K0 -group. Elliott’s conjecture thus predicts that all such algebras will be Z-stable. It is this very prediction which forms the basis for the counterexamples of Rørdam and the first named author: one produces pairs of simple unital C ∗ -algebras with weakly unperforated K0 -groups, one of which is not Z-stable. These examples have legitimised the assumption of Z-stability in Elliott’s classification program, leading to the wide-ranging classification theorem of the second named author for ASH algebras of real rank zero [37,38]. The problem with Z-stability in relation to Elliott’s classification program is that its ability to characterise those algebras which are amenable to classification is an article of faith. In all cases where Z-stability is sufficient for classification (e.g., simple unital ASH algebras of real rank zero), it may also be automatic; when it is known to be necessary for classification (e.g., AH algebras), it is not known to suffice. In this paper we prove that Z-stability does characterise those algebras which satisfy Elliott’s conjecture in an ambient class where the assumption of Z-stability is truly necessary. The class considered is at once substantial and the natural starting point for establishing such a characterisation: Villadsen’s algebras. In fact we will prove much more. Z-stability is not only the hoped for necessary and sufficient condition for classification, but is furthermore equivalent to a topological condition (finite decomposition rank) and to a Ktheoretic condition (strict comparison of positive elements). These three conditions, all of which make sense for an arbitrary nuclear C ∗ -algebra, are equivalent to three further conditions which


1313

are to varying extents native to the class of algebras we consider: approximate divisibility, slow dimension growth, and bounded dimension growth. Some comments on our characterisations are in order. Nuclear C ∗ -algebras can be viewed from several angles. They are evidently analytic objects, but can be seen as ordered algebraic objects through their K-theory, or as topological objects via the decomposition rank of Kirchberg and the second named author. Our main result says that from each of these viewpoints, there is a natural way to characterise those C ∗ -algebras which satisfy the Elliott conjecture. The equivalence of Z-stability, approximate divisibility, finite decomposition rank, slow dimension growth, and bounded dimension growth is a satisfying confirmation of the expectations of experts. The equivalence of these conditions with strict comparison of positive elements, however, is unexpected and exciting for several reasons. First, the very idea of there being a K-theoretic characterisation of those algebras which will satisfy Elliott’s conjecture is new. Second, it is a condition that can be verified for large classes of examples generally suspected to be amenable to classification [25,29]. Third, and most remarkably, this equivalence is a stably finite version of Kirchberg’s celebrated characterisation of purely infinite C ∗ -algebras. Our paper is organised as follows: in Section 2 we recall the definitions of the regularity properties which appear in our main result; Section 3 introduces Villadsen algebras of the first type, and states our main result; Sections 4–7 contain the proof of the main result; Section 8 gives some examples of non-Z-stable Villadsen algebras. 2. Preliminaries and notation 2.1. AH algebras and dimension growth Below we recall the concepts of (separable unital) AH algebras and their dimension growth. Definition 2.1. A separable unital C ∗ -algebra A is called approximately homogeneous, or AH, if it can be written as an inductive limit A = lim (Ai , φi ) i→∞

where each Ai is a C ∗ -algebra of the form Ai =

mi

pi,j C(Xi,j ) ⊗ Mri,j pi,j

j =1

for natural numbers mi and ri,j , compact metrisable spaces Xi,j and projections pi,j ∈ C(Xi,j ) ⊗ Mri,j . We refer to the inductive system (Ai , φi )i∈N as an AH decomposition for A. We say the AH decomposition (Ai , φi )N has slow dimension growth, if lim

max

i→∞ j =1,...,mi

dim Xi,j = 0; rank pi,j

it has very slow dimension growth, if lim

max

i→∞ j =1,...,mi

(dim Xi,j )3 =0 rank pi,j

1314


and it has bounded dimension growth, if sup max dim Xi,j = d < ∞. i∈N j =1,...,mi

The AH algebra A has slow (very slow or bounded, respectively) dimension growth, if it has an AH decomposition which has slow (very slow or bounded, respectively) dimension growth. Remark 2.2. Slow dimension growth is obviously entailed by very slow dimension growth. Moreover, it is easy to see that if A is simple, then bounded dimension growth implies very slow dimension growth. One of the remarkable results of [15] says that, for simple AH algebras, very slow dimension growth also implies bounded dimension growth. 2.2. Approximate divisibility and the Jiang–Su algebra Let p, q and n be natural numbers with p and q dividing n. C ∗ -algebras of the form I [p, n, q] = f ∈ Mn C [0, 1] f (0) = 1n/p ⊗ a, f (1) = b ⊗ 1n/q , a ∈ Mp , b ∈ Mq are commonly referred to as dimension drop intervals. If n = pq and gcd(p, q) = 1, then the dimension drop interval is said to be prime. In [16], Jiang and Su construct a C ∗ -algebra Z, which is the unique simple unital inductive limit of dimension drop intervals having K0 = Z, K1 = 0 and a unique normalised trace. It is a limit of prime dimension drop intervals where the matrix dimensions tend to infinity, and there is a unital embedding of any prime dimension drop interval into Z. Jiang and Su show that Z is strongly self-absorbing in the sense of [31]. A C ∗ -algebra A is said to be Z-stable, if it is isomorphic to A ⊗ Z (since Z is nuclear, there is no need to specify which tensor product we use). It was shown in [16] and [32] that all classes of simple C ∗ -algebras for which the Elliott conjecture has been verified so far consist of Z-stable C ∗ -algebras. Using semiprojectivity of prime dimension drop intervals, it is not too hard to see that a separable unital C ∗ -algebra A is Z-stable if and only if the following holds (cf. [32]): for any n ∈ N there is a sequence of unital completely positive contractions φi : Mn ⊕ Mn+1 → A such that the restrictions of φi to Mn and Mn+1 both preserve orthogonality (i.e., have order zero in the sense of [35]) and such that [φi ((x, 0)), φi ((0, y))] → 0 and [φi ((x, y)), a] → 0 as i goes to infinity for every x ∈ Mn , y ∈ Mn+1 and a ∈ A. This characterisation shows that Z-stability generalises the concept of approximate divisibility: Following [3], we say a separable unital C ∗ -algebra A is approximately divisible, if for any n ∈ N there is a sequence of unital ∗-homomorphisms φi : Mn ⊕ Mn+1 → A such that [φi (x), a] → 0 as i goes to infinity for every x ∈ Mn ⊕ Mn+1 and a ∈ A. It was shown in [32] that approximate divisibility indeed implies Z-stability. The converse cannot hold in general (approximate divisibility asks for the existence of an abumdance of projections). However, using the classification result of [12], it was shown in [11] that simple AH algebras of bounded dimension growth are approximately divisible.


1315

2.3. The Cuntz semigroup Let A be a C ∗ -algebra, and let Mn (A) denote the n × n matrices whose entries are elements of A. If A = C, then we simply write Mn . Let M∞ (A) denote the algebraic limit of the direct system (Mn (A), φn ), where φn : Mn (A) → Mn+1 (A) is given by a →

a 0

0 . 0

Let M∞ (A)+ (resp. Mn (A)+ ) denote the positive elements in M∞ (A) (resp. Mn (A)). Given a, b ∈ M∞ (A)+ , we say that a is Cuntz subequivalent to b (written a b) if there is a sequence (vn )∞ n=1 of elements of M∞ (A) such that

n→∞

vn bv ∗ − a − −−− → 0. n

We say that a and b are Cuntz equivalent (written a ∼ b) if a b and b a. This relation is an equivalence relation, and we write a for the equivalence class of a. The set W (A) := M∞ (A)+ /∼ becomes a positively ordered Abelian monoid when equipped with the operation a + b = a ⊕ b and the partial order a b

⇔

a b.

In the sequel, we refer to this object as the Cuntz semigroup of A. Given a ∈ M∞ (A)+ and > 0, we denote by (a − )+ the element of C ∗ (a) corresponding (via the functional calculus) to the function f (t) = max{0, t − },

t ∈ σ (a).

(Here σ (a) denotes the spectrum of a.) 2.4. Dimension functions and strict comparison Now suppose that A is unital and stably finite, and denote by QT(A) the space of normalised 2-quasitraces on A (v. [2, Definition II.1.1]). Let S(W (A)) denote the set of additive and order preserving maps d from W (A) to R+ having the property that d(1A ) = 1. Such maps are called states. Given τ ∈ QT(A), one may define a map dτ : M∞ (A)+ → R+ by dτ (a) = lim τ a 1/n . n→∞

(1)

This map is lower semicontinuous, and depends only on the Cuntz equivalence class of a. It moreover has the following properties:

1316


(i) if a b, then dτ (a) dτ (b); (ii) if a and b are orthogonal, then dτ (a + b) = dτ (a) + dτ (b); (iii) dτ ((a − )+ ) dτ (a) as → 0. Thus, dτ defines a state on W (A). Such states are called lower semicontinuous dimension functions, and the set of them is denoted LDF(A). If A has the property that a b whenever d(a) < d(b) for every d ∈ LDF(A), then we say that A has strict comparison of positive elements or simply strict comparison. 2.5. Decomposition rank Based on the completely positive approximation property for nuclear C ∗ -algebras, one may define a noncommutative version of covering dimension as follows: Definition 2.3. (See [18, Definitions 2.2 and 3.1].) Let A be a separable C ∗ -algebra. s (i) A completely positive if there is a decomposi n map ϕ : i=1 Mri → A is n-decomposable, tion {1, . . . , s} = j =0 Ij such that the restriction of ϕ to i∈Ij Mri preserves orthogonality for each j ∈ {0, . . . , n}. (ii) A has decomposition rank n, dr A = n, if n is the least integer such that the following holds: Given {b1 , . . . , bm } ⊂ A and > 0, there is a completely positive approximation (F, ψ, ϕ) for b1 , . . . , bm within (i.e., ψ : A → F and ϕ : F → A are completely positive contractions and ϕψ(bi ) − bi < ) such that ϕ is n-decomposable. If no such n exists, we write dr A = ∞. This notion has good permanence properties; for example, it behaves well with respect to quotients, inductive limits, hereditary subalgebras, unitization and stabilization. It generalises topological covering dimension, i.e., if X is a locally compact second countable space, then dr C0 (X) = dim X; see [18] for details. Moreover, if A is an AH algebra of bounded dimension growth, then dr A is finite. 3. VI Algebras and the main result 3.1. Villadsen algebras of the first type The class of algebras we consider is an interpolated family of AH algebras. At their simplest they are the UHF algebras of Glimm, while at their most complex they are the algebras introduced by Villadsen in his work on perforated ordered K-theory. In between these extremes they span the full spectrum of complexity for simple, separable, nuclear, and stably finite C ∗ -algebras. We call these algebras Villadsen algebras of the first type as they are defined by a generalisation of Villadsen’s construction in [33]. (Villadsen used a second and quite distinct construction in his subsequent work on stable rank, cf. [34].) Let X and Y be compact Hausdorff spaces. Recall that a ∗-homomorphism φ : C(X) → Mn ⊗ C(Y )


1317

is said to be diagonal if it has the form f → diag(f ◦ λ1 , . . . , f ◦ λn ), where λi : Y → X is a continuous map for each 1 i n. The maps λ1 , . . . , λn are called the eigenvalue maps of φ. Amplifications of diagonal maps are also called diagonal. Definition 3.1. Let X be a compact Hausdorff space and n, m, k ∈ N. A unital diagonal ∗-homomorphism φ : Mn ⊗ C(X) → Mk ⊗ C X ×m is said to be a Villadsen map of the first type (a VI map) if each eigenvalue map is either a co-ordinate projection or has range equal to a point. ∞ Definition 3.2. Let X be a compact Hausdorff space, and let (ni )∞ i=1 and (mi )i=1 be sequences of natural numbers with n1 = 1. Fix a compact Hausdorff space X, and put Xi = X ×ni . A unital C ∗ -algebra A is said to be a Villadsen algebra of the first type (a VI algebra), if it can be written as an inductive limit algebra

A∼ = lim Mmi ⊗ C X ×ni , φi i→∞

where each φi is a VI map. We will refer to the inductive system in Definition 3.2 as a standard decomposition for A with seed space X1 (= X). Clearly, such decompositions are not unique. For j > i, put φi,j = φj −1 ◦ · · · ◦ φi . ×n /n

Let Ni,j be the number of distinct co-ordinate projections from Xj = Xi j i to Xi occuring as eigenvalue maps of φi,j , and let Mi,j denote the multiplicity (number of eigenvalue maps) of φi,j . Notice that Mi,j = Mj −1,j Mi,j −1 ,

that Ni,j = Nj −1,j Ni,j −1

and that 0

Ni,j 1. Mi,j

From these relations it follows in particular that the sequence

is decreasing and converges for any fixed i.

Ni,j Mi,j

j >i

1318


A is said to have slow (very slow, or bounded ) dimension growth as a VI algebra, if it admits a standard decomposition as above which has slow (very slow, or bounded, respectively) dimension growth in the sense of Definition 2.1. Remarks 3.3. Despite its simple definition, the class of VI algebras is surprisingly broad: • By taking X1 = {∗}, we can obtain any UHF algebra. If instead we take X1 to be a finite set, then we obtain a good supply of AF algebras. • With each Xi equal to a disjoint union of finitely many circles, we obtain a large collection of AT algebras of real rank zero and real rank one. • If each Xi is equal to the same compact Hausdorff space X, then we obtain the class of Goodearl algebras. • If we impose the condition that ni /mi → 0, then we obtain AH algebras of slow dimension growth exhibiting a full range of complexity in their Elliott invariants: torsion in K0 or K1 , and arbitrary tracial state spaces. • By taking “most” of the eigenvalue maps in each φi to be distinct co-ordinate projections and setting X1 = S2 we obtain Villadsen’s example of a simple, separable, and nuclear C ∗ -algebra with perforated ordered K0 -group [33]. A variation on Villadsen’s construction yields the counterexample to Elliott’s classification conjecture discovered by the first named author in [26]. The first three examples above are special cases of the fourth, and the latter is a class of algebras for which the Elliott conjecture can be shown to hold. Proving this, however, requires both the most powerful available classification results for stably finite C ∗ -algebras and the detailed analysis of VI algebras provided in the sequel. Thus, from the standpoint of trying to confirm the Elliott conjecture, VI algebras are no less complex than the class of all simple unital AH algebras. The fifth example demonstrates that VI algebras include non-Z-stable algebras which, in general, cannot be detected with classical K-theory. The sequel will show that there are in fact a tremendous number of such VI algebras (see Section 8). 3.2. The main theorem Theorem 3.4. Let A be a simple VI algebra admitting a standard decomposition with seed space a finite-dimensional CW complex. The following are equivalent: (i) (ii) (iii) (iv) (v) (vi)

A is Z-stable; A has strict comparison of positive elements; A has finite decomposition rank; A has slow dimension growth (as an AH algebra); A has bounded dimension growth (as an AH algebra); A is approximately divisible.

If, moreover, A has real rank zero, then A satisfies the equivalent conditions above. For Theorem 3.4, the following implications are already known: (v) ⇒ (vi) ⇒ (i) ⇒ (ii);

(v) ⇒ (iii);

(iv) ⇒ (ii).


1319

More precisely, (v) ⇒ (vi) was shown in [11] (based on the results of [12]), (vi) ⇒ (i) is a result of [32], (i) ⇒ (ii) is [25, Corollary 4.6], (iv) ⇒ (ii) is [29, Corollary 4.6], and (v) ⇒ (iii) is an easy observation of [18]. We will prove (ii) ⇒ (iv), (ii) ⇒ (v), (iii) ⇒ (ii), and the statement about real rank zero. In the real rank zero setting, the implication (iv) ⇒ (v) is due independently to Dadarlat and Gong [5,14]. Remarks 3.5. • In the absence of Theorem 3.4, no two of conditions (i)–(vi) are known to be equivalent for an arbitrary simple unital AH algebra. However, (i), (ii), (iv), (v) and (vi) are known to be equivalent for simple unital AH algebras of real rank zero (a combination of results from [8,10,11,19,32,38]). One of the central open questions in the classification theory of nuclear C ∗ -algebras is whether any of these conditions is actually necessary in the real rank zero case. Theorem 3.4 makes some progress on this question—see the third remark below. • Conditions (i)–(iii) should remain equivalent in much larger classes of simple, separable, nuclear, and stably finite C ∗ -algebras. Conditions (iv), (v), and (vi) cannot be expected to hold in general (conditions (iv) and (v) exclude non-AH algebras, and (vi) excludes projectionless algebras, such as Z itself), but it remains possible that(i)–(vi) are equivalent for simple unital ASH algebras, when (iv) and (v) are adapted in the obvious way to this class. • It is not known whether real rank zero implies conditions (i)–(vi) above for larger classes of simple nuclear C ∗ -algebras (clearly, (iii), (iv), and (v) can only hold in the stably finite case). Every existing classification theorem for real rank zero C ∗ -algebras assumes at least one of conditions (i)–(vi). It thus remarkable that in the class of VI algebras, real rank zero entails classifiability without assuming any of these conditions. • Our proof of (iv) ⇒ (v) is the first instance of such among simple unital AH algebras of unconstrained real rank. • The proof of Theorem 3.4 yields new examples of simple C ∗ -algebras with infinite decomposition rank. Previous examples all had a unique trace, while our examples can exhibit a wide variety of structure in the tracial state space. (See Section 8.) • Simple VI algebras all have stable rank one by an argument similar to that of [33, Proposition 10]. They may, however, have quite fast dimension growth—[28, Theorem 5.1] exhibits a simple VI algebra for which every AH decomposition has the property that lim inf

max

i→∞ j =1,...,mi

dim Xi,j = ∞. rank pi,j

For completeness, we note that the class of VI algebras which satisfy the conditions of Theorem 3.4 indeed satisfies the Elliott conjecture. Let Ell(•) denote the Elliott invariant of a unital, exact, and stably finite C ∗ -algebra. We then have: Corollary 3.6. (See Gong [15], Elliott, Gong and Li [12].) Let A and B be simple VI algebras as in Theorem 3.4 which satisfy conditions (i)–(vi), and suppose that there is an isomorphism φ : Ell(A) → Ell(B). Then, there is a ∗-isomorphism Φ : A → B which induces φ.

1320


3.3. An analogue of Kirchberg’s first Geneva theorem The most interesting aspect of Theorem 3.4 is that it provides an analogue among VI algebras of Kirchberg’s characterisation of purely infinite algebras. The latter states that for a simple, separable, and nuclear C ∗ -algebra A we have A ⊗ O∞ ∼ =A

⇔

A is purely infinite.

If we suppose that A is a priori traceless, then a result of Rørdam (see [25]) says that Z-stability and O∞ -stability are equivalent, and the definition of strict comparison reduces to the very definition of pure infiniteness. Thus, we see that Kirchberg’s characterisation is equivalent to the statement A⊗Z ∼ =A

⇔

A has strict comparison of positive elements.

This statement makes sense even if A has a trace, and is moreover true for the simple VI algebras of Theorem 3.4. Were the statement to hold for all simple, separable, and nuclear C ∗ -algebras—a distinct possibility—it would be a deep and striking generalisation of Kirchberg’s characterisation. In light of this possibility, we suggest that simple and Z-stable C ∗ -algebras be termed “purely finite.” 4. Villadsen’s obstruction in the Cuntz semigroup In this section we prove that under a technical assumption, a simple VI algebra fails to have strict comparison of positive elements. We shall see later that this failure is dramatic enough to ensure that the algebra also has infinite decomposition rank. 4.1. Vector bundles and characteristic class obstructions All vector bundles considered in this paper are topological and complex. For any connected trivial vector bundle of fibre dimension l ∈ N. If ω is a topological space X, we let θl denote the vector bundle over X, then we denote by ki=1 ω the k-fold Whitney sum of ω with itself, and by ω⊗k its k-fold external tensor product (over X k ). We use rank(ω) to denote the fibre dimension of ω. If Y is a second topological space and f : Y → X is continuous, then f ∗ (ω) denotes the induced bundle over Y . By Swan’s theorem, ω can be represented by a (non-unique) projection in a matrix algebra over C(X); we will use vector bundles and projections interchangeably in the sequel. Recall that the Chern class c(ω) is an element of the integral cohomology ring H ∗ (X) of the form c(ω) =

∞

ci (ω),

i=0

where ci (ω) ∈ H 2i (X) and ci (ω) = 0 whenever i > rank(ω). Let γ be a second vector bundle over X. We will make use of the following properties of the Chern class: (i) c(θl ) = 1 ∈ H 0 (X);


1321

(ii) c(γ ⊕ ω) = c(γ )c(ω), where the product is the cup product; (iii) If Y is another topological space and f : Y → X is continuous, then c(f ∗ (ω)) = f ∗ (c(ω)). Let ξ be the Hopf line bundle over S2 . The following Chern class obstruction argument, due essentially to Villadsen, shows that θk is not isomorphic to a sub-bundle of li=1 ξ ⊗l whenever 1 k < l. The top Chern class cl ( li=1 ξ ⊗l ) (equal, in this case, to the Euler class of li=1 ξ ⊗l ) l is not zero by [24, Proposition 3.2]. If θk is isomorphic to a sub-bundle of i=1 ξ ⊗l , then there exists a vector bundle γ of rank l − k over (S2 )l such that θk ⊕ γ ∼ =

l

ξ ⊗l .

i=1

Applying the Chern class to this equation yields c(γ ) = c

l

ξ

⊗l

.

i=1

But then cl (γ ) = 0, contradicting the fact that ci (γ ) = 0 whenever i > rank(γ ) = k. We review for future use some structural aspects of the integral cohomology ring H ∗ ((S2 )n ). It is well known that H 0 S2 ∼ = H 2 S2 ∼ =Z and H i S2 = 0,

i = 0, 2.

It follows from the Künneth formula that n ⊗n ∼ H ∗ S2 = H ∗ S2 as graded rings. Let ei denote the generator of H 2 (S2 ) in the ith tensor factor of H ∗ (S2 )⊗n . Then, n ∼ H ∗ S2 = Z[1, e1 , . . . , en ]/R, where R = ei2 = 0 1 i n . If n = Nl for some N ∈ N, then N l l ⊗N H ∗ S2 = H ∗ S2 . Let ei,j denote the generator of the ith copy of H 2 (S2 ), i ∈ {1, . . . , l}, in the j th tensor factor of the right-hand side above.

1322


4.2. A failure of strict comparison in C(X) Villadsen’s Chern class obstruction argument may be viewed as a statement about projections in a matrix algebra over C(X). We present below an analogue of his argument for certain nonprojections in Mn (C(X)). Let X be a CW-complex with dim(X) 6, and let there be given a natural number l satisfying 2 l dim(X)/3. Choose an open set O ⊆ X homeomorphic to (−1, 1)dim(X) . Define A˜ := x ∈ (−1, 1)3 dist x, (0, 0, 0) = 1/2 ∼ = S2 and B˜ := x ∈ (−1, 1)3 1/3 < dist x, (0, 0, 0) < 2/3 , and let π : B˜ → A˜ be the continuous projection along rays emanating from (0, 0, 0). Now define a closed subset A = A˜ l × {0}dim(X)−3l and an open subset B = B˜ l × (−1, 1)dim(X)−3l of O. Define a continuous map Π : B → A by Π = π × · · · × π × ev0 × · · · × ev0 , l times

dim(X)−3l times

where ev0 (x) = 0 for every x ∈ (−1, 1). Let f : X → [0, 1] be a continuous map which is identically one on A and identically zero off B. Notice that A ∼ = (S2 )l , so ξ ⊗l may be viewed as a vector bundle over A. Define positive elements l ∗ ⊗l P =f ·Π ξ (2) i=1

and Θk = f · θk in M∞ (C(X)). For every x ∈ X and n ∈ N we have either rank P (x) = rank Θk (x) = 0 or rank P (x) = l,

rank Θk (x) = k.

(3)


1323

If τ ∈ T(C(X)) and a ∈ M∞ (C(X))+ , then dτ (a) is obtained by integrating the rank function of a against the probability measure on X corresponding to τ . Thus, if k < l, we have dτ (Θk ) dτ (P ),

∀τ ∈ T(A),

and this inequality is strict if μτ (B) > 0, where μτ is the measure induced on X by τ ∈ T(A). On the other hand, Θk P . To see this suppose, on the contrary, that there exists a sequence (vi )∞ i=1 in M∞ (C(X)) such that

i→∞

vi P v ∗ − Θk − −−→ 0. i

Then, the same is true upon restriction to A ⊆ X, i.e., Θk |A ∼ = θk

l

ξ ⊗l ∼ = P |A .

i=1

This amounts to saying that θk is isomorphic to a sub-bundle of li=1 ξ ⊗l , contradicting our choice of ξ . We are now ready to prove a key lemma. Its proof is inspired by the proof of [26, Theorem 1.1]. Lemma 4.1. Let A be a simple VI algebra with standard decomposition (Ai , φi ) and seed space a CW-complex X1 of dimension strictly greater than zero. Suppose that for any > 0 there exists i ∈ N such that Ni,j > 1 − , Mi,j

∀j > i.

(4)

Then, for any n ∈ N there exist pairwise orthogonal elements a, b1 , . . . , bn ∈ M∞ (A)+ such that for each s ∈ {1, . . . , n} dτ (a) < dτ (bs ),

∀τ ∈ T(A),

and a b1 + · · · + bn . In particular, A does not have strict comparison of positive elements. Proof. First observe that the simplicity of A combined with the non-zero dimension of X1 imply that mi → ∞ as i → ∞—the number of point evaluations appearing as eigenvalue maps in φi,j is unbounded as j → ∞. It then follows from our assumption on Ni,j /Mi,j that dim(Xi ) → ∞ as i → ∞. We may thus assume that dim(Xi ) = 0, ∀i ∈ N. Since φi always contains at least one eigenvalue map which is not a point evaluation, it is injective. Let n ∈ N be given. Find, using the hypotheses of the lemma, an i ∈ N such that Ni,j 6n − 1 , > Mi,j 6n

∀j > i.

1324


Since Ni,j /Mi,j increases with increasing i, we may assume that i is large enough to permit the construction of the element 3 ∗ ⊗3n f ·Π ξ i=1

—this is just the element P of Eq. (2), with the number of direct summands altered. This element will be our b1 . (The maps φi are injective, so we identify forward images in the inductive sequence.) For b2 , . . . , bn we simply take mutually orthogonal copies of b1 . Let a be Θ2 , chosen orthogonal to each of b1 , . . . , bn . We have by construction that dτ (a) < dτ (bs ),

1 s n,

whenever μτ (B) > 0 (recall that B ⊆ Xi is the support of P and Θ2 —it is an open set). If f : Xi → [0, 1] has support equal to B, then we may write μτ (B) = dτ (f · 1Ai ). Now for τ ∈ T(A) we have dτ φi∞ (f · 1Ai ) = dφ

i∞ (τ )

(f · 1Ai ).

(5)

Since A is simple and φi∞ (f · 1Ai ) = 0 we have 0 < τ φi∞ (f · 1Ai ) < lim τ φi∞ (f · 1Ai )1/n = dτ φi∞ (f · 1Ai ) . n→∞

Combining this with (5) above we see that dτ φi∞ (a) = dφ

(a)

< dφ

(b)

i∞ (τ ) i∞ (τ )

= dτ (φi∞ (b) for every τ ∈ T(A). It remains to prove that a b1 + · · · + bn = b1 + · · · + bn . Notice that b1 + · · · + bn , viewed as an element of M∞ (Ai ), is simply the element P of Eq. (2) with parameter l = 3n. Thus, with this choice of l, we are in fact trying to prove that

φi∞ (Θ2 ) φi∞ (P ) .

It will suffice to prove that for each j > i and v ∈ M∞ (Aj )

vφi,j (P )v ∗ − φi,j (Θ2 ) 1/2.


1325

Let S be the set of eigenvalue maps of φi,j . S is the disjoint union of the set S1 of eigenvalue maps which are co-ordinate projections and the set S2 of eigenvalue maps which are point evaluations. (The fact that dim(Xi ) = 0 ensures that S1 ∩ S2 = ∅.) Note that |S1 | = Ni,j . For λ ∈ S1 , let m(λ) denote the number of times that λ occurs as an eigenvalue map of some φi,j . Write φi,j = γ1 ⊕ γ2 , where γ1 is a VI map corresponding to the eigenvalue maps of φi,j contained in S1 , and γ2 corresponds similarly to the elements of S2 . By construction, γ2 (P ) is a constant positive matrix-valued function over Xj . Put P˜ = γ1 (1Mmi (C(Xi )) ) ⊕ γ2 (P )1/2 , and q = limn→∞ γ2 (P )1/n . It follows that φi,j (P ) = γ1 (P ) ⊕ γ2 (P ) = P˜ γ1 (P ) ⊕ q P˜ , and that the projection q corresponds to a trivial vector bundle. Suppose that there exists v ∈ M∞ (Aj ) such that

vφi,j (P )v ∗ − φi,j (Θ2 ) < 1/2. Then,

v P˜ γ1 (P ) ⊕ q P˜ v ∗ − γ1 (Θ2 ) ⊕ γ2 (Θ2 ) < 1/2. Cutting down by γ1 (1Ai ) and setting w = γ1 (1Ai )v P˜ we have

w γ1 (P ) ⊕ q w ∗ − γ1 (Θ2 ) < 1/2, and this estimate holds a fortiori over any closed subset of Xj .

(6) ×n /n

Fix a point x0 ∈ Xi and let C be the closed subset of Xj = Xi j i consisting of those (nj /ni )-tuples which are equal to x0 in those co-ordinates which are not the range of an element of S1 , and whose remaining co-ordinates belong to A ⊆ Xi . Notice that ×lNi,j C∼ . = A×lNi,j ∼ = S2 We have γ1 (P )|C ∼ =

lm(λ)

λ∗ ξ ⊗l ,

λ∈S1 m=1

γ2 (P )|C ∼ = θlr , and γ1 (Θ2 )|C ∼ = θ2mult(γ1 ) , where r mult(γ2 ). [26, Lemma 2.1] and (6) together imply that θ2 mult(γ1 )

lm(λ) λ∈S1 m=1

λ∗ ξ ⊗l ⊕ θlr

1326


in the sense of Murray and von Neumann. In other words, there is a t ∈ N and a complex vector bundle ω over C of fibre dimension (l − 2) mult(γ1 ) + lr such that θ2 mult(γ1 )+t

⊕ω∼ =

lm(λ)

⊗l ⊕ θlr+t . λ ξ ∗

λ∈S1 m=1

Applying the total Chern class to this equation yields

lm(λ)

c(ω) = c

⊗l ⊕ θlr+t λ ξ ∗

λ∈S1 m=1

lm(λ) = c λ∗ ξ ⊗l λ∈S1

=

lm(λ) λ∗ c ξ ⊗l .

λ∈S1

Let us take the elements of S1 to be numbered λ1 , . . . , λNi,j , so that Ni,j

c(ω) =

(1 + e1,k + · · · + el,k )lm(λ) .

k=1

Recall our description of the ring structure of H ∗ ((S2 )×l )⊗Ni,j from the end of Section 4.1. The class clNi,j (ω) is the sum of all possible products of lNi,j elements of the form es,k s or 1. Since H ∗ ((S2 )×l )⊗Ni,j is torsion free and the products in question generate independent copies of Z whenever the products themselves are distinct, we see that clNi,j (ω) = 0 if even one of the products in question is not zero. Since the only relation on the generators of H ∗ ((S2 )×l )⊗Ni,j is 2 = 0, we see that that es,k Ni,j

l

es,k = 0.

k=1 s=1

Thus, clNi,j (ω) = 0. This in turn necessitates rank(ω) lNi,j —the nth Chern class of a vector bundle of dimension < n is always zero. We conclude that lNi,j (l − 2) mult(γ1 ) + lr (l − 2) mult(γ1 ) + l mult(γ2 ) (l − 2)Mi,j + 2(Mi,j − Ni,j ). Dividing the last inequality above by lMi,j we get Ni,j Ni,j l−2 2 . + 1− Mi,j l l Mi,j


1327

Using the assumption Ni,j 6n − 1 2l − 1 = Mi,j 6n 2l we have Ni,j 2l − 1 2 l−2 l−1 + 2< , 2l Mi,j l l l a contradiction.

2

5. Strict comparison implies bounded dimension growth The next lemma says that if a simple VI algebra has strict comparison of positive elements, then it not only has slow dimension growth, but even has slow dimension growth as a VI algebra. Lemma 5.1. Let A be a simple VI algebra; suppose that A admits a standard decomposition (Ai , φi ) with seed space a CW-complex X. If A has strict comparison of positive elements, then dim(X) = 0 (in which case A is AF), or for every i ∈ N, Ni,j j →∞ −−−→ 0. Mi,j If A does not have strict comparison of positive elements, then Ni,j = 1. i→∞ j →∞ Mi,j lim lim

(7)

Proof. Suppose, for a contradiction, that A satisfies the hypotheses of the lemma, dim(X) 1, and there is some i0 ∈ N and δ > 0 such that Ni0 ,j δ, Mi0 ,j

∀j > i0 .

We must show that A does not have strict comparison of positive elements and that (7) holds. A is simple with dim(X) = 1, so Mi,j → ∞ as j → ∞ for any fixed i ∈ N. This forces Ni,j → ∞, too, so that dim(Xi ) → ∞ as i → ∞. For any j > m > i0 we have δ Ni

,j

Ni0 ,j Ni ,m Nm,j = 0 · . Mi0 ,j Mi0 ,m Mm,j

The sequence ( Mi0 ,j )j >i0 is decreasing, so its limit exists and is larger than or equal to δ. It 0 follows that Nm,j /Mm,j approaches 1 as m, j → ∞, whence (7) holds. Now apply Lemma 4.1 to see that A must fail to have strict comparison of positive elements. 2

1328


Proposition 5.2. Let A be a simple VI algebra with strict comparison of positive elements, and suppose that A admits a standard decomposition (Ai , φi ) with seed space a finite-dimensional CW-complex X. Then, A has bounded dimension growth. Proof. A satisfies the hypotheses of Lemma 5.1. If A is AF, then it has bounded dimension growth, so we may assume that dim(X) 1. The conclusion of Lemma 5.1 then implies that Ni,j j →∞ −−−→ 0 Mi,j

(8)

for every i ∈ N. We will prove that A has very slow dimension growth in the sense of Gong; bounded dimension growth follows by the reduction theorem of [15] or, alternatively, the classification theorem of [20]. Let there be given a positive tolerance > 0. For natural numbers j > i let π1 , . . . , πNi,j be n /n

the co-ordinate projection maps from Xj = Xi j i to Xi appearing as eigenvalue maps of φi,j , and let li,j be the number of eigenvalue maps of φi,j which are point evaluations. Since A is simple and dim(X) 1, li,j0 > 0 for some j0 > i. Straightforward calculation then shows that there are at least li,j0 · Mj0 ,j = Mi,j ·

li,j0 Mi,j0

point evaluations in the map φi,j . Combining this with (8) yields

Mi,j li,j0 Lj := · Ni,j Mi,j0

j →∞

−−−→ ∞.

In other words, if one wants to partition the eigenvalue maps of φi,j which are point evaluations into Ni,j roughly equally sized multisets, then these multisets become arbitrarily large as j → ∞. (We say multisets as some point evaluations may well be repeated.) Assume that we have specified such a partition, let S1 , . . . , SNi,j denote the multisets in the partition, and assume that dim(Xi )3 /Lj < . Each f ∈ Sl , 1 l Ni,j , factors through any of the co-ordinate projections π1 , . . . , πNi,j . Factor f ∈ Sl as f = f˜ ◦ πl , where f˜ : Xi → Xi has range equal to a point; let Rl be the multiset of all maps from Xi to itself obtained in this manner. Let t (l) be the number of copies of πl appearing among the eigenvalue maps of φi,j . Put (l)

Bi = Mmi (t (l)+|Rl |) ⊗ C(Xi ), and observe that rank(1B (l) ) = mi t (l) + |Rl | |Rl | = |Sl | Lj . i

(l)

Define a map ψ (l) : Ai → Bi by ψ (a) = (l)

t (l) m=1

a ⊕

f˜∈Rl

a ◦ f˜ .

A.S. Toms, W. Winter / Journal of Functional Analysis 256 (2009) 1311–1340 (1)

1329

(N )

Put Bi = Bi ⊕ · · · ⊕ Bi i,j , and let ψ : Ai → Bi be the direct sum of the ψ (l) , 1 l Ni,j . For each 1 l Ni,j , let Pl ∈ Aj be the projection which is the sum of the images of the unit of Ai (l) under all of the copies of πl in φi,j and all of the point evaluations in Sl . Let γl : Bi → Pl Aj Pl be induced by πl , and let γ : Bi → Ai be the direct sum of the γl . We now have the factorisation ψ

γ

→ Bi − → Aj , Ai − and each direct summand Bi(l) has the property that dim(Xi )3 dim(Xi )3 < . rank(1B (l) ) Lj i

Both i and were arbitrary, so A has very slow dimension growth.

2

Proposition 5.2 establishes the implications (ii) ⇒ (iv) and (ii) ⇒ (v) of Theorem 3.4. (The first observation of the proof is that the hypotheses guarantee that A has slow dimension growth as a VI algebra, and so a fortiori as an AH algebra.) 6. Finite decomposition rank In the present section we prove the remaining implication of Theorem 3.4, namely that finite decomposition rank implies strict comparison of positive elements in VI algebras. The technical key step is Lemma 6.1 below; under the additional assumption of real rank zero, a related result was already observed in [36, Proposition 3.7]. Our proof is inspired by that argument. Lemma 6.1. Let A be a simple, separable and unital C ∗ -algebra with dr A = n < ∞. If a, d (0) , . . . , d (n) ∈ A+ satisfy dτ (a) < dτ d (i) for i = 0, . . . , n and every τ ∈ T (A), then a d (0) + · · · + d (n) . Proof. It will be convenient to set up some notation: Given 0 α < β 1, define functions on the real line by 0,

0, t < β, gβ (t) := , 1, t β,

gα,β (t) :=

t α, 1, t β, (t − α)/(β − α) else

and 0, fα,β (t) :=

t, β(t − α)/(β − α)

t α, t β, else.

1330


Before turning to the proof, observe first that our hypotheses imply that a is not invertible in A: indeed, if a was invertible, we had a 1/n → 1A as n → ∞, so 1 = lim τ a 1/n = dτ (a) < dτ d (i) 1, i→∞

a contradiction, whence 0 ∈ σ (a). By passing to Mn+1 (A) (which again has decomposition rank n by [18, Corollary 3.9]), replacing each d (i) ∈ A ∼ = e00 Mn+1 (A)e00 by a Cuntz equivalent element in the corner eii Mn+1 (A)eii for i = 1, . . . , n, and observing that dτ (a) < dτ (d (i) ) also holds for every τ ∈ T (Mn+1 (A)), we may as well assume that the d (i) themselves are already pairwise orthogonal. For the actual proof of the lemma, we distinguish two cases. Suppose first that 0 ∈ σ (a) is an isolated point. Then, there is θ > 0 such that p := gθ (a) ∈ A is a projection satisfying p = a , whence dτ (a) = dτ (p) = τ (p) for all τ ∈ T (A). Furthermore, for any τ ∈ T (A) and i = 0, . . . , n, we have dτ d (i) = lim τ gδ/2,δ d (i) , δ0

(9)

so there are δτ > 0 and ητ > 0 such that τ (p) = dτ (a) < dτ d (i) − ητ < τ gδτ /2,δτ d (i) for all i. Since the elements of A are continuous when regarded as functions on T (A), each τ has an open neighborhood Uτ ⊂ T (A) such that τ (p) < τ gδτ /2,δτ d (i) for i = 0, . . . , n and τ ∈ Uτ . Now by compactness of T (A) (and since, for any positive h, gδ /2,δ (h) gδ/2,δ (h) if only δ δ ) it is straightforward to find δ1 > 0 such that τ (p) < τ gδ1 /2,δ1 d (i) for all i = 0, . . . , n and τ ∈ T (A). Now by [36, Proposition 3.7], we have p gδ1 /2,δ1 d (0) + · · · + gδ1 /2,δ1 d (n) , whence a = p gδ1 /2,δ1 d (0) + · · · + gδ1 /2,δ1 d (n) d (0) + · · · + d (n) .

(10)


1331

Next, suppose 0 is a limit point of σ (a). The proof in this case is similar to that of [36, Proposition 3.7], but we have to deal with some extra technical difficulties. Since a = lim0 f/2, (a) and since the d (i) are pairwise orthogonal, it will be enough to show that f/2, (a) d (0) + · · · + d (n) for all > 0. So, given some > 0, we set and c := (g0,/4 − g/4,/2 )(a),

b := f/2, (a) then

c⊥b

and b + c a.

Since 0 is a limit point of σ (a), we have c = 0, hence (each τ ∈ T (A) is faithful by simplicity of A, c is continuous as a function on T (A) and T (A) is compact) α := min τ (c) τ ∈ T (A) > 0. Using that c 1A and that c ⊥ b, we obtain for all τ ∈ T (A) dτ (b) + α dτ (b) + τ (c) dτ (b) + dτ (c) = dτ (b + c) dτ (a) and, by hypothesis, dτ (b) < dτ d (i) − α

(11)

for i = 0, . . . , n and τ ∈ T (A). Again, to show that b d (0) + · · · + d (n) it will suffice to prove that fη,2η (b) d (0) + · · · + d (n) for any given η > 0. To this end, we set b¯ := gη/2,η (b)

(12)

¯ < τ d¯ (i) − 3α τ (b) 4

(13)

and choose 0 < δ2 < α/4 such that

for i = 0, . . . , n and τ ∈ T (A), where d¯ (i) := gδ2 /2,δ2 d (i) .

1332


The number δ2 is obtained in a similar way as δ1 in the first part of the proof, using compactness of T (A): From (9) and (11) we see that for each τ ∈ T (A) there is δτ > 0 such that for i = 0, . . . , n (9) 3α ¯ dτ (b) (11) < dτ d (i) − α τ gδτ /2,δτ d (i) − τ (b) . 4

Each τ has an open neighborhood Uτ such that ¯ < τ gδτ /2,δτ d (i) − 3α τ (b) 4 for i = 0, . . . , n and τ ∈ Uτ . Similar as in the first part of the proof, compactness of T (A) and (10) now yield δ2 > 0 such that (13) holds. Since dr A = n, by [18, Proposition 5.1], there is a system (Fk , ψk , ϕk )k∈N of c.p. approximations for A such that the ϕk are n-decomposable and the ψk are approximately multiplicative. In other words, for each k ∈ N there are finite-dimensional C ∗ -algebras Fk and c.p.c. maps ψ

ϕ

k k Fk −→ A A −→

such that (i) ϕk ψk (a) → a for each a ∈ A as k → ∞, (ii) Fk admits a decomposition Fk = ni=0 Fk(i) such that (i)

ϕk := ϕk |F (i) k

preserves orthogonality (i.e., has order zero in the sense of [35, Definition 2.1(b)]) for each i = 0, . . . , n and k ∈ N, (iii) ψk (aa ) − ψk (a)ψk (a ) → 0 for any a, a ∈ A as k → ∞. We set (i)

ψk (·) := 1F (i) ψk (·) k

(i)

for each i and k and note that the ψk are also approximately multiplicative for each i since the 1F (i) are central projections in Fk . As in [18, Remark 5.2(ii)], we may (and will) assume that the k ψk are unital. (i) Recall from [36, 1.2], that each of the order zero maps ϕk has a supporting ∗-homomorphism σk : Fk → A ; (i)

(i)

this a ∗-homomorphism satisfying (i)

(i)

(i)

(i)

(i)

ϕk (x) = σk (x)ϕk (1F (i) ) = ϕk (1F (i) )σk (x) ∈ A k

(i)

for all x ∈ Fk .

k


1333

We proceed to show that there is K ∈ N such that (i) ¯ < τ gα/4 ψ (i) (d¯ (i) ) τ g1−α/4 ψk (b) k

(14)

(i)

for all i = 0, . . . , n, τ ∈ T (Fk ) and k K. If this was not the case, there would be a strictly (i ) increasing sequence (kl )l∈N ⊂ N such that, for some fixed i0 ∈ {0, . . . , n}, there are τl ∈ T (Fkl 0 ) satisfying (i ) ¯ τl gα/4 ψ (i0 ) d¯ (i0 ) τl g1−α/4 ψkl 0 (b) kl

(15)

for all l ∈ N. But then (i ) ¯ τl g1−α/4 ψ (i0 ) (b) ¯ −α τl ψkl 0 (b) kl 4 (15) (i ) α τl gα/4 ψkl 0 d¯ (i0 ) − 4 (i0 ) (i ) α τl ψkl d¯ 0 − 2 · 4

(16)

for all l ∈ N. Now fix some free ultrafilter ω ∈ βN \ N, then (i )

τ¯ (·) := lim τl ψkl 0 (·) ω

obviously is a well-defined positive functional on A. It is tracial, since the τl are traces and the (i ) ψkl 0 are approximately multiplicative. It is a state, since τ¯ (1A ) = 1 (the τl are states and the (i )

ψkl 0 are unital). We have now constructed a tracial state τ¯ on A satisfying ¯ τ¯ d¯ (i0 ) − α , τ¯ (b) 2 a contradiction to (13). Therefore, there is K ∈ N such that (14) holds for all i = 0, . . . , n, (i) τ ∈ T (Fk ) and k K. (i) (i) As a consequence, for i = 0, . . . , n and k K there exist partial isometries vk ∈ Fk such that (i) ∗ (i) (i) ¯ g1−α/4,1 ψ (i) (b) ¯ vk vk = g1−α/4 ψk (b) k

(17)

and (i) (i) (i) ∗ vk vk gα/4 ψk d¯ (i) (i) ¯ (i) (i) (the g1−α/4 (ψk (b)) and gα/4 (ψk (d¯ (i) )) are projections in Fk —which in turn are finitedimensional algebras, hence satisfy the comparison property).

1334


Note that (i) (i) (i) ∗ vk vk gα/4 ψk d¯ (i) gα/4 ψk d¯ (i) g0,δ2 ψk d¯ (i) 1 Fk for i = 0, . . . , n and k K; since the vk (vk )∗ are projections, from this one easily concludes that (i) (i) ∗ (i) (i) ∗ =v v . g0,δ2 ψk d¯ (i) v v (i)

(i)

k

k

k

k

Because the ψk are approximately multiplicative, and using that the d¯ (i) are mutually orthogonal, we also have

n n

(j )

k→∞

(j ) g0,δ2 d¯ d¯ − ψk

−−−→ 0

g0,δ2 ψk

j =0

j =0

and

k→∞

g0,δ ψk d¯ (i ) v (i) v (i) ∗ − δi ,i · v (i) v (i) ∗ − −−→ 0 2 k k k k for i, i ∈ {0, . . . , n} (δi,i denotes the Kronecker delta of i and i ). Moreover, we have

n n

1 (j ) (i) (i) ∗ (j ) (i) (i) ∗

− ϕk ψk vk vk g0,δ2 d¯ ϕk vk vk g0,δ2 d¯

2μk2

j =0

j =0

by [18, Lemma 3.5] (an easy consequence of Stinespring’s theorem), where

2

n n

(j )

(j )

¯ ¯ μk := max (ϕk ψk − id) g0,δ2 d g0,δ2 d

, (ϕk ψk − id)

.

j =0

j =0

Observing that μk → 0 as k → ∞ and combining all these facts we obtain

n

(j ) (i) (i) ∗ (i) (i) ∗

k→∞ ¯ ϕk vk vk − ϕk vk vk g0,δ2 d

−−−→ 0

j =0

and

k→∞

g0,δ d¯ (i ) ϕk v (i) v (i) ∗ − δi,i · ϕk v (i) (v (i) )∗ − −−→ 0 2 k k k k for i, i = 0, . . . , n. Using that (i) (i) ∗ (i) (i) (i) ∗ (i) (i) (i) (i) (i) ∗ ϕk vk vk = ϕk vk vk = ϕk (1F (i) )σk vk σk vk k


it follows easily that

n

1 1

(i ) (i ) ∗ (i ) 2 σ v ϕ (1 ) g0,δ2 d¯ (j ) ϕk(i) (1F (i) ) 2 σk(i) vk(i)

k (i ) k k Fk

k j =0

(i) (i) ∗ (i) k→∞ − δi ,i · ϕk vk vk −−−→ 0

for i, i = 0, . . . , n, whence

n n

(i) (i) ∗ (i) (j ) 1 1 k→∞

∗ fη,2η (b) 2 −−−→ 0, g0,δ2 /2 d¯ sk − fη,2η (b) 2 ϕk vk vk

sk

j =0

i=0

where sk :=

n

1

(i)

1 (i) (i) vk fη,2η (b) 2

ϕk (1F (i) ) 2 σk

i=0

k

for k K. We now have

n

(j )

∗ lim fη,2η (b) − sk g0,δ2 d¯ sk

k→∞

j =0

n

(i) (i) ∗ (i) 1 1

= lim fη,2η (b) − fη,2η (b) 2 fη,2η (b) 2

ϕk vk vk

k→∞

i=0

n

(i) (17) (i) 1 1

¯ lim fη,2η (b) − fη,2η (b) 2 ϕk g1−α/4,1 ψk (b) fη,2η (b) 2

k→∞

i=0

n

(i) (i) 1 1

(iii)

¯ = lim fη,2η (b) − fη,2η (b) 2 ϕk ψk g1−α/4,1 (b) fη,2η (b) 2

k→∞

i=0

1 1

¯ η,2η (b) 2

= lim fη,2η (b) − fη,2η (b) 2 g1−α/4,1 (b)f (i)

k→∞

(12)

= 0.

This shows that fη,2η (b)

n

g0,δ2 d¯ (j ) .

j =0

Since η was arbitrary, and because n

j =0

n g0,δ2 d¯ (j ) d¯ (j ) , j =0

1335

1336


it follows that b

n

d (j ) ,

j =0

as desired.

2

Corollary 6.2. Let A be as in the hypotheses of Theorem 3.4. If A has finite decomposition rank, then A has strict comparison of positive elements. Proof. We prove the contrapositive. Suppose that A does not have strict comparison of positive elements and fix a standard decomposition as in 3.4. Then, condition (7) of Lemma 5.1 holds. It follows that A satisfies the hypotheses of Lemma 4.1, and so strict comparison fails in the manner prescribed the in the conclusion of that lemma. In light of Lemma 6.1, this failure excludes the possibility that A has finite decomposition rank. 2 The preceding corollary establishes the implication (iii) ⇒ (ii) of Theorem 3.4. 7. Real rank zero In this section we prove that an algebra of real rank zero which also satisfies the hypotheses of Theorem 3.4 must then satisfy conditions (i)–(vi) of the same theorem, thus completing the proof of our main result. The result is a special case of a theorem of Toan Ho and the first named author which will appear in Toan Ho’s PhD thesis. As no preprint of this result was available at the time of writing, we give a proof here which applies only to VI algebras. Let X be a compact connected Hausdorff space and a a self-adjoint element of Mn (C(X)). For each x ∈ X, form an n-tuple consisting of the eigenvalues of a listed in decreasing order. For each m ∈ {1, . . . , n} let λm : X → R be the function whose value at x is the mth entry of the eigenvalue n-tuple for x. The variation of the normalised trace of a (v. [1]), denoted T V (a), is defined as n 1 λm (x) − λm (y) : x, y ∈ X . sup n m=1

Suppose that A = limi→∞ (Mmi (C(Xi ), φi ) is of real rank zero, and let a be a self-adjoint element of some Mmi (C(Xi )). Then, by Theorem 1.3 of [1], the variation of the normalised trace tends to zero as j → ∞ for each direct summand of φi,j (a) corresponding to a connected component of Xj . Proposition 7.1. Let A = limi→∞ (Ai , φi ) be a simple VI algebra with seed space a finitedimensional CW-complex. If A has real rank zero, then A has bounded dimension growth. Proof. If A is AF, then there is nothing to prove. If A is not AF, then all but finitely many of the φi s contain at least one co-ordinate projection as an eigenvalue map, and each Xi has dimension strictly greater than zero. It will be enough to prove that for each i ∈ N, Ni,j j →∞ −−−→ 0. Mi,j The proof of Proposition 5.2 then shows that A has bounded dimension growth.


1337

Let > 0 be given, and suppose for a contradiction that for some i ∈ N and c > 0 we have Ni,j j →∞ −−−→ c. Mi,j By increasing i if necessary (and following the lines of the proof of Lemma 5.1) we may assume that c > 7/8. Choose a continuous function f : Xi → [0, 1] such that for some points x0 , x1 in the same connected component of Xi we have f (x0 ) = 0 and f (x1 ) = 1; put a := f · 1Ai . For any j > i we have φi,j (a)(x) = diag a(γ1 (x)), . . . , a γMi,j (x) ,

∀x ∈ Xj ,

where the γl s are the eigenvalue maps of φi,j . Let π1 , . . . , πNi,j : Xj → Xi be the distinct co-ordinate projections appearing among the γl s. Since T V (φi,j (a)) is unaffected by unitary conjugation in Aj , we may assume that φi,j (a)(x) = diag a π1 (x) , . . . , a πNi,j (x) , . . . , a γMi,j (x) ,

∀x ∈ Xj .

Fix a point y0 ∈ Xj which when viewed as an element of a Cartesian power of Xi has the value x0 in each co-ordinate; define y1 similarly with respect to x1 , and notice that y0 and y1 are in the same connected component of Xj . Then, the eigenvalue list of φi,j (a)(y0 ) contains at least mi Ni,j 0s, while the list for φi,j (a)(y1 ) contains at least mi Ni,j 1s. By the pigeonhole principle, at least mi [Mi,j − 2(Mi,j − Ni,j )] of the eigenfunctions λm corresponding to φi,j (a) have the value 0 at y0 and 1 at y1 , while the remaining 2mi (Mi,j − Ni,j ) eigenfunctions satisfy λm (y1 ) − λm (y0 ) −1. Now, mj

1 T V φi,j (a) λm (y1 ) − λm (y0 ) mi Mi,j m=1

Mi,j − 4(Mi,j − Ni,j ) Mi,j =

4Ni,j −3 Mi,j

>

1 2

since Ni,j /Mi,j > c > 7/8. This contradicts our real rank zero assumption for A, completing the proof. 2 8. Non-Z-stable VI algebras We now give examples of non-isomorphic VI-algebras which cannot be distinguished using topological K-theory and traces. These are not the first such—examples are already given in [26]—but the results of the present paper allow us to construct a large class of examples with relatively little further effort. They will also demonstrate the variety of tracial state spaces which can occur in a simple nuclear C ∗ -algebra of infinite decomposition rank.

1338


Any subgroup G of Q corresponds to a list of prime powers PG = {p1n1 , p2n2 , . . .}, ni ∈ Z+ ∪ {∞}, in the following sense: the elements of G are those rationals which, when in loweset terms, have denominators of the form p1r1 p2r2 . . . , where ri < ni for all i and ri = 0 for all but finitely many i. If ni = ∞ for some i then we will say that G is of infinite type. Let p be a prime. If p ∞ ∈ PG and H is the subgroup of Q with PH = {p ∞ }, then H ⊗ G ∼ = G. Let X be a contractible and finite-dimensional CW-complex. Construct a VI algebra AX = limi→∞ (Ai , φi ) satisfying: (i) The ratio N1,j /M1,j does not vanish; (ii) AX is simple by virtue of a judicious inclusion of point evaluations as eigenvalue maps of the φi ; (iii) The K0 -group of AX (necessarily a subgroup of Q by the contractibility of X1 = X and each Xi ) is of infinite type. Inspection of Villadsen’s construction in [33] shows that for a fixed X, one can arrange for K0 (AX ) to be an arbitrary infinite type subgroup of Q. There are uncountably many such subgroups, and hence, for a fixed X, uncountably many non-isomorphic algebras AX satisfying (i)–(iii). Condition (i) ensures that AX does not have strict comparison of positive elements (use Lemmas 4.1 and 5.1). Fix an algebra AX as above. Let p be a prime such that p ∞ ∈ PK0 (AX ) , and let U be a UHF algebra with PK0 (U) = {p ∞ }. We claim that the tensor product AX ⊗ U has the same topological K-theory and tracial state space as A. At the level of K-theory this statement follows from the Künneth theorem, the triviality of the K1 -groups of both U and AX (in the case of AX this is due to the contractibility of Xi ), and the isomorphism K0 (AX ) ⊗ K0 (U) ∼ = K0 (AX ). At the level of tracial state spaces the statement is due to the fact that U admits a unique tracial state. There is only one possible pairing of traces with K0 in each of AX and AX ⊗ U, as their K0 -groups are subgroups of the rationals. As noted above, AX does not have strict comparison of positive elements, but AX ⊗ U does by virtue of [22, Lemma 5.1]. Thus, AX and AX ⊗ U are not isomorphic, and by varying X and K0 (AX ) independently we obtain a large class of examples of the desired variety. Straightforward but laborious calculation shows that the tracial state space of AX as above is a Bauer simplex with extreme boundary homeomorphic to X ×∞ . (The details of this calculation are more or less contained in the proof of [30, Theorem 4.1]—we will not reproduce them here.) AX also has infinite decomposition rank by Theorem 3.4, and this does not depend on X being contractible. Thus, a large variety of structure can occur in the tracial state space of a simple nuclear C ∗ -algebra with infinite decomposition rank. Finally, we remark that infinite decomposition rank can also occur in the case of a simple AH algebra with unique tracial state, as observed in [36, Example 6.6(i)], using the examples of [34]. Acknowledgments Part of the work on this paper was carried out while the second named author visited the first at the University of New Brunswick (UNB). Both authors thank UNB and the Atlantic Centre for Operator Algebras for their support.


1339

References [1] B. Blackadar, O. Bratteli, G.A. Elliott, A. Kumjian, Reduction of real rank in inductive limits of C ∗ -algebras, Math. Ann. 292 (1992) 111–126. [2] B. Blackadar, D. Handelman, Dimension functions and traces on C ∗ -algebras, J. Funct. Anal. 45 (1982) 297–340. [3] B. Blackadar, A. Kumjian, M. Rørdam, Approximately central matrix units and the structure of noncommutative tori, K-Theory 6 (1992) 267–284. [4] O. Bratteli, Inductive limits of finite dimensional C ∗ -algebras, Trans. Amer. Math. Soc. 171 (1972) 195–234. [5] M. Dadarlat, Reduction to dimension three of local spectra of real rank zero C ∗ -algebras, J. Reine Angew. Math. 460 (1995) 189–212. [6] G.A. Elliott, On the classification of inductive limits of sequences of semi-simple finite-dimensional algebras, J. Algebra 38 (1976) 29–44. [7] G.A. Elliott, On the classification of C ∗ -algebras of real rank zero, J. Reine Angew. Math. 443 (1993) 179–219. [8] G.A. Elliott, An invariant for simple C ∗ -algebras, in: Canadian Math. Soc. 1945–1995, vol. 3, Canadian Math. Soc., Ottawa, ON, 1996, pp. 61–90. [9] G.A. Elliott, The classification problem for amenable C ∗ -algebras, in: Proc. ICM ’94, Zurich, Switzerland, Birkhäuser, Basel, Switzerland, 1995 pp. 922–932. [10] G.A. Elliott, G. Gong, On the classification of C ∗ -algebras of real rank zero. II, Ann. of Math. (2) 144 (1996) 497–610. [11] G.A. Elliott, G. Gong, L. Li, Approximate divisibility of simple inductive limit C ∗ -algebras, Contemp. Math. 228 (1998) 87–97. [12] G.A. Elliott, G. Gong, L. Li, On the classification of simple inductive limit C ∗ -algebras, II: The isomorphism theorem, Invent. Math. 168 (2007) 249–320. [13] J. Glimm, On a certain class of operator algebras, Trans. Amer. Math. Soc. 95 (1960) 318–340. [14] G. Gong, On inductive limits of matrix algebras over higher-dimensional spaces. I, II, Math. Scand. 80 (1997) 41–55, 56–100. [15] G. Gong, On the classification of simple inductive limit C ∗ -algebras I. The reduction theorem, Doc. Math. 7 (2002) 255–461. [16] X. Jiang, H. Su, On a simple unital projectionless C ∗ -algebra, Amer. J. Math. 121 (1999) 359–413. [17] E. Kirchberg, The classification of Purely Infinite C ∗ -algebras using Kasparov’s theory, Fields Inst. Commun., in press. [18] E. Kirchberg, W. Winter, Covering dimension and quasidiagonality, Internat. J. Math. 15 (2004) 63–85. [19] H. Lin, Classification of simple C ∗ -algebras with tracial topological rank zero, Duke Math. J. 125 (2004) 91–119. [20] H. Lin, Simple nuclear C ∗ -algebras of tracial topological rank one, J. Funct. Anal. 251 (2007) 601–679. [21] N.C. Phillips, A classification theorem for nuclear purely infinite simple C ∗ -algebras, Doc. Math. 5 (2000) 49–114. [22] M. Rørdam, On the structure of simple C ∗ -algebras tensored with a UHF-algebra, II, J. Funct. Anal. 107 (1992) 255–269. [23] M. Rørdam, Classification of Nuclear C ∗ -Algebras, Encyclopaedia Math. Sci., vol. 126, Springer-Verlag, Berlin– Heidelberg, 2002. [24] M. Rørdam, A simple C ∗ -algebra with a finite and an infinite projection, Acta Math. 191 (2003) 109–142. [25] M. Rørdam, The stable and the real rank of Z-absorbing C ∗ -algebras, Internat. J. Math. 15 (2004) 1065–1084. [26] A.S. Toms, On the classification problem for nuclear C ∗ -algebras, Ann. of Math. (2) 167 (2008) 1029–1044. [27] A.S. Toms, On the independence of K-theory and stable rank for simple C ∗ -algebras, J. Reine Angew. Math. 578 (2005) 185–199. [28] A.S. Toms, Flat dimension growth for C ∗ -algebras, J. Funct. Anal. 238 (2006) 678–708. [29] A.S. Toms, Stability in the Cuntz semigroup of a commutative C ∗ -algebra, Proc. London Math. Soc. 96 (2008) 1–25. [30] A.S. Toms, An infinite family of non-isomorphic C ∗ -algebras with identical K-theory, preprint, 2006, arXiv: math.OA/0609214, Trans. Amer. Math. Soc., in press. [31] A.S. Toms, W. Winter, Strongly self-absorbing C ∗ -algebras, Trans. Amer. Math. Soc. 359 (2007) 3999–4029. [32] A.S. Toms, W. Winter, Z-stable ASH algebras, Canad. J. Math. 60 (2008) 703–720. [33] J. Villadsen, Simple C ∗ -algebras with perforation, J. Funct. Anal. 154 (1998) 110–116. [34] J. Villadsen, On the stable rank of simple C ∗ -algebras, J. Amer. Math. Soc. 12 (1999) 1091–1102. [35] W. Winter, Covering dimension for nuclear C ∗ -algebras, J. Funct. Anal. 199 (2003) 535–556. [36] W. Winter, On topologically finite-dimensional simple C ∗ -algebras, Math. Ann. 332 (2005) 843–878.

1340


[37] W. Winter, On the classification of simple Z-stable C ∗ -algebras with real rank zero and finite decomposition rank, J. London Math. Soc. 74 (2006) 167–183. [38] W. Winter, Simple C ∗ -algebras with locally finite decomposition rank, J. Funct. Anal. 243 (2007) 394–425.


Combinatorial independence in measurable dynamics David Kerr a,∗ , Hanfeng Li b a Department of Mathematics, Texas A&M University, College Station, TX 77843-3368, USA b Department of Mathematics, SUNY at Buffalo, Buffalo, NY 14260-2900, USA

Received 15 February 2008; accepted 15 December 2008 Available online 14 January 2009 Communicated by K. Ball

Abstract We develop a fine-scale local analysis of measure entropy and measure sequence entropy based on combinatorial independence. The concepts of measure IE-tuples and measure IN-tuples are introduced and studied in analogy with their counterparts in topological dynamics. Local characterizations of the Pinsker von Neumann algebra and its sequence entropy analogue are given in terms of combinatorial independence, 1 geometry, and Voiculescu’s completely positive approximation entropy. Among the novel features of our local study is the treatment of general discrete acting groups, with the structural assumption of amenability in the case of entropy. © 2008 Elsevier Inc. All rights reserved. Keywords: Independence; Entropy; Amenability; Weak mixing

1. Introduction Many of the fundamental concepts in measurable dynamics revolve around the notion of probabilistic independence as an indicator of randomness or unpredictability. Ergodicity, weak mixing, and mixing are all expressions of asymptotic independence, whether in a mean or strict sense. At a stronger level, completely positive entropy can be characterized by a type of uniform asymptotic independence (see [12]). In topological dynamics the appropriate notion of independence is the combinatorial (or settheoretic) one, according to which a family of tuples of subsets of a set is independent if when * Corresponding author.

E-mail addresses: [email protected] (D. Kerr), [email protected] (H. Li). 0022-1236/$ – see front matter © 2008 Elsevier Inc. All rights reserved. doi:10.1016/j.jfa.2008.12.014

1342

D. Kerr, H. Li / Journal of Functional Analysis 256 (2009) 1341–1386

picking any one subset from each of finitely many tuples one always ends up with a collection having nonempty intersection. Combinatorial independence manifests itself dynamically in many ways and has long played an important role in the topological theory, although it has not received the same kind of systematic attention as probabilistic independence has in measurable dynamics. In fact it has only been recently that precise relationships have been established between independence and the properties of nullness, tameness, and positive entropy [22,30]. For example, a topological Z-system has uniformly positive entropy if and only if the orbit of each pair of nonempty open subsets of the space is independent along a positive density subset of Z [22] (see [30] for a combinatorial proof that applies more generally to actions of discrete amenable groups). The aim of this paper is to develop a theory of combinatorial independence in measurable dynamics. Among other things, this will provide the missing link for a geometric understanding of local entropy production in connection with Voiculescu’s operator-algebraic notion of approximation entropy [46]. One of our main motivations is to establish local combinatorial and linear-geometric characterizations of positive entropy and positive sequence entropy. For automorphisms of a Lebesgue space, the extreme situation of complete positive entropy was characterized in terms of combinatorial independence by Glasner and Weiss in Section 3 of [16] using Karpovsky and Milman’s generalization of the Sauer–Perles–Shelah lemma. What we see in this case however is an essentially topological phenomenon whereby independence over positive density subsets of iterates occurs for every finite partition of the space into sets of positive measure (cf. Theorem 3.9 in this paper). This does not help us much in the analysis of entropy production for other kinds of systems, as it can easily happen that combinatorial independence is present but not in a robust enough way to be measure-theoretically meaningful (indeed every free ergodic Z-system has a minimal topological model with uniformly positive entropy [15]). We seek moreover a fine-scale localization predicated not on partitions but rather on tuples of subsets that together compose only a very small fraction of the space, which the Glasner-Weiss result provides for Z-systems with completely positive entropy but in the purely topological sense of [30]. It turns out that we should ask whether combinatorial independence can be observed to the appropriate degree in orbits of tuples of subsets whenever we hide from view a small portion of the ambient space at each stage of the dynamics. Thus the recognition of positive entropy or positive sequence entropy becomes a purely combinatorial issue, with the measure being relegated to the role of observational control device. This way of counting sets appears in the global entropy formulas of Katok for metrizable topological Z-systems with an ergodic invariant measure [26], which rely on the Shannon–McMillan–Breiman theorem for the uniformization of entropy measurement. Here we avoid the Shannon–McMillan–Breiman theorem in our focus on local entropy production and its relation to independence for arbitrary systems. What is particularly important at the technical level is that we be able to vary the obscured part of the space across group elements when making an observation (see Section 2.1), as this will permit us to work with L2 perturbations and thereby establish the link with Voiculescu’s approximation entropy. We will thus be developing probabilistic arguments that will render the theory rather different from the topological one, despite the obvious analogies in the statements of the main results, although we will make use of the key combinatorial lemma from [30]. Our basic framework will be that of a discrete group acting on a compact Hausdorff space with an invariant Borel probability measure, with the structural assumption of amenability on the group in the context of entropy. With a couple of exceptions, our results do not require any restrictions of metrizability on the space or countability on the group. In analogy with topolog-


1343

ical IE-tuples and IN-tuples [30], we introduce the notions of measure IE-tuple (in the entropy context) and measure IN-tuple (in the sequence entropy context) as tuples of points in the space such that the orbit of every tuple of neigbourhoods of the respective points exhibits independence with fixed density on certain finite subsets. For IE-tuples these finite subsets will be required to be approximately invariant in the sense of the Følner characterization of amenability, while for IN-tuples we will demand that they can be taken to be arbitrarily large. Our main application of measure IE-tuples will be the derivation of a series of local descriptions of the Pinsker σ -algebra (or maximal zero entropy factor) in terms of combinatorial independence, 1 geometry, and Voiculescu’s c.p. (completely positive) approximation entropy (Theorem 3.7). These local descriptions are formulated as conditions on an L∞ function f which are equivalent to the containment of f in the Pinsker von Neumann algebra, i.e., the von Neumann subalgebra corresponding to the Pinsker σ -algebra. These conditions include: (1) there exist λ 1 and d > 0 such that every L2 perturbation of the orbit of f exhibits λequivalence to the standard basis of 1 over subsets of Følner sets with density at least d, (2) the local c.p. approximation entropy with respect to f is positive. If the action is ergodic we can add: (3) every L2 perturbation of the orbit of f contains a subset of positive asymptotic density which is equivalent to the standard basis of 1 . In the case that f is continuous we can add: (4) f separates a measure IE-pair. This provides new geometric insight into the phenomenon of positive c.p. approximation entropy, in parallel to what was done in the topological setting for Voiculescu–Brown approximation entropy in [28,29]. In fact the only way to establish positive c.p. approximation entropy until now has been by means of a comparison with Connes–Narnhofer–Thirring entropy, whose definition is based on Abelian models (see Proposition 3.6 in [46]). We also do not require the Shannon–McMillan–Breiman theorem, which factors critically into Voiculescu’s proof for ∗ -automorphisms in the separable commutative ergodic case that c.p. approximation entropy coincides with the underlying measure entropy [46, Corollary 3.8]. One consequence of the characterization of elements in the Pinsker von Neumann algebra given by condition (1) is a linear-geometric explanation for the well-known disjointness between zero entropy systems and systems with completely positive entropy, as discussed at the end of Section 3. The notion of measure entropy tuple was introduced in [4] in the pair case and in [22] in general and has been a key tool in the local study of both measure entropy and topological entropy for Z-systems (see Section 19 of [12]). We show in Theorem 2.27 that nondiagonal measure IE-tuples are the same as measure entropy tuples. The argument depends in part on a theorem of Huang and Ye for Z-systems from [22], whose proof involves taking powers of the generating automorphism and thus does not extend as is to actions of amenable groups. For more general systems we reduce to Huang and Ye’s result by applying the orbit equivalence technique of Rudolph and Weiss [42]. We point this out in particular because, with the exception of the product formula of Theorem 2.30 and the characterizations of completely positive entropy in Theorem 3.9, our study of measure IE-tuples and their relation to the topological theory does

1344


not otherwise rely on orbit equivalence or any special treatment of the integer action case, in contrast to what the measure entropy tuple approach in its present Z-system form seems to demand (see [11,22]). It is worth emphasizing however that we do need the relation with measure entropy tuples to establish the product formula for measure IE-tuples (Theorem 2.30), while the corresponding product formula for topological IE-tuples as established in Theorem 3.15 of [30] completely avoids the entropy tuple perspective, which would only serve to complicate matters (compare the proof of the entropy pair product formula for topological Z-systems in [11]). We also show (without the use of orbit equivalence) that the set of topological IE-tuples is the closure of the union of the sets of measure IE-tuples over all invariant Borel probability measures (Theorem 2.21), and furthermore that when the space is metrizable there exists an invariant Borel probability measure such that the sets of measure IE-tuples and topological IE-tuples coincide (Theorem 2.23). In the Z-system setting, the latter result for entropy pairs was established in [3] and more generally for entropy tuples in [22]. One of the major advantages of the combinatorial viewpoint is the universal nature of its application to entropy and independence density problems, as was demonstrated in the topologicaldynamical domain in [30]. This means that many of the methods we develop for the study of measure IE-tuples apply equally well to the sequence entropy context of measure IN-tuples. Accordingly, using measure IN-tuples we are able to establish various local descriptions of the maximal null von Neumann algebra, i.e., the sequence entropy analogue of the Pinsker von Neumann algebra (Theorem 5.5). We thus have the following types of conditions on an L∞ function f characterizing its containment in the maximal null von Neumann algebra: (1) there exist λ 1 and d > 0 such that every L2 perturbation of the orbit of f contains arbitrarily large finite subsets possessing subsets of density at least d which are λ-equivalent to the standard basis of 1 in the corresponding dimension, (2) the local sequence c.p. approximation entropy with respect to f is positive for some sequence, and, in the case that f is continuous, (3) f separates a measure IN-pair. Here, however, additional equivalent conditions arise that have no counterpart on the entropy side, such as: (4) every L2 perturbation of the orbit of f contains an infinite subset which is equivalent to the standard basis of 1 , and (5) every L2 perturbation of the orbit of f contains arbitrarily large finite subsets which are λ-equivalent to the standard basis of 1 for some λ > 0. The presence of such conditions reflects the fact that there is a strong dichotomy between nullness and nonnullness, which registers as compactness vs. noncompactness for orbit closures in L2 and is thus tied to weak mixing and the issue of finite-dimensionality for group subrepresentations. Notice that the appearance of condition (4) indicates that the distinction between tameness and nullness in topological dynamics collapses in the measurable setting. In parallel with measure IEtuples, it turns out (Theorem 4.9) that nondiagonal measure IN-tuples are the same as measure


1345

sequence entropy tuples as introduced in [21], which leads in particular to a simple product formula (Theorem 4.12). The main body of the paper is divided into four sections. Section 2 consists of four subsections. The first discusses measure independence density for tuples of subsets, while in the second we define measure IE-tuples and establish several basic properties. In the third subsection we address the problem of realizing IE-tuples as measure IE-tuples. The fourth subsection contains the proof that nondiagonal measure IE-tuples are the same as measure entropy tuples and includes the product formula for measure IE-tuples. Section 3 furnishes the local characterizations of the Pinsker von Neumann algebra. In Section 4 we define measure IN-tuples, record their basic properties, show that nondiagonal measure IN-tuples are the same as sequence measure entropy tuples, and derive the measure IN-tuple product formula. Finally, in Section 5 we establish the local characterizations of the maximal null von Neumann algebra. We now describe some of the basic concepts and notation used in the paper. A collection {(Ai,1 , . . . , Ai,k ): i ∈ I } of k-tuples of subsets of a givenJset is said to be independent if i∈J Ai,σ (i) = ∅ for every finite set J ⊆ I and σ ∈ {1, . . . , k} . The following definition captures a relativized version of this idea of combinatorial independence in a group action context and forms the basis for our analysis of measure-preserving dynamics. The relativized form is not necessary for topological dynamics (cf. Definition 2.1 of [30]) but becomes crucial in the measure-preserving case, where we will need to consider independence relative to subsets of nearly full measure. Definition 1.1. Let G be a group acting on a set X. Let A = (A1 , . . . , Ak ) be a tuple of subsets of X. Let D be a map from G to the power set 2X of X, with the image of s ∈ G written as Ds . We say that a set J ⊆ G is an independence set for A relative to D if for every nonempty finite subset F ⊆ J and map σ : F → {1, . . . , k} we have s∈F (Ds ∩ s −1 Aσ (s) ) = ∅. For a subset D of X, we say that J is an independence set for A relative to D if for every nonempty finite subset F ⊆ J and map σ : F → {1, . . . , k} we have D ∩ s∈F s −1 Aσ (s) = ∅, i.e., if J is an independence set for A relative to the map G → 2X with constant value D. By a topological dynamical system we mean a pair (X, G) where X is a compact Hausdorff space and G is a discrete group acting on X by homeomorphisms. We will also speak of a topological G-system. In this context we will always use B to denote the Borel σ -algebra of X. Given a G-invariant Borel probability measure μ on X, we will invariably write α for the induced action of G on L∞ (X, μ) given by αs (f )(x) = f (s −1 x) for all s ∈ G, f ∈ L∞ (X, μ), and x ∈ X. Given another topological G-system (Y, G), a continuous surjective G-equivariant map X → Y will be called a topological G-factor map. In this situation we will regard C(Y ) as a unital C ∗ -subalgebra of C(X). By a measure-preserving dynamical system we mean a quadruple (X, X , μ, G) where (X, X , μ) is a probability space and G is a discrete group acting on (X, X , μ) by μ-preserving bimeasurable transformations. The expression measure-preserving G-system will also be used. The action of G is said to be free if for every s ∈ G \ {e} the fixed-point set {x ∈ X: sx = x} has measure zero. A topological model for (X, X , μ, G) is a measure-preserving G-system (Y, Y , ν, G) isomorphic to (X, X , μ, G) such that (Y, G) is a topological dynamical system. We will actually work for the most part with an invariant Borel probability measure for a topological dynamical system instead of an abstract measure-preserving dynamical system, since the local study of independence properties requires the specification of a topological model and such a specification entails no essential loss of generality from the measure-theoretic viewpoint.

1346


So our basic setting will consist of (X, G) along with a G-invariant Borel probability measure μ. In Sections 2 and 3 we will also suppose G to be amenable, as the entropy context naturally requires. For a finite K ⊆ G and ε > 0 we write M(K, ε) for the set of all nonempty finite subsets F of G which are (K, ε)-invariant in the sense that {s ∈ F : Ks ⊆ F } (1 − ε)|F |. The Følner characterization of amenability asserts that M(K, ε) is nonempty for every finite set K ⊆ G and ε > 0. Given a real-valued function ϕ on the finite subsets of G we define the limit supremum and limit infimum of ϕ(F )/|F | as F becomes more and more invariant by lim

sup

(K,ε) F ∈M(K,ε)

ϕ(F ) |F |

and

lim

inf

(K,ε) F ∈M(K,ε)

ϕ(F ) |F |

respectively, where the net is constructed by stipulating that (K, ε) (K , ε ) if K ⊇ K and ε ε . These limits coincide under the following conditions: (1) (2) (3) (4)

0 ϕ(A) < +∞ and ϕ(∅) = 0, ϕ(A) ϕ(B) whenever A ⊆ B, ϕ(As) = ϕ(A) for all finite A ⊆ G and s ∈ G, ϕ(A ∪ B) ϕ(A) + ϕ(B) if A ∩ B = ∅.

See Section 6 of [32] and the last part of Section 3 in [30]. These conditions hold in the definition of measure entropy, which we recall next. The entropy of a finite measurable partition P of a probability space (X, X , μ) is defined by H (P) = p∈P −μ(P ) ln μ(P ) (sometimes we write Hμ (P) for precision). Let (X, X , μ, G) be a measure-preserving dynamical system. For a finite set F ⊆ G, we abbreviate the join −1 F s∈F s P to P . When G is amenable, we write hμ (P) (or sometimes hμ (X, P)) for the 1 F limit of |F | H (P ) as F becomes more and more invariant, and we define the measure entropy hμ (X) to be the supremum of hμ (P) over all finite Borel partitions P of X. For general G, given a sequence s = {si }i∈N in G we set hμ (P; s) = lim supn→∞ n1 H ( ni=1 si−1 P) and define the measure sequence entropy hμ (X; s) to be the supremum of hμ (P; s) over all finite measurable partitions P. The system is said to be null if hμ (X; s) = 0 for all sequences s in G. The conditional entropy of a finite measurable partition P = {P1 , . . . , Pn } with respect to a σ -subalgebra A ⊆ X is defined by H (P | A ) =

I A (P)(x) dμ(x)

where I A (P)(x) = − ni=1 1Pi (x) ln μ(Pi | A )(x) is the conditional information function. For references on entropy see [12,35,47]. A unitary representation π : G → B(H) of a discrete group G is said to be weakly mixing if for all ξ, ζ ∈ H the function fξ,ζ (s) = π(s)ξ, ζ on G satisfies m(|fξ,ζ |) = 0, where m is the unique invariant mean on the space of weakly almost periodic bounded functions on G. A subset J of G is syndetic if there is a finite set F ⊆ G such that F J = G and thickly syndetic if for every


finite set F ⊆ G the set conditions:

s∈F

1347

sJ is syndetic. Weak mixing is equivalent to each of the following

(1) π has no nonzero finite-dimensional subrepresentations, (2) for every finite set F ⊆ H and ε > 0 there exists an s ∈ G such that |π(s)ξ, ζ | < ε for all ξ, ζ ∈ F , (3) for all ξ, ζ ∈ H and ε > 0 the set of all s ∈ G such that |π(s)ξ, ζ | < ε is thickly syndetic. We say that a measure-preserving dynamical system (X, X , μ, G) is weakly mixing if the associated unitary representation of G on L2 (X, μ) C1 is weakly mixing. For references on weak mixing see [2,12]. For a probability space (X, X , μ) we write · μ for the corresponding Hilbert space norm on elements of L∞ (X, μ), i.e., f μ = μ(|f |2 )1/2 . After this paper was completed we received a preprint by Huang, Ye, and Zhang [24] which uses orbit equivalence to establish a local variational principle for measure-preserving actions of countable discrete amenable groups on compact metrizable spaces. For such systems they provide an entropy tuple variational relation (cf. Section 2.3 herein) and a positive answer to our Question 2.10. They also obtained what appears here as Lemma 2.24 [24, Theorem 5.11]. 2. Measure IE-tuples Throughout this section (X, G) is a topological dynamical system with G amenable and μ is a G-invariant Borel probability measure on X. 2.1. Measure independence density for tuples of subsets Our concept of measure IE-tuple is based on a notion of independence density for tuples of subsets, which in turn is formulated in terms of the concept of independence set from Definition 1.1. In the purely topological framework, we defined in [30] the independence density of a finite tuple A = (A1 , . . . , Ak ) of subsets of X by taking the limit of |F1 | ϕA (F ) as F becomes more and more invariant, where ϕA (F ) denotes the maximum of |J | over all J ⊆ F such that −1 s∈F s Aσ (s) = ∅ for all σ : F → {1, . . . , k}, i.e., J is an independence set relative to X in the terminology of Definition 1.1. In the measure setting, we only want to consider independent behaviour that is robust enough to be observable when a small portion of the space is obscured. This will translate at the function level into stability under L2 perturbations, as illustrated by Theorem 3.7. So for δ > 0 denote by B(μ, δ) the collection of all Borel subsets D of X such that μ(D) 1 − δ, and by B (μ, δ) the collection of all maps D : G → B(X) such that infs∈G μ(Ds ) 1 − δ. Let A = (A1 , . . . , Ak ) be a tuple of subsets of X and let δ > 0. For every finite subset F of G we define ϕA,δ (F ) = ϕA,δ (F ) =

min

max |F ∩ J |: J is an independence set for A relative to D ,

min

max |F ∩ J |: J is an independence set for A relative to D .

D∈B(μ,δ)

D∈B (μ,δ)

(F s) = Since the action of G on X is μ-preserving, we have ϕA,δ (F s) = ϕA,δ (F ) and ϕA,δ ϕA,δ (F ) for all finite sets F ⊆ G and s ∈ G. However, neither ϕA,δ nor ϕA,δ satisfy the subaddi-

1348


(F ) as F tivity condition in Proposition 3.22 of [30], so that the limit of |F1 | ϕA,δ (F ) or |F1 | ϕA,δ becomes more and more invariant might not exist. We define Iμ (A, δ) to be the limit supremum of |F1 | ϕA,δ (F ) as F becomes more and more invariant, and Iμ (A, δ) to be the corresponding

(F ) as F belimit infimum. Similarly, we define Iμ (A, δ) to be the limit supremum of |F1 | ϕA,δ (A, δ) to be the corresponding limit infimum. Note that comes more and more invariant, and Iμ Iμ (A, δ) Iμ (A, δ) and Iμ (A, δ) Iμ (A, δ).

Definition 2.1. We set Iμ (A) = sup Iμ (A, δ) δ>0

and Iμ (A) = sup Iμ (A, δ) δ>0

and refer to these quantities respectively as the upper μ-independence density and lower μindependence density of A. In the case G = Z, we could alternatively take the limit infimum and supremum of averages over larger and larger intervals instead of general Følner sets. The suprema of these respective quantities over δ > 0 would then lie between Iμ (A) and Iμ (A) and thus lead to the same definition of measure IE-tuples in the next subsection in view of Lemma 2.15. In order to relate independence and c.p. approximation entropy in the local description of the Pinsker von Neumann algebra (Theorem 3.7), we will need to be able to estimate Iμ (A) and Iμ (A) from above in terms of the primed quantities Iμ (A, δ) and Iμ (A, δ). More precisely, if the subsets of X of measure at least 1 − δ relative to which independence is gauged are not required to be uniform over G, then the resulting versions of upper and lower independence density are no less than the original ones times some constant depending only on k. This is the content of Proposition 2.4, which we now aim to establish. By Karpovsky and Milman’s generalization of the Sauer–Perles–Shelah lemma [25,43,44], for k 2, positive integers n m, and a subset S of {1, . . . , k}{1,...,n} of cardinality greater than m−1 n n−j , there exists an I ⊆ {1, . . . , n} with |I | m such that S| = {1, . . . , k}I . For I j =0 j (k − 1) a fixed k, we thus see by Stirling’s formula that for every λ ∈ (logk (k − 1), 1) there is an a ∈ (0, 1) such that for every n ∈ N and S ⊆ {1, . . . , k}{1,...,n} with |S| k λn there exists an I ⊆ {1, . . . , n} with |I | an such that S|I = {1, . . . , k}I . Write a(k) for the supremum of all a which witness this statement for some λ ∈ (logk (k − 1), 1) (this depends on k, and tends to zero as k → ∞). For the remainder of this subsection the length k of A is assumed to be at least 2. Lemma 2.2. For every λ in the interval (logk (k − 1), 1) there are a, b > 0 such that for every n ∈ N and S ⊆ {0, 1, . . . , k}{1,...,n} with |S| k λn and maxσ ∈S |σ −1 (0)| bn there exists an I ⊆ {1, . . . , n} with |I | an and S|I ⊇ {1, . . . , k}{1,...,n} . Moreover as λ 1 we may choose a a(k). Proof. Let λ ∈ (logk (k − 1), 1). Set f (λ) = (1 − λ)(λ − logk (k − 1)). Then the quantity λ − f (λ) lies in the interval (logk (k − 1), 1) and tends to one as λ 1. By the result of Karpovsky and Milman as discussed above, there is an a ∈ (0, 1) such that for every n ∈ N and S ⊆ {1, . . . , k}{1,...,n} with |S| k (λ−f (λ))n there exists an I ⊆ {1, . . . , n} with |I | an and S|I = {1, . . . , k}I , and we may choose a a(k) as λ 1. By Stirling’s formula there is a b ∈ (0, 1/2) such that n

k f (λ)n for all n ∈ N. Now suppose we are given an n ∈ N and S ⊆ {0, 1, . . . , k}{1,...,n} bn bn with |S| k λn and maxσ ∈S |σ −1 (0)| bn. Then we can find a J ⊆ {1, . . . , n} with |J | (1−b)n


such that the cardinality of the set {σ ∈ S: σ −1 {1, . . . , k} = J } is at least

|S| n bn(bn )

1349

k (λ−f (λ))n .

Consequently there exists an I ⊆ J with |I | a|J | (1 − b)an and S|I ⊇ {1, . . . , k}I . Since b may be chosen to be arbitrarily small this yields the result. 2 Lemma 2.3. For every δ > 0 there is a δ > 0 such that all finite sets F ⊆ G.

1 |F | ϕA,δ (F )

a(k) |F1 | ϕA,δ (F ) − δ for

Proof. Let δ > 0, and d be a positive number to be further specified below as a function of δ. Set δ = δd. Let F be a finite subset of G. To establish the inequality in the proposition statement we may assume that a(k)ϕA,δ (F ) δ|F |. Let D be an element of B (μ, δ ) such that ϕA,δ (F ) is equal to the maximum of |F ∩ J | over all independence sets J for A relative to D. Put / Ds } d|F | . E = x ∈ X: {s ∈ F : x ∈ Since μ(Ds ) 1 − δ for each s ∈ F we have

μ Dsc |F |δ μ E c d|F | s∈F

and so μ(E) 1 − δd = 1 − δ, that is, E ∈ B(μ, δ). Hence there exists an I ⊆ F with |I | = ϕA,δ (F ) which is an independence set for A relative to E. For each σ ∈ {1, . . . , k}I we can find by the definition of E a set Iσ ⊆ I with |I \ Iσ | d|F | such that s∈Iσ (Ds ∩ s −1 Aσ (s) ) = ∅, and we define ρσ ∈ {0, 1, . . . , k}I by ρσ (s) =

σ (s) 0

if s ∈ Iσ , if s ∈ / Iσ .

Since for every ρ ∈ {0, 1, . . . , k}I the number of σ ∈ {1, . . . , k}I for which ρσ = ρ is at most k d|F | , the set S = {ρσ : σ ∈ {1, . . . , k}I } has cardinality at least k |I | /k d|F | k (1−a(k)d/δ)|I | . It follows by Lemma 2.2 that if d is small enough as a function of δ then there exists a J ⊆ I with |J | (1 − δ)a(k)|I | such that S|J ⊇ {1, . . . , k}J . Such a J is an independence set for A relative 1 1 to D, and so we conclude that |F1 | ϕA,δ (F ) |F | (1 − δ)a(k)ϕA,δ (F ) a(k) |F | ϕA,δ (F ) − δ. Since our choice of δ does not depend on F this completes the proof. 2 It follows from Lemma 2.3 that for every δ > 0 there is a δ > 0 such that Iμ (A, δ ) a(k)Iμ (A, δ) − δ and Iμ (A, δ ) a(k)Iμ (A, δ) − δ. We thus obtain the following alternative means of estimating upper and lower μ-independence density. Proposition 2.4. We have Iμ (A) a(k)−1 sup Iμ (A, δ), δ>0

−1

Iμ (A) a(k)

sup Iμ (A, δ). δ>0

1350


2.2. Definition and basic properties of measure IE-tuples In [30] we defined a tuple x = (x1 , . . . , xk ) ∈ X k to be an IE-tuple (or an IE-pair in the case k = 2) if for every product neighbourhood U1 × · · · × Uk of x the G-orbit of the tuple (U1 , . . . , Uk ) has an independent subcollection of positive density. The following is the measuretheoretic analogue. Definition 2.5. We call a tuple x = (x1 , . . . , xk ) ∈ X k a μ-IE-tuple (or μ-IE-pair in the case k = 2) if for every product neighbourhood U1 × · · · × Uk of x the tuple (U1 , . . . , Uk ) has positive μ upper μ-independence density. We denote the set of μ-IE-tuples of length k by IEk (X). Evidently every μ-IE-tuple is an IE-tuple. The problem of realizing IE-tuples as μ-IE-tuples for some μ will be addressed in Section 2.3. We proceed now with a series of lemmas which will enable us to establish some properties of μ-IE-tuples as recorded in Proposition 2.16. Lemma 2.6. Let A = (A1 , . . . , Ak ) be a tuple of subsets of X which has positive upper μindependence density. Suppose that A1 = A1,1 ∪ A1,2 . Then at least one of the tuples A1 = (A1,1 , A2 , . . . , Ak ) and A2 = (A1,2 , A2 , . . . , Ak ) has positive upper μ-independence density. Proof. By Lemma 3.6 of [30] there is a constant c > 0 depending only on k such that, for all n ∈ N, if S is a subset of ({(1, 0), (1, 1)} ∪ {2, . . . , k}){1,...,n} for which the restriction Γn |S is bijective, where Γn : ({(1, 0), (1, 1)} ∪ {2, . . . , k}){1,...,n} → {1, . . . , k}{1,...,n} converts the coordinate values (1, 0) and (1, 1) to 1, then there is an I ⊆ {1, . . . , n} with |I | cn and either S|I ⊇ ({(1, 0)} ∪ {2, . . . , k})I or S|I ⊇ ({(1, 1)} ∪ {2, . . . , k})I . Thus, given sets D1 , D2 ⊆ X, any finite set I ⊆ G which is an independence set for A relative to D1 ∩ D2 has a subset J of cardinality at least c|I | which is either an independence set for A1 relative to D1 ∩ D2 (and hence relative to D1 ) or an independence set for A2 relative to D1 ∩ D2 (and hence relative to D2 ). Given a δ > 0, we have D1 ∩ D2 ∈ B(μ, δ) whenever D1 , D2 ∈ B(μ, δ/2) and so we deduce that max{Iμ (A1 , δ/2), Iμ (A2 , δ/2)} c · Iμ (A, δ). By hypothesis there is a δ > 0 such that Iμ (A, δ) > 0, from which we conclude that Iμ (Aj , δ/2) > 0 for at least one j ∈ {0, 1}, yielding the proposition. 2 Lemma 2.7. For every d > 0 there exist δ > 0, c > 0, and M > 0 such that if F is a finite subset F of G with |F | M, D is in B (μ, δ), P = {P1 , P2 } is a Borel partition of X with H (|FP| ) d, and A1 ⊆ P1 and A2 ⊆ P2 are Borel sets with μ(P1 \ A1 ), μ(P2 \ A2 ) < δ, then (A1 , A2 ) has a μ-independence set I ⊆ F relative to D with |I | c|F |. Proof. Let d > 0. Given a finite set F ⊆ G, denote by Y the set of all Y ∈ P F such that μ(Y )
0 (depending on d) such that for every nonempty finite set d K and S ⊆ {0, 1, 2}K with |S| e 12 |K| and maxσ ∈S |σ −1 (0)| b|K| there exists an I ⊆ K with d |I | c|K| and S|I ⊇ {1, 2}I . We may assume that 2b e 12 . db −1 Set δ = db 9 . Then μ(X \ (Ds ∩ s (A1 ∪ A2 ))) 3δ = 3 for every s ∈ G. Set W = x ∈ X: s ∈ F : x ∈ Ds ∩ s −1 (A1 ∪ A2 ) (1 − b)|F | , which has measure at least 1−

1 db d 1 · |F | =1− . μ X \ Ds ∩ s −1 (A1 ∪ A2 ) 1 − b|F | b|F | 3 3 s∈F

Then μ(W ∩ B) d3 . Thus the set Y of all Y ∈ Y for which μ(W ∩ Y ) > 0 has cardinality at d

d

least d3 e 3 |F | e 6 |F | . For each Y ∈ Y pick an xY ∈ W ∩ Y . Define a map ϕ : Y → {0, 1, 2}F by ⎧ ⎨0 ϕ(Y )(s) = 1 ⎩ 2

if xY ∈ / Ds ∩ s −1 (A1 ∪ A2 ), if xY ∈ Ds ∩ s −1 A1 , if xY ∈ Ds ∩ s −1 A2 ,

for Y ∈ Y and s ∈ F . If ϕ(Y1 ) = ϕ(Y2 ), then Y1 and Y2 coincide on a subset of F with cardinality d at least (1 − b)|F |. Hence |ϕ(Y )| |Y |/2b|F | e 12 |F | . Therefore there exists an I ⊆ F such that |I | c|F | and ϕ(Y )|I ⊇ {1, 2}I . Then I is a μ-independence set for (A1 , A2 ) relative to D. 2 We remark that the constants δ, c, and M specified in the proof of Lemma 2.7 do not depend on (X, G) or μ.

1352


Lemma 2.8. Let P = {P1 , P2 } be a two-element Borel partition of X such that hμ (P) > 0. Then there exists an ε > 0 such that Iμ (A) > 0 whenever A = (A1 , A2 ) for Borel subsets A1 ⊆ P1 and A2 ⊆ P2 with μ(P1 \ A1 ), μ(P2 \ A2 ) < ε. Proof. Apply Lemma 2.7.

2

Lemma 2.9. Let A be a Borel subset of X with μ(A) > 0. Then there are d > 0 and δ > 0 such that for every finite subset F ⊆ G and D ∈ B(μ, δ) there is an H ⊆ F with |H | d|F | and D ∩ ( s∈H s −1 A) = ∅. Proof. Choose a d > 0 less than μ(A) and set E = {x ∈ X: |{g ∈ F : gx ∈ A}| d|F |}. Then (1 − d)|F |1X\E g∈F 1g −1 (X\A) so that

(1 − d)|F | 1 − μ(E) =

(1 − d)|F |1X\E dμ

1g −1 (X\A) dμ = |F | 1 − μ(A)

g∈F

and hence μ(E) 1 − than 1 −

1−μ(A) 1−d .

1−μ(A) 1−d

> 0. We can thus take δ to be any strictly positive number less

2

In order to determine the behaviour of measure IE-tuples under taking factors and to establish the main results of the next two subsections, we need to consider several auxiliary entropy quantities. Let U be a finite Borel cover of X. For a subset D of X denote by ND (U) the minimal number of members of U needed to cover D. For δ > 0 we set Nδ (U) = minD∈B(μ,δ) ND (U) and write hc,μ (U, δ) for the limit infimum of |F1 | ln Nδ (U F ) as F becomes more and more invariant and hc,μ (U, δ) for the limit supremum of We then define

1 |F |

ln Nδ (U F ) as F becomes more and more invariant.

hc,μ (U) = sup hc,μ (U, δ), δ>0

hc,μ (U) = sup hc,μ (U, δ). δ>0

The metric versions of hc,μ (U, δ) and hc,μ (U, δ) in the ergodic Z-system case appear in the entropy formulas of Katok from [26]. Writing H (U) for the infimum of H (P) over all Borel 1 F partitions P of X refining U , we define h− μ (U) to be the limit of |F | H (U ) as F becomes + more and more invariant. Finally, we define hμ (U) to be the infimum of hμ (P) over all Borel + partitions P of X refining U . The quantities h− μ (U) and hμ (U) were introduced by Romagnoli + in the case G = Z [39]. We have the trivial inequalities hc,μ (U) hc,μ (U) and h− μ (U) hμ (U). Huang, Ye, and Zhang observed in [23] that results in [17,20,39] can be combined to deduce that + h− μ (U) = hμ (U) for all open covers U when X is metrizable and G = Z. + Question 2.10. Is it always the case that h− μ (U) = hμ (U) for an open cover U ?

The following fact was established by Romagnoli [39, Eq. (8)].


1353

Lemma 2.11. Let π : X → Y be a factor of X. Then

Hμ π −1 U = Hπ∗ (μ) (U) for every finite Borel cover U of Y . One direct consequence of Lemma 2.11 is the following, which in the case G = Z is recorded as Proposition 6 in [39]. Lemma 2.12. Let π : X → Y be a factor of X. Then −1

− h− μ π U = hπ∗ (μ) (U) for every finite Borel cover U of Y . Lemma 2.13. For a finite Borel cover U of X and δ > 0 we have δ · hc,μ (U, δ) h− μ (U) hc,μ (U). Proof. Let ε > 0 and δ > 0. When a finite subset F of G is sufficiently invariant, we F with have |F1 | Hμ (U F ) h− μ (U) + ε. Then we can find a finite Borel partition P U 1 − |F | Hμ (P) hμ (U) + 2ε. −|F |(h− μ (U )+2ε)/δ

Consider the set Y consisting of members of P with μ-measure

and set D = Y. Then μ(D c ) δ. Thus D ∈ B(μ, δ) and hence at least e − Nδ (U F ) |Y| e|F |(hμ (U )+2ε)/δ . Consequently, hc,μ (U, δ) (h− μ (U) + 2ε)/δ. Letting ε → 0 (U). we obtain δ · hc,μ (U, δ) h− μ For the second inequality, let ε > 0 and δ ∈ (0, e−1 ). Take a finite subset F of G sufficiently invariant so that |F1 | ln Nδ (U F ) < hc,μ (U, δ) + ε. Then we can find a D ∈ B(μ, δ) with

ln ND (U F ) < hc,μ (U, δ) + ε. Take a Borel partition Y of D finer than the restriction of U F to D with cardinality ND (U F ) and a Borel partition Z of D c finer than the restriction of U F to D c with cardinality ND c (U F ). Since the function x → −x ln x is concave on [0, 1] and increasing on [0, e−1 ] and decreasing on [e−1 , 1], we have 1 |F |

−

μ(P ) ln μ(P ) −μ(D) ln

P ∈Y

μ(D) |Y|

−(1 − δ) ln(1 − δ) + ln ND U F

−(1 − δ) ln(1 − δ) + |F | hc,μ (U, δ) + ε ,

and −

P ∈Z

μ(D c ) |Z|

−δ ln δ + δ ln ND c U F

μ(P ) ln μ(P ) −μ(D c ) ln

−δ ln δ + δ|F | ln |U|.

1354

Thus

D. Kerr, H. Li / Journal of Functional Analysis 256 (2009) 1341–1386 1 F |F | Hμ (U ) −(1 − δ) ln(1 − δ) − δ ln δ

+ hc,μ (U, δ) + ε + δ ln |U| and hence

h− μ (U) −(1 − δ) ln(1 − δ) − δ ln δ + hc,μ (U, δ) + ε + δ ln |U|. Letting ε → 0 and δ → 0 we get h− μ (U) hc,μ (U).

2

Z Let k 2 and let Z be a nonempty finite set. Wewrite Wc for the cover of {0, 1, . . . , k} = z∈Z {0, 1, . . . , k} consisting of subsets of the form z∈Z {iz } , where 1 iz k for each z ∈ Z. For a set S ⊆ {0, 1, . . . , k}Z we denote by FS the minimal number of sets in W one needs to cover S. The following lemma provides a converse to [30, Lemma 3.3].

Lemma 2.14. Let k 2. For every finite set Z and S ⊆ {0, 1, . . . , k}Z , if S|W ⊇ {1, . . . , k}W for k |W | some nonempty set W ⊆ Z, then FS ( k−1 ) . Proof. Replacing S by S|W we may assume that W = Z. We prove the assertion by induction on |Z|. The case |Z| = 1 is trivial. Suppose that the assertion holds for |Z| = n. Consider the case |Z| = n + 1. Take z ∈ Z and set Y = Z \ {z}. For each 1 j k write Sj for the set of k |Y | all elements of S taking value j at z. Then Sj |Y ⊇ {1, . . . , k}Y , and so FSj ( k−1 ) . Now suppose that some V ⊆ W covers S. Write Vj for the set of all elements of V that have nonempty k |Y | intersection with Sj . Then |Vj | FSj ( k−1 ) . Note that each element of V is contained in at k |Y | most k − 1 many of the sets V1 , . . . , Vk . Thus (k − 1)|V| kj =1 |Vj | k( k−1 ) , and hence k |Z| |V| ( k−1 ) , completing the induction. 2 Lemma 2.15. For a finite Borel cover U of X, the three quantities h− μ (U), hc,μ (U), and hc,μ (U) are either all zero or all nonzero. If the complements in X of the members of U are pairwise disjoint and A is a tuple consisting of these complements, then we may also add Iμ (A) and Iμ (A) to the list. Proof. The first assertion follows from Lemma 2.13. If A is a tuple as in the lemma statement, then Lemma 3.3 of [30] and Lemma 2.14 yield the equivalence of hc,μ (U) > 0 and Iμ (A) > 0 as well as the equivalence of hc,μ (U) > 0 and Iμ (A) > 0. 2 Proposition 2.16. The following hold: (1) Let A = (A1 , . . . , Ak ) be a tuple of closed subsets of X which has positive upper μindependence density. Then there exists a μ-IE-tuple (x1 , . . . , xk ) with xj ∈ Aj for j = 1, . . . , k. μ (2) IE2 (X) \ 2 (X) is nonempty if and only if hμ (X) > 0. μ

(3) IE1 (X) = supp(μ). μ

(4) IEk (X) is a closed G-invariant subset of X k . μ π (μ) (5) Let π : X → Y be a topological G-factor map. Then π k (IEk (X)) = IEk ∗ (Y ).


1355

Proof. (1) Apply Lemma 2.6 and a compactness argument. (2) As is well known and easy to show, hμ (X) > 0 if and only if there is a two-element Borel partition of X with positive entropy. We can thus apply (1) and Lemma 2.8 to obtain the “if” part. The “only if” part follows from Lemma 2.15. (3) This follows from Lemma 2.9. (4) Trivial. (5) This follows from (1), (3), (4), and Lemmas 2.12 and 2.15. 2 2.3. IE-tuples and measure IE-tuples Here we will show that the set of IE-tuples of length k is equal to the closure of the union of the sets IEkμ (X) over all G-invariant Borel probability measures μ on X, and furthermore that when X is metrizable there exists a G-invariant Borel probability measure μ on X such that the sets of μ-IE-tuples and IE-tuples coincide. We will need a version of the Rokhlin tower lemma. Following [42], for a finite set F ⊆ G and a Borel subset V of X we say that F × V maps to an ε-quasi-tower if there exists a measurable subset A ⊆ F × V such that the map A → X sending (s, x) to sx is one-to-one and for each x ∈ V the cardinality of {s ∈ F : (s, x) ∈ A} is at least (1 − ε)|F |. The case δ = 0 of the following theorem is a direct consequence of Theorem 5 on page 59 of [35]. The general case δ > 0 follows from the proof given there. Note that although the acting groups are generally assumed to be countable in [35], this assumption is not necessary here. 2

Theorem 2.17. Let 1 > ε > 0 and ε4 > δ > 0. Then whenever the action of G is free with respect to μ, F1 ⊆ F2 ⊆ · · · ⊆ Fk are nonempty finite subsets of G such that Fj +1 is (Fj Fj−1 , ηj )2

invariant and ηj |Fj | < ε4 for all 1 j < k, (1 − 2ε )k < ε, and D1 , . . . , Dk are Borel subsets of X with μ-measure at least 1 − δ, one can find Borel subsets V1 , . . . , Vk such that (1) (2) (3) (4)

each Fj × Vj maps to an ε-quasi-tower, Fi Vi ∩ Fj Vj = ∅ for i = j ,

μ( kj =1 Fj Vj ) > 1 − ε, Vj ⊆ Dj for each j .

For the definitions of the quantities h+ μ (U) and hc,μ (U) see the discussion after Lemma 2.9. Lemma 2.18. Suppose that G is infinite and the action of G is free with respect to μ. Let U be a finite Borel cover of X. Then h+ μ (U) hc,μ (U). 2

Proof. Let 1 > ε > 0 and ε4 > δ > 0. Then we can find nonempty finite subsets F1 ⊆ F2 ⊆ · · · ⊆ Fk of G satisfying the conditions of Theorem 2.17 and |F1j | ln Nδ (U Fj ) < hc,μ (U, δ) + ε for j = 1, . . . , k. For each j = 1, . . . , k take a Dj ∈ B(μ, δ) such that |F1j | ln NDj (U Fj ) < hc,μ (U, δ) + ε. Then we can find Borel sets V1 , . . . , Vk ⊆ X satisfying the conclusion of Theorem 2.17. For j = 1, . . . , k pick a Borel partition Pj of Dj which is finer than the restriction of U Fj to Dj and has cardinality NDj (U Fj ). For each P ∈ Pj fix a UP ,s ∈ U for each s ∈ Fj such

1356


that P ⊆ s∈Fj s −1 UP ,s . Since Fj × Vj maps to an ε-quasi-tower, we can find a measurable subset Aj of Fj × Vj such that T |Aj : Aj → X is one-to-one, where T : G × X → X is the map (s, x) → sx, and |{s ∈ Fj : (s, x) ∈ Aj }| (1 − ε)|Fj | for each x ∈ Vj . Define a Borel partition Y = {YU : U ∈ U} of j T (Aj ) finer than the restriction of U to j T (Aj ) by stipulating that, for each (s, x) ∈ Aj with

x ∈ P ∈ Pj , sx ∈ YU exactly when U = UP ,s . Take a Borel partition Z = {ZU : U ∈ U} of ( j T (Aj ))c with ZU ⊆ U for each U ∈ U . Set PU = YU ∪ ZU for each U ∈ U . Then P = {PU : U ∈ U} is a Borel partition of X finer than U . Note that μ(T (Aj )) (1 − ε)μ(Fj Vj ) for each j . Thus μ( j T (Aj )) > (1 − ε)2 . −1 √ Next we estimate hμ (P). Suppose

that F is a finite subset of G which is ((Fk )Fk , ε)invariant. Set Fx = {s ∈ F : sx ∈ j T (Aj )} for each x ∈ X and put W = {x ∈ X: |Fx |

√ √ √ (1 − ε)|F |}. It is easy to see that μ(W c ) μ(( j T (Aj ))c )/ ε < 2 ε. Replacing W by W \ s∈F −1 F \{eG } {x ∈ X: sx = x} we may assume that s1 x = s2 x for all x ∈ W and all distinct s1 , s2 ∈ F . Let us estimate the number M of atoms of P F which have nonempty intersection with W . Write Hj for the collection of all subsets of Fj with cardinality at least −1 (1 − ε)|F }, we have |Fx | √ j |. For each x ∈ W , setting Fx = Fx ∩ {s ∈ F : Fk Fk s ⊆ F (1 − 2 ε)|F |. Note that if (s, y) ∈ Aj for some 1 j k and sy = s x for some s ∈ Fx , setting c = s −1 s and H = {h ∈ Fj : (h, y) ∈ Aj }, we have y = cx, H c ⊆ Fx and H ∈ Hj . Thus for each x ∈ W we can find a finite set Cj,H ⊆ G for every H ∈ Hj such that the following hold: (1) H c ∩ H c = ∅ for all c∈ Cj,H , c ∈ Cj ,H unless H = H , c = c , and j = j ,

√ (2) j,H H Cj,H ⊆ F and j,H H Cj,H (1 − 2 ε)|F |, (3) cx ∈ Vj and H = {h ∈ Fj : (h, cx) ∈ Aj } for each c ∈ Cj,H . Note that the atom of P to which hcx for h ∈ H belongs is determined by h and the atom of Pj to which cx belongs. Thus, for each fixed choice of sets Cj,H satisfying (1) and (2) above, the number of atoms of P F containing some x ∈ W with such a choice of Cj,H is at most √

|U|2

ε|F |

·

|Pj |

H ∈Hj

|Cj,H |

√

|U|2

ε|F |

j

·

exp hc,μ (U, δ) + ε |Fj | |Cj,H | H ∈Hj

j √

= |U|2

ε|F |

|Fj | · exp hc,μ (U, δ) + ε |Cj,H | j

H ∈Hj

√ (hc,μ (U, δ) + ε)|F | 2 ε|F | . · exp |U| 1−ε By Stirling’s formula, the number of subsets of an n-element set with cardinality at least (1 − ε)n is at most ef (ε)n for all n 0 with f (ε) → 0 as ε → 0. Fix an element gj,H ∈ H for each j and H ∈ Hj . Then Cj,H is determined by the set gj,H Cj,H in F . Thus, for a fixed Q ⊆ F , writing a = minj |Fj | and summing as appropriate over nonnegative integers tj,H , tj , or t subject to the indicated constraints, the number of choices of sets Cj,H satisfying (1) and (2) and

j,H H Cj,H = Q is at most


j,H tj,H |H |=|Q|

(|F | −

(1−ε)

j tj |Fj ||Q|

= (1−ε)

j tj |Fj ||Q|

(1−ε)

j tj |Fj ||Q|

|F |! j,H tj,H )! j,H tj,H !

|F |! · (|F | − j tj )! j tj ! j

(1−ε)

j tj |Fj ||Q|

(1−ε)at|Q|

=

(1−ε)at|Q|

(1−ε)at|Q|

H ∈Hj tj,H =tj

|tj |! H ∈Hj tj,H !

|F |! |Hj |tj · (|F | − j tj )! j tj ! j

|F |! · ef (ε)tj |Fj | (|F | − j tj )! j tj ! j

1357

(|F | −

|F |! · ef (ε)|F |/(1−ε) t )! t ! j j j j

|F |! t! · · ef (ε)|F |/(1−ε) (|F | − t)!t! j tj ! j tj =t

|F |! · k t · ef (ε)|F |/(1−ε) (|F | − t)!t! |F |! · k |F |/((1−ε)a) · ef (ε)|F |/(1−ε) (|F | − t)!t!

ef (1/((1−ε)a))|F | · k |F |/((1−ε)a) · ef (ε)|F |/(1−ε) . √ √ The number of choices of Q ⊆ F with |Q| (1 − 2 ε)|F | is at most ef (2 ε)|F | . Therefore, M is at most

(hc,μ (U, δ) + ε)|F | · exp f 1/ (1 − ε)a |F | |U| · exp 1−ε √ f (ε)|F | · exp f (2 ε)|F | . · k |F |/((1−ε)a) · exp 1−ε √ 2 ε|F |

Since the function x → −x ln x is concave on [0, 1], we have

−μ(P ∩ W ) ln μ(P ∩ W ) −μ(W ) ln

P ∈P F

μ(W ) −μ(W ) ln μ(W ) + ln M M

and P ∈P F

−μ P ∩ W c ln μ P ∩ W c −μ W c ln

μ(W c ) |P||F | − M

−μ W c ln μ W c + μ W c |F | ln |U|.

1358


Set Q = {W, W c }. Since the function x → −x ln x on [0, 1] has maximal value e−1 , we get

H PF H PF ∨ Q = −μ(P ∩ W ) ln μ(P ∩ W ) + −μ P ∩ W c ln μ P ∩ W c P ∈P F

P ∈P F

−μ(W ) ln μ(W ) + ln M − μ W c ln μ W c + μ W c |F | ln |U| √ 2e−1 + ln M + 2 ε|F | ln |U|. Since G is infinite, |F | → ∞ as F becomes more and more invariant. Therefore

√ hc,μ (U, δ) + ε + f 1/ (1 − ε)a h+ μ (U) hμ (P) 4 ε ln |U| + 1−ε √ ln k f (ε) + + + f (2 ε). (1 − ε)a 1 − ε Since we may choose F1 , . . . , Fk to be as close as we wish to being invariant, we may let a → ∞. Thus √ √ hc,μ (U, δ) + ε f (ε) + + f (2 ε) h+ μ (U) 4 ε ln |U| + 1−ε 1−ε √ √ hc,μ (U) + ε f (ε) 4 ε ln |U| + + + f (2 ε). 1−ε 1−ε Letting ε → 0 we get h+ μ (U) hc,μ (U), as desired.

2

Lemma 2.19. Let μ be a Borel probability measure on X. Let C1 , . . . , Ck be closed subsets of X. Then for every k-element Borel partition P = {P1 , . . . , Pk } with Pi ∩ Ci = ∅ for i = 1, . . . , k and every δ > 0 there is a k-element Borel partition Q = {Q1 , . . . , Qk } such that Qi ∩ Ci = ∅ and μ(∂Qi ) = 0 for i = 1, . . . , k and Hμ (Q | P) < δ. Proof. Let P = {P1 , . . . , Pk } be a k-element Borel partition with Pi ∩ Ci = ∅ for i = 1, . . . , k. Let ε > 0. By the regularity of μ, for i = 1, . . . , k − 1 we can find a compact set Ki ⊆ Pi such that μ(Pi \ Ki ) < ε and an open set Ui ⊇ Pi such that μ(Ui \ Pi ) < ε and Ui ∩ Ci = ∅. Then U1 , . . . , Uk−1 cover Ck . Thus we can find a closed cover D1 , . . . , Dk−1 of Ck such that Di ⊆ Ui for i = 1, . . . , k − 1. For each x ∈ Ki ∪ Di there exists an open neighbourhood V of x contained in Ui whose boundary has zero measure, for if we take a function f ∈ C(X) with image in [0, 1] which is 0 at x and 1 on Uic then only countably many of the open sets {y ∈ X: f (y) < t} for t ∈ (0, 1) can have boundary with positive measure. By compactness there is a finite union Bi of such V which covers Ki ∪ Di , and μ(∂(Bi )) = 0. Then μ(Bi Pi ) < 2ε for i = 1, . . . , k − 1. Now define the partition Q = {Q1 , . . . , Qk } by Q1 = B1 , Q2 = B2 \ B1 , Q3 = B3 \ (B1 ∪ B2 ), . . . , Qk = X \ (B1 ∪ · · · ∪ Bk−1 ). Then Qi ∩ Ci = ∅ and μ(∂Qi ) = 0 for i = 1, . . . , k and Hμ (Q|P) < δ(ε) where δ(ε) → 0 as ε → 0, yielding the lemma. 2 Lemma 2.20. Let x = (x1 , . . . , xk ) be an I E-tuple consisting of distinct points and let U1 , . . . , Uk be pairwise disjoint open neighbourhoods of x1 , . . . , xk , respectively. Then there exist a G-invariant Borel probability measure μ on X and a μ-IE-tuple (x1 , . . . , xk ) such that xi ∈ Ui for each i = 1, . . . , k.


1359

Proof. The case k = 1 follows from [30, Prop. 3.12] and Proposition 2.16(3). So we may assume k 2. Let {Fn }n be a Følner net in G. For each i = 1, . . . , k choose a closed neighbourhood Ci of xi contained in Ui . Since x is an IE-tuple there is a d > 0 such that for each n we can find an independence set In ⊆ Fn for the tuple C = (C1 , . . . , Ck ) such that |In | d|Fn |. For each n pick an xσ ∈ s∈In s −1 Cσ (s) for every σ ∈ {1, . . . , k}In and define on X the following averages of point masses: νn =

1 k |In |

δxσ ,

μn =

σ ∈{1,...,k}In

1 sμn . |Fn | s∈Fn

Take a weak∗ limit point μ of the net {μn }n . By passing to a cofinal subset of the net we may assume that μn converges to μ. Let P = {P1 , . . . , Pk } be a Borel partition of X such that Pi ∩ Ci = ∅ and μ(∂Pi ) = 0 for each i = 1, . . . , k. Let E be a nonempty finite subset of G. We will use subadditivity and concavity as in the proof of the variational principle in Section 5.2 of [33]. The function A → Hνn (P A ) on finite subsets of G is subadditive in the sense that if 1A = λB 1B is a finite decomposition of the indicator of a finite set A ⊆ G over a collection of sets B ⊆ A with each λB positive, then Hνn (P A ) λB Hνn (P B ) (see Section 3.1 of [33]). Observe that ε(n) := |E −1 Fn \ Fn |/|Fn | is bounded above by |E −1 Fn Fn |/|Fn | and hence by the Følner property tends to zero along the 1 net. Applying the subadditivity of Hνn (·) to the decomposition 1Fn = |E| s∈E −1 Fn 1Es∩Fn , we have

1 1 Hνn P Fn Hνn P Es + |E| |E| s∈Fn

Hνn P Es

s∈E −1 Fn \Fn

1 Hνn P Es + ε(n)|Fn | ln k. |E| s∈Fn

Since Pi ∩ Ci = ∅ for each i, every atom of P In contains at most (k − 1)|In | points from the set |In | , so that {xσ : σ ∈ {1, . . . , k}In } and hence has νn -measure at most ( k−1 k )

Hνn P In = −νn (W ) ln νn (W ) W ∈ P In

νn (W ) ln

W ∈ P In

= |In | ln

k k−1

k k−1

|In |

and thus

1 1 k Hνn P Fn Hνn P In d ln . |Fn | |Fn | k−1

1360


It follows using the concavity of the function x → −x ln x that E

Es

1 1 k 1 Hμ P Hν P − ε(n) ln k. d ln |E| n |Fn | |E| n k−1 s∈Fn

Since the boundary of each Pi has zero μ-measure, the boundary of each atom of P E has zero μ-measure, and so by [27, Theorem 17.20] the entropy of P E is a continuous function of the measure with respect to the weak∗ topology, whence

1 1 Hμ P E = lim Hμn P E d ln k/(k − 1) . n |E| |E| Since this holds for every nonempty finite set E ⊆ G, we obtain hμ (P) d ln(k/(k − 1)). Now let P = {P1 , . . . , Pk } be any k-element Borel partition of X such that Pi ∩ Ui = ∅ for each i = 1, . . . , k. By Lemma 2.19, for every δ > 0 there is a k-element Borel partition Q = {Q1 , . . . , Qk } such that Qi ∩ Ci = ∅ and μ(∂Qi ) = 0 for i = 1, . . . , k and Hμ (Q | P) < δ, so that hμ (P) hμ (Q) − δ d ln(k/(k − 1)) − δ by the previous paragraph. Thus hμ (P) d ln(k/(k − 1)). This inequality holds moreover for any finite Borel partition P that refines U := {U1c , . . . , Ukc } as a cover since we may assume that P is of the above form by coarsening it if necessary. Therefore h+ μ (U) > 0. Suppose that the action of G on X is (topologically) free, i.e., for all x ∈ X and s ∈ G, sx = x implies s = e. Then it is free with respect to μ, and hence hc,μ (U) > 0 by Lemma 2.18. Therefore by Lemma 2.15 and Proposition 2.16(1) there is a μ-IE-tuple (x1 , . . . , xk ) contained in U1 × · · · × Uk . Now suppose that the action of G on X is not free. Take a free action of G on a compact Hausdorff space (Y, G), e.g., the universal minimal G-system [10]. Then the product system (X × Y, G) is an extension of (X, G) which is free. By Proposition 3.9(4) of [30] we can find a lift x˜ of the tuple x under this extension such that x˜ is an IE-tuple. By the previous paragraph there are a G-invariant Borel probability measure μ on X × Y and a μ-IE-tuple x˜ contained in the inverse image of U1 × · · · × Uk . It then follows by Proposition 2.16(5) that the image x of x˜ is a ν-IE-tuple contained in U1 × · · · × Uk for the measure ν on X induced from μ, completing the proof. 2 From Lemma 2.20 we obtain: Theorem 2.21. For each k 1 the set of IE-tuples of length k is equal to the closure of the union of the sets IEkμ (X) over all G-invariant Borel probability measures μ on X. Lemma 2.22. Suppose that X is metrizable. Let x = (x1 , . . . , xk ) be an IE-tuple. Then there is a G-invariant Borel probability measure μ on X such that x is a μ-IE-tuple. Proof. We may assume that x consists of distinct points. Since X is metrizable, we can find for each m ∈ N pairwise disjoint open neighbourhoods Um,1 , . . . , Um,k of x1 , . . . , xk , respectively, so that for each i = 1, . . . , k the family {Um,i : m ∈ N} forms a neighbourhood basis for xi . 2.20 with respect to For each m take a measure μm and a μ-IE-tuple x m as given by Lemma −m μ . Then Um,1 , . . . , Um,k and define the G-invariant Borel probability measure μ = ∞ m m=1 2 x m is a μ-IE-tuple for each m, and so x is a μ-IE-tuple by Proposition 2.16(4). 2


1361

Theorem 2.23. Suppose that X is metrizable. Then there is a G-invariant Borel probability measure μ on X such that the sets of μ-IE-tuples and IE-tuples coincide. Proof. For each k 1 take a countable dense subset {x k,i }i∈Ik of the set of IE-tuples of length k. By Lemma 2.22, for every k 1 and i ∈ Ik there is a G-invariant Borel probability measure μk,i on X such that x k,i is a μk,i -IE-tuple. Set μ = ∞ λ μ k,i k,i for some λk,i > 0 with k=1 i∈I k ∞ λ = 1. Then μ is a G-invariant Borel probability measure, and x k,i is a μ-IEk,i k=1 i∈Ik tuple for every k 1 and i ∈ Ik . Since the set of μ-IE-tuples of a given length is closed by Proposition 2.16(4) and μ-IE-tuples are always IE-tuples, we obtain the desired conclusion. 2 In the case G = Z, the conclusion of Theorem 2.23 for μ-entropy pairs and topological entropy pairs was established in [3] and then more generally for μ-entropy tuples and topological entropy tuples in [22]. 2.4. The relation between μ-IE-tuples and μ-entropy tuples For G = Z the notion of a μ-entropy pair was introduced in [4] and generalized to μ-entropy tuples in [22]. We will accordingly say for k 2 that a nondiagonal tuple (x1 , . . . , xk ) ∈ X k is a μ-entropy tuple if whenever U1 , . . . , Ul are pairwise disjoint Borel neighbourhoods of the distinct points in the list x1 , . . . , xk , every Borel partition of X refining {U1c , . . . , Ulc } has positive measure entropy. In this subsection we aim to show that nondiagonal μ-IE-tuples are the same as μ-entropy tuples. Our first task is to establish Lemma 2.24. For this we will use the orbit equivalence technique of Rudolph and Weiss [42], which will enable us to apply a result of Huang and Ye for Z-actions [22]. In order to invoke Theorem 2.6 of [42], whose hypotheses include ergodicity, we will need the ergodic decomposition of entropy, which asserts that if (Y, Y , ν) is a Lebesgue space equipped with an action of a countable discrete amenable group H and ν = Z νz dω(z) is the corresponding ergodic decomposition, then for every finite measurable partition P of Y we have hν (P) = Z hνz (P) dω(z). The standard proof of this for G = Z using symbolic representations (see for example Section 15.3 of [12]) also works in the general case. Given a tuple A = (A1 , . . . , Ak ) of Borel subsets of X with ki=1 Ai = ∅, we say that a finite Borel partition P of X is A-admissible if it refines {Ac1 , . . . , Ack } as a cover of X. For the definitions of the quantities h+ μ (U) and hc,μ (U) see the discussion after Lemma 2.9. As the proof below involves several different systems, we will explicitly indicate the action in our notation for the various entropy quantities. Lemma 2.24. Suppose that X is metrizable and G is countably infinite. Let A = (A1 , . . . , Ak ) be a tuple of pairwise disjoint Borel subsets of X. Denote by U the Borel cover {Ac1 , . . . , Ack } of X. Suppose that hμ (P) > 0 for every A-admissible finite Borel partition P of X. Then hc,μ (U) > 0. Proof. Denote by T the action of G on X. Take a free weakly mixing action S of G on a Lebesgue space (Y, Y , ν) (for example a Bernoulli action). We will consider the product action T × S on (X × Y, B ⊗ Y , μ × ν) and view B and Y as sub-σ -algebras of B ⊗ Y when convenient. Since S is free and ergodic, by the Connes-Feldman-Weiss theorem [6] there is an integer action Rˆ on (Y, Y , ν) with the same orbits as S and we may choose Rˆ to have zero measure entropy. Now we define an integer action R on (X × Y, B ⊗ Y , μ × ν) with the same

1362


orbits as T × S by setting R(x, y) = (T × S)s(y) (x, y) where s(y) is the element of G determined ˆ = Ss(y) y. by Ry Let π : (X, B, μ) → (Z, Z , ω) be the dynamical factor defined by the σ -algebra IT of T invariant sets in B. We write the disintegration of μ over ω as μ = Z μz dω(z) and for every z ∈ Z put Xz = π −1 (z) and Bz = B ∩ Xz and denote by Tz the restriction of T to (Xz , Bz , μz ). Since S is weakly mixing, the σ -algebra IT ×S of (T × S)-invariant sets in B ⊗ Y coincides with IT , viewing the latter as a sub-σ -algebra of B ⊗ Y . The dynamical factor (X × Y, B ⊗ Y , μ × ν) → (Z, Z , ω) defined by IT ×S is the product of π and the trivial factor and gives the ergodic decomposition of T × S with ω-a.e. ergodic components (Xz × Y, Bz ⊗ Y , μz × ν) with action Tz × S for z ∈ Z. The orbit equivalence of R with T × S respects the ergodic decomposition and so for R we have ω-a.e. ergodic components (Xz × Y, Bz ⊗ Y , μz × ν) with action Rz for z ∈ Z. Note that for each z ∈ Z the action Rz is free and the orbit change from Tz × S to Rz is Y -measurable in the sense of Definition 2.5 in [42]. Write B for the tuple (A1 × Y, . . . , Ak × Y ) of pairwise disjoint B-measurable subsets of X × Y . Let Q = {Q1 , . . . , Qr } be a B-admissible finite measurable partition of X × Y . We will show that there exists a set of z ∈ Z of nonzero measure for which hμz ×ν (Tz × S, Qz | Y ) > 0, where Qz = {Qj ∩ (Xz × Y ): j = 1, . . . , r}. Suppose to the contrary that hμz ×ν (Tz × S, Qz | Y ) = 0 for ω-a.e. z ∈ Z. Consider the conditional expectations E B = L1 (Y, ν) → L1 (X, μ) and E Bz = idL1 (Xz ,μz ) ⊗ν: idL1 (X,μ) ⊗ν: L1 (X ×Y, μ×ν) = L1 (X, μ) ⊗ L1 (Y, ν) → L1 (Xz , μz ) for z ∈ Z. As is easy to check L1 (Xz × Y, μz × ν) = L1 (Xz , μz ) ⊗ using approximations in the algebraic tensor product, for every f ∈ L1 (X × Y, μ × ν) = L1 (Y, ν) there is a full-measure set of z ∈ Z for which E Bz (f |Xz )(x) = E B (f )(x) L1 (X, μ) ⊗ for μz -a.e. x ∈ Xz . For each j = 1, . . . , r set Cj = {x ∈ X: E B (1Qj )(x) > 0}, which is defined up to a set of μ-measure zero and hence can be assumed to satisfy the condition that for every i = 1, . . . , r it is disjoint from Ai × Y if and only if Qj is. Then {Cj : j = 1, . . . , r} is

j −1 an A-admissible Borel cover of X. Putting P = {Cj \ d=1 Cd : j = 1, . . . , r} we obtain an A-admissible measurable partition of X. Now let z ∈ Z. Denote by R the relative Pinsker σ -algebra of Tz × S with respect to Y , i.e., the σ -algebra generated by all measurable partitions R of Xz × Y such that hμ×ν (Tz × S, R|Y ) = 0. In the ω-a.e. situation that Rz is ergodic we have R = PTz ⊗ Y by Theorem 4.10 of [42]. From the discussion in the previous paragraph we see that if z is assumed to belong to a certain set of full measure then for each j = 1, . . . , r the sets Cj ∩ Xz and {x ∈ Xz : E Bz (1Qj ∩Xz )(x) > 0} coincide up to a set of μz -measure zero. In this case, setting Pz = {P ∩ Xz : P ∈ P} we obtain a partition of Xz which is PTz -measurable and hence satisfies hμz (Tz , Pz ) = 0. It follows using the ergodic decomposition of entropy that hμ (T , P) = Z hμ (Tz , Pz ) dω(z) = 0, contradicting our hypothesis. Therefore we must have hμz ×ν (Tz × S, Qz |Y ) > 0 for all z in a set W ⊆ Z of nonzero measure. For every z in a subset of W with the same measure as W the action Rz is ergodic and free, in which case we can apply Theorem 2.6 of [42] along with the fact that Rˆ has zero entropy to obtain hμz ×ν (Rz , Qz ) = hμz ×ν (Rz , Qz | Y ) = hμz ×ν (Tz × S, Qz | Y ) > 0. The ergodic decomposition of entropy then yields hμ×ν (R, Q) = hμz ×ν (Rz , Qz ) dω(z) > 0. Z


1363

It follows by Theorem 4.6 of [22] that the infimum c of hμ×ν (R, Q) over all B-admissible finite measurable partitions Q of X is nonzero. Denote by V the measurable cover {Ac1 × Y, . . . , Ack × Y } of X × Y . Suppose we are given a B-admissible finite measurable partition Q of X × Y . Applying the ergodic decomposition of entropy, Theorem 2.6 of [42], and the fact that Rˆ has zero entropy we get hμ×ν (T × S, Q) =

hμz ×ν (Tz × S, Qz ) dω(z) Z

hμz ×ν (Tz × S, Qz | Y ) dω(z)

Z

hμz ×ν (Rz , Qz | Y ) dω(z)

= Z

hμz ×ν (Rz , Qz ) dω(z)

= Z

= hμ×ν (R, Q) c. Therefore h+ μ×ν (T × S, V) c > 0, and since T × S is free it follows by Lemma 2.18 that hc,μ×ν (T × S, V) > 0. As we clearly have hc,μ (T , U) hc,μ×ν (T × S, V), this establishes the lemma. 2 We remark that, in the last paragraph of the above proof, if Q is of the form {P × Y : P ∈ P} for some finite A-admissible Borel partition P of X, then hμ (T , P) = hμ×ν (T × S, Q), in which case the display shows that h+ μ (T , U) c > 0. In order to reduce the general case of discrete amenable groups to the case of countable ones, we shall need Lemma 2.26 below. For this we need the machinery of quasi-tiling developed by Ornstein and Weiss. The following lemma is contained in the proof of Theorem 6 in [35]. Lemma 2.25. Given 1 > ε > 0, if F1 ⊆ F2 ⊆ · · · ⊆ Fk are nonempty finite subsets of G such that 2 Fi+1 is (Fi Fi−1 , ηi )-invariant, ηi |Fi Fi−1 | ε4 for i = 1, 2, . . . , k − 1, and (1 − 2ε )k < ε, then for 2

any (Fk , ε4 )-invariant finite nonempty subset F of G there are translates {Fi cij }i,j contained in F and subsets Eij ⊆

Fi cij such that Eij ∩ Ei j = ∅ for all (i, j ) = (i , j ), |Eij |/|Fi cij | 1 − ε for all (i, j ), and | ij Fi cij |/|F | 1 − ε. The following lemma is a direct consequence of Lemma 2.25. For any ϕ satisfying the condi) tions below, by Proposition 3.22 in [30], ϕ(F |F | converges as F becomes more and more invariant. Note that every subgroup of G is amenable [36, Proposition 1.12]. Lemma 2.26. If ϕ is a real-valued function which is defined on the set of finite subsets of G and satisfies (1) 0 ϕ(A) < +∞ and ϕ(∅) = 0,

1364


(2) ϕ(A) ϕ(B) for all A ⊆ B, (3) ϕ(As) = ϕ(A) for all finite A ⊆ G and s ∈ G, (4) ϕ(A ∪ B) ϕ(A) + ϕ(B) if A ∩ B = ∅, ϕ(F ) |F | as F becomes more and more invariant in G is the minimum of the cor) limits of ϕ(F |F | as F becomes more and more invariant in H for H running over the

then the limit of

responding countable subgroups of G.

Theorem 2.27. For every k 2, a nondiagonal tuple in X k is a μ-IE-tuple if and only if it is a μ-entropy tuple. Proof. The fact that a nondiagonal μ-IE-tuple is a μ-entropy tuple follows from Lemma 2.15. In the case that X is metrizable and G is countably infinite, Lemmas 2.24 and 2.15 combine to show that a μ-entropy tuple is a μ-IE-tuple. Suppose now that X is arbitrary. When G is finite, it is easily seen that the nondiagonal μ-IE-tuples and μ-entropy tuples are both precisely the nondiagonal tuples in supp(μ)k . When G is countably infinite, write X as a projective limit of a net of metrizable spaces Xj equipped with compatible G-actions and induced Borel probability measures μj . Then by Proposition 2.16(5) the μ-IE-tuples are the projective limits of the μj -IEtuples. Since the image of a measure entropy tuple under a factor map is clearly again a measure entropy tuple as long as its image is nondiagonal, we conclude from the metrizable case that every μ-entropy tuple is a μ-IE-tuple. Finally, when G is uncountably infinite, it follows from Lemma 2.26 that the set of μ-entropy tuples for (X, G) is equal to the intersection over the countable subgroups G of G of the sets consisting of the μ-entropy tuples for (X, G ). It is also easily verified that the set of μ-IE-tuples for (X, G) contains the intersection over the countable subgroups G of G of the sets consisting of the μ-IE-entropy tuples for (X, G ). We thus obtain the result. 2 To prove the product formula for μ-IE-tuples we will use the Pinsker von Neumann algebra PX , i.e., the G-invariant von Neumann subalgebra of L∞ (X, μ) corresponding to the Pinsker σ -algebra (see the beginning of the next section). Denote by EX the conditional expectation L∞ (X, μ) → PX . The following lemma appeared as Lemma 4.3 in [22]. Note that the assumptions in [22] that X is metrizable and G = Z are not needed here. Lemma 2.28. Let U = {U1 , . . . , Uk } be a Borel cover of X. Then ki=1 EX (χUic ) = 0 if and only if hμ (P) > 0 for every finite Borel partition P finer than U as a cover. Combining Lemma 2.28, Proposition 2.16(3), and Theorem 2.27, we obtain the following characterization of μ-IE tuples. Lemma 2.29. A tuple x = (x1 , . . . , xk ) ∈ X k is a μ-IE tuple if and only if for any Borel neighbourhoods U1 , . . . , Uk of x1 , . . . , xk , respectively, one has ki=1 EX (χUi ) = 0. Theorem 2.30. Let (Y, G) be another topological G-system and ν a G-invariant Borel probability measure on Y . Then for all k 1 we have IEkμ×ν (X × Y ) = IEkμ (X) × IEkν (Y ). Proof. By Proposition 2.16(5) we have IEkμ×ν (X × Y ) ⊆ IEkμ (X) × IEkν (Y ). Thus we just need to prove IEkμ (X) × IEkν (Y ) ⊆ IEkμ×ν (X × Y ).


1365

Assume first that both X and Y are metrizable and G is countable. Then PX×Y = PX ⊗ PY [7, Theorem 0.4(3)] (see also [14, Theorem 4] for the ergodic case) and hence EX×Y (f ⊗ g) = EX (f ) ⊗ EY (g) for any f ∈ L∞ (X, μ) and g ∈ L∞ (Y, ν). Now the desired inclusion follows from Lemma 2.29. The proof for the general case follows the argument in the proof of Theorem 2.27. 2 In the case G = Z, the product formula for measure entropy pairs was established in [11], while for general measure entropy tuples it is implicit in Theorem 8.1 of [22], whose proof we have essentially followed here granted the general tensor product formula for Pinsker von Neumann algebras. Notice that our IE-tuple viewpoint results in a particularly simple formula. 3. Combinatorial independence and the Pinsker algebra Continuing within the realm of entropy, we will assume throughout the section that (X, G) is a topological dynamical system with G amenable and μ is a G-invariant Borel probability measure on X. Recall that the Pinsker σ -algebra is the G-invariant σ -subalgebra of B generated by all finite Borel partitions of X with zero entropy (or, equivalently, all two-element Borel partitions of X with zero entropy), and it defines the largest factor of the system with zero entropy (see Chapter 18 of [12]). The corresponding G-invariant von Neumann subalgebra of L∞ (X, μ) will be denoted by PX and referred to as the Pinsker von Neumann algebra. In Theorem 3.7 we will give various local descriptions of the Pinsker von Neumann algebra in terms of combinatorial independence, 1 geometry, and c.p. approximation entropy. The notion of c.p. (completely positive) approximation entropy was introduced by Voiculescu in [46] for ∗ -automorphisms of hyperfinite von Neumann algebras preserving a faithful normal state (see [34] for a general reference on dynamical entropy in operator algebras). We will formulate here a version of the definition for amenable acting groups. So let M be a von Neumann algebra, σ a faithful normal state on M, and β a σ -preserving action of the discrete amenable group G on M by ∗ -automorphisms. For a finite set Υ ⊆ M and δ > 0 we write CPAσ (Υ, δ) for the set of all triples (ϕ, ψ, B) where B is a finite-dimensional C ∗ -algebra and ϕ : M → B and ψ : B → M are unital completely positive maps such that σ ◦ ψ ◦ ϕ = σ and (ψ ◦ ϕ)(a) − aσ < δ for all a ∈ Υ . We then set rcpσ (Υ, δ) = inf rank B: (ϕ, ψ, B) ∈ CPAσ (Υ, δ) if the set on the right is nonempty, which is always the case if M is commutative or hyperfinite. Otherwise we put rcpσ (Υ, δ) = ∞. Write hcpaσ (β, Υ, δ) for the limit supremum

of |F1 | ln rcpσ ( s∈F αs (Υ ), δ) as F becomes more and more invariant, and define hcpaσ (β, Υ ) = sup hcpaσ (β, Υ, δ), δ>0

hcpaσ (β) = sup hcpaσ (β, Υ ), Υ

where the last supremum is taken over all finite subsets Υ of M. We refer to hcpaσ (β, Υ ) as the c.p. approximation entropy of β. When G = Z and M is commutative and has separable predual, this coincides with Voiculescu’s original definition by the arguments leading to Corollary 3.8 in [46].

1366


Question 3.1. Does the above definition always coincide with Voiculescu’s when G = Z? By Corollary 3.8 in [46], when X is metrizable, G = Z, and the action is ergodic, the c.p. approximation entropy of the induced action α on L∞ (X, μ) agrees with the measure entropy hμ (X). The arguments also work for general amenable G. It follows using the ergodic decomposition of entropy (see the paragraph before Lemma 2.24) that when X is metrizable the Pinsker von Neumann algebra is the largest G-invariant von Neumann subalgebra of L∞ (X, μ) on which the c.p. approximation entropy is zero. We next define geometric analogues of upper and lower measure independence density from Section 2. Let f ∈ L∞ (X, μ). Let p be a projection in L∞ (X, μ) and let λ 1. We say that a set J ⊆ G is an 1 -λ-isomorphism set for f relative to p if {pα i (f ): i ∈ J } is λ-equivalent to the standard basis of J1 . For δ > 0 denote by P(μ, δ) the set of projections p ∈ L∞ (X, μ) such that μ(p) 1 − δ. For every finite subset F of G, λ 1, and δ > 0 we define ϕf,λ,δ (F ) =

min

p∈P(μ,δ)

max |F ∩ J |: J is an 1 -λ-isomorphism set for f relative to p .

Write Iμ (f, λ, δ) for the limit supremum of |F1 | ϕf,λ,δ (F ) as F becomes more and more invariant, and Iμ (f, λ, δ) for the corresponding limit infimum. Set Iμ (f, λ) = supδ>0 Iμ (f, λ, δ) and Iμ (f, λ) = supδ>0 Iμ (f, λ, δ). Finally, we define Iμ (f ) = supλ1 Iμ (f, λ) and Iμ (f ) = supλ1 Iμ (f, λ), and refer to these quantities respectively as the upper μ-1 -isomorphism density and lower μ-1 -isomorphism density of f . On the topological side, for each λ 1 the limit of 1 max |F ∩ J |: α i (f ): i ∈ J is λ-equivalent to the standard basis of J1 |F | as F becomes more and more invariant exists (see the end of Section 3 in [30]), and we refer to the supremum of these limits over all λ 1 as the 1 -isomorphism density of f . Glasner and Weiss proved the next lemma for the real scalar case [16, Lemma 2.3]. The complex scalar version follows by considering the map E → (n∞ )R ⊕∞ (n∞ )R = (2n ∞ )R sending each v ∈ E ⊆ n∞ to the pair consisting of its real and imaginary parts. Lemma 3.2. For all b > 0 and δ > 0 there exist c > 0 and ε > 0 such that, for all sufficiently large n, if E is a subset of the unit ball of n∞ which is δ-separated and |E| ebn , then there are a t ∈ [−1, 1] and a set J ⊆ {1, 2, . . . , n} for which (1) |J | cn, and (2) either for every σ ∈ {0, 1}J there is a v ∈ E such that for all j ∈ J

re v(j ) t + ε

re v(j ) t − ε

if σ (j ) = 1, and if σ (j ) = 0,

or for every σ ∈ {0, 1}J there is a v ∈ E such that for all j ∈ J the above holds with re(v(j )) replaced by im(v(j )). The following is a consequence of Lemma 3.6 in [30].


1367

Lemma 3.3. There exists a c > 0 such that whenever I is a finite set and Ai,1 , Ai,2 , and Bi for i ∈ I are subsets of a given set such that the collection {(Ai,1 ∪ Ai,2 , Bi ): i ∈ I } is independent, there are a set J ⊆ I with |J | c|I | and a j ∈ {1, 2} for which the collection {(Ai,j , Bi ): i ∈ J } is independent. Lemma 3.4. For every δ > 0 there exist c > 0 and ε > 0 such that, for every compact Hausdorff space Y and finite subset Θ of the unit ball of C(Y ) of sufficiently large cardinality, if the linear Θ −1 −1 δ, map γ : Θ 1 → C(Y ) sending the standard basis of 1 to Θ is an isomorphism with γ then there exist closed disks B1 , B2 ⊆ C of diameter at most ε/6 with dist(B1 , B2 ) ε and an I ⊆ Θ with |I | c|Θ| such that the collection {(f −1 (B1 ), f −1 (B2 )): f ∈ I } is independent. Proof. Let δ > 0. Define a pseudometric dΘ on Y by dΘ (x, y) = sup f (x) − f (y) f ∈Θ

and pick a maximal (δ/4)-separated subset Z of Y . Then the open balls B(z, δ/4) with radius δ/4 and centre z for z ∈ Z cover Y . A standard partition of unity argument (see the proof of Proposition 4.8 in [46]) yields the bound rc(Θ, δ/2) |Z| for the contractive (δ/2)-rank of Θ as defined in [29]. By Lemma 3.2 of [29] we have ln rc(Θ, δ/2) |Θ|aγ −2 (γ −1 −1 − δ/2)2 for −2 −1 −1 2 2 some universal constant a > 0. Thus |Z| e|Θ|aγ (γ −δ/2) e|Θ|aδ /4 . Evaluation of the functions in Θ on the points of Y yields a map ψ from Y to the unit ball of Θ ∞ such that ψ(Z) is (δ/4)-separated. By Lemma 3.2 there are c > 0 and ε > 0 depending only on a and δ such that there exist closed disks B1 and B2 contained in the unit disk of C with dist(B1 , B2 ) 4ε/3 and an I ⊆ Θ with |I | c|Θ| such that the collection {(f −1 (B1 ), f −1 (B2 )): f ∈ I } is independent. Now for some N ∈ N depending on ε we can cover each of B1 and B2 with N disks of diameter at most ε/6. By repeated application of Lemma 3.3 we can then replace each of B1 and B2 with one of the smaller disks to obtain the result (with a smaller c). 2 Lemma 3.5. Let δ > 0 and λ > 0. Let Ω = {f1 , . . . , fn } be a subset of the unit ball of L∞ (X, μ) and suppose that for all g1 , . . . , gn in the unit ball of L∞ (X, μ) with max1in gi − fi μ < δ there exists an I ⊆ {1, . . . , n} of cardinality at least dn for which the linear map I1 → span{gi : i ∈ I } sending the standard basis element with index i ∈ I to gi has an inverse with norm at most λ. Then ln rcpμ (Ω, δ) an for some constant a > 0 which depends only on λ. Proof. Let (ϕ, ψ, B) ∈ CPAμ (Ω, δ). Then there exists an I ⊆ {1, . . . , n} of cardinality at least dn for which the linear map I1 → span{(ψ ◦ ϕ)(fi ): i ∈ I } sending the standard basis element with index i ∈ I to gi has an inverse with norm at most λ. It follows using the operator norm contractivity of ϕ and ψ that for any scalars ci for i ∈ I we have −1 ci ϕ(fi ) ci (ψ ◦ ϕ)(fi ) |ci |, λ i∈I

i∈I

s∈I

1368


so that the subset {ϕ(fi ): i ∈ I } of B is λ-equivalent to the standard basis of I1 . Lemma 3.1 of [28] then guarantees the existence of a constant a > 0 depending only on λ such that ln rank(B) an, yielding the result. 2 Lemma 3.6. Let δ > 0. Let Ω = {f1 , . . . , fn } be a subset of the unit ball of L∞ (X, μ) and for each i = 1, . . . , n let Pi be a finite Borel partition of X such that ess supx,y∈P |fi (x) − fi (y)| < δ for every P ∈ Pi . Suppose that H (P) nδ 2 where P = ni=1 Pi . Then

ln rcpμ Ω, δ 2 + 4δ 2nδ if n is sufficiently large as a function of δ. Proof. For a finite Borel partition Q of X we write I (Q) for the information function − Q∈Q 1Q ln μ(Q). Then H (P) = X I (P) dμ, and so by our assumption the set D on which the nonnegative function I (P)/n takes values less than δ has measure at least 1 − δ. Then μ(P ) e−nδ for all P ∈ P such that μ(P ∩ D) = 0. Let B be the linear span of {1P ∩D : P ∈ P and μ(P ∩ D) = 0} ∪ {1D c }. Then B is a unital ∗ -subalgebra of L∞ (X, μ) and dim B enδ + 1. Taking the μ-preserving conditional expectation ϕ : L∞ (X,√μ) → B and the inclusion√ ψ : B → L∞ (X, μ) it is readily checked that (ϕ, ψ, B) ∈ CPAμ (Ω, δ 2 + 4δ) so that rcpμ (Ω, δ 2 + 4δ) enδ + 1, from which the desired conclusion follows. 2 Regarding L∞ (X, μ) as a unital commutative C ∗ -algebra, it is isomorphic by Gelfand theory to C(Ω) for some compact Hausdorff space Ω, which we can identify with the spectrum of L∞ (X, μ) (i.e., the space of nonzero multiplicative linear functionals on L∞ (X, μ)) equipped with the relative weak∗ topology (see Chapter 1 of [8]). Accordingly we will view elements of L∞ (X, μ) as continuous functions on Ω when appropriate. The action α of G on L∞ (X, μ) gives rise to a topological dynamical system (Ω, G) with the action of G defined by (s, σ ) → σ ◦ αs −1 . Since μ defines a state on L∞ (X, μ) it gives rise to a G-invariant Borel probability measure on Ω, which we will also denote by μ. For a projection p ∈ L∞ (X, μ) we write Ωp for the clopen subset of Ω whose characteristic function is p. Theorem 3.7. Let f ∈ L∞ (X, μ). Let {Fn }n∈Λ be a Følner net in G. Then the following are equivalent: (1) f ∈ / PX , (2) there is a μ-IE-pair (σ1 , σ2 ) ∈ Ω × Ω such that f (σ1 ) = f (σ2 ), (3) there are d > 0, δ > 0, and λ > 0 such that, for all n greater than some n0 ∈ Λ, whenever gs for s ∈ Fn are elements of L∞ (X, μ) satisfying gs − αs (f )μ < δ for every s ∈ Fn there exists an I ⊆ Fn of cardinality at least d|Fn | for which the linear map I1 → span{gs : s ∈ I } sending the standard basis element with index s ∈ I to gs has an inverse with norm at most λ, (4) the same as (3) with “for all n greater than some n0 ∈ Λ” replaced by “for all n in a cofinal subset of Λ,” (5) Iμ (f ) > 0, (6) Iμ (f ) > 0, (7) hcpaμ (α, {f }) > 0, (8) hcpaμ (β) > 0 for the restriction β of α to the von Neumann subalgebra of L∞ (X, μ) dynamically generated by f .


1369

When the action is ergodic and either X is metrizable or G is countable, we can add: (9) there is a δ > 0 such that every g ∈ L∞ (X, μ) satisfying g − f μ < δ has positive 1 isomorphism density with respect to the operator norm. When f ∈ C(X) we can add: (10) f ∈ / C(Y ) whenever π : X → Y is a topological G-factor map such that hπ∗ (μ) (Y ) = 0, (11) there is a μ-IE-pair (x1 , x2 ) ∈ X × X such that f (x1 ) = f (x2 ). Proof. (1) ⇒ (2). Since the α-invariant von Neumann subalgebra of L∞ (X, μ) generated by f is also dynamically generated by the set of spectral projections of f over closed subsets of the complex plane, we can find a clopen set Z ⊆ Ω corresponding to a spectral projection of f over A for some set A ⊆ C such that the two-element clopen partition Z = {Z, Z c } satisfies hμ (Ω, Z) > 0. Using Lemma 2.8 we can find a closed set B ⊆ C with B ∩ A = ∅ such that the pair (Z, Z ) has positive μ-independence density, where Z is the subset of Ω supporting the spectral projection of f over B. By Proposition 2.16(1) there is a μ-IE-pair (σ1 , σ2 ) ∈ Ω × Ω such that σ1 ∈ Z and σ2 ∈ Z . Then f (σ1 ) ∈ A while f (σ2 ) ∈ B, establishing (2). (2) ⇒ (3). Let (σ1 , σ2 ) ∈ Ω × Ω be a μ-IE-pair such that f (σ1 ) = f (σ2 ). Choose disjoint 1 dist(B1 , B2 ), f (σ1 ) ∈ int(B1 ), closed disks B1 , B2 ⊆ C such that diam(B1 ) = diam(B2 ) 10 1 and f (σ2 ) ∈ int(B2 ) and set ε = 10 dist(B1 , B2 ). Choose clopen neighbourhoods A1 and A2 of σ1 and σ2 , respectively, such that f (A1 ) ⊆ B1 and f (A2 ) ⊆ B2 . Write A for the pair (A1 , A2 ). Since (σ1 , σ2 ) is a μ-IE-pair there exists by Proposition 2.4 and Lemma 2.15 a δ > 0 such that Iμ (A, δ) > 0. Take an η > 0 such that whenever h is an element of L∞ (X, μ) for which hμ < η the set {x ∈ X: |h(x)| ε} has measure at least 1 − δ. Now let n ∈ Λ and suppose that we are given gs ∈ L∞ (X, μ) for s ∈ Fn such that gs − αs (f )μ < η for every s ∈ Fn . For each s ∈ Fn set Ds = {σ ∈ Ω: |gs (σ ) − αs (f )(σ )| ε}, which has measure at least 1 − δ by our choice of η, and for s ∈ G \ Fn set Ds = Ω. Put d = Iμ (A, δ)/2. Assuming that n > n0 for a suitable n0 ∈ Λ, there exist an independence set I ⊆ Fn for A relative to the map s → Ds such that |I | d|Fn |. The standard Rosenthal–Dor argument [9] then shows that the linear map I1 → span{gs : s ∈ I } sending the standard basis element with index s ∈ I to gs has an inverse with norm at most ε −1 , yielding (3). (3) ⇒ (4). Trivial. (4) ⇒ (7). Apply Lemma 3.5. (3) ⇒ (5). We may assume that f = 1. Let d, δ, and λ be as given by (3). Then for any p ∈ P(μ, δ 2 ) and s ∈ G we have pαs (f ) − αs (f )μ p − 1μ f δ. It follows that ϕf,λ,δ 2 (Fn ) d|Fn | for every n ∈ N, and hence Iμ (f ) Iμ (f, λ, δ 2 ) d > 0. (5) ⇒ (6). Trivial. (6) ⇒ (4). We may assume that G is infinite and f = 1. By (6) there are a λ 1 and a δ > 0 such that Iμ (f, λ, δ) > 0. Then there is a d > 0 and a cofinal set L ⊆ Λ such that ϕf,λ,δ (Fn ) d|Fn | for all n ∈ L. Let b be a positive number to be further specified below, and set δ = δb. Let c > 0 and ε > 0 be as given by Lemma 3.4 with respect to δ = λ−1 . Take an η > 0 such that whenever h is an element of L∞ (X, μ) for which hμ < η the set {x ∈ X: |h(x)| ε/12} has measure at least 1 − δ . Now let n ∈ L, and suppose we are given gs ∈ L∞ (X, μ) for s ∈ Fn such that gs − αs (f )μ < η for every s ∈ Fn . By our choice of η, for every s ∈ Fn there is a projection ps ∈ P(μ, δ ) such that ps (gs − αs (f )) ε/12. Denote by S the set of all σ ∈ {1, 2}Fn

1370


−1 ⊥ such that |σ (2)| b|Fn |. Setting ps,1 = ps and ps,2 = ps we define the projection r = p . Then σ ∈S s∈Fn s,σ (s)

μ ps⊥ |Fn |δ μ r ⊥ b|Fn | s∈Fn

and so μ(r ⊥ ) b−1 δ = δ. Hence there is an K ⊆ Fn with |K| d|Fn | such that K is an 1 -λisomorphism set for f relative to r. By our choice of c and ε, assuming that |Fn | is sufficiently large we can find closed disks B1 , B2 ⊆ C of diameter at most ε/6 with dist(B1 , B2 ) ε and a J ⊆ K with |J | c|K| such that the collection

−1

−1

(B1 ), αs (f )|Ωr (B2 ) : s ∈ J αs (f )|Ωr of pairs of subsets of Ωr is independent. Define the subsets Cs,1 = (gs |Ωr )−1 (B1 ) and Cs,2 = (gs |Ωr )−1 (B2 ) of Ωr , where B1 (resp. B2 ) is the closed disk with the same centre as B1 (resp. B2 ) but with radius bigger by ε/12. Since maxs∈J ps (gs − αs (f )) ε/12, for each J σ ∈ {1, 2} we can find by the definition of r a set JJσ ⊆ J with |J \ Jσ | b|Fn | such that (Ω ps ∩ Cs,σ (s) ) = ∅, and we define ρσ ∈ {0, 1, 2} by s∈Jσ ρσ (s) =

σ (s) 0

if s ∈ Jσ , otherwise.

Since maxσ ∈{1,2}J |ρσ−1 (0)| 2b|Fn | , for every ρ ∈ {0, 1, 2}J the number of σ ∈ {1, 2}J for which ρσ = ρ is at most 2b|Fn | , and so the set R = {ρσ : σ ∈ {1, 2}J } has cardinality at least 2|J | /2b|Fn | 2(cd−b)|Fn | . It follows by Lemma 2.2 that for a small enough b not depending on n there exists a t > 0 for which we can find an I ⊆ J with |I | t|J | tcd|Fn | such that R|I ⊇ {1, 2}I . Then the collection {(Cs,1 , Cs,2 ): s ∈ I } is independent, and since dist(B1 , B2 ) 5ε/6 > 2 max(diam(B1 ), diam(B2 )) the standard Rosenthal-Dor argument [9] shows that the linear map I1 → span{gs : s ∈ I } sending the standard basis element with index s ∈ I to gs has an inverse with norm at most 10ε −1 . We thus obtain (4). (7) ⇒ (8). It suffices to note that if N is an G-invariant von Neumann subalgebra of L∞ (X, μ) then for every finite subset Θ ⊆ N we have hcpaμ|N (N, Θ) = hcpaμ (L∞ (X, μ), Θ), i.e., for computing c.p. approximation entropy it does not matter whether Θ is considered as a subset of N or L∞ (X, μ). This follows from the fact that there is a μ-preserving conditional expectation from L∞ (X, μ) onto N [45, Proposition V.2.36]. See the proof of Proposition 3.5 in [46]. (8) ⇒ (1). Suppose that f ∈ PX . Let Υ be a finite subset of the von Neumann subalgebra of L∞ (X, μ) generated by f and let δ > 0. Take a finite Borel partition P of X such that the characteristic functions of the atoms of P are spectral projections of f and supg∈Ω ess supx,y∈P |g(x) − g(y)| < δ for each P ∈ P. Then hμ (X, P) = 0 by our assumption, and thus, since we may suppose G √ to be infinite (for otherwise the system has completely positive entropy), we obtain hcpaμ (β, Υ, δ 2 + 4δ) 2δ by Lemma 3.6. Hence (8) fails to hold. Thus (8) implies (1). Assume now that either X is metrizable or G is countable and that the action is ergodic and let us show that (9) is equivalent to the other conditions.


1371

(3) ⇒ (9). Let d, δ, and λ be as given by (3). Let g be an element of L∞ (X, μ) such that g − f μ < δ. Then αs (g) − αs (f )μ < δ for all s ∈ G, and so for every n ∈ N there is an I ⊆ Fn of cardinality at least d|Fn | for which {αs (g): s ∈ I } is gλ-equivalent in the operator norm to the standard basis of I1 . Thus g has positive 1 -isomorphism density. (9) ⇒ (8). Suppose that G is countable. We will first treat the case that the action of G on X is free. Suppose contrary to (8) that hcpaμ (β) = 0. Since α is free and ergodic so is β, and since G is countable the von Neumann subalgebra of L∞ (X, μ) dynamically generated by f has separable predual. We can thus apply the Jewett–Krieger theorem for free ergodic measurepreserving actions of countable discrete amenable groups on Lebesgue spaces (see [40], which shows the finite entropy case; the general case was announced in [48] but remains unpublished) to obtain a topological G-system (Y, G) with a unique invariant Borel probability measure ν such that β can be realized as the action of G on L∞ (Y, ν) arising from the action of G on Y . Now let δ > 0 be as given by (9). Take a function g ∈ C(Y ) ⊆ L∞ (Y, ν) such that g − f μ < δ. Since the system (Y, G) has zero topological entropy by the variational principle [33], it follows by Theorem 5.3 of [29] (which is stated for Z-systems but is readily seen to cover actions of general amenable groups) that the function g has zero 1 -isomorphism density, contradicting our choice of δ. We thus obtain (9) ⇒ (8) in the case that the action is free. Suppose now that the action of G on X is not free. Take a free weakly mixing measurepreserving acion of G on a Lebesgue space (Z, Z , ω) (e.g., a Bernoulli shift). Then the product action on X × Z is free and ergodic. Write E for the conditional expectation of L∞ (X × Z, μ × ω) onto L∞ (X, μ). With δ > 0 as given by (9), for every g ∈ L∞ (X × Z, μ × ω) such that E(g) − f μ < δ the function E(g) has positive 1 -isomorphism density, which implies that g has positive 1 -isomorphism density since E is contractive and G-equivariant. Thus the function f ⊗ 1 in L∞ (X × Z, μ × ω) also satisfies (9) for the same δ. By the previous paragraph we obtain (8) for f ⊗ 1. But this is equivalent to (8) for f itself, yielding (9) ⇒ (8) when G is countable. Suppose that G is uncountable and X is metrizable. In this case we will actually show orthogonal complement in L2 (X, μ) of the sub(9) ⇒ (7). For every s ∈ G write Es for the space of vectors fixed by s. Then the span of s∈G Es is dense in L2 (X, μ) C1 by ergodicity,

and since L2 (X, μ) is separable there is a countable set J ⊆ G such that the span of s∈J Es is dense in L2 (X, μ) C1. It follows that the subgroup H generated by J does not fix any vectors in L2 (X, μ) C1. This means that the action of H on X is ergodic, as is the action of any subgroup of G containing H . By Lemma 2.26 condition (9) holds for the action of every subgroup of G containing H , and thus for the action of a countable such subgroup we get (9) ⇒ (8) by the two previous paragraphs and hence (9) ⇒ (7). But if (7) fails for the action of G then it fails for the action of every subgroup of G containing some fixed countable subgroup W of G and in particular for the action of the countable subgroup generated by H and W , yielding a contradiction. Finally, we suppose that f ∈ C(X) and demonstrate the equivalence of (11) and (12) with the other conditions. (2) ⇒ (11). The inclusion C(supp(μ)) ⊆ L∞ (X, μ) gives rise at the spectral level to a topological G-factor map Ω → supp(μ), and so the implication follows from Proposition 2.16(5). (11) ⇒ (10). Use Proposition 2.16(5). μ (10) ⇒ (11). Suppose that f (x1 ) = f (x2 ) for every (x1 , x2 ) ∈ IE2 (X). Set E = {(x, y) ∈ X × X: f (x) = f (y)}. Then E is a closed equivalence relation on X. Thus s∈G sE is a Ginvariant closed equivalence relation on X and hence gives rise to a topological G-factor Y of X. In particular, f ∈ C(Y ). Denote the factor map X → Y by π . Our assumption says that

1372


μ μ μ IE2 (X) ⊆ E. Since IE2 (X) is G-invariant, IE2 (X) ⊆ s∈G sE. This means that (π × π) × μ (IE2 (X)) ⊆ Y . By (2) and (5) of Proposition 2.16, hπ∗ (μ) (Y ) = 0. (11) ⇒ (3). Apply the same argument as for (2) ⇒ (3). 2 Theorem 3.7 shows that for general X the Pinsker von Neumann algebra is the largest Ginvariant von Neumann subalgebra of L∞ (X, μ) on which the c.p. approximation entropy is zero. Remark 3.8. One interesting consequence of Theorem 3.7 is the following. In the case that G is countable, if a weakly mixing measure-preserving action of G on a Lebesgue space (Y, Y , ν) does not have completely positive entropy, then it has a metrizable topological model (Z, G) for which the set IEk (Z) of topological IE-tuples has zero ν k -measure for each k 2. Indeed weak mixing implies that the product action of G on Y k is ergodic with respect to ν k , so that for a topological model (Z, G) and k 2 the set IEk (Z) has ν k -measure either zero or one. If for every metrizable topological model (Z, G) we had ν k (IEk (Z)) = 1 for some k 2, it would follow that every element of L∞ (Y, ν) has positive 1 -isomorphism density, since such an element is a continuous function for some metrizable topological model by the countability of G and hence separates a topological IE-pair. But then (Y, Y , ν, G) would have completely positive entropy by Theorem 3.7. Actually the weak mixing assumption can be weakened to the requirement that there be no sets of measure strictly between zero and one with finite G-orbit. We also point out that, in a related vein, if the topological system (X, G) does not have completely positive entropy, then for a G-invariant Borel probability measure on X the set IEk (X) has zero product measure for each k 2, unless some nontrivial quotient of (X, G) has points with positive induced measure. The reason is that if IEk (X) for some k 2 has positive product measure then so does IEk (Y ) with respect to the induced measure for every quotient (Y, G) of (X, G), and if every point in such a quotient (Y, G) has zero induced measure then the diagonal in Y k has zero product measure and hence does not contain IEk (Y ), implying that (Y, G) has positive topological entropy. In particular, we see that if (X, G) is minimal and does not have completely positive entropy and X is connected (and hence has no nontrivial finite quotients) then for every G-invariant Borel probability measure on X the set IEk (X) has zero product measure for each k 2. At the extreme end of completely positive entropy where the Pinsker von Neumann algebra reduces to the scalars, the picture topologizes and we have the following result. Recall that a topological system is said to have completely positive entropy if every nontrivial factor has positive topological entropy, uniformly positive entropy if every nondiagonal element of X × X is an entropy pair, and uniformly positive entropy of all orders if for each k 2 every nondiagonal element of X k is an entropy tuple (see [12, Chapter 19] and [22]). Theorem 3.9. Suppose that X is metrizable or G is countable. Let Ω = (Ω, G) be the topological dynamical system associated to X = (X, B, μ, G) as above. Then the following are equivalent: (1) (2) (3) (4) (5)

X has completely positive entropy, every nonscalar element of L∞ (X, μ) has positive 1 -isomorphism density, Ω has completely positive entropy, Ω has uniformly positive entropy, Ω has uniformly positive entropy of all orders.


1373

Proof. (1) ⇒ (5). Every Borel partition of Ω is μ-equivalent to a clopen partition and thus every nontrivial such partition has positive entropy by (1). It follows that, for each k 2, every nondiagonal tuple in Ω k is a μ-entropy tuple and hence a μ-IE-tuple by Theorem 2.27. Since μ-IE-tuples are obviously IE-tuples and the latter are easily seen to be entropy tuples when they are nondiagonal, we obtain (5). (5) ⇒ (4) ⇒ (3). These implications hold for any topological G-system, the first being trivial and the second being a consequence of the properties of entropy for open covers with respect to taking extensions. (3) ⇒ (2). Apply Corollary 5.5 of [29] as extended to actions of discrete amenable groups. (2) ⇒ (1). By (2) there do not exist any nonscalar G-invariant projections in L∞ (X, μ), i.e., the system X is ergodic. We can thus apply (9) ⇒ (1) of Theorem 3.7. 2 For G = Z the equivalence of (1), (3), (4), and (5) in Theorem 3.9 can also be obtained from Section 3 of [16]. One might wonder whether a similar type of topologization occurs at the other extreme of zero entropy. Glasner and Weiss showed however in [15] that every free ergodic Z-system has a minimal topological model with uniformly positive entropy. Using Theorem 3.7 and viewing joinings as equivariant unital positive maps, we can give a linear-geometric proof of the disjointness of zero entropy systems from completely positive entropy systems, which for measure-preserving actions of discrete amenable groups on Lebesgue spaces was established in [14] (see also Chapter 6 of [12]). Recall that a joining between two measure-preserving G-systems (Y, Y , ν, G) and (Z, Z , ω, G) is a G-invariant probability measure on (Y × Z, Y ⊗ Z ) with ν and ω as marginals. The two systems are said to be disjoint if ν × ω is the only joining between them. Proposition 3.10. Let (Y, Y , ν, G) and (Z, Z , ω, G) be measure-preserving G-systems. Let ϕ : L∞ (Y, ν) → L∞ (Z, ω) be a G-equivariant unital positive linear map such that ω ◦ ϕ = ν. Then ϕ(PX ) ⊆ PY . Proof. Since ϕ is unital and positive it is operator norm contractive and for every f ∈ L∞ (Y, ν) we have

ϕ(f ) = ω ϕ(f )∗ ϕ(f ) 1/2 ω ϕ(f ∗ f ) 1/2 = ν(f ∗ f )1/2 = f ν , ω that is, ϕ is also contractive for the norms · ν and · ω . Thus if condition (3) in Theorem 3.7 holds for a given f ∈ L∞ (Z, ω) with witnessing constants d, δ, and λ then it also holds for every element of ϕ −1 ({f }) with the same witnessing constants. The equivalence (1) ⇔ (3) in Theorem 3.7 now yields the proposition. 2 A joining η between two measure-preserving systems Y = (Y, Y , ν, G) and Z = (Z, Z , ω, G) gives rise as follows to a G-equivariant unital positive linear map ϕ : L∞ (Y, ν) → L∞ (Z, ω) such that ω ◦ ϕ = ν (this is a special case of a construction for correspondences between von Neumann algebras [38]). Define the operator S : L2 (Z, ω) → L2 (Y × Z, η) by (Sξ )(y, z) = ξ(z) for all ξ ∈ L2 (Z, ω) and (y, z) ∈ Y × Z and the representation π : L∞ (Y, ν) → B(L2 (Y × Z, η)) by (π(f )ζ )(y, z) = f (y)ζ (y, z) for all f ∈ L∞ (Y, ν), ζ ∈ L2 (Y × Z, η), and (y, z) ∈ Y × Z. Then for f ∈ L∞ (Y, ν) we set ϕ(f ) = S ∗ π(f )S. It is easily checked that S ∗ π(f )S commutes with

1374


every element of the commutant L∞ (Z, ω) , so that ϕ(f ) ∈ L∞ (Z, ω) = L∞ (Z, ω). Now define the representation ρ : L∞ (Z, ω) → B(L2 (Y × Z, η)) by (ρ(g)ζ )(y, z) = g(z)ζ (y, z) for all g ∈ L∞ (Z, ω), ζ ∈ L2 (Y × Z, η), and (y, z) ∈ Y × Z. Then for f ∈ L∞ (Y, ν) and g ∈ L∞ (Z, ω) we have, with 1 denoting the unit in the appropriate L∞ algebra,

η π(f )ρ(g) = π(f )ρ(g), 1 ⊗ 1 η = π(f )ρ(g)S1, S1 η = π(f )Sg1, S1 η = S ∗ π(f )Sg1, 1 ω

= ω ϕ(f )g . In the case that the image of ϕ is the scalars, we see that η gives rise to the product state ϕ ⊗ ω on L∞ (Y, ν) ⊗ L∞ (Z, ω) under composition with the representation f ⊗ g → π(f )ρ(g), and furthermore ϕ = ν by the assumption on the marginals in the definition of joining. Corollary 3.11. Let Y = (Y, Y , ν, G) and Z = (Z, Z , ω, G) be measure-preserving G-systems. Suppose that Y has zero entropy and Z has completely positive entropy. Then Y and Z are disjoint. Proof. As above, a joining η between Y and Z gives rise to a G-equivariant unital positive linear map ϕ : L∞ (Y, ν) → L∞ (Z, ω) such that ω ◦ ϕ = ν. By Proposition 3.10 the image of such a map ϕ must be the scalars. Hence there is only the one joining ν × ω. 2 4. Measure IN-tuples In this section (X, G) is an arbitrary topological dynamical system and μ a G-invariant Borel probability measure on X. We will define μ-IN-tuples and establish some properties in analogy with μ-IE-tuples. Here the role of measure entropy is played by measure sequence entropy. The combinatorial phenomena responsible for the properties of μ-IE-tuples in Proposition 2.16 apply equally well to the sequence entropy framework, and so it will essentially be a matter of recording the analogues of various lemmas from Section 2. We will also show that nondiagonal μ-IN-tuples are the same as μ-sequence entropy tuples and derive the measure IN-tuple product formula. For δ > 0 we say that a finite tuple A of subsets of X has δ-μ-independence density over arbitrarily large finite sets if there exists a c > 0 such that for every M > 0 there is a finite set F ⊆ G of cardinality at least M which possesses the property that for every D ∈ B (X, δ) there is an independence set I ⊆ F relative to D with |I | c|F |. We say that A has positive sequential μ-independence density if for some δ > 0 it has δ-μ-independence density over arbitrarily large finite sets. Arguing as in the proof of Lemma 2.6 yields: Lemma 4.1. Let A = (A1 , . . . , Ak ) be a tuple of subsets of X which has positive sequential μ-independence density. Suppose that A1 = A1,1 ∪ A1,2 . Then at least one of the tuples A1 = (A1,1 , A2 , . . . , Ak ) and A2 = (A1,2 , A2 , . . . , Ak ) has positive sequential μ-independence density. In [30] we defined a tuple x = (x1 , . . . , xk ) ∈ X k to be an IN-tuple (or an IN-pair in the case k = 2) if for every product neighbourhood U1 × · · · × Uk of x the G-orbit of the tuple


1375

(U1 , . . . , Uk ) has arbitrarily large finite independent subcollections. Here is the measure-theoretic analogue: Definition 4.2. We call a tuple x = (x1 , . . . , xk ) ∈ X k a μ-IN-tuple (or μ-IN-pair in the case k = 2) if for every product neighbourhood U1 × · · · × Uk of x the tuple (U1 , . . . , Uk ) has positive μ sequential μ-independence density. We denote the set of μ-IN-tuples of length k by INk (X). Obviously every μ-IN-tuple is an IN-tuple. The following analogue of Lemma 2.8 follows immediately from Lemma 2.7. Lemma 4.3. Let P = {P1 , P2 } be a two-element Borel partition of X such that hμ (P; s) > 0 for some sequence s in G. Then there exists ε > 0 such that whenever A1 ⊆ P1 and A2 ⊆ P2 are Borel sets with μ(P1 \ A1 ), μ(P2 \ A2 ) < ε the pair A = (A1 , A2 ) has positive sequential μ-independence density. Fix a sequence s = {sj }j ∈N in G. Recalling the notation ϕA,δ and ϕA,δ from Section 2.1, for δ > 0 we set

1 Iμ (A, δ; s) = lim sup ϕA,δ {s1 , . . . , sn } , n→∞ n

1 {s1 , . . . , sn } , Iμ (A, δ; s) = lim sup ϕA,δ n→∞ n Iμ (A; s) = sup Iμ (A, δ; s). δ>0

By Lemma 2.3, we have a(k)Iμ (A; s) sup Iμ (A, δ; s) Iμ (A; s) δ>0

where a(k) is as defined in Section 2.1. Clearly A has positive sequential μ-independence density if and only if Iμ (A; s) > 0 for some sequence s in G. Let U be a finite Borel cover of X. Recall that H (U) denotes the infimum of the entropies H (P) over all finite Borel partitions P of X that refine U . For δ > 0 we set 1 hc,μ (U, δ; s) = lim sup ln Nδ n→∞ n

n

! sj−1 U

,

j =1

hc,μ (U; s) = sup hc,μ (U, δ; s), δ>0

h− μ (U; s) = lim sup n→∞

1 H n

n

! sj−1 U

,

j =1

h+ μ (U; s) = inf hμ (P; s), P U

where the last infimum is taken over finite Borel partitions refining U . Both h− μ (U; s) and − (U; s) h+ (U; s) trivially. h+ (U; s) appeared in [21] for the case of G = Z. We have h μ μ μ

1376


The next lemma is the analogue of Lemma 2.12 and follows directly from Lemma 2.11. Lemma 4.4. Let π : X → Y be a factor of X. For any finite Borel cover U of Y , one has −1

− h− μ π U; s = hπ∗ (μ) (U; s). The argument in the proof of Lemma 2.13 can also be used to show: Lemma 4.5. We have δ · hc,μ (U, δ; s) h− μ (U; s) hc,μ (U; s). Next we come to the analogue of Lemma 2.15. Lemma 4.6. For a finite Borel cover U of X, the quantities h− μ (U; s) and hc,μ (U; s) are either both zero or both nonzero. If the complements in X of the members of U are pairwise disjoint and A is a tuple consisting of these complements, then we may also add the third quantity Iμ (A; s) to the list. Proof. The first assertion is a consequence of Lemma 4.5. For a tuple A as in the lemma statement, Lemma 3.3 of [30] and Lemma 2.14 show that hc,μ (U; s) > 0 if and only if Iμ (A; s) > 0. 2 Proposition 4.7. The following hold: (1) Let A = (A1 , . . . , Ak ) be a tuple of closed subsets of X which has positive sequential μindependence density. Then there exists a μ-IN-tuple (x1 , . . . , xk ) with xj ∈ Aj for j = 1, . . . , k. μ (2) IN2 (X) \ 2 (X) is nonempty if and only if the system (X, B, μ, G) is nonnull. μ

(3) IN1 (X) = supp(μ) when G is an infinite group. μ

(4) INk (X) is a closed G-invariant subset of X k . μ π (μ) (5) Let π : X → Y be a topological G-factor map. Then π k (INk (X)) = INk ∗ (Y ). Proof. (1) Apply Lemma 4.1 and a compactness argument. (2) As is well known and easy to show, (X, μ) is nonnull if and only if there is a two-element Borel partition of X with positive sequence entropy with respect to some sequence in G. We thus obtain the “if” part by (1) and Lemma 4.3. For the “only if” part apply Lemma 4.6. (3) This follows from Lemma 2.9. (4) Trivial. (5) This follows from (1), (3), (4) and Lemmas 4.4 and 4.6. 2 The concept of measure sequence entropy tuple originates in [21], which deals with the case G = Z. The definition works equally well for general G. Thus for k 2 we say that a nondiagonal tuple (x1 , . . . , xk ) ∈ X k is a sequence entropy tuple for μ if whenever U1 , . . . , Ul are pairwise disjoint Borel neighbourhoods of the distinct points in the list x1 , . . . , xk , every Borel partition of X refining the cover {U1c , . . . , Ulc } has positive measure sequence entropy with respect to some sequence in G. To show that nondiagonal μ-IN-tuples are the same as μ-sequence entropy tuples, it suffices by Lemma 4.6 to prove that if U is a cover of X consisting of the complements


1377

of neighbourhoods of the points in a μ-sequence entropy tuple then h− μ (U; s) > 0 for some sequence s in G. For G = Z this was done by Huang, Maass, and Ye in Theorem 3.5 of [21]. Their methods readily extend to the general case, as we will now indicate. Given a unitary representation π : G → B(H), the Hilbert space H orthogonally decomposes into two G-invariant closed subspaces Hwm and Hcpct such that π is weakly mixing on Hwm and the G-orbit of every vector in Hcpct has compact closure [18]. For our μ-preserving action of G on X, considering its associated unitary representation of G on L2 (X, μ) there exists by Theorem 7.1 of [49] a G-invariant von Neumann subalgebra DX ⊆ L∞ (X, μ) such that L2 (X, μ)cpct = L2 (DX , μ|DX ). The following lemma generalizes part of Theorem 2.3 of [21] with essentially the same proof. In [21] X is assumed to be metrizable, but that is not necessary here. Lemma 4.8. Let P be a finite Borel partition of X. Then there is a sequence s in G such that hμ (P; s) H (P | DX ). Proof. First we show that, given a finite Borel partition Q of X and an ε > 0, the set of all s ∈ G such that H (s −1 P | Q) H (P | DX ) − ε is thickly syndetic. Write P = {P1 , . . . , Pk } and Q = {Q1 , . . . , Ql } and denote by E the μ-preserving conditional expectation onto DX . Since 1A − E(1A ) ∈ L2 (X, μ)wm for every Borel set A ⊆ X and thick syndeticity is preserved under taking finite intersections, for each η > 0 the set of all s ∈ G such that sup1ik, 1j l |Us (1Pi − E(1Pi )), 1Qj | < η is thickly syndetic. It follows that for all s in some thickly syndetic set we have, using the concavity of the function x → −x ln x, k l Us E(1Pi ), 1Qj −1

− Us E(1Pi ), 1Qj ln H s P Q +ε μ(Qj ) i=1 j =1

k

−Us E(1Pi ) ln Us E(1Pi ) dμ

i=1 X

= H (P | DX ), as desired. We can now recursively construct a sequence s = {s1 = e, s2 , s3 , . . .} in G such that H (sn−1 P | n−1 −1 si P) H (P | DX ) − 2−n for each n > 1. Using the identity H ( ni=1 si−1 P) = i=1 n−1 −1 −1 −1 H ( n−1 i=1 si P) + H (sn P | i=1 si P) we then get ! n k−1 1 −1 −1 hμ (P; s) = lim sup H sk P si P H (P | DX ). n→∞ n k=1

2

i=1

Using Lemma 4.8 we can now argue as in the proof of Theorem 3.5 of [21] to deduce that (U; s) > 0 for some sequence s in G whenever U is a cover of X whose elements are the h− μ complements of neighbourhoods of the points in a μ-sequence entropy tuple (it can be checked that the metrizability hypothesis on X in [21] is not necessary in this case). In [21] the authors use the fact that DX -measurable partitions have zero measure sequence entropy for all sequences, which for G = Z and metrizable X is contained in [31]. In our more general setting we can appeal to Theorem 5.5 from the next section. We thus obtain the desired result:

1378


Theorem 4.9. For every k 2, a nondiagonal tuple in X k is a μ-IN-tuple if and only if it is a μ-sequence entropy tuple. To establish the product formula for μ-IN-tuples we will make use of the maximal null von Neumann algebra NX ⊆ L∞ (X, μ), which corresponds to the largest factor of the system with the zero sequence entropy for all sequences (see the beginning of the next section). Denote by EX ∞ conditional expectation L (X, μ) → NX . The following lemma is the analogue of Lemma 2.28 and appeared as Lemma 3.3 in [21]. Note that the assumptions in [21] that X is metrizable and G = Z are not needed here. (χ c ) = 0 if and Lemma 4.10. Let U = {U1 , . . . , Uk } be a Borel cover of X. Then ki=1 EX Ui only if for every finite Borel partition P finer than U as a cover one has hμ (P; s) > 0 for some sequence s in G. Combining Lemma 4.10, Proposition 4.7(3), and Theorem 4.9, we obtain the following analogue of Lemma 2.29. only if for Lemma 4.11. When G is infinite, a tuple x = (x1 , . . . , xk ) ∈ X k is a μ-IN tuple if and (χ ) = 0. any Borel neighbourhoods U1 , . . . , Uk of x1 , . . . , xk , respectively, one has ki=1 EX Ui The following is the analogue of Theorem 2.30. Theorem 4.12. Let (Y, G) be another topological G-system and ν a G-invariant Borel probability measure on Y . Then for all k 1 we have INkμ×ν (X × Y ) = INkμ (X) × INkν (Y ). Proof. When G is finite, both sides are empty. So we may assume that G is infinite. By Proposition 4.7(5) we have INkμ×ν (X × Y ) ⊆ INkμ (X) × INkν (Y ). Thus we just need to prove INkμ (X) × INkν (Y ) ⊆ INkμ×ν (X × Y ). Since the tensor product of a weakly mixing unitary representation of G and any other unitary representation of G is weakly mixing, we have L2 (X × Y, μ × ν)cpct = L2 (X, μ)cpct ⊗ L2 (Y, ν)cpct . It follows that DX×Y = DX ⊗ DY . By Theorem 5.5 from the next section we (f ) ⊗ E (g) for any (f ⊗ g) = EX have NX = DX . Thus NX×Y = NX ⊗ NY and hence EX×Y Y ∞ ∞ f ∈ L (X, μ) and g ∈ L (Y, ν). Now the desired inclusion follows from Lemma 4.11. 2 In the case G = Z, the product formula for measure sequence entropy tuples is implicit in Theorem 4.5 of [21], and we have essentially applied the argument from there granted the fact that for general G the maximal null factor is the same as the maximal isometric factor, as shown by Theorem 5.5. 5. Combinatorial independence and the maximal null factor We will continue to assume that (X, G) is an arbitrary topological dynamical system and μ is a G-invariant Borel probability measure on X. In analogy with the Pinsker σ -algebra in the context of entropy, the G-invariant σ -subalgebra of B generated by all finite Borel partitions of X with zero sequence entropy for all sequences (or, equivalently, all two-element Borel partitions of X with zero sequence entropy for all sequences) defines the largest factor of the system with zero sequence entropy for all sequences (see [21]). The corresponding G-invariant von Neumann


1379

subalgebra of L∞ (X, μ) will be denoted by NX and referred to as the maximal null von Neumann algebra. The system (X, B, μ, G) is said to be null if NX = L∞ (X, μ) (i.e., if it has zero measure sequence entropy for all sequences) and completely nonnull if NX = C. Kushnirenko showed that an ergodic Z-action on a Lebesgue space is isometric if and only if NX = L∞ (X, μ) [31]. As Theorem 5.5 will demonstrate more generally, NX always coincides with DX , as defined prior to Lemma 4.8. Our main goal in this section is to establish Theorem 5.5, which gives various local descriptions of the maximal null factor in analogy with Theorem 3.7. To a large extent the same arguments apply and we will simply refer to the appropriate places in the proof Theorem 3.7. On the other hand, several conditions appear in Theorem 5.5 which have no analogue in the entropy setting, reflecting the fact that there is a particularly strong dichotomy between nullness and nonnullness. This dichotomy hinges on the orthogonal decomposition of L2 (X, μ) into the G-invariant closed subspaces L2 (X, μ)wm and L2 (X, μ)cpct (as described prior to Lemma 4.8) and the relationship between compact orbit closures and finite-dimensional subrepresentations recorded below in Proposition 5.3. To define the sequence analogue of c.p. approximation entropy, let M be a von Neumann algebra, σ a faithful normal state on M, and β a σ -preserving action of the discrete group G on M by ∗ -automorphisms. Let s = {sn }n be a sequence in G. Recall the quantities rcpσ (·,·) from the beginning of Section 3. For a finite set Υ ⊆ M and δ > 0 we set hcpasσ (β, Υ, δ) = lim sup n→∞

1 ln rcpσ n

n "

! βsi (Υ ), δ

i=1

and define hcpasσ (β, Υ ) = sup hcpasσ (β, Υ, δ), δ>0

hcpasσ (β) = sup hcpasσ (β, Υ ) Υ

where the last supremum is taken over all finite subsets Υ of M. We call hcpasσ (β, Υ ) the sequence c.p. approximation entropy of β. In analogy with the upper μ-1 -isomorphism density from Section 3, given a sequence s = {sn }n in G, f ∈ L∞ (X, μ), λ 1, and δ > 0 we set

1 Iμ (f, λ, δ; s) = lim sup ϕf,λ,δ {s1 , . . . , sn } n→∞ n and define Iμ (f, λ; s) = sup Iμ (f, λ, δ; s), δ>0

Iμ (f ; s) = sup Iμ (f, λ; s). λ1

We could also define the lower version but this is less significant for our applications, in which we would always be able to pass to a subsequence.

1380


To establish (10) ⇒ (5) in Theorem 5.5 we will need the relationship between relatively compact orbits and finite-dimensional invariant subspaces given by Proposition 5.3, which is presumably well known. For this we record a couple of lemmas. Lemma 5.1. Suppose that G acts on a Banach space V by isometries. Then the action factors through a compact Hausdorff group ( for a strongly continuous action on V and a homomorphism from G into this group) if and only if the norm closure of the orbit of each vector is compact. Proof. The “only if” part is obvious. Suppose that the action is compact. Denote by E the closure of the image of G in the space B(V ) of bounded linear operators on V with respect to the strong operator topology. Then E is precisely the closure of {(sv)v∈V : s ∈ G} in v∈V Gv. Thus E is a compact Hausdorff space. Note that multiplication on the unit ball of B(V ) is jointly continuous for the strong operator topology. It follows easily that E is a compact Hausdorff group of isometric operators on V and that the action of E on V is strongly continuous. This yields the “if” part. 2 A compactification of G is a pair (Γ, ϕ) where Γ is a compact Hausdorff group and ϕ is a homomorphism from G to Γ with dense image. The Bohr compactification G of G is the spectrum of the space of almost periodic bounded functions on G and has the universal property that every compactification of G factors through it (see [1]). Lemma 5.2. Suppose that G acts on a von Neumann algebra M by ∗ -automorphisms. Let σ be a G-invariant faithful normal state on M such that the induced unitary representation of G on L2 (M, σ ) has the property that the norm closure of the orbit of each vector is compact. Then the action factors through an ultraweakly continuous action of G on M. Proof. Denote the unitary on L2 (M, σ ) corresponding to s ∈ G by Us . By Lemma 5.1 the unitary representation s → Us of G factors through a strongly continuous unitary representation of G. Denote the unitary on L2 (M, σ ) corresponding to t ∈ G by Ut . Note that the action of s ∈ G on M is conjugation by Us . It follows that the conjugation by Ut for each t ∈ G preserves M. 2 For any ultraweakly continuous action of a locally compact group Γ on a von Neumann algebra as automorphisms, there is a Γ -invariant ultraweakly dense unital C ∗ -subalgebra of the von Neumann algebra on which the action of Γ is strongly continuous [37, Lemma 7.5.1]. For any strongly continuous action of a compact group on a Banach space as isometries, the subspace of elements whose orbit spans a finite-dimensional subspace is dense [5, Theorem III.5.7]. Thus we have: Proposition 5.3. Under the hypotheses of Lemma 5.2, there are a G-invariant ultraweakly dense unital C ∗ -subalgebra A of M on which the action of G is strongly continuous and a norm dense ∗ -subalgebra B of A such that the orbit of every element in B spans a finite-dimensional subspace. The following lemma is a local version of Theorem 5.2 of [19] and is a consequence of the proof given there in conjunction with the Rosenthal–Dor 1 theorem, which asserts that a bounded sequence in a Banach space has either a weakly Cauchy subsequence or a subsequence equivalent to the standard basis of 1 [9,41]. Indeed if f is as in the lemma statement, then given


1381

2 a sequence {gj }∞ j =1 in the L closure of {αs (f ): s ∈ G} we take for each j an sj ∈ G with αsj (f ) − gj 2 < 1/j and use the Rosenthal–Dor theorem to find an h ∈ L∞ (X, μ) ∼ = C(Ω) and 1 j1 < j2 < · · · such that limi→∞ αsji (f )(σ ) = h(σ ) for all σ ∈ Ω, where Ω is the spectrum of L∞ (X, μ) (see the discussion before Theorem 3.7). Replacing the topological model X by Ω produces the same L2 completion, so that limi→∞ αsji (f ) − h2 = 0 and thus limi→∞ gji − h2 = 0, yielding the sequential compactness and hence compactness of the closure of {αs (f ): s ∈ G} in L2 (X, μ).

Lemma 5.4. Let f be a function in L∞ (X, μ) whose G-orbit does not contain an infinite subset equivalent to the standard basis of 1 . Then f ∈ L2 (X, μ)cpct . The converse of Lemma 5.4 is false. Indeed by [15] every free ergodic Z-system has a minimal topological model with uniformly positive entropy, which means in particular that there are L∞ functions whose G-orbit has a positive density subset equivalent to the standard basis of 1 . In the following theorem (Ω, G) is the topological G-system associated to (X, B, G, μ) described before Theorem 3.7. Theorem 5.5. Let f ∈ L∞ (X, μ). Then the following are equivalent: (1) f ∈ / NX , (2) there is a μ-IN-pair (σ1 , σ2 ) ∈ Ω × Ω such that f (σ1 ) = f (σ2 ), (3) there are d > 0, δ > 0, and λ > 0 such that for any M > 0 there is some finite subset F ⊆ G with |F | M such that whenever gs for s ∈ F are elements of L∞ (X, μ) satisfying gs − αs (f )μ < δ for every s ∈ F there exists an I ⊆ F of cardinality at least d|F | for which the linear map I1 → span{gs : s ∈ I } sending the standard basis element with index s ∈ I to gs has an inverse with norm at most λ, (4) Iμ (f ; s) > 0 for some sequence s in G, (5) f ∈ / L2 (X, μ)cpct , (6) hcpasμ (α, {f }) > 0 for some sequence s in G, (7) hcpasμ (β) > 0 for some sequence s in G where β is the restriction of α to the von Neumann subalgebra of M dynamically generated by f , (8) there is a δ > 0 such that every g ∈ L∞ (X, μ) satisfying g − f μ < δ has an infinite 1 -isomorphism set, (9) there is a δ > 0 such that every g ∈ L∞ (X, μ) satisfying g − f μ < δ has arbitrarily large λ-1 -isomorphism sets for some λ > 0, (10) there is a δ > 0 such that every g ∈ L∞ (X, μ) satisfying g − f μ < δ has noncompact orbit closure in the operator norm. When f ∈ C(X) we can add: (11) f ∈ / C(Y ) whenever π : X → Y is a topological G-factor map such that π∗ (μ) is null, (12) there is a μ-IN-pair (x1 , x2 ) ∈ X × X such that f (x1 ) = f (x2 ). Proof. (1) ⇒ (2) Argue as for (1) ⇒ (2) in Theorem 3.7 using Lemma 4.3 and Proposition 4.7(1) instead of Lemma 2.8 and Proposition 2.16(1). (2) ⇒ (3). Apply the same argument as for (2) ⇒ (3) in Theorem 3.7, replacing Iμ (A, δ) by Iμ (A, δ; s) for a suitable sequence s in G.

1382


(3) ⇔ (4). Use the arguments for (6) ⇒ (4) and (3) ⇒ (6) in the proof of Theorem 3.7. (3) ⇒ (6). Argue as for (4) ⇒ (7) in Theorem 3.7. (6) ⇒ (7). As in the case of complete positive approximation entropy, if N is an G-invariant von Neumann subalgebra of L∞ (X, μ) and s is a sequence in G then for every finite subset Θ ⊆ N we have hcpasμ|N (N, Θ) = hcpasμ (M, Θ), which follows from the fact that there is a μpreserving conditional expectation from L∞ (X, μ) onto N [45, Prop. V.2.36] (cf. Proposition 3.5 in [46]). (7) ⇒ (1). This can be deduced from Lemma 3.6 in the same way that (8) ⇒ (1) of Theorem 3.7 was. (6) ⇒ (5). Suppose that f ∈ L2 (X, μ)cpct . Let δ > 0. Then the G-orbit {αs (f ): s ∈ G} contains a finite δ-net Ω for the L2 -norm. Take a finite Borel partition P of X such that supg∈Ω ess supx,y∈P |g(x) − g(y)| < δ for each P ∈ P. Let B be the ∗ -subalgebra of L∞ (X, μ) generated by P and let ϕ be the μ-preserving condition expectation of L∞ (X, μ) onto B. Now for every s ∈ G we can find a g ∈ Ω such that αs (f ) − gμ δ so that

ϕ αs (f ) − αs (f ) ϕ αs (f ) − g + ϕ(g) − g + g − αs (f ) < 3δ. μ μ μ μ Taking the inclusion ψ : B → L∞ (X, μ) it follows that for every finite set F ⊆ G we have (ϕ, ψ, B) ∈ CPAμ ({αs (f ): s ∈ F }, 3δ) and hence rcpμ ({αs (f ): s ∈ F }, 3δ) dim B. Since δ is arbitrary we conclude that hcpasμ (α, {f }) = 0 for every sequence s in G. (5) ⇒ (1). Since f ∈ / L2 (X, μ)cpct the restriction of α to the von Neumann algebra N dynamically generated by f has nonzero weak mixing component at the unitary level, and so there exists a finite partition P of X that is N -measurable but not DX -measurable (where DX is as defined prior to Lemma 4.8). By Lemma 4.8 there is a sequence s in G such that / NX . hsμ (X, P) H (P | DX ) > 0, from which we infer that f ∈ (5) ⇒ (8). This follows by observing that if {gk }k∈N were a sequence in L∞ (X, μ) converging to f in the μ-norm such that each gk lacks an infinite 1 -isomorphism set, then we would have gk ∈ L2 (X, μ)cpct for each k by Lemma 5.4 and hence f ∈ L2 (X, μ)cpct . (8) ⇒ (9). Trivial. (9) ⇒ (10). It is easy to see that if an element g of L∞ (X, μ) has arbitrarily large λ-1 isomorphism sets for some λ > 0 then its G-orbit fails to have a finite ε-net for some ε > 0 depending on λ and g. (10) ⇒ (5). Suppose contrary to (5) that f ∈ L2 (X, μ)cpct . Then the restriction of α to the von Neumann algebra N dynamically generated by f has the property that the norm closure of the G-orbit of each vector in L2 (N, μ) is compact. By Proposition 5.3 this contradicts (10). Suppose now that f ∈ C(X). To prove (2) ⇒ (12), observe that the inclusion C(supp(μ)) ⊆ L∞ (X, μ) gives rise to a topological G-factor map Ω → supp(μ), so that we can apply Proposition 4.7(5). For (12) ⇒ (3) apply the same argument as for (2) ⇒ (3). For (11) ⇒ (12) argue as for (11) ⇒ (12) in Theorem 3.7, this time using Proposition 4.7. Finally, for (12) ⇒ (11) use Proposition 4.7(5). 2 As pointed out at the beginning of the section and as used in the proof of Theorem 4.9, Theorem 5.5 shows that a measure-preserving system is isometric if and only if it is null, which in the case of a Z-action on a Lebesgue space is a result of Kushnirenko [31]. Note that Theorem 5.5 does not depend in any way on Theorem 4.9. In conjunction with Theorem 3.7, Theorem 5.5 gives a geometric explanation for the well-known fact that isometric measure-preserving systems have zero entropy.


1383

Condition (8) in Theorem 5.5 is the analogue of tameness from topological dynamics [13,30]. Its equivalence with the other conditions shows that tameness as distinct from nullness is a specifically topological-dynamical phenomenon. This equivalence relies in part, via Lemma 5.4, on the local argument used by Huang in the case G = Z to prove that if X is metrizable and the system (X, G) is tame then every G-invariant Borel probability measure on X is measure null [19, Theorem 5.2]. The following standard type of example illustrates that the converse of Huang’s result fails in an extreme way. Example 5.6. By Lemma 7.2 of [30], when G is Abelian, every nontrivial metrizable weakly mixing system (X, G) is completely untame. We will show how to construct a weakly mixing uniquely ergodic subshift (X, Z) with the invariant measure supported at a fixed point. We indicate first how to construct weakly mixing subshifts (X, Z) with X ⊆ {0, 1}Z . We shall construct two elements p and q in {0, 1}Z so that (p, q) is a transitive point for X × X where X is the orbit closure of p, and determine two increasing sequences 0 = a1 < a2 < · · · and 0 = a1 < a2 < · · · of nonnegative integers with an an < an+1 for all n. Set p(k) = q(k) = 0 unless an k an for some n. Set p(0) = 1 and q(0) = 0. Suppose that we have determined and p(k) and q(k) for all k a . Take a a1 , . . . , am and a1 , . . . , am m+1 to be any integer bigm = am+1 + 2m and set q to be 0 on ger than max(m, am ). If m + 1 ≡ 1 mod 3, we take am+1 ] while setting p on [am+1 , am+1 ] to be the shift of q on [−m, m]. If the interval [am+1 , am+1 ] m + 1 ≡ 2 mod 3, we take am+1 = am+1 + 2m and set p to be 0 on the interval [am+1 , am+1 while setting q on [am+1 , am+1 ] to be the shift of p on [−m, m]. If m + 1 ≡ 0 mod 3, consider ]. the set S consisting of the sequences of values of p over the finite subintervals of (−∞, am Consider pairs of elements in S of the same length which don’t appear as the sequence of values ]. Choose one such pair (f, g) with the smallest of (p, q) on some finite subinterval of (−∞, am length d. Set am+1 = am+1 + d − 1 and set p and q to be f and g, respectively, on the interval ]. Then it is clear that (p, q) is a transitive point for X × X where X is the orbit [am+1 , am+1 closure of p. In general, note that if U is an open subset of X such that there is an infinite subset H of G for which the sets hU for h ∈ H are pairwise disjoint, then μ(U ) = 0 for any invariant Borel probability measure μ on X. Denote by Y the complement of the union of all such U . Then every invariant Borel probability measure μ of X is supported on Y . We claim that in the construction above, by choosing am+1 large enough at each step, we can arrange for Y to consist of only the point 0. Then (X, Z) is uniquely ergodic and the invariant measure is supported at 0. Note that Y is always an invariant closed subset of X. Let V be the subset of X consisting of elements with value 1 at 0. It suffices to find an infinite subset H = {h1 , h2 , . . .} of Z such that the sets hV for h ∈ H are pairwise disjoint. Set h1 = 0. Suppose that we have determined a1 , . . . , am and and h , . . . , h and p(k) and q(k) for all k a . Take h a1 , . . . , am 1 m m+1 > hm + am − a1 and m am+1 > am + hm+1 − h1 . The following theorem addresses the extreme case of complete nonnullness, where we see the same kind of topologization as in the entropy setting of Theorem 3.9. For the definitions of the topological-dynamical properties of complete nonnullness, complete untameness, uniform nonnullness of all orders, and uniform untameness of all orders, see Sections 5 and 6 of [30]. Theorem 5.7. Let X = (X, X , μ, G) be a measure-preserving dynamical system. Let Ω = (Ω, G) be the associated topological dynamical system on the spectrum Ω of L∞ (X, μ). Then the following are equivalent:

1384


(1) X is weakly mixing, (2) X is completely nonnull, (3) for every nonscalar f ∈ L∞ (X, μ) there is a λ 1 such that for every m ∈ N there exists a set I ⊆ G of cardinality m such that {αs (f ): s ∈ I } is λ-equivalent to the standard basis of m 1, (4) every nonscalar element of L∞ (X, μ) has an infinite 1 -isomorphism set, (5) Ω is completely nonnull, (6) Ω is completely untame, (7) Ω is uniformly nonnull of all orders, (8) Ω is uniformly untame of all orders. Proof. (1) ⇒ (8). Use Theorems 8.2, 8.6, and 9.10 of [30]. (8) ⇒ (7) ⇒ (5) and (8) ⇒ (6). These implications hold for any topological system (see Sections 5 and 6 of [30]). (6) ⇒ (4). Apply Propositions 6.4 and 6.6 of [30]. (5) ⇒ (3). Apply Proposition 5.4 and Theorem 5.8 of [29]. (4) ⇒ (2), (3) ⇒ (2), and (2) ⇒ (1). Apply Theorem 5.5. 2 In analogy with Proposition 3.10, if (Y, Y , ν, G) and (Z, Z , ω, G) are measure-preserving G-systems and ϕ : L∞ (Y, ν) → L∞ (Z, ω) is a G-equivariant unital positive linear map such that ω ◦ ϕ = ν, then ϕ(NX ) ⊆ NY . One can deduce this using the characterization of functions in the maximal null von Neumann algebra in terms of either 1 -isomorphism sets or compact orbit closures in L2 . In particular we see that isometric systems are disjoint from weakly mixing systems. Of course it is well known more generally that distal systems are disjoint from weak mixing systems (see Chapter 6 of [12]). Acknowledgments The first author was partially supported by NSF grant DMS-0600907. He is grateful to Bill Johnson and Gideon Schechtman for seminal discussions and in particular for indicating the relevance of the Sauer–Perles–Shelah lemma to the types of perturbation problems considered in the paper. References [1] E.M. Alfsen, P. Holm, A note on compact representations and almost periodicity in topological groups, Math. Scand. 10 (1962) 127–136. [2] V. Bergelson, J. Rosenblatt, Mixing actions of groups, Illinois J. Math. 32 (1988) 65–80. [3] F. Blanchard, E. Glasner, B. Host, A variation on the variational principle and applications to entropy pairs, Ergodic Theory Dynam. Systems 17 (1997) 29–43. [4] F. Blanchard, B. Host, A. Maass, S. Martinez, D.J. Rudolph, Entropy pairs for a measure, Ergodic Theory Dynam. Systems 15 (1995) 621–632. [5] T. Bröcker, T. tom Dieck, Representations of Compact Lie Groups, Grad. Texts in Math., vol. 98, Springer-Verlag, New York, 1995, translated from the German manuscript, corrected reprint of the 1985 translation. [6] A. Connes, J. Feldman, B. Weiss, An amenable equivalence relation is generated by a single transformation, Ergodic Theory Dynam. Systems 1 (1981) 431–450. [7] A.I. Danilenko, Entropy theory from the orbital point of view, Monatsh. Math. 134 (2001) 121–141. [8] K.R. Davidson, C ∗ -Algebras by Example, Fields Inst. Monogr., vol. 6, Amer. Math. Soc., Providence, RI, 1996.


1385

[9] L.E. Dor, On sequences spanning a complex l1 -space, Proc. Amer. Math. Soc. 47 (1975) 515–516. [10] R. Ellis, Universal minimal sets, Proc. Amer. Math. Soc. 11 (1960) 540–543. [11] E. Glasner, A simple characterization of the set of μ-entropy pairs and applications, Israel J. Math. 102 (1997) 13–27. [12] E. Glasner, Ergodic Theory via Joinings, Amer. Math. Soc., Providence, RI, 2003. [13] E. Glasner, On tame dynamical systems, Colloq. Math. 105 (2006) 283–295. [14] E. Glasner, J.-P. Thouvenot, B. Weiss, Entropy theory without a past, Ergodic Theory Dynam. Systems 20 (2000) 1355–1370. [15] E. Glasner, B. Weiss, Strictly ergodic, uniform positive entropy models, Bull. Soc. Math. France 122 (1994) 399– 412. [16] E. Glasner, B. Weiss, Quasi-factors of zero entropy systems, J. Amer. Math. Soc. 8 (1995) 665–686. [17] E. Glasner, B. Weiss, On the interplay between measurable and topological dynamics, in: Handbook of Dynamical Systems, vol. 1B, Elsevier, Amsterdam, 2006, pp. 597–648. [18] R. Godement, Les fonctions de type positif et la théorie des groupes, Trans. Amer. Math. Soc. 63 (1948) 1–84. [19] W. Huang, Tame systems and scrambled pairs under an Abelian group action, Ergodic Theory Dynam. Systems 26 (5) (2006) 1549–1567. [20] W. Huang, A. Maass, P.P. Romagnoli, X. Ye, Entropy pairs and a local Abramov formula for a measure theoretical entropy of open covers, Ergodic Theory Dynam. Systems 24 (2004) 1127–1153. [21] W. Huang, A. Maass, X. Ye, Sequence entropy pairs and complexity pairs for a measure, Ann. Inst. Fourier (Grenoble) 54 (2004) 1005–1028. [22] W. Huang, X. Ye, A local variational relation and applications, Israel J. Math. 151 (2006) 237–279. [23] W. Huang, X. Ye, G. Zhang, A local variational principle for conditional entropy, Ergodic Theory Dynam. Systems 26 (2006) 219–245. [24] W. Huang, X. Ye, G. Zhang, Local entropy theory for a countable discrete amenable group action, preprint, 2007. [25] M.G. Karpovsky, V.D. Milman, Coordinate density of sets of vectors, Discrete Math. 24 (1978) 177–184. [26] A. Katok, Lyapunov exponents, entropy and periodic orbits for diffeomorphisms, Publ. Math. Inst. Hautes Études Sci. 51 (1980) 137–173. [27] A.S. Kechris, Classical Descriptive Set Theory, Grad. Texts in Math., vol. 156, Springer, New York, 1995. [28] D. Kerr, Entropy and induced dynamics on state spaces, Geom. Funct. Anal. 14 (2004) 575–594. [29] D. Kerr, H. Li, Dynamical entropy in Banach spaces, Invent. Math. 162 (2005) 649–686. [30] D. Kerr, H. Li, Independence in topological and C ∗ -dynamics, Math. Ann. 338 (2007) 869–926. [31] A.G. Kushnirenko, On metric invariants of entropy type, Russian Math. Surveys 22 (1967) 53–61. [32] E. Lindenstrauss, B. Weiss, Mean topological dimension, Israel J. Math. 115 (2000) 1–24. [33] J. Moulin Ollagnier, Ergodic Theory and Statistical Mechanics, Lecture Notes in Math., vol. 1115, Springer, Berlin, 1985. [34] S. Neshveyev, E. Størmer, Dynamical Entropy in Operator Algebras, Ergeb. Math. Grenzgeb. (3), vol. 50, Springer, Berlin, 2006. [35] D.S. Ornstein, B. Weiss, Entropy and isomorphism theorems for actions of amenable groups, J. Anal. Math. 48 (1987) 1–141. [36] A.L.T. Paterson, Amenability, Math. Surveys Monogr., vol. 29, Amer. Math. Soc., Providence, RI, 1988. [37] G.K. Pedersen, C ∗ -algebras and their Automorphism Groups, London Math. Soc. Monogr., vol. 14, Academic Press, Inc., London–New York, 1979. [38] S. Popa, Correspondences, INCREST preprint, 1986. [39] P.P. Romagnoli, A local variational principle for the topological entropy, Ergodic Theory Dynam. Systems 23 (2003) 1601–1610. [40] A. Rosenthal, Finite uniform generators for ergodic, finite entropy, free actions of amenable groups, Probab. Theory Related Fields 77 (1988) 147–166. [41] H.P. Rosenthal, A characterization of Banach spaces containing l1 , Proc. Natl. Acad. Sci. USA 71 (1974) 2411– 2413. [42] D.J. Rudolph, B. Weiss, Entropy and mixing for amenable group actions, Ann. of Math. 151 (2000) 1119–1150. [43] N. Sauer, On the density of families of sets, J. Combin. Theory Ser. A 13 (1972) 145–147. [44] S. Shelah, A combinatorial problem; stability and order for models and theories in infinitary languages, Pacific J. Math. 41 (1972) 247–261. [45] M. Takesaki, Theory of Operator Algebras I, Encyclopaedia Math. Sci., vol. 124, Springer-Verlag, Berlin, 2002. [46] D.V. Voiculescu, Dynamical approximation entropies and topological entropy in operator algebras, Comm. Math. Phys. 170 (1995) 249–281.

1386


[47] P. Walters, An Introduction to Ergodic Theory, Grad. Texts in Math., vol. 79, Springer-Verlag, New York–Berlin, 1982. [48] B. Weiss, Strictly ergodic models for dynamical systems, Bull. Amer. Math. Soc. (N.S.) 13 (1985) 143–146. [49] R. Zimmer, Ergodic actions with generalized discrete spectrum, Illinois J. Math. 20 (1976) 555–588.


Asymptotic type for sectorial operators and an integral of fractional powers Nick Dungey 1 Department of Mathematics, Macquarie University, NSW 2109, Australia Received 7 May 2008; accepted 23 July 2008 Available online 25 September 2008 Communicated by N. Kalton

Abstract There is a standard notion of type for a sectorial linear operator acting in a Banach space. We introduce a notion of asymptotic type for a linear operator V , involving estimates on the resolvent (λI + V )−1 as λ → 0. We show, for example, that if V is sectorial and of asymptotic type ω then the fractional power V α is of asymptotic type αω for a suitable range of positive α. Moreover, we establish various properties of the operator 01 dα V α ; in particular, this operator is of asymptotic type 0, for a sectorial operator V . This result has an application to the construction of operators satisfying the well-known Ritt resolvent condition. © 2008 Published by Elsevier Inc. Keywords: Sectorial operator; Fractional power; Type; Ritt resolvent condition; Power-bounded operator

1. Introduction Sectorial operators are linear operators with spectrum in a sector of the complex plane and whose resolvent satisfies certain estimates outside the sector. To recall the precise definition write Λθ := {z ∈ C: z = 0, |arg z| < θ }, Λθ := {0} ∪ {z ∈ C: |arg z| θ } for the open and closed sectors of angle θ , θ ∈ [0, π). A closed linear operator V acting in a complex Banach space X is

1 This article is dedicated to the memory of Nicolas Dungey. Nick, aged 35, died tragically in May 2008, shortly after

submitting it. Sadly his work in mathematics, his life’s passion, has come to an untimely end. 0022-1236/$ – see front matter © 2008 Published by Elsevier Inc. doi:10.1016/j.jfa.2008.07.020

1388

N. Dungey / Journal of Functional Analysis 256 (2009) 1387–1407

said to be of type ω, where ω ∈ [0, π), if its spectrum σ (V ) is contained in Λω and if for each θ ∈ (ω, π) one has sup λ(λI + V )−1 : λ ∈ Λπ−θ < ∞. An operator V is sectorial if it is of type ω for some ω ∈ [0, π). Sectorial operators arise naturally in various aspects of partial differential equations and harmonic analysis, and are closely linked to topics such as semigroup theory, the theory of fractional powers of operators, and operator functional calculus. See [6] for a recent account with references to the literature on these topics. If an operator V generates a bounded C0 -semigroup (e−tV )t0 of operators, it is well known that V is of type π/2, and the Laplace transform formula −1

(λI + V )

∞ =

dt e−λt e−tV

0

suggests that the asymptotic behaviour of e−tV as t → ∞ should be reflected in the behaviour of the resolvent (λI + V )−1 as λ → 0. This semigroup consideration motivates us to introduce a notion of asymptotic type for operators which specifies resolvent estimates as λ → 0. Write D(a; r) := {z ∈ C: |z − a| r} for a ∈ C, r 0. We say that a closed linear operator V is of asymptotic type ω, where ω ∈ [0, π), if for each θ ∈ (ω, π) there exists an ε = ε(θ ) > 0 such that σ (V ) ∩ D(0; ε) ⊆ Λθ

(1)

and λ(λI + V )−1 < ∞.

sup λ∈D(0;ε)∩Λπ−θ

(Remark that (1) says exactly that D(0; ε) ∩ Λπ−θ is contained in the resolvent set of −V .) The main goals of this paper are to study asymptotic type in relation to the fractional powers V α , α > 0, of a sectorial operator, and in relation to the operator 1 dα V α ,

(2)

0

an integral of fractional powers. Recall that for an arbitrary sectorial operator there is a welldeveloped theory for the powers V α , α > 0; see for example [6,14]. A classical theorem on types states that if V is sectorial of type ω, then V α is of type αω for a suitable range of α. In Section 3 we establish a counterpart of this result for the asymptotic type. For the asymptotic type of the operator (2), we discover the following interesting result. Theorem 1.1. Let V be a sectorial operator with domain dense in X. Then the operator 1 α 0 dα V , defined on a suitable domain, is sectorial and is of asymptotic type 0.


1389

1 To prove Theorem 1.1 we give a detailed development of the theory of the operator 0 dα V α , 1 including integral representations for it and for its resolvent (λI + 0 dα V α )−1 . These representations are comparable with the classical Balakrishnan and Kato formulae for V α and (λI + V α )−1 , respectively (for the Balakrishnan/Kato formulae see [14], or Section 3 below). For a general sectorial operator V , the operator (2) is not definable by a standard sectorial functional calculus (as found for example in [6, Chapter 2]) unless one imposes additional conditions, such as injectivity of V . It turns out, however, that (2) can be seen as a case of the Hirsch functional calculus (see [7,13]) in which φ(V ) is defined for a certain class of functions φ analytic on the cut plane C \ (−∞, 0]. See the remarks in Section 4. A variant of the integral (2) was considered by Lyubich in [12] (see also [8, Section 2]). He studied fractional powers J α of the standard Volterra operator J acting in X = Lp (0, 1), 1 p ∞, and proved (with a different terminology) that the operator J:=

∞ dα J α

(3)

0

is bounded and sectorial of type 0, with a single-point spectrum σ (J) = {0}. Theorem 1.1 could be considered as a far-reaching generalization of this example of [12]. Theorem 1.1 also has an interesting application to the study of bounded operators T with spectrum in the unit disc D := D(0; 1), which we now explain. Denote by L(X) the space of all bounded linear operators T : X → X. An operator T ∈ L(X) is said to be a Ritt operator if it satisfies the Ritt resolvent condition, that is, σ (T ) ⊆ D and (4) sup (λ − 1)(λI − T )−1 : λ ∈ C, |λ| > 1 < ∞. This condition recently received interest from a number of authors (see [1,2,8,11,16,18,19] and references therein), partly because it has a nice characterization in terms of the behaviour of the powers T n , n ∈ N := {1, 2, 3, . . .}; see Theorem 6.1 below. In particular, any Ritt operator is power-bounded in the sense that supn∈N T n < ∞. As a corollary of his results in [12], Lyubich obtained that the operator S := I − J, with J as in (3), acting in Lp (0, 1), is a Ritt operator and has spectrum σ (S) = {1}. This answered an earlier question of Zemánek about whether there exist Ritt operators with single-point spectrum {1}. Using Theorem 1.1 we shall obtain a result of a more general nature. Theorem 1.2. Suppose that T ∈ L(X) satisfies σ (T ) ⊆ D and sup (λ − 1)(λI − T )−1 : λ ∈ (1, ∞) < ∞.

(5)

Then the operator 1 S := I −

dα (I − T )α 0

is a Ritt operator, and I − S is of asymptotic type 0. Moreover, if σ (T ) = {1}, then σ (S) = {1} and I − S is of type 0.

1390


The condition (5) for a bounded operator T ∈ L(X) is called Abel boundedness. In comparison, an operator T ∈ L(X) is said to be a Kreiss operator if it satisfies the Kreiss resolvent condition, that is, σ (T ) ⊆ D and sup |λ| − 1 (λI − T )−1 : λ ∈ C, |λ| > 1 < ∞.

(6)

The Kreiss condition is obviously stronger than Abel boundedness, but it is weaker than powerboundedness of T , which is, in turn, weaker than the Ritt condition (see Section 6 below and [15,18] for more detailed discussions). For the Volterra operator J acting in Lp (0, 1), it is known that σ (J ) = {0} and that I − J is a 1 Kreiss operator [15]. Therefore Theorem 1.2 applies with T = I − J and S = I − 0 dα J α , to give a variant of the results of Lyubich cited above. It is interesting to compare Theorem 1.2 with the following observation of the author [4]. Theorem 1.3. (See [4].) Let T ∈ L(X) be a Kreiss operator. Then for each α ∈ (0, 1), the operator S := I − (I − T )α is a Ritt operator, and I − S is of type απ/2. We shall also see that if the operator

T is power-bounded, the operator S in Theorem 1.2 can be represented as a power series S = k0 B(k)T k where k → B(k) is a probability on the

non-negative integers; that is, B(k) 0 and k0 B(k) = 1. This representation connects with results of [4] on such power series and their relation to Ritt operators, as explained in Section 6 below. The paper is organized as follows. Section 2 collects some basic facts about sectoriality and asymptotic type. These results might be useful in future work on asymptotic type. In Section 3 we prove an analogue for asymptotic type of the classic type theorem for fractional powers V α . 1 Section 4 is a careful development of the theory of the operator 0 dα V α for a bounded sectorial operator V , including the proof of Theorem 1.1 in this case. In Section 5 we describe the extension of that theory for an unbounded sectorial operator. Finally, in Section 6 we study operators T with spectrum in the disc D; in particular, we apply the preceding theory to prove Theorem 1.2. We might explore elsewhere the relation between asymptotic type and asymptotics of the semigroup (e−tV ) when V is a semigroup generator. For a Ritt operator T , one expects a similar connection between asymptotic type of I − T and asymptotics of the powers T n as n → ∞. For a sample result in this direction, see Proposition 6.5 below. In what follows X will be a complex Banach space. We always consider the principal branch of the logarithm z → log z and of the power function z → zα (α ∈ C), so these functions are analytic on C \ (−∞, 0]. 2. Sectoriality and asymptotic type In this section we gather some basic facts about sectoriality and asymptotic type. We first recall a standard condition for sectoriality (see, for example, [6, p. 20]). Lemma 2.1. If V is a closed operator in X satisfying σ (V ) ⊆ C \ (−∞, 0) and supλ>0 λ(λI + V )−1 < ∞, then V is sectorial. Concerning asymptotic type, the following observations are basic.


1391

Lemma 2.2. The following statements hold. (i) If V is an operator of type ω ∈ [0, π), then V is of asymptotic type ω. (ii) Suppose that V ∈ L(X) is a bounded operator of asymptotic type ω ∈ [0, π) and that σ (V ) ⊆ Λω . Then V is of type ω. Proof. Part (i) is trivial. Part (ii) holds because the conditions V ∈ L(X), σ (V ) ⊆ Λω imply that sup λ(λI + V )−1 : λ ∈ Λπ−θ , |λ| ε < ∞ for each ε > 0 and θ ∈ (ω, π).

2

Any sectorial operator V has a minimal type ω(V ), given by ω(V ) = min{ω ∈ [0, π): V is of type ω}. Similarly, if an operator V is of asymptotic type ω0 for some ω0 < π , then it has a minimal asymptotic type given by ωas (V ) = min ω ∈ [0, π): V is of asymptotic type ω . For a sectorial operator V one has ωas (V ) ω(V ). The following notion is sometimes useful in connection with asymptotic type. We say that a set A ⊆ C has asymptotic angle ω, where ω ∈ [0, π), if for each θ ∈ (ω, π) there exists an ε = ε(θ ) > 0 such that A ∩ D(0; ε) ⊆ Λθ . In the definition of asymptotic type for operators, the condition on σ (V ) is exactly that σ (V ) have asymptotic angle ω. In case 0 is a limit point of the set A \ {0}, the condition that A have asymptotic angle ω can be expressed as lim sup

|arg z| ω.

z∈A\{0}, z→0

On the other hand, if 0 is not a limit point of A \ {0} then A has asymptotic angle 0. If A ⊆ C has asymptotic angle ω, then so does its closure A (note that A ∩ D(0; 2−1 ε) is contained in the closure of A ∩ D(0; ε)). For future use we record a geometric estimate for asymptotic angle. Lemma 2.3. Let A ⊆ C have asymptotic angle ω ∈ [0, π). Then given any θ ∈ (ω, π), there exist ε > 0 and a ∈ (0, 1) such that −λ ∈ / A,

|λ + z| a |λ| + |z|

for all λ ∈ D(0; ε) ∩ Λπ−θ and all z ∈ A. Proof. Given θ , fix a θ ∈ (ω, θ ) and choose an ε > 0 such that A ∩ D(0; 2ε) ⊆ Λθ .

1392


Since θ < θ , elementary trigonometry gives for some a ∈ (0, 1) (take for instance a = min( 12 , 1+cos(arg(λ)−arg(z)) )) that |λ + z| a(|λ| + |z|) for all λ ∈ Λπ−θ and z ∈ Λθ . This 2 yields the estimate of the lemma in case z ∈ A ∩ D(0; 2ε). Next, for any z ∈ C \ D(0; 2ε) and λ ∈ D(0; ε) one has |z| 2|λ|, so |λ + z| |z| − |λ| 3−1 (|z| + |λ|). The lemma follows. 2 There is a ‘local’ counterpart of the notion of asymptotic type for operators. In fact, let us say that a closed linear operator V is of local type ω, ω ∈ [0, π), if for each θ ∈ (ω, π) there exists an R > 1 such that σ (V ) ∩ C \ D(0; R) ⊆ Λθ and sup λ(λI + V )−1 : λ ∈ C \ D(0; R) ∩ Λπ−θ < ∞. It is not difficult to see that V is of local type ω for some ω < π if and only if the operator λI + V is sectorial for some λ > 0. Note that our definitions of asymptotic type and local type do not imply sectoriality. For example, if V ∈ L(X) has spectrum equal to the unit circle {z ∈ C: |z| = 1}, then V is of asymptotic type 0 and local type 0, yet is not sectorial. Here is a connection between asymptotic type and local type. Lemma 2.4. Let V be an injective closed operator, and let ω ∈ [0, π). Then V is of local type ω if and only V −1 is of asymptotic type ω. Proof. It is a straightforward consequence of the relation (see [6, p. 20]) λ(λI + V −1 )−1 = I − λ−1 (λ−1 I + V )−1 . 2 Actually, the notions of type, asymptotic type and local type extend readily to multi-valued linear operators V ; see, for example, [6, Appendix A] for the definitions of spectrum and resolvent for such operators. Lemma 2.4 remains valid for a closed multi-valued operator V without any injectivity hypothesis. In what follows, however, we only consider conventional single-valued operators. 3. Fractional powers In this section, we briefly recall some standard facts about fractional powers of a sectorial operator, and then prove a result about asymptotic type of these powers (Theorem 3.2 below). Given a sectorial operator V in X, it is well known that one can define the fractional powers V α for all α > 0. Here are some standard properties (for further details see, for example, [6,14]). (i) V α is closed, and V α V β = V α+β for all α, β > 0.


1393

(ii) If α ∈ (0, 1), then the domain D(V ) of V is a core for V α , and one has the Balakrishnan formula sin απ V f= π

∞

α

dt t α−1 (tI + V )−1 Vf

(7)

0

for all f ∈ D(V ). (iii) If V , V −1 ∈ L(X), then V α is given by the Dunford functional calculus for V ; precisely, V α = (2πi)−1 γ dz zα (zI − V )−1 where γ is a positively oriented, simple closed contour lying in C \ (−∞, 0] and enclosing the spectrum of V . (iv) The spectral mapping theorem: σ (V α ) = {zα : z ∈ σ (V )} for all α > 0. Moreover, the following theorem on types is standard (see [14] or [6]). Theorem 3.1. Let V be sectorial of type ω ∈ [0, π), and let α ∈ (0, π/ω). Then V α is sectorial of type αω, and (V α )β = V αβ for all β > 0. Here is our analogue of Theorem 3.1 for asymptotic type. Theorem 3.2. If V is sectorial and is of asymptotic type ω ∈ [0, π), then V α is of asymptotic type αω for each α ∈ (0, π/ω). The operators V α in Theorem 3.2 are not necessarily sectorial when α ∈ (1, π/ω); for a simple example, consider V = iI , ω = 0, α = 2. Our proof of Theorem 3.2 for α ∈ (0, 1) modifies Kato’s original proof of Theorem 3.1 for that case (see [9]). Proof of Theorem 3.2. First consider α ∈ (0, 1). To prove the theorem in this case, it is enough to show that given θ ∈ (ω, π) and τ ∈ (0, (1 − α)π), there exists a δ > 0 depending on θ, τ such that −1 : λ ∈ e±iα(π−θ) Λτ ∩ D(0; δ) < ∞. sup λ λI + V α

(8)

So let θ ∈ (ω, θ ), and by the asymptotic type hypothesis choose an ε ∈ (0, 1) such that sup z(zI + V )−1 : z ∈ D(0; 3ε) ∩ Λπ−θ < ∞.

(9)

Because V is sectorial one has the resolvent formula of Kato [9] −1 sin απ λI + V α = π

∞ dt 0

t α (tI + V )−1 (λ + eiαπ t α )(λ + e−iαπ t α )

(10)

1394


for all λ > 0, and by analytic continuation, for all λ ∈ Λ(1−α)π . Thanks to (9), we may deform the contour of integration in (10) to a contour γ : (0, ∞) → C with arg γ (t) = π − θ for small t, and analytically continue in λ, to obtain −1 sin απ λI + V α = π

dz γ

zα (zI + V )−1 (λ + eiαπ zα )(λ + e−iαπ zα )

(11)

valid for all λ ∈ eiα(π−θ) Λτ ∩ D(0; ε α ). Precisely, for (11) we can take γ such that γ (t) = tei(π−θ) for t ∈ (0, 2ε], γ (t) traces a circular arc from 2εei(π−θ) to 2ε for t ∈ [2ε, 3ε], and γ (t) = t − ε for t 3ε. From (11) and (9) one easily obtains a bound λI + V α −1 c

|dz| γ

|z|α−1 c |λ|−1 (|λ| + |z|α )2

for all λ ∈ eiα(π−θ) Λτ ∩ D(0; ε α ). There is a similar bound when λ ∈ e−iα(π−θ) Λτ ∩ D(0; ε α ), so (8) follows. Next, in case α = 2 and ω ∈ [0, π/2), one obtains the statement of the theorem via the identity −1 1/2 −1 −1 λI + V 2 −iλ1/2 I + V = iλ I + V ; we omit the details. In this case, V need not even be sectorial. n n Finally, for any α ∈ (1, π/ω), by choosing n ∈ N with 2n > α and writing V α = (V α/2 )2 , one deduces the statement of the theorem by applying the previous two cases. 2 There is a result analogous to Theorem 3.2 for local type, but we omit the details. 4. The operator

1 0

dα V α

1 In this section, we shall define and study the operator 0 dα V α when V ∈ L(X) is a bounded sectorial operator. In order to do this, consider the function ψ : C \ (−∞, 0) → C defined by 1 ψ(z) :=

dα zα .

(12)

0

It is easy to see that ψ is continuous on C \ (−∞, 0), analytic on C \ (−∞, 0], and that ψ(0) = 0, ψ(1) = 1. For z not equal to 0 or 1, one writes zα = eα log z to evaluate ψ as ψ(z) =

z−1 , log z

z ∈ C \ (−∞, 0] ∪ {1} .

(13)


1395

For a sectorial operator V ∈ L(X), we now define the operator ψ(V ) by 1 ψ(V )f :=

dα V α f

(14)

0

for f ∈ X. The main theorem of this section, stated next, collects our fundamental results on ψ(V ). Introduce ‘boundary value’ curves ψ+ , ψ− : (0, ∞) → C by setting −(t + 1) ψ+ (t) := lim ψ teiϕ = , ϕ↑π log t + iπ −(t + 1) ψ− (t) := lim ψ te−iϕ = ϕ↑π log t − iπ for all t > 0. Write Rg ψ := ψ(C \ (−∞, 0)) for the range of ψ, and Rg ψ for its C-closure. Theorem 4.1. Let V ∈ L(X) be a sectorial operator and let M > 0 be such that sup λ(λI + V )−1 M. λ>0

Then the following statements hold. (I) The operator ψ(V ), defined by (14), is a well-defined element of L(X), and ψ(V ) 1 + M + M V . (II) One has ∞ ψ(V ) =

dt 0

t +1 (tI + V )−1 V t[(log t)2 + π 2 ]

where the integral converges in operator norm. (III) Spectral mapping: one has σ ψ(V ) = ψ(z): z ∈ σ (V ) ⊆ Rg ψ. Moreover Rg ψ ⊆ C \ (−∞, 0). (IV) For λ ∈ C with −λ ∈ / Rg ψ , one has −1 λI + ψ(V ) =

∞ dt

(t + 1)(tI + V )−1 [(log t)2 + π 2 ](λ + ψ+ (t))(λ + ψ− (t))

dt

(t + 1)(tI + V )−1 (t + 1 − λ log t)2 + π 2 λ2

0

∞ = 0

1396


where the integrals converge in operator norm. (V) One has λ(λI + ψ(V ))−1 M for all λ > 0. Hence the operator ψ(V ) is sectorial. (VI) Given θ ∈ (0, π/2), there exist ε > 0, c > 0, depending only on θ , such that −λ ∈ / Rg ψ and −1 λI + ψ(V ) Mc|λ|−1 for all λ ∈ D(0; ε) ∩ Λπ−θ . Hence the operator ψ(V ) is of asymptotic type 0. Remarks. We comment on some different approaches to the definition of ψ(V ). Given a sectorial operator V ∈ L(X), it is tempting to try and define ψ(V ) by a Cauchy integral ψ(V ) = (2πi)−1

dz ψ(z)(zI − V )−1

(15)

γ

where γ is the boundary of a suitable truncated sector with vertex at 0. But if 0 ∈ σ (V ) this integral does not converge absolutely near 0, due to the behaviour |ψ(z)| ∼ (log |1/z|)−1 as z → 0 (recall (13)). This behaviour actually shows that ψ(V ) is not definable through the usual sectorial functional calculus for V (found in [6, Chapter 2] for instance) unless one imposes further conditions on V , such as injectivity. Nonetheless, in the special case where 0 ∈ / σ (V ), that is, V −1 ∈ L(X), the integral (15) converges and (as is easily seen) agrees with (14). Consider the formula of part (II) of Theorem 4.1. By a change of variable s = t −1 this rewrites as ∞ ψ(V ) =

dμ(s) V (I + sV )−1

(16)

0

where μ is the measure on (0, ∞) given by dμ(s) :=

(1 + s)ds . s[(log s)2 + π 2 ]

∞ Because 0 dμ(s) (1 + s)−1 < ∞, Eq. (16) exhibits ψ(V ) as a special case of the Hirsch functional calculus for a sectorial operator V ; see [7,13] for this calculus. In this section, however, we give an independent development based on (14). Before proving Theorem 4.1, let us record some fundamental properties of ψ. Lemma 4.2. The following statements hold. (i) (ii) (iii) (iv)

ψ(Λω ) ⊆ Λω for all ω ∈ [0, π). Rg ψ = (Rg ψ) ∪ (ψ+ (0, ∞)) ∪ (ψ− (0, ∞)) ⊆ C \ (−∞, 0). The set Rg ψ has asymptotic angle 0. (−1)n+1 ψ (n) (x) > 0 for x > 0 and n ∈ N; in particular, the derivative ψ = ψ (1) is a completely monotone function on (0, ∞).


1397

Proof. If z ∈ Λω then zα ∈ Λω for all α ∈ (0, 1). Then (i) follows by approximating the integral (12) by Riemann sums. The proofs of (ii) and (iv) are straightforward. To establish (iii), we first show that lim arg ψ(z) = 0

(17)

z→0

where it is understood that z ∈ C \ (−∞, 0]. For small r ∈ (0, 1), using (13) we have arg ψ reiϕ = arg 1 − reiϕ − arg(− log r − iϕ) which converges to 0 as r ↓ 0, uniformly for ϕ ∈ (−π, π). This proves (17). Given θ ∈ (0, π), we can thus choose ε > 0 small enough so that |arg ψ(z)| θ for all z ∈ D(0; ε) ∩ (C \ (−∞, 0]). Setting δ(ε) := inf ψ(z) : z ∈ C \ (−∞, 0], |z| > ε > 0, we deduce that |arg ψ(z)| θ for all z ∈ C \ (−∞, 0] satisfying |ψ(z)| < δ(ε). This shows that Rg ψ has asymptotic angle 0, and therefore Rg ψ has asymptotic angle 0. 2 The reader can get a picture of the set Rg ψ ⊆ C by drawing the boundary curves ψ+ and ψ− . We proceed to prove statements (I)–(VI) of Theorem 4.1. Proof of statement (I). For f ∈ X, it is easy to see from (7) that the map α ∈ (0, 1) → V α f ∈ X is norm continuous. Then the results of statement (I) follow from the following lemma. Lemma 4.3. For V as in Theorem 4.1 and α ∈ (0, 1), one has V α ∈ L(X) and α V 1 + M + M V . Proof. Inserting in (7) the bounds (tI + V )−1 V = I − t (tI + V )−1 1 + M for t ∈ (0, 1) and (tI + V )−1 V M V t −1 for t 1, we obtain α V π −1 sin απ

1

∞ dt t α−1 (1 + M) +

0

dt t −2+α M V

1

−1 = (απ)−1 (sin απ)(1 + M) + (1 − α)π (sin απ) M V 1 + M + M V where the last step used the estimate sin απ = sin(1 − α)π min απ, (1 − α)π .

2

1398


Proof of statement (II). We use the definition of ψ(V ) and (7): ψ(V ) = π −1

1

∞ dα sin απ

0

= π −1

dt t α−1 (tI + V )−1 V

0

1

∞ dt 0

dα (sin απ)t α t −1 (tI + V )−1 V .

0

(The interchange of the order of integration is justified because the double integrals converge absolutely; absolute convergence follows from the estimates on (tI + V )−1 V given in the proof of Lemma 4.3.) Then statement (II) will follow if we can show that 1 dα (sin απ)t α = 0

π(t + 1) (log t)2 + π 2

(18)

for each t > 0. To see (18), observe that the left-hand side of (18) equals the imaginary part of 1

1 dα e

t = lim

iαπ α

ϕ↑π

0

α dα teiϕ

0

= ψ+ (t) =

−(t + 1)(log t − iπ) . (log t)2 + π 2

Proof of statement (III). We first show an approximation result. Let ρ(A, B) = max{supa∈A infb∈B |a − b|, supb∈B infa∈A |b − a|} denote the Hausdorff distance between compact sets A, B ⊆ C. Corollary 4.4. Let V ∈ L(X) be sectorial and let Vε := εI + V for ε > 0. Then lim ψ(Vε ) − ψ(V ) = 0. ε→0

Moreover, one has lim ρ(σ ψ(Vε ), σ ψ(V ) = 0

ε→0

and, for λ ∈ C with −λ ∈ / σ (ψ(V )), −1 −1 lim λI + ψ(Vε ) − λI + ψ(V ) = 0.

ε→0

Proof. Applying statement (II) to the operators Vε and V , one has ψ(Vε ) − ψ(V )

∞ dt 0

t +1 (t + ε)I + V −1 V − (tI + V )−1 V 2 2 t[(log t) + π ]


1399

which converges to 0 as ε ↓ 0, by use of the dominated convergence theorem and the estimates of Lemma 4.3. The other parts of the corollary then follow by standard results of perturbation theory; see [10, Theorems IV.3.6, IV.2.23, IV.2.25]. 2 To prove statement (III), consider the sets Aε := {ψ(z): z ∈ σ (Vε ) = σ (V ) + ε}. It is clear that Aε converges in the Hausdorff metric ρ to A := {ψ(z): z ∈ σ (V )} as ε ↓ 0. On the other hand, since Vε−1 ∈ L(X) the operator ψ(Vε ) is given by the Dunford functional calculus for Vε , so Aε = σ (ψ(Vε )) by the standard spectral mapping theorem for that calculus. Then by Corollary 4.4, Aε converges to σ (ψ(V )). Thus A = σ (ψ(V )), proving statement (III). Proof of statement (IV). First consider the case where V −1 ∈ L(X). Fix λ as in the statement, and define an analytic function F on C \ (−∞, 0] by F (z) = (λ + ψ(z))−1 . Define also functions F+ , F− by −1 F+ (t) = lim F teiϕ = λ + ψ+ (t) , ϕ↑π

−1 F− (t) = lim F te−iϕ = λ + ψ− (t) ϕ↑π

for all t > 0. Let γ be the positively oriented boundary of the region S = {z: a |z| b, |arg(z)| ϕ}, for b > a > 0 and ϕ ∈ (0, π). For a small enough and b, ϕ large enough, the contour γ encloses σ (V ), so that −1 −1 λI + ψ(V ) = (2πi) dz F (z)(zI − V )−1 γ −1

b

−1 dt F te−iϕ e−iϕ te−iϕ I − V

= (2πi)

a −1

b

− (2πi)

−1 dt F teiϕ eiϕ teiϕ I − V

a

+ (2πi)−1

ϕ

−1 dθ F beiθ ibeiθ beiθ I − V

−ϕ −1

ϕ

− (2πi)

−1 dθ F aeiθ iaeiθ aeiθ I − V

−ϕ −1

b

dt F− (t) − F+ (t) (tI + V )−1

= (2πi)

a −1

π

+ (2πi)

−π

−1 dθ F beiθ ibeiθ beiθ I − V

1400


− (2πi)−1

π

−1 dθ F aeiθ iaeiθ aeiθ I − V

−π

=: J1 + J2 − J3 where the second-to-last step followed by taking ϕ ↑ π . Now let a ↓ 0 and b ↑ ∞; then the term J2 converges to 0 because lim|z|→∞ F (z) = 0, while the term J3 converges to 0 because limz→0 zF (z) = 0. Also, one computes

F− (t) − F+ (t) =

ψ+ (t) − ψ− (t) (λ + ψ+ (t))(λ + ψ− (t))

=

2πi(t + 1) [(log t)2 + π 2 ](λ + ψ+ (t))(λ + ψ− (t))

=

2πi(t + 1) [t + 1 − λ log t]2 + π 2 λ2

for t > 0. Thus we obtain the formulae of (IV) in case V −1 ∈ L(X). Finally, for general V we apply the formulae of (IV) to the operators Vε := V + εI , ε > 0. Because of the last statement of Corollary 4.4, by letting ε → 0 one straightforwardly obtains statement (IV) for V . Proof of statement (V). By taking V to be the zero operator acting in X = C, and applying (IV), we find the identity

λ

−1

∞ =

dt 0

(t + 1)t −1 [t + 1 − λ log t]2 + π 2 λ2

(19)

for λ > 0. Now for general V satisfying supλ>0 λ(λI + V )−1 M, we use (IV) and (19) to obtain λI + ψ(V ) −1 M

∞ dt 0

(t + 1)t −1 = Mλ−1 [t + 1 − λ log t]2 + π 2 λ2

for λ > 0. Thus ψ(V ) is sectorial (recall Lemma 2.1). Proof of statement (VI). Let θ ∈ (0, π). By Lemma 4.2(iii), we can apply Lemma 2.3 to the set A = Rg ψ with ω = 0. Thus there exist ε > 0, a > 0 such that −λ ∈ / Rg ψ and |λ + z| a(|λ| + |z|) for all λ ∈ D(0; ε) ∩ Λπ−θ and z ∈ Rg ψ . Take z = ψ+ (t) or z = ψ− (t) in this inequality; since ψ− (t) = ψ+ (t) (the complex conjugate), we find that


1401

(log t)2 + π 2 λ + ψ+ (t) λ + ψ− (t) 2 a 2 (log t)2 + π 2 |λ| + ψ+ (t) a 2 (log t)2 + π 2 |λ| + ψ+ (t) |λ| + ψ− (t) 2 = a 2 t + 1 − |λ| log t + π 2 |λ|2 for all λ ∈ D(0; ε) ∩ Λπ−θ and t > 0. In the third line we used the inequality (|λ| + |z|)2 (|λ| + z)(|λ| + z), z ∈ C. Applying the displayed estimate in the first formula of (IV) yields λI + ψ(V ) −1 Ma −2

∞ dt 0

= Ma

−2

(t + 1)t −1 (t + 1 − |λ| log t)2 + π 2 |λ|2

|λ|−1

for λ ∈ D(0; ε) ∩ Λπ−θ , with the last step by (19). The proof of statement (VI), and of Theorem 4.1, is complete. Finally, we give a relation between the types of V and ψ(V ). Corollary 4.5. Let V ∈ L(X) be an operator of type ω ∈ [0, π). Then ψ(V ) ∈ L(X) is of type ω and of asymptotic type 0. Proof. Theorem 4.1(III) and Lemma 4.2(i) yield σ ψ(V ) ⊆ ψ(Λω ) ⊆ Λω . Since ψ(V ) ∈ L(X) and ψ(V ) is of asymptotic type 0, Lemma 2.2(ii) shows that ψ(V ) is of type ω. 2 5. Unbounded operators The theory of Section 4 is sufficient for our applications to bounded operators in Section 6. 1 It is certainly interesting, however, to extend the theory of the operator ψ(V ) = 0 dα V α to the case where V is unbounded. In this section we outline this extension. Thus let V be a possibly unbounded sectorial operator, with domain D(V ) dense in X, which satisfies supλ>0 λ(λI + V )−1 M. Since V (tI + V )−1 ∈ L(X) with V (tI + V )−1 1 + M for t > 0, we can define operators S, T ∈ L(X) by 1 S :=

dt

t +1 V (tI + V )−1 , t[(log t)2 + π 2 ]

dt

t +1 (tI + V )−1 . t[(log t)2 + π 2 ]

0

∞ T := 1

1402


Now set ψ(V ) := S + V T

(20)

with domain D(ψ(V )) = {f ∈ X: Tf ∈ D(V )}. It is clear from Theorem 4.1(II) that this definition of ψ(V ) coincides with the previous definition in case V ∈ L(X). (The definition (20) is partly inspired by the approach to Hirsch functional calculus found in [13, Section 2].) It is a routine to verify that ψ(V ) is a closed operator with D(V ) ⊆ D(ψ(V )), and that ∞ ψ(V )f = 0

t +1 (tI + V )−1 Vf = dt t[(log t)2 + π 2 ]

1 dα V α f

(21)

0

for each f ∈ D(V ). (The last integral in (21) converges because V α f (1 + M) f + M Vf for α ∈ (0, 1) and f ∈ D(V ), an estimate proved in the same way as Lemma 4.3.) Because V is densely defined and sectorial one has f = limε↓0 (I + εV )−1 f for all f ∈ X (see [6, p. 21]). Using this fact it is straightforward to check that D(V ) is a core for ψ(V ). Thus 1 ψ(V ) coincides with the closure of the operator W , where W is defined by Wf := 0 dα V α f for f ∈ D(V ). We next claim that σ ψ(V ) ⊆ Rg ψ and that the formulae of Theorem 4.1(IV) hold for −λ ∈ / Rg ψ. In fact, the bounded operators V[ε] := V (I + εV )−1 ∈ L(X), for ε ∈ (0, 1], form a uniformly sectorial family of operators, and converge to V as ε ↓ 0 in the norm resolvent sense; see for example [6, Proposition 2.1.3]. One can then deduce the results of Theorem 4.1(IV) for V by using the corresponding results for V[ε] and limiting arguments of a standard type. Statements (V) and (VI) of Theorem 4.1 are also valid for V , with no change necessary in the proofs. In particular, ψ(V ) is sectorial and of asymptotic type 0. Let us also mention the following result on semigroups. Theorem 5.1. Suppose the operator V generates a bounded C0 -semigroup (e−tV )t0 (in particular, V must be of type π/2). Then ψ(V ) also generates a bounded C0 -semigroup (e−tψ(V ) )t0 . We only sketch the proof of Theorem 5.1, which is based on standard theory for subordinated semigroups; see [3, Section 2] and references therein. Recall (see Lemma 4.2) that ψ(0) = 0, ψ is continuous on Λπ/2 , analytic on Λπ/2 , ψ(Λπ/2 ) ⊆ Λπ/2 , and the derivative ψ is completely monotone on (0, ∞). It follows by standard results (cf. [5, Chapter XIII], [3, Theorem 2]) that there exists a convolution semigroup (νt )t0 of probability measures on [0, ∞) such that

e

−tψ(z)

∞ = 0

dνt (s) e−sz


1403

for all t 0 and z ∈ Λπ/2 . The semigroup (e−tψ(V ) )t0 is given by the subordination formula

e

−tψ(V )

∞ =

dνt (s) e−sV ,

t 0.

0

Further remarks. It appears that there is a spectral mapping theorem for ψ(V ) generalizing Theorem 4.1(III); in fact, [13, Theorem 3.2] states a more general spectral mapping theorem for the Hirsch functional calculus. Finally, we conjecture that ψ(V ) is of type ω whenever V is of type ω. For V ∈ L(X) this was established in Corollary 4.5, but the proof there does not apply to unbounded operators. 6. Ritt operators The main aim of this section is to prove Theorem 1.2. First, to clarify the setting, let us state four possible conditions on a bounded operator T ∈ L(X), listed in order of increasing strength. (See also [15,18] for discussions of these and related conditions.) (i) (ii) (iii) (iv)

T T T T

is Abel bounded, that is, the resolvent condition (5) holds. is a Kreiss operator, that is, σ (T ) ⊆ D and (6) holds. is power-bounded, that is, supn∈N T n < ∞. is a Ritt operator, that is, σ (T ) ⊆ D and (4) holds.

Note that the implication from condition (ii)

to (i) is trivial, while the implication (iii) ⇒ (ii) follows from the expansion (λI − T )−1 = k0 λ−k−1 T k for |λ| > 1. The implication (iv) ⇒ (iii) is part of the following fundamental theorem characterizing Ritt operators. Write D := {z ∈ C: |z| < 1}. Theorem 6.1. Given T ∈ L(X), the following three conditions are equivalent. (I) T is a Ritt operator. (II) The operator T is power-bounded, and sup nT n − T n+1 < ∞. n∈N

(III) One has σ (T ) ⊆ D ∪ {1}, and the operator I − T is of type ω for some ω ∈ [0, π/2). The equivalence of conditions (II) and (III) in Theorem 6.1 is due to O. Nevanlinna (see [17, Theorem 2.1]), while the equivalence with (I) was then noticed in [16] and [11]. Note that Abel boundedness of T means precisely that supμ>0 μ(μI + I − T )−1 < ∞, that is, the operator I − T is sectorial (recall Lemma 2.1). Thus ψ(I − T ) is defined in this case, and we can restate Theorem 1.2 in the following equivalent form. Theorem 6.2. Let T ∈ L(X) be Abel bounded and such that σ (T ) ⊆ D. Then the operator S := I − ψ(I − T ) is a Ritt operator, and the operator I − S is of asymptotic type 0. If in addition σ (T ) = {1}, then σ (S) = {1} and I − S is of type 0.

1404


The proof of Theorem 6.2 is based on Theorem 4.1, but we need two preliminary lemmas. The first lemma gives a useful variation of condition (III) of Theorem 6.1. Lemma 6.3. An operator T ∈ L(X) is a Ritt operator if, and only if, σ (T ) ⊆ D ∪ {1} and I − T is of asymptotic type ω for some ω ∈ [0, π/2). Proof. Condition (III) of Theorem 6.1 obviously implies the condition stated in the lemma. Conversely, suppose that σ (T ) ⊆ D ∪ {1} and I − T is of asymptotic type ω ∈ [0, π/2). Choose θ1 ∈ (ω, π/2) and ε ∈ (0, 1) such that σ (I − T ) ∩ D(0; ε) ⊆ Λθ1 . It is geometrically evident that there exists a θ2 ∈ (0, π/2) such that σ (T ) \ D(1; ε) ⊆ D \ D(1; ε) ⊆ 1 − Λθ2 , and hence σ (I − T ) \ D(0; ε) ⊆ Λθ2 . Setting θ := max{θ1 , θ2 } ∈ (ω, π/2) gives σ (I − T ) ⊆ Λθ . By Lemma 2.2(ii) the operator I − T is of type θ , so condition (III) of Theorem 6.1 holds. 2 We require the next lemma to show that the operator S in Theorem 6.2 has spectrum in D∪{1}. Actually, the lemma essentially repeats material of [4, Section 6], but we give a proof for the sake of completeness. Lemma 6.4. Define φ : D → C by φ(w) = 1 − ψ(1 − w) = 1 − following statements hold. (i) There exist numbers B(k) > 0, for k ∈ N, such that

1 0

dα (1 − w)α for w ∈ D. The

φ(w) = 1 − ψ(1 − w) =

k1 B(k) = 1

and

B(k)w k

(22)

k1

for all w ∈ D. (ii) One has φ(D) ⊆ D ∪ {1}. Proof. For each α ∈ (0, 1) there is an expansion 1 − (1 − w)α =

Aα (k)w k ,

w ∈ D,

k1

with Aα (k) > 0 and

−1

Aα (k) = (k!)

k1 Aα (k) = 1;

in fact, one can explicitly calculate

d k 1 − (1 − w)α = (k!)−1 α(1 − α)(2 − α) · · · (k − 1) − α dw k w=0

1 (see also [4, Lemma 5.2]). By setting B(k) := 0 dα Aα (k) one deduces part (i) of the lemma. Part (i)

implies that φ(D) ⊆ D. Next suppose that w, λ ∈ D satisfy |λ| = 1 and φ(w) = λ. Since λ = k1 B(k)w k is a convex combination, with strictly positive coefficients, of the points w k ∈ D, it follows that w k = λ for all k ∈ N. Hence w 2 = w = λ so λ = 1. Part (ii) follows. 2


1405

Proof of Theorem 6.2. Abel boundedness of T means that the operator I − T is sectorial. Let S = I − ψ(I − T ) and note that, by Theorem 4.1(III) and Lemma 6.4(ii), σ (S) ⊆ 1 − ψ(1 − w): w ∈ D ⊆ D ∪ {1}. Theorem 4.1(VI) yields that I − S = ψ(I − T ) is of asymptotic type 0. By Lemma 6.3 it follows that S is a Ritt operator. Suppose finally that σ (T ) = {1}. Theorem 4.1(III) then shows that σ (I − S) = {0}, so σ (S) = {1}, and Lemma 2.2(ii) yields that I − S is of type 0. 2 We next mention a sample result relating asymptotic type to behaviour of the powers T n . The proof is a straightforward exercise using the spectral theorem for a normal operator in Hilbert space. Proposition 6.5. Suppose X is a Hilbert space and T ∈ L(X) a Ritt operator which is normal. The operator I − T is of asymptotic type 0 if, and only if, lim sup nT n − T n+1 e−1 .

(23)

n→∞

See [8] and references therein for other results concerning differences T n − T n+1 and the optimal constant e−1 . Actually, if T is a bounded operator in Banach space such that strict inequality holds in (23), it is known that T splits into a direct sum of an identity operator and an operator with spectral radius less than 1; see [8, Theorem 3.1]. In the rest of this section, we explain how Theorem 6.2 relates to results in [4] about operators subordinated to probabilities. Let Z+ := {0, 1, 2, . . .} and consider the Banach algebra + + F (k) < ∞ L Z := F : Z → C : F L1 := 1

k0

with multiplication given by convolution ∗, so that (F1 ∗ F2 )(k) =

k

F1 (m)F2 (k − m),

k ∈ Z+ ,

m=0

for F1 , F2 ∈ L1 (Z+ ). Let P(Z+ ) := {F ∈ L1 (Z+ ): F 0, k0 F (k) = 1} be the set of probabilities on Z+ . For F ∈ P(Z+ ) and n ∈ N, the nth convolution power F (n) := F ∗ F ∗ · · · ∗ F is also an element of P(Z+ ). operator T ∈ L(X), and any F ∈ L1 (Z+ ), the ‘subordinated’ operator

For a power-bounded k k0 F (k)T is a well-defined element of L(X). Properties of such subordinated operators were studied in [4]. In particular, one proved there the following theorem. Theorem 6.6. (See [4].) Given F ∈ P(Z+ ), the following two conditions are equivalent. (I)

For any complex Banach space X and any power-bounded operator T ∈ L(X), the operator k k0 F (k)T is a Ritt operator.

1406


(II) F satisfies sup nF (n) − F (n+1) L1 < ∞.

n∈N

Let us denote by A(Z+ ) the set of all F ∈ P(Z+ ) which satisfy the equivalent conditions of Theorem 6.6. Theorem 6.2 then leads to the following result. Corollary 6.7. Let B be the element of P(Z+ ) such that B(0) = 0 and, for k 1, B(k) is given by the expansion (22). If T ∈ L(X) is power-bounded then the operator I − ψ(I − T ) = B(k)T k (24) k0

is a Ritt operator. Moreover, B ∈ A(Z+ ). Proof. If the spectral radius of T ∈ L(X) is strictly less than 1, that is, σ (T ) ⊆ D, then (24) holds because both sides are given by the Dunford functional calculus for T ; recall (22). For a general power-bounded operator T one can then deduce (24) with an approximation argument, considering the operators Tr := rT as r ↑ 1. The remaining assertions of the corollary follow from Theorem 6.2. 2 Some other examples of probabilities in the class A(Z+ ) were given in [4]. However, the remarkable feature of the probability B ∈ A(Z+ ) is that for power-bounded T the operator I −

( k0 B(k)T k ) is always of asymptotic type 0. Acknowledgments This work was carried out at Macquarie University and financially supported by the ARC (Australian Research Council). References [1] S. Blunck, Maximal regularity of discrete and continuous time evolution equations, Studia Math. 146 (2001) 157– 176. [2] S. Blunck, Analyticity and discrete maximal regularity on Lp -spaces, J. Funct. Anal. 183 (2001) 211–230. [3] A.S. Carasso, T. Kato, On subordinated holomorphic semigroups, Trans. Amer. Math. Soc. 327 (1991) 867–878. [4] N. Dungey, Subordinated discrete semigroups of operators, preprint, Macquarie University, arXiv.org e-print archive, http://arxiv.org/abs/0801.4557, 2008. [5] W. Feller, An Introduction to Probability Theory and Its Applications, vol. II, Wiley, New York, 1966. [6] M. Haase, The Functional Calculus for Sectorial Operators, Birkhäuser, Basel, 2006. [7] F. Hirsch, Intégrales de résolvantes et calcul symbolique, Ann. Inst. Fourier (Grenoble) 22 (1972) 239–264. [8] N. Kalton, S. Montgomery-Smith, K. Oleszkiewicz, Y. Tomilov, Power-bounded operators and related norm estimates, J. London Math. Soc. 70 (2004) 463–478. [9] T. Kato, Note on fractional powers of linear operators, Proc. Japan Acad. 36 (1960) 94–96. [10] T. Kato, Perturbation Theory for Linear Operators, Grundlehren Math. Wiss., vol. 132, Springer-Verlag, Berlin, 1980. [11] Yu. Lyubich, Spectral localization, power boundedness and invariant subspaces under Ritt’s type condition, Studia Math. 134 (1999) 153–167. [12] Yu. Lyubich, The single-point spectrum operators satisfying Ritt’s resolvent condition, Studia Math. 145 (2001) 135–142.


1407

[13] C. Martínez Carracedo, M. Sanz Alix, An extension of the Hirsch symbolic calculus, Potential Anal. 9 (1998) 301–319. [14] C. Martínez Carracedo, M. Sanz Alix, The Theory of Fractional Powers of Operators, North-Holland Math. Stud., vol. 187, North-Holland, Amsterdam, 2001. [15] A. Montes-Rodríguez, J. Sánchez-Álvarez, J. Zemánek, Uniform Abel–Kreiss boundedness and the extremal behaviour of the Volterra operator, Proc. London Math. Soc. 91 (2005) 761–788. [16] B. Nagy, J. Zemánek, A resolvent condition implying power boundedness, Studia Math. 134 (1999) 143–151. [17] O. Nevanlinna, On the growth of the resolvent operators for power bounded operators, in: Linear Operators, in: Banach Center Publ., vol. 38, Polish Acad. Sci., Warsaw, 1997, pp. 247–264. [18] O. Nevanlinna, Resolvent conditions and powers of operators, Studia Math. 145 (2001) 113–134. [19] P. Vitse, Functional calculus under the Tadmor–Ritt condition, and free interpolation by polynomials of a given degree, J. Funct. Anal. 210 (2004) 43–72.


Property A and CAT(0) cube complexes J. Brodzki a,∗ , S.J. Campbell a,1 , E. Guentner b,2 , G.A. Niblo a , N.J. Wright a,3 a School of Mathematics, University of Southampton, Highfield, SO17 1BJ, UK b University of Hawai‘i at M¯anoa, 2565 McCarthy Mall, Honolulu, HI 96822, USA

Received 9 May 2008; accepted 15 October 2008 Available online 28 November 2008 Communicated by Alain Connes

Abstract Property A is a non-equivariant analogue of amenability defined for metric spaces. Euclidean spaces and trees are examples of spaces with Property A. Simultaneously generalising these facts, we show that finitedimensional CAT(0) cube complexes have Property A. We do not assume that the complex is locally finite. We also prove that given a discrete group acting properly on a finite-dimensional CAT(0) cube complex the stabilisers of vertices at infinity are amenable. © 2008 Elsevier Inc. All rights reserved. Keywords: Property A; Coarse geometry; CAT(0) cube complexes; Amenability

0. Introduction This paper is devoted to the study of Property A for finite-dimensional CAT(0) cube complexes. These spaces, which are higher-dimensional analogues of trees, appear naturally in many problems in geometric group theory and low-dimensional topology [2,7,13,19,21]. Property A was introduced by Yu as a non-equivariant generalisation of amenability from the context of groups to the context of discrete metric spaces. It was used with great effect in his attack on the * Corresponding author.

E-mail address: [email protected] (J. Brodzki). 1 Supported by an EPSRC Postdoctoral Fellowship. 2 Supported in part by a grant from the U.S. National Science Foundation. 3 Supported in part by a Leverhulme Postdoctoral Fellowship.

0022-1236/$ – see front matter © 2008 Elsevier Inc. All rights reserved. doi:10.1016/j.jfa.2008.10.018

J. Brodzki et al. / Journal of Functional Analysis 256 (2009) 1408–1431

1409

Baum–Connes conjecture, in which he proved, among other things, that Gromov’s δ-hyperbolic spaces, and hence hyperbolic groups, satisfy Property A, even though they may be very far from amenable [22]. In this paper we prove: Theorem. Let X be a finite-dimensional CAT(0) cube complex. Equipped with the geodesic metric, X has Property A. The vertex set of X, equipped with the edge-path metric has Property A. The proof of the theorem rests on the often used statement that intervals in a CAT(0) cube complex admit combinatorial embeddings into Euclidean spaces. While this fact appears several times in the literature no proof has been published and we take the opportunity to provide one here. Our proof of this generalises to intervals in measured wall spaces, though we omit the details here as this is not relevant to the current application. While interval embeddings exist they are far from unique. Any given interval may admit a large number of such embeddings in spaces of varying dimensions and the embeddings may be very different from one another. For each embedding the target interval fibres over the image, and again these fiberings vary considerably. Nonetheless it is a remarkable fact that regardless of how we embed the interval into Euclidean space the norms of the functions we are computing on each fibre are independent of the embedding chosen. Our technique may well have other applications and we present one here. A group acting properly on an Hadamard space, a building for example, fixing a point in a suitable refinement of the visual boundary is amenable [6]. In the context of CAT(0) cube complexes the natural choice for the boundary is the combinatorial boundary. Theorem. A countable group acting properly on a finite-dimensional CAT(0) cube complex and fixing a vertex at infinity is amenable. The advantage to working with the combinatorial boundary rather than the refined Hadamard boundary is that it is typically much smaller. One might expect the cost of this to be somewhat larger stabilisers at infinity, however our theorem shows that this is not the case. The stabilisers at infinity in both cases are virtually abelian of rank bounded by the dimension of the cube complex. Our main theorem is known to be false for infinite-dimensional cube complexes [16], thus our result is the best possible. While it is already known for finite-dimensional CAT(0) cube complexes admitting a cocompact action by a countable discrete group [5], the approach taken there involved a deformation of the standard embedding of the cube complex in Hilbert space and rested on a functional analytic argument involving the uniform Roe algebra to conclude Property A (see [4,12]). That approach is ultimately unsuitable for non-locally finite complexes. Here, we shall remove the assumption of local finiteness by offering a direct proof of Property A in which the asymptotically invariant functions called for in Yu’s non-equivariant generalisation of the Følner criterion are explicitly constructed. Furthermore we do not require the existence of a group action to make this argument work. The problem of clarifying the relationship between Property A and coarse embeddability (in Hilbert space) has attracted some attention lately, and indeed was a motivation for our study. As a consequence of the above theorem, and the coarse invariance of Property A, we obtain the following corollaries.

1410


Corollary. A metric space that coarsely embeds in a finite-dimensional CAT(0) cube complex has Property A. Corollary. A countable discrete group acting metrically properly on a finite-dimensional CAT(0) cube complex has Property A. Indeed to conclude Property A for a group it would, according to our theorem, be sufficient for the group to embed uniformly in a finite-dimensional CAT(0) cube complex, with no equivariance assumptions on the embedding. Putting the corollaries in perspective, one can use an approximation argument to show that a metric space which coarsely embeds in Hilbert space coarsely embeds in an infinite-dimensional CAT(0) cube complex. (This follows from the observations that the infinite-dimensional Euclidean space R∞ is an infinite-dimensional cube complex and a dense subset of the Hilbert space 2 .) 1. Preliminaries 1.1. Property A In his work on the Novikov conjecture Yu introduced Property A [22]. There are now several variants of the basic definition, all of which are equivalent for spaces of bounded geometry; see for example [9,14,20]. We, however, intend to study spaces that do not have bounded geometry and shall restrict ourselves to the definition below. The definition we have chosen is the strongest, implying all others in full generality. Before formally introducing Property A we recall some elementary notions from coarse geometry. Let X and Y be metric spaces. A function φ : X → Y is a coarse embedding if: (a) For every A > 0 there exists B > 0 such that d(x, x ) < A

⇒

d φ(x), φ(x ) < B.

(b) For every B > 0 there exists A > 0 such that d φ(x), φ(x ) 0 such that for every y ∈ Y there exists z ∈ Z such that d(y, z) < C. A coarse embedding φ : X → Y is a coarse equivalence if its image is coarsely dense in Y . If there is a coarse equivalence X → Y the metric space X is coarsely equivalent to Y . Although not apparent, coarse equivalence is an equivalence relation. Proposition 1.1. Every metric space contains a discrete coarsely dense subset. In particular, every metric space is coarsely equivalent to a discrete metric space. Proof. A straightforward application of Zorn’s lemma.

2

Definition 1.2. A discrete metric space X has Property A if for every R > 0 and every ε > 0 there exists an S > 0 and a family of finite non-empty subsets Ax ⊂ X × N, indexed by x ∈ X, such that:


(a) For every x, x ∈ X with d(x, x ) < R we have (b) For every (x , n) ∈ Ax we have d(x, x ) S.

|Ax Ax | |Ax |

1411

< ε.

An arbitrary metric space X has Property A if it contains a discrete coarsely dense subset with Property A. Remark. We shall see presently that if one discrete coarsely dense subset of a metric space has Property A then every such subset has Property A (see Proposition 1.4 below). Proposition 1.3. Let X and Y be discrete metric spaces. If X is coarsely embeddable in Y and Y has Property A then X has Property A. Proof. Let φ : X → Y be a coarse embedding. Let ψ : Y → X be a function satisfying d φ ψ(y) , y d φ(X), y + 1. Let R > 0 and ε > 0. Since φ is a coarse embedding there exists R > 0 such that d(x, x ) < R

⇒

d φ(x), φ(x ) < R .

Since Y has Property A there is a family {By }y∈Y and an S satisfying the conditions of Definition 1.2 for R and ε. Define Ax = (x , n) ∈ X × N: n (y, m) ∈ Bφ(x) : ψ(y) = x and, using once more the fact that φ is a coarse embedding, we obtain S such that d φ(x), φ(x ) 2S + 1

⇒

d(x, x ) S.

The family {Ax }x∈X and S satisfy the conditions of Definition 1.2 for R and ε. Indeed, if (x , n) ∈ Ax then there exists (y, m) ∈ Bφ(x) such that ψ(y) = x . It follows that d(φ(x), y) S and d φ(x), φ(x ) d φ(x), y + d y, φ(x ) = d φ(x), y + d y, φ ψ(y) 2S + 1, hence also d(x, x ) S. Finally, suppose d(x, x ) R. Then d(φ(x), φ(x )) R so that |Ax Ax | |Bφ(x) Bφ(x ) | < ε. |Ax | |Bφ(x) |

2

Proposition 1.4. Property A is a coarse invariant of discrete metric spaces. Precisely, if X and Y are coarsely equivalent discrete metric spaces then X has Property A if and only if Y has Property A. Proof. If X and Y are coarsely equivalent then each is coarsely embeddable in the other. We shall work exclusively with the following characterisation of Property A.

2

1412


Proposition 1.5. A discrete metric space X has Property A if and only if there exists a sequence of families of finitely supported functions fn,x : X → N ∪ {0}, indexed by x ∈ X, and a sequence of constants Sn > 0, such that: (a) For every n and x the function fn,x is supported in BSn (x). (b) For every R > 0

fn,x − fn,x

→0

fn,x

uniformly on the set {(x, x ): d(x, x ) R} as n → ∞. Furthermore, if X is the vertex set of a graph, equipped with the edge-path metric, it is sufficient to require (b) only for R = 1. Remark. The norm · is the 1 -norm on the space of (finitely supported) functions on X. This is the only norm we shall encounter. Proof. Both Property A and the conditions in the proposition are equivalent to the following statement: for every R > 0 and ε > 0 there exists a family of finitely supported functions fx : X → N ∪ {0}, indexed by x ∈ X, and an S > 0 such that fx is supported in BS (x), and d(x, x ) R

⇒

fx − fx

< ε.

fx

The equivalence with the conditions of the proposition is elementary. The equivalence with Property A is given by mapping Ax to fx (y) = |Ax ∩ ({y} × N)|, and conversely by mapping fx to Ax = {(y, n): 1 n fx (y)}. It remains to check that in the case of a metric graph (b) for R = 1 implies (b) for every R > 0. It follows from (b) for R = 1 that

fn,x

fn,x −1 → 1

(1)

as n → ∞, uniformly on the set of pairs of adjacent vertices x and x . Given two vertices x and x with d(x, x) R we find an r R and a sequence of vertices x = x0 , x1 , . . . , xr = x comprising an edge-path from x to x . Writing

fn,x

fn,x −1 = fn,x0

fn,x1 −1 · fn,x1

fn,x2 −1 · · · fn,xr−1

fn,xr −1 it follows that the convergence in (1) is in fact uniform on the set {(x, x ): d(x, x ) R}. The condition (b) for R is now an application of the triangle inequality: writing

fn,x − fn,x fn,xi − fn,xi+1 fn,xi − fn,xi+1 fn,xi

= · ,

fn,x

fn,x

fn,xi

fn,x

r−1

r−1

i=0

i=0

note that each summand converges to zero uniformly on the appropriate set.

2


1413

Definition 1.6. We shall refer to functions fn,x as in Proposition 1.5 as weight functions. 1.2. CAT(0) cube complexes A cube complex is a polyhedral complex in which the cells are Euclidean cubes of side length one, the attaching maps are isometries identifying the faces of a given cube with cubes of lower dimension and the intersection of two cubes is a common face of each [3,11,18]. One-dimensional cubes are called edges, two-dimensional cubes are called squares and a cube complex is finite-dimensional if there is a bound on the dimension of its cubes. The Euclidean distance between points in a cube is well defined, allowing us to define the length of a rectifiable path. If a cube complex is finite-dimensional it is a complete geodesic metric space with respect to the geodesic metric, in which the distance between two points is defined to be the infimum of the lengths of rectifiable paths connecting them [3]. A finite-dimensional cube complex is a CAT(0) cube complex if the geodesic metric satisfies the CAT(0) inequality, according to which a geodesic triangle in the complex is ‘thinner’ than a triangle in Euclidean space with the same side lengths. Equivalently, the underlying topological space of the complex is simply connected and the complex satisfies Gromov’s link condition [11]; these requirements comprise the definition for infinite-dimensional CAT(0) cube complexes. The vertex set of a cube complex is also equipped with the edge-path metric, in which the distance between vertices is defined to be the minimum number of edges on an edge-path connecting them. A CAT(0) cube complex possesses a rich combinatorial structure. A (geometric) hyperplane H divides the vertex set into two path connected subspaces which we shall refer to as half-spaces. Two hyperplanes provide four possible half-space intersections; the hyperplanes intersect if and only if each of these four half-space intersections is non-empty. Two vertices in a half-space are connected by an edge-path that does not cross H whereas an edge-path connecting a vertex in one half-space to one in the other must cross H . In the latter case we say that H separates the two vertices. The set of hyperplanes separating the vertices x and y is denoted H(x, y). The interval from x to y, denoted [x, y], is the intersection of all half-spaces containing both x and y. A set of vertices is convex if whenever it contains both x and y it contains the entire interval [x, y]. Finally, the set of vertices of a CAT(0) cube complex is a median space; the median of the vertices w, x and y is the (unique) vertex in [w, x] ∩ [x, y] ∩ [w, y] [17]. Proposition 1.7. Let X be a CAT(0) cube complex. The restriction of the geodesic metric to the vertex set is coarsely equivalent to the edge-path metric. Moreover, if X is finite-dimensional the vertex set (with either metric) is coarsely equivalent to X. Proof. For the purposes of the proof denote the geodesic metric by d2 and the edge-path metric by d1 . Let x and y be vertices in X. Let x = x0 , x1 , . . . , xn = y be the ordered sequence of vertices on a shortest edge-path from x to y. By the triangle inequality, d2 (x, y)

n

d2 (xi−1 , xi ) = n = d1 (x, y).

i=1

Conversely, given two vertices x, y with d1 (x, y) = k the interval between them is a CAT(0) cube complex with exactly k hyperplanes, and therefore embeds as a subcomplex of the k-dimensional

1414


unit cube. This embedding is an isometry for the edge-path metrics and a contraction at the level of the geodesic metrics. We denote the image of a point z under this embedding by z, and abuse notation by letting d1 and d2 to refer to the√edge-path and √ geodesic metrics in both cube complexes. We conclude d1 (x, y) = d1 (x, y) = d2 (x, y) d2 (x, y). Thus, the metrics are coarsely equivalent as required. √ If X is finite-dimensional, the vertex set is dim(X)/2-dense in X in the geodesic metric. Consequently, the vertex set with the (restriction of the) geodesic metric is coarsely equivalent to X. 2 A CAT(0) cube complex also possesses a combinatorial boundary, which we now describe. A function σ assigning to each hyperplane one of its two half-spaces is an ultrafilter if it satisfies the following condition: for two hyperplanes H and K the half-spaces σ (H ) and σ (K) have non-trivial intersection. (The condition is vacuous when the hyperplanes H and K themselves intersect.) A vertex x ∈ X defines an assignment of half-spaces to hyperplanes as follows: assign to the hyperplane H the half-space Hx that contains x. The assignment is an ultrafilter since for two hyperplanes H and K we have x ∈ Hx ∩ Kx . Further, distinct vertices define distinct ultrafilters; indeed, if x = y then Hx = Hy precisely when H separates x and y. We have thus described an injective function from vertices of X to ultrafilters. Ultrafilters that are not in the image of this map are vertices at infinity; these comprise the ideal boundary ∂X of X and we denote X = X ∪ ∂X. The elementary combinatorics of hyperplanes and half-spaces extends to X. Let z, w ∈ X. Being an ultrafilter, z associates to each hyperplane H one of its two half-spaces; we denote this half-space by Hz . A hyperplane H separates z and w if Hz = Hw ; the set of these hyperplanes is denoted H(z, w). We say that Hz contains z, and define the interval [z, w] to be the intersection of all half-spaces containing both z and w. Observe that [z, w] ⊂ X. Lemma 1.8. Let x, w ∈ X and z ∈ X. If w ∈ [x, z] then [x, w] ⊂ [x, z]. Proof. The intersection of convex sets is convex; in particular, intervals are convex.

2

Lemma 1.9. Let x, y, w ∈ X and z ∈ X. If w ∈ [x, z] and y ∈ [x, w] then H(y, w) ⊂ H(y, z). Proof. If not there is a hyperplane H such that Hz = Hy = Hw . We must have either Hx = Hz or Hx = Hw , but the first of these statements contradicts w ∈ [x, z] and the second contradicts y ∈ [x, w]. 2 The set X carries a natural topology. We shall require only the following, which we take as a definition: a sequence of vertices zj ∈ X converges to a vertex z ∈ X if and only if for every hyperplane H we have H ∈ / H(zj , z) for almost every j ∈ N. (As usual, we say that a property holds for almost every j ∈ N if the set of those j ∈ N for which the property does not hold is finite.) We defer the question of whether or not there exist sequences converging to a given vertex at infinity until later. For now we note the following properties of such sequences. Lemma 1.10. Let zj ∈ X, z ∈ X and let zj → z. A hyperplane H separates y from z precisely when it separates y from almost every zj :


H(y, z) =

1415

H(y, zj ).

k j k

Proof. A hyperplane H separates y from z means that Hy = Hz ; zj → z means that for every hyperplane H we have Hz = Hzj for almost every j . 2 Lemma 1.11. Let zj ∈ X, z ∈ X and suppose zj → z. Let x and y ∈ X. Precisely one of the following two statements holds: (a) y ∈ [x, zj ] for almost every j , (b) y ∈ / [x, zj ] for almost every j . In the first case y ∈ [x, z] whereas in the second y ∈ / [x, z]. Proof. The first statement fails if and only if y ∈ / [x, zj ] for infinitely many j ; this is clearly implied by the second statement, and we must show it implies the second statement. Now, if y∈ / [x, zj ] there exists H ∈ H(x, y) such that Hx = Hzj . Assuming this is the case for infinitely many j then, since H(x, y) is finite, there exists H ∈ H(x, y) such that Hx = Hzj for infinitely many j . By the definition of convergence we have Hz = Hzj for almost every j . Thus, Hx = / [x, zj ] for almost every j , and y ∈ / [x, z]. Hz = Hzj for almost every j . In particular, y ∈ It remains only to see that the first statement implies y ∈ [x, z]. But, if y ∈ / [x, z] there exists an H ∈ H(x, y) such that Hx = Hz . By the definition of convergence, we have Hz = Hzj for almost every j , so that y ∈ / [x, zj ] for almost every j . 2 Lemma 1.12. Let x, y ∈ X and z ∈ X. The intersection of the intervals [x, y], [x, z] and [y, z] consists of a single vertex of X. Proof. To prove uniqueness suppose m = m are in [x, y] ∩ [x, z] ∩ [y, z] and let H be a hyperplane separating m and m . Two of the three half-spaces Hx , Hy and Hz must be equal; suppose, for example, Hx = Hz . Since Hm = Hm only one of these can be Hx ; if, for example, Hm = Hx we have m ∈ / [x, z], a contradiction. To prove existence, let zj ∈ X be such that zj → z. The interval [x, y] is finite and contains the medians mj = m(x, y, zj ). Hence there exists an m ∈ [x, y] such that m = mj ∈ [x, zj ] for infinitely many j . By Lemma 1.11, m ∈ [x, zj ] for almost every j and m ∈ [x, z]. Similarly, m ∈ [y, z]. 2 Let x ∈ X and z ∈ X. Denote by Nz (x) the set of hyperplanes separating x and z and adjacent to x. (The notation is inspired by [15]; when z ∈ X the hyperplanes in Nz (x) span the first cube on the normal cube path from x to z.) Lemma 1.13. Let X be a finite-dimensional CAT(0) cube complex. Let x ∈ X and z ∈ X. The cardinality of Nz (x) is bounded by the dimension of X. Proof. Since a family of pairwise intersecting hyperplanes have a common point of intersection the cardinality of such a family is bounded by the dimension of X [18, Theorem 4.14]. Thus, it suffices to show that every pair of hyperplanes H and K ∈ Nz (x) intersect. For such H and K we have Hx ∩ Kx = ∅. Further the vertex immediately across H from x lies in

1416


Hz ∩ Kx ; similarly Hx ∩ Kz = ∅. Finally, if zj ∈ X converge to z then for almost every j we have zj ∈ Hz ∩ Kz . All four half-space intersections being nonempty, H and K intersect. Compare [15, Proposition 3.3]. 2 Finally we consider the geometry of intervals in CAT(0) cube complexes. We shall make extensive use of the following often used result; apparently no complete proof exists in the literature so we also provide a detailed discussion. Compare [8]. We view Rd as a cube complex in the obvious way; the vertex set is the integer grid Zd and the (top-dimensional) cubes are the translates of the unit cube with vertices {0, 1}d . An interval in Rd is a cuboid. Precisely, the interval [x, y] for the vertices x = (x1 , . . . , xd ) and y = (y1 , . . . , yd ) is the product {x1 , . . . , y1 } × {x2 , . . . , y2 } × · · · × {xd , . . . , yd },

(2)

where for simplicity we assume that xi yi for all i. To include vertices in the combinatorial boundary we allow the possibility that one or both of x and y are vertices at infinity, meaning that xi = −∞ or yi = ∞ (or both) for some i. Theorem 1.14. Let X be a CAT(0) cube complex of dimension d and let x and y be vertices in X. Then the interval [x, y] admits an isometric embedding as an interval [x, y] in the cube complex Rd . For purposes of the proof we define a partial order on the set H(x, y) of hyperplanes separating x and y as follows: H K

⇔

Hx ⊂ Kx .

Lemma 1.15. Two hyperplanes H and K ∈ H(x, y) are incomparable for the partial order precisely when they intersect. Proof. The intersections Hx ∩ Kx and Hy ∩ Ky are always non-empty since Hx ∩ Kx = ∅ contradicts the fact that x defines an ultrafilter; further Hx ∩ Ky = ∅ ⇔ Hx ⊂ Kx and Hy ∩ Kx = ∅ ⇔ Kx ⊂ Hx . Consequently, H and K are incomparable precisely when the four possible intersections of half-spaces determined by H and K are non-empty, in other words, when they intersect. 2 Lemma 1.16. The partially ordered set H(x, y) is a disjoint union of d (possibly empty) chains: H(x, y) = P1 ∪ · · · ∪ Pd

(disjoint).

Proof. According to the previous lemma an anti-chain in H(x, y) is a collection of pairwise intersecting hyperplanes. A collection of pairwise intersecting hyperplanes has a common intersection [18, Theorem 4.14]. As a consequence, the cardinality of an anti-chain in H(x, y) is bounded by the dimension of X. With this remark, the result is an immediate consequence of Dilworth’s lemma [10, Theorem 1.1]. 2


1417

Proof of Theorem 1.14. We shall require, and prove, the result only in the case x is a vertex of X. We use the decomposition of H(x, y) given in the previous lemma to define a function z → z of the interval [x, y] ⊂ X into Zd (the d-dimensional Euclidean cube complex together with its combinatorial boundary): z = (z1 , . . . , zd ),

zi = {H ∈ Pi : z ∈ Hy }.

Note that x = 0, whereas the coordinates of y are y i = |Pi |; we allow the possibility that some y i = ∞. For every z ∈ [x, y] the coordinates of z are finite and further z ∈ [x, y]. The function is an isometric embedding. Indeed, we calculate for v, w ∈ [x, y], d(v, w) =

d H ∈ Pi : H ∈ H(v, w) = H(v, w) = d(v, w), i=1

since H(v, w) ⊂ H(x, y).

2

Now we return to the question of the existence of sequences of vertices converging to a given vertex at infinity. Lemma 1.17. Let x ∈ X and let z ∈ X. There exists a sequence (zj )j ∈N of vertices in [x, z] such that zj → z. Proof. We follow the construction of normal cube paths as in [15]. Let z0 = x. Assuming we have constructed the vertex zi in the sequence we define the vertex zi+1 to be the vertex opposite to zi on the unique cube adjacent to zi crossed by all the hyperplanes adjacent to zi separating zi from z. Since no hyperplane separates zi+1 from both x and z all the vertices in the sequence lie in the interval [x, z]. It remains to show that given any hyperplane H there are only finitely many values i for which H separates zi from z. To see this we note that when H separates zi from z the set of hyperplanes separating zi from H is properly contained in the set of hyperplanes separating zi−1 from H and that both sets are finite. 2 1.3. Combinations The weights that we give to vertices in a CAT(0) cube complex will be defined in terms of the function nr . A priori this function is defined on pairs of integers with 0 r n. It is uniquely determined by the following properties: (a) n0 = nn = 1 for n 0. n−1 for 1 r n. (b) nr = n−1 r−1 + r In fact the function nr can be defined for all pairs of integers. It is the unique function on Z × Z with the following properties: (a) n0 = 1 for n 0, and nn = 1 for all n ∈ Z. n−1 for all n, r ∈ Z. (b) nr = n−1 r−1 + r

1418


It follows that nr vanishes when r > n or r < 0 n. Moreover it satisfies the identity nr = −1−r (−1)n+r −1−n , which allows one to compute nr for r < 0. n We will make use of r for r −1 and n ∈ Z, where the function takes exclusively nonn 0 negative values. In particular note that −1 = (−1)n−1 −1−n which is 1 if n = −1 and vanishes otherwise. 2. The Euclidean case The standard proof that Zd has Property A proceeds as follows. The weight function fn,x is the characteristic function of the ball of radius n and center x. The variation property, condition (b) of Proposition 1.5, follows from the facts that balls are Følner sets for Zd and that the weight functions fn,x are translates of the single function fn,0 . In this section we shall offer a different proof of Property A for Zd . Our proof parallels the standard proof for Zd , but with several differences, each of which is important for generalising the argument to arbitrary finite-dimensional CAT(0) cube complexes (which do not in general admit an action by an amenable group). First, our weight functions fn,x will be supported on a certain subset of the n-ball with center x, rather than the whole ball. Second, they will not be characteristic functions. Finally, for fixed n and varying x the fn,x will be defined separately, rather than being translates of a single function. For the remainder of the section fix an ambient dimension N d − 1. In proving that Rd has Property A we will take N d; it will nonetheless be useful to note that the definitions and some of the results remain valid in the case N = d − 1 when the codimension is said to be −1. 2.1. Construction of weight functions Our definition of weight functions for Zd , and indeed for general CAT(0) cube complexes, is motivated by the following example. Example. Let X be a (simplicial) tree. To show that X has Property A one can use weight functions defined as follows. Fix a basepoint O ∈ X. For each vertex x ∈ X place weights on the interval [O, x] according to ⎧ if y = O and d(x, y) n, ⎨1 fn,x (y) = n − d(x, y) + 1 if y = O and d(x, y) n, ⎩ 0 if d(x, y) > n. Heuristically we imagine that a charge of n + 1 units has been placed at the vertex x and has then flowed towards the origin, where, ultimately it ‘piles up.’ In higher dimensions we take the same heuristic point of view, that we will ‘flow’ a charge from a vertex x towards the origin O, distributing it across the interval [O, x]. As with the tree case, excess charge will collect at the origin, but, unlike the tree case, there will be additional points at which the charge accumulates. This occurs wherever the charge reaches the boundary on its journey towards the origin, losing a degree of freedom in the routes it can travel as it continues to flow. This loss of freedom is quantified as a ‘deficiency,’ defined below. Fix a basepoint O = (0, 0, . . . , 0) of Rd .


1419

Definition 2.1. The deficiency δ(y) of a vertex y = (y1 , . . . , yd ) ∈ Zd is N minus the number of non-zero coordinates of y. Definition 2.2. For a vertex x ∈ Zd define the weight function fn,x : Zd → N ∪ {0} by fn,x (y) =

n−d(x,y)+δ(y) , y ∈ [O, x], δ(y) 0, otherwise.

We make several remarks on the definition. First, since N d − 1 we have δ(y) −1 for all y, so that fn,x is non-negative integer valued. Second, fn,x is supported in the interval [O, x] so that it lies in the space of finitely supported functions on the vertex set. Finally, although it is not reflected in the notation, the weight functions depend on the fixed ambient dimension N . The definitions are motivated by the following geometric intuition. Imagine a vertex x in the ambient RN , all of whose coordinates exceed n. The intersection of the interval from x to the points of ZN . origin with the ball of radius n is an N -dimensional tetrahedron containing n+N N Projecting RN onto a subspace Rd (supposing d N ) the image is a d-dimensional tetrahedron, and the fibre over a vertex y will be an (N − d)-dimensional tetrahedron, the sides of which −d have length n − d(x, y). Hence each fibre contains n−d(x,y)+N points of ZN . We thus take a N −d n−d(x,y)+N −d on each point of the image tetrahedron in Zd . Now suppose that the weighting of N −d coordinates of x do not all exceed n. Then the tetrahedron will cross outside the interval from x to the origin, and we must further project points of the tetrahedron onto the faces of the interval. This results in higher deficiencies than the standard N − d. 2.2. Analysis of weight functions We conclude our proof of Property A for Zd . The first step is to show that the norm of the weight function fn,x depends only on n and N , and in particular does not depend on x or d. Indeed, as the intuition above indicates the norm is exactly the number of points of ZN contained in a tetrahedron of side length n. Proposition 2.3. For every N d − 1 and x ∈ Zd , the 1 -norm of fn,x is

n+N N

.

d in place of f Proof. In the proof we write fn,x n,x . We shall show that for every 0 d N + 1 and for every n ∈ N and x ∈ Zd

y∈Zd

d fn,x (y) =

n+N . N

d is non-negative and integer-valued. Recall that for d in the range considered fn,x The proof is by induction on d. In the case d = 0 we also have x =O. The sum has the single 0 (O) = n+N . term y = O and, since the deficiency is N , we have fn,O N Suppose d > 0 and let x = (x1 , . . . , xd ) ∈ Zd . Denote the projection of z = (z1 , . . . , zd ) ∈ Zd to Zd−1 by z = (z2 , . . . , zd ). The decomposition of the interval [O, x] as a product [0, x1 ] × x ] gives a natural fibring of [O, x] over [O, x ]. The interval [0, x1 ] in Z is ordered from [O, 0 to x1 , which is the usual order in Z when x1 0 and is the reverse order when x1 < 0. We

1420


Fig. 1. Projecting to Zd−1 .

x ] in the order y 0 , y 1 , . . . , y |x1 | determined by the enumerate the points in the fibre over y in [O, ordering of the interval [0, x1 ]. This is illustrated in Fig. 1. x] We shall show that for every y ∈ [O, |x1 | j =0

j n − d( x , y ) + δ( y) d−1 d . y = fn, fn,x ( y ) = x def δ( y)

(3)

d as follows: Once we have established this equality, we can compute the 1 -norm of fn,x

d fn,x (z) =

=

d−1 fn, y) x (

x] y ∈[O,

=

y ∈Zd−1

j d y fn,x

x ] j =0 y ∈[O,

z∈[O,x]

z∈Zd

|x1 |

d fn,x (z) =

d−1 fn, y)= x (

n+N , N

where the equality on the second line follows from Eq. (3) and the final equality follows from the induction hypothesis. x ]; we shall prove by induction on i that, for 0 i |x1 |, To establish (3) let y ∈ [O, i j =0

d fn,x

j n − d(x, y i ) + δ( y) y = . δ( y)

(4)

In coordinates, y = (y2 , . . . , yd ) so that y 0 = (0, y2 , . . . , yd ) and y j = (±j, y2 , . . . , yd ) for j 1, where we choose ± according to whether x1 is greater or less than zero. It follows that y and y 0 have the same number of non-zero coordinates, and hence the same deficiency: δ( y ) = δ(y 0 ). Similarly for j 1 we find that δ(y j ) = δ( y ) − 1. In particular, we see that n−d(x,y 0 )+δ( y ) d 0 fn,x (y ) = yielding Eq. (4) in the case i = 0. δ( y) Assume that (4) holds for i. Split the sum for i + 1 into the sum for i and the term for i + 1, d to obtain apply the induction hypothesis and the definition of fn,x

J. Brodzki et al. / Journal of Functional Analysis 256 (2009) 1408–1431 i+1 j =0

1421

j i+1 n − d(x, y i ) + δ( y) d d y = y fn,x + fn,x δ( y) y) n − d(x, y i+1 ) + δ(y i+1 ) n − d(x, y i ) + δ( + δ( y) δ(y i+1 ) y)−1 n − d(x, y i+1 ) + δ( y)−1 n − d(x, y i+1 ) + δ( + = δ( y) δ( y)−1 i+1 y) n − d(x, y ) + δ( , = δ( y) =

where we have used δ(y i+1 ) = δ( y ) − 1 (i 0) and d(x, y i ) = d(x, y i+1 ) + 1 in the third equality. The final equality is the binomial coefficient formula from Section 1.3. x , y ). 2 The formula (3) follows from (4) taking i = |x1 | and noting that d(x, y |x1 | ) = d( The second step in our proof of Property A for Zd is to estimate the norm of the difference fn,x − fn,x of weight functions when x and x are adjacent vertices. We shall see that the norm of this difference depends only on n and N , and in particular does not depend on the points x and x , nor on d. Proposition 2.4. For every N d and adjacent vertices x and x ∈ Zd , the 1 -norm of n+N −1 fn,x − fn,x is 2 N −1 . Proof. In the proof we shall encounter weight functions for various values of ambient dimension N ; we incorporate the ambient dimension into the notation where necessary to avoid N . confusion writing, for example, fn,x Let x and x ∈ Zd be adjacent vertices and suppose, without loss of generality that x is closer to the origin than x. It follows that the interval [O, x ] is contained in [O, x]. Further, for every y ∈ [O, x ] we have x ∈ [y, x] so that d(x, y) = d(x , y) + 1. We calculate the difference, for y ∈ [0, x ], n − (d(x , y) + 1) + δ(y) n − d(x , y) + δ(y) − δ(y) δ(y) n − d(x , y) + δ(y) − 1 N −1 = = fn,x (y), δ(y) − 1

N N fn,x (y) − fn,x (y) =

where the last equality results from the observation that replacing N by N − 1 has the effect N −1 of reducing all deficiencies by one. Note also that N − 1 d − 1 so that fn,x is non-negative valued. We conclude from Proposition 2.3 that N −1 n+N −1 N −1 f N (y) − f N (y) = . fn,x (y) = fn,x = n,x n,x N −1

y∈[O,x ]

(5)

y∈[O,x ]

N is supported in [O, x ] ⊂ [O, x], whereas f N and the difference f N − f N Recall that fn,x n,x n,x n,x are supported in [O, x]. Applying again Proposition 2.3 we obtain

1422


N fn,x (y) =

y∈[O,x]

N fn,x (y),

y∈[O,x]

which, by rearranging, leads to y∈[O,x ]

N N fn,x (y) − fn,x (y) =

y∈[O,x]\[O,x ]

N N fn,x (y) − fn,x (y),

where all terms in both sums are positive. Thus n+N −1 N N f N (y) − f N (y) = 2 fn,x (y) − fn,x (y) = 2 . n,x n,x N −1

y∈[O,x]

2

y∈[O,x ]

Theorem 2.5. The Euclidean space Rd has Property A for every d. Proof. As Rd and Zd are coarsely equivalent, it suffices to show that Zd has Property A. To accomplish this we shall show that the sequence of families fn,x defined above, together with the sequence of constants Sn = n satisfy the conditions given in Proposition 1.5. The support condition (a) is immediate: fn,x is supported in Bn (x) since n−d(x,y)+δ(y) vanishes if n − δ(y) d(x, y) + δ(y) < δ(y). The variation condition (b) follows directly from Propositions 2.3 and 2.4: if d(x, x ) 1 then n+N −1

fn,x − fn,x 2 N −1 2N n+N = →0

fn,x

n+N N

as n → ∞, the convergence being uniform on {(x, x ): d(x, x ) 1}.

2

3. Property A for CAT(0) cube complexes In this section we shall generalise the techniques of the previous section to prove that a finitedimensional CAT(0) cube complex has Property A. The construction of the weight functions fn,x generalises in a fairly straightforward manner. The main obstacle to the analysis of the weight functions is the computation of their norm, as in Proposition 2.3. To accomplish this step we shall develop a fibring technique for intervals in a CAT(0) cube complex. Let X be a CAT(0) cube complex of dimension d < ∞. As in the previous section, fix an ambient dimension N d − 1. 3.1. Construction of the weight functions The definition of the weight functions is exactly as in the Euclidean case. Fix a basepoint O ∈ X. Definition 3.1. The deficiency δ(y) of a vertex y ∈ X is the ambient dimension minus the number of hyperplanes both adjacent to y and separating it from O: δ(y) = N − NO (y).


1423

Fig. 2. An interval embedded in the plane.

In the Euclidean case, with basepoint O = 0, the cardinality of NO (y) is the number of nonzero coordinates of y. Thus, the definition generalises the one in the previous section. Definition 3.2. For a vertex x ∈ X define the weight function fn,x : X → N ∪ {0} by

n−d(x,y)+δ(y) , y ∈ [O, x], δ(y) fn,x (y) = 0, otherwise. As in the Euclidean case, fn,x is a non-negative integer valued function because N d − 1 implies that δ(y) −1 for all y. 3.2. Fibring intervals Let x ∈ X. According to Theorem 1.14 we may embed the interval [O, x] into an interval in Zd . We denote the image of a vertex y by y and assume that the embedding maps the basepoint O ∈ X to the basepoint O = (0, . . . , 0) ∈ Zd ; by our convention the coordinates of x are nonnegative. Our objective is to fibre the interval I = [O, x] (in Zd ) over the image J of the interval [O, x]. Definition 3.3. Let y ∈ [O, x] with image y. The i-coordinate is y-bound if the vertex in Zd with coordinates (y 1 , . . . , y i − 1, . . . , y d ) is in the image of the embedding. The i-coordinate is y-free if it is not y-bound. In Fig. 2 the first coordinate of y is y-bound, whereas the second coordinate is y-free. Definition 3.4. Let y ∈ [O, x]. The fibre of I over y is the set of vertices a = (a1 , . . . , ad ) ∈ Zd with coordinates satisfying: (a) if i is y-bound then ai = y i , (b) if i is y-free then 0 ai y i . The fibre of I over y is denoted by Fy . Remark. For every y ∈ [O, x] the fibre Fy is an interval in Rd ; in fact if Oy is defined in coordinates by

Oy,i =

yi , 0,

i is y-bound, i is y-free,

1424


Fig. 3. Medians.

then Fy = [Oy , y]. In particular, for every y ∈ [O, x] we have y ∈ Fy . As the terminology suggests we shall show, in a sequence of lemmas, that each fibre contains a unique vertex of J , that the fibres of distinct vertices are disjoint, and indeed that they partition I . Lemma 3.5. For every y = z ∈ [O, x] the fibres Fy and Fz are disjoint. Proof. Let y = z ∈ [O, x]. Since y = z it follows that either y ∈ / [O, z] or z ∈ / [O, y]; exchanging y and z if necessary we may assume that y ∈ / [O, z]. Let m be the median of O, y and z; since m is the unique vertex in [O, y] ∩ [O, z] ∩ [y, z] it follows that m = y and m ∈ [O, x]. Let H ∈ H(y, m) be adjacent to y. See Fig. 3. It follows from the definition of m that H ∈ H(y, z) ∩ H(y, O) so that also H ∈ / H(z, O). Let i be the coordinate to which H contributes, and suppose that H is the pth hyperplane in the chain. It follows that zi p − 1, so that the same inequality holds for every vertex in Fz . On the other hand, it follows from the definitions that y i = p and that i is y-bound so that every vertex in Fy has i-coordinate equal to p. We conclude that Fy and Fz are disjoint. 2 Lemma 3.6. For every a ∈ I there exists y ∈ [O, x] such that a ∈ Fy . Proof. Let y ∈ [a, x] minimise the distance from a to [a, x] ∩ J . We shall show that a ∈ Fy . The condition y ∈ [a, x] is equivalent to the inequalities y i ai , for all coordinates i. Consequently, it remains to show that for every y-bound coordinate i we have ai y i . But, if the i-coordinate is y-bound and ai < y i then (y 1 , . . . , y i − 1, . . . , y d ) ∈ [a, x] ∩ J and is nearer a than y. This contradicts the choice of y. 2 From these lemmas and the preceding discussion we obtain: Proposition 3.7. The interval I is the disjoint union of the fibres of the vertices in [O, x], and each fibre intersects J in exactly one point. Definition 3.8. For vertices x and z in a CAT(0) cube complex we define nz (x) = |Nz (x)|. Recall that Nz (x) is the set of hyperplanes in H(x, z) adjacent to x. Remark. We shall employ this notation when z is the basepoint of an interval [z, y] containing x. In this case Nz (x) ⊂ H(z, y). We record two special cases of this notation. If a ∈ I = [O, x] then nO (a) is the number of non-zero coordinates of a; further, if y is the unique element of [O, x] such that a ∈ Fy , an interval with basepoint Oy , then nOy (a) is the number of non-zero y-free coordinates of a.


1425

Lemma 3.9. For every y ∈ [O, x] the number of y-bound coordinates is nO (y); for every a ∈ Fy we have nO (a) = nOy (a) + nO (y).

(6)

Proof. Suppose the i-coordinate is y-bound. Obtain z ∈ [O, x] such that y and z agree except in the i-coordinate for which zi = y i − 1. Since the embedding y → y is an isometry, we have d(y, z) = 1 and d(O, y) = d(O, z) + 1. Hence, the unique hyperplane H separating y and z also separates O and y. We have thus described a function i → H from the set of y-bound coordinates to the set of hyperplanes adjacent to y and separating y from O. It remains to show it is bijective. For injectivity, we merely observe that the hyperplane H associated to i separates O from x, belongs to the chain Pi and the distinct Pi are disjoint. For surjectivity, we observe that if H is adjacent to y and separates y from O then H separates O from x and is the image of the i for which H belongs to the chain Pi . For the equation we need to count the number of non-zero coordinates of a. Each of these is either y-bound or y-free. By the observation above the number of non-zero y-free coordinates is precisely nOy (a). By definition of the fibre all y-bound coordinates of a are equal to the corresponding coordinates of y which are themselves non-zero so the number of these is given by nO (y). 2 Remark. It is instructive to examine the case a = y of the lemma. The number nOy (y) of nonzero y-free coordinates of y is simply the dimension of the interval Fy . As a consequence, subtracting both sides of (6) from N , we conclude that this dimension is the difference of the deficiencies of y and y: dimension of Fy = δ(y) − δ(y). Fig. 4 illustrates the fibring in the case of an interval [O, x] embedded in R3 . The vertex x maps to x = (2, 1, 2), while O = (0, 0, 0). The fibres of the points w, x, y and z are as indicated: Fw = {w}, Fx = (2, 0, 2), (2, 1, 2) = x , Fy = (2, 0, 0), (2, 0, 1), (2, 1, 0), (2, 1, 1) = y , Fz = (0, 0, 2), (0, 1, 2), (1, 0, 2), (1, 1, 2) = z .

Fig. 4. Fibring an interval over the embedding.

1426


The vertex x has deficiency one (computed with N = 3) while both y and z have deficiency two. However, the corresponding elements x, y and z ∈ I all have deficiency zero. As expected, the fibre Fx has dimension one and the fibres Fy and Fz both have dimension two. The vertex w has deficiency two, as does w, so the fibre Fw has dimension zero and is reduced to the single point w. 3.3. Analysis of the weight functions We complete our analysis of the weight functions defined for a CAT(0) cube complex following the strategy we used in the Euclidean case. The following analog of Proposition 2.3 provides the crucial step. Proposition 3.10. Let X be a CAT(0) cube complex of dimension at most d, and let N d − 1. For a vertex of x ∈ X, the 1 -norm of the weight function fn,x is n+N N . In particular the norm does not depend on the vertex x nor the complex X. The proof rests on a rather remarkable fact: although the construction of the fibres relies heavily on the non-canonical embedding of an interval of X into a Euclidean interval the process of summing the weights over each fibre gives a quantity which is independent of all choices. Specifically, summing over the fibre Fy one gets the value of fn,x (y), a quantity that is defined intrinsically without reference to an embedding. Proof. In the proof we shall encounter weight functions for the complex X and Euclidean spaces of various dimensions, as well as for various values of the ambient dimension. To avoid confusion N,X we incorporate these parameters into the notation writing, for example, fn,x . Fix x and an identification of the interval [O, x] with a subset J of an interval I = [0, x] in Rd . As described above, we shall prove that for y ∈ [O, x]

N,X (y) = fn,x

d

N,R fn,x (a).

(7)

a∈Fy N,X Assuming this equality for the moment, we complete the proof of the theorem. Since fn,x is non-negative valued and supported in the interval [O, x] and since the fibres partition I it follows that

N,Rd N,X N,Rd n+N N,X f = = , fn,x (y) = fn,x (a) = fn,x n,x N y∈[O,x]

a∈I

the last equality being Proposition 2.3. We turn to the proof of (7). Fix a vertex y ∈ [O, x]. If d(x, y) > n then d(a, x) > n for all a ∈ Fy and both sides of (7) are zero. Therefore, we may assume d(x, y) n. The deficiency of y with respect to the basepoint O is denoted δ N,X (y). A vertex a ∈ Fy has two deficiencies: one with respect to the basepoint O ∈ I , which we denote δ N,I (a) and another with respect to the basepoint Oy of the interval Fy , which we denote δ N,Fy (a). As one might expect, these are related by a shift in the ambient dimension according to


δ N,I (a) = δ Ny ,Fy (a),

Ny = N − nO (y).

1427

(8)

According to our conventions, the deficiency on the right is defined only when the dimension of Fy does not exceed Ny + 1. Indeed, this is the case: Fy has dimension nFy (y) and applying Lemma 3.9 we conclude nOy (y) = nO (y) − nO (y) d − nO (y) N + 1 − nO (y) = Ny + 1. The proof of (8) is straightforward. Indeed, directly from the definitions we have δ N,I (a) = N − nO (a),

δ N,Fy (a) = N − nOy (a),

δ N,X (y) = N − nO (y),

(9)

so that applying Lemma 3.9 we conclude δ N,I (a) = N − nO (a) = N − nO (y) − nOy (a) = δ Ny ,Fy (a). On the basis of (8) we complete the proof of (7). For a ∈ Fy we have the coordinate-wise inequalities 0 ai y i x i so that d(x, a) = d(x, y) + d(y, a). Hence

d

N,R (a) = fn,x

=

n − d(x, a) + δ N,I (a) δ N,I (a)

(n − d(x, y)) − d(y, a) + δ Ny ,Fy (a) δ Ny ,Fy (a) N ,F

y y (a). = fn−d(x,y),y

Observe that n − d(x, y) = n − d(x, y) 0. Summing over a ∈ Fy , applying Proposition 2.3 and using again the fact that d(x, y) = d(x, y) we get a∈Fy

Ny ,Fy N,Rd = fn,x (a) = fn−d(x,y),y

n − d(x, y) + Ny . Ny

N,X Comparing (8) and (9) we see Ny = δ N,X (y). A glance at the definition of fn,x (y) reveals that (7) is proved. 2

The following results are direct analogs of Proposition 2.4 and Theorem 2.5; their proofs are identical to the proofs of their analogs in the Euclidean case, except making use of Proposition 3.10 in place of Proposition 2.3. Proposition 3.11. Let X be a CAT(0) cube complex of dimension at most d, and let N d. For every pair x and x of adjacent vertices in X the 1 -norm of the difference fn,x − fn,x of weight −1 functions is 2 n+N N −1 . Theorem 3.12. Every finite-dimensional CAT(0) cube complex has Property A.

1428


4. Point stabilisers at infinity An amenable group of isometries of a locally compact Hadamard space is known either to fix a point at infinity, or to preserve a flat subspace [1]. Under certain circumstances there is a converse to this result, for example when a group acts properly on a proper CAT(0) space the stabiliser of a flat is virtually abelian [3], and if the space is an Hadamard space, e.g., a building, then the stabiliser of a point in a suitable refinement of the visual boundary is necessarily amenable [6]. We shall adapt our construction from the previous section to prove an analogous result for the combinatorial boundary of a CAT(0) cube complex. Of the numerous characterisations of amenability for countable groups we select the Reiter condition, which is most convenient for our purposes. Definition 4.1. A countable discrete group G is amenable if there exists a sequence of finitely supported probability measures ξn ∈ 1 (G) such that for every g ∈ G lim ξn − g · ξn = 0.

n→∞

An action of a discrete group G on a CAT(0) cube complex X is understood to be cellular. In particular, G acts on the set of vertices of X and on the sets of hyperplanes and half-spaces, and preserves all relevant combinatorics of the complex. In particular, the action on vertices is isometric for the edge-path metric. Further, the action extends to the combinatorial boundary ∂X and to the completion X. Not having gone into detail concerning the topology on the combinatorial boundary, we remark only that if zj → z then g · zj → g · z. Theorem 4.2. Let G be a countable discrete group acting properly on a finite dimensional CAT(0) cube complex X and let z be a vertex at infinity of X. The stabiliser of z in G is amenable, and hence virtually abelian. Our proof will use the following criterion for amenability. Proposition 4.3. Let G be a countable group acting properly on a discrete metric space X. Assume X admits a sequence of families of 1 functions fn,x : X → N ∪ {0}, indexed by x ∈ X, such that: (a) For every pair of points x and x ∈ X we have

fn,x − fn,x

→ 0.

fn,x

(b) For every g ∈ G, x ∈ X, and n ∈ N, fn,gx = g · fn,x . Then G is amenable. Remark. The properness assumption is equivalent to the action having finite point stabilisers.


1429

Proof. We shall construct a sequence of probability measures as required by Definition 4.1. Fix a base point x0 ∈ X. Let T be a transversal for the action of G on X; thus T contains precisely one point from each G-orbit. For each n ∈ N and g ∈ G define φn (g) =

fn,x (gx) 0 , |Gx |

x∈T

where Gx is the stabiliser of x. Observe that fn,x is finitely supported, being an element of 1 (X) with values in N ∪ {0}. Consequently the sum is finite, as indeed are all sums below. Further, φn is finitely supported. We compute φn as follows:

φn =

φn (g) =

g∈G

=

g∈G,x∈T

=

fn,x0 (y)

x∈T y∈G·x

fn,x0 (gx) |Gx |

g∈G: gx=y

1 |Gx |

fn,x0 (y) = fn,x0 .

x∈T y∈G·x

A similar calculation yields the following estimate:

φn − g · φn fn,x0 − fn,gx0 . We obtain the required probability measure by normalising: ξn = φn / φn .

2

Proof of Theorem 4.2. Let z be a vertex at infinity. Replacing G by the stabiliser of z, we assume that G stabilises z. Define weight functions as in Definition 3.2, with z playing the role of the base point O: fn,x (y) =

n−d(x,y)+δ(y) , δ(y)

0,

y ∈ [x, z], y∈ / [x, z],

(10)

where the deficiency is defined relative to an ambient dimension N by δ(y) = N − |Nz (y)|. Choosing N to be at least the dimension of the cube complex we ensure that all deficiencies are non-negative so that fn,x takes its values in the non-negative integers. We first note that the support of fn,x lies in the intersection of the ball of radius n around x with the interval [x, z]. While the ball itself may contain infinitely many vertices, Theorem 1.14 tells us that the interval embeds in Rn for some (finite) n, so the intersection is in fact finite, and fn,x is finitely supported, and therefore 1 . The equivariance condition is an immediate consequence of the manner in which G acts on X and the fact that G fixes z. We verify the remaining condition through a limiting process. Let zj be a sequence of vertices of [x, z] converging to z; this is possible by Lemma 1.17. Define the weight functions as in Definition 3.2 with zj playing the role of the base point O:

1430


zj fn,x (y) =

n−d(x,y)+δ j (y) , δ j (y)

y ∈ [x, z],

(11)

y∈ / [x, z],

0,

where the deficiency is defined relative to an ambient dimension N by δ j (y) = N − |Nzj (y)|. zj We now show that fn,x = fn,x , for almost every j . The support of fn,x is contained in [x, z] ∩ zj is contained in [x, zj ] ∩ B(x, n). Applying Lemma 1.8 B(x, n); similarly the support of fn,x zj is also contained in [x, z] ∩ B(x, n). According to (with w = zj ) we see that the support of fn,x Theorem 1.14 this is a finite set. zj (y) = fn,x (y) for almost every j . It remains to show that for y ∈ [x, z] ∩ B(x, n) we have fn,x The only terms in (10) and (11) dependent on j are the deficiencies δ(y) and δ j (y). Applying Lemma 1.11 we see that y ∈ [x, zj ] for almost every j and applying Lemma 1.9 (with w = zj ) we conclude that Nzj (y) ⊂ Nz (y),

(12)

for almost every j . Applying Lemma 1.10 we have Nz (y) =

Nzj (y).

k j k

Since Nz (y) is a finite set, and the union on the right is increasing, we conclude that Nz (y) ⊂ Nzj (y),

(13)

for almost every j . Combining (12) and (13) we conclude that δ(y) = δ j (y) for almost every j . Comparing the definitions (10) and (11) we are done. The almost invariance of the fn,x now follows. Let x and x ∈ X. Let m = m(x, x , z) so that m ∈ [x, z] ∩ [x , z], hence also [m, z] ⊂ [x, z] ∩ [x , z]. Let zj → z and zj ∈ [m, z]. We have zj = fn,x for almost every j . Applying this to shown above that if zj → z and zj ∈ [x, z] then fn,x both x and x we conclude that if x and x are adjacent then zj n+N −1 zj

fn,x − fn,x = fn,x − fn,x = 2 N −1 and also zj = n+N ,

fn,x = fn,x N where in each case the first equality holds for almost every j and the second for every j by Propositions 3.11 and 3.10, respectively. The argument now follows exactly the same course as that of Theorem 2.5: n+N −1

fn,x − fn,x

2d(x, x )N N −1 2d(x, x ) n+N , =

fn,x

n+N N

which tends to zero uniformly on

{(x, x ):

d(x, x ) R}

as n → ∞.

2


1431

Acknowledgments This work was completed during a visit of the third author to the University of Southampton. The third author thanks the others for their gracious hospitality. References [1] S. Adams, W. Ballmann, Amenable isometry groups of Hadamard spaces, Math. Ann. 312 (1) (1998) 183–195; MR 1645958 (99i:53032). [2] I.R. Aitchison, J.H. Rubinstein, An introduction to polyhedral metrics of nonpositive curvature on 3-manifolds, in: Geometry of Low-dimensional Manifolds, 2, Durham, 1989, in: London Math. Soc. Lecture Note Ser., vol. 151, Cambridge Univ. Press, Cambridge, 1990, pp. 127–161; MR 1171913 (93e:57018). [3] M. Bridson, A. Haefliger, Metric Spaces of Non-positive Curvature, Grundlehren Math. Wiss., vol. 319, SpringerVerlag, Berlin, 1999. [4] J. Brodzki, G.A. Niblo, N.J. Wright, Property A, partial translation structures and uniform embeddings in groups, J. London Math. Soc. (2) 76 (2007) 479–497. [5] S.J. Campbell, G.A. Niblo, Hilbert space compression and exactness of discrete groups, J. Funct. Anal. 222 (2) (2005) 292–305; MR 2132393. [6] P.-E. Caprace, Amenable groups and Hadamard spaces with a totally disconnected isometry group, arXiv preprint, http://fr.arxiv.org/abs/0705.1980v1, 2007. [7] R. Charney, An introduction to right-angled Artin groups, Geom. Dedicata 125 (2007) 141–158. [8] I. Chatterji, K. Ruane, Some geometric groups with rapid decay, Geom. Funct. Anal. 15 (2) (2005) 311–339. [9] M. Dadarlat, E. Guentner, Uniform embeddability of relatively hyperbolic groups, J. Reine Angew. Math. 612 (2007) 1–15. [10] R.P. Dilworth, A decomposition theorem for partially ordered sets, Ann. of Math. (2) 51 (1950) 161–166; MR 0032578 (11,309f). [11] M. Gromov, Hyperbolic groups, in: S. Gersten (Ed.), Essays in Group Theory, in: Math. Sci. Res. Inst. Publ., vol. 8, Springer-Verlag, Berlin, 1987, pp. 75–263. [12] E. Guentner, J. Kaminker, Exactness and uniform embeddability of discrete groups, J. London Math. Soc. (2) 70 (3) (2004) 703–718; MR 2160829 (2006i:43006). [13] F. Haglund, D. Wise, Special cube complexes, Geom. Funct. Anal. 17 (5) (2008) 1551–1620. [14] N. Higson, J. Roe, Amenable actions and the Novikov conjecture, J. Reine Angew. Math. 519 (2000) 143–153. [15] G. Niblo, L. Reeves, The geometry of cube complexes and the complexity of their fundamental groups, Topology 37 (1998) 621–633. [16] P. Nowak, Coarsely embeddable metric spaces without property A, J. Funct. Anal. 252 (1) (2007) 126–136. [17] M.A. Roller, Poc sets, median algebras and group actions. An extended study of Dunwoody’s construction and Sageev’s theorem, http://www.maths.soton.ac.uk/pure/preprints.phtml, 1998. [18] M. Sageev, Ends of group pairs and non-positively curved cube complexes, Proc. London Math. Soc. 71 (1995) 585–617. [19] M. Sageev, Codimension-1 subgroups and splittings of groups, J. Algebra 189 (2) (1997) 377–389; MR 1438181 (98c:20071). [20] Jean-Louis Tu, Remarks on Yu’s “Property A” for discrete metric spaces and groups, Bull. Soc. Math. France 129 (1) (2001) 115–139; MR 1871980 (2002j:58038). [21] D.T. Wise, Cubulating small cancellation groups, Geom. Funct. Anal. 14 (1) (2004) 150–214; MR 2053602 (2005c:20069). [22] G. Yu, The Coarse Baum–Connes conjecture for spaces which admit a uniform embedding into Hilbert space, Invent. Math. 139 (2000) 201–240.


On the Fuˇcík spectrum of the Laplacian on a torus Eugenio Massa a,1 , Bernhard Ruf b,∗ a Departamento de Matemática, Instituto de Ciências Matemáticas e de Computação, Universidade de São Paulo,

Campus de São Carlos, Caixa Postal 668, 13560-970, São Carlos SP, Brazil b Dipartimento di Matematica, Università degli Studi di Milano, Via Saldini 50, 20133 Milano, Italy

Received 13 May 2008; accepted 11 August 2008 Available online 29 August 2008 Communicated by J. Coron

Abstract We study the Fuˇcík spectrum of the Laplacian on a two-dimensional torus T 2 . Exploiting the invariance properties of the domain T 2 with respect to translations we obtain a good description of large parts of the spectrum. In particular, for each eigenvalue of the Laplacian we will find an explicit global curve in the Fuˇcík spectrum which passes through this eigenvalue; these curves are ordered, and we will show that their asymptotic limits are positive. On the other hand, using a topological index based on the mentioned group invariance, we will obtain a variational characterization of global curves in the Fuˇcík spectrum; also these curves emanate from the eigenvalues of the Laplacian, and we will show that they tend asymptotically to zero. Thus, we infer that the variational and the explicit curves cannot coincide globally, and that in fact many curve crossings must occur. We will give a bifurcation result which partially explains these phenomena. © 2008 Elsevier Inc. All rights reserved. Keywords: Fuˇcík spectrum; Variational characterization; Secondary bifurcation; Geometrical T 2 -index

1. Introduction The notion of Fuˇcík spectrum for the Laplacian was introduced in [13] and [10]: it is defined as the set Σ ⊆ R2 of points (λ+ , λ− ) for which there exists a nontrivial solution of the problem * Corresponding author. Fax: +390250316090.

E-mail addresses: [email protected] (E. Massa), [email protected] (B. Ruf). 1 The author was partially supported by Fapesp/Brazil.


E. Massa, B. Ruf / Journal of Functional Analysis 256 (2009) 1432–1452

−u = λ+ u+ − λ− u− Bu = 0

in Ω, in ∂Ω,

1433

(1.1)

where Ω is a bounded domain in Rn , Bu stands for the considered boundary conditions, and u± (x) = max{0, ±u(x)}. If Ω = (0, 1) and Bu denotes either Dirichlet, Neumann or periodic boundary conditions, then the Fuˇcík spectrum can be explicitly determined: it consists of global curves in R2 , emanating from the points (λk , λk ), where (λk ) are the eigenvalues of −u with boundary conditions Bu. For instance, for periodic boundary conditions, the Fuˇcík spectrum is given by the following curves, arising from the eigenvalues (λk , λk ) = (k 2 4π 2 , k 2 4π 2 ): Σ0 : Σk :

λ+ = λ0 (= 0) ∪ λ− = λ0 (= 0) ,

1 1 1 2 +√ = =√ , √ λk λ+ λ− kπ

k = 1, 2, 3, . . . .

(1.2)

In the case of higher dimensions there exist various results, mainly for Dirichlet boundary conditions, but the results are much less complete; it is known that • Σ is a closed set; • the lines {λ+ = λ0 } and {λ− = λ0 } belong to Σ (we will refer to this part of Σ as the trivial part), and Σ does not contain points with λ+ < λ0 or λ− < λ0 ; • in each square (λk−1 , λk+m+1 )2 , where λk−1 < λk = · · · = λk+m < λk+m+1 , from the point (λk , λk ) ∈ Σ arises a continuum composed by a lower and an upper curve, both decreasing (and maybe coincident), see for example [14,16,18,19]; • other points in Σ ∩ (λk−1 , λk+m+1 )2 can only lie between the two curves (and hence in the open squares (λk−1 , λk )2 and (λk+m , λk+m+1 )2 there never are points of Σ ). Something more can be said about the lower part of the continuum Σ1 arising from (λ1 , λ1 ), the “first nontrivial curve in Σ ”: a variational characterization was found in [11], then developed in [9] and applied to the Neumann case in [1]. In these works it was also proved that for Neumann boundary conditions the asymptotic behavior of this first curve depends on the spatial dimension of the problem: it is asymptotic to the lines {λ± = λ0 (= 0)} for N > 1, while it is bounded away from {λ± = λ0 (= 0)} only for N = 1. In a recent paper, Horák and Reichel [15] have combined analytical and numerical methods in the study of the Fuˇcík spectrum for Eq. (1.1) with Dirichlet boundary conditions. They give a new variational characterization for the lower part of the first curve Σ1 , and show numerically the occurrence of secondary bifurcation on this curve and of curve crossings. In [12], the periodic problem in an interval was considered: taking advantage of the intrinsic S 1 -symmetry of the problem, a variational characterization of Fuˇcík curves parting from the eigenvalues of the problem is given. The continuity of the characterization and the complete knowledge of the Fuˇcík spectrum allow in this case to assert that the variational curves actually describe all the curves of the Fuˇcík spectrum. The difficulties encountered in characterizing the Fuˇcík spectrum for the Laplacian in higher dimensions suggest that it probably has a complicated structure. On the other hand, the knowledge of the Fuˇcík spectrum is important in the study of nonlinear elliptic equations, for example in the study of problems with “jumping nonlinearities,” that is nonlinearities which are asymptotically linear at both +∞ and −∞, but with different slopes. If in addition one has also

1434


a variational characterization of the Fuˇcík spectrum, then other interesting results can be obtained, cf. [8,9,11,12]. In this paper we consider the Fuˇcík spectrum Σ ⊂ R2 of the Laplacian on a two-dimensional torus T 2 = (0, 1) × (0, r), that is −u = λ+ u+ − λ− u− in R2 , (1.3) u(x, y) = u(x + 1, y) = u(x, y + r) for all (x, y) ∈ R2 . An important feature of this problem is its invariance under a compact group action given by g · u(x, y) = u(x + s, y + t), g = (s, t) ∈ G = [0, 1) × [0, r). More precisely, denoting F (u) := −u − (λ+ u+ − λ− u− ), we have that F is equivariant with respect to the action of G, i.e. F (g · u) = g · F (u)

for all g ∈ G.

We note that the eigenvalues of the Laplacian on T 2 are explicit, given by λj,k = j 2 4π 2 + k 2 4π 2 /r 2 ,

j, k = 0, 1, 2, . . . .

(1.4)

In this paper, using the mentioned invariance properties of the Laplacian, we will be able to characterize large parts of the Fuˇcík spectrum, and we will see that it is remarkably complex: In our first result we prove that from every eigenvalue (λj,k , λj,k ) there emanates an explicit expl global curve Σj,k ⊂ Σ belonging to the Fuˇcík spectrum. As already mentioned, it is useful to have a variational characterization of the Fuˇcík spectrum. In the case of the ODE with periodic boundary conditions, such a characterization was obtained in [12] by using the S 1 -index due to V. Benci [2], see also [3]. Recently, an analogous G-index was introduced by W. Marzantowicz [17] for general compact groups. In our second result we will use this G-index, more precisely the T 2 -index, to prove that var ⊂ Σ which can be from each eigenvalue λj,k there emanates a global branch of values Σj,k characterized variationally. expl Having proved the existence of an explicit global branch Σj,k and a global variational branch var Σj,k emanating from the same eigenvalue, one may ask whether these two branches coincide (as is the case in the one-dimensional problem mentioned above). Surprisingly, the answer is no. Indeed, in our third result we will show that all the variational eigenvalues tend asymptotically to zero. Since the explicit branches have positive asymptotes, we conclude that many branch crossings occur: in fact, every explicit curve gets crossed by all variational curves starting above it, i.e. by infinitely many curves. In our fourth result, we give a (partial) explanation regarding the separation of the variational curve from the explicit branch: we will show that on the first explicit branch there exist infinitely many points of secondary bifurcation. Thus, it is plausible that the variational branch initially follows the explicit branch (as we will show), and then, at the first branching point on the explicit curve, it will follow the branch of secondary bifurcation which will asymptotically go to zero. In addition, we prove that all these secondary bifurcations are symmetry breaking: the solutions on the explicit curves depend (after a change of variables) on a single variable, and hence have an S 1 -symmetry, while the solutions on the secondary bifurcation branch break this symmetry, and hence their orbit is homeomorphic to the full group T 2 .


1435

2. Results In this section we give the precise statements of the results which we will prove in this paper. In the following we will call H the space H 1 (T 2 ), the standard Sobolev space over the domain 2 T = (0, 1) × (0, r), with periodic boundary conditions as stated in (1.3). We will denote by 0 = λ0 < λ1 λ2 · · · λk · · · the (ordered) eigenvalues of − in H , while we continue to write λj,k when we refer to the explicit form of an eigenvalue given in (1.4) (see Section 3.1 for more details about the notations used). First, in Section 4, we prove that from every eigenvalue λk there departs a global explicit curve belonging to Σ. This is somewhat surprising: in dimension N 2 explicit global curves are only known in special domains departing from certain eigenvalues. Theorem 2.1. Let λk , k 0, be an eigenvalue of the Laplacian on H ; then (i) if k = 0, then the lines {λ+ = λ0 } and {λ− = λ0 } are in Σ ; (ii) if k 1, then the curve expl

Σk

:

1 1 2 +√ =√ √ + − λk λ λ

belongs to Σ. Remark 2.2. The above curves form an infinite family of curves in Σ , one for each eigenvalue. All these curves are similar, so that they never cross, and all have asymptotes at the value equal to one quarter of the corresponding eigenvalue. In some regions, namely near the diagonal points (λk , λk ) with λk corresponding to a twodimensional eigenspace, we may guarantee that these are the only points in Σ : Theorem 2.3. Let λk be an eigenvalue associated to a two-dimensional eigenspace, and let λk−1 , λk+1 , resp., denote the nearest eigenvalues below and above λk : then all the points in expl Σ ∩ (λk−1 , λk+1 )2 are on the curve Σk given in Theorem 2.1. In Section 5 we consider a different approach: we use variational methods and the mentioned T 2 -index by W. Marzantowicz [17] to prove Theorem 2.4. For every μ 0 and k 1, one can characterize variationally values λk (μ) > 0 with the following properties: – – – –

λk (0) = λk ; Σkvar = {(λk (μ) + μ, λk (μ)): μ 0} ⊂ Σ ; each λk (μ) depends continuously and monotone decreasingly on μ; if k > h then λk (μ) λh (μ).

Moreover, λ1 (μ) describes the first nontrivial curve, in the sense that for a given μ, no point in Σ of the form (ξ + μ, ξ ) exists with ξ ∈ (0, λ1 (μ)).

1436


Theorem 2.4 characterizes a family of curves Σkvar in Σ, each one passing through a diagonal point (λk , λk ). Remark 2.5. Observe that Theorem 2.3 implies that if λk has a two-dimensional eigenspace, expl then as long as Σkvar ⊂ (λk−1 , λk+1 )2 , it coincides with Σk . However, this is not always the case, as a consequence of the following theorem: Theorem 2.6. Let λk (μ), k = 1, 2, . . . , denote the variational values obtained in Theorem 2.4. Then lim λk (μ) = 0.

μ→+∞

This theorem says that all variational curves Σkvar have asymptotes in 0. Since the explicit expl expl curves Σk have asymptotes in λk /4, it follows that Σkvar and Σk cannot coincide globally. In fact, it also implies expl

Corollary 2.7. Each explicit branch Σk

gets crossed by all variational curves Σjvar , with j > k.

Recall that Theorem 2.3 says that in a neighborhood of the eigenvalue λ1 the variational curve and the explicit branch coincide, while Theorem 2.6 implies that these curves cannot coincide globally. The following theorem gives an explanation for this: expl

Theorem 2.8. There exists a sequence of secondary bifurcation points γj on Σ1 bifurcate global branches σ1,j , j ∈ N, consisting of T 2 -tori of solutions.

, from which

3. The linear spectrum of − in H = H 1 (T 2 ) Let the domain T 2 be parameterized as [0, 1] × [0, r], then it is simple to see that the functions 2π φj,k (x, y) = cos(j 2πx) cos k y , r

j, k 0,

and their translates are eigenfunctions corresponding to the eigenvalues λj,k = j 2 4π 2 + k 2

4π 2 , r2

j, k 0.

Since the functions above (with their translates) form a complete orthogonal system in H = H 1 (T 2 ), it follows that they are all the possible eigenfunctions. In particular, the first eigenvalue is λ0 = λ0,0 = 0 and its eigenspace is the (one-dimensional) subspace of constant functions. The order of the subsequent eigenvalues depends on the value of r, however “in general” (for example if r 2 ∈ / Q), the values λj,k will be all distinct, with multiplicity 4 if j = 0 = k, and with multiplicity 2 for λj,0 , λ0,k . In the particular cases in which some of the λj,k coincide, the multiplicity will be the sum of the corresponding multiplicities.


1437

3.1. Notation for the spectrum We will use the following notation: λ0 = λ0,0 = 0 is the first eigenvalue, corresponding to the eigenspace generated by a constant positive function φ0 ; then, since the following eigenvalues are always of even multiplicity, we will denote by λi (i > 0), the nondecreasing sequence of eigenvalues, repeated accordingly to the half of their multiplicity. Moreover, one may always choose the eigenfunctions in each eigenspace in such a way that they are mutually orthogonal and that to each i correspond two orthogonal eigenfunctions (differing just by a translation) which we will denote by φia and φib ; we will also maintain the notation with two indices when needed, denoting the corresponding eigenfunctions with the 4 indices a, b, c, d. For example, let λi = λj,k with j, k = 0, then one possible choice of the eigenfunctions is 2π 2 = √ cos(j 2πx) cos k y , r r 2π 2 c a φj,k = φi+1 = √ cos(j 2πx) sin k y , r r a φj,k

= φia

2π 2 = √ sin(j 2πx) cos k y , r r 2π 2 d b φj,k = φi+1 = √ sin(j 2πx) sin k y . r r

b φj,k

= φib

Also, we will assume that these eigenfunctions are chosen with unitary L2 norm. 4. Explicit curves in the Fuˇcík spectrum (proof of Theorem 2.1) In this section we will obtain the explicit curves claimed in Theorem 2.1. These curves will always correspond to nontrivial solutions having a one-dimensional behavior, that is they will depend on a unique variable after a suitable reparameterization of T 2 . For point (i) in Theorem 2.1, it is simple to see that the vertical and the horizontal line through (λ0 , λ0 ) = (0, 0) are in Σ , actually φ0 satisfies Eq. (1.3) with λ+ = 0 and any λ− , while −φ0 satisfies it with λ− = 0 and any λ+ . We now look for other elements in Σ, in order to obtain point (ii) in Theorem 2.1. Recall that the equation −u = λ+ u+ − λ− u− has 1-periodic nontrivial solutions for (λ+ , λ− ) satisfying 1 1 1 , +√ = √ + − nπ λ λ

(4.1)

corresponding to functions of the form sin(λ+ x) where positive and sin(λ− x) where negative, having 2n zeros (and thus having minimal period 1/n). In the torus [0, 1] × [0, r], we may use the change of variables ⎧ ky ⎪ ⎨z = jx + , r ⎪ ⎩ w = j x − ky , r

with k, j ∈ N mutually prime;

observe that the periodicity condition u(x, y) = u(x + 1, y) = u(x, y + r)

for any x, y ∈ R,

1438


becomes U (z, w) = U (z + j, w + j ) = U (z + k, w − k)

for any z, w ∈ R.

We look now for solutions depending on only one of these two variables. k2 2 If u(x, y) = U (z) = U (j x + ky r ), then u = U (z)(j + r 2 ) and U (z) = U (z + j ) = U (z + k): since we chose j , k mutually prime this implies U (z) = U (z + 1). k2 2 In the same way, if u(x, y) = U (w) = U (j x − ky r ), then u = U (j + r 2 ) and U (w) = U (w + 1). We conclude that any solution of the one-dimensional problem ⎧ ⎨

k2 −U = j + 2 r ⎩ U 1-periodic

2

−1

+ + λ U − λ− U − ,

(4.2)

will correspond to the two solutions of −u = λ+ u− − λ− u− in the torus of the form u(x, y) = U (j x ± ky r ). These explicit solutions provide the explicit curves in Σ claimed in Theorem 2.1: actu± ally (4.2) has solution for 2λ k2 satisfying (4.1), so we obtain, for n, j, k ∈ N (j , k mutually j +

prime) the curves

r2

1 1 1 +√ = √ + − λ λ nπ j 2 +

k2 r2

2 = . λnj,nk

As claimed in the theorem, each of these curves passes through the diagonal point correspond2 ing to the eigenvalue λnj,nk = 4n2 π 2 (j 2 + kr 2 ), with eigenfunctions cos(2π((nj )x ± (nk)y r )):

nky actually, these split as cos(2πnj x) cos(2π nky r ) ∓ sin(2πnj x) sin(2π r ), that is, they are a lina,b,c,d ear combination of the four separated variable eigenfunctions φnj,nk . Finally, for j or k equal to zero, one does not need any change of variable to obtain the claim by the same technique.

5. The variational characterization In this section we will obtain the variational characterization of curves in Σ as claimed in Theorem 2.4. We will follow the ideas of [12], and for this we need a suitable index for T 2 -actions. We will use the index for general compact Lie groups in [17], whose definition and main properties we recall here, with some simplifications due to our setting. 5.1. The geometrical G-index of [17] Let G be a compact Lie group and A a separable metric G-space (we will denote the action of an element g ∈ G on a ∈ A as g · a).


1439

First, one defines an index related to the fixed point set AG = {a ∈ A: g · a = a ∀g ∈ G}:

γe AG = inf k 0: AG , S n = ∗ for any n k , where by [AG , S n ] we mean the set of all the homotopy classes of maps from AG to S n , and by ∗ the class of those homotopic to a constant (if AG = ∅ we put γe (AG ) = 0). Then, one considers all representations V of the group G, such that there exists a G-map

f :A → V

\ {0} where f AG ⊆ V G \ {0}, • dimR V G = γe AG , • f |AG is not homotopic to the constant function as a map into V G \ {0},

(5.1)

and defines γG0 (A) = inf dimC VG : V as in (5.1) ,

(5.2)

where VG is the complement of V G in V . We give some useful properties of this index in the following Proposition 5.1. 1. If A, B are G-metric spaces and there exists a G-equivariant map φ : A → B such that φ|AG is a homotopy equivalence between AG and B G , then γG0 (A) γG0 (B) (see point 2 in Proposition 3.7 of [17]). 2. In particular, if φ : A → B is a G-equivariant homeomorphism, then γG0 (A) = γG0 (B) (see point 3 in Proposition 3.7 of [17]). 3. If V is an orthogonal representation of G and S(V ) the unit sphere in V , then γG0 (S(V )) = dimC VG and γe (S(V G )) = dimR V G (see point 10 in Proposition 3.7 of [17]). Remark 5.2. In the case of our interest (see in the next section), AG will always be homeomorphic to a subset of R, then we have the following two possibilities: γe (AG ) = 0, if AG is homeomorphic to a connected subset of R or if AG = ∅. γe (AG ) = 1, if AG is homeomorphic to a subset of R having more than one component. 5.2. The variational characterization (proof of Theorem 2.4) In our application, we will consider the natural action of the group G = T 2 on H = H 1 (T 2 ) given by: if g = (s, t) ∈ T 2 and u = u(x, y) ∈ H,

then g · u = u(x + s, y + t).

(5.3)

Observe that then H G = {const}, so it is the one-dimensional eigenspace of the eigenvalue λ0 . Like in [12] we define, for k 1, μ 0, λk (μ) = inf sup Iμ (u), A∈Γk u∈A

(5.4)

1440


where Iμ : H → R : u → Iμ (u) =

|∇u|2 − μ

T2

+ 2 u

(5.5)

T2

and Γk = A ⊆ ∂B: A closed, A G-invariant; ±φ0 ∈ A; γG0 (A) k

(5.6)

(here B is the L2 ball in H ). Note that critical points u ∈ H at level λ of Iμ constrained to ∂B are nontrivial solutions in H of the equation −u = (λ + μ)u+ − λu− , implying that (λ + μ, λ) ∈ Σ . Observe also that for A ∈ Γk , one always has AG = ±φ0 , so that, by Remark 5.2, γe (AG ) = 1. Theorem 2.4, except for the last claim which will be proved in Section 6, is a consequence of the following Proposition 5.3. For k 1, μ 0, the values λk (μ) are well defined, positive, are critical values for Iμ constrained to ∂B, depend continuously and monotone decreasingly on μ, and λk (0) = λk . Finally, if k > h then λk (μ) λh (μ). Proof. The proof is standard, and goes trough the following points: (1) Iμ |∂B is G-invariant and satisfies the PS condition. (2) For each k 1, Γk = ∅ and λk (μ) is well defined. k = span{φ0 , φ a , φ b , . . . , φ a , φ b }: this is a representation of G of (real) diActually, let E k k 1 1 G ) = 1, then γ 0 ( mension 2k + 1 with dimR (E S k G k ) = k, where Sk = Ek ∩ ∂B (by point 3 in Proposition 5.1), that is Sk ∈ Γk . Since Iμ (−φ0 ) = 0 and Sk ∈ Γk is compact, we have that the infsup in (5.4) is finite and nonnegative (we refer to Section 6 for the proof that it is in fact strictly positive). (3) λk (μ) is critical. Indeed, let Aε ∈ Γk with supu∈Aε Iμ (u) < λk (μ) + ε: if λk (μ) were not critical then, using a G-equivariant deformation lemma in ∂B, supu∈η(Aε ) Iμ (u) < λk (μ) − ε, where η is an equivariant homeomorphism satisfying η(±φ0 ) = ±φ0 and then, by point 2 in Proposition 5.1, η(Aε ) ∈ Γk , which gives a contradiction. (4) Continuity and monotonicity follow easily by the variational formulation, as in [12]. (5) Monotonicity in the index k is a consequence of the fact that if k > h then Γk ⊆ Γh . (6a) λk (0) λk , actually Sk ∈ Γk and supu∈ Sk I0 (u) = λk . ⊥ , and (6b) λk (0) λk , actually by Lemma 5.4 below, if A ∈ Γk , then there exists uˆ ∈ A ∩ E k−1 then I0 (u) ˆ λk . 2 ⊥ = ∅. Lemma 5.4. If A ∈ Γk , then A ∩ E k−1 Proof. Write H as H = H G ⊕ F1 ⊕ F2 where F1 and F2 are invariant orthogonal subspaces. The lemma is a consequence of the following claim: Let A ⊆ ∂B, A closed, A G-invariant, ±φ0 ∈ A; if A ∩ F2 = ∅, then γ 0 (A) dimC (F1 ).


1441

In fact, consider V = H G ⊕ F1 as a representation of G: then V G = H G and VG = F1 . Let Q : H → H G ⊕ F1 be the orthogonal projection: – since A ∩ F2 = ∅, we obtain Q(A) ⊆ (H G ⊕ F1 ) \ {0}; – since AG = {±φ0 } one has γe (AG ) = 1 = dimR H G , moreover Q|AG is the identity, and then it is nontrivial as a map into H G \ {0}. This proves that Q is a G-map satisfying the properties in definition (5.1), and then implies, by (5.2), that γG0 (A) dimC (F1 ). 2 6. Proof of Theorem 2.3 and end of proof of Theorem 2.4 In this section we will use some results from [16] and from [9] in order to prove Theorem 2.3 and to conclude the proof of Theorem 2.4. The required result from [16] is summarized in the following proposition: Proposition 6.1. (See [16].) Let • V be the eigenspace associated to the eigenvalue λh , W its complement in H and ∂BV the unitary L2 sphere in V; • Λ be the open square (λ, λ)2 where λ (resp. λ) is the nearest eigenvalue below (resp. above) λh ; • Iλ+ ,λ− (u) = Ω |∇u|2 − λ+ Ω |u+ |2 − λ− Ω |u− |2 be the functional defined in H associated to the Fuˇcík problem with coefficients λ+ , λ− ; • θ : V → W be such that θ (v) is the (unique) solution of the equation −w = PW (g(v + w)), where g(t) = λ+ t + − λ− t − and PW is the orthogonal projection onto W . Then the curves in Λ given by

λ+ , λ− ∈ Λ: inf Iλ+ ,λ− v + θ (v) = 0 , v∈∂BV

λ+ , λ− ∈ Λ: sup Iλ+ ,λ− v + θ (v) = 0

(6.1)

v∈∂BV

are continua in Σ , and any other point in Σ ∩ Λ must lie between them. With this result, we may give the Proof of Theorem 2.3. It is clear by the invariance of problem (1.3) that θ (g · v) = g · θ (v) and then Iλ+ ,λ− (g · v + θ (g · v)) = Iλ+ ,λ− (g · (v + θ (v))) = Iλ+ ,λ− (v + θ (v)). Since in the hypotheses of Theorem 2.3 the eigenspace V contains a unique orbit and its positive multiples, Iλ+ ,λ− (v + θ (v)) is constant in ∂BV , which implies that the two curves defined in (6.1) coincide, and then no other point in Σ ∩ Λ may exist. 2 In [9], as we commented in Section 1, a variational characterization of the first nontrivial curve was given and many interesting properties were proved for this characterization: we will

1442


summarize these results below, and will then establish a connection with our variational characterization for λ1 (μ) given in (5.4); indeed, we will see that the characterized curves coincide, and this will allow to extend to our characterization some of the properties proved there. The variational characterization given in [9] is ν1 (μ) = inf

sup

h∈Λ1 u∈h([−1,1])

(6.2)

Iμ (u),

where Λ1 = h : [−1, 1] → ∂B continuous, with h(±1) = ±φ0 , and it was proved that Proposition 6.2. (See [9].) • For each μ 0 the level ν1 (μ) > 0 is critical for the restriction to ∂B of the functional Iμ , that is (ν1 (μ) + μ, ν1 (μ)) ∈ Σ (Theorem 2.10). • ν1 (0) = λ1 (Corollary 3.2). • No other critical point may lie at level lower than ν1 (μ) except for ±φ0 , which implies that (ν1 (μ) + μ, ν1 (μ)) is the first nontrivial point of Σ on the parallel to the diagonal through (μ, 0) (Theorem 3.1). • The curve described is continuous and strictly decreasing (Proposition 4.1). The following proposition will extend all these properties to our characterization λ1 (μ), and then imply the last claim in Theorem 2.4 and the strict positivity of λk (μ) which was not proved in Section 5. Proposition 6.3. ν1 (μ) = λ1 (μ) for all μ 0. Proof. For a given h ∈ Λ1 , let Gh([−1, 1]) be the union of the orbits of the points in h([−1, 1]): by the invariance of Iμ with respect to the action of G we have ν1 (μ) = inf

sup

h∈Λ1 u∈h([−1,1])

Iμ (u) = inf

sup

h∈Λ1 u∈Gh([−1,1])

Iμ (u).

However, Gh([−1, 1]) ∈ Γ1 , since it is G-invariant, closed, contains ±φ0 , and also contains a path joining ±φ0 , so that no continuous function may map it in H0 \ {0} if the images of ±φ0 are in different components, implying that a representation V as in definition (5.1) necessarily has dimC VG 1. This implies that ν1 (μ) λ1 (μ). For μ = 0 we already know that ν1 (0) = λ1 (0) = λ1 . Finally, since both characterizations are continuous and ν1 (μ) > 0 we have that if they were not the same, then there would exist μ > 0 such that ν1 (μ) > λ1 (μ) > 0, contradicting the fact that ν1 (μ) is the first nontrivial curve. 2 Remark 6.4. In Section 1, we also recalled the results in [9] and in [1] about the asymptotic behavior of the characterized curve ν1 (μ): we could use Proposition 6.3 to prove that limμ→+∞ λ1 (μ) = λ0 = 0, however, we refer to the next section where we will use our characterization to obtain a more general result.


1443

7. Study of the asymptotical behavior of the variational curves The aim of this section is to construct suitable sets with a given value of γG0 and then to use them in order to obtain estimates on the critical levels (5.4), which will result in the proof of Theorem 2.6. For this purpose, we recall some useful definitions (see for example in [4]): the join Y1 ∗ · · · ∗ Yk of k nonempty G-spaces Yi , is defined as the quotient E/∼ of the space E = (a1 y1 , . . . , ak yk ) with yi ∈ Yi , ai ∈ [0, 1] (i = 1, . . . , k),

k

ai = 1 ,

i=1

with respect to the equivalence relation ∼:

(a1 y1 , . . . , 0yj , . . . , ak yk ) ∼ a1 y1 , . . . , 0yj , . . . , ak yk for any yj , yj ∈ Yj (j = 1, . . . , k),

where the G-action on the join is given by g · (a1 y1 , . . . , ak yk ) = (a1 g · y1 , . . . , ak g · yk ). We will denote by Jk G the join G ∗ G ∗ · · · ∗ G, k times. Also, the join of G-maps φ1 ∗ · · · ∗ φk where φi : Yi → Zi is defined as

φ1 ∗ · · · ∗ φk : Y1 ∗ · · · ∗ Yk → Z1 ∗ · · · ∗ Zk : (a1 y1 , . . . , ak yk ) → a1 φ1 (y1 ), . . . , ak φk (yk ) . We will need the following proposition, which is a consequence of [5]: Proposition 7.1. If G is a torus, consider the G-space S m ∗ Jk G with the trivial action on S m : then there does not exist a G-map φ : S m ∗ Jk G → S(V ) if • V is a representation of G with dimR V G = m + 1 and dimC VG = j < k, • φ induces a homotopy equivalence between the spheres S m and S(V G ). Sketch of the proof. As claimed in the proof of Corollary 2 in [5], there exists a G-map τ : S(VG ) → G/K1 ∗ · · · ∗ G/Kj , where each Ki is a closed proper subgroup of G; then consider the G-map

τ := idS(V G ) ∗ τ : S(V ) → S V G ∗ G/K1 ∗ · · · ∗ G/Kj ; if the G-map φ in the claim existed, then the composition

τ ◦ φ : S m ∗ Jk G → S V G ∗ G/K1 ∗ · · · ∗ G/Kj would be a G-map too; however, by Proposition 6 in [5], no such G-map exists if j < k.

2

1444


7.1. Sets with given γG0 We may now proceed to the construction of a set in the class Γk defined in (5.6): let f1 , . . . , fk be k functions in H \ H G satisfying the following hypothesis: (Hf) If ai 0, gi ∈ G (i = 1, . . . , k), with aj > 0 for at least one j ∈ {1, . . . , k}, then k / H G. i=1 ai gi · fi ∈ Remark 7.2. The above condition (Hf) is not difficult to be achieved, by choosing a “different shape” for the functions f1 , . . . , fk . In particular, the condition is satisfied if the fi are nonconstant eigenfunctions taken in k distinct eigenspaces. Another possible choice of the functions fi will be described in the next section, in the proof of Theorem 2.6. We define the set W{fi }i=1,...,k =

a0 sφ0 + ki=1 ai gi · fi : a0 sφ0 + ki=1 ai gi · fi L2

s ∈ {±1}, a0 , ai ∈ [0, 1], gi ∈ G (i = 1, . . . , k), a0 +

k

ai = 1

(7.1)

i=1

and we prove Lemma 7.3. Provided that hypothesis (Hf) is satisfied, the set W{fi }i=1,...,k is a compact Ginvariant set such that W{fi }i=1,...,k ∈ Γk (in fact, γG0 (W{fi }i=1,...,k ) k). Proof. First, one has to check the wellposedness of the definition, that is a0 sφ0 + fi L2 = 0: this is guaranteed by hypothesis (Hf). Then, we see that there exists the natural G-map

a i gi ·

a0 sφ0 + ki=1 ai gi · fi Ak : S ∗ Jk G → W{fi }i=1,...,k : (a0 s, a1 g1 , . . . , ak gk ) → . (7.2) a0 sφ0 + ki=1 ai gi · fi L2 0

Since S 0 ∗Jk G is a compact set, then W{fi }i=1,...,k is compact too; also, it is a G-invariant subset of ∂B such that W{fi }i=1,...,k ∩ H G = {±φ0 }, then W{fi }i=1,...,k ∈ Γk provided γG0 (W{fi }i=1,...,k ) k. Then suppose there exists a G-map M : W{fi }i=1,...,k → V \ {0} satisfying G • dimR V G = γe (W{f ) = 1 and dimC VG < k, i }i=1,...,k

G • M(W{f ) ⊆ V G \ {0}, and M|W G i }i=1,...,k

a map into

VG

\ {0};

{fi }i=1,...,k

is not homotopic to the constant function as

by composing M ◦ Ak and projecting on S(V ), one would obtain a G-map : S 0 ∗ Jk G → S(V ) which induces homotopy equivalence between S 0 and S(V G ): this contradicts Proposition 7.1, hence such a map M cannot exist and then γG0 (W{fi }i=1,...,k ) k. 2


1445

7.2. Every variational curve is asymptotic to 0 In order to prove Theorem 2.6, we will first produce a suitable function f ∈ H and then use it in the construction above in order to build a suitable set with a given index k, which will allow us to estimate the infsup values in (5.4). Among some other technical conditions, the main property required of this function f is to change sign, but having a suitably small ratio

|∇f |2 : we remark that it is impossible to achieve f2 T2

T2

this property in one spatial dimension, but it is always possible in higher dimension. In particular, f will be defined to be the constant −h < 0 outside of a small ball, and with a spike in this ball which reaches the level H > 0: using a vector coordinate x in the square [−1/2, 1/2] × [−r/2, r/2] representing T 2 we set f=

−h, if |x| > η, |x|δ H − (h + H ) ηδ , if |x| η,

where h, H are two positive reals, and η, δ > 0 are suitably small. We claim that Lemma 7.4. Given k ∈ N, ε > 0, it is possible to choose η, δ, H, h > 0 in such a way that the following requirements are satisfied: (1) (2)

2 f = 0; T 2 2 T |∇f | < ε/k 2 ; 2 f T2

(3) T 2 f (g · f ) − k1 T 2 f 2 for any g ∈ G; (4) 2kη < min{r; 1}.

Proof. First, straightforward computations give (we set A = r: the area of T 2 ) f = −Ah + πη2 (h + H )

T2

2 2

η2 2 2 H 2 η + (h + H ) − 2H (h + H ) , f = h A − πη + 2πη 2 2δ + 2 δ+2 2

T2

δ , δ+2

2

(7.3)

(7.4)

|∇f |2 = πδ(h + H )2 .

(7.5)

T2

Also (from now on we will assume η, δ > 0 suitably small), given g ∈ G, one has −h f H and then f (g · f ) −hH ; however, f (g · f ) ≡ h2 in a region of area at least A − 2πη2 , then we may estimate T2

f (g · f ) h2 A − 2πη2 − hH 2πη2 .

(7.6)

1446


Now, we choose the ratio H / h in order to obtain property (1): again a simple computation gives

H h

=

δ A−πη2 δ+2 πη2 δ δ+2

and

h+H h

=

A(δ+2) , πη2 δ

which we estimate as

H +h H 3A A . h h πη2 δ πη2 δ

(7.7)

With (7.7) we estimate (observe that the term in parentheses in (7.4) is larger than small η)

f 2 h2 A/2 + πη2 H 2 /2 h2 A/2 + πη2

T2

f (g · f ) T2

A2 2(πη2 δ)2

h2

3A h2 A 6h2 A − h2 2 2πη2 − , 2 δ πη δ

|∇f |2 h2 πδ

3A πη2 δ

2 3h2

A2 . η4 δ

A2 , 2πη2 δ 2

H2 4

for

(7.8)

(7.9)

(7.10)

T2

Now we analyze the requirements in the lemma, which will be achieved by choosing δ, η > 0 small enough: (1) has already been enforced, (4) is straightforward and (3) is equivalent to h2 A2 −6h2 A − k1 2πη 2 δ , then it is possible to be achieved by the choice of η once that δ is small. So at this point we fix the value η so that the above requirements are achieved for suitably small (but still free) δ > 0. A2 η4 δ 2 h2 A2 2 2πη δ

3h2

Finally, (2) is equivalent to conclude the proof.

2

= 6 πδ < ε/k 2 , then we may set δ > 0 small enough and η2

Now we are in the position to prove the main result of this section: Proof of Theorem 2.6. Since the function λk (μ) is decreasing and positive, we suppose, for sake of contradiction, that λk (μ) ε > 0 and with these values of ε, k we obtain from Lemma 7.4 a corresponding function f , then we set fi = f for i = 1, . . . , k and we consider the set W := W{fi }i=1,...,k

(7.11)

as defined in (7.1). First we verify hypothesis (Hf): condition (4) implies that k disks of radius η may not cover the exists a point p where (gi · f )(p) = −h whole of T 2 , then for any choice of g1 , . . . , gk ∈ G there for any i = 1, . . . , k and then ki=1 ai (gi · f )(p) = −h ki=1 ai ; however, let aj > 0 and (gj · f )(t) = H , then ki=1 ai (gi · f )(t) aj (H + h) − h ki=1 ai . We conclude that ki=1 ai gi · f is not a constant function and then (Hf) is satisfied. Since W ∈ Γk by Lemma 7.3, we have, by (5.4), λk (μ) max Iμ (u); u∈W

(7.12)


1447

let then v(μ) ∈ W be such that the maximum in (7.12) is assumed in v(μ), consider any sequence μn → +∞ and let vn = v(μn ): up to a subsequence we have vn → v0 ∈ W,

strongly in H ;

from (7.12) we get Iμn (vn ) λk (μn ),

(7.13)

2 ∇vn 2L2 − μn vn+ L2 λk (μn ).

(7.14)

that is

Taking the limit in (7.14), since we assumed that λk (μn ) ε, gives ∇v0 2L2 ε.

(7.15)

Writing v0 = Ak (a0 s, a1 g1 , . . . , ak gk ) in the notation of Eq. (7.2), since ∇φ0 = 0 and v0 L2 = 1, this becomes 2 2 k k ai ∇(gi · f ) ε a i gi · f , a0 sφ0 + T2

i=1

T2

(7.16)

i=1

where T 2 (a0 sφ0 + ki=1 ai gi · f )2 = T 2 (a0 sφ0 )2 + T 2 ( ki=1 ai gi · f )2 since f is orthogonal to φ0 (condition (1) in Lemma 7.4). Now, if a0 = 1 (that is, if all the other coefficients are zero), (7.16) gives 0 ε, contradiction; otherwise we collect as 2 2 k k ai ∇(gi · f ) − ε a i gi · f ε (a0 φ0 )2 0, T2

i=1

T2

i=1

(7.17)

T2

and then we get

|

k

2 i=1 ai ∇(gi · f )| ε. k 2 T 2 ( i=1 ai gi · f )

T2

(7.18)

Using the estimate ( ki=1 xi )2 k ki=1 xi2 , one obtains 2 k k k 2 2 ai ∇(gi · f ) = k ai ∇(gi · f ) k ai |∇f |2 . T2

i=1

T2

i=1

i=1

T2

(7.19)

1448


Writing

k

T 2(

i=1 ai gi

· f )2 as

k k k k (ai gi · f )2 + ai aj (gi · f )(gj · f ) = ai2 f 2 + ai aj (gi · f )(gj · f ) T2

i=1

T2

i,j =1 i =j

i=1

and using property (3) in Lemma 7.4 and the estimate conclude k T2

2 a i gi · f

i=1

k i=1

k

i,j =1, i =j

k−1 2 − ai k k

ai2

i,j =1 i =j

T2

T2

xi xj (k − 1)

1 2 f = ai k k

T2

i=1

2 i=1 xi ,

we

2

i=1

k

f 2.

(7.20)

T2

Inserting (7.19) and (7.20) into (7.18) and using property (2) in Lemma 7.4, one gets ε

[k

k

2 2 i=1 ai ] T 2 |∇f | [ k1 ki=1 ai2 ] T 2 f 2

this contradiction concludes the proof.

=k

|∇f |2 < k 2 ε/k 2 ; 2 f 2 T

2 T2

2

8. Secondary bifurcation from the first curve By comparing Theorems 2.6 and 2.3, we deduce that the variational characterizations λk (μ) follow initially the explicit curves (at least in the case when λk has multiplicity two, to which Theorem 2.3 applies), but eventually separate from them to go asymptotically to 0. This observation implies that the variational curves Σkvar described by λk (μ) cross every exexpl plicit curve Σj with 1 j < k, and also suggests the presence of bifurcation points along the explicit curves: we investigate in this section the bifurcation from the first explicit curve expl expl Σ1,0 ; for this we will impose r < 1 so that Σ1,0 is in fact the first explicit curve and is distinct expl

from Σ0,1 . The result (which implies Theorem 2.8) is in the following expl

Theorem 8.1. If r < 1, then along the explicit curve Σ1,0 there exist infinitely many points of bifurcation, in the sense of the following Definition 8.2. Definition 8.2. If we define a continuous function (0, +∞) μ → (λμ , uμ ) such that (λμ + μ, expl λμ ) ∈ Σ1,0 and uμ is a related solution with uμ L2 = 1, then we say that a point (λμ +μ, λμ ) ∈ expl

Σ1,0 is a bifurcation point if there exists a sequence (μj , λj , uj ) → (μ, λμ , uμ ) where (λj +μj , / Guμj . λj ) ∈ Σ, uj is a corresponding solution with uj L2 = 1 and uj ∈ Remark 8.3. (a) We remark that results of bifurcations from the curves of the Fuˇcík spectrum were obtained for the Dirichlet problem on a square in [15]. In the same work, the authors found, through numerical approximations, examples where curves arising from different eigenvalues cross each other: this behavior was already known in a rectangular domain with Dirichlet boundary conditions between explicitly calculated curves (see [7]); we remark that our result differs


1449

from the cited ones, since it is obtained through analytical tools and since the crossings happen between curves arising from arbitrarily distant eigenvalues (one of which is not explicitly known). (b) It is interesting to observe that the explicit nontrivial solutions uμ corresponding to a point expl along Σ1,0 only depend on the variable x, and then their orbit is homeomorphic to S 1 . Once that we prove that there exists a bifurcation, since the solutions depending on just one variable are known, we obtain that the bifurcating solutions uj break this symmetry and then their orbit is homeomorphic to T 2 . (c) In Remark 8.7 we show that in fact a result analogous to Theorem 8.1 holds for the curve expl expl expl Σ0,1 too; also, we suggest that it should be true “in general” for any curve Σh,0 or Σ0,h . In order to prove Theorem 8.1, we will simplify the problem by getting rid of some of its symmetries: we consider the Neumann problem on a rectangular domain R having dimension 1/2 and r/2 and we call ΣR the corresponding Fuˇcík spectrum: actually, any solution of such a problem may be extended by two subsequent reflections to a periodic solution in the rectangle of dimension 1 and r, corresponding to a solution on our torus (in general, the converse will not be true, and so we have the inclusion ΣR ⊆ ΣT 2 ). Also, it is straightforward that all the explicit expl expl curves Σh,0 and Σ0,h that we found in Section 4 are also in ΣR . If we find a bifurcation point for ΣR , then it will correspond to a bifurcation point for ΣT 2 in the sense of Definition 8.2. In the context of this simpler problem, we may proceed in a similar way as in [15] in order to investigate bifurcation points along the explicit curves: first, we reformulate our problem as the search for solutions of F (μ, λ, u) = 0 where

F : R2 × H → R × H : F (μ, λ, u) = u2L2 − 1, u − K u + λu + μu+ ,

(8.1)

1 where H = H (R) and K : H → H is the inverse of the operator − + 1, in the sense that Ku, vH = R ∇(Ku)∇v + R (Ku)v = R uv. expl Since we are interested in bifurcations from a known solution with (λ + μ, λ) ∈ Σ1,0 , we expl

again define a continuous function (0, +∞) μ → (λμ , uμ ) such that (λμ + μ, λμ ) ∈ Σ1,0 and uμ is a related solution with uμ L2 = 1. Like in Theorem 12 in [15] one may prove that a sufficient condition in order to have a bifurcation point is that 0 is a simple eigenvalue of the derivative F(λ,u) (μ, λμ , uμ ). Also, the above condition turns out to be equivalent to the problem of determining when the eigenvalue λ = 0 of the following equation (with Neumann boundary conditions) has multiplicity 2: −v − μχuμ v − λμ v = λv

in (0, 1/2) × (0, r/2),

where χuμ is the characteristic function of the set {uμ > 0}; actually, one considers F(λ,u) (μ, λμ , uμ )[l, v] = R

uμ v, v − K[v + λμ v + μχuμ >0 v + luμ ] = (0, 0);

(8.2)

1450


by testing the second equation against uμ one gets luμ 2L2 = 0, that is, l = 0: then the second equation is equivalent to (8.2) with λ = 0 and the first one rules out the function uμ which is always an eigenfunction of the zero eigenvalue for (8.2). The spectrum of problem (8.2) is described in the following Lemma 8.4. The eigenvalues λ of (8.2) are λi,j (μ) = ρi (μ) + kj with corresponding eigenμ μ functions vi,j (x, y) = Vi (x)Wj (y) where kj = 4π 2 j 2 /r 2 and Wj (j 0) are eigenvalues and eigenfunctions of −W = kW

in (0, r/2),

W (0) = W (r/2) = 0

(8.3)

μ

and ρi (μ), Vi (i 0) are eigenvalues and eigenfunctions of −V − μχuμ V − λμ V = ρV

in (0, 1/2),

V (0) = V (1/2) = 0.

(8.4) μ

Moreover, the eigenvalues ρi (μ) and kj are all simple and one has that ρ1 (μ) = 0, V1 (x) = μ uμ (x), ρ0 (μ) < 0, V0 (x) > 0 and ρi (μ) > 0 for i 2. Corollary 8.5. – The eigenvalue λ = 0 of (8.2) is double if kj = −ρ0 (μ) for some j 1 and is simple otherwise; in fact, v(x, y) = uμ (x) is always in the eigenspace. – The number of negative eigenvalues of (8.2) is the number of j 0 such that kj < −ρ0 (μ). Proof. Using classical arguments (see for example in [6]), one performs separation of variables looking for solutions of (8.2) of the form v(x, y) = V (x)W (y) and obtains Eqs. (8.3) and (8.4) where ρ = λ − k: since both equations have an unbounded increasing sequence of simple eigenvalues and a complete orthogonal system of eigenfunctions, then the product functions μ Vi (x)Wj (y) form a complete orthogonal system and then the analysis of the separated variable equations (8.3)–(8.4) is sufficient for the analysis of Eq. (8.2). Also, since uμ (x) is a solution of (8.4) when ρ = 0, and since it changes sign once, we deduce that it has to be the second eigenfunction, that is ρ1 (μ) = 0. As a consequence ρ0 (μ) < 0, and it μ is known that the corresponding eigenfunction V0 (x) is positive, while ρi (μ) > 0 for i 2. The claims in the corollary follow straightforward, since λ = 0 may be obtained just by the combination k0 + ρ1 (μ) = 0 and (when possible) kj + ρ0 (μ) = 0, while λ < 0 may only come from kj + ρ0 (μ) < 0. 2 In order to prove the existence of the bifurcation points, and then to prove Theorem 8.1, we expl need to show that the eigenvalue zero of (8.2) is double in infinite points of Σ1,0 , in fact we prove the following Lemma 8.6. For r < 1, the function ρ0 (μ) defined in Lemma 8.4 crosses all the values −ki : i 1 as μ → +∞. Proof. When μ = 0, one has λμ = 4π 2 and the constant function is the principal eigenvalue of (8.4) corresponding to ρ0 (0) = −4π 2 : then we have ρ0 (0) > −k1 = −4π 2 /r 2 .


1451

It is known that the principal eigenvalue may be characterized variationally (see for example in [6]) as ρ0 (μ) =

inf

V ∈H 1 (0,1)\{0}

(V )2 − μ χuμ V 2 − λμ ; V2

(8.5)

by using V = const in (8.5) and observing that uμ > 0 in a set of length √ π 2

μ+λμ

, one may

estimate ρ0 (μ) −

μ π − λμ ; 2 μ + λμ

since λμ is bounded between π 2 and 4π 2 , this implies that limμ→+∞ ρ0 (μ) = −∞ and then, since the function ρ0 (μ) is continuous, the claim is proved. 2 expl

Remark 8.7. We observe that if we consider the curve Σ0,1 (that is, since we chose r < 1, expl

a curve higher than Σ1,0 ), then in Eq. (8.2) the function χuμ depends on the variable y and we are able to obtain a result analogous to that in Lemma 8.4 and in Corollary 8.5, where the equation for V becomes like (8.3) in (0, 1/2) and that for W like (8.4) in (0, r/2). A difference arises in Lemma 8.6 since in this case the principal (negative) eigenvalue of the equation for W when μ = 0 is already below −k1 , but still it goes to −∞ as μ → +∞ and then expl we get infinite points of bifurcation along Σ0,1 too. expl

expl

Finally, if we consider a higher curve Σh,0 or Σ0,h with h > 1, we still are able to find an infinity of points where the zero-eigenspace of (8.2) has dimension higher than 1, but we can no more guarantee that this dimension is exactly 2, since in this case zero is still an eigenvalue of (8.4), but it is the hth one, so that we have more than one negative eigenvalue for (8.4) and then it may happen that ρi (μ) + kj = 0 for more than one couple (i, j ) = (h, 0); again ρ0 (μ) will cross infinite values −ki , so we conclude that for such curves there should still be bifurcation points, but also more complicated phenomenons might arise. References [1] M. Arias, J. Campos, J.-P. Gossez, On the antimaximum principle and the Fuˇcik spectrum for the Neumann pLaplacian, Differential Integral Equations 13 (1–3) (2000) 217–226. [2] V. Benci, A geometrical index for the group S 1 and some applications to the study of periodic solutions of ordinary differential equations, Comm. Pure Appl. Math. 34 (4) (1981) 393–432. [3] H. Berestycki, J.-M. Lasry, G. Mancini, B. Ruf, Existence of multiple periodic orbits on star-shaped Hamiltonian surfaces, Comm. Pure Appl. Math. 38 (3) (1985) 253–289. [4] G.E. Bredon, Introduction to Compact Transformation Groups, Pure Appl. Math., vol. 46, Academic Press, New York, 1972. [5] M. Clapp, Borsuk–Ulam theorems for perturbed symmetric problems, Nonlinear Anal. 47 (6) (2001) 3749–3758. [6] R. Courant, D. Hilbert, Methods of Mathematical Physics, vol. I, Interscience, New York, 1953. [7] M. Cuesta, On the Fuˇcík spectrum of the Laplacian and the p-Laplacian, in: Proceedings of the “2000 Seminar in Differential Equations”, Kvilda, May–June 2000, University of West Bohemia Press, Plzeˇn, 2001. [8] M. Cuesta, J.-P. Gossez, A variational approach to nonresonance with respect to the Fuˇcik spectrum, Nonlinear Anal. 19 (5) (1992) 487–500. [9] M. Cuesta, D. de Figueiredo, J.-P. Gossez, The beginning of the Fuˇcik spectrum for the p-Laplacian, J. Differential Equations 159 (1) (1999) 212–238.

1452


[10] E.N. Dancer, On the Dirichlet problem for weakly non-linear elliptic partial differential equations, Proc. Roy. Soc. Edinburgh Sect. A 76 (4) (1976/77) 283–300. [11] D.G. de Figueiredo, J.-P. Gossez, On the first curve of the Fuˇcik spectrum of an elliptic operator, Differential Integral Equations 7 (5–6) (1994) 1285–1302. [12] D.G. de Figueiredo, B. Ruf, On the periodic Fuˇcik spectrum and a superlinear Sturm–Liouville equation, Proc. Roy. Soc. Edinburgh Sect. A 123 (1) (1993) 95–107. ˇ [13] S. Fuˇcík, Boundary value problems with jumping nonlinearities, Casopis Pˇest. Mat. 101 (1) (1976) 69–87. [14] T. Gallouët, O. Kavian, Résultats d’existence et de non-existence pour certains problèmes demi-linéaires à l’infini, Ann. Fac. Sci. Toulouse Math. (5) 3 (3–4) (1981) 201–246, (1982). [15] J. Horák, W. Reichel, Analytical and numerical results for the Fuˇcík spectrum of the Laplacian, J. Comput. Appl. Math. 161 (2) (2003) 313–338. [16] C.A. Magalhães, Semilinear elliptic problem with crossing of multiple eigenvalues, Comm. Partial Differential Equations 15 (9) (1990) 1265–1292. [17] W. Marzantowicz, A Borsuk–Ulam theorem for orthogonal T k and Zpr actions and applications, J. Math. Anal. Appl. 137 (1) (1989) 99–121. [18] B. Ruf, On nonlinear elliptic problems with jumping nonlinearities, Ann. Mat. Pura Appl. (4) 128 (1981) 133–151. [19] M. Schechter, The Fuˇcík spectrum, Indiana Univ. Math. J. 43 (4) (1994) 1139–1157.


Continuity and generators of dynamical semigroups for infinite Bose systems Philippe Blanchard a,∗ , Mario Hellmich a , Piotr Ługiewicz b , Robert Olkiewicz b a Faculty of Physics and BiBoS, University of Bielefeld, Universitätsstr. 25, 33615 Bielefeld, Germany b Institute of Theoretical Physics, University of Wrocław, Pl. M. Borna 9, 50204 Wrocław, Poland

Received 15 May 2008; accepted 19 May 2008 Available online 18 June 2008 Communicated by P. Malliavin

Abstract For a class of quasifree quantum dynamical semigroups on the algebra of the canonical commutation relations (CCR) we give sufficient conditions for these semigroups to extend to ultraweakly continuous semigroups of normal operators on the von Neumann algebra associated with a representation of the CCR. Then the explicit form of the generators of the extended semigroups is calculated. © 2008 Published by Elsevier Inc. Keywords: Quantum dynamical semigroup; Generator; Von Neumann algebra; Open quantum system

1. Introduction The purpose of this article is to construct a class of quantum dynamical semigroups [2] on von Neumann algebras arising from representations of the algebra of canonical commutation relations (CCR-algebra) and to find their generators. The need to construct such semigroups arises in the theory of open quantum systems where they describe their irreversible time evolution. In particular, our motivation to construct such semigroups stems from applications to decoherence. The program of environmental decoherence attempts to give an answer to the fun* Corresponding author.

E-mail addresses: [email protected] (P. Blanchard), [email protected] (M. Hellmich), [email protected] (P. Ługiewicz), [email protected] (R. Olkiewicz). 0022-1236/$ – see front matter © 2008 Published by Elsevier Inc. doi:10.1016/j.jfa.2008.05.013

1454

P. Blanchard et al. / Journal of Functional Analysis 256 (2009) 1453–1475

damental question of why the objects around us obey the laws of classical physics despite the fact that our most fundamental physical theory—quantum theory—results in contradictions to what is observed when directly applied to macroscopic systems [5,22,38]. Decoherence is an effect which leads to a dynamical destruction of quantum interference due to the unavoidable openness of macroscopic systems. It accepts quantum theory as a fundamental description of nature but contends that it is practically impossible to distinguish operationally between the vast majority of pure states and the corresponding statistical mixtures due to the emergence of so-called environment-induced superselection rules, which indicate that the quantum system acquires classical properties. Therefore, since reversible time evolutions preserve pure states, one has to consider irreversible evolutions, which are characteristic for open systems, in order to construct models exhibiting the decoherence effect. In the Markovian approximation, these evolutions are described by a quantum dynamical semigroup [2]. We will not discuss applications to decoherence in the present paper but we hope to come back to this point in a future publication. We are primarily interested in irreversible dynamics of systems with infinitely many degrees of freedom. To describe such systems rigorously we cannot use the framework of standard quantum mechanics in which observables are represented by self-adjoint linear operators on some Hilbert space and states by positive and normalized trace class operators. Instead we use the algebraic framework of quantum theory which generalizes the mentioned structures of standard quantum mechanics by describing a system through certain representations of an abstract C∗ algebra. Among other things this method takes into account that for infinite systems the canonical commutation and anticommutation relations admit many inequivalent representations, and that time evolution may not be implemented by a Hamiltonian (which, as the observable of total energy, may not exist in representations describing an infinite system), see [19] for a thorough discussion of these points. Moreover, it is possible to characterize decoherence directly in terms of the algebraic framework [3,6,24,29], so if we want to discuss decoherence in infinite systems we have to construct irreversible evolutions on the von Neumann algebra corresponding to representations of the underlying C∗ -algebra. In this paper we concentrate on Markovian evolutions of bosonic systems given by a quantum dynamical semigroup, so in our case the C∗ -algebra is the algebra of canonical commutation relations (CCR-algebra). Several methods to construct quantum dynamical semigroups on von Neumann algebras have been developed. Uniformly continuous dynamical semigroups have been characterized through their generators [12] by exploiting complete positivity. However, for bosonic systems uniform continuity (or even strong continuity) is a too strong restriction. On von Neumann algebras in standard form a one-to-one correspondence between dynamical semigroups satisfying a certain symmetry condition and noncommutative Dirichlet forms has been established [13], and a method for constructing certain noncommutative Dirichlet forms has been given [30]. This has been applied to construct an example of a weak∗ -continuous quantum dynamical semigroup on representations of the CCR-algebra with respect to quasifree states [4]. However, these methods have not resulted in a rich supply of examples of dynamical semigroups on von Neumann algebras, in particular not on von Neumann algebras corresponding to representations of the CCR. We attempt to improve this situation by introducing a class of quantum dynamical semigroups on representations of the CCR. We start out from a class of semigroups on the CCR-C∗ -algebra, constructed over an arbitrary symplectic space, which was introduced in [7]. These semigroups were constructed by means of the so-called {Sˆt }t∈R -perturbed convolution semigroups of measures. Then we address the question under which conditions they extend to quantum dynamical semigroups on representations of the CCR-algebra. More precisely, we consider the following problem. Let A(S) be the CCR-algebra over a symplectic space S and suppose that {τt }t0 is


1455

a one-parameter semigroup of positive contractive and unital operators on A(S), induced by a {Sˆt }t∈R -perturbed convolution semigroup. Let π be a representation of A(S) on the Hilbert space H and let π(A(S)) = M be the von Neumann algebra corresponding to π . Does there exist a quantum dynamical semigroup {Tt }t0 on M such that Tt (π(x)) = π(τt (x)) for all x ∈ A(S) and t 0? Here, by the term quantum dynamical semigroup we mean a semigroup {Tt }t0 of normal unital and completely positive operators on a von Neumann algebra such that the map R+ t → Tt (x) is ultraweakly continuous for each x ∈ M. This problem consists of two parts, the first is the question of ultraweak continuity of the map t → Tt (x) and the second is that of normality (i.e. ultraweak–ultraweak continuity) of each Tt . For automorphism groups the first part of this question has been considered by a number of authors [1,23,28,31]. If A is a concrete C∗ -algebra acting on a Hilbert space and {αg }g∈G ⊆ Aut A is a group of automorphisms (here G is a topological group) such that each αg extends to an automorphism α¯ g on M = A , they gave sufficient conditions for {α¯ g }g∈G to be ultraweakly continuous provided G g → αg (x), x ∈ A, is ultraweakly continuous. In Section 2 we provide an answer of a corresponding question for weak∗ -continuous one-parameter semigroups on Banach spaces; then in Section 3 this result is applied to the case of quantum dynamical semigroups on von Neumann algebras. The second part of the above question is addressed in Section 4. There we give conditions on the {Sˆt }t∈R -perturbed convolution semigroup of measures and the representation on the CCRalgebra such that a quantum dynamical semigroup {Tt }t0 with the mentioned properties exists. Having obtained quantum dynamical semigroups on the von Neumann algebra M corresponding to representations of the CCR-algebra by our extension procedure we can inquire about the explicit form of their generators. In Section 4.3 we calculate the generators of the semigroups {Tt }t0 induced by {Sˆt }t∈R -perturbed convolution semigroups of Gaussian, Poisson, and Dirac type. 2. Continuity of semigroups on Banach spaces Let X be a Banach space. We assume that it has a predual space X∗ , i.e. a Banach space such that its dual equals X. The canonical dual pairing between X and X∗ is denoted by ·,· . Generally, unit balls of normed spaces X will be denoted by ball X. We will consider the σ (X, X∗ )topology or weak∗ -topology on X. A semigroup {Tt }t0 of · -bounded operators is called weak∗ -continuous if the maps t → Tt (x) for each x ∈ X and x → Tt (x) for each t 0 are continuous in this topology. We define its weak∗ -generator Z as usual: dom Z = x ∈ X: lim t −1 Tt (x) − x in the weak∗ -topology , t↓0

Zx = lim t −1 Tt (x) − x t↓0

in the weak∗ -topology, x ∈ dom Z.

A semigroup {Tt }t0 is called contractive if Tt 1 for all t 0. We start by recalling the Hille–Yosida theorem which is the basis of the proof of the main result in this section (Theorem 4). Theorem 1. (Hille–Yosida). Let Z be an operator on X, then the following assertions are equivalent: (1) The operator Z is the generator of a weak∗ -continuous contractive semigroup.

1456


(2) The operator Z is weak∗ -densely defined and weak∗ –weak∗ -closed. For all numbers λ 0 (λ1 − Z)x λx for all x ∈ dom Z,

(1)

and for some and hence all λ > 0 ran(λ1 − Z) = X.

(2)

If (1) or (2) holds we have the following integral representation of the resolvent Rλ (Z) of Z,

Rλ (Z)x = (λ1 − Z)−1 x =

∞

e−λs Ts (x) ds,

x ∈ X, Re λ > 0,

(3)

0

where the integral converges in the weak∗ -topology. For a proof see [10, Proposition 3.1.6. and Theorem 3.1.10]. In the following we assume X∗ to be separable and that there exists a subspace X0 ⊆ X such that ball X0 ⊆ ball X is weak∗ -dense (then clearly X0 is weak∗ -dense in X). Let {Tt0 }t0 be a semigroup of contractive operators Tt0 : X0 → X0 . We will assume that each operator Tt0 has an extension to a weak∗ –weak∗ -continuous operator Tt : X → X. It is clear that the extension Tt is unique and contractive. Our objective is to prove that under these assumptions {Tt }t0 is a weak∗ -continuous semigroup on X. We start by establishing two lemmas. Lemma 2. Assume that t → Tt (x) = Tt0 (x) is weak∗ -measurable for each x ∈ X0 and define the family of operators {Rλ0 : λ > 0} by ∞ Rλ0 (x) =

e−λs Ts0 (x) ds,

where x ∈ X0 , λ > 0.

(4)

0

Then Rλ0 extends to a bounded weak∗ –weak∗ -continuous operator Rλ on X such that Rλ 1/λ for all λ > 0. Proof. The separability of X∗ implies that the weak∗ -topology on ball X is metrizable and by assumption ball X0 ⊆ ball X is dense. Let x ∈ ball X and choose {xn }n∈N ⊆ ball X0 with xn → x. As pointwise limit of the maps t → αn (t) = Tt0 (xn ) the map t → α(t) = Tt (x) is weak∗ -measurable and we may define ∞ Rλ (x) = 0

e−λs Ts (x) ds

(5)


1457

for all x ∈ X and λ > 0, the integral is taken to be a weak∗ -integral; then clearly Rλ extends Rλ0 . Next we have

Rλ (x), ϕ

∞

1 e−λs x · ϕ ds x · ϕ λ

0

for all x ∈ X and ϕ ∈ X∗ , thus Rλ 1/λ. By the dominated convergence theorem we obtain for ϕ ∈ X∗ that

∞

Rλ (xn ), ϕ =

e

−λs

∞

Ts (xn ), ϕ ds →

0

e−λs Ts (x), ϕ ds = Rλ (x), ϕ

0

as n → ∞ for any sequence {xn }n∈N ⊆ ball X with xn → x, therefore Rλ is weak∗ –weak∗ continuous when restricted to ball X. Using the Krein–Šmulian theorem this implies that Rλ is weak∗ –weak∗ -continuous on X. 2 Lemma 3. The operators Rλ defined in Lemma 2 satisfy the resolvent equation Rλ − Rμ = (μ − λ)Rλ Rμ

for all λ, μ > 0.

(6)

Proof. Assume that λ − μ > 0. Using (5) we obtain ∞∞ Rλ Rμ (x) =

e−λs e−μt Tt+s (x) dt ds

0 0

∞ =

e 0

−(λ−μ)s

e

−μt

Tt (x) dt ds

s

1 Rμ (x) − = λ−μ =

∞

∞ e

−(λ−μ)s

0

s e

−μt

Tt (x) dt ds

0

1 Rμ (x) − Rλ (x) λ−μ

by a partial integration. From [Rλ , Rμ ] = 0 for all λ, μ > 0 the result follows.

2

Theorem 4. Assume that t → Tt (x) = Tt0 (x) is weak∗ -continuous for all x ∈ X0 . Then {Tt }t0 is a weak∗ -continuous contractive semigroup on X. Proof. Using (6) we conclude that for any λ, μ > 0 the following relations hold true: ker Rλ = ker Rμ ,

ran Rλ = ran Rμ .

(7)

1458


Now let > 0 and take x ∈ X0 , ϕ ∈ X∗ . Then there exists δ > 0 such that 0 t < δ implies (x)−x, ϕ | = | Tt0 (x)−x, ϕ | < . Furthermore, there exists λ0 0 such that λ > λ0 implies | T

t∞ λ δ e−λs ds = e−λδ < . Thus if λ > λ0 it follows that

λRλ (x) − x, ϕ λ

δ

e−λs Ts (x) − x, ϕ ds

0

∞ +λ

e−λs Ts (x) − x, ϕ ds

δ

δ λ

e

−λs

∞ ds + 2λx · ϕ

e−λs ds

δ

0

< + 2x · ϕ, which shows that λRλ (x) → x as λ → ∞ in the weak∗ -topology for any x ∈ X0 . Since λRλ 1 it follows from (7) that ball X0 ⊆ ball ran Rλ ⊆ ball X, which implies that ball ran Rλ is dense in ball X. Next from (6) we conclude that (λRλ − 1)Rμ = λRλ Rμ − Rλ + (μ − λ)Rλ Rμ Rλ + μRλ · Rμ 2/λ for all λ > 0. Thus (λRλ − 1)x (2/λ)x for all x ∈ ran Rλ . For each x ∈ X there exists a · -bounded sequence {xn }n∈N ⊆ ran Rλ with xn x such that lim xn = x in relative to the weak∗ -topology. Then, for any ϕ ∈ X∗ , we have

(λRλ − 1)xn , ϕ (2/λ)x · ϕ, and upon letting n → ∞ we get | (λRλ − 1)x, ϕ | (2/λ)x · ϕ, and finally (λRλ − 1)x (2/λ)x for all x ∈ X. This proves that limλ→∞ λRλ x = x for all x ∈ X in the norm topology. If x ∈ ker Rλ for some and hence all λ > 0 then this result implies 0 = limλ→∞ λRλ x = x, hence ker Rλ = {0} and Rλ is injective for any λ > 0. This allows us to define the operator Z = λ0 1 − Rλ−1 , 0

dom Z = ran Rλ0 ,

for some λ0 > 0. Using (6) we have (λ1 − Z)Rλ = (λ − λ0 )1 + (λ0 1 − Z) Rλ = (λ − λ0 )1 + (λ0 1 − Z) Rλ0 1 − (λ − λ0 )Rλ = 1 + (λ − λ0 ) Rλ0 − Rλ − (λ − λ0 )Rλ Rλ0 = 1,

(8)


1459

and similarly also Rλ (λ1 − Z) = 1. This shows that Rλ is the resolvent of Z and that the definition of Z does not depend on λ0 . Moreover, Z is weak∗ -densely defined and weak∗ – weak∗ -closed since Rλ is weak∗ –weak∗ -continuous. Taken together we have ran(λ1 − Z) = X and (λ1 − Z)x λx for all x ∈ dom Z, λ > 0, showing that condition (2) of Theorem 1 is satisfied and Z is the generator of a weak∗ -continuous contractive semigroup {St }t0 . It remains to prove that {St }t0 coincides with {Tt }t0 . By the integral representation of the resolvent we have ∞ e 0

−λs

∞ Ss (x) ds = Rλ (x) =

e−λs Ts (x) ds

0

for x ∈ X and all λ ∈ C with Re λ > 0, which implies equality of {Tt }t0 and {St }t0 because the integrands are continuous functions of s. 2 3. Continuity of semigroups on von Neumann algebras We now apply the result of the last section to semigroups on von Neumann algebras. For the basic facts and terminology about C∗ -algebras and von Neumann algebras we refer to [10,32]. Recall that on a von Neumann algebra M with predual space M∗ the weak∗ -topology, i.e. the σ (M, M∗ )-topology, is equivalent to the ultraweak topology, and that an operator T : M → M is normal if and only if it is σ (M, M∗ )–σ (M, M∗ )-continuous. A semigroup {Tt }t0 of operators on a von Neumann algebra M is called a quantum dynamical semigroup if it is weak∗ -continuous, i.e. if each Tt is continuous and if t → Tt (x) is ultraweakly continuous for all x ∈ M, and moreover if each operator Tt is completely positive and unital, i.e. Tt (1) = 1 for any t 0. Then {Tt }t0 is automatically contractive. Let B be a nondegenerate∗ -subalgebra on the Hilbert space H (B is automatically nondegenerate if it contains the identity operator 1), and let M = B , the von Neumann algebra generated by B; then B is ultraweakly dense in M. We consider a contractive semigroup {Tt0 }t0 on B. Corollary 5. Assume that H is separable and that each Tt0 , t 0, extends to a normal operator Tt on M. Moreover, assume that t → Tt (x) is ultraweakly continuous for each x ∈ B. Then {Tt }t0 is a weak∗ -continuous contractive semigroup on M. Proof. Since H is separable, it follows that M∗ is separable [32]. Moreover, by the Kaplansky density theorem ball B ⊆ ball M is weak∗ -dense and the conclusion follows from Theorem 4. 2 It is worth recalling at this point that any unital linear map on a C∗ -algebra into a C∗ -algebra is contractive if and only if it is positive. Thus if Tt0 in Corollary 5 is unital then the semigroup {Tt }t0 is positive. Now let A be an abstract C∗ -algebra and let {τt }t0 be a contractive semigroup on A. For a state ω on A let us consider the Gelfand–Naimark–Segal (GNS)-representation (πω , Hω , ξω ) of A on the Hilbert space Hω with cyclic vector ξω ∈ Hω . The semigroup {τt }t0 is called ωcontinuous if t → πω (τt (x)) is ultraweakly continuous for all x ∈ A, and we say that {τt }t0 is ω-covariant if there is a semigroup of normal contractive operators {Tt }t0 on M such that Tt πω (x) = πω τt (x) , x ∈ A, t 0. (9)

1460


Let us observe that {Tt }t0 is uniquely defined. Furthermore, {Tt }t0 is positive (resp. completely positive) if (and only if provided πω is faithful) {τt }t0 positive (resp. completely positive). A state ω is called separating if ξω is a separating vector for the von Neumann algebra M = πω (A) , or equivalently, if ξω is cyclic for M . Let S (A) be the set of all states on A and let Nω ⊆ S (A) be the folium of all ω-normal states, i.e. ψ ∈ Nω if and only if there exists a normal state ϕ on πω (A) such that ψ(x) = πω (x), ϕ for all x ∈ A. Recall that two states ω1 and ω2 are called quasi-equivalent if Nω1 = Nω2 . Theorem 6. Let ω be a separating state on A and let Hω be separable. Suppose that a positive unital semigroup {τt }t0 is ω-continuous, and let ω and ω ◦ τt be quasi-equivalent for any t 0. Then {τt }t0 is ω-covariant and the semigroup {Tt }t0 on M satisfying (9) is positive and weak∗ continuous. Proof. If ω is separating we have Nω = {ψ ∈ S (A): ψ λω for some λ 0} where the closure is taken in norm topology (see [15, Theorem 4.2]). By assumption we have Nω = Nω◦τt . Let ψ ∈ Nω , then there exists a sequence {ψn }n∈N with ψn λn ω such that λn 0 and ψn − ψ → 0 as n → ∞. By positivity of τt we have ψn ◦ τt λn (ω ◦ τt ), hence ψn ◦ τt ∈ Nω◦τt , and since folia are norm-closed we obtain ψ ◦ τt ∈ Nω◦τt = Nω , thus we have proved (τt ◦ ω)(Nω ) ⊆ Nω . Therefore if ϕ ∈ M∗ is a normal state and ψ(x) = πω (x), ϕ , x ∈ A, there exists a normal state Tt,∗ (ϕ) ∈ M∗ such that ψ(τt (x)) = πω (τt (x)), ϕ = πω (x), Tt,∗ (ϕ) , thus a bounded positive map Tt,∗ on M∗ is well defined and its dual Tt is normal and satisfies (9). We conclude by Corollary 5 since πω (A) is a nondegenerate ∗ -algebra and since {τt }t0 is ωcontinuous. 2 We remark that it is easy to show that α ∈ Aut A is ω-covariant if and only if ω and ω ◦ α are quasi-equivalent (use e.g. [10, Theorem 4.2.26]). If A has a quasilocal structure, states of different normal folia correspond to systems which differ in global properties [20], such as temperature for systems in thermal equilibrium (KMS states on a type III von Neumann algebra at different temperatures are disjoint). So the last result can be interpreted as follows: if an evolution {τt }t0 is such that it changes only local properties of the system (i.e. ω and ω ◦ τt remain quasi-equivalent for all t 0 and hence Nω = Nω◦τt ) then the evolution is ω-covariant and is given by a semigroup of affine maps on the normal states of a single representation. 4. Dynamical semigroups on representations of the CCR-algebra 4.1. Perturbed convolution semigroups Let (S, σ ) be a symplectic space, i.e. a real vector space S with an alternating bilinear form σ : S × S → R which we allow to be degenerate. Let W (f ), f ∈ S, be abstract symbols, called Weyl operators, and let A0 (S, σ ) be the set of all formal linear combinations of Weyl operators, n A0 (S, σ ) = zk W (fk ): zk ∈ C, fk ∈ S, n ∈ N . k=1

The product of two Weyl operators is defined by the canonical commutation relations W (f )W (g) = e−iσ (f,g)/2 W (f + g),

f, g ∈ S,

(10)


1461

and the involution by W (−f ) = W (f )∗ , which may be extended by linearity and antilinearity, respectively, to A0 (S, σ ). The completion A1 (S, σ ) of A0 (S, σ ) with respect to the norm n n zk W (fk ) := |zk |, k=1

fi = fj if i = j,

k=1

1

is a Banach∗ -algebra. We now define a norm on A1 (S, σ ) by x :=

sup

1/2 ω(x ∗ x) ,

x ∈ A1 (S, σ ),

ω∈S (A1 (S,σ ))

where S (A1 (S, σ )) denotes the set of all states on A1 (S, σ ). Then the completion of A1 (S, σ ) in this norm is a C∗ -algebra A(S, σ ), called the algebra of canonical commutation relations, in short the CCR-algebra. We write A(S) if no confusion can arise about σ . The study of CCRalgebras has generated a vast amount of literature, see for instance [17,25], and [27] for the case of a degenerate symplectic form. For a state ω ∈ S (A(S, σ )) we define its characteristic function cω : S → C by f → cω (f ) = ω(W (f )). Then the characteristic function satisfies the properties: (1) cω (0) = 1. (2) For all f1 , . . . , fn ∈ S and z1 , . . . , zn ∈ C, n ∈ N, it follows that n

z¯ i zj eiσ (fi ,fj )/2 cω (fj − fi ) 0.

i,j =1

Conversely [27], every function S → C satisfying properties (1) and (2) defines a state. If S is considered as an Abelian group equipped with the discrete topology, its character group, endowed with the Gelfand topology, is denoted by Sˆ ; then Sˆ is compact. The algebra of all bounded complex Borel measures on Sˆ is denoted by Mb (Sˆ ) and the set of probability measures by M1+ (Sˆ ). For μ ∈ Mb (Sˆ ) the Fourier transform is defined by A∼ = S f → (F μ)(f ) =

χ(f ) dμ(χ), ˆ S

where A is the character group of Sˆ which is canonically isomorphic to S by the Pontrjagin–van Kampen theorem. If μ is a positive measure, F μ is a positive-definite functional on S, and by Bochner’s theorem any positive-definite functional on S is the Fourier transform of a positive measure in Mb+ (Sˆ ). For any linear operator S : S → S the dual operator Sˆ : Sˆ → Sˆ , defined by ˆ (Sχ)(f ) = χ(Sf ), f ∈ S, χ ∈ Sˆ , is a continuous group homomorphism. In [7] we introduced the concept of a {Sˆt }t∈R -perturbed convolution semigroup: let {St }t∈R be a group of symplectic linear maps on S, i.e. σ St (f ), St (g) = σ (f, g)

for all f, g ∈ S, t ∈ R.

1462


Then {μt }t0 ⊆ M1+ (Sˆ ) is called a {Sˆt }t∈R -perturbed convolution semigroup if μ0 = δe , the Dirac measure concentrated in the unit e ∈ Sˆ , and if μt ∗ (Sˆt )∗ μs = μt+s ,

s, t 0.

(11)

Here (Sˆt )∗ : M1+ (Sˆ ) → M1+ (Sˆ ) is defined by [(Sˆt )∗ μ](E) = μ(Sˆt−1 (E)) where E is a Borel set of Sˆ and μ ∈ M1+ (Sˆ ). It is worth remarking that convolution semigroups satisfying a relation similar to (11) were studied independently in the context of generalized Mehler semigroups, see [8,18,34]. A main result of [7] was the first part of the following theorem which we generalize here by allowing the symplectic form σ to be degenerate. Theorem 7. Any {Sˆt }t∈R -perturbed convolution semigroup {μt }t0 ⊆ M1+ (Sˆ ) induces a unique semigroup {τt }t0 of completely positive contractive and unital operators on A(S, σ ) such that τt W (f ) = Γt (f )W St (f ) ,

f ∈ S, t 0,

(12)

where Γt = F μt for any t 0. Conversely, for any completely positive contractive semigroup {τt }t0 on A(S, σ ) which satisfies (12), where {St }t∈R is a group of symplectic linear maps, there exists a {Sˆt }t∈R -perturbed convolution semigroup {μt }t0 such that Γt = F μt for any t 0. Proof. Consider a map τ on A0 (S, σ ), defined by τ (W (f )) = Γ (f )W (S(f )), f ∈ S. It was shown in [16] that τ extends to a completely positive map on A(S, σ ) if and only if there exists a state ω ∈ S (A(S, σS )) such that Γ = cω ; here σS is the symplectic form given by σS (f, g) = σ (f, g) − σ (S(f ), S(g)), f, g ∈ S. Since St is symplectic, σSt is zero and by positivedefiniteness of f → (F μt )(f ) there exists a state ωt ∈ S (A(S, 0)) such that F μt = cωt , hence τt extends from A0 (S, σ ) to a completely positive contraction on A(S, σ ). It is enough to check the semigroup property on Weyl operators which follows by a calculation using (11). Finally, τt (1) = 1 for all t 0 where 1 = W (0) is the unit in A(S, σ ). Conversely, if {τt }t0 is a completely positive semigroup of the form (12) then there exist states ωt ∈ S (A(S, 0)) such that Γt = cωt , hence this function is positive-definite and by Bochner’s theorem there exists μt ∈ M1+ (Sˆ ) such that F μt = Γt for every t 0. By the semigroup property, Γs+t (f )W Ss+t (f ) = Ts+t W (f ) = Ts Tt W (f ) = Γt (f )Γs St (f ) W Ss+t (f ) , for any s, t 0 and f ∈ S. Multiplying by W (Ss+t (f ))∗ from the left shows that we have Γt (f )Γs (St (f )) = Γs+t (f ), i.e. (F μt )(f ) · (F μs )(St (f )) = (F μs+t )(f ) which is equivalent to (11). 2 4.2. Extension to representations of the CCR-algebra Let us fix a symplectic space (S, σ ) with a nondegenerate symplectic form σ and consider A(S) = A(S, σ ). We will give sufficient conditions for a semigroup {τt }t0 of the form (12) with a symplectic group {St }t∈R to be ω-covariant for a state ω ∈ S (A(S)) such that the corresponding semigroup {Tt }t0 defined by (9) is a quantum dynamical semigroup.


1463

Let E be a Hausdorff locally convex topological vector space with topology β and E be its topological dual. Subsets of E of the form C(B; x1 , . . . , xn ) = y ∈ E : x1 , y , . . . , xn , y ∈ B ,

(13)

where x1 , . . . , xn ∈ E are linearly independent, B ⊆ Rn is a Borel set, and n ∈ N, are called cylinder sets, the algebra of all cylinder sets is denoted by UE . A set function μ : UE → R+ ∪ {∞} is called a quasimeasure if the restriction of μ to the σ -algebra {C(B; x1 , . . . , xn ): B ⊆ Rn a Borel set} is a measure for all linearly independent x1 , . . . , xn ∈ E. If μ is σ -additive on UE then μ has a unique extension to σ (UE ). For certain functions on E , so-called cylindrical functions, it is possible to define an integral with respect to a quasimeasure, see [14,35] for details. In the following we will need a criterion which ensures that a quasimeasure μ on UE is σ -additive. For this reason we introduce some notions concerning locally convex topologies on E. Let Ph (E, β) be the set of all β-continuous Hilbert seminorms p on E, i.e. with the property p(x + y)2 + p(x − y)2 = 2p(x)2 + 2p(y)2 for all x, y ∈ E. For any seminorm p on E let Ep = E/p −1 (0), we denote by ψp : E → Ep the quotient map and define the norm ψp (x)p = p(x), x ∈ E. The completion of Ep with respect to · p is denoted by E˜ p , and let ψp also denote the canonical map from E into E˜ p . If p ∈ Ph (E, β) then E˜ p is a Hilbert space, and we define on E the seminorm q(x) = ρψp (x)p where ρ is a positive Hilbert– Schmidt operator on E˜ p , and we let Pn (E, β) be the set of all seminorms thus obtained. The locally convex topology induced by Pn (E, β) is called the nuclear topology on E, and E is called nuclear if τn (E, β) = β. We give two examples of nuclear topologies needed below. 1. τn (E, σ (E, E )) = σ (E, E ). In particular, for a symplectic space (S, σ ) with nondegenerate symplectic form σ it induces a dual pairing between S and itself, making it possible to define the σ (S, S)-topology on S, and S = S . 2. Let E√be a pre-Hilbert space with inner product ·,· , then q ∈ Pn (E, · ) is of the form ˜ q(x) = ρx, ρx , where ρ is a positive Hilbert–Schmidt operator on the completion E. We can now formulate the following criterion [14,35] for σ -additivity of a quasimeasure μ on UE . If E x → (F μ)(x) = ei x,y dμ(y) (14) E

is τn (E, β)-continuous then μ is σ -additive. Notice that E x → ei x,y is cylindrical. This result is a generalization of the well-known Minlos–Sazonov theorem for Hilbert spaces. Consider a group of symplectic linear maps {St }t∈R . It induces a group of ∗ -automorphisms {αt }t∈R on A(S) such that αt (W (f )) = W (St (f )), f ∈ S, t ∈ R. Let ω ∈ S (A(S)), let (πω , Hω , ξω ) be the corresponding GNS-triplet, and denote the corresponding von Neumann algebra by M = πω (A(S)) . As an abbreviation write Wω (f ) = πω (W (f )), f ∈ S. Observe that the Wω (f ) also satisfy (10). Assume that {μt }t0 ⊆ M1+ (Sˆ ) is a {Sˆt }t∈R -perturbed convolution semigroup. The following conditions will be shown to be sufficient for the semigroup {τt }t0 on A(S) given by Theorem 7 to extend to a quantum dynamical semigroup on M:

1464


E1. The map f → (F μt )(f ) is σ (S, S)-continuous. E2. The map t → μt ∈ Mb (Sˆ ) is vaguely continuous. E3. The map f → Wω (f ) is measurable when S is endowed with σ (US ) and M with the Borel σ -algebra of the weak operator topology. E4. The group {αt }t∈R is ω-covariant, i.e. there exists a group of ∗ -automorphisms {βt }t∈R ⊆ Aut M such that (15) πω αt W (f ) = βt Wω (f ) , f ∈ S, t ∈ R. E5. The map t → πω (αt (W (f ))) = Wω (St (f )) is continuous in the weak operator topology (i.e. {αt }t∈R is ω-continuous). E6. The Hilbert space Hω is separable. Proposition 8. Suppose that ω, {μt }t0 , and {St }t∈R satisfy conditions E1–E6. Then the semigroup {τt }t0 of Theorem 7 with the property (12) is ω-covariant. The corresponding semigroup {Tt }t0 on M satisfying (9) is a quantum dynamical semigroup. Moreover, {Tt }t0 is the unique ultraweakly continuous semigroup on M satisfying (16) Tt Wω (f ) = (F μt )(f )Wω St (f ) , f ∈ S, t 0. Proof. Endow S with the σ (S, S)-topology. Since for each t 0 the map f → (F μt )(f ) is positive-definite and continuous when restricted to finite-dimensional subspaces of S there exists by Proposition I.3.3 of [14] a normalized quasimeasure νt on US such that (17) (F μt )(f ) = eiσ (f,g) dνt (g), f ∈ S, t 0. S

By E1 each νt is σ -additive and extends to a measure on σ (US ). Thus we can define on M the operators Vt by an integral with respect to νt over a noncylindrical function: Vt (x) = Wω (g)xWω (g)∗ dνt (g), x ∈ M, t 0. S

This is well defined since the integrand is weakly measurable by E3. The Vt are completely positive, unital, and normal. Using E6 normality can be shown in the same way as the weak∗ – weak∗ -continuity of Rλ in the proof of Lemma 4. By (10) it is easy to see that we have Vt (Wω (f )) = (F μt )(f )Wω (f ), f ∈ S, t 0. Using E4 we define on M the operators Tt by Tt (x) = (βt ◦ Vt )(x),

x ∈ M, t 0.

Clearly, the Tt are normal, completely positive, and unital. Using (11) and the properties of the Fourier transform one can show that {Tt }t0 satisfies the semigroup property Ts ◦ Tt = Ts+t for s, t 0 and T0 = 1 by checking it on Weyl operators. By construction the Tt satisfy (16), and in particular Tt (Wω (f )) = πω (τt (W (f ))). By a density argument this relation implies (9). It remains to prove the ultraweak continuity of {Tt }t0 . Define B = lin{Wω (f ): f ∈ S}. By (10) it follows that B is a ultraweakly dense ∗ -subalgebra of M and 1 ∈ B. Since t → (F μt )(f )


1465

is continuous by E2 and t → Wω (St (f )) is ultraweakly continuous by E5 it follows that t → Tt (Wω (f )) = (F μt )(f )Wω (St (f )) is ultraweakly continuous for each f ∈ S. This property extends to finite linear combinations of Weyl operators, hence the map t → Tt (x), x ∈ B, is ultraweakly continuous. Using E6 the conclusion now follows from Corollary 5. 2 We make some remarks about the validity of conditions E1–E6. 1. A natural embedding S → Sˆ is given by the identification of g ∈ S with the character χg ∈ Sˆ defined by χg (f ) = eiσ (f,g) . If supp μt ⊆ S for all t 0 condition E1 becomes redundant because we have by definition of the Fourier transform (F μt )(f ) =

eiσ (f,g) dμt (g),

f ∈ S, t 0,

S

which replaces (17) in the proof of the proposition. 2. Assume that S is a complex separable Hilbert space with inner product ·,· , considered as a symplectic space by defining σ (f, g) = Im f, g , f, g ∈ S. Then it can be shown that σ (US ) is the Borel σ -algebra of the norm topology of S, therefore E3 can be rephrased by requiring norm-to-weak operator-measurability. 3. If there exist unitaries ut ∈ L(Hω ) such that πω (αt (x)) = ut πω (x)u∗t for all t 0 and x ∈ A(S) condition E4 is satisfied. See [9] for a number of sufficient conditions for the existence of such ut . In particular, this is the case if ω ◦ αt = ω for all t ∈ R. Moreover, if there exists a cyclic and separating vector for M in Hω , it follows from Tomita–Takesaki theory (see [10, Corollary 2.5.32]) that {αt }t∈R ⊆ Aut A is unitarily implementable if and only if it is ω-covariant. The following example shows that the conditions E1–E6 are satisfied in situations of physical interest. Example 9. We consider the temperature representation of a free Bose gas. Let S = L2 (Rd ) and assume that H is the free Hamiltonian, i.e. the unique self-adjoint extension of −∇ 2 to S. Let β > 0 and μ ∈ R such that H (μ + c)1 for some c > 0. If we put z = eβμ it can be shown that

1 −βH −βH −1 1 − ze f , ω W (f ) = exp − f, 1 + ze 4

f ∈ S,

defines a gauge invariant quasifree state on A(S). This state is the thermal equilibrium state of a free Bose gas in the high temperature-low density (noncondensed) regime. Let {St }t∈R be given by St (f ) = eitH f , then clearly St is symplectic. For details and proofs of the above facts we refer to [11]. By Proposition 5.2.29 of [11] the following properties hold: the map f → Wω (f ) is strongly continuous when S is endowed with the norm topology, hence E3 is satisfied. There exists a strongly continuous unitary group {ut }t∈R on Hω , defined by ut xξω = αt (x)ξω ,

x ∈ A(S),

(18)

1466


such that πω (αt (x)) = ut πω (x)u∗t = βt (πω (x)), x ∈ A(S), thus E4 and E5 are satisfied. Finally, let S0 be a countable dense set in S. Then it is easy to see that the set

n (aj + ibj )Wω (fj )ξω : aj , bj ∈ Q, fj ∈ S0 , n ∈ N ⊆ Hω j =1

is countable and dense in Hω , hence E6 is satisfied. Summarizing, we have shown that for a thermal equilibrium state of a noncondensed free Bose gas all conditions depending on the representation, i.e. E3–E6, are satisfied. Therefore it only remains to construct a {Sˆt }t∈R -perturbed convolution semigroup satisfying conditions E1 and E2, see the next section and [7]. We now show that if S can be densely embedded in a larger symplectic space S¯ , then, for certain states ω, the condition supp μt ⊆ S¯ , t 0, implies the conclusion of Proposition 8. Let T (S, σ ) be the set of all topologies τ on the symplectic space (S, σ ) such that S × S (f, g) → f + g and S × S (f, g) → σ (f, g) are (jointly) τ -continuous and such that R t → tf is τ -continuous. For every τ ∈ T (S, σ ) define Fτ (S, σ ) = {ω ∈ S (A(S, σ )): f → cω (f ) is τ -continuous}; remark that a characteristic function cω is τ -continuous on S if and only if it is τ -continuous at 0 [21]. Then it can be shown [21] that Fτ (S, σ ) is a folium and that every state in Fτ (S, σ ) is regular. Let (S¯ , σ¯ ) be a symplectic space with a nondegenerate symplectic form σ¯ and equipped with some topology τ ∈ T (S¯ , σ¯ ) such that the symplectic space (S, σ ) is τ -dense in S¯ and σ¯ S = σ . Note that because σ¯ is nondegenerate and S ⊆ S¯ is dense we have the inclusions S ⊆ S¯ ⊆ Sˆ¯ ⊆ Sˆ . It has been shown in [21] that ω ∈ Fτ (S, σ ) extends to a unique state ω¯ ∈ Fτ (S¯ , σ¯ ) and that the corresponding GNS-representations have the property πω¯ A(S) = πω , Hω¯ = Hω , and ξω¯ = ξω . Proposition 10. Let S¯ and ω be as above, and let {μt }t0 ⊆ M1+ (Sˆ ) be a {Sˆt }t∈R -perturbed convolution semigroup such that E1 and E4–E6 are satisfied. Moreover, assume that S¯ f → Wω¯ (f ) is σ (Sˆ , S)-weak operator measurable. If supp μt ⊆ S¯ for all t 0 then the conclusion of Proposition 8 holds true. Proof. As in Remark 1, write (F μt )(f ) = Vt (x) =

ˆ S

χ(f ) dμt (χ) =

Wω¯ (g)xWω¯ (g)∗ dμt (g),

¯ S

eiσ¯ (f,g) dμt (g). Defining

x ∈ M, t 0,

(19)

¯ S

we see that Vt (Wω (f )) = (F μt )(f )Wω (f ) for all f ∈ S, t 0, and Vt (M) ⊆ M, where M = πω (A(S)) . We now proceed as in the proof of Proposition 8. 2 The preceding proposition can be applied to the following situation. Let the symplectic space S in the problem at hand be the Schwartz space S (Rd ) or the test function space D(Rd ), and take S¯ = L2 (Rd ) considered as a symplectic space with symplectic form σ¯ (f, g) = Im f, g , the imaginary part of the L2 -inner product, and let σ be the restriction of σ¯ to S. Then the norm topology on L2 (Rd ) is in T (S¯ , σ¯ ).


1467

4.3. Generators In [7], we constructed Dirac, Gaussian, and Poisson {Sˆt }t∈R -perturbed convolution semigroups which, by Theorem 7, induce semigroups on the CCR-algebra. In this section we will give conditions for them to be ω-covariant for a state ω on the CCR-algebra and calculate the generators of their extensions to the von Neumann algebra associated with the representation πω of the CCR-algebra. Throughout this section let us fix the following assumptions and notation. Let (S, σ ) be a symplectic space with a nondegenerate symplectic form σ and let {St }t∈R be a group

t of symplectic linear maps. To assure the σ (S, S)-continuity of functions the form S f → 0 F (Sr (f )) dr, where F : S → R is a σ (S, S)-continuous function, we assume that the σ (S, S)-topology satisfies the first axiom of countability in order to characterize continuity by sequential convergence and make the dominated convergence theorem applicable. This assumption is satisfied, for example, if S is a complex separable pre-Hilbert space (e.g. Schwartz space S (Rd )) considered as a symplectic space in the canonical way. Moreover, we shall assume throughout that {St }t∈R is weakly measurable, i.e. that t → σ (St (f ), g) is measurable for all f, g ∈ S. We start with a general lemma. The set of all infinitely differentiable functions on ]0, ∞[ with compact support is denoted by C∞ c (]0, ∞[). Lemma 11. Suppose {μt }t0 is a {Sˆt }t∈R -perturbed convolution semigroup satisfying E1–E6 for a state ω and let {Tt }t0 be the semigroup given by Proposition 8. Let δ be the generator of the automorphism group {βt }t∈R ⊆ Aut M defined in (15) and let Z be the generator of {Tt }t0 . Define B = lin Wω (f ): f ∈ S

(20)

and ∞ ζ (t)Tt (x) dt: x ∈ B, ζ

C=

∈ C∞ c

]0, ∞[ .

(21)

0

Then C is ultraweakly dense in M, C ⊆ dom Z, C ⊆ dom δ, and C is a core for both Z and δ. Let C ◦ = {ϕ ∈ M∗ : x, ϕ = 0 ∀x ∈ C} be the polar of C. Let ϕ ∈ C ◦ , then 0 =

Proof. ∞ ∞ 0 ζ (t) Tt (x), ϕ dt for all x ∈ B and ζ ∈ Cc (]0, ∞[), thus the continuous function t → Tt (x), ϕ vanishes on ]0, ∞[, hence x, ϕ = 0 for all x ∈ B, hence ϕ = 0 and we have C ◦ = {0}. ◦◦ Now the bipolar

∞

∞ theorem implies M = C1 = C. Let xζ = 0 ζ (t)Tt (x) dt ∈ C, then s (Ts (xζ ) − xζ ) = 1s 0 ζ (t)[Ts (Tt (x)) − Tt (x)] dt, and since Tt (x) ∈ dom Z if t > 0 we obtain by the dominated convergence theorem upon letting s ↓ 0 that ∞ Zxζ = 0

d ζ (t) Tt (x) dt = dt

∞ 0

ζ (t)Z Tt (x) dt,

x ∈ B,

(22)

1468


thus

∞ Tt (C) ⊆ C, therefore C is a core ∞for Z. Similarly, if xζ =

∞ C ⊆ dom Z. Clearly ζ (t)T (W (f )) dt = t ω 0 0 ζ (t)Γt (f )βt (Wω (f )) dt, f ∈ S, ζ ∈ Cc (]0, ∞[), then as before 1 1 βs (xζ ) − xζ = s s

∞

ζ (t)Γt (f ) βs βt Wω (f ) − βt Wω (f ) dt.

0

Since βt (Wω (f )) ∈ dom δ if t > 0 we get by letting s ↓ 0 ∞ δ(xζ ) =

ζ (t)Γt (f )δ βt Wω (f ) dt =

0

∞

ζ (t)δ Tt Wω (f ) dt.

(23)

0

We conclude that C ⊆ dom δ, and since βt (Wω (f )) = Wω (St (f )) we also see that βt (C) ⊆ C, thus C is a core for δ. 2 4.3.1. The Gaussian case Let Q : S → R be a positive quadratic form (i.e. Q(f + g) + Q(f − g) = 2(Q(f ) + Q(g)) for all f, g ∈ S). We assume throughout that t → Q(St (f )) is measurable and locally integrable. t Then Qt (f ) = 0 Q(Sr (f )) dr is also a quadratic form and there exists a unique measure μt ∈ M1+ (Sˆ ) such that (F μt )(f ) = Γt (f ) = e−Qt (f ) ,

f ∈ S, t 0.

(24)

It was shown in [7] that {μt }t∈R is a {Sˆt }t∈R -perturbed convolution semigroup. We say that {μt }t0 and the corresponding semigroup {τt }t0 on A(S) given by Theorem 7 are of Gaussian type. If f → Q(f ) is τn (S, σ (S, S)) = σ (S, S)-continuous then f → Qt (f ) is σ (S, S)-continuous as well because St is symplectic, thus in this case E1 is satisfied, and E2 is clearly satisfied as well. If t → Wω (tf ) is strongly continuous for all f ∈ S the state ω is called regular and we can introduce the field operators φω (f ), f ∈ S, of the Weyl system {Wω (f ): f ∈ S} as the generators of the unitary groups {Wω (tf )}t∈R . It can be shown [11] that ξω ∈ dom φω (f ) for all f ∈ S if and only if t → ω(W (tf )) is twice differentiable for all f ∈ S. This condition is satisfied for all quasifree states, see [26] for the necessary definitions concerning quasifree states. Lemma 12. Assume that ξω ∈ dom φω (f ) for all f ∈ S. Moreover, assume that E1–E6 are satisfied for a semigroup {τt }t0 on A(S) of the form (12). Then it follows that D = lin{Wω (g)ξω : g ∈ S} ⊆ dom φω (f ) for all f ∈ S and

∞

φω (g1 ), φω (g2 ),

ζ (t)Tt Wω (f ) dt

ξ

0

∞ = 0

σ g1 , St (f ) σ g2 , St (f ) ζ (t)Tt Wω (f ) ξ dt

(25)


for all g1 , g2 , f ∈ S, ξ ∈ D, and all commutator in (25) is well defined.

∞ 0

1469

ζ (t)Tt (Wω (f )) dt ∈ C, ζ ∈ C∞ c (]0, ∞[). The double

Proof. Since the right-hand side of Wω (tf )Wω (g)ξω = e−itσ (f,g) Wω (g)Wω (tf )ξω is strongly differentiable with derivative φω (f )Wω (g)ξω = −σ (f, g)Wω (g)ξω + Wω (g)φω (f )ξω at t = 0 we have Wω (g)ξω ∈ dom φω (f ), it follows that D ⊆ dom φω (f ). Using this result we get by a similar calculation φω (f ), Wω (g) ξ = −σ (f, g)Wω (g)ξ,

f, g ∈ S, ξ ∈ D.

(26)

We have Tt (Wω (f ))ξ ∈ dom φω (g), and by E4 the map t → Tt (Wω (f )) is ultraweakly and hence weak-operator continuous, hence the integrals ∞

ζ (t)Tt Wω (f ) ξ dt,

0

∞

ζ (t)φω (g)Tt Wω (f ) ξ dt

0

exist in Hω as weak∗ -integrals. If η ∈ dom φω (g) then

∞ φω (g)η, 0

∞

ζ (t)Tt Wω (f ) ξ dt = ζ (t) φω (g)η, Tt Wω (f ) ξ dt 0

∞

= η,

ζ (t)φω (g)Tt Wω (f ) ξ dt ,

0

∞ and we see that 0 ζ (t)Tt (Wω (f ))ξ dt ∈ dom φω∗ (g) = dom φω (g). In the same way it can be

∞ shown that 0 ζ (t)φω (g1 )Tt (Wω (f ))ξ dt ∈ dom φω (g2 ), so that we also have ∞

ζ (t)Tt Wω (t) ξ dt ∈ dom φω (g2 )φω (g1 )

for all f, g1 , g2 ∈ S and ξ ∈ D.

0

∞ By a similar procedure we also obtain 0 ζ (t)Tt (Wω (f ))φω (g1 )ξ dt ∈ dom φω (g2 ) for all f, g1 , g2 ∈ S and ξ ∈ D. Using these results we see upon expanding the double commutator on the left-hand side of (25) that it is well defined, and by applying (26) to the left-hand side of (25) we obtain the desired result. 2 In what follows we will write the bounded operator on Hω defined by D ξ → [φω (f ), x]ξ = φω (f )xξ − xφω (f )ξ , x ∈ C, formally as a commutator [φω (f ), x]. Proposition 13. Let {μt }t0 be a Gaussian {Sˆt }t∈R -perturbed convolution semigroup defined by (24) with quadratic form Q : S → R+ such that f → Q(f ) is τn (S, σ (S, S))-continuous.

1470


Assume that ω is regular and ξω ∈ dom φω (f ) for all f ∈ S, and that E3–E6 are satisfied. Then the corresponding semigroup {τt }t0 on A(S) given by Theorem 7 is ω-covariant and the generator Z of the quantum dynamical semigroup {Tt }t0 on M induced by the covariance condition (9) is given by Zx = δ(x) −

∞

λk φω (fk ), φω (gk ), x ,

x ∈ C,

(27)

k=1

where {fk }k∈N , {gk }k∈N are σ (S, S)-equicontinuous sequences in S and the sequence {λk }k∈N ⊆ R satisfies ∞ k=1 |λk | < ∞. Proof. First we notice that E1–E6 are satisfied and that the semigroup {Tt }t0 given by (16) exists and is ultraweakly continuous. By polarization the bilinear form B : S × S → R corresponding to Q via Q(f ) = B(f, f ), f ∈ S, is jointly τn (S, σ (S, S))-continuous. The nuclear kernel theorem [33] implies the existence of equicontinuous sequences {fk }k∈N , {gk }k∈N ⊆ S = S (in particular, the sequences are contained in a compact set by the Alaoglu–Bourbaki theorem) and an absolutely summable sequence {λk }k∈N ⊆ R such that B(f, g) =

∞

λk σ (f, fk )σ (g, gk ),

f, g ∈ S.

(28)

k=1

Now observe that t → Γt (f ) is differentiable almost everywhere by the Lebesgue fundamental ∞ theorem of calculus. Let xζ = 0 ζ (t)Tt (Wω (f )) dt ∈ C, f ∈ S, then from (22) and (23) we have ∞ Zxζ =

ζ (t)

d Γt (f )βt Wω (f ) dt dt

0

∞ =

ζ (t)Γt (f ) δ βt Wω (f ) − Q St (f ) Wω St (f ) dt

0

= δ(xζ ) −

∞ k=1

= δ(xζ ) −

∞

∞ λk

ζ (t)σ St (f ), fk σ St (f ), gk Tt Wω (f ) dt

0

λk φω (fk ), φω (gk ), xζ ,

k=1

where we have used Lemma 12, the expansion of the bilinear form (28), and the dominated convergence theorem for series together with the weak measurability of {St }t∈R and local integrability of t → Qt (f ). We conclude that (27) holds for all x ∈ C. 2 The result of the preceding proposition can be sharpened if S carries a complex structure, i.e. there exists a complex pre-Hilbert space H with inner product ·,· such that the real vector space S equals H and σ (f, g) = Im f, g for all f, g ∈ S. Let J : S → S be given by Jf = if ,


1471

f ∈ S. The completion of H is denoted by H˜ . The nuclear topology τn (H, · ) is induced by the seminorms q(x) = ρx, where ρ is a Hilbert–Schmidt operator on H˜ . A linear operator A ∗ on H is nuclear if and only if [33] there exist sequences {gk }k∈N ⊆ H ⊇ H and {hk }k∈N ⊆ H and |λ | < ∞ such that a sequence {λk }k∈N ⊆ R with ∞ k=1 k Af =

∞

λk gk , f hk ,

f ∈ H,

(29)

k=1

where {gk }k∈N is equicontinuous and {hk }k∈N lies in a complete convex circled and bounded subset B of H, where complete means that the subspace HB = ∞ n=1 nB of H, normed by the gauge pB (x) = inf{λ > 0: x ∈ λB} of B, is complete, i.e. a Banach space. Clearly, if H is complete then in view of Alaoglu’s theorem it is sufficient that gk , hk 1 for all k ∈ N. Corollary 14. Let all assumptions be as in Proposition 13 except that S carries a complex structure and that Q(f ) = f, Af where A is a nuclear positive C-linear operator on H. Then {τt }t0 is ω-covariant and the generator Z of the corresponding quantum dynamical semigroup {Tt }t0 is given by Zx = δ(x) −

∞

λk φω (ek ), φω (ek ), x ,

x ∈ C,

(30)

k=1

where {ek }k∈N ⊆ H is a norm bounded sequence and {λk }k∈N ⊆ R+ is a sequence such that ∞ k=1 λk < ∞. Proof. Since A is nuclear and positive it is easy to show that f → Q(f ) = f, Af is τn (S, σ (S, S))-continuous, thus E1–E6 are satisfied. From (29) it follows that A has an exorthonormal basis tension A˜ to H˜ such that A˜ is nuclear and positive. Now there exists an ˜ = ∞ {ek }k∈N ⊆ H˜ and a summable sequence {λk }k∈N with λk 0 such that Af k=1 λk ek , f ek . Since A is nuclear on H it is compact [33]. Now let f ∈ H˜ , then there exists a sequence {fn }n∈N such that fn → f in norm, and by compactness (after passing to a subsequence) we can assume ˜ = y and we conclude A( ˜ H˜ ) ⊆ H, in particular Ae ˜ = λ e ∈ H , that Afn → y ∈ H. But then Af k k k i.e. ek ∈ H, k ∈ N. For the quadratic form we obtain Q(f ) =

∞ k=1

=

∞ k=1

∞

2 2 λk ek , f = λk σ ek , Jf + iσ ek , f k=1 ∞ ∞ 2 2 λk σ J ek , f + λk σ ek , f = λk σ (ek , f )2 , k=1

f ∈ S,

(31)

k=1

with λ2k = λ2k−1 = λk , and e2k = ek , e2k−1 = J ek , k ∈ N. Thus the quadratic form has been diagonalized in the symplectic basis {ek }k∈N . The rest of the proof proceeds as in Proposition 27. 2 It is worth noting that in [36] generators of Gaussian semigroups on representations of A(S) were studied, but in a different setting and under somewhat more restrictive assumptions. Note also that the proofs in that paper contain some gaps [37], one of which is filled by our Corollary 5.

1472


4.3.2. The Poisson case Let ν ∈ Mb+ (Sˆ ) be a positive bounded measure and suppose throughout that t → (F ν)(St (f )) ˆ ˆ ˆ is measurable and locally integrable

t for all f ∈ S. Then t → (St )∗ ν is σ (Mb (S), C0 (S))ˆ measurable and the measures νt = 0 (Sr )∗ ν dr are well defined. It was shown in [7] that {μt }t0 , where μt = e−tν exp(νt ), is a {Sˆt }t∈R -perturbed convolution semigroup. In this case we say that {μt }t0 and the corresponding semigroup {τt }t0 on A(S) are of Poisson type. From the above definitions we have t (F νt )(f ) =

χ(f ) d (Sˆr )∗ ν (χ) dr =

0 Sˆ

t

(F ν) Sr (f ) dr,

(32)

0

therefore (F μt )(f ) = Γt (f ) = e

−tν

F

∞ ∗k ν k=0

=e

−tν

t

exp (F νt )(f ) ,

k!

(f )

f ∈ S, t 0.

(33)

We see that if f → (F ν)(f ) is σ (S, S)-continuous then condition E1 is satisfied because St is symplectic; moreover, E2 is satisfied as well. As in the proof of Proposition 16 there exists a bounded measure ν0 on σ (US ) such that (F ν)(f ) = S eiσ (f,g) dν0 (g), f ∈ S. Proposition 15. As above let {μt }t0 be a Poisson {Sˆt }t∈R -perturbed convolution semigroup and assume that f → (F ν)(f ) is τn (S, σ (S, S))-continuous. Moreover, assume that E3–E6 are satisfied for some state ω on A(S). Then the semigroup {τt }t0 on A(S) given by Theorem 7 is ω-covariant and the generator Z of the corresponding quantum dynamical semigroup {Tt }t0 on M induced by the covariance condition (9) is given by Zx = δ(x) + LP (x),

x ∈ C,

(34)

where LP (x) =

Wω (g)xWω (g)∗ − x dν0 (g),

x ∈ M,

(35)

S

and LP is a bounded normal operator. Proof. By the preceding remarks E1–E6 are satisfied. For f ∈ S and a function ζ ∈ C∞ c (]0, ∞[)

∞ let xζ = 0 ζ (t)Tt (Wω (f )) dt ∈ C. From (22) and (23) we have, as in the proof of Proposition 13, ∞ Zxζ =

ζ (t) 0

d Γt (f )βt Wω (f ) dt dt


∞ = δ(xζ ) +

1473

ζ (t)Γt (f ) (F ν) St (f ) − ν Wω St (f ) dt

0

∞

ζ (t)Γt (f ) eiσ (St (f ),g) − 1 Wω St (f ) dν0 (g) dt

= δ(xζ ) + 0 S

∞ = δ(xζ ) +

ζ (t)Γt (f ) Wω (g)Wω St (f ) Wω (g)∗

0

S

− Wω St (f ) dt dν0 (g) = δ(xζ ) + Wω (g)xζ Wω (g)∗ − xζ dν0 (g), S

where we have used (32) and (33), the Weyl relations, and Fubini’s theorem. We conclude that (34) holds for all x ∈ C. 2 4.3.3. The Dirac case Let ψ be a linear functional on S such that t → ψ(St (f )) is measurable and locally integrable t for all f ∈ S, define ψt (f ) = 0 ψ(Sr (f )) dr and the character χt ∈ Sˆ by χt (f ) = eiψt (f ) . Then {μt }t0 , where μt = δχt , is a {Sˆt }t∈R -perturbed convolution semigroup [7]. We say that {μt }t0 and the corresponding semigroup {τt }t0 on A(S) are of Dirac type. Here δχ denotes the Dirac measure in M1+ (Sˆ ) concentrated in χ . The map f → (F μt )(f ) = χt (f ) is τn (S, σ (S, S)) = σ (S, S)-continuous if and only if ψ ∈ S since St is symplectic, t 0; in this case E1 and E2 are satisfied. Proposition 16. Let ψ be a linear functional of the form ψ(f ) = σ (f, g), g ∈ S, and {μt }t0 be the associated perturbed convolution semigroup of Dirac type. Suppose that ω is a regular state on A(S) such that ξω ∈ dom φω (f ) for all f ∈ S, and such that E3–E6 are satisfied. Then the corresponding semigroup {τt }t0 given by Theorem 7 is ω-covariant and the generator Z of the quantum dynamical semigroup {Tt }t0 on M induced by the covariance condition (9) is given by Zx = δ(x) + i φω (g), x ,

x ∈ C.

(36)

Proof. We proceed as in the proof of Proposition 13. For f ∈ S and ζ ∈ C∞ c (]0, ∞[) let xζ =

∞ ζ (t)T (W (f )) dt ∈ C, then t ω 0 ∞ Zxζ =

ζ (t)

d Γt (f )Wω St (f ) dt dt

0

∞ = δ(xζ ) + i 0

ζ (t)σ St (f ), g Tt Wω (f ) dt

1474


∞ = δ(xζ ) + i

ζ (t) φω (g), Tt Wω (f ) dt

0

= δ(xζ ) + i φω (g), xζ , where we have used (26). We conclude that (36) holds for all x ∈ C.

2

5. Concluding remarks By our strategy of extending semigroups on A(S) which are induced by a {Sˆt }t∈R -perturbed convolution semigroup of measures via Theorem 7 we obtained quantum dynamical semigroups on representations of the CCR-algebra which have the familiar generators (27), (34), or (36) of Gauss, Poisson, and Dirac type. We emphasize that instead of constructing semigroups directly from these expressions the method we employed in the present paper bypasses some serious mathematical difficulties. For example, starting from (27), (34), and (36), we would have to check that the operators defined by these equations are ultraweakly closed and that their resolvents satisfy (1) and (2) in order to apply Theorem 1, which seems to be a formidable task, at least in the Gaussian case. Acknowledgment M.H. acknowledges financial support by Evangelisches Studienwerk e.V. Villigst. References [1] J.F. Aarnes, On the continuity of automorphic representations of groups, Comm. Math. Phys. 7 (1968) 332–336. [2] R. Alicki, K. Lendi, Quantum Dynamical Semigroups and Applications, Springer, Berlin, 1987. [3] S. Attal, A. Joye, C.-A. Pillet (Eds.), Open Quantum Systems, vols. I–III, Lecture Notes in Mathematics, vols. 1880– 1882, Springer, Berlin, 2006. [4] C. Bahn, C.K. Ko, Y.M. Park, Dirichlet forms and symmetric Markovian semigroups on CCR algebras with respect to quasi-free states, J. Math. Phys. 44 (2003) 723–753. [5] Ph. Blanchard (Ed.), Decoherence: Theoretical, Experimental and Conceptual Problems, Springer, Berlin, 2000. [6] Ph. Blanchard, R. Olkiewicz, Decoherence induced transition from quantum to classical dynamics, Rev. Math. Phys. 15 (2003) 217–243. [7] Ph. Blanchard, M. Hellmich, P. Ługiewicz, R. Olkiewicz, Quantum dynamical semigroups for finite and infinite Bose systems, J. Math. Phys. 48 (2007) 012106. [8] V.I. Bogachev, M. Röckner, B. Schmuland, Generalized Mehler semigroups and applications, Probab. Theory Relat. Fields 105 (1996) 193–225. [9] H.J. Borchers, On the implementability of automorphism groups, Comm. Math. Phys. 14 (1969) 305–314. [10] O. Bratteli, D.W. Robinson, Operator Algebras and Quantum Statistical Mechanics I, second ed., Springer, New York, 1987. [11] O. Bratteli, D.W. Robinson, Operator Algebras and Quantum Statistical Mechanics II, Springer, New York, 1981. [12] E. Christensen, Generators of semigroups of completely positive maps, Comm. Math. Phys. 62 (1978) 167–171. [13] F. Cipriani, Dirichlet forms and Markovian semigroups on standard forms of von Neumann algebras, J. Funct. Anal. 147 (1997) 259–300. [14] Yu.L. Dalecky, S.V. Fomin, Measures and Differential Equations in Infinite-Dimensional Spaces, Math. Appl., vol. 76, Kluwer, Dordrecht, 1991. [15] E.B. Davies, Irreversible dynamics of infinite Fermion systems, Comm. Math. Phys. 55 (1977) 231–258. [16] B. Demoen, P. Vanheuverzwijn, A. Verbeure, Completely positive maps on the CCR-algebra, Lett. Math. Phys. 2 (1977) 161–166. [17] G.G. Emch, Algebraic Methods in Statistical Mechanics and Quantum Field Theory, Wiley, New York, 1972.


1475

[18] M. Fuhrman, M. Röckner, Generalized Mehler semigroups: The non-Gaussian case, Potential Anal. 12 (2000) 1–47. [19] R. Haag, Local Quantum Physics, second ed., Springer, Berlin, 1996. [20] R. Haag, R.V. Kadison, D. Kastler, Nets of C∗ -algebras and classification of states, Comm. Math. Phys. 16 (1970) 81–104. [21] R. Honegger, On the continuous extensions of states on the CCR algebra, Lett. Math. Phys. 42 (1997) 11–25. [22] E. Joos, et al., Decoherence and the Appearance of a Classical World in Quantum Theory, second ed., Springer, Berlin, 2003. [23] R. Kallman, A remark on a paper of J.F. Aarnes, Comm. Math. Phys. 14 (1969) 13–14. [24] P. Ługiewicz, R. Olkiewicz, Classical properties of infinite quantum open systems, Comm. Math. Phys. 239 (2003) 241–259. [25] J. Manuceau, C∗ -algèbre de relations de commutation, Ann. Inst. H. Poincaré 8 (1968) 139–161. [26] J. Manuceau, A. Verbeure, Quasi-free states of the C.C.R.-algebra and Bogoliubov transformations, Comm. Math. Phys. 9 (1968) 293–302. [27] J. Manuceau, M. Sirugue, D. Testard, A. Verbeure, The smallest C∗ -algebra for the canonical commutation relations, Comm. Math. Phys. 32 (1973) 231–243. [28] J. Moffat, Continuity of automorphic representations, Proc. Cambridge Philos. Soc. 74 (1973) 461–465. [29] R. Olkiewicz, Environment-induced superselection rules in Markovian regime, Comm. Math. Phys. 208 (1999) 245–265. [30] Y.M. Park, Construction of Dirichlet forms on standard forms of von Neumann algebras, Infin. Dimens. Anal. Quantum Probab. Relat. Top. 3 (2000) 1–14. [31] K. Saitô, A note on the continuity of automorphic representations of groups, Tôhoku Math. J. 28 (1976) 305–310. [32] S. Sakai, C∗ -Algebras and W∗ -Algebras, Springer, New York, 1971. [33] H.H. Schaefer, Topological Vector Spaces, second ed., Springer, New York, 1999. [34] B. Schmuland, Wei Sun, On the equation μt+s = μs ∗ Ts μt , Statist. Probab. Lett. 52 (2001) 183–188. [35] O.G. Smolyanov, S.V. Fomin, Measures on linear topological spaces, Uspekhi Mat. Nauk 31 (4) (1976) 1–53; translated in: Russian Math. Surveys 31 (4) (1976) 3–56. [36] P. Vanheuverzwijn, Generators for quasi-free completely positive semigroups, Ann. Inst. H. Poincaré 29 (1978) 123–138. [37] P. Vanheuverzwijn, Erratum to [36], Ann. Inst. H. Poincaré 30 (1979) 83. [38] W.H. Zurek, Decoherence, einselection, and the quantum origins of the classical, Rev. Modern. Phys. 75 (2003) 715–775.


Maximal vectors in Hilbert space and quantum entanglement William Arveson University of California, Berkeley, Department of Mathematics, 970 Evans Hall #3840, CA, USA Received 17 May 2008; accepted 12 August 2008 Available online 3 September 2008 Communicated by L. Gross

Abstract Let V be a norm-closed subset of the unit sphere of a Hilbert space H that is stable under multiplication by scalars of absolute value 1. A maximal vector (for V ) is a unit vector ξ ∈ H whose distance to V is maximum d(ξ, V ) = sup d(η, V ), η=1

d(ξ, V ) denoting the distance from ξ to the set V . Maximal vectors generalize the maximally entangled unit vectors of quantum theory. In general, under a mild regularity hypothesis on V , there is a norm on H whose restriction to the unit sphere achieves its minimum precisely on V and its maximum precisely on the set of maximal vectors. This “entanglement-measuring norm” is unique. There is a corresponding “entanglement-measuring norm” on the predual of B(H ) that faithfully detects entanglement of normal states. We apply these abstract results to the analysis of entanglement in multipartite tensor products H = H1 ⊗ · · · ⊗ HN , and we calculate both entanglement-measuring norms. In cases for which dim HN is relatively large with respect to the others, we describe the set of maximal vectors in explicit terms and show that it does not depend on the number of factors of the Hilbert space H1 ⊗ · · · ⊗ HN −1 . © 2008 Elsevier Inc. All rights reserved. Keywords: Quantum entanglement; Maximally entangled vectors

E-mail address: [email protected]. 0022-1236/$ – see front matter © 2008 Elsevier Inc. All rights reserved. doi:10.1016/j.jfa.2008.08.004

W. Arveson / Journal of Functional Analysis 256 (2009) 1476–1510

1477

1. Introduction Let H = H1 ⊗ · · · ⊗ HN be a finite tensor product of separable Hilbert spaces. In the literature of physics and quantum information theory, a normal state ρ of B(H1 ⊗ · · · ⊗ HN ) called separable or classically correlated if it belongs to the norm closed convex set generated by product states σ1 ⊗ · · · ⊗ σN , where σk denotes a normal state of B(Hk ). Normal states that are not separable are said to be entangled. The notion of entanglement is a distinctly noncommutative phenomenon, and has been a fundamental theme of quantum physics since the early days of the subject. It has received increased attention recently because of possible applications emerging from quantum information theory. In the so-called bipartite case in which N = 2, several numerical measures of entanglement have been proposed that emphasize various features (see [10,11,13,17,18]). Despite the variety of proposed measures, only one we have seen (the projective cross norm introduced in [15,16]) is capable of distinguishing between entangled mixed states and separable mixed states of bipartite tensor products. Of course, the bipartite case has special features because vectors in H1 ⊗ H2 can be identified with Hilbert–Schmidt operators from H1 to H2 , thereby allowing one to access operator-theoretic invariants—most notably the singular value list of a Hilbert– Schmidt operator—to analyze vectors in H1 ⊗ H2 . Most notably, such considerations lead to the so-called Schmidt decomposition of vectors in H1 ⊗ H2 . On the other hand, such operatortheoretic tools are much less effective for higher order tensor products, and perhaps that explains why the higher order cases N 3 are poorly understood. For example, there does not appear to be general agreement as to what properties a “maximally entangled” vector should have in such cases; and in particular, there is no precise definition of the term. In this paper we propose such a definition and introduce two numerical invariants (one for vectors and one for states) that faithfully detect entanglement, in a general mathematical setting that includes the cases of physical interest. We start with a separable Hilbert space H and a distinguished set V ⊆ ξ ∈ H : ξ = 1 of unit vectors that satisfies the following two conditions: V1. λ · V ⊆ V , for every λ ∈ C with |λ| = 1. V2. For every ξ ∈ H , ξ, V = {0} ⇒ ξ = 0. By replacing V with its closure if necessary, we can and do assume that V is closed in the norm topology of H . A normal state ρ of B(H ) is said to be V -correlated if for every > 0, there are vectors ξ1 , . . . , ξn ∈ V and positive numbers t1 , . . . , tn with sum 1 such that n tk xξk , ξk . sup ρ(x) − x1 k=1

A normal state that is not V -correlated is called V -entangled—or simply entangled. The motivating examples are those in which H = H1 ⊗ · · · ⊗ HN is an N -fold tensor product of Hilbert spaces Hk and V = ξ1 ⊗ · · · ⊗ ξn : ξk ∈ Hk , ξ1 = · · · = ξN = 1

1478


is the set of decomposable unit vectors. In such cases the V -correlated states are the separable states, and when H is finite-dimensional, the V -correlated states are the simply the convex combinations of vector states x → xξ, ξ with ξ a unit vector of the form ξ = ξ1 ⊗ · · · ⊗ ξn , ξk ∈ Hk , k = 1, . . . , n. Of course, there are many other examples that have less to do with physics. In general, given such a set V ⊆ H , a maximal vector is defined as a unit vector ξ ∈ H whose distance to V is maximum d(ξ, V ) = sup d(η, V ), η=1

d(η, V ) denoting the distance from η to V . While it is not obvious from this geometric definition, it is a fact that in the case of bipartite tensor products H = H1 ⊗ H2 , maximal vectors turn out to be exactly the “maximally entangled” unit vectors of the physics literature (see (1.2) below). Sections 2–4 are devoted to an analysis of the geometric properties of maximal vectors in general. We introduce a numerical invariant r(V ) of V (the inner radius) and show that when r(V ) > 0, there is a uniquely determined “entanglement measuring norm” ·V on H with the property that ξ ∈ V iff ξ V = 1 and ξ is maximal iff ξ V = r(V )−1 (see Proposition 2.1 and Theorem 4.2). In Section 5 we introduce an extended real-valued function E(ρ) of normal states ρ that takes values in the interval [1, +∞]. This “entanglement” function E is convex, lower semicontinuous, and faithfully detects generalized entanglement in the sense that ρ is entangled iff E(ρ) > 1 (Theorems 5.3 and 6.2). We also show that under the same regularity hypothesis on the given set V of unit vectors (namely r(V ) > 0), E is a norm equivalent to the ambient norm of B(H )∗ ∼ = L1 (H ), and it achieves its maximum on vector states of the form ω(A) = Aξ, ξ , A ∈ B(H ) precisely when ξ is a maximal vector (Theorem 9.1). In the third part of the paper (Sections 8–13), we apply these abstract results to cases in which V is the set of decomposable unit vectors in an N -fold tensor product H = H1 ⊗ · · · ⊗ HN . We assume that at most one of the Hk is infinite-dimensional, and that they are arranged so that the dimensions nk = dim Hk weakly increase with k and nN −1 < ∞. We identify the vector norm · V that measures entanglement as the greatest cross norm on the projective tensor product of Hilbert spaces ˆ H2 ⊗ ˆ ··· ⊗ ˆ HN H1 ⊗ in general—see Theorem 8.2. Similarly, we identify the entanglement function of mixed states as the restriction to density operators of the greatest cross norm of the projective tensor product of Banach spaces ˆ L1 (H2 ) ⊗ ˆ ··· ⊗ ˆ L1 (HN ), L1 (H1 ) ⊗ L1 (H ) denoting the Banach space of trace class operators on a Hilbert space H (Theorem 9.1). Note that in the bipartite case N = 2, the latter reduces to the norm introduced in a more ad hoc way by Rudolph in [15,16]. We are unable to identify the maximal vectors in this generality, and our sharpest results for multipartite tensor products require an additional hypothesis, namely that one of the spaces Hk should be significantly larger than the others in the sense that nN n1 · · · nN −1 . In every case, of course, the entanglement measuring norm · V depends strongly on relative dimensions n1 , . . . , nN of the factors of the decomposition H1 ⊗ · · · ⊗ HN , because the “shape” of its unit ball {ξ : ξ V 1} depends strongly on these relative dimensions. What is interesting is that when


1479

nN n1 · · · nN −1 , the set of maximal vectors does not depend on that finer structure. Indeed, we show that in such cases the maximal vectors are precisely the vectors in H1 ⊗ · · · ⊗ HN that can be represented 1 ξ=√ n1 n2 · · · nN −1

n1 ···nN−1

ek ⊗ f k

(1.1)

k=1

where e1 , . . . , en1 ···nN−1 is an orthonormal basis for H1 ⊗ · · · ⊗ HN −1 and f1 , . . . , fn1 ···nN−1 is an orthonormal set in HN (see Theorem 12.1). The simplest case is N = 2, where our hypotheses reduce to n1 n2 ∞ with n1 finite, and the expression (1.1) becomes a familiar representation of “maximally entangled” vectors of bipartite tensor products that is commonly found in the physics literature. To make the point in somewhat more physical terms, let H and K be finite-dimensional Hilbert spaces with n = dim H m = dim K. The maximal vectors of the bipartite tensor product H ⊗ K are those of the form 1 ξ = √ (e1 ⊗ f1 + · · · + en ⊗ fn ) n

(1.2)

where (ek ) is an orthonormal basis for H and (fk ) is an orthonormal set in K. On the other hand, if the Hilbert space H represents a composite of several subsystems in the sense that it can be further decomposed into a tensor product H = H1 ⊗ · · · ⊗ Hr of Hilbert spaces, then the set of maximal vectors relative to the more refined decomposition H1 ⊗ · · · ⊗ Hr ⊗ K is precisely the same set of unit vectors (1.2). This unexpected stability of the set of maximal vectors is established by showing that the states associated with maximal vectors ξ are characterized by the following requirement on their “marginal distributions.” The algebra A = B(H1 ⊗· · ·⊗HN −1 ) can be viewed as a matrix algebra with tracial state τ , and we show that a unit vector ξ in H1 ⊗ · · · ⊗ HN is maximal if and only if

(A ⊗ 1HN )ξ, ξ = τ (A),

A ∈ A,

see Theorem 11.1. We do not know if there is a useful characterization of the marginal states of maximal vectors in the remaining cases for which nN < n1 · · · nN −1 , and that is an issue calling for further research. Of course, it was also necessary to calculate the geometric invariant r(V ) for these examples, see Theorems 10.1 and 10.2. A more precise and more complete summary of our main results for multipartite tensor products is presented at the end of the paper in Theorem 14.1 (also see Remark 13.3). We conclude this introduction by summarizing a few items of the physics literature and their relation to the basic issues discussed below. The authors of [17] propose three conditions that any measure of entanglement should satisfy, and a variety of entanglement measures are discussed that meet these three criteria. Their measures arise from various considerations of quantum information theory, and they differ from the one proposed below—which, as we have seen above, emerges from a general analysis of states that can be associated with “arbitrary” sets of vectors in Hilbert space. The idea of measuring entanglement of vectors in terms of their distance to the decomposable vectors also appears in [19], and calculations are carried out for several examples. While a related measure was also introduced for states, it is different from the one below,

1480


and there appears to be no further overlap with this paper. Also see [8, formula (22)]. A related operator-theoretic notion of entanglement for bipartite tensor products was introduced in [4], where it is shown essentially that a density operator that is maximally far from the separable ones relative to the Hilbert–Schmidt norm provides a maximal violation of the Bell inequalities. Perhaps it is also relevant to point out that the recent paper [14] establishes unbounded violations of the Bell inequalities for tripartite tensor products using quite different methods. This is the third of a series of papers that relate to entangled states on matrix algebras [2,3]. However, while the results below certainly apply to matrix algebras, many of them also apply to the context of infinite-dimensional Hilbert spaces. Part 1. Vectors in Hilbert spaces 2. Detecting membership in convex sets Let H be a Hilbert space and let V ⊆ {ξ ∈ H : ξ = 1} be a norm-closed subset of the unit sphere of H that satisfies V1 and V2. Recall that since the weak closure and the norm closure of a convex subset of H are the same, it is unambiguous to speak of the closed convex hull of V . In this section we show that there is a unique function u : H → [0, +∞] with certain critical properties that determines membership in the closed convex hull of V , and more significantly for our purposes, such a function determines membership in V itself. While the proof of Proposition 2.1 below involves some familiar ideas from convexity theory, it is not part of the lore of topological vector spaces, hence we include details. We begin with a preliminary function · V defined on H by (2.1) ξ V = sup ξ, η, ξ ∈ H. η∈V

Axiom V2 implies that · V is a norm, and since V consists of unit vectors we have ξ V ξ . The associated unit ball {ξ ∈ H : ξ V 1} is a closed convex subset of H that contains the unit ball {ξ ∈ H : ξ 1} of H because ξ V ξ , ξ ∈ H . Now consider the function · V : H → [0, +∞] defined by (2.2) ξ V = sup ξ, η = sup ξ, η, ξ ∈ H. ηV 1

ηV 1

Since ηV η, the right-hand side of (2.2) is at least ξ , hence ξ V ξ ξ V ,

ξ ∈ H.

(2.3)

Significantly, it is possible for ξ V to achieve the value +∞ when H is infinite-dimensional; an example is given in Proposition 8.3 below. An extended real-valued function u : H → [0, +∞] is said to be weakly lower semicontinuous if for every r ∈ [0, +∞), the set {ξ ∈ H : u(ξ ) r} is closed in the weak topology of H . Proposition 2.1. The extended real-valued function · V : H → [0, +∞] has the following properties: (i) ξ + ηV ξ V + ηV , ξ, η ∈ H .


1481

(ii) λ · ξ V = |λ| · ξ V , 0 = λ ∈ C, ξ ∈ H . (iii) It is weakly lower semicontinuous. (iv) The closed convex hull of V is {ξ ∈ H : ξ V 1}. This function is uniquely determined: if u : H → [0, +∞] is any function that satisfies (ii) and (iv), then u(ξ ) = ξ V , ξ ∈ H . The proof rests on the following result. Lemma 2.2. Let K be the closed convex hull of V . Then K = ξ ∈ H : ξ V 1 ,

(2.4)

and in particular, ξ V = sup ξ, η, ηV 1

ξ ∈ H.

(2.5)

Proof of Lemma 2.2. For the inclusion ⊆ of (2.4), note that if ξ ∈ V and η is any vector in H , then |ξ, η| ηV , so that ξ V = sup ξ, η sup ηV 1. ηV 1

ηV 1

For the other inclusion, a standard separation theorem implies that it is enough to show that for every continuous linear functional f on H and every α ∈ R, sup f (ξ ) α

ξ ∈V

⇒

sup f (η) α.

ηV 1

Fix such a pair f , α with f = 0. By the Riesz lemma, there is a vector ζ ∈ H such that f (ξ ) = ξ, ζ , ξ ∈ H , and the first inequality above implies 0 < ζ V = sup ξ, ζ = sup f (ξ ) α. ξ ∈V

ξ ∈V

Hence α −1 ζ V 1. By definition of · V we have |η, α −1 ζ | ηV , therefore |η, ζ | αηV , and finally sup f (η) sup η, ζ α,

ηV 1

ηV 1

which is the inequality on the right of the above implication. To deduce the formula (2.5), use (2.4) to write ξ V = sup ξ, η = sup ξ, η = sup ξ, η η∈V

and (2.5) follows.

2

η∈K

ηV 1

1482


Proof of Proposition 2.1. Properties (i) and (ii) are obvious from the definition (2.2) of · V , lower semicontinuity (iii) also follows immediately from the definition (2.2), and property (iv) follows from Lemma 2.2. Uniqueness: Property (iv) implies that for ξ ∈ H , u(ξ ) 1

⇔

ξ V 1.

Using u(r · ξ ) = r · u(ξ ) for r > 0, we conclude that for every positive real number r and every ξ ∈ H , one has u(ξ ) r

⇔

ξ V r,

from which it follows that u(ξ ) = ξ V whenever one of u(ξ ), ξ V is finite, and that u(ξ ) = ξ V = +∞ whenever one of u(ξ ), ξ V is +∞. Hence u(ξ ) = ξ V for all ξ ∈ H . 2 What is more significant for our purposes is that the function · V detects membership in V itself: Theorem 2.3. The restriction of the function · V of (2.2) to the unit sphere {ξ ∈ H : ξ = 1} of H satisfies ξ V 1,

and ξ V = 1

⇔

ξ ∈ V.

(2.6)

Proof. (2.3) implies that ξ V 1 for all ξ = 1. Let K be the closed convex hull of V . The description of K given in (2.4) and the properties (i) and (ii) of Proposition 2.1 imply that the extreme points of K are the vectors ξ ∈ H satisfying ξ V = 1. Since V consists of extreme points of the unit ball of H , it consists of extreme points of K, hence ξ V = 1 for every ξ ∈ V . Conversely, if ξ satisfies ξ = ξ V = 1, then the preceding remarks show that ξ is an extreme point of K, so that Milman’s converse of the Krein–Milman theorem implies that ξ belongs to the weak closure of V . But if ξn is a sequence in V that converges weakly to ξ then ξn − ξ 2 = 2 − 2ξn , ξ → 2 − 2ξ, ξ = 0 as n → ∞. We conclude that ξ ∈ V norm = V .

2

3. The geometric invariant r(V ) In this section we introduce a numerical invariant of V that will play a central role. Definition 3.1. The inner radius r(V ) of V is defined as the largest r 0 such that {ξ ∈ H : ξ r} is contained in the closed convex hull of V . Obviously, 0 r(V ) 1. The following result and its corollary imply that r(V ) > 0 when H is finite-dimensional. More generally, they imply that whenever the inner radius is positive, both · V and · V are norms that are equivalent to the ambient norm of H . We write d(ξ, V ) for the distance from a vector ξ ∈ H to the set V , d(ξ, V ) = inf{ξ − η: η ∈ V }.


1483

Theorem 3.2. The inner radius of V is characterized by each of the following three formulas: inf ξ V = r(V ),

(3.1)

ξ =1

1 , r(V ) ξ =1

sup d(ξ, V ) = 2 1 − r(V ) . sup ξ V =

(3.2) (3.3)

ξ =1

Proof. Let K be the closed convex hull of V . If K contains the ball of radius r about 0, then for every ξ ∈ H we have sup ξ, η = sup ξ, η sup ξ, η = r · ξ . η∈V

ηr

η∈K

Hence inf ξ V = inf sup ξ, η r,

ξ =1

ξ =1 η∈V

and r(V ) inf{ξ V : ξ = 1} follows. For the opposite inequality, set r = inf ξ V . ξ =1

Then for every ξ ∈ H satisfying ξ = 1, we have sup ξ, η = r · sup ξ, η = r · ξ = r ξ V = sup ξ, η,

ηr

η1

η∈V

and after rescaling ξ we obtain sup ξ, η sup ξ, η,

ηr

ξ ∈ H.

η∈V

At this point, a standard separation theorem implies that {η ∈ H : η r} is contained in the closed convex hull of V , hence r r(V ). (3.2) follows from (3.1), since by definition of the norm ξ V ξ, η = sup η = sup η ξ =1,ηV =1 ηV =1 η=0 ηV −1 1 = sup = inf ηV = r(V )−1 . η=1 η=1 ηV

sup ξ V =

ξ =1

sup

To prove (3.3), the distance d(ξ, V ) from ξ to V satisfies

1484


d(ξ, V )2 = inf ξ − η2 = inf 2 − 2ξ, η = 2 − 2 sup ξ, η η∈V

η∈V

η∈V

= 2 − 2 sup ξ, η = 2 − 2ξ V , η∈V

and (3.3) follows after taking square roots.

2

Corollary 3.3. If the inner radius r(V ) is positive, then · V is a norm on H satisfying ξ ξ V

1 ξ , r(V )

ξ ∈ H.

If H is finite-dimensional, then r(V ) > 0. Proof. The first sentence follows from (2.3) and (3.2). If H is finite-dimensional, the norm · V must be equivalent to the ambient norm on H , hence r(V ) > 0 follows from (3.1). 2 Corollary 3.4. In general, for any closed set V of unit vectors that satisfies axiom V1, the following five assertions about V are equivalent: (i) (ii) (iii) (iv) (v)

The closed convex hull of V has nonempty interior. The inner radius of V is positive. The seminorm · V is equivalent to the ambient norm of H . The function · V is a norm equivalent to the √ ambient norm of H . The function d(·, V ) is bounded away from 2 on the unit sphere: √ sup d(ξ, V ) < 2. ξ =1

Proof. The equivalences (ii) ⇔ (iii) ⇔ (iv) ⇔ (v) are immediate consequences of the formulas of Theorem 3.2. Since the implication (ii) ⇒ (i) is trivial, it suffices to prove (i) ⇒ (ii). For that, let U be a nonempty open set contained in the closed convex hull K of V . The vector difference U − U is an open neighborhood of 0 that is contained in K − K. By axiom V1, K − K is contained in 2 · K, so that 2−1 · (U − U ) is a subset of K that contains an open ball about 0. 2 4. Maximal vectors Throughout this section, V will denote a norm-closed subset of the unit sphere of a Hilbert space H that satisfies V1 and V2. For every unit vector ξ ∈ H , the distance from ξ to V satisfies √ 0 d(ξ, V ) 2; and since V is norm-closed, one has d(ξ, V ) = 0 iff ξ ∈ V . Definition 4.1. By a maximal vector we mean a vector ξ ∈ H satisfying ξ = 1 and d(ξ, V ) = sup d(η, V ). η=1

When H is finite-dimensional, an obvious compactness argument shows that maximal vectors exist; and they exist for significant infinite-dimensional examples as well (see Sections 8–14).


1485

Maximal vectors will play a central role throughout the remainder of this paper. In this section we show that whenever r(V ) > 0, the restriction of the function · V to the unit sphere of H detects maximality as well as membership in V . Indeed, in Theorem 3.2 we calculated the minimum of · V and the maximum of · V over the unit sphere of H . What is notable is that when either of the two extremal values is achieved at some unit vector ξ then they are both achieved at ξ ; and that such vectors ξ are precisely the maximal vectors. Theorem 4.2. If r(V ) > 0, then for every unit vector ξ ∈ H , the following three assertions are equivalent: (i) ξ V = r(V ) is minimum. −1 (ii) ξ V = r(V √) is maximum. (iii) d(ξ, V ) = 2(1 − r(V )) is maximum. Proof. Choose a unit vector ξ . We will prove the implications (i) ⇔ (iii), (i) ⇒ (ii) and (ii) ⇒ (i). (i) ⇔ (iii). Theorem 3.2 implies that the minimum value of ξ V is r(V ), the maximum value of d(ξ, V ) is given by (iii), and that d(ξ, V ) is maximized at ξ iff ξ V is minimized at ξ . (i) ⇒ (ii). If ξ V = r(V ) then r(V )−1 ξ V = 1, so that ξ V = sup ξ, η ξ, r(V )−1 ξ = ηV 1

1 . r(V )

Since (3.2) implies ξ V r(V )−1 , we conclude that ξ V = r(V )−1 . (ii) ⇒ (i). Assuming (ii), we have r(V )−1 = ξ V = sup ξ, η = sup ξ, η ηV =1

ηV 1

|ξ, η| |ξ, η| = sup , η=0 ηV η=1 ηV

= sup

the last equality holding because the function η ∈ {η ∈ H : η = 0} →

|ξ, η| ηV

is homogeneous of degree zero. After taking reciprocals, we obtain r(V ) = inf

η=1

ηV . |ξ, η|

(4.1)

Now (4.1) implies that there is a sequence of unit vectors ηn such that ηn V = r(V ). n→∞ |ξ, ηn | lim

(4.2)

1486


Since ηn V ηn V r(V ), |ξ, ηn |

n = 1, 2, . . . ,

it follows that ξ, ηn = 0 for large n; moreover, since the left-hand side converges to r(V ) we must have lim ηn V = r(V ), and lim ξ, ηn = 1. n→∞

n→∞

Since ξ and ηn are unit vectors for which |ξ, ηn | converges to 1, there is a sequence λn ∈ C, |λn | = 1, such that λn ξ, η = λn · ξ, ηn is nonnegative and converges to 1. It follows that lim λn · ξ − ηn 2 = lim 2 − 2λn · ξ, η = 0,

n→∞

n→∞

hence λ¯ n · ηn converges in norm to ξ . By continuity of the norm · V , ξ V = lim λ¯ n · ηn V = lim ηn V = r(V ), n→∞

and (i) follows.

n→∞

2

Corollary 4.3. If r(V ) > 0 then · V restricts to a bounded norm-continuous function on the unit sphere of H with the property that for every unit vector ξ , ξ V = 1 iff ξ ∈ V and ξ V = r(V )−1 iff ξ is maximal. Part 2. Normal states and normal functionals on B(H ) Let H be a Hilbert space and let V be a norm-closed subset of the unit sphere of H that satisfies axioms V1 and V2. We now introduce a numerical function of normal states of B(H ) that faithfully measures “generalized entanglement,” and we develop its basic properties in general. When the inner radius of V is positive, this function is shown to be the restriction of a norm on the predual B(H )∗ to the space of normal states, or equivalently, the restriction of a norm on the Banach space L1 (H ) of trace class operators to the space of density operators. 5. Generalized entanglement of states Fix a Hilbert space H . The Banach space B(H )∗ of normal linear functionals on B(H ) identifies naturally with the dual of the C ∗ -algebra K of compact operators on H , and we may speak of the weak∗ -topology on B(H )∗ . Similarly, B(H ) identifies with the dual of B(H )∗ , and we may speak of the weak∗ -topology on B(H ). Thus, a net of normal functionals ρn converges weak∗ to zero iff lim ρn (K) = 0,

n→∞

∀K ∈ K,

and a net of operators An ∈ B(H ) converges weak∗ to zero iff lim ρ(An ) = 0,

n→∞

∀ρ ∈ B(H )∗ .


1487

There is a natural involution ρ → ρ ∗ defined on B(H )∗ by ρ ∗ (A) = ρ(A∗ ),

A ∈ B(H ),

and we may speak of self-adjoint normal functionals ρ. Of course, B(H )∗ identifies naturally with the Banach ∗-algebra of trace class operators, but that fact is not particularly useful for our purposes. Our aim is to introduce a measure of “generalized entanglement” for normal states. It will be convenient to define it more generally as a function (5.3) defined on the larger Banach space B(H )∗ . For every X ∈ B(H ), define (5.1) XV = sup Xξ, η. ξ,η∈V

Axiom V2 implies that · V is a norm, and obviously XV X and X ∗ = X for every X. Consider the C ∗ -algebra A obtained from the compact operators K ⊆ B(H ) by adjoining the identity operator A = {K + λ · 1: K ∈ K, λ ∈ C}. Operators in A serve as “test operators” for our purposes. The V -ball in A B = X ∈ A: XV 1

(5.2)

is a norm-closed convex subset of A that is stable under the ∗-operation, stable under multiplication by complex scalars of absolute value 1, and it contains the unit ball of A. Thus we can define an extended real-valued function E : B(H )∗ → [0, +∞] by (5.3) E(ρ) = sup ρ(X) = sup ρ(X), ρ ∈ B(H )∗ . X∈B

X∈B

Remark 5.1 (Self-adjoint elements of B(H )∗ ). Note that if ρ = ρ ∗ is self-adjoint functional in B(H )∗ , then E(ρ) can be defined somewhat differently in terms of self-adjoint operators: E(ρ) = sup ρ(X): X ∗ = X ∈ A, XV 1 = sup ρ(X): X ∗ = X ∈ A, XV 1 . Indeed, every Z ∈ B has a Cartesian decomposition Z = X + iY where X and Y are self-adjoint with X = (Z + Z ∗ )/2, and we have ρ(Z) =

1 1 ρ(Z) + ρ(Z) = ρ(Z + Z ∗ ) = ρ(X), 2 2

where X = X ∗ ∈ B. After noting |ρ(X)| = max(ρ(X), ρ(−X)), we obtain E(ρ) sup ρ(X): X ∗ = X ∈ K, XV 1 . The opposite inequality is obvious.

1488


In general, E(ρ) can achieve the value +∞ (see Remark 7.4). We first determine when the set B is bounded. Proposition 5.2. Let r(V ) be the inner radius of V and let B0 be the set of all positive rank-one operators in B. Then sup X = sup X = X∈B

X∈B0

1 . r(V )2

(5.4)

Consequently, for every normal linear functional ρ ∈ B(H )∗ , ρ E(ρ) r(V )−2 · ρ.

(5.5)

Proof. To prove (5.4), it suffices to show that for every positive number M, the following are equivalent: (i) X M · XV for every rank one projection X ∈ K. (ii) X M · XV for every X ∈ B(H ). (iii) M r(V )−2 . Since the implication (ii) ⇒ (i) is trivial, it is enough to prove (i) ⇒ (iii) and (iii) ⇒ (ii). (i) ⇒ (iii). Choose a unit vector ζ ∈ H and let X be the rank one projection Xξ = ξ, ζ ζ , ξ ∈ H . Then (i) implies 1 = X M · sup Xξ, η: ξ, η ∈ V = M · sup ξ, ζ · ζ, η: ξ, η ∈ V 2 = M · sup ζ, ξ : ξ ∈ V , from which it follows that √ √ M · sup ζ, ξ = M · sup ζ, ξ 1. ξ ∈V

ξ ∈V

Let K be the closed convex hull of V . After multiplying through by ζ for more general nonzero vectors ζ ∈ H , the preceding inequality implies √ √ M · sup ζ, ξ = M · sup ζ, ξ ζ = sup ζ, η. ξ ∈K

ξ ∈V

η1

Since every bounded real-linear functional f : H → R must have the form f (ξ ) = ζ, ξ for some vector ζ ∈ H , a standard separation theorem implies that the unit ball of H is contained in √ √ M · K, namely the closed convex hull of M · V . Hence r(V ) M −1/2 . (iii) ⇒ (ii). Fix X ∈ B(H ) and let ξ0 , η0 ∈ H satisfy ξ0 1, η0 1. By definition of √ r(V ), hypothesis (iii) implies that both ξ0 and η0 belong to the closed convex hull of M · V , and hence


1489

√ Xξ0 , η0 sup Xξ, η: ξ, η ∈ M · V = M · sup Xξ, η = M · XV . ξ,η∈V

After taking the supremum over ξ0 , η0 , we obtain X M · XV . The estimates (5.5) follow immediately from (5.7). 2 The basic properties of the function E are summarized as follows. Theorem 5.3. The function E : B(H )∗ → [0, +∞] satisfies: (i) (ii) (iii) (iv)

For all ρ1 , ρ2 ∈ B(H )∗ , E(ρ1 + ρ2 ) E(ρ1 ) + E(ρ2 ). For every nonzero λ ∈ C and every ρ ∈ B(H )∗ , E(λ · ρ) = |λ| · E(ρ). E is lower semicontinuous relative to the weak∗ topology of B(H )∗ . If r(V ) > 0, then E is a norm equivalent to the norm of B(H )∗ .

Moreover, letting Σ be the set of all normal states of B(H ), we have sup E(ρ) = sup X = ρ∈Σ

X∈B

1 , r(V )2

(5.6)

the term on the right being interpreted as +∞ when r(V ) = 0. Proof. (i), (ii) and (iii) are immediate consequences of the definition (5.3) of E after noting that a supremum of continuous real-valued functions is lower semicontinuous, and (iv) follows from (5.5). To prove (5.6), let B1 = {X = X ∗ ∈ A: XV 1} be the set of self-adjoint operators in B. Remark 5.1 implies that sup E(ρ) = sup sup ρ(X) = sup sup ρ(X). ρ∈Σ

ρ∈Σ X∈B1

X∈B1 ρ∈Σ

Noting that B1 = −B1 and that the norm of a self-adjoint operator agrees with its numerical radius, the right-hand side can be replaced with sup sup ρ(X) = sup X. X∈B1 ρ∈Σ

X∈B1

Formula (5.6) now follows from (5.4) of Proposition 5.2.

2

We may conclude that when the inner radius is positive, E(·) is uniformly continuous on the unit ball of B(H )∗ : Corollary 5.4. Assume that r(V ) > 0. Then for ρ, σ ∈ B(H )∗ we have E(ρ) − E(σ ) r(V )−2 · ρ − σ .

(5.7)

1490


Proof. Theorem 5.3(iv) implies that E(·) is a norm on B(H )∗ , hence E(ρ) − E(σ ) E(ρ − σ ) r(V )−2 ρ − σ , the second inequality following from (5.5).

2

6. V -correlated states and faithfulness of E Given two unit vectors ξ, η ∈ H , we will write ωξ,η for the linear functional ωξ,η (A) = Aξ, η,

A ∈ B(H ).

∗ = ω . We begin by recalling two definitions from One has ωξ,η = ξ · η = 1, and ωξ,η η,ξ the introduction.

Definition 6.1. A normal state ρ of B(H ) is said to be V -correlated if for every > 0, there is an n = 1, 2, . . . , a set of vectors ξ1 , . . . , ξn ∈ V and a set of positive reals t1 , . . . , tn satisfying t1 + · · · + tn = 1 such that

ρ − (t1 ωξ ,ξ + · · · + tn ωξ ,ξ ) . n n 1 1 A normal state ρ that is not V -correlated is said to be entangled. By (5.5), E(ρ) 1 for every normal state ρ. The purpose of this section is to prove the following result that characterizes entangled states by the inequality E(ρ) > 1. We assume that H is a perhaps infinite-dimensional Hilbert space, that V ⊆ {ξ ∈ H : ξ = 1} satisfies hypotheses V1 and V2, but we make no assumption about the inner radius of V . Theorem 6.2. A normal state ρ of B(H ) is V -correlated iff E(ρ) = 1. The proof of Theorem 6.2 requires some preparation that is conveniently formulated in terms of the state space of the unital C ∗ -algebra A = K + C · 1 = {K + λ · 1: K ∈ K, λ ∈ C}, which of course reduces to B(H ) when H is finite-dimensional. After working out these preliminaries, we will return to the proof of Theorem 6.2 later in the section. The state space of A is compact convex in its relative weak∗ -topology, not to be confused with the various weak∗ topologies described in the previous section. We write ΣV for the set of all states ρ of A that satisfy ρ(X) XV = sup Xξ, η, X ∈ A. (6.1) ξ,η∈V

Theorem 6.3. Every state of ΣV is a weak∗ -limit of states of A of the form t1 · ωξ1 ,ξ1 A + · · · + tn · ωξn ,ξn A where n = 1, 2, . . . , ξ1 , . . . , ξn ∈ V and the tk are positive reals with sum 1.


1491

Proof. Since (6.1) exhibits ΣV as an intersection of weak∗ -closed subsets of the state space of A, it follows that ΣV is weak∗ -compact as well as convex. The Krein–Milman theorem implies that ΣV is the weak∗ -closed convex hull of its extreme points, hence it suffices to show that for every extreme point ρ of ΣV , there is a net of vectors ξn ∈ V such that ρ(X) = lim Xξn , ξn , n→∞

X ∈ A.

(6.2)

To that end, consider the somewhat larger set ΩV of all bounded linear functionals ω on A that satisfy ω(X) sup Xξ, η = XV , X ∈ A. (6.3) ξ,η∈V

Since XV X, ΩV is contained in the unit ball of the dual of A, and it is clearly convex and weak∗ -closed, hence compact. We claim that ∗

ΩV = convweak {ωξ,η A : ξ, η ∈ V },

(6.4)

conv denoting the convex hull. Indeed, the inclusion ⊇ is immediate from the definition of ΩV . For the inclusion ⊆, choose an operator X ∈ A and a real number α such that ωξ,η (X) = Xξ, η α for all ξ, η ∈ V . By axiom V1, this implies that for fixed ξ, η ∈ V we have Xξ, η = sup λXξ, η = sup Xλ · ξ, η sup Xξ, η α |λ|=1

|λ|=1

ξ,η∈V

and after taking the supremum over ξ, η on the left-hand side we obtain XV α. It follows that for every ω ∈ ΩV , ω(X) XV α and (6.4) now follows from a standard separation theorem. Now let ρ be an extreme point of ΣV . Then ρ ∈ ΩV , and we claim that in fact, ρ is an extreme point of ΩV . Indeed, if ω1 , ω2 ∈ ΩV and 0 < t < 1 are such that ρ = t · ω1 + (1 − t) · ω2 , then 1 = ρ(1) = t · ω1 (1) + (1 − t) · ω2 (1). Since |ωk (1)| ωk 1 and 1 is an extreme point of the closed unit disk, it follows that ω1 (1) = ω2 (1) = 1. Since ωk 1 = ωk (1), this implies that both ω1 and ω2 are states of A, hence ωk ∈ ΣV . By extremality of ρ, we conclude that ω1 = ω2 = ρ, as asserted. Finally, since ρ is an extreme point of ΩV and ΩV is given by (6.4), Milman’s converse of the Krein–Milman theorem implies that there is a net of pairs ξn , ηn ∈ V such that ωξn ,ηn converges to ρ in the weak∗ topology. It remains to show that we can choose ηn = ξn for all n, and for that consider ωξn ,ηn (1) = ξn , ηn , which converges to ρ(1) = 1 as n → ∞. This implies that

ξn − ηn 2 = 2 1 − ξn , ηn → 0, as n → ∞, so that ωξn ,ξn − ωξn ,ηn ξn − ηn → 0 as n → ∞. Hence ωξn ,ξn converges weak∗ to ρ, and the desired conclusion (6.2) follows. 2

1492


Proof of Theorem 6.2. It is clear from the definition (5.3) that E(ρ) 1 in general. We claim first that E(ρ) = 1 for every V -correlated normal state ρ. Indeed, since E(·) is a convex function that is lower semicontinuous with respect to the norm topology on states, the set C of all normal states ρ for which E(ρ) 1 is norm closed and convex. It contains every state of the form ωξ,ξ for ξ ∈ V since for every X ∈ A we have ωξ,ξ (X) Xξ, ξ sup Xη, ζ = XV η,ζ ∈V

so that E(ωξ,ξ ) 1. Hence C contains every V -correlated state. Conversely, let ρ be a normal state for which E(ρ) = 1, or equivalently, ρ(X) XV ,

X ∈ A.

Theorem 6.3 implies that there is a net of normal states ρn of B(H ), each of which is a finite convex combination of states of the form ωξ,ξ with ξ ∈ V , such that ρ(X) = lim ρn (X),

X ∈ A,

ρ(K) = lim ρn (K),

K ∈ K.

n→∞

and in particular n→∞

It is well known that if a net of normal states converges to a normal state pointwise on compact operators, then in fact ρ − ρn → 0 as n → ∞ (for example, see [1, Lemma 2.9.10]). We conclude from the latter that ρ is V -correlated. 2 Remark 6.4. In the special case where H is a tensor product of Hilbert spaces H = H1 ⊗ H2 and V = {ξ1 ⊗ ξ2 : ξk ∈ Hk , ξ1 = ξ2 = 1}, Holevo, Shirokov and Werner showed [9] that when H1 and H2 are infinite-dimensional, there are normal states that can be norm approximated by convex combinations of vector states of the form ωξ,ξ , ξ ∈ V , but which cannot be written as a discrete infinite convex combination ρ=

∞

tk · ωξk ,ξk

k=1

with ξk ∈ V and with nonnegative numbers tk having sum 1. On the other hand, they also show that every such ρ can be expressed as an integral ρ(X) =

Xξ, ξ dμ(ξ ),

X ∈ B(H ),

S

where μ is a probability measure on the Polish space S = ξ = η ⊗ ζ : η = ζ = 1 .

(6.5)


1493

It seems likely that an integral representation like (6.5) should persist for V -correlated states in the more general setting of Theorem 6.2, where of course S is replaced with V —though we have not pursued that issue. 7. Maximally entangled states The entanglement of a normal state ρ satisfies 1 E(ρ) r(V )−2 , and the minimally entangled states were characterized as the V -correlated states in Theorem 6.2. In this section we discuss states at the opposite extreme. Definition 7.1. A normal state ρ satisfying E(ρ) = r(V )−2 is said to be maximally entangled. We now calculate the entanglement of (normal) pure states in general, and we characterize the maximally entangled pure states in cases where the inner radius of V is positive. Theorem 7.2. Let V be a norm-closed subset of the unit sphere of H satisfying V1 and V2, let ξ be a unit vector in H and let ω the corresponding vector state ω(X) = Xξ, ξ , X ∈ B(H ). Then

2 E(ω) = ξ V .

(7.1)

Assuming further that r(V ) > 0, then ω is maximally entangled iff ξ is a maximal vector. More generally, let ρ be an arbitrary maximally entangled normal state, and decompose ρ into a perhaps infinite convex combination of vector states ρ(X) = t1 · ω1 + t2 · ω2 + · · ·

(7.2)

where the tk are positive numbers with sum 1 and each ωk has the form ωk (X) = Xξk , ξk , X ∈ B(H ), with ξk = 1. Then each ωk is maximally entangled. The proof of Theorem 7.2 makes use of the following basic inequality: Lemma 7.3. For every ξ, η ∈ H and every A ∈ B(H ), Aξ, η AV ξ V ηV .

(7.3)

Proof of Lemma 7.3. After rescaling both ξ and η, it is enough to show that ξ V 1, ηV 1 ⇒ Aξ, η AV .

(7.4)

To that end, assume first that ξ, η ∈ V . Then Aξ, η sup Aξ, η = AV . ξ,η∈V

Since Aξ, η is sesquilinear in ξ, η, the same inequality |Aξ, η| AV persists if ξ and η are finite convex combinations of elements of V , and by passing to the norm closure, |Aξ, η| AV remains true if ξ and η belong to the closed convex hull of V . By Lemma 2.2, the closed convex hull of V is {ζ ∈ H : ζ V 1}, and (7.4) follows. 2

1494


Proof. Let ξ ∈ H be a unit vector with associated vector state ω and let A = K + C · 1. Then for every A ∈ A satisfying AV 1, (7.3) implies

ω(A) = Aξ, ξ ξ V 2 , and E(ω) (ξ V )2 follows from the definition (5.3) after taking the supremum over A. To prove the opposite inequality E(ω) (ξ V )2 , consider ξ V = sup ξ, ζ . ζ V =1

Let ζn be a sequence of vectors in H satisfying ζn V = 1 for all n = 1, 2, . . . and |ξ, ζn | ↑ ξ V as n → ∞. Consider the sequence of rank one operators A1 , A2 , . . . defined by An (η) = η, ζn ζn , η ∈ H , and note that An V = 1. Indeed, we have An V = sup An η1 , η2 = sup η1 , ζn ζn , η2 η1 ,η2 ∈V

η1 ,η2 ∈V

2 = sup ζn , η = ζn 2V = 1. η∈V

So by (5.3), E(ρ) |ρ(An )| for every n = 1, 2, . . . . But since 2

2 ρ(An ) = An ξ, ξ = ξ, ζn ↑ ξ V as n → ∞, it follows that E(ρ) (ξ V )2 . For the second paragraph, assume that r(V ) > 0. Theorem 4.2 implies that ξ V = r(V )−1 iff ξ is a maximal vector; and from (7.1) we conclude that ω is a maximally entangled state iff ξ is a maximal vector. Let ρ be a maximally entangled state of the form (7.2). By symmetry and since all the tk are positive, it suffices to show that ω1 is maximally entangled. For that, consider the normal state σ=

t3 t2 ω2 + ω3 + · · · . 1 − t1 1 − t1

We have ρ = t1 · ω1 + (1 − t1 ) · σ , and since E is a convex function, 1 = E(ρ) t1 E(ω1 ) + (1 − t1 )E(σ ). r(V )2 Since E(ω1 ) and E(σ ) are both r(V )−2 , it follows that E(ω1 ) = E(σ ) = r(V )−2 , hence ω1 is a maximally entangled pure state. 2 Remark 7.4 (Infinitely entangled states). Consider the case H = H1 ⊗ H2 with V the set of decomposable unit vectors η1 ⊗ η2 , with ηk a unit vector in Hk , k = 1, 2. When dim H1 = dim H2 = ∞, infinitely entangled normal states exist. Indeed, Proposition 8.3 below implies that there are unit vectors ξ satisfying ξ V = +∞ in this case, and by Theorem 7.2, such a ξ gives rise to a vector state ω for which E(ω) = +∞.


1495

Part 3. N -fold tensor products In the remaining sections we consider Hilbert spaces presented as N -fold tensor products H = H1 ⊗ · · · ⊗ HN in which at most one of the factors Hk is infinite-dimensional. We can arrange that the dimensions nk = dim Hk increase n1 · · · nN , so that nN −1 < ∞. The set V of distinguished vectors is the set of all decomposable unit vectors V = ξ1 ⊗ · · · ⊗ ξN : ξk ∈ Hk , ξ1 = · · · = ξN = 1 . The general results above imply that we will have a rather complete understanding of separable states and entanglement once we determine the inner radius of V , have an explicit description of the maximal vectors, and identify the entanglement norm of states. In the remaining sections we present our progress in carrying out those calculations. We calculate the vector norms · V and · V and the entanglement measuring norm E of normal states in general. In order to determine the maximal vectors one must first calculate the inner radius r(V ). While we are unable to obtain an explicit formula in general, we do obtain such a formula under the assumption that HN is “large” in the sense that nN n1 · · · nN −1 and we characterize maximal vectors as those unit vectors that purify the tracial state of the subalgebra B(H1 ⊗ · · · ⊗ HN −1 ) ⊗ 1HN . Of course, a natural setting in which all of the results of this section are valid is that in which exactly one of the factors of H1 ⊗ · · · ⊗ HN is infinite-dimensional. 8. Calculation of the vector norms · V and · V Remark 8.1 (Projective tensor products). We begin by reviewing the definition and universal ˆ ···⊗ ˆ EN of complex Banach spaces E1 , . . . , EN . property of the projective tensor product E1 ⊗ We require these results only when at most one of E1 , . . . , EN is infinite-dimensional and we confine the discussion to such cases, with the Ek arranged so that their dimensions nk = dim Ek weakly increase with k and satisfy nN −1 < ∞. Every vector z of the algebraic tensor product of vector spaces E1 · · · EN can be expressed as a sum of elementary tensors z=

n

k x1k ⊗ · · · ⊗ xN ,

(8.1)

k=1

in many ways, with 1 n n1 n2 · · · nN −1 , xjk ∈ Ej , j = 1, . . . , N . The projective norm (or greatest cross norm) zγ is defined as zγ = inf

n

k

k k

x

x · · · x 1

2

N

k=1

the infimum extended over all representations of z of the form (8.1). It is a fact that the norm · γ makes the algebraic tensor product into a Banach space—the projective tensor product— ˆ ···⊗ ˆ EN . The projective norm is a cross norm (x1 ⊗ · · · ⊗ xN γ = x1 · · · xN ) denoted E1 ⊗ that dominates every cross norm on E1 · · · EN .

1496


It is characterized by the following universal property: For every Banach space F and every bounded multilinear mapping B : E1 × · · · × EN → F , there is a unique bounded linear operator ˆ ··· ⊗ ˆ EN → F with the property L(x1 ⊗ · · · ⊗ xN ) = B(x1 , . . . , xN ) for all xj ∈ Ej , L : E1 ⊗ 1 j N , and the norm of the linearizing operator L is given by

L = sup B(x1 , . . . , xN ) : xj 1, j = 1, . . . , N . ˆ ··· ⊗ ˆ EN → C is In particular, the norm of a linear functional F : E1 ⊗ F =

sup

x1 =···=xN =1

F (x1 ⊗ x2 ⊗ · · · ⊗ xN ).

(8.2)

ˆ ··· ⊗ ˆ EN can be written as a finite linear Moreover, every bounded linear functional F on E1 ⊗ combination of decomposable functionals n1 n2 ···nN−1

F=

F1k ⊗ · · · ⊗ FNk ,

(8.3)

k=1

where for each j = 1, . . . , N , Fjk is a bounded linear functional on Ej . We now calculate the vector norms · V and · V for cases in which V is the set of decomposable unit vectors in N -fold tensor products H1 ⊗ · · · ⊗ HN where the dimensions nk = dim Hk weakly increase with nN −1 < ∞. The space HN is allowed to be infinite-dimensional. Theorem 8.2. For every ξ ∈ H1 ⊗ · · · ⊗ HN , let Fξ be the element of the dual of the projective ˆ ··· ⊗ ˆ HN defined by tensor product H1 ⊗ Fξ (η1 ⊗ · · · ⊗ ηN ) = η1 ⊗ · · · ⊗ ηN , ξ . Then the norms · V and · V are given by ξ V = Fξ ,

ξ V = ξ γ ,

ξ ∈ H1 ⊗ · · · ⊗ HN .

(8.4)

Proof. The first formula of (8.4) is an immediate consequence of the definition of ξ V and the formula (8.2), since ξ V = =

sup

η1 ⊗ · · · ⊗ ηN , ξ

sup

Fξ (η1 ⊗ · · · ⊗ ηN ) = Fξ .

η1 =···=ηN =1 η1 =···=ηN =1

For the second formula, write ξ V = sup ξ, η = sup Fη (ξ ). ηV 1

ηV 1


1497

The formula just proved asserts that ηV = Fη , so the preceding formula can be written ξ V = sup Fη (ξ ) ξ γ . Fη 1

For the opposite inequality, we use the Hahn–Banach theorem to find a linear functional F of ˆ ··· ⊗ ˆ HN such that ξ γ = F (ξ ). By the Riesz lemma there is a norm 1 in the dual of H1 ⊗ unique vector η ∈ H1 ⊗ · · · ⊗ HN such that F (ζ ) = ζ, η for all ζ , and in particular ξ γ = F (ξ ) = ξ, η = Fη (ξ ). By the first part of the proof we have ηV = Fη = 1. Hence ξ γ = ξ, η sup ξ, η = ξ V , ηV 1

and ξ γ = ξ V follows.

2

The following observation implies that r(V ) can be zero and infinitely entangled vectors can exist. While the physics literature contains examples of infinitely entangled states (e.g., see [12]), it seems worthwhile to present concrete examples of that phenomenon in this context. Proposition 8.3. Consider the case N = 2, and let H = H1 ⊗ H2 where H1 and H2 are both infinite-dimensional. Then there are vectors ξ ∈ H satisfying ξ = 1 and ξ V = +∞. Proof. Let θ1 , θ2 , . . . be positive numbers with sum 1, such as θk = 2−k , let n1 , n2 , . . . be positive integers such that θk nk → ∞ as k → ∞, and let e1 , e2 , . . . and f1 , f2 , . . . be orthonormal sets in H1 and H2 , respectively. Partition the positive integers into disjoint subsets S1 , S2 , . . . such that |Sk | = nk for k = 1, 2, . . . . For every k = 1, 2, . . . , let ξk be the vector ξk =

ej ⊗ f j .

j ∈Sk

Obviously, ξk 2 = |Sk | = nk , and we claim that ξk V = 1. Indeed, ξk V =

sup

η=ζ =1

ξk , η ⊗ ζ =

= 1, e , ηf , ζ j j

sup

η=ζ =1 j ∈S k

where the last equality is achieved with unit vectors η, ζ of the form −1/2

η = nk

ej ,

−1/2

ζ = nk

j ∈Sk

k∈Sk

The vectors ξ1 , ξ2 , . . . are mutually orthogonal, so that ξ=

k

√ √ θk θk ξk = √ ξk ξk nk k

fj .

1498


defines a unit vector in H . We claim that ξ V = +∞. To see that, fix k = 1, 2, . . . and use ξk V = 1 to write √ θk ξ = sup ξ, η ξ, ξk = √ ξk 2 = θk nk . nk ηV =1 V

By the choice of nk the right-hand side is unbounded, hence ξ V = +∞.

2

9. Calculation of the entanglement norm E Continuing in the context of the previous section, we now calculate the entanglement norm E(ρ) of normal states ρ on B(H1 ⊗ · · · ⊗ HN ). We write L1 (H ) for the Banach space of trace class operators on a Hilbert space H , with trace norm A = trace |A|,

A ∈ L1 (H ),

|A| denoting the positive square root of A∗ A. Every normal linear functional ρ on B(H ) has a density operator A ∈ L1 (H ), defined by ρ(B) = trace(AB),

B ∈ B(H ),

and the identification of ρ with its density operator A is a linear isometry. Theorem 9.1. Let ρ be a normal state of B(H1 ⊗ · · · ⊗ HN ) with density operator A, ρ(X) = trace(AX). The entanglement of ρ is given by E(ρ) = Aγ ,

(9.1)

where · γ is the greatest cross norm on the projective tensor product of Banach spaces ˆ ··· ⊗ ˆ L1 (HN ). L1 (H1 ) ⊗ Before giving the proof, we first calculate the norm BV , defined on operators B ∈ B(H1 ⊗ · · · ⊗ HN ) as in (5.1), in the current setting in which V is the set of decomposable unit vectors of H1 ⊗ · · · ⊗ HN . Lemma 9.2. For every operator B ∈ B(H1 ⊗ · · · ⊗ HN ), one has

BV = sup trace B(T1 ⊗ · · · ⊗ TN ) : Tk ∈ L1 (Hk ), trace |Tk | 1 .

(9.2)

Proof. In this case, the definition (5.1) of the norm BV becomes BV = sup B(ξ1 ⊗ · · · ⊗ ξN , η1 ⊗ · · · ⊗ ηN the supremum extended over all pairs ξk , ηk ∈ Hk , k = 1, . . . , N that satisfy ξk = ηk = 1. It follows that this formula can be written equivalently as

BV = sup trace B(T1 ⊗ · · · ⊗ TN )

(9.3)


1499

the supremum extended over all rank one operators Tk ∈ B(Hk ) having norm 1. It is well known that for every Hilbert space H , the unit ball of the Banach space L1 (H ) of trace class operators is the closure (in the trace norm) of the set of convex combinations of rank one operators of norm at most 1. It follows that the formula (9.3) is equivalent to (9.2). 2 Proof of Theorem 9.1. We claim first that the bounded linear functionals on the projective tensor ˆ ··· ⊗ ˆ L1 (HN ) are precisely those of the form product L1 (H1 ) ⊗ FB (A) = trace(AB),

ˆ ··· ⊗ ˆ L1 (HN ), A ∈ L1 (H1 ) ⊗

(9.4)

where B is a operator in B(H1 ⊗ · · · ⊗ HN ). Indeed, for every operator B ∈ B(H1 ⊗ · · · ⊗ HN ), the universal property of the projective cross norm implies that there is a unique bounded linear ˆ ··· ⊗ ˆ L1 (HN ) that satisfies functional FB on L1 (H1 ) ⊗

FB (T1 ⊗ · · · ⊗ TN ) = trace (T1 ⊗ · · · ⊗ TN )B ,

Tk ∈ L1 (Hk ), 1 k N.

ˆ ··· ⊗ ˆ For the opposite inclusion, by (8.3), every bounded linear functional F on L1 (H1 ) ⊗ L1 (HN ) is a finite sum of the form n

F=

Fj1 ⊗ · · · ⊗ FjN

j =1

where Fjk belongs to the dual of L1 (Hk ), 1 k N . Letting Bjk ∈ B(Hk ) be the operator defined by Fjk (T ) = trace(T Bjk ), one sees that the operator B=

n

Bj1 ⊗ · · · ⊗ BjN ∈ B(H1 ⊗ · · · ⊗ HN )

j =1

satisfies (9.4), and the claim is proved. Note too that by the universal property of projective tensor products, Lemma 9.2 implies that the norm of the linear functional FB associated with an operator B as in (9.4) is given by FB = BV .

(9.5)

Fixing ρ(X) = trace(AX) as above, the Hahn–Banach theorem, together with the preceding remarks, implies that Aγ = sup FB (A) = sup trace(AB). FB 1

FB 1

Using (9.5), the right-hand side becomes sup trace(AB) = sup trace(AB) = sup ρ(B) = E(ρ),

FB 1

and (9.1) is proved.

BV 1

2

BV 1

1500


10. Calculation of the inner radius In this section we establish a universal lower bound on r(V ), we show that this lower bound is achieved when nN is sufficiently large, and we exhibit maximal vectors for those cases. Theorem 10.1. Let V be the set of all decomposable unit vectors in a tensor product H = H1 ⊗ · · · ⊗ HN with weakly increasing dimensions nk = dim Hk such that nN −1 < ∞. Then the inner radius satisfies 1 . r(V ) √ n1 n2 · · · nN −1

(10.1)

Proof. By formula (3.2) of Theorem 3.2, it suffices to show that for every unit vector ξ ∈ H , ξ V

√ n1 n2 · · · nN −1 .

(10.2)

Fix orthonormal bases 1 −1 e1 , . . . , en11 , . . . , e1N −1 , . . . , enNN−1

(10.3)

for H1 , . . . , HN −1 , respectively. Every unit vector ξ ∈ H1 ⊗ · · · ⊗ HN can be decomposed uniquely into a sum ξ=

n1 i1 =1

...

N −1 iN−1 =1

−1 ei11 ⊗ · · · ⊗ eiNN−1 ⊗ ξi1 ,...,iN−1 ,

(10.4)

where {ξi1 ,...,iN−1 } is a set of vectors in HN satisfying

n1 ,...,nN−1

ξi1 ,...,iN−1 2 = 1.

i1 ,...,iN−1 =1

Indeed, ξi1 ,...,iN−1 is the vector of HN defined by −1 ξi1 ,...,iN−1 , ζ = ξ, ei11 ⊗ · · · ⊗ eiNN−1 ⊗ζ ,

ζ ∈ HN .

By Theorem 8.2, · V is a cross norm on the algebraic tensor product H1 · · · HN , so from (10.4) and the Schwarz inequality, we conclude that

e1 ⊗ · · · ⊗ eN −1 ⊗ ξi ,...,i V = 1 N−1 i1 iN−1

ξ V

i1 ,...,iN−1

i1 ,...,iN−1

and (10.2) follows.

2

1/2 1

i1 ,...,iN−1

1/2 ξi1 ,...,iN−1 2

ξi1 ,...,iN−1

i1 ,...,iN−1

= (n1 · · · nN −1 )1/2 ,


1501

−1 Assume now that nN n1 n2 · · · nN −1 , choose a set of orthonormal bases {ei11 }, . . . , {eiNN−1 } for H1 , . . . , HN −1 as in (10.3), let

{fi1 ,...,iN−1 : 1 i1 n1 , . . . , 1 iN −1 nN −1 } be an orthonormal set in HN , and consider the unit vector ξ ∈ H1 ⊗ · · · ⊗ HN defined by nN−1 n1 1 −1 ξ=√ ... ei11 ⊗ · · · ⊗ eiNN−1 ⊗ fi1 ,...,iN−1 . n1 · · · nN −1 i1 =1

(10.5)

iN−1 =1

Theorem 10.2. For all cases in which nN n1 · · · nN −1 , we have r(V ) = √

1 , n1 · · · nN −1

(10.6)

and vectors of the form (10.5) are maximal vectors. Proof. Let ξ be a unit vector of the form (10.5). We will show that ξ V =

√ n1 · · · nN −1 .

(10.7)

Once (10.7) is established, formula (3.2) of Theorem 3.2 implies that r(V )−1 = sup ξ V ξ =1

√ n1 · · · nN −1 ,

so that r(V ) (n1 · · · nN −1 )−1/2 , and (10.6) will follow after an application of Theorem 10.1. At that point, (10.7) makes the assertion ξ V = r(V )−1 , and Theorem 4.2 will imply that ξ is maximal. Thus it suffices to establish (10.7). Note first that by (3.2) and (10.1), ξ V

1 √ n1 · · · nN −1 . r(V )

(10.8)

For the opposite inequality, Theorem 8.2 implies that ξ V is the projective cross norm ξ γ , and it suffices to exhibit a linear functional F of norm 1 on the projective tensor product ˆ ··· ⊗ ˆ HN such that H1 ⊗ F (ξ ) =

√ n1 · · · nN −1 .

For that, consider the vector η = (n1 · · · nN −1 )

1/2

·ξ =

n1 i1 =1

nN−1

...

iN−1

−1 ei11 ⊗ · · · ⊗ eiNN−1 ⊗ fi1 ,...,iN−1 ,

(10.9)

1502


ˆ ··· ⊗ ˆ HN by F (ζ ) = ζ, η. By the universal property of the projective and define F on H1 ⊗ tensor product, the norm of F is F = sup F (v1 ⊗ · · · ⊗ vN ): vk ∈ Hk , vk 1 . Choosing vk ∈ Hk , we have F (v1 ⊗ · · · ⊗ vk ) = v1 ⊗ · · · ⊗ vk , η =

n1

nN−1

...

i1 =1

iN−1 =1

−1 v1 , ei11 · · · vN −1 , eiNN−1 vN , fi1 ,...,iN−1

= vN ,

ei11 , v1

···

−1 eiNN−1 , vN −1 fi1 ,...,iN−1

.

i1 ,...,iN−1

Using orthonormality of {fi1 ,...,iN−1 }, we can write

2

1 N −1

ei1 , v1 · · · eiN−1 , vN −1 fi1 ,...,iN−1 sup F (v1 ⊗ · · · ⊗ vN ) =

vN 1

i1 ,...,iN−1

=

e1 , v1 2 · · · eN −1 , vN −1 2 i1

iN−1

i1 ,...,iN−1

= v1 2 · · · vN −1 2 , so that F = sup{v1 2 · · · vN −1 2 : vk 1} = 1. Applying this linear functional to ξ , we find that F (ξ ) = ξ, η = and the desired inequality ξ γ

√

√ √ n1 · · · nN −1 · ξ 2 = n1 · · · nN −1

n1 · · · nN −1 is proved.

2

11. Significance of the formula r(V ) = (n1 n2 ···nN −1 )−1/2 Theorem 10.1 asserts that for N -fold tensor products H = H1 ⊗ · · · ⊗ HN in which the dimensions nk = dim Hk increase with k and satisfy nN −1 < ∞, the inner radius of the set V of decomposable vectors satisfies 1 r(V ) √ . n1 n2 · · · nN −1

(11.1)

We have also seen that for fixed n1 · · · nN −1 < ∞, equality holds in (11.1) when nN is sufficiently large (see Theorem 10.2). In this section we show that equality in (11.1) can be characterized in a way that is perhaps unexpected, in that r(V ) = (n1 · · · nN −1 )−1/2 iff the tracial state of B(H1 ⊗ · · · ⊗ HN −1 ) can be extended to a pure state of B(H1 ⊗ · · · ⊗ HN −1 ) ⊗ B(HN ). We also characterize that situation in terms of the size of nN .


1503

Theorem 11.1. Let V be the decomposable unit vectors in a tensor product of finite-dimensional Hilbert spaces H = H1 ⊗ · · · ⊗ HN , with nk = dim Hk weakly increasing with k, consider the subfactor A = B(H1 ⊗ · · · ⊗ HN −1 ) of B(H1 ⊗ · · · ⊗ HN ), and let τ be the tracial state of A. The following assertions are equivalent: (i) Minimality of the inner radius: r(V ) = (n1 · · · nN −1 )−1/2 .

(11.2)

(ii) Existence of purifications: there is a maximal vector ξ ∈ H1 ⊗ · · · ⊗ HN such that τ (A) = (A ⊗ 1HN )ξ, ξ ,

A ∈ A.

(11.3)

(iii) Lower limit on dim HN : nN n1 n2 · · · nN −1 . The proof of Theorem 11.1 requires the following elementary result. Lemma 11.2. Let H and K be finite-dimensional Hilbert spaces and let ω be a faithful state of B(H ). If there is a vector ξ ∈ H ⊗ K such that ω(A) = (A ⊗ 1K )ξ, ξ ,

A ∈ B(H ),

then dim K dim H . Proof. Let η be a unit vector in H ⊗ H such that ω(A) = (A ⊗ 1H )η, η = (1H ⊗ A)η, η ,

A ∈ B(H ).

For example, setting n = dim H , let Ω be the density operator of ω, with eigenvalue list λ1 · · · λn > 0 and corresponding eigenvectors e1 , . . . , en . One can take η=

λ1 · e1 ⊗ e1 + · · · +

λn · en ⊗ en .

Since ω is a faithful state, η is a cyclic and separating vector for B(H ) ⊗ 1H . For every A ∈ B(H ) we have

(A ⊗ 1H )η 2 = ω(A∗ A) = (A ⊗ 1K )ξ 2 , hence there is an isometry U : H ⊗ H = (B(H ) ⊗ 1H )η → H ⊗ K satisfying U (A ⊗ 1H )η = (A ⊗ 1K )ξ,

A ∈ B(H ).

It follows that dim H · dim K = dim(H ⊗ K) dim(H ⊗ H ) = (dim H )2 , and dim K dim H follows after canceling dim H . 2

1504


Proof of Theorem 11.1. The implication (ii) ⇒ (iii) follows after applying Lemma 11.2 to the case H = H1 ⊗ · · · ⊗ HN −1 and K = HN , and (iii) ⇒ (i) is an immediate consequence of Theorem 10.2. (i) ⇒ (ii). Since H1 ⊗ · · · ⊗ HN is finite-dimensional, maximal vectors exist. For each k = 1, . . . , nN −1 , let Ek be a rank-one projection in B(Hk ). We claim that for every maximal vector ξ , one has

(E1 ⊗ E2 ⊗ · · · ⊗ EN −1 ⊗ 1HN )ξ, ξ =

1 . n1 n2 · · · nN −1

(11.4)

For the proof, choose a unit vector ek ∈ Ek Hk , 1 k N − 1 and consider the operator U : HN → H1 ⊗ · · · ⊗ HN defined by U ζ = e1 ⊗ · · · ⊗ eN −1 ⊗ ζ,

ζ ∈ HN .

U is a partial isometry whose range projection is E1 ⊗ · · · ⊗ EN −1 ⊗ 1HN , and since U U ∗ ξ, ξ = U ∗ ξ 2 , (11.4) is equivalent to the assertion 1 U ∗ ξ = √ . n1 · · · nN −1

(11.5)

We claim first that U ∗ ξ (n1 · · · nN −1 )−1/2 . Indeed, for every unit vector ζ ∈ HN we have ∗ U ξ, ζ = ξ, U ζ = ξ, e1 ⊗ · · · ⊗ eN −1 ⊗ ζ ξ, η1 ⊗ · · · ⊗ ηN = ξ V sup η1 =···=ηN =1

1 = r(V ) = √ , n1 · · · nN −1 where the equality ξ V = r(V ) follows from the characterization of maximal vectors of Theorem 4.2. The asserted inequality now follows after taking the supremum over ζ = 1. To prove (11.5), choose orthonormal bases 1 −1 e1 , . . . , en11 , . . . , e1N −1 , . . . , enNN−1 for H1 , . . . , HN −1 , respectively, such that e11 = e1 , . . . , e1N −1 = eN −1 . For every sequence of integers i1 , . . . , iN −1 , 1 ik nk , consider the operator −1 ⊗ ζ ∈ H1 ⊗ · · · ⊗ HN . Ui1 ,...,iN−1 : ζ ∈ HN → ei11 ⊗ · · · ⊗ eiNN−1

The preceding argument implies Ui∗1 ,...,iN−1 ξ 2 (n1 · · · nN −1 )−1 for each i1 , . . . , inN−1 , hence n1 i1 =1

nN−1 n1

2 ∗

Ui1 ,...,iN−1 ξ ... ... nN−1

iN−1 =1

i1 =1

iN−1 =1

1 = 1. n1 · · · nN −1

(11.6)


1505

For each k = 1, . . . , N − 1 and every i = 1, . . . , nk , let Eik be the projection onto the subspace of Hk spanned by eik . For each i1 , . . . , iN −1 , we have

∗

U

i1 ,...,iN−1 ξ

2 1

= E ⊗ · · · ⊗ E nN−1 ⊗ 1H ξ, ξ . N i1 iN−1

Since the projections occurring in the right-hand side are mutually orthogonal and sum to the identity operator of H1 ⊗ · · · ⊗ HN , the left-hand side of (11.6) is n1 i1 =1

−1 Ei11 ⊗ · · · ⊗ EiNN−1 ⊗ 1Hn ξ, ξ = ξ 2 = 1.

nN−1

...

iN−1 =1

It follows that the inequality of (11.6) is actually equality; and since each summand satisfies Ui∗1 ,...,iN−1 ξ 2 (n1 · · · nN −1 )−1 , we must have equality throughout the summands. Formula (11.4) follows. Let S be the set of all operators A ∈ A for which (11.3) holds. Obviously, S is a linear space, and by (11.4), every tensor product E1 ⊗ · · · ⊗ EN −1 of rank one projections Ek ∈ B(Hk ) belongs to S. For fixed k, the rank one projections in B(Hk ) span B(Hk ), so by multilinearity, S contains all operators of the form A1 ⊗ · · · ⊗ AN −1 with Ak ∈ B(Hk ). Since operators of the form A1 ⊗ · · · ⊗ AN −1 span A itself, Theorem 11.1 follows. 2 Remark 11.3 (Finite dimensionality). Notice that the hypothesis nN < ∞ was used only in the proof of (i) ⇒ (ii), and there only to ensure the existence of maximal vectors. If maximal vectors are known to exist in a setting in which nN = ∞, then the proof of (i) ⇒ (ii) applies verbatim. Of course, whenever (iii) holds, maximal vectors exist by Theorem 10.2. 12. Homogeneity and the case nN n1 ···nN −1 Continuing under the hypotheses n1 · · · nN −1 < ∞, we show in this section that when nN n1 · · · nN −1 , the set of maximal vectors in H1 ⊗ · · · ⊗ HN is acted upon transitively by the unitary group of HN , and we draw out several consequences. Theorem 12.1. Assume that nN n1 · · · nN −1 and let ξ1 and ξ2 be two maximal vectors in H1 ⊗ · · · ⊗ HN . Then there is a unitary operator U in B(HN ) such that ξ2 = (1H1 ⊗ · · · ⊗ 1HN−1 ⊗ U )ξ1 .

(12.1)

Maximal vectors are characterized as the unit vectors ξ ∈ H1 ⊗ · · · ⊗ HN that purify the tracial state of B(H1 ⊗ · · · ⊗ HN −1 ) as in (11.3). The maximal vectors for H1 ⊗ · · · ⊗ HN are simply those of the form 1 ξ=√ (e1 ⊗ f1 + · · · + en1 ···nN−1 ⊗ fn1 ···nN−1 ), n1 n2 · · · nN −1

(12.2)

where {ek : 1 k n1 · · · nN −1 } is an orthonormal basis for H1 ⊗ · · · ⊗ HN −1 and {fk : 1 k n1 · · · nN −1 } is an orthonormal set in HN .

1506


We require the following elementary consequence of familiar methods associated with the GNS construction. We sketch the proof for completeness. Lemma 12.2. Let ξ1 , ξ2 be vectors in H1 ⊗ · · · ⊗ HN such that

(A ⊗ 1HN )ξ1 , ξ1 = (A ⊗ 1HN )ξ2 , ξ2

(12.3)

for all A ∈ B(H1 ⊗ · · · ⊗ HN −1 ). Then there is a unitary operator U ∈ B(HN ) such that (1H1 ⊗···⊗HN−1 ⊗ U )ξ1 = ξ2 .

(12.4)

Proof. Consider the following subalgebra B ⊆ B(H1 ⊗ · · · ⊗ HN ) B = B(H1 ⊗ · · · ⊗ HN −1 ) ⊗ 1HN . B is a finite-dimensional factor isomorphic to B(H1 ⊗ · · · ⊗ HN −1 ) whose commutant is 1H1 ⊗···⊗HN−1 ⊗ B(HN ). For k = 1, 2, consider the finite-dimensional subspace Lk of the tensor product H1 ⊗ · · · ⊗ HN defined by Lk = {Bξk : B ∈ B}. Since Bξk 2 = B ∗ Bξk , ξk = B ∗ Bξ2 , ξ2 = Bξ2 2 ,

k = 1, 2, B ∈ B,

there is a unique partial isometry V in the commutant of B having initial space L1 , final space L2 , such that V Bξ1 = Bξ2 ,

B ∈ B,

and in particular, this operator satisfies V ξ1 = ξ2 . Since both spaces Lk are invariant under B, they are the ranges of projections in the commutant of B, and therefore must have the form Lk = H1 ⊗ · · · ⊗ HN −1 ⊗ Kk , k = 1, 2, where Kk is a finite-dimensional subspace of HN . Moreover, since V belongs to the commutant of B, it has the form V = 1H1 ⊗···⊗HN−1 ⊗ U0 where U0 is a partial isometry in B(HN ) having initial and final spaces K1 and K2 , respectively. Finally, since a finite rank partial isometry U0 ∈ B(HN ) can always be extended to a unitary operator U ∈ B(HN ), we obtain a unitary operator U ∈ B(HN ) with the property asserted in (12.4). 2 Proof of Theorem 12.1. Choose an orthonormal set {fi1 ,...,iN−1 : 1 i1 n1 , . . . , 1 iN −1 nN −1 } in HN and let ξ be the vector of the form (10.5). Theorem 10.2 implies that ξ is a maximal vector. Let ξ be another maximal vector. The proof of the implication (i) ⇒ (ii) of Theorem 11.1 implies that Aξ1 , ξ1 = Aξ2 , ξ2 = τ (A),

A ∈ B(H1 ⊗ · · · ⊗ HN −1 ) ⊗ 1HN

(see Remark 11.3), where τ is the tracial state. Lemma 12.2 implies that there is a unitary operator U ∈ B(HN ) such that ξ = (1H1 ⊗···⊗HN−1 ⊗ U )ξ . Notice that this implies that ξ also has the form


1507

(12.2), in which {fi1 ,...,iN−1 } is replaced with {Ufi1 ,...,iN−1 }. It also shows that every maximal vector purifies the tracial state τ . Another application of Lemma 12.2 shows that every vector η in the tensor product H1 ⊗ · · · ⊗ HN that purifies the tracial state τ above must have the form η = (1H1 ⊗···⊗HN−1 ⊗ U )ξ where ξ is the vector above, therefore η is also a maximal vector of the form (10.5). Finally, since every vector of the apparently more general form (12.2) must purify the tracial state τ as above, it follows from Lemma 12.2 that there is a unitary operator U ∈ B(HN ) such that η = (1H1 ⊗···⊗HN−1 ⊗ U )ξ . It follows that η can be rewritten so that it has the form (10.5), and is therefore maximal. 2 Remark 12.3 (Stability of maximal vectors when dim HN is large). It is of interest to reformulate the above results as follows. Let H , K be finite-dimensional Hilbert spaces such that dim H dim K, consider the bipartite tensor product G = H ⊗ K with the associated set V = ξ ⊗ η ∈ G: ξ ∈ H, η ∈ K, ξ = η = 1 of unit decomposable vectors. Suppose we are given a further decomposition of H into a tensor product H = H1 ⊗ · · · ⊗ Hr , with the resulting set V˜ = ξ1 ⊗ · · · ⊗ ξr ⊗ η ∈ G: ξk ∈ Hk , η ∈ K, ξk = η = 1 of decomposable unit vectors in G. Then the preceding results show that the sets V and V˜ give rise to the same set of maximal vectors, and their inner radii satisfy r(V ) = r(V˜ ). Remark 12.4 (An example). That fact seems remarkable, given that the entanglement measuring ˜ norms · V and · V are different. To illustrate the latter in more concrete terms, let H be a finite-dimensional Hilbert space, let K = H ⊗ H , and consider the sets V , V˜ ⊆ H ⊗ H ⊗ K defined by V = ξ ⊗ η: ξ ∈ H ⊗ H, η ∈ K, ξ = η = 1 , V˜ = ξ1 ⊗ ξ2 ⊗ η: ξk ∈ H, η ∈ K, ξ1 = ξ2 = η = 1 . ˜

Recall from Theorem 8.2 that the norms · V and · V are the projective cross norms on the ˆ K and the tripartite tensor product H ⊗ ˆ H⊗ ˆ K, respectively. bipartite tensor product (H ⊗ H ) ⊗ To see that the norms are different, it suffices to exhibit a linear functional F on the vector space ˆ K is 1 but its norm in the H H K with the property that its norm in the dual of (H ⊗ H ) ⊗ ˆ H⊗ ˆ K is < 1. To that end, choose a unit vector e ∈ H ⊗ H that does not decompose dual of H ⊗ into a tensor product e1 ⊗ e2 , let f be an arbitrary unit vector in K, and set F (ξ1 ⊗ ξ2 ⊗ η) = ξ1 ⊗ ξ2 , eη, f ,

ξk ∈ H, η ∈ K.

ˆ K, then its norm is e · f = 1. If one views F as a linear functional in the dual of (H ⊗ H ) ⊗ On the other hand,

1508


sup

ξ1 =ξ2 =η=1

F (ξ1 ⊗ ξ2 ⊗ η) = =

sup

ξ1 =ξ2 =η=1

sup

ξ1 =ξ2 =1

ξ1 ⊗ ξ2 , e · η, f

ξ1 ⊗ ξ2 , e < 1,

since e is not a decomposable vector. That implies that the norm of F as an element of the ˆ H⊗ ˆ K is < 1, hence · V = · V˜ . In a similar way, one can see that while the dual of H ⊗ entanglement measuring function E of states is different for the two sets V and V˜ , the set of maximally entangled states is the same for both sets V and V˜ . 13. Remarks on the case nN < n1 n2 ···nN −1 In this section we continue the discussion of N -fold tensor products H = H1 ⊗ · · · ⊗ HN with increasing dimensions nk = dim Hk , with nN −1 < ∞, and with V the set of decomposable unit vectors. We have discussed the cases in which nN n1 · · · nN −1 at some length, having calculated the inner radius of V and having identified the maximal vectors. The following result and its corollary address the remaining cases. The fact is that we have little information about the inner radius and the structure of maximal vectors in such cases that goes beyond the content of Corollary 13.2. Perhaps there is no simple formula for r(V ) in general. Theorem 13.1. If nN < n1 · · · nN −1 , then 1 . r(V ) > √ n1 n2 · · · nN −1 Proof. By Theorem 10.1, r(V ) (n1 · · · nN −1 )−1/2 , and we have to show that equality cannot hold. But if equality held, then the hypothesis on nN implies that H1 ⊗ · · · ⊗ HN is finitedimensional, so that maximal vectors exist. Every maximal vector ξ satisfies the criteria of Theorem 11.1(i), but item (iii) of Theorem 11.1 contradicts the hypothesis on nN . 2 Corollary 13.2. The inner radius is given by r(V ) = (n1 · · · nN −1 )−1/2 if nN n1 · · · nN −1 ; otherwise, r(V ) > (n1 · · · nN −1 )−1/2 . ˆ ··· ⊗ ˆ HN ). It is of interest to Remark 13.3 (Best constants for the projective norm of H1 ⊗ reformulate the information about the inner radius given by Theorems 10.2 and 13.1 in purely Banach space terms. Given finite-dimensional Hilbert spaces H1 , . . . , HN , let · γ be the projective cross norm on the tensor product H1 ⊗ · · · ⊗ HN and let · be its Hilbert space norm. Then one has the following information about the best constant c for which ξ γ c · ξ for all ξ ∈ H1 ⊗ · · · ⊗ HN : c = sup ξ γ = ξ =1

√

n1 · · · nN −1 ,

if nN n1 · · · nN −1 ,

and c = sup ξ γ < ξ =1

√

n1 · · · nN −1 ,

if nN < n1 · · · nN −1 .


1509

Note too that the preceding results provide no further information about the constant c in cases where nN < n1 · · · nN −1 , and the problem of developing sharper information is one of obvious significance for quantum information theory as well as for the local theory of Banach spaces. For example, in the case of bipartite tensor products, it is shown in [7] that the space B(H1 , H2 ) (endowed with the operator norm) fails to have local unconditional structure if the dimensions of H1 and H2 are large, with further developments in [5]. Also see [6], an important paper on the local theory and the many connections with ideal norms. Remark 13.4 (Qubit triplets). The simplest case of tripartite tensor products to which our results do not apply is the case in which V is the set of unit vectors f ⊗ g ⊗ h in C2 ⊗ C2 ⊗ C2 . We have not attempted to calculate r(V ) or determine the maximal vectors for this example; and if one seeks to extend the preceding calculations into the cases nN < n1 · · · nN −1 , this would seem the natural place to begin. Notice that Corollary 13.2 implies r(V ) > 2. In a more qualitative direction, one might seek asymptotic information about the behavior of r(VN ) for large N , where VN is the set of decomposable unit vectors of (C2 )⊗N . 14. Summary of results for N -fold tensor products We have not interpreted the main abstract results for multipartite tensor products. For the reader’s convenience, we conclude by summarizing the results of Proposition 5.2, and Theorems 4.2, 6.2, 6.3, 7.2, 8.2, 9.1, 11.1 in more concrete terms for these special cases. Let H1 , . . . , HN be Hilbert spaces whose dimensions nk = dim Hk are weakly increasing, with nN −1 < ∞. For brevity, we confine ourselves to the case in which nN n1 · · · nN −1 where our results are sharp; however some of the following statements remain valid in the remaining cases as well. What is missing in the remaining cases nN < n1 · · · nN −1 is that we have only rough knowledge of the inner radius (see Theorem 13.1), and correspondingly little information about the structure of maximal vectors. Obviously, the existence of those gaps in what we know about multipartite entanglement calls for further research. Let V be the decomposable unit vectors ξ1 ⊗ · · · ⊗ ξN in the tensor product of Hilbert spaces H = H1 ⊗ · · · ⊗ HN , in which ξk ∈ Hk , and ξk = 1. Theorem 14.1. Let · be the ambient norm of H = H1 ⊗ · · · ⊗ HN and let · γ be the norm ˆ ··· ⊗ ˆ HN . The restriction of · γ to the of the projective tensor product of Hilbert spaces H1 ⊗ unit sphere of H S = ξ ∈ H : ξ = 1 √ has these properties. Its range is the interval Sγ = [1, n1 · · · nN −1 ]. For every ξ ∈ S one has ξ γ = 1 iff ξ ∈ V is a decomposable vector, and ξ γ =

√

n1 · · · nN −1

⇔

ξ is maximal

⇔

ξ has the form (12.2).

The maximal vectors are also characterized as the unit vectors ξ ∈ H that purify the tracial state of B(H1 ⊗ · · · ⊗ HN −1 ) in the sense of (11.3). ˆ ··· ⊗ ˆ Let · γ be the norm of the projective tensor product of Banach spaces L1 (H1 ) ⊗ 1 L (HN ), and let D be the space of all density operators—positive operators in B(H ) having trace 1. The range of · γ on D is the interval Dγ = [1, n1 · · · nN −1 ].

1510


Let A ∈ D and let ρ(X) = trace(AX) be the corresponding normal state of B(H ). Then ρ is separable ⇔ Aγ = 1, and for every rank one density operator Aη = η, ξ ξ , η ∈ H , Aγ = n1 · · · nN −1 ⇔ ξ is a maximal vector. If a mixed state ρ is maximally entangled in the sense that its density operator A satisfies Aγ = n1 · · · nN −1 , then A is a convex combination of rank one projections associated with maximal vectors. In particular, the unique entanglement measuring norms for vectors and states are identified in these cases as ξ V = ξ γ and E(ρ) = Aγ , respectively, where A is the density operator of the state ρ. Acknowledgments I thank Mary Beth Ruskai for calling my attention to some key results in the physics literature, and Yoram Gordon for helpful comments. References [1] [2] [3] [4] [5] [6] [7] [8] [9] [10] [11] [12] [13] [14] [15] [16] [17] [18] [19]

W. Arveson, Noncommutative Dynamics and E-semigroups, Monogr. Math., Springer-Verlag, New York, 2003. W. Arveson, The probability of entanglement, preprint, arXiv: 0712.4163, 2007, 31 pp. W. Arveson, Quantum channels that preserve entanglement, preprint, arXiv: 0801.2531, 2008, 14 pp. R.A. Bertlmann, H. Narnhofer, W. Thirring, A geometric picture of entanglement and Bell inequalities, Phys. Rev. A 66 (2002) 032319. Y. Gordon, A note on the GL constant of E ⊗ F , Israel J. Math. 39 (1981) 141–144. Y. Gordon, M. Junge, Volume ratios in Lp spaces, Studia Math. 136 (1999) 147–182. Y. Gordon, D.R. Lewis, Absolutely summing operators and local unconditional structures, Acta Math. 133 (1974) 27–48. O. Gühne, M. Reimpell, R. Werner, Lower bounds on entanglement measures from incomplete information, preprint, arXiv: 0802.1734 [quant-ph], 2008. A.S. Holevo, M.E. Shirokov, R. Werner, Separability and entanglement-breaking in infinite dimensions, preprint, arXiv: quant-ph/0504204v1, 2005, 12 pp. R. Horodecki, P. Horodecki, M. Horodecki, K. Horodecki, Quantum entanglement, preprint, arXiv: quant-ph/ 0702225v2, 2007. P. Hyllus, O. Gühne, D. Bruß, M. Lewenstein, Relations between entanglement witnesses and Bell inequalities, Phys. Rev. A 72 (2005) 012321. M. Keyl, D. Schlingemann, R. Werner, Infinitely entangled states, preprint, arXiv: quant-ph/0212014, 2002. A. Peres, Separability criterion for density matrices, Phys. Rev. Lett. 77 (8) (1996) 1413. D. Perez-Garcia, M. Wolf, C. Palazuelos, I. Villanueva, M. Junge, Unbounded violation of tripartite Bell inequalities, Comm. Math. Phys. 279 (2008) 455–486. O. Rudolph, A separability criterion for density operators, J. Phys. A 33 (2000) 3951–3955. O. Rudolph, A new class of entanglement measures, J. Math. Phys. 42 (2001) 2507–2512. V. Vedral, M. Plenio, M. Rippin, P. Knight, Quantifying entanglement, Phys. Rev. Lett. 78 (12) (1997) 2275–2279. X. Wang, S. Gu, Negativity, entanglement witnesses and quantum phase transition in spin-1 Heisenberg chains, J. Phys. A 40 (2007) 10759–10767. T.-C. Wei, P. Golbart, Geometric measure of entanglement and applications to bipartite and multipartite quantum states, Phys. Rev. A 68 (2003) 042307.


Local unitary cocycles of E0-semigroups Daniel Markiewicz a,∗ , Robert T. Powers b a Department of Mathematics, Ben-Gurion University of the Negev, P.O.B. 653, Be’er Sheva 84105, Israel b Department of Mathematics, University of Pennsylvania, Philadelphia, PA 19104, USA

Received 19 May 2008; accepted 8 July 2008 Available online 28 August 2008 Communicated by D. Voiculescu

Abstract This paper concerns the structure of the group of local unitary cocycles, also called the gauge group, of an E0 -semigroup. The gauge group of a spatial E0 -semigroup has a natural action on the set of units by operator multiplication. Arveson has characterized completely the gauge group of E0 -semigroups of type I, and as a consequence it is known that in this case the gauge group action is transitive. In fact, if the semigroup has index k, then the gauge group action is transitive on the set of (k + 1)-tuples of appropriately normalized independent units. An action of the gauge group having this property is called (k + 1)-fold transitive. We construct examples of E0 -semigroups of type II and index 1 which are not 2-fold transitive. These new examples also illustrate that an E0 -semigroup of type IIk need not be a tensor product of an E0 -semigroup of type II0 and another of type Ik . © 2008 Elsevier Inc. All rights reserved. Keywords: CP-semigroup; E0 -semigroup; Units; Cocycles; Dilations

0. Introduction An E0 -semigroup is a strongly continuous one-parameter semigroup of unit preserving ∗endomorphisms of B(H), the algebra of all bounded operators on a separable Hilbert space H. In the 1930s Wigner showed that a one-parameter group of ∗-automorphisms of B(H) is always given by the action of a one-parameter strongly continuous unitary group by conjugation. In particular, the classification of one-parameter automorphism groups of B(H) up to conjugacy * Corresponding author.

E-mail addresses: [email protected] (D. Markiewicz), [email protected] (R.T. Powers). 0022-1236/$ – see front matter © 2008 Elsevier Inc. All rights reserved. doi:10.1016/j.jfa.2008.07.009

1512

D. Markiewicz, R.T. Powers / Journal of Functional Analysis 256 (2009) 1511–1543

can be reduced to the well-known multiplicity theory of Hahn–Hellinger of unbounded selfadjoint operators. In contrast, the classification theory of E0 -semigroups up to cocycle conjugacy (which is the appropriate equivalence relation in this context) has proved to be much richer and full of surprises (see for example [3,9–11,14,17,20,21]; we recommend Arveson’s book [5] for an excellent introduction to the theory of E0 -semigroups). One important cocycle conjugacy invariant of an E0 -semigroup is its gauge group (or more precisely the isomorphism class of the gauge group). Given an E0 -semigroup α, a cocycle C is a one-parameter strongly continuous family {C(t): t 0} satisfying the cocycle identity C(t + s) = C(t)αt (C(s)) for t, s 0. The cocycle C is said to be local if it satisfies the additional property that C(t) ∈ αt (B(H)) for all t 0. It is easy to verify that given two local cocycles C1 and C2 , the expression (C1 · C2 )(t) = C1 (t)C2 (t), for t 0, defines another local cocycle. The set of unitary local cocycles is a group when endowed with this operation, and this group is called the gauge group of the E0 -semigroup α. In terms of the product system approach to the study of E0 -semigroups, the gauge group is canonically isomorphic to the group of automorphisms of the product system associated with α. A unit for an E0 -semigroup α is a strongly continuous one-parameter semigroup of isometries {U (t): t 0} which intertwines the E0 -semigroup: αt (A)U (t) = U (t)A for all t 0, A ∈ B(H). If an E0 -semigroup has at least one unit, it is called spatial. If it is spatial and it is generated by its units, it is called completely spatial or type I. All other spatial E0 -semigroups are called type II. Non-spatial E0 -semigroups, also called type III, have been proven to exist by Powers [14], and in fact it follows from work of Tsirelson [20] that there exists a continuum of pairwise non-cocycle conjugate examples. The type of an E0 -semigroup is also a cocycle conjugacy invariant. We will only consider spatial semigroups in this paper, although we should note that, to our knowledge, little is known about the gauge group of a non-spatial E0 -semigroup. Our main goal in this work is to study the action of the gauge group of a spatial E0 -semigroup on the set of units. If U is a unit of α and C is a local unitary cocycle, then U (t) = C(t)U (t) is another unit of α, defining a natural action of the gauge group. Completely spatial E0 -semigroups and their gauge groups are well understood. Every completely spatial E0 -semigroup is cocycle conjugate to a CAR/CCR flow, and they are completely classified by the index [3,15,19]. Furthermore, their gauge groups were completely characterized by Arveson [3,5]. One property which becomes apparent upon examining his characterization is that the gauge group of a completely spatial semigroup acts transitively on the set of units. In fact, even more can be gleaned from that characterization. If the completely spatial E0 -semigroup has index k, then any pair of (k + 1)-tuples of appropriately normalized and independent units are related by an element of the gauge group. When the action of the gauge group of a spatial E0 -semigroup on its units has this property, we say that the action is (k + 1)-fold transitive. Alevras, Powers and Price [1] were the first to break ground on the study of the gauge group of E0 -semigroups of type II. In their work, they characterize all contractive local cocycles (not just unitary local cocycles) for a certain class of E0 -semigroups of type II and index zero. For semigroups of index zero, the set of units is essentially one-dimensional and the gauge group action is automatically (1-fold) transitive. It is natural to inquire whether the action of the gauge group of a spatial E0 -semigroup of index k on the units is always (k + 1)-fold transitive. In this paper we show that this is not the case, by constructing a class of E0 -semigroups of type II and index 1 whose gauge group action on the set of units is not 2-fold transitive. It is possible, although we could not verify it, that within the class which we have constructed there could also be examples of E0 -semigroups whose gauge group action on the set of units is


1513

not transitive. While this article was in preparation, it came to our attention that non-transitive examples were obtained by Tsirelson [22], using different techniques. We do not know the exact relationship between his examples and our own. Nevertheless, in the last section we discuss some features which they have in common. We also observe that our examples, as well as Tsirelson’s [22], provide a direct answer to an old question. When Arveson [2] proved that the index is additive with respect to tensor products, it was natural to inquire whether type IIk semigroups can be decomposed as tensor products of type II0 and Ik . Alas, that is not the case, as the E0 -semigroups which we construct are of type II1 yet they are not tensor products of type II0 and type I1 semigroups. Our approach involves a detailed analysis of the E0 -semigroups obtained via minimal dilation of certain CP-flows. Bhat [6] proved that CP-semigroups can be dilated to E0 -semigroups, and this result proved very useful for the construction and analysis of new examples of E0 semigroups (a very incomplete list of work in this direction includes [4,8,12,13,16]). Bhat [7] has also found a one-to-one correspondence between the compressions of a CP-semigroup and the compressions of its minimal dilation. Pursuing this correspondence, Powers [18] subsequently carried out a study of a class of CP-semigroups, called CP-flows, and their minimal dilations to E0 -semigroups. In particular, several results were obtained in [18] for the analysis of the cocycle conjugacy of the minimal dilations of CP-flows, as well as their contractive local cocycles. We make full use of this favorable framework, which is in fact quite general, given that all spatial E0 -semigroups arise from the minimal dilation of an appropriate CP-flow (see [18]). We now provide an outline of the contents of the following sections. In Section 1 we describe in detail the basic background and terminology, with an emphasis on the material related to [18]. In Section 2 we introduce the class of examples which will be of interest, and describe some of its key properties. In Section 3 we turn to the analysis of the local cocycles of the E0 -semigroups under consideration. Finally, in the last section we summarize our main results. 1. Background, notation and definitions We begin with the definition of E0 -semigroups of B(H) the set of all bounded operators on a separable Hilbert space H. For a detailed discussion of E0 -semigroups we refer to Arveson’s excellent book [5]. Definition 1.1. We say α is an E0 -semigroup of B(H) if the following conditions are satisfied: (i) αt is a ∗-endomorphism of B(H) for each t 0. (ii) α0 is the identity endomorphism and αt ◦ αs = αt+s for all s, t 0. (iii) For each ρ ∈ B(H)∗ (the predual of B(H)) and A ∈ B(H) the function ρ(αt (A)) is a continuous function of t. (iv) αt (I ) = I for each t 0 (αt preserves the unit). The appropriate notions of when two E0 -semigroups are similar are conjugacy and cocycle conjugacy (which comes from Alain Connes’ definition of outer conjugacy). Definition 1.2. Suppose α and β are E0 -semigroups B(H1 ) and B(H2 ). We say α and β are conjugate, denoted α ≈ β, if there is a ∗-isomorphism φ of B(H1 ) onto B(H2 ) so that φ ◦ αt = βt ◦ φ for all t 0. We say α and β are cocycle conjugate, denoted αt ∼ βt , if α and β are conjugate where α and α differ by a unitary cocycle (i.e., there is a strongly continu-

1514


ous one-parameter family of unitaries U (t) on B(H1 ) for t 0 satisfying the cocycle condition U (t)αt (U (s)) = U (t + s) for all t, s 0 so that αt (A) = U (t)αt (A)U (t)−1 for all A ∈ B(H1 ) and t 0). An E0 -semigroup αt is spatial if there is a semigroup of isometries U (t) which intertwine so U (t)A = αt (A)U (t) for A ∈ B(H) and t > 0. The property of being spatial is a cocycle conjugacy invariant. An extremely useful and well-known result in the theory of C ∗ -algebras is the Gelfand– Segal construction of a cyclic ∗-representation of a C ∗ -algebra associated with a state of the C ∗ -algebra. In the study of E0 -semigroups there is a result in the same spirit which says that every semigroup of unital completely positive maps of B(K) can be dilated to an E0 -semigroup of B(H) where H can be thought of as a larger Hilbert space containing K. We begin with a review of the properties of completely positive maps. A linear map φ from a C ∗ -algebra A into B(H) is completely positive if n fi , φ A∗i Aj fj 0 i,j =1

for Ai ∈ A, fi ∈ H for i = 1, 2, . . . , n and n = 1, 2, . . . . Stinespring’s central result is that if A has a unit and φ is a completely positive map from A into B(H) then there is a ∗-representation π of A on B(K) and an operator V from H to K so that φ(A) = V ∗ π(A)V for A ∈ A. And π is determined by φ up to unitary equivalence if the linear span of {π(A)Vf } for A ∈ A and f ∈ H is dense in K. Often we speak of one functional or map dominating another. We introduce a word for the functional or map that is dominated. The word is “subordinate.” If A is an object which is positive with respect to some order structure we say B is a subordinate of A if B is the same kind of thing A is and B is positive and B is less than A. For example if we are speaking of the positive integers the subordinates of 4 are 4, 3, 2, 1. If A is a positive operator then the subordinates of A are operators B with A B 0. Suppose E is a projection. Are the subordinates of a projection E projections under E or the operators under E? The answer depends on the context. A CP-semigroup of B(H) is a strongly continuous one-parameter semigroup of completely positive maps of B(H) into itself. We now state Bhat’s theorem [6] for B(H). Theorem 1.3. Suppose α is a unital CP-semigroup of B(H). Then there is an E0 -semigroup α d of B(H1 ) and an isometry W from H to H1 so that αt (A) = W ∗ αtd (W AW ∗ )W and αt (W W ∗ ) W W ∗ for t > 0 and if the projection E = W W ∗ is minimal, which means the span of the vectors αtd1 (EA1 E)αtd2 (EA2 E) · · · αtdn (EAn E)Wf for f ∈ H, Ai ∈ B(H), ti 0 for i = 1, 2, . . . and n = 1, 2, . . . is dense in H1 , then α d is determined up to conjugacy. We use Arveson’s definition of minimality which is easier to state and equivalent to Bhat’s.


1515

Suppose α is an E0 -semigroup of B(H). We characterize the subordinates of α, (i.e. the CPsemigroups β of B(H) so that the mapping A → αt (A) − βt (A) is completely positive for all t 0). The subordinates of α are given by positive local cocycles. A cocycle is a σ -weakly continuous one-parameter family of operators C(t) satisfying the cocycle relation C(t + s) = C(t)αt C(s) for all s, t 0. The cocycle C(t) is local if C(t) ∈ αt (B(H)) for all t > 0. The local cocycles and their order structure are a cocycle conjugacy invariant. As first shown by Bhat [6] there is an order isomorphism from the subordinates of a unital CP-semigroup of B(H) to the subordinates of its minimal dilation to an E0 -semigroup of B(H1 ) We use the notation of [18]. Theorem 1.4. Suppose α is a unital CP-semigroup of B(H) and α d is the minimal dilation of α to an E0 -semigroup of B(H1 ) and W is an isometry from H to H1 so that W W ∗ is a minimal projection for α d and αt (A) = W ∗ αtd (W AW ∗ )W for A ∈ B(H) and t 0. Then there is an order isomorphism from the subordinates of α to the subordinates of α d given as follows. Suppose γ is a subordinate of α d and C(t) = γt (I ) for t 0 is the local cocycle associated with γ then the subordinate β of α under this isomorphism is given by βt (A) = W ∗ C(t)αtd (W AW ∗ )W for A ∈ B(H) and t 0. In this paper we will frequently make use of corners. This is a trick introduced by A. Connes. Definition 1.5. Suppose α and β are CP-semigroups of B(H) and B(K). Then γ is a corner from α to β if Θ given by Θt

A C

B αt (A) = D γt∗ (C)

γt (B) βt (D)

for t 0 and A ∈ B(H), D ∈ B(K), B a linear operator from K to H and C a linear operator from H to K is a CP-semigroup of B(H ⊕ K). Suppose γ is a corner from α to β and Θ is the CP-semigroup of B(H ⊕ K) defined above. Suppose Θ is a subordinate of Θ of the form Θt

A C

B αt (A) = D γt∗ (C)

γt (B) βt (D)

for t 0 for A, B, C and D as stated above. We say γ is maximal if for every subordinate Θ of the above form we have α = α. We say γ is hyper-maximal if for every subordinate Θ of the above form we have α = α and β = β.

1516


We state Theorem 3.13 of [18] which shows how to determine when two CP-semigroups dilate to cocycle conjugate E0 -semigroups. Theorem 1.6. Suppose α and β are unital CP-semigroups of B(H) and B(K) and α d and β d are the minimal dilations of α and β to E0 -semigroups. Then α d and β d are cocycle conjugate if and only if there is a hyper-maximal corner γ from α to β. If α is a unital CP-semigroup and α d is its minimal dilation to an E0 -semigroup then the corners from α to α come from contractive local cocycles. The following theorem follows from [18, Theorem 3.16 and Corollary 3.17]. Theorem 1.7. Suppose α is a unital CP-semigroup of B(H) and α d is its minimal dilation to an E0 -semigroup α d of B(H1 ). The relation between α and α d is given by αt (A) = W ∗ αtd (W AW ∗ )W for A ∈ B(H) and t 0 where W is an isometry from H to H1 and α d is minimal over the range of W . Suppose γ is a corner from α to α. Then there is a unique contractive local cocycle C for α d so that γt (A) = W ∗ C(t)αtd (W AW ∗ )W for all A ∈ B(H) and t 0. Conversely, if C is a contractive local cocycle for α d then γ given above is a corner from α to α. Furthermore, C(t) is an isometry for all t 0 if and only if γ is maximal and C(t) is unitary for all t 0 if and only if γ is hyper-maximal. Also in [18, Theorem 3.16] there is a similar theorem for matrices of corners. Theorem 1.8. Suppose α is a unital CP-semigroup of B(H) and α d is its minimal dilation to an E0 -semigroup α d of B(H1 ). The relation between α and α d is given by αt (A) = W ∗ αtd (W AW ∗ )W for A ∈ B(H) and t 0 where W is an isometry from H to H1 and α d is minimal over the range of W . Suppose n is a positive integer and Θ is positive (n × n)-matrix of corners from α to α. Then there is a unique positive (n × n)-matrix C of contractive local cocycles Cij for α d for i, j = 1, . . . , n so that (ij )

θt

(A) = W ∗ Cij (t)αtd (W AW ∗ )W

for all A ∈ B(H) and t 0. Conversely, if C is a positive (n × n)-matrix of contractive local cocycles for α d then the matrix Θ whose coefficients θ (ij ) are given above is a positive (n × n)matrix of corners from α to α. Next we define CP-flows. We believe these are the simplest objects which can be dilated to produce all spatial E0 -semigroups. CP-flows are studied extensively in [18].


1517

Definition 1.9. Suppose K is a separable Hilbert space and H = K ⊗ L2 (0, ∞) and U (t) is right translation of H by t 0. Specifically, we may realize H as the space of K-valued Lebesgue measurable functions with inner product (f, g) =

∞

f (x), g(x) dx

0

for f, g ∈ H. The action of U (t) on an element f ∈ H is given by (U (t)f )(x) = f (x − t) for x ∈ [t, ∞) and (U (t)f )(x) = 0 for x ∈ [0, t). A semigroup α is a CP-flow over K if α is a CP-semigroup of B(H) which is intertwined by the translation semigroup U (t), i.e., U (t)A = αt (A)U (t) for all A ∈ B(H) and t 0. Henceforth, unless stated explicitly otherwise, we will arrange our notation so that our CPflows will be CP-flows over K and acting on B(H), where H = K ⊗ L2 (0, ∞), and U (t) will denote the translation semigroup on H. In [18, Theorem 4.0A] it is shown that every spatial E0 -semigroup is cocycle conjugate to an E0 -semigroup which is also a CP-flow. We introduce notation for describing CP-flows. Let H = K⊗L2 (0, ∞) and U (t) be translation by t. Let E(t) = I − U (t)U (t)∗

and E(a, b) = U (a)U (a)∗ − U (b)U (b)∗

for t ∈ [0, ∞) and 0 a < b < ∞. We will also write E(t, ∞) = U (t)U (t)∗ . Let d = d/dx be the differential operator of differentiation with the boundary condition f (0) = 0. More precisely, the domain D(d) is all f ∈ H of the form x f (x) =

g(t) dt 0

with g ∈ H. The hermitian adjoint d ∗ is −d/dx with no boundary condition at x = 0, that is to say, the domain D(d ∗ ) consists of the linear span of D(d) and the functions g(x) = e−x k with k ∈ K. In summary, we can represent elements f ∈ D(d ∗ ) as f = f0 + f+ where f0 ∈ D(d) and f+ (x) = f (0)e−x . Thus, the space D(d ∗ ) has a natural semi-definite inner product given by f, g = (f (0), g(0)) which induces a (definite) inner product on D(d ∗ ) mod D(d). This leads to a natural identification D(d ∗ )/D(d) K via the map [f ] → f (0). Suppose α is a CP-flow over K and A ∈ B(H). Then, for t > 0, one computes αt (A) = U (t)AU (t)∗ + E(t)αt (A)E(t) = U (t)AU (t)∗ + B for all t 0. Then B commutes with E(s) for all s ∈ [0, t], so B is of the form (Bf )(x) = b(x)f (x) and for t > x 0, b(x) ∈ B(K) depends σ -strongly on A. We now define the boundary representation, π0 . Let δ be the generator of α. Then for A ∈ D(δ) we have AD(d) ⊂ D(d) and AD(d ∗ ) ⊂ D(d ∗ ) so A acts on D(d ∗ ) mod D(d). In terms of the identification D(d ∗ )/D(d) K

1518


discussed in the previous paragraph, it follows that if f ∈ D(d ∗ ), then (Af )(0) only depends on f (0). We call this mapping from π0 : D(δ) → B(K), given by π0 (A) f (0) = (Af )(0), the boundary representation. Note π0 tells you what flows in from the origin. The boundary representation need not be σ -weakly continuous and even when it is it may not tell the whole story. If π is a σ -weakly continuous completely positive contraction of B(K ⊗ L2 (0, ∞)) into B(K) then there is a minimal CP-flow with that boundary representation and if that flow is unital then the E0 -semigroup induced by the flow is completely spatial (type In ) where n is the rank of π . For a detailed discussion of these properties of the boundary representation, we refer the reader to [18]. We now define the generalized boundary representation. The resolvent Rα for α is given by ∞ Rα (A) =

e−t αt (A) dt.

0

Next we introduce some notation. If φ is a σ -weakly continuous mapping from B(H) to B(K) ˆ we define φˆ is the predual map from B(K)∗ to B(H)∗ so we have ρ(φ(A)) = (φρ)(A) for all A ∈ B(H) and ρ ∈ B(K)∗ . We define the mapping Γ as ∞ Γ (A) =

e−t U (t)AU (t)∗ dt

0

for A ∈ B(H). Note Rα − Γ is completely positive which we denote by writing Rα − Γ 0. Note Γ is the resolvent of a CP-flow with boundary representation π0 = 0. We need one more bit of notation. We define Λ : B(K) → B(H) for A ∈ B(K) we define Λ(A) by Λ(A)f = e−x Af (x). We define Λ = Λ(I ). Note Γ (I ) = I − Λ. Now we present the main formula. ˆ +ρ Rˆ α (ρ) = Γˆ ω(Λρ) for ρ ∈ B(H)∗ and η → ω(η) is the boundary weight map and ω(η) is the boundary weight associated with η. A boundary weight is a particular example of a T -weight which we define presently. Definition 1.10. Suppose T ∈ B(H) is a positive strictly contractive operator (i.e. 0 T I and Tf < 1 for f 1 so one is not an eigenvalue for T ). We denote by A(H, T ) the linear space 1

1

A(H, T ) = (I − T ) 2 B(H)(I − T ) 2


1519

and by A(H, T )∗ the linear functionals ρ on A(H, T ) of the form 1 1 ρ (I − T ) 2 A(I − T ) 2 = η(A) for A ∈ B(H) with η ∈ B(H)∗ . We call such functionals T -weights. The T -norm of a T -weight ρ denoted ρT is the norm of η. If ρ is a T -weight and ρT 1 we say ρ is T -contractive. Suppose T ∈ B(H) is a positive strictly contractive operator and P (λ) is the spectral resolution of T so 1 T=

λ dP (λ). 0

If ρ ∈ A(H, T )∗ then ρ restricted to P (λ)B(H)P (λ) is normal for all λ > 0. Consider now the case when√T1 T2 0 and T1 is strictly contractive so T2 is strictly contractive. Define S on the range I − T2 by the relation 1

1

S(I − T2 ) 2 f = (I − T1 ) 2 f for f ∈ H. Note S(I − T2 ) 12 f 2 = f, (I − T1 )f f, (I − T2 )f = (I − T2 ) 12 f 2 √ for f ∈ H. Then S is a contractive map on the range of I − T2 which is dense in H so S has a unique bounded extension to a contraction defined on all of H. We also denote this operator by S. We note S is a contraction which satisfies the operator equation 1

1

S(I − T2 ) 2 = (I − T1 ) 2

1

1

1

1

so (I − T2 ) 2 S ∗ AS(I − T2 ) 2 = (I − T1 ) 2 A(−T1 ) 2

for A ∈ B(H) so it follows that A(H, T1 ) ⊂ A(H, T2 ). We show that A(H, T2 )∗ ⊂ A(H, T1 )∗ . Suppose ρ ∈ A(H, T2 )∗ which means 1 1 ρ (I − T2 ) 2 A(I − T2 ) 2 = η(A) for all A ∈ B(H) where η ∈ B(H)∗ . Then we have 1 1 1 1 ρ (I − T1 ) 2 A(I − T1 ) 2 = ρ (I − T2 ) 2 S ∗ AS(I − T2 ) 2 = η(S ∗ AS) for A ∈ B(H). So we see ρ ∈ A(H, T1 ) and since S is a contraction we have ρT1 ρT2 . Note a 0-weight is just a normal functional. We caution the reader that the T -weights we consider are not normal weights. A normal weight on a von Neumann algebra has the property that if 0 A1 A2 · · · is a increasing sequence of operators which converge strongly to A then ω(A) is the limit of the ω(Ak ), where we allow +∞ as a possible limit. Let H = L2 (0, ∞) and let ω be the Λ-weight given by 1 1 ω (I − Λ) 2 A(I − Λ) 2 = (h, Ah)

1520

D. Markiewicz, R.T. Powers / Journal of Functional Analysis 256 (2009) 1511–1543 1

1

for A ∈ B(H) where h(x) = x − 2 s (1 − e−x ) 2 for s ∈ (1, 2). For each n = 1, 2, 3, . . . , let Mn be the set of functions g in H with support in [1/n, ∞) and ∞

x −s/2 g(x) dx = 0.

1/n

Let Pn be the orthogonal projection onto Mn , and consider Bn = (I − Λ)−1/2 Pn . Observe that Bn is as bounded operator, and moreover An = Bn Bn∗ satisfies (I − Λ)1/2 An (I − Λ)1/2 = Pn , for n = 1, 2, . . . . Note that ω(Pn ) = (An h, h) = 0 for n = 1, 2, . . . and Pn → I as n → ∞, but it is not true that ω(I ) = 0. If we were to assign ω(I ) a value it is +∞ since ω is positive and unbounded. Although T -weights are not in general normal weights we do not think of them as pathological like non-normal bounded functionals since T -weights are normal when scaled down √ by I − T . In the particular case when H = K ⊗ L2 (0, ∞) when we speak of boundary weights we mean the following. Let Λ be the operator corresponding to multiplication by e−x . Then the boundary algebra A(H) is 1

1

A(H) = A(H, Λ) = (I − Λ) 2 B(H)(I − Λ) 2 and the boundary weights denoted by A(H)∗ are A(H)∗ = A(H, Λ)∗ . If ω is a boundary weight we say ω is weight contractive if ωΛ 1 and if ω is a positive boundary weight we say ω is normalized if ωΛ = 1. If ω is a boundary weight and we say ω is bounded we mean ω is bounded as a functional on B(H) (i.e. there exists k > 0 such that |ω(A)| kA for all A ∈ A(H)). The mapping ρ → ω(ρ) defined for ρ ∈ B(K)∗ is a boundary weight map if this mapping is a linear mapping of B(K)∗ into boundary weights on A(H) and this mapping is completely bounded with the norm on B(K)∗ being the usual norm and the norm on the boundary weights being the boundary weight norm. A boundary weight map is positive if it is completely positive. A boundary weight map ω is unital if ω(ρ)(I − Λ) = ρ(I ) for all ρ ∈ B(K)∗ . Maintaining the notation of the above definition we observe that U (t)AU (t)∗ ∈ A(H) for all A ∈ B(H) and t > 0. Recall the mapping Γ defined above. Since Γ is completely positive and Γ (I ) = I − Λ, so Γ (I ) ∈ A(H), it follows that Γ (A) ∈ A(H) for all A ∈ B(H). For more details see the discussion after Definition 4.16 in [18]. Every CP-flow is given by a boundary weight map ρ → ω(ρ). As we have mentioned the map is completely positive. There is a further complicated positivity condition. The condition says if you construct an approximation to the boundary representation πt , then πt is completely positive. We describe the connection between boundary weight and boundary representation. One can construct a boundary weight map so that the boundary representation is a given σ -weakly continuous completely positive contraction of B(H) into B(K). Suppose π is a σ -weakly continuous completely positive contraction of B(H) into B(K). Let ω = πˆ + πˆ Λˆ πˆ + πˆ Λˆ πˆ Λˆ πˆ + πˆ Λˆ πˆ Λˆ πˆ Λˆ πˆ + · · · .


1521

This converges as a weight (i.e. the above series converges on the boundary algebra A(H)) and this is the boundary weight map of a CP-flow. We call this the minimal CP-flow derived from π . Formally ω = πˆ (I − Λˆ π) ˆ −1 and solving for π we have ˆ −1 . πˆ = ω(I + Λω) If a boundary weight associated with a CP-flow is bounded the boundary representation is well defined as stated in the next theorem (see [18, Theorem 4.27]). Theorem 1.11. Suppose α is a CP-flow over K and ρ → ω(ρ) is the associated boundary weight map. Suppose ω(ρ) < ∞ for ρ ∈ B(K)∗ so ω(ρ) ∈ B(H)∗ for all ρ ∈ B(K)∗ . Then the mapˆ ˆ −1 exists and πˆ given by ping ρ → ρ + Λω(ρ) is invertible i.e. (I + Λω) ˆ −1 πˆ = ω(I + Λω) is a completely positive contraction from B(K)∗ to B(H)∗ . There is a unique CP-flow derived from π and its boundary weight map is given by ω = πˆ + πˆ Λˆ πˆ + πˆ Λˆ πˆ Λˆ πˆ + πˆ Λˆ πˆ Λˆ πˆ Λˆ πˆ + · · · . So when ω(ρ) is bounded for all ρ ∈ B(K)∗ we have ω = πˆ (I − Λˆ π) ˆ −1

ˆ −1 . and πˆ = ω(I + Λω)

Now we introduce a bit of notation. Suppose ω is a boundary weight and t > 0. We denote by ω|t the functional given by ω|t (A) = ω(E(t, ∞)AE(t, ∞)) for A ∈ B(H). Note ω|t (ρ) ∈ B(H)∗ , i.e. ω|t (ρ) is a bounded σ -weakly continuous functional. We use the same notation for operators. If A ∈ B(H) and t > 0 then we denote A|t the operator A|t = E(t, ∞)AE(t, ∞). Note for ω a boundary weight and A ∈ B(H) then ω|t (A) = ω(A|t ). From [18, Theorems 4.23 and 4.27 and Lemma 4.34] we have the following theorem. Theorem 1.12. Suppose ρ → ω(ρ) is the boundary weight map of a CP-flow over K. Then for each t > 0 we have ρ → ω|t (ρ) is the boundary weight map of a CP-flow over K. Suppose ρ → ω(ρ) is a completely positive mapping of B(K) into boundary weights on B(H) satisfying ω(ρ)(I − Λ) ρ(I ) for ρ positive. Suppose ˆ t −1 πˆ t# = ω|t I + Λω| is a completely positive contraction of B(K)∗ into B(H)∗ for each t > 0. Then ρ → ω(ρ) is the boundary weight map of a CP-flow over K. Furthermore, the mapping πˆ t# defined above has the property that if φt (A) = πt# (E(s, ∞) AE(s, ∞)) for 0 < t s < ∞ and A ∈ B(H) then φt is increasing in t in the sense complete positivity (i.e., the mapping A → φt (A) − φr (A) for A ∈ B(H) and 0 < t < r s is completely positive). Definition 1.13. If ρ → ω(ρ) is a mapping of B(K)∗ into boundary weights on B(H) so that πˆ t# defined above is completely positive for each t > 0 we say this map is q-positive. The family

1522


πt# of completely positive σ -weakly continuous contractions of B(H) into B(K) is called the generalized boundary representation. We remark that in checking that the πt# are completely positive it is only necessary to check for small t. If the mapping πt# is completely positive then πs# is completely positive for all s t. Next we give the order relation for the generalized boundary representation(see [18, Theorem 4.20]). Theorem 1.14. If α and β are CP-flows over K then β is a subordinate of α (α β) if and only if πt# φt# for all t > 0 where πt# and φt# are the generalized boundary representations of α and β. Also we have if πt# φt# then πs# φs# for all s t so one only has to check for a sequence {tn } tending to zero. In Theorem 1.11 we used the phrase “α is derived from π .” The next theorem (see [18, Theorem 4.24]) and definition will make this more precise. We need a bit of notation which is given in the next definition. √

Definition 1.15. Let Q0 be the map from K to H given by (Q0 k)(x) = e− x k. And let Φ be the mapping of B(K)∗ into B(H)∗ given by Φ(ρ)(A) = ρ(Q∗0 AQ0 ) for all A ∈ B(H). Note that Φ(ρ)(U (t)AU (t)∗ ) = e−t Φ(ρ)(A), Φ(ρ)(Γ (A)) = 12 Φ(ρ)(A) for t 0 and Φ(ρ)(Λ(C)) = 12 ρ(C) for all ρ ∈ B(K)∗ , A ∈ B(H) and C ∈ B(K). Theorem 1.16. Suppose ρ → ω(ρ) defines a CP-flow over K as described in Definition 1.9 and δ is the generator of α (i.e., δ is the derivative of αt at t = 0). Suppose π is a completely positive normal contraction of B(H) into B(K). Then the following are equivalent: ˆ and δ(Φ(ρ)) ˆ (i) Φ(ρ) ∈ D(δ) = π(ρ) ˆ − Φ(ρ) for each ρ ∈ B(K)∗ . ˆ (ii) ω(ρ − Λ(πˆ (ρ))) = πˆ (ρ) for all ρ ∈ B(K)∗ . (iii) π(A) = π0 (A) for all A ∈ D(δ) where π0 is the boundary representation of α. Definition 1.17. We say a CP-flow α over K is derived from the completely positive normal contraction π of B(H) into B(K) if it satisfies one and, therefore, all the conditions of Theorem 1.16. As mentioned earlier for each such π there is a CP-flow α derived from π and the next theorem (see [18, Theorem 4.26]) gives a condition for uniqueness. Theorem 1.18. Suppose π is a completely positive σ -weakly continuous linear contraction of B(H) into B(K). Then for each ρ ∈ B(K)∗ the sum ω(ρ) = πˆ (ρ) + πˆ Λˆ π(ρ) ˆ + πˆ Λˆ πˆ Λˆ π(ρ) ˆ + ··· converges as a weight on A(H) and the mapping ρ → ω(ρ) is the boundary weight map of a CP-flow α which is derived from π . Furthermore, this α is the minimal CP-flow derived from π in that if ρ → η(ρ) is the boundary weight map of a second CP-semigroup derived from π then ω(ρ) η(ρ) for all positive ρ ∈ B(K)∗ . Moreover, if (π ◦ Λ)n (I ) → 0 weakly as n → ∞ then α defined above is unique (i.e. α is the only CP-flow derived from π ).


1523

We remark that we believe that this theorem can be strengthened with the stronger conclusion being that α is a flow subordinate of any CP-flow derived from π . In the examples we construct in this paper we will show that the stronger result holds. So far most of the results in this section are proved in [18]. The next two theorems are new. Theorem 1.19. Suppose α is a CP-flow over K derived from π as described in Definition 1.17 and β is CP-flow subordinate to α, so the mapping A → αt (A) − βt (A) for A ∈ B(H) is completely positive for all t 0. Then there is a unique completely positive normal contraction φ of B(H) into B(K) which is subordinate to π so that β is derived from φ. Proof. Assume the hypothesis of the theorem and suppose δα and δβ are the generators of α and β, respectively. Let γt (A) = U (t)AU (t)∗ for t 0 and A ∈ B(H). Since β is a subordinate of α and β is intertwined by U (t) we have the maps t → αt (A) − βt (A) and t → βt (A) − γt (A) and A ∈ B(H) are completely positive for all t 0. Suppose ρ ∈ B(K)∗ and ρ 0. Then we have ϑt = t −1 αˆ t Φ(ρ) − Φ(ρ) + Φ(ρ) t −1 βˆt Φ(ρ) − Φ(ρ) + Φ(ρ) = νt t −1 γˆt Φ(ρ) − Φ(ρ) + Φ(ρ) = t −1 e−t − 1 + t Φ(ρ) for t > 0 where the two equal signs are definitions of ϑt and νt . Since α is derived from π we have ϑt = t −1 αˆ t Φ(ρ) − Φ(ρ) + Φ(ρ) → δˆ Φ(ρ) + Φ(ρ) = πˆ (ρ) = ϑ0 as t → 0+ and the convergence is in norm. Since ϑ0 = π(ρ) ˆ ∈ B(H)∗ there is a positive trace class operator Ω0 so that ϑ0 (A) = πˆ (ρ)(A) = tr(AΩ0 ) for A ∈ B(H) and for every ρ1 ∈ B(H)∗ with ϑ0 ρ1 0 there is an X ∈ B(H) with 0 X I so that 1 1 ρ1 (A) = tr AΩ02 XΩ02

for A ∈ B(H) and conversely if X ∈ B(H) and 0 X I then ρ1 defined above is in B(H)∗ and 0 ρ1 ϑ0 . (If we require the null space of X contains Range(Ω0 )⊥ then X is uniquely determined by ρ1 .) Suppose t > 0 and Ωt is the unique positive trace class operator so that ϑt (A) = t −1 αˆ t Φ(ρ) − Φ(ρ) + Φ(ρ) (A) = tr(AΩt ) for A ∈ B(H). From the inequality above we have ϑt νt 0 so there is an operator Xt ∈ B(H) with 0 Xt I so that 1 1 νt (A) = t −1 βˆt Φ(ρ) − Φ(ρ) + Φ(ρ) (A) = tr AΩt2 Xt Ωt2

1524


for A ∈ B(H). We will require the null space of Xt contains Range(Ωt )⊥ so Xt is uniquely determined. Now let 1 1 ηt (A) = tr AΩ02 Xt Ω02

for A ∈ B(H). Now we have 1 1 1

1 νt − ηt = sup Re tr A Ωt2 Xt Ωt2 − Ω02 Xt Ω02 : A ∈ B(H), A 1 . 1

We have |tr(AB)| AHS BHS for A, B ∈ B(H) with AHS = tr(A∗ A) 2 the Hilbert– Schmidt norm. Then for A ∈ B(H) with A 1, 1 1 1 1 1 1 1 1 1 1 1 12 tr A Ωt Xt Ωt2 − Ω 2 Xt Ω 2 = tr A Ωt2 Xt Ωt2 − Ωt2 Xt Ω 2 + Ωt2 Xt Ω 2 − Ω 2 Xt Ω 2 0 0 0 0 0 0 1 1 1 1 1 1 tr AΩt2 Xt Ωt2 − Ω02 + tr Xt Ω02 A Ωt2 − Ω02 1 1 1 1 AΩt2 Xt HS + AΩ02 Xt HS Ωt2 − Ω02 HS 1 1 1 1 tr(Ωt ) 2 + tr(Ω0 ) 2 Ωt2 − Ω02 HS

and, hence, it follows that 1 1 1 1 1 1 1 1 νt − ηt νt 2 + ϑ0 2 Ωt2 − Ω02 HS ϑt 2 + ϑ0 2 Ωt2 − Ω02 HS . 1

1

1

1

1

1

Now if U T is the polar decomposition of Ωt2 − Ω02 so U ∗ (Ωt2 − Ω02 ) = |Ωt2 − Ω02 | = 1

1

1

((Ωt2 − Ω02 )2 ) 2 we have ϑt − ϑ0 ϑt (U ∗ ) − ϑ0 (U ∗ ) = tr U ∗ (Ωt − Ω0 ) 1 1 1 1 1 1 1 1 = tr U ∗ Ωt2 − Ω02 Ωt2 + Ω02 + Ωt2 + Ω02 Ωt2 − Ω02 1 1 1 1 1 1 2 = tr Ωt2 − Ω02 Ωt2 + Ω02 tr Ωt2 − Ω02 1 1 1 1 2 2 = Ωt2 − Ω02 HS . = tr Ωt2 − Ω02 Hence, we have 1 1 1 νt − ηt ϑt 2 + ϑ0 2 ϑt − ϑ0 2 for all t > 0. Note ϑ0 ηt 0 for each t > 0. Note the set S of η ∈ B(H)∗ with ϑ0 η 0 is compact. This may be seen as follows. For every 1 > 0 there is a finite rank ξ ∈ B(H)∗ so that 0 ξ ϑ0 and ξ − ϑ0 < 1 . Given > 0 we can by choosing 1 small enough insure that for every η ∈ S there is a positive η ξ with η − η < . Hence, for every > 0 there is a finite-dimensional compact subset of S so that the -neighborhoods of this set cover S, thus for every > 0 there is a cover of S with a finite numbers of open balls of radius and we obtain that


1525

S is totally bounded. Since S is complete (it is clearly closed) and totally bounded, it follows that it is compact. Since S is compact there is a sequence tn → 0+ as n → ∞ so that ηtn converges to a limit η0 in norm as n → ∞. Since ϑt − ϑ0 → 0 as t → 0+ it follows from the above estimate that νtn → η0 as n → ∞. Hence, we have −1 t →0 ˆ n βtn Φ(ρ) − Φ(ρ) − η0 − Φ(ρ) as n → ∞. Now let 1 μn = tn

tn

βˆs Φ(ρ) ds

0

for n = 1, 2, . . . . Let δβ be the generator of β. Note μn ∈ D(δˆβ ) and δˆβ (μn ) = tn−1 βˆtn Φ(ρ) − Φ(ρ) → η0 − Φ(ρ) as n → ∞ where the convergence is in norm. Since μn → Φ(ρ) in norm as n → ∞ it follows from the fact that δˆβ is closed that Φ(ρ) ∈ D(δˆβ ) and δˆβ (Φ(ρ)) = η0 − Φ(ρ). Since ρ was an ˆ arbitrary positive element of B(H)∗ it follows that Φ(ρ) ∈ D(δˆβ ) and δˆβ (Φ(ρ)) + Φ(ρ) = φ(ρ) where this equation defines φ. Since α β γ in the sense of complete positivity it follows that π φ 0 is the sense of complete positivity. From Definition 1.17 it follows that β is derived from φ. The uniqueness of φ follows from the defining equation for φ. 2 Theorem 1.20. Suppose α is a CP-flow over K and π is a normal completely positive contraction of B(H) into B(K) and suppose further that π is unital so π(I ) = I . Suppose β is a CP-flow over K derived from π and α β (i.e. the mapping A → αt (A) − βt (A) for A ∈ B(H) is completely positive for all t 0). Then α is derived from π . Proof. Assume the hypothesis and notation of the theorem. Suppose ρ ∈ B(H)∗ and ρ is positive. Then defining ϑt and νt as in the proof of the last theorem we have ϑt νt 0 for t > 0 ˆ in norm as t → 0+ . Since ϑt − νt 0 and αt (I ) I and π is unital we have and νt → π(ρ) ϑt − νt = ϑt (I ) − νt (I ) = t −1 Φ(ρ) αt (I ) − I + Φ(ρ)(I ) − νt (I ) Φ(ρ)(I ) − νt (I ) = ρ(I ) − νt (I ) → ρ(I ) − πˆ (ρ)(I ) = 0. Hence, ϑt → π(ρ) ˆ in norm as t → 0+ . Since each ρ ∈ B(K)∗ is the linear combination of at most four positive elements of B(K)∗ we have t −1 αˆ t (Φ(ρ)) − Φ(ρ) + Φ(ρ) → πˆ (ρ) ˆ and δ(Φ(ρ)) ˆ + Φ(ρ) = π(ρ) ˆ for all ρ ∈ B(K)∗ . Hence, in norm as t → 0+ . Thus, Φ(ρ) ∈ D(δ) α is derived from π . 2 We will want to analyze the action of local cocycles on units. Suppose α is a unital CP-flow and α d is the minimal dilation of α to an E0 -semigroup acting on B(H1 ) as described in Theorem 1.3. A unit for α d is a one-parameter semigroup of isometries V (t) which intertwine α d

1526


so (V (t)A = αtd (A)V (t) for all A ∈ B(H1 ) and t 0). Units for α d are in one to one correspondence with semigroups S(t) acting on H with the property that the semigroup Ωt (A) = S(t)AS(t)∗ is a trivially maximal subordinate of α (i.e., the mapping A → αt (A) − est Ωt (A) for A ∈ B(H) is completely positive for all t 0 provided s 0 and the mapping is not positive for s > 0 and t > 0). The next two theorems (see [18, Theorems 4.46, 4.50 and 4.51]) describe such semigroups and the connection between them and units for the dilated E0 -semigroup. Theorem 1.21. Suppose α is a CP-flow over K and S(t) is a strongly continuous one-parameter semigroup and Ωt (A) = S(t)AS(t)∗ for t 0 and A ∈ B(H) is a subordinate of α. Then S(t) is a strongly continuous one-parameter semigroup of contractions with generator −D where D(D) = {f ∈ D(d ∗ ): f (0) = Vf } and Df = −d ∗ f +cf where c is a complex number with nonnegative real part and V is a linear operator from H to K with norm satisfying V 2 2 Re(c). Furthermore, if π(A) = (2 Re(c))−1 V AV ∗ for all A ∈ B(H) and γ is the minimal CP-semigroup derived from π then α dominates γ . In the case Re(c) = 0 we define π = 0. Conversely, if c is a complex number with Re(c) > 0 and V is a linear operator from H to K with norm satisfying V 2 2 Re(c) and if π(A) = (2 Re(c))−1 V AV ∗ for A ∈ B(H) and γ is the minimal CP-semigroup derived from π and α dominates γ then if D is an operator with domain D(D) = {f ∈ D(d ∗ ): f (0) = Vf } and Df = −d ∗ f + cf. Then −D is the generator of a contraction semigroup S(t) and if Ωt (A) = S(t)AS(t)∗ for t 0 and A ∈ B(H) and α dominates Ω. Theorem 1.22. Suppose α is a unital CP-flow over K and α d is the minimal dilation of α to an E0 -semigroup and suppose the relation between α and α d is given by αt (A) = W ∗ αtd (W AW ∗ )W for all A ∈ B(H) (with H = K ⊗ L2 (0, ∞)) and t 0 where W is an isometry from H to H1 and W W ∗ is an increasing projection for α d and α d is minimal over the range of W. Then H1 can be expressed as H1 = K1 ⊗ L2 (0, ∞) and α d is a CP-flow over K1 so that if U (t) and U1 (t) are right translation on H and H1 for α and α d , respectively, then U1 (t)W = W U (t) and U1 (t)∗ W = W U (t)∗ for all t 0. This means that W as a mapping of H = K ⊗ L2 (0, ∞) into H1 = K1 ⊗ L2 (0, ∞) can be expressed in the form W = W1 ⊗ I where W1 is an isometry from K into K1 . Suppose S(t) is a strongly continuous semigroup of contractions of H and Ω given by Ωt (A) = S(t)AS(t)∗ for A ∈ B(H) and t 0 is a subordinate of α. Further assume Ω is trivially maximal. Then there is a unique strongly continuous one-parameter semigroup of isometries S1 (t) which intertwine αtd for each t 0 and S(t) = W ∗ S1 (t)W for all t 0. Conversely, if S1 (t) is a strongly continuous one-parameter semigroup of isometries which intertwine αtd for each t 0 then if S(t) is as defined in the equation above we have that S(t) is a strongly continuous one-parameter semigroup of contractions so that Ω defined by Ωt (A) = S(t)AS(t)∗ for A ∈ B(H) and t 0 is a subordinate of α which is trivially maximal.


1527

We end this section with some notation and results which we will need in the next section. As we saw in Theorem 1.18 the boundary weight map of the minimal CP-flow derived from π is given by ω(ρ) = π(ρ) ˆ + πˆ Λˆ π(ρ) ˆ + πˆ Λˆ πˆ Λˆ πˆ (ρ) + ···. We introduce some notation. For n ∈ N we write Rn (φ) to denote finite sum and R(φ) to denote the infinite series Rn (φ) = I + φ + φ 2 + · · · + φ n

and R(φ) = I + φ + φ 2 + · · · .

Then the expression for ω above can be written ˆ π. ω = πˆ R(Λˆ π) ˆ = R(πˆ Λ) ˆ Formally, R(φ) = (I − φ)−1 , however the inverse in question may not exist. The sums above make sense in that the series ω(ρ)(A) = ρ π(A) + ρ π Λ π(A) + · · · converges absolutely for A ∈ A(H). This is seen by setting A = I − Λ and assuming ρ ∈ B(K)∗ is positive. As we saw in Theorem 1.18 the series above for ω defines the minimal CP-flow derived from π. We know from Theorem 1.12 that the truncated boundary weight map ρ → ω|t for t > 0 is the minimal CP-flow derived from the truncated boundary representation φt# . 2. An almost type I CP-flow In this section we study CP-flows derived from a particular strongly continuous ∗-representa2 tion π . Let K be the infinite tensor product of L2 (0, ∞) so K = ∞ k=1 L (0, ∞) with the reference vector (see [23] for details of infinite tensor products of Hilbert spaces) F0 = k 1 ⊗ k 2 ⊗ · · · with 1 2

ki (x) = λi e− 2 λi x for x 0 where λi > 0 for i = 1, 2, . . . . The Hilbert space K is spanned by product vectors of the form F = f1 ⊗ f2 ⊗ · · · where ∞ i=1

fi − ki 2 < ∞.

(2.1)

1528


The inner product between two such product vectors is given by (F, G) =

∞

(fi , gi ). i=1

We impose the following two conditions on the positive numbers λi : ∞ n=1

λ−2 n < ∞ and

∞ |λn − λn+1 |2 n=1

λ2n + λ2n+1

< ∞.

(2.2)

We note both these conditions are satisfied for λn = n and the second condition is not satisfied for λn = 2n . Let S0 be the unitary mapping of H = K ⊗ L2 (0, ∞) into K given by S0 (f1 ⊗ f2 ⊗ · · ·) ⊗ h = h ⊗ f1 ⊗ f2 ⊗ · · ·

(2.3)

and let π(A) = S0 AS0∗ and Δ = e−x ⊗ e−x ⊗ · · · where e−x is a shorthand for the operation of multiplication by e−x on L2 (0, ∞). The first sum condition insures that Δ is not zero and the second condition insures that S0 is well defined. Note π is a normal ∗-representation of B(H) on B(K). Suppose n is a positive integer. We define Kn as the tensor product of the Hilbert = ∞ spaces L2 (0, ∞) from n + 1 on with the reference vector Fn0 i=n+1 ki . Let Sn be the linear ∞ which takes the product vector F = fi ∈ K to the product vector mapping from K to K n i=1 defined and one Sn F = ∞ i=1 fi ∈ Kn . From the second sum condition above one finds Sn is well checks that Sn is unitary. We define Qn (A) = Sn ASn∗ for A ∈ B(K). Let Kn = ni=1 L2 (0, ∞). We see that B(K) = B(Kn ) ⊗ B(Kn ) and Qm (In ⊗ Qn (A)) = In+m ⊗ Qn+m (A) for A ∈ B(H) where Ik is the unit in B(Kk ) for n, m, k = 1, 2, . . . . We have the formulae, π(A ⊗ A0 ) = A0 ⊗ Q1 (A), Λ(A) = A ⊗ e−x , π Λ(A) = e−x ⊗ Q1 (A), (πΛ)n (A) = e−x ⊗ e−x ⊗ · · · ⊗ e−x ⊗ Qn (A) for A ∈ B(K) and n = 1, 2, . . . where there are n factors of e−x in the last equation. For A = A1 ⊗ A2 ⊗ · · · we write these formulae: π (A1 ⊗ A2 ⊗ · · ·) ⊗ A0 = (A0 ⊗ A1 ⊗ A2 ⊗ · · ·), Λ(A1 ⊗ A2 ⊗ · · ·) = (A1 ⊗ A2 ⊗ · · ·) ⊗ e−x , π Λ(A1 ⊗ A2 ⊗ · · ·) = e−x ⊗ A1 ⊗ A2 ⊗ · · · , (πΛ)n (A1 ⊗ A1 ⊗ · · ·) = e−x ⊗ e−x ⊗ · · · ⊗ e−x ⊗ A1 ⊗ A2 ⊗ · · · . We first note that (πΛ)n (I ) converges to Δ as n → ∞. We have (πΛ)n (I ) = e−x ⊗ e−x ⊗ · · · ⊗ e−x ⊗ I ⊗ I ⊗ · · ·


1529

where there are n factors of e−x and we see that (πΛ)n (I ) forms a decreasing sequence of positive operators which must converge strongly to a limit which is Δ = e−x ⊗ e−x ⊗ · · · . As we have mentioned the first sum condition on the λn insures that Δ is not zero. Next we note that if π(Λ(A)) = A then A is a multiple of Δ (i.e. A = cΔ with c ∈ C). In fact, we show first that it is enough to prove that if A is positive and π(Λ(A)) = A then A = λΔ with λ 0. Note that if π(Λ(A)) = A then if A = A1 + iA2 where A1 and A2 are hermitian then π(Λ(Ai )) = Ai for i = 1, 2. So it is enough to show that if A = A∗ and π(Λ(A)) = A then A = λΔ with λ real. Next note that if A ∈ B(K) is hermitian and π(Λ(A)) = A and A = 1 then (πΛ)n (I + A) → Δ + A as n → ∞ and since Δ + A is the strong limit of positive operators we have Δ + A is positive. If Δ + A = λΔ it follows that A is a multiple of Δ. Hence, it is sufficient to show that if A is positive and π(Λ(A)) = A then A = λΔ with λ 0. Suppose then that A ∈ B(K) is positive, A = 1 and π(Λ(A)) = A. Since (πΛ)n (I − A) → Δ − A 0 we have 0 A Δ. Recalling the reference vector F0 we have (F0 , ΔF0 ) = k1 , e−x k1 k2 , e−x k2 · · · =

λ21

·

λ22

1 + λ21 1 + λ22

···.

Since Δ A 0 we have (F0 , AF0 ) = c(F0 , ΔF0 ) with c ∈ [0, 1]. Now since π(Λ(A)) = A it follows that A = e−x ⊗ Q1 (A) = e−x ⊗ e−x ⊗ Q2 (A) = · · · and we have

∞

ki , Qn (A)

i=n+1

∞ i=n+1

ki = c

λ2n+1

·

λ2n+2

1 + λ2n+1 1 + λ2n+2

···

for n = 1, 2, . . . . Now let F=

∞ i=1

fi

and G =

∞

gi

i=1

be product vectors so that fi = gi = ki for i m. Then we see that λ2m+1 λ2m+2 (F, AG) = f1 , e−x g1 f2 , e−x g2 · · · fm , e−x gm c ··· 1 + λ2m+1 1 + λ2m+2 = c(F, ΔG). Since such vectors F and G are dense in K we have A = cΔ. Then we have proved the following lemma. Lemma 2.1. Suppose π is the ∗-representation described above. Let Δ be as described above. Then Δ = limn→∞ (πΛ)n (I ). Furthermore, if A ∈ B(K) and π(Λ(A)) = A then A is a multiple of Δ (i.e., A = cΔ with c ∈ C).

1530


We will need a stronger characterization of this property which is provided by the following lemma. ˆ n (ρ) → 0 as n → ∞. Lemma 2.2. Suppose ρ ∈ B(K)∗ and ρ(Δ) = 0. Then (Λˆ π) Proof. Suppose ρ ∈ B(K)∗ and ρ(Δ) = 0. Suppose > 0. Since ρ can be approximated arbitrarily well in norm by a finite sum of functionals ρi of the form ρi (A) = (Fi , AGi ) with Fi , Gi ∈ K for i = 1, . . . , n and the vectors Fi and Gi can be approximated by vectors Fi and Gi which are finite sums of product vectors of the form f1 ⊗ f2 ⊗ · · · with fi = ki for i > m with m some large integer it follows that there is a functional η so that ρ − η < 12 (F0 , ΔF0 ) and η(A) =

n (Fi , AGi ) i=1

and each of the vectors Fi and Gi is of the form ∞ ki F⊗ i=m+1

(i.e. they consist of sums of product vectors with factors fi = ki for i > m). Since we have η(Δ) = ρ(Δ) − η(Δ) ρ − η < 1 (F0 , ΔF0 ). 2 Now let μ(A) = η(A) − (F0 , AF0 )η(Δ)(F0 , ΔF0 )−1 . Note 1 1 ρ − μ ρ − η + η − μ (F0 , ΔF0 ) + < . 2 2 Since (Λˆ πˆ )k (μ)(A) = μ((πΛ)k (A)) and (πΛ)k (A) is of the form (πΛ)k (A) = e−x ⊗ e−x ⊗ · · · ⊗ e−x ⊗ Qk (A) where there are k factors of e−x , and it follows from the form of μ that μ((πΛ)k (A)) = 0 for ˆ k (ρ) → 0 as k → ∞. 2 k m. Hence, (Λˆ πˆ )k (ρ) < for k m and we have (Λˆ π) Let α 1 be the minimal CP-flow derived from π . If ρ → ω1 (ρ) is the boundary weight map for α 1 then ω1 (ρ) = π(ρ) ˆ + πˆ Λˆ π(ρ) ˆ + πˆ Λˆ πˆ Λˆ π(ρ) ˆ + · · · = πR( ˆ Λˆ πˆ ) where the shorthand R(ψ) = I + ψ + ψ 2 + · · · was introduced in the last section. We analyze CP-flows derived from π . We begin with the following observation. Theorem 2.3. Suppose α is a CP-flow derived from π and ω is the boundary weight map for α. Then ω is of the form ω(ρ) = ω1 (ρ) + ρ(Δ)ξ


1531

for ρ ∈ B(K)∗ where ω1 is the boundary weight map for α 1 the minimal CP-flow derived 1 1 from π and ξ ∈ A(H)∗ is a positive boundary weight on A(H) = (I − Λ) 2 B(H)(I − Λ) 2 with ξ(I − Λ) 1 and α is unital (i.e. αt (I ) = I for t 0) if and only if ξ(I − Λ) = 1. Proof. Assume the hypothesis and notation of the theorem. Since α is derived from π we have (by Theorem 1.16) that ω(ρ − Λˆ πˆ (ρ)) = πˆ (ρ) for ρ ∈ B(K)∗ . Suppose ρ ∈ B(K)∗ . Let ρn = ρ + Λˆ π(ρ) ˆ + · · · + (Λˆ π) ˆ n (ρ). Then we have ω ρ − (Λˆ π) ˆ n+1 (ρ) = π(ρ) ˆ + πˆ Λˆ πˆ (ρ) + · · · + π( ˆ Λˆ π) ˆ n (ρ). Now suppose ρ(Δ) = 0. Then by Lemma 2.2 we have (Λˆ πˆ )n ρ → 0 as n → ∞ so we have taking the limit as n → ∞ that ω(ρ) = ω1 (ρ) for ρ ∈ B(K)∗ with ρ(Δ) = 0. Now suppose η ∈ B(K)∗ is positive and η(Δ) = 1. Then for arbitrary ρ ∈ B(K)∗ we have ω(ρ) = ω ρ − ρ(Δ)η + ρ(Δ)ω(η) = ω1 (ρ) + ρ(Δ) ω(η) − ω1 (η) . Setting ξ = ω(η) − ω1 (η) we have ω given in terms of ω1 and ξ as stated in the theorem. Next we show ξ is a positive. Suppose ρ ∈ B(K)∗ is positive and ρ(Δ) = 1. Then we have ω (Λˆ πˆ )n (ρ) = ω1 (Λˆ π) ˆ n (ρ) + ξ for each n = 1, 2, . . . and since ω1 ((Λˆ π) ˆ n )(ρ) → 0 as a weight and since (Λˆ πˆ )n (ρ) is positive we have ξ is the limit of positive weights so ξ is positive. For ρ ∈ B(K)∗ we have ω(ρ)(I − Λ) = ω1 (ρ)(I − Λ) + ρ(Δ)ξ(I − Λ) and calculating ω1 (ρ)(I − Λ) we find ω1 (ρ)(I − Λ) = ρ I − π(Λ) + π(Λ) − (πΛ)2 (Λ) + · · · = ρ(I ) − ρ(Δ). Hence, we have ω(ρ)(I − Λ) = ρ(I ) − ρ(Δ)(1 − ξ(I − Λ)) for ρ ∈ B(K)∗ . Since we have then inequality ω(ρ)(I − Λ) ρ(I ) for positive ρ ∈ B(K)∗ we find ξ(I − Λ) 1 and ω(ρ)(I − Λ) = ρ(I ) if and only if ξ(I − Λ) = 1. 2 It follows from this result that if α is a unital CP-flow derived from π and α d is its minimal dilation E0 -semigroup, then α d is of type II and of index 1. This is because Δ = 0, so the minimal CP-flow derived from π is not unital, and therefore it must be a proper subordinate. Now recall that since α is derived from π and π is σ -weakly continuous, we have that π is the normal spine of α (see [18, Definition 4.36 and Lemma 4.37]). Furthermore, by [18, Theorem 4.52], α d is completely spatial if and only if α is the minimal CP-flow derived from its normal spine. It follows that α d cannot be completely spatial. Finally, we observe that by [18, Theorem 4.49], the index of α d is precisely the rank of the normal spine of α, and the rank of π is one. We remark that it was shown in [18, Theorem 4.62] if ν is a positive element of B(H)∗ with ν(I ) 1 and ξ is of the form −1 ˆ R(πˆ Λ)ν ξ = 1 − ν Λ(Δ)

1532


then ω of the form given in Theorem 2.3 is the boundary weight map of a CP-flow α is unital if and only if ν(I ) = 1. In a subsequent paper we find necessary and sufficient conditions on ξ that ω as given in the statement of the above theorem is the boundary weight map of a CP-flow over K. If ξ satisfies these conditions we say ξ is q-positive. In a subsequent paper we show that the above formula for ξ can be generalized to positive Λ(Δ)-weights with ν(I − Λ(Δ)) 1. We also show that there are more general ξ . For this paper we simply note that there are plenty of q-positive ξ which yield unital CP-semigroups α. 3. Local flow cocycles In this section we study the local flow cocycles associated with the CP-flows constructed in the previous section. Suppose α is a unital CP-flow and α d is the minimal dilation of α to an E0 semigroup. As we saw in Theorem 1.22 then the α d is also a CP-flow over K1 and the translation U (t) on the Hilbert space H on which α lives dilate to the translations U 1 (t) on the Hilbert space H1 on which α d lives. Recall t → C(t) is a local cocycle for α d C is a cocycle and C(t) commutes with αtd (B(H1 )) for all t 0. The cocycle C is a flow cocycle if C(t)U 1 (t) = U 1 (t) for all t 0. Just as each local unitary cocycle corresponds to a hyper-maximal corner from α to α, each local unitary flow cocycle for α d the dilation of a CP-flow over K corresponds to a hyper-maximal flow corner γ from α to α. Here a flow corner from α to α is a corner so that the matrix Θ in Definition 1.5 is a CP-flow over K ⊕ K. Theorems 1.7 and 1.8 of Section 1 of this paper are valid if one replaces the word “CP-semigroup” with “CP-flow” and “cocycle” with “flow cocycle” (see [18, Theorem 4.54]). One ambiguity that occurs in speaking of flow corners is the following. When one says γ is a maximal flow corner do we mean γ is maximal as a flow corner or simply maximal as a corner. In [18, Lemma 4.55] it was shown that if α and β are CP-semigroups and γ is a flow corner from α to β then α and β are CP-flows. It then follows that the two notions of maximality are the same. We mention one technical problem. Suppose α d is the dilation of the CP-flow α and t → C(t) is a contractive local cocycle and C(t)U 1 (t) = exp(−zt)U 1 (t) for t > 0 where z is a complex number with positive real part. Let C (t) = exp(zt)C(t). Then C is a local flow cocycle, however, it is not clear that it is contractive so there may not be a flow corner associated with it. Fortunately, Theorem 4.61 in [18] shows that C is contractive so there is a local flow corner associated with it. This means that every contractive local cocycle C is of the form C(t) = exp(−zt)C (t) for t 0 where C is a flow cocycle and z is a complex number with non-negative real part. Here we introduce some notation which we will use throughout this section. As in the last section π is the ∗-representation of B(H) on B(K) constructed in the last section. We denote by ξ a q-positive (usually unital) boundary weight and by α = α ξ the CP-flow derived from π associated with ξ as described in the last section. The boundary weight map for α is ω(ρ) = ω1 (ρ) + ρ(Δ)ξ ˆ πˆ . Recall that q-positive means that ω given above is the for ρ ∈ B(K)∗ where ω1 = R(πˆ Λ) boundary weight of a CP-flow over K. As we mentioned in the last section the complete characterization of such ξ will be given in a subsequent paper but for now we simply remark there are many q-positive ξ as given in the previous section. If z is a complex number with |z| 1 we denote by ˆ πˆ = zπˆ R(zΛˆ π) ωz = zR(zπˆ Λ) ˆ = zπˆ + z2 πˆ Λˆ πˆ + z3 πˆ Λˆ πˆ Λˆ πˆ + · · ·

(3.1)


1533

where the sum converges as a boundary weight since the sum converges for z = 1 where all the terms are positive. Next we introduce a family of one-parameter semigroups of isometries which intertwine α. For z any complex number we denote by Uz the one-parameter semigroup of isometries of H = K ⊗ L2 (0, ∞) Uz (t) = exp(−tDz )

1 where Dz = −d ∗ + |z|2 I 2

for t 0 and d is the operation of differentiation defined in the last section and the domain D(Dz ) = {f ∈ D(d ∗ ): f (0) = zS0 f } where S0 is the unitary operator mapping H into K defined by (2.3) in the last section as S0 (f1 ⊗ f2 ⊗ · · ·) ⊗ f0 = f0 ⊗ f1 ⊗ f2 ⊗ · · · for fi ∈ L2 (0, ∞) and the fi satisfy condition (2.1) and S0 defines π in that π(A) = S0 AS0∗ for A ∈ B(H). Note U0 = U the standard right translation and D0 = d. Suppose w and z are complex numbers. We show the Uz are a one-parameter family of isometries and the covariance c(w, z) of Uw with Uz is given by 1 2wz − |w|2 − |z|2 t I Uw (t)∗ Uz (t) = exp c(w, z)t I = exp 2

(3.2)

for t 0. For f ∈ D(Dw ) and g ∈ D(Dz ) we have d Uw (t)f, Uz (t)g = d ∗ Uw (t)f, Uz (t)g + Uw (t)f, d ∗ Uz (t)g dt 1 − |w|2 + |z|2 Uw (t)f, Uz (t)g 2 for t 0. Now we have ∗ d Uw (t)f, Uz (t)g + Uw (t)f, d ∗ Uz (t)g = Uw (t)f (0), Uz (t)g (0) = wS0 Uw (t)f, zS0 Uz (t)g = wz Uw (t)f, Uz (t)g for t 0 where we have used the relation between h(0) and S0 h for h in D(Dz ) or D(Dw ) and the fact that S0 is an isometry. Hence we have d Uw (t)f, Uz (t)g = c(w, z) Uw (t)f, Uz (t)g dt for t 0 and since the domains D(Dw ) and D(Dz ) are dense in H (see the argument in [18, Lemma 4.44]) Eq. (3.2) follows. Next we note that S is a one-parameter semigroup, so Ω given by Ωt (A) = S(t)∗ AS(t)

1534


for A ∈ B(H) and t 0 satisfies α Ω (meaning the mapping A → αt (A) − Ωt (A) is completely positive for A ∈ B(H) and t 0) if and only if there are complex numbers y, z with Re(y) 0 so that S(t) = e−yt Uz (t) for t 0. This follows from Theorem 1.21 of Section 1 once one notes that the condition of the theorem is satisfied if and only if the mapping A → π(A) − (2 Re(c))−1 V AV ∗ is completely positive and since π(A) = S0 AS0∗ for A ∈ B(H) this is the case if and only if V is an appropriate multiple of S0 . Suppose z ∈ C. We show Uz intertwines α. From the result just established we have the mapping βt (A) = αt (A) − Uz (t)AUz (t)∗ for A ∈ B(H) and t 0 is completely positive. Suppose t > 0. Then since α is unital we have βt (I ) = I − Uz (t)Uz (t)∗ . Since Uz (t) is an isometry and βt is positive we have 0 βt (A) I − Uz (t)Uz (t)∗ for A ∈ B(H) with 0 A I and consequently βt (A) = I − Uz (t)Uz (t)∗ βt (A) I − Uz (t)Uz (t)∗ and by linearity this extends to all A ∈ B(H). Then we have αt (A) = Uz (t)AUz (t)∗ + I − Uz (t)Uz (t)∗ αt (A) I − Uz (t)Uz (t)∗ for all A ∈ B(H). And multiplying the above equation on the right by Uz (t) we obtain Uz (t)A = αt (A)Uz (t) so Uz intertwines α. Summarizing our results to this point we have the mapping A → αt (A) − V (t)AV (t)∗ for A ∈ B(H) and t 0 is completely positive where V is a oneparameter semigroup of contractions then the V (t) are in fact multiples of a semigroup Uz of isometries which intertwine α. Now suppose α d is the dilation of α to an E0 -semigroup on H1 as described in Theorem 1.22. Then from Theorem 1.22 we see the mapping W ∗ Uz1 (t)W = Uz (t)

(3.3)

for t 0 and z ∈ C give us a bijection from the Uz to intertwining semigroups of isometries Uz1 which intertwine α d where the covariance for the Uz1 is the same as the covariance for the Uz given in Eq. (3.2). Also every intertwining semigroup for α d is of the form V 1 (t) = e−yt Uz1 (t) with y, z ∈ C. Next we describe the action of local cocycles on the units Uz1 . One checks that if t → C(t) is a local cocycle for α d then C(t)Uz1 (t) is a intertwining semigroup for α d . Now the action of the local unitary (respectively contractive) cocycles on the units Uz1 restricts to an action of local unitary (respectively contractive) cocycles on γ , the type I part of α d (see [5, Remark 10.4.2, p. 347]), which is the “maximal” type I E-semigroup subordinate to α d (necessarily type I1 in


1535

this case). Every E-semigroup is cocycle conjugate to an E0 -semigroup, hence the action at the level of γ must arise via cocycle conjugacy from an action of a subgroup of the gauge group (respectively semigroup of local contractive cocycles) of an E0 -semigroup of type I1 acting on its set of units. Local unitary cocycles generate automorphisms of the product systems associated with an E0 -semigroup and these have been computed in the type I case by Arveson in [3] (see also [5, Section 3.8]). Going one step further, Bhat [7] computed the positive contractive local cocycles of an E0 -semigroup of type I. The general contractive local flow cocycles for a CP-flow of type I are characterized in [1, Theorem 2.11]. Now we characterize the action of the contractive local cocycles on units. If C is a contractive local cocycle for α d then there are complex numbers a, b, c, y ∈ C with |a| 1 and Re(y) 0 so that the action of C on the units Uz1 is given by 1 2 2 1 t −y − |v + z| 1 − |a| + i Im(cz) Uaz+b (t) 2

C(t)Uz1 (t) = exp

(3.4)

for t 0 with −1 v = − 1 − |a|2 (ab + c) and when |a| = 1 then numbers a, b, c ∈ C above satisfy the additional constraint ac + b = 0, so 1 C(t)Uz1 (t) = e−t (y+i Im(abz)) Uaz+b (t).

The action of C ∗ is obtained by making the replacements a → a,

b↔c

and y → y.

In the case when |a| = 1 we parameterize C with complex numbers (y, a, b) not using c so the action of C ∗ in this case is given by 1 C(t)∗ Uz1 (t) = e−t (y−i Im(bz)) Ua(z−b) (t).

If the cocycle is isometric then |a| = 1,

ac + b = 0,

and

Re(y) = 0.

If the cocycle is a flow cocycle then b = c = y = 0 so the action of a flow cocycle on the units Uz1 is given by 1

C(t)Uz1 (t) = e− 2 |z|

2 (1−|a|2 )

1 Uaz (t)

for t 0 and z ∈ C. If C and C are contractive local cocycles whose action on the units is characterized by the n-tuples (a, b, c, y) and (a , b , c , y ) as describe above then the corresponding numbers for the product cocycle t → C(t)C (t) are 1 aa , ab + b, a c + c , y + y + i Im(cb ) − r 2

1536


where r = 0 if either |a| = 1 or |a | = 1 and otherwise 2 −1 −1 r = 1 − |a |2 |a b + c |2 + 1 − |a|2 b 1 − |a|2 − ab − c 2 −1 − 1 − |aa |2 aa (ab + b) + a c + c is a non-negative real function of (a, b, c, a , b , c ). Given the complexity of the function r above we wonder if there is a better parameterization of the action of the local cocycles on the units. If either of the local cocycles above is unitary the number r above is zero so the parameterization of contractive local cocycles is much more difficult than the parameterization of the unitary local cocycles. We caution the reader that action of a local cocycle on the units Uz1 does not completely determine the cocycle since in our case α d is not completely spatial. In the next theorem we characterize the contractive local flow cocycles which as we have explained is equivalent to determining the flow corners from α to α. First we prove the following lemma. Lemma 3.1. Suppose ξ is a unital q-positive boundary weight on A(H) and α is the CP-flow over K derived from π associated with ξ . Suppose γ is a flow corner from α to α which means that A11 A12 αt (A11 ) γt (A12 ) = Θt A21 A22 γt∗ (A21 ) αt (A22 ) for t > 0 and Aij ∈ B(H) for i, j = 1, 2 is a CP-flow over K ⊕ K. Then there is a complex number z with |z| 1 so that Θ is derived from Πz given by Πz

A11 A21

A12 A22

π(A11 ) = zπ(A21 )

zπ(A12 ) π(A22 )

for Aij ∈ B(H) for i, j = 1, 2. Furthermore, for each w ∈ C we have 1

2 (1−|z|2 )

Uzw (t)A = e 2 t|w|

γt (A)Uw (t)

and 1

2 (1−|z|2 )

Uw (t)A = e 2 t|w|

γt∗ (A)Uw (t)

for A ∈ B(H) and t 0. Proof. Assume the hypothesis and notation of the theorem. Let α d and Θ d be the dilation of α and Θ to E0 -semigroups on H1 and H1 ⊕ H1 and the relation between the CP-flow and the dilated E0 -semigroup is as described in Section 1 so αt (A) = W ∗ αtd (W AW ∗ )W for t 0 and A ∈ B(H). We will show that there is z ∈ C with |z| 1 so that Θ is derived from Πz as defined above.


1537

First note that U (t) ⊕ U (t) intertwines Θ. Using this we find the boundary representation of Θ is of the form π(A11 ) φ(A12 ) A11 A12 = Π(A) = Π A21 A22 φ ∗ (A21 ) π(A22 ) for A in the domain of the generator of Θ. Since π is pure meaning the only subordinates of π are of the form λπ with 0 λ 1 and Π is completely positive it follows that φ = zπ for some z ∈ C with |z| 1. Note in general the boundary representation is the direct sum of a normal and a non-normal representation of the domain of the generator but in our case we are assured that there is no non-normal part because π is unital and therefore Π is normal. Thus the boundary representation of Θ is Π so Θ is derived from Π . As we have seen since γ is a flow corner from α to α there is a unique contractive local flow cocycle C for α d so that γt (A) = W ∗ C(t)αtd (W AW ∗ )W for all t 0 and A ∈ B(H). Then as we have seen there is a number y ∈ C with |y| 1 so that 1 1 C(t)Uw1 (t) = exp − t|w|2 1 − |y|2 Uyw (t) 2 for t 0 and w ∈ C. Then we have γt (A)Uw (t) = W ∗ C(t)αtd (W AW ∗ )W Uw (t) = W ∗ C(t)αtd (W AW ∗ )Uw1 (t)W = W ∗ C(t)Uw1 W AW ∗ W 1 1 (t)W A = exp − t|w|2 1 − |y|2 W ∗ Uyw 2 1 = exp − t|w|2 1 − |y|2 Uyw (t)A 2 for t 0, w ∈ C and A ∈ B(H). Also since C is a local cocycle we have γt∗ (A) = W ∗ αtd (W AW ∗ )C(t)∗ W = W ∗ C(t)∗ αtd (W AW ∗ )W so 1 γt∗ (A)Uw (t) = exp − t|w|2 1 − |y|2 Uw (t)A 2 for t 0, w ∈ C and A ∈ B(H). Hence, we have proved the lemma provided we can show y = z. We show y = z. Let d2 = d ⊕ d so d2 is the ordinary differential operator d/dx on H ⊕ H. We use capital letters F and G to denote elements of H ⊕ H and lower case letters f, g to denote elements of H. Recall that the boundary representation discussed in Section 1 for Θ is given by Πz (A)F (0) = (AF )(0)

1538


for F ∈ D(d2∗ ) and A ∈ D(δ2 ) where δ2 is the generator of Θ. Suppose w ∈ C and w = 0. Now suppose G = {0, g} and g ∈ D(Dw ) so g ∈ D(d ∗ ) and g(0) = wS0 g. Suppose A ∈ D(δ2 ) and Aij ∈ B(H) are the matrix coefficients of A for i = 1, 2. Now from what we have shown we have 1 2 2 γt (A12 )Uw (t)g = exp − t|w| 1 − |y| Uyw (t)A12 g 2 for t 0. Since −Dw is the generator of Uw and g ∈ D(Dw ) we have Uw (t)g is differentiable in t and since A ∈ D(δ2 ) we have γt (A12 ) is differentiable in t so the expression on the lefthand side of the above equation is differentiable in t. Hence, Uyw (t)A12 g is differentiable in t so Ag ∈ D(Dyw ) and we have Ag ∈ D(d ∗ ) and (A12 g)(0) = ywS0 A12 g = ywS0 A12 w −1 S0∗ g(0) = yS0 A12 S0∗ g(0) = yπ(A12 )g(0). Since Πz (A)F (0) = (AF )(0) we have (A12 g)(0) = zπ(A12 )g(0) and comparing the two equations we see y = z.

2

Theorem 3.2. Suppose ξ is a unital q-positive boundary weight on A(H) and α is the CP-flow over K derived from π associated with ξ . Suppose γ is a flow corner from α to α which means that Θt

A11 A21

A12 A22

αt (A11 ) = γt∗ (A21 )

γt (A12 ) αt (A22 )

for t > 0 and Aij ∈ B(H) for i, j = 1, 2 is a CP-flow over K ⊕ K and if Ω is the boundary weight map for Θ then Ω is of the form Ω

ρ11 ρ21

ρ12 ρ22

=

ω(ρ11 ) σ ∗ (ρ21 )

σ (ρ12 ) ω(ρ22 )

for ρij ∈ B(K)∗ for i, j = 1, 2. Then there is a unique complex number z with |z| 1 so if z = 1 then σ (ρ) = ωz (ρ) for ρ ∈ B(K)∗ and if z = 1 then there is a boundary weight ξ so that σ (ρ) = ω1 (ρ) + ρ(Δ)ξ for all ρ ∈ B(K)∗ .


1539

Proof. Assume the hypothesis and notation of the first paragraph of the theorem. Then from the previous lemma there is a unique z ∈ C with |z| 1 so that Θ as given in the previous lemma is derived from Πz . Since Θ is derived from Πz we have repeating the argument of Theorem 2.3 that σ ρ − zn+1 (Λˆ π) ˆ n+1 (ρ) = zπˆ (ρ) + z2 πˆ Λˆ πˆ (ρ) + · · · + zn πˆ (Λˆ π) ˆ n (ρ). Suppose ρ(Δ) = 0. Then we have from Lemma 2.2 that (Λˆ π) ˆ n (ρ) → 0 as n → ∞ so we have σ (ρ) = zπˆ R(zΛˆ π)(ρ). ˆ Choose a positive ρ1 so that ρ1 (Δ) = 1 and we find σ (ρ) = σ ρ − ρ(Δ)ρ1 + ρ(Δ)σ (ρ1 ) = zπˆ R(zΛˆ π)(ρ) ˆ + ρ(Δ) σ (ρ1 ) − zπˆ R(zΛˆ πˆ )(ρ1 ) . Letting ξ = σ (ρ1 ) − zπˆ R(zΛˆ πˆ )(ρ1 ) we have σ (ρ) = ωz (ρ) + ρ(Δ)ξ . ωz

Now since σ is derived from zπ we have σ (ρ − zΛˆ π(ρ)) ˆ = zπˆ (ρ) for ρ ∈ B(K)∗ and since is also derived from zπ we have the same equation is true for ωz from which it follows that ρ(Δ)ξ − zΛˆ πˆ ρ(Δ)ξ = (1 − z)ρ(Δ)ξ = 0

for ρ ∈ B(K)∗ . For z = 1 the only solution to this equation is ξ = 0. Hence, if z = 1 we have σ (ρ) = ωz (ρ) = zπˆ (ρ) + z2 πˆ Λˆ πˆ (ρ) + z3 πˆ Λˆ πˆ Λˆ πˆ (ρ) + · · · for ρ ∈ B(K)∗ .

2

Theorem 3.3. Suppose ξ is a unital q-positive boundary weight on A(H) and α is the CP-flow over K derived from π associated with ξ . Suppose α d is the minimal dilation of α to an E0 semigroup of B(H1 ) as given in Theorem 1.6 so αt (A) = W ∗ αtd (W AW ∗ )W for A ∈ B(H) and t 0. Then there is a bijection from the units Uz1 of α d onto the units of Uz of α given by W ∗ Uz1 (t)W = Uz (t) for t 0 and z ∈ C. Suppose t → C(t) is a local unitary local cocycle which fixes U01 so that C(t)U01 (t) = U01 (t) for t 0. Then C(t)Uz1 (t) = Uz1 (t) for t 0 and all z ∈ C. This means that the action of the local unitary cocycles on the units contains no rotations.

1540


Proof. Assume the hypothesis and notation of the theorem. Let γt (A) = W ∗ C(t)αtd (W AW ∗ )W for A ∈ B(H) and t 0. From Theorem 1.6 we have γ is a hyper-maximal flow corner from α to α. Since C(t) is a unitary local cocycle which fixes U01 we know from the general properties of such cocycles discussed before Lemma 3.1 there is a complex number y of modulus one so 1 (t) for all t 0 and w ∈ C. Since γ is a flow corner from α to α we know that C(t)Uw1 (t) = Uyw that there is a complex number z so that 1

Uzw (t)A = e 2 |w|

2 (1−|z|2 )

γt (A)Uw (t)

for w ∈ C, t 0 and A ∈ B(H). Then we have γt (A)Uw (t) = W ∗ C(t)αtd (W AW ∗ )W Uw (t) = W ∗ C(t)αtd (W AW ∗ )Uw1 (t)W = W ∗ C(t)Uw1 (t)W AW ∗ W 1 = W ∗ Uyw (t)W A = Uyw (t)A

for all w ∈ C, t 0 and A ∈ B(H). Comparing the two equations we see y = z so |z| = 1. To complete the proof of the theorem all we need do is to show z = 1. Suppose |z| = 1 and z = 1. Now we apply Theorem 3.2. Let Θ be the CP-flow described in the theorem. Assuming the notation of Theorem 3.2 we have σ (ρ) = ωz (ρ) for ρ ∈ B(K)∗ . We show this implies that C is not unitary. From Theorem 1.6 we know C is a unitary cocycle if and only if γ is hypermaximal. But γ is not hyper-maximal as can be seen as follows. Let Θ 1 be the CP-semigroup of B(H ⊕ H) given by Θt1

A11 A21

A12 A22

ηt (A11 ) γt (A12 ) = γt∗ (A21 ) ηt (A22 )

for t > 0 and Aij ∈ B(H) for i, j = 1, 2 where η is the minimal CP-flow derived from π . Note the boundary weight map Ω 1 for Θ 1 is of the form Ω

1

ρ11 ρ21

ρ12 ρ22

ω1 (ρ11 ) = ωz (ρ21 )

ωz (ρ12 ) ω1 (ρ22 )

for ρij ∈ B(K)∗ for i, j = 1, 2. Now we see γ is not hyper-maximal since α η and if γ were hyper-maximal we would have α = η. Hence, if z = 1 we have C is not unitary so the action of the local unitary cocycles does not contain the rotations. 2 4. Conclusion Here we present our conclusions. Suppose K is a separable Hilbert space and H = K ⊗ L2 (0, ∞) and S is a unitary mapping from H onto K and π(A) = SAS ∗ for A ∈ B(H). Note


1541

π is an irreducible ∗-representation of B(H) on B(K). Suppose Λ is the mapping of B(K) into B(H) given by Λ(A)F (x) = e−x AF (x) for A ∈ B(K) and x 0 for all K-valued function F ∈ H. Let Δ = lim (πΛ)n (I ) n→∞

where the limit exists in the sense strong convergence since the terms are decreasing. Assume Δ = 0. Assume further that for all ρ ∈ B(K)∗ with ρ(Δ) = 0 we have (Λˆ π) ˆ n (ρ) → 0 as n → ∞. It follows from this that if π(Λ(A)) = A then A = λΔ. Then there are unital CP-semigroups α of B(H) which are intertwined by the shifts U and there boundary weight maps are given by ω(ρ) = ω1 (ρ) + ρ(Δ)ξ for ρ ∈ B(K)∗ where ω1 is the boundary weight map for the minimal CP-flow derived from π and it is given by ω1 (ρ) = πˆ (ρ) + πˆ Λˆ π(ρ) ˆ + πˆ Λˆ πˆ Λˆ π(ρ) ˆ + · · · = πR( ˆ Λˆ πˆ ) and ξ ∈ A(H)∗ is a positive boundary weight on 1

1

A(H) = (I − Λ) 2 B(H)(I − Λ) 2 with ξ(I − Λ) 1 and α is unital (i.e. αt (I ) = I for t 0) if and only if ξ(I − Λ) = 1. The boundary weight ξ satisfies certain positivity conditions which we analyze in a separate paper. It was shown in [18, Theorem 4.62] that if ν is a positive element of B(H)∗ with ν(I ) 1 and ξ is of the form −1 ˆ ξ = 1 − ν Λ(Δ) R(πˆ Λ)ν then ω as given above is the boundary weight map of a CP-flow α is unital if and only if ν(I ) = 1. Then if α is such a unital CP-flow then α has a Bhat dilation to an E0 -semigroup α d . This E0 -semigroup is of index one. The action of the local unitary cocycles on the units for this E0 -semigroup in not two-fold transitive. The Hilbert space for the dilation is of the form H1 = K1 ⊗ L2 (0, ∞) and if U 1 (t) is right translation by t on H1 then U 1 is a unit for α d meaning U 1 (t)A = αtd (A)U 1 (t) for all A ∈ B(H1 ) and t 0. If C(t) is a unitary local cocycle for α d so C(t) ∈ αtd (B(H1 )) for all t 0 and the C(t) are unitary operators satisfying the relation C(t)αtd (C(s)) = C(t + s) for s, t 0 and if C(t)U 1 (t) = U 1 (t) for t 0 then C(t) = I for all t 0. This means the action of the gauge group on the units of α d is a smaller group than for an E0 -semigroup type I1 . Also, this means α d is not cocycle conjugate to the tensor product of a semigroup of type II0 with a type I1 for if this was the case the action of gauge group on the units would contain all the Euclidean

1542


transformations just as the action for an E0 -semigroup of type I1 . The same reasoning applies to the E0 -semigroups of type II1 corresponding to the examples of product systems constructed by Tsirelson in [22]. The full action for the gauge group on a type I1 is the Euclidean group whose action on C is given by z → az + b for a, b ∈ C and |a| = 1. In our examples we have the further restriction a = 1. Tsirelson has examples where there are the restrictions a = 1 and Im(b) = 0. It is quite possible in our case there may be further restrictions. It may be that b lies on a one-dimensional line or even the further restriction b = 0. This would be interesting since it would give an example of an action which is rigid. That means that if C(t) is a local unitary cocycle and U is a unit then C(t)U (t) = eiλt U (t) for t 0. We are somewhat embarrassed to report that in order to establish this result all that is required is to determine whether certain fairly simple first order differential equations with constant coefficients have a bounded solution or not. The equations are parameterized by the complex numbers (a, b) with |a| = 1. We have shown that if a = 1 the equations have no solution. If the equations never have solutions the action is rigid. If the equations have solutions when b lies on a onedimensional line we are in the situation Tsirelson found and if the equations have a solution for all b then we are in the case where we have transitivity of the gauge group on the units but no two fold transitivity. As the reader can probably guess the feature that makes these equations interesting and difficult is that they involve infinitely many variables. We will present them in a longer and more detailed paper. Acknowledgments This work was partially carried out while D.M. was a Lecturer at the University of Pennsylvania and later a Technion Swiss Society Postdoctoral Fellow at the Technion in Haifa, Israel. He would like to thank R.T.P. and his wife for their wonderful generosity and hospitality during his time in Philadelphia. He also thanks Baruch Solel at the Technion, for his hospitality and many interesting conversations, and the friendly staff of both institutions. Finally, the authors thank the anonymous referee for his helpful comments and suggestions. References [1] A. Alevras, R.T. Powers, G.L. Price, Cocycles for one-parameter flows of B(H ), J. Funct. Anal. 230 (1) (2006) 1–64. [2] W. Arveson, An addition formula for the index of semigroups of endomorphisms of B(H ), Pacific J. Math. 137 (1) (1989) 19–36. [3] W. Arveson, Continuous analogues of Fock space, Mem. Amer. Math. Soc. 80 (409) (1989), iv+66 pp. [4] W. Arveson, On the index and dilations of completely positive semigroups, Internat. J. Math. 10 (7) (1999) 791–823. [5] W.B. Arveson, Noncommutative Dynamics and E-Semigroups, Springer Monogr. Math., Springer-Verlag, New York, 2003. [6] B.V.R. Bhat, An index theory for quantum dynamical semigroups, Trans. Amer. Math. Soc. 348 (2) (1996) 561–583. [7] B.V.R. Bhat, Cocycles of CCR flows, Mem. Amer. Math. Soc. 149 (709) (2001), xx+114 pp. [8] B.V.R. Bhat, M. Skeide, Tensor product systems of Hilbert modules and dilations of completely positive semigroups, Infin. Dimens. Anal. Quantum Probab. Relat. Top. 3 (4) (2000) 519–575. [9] M. Izumi, Every sum system is divisible, eprint, arXiv: 0708.1591.


[10] [11] [12] [13] [14] [15] [16] [17] [18] [19] [20] [21] [22] [23]

1543

M. Izumi, A perturbation problem for the shift semigroup, J. Funct. Anal. 251 (2) (2007) 498–545. M. Izumi, R. Srinivasan, Generalized CCR flows, eprint, arXiv: 0705.3280. D. Markiewicz, On the product system of a completely positive semigroup, J. Funct. Anal. 200 (1) (2003) 237–280. P.S. Muhly, B. Solel, Quantum Markov processes (correspondences and dilations), Internat. J. Math. 13 (8) (2002) 863–906. R.T. Powers, A nonspatial continuous semigroup of ∗-endomorphisms of B(H ), Publ. Res. Inst. Math. Sci. 23 (6) (1987) 1053–1069. R.T. Powers, An index theory for semigroups of ∗-endomorphisms of B(H ) and type II1 factors, Canad. J. Math. 40 (1) (1988) 86–114. R.T. Powers, Induction of semigroups of endomorphisms of B(H) from completely positive semigroups of (n × n) matrix algebras, Internat. J. Math. 10 (7) (1999) 773–790. R.T. Powers, New examples of continuous spatial semigroups of ∗-endomorphisms of B(H ), Internat. J. Math. 10 (2) (1999) 215–288. R.T. Powers, Continuous spatial semigroups of completely positive maps of B(H ), New York J. Math. 9 (2003) 165–269 (electronic). R.T. Powers, G. Price, Continuous spatial semigroups of ∗-endomorphisms of B(H ), Trans. Amer. Math. Soc. 321 (1) (1990) 347–361. B. Tsirelson, From slightly coloured noises to unitless product systems, eprint, arXiv: math/0006165. B. Tsirelson, From random sets to continuous tensor products: Answers to three questions of W. Arveson, eprint, arXiv: math.FA/0001070, 2000. B. Tsirelson, On automorphisms of type II Arveson systems (probabilistic approach), eprint, arXiv: math. OA/0411062, 2004. J. von Neumann, On infinite direct products, Compos. Math. 6 (1939) 1–77.


Amenability properties of the centres of group algebras Ahmadreza Azimifard a,1 , Ebrahim Samei b,2 , Nico Spronk c,∗,3 a Fields Institute, 222 College Street, Toronto, Ontario, M5T 3J1, Canada b Department of Mathematics and Statistics, University of Saskatchewan, 106 Wiggins Road, Saskatoon,

Saskatchewan, S7N 5E6, Canada c Department of Pure Mathematics, University of Waterloo, Waterloo, Ontario, N2L 3G1, Canada

Received 22 May 2008; accepted 29 November 2008 Available online 13 January 2009 Communicated by N. Kalton

Abstract Let G be a locally compact group, and ZL1 (G) be the centre of its group algebra. We show that when G is compact ZL1 (G) is not amenable when G is either non-abelian and connected, or is a product of infinitely many finite non-abelian groups. We also, study, for some non-compact groups G, some conditions which imply amenability and hyper-Tauberian property, for ZL1 (G). © 2008 Published by Elsevier Inc. Keywords: Compact group; Group algebra; Amenability

Let G be a locally compact group and L1 (G) denote the group algebra, i.e. the subalgebra of the measure algebra M(G) consisting of measures which are absolutely continuous with respect to the left Haar measure. We let ZL1 (G) = f ∈ L1 (G): f ∗ g = g ∗ f for all g ∈ L1 (G) be the centre of L1 (G). Our goal is to study amenability and weak amenability for ZL1 (G). * Corresponding author.

E-mail addresses: [email protected] (A. Azimifard), [email protected] (E. Samei), [email protected] (N. Spronk). 1 Research supported by the Workshop on Operator Algebras (2007), at the Fields Institute. 2 Research supported by an NSERC Post Doctoral Fellowship at University of Waterloo. 3 Research supported by NSERC Grant 312515-05. 0022-1236/$ – see front matter © 2008 Published by Elsevier Inc. doi:10.1016/j.jfa.2008.11.026

A. Azimifard et al. / Journal of Functional Analysis 256 (2009) 1544–1564

1545

We show that when G is compact, ZL1 (G) is generally not amenable. In fact, it fails to be amenable whenever G is either non-abelian and connected (Section 1.4), or when G is a product of infinitely many non-abelian finite groups (Section 1.5). These results make substantial use of some discoveries from the intensive study of central idempotent measures on compact groups of D. Rider [24]. It is mentioned in [26] that it was known to B. E. Johnson that ZL1 (G) fails to be amenable for some compact group G. However no further information is provided. Our results, and techniques therein, lead us towards the following. Conjecture 0.1. If G is compact, then ZL1 (G) is amenable if and only if G admits an open abelian subgroup. We note that by [19], G admits an open abelian subgroup if and only if the set of degrees of its irreducible representations is bounded. We address the above conjecture with an illustrative example (Section 1.6). As a complement to many of the methods used in the prior sections, we illustrate two examples using hypergroup techniques (Section 1.7). We close the article with a study of some non-compact groups. We use results of R.D. Mosak and J. Liukkonen [18,20,21] extensively. When the commutator of G with the open subgroup, which supports all elements of ZL1 (G), is finite, then ZL1 (G) is amenable (Section 2.2). When G has relatively compact conjugacy classes, then ZL1 (G) is hyper-Tauberian (Section 2.3). We outline the basic theory of hyper-Tauberian algebras, below. 0.1. Amenability A denote the projective tensor product of A with itself. If A is a Banach algebra, we let A ⊗ Following B. E. Johnson [14], we say A is amenable if it admits a bounded approximate diagonal A which satisfies (b.a.d.): a bounded net (μα ) ⊂ A ⊗ m(μα )a, am(μα ) → a

and a · μα − μα · a → 0

A → A is the multiplication map, and the module actions of A on for a in A, where m : A ⊗ A ⊗ A are given on elementary tensors by a · (b ⊗ c) = (ab) ⊗ c and (b ⊗ c) · a = b ⊗ (ca). As shown in [14], amenability is equivalent to the existence of a virtual diagonal: an element A)∗∗ such that M in (A ⊗ a·M =M ·a

and am∗∗ (M) = m∗∗ (M)a = a

A)∗∗ and A∗∗ , are the second adjoints of for a in A, where the module actions of A on (A ⊗ A and A, respectively, and m∗∗ is the second adjoint of the the module actions of A on A ⊗ multiplication map. We can quantify amenability via the amenability constant, which was defined in [15]. Let AM(A) = inf sup μα : (μα ) is a b.a.d. for A α

where we allow the infimum of an empty set be ∞. The above definition is equivalent to a cohomological one: A is amenable if every derivation into a dual Banach A-bimodule is inner; see [13] for more on this. We say A is weakly amenable

1546


if every bounded derivation into A∗ is inner. If A is commutative, this is equivalent to having every bounded derivation into any symmetric bimodule be inner; see [2]. We will not directly conduct any computations with derivations. We note the important fact that L1 (G) is amenable exactly when G is an amenable group [13]. 0.2. The hyper-Tauberian property Let A be a commutative semisimple Banach algebra. Suppose A is regular on its spectrum X ; we regard A as an algebra of functions on X . If ϕ ∈ A∗ we define for every neighbourhood U of χ there is f in A supp ϕ = χ ∈ X : . such that supp f ⊂ U and ϕ(f ) = 0 A linear operator T : A → A∗ is said to be local if supp Tf ⊂ supp f for every f in A. We say A is hyper-Tauberian if every bounded local operator T : A → A∗ is an A-module map. This concept was developed by the second named author [25] to study the reflexivity of the (completely bounded) derivation space of A. However, it has nice applications to weak amenability and spectral synthesis problems, which we summarise. Theorem 0.2. If A is hyper-Tauberian, then (i) A is weakly amenable; (ii) finite subsets of X are sets of spectral synthesis; and A is semi-simple, then {(χ, χ): χ ∈ X } is a set of local synthesis for that algebra, (iii) if A ⊗ and hence is a set of spectral synthesis when A has a bounded approximate identity. See [25] Theorem 5, Corollary 8 and Theorem 6, for the proof. 1. Compact groups 1.1. Notation denote the set of equivalence classes In this section we let G denote a compact group. Let G to denote a set of irreducible representations of G. By standard abuse of notation, we will use G or representatives, one from each equivalence class. We let dπ denote the dimension of π . We let ZM(G) denote the centre of the measure algebra. Let for π in G, χπ = Tr π(·)

and ψπ =

1 χπ dπ

so χπ is the character of π and ψπ the normalised character with ψπ (e) = 1. If μ ∈ M(G) we let ¯ dμ(s) ∈ B(Hπ ). μ(π) ˆ = π(s) G


1547

If μ ∈ ZM(G) then it is well-known, and straightforward to compute that μ(π) ˆ =

ψ¯ π dμ · IHπ

G

and we then let μ(π) =

ψ¯ π dμ.

(1.1)

G

We note that f → f is the Gelfand transform on ZL1 (G). 1.2. Some functorial properties of the centre of the group algebra We recall, as observed in [20, Prop. 1.5], that the map P = PG : L1 (G) → ZL1 (G),

Pf (s) =

f tst −1 dt

(1.2)

G

is a surjective quotient map. ZL1 (G) ∼ Proposition 1.1. ZL1 (G) ⊗ = ZL1 (G × G). Proof. This follows from the fact that PG is a surjective quotient map and that in the identifica L1 (G) ∼ tion L1 (G) ⊗ = L1 (G × G), we have that PG ⊗ PG = PG×G . 2 If N is a closed normal subgroup of G, we have a map TN : C(G) → C(G/N ),

TN f (sN ) =

f (sn) dn N

for every sN in G/N . This map extends to a surjective quotient map from L1 (G) to L1 (G/N ) which we again denote TN . See [23, Thm. 3.5.4]. Proposition 1.2. TN (ZL1 (G)) = ZL1 (G/N) and TN : ZL1 (G) → ZL1 (G/N ) is a surjective quotient map. Proof. It is sufficient to verify that TN ◦ PG = PG/N ◦ TN since each are surjective quotient maps. For f ∈ C(G) we have for s in G, using Weyl’s integral formula, that

1548


f tsnt −1 dt dn

TN ◦ PG f (sN ) = N G

=

f tn sn(tn )−1 dn dtN dn

N G/N N

=

f tn sn(tn )−1 dn dn dtN

G/N N N

TN f tst −1 N dtN = PG/N ◦ TN f (sN ).

= G/N

Since C(G) is dense in L1 (G) we are done.

2

Corollary 1.3. If N is a closed normal subgroup of G then AM(ZL1 (G)) AM(ZL1 (G/N )). In particular, if ZL1 (G) is amenable, then ZL1 (G/N ) is amenable. ZL1 (G) is an approximate diagonal for ZL1 (G), then it is a standard Proof. If (μα ) ⊂ ZL1 (G) ⊗ fact that (TN ⊗ TN (μα )) is an approximate diagonal for ZL1 (G/N ). 2 Let m ˜ : M(G × G) → M(G) be given by

u d m(μ) ˜ =

G

u(st) dμ(s, t)

for u in C(G).

G×G

Then m(μ ˜ ⊗ ν) = μ ∗ ν and, m ˜ is the weak* continuous extension of the multiplication map L1 (G) ∼ m : L1 (G) ⊗ = L1 (G × G) → L1 (G). Proposition 1.4. (i) ZM(G) is weak* closed in M(G) and ZL1 (G) is weak* dense in ZM(G). (ii) m(ZM(G ˜ × G)) = ZM(G) and m ˜ : ZM(G × G) → ZM(G) is a homomorphism. Proof. (i) The product on M(G) is well-known to be weak* continuous in each variable. Hence if (μα ) is a net contained in ZM(G) with weak* limit point μ, then for each ν in M(G) we have ν ∗ μ = w*- lim ν ∗ μα = w*- lim μα ∗ ν = ν ∗ μ, α

α

so μ ∈ ZM(G). Let (U ) be a base of neighbourhoods of the identity in G, each invariant for inner automor1 1 phisms. Then (eU ) = ( λ(U ) 1U ) is a central approximate identity for L (G). If μ ∈ ZM(G) then, for each U , eU ∗ μ ∈ ZL1 (G), and we have w*-limU eU ∗ μ = μ.


1549

(ii) Suppose that μ ∈ ZM(G × G). Then for any u in C(G) and s in G we have

u d δs ∗ m(μ) ˜ ∗ δs −1 =

G

u s −1 ts d m(μ) ˜ (t) =

G

u s −1 xys dμ(x, y)

G

u s −1 xss −1 ys dμ(x, y) =

= G×G

u(xy) dμ(x, y)

G×G

=

u d m(μ) ˜ G

so m(μ) ˜ ∈ ZM(G). Hence m(ZM(G ˜ × G)) ⊂ ZM(G). Now if μ ∈ ZM(G × G) and ν ∈ M(G × G) then for u in C(G) we have u d m(μ ˜ ∗ ν) G

=

u(st) d(μ ∗ ν)(s, t)

G×G

=

u(ss tt ) dμ(s, t) dν(s , t )

G×G G×G

=

u(sts t ) dμ(s, t) dν(s , t ),

G×G G×G

=

u(xy) d m(μ)(x) ˜ d m(ν)(y) ˜ = G G

since δ(e,s ) ∗ μ ∗ δ(e,s −1 ) = μ for any s

u d m(μ) ˜ ∗ m(ν) ˜ .

G

Observe that we actually proved that m ˜ is a (left) ZM(G×G)-module map. Since m(μ⊗δ ˜ e ) = μ, it follows that m(ZM(G ˜ × G)) = ZM(G). 2 1.3. Approximate diagonals for centres of compact group algebras is dense in ZL1 (G). To see this, we first recall We note that ZTrig(G) = span{χπ : π ∈ G} of matrix coefficients is dense in L1 (G). that the set Trig(G) = {πij : i, j = 1, . . . , dπ , π ∈ G} −1 It is easily checked that P πij = ψπ = dπ χπ for each πij where P is the map defined in (1.2). Then if (un ) ⊂ Trig(G) converges to f in ZL1 (G), we have limn P un = Pf = f . Lemma 1.5. There exists a net (fβ ) in ZTrig(G) such that (fβ ) is a bounded approximate identity for L1 (G). Moreover, if for each β we have fβ =

π∈G

aπβ χπ

1550


we have where aπ = 0 except for finitely many elements π , then for each π in G β

lim aπβ = dπ . β

Proof. Let (eU ) be the bounded approximate identity for ZL1 (G) specified in the proof of Proposition 1.4(i). Since ZTrig(G) is dense in ZL1 (G) we can find for each ε > 0 and U as above, fε,U ∈ ZTrig(G) such that fε,U − eU 1 < ε. Then (fβ ) = (fε,U ) is the desired bounded approximate identity. we have Since for each π in G β

aπ β χπ = fβ ∗ χπ −→ χπ dπ β

it follows that limβ aπ = dπ .

2

We recall from [9, (27.43)] that G × G = {π × σ : π, σ ∈ G}. Theorem 1.6. Let G be a compact group and (fβ ) be as in Lemma 1.5, above. For each β define μβ =

2 aπβ χπ ⊗ χπ

ZL1 (G). in ZL1 (G) ⊗

π∈G

Then (μβ ) is an approximate diagonal for ZL1 (G). Moreover, the following are equivalent (i) (μβ ) is bounded; (ii) ZL1 (G) is amenable; and (iii) there is a measure μ in ZM(G × G) which satisfies μ(π × σ ) = δπ,σ

(1.3)

where δ, in this context, is the Kronecker symbol. For such μ we have that m(μ) ˜ = δe and (f ⊗ δe ) ∗ μ = μ ∗ (δe ⊗ f ) for f in ZL1 (G). Note that we thus have that ZL1 (G) is pseudo-amenable, in the sense defined in [5]. and each β we have χπ · μβ = μβ · χπ . Since ZTrig(G) is Proof. It is clear that for each π in G dense in ZL1 (G), it follows that f · μβ = μβ · f for each f in ZL1 (G) too. Also m(μβ ) =

(aπβ )2 χπ = fβ ∗ fβ dπ

π∈G

so (m(μβ )) is a bounded approximate identity. Thus (μβ ) is an approximate diagonal for ZL1 (G). It is immediate that (i) ⇒ (ii). (ii) ⇒ (i). If we suppose that ZL1 (G) is amenable, it admits a bounded approximate diagonal (μ γ ). We may assume (μ γ ) is weakly Cauchy, i.e. it converges to a virtual diagonal M in


1551

aπβ = 0} ZL1 (G))∗∗ . With (fβ ) as in the lemma above, let for each β, Fβ = {π ∈ G: (ZL1 (G) ⊗ ZL1 (G) and Aβ = span{χπ : π ∈ Fβ }. Then Aβ ⊗ Aβ is a finite dimensional ideal in ZL1 (G) ⊗ which contains fβ ⊗ fβ . Then ((fβ ⊗ fβ ) ∗ μ γ )γ is a bounded net in Aβ ⊗ Aβ with limit point μ β . Write μ β =

β cπ,σ χπ ⊗ χσ .

π,σ ∈Fβ

Then for any π ∈ Fβ , using that fβ ∗ χπ = χπ ∗ fβ , we have

χπ · μ β = (fβ ⊗ fβ ) ∗ lim χπ · μ γ = (fβ ⊗ fβ ) ∗ lim μ γ · χπ = μ β · χπ γ

γ

and thus β β cσ,π cπ,σ χπ ⊗ χσ = χσ ⊗ χπ . dπ dπ

σ ∈Fβ

σ ∈Fβ

β

It follows from the orthogonality relations of the characters that cπ,σ = 0 if σ = π and hence μ β =

β cπ,π χπ ⊗ χπ . dπ

π∈Fβ

Since m(μ β ) = m(fβ ⊗ fβ ) ∗ limγ m(μ γ ) = fβ ∗ fβ we obtain β (aπβ )2 cπ,π χπ = χπ dπ dπ

π∈Fβ

π∈Fβ

and thus cπ,π = (aπ )2 for each π in Fβ . Then for each β we have μβ = μ β , so β

β

μβ = μ β fβ 21 supμ γ γ

and, since (fβ ) is bounded, (μβ ) is bounded too. (i) ⇒ (iii). Using Proposition 1.1 we identify (μβ ) as a bounded net M(G × G). It thus has a weak* cluster point μ. We note that μ is, in fact, a limit point. Indeed, Trig(G × G) is uniformly dense in C(G × G), and if u ∈ Trig(G × G) it is clear that dπ2 u(s, t)χπ (s)χπ (t) d(s, t) π∈G

G×G

2 aπβ = lim β

π∈G

= limu, μβ β

G×G

u(s, t)χπ (s)χπ (t) d(s, t)

1552


as all sums in the expression are finite. Moreover, the above expression must be G×G u(s, t) dμ(s, t). By Proposition 1.4, μ ∈ ZM(G × G). By (1.1) we find μ(π × σ ) =

1 dπ dσ

χπ (s)χσ (t) dμ(s, t) = δπ,σ . G×G

× G) → C(G) be the map of restriction to the diagonal: Ru(π) = u(π, π). Note Let R : C(G = Rν. Thus we have for any π in G that for any ν in ZM(G × G), (m(ν)) ˜

m(μ) ˜ (π) = Rμ(π) = 1 =δ e (π) then, (f ⊗ δe ) (π × σ ) = f (π) while (δe ⊗ so m(μ) ˜ = δe . Also, if f ∈ ZL1 (G) and π, σ ∈ G f ) (π × σ ) = f (σ ). We thus have (f ⊗ δe )μ(π × σ ) = f (π)δπ,σ = μ(δe ⊗ f ) (π × σ ) so it follows that (f ⊗ δe ) ∗ μ = μ ∗ (δe ⊗ f ). (iii) ⇒ (ii). Let (fα ) be any bounded approximate identity in ZL1 (G × G). We will show that any weak* cluster point M of (μ ∗ fα ) in (ZL1 (G × G))∗∗ is a virtual diagonal. We may assume M is a limit point. First, if f ∈ ZL1 (G) we have f · M = lim(f ⊗ δe ) ∗ μ ∗ fα = lim μ ∗ (δe ⊗ f ) ∗ fα = lim μ ∗ fα ∗ (δe ⊗ f ) = M · f. α

α

α

Second, we note it follows from Proposition 1.4 that (m(fα )) is a bounded approximate identity for ZL1 (G). We let E be any weak* cluster point of (m(fα )), which we may consider to be a limit point. We then have, again by Proposition 1.4, and using m(μ) ˜ = δe , that ˜ ∗ m(fα ) = lim m(fα ) = E. m∗∗ (M) = lim m(μ ∗ fα ) = lim m(μ) α

α

α

It is clear that f · E = E · f = f for f ∈ ZL1 (G). Thus M is a virtual diagonal.

2

Note that if G is abelian, then μ is the Haar measure of the anti-diagonal subgroup A = {(s, s −1 ): s ∈ G}. Indeed, if we denote the latter by λA then we have for χ, ψ in G λA (χ × ψ) = G

χ(s)ψ s −1 ds =

χ(s)ψ(s) ds = δχ,ψ = μ(χ × ψ) G

and hence μ = λA . Though the definition of λA , as above, makes sense for any compact group, it forms a central measure only when G is abelian. Suppose dG = supπ∈G dπ < ∞. Then for u, v in Trig(G) we use the Cauchy–Schwarz in to see that for the approxiequality and Bessel’s inequality on the orthonormal set {χπ : π ∈ G} mate diagonal (μβ ) in the theorem above we have


lim β

1553

2 u(s)v(t)μβ (s, t) ds dt = dπ u(s)χπ (s) ds · v(t)χπ (t) dt π∈G

G×G

2 dG

G

G

u | χπ¯ v | χπ¯

π∈G 2 dG u2 v2

2 dG u∞ v∞ .

Since Trig(G) is dense in C(G), it follows that (μβ ) converges to a bimeasure in the terminology C(G))∗ . Conjecture 0.1, if true, would further imply that of [7], i.e. an element μ of (C(G) ⊗ μ ∈ M(G × G). 1.4. Connected groups Theorem 1.7. If G is a non-abelian connected compact group, then ZL1 (G) is not amenable. Proof. There is a family {Gi }i∈I of compact connected Lie groups, at least one of which is simple (in the sense of Lie groups) with finite centre, such that G∼ =

Gi /A

i∈I

where A is a central subgroup of P = as a quotient

i∈I

Gi . See [22, 6.5.6], for example. Hence G admits,

Gi /Z(Gi ) ∼ = P /Z(P ) ∼ = (P /A)/ Z(P )/A .

i∈I

Let i0 be such that Gi0 is simple with finite centre. Then Gi0 /Z(Gi0 ) is simple with trivial centre. Hence there is a closed normal subgroup N of G such that G/N is a simple Lie group with trivial centre. By [24, Lem. 9.1] we obtain “Condition I” on G/N , which is the property that lim ψπ (sN ) = 0 for sN ∈ G/N \ {eN }.

dπ →∞

Hence, there is a sequence {πn }∞ n=1 ⊂ G such that ψn (s) = 1 for s in N

and

lim ψn (s) = 0 for s ∈ G \ N

n→∞

(1.4)

where ψn = ψπn . Indeed, choose any sequence of representations {π˜ n }∞ n=1 ⊂ G/N where limn→∞ dπ˜ n = ∞, and let πn = π˜ n ◦ q where q : G → G/N is the quotient map. If it were the case that ZL1 (G) were amenable, then we would obtain μ in ZM(G × G) as in (1.3). Let us see that the existence of such μ gives a contradiction. Let N and (ψn ) be as in (1.4). Define two sequences (un ) and (vn ) of functions on G × G by un = ψn ⊗ ψn

and vn = ψn ⊗ ψn+1 .

1554


Then (un ) and (vn ) are bounded sequences with lim un (s, t) = lim vn (s, t) =

n→∞

n→∞

1 if (s, t) ∈ N × N, 0 if (s, t) ∈ / N × N.

Hence it follows from the Lebesgue dominated convergence theorem that lim

n→∞ G×G

un dμ = μ(N × N ) = lim

n→∞ G×G

vn dμ.

(1.5)

However, by (1.3) we have that

un dμ = μ(π¯ n × π¯ n ) = 1

G×G

which contradicts (1.5).

vn dμ = μ(π¯ n × π¯ n+1 ) = 0,

while G×G

2

1.5. Products of finite groups Let Gbe a finite group. We will treat G as a compact group so we have normalised Haar 1 integral: G f = |G| s∈G f (s). Then it is well known that ZL1 (G) = span{χπ : π ∈ G}.

(1.6)

Moreover, if we let for any x in G, Cx = {sxs −1 : s ∈ G} denote the conjugacy class, and Conj(G) = {Cx : x ∈ G}, then since elements of ZL1 (G) are constant on conjugacy classes we have ZL1 (G) = span 1C : C ∈ Conj(G)

(1.7)

where 1C is the indicator function of C. We will let f (C) = f (x) where C = Cx , for f ∈ ZL1 (G). Theorem 1.8. If G is a finite group, then ZL1 (G) has unique diagonal and we have

1 AM ZL1 (G) = |G|2

2

dπ χπ (C)χπ (C )|C||C |.

C,C ∈Conj(G) π∈G

Proof. That μ=

dπ2 χπ ⊗ χπ

π∈G

is a unique diagonal for ZL1 (G) follows from the proof of Theorem 1.6. However, using the that μ is a diagonal is easily verified manually relations χπ ∗ χσ = δπ,σ dπ−1 χπ for π, σ in G,


1555

using (1.6). The uniqueness of the diagonal in any amenable finite dimensional commutative algebra has been observed in [6, Prop. 0.2]. If C ∈ Conj(G) with C = Cx , we let C = Cx −1 . The operation C → C is an involution on then χπ (C) = χπ (C). We appeal to (1.7) to obtain Conj(G). If π ∈ G μ=

π∈G

=

C,C ∈Conj(G)

χπ (C )1

C

C ∈Conj(G)

χπ (C)1C ⊗

χπ (C )1

C

C ∈Conj(G)

C∈Conj(G)

=

χπ (C)1C ⊗

C∈Conj(G)

dπ2

π∈G

dπ2

dπ2 χπ (C)χπ (C ) 1C ⊗ 1C .

π∈G

We then compute AM(ZL1 (G)) = μ1 to finish.

2

Corollary 1.9. If G is a non-abelian finite group, then AM(ZL1 (G)) > 1. Proof. Letting C = C we obtain lower bound

1 AM ZL1 (G) |G|2

2 dπ2 χπ (C) |C|2

C∈Conj(G) π∈G

1 2 = dπ |G| π∈G

C∈Conj(G)

2 |C| . |C|χπ (C) |G|

Since G is non-abelian we have |C| > 1 for some conjugacy class C. Moreover, there is some π so χπ (C) = 0. Thus we find

1 2 AM ZL1 (G) > dπ |G| π∈G

since χπ 2 = 1 and

2 dπ π∈G

χπ (C)2 |C| = 1 dπ2 χπ 22 = 1 |G| |G| π∈G

C∈Conj(G)

= |G|.

2

Let us take a second look at the proof of the above corollary. The Schur orthogonality relations × Conj(G) matrix tell us that the G U=

|C|1/2 χ (C) π |G|1/2

is unitary. Letting C = C we obtain lower bound

1 AM ZL1 (G) |G| =

C∈Conj(G) π∈G

2 |C| |C| dπ2 χπ (C) |G|

1 diag(dπ )U diag |C|1/2 2 2 |G|

1556


where · 2 denotes the Hilbert–Schmidt norm. Is it possible to get a lower estimate in terms of maxπ∈G dπ ? If so, Conjecture 0.1 may be shown to hold for compact totally disconnected groups which do not admit an open abelian subgroup. Theorem 1.10. If G = amenable.

∞

i=1 Gi

where each Gi is a non-abelian finite group, then ZL1 (G) is not

Proof. For each i, the diagonal μi for ZL1 (Gi ) promised by Theorem 1.8 is an idempotent in ZL1 (G × G). Hence by [24, Thm. 5.3], there is a constant δ > 0—in fact δ 1/700—for which

AM ZL1 (Gi ) 1 + δ for each i. Since G admits, for each n, G(n) =

n

i=1 Gi

as a quotient, we have that

n

AM ZL1 (G) AM ZL1 G(n) = AM ZL1 (Gi ) (1 + δ)n . i=1

Hence we have that AM(ZL1 (G)) = ∞ and ZL1 (G) is not amenable.

2

1.6. An amenable example The following example further illustrates Conjecture 0.1. Let G = T Z2 where T = {s ∈ C: |s| = 1} and Z2 = {−1, 1}. The group law and inverse are given by

(s, a)(t, b) = st a , ab and (s, a)−1 = s −a , a for (s, a), (t, b) in G. An application of the “Mackey machine,” see [4, Sec. 6.6] for example, = {1, σ, πn : n ∈ N} where gives us G

1(s, a) = 1,

σ (s, a) = a

sn and πn (s, a) = 0

0

s −n

0 1 1 0

(1−a)/2

for (s, a) in G. It follows that we have normalised characters 1, σ and ψπn (s, a) =

1 n 2 (s

0

+ s −n )

if a = 1, if a = −1

for (s, a) in G. We note that all of the calculations thus far, and hence the next proposition, also hold if T is replaced by any compact abelian group T admitting only 1 as a real character, i.e. for χ in T, χ = χ implies χ = 1. For sake of concreteness, we will continue with T = T.


1557

Theorem 1.11. For G = T Z2 , ZL1 (G) is amenable. Proof. Let μ = 1 ⊗ 1 + σ ⊗ σ − 2(1 + σ ) ⊗ (1 + σ ) + ν, where ν = λD + λA , the sum of the Haar measures on the subgroups of G × G given by D = {((t, 1), (t, 1)): t ∈ T} and A = {((t, 1), (t −1 , 1)): t ∈ T}, each normalised to have total mass 1. We note that ν ∈ ZM(G) since for (s, a) in G we have δ((s,a),(t,b)) ∗ ν ∗ δ((s −a ,a),(t −b ,b)) = δ((1,a),(1,b)) ∗ δ((s a ,1),(t b ,1)) ∗ ν ∗ δ((s −a ,1),(t −b ,1)) ∗ δ((1,a),(1,b)) = ν. we have Thus μ ∈ ZM(G). Now for π, ρ in G

1 ⊗ 1 + σ ⊗ σ − 2(1 + σ ) ⊗ (1 + σ ) · ψπ ⊗ ψρ

μ(π × ρ) = G×G

+

ψπ (s, 1)ψρ (s, 1) + ψπ (s, 1)ψρ s −1 , 1 ds

T

= (1) + (2) where (1) =

−1 if (π, ρ) = (1, 1), (σ, σ ), −2 if (π, ρ) = (1, σ ), (σ, 1), 0 if (π, ρ) = (1, πn ), (πn , 1), (σ, πn ), (πn , σ ), (πn , πm ), n, m ∈ N

and (2) = 2

ψπ (s, 1)ψρ (s, 1) ds T

2 if (π, ρ) = (1, 1), (σ, σ ), (1, σ ), (σ, 1), = 1 if (π, ρ) = (πn , πn ), n ∈ N, 0 if (π, ρ) = (1, πn ), (πn , 1), (σ, πn ), (πn , σ ), (πn , πm ), n = m, n, m ∈ N. Thus it follows that μ satisfies (1.3). We remark that the measure μ corresponds to the (formal) Fourier series

∞

1 1 ⊗ 1 + σ ⊗ σ − 2(1 + σ ) ⊗ (1 + σ ) + 4 (1 + σ ) ⊗ (1 + σ ) + χπn ⊗ χπn 2

n=1

as suggested by Theorem 1.6. The coefficient 4, in the last term, is 1/λ(G(1,1) × G(1,1) ), where G(1,1) ∼ = T is the connected component of the identity in G. The last term corresponds to the Fourier series for λA + λD on T × T, as may be revealed by a simple computation which we leave to the reader. 2

1558


Let us make a few observations about G = TZ2 . First we compute, for s in T and (t, b) in G

(t, b)(s, 1) t −b , b = s b , 1

and (t, b)(s, −1) t −b , b = t 2 s b , −1 .

Hence we deduce that Conj(G) =

(1, 1) , (−1, 1) , (s, 1), s −1 , 1 Ims>0 , G(1,−1)

where G(1,−1) is the connected component of (1, −1). Moreover we compute commutators

(t, b), (s, a) = (t, b)(s, a) t −b , b s −a , a = t 1−a s 1−b , 1 . Letting a = 1, b = −1 and s, t be arbitrary in T we find, in the notation of Section 2.2 that G = [G, G0 ] = G(1,1) . In particular, notice that the assumptions of Theorem 2.2, below, are not necessary for ZL1 (G) to be amenable. Let us close by noting the following decomposition ZL1 (T Z2 ) = ZZ2 L1 (T) ⊕ C(1 − σ ) where ZZ2 L1 (T) = {f ∈ L1 (T) : fˇ = f }, fˇ(s) = f (s −1 ). We note that both of the components of this decomposition are closed subalgebras, but neither is an ideal. Hence it is not apparent that ZZ2 L1 (T) is amenable. We show this fact in the next section. 1.7. The hypergroup approach We indicate, by way of two examples, how the problem of amenability for ZL1 (G) can be treated by using hypergroups. We refer to [3] for the definition of a hypergroup K and its left Haar measure λK , or to [12], where a hypergroup is referred to as a “convos.” If G is a compact group, then K = Conj(G) is a hypergroup [12, 8.4]. Since K is compact and commutative, it admits a Haar measure. Moreover we have ZL1 (G) ∼ = L1 (K), where L1 (K) is the hypergroup algebra. is a (discrete) hypergroup Such K is a strong hypergroup in the sense that its character set K under pointwise multiplication. In fact K identifies naturally with {ψπ }π∈G . We first consider G = SU(2). By [12, 15.4], Conj(SU(2)) identifies naturally with a hypergroup whose underlying set is K = [−1, 1]. We will not explicitly need the convolution formula on K, but we will require the formula for the Haar measure K

2 f dλK = π

1 −1

1/2

2 f (x) 1 − x 2 dx = π

π f (cos θ ) sin2 θ dθ

(1.8)

0

where dx, dθ each denote integration with respect to Lebesgue measure, and the (nonnormalised) characters are given by {χk }∞ k=0

where χk (cos θ ) =

sin(k + 1)θ . sin θ

Note that χk is, up to identification, the character of a unique representation of SU(2) of dimension k + 1.


1559

Theorem 1.12. ZL1 (SU(2)) is not amenable. Proof. We first note that by Lemma 1.5, there is a bounded approximate identity for L1 (K), (eα ) ⊂ Trig(K) = span{ψk : k ∈ N0 } where N0 = {0} ∪ N. This bounded approximate identity (n) (n) may be taken to be a sequence, (en ), and we have for each n, en = ∞ k=0 ak χk where ak = 0 for all but finitely many indices k. We obtain, again from Lemma 1.5, that lim a (n) n→∞ k

= k + 1.

(1.9)

(n) 2 ∞ Now let μn = ∞ k=0 (ak ) χk ⊗ χk , so (μn )n=1 is the approximate identity from Theorem 1.6, L1 (K) ∼ and we are done once we establish (μn ) is not bounded. The using the fact that L1 (K) ⊗ = 1 L (K × K) and (1.8) we have 2 π π π μn (cos θ, cos θ ) sin2 θ sin2 θ dθ dθ μn 1 = 2 0 0

π/2 π/2 ∞

(n) 2 sin(k + 1)θ sin(k + 1)θ 2 ak sin θ sin2 θ dθ dθ sin θ sin θ 0

k=0

0

π/2 π/2 ∞ (n) 2 ak sin(k + 1)θ sin(k + 1)θ sin θ sin θ dθ dθ 0

k=0

0

2 π/2 ∞

(n) 2 = ak sin(k + 1)θ sin θ dθ k=0

0

2 π/2 ∞

(n) 2 1

ak cos kθ − cos(k + 2)θ dθ = 2 k=0

=

(n)

2

∞ a (n) (2j + 2) 2 2j +1 j =0

a

kπ sin k(k + 2) 2

k

k=0

=

0

∞ (n) a (k + 1)

(2j + 1)(2j + 3)

.

(2j +2)

+1 ∞ 2 Let fn = ( (2j2j+1)(2j +3) )j =0 . If (μn ) is bounded, then (fn ) is bounded in (N0 ), in which cases

+2) the latter sequence has a cluster point f . We have, by (1.9), that f (j ) = (2j(2j +1)(2j +3) , which means that f cannot be an element of 2 (N0 ). Thus (μn ) must not be bounded. 2 2

1560


Let us now turn our attention to ZZ2 L1 (T), from the last section. We let 1 ψ0 = (1 + σ ) 2

and ψn = ψπn

for n ∈ N.

Then the family of all Z2 -invariant characters of ZZ2 L1 (T) is XZ2 (T) = {ψn }n∈N0 . Observe, under pointwise multiplication, that XZ2 (T) satisfies the same multiplication rules as the cosine functions {cos(m·)}m∈N0 , and hence is isomorphic to the Chebychev polynomial hypergroup of the first kind [3]. There is a commutative hypergroup K = [−1, 1], which is isomorphic to the double conjugacy class hypergroup T//Z2 , such that ZZ2 L1 (T) ∼ = L1 (K). The Haar measure on K is given by

1 f dλK = π

K

1 −1

f (x) 1 dx = √ π 1 − x2

π f (cos θ ) dθ 0

and the characters, in the present identification, are given by ψn (cos θ ) = cos nθ for n in N0 . Theorem 1.13. ZZ2 L1 (T) is amenable. Proof. We let Kn : [0, π] → R0 for n ∈ N0 denote the well-known Fejer kernel (see [16, 2.5], for example), so n 2k 1− ψk ◦ cos . Kn = n+1 k=0

2k )ψk ⊗ ψk . We have that (μn ) is an approximate diagonal by TheThen let μn = nk=0 (1 − n+1 orem 1.6. Moreover, (μn ) is bounded since

1 μn 1 = 2 π

π π n 2k

1− cos kθ cos kθ dθ dθ n+1 0 0

1 = 2π 2

k=0

π π n 2k

1− cos k(θ + θ ) + cos k(θ − θ ) dθ dθ n+1 0 0

1 = 2π 2

π π

k=0

Kn (θ + θ ) + Kn (θ − θ ) dθ dθ = 1.

0 0

Thus ZZ2 L1 (T) ∼ = L1 (K) is amenable.

2


1561

2. Some non-compact groups 2.1. Preliminaries and notation If G is a locally compact group, then ZL1 (G) = {0} if and only if G has a relatively compact neighbourhood which is invariant under inner automorphisms, i.e. G is an [I N ]-group; see [21, Prop. 1]. In fact, it is shown in [18, Cor. 1.5] that ZL1 (G) is related to certain centers of [F I A]− Bgroups, which we define below. Let Aut(G) denote the space of continuous automorphism of G, which can be endowed with a Hausdorff topology [10, (26.5)]. We let Inn(G) = {s → tst −1 : t ∈ G} denote the group of inner automorphisms in Aut(G). We say G has relatively compact inner automorphisms if Inn(G) is relatively compact in Aut(G). More generally, if there is a relatively compact subgroup B of 1 Aut(G) such that B ⊃ Inn(G) we say G is of class [F I A]− B . We let for β in B and f in L (G), f ◦ β(s) = f (β(s)) for almost every s in G. We then let ZB L1 (G) = f ∈ L1 (G): f ◦ β = f for all β in B . This is a subalgebra of ZL1 (G). The result [18, Cor. 1.5], to which we alluded, above, is that for an [I N]-group G, there is an open normal subgroup G0 of G generated by all elements with relatively compact conjugacy classes, and a closed normal subgroup of G0 , N , which is the intersection of all Inn(G)|G0 -invariant neighbourhoods of e, so that the group B = {sN → t −1 stN: t ∈ G0 } is relatively compact in Aut(H ) where H = G0 /N , and ZL1 (G) ∼ = ZB L1 (H ).

(2.1)

We let XB (G) denote the Gelfand spectrum of ZB L1 (G), and let X (G) = XInn(G) (G). The identification (2.1) gives a natural identification X (G) ∼ = XB (H ). It follows from [20, 4.12] (see [11, 4.2]) that XB (G) may be identified with a certain family of continuous positive definite functions on G. We record the following important structural result, which will be key to many of the results which follow. It summarises results from [18, Prop. 2.3] and [21, Lem. 1]. See the summary presented in [26]. Lemma 2.1. Let G be an [F I A]− B -group and suppose there exists a compact B-invariant subgroup K such that each “β-commutator” s −1 β(s) ∈ K, where β in B and s in G (thus G/K is abelian). Define an equivalence relation on XB (G) by χ ∼ω

⇔

χ|K = ω|K .

Let [χ] denote the equivalence class of χ . Then (i) there is a family of ideals {J (χ): [χ] ∈ XB (G)/ ∼} such that J (χ) ∩ J (ω) = {0} if χ ω,

ZB L1 (G) =

J (χ) [χ]∈XB (G)/∼

1562


and each J (χ) is isomorphic to L1 (G(χ)), where G(χ) is an abelian group, isomorphic to a quotient of an open subgroup of G by K; and (ii) {χ|K : [χ] ∈ XB (G)/ ∼} is an orthogonal family in L2 (K). Note that for such a compact subgroup as K to exist, it is necessary and sufficient that the closed subgroup generated by B-commutators be compact. In this case G is said to be an [F D]− B∼ group. Note that if G is compact we may take K = G and we obtain, for each π ∈ G X (G), = J (χπ ) = Cχπ ∼ = L1 (G/G). 2.2. Some amenable centres If A, B are any pair of subgroups of G, we let [A, B] denote the closed subgroup generated by commutators {aba −1 b−1 : a ∈ A, b ∈ B}. The derived subgroup is given by G = [G, G]. The following result is a generalisation of [26, Thm. 1]. We recall that if G is an [I N]-group, then the subgroup G0 , of all elements with relatively compact conjugacy classes is an open normal subgroup. Theorem 2.2. If [G, G0 ] is finite, then ZL1 (G) is amenable. Proof. We may suppose that ZL1 (G) = {0}, so G has an invariant neighbourhood. Let B and H = G0 /N be as in (2.1) and K = [G, G0 ]/N . Then it is straightforward to check that K is Binvariant and that it is generated by B-commutators. Since K is finite, the orthogonality relations given in Lemma 2.1(ii) imply that there are only finitely many ideals {J (χ): [χ] ∈ XB (G)/ ∼}. It then follows from Lemma 2.1(i), [13, Prop. 5.2], and the fact that each L1 (G(χ)) is amenable, that ZL1 (G) ∼ = ZB L1 (H ) is amenable. 2 Observe that condition of the theorem above holds when G is finite. It also holds when G0 = {e}, in which case G is called an infinite conjugacy class group. 2.3. Some hyper-Tauberian centres We direct the reader to Section 0.2 for the definition and consequences of the hyper-Tauberian property. Proposition 2.3. Suppose G, B and K are as in the hypotheses of Lemma 2.1. Then ZB L1 (G) is hyper-Tauberian. Proof. By Lemma 2.1(i) we may write

L1 G(χ) .

J (χ) ∼ =

ZB L1 (G) = [χ]∈XB (G)/∼

[χ]∈XB (G)/∼

Hence it follows from [25, Cor. 13] that ZB L1 (G) is hyper-Tauberian.

2

We say that G is an [F C]− -group if each conjugacy class in G is relatively compact. In the notation of Section 2.1 this is the same as having G = G0 .


1563

Theorem 2.4. If G is an [F C]− -group, then ZL1 (G) is hyper-Tauberian. Proof. In the notation of (2.1) we have that H = G/N and B = Inn(H ). Thus ZL1 (G) ∼ = ZL1 (H ), and we may assume G, itself, is an [F I A]− -group. If G is compactly generated, then [8, (3.20)] guarantees that the derived group K = G is compact. Hence we can apply Proposition 2.3, and we are done. If G is not compactly generated, we must localise our argument to a compactly generated subgroup. We first wish to see that ZCc (G), the space of all compactly supported continuous 1 1 1 elements of ZL1 (G), is dense in ZL (G). We note that P : L (G) → ZL (G), given for almost every s in G by Pf (s) = Inn(G) f (β(s)) dβ, defines a surjective quotient map. Hence if f ∈ ZL1 (G) and (un ) ⊂ Cc (G) is a sequence with limn un = f , then limn P un = Pf = f . Now let T : ZL1 (G) → ZL1 (G)∗ be a local operator. To see that T is a ZL1 (G)-module map, it suffices to show that !

" ! " T (u ∗ v), w = u ∗ T (v), w

(2.2)

for any u, v, w in ZCc (G). The set U = {s ∈ G : |u(s)| + |v(s)| + |w(s)| > 0} is Inn(G)-invariant, open and relatively compact. Hence U generates a normal open subgroup F of G. We let B = Inn(G)|F and note that F is an [F I A]− B -group. We have that the closed subgroup K generated by B-commutators in F is compact. This is noted in [17], though does follow obviously from [8, (3.20)]. Let us show how this can be proved from [8]. It is shown in [8, (3.16)] that G consists of periodic elements, elements which individually generate relatively compact subgroups of G. Hence K = [F, G] ⊂ G consists of periodic elements. Since F is compactly generated and an [F I A]− B -group, it is clear that K is compactly generated. Then by [8, (3.17)], K is compact. Clearly K is B-invariant. Thus ZB L1 (F ) is hyper-Tauberian by Proposition 2.3. We note that ZB L1 (F ) is the closed subalgebra of all elements of ZL1 (G) which vanish almost everywhere off of F . Moreover, the mapping χ → χ|ZB L1 (F ) maps X (G) continuously onto XB (F ), by [18, Prop. 2.9]. Let ι : ZB L1 (F ) → ZL1 (G) be the injection map, so ι∗ ◦ T ◦ ι : ZB L1 (F ) → ZB L1 (F )∗ is a local map. Then ι∗ ◦ T ◦ ι is a ZB L1 (F )-module map. Since u, v, w ∈ ZB L1 (F ), we see that (2.2) holds. 2 We note that there are non-[F C]− -groups for which the above result fails. Let n 3 and Gn = Rn SO(n)d , the semi-direct product of Rn with the discrete special orthogonal group. We have for odd n that ZL1 (Gn ) ∼ = ZSO(n) L1 (Rn ); for n = 3 this was observed in [18, p. 162]. (Note that for even n we have Z(SO(n)) = {1, −1} = Z2 and we have ZL1 (Gn ) ∼ = ZSO(n) L1 (Rn Z2 ).) It is proved in [23, Prop. 2.6.8] (see also [1, Thm. 5.5]) that for n 3, ZSO(n) L1 (Rn ) admits non-zero point derivations. Hence this algebra cannot even be weakly amenable, neverless hyperTauberian, as noted in Theorem 0.2. Moreover, for n 3, it is shown [23, 2.6.10] that except for the augmentation character, no singleton in XSO(n) (Rn ) is a set of spectral synthesis. References [1] A. Azimifard, On α-amenability of hypergroups, Monatsh. Math. 155 (2008) 1–13. [2] W.G. Bade, P.C. Curtis, H.G. Dales, Amenability and weak amenability for Beurling and Lipschitz algebras, Proc. London Math. Soc. 55 (1987) 359–377. [3] W.R. Bloom, H. Heyer, Harmonic Analysis of Probability Measures and hypergroups, de Gruyter Stud. Math., vol. 20, de Gruyter, 1995.

1564


[4] G.B. Folland, A Course in Abstract Harmonic Analysis, Stud. Adv. Math., CRC Press, 1995. [5] F. Ghahramni, Y. Zhang, Pseudo-amenable and pseudo-contractible Banach algebras, Math. Proc. Cambridge Philos. Soc. 142 (2007) 111–123. [6] M. Ghandehari, H. Hatami, S. Spronk, Amenability constants for semilattice algebras, Semigroup Forum, in press, DOI: 10.1007/s00233-008-9115-z, see arXiv:0705.4277. [7] C.C. Graham, B.M. Schreiber, Bimeasure algebras on LCA groups, Pacific J. Math. 115 (1984) 91–127. [8] S. Grosser, M. Moskowitz, Compactness conditions in topological groups, J. Reine Angew. Math. 246 (1971) 1–40. [9] E. Hewitt, K.A. Ross, Abstract Harmonic Analysis II, Grundlehren Math. Wiss., vol. 152, Springer, New York, 1970. [10] E. Hewitt, K.A. Ross, Abstract Harmonic Analysis I, second ed., Grundlehren Math. Wiss., vol. 115, Springer, New York, 1979. [11] A. Hulanicki, On positive functionals on a group algebra multiplicative on a subalgebra, Studia Math. 37 (1971) 163–171. [12] R.I. Jewett, Spaces with an abstract convolution of measures, Adv. Math. 18 (1975) 1–101. [13] B.E. Johnson, Cohomology in Banach algebras, Mem. Amer. Math. Soc., vol. 127, American Math. Soc., 1972. [14] B.E. Johnson, Approximate diagonals and cohomology of certain annihilator algebras, Amer. J. Math. 94 (1972) 685–698. [15] B.E. Johnson, Non-amenability of the Fourier algebra of a compact group, J. London Math. Soc. 50 (1994) 361–374. [16] Y. Katznelson, An Introduction to Harmonic Analysis, Cambridge Univ. Press, 2004. [17] R. Lasser, Primary ideals in centers of group algebras, Math. Ann. 229 (1977) 53–58. [18] J. Liukkonen, R. Mosak, Harmonic analysis and centers of group algebras, Trans. Amer. Math. Soc. 195 (1974) 147–163. [19] C.C. Moore, Groups with finite dimensional irreducible representations, Trans. Amer. Math. Soc. 166 (1972) 401– 410. [20] R.D. Mosak, The L1 - and C ∗ -algebras of [F I A]− B groups and their representations, Trans. Amer. Math. Soc. 163 (1972) 277–310. [21] R.D. Mosak, Ditkin’s condition and primary ideals in central Beurling algebras, Monatsh. Math. 85 (1978) 115–124. [22] J.F. Price, Lie Groups and Compact Groups, London Math. Soc. Lecture Note Ser., vol. 25, Cambridge Univ. Press, 1977. [23] H. Reiter, J.D. Stegeman, Classical Harmonic Analysis and Locally Compact Groups, London Math. Soc. Monogr. Ser., vol. 22, Oxford, 2000. [24] D. Rider, Central idempotent measures on compact groups, Trans. Amer. Math. Soc. 186 (1973) 459–479. [25] E. Samei, Hyper-Tauberian algebras and weak amenability of Figà–Talamanca–Herz algebras, J. Funct. Anal. 231 (2006) 195–220. [26] U. Stegmeir, Centers of group algebras, Math. Ann. 243 (1979) 11–16.


Gelfand pairs on the Heisenberg group and Schwartz functions ✩ Francesca Astengo a , Bianca Di Blasio b , Fulvio Ricci c,∗ a Dipartimento di Matematica, Via Dodecaneso 35, 16146 Genova, Italy b Dipartimento di Matematica e Applicazioni, Via Cozzi 53, 20125 Milano, Italy c Scuola Normale Superiore, Piazza dei Cavalieri 7, M 56126 Pisa, Italy

Received 25 May 2008; accepted 8 October 2008 Available online 12 November 2008 Communicated by P. Delorme

Abstract Let Hn be the (2n + 1)-dimensional Heisenberg group and K a compact group of automorphisms of Hn such that (K Hn , K) is a Gelfand pair. We prove that the Gelfand transform is a topological isomorphism between the space of K-invariant Schwartz functions on Hn and the space of Schwartz function on a closed subset of Rs homeomorphic to the Gelfand spectrum of the Banach algebra of K-invariant integrable functions on Hn . © 2008 Elsevier Inc. All rights reserved. Keywords: Gelfand pair; Schwartz space; Heisenberg group

1. Introduction A fundamental fact in harmonic analysis on Rn is that the Fourier transform is a topological isomorphism of the Schwartz space S(Rn ) onto itself. Various generalizations of this result for different classes of Lie groups exist in the literature, in particular in the context of Gelfand pairs, where the operator-valued Fourier transform can be replaced by the scalar-valued spherical transform. Most notable is the case of a symmetric pair ✩

Work partially supported by MIUR and GNAMPA.

* Corresponding author.

E-mail addresses: [email protected] (F. Astengo), [email protected] (B. Di Blasio), [email protected] (F. Ricci). 0022-1236/$ – see front matter © 2008 Elsevier Inc. All rights reserved. doi:10.1016/j.jfa.2008.10.008

1566

F. Astengo et al. / Journal of Functional Analysis 256 (2009) 1565–1587

of the noncompact type, with Harish-Chandra’s definition of a bi-K-invariant Schwartz space on the isometry group (cf. [17, p. 489]). The definition of a Schwartz space on a Lie group becomes quite natural on a nilpotent group N (say connected and simply connected). In that case one can define the Schwartz space by identifying N with its Lie algebra via the exponential map. The image of the Schwartz space on the Heisenberg group Hn under the group Fourier transform has been described by D. Geller [14]. Let K be a compact group of automorphisms of Hn such that convolution of K-invariant functions is commutative, in other words assume that (K Hn , K) is a Gelfand pair . Then a scalar-valued spherical transform GK (where G stands for “Gelfand transform”) of K-invariant functions is available, and Geller’s result can be translated into a characterization of the image under GK of the space SK (Hn ) of K-invariant Schwartz functions. In the same spirit, a characterization of GK SK (Hn ) is given in [6] for closed subgroups K of the unitary group U(n). In [1] we have proved that, for K equal to U(n) or Tn (i.e. for radial—respectively polyradial—functions), an analytically more significant description of GK SK (Hn ) can be obtained by making use of natural homeomorphic embeddings of the Gelfand spectrum of L1K (Hn ) in Euclidean space. The result is that GK SK (Hn ) is the space of restrictions to the Gelfand spectrum of the Schwartz functions on the ambient space. This condition of “extendibility to a Schwartz function on the ambient space” subsumes the rather technical condition on iterated differences in discrete parameters that are present in the previous characterizations. In this article we extend the result of [1] to general Gelfand pairs (K Hn , K), with K a compact group of automorphisms of Hn . Some preliminary notions and facts are required before we can give a precise formulation of our main theorem. Let G be a connected Lie group and K a compact subgroup thereof such that (G, K) is a Gelfand pair and denote by L1 (G//K) the convolution algebra of all bi-K-invariant integrable functions on G. The Gelfand spectrum of the commutative Banach algebra L1 (G//K) may be identified with the set of bounded spherical functions with the compact-open topology. Spherical functions are characterized as the joint eigenfunctions of all G-invariant differential operators on G/K, normalized in the L∞ -norm. G-invariant differential operators on G/K form a commutative algebra D(G/K) which is finitely generated [16]. Given a finite set of generators {V1 , . . . , Vs } of D(G/K), we can assign to each bounded (φ) = (V 1 (φ), . . . , V s (φ)) of its eigenvalues with respect to spherical function φ the s-tuple V V of Cs . these generators. In this way, the Gelfand spectrum is identified with a closed subset ΣK When all bounded spherical functions are of positive type and the operators Vj self-adjoint, V ⊂ Rs . As proved in [10], the Euclidean topology induced on Σ V coincides with the compactΣK K open topology on the set of bounded spherical functions (see also [4] for G = K Hn and V , the spherical transform will be K ⊂ U(n)). When the Gelfand spectrum is identified with ΣK V. denoted by GK Let K be a compact group of automorphisms of Hn such that (K Hn , K) is a Gelfand pair and denote by DK the commutative algebra of left-invariant and K-invariant differential operators on Hn . Let V = {V1 , . . . , Vs } be a set of formally self-adjoint generators of DK . We V ) the space of restrictions to Σ V of Schwartz functions on Rs , endowed with denote by S(ΣK K the quotient topology of S(Rs )/{f : f| V = 0}. Our main result is the following:

ΣK

V is a topological isomorphism between S (H ) and S(Σ V ). Theorem 1.1. The map GK K n K


1567

As customary, for us Hn is understood as R × Cn , with canonical coordinates of the first kind. It is well known that, under the action k · (t, z) = (t, kz)

∀k ∈ U(n), (t, z) ∈ Hn ,

U(n) is a maximal compact connected group of automorphisms of Hn , and that every compact connected group of automorphisms of Hn is conjugated to a subgroup of U(n). Therefore, if K is a compact group of automorphisms of Hn , then its identity component in K is conjugated to a subgroup of U(n). For most of this article, we deal with the case of K connected and contained in U(n), leaving the discussion of the general case to the last section. In Section 3 we show that it suffices to prove Theorem 1.1 for one particular set of generators of DK , and in Section 4 we choose a convenient set of generators. From a homogeneous Hilbert basis of K-invariant polynomials on R2n we derive by symmetrization d differential operators V1 , . . . , Vd , invariant under K; to these we add the central operator V0 = i −1 ∂t , obtaining in this way a generating system of d + 1 homogeneous operators. Here we benefit from the deep study of the algebraic properties of the multiplicity-free actions of subgroups of U(n), developed in [2–4,6–8]. In particular, rationality of the “generalized binomial coefficients” is a crucial point in our argument (see Proposition 7.5 below). It must be noticed that the proof of rationality in [8] is based on the actual classification of multiplicity-free actions. In this respect, our proof depends on the actual classification of the groups K giving rise to Gelfand pairs. After these preliminaries, we split the proof of Theorem 1.1 into two parts. V In the first part we show that, if m is a Schwartz function on Rd+1 , its restriction to ΣK is the Gelfand transform of a function f in SK (Hn ) (see Theorem 5.5 below). The argument is based on Hulanicki’s theorem [19], stating that Schwartz functions on the real line operate on positive Rockland operators on graded nilpotent Lie groups producing convolution operators with Schwartz kernels. We adapt the argument in [26] to obtain a multivariate extension of Hulanicki’s theorem (see Theorem 5.2 below). V f of a function f in S (H ) can In the second part we prove that the Gelfand transform GK K n be extended to a Schwartz function on Rd+1 (see Theorem 7.1 below). The proof begins with an extension to the Schwartz space of the Schwarz–Mather theorem [23,25] for C ∞ K-invariant functions (see Theorem 6.1 below). This allows us to extend to a Schwartz function on Rd the V f to the “degenerate part” Σ of the Gelfand spectrum (that corresponding to restriction of GK 0 the one-dimensional representations of Hn , or equivalently corresponding to the eigenvalue 0 for V0 ). Then we associate to f a Schwartz jet on Σ0 . As in [1], the key tool here is the existence of “Taylor coefficients” at points of Σ0 , proved by Geller [14] (see Theorem 7.2 below). The Whitney extension theorem (adapted to Schwartz jets in Proposition 7.4) gives therefore a Schwartz extension to Rd+1 of the jet associated to f . To conclude the proof, it remains to V f admits a Schwartz prove that if f ∈ SK (Hn ) and the associated jet on Σ0 is trivial, then GK extension. This is done by adapting an explicit interpolation formula already used in [1] (see Proposition 7.5). For a nonconnected group K, we remark that, calling K0 the connected component of the identity, one can view the K-Gelfand spectrum as the quotient of the K0 -Gelfand spectrum under the action of the finite group F = K/K0 (it is known that if (K Hn , K) is a Gelfand pair, so is (K0 Hn , K0 ), cf. [5]). Starting from an F -invariant generating system of K0 -invariant differential operators, the K-Gelfand spectrum is then conveniently embedded in a Euclidean space by means of the Hilbert map associated to the action of F on the linear span of these generators.

1568


The paper is structured as follows. In Section 2 we recall some facts about Gelfand pairs (K Hn , K) and the associated Gelfand transform. In Section 3 we show that it suffices to prove Theorem 1.1 for one particular set of generators of DK . In Section 4 we choose a convenient set of generators in the case where K is a connected closed subgroup of U(n). In Section 5 we show that every function in S(Rd+1 ) gives rise to the Gelfand transform of a function in SK (Hn ) via functional calculus. In Section 6 we extend the Schwarz–Mather theorem [23,25] to Schwartz spaces. Section 7 is devoted to define a Schwartz extension on Rd+1 of the Gelfand transform of a function in SK (Hn ). In Section 8 we show that our result holds for all compact groups of automorphisms of Hn . 2. Preliminaries For the content of this section we refer to [2–4,9,10,21]. 2.1. The Heisenberg group and its representations We denote by Hn the Heisenberg group, i.e., the real manifold R × Cn equipped with the group law 1 (t, z)(u, w) = t + u + Im w · z, z + w , 2

t, u ∈ R, ∀z, w ∈ Cn ,

where w · z is a short-hand writing for nj=1 wj zj . It is easy to check that Lebesgue measure dt dz is a Haar measure on Hn . Denote by Zj and Z¯ j the complex left-invariant vector fields i Zj = ∂zj − z¯ j ∂t , 4

i Z¯ j = ∂z¯ j + zj ∂t , 4

and set T = ∂t . For λ > 0, denote by Fλ the Fock space consisting of the entire functions F on Cn such that F 2Fλ =

λ 2π

n

F (z) 2 e− λ2 |z|2 dz < ∞,

Cn

equipped with the norm ·Fλ . Then Hn acts on Fλ through the unitary representation πλ defined by

λ λ 2 πλ (t, z)F (w) = eiλt e− 2 w·¯z− 4 |z| F (w + z)

∀(z, t) ∈ Hn , F ∈ Fλ , w ∈ Cn ,

and through its contragredient π−λ (t, z) = πλ (−t, z¯ ). These are the Bargmann representations of Hn . The space P(Cn ) of polynomials on Cn is dense in Fλ (λ > 0) and an orthonormal basis of P(Cn ) seen as a subspace of Fλ is given by the monomials pλ,d (w) =

wd , ((2/λ)|d| d!)1/2

d ∈ Nn .


1569

Besides the Bargmann representations, Hn has the one-dimensional representations τw (t, z) = ei Re z·w¯ with w ∈ Cn . The πλ (λ = 0) and the τw fill up the unitary dual of Hn . 2.2. Gelfand pairs (K Hn , K) Let K be a compact groups of automorphisms of Hn such that the convolution algebra L1K (Hn ) of integrable K-invariant functions on Hn is abelian, i.e., such that (K Hn , K) is a Gelfand pair. It is known [5] that this property holds for K if and only if it holds for its connected identity component K0 . On the other hand, every compact, connected group of automorphisms of Hn is conjugate, modulo an automorphism, to a subgroup of U(n), acting on Hn via k · (t, z) = (t, kz)

∀(t, z) ∈ Hn , k ∈ U(n).

For the remainder of this section, we assume that K is connected, contained in U(n), and (K Hn , K) is a Gelfand pair. For the one-dimensional representation τw , we have τw (t, k −1 z) = τkw (t, z). Therefore, τw (f ) is a K-invariant function of w for every f ∈ L1K (Hn ). k (t, z) = π (t, kz) is equivalent to π As to the Bargmann representations, if k ∈ U(n), π±λ ±λ ±λ for every λ > 0 and every choice of the ± sign. Precisely, we set ν+ (k)F (z) = F k −1 z , ν− (k)F (z) = F k¯ −1 z , each of the two actions being the contragredient of the other. We then have πλ (kz, t) = ν+ (k)πλ (z, t)ν+ (k)−1 ,

π−λ (kz, t) = ν− (k)π−λ (z, t)ν− (k)−1 .

By homogeneity, the decomposition of Fλ into irreducible invariant subspaces under ν+ (respectively ν− ) is independent of λ and can be reduced to the decomposition of the dense subspace P(Cn ) of polynomials. It is known since [2,9] that (K Hn , K) is a Gelfand pair if and only if ν+ (equivalently ν− ) decomposes into irreducibles without multiplicities (in other words, if and only if it is multiplicity free). The subgroups of U(n) giving multiplicity free actions on P(Cn ) have been classified by Kaˇc [21] and the resulting Gelfand pairs (K Hn , K) are listed in [8, Tables 1 and 2]. Under these assumptions, the space P(Cn ) of polynomials on Cn decomposes into ν+ irreducible subspaces, Pα , P Cn = α∈Λ

of K and α denotes the equivalence class of where Λ is an infinite subset of the unitary dual K the action on Pα . The irreducible ν− -invariant subspaces of P(Cn ) are

Pα = p(z) ¯ = p(¯z): p ∈ Pα , with the action of K on Pα being equivalent to the contragredient α of α.

1570


On the other hand, P(Cn ) = m∈N Pm (Cn ), where Pm (Cn ) is the space of homogeneous polynomials of degree m. Since ν± preserves each Pm (Cn ), each Pα is contained in Pm (Cn ) for some m. We then say that |α| = m, so that Pm (C) =

Pα =

|α|=m

Pα .

|α|=m

As proved in [2], all the bounded spherical functions are of positive type. Therefore there are two families of spherical functions. Those of the first family are ηKw (t, z) =

w ∈ Cn ,

ei Re z,kw dk, K

parametrized by K-orbits in Cn and associated with the one-dimensional representations of the Heisenberg group. The elements of the second family are parametrized by pairs (λ, α) ∈ R∗ × Λ. If λ > 0 and λ λ {v1 , . . . , vdim(P } is an orthonormal basis of Pα in the norm of Fλ , we have the spherical function α) φλ,α (t, z) =

dim(P α ) 1 πλ (t, z)vjλ , vjλ F . λ dim(Pα )

(2.1)

j =1

Taking, as we can, vjλ as λ|α|/2 vj1 , we find that √ φλ,α (z, t) = φ1,α ( λz, λt). For λ < 0, the analogous matrix entries of the contragredient representation give the spherical functions φλ,α (z, t) = φ−λ,α (z, t). Setting for simplicity φα = φ1,α , φα (z, t) = eit qα (z, z¯ )e−|z|

2 /4

,

where qα ∈ P(Cn ) ⊗ P(Cn ) is a real K-invariant polynomial of degree 2|α| in z and z¯ (cf. [3]). Denote by DK the algebra of left-invariant and K-invariant differential operators on Hn . The symmetrization map establishes a linear bijection from the space of K-invariant elements in the symmetric algebra over hn to DK . Therefore every element D ∈ DK can be expressed as m j ¯ j =0 Dj T , where Dj is the symmetrization of a K-invariant polynomial in Z, Z. ¯ T ). With D(φ) Let D be the symmetrization of the K-invariant polynomial P (Z, Z, denoting the eigenvalue of D ∈ DK on the spherical function φ, we have Kw ) = P (w, w, ¯ 0) D(η

(2.2)


1571

for the spherical functions associated to the one-dimensional representations. For λ = 0, dπλ (D) commutes with the action of K and therefore it preserves each Pα . By Schur’s lemma, dπλ (D)|Pα is a scalar operator cλ,α (D)IPα . It follows from (2.1) that λ,α ) = cλ,α (D). D(φ In particular, for D = T , we have T(ηKw ) = 0,

T(φλ,α ) = iλ.

3. Embeddings of the Gelfand spectrum Let K be a compact group of automorphisms of Hn such that (K Hn , K) is a Gelfand pair. The Gelfand spectrum of the commutative Banach algebra L1K (Hn ) is the set of bounded spherical functions endowed with the compact-open topology. Given a set V = {V0 , V1 , . . . , Vd } of formally self-adjoint generators of DK , we assign to each spherical function φ the (d +1)-tuple 1 (φ), . . . , V d (φ)). Since dπ(Vj ) is formally self-adjoint for every irreducible (φ) = (V0 (φ), V V (φ) is in Rd+1 . It has been proved, in a more general context [10], that representation π , V (φ): φ spherical} is closed in Rd+1 and homeomorphic to the Gelfand spectrum via V Σ V = {V K

V , the Gelfand transform of (see also [4]). Once we have identified the Gelfand spectrum with ΣK V of Rd+1 as a function f in L1K (Hn ) can be defined on the closed subset ΣK V GK f V (φ) = f φ. Hn

In order to prove Theorem 1.1, we first show that different choices of the generating system V V ). give rise to natural isomorphisms among the corresponding restricted Schwartz spaces S(ΣK It will then suffice to prove Theorem 1.1 for one particular set of generators. On the Schwartz space S(Rm ) we consider the following family of norms, parametrized by a nonnegative integer p: p f (p,Rm ) = 1 + |y| ∂ α f (y) . sup y∈Rm ,|α|p

Lemma 3.1. Let E and F be closed subsets of Rn and Rm , respectively. Let P : Rn → Rm and Q : Rm → Rn be polynomial maps such that P (E) = F and Q ◦ P is the identity on E. Given f in S(F ) we let P f = f ◦ P |E . Then P maps S(F ) in S(E) continuously. f in S(Rn ) Proof. We show that if f is in S(Rm ), then P f can be extended to a function P in a linear and continuous way. Let Ψ be a smooth function on Rn such that Ψ (t) = 1 if |t| 1 and Ψ (t) = 0 if |t| > 2. Define f (x) = Ψ x − Q ◦ P (x) (f ◦ P )(x) ∀x ∈ Rn . P

f is smooth and P f | = P f . Moreover P f is zero when |x − Q ◦ P (x)| > 2, so Clearly P E it suffices to prove rapid decay for |x − Q ◦ P (x)| 2. Note that there exists in N such that |x| 2 + Q P (x) C 1 + P (x)

∀x ∈ Rn , x − Q ◦ P (x) 2.

1572


f Therefore given a positive integer p there exists a positive integer q such that P (p,Rn ) Cf (q,Rm ) . The thesis follows immediately from the definition of the quotient topology on S(F ) and S(E). 2

Corollary 3.2. Suppose that {V0 , . . . , Vd } and {W0 , . . . , Ws } are two sets of formally self-adjoint V ) and S(Σ W ) are topologically isomorphic. generators of DK . Then the spaces S(ΣK K Proof. There exist real polynomials pj , j = 0, 1, . . . , s, and qh , h = 0, 1, . . . , d, such that Wj = pj (V0 , . . . , Vd ) and Vh = qh (W0 , . . . , Ws ). Setting P = (p0 , p1 , . . . , ps ) : Rd+1 → Rs+1 and Q = (q0 , q1 , . . . , qd ) : Rs+1 → Rd+1 , we can apply Lemma 3.1 in both directions. 2 4. Choice of the generators In this section K shall be a closed connected subgroup of U(n) such that (K Hn , K) is a Gelfand pair. The subject of the following lemma is the choice of a convenient set of formally self-adjoint generators of DK . Lemma 4.1. A generating system {V0 = −iT , V1 , . . . , Vd } can be chosen such that, for each j = 1, . . . , d, (1) Vj is homogeneous of even order 2 mj ; j (φα ) is a positive integer for every α in Λ; (2) Vj is formally self-adjoint and V j (ηKw ) = ρj (w, w), ¯ for every w in Cn , where ρj is a nonnegative homogenous polynomial (3) V of degree 2 mj , strictly positive outside of the origin. Notice that (1) and (2) imply that when j = 1, . . . , d j (φα ) j (φλ,α ) = |λ|mj V V

∀λ ∈ R \ {0}, ∀α ∈ Λ.

(4.1)

Proof. Let CnR denote Cn with the underlying structure of a real vector space. We denote by P(CnR ) ∼ = P(Cn ) ⊗ P(Cn ) the algebra of polynomials in z and z¯ , and by P K (CnR ) the subalgebra of K-invariant polynomials. The fact that the representation of K on P(Cn ) is multiplicity free implies that the trivial representation is contained in Pα ⊗ Pβ ⊂ P(CnR ) if and only if α = β, and with multiplicity one in each of them. Therefore a linear basis of P K (CnR ) is given by the polynomials pα (z, z) =

dim(P α )

vh (z) vh (z) =

h=1

dim(P α )

vh (z) 2 ,

h=1

where {v1 , . . . , vdim(Pα ) } is any orthonormal basis of Pα in the F1 -norm. A result in [18] ensures that there exist δ1 , . . . , δd in Λ such that the polynomials γ j = pδ j ,

j = 1, . . . , d,

(4.2)


1573

freely generate P K (CnR ). In [8] the authors prove that γ1 , . . . , γd have rational coefficients. More precisely, setting mj = |δj |, each γj can be written in the form (j ) γj (z, z¯ ) = θa,b za z¯ b , |a|=|b|=mj (j )

where a, b are in Nn and θa,b are rational numbers. ¯ is a homogenous operator of degree 2mj in DK with The symmetrization Lγj of γj (Z, Z) rational coefficients, and {−iT , Lγ1 , . . . , Lγd } generate DK [3,4]. Moreover, the eigenvalues γj (φα ) are rational numbers [7,8]. L Fix any positive integer m and denote by Mj,m the matrix which represents the restriction of dπ1 (Lγj ) to Pm (Cn ) in the basis of monomials w α , |α| = m. For every F in F1 (Cn ),

dπ1 (Zh )F (w) = [∂wh F ](w),

1 dπ1 (Z¯ h )F (w) = − wh F (w) 2

∀w ∈ Cn ,

for h = 1, . . . , n. Therefore Mj,m has rational entries, with denominators varying in a finite set independent of m. We can then take N such that the matrices N Mj,m have integral entries, for all m and j = 1, . . . , d. Thus the characteristic polynomial of N Mj,m is monic, with integral coefficients and rational zeroes N L γj (φα ); therefore these zeroes must be integers. They all have the same sign, independently of m, equal to (−1)mj (cf. [4]). We then define Vj = N (−1)mj Lγj + Lmj ,

j = 1, . . . , d,

α) = where L = −2 nj=1 (Zj Z j + Z j Zj ) is the U(n)-invariant subLaplacian, satisfying L(φ 2|α| + n. We show that {V0 = −iT , V1 , . . . , Vd } is a set of generators satisfying the required conditions. Since mj =1 γj = 12 |z|2 (cf. [4]), we have

Lγj =

mj =1

n 1 (Zj Z j + Z j Zj ) = − L, 2 j =1

and then

Vj =

mj =1

(−N Lγj + L) =

mj =1

N + r L, 2

(4.3)

where r is the cardinality of the set {δj : mj = |δj | = 1}. Therefore each Lγj is a polynomial in V1 , . . . , Vd . Since −iT , Lγ1 , . . . , Lγd generate DK , the same holds for −iT , V1 , . . . , Vd . Condition (1) follows from the homogeneity of Lγj and Lmj , and from (4.2). Since the polynomials in (4.2) are real-valued, then the Vj s are formally self-adjoint and condition (2) is easily verified. Finally condition (3) follows from (2.2), which gives mj ¯ L γj (ηKw ) = (−1) γj (w, w),

for all w in Cn and j = 1, . . . , d.

2

2mj mj (η L Kw ) = |w|

1574


Let V = {V0 , V1 , . . . , Vd } denote the privileged set of generators chosen in Lemma 4.1. We set ρ = (ρ1 , . . . , ρd ) the polynomial map in (3) of Lemma 4.1. (φ), Coordinates in Rd+1 will be denoted by (λ, ξ ), with λ in R and ξ in Rd . So, if (λ, ξ ) = V then either λ = −i T (φ) = 0, in which case φ = ηKw and ξ = ρ(w, w), ¯ or λ = −i T (φ) = 0, in which case φ = φλ,α and ξj = Vj (φλ,α ) = |λ|mj Vj (φα ). V consists therefore of two parts. The first part is Σ = {0} × ρ(Cn ), a semiThe spectrum ΣK 0 (φλ,α ), λ = 0. algebraic set. The second part, Σ , is the countable union of the curves Γα (λ) = V Each Γα and Σ0 are homogeneous with respect to the dilations (λ, ξ1 , . . . , ξd ) → tλ, t m1 ξ1 , . . . , t md ξd

(t > 0)

(4.4)

and to the symmetry (λ, ξ1 , . . . , ξd ) → (−λ, ξ1 , . . . , ξd ). By our choice of V , Σ ∩ {λ = 1} is V (cf. [4,14]). contained in the positive integer lattice. Moreover Σ is dense in ΣK V f of the integrable KFor the sake of brevity, we denote by f the Gelfand transform GK invariant function f . 5. Functional calculus In this section we prove one of the two implications of Theorem 1.1 for a closed connected subgroup K of U(n) such that (K Hn , K) is a Gelfand pair. More precisely we prove that if m V is the Gelfand transform of a function f in is a Schwartz function on Rd+1 , its restriction to ΣK SK (Hn ) (see Theorem 5.5 below). The proof is based on a result of Hulanicki [19] (see Theorem 5.1 below) on functional calculus for Rockland operators on graded groups and a multi-variate extension of it. A Rockland operator D on a graded Lie group N is a formally self-adjoint left-invariant differential operator on N which is homogeneous with respect to the dilations and such that, for every nontrivial irreducible representation π of N , the operator dπ(D) is injective on the space of C ∞ vectors. As noted in [20], it follows from [24] and [15] that a Rockland operator D is essentially selfadjoint on the Schwartz space S(N ), as well as dπ(D) on the Gårding space for every unitary representation π . We keep the same symbols for the self-adjoint extensions of such operators. Let N be a graded Lie group, and let | · | be a homogeneous gauge on it. We say that a function on N is Schwartz if and only if it is represented by a Schwartz function on the Lie algebra n in any given set of canonical coordinates. The fact that changes of canonical coordinates are expressed by polynomials makes this condition independent of the choice of the coordinates. Given a homogeneous basis {X1 , . . . , Xn } of the Lie algebra n we keep the same notation Xj for the associated left-invariant vector fields on N . Following [12], we shall consider the following family of norms on S(N ), parametrized by a nonnegative integer p:

p f (p,N ) = sup 1 + |x| X I f (x) : x ∈ N, deg X I p ,

(5.1)

where X I = X1i1 · · · Xnin and deg X I = ij deg Xj . Note that the Fréchet space structure induced on S(N ) by this family of norms is independent of the choice of the Xj and is equivalent to that induced from S(n) via composition with the exponential map.


1575

Theorem +∞ 5.1. (See [19].) Let D a positive Rockland operator on a graded Lie group N and let D = 0 λ dE(λ) be its spectral decomposition. If m is in S(R) and +∞ m(λ) dE(λ), m(D) = 0

then there exists M in S(N ) such that m(D)f = f ∗ M

∀f ∈ S(N ).

Moreover for every p there exists q such that M(p,N ) Cm(q,R) . Suppose that D1 , . . . , Ds form a commutative family of self-adjoint operators on N (in the sense that they have commuting spectral resolutions). Then they admit a joint spectral resolution and one can define the bounded operator m(D1 , . . . , Ds ) for any bounded Borel function m on their joint spectrum in Rs . The following theorem was proved in [26] in a special situation. Theorem 5.2. Suppose that N is a graded Lie group and D1 , . . . , Ds form a commutative family of positive Rockland operators on N . If m is in S(Rs ), then there exists M in S(N ) such that m(D1 , . . . , Ds )f = f ∗ M. Moreover, for every p there exists q such that M(p,N ) Cm(q,Rd ) . Proof. We prove the theorem by induction on s. By Theorem 5.1 the thesis holds when s = 1. Let s 2 and suppose that the thesis holds for s − 1. Let m(λ1 , . . . , λs ) be in S(Rs ). Then s−1 ) and {ϕ } in S(R) such that m(λ , . . . , λ there exist sequences {ψk } in S(R k 1 s−1 , λs ) = k ψk (λ1 , . . . , λs−1 )ϕk (λs ) and k ψk ⊗ ϕk N,Rs < ∞ for every N . For a proof of this, one can first decompose m as a sum of C ∞ -functions supported in a sequence of increasing balls and with rapidly decaying Schwartz norms, and then separate variables in each of them by a Fourier series expansion (see also [26]). By the inductive hypothesis, for every k there exist Ψk and Φk in S(N ) such that ψk (D1 , . . . , Ds−1 )f = f ∗ Ψk and ϕk (Ds )f = f ∗ Φk for every f in S(N ). Then ψk ⊗ ϕk (D1 , . . . , Ds−1 , Ds )f = ψk (D1 , . . . , Ds−1 ) ϕk (Ds )f = f ∗ Φk ∗ Ψ k . By straightforward computations (cf. [12, Proposition 1.47]) and the inductive hypothesis, we obtain that

1576


Ψk ∗ Φk (p,N ) Cp Ψk (p ,N ) Φk (p +1,N ) Cp ψk (q,Rs−1 ) ϕk (q ,R) Cp ψk ⊗ ϕk (q ,Rs ) . Since the series k ψk ⊗ ϕk is totally convergent in every Schwartz norm on Rs , there exists a function F in S(N ) such that

Ψ k ∗ Φk = F

k

and hence

f ∗ Ψ k ∗ Φk = f ∗ F

k

for every f in S(N ). On the other hand, if M ∈ S (N ) is the convolution kernel of m(D1 , . . . , Ds−1 , Ds ), by the Spectral Theorem, m(D1 , . . . , Dd−1 , Dd )f =

ψk (D1 , . . . , Dd−1 ) ⊗ ϕk (Dd )f

k

=

f ∗ Ψ k ∗ Φk ,

k

for every f in S(N ), with convergence in L2 (N ). Therefore S (N), i.e. F = M and M is in S(N ). Finally, given p there exists q such that

k Ψk

∗ Φk converges to M in

M(p,N ) m(q,Rd ) . This follows from the Closed Graph Theorem. Indeed, we have shown that there is a linear correspondence m → M from S(Rd ) to S(N ); moreover, reasoning as before, if mh → m in S(Rd ) and Mh → ϕ in S(N ) then Mh → M by the Spectral Theorem in S (N ). Therefore ϕ = M. 2 Going back to our case, we prove that the operators V1 , . . . , Vd satisfy the hypotheses of Theorem 5.2. Lemma 5.3. The differential operators V1 , . . . , Vd defined in Lemma 4.1 form a commutative family of positive Rockland operators on Hn . Proof. Suppose that j = 1, self-adjointness of Vj follows directly from its . . . , d. The formal definition and the identity Hn Zk f g = − Hn f Zk g, for every pair of Schwartz functions f and g on Hn . We check now the injectivity condition on the image of Vj in the nontrivial irreducible unitary representations of Hn .


1577

By (3) in Lemma 4.1, dτw (Vj ) = ρj (w) > 0 for w = 0. As to the Bargmann representations πλ , the Gårding space in F|λ| can be characterized as the space of those F = α∈Λ Fα (with Fα in Pα for λ > 0 and in Pα for λ < 0) such that N 1 + |α| Fα 2F|λ| < ∞ α∈Λ

for every integer N . For such an F , Vj F =

Vj (φλ,α )Fα ,

α∈Λ

where the series is convergent in norm, and therefore it is zero if and only if Fα = 0 for every α. Positivity of Vj follows from Plancherel’s formula: for f ∈ S(Hn ), Hn

Vj f f¯ =

1 2π

n+1 +∞

2 2 Vj (φλ,α ) πλ (f )|Pα H S + π−λ (f )|P H S λn dλ 0, α

0

α∈Λ

since the eigenvalues Vj (φλ,α ) are positive. Given a Borel subset ω of R+ , define the operator Ej (ω) on L2 (Hn ) by πλ Ej (ω)f = χω Vj (φλ,α ) )πλ (f )Πλ,α ,

(5.2)

α∈Λ

where χω is the characteristic function of ω and Πλ,α is the orthogonal projection of F|λ| onto Pα if λ > 0, or onto Pα if λ < 0. Then Ej = {Ej (ω)} defines, for each j , a resolution of the identity, and, for f ∈ S(Hn ), +∞ ξ dEj (ξ )f = Vj f. 0

Therefore Ej is the spectral resolution of the self-adjoint extension of Vj . It is then clear that Ej (ω) and Ek (ω ) commute for every ω, ω and j, k. 2 Corollary 5.4. Let V0 , . . . , Vd be the differential operators defined in Lemma 4.1. If m is in S(Rd+1 ), then there exists M in SK (Hn ) such that m(V0 , . . . , Vd )f = f ∗ M

∀f ∈ S(Hn ).

Moreover, for every p there exists q such that M(p,Hn ) Cm(q,Rd+1 ) .

1578


Proof. We replace V0 = −iT by V˜0 = −iT + 2L. By (4.3), V˜0 is a linear combination of the Vj . Therefore m(V0 , . . . , Vd ) = m( ˜ V˜0 , . . . , Vd ), where m ˜ is the composition of m with a linear transd+1 formation of R . Moreover, V˜0 is a positive Rockland operator (cf. [11]), which commutes with the other Vj because so do V0 and L. Applying Theorem 5.2, m( ˜ V˜0 , . . . , Vd )f = f ∗ M, with M ∈ SK (Hn ) and ˜ (q,Rd+1 ) C m(q,Rd+1 ) . M(p,Hn ) Cm

2

Theorem 5.5. Suppose that m is a Schwartz function on Rd+1 . Then there exists a function M in = m| V . Moreover the map m → M is a continuous linear operator from SK (Hn ) such that M S(Rd+1 ) to SK (Hn ).

ΣK

V (cf. [13, TheoProof. It follows from (5.2) that the joint spectrum of V0 , . . . , Vd is ΣK rem 1.7.10]). Therefore the continuous map m → M of Corollary 5.4 passes to the quotient V }. modulo {m: m = 0 on ΣK On the other hand, by (5.2),

(φλ,α ) πλ (f ) projλ,α , m V πλ m(V0 , . . . , Vd )f = α∈Λ

V (φλ,α )) = m(V (φλ,α )) for every and this must coincide with πλ (f )πλ (M). It follows that M( λ, α. By density, M = m| V . 2 ΣK

6. Extension of Schwartz invariant functions on R m Suppose that K is a compact Lie group acting orthogonally on Rm . It follows from Hilbert’s Basis Theorem [27] that the algebra of K-invariant polynomials on Rm is finitely generated. Let ρ1 , . . . , ρd be a set of generators and denote by ρ = (ρ1 , . . . , ρd ) the corresponding map from Rm to Rd . The image Σ = ρ(Rm ) of ρ is closed in Rd . ∞ (Rm ), the space of K-invariant smooth If h is a smooth function on Rd , then h ◦ ρ is in CK functions on Rm . G. Schwarz [25] proved that the map h → h ◦ ρ is surjective from C ∞ (Rd ) ∞ (Rm ), so that, passing to the quotient, it establishes an isomorphism between C ∞ (Σ) and to CK ∞ (Rm ). CK J. Mather [23] proved that the map h → h ◦ ρ is split-surjective, i.e. there is a continuous ∞ (Rm ) → C ∞ (Rd ) such that (Ef ) ◦ ρ = f for every f ∈ C ∞ (Rm ). linear operator E : CK K From this one can derive the following analogue of the Schwarz–Mather theorem for SK (Rm ). Theorem 6.1. There is a continuous linear operator E : SK (Rm ) → S(Rd ) such that (E g) ◦ ρ = g for every g ∈ SK (Rm ). In particular, the map g → g ◦ ρ is an isomorphism between S(Σ) and SK (Rm ).


1579

Proof. It follows from Lemma 3.1 that the validity of the statement is independent of the choice of the Hilbert basis ρ. We can then assume that the polynomials ρj are homogeneous of degree αj . On Rd we define anisotropic dilations by the formula δr (y1 , . . . , yk ) = r α1 y1 , . . . , r αd yd

∀r > 0,

and we shall denote by | · |α a corresponding homogeneous gauge, e.g. |y|α = c

d

|yj |1/αj ,

(6.1)

j =1

satisfying |δr y|α = r|y|α . On Rm we keep isotropic dilations, given by scalar multiplication. Clearly, ρ is homogeneous of degree 1 with respect to these dilations, i.e., ρ(rx) = δr ρ(x)

∀r > 0, x ∈ Rm ,

and Σ is δr -invariant for all r > 0. Since ρ is continuous, the image under ρ of the unit sphere in Rm is a compact set not containing 0 (in fact, since |x|2 is a polynomial in the ρj , cf. [23], ρj (x) = 0 for every j implies that x = 0). Choosing the constant c in (6.1) appropriately, we can assume that 1 |ρ(x)|α R for every x in the unit sphere in Rm . It follows by homogeneity that for every a, b, 0 a < b, ρ x: a |x| b ⊂ y: a |y|α Rb .

(6.2)

Fix E : C ∞ (Rm )K → C ∞ (Rd ) a continuous linear operator satisfying the condition (Ef ) ◦ ρ = f , whose existence is guaranteed by Mather’s theorem. Denote by Bs the subset of Rd where |y|α < s. For every p ∈ N there is q ∈ N such that, for f supported in the unit ball, Ef C p (BR2 ) < Cp f C q . Given r > 0, set fr (x) = f (rx) and Er f = (Efr ) ◦ δr −1 . By the homogeneity of ρ, (Er f ) ◦ ρ = f . For r > 1 and f supported on the ball of radius r, we have Er f C p (BrR2 ) < Cp r q f C q .

(6.3)

Let {ϕj }j 0 be a partition of unity on Rm consisting of radial smooth functions such that (a) ϕ0 is supported on {x: |x| < 1}; (b) for j 1, ϕj is supported on {x: R j −2 < |x| < R j };

1580


(c) for j 1, ϕj (x) = ϕ1 (R −(j −1) x). Similarly, let {ψj }j 0 be a partition of unity on Rd consisting of smooth functions such that (a ) ψ0 is supported on {y: |y|α < R}; (b ) for j 1, ψj is supported on {y: R j −1 < |y|α < R j +1 }; (c ) for j 1, ψj (y) = ψ1 (δR −(j −1) y). For f ∈ SK (Rm ) define

E f (y) =

∞ 1

ψj + (y)ER j (ϕj f ) =

j =0 =−2

∞

1

ψj (y)

j =0

ER j − (ϕj − f ),

=−2

with the convention that ψ−1 = ψ−2 = ϕ−1 = 0. Then ∞ 1 E f ρ(x) = ψj + ρ(x) ϕj (x)f (x). j =0 =−2

By (6.2), 1=−2 ψj + (ρ(x)) = 1 on the support of ϕj , hence E f ◦ ρ = f . We have the following estimate for the Schwartz norms in (5.1): f (p,Rm ) ∼

∞

R jp f ϕj C p .

j =0

On Rd we adapt the Schwartz norms to the dilations δr by setting p g (p,Rd ) = 1 + |y|α ∂ α g(y) . sup y∈Rd ,

aj αj p

We then have g (p,Rd ) ∼

∞ j =0

R jp sup

aj αj p

α ∂ gψj

∞

∞

R jp gψj C p .

j =0

Therefore E f (p,Rm ) C

∞ j =0

C

1 R jp ψj ER j − (ϕj − f )

∞ 1 j =0 =−2

=C

∞ 1 j =0 =−2

=−2

Cp

R jp ER j − (ϕj − f )C p (B

R j +1

R jp ER j (ϕj f )C p (B

R j ++1

)

)


=C

∞ j =0

R jp ER j (ϕj f )C p (B

R j +2

)

1581

.

By (6.3), since ϕj f is supported on the ball of radius R j , E f (p,Rm ) Cp

∞

R j (p+q) ϕj f C q Cp f (p+q,Rm ) .

2

j =0

7. Schwartz extensions of the Gelfand transform of f ∈ SK (Hn ) In this section we suppose that K is a closed connected subgroup of U(n). The following theorem settles the proof of Theorem 1.1 in this case. Theorem 7.1. Let f be in SK (Hn ). For every p in N there exist Fp in S(Rd+1 ) and q in N, both depending on p, such that Fp | V = fˆ and Fp (p,Rd+1 ) Cp f (q,Hn ) . ΣK

V ) to S (H ) Notice that this statement implies the existence of a continuous map from S(ΣK K n that inverts the Gelfand transform, even though its formulation is much weaker than that of Theorem 6.1. We do not claim that for each f a single F can be found, all of whose Schwartz norms are controlled by those of f . In addition, our proof does not show if Fp can be chosen to be linearly dependent on f . The proof of Theorem 7.1 is modelled on that given in [1] for the cases K = U(n), Tn , but with some relevant differences. On one hand we present a simplification of the argument given there, disregarding the partial results concerning extensions of fˆ with finite orders of regularity; on the other hand extra arguments are required in the general setting. V to a Schwartz We need to show that the Gelfand transform fˆ of f ∈ SK (Hn ) extends from ΣK d+1 function on R . Our starting point is the construction of a Schwartz extension to all of {0} × Rd of the restriction of fˆ to

(ηKw ): w ∈ Cn = {0} × ρ Cn . Σ0 = V If F f denotes the Fourier transform in Cn × R, ¯ f (t, z)e−i(λt+Re z·w) dw dt, F f (λ, w) = Cn ×R

we denote f˜(w) = F f (0, −w) = fˆ 0, ρ(w) . To begin with, we set fˆ = E f˜ ∈ S Rd . Then fˆ (ξ ) = fˆ(0, ξ ) if (0, ξ ) ∈ Σ0 .

(7.1)

1582


The next step consists in producing a Taylor development of fˆ at λ = 0. The following result is derived from [14]. In our setting the formula must take into account the extended functions in (7.1). Proposition 7.2. Let f be in SK (Hn ). Then there exist functions fj , j 1, in SK (Hn ), depending linearly and continuously on f , such that for any p in N, fˆ(λ, ξ ) =

p λj j =0

λp+1 fj (ξ ) + fp+1 (λ, ξ ), j! (p + 1)!

∀(λ, ξ ) ∈ ΣK ,

where f0 = f and fj is obtained from fj applying (7.1). V is in S(Σ V ). It Proof. For f in SK (Hn ), we claim that the restriction of u(λ, ξ ) = fˆ (ξ ) to ΣK K is quite obvious that u is smooth. Let ψ be a smooth function on the line, equal to 1 on [−2, 2] and supported on [−3, 3]. Define

2/m 2/m Ψ (λ, ξ ) = ψ λ2 + ξ1 1 + · · · + ξd d λ2 2/m 2/m 1 − ψ λ 2 + ξ1 1 + · · · + ξd d . + ψ 2/m 2/md 1 ξ1 + · · · + ξd V . It is also homoBy (4.1) and (2) in Lemma 4.1, Ψ is equal to 1 on a neighborhood of ΣK geneous of degree 0 with respect to the dilations (4.4) outside of a compact set. Then Ψ u is in V. S(Rd+1 ) and coincides with u on ΣK It follows from Corollary 5.4 that there exists h in SK (Hn ) such that

ˆ fˆ(λ, ξ ) − fˆ (ξ ) = h(λ, ξ) ˆ ρ(w)) = 0 for every w, Since h(0,

+∞ −∞

V ∀(λ, ξ ) ∈ ΣK .

h(z, t) dt = 0 for every z. Therefore t

f1 (z, t) =

h(z, s) ds

−∞

is in SK (Hn ) and ˆ h(λ, ξ ) = λf1 (λ, ξ )

V ∀(λ, ξ ) ∈ ΣK .

It is easy to verify that the map U : f → f1 is linear and continuous on SK (Hn ). We then define fj , j 1, by the recursion formula fj = j Ufj −1 and the thesis follows by induction. 2 We use now the Whitney Extension Theorem [22] to extend the C ∞ -jet {∂ξα fj }(j,α)∈Nd+1 to a Schwartz function on Rd+1 . In doing so, we must keep accurate control of the Schwartz norms. For this purpose we use Lemma 4.1 in [1], which reads as follows.


1583

Lemma 7.3. Let k 1 and let h(λ, ξ ) be a C k -function on Rm × Rn such that (1) ∂λα h(0, ξ ) = 0 for |α| k and ξ ∈ Rn ; (2) for every p ∈ N, αp (h) =

sup

|α|+|β|k

1 + | · | p ∂ α ∂ β h λ ξ

∞

< ∞.

Then, for every ε > 0 and M ∈ N, there exists a function hε,M ∈ S(Rm × Rn ) such that (1) ∂λα hε,M (0, ξ ) = 0 for every α ∈ Nm and ξ ∈ Rn ; β (2) sup|α|+|β|k−1 (1 + | · |)M ∂λα ∂ξ (h − hε,M )∞ < ε; (3) for every p ∈ N there is a constant Ck,p,M such that p hε,M (p,Rm+n ) Ck,p,M 1 + αM (h)p ε −p 1 + | · | h∞ . The following proposition is in [1, Proposition 4.2] for K = Tn . We give here a simplified proof. Proposition 7.4. Given f ∈ SK (Hn ) and p ∈ N, there are H ∈ S(Rd+1 ) and q ∈ N such that j ∂λ H (0, ξ ) = fj (ξ ) and H (p,Rd+1 ) Cp f (q,Hn ) . Proof. Let η be a smooth function on R such that η(t) = 1 if |t| 1 and η(t) = 0 if |t| 2. By Theorem 6.1 and Proposition 7.2, for every k and r there exists qk,r such that fk

(r,Rd )

Ck,r f (qk,r ,Hn ) .

(7.2)

We fix p ∈ N and apply Lemma 7.3 to hk (λ, ξ ) = η(λ)

λk+1 fk+1 (ξ ). (k + 1)!

Hypothesis (1) is obviously satisfied and (2) also, because hk is a Schwartz function. By (7.2), αr (hk ) Ck,r f (qk+1,r ,Hn ) .

(7.3)

Let q be the maximum among the qk,p for k p + 1. Setting εk = 2−k f (q,Hn ) , M = p, for each k there is a function Hk ∈ S(Rd+1 ) such that j

(i) ∂λ Hk (0, ξ ) = 0 for all j ∈ N and ξ ∈ Rd ; β (ii) sup|α|+|β|k−1 (1 + | · |)p ∂λα ∂ξ (hk − Hk )∞ < 2−k f (q,Hn ) ; (iii) for k p, using (7.3), p −p p Hk (p,Rd+1 ) Ck,p 1 + εk f (q,Hn ) 1 + | · | hk ∞ Cp f (q,Hn ) .

1584


Define H=

p k=0

hk −

p

Hk +

k=0

∞

(hk − Hk ).

k=p+1

By (7.2), (ii) and (iii), the pth Schwartz norm of H is finite and controlled by a constant times j the qth Schwartz norm of f . Differentiating term by term, using (i) and the identity ∂λ hk (0, ξ ) = j j (ξ ) for every j . 2 δj,k+1 f k+1 (ξ ), we obtain that ∂λ H (0, ξ ) = f Let now ϕ be a smooth function on R such that ϕ(t) = 1 if |t| 1/2 and ϕ(t) = 0 if |t| 3/4. V , we define the function Eh on Rd+1 by For h defined on ΣK Eh(λ, ξ ) =

α∈Λ h(λ, ξ(λ,α) )

d

ξ =1 ϕ( |λ|m

− V (φα )),

λ = 0, ξ ∈ Rd , λ = 0, ξ ∈ Rd ,

0,

1 (φλ,α ), . . . , V d (φλ,α )) = (|λ|m1 V 1 (φα ), . . . , |λ|md V d (φα )). where ξ(λ,α) = (V Recall that, by Lemma 4.1, each V (φα ) is a positive integer. Therefore, if ξ 0 for some , every term in the series vanishes, whereas, if ξ is in Rd+ , the series reduces to at most one single term. Moreover, for every g in S(Hn ), E g = g on Σ . The proof of the following result goes as for [1, Lemma 3.1], using [6, p. 407] instead of [1, (2.2)]. In contrast with [1] we state it only for vanishing of infinite order of the Taylor development of the Gelfand transform on Σ0 . Proposition 7.5. Suppose that g in SK (Hn ) and gj |Σ = 0 for every j . Then 0

V; (1) E g (λ, ξ ) = g (λ, ξ ) for all (λ, ξ ) ∈ ΣK s d (2) ∂λ (E g )(0, ξ ) = 0 for all s and ξ ∈ R ; (3) for every p 0 there exist a constant Cp and an integer q 0 such that

E g (p,Rd+1 ) Cp g(q,Hn ) . In particular, E g ∈ S(Rd+1 ). To conclude the proof of Theorem 7.1, take f in SK (Hn ) and p in N. Let H be the function in S(Rd+1 ), depending on p, defined as in Proposition 7.4. By Theorem 5.2, there exists h in S(Hn ) such that H|

V ΣK

ˆ = h.

Define ˆ + H, F = E(fˆ − h) and the thesis follows easily.


1585

8. General compact groups of automorphisms of Hn We have discussed in the previous sections the Gelfand pairs associated with connected subgroups of U(n). In this section we only assume that K is a compact group of automorphisms of Hn . Let K0 be the connected identity component of K. Then K0 is a normal subgroup of K and F = K/K0 is a finite group. Conjugating K with an automorphism if necessary, we may suppose that K0 is a subgroup of U(n). For D in DK0 and w = kK0 in F , define D w by D w f = D f ◦ k −1 ◦ k

∀f ∈ C ∞ (Hn ).

Since K0 is normal, D w is in DK0 . It is also clear that DK0 admits a generating set which is stable under the action of the group F . Indeed, it suffices to add to any given system of generators the F -images of its elements. Denoting by V the linear span of these generators, F acts linearly on V. Let V = {V1 , . . . , Vd } be a basis of V, orthonormal with respect to an F -invariant scalar product. Clearly, V is a generating set for DK0 . Applying Hilbert’s Basis Theorem as in Section 6, there exists a finite number of (homogeneous) polynomials ρ1 , . . . , ρr generating the subalgebra PF (Rd ) of F -invariant elements in P (Rd ). Let ρ = (ρ1 , . . . , ρr ) : Rd → Rr be the corresponding Hilbert map and let Wj = ρj (V1 , . . . , Vd ) for j = 1, . . . , r. When f is K0 -invariant and w = kK0 in F , we set f ◦ w = f ◦ k. Lemma 8.1. The set W = {W1 , , . . . , Wr } generates DK . Moreover if ψ is a K-spherical function, then ψ=

1 φ ◦ w, |F |

(8.1)

w∈F

for some K0 -spherical function φ. Proof. Take D in DK . As an element of DK0 , D is a polynomial in the Vj . Averaging over the action of F , we can express D as an F -invariant polynomial in the Vj . Hence D is a polynomial in W1 = ρ1 (V1 , . . . , Vd ), . . . , Wr = ρr (V1 , . . . , Vd ). Recall that all K0 - and K-spherical functions are of positive type [2]. Let PK (respectively PK0 ) denote the convex set of K-invariant (respectively K0 -invariant) functions of positive ∞ type equal to 1 at the identity element, and consider the linear map J : L∞ K0 → LK defined by 1 J ϕ = |F | w∈F ϕ ◦ w. Since PK0 and PK are weak∗ -compact and J maps PK0 to PK , the extremal points of PK are images of extremal points of PK0 . This proves that every K-spherical function has the form (8.1). Conversely, if ψ is given by (8.1) and D ∈ DK , then Dψ =

1 1 D(ϕ ◦ w) = (Dϕ) ◦ w = D(ϕ)ψ, |F | |F | w∈F

showing that ψ is K-spherical.

2

w∈F

1586


From Lemma 8.1 we derive the following property of the Gelfand spectra: V W = ΣK ⊂ Rr . ρ ΣK 0 V invariant. For a K -invariant function f If V is as above, the linear action of F on V leaves ΣK 0 0 and w in F ,

GV (f ◦ w) = (GV f ) ◦ w.

(8.2)

V ) be the space of F -invariant elements in S(Σ V ). Let SF (ΣK K0 0 W ) and S (Σ V ). Lemma 8.2. The map f → f ◦ ρ is an isomorphism between S(ΣK F K0 W ), let f˜ be any Schwartz extension of f to Rr . Then g = f˜ ◦ ρ is an F Proof. If f is in S(ΣK V is f ◦ ρ. This proves the continuity invariant Schwartz function on Rd and its restriction to ΣK 0 of the map. V ), let g˜ be an F -invariant Schwartz extension of g to Rd . Conversely, given g in SF (ΣK 0 ˜ | W , where E is the operator of Theorem 6.1 for the group F . The proof that the Set h = (E g) ΣK

dependence of h on g is continuous is based on the simple observation that, for any Schwartz norm (N ) on Rd , the infimum of the norms of all extensions of g is the same as the infimum restricted to its F -invariant extensions. 2 We can now prove Theorem 1.1 for general K. Assume that K is a compact group of automorphisms of Hn and let K0 , F , V1 , . . . , Vd , ρ be as above. Take f in L1K (Hn ). Denote by GV f (respectively GW f ) its Gelfand transform as a K0 invariant (respectively K-invariant) function. Then GV f = GW f ◦ ρ. In particular, a K0 -invariant function is K-invariant if and only if GV f is F -invariant. V ) by Theorem 7.1; If f is in SK (Hn ), then f is also K0 -invariant and GV f is in SF (ΣK 0 W ) by Lemma 8.2. therefore GW f is in S(ΣK W ), it follows as before that G f is in S (Σ V ) and therefore Conversely, if GW f is in S(ΣK V F K0 f is in SK (Hn ) by (8.2). References [1] F. Astengo, B. Di Blasio, F. Ricci, Gelfand transforms of polyradial Schwartz functions on the Heisenberg group, J. Funct. Anal. 251 (2007) 772–791. [2] C. Benson, J. Jenkins, G. Ratcliff, On Gelfand pairs associated with solvable Lie groups, Trans. Amer. Math. Soc. 321 (1990) 85–116. [3] C. Benson, J. Jenkins, G. Ratcliff, Bounded spherical functions on Heisenberg groups, J. Funct. Anal. 105 (1992) 409–443. [4] C. Benson, J. Jenkins, G. Ratcliff, T. Worku, Spectra for Gelfand pairs associated with the Heisenberg group, Colloq. Math. 71 (1996) 305–328. [5] C. Benson, J. Jenkins, R.L. Lipsman, G. Ratcliff, A geometric criterion for Gelfand pairs associated with the Heisenberg group, Pacific J. Math. 178 (1997) 1–36. [6] C. Benson, J. Jenkins, G. Ratcliff, The spherical transform of a Schwartz function on the Heisenberg group, J. Funct. Anal. 154 (1998) 379–423. [7] C. Benson, G. Ratcliff, Combinatorics and spherical functions on the Heisenberg group, Represent. Theory 2 (1998) 79–105.


1587

[8] C. Benson, G. Ratcliff, Rationality of the generalized binomial coefficients for a multiplicity free action, J. Austral. Math. Soc. Ser. A 68 (2000) 387–410. [9] G. Carcano, A commutativity condition for algebras of invariant functions, Boll. Unione Mat. Ital. Sez. B 7 (1987) 1091–1105. [10] F. Ferrari Ruffino, The topology of the spectrum for Gelfand pairs on Lie groups, Boll. Unione Mat. Ital. Sez. B Artic. Ric. Mat. (8) 10 (2007) 569–579. [11] G.B. Folland, E.M. Stein, Estimates for the ∂¯b complex and analysis on the Heisenberg group, Comm. Pure Appl. Math. 27 (1974) 429–522. [12] G.B. Folland, E.M. Stein, Hardy Spaces on Homogeneous Groups, Princeton Univ. Press, Princeton, NJ, 1982. [13] R. Gangolli, V.S. Varadarajan, Harmonic Analysis of Spherical Function on Real Reductive Groups, Springer, Berlin, 1988. [14] D. Geller, Fourier analysis on the Heisenberg group. I. Schwartz space, J. Funct. Anal. 36 (2) (1980) 205–254. [15] B. Helffler, J. Nourrigat, Caracterisation des opérateurs hypoelliptiques homogènes invariants à gauche sur un groupe de Lie gradué, Comm. Partial Differential Equations 4 (1979) 899–958. [16] S. Helgason, Differential Geometry and Symmetric Spaces, Academic Press, New York, 1962. [17] S. Helgason, Groups and Geometric Analysis, Academic Press, New York, 1984. [18] R. Howe, T. Umeda, The Capelli identity, the double commutant theorem and multiplicity-free actions, Math. Ann. 290 (1991) 565–619. [19] A. Hulanicki, A functional calculus for Rockland operators on nilpotent Lie groups, Studia Math. 78 (1984) 253– 266. [20] A. Hulanicki, J.W. Jenkins, J. Ludwig, Minimum eigenvalues for positive Rockland operators, Proc. Amer. Math. Soc. 94 (1985) 718–720. [21] V. Kaˇc, Some remarks on nilpotent orbits, J. Algebra 64 (1980) 190–213. [22] B. Malgrange, Ideals of Differentiable Functions, Oxford Univ. Press, Bombay, 1966. [23] J.N. Mather, Differentiable invariants, Topology 16 (1977) 145–155. [24] E. Nelson, W.F. Stinespring, Representation of elliptic operators in an enveloping algebra, Amer. J. Math. 81 (1959) 547–560. [25] G.W. Schwarz, Smooth functions invariant under the action of a compact Lie group, Topology 14 (1975) 63–68. [26] A. Veneruso, Schwartz kernels on the Heisenberg group, Boll. Unione Mat. Ital. Sez. B Artic. Ric. Mat. (8) 6 (2003) 657–666. [27] H. Weyl, David Hilbert and his mathematical work, Bull. Amer. Math. Soc. 50 (1994) 612–654.


A Sobolev-like inequality for the Dirac operator Simon Raulot 1 Université de Neuchâtel, Institut de Mathématiques, Rue Emile-Argand 11, 2007 Neuchâtel, Switzerland Received 26 May 2008; accepted 11 November 2008 Available online 28 November 2008 Communicated by Paul Malliavin

Abstract In this article, we prove a Sobolev-like inequality for the Dirac operator on closed compact Riemannian spin manifolds with a nearly optimal Sobolev constant. As an application, we give a criterion for the existence of solutions to a nonlinear equation with critical Sobolev exponent involving the Dirac operator. We finally specify a case where this equation can be solved. © 2008 Elsevier Inc. All rights reserved. Keywords: Dirac operator; Sobolev inequality; Conformal geometry; Nonlinear elliptic equations

1. Introduction Let (M n , g) be a compact Riemannian manifold of dimension n 3. The Sobolev embedding theorem asserts that the Sobolev space H12 of functions u ∈ L2 such that ∇u ∈ L2 embeds con2n tinuously in the Lebesgue space LN (with N = n−2 ). In other words, there exist two constants 2 A, B > 0 such that, for all u ∈ H1 , we have

2 N |u|N dv(g) A |∇u|2 dv(g) + B u2 dv(g).

M

M

E-mail address: [email protected]. 1 Supported by the Swiss SNF grant 20-118014/1.


M

S(A, B)

S. Raulot / Journal of Functional Analysis 256 (2009) 1588–1617

1589

Considerable work has been devoted to the analysis of sharp Sobolev-type inequalities, very often in connection with concrete problems from geometry. One of these concerns the best constant in S(A, B) defined by A2 (M) := inf A2 (M), where A2 (M) := A > 0 ∃B > 0 such that S(A, B) holds for all u ∈ C ∞ (M) . From S(A, B) and by definition of A2 , we easily get that: (1) A2 (M) K(n, 2)2 , (2) for any ε > 0 there exists Bε > 0 such that inequality S(A2 (M) + ε, Bε ) holds. Here K(n, 2)2 denotes the best constant of the corresponding Sobolev embedding theorem in the Euclidean space given by (see [8,33]): K(n, 2)2 =

4 2/n

n(n − 2)ωn

,

where ωn stands for the volume of the standard n-dimensional sphere. In fact, Aubin [8] showed that A2 (M) = K(n, 2)2 and conjectured that S(A, B) should hold for A = K(n, 2)2 , that is A2 (M) is closed. The proof of this conjecture by Hebey and Vaugon (see [22,23]) gave rise to various interesting problems dealing with the best constants in Riemannian Geometry. One of those given in [24], is the problem of prescribed critical functions which study the existence of functions for which S(A2 (M), B0 ) is an equality (here B0 > 0 denotes the infimum on B > 0 such that S(A2 (M), B0 ) holds). For more details and related topics, we refer to [16]. Recall that one of the first geometric applications of the best constant problem has been discovered by Aubin [7] regarding the Yamabe problem. This famous problem of Riemannian geometry can be stated as follows: given a compact Riemannian manifold (M n , g) of dimension n 3, can one find a metric conformal to g such that its scalar curvature is constant? This problem has a long and fruitful history and it has been completely solved in several steps by Yamabe [36], Trudinger [35], Aubin [7] and finally Schoen [31] using the Positive Mass Theorem coming from General Relativity (see also [27] for a complete review). The Yamabe problem is in fact equivalent to find a smooth positive solution u ∈ C ∞ (M) to a nonlinear elliptic equation: Lg u := 4

n−1 g u + Rg u = λuN −1 , n−2

(1)

where Lg is known as the conformal Laplacian (or the Yamabe operator), g (resp. Rg ) denotes the standard Laplacian acting on functions (resp. the scalar curvature) with respect to the Riemannian metric g and λ ∈ R is a constant. Indeed, if such a function exists then the metric g = uN −2 g is conformal to g and satisfies Rg = λ. A standard variational approach cannot allow to conclude because of the lack of compactness in the Sobolev embedding theorem involved in this method. However, Aubin [7] proved that if:

1590


n−1 K(n, 2)−2 Y M, [g] = inf I (f ) < Y Sn , [gst ] = 4 f =0 n−2

(2)

holds, where I denotes the functional defined by I (f ) =

4 n−1 n−2

M

|∇f |2 dv(g) + M Rg f 2 dv(g) ,

2 ( M |f |N dv(g)) N

Eq. (1) admits a positive smooth solution. This condition points out the tight relation between the Yamabe problem and the best constant involved in the Sobolev inequality. Moreover, it is sharp in the sense that for all compact Riemannian manifolds (M n , g), the following inequality holds (see [7]): n−1 K(n, 2)−2 . Y M, [g] 4 n−2

(3)

In the setting of Spin Geometry, a problem similar to the Yamabe problem has been studied in several works of Ammann (see [4,6]), and Ammann, Humbert and others (see [2,3]). The starting point of all these works is the Hijazi inequality [18,19] which links the first eigenvalue of two elliptic differential operators: the conformal Laplacian Lg and the Dirac operator Dg . Hijazi’s result can be stated as follows: 2

λ21 (g) Vol(M, g) n

n Y M, [g] , 4(n − 1)

(4)

where λ1 (g) denotes the first eigenvalue of the Dirac operator Dg . Thereafter, Ammann studies a spin conformal invariant defined by 1 λmin M, [g], σ := inf λ1 (g) Vol(M, g) n g∈[g]

(5)

and points out that studying critical metrics for this invariant involves similar analytic problems to those appearing in the Yamabe problem. Indeed, finding a critical metric of (5) is equivalent to prove the existence of a smooth spinor field ϕ minimizing the functional defined by

2n n+1 ( M |Dg ψ| n+1 dv(g)) n

Fg (ψ) = , | M Dg ψ, ψ dv(g)|

(6)

with the corresponding Euler–Lagrange equation given by 2 Dg ϕ = λmin M, [g], σ |ϕ| n−1 ϕ.

(7)

In [4], the author observes that a standard variational approach does not yield to the existence of such minimizers. Indeed, the Sobolev inclusion involved in this method is precisely the one for which the compacity is lost in the Reillich–Kondrakov theorem. The argument to overcome this problem is similar to the one used in the Yamabe problem. In fact, one can prove the existence of a smooth solution of Eq. (7), but this solution can be trivial (that is identically zero). So one


1591

might be able to find a criterion which prevents this situation. It is now important to note that an inequality similar to (3) holds in the spinorial setting (see [2,5]), namely: n 1 λmin M, [g], σ λmin Sn , [gst ], σst = ωnn = 2

n K(n, 2)−1 , n−2

(8)

where (Sn , gst , σst ) stands for the n-dimensional sphere equipped with its standard Riemannian metric gst and its standard spin structure σst . The criterion obtained by Ammann in [4] is tightly related to the one involved in the Yamabe problem since he shows that if inequality (8) is strict then the spinor field solution of (7) is nontrivial (compare with (2)). In this paper, we study a more general nonlinear equation involving the Dirac operator (since it also includes Ammann’s result in the case of invertible Dirac operator). This equation is closely related to the problem of conformal immersion of a manifold as a hypersurface in a manifold carrying a parallel spinor (see [1] for example). The proof we give here lies on a Sobolev-type inequality for the Dirac operator. It emphasizes in particular that the same kind of questions of those arising from the Yamabe problem can be studied in the context of Spin Geometry. 2. Geometric and analytic preliminaries 2.1. Geometric preliminaries In this paragraph, we recall briefly some basic facts on Spin Geometry. For more details, we refer to [14] or [25] for example. Let (M n , g, σ ) be an n-dimensional compact Riemannian manifold equipped with a spin structure denoted by σ . It is well known that on such a manifold n one can construct a complex vector bundle of rank 2[ 2 ] denoted by Σg (M), called the complex spinor bundle. This bundle is naturally endowed with the spinorial Levi-Civita connection ∇, a pointwise Hermitian scalar product .,. and a Clifford multiplication “.”. There is also a natural elliptic differential operator of order one acting on sections of this bundle, the Dirac operator. This operator is locally given by Dg ϕ =

n

ei · ∇ei ϕ

i=1

for all ϕ ∈ Γ (Σg (M)) and where {e1 , . . . , en } is a local g-orthonormal frame of the tangent bundle. It defines a self-adjoint operator whose spectrum is constituted of an unbounded sequence of real numbers. Estimates on the spectrum of the Dirac operator has been and is again the main subject of several works (a nonexhaustive list is [13], [18] or [9]). As pointed out in the introduction, a key result for the following of this paper is the Hijazi inequality. More precisely, Hijazi gives an inequality which links the squared of the first eigenvalue of the Dirac operator with the first eigenvalue of the conformal Laplacian. The proof of this inequality relies on the famous Schrödinger–Lichnerowicz formula (see [20] for example) and on the conformal covariance of the Dirac operator. In fact, if g ∈ [g], there is a canonical identification between the spinor bundle over (M, g) with the one over (M, g) (see [21] or [18]). This identification will be denoted by Σg (M) → Σg (M) ϕ → ϕ.

(9)

1592


Under this isomorphism, one can relate the Dirac operators Dg and Dg acting respectively on Σg (M) and Σg (M). Indeed, if g = e2u g where u is a smooth function, then: Dg ϕ = e −

n+1 2 u

n−1 Dg e 2 u ϕ

(10)

for all ϕ ∈ Γ (Σg (M)). 2.2. Analytic preliminaries In this section we give some well-known facts on Sobolev spaces on spinors and on the analysis of differential equations involving the Dirac operator. In the following, we assume that (M n , g) is an n-dimensional compact Riemannian spin manifold (n 2) such that the Dirac operator is invertible. We let Lq := Lq (Σg (M)), the space of spinors ϕ ∈ Γ (Σg (M)) such that: ϕq :=

1 q |ϕ|q dv(g)

M q

q

is finite. The Sobolev space H1 := H1 (Σg (M)) is defined as being the completion of the space of smooth spinor fields with respect to the norm: ϕ1,q := ∇ϕq + ϕq .

(11)

However, since our problem involves the Dirac operator, it would be more convenient if we could consider the following equivalent norm: Lemma 1. The map: ϕ → hDg ϕq

(12)

q

defines a norm equivalent to the H1 -norm for every smooth positive function h on M. Proof. From the definition of (12) it is clear that this map defines a norm on the space of smooth spinors which is equivalent to the norm defined by ϕ → Dg ϕq . q

Now we show that this norm is equivalent to the H1 -norm. In fact, for any smooth spinor field ϕ, the Cauchy–Schwarz inequality yields |Dg ϕ|2 n|∇ϕ|2 which implies the existence of a positive constant C1 > 0 such that: Dg ϕq C1 ∇ϕq + ϕq .


1593

On the other hand, with the help of pseudo-differential operators (see the proof of Lemma 2), it is not difficult to see that there also exists another positive constant C2 > 0 such that:

∇ϕq + ϕq C2 Dg ϕq ,

which concludes the proof of this lemma.

2 q

Using this result and the fact that the Sobolev space H1 is defined as the completion of q the space of smooth spinors with respect to the H1 -norm, it is clear that one can consider the Sobolev space as defined independently from one of the three preceding norms. It will provide a very useful tool to solve the nonlinear equation studied in this paper. A natural way to prove the existence of solutions for this kind of equation is the variational approach. It consists of minimizing a certain functional defined on an adapted Sobolev space and then to apply the machinery of Sobolev–Kondrakov embedding theorems, Schauder estimates and a-priori elliptic estimates. Here we will use this method, and we refer to the works of Ammann [4,6] for proofs of all these results in the setting of Spin Geometry. However, for clarity, we prove the following result which will be of great help in the next section: Lemma 2. If the Dirac operator is invertible then there exists a constant C > 0 such that for all q ϕ ∈ H1 we have ϕp CDg ϕq , where p −1 + q −1 = 1 and 2 p < ∞. Proof. We show that the operator: Dg−1 : Lq → Lp defines a continuous map. Since Dg−1 is a pseudo-differential operator of order −1 the operator 1

(Id + ∇ ∗ ∇) 2 Dg−1 is a pseudo-differential operator of order zero hence (see [34]) a bounded operator from Ls to Ls for all s > 1. Thus if ϕ ∈ Lq : 1 Id + ∇ ∗ ∇ 2 Dg−1 ϕ ∈ Lq , and the spinor field Dg−1 ϕ is in the Sobolev space H1 which is continuously embedded in Lp (using the Sobolev embedding theorem). Then there exists a positive constant C > 0 such that: q

−1

D ϕ Cϕq , g p and this concludes the proof.

2

1594


Remark 1. −1 −1 + qD = 1 the quotient: (1) For q = qD = 2n/(n + 1) and pD such that pD

Cg (ϕ) =

Dg ϕqD ϕpD

is invariant under a conformal change of metric, that is: n−1 Cg h− 2 ϕ = Cg (ϕ)

(13)

q

for all ϕ ∈ H1 D and for g = h2 g ∈ [g]. Indeed, an easy computation using the canonical identification (9) between Σg (M) and Σg (M) and the formula (10) which relates Dg and Dg , leads to (13). (2) On the n-dimensional sphere (Sn , gst , σst ) endowed with its standard spin structure σst , the Dirac operator is invertible since the scalar curvature is positive. Then using Lemma 2, there q exists a constant C > 0 such that for all Φ ∈ H1 D :

n ΦpD C D S Φ q . D

Moreover, since the standard sphere (Sn \ {q}, gst ) (where q ∈ Sn ) is conformally isometric to the Euclidean space (Rn , ξ ), we conclude that for all ψ ∈ Γc (Σξ (Rn )): ψpD CDξ ψqD , where Γc (Σξ (Rn )) denotes the space of smooth spinor fields over (Rn , ξ ) with compact support. 3. The Sobolev inequality In this section, we prove a Sobolev inequality in the spinorial setting. The classical Sobolev inequality S(A, B) shows in particular that the Sobolev space of functions H12 is continuously 2n

embedded in L n−2 . Here one could interpret our result as the inequality involved in the continuous embedding: 2n/(n+1)

H1

2

→ H1/2 ,

2 is defined as the completion of the space of smooth spinors with respect to the norm: where H1/2

ψ 1 ,2 := 2

1

|λi | 2 |Ai |2 .

i

Here ψ = i Ai ψi is the decomposition of any smooth spinor in the spectral resolution {λi ; ψi } of Dg (see [6]). Let us first examine the case of the sphere which is the starting point of the inequality we want to prove. In fact, it is quite easy to compute that the invariant defined by (5) on the sphere is


2n n+1 n n ( Sn |D S ψ| n+1 dv(gst )) n n 1

λmin S , [gst ], σst = inf = ωnn . n S ψ=0 | Sn D ψ, ψ dv(gst )| 2

1595

(14)

The proof of this fact relies on the Hijazi inequality (4) and on the existence of real Killing spinors on the round sphere (see [15]). Thus using the conformal covariance of (14) and the fact that the sphere (minus a point) is conformally isometric to the Euclidean space, we can conclude that: n+1 n 2n Dξ ψ, ψ dx λmin Sn , [gst ], σst −1 n+1 dx |D ψ| ξ Rn

(15)

Rn

for all ψ ∈ Γc (Σξ (Rn )). With this in mind, we can now state the main result of this section: Theorem 3. Let (M n , g, σ ) be an n-dimensional closed compact Riemannian spin manifold and suppose that the Dirac operator is invertible. Then for all ε > 0, there exists a constant Bε such that: n+1 n+1 n n 2n 2n Dg ϕ, ϕ dv(g) K(n) + ε |Dg ϕ| n+1 dv(g) + Bε |ϕ| n+1 dv(g) M

M

(16)

M

2n

for all ϕ ∈ H1n+1 and where −1 = K(n) := λmin Sn , [gst ], σst

2 −1 n−2 K(n, 2) = ωn n . n n

In order to prove (16), we need some well-known technical results which are summarized in the following lemma: Lemma 4. Let (ai )1iN0 ⊂ R+ (N0 ∈ N∗ ), p ∈ [0, 1] and q 1. The following identities hold: N0 p N0 ai )p i=1 ai ; (1) ( i=1 N0 N0 q q (2) i=1 ai ( i=1 ai ) ; (3) ∀ε > 0, ∃Cε > 0, ∀a, b 0: (a + b)p (1 + ε)a p + Cε bp ; (4) For all functions f1 , . . . , fr : M → [0, ∞[, we have r i=1

1 p fi dv(g)

p

M

M

r

p fi

1

p

dv(g)

.

i=1

We can now give the proof of inequality (16). Proof of Theorem 3. Let x ∈ M and ε > 0. Let U (resp. V) be a neighborhood of x ∈ M (resp. 0 ∈ Rn ) such that the exponential map

1596


expx : V ⊂ Rn → U ⊂ M is a diffeomorphism. Then we can identify the spinor bundle over (U, g) with the one over (V , ξ ) that is there exists a map: τ : Σg (U ) → Σξ (V )

(17)

which is a fiberwise isometry (see [10]). Moreover the Dirac operators Dg and Dξ (acting respectively on Σg (U ) and Σξ (V )) are related by the formula: Dg ϕ(y) = τ −1 Dξ τ (ϕ) exp−1 x (y) + ρ(ϕ)(y)

(18)

for all y ∈ U and where ρ(ϕ) ∈ Γ (Σg (U )) is a smooth spinor such that |ρ(ϕ)| ε|ϕ|. Now since M is compact, we choose a finite sequence (xi )1iN0 ⊂ M and a finite cover (Ui )1iN0 of M (where Ui is a neighborhood of xi ∈ M) such that there exist open sets (Vi )1iN0 of 0 ∈ Rn and applications τi such that (17) and (18) are fulfilled. Moreover without loss of generality, we can assume that: 1 ξ g (1 + ε)ξ 1+ε as symmetric bilinear forms and consequently the volume forms satisfy n 1 dx dv(g) (1 + ε) 2 dx. (1 + ε)n/2

(19)

Let (ηi )1iN0 be a smooth partition of unity subordinate to the covering (Ui )1iN0 , in other words ηi satisfies: ⎧ supp(ηi ) ⊂ Ui , ⎪ ⎪ ⎪ ⎪ ⎨ 0 ηi 1, 0 ⎪ ⎪ ⎪ ηi = 1. ⎪ ⎩

N

i=1

For ϕ ∈ Γ (Σg (M)), we write N0 √ √ (LHS) := Dg ϕ, ϕ dv(g) = ηi Dg (ϕ), ηi ϕ dv(g) M

i=1 M

N 0 √ √ = Dg ( ηi ϕ), ηi ϕ dv(g) i=1 M

√ since (LHS) is real and Red( ηi ) · ϕ, ϕ = 0. Inequality (19) leads to


1597

N0 √ √ (LHS) (1 + ε) τi Dg ( ηi ϕ) , τi ( ηi ϕ) dx n 2

i=1 Rn

and using formula (18), we can write n

(LHS) (1 + ε) 2

N0 √ Dξ τi ( ηi ϕ) , τi (√ηi ϕ) dx + C τi (√ηi ϕ)2 dx . i=1

Rn

Rn

√ On the other hand, since τi ( ηi ϕ) ∈ Γc (Σξ (Rn )), inequality (15) gives n+1 N0 n 2 √ √ 2n n+1 Dξ τi ( ηi ϕ) K(n) (LHS) (1 + ε) dx + C τi ( ηi ϕ) dx . n 2

i=1

Rn

Rn

Now note that with the help of (4) of Lemma 4 and since n/(n + 1) 1, it follows that: n+1 n n+1 N0 N0 n n √ √ 2n 2 n+1 Dξ τi ( ηi ϕ) n+1 dx Dξ τi ( ηi ϕ) dx . i=1

Rn

Rn

i=1

Using (3) of Lemma 4, we finally get n

(LHS) n+1 (1 + ε)

(n+1)2 +1 2(n+1)

n2

n

K(n) n+1 A + (1 + ε) 2(n+1) Cε B,

where n N0 √ 2 n+1 Dξ τi ( ηi ϕ) A= dx Rn

B=

and

i=1 N0 √ τi ( ηi ϕ)2 dx

n n+1

.

i=1Rn

√ We now give an estimate of A. If we let γi (ϕ) = d( ηi ) · ϕ − ρi (ϕ), then: N0 N0 √ Dξ τi ( ηi ϕ) 2 = Dg (√ηi ϕ) − ρi (ϕ)2 i=1

i=1

=

N0 √ ηi Dg ϕ + γi (ϕ)2 i=1

and the Minkowski’s inequality leads to

(20)

1598

S. Raulot / Journal of Functional Analysis 256 (2009) 1588–1617 N0 √ ηi Dg ϕ + γi (ϕ)2 i=1

N0 √ | ηi Dg ϕ|2 i=1

2 |Dg ϕ| + C|ϕ|

1 2

+

N0 γi (ϕ)2

1 2 2

i=1

using (1) of Lemma 4 .

Thus we have shown that: A (1 + ε)

n 2

n |Dg ϕ|2 + C|ϕ|2 + C|Dg ϕ||ϕ| n+1 dv(g),

M

and with (2) of Lemma 4, we get n

2n

|Dg ϕ| n+1 dv(g) + C

A (1 + ε) 2

M

2n

|ϕ| n+1 dv(g) + C M

n n |Dg ϕ| n+1 |ϕ| n+1 dv(g) .

M

Then we apply the Cauchy–Schwarz inequality in the last term of the preceding inequality:

n

n

|Dg ϕ| n+1 |ϕ| n+1 dv(g) M

1 1 2 2 2n 2n |Dg ϕ| n+1 dv(g) |ϕ| n+1 dv(g)

M

M

and next we use the Young inequality: |Dg ϕ|

n n+1

|ϕ|

n n+1

ε2 dv(g) 2

M

|Dg ϕ|

2n n+1

1 dv(g) + 2 2ε

M

2n

|ϕ| n+1 dv(g). M

Finally, we have 2n 2n ε2 n+1 n+1 A (1 + ε) 1+ |Dg ϕ| dv(g) + Cε |ϕ| dv(g) . 2 n 2

M

M

Now we estimate B in inequality (20). Hölder’s inequality gives

√ τi ( ηi ϕ)2 dx

Rn

2n √ τi ( ηi ϕ) n−1 dx

n−1 2n

Rn

2n √ τi ( ηi ϕ) n+1 dx

Rn

and using (1) of Lemma 4 and the preceding inequality lead to n−1 1 N0 2 2(n+1) 2n 2n √ √ n−1 n+1 τi ( ηi ϕ) τi ( ηi ϕ) B dx dx . i=1 Rn

Rn

With the help of (2) of Remark 1, there exists a constant C > 0 such that:

n+1 2n


N0 2 2 2n √ √ 2n Dξ τi ( ηi ϕ) n+1 dx τi ( ηi ϕ) n+1 dx . 1

BC

1599 1

i=1 Rn

Rn

On the other hand, the Young inequality gives B Cε 2

N0 2n √ 2n Dξ τi ( ηi ϕ) n+1 dx + C τi (√ηi ϕ) n+1 dx ε2 i=1Rn

Rn

and it is easy to see that: N0 2n √ 2n C τi ( ηi ϕ) n+1 dx Cε (1 + ε) n2 |ϕ| n+1 dv(g). 2 ε ı=1 Rn

M

To conclude, an argument similar to the one used in the estimate of A shows that: N0 √ 2n 2n 2n n n Dξ τi ( ηi ϕ) n+1 dx (1 + ε) n2 |Dg ϕ| n+1 + C|ϕ| n+1 + |Dg ϕ| n+1 |ϕ| n+1 dv(g) i=1Rn

M

and the Cauchy–Schwarz inequality and the Young inequality lead to 2n 2n n B Cε 2 (1 + ε) 2 |Dg ϕ| n+1 dv(g) + Cε |ϕ| n+1 dv(g). M

M

Combining the estimates of A and B in (20) gives inequality (16).

2

4. A nonlinear equation for the Dirac operator 4.1. A criterion for the existence of solutions As a direct application of Theorem 3, we give a sufficient criterion for the existence of solutions for a nonlinear equation involving the Dirac operator. More precisely, the aim of this section is to prove the following result: Theorem 5. Let (M n , g) be an n-dimensional compact Riemannian spin manifold and let H be a smooth positive function on M. If the Dirac operator is invertible and if: − p2

λmin < K(n)−1 (max H )

D

,

(21)

then there exists a spinor field ϕ ∈ C 1,α (M)∩C ∞ (M \ϕ −1 (0)) satisfying the following nonlinear elliptic equation: 2 2n n−1 H |ϕ| n−1 dvg = 1. (22) Dg ϕ = λmin H |ϕ| ϕ and M

1600


In the statement of Theorem 5, we let for 2 q qD : 2

H −(1/p) Dg ψ2q ( M H −(q/p) |Dg ψ|q dv(g)) q

, (23) λq = λq (M, g, σ ) := inf = inf

ψ=0 ψ=0 | M Dg ψ, ψ dv(g)| | M Dg ψ, ψ dv(g)|

q

where the infimum is taken over all ψ ∈ H1 and where λqD (M, g, σ ) := λmin . In the rest of this section, we will let H −(1/p) Dg ψ2q . Fq (ψ) = Fg,q (ψ) =

| M Dg ψ, ψ dv(g)| Here C ∞ (M) (resp. C k,α (M)) denotes the space of smooth spinor fields (resp. of spinor fields with finite (k, α)-Hölder norm) on M (see [6]). Remark 2. Using Lemma 2, we have λq > 0. A standard variational approach to study (22) cannot allow to conclude because of the lack q of compactness of the inclusion H1 D in LpD . The method we use here consists in proving the existence of solutions for subcritical equations where the compactness of the Sobolev embedding theorem is valid. Then we prove that one can extract a subsequence which converges to a solution of (22). We begin with the existence of solutions for subcritical equations, that is: Proposition 6. For all q ∈ (qD , 2), there exists a spinor field ϕq ∈ C 1,α (M) ∩ C ∞ (M \ ϕq−1 (0)) such that: Dg ϕq = λq H |ϕq |p−2 ϕq

(Eq )

where p ∈ R is such that p −1 + q −1 = 1. Moreover, we have H |ϕq |p dvg = 1. M

Proof. The proof of this result is divided into two parts. In a first step, we show that there exists q a spinor field ϕq ∈ H1 satisfying (Eq ), and then we will show that this solution has the desired regularity. For the rest of this proof, we fix q ∈ (qD , 2). q First step: We prove the existence of a spinor field ϕq ∈ H1 satisfying (Eq ). First we study the functional defined by q q Dg ψ, ψ dv(g) = 1 → R. Fq : H1 := ψ ∈ H1 M q

It is clear that H1 is nonempty. Take for example a smooth eigenspinor ψ1 associated to the first q positive eigenvalue λ1 > 0 of the Dirac operator and thus (λ1 )−(1/2) ψ1 −1 2 ψ1 ∈ H1 . On the q other hand, since Fq (ψ) 0 for all ψ ∈ H1 , we can consider a minimizing sequence (ψi ) for q Fq , that is a sequence such that Fq (ψi ) → λq with (ψi ) ⊂ H1 . It is clear that this sequence is q q bounded in H1 and thus there exists a spinor field ψq ∈ H1 such that:


1601

• ψi → ψq strongly in Lp with p −1 + q −1 = 1 (by the Reillich–Kondrakov theorem). q q • ψi → ψq weakly in H1 (by reflexivity of the Sobolev space H1 ). Moreover, we write Dg ψq , ψq dv(g) = Dg ψq , ψq − ψi dv(g) + Dg ψq , ψi dv(g) M

M

M

and we note that: Dg ψq , ψq − ψi dv(g) Dg ψq q ψq − ψi p → 0, M

where we used the Hölder inequality and the strong convergence in Lp . One can also easily check that the map: Dg ψq , Φ dv(g)

Φ → M q

q

defines a continuous linear form on H1 and then the weak convergence in H1 gives Dg ψq , ψq dv(g) = 1, M q

q

that is ψq ∈ H1 . Once again because of the weak convergence in H1 and of Lemma 1, we also have

−(1/p)

2

2

H Dg ψq q lim inf H −(1/p) Dg ψi q = λq i→∞

q

and thus λq = Fq (ψq ). Finally, we proved that there exists ψq ∈ H1 which reaches λq . For all smooth spinors Φ, we compute 2−q

d

Dg (ψq + tΦ) 2 = 2λq 2 q dt |t=0

Re H −(q/p) |Dg ψq |q−2 Dg ψq , Dg Φ dv(g)

M

and d dt |t=0

M

Re Dg (ψq + tΦ), (ψq + tΦ) dv(g) = 2

Reψq , Dg Φ dv(g) M

which, by the Lagrange multipliers theorem, gives the existence of a real number α such that: 2−q

λq 2 M

Re H −(q/p) |Dg ψq |q−2 Dg ψq , Dg Φ dv(g) = α

Reψq , Dg Φ dv(g). M

1602


Moreover, since ψq is a critical point for Fq , we get α = λq and thus:

q λq2 ψq − H −(q/p) |Dg ψq |q−2 Dg ψq , Dg Φ dv(g) = 0.

M q

To sum up, we proved the existence of a spinor field ψq ∈ H1 satisfying weakly the equation: q

|Dg ψq |q−2 Dg ψq = λq2 H q/p ψq . 1/2

q

If we let ϕq = λq ψq , we can easily check that ϕq ∈ H1 satisfies (Eq ) (where we used the re−(q/2) −(q/p) q/2 lations |ψq | = λq H |Dg ψq |q/p and |Dg ψq |2−q = (λq H q/p |ψq |)p−2 ). On the other hand, since: Dg ψq , ψq dv(g) = 1, M

and since the spinor field ϕq is a solution of (Eq ), we deduce that: H |ϕq |p dvg = 1. M

Second step: We show that ϕq ∈ C 1,α (M) ∩ C ∞ (M \ ϕq−1 (0)). The proof of this result uses q the classical “bootstrap argument.” Indeed, the spinor field ϕq is in the Sobolev space H1 which is continuously embedded in Lp1 with p1 = nq/(n − q), by the Sobolev embedding theorem. The Hölder inequality implies that H |ϕq |p−2 ϕ ∈ Lp1 /(p−1) and then elliptic a-priori estimates p /(p−1) (see [4]) gives ϕ ∈ H1 1 . Once again, the Sobolev embedding theorem implies that ϕq ∈ Lp2 with p2 = np1

n(p − 1) − p1 ,

if n(p − 1) > p1 or ϕq ∈ Ls for all s > 1 if n(p − 1) p1 . Note that since q > qD , we can easily check that p2 > p1 and thus we get a better regularity for the spinor field ϕq . In fact, if we push further this argument, we can show that ϕq ∈ Lpi for all i, where pi is the sequence of real numbers defined by npi−1 if n(p − 1) > pi−1 , pi := n(p−1)−pi−1 +∞ if n(p − 1) pi−1 . A classical study of this sequence leads to the existence of a rank i0 ∈ N such that pi0 = +∞ and thus we can conclude that ϕq ∈ Ls for all s > 1. The elliptic a-priori estimate gives that ϕq ∈ H1s for all s > 1 and if we apply the Sobolev embedding theorem, one concludes that ϕq ∈ C 0,α (M) for α ∈ (0, 1). Hence f |ϕq |p−2 ϕq ∈ C 0,α (M) as well, and the Schauder estimate (see [4]) gives ϕq ∈ C 1,α (M). It is clear that one can carry on this argument on M \ ϕq−1 (0) to obtain ϕq ∈ C ∞ (M \ ϕq−1 (0)). 2


1603

Remark 3. If we assume that p 2, the regularity of the spinor field ϕq can be improved to C 2,α (M). In the following, we want to prove the existence of a solution of Eq. (EqD ). However, we cannot argue like in the proof of Proposition 6 because of the lack of compacity of the embedding q H1 D → LpD which is precisely the one involved in our problem. The idea is to adapt the proof of the Yamabe problem (see for example [27]). Indeed we will prove that one can extract a subsequence from the sequence of solutions (ϕq ) which converges to a weak solution of problem (22) (see Lemma 7). Then in Lemma 8, we will get the desired regularity for this solution and finally in Lemma 9, using inequality (16) of Theorem 3, we will be able to exclude the trivial solution. So we first have Lemma 7. There exists a sequence (qi ) which tends to qD and such that the corresponding q sequence (ϕqi ), solution of (Eqi ), converges to a weak solution ϕ ∈ H1 D of (22). Proof. It is clear that without loss of generality, we can suppose that the volume of the manifold (M, g) is equal to 1. Otherwise, because of the conformal covariance of Eq. (22), we change the metric with a homothetic one (and so a conformal one). In a similar way, we can also assume (because of a rescaling argument) that the maximum of the function H is equal to 1. Now we q prove that the sequence (ϕq ) is uniformly bounded in H1 D . Indeed, since q qD , the Hölder inequality gives

−(1/p ) D D ϕ 2 H −(1/pD ) D ϕ 2 .

H g q q g q q D

On the other hand, p pD implies that:

−(1/p ) D D ϕ 2 λ2 .

H g q q q D

The variational characterization of λq and the Hölder inequality directly yield

−(1/p )

D D ϕ 2 λ2 λ2 (min H )−1

H g q q q 1 D

q

and thus we conclude that (ϕq ) is uniformly bounded in H1 D . Then there exist a sequence (qi ) q which tends to qD and a spinor field ϕ ∈ H1 D such that: q

q

• ϕqi → ϕ weakly in H1 D (by reflexivity of the Sobolev space H1 D ). • ϕqi → ϕ a.e. on M. q

Moreover, since (ϕqi ) is bounded in H1 D , the Sobolev embedding theorem implies that it is bounded in LpD , and so H |ϕqi |pi −2 ϕqi is bounded in LpD /(pi −1) . However, since pD /(pD −1) < pD /(pi − 1), the sequence H |ϕqi |pi −2 ϕqi is also bounded in LpD /(pD −1) . Using this fact and since H |ϕqi |pi −2 ϕqi → H |ϕ|pD −2 ϕ we finally get that:

a.e. on M,

1604


H |ϕqi |pi −2 ϕqi → H |ϕ|pD −2 ϕ

weakly in LpD /(pD −1) ,

and so weakly in L1 . Now note that for all smooth spinor fields Φ, the map: ψ → Dg ψ, Dg Φ dv(g) M q

q

defines a continuous linear form on H1 D and thus by weak convergence in H1 D , we obtain

Dg ϕqi , Dg Φ dv(g) →

Dg ϕ, Dg Φ dv(g).

i→+∞

M

M

The weak convergence in L1 gives H |ϕqi |pi −2 ϕqi , Dg Φ dv(g) → H |ϕ|pD −2 ϕ, Dg Φ dv(g). i→+∞

M

M

Now using the variational characterization (23) of λq and the fact that the function: q → Dg Φq is continuous, we easily conclude that q → λq is also continuous. Combining all the preceding statements with the fact that ϕqi is a solution of (Eqi ) leads to

Dg ϕ, Dg Φ dv(g) = λmin

M

2 H |ϕ| n−1 ϕ, Dg Φ dv(g)

M q

for all smooth spinor fields Φ, that is ϕ ∈ H1 D is a weak solution of (22).

2

We then state a regularity lemma which is proved in [4] and thus we omit the proof here. Lemma 8. The spinor field ϕ given in Lemma 7 satisfies ϕ ∈ C 1,α (M) ∩ C ∞ (M \ ϕ −1 (0)). As pointed out by Trudinger in the context of the Yamabe problem, one cannot exclude from this step the case where the spinor field ϕ, obtained in Lemma 7 and 8, is identically zero. In [4], Ammann proves that if (21) (with H constant) is fulfilled then ϕ is nontrivial. We give a similar result for Eq. (22) which generalizes the one of Ammann in the case where the Dirac operator is invertible. The proof we present here is based on the Sobolev-type inequality obtained in Theorem 3. More precisely, we get Lemma 9. If (21) is satisfied, the spinor ϕ obtained in Lemmas 7 and 8 is nonidentically zero and: 2n H |ϕ| n−1 dvg = 1. M


1605

Proof. Let ϕq ∈ C 1,α (M) ∩ C ∞ (M \ ϕq−1 (0)) be a solution of Eq. (Eq ), that is: Dg ϕq = λq H |ϕq |p−2 ϕq

and M H |ϕq |p dvg = 1 for all q ∈ (qD , 2) (where p is such that p −1 + q −1 = 1). Since q > qD , the Hölder inequality yields |Dg ϕq |

qD

2 2 2(q−qD ) qD q −(1/p) q 2 p H dv(g) (max H ) Dg ϕq dv(g) Vol(M, g) qqD

M

M

and with the help of (Eq ), we get

|H −(1/p) Dg ϕq |q dv(g) = λq . q

M

We finally obtain

2 2(q−qD ) qD 2 |Dg ϕq |qD dv(g) (max H ) p λ2q Vol(M, g) qqD .

(24)

M

On the other hand, applying Theorem 3 for the spinor fields ϕq gives

2 2 qD qD qD qD Dg ϕq , ϕq dv(g) = λq K(n) + ε |Dg ϕq | dv(g) + Bε |ϕq | dv(g) ,

M

M

M

where Bε > 0 is a positive constant. Using (24) in the preceding inequality leads to 2(q−qD ) 2 1 K(n) + ε (max H ) p λq Vol(M, g) qqD + Bε

|ϕq |

qD

2 qD dv(g) .

M

Now if q tends to qD , we obtain 2 1 K(n) + ε (max H ) pD λmin + Bε

2 qD |ϕ|qD dv(g) .

M

However, because of (21), we have 2

(max H ) pD λmin K(n) < 1, which allows to conclude that, for ε > 0 small enough, the norm ϕqD > 0 and thus ϕ is not identically zero. 2

1606


Remark 4. Note that we recover the result of Ammann proved in [4] for H = cste (under the assumption that the Dirac operator has a trivial kernel). 4.2. An upper bound for λmin In this section, we prove a general upper bound for λmin . Namely, we get Theorem 10. Let (M n , g) be an n-dimensional compact Riemannian spin manifold with n 3. If H ∈ C ∞ (M) is a smooth positive function on M, then the following inequality holds: − 2 pD . λmin K(n)−1 max H M

The proof of Theorem 10 lies on the construction of an adapted test spinor which will be estimated in the variational characterization of λmin . We first note that λmin is invariant under a conformal change of the metric, therefore we can work with any metric within the conformal class of g. Indeed, we have Proposition 11. The number λmin is a conformal invariant of (M, g). Proof. We can easily compute that for g = u2 g ∈ [g] we have n−1 Fg,qD (ψ) = Fg,qD u 2 ψ and then because of the variational characterization (23) of λmin (M, g, σ ), its conformal covariance follows directly. 2 For sake of completeness, we briefly recall the work of Ammann, Humbert, Grosjean and Morel [2] which describes in particular the construction of the test-spinor. We first need a trivialization of the spinor bundle given by the Bourguignon–Gauduchon trivialization [10] which is adapted for our problem. Let (x1 , . . . , xn ) be the Riemannian normal coordinates given by the exponential map at p ∈ M: expp : V ⊂ Tp M Rn → U ⊂ M (x1 , . . . , xn ) → m. Now if we consider the smooth map m → Gm := (gij (m)) which associates to any point m ∈ U the matrix of the coefficients of the metric g at this point in the basis { ∂x∂ 1 , . . . , ∂x∂ n }, then one j

can find a unique symmetric matrix Bm := (bi (m)) (which depends smoothly on m) such that 2 = G−1 . Thus, at each point m ∈ U we obtain an isometry between Rn and the tangent space Bm m Tm M defined by Bm : Texp−1 (m) V Rn , ξ → (Tm U, gm ) p

j 1 ∂ bi (m)a i (m). a , . . . , a n → ∂xj i,j


1607

This map induces an identification between the two SOn -principal bundles of orthonormal frames over (V , ξ ) and (U, g). Thereafter, this identification can be lifted to the Spinn -principal bundles of spinorial frames over (V , ξ ) and (U, g) and then gives an isometry: Σξ (V ) → Σg (U ) ϕ → ϕ. This identification has already been used in Section 3 and was denoted by τ . However, for sake of clarity, we will denote it by τ (ϕ) := ϕ for ϕ ∈ Γ (Σξ (V )). Now let j

ei := bi

∂ , ∂xj

such that {e1 , . . . , en } defines an orthonormal frame of (T U, g). Via the preceding identification, one can relate the Dirac operator acting on Σξ (V ) with the one acting on Σg (U ). Indeed, if Dξ and Dg denote those Dirac operators, we have Dg ψ = Dξ ψ +

n j j bi − δi ∂i · ∇∂j ψ + W · ψ + V · ψ,

(25)

i,j =1

where W ∈ Γ (Clg (T U )) and V ∈ Γ (T U ). With a little work, one can compute the expansion of W and V in a neighborhood of p ∈ U . In fact, if m ∈ U and r denotes the distance from m to p, we have 1 j j bi = δi − Riαβj (p)x α x β + O r 3 , 6 1 V = − (Ric)αk (p)x α + O r 2 ek , 4 |W| = O r 3 ,

(26) (27) (28)

where Rij kl (resp. (Ric)ik ) are the components of the Riemann (resp. Ricci) curvature tensor. Now consider the smooth spinor field defined on (V , ξ ) by n

ψ(x) = f 2 (x)(1 − x) · ψ0 , 2 2 2 2 where f (x) = 1+r 2 (with r = x1 + · · · + xn ) and ψ0 ∈ Σξ (V ) is a constant spinor which can be chosen such that |ψ0 | = 1. A straightforward computation shows that:

n Dξ ψ = f ψ, 2

|ψ|2 = f n−1

and |Dξ ψ|2 =

n2 n+1 f . 4

With these constructions, we can prove the main statement of this section.

(29)

1608


Proof of Theorem 10. Let ε > 0 and ψ the spinor field described above, then we define x ∈ Γ Σξ Rn , ψε (x) := ηψ ε

(30)

where η = 0 on Rn \ Bp (2δ), η = 1 on Bp (δ) and 0 < δ < 1 is chosen such that Bp (2δ) ⊂ V . Since the support of the spinor field ψε lies in the open set V of Rn , one can use the trivialization described previously to obtain a spinor field ψε over (M, g). On the other hand, because of the conformal covariance of λmin , we can assume that the metric g satisfies Ric(p)ij = 0. First we compute j x x x x ηn j f bi − δi ∂i · ∇∂j ψ ψε Dg ψ ε (x) = ∇η · ψ ε + +η ε ε2 ε ε ε i,j

x x + ηV · ψ ε , + ηW · ψ ε ε ε

where |W| = O(r 3 ) and |V| = O(r 2 ) (since Ric(p)ij = 0). Using [2], we have C 2 n− 1 x n2 n+1 x 4 n−1 x 2 + Cr f + r f |Dg ψ ε | (x) 2 f ε ε ε ε 4ε x n2 = 2 f n+1 1 + Λ(x) , ε 4ε 2

3

where Λ(x) = Cε 2 r 4 f −2 ( xε ) + Cεr 2 f − 2 ( xε ). Now note that for all u −1: n

(1 + u) n+1 1 +

n u, n+1

then we get 2n

|Dg ψ ε | n+1 (x)

n 2ε

2n n+1

fn

2n x n n n+1 n x + Λ(x). f ε n + 1 2ε ε

(31)

On the other hand, since p ∈ M is a point where H is maximum, we have H (x) = H (p) + O r 2 which yields n−1 n−1 H (x)− n+1 = H (p)− n+1 1 + O r 2 . An integration combining (31) and (32) gives

(32)


H

− n−1 n+1

|Dg ψ ε |

2n n+1

dv(g)

n 2ε

2n n+1

n−1

H (p)− n+1 (A + B + C + D),

Bp (2δ)

where A=

fn

x dv(g), ε

Bp (2δ)

B=C

fn

Bp (2δ)

C=C

x Λ(x) dv(g), ε

x dv(g), r f ε 2 n

Bp (2δ)

D=C

x Λ(x) dv(g). r f ε 2 n

Bp (2δ)

Since the function f is radially symmetric, we have 2δ A=

fn

x ωn−1 G(r)r n−1 dr, ε

0

where G(r) = |g|rx dσ (x) with |g|y := det gij (y). Sn−1

Now using the fact that Ricij (p) = 0, one can compute that (see [17], for example): G(r) 1 + O r 4 . Thus, a direct calculation shows that if n 3: A = ωn−1 I ε n + o ε n , where I =

+∞ 0

r n−1 f n (r) dr. In the same way, we can prove that for n 3: B = C = D = o εn .

In brief, we showed that: H Bp (2δ)

− n−1 n+1

|Dg ψ ε |

2n n+1

2n n−1 n(n−1) n n+1 dv(g) = (ωn−1 I )H (p)− n+1 ε n+1 1 + o(1) , 2

1609

1610


hence: H

− n−1 n+1

|Dg ψ ε |

2n n+1

n+1 2 n n+1 n−1 n dv(g) = (ωn−1 I ) n H (p)− n ε n−1 1 + o(1) . 2

M

The denominator of the functional λmin can also be estimated and similar computations give (see [2]):

n Dg ψ ε , ψ ε dv(g) = ωn−1 I ε n−1 + o ε n+1 2

M

for n 3. Combining these two estimates yields − 2 pD 1 + o(1) λmin K(n)−1 max H M

which concludes the proof.

2

Remark 5. We can derive a similar result for the case of 2-dimensional manifolds. It is sufficient to adapt the proof of [2] in our situation and one can show that if (M 2 , g) is a smooth surface we have −2 √ λmin 2 π max H . M

Remark 6. This result is in the spirit of the one obtained by Aubin in [7] for the conformal Laplacian. Indeed, in this article, the author proves that on an n-dimensional compact Riemannian manifold with n > 4, if f, h are smooth positive functions on M such that: h(p) − Rg (p) +

n − 4 g f (p) < 0, 2 f (p)

where f (p) = maxx∈M f (x), then the nonlinear equation: 4

n+2 n−1 g u + hu = f u n−2 n−2

admits a smooth positive solution. We could hope to obtain a similar criterion for the equation studied in this paper. However, if one carries out the computations in the proof of Theorem 10, we obtain for n 5: − 2 pD 1+ λmin = K(n)−1 max H M

n − 1 H (p) 2 ε + o ε2 . 2n(n − 2) H (p)

Thus no conclusion could be made since at a point p ∈ M where H is maximum we have H (p) 0.


1611

4.3. An existence result To end this section, we give conditions on the manifold (M n , g) and on the function H ∈ which ensure that (21) is fulfilled. Then applying Theorem 5, we get the existence of a solution to the nonlinear Dirac equation (22). The condition on H is a technical one given by C ∞ (M)

There is a maximum point p ∈ M at which all partial derivatives of H of order less than or equal to (n − 1) vanish.

(33)

The result we obtain is the following: Theorem 12. Let (M n , g) be an n-dimensional compact Riemannian spin manifold. Assume that (M n , g) is locally conformally flat and H ∈ C ∞ (M) a smooth positive function on M for which (33) holds. Then if the Dirac operator is invertible and the mass endomorphism has a positive eigenvalue, there exists a spinor field solution of the nonlinear Dirac equation (22). This result is quite close to the work of Escobar and Schoen [11] and relies on the construction of Ammann, Humbert and Morel [3] of the mass endomorphism. We first briefly recall the construction of the mass endomorphism. For more details, we refer to [3]. Consider a point p ∈ M and suppose that there is a neighborhood U of p which is flat. Since we assumed that the Dirac operator has a trivial kernel, one can show that the Green function GD of the Dirac operator has the following expansion in U : ωn−1 GD (x, p)ψ0 = −

x−p · ψp + v(x, p)ψp |x − p|n

for all x ∈ U and where v(., p)ψp is a smooth harmonic spinor near p with ψp ∈ Σp (M). The mass endomorphism is then the self-adjoint endomorphism of the fiber Σp (M) defined by αp (ψp ) = v(p, p)ψp . This operator shares many properties with the mass of the Green function of the conformal Laplacian. One of them is that the sign of its eigenvalues is invariant under conformal changes of metrics which preserves the flatness near p. With this construction, we can prove the main result of this section. Proof of Theorem 12. We have to construct a test-spinor which will be estimated in the variational characterization of λmin . The assumption on the mass endomorphism implies that (21) is fulfilled and the result will follow from Theorem 5. The test-spinor is exactly the one used in [3]. In order to make this paper self-contained, we have chosen to briefly recall this construction. First, since λmin is a conformal invariant of (M n , g) which is locally conformally flat, one can suppose that the metric is flat near a point p ∈ M where (33) is satisfied. Now for ε > 0 we set ξ := ε

1 n+1

,

n ξ 2 ξn ε0 := f , ε ε

1612


where f (r) = by

2 1+r 2

is the function used in the previous section. The test-spinor is then defined

⎧ x n x ⎪ ⎨ f ( ε ) 2 (1 − ε ) · ψp + ε0 αp (ψp ) Φε (x) = ε0 (ωn−1 GD (x, p) − η(x)θp (x)) + η(x)f ( ξ ) n2 ψp ε ⎪ ⎩ ε0 ωn−1 GD (x, p)

if r ξ, if ξ r 2ξ, if r 2ξ,

where η is a cut-off function such that: η=

1 on Bp (ξ ), 0 on M \ Bp (2ξ ),

and |∇η|

2 ξ

and θp (x) := v(x, p)ψp − αp (ψp ) is a smooth spinor field (harmonic near p) which satisfies |θp | = O(r). Now an easy calculation shows that:

|DΦε |

2n n+1

(x) =

⎧ n 2n − 2n r n ⎪ ⎪ ⎨ ( 2 ) n+1 ε n+1 f ( ε )

if r ξ, n 2

|ε0 ∇η(x) · θp (x) − f ( ξε ) ∇η(x) · ψp | ⎪ ⎪ ⎩ 0

2n n+1

if ξ r 2ξ, if r 2ξ.

On the other hand, since the function H satisfies the condition (33), we get H (x) = H (p) + O r n that is: n−1 n−1 H (x)− n+1 = H (p)− n+1 1 + O r n . We can now give the estimate of the functional (23) (with q = qD ) evaluated at the spinor field Φε . First, on Bp (ξ ) we have

n−1

2n

H − n+1 |DΦε | n+1 dx

Bp (ξ )

2n n n+1 − 2n − n−1 n+1 n+1 ε H (p) 2

x dx + C f ε n

Bp (ξ )

Bp (ξ )

and we compute that: x n dx ε f f n (x) dx, ε

n

Bp (ξ )

Bp (ξ )

Rn

n n x r f dx = o ε 2n−1 . ε

x dx r f ε n n


1613

Finally we obtain H

− n−1 n+1

|DΦε |

2n n+1

2n n−1 n n+1 n(n−1) dx = ε n+1 H (p)− n+1 I 1 + o ε n−1 , 2

Bp (ξ )

where I =

Rn

f n (x) dx. On Cp (ξ ) := Bp (2ξ ) \ Bp (ξ ):

2n

n−1

H − n+1 |DΦε | n+1 dx

Cp (ξ )

2n

C

|ε0 ∇η · θp | n+1 dx + C

Cp (ξ )

Cp (ξ )

+C

2n n2 ξ n+1 f ∇η · ψ dx p ε

r |ε0 ∇η · θp | n

2n n+1

dx + C

Cp (ξ )

n 2n n+1 ξ 2 r f ∇η · ψp dx ε n

Cp (ξ )

and since ε0 Cε n−1 , |∇η| 2ξ −1 , |θp | = O(r) and Vol(Cp (ξ )) Cξ n , we get

(2n+1)(n−1) n−1 2n . H − n+1 |DΦε | n+1 dx = o ε n+1

Cp (ξ )

In conclusion, the numerator of (23) is given by H

− n−1 n+1

|DΦε |

2n n+1

n+1 2 n n−1 n+1 n dv(g) = ε n−1 H (p)− n I n 1 + o ε n−1 . 2

M

Similar computations for the denominator lead to (see also [3]):

n DΦε , Φε dv(g) = ε n−1 I 1 + J ψp , αp (ψp ) ε n−1 + o ε n−1 , 2

M

n where J = Rn f (x) 2 +1 dx. Now we choose ψp ∈ Σp (M) as an eigenspinor for the mass endomorphism associated with a positive eigenvalue λ and we finally get − n−1 n 1 − λJ ε n−1 + o ε n−1 . λmin FqD (Φε ) = K(n)−1 max H M

Now it is clear that for ε > 0 sufficiently small, (21) is true and thus Theorem 5 allows to conclude. 2

1614


Remark 7. In dimension two, a Riemannian surface is always locally conformally flat and condition (33) is satisfied for all H ∈ C ∞ (M) however the mass endomorphism vanishes (see [3]) and so Theorem 12 cannot be applied. 5. A remark on manifolds with boundary In this last section, we briefly study the case of manifolds with boundary. Since the calculations are quite close to those of the boundaryless case, we only point out arguments which need some explanations. Indeed, let (M n , g) be an n-dimensional compact Riemannian spin manifold with smooth boundary equipped with a chirality operator γ , that is an endomorphism of the spinor bundle which satisfies: γ 2 = Id,

γ ψ, γ ϕ = ψ, ϕ,

∇X (γ ψ) = γ (∇X ψ),

X · γ ψ = −γ (X · ψ)

for all X ∈ Γ (T M) and for all spinor fields ψ, ϕ ∈ Γ (Σg (M)). The orthogonal projection: 1 Bg± := (Id ± νg · γ ), 2 where νg denotes the inner unit vector fields normal to ∂M, defines a (local) elliptic boundary condition (called the chiral bag boundary condition or (CHI) boundary condition) for the Dirac operator Dg of (M, g). Moreover, under this boundary condition, the spectrum of the Dirac operator consists of entirely isolated real eigenvalues with finite multiplicity. In [28] (see also [29]), we define a spin conformal invariant similar to (5) using this boundary condition. More precisely, if λ± 1 (g) stands for the first eigenvalue of the Dirac operator Dg under the chiral bag boundary condition Bg± then the chiral bag invariant is defined by 1 n λmin (M, ∂M) := inf λ± 1 (g) Vol(M, g) g∈[g]

and one can check that: 2n n+1

( M |Dg ϕ| n+1 dv(g)) n

, λmin (M, ∂M) = inf ϕ=0 | M Dg ϕ, ϕ dv(g)|

(34)

where the infimum is taken for all spinor fields ϕ ∈ H1 D such that Bg± ϕ|∂M = 0. On the round hemisphere (Sn+ , gst ), we can compute that: q

1 n n ωn n 1 n = 2− n K(n)−1 , λmin S+ , ∂S+ = 2 2

(35)

and using the conformal covariance of (34) and the fact that the hemisphere is conformally isometric to the half Euclidean space (Rn+ , ξ ), we conclude that:


n+1 n 2n Dξ ψ, ψ dx 2 n1 K(n) |Dξ ψ| n+1 dx Rn+

1615

(36)

Rn+

for all ψ ∈ Γc (Σξ (Rn+ )) where Γc (Σξ (Rn+ )) denotes the space of smooth spinor fields over (Rn+ , ξ ) with compact support. In order to prove a Sobolev-type inequality for manifolds with boundary, we give a result similar to Lemma 2 in this context: Lemma 13. If the Dirac operator is invertible under the chiral bag boundary condition then there exists a constant C > 0 such that: ϕpD CDg ϕqD for all ϕ ∈ H1 D such that Bg± ϕ|∂M = 0. q

Proof. Since the Dirac operator is assumed to be invertible and since the Fredholm property of Dg does not depend on the choice of the Sobolev spaces (see [32]), we have that: q q Dg : H±D := ϕ ∈ H1 D Bg± ϕ|∂M = 0 → LqD defines a continuous bijection. Using the open mapping theorem, the inverse map is also continuous and then we get the existence of a constant C > 0 such that:

ϕH qD = Dg−1 (Dg ϕ) H qD CDg ϕqD 1

1

q

for all ϕ ∈ H±D . On the other hand, the Sobolev embedding theorem implies that the map q H1 D → LpD is continuous, so there exists a constant C > 0 such that: ϕpD CϕH qD 1

q

for all ϕ ∈ H1 D and this concludes the proof.

2

Remark 8. Lemma 13 gives a result similar to Lott’s one (see [26]) for the Dirac operator on manifolds with boundary, that is if Dg is invertible under the chiral bag boundary condition, then: λmin (M, ∂M) > 0. Indeed, the Hölder inequality gives Dg ϕ, ϕ dv(g) ϕp Dg ϕq , D D M

and then Lemma 2 yields

|

Dg ϕ2qD

M Dg ϕ, ϕ dv(g)|

DϕqD C ϕpD

(37)

1616

S. Raulot / Journal of Functional Analysis 256 (2009) 1588–1617 q

for all ϕ ∈ H±D . Using the variational characterization (34) of λmin (M, ∂M) leads to the result. In [30], we give an explicit lower bound for the chiral bag invariant given by λmin (M, ∂M)2

n μ[g] (M, ∂M). 4(n − 1)

(38)

The number μ[g] (M, ∂M) is a conformal invariant of the manifold introduced by Escobar in [12] to study the Yamabe problem on manifolds with boundary and defined by

μ[g] (M, ∂M) =

inf

n−1 2 M (4 n−2 |∇u|

u∈C 1 (M), u=0

+ Rg u2 ) dv(g) + 2(n − 1)

2 ( M uN ds(g)) N

∂M

hg u2 ds(g)

.

This invariant is called the Yamabe invariant of (M n , g). Here hg denotes the mean curvature of the boundary of (∂M, g) in (M, g). Inequality (38) is significant only if the Yamabe invariant is positive and in this case, the Dirac operator under the chiral bag boundary condition is invertible. So inequality (37) is more general than (38) however it does not give an explicit lower bound. We can now argue like in the proof of Theorem 3 and state a Sobolev-like inequality on manifolds with boundary: Theorem 14. Let (M n , g, σ ) be an n-dimensional compact spin manifold with a nonempty smooth boundary and equipped with a chirality operator. Moreover, we assume that the Dirac operator under the chiral bag boundary condition is invertible. Then for all ε > 0, there exists a constant Bε such that: 2 2 qD qD qD qD Dg ϕ, ϕ dv(g) 2 n1 K(n) + ε |Dg ϕ| dv(g) + Bε |ϕ| dv(g) M

M

M

for all ϕ ∈ H1 D such that Bg± ϕ|∂M = 0. q

Acknowledgments I would like to thank Oussama Hijazi and Emmanuel Humbert for their encouragements, support and remarks on previous versions of this paper. I am also very grateful to Bernd Ammann for his remarks and suggestions. Finally, I would thank the Mathematical Institute of Neuchâtel for his financial support. References [1] B. Ammann, E. Humbert, M. Ould Ahmedou, An obstruction for the mean curvature of the conformal immersion Sn into Rn+1 , Proc. Amer. Math. Soc. 137 (2) (2007) 489–493. [2] B. Ammann, E. Humbert, J.-F. Grosjean, B. Morel, A spinorial analogue of Aubin’s inequality, Math. Z. 260 (1) (2008) 127–151. [3] B. Ammann, E. Humbert, B. Morel, Mass endomorphism and spinorial Yamabe type problem on conformally flat manifolds, Comm. Anal. Geom. 14 (1) (2006) 163–182. [4] B. Ammann, The smallest Dirac eigenvalue in a spin-conformal class and cmc-immersions, preprint, 2003. [5] B. Ammann, A spin-conformal lower bound of the first positive Dirac eigenvalue, Differential Geom. Appl. 18 (2003) 21–32.


1617

[6] B. Ammann, A variational problem in conformal spin geometry, Habilitationsschrift, Universität Hamburg, 2003. [7] T. Aubin, Équations différentielles non linéaires et problème de Yamabe concernant la courbure scalaire, J. Math. Pures Appl. (9) (1976) 269–296. [8] T. Aubin, Problèmes isopérimétriques et espaces de Sobolev, J. Differential Geom. 11 (1976) 573–598. [9] C. Bär, Lower eigenvalue estimate for Dirac operator, Math. Ann. 293 (1992) 39–46. [10] J.-P. Bourguignon, P. Gauduchon, Spineurs, opérateurs de Dirac et variations de métriques, Comm. Math. Phys. 144 (1992) 581–599. [11] J.F. Escobar, R.M. Schoen, Conformal metrics with prescribed scalar curvature, Invent. Math. 86 (1986) 243–254. [12] J.F. Escobar, The Yamabe problem on manifolds with boundary, J. Differential Geom. 35 (1992) 21–84. [13] T. Friedrich, Der erste Eigenwert des Dirac-Operators einer kompakten Riemannschen Manniftigkeit nicht negativer Skalarkrümmung, Math. Nachr. 97 (1980) 117–146. [14] T. Friedrich, Dirac Operators in Riemannian Geometry, Grad. Stud. Math., vol. 25, Amer. Math. Soc., 2000. [15] S. Gutt, Killing spinors on spheres and projective spaces, in: Spinors in Physics and Geometry, Proc. Conf. Trieste, 1986, pp. 238–248. [16] E. Hebey, Nonlinear Analysis on Manifolds: Sobolev Spaces and Inequalities, Courant Lect. Notes Math., vol. 5, Amer. Math. Soc., 1999. [17] E. Hebey, Introduction à l’analyse non-linéaire sur les variétés, in: Diderot Editeur and Arts et sciences, HumboldtUniversität zu Berlin, 1997. [18] O. Hijazi, A conformal lower bound for the smallest eigenvalue of the Dirac operator and Killing spinors, Comm. Math. Phys. 25 (1986) 151–162. [19] O. Hijazi, Première valeur propre de l’opérateur de Dirac et nombre de Yamabe, C. R. Acad. Sci. Paris 313 (1991) 865–868. [20] O. Hijazi, Spectral properties of the Dirac operator and geometrical structures, in: Proceedings of the Summer School on Geometric Methods in Quantum Field Theory, Villa de Leyva, Columbia, 1999, World Scientific, 2001. [21] N. Hitchin, Harmonic spinors, Adv. Math. 14 (1974) 1–55. [22] E. Hebey, M. Vaugon, The best constant problem in the Sobolev embedding theorem for complete Riemannian manifolds, Duke Math. J. 79 (1995) 235–279. [23] E. Hebey, M. Vaugon, Meilleures constantes dans le théorème d’inclusion de Sobolev, Ann. Inst. H. Poincaré Anal. Non Linéaire 13 (1996) 57–93. [24] E. Hebey, M. Vaugon, From best constants to critical functions, Math. Z. 237 (2001) 737–767. [25] H.B. Lawson, M.L. Michelsohn, Spin Geometry, Princeton Math. Ser., vol. 38, Princeton Univ. Press, 1989. [26] J. Lott, Eigenvalue bounds for the Dirac operator, Pacific J. Math. 125 (1986) 117–126. [27] J.M. Lee, T.H. Parker, The Yamabe problem, Bull. Amer. Math. Soc. (N.S.) 17 (1987) 37–91. [28] S. Raulot, On a spin conformal invariant on manifolds with boundary, Math. Z. 261 (2) (2009) 321–349. [29] S. Raulot, Aspect conforme de l’opérateur de Dirac sur une variété à bord, PhD thesis, Université Henri Poincaré, Nancy I, 2006. [30] S. Raulot, The Hijazi inequality on manifolds with boundary, J. Geom. Phys. 56 (2006) 2189–2202. [31] R. Schoen, Conformal deformation of a Riemannian metric to constant scalar curvature, J. Differential Geom. 20 (1984) 473–495. [32] G. Schwartz, Hodge Decomposition—A Method for Solving Boundary Value Problems, Lecture Notes in Math., Springer, 1995. [33] G. Talenti, Best constants in Sobolev inequality, Ann. Mat. Pura Appl. 110 (1976) 353–372. [34] M.E. Taylor, Pseudodifferential Operators, Princeton Univ. Press, Princeton, NJ, 1981. [35] N.S. Trudinger, Remarks concerning the conformal deformation of Riemannian structures on compact manifolds, Ann. Sc. Norm. Super. Pisa Sci. Fis. Mat. (3) 22 (1968) 265–274. [36] H. Yamabe, On a deformation of Riemannian structures on compact manifolds, Osaka Math. J. (1960) 21–37.


Smooth solutions for the motion of a ball in an incompressible perfect fluid Carole Rosier a , Lionel Rosier b,∗ a Laboratoire de Mathématiques Pures et Appliquées Joseph Liouville, Université du Littoral,

50 rue F. Buisson, B.P. 699, 62228 Calais Cedex, France b Institut Elie Cartan, Université Henri Poincaré Nancy 1, B.P. 239, 54506 Vandœuvre-lès-Nancy Cedex, France

Received 28 May 2008; accepted 28 October 2008 Available online 28 November 2008 Communicated by J. Coron

Abstract In this paper we investigate the motion of a rigid ball surrounded by an incompressible perfect fluid occupying RN . We prove the existence, uniqueness, and persistence of the regularity for the solutions of this fluid-structure interaction problem. © 2008 Elsevier Inc. All rights reserved. Keywords: Euler equations; Fluid-rigid body interaction; Exterior domain; Classical solutions

1. Introduction We consider a homogeneous rigid body occupying a ball B(t) ⊂ RN (N 2) of radius one and which is surrounded by a homogeneous incompressible perfect fluid. We denote by Ω(t) = RN \ B(t) the domain occupied by the fluid, and write merely B = B(0) = {x; |x| < 1} and Ω = Ω(0) = {x; |x| > 1}. The equations modeling the dynamics of the system read ∂u + (u · ∇)u + ∇p = g, in Ω(t) × [0, T ], ∂t div u = 0, in Ω(t) × [0, T ], * Corresponding author.

E-mail addresses: [email protected] (C. Rosier), [email protected] (L. Rosier). 0022-1236/$ – see front matter © 2008 Elsevier Inc. All rights reserved. doi:10.1016/j.jfa.2008.10.024

(1.1) (1.2)

C. Rosier, L. Rosier / Journal of Functional Analysis 256 (2009) 1618–1641

u · n = h + r × (x − h) · n,

on ∂Ω(t) × [0, T ],

J r =

(1.3)

lim u(x, t) = u∞ ,

(1.4)

pn dσ + frb ,

(1.5)

|x|→∞

mh =

1619

in [0, T ],

∂Ω(t)

(x − h) × pn dσ + Trb ,

in [0, T ],

(1.6)

∂Ω(t)

u(x, 0) = a(x), h(0) = 0 ∈ RN ,

x ∈ Ω,

h (0) = b ∈ RN .

(1.7) (1.8)

In the above equations, u (respectively p) is the velocity field (respectively the pressure) of the fluid, g is the external force field applied to the fluid (assumed for simplicity to be defined on RN × [0, T0 ]), frb (respectively Trb ) stands for the external force (respectively the external torque) applied to the rigid body, m (respectively J ) is the mass (respectively the inertia matrix) of the ball, h denotes the position of the center of the ball, assumed to be 0 at t = 0, r is the angular velocity of the ball, n is the unit outward normal vector to ∂Ω, and u∞ is a given constant vector. As x − h = −n on ∂Ω(t), (1.3) reduces to u · n = h · n,

(1.9)

whereas (1.6) simplifies into J r = Trb . It follows that the dynamics of r, which has no influence on the dynamics of u and h, may be ignored. As in most of fluid-structure interaction problems, one of the main difficulties in proving the wellposedness of (1.1)–(1.8) comes from the fact that the domain occupied by the fluid is variable and not a priori known. If in the last decade a large number of papers have been devoted to the wellposedness of fluid-structure interaction problems involving a viscous fluid (that is, governed by Navier–Stokes equations), the motion of a rigid body in a (not potential) Eulerian flow has been investigated only in a few papers. In [11], the existence and uniqueness of a (global) classical solution of (1.1)–(1.8) was established when N = 2. A result in the same vein was obtained in [12] for a body of arbitrary form, again for N = 2. The aim of this paper is to extend the results of [11] to a space of arbitrary dimension N (N ∈ {2, 3} in practice), and to any order of smoothness. We shall for instance establish the existence of C ∞ smooth (global) solutions when N = 2. Moreover, the fluid considered here will have a (not necessary null) limit at infinity, and will undergo the action of a force. It is clear that a suitable wellposedness theory is required if we have in mind to prove control results in the spirit of those in [5]. Notice that another application concerns inverse problems. In [4], is was proved that a moving ball surrounded by a potential fluid occupying a bounded domain in R2 can be detected thanks to a measurement at some time of the velocity of the fluid on some part of the boundary of the domain. In this paper, the wellposedness of (1.1)–(1.8) is tackled in a direct way, without proving a similar result for Navier–Stokes equations as in [11,12]. This results in a direct and shorter proof. The method of proof combines the study of a variant of Leray projector designed to eliminate the pressure and to take into account the dynamics of the solid, to the classical approach for the wellposedness of Euler equations due to R. Temam [13,14], T. Kato [7], and Kato, Lai [8], which is based upon certain a priori estimates and a Galerkin method. For the sake of shortness, we

1620


shall derive the existence of smooth solutions from an abstract result given in [8], although a direct proof as in [13] could certainly be done. To state the results, we introduce the usual solution v∞ = v∞ (y) of the system curl v∞ = 0,

in Ω,

div v∞ = 0,

in Ω,

v∞ · n = 0,

on ∂Ω,

lim v∞ (y) = u∞ .

|y|→∞

Simple calculations give 2 1 |y| u∞ − N (u∞ · y)y . N +2 (N − 1)|y|

v∞ (y) = u∞ +

(1.10)

Notice that v∞ (·) − u∞ ∈ W s,p (Ω) for all s 0 and all p ∈ (1, +∞]. In order to write the equations of the fluid in a fixed domain, we perform a change of coordinates. For any y ∈ Ω = Ω(0) and any t ∈ [0, T ], we set v(y, t) = u(y + h(t), t) − v∞ (y), q(y, t) = p(y + h(t), t), f (y, t) = g(y + h(t), t), and l(t) = h (t). Then, the functions (v, q, l) satisfy the following system: ∂v + (v∞ + v − l) · ∇(v∞ + v) + ∇q = f, in Ω × [0, T ], ∂t div v = 0, in Ω × [0, T ], v · n = l · n,

ml =

on ∂Ω × [0, T ],

(1.11) (1.12) (1.13)

lim v(y, t) = 0,

(1.14)

qn dσ + frb ,

in [0, T ],

(1.15)

y ∈ Ω,

(1.16)

|y|→∞

∂Ω

v(y, 0) = a(y) − v∞ (y), l(0) = b.

(1.17)

For the sake of shortness, if H denotes any space of real-valued functions, we write v ∈ H s (Ω) the homogewhen each component vi of v belongs to H . For any s 1, we denote by H neous Sobolev space s (Ω) = q ∈ L2loc (Ω) ∇q ∈ H s−1 (Ω) , H where q ∈ L2loc (Ω) means that q ∈ L2 (Ω ∩ B0 ) for all open balls B0 ⊂ RN with B0 ∩ Ω = ∅. Throughout the paper, s0 will denote the number s0 = [N/2] + 2, so that H s−1 (Ω) ⊂ L∞ (Ω) for each s s0 . (s is assumed to be an integer.) The main result in this paper is the following one.


1621

Theorem 1.1. Let s s0 , u∞ ∈ RN , a ∈ v∞ + H s (Ω), where v∞ is given by (1.10), b ∈ RN , f ∈ C([0, T0 ]; H s (Ω)), and frb ∈ C([0, T0 ]). Assume that div a = 0 and (a −b)·n|∂Ω = 0. Then there exist a time T T0 and a solution (v, q, l) of (1.11)–(1.17) such that v ∈ C([0, T ]; H s (Ω)), s (Ω)) and l ∈ C 1 ([0, T ]; RN ). Such a solution is unique up to an arbitrary q ∈ C([0, T ]; H function of t which may be added to q. Furthermore, T does not depend on s. Remark 1.2. (1) It follows from (1.11) that v ∈ C 1 ([0, T ]; H s−1 (Ω)). (2) Theorem 1.1 can be extended to the case when the external force field f has a nonzero limit at infinity (e.g. if f stands for the gravity force). Let f∞ (t) := lim|y|→∞ f (y, t) and f˜(y, t) := ˜ is the solution of (1.11)–(1.17) corref (y, t) − f∞ (t). If f˜ ∈ C([0, T0 ]; H s (Ω)) and (v, ˜ q, ˜ l) ˜ ˜ ˜ sponding to a, b, f and frb = frb + ∂Ω (f∞ (t) · y)n dσ , then (v, q, l) = (v, ˜ q˜ + f∞ (t) · y, l) solves (1.11)–(1.17) with the forcing terms f, frb in (1.11) and (1.15), respectively. (3) It is sufficient to prove Theorem 1.1 with frb ≡ 0. Indeed, introducing a function qrb ∈ C([0, T0 ]; H s+1 (Ω)) with ∂Ω qrb (y, t)n dσ = frb (t) and setting qˆ = q(y, t) + qrb (y, t), fˆ = ˆ fˆ, 0) substituted to (q, f, frb ). We shall assume f + ∇qrb , then (1.11) and (1.15) hold with (q, thereafter that frb ≡ 0. Finally, the existence of global smooth solutions can be asserted when N = 2. Corollary 1.3. Assume that N = 2 and that s, T0 , u∞ , a, b, f and frb are as in Theorem 1.1, with curl a ∈ Lp (Ω) and curl f ∈ L1 (0, T0 ; Lp (Ω)) for some p ∈ [1, 2). Then we can pick T = T0 in Theorem 1.1. We stress that Corollary 1.3 does not follow from [11], since there is a gap between the regularity of the solutions provided in [11] (namely, v ∈ C 1 (Ω) ∩ H 1 (Ω)) and the minimal regularity required in Theorem 1.1 (v ∈ H 3 (Ω)). To prove Corollary 1.3, we use the well-known fact (see e.g. [2]) that a solution remains smooth as long as its vorticity is uniformly bounded. If we compare the results in this paper with the ones in [9–12], we notice that no weighted Sobolev space is involved here. This follows from the crucial observation that ∇v can be estimated in function of the vorticity ω = curl v in the same usual Sobolev space H s (Ω), without incorporating any weight. The paper is outlined as follows. Section 2 provides some background on Kato–Lai theory. Section 3 is concerned with the proof of Theorem 1.1. It begins with the study of the projector which has to be substituted to Leray projector in order to take into account the motion of the rigid ball. Then we apply Kato-Lai theory to a certain abstract system, and we check that the solution provided by that theory is indeed a solution of the original fluid-structure interaction problem. Section 3 is concerned with the proof of Corollary 1.3. It contains the proof of several a priori estimates relating the velocity to the vorticity in an exterior domain. 2. Proof of Theorem 1.1 2.1. Kato–Lai theory In this section we review briefly Kato–Lai theory and introduce some notations. The reader is referred to [8] for more details. Let V , H, X be three real separable Banach spaces. We say that the family {V , H, X} is an admissible triplet if the following conditions hold.

1622


(i) V ⊂ H ⊂ X, the inclusions being dense and continuous.

1

(ii) H is a Hilbert space, with inner product (.,.)H and norm · H = (.,.)H2 . (iii) There is a continuous, nondegenerate bilinear form on V × X, denoted by .,., such that v, u = (v, u)H

for all v ∈ V and u ∈ H.

(2.1)

Recall that the bilinear form v, u is continuous and nondegenerate when v, u C v V u X

for some constant C > 0;

(2.2)

v, u = 0 for all u ∈ X implies v = 0;

(2.3)

v, u = 0 for all v ∈ V implies u = 0.

(2.4)

A map A : [0, T ] × H → X is said to be sequentially weakly continuous if A(tn , vn ) A(t, v) in X whenever tn → t and vn v in H . We denote by Cw ([0, T ]; H ) the space of sequentially weakly continuous functions from [0, T ] to H , and by Cw1 ([0, T ]; X) the space of the functions u ∈ W 1,∞ (0, T ; X) such that du/dt ∈ Cw ([0, T ], X). We are concerned with the Cauchy problem dv + A(t, v) = 0, dt

t 0,

v(0) = v0 .

(2.5)

The Kato–Lai existence result for abstract evolution equations is as follows. Theorem 2.1. (See [8, Theorem A].) Let {V , H, X} be an admissible triplet. Let A be a sequentially weakly continuous map from [0, T ] × H into X such that

v, A(t, v) −β v 2H for t ∈ [0, T ], v ∈ V ,

(2.6)

where β(r) 0 is a continuous nondecreasing function of r 0. Then for any u0 ∈ H there is a time T > 0, T T0 , and a solution v of (2.5) in the class v ∈ Cw [0, T ]; H ∩ Cw1 [0, T ]; X .

(2.7)

Moreover, one has v(t) 2 γ (t), H

t ∈ [0, T ],

(2.8)

where γ solves the ODE γ (t) = 2β(γ (t)), γ (0) = v0 2H . 3. Proof of Theorem 1.1 In this section, we put system (1.11)–(1.17) (with frb ≡ 0) in the form (2.5) in order to apply Theorem 2.1.


1623

Pick s s0 and define the (uniform) density of the ball as ρ = m/|B|, where |B| stands for the Lebesgue measure of the ball B. Let X = L2 (RN ) be endowed with the scalar product (u, v)X =

u(x)v(x) dx + ρ

Ω

u(x)v(x) dx. B

We introduce the (closed) subspace X∗ = u ∈ X; div u = 0 on RN and u = const on B . For any u ∈ X∗ , we denote by lu the unique vector in RN such that u(x) = lu a.e. on B. Let H = {u ∈ X; u|Ω ∈ H s (Ω)} = H s (Ω) ⊕ L2 (B) be endowed with the scalar product (u1 , u2 )H = (u, v)H s (Ω) + ρ(u1 , u2 )L2 (B) . Finally, following Kato–Lai, we define V as the space of functions v ∈ H such that v|Ω belongs to D(S)N , where S is the nonnegative selfadjoint operator S : D(S) ⊂ L2 (Ω) → L2 (Ω) defined by (Sf, g)L2 (Ω) = (f, g)H s (Ω)

∀f ∈ D(S), ∀g ∈ H s (Ω).

Recall that S is the elliptic operator Sf = |α|s (−1)|α| ∂ 2α with Neumann boundary conditions, and that D(S) ⊂ H 2s (Ω). V is endowed with the scalar product (v1 , v2 )V = (v1 , v2 )H 2s (Ω) + ρ(v1 , v2 )L2 (B) . To emphasize the dependence in s, at some places we shall write Xs , Hs , Vs instead of X, H, V . Clearly, X, H and V are Hilbert spaces, and the inclusions in V ⊂ H ⊂ X are continuous and dense. Introduce the bilinear form on V × X v, u =

(−1) ∂ v, u |α| 2α

|α|s

L2 (Ω)

+ ρ(v, u)L2 (B) .

Notice that v, u = (v, u)H

for all v ∈ V , u ∈ H.

Clearly, the conditions (2.2) and (2.3) are satisfied. (2.4) follows from the self-adjointness of S. 3.1. Determination of the projector Let P denote the orthogonal projection from the space X = L2 (RN ), endowed with the scalar product (.,.)X , onto X∗ , and Q = 1 − P . To prove that P (H ) ⊂ H , we need to compute explicitly P (u) for any u ∈ X. This is done in the following proposition.

1624


Proposition 3.1. (i) Pick any u ∈ X. Then P (u) =

u − ∇ϕ l

in Ω, in B,

1 (Ω) of the where ϕ(x) = ϕu (x) + (l · x)(N − 1)−1 |x|−N , ϕu is the unique solution in H elliptic problem

ϕu = div u ∂ϕu ∂n = u · n

in Ω, on ∂Ω,

and l = |B| +

|∂B| N (N − 1)

−1

u(x) dx +

B

ϕu n dx .

∂B

(ii) P maps Hs into Hs continuously for any s 1. Proof. (i) We write X∗ = X1 ∩ X2 , where X1 := {u ∈ X; u = const in B} and X2 := {u ∈ X; div u = 0 in RN }. Obviously X1⊥ + X2⊥ ⊂ X∗⊥ . Clearly X1⊥

= v ∈ X; v = 0 in Ω and v(x) dx = 0 . B

We claim that 1 (Ω), ϕB ∈ H 1 (B) with ϕΩ − ρϕB = 0 on ∂Ω , X2⊥ = v = 1Ω ∇ϕΩ + 1B ∇ϕB ; ϕΩ ∈ H where 1Ω and 1B denote the characteristic functions of Ω and B, respectively. Indeed, if v ∈ X2⊥ , 1 (Ω) and ϕB ∈ H 1 (B) then by a classical result (see e.g. [15]) there exist two functions ϕΩ ∈ H 1 N such that v = ∇ϕΩ in Ω and v = ∇ϕB in B. Pick any u ∈ X2 ∩ H (R ). Then we have 0 = (u, v)X = u · ∇ϕΩ dx + ρ u · ∇ϕB dx Ω

B

=

ϕΩ u · n dσ + ρ

∂Ω

=

ϕB u · n dσ

∂B

(ϕΩ − ρ ϕB )u · n dσ.

∂Ω

This yields ϕΩ − ρϕB = 0 on ∂Ω. The other inclusion is obvious.


1625

We aim to construct two functions u1 , u2 ∈ X satisfying u = u1 + u2 ,

(3.1)

div u1 = 0 in Ω,

(3.2)

u1 · n = l · n

(3.3)

u1 = l

on ∂Ω,

in B,

u2 = ∇ϕΩ

(3.4)

in Ω,

u2 = ∇ϕB + v

in B

(3.5) (3.6)

1 1 for some vector l ∈ RN , some functions ϕΩ ∈ H (Ω), ϕB ∈ H (B) with ϕΩ − ρϕB = 0 on ∂Ω, 2 and some function v ∈ L (B) with B v(x) dx = 0. With such a pair (u1 , u2 ) at hand, it is clear that P (u) = u1 , for u1 ∈ X∗ and u2 ∈ X∗⊥ . We first determine the function ϕΩ . From (3.1)–(3.3) and (3.5), we infer that ϕΩ has to solve

ϕΩ = div u in Ω, ∂ϕΩ = u · n − l · n on ∂Ω. ∂n

(3.7) (3.8)

We seek ϕΩ in the form ϕΩ = ϕu − ϕl , where ϕu and ϕl solve respectively ϕu = div u in Ω, ∂ϕu = u · n on ∂Ω, ∂n ϕl = 0 in Ω, ∂ϕl = l · n on ∂Ω. ∂n

(3.9) (3.10) (3.11) (3.12)

Clearly, for a very general function u ∈ L2 (RN ), the trace u · n on ∂Ω does not make sense. However, we may define a generalized solution of (3.9)–(3.10) by using a variational formulation. 1 (Ω) and integrating by parts, we arrive to Scaling in (3.9) by θ ∈ H 1 (Ω). ∇ϕu · ∇θ dx = u · ∇θ dx for all θ ∈ H (3.13) Ω

Ω

According to Riesz representation theorem, for any u ∈ L2 (RN ) there exists a unique function 1 (Ω) satisfying (3.13). ϕu ∈ H Simple computations show that the function ϕl (x) = −

1 l·x N − 1 |x|N

(3.14)

1 (Ω). Thus ϕΩ = ϕu − ϕl is the unique is the unique solution of (3.11)–(3.12) in the class H 1 solution of (3.7)–(3.8) in H (Ω).

1626


The function u1 is defined as u1 (x) =

u(x) − ∇ϕΩ (x) l

if x ∈ Ω, if x ∈ B.

(3.15)

From (3.7)–(3.8), we infer that div u1 = 0 in RN and that u1 · n = l · n on ∂Ω, hence u1 ∈ X∗ . 1 Since ϕΩ |∂Ω ∈ H 2 (∂Ω), we may pick a function ϕB ∈ H 1 (B) such that ρϕB = ϕΩ

on ∂Ω = ∂B.

(3.16)

Let v : B → RN be defined by v(x) = u(x)−l −∇ϕB (x) for any x ∈ B. The value of l is imposed by the constraint B v(x) dx = 0, i.e.,

u(x) dx − l|B| − B

∇ϕB dx = 0. B

Note that, by (3.14)–(3.16),

∇ϕB dx =

B

ϕB n dσ

∂B

= −ρ

−1

ϕu n dσ − ∂Ω

= −ρ −1

ϕl n dσ

∂Ω

ϕu n dσ − ∂Ω

|∂B| l . N (N − 1)

Therefore l = |B| +

|∂B| ρN(N − 1)

−1

u(x) dx + ρ −1

B

ϕu n dσ .

(3.17)

∂Ω

Notice that, for u sufficiently small at infinity, ∂Ω ϕu n dσ = Ω u dx, as it can be seen by letting θ = xi in (3.13). Let us proceed to the proof of (ii). Pick any u ∈ Hs (s 1), and consider P (u) = u1 where u1 is defined in (3.15). Clearly, P (u) ∈ X, and to prove that P (u) ∈ H s (Ω), it is sufficient to show that ∇ϕu ∈ H s (Ω). Observe that ϕu is defined up to an additive constant. To fix that constant we may impose the condition ϕu (x) dx = 0. (3.18) 1 0 or fξ > 0. If (n, i) = (f, a) and nξ = fξ = 0, then we have 0 = ζ(n,i),(f,a) , and we obtain from the fori,j mula (11) and the fact that bal(tn,m,α )ξ = mξ 0, that 0 = ζ(n,i),(f,a) = ζ˜(n,i),(f,a) by an application of property (a) for c = f − n. Recalling (14) we observe that we have proved that ζ˜(n,i),(f,a) = t˜t˜∗ . q1 = (n,i),(f,a)∈(Zd+ )

2

Repeating the above computations with t := t ∗ rather than t (adjoining equation (9) we see i,j

j,i

i,j

∗

j,i

that it is enough to replace λn,m,α by λm,n,α and tn,m,α by (tm,n,α ) in the definition of t) it ∗ turns out that q2 = t˜ t˜ = t˜∗ t˜ since t˜ = t˜∗ (everything passes as above, since the only essential i,j i,j i,j property we required for λn,m,α and tn,m,α ∈ W , namely bal(tn,m,α ) = m − n, does also hold for ∗

∗

the new coefficients λm,n,α and (tm,n,α ) ∈ W , namely bal((tm,n,α ) ) = m − n). This proves that [q1 ] = [q2 ], and thus q = 0. 2 j,i

j,i

j,i

Proposition 4.8. K0 (ϕQ ) is injective for the embedding ϕQ : C ∗ (Q) → X0 AX0 given by ϕQ (q) = X0 qX0 . Proof. Since Q ⊆ A, and A is the inductively ordered union of finite dimensional C ∗ subalgebras, the same holds for Q and X0 AX0 . Hence any element of K0 (C ∗ (Q)) allows a representation q = [q1 ] − [q2 ] for projections q1 , q2 ∈ M∞ (Q). Assume that K0 (ϕQ )(q) = 0. Notice that (ϕQ )∞ (qi ) ∈ M∞ (X0 AX0 ), and therefore there exists a finite dimensional C ∗ subalgebra F ⊆ X0 AX0 and some t ∈ M∞ (F ) such that (ϕQ )∞ (q1 ) = tt ∗ ∼ t ∗ t = (ϕQ )∞ (q2 ). Since we will next have similar computations as in the proof of Proposition 4.7, we will save space and refer to some formulas appearing there. We choose for t a representation as in (9) (notice that F ⊆ A 0 ). In particular we have n = m = 0 everywhere in the representation (9) i,j of t, and we have bal(tn,m,α ) = m − n = 0. In the representation (10) (fixing i and a, whereas i,j i,j i,j n = m = f = 0 anyway) we can replace λn,m,α by λ˜ n,m,α = 1{t i,j ∈Q} λn,m,α by property (b). n,m,α

B. Burgstaller / Journal of Functional Analysis 256 (2009) 1693–1707

1707

Thus we get (ϕQ )∞ (q1 ) = tt ∗ = t˜t˜∗ for t˜ being defined like t with the only difference that i,j i,j λn,m,α is replaced by λ˜ n,m,α . Similarly we get (ϕQ )∞ (q2 ) = t ∗ t = t˜∗ t˜. Since t˜ ∈ M∞ (X0 QX0 ) we obtain [q1 ] = [q2 ]. 2 Proof of Theorem 2.2. Clearly, A and Q ⊆ A are the inductively ordered union of their finite dimensional subalgebras. Thus {[q] | q ∈ Q is a projection} generates K0 (C ∗ (Q)). By Proposition 4.8 and Corollary 4.6, K0 (ϕ0 ϕQ ) is injective and has image T0 . By Proposition 3.5 (justified by Proposition 4.1 and Proposition 4.7), K0 (θ ) is an isomorphism for θ = ϕA ϕ0 ϕQ , and K1 (A Γˆ Zd ) = 0. A slight analysis of the Takai duality map (exploiting that Q is invariant under Γ ) shows that K0 (θ ) is also an isomorphism. 2 Acknowledgments I thank Joachim Cuntz and Siegfried Echterhoff for their invitation and hospitality at the University of Münster. I am indebted to Toke Meier Carlsen and Aidan Sims for the proof of Lemma 2.6. References [1] S. Allen, D. Pask, A. Sims, A dual graph construction for higher-rank graphs, and K-theory for finite 2-graphs, Proc. Amer. Math. Soc. 134 (2006) 455–464. [2] B. Burgstaller, Notes on Cuntz–Krieger uniqueness theorems and C ∗ -algebras of labelled graphs, Quaest. Math., in press. [3] B. Burgstaller, The uniqueness of Cuntz–Krieger type algebras, J. Reine Angew. Math. 594 (2006) 207–236. [4] B. Burgstaller, A class of higher rank Exel–Laca algebras, Acta Sci. Math. 73 (2007) 209–235. [5] B. Burgstaller, The K-theory of some higher rank Exel–Laca algebras, J. Aust. Math. Soc. 84 (1) (2008) 21–38. [6] B. Burgstaller, D.G. Evans, On certain properties of Cuntz–Krieger type algebras, preprint. [7] J. Cuntz, K-theory for certain C ∗ -algebras, Ann. of Math. 113 (1981) 181–197. [8] D.G. Evans, On the K-theory of higher rank graph C ∗ -algebras, New York J. Math. 14 (2008) 1–31. [9] N.J. Fowler, Discrete product systems of Hilbert bimodules, Pacific J. Math. 204 (2) (2002) 335–375. [10] N.J. Fowler, I. Raeburn, The Toeplitz algebra of a Hilbert bimodule, Indiana Univ. Math. J. 48 (1) (1999) 155–181. [11] E. Kirchberg, Michael’s noncommutative selection principle and the classification of nonsimple algebras, in: J. Cuntz, et al. (Eds.), C ∗ -Algebras. Proceedings of the SFB-Workshop, Münster, Germany, March 8–12, 1999, Springer, Berlin, 2000, pp. 92–141. [12] A. Kumjian, D. Pask, Higher rank graph C ∗ -algebras, New York J. Math. 6 (2000) 1–20. [13] N.C. Phillips, A classification theorem for nuclear purely infinite simple C ∗ -algebras, Doc. Math., J. DMV 5 (2000) 49–114. [14] I. Raeburn, A. Sims, Product systems of graphs and the Toeplitz algebras of higher-rank graphs, J. Operator Theory 53 (2) (2005) 399–429. [15] I. Raeburn, A. Sims, T. Yeend, The C ∗ -algebras of finitely aligned higher-rank graphs, J. Funct. Anal. 213 (2004) 206–240. [16] I. Raeburn, W. Szymański, Cuntz–Krieger algebras of infinite graphs and matrices, Trans. Amer. Math. Soc. 356 (2004) 39–59. [17] A. Sims, C ∗ -algebras associated to higher rank graphs, PhD-thesis at the University of Newcastle, 2003. [18] A. Sims, Relative Cuntz–Krieger algebras of finitely aligned higher-rank graphs, Indiana Univ. Math. J. 55 (2) (2006) 849–868. [19] M. Takesaki, Theory of Operator Algebras I, Encyclopaedia Math. Sci. Operator Algebras and Non-Commutative Geometry, vol. 124(5), Springer, Berlin, 2002, second printing of the 1979 ed.


Nonlinear random ergodic theorems for affine operators Takeshi Yoshimoto Department of Mathematics, Toyo University, Kawagoe, Saitama 350-8585, Japan Received 25 September 2007; accepted 5 January 2009 Available online 3 February 2009 Communicated by H. Brezis

Abstract Let (Ω, ß, μ) be a finite measure space and let (S, F , ν) be another probability measure space on which a measure preserving transformation ϕ is given. We introduce the so-called affine systems and prove a vector-valued nonlinear random ergodic theorem for the random affine system determined by a strongly F measurable family {Ts + ξ(s, ·): s ∈ S} of affine operators, where B is a reflexive Banach space, {Ts : s ∈ S} is a strongly F -measurable family of linear contractions on L1 (Ω, B) as well as on L∞ (Ω, B) and ξ is a function in (I − T )Lp (S × Ω, B) (1 p < ∞) with the operator T defined by Tf (s, ω) = [Ts fϕs ](ω) which denotes the F ⊗ ß-measurable version of Ts fϕs (ω). Moreover, some variant forms of the nonlinear random ergodic theorem are also obtained with some examples of affine systems for which the nonlinear ergodic theorems fail to hold. © 2009 Elsevier Inc. All rights reserved. Keywords: Nonlinear operator; Strong measurability; Measurable representation (version); Affine system; Random affine system; Nonlinear ergodic theorem; Nonlinear random ergodic theorem; Pointwise convergence; Mean (strong) convergence; Abstract Abelian theorem

1. Introduction This paper deals with random ergodic theorems for affine operators in Lp , as a first step in the study of random ergodic theorems for nonlinear operators. An affine operator on a Banach space X is an operator of the type Ax = T x + y, where T is a bounded linear operator on X and y is a fixed element of X. One usually takes T 1, so A is nonexpansive. The fixed points of A are solutions of Poisson’s equation for T , which is (I − T )x = y. Thus it is a natural attempt E-mail address: [email protected]. 0022-1236/$ – see front matter © 2009 Elsevier Inc. All rights reserved. doi:10.1016/j.jfa.2009.01.009

T. Yoshimoto / Journal of Functional Analysis 256 (2009) 1708–1730

to reach the fixed points of A by averaging its iterates, which are An x = T n x + yields for the averages of A the representation N N N n−1 1 n 1 n 1 k A x= T x+ T y. N N N n=1

n=1

1709

n−1

k=0 T

k y.

This

(∗)

n=1 k=0

When we assume T to be mean ergodic, the averages of (∗) converge if and only if y ∈ (I − T )X (see [16], or the earlier result in Dotson [5]). When y ∈ (I − T )X, there is a unique z ∈ (I − T )X such that (I − T )z = y, and the limit in (∗) is Ex + z (where E is the ergodic limit Ex = limn→∞ n1 nk=1 T k x). We can therefore have convergence of the iterate averages (even weakly on a subsequence) only if Ax = T x + (I − T )z. When T is a Dunford–Schwartz contraction on L1 of a probability space (a contraction of L1 which contracts also the L∞ norm), it is mean ergodic and we have also a pointwise ergodic theorem, so iterating Af = Tf + (I − T )h will yield, in addition to norm convergence, also almost everywhere convergence of the averages of An f , for any f ∈ L1 . The a.e. convergence of (∗) holds also if T is a Dunford–Schwartz contraction of L1 of a (not necessarily finite) σ -finite measure, since the pointwise ergodic theorem holds for T . We are motivated by the affine method and the random ergodic theorems for linear operators. It is the purpose of the present paper to establish the nonlinear random ergodic theorems for affine operators in Lp spaces. In 1978 Baillon [1] established the weak nonlinear ergodic theorem of Cesàro (C, 1) type for a nonexpansive self-mapping T of a bounded closed convex subset 1 n k C of Lp with 1 < p < ∞: for every f ∈ C, the (C, 1) mean n+1 k=0 T f converges weakly to a T -fixed point in C. Then later Krengel and Lin [15] considered a class of order preserving, L∞ -norm decreasing and L1 -nonexpansive operators on Lp and proved the weak nonlinear ergodic theorem (which cannot be covered by Baillon’s theorem) for operators belonging to this class. These results due to Baillon, Krengel and Lin have recently been extended by the author [25] to the case of Cesàro (C, α) type of order α with 0 < α < ∞. All nonlinear ergodic theorems obtained so far are the statements pertaining to behavior of the averages of the iterates I = T 0 , T , T 2 , . . . , T n , . . . of a single nonlinear operator T which interact on a function. Now it is interesting to ask what happens if we transform a function f in Lp with a random sequence T1 , T2 , . . . , Tn , . . . of operators chosen at random from some stock of nonlinear operators on Lp given in advance. Then our object is to investigate the limit behaviors of Cesàro averages for the function sequence T1 f , T1 T2 f, . . . , T1 T2 · · · Tn f, . . . in various topologies of Lp . What can we say about the limit limn→∞ n1 nk=1 T1 T2 · · · Tk f ? Unfortunately we cannot expect any convergence for every random sequence chosen from the stock. So it is desirable to consider almost every (not every) random sequence chosen from the stock. On the other hand, Wittmann [17] considered another class of order preserving, integral preserving, positively homogeneous and L∞ -nonexpansive operators on L1 and proved both the pointwise convergence and the mean convergence of the averages of the so-called nonlinear sums for operators belonging to this class (cf. Krengel and Lin [15]). We also introduce the so-called random nonlinear sums of Wittmann’s type and consider a similar question in our setting. Particularly in the linear case, the random ergodic theorems have been given satisfactory operator-theoretical formulations (Jacobs [13], Yoshimoto [20,21], Wo´s [18,19]). Thus it is natural for our consideration to take the affine method to discuss the random nonlinear ergodic theorems as generalizations of the linear random ergodic theorems. Let B be a Banach space and let (Ω, ß, μ) be a finite measure space. Let Lp (Ω, B) = Lp (Ω, ß, μ, B), 1 p ∞, denote the usual Lebesgue spaces of

1710


strongly ß-measurable B-valued functions defined on Ω. In addition we also consider simultaneously another probability measure space (S, F , ν) (which will be needed to provide the stock of nonlinear operators on Lp (Ω, B)) on which an F -measurable ν-measure preserving transformation ϕ is given. The idea of applying a non-random ergodic theorem to a product measure space is a recipe which has furnished numerous (linear) random ergodic theorems. It seems to be very difficult to apply this idea to establish the nonlinear random ergodic theorem for general nonlinear operators in Lp . Indeed the nonlinear ergodic theorems have not received satisfactory general treatment in Lp so far. So, we take the so-called affine method and consider a strongly F -measurable family {Ts + ξ(s, ·): s ∈ S} of affine operators (which is just the stock of nonlinear operators in question), where {Ts : s ∈ S} is a strongly F -measurable family of linear contractions on L1 (Ω, B) as well as on L∞ (Ω, B) and ξ is a function in (I − T )Lp (S × Ω, B) for some p with 1 p < ∞. Here the operator T is defined on Lp (S × Ω, B) by Tf (s, ω) = [Ts fϕs ](ω) which denotes the (uniquely determined) F ⊗ ß-measurable version of Ts fϕs (ω). The strong F -measurability of the family {Ts : s ∈ S} means that for every h ∈ L1 (Ω, B) the function Ts h is strongly F -measurable as an L1 (Ω, B)-valued function defined on S, that is, for the mapping Ψh : s → Ts h of F into L1 (Ω, B), ν ◦ Ψh−1 has a separable support (cf. Hille and Phillips [10]). 2. Nonlinear random ergodic theorems When given a function f (s, ω) defined on S × Ω, for the sake of convenience, we shall write fs (ω) for f (s, ω) in what follows if we want to regard f (s, ω) as a function of ω defined on Ω for s fixed in S. Now let there be given a strongly F -measurable family {Ts : s ∈ S} of linear contractions on L1 (Ω, B) as well as on L∞ (Ω, B). Then it follows from the Riesz convexity theorem (cf. Dunford and Schwartz [6]) that for each s ∈ S, Ts B(Lp (Ω,B)) 1 for 1 p ∞, where B(Lp (Ω, B)) denotes the Banach space of bounded linear operators on Lp (Ω, B), so that Ts is a continuous linear mapping in each space Lp (Ω, B), 1 p < ∞. If f ∈ Lp (S × Ω, B) then fs (·) ∈ Lp (Ω, B) for almost all s in S and so Ts fs (·) is well defined for almost all s in S. Note here that the function Ts fs (ω) may not be strongly F ⊗ ß-measurable as a function of (s, ω) in S × Ω. But by Lemmas 1 and 2 of [21] there exists the uniquely determined F ⊗ ß-measurable version [Ts fϕs ](ω) of Ts fϕs (ω) such that excepting a ν-null set E1 (f ) (for each s ∈ S − E1 (f )), [Ts fϕs ](ω) = Ts fϕs (ω) μ-a.e. Therefore we can define the linear operator T on Lp (S × Ω, B) by Tf (s, ω) = [Ts fϕs ](ω). It follows that for every f ∈ Lp (S × Ω, B) there exists a set E2 (f ) ∈ F of ν-measure zero such that for every s ∈ S − E2 (f ) and k = 1, 2, . . . , Ts Tϕs · · · Tϕ k−1 s fϕ k s (·) can be definable and strongly F -measurable as a function in Lp (Ω, B). Our first main result comes next. Theorem 1. Let B be a reflexive Banach space and let {Ts : s ∈ S} be a strongly F -measurable family of linear contractions on L1 (Ω, B) as well as on L∞ (Ω, B). Put Tf (s, ω) = Ts fϕs (ω) on Lp (S × Ω, B) and define Us = Ts + ξs , s ∈ S, where ξ ∈ (I − T )g for some g ∈ Lp (S × Ω, B), 1 p < ∞. Then for every f ∈ Lp (S × Ω, B) there exist a function η ∈ Lp (S × Ω, B) and a set E(f, g) ∈ F of ν-measure zero such that for each s ∈ S − E(f, g) n 1 Us Uϕs · · · Uϕ k−1 s fϕ k s (ω) − ηs (ω) = 0 lim n→∞ n k=1

B

μ-a.e.

(1)


1711

and if 1 < p < ∞ then p 1/p n 1 Us Uϕs · · · Uϕ k−1 s fϕ k s (ω) − ηs (ω) dμ = 0, lim n→∞ n k=1

Ω

while, if p = 1 and f − g ∈ L(S × Ω, B) log+ L(S × Ω, B), then n 1 lim Us Uϕs · · · Uϕ k−1 s fϕ k s (ω) − ηs (ω) dμ = 0. n→∞ n k=1

Ω

(2)

B

(3)

B

For the proof we need the following lemmas. Lemma 1. Under the hypothesis of Theorem 1, let f ∈ Lp (S × Ω, B). Then there exists a uniquely determined F ⊗ ß-measurable version [Us Uϕs · · · Uϕ k−1 s fϕ k s ](ω) of Us Uϕs · · · Uϕ k−1 s fϕs (ω) for each k = 1, 2, . . . , such that excepting a ν-null set, [Us Uϕs · · · Uϕ k−1 s fϕs ](ω) = Us Uϕs · · · Uϕ k−1 s fϕs (ω)

μ-a.e.

Proof. First let p = 1 and f ∈ L1 (S × Ω, B). We show that there exists the F ⊗ ß-measurable version [Ts fϕs ](ω) of Ts fϕs (ω). We can choose a sequence of functions {fk } in L1 (S × Ω, B) such that each fk is of the form fk (s, ω) =

j

Ts (fk )s (ω) =

akj IAkj (s)IBkj (ω) akj IAkj (s)Ts IBkj (ω) ,

j

where akj ∈ B, and {Akj } and {Bkj } are measurable partitions of S and Ω, respectively, and such that f (ϕs, ω) − fk (s, ω) dν ⊗ μ = 0. lim B k→∞

S×Ω

So, passing to a subsequence if necessary, one has lim Ts fϕs (ω) − Ts (fk )s (ω)B dμ = 0 ν-a.e. k→∞

Ω

Thus since Ts (fk )s (·) is strongly F -measurable, so is Ts fϕs (·). We then see that there exist countably L1 (Ω, B)-valued, F -measurable functions (hk )s (·) defined on S such that hk ∈ L1 (S × Ω, B) (k = 1, 2, . . .) and Ts (fk )s (·) − (hk )s (·)

L1 (Ω,B)

1 k

ν-a.e.

(k = 1, 2, . . .).

1712


Hence lim Ts fϕs (·) − (hk )s (·)L

1 (Ω,B)

k→∞

=0

ν-a.e.,

from which follows that the function Ts fϕs (·) is strongly F -measurable. Moreover, we can show that there exists a function h ∈ L1 (S × Ω, B) such that limk→∞ hk − hL1 (S×Ω,B) = 0 and such that excepting a ν-null set Ts fϕs (ω) = hs (ω)

μ-a.e.

Therefore we get the (unique) F ⊗ ß-measurable version [Ts fϕs ](ω) of Ts fϕs (ω) by defining ((Tf )(s, ω) =) [Ts fϕs ](ω) = h(s, ω). This is also true even for f ∈ Lp (S × Ω, B), 1 < p < ∞. Similarly, excepting a ν-null set, Ts Tϕs · · · Tϕ k−1 s fϕ k s (ω) can be defined μ-almost everywhere and is strongly F -measurable. Moreover, excepting a ν-null set

k T f s (ω) = [Ts Tϕs · · · Tϕ k−1 s fϕ k s ](ω) = Ts Tϕs · · · Tϕ k−1 s fϕ k s (ω)

μ-a.e.

for each k = 1, 2, . . . . Then defining the F ⊗ ß-measurable version [Us fϕs ](ω) of Us fϕs (ω) by [Us fϕs ](ω) = [Ts fϕs ](ω) + ξs (ω), we obtain the desired measurable version by setting [Us Uϕs · · · Uϕ k s fϕ k+1 s ](ω) = [Ts Tϕs · · · Tϕ k s fϕ k+1 s ](ω) +

k [Ts Tϕs · · · Tϕ j −1 s ξϕ j s ](ω) + ξs (ω)

(k = 1, 2, . . .).

2

j =1

Lemma 2. Under the hypothesis of Theorem 1, T B(Lp (S×Ω,B)) 1, 1 p < ∞, and T B(L∞ (S×Ω,B)) 1. Proof. If 1 p < ∞ then by Fubini’s theorem we get

p

Tf Lp (S×Ω,B) =

Tf (s, ω)p dν ⊗ μ B

S×Ω

=

[Ts fϕs ](·)p

Lp (Ω,B)

dν,

S

and [Ts fϕs ](·) Hence

Lp (Ω,B)

fϕs (·)L

p (Ω,B)

ν-a.e.

T. Yoshimoto / Journal of Functional Analysis 256 (2009) 1708–1730 p Tf Lp (S×Ω,B)

fϕs (·)p

dν

fω (ϕ·)p

dμ

Lp (Ω,B)

1713

S

=

Lp (S,B)

Ω

=

fω (·)p

Lp (S,B)

dμ

Ω

=

f (s, ω)p dν ⊗ μ B

S×Ω p

= f Lp (S×Ω,B) . While if f ∈ L1 (S × Ω, B) ∩ L∞ (S × Ω, B) we get Tf L∞ (S×Ω,B) = ess supess sup Tf (s, ω)B s∈S

ω∈Ω

s∈S

ω∈Ω

s∈S

ω∈Ω

ω∈Ω

s∈S

B

= ess supess sup [Ts fϕs ](ω)B ess supess sup fϕs (ω)B = ess sup ess supfs (ω)B

B

B

B

= f L∞ (S×Ω,B) . Hence the lemma follows.

2

Lemma 3. Under the hypothesis of Theorem 1, define the affine operator U on Lp (S × Ω, B) (1 p < ∞) by Uf (s, ω) = [Us fϕs ](ω) (f ∈ Lp (S × Ω, B)). If 1 < p < ∞ and f ∈ Lp (S × Ω, B), then S×Ω

n p

1/p 1

k U f (s, ω) dν ⊗ μ Kp f Lp (S×Ω,B) + gLp (S×Ω,B) , sup n1 n k=1

(1)

B

while, if p = 1, then n 1 k sup U f (s, ω) dν ⊗ μ n n1 S×Ω

k=1

B

gL1 (S×Ω,B) + f (s, ω) − g(s, ω) B log f (s, ω) − g(s, ω) B dν ⊗ μ , + K μ(Ω) + S×Ω

where Kp and K are positive constants and log+ u = log max(1, u) for u 0.

(2)

1714


Proof. Since T is linear, Uf = Tf + (I − T )g = T (f − g) + g. More precisely, we have by induction U k f = T k (f − g) + g

(k = 0, 1, 2, . . .).

Hence the inequalities (1) and (2) follow directly from the maximal ergodic inequalities (for the operator T ) already known even in the vector-valued case (see [22, Theorem 1]; cf. also Dunford and Schwartz [6, VIII, Theorem 8], for complex-valued functions). 2 Proof of Theorem 1. Let f , g ∈ Lp (S × Ω, B). It follows from Lemma 2 and the Riesz convexity theorem that T B(Lp (S×Ω,B)) 1 for 1 p ∞. The operator U is clearly nonlinear and nonexpansive in Lp (S × Ω, B). Since B is reflexive it follows from Chacon’s ergodic theorem [3] applied to the operator T that there exists a function η ∈ Lp (S × Ω, B) such that n 1 lim U k f (s, ω) − η(s, ω) = 0 ν ⊗ μ-a.e., n→∞ n k=1

B

and by Lebesgue’s dominated convergence theorem n 1 U k f − η = 0 (if 1 < p < ∞). lim n→∞ n k=1

Lp (S×Ω,B)

Moreover, if f − g ∈ L(S × Ω, B) log+ L(S × Ω, B) then by Lemma 3 and Lebesgue’s dominated convergence theorem n 1 k U f − η = 0. lim n→∞ n k=1

L1 (S×Ω,B)

Note here that excepting a ν-null set

k U f s (ω) = [Us Uϕs · · · Uϕ k−1 s fϕ k s ](ω) = Us Uϕs · · · Uϕ k−1 s fϕ k s (ω)

μ-a.e.

Hence statement (1) follows from Fubini’s theorem. Statements (2) and (3) of Theorem 1 follow from Fubini’s theorem and Lebesgue’s dominated convergence theorem again, noting that in view of Lemma 3 n p 1 k U f (·, ω) dμ ∈ L1 (S) (1 p < ∞) sup n1 n Ω

k=1

B

in each case of statements (2) and (3). The proof of Theorem 1 has hereby been completed.

2

Remark. Theorem 1 generalizes both the random ergodic theorem of Gladysz [8] and the vectorvalued random ergodic theorem of Beck and Schwartz [2].


1715

We shall call the triples {(Ts , Us , ξs ): s ∈ S} and {T , U, ξ } used in Theorem 1 (and its proof) a random affine system on Lp (Ω, B) and an affine system on Lp (S × Ω, B), respectively. As will be seen below, these affine systems on Lp spaces offer us full information about the ergodic behaviors of affine operators. The argument used in the proof of Theorem 1 yields Theorem 2. Let B be a reflexive Banach space and let {Ts : s ∈ S} be a strongly F -measurable family of linear contractions on L1 (S × Ω, B) as well as on L∞ (S × Ω, B). Then for every f ∈ (I − T )Lp (S × Ω, B), 1 p < ∞, there exist a function f ∗ ∈ Lp (S × Ω, B) and a set E(f ) ∈ F of ν-measure zero such that for every s ∈ S − E(f ) n k 1 lim Ts Tϕs · · · Tϕ j −1 s fϕ j s (ω) − fs∗ (ω) = 0 μ-a.e., n→∞ n k=1 j =1

B

and if 1 < p < ∞ then n k 1 ∗ lim Ts Tϕs · · · Tϕ j −1 s fϕ j s (·) − fs (·) n→∞ n k=1 j =1

= 0. Lp (Ω,B)

Proof. It suffices to note that if we define Us = Ts + fs , s ∈ S, then for any h ∈ Lp (S × Ω, B) k−1

Ts Tϕs · · · Tϕ j −1 s fϕ j s (·) + fs (·)

j =1

= Us Uϕs · · · Uϕ k−1 s hϕ k s (·) − Ts Tϕs · · · Tϕ k−1 s hϕ k s (·)

(k = 2, 3, . . .).

Then the assertion of the theorem follows immediately from Theorem 1.

2

In general one can only expect weak convergence of Cesàro (C, 1) processes for nonlinear operators on Lp . In fact, the example given by Krengel [14] shows that the pointwise convergence of the (C, 1) averages of nonlinear and nonexpansive operators on Lp may fail to hold and the example given by Krengel and Lin [15] shows that the (C, 1) averages of nonlinear and nonexpansive operators need not converge in the strong topology of Lp . As a special case, let T be a positive linear contraction on L1 (Ω) and suppose that there is a k k strictly positive function g ∈ L+ 1 (Ω) with T g = g. Write Uf = Tf + g. Then U f = T f + kg, k = 1, 2, . . . , for every f ∈ L1 (Ω). Thus by Hopf’s ergodic theorem [11]

n n n 1 k 1 k 1 U f− kg = lim T f lim n→∞ n n→∞ n n k=1

exists almost everywhere on Ω. However, where on Ω.

k=1

1 n

n

k=1 U

k=1

kf

tends to ∞ (as n → ∞) almost every-

Theorem 3. Let B be a reflexive Banach space and let {Ts : s ∈ S} be a strongly F -measurable family of linear contractions on L1 (Ω, B) as well as on L∞ (Ω, B). Define Us = Ts + ξs , s ∈ S,

1716


where ξ ∈ (I − T )Lp (S × Ω, B), 1 p < ∞. Then for every f ∈ Lp (S × Ω, B) there exist a function η ∈ Lp (S × Ω, B) and a set E0 (f, ξ ) ∈ F of ν-measure zero such that for each s ∈ S − E0 (f, ξ ) ∞ 1 n s (ω) − ηs (ω) = 0 lim (λ − 1) U U · · · U f μ-a.e. n−1 s ϕs ϕ ϕ s λ→1+0 λn+1 n=1

B

and if 1 < p < ∞, then p 1/p ∞ 1 n s (ω) − ηs (ω) dμ lim U U · · · U f = 0. n−1 (λ − 1) s ϕs ϕ ϕ s λ→1+0 λn+1 n=1

Ω

B

Proof. If we consider the affine system {T , U, ξ } determined as the measurable version of the random affine system {(Ts , Us , ξs ): s ∈ S} then we have U n f = T n f − T n g + g, n = 0, 1, 2, . . . , where ξ = (I − T )g for some g ∈ Lp (S × Ω, B). Then by Chacon’s ergodic theorem [3] for T we have the pointwise and strong (C, 1) ergodic theorems hold for U . Thus the pointwise and strong Abelian ergodic theorems for U follow from Hille’s abstract Abelian theorem [9]. Hence the assertion of the theorem may follow from Fubini’s theorem and Lebesgue’s dominated convergence theorem. 2 Example 1. We consider the measure space (Ω, ß, μ) given above and let φ be a ß-measurable, μ-measure preserving transformation of Ω into itself. Define an affine operator U on Lp (Ω) (1 p < ∞) by Uf (ω) = f (φω) + 1, f ∈ Lp (Ω). Clearly U is nonlinear and nonexpansive on Lp (Ω). Then 1 k 1 n+1 U f= f ◦ φk + n n 2 and obviously

1 n

n

n

k=1

k=1

n

k=1 U

kf

(n = 1, 2, . . .),

tends to ∞ (as n → ∞) almost everywhere on Ω (cf. Krengel [14]).

Example 2. We consider the function space C[0, 1] which consists of functions f (t) continuous for 0 t 1 such that f = max |f (t)|. Let V be the operator on C[0, 1] defined by Vf (t) = t · f (t), t ∈ [0, 1], for f ∈ C[0, 1]. Clearly, V n = 1 for n = 1, 2, . . . . Let us consider the bounded linear operator W defined on X = C[0, 1] × C[0, 1] by W (f, g) = (Vf, f + g),

(f, g) ∈ X.

Then W (f, g) = V f, n

n

n−1

V f +g k

(n = 1, 2, . . .).

k=0

Symbolically, W =

V 0 I I

and Wn =

n

V n−1

k=0 V

k

0 I

(n = 1, 2, . . .).


1717

Note that limn→∞ n1 W n B(X) = 0. Then there exists at least one nontrivial pair (ξ0 , η0 ) ∈ X such that limn→∞ n1 W n (ξ0 , η0 )X = 0. We define U (f, g) = W (f, g) + (I − W )(ξ0 , η0 ) for (f, g) ∈ X. Then U is nonlinear but not nonexpansive and 1 k 1 k U (2ξ0 , 2η0 ) = W (ξ0 , η0 ) + (ξ0 , η0 ) n n n

n

k=1

k=1

does not converge as n → ∞ in the strong topology of X (cf. Krengel and Lin [15]). We next consider the case that (Ω, ß, μ) is a probability space. If X is a strongly measurable function defined on Ω with values in B, X is called a strongly measurable B-valued random variable (defined on Ω). A two-sided sequence {Xi }, −∞ < i < ∞, of strongly measurable B-valued random variables is said to be stationary if μ ω Xi (ω) ∈ Ai , |i| n = μ ω Xi+1 (ω) ∈ Ai , |i| n for each finite collection {Ai }, |i| n, of Borel subsets of B. We are especially interested in the following theorem which generalizes the strong law of large numbers in Banach spaces. Theorem 4. Let B be a reflexive Banach space in which a linear operator T is given and satisfies the norm condition T B 1. Let {Xi }, −∞ < i < ∞, be a stationary sequence in L1 (Ω, B) and define U = T + ξ for some fixed ξ ∈ (I − T )L1 (Ω, B). Then there exists a strongly measurable random variable Y in L1 (Ω, B) such that n 1 lim U k Xk (ω) − Y (ω) = 0 almost surely. n→∞ n k=1

B

Proof. Let ξ be of the form ξ = (I − T )Z (Z ∈ L1 (Ω, B)). It follows immediately that U n Xn = T n Xn − T n Z + Z

(n = 1, 2, . . .).

For indeed suppose this fact has been established for n = k and note that U k χ = T k χ − T k Z + Z for any χ ∈ L1 (Ω, B). Then by the induction hypothesis

U k+1 Xk+1 = U U k Xk+1 = T U k Xk+1 − T Z + Z

= T T k Xk+1 − T k Z + Z − T Z + Z = T k+1 Xk+1 − T k+1 Z + T Z − T Z + Z = T k+1 Xk+1 − T k+1 Z + Z. Thus, applying Beck and Schwartz’s theorem (Corollary to Theorem 2 of [2]) and Chacon’s (vector-valued) ergodic theorem [3] to the linear operator T , we have that there exist random variables X ∗ , Z ∗ ∈ L1 (Ω, B) such that

1718


n 1 lim T k Xk (ω) − X ∗ (ω) = 0 almost surely, n→∞ n k=1 B n 1 T k Z(ω) − Z ∗ (ω) = 0 almost surely. lim n→∞ n k=1

B

Hence the conclusion of the theorem follows directly from this fact with Y = X ∗ − Z ∗ + Z.

2

The original form of Theorem 4 was first proved by Beck and Schwartz [2] for a linear operator T (with T B 1) on B. So, Theorem 4 is a (nonlinear) generalization of Beck and Schwartz’s theorem (Corollary to Theorem 2 of [2]). 3. More about the nonlinear random ergodic theorem To discuss convergence almost everywhere and in the norm of Lp (S × Ω), Wittmann [17] introduced the so-called nonlinear sums Sn f (i.e., S0 f = f , Sn+1 f = f + R(Sn f )) for f ∈ L1 (S × Ω), where R is an order preserving, integral preserving, positively homogeneous and 1 L∞ -nonexpansive operator on L1 (S × Ω). He showed that n+1 Sn f is a.e. convergent for any f ∈ L1 (S × Ω) and norm convergent in Lp (S × Ω) for any f ∈ Lp (S × Ω), 1 p < ∞. In the sequel we make use of the same random affine system {(Ts , Us , ξs ): s ∈ S} on Lp (Ω, B) and the affine system {T , U, ξ } on Lp (S × Ω, B) as used in Section 2 without change of the meaning. For f ∈ Lp (S × Ω, B), 1 p < ∞, we define a new sequence of random functions {Vf (n, s): s ∈ S} (n = 0, 1, 2, . . .), in Lp (Ω, B) inductively by Vf (0, s) = fs , Vf (1, s) = fs + Us Vf (0, ϕs), Vf (n + 1, s) = fs + Us Vf (n, ϕs). Theorem 5. Let B be a reflexive Banach space and let {Vf (n, s): s ∈ S} be a random sequence associated with f ∈ Lp (S × Ω, B), 1 p < ∞ (which is defined by the random affine system {(Ts , Us , ξs ): s ∈ S}). Then there exists a function η ∈ Lp (S × Ω, B) such that except for a set of ν-measure zero, 1 = 0 μ-a.e. V lim (n, s)(ω) − η (ω) f s n→∞ n B

and if 1 < p < ∞ then 1 lim (n, s) − η V f s n→∞ n

= 0.

Lp (Ω,B)

Proof. One may assume that ξ = (I − T )g for some g ∈ Lp (S × Ω, B). There is a set N (f ) ∈ F of ν-measure zero such that for every s ∈ S − N (f ), we have by definition


1719

Vf (1, s) = fs + Ts fϕs − Ts gϕs + gs , Vf (2, s) = fs + Ts fϕs + Ts Tϕs fϕ 2 s − Ts Tϕs gϕ 2 s + Ts gϕs + gs , .. . Vf (n, s) = fs +

n

Ts Tϕs · · · Tϕ k−1 s fϕ k s − Ts Tϕs · · · Tϕ n−1 s gϕ n s

k=1

+

n−1

Ts Tϕs · · · Tϕ k−1 s gϕ k s + gs

(n = 2, 3, . . .).

k=1

Moreover, it follows from the ergodic theorem for T that 1

=0 f lim (ω) + g (ω) s s n→∞ n B 1

lim fs (·) + gs (·) =0 n→∞ n Lp (Ω,B) 1 =0 n T T · · · T g (ω) lim n−1 s ϕs ϕ s ϕ s n→∞ n B 1 lim Ts Tϕs · · · Tϕ n−1 s gϕ n s (·) =0 n→∞ n

μ-a.e., (if 1 < p < ∞), μ-a.e., (if 1 < p < ∞).

Lp (Ω,B)

Hence we may apply Theorem 1 to conclude that Theorem 5 holds.

2

Following Wittmann [17], let us define a sequence of functions {Vn f } in Lp (S × Ω, B) inductively by V0 f = f , Vn+1 f = f + U Vn f , n = 0, 1, 2, . . . , using the affine system {T , U, ξ }. Then Vf (n, s) = (Vn f )s , n = 0, 1, 2, . . . . Theorem 6. Let B be a reflexive Banach space and let {Vn f } be a sequence of nonlinear sums for f ∈ Lp (S × Ω, B), 1 p < ∞ (which is defined by the affine system {T , U, ξ }). Then there exists a function η ∈ Lp (S × Ω, B) such that 1 = 0 ν ⊗ μ-a.e. V lim f (s, ω) − η(s, ω) n n→∞ n + 1 B

Moreover, if 1 < p < ∞, then 1 lim Vn f − η n→∞ n + 1

= 0,

Lp (S×Ω,B)

and if p = 1 and f ∈ L(S × Ω, B) log+ L(S × Ω, B), then 1 V f − η = 0. lim n n→∞ n + 1 L1 (S×Ω,B)

1720


Proof. Let ξ be of the form ξ = (I − T )g for some g ∈ Lp (S × Ω, B). By definition we get Vn f =

n

T k f − T ng + g

(n = 0, 1, 2, . . .),

k=0

and by the ergodic theorem n T g(s, ω) − g(s, ω) =0 lim n→∞ n+1 B n T g − g lim =0 n→∞ n + 1 Lp (S×Ω,B)

ν ⊗ μ-a.e., (if 1 < p < ∞).

Finally we may apply the ergodic theorem for T to conclude that Theorem 6 holds.

2

We are also interested in the properties (or behavior) of the ergodic maximal functions for affine operators. For example, Lemma 3 may be regarded as the so-called dominated ergodic theorem for affine operators. The following inequality may be called the maximal ergodic inequality for affine operators. Inequality 1. Let {T , U, ξ } be an affine system on Lp (S × Ω, B) with 1 p < ∞, where ξ = (I − T )g, g ∈ Lp (S × Ω, B) ∩ L∞ (S × Ω, B). Assume T B(L∞ (S×Ω,B)) 1. Then for every function f ∈ Lp (S × Ω, B) and for every positive number λ, we have 1

ν ⊗ μ eV∗ λ + 2g∞ λ

f (s, ω) dν ⊗ μ, B

e(λ)

∗

1 ν ⊗ μ eU λ + 2g∞ λ

f (s, ω) dν ⊗ μ, B

e(λ)

where 1 Vn f (s, ω) > λ , (s, ω): sup n0 n + 1 B

n 1 ∗ k U f (s, ω) > λ , eU (λ) = (s, ω): sup n0 n + 1 k=0 B e(λ) = (s, ω): f (s, ω) B > λ , g∞ = gL∞ (S×Ω,B) . eV∗ (λ) =

Proof. It is sufficient to note that

∗ λ + 2g∞ ⊂ eT∗ (λ), eV∗ λ + 2g∞ ∪ eU

1 f (s, ω) dν ⊗ μ. ν ⊗ μ eT∗ (λ) B λ e(λ)


1721

And apply the maximal ergodic inequality for T to obtain the desired inequalities (cf. Dunford and Schwartz [5, VIII (6), Lemma 7], [21, Lemma 1]). 2 Let (X, Ξ, π) be a σ -finite measure space on which a Ξ -measurable π -measure preserving transformation τ is given. We consider the Banach space L1 (X, B) + L∞ (X, B) of all functions f which can be written as f = g + h, where g ∈ L1 (X, B) and h ∈ L∞ (X, B), endowed with the norm f 1,∞ = inf gL1 (X,B) + hL∞ (X,B) : f = g + h . The completeness of the norm · 1,∞ follows from the completeness of the norms · L1 (X,B) and · L∞ (X,B) . For each real p 0, we denote by Mp (X, B) the class of all functions f such that f (x)B p f (x)B log dπ < ∞ t t {f (x)B >t}

for every t > 0 and by L(X, B)[log+ L(X, B)]p the class of all functions such that f (x) log+ f (x) p dπ < ∞. B B X

It is well known (Fava [7] and Yoshimoto [22–24]) that (i) (ii) (iii) (iv) (v) (vi)

Each Mp (X, B) is a linear space, L1 (X, B) ⊂ M0 (X, B) ⊂ L1 (X, B) + L∞ (X, B), Mp (X, B) ⊂ L1 (X, B) + L∞ (X, B), Mp (X, B) = L(X, B)[log+ L(X, B)]p if and only if π(X) < ∞, Mq (X, B) ⊂ Mp (X, B) if p < q, Mp (X, B) contains the linear space spanned by q>1 Lq (X, B).

In particular, when we consider the numerically-valued case, we omit the symbol B in the above notation. Note that the class M0 (X, B) is considerably wider than the spaces Lp (X, B), 1 p < ∞. Theorem 7. Let B be reflexive and let {Tx : x ∈ X} be a strongly Ξ -measurable family of linear operators on B such that Tx B 1 for all x ∈ X. Define an operator T on L1 (X, B) + L∞ (X, B) by (Tf )(x) = Tx (f (τ x)) for f ∈ L1 (X, B) + L∞ (X, B). Let Ux = Tx + ξ(x), x ∈ X, with some fixed ξ ∈ (I − T )M0 (X, B). Then for every f ∈ M0 (X, B)

1 Ux Uτ x · · · Uτ k−1 x f τ k x n→∞ n n

lim

k=1

exists strongly π -almost everywhere on X. Proof. We may assume that ξ is of the form ξ = (I − T )p for some p ∈ M0 (X, B). Note that T L1 (X, B) ⊂ L1 (X, B) and T L∞ (X, B) ⊂ L∞ (X, B) and that ξ = (I − T )p ∈ L1 (X, B) +

1722


L∞ (X, B). Then we can define an operator U on L1 (X, B) + L∞ (X, B) by (Uf )(x) = Ux (f (τ x)) for f ∈ L1 (X, B) + L∞ (X, B). Furthermore, for f ∈ M0 (X, B) we have by induction U nf = T nf − T np + p

(n = 1, 2, . . .).

Thus it follows from Theorem 3 of [22] that for any f ∈ M0 (X, B) 1 k 1 k 1 k lim U f (x) = lim T f (x) − lim T p(x) + p(x) n→∞ n n→∞ n n→∞ n n

n

n

k=1

k=1

k=1

exists strongly for π -almost all x ∈ X. Hence the assertion of the theorem follows directly from the following relation

n

U f (x) = Ux Uτ x · · · Uτ k−1 x f τ k x π-a.e. which is easily checked.

(n = 1, 2, . . .),

2

Now, using the random affine system {(Tx , Ux , ξ(x)): x ∈ X} (ξ = (I − T )p for some p ∈ M0 (X, B)) used in Theorem 7, we define for f ∈ L1 (X, B) + L∞ (X, B) a sequence {Wf (n, x): x ∈ X} (n = 0, 1, 2, . . .), in B inductively by Wf (0, x) = f (x),

Wf (1, x) = f (x) + Ux Wf (0, τ x) ,

Wf (n + 1, x) = f (x) + Ux Wf (n, τ x) . Observe that Wf (n, x) = f (x) +

n

Tx Tτ x · · · Tτ k−1 x f τ k x − Tx Tτ x · · · Tτ n−1 x p τ n x

k=1

+

n−1

Tx Tτ x · · · Tτ k−1 x p τ k x + p(x)

(n = 2, 3, . . .).

k=1

Then by Theorem 3 of [22] and Theorem 7 of [22], we have Theorem 8. Let B be reflexive and f ∈ M0 (X, B). Under the hypotheses of Theorem 7, let {Wf (n, x): x ∈ X} be a random sequence defined by using the random affine system {(Tx , Ux , (I − T )p(x)): x ∈ X} with some p ∈ M0 (X, B). Then lim

n→∞

exists strongly for π -almost all x ∈ X.

1 Wf (n, x) n


1723

Especially in the numerically-valued case, after appealing to Dunford–Schwartz’s noncommuting ergodic theorem [6] in the linear case, we have the following nonlinear random ergodic theorem. (i)

Theorem 9. Let {Ts : s ∈ S}, i = 1, 2, . . . , be strongly F -measurable families of linear con(i) (i) tractions on L1 (Ω) as well as on L∞ (Ω). Define Us = Ts + ((I − Ti )g)s for s ∈ S and some (i) fixed g ∈ Lp (S × Ω) with 1 < p < ∞, where Ti f (s, ω) = [Ts fϕs ](ω), i = 1, 2, . . . , r. Let (i) (i) Ui (ki , s) = Us Uϕs · · · Uϕ ki −1 s for i = 1, 2, . . . , r. Then for every f ∈ Lp (S × Ω) (1 < p < ∞), there exists a ν-null set E(f, g, r) ∈ F such that for every s ∈ S − E(f, g, r) n1 nr

1 ··· U1 (k1 , s)U2 k2 , ϕ k1 s · · · Ur kr , ϕ k1 +···+kr−1 s fϕ k1 +···+kr s (ω) n1 n2 · · · nr k1 =1

kr =1

is convergent (as n1 → ∞, . . . , nr → ∞ independently) almost everywhere on Ω, as well as in the norm of Lp (Ω). The special interest in the above theorem lies in the fact that Ui (ki , ϕ k1 +···+ki−1 s) and Uj (kj , ϕ k1 +···+kj −1 s) need not commute. Inequality 2. Let T1 , T2 , . . . , Tr be positive linear L1 − L∞ contractions on L1 (Ω). Define Ui = Ti + (I − Ti )g (i = 1, 2, . . . , r), for some g ∈ L∞ (Ω) ∩ Mr−1 (Ω). Then for every f ∈ Mr−1 (Ω) and every t > 0 ν ⊗ μ fU∗ (ω) > 4t + 2g∞ C(f, r)

{|f |>t}

|f | r−1 |f | log dμ, t t

where C(f, r) is a positive constant, U = (U1 , U2 , . . . , Ur ) and n1 nr 1 k1 ∗ kr fU (ω) = sup ··· U1 · · · Ur f (ω). n1 ,...,nr 1 n1 · · · nr k1 =1

kr =1

Proof. This inequality follows easily after applying Fava’s weak type estimate [7, Theorem 2] to the maximal function n1 nr 1 fT∗ (ω) = sup ··· T1k1 · · · Trkr f (ω), n1 ,...,nr 1 n1 · · · nr k1 =1

where T = (T1 , T2 , . . . , Tr ) (cf. [24, Theorem 2]).

kr =1

2

It is desirable (and natural) to consider wider classes containing that of affine operators. Neither of the classes of operators considered by Krengel and Lin [15] and Wittmann [17] contains the class of affine operators U considered in this paper. In fact, such an affine operator U is not L∞ -contractive, not integral preserving and not positively homogeneous. When given an affine system {T , U, ξ } on Lp , the ergodic behaviors of U depend essentially on the norm conditions for (the linear operator) T .

1724


Example 3. Let α be a fixed positive real number. Following Hille [9], for f ∈ L1 (0, 1), we define 1 Jα f (t) = Γ (α)

t (t − u)α−1 f (u) du,

0 < t < 1,

0

Tα f = (I − Jα )f. Then a careful calculation gives t Tαn f (t) = f (t) −

Pn (t − u, α)f (u) du,

0 < t < 1 (n = 1, 2, . . .),

0

where kα−1 n k−1 n w . (−1) Pn (w, α) = k Γ (kα) k=1

Let α = 1 and define U1 = T1 + (I − T1 )ξ for some ξ ∈ L1 (0, 1). Then for any f ∈ L1 (0, 1) one gets U1n f = T1n f − T1n ξ + ξ (n = 1, 2, . . .). Thus by virtue of Hille’ theorem [9, Theorem 11] it follows that for every h ∈ L1 (0, 1) 1 k T1 h(t) = 0 for almost all t ∈ (0, 1), n→∞ n k=1 n 1 lim T1k h = 0. n→∞ n n

lim

k=1

L1 (0,1)

Hence from these results we obtain 1 k U1 f (t) = ξ(t) for almost all t ∈ (0, 1), n→∞ n k=1 n 1 lim U1k f − ξ = 0. n→∞ n L1 (0,1) n

lim

k=1

Next let α = 2. Then there exist by Theorem 10 of Hille [9] two nontrivial functions ξ0 , ξ00 ∈ L1 (0, 1) such that (λ − 1)R(λ, T2 )ξ0 does not converge (as λ → 1 + 0) almost everywhere on (0, 1) and such that (λ − 1)R(λ, T2 )ξ00 does not converge in the strong topology of L1 (0, 1). Now define U2,0 = T2 + (I − T2 )ξ0 , U2,00 = T2 + (I − T2 )ξ00 .


1725

Theorem 11 of Hille [9] shows that T2 cannot be pointwise and strongly ergodic in L1 (0, 1). In fact, Hille proved the estimate log T2n B(L1 (0,1)) > C · n1/3 , where C is a positive constant. Thus by a simple calculation we find that 1 n 1 1/3 T2 B(L1 (0,1)) lim eC·n = ∞. n→∞ n n→∞ n lim

Therefore 1 k 1 k U2,0 (2ξ0 )(t) = T2 ξ0 (t) + ξ0 (t) n n n

n

k=1

k=1

does not converges as n → ∞ almost everywhere on (0, 1) (cf. Krengel [14]), and 1 k 1 k U2,00 (2ξ00 ) = T2 ξ00 + ξ00 n n n

n

k=1

k=1

does not converge as n → ∞ in the strong topology of L1 (0, 1) (cf. Krengel and Lin [15]). (1) We consider the sequence {Vn f } (f ∈ L1 (0, 1)) determined by the affine system {T1 , U1 , (I − T1 )ξ } (ξ ∈ L1 (0, 1)). By virtue of Hille’s theorem [9, Theorem 11], n T 1

B(L(0,1))

= O n1/4 ,

which immediately gives lim

n→∞

1 T n 1 B(L1 (0,1)) = 0. n

Then for every f ∈ L1 (0, 1) n 1 1 k 1 n Vn(1) f (t) = T1 ξ(t) − ξ(t) T1 f (t) − n+1 n+1 n+1 k=0

is convergent (as n → ∞) almost everywhere on (0, 1), as well as in the strong topology of L1 (0, 1). Next let 0 < α < 1. Hille also proved that Tα is power-bounded in the (operator) topology of B(L1 (0, 1)) and that for 0 < t < 1 t Pn (w, α) dw = 1

lim

n→∞ 0

uniformly with respect to t in any interval (ε, 1). Thus for every h ∈ L1 (0, 1) lim Tαn h(t) = 0 a.e.

n→∞

and

lim Tαn hL

n→∞

1 (0,1)

= 0.

1726

T. Yoshimoto / Journal of Functional Analysis 256 (2009) 1708–1730 (α)

Hence if we consider the sequence {Vn f } (f ∈ L1 (0, 1)) determined by the affine system {Tα , Uα , (I − Tα )ξ } (ξ ∈ L1 (0, 1)) then for every f ∈ L1 (0, 1) n 1 1 k 1 n Vn(α) f (t) = Tα ξ(t) − ξ(t) Tα f (t) − n+1 n+1 n+1 k=0

is convergent (as n → ∞) almost everywhere on (0, 1), as well as in the strong topology of L1 (0, 1). Example 4. Let W and U be the same operators defined on X = C[0, 1] × C[0, 1] as in Example 2. Then it follows that there exists at least one pair (ξ0 , η0 ) ∈ X such nontrivial k (ξ , η ) cannot converge as W that limn→∞ n1 W n (ξ0 , η0 )X = 0. This implies that n1 n−1 0 0 k=0 n → ∞ in the strong topology of X. In addition, the affine operator U is defined by U (f, g) = W (f, g) + (I − W )(ξ0 , η0 ) for (f, g) ∈ X (as in Example 2). For any (f, g) fixed in X we define a new sequence of functions {Pn (f, g)} (n = 0, 1, 2, . . .), in X inductively by P0 (f, g) = (f, g), Pn+1 (f, g) = (f, g) + U Pn (f, g). Then it follows that Pn (f, g) =

n

W k (f, g) − W n (ξ0 , η0 ) + (ξ0 , η0 ) (n = 0, 1, 2, . . .).

k=0

Therefore 1 1 k n 1 Pn (ξ0 , η0 ) = · (ξ0 , η0 ) W (ξ0 , η0 ) + n+1 n+1 n n+1 n−1 k=0

cannot be convergent (as n → ∞) in the strong topology of X. Next we consider the operators T2 , U2,0 , U2,00 and the functions ξ0 , ξ00 used in Example 3. For (0) (00) any f fixed in L1 (0, 1) we define the sequences of functions {Qn } and {Qn } (n = 0, 1, 2, . . .), (0) (00) inductively by Q0 f = f , Q0 f = f and (0)

Qn+1 f = f + U2,0 Q(0) n f, (00)

Qn+1 f = f + U2,00 Qn(00) f. Then we have Q(0) n f =

n

T2k f − T2n ξ0 + ξ0 ,

k=0

Qn(00) f =

n k=0

T2k f − T2n ξ00 + ξ00 .


1727

Therefore, by the results of Example 3 with the aid of Hille’s abstract Abelian theorem [9] we see that 1 n 1 1 k T2 ξ0 (t) + Q(0) · ξ0 (t) n ξ0 (t) = n+1 n+1 n n+1 n−1 k=0

cannot be convergent (as n → ∞) almost everywhere on (0, 1) and that 1 n 1 1 k T2 ξ00 + Qn(00) ξ00 = · ξ00 n+1 n+1 n n+1 n−1 k=0

cannot be convergent (as n → ∞) in the strong topology of L1 (0, 1) (cf. Wittmann [17] (open question), p. 251: this is just the question to find examples of operators T (on L1 ) satisfying the order preserving, integral preserving and L∞ -nonexpansive properties except positive ho1 mogeneity, for which there exists a function f ∈ L1 such that n+1 Sn [T ]f is divergent, where S0 [T ]f = f , Sn+1 [T ]f = f + T Sn [T ]f , n = 0, 1, 2, . . .). Example 5. To illustrate the situation of our consideration in this paper, we take the measure spaces (Ω, ß, μ) and (S, F , ν) to be Ω = S = [0, 1), ß = F = the σ -field of Borel sets, and μ = ν = the Lebesgue measure. Let us consider the product measure spaces S ∗ = S1 × S2 × · · · ,

F ∗ = F1 ⊗ F2 ⊗ · · · ,

ν ∗ = ν1 ⊗ ν2 ⊗ · · · ,

where Si = S, Fi = F , νi = ν, i = 1, 2, . . . , and let ϕ be the one-sided shift transformation on S ∗ , that is, xn (ϕs ∗ ) = xn+1 (s ∗ ),

s ∗ ∈ S ∗ (n = 1, 2, . . .),

where xn (s ∗ ) denotes the nth coordinate of s ∗ ∈ S ∗ . We define an F ⊗ ß-measurable family {ψs : s ∈ S} of μ-measure preserving transformations on Ω by ψs ω = [ω + βs]

(= ω + βs, mod 1)

with a real constant β. Let τ be the skew product of ϕ and {ψs : s ∈ S}: τ (s ∗ , ω) = (ϕs ∗ , ψx1 (s ∗ ) ω),

(s ∗ , ω) ∈ S ∗ × Ω

and define, for h ∈ L1 (Ω) and f ∈ L1 (S ∗ × Ω),

(Tf )(s ∗ , ω) = f τ (s ∗ , ω) ,

(Ux1 (s ∗ ) fϕs ∗ )(ω) = Ψx1 (s ∗ ) fϕs ∗ (ω) + (I − T )g x (s ∗ ) (ω),

(Ψx1 (s ∗ ) h)(ω) = h(ψx1 (s ∗ ) ω),

1

∗

∗

(s , ω) ∈ S × Ω.

1728


Then for any f , g ∈ Lp (S ∗ × Ω) (1 p < ∞) there exists a ν ∗ -null set N (f, g) of S ∗ such that for any s ∗ ∈ S ∗ − N (f, g), there exists a function ξs ∗ (·) ∈ Lp (Ω) such that 1 Ux1 (s ∗ ) Ux1 (ϕs ∗ ) · · · Ux1 (ϕ k−1 s ∗ ) fϕ k s ∗ (ω) = ξs ∗ (ω) n→∞ n n

lim

μ-a.e.

k=1

and p 1/p n 1 lim Ux1 (s ∗ ) Ux1 (ϕs ∗ ) · · · Ux1 (ϕ k−1 s ∗ ) fϕ k s ∗ (ω) − ξs ∗ (ω) dμ = 0. n→∞ n Ω

k=1

Next, given a function f ∈ L1 (S ∗ × Ω) we define a sequence of random functions {Vf (n, s ∗ ): s ∗ ∈ S ∗ } (n = 0, 1, 2, . . .), in L1 (Ω) inductively by Vf (0, s ∗ ) = fs ∗ , Vf (1, s ∗ ) = fs ∗ + Ux1 (s ∗ ) Vf (0, ϕs ∗ ), Vf (n + 1, s ∗ ) = fs ∗ + Ux1 (s ∗ ) Vf (n, ϕs ∗ ). Then for any f , g ∈ Lp (S ∗ × Ω) (1 p < ∞) there exists a ν ∗ -null set N (f, g) of S ∗ such that for any s ∗ ∈ S ∗ − N (f, g), there exists a function ηs ∗ (·) ∈ Lp (Ω) such that lim

n→∞

1 Vf (n, s ∗ )(ω) = ηs ∗ (ω) n

μ-a.e.

and p 1/p 1 ∗ lim = 0. n Vf (n, s )(ω) − ηs ∗ (ω) dμ n→∞ Ω

It seems to be significant to generalize the above (C, 1) results to the case order α > 0 by the (C, α) method. For a real α > −1 and each integer n 0, let Aαn denote the (C, α) coefficient of order α, which is defined by the generating function ∞

1 = Aαn λn , α+1 (1 − λ)

0 < λ < 1,

n=0

−1 with Aα0 = 1. We also let A−1 0 = 1 and An = 0 for all n 1. Then we have

Aαn =

n k=0

Aα−1 n−k ,

Aαn > 0,

A0n = 1,

Aαn ∼

nα Γ (α + 1)

(n → ∞).


1729

According to Irmisch’s theorem [12], for f , g ∈ Lp (S ∗ × Ω) with αp > 1 and 0 < α 1, the Cesàro (C, α) means Cn(α) [T ](f − g) =

n 1 α−1 k An−k T (f − g) Aαn k=0

converges ν ∗ ⊗ μ-almost everywhere on S ∗ × Ω, and thus Cn(α) [U ]f = Cn(α) [T ](f − g) + g converges ν ∗ ⊗ μ-almost everywhere on S ∗ × Ω. Moreover, since Ux1 (s ∗ ) Ux1 (ϕs ∗ ) · · · Ux1 (ϕ k−1 s ∗ ) fϕ k s ∗ (ω) = Ψx1 (s ∗ ) Ψx1 (ϕs ∗ ) · · · Ψx1 (ϕ k−1 s ∗ ) (f − g)ϕ k s ∗ (ω) + gs ∗ (ω)

ν ∗ ⊗ μ-a.e.,

there exists by Fubini’s theorem a ν ∗ -null set N (f, g) ∈ F ∗ such that for any s ∗ ∈ S ∗ − N (f, g) n 1 α−1 An−k Ψx1 (s ∗ ) Ψx1 (ϕs ∗ ) · · · Ψx1 (ϕ k−1 s ∗ ) (f − g)ϕ k s ∗ (ω) = ξs ∗ (ω) n→∞ Aα n

lim

μ-a.e.,

k=1

n 1 α−1 lim An−k Ux1 (s ∗ ) Ux1 (ϕs ∗ ) · · · Ux1 (ϕ k−1 s ∗ ) fϕ k s ∗ (ω) = ηs ∗ (ω) μ-a.e. n→∞ Aα n k=1

for some functions ξ, η ∈ Lp (S ∗ × Ω), and using Deniel’s result [4] (where 0 < α < 1, αp > 1) n 1 ∗ ∗) · · · Ψ lim α Aα−1 Ψ (f − g) − ξ = 0, k ∗ k−1 ∗ s x (s ϕ s x (ϕ s ) 1 n−k 1 n→∞ An k=1 Lp (Ω) n 1 α−1 lim α An−k Ux1 (s ∗ ) · · · Ux1 (ϕ k−1 s ∗ ) fϕ k s ∗ − ηs ∗ = 0. n→∞ An k=1

Lp (Ω)

(α)

Irmisch in fact proved the a.e. convergence of the Cesàro (C, α) means Cn [T ]f for a positive linear contraction T on Lp . He also proved that this result is false in general for αp = 1 (cf. [4]). Even in the nonlinear case there are always three types of convergence, namely, the strong, the weak and convergence almost everywhere. As far as we are concerned with the ergodic behaviors of Cesàro-type processes for nonexpansive operators on Lp , one can only expect weak convergence in general. This comes of the fact that there is essential difference between linearity and nonlinearity of operators in question. The so-called affine systems taken in this paper are very informative in the light of these facts. Acknowledgments The author thanks Professor H. Brezis for his great kindness and the referee for his many helpful comments which improved the presentation of this paper.

1730


References [1] J.B. Baillon, Comportment asymptotique des itérés de contractions non-linéaires dans les espaces Lp , C. R. Acad. Sci. Paris 286 (1978) 157–159. [2] A. Beck, J.T. Schwartz, A vector-valued random ergodic theorem, Proc. Amer. Math. Soc. 8 (1957) 1049–1059. [3] R.V. Chacon, An ergodic theorem for operators satisfying norm conditions, J. Math. Mech. 11 (1962) 165–172. [4] Y. Deniel, On the a.s. Cesàro-α convergence for stationary or orthogonal random variables, J. Theoret. Probab. 2 (1989) 475–485. [5] W.G. Dotson Jr., An application of ergodic theory to the solution of linear functional equations in Banach spaces, Bull. Amer. Math. Soc. 75 (1969) 347–352. [6] N. Dunford, J.T. Schwartz, Linear Operators. I, Interscience, New York, 1957. [7] N.A. Fava, Weak inequalities for product operators, Studia Math. 42 (1972) 271–288. [8] S. Gladysz, Ein ergodischer Satz, Studia Math. 15 (1956) 148–157. [9] E. Hille, Remarks on ergodic theorems, Trans. Amer. Math. Soc. 57 (1945) 246–269. [10] E. Hille, R.S. Phillips, Functional Analysis and Semigroups, Colloq. Publ. Amer. Math. Soc., 1957. [11] E. Hopf, On the ergodic theorem for positive linear operators, J. Reine Angew. Math. 205 (1960) 101–106. [12] R. Irmisch, Punktweise ergodensätze für (C, α)-verfahren, 0 < α < 1, Dissertation, Fachbereich Mathematik, Darmstadt, 1980. [13] K. Jacobs, Lecture Notes on Ergodic Theory, Aarhus Univ., Mathematisk Inst., 1962–63. [14] U. Krengel, An example concerning the nonlinear pointwise ergodic theorem, Israel J. Math. 58 (1987) 193–197. [15] U. Krengel, M. Lin, Order preserving nonexpansive operators in L1 , Israel J. Math. 58 (1987) 170–192. [16] M. Lin, R. Sine, Ergodic theory and the functional equation (I − T )x = y, J. Operator Theory 10 (1983) 153–166. [17] R. Wittmann, Hopf’s ergodic theorem for nonlinear operators, Math. Ann. 289 (1991) 239–253. [18] J. Wo´s, Random ergodic theorem for Dunford–Schwartz operators, Bull. Acad. Pol. Sci. 27 (1979) 865–867. [19] J. Wo´s, Random ergodic theorems for sub-Markovian operators, Studia Math. 74 (1982) 191–212. [20] T. Yoshimoto, Random ergodic theorems with weighted averages, Z. Wahr. Verw. Geb. 30 (1974) 149–165. [21] T. Yoshimoto, Induced contraction semigroups and random ergodic theorems, Dissertationes Math. (Rozprawy Mat.) 139 (1976). [22] T. Yoshimoto, Vector valued ergodic theorems for operators satisfying norm conditions, Pacific J. Math. 85 (1979) 485–499. [23] T. Yoshimoto, Pointwise ergodic theorems and function classes Mpα , Studia Math. 72 (1982) 253–271. [24] T. Yoshimoto, Inequalities for product operators and vector valued ergodic theorems, Studia Math. 73 (1982) 95– 114. [25] T. Yoshimoto, Remarks on nonlinear ergodic theory in Lp , in: Proc. the 4th International Conf. on Nonlinear Anal. and Convex Anal., Okinawa 2005, pp. 677–689.


Function spaces of variable smoothness and integrability L. Diening a,1 , P. Hästö b,∗,2 , S. Roudenko c,3 a Section of Applied Mathematics, Eckerstraße 1, Freiburg University, 79104 Freiburg/Breisgau, Germany b Department of Mathematical Sciences, PO Box 3000, FI-90014 University of Oulu, Finland c Department of Mathematics and Statistics, Arizona State University, Tempe, AZ 85287-1804, USA

Received 31 October 2007; accepted 20 January 2009

Communicated by N. Kalton

Abstract In this article we introduce Triebel–Lizorkin spaces with variable smoothness and integrability. Our new scale covers spaces with variable exponent as well as spaces of variable smoothness that have been studied in recent years. Vector-valued maximal inequalities do not work in the generality which we pursue, and an alternate approach is thus developed. Using it we derive molecular and atomic decomposition results and show that our space is well-defined, i.e., independent of the choice of basis functions. As in the classical case, a unified scale of spaces permits clearer results in cases where smoothness and integrability interact, such as Sobolev embedding and trace theorems. As an application of our decomposition we prove optimal trace theorem in the variable indices case. © 2009 Elsevier Inc. All rights reserved. Keywords: Triebel–Lizorkin spaces; Variable indices; Variable exponent; Non-standard growth; Decomposition; Molecule; Atom; Trace spaces


E-mail addresses: [email protected] (L. Diening), [email protected] (P. Hästö), [email protected] (S. Roudenko). URLs: http://www.mathematik.uni-freiburg.de/IAM/homepages/diening/ (L. Diening), http://cc.oulu.fi/~phasto/ (P. Hästö), http://math.asu.edu/~svetlana (S. Roudenko). 1 Supported in part by the Landesstiftung Baden-Württemberg. 2 Supported in part by the Academy of Finland, INTAS, and the Emil Aaltonen foundation. 3 Partially supported by the NSF grant DMS-0531337. 0022-1236/$ – see front matter © 2009 Elsevier Inc. All rights reserved. doi:10.1016/j.jfa.2009.01.017

1732

L. Diening et al. / Journal of Functional Analysis 256 (2009) 1731–1768

1. Introduction From a vast mass of different function spaces a well ordered superstructure appeared in the α and the Triebel– 1960’s and 1970’s based on two three-index spaces: the Besov space Bp,q α Lizorkin space Fp,q . In recent years there has been a growing interest in generalizing classical spaces such as Lebesgue and Sobolev spaces to the case with either variable integrability (e.g., W 1,p(·) ) or variable smoothness (e.g., W m(·),2 ). These generalized spaces are obviously not covered by the superstructures with fixed indices. It is well-known from the classical case that smoothness and integrability often interact, for instance, in trace and embedding theorems. However, there has so far been no attempt to treat spaces with variable integrability and smoothness in one scale. In this article we address this α(·) issue by introducing Triebel–Lizorkin spaces with variable indices, denoted Fp(·), q(·) . Spaces of variable integrability can be traced back to 1931 and W. Orlicz [41], but the modern development started with the paper [30] of Kováˇcik and Rákosník in 1991. A survey of the history of the field with a bibliography of more than a hundred titles published up to 2004 can be found in [17] by Diening, Hästö and Nekvinda; further surveys are due to Samko [49] and Mingione [42]. Apart from interesting theoretical considerations, the motivation to study such function spaces comes from applications to fluid dynamics, image processing, PDE and the calculus of variation. The first concrete application arose from a model of electrorheological fluids in [45] (cf. [1,2,47,48] for mathematical treatments of the model). To give the reader a feeling for the idea behind this application we mention that an electrorheological fluid is a so-called smart material in which the viscosity depends on the external electric field. This dependence is expressed through the variable exponent p; specifically, the motion of the fluid is described by a Navier–Stokestype equation where the Laplacian u is replaced by the p(x)-Laplacian div(|∇u|p(x)−2 ∇u). By standard arguments, this means that the natural energy space of the problem is W 1,p(·) , the Sobolev space of variable integrability. For further investigations of these differential equations see, e.g. [3,18,19]. More recently, an application to image restoration was proposed by Chen, Levine and Rao [10,40]. Their model combines isotropic and total variation smoothing. In essence, their model requires the minimization over u of the energy

∇u(x)p(x) + λu(x) − I (x)2 dx,

Ω

where I is given input. Recall that in the constant exponent case, the power p ≡ 2 corresponds to isotropic smoothing, whereas p ≡ 1 gives total variation smoothing. Hence the exponent varies between these two extremes in the variable exponent model. This variational problem has an Euler–Lagrange equation, and the solution can be found by solving a corresponding evolutionary PDE. Partial differential equations have also been studied from a more abstract and general point of view in the variable exponent setting, see, e.g. [19]. In analogy to the classical case, we can approach boundary value problems through a suitable trace space, which, by definition, is a space consisting of restrictions of functions to the boundary. For the Sobolev space W 1,p(·) , the trace space was first characterized by first two authors by an intrinsic norm, see [16]. In analogy with 1−1/p(·) the classical case, this trace space can be formally denoted Fp(·),p(·) , so it is an example of a


1733

space with variable smoothness and integrability, albeit on with a very special relationship between the two indices. Already somewhat earlier Almeida and Samko [4] and Gurka, Harjulehto and Nekvinda [26] had extended variable integrability Sobolev spaces to Bessel potential spaces W α,p(·) for constant but non-integer α.4 Along a different line of study, Leopold [34–37] and Leopold and Schrohe [38] studied pseudo-differential operators with symbols of the type ξ m(x) , and defined related function m(·) spaces of Besov-type with variable smoothness, formally Bp,p . In the case p = 2, this corm(·) m(·),2 =W . Function spaces of variable smoothness have responds to the Sobolev space H recently been studied by Besov [5–8]. He generalized Leopold’s work by considering both α(·) α(·) Triebel–Lizorkin spaces Fp, q and Besov spaces Bp, q in Rn . In a recent preprint, Schneider and Schwab [52] used H m(·) (R) in the analysis of certain Black–Scholes equations. In this application the variable smoothness corresponds to the volatility of the market, which surely should change with time. The purpose of the present paper is to define and study a generalized scale of Triebel–Lizorkin type spaces with variable smoothness, α(x), and variable primary and secondary indices of integrability, p(x) and q(x). By setting some of the indices to appropriate values we recover all previously mentioned spaces as special cases, except the Besov spaces (which, like in the classical case, form a separate scale). Apart from the value added through unification, our new space allows treating traces and embeddings in a uniform and comprehensive manner, rather than doing them case by case. Some particular examples are: • The trace space of W k,p(·) is no longer a space of the same type. So, if we were interested in the trace space of the trace space, the theory of [16] no longer applies, and thus, a new theory is needed. In contrast to this, as we show in Section 7, the trace of a Triebel–Lizorkin space is again a Triebel–Lizorkin space (also in the variable indices case), hence, no such problem occurs. • Our approach allows us to use the so-called “r-trick” (cf. Lemma A.6) to study spaces with integrability in the range (0, ∞], rather than in the range [1, ∞]. 0 corresponds to the • It is well-known that the constant exponent Triebel–Lizorkin space Fp,2 p Hardy space h when p ∈ (0, 1]. Hardy spaces have thus far not been studied in the variable exponent case. Therefore, our formulation opens the door to this line of investigation. Based on the results of this article, J. Vybíral [56] has proved optimal Sobolev embeddings in the variable index Triebel–Lizorkin spaces, see also [20]. When generalizing Triebel–Lizorkin spaces, we have several obstacles to overcome. The main difficulty is the absence of the vector-valued maximal function inequalities. It turns out that the inequalities are not only missing, rather, they do not even hold in the variable indices case (see Section 5). As a consequence of this, the Hörmander–Mikhlin multiplier theorem does not apply in the case of variable indices. Our solution is to work in closer connection with the actual 4 After the completion of this paper we learned that Xu [57,58] has studied Besov and Triebel–Lizorkin spaces with variable p, but fixes q and α. The results in two subsections of Section 4 were proved independently in [58]. However, most of the advantages of unification do not occur with only p variable: for instance, trace spaces cannot be covered, and spaces of variable smoothness are not included. Therefore Xu’s work does not essentially overlap with the results presented here.

1734


structure of the space with what we call η-functions and to derive suitable estimates directly for these functions. The structure of the article is as follows: we first briefly recapitulate some standard definitions and results in the next section. In Section 3 we state our main results: atomic and molecular decomposition of Triebel–Lizorkin spaces, a trace theorem, and a multiplier theorem. In Section 4 we show that our new scale is indeed a unification of previous spaces, in that it includes them all as special cases with appropriate choices of the indices. In Section 5 we formulate and prove an appropriate version of the multiplier theorem. In Section 6 we give the proofs of the main decompositions theorems, and in Section 7 we discuss the trace theorem. Finally, in Appendix A we derive several technical lemmas that were used in the other sections. In the constant indices case, the Triebel–Lizorkin space has been considered also for negative smoothness. Our treatment of the smoothness differs from the classical case, and it is not clear how our approach extends to the case of negative smoothness. 2. Preliminaries For x ∈ Rn and r > 0 we denote by B n (x, r) the open ball in Rn with center x and radius r. By we denote the unit ball B n (0, 1). We use c as a generic constant, i.e., a constant whose values may change from appearance to appearance. The inequality f ≈ g means that 1c g f cg for some suitably independent constant c. By χA we denote the characteristic function of the set A. If a ∈ R, then we use the notation a+ for the positive part of a, i.e., a+ = max{0, a}. By N and N0 we denote the sets of positive and non-negative integers. For x ∈ R we denote by x the largest integer less than or equal to x. We denote the mean-value of the integrable function f , defined on a set A of finite, non-zero measure, by Bn

1 f (x) dx. − f (x) dx = |A|

A

A

The Hardy–Littlewood maximal operator M is defined on L1loc (Rn ) by − f (y) dy.

Mf (x) = sup r>0

B n (x,r)

By supp f we denote the support of the function f , i.e., the closure of its zero set. 2.1. Spaces of variable integrability By Ω ⊂ Rn we always denote an open set. By a variable exponent we mean a measurable bounded function p : Ω → (0, ∞) which is bounded away from zero. For A ⊂ Ω we denote + − + − = ess supA p(x) and pA = ess infA p(x); we abbreviate p + = pΩ and p − = pΩ . We define pA the modular of a measurable function f to be Lp(·) (Ω) (f ) = Ω

f (x)p(x) dx.


1735

The variable exponent Lebesgue space Lp(·) (Ω) consists of all measurable functions f : Ω → C for which Lp(·) (Ω) (f ) < ∞. We define the Luxemburg norm on this space by f Lp(·) (Ω) = inf λ > 0: Lp(·) (Ω) (f/λ) 1 , which is the Minkowski functional of the absolutely convex set {f : Lp(·) (Ω) (f ) 1}. In the case when Ω = Rn we replace the Lp(·) (Rn ) in subscripts simply by p(·), e.g., f p(·) denotes f Lp(·) (Rn ) . Note that Lp(·) (Rn ) ⊂ L1loc (Rn ). The variable exponent Sobolev space W 1,p(·) (Ω) is the subspace of Lp(·) (Ω) of functions f whose distributional gradient exists and satisfies |∇f | ∈ Lp(·) (Ω). The norm f W 1,p(·) (Ω) = f Lp(·) (Ω) + ∇f Lp(·) (Ω) makes W 1,p(·) (Ω) a Banach space. For fixed exponent spaces we of course have a very simple relationship between the norm and the modular. In the variable exponent case this is not so. However, we have nevertheless the following useful property: p(·) (f ) 1 if and only if f p(·) 1. This and many other basic results were proven in [30]. Definition 2.1. Let g ∈ C(Rn ). We say that g is locally log-Hölder continuous, abbreviated log g ∈ Cloc (Rn ), if there exists clog > 0 such that g(x) − g(y)

clog log(e + 1/|x − y|)

for all x, y ∈ Rn . We say that g is globally log-Hölder continuous, abbreviated g ∈ C log (Rn ), if it is locally log-Hölder continuous and there exists g∞ ∈ R such that g(x) − g∞

clog log(e + |x|)

for all x ∈ Rn . Note that g is globally log-Hölder continuous if and only if g(x) − g(y)

c |log

1 2 q(x, y)|

for all x, y ∈ Rn , where q denotes the spherical-chordal metric (the metric inherited from a projection to the Riemann sphere), hence the name, global log-Hölder continuity. Building on [12] and [13] it is shown in [15, Theorem 3.6] that M : Lp(·) Rn → Lp(·) Rn is bounded if p ∈ C log (Rn ) and 1 < p − p + ∞. Global log-Hölder continuity is the best possible modulus of continuity to imply the boundedness of the maximal operator, see [12,44].

1736


However, if one moves beyond assumptions based on continuity moduli, it is possible to derive results also under weaker assumptions, see [14,39,43]. 2.2. Partitions Let D be the collection of dyadic cubes in Rn and denote by D+ the subcollection of those dyadic cubes with side-length at most 1. Let Dν = {Q ∈ D: (Q) = 2−ν }. For a cube Q let (Q) denote the side length of Q and xQ the “lower left corner.” For c > 0, we let cQ denote the cube with the same center and orientation as Q but with side length c (Q). The set S denotes the usual Schwartz space of rapidly decreasing complex-valued functions and S denotes the dual space of tempered distributions. We denote the Fourier transform of ϕ by ϕˆ or F ϕ. Definition 2.2. We say a pair (ϕ, Φ) is admissible if ϕ, Φ ∈ S(Rn ) satisfy • supp ϕˆ ⊆ {ξ ∈ Rn : 12 |ξ | 2} and |ϕ(ξ ˆ )| c > 0 when 35 |ξ | 53 , ˆ )| c > 0 when |ξ | 5 . • supp Φˆ ⊆ {ξ ∈ Rn : |ξ | 2} and |Φ(ξ 3 We set ϕν (x) = 2νn ϕ(2ν x) for ν ∈ N and ϕ0 (x) = Φ(x). For Q ∈ Dν we set |Q|1/2 ϕν (x − xQ ) if ν 1, ϕQ (x) = |Q|1/2 Φ(x − xQ ) if ν = 0. We define ψν and ψQ analogously. Following [23], given an admissible pair (ϕ, Φ) we can select another admissible pair (ψ, Ψ ) such that ˜ˆ ) · Ψˆ (ξ ) + ϕˆ˜ 2−ν ξ · ψˆ 2−ν ξ = 1 for all ξ. Φ(ξ ν1

˜ ˜ Here, Φ(x) = Φ(−x) and similarly for ϕ. For each f ∈ S (Rn ) we define the (inhomogeneous) ϕ-transform Sϕ as the map taking f to the sequence (Sϕ f )Q∈D+ by setting (Sϕ f )Q = f, ϕQ . Here, ·,· denotes the usual inner product on L2 (Rn ; C). For later purposes note that (Sϕ f )Q = |Q|1/2 ϕ˜ ν ∗ f (2−ν k) for l(Q) = 2−ν < 1 and (Sϕ f )Q = Φ˜ ∗ f (k) for l(Q) = 1. The inverse

(inhomogeneous)

ϕ-transform Tψ is the map taking a sequence s = {sQ }l(Q)1 to Tψ s = l(Q)=1 sQ ΨQ + l(Q) n. α(·)

Definition 3.3. Let ϕν , ν ∈ N0 , be as in Definition 2.2. The Triebel–Lizorkin space Fp(·), q(·) (Rn ) is defined to be the space of all distributions f ∈ S with f F α(·) < ∞, where p(·), q(·)

f F α(·)

p(·), q(·)

:= 2να(x) ϕν ∗ f (x)l q(x) Lp(·) . ν

α(·)

x

α(·)

In the case of p = q we use the notation Fp(·) (Rn ) := Fp(·), q(·) (Rn ). Note that, a priori, the function space depends on the choice of admissible functions (ϕ, Φ). One of the main purposes of this paper is to show that, up to equivalence of norms, every pair of admissible functions produces the same space. We next present a formulation of the Triebel–Lizorkin norm which is similar to the classical discrete Triebel–Lizorkin spaces. For a sequence of real numbers {sQ }Q we define {sQ }Q

α(·)

fp(·), q(·)

να(x) − 12 := 2 |sQ ||Q| χQ q(x) lν

Q∈Dν

p(·)

.

Lx

α(·)

The space fp(·), q(·) consists of all those sequences {sQ }Q for which this norm is finite. We α(·)

α(·)

are ready to state our first decomposition result, which says that Sϕ : Fp(·), q(·) → fp(·), q(·) is a bounded operator. Theorem 3.4. If p, q and α are as in the Standing Assumptions, then Sϕ f f α(·)

p(·), q(·)

c f F α(·)

.

p(·), q(·)

If we have a sequence {sQ }Q , then we can easily construct a candidate Triebel–Lizorkin function by taking the weighted sum with certain basis functions, sQ mQ . Obviously, certain restrictions are necessary on the functions mQ in order for this to work. We therefore make the following definitions: Definition 3.5. Let ν ∈ N0 , Q ∈ Dν and k ∈ Z, l ∈ N0 and M n. A function mQ is said to be a (k, l, M)-smooth molecule for Q if it satisfies the following conditions for some m > M: (M1) if ν > 0, then x γ mQ (x) dx = 0 for all |γ | k; and (M2)

Rn γ |D mQ (x)| 2|γ |ν |Q|1/2 ην,m (x

+ xQ ) for all multi-indices γ ∈ Nn0 with |γ | l.

The conditions (M1) and (M2) are called the moment and decay conditions, respectively.


1739

Note that (M1) is vacuously true if k < 0. When M = n, this definition is a special case of the definition given in [23] for molecules. The difference is that we consider only k and l integers, and l non-negative. In this case two of the four conditions given in [23] are vacuous. Definition 3.6. Let K, L : Rn → R and M > n. The family {mQ }Q is said to be a family of − + , L− (K, L, M)-smooth molecules if mQ is ( KQ Q , M)-smooth for every Q ∈ D . − − and L− In this definition the notation KQ Q is analogous to the notation pQ , and refers to the essential infimum of the functions in the cube Q. α(·)

Definition 3.7. We say that {mQ }Q is a family of smooth molecules for Fp(·), q(·) if it is a family of (N + ε, α + 1 + ε, M)-smooth molecules, where N (x) :=

n − n − α(x), min{1, p(x), q(x)}

for some constant ε > 0, and M is a sufficiently large constant. The number M needs to be chosen sufficiently large, for instance 2

n + clog (α) min{1, p − , q − }

will do, where clog (α) denotes the log-Hölder continuity constant of α. Since M can be fixed depending on the parameters we will usually omit it from our notation of molecules. Note that the functions ϕQ are smooth molecules for arbitrary indices. Also note that compared to the classical case we assume the existence of 1 more derivative (rounded down) for α(·) smooth molecules for Fp(·), q(·) . We need the assumption for technical reasons (cf. Lemma 6.3). However, we think the additional assumptions are inconsequential; for instance the trace result (Theorem 3.13), and indeed any result based on atomic decomposition, can still be proven in an optimal form. Theorem 3.8. Let the functions p, q, and α be as in the Standing Assumptions. Suppose that α(·) α(·) {m } is a family of smooth molecules for Fp(·), q(·) and that {sQ }Q ∈ fp(·), q(·) . Then f =

Q Q

ν0 Q∈Dν sQ mQ converges in S and f F α(·)

p(·), q(·)

c{sQ }Q f α(·)

.

p(·), q(·)

α(·)

α(·)

Theorems 3.4 and 3.8 yield an isomorphism between Fp(·), q(·) and a subspace of fp(·), q(·) via the Sϕ transform: Corollary 3.9. If the functions p, q, and α are as in the Standing Assumptions, then f F α(·)

p(·), q(·)

α(·)

for every f ∈ Fp(·), q(·) (Rn ).

≈ Sϕ f f α(·)

p(·), q(·)

1740

L. Diening et al. / Journal of Functional Analysis 256 (2009) 1731–1768 α(·)

With these tools we can prove that the space Fp(·), q(·) (Rn ) is well-defined. α(·)

Theorem 3.10. The space Fp(·), q(·) (Rn ) is well-defined, i.e., the definition does not depend on the choice of the functions ϕ and Φ satisfying the conditions of Definition 2.2, up to the equivalence of norms. Proof. Let ϕν and ϕν be different basis functions as in Definition 2.2. Let ·ϕ and ·ϕ denote α(·) the corresponding norms of Fp(·), q(·) . By symmetry, it suffices to prove f ϕ˜ c f ϕ for all

f ∈ S . Let f ϕ < ∞. By (2.3) we have f = Q∈D+ (Sϕ f )Q ψQ with convergence in S . It follows by Theorem 3.4 that Sϕ f f α(·) cf ϕ . Since {ψQ }Q is a family of smooth molecules, f ϕ cSϕ f f α(·)

p(·), q(·)

by Theorem 3.8, which completes the proof.

p(·), q(·)

2

It is often convenient to work with compactly supported basis functions. Thus, we say that the molecule aQ concentrated on Q is an atom if it satisfies supp aQ ⊂ 3Q. Note that this coincides with the definition of atoms in [23] in the case when p, q and α are constants. The downside of atoms, as in the constant exponent case, is that we need to chose a new set of them for each function f that we represent. For atomic decomposition we have the following result. Theorem 3.11. Let the functions p, q, and α be as in the Standing Assumptions and let α(·) f ∈ Fp(·), q(·) . Then there exists a family of smooth atoms {aQ }Q and a sequence of coefficients {tQ }Q such that f=

Q∈D +

tQ a Q

in S

and {tQ }Q f α(·)

p(·), q(·)

≈ f F α(·)

.

p(·), q(·)

Moreover, the atoms can be chosen to satisfy conditions (M1) and (M2) in Definition 3.5 for arbitrarily high, given order. If the maximal operator is bounded and 1 < p − p + < ∞, then it follows easily that (the space of smooth functions with compact support) is dense in W 1,p(·) (Rn ), since it is then possible to use convolution. However, density can be achieved also under more general circumstances, see [21,29,59]. Our standing assumptions are strong enough to give us density directly: C0∞ (Rn )

Proposition 3.12. Let the functions p, q, and α be as in the Standing Assumptions. Then C0∞ (Rn ) α(·) is dense in Fp(·), q(·) (Rn ). Another consequence of our atomic decomposition is the analogue of the standard trace theorem. Since its proof is much more involved, we present it in Section 7. Note that the assumption α − p1 − (n − 1)( p1 − 1)+ > 0 is optimal also in the constant smoothness and integrability case, cf. [22, Section 5] and [23].


1741

Theorem 3.13. Let the functions p, q, and α be as in the Standing Assumptions. If α−

1 1 > 0, − (n − 1) −1 p p +

α(·)− 1 α(·) then tr Fp(·), q(·) Rn = Fp(·) p(·) Rn−1 .

4. Special cases α(·)

In this section we show how the Triebel–Lizorkin scale Fp(·), q(·) includes as special cases previously studied spaces with variable differentiability or integrability. 4.1. Lebesgue spaces We begin with the variable exponent Lebesgue spaces from Section 2.1, which were originally 0 ∼ p(·) under suitable assumptions on p introduced by Orlicz in [41]. We show that Fp(·), 2 =L using an extrapolation result for Lp(·) . Recall, that a weight ω is in the Muckenhoupt class A1 if Mω Kω for some such K > 0. The smallest K is the A1 constant of ω. Lemma 4.1. (See [11, Theorem 1.3].) Let p ∈ C log (Rn ) with 1 < p − p + < ∞ and let G denote a family of tuples (f, g) of measurable functions on Rn . Suppose that there exists a constant r0 ∈ (0, p − ) so that

f (x)r0 ω(x) dx

1

r0

c0

Rn

g(x)r0 ω(x) dx

1

r0

Rn

for all (f, g) ∈ G and every weight ω ∈ A1 , where c0 is independent of f and g and depends on ω only via its A1 -constant. Then f Lp(·) (Rn ) c1 gLp(·) (Rn ) for all (f, g) ∈ G with f Lp(·) (Rn ) < ∞. 0 n Theorem 4.2. Let p ∈ C log (Rn ) with 1 < p − p + < ∞. Then Lp(·) (Rn ) ∼ = Fp(·), 2 (R ). In particular,

f Lp(·) (Rn ) ≈ ϕν ∗ f lν2 Lp(·) (Rn ) for all f ∈ Lp(·) (Rn ). 0 n Proof. Since C0∞ (Rn ) is dense in Lp(·) (Rn ) (see [30]) and also in Fp(·), 2 (R ) by Proposi∞ n − tion 3.12, it suffices to prove the claim for all f ∈ C0 (R ). Fix r0 ∈ (1, p ). Then

ϕν ∗ f 2

lν Lr0 (Rn ;ω)

≈ f Lr0 (Rn ;ω) ,

1742


for all ω ∈ A1 by [32, Theorem 1], where the constant depends only on the A1 -constant of the weight ω, so the assumptions of Lemma 4.1 are satisfied. Applying the lemma with G equal to either ϕν ∗ f lν2 , f : f ∈ C0∞ Rn completes the proof.

or

f, ϕν ∗ f lν2 : f ∈ C0∞ Rn

2

0 for constant p ∈ (1, ∞) to the Theorem 4.2 generalizes the equivalence of Lp (Rn ) ∼ = Fp, 2 setting of variable exponent Lebesgue spaces. If p ∈ (0, 1], then the spaces Lp (Rn ) have to be replaced by the Hardy spaces hp (Rn ). This suggests the following definition:

Definition 4.3. Let p ∈ C log (Rn ) with 0 < p − p + < ∞. Then we define the variable exponent 0 n real Hardy space hp(·) (Rn ) by hp(·) (Rn ) := Fp(·), 2 (R ). The investigation of this space is left for future research. 4.2. Sobolev and Bessel spaces We move on to Bessel potential spaces with variable integrability, which have been independently introduced by Almeida and Samko [4] and Gurka, Harjulehto and Nekvinda [26]. This scale includes also the variable exponent Sobolev spaces W k,p(·) . σ In the following let B σ denote the Bessel potential operator B σ = F −1 (1 + |ξ |2 )− 2 F for σ ∈ R. Then the variable exponent Bessel potential space is defined by Lα,p(·) Rn := B α Lp(·) Rn = B α g: g ∈ Lp(·) Rn equipped with the norm gLα,p(·) := B −α gp(·) . It was shown independently in [4, Corollary 6.2] and [26, Theorem 3.1] that Lk,p(·) (Rn ) ∼ = W k,p(·) (Rn ) for k ∈ N0 when p ∈ C log (Rn ) − + with 1 < p p < ∞. α n We will show that Lα,p(·) (Rn ) ∼ = Fp(·), 2 (R ) under suitable assumptions on p for α 0 k n and that Lk,p(·) (Rn ) ∼ = W k,p(·) (Rn ) ∼ = Fp(·), 2 (R ) for k ∈ N0 . It is clear by the definition of Lα,p(·) (Rn ) that B σ with σ 0 is an isomorphism between Lα,p(·) (Rn ) and Lα+σ,p(·) (Rn ), i.e., the space has a lifting property. Therefore, in view of Theorem 4.2 and L0,p(·) (Rn ) = 0 n Lp(·) (Rn ) ∼ = Fp(·), 2 (R ), we will complete the circle by proving a lifting property for the scale α(·)

Fp(·), q(·) (Rn ). Lemma 4.4 (Lifting property). Let p, q, and α be as in the Standing Assumptions and σ 0. α(·) α(·)+σ Then the Bessel potential operator B σ is an isomorphism between Fp(·), q(·) and Fp(·), q(·) . α(·)

Proof. Let f ∈ Fp(·), q(·) . By (2.3), f =

Q∈D + sQ ϕQ

{sQ }Q

α(·)

fp(·), q(·)

in S and by Corollary 3.9 we have that

≈ f F α(·)

p(·), q(·)

.


1743

Therefore,

Bσ f =

sQ B σ ϕQ =

Q∈D +

Q∈D

2−νσ sQ 2νσ B σ ϕQ . + =: sQ

=: ϕQ

} is a family of smooth molecules of an arbitrary order for a suitable Let us check that {KϕQ Q constant K > 0. Let Q ∈ D+ . Without loss of generality we may assume that xQ = 0. Then νσ (ξ ) ˆ −ν ξ ) 2νσ |Q|1/2 ϕ(2 Q (ξ ) = 2 ϕ ϕ = . Q (1 + |ξ |)σ (1 + |ξ |)σ Since ϕˆ has support in the annulus B n (0, 2)\B n (0, 1/2), it is clear that ϕ Q ≡ 0 in a neighborhood of the origin when l(Q) < 1, so the family satisfies the moment condition in Definition 3.5 for an arbitrarily high order. Next we consider the decay condition for molecules. Let μ ∈ Nn0 be a multi-index with |μ| = m. We estimate

μ ˆ −ν ξ ) D ϕ (ξ ) 2νσ |Q|1/2 D μ ϕ(2 ξ Q ξ (1 + |ξ |)σ ϕ(ζ ˆ ) 1/2 −νm μ = |Q| 2 Dζ (2−ν + |ζ |)σ μ c|Q|1/2 2−νm Dζ ϕ(ζ ˆ )|ζ |−σ , where ζ = 2−ν ξ and we used that the support of ϕˆ lies in the annulus B n (0, 2) \ B n (0, 1/2) for the last estimate. Define Km =

sup

|μ|=m, ζ ∈Rn

μ ˆ )|ζ |−σ . 2−νm Dζ ϕ(ζ

Since σ 0 and ϕˆ vanishes in a neighborhood of the origin, we conclude that Km < ∞ for every m. From the estimate μ x ψ(x) = c (−1)m D μ ψ(ξ ˆ supD μ ψ(ξ ˆ )ei x·ξ dξ c|supp ψ| ˆ ), ξ ξ Rn

ξ

we conclude that |x|m ϕQ (x) c2νn |Q|1/2 2−νm Km

and ϕQ (x) c2νn |Q|1/2 K0 .

Multiplying the former of the two inequalities by 2νm and adding it to the latter gives 1 + 2νm |x|m ϕQ (x) c2νn |Q|1/2 (K0 + Km ).

1744


Finally, this implies that ϕ (x) c Q

2νn |Q|1/2 (K0 + Km ) = |Q|1/2 (K0 + Km )ην,m (x), (1 + 2ν |x|)m

} from which we conclude that the family {KϕQ Q satisfy the decay condition when K μ 1/2 −1 . (|Q| (K0 + Km )) . A similar argument yields the decay condition for Dx ϕQ } is a family of smooth molecules for F Since {KϕQ Q p(·), q(·) , we can apply Theorem 3.8 to conclude that α(·)+σ

σ B f

α(·)+σ Fp(·), q(·)

c{sQ /K}Q f α(·)+σ c{sQ }Q f α(·) p(·), q(·)

The reverse inequality is handled similarly.

p(·), q(·)

≈ f F α(·)

.

p(·), q(·)

2

α n ∼ Theorem 4.5. Let p ∈ C log (Rn ) with 1 < p − p + < ∞ and α ∈ [0, ∞). Then Fp(·), 2 (R ) = k α,p(·) n n k,p(·) n (R ). If k ∈ N0 , then Fp(·), 2 (R ) ∼ (R ). L =W 0 α n −α n Proof. Suppose that f ∈ Fp(·), 2 (R ). By Lemma 4.4, B f ∈ Fp(·), 2 (R ), thus we conclude by Theorem 4.2 that B −α f ∈ Lp(·) (Rn ) = L0,p(·) (Rn ). Then it follows by the definition of the Bessel space that f = B α [B −α f ] ∈ Lα,p(·) (Rn ). The reverse inclusion follows by reversing these steps. The claim regarding the Sobolev spaces follows from this and the equivalence Lk,p(·) (Rn ) ∼ = k,p(·) (Rn ) for k ∈ N0 (see [4, Corollary 6.2] or [26, Theorem 3.1]). 2 W

4.3. Spaces of variable smoothness Finally, we come to spaces of variable smoothness as introduced by Besov [5], following log Leopold [33]. Let p, q ∈ (1, ∞) and let α ∈ Cloc (Rn ) ∩ L∞ (Rn ) with α 0. Then Besov defines the following spaces of variable smoothness n p α(·),Besov Fp, R := f ∈ Lloc Rn : f F α(·),Besov q p, q να(x) M −k 2 h, f (x) dh f F α(·),Besov := 2 p, q

q

n there exists c = c(m, n) > 0 such that ην,m ∗ |g|(x) c

j 0

2−j (m−n)

χ3Q (x)MQ g

Q∈Dν−j

for all ν 0, g ∈ L1loc , and x ∈ Rn . Proof. Fix ν 0, g ∈ L1loc , and x, y ∈ Rn . If |x − y| 2−ν , then we choose Q ∈ Dν containing y. If |x − y| > 2−ν , then we choose j ∈ N0 such that 2j −ν |x − y| 2j −ν+1 and let Q ∈ Dν−j be the cube containing y. Note that, in either case, x ∈ 3Q. Thus we conclude that −m χQ (y) 2νn 1 + 2ν |x − y| c2−j (m−n) χ3Q (x) . |Q| Next we multiply this inequality by |g(y)| and integrate with respect to y over Rn . This gives ην,m ∗ |g|(x) c2j (m−n) χ3Q (x)MQ g, which clearly implies the claim. 2


1747

For the proof of the Lemma 5.4 we need the following result on the maximal operator. It follows from Lemma 3.3 and Corollary 3.4, [15], since p + < ∞ in our case. Lemma 5.3. Let p ∈ C log (Rn ) with 1 p − p + < ∞. Then there exists h ∈ weak-L1 (Rn ) ∩ L∞ (Rn ) such that p(·) (x) + min |Q|, 1 h(x) Mf (x)p(x) c M f (·) for all f ∈ Lp(·) (Rn ) with f Lp(·) (Rn ) 1. We are now ready for a preliminary version of Theorem 3.2, containing an additional condition. Lemma 5.4. Let p, q ∈ C log (Rn ), 1 1. Then there exists m > n such that ην,m ∗ fν q(x) lν

cfν l q(x) Lp(·)

p(·)

Lx

ν

x

for every sequence {fν }ν∈N0 of L1loc -functions. Proof. By homogeneity, it suffices to consider the case fν q(x) lν

p(·)

Lx

1.

Then, in particular,

fν (x)p(x) dx 1

(5.5)

Rn

for every ν 1. Using Lemma 5.2 and Jensen’s inequality (i.e., the embedding in weighted discrete Lebesgue spaces), we estimate

p(x) q(x) ην,m ∗ fν (x)q(x) dx Rn

ν0

Rn

c

−j (m−n)

ν0 j 0

q(x)

χ3Q (x)MQ fν

dx

q(x) p(x) q(x)

χ3Q (x)MQ fν

dx

Q∈Dν−j

ν0 j 0

Rn

2

q(x) p(x)

Q∈Dν−j

j 0

Rn

c

ν0

2

−j (m−n)

2

−j (m−n)

c

Q∈Dν−j

p(x) q(x)

χ3Q (x)(MQ fν )

q(x)

dx.

1748


For the last inequality we used the fact that the innermost sum contains only a finite, uniformly bounded number of non-zero terms. that fν q(·) c. Thus, by Lemma 5.3, It follows from (5.5) and p(x) q(x) q− Lq

−

q(x) q (MQ fν ) q − cMQ |fν | q − + c min |Q|, 1 h(x)

for all Q ∈ Dν−j and x ∈ Q. Combining this with the estimates above, we get

p(x) q(x) ην ∗ fν (x)q(x) dx Rn

ν0

c

2

−j (m−n)

ν0 j 0

Rn

+c

Rn

q − q χ3Q (x) MQ |fν | q −

p(x) q(x)

dx

Q∈Dν−j

2−j (m−n)

q − χ3Q (x) min |Q|, 1 h(x)

p(x) q(x)

dx

Q∈Dν−j

ν0 j 0

=: (I ) + (II). Now we easily estimate that

p(x) q q(x) q − −j (m−n) − q (I ) c M |fν | (x) 2 χ3Q (x) dx Rn

Q∈Dν−j

j 0

ν0

p(x) q q − q(x) − c M |fν | q (x) dx Rn

=c Rn

ν0 q p(x) q − M |fν | q − (x) q(x) dx. − q

lν

The vector valued maximal inequality, Lemma 5.1, with (p/q)− · q − > 1 and q − > 1, implies that the last expression is bounded since

p(x)

p(x) q(x) q(x) − q(x) fν (x) q − q fν (x)q(x) dx = dx 1. Rn

ν0

Rn

ν0


1749

For the estimation of (II) we first note the inequality q − −j (m−n) − 2−j (m−n) χ3Q (x) min |Q|, 1 2 min 2n(j −ν)q , 1 Q∈Dν−j

ν0 j 0

ν0 j 0

− 2−j (m−n) j + 2n(j −ν)q

j 0

ν>j

2

−j (m−n)

(j + 1) c.

j 0

We then estimate (II) as follows: (II) c

h(x)q

2−j (m−n)

h(x)

p(x) − q(x) q

p(x) q(x) − χ3Q (x) min |Q|q , 1 dx

Q∈Dν−j

ν0 j 0

Rn

c

−

dx.

Rn

Since (p/q)− q − > 1 and h ∈ weak-L1 ∩ L∞ , the last expression is bounded.

2

Using a partitioning trick, it is possible to remove the strange condition (p/q)− · q − > 1 from the previous lemma and prove our main result regarding multipliers: Proof of Theorem 3.2. Because of the uniform continuity of p and q, we can choose a finite cover {Ωi } of Rn with the following properties: (1) (2) (3) (4)

each Ωi ⊂ Rn , 1 i k, is open; the sets Ωi cover Rn , i.e., i Ωi = Rn ; non-contiguous sets are separated in the sense that d(Ωi , Ωj ) > 0 if |i − j | > 1; and i+1 − we have (p/q)− j =i−1 Ωi (with the understanding Ai qAi > 1 for 1 i k, where Ai := that Ω0 = Ωk+1 = ∅).

Let us choose an integer l so that 2l min|i−j |>1 3d(Ωi , Ωj ) < 2l+1 . Since there are only finitely many indices, the third condition implies that such an l exists. Next we split the problem and work with the domains Ωi . In each of these we argue as in the previous lemma to conclude that

p(x) q(x) ην,m ∗ fν (x)q(x) dx Rn

ν0

p(x) k q(x) q(x) ην,m ∗ fν (x) dx i=1 Ω

c

ν0

i

k i=1 Ω

i

ν0 j 0

2

−j (m−n)

Q∈Dν−j

p(x) q(x)

χ3Q (x)(MQ fν )

q(x)

dx.

1750


From this we get

p(x) q(x) ην,m ∗ fν (x)q(x) dx ν0

Ωi

c

ν+l

2−j (m−n)

+c

q(x)

χ3Q (x) (MQ fν )q(x)

dx

Q∈Dν−j

ν0 j =0

Ωi

p(x)

2

−j (m−n)

p(x) q(x)

q(x)

Mfν (x)

dx.

ν0 j ν+l

Ωi

The first integral on the right-hand side is handled as in the previous proof. This is possible, since − the cubes in this integral are always in Ai and (p/q)− Ai qAi > 1. So it remains only to bound Ωi

2

−j (m−n)

p(x) q(x)

q(x)

Mfν (x)

dx c

ν0 j ν+l

Ωi

2

−(m−n)ν

p(x) q(x)

Ωi

2

−(m−n)ν

Mfν (x)

q(x)

dx.

if r 1, if r 1.

and conclude that

p(x) q(x)

Mfν (x)

ν0

For a non-negative sequence (xi ) we have

r c(r) i0 2−i(m−n) xir , −i(m−n) 2 xi

−i(m−n)r x r , i0 2 i i0 We apply this estimate for r =

p(x) q(x)

q(x)

dx c

ν0

2

−(m−n)ν min{1,( pq )− }

ν0

Mfν (x)p(x) dx. Ωi

The boundedness of the maximal operator implies that the integral may be estimated by a constant, since |fν (x)|p(x) dx 1. We are left with a geometric sum, which certainly converges. 2 6. Proofs of the decomposition results We can often take care of the variable smoothness simply by treating it as a constant in a cube, which is what the next lemma is for. Lemma 6.1. Let α be as in the Standing Assumptions. There exists d ∈ (n, ∞) such that if m > d, then 2να(x) ην,2m (x − y) c2να(y) ην,m (x − y) for all x, y ∈ Rn .


1751

Proof. Choose k ∈ N0 as small as possible subject to the condition that |x − y| 2−ν+k . Then 1 + 2ν |x − y| ≈ 2k . We estimate that −m ην,2m (x − y) c 1 + 2k c2−km . ην,m (x − y) On the other hand, the log-Hölder continuity of α implies that 2ν(α(x)−α(y)) 2−νclog / log(e+1/|x−y|) 2−kclog |x − y|−clog / log(e+1/|x−y|) c2−kclog . The claim follows from these estimates provided we choose m clog .

2

α(·)

Proof of Theorem 3.4. Let f ∈ Fp(·), q(·) . By (2.3) we have the representation f=

ϕQ , f ψQ =

Q∈D +

1

|Q| 2 ϕν ∗ f (xQ )ψQ

Q∈D +

with convergence in S . Let r ∈ (0, min{p − , q − }) and let m be so large that Lemma 6.1 applies. The functions ϕν ∗ f fulfill the requirements of Lemma A.6, so να(x) ϕν ∗ f (xQ )χQ Sϕ f f α(·) = 2 p(·), q(·)

q(x)

lν

Q∈Dν

p(·)

Lx

1 c2να(x) ην,2m ∗ |ϕν ∗ f |r r l q(x) Lp(·) ν

= c2να(x)r ην,2m ∗ |ϕν ∗ f |r

x

1r q(x) p(·) .

lν

r

Lx r

By Lemma 6.1 and Theorem 3.2, we further conclude that Sϕ f f α(·)

p(·), q(·)

r cην,m ∗ 2να(·) |ϕν ∗ f |

q(x) lν r

c2να(x)r |ϕν ∗ f |r

q(x) lν r

1r p(·) Lx r

1 r p(·)

Lx r

= c2να(x) ϕν ∗ f l q(x) Lp(·) . ν

This proves the theorem.

x

2

We will next prove the following version of Theorem 3.8, wherein it is assumed a priori that the sum of molecules converges in S . This is then used to prove the theorem itself. Proposition 6.2. Let the functions p, q, and α be as in the Standing Assumptions. Suppose α(·) α(·) that {mQ }Q is a family of smooth molecules for Fp(·), q(·) , that {sQ }Q ∈ fp(·), q(·) and that f =

ν0 Q∈Dν sQ mQ converges in S . Then f F α(·) c{sQ }Q f α(·) . p(·), q(·)

p(·), q(·)

1752


In order to prove Proposition 6.2 we need to split our domain into several parts. The following lemma will be applied to each part. For the statement we need Triebel–Lizorkin spaces defined in domains of Rn . These are achieved simply by replacing Lp(·) (Rn ) by Lp(·) (Ω) in the definitions α(·) α(·) of Fp(·), q(·) and fp(·), q(·) : f F α(·)

p(·), q(·) (Ω)

and {sQ }Q

α(·) fp(·), q(·) (Ω)

:= 2να(x) ϕν ∗ f (x)l q(x) Lp(·) (Ω) ν

x

να(x) − 12 2 := |s ||Q| χ Q Q q(x) lν

Q∈Dν

.

p(·)

Lx (Ω)

Since we only need these spaces in an auxiliary result, we still assume that f ∈ S (Rn ) even when we consider the norm only over Ω. Lemma 6.3. Let p, q, and α be as in the Standing Assumptions and define functions J = n/ min{1, p, q} and N = J − n − α. Let Ω be a cube or the complement of a finite collection + − n − α − + ε, α + + 1 + ε)of cubes and suppose that {mQ }Q , Q ⊂ Ω, is a family of (J

smooth molecules, for some ε > 0. Suppose that f = ν0 Q∈Dν sQ mQ converges in S , where sQ = 0 when Q ⊂ Ω. Then f F α(·) (Ω) c{sQ }Q f α(·) (Ω) , p(·), q(·)

p(·), q(·)

where c > 0 is independent of {sQ }Q and {mQ }Q . Proof. Let 2m be sufficiently large, i.e., larger than M (from the definition of molecules). Choose r ∈ (0, min{1, p − , q − }), ε > 0, k1 α + + 2ε and k2 nr − n − α − + 2ε such that {mQ } are (k2 , k1 + 1, 2m)-smooth molecules. Define k(ν, μ) := k1 (ν − μ)+ + k2 (μ − ν)+ and s˜Qμ := sQμ |Qμ |−1/2 . Next we apply Lemma A.5 twice: with g = ϕν , h(x) = mQμ (x − xQμ ) and k = k2 + 1 if μ ν, and g(x) = mQμ (x − xQμ ), h = ϕν and k = k1 + 1 otherwise. This and Lemma A.2 give ϕν ∗ mQ (x) c2−k(ν,μ) |Qμ |1/2 ην,2m ∗ ημ,2m (x + xQ ) μ μ ≈ c2−k(ν,μ) |Qμ |−1/2 (ην,2m ∗ ημ,2m ∗ χQμ )(x). Thus, we have

να(x) 2 |sQμ ||ϕν ∗ mQμ | f F α(·) (Ω) = q(x) p(·), q(·) μ0 Qμ ∈Dμ Qμ ⊂Ω

lν

p(·)

Lx (Ω)

να(x)−k(ν,μ) |˜ s |2 η ∗ η ∗ χ Qμ ν,2m μ,2m Qμ μ0 Qμ ∈Dμ

q(x)

lν

r να(x)−k(ν,μ) = |˜ s |2 η ∗ η ∗ χ Qμ ν,2m μ,2m Qμ μ0 Qμ ∈Dμ

p(·)

Lx (Ω)

1 r q(x) p(·) r r

lν

Lx

. (Ω)


1753

Next we use the embedding l r → l 1 and obtain the estimate on the term inside the two norms above as follows

r |˜sQμ |2να(x)−k(ν,μ) ην,2m ∗ ημ,2m ∗ χQμ

μ0 Qμ ∈Dμ

|˜sQμ |r 2να(x)r−k(ν,μ)r (ην,2m ∗ ημ,2m ∗ χQμ )r .

μ0 Qμ ∈Dμ

By Lemma A.4 we conclude that 2να(x)r−k(ν,μ)r (ην,2m ∗ ημ,2m ∗ χQμ )r c2να(x)r−k1 r(ν−μ)+ −k2 r(μ−ν)+ +n(1−r)(ν−μ)+ ην,2mr ∗ ημ,2mr ∗ χQμ c2μα(x)r−2ε|ν−μ| ην,2mr ∗ ημ,2mr ∗ χQμ ,

(6.4)

where, in the second step, we used the assumptions on k1 and k2 . We use this with our previous estimate to get r μα(x)r−2ε|ν−μ|r f F α(·) (Ω) |˜sQμ | 2 ην,2mr ∗ ημ,2mr ∗ χQμ p(·), q(·)

1 r q(x) p(·) r r

lν

μ0 Qμ ∈Dμ

Lx

. (Ω)

We apply Lemma 6.1 and Theorem 3.2 to conclude that f F α(·) (Ω) ην,mr ∗ p(·), q(·)

|˜sQμ | 2

r μα(·)r−2ε|ν−μ|r

∗ χQμ

ημ,2mr

μ0 Qμ ∈Dμ

r μα(x)r−2ε|ν−μ|r |˜sQμ | 2 ημ,2mr ∗ χQμ

1 r q(x) p(·) r r

lν

μ0 Qμ ∈Dμ

Lx

We estimate the inner part (which depends on x) pointwise as follows: q(x) r r μα(x)r−2ε|ν−μ|r |˜sQμ | 2 ημ,2mr ∗ χQμ q(x) r lν

μ0 Qμ ∈Dμ

=

|˜sQμ | 2

r μα(x)r−2ε|ν−μ|r

ν0 μ0 Qμ ∈Dμ

c

ν0 μ0

ημ,2mr

q(x) r ∗ χQμ

q(x) μα(x)r r 2−ε|ν−μ|r 2 |˜sQμ |r ημ,2mr ∗ χQμ , Qμ ∈ D μ

1 r q(x) p(·) r r

lν

Lx

. (Ω)

(Ω)

1754


where, for the inequality, we used Hölder’s inequality in the space with geometrically decaying weight, as in the proof of Theorem 3.2. Now the only part which depends on ν is a geometric sum, which we estimate by a constant. Next we change the power α(x) to α(y) by Lemma 6.1:

|˜sQμ | 2

r μα(x)r−2ε|ν−μ|r

ημ,2mr

q(x) r ∗ χQμ q(x) r lν

μ0 Qμ ∈Dμ

q(x) r μα(x)r r c 2 |˜sQμ | ημ,2mr ∗ χQμ μ0 Qμ ∈Dμ

q(x) r μα(·)r r c ημ,m ∗ 2 |˜sQμ | χQμ . q(x) r lμ

Qμ ∈ D μ

Hence, we have shown that

μα(·)r r 2 |˜sQ | χQ f F α(·) (Ω) cημ,m ∗ p(·), q(·) Q∈Dμ

1 r q(x) p(·) r r

lμ

Lx

. (Ω)

Therefore, by Theorem 3.2, we conclude that μα(x)r r f F α(·) (Ω) c 2 |˜sQ | χQ p(·), q Q∈Dμ

1 r q(x) p(·) r r

lμ

Lx

(Ω)

μα(x) − 12 = c 2 |sQ ||Q| χQ q(x) Q∈Dμ

lμ

p(·) Lx (Ω)

where we used that the sum consists of a single non-zero term.

= {sQ }Q f α(·)

p(·), q(·) (Ω)

,

2

Proof of Proposition 6.2. We will reduce the claim to the previous lemma. By assumption there exists ε > 0 such that the molecules mQ are (N +4ε, α +1+3ε)-smooth. − + − > JQ − αQ −n+ε By the uniform continuity of p, q and α, we may choose μ0 0 such that NQ − + and αQ > αQ − ε for every dyadic cube Q of level μ0 . Note that if Q0 is a dyadic cube of level μ0 and Q ⊂ Q0 is another dyadic cube, then − − + − + − NQ NQ > JQ − αQ − n − ε JQ − αQ − n − ε, 0 0 0 + − + similarly for α. Thus we conclude that mQ is a (JQ − αQ − n + 3ε, αQ + 1 + 2ε)-smooth when Q is of level at most μ0 . − Since p, q and α have a limit at infinity, we conclude that NR−n \K > JR+n \K − αR n \K − n + −ε − + and αRn \K > αRn \K − ε for some compact set K ⊂ Rn . We denote by Ωi , i = 1, . . . , M, those dyadic cubes of level μ0 which intersect K, and define Ω0 = Rn \ M i=1 Ωi . − − + , q }) such that rni < JQ + ε, and set For every integer i ∈ [0, M] choose ri ∈ (0, min{1, pΩ Ωi i − + n ki := ri − n − αΩi + 2ε and Ki = αΩi + 2ε. Then mQ is a (ki , Ki + 1)-smooth molecule when


1755

Q is of level at most μ0 . Define ki (ν, μ) := Ki (ν − μ)+ + ki (μ − ν)+ and s˜Qμ := sQμ |Qμ |−1/2 . Finally, let r ∈ (0, min{1, p − , q − }). Note that the constants ki and Ki have been chosen so that in each set Ωi we may argue as in the previous lemma. Thus we get ϕν ∗ mQ (x) c2−k(ν,μ) |Qμ |−1/2 (ην,2m ∗ ημ,2m ∗ χQ )(x) μ μ for x ∈ Ωi and Qμ with sidelength at most 2μ0 . With k as in Lemma 6.3, we conclude from this that f F α(·)

p(·), q(·)

2να ϕν ∗ f l q(x) Lp(·) ν

x

μ −1 0 να(x)−k(ν,μ) |˜sQμ |2 ην,2m ∗ ημ,2m ∗ χQμ q(x) μ=0 Qμ ∈Dμ

M + i=0

lν

|˜sQμ |2

να(x)−k(ν,μ)

p(·)

Lx

ην,2m ∗ ημ,2m ∗ χQμ q(x) lν

μμ0 −1 Qμ ∈Dμ

p(·)

.

Lx (Qi )

By the previous lemma, each term in the last sum is dominated by {sQ }Q f α(·)

, so we con-

p(·), q(·)

clude that

f F α(·)

p(·), q(·)

μ −1 0 να(x)−k(ν,μ) |˜sQμ | 2 ην,2m ∗ ημ,2m ∗ χQμ q(x) μ=0 Qμ ∈Dμ

+ c(M + 1){sQ }Q

lν

α(·)

fp(·), q(·)

p(·)

Lx

.

It remains only to take care of the first term on the right-hand side. An analysis of the proof of the previous lemma shows that the only part where the assumption on the smoothness of the molecules was needed was in estimate (6.4). In the current case we get instead 2να(x)r−rk(ν,μ) (ην,2mr ∗ ημ,2mr ∗ χQμ )r c 2μα(x)r−2ε|ν−μ|+n(1−r)+ (μ−ν)+ ην,2mr ∗ ημ,2mr ∗ χQμ , since we have no control of k2 . However, since μ μ0 and ν 0, the extra term satisfies 2n(1−r)+ (μ−ν)+ 2n(1−r)+ μ0 , so it is just a constant. After this modification the rest of the proof of Lemma 6.3 takes care of the first term. 2 α(·)

Proof of Theorem 3.8. Suppose that {sQ } ∈ fp(·), q(·) . In order to prove Theorem 3.8 we only

in S , since then we can apply Proposition 6.2. need to show that ν Q sQ mQ converges

Take ξ ∈ S. We have to show that ν Q sQ mQ , ξ converges. First of all we note that mQ , ξ (|mQ |∗|ξ |)(0). Then we use Lemma A.5 to estimate mQ , ξ c(ξ )|Q|−1/2 η0,m (xQ ),

1756


using that |ξ | c(ξ )η0,m , since ξ ∈ S. Let D be a finite set of dyadic cubes of size at most 1. Then sQ mQ , ξ c |Q|1/2 |sQ | η0,m (xQ )

Q∈D

Q∈D

c

|Q|−1/2 |sQ | χQ (x) η0,m (x) dx

R n Q∈D

c{sQ }Q∈D f α(·)

p(·), q(·)

η0,m p (·) .

If Dk is a family of sets of dyadic cubes of size at most 1 which increases to D+ , then we

α(·) conclude from the previous inequality and {sQ } ∈ fp(·), q(·) that Q∈Dk sQ mQ , ξ is a Cauchy sequence, hence it converges, which was to be shown. 2 Proof of Theorem 3.11. Define constants K = n/ min{1, p − , q − } − n + ε and L = α + + 1 + ε. We construct (K, L)-smooth atoms {aQ }Q∈D+ exactly as on p. 132 of [23]. Note that we may use the constant indices construction, since the constants K and L give sufficient smoothness at α(·) every point. These atoms are also atoms for the space Fp(·), q(·) .

α(·) Let f ∈ Fp(·), q(·) . With functions as in Definition 2.2, we represent f as f = Q∈D+ tQ ϕQ with convergence in S , where tQ = f, ψQ . Next, we define ∗ tr Q =

P ∈Dν

|tP |r (1 + 2ν |xP − xQ |)m

1/r ,

for Q = Qνk , ν ∈ N0 and k ∈ Zn . For there numbers (tr∗ )Q we know that f = Q (tr∗ )Q aQ in S , where {aQ }Q are atoms (molecules with support in 3Q), by the construction of [23]. (Technically, the atoms from the construction of [23] satisfy our inequalities for molecules only up to a constant (independent

of the cube and scale). We will ignore this detail.) For ν ∈ N0 define Tν := Q∈Dν tQ χQ . By definition, tr∗ is a discrete convolution of Tν with ην,m . Changing to the continuous version, we see that (tr∗ )Qνk ≈ (ην,M ∗ (|Tν |r )(x))1/r for x ∈ Qνk . By this point-wise estimate we conclude that ∗ t r

α(·)

fp(·), q(·)

να(x) − 12 ∗ tr Q χQ = 2 |Q|

q(x)

ν lν

Q∈Dν

≈ 2να(x)+νn/2 ην,m ∗ |Tν |r ν

q(x) lν r

p(·)

Lx

1r p(·) . Lx r

Next we use Lemma 6.1 and Theorem 3.2 to conclude that να(x)+ν/2 2 ην,m ∗ |Tν |r ν

q(x) lν r

c 2να(x)+ν/2 Tν q(x) ν lν

p(·)

Lx

1r p(·) Lx r

να(x) − 12 = 2 |Q| tQ χQ Q∈Dν

q(x)

ν lν

p(·)

Lx

.


Since f = f F α(·) .

Q∈D + tQ ϕQ ,

1757

Theorem 3.4 implies that this is bounded by a constant times

p(·), q(·)

This completes one direction. The other direction, f F α(·) c{sQ }Q f α(·) p(·), q(·)

,

p(·), q(·)

follows from Theorem 3.8, since every family of atoms is in particular a family of molecules.

2

We next consider a general embedding lemma. The local classical scale of Triebel–Lizorkin spaces is decreasing in the primary index p and increasing in the secondary index q. This is a direct consequence of the corresponding properties of Lp and l q . In the variable exponent setting we have the following global result provided we assume that p stays constant at infinity: Proposition 6.5. Let pj , qj , and αj be as in the Standing Assumptions, j = 0, 1. (a) If p0 p1 and (p0 )∞ = (p1 )∞ , then Lp0 (·) (Rn ) → Lp1 (·) (Rn ). α (·) α (·) (b) If α0 α1 , p0 p1 , (p0 )∞ = (p1 )∞ , and q0 q1 , then Fp00(·), q0 (·) (Rn ) → Fp11(·), q1 (·) (Rn ). Proof. In Lemma 2.2 of [13] it is shown that Lp0 (·) (Rn ) → Lp1 (·) (Rn ) if and only if p0 1 p1 almost everywhere and 1 ∈ Lr(·) (Rn ), where r(x) := p11(x) − p01(x) . Note that r(x) = ∞ if p1 (x) = p0 (x). The condition 1 ∈ Lr(·) (Rn ) means in this context (since r is usually unbounded) that the modula satisfies limλ0 r(·) (λ) = 0, where we use the convention that λr(x) = 0 if r(x) = ∞ and λ ∈ [0, 1). Due to the assumptions on p0 and p1 , we have 1r ∈ C log , 1r 0, and 1 1 A n r∞ = 0. In particular, | r(x) | log(e+|x|) for some A > 0 and all x ∈ R . Thus, r(·) exp(−2nA) =

Rn

−2n −2nA e + |x| exp dx < ∞. dx 1 | r(x) | Rn

The convexity of r(·) implies that r(·) (λ exp(−2nA)) → 0 as λ 0 and (a) follows. For (b) we argue as follows. Since α0 α1 , we have 2να0 (x) 2να1 (x) for all ν 0 and all x ∈ Rn . Moreover, q0 q1 implies ·l q1 ·l q0 and (a) implies Lp0 (·) (Rn ) → Lp1 (·) (Rn ). α (·) Now, the claim follows immediately from the definitions of the norms of Fp00(·), q0 (·) and α (·)

Fp11(·), q1 (·) .

2

With the help of this embedding result we can prove the density of smooth functions. +

Proof of Proposition 3.12. Choose K so large that FpK+ , 2 → Fpα+ , 1 . This is possible by classical, fixed exponent, embedding results.

α(·) Let f ∈ Fp(·), q(·) and choose smooth atoms aQ ∈ C k (Rn ) so that f = Q∈D+ tQ aQ in S . Define fm =

m

ν=0 Q∈Dν , |xQ |<m α(·)

Then clearly fm ∈ C0K and fm → f in Fp(·), q(·) .

tQ a Q .

1758


We can choose a sequence of functions ϕm,k ∈ C0∞ so that fm − ϕm,k W K,p+ → 0 as k → ∞ and the support of ϕk,m is lies in the ball B(0, rm ). By the choice of K we conclude that fm − ϕm,k F α+ cfm − ϕm,k F K

p+ , 2

p+ , 1

= cfm − ϕm,k W K,p+ .

By Proposition 6.5 we conclude that fm − ϕm,k F α(·)

p(·), q(·)

cfm − ϕm,k F α+ . p+ , 1

Note that the assumption (p0 )∞ = (p1 )∞ of the proposition is irrelevant, since our functions α(·) have bounded support. Combining these inequalities yields that ϕm,k → fm in Fp(·), q(·) , hence α(·) we may chose a sequence km so that ϕm,km → f in Fp(·), q(·) , as required.

2

k ∼ Remark 6.6. Note that we used density of smooth functions in the proof of the equality Fp(·), 2= W k,p(·) . However, in the proof of the previous corollary we needed this result only for constant k ∼ W k,p . Therefore, the argument is not circular. exponent: Fp, 2=

7. Traces In this section we deal with trace theorems for Triebel–Lizorkin spaces. We write Dn and for the families of dyadic cubes in D+ when we want to emphasize the dimension of the underlying space. The idea of the proof of the main trace theorem is to use the localization afforded by the atomic decomposition, and express a function as a sum of only those atoms with support intersecting the hyperplane Rn−1 ⊂ Rn . In the classical case, this approach is due to Frazier and Jawerth [22]. There have been other approaches to deal with traces and extension operators using wavelet decomposition instead of atomic decomposition, which utilizes compactly supported Daubechies wavelets, and thus, conveniently gives trace theorems (see, e.g. [25]). However, for this one would need to define and establish properties of almost diagonal operators and almost diagonal α(·) α(·) matrices for the Fp(·), q(·) and fp(·), q(·) spaces. In the interest of brevity we leave this for future research. The following lemma shows that it does not matter much for the norm if we shift around the mass a bit in the sequence space. Dνn

Lemma 7.1. Let p, q, and α be as in the Standing Assumptions, ε > 0, and let {EQ }Q be a collection of sets with EQ ⊂ 3Q and |EQ | ε|Q|. Then {sQ }Q

α(·)

fp(·), q(·)

να(x) − 12 ≈ 2 |sQ ||Q| χEQ q(x) Q∈Dν

lν

p(·)

Lx

α(·)

for all {sQ }Q ∈ fp(·), q(·) . Proof. We start by proving the inequality “.” Let r ∈ (0, min{p − , q − }). We express the norm as


{sQ }Q

α(·)

fp(·), q(·)

να(x)r r − 2r = 2 |sQ | |Q| χQ

1 r q(x) p(·) , r r

lν

Q∈Dν

1759

Lx

since the sum has only one non-zero term. We use the estimate χQ cην,m ∗ χEQ for all Q ∈ Dν . Now Lemma 6.1 implies that {sQ }Q

α(·)

fp(·), q(·)

να(x)r r − 2r c2 |sQ | |Q| ην ∗ χEQ

1 r q(x) p(·) r r

lν

Q∈Dν

Lx

r − 2r ην ∗ 2να(·)r c |s | |Q| χ Q EQ Q∈Dν

1 r q(x) p(·) . r r

lν

Lx

Then Theorem 3.2 completes the proof of the first direction: {sQ }Q

α(·)

fp(·), q(·)

να(x)r r − 2r c2 |sQ | |Q| χEQ

1 r q(x) p(·) r r

lν

Q∈Dν

να(x) − 12 2 = c |s ||Q| χ Q EQ

q(x)

lν

Q∈Dν

Lx

1 r p(·)

.

Lx

The other direction follows by the same argument, since χEQ cην,m ∗ χQ .

2

Next we use the embedding proposition from the previous section to show that the trace space does not really depend on the secondary index of integration. Lemma 7.2. Let p1 , p2 , q1 , α1 and α2 be as in the Standing Assumptions and let q2 ∈ (0, ∞). Assume that α1 = α2 and p1 = p2 in the upper or lower half space, and that α1 α2 and p1 p2 . Then α (·) α (·) tr Fp11(·), q1 (·) Rn = tr Fp22(·), q2 Rn . Proof. We assume without loss of generality that α1 = α2 and p1 = p2 in the upper half space. We define r0 = min{q2 , q1− } and r1 = max{q2 , q1+ }. It follows from Proposition 6.5 that α (·)

α (·)

α (·)

tr Fp22(·), r0 → tr Fp12(·), q1 (·) → tr Fp11(·), r1 and (·) α2 (·) α1 (·) tr Fpα22(·), r0 → tr Fp2 (·), q2 → tr Fp1 (·), r1 . (·) α2 (·) α1 (·) We complete the proof by showing that tr Fpα11(·), r1 → tr Fp2 (·), r0 . Let f ∈ tr Fp1 (·), r1 . According to Theorem 3.11 we have the representation

1760


f=

tQ a Q

with {tQ }Q

α (·)

fp 1(·), r

Q∈D +

1

cf

1

α (·)

Fp 1(·), r 1

,

1

α (·)

where the aQ are smooth atoms for Fp11(·), r1 satisfying (M1) and (M2) up to high order. Then

(·) they are also smooth atoms for Fpα22(·), r0 . + Let A := {Q ∈ D : 3Q ∩ {xn = 0} = ∅}. If Q ∈ A is contained in the closed upper half space, then we write Q ∈ A+ , otherwise Q ∈ A− . We set t˜Q = tQ when Q ∈ A, and t˜Q = 0 otherwise.

Then we define f˜ = Q∈D+ t˜Q aQ . It is clear that tr f = tr f˜, since all the atoms of f whose support intersects Rn−1 are included in f˜. For Q ∈ A+ we define

3 EQ = x ∈ Q: (Q) xn (Q) ; 4 for Q ∈ A− we define 1 3 EQ = (x , xn ) ∈ Rn : (x , −xn ) ∈ Q, (Q) xn (Q) ; 2 4 for all other cubes EQ = ∅. If Q ∈ A, then |Q| = 4|EQ |; moreover, {EQ }Q covers each point at most three times. By Theorem 3.8 and Lemma 7.1 we conclude that f˜

α (·)

Fp 2(·), r 2

c{t˜Q }Q

α (·)

fp 2(·), r

0

2

0

να (x) − 12 2 c2 |tQ ||Q| χEQ r0 lν

Q∈Dν

p (·)

.

Lx 2

The inner norm consists of at most three non-zero members for each x ∈ Rn . Therefore, we can replace r0 by r1 . Moreover, each EQ is supported in the upper half space, where α2 and α1 , and p2 and p1 agree. Thus, f˜

α (·)

Fp 2(·), r 2

0

να (x) − 12 1 c2 |tQ ||Q| χEQ r1

The right-hand side is bounded by f α (·)

α (·)

lν

Q∈Dν

α (·)

Fp 1(·), r 1

p (·)

.

Lx 1

according to Theorem 3.8 and Lemma 7.1. There-

1

fore, tr Fp11(·), r1 → tr Fp22(·), r0 , and the claim follows.

2 α(·)

α(·)

For the next proposition we recall the common notation Fp(·) = Fp(·), p(·) for the Triebel– Lizorkin space with identical primary and secondary indices of integrability. The next result shows that the trace space depends only on the values of the indices at the boundary, as should be expected. Proposition 7.3. Let p1 , p2 , q1 , α1 and α2 be as in the Standing Assumptions. Assume that α1 (x) = α2 (x) and p1 (x) = p2 (x) for all x ∈ Rn−1 × {0}. Then α (·) α (·) tr Fp11(·), q1 (·) Rn = tr Fp22(·) Rn .

L. Diening et al. / Journal of Functional Analysis 256 (2009) 1731–1768 α (·)

1761

α (·)

Proof. By Lemma 7.2 we conclude that tr Fp11(·), q1 (·) = tr Fp11(·) . Therefore, we can assume that q 1 = p1 . We define α˜ j to equal αj on the lower half space and min{α1 , α2 } on the upper half space and ˜ Applying Lemma 7.2 four times in the chain let α˜ = min{α1 , α2 }. Similarly, we define p˜ j and p. α (·) α˜ (·) α(·) ˜ n α˜ (·) α (·) tr Fp11(·) Rn = tr Fp˜11(·) Rn = tr Fp(·) R = tr Fp˜22(·) Rn = tr Fp22(·) Rn ˜ gives the result.

2

Proof of Theorem 3.13. By Proposition 7.3 it suffices to consider the case q = p with p and α(·) α independent of the nth coordinate for |xn | 2. Let f ∈ Fp(·) with f F α(·) 1 and let f = p(·)

sQ aQ be an atomic decomposition as in Theorem 3.11. We denote by π the orthogonal projection of Rn onto Rn−1 , and (x , xn ) ∈ Rn = n−1 × R. For J ∈ Dμn−1 , a dyadic cube in Rn−1 , we define Qi (J ) ∈ Dμn , i = 1, . . . , 6 · 5n−1 , R 1

to be all the dyadic cubes satisfying J ⊂ 3Qi . We define tJ = |Q1 (J )|− 2n i |sQi (J ) | and

hJ (x ) = tJ−1 i sQi aQi . By Q+ (J ) we denote the cube Qi (J ) which has J as a face (i.e., J ⊂ ∂Q+ (J )). Then we have tr f (x ) = tJ hJ (x ), μ J ∈D n−1 μ

with convergence in S . The condition α −

1 p

− (n − 1)( p1 − 1)+ > 0 implies that molecules in

1 α(·)− p(·)

Fp(·) (Rn−1 ) are not required to satisfy any moment conditions. Therefore, hJ is a family of smooth molecules for this space. Consequently, by Theorem 3.8, we find that tr f α(·)− 1 c{tJ }J α(·)− 1 . Fp(·)

p(·)

(Rn−1 )

fp(·)

p(·)

(Rn−1 )

Thus, we conclude the proof by showing that the right-hand side is bounded by a constant. Since the norm is bounded if and only if the modular is bounded, we see that it suffices to show that μα(x ,0)− 1 p(x ,0) p(x ,0) |t | |J |−1/2 χ (x , 0) 2 dx J J Rn−1

=

μ J ∈D n−1 μ

2−μ

μ J ∈D n−1 μ

μα(x ,0) p(x ,0) 2 |tJ ||J |−1/2 dx

J

is bounded. Using the fact that p(x , 0) = p(x , xn ) = p(x) when |xn | 2, we calculate μα(x ,0) p(x ,0) −μ 2 |tJ ||J |−1/2 dx 2 J

= Q+ (J )

μα(x ,0) p(x ,0) 2 |tJ ||J |−1/2 d(x , xn )

1762


c

2

Q+ (J )

μα(x)

1 − 2n − n−1 2n

p(x)

|sQi ||Q|

dx

i

=c

2

μα(x)

−1/2

p(x)

|sQi ||Q|

dx.

i

Q+ (J )

Hence, we obtain

2−μ

μ J ∈D n−1 μ

c

μα(x ,0) p(x ,0) 2 |tJ ||J |−1/2 dx

J

2

μα(x)

n μ Q∈Dμ Q

c

−1/2

p(x)

|sQi | |Q|

dx

i

να(x) p(x) 2 |sQ ||Q|−1/2 χQ (x) dx,

Rn

ν Q∈Dνn

where we again swapped the integral and the sums. Since f F α(·) 1, the right-hand side quantity is bounded, and we are done.

p(·)

2

Acknowledgments We would like to thank H.-G. Leopold for useful discussions on how our spaces relate to his spaces of variable smoothness, and Tuomas Hytönen for a piece of advice on Fourier analysis. Finally, we like to thank the referee for some helpful comments. The first author thanks Arizona State University and the University of Oulu for their hospitality. All authors thank the W. Pauli Institute, Vienna, at which it was possible to complete the project. Appendix A. Technical lemmas Recall from (3.1) that ην,m (x) = 2nν (1 + 2ν |x|)−m . Lemma A.1. Let ν1 ν0 , m > n, and y ∈ Rn . Then ην0 ,m (y) 2m ην1 ,m (y)

if |y| 2−ν1 ; and

ην1 ,m (y) 2m ην0 ,m (y)

if |y| 2−ν0 .

Proof. Let |y| 2−ν1 . Then 1 + 2ν1 |y| 2 and ην0 ,m (y) 2nν0 (1 + 2ν1 |y|)m 2nν0 · 2m 2m , = ην1 ,m (y) 2nν1 (1 + 2ν0 |y|)m 2nν1 which proves the first inequality. Assume now that |y| 2−ν0 . Then 1 + 2ν0 |y| 2 · 2ν0 |y| and


1763

ην1 ,m (y) 2nν1 (1 + 2ν0 |y|)m 2nν1 (2 · 2ν0 |y|)m = nν ν = 2m 2(ν1 −ν0 )(n−m) 2m , ην0 ,m (y) 2nν0 (1 + 2ν1 |y|)m 2 0 (2 1 |y|)m 2

which gives the second inequality.

Lemma A.2. Let ν 0 and m > n. Then for Q ∈ Dν , y ∈ Q and x ∈ Rn , we have

χQ ην,m ∗ (x) ≈ ην,m (x − y). |Q| Proof. Fix Q ∈ Dν and set d = 1 +

√ √ n. If y, z ∈ Q, then |y − z| n2−ν and

√ 1 1 1 + 2ν |x − z| 1 + · 2ν |x − z| − n 2−ν d d 1 + 2ν |x − y| √ 1 + 2ν |x − z| + n2−ν d 1 + 2ν |x − z| . Therefore, for all y, z ∈ Q we have d −m ην,m (x − y) ην,m (x − z) d m ην,m (x − y). The claim follows when we integrate this estimate over z ∈ Q and use the formula

1 χQ (x) = ην,m ∗ ην,m (x − z) dz. 2 |Q| |Q| Q

Lemma A.3. For ν0 , ν1 0 and m > n, we have ην0 ,m ∗ ην1 ,m ≈ ηmin{ν0 ,ν1 },m with the implicit constants depending only on m and n. Proof. Using dilations and symmetry we may assume that ν0 = 0 and ν1 0. Since m > n, we have ην0 ,m 1 c and ην1 ,m 1 c. We start with the direction “.” If |y| 2−ν1 1, then 1 + |x − y| 2(1 + |x|), and therefore, ην0 ,m (x − y) c ην0 ,m (x). Hence, ην0 ,m ∗ ην1 ,m (x) ην0 ,m (x − y)ην1 ,m (y) dy {y: |y|2−ν1 }

cην0 ,m (x)

−m 2nν1 1 + 2ν1 |y| dy

{y: |y|2−ν1 }

cην0 ,m (x) {y: |y|2−ν1 }

c2−m ην0 ,m (x).

2nν1 2−m dy

1764


We now prove the opposite direction, “.” Let A := {y ∈ Rn : |y| 3 or |x − y| > |x|/2}. If y ∈ A, then 1 + |x − y| 14 (1 + |x|), which implies that ην0 ,m (x − y) cην0 ,m (x) and

ην0 ,m (x − y)ην1 ,m (y) dy cην0 ,m (x)

A

ην1 ,m (y) dy cην0 ,m (x). A

If y ∈ Rn \ A, then |y| 1 and |y| 12 |x|. So ην1 ,m (y) c ην0 ,m (y) c ην0 ,m (x) by Lemma A.1. Hence,

ην0 ,m (x − y)ην1 ,m (y) dy c Rn \A

ην0 ,m (x − y) dyην0 ,m (x) cην0 ,m (x).

Rn \A

Combining the estimates over A and Rn \ A gives ην0 ,m ∗ ην1 ,m (x) cηmin{ν0 ,ν1 },m (x). Lemma A.4. Let r ∈ (0, 1]. Then for ν, μ 0, m >

n r

2

and Qμ ∈ Dμ , we have

(ην,m ∗ ημ,m ∗ χQμ )r ≈ 2(μ−ν)+ n(1−r) ην,mr ∗ ημ,mr ∗ χQμ , where the implicit constants depends only on m, n and r. Proof. Without loss of generality, we may assume that xQμ = 0. Then by Lemmas A.2 and A.3 ην,m ∗ ημ,m ∗ χQμ ≈ 2−nμ ην,m ∗ ημ,m ≈ 2−nμ ηmin{ν,μ},m , ην,mr ∗ ημ,mr ∗ χQμ ≈ 2−nμ ην,mr ∗ ημ,mr ≈ 2−nμ ηmin{ν,μ},mr . From the definition of η we get (ηmin{ν,μ},m )r = 2min{ν,μ}n(r−1) ηmin{ν,μ},mr . Thus, we get (ην,m ∗ ημ,m ∗ χQμ )r ≈ 2μn(1−r) 2min{ν,μ}n(r−1) ην,mr ∗ ημ,mr ∗ χQμ .

2

Lemma A.5. Let g, h ∈ L1loc (Rn ) and k ∈ N0 such that D γ g ∈ L1 (Rn ) for all multi-indices γ with |γ | k. Assume that there exist m0 > n and m1 > n + k such that |h| ημ,m1 and |D γ g| 2νk ην,m0 for the same γ . Further, suppose that x γ h(x) dx = 0, Rn

Then

|g ∗ h| c2k(ν−μ) ην,m0 ∗ ημ,m1 −k .

for |γ | k − 1.


1765

Proof. If k = 0, then the estimate is obvious, so we can assume k 1. It suffices to prove the result for g, h smooth. Since h has vanishing moments up to order k − 1, we estimate by Taylor’s formula

(y − x)γ γ dy g ∗ h(x) g(y) − h(x − y) D g(x) γ! |γ |k−1

Rn

sup D γ g(ξ )|x − ξ |k−1 dξ |h(x − y)| dy

c Rn [x,y]

|γ |=k

c

2νk ην,m0 (ξ )|x − ξ |k−1 ημ,m1 (x − y) dξ dy.

Rn [x,y]

Changing the order of integration with y − x = r(ξ − x), where r 1, yields the inequality g ∗ h(x) c2νk

∞ Rn

ην,m0 (ξ )|x − ξ |k ημ,m1 r(x − ξ ) dr dξ.

(A.1)

1

We estimate the inner integral: for 2μ r|x − ξ | 1 we have ημ,m1 (r(x − ξ )) ≈ r −m1 ημ,m1 (x − ξ ); for 2μ r|x − ξ | < 1 we simply use ημ,m1 (r(x − ξ )) ημ,m1 (x − ξ ). Thus, we find that ∞

ημ,m1 r(x − ξ ) dr

1

∞

r −1−m1 dr +

1

2−μ |x−ξ |−1

r −1 dr ημ,m1 (x − ξ )

1

≈ log e + 2−μ |x − ξ |−1 ημ,m1 (x − ξ ).

Substituting this into (A.1) produces g ∗ h(x) c2νk

Rn

= c2

νk

ην,m0 (ξ )|x − ξ |k log e + 2−μ |x − ξ |−1 ημ,m1 (x − ξ ) dξ −μ −1 log e + 2 |x − ξ |

Rn

|x − ξ | 1 + 2μ |x − ξ |

k ην,m0 (ξ )ημ,m1 −k (x − ξ ) dξ

2k(ν−μ) ην,m0 ∗ ημ,m1 −k (x), proving the assertion.

2

Lemma A.6 (“The r-trick”). Let r > 0, ν 0 and m > n. Then there exists c = c(r, m, n) > 0 such that g(x) c ην,m ∗ |g|r (x) 1/r for all x ∈ Rn amd every g ∈ S with supp gˆ ⊂ {ξ : |ξ | 2ν+1 }.

1766


Proof. Fix a dyadic cube Q = Qν,k and x ∈ Q. By (2.11) of [22] we have −m g(x)r sup g(z)r cr 2νn 1 + |l| z∈Q

l∈Zn

g(y)r dy.

Qν,k+l

In the reference this was shown only for m = n + 1, but it is easy to see that it is also true for m > n + 1. Now for x ∈ Qν,k and y ∈ Qν,k+l , we have |x − y| ≈ 2−ν |l| for large l, hence, 1 + 2ν |x − y| ≈ 1 + |l|. From this we conclude that r sup g(z) cr,n 2νn z∈Q

= cr,n ,

l∈ZnQ

−m g(y)r dy 1 + 2ν |x − y|

ν,k+l

−m g(z)r dz = cr,n ην,m ∗ |g|r (x). 2νn 1 + 2ν |x − z|

Rn

Now, taking the rth root, we obtain the claim.

2

References [1] E. Acerbi, G. Mingione, Regularity results for stationary electro-rheological fluids, Arch. Ration. Mech. Anal. 164 (3) (2002) 213–259. [2] E. Acerbi, G. Mingione, Regularity results for electrorheological fluids: The stationary case, C. R. Acad. Sci. Paris Ser. I 334 (9) (2002) 817–822. [3] E. Acerbi, G. Mingione, Gradient estimates for the p(x)-Laplacean system, J. Reine Angew. Math. 584 (2005) 117–148. [4] A. Almeida, S. Samko, Characterization of Riesz and Bessel potentials on variable Lebesgue spaces, J. Funct. Spaces Appl. 4 (2) (2006) 113–144. [5] O. Besov, Embeddings of spaces of differentiable functions of variable smoothness, in: Issled. po Teor. Differ. Funkts. Mnogikh Perem. i ee Prilozh. 17, Tr. Mat. Inst. Steklova 214 (1997) 25–58, translation in Proc. Steklov Inst. Math. 214 (3) (1996) 19–53. [6] O. Besov, On spaces of functions of variable smoothness defined by pseudodifferential operators, in: Issled. po Teor. Differ. Funkts. Mnogikh Perem. i ee Prilozh. 18, Tr. Mat. Inst. Steklova 227 (1999) 56–74, translation in Proc. Steklov Inst. Math. 227 (4) (1999) 50–69. [7] O. Besov, Equivalent normings of spaces of functions of variable smoothness, in: Funkts. Prostran., Priblizh., Differ. Uravn., Tr. Mat. Inst. Steklova 243 (2003) 87–95 (in Russian), translation in Proc. Steklov Inst. Math. 243 (2003) 80–88. [8] O. Besov, Interpolation, embedding, and extension of spaces of functions of variable smoothness, in: Issled. po Teor. Funkts. i Differ. Uravn., Tr. Mat. Inst. Steklova 248 (2005) 52–63 (in Russian), translation in Proc. Steklov Inst. Math. 248 (2005) 47–58. [9] M. Bownik, K.-P. Ho, Atomic and molecular decompositions of anisotropic Triebel–Lizorkin spaces, Trans. Amer. Math. Soc. 358 (4) (2006) 1469–1510. [10] Y. Chen, S. Levine, R. Rao, Variable exponent, linear growth functionals in image restoration, SIAM J. Appl. Math. 66 (4) (2006) 1383–1406. [11] D. Cruz-Uribe, A. Fiorenza, J.M. Martell, C. Pérez, The boundedness of classical operators in variable Lp spaces, Ann. Acad. Sci. Fenn. Math. 31 (2006) 239–264. [12] D. Cruz-Uribe, A. Fiorenza, C.J. Neugebauer, The maximal function on variable Lp spaces, Ann. Acad. Sci. Fenn. Math. 28 (2003) 223–238, Ann. Acad. Sci. Fenn. Math. 29 (2004) 247–249. [13] L. Diening, Maximal function on generalized Lebesgue spaces Lp(·) , Math. Inequal. Appl. 7 (2) (2004) 245–254. [14] L. Diening, Maximal function on Orlicz–Musielak spaces and generalized Lebesgue spaces, Bull. Sci. Math. 129 (8) (2005) 657–700.


1767

[15] L. Diening, P. Harjulehto, P. Hästö, Y. Mizuta, T. Shimomura, Maximal functions in variable exponent spaces: Limiting cases of the exponent, preprint, 2007. [16] L. Diening, P. Hästö, Variable exponent trace spaces, Studia Math. 183 (2) (2007) 127–141. [17] L. Diening, P. Hästö A. Nekvinda, Open problems in variable exponent Lebesgue and Sobolev spaces, in: Drabek and Rakosnik (Eds.), FSDONA04 Proceedings, Milovy, Czech Republic, 2004, pp. 38–58. [18] L. Diening, M. R˚užiˇcka, Calderón–Zygmund operators on generalized Lebesgue spaces Lp(·) and problems related to fluid dynamics, J. Reine Angew. Math. 563 (2003) 197–220. [19] X.-L. Fan, Global C 1,α regularity for variable exponent elliptic equations in divergence form, J. Differential Equations 235 (2) (2007) 397–417. [20] X.-L. Fan, Boundary trace embedding theorems for variable exponent Sobolev spaces, J. Math. Anal. Appl. 339 (2) (2008) 1395–1412. [21] X.-L. Fan, S. Wang, D. Zhao, Density of C ∞ (Ω) in W 1,p(x) (Ω) with discontinuous exponent p(x), Math. Nachr. 279 (1–2) (2006) 142–149. [22] M. Frazier, B. Jawerth, Decomposition of Besov spaces, Indiana Univ. Math. J. 34 (1985) 777–799. [23] M. Frazier, B. Jawerth, A discrete transform and decompositions of distribution spaces, J. Funct. Anal. 93 (1990) 34–170. [24] M. Frazier, S. Roudenko, Matrix-weighted Besov spaces and conditions of Ap type for 0 < p 1, Indiana Univ. Math. J. 53 (5) (2004) 1225–1254. [25] M. Frazier, S. Roudenko, Traces and extensions of matrix-weighted Besov spaces, Bull. London Math. Soc. 40 (2) (2008) 181–192. [26] P. Gurka, P. Harjulehto, A. Nekvinda, Bessel potential spaces with variable exponent, Math. Inequal. Appl. 10 (3) (2007) 661–676. [27] P. Harjulehto, P. Hästö, Sobolev inequalities for variable exponents attaining the values 1 and n, Publ. Mat. 52 (2) (2008) 347–363. [28] P. Harjulehto, P. Hästö, V. Latvala, Minimizers of the variable exponent, non-uniformly convex Dirichlet energy, J. Math. Pures Appl. (9) 89 (2) (2008) 174–197. [29] P. Hästö, On the density of smooth functions in variable exponent Sobolev space, Rev. Mat. Iberoamericana 23 (1) (2007) 215–237. [30] O. Kováˇcik, J. Rákosník, On spaces Lp(x) and W 1,p(x) , Czechoslovak Math. J. 41 (116) (1991) 592–618. [31] T. Kühn, H.-G. Leopold, W. Sickel, L. Skrzypczak, Entropy numbers of embeddings of weighted Besov spaces. III. Weights of logarithmic type, Math. Z. 255 (1) (2007) 1–15. [32] D.S. Kurtz, Littlewood–Paley and multiplier theorems on weighted Lp spaces, Trans. Amer. Math. Soc. 259 (1) (1980) 235–254. [33] H.-G. Leopold, Pseudodifferentialoperatoren und Funktioneräume variabler Glatt-heit, Dissertation B, Friedrich– Schiller-Universität, Jena, 1987. [34] H.-G. Leopold, On Besov spaces of variable order of differentiation, Z. Anal. Anwend. 8 (1) (1989) 69–82. [35] H.-G. Leopold, Interpolation of Besov spaces of variable order of differentiation, Arch. Math. (Basel) 53 (2) (1989) 178–187. [36] H.-G. Leopold, On function spaces of variable order of differentiation, Forum Math. 3 (1991) 633–644. [37] H.-G. Leopold, Embedding of function spaces of variable order of differentiation in function spaces of variable order of integration, Czechoslovak Math. J. 49(124) (3) (1999) 633–644. [38] H.-G. Leopold, E. Schrohe, Trace theorems for Sobolev spaces of variable order of differentiation, Math. Nachr. 179 (1996) 223–245. [39] A. Lerner, Some remarks on the Hardy–Littlewood maximal function on variable Lp spaces, Math. Z. 251 (3) (2005) 509–521. [40] S. Levine, An adaptive variational model for image decomposition, in: Energy Minimization Methods in Computer Vision and Pattern Recognition, in: Lecture Notes in Comput. Sci., vol. 3757, Springer-Verlag, 2005, pp. 382–397. [41] W. Orlicz, Über konjugierte Exponentenfolgen, Studia Math. 3 (1931) 200–212. [42] G. Mingione, Regularity of minima: An invitation to the dark side of the calculus of variations, Appl. Math. 51 (2006) 355–425. [43] A. Nekvinda, Hardy–Littlewood maximal operator on Lp(x) (Rn ), Math. Inequal. Appl. 7 (2) (2004) 255–266. [44] L. Pick, M. R˚užiˇcka, An example of a space Lp(x) on which the Hardy–Littlewood maximal operator is not bounded, Expo. Math. 19 (2001) 369–371. [45] K. Rajagopal, M. R˚užiˇcka, On the modeling of electrorheological materials, Mech. Res. Comm. 23 (1996) 401–407. [46] S. Roudenko, Matrix-weighted Besov spaces, Trans. Amer. Math. Soc. 355 (2003) 273–314. [47] M. R˚užiˇcka, Electrorheological Fluids: Modeling and Mathematical Theory, Lecture Notes in Math., vol. 1748, Springer-Verlag, Berlin, 2000.

1768


[48] M. R˚užiˇcka, Modeling, mathematical and numerical analysis of electrorheological fluids, Appl. Math. 49 (6) (2004) 565–609. [49] S. Samko, On a progress in the theory of Lebesgue spaces with variable exponent: Maximal and singular operators, Integral Transforms Spec. Funct. 16 (5–6) (2005) 461–482. [50] J. Schneider, Function spaces with negative and varying smoothness, Banach Center Publ. 79 (2007) 187–195. [51] J. Schneider, Function spaces of varying smoothness I, Math. Nachr. 280 (16) (2007) 1801–1826. [52] R. Schneider, C. Schwab, Wavelet solution of variable order pseudodifferential equations, preprint, 2006. [53] H. Triebel, Theory of Function Spaces, Monogr. Math., vol. 78, Birkhäuser-Verlag, Basel, 1983. [54] H. Triebel, Theory of Function Spaces II, Monogr. Math., vol. 84, Birkhäuser-Verlag, Basel, 1992. [55] H. Triebel, Theory of Function Spaces III, Monogr. Math., vol. 100, Birkhäuser-Verlag, Basel, 2006. [56] J. Vybíral, Sobolev and Jawerth embeddings for spaces with variable smoothness and integrability, Ann. Acad. Sci. Fenn. Math., in press. [57] J.-S. Xu, Variable Besov and Triebel–Lizorkin spaces, Ann. Acad. Sci. Fenn. Math. 33 (2) (2008) 511–522. [58] J.-S. Xu, The relation between variable Bessel potential spaces and Triebel–Lizorkin spaces, Integral Transforms Spec. Funct. 19 (8) (2008) 599–605. [59] V. Zhikov, On the density of smooth functions in Sobolev–Orlicz spaces, in: Kraev. Zadachi Mat. Fiz. i Smezh. Vopr. Teor. Funkts. 35 [34], Zap. Nauchn. Sem. S.-Peterburg. Otdel. Mat. Inst. Steklov. (POMI) 310 (2004) 67–81, 226 (in Russian), translation in J. Math. Sci. (N.Y.) 132 (3) (2006) 285–294.


Spectral radius, index estimates for Schrödinger operators and geometric applications Bruno Bianchini, Luciano Mari, Marco Rigoli ∗ Dipartimento di Matematica, Università degli Studi di Milano, Via Saldini 50, I-20133 Milano, Italy Received 12 November 2007; accepted 23 January 2009

Communicated by L. Gross

Abstract In this paper we study the existence of a first zero and the oscillatory behavior of solutions of the ordinary differential equation (vz ) + Avz = 0, where A, v are functions arising from geometry. In particular, we introduce a new technique to estimate the distance between two consecutive zeros. These results are applied in the setting of complete Riemannian manifolds: in particular, we prove index bounds for certain Schrödinger operators, and an estimate of the growth of the spectral radius of the Laplacian outside compact sets when the volume growth is faster than exponential. Applications to the geometry of complete minimal hypersurfaces of Euclidean space, to minimal surfaces and to the Yamabe problem are discussed. © 2009 Elsevier Inc. All rights reserved. Keywords: Spectral radius; Index estimates; Minimal surfaces; Positioning of zeroes

1. Introduction Radialization techniques are a powerful tool in investigating complete Riemannian manifolds. In favourable circumstances these lead to the study of an ordinary differential equation in order to control the solutions of a given partial differential equation. In this respect, one of the challenging problems involved is the study of the sign of the solutions of the ODE, and the positioning of the * Corresponding author.

E-mail addresses: [email protected] (B. Bianchini), [email protected] (L. Mari), [email protected] (M. Rigoli). 0022-1236/$ – see front matter © 2009 Elsevier Inc. All rights reserved. doi:10.1016/j.jfa.2009.01.021

1770

B. Bianchini et al. / Journal of Functional Analysis 256 (2009) 1769–1820

possible zeros. In this paper we determine some conditions ensuring the oscillatory behavior, the existence of zeros and their positioning, of a solution z(t) of the following Cauchy problem: v(t)z (t) + A(t)v(t)z(t) = 0 on (0, +∞), z (t) = O(1) as t ↓ 0+ , z(0+ ) = z0 > 0,

(1.1)

where v(t), A(t) are non-negative functions. The application of these results to the geometric problems we shall consider below leads us to assume the following structural conditions: A(t) ∈ L∞ loc ([0, +∞)), 0 v(t) ∈ L∞ loc ([0, +∞)), v(t)

A(t) 0, A(t) ≡ 0, 1/v(t) ∈ L∞ loc ((0, +∞)),

is non-decreasing near 0 and

lim v(t) = 0.

t→0+

Of course, requests A, v 0, A ≡ 0 are intended in L∞ loc sense, while the last request means that there exists a version of v(t) which is non-decreasing in a neighborhood of zero and whose limit as t → 0+ is equal to zero. Due to the weak regularity of v and A, solutions z(t) of (1.1) are not expected to be classical, and the Cauchy problem is expected to hold almost everywhere (a.e.) on (0, +∞). Equivalently (integrating and using the condition in zero), we are interested in solutions z(t) of the integral equation t z(t) = z0 − 0

1 v(s)

s

A(x)v(x)z(x) dx ds.

0

For our purposes we shall look for z(t) ∈ Liploc ([0, +∞)), that is, locally Lipschitz solutions. Note that the locally Lipschitz condition near zero ensures that z (t) = O(1) hold almost everywhere in a neighborhood of zero. The existence of such solutions in our assumptions will be given in Appendix, where we will also prove that the zeros of z(t), if any, are attained at isolated points. The Cauchy problem (1.1) is a somewhat “integrated" version of that presented in [2], in the sense that, as we shall see, in the geometric applications the role of v(t) will be played by the volume growth of geodesic spheres of some complete Riemannian manifold M, and A(t) will represent the spherical mean of some given function a(x). However, the techniques introduced here are completely different from those in [2], and remind some in the work of Do Carmo and Zhou [8]. Nevertheless, as in [2], we recognize an explicit critical function χ(t), depending only on v(t), which serves as a border line for the behavior of z(t): roughly speaking, if A(t) is much greater than χ(t) in some region, then z(t) has a first zero, while if A(t) is not greater than χ(t) there are examples of positive solutions. We will see that χ(t) generalizes the critical functions presented in [2]. Using χ(t) we will provide a condition in finite form for the existence and localization of a first zero of z(t) (Corollary 2.3), and a sharp condition for the oscillatory behavior (Corollary 2.4). In particular, this latter Corollary improves on the application of the Hille–Nehari oscillation theorem (see [10]) to (1.1).


1771

The key technical result of the paper is Theorem 4.1 which, under very general assumptions, estimates the distance between two consecutive zeros of an oscillatory solution of (1.1): denoting with T1 (τ ) < T2 (τ ) the first two consecutive zeros of z(t) after t = τ , Theorem 4.1 states that T2 (τ ) − T1 (τ ) = O(τ )

as τ → +∞.

This result is achieved using a new but elementary technique which highly improves on the application of Sturm’s type arguments to (1.1). Roughly speaking, the estimate will be obtained performing a careful control on the level sets of the solution of the Riccati equation associated to (1.1). Moreover, in case v(t) f (t) = Λ exp at α logβ t ,

Λ, a, α > 0, β 0,

we provide an upper estimate for lim sup τ →+∞

T2 (τ ) τ

with an explicit constant depending only on α and the growth of A(t) with respect to χ(t) (more precisely, with respect to a critical curve χf (t) modelled on f (t) instead of v(t)). There are several geometric applications of the above results; the main idea is that (1.1) naturally appears in spectral estimates. We will follow two slightly different ways. On the one hand, we will provide an index estimate for Schrödinger type operators L = + a(x), while, on the other hand, we will bound from above the growth of the spectral radius of the Laplacian outside geodesic balls, even when the volume growth of the manifold is faster than exponential. Applications naturally arise in the setting of minimal hypersurfaces of Euclidean space, their Gauss map, minimal surfaces and the Yamabe problem. We state these geometric results in the next subsections. 1.1. The geometric setting From now on, we let (M, ,) denote a connected, geodesically complete, non-compact Riemannian manifold of dimension m 2. Fix an origin o ∈ M and let r(x) = dist(x, o) be the distance function from o. It is well known that r(x) is a Lipschitz function on M which is smooth outside o and its cut-locus cut(o). For later use we briefly recall some basic facts on the cut-locus in case M is geodesically complete; the interested reader can consult, for instance [21, pp. 267–275]. Denote with exp the exponential map exp : To M → M, which, by the Hopf–Rinow theorem, is surjective and defined on the whole To M. The origin o is called a pole of M if it has no conjugate points; for example, this is the case if the sectional curvature of M is non-positive. It turns out that, if o is a pole, exp is a covering map, hence a diffeomorphism if M is simply connected. For every w ∈ To M such that |w| = 1, we indicate

1772


with γw : [0, +∞) → M the geodesic ray starting from o with velocity 1 in the direction of w, and we consider tw = sup s ∈ [0, +∞) such that r γw (s) = s . Clearly, tw > 0 because of the existence of geodesic neighborhoods. If tw < +∞, we define the cut-point of o along γw as γw (tw ). The cut-locus of o is defined as the union of the cut-points of o along every geodesic ray. In other words, cut(o) = exp(Σ), where Σ = tw ∈ To M: |w| = 1 and t = tw < +∞ . It is easy to see that, if r(γw (s)) = s for some s > 0, then the same equality holds for every t ∈ [0, s). Therefore, if tw < +∞ then γw is length minimizing for every t ∈ (0, tw ] and it does not minimize length for any t ∈ (tw , +∞). By the Hopf–Rinow theorem we argue that the exponential map restricted to the set U ∪ Σ, where U = tw ∈ To M: |w| = 1 and t < tw is still surjective, hence exp(U) = M \ cut(o) = cut(o)c . One can prove that – cut(o) is a zero measure, closed subset of M, hence U = exp−1 {cut(o)c } is open in To M. – M is compact if and only if, for every w ∈ To M, tw < +∞. – p ∈ cut(o) if and only if either it is a conjugate point of o, or there exist at least 2 distinct geodesics joining o to p with the same length. The two possibilities do not reciprocally exclude. – For every q ∈ exp(U), there exists a unique minimizing geodesic from o to q. In other words, exp : U → cut(o)c is a bijection (indeed, a diffeomorphism). We indicate with Br the geodesic ball of radius r centered at o, with ∂Br its boundary and we call ∂Br ∩ cut(o)c the regular part of ∂Br . The regular part of ∂Br is an open set in the induced topology on ∂Br , and ∂Br ∩ cut(o)c is diffeomorphic, through the exponential map, to the set U ∩ Sm−1 (r), where Sm−1 (r) is the hypersphere Sm−1 (r) = w ∈ To M: |w| = r . We denote with Vol(∂Br ) the (m − 1)-dimensional volume of ∂Br , that is, the Hausdorff measure of ∂Br . It turns out that it coincides with the induced Riemannian measure when restricted to the regular part of ∂Br . The points of ∂Br ∩ cut(o) may be image of many points of Σ ∩ Sm−1 (r). For this reason, indicating with θ a point of the unit sphere Sm−1 = Sm−1 (1) ⊂ To M, we define the multiplicity function nr (θ ) = cardinality of ϕ ∈ Sm−1 : exp(rθ ) = exp(rϕ) +∞.


1773

This coincides with the number of distinct minimizing geodesic segments joining o to q = exp(rθ ), which, analogously, we denote with nr (q). According to the work of Grimaldi and Pansu [18], if we set ⎧ ⎨1 χr (θ ) = 1/nr (θ ) ⎩ 0

if r < tθ , if r = tθ , if r > tθ ,

and

χr ± (θ ) = lim χt (θ ), t→r ±

then the Hausdorff measure of ∂Br is given by Vol(∂Br ) =

Θ(r, θ )χr (θ ) dθ,

Sm−1

where Θ(r, θ ) is the density of the Riemannian measure. Moreover, by the dominated convergence theorem, lim Vol(∂Bt ) =

t→r ±

Θ(r, θ )χr ± (θ ) dθ.

Sm−1

Therefore, in general circumstances Vol(∂Br ) may present discontinuities of the “first kind,” that is, at a point r > 0 we always have the existence of finite limits both from the right and from the left, possibly with two different values. Indeed, setting v(r) = Vol(∂Br ), it is shown in [18] that for every complete Riemannian manifold v r + − v(r − ) = −2 Vol ∂Br ∩ cut(o) . The key ingredient of their proof is a technical lemma which shows that, up to a set of (m − 1)dimensional measure zero, ∂Br ∩ cut(o) is made up of points having exactly 2 distinct geodesics which minimize distance from o. Observe that v(t) jumps downward and that, a priori, the discontinuities of v(t) may be non-isolated. Note also that, from the definition of χr (θ ), we get ⎧ ⎨ 1 if r < tθ , χr + (θ ) = 0 if r = tθ , ⎩ 0 if r > tθ ,

⎧ ⎨1 χr − (θ ) = 1 ⎩ 0

if r < tθ , if r = tθ , if r > tθ ,

hence from χr + χr χr − we deduce that v(t) ∈ [v + (t), v − (t)]. Therefore, a necessary and sufficient condition on Vol(∂Br ) to be continuous on [0, +∞) is given by the “transversality condition” Vol ∂Br ∩ cut(o) = 0 ∀r 0. However, this reasonable request sometimes is not easy to verify. This is the case, for example, when one constructs manifolds as immersed submanifolds of some ambient space. This suggests to work with discontinuous volume functions Vol(∂Br ) which will take the role of v in (1.1). The next result will reveal important in what follows.

1774


Proposition 1.2. Let v(r) = Vol(∂Br ) be the volume of geodesic spheres of a connected, complete, non-compact Riemannian manifold. Then v(r) is continuous and increasing in a neighborhood of r = 0. Furthermore, v(r) =

v(r + ) + v(r − ) , 2

1 ∈ L∞ loc ((0, +∞)). v(r)

v(r) > 0 for r > 0,

(1.2)

Proof. The first part is immediate using polar coordinates around zero. As for the first property in (1.2), we denote with V = {w ∈ Sm−1 : rw ∈ U} and with W = {w ∈ Sm−1 : rw ∈ Σ}. Since rV = Sm−1 (r) ∩ U is open, then V is an open set of Sm−1 . In polar coordinates

Θ(r, θ )χr (θ ) dθ ≡

v(r) =

Θ(r, θ ) dθ + V

Sm−1

Θ(r, θ ) W

1 dθ nr (θ )

= Vol ∂Br ∩ cut(o)c + Vol ∂Br ∩ cut(o) , v r+ = Θ(r, θ )χr + (θ ) dθ ≡ Θ(r, θ ) dθ = Vol ∂Br ∩ cut(o)c , V

Sm−1

v r− =

Θ(r, θ )χr − (θ ) dθ ≡

Θ(r, θ ) dθ +

V

Sm−1

= Vol ∂Br ∩ cut(o)c +

Θ(r, θ ) dθ W

nr (x) dσ (x).

∂Br ∩cut(o)

By the Grimaldi–Pansu lemma [18], up to a set of (m − 1)-dimensional measure zero, the multiplicity nr (x) is equal to 2. Therefore, by the above expressions is immediate to deduce that v(r + ) + v(r − ) = 2v(r). We observe now that if we prove that 1/v ∈ L∞ loc ((0, +∞)), then v(r) > 0 on (0, +∞). Indeed, assume v(r0 ) = 0 for some r0 ∈ (0, +∞). Then necessarily v(r0+ ) = 0, v(r0− ) = 2v(r0 )−v(r0+ ) = 0 and 1/v is unbounded in a neighborhood of r0 . It remains to prove that 1/v ∈ L∞ loc ((0, +∞)), that is, v(r) is bounded away from zero on every compact set K disjoint from r = 0. Assume by contradiction that there exists {rk } ⊂ K such that v(rk ) → 0. By compactness, there exists r˜ ∈ K such that rk → r˜ . Up to passing to a subsequence we have two cases: rk ↑ r˜ or rk ↓ r˜ . In the first case v(˜r − ) = 0, in the second v(˜r + ) = 0. However, since v jumps downward, in both cases v(˜r + ) = 0. We are going to show that ∂Br˜ ⊆ cut(o).

(1.3)

Indeed, let (1.3) be false, and let q ∈ ∂Br˜ ∩ cut(o)c . Since exp is a diffeomorphism in a neighborhood of q, we can choose a unique θ0 ∈ V such that q = exp(˜r θ0 ). Moreover, since U is open, from r˜ θ0 ∈ U we can chose a neighborhood J with compact closure in U of the form J = rθ : r ∈ (˜r − 2ε, r˜ + 2ε), θ ∈ Vθ0 ,


1775

where ε > 0 is sufficiently small and Vθ0 is a neighborhood of θ0 on the unit sphere Sm−1 , independent from ε. Since the Riemannian density Θ is smooth and positive, there exists C > 0 independent of ε such that Θ(r, θ ) C on J . It follows that Θ(˜r + ε, θ )χr˜ +ε (θ ) dθ C dθ = C VolEucl (Vθ0 ) ∀ε. v(˜r + ε) = Vθ0

Sm−1

This contradicts v(˜r + ) = 0 and proves (1.3). By (1.3) we deduce that, for every geodesic ray γw starting from o, there exists tw r such that γw (tw ) ∈ cut(o). Therefore, M is compact with diameter 2r, against our assumptions. 2 Let s(x) be the scalar curvature of (M, ,). The previous proposition enables us to define the spherical mean 1 s S(r) = Vol(∂Br ) ∂Br

on the whole (0, +∞). S(r) is continuous in a neighborhood of zero with limr→0+ S(r) = s(o), and possesses at least the same regularity as Vol(∂Br ). In case (Vol(∂Br ))−1 ∈ L1 (+∞) we define the critical function

+∞

χ(r) = 2 Vol(∂Br ) r

ds Vol(∂Bs )

−2 ∈ L∞ loc ((0, +∞)),

(1.4)

that we shall consider below. Since in the sequel we will be concerned with spectral arguments, we briefly recall some definitions. Let denote the Laplace–Beltrami operator on M, and consider a differential operator L = + a(x), where a(x) ∈ C 0 (M), and a bounded domain Ω ⊂ M. The kth eigenvalue λL k (Ω), of L on Ω (counted with its multiplicity) is defined by Rayleigh characterization: 2− 2 L Ω |∇φ| Ω aφ sup , (1.5) λk (Ω) = inf∞ 2 Vk C0 (Ω) 0=φ∈Vk Ωφ dim(Vk )=k

where we can substitute C0∞ (Ω) with Lip0 (Ω). If Ω has sufficiently regular boundary, λL 1 (Ω) is achieved by the non-zero solutions of the Dirichlet problem Lu + λL 1 (Ω)u = 0 on Ω, (1.6) u ≡ 0 on ∂Ω. Note that L is non-positive on C0∞ (Ω) if and only if λL 1 (Ω) 0. The main example of a nonpositive operator on every Ω is the Laplacian itself. We define the index indL (Ω) as the number of negative eigenvalues of −L. By Rellich theorem, this number is finite. Indeed, using Rayleigh characterization λL k (Ω) λk (Ω) − aL∞ (Ω) ,

1776


= L − aL∞ (Ω) is strictly non-positive on C ∞ (Ω), hence it is invertible. The therefore L 0 −1 : L2 (Ω) → L2 (Ω) is a compact operator, so that its spectrum Friedrich extension of (−L) consists in a discrete sequence {λj } of eigenvalues, each of them with finite multiplicity. It follows that the spectrum of −L is {λj − aL∞ (Ω) }, and indL (Ω) is clearly finite. The bottom of the spectrum of L on M, also called the first eigenvalue or the spectral radius, λL 1 (M), is defined by L (1.7) λL 1 (M) = inf λ1 (Ω): Ω ⊂ M is a bounded domain . Let Z ⊂ M be a subset. We define the first eigenvalue of L on the “punctured” manifold M \ Z by L λL (1.8) 1 (M \ Z) = inf λ1 (Ω): Ω ⊂ M \ Z is a bounded domain . Similarly, the index of L on M is defined by indL (M) = sup indL (Ω): Ω ⊂ M is a bounded domain and it may be infinite. Note that indL (M) = 0 if and only if λL 1 (M) 0. 1.3. Spectral estimates: the two main results The first theorem deals with the index of L. Theorem 1.4. Let a(x) ∈ C 0 (M). Suppose that the spherical mean A(r) of a(x) is non-negative and not identically null. Consider the following assumptions: (i) either −1 Vol(∂Br ) ∈ / L1 (+∞) or (Vol(∂Br ))−1 ∈ L1 (+∞) and there exist 0 < R0 < R1 such that A(r) ≡ 0 on [0, R0 ] and R1

A(s) −

+∞ ds 1 χ(s) ds > − log a + log ; 2 Vol(∂Bs )

R0

BR0

(1.9)

R0

(ii) either −1 ∈ / L1 (+∞), Vol(∂Br )

a(x) ∈ / L1 (M)

(1.10)

or −1 Vol(∂Br ) ∈ L1 (+∞),

lim sup

r

r→+∞ R

for some R sufficiently large;

A(s) −

χ(s) ds = +∞

(1.11)


1777

(iii) (Vol(∂Br ))−1 ∈ L1 (+∞), Vol(∂Br ) Λ exp ar α logβ r

for some Λ, a, α > 0, β 0,

and for some R > 0, c > 1,

aα α−1 β r A(r) c log r 2

∀r R.

(1.12)

Let L = + a(x). Then – under assumption (i), λL 1 (M) < 0; – under assumption (ii), L is unstable at infinity, that is, λL 1 (M \ BR ) < 0 for every R > 0. In particular, L has infinite index; – under assumption (iii), L is unstable at infinity and lim inf r→+∞

α indL (Br ) . log r 2 log( c+1 c−1 )

(1.13)

We observe that (1.11) and (1.12) are conditions “at infinity” and they are typical of oscillation results. On the other hand, condition (1.9) deserves some special attention since it is in finite form, in the sense that it only involves the behavior of a(x) on a compact set, namely BR1 : the left-hand side states how much must a(x) exceed the critical curve on the compact annular region B R1 \ BR0 in order to have a negative spectral radius, and it only depends on the behavior of a(x) near zero (on BR0 ) and on the geometry at infinity of M. Note also that R1 does not appear in the right-hand side of (1.9). Remark 1.5. By a famous result of Fisher-Colbrie [14], condition IndL (M) < ∞ implies the stability at infinity (that is, λL 1 (M \ BR ) 0 for some R 0). As far as we know, it is yet an open problem to prove the converse, or to provide an explicit counterexample. However, we remark that a sufficient condition to have finite index is that the strict inequality λL 1 (M \ BR ) > 0 hold for some R. For a detailed account of spectral theory for Schrödinger operators on Riemannian manifolds we refer the reader to [5]. The second result can be probably regarded as the core of the paper: it provides a sharp upper bound for the growth of λ 1 (M \ BR ) as a (monotone) function of R. In the literature, bounds for the spectral radius on M are obtained under at most exponential volume growth of geodesic spheres. On the contrary, Theorem 1.6 works also with faster volume growths. To better appreciate the result that we shall introduce below, we begin with some preliminary considerations. m It is well known that, if Z is any compact subset of Rm , then λ 1 (R \ Z) = 0. Extending a result of Cheng and Yau [12], Brooks [13] has shown that if the manifold (M, ,) has at most sub-exponential volume growth then λ 1 (M) = 0. However, if we puncture the manifold by a compact set Z = ∅, contrary to the case of Rm , it may happen that λ 1 (M \ Z) = 0. Indeed, Do Carmo and Zhou, [8], give an example where Vol(M) < +∞ and 1 λ 1 (M \ B 1 ) . 4

1778


Moreover, up to the missing requirement of continuity of Vol(∂Br ), they prove (see also [13], where slightly more general results are proved by a different method) that in case M has infinite volume, – if M has sub-exponential volume growth of geodesic spheres, then λ 1 (M \ BR ) = 0 ∀R 0;

(1.14)

– if Vol(∂Br ) C ear for some C, a > 0, then a2 4

λ 1 (M \ BR )

∀R 0.

(1.15)

It is interesting to see what happens when the volume growth is faster than exponential. Towards this aim, we extend Do Carmo and Zhou’s example to grasp the situation a step further. Thus we consider the model, in the sense of Greene and Wu, (M, ds 2 ) = (Rm , ds 2 ), with metric given in polar coordinates by ds 2 = dr 2 + h(r)2 dθ 2 ,

(1.16)

where h ∈ C ∞ ([0, +∞)) is positive on (0, +∞) and satisfies h(r) =

on [0, 1], on [2, +∞)

r ar α exp m−1

(1.17)

for some a > 0, α 1. Note that (1.16) extends smoothly at the origin because of the definition of h near 0, and that, for r 2, Vol(∂Br ) = exp{ar α }. We let b ∈ (0, a) and set ub (x) = e−br(x)

α

on M \ B2 .

(1.18)

A simple checking shows that ub + λb (r)ub = 0

on M \ B2 ,

where λb (r) is defined as λb (r) = α 2 b(a − b)r 2(α−1) + α(α − 1)br α−2 .

(1.19)

Observe that, in case α = 1, λb (r) ≡ b(a − b), while, if α > 1, λb (r) is strictly increasing on (R0 , +∞), with R0 sufficiently large that 2α(a − b)R0α + (α − 2) > 0. Up to further enlarging R0 , we can also assume that α−1 1 a < 2α r α 2

for r R0 .

(1.20)


1779

Applying a result of Cheng and Yau, [12] we have that, for every b ∈ (0, a), R R0 , λ 1 (M \ BR ) inf − M\BR

ub = inf λb (r) = λb (R). [R,+∞) ub

The choice a α−1 1 b˜ = + 2 2α R α maximize λb (R) and b˜ ∈ (0, a) because of (1.20). Then, for R R0 , 2 (α − 1)2 1 2 a λ1 (M \ BR ) α R 2(α−1) . − 4 4α 2 R 2α

(1.21)

Note that for α = 1 the above reduces to λ 1 (M \ BR )

a2 . 4

In particular, this shows that the upper bound in Theorem 3.1 in [8] is sharp. This example, for α Vol(∂Br ) Cear , C, a > 0, α 1, suggests to look for an upper bound of λ 1 (M \ BR ) of the form C1 R 2(α−1) with C1 = C1 (a, α) > 0. The guess is indeed correct, as Theorem 1.6 shows. Theorem 1.6. If M is a connected, complete, non-compact Riemannian manifold such that −1 ∈ L1 (+∞), Vol(∂Br ) Λ exp ar α logβ r for r large, Vol(∂Br ) for some Λ, a, α > 0, β 0, the following estimates hold: – If 0 < α < 1, then λ 1 (M \ BR ) = 0 ∀R 0. – If α 1, then λ1 (M \ BR ) a2α2 lim sup 2(α−1) log2β R 4 R→+∞ R

4(α−1) α 2 c+1 c . inf c∈(1,+∞) c−1

(1.22)

Remark 1.7. Note that (Vol(∂Br ))−1 ∈ L1 (+∞) implies Vol(M) = ∞. This follows from Schwarz inequality R r

letting R → +∞.

ds Vol(∂Bs )

R Vol(∂Bs ) ds (R − r)2 r

1780


We stress that the hypothesis Vol(M) = ∞ is essential. In fact, Do Carmo and Zhou example quoted above shows that the theorem fails if Vol(M) < ∞. On the contrary, the stronger assumption (Vol(∂Br ))−1 ∈ L1 (+∞) is for convenience: if it fails, we will show in Lemma 5.13 that λ 1 (M \ BR ) = 0 for every R 0. We underline that in Theorem 1.6 we have been considering volume growth assumptions, which are weaker and more general than the usual curvature conditions used in estimating λ 1 (M) (see for instance [15]). It is also worth mentioning that the problem of estimating λ1 (M \ BR ) from above arises naturally in the study of unstable hypersurfaces with constant mean curvature: see for example [8] for details and further references. 1.8. Geometric consequences The first geometric consequence is the following density theorem for complete minimally immersed hypersurfaces of Euclidean space. Theorem 1.9. Let ϕ : M → Rm+1 be a minimal hypersurface. We identify Tx M with ϕ∗ Tx M viewed as an affine hyperplane in Rm+1 passing through ϕ(x). Assume that −1 ∈ / L1 (+∞), Vol(∂Br )

s(x) ∈ / L1 (M)

(1.23)

Vol(∂Br ) Λ exp r α ,

(1.24)

C rμ

(1.25)

or that −1 Vol(∂Br ) ∈ L1 (+∞), S(r) −

for r 1 and some constants C, Λ, α > 0, μ ∈ R, with 2α < 2 − μ.

(1.26)

Then, for every compact set Ω ⊆ M

Tx M ≡ Rm+1 .

(1.27)

x∈M\Ω

[6] has proved that, when the hypersurface is compact and orientable, We note that Halpern m+1 if and only if M is embedded as the boundary of an open star-shaped T M ≡ R x x∈M domain of Rm+1 . In case M is non-compact there are many examples with x∈M Tx M ≡ Rm+1 , for instance cylinders over suitable curves. However, in case m = 2 complete minimal surfaces in R3 for which x∈M Tx M ≡ R3 are planes: this has been proved by Hasanis and Koutroufiotis in [9]. In an analogous way, we prove the following result. Theorem 1.10. Let ϕ : M → Rm+1 be a connected, complete non-compact minimal hypersurface in Rm+1 . Assume that either −1 Vol(∂Br ) ∈ / L1 (+∞),

s(x) ∈ / L1 (M)

(1.28)


1781

or that, for some C, Λ, α > 0, μ ∈ R, −1 Vol(∂Br ) ∈ L1 (+∞), S(r) −

C rμ

Vol(∂Br ) Λ exp r α ,

and 2α < 2 − μ.

(1.29)

Fix an equator E in Sm . Then the spherical Gauss map ν meets E infinitely many times along a divergent sequence in M. Note that we have not assumed the orientability of M; hence, the spherical Gauss map is only locally defined. However, due to the central symmetry of the equators, the conclusion of the theorem does not depend on the chosen local orientation: if ν(x) ∈ E, then also −ν(x) ∈ E. As a third consequence of Theorem 1.4, we have the following result of Fisher-Colbrie [14] and Gulliver [19]. Theorem 1.11. Let N be a flat 3-manifold, and let ϕ : M → N be a simply connected, minimally immersed surface. We denote with K the (necessarily non-positive) sectional curvature of M. Consider the stability operator L = + |II|2 . If M is stable at infinity (in particular, if IndL (M) < ∞), then M is parabolic and |K| < +∞. (1.30) M

With the same technique, we recover a well-known result of Do Carmo and Peng [16], FisherColbrie and Schoen [1] and Pogorelov [17]. Corollary 1.12. Let ϕ : M → R3 be a minimally immersed surface. If M is stable, then M is totally geodesic (hence, an affine plane). The last geometrical application employs directly Theorem 1.4, together with Theorems 2.4 and 2.1 of [4], to yield the following existence result for the Yamabe problem which requires no assumptions on the Ricci curvature. Theorem 1.13. Suppose that the dimension of M is m 3 and that the spherical mean S(r) satisfies S(r) 0

on [0, +∞),

S ≡ 0.

Let k(x) ∈ C ∞ (M) be non-positive on M and strictly negative outside a compact set. Set K0 = k −1 {0} and, for L=−

1 s(x) cm

where cm =

4(m − 1) , m−2

L define λL 1 (K0 ) = supD λ1 (D), where D varies among all open sets with smooth boundary containing K0 . Suppose

λL 1 (K0 ) > 0.

1782


Assume that either (Vol(∂Br ))−1 ∈ / L1 (+∞) or otherwise that there exists 0 < R0 < R1 such that S ≡ 0 on [0, R0 ] and R1

+∞ |S(t)| |s(x)| dt 1 − χ(t) dt > − log + log . cm 2 cm Vol(∂Bt )

R0

BR0

(1.31)

R0

Then, the metric , can be conformally deformed to a new metric of scalar curvature k(x). As the discussion after Theorem 1.4 suggests, this latter result implies that a strongly negative scalar curvature on a compact region Ω gives the existence of the conformal deformation independently of the behavior of s(x) outside Ω. 2. Existence of a first zero and oscillations Fix R ∈ (0, +∞] (note that the value +∞ is allowed), and consider the following set of assumptions: (A1) 0 A(t) ∈ L∞ loc ([0, R)),

A ≡ 0 in L∞ loc sense;

(V1) 0 v(t) ∈ L∞ loc ([0, R)),

1 ∈ L∞ loc ((0, R)), v(t)

lim v(t) = 0.

t→0+

In case 1/v ∈ L1 (R − ), we define the critical function

R

χR (t) = 2v(t)

ds v(s)

−2 =

1 − log 2

t

R

ds v(s)

2 ∈ L∞ loc ((0, R)).

(2.1)

t

For the ease of notation we write χ(t) in case R = +∞. We are now ready to prove: Theorem 2.1. Let A, v satisfy (A1), (V1) and let z ∈ Liploc ([0, R)) be a positive solution of

v(t)z (t) + A(t)v(t)z(t) = 0 almost everywhere on (0, R), z (t) = O(1) as t ↓ 0+ , z(0+ ) = z0 > 0.

(2.2)

1 ∈ L1 (R − ) v

(2.3)

Then

and for every 0 < T < t < R such that A ≡ 0 in L∞ ([0, T ]) t T

A(s) −

T R ds 1 χR (s) ds − log A(s)v(s) ds + log . 2 v(s)

0

T

(2.4)


1783

Proof. We set y(t) = −

v(t)z (t) z(t)

on (0, R).

(2.5)

Then y ∈ Liploc ([0, R)); this follows since (vz ) = −Avz ∈ L∞ loc ([0, R)), therefore vz is locally + Lipschitz. Moreover, from (V1) and (2.2) we deduce that y(0 ) = 0. Differentiating, we can argue that y(t) satisfies Riccati equation

y = A(t)v(t) +

1 2 y v(t)

a.e. on (0, R).

(2.6)

Note that, since A(t) ≡ 0, z is non-constant and y ≡ 0. Moreover, y (t) 0 almost everywhere on (0, R). From (A1) and (2.6) it follows that, for every T > 0 such that A ≡ 0 on [0, T ] T y(t) y(T )

A(s)v(s) ds > 0 ∀t ∈ [T , R).

(2.7)

0 2 −1 2 From (2.6) √ and the elementary inequality a + b 2|a||b|, a, b ∈ R, > 0, we also deduce y 2 A(t)|y(t)| and therefore

y 2 A(t)y

a.e. on [T , R).

(2.8)

From (2.7) and (2.8) we infer

T y(t)

A(s)v(s) ds e2

t √ T

A(s) ds

on [T , R).

(2.9)

0

Moreover, from (2.6) and (A1), y 1 y 2 v(t)

a.e. on [T , R).

(2.10)

Integrating on [t, R − ε] for some small ε > 0 we get 1 1 + y(t) y(R − ε)

R−ε

t

ds v(s)

R−ε

ds . v(s)

(2.11)

t

Letting ε → 0+ we obtain (2.3), and using (2.11) into (2.9) we reach the following inequality: t T

1 A(s) ds − log 2

T 0

1 A(s)v(s) ds − log 2

R t

ds . v(s)

(2.12)

1784


Inequality (2.4) is simply a rewriting of (2.12): it is enough to point out that 1 − log 2

R

1 ds = − log v(s) 2

t

R

ds + v(s)

T

t

χR (s) ds

(2.13)

T

2

which follows integrating the definition of χR (t).

Although very simple, inequality (2.4) is deep. As we have already stressed in the Introduction, the right-hand side of (2.4) is independent both of t and of the behavior of A after T : if (2.4) is contradicted for some 0 < T < t < R, the left-hand side represents how much must A(t) exceed the critical curve on the compact region [T , t] in order to have a first zero of z(t), and it only depends on the behavior of A and v before T (the first addendum of the right-hand side), and on the growth of v after T . For geometrical purposes, from now on we will focus on the case R = +∞. However, the next corollaries can be restated on (0, R) replacing +∞ with R and χ(t) with χR (t). Remark 2.2. Consider (2.13) with R = +∞: 1 − log 2

+∞

ds 1 = − log v(s) 2

t

+∞

ds + v(s)

T

t

χ(s) ds,

T

valid for 1/v ∈ L1 (+∞). Letting t → +∞ we deduce that χ(t) ∈ / L1 (+∞).

(2.14)

Corollary 2.3 (Existence of a first zero). In the assumptions of Theorem 2.1 with R = +∞, suppose that either 1/v ∈ / L1 (+∞) or otherwise there exist 0 < T < t such that t

A(s) −

T +∞ ds 1 χ(s) ds > − log A(s)v(s) ds + log . 2 v(s)

T

(2.15)

T

0

Then, for every solution z(t) ∈ Liploc ([0, +∞)) of (2.2), there exists T0 = T0 (z) > 0 such that z(T0 ) = 0. Moreover, the first zero is attained on (0, R], where R > 0 is the unique real number satisfying t T R ds 1 1 . A(s) ds = − log A(s)v(s) ds − log 2 2 v(s) T

0

(2.16)

t

Proof. Observe that (2.15) is equivalent to say that (2.4) with R = +∞ is false for some 0 < T < t . Hence, the existence of a first zero on (0, +∞) is immediate from Theorem 2.1. As for the position of T0 , note first that (2.4) is a rewriting of (2.12). Suppose that 1/v ∈ L1 (+∞). We note that the RHS of (2.12) is strictly decreasing as a function of R ∈ (t, +∞), limR→t − RHS = +∞, and (2.12) is contradicted for R = +∞ by assumption (2.15). Therefore,


1785

there exists a unique R ∈ (t, +∞) such that (2.16) holds. Choosing ε > 0 and applying Theorem 2.1 on the interval (0, R + ε) we deduce the existence of a first zero on (0, R + ε). Letting ε → 0 we reach the desired conclusion. The case 1/v ∈ / L1 (+∞) is similar: we restrict the considerations on a finite interval [0, R], with R > t small enough that (2.12) holds on [0, R]. Then, we enlarge R in such a way to reach the equality in (2.12), and we conclude as in the previous case. 2 Corollary 2.4 (Oscillatory behavior). Fix t0 ∈ (0, +∞). Suppose that (A1), (V1) are met on [t0 , +∞), with 1/v ∈ L∞ loc ([t0 , +∞)), and let z0 ∈ R \ {0}. Assume that either 1 ∈ / L1 (+∞), v(t)

A(t)v(t) ∈ / L1 (+∞)

(2.17)

t lim sup A(s) − χ(s) ds = +∞

(2.18)

or 1 ∈ L1 (+∞), v(t)

t→+∞

T

for some (hence any) T > t0 . Then, every solution z(t) ∈ Liploc ([t0 , +∞)) of v(t)z (t) + A(t)v(t)z(t) = 0 a.e. on (t0 , +∞), z(t0 ) = z0 ,

(2.19)

is oscillatory. Proof. First, we claim that the two conditions in (2.18) imply that / L1 (+∞). In√ A(t)v(t)1 ∈ / L (+∞), and from deed, from (2.14) and the second condition of (2.18) it follows that A(t) ∈ Cauchy–Schwarz inequality

t

t A(s)v(s)

T

ds v(s)

T

t

2 A(s) ds

T

letting t → +∞ we deduce the claim. Suppose by contradiction that z(t) has eventually constant sign. Up to replacing z with −z, we can assume z(t) > 0 on [τ, +∞), for some τ t0 . We define y as in (2.5). Then y ∈ Liploc ([τ, +∞)) and satisfies (2.6), hence it is increasing. Integrating we get T y(t) y(T ) y(τ ) +

A(s)v(s) ds

∀t > T > τ.

(2.20)

τ

By assumption, in both cases the non-integrability of A(t)v(t) ensures that there exists T > τ such that T y(τ ) +

A(s)v(s) ds > 0, τ

1786


therefore y > 0 on [T , +∞). Now, we argue as in Theorem 2.1. In particular, integrating (2.10) on [t, R0 ] we get 1 1 1 − y(t) y(t) y(R0 )

R0

ds v(s)

∀R0 > t > T ,

(2.21)

t

√ so that 1/v ∈ L1 (+∞), which contradicts (2.17). As for (2.18), from y 2y A almost everywhere we deduce t y(t) y(T ) exp 2 A(s) ds ∀t > T .

(2.22)

T

Combining (2.20), (2.21), (2.22) and using the definition of χ(t) we obtain the following inequality: t T

T +∞ ds 1 1 . A(s) − χ(s) ds − log y(τ ) + A(s)v(s) ds − log 2 2 v(s) τ

T

Letting t → +∞ along a sequence realizing (2.18) we reach the desired contradiction.

2

Here are some stronger conditions which imply oscillation, and that will be used in the sequel. Proposition 2.5. In the assumptions (A1), (V1) on the interval [t0 , +∞), Eq. (2.19) is oscillatory in the following cases: – 1/v ∈ L1 (+∞) and one of the following conditions is satisfied for some T > t0 : √ √ (i) A(t) χ(t) a.e. on [T , +∞) and A(s) − χ(s) ∈ / L1 (+∞); t √ A(s) ds > 1; (ii) lim sup Tt √ t→+∞ T χ(s) ds √ A(t) (iii) lim inf √ > 1; t→+∞ χ(t) t √ A(s) ds (iv) lim sup 1T +∞ ds > 1; t→+∞ − log t 2 v(s) – v(t) ∈ / L1 (+∞), v(t) f (t) a.e. for some continuous function f (t) such that 1/f ∈ 1 L (+∞), and √ log t+∞ fds (s) (v) A is positive, increasing and A(tn ) > inft>tn {− 12 } for some increasing t−tn sequence {tn } ↑ +∞. Proof. Implications (i)–(iii) are immediate from (2.14). To obtain (iv) we also use equality (2.13) with R = +∞. Regarding (v), we proceed, by contradiction, as in Corollary 2.4, restricting the


1787

problem on [τ, +∞), τ > t0 . Since A(t) is increasing, it is bounded from below away from zero on [τ, +∞). Therefore, since v(t) ∈ / L1 (+∞) we can choose T > τ such that T y(τ ) +

A(s)v(s) ds 1. τ

Using the monotonicity of A and v f , (2.12) becomes

A(T )(t − T )

t

1 A(s) ds − log 2

+∞

1 ds − log v(s) 2

t

T

for every T < t; (v) contradicts this last chain of inequalities.

+∞

ds f (s)

t

2

Corollary 2.4 is related to the classical Hille–Nehari oscillation theorem (see [3]). However, in order to apply this latter to ensure that a solution z(t) of (2.19) is oscillatory, one needs to perform a change of variables which requires 1/v ∈ L1 (+∞). Therefore, Hille–Nehari criterion is not straightforwardly applicable when 1/v ∈ / L1 (+∞). Moreover, in case 1/v ∈ L1 (+∞), in order to have oscillatory solutions the criterion requires that

+∞

lim inf A(t)v(t) t→+∞

1 ds > v(s) 2

(2.23)

t

which is exactly request (iii) of Proposition 2.5, using definition (2.1) of χ(t). It is worth to point out that (2.18) implies oscillations even in some cases when the “liminf” in (2.23) is equal to 1/2, an unpredictable case in Hille–Nehari theorem. 3. Why is the critical curve really critical? In this section we show that Corollary 2.4 is sharp. This will be done by studying the relationship between χ(t) and the two critical functions introduced in [2]. Consider the “Euclidean” problem m−1 z (t) + A(t)t m−1 z(t) = 0 on (0, +∞), t (3.1) + z (0 ) = 0, z(0) = z0 > 0, m 3. In this case, from v(t) = t m−1 it is immediate to see that χ(t) =

(m − 2)2 1 . 4 t2

Suppose that 0 A(t) ∈ C ∞ ([0, +∞)) is such that, for some ε > 0, ⎧ (m − 2)2 1 ⎪ ⎪ ⎨ on [0, ε), 4 t2 A(t) 2 ⎪ ⎪ ⎩ = (m − 2) 1 on [ε, +∞). 4 t2

(3.2)

(3.3)

1788


Then, problem (3.1) admits a positive solution 0 < z(t) ∈ C 1 ([0, +∞)) satisfying, by Proposition 4.1 of [2], C −1 t −

m−2 2

log t z(t) Ct −

m−2 2

log t

for some positive constant C and t 1. Suppose now that A(t) = H 2 /t 2 on [ε, +∞). By Proposition A.4 in Appendix A, there exists a positive solution for every H m−2 2 , while in case H > m−2 the limit in item (iii) of (2.5) is 2 √ A(t) 2H lim √ > 1, = t→+∞ χ(t) m−2

(3.4)

and by Corollary 2.4 every solution z(t) is oscillatory. Therefore, in the Euclidean case we recognize (3.2) as the correct critical curve for the behavior of z(t). The hyperbolic case is less immediate. However, fix B > 0 and consider sinhm−1 (Bt)z (t) + A(t) sinhm−1 (Bt)z(t) = 0 on (0, +∞), z (0+ ) = 0, z(0) = z0 > 0, m 2.

(3.5)

In this case v(t) = sinhm−1 (Bt) and the expression of χ(t) is more complicated. Nevertheless, using De l’Hopital theorem, we see that, as t → +∞, χ(t) =

2 sinhm−1 (Bt)

1 +∞ t

2 ∼

sinh1−m (Bs) ds

(m − 1)2 B 2 coth(Bt). 4

Suppose now that 0 A(t) ∈ C ∞ ([0, +∞)) is such that, for some ε > 0, ⎧ (m − 1)2 B 2 ⎪ ⎪ ⎨ coth(Bt) 4 A(t) 2 2 ⎪ ⎪ (m − 1) B ⎩ coth(Bt) = 4

on [0, ε), (3.6) on [ε, +∞).

Then, (3.5) has a positive solution z ∈ C 1 ([0, +∞)) satisfying C −1 te−

m−1 2 Bt

z(t) Cte−

m−1 2 Bt

for some appropriate constant C > 0 and t 1. In case A(t) = H 2 B 2 coth(Bt) on [ε, +∞), again using Proposition A.4 we deduce that, for m−1 every H m−1 2 , there exists a positive solution of (3.5). On the contrary, if H > 2 the limit in item (iii) of Proposition 2.5 is strictly greater than 1, hence every solution is oscillatory. The characteristic curve χ(t) is “asymptotically sharp” even in the hyperbolic case, and numerical 2B2 evidences show it agrees sharply with the curve (m−1) coth(Bt) outside t = 0. 4


1789

4. Oscillation estimates: the key result So far, we have only ensured an oscillatory behavior of (2.19) in case A(t) is, √ of solutions √ for example, asymptotic to the critical curve and A(t) − χ(t) is eventually positive and non-integrable at infinity. Under these assumptions, we cannot expect the oscillations to be automatically thick, since we have proved that χ(t) is sharp as a border line function. Nevertheless, suppose that A(t) c > 1 for t 1. χ(t) In this case, one may expect that the somewhat “uniform” mass of A(t) exceeding from χ(t) can control the distance between zeros from above. The key result, Theorem 4.1, goes in this direction: given two consecutive zeros T1 (τ ) < T2 (τ ) of z(t) after τ it states that T2 (τ ) − T1 (τ ) = O(τ )

as τ → +∞.

Moreover, in case v(t) f (t) = Λ exp at α logβ t ,

Λ, a, α > 0, β 0,

(4.1)

we will be able to estimate the quantity lim sup τ →+∞

T2 (τ ) . τ

Theorem 4.1 exploits upper bounds for the function v(t) in terms of some function f (t), instead of dealing with v(t) itself. The necessity of working with such an upper bound needs some preliminary comment. Although the critical function χ(t) is suitable to describe the oscillatory behavior of (2.19), due to its integral expression in v(t) it is in general not easy to handle. Moreover, v(t) itself can behave very badly since, in our geometric applications, it represents the volume growth of geodesic spheres; indeed, in many situations, such as volume comparison results, one deals only with upper bounds of the volume growth in terms of some known function f (r) which possesses some further regularity property (for example, as we will suppose in the sequel, monotonicity and differentiability). Hence, it would be useful to look for a modified more manageable critical functions depending on f (t) instead of v(t). The most natural way is to define χf (t) =

2f (t)

+∞ t

2

1

ds f (s)

=

1 − log 2

+∞

ds f (s)

2 .

(4.2)

t

Obviously χf ≡ χ in case v ≡ f . It is not hard to see that, if we substitute χ(t) with χf (t) and v(t) with its upper bound f (t) in the assumptions (with the exception of the terms involving integrals of A(t)v(t)), all the conclusions of the theorems of Section 2 are still true.

1790


Unfortunately, despite the further properties of f , even this critical function is too difficult to handle in many instances. Hence, we choose the simpler critical function χ˜ f (t) =

f (t) 2f (t)

2 (4.3)

.

Since (4.1) represents the prototype of most volume growth bounds, it is important to stress the relationship between χf (t) and χ˜ f (t) in case f (t) = Λ exp{at α logβ t}. Using De l’Hopital theorem we have lim

t→+∞

χ˜ f (t) χf (t)

f (t)2 = 1 since α > 0. t→+∞ f (t)f (t)

= lim

(4.4)

Therefore, with this choice of f the modified critical function χ˜ f (t) is asymptotic to the critical function χf (t). This justifies the use of χ˜ f (t) as a border line “at infinity” for A(t) in (A4) below. Throughout this section we shall require the validity of the following properties on [t0 , +∞), for some t0 > 0. (V2) 0 v(t) ∈ L∞ loc ([t0 , +∞)), (F1) f ∈ C 1 ([t0 , +∞)), (F2) f

1 ∈ L∞ loc ([t0 , +∞)), v(t)

1 ∈ L1 (+∞), v(t)

f (t0 ) > 0,

is non-decreasing on [t0 , +∞),

(F3) v(t) f (t) (F4) ∀t t0 ,

a.e. on [t0 , +∞),

f (t) f (t)

1 Dt μ

(A2) A ∈ L∞ loc ([t0 , +∞)), t (A3) lim sup t→+∞

A(s) −

for some D > 0, μ < 1,

A(t) 0 a.e. on [t0 , +∞), χ(s) ds = +∞,

t0

(A4) ∃c > 0 such that

c f (t) A(t) c χ˜ f (t) = 2 f (t)

a.e. on [t0 , +∞).

Next, we introduce two classes of functions: for f ∈ C 0 ([t0 , +∞)), f > 0 on [t0 , +∞), h, k piecewise C 0 and non-negative on [t0 , +∞), c > 0 we set A(f, h, c) = g: [t0 , +∞) → [0, +∞) piecewise C 0 such that (1 − ξ )g(t)f (t + g(t) + h(t))c < +∞ , lim sup sup c+1 t→+∞ ξ ∈(0,1) f (t + (1 − ξ )g(t) + h(t))

(4.5)


1791

B(f, k, c) = g: [t0 , +∞) → [0, +∞) piecewise C 0 such that ξg(t)f (t + (1 − ξ )g(t) + k(t))c < +∞ . lim sup sup c t→+∞ ξ ∈(0,1) f (t + g(t) + k(t)) · f (t + k(t))

(4.6)

Definition. We shall say that f satisfies property (P ) for some c > 0 if whenever h(t), k(t) = O(t)

as t → +∞,

g ∈ A(f, h, c) ∪ B(f, k, c)

implies g(t) = O(t) as t → +∞. An example of f satisfying property (P ) that we shall use in the sequel is the following. Let f (t) = exp at α logβ t ,

a > 0, α > 0, β 0 for t t0 .

(4.7)

Then f satisfies property (P ) for every c > 1. Indeed, let h and k be non-negative and such that h(t), k(t) = O(t) as t → +∞ and let g ∈ A(f, h, c). Assume, by contradiction, the existence of a sequence {tn } → +∞ with the property g(tn ) → +∞ as n → +∞. tn Without loss of generality we suppose g(tn ) > 1 ∀n and we define ξn = 1 −

(4.8) 1 g(tn ) .

Then

(1 − ξn )g(tn )f (tn + g(tn ) + h(tn ))c f (tn + (1 − ξn )g(tn ) + h(tn ))c+1 f (tn + g(tn ) + h(tn ))c f (tn + 1 + h(tn ))c+1 α = exp ac tn + g(tn ) + h(tn ) logβ tn + g(tn ) + h(tn ) α − a(c + 1) tn + 1 + h(tn ) logβ tn + 1 + h(tn ) = exp acg(tn )α logβ tn + g(tn ) + h(tn )

=

tn + × 1+ g(tn ) (c + 1)tnα 1+ − cg(tn )α

h(tn ) g(tn )

(4.9)

α

1 h(tn ) + tn tn

(4.10) α

logβ (tn + 1 + h(tn )) logβ (tn + g(tn ) + h(tn ))

.

(4.11)

Note that expression (4.10) tends to 1 as n → +∞, while expression (4.11) goes to 0. Their difference is thus eventually positive, so (4.9) goes to +∞, but this contradicts the fact that g ∈ A(f, h, c). Observe that here any c > 0 would work. Let now g ∈ B(f, k, c) and reason again by contradiction. Let {tn } be as above. Then

1792


f (tn + (1 − ξ )g(tn ) + k(tn ))c f (tn + g(tn ) + k(tn )) · f (tn + k(tn ))c 1 tn k(tn ) α + = ξg(tn ) exp ac(1 − ξ )α g(tn )α 1 + 1 − ξ g(tn ) g(tn ) tn k(tn ) α + × logβ tn + (1 − ξ )g(tn ) + k(tn ) − ag(tn )α 1 + g(tn ) g(tn ) α k(tn ) β α β × log tn + g(tn ) + k(tn ) − actn 1 + log tn + k(tn ) tn ξg(tn ) exp ag(tn )α logβ tn + (1 − ξ )g(tn ) + k(tn )

ξg(tn )

(4.12)

tn logβ (tn + g(tn ) + k(tn )) k(tn ) α 1 + (4.13) + g(tn ) g(tn ) logβ (tn + (1 − ξ )g(tn ) + k(tn )) tα k(tn ) α logβ (tn + k(tn )) −c n α 1+ . (4.14) g(tn ) tn logβ (tn + (1 − ξ )g(tn ) + k(tn ))

×

c(1 − ξ )α −

Since expression (4.14) goes to 0 as n → +∞, we can choose n such that it is eventually less than , for some fixed > 0. Moreover, since ∀ξ ∈ (0, 1) logβ (tn + g(tn ) + k(tn )) →1 logβ (tn + (1 − ξ )g(tn ) + k(tn ))

as n → +∞,

and using now c > 1, we can choose a suitable ξ such that expression (4.13) is eventually strictly positive and greater than 2, if we choose sufficiently small. Now letting n → +∞ we have that (4.12) goes to infinity, which implies g ∈ / B(f, k, c), a contradiction. Note that assumption α > 0 is necessary: it is not hard to see that, if f (t) has polynomial growth, then f does not satisfy property (P ) for any c > 0. On the contrary, proceeding in a way similar to that outlined above one verifies, for instance, that also the function Λ exp aebt ,

Λ, a, b > 0,

satisfies property (P ) for every c > 1. Assuming f (t) of this type, one can prove analogous estimates as those in (4.16) and (1.22). Going back to (4.7), we observe that (F1), (F2) and (F4) are satisfied. We also observe that the validity of (V2), (A2) and (A3) enables us to apply Corollary 2.4 to conclude that Eq. (2.19) is oscillatory on [t0 , +∞), and that, by Proposition A.3 of Appendix A, the zeros of z(t) are isolated. Now, we are ready to prove our main technical result. Theorem 4.1. Assume the validity of (V2), (F1)–(F4), (A2)–(A4) and that f satisfies property (P ) for the parameter c > 0 required in (A4). Let z ≡ 0 be a locally Lipschitz solution of (2.19) on [t0 , +∞). Let τ ∈ [T , +∞), where T is defined in Corollary 2.4, and let T1 (τ ), T2 (τ ) be the first two consecutive zeros of z(t) on [τ, +∞). Then T2 (τ ) − τ = O(τ )

as τ → +∞.

(4.15)


1793

Moreover, in case f (t) = Λ exp[at α logβ t] we have the estimate T2 (τ ) lim sup τ τ →+∞

c+1 c−1

2

α

(4.16)

.

Proof. As we have observed, z(t) is oscillatory. Having fixed τ ∈ [T , +∞), let U = τ, T2 (τ ) T1 (τ ) and on U consider the locally Lipschitz function y(t) = −

v(t)z (t) z(t)

solution of y (t) = A(t)v(t) +

1 2 y (t) v(t)

a.e. on [t0 , +∞).

(4.17)

Because of (A2) and (V2), (4.17) shows that y is non-decreasing on U . Indeed, from (A4), (F4), (V2) we can argue that y is strictly increasing on U . Since z ≡ 0, proceeding analogously to Proposition A.3 in Appendix A we deduce that y T1 (τ )+ = −∞,

y T1 (τ )− = +∞,

y T2 (τ )− = +∞.

(4.18)

Note that it could be U = (T1 (τ ), T2 (τ )): this is exactly the case when T1 (τ ) = τ . Due to the fact that y is non-decreasing, U can be decomposed as a disjoint union of intervals of the types I1 ⊆ x ∈ U : y(x) ∈ [−1, 1] I2 ⊆ x ∈ U : y(x) > 1 I3 ⊆ x ∈ U : y(x) < −1

interval of type 1, interval of type 2, interval of type 3.

To fix ideas we consider the case y(τ ) < −1 (see Fig. 1). In this case we have U = I3 ∪ I1 ∪ I2 ∪ I3 ∪ I1 ∪ I2 where: I1 I2 I3 I1 I2 I3

is the first interval of type 1, after τ and before T1 (τ ); is the first interval of type 2, after τ and before T1 (τ ); is the first interval of type 3, after τ and before T1 (τ ); is the first interval of type 1, after T1 (τ ) and before T2 (τ ); is the first interval of type 2, after T1 (τ ) and before T2 (τ ); is the first interval of type 3, after T1 (τ ) and before T2 (τ ).

(4.19)

1794


Fig. 1. Riccati solution.

We study this situation which is “the worst” it could happen. The remaining cases can be dealt with similarly and we shall skip proofs. For i = {1, 2, 3} we set |Ii | = gi (τ ) and |Ii | = gi (τ ). We are going to prove that, in the above hypotheses, each gi (τ ), gi (τ ) is O(τ ) as τ → +∞. We consider at first an open interval J of type 3 so that J could be either I3 or I3 . Set P (τ ) < Q(τ ) to denote its end points; thus g3 (τ ) = |J |(τ ) = Q(τ ) − P (τ ) and g3 (τ ) is clearly piecewise C 0 ([T , +∞)). We have y(Q) = −1 and y(P ) −1 if y is defined in P , otherwise y(P + ) = −∞. As in Theorem 2.1, (4.17) yields y 2 A(t)|y| = 2 A(t)(−y) a.e. on J. Fix t ∈ (P , Q] and integrate on [t, Q]. Recalling that y(s) y(Q) = −1 ∀s ∈ (P , Q] we have Q A(s) ds y(t) − exp 2

∀t ∈ (P , Q].

(4.20)

t

Since y /y 2 1/v almost everywhere, integrating on [P + ε, t] for some small ε > 0 we obtain 1 1 − y(P + ε) y(t)

t

P +ε

ds . f (s)

(4.21)

Letting ε → 0+ we get 1 1 − − + y(t) y(P + )

t P

ds f (s)

t P

ds f (s)

(4.22)


1795

which is valid ∀t ∈ (P , Q]. Now, because of (A4), 2

Q

Q A(s) ds c

t

f (Q) c f (s) ds = log f (s) f (t)

t

and therefore, from (4.20), 1 − y(t)

f (t) f (Q)

c .

Substituting into (4.22) and using (F2) we obtain 1

f (Q) f (t)

c t

f (Q)c ds (t − P ) f (s) f (t)c+1

∀t ∈ (P , Q).

(4.23)

P

Suppose now that J = I3 , so that P (τ ) = τ and Q(τ ) = τ + g3 (τ ) . Since t ∈ (P , Q), there exists ξ ∈ (0, 1) such that t − P = (1 − ξ )g3 (τ )

t = τ + (1 − ξ )g3 (τ ), and since t was arbitrary, from (4.23) we obtain

(1 − ξ )g3 (τ )f (τ + g3 (τ ))c 1 c+1 ξ ∈(0,1) f (τ + (1 − ξ )g3 (τ )) sup

(4.24)

in this case it follows that g3 ∈ A(f, 0, c) and then g3 (τ ) = O(τ ) as τ → +∞. We will deal with the case J = I3 later. Next, we consider an interval J of type 1. Set P (τ ) < Q(τ ) to denote its end points; thus g1 (τ ) = |J |(τ ) = Q(τ ) − P (τ ) and g1 (τ ) is piecewise C 0 ([T ; +∞)). In this case y(P ) = −1, y(Q) = 1 and |y| 1 on J . We integrate Riccati equation (4.17) on [P , Q] to obtain Q

Q

y (s) ds =

2= P

Q A(s)v(s) ds +

P

y 2 (s) ds v(s)

P

Q A(s)v(s) ds. P

Next, without loss of generality we can suppose to have chosen T sufficiently large that (V2), in particular 1/v ∈ L1 (+∞), implies +∞

ds 1 v(s)

T

so that Q P

ds 1. v(s)

1796


From the above inequality, using (A4) and the generalized mean value theorem it follows that, for some T0 ∈ [P , Q], Q 2

Q A(s)v(s) ds

P

ds v(s)

P

Q

Q c2 f (t) 2 ds v(s) ds 4 f (t) v(s)

P

P

Q Q ds c2 f (T0 ) 2 . = v(s) ds 4 f (T0 ) v(s) P

P

On the other hand, from Hölder inequality Q (Q − P ) 2

Q v(s) ds

P

ds v(s)

P

so that √ c f (T0 ) (Q − P ) 2 2 f (T0 ) or, in other words, using (F1), (F2) and observing that (F4) implies that f is eventually positive, √ 2 2 f (T0 ) Q − P. c f (T0 )

(4.25)

Now, if J = I1 , P (τ ) = τ + g3 (τ ), Q(τ ) = P (τ ) + g1 (τ ) and there exists θ ∈ [0, 1] such that T0 = τ + g3 (τ ) + θg1 (τ ). Substituting in (4.25) and using (F4) we obtain √ √ μ 2D 2 2 2 f (τ + g3 (τ ) + θg1 (τ )) τ + g3 (τ ) + θg1 (τ ) . g1 (τ ) c f (τ + g3 (τ ) + θg1 (τ )) c

(4.26)

In case μ 0 we immediately obtain g1 (τ ) = O(τ ), hence we examine the case μ ∈ (0, 1). Using the already known equality g3 (τ ) = O(τ ) and inequality (x + y)μ x μ + y μ , there exist constants K1 , K2 > 0 such that K1 g1 (τ ) K2 g1 (τ )μ 1−μ + . τ τ τ

(4.27)

Using a simple reasoning by contradiction, (4.27) implies g1 (τ ) = O(τ ) as τ → +∞. If J = I1 , P (τ ) = τ + (g1 + g2 + g3 )(τ ) + g3 (τ ), Q(τ ) = P (τ ) + g1 (τ ), T0 = τ + (g1 + g2 + g3 )(τ ) + g3 (τ ) + θg1 (τ ), and substituting into (4.25) g1 (τ )

√ 2 2 f (τ + (g1 + g2 + g3 )(τ ) + g3 (τ ) + θg1 (τ )) . c f (τ + (g1 + g2 + g3 )(τ ) + g3 (τ ) + θg1 (τ ))

(4.28)


1797

We will come back to this inequality later to prove g1 (τ ) = O(τ ) as τ → +∞. Indeed, by the same argument as above, the only things that remain to show for this purpose are g2 (τ ) = O(τ ) and g3 (τ ) = O(τ ) as τ → +∞, and we are going to prove these facts now. We consider an interval J of type 2 and again let P (τ ) < Q(τ ) denote its end points. Clearly y(P ) = 1 and y(Q) = μ > 1 (or y(Q− ) = +∞ in case that z(Q) = 0. Indeed, what follows works with any μ > 0). Again y 2 A(t)y

and

y 1 2 v y

a.e. on J.

Fix t ∈ [P , Q). Using y(P ) = 1, integration of the first inequality on [P , t] yields t y(t) exp 2 A(s) ds ∀t ∈ [P , Q),

(4.29)

P

while integrating the second one on [t, Q − ε), for some small ε > 0, and proceeding as in (4.21) we have 1 y(t)

Q

ds f (s)

∀t ∈ (P , Q).

(4.30)

t

Thus, observing that

2

t

A(s) ds log

f (t) f (P )

c

P

we deduce from (4.29) 1 y(t)

f (P ) f (t)

c .

Finally, substituting into (4.30) 1

f (t) f (P )

c Q

1 f (t) c ds (Q − t) f (s) f (Q) f (P )

t

Suppose now J = I2 so that g2 (τ ) = Q(τ ) − P (τ ), P (τ ) = τ + g3 (τ ) + g1 (τ ), Q(τ ) = τ + g3 (τ ) + g1 (τ ) + g2 (τ )

∀t ∈ (P , Q).

(4.31)

1798


and since t ∈ (P , Q), for some ξ ∈ (0, 1) we have t = τ + (1 − ξ )g2 (τ ) + g1 (τ ) + g3 (τ ), Q − t = ξg2 (τ ). Substituting into (4.31) yields, ξg2 (τ )f (τ + (1 − ξ )g2 (τ ) + g1 (τ ) + g3 (τ ))c 1. c ξ ∈(0,1) f (τ + g2 (τ ) + g1 (τ ) + g3 (τ ))f (τ + g1 (τ ) + g3 (τ )) sup

(4.32)

Thus, setting (g1 + g3 )(τ ) = k(τ ) since g1 (τ ) = O(τ ) and g3 (τ ) = O(τ ) as τ → +∞, we have that k(τ ) = O(τ ) as τ → +∞ and g2 ∈ B(f, k, c) and so g2 (τ ) = O(τ ) as τ → +∞. We can now deal with the case J = I3 . We have already shown that g1 (τ ) + g2 (τ ) + g3 (τ ) = O(τ ) as τ → ∞. We go back to (4.23) with J = I3 = (P (τ ), Q(τ )): note that now P (τ ) = τ + g3 (τ ) + g1 (τ ) + g2 (τ ),

Q(τ ) = P (τ ) + g3 (τ ),

where, obviously, g3 (τ ) = |I3 |. Since t ∈ (P , Q), for some ξ ∈ (0, 1) we have t = τ + (1 − ξ )g3 (τ ) + (g3 + g1 + g2 )(τ ), t − P = (1 − ξ )g3 (τ ) and substituting into (4.23), since t ∈ (P , Q), is arbitrary we have (1 − ξ )g3 (τ )f (τ + g3 (τ ) + (g1 + g2 + g3 )(τ ))c 1. c+1 ξ ∈(0,1) f (τ + (1 − ξ )g3 (τ ) + (g1 + g2 + g3 )(τ )) sup

(4.33)

Thus, setting h(τ ) = (g1 +g2 +g3 )(τ ), h(τ ) = O(τ ) as τ → +∞ and so we have g3 ∈ A(f, h, c) therefore g3 (τ ) = O(τ ) as τ → +∞. Coming back to inequality (4.28), we can now claim that also g1 (τ ) = O(τ ) as τ → +∞. The last case is J = I2 so that g2 (τ ) = Q(τ ) − P (τ ). Now we have P (τ ) = τ + (g3 + g1 + g2 + g3 + g1 )(τ ), Q(τ ) = P (τ ) + g2 (τ ) and since t ∈ (P , Q) there exists ξ ∈ (0, 1) such that t = τ + (1 − ξ )g2 (τ ) + g3 + g1 + g2 + g3 + g1 (τ ), Q(τ ) − t = ξg2 (τ ).


1799

Setting k(τ ) = (g3 + g1 + g2 + g3 + g1 )(τ ), we have already proved that k(τ ) = O(τ ) as τ → +∞. Substituting into (4.31) yields ξg2 (τ )f (τ + (1 − ξ )g2 (τ ) + k(τ ))c 1. c ξ ∈(0,1) f (τ + g2 (τ ) + k(τ ))f (τ + k(τ )) sup

(4.34)

Thus we have g2 ∈ B(f, k, c) therefore g2 (τ ) = O(τ ) as τ → +∞, and this shows that T2 (τ ) − T1 (τ ) T2 (τ ) − τ = g3 + g1 + g2 + g3 + g1 + g2 (τ ) = O(τ ) as τ → +∞, so we have the first part of the theorem, that is (4.15). To conclude, we shall estimate the quantity K = lim sup τ →+∞

T2 (τ ) − τ . τ

Looking at the group of Eqs. (4.24), (4.26), (4.32), (4.33), (4.28) and (4.34), we first note that each of the functions gi (τ ) and gi (τ ) involved in the proof (shortly g(τ )) satisfies one of the following inequalities, for τ T and for some suitable function h(τ ) which is known to be O(τ ): (1 − ξ )g(τ )f (τ + g(τ ) + h(τ ))c 1 for g3 and g3 , c+1 f (τ + (1 − ξ )g(τ ) + h(τ )) ξ ∈(0,1) √ 2 2 f (τ + h(τ ) + θg(τ )) for g1 and g1 , g(τ ) c f (τ + h(τ ) + θg(τ )) sup

ξg(τ )f (τ + (1 − ξ )g(τ ) + h(τ ))c 1 for g2 and g2 . c f (τ + g(τ ) + h(τ )) · f (τ + h(τ )) ξ ∈(0,1) sup

(4.35)

(4.36) (4.37)

For the sake of simplicity, we perform computations in case f (t) = Λ exp at α ,

a, Λ, α > 0

(note that f satisfy property (P ) for every c > 1). We shall determine K by computing, in each of the tree cases above, Kj = lim sup τ →+∞

g(τ ) τ

(the index j corresponds to the cases satisfied by gj and gj ), and then summing the terms “inductively” following the changes of the known function h case by case. For this purpose let H lim sup τ →+∞

h(τ ) . τ

1800


Consider at first inequality (4.36): we immediately find that, for this choice of f , √ √ 1 g(τ ) 2 2 1 2 2 1+ τ c τ aα(τ + h(τ ) + θg(τ ))α−1 caα

h(τ ) τ

+ τα

g(τ ) 1−α τ

.

We claim that K1 = 0. Indeed, suppose by contradiction that there exists a divergent sequence {τn } such that g(τn )/τn → K1 > 0. Then, evaluating the above inequality along {τn } and passing to the limit we reach 0 < K1 0 a contradiction. We now focus our attention on (4.35). By an algebraic manipulation g(τ )

1 f (τ + (1 − ξ )g(τ ) + h(τ ))c+1 1−ξ f (τ + g(τ ) + h(τ ))c

∀ξ ∈ (0, 1).

Due to the form of f , better estimates can be obtained choosing ξ near 1. For τ > 1, we choose ξ = (τ − 1)/τ . For the ease of notation let x(τ ) = g(τ )/τ , so that x(τ ) is bounded on [T , +∞) because f satisfies property (P ). With this choice of ξ we have x(τ )

f (τ + x(τ ) + h(τ ))c+1 , f (τ + τ x(τ ) + h(τ ))c

(4.38)

thus substituting x(τ ) h(τ ) α h(τ ) α α x(τ ) Λ exp aτ (c + 1) 1 + . − c 1 + x(τ ) + + τ τ τ Suppose now that K3 > 0, and evaluate this inequality along a sequence {τn } such that x(τn ) → K3 . Choose 0 < δ < K3 , and let n be large enough that the following inequalities hold: x(τn ) > K3 − δ,

x(τn ) < δ. τn

This yields: h(τn ) α h(τn ) α . (4.39) x(τn ) Λ exp aτnα (c + 1) 1 + δ + − c 1 + K3 − δ + τn τn Suppose now that K3 satisfies max

(c + 1)(1 + μ)α − c(1 + K3 + μ)α < 0,

μ∈[0,H ]

(4.40)

and compare it with (4.39). We can say that, by continuity, there exists a small δ > 0 such that the expression between square brackets is strictly less than 0. Letting now τn go to infinity in (4.39) we deduce 0 < K3 0, a contradiction. Note that (4.40) holds if and only if α K3 + 1 < 0 ∀μ ∈ [0, H ], (c + 1) − c μ+1


1801

that is, K3 >

c+1 c

1

α

− 1 (1 + H ).

Hence, if K3 > 0, we necessarily have K3

c+1 c

1

α

− 1 (1 + H ).

(4.41)

The same technique can be exploited when dealing with (4.37): from g(τ )

1 f (τ + g(τ ) + h(τ )) · f (τ + h(τ ))c ξ f (τ + (1 − ξ )g(τ ) + h(τ ))c

∀ξ ∈ (0, 1),

(4.42)

we deduce that it is better to choose ξ near 0, so we set ξ = 1/τ and we obtain, with the same notations, x(τ )

f (τ + τ x(τ ) + h(τ )) · f (τ + h(τ ))c . f (τ + (τ − 1)x(τ ) + h(τ ))c

Thus h(τ ) α x(τ ) Λ exp aτ α 1 + x(τ ) + τ α h(τ ) α h(τ ) τ −1 x(τ ) + +c 1+ −c 1+ . τ τ τ Next, if K2 > 0 we choose a sequence {τn } realizing K2 and we consider n sufficiently large that (τn − 1) > (1 − δ), τn

K2 − δ < x(τn ) < K2 + δ

obtaining the estimate h(τn ) α α + x(τn ) Λ exp aτn · 1 + (K2 + δ) + τn h(τn ) α h(τn ) α − c 1 + (1 − δ)(K2 − δ) + +c 1+ . τn τn

(4.43)

Now, if K2 satisfies max

(1 + K2 + μ)α + c(1 + μ)α − c(1 + K2 + μ)α < 0,

μ∈[0,H ]

(4.44)

1802


we reach a contradiction proceeding as in the previous case. Similarly to what we did above this yields the bound K2

c c−1

1

α

− 1 (1 + H ).

(4.45)

To simplify the writing we now set W=

c+1 c

1

α

−1 ,

Z=

c c−1

1

α

−1 .

To estimate g3 (τ )/τ , we shall use (4.41) and, from (4.24), we deduce h(τ ) ≡ 0 and thus H = 0. Therefore, we get K3 W. We have already shown that K1 = 0. Next, to estimate g2 (τ )/τ we shall consider (4.45). By (4.32) h(τ ) = g3 (τ ) + g1 (τ ), so we can use for H the sum W + 0 = W , hence K2 Z(1 + W ). Proceeding along the same lines we obtain the estimates K3 W 1 + W + Z(1 + W ) ; K1 = 0;

K2 Z 1 + W + Z(1 + W ) + W 1 + W + Z(1 + W ) . Summing up the Kj and the Kj , we obtain the surprisingly simple expression 3 ! c+1 α 2 2 K Kj + Kj = (W + 1) (Z + 1) − 1 = − 1. c−1 2

j =1

Thus we eventually have lim sup τ →+∞

T2 (τ ) τ

c+1 c−1

2

α

.

(4.46)

With few modifications in the computations, it can be seen that, considering f (t) = Λ exp[at α logβ t] instead of the above, the value of the constant K does not change. 2 Remark 4.2. Since f (t) = Λ exp{at α logβ t} satisfies property (P ) for every c > 1, in this case conditions (A3) and (A4) may be replaced by (A3 + A4)

A(t) c

aα α−1 β log t t 2

a.e. on [T , +∞), for some c > 1.


1803

Indeed, since v f for t 1 we have e 1 0 < − log 2

+∞

1 1 ds − log v 2

t

+∞

at α logβ t 1 ds ∼ . v 2

t

Since c > 1, we deduce that (iv) of Proposition 2.5 holds, and therefore so does (A3). Since (A4) clearly holds, applying Theorem 4.1 yields lim sup τ →+∞

T2 (τ ) = τ

c+1 c−1

2

α

,

showing that (4.16) holds. Remark 4.3. One might ask if varying the choice of the level sets in (4.19) one could obtain better estimates. It is not hard to see that, for every choice of the level, (4.46) does not change. 5. Geometric applications This section is devoted to the proofs of the geometric applications given in the Introduction, which follow from the results of Sections 2 and 4. The core are Theorems 1.4 and 1.6, where the Cauchy problems (2.2), (2.19) appear in order to obtain suitable radial test functions which yield estimates for the Rayleigh quotients of L and respectively. An almost direct use of Theorem 1.4 proves Theorems 1.9, 1.10 and 1.13, while Theorem 1.11 requires some special attention and further work. 5.1. The index of + a(x): proof of Theorem 1.4 Choose v(t) = Vol(∂Bt ). From Proposition 1.2 it follows that the spherical mean A(t) belongs to L∞ loc ([0, +∞)), the validity of (V1) and the existence of a locally Lipschitz solution of (2.2) whose zeros (if any) are isolated (Theorems A.1 and A.3 of Appendix A). Consider problem (2.2), and note that, by the coarea formula, R0

R0 A(s)v(s) ds =

0< 0

0

∂Bs

a ds = a. BR0

By Corollary 2.3, assumption (i) guarantees the existence of a first zero of every locally Lipschitz solution z(t), whereas Corollary 2.4 implies that assumption (ii) forces z(t) to be oscillatory. Note that a different choice of R in assumption (ii) does not affect the value of the “limsup.” We now consider case (i): choose a locally Lipschitz solution z(t) of (2.2), and denote with T its first zero. Define ψ(x) = z r(x) so that ψ ∈ Lip(B T ),

ψ ≡ 0 on ∂BT ,

∇ψ(x) = z r(x) ∇r(x)

a.e. on M

1804


and fix 0 < ε < T . Then, using the coarea formula, Gauss lemma and (2.2) we obtain

|∇ψ|2 − a(x)ψ 2 =

BT \Bε

|∇ψ|2 − A(r)ψ 2

BT \Bε

T =

2 z (r) v(r) dr −

ε

T A(r)z2 (r)v(r) dr ε

= −z(ε)z (ε)v(ε) −

T

z(r) v(r)z (r) + A(r)v(r)z(r)

ε

= −z(ε)z (ε)v(ε) and letting ε ↓ 0+ we deduce |∇ψ|2 − a(x)ψ 2 0. BT

By Rayleigh characterization of eigenvalues and by domain monotonicity we conclude λL 1 (M) < 0. Suppose now we are in case (ii), and assume by contradiction that there exists R > 0 such that λL 1 (M \ BR ) 0.

(5.1)

As already stressed in the Introduction, by a result of Fisher-Colbrie [14], if the index of L is finite then (5.1) holds for a sufficiently large R. In our assumptions, every locally Lipschitz solution z(t) of (2.19) is oscillatory. Let T1 < T2 be two consecutive zeros of z(t) strictly after R. Define ψ(x) = z(r(x)) in the annular region BT2 \ BT1 , and ψ(x) ≡ 0 in the rest of M. Then ψ ∈ Lip0 (M) with support contained in M \ BR . Proceeding as in the previous case, we obtain |∇ψ|2 − a(x)ψ 2 0, BT2 \BT1

hence, by strict domain monotonicity, λ1 (M \ BR ) < 0, contradicting (5.1). Let us finally consider case (iii). By Remark 4.2 and Theorem 4.1, (2.2) is oscillatory, thus L is unstable at infinity. In particular, the index of L is infinite. Note that (1.13) is equivalent to prove that 1 indL (Br ) lim inf , r→+∞ log r log K

with K =

c+1 c−1

2

α

.


1805

Fix ε > 0. Then, by Theorem 4.1 there exists T = T (ε) such that on [T , +∞) T2 (r) Kε = r

c+1 c−1

2

α

+ ε.

Proceeding as above, on M \ Br we can find a radial function ψ1 (x), with support contained in BKε r , which makes the Rayleigh quotient non-positive. Starting from T2 (r), the second zero after T2 (r) is attained before Kε T2 (r) Kε2 r, and we can construct a new Lipschitz radial function ψ2 (x) which makes the Rayleigh quotient non-positive. Moreover, the support of ψ2 is disjoint from that of ψ1 . In conclusion, the index of L grows at least by 1 when the radius is multiplied by Kε , hence " # r indL (Br ) indL (BT ) + logKε , T where s denotes the floor of s. Therefore we have lim inf r→+∞

indL (Br ) 1 ∀ε > 0. logKε r

(5.2)

From the change of base theorem, for every u, v > 1, r > 0 logu r log v = logu v = , logv r log u

(5.3)

so that 1 indL (Br ) r→+∞ log r log Kε

lim inf

∀ε > 0.

(5.4)

Letting ε → 0 yields the desired conclusion. 5.2. Tangent envelopes: proof of Theorem 1.9 We briefly recall some well known facts. Suppose we are given an isometrically immersed hypersurface ϕ : M m −→ N m+1 , where N is orientable. We fix the index notation i, j, k, t ∈ {1, . . . , m}, and we choose a local Darboux frame {ei , ν}. Let R, Ricc, s (resp R, Ricc, s) be the curvature tensor, the Ricci tensor and the scalar curvature of M (resp. N ), denote with II = (hij ) the second fundamental form of the immersion, with |II|2 the square of its Hilbert–Schmidt norm and with H = m−1 hii ν the mean curvature vector. Tracing twice the Gauss equations Rij kt = R ij kt + hik hj t − hit hj k

(5.5)

1806


we get s = s − 2Ricc(ν, ν) + m2 |H |2 − |II|2 .

(5.6)

Moreover, we recall the Codazzi–Mainardi equation hij k − hikj = R m+1 ij k ,

(5.7)

where (hij k ) are the components of the covariant derivative ∇II. A minimal immersion ϕ is characterized by H ≡ 0, which implies that ϕ is a stationary point for the volume functional on every relatively compact domain with smooth boundary in M. It is known that if, for example, N = Rm+1 , a minimal hypersurface cannot be compact and, by (5.6), s(x) = −|II|2 0. We say that ϕ is stable if it locally minimizes the volume functional up to second order, and unstable otherwise. Analytically the condition of stability is expressed by

|∇ψ|2 − |II|2 + Ricc(ν, ν) ψ 2 0 ∀ψ ∈ C0∞ (M),

M

and it is equivalent to the fact that the Schrödinger operator L = + |II|2 + Ricc(ν, ν) satisfies m+1 ), using (5.6) we get λL 1 (M) 0. Observe that, if N is Ricci flat (for example, N = R L = + |II|2 = − s(x). The strategy of the proof of Theorem 1.9 is to proceed by contradiction. First we prove that, if (1.27) fails, M is stable at infinity, i.e. λL 1 (M \ Ω) 0 for the chosen compact domain Ω; then, we contradict this fact using Theorem 1.4 under assumptions (1.23) or (1.24), (1.25), (1.26). Proof of Theorem 1.9. We reason by contradiction and, without loss of generality, we can assume that the origin o of Rm+1 belongs to Rm+1

$

Tx M.

x∈M\Ω

Consider on M \Ω a local normal unit vector field ν and define the local vector field X = ϕ, νν, where , denotes the canonical metric on Rm+1 . For every point x in the domain of X we have Xx ≡ 0 since otherwise ϕ(x) would be orthogonal to ν(x) and thus Tx M would contain the origin o. Moreover, under a change of Darboux frame the value of X does not change, hence it provides a globally defined, nowhere vanishing normal vector field, proving that M \ Ω is orientable. Define u(x) = ϕ(x), ν(x) = 0, u ∈ C ∞ (M \ Ω). Possibly inverting the orientation on connected components, we can suppose u > 0 on M \ Ω. A simple computation using minimality of ϕ and Codazzi equation (5.7) for N = Rm+1 shows that u is a positive solution of u − s(x)u = 0 on M \ Ω. By the result of Fisher-Colbrie and Schoen [1] it follows that L = − s(x) has non-negative spectral radius λL 1 (M \ Ω), hence M is stable at infinity.


1807

To contradict this latter result, we choose A(r) = −S(r) and we use Theorem 1.4, case (ii). A contradiction is immediate in case of (1.23), while if we assume (Vol(∂Br ))−1 ∈ L1 (+∞) we can apply Proposition 2.5(iv): indeed, under assumptions (1.24), (1.25) and (1.26), observing that d 1−α exp −s α −s exp −s α C ds

for s 1,

> 0, there exist positive constants D and H such that for some C r √ r √ −μ/2 Cs ds R A(s) ds lim inf 1 lim inf 1 R +∞ exp{−s α } +∞ ds r→+∞ − log r→+∞ − log ds r 2 Vol(∂Bs ) r 2 Λ μ lim inf Dr 1− 2 −α log−H r = +∞. r→+∞

Proposition 2.5(iv) implies (1.11), so that Theorem 1.4 case (ii) contradicts the stability at infinity of L. 2 Remark 5.3. Obviously, when Ω = ∅ there is a version of the above theorem in finite form, which is based on case (i) of Theorem 1.4. We have preferred not to make the proposition too cumbersome, in order to better appreciate the result itself. Nevertheless, even this case seems interesting: inequality (1.9) implies that a strongly negative scalar curvature on a compact set spreads the tangent hyperplanes everywhere on Rm+1 , independently of the behavior of the curvature outside the compact. 5.4. The Gauss map: proof of Theorem 1.10 The proof follows the same lines of Theorem 1.9, and we maintain the same notations. We fix an equator E and we reason by contradiction: assume that there exist a sufficiently large geodesic ball BR such that, outside BR , ν does not meet E. In other words ν(M \ BR ) is contained in the open spherical cups determined by E. Indicating with w ∈ Sm one of the two focal points of E, we can say that w, ν(x) = 0 for every x ∈ M \ BR , where , stands for the scalar product of unit vectors in Sm ⊂ Rm+1 . Then, the normal vector field X = w, νν is globally defined and nowhere vanishing on M \ BR , proving that M \ BR is orientable. Therefore, the Gauss map is globally defined on M \ BR . Let C be one of the (finitely many) connected components of M \ BR ; then, ν(C) is contained in only one of the open spherical caps determined by E. Up to replacing w with −w, we can suppose u = w, ν > 0 on C. Proceeding in the same way for every connected component, we can construct a positive function u on M \ BR . By a standard calculation u satisfies u = −hikk ei , w − |II|2 u on M \ BR . Using Schwarz symmetry, Codazzi equation (5.7) and minimality we deduce hikk = hkik = hkki = 0,

(5.8)

1808


hence u + |II|2 u = 0. From (5.6) we get u − s(x)u = 0.

(5.9)

In particular, (5.9) implies λL 1 (M \ BR ) 0, where L = + s(x). Observe that since s(x) = −|II|2 0, its spherical mean S(r) is non-positive. As in the proof of Theorem 1.9, the assumptions imply case (ii) of Theorem 1.4, and this contradicts λL 1 (M \ BR ) 0. 5.5. The Yamabe problem: proof of Theorem 1.13 Applying Theorem 1.4 to the operator L = − c1m s(x) we obtain λL 1 (M) < 0. Hence, the conclusion follows from Theorems 2.4 and 2.1 of [4], with the observation after Theorem 2.3 therein. Remark 5.6. We can state an alternative version at infinity of condition (1.31) via Proposition 2.5(iv). This reads as follows. Suppose that S(r) −

H rβ

for r 1 and some H > 0, β 2.

Then, condition %

⎧ +∞ ds β2 −1 if β < 2, log r m − 2 ⎨ lim infr→+∞ r −β/2+1 Vol(∂Bs ) H > m − 1 ⎩ lim infr→+∞ − 1 log +∞ ds if β = 2, r log r Vol(∂Bs )

(5.10)

implies the existence of the desired conformal deformation. 5.7. Minimal surfaces: proof of Theorems 1.11 and 1.12 We will obtain both the results as easy consequences of the next two lemmas, the first of which is a somewhat modified version of a result of Colding and Minicozzi [7]. We adopt the notations of Theorems 1.9 and 1.10. Lemma 5.8. Let ϕ : M 2 → N 3 be a simply connected, minimally immersed surface in an ambient 3-manifold. Assume that the Ricci tensor of N satisfies Ricc 0.

(5.11)

Suppose that M has a pole o, and let L be the stability operator. If λL 1 (M \ Ω) 0 for some compact set Ω, then there exists a constant C > 0 such that Vol(BR ) CR 2

∀R 0.

Proof. Let K be the sectional curvature of M. Since for surfaces s(x) = 2K, using (5.11) in (5.6) yields


1809

2K = Ricc(e1 , e1 ) + Ricc(e2 , e2 ) − Ricc(ν, ν) − |II|2 −Ricc(ν, ν) − |II|2 hence the Rayleigh quotient for the stability operator do not exceed that for L = − 2K. It follows that, for every subset D ⊂ M, we have inequality L λL 1 (D) λ1 (D),

(5.12)

thus by the assumptions λ1L (M \ BR0 ) 0 for some R0 sufficiently large that Ω ⊂ BR0 . Since M is simply connected and has a pole, the geodesic spheres centered at o are smooth and the geodesic balls are diffeomorphic to Euclidean ones. By Gauss–Bonnet theorem together with the first variation formula, we have

K = 2π − l (r),

(5.13)

Br

where l(r) is the length of ∂Br (another way to derive this formula can be found in [5, p. 238]). Denote with K(r) = Br K, and observe that, by the coarea formula K (r) = ∂Br K. By the stability of L, for every ψ ∈ Lip0 (M \ BR0 ) we have

|∇ψ|2 + 2

M\BR0

Kψ 2 0.

M\BR0

Fix R > R0 + 2 and choose ψ(x) = f (r(x)), where

f (t) =

⎧ 0 ⎪ ⎪ ⎨t − R

0

R−t ⎪ ⎪ ⎩ R−R0 −1 0

if t R0 , if t ∈ [R0 , R0 + 1], if t ∈ [R0 + 1, R], if t R.

Then, using (5.13) into (5.14) and integrating by parts, by the properties of f we have R 0

2 f (r) l(r) dr + 2

R0

R

l (r) f 2 (r) dr.

R0

Inserting the explicit expression of f we obtain Vol(BR ) − Vol(BR0 +1 ) + 4l(R0 + 1) (R − R0 − 1)2 4l(R0 + 1) 4(Vol(BR ) − Vol(BR0 +1 )) − . − 4 Vol(BR0 +1 ) − Vol(BR0 ) + R − R0 − 1 (R − R0 − 1)2

0 Vol(BR0 +1 ) − Vol(BR0 ) +

(5.14)

1810


Therefore, there exists a constant C = C(R0 ) depending on the geometry of BR0 +1 such that, for every R > R0 + 2, 3(Vol(BR ) − Vol(BR0 +1 )) C(R0 ) (R − R0 − 1)2 hence, Vol(BR ) Vol(BR0 +1 ) +

C(R0 ) 2 (R − R0 − 1)2 C(R 0 )R . 3

Since near o the geometry of M is “nearly” Euclidean, up to enlarging the constant the same estimate holds on all of M, and this concludes the proof. 2 Remark 5.9. Note that, in case Ω = ∅ and N = R3 , we recover Colding and Minicozzi theorem, for which the simply-connectedness assumption is unnecessary: in fact, we can pass to the Rie of M. Indeed, by Fisher-Colbrie and Schoen result [1] stability is mannian universal covering M equivalent to the existence of a positive solution u of Lu = 0 on M; u can be lifted up by composition with the covering projection, which is a local isometry, yielding a positive solution of Moreover, in this case the existence of a pole is automatically satisfied the same equation on M. since by (5.5) M has non-positive sectional curvature. The next lemma is a calculus exercise (see [11]). Lemma 5.10. If

r ∈ / L1 (+∞), Vol(Br )

then

1 ∈ / L1 (+∞). Vol(∂Br )

Now we are ready to prove Theorem 1.11 and Corollary 1.12. Proof of Theorem 1.11. By assumption there exists a relatively compact set Ω such that M \ Ω 2 is stable, that is, λL 1 (M \ Ω) 0. Moreover, by Gauss equation (5.5) we get 2K = −|II| 0, 2 so that every point of M is a pole. Lemma 5.8 implies that Vol(Br ) Cr , hence r ∈ / L1 (+∞). Vol(Br ) From Lemma 5.10 we obtain (Vol(∂Br ))−1 ∈ / L1 (+∞), and by a classical result M is parabolic. Suppose now that (1.30) is false, that is, |K| = ∞. M

Then, the function a(x) = −2K satisfies all the assumptions of Theorem 1.4, and case (ii) implies that − 2K ≡ + |II|2 = L is unstable at infinity, which is a contradiction and concludes the proof. 2


1811

Proof of Corollary 1.12. If M is stable, then there exists a global positive smooth solution u we deduce that M is a stable minimal surface of Lu = 0. Lifting u to the universal covering M is parabolic, hence u is a positive with non-positive sectional curvature. By Theorem 1.11, M constant: indeed, u = −|II|2 u 0. Equality Lu = 0 shows that |II|2 ≡ 0. Alternatively, one can conclude as follows: by Lemma 5.8 applying we deduce 1/v ∈ / L1 (+∞), where v is the volume of the geodesic spheres of M; < 0, contradicting the stability as( M) Theorem 1.4, case (i) we deduce that, if |II|2 ≡ 0, λL 1 sumption. 2 Remark 5.11. Theorems 1.11 and 1.12 can be slightly generalized to the case Ricc 0, assuming a-priori that M has a pole. Indeed, with different techniques, by [14] there is no need to require the existence of the pole. However, this seems to be essential in Lemma 5.8 in order to apply the Gauss–Bonnet theorem. 5.12. The growth of the spectral radius: proof of Theorem 1.6 We begin with a lemma. In case the volume growth is at most exponential, by a direct application of this result we recover Do Carmo and Zhou estimates (1.14) and (1.15). Lemma 5.13. Suppose that Vol(∂Br ) f (r)

on (R, +∞)

for some R sufficiently large and some f ∈ C 0 ([R0 , +∞)). Fix R 0. – If M has infinite volume and (Vol(∂Br ))−1 ∈ / L1 (+∞), then λ 1 (M \ BR ) = 0.

(5.15)

– If (Vol(∂Br ))−1 ∈ L1 (+∞), then for every > 0 there exists T0 = T0 () > R such that +∞ ds 2 log t 1 f (s) λ + . (5.16) 1 (M \ BR ) inf − t>T0 2 t − T0 Proof. Set v(r) = Vol(∂Br ). We begin with the case 1/v ∈ L1 (+∞). Let R > 0 be sufficiently large that +∞ R0 > R,

ds < 1, v(s)

R0

and let > 0. We define on [R0 , +∞) +∞ ds 2 1 log t f (s) A (r) = inf − + . t>r 2 t −r

1812


Then, A (r) , A (r) is continuous and non-decreasing. By Remark 1.7, M has infinite volume, thus we can apply (v) of Proposition 2.5 to obtain that (2.19) (with A instead of A) is oscillatory. Let z be a locally Lipschitz solution of (2.19), and R0 < T1 < T2 be two consecutive zeros. Define φ(x) = z (r(x)) on BT2 \ BT1 . Proceeding as in the proof of Theorem 1.4, by the domain monotonicity of eigenvalues we have 0 λ 1 (M \ BR ) < λ1 (BT2 \ BT1 ) T2 2 2 BT2 \BT1 |∇φ| T1 [z (r)] v(r)dr = T 2 2 2 BT2 \BT1 φ T1 z (r) v(r)dr T2 2 T A (r)z (r) v(r)dr = 1 T A (T2 ) 2 2 T1 z (r) v(r)dr +∞ ds 2 1 log t f (s) = inf − + . t>T2 2 t − T2

Thus we get (5.16) with T0 = T2 (note that T0 depends on since z (t) does). In case 1/v ∈ / L1 (+∞) and M has infinite volume, by Theorem 2.4, Eq. (2.2) is oscillatory whenever A(r) > 0: indeed +∞ +∞ A(s)v(s) ds v(s) ds = +∞. R0

R0

Thus, choosing A (r) = the above reasoning shows that λ 1 (M \ BR ) , and the validity of (5.15) follows at once. 2 Lemma 5.14. In case 1/v ∈ L1 (+∞), the previous lemma yields in particular the weaker estimate +∞ ds 2 log t 1 f (s) λ inf − ∀R > 0. (5.17) 1 (M \ BR ) lim t→+∞ 2 t Proof. This follows immediately from the next observation: if we substitute in (5.16) “inf” with the greater “liminf,” we observe that this does not depend on T0 (). We can thus fix a particular T0 (), compute the “liminf” and then let → 0. 2 Proof of Theorem 1.6. First, we apply Lemma 5.14 to estimate λ 1 (M \ BR ) in case the volume growth is at most exponential. Towards this aim suppose that (Vol(∂Br ))−1 ∈ L1 (+∞) and that Vol(∂Br ) f (r) = Λ exp ar α 0 < α 1, Λ, a > 0. (5.18) Due to our choice of α we easily see that 1 log − 2

+∞ t

t

ds f (s)

a ∼ t α−1 2

as t → ∞.


1813

Because of this we can apply Lemma 5.14. Hence, for every R 0 λ 1 (M

\ BR )

0 a 2 /4

if 0 < α < 1, if α = 1.

(5.19)

In this way we recover Do Carmo and Zhou results quoted in the Introduction, and we also show that the estimate in Lemma 5.13 is sharp. The above observations work also in case Vol(∂Br ) Λ exp{ar α logβ r}), with α < 1, β 0, since it is enough to note that exp ar α logβ r = O exp ar α

for every α > α,

and to choose α such that α < α < 1. We are left with the case α 1, β 0. For c > 1 and r > R we define aα α−1 β 2 r log r . A(r) = c 2 Note that A(r) is monotone increasing. Moreover, Remark 4.2 ensures that (2.19) is oscillatory. Hence, proceeding as in Lemma 5.13 we have for R R0 λ 1 (M \ BR ) A(T2 ), where T2 (R) is the second zero of the solution z of (2.19) after R. By Theorem 4.1, for every ε > 0 there exists R1 (ε) such that, for every R R1 , T2 (R)

c+1 c−1

2

α

(1 + ε) R.

Therefore, from the monotonicity of A(r) we get λ 1 (M

2 c+1 α \ BR ) A (1 + ε) R ∀R R1 (ε). c−1

Inserting the value of A(r), up to choosing ε small enough and R2 R1 large enough we deduce that, for every fixed c > 1, λ 1 (M

4(α−1) α a 2 α 2 2(α−1) 2β 2 c+1 (1 + 2ε) ∀R R2 (ε). R \ BR ) log R c 4 c−1

Thus, letting first R → +∞ and then ε → 0, and minimizing over all c ∈ (1, +∞) we finally have λ1 (M \ BR ) a2α2 lim sup 2(α−1) log2β R 4 R→+∞ R This concludes the proof of the theorem.

2

4(α−1) α 2 c+1 c . inf c∈(1,+∞) c−1

(5.20)

1814


Remark 5.15. The infimum of the function c2

c+1 c−1

4(α−1) α

is attained by the unique positive solution c of α(c + 1)(c − 1) = 4(α − 1)c, which can be computed, although its explicit expression is not so neat. Remark 5.16. It is worth to point out that an application of (5.20) in case α = 1 and β = 0 gives 2 λ 1 (M \ BR ) a /4, hence estimate (5.20) is sharp with respect to the constant appearing in the RHS. Remark 5.17. Proceeding as in the Introduction, one can study a model manifold whose function h(r) is of the following type: h(r) =

r, ar α exp m−1 logβ r ,

r ∈ [0, 1], r ∈ [2, +∞),

for which the volume growth of geodesic spheres is exp ar α logβ r . Performing the same computations of the Introduction, one obtains for R sufficiently large 2(α−1) log2β R λ 1 (M \ BR ) KR

for some K > 0. This shows that the estimate of Theorem 1.6 is sharp even with respect to the power of the logarithm. Acknowledgments The authors express their gratitude to the referee for a very careful reading of the manuscript and for the many useful observations which led to substantial improvements. Appendix A This appendix is devoted to showing existence for the Cauchy problem (2.2) under general assumptions on v(t), A(t). Moreover, we prove that the zeros of such solutions, if any, are at isolated points, and we stress a Sturm type comparison result. In this respect, we fix R ∈ (0, +∞] (note that +∞ is allowed), and we assume that v(t), A(t) satisfy the following set of assumptions: (A1) 0 A(t) ∈ L∞ loc ([0, R)),

A ≡ 0 in L∞ loc sense;

(V1) 0 v(t) ∈ L∞ loc ([0, R)),

1 ∈ L∞ loc ((0, R)), v(t)

lim v(t) = 0;

t→0+

(V3) there exists a ∈ (0, R) such that v is strictly increasing on (0, a).


1815

Proposition A.1. Under assumptions (A1), (V1), (V3) there exists a locally Lipschitz function z ∈ Liploc ([0, R)) such that v(t)z (t) + A(t)v(t)z(t) = 0 almost everywhere on (0, R), z(0+ ) = z0 > 0.

(A.1)

Moreover, up to a zero-measure set Ω, lim z (t) = 0.

t→0+ t ∈Ω /

If in addiction A(t), v(t) are continuous on [0, R), then z ∈ C 1 ([0, R)) and z (0+ ) = 0. Proof. First, fix a sequence Tj ↑ R. We can suppose that a ∈ (0, Tj ) for every j , where a is as in (V3), and A ≡ 0 on [0, Tj ]: the case A ≡ 0 is easier and can be treated similarly. Fix ε ∈ (0, a), and define v(ε) on (0, ε], vε (t) = v(t) on [ε, R). Then, t kε (t, s) = −A(s)vε (s) s

dx vε (x)

(A.2)

belongs to L∞ loc ([0, R) × [0, R)). Thus, by standard theory (one can consult Chapter IX of [20]), Volterra integral equation of the second type t w(t) = z0 +

(A.3)

kε (t, s)w(s) ds, 0

restricted to every interval [0, Tj ] (where the kernel kε (t, s) is bounded), admits a unique solution zε,j ∈ L2 ((0, Tj )). From (A.2), using integration by parts applied to the integrable function −A(s)vε (s)zε,j (s) and to the absolutely continuous one t s

dx . vε (x)

We see that zε,j satisfies t zε,j (t) = z0 − 0

1 vε (s)

s

A(x)vε (x)zε,j (x) dx ds

0

(A.4)

1816


on [0, Tj ]. This shows that zε,j (t), being an integral function, is absolutely continuous on [0, Tj ] (hence, almost everywhere differentiable), and its derivative is almost everywhere 1 vε (t)

t

A(x)vε (x)zε,j (x) dx ∈ L∞ ([0, Tj ]).

0

Therefore, zε,j (t) is a Lipschitz function on [0, Tj ]. By the uniqueness of solutions of (A.3), we deduce that, when j < k, zε,k restricted to [0, Tj ] coincides with zε,j . Hence, we can construct a locally Lipschitz solution zε (t) defined on the whole [0, R). What we want to prove is that, for every Tj , the family {zε }ε∈(0,a) is equibounded and equi-Lipschitz in C 0 ([0, Tj ]). For the ease of notation, from now on we omit the subscript j and we consider the problem on [0, T ] ⊂ [0, R). We observe that, because of (V3) and (A1), for 0 s t a we have & & &kε (t, s)& A(t − s) Aa,

(A.5)

where A = AL∞ ([0,T ]) . Next, we consider the case 0 s a < t T . Because of (V1), on [a, T ] v −1 is bounded. We indicate with v −1 the L∞ -norm of v −1 (t) on [a, T ], and with v the L∞ -norm of v(t) on the whole [0, T ]. It follows that a t & & ' ' dx dx &kε (t, s)& = A(s)vε (s) + A a + vε (s)'v −1 'T vε (x) vε (x) s

' ' A a + v'v −1 'T .

a

It remains to consider the case 0 < a s t T . In this case we obtain & ' & ' ' ' &kε (t, s)& A(s)vε (s)'v −1 'T Av'v −1 'T . Therefore, there exists L = L(T , a) > 0 such that ( sup

sup

&) & &kε (t, s)& L.

(A.6)

ε∈(0,a) 0stT

Using (A.6) into (A.3) we have & & &zε (t)& z0 + L

t

& & &zε (s)& ds

∀t ∈ [0, T ].

0

So that, applying Gronwall lemma on the continuous function |zε (t)|, we conclude & & &zε (t)& z0 eLt z0 eLT

on [0, T ].

(A.7)


1817

This shows equiboundedness of the family {zε }ε∈(0,a) . To show equicontinuity we differentiate (A.4) to obtain zε (t) = −

1 vε (t)

t almost everywhere on [0, T ].

A(x)vε (x)zε (x) dx

(A.8)

0

We set H (ε, t) =

1 max A(s)vε (s). vε (t) s∈[0,t]

If 0 t a, because of (A1) and (V3) we have H (ε, t) A. If a < t T , since ε ∈ (0, a), vε (t) = v(t) and therefore H (ε, t) A

' ' ' ' vε L∞ ([0,t]) A'v −1 'vε L∞ ([0,t]) A'v −1 'v, v(t)

where the last inequality is an immediate consequence of (V3) and the definition of vε (t). Summarizing, there exists M = M(T , a) > 0 such that sup H (ε, t) M

a.e. on [0, T ].

ε∈(0,a)

From (A.8) it follows that & & &z (t)& M

t

ε

& & &zε (x)& dx

a.e. on [0, T ]

0

and thus, from (A.7), & & &z (t)& z0 MT eLT ε

a.e. on [0, T ].

(A.9)

This shows that {zε }ε∈(0,a) is equi-Lipschitz on every compact subset [0, T ] ⊂ [0, R). By the Ascoli–Arzelá theorem, the set {zε }ε∈(0,a) is relatively compact in C 0 ([0, T ]). Therefore, there exists a sequence εn → 0 such that zεn converges uniformly to a Lipschitz function z on [0, T ]. A Cantor diagonal argument on the exhaustion [0, Tj ] ↑ [0, R) yields a sequence zεn which converges locally uniformly to a locally Lipschitz function z on [0, R). Clearly, vεn → v in L∞ ([0, R)). If we set 1 rε (t) = vε (t)

t A(s)vε (s)zε (s) ds 0

1818


using (A.8) and (A.9) we see that rεn is locally a bounded sequence of L∞ loc -functions converging pointwise to 1 r(t) = v(t)

t a.e. on [0, R).

A(s)v(s)z(s) ds 0

By the dominated convergence theorem rεn → r in L1 ((0, t]) ∀t ∈ (0, R). Hence, for every t ∈ [0, R), t lim

n→+∞ 0

ds vεn (s)

s

t

A(x)vεn (x)zεn (x) dx = 0

ds v(s)

0

s

A(x)v(x)z(x) dx . 0

Because of (A.4) it follows that z satisfies the integral equation t z(t) = z0 −

1 v(s)

0

s

A(x)v(x)z(x) dx ds,

(A.10)

0

hence the Cauchy problem (A.1). Note that, in case v(t), A(t) are also continuous, from (A.10) we deduce that z(t) ∈ C 1 ((0, R)). Because of (V3), for t ∈ (0, a] we have & & t t & & & & && 1 & & &z (t)& = & A(s)v(s)z(s) ds & A(s)&z(s)& ds & v(t) & 0

almost everywhere

0

so that, up to a zero-measure set Ω, z (t) → 0 as t → 0+ , t ∈ / Ω. In case v(t), A(t) are continuous, the above inequality is everywhere valid and shows that z(t) ∈ C 1 ([0, R)) with z (0+ ) = 0. This concludes the proof. 2 Remark A.2. With the same technique (but a simpler proof) we can provide existence of a locally Lipschitz solution of problem (2.19) when (A1), (V1) are met on [t0 , R), for some t0 > 0. Note that 1/v is required to be bounded also in a neighborhood of t0 . Proposition A.3. Assume (A1) and (V1). Then, the zeros of every locally Lipschitz solution z(t) of (A.1), if any, are at isolated points of [0, R). Proof. Let y(t) = −

v(t)z (t) . z(t)

Since z ∈ Liploc ([0, R)), y(t) is at least locally Lipschitz on compact sets of [0, R) \ {t: z(t) = 0}. This follows since (vz ) = −Avz ∈ L∞ loc ([0, R)), hence vz is locally Lipschitz. Differentiating and using (A.1) we get y (t) = A(t)v(t) +

y 2 (t) v(t)

almost everywhere,


1819

hence y(t) is increasing on its domain. Assume that t0 ∈ (0, R) is a zero of z(t) (note that z0 > 0). First, we prove that lim y(t) = ∓∞.

(A.11)

t→t0±

Indeed, both limits exist by monotonicity. Indicating with L± the two limits, if by contradiction L− < +∞ (analogously for L+ > −∞) then necessarily v(t0 )z (t0 ) = lim v(t)z (t) = lim v(t)z (t) = −z(t0 )L− = 0. t→t0

t→t0−

Therefore, z(t) should solve v(t)z + A(t)v(t)z = 0 almost everywhere on (0, R), z(t0 ) = 0, v(t0 )z (t0 ) = 0.

(A.12)

In other words, z(t) should be a locally Lipschitz solution of Volterra integral problem t z(t) = −

1 v(s)

s

t0

t t dx A(x)v(x)z(x) dx ds = − A(s)v(s) z(s) ds, (A.13) v(x)

t0

t0

s

where the last inequality follows integrating by parts. Since v(t) is bounded away from zero on compact sets of (0, R), the kernel of Volterra operator is locally bounded. Therefore, (A.13) has a unique local solution, which is necessarily z ≡ 0 on every [T1 , T2 ] ⊂ (0, R). This contradicts z(0+ ) = z0 > 0 and proves (A.11). Now, if there exists {tk } such that z(tk ) = 0 and tk → t0 , every neighborhood of t0 should contain points tk such that limt→t ± y(t) = ∓∞, and this clearly k contradicts the fact that y(t) has both left and right limits in t0 . 2 Proposition A.4. Assume (V1), and let A1 , A2 satisfy (A1) and A1 A2 a.e. on [0, R). Suppose that zi (t), i ∈ {1, 2}, is a locally Lipschitz solution of (A.1) with A(t) = Ai (t). Fix T R such that z1 (t), z2 (t) > 0 on [0, T ). Then z2 (t) z1 (t) on [0, T ). Proof. We consider the locally Lipschitz function F = (vz1 )z2 − (vz2 )z1 . Differentiating we obtain F = vz1 z2 − vz2 z1 = z2 (−A1 vz1 ) − z1 (−A2 vz2 ) = (A2 − A1 )vz1 z2 0 almost everywhere on [0, T ). This shows that F is non-increasing. From F (0+ ) = 0 we argue F 0 and therefore vz1 z2 vz2 z1 . By (V1) we deduce that v is essentially bounded from below with a positive constant on compact sets of (0, T ), thus z1 z2 z2 z1 almost everywhere. Hence

z1 z2

=

z1 z2 − z2 z1 z22

0 almost everywhere on (0, T ).

Since z1 (0)/z2 (0) = 1 we conclude z1 (t) z2 (t) on [0, T ).

2

1820


References [1] D. Fisher-Colbrie, R. Schoen, The structure of complete stable minimal surfaces in 3-manifolds of non-negative scalar curvature, Comm. Pure Appl. Math. XXXIII (1980) 199–211. [2] B. Bianchini, M. Rigoli, Non-existence and uniqueness of positive solutions of Yamabe type equations on nonpositively curved manifolds, Trans. Amer. Math. Soc. 349 (1997) 4753–4774. [3] C.A. Swanson, Comparison and Oscillation Theory of Linear Differential Equations, Academic Press, New York– London, 1968. [4] S. Pigola, M. Rigoli, A.G. Setti, Existence and non-existence results for a logistic type equation on manifolds, Trans. Amer. Math. Soc., in press. [5] S. Pigola, M. Rigoli, A.G. Setti, Vanishing and Finiteness Results in Geometric Analysis, Progr. Math., vol. 266, Birkhäuser, 2008. [6] B. Halpern, On the immersion of an m-dimensional manifold in (m + 1)-dimensional Euclidean space, Proc. Amer. Math. Soc. 30 (1971) 181–184. [7] T. Colding, W. Minicozzi, Estimates for parametric elliptic integrands, Int. Math. Res. Not. (2002) 291–297. [8] M.P. Do Carmo, D. Zhou, Eigenvalue estimate on complete non-compact Riemannian manifolds and applications, Trans. Amer. Math. Soc. 351 (1999) 1391–1401. [9] T. Hasanis, D. Koutroufiotis, A property of complete minimal surfaces, Trans. Amer. Math. Soc. 281 (1984) 833– 843. [10] A. Ratto, M. Rigoli, A.G. Setti, A uniqueness result in PDE’s and parallel mean curvature immersions in Euclidean space, Complex Variables 30 (1996) 221–233. [11] M. Rigoli, A.G. Setti, Liouville-type theorems for ϕ-subharmonic functions, Rev. Mat. Iberoamericana 17 (2001) 471–520. [12] S.Y. Cheng, S.T. Yau, Differential equations on Riemannian manifolds and geometric applications, Comm. Pure Appl. Math. 28 (1975) 333–354. [13] R. Brooks, A relation between growth and the spectrum the Laplacian, Math. Z. 178 (1981) 501–508. [14] D. Fisher-Colbrie, On complete minimal surfaces with finite Morse index in three manifolds, Invent. Math. 82 (1985) 121–132. [15] M. Pinsky, The spectrum of the Laplacian on a manifold of negative curvature I, J. Differential Geom. 13 (1978) 87–91. [16] M.P. Do Carmo, C.K. Peng, Stable complete minimal surfaces in R3 are planes, Bull. Amer. Math. Soc. 1 (1979) 903–906. [17] A.V. Pogorelov, On the stability of minimal surfaces, Soviet Math. Dokl. 24 (1981) 274–276. [18] R. Grimaldi, P. Pansu, Sur la régularité de la fonction croissance d’une variété riemannienne, Geom. Dedicata 50 (1994) 301–307. [19] R. Gulliver, Index and total curvature of complete minimal surfaces, in: Geom. Measure Theory and the Calculus of Variations, Arcata, Calif., 1984, in: Proc. Sympos. Pure Math., vol. 44, Amer. Math. Soc., Providence, RI, 1986, pp. 207–211. [20] A.N. Kolmogorov, S.V. Fomin, Elements of Function Theory and Functional Analysis, Mir, 1980 (in Italian translation). [21] M.P. Do Carmo, Riemannian Geometry, Birkhäuser, 1992.


Lyapunov conditions for Super Poincaré inequalities Patrick Cattiaux a , Arnaud Guillin b,c,∗ , Feng-Yu Wang d,e , Liming Wu f,g a Université Paul Sabatier, Institut de Mathématiques, Laboratoire de Statistique et Probabilités, UMR C 5583,

118 route de Narbonne, F-31062 Toulouse Cedex 09, France b Ecole Centrale Marseille et LATP, Université de Provence, Technopole Château-Gombert, 39, rue F. Joliot Curie,

13453 Marseille Cedex 13, France c Université Blaise Pascal, 33, avenue des landais, 63177 Aubières Cedex, France d School of Mathematical Science, Beijing Normal University, Beijing 100875, China e Department of Mathematics, Swansea University, Singleton Park, SA2 8PP, Swansea, UK f Laboratoire de Mathématiques Appliquées, CNRS-UMR 6620, Université Blaise Pascal, 63177 Aubière, France g Department of Mathematics, Wuhan University, 430072 Hubei, China

Received 30 November 2007; accepted 6 January 2009 Available online 26 January 2009 Communicated by C. Villani

Abstract We show how to use Lyapunov functions to obtain functional inequalities which are stronger than Poincaré inequality (for instance logarithmic Sobolev or F -Sobolev). The case of Poincaré and weak Poincaré inequalities was studied in [D. Bakry, P. Cattiaux, A. Guillin, Rate of convergence for ergodic continuous Markov processes: Lyapunov versus Poincaré, J. Funct. Anal. 254 (3) (2008) 727–759. Available on Mathematics arXiv:math.PR/0703355, 2007]. This approach allows us to recover and extend in a unified way some known criteria in the euclidean case (Bakry and Emery, Wang, Kusuoka and Stroock, . . . ). © 2009 Elsevier Inc. All rights reserved. Keywords: Ergodic processes; Lyapunov functions; Poincaré inequalities; Super Poincaré inequalities; Logarithmic Sobolev inequalities

* Corresponding author at: Ecole Centrale Marseille et LATP, Université de Provence, Technopole Château-Gombert, 39, rue F. Joliot Curie, 13453 Marseille Cedex 13, France. E-mail addresses: [email protected] (P. Cattiaux), [email protected] (A. Guillin), [email protected], [email protected] (F.-Y. Wang), [email protected] (L. Wu).


1822

P. Cattiaux et al. / Journal of Functional Analysis 256 (2009) 1821–1841

1. Introduction During the last thirty years, a lot of attention has been devoted to the study of various functional inequalities and among them a lot of efforts were consecrated to the logarithmic Sobolev inequality. Our goal here will be to give a new and practical condition to prove logarithmic Sobolev inequality in a general setting. Our method being general, we will be able to get also conditions for Super Poincaré, and in particular to various inequalities as F -Sobolev or general Beckner inequalities. Our assumptions are based mainly on a Lyapunov type condition as well as a Nash inequality (for example valid in Rd ). But let us make precise the objects and inequalities we are interested in. Let (X , F , μ) be a probability space and L a self-adjoint operator on L2 (μ), with domain D2 (L), such that Pt = et L is a Markov semigroup. Consider then the Dirichlet form associated to L E(f, f ) := −Lf, f μ ,

f ∈ D2 (L),

with domain D(E). Throughout the paper, all test functions in an inequality will belong to D(L). By definition, L possesses a spectral gap if and only if the following Poincaré inequality holds (for all nice f ’s)

Varμ (f ) :=

2

f 2 dμ −

f dμ

CP E(f, f )

(1.1)

where CP−1 is the spectral gap. Note that such an inequality is also equivalent to the exponential decay in L2 (μ) of Pt . A defective logarithmic Sobolev inequality (say DLSI) is satisfied if for all nice f ’s Entμ f 2 :=

f 2 log f 2 dμ −

f 2 dμ log f 2 dμ

CLS E(f, f ) + DLS

f 2 dμ.

(1.2)

When DLS = 0 the inequality is said to be tight or we simply say that a logarithmic Sobolev inequality is verified (for short (LSI)). Dimension free gaussian concentration, hypercontractivity and exponential decay of entropy are directly deduced from such an inequality explaining the huge interest in it. Note that if a Poincaré inequality is valid, a defective DLSI, via Rothaus’s lemma, can be transformed into a (tight) LSI. For all this we refer to [1] or [32]. Recently, Wang [29] introduced a so-called Super Poincaré inequality (say SPI) to study the essential spectrum: there exist a non-increasing β ∈ C(0, ∞), all nice f and all r > 0 2 μ f 2 rE(f, f ) + β(r)μ |f | .

(1.3)

Wang moreover establishes a correspondence between this SPI and defective F -Sobolev inequality (F-Sob) for a proper choice of increasing F ∈ [0, ∞[ with lim∞ F = ∞, i.e. for all nice f with μ(f 2 ) = 1 μ f 2 F f 2 c1 E(f, f ) + c2 . (1.4)


1823

More precisely, if (1.4) holds for some increasing function F satisfying limu→+∞ F (u) = +∞ and sup00 u an (F-Sob) inequality holds with C1 F (u) = u

u ξ(t/2) dt − C2 0

for some well-chosen C1 and C2 . For details see [32, Theorems 3.3.1 and 3.3.3]. Note that these results are still available when μ is a non-negative possibly non-bounded measure. In particular an inequality (DLSI) is equivalent to an (SPI) inequality with β(u) = cec /u . These inequalities and their consequences (concentration of measure, isoperimetry, rate of convergence to equilibrium) have been studied for diffusions and jump processes by various authors [29,4,5,25,11] under various conditions. In this paper we shall use Lyapunov type conditions. These conditions are well known to furnish some results on the long time behavior of the laws of Markov processes (see e.g. [16,18,15]). The relationship between Lyapunov conditions and functional inequalities of Poincaré type (ordinary or weak Poincaré introduced in [24]) is studied in details in the recent work [3]. The present paper is thus a complement of [3] for the study of stronger inequalities than Poincaré inequality. Let us also mention the paper [2] which is a companion paper of the present one (actually written immediately after the present one) dealing with the (simpler) ordinary Poincaré inequality, and actually with the stronger L1 Poincaré inequality, also called Cheeger inequality. The main idea of use of a Lyapunov function is similar in [2] and in the present work, however we have here to face much more technical difficulties when handling the “local term” in the proof of Theorem 2.1 below. In addition we provide a method allowing us to deal with general Markov processes (including jump processes), giving some simple example of application. We will therefore suppose that (X , d) is a Polish space (actually a Riemannian manifold). Namely we will assume (L) there is a function W 1, a positive function φ > φ0 > 0, b > 0 and r0 > 0 such that LW −φ + b1B(o,r0 ) W

(1.5)

where B(o, r0 ) is a ball, w.r.t. d, with center o and radius r0 . The main idea of the paper (in fact of the use of such Lyapunov functions) is the following one: in order to get some Super Poincaré inequality for μ it is enough that μ satisfies some (SPI) locally and that there exists some Lyapunov function. In other words the Lyapunov function is useful to extend (SPI) on (say) balls to the whole space. General statements are given in Section 2. In particular on nice manifolds the Riemannian measure satisfies locally some (SPI), so that an absolutely continuous probability measure will also satisfy a local (SPI) in most cases. The existence of a Lyapunov function allows us to get some (SPI) on the whole manifold.

1824


The aim of Sections 3 and 4 is to show how one can build such Lyapunov functions, either as a function of the log-density or as a function of the Riemannian distance. In the first case we improve upon previous results in [22,10,4,5] among others. In the second case we (partly) recover and extend some celebrated results: Bakry–Emery criterion for the log-Sobolev inequality, Wang’s result on the converse Herbst argument. In particular we thus obtain similar results as Wang’s one, but for measures satisfying sub-gaussian concentration phenomenon. This kind of new result can be compared to the recent [6]. The main interest of this approach (despite the new results we obtain) is that it provides us with a drastically simple method of proof for many results. The price to pay is that the explicit constants we obtain are far to be optimal. 2. A general result 2.1. Diffusion case To simplify we will deal here with the diffusion case: we assume that X = M is a d-dimensional connected complete Riemannian manifold, possibly with boundary ∂M. When ∂M = ∅, we assume that W in (1.5) satisfies the boundary condition that N W |∂M 0 where N is the inward unit normal vector on the boundary. We denote by dx the Riemannian volume element and ρ(x) = ρ(x, o) the Riemannian distance function from a fixed point o. Let L = Δ − ∇V .∇ 1,2 such that Z = e−V dλ < ∞, and L is self-adjoint in L2 (μ) where dμ = for some V ∈ Wloc Z −1 e−V dx. Note that in this case, we are in the symmetric diffusion case and the Dirichlet form is given by E(f, f ) =

|∇f |2 dμ,

D(E) = W 2,1 (μ).

We shall obtain (SPI) by perturbing a known Super Poincaré inequality. Theorem 2.1. Suppose that the Lyapunov condition (L) is verified for some function φ such that φ(x) → ∞ as ρ(x, o) → ∞ and N W |∂M 0 if ∂M = ∅. Assume also that there exists T locally Lipschitz continuous on M such that dλ = exp(−T ) dx satisfies an (SPI) (1.3) with function β. Then (SPI) holds for μ and some α : (0, ∞) → (0, ∞) in place of β. More precisely, for a family of compact sets {Ar ⊃ B(o, r0 )}r0 such that Ar ↑ M as r ↑ ∞, define for r > 0: Φ(r) := infc φ, Ar

g(r) :=

Φ −1 (r) := inf s 0: Φ(s) r ,

|V − T |,

sup ρ(·,Ar )2

G(r) :=

sup

∇(V − T ) 2 ,

ρ(·,Ar )2

H (r) = Oscρ(·,Ar )2 (V − T ). Then we may choose for s > 0, either (1)

α(s) := inf

ε∈(0,1)

5 εs ε 2(1 − ε) β ∧ ∧ 2ε 10 16 G ◦ Φ −1 ( 4b ε ∨

4 −1 4b ∨ , exp g ◦ Φ 4 ε sε sε )


1825

or

4 bs s −1 4 bs β e−H ◦Φ ( s ∨ 2 ) . α(s) := 2 exp 2H r0 ∨ Φ −1 ∨ s 2 8

(2)

Proof. Let f ∈ C0∞ (M). For r > r0 it holds

f 2 dμ =

f 2 dμ + Acr

Ar

f 2φ dμ + φ

= Acr

f 2 dμ

1 Φ(r)

1 Φ(r)

f 2 dμ Ar

f 2 φ dμ +

f

2

f 2 dμ Ar

−LW W

b dμ + +1 Φ(r)

f 2 dμ Ar

using our assumption (L). The proof turns then to the estimation of the two terms in the right-hand side of the latter inequality, a global term and a local one. Let μ∂ be the measure on ∂M induced by μ, which of course vanishes if the boundary is empty. For the first term remark, by our assumption on L and W that

f

2

−LW W

f2 f2 ∇ .∇W dμ + (N W ) dμ∂ W W

dμ =

∂M

f 2 |∇W |2 f ∇f.∇W dμ − dμ W W2

2

f = |∇f |2 dμ −

∇f − ∇W

dμ W 2

which leads to

f

2

−LW W

dμ

|∇f |2 dμ.

(2.2)

For the local term we will localize the (SPI) for the measure λ. To this end, let ψ be a Lipschitz function defined on M such that 1IAr ψ(u) 1Iρ(.,Ar )2 and |∇ψ| 1. Writing (SPI) for the function f ψ we get that for all s > 0

f 2 dλ

Ar

f 2 ψ 2 dλ

1826


2s

|∇f | 1Iρ(.,Ar )2 dλ + 2s 2

f 2 1Iρ(.,Ar )2 dλ

2 + β(s) |f |1Iρ(.,Ar )2 dλ .

(2.3)

To deduce a similar local inequality for μ we have two methods. For the first one we apply this inequality to f e−V /2+T /2 . It yields

f 2 dμ =

Ar

Ar

f 2 e−V +T dλ

2s

|∇f |2 1Iρ(.,Ar )2 dμ +

+ 2s

s 2

2

f 2 ∇(V − T ) 1Iρ(.,Ar )2 dμ

2 f 2 1Iρ(.,Ar )2 dμ + β(s) |f |e(V −T )/2 1Iρ(.,Ar )2 dμ

so that if we choose s small enough so that sG(r) 2(1 − ε), we get

f dμ 2

2s|∇f | dμ + (1 − ε) 2

Ar

+ β(s) exp g(r)

f dμ + 2s

2

f 2 dμ

2 |f | dμ

.

(2.4)

Now combine (2.2) and (2.4). On the left-hand side we get b 1− + 1 (1 − ε + 2s) f 2 dμ. Φ(r) For the coefficient to be larger than ε/2 it is enough that s ε/16 and Φ(r) 4b/ε. Assuming this in addition to sG(r) 2(1 − ε) we obtain that for such s > 0 and r, 2 2 2 1 5 + 5s/2 μ |∇f |2 + β(s) exp g(r) μ |f | . μ f ε Φ(r) 2ε If t is given, it remains to choose first r = Φ −1

4b 4 ∨ , ε εt

and then s=

εt ε 2(1 − ε) ∧ ∧ , 10 16 G(r)

to get the first α(t). The second method is more naive but do not introduce any condition on the gradient of V .


1827

Start with f 2 1IAr dμ = f 2 e−V +T 1IAr dλ e− infAr (V −T ) f 2 1IAr dλ e− infAr (V −T ) 2s |∇f |2 1Iρ(.,Ar )2 dλ + 2s e

2 f 2 1Iρ(.,Ar )2 dλ + β(s) |f |1Iρ(.,Ar )2 dλ

− infAr (V −T ) supρ(.,Ar )2 (V −T )

e

2s

+ 2s

f 1Iρ(.,Ar )2 dμ + β(s)e 2

|∇f |2 1Iρ(.,Ar )2 dμ

supρ(.,Ar )2 (V −T )

2 |f |1Iρ(.,Ar )2 dμ

eOscρ(.,Ar )2 (V −T ) 2s |∇f |2 dμ + 2s f 2 dμ 2 + e2 Oscρ(.,Ar )2 (V −T ) β(s) |f | dμ . If we combine the latter inequality with (2.2) and denote s = 2seOscρ(.,Ar )2 (V −T ) we obtain

bs f 2 dμ Φ(r) 2 1 s + |∇f |2 dμ + e2 Oscρ(.,Ar )2 (V −T ) β s e−Oscρ(.,Ar )2 (V −T ) /2 |f | dμ . Φ(r)

1−

Hence, if we choose r = Φ −1 ( 4s ∨

bs 2)

and s = s/4 we obtain the second possible α(s).

2

Remark 2.5. (1) The previous proof extends immediately to the general case of a “diffusion” process with a “carré du champ” which is a derivation, i.e. if E(f, f ) = Γ (f, f ) dμ for a symmetric Γ such that Γ (f g, h) = f Γ (g, h) + gΓ (f, h) (see [3] for more details on this framework). (2) For a general diffusion process, say with a non-constant diffusion term, as noted in the previous remark we have to modify the energy term so that it is no further difficulty and there are numerous examples where condition (L) is verified, i.e. consider L = a(x)Δ − x.∇ where 2 a is uniformly elliptic and bounded (consider W = ea|x| so φ(x) = c|x|2 ). But our method as expressed here relies crucially on the explicit knowledge of V . Note however, that for the second approach, only an upper bound on the behavior of V over, say, balls is needed, which can be made explicit in some cases. Remark 2.6. We may for instance choose Ar = V¯r := {x; |V − T |(x) < r} (i.e. a level set of |V − T |) provided |V − T |(x) → +∞ as ρ(o, x) → +∞. However we have to look at an enlargement V¯ r+2 = {x; ρ(x, V¯r ) < 2} (not the level set of level r + 2). If we want to replace V¯ r+2 by the level set V¯r+2 we have to modify the proof, choosing some ad hoc function ψ which is no more 1-Lipschitz. It is not difficult to see that we have to modify

1828


(2.3) and what follows, replacing 1 (the 1 of 1-Lipschitz) by supV¯r+2 |∇(V − T )|2 . So we have to modify the condition on s in (1) of the previous theorem, i.e. 2 inf(V¯r )c φ

εs

2 inf(V¯r )c φ

+

2(1 − ε) , supV¯r+2 |∇(V − T )|2

(2.7)

i.e. we get the same result as (1) but with Φ(r) = inf(V¯r )c φ, g(r) = r + 2 and G(r) = supV¯r+2 |∇(V − T )|2 . The second case (2) cannot (easily) be extended in this direction. Actually one can derive a lot of results following the lines of the proof, provided some “local” (SPI) is satisfied. Here is the more general result in this direction. Theorem 2.8. In Theorem 2.1 define λAr (f ) = λ(f 1IAr ) where Ar is an increasing family of open sets such that r Ar = M. Given two such families Ar ⊆ Br , assume that for all r large enough the following local (SPI) holds, 2 λAr f 2 sλBr |∇f |2 + βr (s) λBr |f | .

(2.9)

Then the conclusions of Theorem 2.1 are still true if we replace ρ(., Ar ) 2 by Br and β(s) 4 bs −1 4 by βr(s) (s) with r(s) = Φ −1 ( 4b ε ∨ εs ) for each given ε in case (1) and r(s) = Φ ( s ∨ 2 ) in case (2). 2.2. General case We consider here the case of general Markov processes on a manifold M, with a particular care to jump processes. Indeed, a crucial step in the previous proof is to prove (2.2) and it has been made directly taking profit of the gradient structure, but it can be proved in greater generality. However the second part relying on a perturbation approach seems more difficult. We therefore introduce a local Super Poincaré inequality. Let Dw (L) be the weak domain of L for the martingale problem; i.e. f ∈ Dw (L) if and only t if t → f (Xt ) − 0 Lf (Xs ) ds is a local martingale for a Markov process Xt generated by L. Theorem 2.10. Suppose that W ∈ Dw (L) satisfies the Lyapunov condition (L) for some function φ such that φ(x) → ∞ as ρ(x, o) → ∞. Assume also the following family of local Super Poincaré inequality holds for μ: for a family of compact sets {Ar ⊃ B(o, r0 )}r0 such that Ar ↑ M as r ↑ ∞, there exists β(r, ·) such that for all nice f and s > 0 2 μ f 2 1Ar sE(f, f ) + β(r, s)μ |f | . Then, denoting Φ(r) := infc φ, Ar

Φ −1 (r) := inf s 0: Φ(s) r ,

μ verifies a Super Poincaré inequality with function for small enough s > 0 b s +1 . α(s) = β Φ −1 (2/s), s/2 2

(2.11)


1829

Proof. The proof relies on a simple optimization procedure between the weighted energy term and the local Super Poincaré inequality. Namely for all positive s μ f 2 = μ f 2 1Acr + μ f 2 1Ar b −LW 1 + μ f2 + 1 μ f 2 1Ar Φ(r) W Φ(r) 2 1 b + s E(f, f ) + β(r, s) + 1 μ |f | Φ(r) Φ(r) where in the last line we use a generalization of (2.2) which is done by a large deviations argument in Lemma 2.12. Conclude now by setting s = 1/Φ(r), possible for small enough s, and changing 2s by s. 2 Lemma 2.12. For every continuous function U 1 such that U ∈ Dw (L) and −LU/U is bounded from below, −

LU 2 f dμ E(f, f ), U

∀f ∈ D(E).

(2.13)

Proof. Remark that t LU Nt = U (Xt ) exp − (Xs ) ds U 0

is a Pμ -local martingale. Indeed, let At := exp(−

t LU 0 U (Xs ) ds), we have by Ito’s formula,

LU (Xt )At U (Xt ) dt = At dMt (U ). dNt = At dMt (U ) + LU (Xt ) dt − U Now let β := (1 + U )−1 dμ/Z (Z being the normalization constant). (Nt ) is also local martingale, then a super-martingale w.r.t. Pβ . We so get t LU (Xs ) ds Eβ Nt β(U ) < +∞. E exp − U β

0

Let un := min{−LU/U, n} −LU/U . Since un −LU/U the above estimate implies t 1 F (un ) := lim sup log Eβ exp un (Xs ) ds 0. t→∞ t 0

We cannot use this argument if we also truncate −LU/U from below. We may now apply the lower bound of large deviation in [34, Theorem B.1, Corollary B.11] and Varadhan’s Laplace

1830


principle, for which we need un to be bounded. This√requires −LU/U to be bounded below, so √ that un is clearly bounded. Define now I (ν|μ) = E( dν/dμ, dν/dμ), we get F (un ) sup ν(un ) − I (ν|μ); ν ∈ M1 (E) . Thus un dν I (ν|μ), which yields to (by letting n → ∞ and monotone convergence) −

LU dν I (ν|μ), U

∀ν ∈ M1 (E).

(2.14)

That is equivalent to (2.13) by the fact that E(|f |, |f |) E(f, f ) for all f ∈ D(E).

2

We will fully discuss examples on diffusion processes in the next sections. Let us just give a simple example on jump processes, see also [32, Th. 3.4.2] for results in this direction. Remark that in full generality, the main difficulty is to find a local Super Poincaré inequality. However if the state space is discrete, this difficulty mainly disappears. Lyapunov conditions for discrete-valued pure jump process. Let us consider here a pure jump process with values in a countable space E, symmetric with respect to the invariant measure μ, such that for all x ∈ E, μ(x) > 0. The important remark is that a local Super Poincaré inequality in the sense of (2.11) is easy to obtain, indeed for B ⊂ E with finite cardinal μ |f |1B 2 =

x∈B

2

X f (x) μ(x) f 2 (x)μ2 (x) inf μ(x) μ f 2 1B x∈B

x∈B

so that −1 2 μ f 2 1B inf μ(x) μ |f | . x∈B

Let us apply the previous theorem for the birth and death process on N with birth rate bi and death rate di both equal to i a logα (1 + i) for a 2 and α ∈ R and d0 = 0, b1 = 1. Remark that the invariant measure is given by μ(i) = (1/Z)di (where Z is the normalizing constant) so that we have some local Super Poincaré inequality 2 μ f 2 1xn Zns logα (1 + n)μ |f | . Choose now W (n) = 1 + nγ with 0 < γ < 1, for which we derive that for n n0 , there exists λ > 0 such that LW −λns−2 logα (s) W which is our Lyapunov condition. By Theorem 2.10, and for a = 2, if α > 0 then μ verifies a Super Poincaré inequality: for all positive s, there exist c, C > 0 2 1/α μ f 2 sE(f, f ) + ceCs μ |f | . Note that we find back the (optimal) results of [30] in a simple way.


1831

3. Examples in R n We use the setting of Section 2.1 (or of Remark 2.5) but in the euclidean case M = Rn for simplicity. Hence in this section λ is the Lebesgue measure, i.e. we have T = 0. Recall that dμ = Z −1 e−V dx. It is well known that λ satisfies an (SPI) with β(s) = c1 + c2 s −n/2 . However it is interesting to have some hints on the constants (in particular dimension dependence). It is also interesting (in view of Theorem 2.8) to prove (SPI) for subsets of Rn . Hence we shall first discuss the (SPI) for λ and its restriction to subsets. Since we want to show that the Lyapunov method is also quite quick and simple in many cases, we shall also recall the quickest way to recover these (SPI) results. 3.1. Nash inequalities for the Lebesgue measure Let A be an open connected domain with a smooth boundary. For simplicity we assume that A = {ψ(x) 0} for some C 2 function ψ such that |∇ψ|2 (x) a > 0 for x ∈ ∂A = {ψ = 0}. It is then known that one can build a Brownian motion reflected at ∂A, corresponding to the heat semi-group with Neumann condition. Let PtN denote this semi-group, and denote by ptN its kernel. Recall the following Proposition 3.1. The following statements are equivalent (3.1.1) for all 0 < t 1 and all f ∈ L2 (A, dx), N P f t

∞

C1 t −n/4 f L2 (A,dx) ,

¯ (3.1.2) (provided n > 2) for all f ∈ C ∞ (A), f 2L2n/n−2 (A,dx) C2

|∇f |2 dx +

A

f 2 dx ,

A

¯ (3.1.3) for all f ∈ C ∞ (A), 2+4/n f L2 (A,dx)

C3

|∇f | dx + 2

A

4/n f dx f L1 (A,dx) , 2

A

(3.1.4) the (SPI) inequality

f dx s 2

A

2 |∇f | dx + β(s) |f | dx 2

A

A

holds with β(s) = C4 (s −n/2 + 1). Furthermore any constant Ci can be expressed in terms of any other Cj and the dimension n.

1832


These results are well known. They are due to Nash, Carlen, Kusuoka and Stroock [9] and Davies, and can be found in [14, Section 2.4] or [26]. See also Varopolous [28] for the equivalence between (3.1.1) and (3.1.2) in a very general setting. Generalizations to other situations (including general forms of rate functions β) can be found in [32, Section 3.3]. If A = Rn (3.1.1) holds (for all t) with C = (2π)−n/2 and α = n/2, yielding an (SPI) inequality with β(s) = cn s

−n/2

=

1 4π

n/2

s −n/2

(3.2)

which is equivalent, after optimizing in s, to the Nash inequality 2+4/n 4/n f 2 Cn |∇f |2 dx f 1 ,

(3.3)

with Cn = 2(1 + 2/n)(1 + n/2)2/n (1/8π)n/4 . For nice open bounded domains in Rn , as we consider here, (3.1.2) is a well-known consequence of the Sobolev inequality in Rn (see e.g. [14, Lemma 1.7.11] and note that the particular cases n = 1, 2 can be treated by extending the dimension (see [14, Theorem 2.4.4])). But we want here to get some information on the constants. In particular, when A is the level set V¯r we would like to know how βr depends on r. Remark 3.4. If n = 1, we have an explicit expression for ptN when A = (0, r), namely ptN (x, y) = (2πt)−n/2

(x − y − 2kr)2 (x + y + 2kr)2 exp − + exp − . 2t 2t

k0

It immediately follows that ((2k − 1)r)2 (2kr)2 exp − + exp − , ptN (x, y) (2πt)−n/2 2 + 2t 2t x,y∈(0,r) sup

k1

(3.5) so that, using translation invariance, for any interval A of length r > r0 and for 0 < t 1, supx,y∈A ptN (x, y) c(r0 )(2πt)−n/2 . Hence (3.1.1) is satisfied, and an (SPI) inequality holds in A with the same function βr (s) = cB (s −n/2 + 1) independently on r > r0 . By tensorization, the result extends to any cube or parallelepiped in Rn with edges of length larger than r0 . If we replace cubes by other domains, the situation is more intricate. However in some cases one can use some homogeneity property. For instance, for n > 2 we know that (3.1.2) holds for the unit ball with a constant C2 (for n = 2 we may add a dimension and consider a cylinder B2 (0, 1) ⊗ R as in [14, Theorem 2.4.4]). But a change of variables yields 2 2 −2 2 f L2n/n−2 (B(0,r),dx) C2 |∇f | dx + r f dx , B(0,r)

B(0,r)

so that for r 1 (3.1.2) holds in the ball of radius r with a constant C2 independent of r.


1833

The previous argument extends to A = V¯r provided for r r0 , V¯r is star-shaped, in particular it holds if V is convex at infinity. This is a direct consequence of the coarea formula (see e.g. [17, Proposition 3, p. 118]). Indeed if f has his support in an annulus r0 < r1 < V (x) < r2 the surface measure on the level sets V¯r is an image of the surface measure on the unit sphere. This x is immediate since the application x → (V (x), |x| ) is a diffeomorphism in this annulus. Hence for such f ’s the previous homogeneity property can be used. For a given r > r0 large enough, it remains to cover V¯r by such an annulus and a large ball (such that the ball contains V¯r0 and is included in V¯r ) and to use a partition of unity related to this recovering. We thus get as before that for r large enough, C2 can be chosen independent of r. For general domains A, recall that (3.1.2) holds true if A satisfies the “extension property” of the boundary, i.e. the existence of a continuous extension operator E : W1,2 (A) → W1,2 (Rn ). If this extension property is true, (3.1.2) is true in A with a constant C2 depending only on n and the operator norm of E (see [14, Proposition 1.7.11]). If A = V¯r is bounded, as soon as ∇V does not vanish on ∂A, the implicit function theorem tells us that for all x ∈ ∂A one can find an open neighborhood vx of x, an index ix and a 2Lipschitz function φx defined on vx such that vx ∩ A = vx ∩ φ(y1 , . . . , yix −1 , yix +1 , . . . , yn ) < yix . To this end choose ix such that |∂ix V |(x) |∂j V |(x) for all j = 1, . . . , n, so that, for y ∈ ∂A neighboring x, 2|∂ix V |(y) |∂j V |(y), and the partial derivative of the implicit function φ given by the ratio ∂j V (y)/∂ix V (y) is less than 2 in absolute value. By compactness we may choose a finite number Q of points such that j =1,...,Q vxj ⊃ ∂A. Hence we are in the situation of [14, Proposition 1.7.9]. This property implies the extension property but with some extension operator E whose norm depends on two quantities: first the maximal ε > 0 such that for all x ∈ ∂A, B(x, ε) ⊆ vxj for some j = 1, . . . , Q; second, the maximal integer N such that any x ∈ ∂A belongs to at most N such vxj ’s. This is shown in [27, pp. 180–192]. Actually an accurate study of Stein’s proof (pp. 190 and 191) shows that E C(n)(N/ε) (recall that we have chosen φ 2-Lipschitz). Now assume that

there exist R > 0, v > 0, k ∈ N such that for |x| R, ∇V (x) v > 0.

(3.6)

Then it is easy to check that for A = V¯r it holds ε ε0 = c(v, R, n)θ −1 (r) with

2

∂ V

(x)

. θ (r) = sup max

i,j =1,...,n ∂x ∂x x∈∂ V¯r

i

(3.7)

j

But ε0 being given, it is well known that one can find a covering of A by balls of radius ε0 /2 such that each x ∈ V¯r belongs to at most N = cn such balls for some universal c large enough. Hence N can be chosen as a constant depending on the dimension only. It follows that Proposition 3.8. If (3.6) is satisfied, the (SPI) (3.1.4) holds with A = V¯r , θ defined by (3.7) and βr (s) = C(n)θ n (r) 1 + s −n/2 .

1834


For the computation of βr we used [14, Lemma 1.7.11] which says that C2 = c(n)E2 and [14, proof of Theorem 2.4.2, p. 77] which yields a logarithmic Sobolev inequality with β(ε) = n/4 −(n/4) log ε + (n/4) log(C2 n/4) together with [14, Corollary 2.2.8] which gives C1 = c(n)C2 . 2 −n/2 Finally the proof of [14, Theorem 2.4.6] gives β(s) = C1 (1 + (s/2) ). Proposition 3.8 gives of course the worse result and in many cases one can expect a much better behavior of βr as a function of r. In particular in the homogeneous case we know the result with a constant independent of r. Remark 3.9. Another possibility to get (SPI) in some domain A, is to directly prove the Nash inequality (3.1.3). One possible way to get such a Nash inequality is to prove some Poincaré– Sobolev inequality. The case of euclidean balls is well known. According to [26, Theorem 1.5.2], for n > 2, with p = 2 and s = 2n/(n − 2) = 2∗ therein, for all r > 0 and all ball Br with radius r, if λr is the Lebesgue measure on Br and f¯r = (1/Vol(Br )) Br f dx we have 1 2n n−2 λr |f − f¯r | n−2 2n Cn λr |∇f |2 2 ,

(3.10)

so that using first Minkowski, we have 1 2n n−2 λr |f | n−2 2n Cn λr |∇f |2 2 +

1 λr |f | , Vol(Br )

(3.11)

and finally using Hölder inequality and Cauchy–Schwarz inequality we get the local Nash inequality 4/(n+2) 1 λr |f |2 λr |f | Cn λr |∇f |2 2 +

2n/(n+2) 1 λr |f | Vol(Br ) 1 2 1 2n/(n+2) 4/(n+2) 1 2 2 2 Cn λr |∇f | λr |f | λr |f | +√ . Vol(Br )

(3.12)

Again, for r > r0 we get a Nash inequality hence an (SPI) inequality independent of r with βr (s) = cn (1 + s −n/2 ). Notice that (3.10) is scale invariant, i.e. if it holds for some subset A, it holds for the homothetic rA (r > 0) with the same constants. That is why the constants do not depend on the radius for balls. If we replace a ball by a convex set, the classical method of proof using Riesz potentials (see e.g. [26] or [14, Lemma 1.7.3]) yields a similar results but with an additional constant, namely diamn (A)/Vol(A), so that if V is a convex function the constant we obtain with this method in (3.10) for V¯r may depend on r. Actually the Sobolev–Poincaré inequality (3.10) extends to any John domain with a constant C depending on the dimension n and on the John constant of the domain. This result is due to Bojarski [7] (also see [20] for another proof and [8] for a converse statement). Actually a John domain satisfies some chaining (by cubes or balls) condition which is the key for the result (see the quoted papers for the definition of a John domain and the chaining condition). But an explicit calculation of the John constant is not easy.


1835

3.2. Typical Lyapunov functions and applications We here specify classes of natural Lyapunov function: function of the potential or of the distance. As will be seen, it gives new practical conditions for Super Poincaré inequality and for logarithmic Sobolev inequality. First, since W 1 we may write W = eU so that condition (L) becomes ΔU + |∇U |2 − ∇U.∇V + φ 0 “at infinity”.

(3.13)

3.2.1. Lyapunov function eaV Test functions eaV for a < 1 are quite natural in that they are the limiting case for the spectral gap (see [3]). Indeed, μ(eaV ) is finite if and only if a < 1 and a drift condition such that LW −λW + b1C formally implies by integration by μ, that μ(W ) is finite. So in a sense, eaV are the “largest” possible Lyapunov functions. Hence, if W = eaV , LWW = a(ΔV − (1 − a)|∇V |2 ). Introduce the following conditions (3.14.1) V (x) → +∞ as |x| → +∞, (3.14.2) there exist 0 < a0 < 1, a non-decreasing function η with η(u) → +∞ as u → +∞ and a constant b0 such that (1 − a0 )|∇V |2 − ΔV η(V ) + b0 1I|x| 0 and b > 1, β(s) = Cec(1/s)

b 2((b−1)∧1)

.

(3.21.2) c0 0, d |x|b V (x) c |x|b for |x| large enough some d , c > 0 and b b > 1, β(s) = Cec(1/s)

b b +b−2

.

(3.21.3) c0 0, for |x| large enough, V (x) (ε − c0 /2)|x|2 for some ε > 0, and β(s) = Cec(1/s) .

(3.21.4) c0 0, for |x| large enough, d |x|b V (x) c |x|b and β as in (3.21.2). Proof. In all the proof D will be an arbitrary positive constant whose value may change from place to place. All the calculations are assuming that |x| is large enough. Consider first case 1. Choosing a small enough and using Lemma 3.20, we see that φ(x) 2(b−1) D|x|b−2 V (x). If b 2 we thus have φ(x) DV (x) while for b < 2, φ(x) DV b (x) for large |x| according to the hypothesis. For φ to go to infinity at infinity, b > 1 is required. In particular on the level sets V¯r we have either φ(x) Dr or φ(x) Dr 2(b−1)/b . Now since the level sets V¯r are convex, we know that some Nash inequality holds on V¯r according to the discussion in the previous subsection. We may thus use Theorem 2.8 in the situation (2) of Theorem 2.1. Choosing s = d/r or s = d/r 2(b−1) b for some well-chosen d yields the result with an extra factor s −k for some k > 0. This extra term can be skipped just changing the constants in the exponential term. Case 2 is similar but improving the lower bound for φ. Indeed since D|x| V 1/b (x), φ(x) b +b−2

DV b (x). It allows us to improve β. Let us now consider case 3. Since b = 2, our hypothesis implies that for 2a < ε, φ DV . But the curvature assumption implies that the level sets of x → H (x) = V (x) + c0 |x|2 /2 are convex. Since V (x) D|x|2 , one has cr V (x) r if x ∈ H¯ r . We may thus mimic case 1, just replacing V¯r by H¯ r . Case 4 is similar to the previous one just improving the bound on φ as in case 2. 2

1838


Corollary 3.22. (1) If (3.21.3) holds, μ satisfies a logarithmic Sobolev inequality. (2) If (3.21.1) holds with b = 2, μ satisfies a logarithmic Sobolev inequality. In particular if ρ > 0, μ satisfies a logarithmic Sobolev inequality (Bakry–Emery criterion). (3) If (3.21.1) holds for some 1 < b < 2, μ satisfies a F -Sobolev inequality with F (u) = 2(1−(1/b)) log+ (u). The first statement of the theorem is reminiscent to Wang’s improvement of the Bakry–Emery (−ρ+ε)|x−y|2 criterion, namely if e μ(dx)μ(dy) < +∞, μ satisfies a logarithmic Sobolev inequality. Our statement is weaker since we are assuming some uniform behavior. The third statement can thus be seen as an extension of Wang’s result to the case of F -Sobolev inequalities interpolating between Poincaré inequality and log-Sobolev inequality. These inequalities are related to the Latała–Oleskiewicz interpolating inequalities [23], see [4] for a complete description. It should be interesting to improve (3) in the spirit of Wang’s concentration result. See [21,6] for an attempt involving modified log-Sobolev inequalities introduced in [19] and mass transport. 4. The general manifold case In fact as one guesses, the main point is to get the additional Super Poincaré inequality, local as developed in Section 3.1, or global (and then using the localization technique already mentioned). It is of course a fundamental field of research which encompasses the scope of the present paper. We may however use our main results Theorems 2.1 and 2.8, with the same Lyapunov functionals as developed in Sections 3.2.1 and 3.2.2, replacing of course the euclidean distance by the Riemannian distance (w.r.t. a fixed point), at least in two main cases. According to [13], if the injectivity radius of M is positive then (1.3) holds for T = 0 and β(s) = c1 + c2 s −d/2 for some constants c1 , c2 > 0; if in particular the injectivity is infinite, then one may take c1 = 0 [29, p. 225]. Next, if the Ricci curvature of M is bounded below, then by [29, Theorem 7.1], there exist c1 , c2 > 0 such that (1.3) holds for T = c1 ρ and β(s) = c2 s −d/2 . For simplicity, throughout this section we assume that (H ) The injectivity radius of M is positive. 4.1. Lyapunov condition eaV In this context, one may readily generalize the result of Theorem 3.17 for the first case (3.17.1), with the euclidean distance replaced by the Riemannian one, assuming (3.14.1), (3.14.2) and (3.14.3). Theorem 4.1. Assume (H ) and that (3.14.1), (3.14.2), (3.14.3) are satisfied. Suppose moreover that for large ρ, |∇V |(x) γ (V (x)). Then μ will satisfy a (SPI) inequality with function β given by −1 β(s) = C 1 + eη (c/s) γ n η−1 (c/s) .


1839

The second point of Theorem 3.17 is more delicate as it relies on finer conditions on the manifold and the potential, it should however be possible to give mild additional assumptions ensuring such a result (for instance the so-called “rolling ball condition”). Remark that it extends to the manifold case Kusuoka–Stroock’s result (giving life to Remark (2.49) in their paper). 4.2. Lyapunov condition eaρ

b

We suppose moreover here that M is a Cartan–Hadamard manifold, i.e. a simply connected complete Riemannian manifold with non-positive sectional curvature. According to the Cartan– Hadamard theorem, M is thus diffeomorphic to Rn . One certainly should relax this assumption, introducing however some technicalities. In addition we will assume that the Ricci curvature is bounded from below. b If we try to use W = eaρ for ρ 1, since Δρ is bounded above on {ρ 1} (see for example Th. 0.4.10 in [32]), (L) holds for φ := abρ b−2 ψ with ψ := ∇ρ 2 , ∇V − c + abρ b for some constant c > 0 provided ψ is positive for large ρ. We may then extend Lemma 3.20 in the manifold context. Lemma 4.2. If (3.19) holds, then ρ∇ρ, ∇V V − V (o) + c0 ρ 2 /2. Proof. For x ∈ M, let ξ : [0, ρ(x)] → M be the minimal geodesic from o to x. Let g(t) = t∇ρ, ∇V (ξt ),

t 0.

We have g (t) = ∇ρ, ∇V (ξt ) + t HessV (∇ρ, ∇ρ)(ξt ) c0 t +

dV (ξt ) . dt

This implies the desired assertion by integrating both sides on [0, ρ(x)].

2

We may thus state Proposition 4.3. Let M be a Cartan–Hadamard manifold with Ricci curvature bounded below. Let V satisfy (3.19). Then one can find positive constants c, C such that μ satisfies some (SPI) with function β (given below for s small enough) in the following cases: (4.3.1) c0 0, V (x) c ρ b for ρ large enough some c > 0 and b > 1, β(s) = Cec(1/s)

b 2((b−1)∧1)

.

1840


(4.3.2) c0 0, d ρ b V (x) c ρ b for ρ large enough some d , c > 0 and b b > 1, β(s) = Cec(1/s)

b b +b−2

.

(4.3.3) c0 0, for ρ large enough, V (x) (ε − c0 /2)ρ 2 for some ε > 0, and β(s) = Cec(1/s) .

(4.3.4) c0 0, for ρ large enough, d ρ b V (x) c ρ b and β as in (3.21.2). The first point of this proposition specialized to the case c0 > 0 enables us to recover [31, Th. 1.3] which extends Bakry–Emery criterion to lower bounded Ricci curvature manifold. It then extends the result to various F -Sobolev. Proof. The proof follows exactly the same line as in the flat case so that case 1 and case 2 follow once it is noted that since HessV 0 implies the convexity of the level sets V¯r , we know that some Nash inequality holds on V¯r according to the discussion in the previous subsection and the boundedness of these level sets ensured by our hypotheses on V . Let us now consider case 3. Since b = 2, our hypothesis implies that for 2a < ε, φ DV . But (3.19) and Hessρ 2 2 on Cartan–Hadamard manifolds imply that the level sets of x → H = V + c0 ρ 2 /2 are convex. Since V Dρ 2 , one has cr V r on H¯ r . We may thus mimic case 1, just replacing V¯r by H¯ r . Case 4 is similar to the previous one just improving the bound on φ as in case 2. 2 Remark 4.4. Remark that in full generality, according to [33, Theorem 1.2] and the recent paper [12], there always exists T ∈ C ∞ (M) such that dλ := e−T (x) dx satisfies a logarithmic −1 Sobolev inequality hence (SPI) with β(s) = es . Of course for practical purposes, this very general fact is not completely useful since T is unknown. Acknowledgment We thank an anonymous careful referee for his comments and constructive questions. References [1] C. Ané, S. Blachère, D. Chafaï, P. Fougères, I. Gentil, F. Malrieu, C. Roberto, G. Scheffer, Sur les inégalités de Sobolev logarithmiques, Panoramas et Synthèses, vol. 10, Société Mathématique de France, Paris, 2000. [2] D. Bakry, F. Barthe, P. Cattiaux, A. Guillin, A simple proof of the Poincaré inequality for a large class of measures including the logconcave case, Electron. Comm. Probab. 13 (2008) 60–66. [3] D. Bakry, P. Cattiaux, A. Guillin, Rate of convergence for ergodic continuous Markov processes: Lyapunov versus Poincaré, J. Funct. Anal. 254 (3) (2008) 727–759. Available on Mathematics arXiv:math.PR/0703355, 2007. [4] F. Barthe, P. Cattiaux, C. Roberto, Interpolated inequalities between exponential and Gaussian, Orlicz hypercontractivity and isoperimetry, Rev. Mat. Iberoamericana 22 (3) (2006) 993–1066. [5] F. Barthe, P. Cattiaux, C. Roberto, Isoperimetry between exponential and Gaussian, Electron. J. Probab. 12 (2007) 1212–1237. [6] F. Barthe, A.V. Kolesnikov, Mass transport and variants of the logarithmic Sobolev inequality, J. Geom. Anal. 18 (4) (2008) 921–979. [7] B. Bojarski, Remarks on Sobolev imbedding inequalities, in: Complex Analysis, in: Lecture Notes in Math., vol. 1351, Springer, Berlin, 1987, pp. 52–68.


1841

[8] S.M. Buckley, P. Koskela, Sobolev–Poincaré implies John, Math. Res. Lett. 2 (1995) 881–901. [9] E. Carlen, S. Kusuoka, D. Stroock, Upper bounds for symmetric Markov transition functions, Ann. Inst. H. Poincaré Probab. Statist. 23 (1987) 245–287. [10] P. Cattiaux, Hypercontractivity for perturbed diffusion semi-groups, Ann. Fac. Sci. Toulouse Math. 14 (4) (2005) 609–628. [11] P. Cattiaux, A. Guillin, Trends to equilibrium in total variation distance, Ann. Inst. H. Poincaré, in press. Available on Mathematics arXiv:math.PR/0703451, 2007. [12] X. Chen, F.Y. Wang, Construction of larger Riemannian metrics with bounded sectional curvatures and applications, Bull. London Math. Soc. 40 (2008) 659–663. [13] C.B. Croke, Some isoperimetric inequalities and eigenvalue estimates, Ann. Sci. École Norm. Sup. 13 (1980) 419– 435. [14] E.B. Davies, Heat Kernels and Spectral Theory, Cambridge University Press, 1989. [15] R. Douc, G. Fort, A. Guillin, Subgeometric rates of convergence of f -ergodic strong Markov processes, preprint. Available on Mathematics arXiv:math.ST/0605791, 2006. [16] N. Down, S.P. Meyn, R.L. Tweedie, Exponential and uniform ergodicity of Markov processes, Ann. Probab. 23 (4) (1995) 1671–1691. [17] L.C. Evans, R.F. Gariepy, Measure Theory and Fine Properties of Functions, CRC Press, 1992. [18] G. Fort, G.O. Roberts, Subgeometric ergodicity of strong Markov processes, Ann. Appl. Probab. 15 (2) (2005) 1565–1589. [19] I. Gentil, A. Guillin, L. Miclo, Modified logarithmic Sobolev inequalities and transportation inequalities, Probab. Theory Related Fields 133 (3) (2005) 409–436. [20] P. Hajlasz, Sobolev inequalities, truncation method, and John domains, Report. Univ. Jyväskylä 83 (2001) 109–126, also see the Web page of the author. [21] A.V. Kolesnikov, Modified log-Sobolev inequalities and isoperimetry, preprint. Available on Mathematics arXiv: math.PR/0608681, 2006. [22] S. Kusuoka, D. Stroock, Some boundedness properties of certain stationary diffusion semigroups, J. Funct. Anal. 60 (1985) 243–264. [23] R. Latała, K. Oleszkiewicz, Between Sobolev and Poincaré, in: Geometric Aspects of Functional Analysis, in: Lecture Notes in Math., vol. 1745, 2000, pp. 147–168. [24] M. Röckner, F.Y. Wang, Weak Poincaré inequalities and L2 -convergence rates of Markov semigroups, J. Funct. Anal. 185 (2) (2001) 564–603. [25] M. Röckner, F.Y. Wang, Supercontractivity and ultracontractivity for (non-symmetric) diffusion semi-groups on manifolds, Forum Math. 15 (6) (2003) 893–921. [26] L. Saloff-Coste, Aspects of Sobolev Type Inequalities, Cambridge University Press, 2002. [27] E. Stein, Singular Integrals and Differentiability Properties of Functions, Princeton University Press, 1970. [28] N. Varopolous, Hardy Littlewood theory for semigroups, J. Funct. Anal. 63 (2) (1985) 240–260. [29] F.Y. Wang, Functional inequalities for empty essential spectrum, J. Funct. Anal. 170 (1) (2000) 219–245. [30] F.Y. Wang, Sobolev type inequalities for general symmetric forms, Proc. Amer. Math. Soc. 128 (12) (2000) 3675– 3682. [31] F.Y. Wang, Logarithmic Sobolev inequalities: Conditions and counterexamples, J. Operator Theory 46 (2001) 183– 197. [32] F.Y. Wang, Functional Inequalities, Markov Processes and Spectral Theory, Science Press, Beijing, 2004. [33] F.Y. Wang, Functional inequalities on arbitrary Riemannian manifolds, J. Math. Anal. Appl. 30 (2004) 426–435. [34] L. Wu, Uniformly integrable operators and large deviations for Markov processes, J. Funct. Anal. 172 (2) (2000) 301–376.


Fractional Laplacian phase transitions and boundary reactions: A geometric inequality and a symmetry result Yannick Sire a , Enrico Valdinoci b,∗ a Université Aix-Marseille 3, Paul Cézanne, LATP, Marseille, France b Università di Roma Tor Vergata, Dipartimento di Matematica, I-00133 Rome, Italy

Received 10 January 2008; accepted 23 January 2009 Available online 31 January 2009 Communicated by C. Villani

Abstract We deal with symmetry properties for solutions of nonlocal equations of the type (−)s v = f (v) in Rn , where s ∈ (0, 1) and the operator (−)s is the so-called fractional Laplacian. The study of this nonlocal equation is made via a careful analysis of the following degenerate elliptic equation

− div x α ∇u = 0 on Rn × (0, +∞), on Rn × {0}, −x α ux = f (u)

where α ∈ (−1, 1), y ∈ Rn , x ∈ (0, +∞) and u = u(y, x). This equation is related to the fractional Laplacian since the Dirichlet-to-Neumann operator Γα : u|∂Rn+1 → −x α ux |∂Rn+1 is (−) +

we study the so-called boundary reaction equations given by

1−α 2

+

− div μ(x)∇u + g(x, u) = 0 on Rn × (0, +∞), on Rn × {0} −μ(x)ux = f (u)


E-mail addresses: [email protected] (Y. Sire), [email protected] (E. Valdinoci). 0022-1236/$ – see front matter © 2009 Elsevier Inc. All rights reserved. doi:10.1016/j.jfa.2009.01.020

. More generally,

Y. Sire, E. Valdinoci / Journal of Functional Analysis 256 (2009) 1842–1864

1843

under some natural assumptions on the diffusion coefficient μ and on the nonlinearities f and g. We prove a geometric formula of Poincaré-type for stable solutions, from which we derive a symmetry result in the spirit of a conjecture of De Giorgi. © 2009 Elsevier Inc. All rights reserved. Keywords: Boundary reactions; Allen–Cahn phase transitions; Fractional operators; Poincaré-type inequality

Contents 0. 1.

Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . Regularity theory for Eq. (4) . . . . . . . . . . . . . . . . 1.1. Regularity for Eq. (4) under assumption (11) 1.2. Verification of assumption (11) . . . . . . . . . . 2. Proof of Theorem 1 . . . . . . . . . . . . . . . . . . . . . . 3. Proof of Theorem 2 . . . . . . . . . . . . . . . . . . . . . . 4. Proof of Theorem 3 . . . . . . . . . . . . . . . . . . . . . . Acknowledgments . . . . . . . . . . . . . . . . . . . . . . . . . . . . References . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .

. . . . . . . . .

. . . . . . . . .

. . . . . . . . .

. . . . . . . . .

. . . . . . . . .

. . . . . . . . .

. . . . . . . . .

. . . . . . . . .

. . . . . . . . .

. . . . . . . . .

. . . . . . . . .

. . . . . . . . .

. . . . . . . . .

. . . . . . . . .

. . . . . . . . .

. . . . . . . . .

. . . . . . . . .

. . . . . . . . .

. . . . . . . . .

. . . . . . . . .

. . . . . . . . .

. . . . . . . . .

. . . . . . . . .

. . . . . . . . .

. . . . . . . . .

. . . . . . . . .

. . . . . . . . .

. . . . . . . . .

1843 1849 1849 1854 1855 1858 1861 1863 1863

0. Introduction This paper is devoted to some geometric results on the following equation (−)s v = f (v)

in Rn .

(1)

The operator (−)s is the fractional Laplacian and it is a pseudo-differential operator with symbol |η|2s , with s ∈ (0, 1) – here, η denotes the variable in the frequency space. This operator, which is a nonlocal operator, can also be defined, up to a multiplicative constant, by the formula (−)s v(x) = P.V. Rn

v(x) − v(y) dy, |x − y|n+2s

(2)

where P.V. stands for the Cauchy principal value (see [29] for further details). Seen as an operator acting on distributional spaces, the quantity (−)s v is well-defined as long as v belongs to the space L s = v ∈ S Rn , Rn

n |v(x)| 2 R . dx < ∞ ∩ Cloc n+2s (1 + |x|)

Notice in particular that smooth bounded functions are admissible for the fractional Laplacian. The L1 assumption allows to make the integral in (2) convergent at infinity, whereas the ad-

1844


2 -regularity is here to make sense to the principal value1 near the ditional assumption of Cloc singularity. From a probabilistic point of view, the fractional Laplacian is the infinitesimal generator of a Levy process (see, e.g., [5]). This type of diffusion operators arise in several areas such as optimization [18], flame propagation [9] and finance [14]. Phase transitions driven by fractional Laplacian-type boundary effects have also been considered in [2,27] in the Gamma convergence framework. Power-like nonlinearities for boundary reactions have also been studied in [7]. See also [34] for applications to stratified media. In this paper, we focus on an analogue of the De Giorgi conjecture [17] for equations of type (1), namely, whether or not “typical” solutions possess one-dimensional symmetry. One of the main difficulty of the analysis of this operator is its nonlocal character. However, it is a well-known fact in harmonic analysis that the power 1/2 of the Laplacian is the boundary operator of harmonic functions in the half-space. In [11], the equivalence between (1) and the α-harmonic extension in the half-space has recently been proved. More precisely, if one considers the boundary reaction problem for u = u(y, x), with y ∈ R and x > 0,

n div x α ∇u = 0 on Rn+1 + := R × (0, +∞), −x α ux = f (u) on Rn × {0},

(3)

it is proved in [11] that, up to a normalizing factor, the Dirichlet-to-Neumann operator 1−α Γα : u|∂Rn+1 → −x α ux |∂Rn+1 is precisely (−) 2 and then that u(0, y) is a solution of +

+

(−)

1−α 2

u(0, y) = f u(0, y) .

Note that the condition 1−α 2 = s ∈ (0, 1) in (1) reduces to α ∈ (−1, 1). Qualitatively, the result of [11] states that one can localize the fractional Laplacian by adding an additional variable. This argument plays, for instance, a crucial role in the proof of full regularity of the solutions of the quasigeostrophic model as given by [15] and in the free boundary analysis in [13]. The operator div(x α ∇) is elliptic degenerate. However, since α ∈ (−1, 1), the weight x α is integrable at 0. This type of weights falls into the category of A2 -Muckenhoupt weights (see, for instance, [32]), and an almost complete theory for these equations is available (see [21,22]). In particular, one can obtain Hölder regularity, Poincaré–Sobolev-type estimates, Harnack and boundary Harnack principles. In the present paper, we want to give a geometric insight of the phase transitions for Eq. (1). Our goal is to give a geometric proof of the one-dimensional symmetry result for fractional boundary reactions in dimension n = 2, inspired by De Giorgi conjecture and in the spirit of the proof of Bernstein theorem given in [26]. A similar De Giorgi-type result for boundary reaction in dimension n = 2 has been proven in [12] for α = 0, which corresponds to the square root of the Laplacian in (1). The technique of [12] will be adapted to the case α = 0 in the forthcoming [10]. 1 For v ∈ C 2 (Rn ), the singular integral in (2) makes sense for any s ∈ (0, 1). Of course, it is possible to weaken such loc assumption depending on the values of s.


1845

However, the proofs in [10,12] are based on different methods (namely, a Liouville-type result inspired by [1,3,4] and a careful analysis of the linearized equation around a solution) and they are quite technical and long. Our techniques also gives some geometric insight on more general types of boundary reactions (see Eq. (4) below). There has been a large number of works devoted to the symmetry properties of semilinear equations for the standard Laplacian. In particular, De Giorgi conjecture on the flatness of level sets of standard phase transitions has been studied in low dimensions in [1,3,4,24,25]. The conjecture has also been settled in [33] up to dimension 8 under an additional assumption on the profiles at infinity. Here we give a proof of analogous symmetry properties for phase transitions driven by fractional Laplacian as in (1). Such proof will be rather simple and short, with minimal assumptions (even on the nonlinearity f which can be taken here to be just locally Lipschitz) and it reveals some geometric aspects of the equation. Indeed, our proof, which is based on the recent work [23], relies heavily on a Poincaré-type inequality which involves the geometry of the level sets of u. Most of our paper will focus on the boundary reaction equation in (3) (and, in fact, on the more general framework of (4) below). We recall that (3) still exhibits nonlocal properties. For instance, for α = 0, it has been proven in [12] that layer solutions admits nonlocal Modica-type estimates. Furthermore, in virtue of [11], these equations can be considered as models of a large variety of nonlocal operators. As a consequence, it is worth studying the largest possible class of boundary reaction equations. We will then focus on the following problem:

− div μ(x)∇u + g(x, u) = 0 on Rn+1 + , −μ(x)ux = f (u)

on ∂Rn+1 + ,

(4)

under the following structural assumptions (denoted by (S)): • The function μ is in L1 ((0, r)), for any r > 0. Also, μ is positive and bounded over all open n+1 sets compactly contained in Rn+1 + , i.e. for all K R+ , there exist μ1 , μ2 > 0, possibly depending on K, such that μ1 μ(x) μ2 , for any x ∈ K. • The function μ is an A2 -Muckenhoupt weight, that is, there exists κ > 0 such that b

b μ(x) dx

a

1 dx κ(b − a)2 μ(x)

(5)

a

for any b a 0. • The map (0, +∞) x → g(x, 0) belongs to L∞ ((0, r)) for any r > 0. Also, for any x > 0, the map R u → g(x, u) is locally Lipschitz, and given any R, M > 0 there exists C > 0, possibly depending on R and M in such a way that sup gu (x, u) C. 0<x 0, and that2

μ(x)∇u · ∇ξ + Rn+1 +

g(x, u)ξ =

Rn+1 +

f (u)ξ

(8)

∂Rn+1 +

for any ξ : BR+ → R which is bounded, locally Lipschitz in the interior of Rn+1 + , which vanishes on Rn+1 \ B and such that R + μ(x)|∇ξ |2 ∈ L1 BR+ .

(9)

As usual, we are using here the notation BR+ := BR ∩ Rn+1 + . A classical definition is to say that u is stable if

μ(x)|∇ξ |2 +

BR+

BR+

gu (x, u)ξ 2 −

f (u)ξ 2 0

(10)

∂BR+

for any ξ as above. The stability (sometimes also called semistability) condition in (10) appears naturally in the calculus of variations setting and it is usually related to minimization and monotonicity properties. In particular, (10) says that the (formal) second variation of the energy functional associated to the equation has a sign (see, e.g., [1,20,31] and Section 7 of [23] for further details). In our case, however, it is convenient to relax this definition of stability. Namely, we say that u is stable if (10) holds for any ξ of the form ξ := |∇y u|φ, where φ : Rn+1 → R is Lipschitz and vanishes on Rn+1 + \ BR . This relaxation of the stability definition is convenient for our setting, since it makes possible to write (10) when f is only locally Lipschitz and not necessarily differentiable. Indeed, since the map y → u(y, x) will be taken to be locally Lipschitz (see (11) below), then so is the map y → f (u(y, x)) and therefore f (u)ξ 2 = ∇y f (u) · ∇y uφ 2 is well-defined almost everywhere, making sense of the last term in (10). The regularity theory on u, see (7) and (25), also makes the first term in (10) well-posed. 2 Condition (7) is assumed here to make sense of (8). We will see in the forthcoming Lemma 5 that it is always uniformly fulfilled when u is bounded. The structural assumptions on g may be easily checked when g(x, u) has the product-like form of g (1) (x)g (2) (u).


1847

The main results we prove are a geometric formula, of Poincaré-type, given in Theorem 1, and a symmetry result, given in Theorem 2. For our geometric result, we need to recall the following notation. Fixed x > 0 and c ∈ R, we look at the level set

S := y ∈ Rn s.t. u(y, x) = c . We will consider the regular points of S, that is, we define

L := y ∈ S s.t. ∇y u(y, x) = 0 . Note that L depends on the x ∈ (0, +∞) that we fixed at the beginning, though we do not keep explicit track of this in the notation. For any point y ∈ L, we let ∇L to be the tangential gradient along L, that is, for any yo ∈ L and any G : Rn → R smooth in the vicinity of yo , we set ∇y u(yo , x) ∇y u(yo , x) . ∇L G(yo ) := ∇y G(yo ) − ∇y G(yo ) · |∇y u(yo , x)| |∇y u(yo , x)| Since L is a smooth manifold, in virtue of the Implicit Function Theorem (and of the standard elliptic regularity of u apart from the boundary of Rn+1 + ), we can define the principal curvatures on it, denoted by κ1 (y, x), . . . , κn−1 (y, x), for any y ∈ L. We will then define the total curvature

n−1 2 K(y, x) := κj (y, x) . j =1

We also define

n Rn+1 + := (y, x) ∈ R × (0, +∞) s.t. ∇y u(y, x) = 0 . With this notation, we can state our geometric formula: 2 in the interior of Rn+1 . Assume that u is a bounded and stable weak Theorem 1. Let u be Cloc + solution of (4) under assumptions (S). Assume furthermore that for all r > 0,

|∇y u| ∈ L∞ Br+ .

(11)

Then, for any R > 0 and any φ : Rn+1 → R which is Lipschitz and vanishes on Rn+1 + \ BR , we have that 2 μ(x)φ 2 K2 |∇y u|2 + ∇L |∇y u| μ(x)|∇y u|2 |∇φ|2 . Rn+1 +

Rn+1 +

1848


Assumption (11) is natural and it holds in particular in the important case g := 0, μ(x) = x α where α ∈ (−1, 1), as discussed in Lemmata 9 and 13 below. Interior elliptic regularity also ensures that u is smooth inside Rn+1 + . The result in Theorem 1 has been inspired by the work of [35,36], as developed in [19,23]. In particular, [35,36] obtained a similar inequality for stable solutions of the standard Allen–Cahn equation, and symmetry results for possibly singular or degenerate models have been obtained in [19,23]. Related geometric inequalities also played an important role in [6]. The advantage of the above formula is that one bounds tangential gradients and curvatures of level sets of stable solutions in terms of the gradient of the solution. That is, suitable geometric quantities of interest are controlled by an appropriate energy term. On the other hand, since the geometric formula bounds a weighted L2 -norm of any test function φ by a weighted L2 -norm of its gradient, we may consider Theorem 1 as a weighted Poincaré inequality. Again, the advantage of such a formula is that the weights have a neat geometric interpretation. The second result we present is a symmetry result in low dimension: Theorem 2. Let the assumptions of Theorem 1 hold and let n = 2. Suppose also that one of the following conditions (12) or (13) hold, namely assume that either for any M > 0, (12) the map (0, +∞) x → sup g(x, u) is in L1 ((0, +∞)) |u|M

or that inf g(x, u)u 0.

x∈Rn u∈R

(13)

Suppose also that there exists C > 0 in such a way that R μ(x) dx CR 2

(14)

0

for any R 1. Then, there exist ω : (0, +∞) → S1 and uo : R × [0, +∞) → R such that u(y, x) = uo (ω(x) · y, x) for any (y, x) ∈ R3+ . Also, if g := 0 and μ(x) := x α where α ∈ (−1, 1), then ω is constant. Roughly speaking, Theorem 2 asserts that, for any x > 0, the function R2 y → u(y, x) depends only on one variable (and, at least for g := 0 and μ(x) := x α , it depends “on the same” y-variable for any fixed x). Of course, condition (14) is satisfied, for instance, for μ := x α and α ∈ (−1, 1) and (12) is fulfilled by g := 0, or, more generally, by g := g (1) (x)g (2) (u), with g (1) summable over R+ and g (2) locally Lipschitz. Also, condition (13) is fulfilled by g := u2+1 , with ∈ N.


1849

We remark that when u is not bounded, the claim of Theorem 2 does not, in general, hold (a counterexample being μ := 1, f := 0, g := 0 and u(y1 , y2 , x) := y12 − y22 ). Theorem 12 below will also provide a result, slightly more general than Theorem 2, which will be valid for n 2 and without conditions (12) or (13), under an additional energy assumption. The pioneering work in [12] is related to Theorem 2. Indeed, with different methods, [12] proved a result analogous to our Theorem 2 under the additional assumptions that α := 0, g := 0 and f ∈ C 1,β for some β > 0 (see, in particular, p. 1681 and Theorem 1.5 in [12]). We finally state the symmetry result for Eq. (1): 2 (Rn ) be a bounded solution of Eq. (1), with n = 2 and f locally LipsTheorem 3. Let v ∈ Cloc chitz. Suppose that either

f 0

(15)

∂y2 v > 0.

(16)

or that

Then, there exist ω ∈ S1 and vo : R → R such that v(y) = vo (ω · y) for any y ∈ R2 . The remaining part of the paper is devoted to the proofs of Theorems 1–3. For this, some regularity theory for solutions of Eq. (4) will also be needed. 1. Regularity theory for Eq. (4) This section is devoted to several results we need for the regularity theory of Eq. (4). We do not develop here a complete theory. We recall that (17) μ(x)u2x ∈ L1 BR+ for any R > 0, due to (7). 1.1. Regularity for Eq. (4) under assumption (11) We start with an elementary observation: Lemma 4. If n = 2 and (14) holds, then there exists C > 0 in such a way that μ(x) CR 4 + B2R \BR+

for any R 1.

(18)

1850


Proof. Using (14), we have that 2R

μ(x) + B2R \BR+

μ(x) dy dx 0 B2R

2R C1 R μ(x) dx 2

0

C2 R , 4

2

for suitable C1 , C2 > 0.

Though not explicitly needed here, we would like to point out that the natural integrability condition in (7) holds uniformly for bounded solutions. A byproduct of this gives an energy estimate, which we will use in the proof of Theorem 2. Lemma 5. Let u be a bounded weak solution of (4) under assumptions (S). Then, for any R > 0 there exists C, possibly depending on R, in such a way that μ(x)|∇u|2 1 + C. L (B ) R

Moreover, if • n = 2, • either (12) or (13) holds, • (14) holds, then there exists Co > 0 such that μ(x)|∇u|2 Co R 2

(19)

BR+

for any R 1. Proof. The proof consists in testing the weak formulation in (8) with ξ := uτ 2 where τ is a cutoff function such that 0 τ ∈ C0∞ (B2R ), with τ = 1 in BR and |∇τ | 8/R, with R 1. Note that such a ξ is admissible, since (9) follows from (7). One then gets from (8) that

μ(x) |∇u|2 τ 2 + 2τ ∇u · ∇τ +

Rn+1 +

Rn+1 +

=

f (u)uτ 2 . Rn

g(x, u)uτ 2


1851

Thus, by Cauchy–Schwarz inequality, μ(x)|∇u|2 τ 2 Rn+1 +

1 2

μ(x)|∇u|2 τ 2 + C∗ Rn+1 +

f (u)|u|τ 2

Rn

Rn+1 +

−

μ(x)|∇τ |2 +

g(x, u)uτ 2 ,

Rn+1 +

for a suitable constant C∗ > 0. This, recalling (12), (13) and (18), plainly gives the desired result.

2

We now control further derivatives in y, thanks to the fact that the operator is independent of the variable y: Lemma 6. Let u be a bounded weak solution of (4) under conditions (S). Suppose that (11) holds. Then, μ(x)|∇uyj |2 ∈ L1 BR+ for every R > 0. Proof. Given |η| < 1, η = 0, we consider the incremental quotient uη (y, x) :=

u(y1 , . . . , yj + η, . . . , yn , x) − u(y1 , . . . , yj , . . . , yn , x) . η

Since f is locally Lipschitz, f (u) η C,

(20)

for some C > 0, due to (11). Analogously, from (6) and (11), for any R > 0 there exists CR > 0 such that g(x, u) η CR

(21)

for any x ∈ (0, R). Let now ξ be as requested in (8). Then, (8) gives that

μ(x)∇uη · ∇ξ + g(x, u) η ξ −

Rn+1 +

=− Rn+1 +

= 0.

f (u) η ξ

∂Rn+1 +

μ(x)∇u · ∇ξ−η + g(x, u)ξ−η +

∂Rn+1 +

f (u)ξ−η

1852


We now consider a smooth cutoff function τ such that 0 τ ∈ C0∞ (BR+1 ), with τ = 1 in BR and |∇τ | 2. Taking ξ := uη τ 2 in the above expression, one gets μ(x)τ uη ∇uη · ∇τ

2 Rn+1 +

+ Rn+1 +

=

g(x, u) η uη τ 2

μ(x)τ 2 |∇uη |2 + Rn+1 +

f (u) η uη τ 2 .

(22)

∂Rn+1 +

We remark that the above choice of ξ is admissible, since (9) follows from (11) and (17). Now, by Cauchy–Schwarz inequality, we have

ε μ(x)τ uη ∇uη · ∇τ − 2

Rn+1 +

μ(x)τ 2 |∇uη |2 Rn+1 +

−

1 2ε

μ(x)u2η |∇τ |2 Rn+1 +

for any ε > 0. Therefore, by choosing ε suitably small, (22) reads

μ(x)τ 2 |∇uη |2 C

μ(x)u2η +

+ BR+1

Rn+1 +

+

g(x, u) uη η

+ BR+1

f (u) uη η

{|y|R}×{x=0}

for some C > 0. From (11), (20) and (21), we thus control μ(x)τ 2 |∇uη |2 BR+

uniformly in η. By sending η → 0 and using Fatou lemma, we obtain the desired claim. Following is the regularity needed for some subsequent computations:

2


1853

2 in the interior of Rn+1 . Suppose that u is a bounded weak solution Lemma 7. Let u be Cloc + of (4) under conditions (S) and that (11) holds. Then,

for almost any x > 0, the map Rn y → ∇u(y, x) 1,1 n is in Wloc R , Rn

(23)

and the map Rn+1 + (y, x) → μ(x) + 1 is in L Br , for any r > 0.

n

j =1

|∇uyj |2 + |uyj |2 (24)

What is more, 2 2 the map Rn+1 + (y, x) → μ(x) |∇|∇y u|| + |∇y u| is in L1 Br+ , for any r > 0.

(25)

for any r > 0. 2 in the interior of Rn+1 , for any x ∈ (, 1/) and any R > 0 Proof. Since u is Cloc +

n ∇u(y, x) + ∇uy (y, x) dy C j j =1

BR

for a suitable C > 0, possibly depending on and R, which proves (23). Exploiting Lemma 6, (11) and the local integrability of μ(x), one obtains (24). To prove (25), we now perform the following standard approximation argument. Define Γ = (Γ1 , . . . , Γn ) := ∇y u, and let r, ρ > 0 and P ∈ Rn+1 be such that Br+ρ (P ) ⊂ Rn+1 + + . Fix also i ∈ {1, . . . , n + 1}. Then, for any > 0, n

2|Γ ||∂i Γ | j =1 Γj ∂i Γj 2|∂i Γ | ∈ L1 Br (P ) , + |Γ | 2 + nj=1 Γj2 n n j =1 Γj ∂i Γj j =1 Γj ∂i Γj , lim = χ{Γ =0} n |Γ | 2 →0+ 2 + j =1 Γj

n 2 + Γj2 + |Γ | ∈ L1 Br (P ) and j =1

n lim 2 + Γj2 = |Γ |,

→0+

j =1

1854


thanks to (24). As standard, we denote by χA , here and in the sequel, the characteristic function of a set A. Therefore, by Dominated Convergence Theorem, n

ψχ{Γ =0} Rn+1 +

j =1 Γj ∂i Γj

|Γ |

n

ψ

= lim

→0+ Rn+1 +

= lim

→0+ Rn+1 +

j =1 Γj ∂i Γj

2 +

n

2 j =1 Γj

n 2 2 ψ∂i + Γj

= − lim

→0+ Rn+1 +

j =1

n (∂i ψ) 2 + Γj2 j =1

=−

(∂i ψ)|Γ |

Rn+1 +

for any ψ ∈ C0∞ (Br (P )). Thus, since P , r and ρ can be arbitrarily chosen, we have that n ∂i |Γ | = χ{Γ =0}

j =1 Γj ∂i Γj

|Γ |

weakly and almost everywhere in Rn+1 + . Accordingly, n+1 2 ∇|∇y u|2 = ∇|Γ |2 = ∂i |Γ | i=1

2 n+1 n j =1 Γj ∂i Γj i=1

=

n+1

|∂i Γ |2

i=1

n+1 n n (∂i uyj )2 = |∇uyj |2 . i=1 j =1

Then, (24) implies (25).

|Γ |

j =1

2

1.2. Verification of assumption (11) In this section, we show that (11) is always satisfied in the important case g := 0, μ(x) := x α , with α ∈ (−1, 1). More precisely, we state the following result, the proof of which can be found in [10]:


1855

Lemma 8. Let u be a bounded weak solution of (3) and assume that f is locally Lipschitz. Then there exists a constant C > 0 depending on R and β ∈ (0, 1) such that • the function u is Hölder-continuous of exponent β and

u C β (B + ) C, R

• for all j = 1, . . . , n, the function uyj is Hölder-continuous of exponent β and

uyj C β (B + ) C.

(26)

R

We can now prove the following gradient bound, which says that (11) holds for bounded solutions of Eq. (3): Lemma 9. Let u be a bounded weak solution of (3) and assume that f is locally Lipschitz. Then, given R > 0, there exists CR > 0 such that

∇y u L∞ (Rn ×(0,R)) CR . Proof. From (26), ∇y u is bounded in, say, Rn+1 + ∩ {0 x 3}. Now, in Rn+1 + ∩ {x > 3}, Eq. (3) is nondegenerate and therefore, the gradient bound follows from standard elliptic theory. 2 2. Proof of Theorem 1 Besides few technicalities, the proof of Theorem 1 consists simply in plugging the right test function in stability condition (10) and in using the linearization of (4) to get rid of the unpleasant terms. Following are the rigorous details of the proof. By (23), we have that

∞

μ(x)∇uyj · Ψ =

∇uyj · Ψ dy dx = −

μ(x) 0

Rn+1 +

Rn

μ(x)∇u · Ψyj

Rn+1 +

n for any j = 1, . . . , n and any Ψ ∈ C ∞ (Rn+1 + , R ) supported in BR . Thus, making use of (8), we conclude that

μ(x)∇uyj · ∇ψ = −

Rn+1 +

μ(x)∇u · ∇ψyj

Rn+1 +

=− ∂Rn+1 +

f (u)ψyj + Rn+1 +

g(x, u)ψyj

1856


f (u) y ψ −

=

gu (x, u)uyj ψ

j

∂Rn+1 +

Rn+1 +

f (u)uyj ψ −

= ∂Rn+1 +

gu (x, u)uyj ψ

(27)

Rn+1 +

for any j = 1, . . . , n and any ψ ∈ C ∞ (Rn+1 + ) supported in BR . A density argument (recall (5) and see, e.g., Lemma 3.4, Theorem 2.4 and (2.9) in [8]), via (24), implies that (27) holds for ψ := uyj φ 2 , where φ is as in the statement of Theorem 1, therefore

f (u)|∇y u|2 φ 2

∂BR+

=

n

n μ(x)∇uyj · ∇ uyj φ 2 +

j =1 + BR

=

n

gu (x, u)u2yj φ 2

j =1 + BR

μ(x) |∇uyj |2 φ 2 + uyj ∇uyj · ∇φ 2

j =1 + BR

+

n

gu (x, u)u2yj φ 2

j =1 + BR

=

μ(x)

BR+

n

|∇uyj |2 φ 2 + φ∇φ · ∇|∇y u|2

j =1

+

gu (x, u)|∇y u|2 φ 2 .

(28)

BR+

Now, we make use of (10) by taking ξ := |∇y u|φ (note that (11) and (25) imply (9) and so they make it possible to use here such a test function). We thus obtain

2 μ(x) ∇|∇y u| φ 2 + |∇y u|2 |∇φ|2 + 2|∇y u|φ∇φ · ∇|∇y u|

0 BR+

+ BR+

This and (28) imply that

gu (x, u)|∇y u|2 φ 2 − ∂BR+

f (u)|∇y u|2 φ 2 .


μ(x)φ

2

n n 2 2 (∂x uyj )2 − ∂x |∇y u| + |∇y uyj |2 − ∇y |∇y u| j =1

Rn+1 +

+

μ(x)φ

BR+

j =1

μ(x)φ∇φ · ∇|∇y u|2 − 2|∇y u|∇|∇y u|

Rn+1 +

=

1857

2

n

2 |∇uyj | − ∇|∇y u|

2

j =1

+

μ(x)φ∇φ · ∇|∇y u|2 − 2|∇y u|∇|∇y u|

Rn+1 +

μ(x)|∇φ|2 |∇y u|2 .

(29)

Rn+1 +

Let now r, ρ > 0 and P ∈ Rn+1 be such that Br+ρ (P ) ⊂ Rn+1 + + . We consider γ to be either |∇y u| or uyj . In force of (24) and (25), we see that γ is in W 1,2 (Br (P )), and so 1,1 in Wloc (Br (P )). Thus, by Stampacchia theorem (see, e.g., Theorem 6.19 in [30]), ∇γ = 0 for almost any (y, x) ∈ Br (P ) such that γ (y) = 0. Hence, since P , r and ρ can be chosen arbitrarily, we have that ∇|∇y u| = 0 = ∇uyj for almost every (y, x) such that ∇y u(y, x) = 0. Accordingly, (29) may be written as

μ(x)φ

2

n

2 (∂x uyj ) − ∂x |∇y u| + 2

j =1

Rn+1 +

μ(x)φ

Rn+1 +

2

n

2 |∇y uyj | − ∇y |∇y u|

2

j =1

μ(x)|∇φ|2 |∇y u|2 .

Rn+1 +

Therefore, by standard differential geometry formulas (see, for example, equation (2.10) in [23]), we obtain

μ(x)φ

2

n 2 2 (∂x uyj ) − ∂x |∇y u| + j =1

Rn+1 +

Rn+1 +

μ(x)|∇φ|2 |∇y u|2 .

2 μ(x)φ 2 K2 |∇y u|2 + ∇L |∇y u|

Rn+1 +

(30)

1858


We now note that, on Rn+1 + , n 2 ∇y u · ∇y ux 2 |∇y ux |2 = ∂x |∇y u| = (∂x uyj )2 . ∇y u

(31)

j =1

This and (30) complete the proof of Theorem 1. 3. Proof of Theorem 2 The strategy for proving Theorem 2 is to test the geometric formula of Theorem 1 against an appropriate capacity-type function to make the left-hand side vanish. This would give that the curvature of the level sets for fixed x > 0 vanishes and so that these level sets are flat, as desired (for this, the vanishing of the tangential gradient term is also useful to take care of the possible plateaus of u, where the level sets are not smooth manifold: see Section 2.4 in [23] for further considerations). Some preparation is needed for the proof of Theorem 2. Indeed, Theorem 2 will follow from the subsequent Theorem 12, which is valid for any dimension n and without the restriction in either (12) or (13). We will use the notation X := (y, x) for points in Rn+1 + . Given ρ1 ρ2 , we also define

Aρ1 ,ρ2 := X ∈ Rn+1 + s.t. |X| ∈ [ρ1 , ρ2 ] . Lemma 10. Let R > 0 and h : BR+ → R be a nonnegative measurable function. For any ρ ∈ (0, R), let η(ρ) :=

h.

Bρ+

Then, A√R,R

R h(X) η(R) dX 2 t −3 η(t) dt + 2 . |X|2 R √ R

Proof. By Fubini theorem, A√R,R

h(X) dX 2|X|2

R

= A√R,R |X|

t −3 h(X) dt dX +

A√R,R

h(X) dX 2R 2


R

t −3 h(X) dX dt +

= √ √ R A R,t

R

t −3 h(X) dX dt +

√

R Bt+

from which we obtain the desired result.

1 2R 2

1 2R 2

1859

h(X) dX A√

R,R

h(X) dX, BR+

2

Now we recall the following result of [11], dealing with the Poisson kernel associated to the fractional Laplacian: Lemma 11. The function P (y, x) = Cn,α

x 1−α (x 2 + |y|2 )

n+1−α 2

(32)

is a solution of

− div x α ∇P = 0 on Rn × (0, +∞), P = δ0 on Rn × {0},

(33)

where α ∈ (−1, 1) and Cn,α is a normalizing constant such that P (y, x) dy = 1. Rn

Following is the main symmetry result, from which Theorem 2 will easily follow: Theorem 12. Let u be as requested in Theorem 1. Assume furthermore that there exists Co 1 in such a way that μ(x)|∇u|2 Co R 2 (34) BR+

for any R Co . Then there exist ω : (0, +∞) → Sn−1 and uo : R × (0, +∞) → R such that u(y, x) = uo ω(x) · y, x

(35)

for any (y, x) ∈ Rn+1 + . Also, if g := 0 and μ(x) := x α

where α ∈ (−1, 1), then ω is constant.

(36)

1860


Proof. From Lemma 10 (applied here with h(X) := μ(x)|∇u(X)|2 ) and (34), we obtain A√

μ(x)|∇u(X)|2 C1 log R |X|2

(37)

R,R

for a suitable C1 , as long as R is large enough. Now we define ⎧ ⎨ log R φR (X) := 2 log(R/|X|) ⎩ 0

√ if |X| R, √ if R < |X| < R, if |X| R,

and we observe that |∇φR |

C2 χA√R,R |X|

,

for a suitable C2 > 0. Thus, plugging φR inside the geometric inequality of Theorem 1, we obtain

2 μ(x) K2 |∇y u|2 + ∇L |∇y u| C3

(log R)2

A√R,R

+ B√ ∩Rn+1 + R

μ(x)|∇y u|2 |X|2

for large R. Dividing by (log R)2 , employing (37) and taking R arbitrarily large, we see that 2 K2 |∇y u|2 + ∇L |∇y u|

(38)

n+1 vanishes identically on Rn+1 + , that is K = 0 = |∇L |∇y u|| on R+ . Then, (35) follows by Lemma 2.11 of [23] (applied to the function y → u(y, x), for any fixed x > 0). We now prove (36). For this, since Sn−1 is compact, we take a sequence xj → 0+ and ω ∈ Sn−1 in such a way that ωj := ω(xj ) → ω. Then, by Lemma 8 and (35),

v(y) := lim u(y, xj ) = lim uo (ωj · y, xj ) = vo (ω · y) j →+∞

j →+∞

for a suitable function vo . Following [11], we now consider the Poisson kernel in (32) and we define u (y, x) := P (ξ, x)v(y − ξ ) dξ = P (ξ, x)vo (ω · y − ω · ξ ) dξ. Rn

Rn

By construction, u (y, x) = uo (ω · y, x) for a suitable function uo . We also consider the function U := u − u . Note that div(x α ∇U ) = 0 in Rn+1 + , thanks to [11] (recall Lemma 11). Furthermore, U is bounded, since so is u and U (y, 0) = 0.


1861

Therefore, by a Liouville-type result,3 we conclude that U vanishes identically. Hence, u(y, x) = uo (ω · y, x), which gives4 (36). 2 We now complete the proof of Theorem 2. We observe that, under the assumptions of Theorem 2, estimate (34) holds, thanks to (19). Consequently, the hypotheses of Theorem 2 imply the ones of Theorem 12, from which the claim in Theorem 2 follows. 4. Proof of Theorem 3 We use Theorem 2 to prove Theorem 3. For this, given a function v satisfying (1), we select an extension5 u satisfying (3) by the Poisson kernel in (32). These are the details of the proof of Theorem 3. Let v be a bounded solution of (1) and consider the function (39) u(y, x) = P (y − z, x)v(z) dz = P (ξ, x)v(y − ξ ) dξ. Rn

Rn

Note that since P (x, .) ∈ L1 (Rn ) and v ∈ L∞ (Rn ) and by the embedding L1 ∗ L∞ ⊂ L∞ , we n have that u is bounded in Rn+1 + if v is bounded in R . 3 We would like to give further details on such a Liouville result (the argument is taken from p. 431 of [13]). First of all, by identifying U with its odd reflection, we have that div(|x|α ∇U ) = 0 in Rn+1 .

Hence, by Proposition 2.6 in [13], for any r > 0, C U ∞ n+1 L (R+ ) α supUxx + Ux , x r2 Br for a suitable C > 0. By taking r as large as we wish, we see that Uxx +

α Ux = 0 in Rn+1 . x

For a fixed y ∈ Rn , this ODE may be easily solved explicitly, giving that U (y, x) = c1 (y)x 1−α + c2 (y),

for any y ∈ Rn and any x > 0,

for suitable c1 , c2 : Rn → R. Since U is bounded, c1 (y) = 0. Since U (y, 0) = 0, c2 (y) = 0. More Liouville-type results for operators with Muckenhoupt weights are in [28]. 4 Though we do not pursue such a generality in this paper, we point out that, when {∇ u = 0} = ∅, then (36) may also y be obtained via the following argument: by keeping track of the term in (31), one does not only obtain (38) from (30), but also that ∇y u · ∇y ux 2 = |∇y ux |2 . ∇y u Therefore, ∇y ux is parallel to ∇y u. This and Lemma A.1 in [16] imply that ω(x) is constant. 5 The extension is not, in general, unique. For instance, both the functions u := 0 and u := x 1−α satisfy div(x α ∇u) = 0 n+1 in Rn+1 + with u = 0 on ∂R+ .

1862


We now prove the following regularity result. 2 (Rn ). Let u be given by (39). Lemma 13. Let v be bounded and Cloc Then, for all R > 0 there exists CR > 0 such that α x ux + CR . L∞ (BR )

Proof. Since P has unit mass, we have the relation u(y, x) − v(y) = Cn,α Rn

x 1−α (v(y − ξ ) − v(y)) dξ. (x 2 + |ξ |2 )(n+1−α)/2

Therefore, x α ux = x α ∂x u(y, x) − v(y) [(1 − α)|ξ |2 − nx 2 ](v(y − ξ ) − v(y)) dξ. = Cn,α (x 2 + |ξ |2 )(n+3−α)/2 Rn

This bounds the quantity x α ux by Rn

|v(y − ξ ) − v(y)| dξ (x 2 + |ξ |2 )(n+1−α)/2

which is controlled by Rn

|v(y − ξ ) − v(y)| dξ |ξ |(n+1−α) |ξ |1

2 v L∞ (Rn ) dξ + |ξ |(n+1−α)

|ξ |1

∇v L∞ (B1 (y)) dξ. |ξ |(n−α)

The last two terms are summable and one gets the bound α x ux C v L∞ (Rn ) + ∇v L∞ (BR+1 ) , L∞ (B + ) R

as desired.

2

We now complete the proof of Theorem 3 via the following argument. We take u as defined in (39) and we observe that (11) and (17) are satisfied, thanks to the local integrability of x −α and Lemmata 9 and 13. Also, u is stable, because of either (15) or (16). Indeed, if (15) holds, then (10) is obvious since f 0 =: g in this case. If, on the other hand, (16) holds, then uy2 = P ∗ vy2 > 0 in Rn+1 + , and uy2 (y, 0) = vy2 (y) > 0 on ∂Rn+1 , thanks to Lemma 8. +


1863

Therefore, given ξ : BR+ → R which is bounded, locally Lipschitz in the interior of Rn+1 + , n+1 2 which vanishes on R+ \ BR and such that (9) holds, we use (27) with ψ := ξ /uy2 (here, j := 2, g := 0, μ := x α and (24) make the choice of such a ψ admissible), and we get

f (u)ξ = 2

∂Rn+1 +

2x α ξ

Rn+1 +

∇uy2 · ∇ξ |∇uy2 |2 − xα ξ 2 . uy2 u2y2

This, by Cauchy–Schwarz inequality, gives (10) and so u is stable. Then, we apply Theorem 2 to get that u(y, x) = uo (ω · y, x) for any y ∈ R2 and any x > 0, for an appropriate direction ω. By Lemma 8, u is continuous up to {x = 0} and so u(y, 0) = uo (ω · y, 0). Since, by (33) and (39), u|∂Rn+1 = v, +

the proof of Theorem 3 is complete. Acknowledgments This collaboration has started on the occasion of a very pleasant visit of the authors at the University of Texas at Austin. We would like to thank the Department of Mathematics for its kind hospitality. E.V. has been partially supported by MIUR Project Metodi variazionali ed equazioni differenziali nonlineari and FIRB Project Analysis and Beyond. We thank an anonymous referee for her or his deep and helpful comments. References [1] Giovanni Alberti, Luigi Ambrosio, Xavier Cabré, On a long-standing conjecture of E. De Giorgi: Symmetry in 3D for general nonlinearities and a local minimality property, Acta Appl. Math. 65 (1–3) (2001) 9–33, special issue dedicated to Antonio Avantaggiati on the occasion of his 70th birthday. [2] Giovanni Alberti, Guy Bouchitté, Pierre Seppecher, Phase transition with the line-tension effect, Arch. Ration. Mech. Anal. 144 (1) (1998) 1–46. [3] Luigi Ambrosio, Xavier Cabré, Entire solutions of semilinear elliptic equations in R3 and a conjecture of De Giorgi, J. Amer. Math. Soc. 13 (4) (2000) 725–739 (in electronic). [4] Henri Berestycki, Luis Caffarelli, Louis Nirenberg, Further qualitative properties for elliptic equations in unbounded domains, Ann. Sc. Norm. Super. Pisa Cl. Sci. (4) 25 (1–2) (1998) 69–94, 1997, dedicated to Ennio De Giorgi. [5] Jean Bertoin, Lévy Processes, Cambridge Tracts in Math., vol. 121, Cambridge Univ. Press, Cambridge, 1996. [6] Xavier Cabré, Antonio Capella, Regularity of radial minimizers and extremal solutions of semilinear elliptic equations, J. Funct. Anal. 238 (2) (2006) 709–733. [7] M. Chipot, M. Chlebík, M. Fila, I. Shafrir, Existence of positive solutions of a semilinear elliptic equation in Rn+ with a nonlinear boundary condition, J. Math. Anal. Appl. 223 (2) (1998) 429–471. [8] Valeria Chiadò Piat, Francesco Serra Cassano, Relaxation of degenerate variational integrals, Nonlinear Anal. 22 (4) (1994) 409–424. [9] Luis Caffarelli, Jean-Michel Roquejoffre, Yannick Sire, Free boundaries with fractional Laplacians, 2007, in preparation. [10] X. Cabré, Y. Sire, Semilinear equations with fractional Laplacians, 2007, in preparation. [11] Luis Caffarelli, Luis Silvestre, An extension problem related to the fractional Laplacian, Comm. Partial Differential Equations 32 (8) (2007) 1245.

1864


[12] Xavier Cabré, Joan Solà-Morales, Layer solutions in a half-space for boundary reactions, Comm. Pure Appl. Math. 58 (12) (2005) 1678–1732. [13] Luis A. Caffarelli, Sandro Salsa, Luis Silvestre, Regularity estimates for the solution and the free boundary of the obstacle problem for the fractional Laplacian, Invent. Math. 171 (2) (2008) 425–461. [14] Rama Cont, Peter Tankov, Financial Modelling with Jump Processes, Chapman & Hall/CRC Financ. Math. Ser., Chapman & Hall/CRC, Boca Raton, FL, 2004. [15] Luis Caffarelli, Alexis Vasseur, Drift diffusion equations with fractional diffusion and the quasi-geostrophic equation, preprint, 2006. [16] Milena Chermisi, Enrico Valdinoci, Fibered nonlinearities for p(x)-Laplace equations, preprint, http://arxiv. org/abs/0808.1835, 2008. [17] Ennio De Giorgi, Convergence problems for functionals and operators, in: Proceedings of the International Meeting on Recent Methods in Nonlinear Analysis, Rome, 1978, Pitagora, Bologna, 1979, pp. 131–188. [18] G. Duvaut, J.-L. Lions, Inequalities in Mechanics and Physics, Springer-Verlag, Berlin, 1976, translated from the French by C.W. John, in Grundlehren Math. Wiss. 219. [19] Alberto Farina, Propriétés qualitatives de solutions d’équations et systèmes d’équations non-linéaires, Habilitation à diriger des recherches, Paris VI, 2002. [20] Doris Fischer-Colbrie, Richard Schoen, The structure of complete stable minimal surfaces in 3-manifolds of nonnegative scalar curvature, Comm. Pure Appl. Math. 33 (2) (1980) 199–211. [21] E. Fabes, D. Jerison, C. Kenig, The Wiener test for degenerate elliptic equations, Ann. Inst. Fourier (Grenoble) 32 (3) (1982) 151–182, vi. [22] Eugene B. Fabes, Carlos E. Kenig, Raul P. Serapioni, The local regularity of solutions of degenerate elliptic equations, Comm. Partial Differential Equations 7 (1) (1982) 77–116. [23] Alberto Farina, Berardino Sciunzi, Enrico Valdinoci, Bernstein and De Giorgi type problems: New results via a geometric approach, Ann. Sc. Norm. Super. Pisa Cl. Sci. (5) 7 (2008). [24] N. Ghoussoub, C. Gui, On a conjecture of De Giorgi and some related problems, Math. Ann. 311 (3) (1998) 481– 491. [25] Nassif Ghoussoub, Changfeng Gui, On De Giorgi’s conjecture in dimensions 4 and 5, Ann. of Math. (2) 157 (1) (2003) 313–334. [26] Enrico Giusti, Minimal Surfaces and Functions of Bounded Variation, Monogr. Math., vol. 80, Birkhäuser-Verlag, Basel, 1984. [27] María del Mar González, Gamma convergence of an energy functional related to the fractional Laplacian, preprint, 2008. [28] Juha Heinonen, Tero Kilpeläinen, Olli Martio, Nonlinear Potential Theory of Degenerate Elliptic Equations, Dover Publications, Inc., Mineola, NY, 2006, unabridged republication of the 1993 original. [29] N.S. Landkof, Foundations of Modern Potential Theory, Springer-Verlag, New York, 1972, translated from the Russian by A.P. Doohovskoy, in Grundlehren Math. Wiss. 180. [30] Elliott H. Lieb, Michael Loss, Analysis, Grad. Stud. Math., vol. 14, Amer. Math. Soc., Providence, RI, 1997. [31] William F. Moss, John Piepenbrink, Positive solutions of elliptic equations, Pacific J. Math. 75 (1) (1978) 219–226. [32] Benjamin Muckenhoupt, Weighted norm inequalities for the Hardy maximal function, Trans. Amer. Math. Soc. 165 (1972) 207–226. [33] Ovidiu Savin, Phase transitions: Regularity of flat level sets, Ann. of Math. (2008), in press. [34] Ovidiu Savin, Enrico Valdinoci, Elliptic PDEs with fibered nonlinearities, J. Geom. Anal. 19 (2) (2009). [35] Peter Sternberg, Kevin Zumbrun, Connectivity of phase boundaries in strictly convex domains, Arch. Ration. Mech. Anal. 141 (4) (1998) 375–400. [36] Peter Sternberg, Kevin Zumbrun, A Poincaré inequality with applications to volume-constrained area-minimizing surfaces, J. Reine Angew. Math. 503 (1998) 63–85.


Infinite matrices with “few” non-zero entries and without non-trivial invariant subspaces Gleb Sirotkin Department of Mathematical Sciences, Northern Illinois University, DeKalb, IL 60115, USA Received 18 January 2008; accepted 7 January 2009 Available online 30 January 2009 Communicated by D. Voiculescu

Abstract In this paper we investigate a family of infinite matrices that act on 1 . We derive a condition sufficient to guarantee that a matrix has no non-trivial closed invariant subspaces. As a result, a simplest known operator on 1 without invariant subspaces is obtained. All entries of the matrix of the example but one are non-negative. Published by Elsevier Inc. Keywords: Operator; Invariant subspace; Transitive operator

1. Introduction This paper produces another operator without closed non-trivial invariant subspaces. The earlier examples were given by P. Enflo [2], C. Read in [3,4], and [5] and B. Beauzamy [1]. The result presented in the paper has a very simple matrix and relatively direct proof. The construction is based on the one in [6] and continues the author’s work in [7]. The matrix might have all non-negative entries but one, which makes us closer to the negative solution of the invariant subspace problem for positive operators. It remains unknown if there exists a positive operator on a Banach lattice without non-trivial invariant subspaces. Although the direct replica of Read’s proof with some addition would have worked, we present an inverted version of the proof, which feels more natural.

E-mail address: [email protected]. 0022-1236/$ – see front matter Published by Elsevier Inc. doi:10.1016/j.jfa.2009.01.015

1866

G. Sirotkin / Journal of Functional Analysis 256 (2009) 1865–1874

All the operators considered in this paper will be linear operators. We are only interested in bounded linear operators whenever we discuss an operator on a normed space. All vector spaces considered in the paper could be over the real or the complex field. First, we define an operator T : c00 → c00 on the space of finite number sequences. We do it by defining the orbit of the first basic vector. Then we estimate the norms of some powers of T : (c00 , · 1 ) → (c00 , · 1 ). It follows that T can be turned into a contraction such that any unit vector x will have an essential part that shrinks significantly under some powers of T . The rest of vector x is going to have finite support. At the end, it is shown that there is a polynomial in T , divisible by the appropriate power of T , that maps the rest of x close to the first element of the basis. 2. Color game and partition of the index set Consider the following game. Two people play on a board which is the set of natural numbers {α: α ∈ N}. They alternatively move the same piece in the positive direction. Each field numbered by α is marked by a pair of natural numbers (m, n) with m n and this assignment satisfies two rules. One, the pair (m, m) should appear and the inequality (m, m) < (m + 1, m + 1) should hold for every m. Two, the only pairs between (m, m) and (m + 1, m + 1) are pairs (m, n) with m > n, though, any pair (m, n) could appear more than once. This coding by pairs of numbers is actually equivalent to coloring the set of natural numbers by infinitely many colors. From now on we are free to replace any number α by the corresponding pair (m, n). Player one (P1) choses number n and a starting position (r, s) with r > n. Player two (P2) must move each time to the first available pair (a, b) with b n. P2 wins if, after her move, she ends up at (a, b) with a > b n. If not, that is if a = b, it is P1’s turn to move. P1 moves only within pairs with the same second number. If P1 cannot or does not want to move, he passes the turn to P2. If P2 wins for every initial choice of P1, we call such a playing board winning. Equivalently, for a pair α = (m, n) define λ(α) to be the largest n such that the pair (m, n ) exists and is greater than α. If no such n exists, set λ(α) = 0. Then the board is winning if and only if there is no infinite sequence (αk ) with αk = (mk , nk ) such that n2k = m2k−1 + 1 and n2k+1 = m2k for each k and the sequence (λ(α2k−1 )) is bounded. Example 2.1. If we order the pairs (m, n) by setting (m, m) < (m, m − 1) < (m, m − 2) < · · · < (m, 2) < (m, 1) < (m + 1, m + 1) for every m, then the associated playing board is winning. Furthermore, P2 wins in, at most, two moves. Indeed, if P2 does not win after her first move, she ends up at (r, r) with r > n. Then P1 moves to some (t, r) with t r > n. According to the described ordering, the next move of P2 ends at (t, r − 1) and P2 wins due to r − 1 n. Example 2.2. If we order the pairs (m, n) by setting (m, m) < (m, 1) < (m, 2) < · · · < (m, m − 2) < (m, m − 1) < (m + 1, m + 1) for every m, then the associated playing board is losing. If the initial choice of P1 is any n and any (m, m − 1) with m n, then P1 can always move to (r, r − 1) in his turn. Thus, P2 is always forced to move to (r + 1, r + 1). It turns out that any winning board can be used to construct an operator on 1 without nontrivial invariant subspaces. So, let us start by fixing some winning board. For every α = (m, n)


1867

we reserve notations α˜ for (n, n) and αˆ for (m, m). Then we fix two increasing sequences (uα )α=(m,n) and (vα )α=(m,n) , satisfying uα < 2uα vα < uα+1 := vα + uα˜ . These sequences partition set N into the blocks [uα , uα+1 ) ∩ N, which we will refer to as α-block. Every such block we consider as a disjoint union of two “intervals”: “Diagonal” [D]α := [uα , vα ] ∩ N, “returning Back” (B)α := (vα , uα+1 ) ∩ N. Next, we will use this partition to define operator T . 3. Definition of operator T Let (ei )∞ i=1 denote the standard basis of 1 . For each n 1, Xn will denote the linear span span{ei : i = 1, . . . , n}. We are going to define sequence (zi )∞ i=1 using the standard basis. After that we will define operator T by simply setting T zi = zi+1 . If i ∈ [D]α , we set ei = Fi zi or zi =

1 Fi e i .

If i ∈ (B)α , we set ei = Gi (Hα zi − zi−vα ) or zi =

1 Gi Hα ei

+

1 Hα zi−vα .

Clearly, if F -, G- and H -coefficients are all non-zero, we have span{zi : i = 1, . . . , n} = span{ei : i = 1, . . . , n} = Xn . For a subset U ⊂ N, we denote by πU and τU the linear projections on c00 such that πU (ei ) = ei , τU (zi ) = zi if i ∈ U and zero otherwise. The values of the coefficients for any α-block are determined in the following order: Gvα +1 → Gvα +2 → · · · → Guα+1 −1 → Fuα → Fuα +1 → · · · → Fvα → Hα . In setting the values of any number in this linear diagram, we are free to use every number which is already defined. Using introduced vectors zi we define the linear map T : c00 → c00 by T zi = zi+1 . So, any collection of non-zero F -, G- and H -coefficients will provide us with an infinite matrix corresponding to this right shift. It is not hard to see that any such matrix can be described pretty easily. If i and i + 1 both lie in [D]α , then T ei = Fi T zi = Fi zi+1 =

Fi ei+1 . Fi+1

If i and i + 1 both lie in (B)α , then T ei = Gi T (Hα zi − zi−vα ) = Gi (Hα zi+1 − zi−vα +1 ) =

Gi ei+1 . Gi+1

1868


If i is in [D]α and i + 1 lies in (B)α , that is, i = vα , then T evα = Fvα zvα +1 =

F vα Gvα +1 Hα

evα +1 +

F vα F vα F vα z1 = evα +1 + e1 . Hα Gvα +1 Hα F1 Hα

If i lies in (B)α and i + 1 does not, then i = vα + uα˜ − 1 = uα+1 − 1. Thus, i − vα + 1 = uα˜ lies in [D]α˜ , and, so, T euα+1 −1 = Guα+1 −1 (Hα zuα+1 − zuα˜ ) =

Guα+1 −1 Hα Guα+1 −1 euα+1 − euα˜ . Fuα+1 Fuα˜

Summarizing the discussion above, we can say that the only non-zero entries of the matrix corresponding to T are those at positions (i + 1; i), (1; vα ), and (uα˜ ; vα + uα˜ − 1). We also can deduct a somewhat necessary condition for such a matrix to represent a bounded operator on 1 . The proof is a straightforward application of the expansions of T ei above. Corollary 3.1. Let T : c00 → c00 be an operator defined as above. Suppose there exists a number ˜ = |Fuα˜ | max{1, A2uα˜ } satisfies A such that T ei A holds for every i. Then function G(α) ˜ sup |Gi |: i ∈ (B)β , β˜ α˜ G(α). It is convenient to introduce the smallest rate of change of coefficients in any α-block. There are two different cases though. In Corollary 3.1 we saw that G-coefficients in α-blocks with bounded α˜ must have a common upper bound. In addition, Gvα +1 and Gvβ +1 should be of com˜ It follows that the values of G-coefficients in α-block are pretty much parable size if α˜ = β. determined earlier—in α-block. ˜ Thus, if α = α, ˜ we set |Fuα | |Hα | |Fi | |Fuα | ∧ min , , and where uα < i vα ; R(α) := |Hα−1 Guα −1 | |Guα+1 −1 | |Fvα | |Fi−1 | where the first ratio guarantees that F -coefficients in α-block are R(α) times larger than coef|Fuα˜ | i| ficients in the (α − 1)-block. Otherwise, if α = α, ˜ we throw in the ratios |G|Gi−1 | and |Gi | for all i − 1, i ∈ (B)β with β˜ = α˜ into the formula as well. It should be clear that, given any function R1 (α), we can define all coefficients in such a way that the corresponding function R(α) will satisfy R(α) R1 (α). Example 3.2. One of the simplest choices leading to a transitive operator would be to set everything as follows. If α = (m, n) with m n, then we set vα = 2uα ; Gvα +1 = n; Gi+1 = Gi R(α), ˜

if i, i + 1 ∈ (B)α ;

Fuα = Guα+1 −1 R(α); Fi+1 = Fi R(α),

if i, i + 1 ∈ [D]α ;

Hα = Fvα R(α).

and


1869

Operator T , generated by such coefficients, has no closed non-trivial invariant subspaces provided R(α) grows sufficiently fast. Notice that with this definition, Gi = Gi−vα +vα˜ , if i ∈ (B)α . 4. Norm estimates and their consequences We begin this section by discussing the norms of T K ei for all i and some not very large K. This will allow us to derive conditions sufficient for T to be continuous. Let us agree that from now on R(α) 2 for every α. If i ∈ [D]α and i + K ∈ [D]β with α β, then K K T ei = T (Fi zi ) = |Fi | ei+K = |Fi | 1 . |Fi+K | |Fi+K | R(β)

(∗)

If i, i + K ∈ (B)α , then K T ei = Gi T K (Hα zi − zi−v ) = Gi (Hα zi+K − zi+K−v ) α α =

1 |Gi | . |Gi+K | R(α) ˜

(∗∗)

Let i ∈ (B)α , i + K ∈ [D]α+1 , and i + K − vα ∈ [D]α˜ . Then K |Gi | 2 T ei = Gi (Hα zi+K − zi+K−v ) = |Gi Hα | + . α |Fi+K | |Fi+K−vα | R(α) ˜

(∗∗∗)

The last case interesting for us is when i ∈ [D]α and i + K ∈ (B)β . Notice, that in such a case for k = i + K − vβ , we have vβ + 1 and vβ + k are in (B)β . Therefore, T k−1 evβ +1 1 ˜ 1. R(β)

If, in addition, we have T k−1 e1 1, then K k−1 T ei = T (Fi zv

2 . R(β)

β +1

)

k−1 |Fi | k−1 |Fi | T T e vβ + e1 |Hβ Gvβ +1 | |Hβ | (∗∗∗∗)

Corollary 4.1. Let T : c00 → c00 be an operator defined as above. Suppose there exists a number A such that for any α we have R(α) 2/A2 . Then T can be extended to a continuous operator on X with T A. In particular, our current agreement R(α) 2 implies T 1. Proof. The assumptions of the corollary, together with inequalities preceding it, say that T ei A for every i. The only touchy place is inequality (∗∗∗∗) with its additional assumption T k−1 e1 1. Nevertheless, it can be easily adjusted to T k−1 e1 A, which holds by simple induction. The conclusion follows. 2 We should point out that there is only one fraction in the conditions above that has a numerator with greater index than the denominator; it is the second fraction in inequality (∗∗∗). It means

1870


that as soon as we defined Fuα˜ , we would have specified the upper boundary for all G-coefficients ˜ It can be done, for instance, as in Example 3.2. in all intervals (B)β with β˜ = α. Our next two corollaries demonstrate parts of any vector that are shrunk by some power of operator T . They show that if block α is viewed as present, then any part of vector x that contains only future coefficients can be neglected. Corollary 4.2. Let T : c00 → c00 be an operator defined as above. Suppose for some function h(x) and a given pair α = (m, n) inequality R(β) 2nh(α) holds for every β > α. Then for any “future” support U , that is, [D]β ∪ (B)β , U⊂ ˜ β>α

β>α

any number K satisfying 1 K vα + uα˜ − 1, and any vector x with x 1, we have K T πU x
α. Then for any set U such that U⊂

(B)β ,

˜ βαβ

any number K satisfying uαˆ K vα + uα˜ − 1, and any vector x with x 1, we have K T (τU πU )x
α [D]β ) ∪ ( β>α (B)β ) Next, we observe that disjoint sets V = βαβ ˜ ˜ satisfy N = [1, vα ] ∪ V ∪ W for every α. Denoting Uα = [1, vα ] ∪ (B)β , ˜ βαβ


1871

we can write that x = πUα x + πW x = τ[1,vα ] πUα x + τV πV x + πW x. The last two corollaries showed that the second and the third terms can be shrunk by some powers of T . Moreover, for collection of polynomials p(T ) h(α), and deg p(t) < vα + uα˜ , Pα,K = p(t): t K p(t), TK we obtain the following fact. Proposition 4.4. Let T , α = (m, n), and h(α) be as described above and K satisfy uαˆ K vα + uα˜ − 1, as in the last corollary. Suppose that R(β) 2nh(α) holds for every β > α. Then, for any polynomial p(t) from Pα,K and for any unit vector x, we have p(T )(x − τ[1,v ] πU x) < 2 . α α n Proof. Of course, this holds due to the fact that x − τ[1,vα ] πU x = τV πV x + πW x and the two previous corollaries. 2 Hence, if for every n and every unit vector x we want a polynomial p(t) such that p(T )x − z1 is of order 1/n, it is enough to find β = (r, s), h(β), and K such that s n and p(T )τ[1,vβ ] πUβ x − z1 is of order 1/n for some p ∈ Pβ,K . 5. Operator T in Xvα +uα˜ −1 For this section, let us set K = uα . As our next step, we are going to show that some useful fixed part of vector p(T )τ[1,vα ] πUα x is 1/n-small uniformly for all p ∈ Pα,uα . As we can see, vector τ[1,vα ] πUα x lies in finite dimensional space Yα := Xvα +uα˜ −1 . If i ∈ (B)β is such that β α and τ[1,vα ] ei = 0, then τ[1,vα ] ei = |Gi−vβ |zi−vβ |Gi−vβ |. Thus, for a unit vector x we can use corollary 3.1 to estimate: τ[1,vα ] πUα x π[1,vα ] x + G(α)π ˆ ˆ ∪βαβ (B)β x 1 + G(α). ˜ Hence, all vectors τ[1,vα ] πUα x with x = 1 lie in a compact subset of space Yα . Let us take a closer look at the behavior N of T in this Nspace. For convenience, let us consider 1 -norm on vectors zi : | N i=1 λi zi | = i=1 |λi | i=1 λi zi and its quantitative relation to the usual norm in Yα : f (α) := sup |x|: x ∈ Yα , x = 1 . vα +uα˜ −1 vα +uα˜ −1 k Also, we introduce similar function g(α) = sup{ k=u |γk | : p(t) = k=u γk t ∈ α α Pα,uα }. Notice that h(α) g(α) due to the fact that h(α)t uα ∈ Pα,uα . Now, everything is set for the following. Remark 5.1. Given α = (m, n) and h(α), if p(t) ∈ Pα,uα and functions f (α), g(α) are defined ˆ + 1 provided R(β) > as above, then τ[D]α+1 p(T )y n1 for every y ∈ Yα with y < G(α) f (α)g(α)(G(α) ˆ + 1)n for every β > α. In particular, τ[D]α+1 p(T )τ[1,vα ] πUα x n1 for every unit vector x ∈ 1 .

1872


Indeed, for every p(t) ∈ Pα,uα and y ∈ Yα , we can write +uα˜ +uα˜ vα vα τ[D] T k y τ[D] p(T )y |γ | |γk |zuα+1 |y| k α+1 α+1 k=0

k=0

ˆ + 1) f (α)g(α)y f (α)g(α)(G(α) . Fuα+1 R(β)

Since τ[D]α+1 = π[D]α+1 and π(vα+1 ,∞) p(T )y = 0 for every y ∈ Yα , we can disregard any part of p(T )y which is not in Yα . In other words, we replace operator T with its truncated version. Consider Tα defined by Tα zi = zi+1 if i < vα + uα˜ − 1 or zero otherwise. For every vector y v +u −2 in Yα with τ[1,uα ) y = 0, the span of {Tα α α˜ y, . . . , Tαuα y} includes vector zvα +1 . Therefore, there is a polynomial p(t) such that p(Tα )y − z1 = Hα zvα +1 − z1 = Gv1 +1 n1 . Thus, for α any compact set Cα = {y ∈ Yα : 0 < mα y G(α) ˆ + 1} of such vectors, we can choose finitely many polynomials p1 , p2 , . . . , pq satisfying the following. For any vector from the set Cα there is at least one polynomial pi such that pi (Tα )y − z1 n1 . Notice, that t uα divides every pi ) and if we set h(α) = max{ pTi (T uα : i = 1, . . . , q}, then every pi is in Pα,uα . As our last step, we describe how to find mα , so that for every n and every unit vector x ∈ 1 , there would exist α = (r, s) with r s n such that τ[1,uα ) πUα x mα . Remark 5.2. The rest of this section will be done mostly for the winning board given in Example 2.1. The proof for arbitrary winning board will be discussed along the lines. Clearly, for a given unit vector x and number n, we can find m > n such that π[1,uα ) x > 34 with α = (m, m). If, for such α, τ[1,uα ) πUα x is not sufficiently separated from zero, then we can write

τ[1,uα ) πUα x π[1,uα ) x − τ[1,uα ) π(B)β x. ˜ βαβ

If the left-hand side is less than 12 , then there should be some β = (r, s) such that β˜ α and τ[1,uα ) π(B)β x > 21rs hold. By τ[1,uα ) ei G(α) for every i ∈ (B)β , it follows that π(B)β x > 1 2rs G(α) . On a different winning board powers of 2 in the denominator should be modified. The found β is the position chosen by player P1. The second player moves to the first available (j, k) with k n. If number k is less than n, we might have problems showing that p(T )x − z1 < n1 . If j = k, then we win because of the following. There are no pairs (w, w) between β and (j, k), so, for every γ (j, k) with γ˜ (j, k), we have γ˜ β. It follows that no (B)γ from U(j,k) affects (B)β . Therefore, we obtain separation from zero by τ[1,u(j,k) ) π(B)β x = π(B)β x. The discussion of the case j = k follows. Lemma 5.3. Let x be a unit vector and assume that for some δ > 0 and β = (r, s) we have δ . π(B)β x > δ, then there is α = (j, k) with k r such that τ[1,uα ) πUα x > 2j G( α) ˜ Proof. Suppose that α = (r + 1, r + 1) is not good (P2 moves to (r + 1, r + 1)). Then we can write Uα as the disjoint union:


1873

(vβˆ , vα ] Uα = [1, vβˆ ] ∪ (B)γ (B)γ ˆ γˆ γ˜ β
r+1 τ[1,uα ) πUα x π(vβˆ ,uα ) x − τ[1,uα ) π(B)γ x 8 2 G(α) δ−

γ˜ =α

τ[1,uα ) π(B)γ x.

γ˜ =α

This leads to the existence of γ = (w, r + 1) such that τ[0,uα ) π(B)γ x 2δw holds. As before, δ we can write π(B)γ x 2w G( γ˜ ) . For an arbitrary board, the argument remains valid with appropriate correction of powers of 2. The chosen γ is the next move of player P1. Finally, we claim that pair α = γ + 1 = (w, r) will work for us. It follows, as above, from the fact that τ[1,uγ +1 ) π(B)γ x = π(B)γ x. Thus, τ[1,uγ +1 ) πUγ +1 x π(B)γ x

δ δ 2w G(γ˜ ) 2w G(α) ˜

The argument for a different board is similar. As her next move, P2 moves to the first pair (a, b) with b n. If a = b, we got the separation. If a = b, we repeat the steps of this lemma. Since the board is winning, we will eventually stop. 2 Summarizing the discussion above, for any n and unit vector x, we have found α = (r, s) with r s n such that τ[1,uα ) πUα x is at least r 3 1 2 (of course, appropriate modification of all 2 [G(α)] ˜

powers is due for a different winning board). Assuming mα are found, we can compute h(α), g(α) h(α), and f (α). If R(β) > f (α)g(α)(G(α) ˆ + 1)n for every β > α, then all facts about the operator T proved in the paper will hold. Hence, any operator T will have no non-trivial invariant subspaces. 6. Almost positive transitive operator Simplicity of the described matrix allows us to give an example of a transitive operator with all entries positive except for one . We set F1 , all Gi , and all Hα to be negative, while all other Fi to be positive. Using formulas at the end of Section 3, we see that, indeed, all but one entries of T are positive. Acknowledgments The author wishes to thank professor V.G. Troitsky for fruitful discussions about the subject of the paper and anonymous referee for useful suggestions of how to improve the paper.

1874


References [1] B. Beauzamy, An operator with no invariant subspace: Simplification of the example of P. Enflo, Integral Equations Operator Theory 8 (3) (1985) 314–384, (French). [2] P. Enflo, On the invariant subspace problem for Banach spaces, in: Seminaire Maurey–Schwarz (1975–1976), Acta Math. 158 (1987) 213–313. [3] C.J. Read, A solution to the invariant subspace problem, Bull. London Math. Soc. 16 (1984) 337–401. [4] C.J. Read, A solution to the invariant subspace problem on the space l1 , Bull. London Math. Soc. 17 (1985) 305–317. [5] C.J. Read, A short proof concerning the invariant subspace problem, J. London Math. Soc. 34 (2) (1986) 335–348. [6] C.J. Read, Quasinilpotent operators and the invariant subspace problem, J. London Math. Soc. 56 (2) (1997) 595–606. [7] G. Sirotkin, A modification of Read’s transitive operator, J. Operator Theory 55 (1) (2006) 153–167.


Singular quasilinear and Hessian equations and inequalities Nguyen Cong Phuc a,∗ , Igor E. Verbitsky b,1 a Department of Mathematics, Louisiana State University, Baton Rouge, LA 70803, USA b Department of Mathematics, University of Missouri, Columbia, MO 65211, USA

Received 30 January 2008; accepted 14 January 2009 Available online 29 January 2009 Communicated by H. Brezis

Abstract We solve the existence problem in the renormalized, or viscosity sense, and obtain global pointwise estimates of solutions for quasilinear and Hessian equations with measure coefficients and data, including the following model problems: −p u = σ uq + μ,

Fk [−u] = σ uq + μ,

u 0,

on Rn , or on a bounded domain Ω ⊂ Rn . Here p is the p-Laplacian defined by p u = div(∇u|∇u|p−2 ), and Fk [u] is the k-Hessian, i.e., the sum of the k × k principal minors of the Hessian matrix D 2 u (k = 1, 2, . . . , n); σ and μ are general nonnegative measurable functions (or measures) on Ω. © 2009 Elsevier Inc. All rights reserved. Keywords: Quasilinear equations; Fully nonlinear equations; Power source terms; p-Laplacian; k-Hessian; Wolff’s potential; Weighted norm inequalities

1. Introduction Let σ , ω be arbitrary nonnegative locally integrable functions, or more generally, nonnegative q locally finite measures on a domain Ω in Rn . By Lσ,loc (Ω), q > 0, we denote the space of q measurable functions f such that |f | is locally integrable with respect to σ in Ω. * Corresponding author.

E-mail addresses: [email protected] (N.C. Phuc), [email protected] (I.E. Verbitsky). 1 Supported in part by NSF grant DMS-0556309.


1876

N.C. Phuc, I.E. Verbitsky / Journal of Functional Analysis 256 (2009) 1875–1906

In this paper we consider the following quasilinear and Hessian equations with nonlinear source terms and measure coefficients and data: −p u = σ uq + ω,

u 0,

u is p-superharmonic in Ω,

(1.1)

and Fk [−u] = σ uq + ω,

u 0,

−u is k-convex in Ω,

(1.2)

q

where u ∈ Lσ,loc (Ω), q > 0, 1 < p < ∞, and k = 1, . . . , n. Here p is the p-Laplacian defined by p u = div |∇u|p−2 ∇u , and Fk [u] denotes the k-Hessian

Fk [u] =

λi 1 · · · λi k ,

1i1 1. Here we set α = 1, s = p for 2k the p-Laplacian, and α = k+1 , s = k + 1 for the k-Hessian. These potentials were originally introduced by Hedberg and Wolff as a tool for solving certain hard problems of nonlinear potential theory and Sobolev spaces (see [1, Sections 4.9 and 9.13]). Their importance for quasilinear and Hessian equations was discovered subsequently by Kilpeläinen and Malý [21], and Labutin [22]. In particular, it was shown in [24] that if the equation of Lane–Emden type, −p u = uq + ω,

u 0,

u is p-superharmonic in Rn ,

(1.4)

has a solution, then W1,p ω(x) < +∞ a.e., and there exists a constant C = C(n, p, q) > 0 such that W1,p (W1,p ω)q dx (x) CW1,p ω(x)

dx-a.e.,

(1.5)

where dx stands for Lebesgue measure on Rn . Conversely, there exists a constant C0 = C0 (n, p, q) > 0 such that if W1,p ω(x) < +∞ a.e., and if (1.5) holds for some C C0 , then Eq. (1.4) admits a solution u such that c1 W1,p ω(x) u(x) c2 W1,p ω(x),

(1.6)

where c1 , c2 are positive constants depending only on n, p, q. Analogous results are obtained in [24] for the corresponding Dirichlet problem on a bounded domain Ω ⊂ Rn , as well as for Hessian equations. Furthermore, several equivalent characterizations of (1.5), along with simpler sufficient and necessary coefficients are given there. However, for variable coefficients σ in the source term, the global existence problem for (1.1), (1.2) and control of the corresponding solutions are much harder due to a nonlinear interplay between σ , the inhomogeneous term ω, and the operator on the left-hand side. Such a phenomenon, observed previously in the semilinear case in [10,23], becomes substantially more complicated for quasilinear and fully nonlinear operators. Some partial results towards the solution of this problem were obtained in [25]. In the present paper, we develop new techniques to establish criteria for the solvability of (1.1) and (1.2) with a pair of general weights σ and ω. Our results are new even in the special case of power weights σ (x) = |x|γ , γ > −n (see [25], and the literature cited there). As we will demonstrate below, an analogue of (1.5), namely the pointwise condition Wα,s (Wα,s ω)q dσ (x) CWα,s ω(x)

dσ -a.e.,

(1.7)

2k with α = 1, s = p (or α = k+1 , s = k + 1), is still necessary and sufficient, with a gap only in the best constants, for the solvability of (1.1), or (1.2), respectively. An important part of our approach, which distinguishes it from that of [24], is concerned with nonlinear two-weight trace inequalities of the type:

Rn

q Wα,s (g dω)(y) dσ (y) C

q

g s−1 dω, Rn

(1.8)

1878

N.C. Phuc, I.E. Verbitsky / Journal of Functional Analysis 256 (2009) 1875–1906 q

where g ∈ Lωs−1 (Rn ), g 0, and its “testing” counterpart:

q Wα,s ωB (y) dσ (y) Cω(B),

(1.9)

B

where B are balls in Rn , and dωB = χB dω. The preceding inequality is deduced from (1.8) by letting g = χB . It turns out that (1.8) is intimately connected with Eqs. (1.1) and (1.2). We will show that 2k (1.8), and hence (1.9), with α = 1, s = p (or α = k+1 , s = k + 1), is necessary for the existence of a solution to (1.1), or (1.2), respectively. Actually, it follows from our earlier work [24] that, for σ ≡ 1, (1.8) or (1.9), with the appropriate α and s, is also sufficient for the solvability of (1.1) and (1.2). In the case where σ is a general nonnegative measure, we will show below that (1.8), or (1.9), together with the additional pointwise condition r

σ (Bt (x)) t n−αs

1 s−1

dt · t

∞

ω(Bt (x)) t n−αs

1 s−1

dt t

q s−1 −1

C,

(1.10)

r

0

2k where C does not depend on x ∈ Rn and r > 0, with α = 1, s = p (or α = k+1 , s = k + 1), is sufficient for the solvability of (1.1), or (1.2), respectively. Remarkably, (1.10) with the appropriate α, s is also necessary for the solvability of (1.1) and (1.2), as was shown in [25]. This gives a complete solution to the existence problem, with a gap only in the best constants in the sufficiency and necessity parts. Such characterizations of solvability were obtained earlier [23] for semilinear equations where p = 2 or k = 1. Analogous results for equations of the type (1.1), (1.2) on bounded domains Ω in Rn with Dirichlet boundary conditions are obtained in Theorems 3.11 and 4.9 below as well. In this case, we use a truncated version of Wolff’s potentials defined for α > 0, s > 1, and a nonnegative measure μ on Ω by

r Wrα, s μ(x) =

μ(Bt (x)) t n−αs

1 s−1

dt , t

0

where x ∈ Ω and 0 < r < 2 diam(Ω). In the present paper, we use extensively dyadic models [24,32], and various global and local Wolff’s potential estimates obtained in [12,13,21,22,24,25], along with fundamental weak continuity theorems for quasilinear and Hessian equations [29–31]. This makes it possible to give a unified treatment for both quasilinear equations and equations of Monge–Ampère type with general coefficients and data. Our main results are summarized in the following theorems. In Theorems 1.1, we assume that A : Rn × Rn → Rn is a vector-valued mapping that satisfies A(x, ξ ) · ξ ≈ |ξ |p where 1 0. The precise structural conditions imposed on A along with the notion of A-superharmonic functions are introduced in the next section. This includes the standard case A(x, ξ ) = |ξ |p−2 ξ which corresponds to the usual p-Laplacian p , and the corresponding notion of p-superharmonic functions. Theorem 1.1. Let ω and σ be nonnegative locally finite measures on Rn and let q > p − 1 > 0. If the equation − div A(x, ∇u) = σ uq + ω

(1.11)

q

has a solution u ∈ Lσ,loc (Rn ), u 0, then there exists a constant C = C(n, p, q, α, β) > 0 such that statements (i), (ii) and (iii) below are valid. (i) The inequality

q W1,p (g dω)(y) dσ (y) C

Rn

q

(1.12)

g p−1 dω Rn

q

holds for all g ∈ Lωp−1 (Rn ), g 0, and r

σ (Bt (x)) t n−p

1 p−1

dt · t

∞

ω(Bt (x)) t n−p

1 p−1

q p−1 −1

dt C t

(1.13)

r

0

holds for all x ∈ Rn and r > 0. (ii) The inequality

q W1,p ωB (y) dσ (y) Cω(B)

(1.14)

B

holds for all balls B ⊂ Rn , and (1.13) holds for all x ∈ Rn and r > 0. (iii) For each x ∈ Rn , W1,p (W1,p ω)q dσ (x) CW1,p ω(x).

(1.15)

Conversely, there exists a constant C0 = C0 (n, p, q, α, β) > 0 such that if anyone of stateq ments (i), (ii), (iii) holds with C C0 then Eq. (1.11) has a solution u ∈ Lσ,loc (Rn ), u 0, such that M1 W1,p ω(x) u(x) M2 W1,p ω(x). Theorem 1.2. Let ω and σ be nonnegative locally finite measures on Rn , and let 1 k < n2 , q > k. If the equation Fk [−u] = σ uq + ω

(1.16)

1880


has a solution u ∈ Lσ,loc (Rn ), u 0, then there exists a constant C = C(n, k, q) > 0 such that statements (i), (ii) and (iii) below are valid. (i) The inequality

W

2k k+1 , k+1

q (g dω)(y) dσ (y) C

Rn

q

g k dω Rn

q

holds for all g ∈ Lωk (Rn ), g 0, and r

1

σ (Bt (x)) t n−2k

k

dt · t

∞

ω(Bt (x)) t n−2k

1 qk −1 k

dt C t

(1.17)

r

0

holds for all x ∈ Rn and r > 0. (ii) The inequality W

2k k+1 , k+1

q ωB (y) dσ (y) Cω(B)

B

holds for all balls B ⊂ Rn , and (1.17) holds for all x ∈ Rn and r > 0. (iii) For each x ∈ Rn , W 2k ,k+1 (W 2k ,k+1 ω)q dσ (x) CW 2k ,k+1 ω(x). k+1

k+1

k+1

Conversely, there exists a constant C0 = C0 (n, k, q) > 0 such that if anyone of statements (i), q (ii), (iii) holds with C C0 , then Eq. (1.16) has a solution u ∈ Lσ,loc (Rn ), u 0 such that M1 W

2k k+1 ,k+1

ω(x) u(x) M2 W

2k k+1 ,k+1

ω(x).

2. A-superharmonic functions In this section, we recall for later use some facts on A-superharmonic functions, most of which can be found in [18,20,21,31]. Let Ω be an open set in Rn , and let p > 1. (We will be mostly interested in the case Ω = Rn and 1 < p < n.) Let us assume that A : Rn × Rn → Rn is a vector-valued mapping which satisfies the following structural properties: the mapping x → A(x, ξ )

is measurable for all ξ ∈ Rn ,

(2.1)

the mapping ξ → A(x, ξ )

is continuous for a.e. x ∈ R ,

(2.2)

n

and there are constants 0 < α β < ∞ such that for a.e. x in Rn , and for all ξ in Rn , A(x, ξ ) β|ξ |p−1 , A(x, ξ ) · ξ α|ξ |p , A(x, ξ1 ) − A(x, ξ2 ) · (ξ1 − ξ2 ) > 0, if ξ1 = ξ2 , A(x, λξ ) = λ|λ|p−2 A(x, ξ ),

if λ ∈ R \ {0}.

(2.3) (2.4) (2.5)


1881

1,p

For u ∈ Wloc (Ω), we define the divergence of A(x, ∇u) in the sense of distributions, i.e., if ϕ ∈ C0∞ (Ω), then div A(x, ∇u)(ϕ) = −

A(x, ∇u) · ∇ϕ dx. Ω

1, p

It is well known that every solution u ∈ Wloc (Ω) to the equation − div A(x, ∇u) = 0

(2.6)

has a continuous representative. Such continuous solutions are said to be A-harmonic in Ω. If 1,p u ∈ Wloc (Ω) and A(x, ∇u) · ∇ϕ dx 0, Ω

for all nonnegative ϕ ∈ C0∞ (Ω), i.e., − div A(x, ∇u) 0 in the distributional sense, then u is called a supersolution to (2.6) in Ω. A lower semicontinuous function u : Ω → (−∞, ∞] is called A-superharmonic if u is not identically infinite in each component of Ω, and if for all open sets D such that D ⊂ Ω, and all functions h ∈ C(D), A-harmonic in D, it follows that h u on ∂D implies h u in D. We recall here the fundamental connection between supersolutions of (2.6) and A-superharmonic functions [18]. Proposition 2.1. (See [18].) 1,p

(i) If u ∈ Wloc (Ω) is such that − div A(x, ∇u) 0, then there is an A-superharmonic function v such that u = v a.e. Moreover, v(x) = ess lim inf v(y), y→x

x ∈ Ω.

(2.7) 1,p

(ii) If v is A-superharmonic, then (2.7) holds. Moreover, if v ∈ Wloc (Ω), then − div A(x, ∇v) 0. 1,p

(iii) If v is A-superharmonic and locally bounded, then v ∈ Wloc (Ω), and − div A(x, ∇v) 0.

1882


A consequence of the above proposition is that if u and v are two A-superharmonic functions on Ω for which u v a.e. on Ω then u v everywhere on Ω. 1,p Note that an A-superharmonic function u does not necessarily belong to Wloc (Ω), but its truncation min{u, k} does, for every integer k, by Proposition 2.1(iii). Using this we set Du = lim ∇ min{u, k} , k→∞

1,1 defined a.e. If either u ∈ L∞ loc (Ω) or u ∈ Wloc (Ω), then Du coincides with the regular distributional gradient of u. In general we have the following gradient estimates [20] (see also [18,31]). n . Then both Proposition 2.2. (See [20].) Suppose u is A-superharmonic in Ω and 1 q < n−1 q 1 p−1 and A(·, Du) belong to Lloc (Ω). Moreover, if p > 2 − n , then Du is the distributional |Du| gradient of u.

We can now extend the definition of the divergence of A(x, ∇u) to those u which are merely A-superharmonic in Ω. For such u we set A(x, Du) · ∇ϕ dx,

− div A(x, ∇u)(ϕ) = Ω

for all ϕ ∈ C0∞ (Ω). Note that by Proposition 2.2 and the dominated convergence theorem, − div A(x, ∇u)(ϕ) = lim

k→∞

A x, ∇ min{u, k} · ∇ϕ dx 0,

Ω

whenever ϕ ∈ C0∞ (Ω) and ϕ 0. Since − div A(x, ∇u) is a nonnegative distribution in Ω for an A-superharmonic u, it follows that there is a positive (not necessarily finite) Radon measure denoted by μ[u] such that − div A(x, ∇u) = μ[u]

in Ω.

Conversely, given a nonnegative finite measure μ in a bounded domain Ω, there is an Asuperharmonic function u (not necessarily unique) such that − div A(x, ∇u) = μ in Ω and 1,p min{u, k} ∈ W0 (Ω) for all integers k; see [20]. The following weak continuity result in [31] will be used later to prove the existence of Asuperharmonic solutions to quasilinear equations. Theorem 2.3. (See [31].) Suppose that {un } is a sequence of nonnegative A-superharmonic functions in Ω that converges a.e. to an A-superharmonic function u. Then the sequence of measures {μ[un ]} converges to μ[u] weakly, i.e., ϕ dμ[un ] = ϕ dμ[u] lim n→∞

Ω

for all ϕ ∈ C0∞ (Ω).

Ω


1883

In [20] and [21] the following two-sided pointwise potential estimate for A-superharmonic functions was established, which serves as a major tool in our study of quasilinear equations of Lane–Emden type. Theorem 2.4. (See [21].) Let u be an A-superharmonic function in Rn with infRn u = 0. If μ = − div A(x, ∇u), then 1 W1,p μ(x) u(x) KW1,p μ(x) K for all x ∈ Rn , where K is a positive constant depending only on n, p and the structural constants α and β. 3. Quasilinear equations In this section, we study the solvability problem for the quasilinear equation − div A(x, ∇u) = σ uq + ω

(3.1)

in a bounded domain Ω ⊂ Rn or in the entire space Rn , where A(x, ξ ) · ξ ≈ |ξ |p is given as in Section 2. Here we assume p > 1, q > p −1, and ω is a nonnegative locally finite measure on Rn . q The solvability of (3.1) in Rn is understood in the “potential-theoretic” sense, i.e., u ∈ Lσ,loc (Rn ), u 0, is a solution to (3.1) if u is A-superharmonic, and q − div A(x, ∇u)(ϕ) = u ϕ dσ + ϕ dω for all test functions ϕ ∈ C0∞ (Rn ). Here as discussed in Section 2, − div A(x, ∇u)(ϕ) =

lim A x, ∇ min{u, k} · ∇ϕ dx

k→∞

for each ϕ ∈ C0∞ (Rn ). On the other hand, to deal with (3.1) in bounded domains we will use the notion of renormalized solutions. We first prove a necessary condition for the solvability of (3.1) in Rn in terms of a certain nonlinear weighted norm inequality. q

Theorem 3.1. Let u ∈ Lσ,loc (Rn ) be a nonnegative A-superharmonic function for which − div A(x, ∇u) = σ uq + ω in Rn , where 1 p − 1. Then q q W1,p (g dω) dσ C g p−1 dω Rn

(3.2)

Rn

q

for all g ∈ Lωp−1 (Rn ), g 0, and for a constant C depending only on n, p, q, and the structural constants α, β.

1884


Proof. Let μ = σ uq + ω. From the lower Wolff potential estimate, Theorem 2.4, there is a constant C = C(n, p, α, β) > 0 such that u(x) CW1,p μ(x) for all x ∈ Rn . From this we obtain (W1,p μ)q dσ C dμ,

(3.3)

where C = C(n, p, q, α, β). Moreover, it is also easily checked that all constants C appearing in the rest of the proof depend only on n, p, q, and α, β, but not on the measures σ and ω. Inequality (3.3) now gives

q

(W1,p μ)q (Mμ g) p−1 dσ C Rn

q

(Mμ g) p−1 dμ, Rn

q

which holds for all g ∈ Lμp−1 . Here Mμ denotes the centered Hardy–Littlewood maximal function defined for a locally μ-integrable function f by

Mμ f (x) = sup r>0

Br (x) |f | dμ

μ(Br (x))

.

Since Mμ is bounded on Lsμ (Rn ), s > 1 (see, e.g., [16]), from (3.4) we obtain

q

(W1,p μ)q (Mμ g) p−1 dσ C Rn

q

g p−1 dμ Rn

q

for all g ∈ Lμp−1 (Rn ), g 0. From this inequality and the estimate q q q W1,p μ(x) Mμ g(x) p−1 C W1,p (g dμ)(x) we deduce

q W1,p (g dμ)(x) dσ (x) C

Rn

q

(3.4)

g p−1 dμ. Rn

We now continue the proof of Theorem 3.1 with the cases 1 2 being treated separately. The case 1 < p 2. We first rewrite (3.4) in the form 1 p−1 q p−1 B(x,2−j ) g dμ p−1 Rn

j ∈Z

(2−j )n−p

dσ (x) C

q

g p−1 dμ. Rn


q

1

1885 1

Since 1 < p 2, by duality of spaces with mixed norms, Lσp−1 ( p−1 ) = [Lσq−p+1 ( 2−p )]∗ , this is equivalent to B(x,2−j ) g dμ q φj (x) dσ (x) Cg q φ 1 p−1 q−p+1 (2−j )n−p Lμ Lσ ( 2−p ) Rn j ∈Z

q

1

for all φ = {φj }j ∈Z ∈ Lσq−p+1 ( 2−p ). Using Fubini’s theorem, the above inequality can be rewritten as χB(x,2−j ) (y) q φj (x) dσ (x) g(y) dμ(y) Cg q φ . (3.5) 1 p−1 q−p+1 (2−j )n−p Lμ Lσ ( 2−p ) Rn Rn

j ∈Z

Again by duality (3.5) is equivalent to χB(x,2−j ) (y) Rn j ∈Z

Rn

(2−j )n−p

q q q−p+1 q−p+1 φj (x) dσ (x) dμ(y) Cφ q

q−p+1

Lσ

1

.

(3.6)

( 2−p )

Since ω μ, inequality (3.6), and hence inequality (3.4), holds also for ω in place of μ, i.e., q q W1,p (gω)(x) dσ (x) C g p−1 dω Rn

Rn

q

for all g ∈ Lωp−1 (Rn ), g 0, which gives (3.2). The case p > 2. Since Q ⊂ B√n(Q) (x) for every cube Q that contains x, we see that Q∈D

Q g dμ (Q)n−p

1 p−1

χQ (x) CW1,p (g dμ)(x).

Thus from (3.4) we obtain 1 g dμ p−1 Q

Rn

Q∈D

(Q)n−p

q q χQ (x) dσ (x) C g p−1 dμ.

(3.7)

Rn

1 1 , q−p+1 } and replace g by g s(p−1) in (3.7) We next choose a number s such that s > max{1, p−1 to get s g s(p−1) dμ s(p−1) Q

Rn

Q∈D

(Q)n−p

q χQ (x) dσ (x) C g qs dμ.

Since s(p − 1) > 1, from this and Hölder’s inequality we obtain

Rn

1886


c g dμ s Q Q μ(Q)

Q∈D

Rn

1 qs s χQ (x) dσ (x) C g qs dμ,

(3.8)

Rn 1

where we set cQ = [(Q)p−n μ(Q)] s(p−1) . By duality (3.8) is equivalent to c g dμ Q Q Rn

μ(Q)

Q∈D

(qs)

χQ (x)φQ (x) dσ (x) CgLqs φ μ

(qs)

Lσ

(s )

(3.9)

for all φ = {φQ }Q∈D ∈ Lσ (s ). Here and in what follows for r > 1 we denote by r its conjur gate, i.e., r = r−1 . Observe that inequality (3.9) can be rewritten in the form c φ dσ Q Q Q Rn

μ(Q)

Q∈D

φ χQ (y)g(y) dμ(y) CgLqs μ

(qs)

Lσ

(s )

,

which again by duality is equivalent to c φ dσ Q Q Q Rn

μ(Q)

Q∈D

(qs) (qs) χQ (y) dμ(y) Cφ (qs) Lσ

(s )

.

(3.10)

Applying Proposition 2.2 in [13] we see that this inequality is equivalent to

λQ

Q∈D

(qs) −1 1 (qs) λQ Cφ , (qs) μ(Q) Lσ (s )

(3.11)

Q ⊂Q

1

1 where λQ = cQ Q φQ dσ = [(Q)p−n μ(Q)] s(p−1) Q φQ dσ . Note that s(p−1) − (qs) + 1 > 0 by the choice of s. Thus (3.11), and hence (3.8), holds also for ω in place of μ, i.e.,

Rn

Q∈D

sQ

Q gdω

ω(Q)

s

q χQ (x) dσ (x) C g qs dω,

(3.12)

Rn 1

qs

which holds for all g ∈ Lω (Rn ), g 0. We set sQ = [(Q)p−n ω(Q)] p−1 . Let Mdμ denote the dyadic Hardy–Littlewood maximal function defined for each locally μintegrable function f by

Mdμ f (x) = sup

Qx

Q |f | dμ

μ(Q)

,


1887

where the supremum is taken over all dyadic cubes that contain x. By replacing g with 1

[Mdω (g s(p−1) )] s(p−1) in (3.12) we obtain Rn

Qg

sQ

1 s(p−1) dω p−1

ω(Q)

Q∈D

q d s(p−1) q Mω (g χQ (x) dσ (x) C ) p−1 dω. (3.13) Rn

Since q > p − 1, the boundedness of Mdω on Lsω , s > 1 (see, e.g., [26]), and (3.13) then yield

sQ

Q∈D

Rn

Qg

1 s(p−1) dω p−1

ω(Q)

q χQ (x) dσ (x) C g qs dω. Rn

1

Thus replacing g with g s(p−1) in the above inequality we get 1 g dω p−1 Q

Rn

Q∈D

(Q)n−p

q q χQ (x) dσ (x) C g p−1 dω,

(3.14)

Rn

q

which holds for all g ∈ Lωp−1 (Rn ), g 0. Note that a shifted version of (3.14), namely, 1 g dω p−1 Qt

Rn

Q∈D

(Qt )n−p

q q χQt (x) dσ (x) C g p−1 dω,

(3.15)

Rn

holds for the same constant C > 0 independent of t ∈ Rn as well. Here Qt = Q + t for each t ∈ Rn . We next introduce the truncated version WR 1,p , R > 0, of Wolff’s potentials defined for each nonnegative measure ν by R WR 1,p ν(x) =

ν(Bt (x)) t n−p

1 p−1

dt . t

(3.16)

0

It was established in [12, p. 399], that WR 1,p (g dω)(x)

C Rn

|t|cR Q∈D

Qt

g dω

(Qt )n−p

1 p−1

χQt (x) dt,

which holds for constants C, c > 0 depending only on n. From this estimate, Hölder’s inequality and the fact that q > p − 1 > 1 we get R q C W1,p (g dω)(x) n R

|t|cR

1 p−1 Qt g dω Q∈D

(Qt )n−p

q χQt (x) dt.

Thus in view of (3.15), applying Fubini’s theorem and letting R → ∞, we obtain (3.2).

2

1888


In the next theorem we establish a pointwise inequality which will turn out to be necessary and sufficient for the solvability of (3.1) in Rn . Theorem 3.2. Let σ and ω be nonnegative locally finite measures on Rn for which q W1,p ωB (y) dσ (y) Mω(B)

(3.17)

B

for all balls B ⊂ Rn , and r

σ (Bt (x)) t n−p

1 p−1

dt · t

∞

ω(Bt (x)) t n−p

1 p−1

dt t

q p−1 −1

M

(3.18)

r

0

for x ∈ Rn and 0 < r < ∞. Then W1,p (W1,p ω)q dσ (x) CW1,p ω(x) for a constant C > 0 depending only on p, q and M. Proof. Let dν = (W1,p ω)q dσ and let x be as in the theorem. We have to show that for a constant C = C(p, q, M), ∞ W1,p ν(x) =

ν(Br (x)) r n−p

1 p−1

dr CW1,p ω(x). r

0

For r > 0 we write ω = ω1 + ω2 where dω1 = χB2r (x) dω and dω2 = (1 − χB2r (x) ) dω. From (3.17) we have q W1,p ω(y) dσ (y) ν(Br (x)) = Br (x)

C

q q W1,p ω1 (y) + W1,p ω2 (y) dσ

Br (x)

Cω B2r (x) + C

q W1,p ω2 (y) dσ,

Br (x)

where C = C(p, q, M). On the other hand, for y ∈ Br (x), ∞ W1,p ω2 (y) =

ω2 (Bt (y)) t n−p

1 p−1

1 p−1

dt t

r

∞ r

ω(B2t (x)) t n−p

dt , t

(3.19)


1889

since Bt (y) ⊂ B2t (x) for y ∈ Br (x) and t r. Thus,

q W1,p ω2 (y) dσ

∞

ω(B2t (x)) t n−p

1 p−1

dt t

q

σ Br (x) .

r

Br (x)

From this inequality and (3.19) we get ∞ W1,p ν(x) C

ω(B2r (x)) r n−p

1 p−1

dr + CI (x), r

(3.20)

0

where ∞ ∞ I (x) = 0

ω(B2t (x)) t n−p

1 p−1

dt t

q p−1

1 p−1

σ (Br (x)) r n−p

1 p−1

dr . r

r

Using integration by parts we can write q I (x) = p−1

∞ ∞

ω(B2t (x)) t n−p

dt t

q p−1 −1

r

0

ω(B2r (x)) × r n−p

1 p−1

r

σ (Bt (x)) t n−p

1 p−1

dt dr , t r

0

which by (3.18) gives ∞ I (x) M

ω(B2r (x)) r n−p

1 p−1

dr . r

0

Thus in view of (3.20) we obtain for a constant C = C(p, q, M), ∞ W1,p ν(x) C

ω(B2r (x)) r n−p

1 p−1

dr CW1,p ω(x), r

0

which completes the proof of the theorem.

2

As we will employ the notion of renormalized solutions below, we now recall two of its equivalent definitions established in [15]. In what follows, for a measure μ of bounded total variation on a bounded open set Ω we will denote by μ0 its continuous part (with respect to the

1890


capacity cap1,p (·, Ω)), and by μs its singular part, which concentrates on a set of zero capacity. Here the capacity cap1,p (·, Ω) is defined by |∇ϕ|p dx: ϕ ∈ C0∞ (Ω), u 1 on K cap1,p (K, Ω) = inf Ω

for each compact set K ⊂ Ω. Thus we can write − μ = μ0 + μ+ s − μs , − where μ+ s and μs are positive and negative parts of μs respectively.

Definition 3.3. Let μ be a measure of bounded total variation on Ω. Then u is said to be a renormalized solution of

− div A(x, ∇u) = μ in Ω, u = 0 on ∂Ω,

(3.21)

if the following conditions hold: 1, p

(a) The function u is measurable and finite almost everywhere, and Tk (u) belongs to W0 (Ω) for every k > 0, where for k > 0 and s ∈ R, Tk (s) = max{−k, min{s, k}} . n (b) The gradient Du of u satisfies |Du|p−1 ∈ Lq (Ω) for all q < n−1 , where Du is defined by Duχ{|u| 0.

(c) If w belongs to W0 (Ω) ∩ L∞ (Ω) and if there exist k > 0, w +∞ and w −∞ in W 1,r (Ω) ∩ L∞ (Ω), with r > N , such that 1,p

w = w +∞ w = w −∞

a.e. on the set {u > k}, a.e. on the set {u < −k},

then

A(x, Du) · ∇w dx = Ω

w dμ0 +

Ω

w Ω

+∞

dμ+ s

−

w −∞ dμ− s .

Ω

Definition 3.4. Let μ be a measure of bounded total variation on Ω. Then u is a renormalized solution of (3.21) if u satisfies (a) and (b) in Definition 3.3, and if the following conditions hold: − (c) For every k > 0 there exist two nonnegative measures λ+ k and λk which are continuous with respect to the capacity cap1,p (·, Ω) and concentrate on the sets {u = k} and {u = −k},


1891

− + − respectively, such that λ+ k → μs and λk → μs in the narrow topology of measures, i.e., − + φ dλ+ → φ dμ and φ dλ → φ dμ− s s , k k Ω

Ω

Ω

Ω

for every bounded continuous function φ on Ω. (d) For every k > 0, + A(x, Du) · ∇ϕ dx = ϕ dμ0 + ϕ dλk − ϕ dλ− k {|u| p − 1 > 0, 2R q 2R W2R 1,p W1,p ω dσ CW1,p ω

σ -a.e.,

(3.22)

where R = diam(Ω) and C

q −p+1 qK max{1, 2p −2 }

q(p −1)

p−1 , q −p+1

(3.23) q

where K is the constant in Theorem 3.6. Then there is a renormalized solution u ∈ Lσ (Ω) to the Dirichlet problem

− div A(x, ∇u) = σ uq + ω u = 0 on ∂Ω

in Ω,

(3.24)

such that u(x) MW2R 1,p ω(x) for all x in Ω, where the constant M depends only n, p, q, and the structural constants α, β. Proof. Using Lemma 3.7 we can construct a nondecreasing sequence of functions {uj }j 0 for which − div A(x, ∇u0 ) = ω in Ω, u0 = 0 on ∂Ω and

q

−p uj = σ uj −1 + ω uj = 0 on ∂Ω

in Ω,

in the renormalized sense for each j 1. By Theorem 3.6 we have u0 KW2R 1,p ω,

q um KW2R 1,p um−1 + ω .

Thus using (3.22), (3.23) and arguing by induction as in the proof of Theorem 5.3 in [24] we obtain a constant M > 0 such that uj MW2R 1,p ω for all j 0. Therefore, the sequence {uj }j 0 converges pointwise increasingly to a nonnegative function u for which u MW2R 1,p ω.


1893

Thus from the stability result in [15] we see that u is a renormalized solution of (3.24), which proves the theorem. 2 To obtain an existence result similar to that of Theorem 3.8 for equations in the entire space Rn we will need the following technical lemma. Lemma 3.9. Let μ and ν be nonnegative locally finite measures on Rn such that μ ν and W1,p ν < ∞ a.e. Suppose that u is a nonnegative A-superharmonic function for which − div A(x, ∇u) = μ

in Rn ,

and u is a pointwise a.e. limit of a subsequence of the sequence {uk }k1 , where uk are renormalized solutions of

− div A(x, ∇uk ) = μBk uk = 0 on ∂Bk+1 .

in Bk+1 ,

(3.25)

Then there exists an A-superharmonic function v for which v u and

− div A(x, ∇v) = ν infRn v = 0.

in Rn ,

(3.26)

Moreover, v is also a pointwise a.e. limit of a subsequence of the sequence {vk }k1 , where vk are renormalized solutions of

− div A(x, ∇vk ) = νBk vk = 0 on ∂Bk+1 .

in Bk+1 ,

(3.27)

Proof. By Lemma 3.7, we choose a sequence of functions {vk } satisfying equation (3.27) such that vk uk for each k ∈ N. Then by Theorem 3.6 we have vk KW1,p ν.

(3.28)

Thus there is a subsequence of {vk } that converges a.e. to an A-superharmonic function v u (see [20]). By Theorem 2.3 we have − div A(x, ∇v) = ν

in Rn .

On the other hand, from (3.28) we have v KW1,p ν

a.e.,

which by Theorem 2.4 gives v C v − infn v . R

Hence infRn v = 0. This completes the proof of the lemma.

2

1894


Theorem 3.10. Let ω and σ be nonnegative locally finite measures on Rn such that W1,p ω < ∞ a.e., and for some q > p − 1 > 0, W1,p (W1,p ω)q dσ CW1,p ω

σ -a.e.

(3.29)

p−1 , q −p+1

(3.30)

with C

q −p+1 qK max{1, 2p −2 }

q(p −1)

q

where K is the constant in Theorem 2.4. Then there is a function u ∈ Lσ,loc (Rn ), Asuperharmonic in Rn , such that

− div A(x, ∇u) = σ uq + ω, infRn u = 0,

(3.31)

c1 W1,p ω u c2 W1,p ω

(3.32)

and

for constants c1 , c2 > 0 depending only on n, p, q, and the structural constants α, β. Proof. Using Lemma 3.9 we can construct inductively a sequence of nonnegative A-superharmonic functions {uj } such that uj uj +1 , where

− div A(x, ∇u0 ) = ω, infRn u0 = 0,

and

− div A(x, ∇uj ) = σ (uj −1 )q + ω, infRn uj = 0 q

for each j 1. Note that uj ∈ Lσ,loc (Rn ) by Theorem 2.4 and condition (3.29). Also from Theorem 2.4, conditions (3.29), (3.30) and arguing by induction we have

uj

K max{1, 2p −2 }q W1,p ω. q −p+1

Thus by weak continuity (Theorem 2.3) uj ↑ u for an A-superharmonic function u 0 that satisfies equation (3.31) and estimate (3.32). 2 We are now in a position to prove Theorems 1.1 stated in Section 1. q

Proof of Theorem 1.1. Suppose first that u ∈ Lσ,loc (Rn ), u 0, is a solution of (1.11). By q

Theorem 3.1 we obtain inequality (1.12) which holds for all g ∈ Lωp−1 (Rn ), g 0, and with a constant C = C(n, p, q, α, β). Consequently, letting g = χB in (1.12) we deduce inequality


1895

(1.14) also with C = C(n, p, q, α, β). On the other hand, by Theorem 2.4 in [25] we have inequality (1.13). Thus statements (i) and (ii) have been verified. Note that statement (iii) follows from statement (ii) and Theorem 3.2. Therefore, from Theorem 3.10 we obtain the last statement of the theorem. Note that the constant C in Theorem 3.10 can be chosen independently of the measures σ , ω and so is the constant C0 in Theorem 1.1 2 We next present similar results for quasilinear equations (1.11) on a bounded domain Ω ⊂ Rn with the Dirichlet boundary condition. Theorem 3.11. Let p > 1, q > p − 1, and R = diam(Ω). Suppose that ω and σ are nonnegative finite measures on Ω such that supp(ω) Ω and in the case 1 pn and a compact set K Ω. If the equation

− div A(x, ∇u) = σ uq + ω u = 0 on ∂Ω

in Ω,

(3.33)

q

has a nonnegative solution u ∈ Lσ,loc (Ω), then there exists a constant C = C(n, p, q, α, β, u, σ, ω) > 0 such that statements (i), (ii) and (iii) below are valid. (i) The inequality

2R q W1,p (g dω)(y) dσ (y) C

Rn

q

(3.34)

g p−1 dω Rn

q

holds for all g ∈ Lωp−1 (Ω), g 0, and r

σ (Bt (x)) t n−p

1 p−1

dt · t

2R

ω(Bt (x)) t n−p

1 p−1

dt t

q p−1 −1

C

(3.35)

r

0

holds for all x ∈ Ω and 0 < r < 2R. (ii) The inequality

2R q W1,p ωB (y) dσ (y) Cω(B)

(3.36)

B

holds for all balls B ⊂ Rn , and (3.35) holds for all x ∈ Ω and 0 < r < 2R. (iii) For all x ∈ Ω, 2R q 2R W2R 1,p W1,p ω dσ (x) CW1,p ω(x).

(3.37)

Conversely, there exists a constant C0 = C0 (n, p, q, α, β) > 0 such that if anyone of statements (i), (ii), (iii) holds with C C0 , then Eq. (3.33) has a nonnegative renormalized solution

1896


u ∈ Lσ (Ω), for any nonnegative finite measures σ , ω on Ω. Moreover, u satisfies the following pointwise estimate: u(x) MW2R 1,p ω(x). Here we implicitly extend σ and ω to the whole space Rn in such a way that σ (Rn \ Ω) = ω(Rn \ Ω) = 0. Proof. We first adapt the proof of Theorem 3.1 to derive inequality (3.34). Suppose that q u ∈ Lσ,loc (Ω), u 0, is a solution of (3.33), where ω is compactly supported in Ω. Let r0 = dist(supp(ω), ∂Ω) and Ω = {x ∈ Ω: dist(x, supp(ω)) < r20 }. In what follows we will need the following obvious inequality μ(B(x, r 2−j )) p−1 1

C1

2

( 2r 2−j )n−p

j 0

μ(B(x, r2−j )) p−1 1

Wr1,p μ(x) C2

(r2−j )n−p

j 0

.

From the lower Wolff potential estimate (see [20,21]) we have δ(x)

3 u(x) CW1,p μ(x),

x ∈ Ω,

where μ = σ uq + ω and δ(x) = dist(x, ∂Ω). As in the proof of Theorem 3.1, from this we obtain the following analogue of (3.4):

1 q −j g dμ p−1 q B(x, δ(x) 6 2 ) dσ (x) C g p−1 dμ, δ(x) −j n−p ( 6 2 ) j 0 n n R

R

q

q

which holds for all g 0, g ∈ Lμp−1 (Rn ). This implies that for all g 0, g ∈ Lμp−1 (Rn ) and supp(g) ⊂ Ω we have 1 q r q B(x, 480 2−j ) g dμ p−1 dσ (x) C g p−1 dμ. (3.38) r0 −j n−p ( 48 2 ) j 0

Rn

Rn

Thus in the case 1 < p 2 arguing as before we see that this is equivalent to an analogue of (3.6): χ r B(x, 0 2−j ) (y) 48

Ω

Rn

r0 −j n−p ( 48 2 )

j 0

q q q−p+1 q−p+1 φj (x) dσ (x) dμ(y) Cφ q

q−p+1

Lσ

1

,

( 2−p )

which gives

r0 48 q W1,p (g dω)(x) dσ (x) C

Rn

q

g p−1 dω, Rn

q

for all g 0, g ∈ Lωp−1 , since ω μ and supp(ω) ⊂ Ω .

1 2. As before, from (3.38) we get

r

0 Q∈D , (Q) 48√ n

Rn

q

1 Q g dμ p−1 ( ) χQ (x) (Q)n−p

q

dσ (x) C

g p−1 dμ Rn

q

for all g 0, g ∈ Lμp−1 and supp(g) ⊂ Ω . Thus, with the same notation as in (3.10), we obtain the following localized version of (3.10)

cQ

μ(Q)

r0 Q∈D , (Q) 48√ n

Ω

Q φQ dσ


Let Qi , i = 1, . . . , N (r0 ), be dyadic cubes that intersect supp(ω) and that Since Qi ⊂ Ω , from the above inequality we get N (r0 ) i

cQ Q φQ dσ

Qi

μ(Q)

Q⊂Qi

r√ 0 96 n


(s )

.

< (Qi )

(s )

r√ 0 . 48 n

.

As before, by applying Proposition 2.2 in [13] we see that this inequality is equivalent to the following analogue of (3.11) N (r0 )

λQ

(qs) −1 1 (qs) λQ Cφ , (qs) μ(Q) Lσ (s ) Q ⊂Q

i=1 Q⊂Qi

where λQ and s are as in (3.11). Since ω μ and supp(ω) ⊂ obtain an inequality similar to (3.12): Rn

sQ r

0 Q∈D , (Q) 48√ n

Q g dω

ω(Q)

s

N (r0 ) i=1

Qi , arguing as before we

q χQ (x) dσ (x) C g qs dω,

(3.40)

Rn

1

qs

for every nonnegative g in Lω (Rn ) and sQ = [(Q)p−n ω(Q)] p−1 . Now using (3.40) and adapting the argument after (3.12) in the proof of Theorem 3.1 we can find a constant c = c(n) such that Rn

rc0 q W1,p (g dω)(y) dσ (y) C

q

g p−1 dω Rn

(3.41)

1898


for all g 0, g ∈ Lωp−1 (Rn ) and p > 2. In view of (3.39) we see that (3.41) holds for any 1 < p < ∞. Thus from the estimate 2R

r0

Ω g dω t n−p

c W2R 1,p (g dω) W1,p (g dω) + r0 c

1 p−1

dt t

with R = diam(Ω) and Hölder’s inequality we get the desired inequality (3.34). We next prove (3.35). Since both σ and ω are finite we may assume that 0 < r < R0 = min{r0 , dist(K, ∂Ω)}. Note that for x ∈ Ω such that δ(x) 12 R0 we have r

σ (Bt (x)) t n−p

1 p−1

dt t

2R

ω(Bt (x)) t n−p

1 p−1

dt t

R0 128 ,

where

q p−1 −1

r

0

r =

σ (Bt (x)) t n−p

1 p−1

dt t

2R

0

ω(Bt (x)) t n−p

1 p−1

dt t

q p−1 −1

C

R0 4

since R0

4

σ (Bt (x)) t n−p

1 p−1

dt C t

0

due to the fact that in the case 1 pn . Thus it is enough to consider x in the set E = {x ∈ Ω: δ(x) > r

σ (Bt (x)) t n−p

1 p−1

dt t

δ(x) 8

From Theorem 2.4 in [25] we have

μ(Bt (x)) t n−p

1 p−1

dt t

q p−1 −1

C,

(3.42)

C.

(3.43)

r

0

where 0 < r
0

R0 on x ∈ Ω: δ(x) > . 4


1899

Hence from (3.43) we obtain R0

1 q−p+1 128 R0 σ (Bt (x)) p−1 dt σ B x, C(u, R0 ). 32 t n−p t 0

We next cover the set E by a finite number of open balls {B R0 (xi )}N i=1 . For x ∈ B R0 (xi ) we 128

have

128

B R0 (x) ⊂ B R0 (xi ) ⊂ B R0 (x). 128

64

32

Thus if x ∈ E ∩ B R0 (xi ) with σ (B R0 (xi )) = 0 then 128

64

R0

128

σ (Bt (x)) t n−p

1 p−1

dt = 0, t

0

and if x ∈ E ∩ B R0 (xi ) with σ (B R0 (xi )) > 0 we have 128

64

R0

128

σ (Bt (x)) t n−p

1 p−1

dt m−q+p−1 C(u, R0 ), t

0

where m = min σ B R0 (xi ) : σ B R0 (xi ) > 0 > 0. 64

64

Therefore, for any x ∈ E we obtain R0

128

σ (Bt (x)) t n−p

1 p−1

dt C(u, R0 , σ ). t

0

It is then easy to see from this inequality and (3.42) that r

σ (Bt (x)) t n−p

1 p−1

dt · t

2R

ω(Bt (x)) t n−p

1 p−1

dt t

q p−1 −1

r

0

C(u, R0 , R, σ, ω) R0 for all 0 < r < 128 and x ∈ E. This completes the proof of (3.35) and hence that of statement (i). Note that statement (ii) follows from statement (i) as inequality (3.36) is a trivial consequence of (3.35). Moreover, by modifying the proof of Theorem 3.2 we see that (ii) implies (iii).

1900


Finally, for finite measures σ and ω on Ω we observe that in the implication (i) ⇒ (ii) ⇒ (iii) if the constants C in (i) depend only on n, p, q, and α, β then so do the constants C in (ii) and (iii). Thus from Theorem 3.8 we obtain the last statement of the theorem. 2 4. Hessian equations In this section, we study a fully nonlinear counterpart of the theory presented in the previous sections. Here the notion of k-subharmonic (k-convex) functions associated with the fully nonlinear k-Hessian operator Fk , k = 1, . . . , n, introduced by Trudinger and Wang in [28–30] will play a role similar to that of A-superharmonic functions in the quasilinear theory. Let Ω be an open set in Rn , n 2. For k = 1, . . . , n and u ∈ C 2 (Ω), the k-Hessian operator Fk is defined by Fk [u] = Sk λ D 2 u , where λ(D 2 u) = (λ1 , . . . , λn ) denotes the eigenvalues of the Hessian matrix of second partial derivatives D 2 u, and Sk is the kth symmetric function on Rn given by

Sk (λ) =

λi 1 · · · λi k .

1i1 0 such that 2 diam(Ω) ω(x) 2k k+1 ,k+1

u(x) KW for every x in Ω.

Theorem 4.3. Let u 0 be such that −u ∈ Φ k (Rn ), where 1 k < n2 . If μ = μk [−u] and infRn u = 0 then for all x ∈ Rn , 1 W 2k μ(x) u(x) KW 2k ,k+1 μ(x) k+1 K k+1 ,k+1 for a constant K > 0 depending only on n and k. We now recall an existence result for Hessian equations with measure data established in [28,29] for bounded uniformly (k − 1)-convex domains Ω in Rn , i.e., ∂Ω ∈ C 2 and Hj (∂Ω) > 0, j = 1, . . . , k − 1, where Hj (∂Ω) denotes the j -mean curvature of the boundary ∂Ω. Lemma 4.4. (See [28,29].) Let Ω be a bounded uniformly (k − 1)-convex domain in Rn . Suppose that μ = μ + f where μ is a nonnegative measure compactly supported in Ω, and f 0, n f ∈ Ls (Ω) with s > 2k if 1 k n2 , and s = 1 if n2 < k n. Then there exists u 0 such that −u ∈ Φ k (Ω) and u is continuous near ∂Ω which satisfies the equation

μk [−u] = μ in Ω, u = 0 on ∂Ω.

(4.1)

1902


The uniqueness of solutions to (4.1) remains an open problem for general measure data μ. However, if μ is continuous with respect to the capacity capk (·, Ω), i.e., μ(E) = 0 whenever capk (E, Ω) = 0 where capk (·, Ω) is defined by capk (K, Ω) = sup μk [u](K): u ∈ Φ k (Ω) and − 1 < u < 0 for each compact set K ⊂ Ω, then the uniqueness follows from the following comparison principle which was established in [30]. Theorem 4.5. (See [30].) Suppose u, v ∈ Φ k (Ω) are such that the measures μk [u] and μk [v] are continuous with respect to capk (·, Ω), and u v continuously on ∂Ω. If μk [u] μk [v], then u v on Ω. From Theorem 4.5 and the weak continuity result (Theorem 4.1) one gets the following analogue of Lemma 3.7, which was also proved earlier in [24]. Lemma 4.6. (See [24].) Let Ω, μ and u be as in Lemma 4.4. Suppose that ν is a measure similar to μ, i.e., ν = ν + g where ν is a nonnegative measure compactly supported in Ω, and g 0, n g ∈ Ls (Ω) with s > 2k if 1 k n2 , and s = 1 if n2 < k n. Then there exists a function v such that −v ∈ Φ k (Ω), v u and

μk [−v] = μ + ν v = 0 on ∂Ω.

in Ω,

The following technical lemma will be needed in the proof of Theorem 4.8 below to construct a solution to Hessian equations. It is the Hessian counterpart of Lemma 3.9. Lemma 4.7. Let μ, ν be nonnegative locally finite measures on Rn for which μ ν and W 2k , k+1 ν < ∞ a.e. Suppose that u is a nonnegative function for which −u ∈ Φ k (Rn ), k+1 μk [−u] = μ, and u is a pointwise a.e. limit of a subsequence of the sequence {um }, where −um ∈ Φ k (Bm+1 ) and

μk [−um ] = μBm in Bm+1 , um = 0 on ∂Bm+1 .

Then there exists a function v for which −v ∈ Φ k (Rn ), v u and

μk [−v] = ν, infRn v = 0.

(4.2)

Moreover, v is also a pointwise a.e. limit of a subsequence of the sequence {vm }, where −vm ∈ Φ k (Bm+1 ) and

μk [−vm ] = νBm in Bm+1 , vm = 0 on ∂Bm+1 .

(4.3)


1903

Proof. By Lemma 4.6 we choose a sequence of functions {vm } satisfying equation (4.3) such that vm um . Note that vm KW

2k k+1 ,k+1

on Bm+1 ,

ν

(4.4)

by Theorem 4.2. Thus we can find a subsequence {vmk } that converges a.e. to a function v such that −v ∈ Φ k (Rn ) and v u. From (4.4) we have v KW

2k k+1 ,k+1

ν

a.e.,

which by Theorem 4.3 gives v C v − infn v . R

Thus infRn v = 0. Finally, from (4.4) and weak continuity we see that u satisfies equation (4.2). 2 We next obtain an existence result for fully nonlinear equations on Rn which is an analogue of Theorem 3.10 in the quasilinear case. Theorem 4.8. Let ω, σ be nonnegative locally finite measure on Rn for which W a.e. Let 1 k
k, and R = diam(Ω). Suppose that ω and σ are nonnegative finite measures on Ω such that supp(ω) Ω and in the case 1 k n2 we assume σ ∈ Ls (Ω \ K) n and a compact set K Ω. If the equation for some s > 2k

Fk [−u] = σ uq + ω u = 0 on ∂Ω

in Ω,

(4.9)

q

has a nonnegative solution u ∈ Lσ,loc (Ω), then there exists a constant C > 0 such that statements (i), (ii) and (iii) below are valid. (i) The inequality Rn

2R

W

2k k+1

q (g dω)(y) dσ (y) C , k+1

q

g k dω Rn

q

holds for all g ∈ Lωk (Ω), g 0, and r

σ (Bt (x)) t n−2k

1 k

dt · t

0

holds for all x ∈ Ω and 0 < r < 2R.

2R r

ω(Bt (x)) t n−2k

1 qk −1 k

dt C t

(4.10)


1905

(ii) The inequality

2R W 2k k+1

q ω (y) dσ (y) Cω(B) B ,k+1

B

holds for all balls B ⊂ Rn , and (4.10) holds for all x ∈ Ω and 0 < r < 2R. (iii) For each x ∈ Ω, W2R 2k

k+1 ,k+1

2R W 2k

k+1 ,k+1

ω

q

dσ (x) CW2R 2k

k+1 ,k+1

ω(x).

Conversely, there exists a constant C0 = C0 (n, k, q) > 0 such that if Ω is uniformly (k − 1)convex, and if anyone of statements (i), (ii) and (iii) holds with C C0 , then Eq. (4.9) has a q nonnegative solution u ∈ Lσ (Ω) such that u(x) MW2R 2k

k+1 ,k+1

ω(x),

provided σ = σ +f and ω = ω +g, where σ , σ are nonnegative measures compactly supported n if 1 k n2 , and s = 1 if in Ω, and f , g are nonnegative functions in Ls (Ω) with s > 2k n 2 < k n. Here the boundary condition in (4.9) is understood in the classical sense. References [1] D.R. Adams, L.I. Hedberg, Function Spaces and Potential Theory, Springer-Verlag, Berlin, 1996. [2] D.R. Adams, M. Pierre, Capacitary strong type estimates in semilinear problems, Ann. Inst. Fourier (Grenoble) 41 (1991) 117–135. [3] P. Baras, M. Pierre, Critère d’existence des solutions positives pour des équations semi-linéaires non monotones, Ann. Inst. H. Poincaré 3 (1985) 185–212. [4] H. Berestycki, I. Capuzzo-Dolcetta, L. Nirenberg, Superlinear indefinite elliptic problems and nonlinear Liouville theorems, Topol. Methods Nonlinear Anal. 4 (1994) 59–78. [5] M.F. Bidaut-Véron, Local and global behavior of solutions of quasilinear equations of Emden–Fowler type, Arch. Ration. Mech. Anal. 107 (1989) 293–324. [6] M.F. Bidaut-Véron, Necessary conditions of existence for an elliptic equation with source term and measure data involving p-Laplacian, in: Proc. 2001 Luminy Conf. on Quasilinear Elliptic and Parabolic Equations and Systems, Electron. J. Differ. Equ. Conf. 8 (2002) 23–34. [7] M.F. Bidaut-Véron, Removable singularities and existence for a quasilinear equation with absorption or source term and measure data, Adv. Nonlinear Stud. 3 (2003) 25–63. [8] M.F. Bidaut-Véron, S. Pohozaev, Nonexistence results and estimates for some nonlinear elliptic problems, J. Anal. Math. 84 (2001) 1–49. [9] I. Birindelli, F. Demengel, Some Liouville theorems for the p-Laplacian, in: Proc. 2001 Luminy Conf. on Quasilinear Elliptic and Parabolic Equations and Systems, Electron. J. Differ. Equ. Conf. 8 (2002) 35–46. [10] H. Brezis, X. Cabré, Some simple nonlinear PDE’s without solutions, Boll. Unione Mat. Ital. Ser. B (8) (1998) 223–262. [11] L. Caffarelli, L. Nirenberg, J. Spruck, The Dirichlet problem for nonlinear second-order elliptic equations. III. Functions of the eigenvalues of the Hessian, Acta Math. 155 (1985) 261–301. [12] C. Cascante, J.M. Ortega, I.E. Verbitsky, Trace inequalities of Sobolev type in the upper triangle case, Proc. London Math. Soc. 3 (2000) 391–414. [13] C. Cascante, J.M. Ortega, I.E. Verbitsky, Nonlinear potentials and two weight trace inequalities for general dyadic and radial kernels, Indiana Univ. Math. J. 53 (2004) 845–882. [14] K.-S. Chou, X.-J. Wang, Variational theory for Hessian equations, Comm. Pure Appl. Math. 54 (2001) 1029–1064. [15] G. Dal Maso, F. Murat, A. Orsina, A. Prignet, Renormalized solutions of elliptic equations with general measure data, Ann. Sc. Norm. Super. Pisa (4) 28 (1999) 741–808.

1906


[16] R. Fefferman, Strong differentiation with respect to measures, Amer. J. Math. 103 (1981) 33–40. [17] N. Grenon, Existence results for semilinear elliptic equations with small measure data, Ann. Inst. H. Poincaré Anal. Non Linéaire 19 (2002) 1–11. [18] J. Heinonen, T. Kilpeläinen, O. Martio, Nonlinear Potential Theory of Degenerate Elliptic Equations, Oxford Univ. Press, Oxford, 1993. [19] N.M. Ivochkina, Solution of the Dirichlet problem for some equations of the Monge–Ampère type, Mat. Sb. 128 (1985) 403–415 (in Russian). [20] T. Kilpeläinen, J. Malý, Degenerate elliptic equations with measure data and nonlinear potentials, Ann. Sc. Norm. Super. Pisa, Cl. Sci. 19 (1992) 591–613. [21] T. Kilpeläinen, J. Malý, The Wiener test and potential estimates for quasilinear elliptic equations, Acta Math. 172 (1994) 137–161. [22] D.A. Labutin, Potential estimates for a class of fully nonlinear elliptic equations, Duke Math. J. 111 (2002) 1–49. [23] N.J. Kalton, I.E. Verbitsky, Nonlinear equations and weighted norm inequalities, Trans. Amer. Math. Soc. 351 (1999) 3441–3497. [24] N.C. Phuc, I.E. Verbitsky, Quasilinear and Hessian equations of Lane–Emden type, Ann. of Math. 168 (2008) 859–914. [25] N.C. Phuc, I.E. Verbitsky, Local integral estimates and removable singularities for quasilinear and Hessian equations with nonlinear source terms, Comm. Partial Differential Equations 31 (2006) 1779–1791. [26] E.T. Sawyer, A characterization of a two-weight norm inequality for maximal operator, Studia Math. 75 (1982) 1–11. [27] J. Serrin, H. Zou, Cauchy–Liouville and universal boundedness theorems for quasilinear elliptic equations and inequalities, Acta Math. 189 (2002) 79–142. [28] N.S. Trudinger, X.J. Wang, Hessian measures I, Topol. Methods Nonlinear Anal. 10 (1997) 225–239. [29] N.S. Trudinger, X.J. Wang, Hessian measures II, Ann. of Math. 150 (1999) 579–604. [30] N.S. Trudinger, X.J. Wang, Hessian measures III, J. Funct. Anal. 193 (2002) 1–23. [31] N.S. Trudinger, X.J. Wang, On the weak continuity of elliptic operators and applications to potential theory, Amer. J. Math. 124 (2002) 369–410. [32] I.E. Verbitsky, Superlinear equations, potential theory and weighted norm inequalities, in: Nonlinear Analysis, Function Spaces and Applications, Proc. Spring School 5, Prague, May 31–June 6, 1998, pp. 1–47. [33] L. Véron, Elliptic equations involving measures, in: Stationary Partial Differential Equations, vol. I, Handbook of Partial Differential Equations, North-Holland, Amsterdam, 2004, pp. 593–712.


Optimal and better transport plans Mathias Beiglböck b,∗,1 , Martin Goldstern a , Gabriel Maresch a,2 , Walter Schachermayer b,3 a Institut für Diskrete Mathematik und Geometrie, Technische Universität Wien, Wiedner Hauptstraße 8-10/104,

1040 Wien, Austria b Fakultät für Mathematik, Universität Wien, Nordbergstraße 15, 1090 Wien, Austria

Received 27 May 2008; accepted 16 January 2009

Communicated by C. Villani

Abstract We consider the Monge–Kantorovich transport problem in a purely measure theoretic setting, i.e. without imposing continuity assumptions on the cost function. It is known that transport plans which are concentrated on c-monotone sets are optimal, provided the cost function c is either lower semi-continuous and finite, or continuous and may possibly attain the value ∞. We show that this is true in a more general setting, in particular for merely Borel measurable cost functions provided that {c = ∞} is the union of a closed set and a negligible set. In a previous paper Schachermayer and Teichmann considered strongly cmonotone transport plans and proved that every strongly c-monotone transport plan is optimal. We establish that transport plans are strongly c-monotone if and only if they satisfy a “better” notion of optimality called robust optimality. © 2009 Elsevier Inc. All rights reserved. Keywords: Monge–Kantorovich problem; c-Cyclically monotone; Strongly c-monotone; Measurable cost function


E-mail address: [email protected] (M. Beiglböck). 1 Supported by the Austrian Science Fund (FWF) under grant S9612. 2 Supported by the Austrian Science Fund (FWF) under grant Y328 and P18308. 3 Supported by the Austrian Science Fund (FWF) under grant P19456, from the Vienna Science and Technology Fund

(WWTF) under grant MA13 and from the Christian Doppler Research Association (CDG). 0022-1236/$ – see front matter © 2009 Elsevier Inc. All rights reserved. doi:10.1016/j.jfa.2009.01.013

1908

M. Beiglböck et al. / Journal of Functional Analysis 256 (2009) 1907–1927

1. Introduction We consider the Monge–Kantorovich transport problem (μ, ν, c) for Borel probability measures μ, ν on Polish spaces X, Y and a Borel measurable cost function c : X × Y → [0, ∞]. As standard references on the theory of mass transport we mention [1,9,14,15]. By Π(μ, ν) we denote the set of all probability measures on X × Y with X-marginal μ and Y -marginal ν. For a Borel measurable cost function c : X × Y → [0, ∞] the transport costs of a given transport plan π ∈ Π(μ, ν) are defined by c(x, y) dπ. (1) Ic [π] := X×Y

π is called a finite transport plan if Ic [π] < ∞. A nice interpretation of the Monge–Kantorovich transport problem is given by Cédric Villani in Chapter 3 of the impressive monograph [15]: “Consider a large number of bakeries, producing breads, that should be transported each morning to cafés where consumers will eat them. The amount of bread that can be produced at each bakery, and the amount that will be consumed at each café are known in advance, and can be modeled as probability measures (there is a “density of production” and a “density of consumption”) on a certain space, which in our case would be Paris (equipped with the natural metric such that the distance between two points is the length of the shortest path joining them). The problem is to find in practice where each unit of bread should go, in such a way as to minimize the total transport cost.” We are interested in optimal transport plans, i.e. minimizers of the functional Ic [·] and their characterization via the notion of c-monotonicity. Definition 1.1. A Borel set Γ ⊆ X × Y is called c-monotone if n i=1

c(xi , yi )

n

c(xi , yi+1 )

(2)

i=1

for all pairs (x1 , y1 ), . . . , (xn , yn ) ∈ Γ using the convention yn+1 := y1 . A transport plan π is called c-monotone if there exists a c-monotone Γ with π(Γ ) = 1. In the literature (e.g. [1,3,7,8,13]) the following characterization was established under various continuity assumptions on the cost function. Our main result states that those assumptions are not required. Theorem 1. Let X, Y be Polish spaces equipped with Borel probability measures μ, ν and let c : X × Y → [0, ∞] a Borel measurable cost function. a. Every finite optimal transport plan is c-monotone. b. Every finite c-monotone transport plan is optimal if there exist a closed set F and a μ ⊗ νnull set N such that {(x, y): c(x, y) = ∞} = F ∪ N .


1909

Thus in the case of a cost function which does not attain the value ∞ the equivalence of optimality and c-monotonicity is valid without any restrictions beyond the obvious measurability conditions inherent in the formulation of the problem. The subsequent construction due to Ambrosio and Pratelli in [1, Example 3.5] shows that if c is allowed to attain ∞ the implication “c-monotone ⇒ optimal” does not hold without some additional assumption as in Theorem 1.b. Example 1.2 (Ambrosio and Pratelli). Let X = Y = [0, 1], equipped with Lebesgue measure λ = μ = ν. Pick α ∈ [0, 1) irrational. Set Γ1 = (x, x ⊕ α): x ∈ X , Γ0 = (x, x): x ∈ X , where ⊕ is addition modulo 1. Let c : X × Y → [0, ∞] be such that c = a ∈ [0, ∞) on Γ0 , c = b ∈ [0, ∞) on Γ1 and c = ∞ otherwise. It is then easy to check that Γ0 and Γ1 are cmonotone sets. Using the maps f0 , f1 : X → X × Y , f0 (x) = (x, x), f1 (x) = (x, x ⊕ α) one defines the transport plans π0 = f0# λ, π1 = f1# λ supported by Γ0 respectively Γ1 . Then π0 and π1 are finite c-monotone transport plans, but as Ic [π0 ] = a, Ic [π1 ] = b it depends on the choice of a and b which transport plan is optimal. Note that in contrast to the assumption in Theorem 1.b the set {(x, y) ∈ X × Y : c = ∞} is open. We want to remark that rather trivial (folkloristic) examples show that no optimal transport has to exist if the cost function does not satisfy proper continuity assumptions. Example 1.3. Consider the task to transport points on the real line (equipped with the Lebesgue measure) from the interval [0, 1) to [1, 2) where the cost of moving one point to another is the squared distance between these points (X = [0, 1), Y = [1, 2), c(x, y) = (x − y)2 , μ = ν = λ). The simplest way to achieve this transport is to shift every point by 1. This results in transport costs of 1 and one easily checks that all other transport plans are more expensive. If we now alter the cost function to be 2 whenever two points have distance 1, i.e. if we set 2 if y = x + 1, c(x, ˜ y) = c(x, y) otherwise, it becomes impossible to find a transport plan π ∈ Π(μ, ν) with total transport costs Ic˜ [π] = 1, but it is still possible to achieve transport costs arbitrarily close to 1. (For instance, shift [0, 1 − ε) to [1 + ε, 2) and [1 − ε, 1) to [1, 1 + ε) for small ε > 0.) 1.1. History of the problem The notion of c-monotonicity originates in convex analysis. The well-known Rockafellar Theorem (see for instance [11, Theorem 3] or [14, Theorem 2.27]) and its generalization, Rüschendorf’s Theorem (see [12, Lemma 2.1]), characterize c-monotonicity in Rn in terms of integrability. The definitions of c-concave functions and super-differentials can be found for instance in [14, Section 2.4]. Theorem (Rockafellar). A non-empty set Γ ⊆ Rn × Rn is cyclically monotone (that is, cmonotone with respect to the squared euclidean distance) if and only if there exists an l.s.c. concave function ϕ : Rn → R such that Γ is contained in the super-differential ∂(ϕ).

1910


Theorem (Rüschendorf). Let X, Y be abstract spaces and c : X × Y → [0, ∞] arbitrary. Let Γ ⊆ X × Y be c-monotone. Then there exists a c-concave function ϕ : X → Y such that Γ is contained in the c-super-differential ∂ c (ϕ). Important results of Gangbo and McCann [3] and Brenier [14, Theorem 2.12] use these potentials to establish uniqueness of the solutions of the Monge–Kantorovich transport problem in Rn for different types of cost functions subject to certain regularity conditions. Optimality implies c-monotonicity: This is evident in the discrete case if X and Y are finite sets. For suppose that π is a transport plan for which c-monotonicity is violated on pairs (x1 , y1 ), . . . , (xn , yn ) where all points x1 , . . . , xn and y1 , . . . , yn carry positive mass. Then we can reduce costs by sending the mass α > 0, for α sufficiently small, from xi to yi+1 instead of yi , that is, we replace the original transport plan π with πβ = π + α

n i=1

δ(xi ,yi+1 ) − α

n

δ(xi ,yi ) .

(3)

i=1

(Here we are using the convention yn+1 = y1 .) Gangbo and McCann [3, Theorem 2.3] show how continuity assumptions on the cost function can be exploited to extend this to an abstract setting. Hence one achieves: Let X and Y be Polish spaces equipped with Borel probability measures μ, ν. Let c : X × Y → [0, ∞] be an l.s.c. cost function. Then every finite optimal transport plan is c-monotone. Using measure theoretic tools, as developed in the beautiful paper by Kellerer [6], we are able to extend this to Borel measurable cost functions (Theorem 1.a) without any additional regularity assumption. c-Monotonicity implies optimality: In the case of finite spaces X, Y this again is nothing more than an easy exercise [14, Exercise 2.21]. The problem gets harder in the infinite setting. It was first proved in [3] that for X, Y compact subsets of Rn and c a continuous cost function, cmonotonicity implies optimality. In a more general setting this was shown in [1, Theorem 3.2] for l.s.c. cost functions which additionally satisfy the moment conditions c(x, y) dν < ∞ > 0, μ x: Y

ν y: c(x, y) dμ < ∞ > 0. X

Further research into this direction was initiated by the following problem posed by Villani in [14, Problem 2.25]: For X = Y = Rn and c(x, y) = x − y2 , the squared euclidean distance, does c-monotonicity of a transport plan imply its optimality? A positive answer to this question was given independently by Pratelli in [8] and by Schachermayer and Teichmann in [13]. Pratelli proves the result for countable spaces and shows that it extends to the Polish case by means of approximation if the cost function c : X × Y → [0, ∞]


1911

is continuous. The paper [13] pursues a different approach: The notion of strong c-monotonicity is introduced. From this property optimality follows fairly easily and the main part of the paper is concerned with the fact that strong c-monotonicity follows from the usual notion of c-monotonicity in the Polish setting if c is assumed to be l.s.c. and finitely valued. Part (b) of Theorem 1 unifies these statements: Pratelli’s result follows from the fact that for continuous c : X × Y → [0, ∞] the set {c = ∞} = c−1 [{∞}] is closed; the Schachermayer– Teichmann result follows since for finite c the set {c = ∞} is empty. Similar to [13] our proofs are based on the concept of strong c-monotonicity. In Section 1.2 we present robust optimality which is a variant of optimality that we shall show to be equivalent to strong c-monotonicity. As not every optimal transport plan is also robustly optimal, this accounts for the somewhat provocative concept of “better than optimal” transport plans alluded to in the title of this paper. Correspondingly the notion of strong c-monotonicity is in fact stronger than ordinary cmonotonicity (at least if c is allowed to assume the value ∞). 1.2. Strong notions It turns out that optimality of a transport plan is intimately connected with the notion of strong c-monotonicity introduced in [13]. Definition 1.4. A Borel set Γ ⊆ X × Y is strongly c-monotone if there exist Borel measurable functions ϕ : X → [−∞, ∞) and ψ : Y → [−∞, ∞) such that ϕ(x) + ψ(y) c(x, y) for all (x, y) ∈ X × Y and ϕ(x) + ψ(y) = c(x, y) for all (x, y) ∈ Γ . A transport plan π ∈ Π(μ, ν) is strongly c-monotone if π is concentrated on a strongly c-monotone Borel set Γ . Strong c-monotonicity implies c-monotonicity since n

c(xi+1 , yi )

i=1

n

ϕ(xi+1 ) + ψ(yi ) =

i=1

n

ϕ(xi ) + ψ(yi ) =

i=1

n

c(xi , yi )

(4)

i=1

whenever (x1 , y1 ), . . . , (xn , yn ) ∈ Γ . If there are integrable functions ϕ and ψ witnessing that π is strongly c-monotone, then for every π˜ ∈ Π(μ, ν) we can estimate: Ic [π] =

c(x, y) dπ =

Γ

ϕ(x) dμ +

= Γ

ϕ(x) + ψ(y) dπ

Γ

ψ(y) dν = Γ

ϕ(x) + ψ(y) d π˜ Ic [π˜ ].

Γ

Thus in this case strong c-monotonicity implies optimality. However there is no reason why the Borel measurable functions ϕ, ψ appearing in Definition 1.4 should be integrable. In [13, Proposition 2.1] it is shown that for l.s.c. cost functions, there is a way of truncating which allows to also handle non-integrable functions ϕ and ψ. The proof extends to merely Borel measurable functions; hence we have:

1912


Proposition 1.5. Let X, Y be Polish spaces equipped with Borel probability measures μ, ν and let c : X × Y → [0, ∞] be Borel measurable. Then every finite transport plan which is strongly c-monotone is optimal. No new ideas are required to extend [13, Proposition 2.1] to the present setting but since Proposition 1.5 is a crucial ingredient of several proofs in this paper we provide an outline of the argument in Section 3. As it will turn out, strongly c-monotone transport plans even satisfy a “better” notion of optimality, called robust optimality. Definition 1.6. Let X, Y be Polish spaces equipped with Borel probability measures μ, ν and let c : X × Y → [0, ∞] be a Borel measurable cost function. A transport plan π ∈ Π(μ, ν) is robustly optimal if, for any Polish space Z and any finite Borel measure λ 0 on Z, there exists a Borel measurable extension c˜ : (X ∪ Z) × (Y ∪ Z) → [0, ∞] satisfying c(a, ˜ b) =

c(a, b) for a ∈ X, b ∈ Y, 0 for a, b ∈ Z, 0. The main ingredient in the proof of Proposition 2.1 is the following duality theorem due to Kellerer (see [6, Lemma 1.8(a), Corollary 2.18]). Theorem (Kellerer). Let X1 , . . . , Xn , n 2, be Polish spaces equipped with Borel probability measures μ1 , . . . , μn and assume that c : X = X1 × · · · × Xn → R is Borel measurable and that c := supX c, c := infX c are finite. Set c dπ: π ∈ Π(μ1 , . . . , μn ) , I (c) = inf X

S(c) = sup

n i=1 X

ϕi dμi : c(x1 , . . . , xn )

n i=1

i

c c ϕi (xi ), − (c − c) ϕi . n n

Then I (c) = S(c). Proof of Proposition 2.1. Observe that −I (−1B ) = P (B) and that n n χi dμi : 1B (x1 , . . . , xn ) χi (xi ), 0 χi 1 . −S(−1B ) = inf i=1 X

i

i=1

(7)


1915

By Kellerer’s Theorem −S(−1B ) = −I (−1B ). Thus it remains to show that −S(−1B ) 1 n L(B). Fixfunctions χ1 , . . . , χn as in (7). Then for (x1 , . . . , xn ) ∈ B one has 1 = 1B (x1 , . . . , xn ) ni=1 χi (xi ) and hence there exists some i such that χi (xi ) n1 . Thus B ⊆

n −1 1 i=1 pXi [{χi n }]. It follows that −S(−1B ) inf inf

n i=1 X

n

−1 pX i

i=1

i

n 1 i=1

χi dμi : B ⊆

n

μi

1 χi n

1 χi , 0 χi 1 n

: B⊆

n

−1 pX i

i=1

1 1 χi L(B). n n

From this we deduce that either L(B) = 0 or there exists π ∈ Π(μ1 , . . . , μn ) such that π(B) > 0. The last assertion of Proposition 2.1 now follows from the following lemma due to Richárd Balka and Márton Elekes (private communication). 2 Lemma 2.2. Suppose that L(B) = 0 for a Borel set B ⊆ X1 × · · · × Xn . Then B is an L-shaped null set. Proof. Fix ε > 0 and Borel sets B1 , . . . , Bn with μi (Bi ) ε2−k such that for each k (k)

(k)

(k)

−1 (k) −1 (k) B ⊆ pX B1 ∪ · · · ∪ pX Bn . n 1

Let Bi :=

∞

(k) k=1 Bi

for i = 2, . . . , n, such that −1 (k) −1 −1 B ⊆ pX B1 ∪ pX [B2 ] ∪ · · · ∪ pX [Bn ] n 1 2

for each k ∈ N. Thus with B1 :=

∞

(k) k=1 B1 ,

−1 −1 −1 B ⊆ pX [B1 ] ∪ pX [B2 ] ∪ · · · ∪ pX [Bn ]. n 1 2

Hence we can assume from now on that μ1 (B1 ) = 0 and that μi (Bi ) is arbitrarily small for i = 2, . . . , n. Iterating this argument in the obvious way we get the statement. 2 Remark 2.3. In the case n = 2 it was shown in [6, Proposition 3.3] that L(B) = P (B) for every Borel set B ⊆ X1 × X2 . However, for n > 2, equality does not hold true, cf. [6, Example 3.4]. Definition 2.4. Let X, Y be Polish spaces. For a Borel measurable cost function c : X × Y → [0, ∞], n ∈ N and ε > 0 we set Bn,ε :=

(xi , yi )ni=1

∈ (X × Y ) : n

n i=1

c(xi , yi )

n i=1

c(xi , yi+1 ) + ε .

(8)

1916


The definition of the sets Bn,ε is implicitly given in [3, Theorem 2.3]. The idea behind it is, that (xi , yi )ni=1 ∈ Bn,ε tells us that transport costs can be reduced if “xi is transported to yi+1 instead of yi ” (recall the conventions xn+1 = x1 resp. yn+1 = y1 ). In what follows we make this statement precise and give a coordinate free formulation. Denote by σ, τ : (X × Y )n → (X × Y )n the shifts defined via σ : (xi , yi )ni=1 → (xi+1 , yi+1 )ni=1 , τ : (xi , yi )ni=1 → (xi , yi+1 )ni=1 .

(9) (10)

Observe that σ n = τ n = Id(X×Y )n and that σ and τ commute. Also note that the set Bn,ε from (8) is σ -invariant (i.e. σ (Bn,ε ) = Bn,ε ), but in general not τ -invariant. Denote by pi : (X × Y )n → X × Y the projection on the ith component of the product. The projections pX : X × Y → X, (x, y) → x and pY : X × Y → Y , (x, y) → y are defined as usual and there will be no danger of confusion. Lemma 2.5. Let X, Y be Polish spaces equipped with Borel probability measures μ, ν. Let π be a transport plan. Then one of the following alternatives holds: a. π is c-monotone, b. there exist n ∈ N, ε > 0 and a measure κ ∈ Π(π, . . . , π) such that κ(Bn,ε ) > 0. Moreover κ can be taken to be both σ and τ invariant. Proof. Suppose that Bn,ε is an L-shaped null set for all n ∈ N and every ε > 0. Then there are 1 , . . . , S n ⊆ X × Y of full π -measure such that Borel sets Sn,ε n,ε 1 n Sn,ε × . . . × Sn,ε ∩ Bn,ε = ∅ and π is concentrated on the c-monotone set S=

∞ ∞ n

i Sn,1/k .

k=1 n=1 i=1

If there exist n ∈ N and ε > 0 such that Bn,ε is not an L-shaped null set, we apply Proposition 2.1 to conclude the existence of a measure κ ∈ Π(π, . . . , π) with κ(Bn,ε ) > 0. To achieve the desired invariance, simply replace κ by n 1 i σ ◦ τ j # κ. 2 n

2

(11)

i,j =1

We are now in the position to prove Theorem 1.a, i.e. Let X, Y be Polish spaces equipped with Borel probability measures μ, ν and let c : X × Y → [0, ∞] be a Borel measurable cost function. If π is a finite optimal transport plan, then π is c-monotone. Proof. Suppose by contradiction that π is optimal, Ic [π] < ∞ but π is not c-monotone. Then by Lemma 2.5 there exist n ∈ N, ε > 0 and an invariant measure κ ∈ Π(π, . . . , π) which gives


1917

mass α > 0 to the Borel set Bn,ε ⊆ (X × Y )n . Consider now the restriction of κ to Bn,ε defined via κ(A) ˆ := κ(A ∩ Bn,ε ) for Borel sets A ⊆ (X × Y )n . κˆ is σ -invariant since both the measure κ and the Borel set Bn,ε are σ -invariant. Denote the marginal of κˆ in the first coordinate (X × Y ) of (X × Y )n by πˆ . Due to σ -invariance we have ˆ = (pi ◦ σ )# κˆ = pi+1# κ, ˆ pi # κˆ = pi # (σ # κ) i.e. all marginals coincide and we have κˆ ∈ Π(πˆ , . . . , πˆ ). Furthermore, since κˆ κ, the same is true for the marginals, i.e. πˆ π . Denote the marginal of τ # κˆ in the first coordinate (X × Y ) of (X × Y )n by πˆ β . As σ and τ commute, τ # κˆ is σ -invariant, so the marginals in the other coordinates coincide with πˆ β . An easy calculation shows that πˆ and πˆ β have the same marginals in X resp. Y : ˆ = (pX ◦ pi ◦ τ )# κˆ = (pX ◦ pi )# κˆ = pX# πˆ , pX# πˆ β = pX# pi # (τ # κ) pY # πˆ β = pY # pi # (τ # κ) ˆ = (pY ◦ pi ◦ τ )# κˆ = (pY ◦ pi+1 )# κˆ = pY # π. ˆ The equality of the total masses is proved similarly: ˆ × Y ) = pi # κ(X ˆ × Y ) = π(X ˆ × Y ). α = πˆ β (X × Y ) = (pi ◦ τ )# κ(X Next we compute the transport costs associated to πˆ β :

c d πˆ β =

X×Y

c ◦ p1 d(τ # κ) ˆ

(marginal property)

(X×Y )n

1 = n

1 n

n

c ◦ pi d(τ # κ) ˆ

(σ -invariance)

i=1(X×Y )n n

=

(c ◦ pi ◦ τ ) d κˆ

(push-forward)

i=1(X×Y )n

1 n n

=

(c ◦ pi ◦ τ ) dκ

(definition of κ) ˆ

i=1B n,ε

n (c ◦ pi ) − ε dκ

1 n

Bn,ε

=

(definition of Bn,ε )

i=1

c d πˆ − ε

α n

(definition of π). ˆ

X×Y

To improve the transport plan π we define πβ := (π − πˆ ) + πˆ β .

(12)

1918


Recall that π − πˆ is a positive measure, so πβ is a positive measure. As πˆ and πˆ β have the same total mass, πβ is a probability measure. Furthermore πˆ and πˆ β have the same marginals, so πβ is indeed a transport plan. We have Ic [πβ ] = Ic [π] +

c d(πˆ β − π) ˆ Ic [π] − ε

α < Ic [π]. n

2

(13)

X×Y

3. Connecting c-monotonicity and strong c-monotonicity The Ambrosio–Pratelli example (Example 1.2) shows that c-monotonicity need not imply strong c-monotonicity in general. Subsequently we shall present a condition which ensures that this implication is valid. A c-monotone transport plan resists the attempt of enhancement by means of cyclically rerouting. This, however, may be due to the fact that cyclical rerouting is a priori impossible due to infinite transport costs on certain routes. Continuing Villani’s interpretation, a situation where rerouting in this consortium of bakeries and cafés is possible in a satisfactory way is as follows: Suppose that bakery x = x0 is able to produce one more croissant than it already does and that café y˜ is short of one croissant. It might not be possible to transport the additional croissant itself to the café in need, as the costs c(x, y) ˜ may be infinite. Nevertheless it might be possible to find another bakery x1 (which usually supplies café y1 ) such that bakery x can transport (with finite costs!) the extra croissant to y1 ; this leaves us with a now unused item from bakery x1 , which can be transported to y˜ with finite costs. Of course we allow not only one, but finitely many intermediate pairs (x1 , y1 ), . . . , (xn , yn ) of bakeries/cafés to achieve this relocation of the additional croissant. In the Ambrosio–Pratelli example we can reroute from a point (x, x ⊕ α) ∈ Γ1 to a point (x, ˜ x˜ ⊕ α) ∈ Γ1 only if there exists n ∈ N such that x ⊕ (nα) = x. ˜ In particular, irrationality of α implies that if we can redirect with finite costs from (x, x ⊕ α) to (x, ˜ x˜ ⊕ α) we never can redirect back from (x, ˜ x˜ ⊕ α) to (x, x ⊕ α). Definition 3.1. Let X, Y be Polish spaces equipped with Borel probability measures μ, ν, let c : X × Y → [0, ∞] be a Borel measurable cost function and Γ ⊆ X × Y a Borel measurable set on which c is finite. We define a. (x, y) (x, ˜ y) ˜ if there exist pairs (x0 , y0 ), . . . , (xn , yn ) ∈ Γ such that (x, y) = (x0 , y0 ) and (x, ˜ y) ˜ = (xn , yn ) and c(x1 , y0 ), . . . , c(xn , yn−1 ) < ∞. b. (x, y) ≈ (x, ˜ y) ˜ if (x, y) (x, ˜ y) ˜ and (x, y) (x, ˜ y). ˜ We call (Γ, c) connecting if c is finite on Γ and (x, y) ≈ (x, ˜ y) ˜ for all (x, y), (x, ˜ y) ˜ ∈Γ. These relations were introduced in [15, Chapter 5, p. 75] and appear in a construction due to Stefano Bianchini. When there is any danger of confusion we will write c,Γ and ≈c,Γ , indicating the dependence on Γ and c. Note that is a pre-order, i.e. a transitive and reflexive relation, and that ≈ is an equivalence relation. We will also need the projections X , ≈X resp. Y , ≈Y of these relations onto the set pX [Γ ] ⊆ X resp. pY [Γ ] ⊆ Y . The projection is defined in the obvious way: x X x˜ if there exist y, y˜ such that (x, y), (x, ˜ y) ˜ ∈ Γ and (x, y) (x, ˜ y) ˜ holds.


1919

The other relations are defined analogously. The projections of are again pre-orders and the projections of ≈ are again equivalence relations, provided c is finite on Γ . The equivalence classes of ≈ and its projections are compatible in the sense that [(x, y)]≈ = ([x]≈X × [y]≈Y ) ∩ Γ . The elementary proofs of these facts are left to the reader. The main objective of this section is to prove Proposition 3.2, based on several lemmas which will be introduced throughout the section. Proposition 3.2. Let X, Y be Polish spaces equipped with Borel probability measures μ, ν and let c : X × Y → [0, ∞] be a Borel measurable cost function. Let π be a finite transport plan. Assume that there exists a c-monotone set Γ ⊆ X × Y with π(Γ ) = 1 on which c is finite, such that (Γ, c) is connecting. Then π is strongly c-monotone. In the proof of Proposition 3.2 we will establish the existence of the functions ϕ, ψ using the construction given in [12], see also [14, Chapter 2] and [1, Theorem 3.2]. As we do not impose any continuity assumptions on the cost function c, we cannot prove the Borel measurability of ϕ and ψ by using limiting procedures similar to the methods used in [1,12–14]. Instead we will use the following projection theorem, a proof of which can be found in [2, Theorem III.23] by analysts or in [5, Section 29.B] by readers who have some interest in set theory. Proposition 3.3. 4 Let X and Y be Polish spaces, A ⊆ X a Borel measurable set and f : X → Y a Borel measurable map. Then B := f (A) is universally measurable, i.e. B is measurable with respect to the completion of every σ -finite Borel measure on Y . The system of universally measurable sets is a σ -algebra. If X is a Polish space, we call a function f : X → [−∞, ∞] universally measurable if the pre-image of every Borel set is universally measurable. Lemma 3.4. Let X be a Polish space and μ a finite Borel measure on X. If ϕ : X → [−∞, ∞) is universally measurable, then there exists a Borel measurable function ϕ˜ : X → [−∞, ∞) such that ϕ˜ ϕ everywhere and ϕ = ϕ˜ almost everywhere. Proof. Let (In )∞ n=1 be an enumeration of the intervals [a, b) with endpoints in Q and denote the completion of μ by μ. ˜ Then for each n ∈ N, ϕ −1 [In ] is μ-measurable ˜ and hence

∞ the union ˜ set Nn . Let N be a Borel null set which covers n=1 Nn . Let of a Borel set Bn and a μ-null ˜ ϕ(x) for all x ∈ X and ϕ(x) = ϕ(x) ˜ for μ-almost ˜ all ϕ(x) ˜ = ϕ(x) − ∞ · 1N (x). Clearly ϕ(x) x ∈ X. Furthermore, ϕ˜ is Borel measurable since (In )∞ n=1 is a generator of the Borel σ -algebra on [−∞, ∞) and for each n ∈ N we have that ϕ˜ −1 [In ] = Bn \ N is a Borel set. 2 The following definition of the functions ϕn , n ∈ N, resp. ϕ is reminiscent of the construction in [12]. 4 Sets which are images of Borel sets under measurable functions are called analytic in descriptive set theory. Lusin first noticed that analytic sets are universally measurable. Details can be found for instance in [5].

1920


Lemma 3.5. Let X, Y be Polish spaces, c : X × Y → [0, ∞] a Borel measurable cost function and Γ ⊆ X × Y a Borel set. Fix (x0 , y0 ) ∈ Γ and assume that c is finite on Γ . For n ∈ N, define ϕn : X × Γ n → (−∞, ∞] by n−1

c(xi+1 , yi ) − c(xi , yi ) . ϕn (x; x1 , y1 , . . . , xn , yn ) = c(x, yn ) − c(xn , yn ) +

(14)

i=0

Then the map ϕ : X → [−∞, ∞] defined by ϕ(x) = inf ϕn (x; x1 , y1 , . . . , xn , yn ): n 1, (xi , yi )ni=1 ∈ Γ n

(15)

is universally measurable. Proof. First note that the Borel σ -algebra on [−∞, ∞] is generated by intervals of the form [−∞, α), thus it is sufficient to determine the pre-images of those sets under ϕ. We have ϕ(x) < α

↔

∃n ∈ N ∃(x1 , y1 ), . . . , (xn , yn ) ∈ Γ : ϕn (x; x1 , y1 , . . . , xn , yn ) < α.

The set ϕn−1 [[−∞, α)] is Borel measurable. Hence

ϕ −1 [−∞, α) = pX ϕn−1 [−∞, α) n∈N

is the countable union of projections of Borel sets. Since projections of Borel sets are universally measurable by Proposition 3.3, ϕ −1 [[−∞, α)] belongs also to the σ -algebra of universally measurable sets. 2 Lemma 3.6. Let X, Y be Polish spaces and c : X ×Y → [0, ∞] a Borel measurable cost function. Suppose Γ is c-monotone, c is finite on Γ and (Γ, c) is connecting. Fix (x0 , y0 ) ∈ Γ . Then the map ϕ from (15) is finite on pX [Γ ]. Furthermore ϕ(x) ϕ(x ) + c(x, y) − c(x , y)

∀x ∈ X, (x , y) ∈ Γ.

(16)

Proof. Fix x ∈ pX [Γ ]. Since x0 x (recall Definition 3.1), we can find x1 , y1 , . . . , xn , yn such that ϕn (x; x1 , y1 , . . . , xn , yn ) < ∞. Hence ϕ(x) < ∞. Proving ϕ(x) > −∞ involves some wrestling with notation but, not very surprisingly, it comes down to applying the fact that x x0 . Let a1 = x and choose b1 , a2 , b2 , . . . , am , bm such that (a1 , b1 ), . . . , (am , bm ) ∈ Γ and c(a2 , b1 ), . . . , c(am , bm−1 ), c(x, bm ) < ∞. Assume now that x1 , y1 , . . . , xn , yn are given such that ϕn (x; x1 , y1 , . . . , xn , yn ) < ∞. Put xn+i = ai and yn+i = bi for i ∈ {1, . . . , m}. Due to cmonotonicity of Γ and the finiteness of all involved terms we have:

n+m−1 0 c(x0 , yn+m ) − c(xn+m , yn+m ) + c(xi+1 , yi ) − c(xi , yi ) , i=0

which, after regrouping yields


1921

m−1 c(ai+1 , bi ) − c(ai , bi ) α := c(x0 , bm ) − c(am , bm ) + i=1 n−1

c(x, yn ) − c(xn , yn ) + c(xi+1 , yi ) − c(xi , yi ) .

(17)

i=0

Note that the right-hand side of (17) is just ϕn (x; x1 , y1 , . . . , xn , yn ). Thus passing to the infimum we see that ϕ(x) α > −∞. To prove the remaining inequality, observe that the right-hand side of (16) can be written as inf ϕn (x; x1 , y1 , . . . , xn , yn ): n 1, (xi , yi )ni=1 ∈ Γ n and (xn , yn ) = (x , y) whereas the left-hand side of (16) is the same, without the restriction (xn , yn ) = (x , y).

2

Lemma 3.7. Let X, Y be Polish spaces and c : X ×Y → [0, ∞] a Borel measurable cost function. Let X0 ⊆ X be a non-empty Borel set and let ϕ : X0 → R be a Borel measurable function. Then the c-transform ψ : Y → [−∞, ∞), defined as

ψ(y) := inf c(x, y) − ϕ(x)

(18)

x∈X0

is universally measurable. Proof. As in the proof of Lemma 3.4 we consider the set ψ −1 [[−∞, α)]: ψ(y) < α

↔

∃x ∈ X0 : c(x, y) − ϕ(x) < α.

Note that the set {(x, y) ∈ X0 × Y : c(x, y) − ϕ(x) < α} is Borel. Thus

ψ −1 [−∞, α) = pX (x, y) ∈ X0 × Y : c(x, y) − ϕ(x) < α is the projection of a Borel set, hence universally measurable.

2

We are now able to prove the main result of this section. Proof of Proposition 3.2. Let Γ ⊆ X × Y be a c-monotone Borel set such that π(Γ ) = 1 and the pair (Γ, c) is connecting. Let ϕ be the map from Lemma 3.5. Using Lemmas 3.4 and 3.6, and eventually passing to a subset of full π -measure, we may assume that ϕ is Borel measurable, that X0 := pX [Γ ] is a Borel set and that c(x , y) − ϕ(x ) c(x, y) − ϕ(x)

∀x ∈ X0 , (x , y) ∈ Γ.

(19)

Note that (19) follows from (16) in Lemma 3.6. Here we consider x ∈ X0 in order to ensure that ϕ(x) is finite on X0 . Now consider the c-transform

ψ(y) := inf c(x, y) − ϕ(x) , x∈X0

(20)

1922


which by Lemma 3.7 is universally measurable. Fix y ∈ pY [Γ ]. Using (19) we see that the infimum in (20) is attained at a point x0 ∈ X0 satisfying (x0 , y) ∈ Γ . This implies that ϕ(x) + ψ(y) = c(x, y) on Γ and ϕ(x)+ψ(y) c(x, y) on pX [Γ ]×pY [Γ ]. To guarantee this inequality on the whole product X × Y , one has to redefine ϕ and ψ to be −∞ on the complement of pX [Γ ] resp. pY [Γ ]. Applying Lemma 3.4 once more, we find that there exists a Borel set N ⊆ Y of ˜ zero ν-measure, such that ψ(y) = ψ(y) − ∞ · 1N (y) is Borel measurable. Finally, replace Γ by Γ ∩ (X × (Y \ N)) and ψ by ψ˜ . 2 We conclude this section by proving that every strongly c-monotone transport plan is optimal (Proposition 1.5). Let X, Y be Polish spaces equipped with Borel probability measures μ, ν and let c : X × Y → [0, ∞] be Borel measurable. Then every finite transport plan which is strongly c-monotone is optimal. Proof. Let π0 be a strongly c-monotone transport plan. Then, according to the definition, there exist Borel functions ϕ(x) and ψ(y) taking values in [−∞, ∞) such that ϕ(x) + ψ(y) c(x, y)

(21)

everywhere on X × Y and equality holds π0 -a.e. We define the truncations ϕn = (n ∧ (ϕ ∨ −n)), ψn = (n ∧ (ψ ∨ −n)) and let ξn (x, y) := ϕn (x) + ψn (y) resp. ξ(x, y) := ϕ(x) + ψ(y). Note that ϕn , ψn , ξn , ξ are Borel measurable. By elementary considerations which are left to the reader, we get pointwise monotone convergence ξn ↑ ξ on the set {ξ 0} resp. ξn ↓ ξ on the set {ξ 0}. Let π1 be an arbitrary finite transport plan; to compare Ic [π0 ] and Ic [π1 ] we make the following observations: a. By monotone convergence

ξn dπi ↑ {ξ 0}

ξ dπi Ic [πi ] < ∞ and

{ξ 0}

ξn dπi ↓ {ξ 0. Proof of Lemma 4.1. As ≈Γ,c is an equivalence relation and π is concentrated on Γ , the sets Ci , i ∈ I are a partition of X modulo μ-null sets. Likewise the sets Di , i ∈ I form a partition of Y modulo ν-null sets. In particular the quantities pi := μ(Ci ) = ν(Di ) = π(Ci × Di ),

i ∈ I,

(25)

add up to 1. Without loss of generality we may assume that pi > 0 for all i ∈ I . We define pij :=

π0 (Ci × Dj ) , μ(Ci )

i, j ∈ I.

5 Such a matrix P is often called a stochastic matrix while p is a stochastic vector.

(26)

1924


(Ci ×Y ) Then j ∈I pi0 j = π0μ(C = 1 for each i0 ∈ I . By the condition on the marginals of π0 we i) have for the ith component of p · P (p · P )i =

j ∈I

μ(Cj )

π0 (Cj × Di ) = π0 (X × Di ) = ν(Di ) = pi , μ(Cj )

i.e. p · P = p. Hence P satisfies the assumptions of Lemma 4.2. We claim that pii = 1 for all i ∈ I . Suppose not. Pick i0 ∈ I such that pi0 i0 < 1. Then there exists some index i1 = i0 such that pi0 i1 > 0. Pick a finite sequence i0 , i1 , . . . , in = i0 according to Lemma 4.2. Fix k ∈ {1, . . . , n − 1}. Then π0 (Cik × Dik+1 ) = pik ik+1 > 0. Since π0 is a finite transport plan, there exist xk ∈ Cik ∩ pX [Γ ] and yk+1 ∈ Dik+1 ∩ pY [Γ ] such that c(xk , yk+1 ) < ∞. Choose yk ∈ Dik and xk+1 ∈ Cik+1 such that (xk , yk ), (xk+1 , yk+1 )∈Γ. Then

(x0 , y0 ) x1 , y1 ≈ (x1 , y1 ) x2 , y2 ≈ (x2 , y2 ) · · · xn , yn ≈ (x0 , y0 ). But this implies that (x0 , y0 ) ≈ (x1 , y1 ), contradicting the assumption that (Ci0 ×Di0 )∩Γ, (Ci1 × ≈Γ,c . Hence we have indeed pii = 1 for all i ∈ I , Di1 ) ∩ Γ are different equivalence classes of thus π0 (Ci × Di ) = μ(Ci ) which implies π0 ( i∈I Ci × Di ) = 1. 2 Lemma 4.3. Let X, Y be Polish spaces equipped with Borel probability measures μ, ν and let c : X × Y → [0, ∞] be a Borel measurable cost function which is μ ⊗ ν-a.e. finite. For every finite transport plan π and every Borel set Γ ⊆ X × Y with π(Γ ) = 1 on which c is finite, there exist Borel sets O ⊆ X, U ⊆ Y such that Γ = Γ ∩ (O × U ) has full π -measure and (Γ , c) is connecting. Proof. By Fubini’s Theorem for μ-almost all x ∈ X the set {y: c(x, y) < ∞} has full ν-measure and for ν-almost all y ∈ Y the set {x: c(x, y) < ∞} has full μ-measure. In particular the set of points (x0 , y0 ) such that both μ({x: c(x, y0 ) < ∞}) = 1 and ν({y: c(x0 , y) < ∞}) = 1 has full π -measure. Fix such a pair (x0 , y0 ) ∈ Γ and let O = {x ∈ X: c(x, y0 ) < ∞}, U = {y ∈ Y : c(x0 , y) < ∞}. Then Γ = Γ ∩ (O × U ) has full π -measure and for every (x, y) ∈ Γ both quantities c(x, y0 ) and c(x0 , y) are finite. Hence x ≈X x0 , for every x ∈ pX [Γ ]. Similarly we obtain y ≈Y y0 , for every y ∈ pY [Γ ]. Hence (Γ , c) is connecting. 2 Finally we prove the statement of Theorem 1.b: Let X, Y be Polish spaces equipped with Borel probability measures μ, ν and c : X ×Y → [0, ∞] a Borel measurable cost function. Every finite c-monotone transport plan is optimal if there exist a closed set F and a μ ⊗ ν-null set N such that {(x, y): c(x, y) = ∞} = F ∪ N . Proof. Let π be a finite c-monotone transport plan and pick a c-monotone Borel set Γ ⊆ X × Y with π(Γ ) = 1 on which c is finite.

Let On , Un , n ∈ N, be open sets such that n∈N (On × Un ) = (X × Y ) \ F . Fix n ∈ N and interpret π On × Un as a transport plan on the spaces (On , μn ) and (Un , νn ) where μn and νn


1925

are the marginals corresponding to π On ×Un . Apply Lemma 4.3 to Γ ∩(On ×Un ) and the cost function c On × Un to find On ⊆ On , Un ⊆ Un and Γn = Γ ∩ (On × Un ) with π(Γn ) = π(Γ ∩

of full measure (On × Un )) such that (Γn , c) is connecting. Then Γ˜ = n∈N Γn is a subset of Γ and every equivalence class of ≈Γ˜ ,c can be written in the form (( n∈N On ) × ( n∈N Un )) ∩ Γ for some non-empty index set N ⊆ N. Thus there are at most countably many equivalence classes which we can write in the form (Ci × Di ) ∩ Γ , i ∈ I , where I = {1, . . . , n} or I = N. Note that by shrinking the sets Ci , Di , i ∈ I we can assume that Ci ∩ Cj = Di ∩ Dj = ∅ for i = j . Assume now that we are given another finite

transport plan π0 . Apply Lemma 4.1 to π, π0 and Γ˜ to achieve that π0 is concentrated on i∈I Ci × Di . For i ∈ I we consider the restricted problem of transporting μ Ci to ν Di . We know that π Ci × Di is optimal for this task by Propositions 1.5 and 3.2, hence Ic [π] Ic [π0 ]. 2 Remark 4.4. In fact the following somewhat more general (but also more complicated to state) result holds true: Assume that {(x, y): c(x, y) = ∞} ⊆ F ∪ N where F is closed and N is a μ ⊗ ν-null set. Then every c-monotone transport plan π with π(F ∪ N ) = 0 is optimal. 5. Completing the picture First we give the proof of Theorem 2. Let X, Y be Polish spaces equipped with Borel probability measures μ, ν and c : X ×Y → [0, ∞] a Borel measurable cost function. For a finite transport plan π the following assertions are equivalent: a. π is robustly optimal. b. π is strongly c-monotone. Proof. a ⇒ b: Let Z and λ = 0 be according to the definition of robust optimality. As π˜ = ˜ Borel set (IdZ × IdZ )# λ + π is optimal, Theorem 1.a ensures the existence of a c-monotone Γ˜ ⊆ (X ∪ Z) × (Y ∪ Z) such that c˜ is finite on Γ˜ and π˜ is concentrated on Γ˜ . Note that (z, z) ∈ Γ˜ for λ-a.e. z ∈ Z. We claim that for λ-a.e. z ∈ Z and all (x, y) ∈ Γ = Γ˜ ∩ (X × Y ) the relation (x, y) ≈Γ˜ ,c˜ (z, z)

(27)

holds true. Indeed, since c˜ is finite on Z × Y we have c(z, y) < ∞ hence (x, y) Γ˜ ,c˜ (z, z). Analogously finiteness of c˜ on X × Z implies c(x, z) < ∞ such that also (z, z) Γ˜ ,c˜ (x, y). By ˜ is connecting. Applying Proposition 3.2 to the spaces X ∪Z and Y ∪Z transitivity of ≈Γ˜ ,c˜ , (Γ˜ , c) ˜ we get that π˜ is strongly c-monotone, ˜ i.e. there exist ϕ˜ and ψ˜ such that ϕ(a) ˜ + ψ(b) c(a, ˜ b) for (a, b) ∈ (X ∪ Z) × (Y ∪ Z) and equality holds π˜ -almost everywhere. By restricting ϕ˜ and ψ˜ to X resp. Y we see that π is strongly c-monotone. b ⇒ a: Let Z be a Polish space and let λ be a finite Borel measure on Z. We extend c to c˜ : (X ∪ Z) × (Y ∪ Z) → [0, ∞] via ⎧ c(a, b) for (a, b) ∈ X × Y, ⎪ ⎨ max(ϕ(a), 0) for (a, b) ∈ X × Z, c(a, ˜ b) = ⎪ ⎩ max(ψ(b), 0) for (a, b) ∈ Z × Y, 0 otherwise.

1926


Define ϕ(a) ˜ :=

ϕ(a) for a ∈ X, 0 for a ∈ Z

ψ(b) for b ∈ Y, ˜ and ψ(b) := 0 for b ∈ Z.

˜ Then ϕ˜ resp. ψ˜ are extensions of ϕ resp. ψ to X ∪ Z resp. Y ∪ Z which satisfy ϕ(a) ˜ + ψ(b) ˜ ˜ c(a, ˜ b) and equality holds on Γ = Γ ∪ {(z, z): z ∈ Z}. Hence Γ is strongly c-monotone. ˜ Since π˜ is concentrated on Γ˜ , π˜ is optimal by Proposition 1.5. 2 Next consider Theorem 3. Let X, Y be Polish spaces equipped with Borel probability measures μ, ν and let c : X × Y → [0, ∞] be Borel measurable and μ ⊗ ν-a.e. finite. For a finite transport plan π the following assertions are equivalent: (1) (2) (3) (4)

π π π π

is optimal. is c-monotone. is robustly optimal. is strongly c-monotone.

Proof. By Theorem 2, (3) and (4) are equivalent and they trivially imply (1) and (2) which are equivalent by Theorem 1. It remains to see that (2) ⇒ (4). Let π be a finite c-monotone transport plan. Pick a c-monotone Borel set Γ ⊆ X × Y such that c is finite on Γ and π(Γ ) = 1. By Lemma 4.3 there exists a Borel set Γ ⊆ Γ such that π(Γ ) = 1 and (Γ , c) is connecting, hence Proposition 3.2 applies. 2 Finally the example below shows that the (μ ⊗ ν-a.e.) finiteness of the cost function is essential to be able to pass from the “weak properties” (optimality, c-monotonicity) to the “strong properties” (robust optimality, strong c-monotonicity). Example 5.1 (Optimality does not imply strong c-monotonicity). Let X = Y = [0, 1] and equip both spaces with Lebesgue measure λ = μ = ν. Define c to be ∞ above the diagonal and √ 1 − x − y for y x. The optimal (in this case the only finite) transport plan is the Lebesgue measure π on the diagonal . We claim that π is not strongly c-monotone. Striving for a contradiction we assume that there exist ϕ and ψ witnessing the strong c-monotonicity. Let 1 be the full-measure subset of on which ϕ + ψ = c, and write pX [1 ] for the projection of 1 . We claim that √ ∀x, x ∈ pX [1 ]: if x < x , then ϕ(x) − ϕ(x ) x − x, (28) which will yield a contradiction when combined with the fact that pX [1 ] is dense. Our claim (28) follows directly from √ ϕ(x ) + ψ(x) c(x , x) = 1 − x − x and ϕ(x) + ψ(x) = c(x, x) = 1.

(29)

Now let x < x +a be elements of pX [1 ], let b := ϕ(x)−ϕ(x ), and let n ∈ N be a sufficiently 2 large number, say satisfying n > 2 ab2 . Using the fact that pX [1 ] is dense, we can find real numbers x = x0 < x1 < · · · < xn = x + a in 1 satisfying xk − xk−1 < n2 for k = 1, . . . , n.


Let εk := xk − xk−1 for k = 1, . . . , n. Then we have εk < So we get b = ϕ(x) − ϕ(x ) =

n k=1

ϕ(xk−1 ) − ϕ(xk )

n √ k=1

2 n

a2 b2

for all k, hence

n b k=1

a

1927

√ εk > ab εk .

b εk = b, a n

εk =

k=1

a contradiction. (By letting c = 0 below the diagonal the argument could be simplified, but then we would lose lower semi-continuity of c.) Acknowledgment The authors are indebted to the extremely careful referee who noticed many inaccuracies resp. mistakes and whose insightful suggestions led to a more accessible presentation of several results in this paper. References [1] L. Ambrosio, A. Pratelli, Existence and stability results in the L1 theory of optimal transportation, in: Optimal Transportation and Applications, Martina Franca, 2001, in: Lecture Notes in Math., vol. 1813, Springer, Berlin, 2003, pp. 123–160. [2] C. Castaing, M. Valadier, Convex Analysis and Measurable Multifunctions, Lecture Notes in Math., vol. 580, Springer-Verlag, Berlin, 1977. [3] W. Gangbo, R.J. McCann, The geometry of optimal transportation, Acta Math. 177 (2) (1996) 113–161. [4] O. Kallenberg, Foundations of Modern Probability, Probab. Appl., Springer-Verlag, New York, 1997. [5] A.S. Kechris, Classical Descriptive Set Theory, Grad. Texts in Math., vol. 156, Springer-Verlag, New York, 1995. [6] H.G. Kellerer, Duality theorems for marginal problems, Z. Wahrsch. Verw. Gebiete 67 (4) (1984) 399–432. [7] M. Knott, C. Smith, On Hoeffding–Fréchet bounds and cyclic monotone relations, J. Multivariate Anal. 40 (2) (1992) 328–334. [8] A. Pratelli, On the sufficiency of c-cyclical monotonicity for optimality of transport plans, Math. Z. 258 (3) (2008) 677–690. [9] S.T. Rachev, L. Rüschendorf, Mass Transportation Problems, vol. I, Probab. Appl., Springer-Verlag, New York, 1998. [10] D. Ramachandran, Perfect measures and related topics, in: Handbook of Measure Theory, vols. I, II, North-Holland, Amsterdam, 2002, pp. 765–786. [11] R.T. Rockafellar, Characterization of the subdifferentials of convex functions, Pacific J. Math. 17 (1966) 497–510. [12] L. Rüschendorf, On c-optimal random variables, Statist. Probab. Lett. 27 (3) (1996) 267–270. [13] W. Schachermayer, J. Teichmann, Characterization of optimal transport plans for the Monge–Kantorovich problem, Proc. Amer. Math. Soc. 137 (2) (2009) 519–529. [14] C. Villani, Topics in Optimal Transportation, Grad. Stud. Math., vol. 58, American Mathematical Society, Providence, RI, 2003. [15] C. Villani, Optimal Transport Old and New, Grundlehren Math. Wiss., vol. 338, Springer-Verlag, Berlin, 2009.


Dynamics for the energy critical nonlinear Schrödinger equation in high dimensions Dong Li a , Xiaoyi Zhang a,b,∗ a Institute for Advanced Study, 1st Einstein Drive, Princeton, NJ 08540, United States b Academy of Mathematics and System Sciences, Beijing 100080, China

Received 2 June 2008; accepted 10 December 2008 Available online 8 January 2009 Communicated by I. Rodnianski

Abstract In [T. Duyckaerts, F. Merle, Dynamic of threshold solutions for energy-critical NLS, preprint, arXiv:0710.5915 [math.AP]], T. Duyckaerts and F. Merle studied the variational structure near the ground state solution W of the energy critical NLS and classified the solutions with the threshold energy E(W ) in dimensions d = 3, 4, 5 under the radial assumption. In this paper, we extend the results to all dimensions d 6. The main issue in high dimensions is the non-Lipschitz continuity of the nonlinearity which we get around by making full use of the decay property of W . © 2008 Elsevier Inc. All rights reserved. Keywords: Energy critical; Schrödinger equation; Variational structure; Ground state

1. Introduction We consider the Cauchy problem of the focusing energy critical nonlinear Schrödinger equation:

4

iut + u + |u| d−2 u = 0, u(0, x) = u0 (x),

(1.1)

* Corresponding author at: Institute for Advanced Study, 1st Einstein Drive, Princeton, NJ 08540, United States.

E-mail address: [email protected] (X. Zhang). 0022-1236/$ – see front matter © 2008 Elsevier Inc. All rights reserved. doi:10.1016/j.jfa.2008.12.007

D. Li, X. Zhang / Journal of Functional Analysis 256 (2009) 1928–1961

1929

where u(t, x) is a complex function on R × Rd , d 3 and u0 ∈ H˙ x1 (Rd ). The name “energy critical” refers to the fact that the scaling u(t, x) → uλ (t, x) = λ−

d−2 2

u λ−2 t, λ−1 x .

(1.2)

leaves both the equation and the energy invariant. Here, the energy is defined by 2 d − 2 2d 1 u(t) d−2 E u(t) = ∇u(t)2 − 2d , 2 2d d−2

(1.3)

and is conserved in time. We refer to the first part as “kinetic energy” and the second part as “potential energy.” From the classical local theory [4], for any u0 ∈ H˙ x1 (Rd ), there exists a unique maximallifespan solution of (1.1) on a time interval (−T− , T+ ) such that the local scattering size SI (u) = u

2(d+2)

Lt,xd−2 (I ×Rd )

< ∞,

for any compact interval I ⊂ (−T− , T+ ). If S[0,T + ) (u) = ∞, we say u blows up forward in time. Likewise u blows up backward in time if S(−T− ,0] (u) = ∞. We also recall the fact that the nonblowup of u in one direction implies scattering in that direction. For the defocusing energy critical NLS, the global well-posedness and scattering was established in [2,5,16,18,21]. In the focusing case, depending on the size of the kinetic energy of the initial data, both scattering and blowup may occur. One can refer to [3] for scattering of small kinetic energy solutions and [7] for the existence of finite time blowup solutions. The threshold between blowup and scattering is believed to be determined by the ground state solution of Eq. (1.1): W (x) = 1 +

|x|2 d(d − 2)

− d−2 2

,

which solves the static NLS d+2

W + W d−2 = 0. This was verified by Kenig and Merle [9] in dimensions d = 3, 4, 5 in the spherically symmetric case and by Killip and Visan [13] in all dimensions d 5 without the radial assumption. To summarize, we have the following Theorem 1.1 (Global well-posedness and scattering [9,13]). Let u = u(t, x) be the maximallifespan solution of (1.1) on I × Rd in dimension d 3, in the case when d = 3, 4, we also require that u is spherically symmetric. If E∗ := sup∇u(t)2 < ∇W 2 , t∈I

then I = R and the scattering size of u is finite, SI (u) = u

2(d+2)

Lt,xd−2 (I ×Rd )

< C(E∗ ).

1930


As a consequence of this theorem and the coercive property of W [9], they also proved Corollary 1.2. (See [9,13].) Let d 3 and u0 ∈ H˙ x1 (Rd ). In dimension d = 3, 4 we also require u0 is spherically symmetric. If E(u0 ) < E(W ), ∇u0 2 ∇W 2 , then the corresponding solution u = u(t, x) exists globally and scatters in both time directions. Theorem 1.1 and Corollary 1.2 confirmed that the threshold between blowup and scattering is given by the ground state W . Our purpose of this paper is not to investigate the global wellposedness and scattering theory blow the threshold. Instead, we aim to continue the study in [6] on what will happen if the solution has the threshold energy E(W ). In that paper, T. Duyckaerts and F. Merle carried out a very detailed study of the dynamical structure around the ground solution W . They were able to give the characterization of solutions with the threshold energy in dimensions d = 3, 4, 5 under the radial assumption. Note that the energy-critical problem here can be compared with the focusing mass critical problem 4

iut + u = −|u| d u. There the ground state solution Q satisfies the equation 4

Q − Q + Q1+ d = 0. And the mass of Q turns out to be the threshold between blowup and scattering. The characterization of the minimal mass blowup solution was established in [10,11,15,22]. In this paper, we aim to extend the results in [6] to all dimensions d 6. Although the whole framework designed for low dimensions can also be used for the high dimensional setting, there are a couple of places where the arguments break down in high dimensions. Roughly speaking, this is caused by the non-smoothness of the nonlinearity; more precisely, in high dimensions, the nonlinearity |u|4/(d−2) u is no longer Lipschitz continuous in the usual Strichartz space S˙ 1 (see Section 2 for the definition). This reminds us of the similar problem one encountered in establishing the stability theory for high dimensional energy critical problem where this was gotten around by using exotic Strichartz estimates (see, for example, [19]).1 However, the exotic Strichartz trick will inevitably cause the loss of derivatives and one cannot go back to the natural energy space Hx1 . On the other hand, the Hx1 regularity is heavily used in the spectral analysis around the ground state W (see for example the proof of Proposition 5.9 in [6]). To solve this problem, we will use a different technique where the decay property of W is fully considered. When constructing the threshold solutions W ± (see Theorem 1.3 below), we transform the problem into solving a perturbation equation with respect to W using the fixed point argument. Although the nonlinearity of the perturbed equation is not Lipschitz 1 The main ingredient of exotic Strichartz trick is as follows: instead of using spaces S˙ 1 , we use the space which has the same scaling but lower regularity. The nonlinearity can be shown to be Lipshitz continuous in such spaces. (See lecture notes [12], Section 3 for more details.)


1931

continuous for general functions, it is for perturbations which are much smaller than W . The reason is that if we restrict ourselves to the regime |z| 1, we can expand the real analytic func4 tion |1 + z| d−2 (1 + z) (which corresponds to the form of energy critical nonlinearity) and get the Lipschitz continuity. This consideration leads us to working in the space of functions which have much better decay than W . The weighted Sobolev space H m,m (see (3.4) for the definition) turns out to be a good candidate for this purpose. By doing this, besides proving the existence of the threshold solutions W ± , we can actually show the difference W ± − W has very high regularity and good decay properties. This property also helps us in the next step where we have to show after extracting the linear term, the perturbed nonlinearity is superlinear with respect to the perturbations. The superlinearity is needed to show the rigidity of the threshold solutions W ± . This time again we make use of the decay estimate of W . We split Rd into regimes where the solution dominates W and the complement. In the first regime, we can transform some portion of W to increase the power of the solution, and get the superlinearity (cf. Lemma 2.3). In the regime where the solution is dominated by W , we simply use the real analytic expansion. The fact that the difference W ± − W has enough decay in space and time plays a crucial role in the whole analysis. In all, the material in this paper allows us to extend the argument in [6] to all dimensions d 6. With some suitable modifications, the same technique can be used to treat the high dimensional energy critical nonlinear wave equation and we will address this problem elsewhere [14]. For NLS we have the following Theorem 1.3. Let d 6. There exists a spherically symmetric global solution W − of (1.1) with E(W − ) = E(W ) such that ∇W − (t) < ∇W 2 , 2

∀t ∈ R.

Moreover, W − scatters in the negative time direction and blows up in the positive time direction, in which W − is asymptotically close to W : lim W − (t) − W H˙ 1 = 0.

t→+∞

x

There also exists a spherically symmetric solution W + with E(W + ) = E(W ) such that ∇W + (t) > ∇W 2 , 2

∀t ∈ R.

Moreover in the positive time direction, W + blows up at infinite time and is asymptotically close to W lim W + (t) − W H˙ 1 = 0.

t→+∞

x

In the negative time direction, W + blows up at finite time. Next, we classify solutions with the threshold energy. Since the equation is invariant under several symmetries, we can determine the solution only modulo these symmetries. In the spher-

1932


ically symmetric setting, when we say u = v up to symmetries, we mean there exist θ0 , t0 ∈ R, λ0 > 0 such that − d−2 u(t, x) = eiθ0 λ0 2 v

t + t0 x . , λ20 λ0

With this convention we have Theorem 1.4. Let d 6, u0 ∈ H˙ x1 (Rd ) be spherically symmetric and such that E(u0 ) = E(W ). Let u be the corresponding maximal-lifespan solution of (1.1) on I × Rd . We have (1) If ∇u0 2 < ∇W 2 , then either u = W − up to symmetries or u scatters in both time directions. (2) If ∇u0 2 = ∇W 2 , then u = W up to symmetries. (3) If ∇u0 2 > ∇W 2 and u0 ∈ L2x (Rd ), then either |I | is finite or u = W + up to symmetries. The proof of Theorems 1.3 and 1.4 will follow roughly the same strategy as in [6]. Here we make a remark about the proof of Theorem 1.4. The second point is a direct application of variational characterization of W (see the last section for more details). To prove (1) and (3), in [6], a large portion of the work was devoted to showing the exponential convergence of the solution to W , which after several minor changes, also works for higher dimensions. For this reason, we do not repeat that part of the argument and build our starting point on the following Proposition 1.5 (Exponential convergence to W [6]). Suppose u0 , u satisfy the same conditions as in Theorem 1.4 and u blows up on I forward in time. If ∇u0 2 > ∇W 2 , we assume [0, sup I ) = [0, ∞). If ∇u0 2 ∇W 2 , then the solution exists globally and I = R. In all cases there exist θ0 ∈ R, γ0 > 0, μ0 > 0 such that u(t) − W[θ

0 ,μ0 ]

H˙ 1

Ce−γ0 t ,

∀t 0,

(1.4)

where W[θ0 ,μ0 ] (x) = e

iθ0

− d−2 μ0 2 W

x . μ0

This paper is organized as follows. In Section 2, we introduce some notations and collect some basic estimates. Section 3 is devoted to proving Theorem 1.3. In Section 4, we give the proof of Theorem 1.4 by assuming Proposition 1.5. 2. Preliminaries We use X Y or Y X whenever X CY for some constant C > 0. We use O(Y ) to denote any quantity X such that |X| Y . We use the notation X ∼ Y whenever X Y X. We will add subscripts to C to indicate the dependence of C on the parameters. For example, Ci,j means that the constant C depends on i, j . The dependence of C upon dimension will be suppressed. We use the ‘Japanese bracket’ convention x := (1 + |x|2 )1/2 .


1933

Throughout this paper, we will use pc to denote the total power of nonlinearity: pc =

d +2 . d −2

q

We write Lt Lrx to denote the Banach space with norm q/r 1/q r u(t, x) dx uLq Lr (R×Rd ) := dt , t

x

R

Rd

with the usual modifications when q or r are equal to infinity, or when the domain R × Rd is q q replaced by a smaller region of spacetime such as I × Ω. When q = r we abbreviate Lt Lx q as Lt,x . For a positive integer k, we use W k,p to denote the space with the norm

∇ j u p , uW k,p = L x

0j k

when p = 2, we write W k,2 as H k . 2.1. Strichartz estimates Let the dimension d 6. We say a couple (q, r) is admissible if 2 q ∞ and 2 d d + = . q r 2 q Let I be a time slab. We denote S˙ 0 (I ) = (q,r) admissible Lt Lrx (I × Rd ) and N˙ 0 (I ) as its dual space. We will use S˙ 1 (I ) and N˙ 1 (I ) to denote the space of functions u such that ∇u ∈ S˙ 0 (I ) and ∇u ∈ N˙ 0 (I ) respectively. By Sobolev embedding, it is easy to verify that uLqt Lr uS˙ 1 ,

(2.1)

x

for all H˙ 1 admissible pairs (q, r) in the sense that 2 q ∞, and q2 + dr = d2 − 1. Two typical 2d 2d H˙ 1 admissible pairs are (∞, d−2 ), (2, d−4 ). Other pairs will also be used in this paper without mentioning this embedding. With the notations above, we record the standard Strichartz estimates as follows. Lemma 2.1. (See Strichartz estimates [8,17].) Let k = 0, 1. Let I be an interval, t0 ∈ I , u0 ∈ H˙ k and f ∈ N˙ k (I ). Then, the function u defined by t u(t) := e

i(t−t0 )

u0 − i

ei(t−t ) f (t ) dt

t0

obeys the estimate uS˙ k (I ) u0 H˙ k + f N˙ k (I ) .

1934


2.2. Derivation of the perturbation equation near W Let u be the solution of (1.1) and v = u − W , then v satisfies the equation i∂t v + v + Γ (v) + iR(v) = 0, where, pc − 1 pc −1 pc + 1 pc −1 W W v+ v, ¯ 2 2 pc + 1 pc −1 pc − 1 pc −1 W W iR(v) = |v + W |pc −1 (v + W ) − W pc − v− v. ¯ 2 2 Γ (v) =

Define the linear operator L by L(v) = −iv − i

pc + 1 pc −1 pc − 1 pc −1 W W v−i v. ¯ 2 2

We write the equation for v equivalently as ∂t v + L(v) + R(v) = 0. For the spectral properties of L, we need the following lemma from [6]. Lemma 2.2. (See [6].) The operator L admits two eigenfunctions Y+ , Y− ∈ S(Rd ) with real eigenvalues LY+ = e0 Y+ ,

LY− = −e0 Y− ,

and Y+ = Y¯ − , Proof. See Lemma 5.1 of [6].

e0 > 0.

2

2.3. Basic estimates We will use the following lemma many times throughout this paper. Lemma 2.3. Let I be a time slab. We have p −1 |u| c ∇v ˙ 0

N (I )

p −1

uS˙ 1c (I ) vS˙ 1 (I ) ,

∇W |v|pc −1 ˙ 0

N (I ;|v|> 14 W )

p −2 W c ∇W v ˙ 0

N (I ;|v|> 14 W )

v

d+ 23 d−2 S˙ 1 (I )

v

,

d+ 32 d−2 S˙ 1 (I )

,


p −3 W c ∇W v 2 ˙ 0

v

N (I ;|v| 14 W )

−1 W ∇W |v|pc ˙ 0

N (I ;|v| 14 W )

d+ 23 d−2 S˙ 1 (I )

v

1935

,

d+ 23 d−2 S˙ 1 (I )

.

Here N˙ 0 (I ; |v| > 14 W ) denotes N˙ 0 (I × Ω), where Ω = {x: |v(x)| > 14 W (x)}. Similar conventions apply to N˙ 0 (I ; |v| < 14 W ). Proof. The first one follows directly from Hölder’s inequality, we have p −1 |u| c ∇v ˙ 0

N (I )

|u|pc −1 ∇v

2d

L2t Lxd+2 (I ×Rd )

u

pc −1 2d

d−2 L∞ (I ×Rd ) t Lx

∇v

2d

L2t Lxd−2 (I ×Rd )

p −1

uS˙ 1c (I ) vS˙ 1 (I ) . Now we verify the second one. Noting |∇W (x)| x −(d−1) , we have ∇W |v|pc −1 ˙ 0

N (I ;|v|> 14 W )

5 3 |v|pc −1 x −(d− 2 ) x − 2 N˙ 0 (I ;|v| 1 W ) 4

|v|

d+ 32 d−2

3 x − 2

2d

L2t Lxd+2 (I ×Rd )

3 x − 2

d+ 32 4d |v| d−2

Lx5

4d

L2t Lx2d−1 (I ×Rd )

d+ 32

v d−2

4d(d+ 23 ) 2d+3 (d−2)(2d−1) d−2 Lt Lx (I ×Rd ) d+ 32

vS˙d−2 1 (I ) . To see the third one, we use the bound W pc −2 |∇W | x −5 to control 3

d+ 32

W pc −2 |∇W | x − 2 |v| d−2 ,

1 |v| > W, 4

the same argument in proving the second one yields the desired estimate. We verify the fourth inequality: p −3 W c ∇W v 2 ˙ 0

N (I ;|v| 14 W )

d+3 W d−2 −2 v 2 N˙ 0 (I ;|v| 1 W ) 4

3 W 2(d−2) |v|

d+ 32 d−2

2d

L2t Lxd+2 (I ×Rd )

1936


3 W 2(d−2)

4d

Lx5

v

d+ 32 d−2 S˙ 1 (I ×Rd )

v

d+ 32 d−2 2d(2d+3) 2d+3 (d−2)(2d−1) Lt d−2 Lx (I ×Rd )

.

The last one follows from the bound 3

d+ 32

W |∇W ||v|pc x − 2 |v| d−2 for |v| 14 W and Hölder inequality, as in the second one.

2

3. The existence of W − , W + As in [6], we will construct the threshold solutions W − , W + as the limit of a sequence of near solutions Wka (t, x) in the positive time direction. It follows from this construction that both W − and W + approach to the ground state W exponentially fast as t → +∞. On the other hand, the asymptotic behaviors of W − and W + are quite different in the negative time direction (see Remark 3.10). We begin with the following result: Lemma 3.1. (See [6].) Let a ∈ R. There exist functions {Φja }j 1 in S(Rd ) such that Φ1a = aY+ (see Lemma 2.2 for the definition of Y+ ) and the function Wka (t, x) = W (x) +

k

e−j e0 t Φja (x),

j =1

is a near solution of Eq. (1.1) in the sense that 4 (i∂t + )Wka + Wka d−2 Wka = εka , where the error εka is exponentially small in S(Rd ). More precisely, ∀J, M 0, J, M are integers, there exists a constant CJ,M such that x M ∇ J εka (x) CJ,M e−(k+1)e0 t . Remark 3.2. Since all Φj are Schwartz functions, we have the following properties for the difference k

e−j e0 t Φja (x).

(3.1)

j m x ∇ vk (t, x) Ck,j,m e−e0 t .

(3.2)

vk = Wka − W =

j =1

For any j, m 0, there exists Ck,j,m > 0 such that


1937

Next we show that there exists a unique genuine solution W a (t, x) of (1.1) which can be approximated by the above constructed near solutions Wka (t, x). The existence and uniqueness of the solution W a can be transformed to that of h := W a − Wka which satisfies the equation i∂t h + h = −Γ (h) − iR(vk + h) + iR(vk ) + iεka .

(3.3)

Remark that this is the first place where the proof in [6] breaks down in higher dimensions. In [6] for dimensions d = 3, 4, 5, they made use of the fact that the nonlinearity G(h) := −Γ (h) − iR(vk + h) + iR(vk ) is Lipschitz2 in S˙ 1 to construct the solution to (3.3) by the fixed point argument. In higher dimensions d 6, the Lipschitz continuity does not hold anymore. However, since vk is small compared with W , we can use real analytic expansion for the complex function |1 + z|pc −1 (1 + z) to show that R(vk + h) − R(vk ) is actually Lipschitz in h once h is small. This observation motivates us to construct the solution in a certain space consisting of functions which decay much faster than W . It turns out that the weighted Sobolev space H m,m with the norm f H m,m =

x m−j ∇ j f 0j m

2

(3.4)

for large m serves this purpose. We have several properties for H m,m . Lemma 3.3 (Linear estimate in H m,m ). For any m 1, there exists a constant C depending on m such that3 it e u0

H m,m

eC|t| u0 H m,m ,

∀t ∈ R.

(3.5)

Let t0 > 0, α > 2C and Σt0 be the space with the norm uΣt0 = sup eαt u(t)H m,m ,

(3.6)

tt0

then the following holds: ∞ i(t−τ ) F (τ ) dτ e t

Σt0

1 F Σt0 . α−C

(3.7)

Proof. (3.5) follows directly from the standard energy method. To obtain (3.7), we use (3.5) to estimate 2 More precisely G(h ) − G(h ) 1 2 N˙ 1 h1 − h2 S˙ 1 . 3 Certainly the estimate (3.5) is not optimal. For example, one can improve it to: eit u m,m 0 H (1 + |t|)m u0 H m,m . However, the rough estimate (3.5) is enough for our use.

1938


∞ i(t−τ ) F (τ ) dτ e

∞ ei(t−τ ) F (τ )

H m,m

t

H m,m

dτ

t

∞

eC(τ −t) F (τ )H m,m dτ

t

∞

eC(τ −t) e−ατ dτ F Σt0

t

(3.7) now follows immediately.

1 e−αt F Σt0 . α−C

2

Lemma 3.4 (Embedding in H m,m ). Let k1 , k2 be non-negative integers, then for any m k1 + k2 + d2 + 1, we have k k x 1 ∇ 2 f

∞

f H m,m ,

where the implicit constant depends only on k1 , k2 . Proof. Denote [ d2 +] as the smallest integer strictly bigger than d2 . By Sobolev embedding f ∞ f

H

[ d2 +]

,

we have d k k x 1 ∇ 2 f x k1 ∇ k2 f + ∇ [ 2 +] x k1 ∇ k2 f ∞ 2 2

d ∇ j x k1 ∇ [ 2 +]−j +k2 f f H m,m + 2 0j [ d2 +]

f H m,m +

k −j [ d +]−j +k 2f x 1 ∇ 2

2

0j [ d2 +]

f H m,m . The lemma is proved.

2

Lemma 3.5 (Bilinear estimate in H m,m ). We have f gH m,m f W m,∞ gH m,m , with the implicit constant depending only on m.

(3.8)


1939

Proof.

x m−j ∇ j (f g)

f gH m,m

0j m

2

x m−j ∇ j −k f ∇ k g

2

0j m 0kj

x m−k ∇ k g x k−j ∇ j −k f

0j m 0kj

2

x m−k ∇ k g ∇ j −k f 2 ∞

0j m 0kj

gH m,m f W m,∞ . Lemma 3.6. Let C > 0, j 2 and m

d 2

+1+

Cj j x h

Cj j −1 ,

2 then j

H m,m

j m hH m,m ,

where the implicit constant depends only on m. Proof. From the definition and the chain rule, we estimate Cj j x h m,m H

x m−l ∇ l x Cj hj 2 0lm

j l x m−l+Cj −l0 ∇ l1 h∇ l2 h · · · ∇ lα hhj −α 2

0lm l0 +···+lα =l

jm

m−l+Cj −l d d d 0 x −(m−l1 +m−l2 − 2 −1+···+m−lα − 2 −1+(j −α)(m− 2 −1)) x

∞

0lm l0 +···+lα =l

d · x m−l1 ∇ l1 h2 · x m−l2 − 2 −1 ∇ l2 h∞ · · · j −α d d · x m−lα − 2 −1 ∇ lα h∞ x m− 2 −1 h∞ . Since m > d2 + 1 + jCj −1 , it is not difficult to verify that the exponent of x is non-positive in the first factor of the last expression. This combined with Lemma 3.4 shows that Cj j x h

H m,m

We will prove the following:

j

j m hH m,m .

2

(3.9)

1940


Proposition 3.7. Let a ∈ R. Let Y+ and Wka = Wka (t, x) be the same as in Lemma 3.1. Assume m 3d is fixed. Then there exists k0 > 0 and a unique solution W a (t, x) for the equation in (1.1) which satisfies the following: for any k k0 , there exists tk 0 such that ∀t tk , a W (t) − W a (t) k

H m,m

e

−αt

1 e0 . α= k+ 2

,

(3.10)

Moreover, we have a W (t) − W − ae−e0 t Y+ m,m e− 32 e0 t . H

(3.11)

Proof. Let h = W a − Wka , then W a is the solution of (1.1) as long as h is a solution of Eq. (3.3) which tends to 0 as t → ∞. From Duhamel’s formula, it is equivalent to solve the following integral equation ∞ h(t) = i

ei(t−s) −Γ (h) − iR(h + vk ) + iR(vk ) + iεka (s) ds

t

=: Φ h(t) .

(3.12)

Define the space Σtk by f Σtk = supttk eαt f (t)H m,m and introduce the unit ball Bk = f = f (t, x): f Σtk 1 . We shall show that Φ is a contraction on Bk . Taking h ∈ Bk , we compute the H m,m norm of Φ(h(t)): Φ h(t)

H m,m

∞ ei(t−s) Γ h(s)

H m,m

(3.13)

ds

t

+

∞ i(t−s) e R(h + vk ) − R(vk ) (s)

H m,m

ds

(3.14)

t

∞ + ei(t−s) ε a (s) k

H m,m

ds.

t

To estimate (3.13), we use Lemmas 3.5, 3.3 to get ∞ (3.13)

eC|t−s| Γ h(s) H m,m ds

t

∞ t

eC|t−s| W pc −1 W m,∞ h(s)H m,m ds

(3.15)


∞

1941

eC|t−s| e−αs hΣtk ds

t

e

−αt

∞ hΣtk

e−(α−C)(s−t) ds

t

1 e−αt hΣtk . α−C

(3.16)

Since α = (k + 12 )e0 , by taking k0 sufficient large, we have 1 −αt 1 −αt e hΣtk e 100 100

(3.13)

(3.17)

for all k k0 . Now we deal with (3.15). Note that by Lemma 3.1, εka (t) = O(e−(k+1)e0 t ) in S(Rd ). This implies a ε (t) k

H m,m

Ck e−(k+1)e0 t .

Thus, ∞ (3.15)

eC|t−s| εka (s)H m,m ds

t

∞ Ck

eC|t−s| e−(k+1)e0 s ds

t 1

Ck

1 e − 2 e0 t 1 −αt e−(k+ 2 )e0 t e (k + 1)e0 − C 100

(3.18)

if t tk and tk is sufficiently large. It remains to estimate (3.14). The reason that we can take m derivatives is that both vk and h are small compared to W . Indeed by Remark 3.2, we have vk (t, x) < 1 W (x), 2

∀t tk , x ∈ Rd .

(3.19)

Moreover, since h ∈ Σtk and m 3d, by Lemma 3.4 we have d−2 x h(t)

h(t) m,m e−αt hΣ . tk ∞ H

As a consequence, we have h(t, x) e−αt x −(d−2) hΣ 1 W (x). tk 4

(3.20)

1942


Using (3.19) and (3.20) together with the expansion for the real analytic function P (z) = |1 + z|pc −1 (1 + z) for |z| 34 which takes the form P (z) = 1 +

pc + 1 pc − 1 aj1 ,j2 zj1 z¯ j2 , z+ z¯ + 2 2

(3.21)

j1 +j2 2

we write

pc −1 vk pc −1 vk + h vk 1 + v k + h 1+ 1+ − 1 + W W W W ¯ pc + 1 h pc − 1 h − − 2 W 2 W

j1 j2 vk + h j1 v¯k + h¯ j2 v¯k vk pc aj1 ,j2 − =W W W W W j1 +j2 2

j −i (3.22) aj Ci,j W pc −j O vk hi , =O

i R(vk + h) − R(vk ) = W pc

j 2, 1ij

where the last equality following from using binomial expansion and regrouping coefficients, and we have the bound pc (pc − 1) · · · (pc − j + 1) and |aj | 1, aj = O j! j! and Ci,j 2j . Ci,j = O i!(j − i)! j −i i h)

The notation O(vk

denotes terms of the form vkα1 v¯kα2 hβ1 h¯ β2

with α1 + α2 = j − i, β1 + β2 = i. Now we use this expression to estimate R(vk + h) − R(vk )H m,m . Using Lemmas 3.5 and 3.6, we have

j −i R(vk + h) − R(vk ) m,m aj Ci,j W pc −j vk hi H m,m H j 2, 1ij

j 2

+

j −1 2j W −j vk hH m,m

j 2, 2ij

j 2

+

j −i −i i 2j W −1 vk W h H m,m

j −1 2j W −j vk W m,∞ hH m,m

j 2, 2ij

j −i m,∞ hi m,m . 2j i m W −1 vk H W


1943

Applying Remark 3.2 and in view of h ∈ Σtk , we have −j j −1 W v m,∞ j m Ck,m e−(j −1)e0 t , k W −1 j −i m,∞ j m Ck,m e−(j −i)e0 t . W vk W Noting moreover that hH m,m e−αt hΣtk , we estimate R(vk + h)(t) − R(vk )(t)

H m,m

j 2, 1ij

i j −i 2j j 2m Ck,m e−αt hΣtk e−e0 t

e−αt hΣtk e

−αt

hΣtk

2j j 2m Ck,m e−(α(i−1)+(j −i)e0 )t

j 2, 1ij

2j j 2m Cm,k e−(j −1)e0 tk

j 2

1 −αt e hΣtk . 100 The last inequality comes from the fact we can choose tk large enough such that the series converges. Now we are ready to estimate (3.14). Using Lemma 3.3, we have ∞ (3.14)

eC|t−s| R(vk + h)(s) − R(vk )(s)H m,m ds

t

1 hΣtk 100

∞

eC(s−t) e−αs ds

1 −αt e hΣtk 100

t

1 −αt e . 100

(3.23)

Collecting the estimates (3.17), (3.18) and (3.23), we obtain Φ h(t)

H m,m

1 −αt e 10

1 , 10

(3.24)

for all k k0 and t tk . Therefore Φ(h)

Σtk

which shows that Φ maps Bk to itself. Next we show that Φ is a contraction. Taking h1 and h2 in Σtk , we compute

1944


Φ h1 (t) − Φ h2 (t)

H m,m

∞

i(t−s) e Γ h1 (s) − h2 (s)

H m,m

(3.25)

ds

t

∞ + ei(t−s) R(vk + h1 ) − R(vk + h2 ) (s)

H m,m

(3.26)

ds.

t

The estimate of (3.25) is the same as (3.16), we omit the details. To estimate (3.26), we write −i R(vk + h1 ) − R(vk + h2 )

vk + h1 j1 v¯k + h¯ 1 j2 vk + h2 j1 v¯k + h¯ 2 j2 pc −j aj1 ,j2 W − = W W W W j 2

=O

aj Ci,j W

pc −j

j −i i O (h1 − h2 )vk h ,

j 2, 1ij −1

where the constants aj , Ci,j are the same as in (3.22). We are in the same situation as before. Therefore, we obtain Φ h1 (t) − Φ h2 (t)

H m,m

1 −αt e h1 − h2 Σtk , 10

∀k k0 , t tk ,

(3.27)

which shows that Φ is a contraction in Bk . This proves the existence and uniqueness of the solution to the equation in (1.1) such that (3.10) holds. It only remains to show that W a (t, x) is a be the corresponding solutions such that independent of k. Indeed, let k1 < k2 and W a , W a W (t) − W a (t)

1

e−(k1 + 2 )e0 t ,

∀t tk1 ,

1 W a (t) − Wka2 (t)H m,m e−(k2 + 2 )e0 t ,

∀t tk2 .

k1

H m,m

Without lose of generality we also assume tk1 tk2 , then the triangle inequality gives that W a (t) − Wka1 (t)H m,m W˜ a (t) − Wka2 (t)H m,m + 1

e−(k1 + 2 )e0 t ,

k1 j 0 such that for all t t0 and all 2 p ∞, we have l l a x 1 ∇ 2 w (t) as long as l1 + l2 +

d 2

1

p

Lx

e − 2 e0 t ,

(3.28)

+ 1 m. In particular, a w ˙ 1

1

S ([t,∞))

e − 2 e0 t .

Proof. Let k0 be the same as in Proposition 3.7, then by Remark 3.2 we have a w (t)

H m,m

w a (t) − vk0 (t)H m,m + vk0 (t)H m,m −j e t a −(k0 + 12 )e0 t 0 e + e Φj . 1j k0

H m,m

Thus for t0 sufficiently large and t t0 , we obtain a w (t)

2

H m,m

e − 3 e0 t .

An application of Sobolev embedding gives that for any p with 2 p ∞, l l a x 1 ∇ 2 w (t)

p

Lx

provided l1 + l2 +

d 2

1 w a H m,m e− 2 e0 t ,

+ 1 < m. The corollary is proved.

2

Before finishing this section, we make the following two remarks. Remark 3.9. In next section we shall show for a, b such that ab > 0, W a is just a time translation of W b . This will allow us to define W ± as W ±1 and to classify the solutions with threshold energy. The second remark concerns the behavior of W ± in the negative time direction. Remark 3.10. From the construction of W ± (t), it is clear that they both approaches to the ground state W exponentially fast as t → +∞. For the behavior of W ± in negative time direction, we can apply the same argument in [6] (see Corollaries 3.2, 4.2 for instance) to conclude that W −

1946


scatters when t → −∞ and W + blows up at finite time. To get the blowup of W + , we need the crucial property W + ∈ L2x which is now available as we are in dimensions d 6. 4. Classification of the solution Our purpose of this section is to prove Theorem 1.4. Following the argument in [6], the key step is to establish the following Theorem 4.1. Let γ0 > 0. Assume u is the solution of the equation in (1.1) satisfying E(u) = E(W ) and u(t) − W

H˙ 1

Ce−γ0 t ,

∀t 0,

(4.1)

then there exists a ∈ R such that u = W a. As a corollary of Theorem 4.1, we see that modulo time translation, all the {W a , a > 0} and {W a , a < 0} are same. Corollary 4.2. For any a = 0, there exists Ta ∈ R such that

W a (t) = W + (t + Ta ), W a (t) = W − (t + Ta ),

if a > 0, if a < 0.

(4.2)

We now prove Theorem 4.1. The strategy is the following: we first prove that there exists a ∈ R such that u(t) − W a (t) has enough decay, then using the decay estimate to show that u(t) − W a (t) is actually identically zero. To this end, we have to input the condition (4.1) and upgrade it to the desired decay estimate. At this point, we need the following crucial result from [6]. Lemma 4.3. Let h be the solution of the equation ∂t h + Lh = ε.

(4.3)

And for t 0, εN˙ 1 ([t,∞)) Ce−c1 t , h(t)

H˙ 1

ε

2d Lxd+2

Ce−c1 t ,

Ce−c0 t ,

where c0 < c1 . Then the following statements hold true, • If c0 < c1 or e0 < c0 < c1 , then hS˙ 1 ([t,∞)) Cη e−(c1 −η)t .

(4.4)


1947

• If c0 e0 < c1 , then there exists a ∈ R such that h − ae−e0 t Y+ ˙ 1

S ([t,∞))

Cη e−(c1 −η)t .

(4.5)

t 0.

(4.6)

Let v = u − W , then (4.1) gives that v(t)

H˙ 1

e−γ0 t ,

Without loss of generality we assume γ0 < e0 . We first show that this decay rate can be upgraded to e−e0 t . More precisely, we have Proposition 4.4. Let v = u − W , then there exists t0 > 0 such that for all t t0 , vS˙ 1 ([t,∞)) Ce−e0 t , R(v) ˙ 1

N ([t,∞))

Ce

d+ 3 − d−22 e0 t

,

(4.7)

R(v)(t)

2d Lxd+2

Ce−pc γ0 t .

(4.8)

In particular, there exists a ∈ R such that v − ae−e0 t Y+ ˙ 1 S

d+ 32

Cη e−( d−2 −η)e0 t . ([t,∞))

(4.9)

Proof. First we show that (4.9) is a consequence of (4.7), (4.8). To see this, note that v satisfies the equation i∂t v + v + Γ (v) + iR(v) = 0,

(4.10)

∂t v + Lv = −R(v).

(4.11)

or equivalently,

d+ 3

Applying Lemma 4.3 with h = v, ε = −R(v), c0 = e0 , c1 = d−22 e0 and using the estimates (4.7) and (4.8), we obtain (4.9). So we only need to establish (4.7) and (4.8). This will be done in two steps. At the first step, we prove that the Strichartz norm of v decays like e−γ0 t and the dual Strichartz norm of R(v) decays even faster. Secondly, we iterate this process and upgrade the decay estimate by using Lemma 4.3 finitely many times. Step 1. We prove there exists t0 > 0 such that for t t0 , vS˙ 1 ([t,∞)) e−γ0 t ,

R(v) ˙ 1 N

d+ 32

e− d−2 γ0 t . ([t,∞))

(4.12)

Let τ be a small constant to be chosen later. Using Strichartz estimate on the time interval [t, t + τ ], we have vS˙ 1 ([t,t+τ ]) v(t)H˙ 1 + Γ (v)N˙ 1 ([t,t+τ ]) + R(v)N˙ 1 ([t,t+τ ]) .

(4.13)

1948


For the linear term, we have Γ (v) ˙ 1

N ([t,t+τ ])

W pc −1 ∇v τ W pc −1

L1t L2x ([t,t+τ ]×Rd )

+ W pc −2 ∇W v L1 L2 ([t,t+τ ]×Rd ) t

x

∇vL∞ 2 d L∞ t Lx ([t,t+τ ]×R ) x p −2 + τ W c ∇W d v 2d Lx

d−2 L∞ ([t,t+τ ]×Rd ) t Lx

τ vS˙ 1 ([t,t+τ ]) .

(4.14)

This is good for us. Now we deal with the term R(v). In lower dimensions, it is easy to see that R(v) is superlinear in v. In higher dimensions (d 6), this is trickier. Here we will rely heavily on the fact that W has nice decay to show that R(v) is essentially superlinear in v. We claim for any time interval I , that R(v) ˙ 1 N

d+ 23

p

c vS˙d−2 1 (I ) + vS˙ 1 (I ) . (I )

(4.15)

Assume the claim is true for the moment. By (4.6), (4.13)–(4.15), we have d+ 23

vS˙ 1 ([t,t+τ ]) e−γ0 t + τ vS˙ 1 ([t,t+τ ]) + vS˙ 1c ([t,t+τ ]) + vS˙d−2 1 ([t,t+τ ]) p

(4.16)

Taking τ small enough, a continuity argument shows that there exists t0 > 0, such that for all t t0 , vS˙ 1 ([t,t+τ ]) e−γ0 t .

(4.17)

Therefore, we have vS˙ 1 ([t,∞))

j 0

vS˙ 1 ([t+τj,t+τ (j +1)]) e−γ0 (t+τj ) e−γ0 t

j 0

1 1 − e−γ0 τ

e−γ0 t . Plugging this estimate into (4.15), we have proved (4.12). Now it remains to prove the claim (4.15). Recall that iR(v) = |v + W |pc −1 (v + W ) − W pc − v = W pc J , W

pc + 1 pc −1 pc − 1 pc −1 W W v− v, ¯ 2 2


1949

where J (z) = |1 + z|pc −1 (1 + z) − 1 − We write ∇R(v) as, i∇R(v) = pc W pc −1 ∇W J =

v W

+ W pc Jz

pc + 1 pc − 1 z− z¯ . 2 2

v v v v¯ ∇ + W pc Jz¯ ∇ W W W W

pc + 1 |v + W |pc −1 ∇(v + W ) − W pc −1 ∇W − (pc − 1)W pc −2 ∇W v − W pc −1 ∇v 2 pc − 1 |v + W |pc −3 (v + W )2 ∇(v¯ + W ) − W pc −1 ∇W + 2

− (pc − 1)W pc −2 ∇W v¯ − W pc −1 ∇ v¯ . (4.18)

Note moreover for |z| < 1, J (z) is real analytic in z. Thus for |z| 34 , we have J (z) |z|2 , Jz (z) , Jz¯ (z) |z|.

(4.19)

To estimate ∇R(v)N˙ 0 , we split Rd into regimes {x: |v(x)| 14 W (x)} and {x: |v(x)| > 1 4 W (x)}. In the first regime, we use the expression (4.18), the estimate (4.19) and Lemma 2.3 to get p −1 p v v v2 c c ∇R(v) ˙ 0 W W ∇ ∇W + N (I,|v| 14 W ) 2 W W N˙ 0 (I ;|v| 1 W ) W N˙ 0 (I,|v| 1 W ) 4 4 p −2 c W W pc −3 ∇W v 2 ˙ 0 + v∇v 1 1 ˙0 N (I ;|v| 4 W )

d+ 32

N (I ;|v| 4 W )

p

c vS˙d−2 1 (I ) + vS˙ 1 (I ) .

In the second regime, we use the second equality in (4.18). Using the triangle inequality, Lemma 2.3 and noting that the conjugate counterpart will give the same contribution we obtain ∇R(v) ˙ 1 |v + W |pc −1 − W pc −1 ∇v N˙ 0 (I ;|v|> 1 W ) N (I ;|v|> 14 W ) 4 pc −1 pc −1 ∇W N˙ 0 (I ;|v|> 1 W ) + |v + W | −W 4 p −2 c + W ∇W v N˙ 0 (I ;|v|> 1 W ) 4 p −1 c |v| ∇v N˙ 0 (I ) + ∇W |v|pc −1 N˙ 0 (I ;|v|> 1 W ) 4 p −2 c + W ∇W v N˙ 0 (|v|> 1 W ) 4

p

vS˙ 1c (I ) + v

d+ 32 d−2 S˙ 1 (I )

.

1950


Combining the two estimates together, (4.15) is verified. Finally, we quickly show that R(v)

2d

Lxd+2

e−pc γ0 t .

Indeed, note R(v) = −iW pc J ( Wv ) and J (z) |z|pc , we estimate R(v)(t)

2d Lxd+2

|v|pc v(t)

2d

Lxd+2

2d Lxd−2

v(t)H˙ 1 x

e−pc γ0 t . Step 2. Iteration. Since v satisfies the equation (4.11), we can use Lemma 4.3 with h = v and ε = −R(v) to get v(t) If

d+1 d−2 γ0

H˙ 1

d+1 C e−e0 t + e− d−2 γ0 t .

e0 , then by repeating the same arguments as above, we have vS˙ 1 ([t,∞)) e−e0 t ,

and the proposition is proved. Otherwise, we are at the same situation as the first step with γ0 now replaced by d+1 d−2 γ0 . Iterating this process finitely many times, we obtain the proposition. 2 Based on this result, we now show that u − W a decays arbitrarily fast. Proposition 4.5. For any m > 0, there exists tm > 0 such that u − W a ˙ 1 e−mt , ∀t tm . S ([t,∞))

(4.20)

Proof. Step 1. We first remark that as a consequence of Proposition 4.4, we can prove u − W a ˙ 1

d

S ([t,∞))

e− d−2 e0 t ,

∀t t0 .

Indeed by the triangle inequality and recalling that v = u − W , we estimate u − W a ˙ 1

S ([t,∞))

v − ae−e0 t Y+ S˙ 1 ([t,∞)) + w a − vk0 S˙ 1 ([t,∞)) + vk0 − ae−e0 t Y+ S˙ 1 ([t,∞)) .

For the first term, we use (4.9) to get v − ae−e0 t Y+ ˙ 1

S ([t,∞))

d 1 e− d−2 e0 t . 2

For the last two terms, we use the definition of vk (see (3.1)) and Proposition 3.7 to obtain for any 2 p ∞:


1951

∇ vk − ae−e0 t Y+ p e− 54 e0 t , 0 L x

a ∇ w − vk 0

p

Lx

w a − vk0 H m,m W a − Wka0 H m,m 1

e−(k0 + 2 )e0 t . Integrating in the time variable over [t, ∞), we have a d 1 3 1 −e0 t w − vk ˙ 1 Y+ S˙ 1 ([t,∞)) e− 2 e0 t e− d−2 e0 t . 0 S ([t,∞)) + vk0 − ae 2 2 Hence u − W a ˙ 1

d

S ([t,∞))

e− d−2 e0 t .

Step 2. We will prove (4.20) by induction. More precisely, suppose there exists tm1 > 0 such that u − W a ˙ 1 e−m1 t , ∀t tm1 , (4.21) S ([t,∞)) we aim to prove for t large enough that u − W a ˙ 1

e0

S ([t,∞))

e− 2(d−2) t e−m1 t .

(4.22)

d From Step 1, we can assume that (4.21) holds with m1 d−2 e0 . Let h = u − W a , then h solves the equation (4.23) ∂t h + Lh = −R h + w a + R w a ,

with hS˙ 1 ([t,∞)) e−m1 t ,

m1 > e0 .

An application of Lemma 4.3 gives immediately (4.22) if we establish the following R h + w a (t) − R w a (t) R h + w a − R w a ˙ 1

2d Lxd+2

N ([t,∞))

e0

e− d−2 t−m1 t ,

(4.24)

7

e− 4(d−2) e0 t hS˙ 1 ([t,∞)) e0

e− d−2 t−m1 t ,

(4.25)

for t large enough. The remaining part of the proof is devoted to showing (4.25), (4.24). The idea is similar to the proof of (4.15). We split the space into two regimes. In the regime where h is large, we use the decay estimate of W to show that R(h + w a ) − R(w a ) is superlinear in h. In the

1952


regime where h is small, we simply use the real analytic expansion of the complex function P (z) = |1 + z|pc −1 (1 + z). However, the argument here is more involved than the proof of (4.15). We first show (4.24). To begin with, we recall the exact form of R(h + w a ) − R(w a ). We have i R h + wa − R wa p −1 p −1 = w a + h + W c w a + h + W − w a + W c w a + W −

pc + 1 pc −1 pc − 1 pc −1 ¯ W W h− h. 2 2

(4.26)

By triangle inequality, we estimate R h + w a (t) − R w a (t)

2d

Lxd+2

R h + w a (t) − R w a (t)

(4.27)

2d

Lxd+2 (|h|> 14 W )

+ R h + w a (t) − R w a (t)

2d

Lxd+2 (|h| 14 W )

(4.28)

.

For (4.27), we use the fact that |w a (t, x)| 12 W (x) which follows from Corollary 3.8 to estimate R h + w a (t) − R w a (t) |h|pc −1 w a + W p −1 h(t) c h(t) p h(t) c 2d

2d

Lxd+2 (|h|> 14 W )

2d Lxd+2 (|h|> 14 W )

p −1 + w a + W + h c h

2d

Lxd+2 (|h|> 14 W )

2d

Lxd−2

Lxd+2

e−m1 pc t .

(4.29)

For (4.28), we use P (z) = |1 + z|pc −1 (1 + z) to rewrite (4.26) into i R h + wa − R wa a h + wa w pc + 1 h pc − 1 h¯ pc P −P − − . =W W W 2 W 2 W

(4.30) (4.31)

Note that |w a + h| 3 , W 4 We use the expansion for P (z) (see (3.21)) to write

|w a | 1 . W 2

(4.32)


i R h + wa − R wa =

j1 ¯ a j1 a j2 h + w¯ a j2 w¯ w − W W W j −i i aj Ci,j W pc −1−j ∇W w a h ,

aj1 ,j2

j1 +j2 2

=O

1953

h + wa W

j 2, 1ij

where |aj | 1 and Ci,j 2j . Therefore by triangle inequality we have R h + w a (t) − R w a (t)

j −i 2j W pc −j w a (t) h(t)i

2j h(t)

j 2

+

j 2, 2ij

h(t) j 2

+

2d Lxd−2

j 2, 2ij

p −j a j −1 W c w (t)

(4.34)

d

Lx2

p j −i 2j h(t) c 2d W pc −j w a (t) h(t)i−pc L∞ (|h| 1 W ) x

Lxd−2

2d Lxd−2

4

−1 a j −1 W w (t) d Lx2

(j −1)

p i−p j −i 2j h(t) c 2d hW −1 L∞ c w a W −1 L∞ Lxd−2

x

j −e (j −1)t p h(t)H˙ 1 2 e 0 + h(t)H˙c1 x

2d

Lxd+2 (|h(t)| 14 W )

j 2, 1ij

(4.33)

2d

Lxd+2 (|h| 14 W )

j 2

x

x

j 2, 2ij

2j

i−pc 1 e−e0 (j −i)t 4

e−(e0 +m1 )t . Collecting estimates (4.29) and (4.33) we obtain (4.24). Next we prove (4.25). To this end, we take the gradient and regroup the term, we have i∇ R h + w a − R w a =

p −1 pc + 1 a w + h + W c − W pc −1 ∇h 2 p −1 p −1 a + w + h + W c − w a + W c ∇ w a + W pc −2 + (pc − 1)W ∇W h p −3 2 pc − 1 a w + h + W c w a + h + W − W pc −1 ∇ h¯ 2 p −3 a 2 + w + h + W c w a + h + W p −3 2 − w a + W c w a + W ∇ w¯ a + W

+ (pc − 1)W pc −2 ∇W h¯ . +

1954


By Lemma 2.3, Corollary 3.8 and the triangle inequality we have ∇ R h + w a − R w a ˙ 0 p −1 h + w a c ∇h ˙ 0

N ([t,∞);|h|> 14 W )

N ([t,∞);|h|> 14 W )

+ |h|pc −1 ∇ w a + W ˙ 0

N ([t,∞);|h|> 14 W )

+ W pc −2 ∇W hN˙ 0 ([t,∞);|h|> 1 W )

p −1 p w a S˙ 1c ([t,∞)) hS˙ 1 ([t,∞)) + hS˙ 1c ([t,∞)) + h

4

d+ 32 d−2 S˙ 1 ([t,∞))

e0

e− 2 (pc −1)t hS˙ 1 ([t,∞)) .

(4.35)

To get the estimate in the regime where |h| is small, we adopt the form (4.30). By chain rule we have i∇ R h + w a − R w a a w pc + 1 h pc − 1 h¯ h + wa −P − − = pc W pc −1 ∇W P W W 2 W 2 W a a w pc + 1 h pc − 1 h¯ h+w + W pc ∇ P −P − − . W W 2 W 2 W

(4.36) (4.37)

In view of (4.32),we can use the expansion for P (z) (see (3.21)) to write (4.36) as (4.36) = pc W pc −1 ∇W =O

j1 ¯ a j1 a j2 h + w¯ a j2 w¯ w − W W W j1 +j2 2 j −i i (4.38) aj Ci,j W pc −1−j ∇W w a h ,

aj1 ,j2

h + wa W

j 2, 1ij

where the constants aj , Ci,j are the same as those in (3.22). Now we deal with the second term (4.37). Applying the chain rule and regrouping the terms, we eventually get h + w a pc −1 pc + 1 pc −1 W (4.37) = − 1 ∇h 1 + W 2 h + w a pc −1 pc + 1 pc −2 W − ∇W 1 + −1 h 2 W h + w a pc −1 w a pc −1 pc + 1 pc −1 ∇w a − 1 + + W 1 + W 2 W h + w a pc −1 w a pc −1 a pc + 1 pc −2 w W − ∇W 1 + − 1 + 2 W W h + w a pc −3 pc − 1 pc −1 h + wa 2 1 + W − 1 ∇ h¯ + 1 + 2 W W

(4.39) (4.40) (4.41) (4.42) (4.43)


h + w a pc −3 w a pc −3 h + w a 2 wa 2 a ¯ 1+ 1+ ∇w + 1 + − 1 + W W W W h + w a pc −3 pc − 1 pc −2 h + wa 2 − 1+ W ∇W 1 + − 1 h¯ 2 W W w a pc −3 h + w a pc −3 h + w a 2 wa 2 a 1+ 1+ − 1 + + 1 + w¯ . W W W W

1955

(4.44) (4.45) (4.46)

For (4.39) and (4.40) we use the fact |1 + z|pc −1 − 1 |z|pc −1 to bound them as: (4.39) + (4.40) h + w a pc −1 |∇h| + W −1 |∇W | · h + w a pc −1 · |h|. For (4.41) we use the expansion |1 + z|pc −1 = 1 +

pc − 1 pc − 1 z+ z¯ + bj1 ,j2 zj1 z¯ j2 2 2 j1 +j2 2

to write (4.41) =

pc2 − 1 pc −2 a ¯ W h∇w a + h∇w 4

pc + 1 pc −1 + W ∇w a bj1 ,j2 2 j1 +j2 2

a j1 a j2 h + w a j1 h¯ + w¯a j2 w w¯ × − W W W W

a j −i i p −2 a pc −1−j a c h∇w + O bj Ci,j W ∇w O w h , =O W

j 2, 1ij

where in the last equality we use the same conventions as in (3.22). In particular the constants |bj | 1 and Ci,j 2j . We therefore have the bound

(4.41) W pc −2 ∇w a h +

j −i i 2j W pc −1−j ∇w a w a h .

j 2, 1ij

Similarly for (4.42) we have (4.42) W pc −3 ∇W ∇w a h +

j +1−i i 2j W pc −2−j ∇W w a h .

j 2, 1ij

Collecting all the estimates and noticing that (4.43) through (4.46) are just complex conjugates of (4.39) through (4.42), we therefore can bound (4.37) as follows:

1956


p −1 (4.37) h + w a c |∇h| p −1 + W −1 |∇W | · h + w a c · |h| + W pc −2 ∇w a h + W pc −3 |∇W | w a h

j −i i + 2j W pc −1−j ∇w a w a h j 2, 1ij

+

(4.47) (4.48) (4.49) (4.50) (4.51)

j +1−i i 2j W pc −2−j ∇W w a h .

(4.52)

j 2, 1ij

Now our task is reduced to bounding the N˙ 0 norm of (4.38) and (4.47) through (4.52). We start from (4.47), using Lemma 2.3 and Corollary 3.8 we have a pc −1 pc w 1 (4.47) ˙ 0 h + hS˙ 1 ([t,∞)) 1 1 ˙ N ([t,∞);|h| W ) S˙ ([t,∞)) S ([t,∞)) 4

e0

e− 2 (pc −1)t hS˙ 1 ([t,∞)) .

(4.53)

Similarly we have 3

d+ 2 7 a 2(d−2) d−2 w (4.48) ˙ 0 h + hS˙ 1 ([t,∞)) 1 1 1 ˙ ˙ N ([t,∞);|h| W ) S ([t,∞)) S ([t,∞)) 4

7

e− 4(d−2) e0 t hS˙ 1 ([t,∞)) .

(4.54)

For (4.49), (4.50), we use Hölder’s inequality and Corollary 3.8 to get (4.49) ˙ 0

N ([t,∞);|h| 14 W )

W pc −2 ∇w a h W pc −2 ∇w a

2d

L2s Lxd+2 ([t,∞)) d

3 L∞ s Lx ([t,∞))

h

2d

L2s Lxd−4 ([t,∞))

e0

e− 2 t hS˙ 1 ([t,∞)) . (4.50) ˙ 0

N ([t,∞);|h| 14 W )

h

2d L2s Lxd−4 ([t,∞))

w a

(4.55)

e0

d Lx3

e− 2 t hS˙ 1 ([t,∞)) .

(4.56)

Now we are left with the estimates of the summation terms (4.38), (4.51) and (4.52). We first treat (4.38). We have (4.38) ˙ 0

N ([t,∞);|h| 14 W )

j 2, 1ij

d+3 j −i i−1 2j W d−2 −j w a h h

2d

L2s Lxd+2 ([t,∞);|h| 14 W )

(4.57)


d+3 2j W d−2 −j hj

1957

(4.58)

2d

L2s Lxd+2 ([t,∞);|h| 14 W )

j 2

+

d+3 j −i i−1 2j W d−2 −j w a h h

2d

L2s Lxd+2 ([t,∞);|h| 14 W )

j 2, 1ij −1

.

(4.59)

For (4.58) we have by Lemma 2.3, j d+3 −2 2 (4.58) 2 W d−2 h

d L2s Lxd+2 ([t,∞);|h| 14 W )

j 2

2 h j

j 2

h

d+ 32 d−2 S˙ 1 ([t,∞)

·

2−j j −2 W h

1 L∞ s,x ([t,∞);|h| 4 W )

j −2 1 4

d+ 32 d−2 S˙ 1 ([t,∞)

7m1

e− 2(d−2) t hS˙ 1 ([t,∞) . For (4.59) we estimate

(4.59)

2j h

j 2, 1ij −1

2d L2s Lxd+4 ([t,∞))

d+3 j −i × W d−2 −1+i−j w a

−1 i−1 W h ∞

Ls,x ([t,∞))

d

3 L∞ s Lx ([t,∞))

j 2, 1ij −1

i−1 1 1 2 hS˙ 1 ([t,∞)) e− 2 (j −i)e0 t 4 j

e0

hS˙ 1 ([t,∞)) e− 2 t

j 2, 1ij −1

hS˙ 1 ([t,∞)) e

e − 20 t

2j

i−1 e0 1 e− 2 (j −i−1)t 4 e0

2−j 4j −i−1 e− 2 (j −i−1)t

j 2, 1ij −1

e

e − 20 t

hS˙ 1 ([t,∞)) .

This ends the estimate of (4.38). Using the fact that |∇w a | |∇W | and |w a | W , (4.50) and (4.52) can be bounded by (4.38), thus has the same estimate e0

(4.50) + (4.52) e− 2 t hS˙ 1 ([t,∞)) . Collecting the estimates (4.35), (4.53) through (4.60), we have e0 7 R h + w a − R w a ˙ 1 e− 4(d−2) e0 t hS˙ 1 ([t,∞)) e(−m1 − d−2 )t . N ([t,∞))

(4.25) is proved and we conclude the proof of the proposition.

2

(4.60)

1958


As the last step of the argument, we show that any solution h of Eq. (4.23) which has enough exponential decay must be identically 0. This would imply u = W a and we can conclude the proof of Theorem 4.1. To this end, we have Proposition 4.6. Let h be the solution of Eq. (4.23) satisfying the following: ∀m > 0, there exists tm > 0 such that hS˙ 1 ([t,∞)) e−mt ,

∀t > tm .

(4.61)

Then h ≡ 0. Proof. Note first that in an equivalent form, h satisfies i∂t h + h = −Γ (h) + i −R(v + w a ) + R w a , hence the following Duhamel’s formula holds ∞ h(t) = i

ei(t−s) −Γ (h) − iR h + w a + iR w a (s) ds,

t

since h(t)H˙ 1 → 0 as t → ∞. Using Strichartz estimate we then have hS˙ 1 ([t,∞)) Γ (h)N˙ 1 ([t,∞)) + R h + w a − R w a N˙ 1 ([t,∞)) . Denote hΣt := supst ems hS˙ 1 ([s,∞)) , and we have for η > 0 small enough Γ (h) ˙ 1 N

([t,∞))

Γ (h) ˙ 1

j 0

N ([t+ηj,t+η(j +1)])

j 0

ηhS˙ 1 ([t+ηj,t+(j +1)η]) ηe−m(t+ηj ) hΣtm

j 0

e−mt hΣtm

η 1 − e−ηm

2 −mt e hΣtm . m

From the estimate (4.25), we get a R w + h − R w a ˙ 1

N ([t,∞))

1 −mt e hΣtm . 10

Combining these two estimates, we get for m large enough that 1 hΣtm hΣtm , 2

(4.62)


1959

which implies that h = 0 on [tm , ∞). Recall that h = u − W a we obtain u = W a on [tm , ∞). Therefore u ≡ W a by uniqueness of solutions to (1.1). The proposition is proved and we have Theorem 4.1. 2 Proof of Corollary 4.2. The proof is almost the same as Corollary 6.6 in [6]. Let a = 0 and Ta be such that |a|e−e0 Ta = 1. By (3.11) we have a W (t + Ta ) − W ∓ e−e0 t Y+ m,m e− 32 e0 t . H

(4.63)

Moreover W a (· + Ta ) satisfies the assumption in Theorem 4.1, thus there exists a such that W a (· + Ta ) = W a . By (4.63), a = 1 if a > 0 and a = −1 if a < 0. Corollary 4.2 is proved. 2 Finally, we give the proof of the main Theorem 1.4. Proof of Theorem 1.4. We first note that (2) is just the variational characterization of W . More precisely we have Theorem 4.7. (See [1,20].) Let c(d) denote the sharp constant in Sobolev-embedding f

2d d−2

c(d)∇f 2 .

Then the equality holds iff f is W up to symmetries. More precisely, there exists (θ0 , λ0 , x0 ) ∈ R × R+ × Rd such that x − x0 − d−2 f (x) = eiθ0 λ0 2 W . λ0 In particular, if u0 satisfies E(u0 ) = E(W ),

∇u0 2 = ∇W 2 ,

then u0 coincides with W up to symmetries, hence the corresponding solution u coincides with W up to symmetries. It remains for us to show (1), (3). We first prove (1). Let u be the maximal-lifespan solution of (1.1) on I satisfying E(u) = E(W ), ∇u0 2 < ∇W 2 . Then by the Proposition 1.5, we have I = R. Assume that u blows up forward in time. Applying Proposition 1.5 again, we conclude that there exist θ0 , μ0 , γ0 such that u(t) − W[θ ,μ ] ˙ 1 e−γ0 t . 0 0 H This implies

u

(t) − W H˙ 1 [−θ0 ,μ−1 0 ]

e−γ0 μ0 t 2

where d−2 u[−θ0 ,μ−1 ] (t, x) = e−iθ0 μ0 2 u μ20 t, μ0 x 0

1960


is also a solution of Eq. (1.1). By Theorem 4.1 with γ0 now replaced by γ0 μ20 , we conclude there exists a < 0 such that u[−θ0 ,μ−1 ] = W a . 0 Using Corollary 4.2, we get − d−2 2

u(t, x) = eiθ0 μ0

−1 W − μ−2 0 t + Ta , μ 0 x .

This shows that u = W − up to symmetries. The proof of (3) is similar so we omit it. This ends the proof of Theorem 1.4. 2 Acknowledgments Both authors were supported by the National Science Foundation under agreement No. DMS0635607. Dong Li was also supported by a start-up funding from the Mathematics Department of University of Iowa. X. Zhang was also supported by NSF grant No. 10601060 and project 973 in China. References [1] T. Aubin, Équations différentielles non linéaires et problème de Yamabe concernant la courbure scalaire, J. Math. Pures Appl. (9) 55 (3) (1976) 269–296. [2] J. Bourgain, Global well-posedness of defocusing 3D critical NLS in the radial case, J. Amer. Math. Soc. 12 (1999) 145–171. [3] T. Cazenave, F.B. Weissler, Some remarks on the nonlinear Schrödinger equation in the subcritical case, in: New Methods and Results in Nonlinear Field Equations, Bielefeld, 1987, in: Lecture Notes in Phys., vol. 347, Springer, Berlin, 1989, pp. 59–69. [4] T. Cazenave, Semilinear Schrödinger Equations, Courant Lect. Notes Math., vol. 10, Amer. Math. Soc., 2003. [5] J. Colliander, M. Keel, G. Staffilani, H. Takaoka, T. Tao, Global well-posedness and scattering for the energy-critical nonlinear Schrödinger equation in R3 , Ann. of Math. 167 (2007) 767–865. [6] T. Duyckaerts, F. Merle, Dynamic of threshold solutions for energy-critical NLS, preprint, arXiv:0710.5915 [math.AP]. [7] R.T. Glassey, On the blowing up of solution to the Cauchy problem for nonlinear Schrödinger operators, J. Math. Phys. 8 (1977) 1794–1797. [8] M. Keel, T. Tao, Endpoint Strichartz estimates, Amer. Math. J. 120 (1998) 955–980. [9] C. Kenig, F. Merle, Global well-posedness, scattering, and blowup for the energy-critical, focusing, non-linear Schrödinger equation in the radial case, Invent. Math. 166 (2006) 645–675. [10] S. Keraani, On the blow-up phenomenon of the critical nonlinear Schrödinger equation, J. Funct. Anal. 235 (2006) 171–192. [11] Rowan. Killip, Dong. Li, Monica. Visan, Xiaoyi. Zhang, Characterization of minimal-mass blowup solutions to the focusing mass-critical NLS, preprint, math.ap/0804.1124. [12] Rowan Killip, M. Visan, Nonlinear Schrödinger equations at critical regularity, preprint. [13] R. Killip, M. Visan, The focusing energy-critical nonlinear Schrödinger equation in dimensions five and higher, Clay Lecture Notes, in press, math.ap/0804.1018. [14] D. Li, X. Zhang, Dynamics for the energy critical nonlinear wave equation in dimensions d 6, preprint. [15] F. Merle, Determination of blow-up solutions with minimal mass for nonlinear Schrödinger equation with critical power, Duke Math. J. 69 (1993) 427–453. [16] E. Ryckman, M. Visan, Global well-posedness and scattering for the defocusing energy-critical nonlinear Schrödinger equation in R1+4 , Amer. J. Math. 129 (2007) 1–60. [17] R.S. Strichartz, Restriction of Fourier transform to quadratic surfaces and decay of solutions of wave equations, Duke Math. J. 44 (1977) 705–774. [18] T. Tao, Global well-posedness and scattering for the higher-dimensional energy-critical non-linear Schrödinger equation for radial data, New York J. Math. 11 (2005) 57–80. [19] T. Tao, M. Visan, Stability of energy-critical nonlinear Schrödinger equations in high dimensions, Electron. J. Differential Equations 118 (2005) 1–28.


1961

[20] G. Talenti, Best constant in Sobolev inequality, Ann. Mat. Pura Appl. (4) 110 (1976) 353–372. [21] M. Visan, The defocusing energy-critical nonlinear Schrödinger equation in higher dimensions, Duke Math. J. 138 (2007) 281–374. [22] M. Weinstein, The nonlinear Schrödinger equation—Singularity formation, stability and dispersion, in: The Connection Between Infinite-Dimensional and Finite-Dimensional Dynamical Systems, Boulder, 1987, in: Contemp. Math., vol. 99, Amer. Math. Soc., Providence, RI, 1989, pp. 213–232.


Multiscale Young measures in homogenization of continuous stationary processes in compact spaces and applications Luigi Ambrosio a , Hermano Frid b,∗,1 , Jean Silva b a Scuola Normale Superiore, Piazza dei Cavalieri 7, 56126 Pisa, Italy b Instituto de Matemática Pura e Aplicada—IMPA, Estrada Dona Castorina, 110,

Rio de Janeiro, RJ 22460-320, Brazil Received 2 June 2008; accepted 4 December 2008 Available online 16 December 2008 Communicated by Cedric Villani

Abstract We introduce a framework for the study of nonlinear homogenization problems in the setting of stationary continuous processes in compact spaces. The latter are functions f ◦ T : Rn × Q → Q with f ◦ T (x, ω) = f (T (x)ω) where Q is a compact (Hausdorff topological) space, f ∈ C(Q) and T (x) : Q → Q, x ∈ Rn , is an n-dimensional continuous dynamical system endowed with an invariant Radon probability measure μ. It can be easily shown that for almost all ω ∈ Q the realization f (T (x)ω) belongs to an algebra with mean value, that is, an algebra of functions in BUC(Rn ) containing all translates of its elements and such that each of its elements possesses a mean value. This notion was introduced by Zhikov and Krivenko [V.V. Zhikov, E.V. Krivenko, Homogenization of singularly perturbed elliptic operators, Mat. Zametki 33 (1983) 571–582, English transl. in Math. Notes 33 (1983) 294–300]. We then establish the existence of multiscale Young measures in the setting of algebras with mean value, where the compactifications of Rn provided by such algebras plays an important role. These parametrized measures are useful in connection with the existence of correctors in homogenization problems. We apply this framework to the homogenization of a porous medium type equation in Rn with a stationary continuous process as a stiff oscillatory external source. This application seems to be new even in the classical context of periodic homogenization. © 2008 Elsevier Inc. All rights reserved.


E-mail addresses: [email protected] (L. Ambrosio), [email protected] (H. Frid), [email protected] (J. Silva). 1 H. Frid gratefully acknowledges the support of CNPq, grant number 306137/2006-2, and FAPERJ, grant number E-

26/152.192-2002. 0022-1236/$ – see front matter © 2008 Elsevier Inc. All rights reserved. doi:10.1016/j.jfa.2008.12.001

L. Ambrosio et al. / Journal of Functional Analysis 256 (2009) 1962–1997

1963

Keywords: Stochastic homogenization; Stationary ergodic processes; Two-scale Young measures; Algebras with mean value; Ergodic algebras; Porous medium equation

1. Introduction Continuous dynamical systems in compact spaces constitute a classical matter going back to pioneering works of Birkhoff, von Neumann, Khintchine, Kolmogorov, Markov, Hopf, Krylov and Bogolyubov, among others, during the 1930’s. They provide a natural setting for stochastic homogenization problems which extends the setting of periodic and almost periodic functions and also combines topological and measure theoretic features that usually allow a better understanding of the involved questions. Following a series of important papers on stochastic homogenization of linear differential operators by Zhikov et al. [35–37] (see also [17]), Zhikov and Krivenko [38] introduced the notion of algebras with mean value which captures the essential properties of typical realizations of continuous stationary processes defined by continuous dynamical systems in compact spaces endowed with an invariant probability measure. More specifically, let Q be a compact (Hausdorff topological) space and T (x) : Q → Q, x ∈ Rn , be an n-dimensional continuous dynamical system, that is, T (0)ω = ω, T (x + y)ω = T (x)T (y)ω, for all ω ∈ Q, and the mapping T : Rn × Q → Q given by T (x, ω) = T (x)ω is continuous. A classical result of Krylov and Bogolyubov [20] establishes the existence of an invariant (regular) probability measure μ on Q for T (x); that is μ(T (x)E) = μ(E) for Borelian E. So we may assume that Q is endowed with such an invariant probability measure. A stationary continuous process is a mapping (x, ω) → f (T (x)ω) where f ∈ C(Q) and {T (x)}x∈Rn is an n-dimensional continuous dynamical system on a compact space Q endowed with some invariant measure. The dynamical system (endowed with an invariant measure) is said to be ergodic if whenever f ∈ L2 (Q) satisfies f (T (x)ω) = f (ω) for μ-a.e. ω ∈ Q, for all x ∈ Rn , then f is equivalent to a constant. Given any f ∈ C(Q), by means of the well-known Birkhoff ergodic theorem, one easily shows that for almost all ω ∈ Q the realization f (T (x)ω) belongs to a linear subspace A ⊆ BUC(Rn ), where BUC(Rn ) is the space of bounded uniformly continuous functions in Rn , with the following properties: (i) A is an algebra, i.e., if f, g ∈ A then f g ∈ A; (ii) if f ∈ A, then its translates f (· + t), t ∈ Rn , also belong to A; (iii) every f ∈ A possesses a mean value. A linear subspace of BUC(Rn ) satisfying these three properties is called an algebra with mean value (algebra w.m.v., for short). Given an algebra w.m.v. A we may define the associated generalized Besicovitch space B 2 as the completion of A with respect to the semi-norm provided by the square root of the mean value of |f |2 for f ∈ A. The algebra w.m.v. A is said to be ergodic if whenever f ∈ B 2 satisfies f (· + x) = f (·) in B 2 for all x ∈ Rn , then f is equivalent in B 2 to a constant. It can be shown that for almost all ω ∈ Q the realization f (T (x)ω) just mentioned belongs to an ergodic algebra, even if the dynamical system is not ergodic. We then follow the approach in [2], defining vector valued algebras with mean value and establishing the existence of multiscale Young measures in the setting of vector valued algebras with mean value. For that, as in the case of almost periodic functions, we make essential use of the fact that associated with any algebra w.m.v. A there is a compact space K such that any f ∈ A may be viewed as an element of C(K), which follows from a classical theorem of Stone, as is shown below (cf. Theorem 4.1). Such compact space associated with the algebra w.m.v. provides the additional parameter of the multiscale (two-scale) Young measures. The latter are useful tools for the search of corrector functions in nonlinear homogenization problems.

1964


We show how this framework can be applied in the homogenization of nonlinear partial differential equations by considering the homogenization problem for a porous medium type equation with a stationary continuous process as a stiff oscillatory external source. In this general context we need to restrict the initial data to prepared ones, that is, those which satisfy an associated stationary equation in the oscillatory variable. Multiscale Young measures have been introduced in periodic problems by W. E [15] as a broader tool extending the previous concept of multiscale convergence introduced by Nguetseng [25] and further developed by Allaire [1]. It refines to multiple scale analysis the classical concept of Young measures introduced in [33], so fundamentally useful, especially after its striking applications in connection with problems concerning compactness of solution operators for nonlinear partial differential equations by Tartar [32], Murat [23], DiPerna [12–14], etc. This paper links multiscale Young measures to the recently growing interest in the more general setting of homogenization of random stationary ergodic processes (see, e.g., [7,10,17,18,22,28,31]). The extension of the multiscale Young measures from the periodic setting to the almost periodic one was carried out in [2] where applications to nonlinear transport equations, scalar conservation laws with oscillatory external sources, Hamilton–Jacobi equations and fully nonlinear elliptic equations are provided. In this connection, we recall that the two-scale convergence has been extended to the context of almost periodic homogenization and, more generally, to generalized Besicovitch spaces in [9] (see also, e.g., [26,27]). We also recall that the method of two-scale convergence was extended to the context of stochastic homogenization, under separability assumption, in [6]. The applications in the cited references [6,9,26,27] are basically to linear or monotone operators. This paper is organized as follows. In Section 2, we recall some concepts in order to state the well-known Birkhoff ergodic theorem, which will be used in later sections, also recall the definition of continuous dynamical systems, the classical theorem of Krylov and Bogolyubov and give some elementary examples. In Section 3 we recall the definition of algebras with mean value introduced in [38]. The purpose of Section 4 is to establish the connection between algebras with mean value and continuous dynamical systems in compact spaces. We also analyse the characterization of AP(Rn ) by means of the properties of the associated compact spaces. In Section 5 we introduce the vector-valued algebras w.m.v. which are needed in the construction of the multiscale Young measures in the context of algebras w.m.v. In Section 6 we establish the theorem on the existence of multiscale Young measures from homogenization in algebras w.m.v. In Section 7 we apply the general framework established in the earlier sections to the homogenization problem of a porous medium type equation in Rn with a stationary continuous process as a stiff oscillatory external source, and oscillatory initial data satisfying a stationary equation in the oscillatory variable. We also include Appendix A where we state without proof some basic results that are needed in Section 7. 2. Stationary processes We begin this section by recalling the definition of n-dimensional dynamical system in a probability measure space, as a preparation for the statement of the Birkhoff ergodic theorem. Definition 2.1 (n-dimensional dynamical system). Let (Q, M(Q), μ) be any probability measure space. An n-dimensional dynamical system on Q is a family of mappings T (x) : Q → Q, x ∈ Rn , which satisfies the following conditions:


1965

(i) (Group property) T (0) = I , where I is the identity mapping on Q, and T (x + y) = T (x)T (y),

∀x, y ∈ Rn ;

(ii) (Invariance) The mappings T (x) : Q → Q are measurable and μ-measure preserving, i.e., μ T (x)(E) = μ(E) for every x ∈ Rn and every E ∈ M(Q); (iii) (Measurability) Given any F ∈ M(Q) the set {(x, ω) ∈ Rn × Q: T (x)ω ∈ F } ⊆ Rn × Q is measurable with respect to the product σ -algebra Ln ⊗ M(Q), where Ln is the σ -algebra of Lebesgue measurable sets. As usual, for p 1 we denote by Lp (Q) be the space of the (equivalence classes of) measurable functions f : Q → R such that |f |p is μ-integrable on Q, and by L∞ (Q) the space of the μ-essentially bounded measurable functions. For f ∈ Lp (Q) and f ∈ L∞ (Q) respectively we denote 1/p p |f | dμ , f ∞ := ess sup f (ω). f p := ω∈Q

Q

An n-dimensional dynamical system T (x) : Q → Q induces an n-parameter group of transformations T (x) : L2 (Q) → L2 (Q) defined by T (x)f (ω) := f T (x)ω , f ∈ L2 (Q). It follows that the operator T (x) : L2 (Q) → L2 (Q) is unitary for each x ∈ Rn . Moreover, it is a consequence of the Lebesgue Dominated Convergence theorem (see [17, p. 223]) that the group T (x) is strongly continuous, i.e., (2.1) lim T (x)f − f 2 = 0, ∀f ∈ L2 (Q). x→0

Definition 2.2 (Ergodic dynamical system). Let (Q, M(Q), μ) be any probability measure space and let T (x) : Q → Q, x ∈ Rn , be an n-dimensional dynamical system on Q. A M(Q)measurable function f : Q → R is called invariant if f (T (x)ω) = f (ω) μ-almost everywhere in Q, for all x ∈ Rn . A dynamical system is said to be ergodic if every invariant function is μ-equivalent to a constant in Q. If f is a measurable function in Q, for a fixed ω ∈ Q the function x → f (T (x)ω), x ∈ Rn , is called a realization of f and the map (x, ω) → f (T (x)ω) is called a stationary process. The process is said to be stationary ergodic if the dynamical system is ergodic. We will make use of the well-known Birkhoff ergodic theorem. In order to state it we need to introduce the notion of mean value for functions defined in Rn . Definition 2.3. Let g ∈ L1loc (Rn ). A number M(g) is called the mean value of g if lim g ε −1 x dx = |A|M(g) ε→0

A

(2.2)

1966


for any Lebesgue measurable bounded set A ⊆ Rn , where |A| stands for the Lebesgue measure of A. This is equivalent to say that g(ε −1 x) converges, in the duality with L∞ and compactly supported functions, to the constant M(g). Also, if At := {x ∈ Rn : t −1 x ∈ A} for t > 0 and |A| = 0, (2.2) may be written as 1 lim t→∞ t n |A|

g(x) dx = M(g).

(2.3)

At

We now recall the Birkhoff ergodic theorem (see [11]). Theorem 2.1 (Birkhoff ergodic theorem). Let f ∈ Lp (Q), p 1. Then for almost all ω ∈ Q the realization g(x) = f (T (x)ω) possesses a mean value in the sense of (2.2). Moreover, the mean value M(f (T (·)ω)) is invariant and

f (ω) dμ = Q

M f T (·)ω dμ.

Q

In particular, if the system T (x) is ergodic, then M f T (·)ω =

f dμ for μ-almost all ω ∈ Q. Q

Throughout the remaining of this paper we will be dealing with continuous n-dimensional dynamical systems T (x) on compact topological spaces whose definition we recall now. Definition 2.4. Let Q be a compact topological space. A continuous n-dimensional dynamical system on Q is a family of mappings T (x) : Q → Q, x ∈ Rn , which satisfies the following conditions: (i) T (0) = I , where I is the identity mapping on Q, and T (x + y) = T (x)T (y),

∀x, y ∈ Rn ;

(ii) the mapping (x, ω) → T (x)ω is continuous from Rn × Q to Q. Henceforth by compact space we will always mean a compact Hausdorff topological space. Moreover, in compact spaces Q we shall always consider Radon measures. By a Radon measure μ we mean that μ is defined on the σ -algebra B(Q) of Borel sets, it is σ -additive and regular, in the sense that

μ(B) = inf μ(A): A ⊃ B, A open ,

for all B ∈ B(Q),

and

μ(B) = sup μ(K): K ⊆ B, K compact ,

for all B ∈ B(Q).


1967

We recall that for a Radon probability measure μ on a compact space Q, the space C(Q) is dense in the spaces Lp (Q, μ) of Borel functions whose pth power of the absolute value is μ-integrable, 1 p < ∞ (see [29, p. 69]). A well-known theorem of Krylov and Bogolyubov [20] (see also [24]) asserts that for any continuous dynamical system T (x) : Q → Q, x ∈ Rn , there exist invariant Borel probability measures when Q is a compact metric space. The result holds more generally when Q is any compact Hausdorff topological space and the proof of the more general statement is essentially the same as that of Bogolyubov and Krylov with minor adaptations. Theorem 2.2 (Krylov–Bogolyubov). Let Q be a compact Hausdorff topological space and let T (x) : Q → Q, x ∈ Rn , be an n-dimensional continuous dynamical system on Q. Then, there exists a probability Radon measure μ on Q invariant under T (x), x ∈ Rn . Let be given any continuous dynamical system T (x), x ∈ Rn , on a compact space Q, and any probability Radon measure μ invariant under T (x), x ∈ Rn . Then, if we choose as M(Q) the Borel σ -algebra, T (x) can be viewed as an n-dynamical system according to Definition 2.1. To prove this fact, the only nontrivial property to be checked is (iii). The class of Borel sets E ⊆ Q such that {(x, ω) ∈ Rn × Q: T (x)ω ∈ E} belongs to the product σ -algebra Ln ⊗ B(Q) contains the class of open sets and it is a σ -algebra; therefore, it coincides with B(Q). Definition 2.5 (Continuous stationary process). Given a compact space Q, an n-dimensional continuous dynamical system T (x) : Q → Q, x ∈ Rn , and an invariant Radon probability measure μ in Q, by a continuous stationary process we mean any map (x, ω) → f (T (x)ω) with f ∈ C(Q). We next give two basic examples of this setting. 2.1. Periodic functions In this case Q is the torus (S 1 )n and T (x) : Q → Q is defined as T (x)ω := ω + x (mod 1), where we adopt the usual equivalence between S 1 and [0, 1] with the identification 0 ≡ 1. We easily verify that T (x) is a continuous dynamical system. The Lebesgue measure is invariant and it is also easy to see that T (x) is ergodic. Observe that C(Q) is isometrically isomorphic to the space of continuous periodic functions with period 1 in each coordinate variable. 2.2. Almost periodic functions This case was extensively studied in [2]. The basic fact here is that the space of almost periodic functions is a closed subalgebra of the space of bounded uniformly continuous functions in Rn which induces a compactification of Rn , called Bohr compactification, Gn , which turns out to be a topological group with respect to the extension to Gn of the addition operation in Rn . Hence, in Gn the Haar measure is defined and is invariant with respect to the translations T (x) : Gn → Gn , T (x)ω = ω +x. In [2] it is shown that such maps T (x) form an ergodic continuous n-dimensional dynamical system. We leave the more general example of the algebras with mean value to be thoroughly considered in the next three sections, since the deep understanding of its relationship with continuous dynamical systems acting on compact spaces is a central point of this work.

1968


3. Algebras with mean value The concept of algebra with mean value was introduced in [38] (see also [17]) as a generalization of the concept of almost periodic functions AP(Rn ) and the corresponding Besicovitch spaces BAPp (Rn ), 1 p ∞ (cf. [2]), motivated by the reduction of problems of stochastic homogenization to problems of individual homogenization, in the terminology adopted in [17]. Notation. As usual, we denote by BUC(Rn ) the space of the bounded uniformly continuous real-valued functions in Rn . Definition 3.1. Let A be a linear subspace of BUC(Rn ). We say that A is an algebra with mean value (or algebra w.m.v., in short), if the following conditions are satisfied: (A) (B) (C) (D)

If f and g belong to A, then the product f g belongs to A. A is invariant with respect to translations τy in Rn . Any f ∈ A possesses a mean value. A is closed in BUC(Rn ) and contains the unity, i.e., the function e(x) := 1 for x ∈ Rn .

Remark 3.1. The definition of algebra w.m.v. as given in [17] contains only conditions (A)–(C). However, since the closure of a linear subspace A in BUC(Rn ) satisfying (A)–(C) also satisfies (A)–(C) and adding the unit to such an A one obtains a linear subspace of BUC(Rn ) still satisfying (A)–(C), the inclusion of condition (D) does not imply any restriction in the theory, and we do that here just for convenience. For the development of the homogenization theory in algebras A with mean value, as is done in [17,38] (see also [9]), in similarity with the case of almost periodic functions, one introduces, for 1 p < ∞, the space B p as the abstract completion of A with respect to the Besicovitch semi-norm p

|f |p := lim sup L→∞

1 (2L)n

|f |p dx. [−L,L]n

Both the action of translations and the mean value extend by continuity to B p , and we will keep using the notation τy f and M(f ) even when f ∈ B p and y ∈ Rn . Furthermore, for p > 1 the product in A extends to a bilinear operator from B p × B q into B 1 , with q equal to the dual exponent of p, satisfying |f g|1 |f |p |g|q . In particular, the operator M(f g) provides a nonnegative definite bilinear form on B 2 . Remark 3.2. A classical argument going back to Besicovitch [5] (see also [17, p. 239]) shows p that the elements of B p can be represented by functions in Lloc (Rn ), 1 p < ∞.


1969

Since there is an obvious inclusion between this family of spaces, we may define the space B ∞ as follows: B

∞

= f∈

1p 0, there is a finite set {t1 , . . . , tN } such that for all t ∈ Rn , f (· + t) − f (· + tj )∞ < ε, for some j ∈ {1, . . . , N}. Therefore, we conclude that {f (· + t)} is precompact in BUC(Rn ). 3. To prove the converse, let Gn be the Bohr compactification of Rn , that is, the compactification of Rn induced by the whole algebra of the almost periodic functions (see [11]; also [2]). In order to take advantage of the properties of exponential functions we consider algebras of complex valued functions; the passage to the real valued case is immediate. Let then A be a subalgebra of almost periodic functions. 4. It is well known that the family F := {eiλ·x : λ ∈ Rn } form a fundamental set in the space of almost periodic functions AP(Rn ), in the sense that any f ∈ AP(RN ) may be approximated in the sup-norm by finite linear combinations of elements of F . 5. Suppose first that A is a subalgebra of AP(Rn ) generated by any subset F ⊆ F , and let K be the associated compactification of Rn . We are going to apply Lemma 4.1 with X1 = K × K, X2 = K, R1 = Rn × Rn , R2 = Rn and W : R1 → R2 the addition operation + in RN . For any eiλ·x ∈ F , we have eiλ·(x+y) = eiλ·x eiλ·y . Clearly, eiλ·x eiλ·y is the restriction to Rn × Rn of a continuous function in K × K. Since F is a fundamental set for A, Lemma 4.1 implies that + may be extended continuously to K × K. It is also immediate to verify that this extension preserves the properties of an abelian group. 6. Now, we consider the case where A is a general subalgebra of AP(Rn ), not necessarily generated by some subset of F . We first interpret K as a quotient space of Gn as follows. In Gn we consider the equivalence relation z1 ∼ z2 if f (z1 ) = f (z2 ) for all f ∈ A, where f denotes


1975

the unique continuous extension of f to Gn . The quotient space K˜ = Gn /∼, endowed with the ˜ quotient topology, is a compact space. The functions f , f ∈ A, pass to the quotient and f ∈ C(K) ˜ Hence, for all f ∈ A. Moreover, the family {f : f ∈ A} distinguishes between the points of K. ˜ ˜ there is an isometric isomorphism between A and C(K) and so K is homeomorphic to K and we may identify these spaces. 7. Now, observe that if z1 , z2 , σ1 , σ2 ∈ Gn with z1 ∼ z2 and σ1 ∼ σ2 , then z1 + σ1 ∼ z2 + σ2 . Indeed, given any f ∈ A, by the invariance of A by translations we have f (z1 + σ1 ) = f (z2 + σ1 ) = f (σ1 + z2 ) = f (σ2 + z2 ). We have seen above that for any f ∈ A, f (x + y) is the restriction to Rn × Rn of a continuous function on Gn × Gn . Now, as we have just proved, this function may pass to the quotient Gn /∼ × Gn /∼ ≡ K × K. Hence, f (x + y) is the restriction to Rn × Rn of a continuous function on K × K for all f ∈ A. Therefore, an application of Lemma 4.1 with X1 = K × K, X2 = K, R1 = Rn × Rn , R2 = Rn and W = +, gives that + may be continuously extended to K × K and it is again a trivial matter to prove that the abelian group properties are preserved by this extension. 8. The fact that the measure m of K induced by the mean value on A is the Haar measure is a straightforward consequence of the uniqueness of the Haar measure. 2 5. Vector-valued algebras with mean value In this section we extend the notion of algebra with mean value to vector-valued functions. We begin with the following definition. Definition 5.1. Let A ⊆ BUC(Rn ) be an algebra with mean value and let E be a Banach space. We denote by A(Rn ; E) the space of functions f ∈ BUC(Rn ; E) satisfying the following conditions: (i) For all L ∈ E ∗ , Lf := L, f belongs to A; (ii) The family F := {Lf : L ∈ E ∗ , L 1} is relatively compact in A. For bounded Borel sets Q ⊆ Rnand f ∈ BUC(Rn ; E), it is easily checked by an approximation with Riemann sums that L → Q L, f dx defines a linear functional on E ∗ , continuous for the weak topology σ (E ∗ , E); as a consequence, there exists a unique element of E, that we shall denote by Q f dx, satisfying

f dx =

L, Q

L, f dx

∀L ∈ E ∗ .

Q

For similar reasons, if f ∈ A(Rn ; E) the integrals –Qt f dx weakly converge in E, as t → +∞, to a vector, that we shall denote by –Rn f dx, characterized by

L, – f dx = –L, f dx Rn

Rn

∀L ∈ E ∗ .

1976


Theorem 5.1. Let A ⊆ BUC(Rn ) be an algebra with mean value. Let E be a Banach space and let K be the compact space associated with A. There is an isometric isomorphism between A(Rn ; E) and C(K; E). Denoting by g → g the canonical map from A to C(K), the isomorphism associates to f ∈ A(Rn ; E) the map f˜ ∈ C(K; E) satisfying L, f = L, f˜ ∈ C(K)

∀L ∈ E ∗ .

(5.1)

Moreover f E ∈ A for each f ∈ A(Rn ; E). Proof. 1. For any z ∈ K we consider the map L → Lf (z). This is a linear map on E ∗ ; we claim that the compactness of F implies that this map is continuous with respect to the topology σ (E ∗ , E). 2. Indeed, by the well-known Krein–Šmulian theorem (see, e.g., [11, p. 429]) it suffices to check the continuity of this linear functional when restricted to bounded closed balls. Now, if Li → L in the w ∗ -topology, then the maps Lif converge to Lf pointwise and compactness yields that they converge also in A. As a consequence Lif converge uniformly in K to Lf . 3. Hence, for any z ∈ K we can find an element of E, that we denote by f˜(z), such that Lf (z) = L, f˜(z) for any L ∈ E ∗ . This proves (5.1) and it remains to show that f˜ is a continuous map. This is again an argument based on the compactness of the family F := {Lf : L ∈ E ∗ , L 1}: if zi → z then, by the compactness of F , Lf (zi ) → Lf (z) uniformly with respect to L in the unit ball of E ∗ . As a consequence f˜(zi ) → f˜(z) in E. 4. Now we prove that f → f˜ is an isometry between A(Rn ; E) and C(K; E). This map is clearly an isomorphism. Moreover, for each x ∈ Rn we obtain from (5.1) that f˜(x)E = f (x)E . Since f˜E ∈ C(K) we have that f E ∈ A and so f˜E = f E . Consequently, f → f˜ is an isometry. 2 Definition 5.2. Given a compact space K, a probability Radon measure m on K and a Banach space E, for 1 p < ∞, we define the space Lp (K; E) as the completion of C(K; E) with respect to the norm · p , defined as usual: 1/p p f p := f E dm . K

We also define L∞ (K; E) as the space of the functions f : K → E such that f ∈ Lp (K; E) for all p ∈ [1, ∞) and sup1p 1; (3) L1 (Ω; A(Rn ; C(K))). Remark 6.1. A similar result holds, with minor adaptations in the proof, for families {uε }ε>0 ⊆ L1 (Ω; Rm ) that satisfy the condition

lim lim sup |uε | > R = 0.

R→∞

ε→0

This happens, for instance, when a uniform bound in Lp (Ω; Rm ) is available. In this special case, the representation formula (6.1) is valid for all Φ(z, x, λ) ∈ A(Rn ; C0 (Ω, C(Rm ))) such that lim

|λ|→∞

|Φ(z, x, λ)| =0 |λ|p

uniformly as (z, x) ∈ Rn × Ω.

This extension is analogous to the well-known one in the classical theory of Young measures (see, e.g. [3,4,30], etc.). As in the classical theory of Young measures we have the following consequence of Theorem 6.1.

1978


Theorem 6.2. Let Ω ⊆ Rn be a bounded open set, let {uε } ⊆ L∞ (Ω; Rm ) be uniformly bounded and let νz,x be a two-scale Young measure generated by a subnet {uε(d) }d∈D , according to Theo¯ Rm )) for some rem 6.1. Assume that U belongs either to L1 (Ω; A(Rn ; Rm )) or to B p (Rn ; C(Ω; p > 1. Then νz,x = δU (z,x)

if and only if

x limuε(d) (x) − U = 0. ,x 1 D ε(d) L (Ω)

(6.2)

7. Porous medium type equations with oscillating external sources: The Cauchy problem Let Q be a compact space and let T (x) : Q → Q an ergodic n-dimensional continuous dynamical system on Q with an invariant probability measure μ on Q. We consider the following stochastic homogenization problem

∂t u − f (u) = − ε12 z V T xε ω , u(x, 0) = u0 T xε ω, x ,

(x, t, ω) ∈ Rn+1 + × Q, (x, ω) ∈ Rn × Q

n+1 n where V , V ∈ C(Q) and u0 ∈ L∞ (Rn ; C(Q)). Here we denote + := R ×(0, ∞). As usual, R n n 2 2 = i=1 ∂xi is the Laplace operator and we denote z = i=1 ∂zi , where z represents the oscillatory variable x/ε. Here, by V ∈ C(Q) we mean that the function V˜ (x, ω) := V (T (x)ω) satisfies (x V˜ )(x, ω) = h(T (x)ω), for some h ∈ C(Q). Since, by Theorem 3.1, almost all realizations of functions in C(Q) belong to an ergodic algebra, for simplicity of notation, here and henceforth, we consider the equivalent individual homogenization problem with oscillatory functions belonging to an ergodic algebra, which in this case reduces to the problem

∂t u − f (u) = − ε12 z V u(x, 0) = u0 xε , x ,

x ε ,

(x, t) ∈ Rn+1 + , x ∈ Rn .

(7.1)

So, let A(Rn ) be an ergodic algebra, K be the compact space given by Theorem 4.1 such that A(Rn ) ∼ C(K), and m be the associated invariant probability measure on K. We make the following assumptions: (A1) The function f in (7.1) is in C 2 (R) and satisfies f (u) > 0 for u ∈ R. (A2) V ∈ A(Rn ) and u0 (z, x) = g ϕ0 (x) + V (z) ,

(7.2)

with g := f −1 , for some ϕ0 ∈ L∞ (Rn ). In particular u0 ∈ L∞ (Rn ; A(Rn )). (A3) V ∈ A(Rn ). Let g be as above and let f¯ be implicitly defined by the equation p = – g f¯(p) + V (z) dz. Rn

(7.3)


1979

In the sequel we shall identify V with the function V ∈ C(K), and define ψα (z) := g V (z) + α ,

z ∈ K.

Notice that ψα is a steady state solution of (7.1). Theorem 7.1. Suppose assumptions (A1)–(A3) hold and let uε denote the unique weak solution of (7.1). Let u¯ be the unique weak solution of

∂t u¯ − f¯(u) ¯ = 0, u(x, ¯ 0) = –Rn u0 (z, x) dz,

(x, t) ∈ Rn+1 + , x ∈ Rn ,

(7.4)

and set U (z, x, t) := g f¯ u(x, ¯ t) + V (z) .

(7.5)

Then, as ε → 0, we have uε → u¯ in the weak star topology of L∞ (Rn+1 + ) and x , x, t lim uε − U ε→0 ε

L1loc Rn+1 +

= 0.

(7.6)

Proof. 1. First, we observe that the weak solutions uε , ε > 0, of (7.1) are bounded uniformly with respect to ε in L∞ (Rn+1 + ). For this, we note that if α1 , α2 are such that α1 ϕ0 (x) α2 for x ∈ R, we have x x x for all x ∈ Rn . + α1 u0 ,x g V + α2 g V ε ε ε By the monotonicity of the solution operator of (7.1) (see Theorem A.3), we get x x g V for all (x, t) ∈ Rn+1 + α1 uε (x, t) g V + α2 + . ε ε Thus, in the sequel, we denote by K a closed interval containing the image of all the functions uε , ε > 0. Let νz,x,t ∈ M(K), with (z, x, t) ∈ K × Rn+1 + , be the two-scale space time Young measures associated with a subnet of {uε }ε>0 with test functions oscillating only on the space variable. Following [16] and [2], the theorem will be proved by adapting DiPerna’s method in [14], that is, by showing that νz,x,t is a Dirac measure for almost all (z, x, t) ∈ K × Rn+1 + . Since we are going to show that νz,x,t do not depend on the chosen subnet (so that, a posteriori, a full limit as ε → 0 occurs), in order to simplify our notation we will use the notation limε→0 , not denoting the subnet.

1980


Observe that, for every α ∈ R, the weak solutions uε and ψα ( xε ) satisfy (see Theorem A.3) φ dx dt uε (x, t) − ψα x φt + f uε (x, t) − f ψα x ε ε Rn+1 +

x x φ(x, 0) dx 0, + u0 , x − ψα ε ε

(7.7)

Rn

for all 0 φ ∈ Cc∞ (Rn+1 ). In (7.7), we take φ(x, t) = ε 2 ϕ( xε )ψ(x, t) with 0 ψ ∈ Cc∞ (Rn+1 + ), ϕ, ϕ ∈ A(Rn ) and ϕ 0. Observe that x x x 2 ψ(x, t) + 2ε∇ϕ · ∇ψ(x, t) + ε ϕ ψ(x, t). φ = ϕ ε ε ε Letting ε → 0 and using Theorem 6.1, we get

ψ(x, t) νz,x,t , f (·) − f ψα (z) ϕ(z) dm(z) dx dt 0.

K Rn+1 +

Now apply the inequality above to ϕ∞ ± ϕ to obtain

ψ(x, t) νz,x,t , f (·) − V (z) − α ϕ(z) dm(z) dx dt = 0

(7.8)

K Rn+1 +

for all ϕ such that ϕ, ϕ ∈ A(Rn ) and all 0 ψ ∈ Cc∞ (Rn+1 + ). 2. As in [16], we define a new family of parametrized measures μz,x,t supported on a compact set K ⊃ {f (λ) − V (z): (λ, z) ∈ K × K} by μz,x,t , θ := νz,x,t , θ f (·) − V (z) ,

θ ∈ C(R).

(7.9)

In this way, Eq. (7.8) can also be rephrased as ψ(x, t)μz,x,t , θ ϕ(z) dm(z) dx dt = 0,

(7.10)

K Rn+1 +

where θ (λ) = |λ − α|. On the other hand, inserting in the integral equation defining weak solution of (7.1) with a test function as above, we easily get letting ε → 0 that (7.10) holds when θ is any affine function. Therefore, we deduce that (7.10) holds for finite linear combinations of affine functions and functions of the form | · −α|, α ∈ R. Since these combinations generate the piecewise affine functions, we finally conclude that (7.10) holds for all θ ∈ C(R).


1981

Set F (z) := Rn+1 ψ(x, t)μz,x,t , θ dx dt and observe that K F (z)ϕ(z) dm(z) = 0, for all + ϕ such that ϕ, ϕ ∈ A(Rn ). Then, we can apply Lemma 3.2 to obtain that F is equivalent to a constant for all θ ∈ C(R). Using this fact and defining μx,t := μz,x,t dm(z) ∈ M(K ), K

we have, in particular,

ψ(x, t)μz,x,t , θ dx dt =

ψ(x, t)μz,x,t , θ dx dt dm(z) K Rn+1 +

Rn+1 +

=

ψ(x, t)μx,t , θ dx dt,

Rn+1 +

for a.e. z ∈ K, for all θ ∈ C(R). Hence, μx,t , W (z, ·) dm(z) ψ(x, t) dx dt K

Rn+1 +

=

i

=

μx,t , θi ψ(x, t) dx dt =

m(Ki )

Rn+1 +

m(Ki )

i

μz,x,t , θi ψ(x, t) dx dt

Rn+1 +

μz,x,t , θi χKi (z)ψ(x, t) dx dt dm(z)

i K n+1 R+

μz,x,t , W (z, ·) ψ(x, t) dm(z) dx dt

= Rn+1 +

(7.11)

K

for any function W (λ, z) = i θi (λ)χKi (z), where θi ∈ C(K ), Ki is any Borelian subset of K, and χKi is the characteristic function of Ki . By approximation (7.11) holds for any W ∈ C(K × K ). 3. From (7.7), taking the limit as ε → 0, passing to a subnet if necessary, we get

νz,x,t , · − ψα (z) ϕt + νz,x,t , f (·) − f (ψα (z)) ϕ(z) dm(z) dx dt

K Rn+1 +

+

u0 (z, x) − ψα (z)ϕ(x, 0) dm(z) dx 0

Rn K

for all α ∈ R and for all 0 ϕ ∈ Cc∞ (Rn+1 ).

(7.12)

1982


We define I (ρ, α) and G(ρ, α) by I (ρ, α) :=

g ρ + V (z) − g α + V (z) dm(z),

(7.13)

K

G(ρ, α) := |ρ − α|.

(7.14)

Now, setting θ (ρ) = |g(ρ + V (z)) − g(α + V (z))|, we have

νz,x,t , · − ψα (z) ϕt dm(z) dx dt

K Rn+1 +

νz,x,t , θ f (·) − V (z) ϕt dm(z) dx dt

= K Rn+1 +

μz,x,t , g · + V (z) − g α + V (z) ϕt dm(z) dx dt.

= K Rn+1 +

Using (7.11), we obtain

K Rn+1 +

νz,x,t , · − ψα (z) ϕt dm(z) dx dt

μz,x,t , g · + V (z) − g α + V (z) ϕt dm(z) dx dt

= K Rn+1 +

=

μx,t , K

Rn+1 +

g · + V (z) − g α + V (z) dm(z) ϕt dx dt

μx,t , I (·, α) ϕt dx dt.

=

(7.15)

Rn+1 +

Analogously, K Rn+1 +

νz,x,t , f (·) − f ψα (z) ϕ(x, t) dm(z) dx dt

= Rn+1 +

μx,t , G(·, α) ϕ(x, t) dx dt.

(7.16)


1983

Using (7.15) and (7.16) in (7.12), we have

μx,t , I (·, α) ϕt + μx,t , G(·, α) ϕ dx dt

Rn+1 +

u0 (z, x) − ψα (z)ϕ(x, 0) dm(z) dx 0,

+ Rn

(7.17)

K

for all 0 ϕ ∈ Cc∞ (Rn+1 ) and all α ∈ R. Now, choosing ϕ(x, t) = δh (t)φ(x), with 0 φ ∈ Cc∞ (Rn ) and δh (t) = max{ h−|t| h , 0} for h > 0 in (7.17), we obtain 1 lim h→0 h

h 0

μx,t , I (·, α) φ dx dt

Rn

Rn

u0 (z, x) − ψα (z)φ dm(z) dx.

(7.18)

K

Using the flexibility provided by φ in (7.18), we deduce that the same inequality holds if α ∈ L∞ (Rn ) and φ = χBR , R > 0. We have that ϕ0 (x) = f (u0 (z, x) − V (z)) is independent of z. Taking α(x) = ϕ0 (x) and recalling that u0 (z, x) = g(α(x) + V (z)), we have α(x) = f¯(u(x, ¯ 0)). Using this and ψα (z) = g(α + V (z)) in (7.18), we obtain that 1 lim h→0 h

h

μx,t , I ·, f¯ u(x, ¯ 0) dx dt = 0,

∀R > 0.

(7.19)

0 BR

4. By using the Remark A.1 with u1 = uε and u2 (x) = ψα ( xε ), for all 0 ϕ ∈ Cc∞ (Rn+1 + ) we get

ψα ( xε )

−

uε (x, t) ϕt dx dt

Bϑδ

Rn+1 +

x x ∇ f uε (x, t) − f ψα · ∇ϕ dx dt Hδ f uε (x, t) − f ψα ε ε

+ Rn+1 +

2 x =− ∇ f uε (x, t) − f ψα ε Rn+1 +

× Hδ

x f uε (x, t) − f ψα ϕ dx dt. ε

(7.20)

1984


2 Now, we let α = ξ(y, s) := f¯(u(y, ¯ s)), take 0 φ ∈ Cc∞ ((Rn+1 + ) ), integrate in y, s, and send δ → 0, to get

x x · ∇x φ dx dt dy ds φt + ∇x f uε (x, t) − f ψξ(y,s) −uε (x, t) − ψξ(y,s) ε ε

2 (Rn+1 + )

= − lim

δ→0 2 (Rn+1 + )

× Hδ

2 ∇x f uε (x, t) − f ψξ(y,s) x ε

x f uε (x, t) − f ψξ(y,s) φ dx dt dy ds. ε

Then we use Theorem 6.1 on multiscale Young measures to obtain, as ε → 0,

− μx,t , I ·, ξ(y, s) φt − μx,t , G ·, ξ(y, s) x φ dx dt dy ds

2 (Rn+1 + )

= − lim lim

ε→0 δ→ 0 2 (Rn+1 + )

2 ∇x f uε (x, t) − f ψξ(y,s) x ε

x φ dx dt dy ds. × Hδ f uε (x, t) − f ψξ(y,s) ε 5. Observe that ∇y [f (ψξ(y,s) ( xε ))] = ∇y [V ( xε ) + ξ(y, s)] = ∇y ξ(y, s). Hence x x · ∇x Hδ (f uε (x, t) − f ψξ(y,s) φ dx dt, ∇y f ψξ(y,s) ε ε

0= Rn+1 +

which implies that x x ∇y f uε (x, t) − f ψξ(y,s) · ∇x f uε (x, t) − f ψξ(y,s) ε ε Rn+1 +

x f uε (x, t) − f ψξ(y,s) φ dx dt ε x =− · ∇x φ ∇y f uε (x, t) − f ψξ(y,s) ε × Hδ

Rn+1 +

x × Hδ f uε (x, t) − f ψξ(y,s) dx dt. ε

(7.21)


1985

Integrating in y, s and letting δ → 0, we have divy ∇x φ dx dt dy ds f uε (x, t) − f ψξ(y,s) x ε

2 (Rn+1 + )

= lim

δ→0 2 (Rn+1 + )

× Hδ

x x ∇y f uε (x, t) − f ψξ(y,s) · ∇x f uε (x, t) − f ψξ(y,s) ε ε

x f uε (x, t) − f ψξ(y,s) φ dx dt dy ds. ε

By Theorem 6.1, as ε → 0, we get

μx,t , G ·, ξ(y, s) divy ∇x φ dx dt dy ds

2 (Rn+1 + )

= lim lim

ε→0 δ→0 2 (Rn+1 + )

x x ∇y f (uε ) − f ψξ · ∇x f (uε ) − f ψξ ε ε

x φ dx dt dy ds. × Hδ f (uε ) − f ψξ ε

(7.22)

Similarly, we have also that f (uε (x, t)) − f (ψξ(y,s) ( xε )) = f (uε (x, t)) − V ( xε ) − ξ(y, s) and thus x x ∇x f uε (x, t) − f ψξ(y,s) = ∇x f uε (x, t) − V , ε ε is independent of y. Hence, by integrating first in (y, s) and then (x, t), proceeding as above in obtaining (7.22), yields the equality

μx,t , G ·, ξ(y, s) divx ∇y φ dx dt dy ds

2 (Rn+1 + )

= lim lim

ε→0 δ→0 2 (Rn+1 + )

× Hδ

x x · ∇y f (uε ) − f ψξ ∇x f (uε ) − f ψξ ε ε

x f (uε ) − f ψξ φ dx dt dy ds ε

where uε and ξ are functions of x, t and y, s, respectively.

(7.23)

1986


6. Let u¯ be the weak solution of (7.4). From (A.5) in Theorem A.1, we have l − u(y, ¯ s)φs + sgn f¯(l) − f¯ u(y, ¯ s) ∇y f¯(u) ¯ · ∇y φ dy ds Rn+1 +

= lim

δ→0 Rn+1 +

2 ∇y f¯(u) ¯ s) φ dy ds, ¯ H f¯(l) − f¯ u(y, δ

¯ Now, let k := f (l) and notice K g(ξ(y, s) + V (z)) dm(z). Thus, l − u(y, ¯ s) φs dy ds =

g k + V (z) − g ξ(y, s) + V (z) dm(z)φs dy ds

= Rn+1 +

=

g k + V (z) − g ξ(y, s) + V (z) dm(z) φs dy ds

K

I k, ξ(y, s) φs dy ds.

Rn+1 +

Also,

sgn f¯(l) − f¯ u(y, ¯ s) ∇y f¯(u) ¯ · ∇y φ dy ds

Rn+1 +

¯ s) · ∇y φ dy ds = ∇y f¯(l) − f¯ u(y,

=− Rn+1 +

=

k − ξ(y, s)y φ dy ds

Rn+1 +

G k, ξ(y, s) y φ dy ds.

Rn+1 +

Besides, since ∇y ξ(y, s) = ∇y [f (ψξ(y,s) ( xε ))], we have 2 ∇y f¯(u) ¯ s) φ dy ds ¯ Hδ f¯(l) − f¯ u(y, Rn+1 +

= Rn+1 +

∇y ξ(y, s)2 H k − ξ(y, s) φ dy ds δ

2 H k − ξ(y, s) φ dy ds. ∇y f ψξ(y,s) x = δ ε Rn+1 +

(7.24)

that l = K g(f¯(l) + V (z)) dm(z) and that u(y, ¯ s) =

K Rn+1 +

Rn+1 +

for all l ∈ R.


1987

Using the two previous equalities in (7.24) we obtain I k, ξ(y, s) φs + G k, ξ(y, s) y φ dy ds Rn+1 +

2 H k − ξ(y, s) φ dy ds, ∇y f ψξ(y,s) x δ ε

= lim

δ→0 Rn+1 +

2 for all k ∈ R and all 0 φ ∈ Cc∞ ((Rn+1 + ) ). x We take k = f (uε (x, t)) − V ( ε ) in the above equality and integrate in x, t to get x , ξ(y, s) φs I f uε (x, t) − V ε 2 (Rn+1 + )

x , ξ(y, s) y φ dx dt dy ds + G f uε (x, t) − V ε 2 ∇y f uε (x, t) − f ψξ(y,s) x = lim δ→0 ε 2 (Rn+1 + )

x φ dx dt dy ds. × Hδ f uε (x, t) − f ψξ(y,s) ε Applying Theorem 6.1, letting ε → 0, we obtain x lim , ξ(y, s) φs dx dt dy ds I f uε (x, t) − V ε→0 ε 2 (Rn+1 + )

νz,x,t , I f (·) − V (z), ξ(y, s) φs dm(z) dx dt dy ds

= 2 K (Rn+1 + )

μz,x,t , I ·, ξ(y, s) φs dm(z) dx dt dy ds

= 2 K (Rn+1 + )

μx,t , I ·, ξ(y, s) φs dx dt dy ds.

= 2 (Rn+1 + )

Similarly

lim

ε→0 2 (Rn+1 + )

x , ξ(y, s) y φ dx dt dy ds G f uε (x, t) − V ε

= 2 (Rn+1 + )

μx,t , G ·, ξ(y, s) y φ dx dt dy ds.

(7.25)

1988


Using the last two equalities in (7.25), we get μx,t , I ·, ξ(y, s) φs + μx,t , G ·, ξ(y, s) y φ dx dt dy ds 2 (Rn+1 + )

= lim lim

ε→0 δ→0 2 (Rn+1 + )

2 ∇y f uε (x, t) − f ψξ(y,s) x ε

x φ dx dt dy ds. × Hδ f uε (x, t) − f ψξ(y,s) ε 7. We now prove that μx,t , I ·, ξ(x, t) ϕt + μx,t , G ·, ξ(x, t) ϕ dx dt 0,

(7.26)

(7.27)

Rn+1 +

for all 0 ϕ ∈ Cc∞ (Rn+1 + ). By subtracting (7.21) from (7.22), we deduce that − μx,t , I (·, ξ ) φt − μx,t , G(·, ξ ) (x φ + divy ∇x φ) dx dt dy ds 2 (Rn+1 + )

= − lim lim

ε→0 δ→0 2 (Rn+1 + )

2 ∇x f (uε ) − f ψξ x ε

x x · ∇x f (uε ) − f ψξ + ∇y f (uε ) − f ψξ ε ε x × Hδ f (uε ) − f ψξ φ dx dt dy ds, ε

(7.28)

where uε = uε (x, t), ξ = ξ(y, s). The sum of (7.26) and (7.23) gives μx,t , I (·, ξ ) φs + μx,t , G(·, ξ ) (y φ + divx ∇y φ) dx dt dy ds 2 (Rn+1 + )

= lim lim

ε→0 δ→0 2 (Rn+1 + )

2 ∇y f (uε ) − f ψξ x ε

x x · ∇x f (uε ) − f ψξ + ∇y f (uε ) − f ψξ ε ε x × Hδ f (uε ) − f ψξ φ dx dt dy ds. ε

(7.29)


1989

Finally, taking the difference between (7.28) and (7.29) we obtain

− μx,t , I (·, ξ ) (φt + φs ) − μx,t , G(·, ξ ) (x + divy ∇x + divx ∇y + y )φ dx dt dy ds

2 (Rn+1 + )

= − lim lim

ε→0 δ→0 2 (Rn+1 + )

∇x f (uε ) − f ψξ x ε

2 x H f (uε ) − f ψξ x φ dx dt dy ds 0. + ∇y f (uε ) − f ψξ δ ε ε

(7.30)

x−y n+1 t+s t−s ∞ Now, we take φ(x, t, y, s) := ϕ( x+y 2 , 2 )ρj ( 2 )θj ( 2 ), where 0 ϕ ∈ Cc (R+ ), and ρj , θj are classical approximations of the identity in Rn and R, respectively, as in the doubling of variables method, and observe that

(x + divy ∇x + divx ∇y + y )φ = ρj

x −y t −s x +y t +s θj x ϕ , . 2 2 2 2

Substituting such test function in the inequality in (7.30) and letting j → ∞, we obtain (7.27), proving the assertion. 8. To conclude the proof, we set ϕ(x, t) = δh (t)Λ(x) in (7.27), with 0 δh ∈ Cc∞ (R+ ) as in step 3 above and Λ given by (A.13). We define γ (t) :=

μx,t , I ·, ξ(x, t) Λ(x) dx

Rn

and observe that G(·,·) CI (·,·). Then, using the properties of the weight function Λ, proceeding in a standard way and letting h → 0, we arrive at t γ (t) C

γ (s) ds

for a.e. t 0.

0

Hence, Gronwall’s lemma implies γ (t) = 0 for a.e. t 0 which, by the definition of γ , means that μx,t , I (·, ξ(x, t)) = 0 for a.e. (x, t) ∈ Rn+1 + , and so μx,t , G(·, ξ(x, t)) = 0 for a.e. (x, t) ∈ Rn+1 . Therefore, μ is the Dirac mass concentrated at ξ(x, t) for a.e. (x, t) ∈ Rn+1 x,t + + . Recalling the definition of μx,t we have also that μz,x,t is the Dirac mass concentrated at ξ(x, t) for a.e. (z, x, t), and thus, νz,x,t is the Dirac mass concentrated at g(f¯(u(x, ¯ t)) + V (z)) for a.e. (z, x, t). Hence, we can apply Theorem 6.1 to conclude (7.6). Finally, the fact that the whole sequence uε converges in the weak star topology of L∞ (Rn+1 + ) to u¯ follows from (7.6) observing that, for any ϕ ∈ Cc (Rn+1 ), we have +

1990


lim

ε→0 Rn+1 +

U

x , x, t ϕ(x, t) dx dt = U (z, x, t)ϕ(x, t)dm(z) dx dt ε K Rn+1 +

= Rn+1 +

=

¯ g f u(x, ¯ t) + V (z) dm(z) ϕ(x, t) dx dt

K

u(x, ¯ t)ϕ(x, t) dx dt,

Rn+1 +

by the definitions of f¯ and U .

2

Acknowledgments H. Frid gratefully acknowledges the support of CNPq, grant 306137/2006-2 , and FAPERJ, grant E-26/152.192-2002. Appendix A. Some basic results about the nondegenerate porous medium equation In this section we state some results about the porous medium equation which are used in Section 7. Most of them follow from more general results in [8] and in these cases for the proof we just refer to [8]. More specifically, we consider the Cauchy problem for the following quasilinear parabolic equation ut − f (u) = h(x),

n (x, t) ∈ Rn+1 + := R × (0, ∞),

(A.1)

with initial data given by u(x, 0) = u0 (x),

x ∈ Rn ,

(A.2)

where we assume that f ∈ C 2 (R) with f (u) > 0 for u ∈ R, h, u0 ∈ L∞ (Rn ). Observe that here we assume f ∈ C 2 (R) and we only consider the simpler nondegenerate case, i.e., f > 0 since this is the context we are interested in for our application in Section 7. For smooth u0 and h it is well known the existence and uniqueness of a solution u ∈ C 2,1 (Rn × [0, ∞)) (see, e.g., [21]). For u0 , h ∈ L∞ (Rn ), we use Aubin’s compactness lemma to prove the convergence in L2 , to a limit function u, of the solutions uj obtained by j approximating u0 and h by smooth functions u0 , hj . We then deduce that the so obtained limit p function u satisfies ut , ∇u, ∇ 2 u ∈ Lloc (Rn × (0, ∞)), for all p ∈ (1, ∞), combining Nash– De Giorgi theorem and linear theory (see, e.g., [21]). It is also easy to verify that u satisfies Eq. (A.1) almost everywhere in R × (0, ∞), and it is in fact a weak solution in the sense of Definition A.1 below. Hence, in this case, uniqueness follows from the doubling of variables method as in [8] (see Theorem A.3 below). Definition A.1. A function u ∈ L∞ (Rn+1 + ) is said to be a weak solution of the problem (A.1), (A.2) if the following hold:

L. Ambrosio et al. / Journal of Functional Analysis 256 (2009) 1962–1997 1 (Rn )); (1) f (u(x, t)) ∈ L2loc ((0, ∞); Hloc ∞ n+1 (2) Given ϕ ∈ Cc (R ), we have uϕt − ∇f (u) · ∇ϕ + hϕ dx dt + u0 ϕ(x, 0) dx = 0.

1991

(A.3)

Rn

Rn+1 +

Let Hδ : R → R be the approximation of the sgn function given by ⎧ for s > δ, ⎨ 1, s Hδ (s) := δ , for |s| δ, ⎩ −1, for s < −δ, and let (Hδ )+ and (Hδ )− denote its nonnegative part and nonpositive part, respectively; (Hδ )+ (s) := max{Hδ (s), 0}, (Hδ )− (s) := max{−Hδ (s), 0}. Given a nondecreasing Lipschitz continuous function ϑ : R → R and k ∈ R, we define λ Bϑk (λ) :=

ϑ f (r) dr.

k

Let us denote ϑδ (λ) := Hδ λ − f (k)

and (ϑδ )+ (λ) := (Hδ )+ λ − f (k) .

The following two results are essentially adaptations of more general ones established in [8] and state important properties of weak solutions of (A.1), (A.2). Theorem A.1. Let u be a weak solution of the problem of Cauchy (A.1), (A.2), with h, u0 ∈ L∞ (Rn ). Then, −Bϑkδ (u)ϕt + Hδ f (u) − f (k) ∇f (u) · ∇ϕ − Hδ f (u) − f (k) hϕ dx dt Rn+1 +

∇f (u)2 H f (u) − f (k) ϕ dx dt δ

=−

(A.4)

Rn+1 +

for all k ∈ R and all 0 ϕ ∈ Cc∞ (Rn+1 + ). Moreover, letting δ → 0 in (A.4) and using the strict increasing monotonicity of f , we obtain

−|u − k|ϕt + ∇ f (u) − f (k) · ∇ϕ − sgn(u − k)hϕ dx dt

Rn+1 +

= − lim

δ→0 Rn+1 +

∇f (u)2 H f (u) − f (k) ϕ dx dt, δ

(A.5)

1992


k for all k ∈ R and all 0 ϕ ∈ Cc∞ (Rn+1 + ). We have similar identities with Bϑδ , Hδ replaced by k B(ϑ , (Hδ )+ , respectively, in (A.4) and |u − k|, |f (u) − f (k)| replaced by (u − k)+ , (f (u) − δ )+ f (k))+ , respectively, in (A.5).

For the next result we assume that there is V ∈ W 2,∞ (Rn ) such that, in (A.1), h = V . In particular, (A.1) admits stationary solutions, namely, ψα (x) := f −1 V (x) + α ,

α ∈ R.

The following theorem follows from (A.4), by using doubling of variables, the fact that u2 is stationary, and the trick of completing the square in [8, Theorem 13, p. 339]. Because of its central role in the proof of Theorem 7.1 we will give its detailed proof. Theorem A.2. Let u1 , u2 be weak solutions of the Cauchy problem for (A.1) with initial data u01 , u02 ∈ L∞ (Rn ). Assume h = V for some V ∈ W 2,∞ (Rn ) and that u2 = u02 is a stationary solution. Then,

u2 (y) Bϑδ u1 (x, t) (φt + φs )

− 2 (Rn+1 + )

+ Hδ f u1 (x, t) − f u2 (y) h(x) − h(y) φ dx dt dy ds + Hδ f u1 (x, t) − f u2 (y) (∇x + ∇y ) 2 (Rn+1 + )

× f u1 (x, t) − f u2 (y) · (∇x + ∇y )φ dx dt dy ds (∇x + ∇y ) f u1 (x, t) − f u2 (y) 2 =− 2 (Rn+1 + )

× Hδ f u1 (x, t) − f u2 (y) φ dx dt dy ds,

(A.6)

2 for all 0 φ ∈ Cc∞ ((Rn+1 + ) ).

Proof. Let u1 = u(x, t) and u2 = u2 (y). By (A.4) applied to u1 , we have

−Bϑkδ (u1 )φt + Hδ f (u1 ) − f (k) ∇x f (u1 ) · ∇x φ − Hδ f (u1 ) − f (k) h(x)φ dx dt

Rn+1 +

=− Rn+1 +

∇x f (u1 )2 H f (u1 ) − f (k) φ dx dt δ


1993

for all k ∈ R. Setting k = u2 (y) and integrating in y, s, we obtain

−Bϑuδ2 (u1 )φt + Hδ f (u1 ) − f (u2 ) ∇x f (u1 ) · ∇x φ

2 (Rn+1 + )

− Hδ f (u1 ) − f (u2 ) h(x)φ dx dt dy ds ∇x f (u1 )2 H f (u1 ) − f (u2 ) φ dx dt dy ds. =− δ

(A.7)

2 (Rn+1 + )

Now, applying (A.4) to u2 , taking k = u1 (x, t) and integrating x, t, we obtain

u1 Bϑδ (u2 )φs + Hδ f (u1 ) − f (u2 ) ∇y f (u2 ) · ∇y φ

2 (Rn+1 + )

− Hδ f (u1 ) − f (u2 ) h(y)φ dx dt dy ds ∇y f (u2 )2 H f (u1 ) − f (u2 ) φ dx dt dy ds. = δ 2 (Rn+1 + )

Since Bϑuδ1 (u2 ) and Bϑuδ2 (u1 ) are independent of s, we can write the trivial equality where both members are null u1 Bϑδ (u2 )φs dx dt dy ds = Bϑuδ2 (u1 )φs dx dt dy ds. 2 (Rn+1 + )

2 (Rn+1 + )

Combining the two previous equalities yields

u2 Bϑδ (u1 )φs + Hδ f (u1 ) − f (u2 ) ∇y f (u2 ) · ∇y φ

2 (Rn+1 + )

− Hδ f (u1 ) − f (u2 ) h(y)φ dx dt dy ds ∇y f (u2 )2 H f (u1 ) − f (u2 ) φ dx dt dy ds. = δ

(A.8)

2 (Rn+1 + )

Now, note that 0= ∇y f (u2 ) · ∇x Hδ f (u1 ) − f (u2 ) φ dx dt Rn+1 +

= Rn+1 +

∇y f (u2 ) · ∇x f (u1 )Hδ f (u1 ) − f (u2 ) φ + Hδ f (u1 ) − f (u2 ) ∇y f (u2 ) · ∇x φ dx dt

1994


and so we have

Hδ f (u1 ) − f (u2 ) ∇y f (u2 ) · ∇x φ dx dt dy ds

2 (Rn+1 + )

∇y f (u2 ) · ∇x f (u1 )Hδ f (u1 ) − f (u2 ) φ dx dt dy ds.

=−

(A.9)

2 (Rn+1 + )

Analogously,

Hδ f (u1 ) − f (u2 ) ∇x f (u1 ) · ∇y φ dx dt dy ds

2 (Rn+1 + )

∇y f (u2 ) · ∇x f (u1 )Hδ f (u1 ) − f (u2 ) φ dx dt dy ds.

=

(A.10)

2 (Rn+1 + )

Adding (A.7) and (A.10) yields

u2 Bϑδ (u1 )φt − Hδ f (u1 ) − f (u2 ) h(x)φ

− 2 (Rn+1 + )

+ Hδ f (u1 ) − f (u2 ) ∇x f (u1 ) · (∇x + ∇y )φ dx dt dy ds

∇x f (u1 )2 + ∇x f (u1 ) · ∇y f (u2 ) =− 2 (Rn+1 + )

× Hδ f (u1 ) − f (u2 ) φ dx dt dy ds.

(A.11)

Further, multiplying (A.8) by −1 and adding to (A.9) gives

u2 Bϑδ (u1 )φs − Hδ f (u1 ) − f (u2 ) h(y)φ

− 2 (Rn+1 + )

+ Hδ f (u1 ) − f (u2 ) ∇y f (u2 ) · (∇x + ∇y )φ dx dt dy ds

∇y f (u2 )2 − ∇x f (u1 ) · ∇y f (u2 ) =− 2 (Rn+1 + )

× Hδ f (u1 ) − f (u2 ) φ dx dt dy ds. Finally, adding (A.11) and (A.12) we obtain (A.6) concluding the proof.

(A.12) 2


1995

Remark A.1. From the equality in Theorem A.2, using test functions we φ(x, t, y, s) := x−y n+1 t+s t−s ∞ ϕ( x+y 2 , 2 )ρj ( 2 )θj ( 2 ), where 0 ϕ ∈ Cc (R+ ), and ρj , θj are classical approximations of the identity in Rn and R, respectively, as in the doubling of variables method, we get

u (x)

−

u1 (x, t) ϕt dx dt

Bϑδ2

Rn+1 +

Hδ f u1 (x, t) − f u2 (x) ∇ f u1 (x, t) − f u2 (x) · ∇ϕ dx dt

+ Rn+1 +

∇ f u1 (x, t) − f u2 (x) 2 H f u1 (x, t) − f u2 (x) ϕ dx dt,

=−

δ

Rn+1 +

for all 0 ϕ ∈ Cc∞ (Rn+1 + ). Let the weight function Λ : Rn → R be defined by √ 2 Λ(x) := e− 1+|x| .

(A.13)

An important feature of the weight function Λ is that Di Λ(x) Λ(x), for i = 1, . . . , n, and Λ(x) (n + 1)Λ(x), for x ∈ Rn .

(A.14)

The next result establishes the existence and L1 -stability of weak solutions of (A.1), (A.2). The proof, which we will omit, is obtained by combining ideas in Volpert and Hudjaev [34], more specifically the use of the weight function Λ, and the extension of the doubling of variables method of Kruzhkov [19] to degenerate quasilinear parabolic equations obtained by Carrillo [8]. Theorem A.3. Assume f ∈ C 2 (R), with f (u) > 0 for all u ∈ R, and h, u0 ∈ L∞ (Rn ). Then we have the following: (i) There exists a weak solution u ∈ L∞ (Rn+1 + ) of the problem (A.1), (A.2). n+1 ∞ (ii) If u1 , u2 ∈ L (R+ ) are weak solutions of (A.1) with initial data u01 , u02 ∈ L∞ (Rn ), respectively, then

u1 (x, t) − u2 (x, t) + φt + f u1 (x, t) − f u2 (x, t) + φ dx dt

Rn

+

u01 (x) − u02 (x) + φ(x, 0) dx 0,

Rn

for all 0 φ ∈ Cc∞ (Rn+1 ), from which we obtain

(A.15)

1996


u1 (x, t) − u2 (x, t)φt + f u1 (x, t) − f u2 (x, t) φ dx dt

Rn

+

|u01 (x) − u02 (x)|φ(x, 0) dx 0,

(A.16)

Rn

for all 0 φ ∈ Cc∞ (Rn+1 ). (iii) Therefore, there is a constant c > 0, depending only on n and f , such that for a.e. t 0 we have u1 (x, t) − u2 (x, t) + Λ(x) dx ect u01 (x) − u02 (x) + Λ(x) dx. (A.17) Rn

Rn

In particular, we also have u1 (x, t) − u2 (x, t)Λ(x) dx ect u01 (x) − u02 (x)Λ(x) dx. Rn

(A.18)

Rn

References [1] G. Allaire, Homogenization and two-scale convergence, SIAM J. Math. Anal. 23 (6) (1992) 1482–1518. [2] L. Ambrosio, H. Frid, Multiscale Young measures in almost periodic homogenization and applications, Arch. Ration. Mech. Anal. (2009), in press. [3] L. Ambrosio, N. Fusco, D. Pallara, Functions of Bounded Variation and Free Discontinuity Problems, Oxford University Press, 2000. [4] J.M. Ball, A version of the fundamental theorem for Young measures, in: M. Rascle, D. Serre, M. Slemrod (Eds.), Partial Differential Equations and Continuum Models of Phase Transitions, in: Lecture Notes in Phys., vol. 344, Springer-Verlag, 1989, pp. 207–215. [5] A.S. Besicovitch, Almost Periodic Functions, Cambridge University Press, 1932. [6] A. Bourgeat, A. Mikelic, Steve. Wright, Stochastic two-scale convergence in the mean and applications, J. Reine Angew. Math. 456 (1994) 19–51. [7] L. Caffarelli, P.E. Souganidis, C. Wang, Homogenization of nonlinear, uniformly elliptic and parabolic partial differential equations in stationary ergodic media, Comm. Pure Appl. Math. 58 (3) (2005) 319–361. [8] J. Carrillo, Entropy solutions for nonlinear degenerate problems, Arch. Ration. Mech. Anal. 147 (4) (1999) 269–361. [9] J. Casado-Diaz, I. Gayte, The two-scale convergence method applied to generalized Besicovitch spaces, Proc. R. Soc. Lond. Ser. A 458 (2002) 2925–2946. [10] G. Dal Maso, L. Modica, Nonlinear stochastic homogenization, Ann. Mat. Pura Appl. (1985) 347–389. [11] N. Dunford, J.T. Schwartz, Linear Operators. Parts I and II, Interscience Publishers, Inc., New York, 1958, 1963. [12] R.J. DiPerna, Convergence of approximate solutions to conservation laws, Arch. Ration. Mech. Anal. 82 (1983) 27–70. [13] R.J. DiPerna, Convergence of the viscosity method for isentropic gas dynamics, Comm. Math. Phys. 91 (1983) 1–30. [14] R.J. DiPerna, Measure-valued solutions to conservation laws, Arch. Ration. Mech. Anal. 88 (1985) 223–270. [15] W. E, Homogenization of linear and nonlinear transport equations, Comm. Pure Appl. Math. 45 (1992) 301–326. [16] W. E, D. Serre, Correctors for the homogenization of conservation laws with oscillatory forcing terms, Asymptot. Anal. 5 (1992) 311–316. [17] V.V. Jikov, S.M. Kozlov, O.A. Oleinik, Homogenization of Differential Operators and Integral Functionals, Springer-Verlag, Berlin–Heidelberg, 1994. [18] S.M. Kozlov, The method of averaging and walks in inhomogeneous environments, Uspekhi Mat. Nauk 40 (2) (1985) 61–120. [19] S.N. Kruzhkov, First order quasilinear equations in several independent variables, Math. USSR Sb. 10 (1970) 217– 243.


1997

[20] N.M. Krylov, N.N. Bogolyubov, La théorie générale de la mesure et son application à l’étude des systèmes dynamiques de la mécanique non linéaire, Ann. of Math. (2) 38 (1937) 65–113. [21] O.A. Ladyženskaja, V.A. Solonnikov, N.N. Ural’ceva, Linear and Quasi-linear Equations of Parabolic Type, Amer. Math. Soc., Providence, RI, 1988. [22] P.-L. Lions, P.E. Souganidis, Correctors for the homogenization of the Hamilton–Jacobi equations in the stationary ergodic setting, Comm. Pure Appl. Math. 56 (2003) 1501–1524. [23] F. Murat, Compacité par compensation, Ann. Sc. Norm. Super. Pisa Cl. Sci. (4) 5 (1978) 489–507. [24] V.V. Nemytskii, V.V. Stepanov, Qualitative Theory of Differential Equations, Princeton University Press, Princeton, NJ, 1960. [25] G. Nguetseng, A general convergence result for a functional related to the theory of homogenization, SIAM J. Math. Anal. 20 (3) (1989) 608–623. [26] G. Nguetseng, Homogenization structures and applications, Z. Anal. Anwend. 22 (1) (2003) 73–107. [27] G. Nguetseng, H. Nnang, Homogenization of nonlinear monotone operators beyond the periodic setting, Eletron. J. Differential Equations 2003 (36) (2003) 1–24. [28] G. Papanicolaou, S.R.S. Varadhan, Boundary value problems with rapidly oscillating random coefficients, in: J. Fritz, J.L. Lebaritz, D. Szasz (Eds.), Proc. Colloq. on Random Fields, Rigorous Results in Statistical Mechanics and Quantum Field Theory, in: Colloq. Math. Soc. Janos Bolyai, vol. 10, 1979, pp. 835–873. [29] W. Rudin, Real and Complex Analysis, McGraw–Hill, New York, 1966. [30] M.E. Schonbek, Convergence of solutions to nonlinear dispersive equations, Comm. Partial Differential Equations 7 (1982) 959–1000. [31] P.E. Souganidis, Stochastic homogenization of Hamilton–Jacobi equations and some application, Asymptot. Anal. 20 (1999) 141–178. [32] L. Tartar, Compensated compactness and applications to partial differential equations, in: R.J. Knops (Ed.), Nonlinear Analysis and Mechanics, in: Res. Notes Math., vol. 4, Pitman Press, New York, 1979, pp. 136–211. [33] L.C. Young, Lectures on Calculus of Variations and Optimal Control Theory, Saunders, 1969. [34] A.I. Vol’pert, S.I. Hudjaev, Cauchy’s problem for degenerate second order quasilinear parabolic equations, Math. USSR Sb. 7 (3) (1969) 365–387. [35] V.V. Zhikov, S.M. Kozlov, O.A. Ole˘ınik, Kha T’en Ngoan, Averaging and G-convergence of differential operators, Uspekhi Mat. Nauk 34 (5) (1979) 65–133 (in Russian), English transl. in Russian Math. Surveys 34 (1981) 69–147. [36] V.V. Zhikov, S.M. Kozlov, O.A. Ole˘ınik, G-convergence of parabolic operators, Uspekhi Mat. Nauk 36 (1) (1981) 11–58 (in Russian), English transl. in Russian Math. Surveys 36 (1981) 9–60. [37] V.V. Zhikov, S.M. Kozlov, O.A. Ole˘ınik, Averaging of parabolic operators, Tr. Mosk. Mat. Obs. 45 (1982) 182–236, (in Russian), English transl. in Trans. Moscow Math. Soc. 1 (1984) 189–241. [38] V.V. Zhikov, E.V. Krivenko, Homogenization of singularly perturbed elliptic operators, Mat. Zametki 33 (1983) 571–582, English transl. in Math. Notes 33 (1983) 294–300.


On Schrödinger semigroups and related topics Mustapha Mokhtar-Kharroubi Département de Mathématiques, Université de Franche-Comté, 16 Route de Gray, 25030 Besançon, France Received 4 June 2008; accepted 17 November 2008 Available online 14 January 2009 Communicated by C. Villani

Abstract This paper deals with two related subjects. In the first part, we give generation theorems, relying on (weak) compactness arguments, for perturbed positive semigroups in general ordered Banach spaces with additive norm on the positive cone. The second part provides new functional analytic developments on semigroup theory for Schrödinger operators in Lp spaces with (L1 ) -bounded potentials without restriction on the (L1 ) -bound. In particular, our formalism enlarges a priori the classical Kato class and its subsequent refinements. The connection with form-perturbation theory is also dealt with. © 2008 Elsevier Inc. All rights reserved. Keywords: Schrödinger operator; Kato class; Positive semigroups; Weak compactness; Essential self-adjointness; Form-perturbation

1. Introduction This paper provides some results on semigroup theory and on Schrödinger operators. The first part deals with new generation theorems (of perturbative type) of positive semigroups in general ordered Banach spaces with additive norm on the positive cone. We note that such Banach spaces cover L1 (μ) spaces (or spaces of bounded measures) and also some other ordered Banach spaces of practical interest but without lattice structure such as the Banach space of trace class operators in a Hilbert space where our results can apply to quantum dynamical semigroups. Our generation theorems rely on (weak) compactness arguments. The second part of the paper provides new functional analytic developments on Schrödinger operators − − V in Lp (R N ) where is the Laplacian and V is an unbounded multiplicaE-mail address: [email protected]. 0022-1236/$ – see front matter © 2008 Elsevier Inc. All rights reserved. doi:10.1016/j.jfa.2008.11.012

M. Mokhtar-Kharroubi / Journal of Functional Analysis 256 (2009) 1998–2025

1999

tion operator by a positive measurable function V which does not necessarily fall within known classes of potentials (we use the same symbol for the function V and the multiplication operator by V ). In the literature, this picture refers to Schrödinger operators with negative potentials, i.e. −V 0 is the potential; we have also dropped the classical coefficient 12 in front of the Laplacian. We note that the case of positive (or equivalently bounded below) potentials is well understood under very weak assumptions (e.g. V ∈ L1loc (R N − ) where is a closed set with zero-Lebesgue measure) and a general m-accretivity theory is available. On the other hand, it is well known that the treatment of negative potentials requires some “smallness” condition; typically the relative bound of V with respect to (the -bound) must be small enough or, at least, the potential must be form-small in L2 with respect to −. The state of the art by the end of the sixties is comprehensively covered by Schechter’s monograph [35]. The subsequent literature, influenced to some extent by some seminal papers such as [16,18,36,38], is really considerable: For the next two decades (and without any pretense to completeness), we refer to the classical books [33, Chapter X], [20, Chapter V] and to the papers [1,9,14,17,19,21–25,30,37,39,40,43] and references therein; some works rely also on probabilistic tools, e.g. [1,9,14,22]. (We mention that complex potentials are also dealt with; see e.g. [8].) An extensive list of references on Schrödinger operators is given in the more recent review paper by B. Simon [41]. We note that our paper deals essentially with one aspect of the subject which is more or less connected to essential self-adjointness. To this end, we provide a systematic semigroup theory for Schrödinger operators in Lp spaces. Since the understanding of positive potentials, i.e. absorption semigroups, is essentially complete (see [3,43] and references therein), the present paper focuses mainly on negative potentials; we point out that we could as well add a positive singular (e.g. L1loc ) potential, see Remark 12. We provide a general theory which improves in several respects the literature on the subject; in particular the concept of “smallness” of negative potentials with respect to the Laplacian is finely revisited. The general philosophy behind this work is that a great deal of mathematical properties of Schrödinger operators on Lp spaces is in a sense “already contained” in the L1 theory; this gives the L1 setting a special status. The role of L1 appears also (but differently) in the context of spectral theory of Schrödinger operators [12]. Our general strategy is the following: We give first a very general generation theorem for A1 := + V in L1 -spaces for -bounded potential in L1 -sense (as a consequence of a perturbation result by W. Desch [13]). This theorem relies on the optimal assumption δ := lim rσ V (λ − )−1 < 1 λ→+∞

(1)

which is much weaker than the usual “-smallness” assumption lim V (λ − )−1 L(L1 ) < 1

λ→+∞

occurring in the literature. In particular, thanks to the functional analytic results of the first part of the paper, this theorem may be used (i.e. (1) holds) under suitable weak compactness assumptions. This allows us to enlarge a priori the known classes of potentials. As far as we know, the idea to get round “-smallness” assumptions by means of weak compactness arguments in L1 appears here for the first time and improves our understanding of certain aspects of Schrödinger operators even though it is known (see e.g. [1,40]) that most physical potentials have zero bound. (We point out also that for the potentials with radial symmetry, -boundedness in L1

2000


sense implies zero -bound of the potential; see [40, Proposition A.2.5] and the comment coming after.) By a classical symmetry argument, in the spirit of [21], the corresponding semigroup interpolates on Lp spaces and provides us with “Schrödinger semigroups” {Sp (t); t 0} with generators Ap in Lp spaces. In particular, we capture a very general self-adjoint semi-bounded Schrödinger operator A2 in L2 . We also provide connections between our formalism and form-perturbation theory in L2 by showing that (1) implies that V is form-bounded with respect to − in L2 with relative form-bound less than or equal to δ. In particular, under (1), we show that A2 is nothing but V (form-sum). A conjecture on a characterization of form-smallness in L2 of negative potentials V with respect to − in terms of (1) is also given. The domains of the generators Ap are precisely characterized and practical cores are given. Our formalism is general, self-contained and most of our results are new. We note also that Schrödinger operators with magnetic fields can be dealt with by combining additional domination arguments [29]. Finally, we mention that, without extra mathematical cost, some of our results can be stated in abstract Lp (μ) spaces for general positive symmetric semigroups (see Remark 25); however, for the simplicity of statements, we have prefered to restrict ourselves to the Laplacian in the whole space. This paper is an expanded version (with applications) of the preprint [28]. It is organized as follows. In Section 2, we recall a perturbation theorem by W. Desch [13] in ordered Banach spaces X with additive norm on X+ . In Section 3, we prove a result on the spectral radius of positive operators in ordered Banach spaces (without necessarily a lattice structure) and show how Desch’s perturbation theorem applies under suitable (weak) compactness assumptions. In Section 4, we show how Desch’s perturbation theorem, applied to the Schrödinger operator in L1 (R N )

A1 : f ∈ D(1 ) → f + Vf ∈ L1 R N , D(1 ) = ϕ ∈ L1 R N ; ϕ ∈ L1 R N ,

(2)

provides an optimal generation theorem under assumption (1). In particular, we show that if V = V1 + V2 where the relative bound of V2 with respect to is less than 1 and V1 is -weakly compact then A1 is a generator. We also characterize this weak compactness assumption in terms of the potential V1 . In particular, such an assumption is satisfied if V1 is “small at infinity” in the sense

G1 (x − y)V1 (x) dx → 0 as c → ∞ sup y∈R N

|x|c

(e.g. if y ∈ R N → |x|c G1 (x − y)V1 (x) dx is continuous and goes to zero as |y| → ∞) where G1 (·) is the Bessel kernel of (1 − )−1 and, for any ball B with finite radius centered at zero,

lim

sup

|Ω|→0, Ω⊂B x∈B

gN (x − y) V1 (y) dy → 0,

Ω

where gN (·) denotes a fundamental solution of the Laplacian and |Ω| is the Lebesgue measure of Ω. (We note that the last assumption on V1 is weaker than the membership to the local class KNloc defined in [1, p. 210]; see Remark 20 below.) Since no condition on the bound of V1 with respect to is required, this result enlarges a priori the classical Kato class and subsequent


2001

refinements; see Remarks 9–11 and 20, Theorems 13, 15, 17 and Proposition 19. By a symmetry argument, the generation result in L1 (R N ) provides generation results in Lp (R N ) (1 p < ∞) and the generator Ap (i.e. the Schrödinger operator) of the corresponding Schrödinger semigroup {Sp (t); t 0} turns out to be the closure of + V : Ξp → Lp (R N ) where Ξp := f ∈ Lp R N ∩ D(1 ); f + Vf ∈ Lp R N . In particular, the closure A2 of + V : Ξ2 → L2 (R N ) is a self-adjoint semi-bounded operator. We also show (in Section 5) that (1) implies that V is form-bounded with respect to − in L2 with relative form-bound less than or equal to δ and prove that A2 is nothing but V (formsum). We note that, a priori, V is not (Lp ) -bounded for p > 1. We show that Cc∞ (R N ) (the C ∞ functions with compact supports) is a core for A1 ; this result is known if the (L1 ) relative -bound of V is < 1 [43]. Note that the potential V being (only) locally integrable, Cc∞ (R N ) is not a priori contained in the domain of Ap for p > 1. The fact that Cc∞ (R N ) is a core for A2 if V ∈ L2loc (R N ) and the (L1 ) relative -bound of V is zero is a classical result by T. Kato [18]; this p result was generalized later to any p > 1 and V ∈ Lloc (R N ) such that the (L1 ) relative -bound p of V is < 1 [25]. In our general setting with p > 1, we show that if V ∈ Lloc (R N ) then Cc∞ (R N ) is a core for Ap under the following regularity assumption: For all g ∈ S(R N ) (the Schwartz class) the solution f to the problem λf − f − Vf = g,

f ∈ D(A1 )

(for large λ) has a gradient ∇f ∈ Lp (R N ). (Note that this regularity assumption above is always true in one dimension.) In particular, this assumption is satisfied for p = 2 since we show (in Section 5) by form-perturbation theory that D(A2 ) ⊂ W 1,2 R N . Since D(A1 ) ⊂ W 1,1 (R N ) then it follows that if V ∈ L2loc (R N ) then D(Ap ) ⊂ W 1,p (R N ) for all 1 p 2 and consequently Cc∞ (R N ) is a core for Ap . However if p > 2 or if 1 < p < 2 with p V ∈ Lloc (R N )\L2loc (R N ) then the above regularity hypothesis seems to require further assumptions; thus we show that this hypothesis is satisfied for 1 p < NN−2 if the potential is “smooth” in the following sense: for each ϕ ∈ D(1 ) ∩ L∞ (R N ) Vh − V ϕ 1 N < ∞, |h| |h|1,h=0 L (R ) sup

where Vh : x → V (x + h); the last condition itself is satisfied if for instance the potential V belongs to W 1,r (R N ) + W 1,s (R N ) + BV(R N ) for some r and s such that 1 r, s ∞ where BV(R N ) denotes the space of L1 functions whose first order distributional derivatives are bounded measures (the role of decompositions like W 1,r (R N ) + W 1,s (R N ) is to cover potentials whose local Sobolev regularity is different from their Sobolev regularity at infinity while the role of BV(R N ) is to take into account some discontinuous potentials). It is also possible to drop the condition p < NN−2 by imposing suitable conditions on the potential; see Remark 32. Finally, in Section 6 we show that the semigroups {Sp (t); t 0} are holomorphic; this result is known if the relative bound of V is small; see e.g. [21,23,37], [33, p. 253].

2002


Some spectral properties of the Schrödinger semigroups considered here and in [29] will be given in a forthcoming paper. The author thanks C. Villani and F. Murat for helpful remarks on the L1 -Laplacian. 2. Desch’s theorem The starting point is a remarkable perturbation result by W. Desch [13] which already has relevant applications to neutron transport with singular cross-sections (see [27, Chapter 9]). Theorem 1. (See [13, Desch’s theorem].) Let X be an ordered Banach space such that the norm is additive on the positive cone X+ . Let T : D(T ) ⊂ X → X be the generator of a positive c0 -semigroup {U (t); t 0} on X. Let B : D(T ) ⊂ X → X be a positive operator, i.e. B : D(T ) ∩ X+ → X+ , such that the resolvent (λ − T − B)−1 exists for large λ and is positive (i.e. leaves invariant the positive cone). Then T + B : D(T ) → X generates a positive semigroup {V (t); t 0} on X. The peculiarity of this result is that the mere existence (and positivity) of the resolvent of the perturbed operator T + B is sufficient to assert that the latter is a generator. Desch’s theorem [13] was given initially in AL-spaces, i.e. in Banach lattices with an additive norm on the positive cone. Actually, this theorem is true without the lattice assumption as it was pointed out in a remark of [4, p. 113]; a detailled proof of this is given in [28]. For more information on Desch’s theorem and related topics we refer the reader to [27, Chapter 8] and [5, Chapter 5] where the following standard result can also be found. Lemma 2. Let X be an ordered Banach space with a generating and normal positive cone X+ . Let T : D(T ) ⊂ X → X be a resolvent positive operator with spectral bound s(T ) and let B : D(T ) ⊂ X → X be a positive operator (i.e. B : D(T ) ∩ X+ → X+ ). Then, for λ > s(T ), the following assertions are equivalent: (i) rσ [B(λ − T )−1 ] < 1. (ii) λ ∈ ρ(T + B) and (λ − T − B)−1 0. If one of these conditions is satisfied then T + B is resolvent positive, λ > s(T + B) and (λ − T − B)−1 = (λ − T )−1

∞ j B(λ − T )−1 (λ − T )−1 . j =0


2003

3. (Weak) compactness and generation We show here how (weak) compactness arguments provide useful generation results of perturbative type in ordered Banach spaces with additive norm on the positive cone. Before this, we give first a new result on the spectral radius of positive operators in general ordered Banach spaces which is needed in the sequel (this result is of course well known in Banach lattices). Let X be an ordered Banach space with norm and positive cone X+ . We assume that the positive cone is generating, i.e. X = X+ − X+ . We recall (see e.g. [6, Proposition 1.1.2]) that by a Baire category argument, there exists a constant γ > 0 such that each x ∈ X has a decomposition x = x1 − x2 ; x1 , x2 ∈ X+ with x1 , x2 γ x .

(3)

For each positive bounded linear operator C : X → X we define C + :=

sup

x 1, x∈X+

Cx .

We note that Cx C + x , ∀x ∈ X+ and C1 C2 + C1 + C2 + for positive operators C1 , C2 . Finally, C + C and this inequality might a priori be strict if X has not a lattice structure. Lemma 3. Let C be a positive bounded linear operator in an ordered Banach space with a generating cone. Then 1 1 lim C n +n = inf C n +n

(4)

rσ + (C) = rσ (C),

(5)

n→∞

n0

and

1

where rσ + (C) := limn→∞ C n +n . Moreover, if X+ is normal and if C1 and C2 are positive operators such that C1 C2 then rσ (C1 ) rσ (C2 ). Proof. The proof of (4) is the same as the standard one for a spectral radius (see e.g. [44, p. 212]) and is omitted. Let x ∈ X be arbitrary and let a decomposition x = x1 − x2 with x1 , x2 ∈ X+ satisfying (3), then Cx = Cx1 − Cx2 Cx1 + Cx2 C + x1 + C + x2 = C + x1 + x2 2γ C + x so C 2γ C + . Thus n 1 n 1 1 C n C n (2γ ) n1 C n n + +

2004


which ends the proof of (5) by letting n → ∞. Finally, X+ is normal if and only if there exists α 1 such that the norm is α-monotone, i.e. for x, y ∈ X+ with x y we have x α y (see e.g. [32, Proposition A.2.2, p. 266]). Let C1 C2 . Then C1n x C2n x, ∀x ∈ X+ , ∀n and n 1 1 C n α n1 C n n , 1 + 2 +

∀n,

so that rσ + (C1 ) rσ + (C2 ) and consequently (5) ends the proof.

2

We note that according to Lemma 3, we can replace rσ by rσ + in Lemma 2 so that (i) is satisfied once B(λ − T )−1 + < 1; this is useful when a lattice structure is lacking. We are now ready to show a basic result. Theorem 4. Let X be an ordered Banach space with a generating positive cone X+ such that the norm is additive on X+ . Let T : D(T ) ⊂ X → X be the generator of a positive c0 -semigroup on X and let B : D(T ) → X be a linear positive operator. We assume that there exist λ > s(T ) and an integer n such that [B(λ − T )−1 ]n is a compact operator. Then T + B is a generator of a positive c0 -semigroup. Proof. Note that an additive norm on X+ is monotone, i.e. if x, y ∈ X+ and x y then x y . This shows that for λ λ B(λ − T )−1 n+1 = B(λ − T )−1 B(λ − T )−1 n + + n −1 −1 . B(λ − T ) B(λ − T ) + On the other hand, B(λ − T )−1 B(λ − T )−1 n B(λ − T )−1 B(λ − T )−1 n . + Note that the positivity of B : D(T ) → X implies that B(λ − T )−1 is positive and therefore bounded. Thus B is T -bounded and B(λ − T )−1 n+1 B L(D(T ),X) (λ − T )−1 B(λ − T )−1 n . + L(X,D(T )) Let N(·) be the T -graph norm. We note that (λ − T )−1 x → 0 as λ → ∞ and, for x ∈ D(T ), T (λ − T )−1 x = (λ − T )−1 T x → 0 as λ → ∞. Since T (λ − T )−1 = I + λ(λ − T )−1 is uniformly bounded for large λ then, for all x ∈ X, T (λ − T )−1 x → 0 as λ → ∞. Thus ∀x ∈ X,

N (λ − T )−1 x → 0 as λ → ∞;

the convergence being uniform on compact subsets of X. Finally the compactness of [B(λ − T )−1 ]n shows that (λ − T )−1 B(λ − T )−1 n

L(X,D(T ))

→0

as λ → ∞.


2005

Thus [B(λ − T )−1 ]n+1 + → 0 as λ → ∞ and consequently rσ + [B(λ − T )−1 ] < 1 for λ large enough by Lemma 3. Hence T + B is resolvent positive by Lemma 2 and consequently Desch’s theorem (Theorem 1) ends the proof. 2 Remark 5. Theorem 4 is known in AL spaces with n = 1 [34, p. 19]. We provide now an important consequence of Theorem 4 for AL spaces. Theorem 6. Let X be an AL space and T : D(T ) ⊂ X → X be the generator of a positive c0 semigroup on X. Let B : D(T ) ⊂ X → X be a positive weakly compact operator where D(T ) is endowed with the graph norm. Then T + B is a generator of a positive c0 -semigroup. Proof. The weak compactness of B : D(T ) ⊂ X → X amounts to the weak compactness of B(λ − T )−1 . Since the product of two weakly compact operators on an AL space is a compact operator [2, Corollary 19.9, p. 337] then [B(λ − T )−1 ]2 is compact and Theorem 4 ends the proof. 2 We also give a useful (and simple) improvement of Theorem 6. Corollary 7. Let X be an AL space and T : D(T ) ⊂ X → X be the generator of a positive c0 semigroup on X. Let Bi : D(T ) ⊂ X → X (i = 1, 2) be two positive operators. We assume that B1 : D(T ) → X is weakly compact and limλ→∞ B2 (λ − T )−1 < 1. Then T + B1 + B2 is a generator of a positive c0 -semigroup. Proof. According to Lemma 2, rσ [B2 (λ − T )−1 ] < 1 for large λ so that T + B2 : D(T ) → X is resolvent positive. By Theorem 1, T + B2 is a generator of a positive semigroup. Since B1 is T weakly compact, or equivalently (T + B2 )-weakly compact, then Theorem 6 ends the proof. 2 4. On Schrödinger operators with negative potentials Let {Hp (t); t 0} be the Heat semigroup on Lp (R N ) (1 p ∞) Hp (t)f =

1 (4πt)

N 2

e−

|x−y|2 4t

f (y) dy,

f ∈ Lp R N .

RN

We denote by p its (Laplacian) generator with domain D(p ) = ϕ ∈ Lp R N ; ϕ ∈ Lp R N . The resolvent of the generator is given by

−1

(λ − p )

+∞ f= e−λt Hp (t)f dt = Gλ (x − y)f (y) dy, 0

RN

2006


λ (ζ ) = where Gλ is defined by G for z = 0.

−N

(2π) 2 λ+|ζ |2

. It is clear that Gλ (·) is C ∞ on R N − {0} and Gλ (z) > 0

4.1. Generation results Because of the positivity of V , it is an elementary fact that V (λ − 1 )−1 is a bounded operator on L1 (R N ) for some (or equivalently all) λ > 0, i.e. V is -bounded in L1 (R N ), if and only if

Gλ (x − y)V (y) dy ∈ L∞ R N .

x→

(6)

RN

In such a case

V (λ − 1 )−1

L(L1 (R N )) = sup

x∈R N

Gλ (x − y)V (y) dy. RN

It is known [18,40] (see also [43, Proposition 5.1]) that (6) can be expressed in terms of a fundamental solution gN of the Laplacian on R N by

gN (x − y) V (y) dy ∈ L∞ R N .

x→

(7)

|x−y|1

In dimension N = 1, the class of potentials V satisfying (7) is called the Kato class K1 . In dimension N 2, the Kato class, noted KN , refers rather to the subclass of potentials such that

lim ess sup

α↓0+

x∈R N

gN (x − y) V (y) dy = 0.

(8)

|x−y|α

In all this paper, the potential V is assumed to be -bounded in L1 (R N ), i.e. V is assumed to satisfy (7). Then it is well known (as a consequence of (6) for instance) that V ∈ L1loc R N . We start with a basic observation: Theorem 8. Let (7) be satisfied. Then A1 := 1 + V : D(1 ) → L1 (R N ) is a generator of a positive semigroup {S1 (t); t 0} in L1 (R N ) if and only if lim rσ V (λ − 1 )−1 < 1.

λ→+∞

(9)

Proof. Note that λ > 0 → rσ [V (λ − 1 )−1 ] in nonincreasing and then the limit (9) exists. If rσ [V (λ − 1 )−1 ] < 1 for large λ then (λ − 1 − V )−1 exists and is positive so that A1 := 1 + V : D(1 ) → L1 (R N ) is a generator of a positive semigroup {S1 (t); t 0} by Desch’s theorem. Conversely, if 1 + V : D(1 ) → L1 (R N ) is a generator of a positive semigroup then


2007

(λ − 1 − V )−1 exists and is positive for large λ and, by Lemma 2, rσ [V (λ − 1 )−1 ] < 1 for large λ. 2 Remark 9. One sees that for -bounded potentials in L1 sense, assumption (9) is optimal for the L1 generation theory. Remark 10. It is well known (see [1, Theorem 4.14], [40, Proposition A.2.3], [21, Lemma 11]) that the parameter cN (V ) := lim V (λ − 1 )−1 L(L1 (R N )) λ→∞

(10)

provides the relative bound of V with respect to 1 and this limit can be “computed”

cN (V ) = lim ess sup α↓0+

x∈R N

gN (x − y) V (y) dy ,

|x−y|α

i.e. the class of potentials V satisfying cN (V ) = 0 is the Kato class. The weaker assumption cN (V ) < 1 appears for instance in [14,24,43]. Actually, Theorem 8 suggests to consider rather limλ→+∞ rσ [V (λ − 1 )−1 ] as the relevant parameter; see Section 5 below for more information. Remark 11. Note that a priori V1 (λ − 1 )−1 L(L1 (R N )) need not go to 0 as λ → +∞ so that the potential V1 does not belong a priori to the Kato class. Further, a priori we may have cN (V1 ) > 1 without preventing the generation property; see Remark 16. A similar phenomenon arises in neutron transport theory; see [27, Chapter 9]. Remark 12. We can of course add a bounded potential to the generator 1 + V without changing ∈ L1 to V without changthe conclusions in Theorem 8. We can also add a negative term V loc acts as an absorption term which decreases the ing the conclusions in Theorem 8. Indeed, V resolvent, i.e. ) −1 (λ − 1 )−1 , λ − (1 + V so that ) −1 rσ V (λ − 1 )−1 < 1 rσ V λ − (1 + V is “(1 + V )”+V on the domain of “(1 + V )” for large λ. Actually, the meaning of 1 +V + V where “(1 + V )” is defined by a suitable truncation and monotonic limiting procedure. We refer to [3,43] and references therein for a systematic treatment of absorption semigroups. Theorems 13, 15 and 17 below provide concrete realizations of Theorems 4 and 8. Theorem 13. Let (7) be satisfied. We assume that there exists an integer k such that [V (λ − 1 )−1 ]k is a compact operator. Then A1 : 1 + V : D(1 ) → L1 (R N ) is a generator of a positive semigroup {S1 (t); t 0} in L1 (R N ).

2008


Proof. This is simply a particular case of (the abstract) Theorem 4.

2

Remark 14. Note that for k = 1 we find again a known result corresponding to cN (V ) = 0; see e.g. [40, Proposition A.2.3]. Theorem 15. Let V = V1 + V2 with V1 , V2 0. Let (7) be satisfied and cN (V2 ) < 1. If V1 is such that the (bounded) subset of L1 (R N ) G1 (x − ·)V1 (·); x ∈ R N is equi-integrable then A1 : 1 + V : D(1 ) → L1 (R N ) is a generator of a positive semigroup {S1 (t); t 0} in L1 (R N ). Proof. Note that cN (V2 ) < 1 amounts to V2 (λ − 1 )−1 L(L1 (R N )) < 1 for large λ. On the other hand, according to Corollary 7, it suffices to show that V1 (λ − 1 )−1 : L1 (R N ) → L1 (R N ) is weakly compact, i.e.

V1 (λ − 1 )−1 f ; f L1 (R N ) 1

is equi-integrable. According to the general criteria of equi-integrability (see e.g. [15]), this is equivalent to

V1 (x) (λ − 1 )−1 f (x) dx → 0

as |Ω| → 0

Ω

(|Ω| is the Lebesgue measure of Ω) uniformly in f in the unit ball of L1 (R N ) and

V1 (x) (λ − 1 )−1 f (x) dx → 0

Ejc

uniformly in f in the unit ball of L1 (R N ) as j → ∞ forany increasing sequence {Ej } of measurable subsets of R N (with finite measure) such that Ej = R N . Actually, we can restrict ourselves to nonnegative f . Thus, the estimate

V1 (x) (λ − 1 )−1 f (x) dx =

Ω

RN

f (y) Gλ (x − y)V1 (x) dx dy Ω

shows that

sup

f L1 (R N ) 1 +

Similarly

Ω

V1 (x) (λ − 1 )−1 f (x) dx = ess sup y∈R N

G1 (x − y)V1 (x) dx. Ω


sup

f L1 (R N ) 1 + Ejc

V1 (x) (λ − 1 )−1 f (x) dx = ess sup y∈R N

2009

G1 (x − y)V1 (x) dx Ejc

and consequently the weak compactness of V1 (λ − 1 )−1 is equivalent to our equi-integrability assumption. 2 Remark 16. We note that the size of the set {G1 (x − ·)V1 (·); x ∈ R N } is nothing but

sup x∈R N

G1 (x − y)V1 (y) dy = V1 (1 − 1 )−1 L(L1 (R N )) .

Thus, Theorem 15 shows that under an equi-integrability assumption, the size of the set {G1 (x − ·)V1 (·); x ∈ R N } is irrelevant and, a priori, cN (V1 ) need not be small. We give a slightly different version of Theorem 15 which separates the local role of V1 from its role at infinity. Theorem 17. Let V = V1 + V2 (V1 , V2 0) satisfying (7) and let cN (V2 ) < 1. Let

G1 (x − y)V1 (x) dx → 0 as c → ∞

(11)

|x|c

uniformly in y ∈ R N . We assume that V1 (1 − 1 )−1 : L1 R N → L1loc R N is weakly compact.

(12)

Then A1 : 1 + V : D(1 ) → L1 (R N ) is a generator of a positive semigroup {S1 (t); t 0} in L1 (R N ). Moreover, (11) is satisfied if for c large enough, the function

Fc : y ∈ R N →

G1 (x − y)V1 (x) dx

|x|c

is continuous and goes to zero as |y| → ∞. Proof. We have

|x|c

V1 (1 − 1 )−1 f dx

V1 (x) G1 (x − y) f (y) dy dx

|x|c

=

RN

f (y)

RN

sup y∈R N

|x|c

G1 (x − y)V1 (x) dx

|x|c

G1 (x − y)V1 (x) dx f L1

2010


so that

V1 (1 − 1 )−1 f dx → 0 as c → ∞

|x|c

uniformly in f L1 1. This shows that χ{|x| 0 such that Fc0 (y) ε for |y| > γ and consequently Fcj (y) Fc0 (y) ε for |y| > γ for all j . This shows that Fcj (y) → 0 uniformly in y ∈ R N as j → ∞. 2 Remark 18. Assumption (11) expresses that the potential V1 is “small at infinity” in some averaged sense: (i) For instance, for N 2, (11) is satisfied if V1 ∈ Lq (B ext ) for some q ∈ ] N2 , ∞[ where B ext is the exterior of B := {x; |x| c} (or if V1 (x) → 0 as |x| → ∞). Indeed, define Vc by: Vc = 0 on B and Vc = V1 on B ext . Then Fc = Gλ ∗ Vc ∈ C0 (R N ) since Vc ∈ Lq (R N ) for some q ∈ ] N2 , ∞[ and Gλ (·) ∈ Ls (R N ) for all s ∈ [1, NN−2 [ (see [45, p. 65]) so that we can choose s = q ∗ the conjugate exponent of q. On the other hand, if V1 (x) → 0 as |x| → ∞, one sees directly that

G1 (x − y)V1 (x) dx G1 L1 sup V1 (x) → 0 as c → ∞. |x|c

|x|c

(ii) Assumption (11) can also be checked by noting that Gλ ∗ Vc ∈ C0 (R N ) if Vc is a tempered λ (ζ )Vc (ζ ) ∈ L1 (R N ), i.e. Vc (ζ )2 ∈ L1 (R N ). This condition on the distribution such that G λ+|ζ | potential is a priori different from that given in (i). (iii) We note that in dimension N = 1, (12) is always satisfied; actually V (1 − 1 )−1 : L1 (R) → L1loc (R) is compact because (1 − 1 )−1 maps continuously L1 (R) into W 2,1 (R), the restriction to a bounded interval [a, b] ⊂ R of a bounded set of W 2,1 (R) is relatively compact in C([a, b]) (equipped with the supremum norm) and V ∈ L1loc (R). Thus, for N = 1, V is 1 compact if (11) is satisfied, e.g. if there exists q ∈ [1, +∞[ such that (V )q is integrable at infinity, or if V (x) → 0 as |x| → ∞. We show now how to check assumption (12) in dimension N 2.


2011

Proposition 19. We assume that for any ball B centered at zero and any c > 0

lim

sup

|Ω|→0, Ω⊂B |x|c

gN (x − y) V1 (y) dy = 0,

(13)

Ω

where |Ω| is the Lebesgue measure of Ω. Then V1 (1 − 1 )−1 : L1 R N → L1loc R N is weakly compact. Proof. Let b > 0 be a constant and B be the ball with radius b and centered at the origin. Let us show that V1 (1 − 1 )−1 : L1 (R N ) → L1 (B) is weakly compact. By the Dunford–Pettis criterion, we have to check that

V1 (1 − 1 )−1 f dx → 0 Ω

as |Ω| → 0 (Ω ⊂ B) uniformly in f L1 (R N ) 1. Let δ > 0 be fixed. We note that

V1 (1 − 1 )−1 f dx

Ω

V1 (x) G1 (x − y) f (y) dy dx

Ω

RN

f (y)

= RN

G1 (x − y)V1 (x) dx dy

Ω

f (y)

= |y|b+δ

G1 (x − y)V1 (x) dx dy

Ω

f (y)

+

|y| 0 such that G1 (x − y) c |gN (x − y)|, ∀x, y so that

2012


f (y)

|y| 0 such that (λ − 1 − V )−1 g

Lp

c g Lp ,

g ∈ L1 R N ∩ L p R N .

Let f ∈ D(Ap ). There exists g ∈ Lp (R N ) such that f = (λ − Ap )−1 g. For any sequence (gk ) ⊂ L1 (R N ) ∩ Lp (R N ) such that gk → g in Lp (R N ) we have fk := (λ − 1 − V )−1 gk ∈ Lp (R N ) ∩ D(1 ) and fk → f := (λ − Ap )−1 g

in Lp R N .

Thus λfk − 1 fk − Vfk → λf − Ap f

in Lp R N ,

1 fk + Vfk ∈ Lp (R N ) and 1 fk + Vfk → Ap f in Lp (R N ), i.e. (fk ) ⊂ Ξ and (fk , 1 fk + Vfk ) → (f, Ap f ) in Lp (R N ) × Lp (R N ). 2 Corollary 24. Let the conditions in Theorem 8 be satisfied. Then the closure of + V : Ξ2 → L2 (R N ) is self-adjoint and semi-bounded. Proof. We know that A2 , the closure of + V : Ξ2 → L2 (R N ), generates a self-adjoint semigroup. 2 Remark 25. Theorem 8 could be stated in abstract L1 (μ) spaces (with a σ -finite measure μ) 1 (t); t 0} having the symwhere the Heat semigroup is replaced by any positive semigroup {H metry property

1 (t)g1 g2 dx = H

1 (t)g2 dx, g1 H

∀g1 , g2 ∈ L1 (μ) ∩ L∞ (μ).

Then the perturbed semigroup { S1 (t); t 0} will inherit this symmetry and interpolates on Lp (μ) spaces. We can then derive an abstract version of Theorem 23. For instance, we could replace the Laplacian operator by a general symmetric second order elliptic operator with variable coefficients and with suitable boundary conditions in a domain Ω ⊂ R N . 4.2. On the domain generator This subsection provides some additional information related to whether Cc∞ (R N ) is a core for Ap . (Other results are given in Section 5 below.) We start with the case p = 1. Theorem 26. Under the general assumptions of Theorem 8, Cc∞ (R N ) is a core for A1 . Proof. Note that D(A1 ) = D(1 ) and D(1 ) ⊂ W 1,1 (R N ) (see e.g. [45, p. 65]). Let f ∈ D(A1 ); then Vf ∈ L1 (R N ) since V is 1 -bounded. Let φ ∈ Cc∞ (R N ) with φ(0) = 1 and let φn (x) = φ( xn ). Then fn := f φn ∈ D(A1 ), fn → f in L1 (R N ), Vfn = (Vf )φn → Vf in L1 (R N ) and

2016


in L1 R N .

fn = (φn )f + 2∇φn .∇f + (f )φn → f

Then the elements of D(A1 ) with compact supports form a core of A1 . Now, let f ∈ D(A1 ) with compact support and let fn := f ∗ gn where (gn )n is a standard mollifier sequence. Then fn ∈ Cc∞ (R N ), fn → f in L1 (R N ) and fn = (f ∗ gn ) = (f ) ∗ gn → (f )

in L1 R N

so that fn → f in the 1 -graph norm; it follows that Vfn → Vf in L1 (R N ) since V is 1 bounded. Thus fn ∈ Cc∞ (R N ), fn → f in L1 (R N ) and A1 fn → A1 f in L1 (R N ). 2 Remark 27. The proof above is partly inspired by similar ideas scattered in the literature (e.g. [40, Theorem B.1.5] or [43, Theorem 7]). Thus, Theorem 26 is given in [43] under the assumption cN (V ) < 1 and its proof uses also other technical results from [14] which are unnecessary here. According to a classical result by T. Kato [18], Cc∞ (R N ) is a core for A2 if V ∈ L2loc (R N ) p and cN (V ) = 0. Actually, it was shown later [25] that for p > 1 and V ∈ Lloc (R N ), Cc∞ (R N ) is a core for Ap provided that cN (V ) < 1. We treat here our general case differently under a technical assumption (which is always true for p = 2; see Section 5) we discuss in Corollary 31 below for general p. The general strategy of the proof is similar to that of Theorem 26 but combines additional technical arguments. Theorem 28. Let the general assumptions of Theorem 8 be satisfied. Let p > 1 and p V ∈ Lloc (R N ). We assume that for each g ∈ S(R N ) (the Schwartz space) the solution f to the problem λf − f − Vf = g,

f ∈ D(A1 )

(18)

(which exists for λ large enough in all Lq spaces) has a gradient ∇f ∈ Lp (R N ). Then Cc∞ (R N ) is a core for Ap . Proof. We observe first that since g is bounded then the solution f in (18) is also bounded. Indeed, this is a consequence of the fact, noted in the proof of Theorem 23, that [λ−(1 −V ) ]−1 in L∞ (R N ) coincides with (λ − 1 − V )−1 on L1 (R N ) ∩ L∞ (R N ). Let λ > 0 be large enough, i.e. λ such that rσ [V (λ − 1 )−1 ] < 1. We already know that Ξp := f ∈ Lp R N ∩ D(1 ); f + Vf ∈ Lp R N is a core of Ap . Actually, by inspecting the proof of Theorem 23, one sees that the sequence (gk ) ⊂ L1 (R N ) ∩ Lp (R N ) which approximates g in Lp (R N ) is arbitrary. Therefore we can choose (gk ) in the Schwartz space S(R N ) so that (according to our extra assumption) we may replace Ξp by another core p := f ∈ L∞ R N ∩ W 1,p R N ∩ D(1 ); f + Vf ∈ Lp R N . Ξ


2017

p . Let φ ∈ Cc∞ (R N ) with φ(0) = 1, φn (x) = φ( x ) and fn := f φn . Clearly fn ∈ Let f ∈ Ξ n Lp (R N ) and fn → f in Lp (R N ). Moreover (see the proof of Theorem 26) fn ∈ D(1 ). On the other hand fn + Vfn = (φn )f + 2∇φn .∇f + (f )φn + Vfn

= (φn )f + 2∇φn .∇f + [f + Vf ]φn ∈ Lp R N

p . Moreover because ∇f ∈ [Lp (R N )]N and f + Vf ∈ Lp (R N ). Thus (fn ) ⊂ Ξ fn + Vfn = (φn )f + 2∇φn .∇f + [f + Vf ]φn → f + Vf in Lp R N . p with compact supports form a core for Ap . Let now f ∈ Ξ p with Hence the elements of Ξ a compact support and let fn := f ∗ gn where (gn )n is a standard mollifier sequence. Then fn ∈ Cc∞ (R N ), fn → f in Lp (R N ). We have also fn + Vfn = (f ) ∗ gn + V (f ∗ gn ) = (f ) + Vf ∗ gn + V (f ∗ gn ) − (Vf ) ∗ gn . Note that (f ) + Vf ∈ Lp (R N ) and then [(f ) + Vf ] ∗ gn → (f ) + Vf in Lp (R N ). On the other hand V (f ∗ gn ) − (Vf ) ∗ gn = V [f ∗ gn − f ] + Vf − (Vf ) ∗ gn . Note that f ∈ L∞ (R N ) and f is compactly supported so that Vf ∈ Lp (R N ) because p V ∈ Lloc (R N ) and then Vf − (Vf ) ∗ gn → 0 in Lp (R N ). On the other hand, since f is bounded then f ∗ gn − f is uniformly bounded too and its support is included in a bounded set independent of n. Moreover, since f ∗ gn → f in all Lq spaces with q < ∞ then, by extracting a subsequence if necessary, we can assume that f ∗ gn − f → 0 a.e. and then, by the dominated convergence theorem, V [f ∗ gn − f ] → 0 in Lp (R N ). Finally fn + Vfn → (f ) + Vf in Lp R N and we are done.

2

Remark 29. Note that, in dimension N = 1, the solution f to (18) belongs to W 2,1 (R) so that the assumption concerning (18) is automatically satisfied. Remark 30. We note that it is known that if cN (V ) < 1 and p = 2 then V is form bounded with relative bound < 1 and D(A2 ) ⊂ W 1,2 (R N ) (see e.g. [20, Lemma 4.8a, p. 350], [23], [14, Lemma 4] and [40, (2), p. 459]). A more general result is given in Theorem 39 below. We give now sufficient (smoothness) conditions on the potential to check the main assumption in Theorem 28. (Such conditions are unnecessary if 1 < p 2 and V ∈ L2loc (R N ); see Corollary 40.)

2018


Corollary 31. Let the general assumptions of Theorem 8 be satisfied and let N 2. We assume that for each ϕ ∈ D(1 ) ∩ L∞ (R N ) Vh − V ϕ sup 1 N 1, this last estimate characterizes the mem1,p N bership of f to W (R ) (see e.g. [7, Proposition IX.3, p. 153]). Finally, we note that if (for instance) V ∈ W 1,r (R N ) for some r ∈ [1, ∞] then

Vh − V |h| |h|1, h=0 sup

N2 with r 2 then Theorem 33(ii) applies and we find again a classical result [16, Theorem 5.1]. 5. Connection with form-perturbation theory In this section we provide connections between our formalism and standard form-perturbation theory. Theorem 35. Let δ := limλ→+∞ rσ [V (λ − 1 )−1 ] < 1 be satisfied and let V ∈ L2loc (R N ). Then V is form-bounded with respect to − in L2 (R N ) with relative form-bound δ. In particular, if limλ→+∞ rσ [V (λ − 1 )−1 ] = 0 then the relative form-bound of V with respect to − is equal to zero. Proof. Let c be such that 1 < c
0. Note first that λϕn − ϕn − V ϕn = (λ − A2 )ϕn implies

λ

ϕn φ +

∇ϕn .∇φ −

V ϕn φ =

(λ − A2 )ϕn φ,

φ ∈ W 1,2 R N .

(21)

On the other hand, according to (20),

V ϕn2 dx α

|∇ϕn |2 dx + αs 1 ϕn 2 , α

δ < α < 1,

so that

(λ − αs 1 ) ϕn + (1 − α) 2

α

|∇ϕn |2 dx (λ − A2 )ϕn , ϕn

and the choice λ > αs 1 show that {∇ϕn }n is bounded in L2 (R N ) and then standard arguments α

show that ϕ ∈ W 1,2 (R N ). Taking a subsequence if necessary, we can pass to the limit in (21) and obtain

− ∇ϕ.∇φ + V ϕφ = (A2 ϕ, ψ), ϕ ∈ D(A2 ), φ ∈ W 1,2 R N which characterizes the form-sum operator V .

2

Corollary 40. Let V ∈ L2loc (R N ). Then D(Ap ) ⊂ W 1,p (R N ) and Cc∞ (R N ) is a core for Ap for all p ∈ [1, 2]. Proof. We already know this result for p = 1 (Theorem 26) and for p = 2 (Lemmas 37 and 38). The bounded operators (λ − A1 )−1 and (λ − A2 )−1 (respectively in L1 (R N ) and L2 (R N )) coincide on L1 (R N ) ∩ L2 (R N ) and, for all 1 j N , the bounded operators ∂j (λ − A1 )−1 and ∂j (λ − A2 )−1 (respectively in L1 (R N ) and L2 (R N )) coincide on L1 (R N ) ∩ L2 (R N ) where ∂j := ∂x∂ j . Then Riesz–Thorin interpolation theorem shows that ∂j (λ − A1 )−1 interpolates to Lp (R N ) spaces with p ∈ [1, 2] showing thus that D(Ap ) ⊂ W 1,p (R N ); finally Theorem 28 ends the proof. 2


2023

p

Remark 41. If p > 2 or if 1 < p < 2 and V ∈ Lloc (R N )\L2loc (R N ), Corollary 40 above does not apply a priori but we can still prove that Cc∞ (R N ) is a core for Ap under an additional “smoothness” assumption on V (see Corollary 31). We end this section with a remark and a conjecture. It is known (see [26, Theorem 3.2]) that if δ 2 is the conjugate exponent of c1 (β)) with a suitable estimate depending on the choice of β. Thus, the case p = 1 seems to be out of reach of this method even if δ = 0. Suppose now that additionally V is -bounded in L1 (R N ) and that we can show that the resolvent (λ − ( V ))−1 interpolates on Lp (R N ) with p ∈ [1, 2] in such a way that it acts in L1 (R N ) as (λ − 1 − V )−1 (i.e. (λ − ( V ))−1 maps L1 (R N ) into D(1 )). Then, by Lemma 2, we can assert that limλ→∞ rσ [V (λ − 1 )−1 ] < 1. If δ < 1 (note that Vc := cV this argument works for all potentials Vc := cV where c is such that c is still form-small with respect to − in L2 (R N )) then c lim rσ V (λ − 1 )−1 < 1 λ→∞

for all c such that c δ < 1 and therefore δ δ. This formal observation (combined to Theorem 35) suggests a plausible conjecture. Conjecture 42. Let V ∈ L2loc (R N ) be nonnegative and -bounded in L1 (R N ). Then V is formsmall with respect to − in L2 (R N ) (i.e. δ < 1) if and only if limλ→+∞ rσ [V (λ − 1 )−1 ] < 1. Our conjecture is somewhat “corroborated” by the following result given by E.B. Davies and A.M. Hinz [11] (the authors note that the basic idea of this result goes back to [1,40,42]): Let H0 be a nonnegative self-adjoint operator in L2 (R N ) such that e−tH0 is an integral operator with a “heat kernel” bound (e.g. H0 = −); let the quadratic form bound V εH0 + β(ε),

∀ε > 0,

(s) := hold, i.e. V is form-bounded with respect to H0 with relative form-bound zero, and let β infs>0 {εs + β(ε)}. If

+∞ β (s) ds < ∞ s2 c

(c > 0 is a constant) then limλ→+∞ V (λ + H0 )−1 L(L1 ) = 0, i.e. V is (L1 ) H0 bounded with relative (operator) bound zero. 6. On holomorphy of Schrödinger semigroups It is known that the Schrödinger semigroups {Sp (t); t 0} are holomorphic if the relative bound of V is small, see e.g. [21,23,37], [33, p. 253] (see also [10,31] for more information). We show here the L1 -holomorphy in our general setting and extend it to general Lp spaces by standard duality and interpolation arguments.

2024


Theorem 43. Under the general assumption of Theorem 8, the Schrödinger semigroups {Sp (t); t 0} are holomorphic. Proof. Consider first the case p = 1. In this case A1 = 1 + V . Since 1 generates a holomorphic semigroup, V is 1 -bounded and 1 + V has a positive resolvent then, by [4, Theorem 1.1], 1 + V generates a holomorphic semigroup {S1 (t); t 0}. (This argument is not linked to L1 but rather to the holomorphy of the unperturbed semigroup; in particular, it works in Lp spaces provided that V is p -bounded and p + V has a positive resolvent, i.e. rσ (V (λ − p )−1 ) < 1 for large λ.) We argue now as in [21]: The holomorphy of {S1 (t); t 0} is characterized by the existence of M > 0 and ω large enough such that (λ − A1 )−1 M |λ|

(Re λ ω).

The dual operator in L∞ (R N ) satisfies the same estimate λ − A −1 M 1 |λ|

(Re λ ω).

But we know that (λ − A1 )−1 and (λ − A 1 )−1 coincide on L1 (R N ) ∩ L∞ (R N ) and then, by Riesz–Thorin interpolation theorem, (λ − Ap )−1 M |λ| which shows that {Sp (t); t 0} is holomorphic.

(Re λ ω)

2

Remark 44. If V ∈ Lp (R N ) for some p > N2 then, by Sobolev imbeddings (W 2,p being the domain of p ), V is p -bounded and p + V has a positive resolvent so that Ap = p + V generates a holomorphic semigroup; (see [4] for the details). This provides a different proof of Theorem 33(i) and (in the same time) the holomorphy property; see [23,24] and references therein for more information on p -bounded potentials. References [1] M. Aizenman, B. Simon, Brownian motion and Harnack inequality for Schrödinger operators, Comm. Pure Appl. Math. 35 (1982) 209–273. [2] C.D. Aliprantis, O. Burkinshaw, Positive Operators, Academic Press, New York, 1985. [3] W. Arendt, C.J.K. Batty, Absorption semigroups and Dirichlet boundary conditions, Math. Ann. 295 (1993) 427– 448. [4] W. Arendt, A. Rhandi, Perturbation of positive semigroups, Arch. Math. 56 (1991) 107–119. [5] J. Banasiak, L. Arlotti, Perturbations of Positive Semigroups with Applications, Springer Monogr. Math., Springer, London, 2006. [6] C.J.K. Batty, D.W. Robinson, Positive one-parameter semigroups on ordered Banach spaces, Acta Appl. Math. 1 (1984) 221–296. [7] H. Brezis, Analyse fonctionnelle. Théorie et applications, Masson, Paris, 1983. [8] H. Brezis, T. Kato, Remarks on the Schrödinger operator with singular complex potentials, J. Math. Pures Appl. 58 (9) (1979) 137–151. [9] R. Carmona, Regularity properties of Schrödinger and Dirichlet semigroups, J. Funct. Anal. 33 (1979) 259–296. [10] E.B. Davies, Lp spectral independence and L1 analyticity, J. London Math. Soc. 52 (2) (1994) 177–184.


2025

[11] E.B. Davies, A.M. Hinz, Kato class potentials for higher order elliptic operators, J. London Math. Soc. 58 (2) (1998) 669–678. [12] E.B. Davies, B. Simon, L1 -properties of intrinsic Schrödinger semigroups, J. Funct. Anal. 65 (1986) 126–146. [13] W. Desch, Perturbations of positive semigroups in AL-spaces, unpublished manuscript, 1988. [14] A. Devinatz, Schrödinger operators with singular potentials, J. Operator Theory 4 (1980) 25–35. [15] N. Dunford, J.T. Schwartz, Linear Operators, Part I: General Theory, Interscience, New York, 1958. [16] W.G. Faris, Essential self-adjointness of operators in ordered Hilbert spaces, Comm. Math. Phys. 30 (1973) 23–34. [17] W. Herbst, D. Sloan, Perturbation of translation invariant positivity preserving semigroups on L2 (R N ), Trans. Amer. Math. Soc. 236 (1978) 325–360. [18] T. Kato, Schrödinger operators with singular potentials, Israel J. Math. 13 (1973) 135–148. [19] T. Kato, A second look at the essential self-adjointness of Schrödinger operators, in: C.P. Enz, J. Mehra (Eds.), Physical Reality and Mathematical Description, Reidel Publishing Company, Dordrecht-Holland, 1974, pp. 193– 201. [20] T. Kato, Perturbation Theory for Linear Operators, Springer, 1976. [21] T. Kato, Lp -theory of Schrödinger operators with a singular potential, in: R. Nagel, U. Schlotterbeck, M.P.H. Wolf (Eds.), Aspects of Positivity in Functional Analysis, Mathematics Studies, North-Holland, 1986, pp. 63–78. [22] H.P. McKean, − plus a bad potential, J. Math. Phys. 18 (6) (1977) 1277–1279. 1 [23] V.F. Kovalenko, M.A. Perelmuter, Yu.A. Semenov, Schrödinger operators with Lw2 (R l )-potentials, J. Math. Phys. 22 (5) (1981) 1033–1044. [24] V.F. Kovalenko, Yu.A. Semenov, Some problems on expansion in generalized eigenfunctions of the Schrödinger operator with strongly singular potentials, Russian Math. Surveys 33 (1978) 119–157. [25] V.F. Kovalenko, Yu.A. Semenov, Towards the theory of Schrödinger operators I, translation from Ukrain. Mat. Zh. 41 (2) (1989) 273–278. [26] V.A. Liskevich, Yu.A. Semenov, Some problems on Markov semigroups, in: M. Demuth, E. Schrohe, B. Schulze, J. Sjöstrand (Eds.), Schrödinger Operators, Markov Semigroups, Wavelet Analysis, Operator Algebras, Akademie, Berlin, 1996, pp. 163–217. [27] M. Mokhtar-Kharroubi, Mathematical Topics in Neutron Transport Theory. New Aspects, Ser. Adv. Math. Appl. Sci., vol. 46, World Scientific, 1997. [28] M. Mokhtar-Kharroubi, On resolvent positive operators in ordered Banach spaces with additive norm and application to semigroup theory, Prépublication du Laboratoire de Mathématiques de Besançon, No. 36, 2007. [29] M. Mokhtar-Kharroubi, On perturbation theory of holomorphic semigroups, Extension of Kato class potentials for higher order elliptic operators and for Schrödinger operators with magnetic fields, in preparation. [30] N. Okazawa, An Lp theory for Schrödinger operators with nonnegative potentials, J. Math. Soc. Japan 36 (4) (1984) 675–688. [31] E.M. Ouhabaz, Gaussian estimates and holomorphy of semigroups, Proc. Amer. Math. Soc. 123 (5) (1995) 1465– 1474. [32] B. de Pagter, Ordered Banach spaces, in: Ph. Clément, H.J.A.M. Heijmans, S. Angenent, C.J. van Duijn, B. de Pagter (Eds.), One-Parameter Semigroups, North-Holland, Amsterdam, 1987, pp. 269–279. [33] M. Reed, B. Simon, Methods of Modern Mathematical Physics, vol. II, Academic Press, 1975. [34] A. Rhandi, Perturbations positives des équations d’évolution et applications, PhD thesis, Besançon University, 1990. [35] M. Schechter, Spectra of Partial Differential Operators, North-Holland, 1971. [36] M. Schechter, Hamiltonians for singular potentials, Indiana Univ. Math. J. 22 (5) (1972) 483–503. p [37] Yu.A. Semenov, Schrödinger operators with Lloc -potentials, Comm. Math. Phys. 53 (1977) 277–284. [38] B. Simon, Essential sef-adjointness of Schrödinger operators with positive potentials, Math. Ann. 201 (1973) 211– 220. [39] B. Simon, Maximal and minimal Schrödinger forms, J. Operator Theory 1 (1979) 37–47. [40] B. Simon, Schrödinger semigroups, Bull. Amer. Math. Soc. 7 (3) (1982) 447–526. [41] B. Simon, Schrödinger operators in the twentieth century, J. Math. Phys. 41 (6) (2000) 3523–3555. [42] N.S. Trudinger, Linear elliptic operators with measurable coefficients, Ann. Sc. Norm. Super. Pisa Cl. Sci. 27 (3) (1973) 265–308. [43] J. Voigt, Absorption semigroups, their generators, and Schrödinger semigroups, J. Funct. Anal. 67 (1986) 167–205. [44] K. Yosida, Functional Analysis, Springer, 1978. [45] W.P. Ziemer, Weakly Differentiable Functions, Springer-Verlag, 1989.


Topological free entropy dimension in unital C∗-algebras Don Hadwin, Junhao Shen ∗,1 Department of Mathematics and Statistics, University of New Hampshire, Durham, NH 03824, United States Received 21 August 2007; accepted 30 January 2009 Available online 13 February 2009 Communicated by D. Voiculescu

Abstract The notion of topological free entropy dimension of n-tuple of elements in a unital C∗ algebra was introduced by Voiculescu. In the paper, we compute topological free entropy dimension of one self-adjoint element and topological free orbit dimension of one self-adjoint element in a unital C∗ algebra. We also calculate the values of topological free entropy dimensions of any families of self-adjoint generators of some unital C∗ algebras, including irrational rotation C∗ algebra, UHF algebra, and minimal tensor product of two reduced C∗ algebras of free groups. © 2009 Elsevier Inc. All rights reserved. Keywords: Topological free entropy dimension; C∗ algebra

1. Introduction The theory of free probability and free entropy was developed by Voiculescu from 1980s. It played a crucial role in the recent study of finite von Neumann algebras (see [1,3–15,19,20,23–26]). The analogue of free entropy dimension in C∗ algebra context, the notion of topological free entropy dimension of n−tuple of elements in a unital C∗ algebra, was also introduced by Voiculescu in [27]. After introducing the concept of topological free entropy dimension of n-tuple of elements in a unital C∗ algebra, Voiculescu discussed some of its properties including subadditivity and * Corresponding author.

E-mail addresses: [email protected] (D. Hadwin), [email protected] (J. Shen). 1 Author is supported by an NSF grant.


2028

D. Hadwin, J. Shen / Journal of Functional Analysis 256 (2009) 2027–2068

change of variables in [27]. In this paper, we will add one basic property into the list: topological free entropy dimension of one variable. More specifically, suppose x is a self-adjoint element in a unital C∗ algebra A and σ (x) is the spectrum of x in A. Then topological free entropy dimension of x is equal to 1 − n1 where n is the cardinality of the set σ (x) (see Theorem 4.3.1). In [27], Voiculescu showed that (i) if x1 , . . . , xn is a family of free semicircular elements in a unital C∗ algebra with a tracial state, then δtop (x1 , . . . , xn ) = n, where δtop (x1 , . . . , xn ) is the topological free entropy dimension of x1 , . . . , xn ; (ii) if x1 , . . . , xn is the universal n-tuple of self-adjoint contractions, then δtop (x1 , . . . , xn ) = n. Except in these two cases, very little has been known on the values of topological free entropy dimensions in other C∗ algebras. Using the inequality between topological free entropy dimension and Voiculescu’s free dimension capacity, we are able to obtain an estimation of upper-bound of topological free entropy dimension for a unital C∗ algebra with a unique tracial state (see Theorem 5.2.2). The lower-bound of topological free entropy dimension is also obtained for infinite dimensional simple unital C∗ algebra with a unique tracial state (see Theorem 5.3.7). As a corollary, we know that the topological free entropy dimension of every finite family of self-adjoint generators of an irrational rotation C∗ algebra or ∗ (F ) ⊗ ∗ UHF algebra or Cred 2 min Cred (F2 ) is equal to 1 (see Theorems 5.4.1, 5.4.2, 5.4.4). For these ∗ C algebras, the value of the topological free entropy dimension is independent of the choice of generators. The rest of the paper is devoted to study another invariant associated to n-tuple of elements in C∗ algebras. This invariant, called topological free orbit dimension, is an analogue of free orbit dimension in finite von Neumann algebras (see [11]). We show that the topological free orbit dimension of a self-adjoint element in a unital C∗ algebra is equal to, according to some measurement, the packing dimension of the spectrum of x (see Theorem 7.3.1). The organization of the paper is as follows. In the Section 2, we recall the definition of topological free entropy dimension. Some technical lemmas are proved in Section 3. In Section 4, we compute the topological free entropy dimension of one self-adjoint element in a unital C∗ algebra. In Section 5, we study the relationship between topological free entropy dimension and free capacity dimension of a unital C∗ algebra. Then we show that topological free entropy dimension of every finite family of generators of an infinite dimensional simple unital C∗ algebra with a unique tracial state is always greater than or equal to 1. The concept of topological free orbit dimension of n-tuple of elements in a C∗ algebra is introduced in Section 6. Its value for one variable is computed in Section 7. 2. Definitions and preliminaries In this section, we are going to recall Voiculescu’s definition of topological free entropy dimension of n-tuple of elements in a unital C∗ algebra. 2.1. A covering of a set in a metric space Suppose (X, d) is a metric space and K is a subset of X. A family of balls in X is called a covering of K if the union of these balls covers K and the centers of these balls lie in K. 2.2. Covering numbers in complex matrix algebra (Mk (C))n Let Mk (C) be the k × k full matrix algebra with entries in C, and τk be the normalized trace on Mk (C), i.e., τk = k1 T r, where T r is the usual trace on Mk (C). Let U(k) denote the group


2029

of all unitary matrices in Mk (C). Let Mk (C)n denote the direct sum of n copies of Mk (C). Let Ms.a. k (C) be the subalgebra of Mk (C) consisting of all self-adjoint matrices of Mk (C). Let n be the direct sum of n copies of Ms.a. (C). Let · be an operator norm on M (C)n (C)) (Ms.a. k k k defined by (A1 , . . . , An ) = max A1 , . . . , An for all (A1 , . . . , An ) in Mk (C)n . Let · 2 denote the trace norm induced by τk on Mk (C)n , i.e., (A1 , . . . , An ) = τk A∗ A1 + · · · + τk A∗ An n 1 2 for all (A1 , . . . , An ) in Mk (C)n . For every ω > 0, we define the ω- · -ball Ball(B1 , . . . , Bn ; ω, · ) centered at (B1 , . . . , Bn ) in Mk (C)n to be the subset of Mk (C)n consisting of all (A1 , . . . , An ) in Mk (C)n such that (A1 , . . . , An ) − (B1 , . . . , Bn ) < ω. Definition 2.2.1. Suppose that Σ is a subset of Mk (C)n . We define the covering number ν∞ (Σ, ω) to be the minimal number of ω- · -balls that constitute a covering of Σ in Mk (C)n . For every ω > 0, we define the ω- · 2 -ball Ball(B1 , . . . , Bn ; ω, · 2 ) centered at (B1 , . . . , Bn ) in Mk (C)n to be the subset of Mk (C)n consisting of all (A1 , . . . , An ) in Mk (C)n such that (A1 , . . . , An ) − (B1 , . . . , Bn ) < ω. 2 Definition 2.2.2. Suppose that Σ is a subset of Mk (C)n . We define the covering number ν2 (Σ, ω) to be the minimal number of ω- · 2 -balls that constitute a covering of Σ in Mk (C)n . 2.3. Noncommutative polynomials In this article, we always assume that A is a unital C∗ -algebra. Let x1 , . . . , xn , y1 , . . . , ym be self-adjoint elements in A. Let CX1 , . . . , Xn , Y1 , . . . , Ym be the unital noncommutative polynomials in the indeterminates X1 , . . . , Xn , Y1 , . . . , Ym . Let {Pr }∞ r=1 be the collection of all noncommutative polynomials in CX1 , . . . , Xn , Y1 , . . . , Ym with rational complex coefficients. (Here “rational complex coefficients” means that the real and imaginary parts of all coefficients of Pr are rational numbers.) Remark 2.3.1. We always assume that 1 ∈ CX1 , . . . , Xn , Y1 , . . . , Ym . 2.4. Voiculescu’s norm-microstates space For all integers r, k 1, real numbers R, > 0 and noncommutative polynomials P1 , . . . , Pr , we define (top)

ΓR

(x1 , . . . , xn , y1 , . . . , ym ; k, , P1 , . . . , Pr )

2030


n+m consisting of all those to be the subset of (Ms.a. k (C))

n+m (A1 , . . . , An , B1 , . . . , Bm ) ∈ Ms.a. k (C) satisfying max A1 , . . . , An , B1 , . . . , Bm R and Pj (A1 , . . . , An , B1 , . . . , Bm ) − Pj (x1 , . . . , xn , y1 , . . . , ym ) ,

∀1 j r.

Remark 2.4.1. In the definition of norm-microstates space, we use the following assumption. If Pj (x1 , . . . , xn , y1 , . . . , ym ) = α0 · IA +

N

αi1 ···is zi1 · · · zis

s=1 1i1 ,...,is n+m

where z1 , . . . , zn+m denotes x1 , . . . , xn , y1 , . . . , ym and α0 , αi1 ···is are in C, then Pj (A1 , . . . , An , B1 , . . . , Bm ) = α0 · Ik +

N

αi1 ···is Zi1 · · · Zis

s=1 1i1 ,...,is n+m

where Z1 , . . . , Zn+m denotes A1 , . . . , An , B1 , . . . , Bm and Ik is the identity matrix in Mk (C). Remark 2.4.2. In the original definition of norm-microstates space in [27], the parameter R was not introduced. Note the following observation: Let R > max{x1 , . . . , xn , y1 , . . . , ym }. When r is large enough so that {X1 , . . . , Xn , Y1 , . . . , Ym } ⊂ {P1 , . . . , Pr } and 0 < < R − max{x1 , . . . , xn , y1 , . . . , ym }, we have (top)

ΓR

(x1 , . . . , xn , y1 , . . . , ym ; k, , P1 , . . . , Pr ) = Γtop (x1 , . . . , xn , y1 , . . . , ym ; k, , P1 , . . . , Pr )

for all k 1, where Γ(top) (x1 , . . . , xn , y1 , . . . , ym ; k, , P1 , . . . , Pr ) is the norm-microstates space defined in [27]. Thus our definition agrees with the one in [27] for large R, r and small . In the later sections, since we need to construct the ultraproduct of some matrix algebras, it will be convenient for us to include the parameter “R” in the definition of norm-microstate space. Define the norm-microstates space of x1 , . . . , xn in the presence of y1 , . . . , ym , denoted by (top)

ΓR (top)

as the projection of ΓR via the mapping

(x1 , . . . , xn : y1 , . . . , ym ; k, , P1 , . . . , Pr )

n (x1 , . . . , xn , y1 , . . . , ym ; k, , P1 , . . . , Pr ) onto the space (Ms.a. k (C))

(A1 , . . . , An , B1 , . . . , Bm ) → (A1 , . . . , An ).


2031

2.5. Voiculescu’s topological free entropy dimension (see [27]) Define (top) ν∞ ΓR (x1 , . . . , xn : y1 , . . . , ym ; k, , P1 , . . . , Pr ), ω (top)

to be the covering number of the set ΓR (x1 , . . . , xn : y1 , . . . , ym ; k, , P1 , . . . , Pr ) by ω- · n balls in the metric space (Ms.a. k (C)) equipped with operator norm. Definition 2.5.1. Define δtop (x1 , . . . , xn : y1 , . . . , ym ; ω) (top)

= sup

inf

lim sup

log(ν∞ (ΓR

R>0 >0, r∈N k→∞

(x1 , . . . , xn : y1 , . . . , ym ; k, , P1 , . . . , Pr ), ω)) . −k 2 log ω

The topological entropy dimension of x1 , . . . , xn in the presence of y1 , . . . , ym is defined by δtop (x1 , . . . , xn : y1 , . . . , ym ) = lim sup δtop (x1 , . . . , xn : y1 , . . . , ym ; ω). ω→0+

Remark 2.5.2. By definition we know that δtop does not depend on the order of the polynomials P1 , P2 , P3 , . . . . Remark 2.5.3. Let M > max{x1 , . . . , xn , y1 , . . . , ym } be some positive number. By Remark 2.4.2, we know δtop (x1 , . . . , xn : y1 , . . . , ym ) (top)

= lim sup

inf

lim sup

log(ν∞ (ΓM

ω→0+ >0, r∈N k→∞

(x1 , . . . , xn : y1 , . . . , ym ; k, , P1 , . . . , Pr ), ω)) . −k 2 log ω

2.6. C∗ algebra ultraproduct and von Neumann algebra ultraproduct Suppose {Mkm (C)}∞ m=1 is a sequence of complex matrix algebras where km goes to infinity when m

approaches infinity. Let γ be a free ultrafilter in β(N) \ N. We can introduce a unital C∗ algebra ∞ m=1 Mkm (C) as follows: ∞

Mkm (C) = (Ym )∞ m=1 ∀m 1, Ym ∈ Mkm (C) and sup Ym < ∞ . m1

m=1

We can also introduce the norm closed two sided ideals I∞ and I2 as follows.

∞ ∞ Mkm (C) lim Ym = 0 , I∞ = (Ym )m=1 ∈ m=1

I2 =

(Ym )∞ m=1

∈

∞ m=1

m→γ

Mkm (C) lim Ym 2 = 0 . m→γ

2032


Definition 2.6.1. The C∗ algebra ultraproduct of {Mkm (C)}∞ along the ultrfilter γ , denoted

m=1

γ ∞ by m=1 Mkm (C), is defined to be the quotient algebra of m=1 Mkm (C) by the ideal I∞ . The ∞ image of (Ym )∞ m=1 Mkm (C) in the quotient algebra is denoted by [(Ym )m ]. m=1 ∈ Definition 2.6.2. The von Neumann algebra ultraproduct of {Mkm (C)}∞ m=1 along the ultrfilter γ ,

γ M (C) if no-confusion arises, is defined to be the quotient algebra of also denoted by k m m=1

∞

∞ ∞ ∈ M (C) by the ideal I . The image of (Y ) M (C) in the quotient algebra k 2 m k m m m=1 m=1 m=1 is denoted by [(Ym )m ].

γ Remark 2.6.3. The von Neumann algebra ultraproduct m=1 Mkm (C) is a finite factor (see [16]). 2.7. Topological free entropy dimension of elements in a non-unital C∗ algebra Topological free entropy dimension can also be defined for n-tuple of elements in a non-unital C∗ algebra. Suppose that A is a non-unital C∗ -algebra. Let x1 , . . . , xn , y1 , . . . , ym be self-adjoint elements in A. Let CX1 , . . . , Xn , Y1 , . . . , Ym C be the noncommutative polynomials in the indeterminates X1 , . . . , Xn , Y1 , . . . , Ym without constant terms. Let {Pr }∞ r=1 be the collection of all noncommutative polynomials in CX1 , . . . , Xn , Y1 , . . . , Ym C with rational complex coefficients. Then norm-microstate space (top)

ΓR

(x1 , . . . , xn : y1 , . . . , ym ; k, , P1 , . . . , Pr )

can be defined similarly as in Section 2.4. So topological free entropy dimension δtop (x1 , . . . , xn : y1 , . . . , ym ) can also be defined similarly as in Section 2.5. In the paper, we will focus on the case when A is a unital C∗ algebra. 3. Some technical lemmas 3.1. Suppose x is a self-adjoint element in a unital C∗ algebra A. Let σ (x) be the spectrum of x in A. Let {Pr }∞ r=1 be the collection of all polynomials in CX with rational coefficients. Theorem 3.1.1. Let R > x. For any ω > 0, we have the following result. (1) There are some integer n 1 and distinct real numbers λ1 , λ2 , . . . , λn in σ (x) satisfying (i) |λi − λj | ω for all 1 i = j n; and (ii) for any λ in σ (x), there is some λj with 1 j n such that |λ − λj | ω. (2) There is some r0 ∈ N such that the following holds: when r r0 , for any k ∈ N and any A (top) in ΓR (x; k, 1/r, P1 , . . . , Pr ), there are positive integers 1 k1 , . . . , kn k with k1 + k2 + · · · + kn = k and a unitary matrix U in Mk (C) satisfying ⎛λ I 0 ··· 0 ⎞ 1 k1 0 ⎟ λ2 Ik2 · · · ∗ ⎜ 0 U AU − ⎜ ⎟ .. ⎠ 2ω, ⎝ . ··· ··· ··· 0 0 · · · λn Ikn where Ikj is the kj × kj identity matrix in Mkj (C) for 1 j n.


2033

Proof. The proof of part (1) is trivial. We will only prove part (2). Assume that the result in (2) does not hold. Then there is some ω > 0 so that the following holds: for all m 1, there are (top) integer km 1 and some Am in ΓR (x; km , m1 , P1 , . . . , Pm ) such that ⎛λ I 1 s1 ∗ ⎜ 0 U Am U − ⎜ ⎝ ··· 0

0 λ2 I s 2 ··· 0

··· ··· .. . ···

0 ⎞ 0 ⎟ ⎟ > 2ω, ⎠ ··· λn Is n

(∗)

for every 1 s1 , . . . , sn km with s1 + · · · + sn = km and every unitary matrix U in Mkm (C). First we prove the following claims. Claim 3.1.2. For any ρ > 0, there is some m0 ∈ N such that, ∀m m0 and ∀λ˜ ∈ σ (Am ), we have ˜ σ (x)) < ρ. dist(λ, Proof. Suppose that the result of the claim does not hold. Then there are some ρ > 0, a sequence ˜ ∞ of positive integers {mj }∞ j =1 and a sequence of real numbers {λj }j =1 satisfying ∀j 1,

λ˜ j ∈ σ (Amj )

and dist λ˜ j , σ (x) ρ.

(3.1.1)

Note that each λ˜ j ∈ σ (Amj ) and Amj is a self-adjoint kmj × kmj matrix. It follows that there is a unit vector ξj ∈ C

kmj

so that ξj = 1

and Amj ξj = λ˜ j ξj .

(3.1.2)

Moreover each |λ˜ j | Amj R. Thus, there is an accumulation point of the sequence of the numbers {λ˜ j }∞ j =1 . By passing to the subsequence, we might assume that λ˜ j → λ˜

for some real number λ˜ ∈ R.

(3.1.3)

It follows that ˜ j = 0. lim Amj ξj − λξ

j →∞

(3.1.4)

γ Let γ be a free ultrafilter in β(N) \ N. Let B = j =1 Mkmj (C) be the C∗ algebra ultraproduct

γ ∗ of {Mkmj (C)}∞ j =1 Mkmj (C) is the quotient algebra of the C alj =1 along the ultrafilter γ , i.e.

∞

∞ gebra j =1 Mkmj (C) by I∞ , where I∞ = {(Bmj )∞ j =1 Mkmj (C) | limj →γ Bmj = 0}. j =1 ∈ ∞ Let a = [(Amj )j =1 ] be a self-adjoint element in B. Since {Pr }∞ r=1 is the collection of all polynomials in CX with rational coefficients, by mapping x to a, there is a unital ∗-isomorphism from the C∗ subalgebra generated by {IA , x} in A onto the C∗ subalgebra generated by {IB , a} in B. Thus σ (x) = σ (a). the ultrafilter γ . It folLet Hγ is the ultraproduct of the Hilbert spaces {Cmj }∞ j =1 along

γ ∗ lows from the definition of ultraproduct of C algebras that j =1 Mkmj (C) can also be

2034


∞ ∞ viewed naturally as a subalgebra of B(Hγ ), namely [(Bmj )∞ j =1 ][(ηj )j =1 ] = [(Bmj ηj )j =1 ] for

γ ∞ ∞ ∞ all [(ηj )j =1 ] ∈ Hγ and [(Bmj )j =1 ] ∈ j =1 Mkmj (C). By Eq. (3.1.2), [(ξj )j =1 ] is a unit vector ∞ ∞ ˜ ˜ in Hγ . By (3.1.4), [(Amj )∞ j =1 ][(ξj )j =1 ] = λ[(ξj )j =1 ]. Hence λ is an eigenvalue of a, thus is in σ (a) = σ (x), which contradicts with the facts (3.1.1) and (3.1.3). This ends the proof of the claim. 2

Claim 3.1.3. For any ρ > 0, there is some m0 ∈ N such that, ∀m m0 and ∀λ ∈ σ (x) we have dist λ, σ (Am ) < ρ. Proof. Suppose that the result of the claim does not hold. Then there are some ρ > 0, a sequence ˜ ∞ of positive integers {mj }∞ j =1 and a sequence of real numbers {λj }j =1 satisfying, ∀j 1,

λ˜ j ∈ σ (x)

and dist λ˜ j , σ (Amj ) ρ.

(3.1.5)

It follows that (λ˜ j Ikmj − Amj )−1 is in Mkmj (C) satisfying (λ˜ j Ik

mj

1 − Amj )−1 . ρ

(3.1.6)

Note that each λ˜ j is in σ (x). Thus |λ˜ j | x. Therefore there is an accumulation point of the sequence of the numbers {λ˜ j }∞ j =1 . By passing to the subsequence, we might assume that λ˜ j → λ˜

for some real number λ˜ ∈ σ (x).

(3.1.7)

γ Let γ be a free ultrafilter in β(N) \ N. Let B = j =1 Mkmj (C) be the C∗ algebra ultraproduct ∞ of {Mkmj (C)}∞ j =1 along the ultrafilter γ . Let a = [(Amj )j =1 ] be a self-adjoint element in B. By mapping x to a, there is a unital ∗-isomorphism from the C∗ subalgebra generated by {IA , x} in A onto the C∗ subalgebra generated by {IB , a} in B. Thus σ (x) = σ (a). By Eqs. (3.1.6) and (3.1.7), we know that lim (λ˜ Ikmj − Amj )(λ˜ j − Amj )−1 − Ikmj = 0

j →∞

˜ km − Amj )∞ ][((λ˜ j − Amj )−1 )∞ ] = (λI ˜ − a)[((λ˜ j − Amj )−1 )∞ ] = I . Together or [(λI j =1 j =1 j =1 j with (3.1.6), it induces that λ˜ ∈ / σ (a) = σ (x) which contradicts with the fact that λ˜ ∈ σ (x) from (3.1.7). This ends the proof of the claim. 2 Continue the proof of the lemma. Put ρ = ω in Claims 3.1.2 and 3.1.3. By Claims 3.1.2 and 3.1.3, there is some m0 ∈ N such that ∀m m0 , we have Hausdorf-Distance σ (Am ), σ (x) < ω. Therefore, together with the choice of the numbers λ1 , . . . , λn in part (1), we know (i) ∀m m0 and ∀λ˜ ∈ σ (Am ), there is some 1 j n satisfying |λ˜ − λj | 2ω; and (ii) ∀m m0 and


2035

∀1 j n, there is some λ˜ ∈ σ (Am ) satisfying |λ˜ − λj | 2ω. I.e. ∀ m m0 , there are some 1 s1 , . . . , sn km with s1 + · · · + sn = km and a unitary matrix U in Mkm (C), such that ⎛ λ1 Is 1 ⎜ 0 ∗ ⎜ U Am U − ⎜ ⎝ ··· 0

0 λ2 I s 2 ··· 0

··· ··· .. . ···

⎞ 0 0 ⎟ ⎟ ⎟ 2ω. · · · ⎠ λn Is n

This contradicts the inequality (∗). And the proof of the lemma is completed.

2

3.2. In this subsection, we will use the following notation. (i) Let n, m be some positive integers with n m. (ii) Let δ, θ be some positive numbers. (iii) Let {λ1 , λ2 , . . . , λm } ∪ {λm+1 , . . . , λn } be a family of real numbers such that |λi − λj | θ

for all 1 i < j m.

(iv) Let k be a positive integer such that k − (n − m) is divisible by m. We let t=

k−n+m . m

(v) We let B = diag(λm+1 , . . . , λn ) be a diagonal matrix in Mn−m (C) and A = diag(λ1 It , λ2 It , . . . , λm It , B) be a block-diagonal matrix in Mk (C), where It is the identity matrix in Mt (C). (vi) We let A be defined as above and Ω(A) = U AU ∗ U is in U(k) be the unitary orbit of A in Mk (C). (vii) Assume that {eij }ki,j =1 is a system of canonical basis of Mk (C). We let V1 = span eij , ej i pt + 1 i (p + 1)t < j mt, 0 p < m

and

V2 = Mk (C) V1 be linear subspaces of Mk (C). Note dimR V2 = 2mt 2 + 4m(n − m)t + 2(n − m)2 .

2036


Lemma 3.2.1. We follow the notation as above. Suppose U1 AU1∗ − U2 AU2∗ 2 δ for some unitary matrices U1 and U2 in U(k). Then the following hold. (1) There exists some S ∈ V2 such that S2 1 and δ U1 − U2 S2 . θ (2) If n = m, then there is a unitary matrix W in V2 such that U1 − U2 W 2

3δ . θ

Proof. Assume that ⎛

U11 ⎜ U21 U2∗ U1 = ⎜ ⎝ ··· Um+1,1

U12 U22 ··· Um+1,2

⎞ ··· U1,m+1 ··· U2,m+1 ⎟ ⎟ ∈ Mk (C) ⎠ ··· ··· · · · Um+1,m+1

where Ui,j is a t × t matrix, Ui,m+1 a t × (n − m) matrix, Um+1,j a (n − m) × t matrix for 1 i, j m and Um+1,m+1 is a (n − m) × (n − m) matrix. (1) Let ⎛

U11 ⎜ 0 ⎜ ⎜ S=⎜ ⎜ ··· ⎜ ⎝ 0 Um+1,1

0 U22 ··· 0 Um+1,2

··· ··· .. . ··· ···

0 0 ··· Um,m Um+1,m

⎞ U1,m+1 U2,m+1 ⎟ ⎟ ⎟ ⎟. ··· ⎟ ⎟ Um,m+1 ⎠ Um+1,m+1

It is easy to see that S is in V2 , S2 1 and 2 1 ∗ δ 2 U1 AU1∗ − U2 AU2∗ 2 = Tr U2∗ U1 A − AU2∗ U1 U2∗ U1 A − AU2∗ U1 k 1 Tr |λi − λj |2 Uij Uij∗ k 1i =j m

1 2 ·θ k

Tr Uij Uij∗ .

1i =j m

Hence 2 1 U1 − U2 S22 = U2∗ U1 − S 2 = k

1i =j m

δ2 Tr Uij Uij∗ 2 . θ


2037

It follows that δ U1 − U2 S2 . θ (2) If n = m, then V2 = Mt (C) ⊕ Mt (C) ⊕ · · · ⊕ Mt (C). By the construction of S, we can assume S = W H is a polar decomposition of S in V2 for some unitary matrix W and positive matrix H in V2 . Again by the construction of S, we know that S 1, whence H 1. From the proven fact that U2∗ U1 − S2 θδ , we know that 2 H − I = S ∗ S − I 2 2δ . 2 θ Thus 2δ H − I 2 H 2 − I 2 . θ It follows that U1 − U2 W 2 U1 − U2 W H 2 + U2 W H − U2 W 2 = U1 − U2 S2 + H − I 2

3δ . θ

2

We list some results on the covering numbers from [21] as a sublemma here. Sublemma 3.2.2. The following statements are true. (1) There are universal constants d1 , D1 > 0 so that the following holds: For any positive integer p, let Rp be the euclidean space of dimension p with the euclidean norm · e , and B(r) be the ball of radius r in Rp w.r.t. · e . For any ω > 0, let ν(B(r), ω) be the minimal number of ω- · e -balls to cover B(r). Then

d1 r ω

p

ν B(r), ω

D1 r ω

p .

(2) There are universal constants d2 , D2 > 0 so that the following holds: For any positive integer k, let U(k) the group of all k × k unitary matrices and · 2 be the normalized Hilbert–Schmidt norm on U(k). For any ω > 0, let ν2 (U(k), ω) be the minimal number of ω- · 2 -balls in U(k) to cover U(k). Then

d2 ω

k 2

ν2 U(k), ω

D2 ω

k 2 .

2038


(3) There is a universal constant C1 > 0 so that the following holds: For any positive integer k, let U(k) the group of all k × k unitary matrices and μ be the normalized Haar measure on U(k), i.e. μ(U(k)) = 1. For any ω > 0 and T in Mk (C), let BU T , ω, · 2 = U ∈ U(k) U − T 2 ω be a subset of U(k). Then the Haar measure of BU(T , ω, · 2 ) in U(k) is bounded by 2 μ BU T , ω, · 2 (C1 · 2ω)k . Proof. (1) This is a well-known result (for example, see Lemma 1 in [21]). (2) This is the combined statement of Proposition 7 in [21] and the remark in the last second paragraph in the introduction of [21]. (3) Roughly, this is the re-statement of the result in (2) and also from [21]. We sketch its proof here. Let μ be the normalized Haar measure on the unitary group U(k), i.e. μ(U(k)) = 1. Let {Wλ }λ∈Λ be a family of unitary matrices in U(k) such that {BU(Wλ , 2ω, · 2 )}λ∈Λ is the maximal disjoint collection of 2ω- · 2 -balls in U(k). Since μ is the Haar measure on U(k), we can let α(ω) = μ(BU(Wλ , 2ω, · 2 )) > 0 for any λ ∈ Λ. Then |Λ| · α(ω) μ U(k) = 1, where |Λ| is the cardinality of Λ. On the other hand, the maximality of the collection {BU(Wλ , 2ω, · 2 )}λ∈Λ in U(k) implies that U(k) ⊆

BU Wλ , 4ω, · 2

λ∈Λ

whence |Λ| ν2 U(k), 4ω , where ν2 (U(k), 4ω) be the minimal number of 4ω- · 2 -balls in U(k) to cover U(k). Therefore, by the result in (2), we have α(ω)

1 1 |Λ| ν2 (U(k), 4ω)

4ω d2

k 2 ,

where d2 > 0 is a universal constant. Let T be any element in Mk (C). If BU T , ω, · 2 ∩ U(k) = ∅, then the result in (3) is trivially true. If BU T , ω, · 2 ∩ U(k) = ∅, then take any W ∈ BU(T , ω) ∩ U(k). Thus BU T , ω, · 2 ⊆ BU W, 2ω, · 2 .


2039

It follows that μ BU T , ω, · 2 μ BU W, 2ω, · 2 = α(ω) Let C1 = 2/d2 and we have the desired result.

4ω d2

k 2 .

2

Lemma 3.2.3. Following the notation in Lemma 3.2.1. We have the following results. (1) Let μ be the normalized Haar measure on the unitary group U(k), i.e. μ(U(k)) = 1. For every U ∈ U(k), let δ Σ(U ) = W ∈ U(k) ∃S ∈ V2 such that S2 1 and W − U S2 θ be a subset of U(k). Then there are universal constants C, C1 0 such that 2 2 Cθ 2mt +4m(n−m)t+2(n−m) k2 . μ Σ(U ) (C1 · 4δ/θ ) · δ

(2) When n = m, for every U ∈ U(k), let 3δ ˜ ) = W ∈ U(k) ∃ a unitary matrix W1 in V2 such that W − U W1 2 Σ(U θ be a subset of U(k). Then, there is a universal constant C˜ > 0 such that ˜ nt 2 Cθ 2 k ˜ ) (C1 · 8δ/θ ) · μ Σ(U , δ where C1 is the same constant as in (1). Proof. (1) Note that the linear space V2 can be viewed as an Euclidean space with the real dimension 2mt 2 + 4m(n − m)t + 2(n − m)2 . Note · 2 is the normalized Hilbert–Schmidt norm on V2 . By part (1) of Sublemma 3.2.2, there is a universal constant C > 0 so that the following is true: there is a family of elements {Sλ }λ∈Λ in V2 such that (i) {Ball(Sλ , δ/θ, · 2 )}λ∈Λ is a covering of the set {V ∈ V2 | V 2 1} and (ii) |Λ|

Cθ δ

2mt 2 +4m(n−m)t+2(n−m)2 ,

where |Λ| is the cardinality of Λ. For each T in Mk (C) and ω > 0, let BU T , ω, · 2 = W ∈ U(k) W − T 2 ω

2040


be a subset of U(k). Then by the construction of Σ(U ), we know that Σ(U ) ⊆

BU U Sλ , 2δ/θ, · 2 .

λ∈Λ

By part (3) in Sublemma 3.2.2, we know that there is a universal constant C1 > 0 such that 2 μ BU U Sλ , 2δ/θ, · 2 (C1 · 4δ/θ )k ,

∀λ ∈ Λ.

Hence 2 2 Cθ 2mt +4m(n−m)t+2(n−m) k2 μ Σ(U ) μ BU U Sλ , 2δ/θ, · 2 (C1 · 4δ/θ ) · . δ

λ∈Λ

(2) When n = m, we know that V2 ∩ U(k) U(t) ⊕ · · · ⊕ U(t). By part (2) of Sublemma 3.2.2, there is a family of elements {Vλ }λ∈Λ in V2 ∩ U(k) such that (i) {BU(Vλ , δ/θ, · 2 )}λ∈Λ is a covering of the set V2 ∩ U(k) and (ii) ˜ nt 2 Cθ , |Λ| δ where C˜ > 0 is a universal constant. The rest is similar to proof of part (1) and we skip it.

2

Lemma 3.2.4. Let A, Ω(A) be defined as in (vi) at the beginning of this subsection. (1) The covering number of Ω(A) by the 12 δ- · 2 -balls in Mk (C) is bounded by 2 2 1 Cθ −(2mt +4m(n−m)t+2(n−m) ) −k 2 · . ν2 Ω(A), δ (C1 · 4δ/θ ) 2 δ

(2) If n = m, then ˜ −mt 2 1 Cθ 2 . ν2 Ω(A), δ (C1 · 8δ/θ )−k · 2 δ Proof. (1) Recall that, for every U ∈ U(k), we define δ ⊂ U(k). Σ(U ) = W ∈ U(k) ∃S ∈ V2 , such that S2 1, W − U S2 θ Let μ be the normalized Haar measure on the unitary group U(k), i.e. μ(U(k)) = 1. Pick any U1 in U(k). Then by preceding lemma, 2 2 Cθ 2mt +4m(n−m)t+2(n−m) k2 , μ Σ(U1 ) (C1 · 4δ/θ ) · δ


2041

where C, C1 > 0 are universal constants. If

2

(C1 · 4δ/θ )k ·

Cθ δ

2mt 2 +4m(n−m)t+2(n−m)2 < 1,

then there is some U2 ∈ U(k) \ Σ(U1 ). Now μ Σ(U1 ) ∪ Σ(U2 ) μ Σ(U1 ) + μ Σ(U2 ) 2 2 Cθ 2mt +4m(n−m)t+2(n−m) 2 2 · (C1 · 4δ/θ )k · . δ If 2

2 · (C1 · 4δ/θ )k ·

Cθ δ

2mt 2 +4m(n−m)t+2(n−m)2 < 1,

then there is a U3 ∈ U(k) \ (Σ(U1 ) ∪ Σ(U2 )). Continue this process until we find a sequence of unitary matrices {Ui }N i=1 in U(k) such that N (C1 · 4δ/θ )−k · 2

Cθ δ

−(2mt 2 +4m(n−m)t+2(n−m)2 )

and Ui

is not contained in

i−1

Σ(Uj ),

∀i = 2, . . . , N.

(3.2.1)

j =1

From the definition of each Σ(Uj ) and the fact (3.2.1), it follows that, ∀1 j , θ

∀S ∈ V2 ,

with S2 1.

By Lemma 3.2.1, we know that Ui AU ∗ − Uj AU ∗ > δ, i j 2

∀1 j < i N.

Recall Ω(A) = {U AU ∗ | U ∈ U(k)}. Hence ν2

2 2 1 Cθ −(2mt +4m(n−m)t+2(n−m) ) −k 2 Ω(A), δ N (C1 · 4δ/θ ) · . 2 δ

(2) is similar as (1).

2

2042

3.3.


As a summary, we have following theorem.

Theorem 3.3.1. Let n m, δ, θ > 0 and {λ1 , λ2 , . . . , λm } ∪ {λm+1 , . . . , λn } be a family of real numbers such that |λi − λj | θ for all 1 i < j m. Let k be a positive integer such that k − (n − m) is divisible by m and t=

k−n+m . m

Let B = diag(λm+1 , . . . , λn ) be a diagonal matrix in Mn−m (C) and A = diag(λ1 It , λ2 It , . . . , λm It , B) be a block-diagonal matrix in Mk (C), where It is the identity matrix in Mt (C). We let Ω(A) = U ∗ AU U is in U(k) be the unitary orbit of A in Mk (C). Then the covering number of Ω(A) by the 12 δ- · -balls in Mk (C) is bounded by 2 2 1 Cθ −(2mt +4m(n−m)t+2(n−m) ) 2 , ν∞ Ω(A), δ (C1 · 4δ/θ )−k · 2 δ

where C, C1 > 0 are some universal constants. When n = m, we have ˜ −nt 2 1 Cθ −k 2 ν∞ Ω(A), δ (C1 · 8δ/θ ) · , 2 δ where C˜ > 0 is a universal constant and C1 is as in (1). Proof. Note that ν∞

δ Ω(A), 2

ν2

δ Ω(A), , 2

The result follows directly from preceding lemma.

2

∀δ > 0.


2043

3.4. The following proposition, whose proof is skipped, is a direct extension of Lemma 3.2.4. Proposition 3.4.1. Let m, k be some positive integers and θ, δbe some positive numbers. Let T1 , T2 , . . . , Tm+1 be a partition of the set {1, 2, . . . , k}, i.e. m+1 i=1 Ti = {1, 2, . . . , k} and Ti ∩ Tj = ∅ for 1 i = j m + 1. Let λ1 , . . . , λk be real numbers such that, if 1 j1 = j2 m then |λi1 − λi2 | > θ,

∀i1 ∈ Tj1 , ∀i2 ∈ Tj2 .

Let A = diag(λ1 , λ2 , . . . , λk ) be a self-adjoint matrix in Mk (C) and Ω(A) = U ∗ AU U ∈ U(k) be the unitary orbit of A in Mk (C). Let sj be the cardinality of the set Tj for 1 j m + 1. Then the covering number of Ω(A) by the 12 δ- · 2 -balls in Mk (C) is bounded by 2 2 2 1 Cθ −2s1 −2s2 −···−2sm+1 −4(s1 +···+sm )sm+1 −k 2 · , ν2 Ω(A), δ (C1 · 4δ/θ ) 2 δ

where C, C1 > 0 are some universal constants. 4. Topological free entropy dimension of one variable Suppose x is a self-adjoint element of a unital C∗ algebra A. Let {Pr }∞ r=1 be the collection of all polynomials in CX with rational coefficients. In this section, we are going to compute the topological free entropy dimension of x. 4.1. Upperbound Proposition 4.1.1. Suppose x in A is a self-adjoint element with the spectrum σ (x). Then 1 δtop (x) 1 − , n where n is the cardinality of σ (x). Here we assume that

1 ∞

= 0.

Proof. By [27], we know that the inequality always holds when n is infinity. We need only to show that 1 δtop (x) 1 − , n when n < ∞. Assume that λ1 , . . . , λn are distinct elements in σ (x). Let R > x. By Theorem 3.1.1, for every ω > 0, there are r0 > 0 and 0 > 0 such that, for all r r0 , 0 , any k 1 and any

2044

D. Hadwin, J. Shen / Journal of Functional Analysis 256 (2009) 2027–2068 (top)

A ∈ ΓR (x; k, , P1 , . . . , Pr ), there are some 1 k1 , . . . , kn k, with k1 + · · · + kn = k and a unitary matrix U in Mk (C) satisfying ⎛ λ1 Ik1 A − U ⎝ 0 0

0 λ2 Ik2 0

⎞ ··· 0 ··· 0 ⎠ U ∗ 2ω. · · · λn Ikn

(∗∗)

Denote ⎧ ⎛ λ1 Ik1 ⎪ ⎪ ⎨ ⎜ 0 Ω(k1 , . . . , kn ) = U ⎜ ⎝ 0 ⎪ ⎪ ⎩ 0

··· 0 ··· 0 · · · λn−1 Ikn−1 ··· 0

0 λ2 Ik2 0 0

⎫ ⎞ 0 ⎪ ⎪ ⎬ 0 ⎟ ⎟ U ∗ | U is in Uk . ⎪ 0 ⎠ ⎪ ⎭ λn Ikn

By Corollary 12 in [22] or Theorem 3 in [2] (see also the proof of Theorem 3.1 in [25]), there are a family of unitary matrices {Uλ }λ∈Λ in U(k) giving a ω-net in U(k)/(U(k1 ) ⊕ · · · ⊕ U(kn )) with respect to the distance obtained from the operator norm and satisfying |Λ|

C2 ω

k 2 −!ni=1 k 2 i

,

where C2 > 1 is a universal constant depending only on n. Since ⎧ ⎛ λ1 Ik1 ⎨ Ω(k1 , . . . , kn ) = U ⎝ · · · ⎩ 0

⎫ ⎞ ··· 0 ⎬ " ··· · · · ⎠ U ∗ U ∈ U(k) U(k1 ) ⊕ · · · ⊕ U(kn ) , ⎭ · · · λn Ikn

the covering number of Ω(k1 , . . . , kn−1 , kn ) by 2Rω- · -balls in Mk (C) is bounded by ν∞ Ω(k1 , . . . , kn−1 , kn ), 2Rω |Λ|

C2 ω

k 2 −!ni=1 k 2 i

.

(4.1.1)

For each k ∈ N, let I(k) be the set consisting of all these (k1 , . . . , kn ) in Zn such that 1 k1 , . . . , kn k and k1 + · · · + kn = k. Then the cardinality of the set I(k) is equal to (k − 1)! . (n − 1)!(k − n)!

(4.1.2)

Note that n

ki2 k 2 /n

i=1

for all 1 k1 , . . . , kn k with k1 + · · · + kn = k; and by (∗∗) (top)

ΓR

(x; k, , P1 , . . . , Pr )

(4.1.3)


2045

is contained in 2ω-neighborhood of the set

Ω(k1 , . . . , kn ).

(k1 ,...,kn )∈I (k)

Put (4.1.1), (4.1.2) and (4.1.3) together. It follows that the covering number of the set (top)

ΓR

(x; k, , P1 , . . . , Pr )

by (2R + 2)ω- · -balls in Mk (C) is bounded by

(top) ν∞ ΓR (x; k, , P1 , . . . , Pr ), (2R + 2)ω

ν∞ Ω(k1 , . . . , kn−1 , kn ), 2Rω

(k1 ,...,kn )∈I (k)

k 2 −k 2 /n (k − 1)! C2 . · (n − 1)!(k − n)! ω Thus, by Remark 2.5.3 we have δtop (x) lim sup lim sup ω→0+

log

k→∞

(k−1)! C2 k 2 −k 2 /n (n−1)!(k−n)! · ( ω ) −k 2 log((2R + 2)ω)

1 =1− . n

2

4.2. Lower-bound We follow the notation from last subsection. Proposition 4.2.1. Suppose that x is a self-adjoint element with the finite spectrum σ (x) in A. Then 1 δtop (x) 1 − , n where n is the cardinality of the set σ (x). Proof. Suppose that λ1 , . . . , λn are distinct elements in σ (x). Let θ > 0 be a fixed number such that |λi − λj | > θ,

∀1 i = j n.

Assume k = nt for some positive integer t. Let Ak = diag(λ1 It , . . . , λn It ) be a diagonal matrix in Mk (C) where It is the t × t identity matrix. By the choice of Ak (top) we know that, for all R > x, r 1 and > 0, Ak ∈ ΓR (x; k, , P1 , . . . , Pr ), whence (top) Ω(Ak ) ⊆ ΓR (x; k, , P1 , . . . , Pr ) where Ω(Ak ) = {U ∗ Ak U | U ∈ U(k)}. For any ω > 0, applying Theorem 3.3.1 for n = m and δ = 2ω, we have

2046


˜ −nt 2 (top) Cθ −k 2 ν∞ ΓR (x; k, , P1 , . . . , Pr ), ω (C1 · 8δ/θ ) · δ ˜ −k 2 /n Cθ 2 = (16C1 ω/θ )−k · , 2ω for some universal constants C1 , C˜ > 0. Note that θ is a fixed number. By the definition of topological free entropy dimension, we obtain 1 δtop (x) 1 − . n

2

Proposition 4.2.2. Suppose that x is a self-adjoint element in A with infinite spectrum. Then δtop (x) 1. Proof. For any 0 < θ < 1, there are a positive integer m and real numbers λ1 , . . . , λm in σ (x) satisfying (i) |λi − λj | θ for any 1 i = j m; and (ii) for any λ in σ (x), there is some λj with 1 j m and |λ − λj | θ . By spectral theorem, for any R > x, r 1 and > 0, there are some positive integer n m and real numbers λm+1 , . . . , λn in σ (x) satisfying: for every t 1 the matrix A = diag(λ1 It , λ2 It , . . . , λm It , λm+1 , . . . , λn ) is in (top)

ΓR

(x; k, , P1 , . . . , Pr ),

where k = mt + n − m. Hence (top)

Ω(A) ⊆ ΓR

(x; k, , P1 , . . . , Pr )

where Ω(A) = {U ∗ AU | U ∈ U(k)}. For any ω > 0, let δ = 2ω. By Theorem 3.3.1, we know that (top) ν∞ ΓR (x; k, , P1 , . . . , Pr ), ω 2 2 Cθ −(2mt +4m(n−m)t+2(n−m) ) −k 2 (C1 · 4δ/θ ) · , δ

∀t ∈ N,

for some universal constants C, C1 > 0. Thus (top)

lim sup k→∞

log(ν∞ (ΓR

(x; k, , P1 , . . . , Pr ), ω)) −k 2 log ω

2 log(C/2) + log θ 2 log(8C1 ) − log θ +1+ − . log ω m log ω m


2047

Note that θ, m are numbers independent of r, and ω. Then, δtop (x) 1 −

2 . m

When θ goes to 0, m goes to infinity as σ (x) has infinitely many elements. Therefore, δtop (x) 1.

2

4.3. Topological free entropy dimension in one variable case By Propositions 4.1.1, 4.2.1 and 4.2.2, we have the following result. Theorem 4.3.1. Suppose x is a self-adjoint element in a unital C∗ algebra A. Then 1 δtop (x) = 1 − , n where n is the cardinality of the spectrum of x in A. Here we assume that

1 ∞

= 0.

5. Topological free entropy dimension of n-tuple in unital C∗ algebras 5.1. An equivalent definition of topological free entropy dimension by Voiculescu Suppose that A is a unital C∗ algebra and x1 , . . . , xn , y1 , . . . , ym are self-adjoint elements in A. Let {Pr }∞ r=1 is the collection of all polynomials in CX1 , . . . , Xn , Y1 , . . . , Ym with rational coefficients. For every R, > 0 and positive integers r, k, let (top)

ΓR

(x1 , . . . , xn : y1 , . . . , ym ; k, , P1 , . . . , Pr )

be Voiculescu’s norm-microstate space defined in Section 2.4. For any ω > 0, define (top) ν2 ΓR (x1 , . . . , xn : y1 , . . . , ym ; k, , P1 , . . . , Pr ), ω (top)

to be the covering number of the set ΓR (x1 , . . . , xn : y1 , . . . , ym ; k, , P1 , . . . , Pr ) by ω- · 2 n balls in the metric space (Ms.a. k (C)) equipped with normalized Hilbert–Schmidt norm (see Definition 2.2.2). Definition 5.1.1. Define δ˜top (x1 , . . . , xn : y1 , . . . , ym ; ω) (top)

= sup

inf

lim sup

R>0 >0,r∈N k→∞

log(ν2 (ΓR

(x1 , . . . , xn : y1 , . . . , ym ; k, , P1 , . . . , Pr ), ω)) −k 2 log ω

and δ˜top (x1 , . . . , xn : y1 , . . . , ym ) = lim sup δ˜top (x1 , . . . , xn : y1 , . . . , ym ; ω). ω→0+

2048


The following proposition was shown by Voiculescu in Section 6 of [27]. Proposition 5.1.2. (From [27].) Suppose that A is a unital C∗ algebra and x1 , . . . , xn , y1 , . . . , ym are self-adjoint elements in A. Then δ˜top (x1 , . . . , xn : y1 , . . . , ym ) = δtop (x1 , . . . , xn : y1 , . . . , ym ), where δtop (x1 , . . . , xn : y1 , . . . , ym ) is the topological free entropy dimension of x1 , . . . , xn in presence of y1 , . . . , ym . 5.2. Upper-bound of topological free entropy dimension in a unital C∗ algebra Let us recall Voiculescu’s definition of free dimension capacity in [27]. Definition 5.2.1. Suppose that A is a unital C∗ algebra with a family of self-adjoint generators x1 , . . . , xn . Suppose that T S(A) is the set consisting of all tracial states of A. If T S(A) = ∅, define Voiculescu’s free dimension capacity κδ(x1 , . . . , xn ) of x1 , . . . , xn as follows, κδ(x1 , . . . , xn ) =

sup

τ ∈T S(A)

δ0 (x1 , . . . , xn : τ ),

where δ0 (x1 , . . . , xn : τ ) is Voiculescu’s (von Neumann algebra) free entropy dimension of x1 , . . . , xn in A, τ . The relationship between topological free entropy dimension of a unital C∗ algebra with a unique tracial state and its free dimension capacity is indicated by the following result. Theorem 5.2.2. Suppose that A is a unital C∗ algebra with a family of self-adjoint generators x1 , . . . , xn . Suppose that T S(A) is the set consisting of all tracial states of A. If T S(A) is a set of a single element, then δtop (x1 , . . . , xn ) κδ(x1 , . . . , xn ). To prove the preceding theorem, we need the following sublemma. Sublemma 5.2.3. Suppose that A is a unital C∗ algebra with a family of self-adjoint generators x1 , . . . , xn . Suppose that T S(A) = ∅ is the set consisting of all tracial states of A. Let {Pr }∞ r=1 is the collection of all polynomials in CX1 , . . . , Xn with rational coefficients. Let R > max{x1 , . . . , xn } be a positive number. Then for any m 1, there is a r ∈ N such that (top)

ΓR

1 x1 , . . . , xn ; k, , P1 , . . . , Pr ⊆ r

τ ∈T S(A)

1 ΓR x1 , . . . , xn ; k, m, ; τ , m

∀k 1,

where ΓR (x1 , . . . , xn ; k, m, m1 ; τ ) is microstate space of x1 , . . . , xn with respect to τ (see [24]).


2049

Proof of Sublemma 5.2.3. We will prove the result by contradiction. Suppose, to the contrary, there is some m0 1 so that following holds: for any r ∈ N, there are some kr 1 and some (r) (r) 1 (top) (r) A1 , A2 , . . . , An ∈ ΓR x1 , . . . , xn ; kr , , P1 , . . . , Pr r satisfying (r) (r) A1 , A2 , . . . , A(r) ∈ / n

τ ∈T S(A)

1 ΓR x1 , . . . , xn ; kr , m0 , ;τ . m0

(5.2.1)

Let α be a free ultrafilter in β(N) \ N. Let N = αr=1

Mkr (C) be the von Neumann algebra ultraα product of {Mkr (C)}∞ along the ultrafilter α, i.e. r=1 Mkr (C) is the quotient algebra of the r=1

∞ T r(Br ) ∗ C algebra r=1 Mkr (C) by I2 , the 0-ideal of the trace τα , where τα ((Br )∞ r=1 ) = limr→α kr . Note that I∞ ⊆ I2 (see Definitions 2.6.1 and 2.6.2). Thus there is a ∗-homomorphism from

∞

∞ (j ) ∞ r=1 Mkr (C)/I∞ onto N = r=1 Mkr (C)/I2 . Let, for each 1 j n, aj = [(Ar )r=1 ] be a self-adjoint element in N . By mapping xj to aj , there is a unital ∗-homomorphism ψ from the C∗ algebra A onto the C∗ subalgebra generated by {a1 , . . . , an } in N . Let τ0 be the tracial state on A which is induced by τα on ψ(A), i.e. τ0 (x) = τα ψ(x) ,

∀x ∈ A.

It follows that when r is large enough, (r) (r) 1 (r) A1 , A2 , . . . , An ∈ ΓR x1 , . . . , xn ; kr , m0 , ; τ0 , m0 which contradicts with the inequality (5.2.1). This complete the proof of Sublemma 5.2.3.

2

Proof of Theorem 5.2.2. Let R > max{x1 , . . . , xn }. Let τ be the unique trace of A. By Sublemma 5.2.3, for any m 1, there is r ∈ N such that (top)

ΓR

1 1 x1 , . . . , xn ; k, , P1 , . . . , Pr ⊆ ΓR x1 , . . . , xn ; k, m, ; τ , r m

Therefore, for any 1 > ω > 0, we have

1 x1 , . . . , xn ; k, , P1 , . . . , Pr , ω r 1 ν2 ΓR x1 , . . . , xn ; k, m, ; τ , ω , ∀k 1. m

(top) ν2 ΓR

Now it is easy to check that δ˜top (x1 , . . . , xn ) δ(x1 , . . . , xn ; τ ) = κδ(x1 , . . . , xn ).

∀k 1.

2050


By Proposition 5.1.2, we know that δtop (x1 , . . . , xn ) κδ(x1 , . . . , xn ).

2

Remark 5.2.4. Combining Theorem 5.2.2 with the results in [11] or [14], we will be able to compute the upper-bound of topological free entropy dimension for a large class of unital C∗ algebras. For example, δtop (x1 , . . . , xn ) 1 if x1 , . . . , xn is a family of self-adjoint operators that generates an irrational rotation algebra A. 5.3. Lower-bound of topological free entropy dimension in a unital C∗ algebra In this subsection, we assume that A is a finitely generated, infinite dimensional, unital simple C∗ algebra with a unique tracial state τ . Assume that x1 , . . . , xn is a family of self-adjoint generators of A. Let H be the Hilbert space L2 (A, τ ). Without loss of generality, we might assume that A acts on the Hilbert space H by GNS representation. Let M be the von Neumann algebra generated by A on H . Thus M is a diffuse von Neumann algebra with a tracial state τ . For each positive integer m, there is a family of mutually orthogonal projections p1 , . . . , pm in M such that τ (pj ) = 1/m for 1 j m. Let ym = 1 · p1 + 2 · p2 + · · · + m · pm =

m

j · pj ∈ M.

(5.3.1)

j =1

Let {Pr }∞ r=1 is the collection of all polynomials in CX1 , . . . , Xn with rational coefficients. Thus {Pr (x1 , . . . , xn )}∞ r=1 is dense in M with respect to the strong operator topology. Hence, for each m 1, there are a positive integer rm and a self-adjoint element Prm (x1 , . . . , xn ) in A such that ym − Pr (x1 , . . . , xn ) 1 , m 2 m3 where a2 =

(5.3.2)

√ ∗ τ (a a) for all a ∈ M.

Lemma 5.3.1. Let A be a finitely generated, infinite dimensional, unital simple C∗ algebra with a unique tracial state τ . Assume that x1 , . . . , xn is a family of self-adjoint generators of A. Let H , M, {Pr }∞ r=1 be defined as above. For each m 1, let ym and Prm (x1 , . . . , xn ) be chosen as above. Then δtop (x1 , . . . , xn ) δtop Prm (x1 , . . . , xn ) : x1 , . . . , xn . Proof. Let R > max{Prm (x1 , . . . , xn ), x1 , . . . , xn }. Then there exists a positive constant D > 1 such that Pr (A1 , . . . , An ) − Pr (B1 , . . . , Bm ) D (A1 , . . . , An ) − (B1 , . . . , Bm ) m m for all A1 , . . . , An , B1 , . . . , Bn in Mk (C) satisfying 0 A1 , . . . , An , B1 , . . . , Bn R. Let {Qj }∞ j =1 be the collection of all noncommutative polynomials in CZ, X1 , . . . , Xn with rational coefficients.


2051

Let ω > 0 and r0 ∈ N and 0 > 0. By the definition of topological free entropy dimension, we know there are some 1 > 0 and j0 in N such that if < 1 , j j0 , then any (top)

(B, A1 , . . . , An ) ∈ ΓR

Prm (x1 , . . . , xn ), x1 , . . . , xn ; k, , Q1 , . . . , Qj

satisfies B − Pr (A1 , . . . , An ) ω/4; m and (top)

(A1 , . . . , An ) ∈ ΓR

(x1 , . . . , xn ; k, 0 , P1 , . . . , Pr0 ). (top)

ω Let {Ball(Aλ1 , . . . , Aλn ; 4D , · )}λ∈Λ be a covering of ΓR such that

ω 4D - · -balls

(x1 , . . . , xn ; k, 0 , P1 , . . . , Pr0 ) by

ω (top) Card(Λ) = ν∞ ΓR (x1 , . . . , xn ; k, 0 , P1 , . . . , Pr0 ), . 4D Then if < 1 , j j0 , for any (top)

(B : A1 , . . . , An ) ∈ ΓR

Prm (x1 , . . . , xn ) : x1 , . . . , xn ; k, , Q1 , . . . , Qj ,

there is some λ0 ∈ Λ such that (A1 , . . . , An ) − Aλ0 , . . . , Aλ0 ω . n 1 4D Moreover B − Pr Aλ0 , . . . , Aλ0 B − Pr (A1 , . . . , An ) m m n 1 λ + Prm (A1 , . . . , An ) − Prm A1 0 , . . . , Aλn0 ω ω +D· ω/2. 4 4D This means (top) ν∞ ΓR Prm (x1 , . . . , xn ) : x1 , . . . , xn ; k, , Q1 , . . . , Qj , ω ω (top) Card(Λ) = ν∞ ΓR (x1 , . . . , xn ; k, 0 , P1 , . . . , Pj0 ), 4D for each j j0 and < 0 . Note D only depends on R and rm , not on ω. By definition of δtop and Remark 2.5.3, we have 2 δtop Prm (x1 , . . . , xn ) : x1 , . . . , xn δtop (x1 , . . . , xn ). In the spirit of Voiculescu’s finite approximation property in von Neumann algebras, we also introduce “matrix-norm approximation property” in the context of C∗ -algebras.

2052


Definition 5.3.2. Suppose A is a unital C∗ algebra with a family of self-adjoint generators x1 , . . . , xn . Suppose {Pr }∞ r=1 is the collection of all noncommutative polynomials in CX1 , . . . , Xn with rational coefficients. If for any R > max{x1 , . . . , xn }, r > 0, > 0, there is a sequence of positive integers k1 < k2 < · · · such that (top)

ΓR

(x1 , . . . , xn ; ks , , P1 , . . . , Pr ) = ∅,

∀s 1,

then A is called having matrix-norm approximation property. Lemma 5.3.3. Let A be a finitely generated, infinite dimensional, unital simple C∗ algebra with a unique tracial state τ . Assume that A has matrix-norm approximation property. Assume that x1 , . . . , xn is a family of self-adjoint generators of A. Let H , M, {Pr }∞ r=1 be defined as above. Let m be a positive integer. Let ym and Prm (x1 , . . . , xn ) be chosen as above satisfying (5.3.1) and (5.3.2). Let {Qj }∞ j =1 be the collection of all noncommutative polynomials in CZ, X1 , . . . , Xn with rational coefficients. Let R > max{Prm (x1 , . . . , xn ), x1 , . . . , xn }. Then there is a positive integer p so that the following hold: ∀r p, ∀k 3r, if (top) (B, A1 , . . . , An ) ∈ ΓR

1 Prm (x1 , . . . , xn ), x1 , . . . , xn ; k, , Q1 , . . . , Qr , r k

then there are (i) some 1 k1 , . . . , km k with m1 − 1r kj m1 + k1 + · · · + km = k, and (ii) a unitary matrix U in U(k), satisfying ⎛1 · I k1 ⎜ 0 ⎜ B − U ⎜ ⎝ ··· 0

0 2 · Ik2 ··· 0

··· ··· .. . ···

1 r

for each 1 j m and

⎟ 2 ⎟ ∗ ⎟U 3. ⎠ m ··· m · Ikm 2 0 0

⎞

Proof. We will prove the result by contradiction. Assume, to the contrary, for all p 1 there are some rp 1, kp 3rp and some (p) (p) 1 (top) (p) B , A1 , . . . , An ∈ ΓR Prm (x1 , . . . , xn ), x1 , . . . , xn ; kp , , Q1 , . . . , Qrp , rp satisfying ⎛1 · I s1 ⎜ 0 (p) ⎜ B − U ⎜ ⎝ ··· 0

0 2 · Is2 ··· 0 s

··· ··· .. . ···

⎟ 2 ⎟ ∗ ⎟U > 3, m ··· ⎠ m · Ism 2 0 0

⎞

(5.3.3)

for all 1 s1 , . . . , sm kp with m1 − r1p kpj m1 + r1p for each 1 j n and s1 +· · ·+sm = kp , and all unitary matrix U in U(kp ).

Let α be a free ultrafilter in β(N) \ N. Let N = αp=1 Mkp (C) be the von Neumann algebra

α ∗ ultraproduct of {Mkp (C)}∞ p=1 Mkp (C) is the quotient of the C p=1 along the ultrafilter α, i.e.


2053

T r(A˜ p ) ˜ ∞ algebra ∞ p=1 Mkp (C) by I2 , the 0-ideal of the trace τα , where τα ((Ap )p=1 ) = limp→α kr . Then N is a II1 factor with a tracial state τα . We let a2,τα = (τα (a ∗ a))1/2 for a ∈ N . (p) Let, for each 1 j n, aj = [(Aj )∞ p=1 ] be a self-adjoint element in N . By mapping xj to aj , there is a unital ∗-homomorphism ψ from the C∗ algebra A onto the C∗ subalgebra B generated by {a1 , . . . , an } in N . Since A is a simple C∗ algebra and ψ(IA ) = IB , ψ actually is a ∗-isomorphism from A onto B. Since A has a unique trace τ , τ (x) = τα ψ(x) ∀x ∈ A.

(5.3.4)

On the other hand, it follows from the choice of von Neumann algebra M that on the unit ball of M (here M acts on the Hilbert space L2 (A, τ )) the strong operator topology coincides with the topology induced by the trace norm · 2 on M. Similarly,

it follows from the definition of von Neumann algebra ultraproduct, on the unit ball of N = αp=1 Mkp (C) (here N acts on the Hilbert space L2 (N , τα )) the strong operator topology coincides with the topology induced by the trace norm · 2,τα on N . Combining these facts with (5.3.4), we know that ψ : A → B induces a trace preserving ∗-isomorphism (still denoted by ψ ) from M onto the von Neumann subalgebra generated by a1 , . . . , an in N . Therefore, ym − Pr (x1 , . . . , xn ) = ψ ym − Pr (x1 , . . . , xn ) m m 2 2,τ

α

∞ $ # = ψ(ym ) − B (p) p=1

2,τα

1 ; m3

(5.3.5)

and ψ(ym ) =

m

j qj

j =1

for a family {q1 , . . . , qm } of mutually orthogonal projections in N with τα (qj ) = 1/m, 1 j m. Notice that N is a II1 factor with the tracial state τα . If {q˜1 , . . . , q˜m } is another family of mutually orthogonal projections in N with τα (q˜j ) = 1/m for all 1 j m, then ! there is a unitary element u in N such that qj = uq˜j u∗ for all 1 j m, whence ψ(ym ) = m j =1 j qj = !m (p) (p) ∗ u( j =1 j q˜j )u . In other words, if s1 , . . . , sm , p = 1, 2, . . ., are positive integers satisfying (p)

s

(p)

(p)

(p)

(p)

1 s1 , . . . , sm kp , m1 − r1p kjp m1 + r1p for each 1 j n and s1 + · · · + sm = kp , then there is a unitary u = [(Up )∞ p=1 ] in N with each Up ∈ U(kp ) such that ⎡⎛⎛ ⎢⎜⎜ ⎢⎜⎜ $ # ⎢⎜⎜ ψ(ym ) = (Up )∞ ⎜⎜ p=1 ⎢ ⎢⎜⎜ ⎣⎝⎝

⎞⎞∞ ⎤

1 · Is (p)

0

···

0

0

2 · Is (p)

···

0

··· 0

..

··· m · Is (p)

1

2

··· 0

. ···

This contradicts with inequalities (5.3.3) and (5.3.5).

m

2

⎟⎟ ⎟⎟ ⎟⎟ ⎟⎟ ⎟⎟ ⎠⎠

⎥ ⎥ # ∞ $ ⎥ ⎥ Up∗ p=1 ∈ N . ⎥ ⎦ p=1

2054


The following lemma is well-known (for example, see Lemma 4.1 in [24]). Lemma 5.3.4. Suppose A and B are self-adjoint matrices in Ms.a. k (C) with a list of eigenvalues λ1 λ2 · · · λk , and μ1 μ2 · · · μk respectively. Then k

|λj − μj |2 Tr (A − U BU ∗ )2 ,

j =1

where U is any unitary matrix in U(k). Lemma 5.3.5. Let r, m be positive integers with 4 < m < r. Suppose k1 , . . . , km is a family of k positive integers such that m1 − 1r kj m1 + 1r for all 1 j m and k1 + · · · + km = k. If A is a self-adjoint matrix in Mk (C) such that, for some unitary matrix U in U(k), ⎛1 · I k1 ⎜ 0 ⎜ A − U ⎜ ⎝ ··· 0

0 2 · Ik2 ··· 0

··· ··· .. . ···

⎟ 2 ⎟ ∗ ⎟U 3, ⎠ m ··· m · Ikm 2 0 0

⎞

(5.3.6)

then, for any ω > 0 we have −56k2 m 2C −k 2 ν2 Ω(A), ω (8C1 ω) · ω for some universal constants C1 , C > 0, where Ω(A) = {W ∗ AW | W ∈ U(k)}. Proof. Suppose that λ1 λ2 . . . λk are the eigenvalues of A. For each 1 j m, let

Tj = i ∈ N

+ j −1 ,

j 1 kt + 1 i kt and |λi − j | m t=0

(5.3.7)

t=0

and + j −1 , + j −1 ,

j kt + 1, kt + 2, . . . , kt \ Tj , Tˆj = t=0

t=0

t=0

here we assume that k0 = 0. Let B = diag(1 · Ik1 , . . . , m · Ikm ) be a diagonal matrix in Mk (C). By inequality (5.3.6), Lemma 5.3.4 and the definition of Tˆj , we have

2 k m3

2

Tr (A − U BU ∗ )2 |λi − j |2 i∈Tˆj

1 m

2

card(Tˆj ),

∀1 j m,


2055

where card(Tˆj ) is the cardinality of the set Tˆj . Thus card(Tˆj )

4k , m4

for 1 j m.

Let sj = card(Tj ) for 1 j m, whence sj kj

k k + , m r

∀1 j m.

Let Tm+1 = {1, 2, . . . , k}

+ m -

, Tj

=

j =1

m

Tˆj

j =1

and sm+1 be the cardinality of the set Tm+1 . Thus sm+1 = k − s1 − · · · − sm =

m

card(Tˆj )

j =1

m 4k 4k = 3. 4 m m j =1

It is not hard to see that T1 , . . . , Tm+1 is a partition of the set {1, 2, . . . , k}. Moreover, if 1 j1 = j2 m, then for any i1 ∈ Tj1

and i2 ∈ Tj2 ,

we have by (5.3.7) |λi1 − λi2 | |j2 − j1 | − |λi2 − j2 | − |λi1 − j1 | 1 −

1 2 . m 2

Applying Proposition 3.4.1 for such T1 , . . . , Tm , Tm+1 , θ = 1/2 and ω = δ/2, we have −2s 2 −···−2sm2 −2s 2 −4(s1 +···+sm )sm+1 1 m+1 2C −k 2 ν2 Ω(A), ω (8C1 ω) · ω −2(k 2 +···+km2 +( 4k )2 +2k· 4k ) 1 m3 m3 2C 2 (8C1 ω)−k · ω −k 2

(8C1 ω)

−k 2

(8C1 ω)

for some universal constants C, C1 > 0.

−2(( k + k )2 +···+( k + k )2 + 16k2 + 8k2 )

−56k2

2C · ω 2C · ω 2

m

m

,

r

m

r

m6

m3

2056


Lemma 5.3.6. Let A be a finitely generated, infinite dimensional, simple unital C∗ algebra with a unique tracial state τ . Assume that A has matrix-norm approximation property. Assume that x1 , . . . , xn is a family of self-adjoint generators of A. Let H , M, {Pr }∞ r=1 be defined as above. Let m be a positive integer. Let ym and Prm (x1 , . . . , xn ) be chosen as above satisfying (5.3.1) and (5.3.2). Let {Qj }∞ j =1 be the collection of all noncommutative polynomials in CZ, X1 , . . . , Xn with rational coefficients. Let R > max{Prm (x1 , . . . , xn ), x1 , . . . , xn }. When r is large enough and is small enough, for any ω > 0, we have for some universal constants C, C1 > 0, −56k2 m (top) 2C −k 2 · , ν2 ΓR (Prm (x1 , . . . , xn ) : x1 , . . . , xn ; k, , Q1 , . . . , Qr ), ω (8C1 ω) ω (top)

if ΓR

(Prm (x1 , . . . , xn ) : x1 , . . . , xn ; k, , Q1 , . . . , Qr ) = ∅.

Proof. By Lemma 5.3.3, when r is large enough and is small enough, the following hold: ∀k 3r, if (top)

(B, A1 , . . . , An ) ∈ ΓR

Prm (x1 , . . . , xn ), x1 , . . . , xn ; k, , Q1 , . . . , Qr =

∅, k

then there are some 1 k1 , . . . , km k with m1 − 1r kj k1 + · · · + km = k, and a unitary matrix U in U(k) satisfying ⎛1 · I k1 ⎜ 0 ⎜ B − U ⎜ ⎝ ··· 0

0 2 · Ik2 ··· 0

··· ··· .. . ···

1 m

+

1 r

for each 1 j m and

⎟ 2 ⎟ ∗ ⎟U 3. ⎠ m ··· m · Ikm 2 0 0

⎞

Combining with Lemma 5.3.5, we know that if (top)

B ∈ ΓR

Prm (x1 , . . . , xn ) : x1 , . . . , xn ; k, , Q1 , . . . , Qr =

∅

then, for any ω > 0, −56k2 m 2C −k 2 ν2 Ω(B), ω (8C1 ω) · , ω where Ω(B) = W ∗ BW W ∈ U(k) . (top)

Note that Ω(B) ⊆ ΓR

(Prm (x1 , . . . , xn ) : x1 , . . . , xn ; k, r, ). It follows that, for any ω > 0,

2 (top) ν2 ΓR (Prm (x1 , . . . , xn ) : x1 , . . . , xn ; k, r, ), ω (8C1 ω)−k (top)

if ΓR

(Prm (x1 , . . . , xn ) : x1 , . . . , xn ; k, , Q1 , . . . , Qr ) = ∅.

2

2C · ω

−56k2 m

,


2057

Now we have the following result. Theorem 5.3.7. Let A be a finitely generated, infinite dimensional, simple unital C∗ algebra with a unique tracial state τ . Assume that x1 , . . . , xn is a family of self-adjoint generators of A. If A has matrix-norm approximation property, then δtop (x1 , . . . , xn ) 1. Proof. Let H be the Hilbert space L2 (A, τ ). Without loss of generality, we might assume that A acts on the Hilbert space H . Let M be the von Neumann algebra generated by A on H . Then M is a diffuse von Neumann algebra with a tracial state τ . For each positive integer m, there is a family of mutually orthogonal projections p1 , . . . , pm in M such that τ (pj ) = 1/m for 1 j m. Let ym = 1 · p1 + 2 · p2 + · · · + m · pm =

m

j · pj .

j =1

Let {Pr }∞ r=1 be the collection of all noncommutative polynomials in CX1 , . . . , Xn with rational coefficients. Thus {Pr (x1 , . . . , xn )}∞ r=1 is dense in M with respect to the strong operator topology. Hence, for each m 1, there are a positive integer m and a self-adjoint element Prm (x1 , . . . , xn ) in A such that ym − Pr (x1 , . . . , xn ) 1 . m 2 m3 Let {Qj }∞ j =1 be the collection of all noncommutative polynomials in CZ, X1 , . . . , Xn with rational coefficients. Let R > max{Prm (x1 , . . . , xn ), x1 , . . . , xn }. By Lemma 5.3.6, for any ω > 0, when r is large enough and is small enough, we have for some universal constants C1 , C > 0 −56k2 m (top) 2C −k 2 · , ν2 ΓR (Prm (x1 , . . . , xn ) : x1 , . . . , xn ; k, , Q1 , . . . , Qr ), ω (8C1 ω) ω (top)

if ΓR (Prm (x1 , . . . , xn ) : x1 , . . . , xn ; k, , Q1 , . . . , Qr ) = ∅. Since A has matrix-norm approximation property, we have 56 δ˜top Prm (x1 , . . . , xn ) : x1 , . . . , xn 1 − . m By Proposition 5.1.2, we get 56 δtop Prm (x1 , . . . , xn ) : x1 , . . . , xn 1 − . m By Lemma 5.3.1, δtop (x1 , . . . , xn ) 1 −

56 . m

2058


Since m is an arbitrary positive integer, we obtain δtop (x1 , . . . , xn ) 1.

2

5.4. Values of topological free entropy dimensions in some unital C∗ algebras In this subsection, we are going to compute the values of topological free entropy dimensions in some unital C∗ algebras by using the results from preceding subsection. Theorem 5.4.1. Let Aθ be an irrational rotation C∗ algebra. Then δtop (x1 , . . . , xn ) = 1, where x1 , . . . , xn is a family of self-adjoint operators that generates Aθ . Proof. Note that Aθ is an infinite dimensional, unital simple C∗ algebra with a unique tracial state τ . By [25] or [11,14] and Theorem 5.2.2, we know that δtop (x1 , . . . , xn ) 1. It follows from [18] that Aθ has matrix-norm approximation property. Therefore δtop (x1 , . . . , xn ) 1. Hence δtop (x1 , . . . , xn ) = 1.

2

Theorem 5.4.2. Let A be a UHF algebra (uniformly hyperfinite C∗ algebra). Then δtop (x1 , . . . , xn ) = 1, where x1 , . . . , xn is a family of self-adjoint operators that generates A. Proof. By [17], we know that A is generated by two self-adjoint elements. We also know that A is an infinite dimensional, unital simple C∗ algebra with a unique tracial state τ . By [25] or [11,14] and Theorem 5.2.2, we know that δtop (x1 , . . . , xn ) 1. It is easy to check that A has matrix-norm approximation property. Therefore δtop (x1 , . . . , xn ) 1. Hence δtop (x1 , . . . , xn ) = 1.

2


2059

∗ ∗ Recall that for any sequence (Am )∞ m=1 of C algebras,we can introduce two C algebras

. / am ∈ Am , sup am < ∞ , Am = (am )∞ m=1

m

. / Am = (am )∞ m=1 am ∈ Am , lim am = 0 .

m∈N

m→N

m

! The norm in the quotient C∗ algebra m Am / m Am is given by ρ (am )∞ = lim sup xm , m=1 m→∞

where ρ is the quotient map from m Am onto If A is an exact C∗ algebra, then the sequence 0 → A ⊗min

Mm (C) → A ⊗min

m

m Am /

!

m Am .

Mm (C) → A ⊗min

m

" Mm (C) Mm (C) → 0

m

m

is exact. Therefore, we have the following natural identification 0 " Mm (C) Mm (C) = A ⊗min Mm (C) A ⊗min Mm (C) . A ⊗min m

m

m

m

On the other hand, we have the following natural embedding Mm (C) ⊆ Mm (A) A ⊗min m

m

and the identification A ⊗min

Mm (C) =

m

Mm (A).

m

Thus we have for any exact C∗ algebra A a natural embedding " " ψ : A ⊗min Mm (C) Mm (C) ⊆ Mm (A) Mm (A). m

m

m

m

Lemma 5.4.3. Suppose that A and B are unital C∗ algebras and ρ is a unital embedding " Mm (B) Mm (B). ρ:A→ m

m

Suppose that x1 , . . . , xn is a family of elements in A. Suppose r is a positive integer and {Pj (x1 , . . . , xn )}rj =1 is a family of noncommutative polynomials of x1 , . . . , xn . Then there are (k)

(k)

some k ∈ N and a1 , . . . , an in Mk (B) so that (k) Pj a , . . . , a (k) − Pj (x1 , . . . , xn ) 1 , n 1 r

∀1 j r.

2060


Proof. Let ρ(xi ) =

# (m) $ " xi m ∈ Mm (B) Mm (B), m

∀1 i n,

m

where each xim ∈ Mm (B). By the definition of

"!

m Mm (B)

m Mm (B),

lim supP x1(m) , . . . , xn(m) M (B) = P (x1 , . . . , xn )A , m m→∞

we have

∀P ∈ CX1 , . . . , Xn .

Thus there are some positive integers m1 , m2 with m1 m2 such that 1 Let k =

sup

m1 lm2

!m2

j =m1

(l) Pj x , . . . , x (l) n

1

2 M l (B )

1 − Pj (x1 , . . . , xn )A , r

∀1 j r.

j and (k)

ai

=

m2 3

(l)

xi ∈ Mk (B),

∀1 i n.

l=m1

Then, we have (k) Pj a , . . . , a (k) 1

n

M k (B )

1 − Pj (x1 , . . . , xn )A , r

∀1 j r.

2

Theorem 5.4.4. Let p 2 be a positive integer and Fp be the free group on p generators. ∗ (F ) ⊗ ∗ ∗ Let Cred p min Cred (Fp ) be the minimal tensor product of two reduced C algebras of free groups Fp . Then δtop (x1 , . . . , xn ) = 1, ∗ (F ) ⊗ ∗ where x1 , . . . , xn is any family of self-adjoint generators of Cred p min Cred (Fp ). ∗ (F ) ⊗ ∗ ∗ Proof. Note that Cred p min Cred (Fp ) is an infinite dimensional, unital simple C algebra with a unique tracial state. By the result from [5] or [11,14] and Theorems 5.2.2, ∗ (F ) ⊗ ∗ 5.3.7, to show δtop (x1 , . . . , xn ) = 1, we need only to show that Cred p min Cred (Fp ) has matrix-norm approximation property. Therefore, it suffices to show the following: Let R > max{x1 , . . . , xn }. For any r 1, there is some k ∈ N so that (top)

ΓR

1 x1 , . . . , xn ; k, , P1 , . . . , Pr = ∅. r

By the result from [9], we know there is a unital embedding ∗ φ1 : Cred (Fp ) →

" Mm (C) Mm (C),

m

m


2061

which induces a unital embedding ∗ φ2 : Cred (Fp ) ⊗min

∗ ∗ Cred (Fp ) → Cred (Fp ) ⊗min

" Mm (C) Mm (C) .

m

m

∗ (F ) is an exact C∗ algebra. From the explanation preceding Lemma 5.4.3 it folNote that Cred p lows that there is a unital embedding ∗ ∗ φ3 : Cred (Fp ) ⊗min Cred (Fp ) →

∗ ∗ " Mm Cred (Fp ) Mm Cred (Fp ) .

m

m

∗ (F ) ⊗ ∗ By Lemma 5.4.3, for a family of elements x1 , . . . , xn in Cred p min Cred (Fp ) and r 1, there (m) (m) ∗ (F )) so that max{a , . . . , a } < R are some m ∈ N and some a1 , . . . , an in Mm (Cred p 1 n and

(m) Pj a , . . . , a (m) − Pj (x1 , . . . , xn ) 1 , n 1 2r

∀0 j r.

On the other hand, by the existence of embedding ∗ φ1 : Cred (Fp ) →

" Mm (C) Mm (C),

m

m

it follows that there is a unital embedding ∗ φ4 : Mm Cred (Fp ) ∗ (Fp ) → Mm (C) ⊗min = Mm (C) ⊗min Cred

" Mm (C) Mm (C) .

m

m

But Mm (C) ⊗min

" " Mm (C) Mm (C) = Mm m (C) Mm m (C).

m

m

m

m

∗ (F )) and r 1, by Lemma 5.4.3, there are some Hence for such a1 , . . . , an in Mm (Cred p k ∈ N and A1 , . . . , An in Mk (C) so that max{A1 , . . . , An } < R and (m)

(m)

(m) Pj a , . . . , a (m) − Pj (A1 , . . . , An ) 1 , n 1 2r

∀0 j r.

Altogether, we have Pj (x1 , . . . , xn ) − Pj (A1 , . . . , An ) 1 , r

∀0 j r,

∗ (F ) ⊗ ∗ which implies that Cred p min Cred (Fp ) has matrix-norm approximation property.

2062


Hence δtop (x1 , . . . , xn ) = 1, ∗ (F ) ⊗ ∗ for any family of self-adjoint elements x1 , . . . , xn that generates Cred p min Cred (Fp ).

2

Theorem 5.4.5. Suppose that K be the C∗ algebra consisting of all compact operators on an infinite dimensional separable Hilbert space H . Suppose A = C ⊕ K is the unitization of K. If x1 , . . . , xn is a family of self-adjoint elements that generate A as a C∗ algebra, then δtop (x1 , . . . , xn ) = 0. Proof. By [17], we know that unital C∗ algebra A is generated by two self-adjoint elements in A. Note that A has a unique trace τ , which is defined by τ (λ, x) = λ, ∀(λ, x) ∈ A. By Theorem 5.2.2, we have δtop (x1 , . . . , xn ) = 0, where x1 , . . . , xn is a family of self-adjoint generators of A.

2

6. Topological free orbit dimension of C∗ algebras Assume that A is a unital C∗ -algebra. Let x1 , . . . , xn , y1 , . . . , ym be self-adjoint elements in A. Let CX1 , . . . , Xn , Y1 , . . . , Ym be the noncommutative polynomials in the indeterminates X1 , . . . , Xn , Y1 , . . . , Ym . Let {Pr }∞ r=1 be the collection of all noncommutative polynomials in CX1 , . . . , Xn , Y1 , . . . , Ym with rational coefficients. 6.1. Unitary orbits of balls in Mk (C)n We let Mk (C) be the k × k full matrix algebra with entries in C, and U(k) be the group of all unitary matrices in Mk (C). Let Mk (C)n denote the direct sum of n copies of Mk (C). Let Ms.a. k (C) be the subalgebra of Mk (C) consisting of all self-adjoint matrices of Mk (C). Let s.a. n (Ms.a. k (C)) be the direct sum of n copies of Mk (C). For every ω > 0, we define the ω-orbit- · -ball U(B1 , . . . , Bn ; ω) centered at (B1 , . . . , Bn ) in Mk (C)n to be the subset of Mk (C)n consisting of all (A1 , . . . , An ) in Mk (C)n such that there exists some unitary matrix W in U(k) satisfying (A1 , . . . , An ) − W B1 W ∗ , . . . , W Bn W ∗ < ω. 6.2. Norm-microstate space For all integers r, k 1, real numbers R, > 0 and noncommutative polynomials P1 , . . . , Pr , we let (top)

ΓR

(x1 , . . . , xn : y1 , . . . , ym ; k, , P1 , . . . , Pr )

be as defined as in Section 2.4.


2063

6.3. Topological free orbit dimension Definition 6.3.1. For ω > 0, we define the covering number (top) o∞ ΓR (x1 , . . . , xn : y1 , . . . , yp ; k, , P1 , . . . , Pr ), ω (top)

to be the minimal number of ω-orbit– · -balls that cover ΓR (x1 , . . . , xn : y1 , . . . , yp ; (top) k, , P1 , . . . , Pr ) with the centers of these ω-orbit– · -balls in ΓR (x1 , . . . , xn : y1 , . . . , yp ; k, , P1 , . . . , Pr ) For each function f : N × N × R+ → R, we define, kf (x1 , . . . , xn : y1 , . . . , yp ; ω, R) (top) = inf lim sup f o∞ ΓR (x1 , . . . , xn : y1 , . . . , yp ; k, , P1 , . . . , Pr ), ω , k, ω r∈N,>0 k→∞

and kf (x1 , . . . , xn : y1 , . . . , yp ; ω) = sup kf (x1 , . . . , xn : y1 , . . . , yp ; ω, R), R>0

kf (x1 , . . . , xn : y1 , . . . , yp ) = lim sup kf (x1 , . . . , xn : y1 , . . . , yp ; ω), ω→0+

where kf (x1 , , . . . , xn : y1 , . . . , yp ) is called the topological f (·)-free-orbit-dimension of x1 , . . . , xn in the presence of y1 , . . . , yp . 6.4. Topological free entropy dimension and topological free orbit dimension The following result follows directly from the definitions of topological free entropy dimension and topological free orbit dimension of n-tuple of self-adjoint elements in a C∗ algebra. Theorem 6.4.1. Suppose that A is a unital C∗ algebra and x1 , . . . , xn is a family of self-adjoint elements of A. Let f : N × N × R+ → R be defined by f (s, k, ω) =

log s −k 2 log ω

for s, k ∈ N, ω > 0. Then δtop (x1 , . . . , xn ) kf (x1 , . . . , xn ) + 1, where δtop (x1 , . . . , xn ) is the topological free entropy dimension. Proof. The proof is similar to the one of Lemma 1 in [11].

2

2064


7. Topological free orbit dimension in one variable We recall the packing number of a set in a metric space as follows. Definition 7.0.1. Suppose that X is a metric space with a metric distance d. (i) The packing number of a set K by ω-nets in X, denoted by P (K, ω), is the maximal cardinality of the subsets F in K satisfying for all a, b in F either a = b or d(a, b) ω. (ii) The packing dimension of the set K in X, denoted by dim(K), is defined by dim(K) = lim sup ω→0+

log(P (K, ω)) . − log ω

7.1. Upper-bound of the topological free orbit dimension of one variable Suppose that x = x ∗ is a self-adjoint element in a unital C∗ algebra A and σ (x) is the spectrum of x in A. For any ω > 0, let m = P (K, ω) be the packing number of σ (x) in R. Thus there exists a family of elements λ1 , . . . , λm in σ (x) such that (i) |λi − λj | ω for all 1 i = j m; and (ii) for any λ in σ (x), there is some λj with 1 j m satisfying |λ − λj | ω. Lemma 7.1.1. For any given R > x, when r is large enough and is small enough, we have (top)

lim sup

log o∞ (ΓR

k→∞

(x; k, , P1 , . . . , Pr ), 3ω) m. log k

Proof. By Theorem 3.1.1, there exist some r0 1 and 0 > 0 such that the following holds: when (top) r r0 , 0 , for any k ∈ N and A in ΓR (x; k, , P1 , . . . , Pr ), there are positive integers 1 k1 , . . . , km k with k1 + · · · + km = k and a unitary matrix U in Mk (C) satisfying ⎛λ I 1 k1 ⎜ 0 ∗ ⎜ U AU − ⎜ ⎝ ··· 0

0 λ2 Ik2 ··· 0

··· ··· .. . ···

⎞ ⎟ ⎟ ⎟ 2ω, · · · ⎠ λm Ikm

where Ikj is the kj × kj identity matrix for 1 j m. Let ⎧ ⎛ λ1 Ik1 0 ··· ⎪ ⎪ ⎪ ⎜ 0 ⎨ λ2 Ik2 · · · ⎜ Ω(k1 , . . . , km ) = U ∗ ⎜ .. ⎪ ⎝ ··· . ⎪ ··· ⎪ ⎩ 0 0 ···

0 0

0 0

⎞

(7.1.1)

⎫ ⎪ ⎪ ⎪ ⎬

⎟ ⎟ ⎟ U | U is in Uk . ⎪ ⎪ ··· ⎠ ⎪ ⎭ λm Ikm

Let J (k) be the set consisting of all these (k1 , . . . , km ) ∈ Nm with k1 + · · · + km = k. Then the cardinality of the set J (k) is equal to (k − 1)! . (m − 1)!(k − m)!


2065

Then, by (7.1.1), (top)

ΓR

(x; k, , P1 , . . . , Pr )

is contained in 2ω-neighborhood of the set

Ω(k1 , . . . , km ).

(k1 ,...,km )∈J (k)

Therefore we have (top) o∞ ΓR (x; k, , P1 , . . . , Pr ), 3ω o∞

Ω(k1 , . . . , km ), ω

(k1 ,...,km )∈J (k)

J (k) =

(k − 1)! . (m − 1)!(k − m)!

Therefore, (top)

lim sup

log o∞ (ΓR

k→∞

(k−1)! log (m−1)!(k−m)! (x; k, , P1 , . . . , Pr ), 3ω) lim sup = m − 1. log k log k k→∞

2

7.2. Lower-bound Recall x is a self-adjoint element in a unital C∗ algebra A and σ (x) is the spectrum of x in A. For any ω > 0, let m = P (K, ω) be the packing number of σ (x) in R. Thus there exists a family of elements λ1 , . . . , λm in σ (x) such that (i) |λi − λj | ω for all 1 i = j m; and (ii) for any λ in σ (x), there is some λj with 1 j m satisfying |λ − λj | ω. Lemma 7.2.1. We have (top)

lim sup

log o∞ (ΓR

k→∞

(x; k, , P1 , . . . , Pr ), ω3 ) m − 1. log k

Proof. For any R > x, r 1 and > 0, by functional calculus, there are λm+1 , . . . , λn in σ (x) such that for every t1 , . . . , tm ∈ N, the matrix A = diag(λ1 I2nt1 , λ2 I2nt2 , . . . , λm I2ntm , λ1 , . . . , λm , . . . , λn ) (top)

is in ΓR

(7.2.1)

(x; k, , P1 , . . . , Pr ),

where k = 2nt1 + · · · + 2ntn + n. In other words, for any k ∈ N with 2n|(k − n), we let J (k) be the set consisting of all these (t1 , . . . , tm ) ∈ Nm with 2nt1 + . . . + 2ntm = k − n. Then A = diag(λ1 I2nt1 , λ2 I2nt2 , . . . , λm I2ntm , λ1 , . . . , λm , . . . , λn ) (top)

is in ΓR

(x; k, , P1 , . . . , Pr ),

∀(t1 , . . . , tm ) ∈ J (k).

(7.2.2)

2066


By the definition of J (k), the cardinality of the set J (k) is equal to −1 ! . − m !(m − 1)!

k−n k−n 2n

2n

Assume that B1 and B2 are two self-adjoint matrices in Mk (C) whose eigenvalues are α1 α2 · · · αk , and β1 β2 · · · βk respectively. We can introduce the quantity δ(B1 , B2 ) as follows: δ(B1 , B2 ) = max |αi − βi |. 1ik

Let (s1 , . . . , sm ) and (t1 , . . . , tm ) be two distinct elements in J (k) and A1 = diag(λ1 I2nt1 , λ2 I2nt2 , . . . , λm I2ntm , λ1 , . . . , λm , . . . , λn ), A2 = diag(λ1 I2ns1 , λ2 I2ns2 , . . . , λm I2nsm , λ1 , . . . , λm , . . . , λn ) be two diagonal self-adjoint matrices in Mk (C). By the definition, we know that δ(A1 , A2 ) ω. By Weyl’s inequality on the eigenvalues of two self-adjoint matrices in [28], we have A1 − W A2 W ∗ δ(A1 , A2 ) ω, for any W in U(k). Combining with (7.2.2), we have ω (top) o∞ ΓR (x; k, , P1 , . . . , Pr ), J (k) 3 k−n k−n 2n

−1 ! , − m !(m − 1)! 2n

∀k 1 with 2n | (k − n).

Hence k−n

lim sup

(top) log o∞ (ΓR (x; k, , P1 , . . . , Pr ), ω3 )

k→∞

log k

lim sup

log k−n 2n

k→∞

−1 ! −m !(m−1)! 2n

log k

= m − 1.

2

7.3. Topological free orbit dimension of one self-adjoint element Theorem 7.3.1. Suppose that x is a self-adjoint element in a unital C∗ algebra A and σ (x) is the spectrum of x in A. Let dim(σ (x)) be the packing dimension of the set σ (x) in R. Let f : N × N × R+ → R be defined by log s log log k f (s, k, ω) = − log ω for s, k ∈ N, ω > 0. Then kf (x) = dim σ (x) .


Proof. The result follows directly from Lemmas 7.1.1, 7.2.1 and Definition 7.0.1.

2067

2

Theorem 7.3.2. Suppose that x is a self-adjoint element in a unital C∗ algebra A. Let f : N × N × R+ → R be defined by f (s, k, ω) =

log s −k 2 log ω

for s, k ∈ N, ω > 0. Then kf (x) = 0. Proof. The result follows directly from Lemma 7.1.1 and Definition 7.0.1.

2

References [1] N. Brown, K. Dykema, K. Jung, Free entropy dimension in amalgamated free products, math.OA/0609080. [2] M. Dostál, D. Hadwin, An alternative to free entropy for free group factors, in: International Workshop on Operator Algebra and Operator Theory, Linfen, 2001, Acta Math. Appl. Sin. Engl. Ser. 19 (3) (2003) 419–472. [3] K. Dykema, Two applications of free entropy, Math. Ann. 308 (3) (1997) 547–558. [4] L. Ge, Applications of free entropy to finite von Neumann algebras, Amer. J. Math. 119 (2) (1997) 467–485. [5] L. Ge, Applications of free entropy to finite von Neumann algebras, II, Ann. of Math. (2) 147 (1) (1998) 143–157. [6] L. Ge, S. Popa, On some decomposition properties for factors of type II1 , Duke Math. J. 94 (1) (1998) 79–101. [7] L. Ge, J. Shen, Free entropy and property T factors, Proc. Natl. Acad. Sci. USA 97 (18) (2000) 9881–9885 (electronic). [8] L. Ge, J. Shen, On free entropy dimension of finite von Neumann algebras, Geom. Funct. Anal. 12 (3) (2002) 546–566. ∗ (F )) is not a group, Ann. of Math. [9] U. Haagerup, S. Thorbjørnsen, A new application of random matrices: Ext(Cred 2 (2) 162 (2) (2005) 711–775. [10] D. Hadwin, Free entropy and approximate equivalence in von Neumann algebras, in: Operator Algebras and Operator Theory, Shanghai, 1997, in: Contemp. Math., vol. 228, Amer. Math. Soc., Providence, RI, 1998, pp. 111–131. [11] D. Hadwin, J. Shen, Free orbit dimension of finite von Neumann algebras, J. Funct. Anal. 249 (2007) 75–91. [12] K. Jung, The free entropy dimension of hyperfinite von Neumann algebras, Trans. Amer. Math. Soc. 355 (12) (2003) 5053–5089 (electronic). [13] K. Jung, A free entropy dimension lemma, Pacific J. Math. 211 (2) (2003) 265–271. [14] K. Jung, Strongly 1-bounded von Neumann algebras, arXiv:math.OA/0510576. [15] K. Jung, D. Shlyakhtenko, All generating sets of all property T von Neumann algebras have free entropy dimension 1, arXiv:math.OA/0603669. [16] D. McDuff, Central sequences and the hyperfinite factor, Proc. London Math. Soc. (3) 21 (1970) 443–461. [17] C. Olsen, W. Zame, Some C∗ algebras with a single generator, Trans. Amer. Math. Soc. 215 (1976) 205–217. [18] M. Pimsner, D. Voiculescu, Imbedding the irrational rotation C ∗ -algebra into an AF-algebra, J. Operator Theory 4 (2) (1980) 201–210. [19] M. Stefan, Indecomposability of free group factors over nonprime subfactors and abelian subalgebras, Pacific J. Math. 219 (2) (2005) 365–390. [20] M. Stefan, The primality of subfactors of finite index in the interpolated free group factors, Proc. Amer. Math. Soc. 126 (8) (1998) 2299–2307. [21] S. Szarek, Nets of Grassmann manifold and orthogonal group, in: Proceedings of Research Workshop on Banach Space Theory, Iowa City, Iowa, 1981, Univ. Iowa, Iowa City, IA, 1982, pp. 169–185. [22] S. Szarek, Metric entropy of homogeneous spaces, in: Quantum Probability, in: Banach Center Publ., vol. 43, Polish Acad. Sci., Warsaw, 1998, pp. 395–410. [23] D. Voiculescu, Circular and semicircular systems and free product factors, in: Operator Algebras, Unitary Representations, Enveloping Algebras, and Invariant Theory, Paris, 1989, in: Progr. Math., vol. 92, Birkhäuser, Boston, MA, 1990, pp. 45–60. [24] D. Voiculescu, The analogues of entropy and of Fisher’s information measure in free probability theory II, Invent. Math. 118 (1994) 411–440.

2068


[25] D. Voiculescu, The analogues of entropy and of Fisher’s information measure in free probability theory III: The absence of Cartan subalgebras, Geom. Funct. Anal. 6 (1996) 172–199. [26] D. Voiculescu, Free entropy dimension 1 for some generators of property T factors of type II1 , J. Reine Angew. Math. 514 (1999) 113–118. [27] D. Voiculescu, The topological version of free entropy, Lett. Math. Phys. 62 (1) (2002) 71–82. [28] H. Weyl, Das asymptotische Verteilungsgesetz der Eigenwerte linearer partieller Differentialgleichungen (mit einer Anwendung auf die Theorie der Hohlraumstrahlung), Math. Ann. 71 (4) (1912) 441–479 (in German).


On the characterization of the smoothness of skew-adjoint potentials in periodic Dirac operators T. Kappeler a,∗,1 , F. Serier b , P. Topalov c a University of Zürich, Winterthurestrasse 190, CH-8057 Zürich, Switzerland b École Centrale de Lyon - DMI - Institut Camille Jordan UMR CNRS 5208 36, avenue Guy de Collongue,

69134 Ecully cedex, France c Northeastern University, 360 Huntington Avenue, Boston, MA 02115, USA

Received 29 March 2008; accepted 21 January 2009 Available online 11 February 2009 Communicated by C. Kenig

Abstract In this paper we consider periodic Dirac operators with skew-adjoint potentials in a large class of weighted Sobolev spaces. We characterize the smoothness of such potentials by asymptotic properties of the periodic spectrum of the corresponding Dirac operators. © 2009 Elsevier Inc. All rights reserved. Keywords: Focusing NLS; Periodic Dirac operators; Decay of gap lengths

1. Introduction Consider the focusing nonlinear Schrödinger equation (fNLS) i∂t ψ = −∂x2 ψ − 2|ψ|2 ψ

(1)


E-mail addresses: [email protected] (T. Kappeler), [email protected] (F. Serier), [email protected] (P. Topalov). 1 Supported in part by the Swiss National Science Foundation and the European Community through the FPG Marie Curie RTN ENIGMA (M RTN-CT-2004-5652). 0022-1236/$ – see front matter © 2009 Elsevier Inc. All rights reserved. doi:10.1016/j.jfa.2009.01.027

2070

T. Kappeler et al. / Journal of Functional Analysis 256 (2009) 2069–2112

with periodic boundary conditions, i.e. ψ(x + 1, t) = ψ(x, t) for x ∈ R, t ∈ R. The fNLS-equation is known to be integrable – see e.g. [23]. According to [23] it admits a Laxpair formalism. Indeed, recall that (1) can be written in Hamiltonian form. Let L2 := L2 (T; C) denote the standard Hilbert space of L2 -integrable complex-valued functions on the circle, T = R/Z, and define L2 := L2 × L2 . Introduce the Poisson bracket, defined for C 1 -functionals F, G on L2 as follows 1 {F, G}(ϕ1 , ϕ2 ) = i

(∂ϕ1 F ∂ϕ2 G − ∂ϕ2 F ∂ϕ1 G) dx 0

where ∂ϕi F denotes the L2 -gradient of F with respect to ϕi , i = 1, 2. The Hamiltonian system with Hamiltonian 1 H(ϕ1 , ϕ2 ) :=

∂x ϕ1 ∂x ϕ2 + ϕ12 ϕ22 dx

0

is given by ∂t (ϕ1 , ϕ2 ) = i(−∂ϕ2 H, ∂ϕ1 H)

(2)

and (1) is obtained by restricting (2) to the invariant subspace iL2R of L2 , iL2R := (ϕ1 , ϕ2 ) ∈ L2 : ϕ2 = −ϕ1 . With (ϕ1 , ϕ2 ) = (ψ, −ψ), Eq. (1) can be written as ∂t ψ = i∂ψ Hf

(3)

−∂x ψ∂x ψ + ψ 2 ψ 2 dx.

(4)

where 1 Hf (ψ) = 0

We remark that when restricting (2) to the invariant subspace L2R of L2 , L2R := (ϕ1 , ϕ2 ) ∈ L2 : ϕ2 = ϕ1 , one obtains the defocusing nonlinear Schrödinger equation (dNLS) for ψ := ϕ1 , ∂t ψ = −i∂ψ Hd = −i −∂x2 ψ + 2|ψ|2 ψ

(5)


2071

where 1 Hd (ψ) =

∂x ψ∂x ψ + ψ 2 ψ 2 dx.

(6)

0

Eq. (2) admits the Lax pair representation ∂t L(ϕ) = A(ϕ), L(ϕ) where ϕ = (ϕ1 , ϕ2 ), L = L(ϕ) is the Dirac operator – also referred to as Zakharov–Shabat operator

1 0 0 ϕ1 ∂x + L(ϕ) := i 0 −1 ϕ2 0 and A(ϕ) := i

−2∂x2 + ϕ1 ϕ2 ∂x ϕ2 + 2ϕ2 ∂x

−∂x ϕ1 − 2ϕ1 ∂x 2∂x2 − ϕ1 ϕ2

.

The periodic spectrum of L(ϕ) is conserved along any solution of (2). It plays an important role in analyzing the equations (fNLS) and (dNLS). Note that for potentials ϕ = (ϕ1 , ϕ2 ) in L2R , the Dirac operator L(ϕ) is symmetric with respect to the L2 -inner product on L2 ×L2 . Its spectral properties – unlike in the case where ϕ is in iL2R – have been studied in detail. In this paper we present new results relating properties of the periodic spectrum of L(ϕ) with the smoothness of ϕ for potentials ϕ = (ϕ1 , ϕ2 ) in iL2R , i.e. for ϕ with ϕ2 = −ϕ1 . Note that in this case, L(ϕ) is no longer symmetric. To state our results we first have to introduce some notation and to recall some well-known results. For ϕ ∈ L2 , denote by spec(L(ϕ)) the spectrum of the operator L(ϕ) with domain 1 1 dom L(ϕ) := F ∈ Hloc (R, C) × Hloc (R, C): F (1) = ±F (0) 1 (R, C) denotes the standard Sobolev space of complex-valued functions which towhere Hloc gether with their derivative are locally L2 -integrable. The spectrum spec(L(ϕ)) coincides with the spectrum of the operator L(ϕ) considered on [0, 2] with periodic boundary conditions. The following proposition is well known – see e.g. [9, Proposition I.6].

Proposition 1.1. For any ϕ ∈ L2 , the periodic spectrum spec(L(ϕ)) consists of a sequence of − pairs of eigenvalues λ+ k (ϕ), λk (ϕ) in C, k ∈ Z, listed with multiplicities, such that 2 λ± k (ϕ) = kπ + (k) 2 locally uniformly in ϕ, i.e. (λ± k (ϕ) − kπ)k∈Z ∈ (Z; C) and the sequences are locally uniformly bounded with respect to ϕ.

2072


Here, 2 (Z; C) denotes the standard 2 -sequence space. It is straightforward to check that for + ϕ in iL2R , the periodic eigenvalues (λ± k (ϕ))k∈Z can be listed in such a way that Im λk (ϕ) 0, + + + and λ− k (ϕ) = λk (ϕ) for any k ∈ Z, and (λk (ϕ))k∈Z is lexicographically ordered, · · · λk + λk+1 · · ·, i.e.

+ or Re λ+ k < Re λk+1

+ + + Re λ+ k = Re λk+1 and Im λk Im λk+1 .

+ For any ϕ ∈ iL2R , we introduce (γk (ϕ))k∈Z by setting γk (ϕ) := i(λ− k − λk ). Note that

γk (ϕ) = 2 Im λ+ k ∈ R0 . Unlike for ϕ ∈ L2R , the numbers γk (ϕ) have not the interpretation as lengths of gaps in the spectrum of L(ϕ) on R if ϕ ∈ iL2R - they are simply the spacing of the pair of eigenvalues λ+ k (ϕ), − λk (ϕ). Nevertheless we will refer to γk (ϕ) as kth gap length. According to Proposition 1.1, it follows that γk (ϕ) k∈Z ∈ 2 (Z; R). Finally let us introduce the weighted Sobolev spaces. A weight is a function ω : Z → R, m → ω(m) ≡ ωm satisfying 1 ω(0) ω(m),

and ω(−m) = ω(m)

∀m 1.

A weight ω is said to be sub-multiplicative if ω(m) ω(m − l)ω(l) ∀m, l ∈ Z. We denote the set of all weights by W and by M the subset of sub-multiplicative weights (cf. [11,12]). Given any weight ω ∈ W, the ω-norm f ω of a L2 -integrable complex-valued 1-periodic function f : R → C, f=

f2m e2mπix ,

m∈Z

is defined by f ω =

1/2 2 ω2m |f2m |2

m∈Z

and H ω ≡ H ω (T; C) := f ∈ L2 (T; C): f ω < ∞ denotes the Hilbert space of all such functions with finite ω-norm. (0) Note that for the trivial weight ω(0) ≡ 1, one has H ω = L2 (T; C). Further, we introduce Hω := H ω × H ω .


2073

For any ϕ = (ϕ1 , ϕ2 ) ∈ Hω , the norm ϕω is given by 1/2 . ϕω = ϕ1 2ω + ϕ2 2ω Here are some examples of relevant weights. All of them are elements of M. Let n = 1+|n|. The Sobolev weights

nr ,

r ∈ R0

give rise to the usual Sobolev spaces H r = H r (T; C) of 1-periodic, complex-valued functions. The Abel weights r ∈ R0 , a ∈ R>0 ,

nr ea|n| ,

define spaces H r,a = H r,a (T; C) of functions in H r which are analytic on the complex strip a a and have traces in H r on the lines Im z = ± 2π . The Gevrey weights | Im z| < 2π σ

nr ea|n| ,

r ∈ R0 , a ∈ R>0 , 0 < σ < 1,

give rise to the so-called Gevrey spaces H r,a,σ = H r,a,σ (T; C). They are all subspaces of C ∞ (T; C). Obviously, H r,a H r,a,σ H r . Since log ωn is sub-additive and nonnegative, the limit log ωn n→∞ n

χ(ω) := lim

exists and is nonnegative [17, No. 98]. We call a weight ω in W exponential if χ(ω) > 0 and sub-exponential if χ(ω) = 0, (ωn )n0 is non-decreasing, and ( n1 log ωn )n1 converges to 0 in an eventually monotone manner. Clearly, Abel weights are exponential, while Sobolev and Gevrey weights are sub-exponential. Yet another example of a sub-exponential weight is given by

a|n| r r ∈ R0 and a, α ∈ R>0 ,

n exp 1 + (log n)α which is lighter than the Abel and heavier than the Gevrey weights. To measure the decay of sequences such as (γk (ϕ))k∈Z we introduce the notion of weighted 2 -sequences. A sequence z = (zn )n∈Z in C is said to be in hω ≡ hω (Z; C) if its ω-norm is finite, zω :=

k∈Z

First we state the following known result.

1/2 2 ω2k |zk |2

< ∞.

2074


Theorem 1.1. Let ω be a weight in M. If ϕ ∈ iL2R satisfies ϕω < ∞ then (γk (ϕ))k∈Z ω < ∞. In fact, Theorem 1.1 is known to hold for any ϕ ∈ L2 and was first proved in [6,8] for weights of the form ω(n) = nδ η(n) with δ > 0 and η ∈ M and, in the generality stated, in [2,4,5]. In this paper we are concerned with the question if the converse of Theorem 1.1 holds. The main results of this paper are the following two theorems. Theorem 1.2. Let ω be a sub-exponential weight in M. Then for any ϕ ∈ iL2R with (γk (ϕ))k∈Z ω < ∞, it follows that ϕω < ∞. For exponential weights, we get a slightly weaker result. Theorem 1.3. Let ω ∈ M be an exponential weight. Then for any ϕ ∈ iL2R with (γk (ϕ))k∈Z ∈ hω ρ it follows that ϕ ∈ Hω for some 0 < ρ 1. Theorem 1.2 and Theorem 1.3 lead to the following application. An element ϕ ∈ L2 is said to be a finite gap potential if {n ∈ Z | γn (ϕ) = 0} is finite. We denote the set of all finite gap potentials by G and set G ω := G ∩ Hω . Corollary 1.1. For any weight ω ∈ M, the set G ω ∩ iL2R is dense in Hω ∩ iL2R . We remark that Corollary 1.1 does not follow from the fact that G ω is dense in Hω . The latter result was first proved in [7] for weights of the form ω(n) = nδ η(n) with δ > 0 and η ∈ M and in the generality stated in Corollary 1.1, in [5]. In the case ω = 1, Corollary 1.1 is due to Tkachenko [21]. In subsequent work we will use the theorems stated above to construct Birkhoff coordinates for the fNLS-equation – see [10]. There is a vast literature on the topic of relating the regularity of a periodic potential with asymptotic properties of spectral data of the corresponding Schrödinger operator or Dirac operator – see e.g. [15]. Instead of giving a survey on the results obtained so far we limit ourselves to briefly comment on the papers most closely related to the present article. The methods emd2 ployed in this paper were first developed for Hill’s operator, − dx 2 + q, with q being a 1-periodic 2

d real-valued potential in L2 . It is well known that the periodic spectrum of − dx 2 + q (considered on the interval [0, 2]) is discrete. When listed in increasing order (and with multiplicities) the eigenvalues (λk )k0 satisfy

λ0 < λ1 λ2 < λ3 λ4 < · · · . By Floquet theory, the open intervals (−∞, λ0 ), (λ1 , λ2 ), (λ3 , λ4 ), . . . are gaps in the spectrum d2 of − dx 2 + q, when considered on the whole real line. In [11,12], Kappeler and Mityagin presented a novel approach for relating the decay of the gap lengths, γk := λ2k − λ2k−1 , to the regularity of the potential q ∈ H ω (T) based on a Lyapunov– Schmidt decomposition in Fourier space. For q in the weighted Sobolev space H ω with ω ∈ M, the sequence (γk )k1 can be proved to be in the weighted 2 -sequence space hω .


2075

In subsequent work, Djakov and Mityagin [1,5] succeeded in proving that for Gevrey weights of the form ω(n) = na e|n|

σ

(a 0, 0 < σ < 1)

any q ∈ L2 (T) with (γk )k1 ∈ hω actually is in H ω . A slightly weaker result holds for exponential weights. Their proofs however are rather complicated. In [18], Pöschel obtained the most general results in this direction. He used a new functional analytic approach which considerably simplifies the one of Djakov and Mityagin. See also the Epilog in [18]. In [14], Kappeler, Serier, and Topalov, using still another approach involving flows related to angle variables, presented a very short proof of the analogous result for potentials in certain spaces of distributions. In the same way, similar results for the Dirac operator L(ϕ) with ϕ = (ϕ1 , ϕ2 ) in L2R were obtained. When listed in increasing order (and with multiplicities) the eigenvalues (λ± k )k∈Z of L(ϕ) with ϕ ∈ L2R satisfy + − + − + · · · < λ− −1 λ−1 < λ0 λ0 < λ1 λ1 < · · · . + By Floquet theory, the open intervals (λ− k , λk ), k ∈ Z are gaps in the spectrum of L(ϕ), when considered on the whole real line. For any weight of the form ω(n) = nδ ω1 (n) with ω1 ∈ M − and δ > 0 it was proved in [6,8], that for an arbitrary ϕ in Hω , the sequence (λ+ k −λk )k∈Z is in the weighted sequence space hω . In [2,4,5], it was shown that the latter result holds for an arbitrary weight ω ∈ M and that for a sub-exponential weight ω, an element ϕ ∈ L2R with (γk )k∈Z ∈ hω is in fact an element in Hω . A slightly weaker result holds for exponential weights – see [5]. More generally, the Schrödinger operator has been considered for complex-valued potentials and the Zakharov–Shabat system for arbitrary potentials ϕ = (ϕ1 , ϕ2 ) in L2 – see e.g. [16,19, 20,22]. It turns out that the characterization of the regularity of potentials in terms of the corresponding spectra can be extended to this more general situation. However, additional spectral information of the operators involved is needed for such a characterization. See [3], and, for improved results, [18] for the Schrödinger operator. Concerning Dirac operators see [2,4]. Most of the results mentioned in this short survey of recent advances in this topic – but not the results obtained in [14] – are discussed in the survey article [5]. We prove Theorem 1.2 and Theorem 1.3 using the new functional analytic approach developed for the Schrödinger equation by Jürgen Pöschel [18]. To make the paper self-contained we include for the convenience of the reader a proof of Theorem 1.1. In Sections 2 and 3 we provide the set-up while the proofs of Theorems 1.1–1.3, and Corollary 1.1 are given in the subsequent sections.

2. Set-up For an arbitrary weight ω ∈ W, we introduce the weighted Sobolev space of 2-periodic functions H∗ω

2 := f ∈ L [0, 2]; C , f = fm em : f ω < ∞ m∈Z

2076


where em (x) = exp(mπix),

m ∈ Z,

and f 2ω =

2 ωm |fm |2 .

m∈Z

Similarly, we define the spaces L2∗ , H∗ω and L2∗ as the 2-periodic versions of the spaces L2 , Hω and L2 . − For (ϕ1 , ϕ2 ) = (0, 0), the periodic eigenvalues are given by λ+ k = λk = kπ with k ∈ Z and a 2 basis of corresponding eigenfunctions in L∗ is given by

0 ek

e−k , , k∈Z . 0

The latter basis is orthonormal with respect to the inner product on L2∗

f1 f2

2 1 g1 , = f1 (x)g1 (x) + f2 (x)g2 (x) dx. g2 2

(7)

0

More generally, we will denote by spec(ϕ) the set of periodic eigenvalues (with multiplicities) of the operator L(ϕ1 , ϕ2 ), where ϕ = (ϕ1 , ϕ2 ). Given ϕ ∈ L2 there exists N 1 so that for any integer |n| N , the set spec(ϕ) ∩ {λ ∈ C: |λ − nπ| < π/2} contains exactly one isolated pair of − eigenvalues {λ+ n , λn }, and so that spec(ϕ) contains 4N −2 = 4(N −1)+2 remaining eigenvalues which are contained in the open disk of center 0 and radius (N − 1/2)π , see e.g. [6]. We then define for any n ∈ Z, − γn (ϕ) = λ+ n (ϕ) − λn (ϕ) . − We refer to the spacing γn (ϕ) of the pair of eigenvalues {λ+ n (ϕ), λn (ϕ)} as the nth gap length of ϕ. If γn (ϕ) = 0, we speak of a collapsed gap, otherwise of an open gap. f By a slight abuse of notations, we will often write (f1 , f2 ) for f21 . Consider the splitting

L2∗ = Pn ⊕ Qn where Pn = span (0, en ), (e−n , 0) and Qn = span (0, ek ), (e−k , 0): k = n . Denote by Pn , Qn ,


2077

Pn : L2∗ → Pn , Qn : L2∗ → Qn , the corresponding orthogonal projections. We write the eigenvalue equation Lf = λf where λ ∈ C and f = (f1 , f2 ) ∈ L2∗ as Aλ f = Φf

(8)

where

d 1 0 f Aλ f := λ − i 0 −1 dx and Φf :=

0 ϕ2

ϕ1 0

f = (ϕ1 f2 , ϕ2 f1 ).

(9)

With f = u + v = Pn f + Qn f , Eq. (8) takes the form Aλ u = Pn Φ(u + v),

(10a)

Aλ v = Qn Φ(u + v)

(10b)

and

referred to as the P-equation, respectively Q-equation. For any n ∈ Z, introduce the strip Un of the complex plane,

π Un = λ ∈ C: Re(λ) − nπ . 2 Note that for any λ ∈ Un , Aλ |Qn : Qn −→ Qn is an unbounded operator with bounded inverse. Instead of (10a)–(10b), we can consider a slightly different set of equations. Note that in Eq. (10a) the unknown v only appears in the form w := Φv. By multiplying (10b) by ΦA−1 λ one gets −1 Φv = ΦA−1 λ Qn Φu + ΦAλ Qn Φv

or (Id − Tn )w = Tn (Φu). where 2 2 Tn ≡ Tn (λ) := ΦA−1 λ Qn : L∗ → L∗ .

(10c)

2078


We then obtain instead of (10a)–(10b), Aλ u − Pn Φu = Pn w,

(10d)

(Id − Tn )w = Tn (Φu).

(10e)

By a slight abuse of notation, we also refer to (10e) as the Q-equation. Given a solution (u, w) of (10d)–(10e), v is then obtained from (10b), v = A−1 λ Qn (Φu + w). In the following, we solve (10e) for w, viewing Tn (Φu) as an inhomogeneous term. We then substitute the solution w into (10d). This leads to a linear equation for u, which we study in Section 3. In particular, we will determine the λ’s in Un for which this equation admits a nontrivial solution. For the remainder of this section, we study the Q-equation (Id − Tn )w = Tn (Φu) where u is viewed as a parameter and is an arbitrary element in Pn . Our goal is to show that (Id − Tn ) : L2∗ → L2∗ is invertible for |n| sufficiently large. For λ ∈ Un , the operator A−1 λ is bounded on Qn , and is given by A−1 λ

αk (0, ek ) + βk (e−k , 0) =

k =n

k =n

1 αk (0, ek ) + βk (e−k , 0) λ − kπ

since for any k ∈ Z and any λ ∈ C, we have Aλ

0 ek

= (λ − kπ)

0 ek

and Aλ

e−k 0

= (λ − kπ)

e−k 0

and for any k = n min |λ − kπ| = k − n + λ∈Un

1 π |k − n| 1. 2

Any f = (f1 , f2 ) in L2∗ admits a Fourier expansion

e−k 0 ˆ ˆ + f2 (k) f= f1 (−k) 0 ek k∈Z

where for j = 1, 2, fj =

k∈Z

fˆj (k)ek

1 and fˆj (k) = 2

2 fj (x)ek (x) dx. 0

(11)


2079

As by (9), Φ(0, ek ) = (ϕ1 ek , 0) and Φ(e−k , 0) = (0, ϕ2 e−k ) we then get Tn f = ΦA−1 λ Qn f =

fˆ2 (k)(ϕ1 ek , 0) + fˆ1 (−k)(0, ϕ2 e−k ) λ − kπ

k =n

.

Hence Tn f =

Tn+ 0

0 Tn−

f1 f2

= Tn+ f2 , Tn− f1

(12a)

where Tn± ≡ Tn± (λ) are linear operators on L2∗ defined for any h ∈ L2∗ by Tn+ h :=

ϕ1 (l − k) l

λ − kπ

k =n

ˆ h(k) el

(12b)

ˆh(−k) el .

(12c)

and Tn− h :=

ϕ2 (l + k) l

k =n

λ − kπ

For a linear operator T : L2∗ → L2∗ , we denote by T its operator norm. Lemma 2.1. For any ϕ = (ϕ1 , ϕ2 ) ∈ L2 and λ ∈ Un with n ∈ Z arbitrary, Tn (λ) is a bounded operator on L2∗ . More precisely sup Tn (λ) 2ϕ. λ∈Un

Proof. Let n ∈ Z. Using the definition of Tn+ and Young’s inequality, we get for any λ ∈ Un and f = (f1 , f2 ) ∈ L2∗

ˆ

2 + 2 |fˆ2 (k)| |f2 (k)| 2 T f2 ϕ1 (l − k) ϕ1 2 . n |k − n| |k − n| l

k =n

k =n

Hence + 2 T f2 ϕ1 2 f2 2 n

k =n

1 4ϕ1 2 f2 2 . |k − n|2

A similar estimate holds for Tn− f1 and we conclude that Tn f 2 4 ϕ1 2 f2 2 + ϕ2 2 f1 2 4ϕ2 f 2 .

2

2080


To prove that for |n| sufficiently large (Id − Tn (λ)), is invertible for any λ ∈ Un , we will first show that Tn (λ)2 = o(1) as |n| → ∞. Hence Id − Tn (λ)2 is invertible for |n| sufficiently large. In view of the identity −1 = Id (Id − Tn )(Id + Tn ) Id − Tn 2 we then conclude that Id − Tn (λ) is invertible as well. First note that by (12a)

+ − 0 Tn Tn f1 2 = Tn+ Tn− f1 , Tn− Tn+ f2 . Tn f = − + f 0 Tn Tn 2

(13)

(14)

In view of (12b) and (12c), we obtain for any h ∈ L2∗ Tn+ Tn− h =

ϕ1 (l − k) ϕ2 (k + m) ˆ h(−m)e l λ − kπ λ − mπ

(15a)

ϕ2 (l + k) ϕ1 (−k − m) ˆ h(m)e l. λ − kπ λ − mπ

(15b)

l∈Z k,m =n

and Tn− Tn+ h =

l∈Z k,m =n

To estimate the norms of Tn+ Tn− and Tn− Tn+ it is useful to introduce some more notation. For any given ϕ ∈ L2∗ , let (16) φ= φ(k)ek , φ(k) = max ϕ1 (±k), ϕ2 (±k) . k∈Z

Then, φ ∈ L2∗ and 1 ϕ φ 2ϕ. 2 Further, for any h ∈ L2∗ and n = 0 we denote by Rn (h) the remainder term Rn (h) = h(j )ej .

(17)

(18)

|j ||n|

In addition, we introduce the convolution operators Xn± ≡ Xn± (φ), defined by the same expresϕi (k) replaced by φ(k) and (λ − kπ) replaced by sions as in the definition of Tn± (λ), but with |n − k|, i.e. for any h ∈ L2∗ ,

φ(l − k) + ˆ Xn h := (19a) h(k) el |n − k| l∈Z

k =n

and Xn− h :=

φ(l + k) l∈Z

k =n

|n − k|

ˆh(−k) el .

(19b)


2081

Then Xn± are bounded operators on L2∗ . Indeed, by a computation as in the proof of Lemma 2.1, ± X φ n

k =n

1 |k − n|2

1/2 2φ.

(20)

Further Xn+ Xn− h =

φ(l − k)φ(k + m) ˆh(−m) el |n − k||n − m| k,m =n

l∈Z

and Xn− Xn+ h =

φ(l + k)φ(k + m) ˆh(m) el . |n − k||n − m|

(21)

k,m =n

l∈Z

It is easy to see that the Hilbert–Schmidt norm Tn− (λ)Tn+ (λ)HS of Tn− (λ)Tn+ (λ) is bounded by the Hilbert–Schmidt norm of Xn− Xn+ , sup Tn− (λ)Tn+ (λ)HS Xn− Xn+ HS λ∈Un

and, similarly, that sup Tn+ (λ)Tn− (λ)HS Xn+ Xn− HS .

λ∈Un

Further, one verifies in a straightforward way that − + X X = X + X − n

n

n

HS

n

HS

and hence, as Tn (λ)2 2 = T + (λ)T − (λ)2 + T − (λ)T + (λ)2 n n n n HS HS HS it follows that sup Tn (λ)2 HS 2Xn− Xn+ HS .

λ∈Un

Lemma 2.2. For any |n| 5, the operators Xn− Xn+ , Xn+ Xn− : L2∗ → L2∗ are Hilbert–Schmidt operators and

− + X X 7φ φ + Rn (φ) . (22) n n HS |n|1/2 As a consequence, sup Tn (λ)2 HS 14φ

λ∈Un

φ Rn (φ) . + |n|1/2

(23)

2082


Proof. By the definition of the Hilbert–Schmidt norm, − + 2 X X = X − X + ej 2 . n n HS n n j ∈Z

Hence by (21) φ(l + k)φ(k + j ) 2 − + 2 X X = . n n HS |n − k||n − j | j =n l

k =n

The latter sum is split up into Σn,1 + Σn,2 + Σn,3 and the three sums Σn,i , 1 i 3, are estimated separately. First let us consider Σn,1 given by Σn,1 :=

φ(l + k)φ(k + j ) 2

|j −n|> |n| 2

k =n

l∈Z

|n − k| |n − j |

.

By the Cauchy–Schwarz inequality, Σn,1

|j −n|>

φ(l + k)2 φ(k + j )2 4 |n − j |2 |n| k =n

l

2

4φ4

1 . |n − j |2

|j −n|> |n| 2

Note that for any |n| 5, 1 1 1 5 2 |n|−1 2 2 l(l − 1) |n| l |n| |n|

|l|>

2

l>

2

2

and therefore Σn,1

20 φ4 . |n|

Next we look at the sum Σn,2 :=

j =n l∈Z

|k−n|> |n| 2

φ(l + k)φ(k + j ) |n − k||n − j |

which can be estimated in a similar way as Σn,1 , Σn,2

20 φ4 . |n|

2


2083

Finally, Σn,3 is given by

Σn,3 :=

1|j −n| |n| 2

1 |n − j |2 l∈Z

1|k−n| |n| 2

φ(l + k)φ(k + j ) |n − k|

2 .

By Cauchy’s inequality Σn,3 4

1|j −n| |n| 2

1 2 |n − j | l

φ(l + k)2 φ(k + j )2 .

1|k−n| |n| 2

Using that for 1 |j − n| |n|/2 and 1 |k − n| |n|/2 |k + j | 2|n| − |k − n| − |j − n| |n| one then gets 2 Σn,3 16φ2 Rn (φ) . Combining the estimates for Σn,1 , Σn,2 , and Σn,3 leads to inequality (22).

2

According to Lemma 2.2, for any given ϕ ∈ L2 , there exists N0 ≡ N0 (ϕ) in N such that for any |n| N0 and any λ ∈ Un , the operator Tn (λ)2 is a contraction. More precisely, we choose N0 5 such that, for any |n| N0 and σ ∈ {±} −σ σ X X n

n HS

1 4

and hence

1 sup Tn (λ)2 HS . 2 λ∈Un

(24)

As a consequence, for any |n| N0 and any λ ∈ Un , the Q-equation (10e) can be solved, −1 w = Id − Tn (λ) Tn (λ)(Φu).

(25)

3. Lyapunov–Schmidt reduction Substituting the solution (25) into (10d) one gets Aλ u = Pn Φu + Pn (Id − Tn )−1 Tn (Φu), or, with (Id − Tn )−1 = Id + (Id − Tn )−1 Tn , Aλ u = Pn (Id − Tn )−1 (Φu). We write the latter equation as −1 Sn (λ)u = 0 where Sn (λ) := Aλ − Pn Id − Tn (λ) Φ P . n

(26a)

Thus, any λ in Un is a periodic or antiperiodic eigenvalue of L(ϕ) if and only if it is a root of det Sn (λ). Recall that u is an element of Pn , a 2-dimensional subspace with L2∗ -orthonormal

2084


basis {(0, en ), (e−n , 0)}. Any linear operator T from Pn to Pn can be represented with respect to the basis {(0, en ), (e−n , 0)} as a matrix,

T (e−n , 0), (0, en )

T (0, en ), (0, en ) .

T (0, en ), (e−n , 0) T (e−n , 0), (e−n , 0) As Aλ P = (λ − nπ) · IdP n

n

the matrix representation of Sn (λ)

Sn,12 Sn,22

Sn,11 Sn,21

has entries −1 Sn,11 ≡ Sn,11 (λ) := (λ − nπ) − Id − Tn (λ) Φ(0, en ), (0, en ) , −1 Sn,22 ≡ Sn,22 (λ) := (λ − nπ) − Id − Tn (λ) Φ(e−n , 0), (e−n , 0) ,

(26b) (26c)

and −1 Sn,12 ≡ Sn,12 (λ) := − Id − Tn (λ) Φ(e−n , 0), (0, en ) , −1 Sn,21 ≡ Sn,21 (λ) := − Id − Tn (λ) Φ(0, en ), (e−n , 0) .

(26d) (26e)

Let us investigate the matrix Sn (λ). Lemma 3.1. Let ϕ ∈ L2 and |n| N0 with N0 given as in (24). Then for any λ ∈ Un (i) Sn,11 (λ) = Sn,22 (λ). (ii) If ϕ2 = ±ϕ1 , then Sn,12 (λ) = ±Sn,21 (λ). (iii) If ϕ2 = ±ϕ1 , then Sn,11 (λ) = Sn,11 (λ). Proof. Statement (i) has been proved in [6]. For the convenience of the reader we include its proof here. In view of the matrix representation (12a) of Tn (λ) and the identities (13) and (14), we have −1 (27) Sn,11 − (λ − nπ) = − Tn− Id − Tn+ Tn− ϕ1 en , en and −1 Sn,22 − (λ − nπ) = − Tn+ Id − Tn− Tn+ ϕ2 e−n , e−n . Writing (Id − Tn− Tn+ )−1 as a Neumann series, the latter formula takes the form Sn,22 − (λ − nπ) =

k Tn+ Tn− Tn+ (ϕ2 e−n ), e−n

k0

(28)


2085

where by (15b), − + k + T n Tn f = βk,l (f )el l∈Z

with + βk,l (f ) =

ϕ2 (l + l2k )ϕ1 (−l2k − l2k−1 ) · · · ϕ2 (l3 + l2 )ϕ1 (−l2 − l1 ) li =n

(λ − l2k π)(λ − l2k−1 π) · · · (λ − l2 π)(λ − l1 π)

f(l1 ).

(29)

Thus, according to (12b), we can write Sn,22 − (λ − nπ) = −

βk+

k0

where k ϕ1 (−n − l) + βk+ = Tn+ Tn− Tn+ (ϕ2 e−n ), e−n = β (ϕ2 e−n ). (λ − lπ) k,l

(30)

l =n

Similarly, using (15a) we have Sn,11 − (λ − nπ) = −

βk−

k0

where k ϕ2 (n + l) − βk− = Tn− Tn+ Tn− (ϕ1 en ), en = β (ϕ1 en ). (λ − lπ) k,l

(31)

l =n

and − βk,l (f ) =

ϕ1 (−l − l2k )ϕ2 (l2k + l2k−1 ) · · · ϕ1 (−l3 − l2 )ϕ2 (l2 + l1 ) li =n

(λ − l2k π)(λ − l2k−1 π) · · · (λ − l2 π)(λ − l1 π)

f(−l1 ).

Comparing (30) and (31) one sees that βk− = βk+

∀k 0,

and therefore, Sn,11 = Sn,22 . This proves statement (i). Next we prove item (ii). Again, making use of the matrix representation of Tn (λ), we obtain −1 Sn,12 (λ) = − Id − Tn− Tn+ ϕ2 e−n , en , −1 Sn,21 (λ) = − Id − Tn+ Tn− ϕ1 en , e−n .

(32a) (32b)

2086


If ϕ2 = ±ϕ1 then −1 Sn,12 (λ) = ∓ Id − Tn− (λ)Tn+ (λ) ϕ1 en , en . ˆ Note that for arbitrary h ∈ L2∗ one has h(k) = h(−k), k ∈ Z. Hence, by (15a) and (15b), Tn− (λ)Tn+ (λ)h =

ϕ2 (l + k) ϕ1 (−k − m) h(m) el λ − kπ λ − mπ l k,m =n

ϕ2 (−l − k) ϕ1 (k + m) = c.c. h(−m) e−l λ − kπ λ − mπ l

k,m =n

l

k,m =n

ϕ1 (−l − k) ϕ2 (k + m) = c.c. h(−m) e−l . λ − kπ λ − mπ Hence Tn− (λ)Tn+ (λ)h = Tn+ (λ)Tn− (λ)h.

(33)

Here c.c.(z) denotes the complex conjugate of the complex number z. Thus −1 −1 Id − Tn− (λ)Tn+ (λ) ϕ1 en = Id − Tn+ (λ)Tn− (λ) ϕ1 en and it follows that −1 Sn,12 (λ) = ∓ Id − Tn+ (λ)Tn− (λ) ϕ1 en , en = ±Sn,21 (λ) as claimed. (We note that the case ϕ2 = ϕ1 has been treated in [4].) It remains to prove item (iii). In view of the definition of Tn+ and the fact that for arbitrary ˆ h(k) = h(−k), k ∈ Z, it follows that for any ϕ = (ϕ1 , ϕ2 ) ∈ L2 with ϕ2 = ±ϕ1 h ∈ L2∗ one has Tn+ (λ)h =

ϕ1 (l − k) l

=

λ − kπ

k =n

h(k) el

±ϕ2 (l − k)

h(−k) e−l

λ − kπ

ϕ2 (k − l) = ± c.c. h(−k) e−l λ − kπ l

k =n

l

k =n

= ±Tn− (λ)h. Combined with (33) we then get for ϕ = (ϕ1 , ϕ2 ) ∈ L2 with ϕ2 = ±ϕ1


2087

−1 Sn,22 (λ) − (λ − nπ) = − Tn+ (λ) Id − Tn− (λ)Tn+ (λ) (±ϕ1 en ), e−n −1 = − c.c. Tn− (λ) Id − Tn+ (λ)Tn− (λ) ϕ1 en , en = Sn,11 (λ) − (λ − nπ). In view of item (i) it then follows that Sn,11 (λ) = Sn,11 (λ).

2

Given λ ∈ Un , ϕ ∈ L2 , and |n| N0 as in (24), introduce an (λ) = (λ − nπ) − Sn,11 (λ).

(34)

By Lemma 3.1 an ≡ an (λ) = (λ − nπ) − Sn,22 (λ) and by (27) −1 an = Tn− Id − Tn+ Tn− (ϕ1 en ), en .

(35)

Further let bn ≡ bn (λ) = −Sn,12 (λ)

and cn ≡ cn (λ) = −Sn,21 (λ). −1

Writing (Id − Tn− Tn+ )−1 = Id + Tn− Tn+ (Id − Tn− Tn+ )

(36)

and using (32a) one sees that

−1 bn = ϕ2 (2n) + Tn− Tn+ Id − Tn− Tn+ (ϕ2 e−n ), en .

(37)

Similarly one gets from (32b) the identity −1 cn = ϕ1 (−2n) + Tn+ Tn− Id − Tn+ Tn− (ϕ1 en ), e−n .

(38)

Note that by the definitions (12b)–(12c) of Tn± (λ), the functions λ → an (λ), λ → bn (λ), and λ → cn (λ) are analytic on Un . 4. Proof of Theorem 1.1 In this section we prove Theorem 1.1. We note that in Proposition 4.1 we derive a slightly stronger auxiliary result than needed for the proof of Theorem 1.1. It will be used later. In this section we follow work first developed in [6,8] and then refined in [5]. Note however that the proof of Lemma 36 in [5] has a gap on p. 710 as Lemma 32 in [5] cannot be applied to the expression Σ4 (n) given by (2.117) of [5]. However it turns out that the method developed in [4] can be applied. To obtain the estimate for the sequence (γn (ϕ))n∈Z claimed in Theorem 1.1, we first need to establish bounds for the coefficients an (λ), bn (λ), and cn (λ).

2088


Lemma 4.1. Let ϕ = (ϕ1 , ϕ2 ) ∈ L2 and let N0 be given by (24). Then for any n ∈ Z with |n| N0 ,

φ + Rn (φ) , sup an (λ) 2φ |n| λ∈Un

φ sup bn (λ) − + Rn (φ) , ϕ2 (2n) 4φ2 |n| λ∈Un

φ sup cn (λ) − + Rn (φ) ϕ1 (−2n) 4φ2 |n| λ∈Un

(39a) (39b) (39c)

where φ and Rn (φ) are given by (16), respectively (18). Remark 4.1. With Proposition 4.1 and Corollary 4.1 we will improve the estimates (39b)–(39c). Proof. Substituting −1 −1 Id − Tn− (λ)Tn+ (λ) = Id + Id − Tn− (λ)Tn+ (λ) Tn− (λ)Tn+ (λ) in (35) yields an (λ) = Σ1 + Σ2 where Σ1 := Tn− (ϕ1 en ), en

and Σ2 := Tn− H (ϕ1 en ), en

(40)

with −1 H ≡ H (λ; n) := Id − Tn− (λ)Tn+ (λ) Tn− (λ)Tn+ (λ) is a Hilbert–Schmidt operator on L2∗ . By (24), for |n| N0 , sup Tn+ (λ)Tn− (λ)HS 1/2

λ∈Un

and thus sup H (λ; n)HS 1.

(41)

λ∈Un

The terms Σ1 and Σ2 in (40) are estimated separately. According to (12c) Σ1 =

ϕ2 (k + n) ϕ1 (−k − n) λ − kπ

k =n

and by Cauchy’s inequality we have, for any λ ∈ Un , |Σ1 | φ

k =n

φ(n + k)2 |n − k|2

1/2 .

(42)


2089

To estimate Σ2 , we use the matrix representation (Hi,j )i,j ∈Z of H and write Σ2 =

H−k,−j

k =n j

ϕ2 (n + k)ϕ1 (−j − n) . λ − kπ

By the Cauchy–Schwarz inequality and the estimate (41) |Σ2 | H HS

k =n j

φ

k =n

φ(n + k)2 φ(n + j )2 |n − k|2

φ(n + k)2 |n − k|2

1/2

1/2 (43)

.

Combining the estimates (42)–(43) for Σ1 and Σ2 leads to

φ(n + k)2 1/2 sup an (λ) 2φ . |n − k|2 λ∈Un k =n

For n N0 , one has φ(k + n)2 k =n

|n − k|2

φ(k + n)2 +

1 φ(k + n)2 n2 k . λ∈∂Dn 3 λ∈∂Dn 48 As det Sn (λ) and (λ − nπ − an (λ))2 are both analytic on Un , by Rouché’s theorem, they have the same number of zeros in Dn when counted with their multiplicities. By the same argument, one shows that (λ − nπ − an (λ))2 and (λ − nπ)2 have the same numbers of zeros in Dn counted with their multiplicities. Note that λ = nπ is a double root of (λ − nπ)2 in Dn , hence det Sn (λ) has two roots in Dn , which we denote by ξ+ and ξ− . Next, we want to estimate the difference ξ+ − ξ− . By Cauchy’s estimate, |∂λ an |Dn

π/48 1 |an |Un = . dist(Dn , ∂Un ) π/6 8

We have

ξ+ − nπ − an (ξ+ ) = ± bn (ξ+ )cn (ξ+ ), ξ− − nπ − an (ξ− ) = ± bn (ξ− )cn (ξ− ),

and thus |ξ+ − ξ− | an (ξ+ ) − an (ξ− ) + bn (ξ+ )cn (ξ+ ) + bn (ξ− )cn (ξ− ) 1/2 |ξ+ − ξ− | sup ∂λ an (λ) + 2 sup bn (λ)cn (λ) . λ∈Dn

λ∈Un

Hence, by (45), 1/2 7 |ξ+ − ξ− | 2 sup bn (λ)cn (λ) 8 λ∈Un and therefore 1/2 |ξ+ − ξ− | 3 sup bn (λ)cn (λ) . λ∈Un

2

(45)


2091

We now improve the estimates for bn and cn of Lemma 4.1. To this end, we need to introduce additional notations. Given n with |n| N0 we write (Id − Tn− Tn+ )−1 as a Neumann series. Then formula (37) for bn (λ, ϕ) takes the form bn (λ, ϕ) − ϕ2 (2n) =

k Tn− Tn+ (ϕ2 e−n ), en . k1

By (29) we know that for any k 1, and |n| N0 , − + k Tn Tn (ϕ2 e−n ), en ϕ2 (n + l2k )ϕ1 (−l2k − l2k−1 ) · · · ϕ2 (l3 + l2 )ϕ1 (−l2 − l1 ) ϕ2 (l1 + n). = (λ − l2k π)(λ − l2k−1 π) · · · (λ − l2 π)(λ − l1 π) li =n

Recall that we have introduce the convolution operators Xn± (19a)–(19b) for which one gets the identity − + k Xn Xn (φe−n ), en φ(n + l2k )φ(l2k + l2k−1 ) · · · φ(l3 + l2 )φ(l2 + l1 ) = φ(l1 + n). |n − l2k ||n − l2k−1 | · · · |n − l2 ||n − l1 |

(46)

li =n

Hence, for any k 1 and n ∈ Z, with |n| N0 k k sup Tn− Tn+ (ϕ2 e−n ), en Xn− Xn+ (φe−n ), en

λ∈Un

and thus sup bn (λ, ϕ) − ϕ2 (2n) Bn (φ)

(47)

λ∈Un

where Bn (φ) =

k Xn− Xn+ (φe−n ), en . k1

Similarly, by formula (38) for cn (λ, ϕ), cn (λ, ϕ) − ϕ1 (−2n) =

k Tn+ Tn− (ϕ1 en ), e−n k1

and by the identity (31), for any k 1 and n ∈ Z, + − k Tn Tn (ϕ1 en ), e−n ϕ1 (−n − l2k )ϕ2 (l2k + l2k−1 ) · · · ϕ1 (−l3 − l2 )ϕ2 (l2 + l1 ) = ϕ1 (−l1 − n). (λ − l2k π)(λ − l2k−1 π) · · · (λ − l2 π)(λ − l1 π) li =n

(48)

2092


Inspecting (46) one concludes that + − k k Xn Xn (φen ), e−n = Xn− Xn+ (φe−n ), en and therefore sup cn (λ, ϕ) − ϕ1 (−2n) Bn (φ).

(49)

λ∈Un

Next we need to introduce a special class of weights. We say that the weight ω = (ω(n))n∈Z in W is slowly growing if there exists a constant Cω 1 so that ω(2n) Cω ω(n)

∀n ∈ Z,

(50a)

and 1 ω(0) ω(1) = ω(−1) ω(2) = ω(−2) · · · .

(50b)

Note that ω might not be in M, i.e. ω might not be sub-multiplicative. However Cω ω is submultiplicative, hence in M, as ω(n + m) Cω max ω(n), ω(m) Cω ω(n)ω(m) where we used that ω(0) 1, and ω(0) ω(n) = ω(|n|) ω(|n| + 1) for any |n| 1. Proposition 4.1. Let ω be a slowly growing weight. Then for any ϕ ∈ Hω there exists N2 ≡ N2 (φ, ω) N1 (φ) with N1 (φ) given by Lemma 4.2 so that for any N N2 ,

1/2 2

2

ω(2n) Bn (φ)

|n|N

2 √ φ2ω 2 RN (φ)ω 4 2Cω + 1/2 . ω(N )2 N

(51)

The constant N2 can be chosen locally uniformly with respect to ϕ. Proof. By the definition (48) of Bn (φ), ∞ − + k Xn Xn (φe−n ), en Bn (φ) |n|N =

.

|n|N

k=1

Hence by the triangle inequality in 2 , ∞ ω(2n)Bn (φ) ω(2n) X − X + k (φe−n ), en n n |n|N |n|N k=1

or |n|N

1/2 2

2

ω(2n) Bn (φ)

∞ k=1

|n|N

k 2 ω(2n) Xn− Xn+ (φe−n ), en 2

1/2 .

(52)


2093

To estimate the right-hand side of the latter inequality note that by (46),

k 2 ω(2n)2 Xn− Xn+ (φe−n ), en

|n|N

=

|n|N

2 φ(n + l2k )φ(l2k + l2k−1 ) · · · φ(l2 + l1 ) φ(l1 + n) ω(2n) |n − l2k | · · · |n − l1 | 2

(53)

l∈I2k,n

where I2k,n := l = (l1 , . . . , l2k ) li ∈ Z \ {n} ∀ 1 i 2k . To estimate to right-hand side of (53) we want to distinguish between the cases |li − n| > |n|/2 and |li − n| |n|/2. Hence we decompose the index set I2k,n I2k,n =

σ I2k,n

σ

where σ = (σ1 , . . . , σ2k )

with σi ∈ {0, 1}

and

|n| |n| σ I2k,n if σi = 0; |li − n| > if σi = 1 . = l ∈ I2k,n |li − n| 2 2 Again by the triangle inequality in 2 , one gets

ω(2n)2

|n|N

2 1/2 φ(n + l2k )φ(l2k + l2k−1 ) · · · φ(l2 + l1 ) φ(l1 + n) |n − l2k | · · · |n − l1 | σ σ l∈I2k,n

σ ∈Z2k 2

|n|N

ω(2n)2

2 1/2 φ(n + l2k )φ(l2k + l2k−1 ) · · · φ(l2 + l1 ) φ(l1 + n) . |n − l2k | · · · |n − l1 | σ l∈I2k,n

For any given σ = (σ1 , . . . , σ2k ) ∈ Z2k 2 we get by the Cauchy inequality

2 φ(n + l2k )φ(l2k + l2k−1 ) · · · φ(l2 + l1 ) σ φ(l1 + n) Aσ2k,n · B2k,n (φ) |n − l | · · · |n − l | 2k 1 σ l∈I2k,n

where Aσ2k,n =

σ l∈I2k,n

1 |n − l2k | · · · |n − l1 |

2

2094


and

σ (φ) := B2k,n

φ(n + l2k )2 φ(l2k + l2k−1 )2 · · · φ(l2 + l1 )2 φ(l1 + n)2 .

(54)

σ l∈I2k,n

Our aim is to estimate for any given σ ∈ Z2k 2

σ ω(2n)2 Aσ2k,n · B2k,n (φ).

(55)

|n|N σ (φ) are estimated separately. Let us first consider Aσ . With |σ | = The terms Aσ2k,n and B2k,n 2k,n 2k σ , i=1 i

Aσ2k,n

0 |n| 2

1 |n − j |2

|σ |

4

2k−|σ |

5 |n|

|σ |

or Aσ2k,n 52k |n|−|σ | .

(56)

σ . Note that if σ = 0, then Next we consider B2k,n 1

|n| > |n| |l1 + n| = 2n − (n − l1 ) 2|n| − 2

(57a)

and similarly, if σ2k = 0, then |l2k + n| > |n|.

(57b)

If 1 i 2k − 1 and σi = 0 as well as σi+1 = 0 then |n| |n|. |li + li+1 | = 2n − (n − li ) − (n − li+1 ) 2|n| − 2 2

(57c)

Denote by τ (σ ) the number of arguments n + l2k , l2k + l2k−1 , . . . , l2 + l1 , l1 + n of the φ’s in σ (φ) with range contained in {j ∈ Z||j | |n|}. Then according to the expression (54) for B2k,n (57a)–(57c) τ (σ ) (1 − σ1 ) + (1 − σ2k ) +

2k−1

(1 − σi )(1 − σi+1 ).

i=1

Clearly, one has τ (σ ) 2k + 1 − 2|σ | +

2k−1 i=1

σi σi+1 2k + 1 − 2|σ |.

(58)


2095

Use that Cω ω is a sub-multiplicative weight to see that ω(2n) (Cω )2k ω(n + l1 )ω(l1 + l2 ) · · · ω(l2k−1 + l2k )ω(l2k + n). Hence for σ ∈ Z2k 2 with τ (σ ) = 0 and φω (j ) := ω(j )φ(j ) we get in view of (56)

σ ω(2n)2 Aσ2k,n · B2k,n (φ)

|n|N

2k 5Cω2 |n|N

1 φω (n + l2k )2 · · · φω (l1 + n)2 |n||σ | σ l∈I2k,n

2k 1 2k+1 φ2ω 5Cω2 |σ | N

2 2k φ2ω 2k+1 5Cω √ N where for the latter inequality we used that in the case τ (σ ) = 0, (58) implies 2|σ | 2k + 1. In the case τ (σ ) 1 we use (50a) and (50b) to get


|n|N

52k ω(2n)2 φ(n + l2k )2 · · · φ(l1 + n)2 |σ | N σ |n|N

l∈I2k,n

2 2 τ (σ )−1 2k+1−τ (σ ) 52k 2 φ2 Cω RN (φ)ω RN (φ) |σ | N

2 RN (φ)2ω τ (σ )−1 2k+1−τ (σ ) 52k φ2 |σ | Cω2 RN (φ)ω . 2 N ω(N)

As φ φω and 2|σ | 2k + 1 − τ (σ ) by (58) we see that for σ with τ (σ ) 1


|n|N

2 RN (φ)2ω τ (σ )−1 φ2ω 2k+1−τ (σ ) 52k Cω2 RN (φ)ω . √ ω(N )2 N Combining the two cases we conclude that for any σ ∈ Z2k 2 ,

(59)

2096


1/2 2

ω(2n)

σ · B2k,n (φ)

Aσ2k,n

|n|N

2 φ2 1/2 2 k RN (φ)2ω φ2ω k 5Cω Cω RN (φ)ω + √ ω + . √ ω(N )2 N N

(60)

Choose N2 ≡ N2 (φ) N1 so large that for any N N2 Cω2

2 ω RN (φ)2 + φ √ ω N

1 . 60

(61)

2k As (Z2k 2 ) = 2 , we then get for any N N2 , by combining (52), (53), (55) and (60)

1/2 ω(2n)2 Bn (φ)2

|n|N

which is the claimed estimate.

∞ 1 2k 2 k RN (φ)2ω φ2ω k √ 2 5Cω + √ ω(N )2 N 60 k=1

RN (φ)2ω φ2ω 4Cω2 + √ ω(N )2 N

2

As any weight ω ∈ M is sub-multiplicative one has (see (46), (48) and (59)) ω(2n)Bn (φ) Bn (φω ),

(62)

where φω (j ) = ω(j )φ(j ) for any j ∈ Z. Applying Proposition (4.1) to the constant weight ω ≡ 1 and φω ∈ L2 then yields the following Corollary 4.1. Let ω ∈ M be an arbitrary weight. Then for any ϕ ∈ Hω there exists N2 ≡ N2 (φω ) so that for any N N2 |n|N

1/2 ω(2n)2 Bn (φ)2

2 φ2 4 RN (φ)ω + √ ω . N

The constant N2 can be chosen locally uniformly with respect to ϕ. Proof. The claimed estimate follows from Proposition 4.1 with ω ≡ 1 and φω ∈ L2 , taking into account that RN (φ) = RN (φω ) and φω = φω . 2 ω Given an arbitrary element ϕ in L2 , the following result describes the decay properties of the sequence (γn (ϕ))n∈Z in terms of the regularity of ϕ. Theorem 4.1. Let ω be an arbitrary weight in M. Then for any ϕ ∈ Hω , one has (γn (ϕ))n∈Z ∈ hω (Z).


2097

Proof. Let ϕ be in Hω , and n ∈ Z with |n| N1 and N1 given as in Lemma 4.2. Then by Lemma 4.2, det Sn (λ) has exactly two roots in Dn and none in Un \ Dn . Note that the union of all the (closed) strips Un covers the complex plane. Further there exists N3 N2 with N2 given as in Corollary 4.1 so that • spec(ϕ) ∩ {λ ∈ C: |λ − kπ| < π/2} contains exactly one isolated pair of eigenvalues − {λ+ k , λk }, for any |k| N3 ; • spec(ϕ) \ {λ± k , |k| N3 } is contained in {λ ∈ C: |λ| < (N3 − 1/2)π} (cf. e.g. [6]). Hence, for any n with |n| N3 , the two roots of det Sn (λ) in Dn found above are the periodic eigenvalues λ± n . By Lemma 4.2 γn (ϕ)2 = |ξ+ − ξ− |2 9 sup |bn cn | 9 sup |bn |2 + 9 sup |cn |2 . 2 λ∈Un 2 λ∈Un λ∈Un Notice that by (47), ω(2n)|bn | ω(2n)ϕ2 (2n) + ω(2n)bn − ϕ2 (2n) ω(2n)ϕ2 (2n) + ω(2n)B(φ, n). As ϕ ∈ Hω , the first term of the latter expression is in 2 (Z), and by Corollary 4.1 the second one is in 2 (Z) as well. By the same arguments, we obtain similar estimates for cn . This establishes Theorem 4.1. 2 Proof of Theorem 1.1. As iL2R ⊂ L2 , Theorem 1.1 is a special case of Theorem 4.1.

2

5. Adapted Fourier coefficients To prove Theorem 1.2 and Theorem 1.3 we want to adapt the arguments of Pöschel [18] developed for KdV to the focusing NLS equation. In a first step we slightly perturb the Fourier coefficients of a given element ϕ ∈ L2 so that they can be compared with the sequence of gap lengths (γn (ϕ))n∈Z . It turns out that the matrix Sn (λ, ϕ), introduced in (26a), Sn (λ, ϕ) =

λ − nπ − an (λ, ϕ) −cn (λ, ϕ)

−bn (λ, ϕ) λ − nπ − an (λ, ϕ)

(63)

encodes all the information to prove Theorems 1.2 and 1.3. Following [18] we want to choose λ ∈ Un so that the diagonal terms of Sn (λ, ϕ) vanish. For ω the closed ball of radius M in H ω , any real M > 0 denote by BM ω BM := f ∈ H ω : f ω M and let ω ω ω BM = BM × BM .

2098


Lemma 5.1. Let M 1 and let ω be a weight of the form ω(k) = w(k)v(k), k ∈ Z where w and v are weights with v ∈ M and limk→∞ w(k) = ∞. Then there exists N4 ≡ N4 (M, w) such that ω −→ C, such that for any n ∈ Z with |n| N4 , there is a unique analytic function σn : BM π , (i) supϕ∈Bω |σn (ϕ) − nπ| 48 M ω (ii) σn (ϕ1 , ±ϕ1 ) ∈ R for any ϕ1 ∈ BM , and ω. (iii) σn (ϕ) = nπ + an (σn (ϕ), ϕ) identically on BM

Proof. For any given n ∈ Z, consider the fixed point problem for the map T T σ := nπ + an σ (·), · ω → C satisfying acting on the set E of all analytic functions σ : BM ω := sup |σ − nπ| |σ − nπ|BM ω ϕ∈BM

π 48

and ω . σ (ϕ1 , ±ϕ1 ) ∈ R ∀ϕ1 ∈ BM

An example of a map σ in E is 1

π σ (ϕ) = nπ + 48M 2

ϕ1 (t)ϕ2 (t) dt. 0

Hence E = ∅. Now choose N4 = N4 (M, w) 1 so that for any n with |n| N4 1 |n|1/2

+

1 w(n)

1 . 96M 2

ω and any |n| N By Lemma 2.2 it then follows that for any ϕ ∈ BM 4

− + X X n

n

7φ2ω HS

1 1 + 1/2 w(n) |n|

whereas by Lemma 4.1, π sup an (λ) . 48 λ∈Un ω into the closed disc Since any σ in E maps BM

π Dn = λ: |λ − nπ| ⊂ Un , 48

1 2

(64)


2099

we have for any |n| N4 and any σ ∈ E π ω an σ (·), · ω |an |U ×B ω |T σ − nπ|BM . n BM M 48 The latter inequality and Lemma 3.1(iii) show that T maps E into E. Endow E with the metric ω . Then T is a contraction, d(σ1 , σ2 ) = |σ1 − σ2 |BM 1 ω d(σ1 , σ2 ) d(σ1 , σ2 ) d T (σ1 ), T (σ2 ) = T (σ1 ) − T (σ2 )B ω |∂λ an |Dn ×BM M 23 as by Cauchy’s estimate ω |∂λ an |Dn ×BM

ω |an |Un ×BM

dist(Dn , ∂Un )

π/48 1 = . π/2 − (π/48) 23

Hence, T admits a unique fixed point in E, denoted by σn . By construction, σn satisfies items (i)–(iii). 2 To investigate the function σn given by Lemma 5.1, apply Lemma 5.1 for M 1 and v = 1 and denote the unique analytic function σn by σnw . For any weight ω = wv with v ∈ M, one has ω(k) = w(k)v(k) w(k)

∀k ∈ Z,

ω ⊂ B w . Hence by the uniqueness property, the function σ given by Lemma 5.1 for and thus, BM n M ω, ω and M 1 is the restriction of σnw to BM

σn = σnw Bω . M

For M 1 and ω = wv as above, choose N4 = N4 (M, w) as in Lemma 5.1. Then for any n ∈ Z ω with |n| N4 and ϕ = (ϕ1 , ϕ2 ) ∈ BM Sn σn (ϕ), ϕ =

0 −cn (σn (ϕ), ϕ)

−bn (σn (ϕ), ϕ) . 0

By (64) and Lemma 4.1, we know that for |n| N4 , the coefficient bn (σn (ϕ), ϕ) is close to ϕ2 (2n) whereas cn (σn (ϕ), ϕ) is close to ϕ1 (−2n). We now define the perturbed Fourier series ω by FM (ϕ1 , ϕ2 ) on BM FM (ϕ1 , ϕ2 ) =

ϕ1 (2n)e2n ,

|n|NM

+

|n|>NM

ϕ2 (2n)e2n

|n|NM

c−n (σ−n (ϕ), ϕ)e2n ,

bn (σn (ϕ), ϕ)e2n ,

(65)

|n|>NM

where the integer NM satisfies NM max{N4 (M, w), N4 (2M, w)}, and will be chosen according to the following proposition.

2100


Proposition 5.1. For any M 1 and for any unbounded slowly growing weight w, there exists w into Hw . NM ≡ NM (w) such that FM maps BM Moreover, for any weight ω of the form ω(k) = w(k)v(k),

k ∈ Z,

ω is an analytic diffeomorphism where v is any given weight in M, the restriction of FM to BM onto its image ω ω ⊂ Hω F M B ω : B M −→ FM BM (66) M

ω, such that for all ϕ ∈ BM

1 ϕω FM (ϕ)ω 2ϕω . 2

(67)

ω ω FM B M ∩ εL2R . ∩ εL2R = FM BM

(68)

Moreover, for any ε ∈ {1, i}

ω Proof. Let M 1 and let w be an unbounded, slowly growing weight. Since σn maps B2M into Un for |n| N4 (2M, w) and any weight ω of the form ω = wv as described above, the coω . Moreover, according efficients c−n (σ−n (ϕ), ϕ) and bn (σn (ϕ), ϕ) are well defined for ϕ in B2M to (47), we have ω(2n)bn (σn , ϕ) − ϕ2 (2n) ω(2n)Bn (φ) w(2n)v(2n)Bn (φ). (69)

Since v is sub-multiplicative, one concludes by (62) that ω(2n)bn (σn , ϕ) − ϕ2 (2n) w(2n)Bn (φv ).

(70)

In the same way, one gets from (49) and (62) that ω(2n)cn (σn , ϕ) − ϕ1 (−2n) w(2n)Bn (φv ).

(71)

Similarly as in (61) choose NM = NM (w) N4 (2M, w) so that for any n NM

1 φv w 2 . Cw Rn (φv ) w + √ 60 n Then, as in the proof of Proposition 4.1 |n|NM

1/2 2

2

w(2n) Bn (φv )

√ 4 2Cw 2 (2M)2

1 1 +√ 2 w(NM ) NM

where we used that φv w = φvw = φω 2M.

(72)


2101

As in the proof of Theorem 4.1, it then follows that (bn (σn , ϕ))n∈Z belongs to hω . Similarly, we ω . see that (cn (σn , ϕ))n∈Z is in hω . Hence, the adapted Fourier series FM is well defined on B2M In view of the definition (65) and the estimates (69)–(71), the following estimate then holds ω with ω = wv as above for any ϕ in B2M FM (ϕ) − ϕ = ω

2 2 1/2 ω(2n) bn (σn , ϕ) − ϕ2 (2n) + c−n (σ−n , ϕ) − ϕ1 (2n) 2

|n|>NM

√

1/2 2 w(2n)2 Bn (φv )2 . |n|>NM

Thus one has FM (ϕ) − ϕ 32Cw 2 M 2 ω

1 1 . +√ w(NM )2 NM

ω ) ⊆ Hω . Moreover, according to the latter estimate, we can choose N In particular, FM (B2M M larger, if necessary, so that

FM (ϕ) − ϕ M ω 16

ω . ∀ϕ ∈ B2M

(73)

Cauchy’s estimate (cf for instance [13, Lemma A.2]) then yields sup dϕ FM − Idω

ω ϕ∈BM

1 1 sup FM (ϕ) − ϕ ω , ω M ϕ∈B 8

(74)

2M

ω → Hω is a local diffeomorphism. To see that F is one-to-one note that for any hence FM : BM M ω ϕ, ϕ ∈ BM

FM (ϕ) − FM ( ϕ ) − (ϕ − ϕ )ω sup dψ FM − Idω ϕ − ϕ ω ω ψ∈BM

1 ϕ ω . ϕ − 8 ϕ ), one has ϕ − ϕ ω 18 ϕ − ϕ ω which implies ϕ = ϕ . Note that for Thus if FM (ϕ) = FM ( ω any ϕ ∈ BM FM (ϕ) ϕω + FM (ϕ) − ϕ 1 + sup dψ FM − Id ϕω ω ω ω ω ψ∈BM

where for the latter inequality we used that FM (0) = 0. It then follows that ϕω FM (ϕ)ω + FM (ϕ) − ϕ ω FM (ϕ)ω + sup dψ FM − Idω ϕω . ω ψ∈BM

2102


ω Hence by (74) it follows that for any ϕ ∈ BM

7 9 ϕω FM (ϕ)ω ϕω , 8 8 proving the estimate (67). Finally the statement (68) follows from Lemma 3.1(ii)–(iii) and Lemma 5.1(ii). 2 6. Regularity: Abstract case Proposition 6.1. Let M 1 and let ω be a weight of the form ω = wv where v ∈ M and w is an w unbounded slowly growing weight. Then for any ϕ ∈ BM one has ω FM (ϕ) ∈ BM/2

⇒

ω ϕ ∈ BM .

In particular, ϕ is in Hω . w Proof. By Proposition 5.1, the map FM is an analytic diffeomorphism between BM and its ω image. In addition, it is an analytic diffeomorphism between BM and its image. By Lemma 6.1 ω ) contains B ω . Thus, if ψ := F (ϕ) ∈ B ω w below, FM (BM M M/2 M/2 for some ϕ ∈ BM , then

−1 ω ϕ = FM (ψ) ∈ BM ⊂ Hω .

This proves the claimed statement.

2

ω ) contains B ω . Lemma 6.1. Under the same assumptions as in Proposition 6.1, FM (BM M/2 ω and its image Proof. First note that by Proposition 5.1, FM is a diffeomorphism between BM ω ω IM := FM (BM ). In particular,

ω ω ∂IM . = FM ∂BM

(75)

ω ω . We claim that in this case, Assume that BM/2 is not contained in IM ω ω ∩ UM/2 = ∅ ∂IM

(76)

ω ω . To see it, assume that (76) is not true and let X := where UM/2 denotes the interior of BM/2 ω ω IM ∩ UM/2 . Note that FM (0) = 0, and hence 0 ∈ X, i.e. X is not empty. As FM is a diffeomorω is closed and therefore X is closed in the relative topology of U ω . On the other phism, IM M/2 ω ∩ Uω ω hand, by our assumption, ∂IM M/2 = ∅ and thus X is open. As UM/2 is connected it then ω , contradicting the assumption ∂I ω ∩ U ω = ∅. This proves (76). follows that X = UM/2 M M/2 ω ω such that ψ = F (ϕ). It follows from (75) and (76) that there exists ψ ∈ UM/2 and ϕ ∈ ∂BM M In particular,

FM (ϕ) < M/2. ω


2103

Combining the latter inequality with estimate (67), we get M/2 = ϕω /2 FM (ϕ)ω < M/2. 2

This contradiction proves the statement of the lemma.

w for some M 1 where w is an unbounded slowly growing weight. Proposition 6.2. Let ϕ ∈ BM Suppose that

FM (ϕ) ∈ Hω for some weight ω = wv where v ∈ M. Then the following statements hold: (i) If v is sub-exponential, then ϕ ∈ Hω . (ii) If v is exponential, then ϕ ∈ Hwv for some ε > 0 where vε is the weight defined by vε (k) = eε|k| ,

k ∈ Z.

To deduce Proposition 6.2 from Proposition 6.1, we argue as in [18], and introduce a modified weight. The following lemma can be found in [18, Lemma 9]. Lemma 6.2. If v ∈ M is either sub-exponential or exponential, then ωε := min{vε , v} is a weight in M for any ε > 0 sufficiently small. Proof of Proposition 6.2. Without loss of generality we may assume that M 8ϕw ,

(77)

since the assumptions of the proposition and the conclusions continue to hold if M is increased. For ψ = FM (ϕ) we have by the estimate (67) ψw 2ϕw .

(78)

On the other hand, ψ ∈ Hω by assumption so ψω < ∞. By the definition (65) of FM we have that FM (0) = 0. Hence it remains to consider the case ψ = 0. Choose N so large that TN ψω ψw , where TN ψ is the tail of the Fourier series of ψ = (ψ1 , ψ2 )

ψ1 (2n)e2n , ψ2 (2n)e2n . TN ψ = |n|>N

|n|>N

2104


By Lemma 6.2, ωε = min{vε , v} is a weight in M for 0 < ε 1/2N sufficiently small. Then ψ2wωε = ψ − TN ψ2wωε + TN ψ2wωε ψ − TN ψ2wvε + TN ψ2ω e2N ε ψ − TN ψ2w + ψ2w e2N ε + 1 ψ2w 4ψ2w , or, taking into account (77), (78), ψwωε 2ψw 4ϕw

M . 2

wωε Thus, ψ ∈ BM/2 , whence wωε −1 (ψ) ∈ BM ⊂ Hwωε ϕ = FM

by Proposition 6.1. The claim follows by noting that Hwωε = Hω if v is sub-exponential, and Hwωε = Hwvε if v is an exponential weight and ε > 0 is sufficiently small. Indeed, if v is subexponential, then log v(n) = 0. |n|→∞ |n|

χ(v) = lim

Thus, for any ε > 0, there exists Nε > 0 such that for all |n| Nε 0

log v(n) ε |n|

or

v(n) exp ε|n| = vε (n).

In other words, ωε (n) = min(v(n), vε (n)) = v(n) for |n| Nε which means that Hwωε = Hω . Similarly if v is an exponential weight, i.e. χ(v) > 0, then one has ωε (n) = vε (n) for any 0 < ε < χ(v) and |n| large enough and thus Hwωε = Hwvε . 2 7. Lower bound for γn − The following result gives a two sided bound for the gap length |γn (ϕ)| = |λ+ n (ϕ) − λn (ϕ)| assuming a two sided bound for the quotient of cn and bn .

Lemma 7.1. Let M 1, let w be an unbounded increasing weight and let ω be a weight of the form ω(k) = w(k)v(k),

k ∈ Z,

with v ∈ M.


2105

Further assume that n ∈ Z with |n| N4 (M) (where N4 (M) is given as in Lemma 5.1), and ω . If b (σ , ϕ) = 0 and ϕ ∈ BM n n 1 cn (σn , ϕ) 9. 9 bn (σn , ϕ)

(79)

Then bn (σn , ϕ)cn (σn , ϕ) γn (ϕ)2 9bn (σn , ϕ)cn (σn , ϕ). Proof. Write det(Sn (λ)) = g+ (λ)g− (λ) where g± (λ) := λ − nπ − an (λ) ±

(80)

bn (λ, ϕ)cn (λ, ϕ).

Set σn ≡ σn (ϕ) with σn (ϕ) given by Lemma 5.1 and introduce ξn ≡ ξn (ϕ) =

bn (σn , ϕ)cn (σn , ϕ)

and rn := |ξn |.

(81)

As by assumption, bn (σn , ϕ) =√0 and cn (σn , ϕ) = 0 one has rn > 0. For λ near σn , we can choose a fixed sign of the square root bn (λ, ϕ)cn (λ, ϕ). Indeed, let Dn0 be the disk Dn0 := λ ∈ C: |λ − σn | 2rn . By Lemma 5.1, we have |σn − nπ| π/48 and by the estimate (44), rn π/48. Hence

π Dn0 ⊆ λ ∈ C: |λ − nπ| < = Dn . 3 We claim that bn (λ, ϕ) and cn (λ, ϕ) do not vanish for λ ∈ Dn0 . To see it, write |cn (σn , ϕ)| as 1/2 cn (σn , ϕ) = bn (σn , ϕ)cn (σn , ϕ)1/2 cn (σn , ϕ) . b (σ , ϕ) n

n

The estimate (79) then implies that 1 rn cn (σn , ϕ) 3rn . 3

(82a)

1 rn bn (σn , ϕ) 3rn . 3

(82b)

Similarly, one gets

Moreover, for any λ ∈ Dn0 cn (σn , ϕ) − cn (λ, ϕ) |∂λ cn |

Dn0 2rn

2106


and, by Cauchy’s estimate |cn |Un dist(Dn0 , ∂Un )

|∂λ cn |Dn0

where we used again the estimates |σn − nπ|

π 2

π/48 1 , − |σn − nπ| − 2rn 21

π 48 ,

and rn

cn (σn , ϕ) − cn (λ, ϕ)

Dn0

π 48 .

(83)

Hence

2 rn . 21

Similarly, we get |∂λ bn |Dn0

1 21

2 and bn (σn , ϕ) − bn (λ, ϕ)D 0 rn n 21

(84)

as well as |∂λ an |Dn0

1 . 21

(85)

Combined with the estimate (82a) we then conclude that for any λ ∈ Dn0 5 rn = 21

2 2 1 65 − rn cn (λ, ϕ) , bn (λ, ϕ) 3 + rn = rn 3 21 21 21

or bn c

cn , b 0

n Dn

n Dn0

65 = 13. 5

(86)

In bn (λ) and cn (λ) do not vanish on Dn0 . Hence λ → √ particular, the latter estimate shows that 0 bn (λ, ϕ)cn (λ, ϕ) is differentiable on Dn . Using (83), (84) and (86), we finally obtain ∂λ bn (λ, ϕ)cn (λ, ϕ)

Dn0

4 1 √ · 13 |∂λ bn |Dn0 + |∂λ cn |Dn0 . 2 21

(87)

Now, we are in a position to compare g± with h± , defined by h± (λ) := λ − nπ − an (σn , ϕ) ± ξn on the disk Dn± := λ: |λ − σn ± ξn | rn /2 ⊂ Dn0 .

(88)

Then |λ − σn |Dn± |λ − σn ± ξn |Dn± + |ξn |

3rn . 2

(89)


2107

By the definition (80) of g± , the definition (81) of ξn and the estimates (85) and (87) one has h± (λ) − g± (λ)

Dn±

an (λ) − an (σn )D ± + bn (λ, ϕ)cn (λ, ϕ) − ξn D ± n n

4 1 + |λ − σn |Dn± 21 21 rn < = h± (λ)∂D ± n 2

where, for the latter identity, we used definitions (88) and the identity h± (λ) = λ − σn ± ξn which follows from the definition of σn , an (σn , ϕ) = σn − nπ . Since, h± has the unique root λ = σn ∓ ξn in Dn± , it follows by Rouché’s theorem that the unique root λ± n of g± within Dn must be contained in Dn± . Hence ± λ − σn ± ξn rn . n 2 Writing + − − λ+ n − λn = λn − σn + ξn − [λn − σn − ξn ] + 2ξn we then conclude that − rn = 2|ξn | − rn λ+ n − λn 2|ξn | + rn = 3rn as claimed.

2

In the focusing and defocusing case, the two sided estimate (79) can be easily verified. Indeed, according to Lemma 3.1, for any ϕ ∈ L2 with ϕ = (ϕ1 , ±ϕ1 ) bn (λ, ϕ) = ±cn (λ, ϕ). By Lemma 5.1(ii), σn (ϕ1 , ±ϕ1 ) is real and thus bn σn (ϕ), ϕ = ±cn σn (ϕ), ϕ . Hence if bn (σn (ϕ), ϕ) = 0, then for ϕ ∈ L2 with ϕ = (ϕ1 , ±ϕ1 ) cn (σn (ϕ), ϕ) b (σ (ϕ), ϕ) = 1. n n We note that the conclusion of the lemma above continues to hold for ϕ = (ϕ1 , ±ϕ1 ) even if bn (σn (ϕ), ϕ) = 0. Indeed, as cn (σn (ϕ), ϕ) = ±bn (σn (ϕ), ϕ) = 0, the matrix Sn (σn (ϕ), ϕ) is zero − (cf (63)) and consequently, λ+ n (ϕ) = λn (ϕ) = σn (ϕ) is a double periodic eigenvalue, i.e. the nth gap is collapsed.

2108


8. Regularity It turns out that the converse of Theorem 4.1 is not true. First we need to exclude exponential weights, and second we need to restrict to potentials ϕ in L2R ∪ iL2R . Theorem 8.1. Let ω be any weight of the form ω = wv where w is an unbounded slowly growing weight and v ∈ M. Then the following statements hold. (i) Assume that v is a sub-exponential weight. Then for any ϕ ∈ L2R ∪ iL2R , ϕ ∈ Hw

and

γn (ϕ) n∈Z ∈ hω ⇒ ϕ ∈ Hω .

(ii) Assume that v is an exponential weight. Then for any ϕ ∈ L2R ∪ iL2R , there exists ε > 0 so that ϕ ∈ Hw

and

γn (ϕ) n∈Z ∈ hwv ⇒ ϕ ∈ Hwvε

where vε is the weight defined by vε (k) = eε|k| , k ∈ Z. Proof. Suppose that ϕ ∈ Hw ∩ (L2R ∪ iL2R ) satisfies

2 ω(2n)2 γn (ϕ) < ∞.

n∈Z w 1 (2k) = c−k (σ−k (ϕ), ϕ) Let M be large enough so that ϕ ∈ BM , and consider ψ = (ψ1 , ψ2 ) with ψ 2 (2k) = bk (σk (ϕ), ϕ) for |k| > NM where NM is given as in Proposition 5.1. As ϕ is in and ψ L2R ∪ iL2R , we know that

bk σk (ϕ), ϕ = ±ck σk (ϕ), ϕ . So Lemma 7.1 applies, leading to ψ 2 (2k) γk (ϕ), 1 (−2k) = ψ

|k| > NM .

Hence ψ = FM (ϕ) ∈ Hω , and the claimed result follows from Proposition 6.2.

2

9. Proofs of Theorems 1.2, 1.3, and Corollary 1.1 To prove Theorem 1.2 and Theorem 1.3, we want to apply Theorem 8.1. The following lemma which can be found in [5, Lemma 48] allows to get rid of the weight function w in Theorem 8.1. Lemma 9.1. If z = (zk )k∈Z ∈ 2 , then there exists an unbounded slowly increasing weight w = (w(k))k∈Z such that z ∈ hw .


2109

Proof. Without loss of generality, we can assume that the sequence z has infinitely many nonzero terms. Consider the sequence ρ = (ρn )n1 , ρn = √1 , where 2rn

rn :=

1/4 |zk |

2

.

|k|n

Clearly, ρ is an unbounded increasing sequence such that ∞

|ρn |2 |zn |2 + |z−n |2 < ∞.

n=1

Indeed, 4 |zn |2 + |z−n |2 rn4 − rn+1 2 ρn2 |zn |2 + |z−n |2 = rn2 − rn+1 2 2rn2 rn2 + rn+1

and thus for any N 1, N

ρn2 |zn |2 + |z−n |2 r12 = z.

n=1

Let us introduce the sequence d = (dn )n1 defined by d1 = ρ 1

and d2n+1 = d2n = min(2dn , ρ2n )

∀n 1.

Then, the sequence d verifies d2n 2dn

∀n 1.

Note that dn ρn , for any n 1, as d1 = ρ1 and for any n 1, d2n+1 = d2n ρ2n ρ2n+1 where for the latter inequality we used that (ρn )n1 is increasing. Moreover (dn )n1 is increasing. Indeed, d1 d2 and d2n = d2n+1 , n 1, by construction whereas for any n 1, d2n+1 min(2dn+1 , ρ2n+2 )(= d2n+2 ) as d2n+1 ρ2n ρ2n+2 and d2n+1 = d2n 2dn 2dn+1 where we used induction to get the latter inequality. To see that (dn )n1 is unbounded, suppose the contrary. Then, there exists d∞ such that dn d∞

as n → ∞.

2110


Since (ρn )n1 is unbounded and increasing, there exists N 1 so that ρ2N > 2d∞

and

2 d∞ dN d∞ 3

and hence 4 d2N = min(2dN , ρ2N ) = 2dN d∞ > d∞ , 3 contradicting our assumption that dn d∞ , ∀n 1. Finally, we define the weight w by w(0) = 1,

w(k) = 1 + d|k|

∀k ∈ Z \ {0}.

Note that w(k) 1 + ρ|k| , ∀k ∈ Z \ {0} as d|k| ρ|k| . Hence, w is an unbounded, slowly growing weight so that z ∈ hw . 2 Now, we are in position to prove our main results. Proof of Theorems 1.2 and 1.3. Let ϕ ∈ iL2R and assume that (γn (ϕ)) ∈ hω for some weight ω ∈ M. Consider the sequence z = (zn )n∈Z given by 2 2 2 1/2 . zn = ϕ1 (2n) + ϕ2 (2n) + ω(2n)2 γn (ϕ) By the assumptions, this sequence is in 2 . By Lemma 9.1 there exists a slowly increasing unbounded weight w for which z ∈ hw . Thus ϕ ∈ Hw and (γn (ϕ))n∈Z ∈ hwω . Theorem 8.1 then implies that Theorem 1.2 holds. To complete the proof of Theorem 1.3 we argue as in the proof of Proposition 6.2. Let ω be an exponential weight in M, i.e. χ(ω) > 0. Applying Theorem 8.1 to wω it follows that, ϕ ∈ Hvε for some (small) ε > 0. Since χ(ω) > 0, there exists N 1 so that |n| N , 0 N , BM M R FM (ϕ) ∈ TN := span (0, e2k ), (e2k , 0) |k| N .


2111

+ Then for any |n| > N , the matrix Sn (σn (ϕ), ϕ) vanishes and thus λ− n (ϕ) = λn (ϕ) = σn (ϕ). In ω ∩ iL2 ) ∩ T particular γn (ϕ) = 0. Hence ϕ is an N -gap potential. The union of the sets FM (BM N R ω ∩ iL2 ). Since, by Proposition 5.1, for N > NM is dense in FM (BM R

ω ω FM : B M ∩ iL2R ∩ iL2R → FM BM ω ∩ iL2 is dense in B ω ∩ iL2 . Since is a diffeomorphism, the family of N -gap potentials in BM M R R M is arbitrary, this proves Corollary 1.1 for such weights. Let ϕ = (ϕ1 , ϕ2 ) ∈ Hω where ω is an arbitrary weight in M. Then

2 2 1/2 zn = ω(2n) ϕ1 (2n) + ϕ2 (2n) is in 2 . By Lemma 9.1, ϕ ∈ Hwω for some unbounded, slowly increasing weight w. By the arguments above the finite gap potentials are dense in Hwω and Hwω is a dense subspace of Hω . 2 References [1] P. Djakov, B. Mityagin, Smoothness of Schrödinger operator potential in the case of Gevrey type asymptotics of the gaps, J. Funct. Anal. 195 (1) (2002) 89–128. [2] P. Djakov, B. Mityagin, Spectra of 1-D periodic Dirac operators and smoothness of potentials, C. R. Math. Acad. Sci. Soc. R. Can. 25 (4) (2003) 121–125. [3] P. Djakov, B. Mityagin, Spectral triangles of Schrödinger operators with complex potentials, Selecta Math. (N.S.) 9 (4) (2003) 495–528. [4] P. Djakov, B. Mityagin, Instability zones of a periodic 1D Dirac operator and smoothness of its potential, Commun. Math. Phys. 259 (1) (2005) 139–183. [5] P. Djakov, B. Mityagin, Instability zones of periodic 1-dimensional Schrödinger and Dirac operators, Russian Math. Surveys 61 (4) (2006) 663–766. [6] B. Grébert, T. Kappeler, Estimates on periodic and Dirichlet eigenvalues for the Zakharov–Shabat system, Asymptot. Anal. 25 (3–4) (2001) 201–237, Asymptot. Anal. 29 (2002) 183, Erratum. [7] B. Grébert, T. Kappeler, Density of finite gap potentials for the Zakharov–Shabat system, Asymptot. Anal. 33 (1) (2003) 1–8. [8] B. Grébert, T. Kappeler, B. Mityagin, Gap estimates of the spectrum of the Zakharov–Shabat system, Appl. Math. Lett. 11 (4) (1998) 95–97. [9] B. Grébert, T. Kappeler, J. Pöschel, Normal form theory for the NLS equation: A preliminary report, preprint http://www.math.sciences.univ-nantes.fr/~grebert/publications.html, 2002. [10] T. Kappeler, P. Lohrmann, P. Topalov, N.T. Zung, On Birkhoff coordinates for the focusing NLS, Commun. Math. Phys. 285 (3) (2009) 1087–1107. [11] T. Kappeler, B. Mityagin, Gap estimates of the spectrum of Hill’s equation and action variables for KdV, Trans. Amer. Math. Soc. 351 (2) (1999) 619–646. [12] T. Kappeler, B. Mityagin, Estimates for periodic and Dirichlet eigenvalues of the Schrödinger operator, SIAM J. Math. Anal. 33 (1) (2001) 113–152. [13] T. Kappeler, J. Pöschel, KdV & KAM, Ergeb. Math. Grenzgeb. (3), vol. 45, Springer, 2003. [14] T. Kappeler, F. Serier, P. Topalov, On the symplectic phase space of KdV, Proc. Amer. Math. Soc. 136 (5) (2008) 1691–1698. [15] V.A. Marchenko, Sturm–Liouville Operators and Applications, Oper. Theory Adv. Appl., vol. 22, Birkhäuser, 1986. [16] L.A. Pastur, V.A. Tkachenko, Spectral theory of Schrödinger operators with periodic complex-valued potentials, Funct. Anal. Appl. 22 (2) (1988) 156–158. [17] G. Pólya, G. Szegö, Problems and Theorems in Analysis. I. Series, Integral Calculus, Theory of Functions, Classics Math., Springer, 1998, reprint of the 1978 English translation. [18] J. Pöschel, Hill’s potentials in weighted Sobolev spaces and their spectral gaps, preprint (link: http://www.poschel. de/pbl/w-gaps.pdf). [19] J.-J. Sansuc, V.A. Tkachenko, Spectral parametrization of non-selfadjoint Hill’s operators, J. Differential Equations 125 (2) (1996) 366–384.

2112


[20] V.A. Tkachenko, Spectral analysis of a nonselfadjoint Hill operator, Sov. Math. Dokl. 45 (1) (1992) 78–82. [21] V.A. Tkachenko, Non-selfadjoint periodic Dirac operators with finite band spectra, Integral Equations Operator Theory 36 (3) (2000) 325–348. [22] V.A. Tkachenko, Nonselfadjoint periodic Dirac operators, in: Operator Theory, System Theory and Related Topics, Beer-Sheva/Rehovot, 1997, in: Operator Theory Adv. Appl., vol. 123, Birkhäuser, 2001, pp. 485–512. [23] V.E. Zakharov, A.B. Shabat, A scheme for integrating the nonlinear equations of mathematical physics by the method of the inverse scattering problem I, Funct. Anal. Appl. 8 (1974) 226–235.


Uniqueness in E (X, ω) Sławomir Dinew Institute of Mathematics, Jagiellonian University, ul. Łojasiewicza 6, 30-348 Kraków, Poland Received 26 May 2008; accepted 23 January 2009 Available online 11 February 2009 Communicated by L. Gross

Abstract We prove uniqueness for the Dirichlet problem for the complex Monge–Ampère equation on compact Kähler manifolds in the case of probability measures vanishing on pluripolar sets. The proof uses the mass concentration technique due to Kołodziej coupled with inequalities for mixed Monge–Ampère measures and the comparison principle. © 2009 Elsevier Inc. All rights reserved. Keywords: Monge–Ampère operator; Cegrell classes; Kähler manifold

1. Introduction Pluripotential theory on compact Kähler manifolds turned out to be a very effective tool in complex geometry and dynamics. Despite the fact that it deals with (a priori) non-smooth functions, the techniques were used with success in purely geometrical problems. For example, the L∞ estimate from [18] gives enough flexibility for studying various limiting problems in geometry which were unaccessible with the standard (restrictive) PDE techniques. However the ideas considered soon opened new problems of independent interest. In a series of articles Guedj, Zeriahi and collaborators [15,16,14,2,10] and Kołodziej [18,19] laid down the foundations of the theory. In particular the complex Monge–Ampère operator was defined on (non-smooth) ω-psh functions (see the next section for definitions of all the notions appearing in the introduction), and its maximal domain of definition was explored. By analogy to the works of Błocki [4,5] and Cegrell [8,9] many results from the “flat” theory (i.e. the one in domains in Cn ) were adjusted to the Kähler manifold case. However from the very beginning some probE-mail address: [email protected]. 0022-1236/$ – see front matter © 2009 Elsevier Inc. All rights reserved. doi:10.1016/j.jfa.2009.01.019

2114

S. Dinew / Journal of Functional Analysis 256 (2009) 2113–2122

lems revealed to be unexpectedly difficult in the new setting. One of them is the problem of uniqueness. This problem consists of the following: Let φ, ψ be ω-psh functions and the Monge–Ampère n . The question is under what asoperator is well defined for them. Assume that ωφn = μ = ωψ sumptions on the positive measure μ and/or on the functions φ, ψ one can conclude that φ − ψ is constant? Note that there are known examples of failure of the uniqueness in general (see [3] for one such example), hence some assumptions are necessary. The first result in this direction was done by E. Calabi [7]. He proved that if φ, ψ are smooth and ωφ , ωψ are Kähler forms (i.e. strictly positive) then uniqueness does hold. These are natural assumptions from geometer’s perspective and the proof is quite easy in this case. However both smoothness and strict positivity are crucial in this approach, hence it gives no insight what to do in general. The next step was done by Bedford and Taylor [1] who proved uniqueness for bounded φ, ψ provided the underlying manifold is Pn . Their main idea was to control the L2 norm of the gradient of the difference of φ and ψ. Using different technique Kołodziej [19] proved uniqueness for bounded functions on arbitrary compact Kähler manifold modulo additional mild assumptions on the measure μ. The “bounded” case was finally done by Błocki [3]. The proof has some common points with the one in [1], but is much easier and transparent. Furthermore the proof gives some stability results showing that when one perturbs the measure on the right-hand side slightly the normalized solution is in a way close to the original one. By developing theory of Cegrell classes in the Kähler manifold setting [15,16] (see also [12]) the domain of definition of the operator was enlarged with many unbounded functions. Guedj and Zeriahi [16] observed that Błocki’s argument, with suitable modifications, can be carried over to prove uniqueness in the class E 1 (X, ω). Recently Demailly and Pali [11] proved uniqueness in the same class for more general forms which are only semi-positive. The most general result so far was proven very recently by Błocki (see [6]) who proved that uniqueness does hold in the 1−

1

class E 2n−1 (X, ω), n = dim X. Simultaneously the picture in the flat theory was made much clearer by Cegrell who proved in [9] that one can prove uniqueness provided the measure μ does not charge pluripolar sets. The proof however relied heavily on tools that are not available in the compact setting. Nevertheless it is natural to expect that uniqueness in this class (called E(X, ω)) should also hold. In fact in [10] this point is an important obstruction for further understanding of the domain of definition of the Monge–Ampère operator. The class E(X, ω) deserves special interest, due to the following result proven in [16]: Theorem 1.1. Let μ be a probability Borel measure that vanishes on pluripolar sets. Then there exists (at least one) ρ ∈ E(X, ω) such that ωρn = μ,

sup ρ = 0. X

It is therefore important to study what happens in E(X, ω) \ E 1 , so one can understand better the action of the complex Monge–Ampère operator. Let us state our main result which solves the uniqueness problem in E(X, ω) completely:


2115

n . Then φ − ψ is constant. Theorem 1.2. Let φ, ψ ∈ E(X, ω), be such that ωφn = ωψ

2. Preliminaries Throughout the note we shall work on a fixed compact n-dimensional Kähler manifold X equipped with a fundamental Kähler form ω (that is d-closed strictly positive globally defined form) given in local coordinates by ω=

n i gkj dzk ∧ dzj . 2 k,j =1

We assume that the metric is normalized so that ωn = 1. X

Recall that PSH(X, ω) := φ ∈ L1 (X, ω): dd c φ −ω, φ ∈ C ↑ (X) , where C ↑ (X) denotes the space of upper semicontinuous functions and, as usual, d is the standard operator of exterior differentiation while d c := i/2π(∂ − ∂). We call the functions that belong to PSH(X, ω) ω-plurisubharmonic (ω-psh for short). We shall often use the handy notation ωφ := ω + dd c φ. Also, for the sake of brevity, we shall denote sets {z ∈ X | u(z) > −j } simply by {u > −j }. The ω-psh functions are locally standard plurisubharmonic functions minus a (smooth) potential for the form ω. This allows to use classical local results from pluripotential theory. In particular the Monge–Ampère operator ωφn := ωφ ∧ · · · ∧ ωφ is well defined for bounded ω-psh functions. This approach was used in [18,19,12]. We recall below the definition of the class E(X, ω). For every u ∈ PSH(X, ω) (ω + dd c max(u, −j ))n is a well-defined probability measure. By [16] the sequence of measures χ{u>−j } (ω + dd c max(u, −j ))n is always increasing and one defines

n c E(X, ω) := u ∈ PSH(X, ω) lim χ{u>−j } ω + dd max(u, −j ) = 1 . j →∞

X

These functions are a priori unbounded, but the integral assumption ensures that the Monge– Ampère measure has no mass on {u = −∞}. Then one defines n n ω + dd c u := lim χ{u>−j } ω + dd c max(u, −j ) . j →∞

In particular Monge–Ampère measures of functions from E(X, ω) do not charge pluripolar sets. We refer to [16] for a discussion of that notion.

2116


The class E 1 (X, ω), or more generally E p (X, ω), p > 0, is defined by

p n E (X, ω) := φ ∈ E(X, ω) |φ| ωφ < ∞ . p

X

Since ω-psh functions are upper semicontinuous, they are bounded from above, hence one usually considers only nonpositive ω-psh functions from E p (X, ω), which often comes in handy in technical details. Note that originally the classes E p were defined (similarly to the Cegrell classes in the flat theory) with the use of a sequence of bounded functions φj , φj φ, such that supj X |φj |p ωφnj < ∞. The results from [16,12] have shown that actually one can take just the sequence φj := max(φ, −j ), hence both definitions are coherent. One can also define “local” classes in an attempt similar to the one from [4,5]. We define the class D(X, ω) by D(X, ω) := φ ∈ PSH(X, ω) ∀z ∈ X ∃Uz -open, z ∈ Uz , ρ + φ ∈ D(Uz ) , where ρ is a local potential in Uz for ω and D(Uz ) is the maximal domain of definition of the Monge–Ampère operator in Uz (see [4,5]). Note however that the “local” and global definition yield different classes, as shown in [16] (this was also studied in [10]). This is in sharp contrast with the “flat” theory. Define also Da (X, ω) by Da (X, ω) := φ ∈ D(X, ω) ωφn (A) = 0, ∀A ⊂ X, A-pluripolar . It is known that Da (X, ω) ⊂ E(X, ω), while D(X, ω) E(X, ω) D(X, ω). Please note that the terminology in the Cegrell classes, partially due to the mentioned differences in the “local” and “global” settings varies in the literature. In particular the class D(X, ω) is denoted by E(X, ω) in [17] or [13]. The class E(X, ω) in turn differs in some aspects from the class E(Ω) in the “flat” setting (for example a function in E(Ω) may have a Monge–Ampère measure that charges points). The first result that we shall need – an inequality for mixed Monge-Ampère measures – was shown in a special case in [19] and in full generality in [13]. Theorem 2.1. Let u, v ∈ E(X, ω) be ω-psh functions, μ be a positive measure that does not charge pluripolar sets and f, g ∈ L1 (dμ). If n ω + dd c u f dμ,

n ω + dd c v g dμ

as measures, then k n−k k n−k ω + dd c u ∧ ω + dd c v f n g n dμ,

∀k ∈ {1, . . . , n − 1}.

n , then also for every t ∈ (0, 1) we have Corollary 2.2. If φ, ψ ∈ E(X, ω) and ωφn = ωψ n n ωtφ+(1−t)ψ = ωφn = ωψ .


2117

The second result we shall need is somewhat nonstandard, so although it is similar to the usual comparison principle, we give a detailed proof. Theorem 2.3 (“Partial” comparison principle). Suppose T is a (k, k) positive closed current on X of the form ωφ1 ∧ · · · ∧ ωφk , φj ∈ E(X, ω) ∀j ∈ {1, . . . , k}, where 0 k n − 1. Let furthermore u, v ∈ E(X, ω). Then ωvn−k ∧ T ωun−k ∧ T . {u −∞;

(1.38)

T

• All other coefficients are of the form T− :=

O , B

T+ (τ ) = T− (τ¯ )

and R− := −

T− ¯ R+ , T¯+

(1.39)

where O is the outer function in the unit disk D, such that 2 |O|2 + R+ (τ ) = 1 a.e. on T

(1.40)

T∓ () = −ieic± T∓ ().

(1.41)

and

2166

F. Peherstorfer et al. / Journal of Functional Analysis 256 (2009) 2157–2210

Also 2 1 1 1 = (ζk ) . ν+ (ζk ) ν− (ζk ) T±

(1.42)

Now, we define the Faddeev–Marchenko Hilbert space. Definition 1.10. Set α+ := {R+ , ν+ }.

(1.43)

An element f of the space L2α+ is a function on T ∪ Z such that

f 2α+ =

f (ζk )2 ν+ (ζk ) ζk ∈ Z

+

1 2

f (τ )

τ f (τ¯ )

T

1 R+ (τ )

R+ (τ ) 1

f (τ ) dm τ¯ f (τ¯ )

(1.44)

is finite. Theorem 1.11. For A ∈ ASB (E) the system ∞

+ e (n, ζ ) n=−∞

(1.45)

forms an orthonormal basis in the associated space L2α+ . Therefore, the map F + : l 2 (Z) → L2α+

such that F + |n := e+ (n, ζ )

(1.46)

is unitary. Moreover F + A(F + )∗ is the multiplication operator by v. (1.46) is called the scattering representation of A. Note that simultaneously we have the representation F − : l 2 (Z) → L2α−

such that F − | − n − 1 := e− (n, ζ ).

(1.47)

Theorem 1.12. The scattering representations (1.46), (1.47) determine each other by T± (τ ) F ± f˜ (τ ) = τ¯ F ∓ f˜ (τ¯ ) + R∓ (τ ) F ∓ f˜ (τ ), τ ∈ T, ± 1 (ζk )ν∓ (ζk ) F ∓ f˜ (ζk ), ζk ∈ Z, F f˜ (ζk ) = − T±

(1.48)

for f˜ ∈ l 2 (Z), and have the following analytic properties (BT± )F ± l 2 (Z± ) ⊂ H 2 .

(1.49)


2167

Finally, let us mention the important Wronskian identity. Put formally e+ (2m, τ ) = eic+ bm bm¯ L¯ (2m, τ ), ρ2m e+ (2m + 1, τ ) + a¯ 2m e+ (2m, τ ) = bm bm¯ L (2m, τ ),

(1.50)

and L (2m + 1, τ ), e+ (2m + 1, τ ) = bm bm+1 ¯ L¯ (2m + 1, τ ). ρ2m+1 e+ (2m + 2, τ ) + a2m+1 e+ (2m + 1, τ ) = eic+ bm bm+1 ¯

(1.51)

Then τ¯ L¯ (n, τ¯ ) L (n, τ ) ¯

τ¯ L (n, τ¯ ) d log v(τ ) = . L (n, τ ) dτ

(1.52)

1.3. Inverse scattering: a brief discussion The unimodular constant eic+ and the pair α+ (1.43) are called the scattering data. A fundamental question is how to recover the CMV matrix from the scattering data? When can this be done? Do we have a uniqueness theorem? We say that the scattering data are in the Szegö over Blaschke class, α+ ∈ ASB (E), if • R+ has the properties (1.38), • ν+ is a discrete measure supported on Z (1.24). Let us point out that we did not even assume that the measure ν+ is finite. In short: to every scattering data of this class we can associate the system of reflection/transmission coefficients by (1.39)–(1.41), the dual measure ν− (1.42) and the constant eic− (1.41) in such a way that there exists a CMV matrix from ASB (E), which satisfies Theorem 1.6 and Corollary 1.7 with these data. To this end we associate with α+ the Faddeev–Marchenko space L2α+ , define a Hardy type subspace Hˇ α2+ in it, and, similar to (1.18), construct the orthonormal basis (at this place the constant eic+ is required). Then, the multiplication operator (with respect to this basis) is the CMV matrix and the claim of Theorem 1.6 is a Szegö kind result on the asymptotics of this orthonormal system. For a brief explanation of the uniqueness problem we would like to use the following analogy. For the measure dμ = w(τ )dm(τ ), with log w ∈ L1 , we can define the Hardy space Hˇ μ2 as the closure of H ∞ (or polynomials) in L2μ -sense. On the other hand, let us define the outer function φ such that |φ|2 = w and then define g (1.53) Hˆ μ2 := f = : g ∈ H 2 . φ According to the Beurling theorem [12] these two Hardy spaces are the same. But as we shall see in the Faddeev–Marchenko setting their counterparts Hˇ α2+ and Hˆ α2+ do not necessarily coincide. Indeed, for the data α+ uniqueness in the inverse scattering takes place if and only if Hˇ α2+ = Hˆ α2+ .

2168


1.4. Hardy subspaces in the Faddeev–Marchenko space. Duality Let α+ ∈ ASB (E). Define T± , the dual data: eic− , α− := {R− , ν− }, and set

+ T+ f + f (τ ) T+ 0 (τ ) (τ ) = R+ 1 τ¯ f + (τ¯ ) T− f − − 1 R− τ¯ f (τ¯ ) = (τ ) f − (τ ) 0 T−

(1.54)

for τ ∈ T and

1 f (ζk ) = − T− −

(ζk )ν+ (ζk )f + (ζk )

(1.55)

for ζk ∈ Z. It is evident that in this way we define a unitary map from L2α+ to L2α− , in fact, due to (1.54) + 1 + R+ (τ ) f (τ ) 1 + dm f (τ ) τ¯ f (τ¯ ) R+ (τ ) τ¯ f + (τ¯ ) 1 2 T

=

T+ f + 2 + T− f − 2 , 2

(1.56)

where in the RHS we have the standard L2 norm on T. The key point is duality not only between these two spaces but, what is more important, between corresponding Hardy subspaces. Let us introduce two versions of Hardy subspaces (in general, they are not equivalent!). The first one Hˇ α2+ basically is the closure of H ∞ with respect to the given norm. More precisely, let B = {BN }, where BN is a divisor of B such that B/BN is a finite Blaschke product. Then f := BN g,

g ∈ H ∞ , BN ∈ B,

(1.57)

belongs to L2α+ and by Hˇ α2+ we denote the closure in L2α+ of functions of the form (1.57). Let us point out that every element f of Hˇ α2+ is such that Of belongs to the standard H 2 , see (1.56). Therefore, in fact, f (ζ ) has an analytic continuation from T in the disk D. Moreover, the value of f at ζk , due to this continuation, and f (ζk ), that should be defined for all ζk ∈ Z since f is a function from L2α+ , still perfectly coincide. The second space also consists of functions from L2α+ having an analytic continuation in D. Definition 1.13. A function f ∈ L2α+ belongs to Hˆ α2+ if g(τ ) := (BT+ f )(τ ), τ ∈ D, belongs to the standard H 2 and g (ζk ), ζk ∈ Z, f (ζk ) = BT+ where in the RHS g and BT+ are defined by their analytic continuation in D. The following theorem clarifies the relations between the two Hardy spaces.


2169

Theorem 1.14. Let f + ∈ L2α+ Hˇ α2+ and let f − ∈ L2α− be defined by (1.54), (1.55). Then f − ∈ Hˆ α2− . In short, we write 2 + Hˆ α− = L2α+ Hˇ α2+ .

(1.58)

1.5. Main results on inverse scattering Both Hˇ α2+ and Hˆ α2+ are spaces of analytic functions in D with the reproducing kernels, which we denote by kˇα+ ,ζ0 = kˇα+ (ζ, ζ0 ) and kˆα+ ,ζ0 = kˆα+ (ζ, ζ0 ) respectively. We put Kˇ α+ ,ζ0 =

kˇα+ ,ζ0 ,

kˇα+ ,ζ0

Kˆ α+ ,ζ0 =

kˆα+ ,ζ0 .

kˆα+ ,ζ0

(1.59)

Define the following shift operation on the scattering data

(n) (n) n α+ = R+ , ν+ := bn bn¯ R+ , bn bn¯ ν+ ,

n ∈ Z.

(1.60)

Theorem 1.15. Let Kα+ ,ζ0 denote one of the normalized kernel in (1.59). The system of functions +

e (n, τ ) =

bm bm¯ Kα+n ,¯ (τ )eic+ ,

n = 2m,

bm bm+1 Kα+n , (τ ), ¯

n = 2m + 1,

(1.61)

forms an orthonormal basis in Hˇ α2+ and Hˆ α2+ respectively, if n ∈ Z+ and in the whole L2α+ if n ∈ Z. With respect to this basis the multiplication operator by v(τ ) is the CMV matrix A ∈ ASB (E) with coefficients given by (1.29). Moreover, the scattering data given by Proposition 1.9 and the dual orthonormal system T− (τ )e− (−1 − n, τ ) := τ¯ e+ (n, τ¯ ) + R+ (τ )e+ (n, τ )

(1.62)

correspond to A in the sense of Theorem 1.6 and Corollary 1.7. An important observation is the following Proposition 1.16. Let A ∈ ASB (E), and let α+ and F + correspond to this matrix. Then Hˇ α2+ ⊂ F + l 2 (Z+ ) ⊂ Hˆ α2+ .

(1.63)

Due to this observation the direct scattering result Theorem 1.6 can be proved as a corollary of the inverse scattering Theorem 1.15. Concerning the uniqueness problem: Theorem 1.17. The scattering data α+ , eic+ determine A ∈ ASB (E) if and only if ¯ ) ¯ = kˇα± (, )kˇα −1 (, ∓

1 1 . |T± ()|2 (1 − ||2 )2

(1.64)

2170


Corollary 1.18. Let A ∈ ASB (E) and W be its spectral density (1.25). If W −1 (t) dm(t) < ∞,

(1.65)

E

then there is no other CMV matrix of ASB (E) class corresponding to the same scattering data. In fact, (1.65) means that e± (n, τ ) ∈ L2 for n = −1, 0 and, therefore, for all n ∈ Z. In this case there exist the decompositions ± Ml,n el,c± (τ ). (1.66) e± (n, τ ) = ln

The following matrix ⎡ M+

0,0 ⎢ M+ ⎢ 1,0 M+ = ⎢ + ⎢ M2,0 ⎣

0

0

+ M1,1

0

+ M2,1

.. .

.. .

...⎤ ...⎥ ⎥ ⎥ ...⎥ ⎦ .. .

+ M2,2 .. .

(1.67)

yields the transformation (Gelfand–Levitan–Marchenko) operator, acting in l 2 (Z+ ). Similarly we define M− : l 2 (Z− ) → l 2 (Z− ) (for details see Sect. 8). Note that under condition (1.65) they are not necessarily bounded. We present necessary and sufficient conditions when the scattering data determine the CMV matrix and both transformation operators M± are bounded. For θ ∈ ΘSB (E) consider the following two conditions: (i) for all arcs I ⊂ E supwI w−1 I < ∞,

(1.68)

I

where w(t) :=

1−|θ(t)|2 , |1−θ(t)|2

and wI :=

1 |I |

w(t) dm(t). I

(ii) for all arcs of the form I = (eiξ , eiξ0 ) or I = (e−iξ0 , e−iξ ), I ⊂ T \ E, 1 d log v iηk sup e < ∞, √ |I Ik |wIk d log θ I iη e

k ∈Y ∩I

where Y = {eiηk ∈ T \ E: θ (eiηk ) = 1}, and iξ i(2ξ −η ) (e 0 , e 0 k ), Ik = (e−i(2ξ0 −ηk ) , e−iξ0 ),

ηk > 0, ηk < 0.

(1.69)


2171

Theorem 1.19. Let A ∈ ASB (E) with the associated Schur functions θ± and the scattering data {R+ , ν+ , eic+ }. Then the following statements are equivalent. (1) The Schur functions θ± satisfy (i), (ii). (2) The scattering data {R+ , ν+ , eic+ } determine a CMV matrix of ASB (E) class uniquely and both related transformation operators are bounded. The importance of the A2 condition in the inverse scattering/spectral problems was mentioned e.g. in [22, Chapter 2, Section 4] and [1]. In Section 9 we propose the following sufficient condition given directly in terms of the scattering data. With ν+ we associate the measure ν˜ + by ν˜ + (ζk ) =

|B (ζ

1 2 k )| ν+ (ζk )

(1.70)

and with the reflection coefficient R+ the Szegö function R˜ + (τ ) = R+ (τ )B(τ )2 .

(1.71)

Theorem 1.20. Let ν˜ + be a Carleson measure in D and let R˜ + satisfy the following modification of the A2 condition 1 sup I |I |

I

|R˜ + − R˜ + I |2 + (1 − |R˜ + I |2 ) dm < ∞. 1 − |R˜ + |2

(1.72)

Then the data α+ = {R+ , ν+ } determine the CMV matrix uniquely for any eic+ . Moreover, the both GLM transformation operators are bounded. The theorem shows that the class of data, compared with the classical Faddeev–Marchenko one, is widely extended (indeed an infinite set of mass points is allowed and the reflection coefficient is very far away from being necessarily a continuous function). Remark 1.21. Concerning (1.72) criteria for the strong regularity of J -inner functions and γ generating matrices see [2] look similarly. Remark 1.22. As we clarified in a discussion with A. Kheifets our condition is optimal among all of conditions on the scattering data which have the following two properties: (a) the condition is stable with respect to the involution R+ (τ ) → −R+ (τ ); (b) the assumption on R+ depends on the support of ν+ but not on the corresponding masses. 2. Proof of the duality theorem The main goal of the lemma below is to clarify notations that could be a bit confusing. We believe that the diagram, given in it, and the proof will help to avoid misunderstanding: ±-mappings ± L2α+ ←→ L2α− , defined by (1.54), (1.55), actually depend on the data {R± , ν± }, although we do not indicate this dependence explicitly.

2172


Lemma 2.1. Let w(ζ ) be an inner meromorphic function in D such that w(ζk ) = 0, w(ζk ) = ∞ for all ζk ∈ Z. Put w∗ (ζ ) := w(ζ¯ ). The following diagram is commutative L2{ww∗ R+ ,ww∗ ν+ } +

w

L2α+

−

L2

{w −1 w∗−1 R− ,w −1 w∗−1 ν− }

+ w∗−1

(2.1) −

L2α−

Here the horizontal arrows are related to the unitary multiplication operators and the vertical arrows are related to two different ±-duality mappings. Proof. Note that both w and w∗−1 are well defined on T ∪ Z. Evidently, wf ∈ L2α+ means that f ∈ L2{ww∗ R+ ,ww∗ ν+ } . Also, since |w(τ )| = 1, τ ∈ T, we have that {w −1 w∗−1 R− , w −1 w∗−1 ν− } are minus-scattering data for {ww∗ R+ , ww∗ ν+ } if α− corresponds to α+ . In other words the T± -functions remain the same for both sets of scattering data. Then we use definitions (1.54), (1.55). 2 Proof of Theorem 1.14. Let us mention that f + ∈ L2α+ implies (T− f − )(τ ) = R+ (τ )f + (τ ) + τ¯ f + (τ¯ ) ∈ L2 ,

τ ∈ T.

Since

f + , Bh α = R+ (τ )f + (τ ) + τ¯ f + (τ¯ ), τ¯ B(τ¯ )h(τ¯ ) , +

h ∈ H 2,

it follows from f + ∈ L2α+ Hˇ α2+ that (BT− f − )(τ ) = g(τ ) := B(τ ) R+ (τ )f + (τ ) + τ¯ f + (τ¯ ) ∈ H 2 . Now we calculate the scalar product

f +,

B(τ ) τ − ζk

= f + (ζk )B (ζk )ν+ (ζk ) + BT− f − , α+

1 1 − τ ζ¯k

= f + (ζk )B (ζk )ν+ (ζk ) + g(ζk ) = 0. Therefore, by (1.55) we get −

f (ζk ) =

g (ζk ), BT−

ζk ∈ Z.

For the converse direction we calculate the scalar product of f + ∈ Hˆ α2+ with a function of the form BN g, BN ∈ B, g ∈ H 2 and use the fact that BT− f − ∈ H 2 . 2


2173

3. Reproducing kernels We prove several propositions concerning specific properties of the reproducing kernels in Hˇ α2+ and Hˆ α2+ . The multiplication operator by v is playing an essential role in these constructions. Lemma 3.1. Let kˇα+ (ζ, ) and kˆα+ (ζ, ) denote the reproducing kernels of the spaces Hˇ α2+ and Hˆ α2+ respectively. Then − kˇα+ (ζ, ) =

ˆ ¯ 1 kα−−1 (ζ, ) 1 − ζ , 2 ¯ kˆ −1 (, (ζ − )(1 ¯ − || ) T− () ¯ ) ¯

(3.1)

α−

and, therefore, kˇα+ (, )kˆα −1 (, ¯ ) ¯ = −

1 1 . |T− ()| ¯ 2 (1 − ||2 )2

(3.2)

Proof. First we note that the following one-dimensional spaces coincide:

− ˆ ¯ . (kˇα+ (ζ, ) = b−1 ¯ kα −1 (ζ, ) −

It follows immediately from Theorem 1.14, but let us give a formal proof. Starting with the orthogonal decomposition

kˇα+ (ζ, ) = Hˇ α2+ b Hˇ α21

+

we have − − −

kˇα+ (ζ, ) = Hˇ α2+ b Hˇ α21 , +

or, due to (2.1),

− 2 − − ˇ kˇα+ (ζ, ) = Hˇ α2+ b−1 . ¯ Hα 1 +

Now we use Theorem 1.14

2 − ˆ2 kα+ (ζ, ) = L2α− Hˆ α2− b−1 ¯ L −1 H −1 2 ˆ = b−1 ¯ Hα −1 −

b¯ Hˆ α2−

.

α−

α−

Thus − ˆ kα+ (ζ, ) = Cb−1 ¯ ¯ kα −1 (ζ, ). −

(3.3)

2174


The essential part of the lemma deals with the constant C. We calculate the scalar product kˇα+ (τ, ),

B 1 − τ ¯

. α+

B 2 2 On the one hand, since 1−ζ ¯ belongs to the intersection of Lα+ with H , we can use the reproducing property of kˇα+ :

kˇα+ (τ, ),

B 1 − τ ¯

= α+

B() B() ¯ = . 1 − ||2 1 − ||2

(3.4)

On the other hand we can reduce the given scalar product to the scalar product in the standard H 2 . Since B(ζk ) = 0, the ν-component vanishes and we get 1 2

!

1 R+

ˇ B(τ ) " kα+ (τ, ) B¯ R¯ + ¯ (τ ) = T− (τ )(kˇα+ (τ, ))− , . , 1−τ B( τ ¯ ) 1 τ − ¯ τ¯ kˇα+ (τ¯ , ) t−¯

Substituting here (3.3) we get

¯ b¯ (τ ) C (BT− )(τ )kˆα −1 (τ, ), −

1 . τ − ¯

1 Since (BT− )(ζ )kˆα −1 (ζ, ) ¯ belongs to H 2 and b¯ (ζ ) ζ −1 ¯ = eic 1−ζ is collinear to the reproduc− ing kernel here, we get recalling (3.4)

e−ic C(BT− )() ¯ kˆα −1 (, ¯ ) ¯ = −

B() ¯ . 1 − ||2

Thus (3.1) is proved. Comparing the norms of that vectors and taking into account that the −-map is an isometry we get (3.2). 2 Consider the multiplication operator by v −1 , acting in + L2α+ = Hˆ α2− ⊕ Hˇ α2+ .

(3.5)

Lemma 3.2. The multiplication operator by v −1 acts as a unitary operator from kˆα+− (ζ, ) ⊕ Hˇ α2+

(3.6)

¯ ⊕ Hˇ α2+ . kˆα+− (ζ, )

(3.7)

to


Proof. It is evident that the multiplication by v −1 =

b¯ b

2175

acts from

f ∈ Hˆ α2− : f () = 0 = b Hˆ α21

−

to

f ∈ Hˆ α2− : f () ¯ = 0 = b¯ Hˆ α21 . −

Therefore it acts in their orthogonal complements (3.6), (3.7).

2

Recall the definition of the characteristic function of a unitary node and its functional model, see e.g. [16,24] and references therein. Let U be a unitary operator acting from K ⊕ E1 to K ⊕ E2 , where K, E1 , E2 are Hilbert spaces. We assume that E1 and E2 are finite-dimensional spaces (actually, in this section we need dim E1 = dim E2 = 1). The characteristic function is defined by Θ(w) := PE2 U (IK⊕E1 − wPK U )−1 | E1 .

(3.8)

It is a contractive-valued operator function holomorphic in the unit disk. We make the specific assumption that Θ(w) has an analytic continuation in the exterior of the unit disk through a certain arc (a, b) ⊂ T due to the symmetry principle: −1 ∗ 1 Θ(w) = Θ . w¯ For f ∈ K define F (w) := PE2 U (I − wPK U )−1 f.

(3.9)

This E2 -valued holomorphic vector function belongs to the functional space KΘ with the following properties. • F (w) ∈ H 2 (E2 ), moreover it has an analytic continuation through the arc (a, b). • F∗ (w) := Θ ∗ (w)F ( w1¯ ) ∈ H−2 (E1 ). ∗ • For almost every w ∈ T the vector FF∗ (w) belongs to the image of the operator ΘI ΘI (w), and therefore the scalar product ! " F∗ I Θ ∗ [−1] F∗ , Θ I F F E1 ⊕E2

has sense and does not depend on the choice of the preimage (the first term in the above scalar product). Moreover " ! I Θ ∗ [−1] F∗ F∗ , dm < ∞. (3.10) Θ I F F T

E1 ⊕E2

The integral in (3.10) represents the square of the norm of F in KΘ .

2176


Note that in the model space PK U |K becomes a certain “standard” operator f → F (w)

⇒

PK Uf →

F (w) − F (0) , w

(3.11)

see (3.9). The following simple identity is a convenient tool in the forthcoming calculation. Lemma 3.3. For a unitary U : K ⊕ E1 → K ⊕ E2 U ∗ PE2 U (I − wPK U )−1 = I + (w − U ∗ )PK U (I − wPK U )−1 .

(3.12)

Proof. Since IK⊕E2 = PK + PE2 and U is unitary we have U ∗ PE2 U = (I − wPK U ) + (w − U ∗ )PK U. Then we multiply this identity by (I − wPK U )−1 .

2

Theorem 3.4. Let e1 , e2 be the normalized vectors of the one-dimensional spaces (3.6) and (3.7)

e1 (ζ ) =

ˇ + ¯ T+ () ¯ kˆα− (ζ, ) 1 kα+−1 (ζ, ) , = −i b¯ kˇ −1 (, |T+ ()| ¯ ˆ ¯ ) ¯ (, ) k α− α +

e2 (ζ ) =

ˇ + ¯ T+ () kˆα− (ζ, ) 1 kα+−1 (ζ, ) . =i b kˇ −1 (, ) |T+ ()| ˆ ( , ¯ ) ¯ k α− α

(3.13)

+

Then the reproducing kernel of Hˇ α2+ is of the form (ve2 )(ζ )(ve2 )(ζ0 ) − e1 (ζ )e1 (ζ0 ) . kˇα+ (ζ, ζ0 ) = 1 − v(ζ )v(ζ0 )

(3.14)

Proof. First, we are going to find the characteristic function of the multiplication operator by v −1 with respect to decompositions (3.6) and (3.7) and the corresponding functional representation of this node. By (3.13) we fixed ‘basises’ in the one-dimensional spaces. So, instead of the operator we get the matrix, in fact the scalar function θ (w): Θ(w)e1 := PE2 U (I − wPK U )−1 e1 = e2 θ (w).

(3.15)

Let us substitute (3.15) into (3.12) v(ζ )e2 (ζ )θ (w) = e1 (ζ ) + w − v(ζ ) PK U (I − wPK U )−1 e1 (ζ ).

(3.16)


2177

Recall an important property of kˆα+− (ζ, ): it has an analytic continuation in D with the only pole at ¯ (see Lemma 3.1). Therefore all terms in (3.16) are analytic in ζ and we can chose ζ such that v(ζ ) = w. Then we obtain the characteristic function in terms of the reproducing kernels θ v(ζ ) =

kˇα −1 (ζ, ) ¯ e1 (ζ ) = + . v(ζ )e2 (ζ ) kˇ −1 (ζ, ) α

(3.17)

+

Similarly for f ∈ K = Hˇ α2+ we define the scalar function F (w) by PE2 U (I − wPK U )−1 f = e2 F (w).

(3.18)

Using again (3.12) we get v(ζ )e2 (ζ )F (w) = f (ζ ) + w − v(ζ ) PK U (I − wPK U )−1 f (ζ ). Therefore, F v(ζ ) =

f (ζ ) . v(ζ )e2 (ζ )

(3.19)

Now we are in a position to get (3.14). Indeed, by (3.18) and (3.19) we proved that the vector −1 ∗ PK I − v(ζ0 )U ∗ PK U e2 v(ζ0 )e2 (ζ0 ) is the reproducing kernel of K = Hˇ α2+ with respect to ζ0 , |v(ζ0 )| < 1. Using the Darboux identity PE2 U (I − wPK U )−1 PK (I − w¯ 0 U ∗ PK )−1 U ∗ | E2 =

I − Θ(w)Θ ∗ (w0 ) 1 − w w¯ 0

(in the given setting it is a simple and pleasant exercise) we obtain I − θ (v(ζ ))θ (v(ζ0 )) v(ζ0 )e2 (ζ0 ) kˇα+ (ζ, ζ0 ) = v(ζ )e2 (ζ ) 1 − v(ζ )v(ζ0 )

(3.20)

for |v(ζ )| < 1, |v(ζ0 )| < 1. By (3.17) we have (3.14) that, by analyticity, holds for all |ζ | < 1, |ζ0 | < 1. 2 Corollary 3.5. The following Wronskian-kind identity is satisfied for the reproducing kernels (T− e2− )(ζ ) (T− e1− )(ζ ) = − log v(ζ ) , |ζ | < 1. (3.21) e (ζ ) e1 (ζ ) 2 Proof. We multiply kˇα−+ (ζ, ζ¯0 ) by bζ0 (ζ ) and calculate the resulting function of ζ at ζ = ζ0 . By (3.1) we get

bζ0 (ζ )kˇα−+ (ζ, ζ¯0 ) ζ =ζ = eic 0

1 . T− (ζ0 )(1 − |ζ0 |2 )

(3.22)

2178


Now we make the same calculation but using representation (3.14). We have e1− (ζ ) v(ζ )e2− (ζ ) , ˇkα− (ζ, ζ¯0 ) = −v(ζ0 ) + v(ζ ) − v(ζ0 ) e1 (ζ¯0 ) v(ζ¯0 )e2 (ζ¯0 ) or, after multiplication by bζ0 (ζ ),

bζ0 (ζ )kˇα−+ (ζ, ζ¯0 ) ζ =ζ = eic 0

v(ζ )e2− (ζ ) −v(ζ0 ) v (ζ0 )(1 − |ζ0 |2 ) e1 (ζ¯0 )

. v(ζ¯0 )e2 (ζ¯0 ) e1− (ζ )

In combination with (3.22), we get −

v(ζ0 )e2− (ζ0 ) v (ζ0 ) = v(ζ0 )T− (ζ0 ) e1 (ζ¯0 )

. −1 v (ζ0 )e2 (ζ¯0 ) e1− (ζ0 )

Due to the symmetry kˆα− (ζ, ζ0 ) = kˆα− (ζ¯ , ζ¯0 ), we have e2 (ζ¯0 ) = e1 (ζ0 ). Thus (3.21) is proved. 2 Corollary 3.6. Let τ ∈ T, then e2 (τ )2 − e1 (τ )2 = d log v(τ ) . d log τ

(3.23)

Proof. All terms of (3.21) have boundary values. Recall that − (T− e1,2 )(τ ) = (R+ e1,2 )(τ ) + τ¯ e1,2 (τ¯ ),

Then use again the symmetry of the reproducing kernel.

τ ∈ T.

2

4. A recurrence relation for the reproducing kernels and the Schur parameters Let kα (ζ, ζ0 ) Kα (ζ, ζ0 ) := √ , kα (ζ0 , ζ0 )

(4.1)

where kα (ζ, ζ0 ) denotes one of the reproducing kernels kˆα± (ζ, ζ0 ) or kˇα± (ζ, ζ0 ). Theorem 4.1. Both systems

Kα (ζ, ), b (ζ )Kα 1 (ζ, ) and

Kα (ζ, ), b¯ (ζ )Kα 1 (ζ, ) form an orthonormal basis in the two-dimensional space spanned by Kα (ζ, ) and Kα (ζ, ). Moreover


2179

Kα (ζ, ) = a(α)Kα (ζ, ) + ρ(α)b (ζ )Kα 1 (ζ, ), Kα (ζ, ) = a(α)Kα (ζ, ) + ρ(α)b¯ (ζ )Kα 1 (ζ, ),

(4.2)

where a(α) = a =

Kα (, ) , Kα (, )

ρ(α) = ρ =

1 − |a|2 .

(4.3)

Proof. The first claim is evident, therefore Kα (ζ, ) = c1 Kα (ζ, ) + c2 b (ζ )Kα 1 (ζ, ). Putting ζ = we get c1 = a. Due to orthogonality we have 1 = |a|2 + |c2 |2 . Now, put ζ = . Taking into account that Kα (, ) = K α (, ) and that by normalization b () > 0 we proved that c2 , being positive, is equal to 1 − |a|2 . Note that simultaneously we proved that ρ(α) = b ()

Kα 1 (, ) . Kα (, )

2

Corollary 4.2. A recurrence relation for the reproducing kernels generated by a shift of the scattering data is of the form b (ζ ) Kα 1 (ζ, ),

−Kα 1 (ζ, ) = Kα (ζ, ),

−Kα (ζ, )

1 ρ

1 a¯

a 1

v 0

0 . 1

(4.4)

Proof. Recalling v = b /b¯ , we write b (ζ ) Kα 1 (ζ, ), Then, use (4.2).

−Kα 1 (ζ, ) = b¯ (ζ )Kα 1 (ζ, ),

v −b (ζ )Kα 1 (ζ, ) 0

0 . 1

2

Corollary 4.3. Let θα (v) :=

Kα (ζ, ) . Kα (ζ, )

Then the Schur parameters of the function eic θα (v), are

∞ eic a α n n=0 .

(4.5)

2180


Proof. Let us note that (4.4) implies θα (v) =

a(α) + vθα 1 (v) 1 + a(α)vθα 1 (v)

and that |a(α)| < 1. Then we iterate this relation. Also, multiplication by eic ∈ T of a Schur class function evidently leads to multiplication by eic of all Schur parameters. 2 Theorem 4.4. The multiplication operator with respect to the basis (1.61) is CMV. Proof. Recall (1.12), from which we can see that the decomposition of the vector v(ζ )Kα (ζ, ) is of the form v(ζ )Kα (ζ, ) ¯ = c0

Kα −2 (ζ, ) ¯ K −1 (ζ, ) + c1 α + c2 Kα (ζ, ) ¯ + c3 b¯ (ζ )Kα 1 (ζ, ). b (ζ )b¯ (ζ ) b (ζ )

Multiplying by the denominator b (ζ )b¯ (ζ ) we get b2 (ζ )Kα (ζ, ) ¯ = c0 Kα −2 (ζ, ) ¯ + c1 Kα −1 (ζ, )b¯ (ζ ) + c2 Kα (ζ, )b ¯ (ζ )b¯ (ζ ) + c3 Kα 1 (ζ, )b (ζ )b2¯ (ζ ).

(4.6)

First we put ζ = . ¯ By the definition of ρ(α) we have c0 = b2 () ¯

¯ ) ¯ Kα (, = ρ α −1 ρ α −2 . Kα −2 (, ¯ ) ¯

Putting ζ = in (4.6) and using the definition of a(α), we have c1 = −c0

a(α −2 ) Kα −2 (, ) ¯ = −ρ(αμ)ρ α −2 = −ρ α −1 a α −2 . −2 Kα −1 (, )b¯ () ρ(α )

Doing in the same way we can find a representation for c2 that would involve derivatives of the reproducing kernels. However, we can find c2 in terms of a and ρ calculating the scalar product c2 = b2 (ζ )Kα (ζ, ), ¯ b (ζ )b¯ (ζ )Kα (ζ, ) ¯ . Since b (ζ ) is unimodular, using (4.2), we get c2 =

¯ − a(α −1 )Kα −1 (ζ, ) Kα −1 (ζ, ) , b¯ (ζ )Kα (ζ, ) . ρ(α −1 )

Recall that kα (ζ, ) = Kα (ζ, )Kα (, ) is the reproducing kernel. Thus c2 = −

a(α −1 ) −1 a(α −1 ) b¯ ()Kα (, ) = − ρ α a(α) = −a α −1 a(α). −1 −1 ρ(α ) Kα −1 (, ) ρ(α )


2181

And, similar, c3 =

¯ − a(α −1 )Kα −1 (ζ, ) 2 Kα −1 (ζ, ) , b¯ (ζ )Kα 1 (ζ, ) . ρ(α −1 )

Thus c3 = −

a(α −1 ) −1 a(α −1 ) b2¯ ()Kα 1 (, ) = − ρ α ρ(α) = −a α −1 ρ(α). −1 −1 ρ(α ) Kα −1 (, ) ρ(α ) K

(ζ,)

To find the decomposition of the vector v(ζ ) αb−1 (ζ ) is even simpler. Note that all other columns of the CMV matrix, starting from these two, can be obtained by the two step shift of the scattering data. 2 5. From the spectral data to the scattering data: a special representation of the Schur function In this section we use Theorem D [34], see also [32]. For readers convenience we formulate it here. Theorem 5.1. Let r(v) be a function meromorphic in Ω with the property r(v(ζ )) + r(v(ζ )) 0. i(ζ − ζ¯ )

(5.1)

If the poles {tj } of r(v) (due to (5.1) they lie on T \ E) satisfy the Blaschke condition (1.24), then r(v(ζ )) is of bounded characteristic in D, and in addition the inner (in the Beurling sense) factor of r(v(ζ )) is a quotient of Blaschke products, i.e., it does not have a singular inner factor. Note the evident fact: if r(v) is of bounded characteristic in Ω then for the poles {tj } the Blaschke condition (1.24) holds. Proposition 5.2. If A belongs to ASB (E) then the associated Schur functions θ± are of bounded characteristic in Ω and 2 log1 − θ± v(τ ) ∈ L1 .

(5.2)

Proof. We use the formula (see (1.22)) rA (v) := 0|

A+v 1 + vθ+ (v)θ− (v) |0 = . A−v 1 − vθ+ (v)θ− (v)

Since A ∈ ASB (E) and rA (v) is a resolvent function, its poles satisfy the Blaschke condition. Now we note that r+ (v) + r− (v) =

2(1 − vθ+ (v)θ− (v)) 1 + vθ+ (v) 1 + θ− (v) + = . 1 − vθ+ (v) 1 − θ− (v) (1 − vθ+ (v))(1 − θ− (v))

2182


Since zeros and poles of the last function interlace we get that poles of r± also satisfy the Blaschke condition. By Theorem 5.1 they are of bounded characteristic in Ω. Hence θ± are also in this class. By (1.23) we get (5.2). 2 Definition 5.3. A function θ (v) belongs to the class ΘSB (E) if it is a function of bounded characteristic in Ω with the following properties 1 − θ (v(ζ ))θ (v(ζ )) 0 i(ζ − ζ¯ )

(5.3)

2 log1 − θ v(τ ) ∈ L1 .

(5.4)

and

Denote T− = {τ ∈ T: Im τ < 0},

D− = {ζ ∈ D: Im ζ < 0}.

Proposition 5.4. Functions of the class ΘSB (E) possess the following parametric representation λ¯ k λk − ζ 1 − λk ζ − # ( τ +ζ − τ¯ +ζ )(dμ(τ )−log ρ(τ )dm(τ )) θ v(ζ ) = eic , e T− τ −ζ τ¯ −ζ λk λ¯ k − ζ 1 − λ¯ k ζ

(5.5)

where • = {λk } ⊂ D− is a Blaschke sequence, • μ is a singular measure on the (open) set T− , • ρ, 0 ρ 1, is such that

log 1 − ρ(τ ) ρ(τ ) dm(τ ) > −∞.

(5.6)

T−

Proof. First we note the symmetry θ v(ζ¯ ) =

1 θ (v(ζ ))

(5.7)

and then use the parametric representation of functions of bounded characteristic and (5.4). In the opposite direction to prove (5.3) we can use directly representation (5.5) or note that θ (v(ζ )) is of the Smirnov class in the domain D− and then use the maximum principle. 2 Remark 5.5. θ± ∈ ΘSB (E) implies (1.24) and (1.26), but the spectral measure dΣ is not necessarily absolutely continuous on E.


2183

Example 5.6. On the other hand for every θ+ ∈ ΘSB (E) there exists θ− ∈ ΘSB (E) such that the associated A belongs to ASB (E). Put, for instance, 1 − ζ ¯ , θ− v(ζ ) = eic− 1 − ζ

(5.8)

that corresponds to the constant Schur parameters (see Theorem 1.2). Since for every > 0 sup

{ζ ∈D− : Im ζ 0, L () > 0,

(5.11)

2184


such that L¯ (ζ ) and L (ζ ) are of Smirnov class in D with the mutually simple inner parts, and the (Wronskian) identity L (τ ) L¯ (τ ) d log v(τ ) (5.12) L (τ ) L (τ ) = d log τ , τ ∈ T, ¯ holds. Proof. By (5.12) and (5.11) we have L (τ )2 1 − θ v(τ ) 2 = d log v(τ ) . d log τ

(5.13)

Due to (5.4) we can define the outer function O such that O (τ )2 = 1 − θ v(τ ) 2 −1 d log v(τ ) , d log τ and the outer function O¯ such that O¯ (τ )2 = O (τ )2 θ v(τ ) 2 ,

O () > 0,

O¯ () ¯ > 0.

We represent the inner part of the function θ (v(ζ )) as the ratio of the inner holomorphic functions I¯ (ζ ) , I (ζ )

I¯ () ¯ > 0, I () > 0.

Finally we put L¯ (ζ ) := I¯ (ζ )O¯ (ζ ),

L (ζ ) := I (ζ )O (ζ ).

Then the left- and right-hand sides of (5.11) coincide up to an unimodular constant and this defines eic . By (5.13) relation (5.12) also holds. It is evident that L¯ (ζ ) and L¯ (ζ ) as functions of the Smirnov class are defined uniquely. Note that due to uniqueness and property (5.7) we have L¯ (ζ¯ ) = L (ζ ). That is (5.12) can be written in the form similar to (1.52) L¯ (τ¯ ) L (τ¯ ) d log v(τ ) 2 (5.14) L (τ ) L (τ ) = d log τ . ¯ Theorem 5.8. Let θ ∈ ΘSB (E) and let {ak }∞ k=0 be the sequence of its Schur parameters. Represent the Schur iterates as L¯ (n, ζ ) θ (n) v(ζ ) = eicn . L (n, ζ )

(5.15)

Then eicn = eic and L¯ (n, ζ ) = e−ic an L (n, ζ ) + ρn b (ζ )L¯ (n + 1, ζ ), L (n, ζ ) = eic a¯ n L¯ (n, ζ ) + ρn b¯ (ζ )L (n + 1, ζ ).

(5.16)


2185

Proof. By definition L¯ L¯ − (e−ic a0 )L b¯ θ − a0 b¯ = eic = eic1 (1) . −ic 1 − θ a¯ 0 b b L L − (e a0 )L¯ (1)

θ (1) =

By the uniqueness of representation (5.11) we get

(1)

ρ˜1 b L¯

(1)

= [ L¯

L ]

a0 = eic

L¯ () , L ()

ρb ˜ ¯ L

−eic a¯ 0 , 1

1 −e−ic a0

(5.17)

with ρ˜1 = ρe ˜ i(c1 −c) . Using

we have in particular (1) ρ˜1 b ()L ¯ ¯ () ¯ = L¯ () ¯ 1 − |a0 |2 ,

2 ρb ˜ ¯ ()L(1) () = L () 1 − |a0 | .

That is, both ρ, ˜ ρ˜1 are positive and therefore ρ˜1 = ρ˜ and eic1 = eic . From (5.17) we have the matrix identity ρ˜

(1)

b (τ¯ )L¯ (τ¯ ) (1)

b (τ )L¯ (τ )

(1)

b¯ (τ¯ )L (τ¯ )

L¯ (τ¯ ) = (1) L b¯ (τ )L (τ ) ¯ (τ )

L (τ¯ ) L (τ )

1 −e−ic a0

−eic a¯ 0 . 1

(5.18)

Finally using (5.14) we have ρ˜ 2 = 1 − |a0 |2 . Hence ρ˜ = ρ0 . Thus (5.16) holds for n = 0 and we can iterate this procedure. 2 Lemma 5.9. For the spectral density W the following factorization holds dm(v(τ )) 2 = ρ−1 W −1 v(τ ) Φ(τ )Φ ∗ (τ ), dm(τ )

(5.19)

where $ Φ(τ ) =

1 b (τ ) L−, (τ ) (1) −eic− L−,¯ (τ )

−eic+ L+,¯ (τ )

%

(−1) 1 b (τ ) L+, (τ )

(5.20)

.

Proof. Due to (5.11) θ+ (v) = eic+

L+,¯ (ζ ) , L+, (ζ )

(1)

(1)

θ− (v) = eic−

L−,¯ (ζ ) (1)

.

(5.21)

L−, (ζ )

Besides, due to (5.13) 2 1 − θ+ v(τ ) =

dm(v(τ )) 1 2 |L+, (τ )| dm(τ )

(5.22)

2186


and (1) 2 1 − θ− v(τ ) =

1

dm(v(τ )) . dm(τ )

(1) |L−, (τ )|2

(5.23)

By definition (1.23) and (5.23), (5.22), we have W −1 (v(τ ))

(1) |L−, (τ )|2 2 dm(v(τ )) = dm(τ ) I + R(v) 0

0

2 . |L+, (τ )|2 I + R∗ (v(τ ))

By definition (1.22) (1) 2 θ = I − vA∗−1 − 0 I + R(v)

0 . θ+

Therefore we get (5.19) with b¯ ρ−1 Φ(τ ) = b ×

(1)

0

L−, 0

L+,

(τ ) − A∗−1

(1) eic− L−,¯

0

0

eic+ L+,¯

a−1 ρ−1

ρ−1 −a¯ −1

(5.24)

(τ ).

By (5.16) (−1)

(−1)

ρ−1 b¯ (ζ )L+, (ζ ) = L+, (ζ ) − eic+ a¯ −1 L+,¯ (ζ ) (−1) (−1) = L+, (ζ ) − eic+ a¯ −1 e−ic+ a−1 L+, (ζ ) + ρ−1 b (ζ )L+,¯ (ζ ) (−1)

= (ρ−1 )2 L+, (ζ ) − ρ−1 eic+ a¯ −1 b (ζ )L+,¯ (ζ ), that is (−1)

b¯ (ζ )L+, (ζ ) + eic+ a¯ −1 b (ζ )L+,¯ (ζ ) = ρ−1 L+, (ζ )

(5.25)

and similarly (1)

(1)

(1)

(1)

b¯ (ζ )L−, (ζ ) + eic− a¯ −,−1 b (ζ )L−,¯ (ζ ) = ρ−,−1 L−, (ζ ), (1)

(5.26)

(1)

Note that a−,−1 = −a¯ −1 (generally a−,k = −a¯ −k−2 ). Thus, using (5.26), (5.25), we get (5.20) from (5.24). The lemma is proved. 2 Lemma 5.10. Define

R− S(τ ) = T+ Then (1.38)–(1.41) hold true.

T− = −Φ −1 (τ )τ¯ Φ(τ¯ ). R+

(5.27)


2187

Proof. S(τ ) is unitary-valued since W (v(τ¯ )) = W (v(τ )). Due to L±,¯ (ζ¯ ) = L±, (ζ ) and A¯ n = A∗n = A−1 n we get directly from (5.24)

e−ic− Φ(τ¯ ) = −vA−1 Φ(τ ) 0

0

(5.28)

.

e−ic+

And, therefore, the following symmetry property of S S(τ¯ ) = − =

eic− 0

eic− 0

0 eic+ 0 eic+

S(τ )

e−ic− 0

e

e−ic− 0 0 −ic+ .

Φ(τ )−1 τ¯ Φ(τ¯ )

0

e−ic+ (5.29)

is proved. Let us show that T+ (τ ) = T− (τ¯ ). We have

R− T+

1 (−1) −τ¯ b L+, T− = R+ Δ eic− L(1) −,¯

eic+ L+,¯ 1 b L−,

(τ )

1 b L−, (1) −eic− L−,¯

−eic+ L+,¯

(τ¯ ),

(5.30)

(v −1 ) . ρ−1 Δ

(5.31)

1 (−1) b L+,

where Δ = det Φ. Therefore

T+ = −eic− τ¯

1 b (τ¯ ) L−, (τ¯ ) 1 L−, (τ ) b (τ )

(1) L−,¯ (τ¯ ) (1) L (τ ) −,¯

Δ −1 (1) (1) v L−, (τ¯ ) − eic− a−1 L−,¯ (τ¯ ) −1 (1) v L−, (τ ) − eic− a−1 L(1) −,¯ (τ ) = −eic− τ¯ ρ−1 Δ

(1) L−,¯ (τ¯ ) L(1) −,¯ (τ )

= −eic−

−1

) i(c+ −c− ) T . Due to (5.29) T (τ¯ ) = ei(c+ −c− ) T (τ ), therefore the Similarly T− = −eic+ (v + + + ρ−1 Δ = e ∗ symmetry S (τ¯ ) = S(τ ) is completely proved. Note also that (5.31) implies the following normalization

T+ () = eic−

b¯ ()b () (−1)

ρ−1 L+, ()L−, ()

= eic−

b () (−1)

(1)

.

(5.32)

L+, ()L−, ()

That is, T+ () = −ieic− |T+ ()|. Finally we have to prove that T± is a ratio of an outer function and a Blaschke product. In other words, by (5.31), we need to show that the inner part of the Smirnov class function ρ−1 b2 Δ = det

−eic+ b L+,¯

L−, (1)

(−1)

−eic− ρ−1 b L−,¯ ρ−1 L+, L−, −eic+ b L+,¯ = ic− −ic ic − + −e (L−,¯ + a¯ −1 e L−, ) b¯ L+, + a¯ −1 e b L+,¯

2188


−eic+ b L+,¯ b¯ L+,

L−, = ic− −e L

−,¯

= b¯ L−, L+, − ei(c+ +c− ) b L+,¯ L−,¯

(5.33)

is a Blaschke product (actually related to the spectrum of the associated CMV matrix). Since b¯ L−, L+, + ei(c+ +c− ) b L+,¯ L−,¯ 1 + vθ+ θ− = i(c +c ) + − 1 − vθ+ θ− b¯ L−, L+, − e b L+,¯ L−,¯ by Theorem 5.1 the inner part of this fraction is a ratio of two Blaschke products. Thus, any other inner divisor of the inner part of ρ−1 b2 Δ should simultaneously divide the inner part of the numerator b¯ L−, L+, + ei(c+ +c− ) b L+,¯ L−,¯ . That is, b¯ L−, L+, and ei(c+ +c− ) b L+,¯ L−,¯ possess a nontrivial common inner factor in this case. But they are coprime since the inner part of the first function is supported in the upper half plane and of the second one in the lower part, see Proposition 5.4. 2 Lemma 5.11. For every ζk ∈ Z the following two vectors are collinear

(1) eic− L−,¯

1 b L−,

(ζk ) = −

1 T−

(−1) (ζk )ν+ (ζk ) b1 L+,

eic+ L+,¯ (ζk ).

(5.34)

Moreover ν+ (ζk ) > 0. Proof. By definition (1.21) and (1.22) we have (1) θ (v) ∗ tk Σ(tk ) = (tk − v) I − vA−1 − 0

0 θ+ (v)

−1 . v=tk

Since ρ−1 vΦ(τ ) = I

− vA∗−1

(1)

θ− (v) 0

0 θ+ (v)

(1)

L−, 0

0 L+,

(τ ),

we get tk Σ(tk ) = =

(1)

L−, 0 (1)

L−, 0 (1)

L−, = 0

0

L+,

(tk − v) −1 (ζk ) Φ (τ ) ρ−1 v τ =ζk

1 (−1) b L+, (1) L+, eic− L−,¯ 1 (−1) 0 b L+, (ζk ) (1) L+, eic− L−,¯

0

(ζk )

eic+ L+,¯ 1 b L−,

eic+ L+,¯ 1 b L−,

(tk − v) (ζk ) ρ−1 vΔ τ =ζk

(ζk )

−v (ζk ) , ρ−1 tk Δ (ζk )

(5.35)


2189

−1

) or, using T− = −eic+ (v ρ−1 Δ ,

(1)

L−, Σ(tk ) = 0

0

L+,

(ζk )

1 (−1) b L+, (1) eic− L−,¯

eic+ L+,¯

1 b L−,

ic+ −1 e (ζk ) − (ζk ) . T−

(5.36)

From this formula we conclude that the vector in the RHS (5.34) does not vanish. Otherwise, by L+, (ζk ) = L+,¯ (ζk ), we have Σ(tk ) = 0, which is impossible. On the other hand the rank of the second matrix in (5.36) is one, therefore (5.34) is proved. Now, making use of (5.34) and the symmetry of T− , we get from (5.36) $ % (−1) 1 1 L (ζ ) k (−1) ic+ L Σ(tk ) = b (ζk ) +, (5.37) +,¯ (ζk ) ν+ (ζk ), b (ζk ) L+, (ζk ) e −ic e + L+,¯ (ζk ) here Σ(tk ) 0 implies ν+ (ζk ) > 0.

2

Remark 5.12. Similarly −

1 T+

(ζk )ν− (ζk ) eic− L(1) −,¯

1 b L−,

(−1) (ζk ) = b1 L+,

eic+ L+,¯ (ζk ).

(5.38)

Therefore (1.42) holds for ν± defined by (5.34) and (5.38). 6. From the spectral representation to the scattering representation In this section an essential part of Theorem 1.6 will be proved. Theorem 6.1. Let A ∈ ASB (E). Define S by (5.27) and ν± by (5.34) and (5.38). Then bm (ζ )bm¯ (ζ )L±,¯ (n, ζ )eic± , n = 2m, ± e (n, ζ ) = (ζ )L±, (n, ζ ), n = 2m + 1, bm (ζ )bm+1 ¯

(6.1)

is an orthonormal basis in L2α± , α± = {R± , ν± }. The proof is based on the following lemma. Lemma 6.2. For f + ∈ L2α+ $ v(τ )+w

+ + v(τ )−w f , e (−1, τ ) α+ v(τ )+w + + v(τ )−w f , e (0, τ ) α+

%

=

t +w dΣ(t)f˜(t), t −w

(6.2)

where + 1 f (τ ), t = v(τ ), τ ∈ T− , Φ f− Δ(τ ) + f + (ζk ) e (−1, ζk ) ˜ f (tk ) := , tk = v(ζk ), ζk ∈ Z. e+ (0, ζk ) |e+ (−1, ζk )|2 + |e+ (0, ζk )|2 f˜(t) :=

(6.3) (6.4)

2190


Proof. Note that in this notations (see (5.37)) + e (−1, ζk ) + Σ(tk ) = e (−1, ζk ) e+ (0, ζk ) and (see (5.27)) − e (−1, τ ) −e− (0, τ )

−e+ (0, τ ) e+ (−1, τ )

R− T+

e+ (0, ζk ) ν+ (ζk ),

− T− e (−1, τ¯ ) (τ ) = −τ¯ −e− (0, τ¯ ) R+

−e+ (0, τ¯ ) . e+ (−1, τ¯ )

(6.5)

(6.6)

Therefore, by the definition of the scalar product in L2α+ , we have v(τ )+w + + v(τ )−w f , e (−1, τ ) v(τ )+w + + v(τ )−w f , e (0, τ ) e+ (−1, ζk )f + (ζk ) v(ζk ) + w ν+ (ζk ) = e+ (0, ζk )f + (ζk ) v(ζk ) − w ζk ∈ Z

+ T−

T+ (τ )e+ (−1, τ ) T− (τ )e− (0, τ )

T+ (τ )e+ (0, τ ) T− (τ )e− (−1, τ )

∗

v(τ ) + w T+ f + (τ ) dm(τ ). T− f − v(τ ) − w

Using (6.5), definition (6.4) and Φ

−1

1 e+ (−1, τ ) (τ ) = Δ e− (0, τ )

e+ (0, τ ) , e− (−1, τ )

(6.7)

we get v(τ )+w + + v(τ )−w f , e (−1, τ ) v(τ )+w + + v(τ )−w f , e (0, τ ) tk + w = Σ(tk )f˜(tk ) t −w tk ∈X k

−1 ∗ |T+ |2 f + v(τ ) + w (τ ) dm(τ ) + Δ(τ ) Φ (τ ) 2 − |T− | f v(τ ) − w T−

+ tk + w 1 t +w f ˜ = (τ ) dm(t), Σ(tk )f (tk ) + W (t) Φ f− tk − w t −w Δ tk ∈X

E

since W = (Φ −1 )∗ Φ −1 |v2 | and |T− |2 = |T+ |2 = ρ−1

|v |2 2 |Δ|2 . ρ−1

2

Proof of Theorem 6.1. It was shown that e+ (−1, τ ), e+ (0, τ ) form a cyclic subspace for the multiplication operator by v(τ ) in L2α+ , moreover, the resolvent matrix function + ∗ v(τ ) + w + E , E v(τ ) − w

E+

c−1 := e+ (−1, τ )c−1 + e+ (0, τ )c0 , c0


2191

coincides with R(w) (1.20). Therefore the operator F + : l 2 (Z) → L2α+ , defined by −1 F + (A − w)−1 |n = v(τ ) − w e+ (n, τ ),

n = −1, 0,

is unitary. Recurrences (5.16) imply (1.50), (1.51), and therefore, (1.29). Thus F + |n = e+ (n, τ ), and the theorem is proved.

n ∈ Z,

2

Proof of Proposition 1.16. Note that e+ (n, τ )’s are in the Smirnov class for n ∈ Z+ . Therefore (BT+ )(τ )e+ (n, τ ) ∈ L2 implies (BT+ )(τ )e+ (n, τ ) ∈ H 2 . Thus F + (Z+ ) ⊂ Hˆ α2+ . In the same way F − (Z− ) ⊂ Hˆ α2− . Therefore, due to the duality Theorem 1.14, Hˇ α2+ ⊂ F + (Z+ ).

2

Remark 6.3. Let us note the following fact lim F + (Z+,n ) = L2α+ ,

lim F + (Z+,n ) = {0},

n→∞

n→−∞

(6.8)

where Z+,n := {m ∈ Z, m n}. Also, in the standard way, l n,+ (ζ, ζ0 ) :=

∞

e+ (m, ζ )e+ (m, ζ0 ),

ζ, ζ0 ∈ D,

m=n

is the reproducing kernel in F + (Z+,n ). In particular, ¯ l + (ζ, ) L+,¯ (ζ ) = , + l (, ¯ ) ¯

l + (ζ, ) L+, (ζ ) = , l + (, )

where l + (ζ, ζ0 ) := l 0,+ (ζ, ζ0 ). Proof of (1.32). Let δk (τ ) =

1 ν+ (ζk ) ,

0,

τ = ζk , & τ ∈ (T Z) \ {ζk }.

Then

f + (τ ), δk (τ ) α = f + (ζk ) +

(6.9)

2192


for every f + ∈ F + (Z+,n ). Therefore the projection of δk (τ ) onto F + (Z+,n ) is the reproducing kernel l n,+ (τ, ζk ). Since by (6.8) ' ' ' ' 'δk (τ )' = lim 'l n,+ (τ, ζk )', n→−∞

we get by (6.9) ∞ ∞ + + 1 e (m, ζk )2 = e (m, ζk )2 . = lim ν+ (ζk ) n→−∞ m=n m=−∞

2

Thus, to complete the proof of Theorem 1.6, we have to show asymptotics (1.30). 7. Asymptotics In this section we prove the main claim of Theorem 1.6. Recall briefly the notations. With A ∈ ASB (E) we associate the Schur functions θ+ , θ−(1) . They belong to ΘSB (E) and, therefore, possess the special representation (5.11). We put e+ (0, τ ) = eic+ L+,¯ (τ ),

ρ−1 e+ (−1, τ ) =

1 L+, (τ ) + eic+ a¯ −1 L+,¯ (τ ), v(τ )

e− (0, τ ) = eic− L−,¯ (τ ),

ρ−1 e− (−1, τ ) =

1 (1) (1) L (τ ) − eic− a−1 L−,¯ (τ ). v(τ ) −,

(1)

(7.1)

The scattering matrix S is defined by (6.6) and the measures ν± on Z are defined by 2 2 ν± (ζk ) e± (−1, ζk ) + e± (0, ζk ) = tr Σ(tk ). Our goal is to prove the asymptotics (1.30), (1.35) for the systems defined by the recurrence relations (1.29), (1.34) with the initial data e± (n, τ ), n = −1, 0. It is a standard fact that such asymptotics can be obtained from the convergence of a certain system of analytic functions just in one fixed point of their domain. More specifically, our first step is a reduction to the convergence of the reproducing kernels L±,ζ0 (n, ζ0 ) to the standard one Kζ0 (ζ0 ) in a fixed point of the unit disk. Lemma 7.1. Let χT (τ ), τ ∈ T ∪ Z, be the characteristic function of the set T. Then (1.30), (1.35) can be deduced from ' ' lim 'χT L±,ζ0 (n, τ ) − Kζ0 (τ ) 'α (n) = 0,

n→∞

±

ζ0 ∈ D.

Proof. Using definition (6.6) and the recurrence relations for e± (n, τ ) we have τ¯ e± (n, τ¯ ) + R± (τ )e± (n, τ ) = T∓ (τ )e∓ (−n − 1, τ ). Therefore conditions (1.30), (1.35) are equivalent to

(7.2)


2193

T± (τ )e± (n, τ ) = T± (τ )en,c± (τ ) + o(1), τ¯ e± (n, τ¯ ) + R± (τ )e± (n, τ ) = τ¯ en,c± (τ¯ ) + R± (τ )en,c± (τ ) + o(1) in L2 as n → ∞. That is, ' ' lim 'χT e± (n, τ ) − en,c± (τ ) 'L2 = 0.

n→∞

(7.3)

α+

Then we use (6.1), (1.18) and definition (1.60) to rewrite (7.3) into the form ' ' lim 'χT L±,¯ (n, τ ) − K¯ (τ ) 'α (n) = 0, n→∞ ± ' ' ' ' 2 lim χT L±, (n, τ ) − K (τ ) α (n) = 0. ±

n→∞

Lemma 7.2. Assume that lim L±,ζ0 (n, ζ0 ) = Kζ0 (ζ0 ),

ζ0 ∈ D.

(7.4)

' ' lim 'L±,ζ0 (n, τ ) − χT Kζ0 (τ )'α (n) = 0.

(7.5)

n→∞

Then

±

n→∞

Proof. For > 0 chose N such that Re BN (ζ0 ) 1 − . Note that BN Kζ0 ∈ Hˇ 2(n) and consider α±

' ' 'L±,ζ (n, τ ) − (BN Kζ )(τ )'2 (n) , 0 0 α ±

' ' '(BN Kζ )(τ ) − χT Kζ (τ )'2 (n) . 0 0 α ±

For the first term we have ' ' 'L±,ζ (n, τ ) − (BN Kζ )(τ )'2 (n) = 2 − 2 Re (BN Kζ0 )(ζ0 ) 0 0 α± L±,ζ0 (n, ζ0 ) − Re P− bn bn¯ R± BN Kζ0 , τ¯ (BN Kζ0 )(τ¯ ) L2 . Note that for any two functions f, g ∈ L2 the following limit exists lim P− bn bn¯ f, g L2 = 0.

(7.6)

n→∞

Therefore, if (7.4) is satisfied then '2 ' lim sup'L±,ζ0 (n, τ ) − (BN Kζ0 )(τ )'α (n) 2. n→∞

±

(7.7)

2194


Similarly, ' ' '(BN Kζ )(τ ) − χT Kζ (τ )'2 (n) 0 0 α± 2 = |BN Kζ0 |2 (ζk )ν± (ζk )bn (ζk ) Z

+ 2 − 2 Re BN (ζ0 ) + Re P− bn bn¯ R± (BN − 1)Kζ0 , τ¯ (BN − 1)Kζ0 (τ¯ ) L2 . Note that the sum over Z here contains just a fixed finite number of nonvanishing terms, and therefore it goes to zero as n → ∞. Thus, taking also into account (7.6), we get '2 ' lim sup'(BN Kζ0 )(τ ) − χT Kζ0 (τ )'α (n) 2. ±

n→∞

(7.8)

Combining (7.7) and (7.8) we have '2 ' lim sup'L±,ζ0 (n, τ ) − χT Kζ0 (τ )'α (n) 8. ±

n→∞

Since > 0 is arbitrary the lemma is proved.

2

Remark 7.3. Note that, in addition to (7.2), (7.5) contains lim

n→∞

L±,ζ (n, ζk )2 b (ζk )2n ν± (ζk ) = 0. 0 Z

In the proof of (7.4) we follow the line that was suggested in [26] and then improved in [37] and [19]. Actually, the general idea is very simple. There are two natural steps in approximation of the given spectral data by “regular” ones. First, to substitute the given measure ν+ by a finitely supported νN,+ . Second, to substitute R+ by qR+ with 0 < q < 1. Then the corresponding data produce the Hardy space which is topologically equivalent to the standard H 2 . In particular Kˇ αN,q,+ (ζ0 , ζ0 ) = Kˆ αN,q,+ (ζ0 , ζ0 ). Lemma 7.4. Let Z contain a finite number of points and R+ L∞ < 1. Then the limit (7.4) exists. Basically, it follows from (7.6) and |b (ζk )|n → 0. It is a fairly easy task and we omit a proof here. Further, due to Hˇ α2q,+ ⊂ Hˇ α2+ ⊂ F + (Z+ ) ⊂ Hˆ α2+ ⊂ Hˆ α2N,+ we have the evident estimations for the corresponding reproducing kernels Kˇ αq,+ (ζ0 , ζ0 ) Lα+ (ζ0 , ζ0 ) Kˆ αN,+ (ζ0 , ζ0 ). And the key point is that, due to the duality principle, (3.2) holds. It allow us to use the left or right side estimation whenever it is convenient for us.


2195

Theorem 7.5. Let A ∈ ASB (E). For ζ0 ∈ D the limit (7.4) exists, and therefore (1.30), (1.35) hold true. Proof. O(ζ ) Recall Lemma 2.1 on the relation between ± mappings and the notations |T± (ζ )| = B(ζ ) , where B is a Blaschke product and O is an outer function (1.39). We have Lα (n) (ζ0 , ζ0 ) Kˆ α (n) (ζ0 , ζ0 ) Kˆ α (n) (ζ0 , ζ0 ) = ±

±

N,±

K 2 (ζ0 , ζ0 ) 1 |TN,± (ζ0 )| Kˇ (−n−1) (ζ0 , ζ0 ) αN,∓

|Oq (ζ0 )| K 2 (ζ0 , ζ0 ) 1 = Kˆ (n) (ζ0 , ζ0 ). |TN,± (ζ0 )| Kˇ (−n−1) (ζ0 , ζ0 ) |O(ζ0 )| αN,q,±

(7.9)

αN,q,∓

And from the other side Lα (n) (ζ0 , ζ0 ) Kˇ α (n) (ζ0 , ζ0 ) Kˇ α (n) (ζ0 , ζ0 ) = ±

±

q,±

K 2 (ζ0 , ζ0 ) 1 |Tq,± (ζ0 )| Kˆ (−n−1) (ζ0 , ζ0 ) αq,∓

1 |Tq,± (ζ0 )| Kˆ

K 2 (ζ

0 , ζ0 )

(−n−1)

αq,N,∓

(ζ0 , ζ0 )

= BN (ζ0 )Kˇ α (n)

q,N,±

(ζ0 , ζ0 ).

(7.10)

Passing to the limit in (7.9) and (7.10) we get BN (ζ0 )K(ζ0 , ζ0 ) lim inf L n→∞

(n)

α±

(ζ0 , ζ0 )

lim sup Lα (n) (ζ0 , ζ0 ) ±

n→∞

|Oq (ζ0 )| K(ζ0 , ζ0 ). |O(ζ0 )|

(7.11)

Since lim BN (ζ0 ) = 1

lim Oq (ζ0 ) = O(ζ0 ),

and

N →∞

q→1

(7.11) implies (7.4) and thus asymptotics (1.30), (1.35) are proved.

2

8. Hilbert transform Recall definition (1.67) of the transformation operator. In terms of the decomposition (1.66) the operator M− : l 2 (Z− ) → l 2 (Z− ) is defined by ⎡ M−

0,0 ⎢ M− ⎢ 1,0 M− = ι∗ ⎢ − ⎢ M2,0 ⎣

.. .

0

0

− M1,1

0

− M2,1

.. .

− M2,2 .. .

...⎤ ...⎥ ⎥ ⎥ ι, ...⎥ ⎦ .. .

(8.1)

2196


where ι : l 2 (Z− ) → l 2 (Z+ ), ι|m = | − 1 − m. Also, the shifted transformation operator is of the form ⎡ M(n) +

− Mn,n

⎢ M− ⎢ 1+n,n =⎢ − ⎢ M2+n,0+n ⎣ .. .

0

0

− M1+n,1+n

0

...

− M2+n,1+n .. .

− M2+n,2+n .. .

0

0

− M1+n,1+n

0

⎤

...⎥ ⎥ ⎥ ...⎥ ⎦ .. .

(8.2)

for even n and ⎡ (n) M+

⎢ =⎣

⎤

A A

⎡

− Mn,n

⎢ M− 1+n,n ⎥⎢ − ⎦⎢ ⎢ M2+n,0+n .. ⎣ . .. .

− M2+n,1+n .. .

− M2+n,2+n .. .

for odd n, where Sn : l 2 (Z+ ) → l 2 (Z+,n ), Sn |m = |n + m and A = matrix Aa with constant coefficients.

...

⎤

...⎥ ⎥ ∗ ∗ ⎥ . . . ⎥ S n A1 Sn ⎦ .. . ρ ρ −a

a¯

(8.3)

is related to the

Lemma 8.1. M+ is bounded if and only if

F (τ )2 dm(τ ) C F 2

α+

(8.4)

T

is satisfied for all F ∈ F + (Z+ ). If M+ is bounded for a certain n = n0 then it is bounded for all n ∈ Z. (n)

(n+1)

(n)

Proof. (8.4) follows directly from (1.66). M+ M+ . So the only thing required to be (n) (n−1) proved is that M+ < ∞ implies M+ < ∞. It follows from the recurrence (1.29). 2 Let 1 + θ (0) 1 + θ (v) = i Im + 1 − θ (v) 1 − θ (0)

T

t +v dσ (t) t −v

tk + v 1 + θ (0) t +v + σk + w(t) dm(t) + dσs (t) , = i Im 1 − θ (0) tk − v t −v tk ∈T\E

(8.5)

E

where σs is a singular measure on E and w(t) =

1 − |θ (t)|2 . |1 − θ (t)|2

(8.6)


2197

Then

1 − θ (v)θ (v0 ) (1 − θ (v))(1 − θ (v0 ))

= T

1 − v v¯0 dσ (t). (t − v)(t¯ − v) ¯

(8.7)

Lemma 8.2. Let θ ∈ ΘSB (E). Put θ+ = θ and select θ− as in Example 5.6, so that the associate CMV matrix A belongs to ASB (E). Then t ic dσ (t)f (t) (8.8) F (ζ ) := L(ζ, ) − e L(ζ, ) ¯ b¯ (ζ )t − b (ζ ) T

is a unitary map from L2dσ to F + (Z+,1 ). Proof. Put f (t) =

L(,ζ0 )−eic L(,ζ ¯ 0) . b (ζ¯0 )−tb¯ (ζ¯0 )

l 1,+ (ζ, ζ0 ) =

Note that ¯ ζ )L(ζ0 , ) ¯ L(, ζ )L(ζ0 , ) − L(,

(8.9)

b¯ (ζ )b¯ (ζ0 ) − b (ζ )b (ζ0 )

is the reproducing kernel in F + (Z+,1 ). By (8.7) we have

f 2L2 = F 2α+ . σ

Thus the map is an isometry. Since the set of such functions is dense, it is unitary.

2

(1)

Proposition 8.3. The transformation operator M+ is bounded if and only if

(Hf )(v)2 dm(v) C f 2 2 , Lσ w(v)

f ∈ L2σ ,

(8.10)

E

where (Hf )(v) := T

t dσ (t)f (t). t −v

(8.11)

Proof. We use (8.4). Then, by (8.8) and (5.12), we have E

Thus (8.10) is proved.

2 |1 − θ (v)|2 (Hf )(v) dm(v) C F 2α+ = C f 2L2 . 2 σ 1 − |θ (v)| 2

We give necessary and sufficient conditions on the measure σ that guarantee (8.10). Let us reformulate our problem and change the notations slightly. Obviously we can straighten up by fractional linear transformation the arc E and point part of σ in such a way that E becomes

2198


the segment [−2, 2], points {ζk } are transformed to {xk } accumulating only to −2 and 2, the ˜ dx on [−2, 2], σ˜ (xk ) = σk . It is easy to see that inequality measure σ goes to σ˜ , and d σ˜ = w (8.10) becomes equivalent to the following one 2 −2

(Hf )(y)2 dy C f 2 2 , Lσ˜ ˜ w(y)

∀f ∈ L2 (d σ˜ ),

(8.12)

where 2 (Hf )(v) := −2

f (x) d σ˜ (x). x −y

(8.13)

˜ ˜ ∈ A2 [−2, 2]. If we choose all f ’s from L2 ([−2, 2], wdx) we get that (8.12) is equivalent to w In fact, with such test functions f (8.12) becomes 2 2 2 2 dy 2 ˜ f (x) w(x)dx ˜ C f (x) w(x) dx, ˜ w(y) x −y

−2 −2

˜ dx). ∀f ∈ L2 (w

(8.14)

−2

˜ Then the previous estimate becomes the boundedness of Put F := f w. ˜ −1 dx → L2 [−2, 2], w ˜ −1 dx . H : L2 [−2, 2], w ˜ −1 ∈ A2 [−2, 2], namely, to This is of course equivalent to w sup

I,I ⊂[−2,2]

−1 ˜ ˜ I w w < ∞, I

(8.15)

# ˜ I := |I1| I w ˜ dx. This is obviously the same as w ˜ ∈ A2 [−2, 2]. where w ˜ is a restriction onto [−2, 2] ˜ Notice that it is easy to proof that w ∈ A2 [−2, 2] if and only if w of an A2 weight on the whole real line. Lemma 8.4. Condition (8.10) implies that the measure σ is absolutely continuous on the arc E and moreover w ∈ A2 . Therefore, to prove Theorem 1.19 we have to answer the following question: what is the property of the singular part on T \ E? To continue with (8.12) we write it down now for all f ∈ L2 (X, d σ˜ ): 2 f (x) d σ˜ (x) 2 dy f (x)2 d σ˜ (x), C ˜ w(y) x −y

−2 X

X

∀f ∈ L2 (X, d σ˜ ).

(8.16)


2199

˜ dx), we have that (8.16) is equivaLet us write down the dual inequality. Fix g ∈ L2 ([−2, 2], w lent to

2 sup

g L2 (w) ˜ 1 −2

g(y) X

f (x)d σ˜ (x) dy C x −y

1/2 |f |2 d σ˜

.

X

Thus, we can conclude that (8.16) is equivalent to the following inequality: 2 2 2 g(y) ˜ dx, dy d σ˜ C |g|2 w y −x

X −2

˜ dx . ∀g ∈ L2 [−2, 2], w

(8.17)

−2

To understand necessary and sufficient conditions for (8.17) we introduce the Smirnov class E 2 (Ω), where Ω #= C \ [−2, 2]. Recall that this is the class of analytic functions f on Ω having the property that γn |f (z)|2 |dz| C for a sequence of smooth contours converging to [−2, 2] (the class does not depend on the sequence of contours). Let us denote by φ(z) the outer function ˜ = |φ|2 on the boundary [−2, 2] of Ω (the same boundary value on both sides in Ω such that w ˜ ∈ A2 [−2, 2] ˜ ∈ A2 [−2, 2] is sufficient that φ exists (as w of [−2, 2]), φ(∞) > 0. The fact that w # 2 log w(x) ˜ √ dx < ∞, and the latter condition means the existence of an outer obviously ensures −2 2 4−x

˜ on the boundary). function in Ω with absolute value w ˜ ∈ A2 [−2, 2], Lemma 8.5. Let w

#2

2˜ −2 |g| wdx

< ∞, and let 2

G(z) = −2

g(t)dt . t −z

Then G(z)φ(z) ∈ E 2 (Ω). Proof. Consider 2 G+ (x) := lim

y→0+ −2

2 G− (x) := lim

y→0− −2

g(t)dt , t − x − iy g(t)dt . t − x − iy

The jump formula says that G+ (x) − G− (x) = c · g(x) for a.e. x. On the other hand, G+ (x) + ˜ dx) if and only G− (x) = c · Hg(x) for a.e. x. We conclude that both G+ , G− ∈ L2 ([−2, 2], w ˜ dx). The latter is the same as Hg ∈ L2 ([−2, 2], w ˜ dx) (because if both g, Hg ∈ L2 ([−2, 2], w ˜ dx) by assumption). We conclude that both boundary values are in of course g ∈ L2 ([−2, 2], w ˜ dx) if and only if Hg is. But the latter condition is equivalent to (we discussed L2 ([−2, 2], w ˜ ∈ A2 [−2, 2]. this already) w

2200


˜ dx) implies that G(z)φ(z) ∈ E 2 (Ω). AcWe are left to prove that G+ , G− ∈ L2 ([−2, 2], w tually these claims are equivalent, and this does not depend on A2 anymore. Notice that our function G(z) is a Cauchy integral of an L1 (−2, 2) function, and, as such, belongs to the Smirnov class E p (Ω) for any p ∈ (0, 1). For any outer function h in Ω and for any analytic function G, say, from E 1/2 (Ω) we have that Gh ∈ E 2 (Ω) if and only if (Gh)+ ∈ L2 (−2, 2), (Gh)− ∈ L2 (−2, 2). This is the corollary of the famous theorem of Smirnov (see [28]) that says that if in a domain Ω one has a holomorphic function F which is the ratio of two bounded holomorphic functions such that the denominator does not have singular inner part (the class of such functions is denoted by N , and if f |∂Ω ∈ Lq (∂Ω) then f ∈ E q (Ω)). In our case one should only see that any G ∈ E 1/2 (Ω) and any outer function h are functions from N . Then we apply this observation to our G and to the outer function h = φ, and we see that the requirement G(z)φ(z) ∈ E 2 (Ω) is equivalent to G+ , G− ∈ ˜ dx). Thus we are done. 2 L2 ([−2, 2], w ˜ on [−2, 2], we can claim that Remark 8.6. A little bit more is proved. Namely, given a weight w #2 2 ˜ for every g such that −2 |g| wdx < ∞ we have that the function 2 G(z) = −2

g(t)dt t −z

˜ ∈ A2 [−2, 2]. We need this claim only in “if” direcsatisfies G(z)φ(z) ∈ E 2 (Ω) if and only if w tion. Lemma 8.5 is very helpful as it allows us to write yet another inequality equivalent to (8.17): Gφ(xk )2 xk ∈X

σk C |φ(xk )|2

2

Gφ(x)2 dx.

(8.18)

−2

˜ function Gφ runs over the We want to see now that when g runs over the whole of L2 (w), #2 dt ˜ ). Lemma 8.5 gives one direction: if g ∈ L2 (w) whole of E 2 (Ω) (recall that G(z) := −2 g(t) t−z 2 then Gφ ∈ E (Ω). (z) Let us show the other inclusion. So suppose F ∈ E 2 (Ω). Consider G(z) = Fφ(z) . We want to represent it as follows:

F (z) = φ(z)

2

−2

f (t) dt , t −z

˜ f ∈ L2 (w).

(8.19)

(z) (z) ˜ Here we To do that notice that both boundary value functions ( Fφ(z) )+ , ( Fφ(z) )− are in L2 (w). ˜ ∈ A2 [−2, 2]. So these two boundary value functions are in L1 . And use again the fact that w

F. Peherstorfer et al. / Journal of Functional Analysis 256 (2009) 2157–2210 F (z) φ(z)

∈ N of course. We use again Smirnov’s theorem (see [28]) to conclude that Then put f (t) := c ·

F (z) φ(z)

F (z) − φ(z) +

2201 F (z) φ(z)

∈ E 1 (Ω).

. −

(z) ˜ and so is in L1 . We apply Cauchy integral theorem to function Fφ(z) It is in L2 (w) from 1 E (Ω). We get exactly (8.19) if the constant c is chosen correctly. ˜ function Gφ All this reasoning shows that in (8.18) when g runs over the whole of L2 (w), runs over the whole of E 2 (Ω). Therefore (8.18) can be rewritten as follows:

F (xk )2 xk ∈X

σk C |φ(xk )|2

2

F (x)2 dx,

∀F ∈ E 2 (Ω).

(8.20)

−2

This is very nice because (8.20) is a familiar Carleson measure condition, only not in the Hardy class H 2 in the unit disk, but for its full analog E 2 in Ω = C \ [−2, 2]. The transfer from the disc to Ω is obvious: Lemma 8.7. Let DI denote two discs centered at −2 and 2 and of radius I . Then a measure dμ in Ω satisfies

F (z)2 dμ(z) C

2

F (x)2 dx

−2

for all F ∈ E 2 (Ω) if and only if DI

√ dμ(z) C I . |z2 − 4|

(8.21)

Proof. Let ψ be conformal map from the disc D onto Ω. If F ∈ E 2 then F ◦ ψ · (ψ )1/2 ∈ H 2 . We apply Carleson measure theorem to the new measure μ˜ := ψ −1 ∗ μ in the disc and see that | is a usual Carleson measure (see [12]). Coming back to Ω gives (8.21). 2 μ/|ψ ˜ Immediately we obtain the following necessary and sufficient condition for (8.16) (or (8.17)) to hold: √ σk C τ , ∀τ > 0. (8.22) 2 k: |xk ±2|τ |φ(xk )|2 xk − 4 ˜ ∈ A2 give the full necessary and sufficient condition for (8.12) to The condition (8.22) plus w hold, and so for the L2 boundedness of the operators of transformation. However we want to simplify (8.22). The problem with this condition as it is shown √ now ˜ on lies in the fact that we have to compute the outer function φ with given absolute value w

2202


˜ ∈ A2 [−2, 2] once again [−2, 2]. This might not be easy in general. We want to use the fact that w to replace φ(xk ) by a simpler expression. We need one more lemma. ˜ ∈ A2 [−2, 2] and let x > 2. There are two constants 0 < c < C < ∞ indepenLemma 8.8. Let w dent of x such that 1 c x −2

2

˜ dt φ 2 (x) C w

4−x

1 x −2

2

˜ dt. w

(8.23)

4−x

Proof. Let Pz (s) stands for the Poisson kernel for the domain Ω with pole at z ∈ Ω. It is easy to write its formula using the conformal mapping onto the disc, but we prefer to write its asymptotic bahavior when z > 2 and z − 2 is small: Pz (s) √

√ z−2 2 − s (z − s)

(8.24)

.

Notice that it is sufficient to prove only the right inequality in (8.23). In fact, the left one then ˜ ∈ A2 . So let us have δ be a number close ˜ −1 if one uses w follows from the right one applied to w ˜ δ is in A1+a , a > 0. to 0, but δ > 0. There exists such a δ that w Having this in mind we write

φ (x) = e 2

#2

˜ −2 log wPx (s) ds

( 2 −2

)1 √ δ x −2 ds . w √ 2−s x −s ˜δ

1

We can split the last integral into two: 2 I := 4−x

√

1 x−2 ds C √ w √ 2−s x −s x −2 ˜δ

1

2 4−x

1 ˜δ√ w ds 2−s

and 4−x √ 2 x −2 δ ˜ · . . . ds C w ˜δ w ds. II := 3 (x − s) 2 −2

−2

It is easy to take care of II. In fact, it is well known that for any A1+a [−2, 2] weight u 2 −2

1 (x − 2)a u(s) ds Ca x −2 (x − s)1+a

2 u(s) ds. 4−x


2203

#2 But this is false to claim that for any A2 weight u one has √ 1 u √ 1 ds x−2 4−x 2−s #2 1 √1 C x−2 u(s) ds! Just take u to be equal to for all s < 2 and close to 2. Therefore 4−x 2−s term I is more difficult than term II. But not much. Use Cauchy inequality: 1 δ

I C √

2

1 x−2

1 · x −2

˜δ

w √ 4−x

2 4−x

δ

2−s

ds

1 1

1

1

1−δ δ

1

(2 − s) 2 · 1−δ

1 x −2

· (x − 2)

1 2δ

2

˜ ds w

4−x

1 C x−2

2

˜ ds. w

4−x

#2 1 ˜ As a result we get |φ 2 (x)| C x−2 4−x w ds, which is the right inequality of the lemma. We already noticed that the left inequality follows from the right one (using the A2 property and 1 applying what we proved to w ˜ ). Hence the lemma is completely proved. 2 Now we can rewrite (8.22) in an equivalent form. Proposition 8.9. Let xk → 2 (we consider accumulation to the point 2 only, accumulation to −2 is symmetric). Consider the condition k:xk −2τ

σk

#2 4−xk

˜ w(s) ds

√ xk − 2 C τ ,

∀τ > 0.

(8.25)

˜ ∈ A2 [−2, 2] are equivalent to (8.12). Then (if points accumulate only to 2) (8.25) plus w If points accumulate to both ±2 we need to add an obvious symmetric condition near −2. Thus Theorem 1.19 is completely proved. Remark 8.10 (Step backward—step forward). Let θ˜ (v) :=

b0 + vθ (1) (v) , 1 + b¯0 vθ (1) (v)

b0 ∈ D.

(8.26)

Then θ˜ (v) :=

1 + c¯0 θ (0) c0 + θ (v) , 1 + c0 θ (0) 1 + c¯0 θ (v)

(8.27)

b0 −a0 ic ˜ where c0 = 1− a¯ 0 b0 is actually an arbitrary point in D. Obviously multiplication of θ by e does not change the norm of the transformation operator. Thus arbitrary fraction-linear transformation

θ˜ (v) := eic

c0 + θ (v) , 1 + c¯0 θ (v)

preserves A2 (1.68) and “Carleson” (1.69) conditions.

(8.28)

2204


9. Sufficient condition in terms of scattering data Proof of Theorem 1.20. Let W=

1 R+

R¯ + , 1

B=

B¯ 0

0 . B

Condition (1.72) means that the matrix weight BWB∗ is in A2 , see Appendix A. First we prove that

f − 2 Q f − 2L2

α−

(9.1)

for f − (t) ∈ Hˆ α2− . In fact even f − 2 Q f − 2R− . Recall ¯ 2, T− (τ )f − (τ ) = τ¯ f + (τ¯ ) + R+ (τ )f + (τ ) ∈ BH where f + ∈ L2α+ Hˇ α2+ . Therefore 0 f + (τ ) = P+ BW τ¯ f + (τ¯ ) B(τ )T− (τ )f − (τ )

and + + f (τ ) f (τ ) −1 ∗ BW B P+ BW , P+ BW = f − 2 . τ¯ f + (τ¯ ) τ¯ f (+ τ¯ )

(9.2)

Due to the A2 condition we get + f (τ ) f + (τ ) , = Q f − 2R− .

f Q W τ¯ f + (τ¯ ) τ¯ f + (τ¯ ) − 2

(9.3)

Now we will prove the second part of the claim, that is, ' + '2 ' ' 'f ' Q'f + '2 2

Lα+

(9.4)

for f + (t) ∈ Hˆ α2+ . Since (9.1) holds then Hˆ α2− = Hˇ α2− (moreover Hˆ α2− ⊂ H 2 ). Evidently, this implies Hˆ α2+ = Hˇ α2+ . Indeed, + + + Hˆ α2+ = L2α− Hˇ α2− = L2α− Hˆ α2− = L2α+ Hˆ α2− = Hˇ α2+ . Therefore (9.4) is guarantied by the inequality f (τ ) f (τ ) f (ζk )2 ν(ζk ) f, f Q W , + τ¯ f (t¯) τ¯ f (τ¯ )

(9.5)


2205

for functions of the form f = f1 + Bf2 , where f1 (ζ ) =

N k=1

B(ζ ) f (ζk ), (ζ − ζk )B (ζk )

f2 ∈ H 2 .

Note that f1 and Bf2 are orthogonal with respect to the standard metric in H 2 , i.e.,

f 2 = f1 2 + f2 2 . Let us calculate the matrix of the metric in Hˇ α2+ which is generated by this decomposition. Bf2 , Bf2 L2α

+

1 f2 (τ ) f2 (τ ) ∗ , = (I + H2 )f2 , f2 , BWB = τ¯ f2 (τ¯ ) τ¯ f2 (τ¯ ) 2

(9.6)

where H2 is the Hankel operator generated by the symbol R˜ + , H2 f2 = P+ τ¯ (R˜ + f2 )(τ¯ ). Similarly f1 , f1 L2α = (I + H1 )f1 , f1 + δ(f1 , f1 ). +

(9.7)

Here δ is the quadratic form corresponding to the scalar product in L2ν+ . Finally, f2 , f1 L2α = Tf2 , f1 , +

(9.8)

where T is the truncated Toeplitz operator Tf2 = P+ BP− τ¯ (R˜ + f2 )(τ¯ ). In these terms, according to (9.5) and the above (9.6)–(9.8), we have to show that there exists 1 (= Q ) > 0 such that

I 0

0 I + H1 + δ T∗ I

T . I + H2

(9.9)

By (1.72) we have H2 < 1. Therefore we can substitute (9.9) by

I 0

I + H1 + δ 0 T∗ I + H2

T . I + H2

It is equivalent to

I + H1 + δ − T∗

T 0 (1 − )(I + H2 )

(9.10)

2206


or

T (1 − )(I + H1 + δ − ) T∗ (I + H2 ) I + H1 (1 − )(δ − ) − (I + H1 ) 0 T 0. = + T∗ 0 0 (I + H2 )

(9.11)

Since the first term in the RHS (9.11) is nonnegative it is enough to find such that 3− (I + H1 ) + I I δ. 1− 1− Note that the last inequality is the same as

3−

f1 2 f1 2L2 . ν+ 1−

(9.12)

Due to the (well-known) lemma below, the Carleson condition for ν˜ + , given by (1.70), implies

f1 2 Q f1 2L2 .

(9.13)

ν+

Thus (9.12) and consequently the whole theorem is proved.

2

Lemma 9.1. The following condition

f 2KB Q f 2L2 , ν

is satisfied if and only if ν˜ , ν˜ (ζk ) =

∀f ∈ KB := H 2 BH 2 ,

1 , |B (ζk )|2 ν(ζk )

is a Carleson measure.

Proof. A function f ∈ KB can be represented in the form f=

B(ζ ) f (ζk ) (ζ − ζk )B (ζk )

so (9.14) is equivalent to the bounded ness of the operator A {xk } =

B(ζ ) xk : l 2 → KB . √ (ζ − ζk )B (ζk ) ν(ζk )

Note that f, g =

xk y¯k √ ν(ζk )

for g =

That is, yk A∗ (g) = √ ν(ζk )

yk . 1 − ζ ζ¯k

(9.14)


2207

and (9.14) can be rewritten into the form |yk |2 Q g 2KB . ν(ζk )

(9.15)

Note that g=

B(ζ )yk yk → g˜ = (ζ − ζk ) 1 − ζ ζ¯k

is a unitary mapping, and thus we get from (9.15) 2 g(ζ ˜ k ) for all g˜ ∈ KB .

1 Q g

˜ 2KB |B (ζk )|2 ν(ζk )

(9.16)

2

Appendix A. Attachment The space Lθ is defined as the set of 2D vector functions with the scalar product

1 θ¯

f = 2

θ 1

T

f1 f1 , dm(t). f2 f2

(A.1)

By Kθ we denote its subspace Kθ = Lθ

H−2

H+2

.

Lemma A.1. The vector

1 1 −θ (μ) 1 − t μ¯

is the reproducing kernel in Kθ . Proposition A.2. Let E = {t: |θ | = 1}. For f ∈ Kθ E

|f1 + θf2 |2 dm C f 2 1 − |θ |2

(A.2)

(compere (8.10)) if and only if ! E

1 θ¯

θ 1

"

−1 Hf, Hf

dm(t) C T

1 θ¯

θ f, f dm(t), 1

(A.3)

2208


where (Hf )(z) is the Hilbert Transform (Hf )(z) =

1 θ¯

θ 1

t f1 dm(t). f2 t − z

Proof. By definition

t 1 θ f1 h− + dm(t) θ¯ 1 f2 h+ t − z f1 + θf2 1 θ h + h+ (z) = P+ + ¯θ − ¯θ f1 + f2 1 θ (z) (f1 + θf2 )(z) + h+ (z). = 0 1

Hf (z) =

(A.4)

Therefore !

E

−θ " 1 Hf, Hf dm 1 − |θ |2

1 −θ¯

! = E

= E

1 " (f1 + θf2 ) θ 0 f1 + θf2 −θ¯ + h+ dm + h+ , 0 1 1 1 − |θ |2

|f1 + θf2 |2 dm + 1 − |θ |2

|h+ |2 dm.

(A.5)

E

That is (A.3) is equivalent to E

|f1 + θf2 |2 dm + 1 − |θ |2

|h+ |2 dm C f 2 + |h+ |2 dm .

2

(A.6)

T

E

Lemma A.3. Condition (A.3) implies 1 sup |I | I

I

|θ − θ I |2 + (1 − |θ I |2 ) dm < ∞, 1 − |θ |2

(A.7)

where for an arc I ⊂ E we put 1 θ I := |I |

θ dm. I

(A.8)


1 Proof. In particular (A.3) implies that the matrix weight ¯ θ 1 |I |

I

1 −θ −θ¯ 1 1 − |θ |2

2209

θ is in A2 on E. And this means 1

dm C

1 θ¯I

θ I 1

−1 ,

(A.9)

or 1 |I |

I

1 − |θ |2 + |θ − θ I |2 (θ I − θ ) 1 − |θ I |2 (θ I − θ ) 1 − |θ I |2 1 − |θ I |2 dm C, 1 − |θ |2

which is equivalent to (A.7).

(A.10)

2

References [1] D. Arov, H. Dym, On matricial Nehari problems, J -inner matrix functions and the Muckenhoupt condition, J. Funct. Anal. 181 (2001) 227–299. [2] D. Arov, H. Dym, Criteria for the strong regularity of J -inner functions and γ -generating matrices, J. Math. Anal. Appl. 280 (2003) 387–399. [3] R. Beals, P. Deift, C. Tomei, Direct and Inverse Scattering on the Line, Math. Surveys Monogr., vol. 28, Amer. Math. Soc., Providence, RI, 1988. [4] K. Case, J. Geronimo, Scattering theory and polynomials orthogonal on the real line, Trans. Amer. Math. Soc. 258 (1980) 467–494. [5] M. Christ, A. Kiselev, Scattering and wave operators for one-dimensional Schrödinger operators with slowly decaying nonsmooth potentials, Geom. Funct. Anal. 12 (2002) 1174–1234. [6] D. Damanik, R. Killip, B. Simon, Perturbation of orthogonal polynomials with periodic recursion coefficients, preprint, 2006. [7] P.A. Deift, R. Killip, On the absolutely continuous spectrum of one-dimensional Schrödinger operators with square summable potentials, Comm. Math. Phys. 203 (1999) 341–347. [8] S.A. Denisov, On Rakhmanov’s theorem for Jacobi matrices, Proc. Amer. Math. Soc. 132 (2004) 847–852. [9] S.A. Denisov, On the existence of wave operators for some Dirac operators with square summable potential, Geom. Funct. Anal. 14 (3) (2004) 529–534. [10] I. Egorova, J. Michor, G. Teschl, Scattering theory for Jacobi operators with quasi-periodic background, Comm. Math. Phys. 264 (2006) 811–842. [11] I. Egorova, J. Michor, G. Teschl, Scattering theory for Jacobi operators with steplike quasi-periodic background, Inverse Problems 23 (2007) 905–918. [12] John B. Garnett, Bounded Analytic Functions, Revised first ed., Grad. Texts in Math., vol. 236, Springer, New York, 2007, xiv+459 pp. [13] F. Gesztesy, M. Zinchenko, Weyl–Titchmarsh theory for CMV operators associated with orthogonal polynomials on the unit circle, J. Approx. Theory 139 (2006) 172–213. [14] F. Gesztesy, M. Zinchenko, A Borg-type theorem associated with orthogonal polynomials on the unit circle, J. London Math. Soc. 74 (2006) 757–777. [15] G. Guseinov, The determination of an infinite Jacobi matrix from the scattering data, Soviet Math. Dokl. 17 (1976) 596–600. [16] A. Kheifets, P. Yuditskii, An analysis and extension of V.P. Potapov’s approach to interpolation problems, in: Matrix and Operator Valued Functions, in: Oper. Theory Adv. Appl., vol. 72, Birkhäuser, Basel, 1994, pp. 133–161. [17] A. Kheifets, F. Peherstorfer, P. Yuditskii, On scattering for CMV matrices, arXiv: 0706.2970v1, 2007. [18] R. Killip, B. Simon, Sum rules for Jacobi matrices and their applications to spectral theory, Ann. of Math. (2) 158 (2003) 253–321.

2210


[19] S. Kupin, F. Peherstorfer, A. Volberg, P. Yuditskii, Inverse scattering problem for a special class of canonical systems and non-linear Fourier integral. Part I: Asymptotics of eigenfunctions, in: Oper. Theory Adv. Appl., vol. 186, Birkhäuser Verlag, Basel, 2008, pp. 285–323. [20] N. Makarov, A. Poltoratski, Beurling–Malliavin theory for Toeplitz kernels, preprint, 2007, arXiv: math/0702497. [21] V. Marchenko, Sturm–Liouville Operators and Applications, Birkhäuser, Basel, 1986. [22] V. Marchenko, Nonlinear Equations and Operator Algebras, Math. Appl. (Soviet Ser.), vol. 17, D. Reidel Publishing Co., Dordrecht–Boston, MA, 1988. [23] P. van Moerbekke, D. Mumford, The spectrum of difference operators and algebraic curves, Acta Math. 143 (1–2) (1979) 93–154. [24] N. Nikolskii, Treatise on the Shift Operator, Springer-Verlag, Berlin, 1986. [25] F. Peherstorfer, P. Yuditskii, Asymptotics of orthonormal polynomials in the presence of a denumerable set of mass points, Proc. Amer. Math. Soc. 129 (2001) 3213–3230. [26] F. Peherstorfer, P. Yuditskii, Asymptotic behavior of polynomials orthonormal on a homogeneous set, J. Anal. Math. 89 (2003) 113–154. [27] F. Peherstorfer, P. Yuditskii, Finite difference operators with a finite-band spectrum, in: Oper. Theory Adv. Appl., vol. 186, Birkhäuser Verlag, Basel, 2008, pp. 345–387. [28] I.I. Privalov, Graniˇcnye Svo˘ıstva Analitiˇceskih Funkci˘ı (Boundary Properties of Analytic Functions), second ed., Gosudarstv. Izdat. Tehn.-Teor. Lit., Moscow–Leningrad, 1950, 336 pp. (in Russian). [29] Ch. Remling, The absolutely continuous spectrum of Jacobi matrices, preprint, 2007, arXiv: 0706.1101. [30] B. Simon, Orthogonal Polynomials on the Unit Circle, Part 1: Classical Theory, Amer. Math. Soc. Colloq. Publ., Amer. Math. Soc., Providence, RI, 2005. [31] B. Simon, Orthogonal Polynomials on the Unit Circle, Part 2: Spectral Theory, Amer. Math. Soc. Colloq. Publ., Amer. Math. Soc., Providence, RI, 2005. [32] B. Simon, A canonical factorization for meromorphic Herglotz functions on the unit disc and sum rules for Jacobi matrices, J. Funct. Anal. 214 (2004) 396–409. [33] B. Simon, CMV matrices: Five years after, J. Comput. Appl. Math. 208 (1) (2007) 120–154. [34] M. Sodin, P. Yuditskii, Almost periodic Jacobi matrices with homogeneous spectrum, infinite-dimensional Jacobi inversion, and Hardy spaces of character-automorphic functions, J. Geom. Anal. 7 (3) (1997) 387–435. [35] T. Tao, Ch. Thiele, Nonlinear Fourier Analysis, IAS/Park City Mathematics Series, Graduate Summer School 2003, in press. [36] G. Teschl, Jacobi Operators and Completely Integrable Nonlinear Lattices, Math. Surveys Monogr., vol. 72, Amer. Math. Soc., Providence, RI, 2000. [37] A. Volberg, P. Yuditskii, On the inverse scattering problem for Jacobi matrices with the spectrum on an interval, a finite system of intervals or a Cantor set of positive length, Comm. Math. Phys. 226 (2002) 567–605.


A general two-scale criteria for logarithmic Sobolev inequalities Tony Lelièvre a,b,∗ a CERMICS, École des Ponts, 77455 Marne-La-Vallée cedex 2, France b INRIA Rocquencourt, MICMAC project, Domaine de Voluceau, B.P. 105, 78153 Le Chesnay cedex, France

Received 7 June 2008; accepted 25 September 2008 Available online 10 October 2008 Communicated by C. Villani

Abstract We present a general criteria to prove that a probability measure satisfies a logarithmic Sobolev inequality, knowing that some of its marginals and associated conditional laws satisfy a logarithmic Sobolev inequality. This is a generalization of a result by N. Grunewald et al. [N. Grunewald, F. Otto, C. Villani, M.G. Westdickenberg, A two-scale approach to logarithmic Sobolev inequalities and the hydrodynamic limit, Ann. Inst. H. Poincaré Probab. Statist., in press]. © 2008 Elsevier Inc. All rights reserved. Keywords: Logarithmic Sobolev inequality; Two-scale criteria

1. Motivation and notation The motivation behind this work is molecular dynamics (in the canonical statistical ensemble), and more precisely, (i) the analysis of numerical methods for the computation of free energy differences [8] (see Remark 1 below) and (ii) the derivation of effective dynamics on coarse-grained variables [7]. In both cases, it appears that estimates based on entropies for measures related to the Boltzmann–Gibbs measure is a useful tool. One important question is the following: what is the link between the logarithmic Sobolev inequality (LSI) constant of the Boltzmann–Gibbs measure for the original variables (microscopic level) and the LSI constant of the Boltzmann– * Address for correspondence: CERMICS, École Nationale des Ponts et Chaussées, 77455 Marne-La-Vallée cedex 2, France. E-mail address: [email protected].


2212

T. Lelièvre / Journal of Functional Analysis 256 (2009) 2211–2221

Gibbs measure for some coarse-grained variables (macroscopic level). The aim of this work is to give an answer, which is a generalization to non-linear coarse-graining operators of results in [6,9]. Let D be a domain of Rn representing the configuration space of the system under consideration, and V : D → R a potential, associating to each configuration an energy. Let us consider a function (representing the coarse-grained variables, also called the reaction coordinates) ξ : D → M, with M ⊂ Rp (and 1 p < n). Let us introduce the Gram matrix G : D → Rp×p of the derivative ∇ξ : D → Rp×n : G = ∇ξ ∇ξ T , i.e., componentwise, ∀α, β ∈ {1, . . . , p}, Gα,β = ∇ξα · ∇ξβ .

(1)

We suppose that ξ is such that (H1) ξ is a smooth function such that det G = 0 on D. The submanifolds Σz = x ∈ D, ξ(x) = z are then smooth submanifolds of D of codimension p. We denote by σΣz the surface measure on Σz , i.e. the Lebesgue measure on Σz induced by the Lebesgue measure in the ambient Euclidean space D. The submanifold Σz naturally has a (complete and locally compact) Riemannian structure induced by the Euclidean structure of the ambient space D. Let us define the density ψ0 (with respect to the Lebesgue measure on D) of the Boltzmann– Gibbs probability measure dμ0 (x) = ψ0 (x) dx associated to the potential V : ψ0 = Z −1 exp(−V ), ξ where Z = D exp(−V ). We denote by ψ0 the density (with respect to the Lebesgue measure ξ ξ on M) of the image dμ0 (z) = ψ0 (z) dz of the measure μ0 by ξ : ψ0 (z) = Z −1 ξ

exp(−V )(det G)−1/2 dσΣz .

Σz

Let us introduce then the conditional measure μ0,z of μ0 at a fixed value z of ξ : dμ0,z =

Z −1 exp(−V )(det G)−1/2 dσΣz ξ

.

ψ0 (z)

Let us introduce the effective potential A0 associated to ξ (also called free energy), defined by ξ

A0 (z) = − ln ψ0 (z).

(2)


2213

The following expression for the derivative of A0 (also called the mean force) is obtained: ∇A0 (z) = F dμ0,z ,

(3)

Σz

where F is defined by: ∀α ∈ {1, . . . , p}, Fα =

p

G−1 α,β ∇ξβ

· ∇V − div

β=1

p

G−1 α,β ∇ξβ

,

(4)

β=1

where G−1 α,β denotes the (α, β)-component of the inverse of the matrix G. All these results can be derived using the co-area formula (see Lemma 2.2 below), using similar computations as in Lemma 2.3 below. Let us also introduce the following projection operators. For any x ∈ D, we denote by P (x) = Id −

p

G−1 α,β ∇ξα ⊗ ∇ξβ (x)

(5)

α,β=1

the orthogonal projection operator onto the tangent space Tx Σξ(x) to Σξ(x) at point x, and by p

Q(x) = Id − P (x) =

G−1 α,β ∇ξα ⊗ ∇ξβ (x)

(6)

α,β=1

the orthogonal projection operator onto the normal space Nx Σξ(x) to Σξ(x) at point x. We denote by ⊗ the tensor product: for two vectors u, v ∈ Rn , u ⊗ v is a n × n matrix with components (u ⊗ v)i,j = ui vj . For any two probability measures μ and ν such that μ is absolutely continuous with respect to ν (this property being denoted μ ν in the following), we introduce the relative entropy

H (μ|ν) =

ln

dμ dμ. dν

Let us also introduce the Fisher information. For any two probability measures μ and ν such that μ ν,

dμ 2 I (μ|ν) = ∇ ln dμ. dν

(7)

In (7) and in the following, | · | denotes the Euclidean norm (in Rn or in Rp ). In the case ν is a probability measure on the (Riemannian) submanifold Σz , ∇ actually denotes the gradient on Σz in (7), namely ∇Σz = P ∇. We recall the definition of the Logarithmic Sobolev Inequality (LSI).

(8)

2214


Definition 1.1. The probability measure ν satisfies a logarithmic Sobolev inequality with constant ρ > 0 (in short: LSI(ρ)) if for all probability measures μ such that μ ν, H (μ|ν)

1 I (μ|ν). 2ρ

The main result of this paper states conditions under which a LSI holds for μ0 , assuming that ξ a LSI holds for the conditional probability measure μ0,z (this is (H2)) and for the marginal μ0 (this is (H3)). Theorem 1.2. In addition to (H1), let us assume (recall that the local mean force F is defined by (4)): (H2) V and ξ are such that ∃ρ > 0, for all z ∈ M, the conditional measure μ0,z satisfies LSI(ρ). ξ ξ (H3) V and ξ are such that ∃r > 0, the measure dμ0 = ψ0 (z) dz satisfies LSI(r). (H4) V and ξ are sufficiently differentiable functions such that ∃m > 0, G m Id and (a) ∇Σz F L∞ M < ∞ or (b) F L∞ √Mρ < ∞. Then μ0 satisfies LSI(R) with M 2m 1 rm + +ρ − R= 2 ρ

2

M 2m rm + + ρ − 4rmρ . ρ

(9)

In (H4), G m Id should be understood in the following sense: for any vector u ∈ Rp , m|u|2 . In (H4)(a) or (H4)(b), the L∞ norm is with respect to x ∈ D: F L∞ = supx∈D |F | and ∇Σz F L∞ = supx∈D |∇Σz F |, where | · | here denotes the operator norm on the (x)u| matrix ∇Σz F associated to the Euclidean norm on the vectors: |∇Σz F (x)| = supu∈Tx Σz |∇F|u| . Assumption (H4)(a) is an assumption on the coupling in the following sense. Assume that V (x) = 12 x T H x for some symmetric positive matrix H ∈ Rn×n (so that μ0 is a Gaussian law), and that ξ(x1 , . . . , xn ) = (x1 , . . . , xp ) (so that G = Id). Then, ∇Σz F = 0 is equivalent to the fact that the covariance Cov((X1 , . . . , Xp ), (Xp+1 , . . . , Xn )) = 0, where (X1 , . . . , Xn ) is a random variable with law μ0 , and thus equivalent to the fact that the projected variables (X1 , . . . , Xp ) are decoupled with the variables (Xp+1 , . . . , Xn ) with values in the submanifolds Σz . In this case of Gaussian laws and a linear function ξ , it can be checked that (9) is optimal in the sense that R is actually the largest constant for which a LSI holds for μ0 (see [9]). Theorem 1.2 is a generalization of [6, Theorem 3] where a similar result is proven for a linear function ξ . In [6], this result is used to derive a quantitative estimate of the difference between the projection of a microscopic dynamics on coarse variables and an effective dynamics on these coarse variables. uT Gu

Remark 1 (Application to free energy calculation methods). As mentioned in Section 1, Theorem 1.2 is an important result in the framework of molecular dynamics, in particular for the computation of free energy differences [4]. Let us explain this with more details. A central problem in molecular dynamics is the computation of the free energy A0 associated to the reaction coordinate ξ . A naive method consists in using a simple gradient dynamics


dXt = −∇V (Xt ) dt +

2215

√ 2 dWt

to sample the measure μ0 , and thus to consider ξ(Xt ) to approximate A0 . Typically, this method does not work in practice because of the metastable features of the potential V : the law of the process Xt (and thus of ξ(Xt )) needs a very long time to reach its stationary state. Mathematically, this is related to the fact that the Logarithmic Sobolev constant of μ0 is typically rather small. Methods have been developed to circumvent this problem. Among them are adaptive methods, for which it can be checked (see [8]) that the rate of convergence to equilibrium is essentially the Logarithmic Sobolev constant ρ of the conditional measures μ0,z (see (H2) above), under the assumption of a bounded coupling (see (H4)(a) above). A similar rate of convergence is obtained for the thermodynamic integration method, for example. To compare quantitatively the naive method with the more advanced methods such as adaptive methods, a natural question is thus: how the Logarithmic Sobolev constant of the conditional measures μ0,z and the Logarithmic Sobolev constant of the measure μ0 are related? Theorem 1.2 gives one answer to this question. 2. Proof To prove the result, we need to introduce a few other notation. Let ψ be a probability density functional on D. We denote the total entropy by E = H (ψ|ψ0 ), and the macroscopic entropy by

ξ EM = H ψ ξ ψ0 , where ψ (z) = ξ

ψ(det G)−1/2 dσΣz .

(10)

Σz

We denote the conditioned probability measures of ψ at a fixed value z of the reaction coordinate by ψ(det G)−1/2 dσΣz , ψ ξ (z)

dμz = the “local entropy” by

em (z) = H (μz |μ0,z ) = Σz

ψ ψ0 dμz , ln ψ ξ (z) ψ ξ (z) 0

and finally the microscopic entropy by Em = M

em (z)ψ ξ (z) dz.

(11)

2216


It is straightforward to obtain the following result which can be seen as a property of extensivity of the entropy: Lemma 2.1. It holds E = EM + Em . We will need the co-area formula (see [1,5]): Lemma 2.2. For any smooth function φ : Rn → R,

1/2 φ(x) det G(x) dx =

Rn

Rp

φ dσΣz dz,

(12)

Σz

where G is defined by (1). Remark 2 (Co-area formula and conditioning). The co-area formula shows that if the random variable X has law ψ(x) dx in Rn , then ξ(X) has law ψ ξ (z) dz, where ψ ξ is defined by (10) and the law of X conditioned to a fixed value z of ξ(X) is μz , where μz is defined by (11). Indeed, for any bounded functions f and g,

E f ξ(X) g(X) =

f ξ(x) g(x)ψ(x) dx

Rn

=

f ◦ ξgψ(det G)−1/2 dσΣz dz

Rp Σz

= Rp

=

Σ f (z) z

gψ(det G)−1/2 dσΣz

−1/2 dσ Σz Σz ψ(det G)

f (z) g dμz ψ ξ (z) dz.

Rp

ψ(det G)−1/2 dσΣz dz

Σz

Σz

The measure (det G)−1/2 dσΣz is sometimes denoted by δξ(x)−z in the literature. From the co-area formula, we get: Lemma 2.3. The derivative of ψ ξ reads: ∀α ∈ {1, . . . , p}, ∂zα ψ ξ (z) =

p

−1

−1/2 Gα,β ∇ξβ · ∇ψ + div G−1 dσΣz . α,β ∇ξβ ψ (det G) Σz β=1


2217

Proof. For any smooth test function g : M → Rp , we obtain (using the co-area formula (12) and an integration by parts)1 :

ψ ξ div g = M

ψ(div g) ◦ ξ D

ψG−1 α,β ∇ξβ · ∇(gα ◦ ξ )

= D

=−

div ψG−1 α,β ∇ξβ gα ◦ ξ

D

=−

−1

−1/2 Gα,β ∇ξβ · ∇ψ + div G−1 dσΣz dz, α,β ∇ξβ ψ (det G)

gα (z)

M

which yields the result.

Σz

2

A corollary of Lemma 2.3 applied with ψ = ψ0 is Eq. (3). Let us now introduce the mean force associated with ψ (compare with (3)): D(z) =

F dμz , Σz

where the probability measure μz is defined in (11). Notice that, in general, D = −∇ ln ψ ξ , and curl D = 0. We need a measure of the difference between D and ∇A0 , in terms of the difference between ψ and ψ0 : Lemma 2.4. The difference between D and ∇A0 can be expressed in terms of ψ and ψ0 as follows: for α ∈ {1, . . . , p}, for all z ∈ M, (Dα − ∂zα A0 )(z) =

p Σz β=1

G−1 α,β ∇ξβ · ∇ ln

ξ

ψ ψ(det G)−1/2 dσΣz ψ . − ∂ ln zα ξ ψ0 ψξ ψ 0

Proof. Using Lemma 2.3 and the definition of D, it holds:

G−1 α,β ∇ξβ · ∇ ln

ξ

ψ ψ(det G)−1/2 dσΣz ψ − ∂ ln zα ξ ψ0 ψξ ψ 0

Σz

=

1 ψξ

Σz

−1/2 G−1 dσΣz + α,β ∇ξβ · ∇ψ(det G)

G−1 α,β ∇ξβ · ∇V

ψ(det G)−1/2 dσΣz ψξ

Σz

1 In all the following proofs, we use the summation convention on repeated Greek indices going from 1 to p.

2218

T. Lelièvre / Journal of Functional Analysis 256 (2009) 2211–2221 ξ

− ∂zα ln ψ ξ + ∂zα ln ψ0 ψ(det G)−1/2 dσΣz

−1 ψ(det G)−1/2 dσΣz −1 = − div Gα,β ∇ξβ + G ∇ξ · ∇V β α,β ψξ ψξ Σz

Σz

− ∂zα A0 = Dα − ∂zα A0 .

2

From Lemma 2.4, the following estimates are obtained: Lemma 2.5. Let us assume (H2) and (H4). Then for all z ∈ M, D(z) − ∇A0 (z) M

2 em (z). ρ

Proof. If we suppose (H4)(b), then we have: D(z) − ∇A0 (z) = F dμz − F dμ0,z F L∞ μz − μ0,z VT M √ μz − μ0,z VT , ρ where μz − μ0,z VT denotes the total variation norm of the signed measure (μz − μ0,z ). The result then follows from the Csiszar–Kullback inequality (see for example [2]): μz − μ0,z VT

2H (μz |μ0,z ).

Let us now assume (H4)(a). For any coupling measure π ∈ Π(μz , μ0,z ) defined on Σz × Σz (namely any probability measure on Σz × Σz such that its marginals are μz and μ0,z ), it holds: D(z) − ∇A0 (z) =

F (x) − F (x ) π(dx, dx )

Σz ×Σz

∇Σz F M

L∞

dΣz (x, x ) π(dx, dx )

Σz ×Σz

dΣz (x, x ) π(dx, dx ),

Σz ×Σz

where dΣz denotes the geodesic distance on Σz : ∀x, y ∈ Σz , 1 2

1 ˙ dt w ∈ C [0, 1], Σz , w(0) = x, w(1) = y . dΣz (x, y) = inf w(t) 0


2219

Taking now the infimum over all π ∈ Π(μz , μ0,z ), we obtain D(z) − ∇A0 (z) MW (μz , μ0,z ) where W (μz , μ0,z ) denotes the Wasserstein distance with linear cost (see for example [2]). It is known that if μ0,z satisfies a LSI (which is (H2)), then we have the following Talagrand inequality (see [3,10]): W (μz , μ0,z )

2 H (μz |μ0,z ). ρ

2

This implies the result.

Lemma 2.6. Let us assume (H2). Then it holds 1 Em 2ρ

2 ∇Σ ln ψ ψ. z ψ 0

D

Proof. Notice that the Fisher information of μz with respect to μ0,z can be written as 2 ψ ψ(det G)−1/2 dσΣz I (μz |μ0,z ) = ∇Σz ln . ψ0 ψ ξ (z) Σz

Therefore, using (H2), it follows: Em = em ψ ξ dz M

M

1 2ρ

2 −1/2 dσ Σz ∇Σ ln ψ ψ(det G) ψ ξ dz, z ξ ψ0 ψ (z)

Σz

which yields the result, using the co-area formula (12).

2

We are now in position to prove Theorem 1.2. We have (using (H2), (H3), Lemmas 2.1, 2.4, and the inequality (a + b)2 (1 + ε)a 2 + (1 + ε −1 )b2 , for a positive ε to be fixed later on): E = Em + EM

1 2ρ

ξ 2 2 ∇Σ ln ψ ψ + 1 ∇ ln ψ ψ ξ z ξ ψ0 2r ψ

D

0

M

2 1 ψ 1+ε ∇Σ ln ψ+ |D − ∇A0 |2 ψ ξ 2ρ z ψ0 2r D

+

M

1 + ε −1 2r

p −1/2 dσ 2 Σz ξ G−1 ∇ξβ · ∇ ln ψ ψ(det G) α,β ψ . ξ ψ0 ψ

M α=1 Σz

2220


Using the Cauchy–Schwarz inequality:

−1/2 dσ 2 Σz G−1 ∇ξβ · ∇ ln ψ ψ(det G) α,β ψ0 ψξ Σz

2 −1 ψ ψ(det G)−1/2 dσΣz Gα,β ∇ξβ · ∇ ln ψ0 ψξ Σz

and Lemma 2.5, we thus obtain 1 E 2ρ +

2 2 ∇Σ ln ψ ψ + (1 + ε)M em ψ ξ z ψ0 rρ

D

M

1 + ε −1 2r

2 p −1 G ∇ξβ · ∇ ln ψ ψ(det G)−1/2 dσΣ . z α,β ψ 0

M Σz α=1

2 2 2 For any vector u ∈ Rn , notice that |Qu|2 = G−1 α,β ∇ξα · u∇ξβ · u, and that |u| = |P u| + |Qu| (where P and Q are the projection operators defined by (5) and (6)). Using (H4), we thus have: p −1 G ∇ξβ · u2 = G−1 ∇ξβ · uG−1 ∇ξγ · u α,γ α,β α,β α=1

1 −1 G ∇ξβ · u∇ξγ · u m β,γ 1 = |Qu|2 . m

Applying this inequality with u = ∇ ln( ψψ0 ) and using Lemma 2.6, we get: 1 E 2ρ

2 2 P ∇ ln ψ ψ + (1 + ε)M Em ψ0 rρ

D

1 + ε −1 + 2rm

2 Q∇ ln ψ ψ(det G)−1/2 dσΣ z ψ

M Σz

(1 + ε)M 2 1 + 2ρ 2rρ 2

1 + ε −1 + 2rm

0

2

P ∇ ln ψ ψ ψ 0

D

2 Q∇ ln ψ ψ. ψ

D

0

This shows that ψ satisfies a LSI with constant R, where


2221

−1 1 (1 + ε)M 2 1 + ε −1 1 max + R= , 2 2ρ 2rm 2rρ 2

rm ρ2 . , = min ρ + (1 + ε)M 2 /r 1 + ε −1 Optimizing in ε, namely solving

ρ2 ρ+(1+ε)M 2 /r

=

rm 1+ε −1

concludes the proof.

Acknowledgments We would like to thank F. Otto and M. Westdickenberg for fruitful discussions on the subject. Part of this work was completed during a stay of the author at the Hausdorff Institute of Mathematics (HIM) in Bonn. The hospitality of the HIM is acknowledged. References [1] L. Ambrosio, N. Fusco, D. Pallara, Functions of Bounded Variation and Free Discontinuity Problems, Oxford Sci. Publ., Oxford Univ. Press, 2000. [2] C. Ané, S. Blachère, D. Chafaï, P. Fougères, I. Gentil, F. Malrieu, C. Roberto, G. Scheffer, Sur les inégalités de Sobolev logarithmiques, Soc. Math. France, 2000 (in French). [3] S. Bobkov, F. Götze, Exponential integrability and transportation cost related to logarithmic Sobolev inequalities, J. Funct. Anal. 163 (1) (1999) 1–28. [4] C. Chipot, A. Pohorille (Eds.), Free Energy Calculations, Springer Ser. Chem. Phys., vol. 86, Springer, 2007. [5] L.C. Evans, R.F. Gariepy, Measure Theory and Fine Properties of Functions, Stud. Adv. Math., CRC Press, 1992. [6] N. Grunewald, F. Otto, C. Villani, M.G. Westdickenberg, A two-scale approach to logarithmic Sobolev inequalities and the hydrodynamic limit, Ann. Inst. H. Poincaré Probab. Statist., in press. [7] F. Legoll, T. Lelièvre, Effective dynamics using conditional expectations, in preparation. [8] T. Lelièvre, M. Rousset, G. Stoltz, Long-time convergence of an adaptive biasing force method, Nonlinearity 21 (2008) 1155–1181. [9] F. Otto, M.G. Reznikoff, A new criterion for the logarithmic Sobolev inequality and two applications, J. Funct. Anal. 243 (2007) 121–157. [10] F. Otto, C. Villani, Generalization of an inequality by Talagrand and links with the logarithmic Sobolev inequality, J. Funct. Anal. 173 (2) (2000) 361–400.


Calculus of principal series Whittaker functions on GL(3, C) Miki Hirano a,∗ , Takayuki Oda b a Department of Mathematics, Faculty of Science, Ehime University, 2-5 Bunkyocho, Matsuyama, Ehime 790-8577,

Japan b Graduate School of Mathematical Sciences, University of Tokyo, 3-8-1 Komaba, Meguro, Tokyo 153-8914, Japan

Received 9 June 2008; accepted 2 October 2008 Available online 12 November 2008 Communicated by P. Delorme

Abstract In this paper, we discuss the Whittaker functions for the non-spherical principal series representations of GL(3, C). In particular, we give explicit formulas for these functions. © 2008 Elsevier Inc. All rights reserved. Keywords: Whittaker functions; Whittaker models

1. Introduction The global Whittaker functions of automorphic representations on GL(n) arise in the theory of automorphic L-functions, as developed by Jacquet, Piatetski-Shapiro, and Shalika (cf. [2]). Recently, many of the local investigations in this area have been aimed at handling the unramified p-adic cases and the archimedean cases. Compared with the explicit formula of Whittaker functions for unramified principal series representations of GL(n) over p-adic fields [19], the history of explicit integral expressions of Whittaker functions for principal series representations of GL(n) over the archimedean fields R and C is more involved and longer. The classical case GL(2) is found in the literature of automorphic forms such as Jacquet, Langlands [11] and Weil [27]. The beginning works beyond * Corresponding author.

E-mail addresses: [email protected] (M. Hirano), [email protected] (T. Oda). 0022-1236/$ – see front matter © 2008 Elsevier Inc. All rights reserved. doi:10.1016/j.jfa.2008.10.011

M. Hirano, T. Oda / Journal of Functional Analysis 256 (2009) 2222–2267

2223

this point seem to be those of Vinogradov, Takhtajan [24], Proskurin [17], and Bump [1]. They obtained the explicit integral formula of archimedean Whittaker functions for class one principal series representations of GL(3) by evaluating the Jacquet’s integral [10] or by solving the differential equations for them. And further investigation of the class one Whittaker functions is developed gradually by the papers of Stade and Ishii (cf. [8,9,20–22]). Contrary to the class one case refereed above, investigations of explicit integral formulas of the archimedean Whittaker functions for non-spherical principal series representations have only begun rather recently (Manabe, Ishii, Oda [12]). The reason of this delay is not clear. But it is true that the discussion of non-spherical cases which is not a trivial extension of the spherical cases is more time-demanding and requires some new ideas. In this paper, we discuss the Whittaker functions with minimal K-types belonging to general principal series representations of GL(3, C). We need two new ideas in this paper. One is the use of the Gelfand–Zelevinsky basis of simple K-modules in order to treat the Whittaker functions which is vector-valued different from the spherical cases. This basis is defined in the paper [3] and is recognized as the (classical limit of) dual of canonical basis in quantum groups investigated by Kashiwara and Lusztig. The other is the use of the Dirac–Schmid operators in our constructions of differential equations satisfied by the Whittaker functions. These operators are elements in U (gC ) defined by the injectors of the minimal K-type τ into the tensor product pC ⊗ τ and their explicit descriptions require the Clebsch–Gordan coefficients. The main results are explicit formulas for Whittaker functions in Section 7. In more detail, the results are explicit formulas for the secondary Whittaker functions, two equivalent integral representations for the primary Whittaker function, and the factorization theorem of the primary function by the secondaries. These formulas are a natural extension of the class one case and can be handled easily like as those for class one functions. We expect that our results are applicable for deeper investigation of automorphic forms on GL(3). Also, based on the explicit formulas, we derive an inductive procedure to write Whittaker functions on GL(3, C) by these on GL(2, C) in Section 8, which we call a propagation formula. This is an analogue of the formula in the real cases by Ishii, Stade [9] and Hina, Ishii, Oda [5]. It seems not only to show the similarity between the real and the complex cases in the non-class one situations but also to give a hint on a basis of gln -modules which is suitable for an explicit description of general principal series Whittaker functions on GL(n, C). 2. Preliminaries 2.1. Groups and algebras Let G = GL(3, C) be the complex general linear group of degree 3. We view G as a real reductive group and denote the imaginary unit by J ; J 2 = −1. The center ZG of G is {ru13 | r ∈ R>0 , u ∈ U (1)} C× . Here 13 is the unit matrix of degree 3. For a Cartan involution θ (g) = t g¯ −1 , g ∈ G of G, its fixed part K = {g ∈ G | θ (g) = g} = U (3), the unitary group of degree 3, is a maximal compact subgroup of G. Let g = gl(3, C) be the Lie algebra of G. If we denote the differential of θ again by θ , then we have θ (X) = − t X¯ for X ∈ g. Let k and p be the +1 and the −1 eigenspaces of θ in g, respectively. Then k = u(3) is the Lie algebra of K and g has a Cartan decomposition g = k ⊕ p. In general for a Lie algebra l, its complexification is denoted by lC . For 1 i, j 3, let Eij (respectively Eij ) in g be the matrix unit with its (i, j )-entry 1 (respectively J ) and the

2224


, I =E +E +E , remaining entries 0. Moreover put Hij = Eii − Ejj , Hij = Eii − Ejj 3 11 22 33 and I3 = E11 + E22 + E33 . Then we have k = Zk ⊕ k0 and p = Zp ⊕ p0 with

Zk = RI3 ,

k0 = RH12

⊕ RH23

⊕

R(Eij − Ej i ) ⊕ R Eij + Ej i ,

i<j

i<j

and Zp = RI3 ,

p0 = RH12 ⊕ RH23 ⊕

R(Eij + Ej i ) ⊕ R Eij − Ej i .

i<j

i<j

In the complexification kC and pC , we use the following symbols: √ I3k = − −1I3 ,

Hijk =

√ −1Hij ,

Eijk =

√ 1 (Eij − Ej i ) − −1 Eij + Ej i 2

in kC and p

I3 = I3 ,

p

Hij = Hij ,

p

Eij =

√ 1 (Eij + Ej i ) − −1 Eij − Ej i 2

in pC . Put a = Zp ⊕ RH12 ⊕ RH23 . Then a is a maximal abelian subalgebra of p. Also if we put n = i<j (REij ⊕ REij ), then n is the direct sum of the all positive restricted root spaces with respect to (g, a) and we have an Iwasawa decomposition g = n ⊕ a ⊕ k of g. Moreover, we have an Iwasawa decomposition G = N AK of G, where A and N is the analytic subgroup with Lie algebra a and n, respectively, that is,

A = diag(a1 , a2 , a3 ) ∈ G ai ∈ R>0 , i = 1, 2, 3 ,

1 x1 x 2

N= 0 1 x3 ∈ G xi ∈ C, i = 1, 2, 3 . 0 0 1 Consider the centralizer M of A in K,

M = k ∈ K kak −1 = a, a ∈ A

= diag(u1 , u2 , u3 ) ui ∈ U (1), i = 1, 2, 3 U (1)3 . Then the upper triangular subgroup P = N AM is a minimal parabolic subgroup of G and the right-hand side gives its Langlands decomposition. Namely, N is the unipotent radical of P and AM is a Levi subgroup whose split component is A.


2225

2.2. Representations of K According to the theory of highest weight, the equivalence classes of irreducible continuous representations of the maximal compact subgroup K = U (3) of G are parameterized by the set of highest weights

Λ = μ = (μ1 , μ2 , μ3 ) μi ∈ Z, μ1 μ2 μ3 . The representation of K corresponding to a highest weight μ ∈ Λ is denoted by (τμ , Vμ ). The dimension of Vμ is given by the Weyl dimension formula (cf. [26, Theorem 2.4.1.6]). Lemma 2.1. 1 dimC Vμ = (μ1 − μ2 + 1)(μ2 − μ3 + 1)(μ1 − μ3 + 2). 2 In the following, we often use the symbol ei for 1 i 3 which means the unit vector of degree 3 with its ith component 1 and the remaining components 0 in order to write an element in Z3 . 2.3. Principal series representations of G The (irreducible) characters of M U (1)3 are exhausted by n σn diag(u1 , u2 , u3 ) = un1 1 un2 2 u3 3 ,

n = (n1 , n2 , n3 ) ∈ Z3 .

Since the Lie algebra a of A has a system of generators consisting of diagonal matrix units {Eii | i = 1, 2, 3}, each linear form ν ∈ HomR (a, C) can be identified with the complex vector (ν1 , ν2 , ν3 ) ∈ C3 of degree 3 via νi = ν(Eii ) for 1 i 3. The adjoint action of A on the Lie algebra n of N induces the action e2ρ on the top degree wedge product 6R n. Here ρ is the half-sum of the positive restricted roots, i.e., eρ diag(a1 , a2 , a3 ) =

a∗C

a1 a3

2 ,

diag(a1 , a2 , a3 ) ∈ A.

Let us take a character σn of M parameterized by n = (n1 , n2 , n3 ) ∈ Z3 and an element ν in identified with (ν1 , ν2 , ν3 ) ∈ C3 . Then the induced representation ν+ρ π = π(ν, σn ) = IndG ⊗ σn P 1N ⊗ e

of G from the parabolic subgroup P = N AM is called the principal series representation of G. The representation π is a Hilbert representation (i.e., a Banach representation on a Hilbert space) with the representation space

L2(M,σn ) (K) = f ∈ L2 (K) f (mk) = σn (m)f (k), m ∈ M, k ∈ K ,

2226


and the action of G on L2(M,σn ) (K) is given by π(x)f (k) = a(kx)ν+ρ f κ(kx) ,

k ∈ K, x ∈ G.

Here g = n(g)a(g)κ(g) ∈ G is the Iwasawa decomposition of g ∈ G. If we put ν˜ = ν1 + ν2 + ν3 and n˜ = n1 + n2 + n3 , the central character of π is given by ZG ru13 → r ν˜ un˜ ,

r ∈ R>0 , u ∈ U (1).

The K-types of the principal series representation π = π(ν, σn ) are understood via the right K-action on L2(M,σn ) (K). A standard argument using the Frobenius reciprocity for induced representations leads to the following proposition. Proposition 2.2. Let π = π(ν, σn ) be a principal series representation with data (ν, σn ). A necessary and sufficient condition for a representation τμ of K corresponding to a highest weight μ = (μ1 , μ2 , μ3 ) ∈ Λ to be a constituent of the restriction π|K of π to K is that the convex closure of the subset

(μi , μj , μk ) ∈ Z3 (i, j, k) are permutations of (1, 2, 3)

in R3 contains the point n = (n1 , n2 , n3 ). In particular, if m = (na , nb , nc ) is the dominant permutation of n (namely na nb nc ), then the representation τm is the minimal K-type of π and occurs with multiplicity one in π|K . 2.4. Unitary characters of N Since a set {Eij , Eij | 1 i < j 3} gives a system of generators of n, a non-degenerate character η = ηc1 ,c2 of N can be specified as √ η(E12 ) = 2π −1 Re(c1 ), √ η E12 = 2π −1 Im(c1 ),

√ η(E23 ) = 2π −1 Re(c2 ), √ = 2π −1 Im(c2 ), η E23

with two non-zero complex numbers c1 , c2 ∈ C× . Then we have

1

η

x1 1

x2 x3 1

√ = exp 2π −1 Re(c¯1 x1 + c¯2 x3 ) ,

xi ∈ C.

3. Whittaker functions For a finite-dimensional representation (τ, Vτ ) of K and a non-degenerate character η of N , ∞ (N \G/K) the space consisting of smooth functions ϕ : G → V satisfying the we denote by Cη,τ τ condition ϕ(ngk) = η(n)τ (k)−1 ϕ(g),

(n, g, k) ∈ N × G × K.


2227

∞ (N \G/K) is determined by its restriction ϕ| to A, because of the Then the function ϕ ∈ Cη,τ A (η) be the representation of G Iwasawa decomposition G = N AK of G. Moreover, let C ∞ IndG N induced from η as C ∞ -induction. Here the representation space of C ∞ IndG N (η) is

Cη∞ (N \G) = ϕ ∈ C ∞ (G) ϕ(ng) = η(n)ϕ(g), (n, g) ∈ N × G , on which G acts via right translation. If we denote by (τ ∗ , Vτ ∗ ) the contragradient representation of (τ, Vτ ) and by ·,· the canonical bilinear form on Vτ ∗ × Vτ , then the relation ι(v ∗ )(g) = v ∗ , F [ι] (g) ,

v ∗ ∈ Vτ ∗ , g ∈ G,

[ι] ∈ C ∞ (N \G/K), which gives defines an association from ι ∈ HomK (τ ∗ , C ∞ IndG η,τ N (η)) to F ∞ (N \G/K). ∼ (η)) C an isomorphism HomK (τ ∗ , C ∞ IndG = η,τ N For an (irreducible) admissible representation (π, Hπ ) of G, we choose a K-type (τ ∗ , Vτ ∗ ) in π which occurs with multiplicity one and fix an injective K-homomorphism i ∈ HomK (τ ∗ , π|K ). Let

Iη,π = Hom(gC ,K) π, C ∞ IndG N (η) be the intertwining space between (gC , K)-modules π and C ∞ IndG N (η) consisting of all K-finite ∞ (N \G/K) by vectors. For each T ∈ Iη,π , we define an element Ti ∈ Cη,τ T i(v ∗ ) (g) = v ∗ , Ti (g) ,

v ∗ ∈ Vτ ∗ , g ∈ G.

Then we call the subspace

Wh(π, η, τ ) =

∞ Ti ∈ Cη,τ (N \G/K) T ∈ Iη,π

i∈HomK (τ ∗ ,π|K ) ∞ (N\G/K) the space of Whittaker functions with respect to (π, η, τ ). Moreover, we denote of Cη,τ ◦ the subspace of I ∞ by Iη,π η,π consisting of the intertwining operators whose images in Cη (N \G) are moderate growth functions [25, §8.1] and define the subspace

Wh(π, η, τ )mod =

∞ ◦ Ti ∈ Cη,τ , (N \G/K) T ∈ Iη,π

i∈HomK (τ ∗ ,π|K )

of Wh(π, η, τ ). An element in Wh(π, η, τ )mod is called a Whittaker function of moderate growth. 4. A small U (3) machine 4.1. Gelfand–Zelevinsky basis Let (τμ , Vμ ) be an irreducible representation of K = U (3) associated with a highest weight μ = (μ1 , μ2 , μ3 ) ∈ Λ. The representation space Vμ of τμ has a Gelfand–Zelevinsky basis (or a proper basis) defined and studied in the paper of Gelfand and Zelevinsky [3]. This basis can be

2228


parameterized by the set G(μ) of G-patterns belonging to μ as well as the Gelfand–Tsetlin basis. Here a G-pattern M ∈ G(μ) belonging to μ is a triangle M=

μ1 μ2 μ3 α1 α2 β

consisting of 6 integers satisfying the inequalities μ1 α1 μ2 α2 μ3

and α1 β α2 .

A Gelfand–Zelevinsky basis for gl3 has the ambiguity of scalar multiples. In the paper [3], a normalization of this basis was defined and the explicit action of gl3 on them was given. We denote this normalized Gelfand–Zelevinsky basis by {f (M)}M∈G(μ) and call it the GZ-basis simply. In order to describe the explicit action of kC on the GZ-basis, we introduce some notations for μ1 μ2 μ3 α1 α2 ∈ G(μ) belonging to μ ∈ Λ and a triangular array G-patterns. For a G-pattern M = β i13 i23 i33 I = i12 i22 of integers, we define the shift M(I ) of M by I as i11

M(I ) =

μ1 + i13 μ2 + i23 μ3 + i33 α1 + i12 α2 + i22 β + i11

.

If the vector (i13 i23 i33 ) is zero, we omit the top row of I , that is, M(I ) is written as M We use a convenient symbol M[k] defined by M k 0−k . Put

i12 i22 i11

.

δ(M) = α1 + α2 − μ2 − β, (i)

(i)

and define the characteristic functions χ+ (M) and χ− (M) of the sets {M | δ(M) > i} and (0) {M | δ(M) < −i}, respectively. If i = 0, we write χ± (M) simply by χ± (M). Moreover, we introduce ‘piecewise-linear’ functions C1 (M) and C¯ 1 (M) by C1 (M) = Min{β − α2 , α1 − μ2 } =

β − α2 , if δ(M) 0, α1 − μ2 , if δ(M) 0,

μ2 − α2 , C¯ 1 (M) = Min{μ2 − α2 , α1 − β} = α1 − β,

if δ(M) 0, if δ(M) 0

and put C2 (M) = C1 (M)C¯ 1 (M). Also we define the functions D(M) = −μ1 + α1 − δ(M), E(M) = C¯ 1 (M) μ1 − μ3 + 1 − C1 (M) , F (M) = −C2 (M)

− χ− (M) (μ1 − α1 )(α2 − μ3 ) − (μ1 − μ3 + 1)δ(M) ,

and their duals


2229

¯ D(M) = −α2 + μ3 + δ(M), ¯ E(M) = C1 (M) μ1 − μ3 + 1 − C¯ 1 (M) , F¯ (M) = −C2 (M)

− χ+ (M) (μ1 − α1 )(α2 − μ3 ) + (μ1 − μ3 + 1)δ(M) .

Lemma 4.1. Let Vμ be an irreducible finite-dimensional representation of kC corresponding to a highest weight μ ∈ Λ and {f (M)}M∈G(μ) be the GZ-basis of Vμ . If we take the subalgebra consisting of diagonal matrices as the Cartan subalgebra, the actions of the elements Eiik in the Cartan subalgebra and the simple root vectors Eijk on {f (M)}M∈G(μ) are given as follows: Eiik f (M) = wi f (M),

k E12 f (M) = (α1 − β)f M 010

+ χ+ (M)(μ2 − α2 )f M 010 [−1] , 00 k E21 f (M) = (β − α2 )f M −1 00 [−1] , + χ− (M)(α1 − μ2 )f M −1 k E23 f (M) = (μ1 − α1 )f M 100 + χ− (M) μ1 − α1 − δ(M) f M 100 [−1] , k E32 f (M) = (α2 − μ3 )f M 0 0−1 + χ+ (M) α2 − μ3 + δ(M) f M 0 0−1 [−1] , k E13 f (M) = (μ1 − α1 )f M 110 − C¯ 1 (M)f M 110 [−1] , −1 k E31 f (M) = −(α2 − μ3 )f M 0−1 −1 [−1] . + C1 (M)f M 0−1

Here, (w1 , w2 , w3 ) = (β, α1 + α2 − β, μ1 + μ2 + μ3 − α1 − α2 ) is the weight of the vector f (M) μ1 μ2 μ3 α1 α2 , and we stipulate the corresponding vector f (M ) associated with a G-pattern M = β

is zero if a shift M of M appearing in the above formulas violates the conditions of G-patterns. 4.2. pC as a K-module Let p = Zp ⊕ p0 be the (−1)-eigenspace for the Cartan involution θ in g as explained in Section 2.1. It is well known that the complexification pC of p is a K-module via the adjoint action and has the irreducible decomposition pC = Zp,C ⊕ p0,C , where Zp,C and p0,C are

2230


Table 1 k on p The adjoint actions of Eij 0,C . p

H12 p H23 p E12 p E21 p E23 p E32 p E13 p E31

k E11

k E22

k E33

k E12

0

0

0

−2E12

p

0

0

0

p E12 p −E21

p −E12 p E21 p E23 p −E32

p E12

0

0

0 −E23

p H12 p E13

E32

0

0 0 p E13 p −E31

p

p

p −E13 p E31

0 0

k E21

k E23

p

E23

2E21

p −E21 p H21

0 0 p

p −E32

0

p

−E32

p −2E23 p −E13

0 p E23

0

p

0 −E31

k E32 p 2E32

p

H23 p E21

p

−E13 p −E13

k E31 p

E31 p

E31 p

0

0

p E31 p H32

p −E23

0

E12

0

0

H31

p −E12

0

k E13

0

0 p

p H13

E32 0 p

−E21 p

0

isomorphic to the trivial representation V(0,0,0) and the 8-dimensional representation Ve1 −e3 corresponding to the highest weight e1 − e3 , respectively. The correspondence between the GZ-basis {f (M)}M∈G(e1 −e3 ) of Ve1 −e3 and the elements in p0,C is given by the following lemma. Lemma 4.2. We have an isomorphism Ve1 −e3 p0,C by the following correspondence between their basis: e1 − e3 e1 − e3 e1 − e3 p p p 10 10 f f 1 −1 ↔ −E12 , f ↔ E13 , ↔ E23 , 1

f f

f

e1 − e3 0 −1 0

e1 − e3 1 −1 0

e1 − e3 00 0

↔

p

↔

↔ −E32 ,

0

1

1 p 1 p p p H12 + H13 = 2H12 + H23 , 3 3

1 p 1 p p p H31 + 2H32 = − H12 + H23 , 3 3 e −e e −e 1 3 1 3 p p f 1 −1 ↔ E21 , f 0 −1 ↔ E31 . −1

−1

Proof. We can find the table of the adjoint action of kC on the elements in p0,C (see Table 1) by direct computation. Comparing this with the action of the simple root vectors of kC on the GZ-basis {f (M)}M∈G(e1 −e3 ) of Ve1 −e3 in Lemma 4.1, we have the assertion. 2 By the isomorphism in Lemma 4.2, we identify the tensor product p0,C ⊗ Vμ with Ve1 −e3 ⊗ Vμ for a general irreducible representation Vμ of K. 4.3. Injectors and their Clebsch–Gordan coefficients For the 8-dimensional representation (τe1 −e3 , Ve1 −e3 ) of K = U (3), we consider the tensor product with a general irreducible representation (τμ , Vμ ) associated with a highest weight μ = (μ1 , μ2 , μ3 ) ∈ Λ. The tensor product Ve1 −e3 ⊗ Vμ has the following irreducible decomposition: Ve1 −e3 ⊗ Vμ

i=j

Vμ+ei −ej

⊕ Vμ⊕2 .


2231

Here, if the weight μ + ei − ej is not dominant, the corresponding irreducible component Vμ+ei −ej does not appear, and if either μ1 = μ2 or μ2 = μ3 holds, the irreducible component Vμ occurs with multiplicity free in Ve1 −e3 ⊗ Vμ . Among others, the intertwining space Hom(Ve1 −e3 ⊗ Vμ , Vμ ) has dimension 2 if μ1 > μ2 > μ3 . An explicit description for projectors from Ve1 −e3 ⊗ Vμ into its irreducible components with respect to the GZ-basis are given in our previous paper [6]. Now we give an explicit formula of injectors from the irreducible component Vμ into Ve1 −e3 ⊗ Vμ . The injectors we construct here are based on the following lemma given in our previous paper [6, Lemma 3.9]. Lemma 4.3. Let L(1) =

μ1 μ2 μ3 μ1 μ2 μ1

∈ G(μ) be the G-pattern giving the highest weight vector (1)

(1)

f (L(1) ) in Vμ , and let us define two vectors v1 = v1 formulas v1 = f

e1 − e3 00 0

+f v2 = f

⊗ f L(1) − f

e1 − e3 10 1

e1 − e3 1 −1 0

+f

e1 − e3 10 0

in Ve1 −e3 ⊗ Vμ by the

⊗ f L(1) 0 0−1

−1 ⊗ f L(1) 0−1 ,

⊗ f L(1) − f

e1 − e3 10 1

and v2 = v2

e1 − e3 1 −1 1

00 ⊗ f L(1) −1

0 ⊗ f L(1) −1 . −1

Then, if μ1 > μ2 > μ3 , each of v1 and v2 respectively generates a representation isomorphic to Vμ in Ve1 −e3 ⊗ Vμ and gives the highest weight vector in each space. If μ1 = μ2 (respectively μ2 = μ3 ), then the vector v2 (respectively v1 ) is not valid. Since each of the vectors v1 and v2 defined in the above lemma is a highest weight vector for a representation isomorphic to Vμ in Ve1 −e3 ⊗ Vμ , two injectors which map the highest weight vector f (L(1) ) in Vμ into v1 and v2 can be constructed. The following theorem which is the main theorem in this subsection gives an explicit description for such injectors. Theorem 4.4. Let M =

μ1 μ2 μ3 α1 α2 β

∈ G(μ) be a G-pattern belonging to μ. Then, for i = 1, 2,

the following formulas give injective K-homomorphisms ιi from Vμ into Ve1 −e3 ⊗ Vμ satisfying μ1 μ2 μ3 ιi (f (L(1) )) = vi with the G-pattern L(1) = μ1 μ2 : (1)

μ1

ι1 (μ1 − μ3 + 1)(μ2 − μ3 )f (M) e −e 1 3 = f 0 −1 ⊗ −(μ1 − α1 )(α2 − μ3 )f M 110 −1 +E(M)f M 110 [−1]

2232


+f

+f

e1 − e3 0 −1 0

e

1 − e3 1 −1 −1

⊗ (μ1 − α1 )(α2 − μ3 )f M 100 −F (M)f M 100 [−1] + χ− (M)C2 (M)f M 100 [−2] ⊗ (μ1 − α1 )(α2 − μ3 )f M 1 1−1

− E(M) − χ+ (M)(μ1 − α1 )(α2 − μ3 ) f M 1 1−1 [−1] −χ+ (M)E(M)f M 1 1−1 [−2]

⊗ −2(μ1 − α1 )(α2 − μ3 )f M 1 0−1 + −(μ1 − α1 )(α2 − μ3 ) − C2 (M) + E(M) + χ− (M)δ(M)(μ1 − μ3 + 1) f M 1 0−1 [−1] −C2 (M)f M 1 0−1 [−2] e1 − e3 00 ⊗ −(μ1 − α1 )(α2 − μ3 )f M 1 0−1 +f 0 + −(μ1 − α1 )(α2 − μ3 ) − C2 (M) + (α1 − μ3 + 1)(α2 − μ3 ) + χ− (M)δ(M)(μ1 − μ3 + 1) f M 1 0−1 [−1] −2C2 (M)f M 1 0−1 [−2] e1 − e3 −1 +f 1 −1 ⊗ (μ1 − α1 )(α2 − μ3 )f M 1−1 1 −1 [−1] − F (M)f M 1−1 −1 [−2] +χ− (M)C2 (M)f M 1−1 e1 − e3 10 ⊗ −(α1 − μ3 + 1)(α2 − μ3 )f M 0 0−1 +f 0 + C2 (M) − χ+ (M)(α1 − μ3 + 1)(α2 − μ3 ) f M 0 0−1 [−1] +χ+ (M)C2 (M)f M 0 0−1 [−2] e1 − e3 −1 10 ⊗ (α1 − μ3 + 1)(α2 − μ3 )f M 0−1 +f 1 −1 [−1] . −C2 (M)f M 0−1

+f

e1 − e3 1 −1 0

(2) ι2 (μ1 − μ3 + 1)(μ1 − μ2 )f (M)


=f

e

1 − e3 0 −1 −1

+f

+f

⊗ (μ1 − α1 )(μ1 − α2 + 1)f M 110 −C2 (M)f M 110 [−1]

e1 − e3 0 −1 0

e

1 − e3

+f

+f

e1 − e3 1 −1 0

e1 − e3 00 0

+ C2 (M) − χ− (M)(μ1 − α1 )(μ1 − α2 + 1) f M 100 [−1] +χ− (M)C2 (M)f M 100 [−2]

⊗ (μ1 − α1 )(α2 − μ3 )f M 1 1−1 − F¯ (M)f M 1 1−1 [−1] +χ+ (M)C2 (M)f M 1 1−1 [−2]

⊗ −2(μ1 − α1 )(α2 − μ3 )f M 1 0−1 ¯ + −(μ1 − α1 )(α2 − μ3 ) − C2 (M) + E(M) − χ+ (M)δ(M)(μ1 − μ3 + 1) f M 1 0−1 [−1] −C2 (M)f M 1 0−1 [−2]

⊗ −(μ1 − α1 )(α2 − μ3 )f M 1 0−1 + −(μ1 − α1 )(α2 − μ3 ) − C2 (M)

1 −1 −1

⊗ −(μ1 − α1 )(μ1 − α2 + 1)f M 100

+ (μ1 − α2 + 1)(μ1 − α1 )

− χ+ (M)δ(M)(μ1 − μ3 + 1) f M 1 0−1 [−1] −2C2 (M)f M 1 0−1 [−2] +f

+f

e1 − e3 1 −1 1

e1 − e3 10 0

−1 ⊗ (μ1 − α1 )(α2 − μ3 )f M 1−1

−1 ¯ [−1] + −E(M) + χ− (M)(μ1 − α1 )(α2 − μ3 ) f M 1−1 −1 ¯ [−2] −χ− (M)E(M)f M 1−1

⊗ (μ1 − α1 )(α2 − μ3 )f M 0 0−1 − F¯ (M)f M 0 0−1 [−1] +χ+ (M)C2 (M)f M 0 0−1 [−2]

2233

2234


+f

e1 − e3 10 1

−1 ⊗ −(μ1 − α1 )(α2 − μ3 )f M 0−1 −1 ¯ [−1] . +E(M)f M 0−1

Proof. If we put M = L(1) in the formula, then we have ιi (f (L(1) )) = vi for each i. Thus, to prove the formulas, it suffices to check that each of these injectors ιi gives a gl3 -homomorphism. An essential part which we should check is to confirm the commutativity ιi · Ekl = Ekl · ιi for the simple root vectors Ekl with |k − l| = 1. This can be shown by a direct but a long computation. We leave it for the reader. 2 A generic irreducible representation τμ corresponding to a highest weight μ = (μ1 , μ2 , μ3 ) ∈ Λ has the 6 extremal weight vectors. Here an extremal weight means a weight given by permutations of μ. Each extremal weight vector is annihilated by the action of three different simple root vectors. If we evaluate the formulas in Theorem 4.4 at the G-patterns which give the extremal weight vectors in Vμ , all extremal vectors in ιi (Vμ ) are obtained. The explicit description of the five extremal weight vectors except the highest weight vector in ιi (Vμ ) is given as follows. Corollary 4.5. (1) For the G-pattern L(2) = f (L(2) )

μ1 μ2 μ3 μ1 μ2 μ2

∈ G(μ) giving the extremal weight vector

of weight (μ2 , μ1 , μ3 ) in Vμ , we have (2) v1

= ι1 f L(2) = f

e1 − e3 00 0

−f e

v2(2) = ι2 f L(2) = f

⊗ f L(2) + f

e1 − e3 10 0

1 − e3 1 −1 −1

e1 − e3 10 1

−1 ⊗ f L(2) 0−1

⊗ f L(2) 0 0−1 + f L(2) −10 0 ,

⊗ f L(2) 010

e1 − e3 e1 − e3 00 − f 1 −1 + f ⊗ f L(2) +f (2) For the G-pattern L(3) =

0

0

e1 − e3 10 0

⊗ f L(2)

μ1 μ2 μ3 μ1 μ3 μ1

−1 0 0

.

∈ G(μ) giving the extremal weight vector f (L(3) ) of

weight (μ1 , μ3 , μ2 ) in Vμ , we have (3) v1

= ι1 f L(3) = f

e1 − e3 0 −1 0

⊗ f L(3) 001

e1 − e3 e1 − e3 00 ⊗ f L(3) − f 1 −1 + f +f

0

e1 − e3 1 −1 1

0

00 , ⊗ f L(3) −1


(3) v2 = ι2 f L(3) = f

e1 − e3 1 −1 0

−f (3) For the G-pattern L(4) =

⊗ f L(3) + f

e1 − e3 1 −1 1

μ1 μ2 μ3 μ1 μ3 μ3

e1 − e3 10 1

2235

0 ⊗ f L(3) −1 −1

00 1 ⊗ f L(3) −1 + f L(3) −1 . −1


weight (μ3 , μ1 , μ2 ) in Vμ , we have (4) v1

e

= ι1 f L(4) = f

e

e

= ι2 f L(4) = f

⊗f L

0 −1 −1

−f (4) v2

1 − e3

1 − e3 1 −1 −1

1 − e3

1 −1 −1

(4)

01 1

+f

e1 − e3 1 −1 0

⊗ f L(4)

⊗ f L(4) 010 + f L(4) −11 1 ,

⊗ f L(4) 010

e1 − e3 e1 − e3 00 ⊗ f L(4) − f 1 −1 + f +f (4) For the G-pattern L(5) =

0

e1 − e3 10 0

μ1 μ2 μ3 μ2 μ3 μ2

0

⊗ f L(4)

−1 0 0

.


weight (μ2 , μ3 , μ1 ) in Vμ , we have (5) v1 = ι1 f L(5) = f

e1 − e3 0 −1 0

⊗ f L(5) 001

e1 − e3 e1 − e3 00 ⊗ f L(5) − f 1 −1 + f +f (5) v2

= ι2 f L(5) = f

e

1 − e3

v1(6)

= ι1 f L(6) = f

e1 − e3 0 −1 0

μ2 μ3 μ3

e1 − e3 1 −1 0

+f

e

0

00 ⊗ f L(5) −1 ,

⊗f L

μ1 μ2 μ3

weight (μ3 , μ2 , μ1 ) in Vμ , we have

0 −1 −1

−f (5) For the G-pattern L(6) =

0

e1 − e3 1 −1 1

(5)

10 1

+f

e1 − e3 00 0

⊗ f L(5)

⊗ f L(5) 100 + f L(5) 001 .

∈ G(μ) giving the lowest weight vector f (L(6) ) of

⊗ f L(6) − f

1 − e3 0 −1 −1

e

1 − e3 1 −1 −1

⊗ f L(6) 011 ,

⊗ f L(6) 010

2236


(6) v2 = ι2 f L(6) = f

e1 − e3 00 0

+f

⊗ f L(6) − f

e

1 − e3 0 −1 −1

e1 − e3 0 −1 0

⊗ f L(6) 100

⊗ f L(6) 110 .

4.4. The realization of τμ in L2 (K) Let (τμ , Vμ ) be a representations of K associated with a highest weight μ = (μ1 , μ2 , μ3 ) ∈ Λ. In this subsection, we give a natural construction of τμ of K in L2 (K). To do this, it suffices to investigate τ(p,0,−q) with p = μ1 − μ2 and q = μ2 − μ3 instead of τμ , since there is an isomorphism τμ τ(p,0,−q) ⊗ detμ2 . Here detμ2 = τ(μ2 ,μ2 ,μ2 ) is the character of K = U (3) given by X → (det X)μ2 . First, we remark the following lemma which is easy to prove, say, utilizing the harmonic polynomial model (cf. [23] for example). Lemma 4.6. τ(p,0,0) ⊗ τ(0,0,−q)

min{p,q}

τ(p−i,0,−q+i) .

i=0

In particular, τ(p,0,−q) occurs in τ(p,0,0) ⊗ τ(0,0,−q) with multiplicity one. Now we give a natural construction of the representation τ(p,0,−q) in L2 (K). In the tautological representation K k → s(k) = sij (k) 1i,j 3 ∈ U (3) ⊂ G of K, we can consider each of the matrix coefficients sij as a L2 -function on K. Then, for each fixed 1 i 3, the set {si1 , si2 , si3 } of the matrix coefficients generates a representation isomorphic to τe1 in L2(M,σe ) (K). The correspondence to the GZ-basis is given as follows: i

si1 ↔ f

e1 10 1

,

si2 ↔ f

e1 10 0

,

si3 ↔ f

e1 00 0

(see Table 2). Table 2 k on {s (k)}. The actions of the simple root vectors Eij ij

si1 si2 si3

k E12

k E21

k E23

k E32

k E13

k E31

0 si1 0

si2 0 0

0 0 si2

0 si3 0

0 0 si1

si3 0 0


2237

Table 3 k on {s (k)}. The actions of the simple root vectors Eij ij

si3 si2 si1

k E12

k E21

k E23

k E32

k E13

k E31

0 0 −si2

0 −si1 0

0 −si3 0

−si2 0 0

0 0 −si3

−si1 0 0

Similarly, if we consider the matrix coefficients sij of the representation K k → s(k) = sij (k) 1i,j 3 ∈ U (3) ⊂ G, as a L2 -function on K, the set {si1 , si2 , si3 } generates a representation isomorphic to τ−e3 in L2(M,σ−e ) (K) for each fixed i. The correspondence to the GZ-basis is given by i

si3 ↔ f

−e3 00 0

−si2 ↔ f

,

−e3 0 −1 0

,

si1 ↔ f

−e3 0 −1 −1

(see Table 3). Since we have the isomorphisms τ(p,0,0) Symp τe1 and τ(0,0,−q) Symq τ−e3 , the facts discussed above lead to the following lemma immediately. Lemma 4.7. Let p, q ∈ Z0 . p

(1) For each fixed 1 i 3, the function si1 ∈ L2(M,σpe ) (K) generates a representation isomori phic to τ(p,0,0) by its right translations and becomes its highest weight vector. (2) For each fixed 1 i 3, the function si3 q ∈ L2(M,σ−qe ) (K) generates a representation isoi morphic to τ(0,0,−q) by its right translations and becomes its highest weight vector. p (3) For each fixed 1 i, j 3 such that i = j , the function si1 sj 3 q ∈ L2(M,σpe −qe ) (K) generates i

j

a representation isomorphic to τ(p,0,−q) by its right translations and becomes its highest weight vector. In the above realization, the highest weight vector f (L(1) ) in V(p,0,−q) corresponds to the p function si1 sj 3 q in L2(M,σpe −qe ) (K). The next lemma gives the correspondence between the i

j

extremal weight vectors f (L(k) ) for 1 k 6 in V(p,0,−q) and the functions in L2(M,σpe −qe ) (K) i

j

together with their neighbors. Lemma 4.8. Let μ = (p, 0, −q), and for 1 k 6 let L(k) be the G-patterns belonging to the highest weight μ defined in Lemmas 4.3 and 4.5 which give the extremal vectors in Vμ . In the above embedding of Vμ in L2(M,σpe −qe ) (K), we have the following correspondence with the i

GZ-basis.

j

2238


(1) p f L(1) ↔ si1 sj 3 q , p f L(1) 0 0−1 ↔ −si1 sj 3 q−1 sj 2 , p−1 0 ↔ si1 si3 sj 3 q . f L(1) −1 −1

p−1 00 ↔ si1 si2 sj 3 q , f L(1) −1 p −1 ↔ si1 sj 1 sj 3 q−1 , f L(1) 0−1

(2) p f L(2) ↔ si2 sj 3 q , p −1 ↔ si2 sj 3 q−1 sj 1 , f L(2) 0−1 p−1 f L(2) −10 0 ↔ si2 si3 sj 3 q .

p−1 f L(2) 010 ↔ si2 si1 sj 3 q , p−1 f L(2) 0 0−1 ↔ si2 si1 sj 3 q−1 sj 1 ,

(3) p f L(3) ↔ si1 (−sj 2 )q , p−1 0 ↔ si1 si3 (−sj 2 )q , f L(3) −1 −1 p−1 1 ↔ si1 si3 (−sj 2 )q−1 sj 3 . f L(3) −1 −1

p f L(3) 001 ↔ si1 (−sj 2 )q−1 sj 3 , p 00 ↔ si1 (−sj 2 )q−1 sj 1 , f L(3) −1

(4) p f L(4) ↔ si2 sj 1 q , p−1 f L(4) −10 0 ↔ si2 si3 sj 1 q , p−1 f L(4) −11 1 ↔ si2 si3 sj 1 q−1 sj 3 .

p f L(4) 011 ↔ si2 sj 1 q−1 sj 3 , p−1 f L(4) 010 ↔ si2 si1 sj 1 q ,

(5) p f L(5) ↔ si3 (−sj 2 )q , p 00 ↔ si3 (−sj 2 )q−1 sj 1 , f L(5) −1 p f L(5) 001 ↔ si3 (−sj 2 )q−1 sj 3 .

p−1 f L(5) 110 ↔ si3 si1 (−sj 2 )q , p−1 f L(5) 100 ↔ si3 si1 (−sj 2 )q−1 sj 1 ,

(6) p f L(6) ↔ si3 sj 1 q , p f L(6) 010 ↔ si3 sj 1 q−1 (−sj 2 ), p f L(6) 011 ↔ si3 sj 1 q−1 sj 3 .

p−1 f L(6) 100 ↔ si3 si2 sj 1 q , p−1 f L(6) 110 ↔ si3 si1 sj 1 q ,


2239

Proof. First we prove the correspondence in the assertion (1). From Lemma 4.1, we have 00 k k E21 , E32 f L(1) = p · f L(1) −1 f L(1) = q · f L(1) 0 0−1 . On the other hand, by using the actions given in Tables 2 and 3 we obtain p q p−1 k E21 si1 sj 3 = p · si1 si2 sj 3 q ,

p q p k si1 sj 3 = −q · si1 sj 3 q−1 sj 2 . E32

These give the second and the third correspondences in the assertion (1). The fourth and the fifth one are obtained by the equations k E31 f L(1) = −q · f 00 k = −q · f E32 f L(1) −1

−1 0 + p · f L(1) −1 , L(1) 0−1 −1 −1 0 L(1) 0−1 + (−q + 1) · f L(1) −1 , −1

from Lemma 4.1 and p q p p−1 k E31 si1 sj 3 = −q · si1 sj 1 sj 3 q−1 + p · si1 si3 sj 3 q , p−1 p−1 p−1 k E32 si1 si2 sj 3 q = si1 si3 sj 3 q + q · si1 si2 sj 2 sj 3 q−1 p

p−1

= −q · si1 sj 1 sj 3 q−1 + (−q + 1) · si1 si3 sj 3 q , from Tables 2 and 3 together with the relation si1 sj 1 + si2 sj 2 + si3 sj 3 = 0,

i = j,

which comes from the unitarity. The correspondences given in the other assertions are obtained similarly, if the correspondences for the extremal weight vectors are given. Lemma 4.1 gives the following relations between the extremal weight vectors in Vμ : k p (1) E21 f L = p!f L(2) , k q (2) E31 f L = q!(−1)q f L(4) , k p (4) E32 f L = p!f L(6) ,

(1) k q E32 = q!f L(3) , f L k p (3) E31 f L = p!f L(5) , k q (5) E21 f L = q!f L(6) .

By considering the corresponding actions in L2(M,σpe −qe ) (K), we have the correspondences for i

the extremal weight vectors in the assertions (2)–(6).

j

2

5. (gC , K)-module structure Let π = π(ν, σn ) be an irreducible principal series representation with data ν = (ν1 , ν2 , ν3 ) ∈ C3 and n = (n1 , n2 , n3 ) ∈ Z3 , and let τ ∗ = τm be the minimal K-type of π . Here m = (m1 , m2 , m3 ) ∈ Λ is the dominant permutation of n. In this subsection, we explain some equations for weight vectors in the minimal K-type τm of π , which are determined from (gC , K)module structure of π . Although we need only a partial result here, we can describe the

2240


whole (gC , K)-module structure of the principal series representation as in the case of Sp(2, R) (see [15]), Sp(3, R) (see [13]), and SL(3, R) (see [14]). 5.1. Differential equations for generators of Z(gC ) For a Lie algebra l over C, let us denote the universal enveloping algebra of l by U (l) and its center by Z(l). It is well known that an element C in Z(gC ) acts as a scalar on the K-finite vectors in π . Thus, if we take an injection j ∈ HomK (τm , π|K ), then each element of the GZ-basis {f (M)}M∈G(m) of Vm satisfies the equation C · j f (M) = χC j f (M) , (1) for a scalar χC . Now we construct a set of generators of Z(gC ). To do this, we use the Capelli elements in U (g) given in the following lemma (cf. [7, §11]). Lemma 5.1. Define three elements Cp1,R = I3 , Cp2,R = (E11 − 1)E22 + E22 (E33 + 1) + (E11 − 1)(E33 + 1) − E23 E32 − E13 E31 − E12 E21 , Cp3,R = (E11 − 1)E22 (E33 + 1) + E12 E23 E31 + E13 E21 E32 − (E11 − 1)E23 E32 − E13 E22 E31 − E12 E21 (E33 + 1) in U (g). Then the set {Cpk,R | 1 k 3} is a system of independent generators of Z(g). The complexification gC of the Lie algebra g can be identified with g ⊕ g in such way that ¯ where X¯ is the complex conjugate of X. Hence the X ∈ gC corresponds to the element X ⊕ X, universal enveloping algebra U (gC ) of gC is isomorphic to U (g) ⊗C U (g). From this identification and Lemma 5.1, we have the following lemma which gives a set of generators of Z(gC ). Lemma 5.2. For 1 k 3, put Cpk(1) = Cpk,R ⊗ 1 and Cpk(2) = 1 ⊗ Cpk,R in U (g) ⊗C U (g). (i) Then the set {Cpk | 1 i 2, 1 k 3} gives a system of independent generators of Z(gC ), considered as a subalgebra of U (g) ⊗C U (g). (i)

For the generators Cpk given in Lemma 5.2, we give their expression as the elements in U (gC ). (i)

Lemma 5.3. As the elements in U (gC ), the generators Cpk of Z(gC ) are given as follows. 1 p I3 + I3k , 2 1 p (2) Cp1 = I3 − I3k , 2 (1)

Cp1 =

M. Hirano, T. Oda / Journal of Functional Analysis 256 (2009) 2222–2267 (1)

Cp2 =

2241

p p p 1 p k k k k E11 + E11 − 2 E22 + E22 +2 + E22 + E22 E33 + E33 4 p p p p k k k k + E11 + E11 E32 + E32 − 2 E33 + E33 + 2 − E23 + E23 p p p p k k k k E31 + E31 − E12 + E12 E21 + E21 , − E13 + E13

(2)

Cp2 =

p p p 1 p k k k k E11 − E11 + E22 − E22 E33 − E33 − 2 E22 − E22 +2 4 p p p p k k k k + E11 − E11 E23 − E23 − 2 E33 − E33 + 2 − E32 − E32 p p p p k k k k E13 − E13 − E21 − E21 E12 − E12 , − E31 − E31

Cp3(1) =

p p 1 p k k k E11 + E11 E33 + E33 − 2 E22 + E22 +2 8 p p p k k k E23 + E23 E31 + E31 + E12 + E12 p p p k k k E21 + E21 E32 + E32 + E13 + E13 p p p k k k E32 + E32 − 2 E23 + E23 − E11 + E11 p p p k k k E22 + E22 E31 + E31 − E13 + E13 p p p k k k E21 + E21 E33 + E33 +2 , − E12 + E12

(2)

Cp3 =

p p 1 p k k k E11 − E11 E33 − E33 − 2 E22 − E22 +2 8 p p p k k k E32 − E32 E13 − E13 + E21 − E21 p p p k k k E12 − E12 E23 − E23 + E31 − E31 p p p k k k E23 − E23 − 2 E32 − E32 − E11 − E11 p p p k k k E22 − E22 E13 − E13 − E31 − E31 p p p k k k E12 − E12 E33 − E33 +2 . − E21 − E21 p

Proof. From the definition, the elements Eijk and Eij in gC correspond to the elements Eij ⊕ (−Ej i ) and Eij ⊕ Ej i in g ⊕ g, respectively. Therefore, we have the correspondence between p p Eij + Eijk (respectively Eij − Eijk ) and 2Eij ⊕ 0 (respectively 0 ⊕ 2Ej i ). The assertion can be (i)

obtained from the above correspondences and the definition of the generators Cpk by direct computation. 2 (i)

For each C = Cpk , the scalar value χC in Eq. (1) can be obtained by considering the evaluation of the left-hand side at the identity.

2242


Lemma 5.4. 1 (νi + ni ), 2 1i3 1 = (νi + ni )(νj + nj ), 4 1i<j 3 1 = (νi + ni ), 8

1 (νi − ni ), 2 1i3 1 = (νi − ni )(νj − nj ), 4 1i<j 3 1 = (νi − ni ). 8

χCp(1) =

χCp(2) =

χCp(1)

χCp(2)

1

2

χCp(1) 3

1i3

1

2

χCp(2) 3

1i3

(i)

Proof. We evaluate the actions of Cpk on the representation space L2(M,σn ) (K) of π = π(ν, σn ) p p p at the identity using their expressions in Lemma 5.3. Then the elements E11 , E22 , and E33 in a k , E k , and E k in act by the scalar ν1 + 2, ν2 , and ν3 − 2, respectively. Also, the elements E11 22√ 33 p k mC act by the scalar n1 , n2 , and n3 , respectively. Moreover, Eij + Eij = Eij − −1Eij and √ p Ej i − Ejki = Eij + −1Eij belong to nC for i < j and thus their actions are zero. From the above facts, the eigenvalues χCp(i) can be calculated as in the assertion. 2 k

5.2. The Dirac–Schmid eigen-equations For i = 1, 2, let ιi be the injectors from Vm into p0,C ⊗ Vm Ve1 −e3 ⊗ Vm defined in Theorem 4.4, and fix an injection j ∈ HomK (τm , π|K ). Since τm occurs with multiplicity one in π|K , the composition ιi

α

Vm −→ p0,C ⊗ Vm −→ π(p0,C )j (Vm ) ⊂ L2(M,σn ) (K) is a scalar multiple of j , where α is the evaluation map. Thus, if we write ιi f (M) =

M ∈G(m)

(i) XM,M ⊗ f (M ),

(i) XM,M ∈ p0,C ,

for the GZ-basis {f (M)}M∈G(m) in Vm , then we have the following system of equations: M ∈G(m)

(i) XM,M · j f (M ) = λi j f (M) ,

M ∈ G(m),

(2)

for a scalar λi . We call this system of equations (2) the Dirac–Schmid eigen-equations. Here if m1 = m2 (respectively m2 = m3 ) then the Dirac–Schmid eigen-equation (2) for i = 2 (respectively i = 1) is not valid. The scalar values λi in the Dirac–Schmid eigen-equations (2) are given as follows. Lemma 5.5. If m = (na , nb , nc ), then we have 1 λ1 = νc − ν˜ , 3 Here ν˜ = ν1 + ν2 + ν3 .

1 λ2 = νa − ν˜ . 3


2243

Table 4 (a, b, c)

(1, 2, 3)

(2, 1, 3)

(1, 3, 2)

(2, 3, 1)

(3, 1, 2)

(3, 2, 1)

G-pattern M

L(1)

L(2)

L(3)

L(4)

L(5)

L(6)

Proof. Let us assume that n1 > n2 > n3 , i.e. m = n. Then the Dirac–Schmid equation (2) for i = 1 and the G-pattern M = L(1) becomes the following equation in L2(M,σm ) (K): 1 p p p H31 + H32 s11 s33 q det(S)n2 (k) 3 p p p p − E23 −s11 s32 s33 q−1 det(S)n2 (k) + E13 s11 s31 s33 q−1 det(S)n2 (k) p = λ1 s11 s33 q det(S)n2 (k), with p = na − nb = n1 − n2 and q = nb − nc = n2 − n3 . Here we use the identification of p0,C with Ve1 −e3 in Lemma 4.2, the expression of the highest weight vector v1 = ι1 (f (L(1) )) in Lemma 4.3, and the correspondence between the GZ-basis of Vm and L2(M,σn ) (K) in Lemma 4.8. If we evaluate this equation at the identity k = e after computing the actions of the elements in ˜ U (gC ), we obtain λ1 = ν3 − 13 ν˜ . Similarly we have λ2 = ν1 − 13 ν. For the other cases of m, the scalar-values λi can be obtained by evaluating the Dirac–Schmid equation (2) at the G-patterns corresponding to the other extremal weight vectors as in Table 4. For our later computation, we define λ3 by the relation λ1 + λ2 + λ3 = 0, that is, λ3 = νb − 13 ν˜ if m = (na , nb , nc ). 2 6. Whittaker realization Let π = π(ν, σn ) be an irreducible principal series representation with data ν = (ν1 , ν2 , ν3 ) ∈ C3 and n = (n1 , n2 , n3 ) ∈ Z3 , and let τ ∗ = τm associated to the dominant permutation m = (m1 , m2 , m3 ) ∈ Λ of n be the minimal K-type of π , as in the previous section. Moreover let η = ηc1 ,c2 be a non-degenerate unitary character of N specified by the parameters c1 , c2 ∈ C× . In this section, we write the Whittaker realization, i.e. the realization in the space Cη∞ (N \G), of Eqs. (1) and (2) explicitly. 6.1. Preliminaries ∞ (N \G/K) is expressed as A Whittaker function φ ∈ Wh(π, η, τ ) ⊂ Cη,τ

T j (v ∗ ) (g) = v ∗ , φ(g) ,

v ∗ ∈ Vτ ∗ , g ∈ G,

with an intertwining operator T ∈ Iπ,η and an injective K-homomorphism j ∈ HomK (τ ∗ , π|K ), by definition. Now, for each G-pattern M ∈ G(m) belonging to m, we define a function φ(M) in Cη∞ (N\G) by taking the element f (M) of the GZ-basis {f (M)}M∈G(m) for Vm as v ∗ ∈ Vτ ∗ = Vm in the above equation, that is, φ(M; g) = T j f (M) (g) = f (M), φ(g) ,

g ∈ G.

We call this function φ(M) the M-component of a Whittaker function φ.

2244


Whittaker functions are determined by their A-radial parts (i.e. their restrictions to A) because of the Iwasawa decomposition of G. Moreover, the values of Whittaker functions on the center ZG of G are given by the central character of π , i.e., φ(rug) = r ν˜ un˜ φ(g),

φ ∈ Wh(π, η, τ ), r ∈ R>0 , u ∈ U (1), g ∈ G.

Therefore, we can describe Whittaker functions as functions of two variables with the coordinates y1 =

a1 , a2

y2 =

a2 a3

for diag(a1 , a2 , a3 ) = a3 · diag(y1 y2 , y2 , 1) ∈ A, which correspond to simple roots of (a, g). Also, we denote the Euler operator with respect to yi by ∂i = yi ∂y∂ i . 6.2. Differential equations Let φ ∈ Wh(π, η, τ ) be a Whittaker function determined by an intertwining operator T ∈ Iπ,τ and an injection j ∈ HomK (τ ∗ , π|K ) and φ(M) be its M-component. For each M ∈ G(m), we consider the image of both side of Eq. (1) by T : T C · j f (M) (g) = T χC j f (M) (g), g ∈ G. Then the intertwining property of T leads the differential equation Cφ(M; y) = χC φ(M; y),

y = (y1 , y2 ),

(3)

for the A-radial part of φ(M). Similarly, the Dirac–Schmid eigen-equation (2) leads the differential equation (i) XM,M φ(M ; y) = λi φ(M; y), y = (y1 , y2 ). (4) M ∈G(m)

In this subsection, we write these Eqs. (3) and (4), explicitly. First, we observe the following fundamental lemmas. ∞ (N\G/K). For X ∈ U (k ), Y ∈ U (n ), Z ∈ U (a ), and a ∈ A, we Lemma 6.1. Let f ∈ Cη,τ C C C −1 have (Ad(a )Y )ZXf (a) = η(Y )τ (−X)(Zf )(a).

Lemma 6.2. Let φ = φ(y) ∈ Wh(π, η, τ )|A . p

p

p

(1) The actions of elements H12 , H23 , and I3 in aC on φ are the following differentials: p

p

H12 φ = (2∂1 − ∂2 )φ,

H23 φ = (−∂1 + 2∂2 )φ,

p

I3 φ = ν˜ φ.

p

Thus, for Eii we have ν˜ φ, = ∂1 + 3

p E11 φ

p E22 φ

ν˜ φ, −∂1 + ∂2 + 3

ν˜ φ. = −∂2 + 3

p E33 φ

M. Hirano, T. Oda / Journal of Functional Analysis 256 (2009) 2222–2267 p

2245

p

(2) The actions of elements Eij + Eijk and Ej i − Ejki with i < j in nC on φ are the following multiplications: √ √ p p k k φ = 2π −1c1 y1 φ, φ = 2π −1c¯1 y1 φ, E12 + E12 E21 − E21 √ √ p p k k φ = 2π −1c2 y2 φ, φ = 2π −1c¯2 y2 φ, and E23 + E23 E32 − E32 p p k k E31 − E31 φ = E13 + E13 φ = 0. The proof is omitted (cf. [12]). By using the above lemmas together with Lemma 4.1 which gives the actions of elements Eijk in kC , the following explicit description of Eq. (3) is obtained from Lemmas 5.3 and 5.4. Proposition 6.3. Let φ(M) be the M-component of a Whittaker function φ ∈ Wh(π, η, τ ) and put (i) ˜ φ(M; y) = y12 y22 φ(M; y). Then the differential equations (3) for the Capelli elements C = Cpk with k = 2, 3 and i = 1, 2 are given as follows. Let (w1 , w2 , w3 ) = (β, α1 + α2 − β, m1 + m2 + m1 m2 m3 α1 α2 m3 − α1 − α2 ) be the weight of a G-pattern M = . β

(1)

(1) For C = Cp2 , we have ν˜ ν˜ − ∂1 + ∂2 + + w2 ∂1 + + w1 3 3 ν˜ ν˜ + −∂1 + ∂2 + + w2 −∂2 + + w3 3 3 ν˜ ν˜ + ∂1 + + w1 −∂2 + + w3 3 3 √ 2 ˜ (νi + ni )(νj + nj ) φ(M; y) − 2π −1 |c1 |2 y12 + |c2 |2 y22 − 1i<j 3

√ − 4π −1c¯2 y2 (α2 − m3 )φ˜ M 0 0−1 ; y +χ+ (M) α2 − m3 + δ(M) φ˜ M −10 0 ; y √ 00 ;y − 4π −1c¯1 y1 (β − α2 )φ˜ M −1 1 ; y = 0. +χ− (M)(α1 − m2 )φ˜ M −1 −1 (2)

(2) For C = Cp2 , we have ν˜ ν˜ ∂1 + − w1 −∂1 + ∂2 + − w2 3 3 ν˜ ν˜ + −∂1 + ∂2 + − w2 − ∂2 + − w3 3 3

2246


ν˜ ν˜ + ∂1 + − w1 −∂2 + − w3 3 3 √ 2 2 2 2 2 ˜ (νi − ni )(νj − nj ) φ(M; y) − 2π −1 |c1 | y1 + |c2 | y2 − 1i<j 3

√ + 4π −1c2 y2 (m1 − α1 )φ˜ M 100 ; y +χ− (M) m1 − α1 − δ(M) φ˜ M 001 ; y √ + 4π −1c1 y1 (α1 − β)φ˜ M 010 ; y +χ+ (M)(m2 − α2 )φ˜ M −11 1 ; y = 0. (1)

(3) For C = Cp3 , we have

ν˜ ν˜ ν˜ + w1 −∂1 + ∂2 + + w2 −∂2 + + w3 3 3 3 √ 2 2 √ ν˜ ν˜ ∂1 + + w1 − 2π −1|c1 |y1 − ∂2 + + w3 − 2π −1|c2 |y2 3 3 ˜ (νi + ni ) φ(M; y) − ∂1 +

1i3

√ √ + 2 · 2π −1c¯1 y1 · 2π −1c¯2 y2 −1 0 ; y + C1 (M)φ˜ M −1 ;y × −(α2 − m3 )φ˜ M 0−1 −1 √ ν˜ − 2 · 2π −1c¯2 y2 ∂1 + + w1 3 × (α2 − m3 )φ˜ M 0 0−1 ; y + χ+ (M) α2 − m3 + δ(M) φ˜ M −10 0 ; y √ ν˜ − 2 · 2π −1c¯1 y1 −∂2 + + w3 3 00 1 × (β − α2 )φ˜ M −1 ; y + χ− (M)(α1 − m2 )φ˜ M −1 ; y = 0. −1 (2)

(4) For C = Cp3 , we have

ν˜ ν˜ ν˜ ∂1 + − w1 −∂1 + ∂2 + − w2 −∂2 + − w3 3 3 3 √ 2 2 √ ν˜ ν˜ ∂1 + − w1 − 2π −1|c1 |y1 −∂2 + − w3 − 2π −1|c2 |y2 3 3 ˜ (νi − ni ) φ(M; y) − 1i3


2247

√ √ − 2 · 2π −1c1 y1 · 2π −1c2 y2 × (m1 − α1 )φ˜ M 110 ; y − C¯ 1 (M)φ˜ M 011 ; y √ ν˜ + 2 · 2π −1c2 y2 ∂1 + − w1 3 × (m1 − α1 )φ˜ M 100 ; y + χ− (M) m1 − α1 − δ(M) φ˜ M 001 ; y √ ν˜ + 2 · 2π −1c1 y1 −∂2 + − w3 3 × (α1 − β)φ˜ M 010 ; y + χ+ (M)(m2 − α2 )φ˜ M −11 1 ; y = 0. (2)

If we evaluate the above equations from Cpk with k = 2, 3 at the G-pattern L(1) = m1 m2 m3 m1 m2 associated with the highest weight vector f (L(1) ) in Vm , we obtain the following m1

system of differential equations for the L(1) -component of Whittaker functions. Corollary 6.4. Let φ(L(1) ) be the L(1) -component of a Whittaker function φ ∈ Wh(π, η, τ ). Then ˜ (1) ) = y −2 y −2 φ(L(1) ) satisfies the following two differential equations: the function φ(L 1 2 (1) 2 ∂1 + ∂22 − ∂1 ∂2 − p(∂1 − λ2 ) − q(∂2 + λ1 ) + (λ1 λ2 + λ2 λ3 + λ3 λ1 ) √ 2 + 2π −1 |c1 |2 y12 + |c2 |2 y22 φ˜ L(1) ; y = 0. Here p = m1 − m2 and q = m2 − m3 . (2)

ν˜ ν˜ ν˜ − m1 −∂1 + ∂2 + − m2 −∂2 + − m3 3 3 3 √ √ 2 2 ν˜ ν˜ ∂1 + − m1 − 2π −1|c1 |y1 −∂2 + − m3 − 2π −1|c2 |y2 3 3 ν˜ ν˜ ν˜ − λ2 + − m1 λ3 + − m2 λ1 + − m3 φ˜ L(1) ; y = 0. 3 3 3 ∂1 +

Proof. In Eqs. (2) and (4) in the above proposition evaluated at M = L(1) , all terms in the left˜ (1) ) vanish, since the highest weight vector f (L(1) ) hand side except the L(1) -component φ(L k (1) in Vm satisfies Eij f (L ) = 0 with i < j . Then direct computation leads the equations in the corollary. Here we remark the equations ν˜ i<j

3

− mi

ν˜ − mj 3

−

(νi − ni )(νj − nj ) i<j

2248


= −(λ1 λ2 + λ2 λ3 + λ3 λ1 ) − pλ2 + qλ1 , and i

ν˜ ν˜ ν˜ λ3 − m2 + λ1 − m3 + , (νi − ni ) = λ2 − m1 + 3 3 3

which can be shown by using the definition of λi in Lemma 5.5.

2

Similarly to Eq. (3), we can describe the explicit form of the Dirac–Schmid eigen-equation (4) for each G-pattern M. However we need only the following partial result in our later discussion. Proposition 6.5. Let φ(M) be the M-component of a Whittaker function φ ∈ Wh(π, η, τ ) and ˜ put φ(M; y) = y12 y22 φ(M; y). (1) If m2 = m3 , the Dirac–Schmid eigen-equation (4) for i = 1 at M = L(1) p = m1 − m2 is given by

0 0 −k

with 0 k

˜ (∂2 + λ1 )φ(M; y) √ = −2π −1c¯2 y2 φ˜ M 0 0−1 ; y + χ+ (M)φ˜ M −10 0 ; y . (2) If m1 = m2 , the Dirac–Schmid eigen-equation (4) for i = 2 at M = L(1) q = m2 − m3 is given by

0 −k 0

with 0 k

˜ (∂1 − λ2 )φ(M; y) √ 00 1 = −2π −1c¯1 y1 φ˜ M −1 ; y + χ− (M)φ˜ M −1 ;y . −1 Proof. Assume 0 0 m2 = m3 . If we evaluate the formula (1) of the injector ι1 in Theorem 4.4 at M = L(1) −k , then we have ι1 f (M) = f

e1 − e3 00 0

−f +f

⊗ f (M)

e1 − e3 10 0 e1 − e3 10 1

⊗ f M 0 0−1 + χ+ (M)f M −10 0

−1 ⊗ f M 0−1 .

By using the correspondence between Ve1 −e3 and p0,C in Lemma 4.2 and the fundamental lemmas on the actions of U (gC ) on the space of Whittaker functions given in the top of this subsection, the above injection formula leads the following equation for the M-components of a Whittaker function φ ∈ Wh(π, η, τ ):


2249

(∂2 + λ1 )φ(M; y) 0 −1 √ k φ M 0 ; y + χ+ (M)φ M −10 0 ; y = − 2φ −1c¯2 y2 − E23 −1 k ;y . − E13 φ M 0−1 Thus we have the equation in the assertion (1), because of the equations −1 k k E23 f M 0 0−1 + χ+ (M)f M −10 0 = −E13 = f (M), f M 0−1 which are obtained from Lemma 4.1. The assertion (2) can be obtained similarly.

2

7. Explicit formulas Let us take π = π(ν, σn ), τ ∗ = τm , and η = ηc1 ,c2 as in the previous section. In this section, we discuss explicit descriptions for the Whittaker functions with respect to (π, η, τ ) which is our main theme in this paper. 7.1. Preliminaries Let φ ∈ Wh(π, η, τ ) be a Whittaker function with respect to (π, η, τ ). Then the set {φ(M)}M∈G(m) of M-components of φ satisfies the system of Eqs. (3) and (4) in Section 6.2. Before studying explicit formulas, we observe the following lemma concerning this system of equations. Lemma 7.1. A Whittaker function φ ∈ Wh(π, η, τ ) is determined by its L(1) -component φ(L(1) ). That is, all M-components φ(M) of φ ∈ Wh(π, η, τ ) are uniquely determined from φ(L(1) ) by the equations in Propositions 6.3 and 6.5. Proof. To prove this assertion, we may give an effective procedure for determining all M˜ ) means zero if M ˜ ˜ (1) ). In the following, we stipulate that φ(M components φ(M) from φ(L violates the conditions of G-patterns. 00 from the equations in First, we can find the components φ˜ L(1) 0 0−1 and φ˜ L(1) −1 Proposition 6.5 for k = 0: √ (∂2 + λ1 )φ˜ L(1) ; y = −2π −1c¯2 y2 φ˜ L(1) 0 0−1 ; y , √ 00 ;y . (∂1 − λ2 )φ˜ L(1) ; y = −2π −1c¯1 y1 φ˜ L(1) −1 −1 Next let us take 1 k m1 − m2 and assume that the components φ˜ L(1) 0−i0 , φ˜ L(1) 0−i , (1) −1 0 ˜ for 0 i k − 1 are all known. Then, in Eq. (1) of Proposition 6.3 evaland φ L −i 00 0 0 , the only unknown function is φ˜ L(1) −k with the coefficient uated for M = L(1) −k+1 0 0 √ (1) -component is determined. Moreover Eq. (3) −4π −1c¯1 y1 (m1 −m2 −k +1). Thus the L −k 00 0 0 (1) and Eq. (1) in Proposition 6.5 for M = L(1) −k have in Proposition 6.3 for M = L −k+1 the unknown terms

2250


√ √ 2 · 2π −1c¯1 y1 · 2π −1c¯2 y2 −1 0 + (m1 − m2 − k + 1)φ˜ L(1) −1 , × −(m2 − m3 )φ˜ L(1) 0−k −k and √ −1 0 −2π −1c¯2 y2 φ˜ L(1) 0−k + φ˜ L(1) −1 , −k respectively, and thus these two unknown components are determined fromthese two equa −i tions. Similarly, for fixed 1 k m2 − m3 , if the components φ˜ L(1) 0 0−i , φ˜ L(1) 0−1 , (1) 0 −k (1) −1 −i+1 ˜ ˜ for 0 i k − 1 are all given, then the three components φ L , and φ L 0 (1) −1 −k+1 (1) 0 −k −1 ˜ ˜ , and φ L can be determined from Eqs. (1) and (3) in Proposition 6.3 φ L −1 −1 ˜ and Eq. (2) in Proposition 6.5. Therefore the M-components φ(M) corresponding to the weights (m1 − i, m2 + i − j, m3 + j ) for 0 i m1 − m2 and j = 0, 1 and (m1 − j, m2 − i + j, m3 + i) for 0 i m2 − m3 and j = 0, 1 can be determined. To determine the remaining M-components, we need only Eqs. (1) and (3) in Proposition 6.3. This process is done one by one from the larger pair (w1 − w3 , |δ(M)|) in lexicographical order, where (w1 , w2 , w3 ) is the weight corresponding to G-pattern M. We leave the details for the reader. 2 The proof of this lemma shows that all M-components φ(M) of a Whittaker function φ are moderate growth functions if and only if φ(L(1) ) is. Thus a Whittaker function is in the space Wh(π, η, τ )mod if and only if its L(1) -component is a moderate growth function. 7.2. The highest weight components of Whittaker functions According to Lemma 7.1 in the previous subsection, we may consider their L(1) -components in order to determine Whittaker functions, which satisfy the holonomic system of partial differential equations in Corollary 6.4. In this subsection, we describe the space of solutions for this holonomic system explicitly. The holonomic system of partial differential equations in Corollary 6.4 has regular singularities along 2 divisors y1 = 0 and y2 = 0 which are of simple normal crossing at (y1 , y2 ) = (0, 0), in the sense of [16]. First, we consider the power series solutions of this system at the point (y1 , y2 ) = (0, 0), which give the L(1) -components of the secondary Whittaker functions with respect to (π, η, τ ). For a power series ∞ γ γ k l γ π|c1 |y1 1 π|c2 |y2 2 ck,l π|c1 |y1 π|c2 |y2 ,

γ = (γ1 , γ2 ) ∈ C2 ,

(5)

k,l=0

with a characteristic index γ = (γ1 , γ2 ), it is easy to see that the holonomic system in Corollary 6.4 can be translated into the following system of difference equations for the coefficients γ {ck,l }. Lemma 7.2. The power series (5) satisfies the holonomic system in Corollary 6.4 if and only if γ the coefficients {ck,l } satisfy the following system of difference equations:


2251

(1) (γ1 + k)2 + (γ2 + l)2 − (γ1 + k)(γ2 + l)

γ − p(γ1 + k − λ2 ) − q(γ2 + l + λ1 ) + (λ1 λ2 + λ2 λ3 + λ3 λ1 ) ck,l γ

γ

− 4ck−2,l − 4ck,l−2 = 0, (2) ν˜ ν˜ ν˜ γ1 + k + − m1 −γ1 + γ2 − k + l + − m2 −γ2 − l + − m3 3 3 3 ν˜ ν˜ ν˜ γ − λ2 + − m1 λ3 + − m2 λ1 + − m3 ck,l 3 3 3 ν˜ ν˜ γ γ + 4 γ1 + k + − m1 ck,l−2 + 4 −γ2 − l + − m3 ck−2,l = 0. 3 3 γ

Here we understand ck,l = 0 if k < 0 or l < 0. γ

Observe that all coefficients ck,l are determined inductively from an initial non-zero coeffiγ cients c0,0 by the first difference equation in Lemma 7.2. The characteristic indices γ can be found by putting k = l = 0 in the equations in Lemma 7.2. (i)

(i)

Lemma 7.3. The set of characteristic indices {γ (i) = (γ1 , γ2 ) | 1 i 6} of the holonomic system of partial differential equations in Corollary 6.4 at (y1 , y2 ) = (0, 0) is given as follows: γ (1) = (λ2 , −λ1 ), γ (3) = (λ2 , −λ3 + q), γ (5) = (λ3 + p, −λ2 + p + q),

γ (2) = (λ3 + p, −λ1 ), γ (4) = (λ1 + p + q, −λ3 + q), γ (6) = (λ1 + p + q, −λ2 + p + q). (i)

Now, for each 1 i 6, we define the coefficients {Ck,l }k,l0 by (i)

Ck,l =

⎧ ⎨ 4(−1)k +l ⎩

k !·l !

a1 2

− k , a22 − k , a23 − l , a24 − l b 2 −k −l

0,

if (k, l) = (2k , 2l ), otherwise,

with the parameters (i)

(i)

a1 = a3 = b = −γ1 − γ2 + p + q,

(i)

(i)

a2 = −2γ1 + γ2 + p,

Here we use the notation

α , . . . , αr 1 β1 , . . . , βs

=

r i=1

s " (αi ) (βi ). i=1

Since (x + 1) = x(x) for x ∈ / Z0 , we have the relations

(i)

(i)

a4 = γ1 − 2γ2 + q.

2252


−1 a2 b a1 −k −k −k −l · (−k ) , 2 2 2 −1 a4 b a3 (i) (i) Ck,l−2 = Ck,l · (−l ) −l −l −k −l , 2 2 2

(i) Ck−2,l

(i) = Ck,l

if (k, l) = (2k , 2l ), and thus, (i) (i) (i) 4 Ck−2,l + Ck,l−2 = Ck,l k 2 − kl + l 2 − a2 k − a4 l . (i)

This identity shows that the coefficients {Ck,l } satisfy the first difference equations for γ = γ (i) in Lemma 7.2. Therefore we can state the following proposition on an explicit formula for the L(1) -components of secondary Whittaker functions with respect to (π, η, τ ). Proposition 7.4. For each 1 i 6, we define the function ϕ˜3(i) (L(1) ; y) by the power series (5) (i) with the above coefficients {Ck,l }, that is, γ (i) ϕ˜3 L(1) ; y = π|c1 |y1 1

(i)

∞ γ (i) 2k 2l (i) 2 π|c2 |y2 π|c2 |y2 . C2k ,2l π|c1 |y1 k ,l =0,

(i)

Then the set {ϕ˜3 (L(1) )} gives the complete system of linearly independent solutions for the holonomic system of differential equations in Corollary 6.4 at y = (0, 0). Next, we consider a solution with moderate growth property for the holonomic system of partial differential equations in Corollary 6.4. As we mentioned in the previous subsection, a Whittaker function φ is of moderate growth if and only if its L(1) -component φ(L(1) ) is. Therefore the local multiplicity one theorem for Whittaker model (cf. [18,25]) tells that the holonomic system in Corollary 6.4 has a solution of moderate growth unique up to scalar multiples, which gives the L(1) -component of the primary Whittaker function. Here we give two integral expressions of this unique solution of moderate growth. Proposition 7.5. (1) Put ϕ˜ 3mod

(1) L ;y =

1 √ (2π −1)2

# #

−s −s V3 L(1) ; s1 , s2 π|c1 |y1 1 π|c2 |y2 2 ds1 ds2 .

s1 s2

Here V3 L(1) ; s1 , s2 =

s1 +λ1 +p+q 2

2 s1 +λ3 +p s2 −λ1 s2 −λ2 +p+q s2 −λ3 +q , s1 +λ , 2 , , 2 , 2 2 2

s1 +s2 +p+q 2

,

√ √ and the paths of integrations are the vertical lines from Re si − −1 ∞ to Re si + −1 ∞ with large enough real parts. Then, up to scalar multiples, the function ϕ˜3mod (L(1) ) gives a unique solution with moderate growth property for the holonomic system of partial differential equations in Corollary 6.4.


2253

(2) The function ϕ˜ 3mod (L(1) ) has the following integral expression of Euler type: −λ3 +p+q λ3 +p+q 2 2 ϕ˜3mod L(1) ; y = 24 π|c1 |y1 π|c2 |y2

#∞ ×

$

KA 2π|c1 |y1 1 +

√ dv 1 KA 2π|c2 |y2 1 + v v B . v v

0

Here Kν (z) is the modified Bessel function of the second kind and the parameters A and B are given by A=

λ 1 − λ2 + p + q , 2

B=

3λ3 + p − q . 4 (i)

(3) The function ϕ˜ 3mod (L(1) ) has the following factorization by the power series ϕ˜3 (L(1) ) defined in Proposition 7.4:

ϕ˜3mod

6 (1) (i) L ;y = ϕ˜3 L(1) ; y . i=1

Proof. The Stirling formula for the gamma function shows that the double Mellin–Barnes integral defining the function ϕ˜3mod (L(1) ) converges absolutely and also defines a moderate growth function of y. The second assertion follows from Lemma 7.1 in the paper [12]. Moving the integration paths in the definition of ϕ˜3mod (L(1) ) to the left, we have the third assertion after the standard residue calculus. The factorization in the third assertion means that the function ϕ˜3mod (L(1) ) satisfies the holonomic system in Corollary 6.4. Therefore, ϕ˜3mod (L(1) ) gives a unique solution with moderate growth property for the system, up to scalar multiples. 2 7.3. Explicit formulas of Whittaker functions As we asserted in Lemma 7.1, all M-components of a Whittaker function are determined from its L(1) -component whose explicit formulas are given in the previous subsection. In this subsection, we give explicit formulas for the whole components of Whittaker functions with √ respect to (π, η, τ ). For simplicity, we assume c1 = c2 = −1 in the following discussion. First, we consider the power series solutions of the holonomic system of differential equations (3) and (4) at (y1 , y2 ) = (0, 0), which we call the secondary Whittaker functions. That is, we give ˜ a family {φ(M; y)}M∈G(m) of power series ˜ φ(M; y) = (πy1 )γ1 (M) (πy2 )γ2 (M)

∞

γ (M)

ck,l

(πy1 )k (πy2 )l ,

(6)

k,l=0

with a characteristic index γ (M) = (γ1 (M), γ2 (M)) ∈ C2 satisfying the differential equations in Propositions 6.3 and 6.5.

2254


Table 5 The indices (ui , vi ) in γ (i) (M). i

1

2

3

4

5

6

(ui , vi )

(2, 1)

(3, 1)

(2, 3)

(1, 3)

(3, 2)

(1, 2)

acteristic index γ (i) (M) = follows. Put

m1 m2 m3

α1 α2 ∈ G(m) and each β (i) (i) (γ1 (M), γ2 (M)) and the set of

Now, for each G-pattern M =

1 i 6, we define the char(i)

coefficients {Ck,l (M)}k,l0 as

(1)

ζ1 (M) = −λ1 + m1 − β − δ(M),

(2)

(1)

ζ2 (M) = −λ2 − m3 + β + δ(M),

ζ1 (M) = λ1 − m3 + β,

(2)

ζ2 (M) = λ2 + m1 − β,

(1) ζ3 (M) = λ3 + α1 − α2 − δ(M) , (1)

(2)

ζ3 (M) = −λ3 + m1 − m3 − α1 + α2 .

(2)

Then we define γ (i) (M) = (ζui (M), ζvi (M)) with the index (ui , vi ) given in Table 5. Here we observe that γ (i) (L(1) ) = γ (i) is the characteristic index given in Lemma 7.3. Moreover we (i) define the coefficients {Ck,l (M)}k,l0 by (i)

Ck,l (M) =

⎧ ⎨ 4(−1)k +l ⎩

k !·l !

a1

2

− k,

a2 2

− k , a23 − l , b 2 −k −l

0,

a4 2

− l

if (k, l) = (2k , 2l ), otherwise.

Here the parameters aj (1 j 4) and b are given by (1)

(M), a1 = ζu(1) (M) − ζu(1) i (2) (2) a3 = ζv (M) − ζvi (M),

a2 = ζu (M) − ζu(1) (M), i (2) (2) a4 = ζv (M) − ζvi (M),

and b = −ζui (M) − ζvi (M) + ζ3 (M) + ζ3 (M) with the indices u, u , v, and v satisfying (1)

(2)

u < u ,

(1)

(2)

v < v,

{u, u , ui } = {v, v , vi } = {1, 2, 3}.

We write the power series with the characteristic index γ (i) (M) and the coefficients (i) (i) {Ck,l (M)}k,l0 defined above by ϕ˜ 3 (M; y), i.e., (i)

ϕ˜3(i) (M; y) = (πy1 )γ1

(M)

(i)

(πy2 )γ2

(M)

∞ k ,l =0,

(i) 2k C2k (πy2 )2l . ,2l (M)(πy1 )

When M = L(1) , this power series coincides with the one (for c1 = c2 = sition 7.4.

√ −1 ) defined in Propo-

Theorem 7.6. Let π = π(ν, σn ) be an irreducible principal series representation with data ν = (ν1 , ν2 , ν3 ) ∈ C3 and n = (n1 , n2 , n3 ) ∈ Z3 , and let τ ∗ = τm associated to the dominant permutation m = (m1 , m2 , m3 ) ∈ Λ of n be the minimal K-type of π . Moreover √ let η be a non-degenerate unitary character of N specified by the parameters c1 = c2 = −1. For each


2255

(i)

1 i 6, let ϕ3 ∈ Wh(π, η, τ ) be the secondary Whittaker function whose L(1) -component is ϕ3(i) (L(1) ) = y12 y22 ϕ˜ 3(i) (L(1) ) defined in Proposition 7.4. Then, for each G-pattern M, the M(i) (i) (i) (i) component of ϕ3 is given by ϕ3 (M) = y12 y22 ϕ˜3 (M) with the power series ϕ˜3 (M) defined above. Proof. We can obtain this assertion similarly to Proposition 7.4, that is, by showing directly for γ (M) (i) each 1 i 6 the set {Ck,l (M)} satisfies the difference equations for the coefficients {ck,l } of the power series (6) which is equivalent with Propositions 6.3 and 6.5. In the case of δ(M) > 0 and γ (M) = γ (1) (M), since (1) (1) γ (1) M 0 0−1 = γ (1) M −10 0 = γ1 (M), γ2 (M) + 1 , (1) (1) 00 = γ1 (M) + 1, γ2 (M) , γ (1) M −1 γ (M)

the difference equation for {ck,l γ1 (M) + k +

} equivalent to Eq. (1) in Proposition 6.3 is given by

ν˜ + w1 3

−γ1 (M) + γ2 (M) − k + l +

ν˜ + w2 3

ν˜ ν˜ + −γ1 (M) + γ2 (M) − k + l + + w2 −γ2 (M) − l + + w3 3 3 ν˜ ν˜ + γ1 (M) + k + + w1 −γ2 (M) − l + + w3 3 3 γ (M) (νi + ni )(νj + nj ) ck,l −

1i<j 3 γ (M)

γ (M)

+ 4ck−2,l + 4ck,l−2

γ M 0 −1 − 4(α2 − m3 )ck,l−2 0 − 4 α2 00 γ M − 4(β − α2 )ck−2,l −1 = 0.

γ M − m3 + δ(M) ck,l−2

−1 0 0

(1)

Then direct computation shows that the coefficients {Ck.l (M)} satisfy the above difference equation by using the relations −1 a1 a2 b (1) −k −k −k −l Ck,l (M), 2 2 2 −1 a4 b (1) (1) a3 Ck,l−2 (M) = −l −l −l −k −l Ck,l (M), 2 2 2 −1 a2 b (1) (1) Ck,l−2 M 0 0−1 = −l − k − k − l Ck,l (M), 2 2 (1) Ck−2,l (M) = −k

2256


(1) Ck,l−2

M

−1 0 0

= −l

a4 − k 2

(1) (1) 00 = −k Ck,l (M), Ck−2,l M −1

b − k − l 2

−1

(1)

Ck,l (M),

if (k, l) = (2k , 2l ), where (1)

(1)

a2 = ζ3 (M) − ζ2 (M),

(2)

(2)

a4 = ζ3 (M) − ζ1 (M),

a1 = ζ1 (M) − ζ2 (M), a3 = ζ2 (M) − ζ1 (M), (1)

(2)

(1)

(1)

(1)

(2)

(2)

(2)

and b = −ζ2 (M) − ζ1 (M) + ζ3 (M) + ζ3 (M). The other cases can be shown similarly and we omit their detail. 2 Finally, we state our main result for the primary Whittaker functions with respect to (π, η, τ ), i.e. the unique solution of moderate growth for the holonomic system of differential equations ˜ (3) and (4). If we write such a solution by φ ∈ Wh(π, η, τ )mod , then a family {φ(M; y)}M∈G(m) 2 2 ˜ consisting of all M-components φ(M; y) = y1 y2 φ(M; y) of φ is the unique solution of moderate growth for the differential equations in Propositions 6.3 and 6.5. Also, φ is given by a linear combination of the six secondary Whittaker functions ϕ (i) in Theorem 7.6. The following theorem can be seen by the same way as the proof of Proposition 7.5. Theorem 7.7. Let π = π(ν, σn ), τ ∗ = τm , and η be the representations as in Theorem 7.6. Moreover let ϕ3mod ∈ Wh(π, η, τ )mod be the primary Whittaker function whose L(1) -component is ϕ3mod (L(1) ) = y12 y22 ϕ˜ 3mod (L(1) ) defined in Proposition 7.5. Then, for each G-pattern M ∈ G(m) we have the following assertions on the M-component ϕ3mod (M) = y12 y22 ϕ˜3mod (M) of ϕ3mod . (1) The function ϕ˜3mod (M) has the following integral expressions: ϕ˜3mod (M; y) =

1 √ (2π −1)2

= 24 (πy1 )

# #

V3 (M; s1 , s2 )(πy1 )−s1 (πy2 )−s2 ds1 ds2

s1 s2

−λ3 +m1 −m3 2

λ3 +m1 −m3

2 (πy2 ) $ ∞ # √ dv 1 KA+δ(M) 2πy2 1 + v v B (1 + v)C . × KA 2πy1 1 + v v

0

Here, in the integral of Mellin–Barnes √ √ type, the paths of integrations are the vertical lines from Re si − −1 ∞ to Re si + −1 ∞ with large enough real parts and the integrand V3 (M; s1 , s2 ) is defined by %s V3 (M; s1 , s2 ) =

(1) 1 +ζ1 (M)

2

(1)

,

(2)

(1) (2) (2) s1 +ζ2 (M) s1 +ζ3 (M) s2 +ζ1 (M) s2 +ζ2 (M) s2 +ζ3 (M) , , , , 2 2 2 2 2 (1) (2) s1 +s2 +ζ3 (M)+ζ3 (M) 2

& .


2257

Also, in the integral of Euler type, the parameters A, B and C are given by (1)

2ζ3(1) (M) − ζ1(1) (M) − ζ2(1) (M) B= , 4

(1)

ζ (M) − ζ2 (M) A= 1 , 2

and C =

|δ(M)| . 2

(i)

(2) The function ϕ˜3mod (M) has the following factorization by the power series ϕ˜3 (M): ϕ˜ 3mod (M; y) =

6

(i)

ϕ˜3 (M; y).

i=1

8. Propagation formula Based on our main result in the previous section, we give here an expression of Whittaker functions on GL(3, C) in terms of those on GL(2, C), which we call a propagation formula. This is an analogous formula in the class one case obtained by Ishii, Stade [9]. 8.1. Principal series Whittaker functions on GL(2, C) In this subsection, we derive an explicit formula of principal series Whittaker functions on GL(2, C) by similar computation to the case of GL(3, C). Let G = GL(2, C) be the complex general linear group of degree 2 and G = N A K be its Iwasawa decomposition, where K = U (2) is a maximal compact subgroup of G and

A =

a1 0

0 a2

ai ∈ R>0 , i = 1, 2 ,

1 N = n(x) = 0

x 1

x∈C .

The center ZG of G is {ru12 | r ∈ R>0 , u ∈ U (1)} C× . The upper triangular subgroup of G is P = N A M , where M is the centralizer of A in K given by

M =

u1 0

0 u2

ui ∈ U (1), i = 1, 2 U (1)2 .

Next, we recall the representations of K , G , and N which we need in order to describe the Whittaker functions. We can parameterize the equivalence classes of irreducible continuous representations of K = U (2) by the set of highest weights

Λ = μ = μ1 , μ2 μ ∈ Z2 , μ1 μ2 . The representation space Vμ of the representation τμ associated with μ = (μ1 , μ2 ) ∈ Λ has the (normalized) GZ-basis {f (M )}M ∈G(μ ) as in the case of U (3). Here

μ1 μ2 α ∈ Z, μ α μ G(μ ) = M =

1 2 . α The explicit action of the complexification kC of the Lie algebra k of K on the GZ-basis is given as follows. Let us put

2258


√ 1 (Eij − Ej i ) − −1 Eij + Ej i , 2

Eijk =

for the matrix unit Eij (respectively Eij ) with its (i, j )-entry 1 (respectively J ) and the remaining entries 0. Then

Eiik f (M ) = wi f (M), i = 1, 2, k E12 f (M ) = μ1 − α f M (1) , k E21 f (M ) = α − μ2 f M (−1) . Here (w1 , w2 ) = (α , μ1 + μ2 − α ) is the weight of vector f (M ) associated with a G-pattern μ μ M = μ1 μ2 and M (i) = 1 2 . Moreover, we stipulate the corresponding vector f (M ) is α +i α zero if M (i) appearing in the above formulas violates the conditions of G-patterns. A principal series representation ν +ρ ⊗ σ n , π = π (ν , σn ) = IndG P 1N ⊗ e of G with data ν = (ν1 , ν2 ) ∈ C2 and n = (n1 , n2 ) ∈ Z2 induced from the minimal parabolic subgroup P = N A M is defined similarly to the case of GL(3, C). Here, the half-sum ρ of the positive restricted roots is given by a1 eρ diag(a1 , a2 ) = , a2

diag(a1 , a2 ) ∈ A .

The central character of π is

ZG ru12 → r ν˜ un˜ ,

r ∈ R>0 , u ∈ U (1),

with ν˜ = ν1 +ν2 and n˜ = n1 +n2 , and the minimal K -type of π is the representation (τm , Vm ) associated with the dominant permutation m = (m1 , m2 ) ∈ Λ of n . Finally, we take a nondegenerate character η of N defined by √ η n(x) = exp 2π −1 Im(x) . As in the case of GL(3, C), for each element C in the center Z(gC ) of the universal enveloping algebra of gC each M -component φ(M ) of a Whittaker function φ ∈ Wh(π , η , τ ) satisfies a differential equation Cφ(M ) = χC φ(M )

(7)

with an eigenvalue χC . We can give the following explicit description of the differential equation (7) in terms of the coordinate y=

a1 , a2

for diag(a1 , a2 ) = a2 · diag(y, 1) ∈ A ,

by computations similar to the case of GL(3, C).


2259

Proposition 8.1. Let φ(M ) be the M -component of a Whittaker function φ ∈ Wh(π , η , τ ) ˜ ). Then the differential equations (7) for the Capelli elements of gl2 and put φ(M ) = y φ(M d are given as follows. Let us denote the Euler operator with respect to y by ∂ = y dy and put m m (w1 , w2 ) = (α , m1 + m2 − α ) be the weight of a G-pattern M = 1 2 . α

(1)

ν˜ ν˜ −∂ + + w2 + w1 2 2 √ 2 2 ˜ ; y) − 2π −1 y − ν1 + n1 ν2 + n2 φ(M ∂+

− 4πy α − m2 φ˜ M (−1); y = 0. (2)

ν˜ ν˜ −∂ + − w2 − w1 2 2 √ 2 2 ˜ ; y) − 2π −1 y − ν1 − n1 ν2 − n2 φ(M ∂+

− 4πy m1 − α φ˜ M (1); y = 0. In particular, the second equation at the G-pattern L =

m1 m2 m1

associated with the highest

˜ ): weight vector f (L ) in Vm gives the following differential equation for φ(L ν˜ ν˜ ∂ + − m1 −∂ + − m2 2 2 √ 2 2 ˜ ; y) = 0. − 2π −1 y − ν1 − n1 ν2 − n2 φ(L If we put λ1 = νb −

ν˜ , 2

λ2 = νa −

ν˜ 2

for m = (na , nb ), then we have the relations λ1 + λ2 = 0 and ν˜ ν˜ ν1 ± n1 ν2 ± n2 = λ2 + ± m1 λ1 + ± m2 , 2 2 ˜ ) as and thus we can write the above equation for φ(L 2 √ 2 ˜ ; y) = 0. ∂ − m1 − m2 ∂ + λ1 + λ1 λ2 + 2π −1 y 2 φ(L

2260


As solutions for the differential equations in Proposition 8.1, explicit formulas of the M components of Whittaker functions are given in the next theorem. Theorem 8.2. Let π = π(ν , σn ) be an irreducible principal series representation with data ν = (ν1 , ν2 ) and n = (n1 , n2 ), and let (τ )∗ = τm associated to the dominant permutation m = (m1 , m2 ) ∈ Λ of n be the minimal K-type of π . Moreover let η be a non-degenerate unitary character of N defined above. (1) For each G-pattern M =

m

1

m2

α

γ

(i)

and i = 1, 2, we put

(M ) =

λ2 + m1 − α , λ1 − m2 + α ,

i = 1, i = 2,

and define the coefficients {C2k (M )}k0 by (i)

C2k (M ) = (i)

2(−1)k a −k , k! 2

with the parameter a = (−1)i (γ (1) (M ) − γ (2) (M )). Then the power series ϕ˜ 2(i) (M ; y) = (πy)γ

(i) (M )

∞

(i) C2k (M )(πy)2k

k=0

= 2(πy)

m1 −m2 2

I ∗ a (2πy), −

2

for i = 1, 2 give the complete system of linearly independent solutions at y = 0 for the equations in Proposition 8.1. Here Iν∗ (z) =

2k+ν ∞ z (−1)k −π · Iν (z) = (−k − ν) , sin νπ k! 2 k=0

with the modified Bessel function Iν (z) of the first kind. (2) Let ϕ˜ 2mod (M ) be the unique (up to constant multiples) solution with the moderate growth property for the differential equations in Proposition 8.1. Then we have ϕ˜2mod (M ; y) =

1 √

2π −1

#

V2 (M ; s)(πy)−s ds = 4(πy)A KB (2πy).

s

√ √ Here, the path of integration is the vertical line from Re s − −1 ∞ to Re s + −1 ∞ with enough large real part and the integrand V2 (M ; s) is defined by V2 (M ; s) =

s + γ (1) (M ) s + γ (2) (M ) , , 2 2


2261

and the parameters A and B are given by A=

m1 − m2 , 2

B=

λ1 − λ2 + w1 − w2 . 2

(3) The function ϕ˜ 2mod (M ) has the factorization ϕ˜ 2mod (M ; y) =

2

ϕ˜2 (M ; y). (i)

i=1

8.2. Some formulas Here we recall some formulas which are fundamental to derive our propagation formula (see [4] for example). First, among many formulas for the gamma function (s), the followings are useful in our discussion: the reflection formula [s, 1 − s] =

π , sin πs

s ∈ C,

Gauss’ summation formula

c, c − a − b = 2 F1 (a, b; c|1), c − a, c − b

and Barnes’ lemma 1 √ 2π −1

# z

a + c, a + d, b + c, b + d [z + a, z + b, −z + c, −z + d] dz = . a+b+c+d

(8)

function, and Here, in Gauss’ summation formula, 2 F1 means Gauss’ hypergeometric √ √ in Barnes’ lemma, the path of integration is the vertical line from Re z − −1 ∞ to Re z + −1 ∞ with enough large real part. In particular, combining the reflection formula and Gauss’ summation formula we can derive the following two formulas: For non-negative integers m, n ∈ Z0 , we have

n n! p − m, p − q − m, q − n [q − k, p − q − m + k], = p−m−n k!(n − k)!

(9)

k=0

and

min{m,n} (−1)k m!n! p − m, p − n (p − k). = p−m−n k!(m − k)!(n − k)! k=0

(10)

2262


Table 6 Expressions of the parameter b by aj ’s. i

1

2

3

4

5

6

Arbitrary M δ(M) 0 δ(M) 0

a2 + a4 a1 a3

a4 a1 a2 + a3

a2 a1 + a3 a4

a2 a3 a1 + a4

a4 a1 + a3 a2

a2 + a4 a3 a1

Next, for the modified Bessel function Kν (z) of the second kind, we need two integral expressions. One is the integral expression of Mellin–Barnes type 1 1 Kν (z) = · √ 4 2π −1

#

s +ν s −ν , 2 2

s

Here, the path of integration is the vertical line from Re s − enough large real part. Another is that of Euler type Kν (z) =

1 2

−s z ds. 2

#∞ exp

√ √ −1 ∞ to Re s + −1 ∞ with

−z(t + t −1 ) ν dt t , 2 t

0

which is valid only for Re z > 0. 8.3. Propagation formula Let π = π(ν, σn ) be an irreducible principal series representation of G = GL(3, C) with data ν = (ν1 , ν2 , ν3 ) ∈ C3 and n = (n1 , n2 , n3 ) ∈ Z√3 , and let η be a non-degenerate unitary character of N specified by the parameters c1 = c2 = −1 as in Section 7.3. For simplicity, we assume that the parameter n satisfies the regularity condition n1 n2 n3 . Then the minimal K-type of π is given by (τm , Vm ) = (τn , Vn ). Also, we note that ν˜ ν˜ ν˜ , (λ1 , λ2 , λ3 ) = ν3 − , ν1 − , ν2 − 3 3 3 under this condition of n. First, we discuss the propagation formula for the secondary Whittaker functions. For each m1 m2 m3 (i) (i) α1 α2 1 i 6 and a G-pattern M = ∈ G(m), let ϕ3 (M) = y12 y22 ϕ˜3 (M) be the Mβ

(i)

component of the secondary Whittaker function ϕ3 ∈ Wh(π, η, τ ) defined in Section 7.3. Then, (i) (i) we remark that the parameter b in the coefficient C2k ,2l (M) of the power series ϕ˜ 3 (M) can be expressed by the parameters aj ’s as in Table 6. By using these relations of the parameters and the formulas for (s) given in the previous subsection, we can show the following theorem.


2263

Table 7 The correspondence between (δ(M), i) and (type; p, q; j ). i

1

2

3

4

5

6

δ(M) 0 δ(M) 0 δ(M) = 0

(II; 2, 3; 1) (I; 1, 4; 1) (III; 2, 4; 1)

(III; 2, 3; 1) (I; 1, 3; 2) (II; 2, 4; 1)

(II; 1, 4; 2) (III; 1, 3; 1) (I; 2, 3; 1)

(III; 1, 4; 2) (II; 1, 3; 1) (I; 2, 4; 2)

(I; 2, 3; 1) (III; 1, 3; 2) (II; 1, 4; 2)

(I; 1, 4; 2) (II; 2, 3; 2) (III; 2, 4; 2)

Table 8 The correspondence between δ(M) and the data (ν , n , M ). ν

n

δ(M) 0

(ν2 , ν3 )

(m2 , m3 )

δ(M) 0

(ν1 , ν2 )

(m1 , m2 )

δ(M) = 0

(ν1 , ν3 )

(m1 , m3 )

M m2 m3 α

m1 2m2 α

m1 1m3 β

(i)

Theorem 8.3. Let the notations be as above. Then the coefficients C2k ,2l (M) of the power series (i)

ϕ˜3 (M) can be expressed as one of the following, depending on M and i: (I)

(i) C2k ,2l (M) =

k ap 2(−1)k +l +k (j ) aq − k − l , + k C2k (M ), l !(k − k)! 2 2 k=0

(II)

(i) C2k ,2l (M) =

l aq ap 2(−1)k +l +k (j ) − k + k, − l C2k (M ), k !(l − k)! 2 2 k=0

(III)

(i) C2k ,2l (M) =

min{k ,l }

k=0

ap 2(−1)k +l (j ) aq −k , − l C2k (M ). (k − k)!(l − k)! 2 2

Here, C2k (M ) is a coefficient of the power series ϕ˜2 (M ) for a triple (π (ν , σn ), η , τm ) defined in Theorem 8.2 and the corresponding types I, II, III, and parameters (p, q; j ) are given in Tables 7 and 8. (j )

(j )

(1)

Proof. Let us consider the case of δ(M) 0 and i = 1. Then the coefficient C2k ,2l (M) given by

(1) C2k ,2l (M) =

4(−1)k +l k! · l!

a1 2

− k,

a2 2

− k , a23 − l , b 2 −k −l

a4 2

− l

,

has the expression (1) C2k ,2l (M) =

l 2(−1)k 2(−1)k +l +k a3 a1 a4 a4 · + k, − − k − l − k , k !(l − k)! 2 2 2 k! 2 k=0

by the formula (9) with the parameters (p, q; m, n) = ( a21 , a24 ; k , l ), since b = a2 + a4 = a1 . If m m (2) (2) we put ν = (ν2 , ν3 ) and M = 2α2 3 , the parameter a4 = ζ3 (M) − ζ1 (M) = −ν2 + ν3 −

2264


m2 − m3 + 2α2 can be written as −γ (1) (M ) + γ (2) (M ), and thus, we have the assertion in this case. The other cases can be proved similarly. 2 (i)

Corollary 8.4. The power series ϕ˜ 3 (M) has the following expression, depending on M and i: (i)

(i)

ϕ˜3 (M; y) = 2(πy1 )γ1 ×

∞

(M)

(i)

(πy2 )γ2

(M)

C2k (M )(πy1 )k+ (j )

ap 2

aq

(πy2 )k+ 2 Iκ∗1 (2πy1 )Iκ∗2 (2πy2 ).

k=0

Here the correspondence between (δ(M), i) and (type; p, q; j ; ν , n , M ) is given by Tables 7, and 8, and ⎧ ap aq ⎪ ⎨ (k − 2 , −k − 2 ), type I, a a (κ1 , κ2 ) = (−k − 2p , k − 2q ), type II, ⎪ ⎩ a a (k − 2p , k − 2q ), type III. (i) Proof. Let us assume that the coefficients C2k ˜3(i) (M) have the ,2l (M) of the power series ϕ expression of type I in Theorem 8.3. Then we have

∞ k ap 2(−1)k +l +k (j ) aq − k − l , + k C2k (M ) !(k − k)! l 2 2

(i) ϕ˜3 (M; y) =

k ,l =0 k=0

(i)

(M)+2k

(i)

(M)+2(k +k)

× (πy1 )γ1 =

∞ k ,l ,k=0

(i)

(πy2 )γ2 (M)+2l ap aq 2(−1)k +l (j ) −k − k + , −l + k + C2k (M ) l! · k! 2 2

× (πy1 )γ1

(i)

(πy2 )γ2

(M)+2l

.

Here the second equality comes from the substitution k → k + k. Thus we have the assertion by the definition of Iν∗ (z) in Theorem 8.2. The assertions for the other types can be obtained similarly. 2 Next, we consider the formula for the primary Whittaker functions. Let ϕ3mod ∈ Wh(π, η, τ )mod be the primary Whittaker function with the M-components ϕ3mod (M) = y12 y22 ϕ˜ 3mod (M) for each G-pattern M ∈ G(m) given in Theorem 7.7. Theorem 8.5. Let π , τ ∗ = τm , and η be as above. The integrand V3 (M; s1 , s2 ) in the Mellin– Barnes type integral expression for the M-component ϕ˜3mod (M) in Theorem 7.7 has the following expression: (1) (2) s1 + ζj (M) s2 + ζj (M) , V3 (M; s1 , s2 ) = 2 2 # z + s1 + μ1 z + s2 + μ2 1 , V2 (M ; −z) dz, × √ 2 2 2π −1

z


2265

Table 9 The correspondence between δ(M) and the data (j, μ1 , μ2 , ν , n , M ). j

μ1

μ2

δ(M) 0

2

δ(M) 0

1

δ(M) = 0

3

λ − 22 − α2 + β λ − 21 + α1 − β λ − 23

λ2 2 + m1 − α1 λ1 2 + α2 − m3 λ3 2

ν

n

(ν2 , ν3 )

(m2 , m3 )

(ν1 , ν2 )

(m1 , m2 )

(ν1 , ν3 )

(m1 , m3 )

M m2 m3 α

m1 2m2 α

m1 1m3 β

where V2 (M ; s) is the integrand of the integral expression of ϕ˜2mod (M ) in Theorem 8.2 for a triple (π (ν , σn ), M ∈ G(m ) and the path of integration is the vertical √η , τm ) and a G-pattern √ line from Re z − −1 ∞ to Re z + −1 ∞ with large enough real part. The parameters and the representations are given in Table 9. Proof. Assume δ(M) 0. Since ζ1(1) (M) + ζ1(2) (M) = ζ3(1) (M) + ζ3(2) (M), Barnes’ lemma (8) leads the equation (1) (2) s1 + ζ2 (M) s2 + ζ2 (M) , V3 (M; s1 , s2 ) = 2 2 # z + s1 + μ1 z + s2 + μ2 −z + μ3 −z + μ4 1 , , , dz, × √ 2 2 2 2 2π −1

z

where the parameters μ1 and μ2 are given in the assertion and μ3 and μ4 are μ3 =

−ν2 + ν3 + α2 − m3 , 2

μ4 =

ν2 − ν3 − α2 + m2 . 2

3 Here we use the relations λ1 + λ22 = −ν22+ν3 and λ3 + λ22 = ν2 −ν 2 . Thus we have the assertion in this case. (1) (2) (1) In the case of δ(M) 0 (respectively δ(M) = 0), the relation ζ2 (M) + ζ2 (M) = ζ3 (M) + ζ3(2) (M) (respectively ζ1(1) (M) + ζ1(2) (M) = ζ2(1) (M) + ζ2(2) (M) = ζ3(1) (M) + ζ3(2) (M)) brings the assertion by similar computation. 2

Corollary 8.6. We have the following expression of ϕ˜3mod (M): ϕ˜3mod (M; y) =

24 √ 2π −1

#

z

z

(πy1 ) 2 +a1 K− 2z +A1 (2πy1 )(πy2 ) 2 +a2 K 2z −A2 (2πy2 )V2 (M ; −z) dz.

z

Here ak =

1 (k) ζj (M) + μk , 2

(k)

Ak = ζj (M) − ak ,

and the parameters and the representations are given in Theorem 8.5.

k = 1, 2,

2266


Proof. Using the first integral expression of Kν (z) of Mellin–Barnes type in the previous subsection, we can get the corollary from Theorem 8.5 together with the integral expression of Mellin–Barnes type for ϕ˜3mod (M) in Theorem 7.7. 2 Corollary 8.7. We have the following expression of ϕ˜3mod (M): ϕ˜ 3mod (M; y) = 4π a1 +a2 y1a1 +A1 y2a2 −A2

#∞#∞ 0 0

1 1 exp −π y12 u1 + + y22 u2 + u1 u2

$ u2 du1 du2 1 −A2 mod M u ϕ ˜ ; y . × uA 2 2 1 2 u1 u1 u2 Here the parameters and the representations are given in Theorem 8.5. Proof. By applying the second integral expression of Kν (z) in the previous subsection to the expression of ϕ˜3mod (M) in Corollary 8.6, we have ϕ˜ 3mod (M; y) 4 = √ 2π −1

#∞#∞# 0 0

z

1 1 1 −A2 exp −πy1 u1 + − πy2 u2 + uA 1 u2 u1 u2

z u2 2 du1 du2 × (πy1 )a1 (πy2 )a2 π 2 y1 y2 V2 (M ; −z) dz. u1 u1 u2 Then we can get the assertion by the substitutions u1 → u1 y1 , u2 → u2 y2 , and z → −z in the above integrals. 2 References [1] D. Bump, Automorphic Forms on GL(3, R), Lecture Notes in Math., vol. 1083, Springer-Verlag, 1984. [2] J. Cogdell, H. Kim, M. Ram Murty, Lectures on Automorphic L-Functions, Fields Inst. Monogr., vol. 20, Amer. Math. Soc., 2004. [3] I. Gelfand, A. Zelevinsky, Canonical basis in irreducible representations of gl3 and its applications, in: Group Theoretical Methods in Physics, vol. II, VNU Sci. Press, 1986, pp. 127–146. [4] I.S. Gradshteyn, I.M. Ryzhik, Tables of Integrals, Series, and Products, fifth ed., Academic Press, 1994. [5] T. Hina, T. Ishii, T. Oda, Principal Series Whittaker Functions on SL(4, R), Kokyuroku Bessatsu, in press. [6] M. Hirano, T. Oda, Integral switching engine for special Clebsch–Gordan coefficients for the representations of gl3 with respect to Gelfand–Zelevinsky basis, preprint. [7] R. Howe, T. Umeda, The Capelli identity, the double commutant theorem, and multiplicity-free actions, Math. Ann. 290 (1991) 565–619. [8] T. Ishii, A remark on Whittaker functions on SL(n, R), Ann. Inst. Fourier 55 (2005) 483–492. [9] T. Ishii, E. Stade, New formulas for Whittaker functions on GL(n, R), J. Funct. Anal. 244 (2007) 289–314. [10] H. Jacquet, Fonctions de Whittaker associées aux groupes de Chevalley, Bull. Soc. Math. France 95 (1967) 243–309. [11] H. Jacquet, R. Langlands, Automorphic Forms on GL(2), Lecture Notes in Math., vol. 114, Springer-Verlag, 1970. [12] H. Manabe, T. Ishii, T. Oda, Principal series Whittaker functions on SL(3, R), Japan J. Math. (N.S.) 30 (2004) 183–226. [13] T. Miyazaki, The (g, K)-module structures of principal series representations of Sp(3, R), master thesis, University of Tokyo.


2267

[14] T. Miyazaki, The structures of standard (g, K)-modules of SL(3, R), preprint. [15] T. Oda, The standard (g, K)-modules of Sp(2, R), I. The case of principal series, preprint. [16] T. Oshima, A definition of boundary values of solutions of partial differential equations with regular singularities, Publ. Res. Inst. Math. Sci. 19 (1983) 1203–1230. [17] N. Proskurin, Automorphic functions and Bass–Milnor–Serre homomorphism, I, II, J. Soviet Math. 29 (1985) 1160– 1191, 1192–1219. [18] J.A. Shalika, The multiplicity one theorem for GLn , Ann. of Math. (2) 100 (1974) 171–193. [19] T. Shintani, On an explicit formula for class-1 “Whittaker functions” on GLn over P-adic fields, Proc. Japan Acad. Ser. A 52 (1976) 180–182. [20] E. Stade, On explicit integral formulas for GL(n, R)-Whittaker functions, Duke Math. J. 60 (1989) 695–729. [21] E. Stade, Mellin transforms of Whittaker functions on GL(4, R) and GL(4, C), Manuscripta Math. 87 (1995) 511– 526. [22] E. Stade, Mellin transforms of GL(n, R) Whittaker functions, Amer. J. Math. 123 (2001) 121–161. [23] M. Takeuchi, Modern Spherical Functions, Iwanami Shoten, 1975 (in Japanese). [24] A. Vinogradov, L. Takhtajan, Theory of Eisenstein series for the group SL(3, R) and its application to a binary problem, J. Soviet Math. 18 (1982) 293–324. [25] N. Wallach, Asymptotic expansions of generalized matrix entries of representations of real reductive groups, in: Lecture Notes in Math., vol. 1024, Springer-Verlag, 1983, pp. 287–369. [26] G. Warner, Harmonic Analysis on Semi-simple Lie Groups I, Springer-Verlag, 1972. [27] A. Weil, Dirichlet Series and Automorphic Forms, Lecture Notes in Math., vol. 189, Springer-Verlag, 1971.


Global minimizers for a p-Ginzburg–Landau-type energy in R2 Yaniv Almog a , Leonid Berlyand b , Dmitry Golovaty c , Itai Shafrir d,∗ a Department of Mathematics, Louisiana State University, Baton Rouge, LA 70803, USA b Department of Mathematics, Pennsylvania State University, University Park, PA 16802, USA c Department of Theoretical and Applied Mathematics, The University of Akron, Akron, OH 44325, USA d Department of Mathematics, Technion–Israel Institute of Technology, 32000 Haifa, Israel

Received 10 June 2008; accepted 28 September 2008 Available online 16 October 2008 Communicated by H. Brezis

Abstract Given a p > 2, we prove existence of global minimizers for a p-Ginzburg–Landau-type energy over maps on R2 with degree d = 1 at infinity. For the analogous problem on the half-plane we prove existence of a global minimizer when p is close to 2. The key ingredient of our proof is the degree reduction argument that allows us to construct a map of degree d = 1 from an arbitrary map of degree d > 1 without increasing the p-Ginzburg–Landau energy. © 2008 Elsevier Inc. All rights reserved. Keywords: p-Ginzburg–Landau energy; Global minimizer

1. Introduction For a given p > 2 consider the Ginzburg–Landau-type energy Ep (u) =

|∇u|p +

2 1 1 − |u|2 2

R2


E-mail address: [email protected] (I. Shafrir). 0022-1236/$ – see front matter © 2008 Elsevier Inc. All rights reserved. doi:10.1016/j.jfa.2008.09.020

(1.1)

Y. Almog et al. / Journal of Functional Analysis 256 (2009) 2268–2290

2269

1,p

over the class of maps u ∈ Wloc (R2 , R2 ) that satisfy Ep (u) < ∞ and have a degree d “at infin1,p ity.” The last statement can be made precise by observing that any map u ∈ Wloc (R2 , R2 ) with Ep (u) < ∞ satisfies α (R2 , R2 ) where α = 1 − 2/p (Morrey’s lemma [11]). • u ∈ Cloc • lim|x|→∞ |u(x)| = 1 (Section 3 below). u Therefore, there exists an R > 0 such that the degree deg( |u| , ∂Br (0)) is well defined for every r R and is independent of r. We use this value as the definition of the degree, deg(u). For any integer d ∈ Z, introduce the class of maps

1,p Ed = u ∈ Wloc R2 , R2 : Ep (u) < ∞, deg(u) = d and define Ip (d) = inf Ep (u). u∈Ed

(1.2)

The set Ed is nonempty, as can be readily seen, e.g., by verifying that the map v(reiθ ) = f (r)eidθ with f (r) =

r, 1,

r < 1, r 1,

is in Ed . A natural question then is whether the infimum in (1.2) is attained. Our main result provides an affirmative answer when d = ±1—we are uncertain as to whether this conclusion remains true for |d| 2. Theorem 1. For d = ±1 there exists a map realizing the infimum Ip (d) in (1.2). Note that the problem (1.2) is meaningless for the standard Ginzburg–Landau energy E2 because it is not even clear how the class Ed should be defined when p = 2 and d = 0. In fact, by a result of Cazenave (described in [7]), the constant solutions u = eiα with α ∈ R are the only finite energy solutions of the associated Euler–Lagrange equation, −u = (1 − |u|2 )u. Clearly, the degree of these solutions is zero. The natural questions for p = 2 are concerned with local minimizers, i.e., those maps that are minimizers of the energy functional E2 on BR (0) with respect to C0∞ (BR (0))-perturbations for every R > 0. These questions were first addressed in [7]. Subsequently, Mironescu [10], relying on a result of Sandier [12], characterized these local minimizers completely by showing that, up to a translation and a rotation, they are all of the form f (r)eiθ . Here f (r) is the unique solution of the ODE obtained by imposing rotational invariance on the Euler–Lagrange equation. Next we turn our attention to the analogous problem on the upper half-plane R2+ = (x1 , x2 ) ∈ R2 x2 > 0 .

2270


Once again, for p > 2, we are interested in minimizers of the energy functional 2 1 Ep (u) = |∇u|p + 1 − |u|2 , 2 R2+ 1,p

but this time over all maps in Wloc (R2+ , R2 ), satisfying the boundary condition u(x1 , 0) = 1, ∀x1 ∈ R,

(1.3)

along with a degree condition at infinity. Here the definition of the degree can be given by a small u modification of the argument we employed in the R2 -case: we observe that the degree of |u| on 2 ∂(BR (0) ∩ R+ ) does not depend on R for sufficiently large R and define deg(u) to be this integer value. For any d ∈ Z we set 1,p Ed+ = u ∈ Wloc R2+ , R2 : u(x1 , 0) = 1, Ep (u) < ∞, deg(u) = d

(1.4)

Ip+ (d) = inf Ep (u).

(1.5)

and define u∈Ed+

Again, we study the question of existence of a minimizer for (1.5), but we are only able to prove a result analogous to Theorem 1 when p is sufficiently close to 2. Theorem 2. For d = ±1 there exists p0 > 2 such that for all p ∈ (2, p0 ) the infimum Ip+ (d) is attained. Recall that minimization problems with degree boundary conditions for the classical Ginzburg–Landau energy (p = 2) on perforated bounded domains were studied in [1–4]. Our study of the problem on a half-plane was motivated by the results in [2,3] regarding the behavior of minimizing sequences when the H 1 -capacity of the domain is sufficiently small and the minimizing sequences develop vortices approaching the boundary of the domain. The main tool we use in the proofs of Theorems 1 and 2 is a “degree reduction” proposition proved in Section 2. In this proposition we show how we can transform any given map u of ˜ = Ep (u). degree D 2 (on either R2 or R2+ ) to a new map u˜ of degree D = 1 so that Ep (u) Loosely speaking, the proposition establishes the intuitively clear result that “less degree implies less energy” for the infima. In Section 3 we use the degree reduction argument to prove Theorem 1. In Section 4 we study the limit p → 2+ in the half-plane case and obtain some results needed to prove Theorem 2. The proof of this theorem is given in Section 5. 2. A key proposition Here we prove a key proposition that is the main ingredient of the proof of Theorem 1. A variant of it will also be used in the proof of Theorem 2. Before stating the proposition, we provide some basic properties of maps with finite energy.


2271

1,p

Lemma 2.1. Let u ∈ Wloc (R2 , R2 ) be any map with Ep (u) < ∞. Then u ∈ C α (R2 , R2 ) with α = 1 − 2/p and lim u(x) = 1.

(2.1)

|x|→∞ 1,p

The analogous result holds for u ∈ Wloc (R2+ , R2 ) satisfying Ep (u) < ∞. Proof. The first assertion is a direct consequence of Morrey’s inequality [11] which asserts that, upon modifying u on a set of measure zero, u(x) − u(y) C ∇u

Lp (R2 ) |x

− y|α ,

∀x, y ∈ R2 ,

(2.2)

for some constant C > 0 depending only on p. To prove (2.1) we employ the same argument used in the proof of the analogous result [7] in the case p = 2. Suppose that there exists a sequence |x (n) | → ∞ with |u(x (n) )| 1 − δ for some δ > 0. Then, by (2.2),

2 1 − |u|2 η > 0,

B1 (x (n) )

for all n and some constant η. But this contradicts our assumption that

2 1 − |u|2 Ep (u) < ∞.

R2 1,p

1,p

In the case of u ∈ Wloc (R2+ , R2 ) it suffices to extend u to a map U ∈ Wloc (R2 , R2 ) via reflection with respect to the x2 -axis, and to apply the previous argument. 2 Next we state and prove the main result of this section. Proposition 1. Let D 2 be an integer. Then, for each u ∈ ED , there exists u˜ ∈ E1 such that Ep (u) ˜ = Ep (u) and u˜ 1 (x) = u1 (x)

and

u˜ 2 (x) = u2 (x),

∀x ∈ R2 .

(2.3)

1 Proof. By Lemma 2.1 there exists R0 > 0 such thatp |u(x)| 2 for |x| R0 . By Fubini theorem we can find r ∈ (R0 , R0 + 1) such that ∂Br (0) |∇u| Ep (u). Therefore, by Hölder inequality,

∂u dτ < ∞. ∂τ

(2.4)

∂Br (0)

Here ∂u ∂τ denotes the tangential derivative of u along ∂Br (0). We start by constructing a map u˜ ∈ W 1,p (Br (0), R2 ) (of course, also u˜ ∈ C(Br (0), R2 )) satisfying (2.3) on Br (0) and such that deg(u, ˜ ∂Br (0)) = 1. Thus, until stated otherwise, we consider

2272


u on Br (0) only. Since u = (u1 , u2 ) is continuous, we can represent the set Br (0) ∩ {u2 = 0} as a union of its (countably many) disjoint components, {u2 = 0} ∩ Br (0) =

ωj+ ∪

j ∈I+

ωj− ,

j ∈I−

where {ωj+ }j ∈I+ are the components of the set Br (0) ∩ {u2 > 0}, while {ωj− }j ∈I− are the components of Br (0) ∩ {u2 < 0}. Each index set I± is either a finite set of integers {1, . . . , N }, or the set N of all positive integers. Note that both I+ and I− are nonempty because, contrary to our assumption, the degree of u is zero if u takes values in a half-plane. Denote u+ 2,j = χω+ |u2 |,

∀j ∈ I+

j

and u− 2,j = χω− |u2 |,

∀j ∈ I− .

j

Then, on Br (0) u2 =

u+ 2,j −

j ∈I+

u− 2,j .

(2.5)

j ∈I−

Next we claim that for each ωj± , 1,p u± Br (0) . 2,j ∈ W

(2.6)

We pay special attention to cases where ∂ωj± ∩ ∂Br (0) is nonempty. We begin by applying a standard argument (cf. [8]) to construct an extension w of u2 such that w ∈ Cc (R2 ) ∩ W 1,p (R2 ). Let Q denote the connected component of the set {w = 0} that contains ωj± . Then, w ∈ 1,p

W0 (Q) ∩ C(Q), and by defining an extension map w˜ which is identically zero on R2 \ Q we obtain that w˜ ∈ W 1,p (R2 ) ∩ Cc (R2 ) (note that no regularity assumption on Q is required for this to hold, see Remarque 20 after Théorème IX.17 in [6]). Since w˜ = χω± u2 on Br (0), we j deduce immediately that (2.6) holds. It follows from (2.6) that for every pair of maps, γ+ : I+ → {−1, +1}, γ− : I− → {−1, +1} the function (γ ,γ− )

u2 +

=

γ+ (j )u+ 2,j +

j ∈I+

γ− (j )u− 2,j

(2.7)

j ∈I− (γ ,γ )

belongs to W 1,p (Br (0)) and the map u˜ = (u1 , u2 + − ) satisfies (2.3) on Br (0). We now show that it is possible to choose γ+ and γ− in such a way that the resulting u˜ will have degree equal to 1. First, we claim that one can assume that I+ is finite. Indeed, if I+ = N, then we define a (N ) (N ) sequence of maps v (N ) = (v1 , v2 ) by (N ) v2

=

N j =1

u+ 2,j

−

∞ j =N +1

u+ 2,j −

j ∈I−

u− 2,j

(N )

and v1

= u1 ,


2273

where N 1. By dominated convergence, it can be easily seen that v (N ) → u in W 1,p (Br (0)), hence also in C(Br (0)). By the continuity of the degree, we obtain lim deg v (N ) , ∂Br (0) = deg u, ∂Br (0) = D.

N →∞

Therefore, for sufficiently large N , we have deg(v (N ) ) = D. Since we can replace u by v (N ) , the claim follows. We will assume in the sequel that u is such that I+ = {1, . . . , N } for some N ∈ N. Next, we claim that one can effectively assume that N = 1. From now on, we assume the u positive orientation (i.e., counter clockwise) of ∂Br (0). The map U = |u| is well defined on 1,1 1 ∂Br (0) and, thanks to (2.4), it belongs to W (∂Br (0), S ). For j = 1, . . . , N set

Aj = ωj+ ∩ ∂Br (0)

and aj =

U ∧ Uτ dτ, Aj

so that aj equals the change of phase of U on Aj . Further, denote U ∧ Uτ dτ.

b= N

∂Br (0)\

j =1 Aj

Clearly 2πD = b +

N

aj .

(2.8)

j =1

+ But since replacing u = u1 + iu2 by its complex conjugate u¯ = u1 − iu2 on N j =1 ωj (without changing u elsewhere) would result with a map of degree zero (since it takes its values only in the lower half-plane), we must also have 0=−

N

aj + b.

(2.9)

aj = πD.

(2.10)

j =1

From (2.8), (2.9) we get b=

N j =1

It follows from (2.10) that there exists j0 for which b + a j0 −

j =j0

aj = 2aj0 > 0.

(2.11)

2274


From (2.11) we deduce that the map v = (v1 , v2 ) ∈ W 1,p (Br (0)) with v1 = u1 and v2 given by

v2 (x) =

−|u2 (x)|, u2 (x),

x∈ / ωj+0 ,

x ∈ ωj+0 ,

has degree d > 0. If d = 1 then the proposition is proved, thus we assume in the sequel that d 2. Consider the set Ω − = {x ∈ Br (0): v2 (x) < 0} and write it as a disjoint countable union of its components,

Ω− =

Ωj− .

j ∈J

Set V=

v |v|

on ∂Br (0).

By (2.4)

∂V ∂τ

dτ < ∞.

(2.12)

∂Br (0)

Define G+ = x ∈ ∂Br (0): v2 (x) > 0

and G− = x ∈ ∂Br (0): v2 (x) < 0 .

As above

V ∧ Vτ dτ =

G+

V ∧ Vτ dτ = πd.

(2.13)

G−

We can write each of G+ and G− (which are (relatively) open subsets of ∂Br (0)) as a countable union of open segments on the circle ∂Br (0): G+ =

i∈K+

J+i

and G− =

i∈K−

J−i .

Clearly, each segment J−i satisfies J−i ⊂ Ωζ−(i)

for a unique ζ (i) ∈ J .

(2.14)


2275

Of course, also J+i ⊂ ωj+0 for each i. Since for each segment J±i , v2 (∂J±i ) = 0, i.e., V (∂J±i ) ⊂ {−1, 1}, we clearly have V ∧ Vτ dτ ∈ {−π, 0, π}.

(2.15)

J±i

Invoking (2.12) we deduce that the number of intervals J±i for which We denote them (ordered according to the positive orientation) by j

jκ

J+1 , . . . , J+ +

J±i

V ∧ Vτ dτ = 0 is finite.

lκ

and J−l1 , . . . , J− − .

Then our assumption that d 2 in conjunction with (2.15) and (2.13) implies that κ+ 2. j Given an s = 1, . . . , κ+ , denote by r exp(iθ1,js ) and r exp(iθ2,js ) the end points of J+s so that j J+s = reiθ : θ ∈ (θ1,js , θ2,js ) . We claim that there exists at least one pair of two consecutive segments, without loss of generality j j J+1 and J+2 , such that for the intermediate segment I = reiθ : θ ∈ (θ2,j1 , θ1,j2 ) we have V ∧ Vτ dτ = 0.

δ := I

Indeed, this follows immediately from the fact that the total change of phase of V over all such intermediate segments equals πd by (2.13). Next, set I+ = x ∈ I : v2 (x) > 0 and I− = x ∈ I : v2 (x) < 0 . j

From the definitions of J+s and δ it follows that

V ∧ Vτ = δ

I−

V ∧ Vτ = 0.

and

(2.16)

I+

Furthermore, it is easy to see that (i) V (r exp(iθ2,j1 )) = ±1 and V (r exp(iθ1,j2 )) = −V (r exp(iθ2,j1 )), (ii) δ = ±π , κ− ls lσ lσ (iii) I ∩ s=1 J− = ∅, and there is an odd number of segments J− 1 , . . . , J− 2k+1 such that lσ

J− i ⊂ I for every i = 1, . . . , 2k + 1 and some k ∈ N (see Fig. 1).

2276


lσ

lσ

j

Fig. 1. In this example, there are five “negative segments,” J− 1 , . . . , J− 5 , between the two “positive segments” J+1 j

and J+2 . lσ

Consider the components Ωζ−(lσ ) corresponding to J− i for i = 1, . . . , 2k + 1 (see (2.14)). We i

now claim that, for each i = 1, . . . , 2k + 1, the set Ωζ−(lσ ) satisfies i

Ωζ−(lσ ) ∩ G− ⊂ I.

(2.17)

i

Indeed, assume by negation that (2.17) does not hold for some i. Then, there exists a segment j j J− ⊂ Ωζ−(lσ ) ∩ (G− \ I ). But this would imply the existence of a curve starting at a point on J− i

lσ

and ending at a point on J− i whose interior is contained in Ωζ−(lσ ) . The existence of such a curve i

clearly contradicts the connectedness of ωj+0 , and (2.17) follows. Finally, we define the map u˜ = u˜ 1 + i u˜ 2 on Br (0) as follows. Set u˜ 1 = u1 and

− v2 (x), x ∈ 2k+1 i=1 Ωζ (lσi ) , u˜ 2 (x) = |v2 (x)|, otherwise. = From the above it follows that U

u˜ |u| ˜

1 + i U 2 satisfies =U

∧U τ = 2δ = ±2π. U

∂Br (0)

Therefore, either u˜ or its complex conjugate u˜ 1 − i u˜ 2 has degree 1 as required.


2277

Finally, we use the above construction to define a map u˜ on R2 possessing the property stated in the proposition. Choose a sequence {Rn }∞ n=1 with Rn → ∞, so we may assume that Rn > R0 for all n. For each n we may find rn ∈ (Rn , Rn + 1) satisfying (2.4) with r = rn and repeat the above construction to get a map u˜ (n) ∈ W 1,p (Brn (0), R2 ) satisfying (n) and u˜ 2 (x) = u2 (x),

(n)

u˜ 1 (x) = u1 (x)

∀x ∈ Brn (0),

(2.18)

and deg(u˜ (n) , ∂Brn (0)) = 1. Note that (2.18) implies (n) ∇ u˜ (x) = ∇u(x),

a.e. in Brn (0).

(2.19)

1,p

By (2.18), (2.19) the sequence {u˜ (n) } is bounded in Wloc (R2 , R2 ), and by passing to a subse1,p 1,p quence we may assume that u˜ (n) converges to a map u˜ ∈ Wloc (R2 , R2 ), weakly in Wloc (R2 , R2 ), hence also in Cloc (R2 , R2 ). Clearly, u˜ satisfies the assertion of the proposition. 2 By using exactly the same method, we can prove an analogous result for the half-plane. + Proposition 2. Let D 2 be an integer. Then, for each u ∈ ED there exists u˜ ∈ E1+ such that Ep (u) ˜ = Ep (u), where

u˜ 1 (x) = u1 (x)

and u˜ 2 (x) = u2 (x),

∀x ∈ R2 .

3. Existence of minimizers in R 2 In this section we study the existence of minimizers on R2 and prove Theorem 1. The main difficulty we face here is to show that the (weak) limit of a minimizing sequence must satisfy the degree condition. Our main tool in overcoming this difficulty is Proposition 1. Proof of Theorem 1. Clearly, without any loss of generality, we can consider the case d = 1. Let {un }∞ n=1 be a minimizing sequence in E1 for Ip (1), i.e., lim Ep (un ) = Ip (1).

n→∞

By (2.2), there exists a constant λ0 > 0 such that, un (x0 ) 1 2

⇒

un (x) 3 , 4

∀x ∈ Bλ0 (x0 ), ∀n ∈ N.

(3.1)

Consider the set 1 . Sn = x ∈ R2 : un (x) 2

(3.2)

Next, borrowing an argument from [5], we show that Sn can be covered by a finite number of “bad disks.” Starting from a point x1,n ∈ Sn , we choose a point x2,n ∈ Sn \ B5λ0 (x1,n ) (if this set is nonempty) and then, by recurrence, xk,n ∈ Sn \ k−1 j =1 B5λ0 (xj,n ) (if this set is nonempty). This

2278


selection process must stop after a finite number of iterations (bounded uniformly in n), because k 2 2 (1 − |u n | ) C, and the disks {Bλ0 (xj,n )}j =1 are mutually disjoint at each step, while R2

2 πλ20 1 − |un |2 , 16

(3.3)

Bλ0 (xj,n )

by (3.1). Passing to a further subsequence (if necessary), we find that the number of disks is independent of n, i.e., Sn ⊂

m

B5λ0 (xj,n ),

j =1 m where {xj,n }m j =1 ⊂ Sn and the disks {Bλ0 (xj,n )}j =1 are mutually disjoint. By replacing un (x) with un (x − x1,n ), we may assume that

x1,n = 0,

∀n ∈ N.

(3.4)

α (R2 , R2 ). Therefore, by passing to a From (2.2) and (3.4) it follows that {un } is bounded in Cloc subsequence and relabeling, we may assume that {un } converges in Cloc (R2 , R2 ) and weakly in 1,p 1,p Wloc (R2 , R2 ) to a map u ∈ Wloc (R2 , R2 ). By weak lower semicontinuity and the local uniform convergence it follows that

Ep (u) lim Ep (un ) = Ip (1).

(3.5)

R := sup max |xj,n |: 1 j m ∈ (0, ∞].

(3.6)

n→∞

It remains to show that u ∈ E1 . Let

n1

We distinguish two cases: (i) R < ∞. (ii) R = ∞. In the case (i), we clearly have un (x) 1 , 2

|x| R + 5λ0 , ∀n ∈ N.

By the local uniform convergence, deg u, ∂Br (0) = deg un , ∂Br (0) = 1, for each r R + 5λ0 , i.e., u ∈ E1 and we conclude from (3.5) that u is a minimizer for (1.2).


2279

Next, we show that the case (ii) is impossible. Assume by negation that the case (ii) holds. Then, by passing to a subsequence, we may assume the following: the index set J = {1, . . . , m} is a union of K 2 disjoint subsets, J1 , . . . , JK , such that the (generalized) limit lj1 ,j2 := limn→∞ |xj1 ,n − xj2 ,n | ∈ (0, ∞] exists for every pair of distinct indices j1 , j2 ∈ {1, . . . , m} and lj1 ,j2 < ∞

⇔

∃k ∈ {1, . . . , K} such that j1 , j2 ∈ Jk .

For every k ∈ {1, . . . , K} and each n we define δk,n = max |xj1 ,n − xj2 ,n |: j1 , j2 ∈ Jk .

(3.7)

Note that Δk = supn δk,n < ∞ for every k ∈ {1, . . . , K}. For j ∈ {1, . . . , m} we denote by σ (j ) the index in {1, . . . , K} such that j ∈ Jσ (j ) . Defining ρn = inf |xj1 ,n − xj2 ,n |: j1 , j2 ∈ {1, . . . , m} such that σ (j1 ) = σ (j2 ) , we have limn→∞ ρn = ∞. (k) (k) Fix any k ∈ {1, . . . , K} and any jk ∈ σ −1 (k). Define the sequence {vn } by vn (x) = un (x +xjk ,n ). For any r1 > 0 we have r1 < ρn /2 for a sufficiently large n ∈ N. If we take r1 > Δk , then the degree dk,n = deg(vn(k) , ∂Br (0)) does not depend on r for r1 < r < ρn /2. Passing to a further subsequence, we may assume that dk,n = dk for all n ∈ N and further, that vn(k) → vk in 1,p Cloc (R2 , R2 ) and weakly in Wloc (R2 , R2 ), for some vk ∈ Edk . In case dk = 0 we have Ip (dk ) Ep (vk ).

(3.8)

Note that (3.8) is obviously true when dk = 0 because Ip (0) = 0. However, thanks to (3.3), we have that πλ20 Ep (vk ). 32

(3.9)

Set K = k ∈ {1, . . . , K}: dk = 0 , and denote its complement in {1, . . . , K} by Kc . Note that by the properties of the degree 1=

K

dk =

k=1

dk ,

k∈K

so that, in particular, K = ∅. By weak lower semi-continuity, the aforementioned convergence, and (3.8), (3.9) we obtain c πλ20 K + Ip (dk ) Ep (vk ) lim Ep (un ) = Ip (1). n→∞ 32 k∈K

k∈K

(3.10)

2280


Using Proposition 1 in (3.10) yields c πλ20 K + |K|Ip (1) Ip (1), 32 from which it is clear that Kc = ∅ and thus K must be a singleton, i.e., K = 1—a contradiction. 2 4. Limiting behaviour of global minimizers when p → 2 Throughout this section we denote by up a global minimizer realizing Ip (1) for p > 2 (the existence is guaranteed by Theorem 1) satisfying up (0) = 0.

(4.1)

The condition (4.1) can always be fulfilled by an appropriate translation. The following proposition is needed in Section 5 where we study the existence problem for minimizers on R2+ . Proposition 3. Let {up }p>2 be a family of minimizers satisfying (4.1). Then, for every sequence pn → 2+ we have, up to a subsequence, 2 1 R , upn u˜ weakly in Hloc

(4.2)

where u˜ is a degree-one solution of the classical Ginzburg–Landau equation ˜ −u˜ = 1 − |u| ˜ 2 u.

(4.3)

on R2 . Furthermore, lim

p→2+

2 1 − |up |2 = 2π.

(4.4)

R2

To prove this proposition we need the following Pohozaev-type identity that will also be used later on in Section 5. Lemma 4.1. For every p > 2 we have

2 2(p − 2) 1 − |up |2 = Ip (1). p

R2

Proof. Let λ > 0 and set wλ (x) = up (λx) and F (λ) := Ep (wλ ) = λp−2

|∇up |p + R2

1 2λ2

R2

2 1 − |up |2 .

(4.5)


2281

Since F has a local minimum at λ = 1, we must have F (1) = 0. Thus 2 1 − |up |2 , (p − 2) |∇up |p = R2

and (4.5) follows.

R2

2

An upper bound for Ip (1) is given by the next lemma. Lemma 4.2. We have 2π + 3π, p−2

Ip (1)

∀p > 2.

(4.6)

Proof. Define a function f (r) by

f (r) =

, 0r √ 1, 2 < r. √r

2

√ 2,

A direct computation gives Ip (1) Ep f eiθ 3π + 2π

∞ r 1−p dr = 3π + 2π √

and (4.6) follows.

21−p/2 , p−2

2

2

Remark 4.1. Although our main interest is in the limit p → 2, we note that the result of Lemma 4.2 provides a uniform bound in the limit p → ∞ as well. Proof of Proposition 3. First, we show that the maps {up }p>2 are uniformly bounded in 1 (R2 ). The Euler–Lagrange equation associated with (1.1) is Hloc p ∇ · |∇up |p−2 ∇up + up 1 − |up |2 = 0. 2

(4.7)

Let η ∈ C0∞ (R+ , [0, 1]) be a cutoff function satisfying η(r) =

1, r < 12 , 0, r > 1,

|η | 4.

Fix any x0 ∈ R2 . Using the identity 2 ∇up ∇ η2 up = ∇(ηup ) − |up |2 |∇η|2 , we obtain, upon multiplying (4.7) by η2 (|x − x0 |)up (x) and integrating over R2 ,

(4.8)

2282


|∇up | p

B1/2 (x0 )

2 |∇up |p−2 ∇(ηup )

B1 (x0 )

=

|∇η| |up | |∇up | 2

2

p−2

2 + p

B1 (x0 )

η2 |up |2 1 − |up |2 .

B1 (x0 ) u (x)

Since we have up ∞ 1 for every p (otherwise, replacing up (x) by |upp (x)| on the set {x: |up (x)| > 1} would yield a map with a lower energy), we conclude, using the Hölder inequality, that |∇up |p C 1 +

B1/2 (x0 )

|∇up |p

(p−2)/p .

(4.9)

B1 (x0 )

Here and for the remainder of the proof, C denotes a constant independent of p > 2. Inserting (4.6) into (4.9) yields |∇up |p C (p − 2)−(p−2)/p + 1 , B1/2 (x0 )

hence |∇up |p C, B1/2 (x0 )

uniformly in p > 2. Applying the Hölder inequality once again and using a covering argument we find that |∇up |2 C(R), ∀p > 2, ∀R > 0. (4.10) BR (0) 1 (R2 ). Thanks to (4.10), there exists a sequence pn → 2+ such that upn u˜ weakly in Hloc We now verify that u˜ satisfies (4.3). To this end, choose an arbitrary test function φ ∈ Cc∞ (R2 ). By (4.7) we have for each n, pn |∇upn |pn −2 ∇upn ∇φ = upn 1 − |upn |2 φ. (4.11) 2 R2

R2

Using (4.10) and the Rellich–Kondrachov compact embedding theorem, we deduce that {upn } is q relatively compact in Lloc (R2 ) for every q > 2. By passing, if necessary, to a further subsequence, we then deduce that lim (4.12) upn 1 − |upn |2 φ = u˜ 1 − |u| ˜ 2 φ. n→∞ R2

R2


2283

Next, we claim that lim

n→∞ R2

pn |∇upn |pn −2 ∇upn ∇φ = 2

∇ u∇φ. ˜

(4.13)

R2

Clearly, (4.13) would follow if we can show that lim |∇upn |pn −2 − 1 ∇upn ∇φ = 0.

(4.14)

n→∞ R2

For any p > 2 define the function gp (t) = t p−2 − 1t

on t ∈ [0, ∞).

(4.15)

An elementary computation shows that, for any β > 1, max gp (t) = max gp (β), gp

t∈[0,β]

1 p−1

1 p−2

→ 0,

as p → 2.

It follows that lim

n→∞ {|∇upn (x)|β}

|∇upn |pn −2 − 1 ∇upn ∇φ = 0.

(4.16)

Let R > 0 be such that supp(φ) ⊂ BR (0) and set An,β = x ∈ BR (0): ∇upn (x) > β . By (4.10), we have μ(An,β )

C(R) . β2

Therefore, pn −2 |∇upn | |∇upn |pn −1 + |∇upn | − 1 ∇upn ∇φ C(R) An,β

An,β

C(R)

pn −1 |∇upn |

2

2

μ(An,β )

3−pn 2

BR (0)

C(R)β pn −3 .

(4.17)

2284


Since we may choose β to be arbitrary large, we deduce (4.14) from (4.16), (4.17) and (4.13) follows. Consequently, (4.3) follows from (4.12), (4.13). Finally, we need to identify the degree of u. ˜ Combining Lemma 4.1 with Lemma 4.2 we get that 2 1 − |upn |2 2π. lim sup n→∞

R2

Since upn → u˜ in L4loc (R2 ), we obtain

2 1 − |u| ˜ 2 2π.

R2

From the quantization result of [7] it follows that there are only two possibilities

2 1 − |u| ˜ 2 = 2π or 0,

(4.18)

R2

corresponding to the degrees ±1 or 0, respectively. We now establish an improved regularity result for {up }. We make use of Theorem 4.1 in [9]. While it is not clearly stated there, it is possible to verify by examining the proof provided in [9] that all of the estimates in [9] are uniform in p when p → 2+ . It follows that there exists a q > 2 and a constant C > 0 such that |∇upn |q C, (4.19) B1 (y)

for each n and for each disk B1 (y) in R2 . From (4.19) and Morrey’s lemma we deduce that up (x) − up (y) C|x − y|1−2/q , n n

∀x, y ∈ R2 , ∀n ∈ N,

(4.20)

i.e., the family {upn } is equicontinuous on R2 . Therefore, u(0) ˜ = 0, the integral in (4.18) cannot vanish, and (4.4) follows. Finally, using the equicontinuity again, deg(u) ˜ = 1. 2 Combining (4.4) with (4.5) we obtain the following result. Corollary 4.1. We have lim (p − 2)Ip (1) = 2π

p→2+

and lim Ip (1) = ∞.

p→2+


2285

5. Existence of minimizers in R 2+ In this section we study the problem of existence of minimizers in R2+ under the degree condition at infinity. In contrast with the case of the entire plane, here we are only able to prove the existence of minimizers of degree ±1 when p is restricted to some right semi-neighborhood (2, p0 ) of p = 2. A major difference between the two cases is due to the different asymptotic behaviour of the energies when p → 2+ . While in the R2 -case the energy blows up in that limit, i.e., limp→2+ Ip (1) = +∞ (Corollary 4.1), in the R2+ -case the energy Ip+ (1) remains bounded when p → 2+ . The latter result is demonstrated in the following lemma. Lemma 5.1. We have limp→2+ Ip+ (1) = 2π . Proof. Let uλ (z) =

z − λi , z + λi

where 0 < λ 1/2 and z = x1 + ix2 . We obtain an upper bound for lim supp→2+ Ip+ (1) by introducing a smooth test function satisfying uλ , |z| 1, and ∇ u˜ λ (z) Cλ, 1 |z| 2, u˜ λ (z) = 1, |z| 2, for λ < 1/2. As uλ is a conformal mapping of R2+ on B1 (0), we have |∇uλ |2 = 2π. R2+

Hence, lim

λ→0 R2+

|∇ u˜ λ |2 = 2π.

As 1 − u˜ λ is compactly supported, we have lim lim Ep (u˜ λ ) = 2π,

λ→0+ p→2+

from which we obtain that lim supp→2+ Ip+ (1) 2π . Next, we prove the lower bound. Fix any u ∈ E1+ (see (1.4)) and for each β ∈ (0, 1) set Ωβ = x ∈ R2+ : u(x) < β . Clearly, 1 Ep (u) 2

Ωβ

2 1 2 1 − |u|2 1 − β 2 μ(Ωβ ), 2

2286


and hence μ(Ωβ )

2Ep (u) . (1 − β 2 )2

(5.1)

Consider any connected component ω of Ωβ . If ω contains a point x0 where u(x0 ) = 0 then Br (x0 ) ⊂ ω for some r > 0, which depends only on the modulus of continuity of u. It follows in particular that the number of the components ω with deg(u, ∂ω) = 0 is finite. Denoting the union of these components by A, we obtain that the image of A under u is the disk Bβ (0), hence 2 2 |∇u| |∇u| 2 ux1 ∧ ux2 2πβ 2 . (5.2) Ωβ

A

A

The Hölder inequality implies that |∇u|p

( Ωβ |∇u|2 )p/2

Ωβ

μ(Ωβ )(p−2)/2

.

(5.3)

Combining (5.1), (5.2), and (5.3) we obtain Ep (u)

|∇u|p

Ωβ

(2π)p/2 (1 − β 2 )p−2 β p . (2Ep (u))(p−2)/2

Consequently, 2 2(p−2) Ep (u) 2 p πβ 2 1 − β 2 p .

Letting p → 2, we obtain lim Ep (u) 2πβ 2 ,

p→2+

and the desired lower bound follows.

∀β < 1,

2

Proof of Theorem 2. By Corollary 4.1 and Lemma 5.1 there exists a p0 > 2 such that Ip+ (1) < Ip (1),

∀p ∈ (2, p0 ).

(5.4)

Next, we show that the theorem holds with this value of p0 , thus we assume in the sequel that p ∈ (2, p0 ). As in the proof of Theorem 1, we consider a minimizing sequence {un } for Ip+ (1). Our argument is very similar to the one used in the proof of Theorem 1, with the only new difficulty related to the possibility of a “vortex” whose distance to ∂R2+ goes to infinity with n. Eq. (5.4) is needed precisely in order to exclude this possibility. As in (3.2) we set for each n 1 . Sn = x ∈ R2+ : un (x) 2


2287

With λ0 defined as in (3.1), we can find (along the lines of the proof of Theorem 1) a collection of mutually disjoint disks {Bλ0 /5 (xj,n )}m j =1 such that {xj,n }m j =1 ⊂ Sn

and Sn ⊂

m

Bλ0 (xj,n ),

j =1

where m is independent of n (upon passing to a further subsequence, if necessary). In what follows, the coordinates of xj,n are denoted by (xj,n )1 and (xj,n )2 . Note that Bλ0 (xj,n ) ⊂ R2+ because of (3.1) and the boundary condition (1.3). Next, we divide the index set J = {1, . . . , m} into K 1 disjoint subsets J1 , . . . , JK so that the distance |xj1 ,n − xj2 ,n | remains bounded as n goes to ∞ if and only if j1 and j2 belong to the same Ji (cf. Theorem 1). Now we can subdivide the index set {1, . . . , K} into two disjoint subsets: K1 = k: (xj,n )2 → ∞, j ∈ Jk , ∞ K2 = k: (xj,n )2 n=1 is bounded, j ∈ Jk . Note that one of the sets K1 , K2 may be empty. By passing to a subsequence we may further assume that limn→∞ (xj,n )2 exists for every k ∈ K2 and j ∈ Jk . For each k ∈ {1, . . . , K} we fix an arbitrary jk ∈ Jk and define vn(k) (x) = un (x + xjk ,n )

on Ajk ,n := R2+ − xjk ,n .

Consider first the case k ∈ K1 . Then, the limit of the sets {Ajk ,n }n1 , as n → ∞, is R2 . (k) Further, there exists an R > 0 such that, for every r R, the degree dk,n = deg(vn , ∂Br (0)) does not depend on r and n and may be denoted by dk . This statement follows (for a subsequence) from the equicontinuity of {vn(k) }∞ n=1 on ∂BR (0). Passing to a further subsequence, we obtain that 1,p (k) 2 2 vn → vk in Cloc (R , R ), and weakly in Wloc (R2 , R2 ), where the map vk ∈ Edk . Clearly sup lim Ep vn(k) ; BR (0) sup Ep vk ; BR (0) = Ep (vk )

R>0 n→∞

R>0

πλ20 min Ip (dk ), , 32

here Ep (u; D) :=

|∇u|p +

2 1 1 − |u|2 , 2

D

for every D ⊂ R2 . When k ∈ K2 the limit of {Ajk ,n }n1 , as n → ∞, is the half-plane with t = − lim (xj,n )2 . Ht = y ∈ R2 : y2 > t n→∞

(5.5)

2288


Similar to the previous case, for r R, we find that dn,k = deg vn(k) , ∂ Br (0) ∩ Ajk ,n = dk , is independent of r and n. As in (5.5) we obtain sup lim Ep vn(k) ; BR (0) ∩ Ajk ,n sup Ep vk ; BR (0) ∩ Ajk ,n

R>0 n→∞

R>0

πλ20 + . min Ip (dk ), 32

(5.6)

Obviously, by construction, K

dk = 1.

(5.7)

k=1

Using (5.5), (5.6) we deduce that Ip+ (1) = lim Ep (un ) n→∞

k∈K1

πλ20 πλ20 + + . min Ip (dk ), min Ip (dk ), 32 32

(5.8)

k∈K2

By Proposition 1 and (5.4) we have Ip (d) Ip (1) > Ip+ (1),

∀d = 0,

which together with (5.8), (5.7), and Proposition 2 gives K1 = ∅ and K2 = {k0 }. Furthermore, it follows that dk0 = 1. Choosing j0 ∈ Jk0 and defining a new sequence by u˜ n (x) = un x + (xj0 ,n )1 , n 1, we conclude (again, after passing to a subsequence) that u˜ n → u

in Cloc R2+ , R2

1,p and u˜ n u weakly in Wloc R2+ , R2 .

It follows that u ∈ E1+ and Ep (u) = Ip+ (1).

2

We conclude this section by providing an upper bound for the distance of the zeros of a minimizer from ∂R2+ . Proposition 4. For p ∈ (2, p0 ), let vp denote a minimizer realizing the minimum in (1.5). Let vp (xp ) = 0 and assume without loss of generality that xp = (0, rp ). Then, there exists a positive constant C such that rp < C(p − 2)1/2 ,

∀p ∈ (2, p0 ).

(5.9)


2289

Proof. For each p ∈ (2, p0 ) we set r˜p = min(rp , 1) and define a rescaled map v˜p (x) on B1 (0) by v˜p (x) = vp (˜rp x + xp ). From the identity 2−p r˜p

|∇ v˜p |

p

2 1 1 − |v˜p |2 = Ep vp ; Br˜p (xp ) , 2

+ r˜p2

B1 (0)

B1 (0)

it follows that v˜p is a minimizer for the energy p (v) = E

p

|∇v|p + r˜p

B1 (0)

2 1 1 − |v|2 , 2

B1 (0)

over the maps v ∈ W 1,p (B1 (0), R2 ) satisfying v = v˜p on ∂B1 (0). By Lemma 5.1 we have p−2 |∇ v˜p |p = r˜p |∇vp |p |∇vp |p C, B1 (0)

Br˜p (xp )

Br˜p (xp )

so we can again apply the same method as in the proof of the Giaquinta–Giusti regularity result from [9] in order to deduce a uniform bound for the Hölder semi-norm [v˜p ]C β (B1/2 (0)) c0 , with β = 1 − 2/q for some q > 2. Rescaling back we get [vp ]C β (Br˜p /2 (xp ))

c0 β

r˜p

.

(5.10)

It follows from (5.10) that vp (x) c0 |x − xp |β , β r˜p and we deduce easily that

2 1 − |vp |2

R2+

x ∈ Br˜p /2 (xp ),

2 1 − |vp |2 c1 r˜p2 ,

(5.11)

Br˜p /2 (xp )

for some positive constant c1 . Finally, we note that the Pohozaev identity (4.5) also holds for minimizers on R2+ , i.e., R2+

2 2(p − 2) + Ip (1). 1 − |vp |2 = p

(5.12)

2290


Combining (5.11), (5.12) with Lemma 5.1 yields (5.9).

2

Acknowledgments The research of the authors was supported by the following resources: Y.A. by NSF grant DMS-0604467, L.B. by NSF grant DMS-0708324, D.G. by NSF grant DMS-0407361 and I.S. by the Steigman Research Fund. References [1] L. Berlyand, D. Golovaty, On uniqueness of vector-valued minimizers of the Ginzburg–Landau functional in annular domains, Calc. Var. Partial Differential Equations 14 (2002) 213–232. [2] L. Berlyand, P. Mironescu, Ginzburg–Landau minimizers with prescribed degrees: Dependence on domain, C. R. Math. Acad. Sci. Paris 337 (2003) 375–380. [3] L. Berlyand, P. Mironescu, Ginzburg–Landau minimizers with prescribed degrees. Capacity of the domain and emergence of vortices, J. Funct. Anal. 239 (2006) 76–99. [4] L. Berlyand, K. Voss, Symmetry breaking in annular domains for a Ginzburg–Landau superconductivity model, in: Proceedings of IUTAM 99/4 Symposium, Sydney, Australia, Kluwer Acad. Publ., Dordrecht, 1999. [5] F. Bethuel, H. Brezis, F. Hélein, Ginzburg–Landau Vortices, Birkhäuser, Basel, 2004. [6] H. Brezis, Analyse fonctionnelle. Théorie et applications, Collect. Maîtrise Math. Appl., Masson, Paris, 1983. [7] H. Brezis, F. Merle, T. Rivière, Quantization effects for −u = u(1 − |u|2 ) in R2 , Arch. Ration. Mech. Anal. 126 (1994) 35–58. [8] L.C. Evans, Partial Differential Equations, first ed., Amer. Math. Soc., Providence, RI, 1998. [9] M. Giaquinta, E. Giusti, On the regularity of the minima of variational integrals, Acta Math. 148 (1982) 31–46. [10] P. Mironescu, Les minimiseurs locaux pour l’équation de Ginzburg–Landau sont à symétrie radiale, C. R. Acad. Sci. Paris Sér. I Math. 323 (1996) 593–598. [11] C.B. Morrey Jr., Multiple Integrals in the Calculus of Variations, Grundlehren Math. Wiss., vol. 130, SpringerVerlag, New York, 1966. [12] E. Sandier, Locally minimising solutions of −u = u(1 − |u|2 ) in R2 , Proc. Roy. Soc. Edinburgh Sect. A 128 (1998) 349–358.


Finite rank Toeplitz operators: Some extensions of D. Luecking’s theorem Alexey Alexandrov a , Grigori Rozenblum b,c,∗ a Petersburg Department of Steklov Institute of Mathematics, Russian Academy of Sciences, 27, Fontanka,

St. Petersburg 191023, Russia b Department of Mathematics, Chalmers University of Technology, S-412 96 Gothenburg, Sweden c Department of Mathematics, University of Gothenburg, S-412 96 Gothenburg, Sweden

Received 12 June 2008; accepted 12 November 2008 Available online 28 November 2008 Communicated by L. Gross

Abstract The recent theorem by D. Luecking about finite rank Bergman–Toeplitz operators is extended to weights being distributions with compact support and to the spaces of harmonic functions. © 2008 Elsevier Inc. All rights reserved. Keywords: Bergman spaces; Bargmann spaces; Toeplitz operators

1. Introduction and the main result Toeplitz operators play an important role in many branches of analysis. A significant recent development in the theory of such operators is related to the proof, given by D. Luecking [7], of the finite rank conjecture. Let B 2 be the Bergman space of L2 -functions analytical in a bounded domain Ω ⊂ C1 with Lebesgue measure and P be the orthogonal projection in L2 (Ω) onto B2 . For a regular complex Borel measure μ with compact support, the Toeplitz operator with weight μ, u → Tμ u = P uμ,

u ∈ B2 ,

(1.1)

* Corresponding author at: Department of Mathematics, Chalmers University of Technology, S-412 96 Gothenburg, Sweden. E-mail addresses: [email protected] (A. Alexandrov), [email protected] (G. Rozenblum).


2292

A. Alexandrov, G. Rozenblum / Journal of Functional Analysis 256 (2009) 2291–2303

can be correctly defined. According to the finite rank conjecture, if Tμ has finite rank then the measure is a finite combination of point masses, exactly as many as the rank is. The conjecture can be naturally extended to not necessarily bounded domains with a rather wide class of measures. The nontrivial past of this conjecture is described in [7,10]. Immediately after the preprint containing the proof appeared, an activity developed in extending and applying this result. On the one hand, the theorem by Luecking was extended to the multi-dimensional case, see [1,10] (by different methods). On the other hand, interesting application to the theory of Toeplitz operators appeared, see [2,3,5,6], as well as in Function Theory, see [1]. The finite rank result turns out to be useful also in Mathematical Physics, more exactly, to the spectral analysis of the perturbed Landau Hamiltonian, see [9], as well as the discussion and further references in [10]. A number of natural questions arise around Luecking’s theorem. First, it is interesting to find out whether the finite rank property still holds when the analytical Bergman space is replaced by some other, also closed in L2 , space of smooth functions. In [1] such a generalization was found for the space of n-harmonic functions in a domain in Cn , and in [5] the finite rank property was, in the complex dimension 1, extended to the L2 -closed span of certain, not too sparse, sets of monomials znk , nk ∈ Z+ . At the same time, for the problems arising in Mathematical Physics, it is important to generalize the results to the case when the weight measure μ is replaced by a distribution with compact support. In the present paper we deal with these questions. First, in the complex dimension 1, for the analytical Bergman space, we describe the procedure of reducing the finite rank problem for a distribution to the same problem for an absolutely continuous measure μ, which is already taken care of. Thus, the finite rank problem finds its solutions also for distributional weights. We note that the reduction above seems to be necessary. The initial proof with measure weight was critically based upon a lemma on the density of symmetric polynomials of a special form in the space of symmetric continuous functions of many variables, proved by an ingenious use of the Stone–Weierstrass theorem. The distributional case requires a similar density result in the space of differentiable functions, where no proper analogy of the Stone–Weierstrass theorem exists. Moreover, the density result itself turns out to be wrong for differentiable functions. We present an example demonstrating this. Therefore our approach seems to be at the moment the only one able to treat the distributional case. The results on finite rank problem for distributional weights are further extended to the multidimensional case. We use a modification of the induction on dimension presented in [10]. It seems that the approach to proving the multi-dimensional Luecking’s theorem, proposed in [1] using Stone–Weierstrass argument would not work for distributions, by the reasons given above. Finally, we consider the finite rank problem for the Bergman space of harmonic functions. The result follows immediately from the one in the analytical case in an even dimension, since the space of harmonic functions contains the space of n-harmonic functions, where the finite rank property is an obvious consequence of the one in the analytical case, see [1]. Quite different is the situation in an odd dimension ( 3), where no direct coupling of harmonic functions to analytical ones exists. Here we are able to handle only the case of a measure acting as weight, using a sort of dimension-reduction argument and some Harmonic Analysis technique. We give also an example, not disproving the finite rank conjecture directly, but just hinting that the situation here with distributions might be considerably more delicate than the one with measures. The results of the paper were obtained when the first author enjoyed the hospitality of the Department of Mathematics of Chalmers University of Technology in Gothenburg, Sweden, supported by the grant from the Swedish Royal Academy of Sciences, for which he expresses his gratitude.


2293

2. Setting Let τ be a positive measure in a domain Ω ⊂ Cd such that 0 < Cd |P | dτ < +∞ for every polynomial P of the complex variables (z1 , . . . , zd ), P ≡ 0. We consider the space L2 (Ω, τ ) and the subspace A(Ω, τ ) ⊂ L2 (Ω, τ ) consisting of analytical functions. It is a closed subspace, and we denote by PA (Ω, τ ) the orthogonal projection onto A(Ω, τ ). Further on, as soon as the domain and the measure are fixed, we suppress them in the notations. The typical examples here are the Bergman spaces, for the case of a bounded Ω with (say) Lebesgue measure, and the Fock–Bargmann spaces for Ω = Cd , τ being the Gaussian measure. The projection PA is an integral operator with the reproducing kernel P (z, w), infinitely smooth, analytical in z and anti-analytical in w in the domain Ω. Let F be a distribution with compact support in Ω, F ∈ E (Ω). We denote by F, φ the action of the distribution F on the function φ ∈ C ∞ (Ω). Then, for u ∈ A(Ω, τ ), the expression (TF u)(z) = F, P (z, ·)u(·) ,

z ∈ Ω,

(2.1)

defines an analytical function (TF u)(z) ∈ A(Ω, τ ). The corresponding operator u → TF u is a natural generalization of the Toeplitz operator u → P F u, u ∈ A(Ω, τ ) for the case when F is a bounded measurable function with compact support in Ω. The operator TF is bounded in A. Its sesquilinear form can be described as (TF u, v) = F, uv , ¯

u, v ∈ A.

(2.2)

In the special case when the distribution F is, in fact, a complex Borel measure μ with compact support in Ω, the operator TF can be described as (TF u)(z) = P (z, w)u(w) dμ(w), (2.3) Ω

and the sesquilinear form is given by (TF u, v) =

uv¯ dμ.

(2.4)

Suppose that the operator TF has finite rank, rank(TF ) = m < ∞. This means, in particular, that for any, finite or infinite, system of functions fα ∈ A, the system of functions gα = TF fα is linearly dependent and rank{gα } m. This is correct, in particular, if we take as fα the system of polynomials fα = zα , α = (α1 , . . . , αd ) ∈ (Z+ )d . Therefore the infinite matrix AF = (aαβ ),

aαβ = TF zα , zβ = F, zα z¯ β

(2.5)

has finite rank, rank(AF ) m. It is important that the matrix AF does not depend on the domain Ω or the measure τ , but it depends only on the distribution F . Of course, the rank of AF does not change if we make a unitary transformation of Cd with corresponding change of complex coordinates. We notice also, following [10], that if g is function analytical and bounded in some polydisk neighborhood of supp F and Fg is the distribution |g|2 F then rank AFg rank AF . To show this,

2294


we consider first a polynomial gl of degree l. The matrix AFgl is obtained by building linear combinations of rows and columns of AF , therefore the rank does not increase, rank AFgl rank AF . We pass to a general analytical function g using approximations by Taylor polynomials, convergent, together with all derivatives, uniformly on any compact in the polydisk. In a similar way, we consider Toeplitz operators in spaces of harmonic functions. Supposing for simplicity that the measure τ is absolutely continuous with respect to Lebesgue measure, with bounded positive density, we denote by H(Ω, τ ) the subspace in L2 (Ω, τ ), consisting of harmonic functions in a domain Ω ⊂ Rd and by Q the orthogonal projection Q : L2 (Ω, τ ) → H(Ω, τ ); this projection is an integral operator with kernel Q(x, y), x, y ∈ Ω, the kernel being a harmonic function in each variable x and y. With a distribution F having compact support in Ω we associate, similarly to (2.1) the Toeplitz operator TFH : u → TFH u, TFH u(z) = F, Q(x, ·)u(·) . The expression for the action of the operator for the case when F is a Borel measure and the expressions for the sesquilinear form are analogous to (2.3), (2.2), (2.4). Similar to the case of analytical functions, we associate with the distribution F the matrix HF , with entries being F, fα fβ , where fα is some system of harmonic polynomials in Rd . Again, the rank of the infinite matrix HF does not exceed the rank of the operator TFH . We, however, may not include, as we have done for analytical functions, the multiplicative functional parameter g, since harmonic functions do not possess a multiplicative structure. 3. Finite rank operators in dimension 1 The aim of this section is to give a proof of the following result generalizing the Luecking theorem. Theorem 3.1. Let F be a distribution with compact support in the domain Ω ⊂ C1 . Suppose that the operator TF has finite rank m. Then there exist finitely many points zq ∈ Ω, q = 1, . . . , m0 , m0 m, and differential operators Lq = Lq (∂x , ∂y ), q = 1, . . . , m0 such that F = Lq δ(z − zq ). We start with some observations about distributions in E (C). For such distribution we denote by psupp F the complement of the unbounded component of the complement of supp F . Lemma 3.2. Let F ∈ E (C). Then the following two statements are equivalent: (a) there exists a distribution G ∈ E (C) such that ∂G ∂ z¯ = F , moreover supp G ⊂ psupp F ; (b) F is orthogonal to all polynomials of z variable, i.e. F, zk = 0 for all k ∈ Z+ . Proof. The implication (a) ⇒ (b) follows from the relation

∂zk ∂G k = 0. F, zk = , z = G, ∂ z¯ ∂ z¯

(3.1)

1 We prove that (b) ⇒ (a). Put G := F ∗ πz ∈ S (C), the convolution being well-defined 1 since F has compact support. Since πz is the fundamental solution of the Cauchy–Riemann operator ∂∂z¯ , we have ∂G ∂ z¯ = F (cf., for example, [4, Theorem 1.2.2]). By the ellipticity of the Cauchy–Riemann operator, singsupp G ⊂ singsupp F ⊂ supp F , in particular, this means


2295

that G is a smooth function outside psupp F , moreover, G is analytic outside psupp F (by singsupp F we denote the singular support of the distribution F , see, e.g., [4], the complement of the largest open set where coincides with a smooth function). Additionally, the distribution 1 −k−1 F, w k = 0 if |z| > R and R is sufficiently large. By = π −1 ∞ z G(z) = F, π(z−w) k=0 analyticity this implies G(z) = 0 for all z outside psupp F . 2 Proof of Theorem 3.1. The distribution in question F , as any distribution with compact support, is of finite order, therefore it belongs to some Sobolev space, F ∈ H s for certain s ∈ R1 . If s 0, F is a function and must be zero by Luecking’s theorem. So, suppose that s < 0. Consider the first m + 1 columns in the matrix AF , i.e. akl = TF zk , zl = F, zk z¯ l ,

l = 0, . . . , m; k = 0, . . . .

(3.2)

Since the rank of the matrix AF is not greater than m, thecolumns are linearly dependent, in other words, there exist coefficients c0 , . . . , cm such that m l=0 akl cl = 0 for any k 0. This relation can be written as F, z h1 (¯z) = h1 (¯z)F, zk = 0,

k

h1 (¯z) =

m

cl z¯ l .

(3.3)

k=0

Therefore the distribution h1 (¯z)F ∈ H s satisfies the conditions of Lemma 3.2 and hence there (1) exists a compactly supported distribution F (1) such that ∂F∂ z¯ = h1 F . By the ellipticity of the Cauchy–Riemann operator, the distribution F (1) is less singular than F , F (1) ∈ H s+1 . At the same time,

F

(1)

∂zk z¯ l+1 , z z¯ = (l + 1) F , ∂ z¯ = (l + 1)−1 h1 (¯z)F, zk z¯ l = (l + 1)−1 F, zk z¯ l h1 (¯z) , k l

−1

(1)

(3.4)

and therefore the rank of the matrix AF (1) does not exceed the rank of the matrix AF . We repeat this procedure sufficiently many (say, N = [−s] + 1) times and arrive at the distribution F (N ) in L2 , for which the corresponding matrix AF (N) has finite rank. By Luecking’s theorem, this may happen only if F (N ) = 0. (N) Now we go back to the initial distribution F . Since, by our construction, ∂F∂ z¯ = hN (¯z)F (N −1) , we have that hN (¯z)F (N −1) = 0 and therefore supp F (N −1) is a subset of the (N−1) set of zeroes of the polynomial hN (¯z). On the next step, since ∂F ∂ z¯ = hN −1 (¯z)F (N −2) , we obtain that supp F (N −2) lies in the union of sets of zeroes of polynomials hN −1 (¯z) and hN (¯z). After having gone all the way back to F , we obtain that its support is a finite set of points lying in the union of zero sets of polynomials hj . A distribution with such support must be a linear combination of δ-distributions in these points and their derivatives, F = Lq δ(z − zq ), where Lq = Lq (D) is some differential operator. Finally, to show that the number of points zq does not exceed m, we construct for each of them the interpolating polynomial fq (z) such that Lq (−D)|fq |2 = 0 at the point zq while at the points zq , q = q, the polynomial fq has zero of sufficiently high order, higher than the order of Lq , so that Lq (fq g)(zq ) = 0 for any smooth function g. With such choice of polynomials, the matrix with entries F, fq fq is the diagonal

2296


matrix with nonzero entries on the diagonal, and therefore its size (that equals the number of the points zq ) cannot be greater than the rank of the whole matrix AF , i.e., cannot be greater than m. 2 We note here that the attempt to extend the original proof of Luecking’s theorem to the distributional case would probably meet certain complications. Let us recall the crucial place in [7]. The matrix of the type (2.5) is also considered, with a measure μ standing on the place of the distribution F . Then, for a given N , the measure μN = N μ on CN is introduced, and Lemma 5.1 is established, stating that if the Toeplitz operator Tμ has rank smaller than N , then for all symmetric polynomials H1 (Z), H2 (Z) of the multi-dimensional complex variable Z = (z1 , z2 , . . . , zN ) ∈ CN ,

2

H1 (Z)H2 (Z) V (Z) dμN = 0,

(3.5)

where V (Z) is the Vandermonde function, V (Z) = i<j (zi − zj ). To derive the finite rank result from Lemma 5.1, the following property is needed: the algebra generated by the functions of the form H1 (Z)H2 (Z) is dense (in the sense of the uniform convergence on compacts) in the space of symmetric continuous functions. This latter property is proved in [7] by an ingenious reduction to the Stone–Weierstrass theorem. Now, if μ = F is a distribution that is not a measure, the analogy of reasoning in [7] would require a similar density property, however not in the sense of the uniform convergence on compacts, but in a stronger sense, the uniform convergence together with derivatives up to some fixed order (depending on the order of the distribution F ). The Stone–Weierstrass theorem seems not to help here since it deals with uniform convergence only. Moreover, the required more general density statement itself is wrong, which follows from the construction below. Proposition 3.3. The algebra generated by the functions having the form H1 (Z)H2 (Z), where H1 , H2 are symmetric polynomials of the variables Z = (z1 , . . . , zN ) is not dense in the sense of the uniform C l -convergence on compact sets in the space of C l -differentiable symmetric functions, as long as l N (N − 1). Proof. We introduce the notations: Dj = ∂z∂ j , Dj = ∂∂z¯j . Consider the differential operator V (D) = j 0, sufficiently small, so that 2 -neighborhoods of ζq (1) are disjoint, and consider the functions ϕq (z ) ∈ C ∞ (Cd−1 ), q = 1, . . . , such that supp ϕq lies in the -neighborhood of the point ζq (1) and ϕq (z ) = 1 in the 2 -neighborhood of ζq (1). We fix an analytic function g(z) and consider for any q the distribution Φq (t, g) ∈ E (Cd ), Φq (t, g) = |1 + tg|2 ϕq (Z )F = ϕq (Z )F1+tg . For t = 0, Φq (t, g) = ϕq (Z )F , the point ζq (1) belongs to the support of π∗ Φq (0, g), and therefore for some function u ∈ C ∞ (Cd−1 ), π∗ Φq (0, g), u = 0. By continuity, for |t| small enough, we still have π∗ Φq (t, g), u = 0, which means that the -neighborhood of the point ζq (1) contains at least one point in the support of the distribution G(1 + tg). Altogether, we have not less than m0 points of the support of G(1 + tg) in the union of -neighborhoods of the points ζj (1). However, recall, the support of G(1 + tg) can never contain more than m0 points, so we deduce that for t small enough, there are no points of the support of G(1 + tg) outside the -neighborhoods of the points ζq (1), so supp G(1 + tg) ∩ Z : |Z − ζq | > = ∅

(4.3)

for |t| small enough (depending on g). Now we introduce a function ψ ∈ C ∞ (Cd−1 ) that equals 1 outside 2 -neighborhoods of the points ζq (1) and vanishes in -neighborhoods of these points. By (4.3), the distribution ψG(1 + tg) equals zero for any g, for t small enough. In particular, applying this distribution to the function u = 1, we obtain

ψG(1 + tg), 1 = ψF, |1 + tg|2 = ψF, 1 + 2t Re g + t 2 |g|2 = 0.

(4.4)

By the arbitrariness of t in a small interval, (4.4) implies that ψF, |g|2 = 0 for any g. Now we take g in the form g = g1 + g2 , where g1 , g2 are again functions analytical in a polydisk neighborhood of supp F . Then we have

ψF, |g1 |2 + 2 Re(g1 g2 ) + |g2 |2 = ψF, 2 Re(g1 g2 ) = 0.

Replacing here g1 by ig1 , we obtain ψF, 2 Im(g1 g2 ) = 0, and thus ψF, g1 g2 = 0.

(4.5)

¯ can be represented as a linear combination of functions of the form Any polynomial p(Z, Z) g1 g2 , so, (4.5) gives

¯ = 0. ψF, p(Z, Z)

(4.6)


2299

Now we take any function f ∈ C ∞ (Cd ) supported in the neighborhood V of supp F such that ¯ uniformly f = 0 on the support of ψ . We can approximate f by polynomials of the form p(Z, Z) on V in the sense of C l , where l is the order of the distribution F . Passing to the limit in (4.6), we obtain ψF, f = F, f = 0. The latter relation shows that supp F ⊂ q {Z: |Z − ζq (1)| < 2 }. Since > 0 is arbitrary, this implies that supp F lies in the union of affine subspaces Z = ζj , j = 1, . . . , m0 of complex dimension 1. Now we repeat the same reasoning having chosen instead of Z = (z1 , Z ) another decomposition of the complex variable Z: Z = (Z

, zd ). We obtain that for some points ξk ∈ Cd−1 , no more than m of them, the support of F lies in the union of subspaces Z

= ξk . Taken together, this means that, actually, supp F lies in the intersection of these two systems of subspaces, which consists of no more than m2 points Zs . The number of points is finally reduced to m0 m in the same way as in Theorem 3.1, by choosing a special system of interpolation functions. 2 5. Harmonic functions The aim of this section is to establish finite rank results for Toeplitz operators corresponding to the Bergman spaces of harmonic functions. The main difference with the analytical case lies in the circumstance that the space of harmonic functions does not possess the multiplicative structure. Therefore, in the process of dimension reduction, similar to the one we used in the proof of Theorem 4.1, we are not able to introduce the functional parameter (denoted by g there.) As a result of this circumstance, we can prove the finite rank theorem only in the case of F being a measure and not a more singular distribution. In order to justify this shortcoming, we conclude the section by presenting an example of a singular distribution with rather large support (and thus non-discrete), that projects to a discrete measure, whatever the direction of the projection. Thus, a considerable part of F becomes invisible after being projected. This example, although not contradicting directly the finite rank property, indicates that the reduction of dimension might be not sufficient to prove the result. We start with the even-dimensional case. Here the problem with harmonic spaces reduces easily to the analytical case (in fact, we could have used a reference to [1] instead). Theorem 5.1. Let d = 2n be an even integer. Suppose that for a certain distribution F ∈ E (Rn ) the matrix HF defined in Section 2 has rank m < ∞. Then the distribution F is a sum of m0 m terms, each supported in one point: F = Lj δ(x − xq ), xq ∈ Rn , Lq are differential operators in Rn . Proof. We identify the space Rd with the complex space Cn . Since the functions zα , z¯ β are harmonic, the matrix AF can be considered as a submatrix of HF , and therefore it has rank not greater than m. It remains to apply Proposition 4.2 to establish that the distribution F has the required form, with no more than m points xq . 2 The odd-dimensional case requires considerably more work. We will use again a kind of dimension reduction, however, unlike the analytic case, we will need projections of the distribution to one-dimensional subspaces. Let S denote the unit sphere in Rd , S = {ζ ∈ Rd : |ζ | = 1} and let σ be the Lebesgue measure on S. For ζ ∈ S, we denote by Lζ the one-dimensional subspace in Rd passing through ζ , Lζ = ζ R1 . For a distribution F ∈ E (Rd ), we define the distribution Fζ ∈ E (R1 ) by setting

2300


Fζ , φ = F, φz , where φz ∈ C ∞ (Rd ) is φz (x) = φ(x · z). The distribution Fζ can be underL

stood as result of projecting of F to Lζ with further transplantation of the projection, π∗ ζ F , from the line Lζ to the standard line R1 . The Fourier transform F Fζ of Fζ is closely related with FF : F (Fζ )(t) = (F F )(tζ ).

(5.1)

Further on, we will restrict ourselves to the case when the distribution F is a finite complex Borel measure μ. Here we will use the notation μζ instead of Fζ . We need to recall certain facts in harmonic analysis. In the one-dimensional case, they were proved by N. Wiener as long ago as in 1919; the multi-dimensional version seems to be folklore, however the formulations we found in the literature, see [8], are slightly weaker than the ones we need. Let μ be a finite complex Borel measure in Rd . We define μ =

1

2 2

μ {ξ } . ξ ∈Rd

Of course, μ is finite for a finite measure and it vanishes if and only if μ has no atoms. Lemma 5.2. Let μ be a finite Borel measure Rd and h be a function in L1 (R1 ). Denote by F μ the Fourier transform of μ. Then lim R −d h R −1 ξ F μ(ξ ) dξ = μ {0} h(ξ ) dξ. (5.2) R→∞

Rd

Rd

Proof. By Plancherel identity, we have (F h)(Rx) dμ(x). lim R −d h R −1 ξ F μ(ξ ) dξ = lim R→∞

R→∞ Rd

Rd

Now note that F h(0) = Rd h(ξ ) dξ and limR→0 (F h)(Rx) = 0 for x = 0 by Riemann–Lebesgue lemma. The proof completes by applying the Lebesgue dominant convergence theorem. 2 Corollary 5.3. Under the conditions of Lemma 5.2,

2 −1 −d 2

lim R h R ξ F μ(ξ ) dξ = μ h(ξ ) dξ. R→∞

Rd

(5.3)

Rd

ˇ Proof. We define the measure μˇ as μ(E) ˇ = μ(−E) for any Borel set E and introduce ν = μ ∗ μ. Then F ν = |F μ|2 and ν(0) = μ2 . It remains to apply Lemma 5.2 to the measure μ. 2 We are going to use Corollary 5.3 to relate the properties of the family of measures μζ , ζ ∈ S, with the properties of μ.


2301

Lemma 5.4. Let μ be a finite compactly supported complex Borel measure on Rd . Then the following two statements are equivalent: (a) The measure μ is continuous, i.e., μ({x}) = 0 for any x ∈ Rd . (b) The measure μζ is continuous for σ -almost all ζ ∈ S. Proof. We take a function h(ξ ), depending only on |ξ |, h(ξ ) = H (|ξ |) such that 2 So, R |r|d−1 H (|r|)dr = σ (S) . By Corollary 5.3, used in dimension 1 for μζ , σ (S) R→∞ 2R

Rd

h(ξ ) dξ = 1.

−1 d−1 −1

R r H R |r| (F μζ )(r) 2 dr.

μζ 2 = lim

R

In what follows we apply the Lebesgue dominant compactness theorem to justify the passing to a limit: 1 σ (S)

μζ 2 dσ (ζ ) = S

1 R→∞ 2R

lim

S

1 R→∞ 2R d

−1 d−1 −1

R r H R |r| (F μζ )(r) 2 dr dσ (ζ )

R

= lim

1 = lim d R→∞ R

2 |r|d−1 H R −1 |r| (F μζ )(r) dr dσ (ζ )

S R

∞

2 r d−1 H R −1 r (F μ)(rζ ) dr dσ (ζ )

S 0

= lim

R→∞

1 Rd

2 h R −1 ξ (F μ)(ξ ) dξ = μ2 .

(5.4)

Rd

Hence, μ = 0 if and only if μζ = 0 for almost all ζ ∈ S.

2

Corollary 5.5. For a finite complex Borel measure μ with compact support in Rd the following three statements are equivalent: (a) μ is discrete; (b) μζ is discrete for all ζ ∈ S; (c) μζ is discrete for σ -almost all ζ ∈ S. Proof. The implications (a) ⇒ (b) and (b) ⇒ (c) are obvious. To establish (c) ⇒ (a), we denote by μc the continuous part of μ. Then the statement c) means that (μc )ζ is discrete for σ -almost all ζ ∈ S. On the other hand, by Lemma 5.4 applied to μc , the measure (μc )ζ is continuous for σ -almost all ζ ∈ S. Being both discrete and continuous, the measure (μc )ζ is zero for σ -almost all ζ ∈ S. Passing to the Fourier transform, we obtain (F μc )(rζ ) = 0 for all r for σ -almost all ζ ∈ S. Now, since the Fourier transform F μc is smooth, this means that μc = 0. 2 Now we return to our finite rank problem.

2302


Theorem 5.6. Let d 3 be an odd integer, d = 2n + 1. Let μ be a finite complex Borel measure in Rd with compact support. Suppose that the matrix Hμ has finite rank m. Then supp μ consists of no more than m points. Proof. Fix some ζ ∈ S and chose some d − 1 = 2n-dimensional linear subspace L ⊂ Rd containing Lζ . We choose the co-ordinate system x = (x1 , . . . , xd ) in Rd so that the subspace L coincides with {x: xd = 0}. The even-dimensional real space L can be considered as the n-dimensional complex space Cn with co-ordinates z = (z1 , . . . , zn ), zj = x2j −1 + ix2j , j = 1, . . . , n. The functions (z, xd ) → zα , (z, xd ) → z¯ β , α, β ∈ (Z+ )d , are harmonic polynomin als in Cd ×R1 . Moreover, by definition, μ, zα z¯ β = π∗C μ, zα z¯ β . Hence, the matrix Aπ Cn μ is a ∗ submatrix of the matrix Hμ , and the former has not greater rank than the latter, rank(Aπ Cn μ ) m. ∗

So we can apply Theorem 4.1 and obtain that the measure π∗C μ is discrete and its support conn tains not more than m points. Now we project the measure π∗C μ to the real one-dimensional linear subspace Lζ in L. We obtain the same measure as if we had projected μ to Lζ from the n

L

L

very beginning, and not in two steps i.e., π∗ ζ μ. As a projection of a discrete measure, π∗ ζ μ is discrete and has no more than m points in the support. By our definition of the measure μζ as L

π∗ ζ μ transplanted to R1 , this means that μζ is discrete. Due to the arbitrariness of the choice of ζ ∈ S, we obtain that all measures μζ are discrete. Now we can apply Corollary 5.5 and obtain that the measure μ is discrete itself. Finally, in order to show that the number of points in supp μ does not exceed m, we chose ζ ∈ S such that no two points in supp μ project to the same point in Lζ . Then the point masses of μ cannot cancel each other under the projection, and thus card supp μ = card supp μζ m. The number of points in the support of μ is estimated in the same way as in Theorems 3.1 and 4.1. 2 The analysis of the reasoning in the proof shows that the only essential obstacle for extending Theorem 5.6 to the case of distributions is the limitation set by Corollary 5.5. If we were able to prove this corollary for distributions, all other steps in the proof of Theorem 5.6 would go through without essential changes. However, it turns out that not only the proof of Corollary 5.5 cannot be carried over to the distributional case, but, moreover, the corollary itself becomes wrong. The example that we present does not disprove Theorem 5.6 for distributions, however it indicates that the proof, if exists, should involve some other ideas. Example 5.7. Let d 2. We consider the Schwartz distribution F ∈ S(Rd ) that has cos |ξ | as its Fourier transform. By the Paley–Wiener theorem, since F F is an entire function of exponential type, F has compact support, F ∈ E (Rd ). By (5.1) and spherical symmetry, for any ζ ∈ S, Fζ = F −1 (cos τ ) = 12 (δ1 +δ−1 ). If F were a measure, then, by Corollary 5.5 it would be discrete. This, however is impossible since F , together with F F , is rotationally invariant; being both discrete and rotationally invariant, F must have support in the origin, which contradicts the above expression for Fζ . The construction also shows that F is the unique distribution that has 12 (δ1 + δ−1 ) as its one-dimensional projections. Of course, we could have directly checked that F is not a measure, using the fact that F is, actually, the solution u(x, t), t = 1, for the wave equation utt − x u = 0 with initial conditions u(·, 0) = δ, ut (·, 0) = 0. Moreover, from the classical Poisson formulas it follows that supp F is the sphere {|x| = 1} for odd d and the ball {|x| 1} for even d. Note, however, that in neither dimension F generates a finite rank Toeplitz operator.


2303

6. Discussion In the process of exploring the finite rank conjecture, a number of interesting open questions arise. The case of analytical functions is studied completely. However, in the case of harmonic functions the finite rank conjecture is open for weights being distributions that are not measures. The complete solution of this problem would follow from the positive answer to the next question. Let d 3, F ∈ E (Rd ). Suppose that π∗H F is a distribution with a finite support for every subspace H ⊂ Rd with dim H = d − 1. Is it true that the support of μ is finite? As Example 5.7 shows, the answer is negative, if we consider subspaces of dimension 1 instead. Further possible versions of the finite rank conjecture may involve some other elliptic equations playing the part of the Cauchy–Riemann or the Laplace equations in the problem. The first interesting candidate for the study here is the Helmholtz operator HE u = u + Eu, E > 0. Let PHE be the orthogonal projection from L2 (Ω) to the subspace HE (Ω) consisting of solution of the Helmholtz equation. With a function (or a compactly supported distribution) F we associate the Toeplitz operator TF : u → PHE uF , u ∈ HE (Ω). Which restrictions on F are imposed by the condition that the operator TF has a finite rank? The question is of a certain importance for the scattering theory. It is easy to show that if TF is zero then F must be zero. However it is unclear at the moment how to handle the case of a positive rank. For the Toeplitz operator corresponding to the projection onto the subspace of solutions of a general elliptic equation, even the case of rank 0 is unresolved. References [1] B.R. Choe, On higher dimensional Luecking’s theorem, http://math.korea.ac.kr/~choebr/papers/luecking.pdf. [2] B.R. Choe, H. Koo, J.L. Young, Finite sums of Toeplitz products on the polydisk, http://math.korea.ac.kr/~choebr/ papers/finitesum_polydisk.pdf. ˘ ckovic, I. Louchihi, Finite rank commutators and semicommutators of Toeplitz operators with quasihomoge[3] Z. Cu˘ neous symbols, http://www.math.utoledo.edu/~ilouhic2/PDF/rank.pdf. [4] L. Hörmander, The Analysis of Linear Partial Differential Operators, vol. 1, Springer, 1983. [5] T. Le, A refined Luecking’s theorem and finite-rank products of Toeplitz operators, arXiv: 0802.3925. [6] T. Le, Finite rank products of Toeplitz operators in several complex variables, http://individual.utoronto.ca/trieule/ Data/FiniteRankToeplitzProducts.pdf. [7] D. Luecking, Finite rank Toeplitz operators on the Bergman space, Proc. Amer. Math. Soc. 136 (5) (2008) 1717– 1723. [8] P. Mattila, Spherical averages of Fourier transforms of measures with finite energy; dimension of intersections and distance sets, Mathematika 34 (2) (1987) 207–228. [9] S. Raikov, G. Warzel, Quasi-classical versus non-classical spectral asymptotics for magnetic Schrödinger operators with decreasing electric potentials, Rev. Math. Phys. 14 (2002) 1051–1072. [10] G. Rozenblum, N. Shirokov, Finite rank Bergman–Toeplitz and Bargmann–Toeplitz operators in many dimensions, arXiv: 0802.0192.


A change of variable formula for the 2D fractional Brownian motion of Hurst index bigger or equal to 1/4 Ivan Nourdin Laboratoire de Probabilités et Modèles Aléatoires, Université Pierre et Marie Curie, Boîte courrier 188, 4 Place Jussieu, 75252 Paris Cedex 5, France Received 13 June 2008; accepted 2 October 2008 Available online 30 October 2008 Communicated by Paul Malliavin

Abstract We prove a change of variable formula for the 2D fractional Brownian motion of index H bigger or equal to 1/4. For H strictly bigger than 1/4, our formula coincides with that obtained by using the rough paths theory. For H = 1/4 (the more interesting case), there is an additional term that is a classical Wiener integral against an independent standard Brownian motion. © 2008 Elsevier Inc. All rights reserved. Keywords: Fractional Brownian motion; Weak convergence; Change of variable formula

1. Introduction and main result In [4], Coutin and Qian have shown that the rough paths theory of Lyons [13] can be applied to the 2D fractional Brownian motion B = (B (1) , B (2) ) under the condition that its Hurst parameter H (supposed to be the same for the two components) is strictly bigger than 1/4. Since this seminal work, several authors have recovered this fact by using different routes (see e.g. Feyel and de La Pradelle [7], Friz and Victoir [8] or Unterberger [18] to cite but a few). On the other hand, it is still an open problem to bypass this restriction on H . Rough paths theory is purely deterministic in essence. Actually, its random aspect comes only when it is applied to a sample path of a given stochastic process (like a Brownian motion, a fractional Brownian motion, etc.). In particular, it does not allow to produce a new alea. As E-mail address: [email protected]. 0022-1236/$ – see front matter © 2008 Elsevier Inc. All rights reserved. doi:10.1016/j.jfa.2008.10.005

I. Nourdin / Journal of Functional Analysis 256 (2009) 2304–2320

2305

such, the second point of Theorem 1.2 just below shows, in a sense, that it seems difficult to reach the case H = 1/4 by using exclusively the tools of rough paths theory. Before stating our main result, we need some preliminaries. Let W be a standard (1D) Brownian motion, independent of B. We assume that B and W are defined on the same probability space (Ω, F , P ) with F = σ {B} ∨ σ {W }. Let (Xn ) be a sequence of σ {B}-measurable random varistably ables, and let X be a F -measurable random variable. In the sequel, we will write Xn −−−→ X if law (Z, Xn ) −−→ (Z, X) for all bounded and σ {B}-measurable random variable Z. In particular, we see that the stable convergence imply the convergence in law. Moreover, it is easily checked that the convergence in probability implies the stable convergence. We refer to [11] for an exhaustive study of this notion. Now, let us introduce the following object: Definition 1.1. Let f : R2 →R be a continuously differentiable function, and fix a time t > 0. t Provided it exists, we define 0 ∇f (Bs ) · dBs to be the limit in probability, as n → ∞, of (1) (2) (1) (2) ∂f nt−1 ∂f ∂x (Bk/n , Bk/n ) + ∂x (B(k+1)/n , Bk/n ) (1) B(k+1)/n In (t) := 2 k=0

(1)

− Bk/n

(1) (2) (1) (2) ∂f nt−1 ∂f ∂y (Bk/n , Bk/n ) + ∂y (Bk/n , B(k+1)/n ) (2) B(k+1)/n + 2 k=0

(2) − Bk/n .

(1.1)

by (1.1) does not converge in probability but converges stably, we denote the If In (t) defined t limit by 0 ∇f (Bs ) · d Bs . Our main result is as follows: Theorem 1.2. Let f : R2 → R be a function belonging to C 8 and verifying (H8 ), see (3.1) below. Let also B = (B (1) , B (2) ) denote a 2D fractional Brownian motion of Hurst index H ∈ (0, 1), and t > 0 be a fixed time. 1. If H > 1/4 then

t 0

∇f (Bs ) · dBs is well defined, and we have t f (Bt ) = f (0) +

∇f (Bs ) · dBs .

(1.2)

0

2. If H = 1/4 then only

t 0

Law

∇f (Bs ) · d Bs is well defined, and we have t

f (Bt ) = f (0) +

σ1/4 ∇f (Bs ) · d Bs + √ 2

t

0

∂ 2f (Bs ) dWs . ∂x∂y

(1.3)

0

t Here, σ1/4 is the universal constant defined below by (1.5), and 0 ∂ 2 f/∂x∂y(Bs ) dWs denotes a classical Wiener integral with respect to the independent Brownian motion W .

2306


t 3. If H < 1/4 then the integral 0 Bs · d Bs does not exist. Therefore, it is not possible to write (1) (2) a change of variable formula for Bt Bt using the integral defined in Definition 1.1. Remark 1.3. 1. Due to the definition of the stable convergence, we can freely move each component in (1.3) from the right-hand side to the left (or from the left-hand side to the right). 2. Whenever β denotes a one-dimensional fractional Brownian motion with Hurst index in nt−1 (0, 1/2), it is easily checked, for any fixed t > 0, that k=0 βk/n (β(k+1)/n − βk/n ) does not converge in law. (Indeed, on one hand, we have 2 βnt/t =

nt−1

nt−1 nt−1 2 2 2 β(k+1)/n − βk/n =2 β(k+1)/n − βk/n βk/n β(k+1)/n − βk/n +

k=0

k=0

k=0

and, on the other hand, it is well known (see e.g. [12]) that n2H −1

nt−1

2 L2 β(k+1)/n − βk/n −→ t. n→∞

k=0

These two facts imply immediately that nt−1 k=0

nt−1 1 2 2 β(k+1)/n − βk/n βk/n β(k+1)/n − βk/n = − β 2 nt/t k=0

does not converge in law). On the other hand, whenever H > 1/6, the quantity nt−1 k=0

1 f (βk/n ) + f (β(k+1)/n ) (β(k+1)/n − βk/n ) 2

converges in L2 for any regular enough function f : R → R, see [3,9]. This last fact roughly explains why there is a “symmetric” part in the Riemann sum (1.1). 3. We stress that it is still an open problem to know if each individual integral t 0

∂f (Bs ) d () Bs(1) ∂x

t and

∂f (Bs ) d () Bs(2) ∂y

0

could be defined separately. Indeed, in the first two points of Theorem 1.2, we “only” prove that t their sum, that is 0 ∇f (Bs ) · d () Bs , is well defined. 4. Let us give a quicker proof of (1.3) in the particular case where f (x, y) = xy. Let β be a one-dimensional fractional Brownian motion of index 1/4. The classical Breuer–Major’s theorem [1] yields: n·−1 n·−1 Law 1 stably 1 √ (βk+1 − βk )2 − 1 −→ σ1/4 W. (1.4) n(β(k+1)/n − βk/n )2 − 1 = √ √ n→∞ n n k=0

k=0


2307

Here, the convergence is stable and holds in the Skorohod space D of càdlàg functions on [0, ∞). Moreover, W still denotes a standard Brownian motion independent of β (the independence is a consequence of the central limit theorem for multiple stochastic integrals proved in [17]) and the constant σ1/4 is given by

σ1/4 :=

2 1 |k + 1| + |k − 1| − 2 |k| < ∞. 2

(1.5)

k∈Z

be another fractional Brownian motion of index 1/4, independent of β. From (1.4), Now, let β we get

nt−1 nt−1 √ 1 √ 1 (k+1)/n − β k/n )2 − 1 n(β(k+1)/n − βk/n )2 − 1 , √ n(β √ n n k=0

k=0

stably

) −→ σ1/4 (W, W

n→∞

) a 2D standard Brownian motion, independent of the 2D fractional Brownian motion for (W, W (β, β ). In particular, by difference, we have n·−1 σ1/4 σ1/4 1 ) Law (k+1)/n − β k/n )2 stably (W − W (β(k+1)/n − βk/n )2 − (β −→ = √ W. n→∞ 2 2 2 k=0

√ √ )/ 2 and B (2) = (β − β )/ 2. It is easily checked that B (1) and B (2) Now, set B (1) = (β + β are two independent fractional Brownian motions of index 1/4. Moreover, we can rewrite the previous convergence as n·−1 k=0

(1) (1) (2) (2) stably σ1/4 B(k+1)/n − Bk/n B(k+1)/n − Bk/n −→ √ W, n→∞ 2

(1.6)

with B (1) , B (2) and W independent. On the other hand, for any a, b, c, d ∈ R: bd − ac = a(d − c) + c(b − a) + (b − a)(d − c). (1) (1) (2) (2) Choosing a = Bk/n , b = B(k+1)/n , c = Bk/n and d = B(k+1)/n , and summing for k over 0, . . . , nt − 1, we obtain

(1)

(2)

Bnt/n Bnt/n =

nt−1

(1) (2) (2) (2) (1) (1) Bk/n B(k+1)/n − Bk/n + Bk/n B(k+1)/n − Bk/n

k=0

+

nt−1 k=0

(1) (1) (2) (2) B(k+1)/n − Bk/n B(k+1)/n − Bk/n .

(1.7)

2308


Hence, passing to the limit using (1.6), we get the desired conclusion in (1.3), in the particular case where f (x, y) = xy. Note that the second term in the right-hand side of (1.7) is the discrete analogue of the 2-covariation introduced by Errami and Russo in [6]. 5. We could prove (1.3) at a functional level (note that it has precisely been done for f (x, y) = xy in the proof just below). But, in order to keep the length of this paper within limits, we defer to future analysis this rather technical investigation. 6. In the very recent work [15], Réveillac and I proved the following result (see also Burdzy and Swanson [2] for similar results in the case where β is replaced by the solution of the stochastic heat equation driven by a space/time white noise). If β denotes a one-dimensional fractional Brownian motion of index 1/4 and if g : R → R is regular enough, then n−1 √ stably 1 1 g(βk/n ) n(β(k+1)/n − βk/n )2 − 1 −→ √ n→∞ 4 n k=0

1

1

g (βs ) ds + σ1/4 0

g(βs ) dWs

(1.8)

0

for W a standard Brownian motion independent of β. Compare with Proposition 3.3 below. In particular, by choosing g identically one in (1.8), it agrees with (1.4). 7. The fractional Brownian motion of index 1/4 has a remarkable physical interpretation in terms of particle systems. Indeed, if one consider an infinite number of particles, initially placed on the real line according to a Poisson distribution, performing independent Brownian motions and undergoing “elastic” collisions, then the trajectory of a fixed particle (after rescaling) converges to a fractional Brownian motion of index 1/4. See Harris [10] for heuristic arguments, and Dürr, Goldstein and Lebowitz [5] for precise results. Now, the rest of the note is entirely devoted to the proof of Theorem 1.2. The Section 2 contains some preliminaries and fix the notation. Some technical results are postponed in Section 3. Finally, the proof of Theorem 1.2 is done in Section 4. 2. Preliminaries and notation We shall now provide a short description of the tools of Malliavin calculus that will be needed in the following sections. The reader is referred to the monographs [14,16] for any unexplained notion or result. (1) (2) Let B = (Bt , Bt )t∈[0,T ] be a 2D fractional Brownian motion with Hurst parameter belonging to (0, 1/2). We denote by H the Hilbert space defined as the closure of the set of step R2 -valued functions on [0, T ], with respect to the scalar product induced by (1[0,t1 ] , 1[0,t2 ] ), (1[0,s1 ] , 1[0,s2 ] ) H = RH (t1 , s1 ) + RH (t2 , s2 ),

si , ti ∈ [0, T ], i = 1, 2, (1)

(2)

where RH (t, s) = 12 (t 2H + s 2H − |t − s|2H ). The mapping (1[0,t1 ] , 1[0,t2 ] ) → Bt1 + Bt2 can be extended to an isometry between H and the Gaussian space associated with B. Also, H will denote the Hilbert space defined as the closure of the set of step R-valued functions on [0, T ], with respect to the scalar product induced by 1[0,t] , 1[0,s] H = RH (t, s),

s, t ∈ [0, T ].


2309

(i)

The mapping 1[0,t] → Bt (i equals 1 or 2) can be extended to an isometry between H and the Gaussian space associated with B (i) . Consider the set of all smooth cylindrical random variables, i.e. of the form F = f B(ϕ1 ), . . . , B(ϕk ) ,

ϕi ∈ H, i = 1, . . . , k,

(2.1)

where f ∈ C ∞ is bounded with bounded derivatives. The derivative operator D of a smooth cylindrical random variable of the above form is defined as the H-valued random variable DF =

k ∂f B(ϕ1 ), . . . , B(ϕk ) ϕi =: (DB (1) F, DB (2) F ). ∂xi i=1

In particular, we have (j )

DB (i) Bt

= δij 1[0,t]

for i, j ∈ {1, 2}, and δij the Kronecker symbol.

By iteration, one can define the mth derivative D m F (which is a symmetric element of L2 (Ω, H⊗m )) for m 2. As usual, for any m 1, the space Dm,2 denotes the closure of the set of smooth random variables with respect to the norm · m,2 defined by the relation F 2m,2 = E|F |2 +

m 2 E D i F H⊗i . i=1

The derivative D verifies the chain rule. Precisely, if ϕ : Rn → R belongs to C 1 with bounded derivatives and if Fi , i = 1, . . . , n, are in D1,2 , then ϕ(F1 , . . . , Fn ) ∈ D1,2 and Dϕ(F1 , . . . , Fn ) =

n ∂ϕ (F1 , . . . , Fn )DFi . ∂xi i=1

The mth derivative DBm(i) (i equals 1 or 2) verifies the following Leibnitz rule: for any F, G ∈ Dm,2 such that F G ∈ Dm,2 , we have m DB (i) F G t

1 ,...,tm

=

DBr (i) F s

1 ,...,sr

m−r DB (i) G u

ti ∈ [0, T ], i = 1, . . . , m,

1 ,...,um−r

, (2.2)

where the sum runs over any subset {s1 , . . . , sr } ⊂ {t1 , . . . , tm } and where we write {t1 , . . . , tm } \ {s1 , . . . , sr } =: {u1 , . . . , um−r }. The divergence operator δ is the adjoint of the derivative operator. If a random variable u ∈ L2 (Ω, H) belongs to dom δ, the domain of the divergence operator, then δ(u) is defined by the duality relationship E F δ(u) = E DF, u H for every F ∈ D1,2 .

2310


For every q 1, let Hq be the qth Wiener chaos of B, that is, the closed linear subspace of L2 (Ω, A, P ) generated by the random variables {Hq (B(h)), h ∈ H, hH = 1}, where Hq is the 2 dq −x 2 /2 ). The mapping qth Hermite polynomial given by Hq (x) = (−1)q ex /2 dx q (e Iq h⊗q = Hq B(h)

(2.3)

provides a linear isometry between the symmetric tensor product Hq (equipped with the modified norm √1q! · H⊗q ) and Hq . The following duality formula holds E F Iq (f ) = E D q F, f H⊗q , for any f ∈ Hq and F ∈ Dq,2 . In particular, we have q E F Iq(i) (g) = E DB (i) F, g H⊗q ,

(2.4)

i = 1, 2,

(2.5)

for any g ∈ Hq and F ∈ Dq,2 , where, for simplicity, we write Iq (g) whenever the corresponding qth multiple integral is only with respect to B (i) . Finally, we mention the following particular case (actually, the only one we will need in the sequel) of the classical multiplication formula: if f, g ∈ H, q 1 and i ∈ {1, 2}, then (i)

Iq(i)

2 q ⊗q (i) ⊗q q (i) f Iq g = r! I2q−2r f ⊗q−r ⊗ g ⊗q−r f, g rH . r

(2.6)

r=0

3. Some technical results In this section, we collect some crucial results for the proof of (1.3), the only case which is difficult. Here and in the rest of the paper, we set (i)

(i)

(i)

Bk/n := B(k+1)/n − Bk/n ,

δk/n := 1[k/n,(k+1)/n]

and εk/n := 1[0,k/n] ,

for any i ∈ {1, 2} and k ∈ {0, . . . , n − 1}. In the sequel, for g : R2 → R belonging to C q , we will need assumption of the type: (Hq )

a+b ∂ g (1) (2) p sup E a b Bs , Bs < ∞ ∂x ∂y

s∈[0,1]

for all p 1 and all integers a, b 0 such that a + b q. We begin by the following technical lemma: Lemma 3.1. Let β be a 1D fractional Brownian motion of Hurst index 1/4. We have √ (i) |E(βr (βt − βs ))| |t − s| for any 0 r, s, t 1, n−1 = O(n), (ii) k,l=0 | εl/n , δk/n H | n→∞ n−1 r = O(n1−r/2 ) for any r 1, (iii) k,l=0 | δl/n , δk/n H | n→∞

(3.1)


(iv) (v)

n−1

k=0 | εk/n , δk/n H n−1 2 k=0 | εk/n , δk/n H

+ −

2311

1 √ | = O(1), 2 n n→∞ √ 1 = O(1/ n). 4n | n→∞

Proof. (i) We have 1 √ √ 1 E βr (βt − βs ) = t− s + |s − r| − |t − r| . 2 2 √ √ √ Using the classical inequality | |b| − |a|| |b − a|, the desired result follows. (ii) Observe that √ 1 √ εl/n , δk/n H = √ k + 1 − k − |k + 1 − l| + |k − l| . 2 n Consequently, for any fixed l ∈ {0, . . . , n − 1}, we have l−1 n−1 n−1 √ √ √ √ 1 εl/n , δk/n H 1 + √ l−k− l−k−1+1+ k−l+1− k−l 2 2 n k=0

k=0

k=l+1

1 √ √ 1 = + √ l+ n−l 2 2 n from which we deduce that sup0ln−1

n−1

n−1 εl/n , δk/n H n k,l=0

= O(1). k=0 | εl/n , δk/n H | n→∞

sup

It follows that

n−1 εl/n , δk/n H = O(n).

0ln−1 k=0

n→∞

√ √ √ (iii) We have, by noting ρ(x) = 12 ( |x + 1| + |x − 1| − 2 |x|): n−1 n−1 r δl/n , δk/n H r = n−r/2 ρ (l − k) n1−r/2 ρ r (k). k,l=0

k,l=0

k∈Z

Since k∈Z |ρ r (k)| < ∞ if r 1, the desired conclusion follows. (iv) is a consequence of the following identity combined with a telescopic sum argument: √ 1 √ 1 εk/n , δk/n H + √ = √ k+1− k . 2 n 2 n (v) We have √ √ √ √ εk/n , δk/n 2 − 1 = 1 k + 1 − k k + 1 − k − 2. H 4n 4n

2312


Thus, the desired bound is immediately checked by combining a telescoping sum argument with the fact that √ √ 1 k + 1 − k − 2 = √ 2 k + 1 + √k − 2 2. Also the following lemma will be useful in the sequel: Lemma 3.2. Let α 0 and q 2 be two positive integers, g : R2 → R be any function belonging to C 2q and verifying (H2q ) defined by (3.1), and B = (B (1) , B (2) ) be a 2D fractional Brownian motion of Hurst index 1/4. Set Vn = n−q/4

n−1 (1) (2) (1) α (2) g Bk/n , Bk/n Bk/n Hq n1/4 Bk/n , k=0

where Hq denotes the qth Hermite polynomial defined by Hq (x) = (−1)q ex Then, the following bound is in order: E |Vn |2 = O n1−q/2−α/2

2 /2

dq −x 2 /2 ). dx q (e

as n → ∞.

Proof. We can write E |Vn |2 = n−q/2

n−1

(1) (2) (1) (2) (1) α (1) α E g Bk/n , Bk/n g Bl/n , Bl/n Bk/n Bl/n

k,l=0

(2) (2) × Hq n1/4 Bk/n Hq n1/4 Bl/n =

(2.3)

n−1

⊗q ⊗q (1) (2) (1) (2) (1) α (1) α E g Bk/n , Bk/n g Bl/n , Bl/n Bk/n Bl/n Iq(2) δk/n Iq(2) δl/n

k,l=0

2 q n−1 (1) (2) (1) (2) q = r! E g Bk/n , Bk/n g Bl/n , Bl/n (2.6) r r=0

k,l=0

⊗q−r (1) α (1) α (2) ⊗q−r Bl/n I2q−2r δk/n ⊗ δl/n δk/n , δl/n rH × Bk/n 2 q n−1

2q−2r (1) (2) (1) (2) q r! E DB (2) g Bk/n , Bk/n g Bl/n , Bl/n (2.5) r =

r=0

k,l=0

⊗q−r (1) α (1) α ⊗q−r × Bk/n Bl/n , δk/n ⊗ δl/n H⊗2q−2r δk/n , δl/n rH 2 q q = r! (2.2) r r=0

a+b=2q−2r

a n−1 (a + b)! d g (1) (2) d b g (1) (2) B ,B B ,B E a!b! dy a k/n k/n dy b l/n l/n k,l=0

(3.2)


2313

⊗a ⊗q−r (1) α (1) α ⊗b ⊗q−r εl/n (2q − 2r)! εk/n ⊗ × Bk/n Bl/n , δk/n ⊗ δl/n H⊗2q−2r δk/n , δl/n rH . (3.3) Now, observe that, uniformly in k, l ∈ {0, . . . , n − 1}:

⊗q−r ⊗a ⊗b ⊗q−r εk/n ⊗ εl/n , δk/n ⊗ δl/n

= O n−(q−r) ,

see Lemma 3.1(i), a b E d g B (1) , B (2) d g B (1) , B (2) B (1) α B (1) α = O n−α/2 , use (H2q ), k/n k/n l/n l/n k/n l/n a b n→∞ dy dy H⊗2q−2r n→∞

and, also: n−1

δk/n , δl/n rH = O n1−r/2

for any fixed r 1, see Lemma 3.1(iii).

k,l=0

Finally, the desired conclusion is obtained by plugging these three bounds into (3.3), after having separated the cases r = 0 and r = 1. 2 The independent Brownian motion appearing in (1.3) comes from the following proposition. ) be a 2D fractional Brownian motion of Hurst index 1/4. Consider Proposition 3.3. Let (β, β 2 two functions g, g : R → R belonging in C 4 , and assume that they both verify (H4 ) defined by (3.1). Then n ) := (Gn , G

n−1 √ 1 k/n ) n( βk/n )2 − 1 , g(βk/n , β √ n k=0 n−1 √ 1 2 k/n ) n( β k/n ) − 1 g (βk/n , β √ n k=0

stably

−→

n→∞

1 σ1/4

s ) dWs + g(βs , β

0

+

1 4

1

1 4

1 0

∂ 2g s ) ds, σ1/4 (βs , β ∂x 2

1

s s ) d W g (βs , β

0

∂ 2 g s ) ds , (βs , β ∂y 2

0

) is a 2D standard Brownian motion independent of (β, β ), and σ1/4 is defined by where (W, W (1.5). In the particular case where g(x, y) = g(x) and g (x, y) = g (y), the conclusion of the proposition follows directly from (1.8). In the general case, the proof only consists to extend literaly the proof of (1.8) contained in [15]. Details are left to the reader.

2314


4. Proof of Theorem 1.2 We are now in position to prove our main result, that is Theorem 1.2. Proof of the third point (case H < 1/4). Firstly, observe that (1.4) is actually a particular case of the following result, which is valid for any fractional Brownian β with Hurst index H belonging to (0, 3/4): n·−1 stably 1 2H n ( βk/n )2 − 1 −→ σH W √ n→∞ n k=0

with W an independent Brownian motion and σH > 0 an (explicit) constant. By mimicking the proof contained in the fourth point of Remark 1.3, we get, here, for any H ∈ (0, 3/4), n2H −1/2

n·−1 k=0

(1) (2) stably σH Bk/n Bk/n −→ √ W. n→∞ 2

(4.1)

· n·−1 (1) (2) But, see (1.7), the existence of 0 Bs · d Bs would imply in particular that k=0 Bk/n Bk/n converges in law as n → ∞, which is in contradiction with (4.1) for H < 1/4. The proof of the third point is done. Proof of the second point (case H = 1/4). For the simplicity of the exposition, we assume from now that t = 1, the general case being of course similar up to cumbersome notation. For any a, b, c, d ∈ R, by the classical Taylor formula, we can expand f (b, d) as (compare with (1.7)): 1 1 f (a, c) + ∂1 f (a, c)(b − a) + ∂2 f (a, c)(d − c) + ∂11 f (a, c)(b − a)2 + ∂22 f (a, c)(d − c)2 2 2 1 1 + ∂111 f (a, c)(b − a)3 + ∂222 f (a, c)(d − c)3 6 6 1 1 + ∂1111 f (a, c)(b − a)4 + ∂2222 f (a, c)(d − c)4 24 24 1 1 + ∂12 f (a, c)(b − a)(d − c) + ∂112 f (a, c)(b − a)2 (d − c) + ∂122 f (a, c)(b − a)(d − c)2 2 2 1 + ∂1112 f (a, c)(b − a)3 (d − c) 6 1 1 + ∂1122 f (a, c)(b − a)2 (d − c)2 + ∂1222 f (a, c)(b − a)(d − c)3 (4.2) 4 6 plus a remainder term. Here, as usual, the notation ∂1...12...2 f (where the index 1 is repeated k times and the index 2 is repeated l times) means that f is differentiated k times with respect to the first component and l times with respect to the second one. By combining (4.2) with the following identity, available for any h : R → R belonging to C 4 :


2315

1 1 1 h (a)(b − a) + h

(a)(b − a)2 + h

(a)(b − a)3 + h

(a)(b − a)4 2 6 24 1

h (a) + h (b) 1 (b − a) − h (a)(b − a)3 − h

(a)(b − a)4 + some remainder = 2 12 24 we get that f (b, d) can also be expanded as 1 ∂1 f (a, c) + ∂1 f (b, c) (b − a) 2 1 1 − ∂111 f (a, c)(b − a)3 − ∂1111 f (a, c)(b − a)4 12 24 1 1 1 + ∂2 f (a, c) + ∂2 f (a, d) (d − c) − ∂222 f (a, c)(d − c)3 − ∂2222 f (a, c)(d − c)4 2 12 24 1 1 + ∂12 f (a, c)(b − a)(d − c) + ∂112 f (a, c)(b − a)2 (d − c) + ∂122 f (a, c)(b − a)(d − c)2 2 2 1 1 + ∂1112 f (a, c)(b − a)3 (d − c) + ∂1122 f (a, c)(b − a)2 (d − c)2 6 4 1 + ∂1222 f (a, c)(b − a)(d − c)3 (4.3) 6

f (a, c) +

plus a remainder term. (1) (1) (2) (2) By setting a = Bk/n , b = B(k+1)/n , c = Bk/n and d = B(k+1)/n in (4.3), and by summing the obtained expression for k over 0, . . . , n − 1, we deduce that the conclusion in Theorem 1.2 is a consequence of the following convergences:

Sn(1)

:=

n−1 k=0

Sn(2)

:=

n−1

2 (1) (2) 3 (1) 3 L ∂111 f Bk/n , Bk/n Bk/n −→ − n→∞ 2

:=

n−1 k=0

Sn(4) :=

n−1

∂1111 f

2 (1) (2) (1) 4 L Bk/n , Bk/n Bk/n −→ 3 n→∞

1

2 (1) (2) 3 (2) 3 L ∂222 f Bk/n , Bk/n Bk/n −→ − n→∞ 2

∂1111 f Bs(1) , Bs(2) ds,

(4.5)

1

∂2222 f Bs(1) , Bs(2) ds,

(4.6)

0 2 (1) (2) (2) 4 L ∂2222 f Bk/n , Bk/n Bk/n −→ 3

1

n→∞

∂2222 f Bs(1) , Bs(2) ds,

(1) (2) (1) (2) stably σ1/4 := ∂12 f Bk/n , Bk/n Bk/n Bk/n −→ √ n→∞ 2 k=0 1 4

(4.7)

0

n−1

+

(4.4)

0

k=0

Sn(5)

∂1111 f Bs(1) , Bs(2) ds,

0

k=0

Sn(3)

1

1 0

∂1122 f Bs(1) , Bs(2) ds,

1

∂12 f Bs(1) , Bs(2) dWs

0

(4.8)

2316


Sn(6)

:=

n−1 k=0

Sn(7)

:=

n−1 k=0

Sn(8)

:=

n−1

2 (1) (2) 1 (1) 2 (2) L ∂112 f Bk/n , Bk/n Bk/n Bk/n −→ − n→∞ 2

n−1

2 (1) (2) 1 (1) (2) 2 L ∂122 f Bk/n , Bk/n Bk/n Bk/n −→ − n→∞ 2

n−1

1

∂1122 f Bs(1) , Bs(2) ds, (4.10)

0 2 (1) (2) (1) 2 (2) 2 L ∂1122 f Bk/n , Bk/n Bk/n Bk/n −→

1

n→∞

∂1122 f Bs(1) , Bs(2) ds, (4.11)

0

(1) (2) (1) 3 (2) Prob ∂1112 f Bk/n , Bk/n Bk/n Bk/n −→ 0,

(4.12)

(1) (2) (1) (2) 3 Prob ∂1222 f Bk/n , Bk/n Bk/n Bk/n −→ 0.

(4.13)

n→∞

k=0

Sn(10) :=

∂1122 f Bs(1) , Bs(2) ds, (4.9)

0

k=0

Sn(9) :=

1

n→∞

k=0

Note that the term corresponding to the remainder in (4.3) converges in probability to zero due to the fact that B has a finite quartic variation. Proof of (4.4), (4.6), (4.9) and (4.10). By Lemma 3.2 with q = 3 and α = 0, and by using the basic fact that 3 (2) 3 (2) (2) Bk/n = n−3/4 H3 n1/4 Bk/n + √ Bk/n , n

(4.14)

we immediately see that (4.6) is a consequence of the following convergence: n−1

1

k=0

0

2 (1) (2) 1 1 (2) L ∂222 f Bk/n , Bk/n Bk/n −→ − √ n→∞ 2 n

∂2222 f Bs(1) , Bs(2) ds.

So, let us prove (4.15). We have, on one hand: 2 n−1 1 (1) (2) (2) ∂222 f Bk/n , Bk/n Bk/n E √ n k=0

=

n−1 (1) (2) (1) (2) 1 (2) (2) E ∂222 f Bk/n , Bk/n ∂222 f Bl/n , Bl/n Bk/n Bl/n n k,l=0

=

n−1 (1) (2) (1) (2) (2) 1 E ∂222 f Bk/n , Bk/n ∂222 f Bl/n , Bl/n I2 (δk/n ⊗ δl/n ) n k,l=0

+

n−1 (1) (2) (1) (2) 1 E ∂222 f Bk/n , Bk/n ∂222 f Bl/n , Bl/n δk/n , δl/n H n k,l=0

(4.15)


2317

n−1 (1) (2) (1) (2) 1 E ∂22222 f Bk/n , Bk/n ∂222 f Bl/n , Bl/n εk/n , δk/n H εk/n , δl/n H n

=

k,l=0

+

n−1 (1) (2) (1) (2) 1 E ∂2222 f Bk/n , Bk/n ∂2222 f Bl/n , Bl/n εk/n , δk/n H εl/n , δl/n H n k,l=0

+

n−1 (1) (2) (1) (2) 1 E ∂2222 f Bk/n , Bk/n ∂2222 f Bl/n , Bl/n εl/n , δk/n H εk/n , δl/n H n k,l=0

+

n−1 (1) (2) (1) (2) 1 E ∂222 f Bk/n , Bk/n ∂22222 f Bl/n , Bl/n εl/n , δk/n H εl/n , δl/n H n k,l=0

+

n−1 (1) (2) (1) (2) 1 E ∂222 f Bk/n , Bk/n ∂222 f Bl/n , Bl/n δk/n , δl/n H n k,l=0

= a (n) + b(n) + c(n) + d (n) + e(n) . Using Lemma 3.1(i) and (ii), we have that a (n) , c(n) and d (n) tends to zero as n → ∞. Using Lemma 3.1(iii), we have that e(n) tends to zero as n → ∞. Finally, observe that b(n) n−1 (1) (2) (1) (2) 1 = 2 E ∂2222 f Bk/n , Bk/n ∂2222 f Bl/n , Bl/n 4n k,l=0

n−1 (1) (2) (1) (2) 1 1 − √ εl/n , δl/n H + √ E ∂2222 f Bk/n , Bk/n ∂2222 f Bl/n , Bl/n 2n n 2 n k,l=0

1 + n

n−1 k,l=0

(1) (2) (1) (2) 1 εk/n , δk/n H + √ εl/n , δl/n H . E ∂2222 f Bk/n , Bk/n ∂2222 f Bl/n , Bl/n 2 n

Therefore, using Lemma 3.1(i) and (iv), we have 2 2 n−1 n−1 1 1 (1) (2) (1) (2) (2) E √ ∂222 f Bk/n , Bk/n Bk/n = E ∂2222 f Bk/n , Bk/n + o(1). (4.16) n 2n k=0

k=0

On the other hand, we have

(1) (2) (1) (2) 1 (2) (−1) E √ ∂222 f Bk/n , Bk/n Bk/n · ∂2222 f Bl/n , Bl/n 2n n =−

n−1

n−1

k=0

l=0

n−1 (1) (2) (1) (2) 1 (2) E ∂222 f Bk/n , Bk/n ∂2222 f Bl/n , Bl/n Bk/n √ 2n n k,l=0

2318


=−

n−1 (1) (2) (1) (2) 1 E ∂2222 f Bk/n , Bk/n ∂2222 f Bl/n , Bl/n εk/n , δk/n H √ 2n n k,l=0

−

n−1 (1) (2) (1) (2) 1 E ∂222 f Bk/n , Bk/n ∂22222 f Bl/n , Bl/n εl/n , δk/n H √ 2n n k,l=0

=

n−1 (1) (2) (1) (2) 1 E ∂2222 f Bk/n , Bk/n ∂2222 f Bl/n , Bl/n 2 4n k,l=0

−

n−1 (1) (2) (1) (2) 1 1 εk/n , δk/n H + √ E ∂2222 f Bk/n , Bk/n ∂2222 f Bl/n , Bl/n √ 2n n 2 n k,l=0

−

n−1 (1) (2) (1) (2) 1 E ∂222 f Bk/n , Bk/n ∂22222 f Bl/n , Bl/n εl/n , δk/n H . √ 2n n k,l=0

We immediately have that the second (see Lemma 3.1(iv)) and the third (see Lemma 3.1(ii)) terms in the previous expression tends to zero as n → ∞. That is n−1 n−1 (1) (2) (1) (2) 1 (2) (−1) ∂222 f Bk/n , Bk/n Bk/n · ∂2222 f Bl/n , Bl/n E √ 2n n

k=0

n−1 2 1 (1) (2) = E ∂2222 f Bk/n , Bk/n + o(1). 2n

l=0

(4.17)

k=0

We have proved, see (4.16) and (4.17), that 2 n−1 n−1 1 (1) (2) (1) (2) 1 (2) E √ ∂222 f Bk/n , Bk/n Bk/n + ∂2222 f Bk/n , Bk/n −→ 0. n n→∞ 2n k=0

k=0

This implies (4.15). The proof of (4.4) follows directly from (4.6) by exchanging the roles played by B (1) and (2) B . On the other hand, by combining Lemma 3.2 with the following basic identity: 1 1 (2) 2 (2) Bk/n = √ H2 n1/4 Bk/n + √ , n n we see that (4.10) is also a direct consequence of (4.15). Finally, (4.9) is obtained from (4.10) by exchanging the roles played by B (1) and B (2) . Proof of (4.5), (4.7) and (4.11). By combining Lemma 3.2 with the identity 6 3 1 (1) 4 (1) (1) Bk/n = H4 n1/4 Bk/n + H2 n1/4 Bk/n + , n n n


2319

we see that (4.7) is easily obtained through a Riemann sum argument. We can use the same arguments in order to prove (4.5). Finally, to obtain (4.11), it suffices to combine Lemma 3.2 with the identity 1 1 1 (1) 2 (2) 2 (1) 2 (2) (1) Bk/n Bk/n = + √ Bk/n H2 n1/4 Bk/n + H2 n1/4 Bk/n . n n n Proof of (4.12) and (4.13). We only prove (4.13), the proof of (4.12) being obtained from (4.13) by exchanging the roles played by B (1) and B (2) . By combining (4.14) with Lemma 3.2, it suffices to prove that (1) (2) 1 (1) (2) Prob ∂1222 f Bk/n , Bk/n Bk/n Bk/n −→ 0. √ n→∞ n n−1 k=0

But this last convergence follows directly from Lemma 3.3. Therefore, the proof of (4.13) is done. Proof of (4.8). We combine Proposition 3.3 with the idea developed in the third comment that we have addressed just after the statement of Theorem 1.2. Indeed, we have n−1

(1) (2) (1) (2) ∂12 f Bk/n , Bk/n Bk/n Bk/n

k=0

n−1 k/n βk/n − β k/n √ 2 βk/n + β 1 , = √ ∂12 f n βk/n − 1 √ √ 2 n 2 2 k=0

n−1 k/n βk/n − β k/n √ βk/n + β 1 k/n 2 − 1 − √ , ∂12 f n β √ √ 2 n 2 2 k=0

√ √ = (B (1) − B (2) )/ 2. Note that (β, β ) is also a 2D fractional for β = (B (1) + B (2) )/ 2 and β Brownian motion of Hurst index 1/4. Hence, using Proposition 3.3 with g(x, y) = g (x, y) = √ , x−y √ ), we get f ( x+y 2

2

n−1

(1) (2) (1) (2) ∂12 f Bk/n , Bk/n Bk/n Bk/n

k=0

σ1/4 −→ n→∞ 2 stably

1

)s + 1 ∂12 f Bs(1) , Bs(2) d(W − W 4

0

σ1/4 = √ 2

Law

1 0

1

∂1122 f Bs(1) , Bs(2) ds

0

1 ∂12 f Bs(1) , Bs(2) dWs + 4

1

∂1122 f Bs(1) , Bs(2) ds,

0

) a 2D standard Brownian motion independent of (β, β ). The proof of (4.8) is done. for (W, W

2320


Proof of the first point (case H > 1/4 1/4). The proof can be done by following exactly the same strategy than in the step above. The only difference is that, using a version of Lemma 3.2 together with computations similar to that allowing to obtain (4.15), the limits in (4.4)–(4.11) are, here, all equal to zero (for the sake of simplicity, the technical details are left to the reader). Therefore, we can deduce (1.2) by using (4.3). Acknowledgments I wrote the revised version of this work while I was visiting The Banff Centre (Canada), in the occasion of the workshop “Differential Equations Driven by Fractional Brownian Motion as Random Dynamical Systems: Qualitative Properties,” from September 28 until October 5, 2008. I heartily thank David Nualart, Björn Schmalfuß and Frederi Viens for the invitation and the generous support. References [1] P. Breuer, P. Major, Central limit theorems for nonlinear functionals of Gaussian fields, J. Multivariate Anal. 13 (3) (1983) 425–441. [2] K. Burdzy, J. Swanson, A change of variable formula with Itô correction term, preprint, 2008, available on ArXiv. [3] P. Cheridito, D. Nualart, Stochastic integral of divergence type with respect to fractional Brownian motion with Hurst parameter H in (0, 1/2), Ann. Inst. H. Poincaré Probab. Statist. 41 (2005) 1049–1081. [4] L. Coutin, Z. Qian, Stochastic rough path analysis and fractional Brownian motion, Probab. Theory Related Fields 122 (2002) 108–140. [5] D. Dürr, S. Goldstein, J.L. Lebowitz, Asymptotics of particle trajectories in infinite one-dimensional systems with collision, Comm. Pure Appl. Math. 38 (1985) 573–597. [6] M. Errami, F. Russo, n-Covariation, generalized Dirichlet processes and calculus with respect to finite cubic variation processes, Stochastic Process. Appl. 104 (2003) 259–299. [7] D. Feyel, A. de La Pradelle, Curvilinear integrals along enriched paths, Electron. J. Probab. 11 (2006) 860–892. [8] P. Friz, N. Victoir, Differential equations driven by Gaussian signals I, Ann. Inst. H. Poincaré Probab. Statist., 2009, in press, available on ArXiv. [9] M. Gradinaru, I. Nourdin, F. Russo, P. Vallois, m-Order integrals and Itô’s formula for non-semimartingale processes: The case of a fractional Brownian motion with any Hurst index, Ann. Inst. H. Poincaré Probab. Statist. 41 (2005) 781–806. [10] T.E. Harris, Diffusions with collisions between particles, J. Appl. Probab. 2 (1965) 323–338. [11] J. Jacod, A.N. Shiryayev, Limit Theorems for Stochastic Processes, Springer, Berlin, 1987. [12] R. Klein, E. Giné, On quadratic variation of processes with Gaussian increments, Ann. Probab. 3 (1975) 716–721. [13] T. Lyons, Differential equations driven by rough signals, Rev. Mat. Iberoamericana 14 (2) (1998) 215–310. [14] P. Malliavin, Stochastic Analysis, Springer, Berlin, 1997. [15] I. Nourdin, A. Réveillac, Asymptotic behavior of weighted quadratic variations of fractional Brownian motion: The critical case H = 1/4, preprint, 2008, available on ArXiv. [16] D. Nualart, The Malliavin Calculus and Related Topics of Probability and Its Applications, second ed., Springer, Berlin, 2006. [17] G. Peccati, C.A. Tudor, Gaussian limits for vector-valued multiple stochastic integrals, in: Sémin. Probab., 38, in: Lecture Notes in Math., vol. 1857, Springer, Berlin, 2005, pp. 247–262. [18] J. Unterberger, Stochastic calculus for fractional Brownian motion with Hurst exponent H > 1/4: A rough path method by analytic extension, Ann. Probab., 2008, in press.


On an extreme class of real interpolation spaces Fernando Cobos

a,∗,1,2

, Luz M. Fernández-Cabrera b,1,2 , Thomas Kühn c,1 , Tino Ullrich a,3,4

a Departamento de Análisis Matemático, Facultad de Matemáticas, Universidad Complutense de Madrid,

Plaza de Ciencias 3, 28040 Madrid, Spain b Sección Departamental de Matemática Aplicada, Escuela de Estadística, Universidad Complutense de Madrid,

28040 Madrid, Spain c Mathematisches Institut, Fakultät für Mathematik und Informatik, Universität Leipzig, Johannisgasse 26,

D-04103, Germany Received 17 June 2008; accepted 15 December 2008 Available online 23 December 2008 Communicated by Alain Connes Dedicated to the memory of Professor Pedro Matos

Abstract We investigate the limit class of interpolation spaces that comes up by the choice θ = 0 in the definition of the real method. These spaces arise naturally interpolating by the J -method associated to the unit square. Their duals coincide with the other extreme spaces obtained by the choice θ = 1. We also study the behavior of compact operators under these two extreme interpolation methods. Moreover, we establish some interpolation formulae for function spaces and for spaces of operators. © 2008 Elsevier Inc. All rights reserved. Keywords: Extreme interpolation spaces; Real interpolation; J -functional; K-functional; Interpolation methods associated to polygons; Compact operators; Lorentz–Zygmund function spaces; Spaces of operators


E-mail addresses: [email protected] (F. Cobos), [email protected] (L.M. Fernández-Cabrera), [email protected] (T. Kühn), [email protected] (T. Ullrich). 1 Supported in part by the Spanish Ministerio de Educación y Ciencia (MTM2007-62121). 2 Supported in part by CAM-UCM (Grupo de Investigación 910348). 3 Supported by the German Academic Exchange Service DAAD (D/06/48125). 4 Now at: Hausdorff Center for Mathematics, Endenicher Allee 62, D-53115 Bonn, Germany. 0022-1236/$ – see front matter © 2008 Elsevier Inc. All rights reserved. doi:10.1016/j.jfa.2008.12.013

2322

F. Cobos et al. / Journal of Functional Analysis 256 (2009) 2321–2366

Fig. 1.1.

1. Introduction Interpolation theory is a very useful tool not only in functional analysis, operator theory, harmonic analysis and partial differential equations, but also in some other more distant areas of mathematics. Monographs by Butzer and Berens [6], Bergh and Löfström [5], Triebel [36], Beauzamy [2], Bennett and Sharpley [4], Connes [18] and Amrein, Boutet de Monvel and Georgescu [1] illustrate this fact. The real interpolation method (A0 , A1 )θ,q is particularly useful due to its flexibility. It has several equivalent definitions, being the more important those given by Peetre’s K- and J -functional, which allow to use it in different contexts. Let A0 , A1 be Banach spaces with A0 continuously embedded in A1 and 1 q ∞. When θ runs in (0, 1), spaces (A0 , A1 )θ,q form a “continuous scale” of spaces joining A0 with A1 . If we imagine A0 and A1 sitting on the endpoints of the segment [0, 1] then we can think of (A0 , A1 )θ,q as the space located at the point θ (see Fig. 1.1). This picture is also connected with definitions of K- and J -functionals: Having in mind that Aj is sitting on the point j , we modify the norm of A0 + A1 = A1 by inserting the weight t j in front of · Aj and the outcome is the K-functional (see Section 2 for the precise definition). The J -functional is generated similarly but working with the intersection A0 ∩ A1 = A0 . The geometrical elements involved in the construction of the real interpolation space are more visible when we consider its extensions to finite families (N -tuples) of Banach spaces. So, in the extension proposed by Cobos and Peetre [17], spaces of the N -tuple {A1 , . . . , AN } are thought of as sitting on the vertices of a convex polygon Π = P1 · · · PN in the plane R2 . K- and J functionals with two parameters t, s are defined by inserting the weight t xj s yj in front of the norm of Aj , where Pj = (xj , yj ) is the vertex on which Aj is sitting. Then, for any point (α, β) in the interior of Π , K- and J -spaces are introduced by using an (α, β)-weighted Lq -norm. For the special choice of Π as the simplex, these constructions give back (the first non-trivial case of) spaces introduced by Sparr [35], and if Π coincides with the unit square they recover spaces studied by Fernandez [23]. Developing the theory of interpolation methods associated to polygons, there is a case that sometimes is harder and may give rise to unexpected results. This is the case when the interior point (α, β) is in any diagonal of Π (see [22,16]). A recent results on this matter of Cobos, Fernández-Cabrera and Martín [11] shows that if A0 → A1 and we interpolate using the unit square the 4-tuple obtained by sitting A0 on the vertices (0, 0) and (1, 1), and A1 on (1, 0) and (0, 1) then when (α, β) lies on the diagonal β = 1 − α, K-spaces coincide with limit interpolation spaces (A0 , A1 )1,q;K . The case of the J -spaces was left open in [11]. Accordingly, we investigate here spaces that arise using the J -method when (α, β) lies on the diagonals. It turns out that if β = α then they correspond to the extreme choice θ = 0 in the construction of the real interpolation space realized as a J -space. This new class of interpolation spaces, that we call (0, q; J )-spaces, are not far from A0 . In fact, if 0 < θ0 < θ1 < 1, X = (A0 , A1 )θ0 ,q and Y = (A0 , A1 )θ1 ,q , then (X, Y )0,q;J has a similar description to X but in stead of multiplying the K-functional by t −θ0 , we have to multiply by t −θ0 (1 + log t)−1/q where 1/q + 1/q = 1 (see Section 3 for the precise result). We show that (0, q; J )-spaces can be equiv-


2323

alently described by means of the K-functional and we identify some concrete (0, q; J )-spaces generated by couples of function spaces and of spaces of operators. Results that we establish on (0, q; J )-spaces exhibit a number of important changes in comparison with the theory of the real method. For example, referring to norm estimates for interpolated operators, instead of the well-known inequality for the real method 1−θ T θA1 ,B1 , T (A0 ,A1 )θ,q ,(B0 ,B1 )θ,q T A 0 ,B0

interpolated operators by the (0, q; J )-method satisfy T A1 ,B1 T (A0 ,A1 )0,q;J ,(B0 ,B1 )0,qJ T A0 ,B0 1 + log . T A0 ,B0 + As for compact operators, a result of Cwikel [19] and Cobos, Kühn and Schonbek [15] shows that if any restriction of the operator is compact, then the interpolated operator by the real method is compact as well. However, this is not the case for the (0, q; J )-method as we show in Section 6. It turns out that compactness of T : A1 → B1 is not enough to imply that the interpolated operator is compact. Nevertheless, compactness of T : A0 → B0 does it. We also establish here new results on (1, q; K)-spaces. Among others, we show that on the contrary to the case of the (0, q; J )-method, if T : A0 → B0 is compact then the interpolated operator by the (1, q; K)-method might fail to be compact, but compactness of T : A1 → B1 implies that the interpolated operator is also compact. Furthermore, we prove that if 1 < q < ∞ and 1/q + 1/q = 1 then the dual of a (0, q; J )-space coincides with a (1, q ; K)-space and, conversely, the dual of a (1, q; K)-space is a (0, q ; J )-space. The paper is organized as follows. In Section 2 we review some basic concepts from real interpolation and we recall definitions of function spaces and spaces of operators that we shall need latter. In Section 3 we introduce (0, q; J )-spaces and we show some basic properties and examples. The description of (0, q; J )-spaces in terms of the K-functional is given in Section 4, where we also determine some more concrete (0, q; J )-spaces. In Section 5 we show that (0, q; J )spaces arise by interpolation using the unit square and the point (α, β) in the diagonal β = α. Compactness results for interpolated operators are established in Section 6. Section 7 is devoted to (1, q; K)-spaces and in the final Section 8 we prove the duality theorems. 2. Preliminaries Let A¯ = (A0 , A1 ) be a couple of Banach spaces with A0 → A1 , where the symbol → means continuous inclusion. For each t > 0, Peetre’s K- and J -functionals are defined by K(t, a) = K(t, a; A0 , A1 ) = inf a0 A0 + ta1 A1 : a = a0 + a1 , aj ∈ Aj ,

a ∈ A1 ,

and J (t, a) = J (t, a; A0 , A1 ) = max aA0 , taA1 ,

a ∈ A0 .

For 0 < θ < 1 and 1 q ∞, the real interpolation space (A0 , A1 )θ,q realized as a K-space consists of all elements a ∈ A1 having a finite norm

2324


aθ,q;K =

∞ ( 0 (t −θ K(t, a))q supt>0

dt 1/q t )

{t −θ K(t, a)}

if 1 q < ∞, if q = ∞

(see [6,5,36,4]). According to the equivalence theorem (see [5, Theorem 3.3.1]), the space (A0 , A1 )θ,q can be equivalently described by means ∞ of the J -functional as the collection of all those a ∈ A1 which can be represented as a = 0 u(t) dt/t (convergence in A1 ), where u(t) is a strongly measurable function with values in A0 and

∞ 1/q −θ

q dt t J t, u(t) 0: μ x ∈ Ω: f (x) > s t ,


2325

t and we put f ∗∗ (t) = (1/t) 0 f ∗ (s) ds. For 1 < p < ∞, 1 q ∞ and b ∈ R or p = ∞, 1 q ∞ and b < −1/q, the Lorentz–Zygmund function space Lp,q (log L)b is defined to be the set of all (equivalence classes of) measurable functions f on Ω which have a finite norm

f Lp,q (log L)b

μ(Ω) 1/q 1

b ∗∗ q dt p t 1 + |log t| f (t) = t 0

(with the obvious modification if q = ∞). We refer to [3,4,20,33] for basic properties of Lorentz– Zygmund function spaces. Note that Lp,p (log L)b = Lp (log L)b are the Zygmund spaces and Lp,q (log L)0 = Lp,q are the Lorentz spaces with the Lebesgue spaces Lp = Lp,p as special case. It is well known that 1/t K(t, f ; L∞ , L1 ) = t f ∗ (s) ds = f ∗∗ (1/t). 0

This formula yields (L∞ , L1 )θ,q = Lp,q

(2.2)

provided that 0 < θ = p1 < 1 and 1 q ∞. On the other hand, interpolating with the function parameter θ,b (t) = t θ (1 + |log t|)−b , t > 0, we get (L∞ , L1 )θ,b ,q = Lp,q (log L)b where 0 < θ = p1 < 1, 1 q ∞ and b ∈ R. Using this last formula one can describe Lorentz– Zygmund spaces in terms of Lorentz spaces (see [12,30,10]). We shall also work with weighted spaces. Given any σ -finite measure space (Ω, μ), by a weight w(x) on Ω we mean any positive measurable function on Ω. The weighted Lp space Lp (w) consists of all (equivalence classes of) measurable functions f on Ω such that f Lp (w) = wf Lp < ∞. For our latter considerations, we shall also need some spaces of operators. Let H be a Hilbert space and let L(H ) be the Banach space of all bounded linear operators acting from H into H . For T ∈ L(H ), the singular numbers of T are sn (T ) = inf T − RH,H : R ∈ L(H ) with rank R < n ,

n ∈ N.

For 1 p ∞, the Schatten p-class Lp (H ) is formed by all those T ∈ L(H ) having a finite norm

T Lp (H ) =

∞ n=1

See [26,31].

1/p p

sn (T )

.

2326


The so-called Macaev ideal LM (H ) consists of all T ∈ L(H ) such that n −1 T LM (H ) = sup (1 + log n) sk (T ) < ∞. n∈N

k=1

See [26,18]. Some remarks concerning notation. For a real number a we put a+ = max{a, 0}. As usual, given two quantities X, Y depending on certain parameters, we write X Y if there is a constant c > 0 independent of all parameters such that X cY . Notation X Y means X Y and Y X. Given two sequences (bn ), (dn ) of non-negative real numbers, notation bn dn has a similar meaning: there is c > 0 such that bn cdn for all n ∈ N. If bn dn and dn bn , we write bn dn . 3. A class of extreme interpolation spaces In Section 2 we have defined spaces (A0 , A1 )θ,q for 0 < θ < 1 and 1 q ∞. As one can see in [6, Proposition 3.2.7], if we take θ = 0 in the definition then the J -spaces are meaningful only if q = 1. However, as we shall show in this section, working with ordered couples, the norm given in (2.1) still makes sense for θ = 0 and any 1 q ∞, and it leads to interesting spaces. Definition 3.1. Let A0 , A1 be Banach spaces with A0 → A1 and let 1 q ∞. The space (A0 , A1 )0,q;J is formed by all those elements a ∈ A1 for which there exists a strongly measurable function u(t) with values in A0 such that ∞ a=

u(t)

dt t

(convergence in A1 )

(3.1)

1

and

∞

q dt J t, u(t) t

1/q 1. The discretization t = λn does not change ∞ thenspaceq but produces an equivalent norm. It is also worth mentioning that the condition n=1 J (2 , un ) < ∞ implies absolute convergence of the series ∞ u in A . This can be checked easily by using Hölder’s inequality. 1 n=1 n Given any Banach space A and s > 0, we denote by sA the space A with the norm s · A . Lemma 3.3. Let A be a Banach space, let t > 0 and 1 q ∞. Put ηq (t) = sup

∞

min 1, t/2n |ξn |:

n=1

∞

1/q |ξn |

q

=1 .

n=1

Then

and the norm of

1 ηq (t) A

1 1 A = A, A t ηq (t) 0,q;J

is equal to the discrete norm · 0,q .

Proof. The argument is based on an idea used in [9] to determine the characteristic function of the abstract J -method. Take any a ∈ A. It is easy to check that K(t, a; A, 1/tA) = aA . Hence, given any discrete representation a = ∞ n=1 un we have aA = K(t, a)

∞

K(t, un )

n=1 ∞

J 2n , un min 1, t/2n q

n=1

J 2n , un ηq (t).

J (2n , un ) (J (2n , un ))q

q

Therefore, (1/ηq (t))aA a0,q . To establish the take any ε > 0 and find (ξn )n ∈ q such that (ξn )n q = 1 converse inequality, n }|ξ |. Given any a ∈ A, we can represent a by and ηq (t) − ε ∞ min{1, t/2 n n=1 a=

∞ n=1

|ξn | a min 1, t/2n D

where D =

∞ n=1

min 1, t/2n |ξn |.


2329

We have for every n ∈ N, 2n |ξn | 1 t t |ξn | n a; A, A = min 1, n max 1, aA J 2 , min 1, n 2 D t 2 t D =

|ξn | aA . D

Whence a0,q (1/D)aA and therefore (ηq (t) − ε)a0,q aA . Since ε > 0 is arbitrary, we conclude that a0,q (1/ηq (t))aA . 2 Next we compute the function ηq . Lemma 3.4. Let 1 q ∞ and 1/q + 1/q = 1. For t 2, we have

ηq (t) ∼ (log t)1/q . Proof. Let n = [log2 t], where [·] is the greatest integer function. Put μm = (log2 t)−1/q if 1 m n and μm = 0 otherwise. Then (μm )q 1, so ηq (t)

∞

n m min 1, t/2 |μm | = (log2 t)−1/q (log2 t)1/q .

m=1

m=1

Conversely, given any (ξm ) ∈ q with (ξm )q 1 we have by Hölder’s inequality ∞

min 1, t/2m |ξm |

m=1

∞

q min 1, t/2m

m=1 n m=1

1/q 1

+t

1/q

1/q

∞

2

−mq

m=n+1

n1/q + t2−n (log t)1/q .

2

The following result refers to interpolation of vector-valued sequence spaces. Given any sequence (Gn ) of Banach spaces, any sequence (γn ) of positive numbers and 1 q ∞, we put

q (γn Gn ) = x = (xn ): xn ∈ Gn and xq (γn Gn ) = γn xn Gn < ∞ . q

When γn = 1 for any n ∈ N, we write simply q (Gn ). If all Gn coincide with the scalar field K (K = R or C) then we get a weighted q space that we denote by q (γn ). Theorem 3.5. Let 1 q ∞ and let (An ), (Bn ) be two sequences of Banach spaces with An → Bn for every n ∈ N and sup{In An ,Bn : n ∈ N} < ∞, so q (An ) → q (Bn ). We have with equivalent norms

q (An ), q (Bn ) 0,q;J = q (An , Bn )0,q;J .

2330


m Proof. Take any x = (xn ) in (q (An ), q (Bn ))0,q;J . We write x as a sum x = ∞ m=1 u with ∞ m in B . ) ∈ (A ) where the series converges in (B ). Thus, we have x = u um = (um q n q n n n n m=1 n This gives

xn (An ,Bn )0,q;J

∞

q J 2m , um n ; An , Bn

1/q

m=1

1/q ∞ m q

q u + 2mq um . n A n B n

n

m=1

Hence,

∞

1/q ∞ m q m q mq u + 2 u n A n B

1/q q xn (An ,Bn )0,q;J

n

n=1

n

m,n=1

∞

q J 2m , um ; q (An ), q (Bn )

1/q .

m=1

This shows that

q (An ), q (Bn ) 0,q;J → q (An , Bn )0,q;J . Conversely, take x = (xn ) ∈ q ((An , Bn )0,q;J ). For any ε > 0 we can find a representation xn = ∞ m m=1 un with

∞

q J 2m , um n ; An , Bn

1/q (1 + ε)xn (An ,Bn )0,q;J .

m=1 ∞ Put um = (um n )n=1 . We have

∞

q J 2m , um ; q (An ), q (Bn )

1/q ∼

1/q ∞ m q u + 2mq um q n A n B n

m=1

n

n,m=1

∼

∞ ∞

q J 2m , um n ; An , Bn n=1 m=1

(1 + ε)

∞

1/q

1/q q xn (An ,Bn )0,q;J

.

n=1

m in q (Bn ). Since each coordinate of x coincides with So, the series ∞ m=1 u is convergent ∞ m , we obtain that x = m the corresponding coordinate of ∞ u m=1 m=1 u . Therefore, x belongs


2331

to (q (An ), q (Bn ))0,q;J and x(q (An ),q (Bn ))0,q;J xq ((An ,Bn )0,q;J ) . This completes the proof.

2

As a direct consequence of Theorem 3.5 and Lemmata 3.3 and 3.4 we obtain the following result. Corollary 3.6. Let (Gn ) be a sequence of Banach spaces, let λ > 1, 1 q ∞ and 1/q + 1/q = 1. Then we have with equivalent norms

q (Gn ), q λ−n Gn 0,q;J = q n−1/q Gn . Next we determine the spaces that arise applying the (0, q; J )-method to two spaces obtained by using the real method. Theorem 3.7. Let A0 , A1 be Banach spaces with A0 → A1 , let 0 < θ0 < θ1 < 1, 1 q ∞ and 1/q + 1/q = 1. Then

(A0 , A1 )θ0 ,q , (A0 , A1 )θ1 ,q 0,q;J

∞ 1/q

q n−1/q 2−θ0 n K 2n , a; A0 , A1 = a ∈ A1 : a = 0 such that for any a ∈ Z0,q , a = j aq (n−1/q Fn ) Ma0,q;J . Conversely, for each n ∈ N, let Gn be the space A0 normed by 2−θ0 n J (2n , ·; A0 , A1 ) and let π be the operator that associates to each sequence (un ) its sum π(un ) = ∞ n=1 un in A1 . According to the discrete characterization of (A0 , A1 )θk ,q viewed as a J -space, we have that

π : q 2−(θk −θ0 )n Gn → (A0 , A1 )θk ,q

2332


is bounded for k = 0, 1. Interpolating and using Corollary 3.6 we get that

π : q n−1/q Gn → Z0,q

−1/q G ) we have is bounded. n Therefore, there is a constant M1 > 0 such that for any (un ) ∈ q (n u belongs to Z with that a = ∞ 0,q n=1 n

a0,q M1 (un )

−1/q G ) q (n n

(3.4)

.

Now take any a ∈ Z0,q , then a ∈ (A0 , A1 )θ1 ,q and so t −1 K(t, a; A0 , A1 ) → 0 as t → ∞. Adapting the proof of (see [5, Lemma 3.3.2]), we can find the so-called fundamental lemma n , u ; A , A ) M K(2n , a; A , A ) for u with (u ) ⊂ A and J (2 a representation a = ∞ n n 0 n 0 1 2 0 1 n=1 any n ∈ N. By (3.4), we deduce that

a0,q M1

∞ −1/q −θ n n

q n 2 0 J 2 , un ; A0 , A1

1/q

n=1

M1 M2

∞ −1/q −θ n n

q n 2 0 K 2 , a; A0 , A1

1/q

n=1

= M1 M2 a and the result follows.

2

Note that the space that comes out in Theorem 3.7 does not depend on θ1 . Next we write down a concrete case for function spaces. Corollary 3.8. Let (Ω, μ) be a finite measure space, let 1 < p1 < p0 < ∞, 1 q ∞ and 1/q + 1/q = 1. Then (Lp0 ,q , Lp1 ,q )0,q;J = Lp0 ,q (log L)−1/q with equivalence of norms. Proof. As we pointed out in (2.2), we have that 1/t K(t, f ; L∞ , L1 ) = t f ∗ (s) ds = f ∗∗ (1/t) 0

and Lpj ,q = (L∞ , L1 )θj ,q for θj = 1/pj , j = 0, 1. Then, by Theorem 3.7, we obtain f (Lp0 ,q ,Lp1 ,q )0,q;J

∞ 1/q q dt −1/q −1/p0 ∗∗ (1 + log t) ∼ t f (1/t) t 1


1 =

−1/q 1/p ∗∗ q dt 1 + |log t| t 0 f (t) t

2333

1/q

0

∼ f Lp0 ,q (log L)−1/q .

2

4. Description of the extreme spaces using the K-functional In this section we shall show that (0, q; J )-spaces have an equivalent description by using the K-functional. The K-representation makes more easy the characterization of some important extreme spaces and yields a good norm estimate for interpolated operators. Since (A0 , A1 )0,1;J = A0 , we only pay attention to the case 1 < q ∞. Definition 4.1. Let A0 , A1 be Banach spaces with A0 → A1 . For 1 < q ∞, the space (A0 , A1 )log,q;K is formed by all elements a ∈ A1 having a finite norm

alog,q;K =

⎧ ⎨ ( ∞ [ K(t,a) ]q

dt 1/q 1+log t t ) K(t,a) ⎩ sup { t>1 1+log t } 1

if 1 < q < ∞, if q = ∞.

It is not hard to check that the norm · log,q;K is equivalent to

alog,q =

∞ K(2n ,a) ( n=1 [ n ]q )1/q n supn1 { K(2n ,a) }

if 1 < q < ∞, if q = ∞.

Theorem 4.2. Let A0 , A1 be Banach spaces with A0 → A1 and let 1 < q ∞. Then (A0 , A1 )0,q;J = (A0 , A1 )log,q;K with equivalence of norms. Proof. We start with the case q = ∞. Let a ∈ (A0 , A1 )0,∞;J with a = ∞ K(s, a)

dt K s, u(t) t

1

sup

∞

1 1, Ω is a bounded open set of RN (N 2) with Lipschitz boundary and f belongs to L1 (Ω). We prove that these renormalized solutions pointwise converge, up to “subsequences,” to a function u. With a suitable definition of solution we also prove that u is a solution to a “limit problem.” Moreover we analyze the situation occurring when more regular data f are considered. © 2009 Elsevier Inc. All rights reserved. Keywords: Nonlinear elliptic equations; Renormalized solutions; 1-Laplace operator; L1 -data

1. Introduction In the present paper we study the behaviour, when p goes to 1, of the renormalized solutions to the problems * Corresponding author.

E-mail address: [email protected] (A. Mercaldo). 0022-1236/$ – see front matter © 2009 Elsevier Inc. All rights reserved. doi:10.1016/j.jfa.2008.12.025

2388

A. Mercaldo et al. / Journal of Functional Analysis 256 (2009) 2387–2416

−div |∇up |p−2 ∇up = f

in Ω,

up = 0

on ∂Ω,

(1.1)

where p > 1, Ω is a bounded open set of RN (N 2) with Lipschitz boundary and f belongs to L1 (Ω). The notion of renormalized solution was introduced in order to extend the classical setting of monotone operators (see [31]) and so be able to define a notion of solution to problems whose data do not belong to the dual space W −1,p (Ω) (as, for instance, the case of L1 -data). The main interest is not to get a solution to (1.1) in the sense of distributions but to have a concept which allows to obtain existence (see [10] and [11] to this end) and uniqueness. Renormalized solutions were adapted to second order elliptic problems by P.-L. Lions and F. Murat in [32] (see also [35] or [36]); both existence and uniqueness of such a solution are proved if the datum f belongs to L1 (Ω) + W −1,p (Ω). In [18] and [19] such a notion has been extended to the case where the right-hand side is a Radon measure with bounded total variation; the authors proved an existence result and a partial uniqueness result. We refer to [19] for an exhaustive treatment of renormalized solutions. An equivalent notion, the concept of entropy solution, was introduced in [9] (see also [12]). For such a solution both existence and uniqueness have been proved when f belongs to L1 (Ω) + W −1,p (Ω). Other approaches to define suitable generalized solutions can be found in [20] and [37] (see also [1] where symmetrization techniques are used). Our purpose is to study the renormalized solutions up with two objectives. First, we will study the behaviour of up when p goes to 1, proving that, up to a subsequence (considering that up is a sequence), up → u pointwise in Ω, |∇up |p−2 ∇up z

in Lq (Ω), 1 q
t < +∞.

(2.1)

t>0

We define M(Ω) as the space of all Radon measures with bounded total variation on Ω and we denote by |μ| the total variation of μ ∈ M(Ω). The space of all functions of finite variation,


2391

that is the space of those u ∈ L1 (Ω) whose distributional gradient belongs to M(Ω), is denoted by BV(Ω). It is endowed with the norm defined by uBV(Ω) = Ω |u| + |Du|(Ω), for any u ∈ BV(Ω). Since Ω has Lipschitz boundary, if u belongs to BV(Ω), then the function u0 =

u

in Ω,

0

in RN \ Ω,

belongs to BV(RN ) and |Du0 |(RN ) = ∂Ω |u| dHN −1 + |Du|(Ω). We explicitly point out that |Du0 |(RN ) defines an equivalent norm on BV(Ω), which we will use in the sequel. Through the paper, with an abuse of notation, we still denote u0 by u. We will denote by SN,p the best constant in the Sobolev inequality (cf. [39]), that is, up∗ SN,p |∇u|p ,

1,p

for all u ∈ W0 (Ω).

We will also write SN instead of SN,1 . It is well known (cf. [39]), that lim SN,p = SN .

(2.2)

p→1

We will denote by W −1,∞ (Ω) the dual space of W01,1 (Ω), its norm is given by μW −1,∞ (Ω) = sup μ, ϕ W −1,∞ (Ω),W 1,1 (Ω) : |∇ϕ| 1 .

(2.3)

0

Ω

Following [13] we define DM∞ (Ω) as the space of all vector fields z ∈ L∞ (Ω; RN ) whose divergence in the sense of distribution is a Radon measure, i.e., z ∈ DM∞ (Ω)

⇔

div z ∈ M(Ω) ∩ W −1,∞ (Ω).

Then μ = div z satisfy the following condition: there exists a constant C > 0 such that μ(B) CR N −1 ,

for all (open or closed) balls B ⊂ Ω with radius R.

(2.4)

It is well known that if |μ|(B)| CR N −1 for all balls B ⊂ Ω with radius R, then μ can be extended from W01,1 (Ω) to BV(Ω), see [40, Theorem 5.12.4]. (While the extension of a functional is not necessarily unique, a particular extension to BV(Ω) will be singularized: namely, that given by the integral, with respect to μ, of the precise representative of each u ∈ BV(Ω) see [2,23,24,40].) These measures are called David measures in [34]. 3. The asymptotic behaviour Consider the nonlinear elliptic problem

−div |∇up |p−2 ∇up = f up = 0

in Ω, on ∂Ω,

(3.1)

2392


where Ω is a bounded open subset of RN with Lipschitz boundary, p is a real number p > 1 and f is a function belonging to L1 (Ω). In this section, we will study the behaviour, as p goes to 1, of renormalized solutions up to problem (3.1). For k > 0, denote by Tk : R → R the usual truncation at level k, that is Tk (s) =

s, k sign(s),

|s| k, |s| > k,

∀s ∈ R.

We may extend this definition to infinite values: Tk (±∞) = ±k. Consider a measurable function u : Ω → R which is finite almost everywhere and satisfies 1,p Tk (u) ∈ W0 (Ω) for every k > 0. Then there exists (see e.g. [9, Lemma 2.1]) a unique measurable function v : Ω → RN such that ∇Tk (u) = vχ{|u|k}

almost everywhere in Ω, ∀k > 0.

(3.2)

Remark 3.1. We point out that although truncations can be applied to functions that are infinite on a set of positive measure, its gradient cannot be defined by the above expression. Definition 3.1. Assume that 1 < p < N . Let up : Ω → R be measurable and almost everywhere finite on Ω. We say that up is a renormalized solution of (3.1) if it satisfies the following conditions: 1,p

Tk (up ) ∈ W0 (Ω), |up | ∈ L

N(p−1) N−p ,∞

∀k > 0;

(3.3)

(3.4)

(Ω);

the gradient ∇up introduced in (3.2), satisfies: |∇up | ∈ L 1 lim n→+∞ n

N(p−1) N−1 ,∞

(Ω),

(3.5)

|∇up |p = 0;

(3.6)

{n|up | 0.

L1 (Ω) ,

(3.8)

Ω

Remark 3.3. If up is a renormalized solution to problem (3.1), then up is also a distributional solution in the sense that it satisfies the equality (see, for instance, [19])

|∇up |p−2 ∇up · ∇φ =

Ω

for any φ ∈ C0∞ (Ω).

f φ,

(3.9)

Ω

The main result of this section is given by the following theorem: Theorem 3.1. For every fixed p ∈ ]1, N[, let up denote the renormalized solution to problem (3.1). Then, there exist a measurable function u and a vector field z belonging to N L N−1 ,∞ (Ω; RN ) such that, up to a subsequence, up → u

a.e. in Ω,

(3.10)

and weakly in Lq Ω; RN , for every 1 q
1 . For every fixed k > 0, denote h = k 1/(p−1) . Then, Sobolev’s embedding theorem and (3.8) imply

|up |p−1 k = |up | k 1/(p−1)

∗

|Th (up )|p ∗ k p /(p−1)

Ω

p∗ SN,p ∗ ∇Th (up )p ∗ /(p−1) p p k p∗

SN,p

f L1 (Ω) k

N N−p

,

p∗

SN,p k

p ∗ /(p−1)

h

p∗ p

p∗

f Lp1 (Ω)

2394


where SN,p denotes the best constant in the Sobolev embedding Theorem and p ∗ = fore

∗ |up |p−1 k S p

f L1 (Ω)

N,p

Np N −p . There-

N N−p

(3.12)

.

k

Now we go on in proving the boundedness of the sequence (|∇up |p−1 )p>1 in the MarcinkieN

wicz space L N−1 ,∞ (Ω). Indeed, since for every fixed k > 0 and η > 0, we have p−1

|∇up |p−1 η ⊂ |up | k ∪ ∇Tk (up ) η . Using (3.12) and (3.8), it yields

|∇up |p−1 η |up | k + ∇Tk (up ) η1/(p−1) p∗ SN,p

f L1 (Ω)

N N−p

k p−1

+

|∇Tk (up )|p ηp/(p−1)

Ω p∗

SN,p f

N N−p L1 (Ω)

1 k

N(p−1) N−p

+

kf L1 (Ω) p

.

η p−1

Now choosing N

1

N−p

N−1 (N−1)(p−1) k = SN,p f LN−1 1 (Ω) η

in the previous inequality, we obtain

N

S f L1 (Ω) N−1 |∇up |p−1 η 2 N,p , η

(3.13)

for any η > 0. From (3.13), since SN = limp→1 SN,p , it follows that

N

(SN + 1)f L1 (Ω) N−1 |∇up |p−1 η 2 , η

(3.14)

for p close to 1. For each 1 q < NN−1 , by estimate (3.14), we deduce that, up to subsequences, there exists a vector field zq belonging to Lq (Ω; RN ) such that |∇up |p−2 ∇up zq

weakly in Lq Ω; RN .

Finally, by a diagonal argument we may find a limit that does not depend on q; hence (3.11) N is proved. Observe also that (3.14) and (3.11) imply z ∈ L N−1 ,∞ (Ω; RN ). Step 2. Pointwise convergence of (up )p .


2395

We will prove that, up to a subsequence, up → u a.e. in Ω,

(3.15)

where u is a measurable function in Ω. Following [37], first consider Ψ (s) = s/(1 + |s|), which is a strictly increasing and bounded real function. Moreover up |up | p Ψ (s) ds Ψ (s) ds = Ψ |up | 1. 0

0

So that if, for each k > 0, we take Tk (u p (x))

p Ψ (s) ds,

φ(x) = 0

and ⎧ if |s| n, ⎪ ⎨ 1, 1 hn (s) = − n |s| + 2, if n |s| 2n, ⎪ ⎩ 0, if |s| 2n, in (3.7), then −1 n

|∇up | φ sign(up ) + p

{n|up |2n}

Ω

Ω

f hn (up )φ

=

p p Ψ Tk (up ) ∇Tk (up ) hn (up )

|f |. Ω

By letting n go to infinity and applying (3.6), we get

∇Ψ Tk (up ) p =

Ω

p p Ψ Tk (up ) ∇Tk (up )

Ω

|f |. Ω

By Fatou’s Lemma, when k goes to infinity we obtain Ω

∇Ψ (up )p

|f |. Ω

Thus, Hölder’s inequality implies that the sequence (Ψ (up ))p is bounded in W01,1 (Ω) and so a subsequence, also denoted by (Ψ (up ))p , converges *-weakly in BV(Ω). As a consequence, it also converges strongly in L1 (Ω) and a.e. Since Ψ is strictly increasing, the sequence (up )p

2396


tends a.e. to a measurable function u. We point out that, when limp→1 Ψ (up ) = ±1, we have u = ±∞. 2 Remark 3.4. We remark that when the datum f is more regular, we may find better regularity on z. Indeed it is well known that, if f ∈ Lm (Ω), with 1 < m < N , then the sequence ∗ ∗ (|∇up |p−2 ∇up )p is bounded in Lm (Ω; RN ) and so z ∈ Lm (Ω; RN ). Observe also the regularity enjoyed by z in Example 4.1 below. 4. The limit problem In this section we will show that the limit function u whose existence has been proved in the previous section is a solution (in the sense of Definition 3.1 below) to a boundary value problem associated to the “limit equation” of equation in (3.1), which can formally be written

⎧ ⎨ −div Du = f |Du| ⎩ u=0

in Ω,

(4.1)

on ∂Ω,

with f ∈ L1 (Ω). We begin by introducing the notion of solution to such a problem, which needs some preliminaries. Let u : Ω → R be a measurable function on Ω, such that Tk (u) ∈ BV(Ω) for any k > 0. Let N z ∈ L N−1 ,∞ (Ω; RN ) be a vector field satisfying −div(z) = f

in D (Ω),

and zχ{|u| 0;

i.e., denoting zk = zχ{|u|k} ) : C0∞ (Ω) → R

(4.3)

by zk , DTk (u) , φ = −

Tk (u)φ dμk −

Ω

Tk (u)zk · ∇φ, Ω

(4.4)


2397

(z, Dχ{|u|>k} ), φ =

fφ −

{|u|>k}

z · ∇φ,

(4.5)

{|u|>k}

for any φ ∈ C0∞ (Ω). Since Tk (u) ∈ L∞ (Ω) ∩ BV(Ω) ⊂ L1 (Ω, μk ) (we point out that a singular extension of μk to bounded BV-functions has been chosen), zk ∈ L∞ (Ω; RN ), f ∈ L1 (Ω), N z ∈ L N−1 ,∞ (Ω; RN ) and φ ∈ C0∞ (Ω) all terms in (4.4) and (4.5) make sense. Definition 4.1. We say that a measurable function u : Ω → R is a solution to problem (4.1) if the following conditions hold Tk (u) ∈ BV(Ω),

for all k > 0;

(4.6)

N

there exists a vector field z ∈ L N−1 ,∞ (Ω; RN ) such that −div z = f

in D (Ω);

(4.7)

for almost every k > 0, the distribution (z, Dχ{|u|>k} ) is a Radon measure and the vector field zk = zχ{|u| h .

(4.16)

By (3.14), as p goes to 1, we have (up to subsequences) |∇up |p−2 ∇up χBp,h,k ∩{|up | 0 and k > 0. Moreover, by definition of the set Bp,h,k we have |∇up |p−2 ∇up χ(Ω\B

p,h,k )∩{|up | 0 and k > 0, wk = fh,k + gh,k

(4.20)

with fh,k ∞ 1 and

|gh,k |

M . h

Ω

Therefore, we obtain (see [4], and also Step 3 of Proposition 4.1 in [33]) |wk |∞ 1,

(4.21)

for all k > 0. Since limp→1 up (x) = u(x) almost everywhere in Ω, it follows that χ{|up | 0 is countable. Therefore, by (3.11) and (4.15), we conclude wk = zχ{|u| 0. Observe that, by applying lim wk = lim zχ{|u|k} )). Step 5. Study of (zk , DTk (u)). As pointed out in the previous step, −divzk is a Radon measure. Therefore by Proposition A.2 in Appendix A, since |zk |∞ 1, we have zk , DTk (u) DTk (u)

as measures in Ω.

(4.25)

Now we prove that in fact equality holds in (4.25). Denote, for every φ ∈ C0∞ (Ω),

(z, Dχ{u>k} ), φ =

fφ −

{u>k}

(z, Dχ{−u>k} ), φ =

z · ∇φ,

{u>k}

fφ −

{−u>k}

z · ∇φ.

(4.26)

{−u>k}

By Proposition A.4 in Appendix A, these distributions are Radon measure concentrated in {u = k} and {−u = k}, respectively. Therefore, by (4.4) and (4.9), we obtain zk , DTk (u) , φ =

f Tk (u)φ +

{|u|k} ) −

Tk (u)z · ∇φ.

{|u|k}

fφ +k {−u>k}

f φ,


2403

Tk (u)φ d(z, Dχ{|u|>k} ) = k (z, Dχ{u>k} ), φ − k (z, Dχ{−u>k} ), φ

Ω

and

Tk (u)z · ∇φ =

{|u|k}

Tk (u)z · ∇φ − k

z · ∇φ + k

{u>k}

Ω

z · ∇φ;

{−u>k}

it follows that zk , DTk (u) , φ =

f Tk (u)φ −

Ω

Tk (u)z · ∇φ,

(4.27)

Ω

we denote for almost every k > 0, z, DTk (u) , φ =

f Tk (u)φ −

Ω

Tk (u)z · ∇φ,

φ ∈ C0∞ (Ω).

(4.28)

Ω

In Appendix A, Proposition A.1, we prove that the distribution defined by the above expression is actually a Radon measure. From (4.27) we deduce that zk , DTk (u) = z, DTk (u) ,

for almost all k > 0,

(4.29)

and therefore by (4.25), z, DTk (u) DTk (u)

as measures in Ω,

(4.30)

DTk (u) z, DTk (u)

as measures in Ω.

(4.31)

Now we prove that

Denote for n > k,

hkn (s) =

⎧ ⎪ ⎨ 0, ⎪ ⎩

|s| k + 2n,

(k+2n−|s|)k sign s , n

k + n < |s| < k + 2n,

Tk (s),

|s| k + n.

Obviously hkn tends to Tk (s) as n → +∞. Let φ be a nonnegative function belonging to C0∞ (Ω). By choosing hkn (up )φ as test function in (3.7) and letting n go to infinity, we get Ω

∇Tk (up )p φ +

|∇up |p−2 ∇up · ∇φTk (up ) =

Ω

Tk (up )φf. Ω

2404


By Young’s inequality we have

∇Tk (up )φ 1 p

Ω

∇Tk (up )p φ + p − 1 p

Ω

1 p

=

Tk (up )φf −

1 p

Ω

φ Ω

|∇up |p−2 ∇up · ∇φ Tk (up ) +

p−1 p

Ω

φ. Ω

This implies

∇Tk (up )φ + 1 p

Ω

|∇up |

p−2

1 ∇up · ∇φTk (up ) p

Ω

p−1 Tk (up )φf + p

Ω

φ. Ω

Now we let p go to 1 and we obtain

φ d DTk (u) +

Ω

z · ∇φTk (u)

Ω

Tk (u)φf, Ω

for every nonnegative φ ∈ C0∞ (Ω). On the other hand, by (4.28), it follows that DTk (u), φ z, DTk (u) , φ , for every φ ∈ C0∞ (Ω) with φ 0. This yields (4.31), and by (4.29) and (4.30) we arrive to (4.10). We explicitly observe that the previous arguments imply lim

p→1

∇Tk (up )p φ = z, DTk (u) , φ = DTk (u), φ ,

∀φ ∈ C0∞ (Ω).

Ω

Step 6. Proof of (4.12). We begin by observing that, since zχ{|u| N .

We will write ∂Ω [z, ν]v dHN −1 instead of z, v ∂Ω . To define ∂Ω [zχ{u=+∞} , ν]v dHN −1 and ∂Ω [zχ{u=−∞} , ν]v dHN −1 , we need to know an expression to −div(zχ{u=+∞} ) and −div(zχ{u=−∞} ), respectively. It is easy to check that −div(zχ{u=+∞} ) = f χ{u=+∞} − (z, Dχ{u=+∞} ), −div(zχ{u=−∞} ) = f χ{u=−∞} − (z, Dχ{u=−∞} ), −div(zχ{|u| 4 and later improved in [65] for s > 0. We observe that a convolution operator Ca associated with a summable sequence a = (a(j ))j ∈Z is the operator associated with the infinite matrix A = (a(j − j ))j,j ∈Z in the Sjöstrand class. Therefore Theorem 1.1 follows from Theorem 1.2. We conjecture that the equivalence of p -stability for different p ∈ [1, ∞] holds for any infinite matrix in the Schur class A. Some progress on the above conjecture is made in [65] under additional assumption that the infinite matrix has rows supported in balls of bounded radii. For a continuous function f on R, we define the modulus of continuity ωδ (f ) by ωδ (f )(x) = sup f (x + y) − f (x).

(1.7)

|y|δ

The modulus of continuity is a delicate tool in mathematical analysis to measure the regularity of a function [27,69]. For 1 p ∞, let Lp be the space of all p-integrable functions on Rd with standard norm · p ,

φ(· − j )

Lp = φ: φLp :=

j ∈Z

(d + 1)2 and with s > 0, is established in [3] and [65], respectively. Given a Banach algebra B, we say that a subalgebra A of B is inverse-closed if the inverse T −1 of the operator T ∈ A belongs to B implies that it belongs to A [23,30,50,53,64]. The inverse-closed subalgebra was first studied for periodic functions with absolutely convergent

C.E. Shin, Q. Sun / Journal of Functional Analysis 256 (2009) 2417–2439

2423

Fourier series, which states that if a periodic function on the real line and has f does not vanish absolutely convergent Fourier series, i.e., f (x) = j ∈Z a(j )e−ij x and j ∈Z |a(j )| < ∞, then f −1 has absolutely convergent Fourier series too [70]. An equivalent formulation of the above Wiener’s lemma involving matrix algebras is that the commutative Banach algebra ˜ := W

a(j ) < ∞ a(j − j ) j,j ∈Z ,

(2.7)

j ∈Z

is an inverse-closed Banach subalgebra of B 2 (2 (Z)) [70]. The classical Wiener’s lemma and its various generalizations (see, e.g., [6–8,10,23,28,35,36,39,42,53,55]) are important and have numerous applications in numerical analysis, wavelet theory, frame theory, and sampling theory. For example, the classical Wiener’s lemma and its weighted variation [39] were used to establish the decay property at infinity for dual generators of a shift-invariant space [1,43]; the Wiener’s lemma for matrices associated with twisted convolution was used in the study the decay properties of the dual Gabor frame for L2 [7,35,36]; the Jaffard’s result [42] for infinite matrices with polynomial decay was used in numerical analysis [17,57,58], wavelet analysis [42], time– frequency analysis [31–33] and sampling [4,24,33,62]; and the Sjöstrand’s result [55] for infinite matrices was used in the study of pseudo-differential operators, Gabor frames and sampling [7, 34,55,60]. Therefore there are lots of papers devoted to the Wiener’s lemma for infinite matrices with various off-diagonal decay conditions (see [5–7,10,12,28,36,39,42,55,59,61] and also [37] for a short historical review). The Wiener’s lemma for the Sjöstrand class C(Λ, Λ) of infinite matrices (a(λ, λ ))λ,λ ∈Λ says that C(Λ, Λ) is an inverse-closed subalgebra of B(2 (Λ)) where Λ is a relatively-separated subset of Rd [55]. This together with the equivalence of p -stability for different p in Theorem 2.1 proves that C(Λ, Λ) is an inverse-closed subalgebra of B(p (Λ)) for any 1 p ∞. Corollary 2.4. Let 1 p ∞ and Λ be a relatively-separated subset of Rd . Then the Sjöstrand class C(Λ, Λ) is an inverse-closed subalgebra of B(p (Λ)), i.e., if A ∈ C(Λ, Λ) has bounded inverse on B(p (Λ)), then A−1 ∈ C(Λ, Λ). Before we start the proof of Theorem 2.1, let us consider necessary conditions on the relatively-separated subsets Λ and Λ such that there exists a matrix A = (a(λ, λ ))λ∈Λ,λ ∈Λ

in the Sjöstrand class C(Λ, Λ ) which has p -stability for some 1 p ∞. Similar conclusion is obtained in [60] for sampling signals with finite rate of innovation, and in [51] for slanted matrices. Proposition 2.5. Let Λ, Λ be relatively-separated subsets of Rd . If there exists a matrix A = (a(λ, λ ))λ∈Λ,λ ∈Λ in the Sjöstrand class C(Λ, Λ ) which has p -stability for some 1 p ∞, then there exists a positive number R0 such that for any bounded set K the cardinality of the set Λ ∩ B(K, R0 ) is larger than or equal to the cardinality of the set Λ ∩ K, where B(K, R) is the set of all points in Rd with distance to K less than R. Proof. We show the above result on relatively-separated sets Λ and Λ by similar argument to the proof of the necessary condition on the sampling set of a stable sampling and reconstruction process in [60]. Let K be a compact subset of Rd and p (Λ ∩ K) be the space of all sequences in p (Λ ) supported on Λ ∩ K. For a sequence c ∈ p (Λ ∩ K), it follows from the property of the

2424


matrix A in the Sjöstrand class that the p norm of the sequence Ac outside of B(K, R)∩Λ is less than ε(R)cp (Λ ) , where ε(R) (independent of the compact set K) tends to zero as R tends to infinity. Thus there exists a positive constant R0 such that the p norm of the sequence Ac inside B(K, R0 ) ∩ Λ is equivalent to the p norm of the sequence c. This implies that the submatrix obtained by selecting the columns in Λ ∩ B(K, R0 ) and rows in Λ ∩ K of the matrix A has full rank, which proves the desired conclusion on the relatively-separated subsets Λ and Λ . 2 The proof of Theorem 2.1 is inspired by the commutator technique developed in [55] and norm equivalence technique for a finite-dimensional space in [3]. To prove Theorem 2.1, we need several lemmas. First we recall a known result about the boundedness of infinite matrices in the Sjöstrand class. Lemma 2.6. (See [62].) Let 1 p ∞, Λ and Λ be two relatively-separated subsets of Rd , and A = (a(λ, λ ))λ∈Λ,λ ∈Λ be an infinite matrix in the Sjöstrand class C(Λ, Λ ). Then the infinite matrix A is a bounded operator from p (Λ ) to p (Λ). Moreover there exists an absolute constant C (that depends on d and p only) such that Acp (Λ) CR(Λ)1/p R(Λ )1−1/p AC cp (Λ )

for all c ∈ p (Λ ).

(2.8)

if x∞ 1, if 1 < x∞ < 2, if x∞ 2.

(2.9)

Define the cut-off function ⎧ 1 ⎨ ψ(x) = min max 2 − x∞ , 0 , 1 = 2 − x∞ ⎩ 0 Then

0 ψ(x) 1 for all x ∈ Rd , and ψ(x) − ψ(y) x − y∞ for all x, y ∈ Rd .

(2.10)

For n ∈ Zd and N ∈ N, define the multiplication operator ΨnN : p (Λ) → p (Λ) by ΨnN c

λ−n c(λ) = ψ N λ∈Λ

for c = c(λ) λ∈Λ ∈ p (Λ),

(2.11)

where Λ is a relatively-separated subset of Rd . The multiplication operator ΨnN can also be thought as a diagonal matrix diag(ψ((λ − n)/N ))λ∈Λ . For an infinite matrix A = (a(λ, λ ))λ∈Λ,λ ∈Λ and any s 0, define the truncation matrix As = as (λ, λ ) λ∈Λ,λ ∈Λ ,

(2.12)

where as (λ, λ ) = a(λ, λ ) if λ − λ ∞ < s and as (λ, λ ) = 0 otherwise. For the truncation matrices As , s 0, of an infinite matrix A in the Sjöstrand class C(Λ, Λ ), we have A − As C is a decreasing function with lim A − As C = 0. s→+∞

(2.13)


2425

Lemma 2.7. Let 1 p, q ∞, 1 N ∈ N, Λ and Λ be two relatively-separated subsets of Rd , and A = (a(λ, λ ))λ∈Λ,λ ∈Λ be an infinite matrix in the Sjöstrand class C(Λ, Λ ). Then there exists an absolute constant C (that depends on d, p, q only) such that

AN Ψ N − Ψ N AN c p

d n n (Λ) n∈N Z q (N Zd ) s 1/p

1−1/p CR(Λ) R(Λ ) min A − As C + AC 0sN N

N

× Ψn c p (Λ ) n∈N Zd q (N Zd ) for all c ∈ q (Λ ).

(2.14)

Proof. Observing that AN ΨnN − ΨnN AN = AN ΨnN − ΨnN AN Ψn6N , we obtain from Lemma 2.6 that

AN Ψ N − Ψ N AN c p n n (Λ)

1/p

1−1/p

AN ΨnN − ΨnN AN C Ψn6N c p (Λ ) CR(Λ) R(Λ )

(2.15)

for any c ∈ p (Λ ). We note from (2.9), (2.10) and (2.13) that

AN Ψ N − Ψ N AN aN (λ, λ ) ψ N (λ ) − ψ N (λ)

n n n n C λ∈Λ,λ ∈Λ C

min aN (λ, λ ) − as (λ, λ ) λ∈Λ,λ ∈Λ C 0sN

s

as (λ, λ )

∈Λ C λ∈Λ,λ N s min A − As C + AC . 0sN N +

(2.16)

Then we combine (2.15) and (2.16) to yield

AN Ψ N − Ψ N AN c p n n (Λ) CR(Λ)

1/p

1−1/p

R(Λ )

s min A − As C + AC Ψn6N c p (Λ ) 0sN N

for any c ∈ p (Λ ). Thus for 1 q ∞, we get from (2.9), (2.10) and (2.17) that

AN Ψ N − Ψ N AN c p

n n (Λ) n∈N Zd q (N Zd ) s CR(Λ)1/p R(Λ )1−1/p min A − As C + AC 0sN N

6N

× Ψn c p (Λ ) n∈N Zd q (N Zd ) s CR(Λ)1/p R(Λ )1−1/p min A − As C + AC 0sN N

(2.17)

2426


×

N

Ψ

n+2j N c p (Λ ) n∈N Zd q (N Zd )

j ∈Zd with j ∞ 6

s CR(Λ) R(Λ ) min A − As C + AC 0sN N

N

× Ψn c p (Λ ) n∈N Zd q (N Zd ) .

1−1/p

1/p

2

This proves the estimate (2.14).

Lemma 2.8. Let 1 N ∈ N, 1 p, q ∞, Λ and Λ be two relatively-separated subsets of Rd , and A = (a(λ, λ ))λ∈Λ,λ ∈Λ be an infinite matrix in the Sjöstrand class C(Λ, Λ ). Then there exists a positive constant C (that depends only on d, p, q) such that

N

Ψ Ac

p (Λ) n∈N Zd q (N Zd )

n

CR(Λ)1/p R(Λ )1−1/p AC ΨnN c p (Λ ) n∈N Zd q (N Zd )

(2.18)

holds for any sequence c ∈ q (Λ ). Proof. By (2.9) and (2.10), we have that

4d

χ[−2,2)d (x − k)

k∈Zd

2 ψ(x − k) k∈Zd

χ[−1,1)d (x − k) = 2d

for all x ∈ Rd .

(2.19)

k∈Zd

Combining (2.19) and Lemma 2.6, we obtain that

N

Ψ Ac

n

p (Λ)

CR(Λ)1/p R(Λ )1−1/p

Ψ N AΨ N Ψ N c p

n n+n C n+n (Λ )

n ∈N Zd

1−1/p

CR(Λ) R(Λ ) × 1/p

n ∈N Zd

N

a(k) Ψn+n

c p (Λ )

k∈Zd with k−n ∞ 4N

holds for n ∈ NZd and c ∈ q (Λ ), where a(k) =

sup

λ∈Λ,λ ∈Λ

a(λ, λ )χ k+[0,1)d (λ − λ ).

From (2.20) we get that

N

Ψ Ac

n

p (Λ) n∈N Zd q (N Zd )

CR(Λ)1/p R(Λ )1−1/p

n ∈N Zd k∈Zd with k−n ∞ 4N

a(k)

(2.20)


2427

× ΨnN c p (Λ ) n∈N Zd q (N Zd )

CR(Λ)1/p R(Λ )1−1/p AC ΨnN c p (Λ ) n∈N Zd q (N Zd ) for 1 q ∞. Then the estimate (2.18) follows.

2

Now let us start to prove Theorem 2.1. Proof of Theorem 2.1. Let N 1 be a sufficiently large integer determined later, n ∈ N Zd , the multiplication operator ΨnN be as in (2.11), and the truncation matrix AN be as in (2.12). By the assumption on the infinite matrix A, there exists a positive constant C0 such that

N

Ψ c

p (Λ )

n

C0 AΨnN c p (Λ)

(2.21)

for any sequence c ∈ q (Λ ), n ∈ N Zd and 1 N ∈ N. By (2.8), (2.13), (2.14), (2.18) and (2.21), we get

N q

Ψ c p

(Λ )

n

n∈N Zd

C0

1/q

AΨ N c qp n

1/q

(Λ)

n∈N Zd

C0

(A − AN )Ψ N c qp n

1/q

(Λ)

n∈N Zd

1/q

q N N

AN Ψn − Ψn AN c p (Λ) + C0 n∈N Zd

+ C0

1/q 1/q

N

N q

Ψ (AN − A)c qp

Ψ + C Ac 0 n n (Λ) p (Λ) n∈N Zd

n∈N Zd

s C0 CR(Λ)1/p R(Λ )1−1/p A − AN C + inf A − As C + AC 0sN N 1/q 1/q

N q

N q

Ψ c p

Ψ Ac p + C0 × n

n

(Λ )

n∈N Zd

(Λ)

n∈N Zd

s A − As C + AC 0sN N 1/q 1/q

N q

N q

Ψn c p (Λ ) Ψn Ac p (Λ) × + C0 ,

C0 CR(Λ)1/p R(Λ )1−1/p

n∈N Zd

inf

n∈N Zd

where 1 q < ∞. Note that for any infinite matrix A ∈ C(Λ, Λ )

(2.22)

2428


s A − As C + AC N →∞ 0sN N −1/2 AC = 0 lim A − A√N C + N

0 lim

inf

N →∞

by (2.13). Therefore by selecting N sufficiently large in (2.22), we have that

N q

Ψ c p n

1/q

(Λ )

N

Ψ (Ac) qp 2C0

n∈N Zd

n

1/q

(Λ)

.

(2.23)

n∈N Zd

By the equivalence of different norms on a finite-dimensional space, there exists a positive constant C (that depends on p, q, d only) such that

min(1/p−1/q,0) N

Ψ c q Ψ N c p

C −1 R(Λ )N d n n (Λ ) (Λ )

(2.24)

N

Ψ Ac

(2.25)

and n

p (Λ)

max(1/p−1/q,0) N

Ψ Ac q C R(Λ)N d n (Λ)

hold for all sequences c ∈ q (Λ ), n ∈ NZd and 1 N ∈ N. Therefore combining (2.23), (2.24) and (2.25), we conclude that − min(1/p−1/q,0) max(1/p−1/q,0) R(Λ)N d Acq (Λ) cq (Λ ) C R(Λ )N d

(2.26)

for any c ∈ q (Λ ), and the conclusion for 1 q < ∞ follows. The conclusion for q = ∞ can be proved by similar argument. We omit the details here.

2

3. Stability for localized synthesis operators In this section, we consider the stability of the synthesis operator SΦ : p (Λ) c(λ) λ∈Λ → c(λ)φλ ∈ Vp (Φ, Λ)

(3.1)

λ∈Λ

associated with a family Φ = {φλ : λ ∈ Λ} of functions on Rd , where Vp (Φ, Λ) :=

c(λ)φλ : (cλ )λ∈Λ ∈ p (Λ) ,

1 p ∞,

(3.2)

λ∈Λ

see [62]. The synthesis operator SΦ appears in the study of spline approximation and operator approximation [27,54], wavelet analysis [18,25,48,49], Gabor analysis [31] and sampling [1,60], while one of basic assumptions for the synthesis operator SΦ is the p -stability, i.e., there exists a positive constant C such that C −1 cp (Λ) SΦ cp Ccp (Λ)

for all c ∈ p (Λ).

(3.3)


2429

The main results of this section are Theorem 3.1 (a generalization of Theorem 1.3) about equivalence of the p -stability of the synthesis operator SΦ for different 1 p ∞, and Corollary 3.3 about well localization of the inverse of the synthesis operator SΦ . Theorem 3.1. Let Λ be a relatively-separated subset of Rd , Φ = {φλ , λ ∈ Λ} be a family of functions with the property that

(3.4)

sup φλ (· + λ) < ∞ λ∈Λ

and

W1

lim sup ωδ (φλ )(· + λ)

δ→0 λ∈Λ

W1

= 0.

(3.5)

If the synthesis operator SΦ in (3.1) has p -stability for some 1 p ∞, then it has q -stability for any 1 q ∞. For Φ = {φn (· − j )}1nN,j ∈Zd generated by integer shifts of finitely many functions φ1 , . . . , φN , we have the following corollary for the synthesis operator SΦ associated with Φ. Here in the statement of the following result, we do not include the regularity condition (3.5) because limδ→0 ωδ (f )W1 = 0 for any continuous function f in the Wiener amalgam space W1 [1]. Corollary 3.2. Let φ1 , . . . , φN be continuous functions in the Wiener amalgam space W1 , and for 1 p ∞ define Vp (φ1 , . . . , φN ) :=

N

p d N cn (j )φn (· − j ): cn (j ) ∈ Z .

n=1 j ∈Zd

If the synthesis operator Lφ1 ,...,φn : (p (Zd ))N → Vp (φ1 , . . . , φN ) defined by N Lφ1 ,...,φn : cn (j ) 1nN,j ∈Zd → cn (j )φn (· − j ) n=1 j ∈Zd

has p -stability for some p ∈ [1, ∞], i.e., there exists a positive constant C such that C −1 c(p (Zd ))N Lφ1 ,...,φN cp Cc(p (Zd ))N

N for all c ∈ p Zd ,

then the synthesis operator Lφ1 ,...,φn has q -stability for any q ∈ [1, ∞]. The result in the above corollary is established in [43] under the weak assumption that φ1 , . . . , φN ∈ L∞ . if and only if the matrix Note that the synthesis operator SΦ has 2 -stability A = (a(λ, λ ))λ,λ ∈Λ has 2 -stability where a(λ, λ ) = Rd φλ (x)φλ (x) dx for λ, λ ∈ Λ. This observation together with the equivalence in Theorem 3.1 for the synthesis operator SΦ and the Wiener’s lemma in [55] for the Sjöstrand class of infinite matrices leads to the following result.

2430


Corollary 3.3. Let 1 p ∞, Λ be a relatively-separated subset of Rd , Φ = {φλ , λ ∈ Λ} satisfy (3.4) and (3.5). If the synthesis operator SΦ has p -stability, then there exists another family Φ˜ = {φ˜ λ , λ ∈ Λ} of functions satisfying (3.4) and (3.5) such that the inverse of the synthesis operator SΦ is given by (SΦ )−1 f =

f (x)φ˜ λ (x) dx

for all f ∈ Vp (Φ, Λ). λ∈Λ

Rd

The conclusion in the above corollary with p = 2 is established in [62] without the regularity assumption (3.5). The conclusion in the above corollary for general 1 p ∞ gives a partial answer to a problem in [62, Remark 5.3]. To prove Theorem 3.1, we recall a result in [62]. Lemma 3.4. Let 1 p ∞, Λ be a relatively-separated subset of Rd , Φ = {φλ , λ ∈ Λ} satisfy (3.4). Then there exists a positive constant C (that depends on d and p only) such that

SΦ cp CR(Λ)1−1/p sup φλ (· + λ)

λ∈Λ

W1

cp (Λ) .

(3.6)

Now we start to prove Theorem 3.1. Proof of Theorem 3.1. Let 1 p, q ∞. By the p -stability of the synthesis operator SΦ , there exists a positive constant C0 such that cp (Λ) C0 SΦ cp


(3.7)

For 1 n ∈ N, define the operator Pn on Lp by Pn f (x) = 2nd

λ ∈2−n Zd

φ0 2n (x − λ ) ·

f (y)φ0 2n (y − λ ) dy,

f ∈ Lp Rd , (3.8)

Rd

where φ0 be the characteristic function on [0, 1)d , and let Φn = {Pn φλ , λ ∈ Λ}. Then φλ (x) − Pn φλ (x) ω2−n (φλ )(x)

for all x ∈ Rd and λ ∈ Λ.

(3.9)

From (3.7), (3.9) and Lemma 3.4 it follows that SΦ cp SΦn cp + SΦ−Φn cp

SΦn cp + CR(Λ)1−1/p sup ω2−n (φλ )(· + λ)

λ∈Λ

W1

cp (Λ) .

(3.10)

Combining (3.5) and (3.10) leads to the existence of a sufficiently large integer n0 such that cp (Λ) 2C0 SΦn0 cp


(3.11)


2431

Define An0 = (an0 (λ , λ))λ ∈2−n0 Zd ,λ∈Λ by an0 (λ , λ) = 2n0 d

φλ (y)φ0 2n0 (y − λ ) dy.

(3.12)

Rd

Since

Pn0 φλ =

an0 (λ , λ)φ0 2n0 (· − λ ) ,

λ ∈2−n0 Zd

and

λ ∈2−n0 Zd

−n0 d/p a(λ )φ0 2n0 (· − λ )

ap (2−n0 Zd )

=2

(3.13)

p

for any a = (a(λ ))λ ∈2−n0 Zd ∈ p (2−n0 Zd ), Eq. (3.11) can be rewritten in the following matrix formulation: cp (Λ) 2C0 2−n0 d/p An0 cp (2−n0 Zd )


(3.14)

By (3.4), it holds that

sup

−n0 Zd ,λ∈Λ

j ∈Zd λ ∈2

2

n0 d

j ∈Zd

an (λ , λ)χ j +[0,1)d (λ − λ) 0

j ∈Zd

sup

λ ∈2−n0 Zd ,λ∈Λ

sup y∈j +[0,2)d

χj +[0,1)d (λ − λ) ·

h(y − λ)φ0 2n0 (y − λ ) dy

Rd

h(y) 2d hW1 < ∞,

where h(x) = supλ∈Λ |φλ (x + λ)|, which means that the infinite matrix An0 in (3.12) belongs to the Sjöstrand class C(2−n0 Zd , Λ), An0 ∈ C 2−n0 Zd , Λ .

(3.15)

By (3.14), (3.15) and Theorem 2.1, the infinite matrix An0 has the q -stability, i.e., there exists a positive constant C1 such that cq (Λ) C1 An0 cq (2−n0 Zd )

for all c ∈ q (Λ).

(3.16)

For any c = (c(λ))λ∈Λ ∈ q (Λ), An0 cq (2−n0 Zd ) = 2

n0 d/q

SΦ cq

Kn0 (·, y)(SΦ c)(y) dy 2

n0 d/q

Rd

q

(3.17)

2432


by (3.13), where Kn0 (x, y) = 2n0 d λ ∈2−n0 Zd φ0 (2n0 (x − λ ))φ0 (2n0 (y − λ )). The q -stability of the synthesis operator SΦ then follows from (3.16) and (3.17). 2 4. Lp -stability for localized integral operators In this section, we consider the Lp -stability of integral operators Tf (x) := KT (x, y)f (y) dy, f ∈ Lp Rd ,

(4.1)

Rd

whose kernels KT are enveloped by convolution kernels with certain decay at infinity, i.e., KT (x, y) h(x − y)

for all x, y ∈ Rd ,

(4.2)

where h is a function in the Wiener amalgam space W1 [8,15,44,46,47,63]. Examples of the integral operators of the form (4.1) include projection operators on wavelet spaces [19,21,22,25, 43,62], frame operators associated with Gabor systems in the time–frequency space [5,7,31,35], and reconstruction operators in sampling theory [1,60,62]. An integral operator T with kernel KT enveloped by a convolution kernel in the Wiener amalgam space defines a bounded operator on Lp . The above class C1 of localized integral operators is a non-unital algebra. The new algebra IC1 = {λI + T : λ ∈ C, T ∈ C1 } obtained by adding the identity operator I on Lp to that algebra C1 is a unital Banach subalgebra of B(Lp ), 1 p ∞ [63]. In this section, we discuss the Lp -stability of the localized integral operators in IC1 with additional regularity on kernels. The main results of this section are Theorem 4.1 (a slight generalization of Theorem 1.4), and Corollary 4.2 concerning the well localization of the inverse of a localized integral operator. Theorem 4.1. Let 0 < α 1, D be a positive constant, and T be an integral operator of the form (4.1) with its kernel KT satisfying

(4.3)

sup KT (y, · + y) D W1

y∈Rd

and

sup ωδ (KT )(y, · + y)

y∈Rd

W1

Dδ α

for all δ ∈ (0, 1).

(4.4)

If I + T has Lp -stability for some 1 p ∞, then it has Lq -stability for all 1 q ∞. The above result for p = 2 follows from the Wiener’s lemma for localized integral operators, see [63]. Applying the Lp equivalence in Theorem 4.1 for different p, we can extend the Wiener’s lemma in [63] to p = 2.


2433

Corollary 4.2. Let 1 p ∞, 0 = λ ∈ C, and T be an integral operator with its kernel KT satisfying (4.3) and (4.4). If λI + T has bounded inverse on Lp , then (λI + T )−1 = λ−1 I + T˜ for some integral operator T˜ with kernel satisfying (4.3) and (4.4). Recall that any integral operator having its kernel satisfying (4.3) and (4.4) does not have bounded inverse in Lp , 1 p < ∞ [63]. Then from Corollary 4.2 we have the following result to spectra of localized integral operators on Lp . Corollary 4.3. Let T be an integral operator with its kernel KT satisfying (4.3) and (4.4). Then σp (T ) = σq (T )

for all 1 p, q < ∞,

(4.5)

where σp (T ) denotes the spectrum of the operator T on Lp . Now we start to prove Theorem 4.1. Proof of Theorem 4.1. By the Lp -stability of the operator I + T , there exists a positive constant C0 such that

f p C0 (I + T )f p

for all f ∈ Lp .

(4.6)

For 1 n ∈ N, let Tn = Pn T Pn with kernel KTn where Pn is given in (3.8). Then

KTn (x, y) =

an (λ, λ )φ0 2n (x − λ) φ0 2n (y − λ )

(4.7)

λ,λ ∈2−n Zd

and KT (x, y) − KT (x, y) Cω2−n (KT )(x, y) n

for all x, y ∈ Rd ,

(4.8)

φ0 2n (s − λ) KT (s, t)φ0 2n (t − λ ) ds dt

(4.9)

where φ0 is the characteristic function on [0, 1)d and an (λ, λ ) = 22nd

Rd Rd

for λ, λ ∈ 2−n Zd . Therefore we have from (4.4) and (4.8) that for any f ∈ Lr with 1 r ∞,

(T − Tn )f C

sup ω2−n (KT )(y, · + y)

r y∈Rd

W1

C2−nα f r .

f r (4.10)

By (4.4), (4.6) and (4.10), there exists a sufficiently large integer n0 such that for all n n0 ,

f p 2C0 (I + Tn )f p

for all f ∈ Lp .

(4.11)

2434


Let An := an (λ, λ ) λ,λ ∈2−n Zd ,

(4.12)

where an (λ, λ ), λ, λ ∈ 2−n Zd , are given in (4.9). Applying (4.11) to fn :=

cn (λ)φ0 2n (· − λ)

λ∈2−n Zd

with cn (λ) λ∈2−n Zd ∈ p 2−n Zd ,

and noting fn p = 2−nd/p cn p (2−n Zd )

(4.13)

(I + Tn )fn = 2−nd/p z I + 2−nd An cn p −n d , p (2 Z )

(4.14)

and

we obtain the uniform p -stability of the matrix I + 2−nd An , i.e.,

cn p (2−n Zd ) 2C0 I + 2−nd An cn p (2−n Zd )

(4.15)

holds for any cn ∈ p (2−n Zd ) and n n0 . Define An,s = an,s (λ, λ ) λ,λ ∈2−n Zd , where an,s (λ, λ ) =

an (λ, λ ) 0,

if λ − λ ∞ < s, otherwise.

Then for s 0, An − An,s C

sup

−n Zd

j ∈Zd with j ∞ s−1 λ,λ ∈2

22nd

sup

j ∈Zd with j ∞ s−1

×

an (λ, λ )χ j +[0,1)d (λ − λ )

λ,λ ∈2−n Zd

K(λ + s, λ + t) ds dt

2−n [0,1)d 2−n [0,1)d

h(x)

sup

j ∈Zd with j ∞ s−1 x∈j +[−1,2)

3d

χj +[0,1)d (λ − λ )

d

h(x),

sup

j ∈Zd with j ∞ s−3 x∈j +[0,1)

d

(4.16)


2435

where h(x) = supy∈Rd |KT (x + y, y)|. Thus inf

0sN

s An − An,s C + An C N

An − An,√N C + N −1/2 An C C sup h(x) + N −1/2 j ∈Zd

√ x∈j +[0,1)d with j ∞ N −3

j ∈Zd

sup

h(x) .

(4.17)

x∈j +[0,1)d

Let N be a sufficiently large integer chosen later and the multiplication operator ΨjN be as in the proof of Theorem 2.1. Then for 1 q ∞, using the similar argument in the proof of Theorem 2.1, we obtain from (2.13), (4.15), (4.17), and Lemmas 2.6–2.8 that

N

Ψ cn p −n d

j (2 Z ) j ∈N Zd q (N Zd )

2C0 I + 2−nd An ΨjN cn p (2−n Zd ) j ∈N Zd q (N Zd )

2−nd+1 C0 (An − An,N )ΨjN cn p (2−n Zd ) j ∈N Zd q (N Zd )

+ 2−nd+1 C0 An,N ΨjN − ΨjN An,N cn p (2−n Zd ) j ∈N Zd q (N Zd )

+ 2−nd+1 C0 ΨjN (An,N − An )cn p (2−n Zd ) j ∈N Zd q (N Zd )

+ 2C0 ΨjN I + 2−nd An cn p (Λ) j ∈N Zd q (N Zd ) s C0 C An − An,N C + inf An − An,s C + An C 0sN N 1/q 1/q

N q

N

q

Ψj cn p (Λ) Ψj An cn p (Λ) + C0 × j ∈N Zd

CC0

j ∈N Zd

sup

√ x∈j +[0,1)d j ∈Zd with j ∞ N −3

+ N −1/2

sup

d j ∈Zd x∈j +[0,1)

h(x)

h(x) Ψ N cn

j

p (2−n Zd ) j ∈N Zd q (N Zd )

+ 2C0 ΨjN I + 2−nd An cn p (2−n Zd ) j ∈N Zd q (N Zd ) ,

(4.18)

where h(x) = supy∈Rd |KT (y, x + y)| and the uppercase letter C denotes an absolute constant independent of n n0 and N 1 but may be different at different occurrences. By (4.3) and (4.18) there exists a sufficiently large integer N0 (independent of n n0 ) such that

N0

Ψ cn p −n d

j (2 Z ) j ∈N0 Zd q (N0 Zd )

N

4C0 Ψj 0 I + 2−nd An cn p (2−n Zd ) j ∈N Zd q (N Zd ) (4.19) 0

holds for any cn ∈ q (2−n Zd ) and n n0 .

0

2436


Combining (2.24), (2.25) and (4.19) yields

cn q (2−n Zd ) C1 2nd|1/p−1/q| I + 2−nd An cn q (2−n Zd ) for all cn ∈ q 2−n Zd ,

(4.20)

where C1 is a positive constant independent of n n0 . This together with (4.13) and (4.14) proves that

fn q C1 2nd|1/p−1/q| (I + Tn )fn q

(4.21)

holds for any fn = λ∈2−n Zd cn (λ)φ0 (2n (· − λ)) with (cn (λ))λ∈2−n Zd ∈ q (2−n Zd ). Note that for any f ∈ Lq , Pn f = 2

nd

φ0 2n (· − λ) ·

λ∈2−n Zd

f (y)φ0 2n (y − λ) dy

(4.22)

Rd

with

nd n

2 2 f (y)φ (y − λ) dy 0

Rd

d

λ∈2−n Z

q (2−n Zd )

2nd/q f q < ∞.

(4.23)

Therefore it follows from (4.21) that

Pn f q C1 2nd|1/p−1/q| (I + Tn )Pn f q

for all f ∈ Lq .

(4.24)

By (4.13), (4.22) and (4.23), we have Pn f q f q

for all f ∈ Lq .

(4.25)

This implies that f q f − Pn f q + Pn f q 3f q

for all f ∈ Lq .

(4.26)

Noting that Pn2 = Pn , (I − Pn )(I + T )f = (I − Pn )f + (I − Pn )T (I − Pn )f + (I − Pn )T Pn f and Pn (I + T )f = Pn (I + T )Pn f + Pn T (I − Pn )2 f , we obtain from the second inequality of (4.26) that for any f ∈ Lq ,

(I + T )f 1 (I − Pn )(I + T )f + 1 Pn (I + T )f

q q q 3 3

1

1

1

(I − Pn )f q + Pn (I + T )Pn f q − (I − Pn )T (I − Pn )f q 3 3 3

1

1

− (I − Pn )T Pn f q − Pn T (I − Pn )2 f q . (4.27) 3 3


2437

We note that (I − Pn )T and T (I − Pn ) are integral operators with their kernel bounded by ω2−n (KT ) where KT is the kernel of the integral operator T . Therefore similar to the argument in (4.10) we have

(I − Pn )Tf + T (I − Pn )f C2−nα f r for all f ∈ Lr , (4.28) r r where 1 r ∞ and C is a positive constant independent of n n0 . For those 1 q ∞ satisfying |1/q − 1/p| < α/d, we get from (4.24)–(4.28) that

(I + T )f q

1 −nα f − Pn f q − C2 3 + (3C1 )−1 2−nd|1/p−1/q| − C2−nα Pn f q

1 f − Pn f q + (4C1 )−1 2−nd|1/p−1/q| Pn f q 4 (4C1 )−1 2−nd|1/p−1/q| f q

for all f ∈ Lq

(4.29)

if we let the integer n be chosen to be sufficiently large. This proves that I + T has Lq -stability if |1/p − 1/q| < α/d. Using the above argument iteratively leads to the conclusion that I + T has Lq -stability for all 1 q ∞. 2 Acknowledgments The authors thank Professors Akram Aldroubi, Radu Balan, Ilya Krishtal, Deguang Han, Wai-Shing Tang, and Romain Tessera for their discussion and suggestions in preparing the manuscript. References [1] A. Aldroubi, K. Gröchenig, Nonuniform sampling and reconstruction in shift-invariant space, SIAM Review 43 (2001) 585–620. [2] A. Aldroubi, Q. Sun, W.-S. Tang, p-Frames and shift invariant subspaces of Lp , J. Fourier Anal. Appl. 7 (2001) 1–21. [3] A. Aldroubi, A. Baskakov, I. Krishtal, Slanted matrices, Banach frames, and sampling, J. Funct. Anal. 255 (2008) 1667–1691. [4] N. Atreas, J.J. Benedetto, C. Karinakas, Local sampling for regular wavelet and Gabor expansions, Sampl. Theory Signal Image Process. 2 (2003) 1–24. [5] R. Balan, The noncommutative Wiener lemma, linear independence, and special properties of the algebra of time– frequency shift operators, Trans. Amer. Math. Soc. 360 (2008) 3921–3941. [6] R. Balan, I. Krishtal, An almost periodic noncommutative Wiener’s lemma, preprint, 2008. [7] R. Balan, P.G. Casazza, C. Heil, Z. Landau, Density, overcompleteness and localization of frames I. Theory, J. Fourier Anal. Appl. 12 (2006) 105–143; R. Balan, P.G. Casazza, C. Heil, Z. Landau, Density, overcompleteness and localization of frames II. Gabor system, J. Fourier Anal. Appl. 12 (2006) 309–344. [8] B.A. Barnes, The spectrum of integral operators on Lebesgue spaces, J. Operator Theory 18 (1987) 115–132. [9] B.A. Barnes, When is the spectrum of a convolution operator on Lp independent of p? Proc. Edinb. Math. Soc. 33 (1990) 327–332. [10] A.G. Baskakov, Wiener’s theorem and asymptotic estimates for elements of inverse matrices, Funktsional. Anal. i Prilozhen. 24 (1990) 64–65; translation in: Funct. Anal. Appl. 24 (1990) 222–224.

2438


[11] L. Berg, G. Plonka, Spectral properties of two-slanted matrices, Results Math. 35 (1999) 201–215. [12] S. Bochner, R.S. Phillips, Absolutely convergent Fourier expansions for non-commutative normed rings, Ann. of Math. 43 (1942) 409–418. [13] A. Boulkhemair, Remarks on a Wiener type pseudo-differential algebra and Fourier integral operators, Math. Res. Lett. 4 (1997) 53–67. [14] A. Boulkhemair, L2 estimates for Weyl quantization, J. Funct. Anal. 165 (1999) 173–204. [15] L.H. Brandenburg, On identifying maximal ideals in Banach algebras, J. Math. Anal. Appl. 50 (1975) 489–510. [16] A. Cavaretta, W. Dahmen, C.A. Micchelli, Stationary subdivision, Mem. Amer. Math. Soc. 453 (1991) 1–186. [17] O. Christensen, T. Strohmer, The finite section method and problems in frame theory, J. Approx. Theory 133 (2005) 221–237. [18] C.K. Chui, An Introduction to Wavelets, Academic Press, New York, 1992. [19] C.K. Chui, Q. Sun, Affine frame decompositions and shift-invariant spaces, Appl. Comput. Harmon. Anal. 20 (2006) 74–107. [20] C.K. Chui, W. He, J. Stöckler, Nonstationary tight wavelet frames, II: Unbounded intervals, Appl. Comput. Harmon. Anal. 18 (2005) 25–66. [21] A. Cohen, N. Dyn, Nonstationary subdivision schemes and multiresolution analysis, SIAM J. Math. Anal. 27 (1996) 1745–1769. [22] R. Coifman, M. Maggioni, Diffusion wavelets, Appl. Comput. Harmon. Anal. 21 (2006) 53–94. [23] A. Connes, C ∗ algebres et geometrie differentielle, C. R. Acad. Sci. Paris Sér. A–B 290 (1980) A599–A604. [24] E. Cordero, K. Gröchenig, Localization of frames II, Appl. Comput. Harmon. Anal. 17 (2004) 29–47. [25] I. Daubechies, Ten Lectures on Wavelets, CBMF Conf. Ser. Appl. Math., vol. 61, SIAM, Philadelphia, PA, 1992. [26] C. de Boor, A bound on the L∞ -norm of the L2 -approximation by splines in terms of a global mesh ratio, Math. Comp. 30 (1976) 687–694. [27] R.A. Devore, G.G. Lorentz, Constructive Approximation, Springer-Verlag, Berlin, 1993. [28] S. Demko, Inverse of band matrices and local convergences of spline projections, SIAM J. Numer. Anal. 14 (1977) 616–619. [29] B. Farrell, T. Strohmer, Inverse-closedness of a Banach algebra of integral operators on the Heisenberg group, J. Operator Theory, in press. [30] I.M. Gelfand, D.A. Raikov, G.E. Silov, Commutative Normed Rings, Chelsea, New York, 1964. [31] K. Gröchenig, Foundation of Time–Frequency Analysis, Birkhäuser Boston, Boston, MA, 2001. [32] K. Gröchenig, Localized frames are finite unions of Riesz sequences, Adv. Comput. Math. 18 (2003) 149–157. [33] K. Gröchenig, Localization of frames, Banach frames, and the invertibility of the frame operator, J. Fourier Anal. Appl. 10 (2004) 105–132. [34] K. Gröchenig, Time–frequency analysis of Sjöstrand’s class, Rev. Mat. Iberoamericana 22 (2006) 703–724. [35] K. Gröchenig, M. Leinert, Wiener’s lemma for twisted convolution and Gabor frames, J. Amer. Math. Soc. 17 (2003) 1–18. [36] K. Gröchenig, M. Leinert, Symmetry of matrix algebras and symbolic calculus for infinite matrices, Trans. Amer. Math. Soc. 358 (2006) 2695–2711. [37] K. Gröchenig, Z. Rzeszotnik, T. Strohmer, Quantitative estimates for the finite section method, preprint. [38] K. Gröchenig, T. Strohmer, Pseudodifferential operators on locally compact abelian groups and Sjöstrand’s symbol class, J. Reine Angew. Math. 613 (2007) 121–146. [39] C.C. Graham, O.C. McGehee, Essay in Commutative Harmonic Analysis, Springer-Verlag, Berlin, 1979. [40] F. Herau, Melin–Hömander inequality in a Wiener type pseudo-differential algebra, Ark. Math. 39 (2001) 311–338. [41] A. Hulanicki, On the spectrum of convolution operators on groups with polynomial growth, Invent. Math. 17 (1972) 135–142. [42] S. Jaffard, Properiétés des matrices bien localisées prés de leur diagonale et quelques applications, Ann. Inst. H. Poincaré Anal. Non Linéaire 7 (1990) 461–476. [43] R.-Q. Jia, C.A. Micchelli, Using the refinement equations for the construction of pre-wavelets. II. Powers of two, in: Curves and Surfaces, Chamonix-Mont-Blanc, 1990, Academic Press, Boston, MA, 1991, pp. 209–246. [44] K. Jörgens, Linear Integral Operators, Pitman, London, 1982. [45] J. Kovacevic, P.L. Dragotti, V. Goyal, Filter bank frame expansions with erasures, IEEE Trans. Inform. Theory 48 (2002) 1439–1450. [46] V.G. Kurbatov, Functional Differential Operators and Equations, Kluwer Acad. Publ., Dordrecht, 1999. [47] V.G. Kurbatov, Some algebras of operators majorized by a convolution, Funct. Differ. Equ. 8 (2001) 323–333. [48] S.G. Mallat, A Wavelet Tour of Signal Processing, Academic Press, New York, 1999. [49] Y. Meyer, Ondelettes et Operateurs, Herman, Paris, 1990.


2439

[50] M.A. Naimark, Normed Algebras, Wolters–Noordhoff, Groningen, 1972. [51] G.E. Pfander, On the invertibility of “rectangular” bi-infinite matrices and applications in time–frequency analysis, Linear Algebra Appl. 429 (2008) 331–345. [52] T. Pytlik, On the spectral radius of elements in group algebras, Bull. Acad. Polon. Sci. Ser. Sci. Math. 21 (1973) 899–902. [53] M.A. Rieffel, Projective modules over higher-dimensional noncommutative tori, Canad. J. Math. 40 (1988) 257– 338. [54] L.L. Schumaker, Spline Functions: Basic Theory, Wiley, New York, 1981. [55] J. Sjöstrand, An algebra of pseudodifferential operators, Math. Res. Lett. 1 (1994) 185–192. [56] J. Sjöstrand, Wiener type algebra of pseudodifferential operators, in: Sémin. Équ. Dériv. Partielles, École Polytech., Palaiseau, 1995, Semin. 1994–1995, Exp. No. 4, 19 pp. [57] T. Strohmer, Rates of convergence for the approximation of shift-invariant systems in 2 (Z), J. Fourier Anal. Appl. 5 (2000) 519–616. [58] T. Strohmer, Four short stories about Toeplitz matrix calculations, Linear Algebra Appl. 343/344 (2002) 321–344. [59] Q. Sun, Wiener’s lemma for infinite matrices with polynomial off-diagonal decay, C. R. Math. Acad. Sci. Paris 340 (2005) 567–570. [60] Q. Sun, Non-uniform sampling and reconstruction for signals with finite rate of innovations, SIAM J. Math. Anal. 38 (2006) 1389–1422. [61] Q. Sun, Wiener’s lemma for infinite matrices, Trans. Amer. Math. Soc. 359 (2007) 3099–3123. [62] Q. Sun, Frames in spaces with finite rate of innovations, Adv. Comput. Math. 28 (2008) 301–329. [63] Q. Sun, Wiener’s lemma for localized integral operators, Appl. Comput. Harmon. Anal. 25 (2008) 148–167. [64] M. Takesaki, Theory of Operator Algebra I, Springer-Verlag, Berlin, 1979. [65] R. Tessera, Finding left inverse for operators on p (Zd ) with polynomial decay, preprint, 2008. [66] J. Toft, Subalgebra to a Wiener type algebra of pseudo-differential operators, Ann. Inst. Fourier (Grenoble) 51 (2001) 1347–1383. [67] J. Toft, Continuity properties in non-commutative convolution algebras with applications in pseudo-differential calculus, Bull. Sci. Math. 126 (2002) 115–142. [68] J. Toft, Continuity properties for modulation spaces with applications to pseudo-differential calculus I, J. Funct. Anal. 207 (2004) 399–429. [69] H. Triebel, Theory of Function Spaces, Birkhäuser, Basel, 1983. [70] N. Wiener, Tauberian theorem, Ann. of Math. 33 (1932) 1–100. [71] G. Yu, Higher index theory of elliptic operators and geometry of groups, in: Proc. Internat. Congress of Mathematicians, Madrid, Spain, 2006, Amer. Math. Soc., Providence, RI, 2007, pp. 1624–1639.


A martingale approach to minimal surfaces Robert W. Neel Department of Mathematics, Columbia University, Room 506, New York, NY, USA Received 18 June 2008; accepted 29 June 2008 Available online 25 July 2008 Communicated by Daniel W. Stroock

Abstract We provide a probabilistic approach to studying minimal surfaces in R3 . After a discussion of the basic relationship between Brownian motion on a surface and minimality of the surface, we introduce a way of coupling Brownian motions on two minimal surfaces. This coupling is then used to study two classes of results in minimal surface theory, maximum principle-type results, such as weak and strong halfspace theorems and the maximum principle at infinity, and Liouville theorems. © 2008 Elsevier Inc. All rights reserved. Keywords: Minimal surface; Halfspace theorem; Maximum principle at infinity; Liouville theorem; Brownian motion; Coupling

1. Introduction The past several years have seen sustained interest in the theory of minimal surfaces in R3 . We present an approach to studying minimal surfaces using Brownian motion and the methods of martingale theory. We begin with a discussion of the basic relationship between Brownian motion on a minimal surface and the coordinate functions and the Gauss map, particularly in the cases when the surface is either stochastically complete with bounded curvature or (geodesically) complete and properly immersed. We then introduce a way of coupling Brownian motions on a pair of minimal surfaces such that the particles are “encouraged” to couple in finite time. We apply this coupling to two classes of results for minimal surfaces, maximum principle-type theorems, by which we mean weak and strong halfspace theorems and the maximum principle at infinity, and Liouville theorems. E-mail address: [email protected]. 0022-1236/$ – see front matter © 2008 Elsevier Inc. All rights reserved. doi:10.1016/j.jfa.2008.06.033

R.W. Neel / Journal of Functional Analysis 256 (2009) 2440–2472

2441

In both cases, we discuss the relationship between the results we obtain from the coupling and the existing results and conjectures obtained by non-stochastic methods of geometric analysis. Further results (beyond what we have been able to prove here) using the coupling appear to be possible, subject to obtaining a better understanding of the behavior of the process when the particles are close. The reader who is primarily interested in the geometric consequences of the coupling is encouraged to proceed to Section 5, where the topics just mentioned are discussed in detail. In addition to the particular results we are able to prove, there are a few more general reasons for introducing this approach to minimal surfaces, aside from the subjective claim that probabilistic methods are intuitively appealing. First, it provides a common framework for studying maximum principle-type theorems and Liouville theorems. Second, there is reason to think that such a probabilistic framework might make it easier to extend results from minimal surfaces to minimal surfaces-with-boundary. This is because Brownian motion can simply be stopped at the boundary, and prior to hitting the boundary it is governed by the local geometry just as in the boundary-less case. For one example of such an extension, see Theorems 5.1 and 5.3 and their proofs. 2. Brownian motion on a minimal surface 2.1. Basic results We begin by considering Brownian motion on a connected minimal surface M. This is interesting in its own right, and it will also be important for understanding coupled Brownian motion later on. There are several ways to think of Brownian motion on a manifold. For our purposes, we will think of it as the solution to the martingale problem corresponding to one-half the Laplacian on M; we now explain what this means in more detail. In general, Brownian motion may only be defined until an explosion time, which we denote by e. Let (C[0, e(ω)), B) be the space of continuous paths on M, which we allow to escape to infinity at some time e(ω) ∈ (0, ∞] depending on the path ω, with the Borel σ -algebra; we give the space the topology of uniform convergence on compacts. Let {Bt∧e , t 0} be the filtration generated by these continuous paths. Then Brownian motion started at x ∈ M is a probability measure on (C[0, e(ω)), B), which we denote Px , such that Px ω(0) = x = 1 and t∧e 1 h ω(s) ds; Bt∧e h ω(t ∧ e) − is a Px -martingale, 2 0

for any smooth, compactly supported function h on M. Throughout this paper, all of our martingales will be continuous, and having established this convention we will refer to them simply as “martingales.” Minimal surfaces are characterized by the fact that the restrictions of the coordinate functions to the surface are harmonic. From a probabilistic point of view, this means that the compositions of the coordinate functions with Brownian motion are local martingales. More concretely, consider any Euclidean coordinate system (x1 , x2 , x3 ) on R3 . The gradient of any coordinate xi (in M) is the projection of the gradient of xi in R3 onto the tangent plane of M. If we denote the

2442


unit normal to M by m = (m1 , m2 , m3 ), then it is easy to check that ∇xi , ∇xj = δij − mi mj where δij is the Kronecker delta function. Let Bt be Brownian motion on M. We will adopt the usual convention of writing the process obtained by composing a function with Brownian motion as f ◦ Bt = ft , and when there is no possibility of confusion we will sometimes omit the subscript. We see that the quadratic variations and cross variations of the coordinate processes are given by dxi,t , xj,t = (δij − mi mj ) dt.

(2.1)

This determines the evolution of the coordinate processes in terms of the normal vector. Another consequence of this equation is that the coordinate processes xi,t∧e are true (as opposed to local) martingales. Next, we wish to understand the evolution of the normal vector under Brownian motion on M. For any smooth surface in R3 , we can define the Gauss map G : M → S2 by identifying the It is well known that, normal vector at every point of M with the corresponding point in S2 . √ if M is minimal, the Gauss map is conformal with conformal factor − |K|, where K is the Gauss curvature. Recall that a map is conformal with conformal factor c (where c can vary over the domain) if its differential preserves angles and scales lengths by a factor of c, where c < 0 means that the map is orientation-reversing. On a minimal surface K 0, so the Gauss map is orientation-reversing and distorts area by a factor of |K|. This implies that Gt is time-changed t Brownian motion on S2 , with the changed time given by 0 |Ks | ds. The introduction of the Gauss map raises the question of orientability of our minimal surfaces. We do not wish to restrict ourselves to orientable surfaces. To accomplish this, observe that the Gauss sphere process is well defined (given a choice of normal vector at the starting point) whether or not the surface is orientable, because given any path there is a unique continuous choice of normal along it. This allows us to state Theorem 2.1 below, for example, without regard to orientability. From the geometric point of view, this corresponds to the fact that Brownian motion lifts to the orientation cover of the surface. With this in mind, we freely ignore questions of orientability in what follows. In order to say more, we need to place more restrictions on our minimal surface. A natural assumption is that M is (geodesically) complete. An example of a theorem which can be proved in this generality is a result of Osserman that the Gauss map of a non-planar complete minimal surface is dense in S2 (see [11, Theorem 8.1]). However, as mentioned above, we are more interested in global control of the immersion and on Liouville properties of the surface. If we place no restrictions on M beyond completeness, then not much can be said in this direction. In [10], Nadirashvili showed that there exists a complete, minimal, conformal immersion of the disk into the unit ball. Another natural assumption is that of stochastic completeness. Probabilistically, this means that Brownian motion almost surely exists for all time, that is, it does not explode by “going off to infinity” in finite time. Analytically, stochastic completeness means that the Cauchy initial value problem for one-half the Laplacian with bounded initial data has a unique bounded solution for all time. Let ρ =

x12 + x22 + x32 . Then a straightforward computation shows (see [14,

1 2 2 ρ

= 2. It follows from the defining property of the martingale problem

Section 5.2.2]) that that

2 = ρ02 + 2(t ∧ e) E ρt∧e

for all t ∈ [0, ∞).

(2.2)


2443

If M is stochastically complete (which is equivalent to e being identically infinite), this equation holds with t ∧ e replaced by t. This obviously prevents M from being contained in a ball. Moreover, because the quadratic variation of a single coordinate is no greater than t, it also implies that M cannot be contained in an infinite cylinder (that is, a set of the form x12 + x22 C for some C > 0 and some choice of orthonormal coordinates on R3 ). In other words, if M is a stochastically complete minimal surface, then there is at most one element of RP2 such that the projection of M onto that line is not the entire line. If we assume only that M is stochastically complete (or even complete and stochastically complete), the preceding is as much as we can say in this direction. Jorge and Xavier [5] show how to construct complete, minimal, conformal immersions of the disk into the “slab” {(x1 , x2 , x3 ): −1 x1 1}. It is relatively easy to see that their construction can be performed in such a way that the resulting minimal surface is also stochastically complete. Thus the above result that, for a stochastically complete minimal surface M, there is at most one line in RP2 such that the projection of M onto that line is not the entire line is sharp. 2.2. Bounded curvature and the weak halfspace theorem In [16], Xavier proved the weak halfspace theorem for complete minimal surfaces of bounded curvature; that is, any complete minimal surface of bounded curvature which is not a plane is not contained in any halfspace. Our goal here is to prove a differential version of the weak halfspace theorem, namely, that the Gauss process accumulates infinite occupation time in every open subset of the Gauss sphere, almost surely. Xavier’s weak halfspace theorem as just mentioned is an obvious corollary, given the relationship between the normal vector and the quadratic variation of the coordinate processes in Eq. (2.1). It is well known that any complete manifold with bounded curvature is stochastically complete, and thus the assumptions in the theorem below are weaker than in Xavier’s weak halfspace theorem. Theorem 2.1. Let M be a minimal surface, and assume either that M is recurrent or that M is stochastically complete and has bounded curvature. If M is not flat (that is, K is not identically zero), then the corresponding Gauss sphere process almost surely accumulates infinite occupation time in each open subset of S2 . Proof. First, we observe that recurrent minimal surfaces are easy to handle. Any recurrent manifold is necessarily stochastically complete. Now suppose that M is a recurrent, non-flat minimal surface. Because M is not flat, there are at least two points in S2 in the image of the Gauss map. By recurrence, the normal vector visits a neighborhood of each of these points infinitely often, and thus we see that the Gauss sphere process has infinite quadratic variation. Since it is timechanged Brownian motion, the Gauss process visits every open subset of S2 infinitely often. For any open subset of S2 , we can choose some component of its pre-image in M. By recurrence, Brownian motion on M accumulates infinite occupation time in that pre-image almost surely, and we conclude that the Gauss sphere process accumulates infinite occupation time in every open set of S2 almost surely. Now assume that M is transient, not flat, and stochastically complete with bounded curvature. Thus its universal cover M˜ is conformally equivalent to the unit disk, by the uniformization ˜ We will write the metric with respect to theorem. It is obviously enough to prove the result for M. the usual Cartesian coordinates on the disk as λδij , where δij is the Kronecker delta function (so

2444


√ lengths are scaled by λ while area is scaled by λ). Here, of course, λ is smooth and positive on the open unit disk D; also, λ determines the time-change taking Euclidean Brownian motion on ˜ In particular, the stochastic completeness of M˜ is equivalent the disk to Brownian motion on M. to the statement that the integral of λ along paths of the Euclidean Brownian motion (until the first hitting time of the boundary of the disk) is almost surely infinite. ˜ as described in Lemmas 8.1 and 8.2 of [11]. Let f and g be the Weierstrass data for M, In particular, g is the stereographic projection of the Gauss map and thus meromorphic, and f is holomorphic. Further, we have that (see [11, Eq. (8.7) and Lemma 9.1], and note that our definition of λ differs from Osserman’s by a power of two) λ=

|f |(1 + |g|2 ) 2

2

and −K =

4|g | |f |(1 + |g|2 )2

2 .

We wish to show that the integral of −K along Brownian paths of M˜ is almost surely infinite. ˜ is a complex martinNote that gt (which, we recall, is g ◦ Bt where Bt is Brownian motion on M) gale with quadratic variation given by the integral of −K along Bt . Thus, showing that −K has infinite integral along Bt is equivalent to showing that gt almost surely does not converge, which in turn is the same as showing that g does not have non-tangential limits at the boundary of the disk, except possibly on a set of Lebesgue measure zero. This last equivalence is a consequence of Doob’s form of the Fatou theorem. We proceed by contradiction, assuming that with probability 2p > 0, gt does converge. We may assume that our Brownian motion begins at the center of the disk, and thus that the hitting measure of the boundary is proportional to its Lebesgue measure. Possibly after a rotation (of our coordinates on R3 ), we can assume that gt has limits with absolute value less than C with probability p, for some positive constant C. Because of the above formula for −K, bounded curvature implies that there is a constant C such that |g /f | < C whenever |g| < 2C. Because a meromorphic function on the disk has a non-tangential limit at a boundary point if and only if it is non-tangentially bounded at that point (again by Doob’s version of the Fatou theorem), it follows that g /f has a non-tangential limit with absolute value less than C on a set of boundary points of Lebesgue measure 2πp. The set of such points where the limit is zero must have Lebesgue measure zero since otherwise g /f would be identically zero, which would mean that g was identically zero, contradicting the assumption that M˜ is not flat. So there is a set Φ of boundary points of Lebesgue measure 2πp where g has non-tangential limits of absolute value less than C and g /f has non-tangential limits with absolute value in (0, C ). For any Euclidean Brownian path hitting the boundary at Φ, we see that the integral of λ along the path being infinite almost surely implies that the integral of |f |2 along the path is infinite, using the above formula for λ in terms of the Weierstrass data. This in turn almost surely implies that the integral of |g |2 along the path is infinite. However, the integral of |g |2 is the quadratic variation of the gt (which, we recall, is a complex martingale). We conclude that gt almost surely does not converge along Brownian paths which hit the boundary at Φ. This is a contradiction, and thus we have shown that the integral of −K along Brownian paths of M˜ is almost surely infinite. Because the Gauss sphere process is time-changed Brownian motion on the sphere with the time-change given by the integral of −K, this shows that the Gauss sphere process hits every open subset of S2 infinitely often. Now consider B an open ball in S2 of radius , and let B /2 be an open ball with the same center and half the radius. Because −K is bounded from above, there exists some δ > 0 such that every time the Gauss sphere process hits B /2 , it spends time δ


2445

in B with probability at least 1/2. An easy application of the Borel–Cantelli lemma then shows that the process spends an infinite amount of time in B , almost surely. As this argument applies to any open ball in S2 , the theorem is proved. 2 We note that this theorem does not require that M is complete. This also explains why the hypothesis of the theorem is that M is not flat, rather than that M is not a plane. For example, let M be the universal cover of a plane minus two points, with the obvious immersion. Then M is a flat, stochastically complete minimal surface, but M is transient and, in particular, not a plane. 2.3. Proper immersion and the weak halfspace theorem A weak halfspace theorem can also be proved for properly immersed minimal surfaces; namely, if M is a complete, properly immersed minimal surface, and M is not a plane, then M is not contained in any halfspace. This was first done by Hoffman and Meeks [4], using a geometric construction comparing M to the lower half of a catenoid. As noted in [7] (see the discussion surrounding Theorems 1.3 and 1.4 in [7]), it is also a simple consequence of Theorem 3.1 of [2], which is proved using elementary harmonic function methods. Before continuing, we make an observation about terminology. Any properly immersed minimal surface is necessarily complete, and thus the phrase “complete, properly immersed minimal surface” is somewhat redundant. Nonetheless, we will employ this phrase in order to highlight the parallel between them and stochastically complete minimal surfaces of bounded curvature, since we are viewing both conditions as strengthenings of the corresponding assumptions of (geodesic or stochastic) completeness. We claim that Eq. (2.2) implies that any complete, properly immersed minimal surface is stochastically complete. Brownian motion on a complete manifold explodes in finite time if and only if it exits every compact set in finite time. Properness of the immersion means that exiting every compact set of M is the same as exiting every compact set of R3 , which is the same as ρt2 blowing up in finite time. However, Eq. (2.2) implies that the expectation of ρt2 remains finite at all times. This justifies our claim. Thus the assumption that M is properly immersed (and, necessarily, complete) is a strengthening of the assumption that M is stochastically complete. Our probabilistic proof is similar, both in spirit and technique, to the harmonic function approach mentioned above. Also, we note that we are unable to give a stronger, “differential,” version as we did for the bounded curvature case (more on this below). Theorem 2.2. Let M be a complete, properly immersed minimal surface. If M is not flat, then M is not contained in any halfspace. Proof. Assume that M is not flat. The case when M is recurrent is already covered by Theorem 2.1, so we assume that M is transient. We begin by showing that the integral of 1 − m23 along Brownian motion on M blows up almost surely. We will proceed by contradiction; assume that the integral of 1 − m23 is bounded

with probability p > 0. Let r = x12 + x22 . Then the semi-martingale decomposition of the rt process is drt =

1 − (∂r · m)2 dWt +

1 2 m3 + (∂r · m)2 dt + dLt , 2rt

2446


where Wt is some Brownian motion, and Lt is an increasing process which increases only when rt = 0. Also, ∂r is the R3 -gradient of r (since we can, at least locally, identify M with its image in R3 , there is no problem in thinking of Brownian motion on M as a martingale in R3 and writing the coefficients of the semi-martingale decomposition in terms of the geometry of R3 ). This is the geometer’s convention of identifying vector fields with first-order differential operators, and we will adopt this convention freely in what follows. We will assume, for simplicity, that we start our Brownian motion at a point with r0 > 0. We wish to compare the rt process with the 2-dimensional Bessel process generated by the same Brownian motion, that is, the strong solution of d r˜t = dWt +

1 dt 2˜rt

with r˜0 = r0 > 0.

Then the process (r − r˜ )t has the semi-martingale decomposition d(r − r˜ )t =

1 − (∂r

· m)2

1 1 2 2 dt + dLt . m + (∂r · m) − − 1 dBt + 2rt 3 2˜rt

We will also write this in integrated form as (r − r˜ )t = Mt + At + Lt . We wish to control the size of excursions of rt above r˜t . Introduce the stopping time

t

σ (C) = inf t 0:

2 1 − m3 C .

0

We will assume that C is chosen large enough so that σ (C) = ∞ with probability at least 3p/4. The quadratic variation of M satisfies t∧σ (C)

2 1 − 1 − (∂r · m)2 − (∂r · m)2 dτ

Mt∧σ (C) = 0

t∧σ (C)

1 − 1 − (∂r · m)2 dτ.

2 0

Further, because (∂r · m)2 1 − m23 , we see that 1−

1 − (∂r · m)2 1 − |m3 | 1 − m23 .

It follows that Mt∧σ (C) C. Thus whatever else it is, Mt∧σ (C) is a continuous process with both supt0 {Mt∧σ (C) } and inft0 {Mt∧σ (C) } almost surely finite. Now introduce η(t) as the random (but not stopping) time defined by η(t) = sup{τ t: rτ r˜τ + 1};


2447

that is, η(t) is the last time before t when r has been at or below r˜ + 1. Since integrals of the non-martingale terms make sense with random times that are not stopping times, we can write (r − r˜ )t∧σ (C) 1 + Mt∧σ (C) − Mη(t∧σ (C)) t∧σ (C)

+

1 2 1 2 dτ + m + (∂r · m) − 2rτ 3 2˜rτ

η(t∧σ (C))

t∧σ (C)

dLτ .

η(t∧σ (C))

Note that m23 + (∂r · m)2 1, and thus that the first integrand is always non-positive on the interval of integration. Next, because r˜t is a 2-dimensional Bessel process, it is almost surely positive for all time. Thus the Lt term never increases when rt r˜t , and thus never increases on the interval of integration. It follows that (r − r˜ )t∧σ (C) 1 + Mt∧σ (C) − Mη(t∧σ (C)) . Then we conclude that sup{rt∧σ (C) − r˜t∧σ (C) } < 1 + sup{Mt∧σ (C) } − inf {Mt∧σ (C) } < ∞ t0

t0

t0

almost surely. To continue, note that because t∧σ (C)

x3 t∧σ (C) =

1 − m23 dτ < C,

0

we see that supt0 {|x3 |t∧σ (C) } < ∞ almost surely. Recall that with probability 3p/4 > 0, we have σ (C) = ∞. Thus with probability 3p/4, we have that supt0 {rt − r˜t } < ∞ and supt0 {|x3 |t } < ∞. Now r˜t is recurrent for the set r˜ < 1. It follows that, on the set of paths with σ (C) = ∞, we know that rt returns infinitely often to some bounded interval (depending on the particular path). Since a subset of R3 with r and x3 bounded is contained in a compact cylinder, it follows that the set of paths with σ (C) = ∞ returns infinitely often to a compact cylinder (the exact cylinder depending on the path). Because the immersion is proper, the intersection of any compact subset of R3 with M is also compact. Thus, almost every path with σ (C) = ∞ returns infinitely often to some compact subset of M. This implies that, with probability 3p/4, Brownian motion on M is recurrent. This contradicts our assumption that M is transient, and we conclude that the integral of 1 − m23 is almost surely unbounded. This means that x3,t is a martingale with almost surely unbounded quadratic variation. Thus M is not contained in the halfspace x3 > 0. Since our choice of Euclidean coordinates on R3 was arbitrary, it follows that M cannot be contained in any halfspace. 2 Note that the computation showing that Brownian motion on M is recurrent on some compact subset of R3 does not use the fact that the process is restricted to a surface. In particular, consider any R3 -valued (continuous) martingale Y such that the diffusion matrix of Y at each instant is given by the diffusion matrix for Brownian motion on some plane (if Y is Brownian motion

2448


on M, then this plane is the tangent plane to M at Y ). Further, suppose that one of the coordinates of Y has finite quadratic variation with probability p > 0. Then the proof shows that, up to a set of probability zero, those paths for which a coordinate of Y is bounded are recurrent on some compact subset of R3 (with the particular subset depending on the path). That the corresponding plane-field is integrable (in the sense of the Frobenius theorem) and the resulting surface complete and properly immersed in the case of Brownian motion on a minimal surface is only used to show that recurrence on R3 implies that Brownian motion on M is recurrent. One consequence of this is that the theorem also applies to complete, properly immersed branched minimal surfaces. In particular, there are at most countably many branch points, so Brownian motion almost surely avoids them all. We then see that the above proof applies in this case as well. As mentioned, we do not have a “differential” version of this result, as we do in the bounded curvature case. In particular, nothing in this theorem rules out the Gauss process spending only finite time in any closed set in RP2 not containing the North/South pole. As we will see later, this difference will mean that we can prove more in the bounded curvature case than in the properly immersed case. 3. Coupled Brownian motion To address more sophisticated issues, we will need to do more than study a single Brownian motion on a minimal surface. In particular, for any two stochastically complete minimal surfaces M and N (where we allow the possibility that they are the same surface), we wish to couple Brownian motions on the surfaces such that the R3 -distance between the two particles goes to zero as efficiently as possible. We will do so by constructing a diffusion on the product space that, pointwise, gives a favorable evolution for the distance between the particles, in a sense to be made precise below. 3.1. Pointwise specification of an optimal coupling We begin by determining what the coupling should be “instantaneously.” Let x = xt be the position of the Brownian motion on M and y = yt be the position of the Brownian motion on N . Let r = r(x, y) be the R3 -distance between x and y. We assume that r(x, y) > 0, since we stop the process if the particles successfully couple. The instantaneous evolution of r will depend only on r and the positions of the tangent planes Tx M and Ty N (and the coupling). We wish to understand this dependence in detail and use it to see how to choose the coupling. As such, we will consider x and y to be given, fixed points and construct our analysis around them. We can choose orthonormal coordinates (z1 , z2 , z3 ) for R3 which are well suited to the current configuration (we use zi for our coordinates here instead of xi in order to avoid overburdening the notation, considering that we use x to denote a point in M). Let ∂z3 lie in the direction of x − y, where we view y and x as elements of R3 under the corresponding immersions of N and M. Let ∂z2 lie in Tx M. (As long as Tx M is not perpendicular to x − y, this will determine ∂z2 up to sign.) Finally, ∂z1 is chosen to complete the orthonormal frame. Such a choice of orthonormal frame determines orthonormal coordinates on R3 up to translation. Since we are only interested in the relative positions of x and y, any choice of coordinates {zi } corresponding to this choice of frame {∂zi } will work.


2449

Given such coordinates, Tx M is determined by the angle it makes with the z1 z2 -plane; call this angle θ . We choose an orthonormal basis {∂α , ∂β } for Tx M such that, at x, ∇M z1 = cos θ ∂α ,

∇M z2 = ∂β ,

and ∇M z3 = sin θ ∂α ,

where ∇M zi is the gradient of zi restricted to M. In other words, ∂β is in the ∂x2 direction, while ∂α lies in the intersection of Tx M and the z1 z3 -plane. In order to specify the direction of Ty N , we will need two angles. Let ϕ be the angle Ty N makes with the z1 z2 -plane. Let ψ be the angle between the intersection of Ty N with the z1 z2 plane and ∂z2 . Then we can choose an orthonormal basis {∂a , ∂b } for Ty N such that, at y, ∇N z1 = cos ϕ cos ψ∂a + sin ψ∂b ,

∇N z2 = −cos ϕ sin ψ∂a + cos ψ∂b ,

and

∇N z3 = sin ϕ∂a . This means that ∂b lies in the intersection of Ty N and the x1 x2 -plane, while ∂a lies along the projection of ∂x3 onto Ty N . Note that, after possibly reflecting some of the zi and exchanging the roles of M and N , we can, and will, assume that θ ∈ [0, π/2],

ϕ ∈ [0, π/2],

and ψ ∈ [0, π].

Further, whenever θ or ϕ is zero, we can, and will, take ψ to be zero. With these conventions, there is a unique choice of (θ, ϕ, ψ) for every (x, y) ∈ (M × N ) \ {r = 0}. Since the coupling at each point will depend only on these three angles, we will refer to them as the configuration of the system at this instant. It is clear that the map into the configuration space is smooth near any point where all three angles are in the interior of their ranges, but in general will be discontinuous if any of them is at the boundary of its range. For the moment, we are concerned only with determining the coupling pointwise. Later, when we consider the induced martingale problem on M × N \ {r = 0}, we will have to account for the behavior of the configuration as (x, y) = (xt , yt ) varies. With the product metric on M × N , again using the fact that x and y can be viewed as elements of R3 , and viewing the coordinate zi as a function on R3 that gives the ith coordinate, we have ∇(M×N ) z1 (x − y) = cos θ ∂α − cos ϕ cos ψ∂a − sin ψ∂b , ∇(M×N ) z2 (x − y) = ∂β + cos ϕ sin ψ∂a − cos ψ∂b , ∇(M×N ) z3 (x − y) = sin θ ∂α − sin ϕ∂a .

and (3.1)

Now we wish to see what the above computations mean for the evolution of r under some (to be determined) coupling of Brownian motions on M and N . Our orthonormal basis {∂α , ∂β } for Tx M uniquely determines normal coordinates (α, β) in a neighborhood of x in M, and we similarly have normal coordinates (a, b) in a neighborhood of y in N . Further, (α, β, a, b) gives product normal coordinates in a neighborhood of (x, y) in M × N . Since the zi,t = zi (xt − yt ) are martingales, their instantaneous evolution is determined by the rate of change of their quadratic variations. The above expressions for the gradients mean that, at this instant with (xt , yt ) = (x, y), we have

2450


dz1,t = cos2 θ + cos2 ϕ cos2 ψ + sin2 ψ dt − 2 cos θ cos ϕ cos ψ dαt , at + sin ψ dαt , bt , dz2,t = 1 + cos2 ϕ sin2 ψ + cos2 ψ dt − 2 cos ψ dβt , bt − cos ϕ sin ψ dβt , at , and dz3,t = sin2 θ + sin2 ϕ dt − 2 sin θ sin ϕ dαt , at . Here we have used that the marginals are Brownian motions, so that the only unknown quantities are the four cross-variations above, which relate the Brownian motions on M and N . It is these four cross-variations which we think of as determining the coupling at (x, y). The zi were chosen so that z1 (x − y) = z2 (x − y) = 0 and z3 (x − y) = r. Thus, the semi-

martingale decomposition of r = z12 + z22 + z32 , at this instant with (xt , yt ) = (x, y), is easy to compute using the above equations for the quadratic variations of the zi . Itô’s formula implies that, at this instant with (xt , yt ) = (x, y), the bounded variation part is growing at rate (dz1,t + dz2,t )/2r while the quadratic variation of the martingale part is growing at rate dz3,t . We introduce one more bit of notation. Let f and g be such that the martingale part has quadratic variation given by the integral of f along paths and the bounded variation part is given by the integral of g/2rt along paths, plus r0 . Then f and g are non-negative functions of the configuration (θ, ϕ, ψ) and the coupling, and they determine the evolution of rt . We return to discussing how the Brownian motions should be coupled at (x, y). We wish to consider couplings which make the cross-variations as large (in absolute value) as possible, since it is intuitively clear that such couplings give the most control over f and g. We are working within the framework of the martingale problem, but the type of couplings we are considering can perhaps be explained best from the point of view of stochastic differential equations. From this perspective, we are saying that the Brownian motion on N should be driven by the Brownian motion on M (at least at the instant under consideration), or vice versa, since the roles of the two surfaces are symmetric. At any rate, such couplings are, at a point, parametrized by isometries between Tx M and Ty N . In terms of our orthonormal bases, all such maps can be written in the form ∂a = A cos σ ∂α + A sin σ ∂β , ∂b = −sin σ ∂α + cos σ ∂β , where σ ∈ [0, 2π) and A is ±1. Note that we are simply relating the tangent planes by an element of O(2). Our choice of orthonormal bases for the tangent planes determines coordinates for O(2) as above, where A determines whether we are in the orientation-preserving or orientationreversing component and σ then parametrizes the relevant component, which is diffeomorphic to S1 (it is a slight extension of the term “coordinates” to call σ and A coordinates, but the meaning is clear and no harm is done). In terms of σ and A, the four cross-variation terms relating the Brownian motions on M and N , at the instant when (xt , yt ) = (x, y), are given by dαt , at = A cos σ dt, dαt , bt = −sin σ dt,

dβt , at = A sin σ dt, and dβt , bt = cos σ dt.

(3.2)


2451

Then, at the instant when (xt , yt ) = (x, y), the instantaneous evolution of rt is determined by the equations f = sin2 θ + sin2 ϕ − 2A sin θ sin ϕ cos σ

and

g = 2 + cos2 θ + cos2 ϕ − 2 cos σ [A cos θ cos ϕ cos ψ + cos ψ] + 2 sin σ [A cos ϕ sin ψ + cos θ sin ψ]. We are now in a position to indicate what the criterion for our coupling to be “optimal.” We wish to consider the rt process on new a time-scale, namely the time-scale in which its quadratic variation grows at rate 1. In this time-scale, the martingale part is just a Brownian motion, and the drift coefficient is given (at the instant for which (xt , yt ) = (x, y)) by g/f 2r . Thus, the time-changed process will be dominated by a two-dimensional Bessel process if g f , and the domination will be strict whenever this inequality is strict. Note that, unlike g/f , this inequality makes sense even if f = 0. That the two-dimensional Bessel process is the critical case will be seen in the following analysis (we mention it here for future reference). That the two-dimensional Bessel process is critical is obviously significant in our effort to make r hit zero as efficiently as possible, in light of the well-known relationship between the dimension of a Bessel process and its long-time behavior. Motivated by this, we will choose our coupling so as to maximize f − g = −2 cos2 θ − 2 cos2 ϕ + 2 cos σ [A cos θ cos ϕ cos ψ + cos ψ − A sin θ sin ϕ] − 2 sin σ [A cos ϕ sin ψ + cos θ sin ψ].

(3.3)

If we assume for a moment that A is fixed, then it is clear how to choose σ in order to maximize this expression. In particular, we can view the two bracketed expressions as being the two components of a vector in the plane, in which case the two terms depending on σ become the dot product of this vector with a unit vector making angle σ with the positive horizontal axis. Thus we should choose σ so that these vectors are parallel. Let σ+ be this optimal choice when A = 1 and σ− be this optimal choice when A = −1. Writing explicit formulas for σ+ and σ− is not important (although it could easily be done in terms of inverse tangents); instead we can write directly that the maximum of f − g arising from the optimal choice of σ+ or σ− is −2 cos2 θ − 2 cos2 ϕ + 2 cos2 ψ 1 + cos2 θ cos2 ϕ + sin2 ψ cos2 θ + cos2 ϕ 1/2 , + sin2 θ sin2 ϕ − 2 cos θ cos ϕ cos ψ sin θ sin ϕ + 2A(cos θ cos ϕ − cos ψ sin θ sin ϕ) where we have used the fact that A2 = 1. Further, it is now clear how to choose A; we want A to be equal to the sign of the expression it multiplies. Doing so shows that the maximum of f − g, that is, the value realized by the optimal coupling, is −2 cos2 θ − 2 cos2 ϕ + 2 cos2 ψ 1 + cos2 θ cos2 ϕ + sin2 ψ cos2 θ + cos2 ϕ 1/2 . + sin2 θ sin2 ϕ − 2 cos θ cos ϕ cos ψ sin θ sin ϕ + 2|cos θ cos ϕ − cos ψ sin θ sin ϕ|

2452


Let Σ+ be the region (in (M × N ) \ {r = 0}) where cos θ cos ϕ − cos ψ sin θ sin ϕ is non-negative, and let Σ− be the region where this expression is non-positive. Then Σ+ is the set where the maximum of f − g is realized by a coupling coming from the orientation-preserving component of O(2), and Σ− is the analogous set for the orientation-reversing component of O(2). Here the notions of orientation-preserving and orientation-reversing are defined relative to the standard choice of orthonormal bases for Tx M and Ty N given above (which we recall are not continuous when the configuration is at the boundary of its range). Further, let Σ0 be the set where cos θ cos ϕ − cos ψ sin θ sin ϕ is zero; this is the set on which there are two possible choices for an optimal coupling, one from each component of O(2). We will return to questions of how the choice of optimal coupling varies over (M × N ) \ {r = 0} below. In particular, we will see in Lemma 4.1 that Σ0 is defined invariantly, that is, it does not depend on the particular identification of maps from Tx M to Ty N with O(2). Next, we wish to show that the optimal coupling is good enough for our purposes. In particular, we wish to show that the above expression for the maximum of f − g is non-negative for any values of the parameters, which will show that the time-changed rt process is dominated by a two-dimensional Bessel process (assuming that a coupling satisfying this pointwise specification exists, a question which we continue to postpone). We also wish to determine the values of the parameters for which the expression is zero, since those will be the configurations where the process looks instantaneously like a two-dimensional Bessel process, rather than being strictly dominated by one. Observe that the non-negativity of f − g is equivalent to the inequality cos2 ψ 1 + cos2 θ cos2 ϕ + sin2 ψ cos2 θ + cos2 ϕ + sin2 θ sin2 ϕ − 2 cos θ cos ϕ cos ψ sin θ sin ϕ + 2|cos θ cos ϕ − cos ψ sin θ sin ϕ| 2 2 cos θ + cos2 ϕ ,

(3.4)

and positivity is equivalent to strict inequality. We begin by considering the case when ψ = 0 (this occurs when x − y and the normal vectors to Tx M and Ty N are coplanar, as vectors in R3 ). Then the above simplifies to 1 + cos2 θ cos2 ϕ + sin2 θ sin2 ϕ − 2 cos θ cos ϕ sin θ sin ϕ + 2cos(θ + ϕ) 2 cos2 θ + cos2 ϕ . After simplifying the left-hand side and taking the square root of both sides, we see that this is equivalent to cos(θ + ϕ) + 1 cos2 θ + cos2 ϕ. One then can show that this inequality holds, and that one has equality exactly when θ = ϕ π/4 or θ + ϕ = π/2. Recall that whenever either θ or ϕ equals zero, we can assume that ψ = 0. Thus we now consider the case when all three angles are positive. First we assume that θ + ϕ π/2. We can rewrite inequality (3.4) as 2 cos2 θ cos2 ϕ − cos2 θ + cos2 ϕ + cos2 ψ + 1 sin2 θ sin2 ϕ −2 cos ψ cos θ cos ϕ sin θ sin ϕ + 2|cos θ cos ϕ − cos ψ sin θ sin ϕ| 0.


2453

Now θ + ϕ π/2 implies that sin θ sin ϕ cos θ cos ϕ

and

cos2 θ + cos2 ϕ 1.

It follows that the sum of the first three terms is non-negative. Along with the fact that cos2 ψ + 1 > 2 cos ψ

for ψ ∈ (0, π],

it follows that the sum of the fourth and fifth terms is strictly positive. Since the last term is non-negative (it is an absolute value), we conclude that the inequality holds strictly. We continue to assume that all three angles are positive, and now we consider the case θ + ϕ < π/2. We can rewrite inequality (3.4) as 2 cos θ cos ϕ 1 − (cos θ cos ϕ + sin θ sin ϕ cos ψ) + cos2 θ sin2 θ + cos2 ϕ sin2 ϕ + cos2 ψ + 1 sin2 θ sin2 ϕ − 2 sin θ sin ϕ cos ψ 0. We know that this holds for ψ = 0. We compute the derivative of the left-hand side with respect to ψ as ∂ [left-hand side] = 2 sin ψ sin θ sin ϕ(1 + cos θ cos ϕ − cos ψ sin θ sin ϕ). ∂ψ This is positive as soon as ψ is positive, and we conclude that the desired inequality holds strictly when all three angles are positive. This shows that, under the optimal choice of A and σ , f − g is always non-negative. Further, f − g is positive except in the following configurations. (That these configurations are the only ones with f − g = 0 is something we computed above. The additional details we are about to provide in these cases follow from elementary trigonometry and the above equations.) When ψ = 0 and θ = ϕ < π/4, we have f = g = 0. When ψ = 0 and θ + ϕ = π/2, there are two possible couplings giving the maximum value of f − g, one orientation-preserving and one orientationreversing. For both choices of optimal coupling, f and g are equal and positive (except when θ = ϕ = π/4), although their shared value is larger under the orientation-reversing coupling. Finally, the case ψ = 0 and θ = ϕ = π/4 lies at the border between the above configurations. Here, there are again two possibilities for the optimal coupling. The orientation-preserving choice gives f = g = 0, while the orientation-reversing coupling gives f = g = 2. Recall that our goal is to produce a coupling for which the particles are encouraged to couple in finite time. Toward this end, it is not so much the specific coupling that matters to us as much as its qualitative features, especially the domination by a time-changed two-dimensional Bessel process. In particular, the distance between the particles, which we called r, is a semi-martingale under any coupling, and this semi-martingale is characterized by the corresponding versions of f and g. We call any coupling with the feature that its corresponding versions of f and g satisfy all of the properties of f and g listed in the previous paragraph (that is, the domination by a timechanged two-dimensional Bessel process is strict expect for the configurations given, for which strictness fails in the ways described) an adequate coupling. As this terminology suggests, the coupling we have already described is the optimal one from our point of view, but any adequate coupling is still good enough for our exploration of minimal surfaces. We will see why this matters in the next section.

2454


3.2. The martingale problem The computations in the previous section describe what the optimal coupling should be at a single point of (x, y) ∈ M × N \ {r = 0}; more specifically, they specify the desired covariance structure at each point. The next task is to show that this gives a global description of the coupling. (Dealing with existence is a separate question. Here we just want to see that this pointwise choice of σ and A leads to a well defined operator such that the corresponding martingale problem gives the coupling we are looking for.) As we have already indicated in the previous section, we think of a coupled Brownian motion as a process on the product space such that the M and N marginals are Brownian motions. More specifically, we can formulate this in the language of the martingale problem, and we begin by briefly recalling what the martingale problem is (for a more detailed explanation, consult the standard text [15]). Let L˜ be a second-order operator on M × N . For simplicity, we consider the case when the solution is not allowed to explode (which will be the case for coupled Brownian motion on stochastically complete manifolds); extending this to allow for explosion is easily accomplished by stopping the process at the explosion time, as we did for Brownian motion at the beginning of Section 2.1. Then using the notation from the beginning of Section 2.1, a solution to ˜ starting at (x0 , y0 ), is a probability measure P(x0 ,y0 ) the martingale problem corresponding to L, on (C[0, ∞), B) such that P(x0 ,y0 ) ω(0) = (x0 , y0 ) = 1 and t ˜ ω(s) ds; Bt is a P(x0 ,y0 ) -martingale, h ω(t) − Lh 0

for any smooth, compactly supported function h on M × N . Further, we note that the martingale problem is compatible with stopping times. For example, if we let ζ be the first hitting time of {r = 0}, then we can consider solutions to the martingale problem up until ζ by stopping the process at ζ . In this case, the operator L˜ need not be defined at {r = 0}, and the class of test functions h can be restricted to those having compact support on (M × N ) \ {r = 0}. Next, we review some basic facts about martingale problems that we will use throughout what follows. There are two basic situations for which a solution to the martingale problem is known to exist. If L˜ is C 2 , then there is a unique solution to the corresponding martingale problem, at least until possible explosion (see [15, Chapter 5]). If L˜ is locally bounded and uniformly elliptic, then there is a strong Markov family of solutions to the corresponding martingale problem, at least until possible explosion (see [15, Exercises 7.3.2 and 12.4.3]). Further, both of these results can be localized (see [15, Chapter 10]). Indeed, we have already taken advantage of this fact in transferring the above results to the manifold setting. Localization also means that a global solution can be constructed by patching together local solutions. As indicated, specifying a martingale problem corresponds to specifying a second-order operator, the diffusion operator. At any point (x, y) ∈ (M × N ) \ {r = 0}, we have product normal coordinates (α, β, a, b) as described in the previous section. The operator at (x, y) in the (α, β, a, b) coordinates is given by a symmetric, non-negative definite matrix [ai,j ]|(x,y) such that the operator applied to a function f is given by 12 ai,j ∂i ∂j f . Let O=

A cos σ −sin σ

A sin σ , cos σ


2455

let OT be its transpose, and let I be the 2 × 2 identity matrix. Then we have ⎤ 1 0 A cos σ −sin σ ⎢ 0 1 A sin σ cos σ ⎥ ⎥ ⎢ [ai,j ]|(x,y) = ⎢ ⎥ ⎣ A cos σ A sin σ 1 0 ⎦ −sin σ cos σ 0 1 T I O I = × [ I OT ] . = O I O ⎡

(3.5)

It is easy to see that [ai,j ]|(x,y) is symmetric and non-negative definite, and also that it has rank two. Carrying out this procedure at every point shows that any pointwise choice of A and σ does in fact give rise to a well-defined operator. Let L+ be the operator determined by the optimal choice of coupling from the orientation-preserving component of O(2), that is, the operator given by (A, σ ) = (1, σ+ ), and let L− be the analogous operator arising form the orientationreversing component of O(2), given by (A, σ ) = (−1, σ− ). Then define L to be equal to L+ on the interior Σ+ and L− on the interior of Σ− . This determines L uniquely except on Σ0 , where there are two possibilities. We will have more to say about Σ0 later, but for now we will leave open whether we choose the orientation-preserving or orientation-reversing possibility. Now suppose we have a solution to martingale problem corresponding to L. We want to compute the semi-martingale decomposition of any function h on M × N \ {r = 0} composed with the solution process. The bounded variation process is just the integral of Lh along paths, while the quadratic variation of the martingale part is given by the integral of ∇h · [ai,j ] · ∇h along paths. Alternatively, the growth of quadratic variation of the martingale part at each point (or more generally, the joint variation coming from two different functions) is determined by a bilinear form Γ (·,·). In the same normal coordinates around (x, y), we can write a vector v ∈ T(x,y) (M × N ) as v = vα ∂α + vβ ∂β + va ∂a + vb ∂b . Then Γ (v, v) is determined by its expression in these coordinates. This is most easily done by using the inner product structure on the tangent space. Also, we divide the formula for Γ into two cases, the case where (A, σ ) = (1, σ+ ) corresponding to the diffusion operator at our point being L+ and the case where (A, σ ) = (−1, σ− ) corresponding to L− . We denote the corresponding bilinear forms by Γ+ and Γ− . Then we have Γ+ (v, v) = v, ∂α + cos σ+ ∂a − sin σ+ ∂b 2 + v, ∂β + sin σ+ ∂a + cos σ+ ∂b 2 , Γ− (v, v) = v, ∂α − cos σ− ∂a − sin σ− ∂b 2 + v, ∂β − sin σ− ∂a + cos σ− ∂b 2 . Note that Γ at a point depends only on first-order information at that point. We now verify that a solution to the martingale problem corresponding to L gives the coupling that it is supposed to. First of all, any solution has marginals which are Brownian motions. This follows from projecting [ai,j ]|(x,y) down to M (which means taking the upper-right 2 × 2 submatrix), which shows that the marginal diffusion operator is expressed, in normal coordinates, as the 2 × 2 identity matrix. Thus, the marginal diffusion operator is one-half the Laplacian and the

2456


marginal process is Brownian motion. The same holds for the marginal process on N . Second, we can check that any solution induces a process rt = r(xt , yt ) with the desired semi-martingale decomposition. Again, choose (α, β, a, b) normal coordinates around a point (x, y). Since r does not depend on a particular choice of Euclidean coordinates, we can choose z1 , z2 and z3 as at the beginning of the previous section. The coordinate process will be martingales, and we see that [ai,j ]|(x,y) is chosen so that their covariance structure is as desired. Once we know that the coordinate processes have the desired semi-martingale decompositions, the decomposition for rt follows. Ideally, we would produce a solution to the martingale problem for L, and use that in our study of minimal surfaces. Unfortunately, L is everywhere degenerate and discontinuous at Σ0 . Thus, the existence of a solution is not guaranteed by standard theorems. More to the point, we do not have sufficient control of the configuration angles near Σ0 to prove existence for arbitrary minimal surfaces. To get around this difficulty, we will modify our operator. The idea is that anywhere the coupling is strictly dominated by a time-changed two-dimensional Bessel process we have room to adjust it while remaining dominated by such a Bessel process. In particular, this will allow us to produce an operator that is easier to deal with but such that any solution to the corresponding martingale problem will still be adequate (although no longer optimal, in the sense that a solution corresponding to L would be). To make this precise, let Σe ⊂ Σ0 be the set of points where ψ = 0 (recall our convention that anywhere we can choose ψ to be zero, we do) and θ + ϕ = π/2. Thus Σe is precisely the set of points in Σ0 where the coupling looks instantaneously like a time-changed two-dimensional Bessel process, and where we have no room to perturb the operator while keeping the coupling adequate. It follows that, for any (x, y) ∈ Σ0 \ Σe , the value of f − g (recall that f is the timederivative of the quadratic variation and g/2r is the time-derivative of the bounded variation part of rt under any solution to the martingale problem) achieved by the optimal coupling is positive. Further, this value depends continuously on the cross-variations (which determine how the Brownian motions are coupled at a point), so a sufficiently small change in the operator will decrease f − g (which we now view as depending on the operator as well) but maintain its positivity. We cannot make the operator continuous with an arbitrarily small perturbation, but we can make it elliptic. In particular, we can replace the cross-variations prescribed by Eq. (3.2) (where we choose σ and A to correspond to the optimal coupling) with the cross-variations prescribed by dαt , at = (1 − ˆ )A cos σ dt, dαt , bt = (1 − ˆ )(−sin σ ) dt,

dβt , at = (1 − ˆ )A sin σ dt, and dβt , bt = (1 − ˆ ) cos σ dt.

In terms of the corresponding operator characterizing the martingale problem, in (α, β, a, b) normal coordinates at a point, it is given by the following matrix:

√

I 1 − ˆ 2 O

√ 1 − ˆ 2 OT I = √ I 1 − ˆ 2 O

0 I ×

ˆ I 0

√

1 − ˆ 2 OT

ˆ I

,

where I and O are as in Eq. (3.5). Heuristically, from the stochastic differential equation point of view, this means that Brownian motion on N is only partially being driven by Brownian motion on M, and “the rest of” the Brownian motion on N is being driven by an independent source, where the ratio between these two driving sources is governed by ˆ . In particular, ˆ = 0 gives


2457

us the same coupling as before, while ˆ = 1 means that the Brownian motions on M and N are independent. For any ˆ ∈ [0, 1], the resulting marginals are still Brownian motions on M and N . Also, the value of f − g achieved by the coupling is smooth in . ˆ Finally, note that if ˆ < 1, then the corresponding operator will be elliptic. It follows that we can choose ˆ ∈ [0, 1/2) to depend smoothly on (x, y) ∈ M × N \ {r = 0} in such away that ˆ = ˆ (x, y) is zero outside of some neighborhood of Σ0 , zero on Σe , positive on Σ0 \ Σe , and so that the corresponding coupling is adequate (which also requires ˆ to be zero on the set of points where ψ = 0 and θ = ϕ < π/4). Call the resulting operator L. Of course, L depends on our precise choice of a function ˆ . We will simply assume that we have chosen some ˆ satisfying the criteria just mentioned, and any will do. It follows from our construction that, for any point (x, y) ∈ Σ0 \ Σe , L is uniformly elliptic in some neighborhood of (x, y), and also that L has the same smoothness properties as L everywhere. The advantage of using L instead of L is that now we really only need to worry about the behavior of the process near the smaller set of discontinuities Σe , which we will find much more manageable. It is the martingale problem for L for which we will show there exists a solution. As mentioned, this will be sufficient for our purposes, since the resulting coupling will be adequate. Finally, we comment that our study of L remains relevant, since we will primarily approach L as a perturbed version of L. This is especially true at Σe , since the two operators are equal there and thus we will continue to use our expressions for Γ . 4. Existence of an adequate coupling 4.1. Preliminary results The fact that we have defined L, and hence L, in a different system of coordinates at each point obscures the questions of how smooth L is. This is not too hard to get around near points where the configuration (θ, ϕ, ψ) is in the interior of its range, since one can see that all of the coordinates (both on R3 and on M × N ) introduced in the previous section will be smooth near such a point. However, these canonical coordinate systems can be discontinuous when the configuration is at the boundary if its range. This means, for example, that the identification of O(2) with isometries between the tangent spaces can also change discontinuously. One could try to introduce other, better behaved coordinates near such points. However, this turns out not to be necessary. The following lemma clarifies the behavior of L. Lemma 4.1. Using the above notation, Σ0 is locally the zero level-set of a smooth function, and L is smooth at any point not in Σ0 . Proof. As already mentioned, L is smooth at a point if and only if L is. Thus we will prove the lemma for L, since we have more explicit formulas for L. We begin by considering the expression of an operator in local, smooth coordinates. In particular, choose smooth product coordinates (x, y) on some product neighborhood S = SM × SN of a point (x0 , y0 ) ∈ (M × N ) \ {r = 0}, and smooth orthonormal frames for both M and N on SM and SN . As above, this choice of orthonormal frames on M and N determines an identification of the isometries between tangent planes and O(2). Suppose the operator under consideration is determined by a choice of element of O(2) at each point (x, y) ∈ S, in the fashion described above. Then if the map from S to O(2) is smooth, so is the corresponding diffusion operator.

2458


This is just a consequence of the change of variables formula and the fact that all of the functions involved are smooth. Thus, in order to show that L is smooth at such a point (x0 , y0 ), it is enough to show that, with respect to coordinates and frames as above, the map from S into O(2) which determines L is smooth. Suppose that, with respect to the induced identification of the isometries between tangent planes and O(2), L is determined by elements of the orientation-preserving component of O(2) on S. Let s ∈ [0, 2π) be the natural coordinate on the orientation-preserving component. Then Eq. (3.3) implies that (f − g)(x, y, s) = α(x, y) · cos s + β(x, y) · sin s + γ (x, y) for some functions α, β, and γ . Here f and g are the functions determining the semi-martingale decomposition of rt , under the coupling corresponding to any choice of s (assuming such a coupling exists). In particular, this equation gives f − g for any choice of s; the value of f − g for L is obtained by letting s be the “optimal choice,” as determined above, at each (x, y). Note that the images of x and y in R3 and the corresponding tangent planes vary smoothly, as do the projections of the tangent planes onto the direction of x − y (since r = 0) and its orthogonal complement. Since f and g depend smoothly on these projections and on the coupling, it follows that f − g is a smooth function of (x, y, s). Further, Eq. (3.3) makes it clear that γ (x, y) is always non-positive, and zero only when (x, y) is such that θ = ϕ = 0 in the standard configuration. Since we showed in the previous section that the maximum of f − g is always

non-negative, and since this maximum is equal to α 2 + β 2 + γ on S (recall that we are assuming that the maximum is achieved by a coupling in the orientation-preserving component on S),

2 it follows that α + β 2 is positive

except possibly where θ = ϕ = 0. Where θ = ϕ = 0, this can be checked by hand, and thus α 2 + β 2 is positive everywhere in S. Next, observe that because f − g is smooth in x and y for every fixed s ∈ [0, 2π), we can see that α, β, and γ are smooth functions of x and y. This implies that α 2 + β 2 is positive and smooth on S, and thus, possibly by shrinking S, we can assume that α 2 + β 2 is smooth and uniformly positive on S. We know that, for given x and y, the choice of s that maximizes f − g, which we will denote s+ (x, y), is the angular component of the polar representation of (α(x, y), β(x, y)) ∈ R2 . Since the components of this vector are smooth and the length of the vector is uniformly positive on S, it follows that s+ is smooth on S. Since s+ determines L on S, it follows that L is smooth on S. Obviously, an analogous argument applies if we assume that L is realized by an element of the orientation reversing component of O(2) on S. Now suppose that we choose a point (x0 , y0 ) ∈ Σ0 , and we again choose smooth coordinates and smooth orthonormal frames on a neighborhood S as above. At (x0 , y0 ), there are two possible choices of optimal couplings, one from each connected component of O(2). Let L+ be the operator on S obtained from the optimal choice of coupling in the orientation-preserving component, and let L− be the analogous operator for the orientation-reversing component (these need not be the same as L+ and L− , as these were defined for a particular choice of orthonormal frames). Since the value of f − g achieved by each of these two couplings is positive at (x0 , y0 ), the above argument implies that, after possibly shrinking S, we can assume that both L+ and L− are smooth on S. Further, the values of f − g realized by L+ and L− are also smooth on S. Since Σ0 is the set where these two values are equal, it follows that Σ0 is the zero-level set of a smooth function on S.


2459

Because Σ0 is locally the zero level-set of a smooth function, it is a closed set. Thus, if we choose a point (x0 , y0 ) in the complement of Σ0 , we can take our neighborhood S to be disjoint from Σ0 . Then the optimal coupling belongs to the same connected component of O(2) throughout S (that the coupling can only “switch” components of O(2), with respect to smooth frames, at Σ0 follows from the smoothness of f − g and the fact that Σ0 is the only set where there are two possible choices for the optimal coupling), and the above argument shows that L is smooth on S. 2 This proof also gives a description of the discontinuities of L and L. We know that L can only be discontinuous at Σ0 . In a neighborhood of a point in Σ0 , both L+ and L− (using the notation from the proof, and assuming that we have chosen some smooth orthonormal frames on M and N ) are smooth. The discontinuity in L arises from the fact that we “switch” from having L given by L+ to having L given by L− . In particular, if we let h be a smooth function such that Σ0 is locally the zero level-set of h, and if the gradient of h is non-zero at (x0 , y0 ) ∈ Σ0 , then Σ0 is locally a hypersurface, and L is given by L+ or L− depending on which side of the hypersurface we are on, at least locally. An analogous description applies to L, with L+ and L− replaced by their perturbed versions. One consequence of this lemma is that if we start our coupled Brownian motion, corresponding to L, in the complement of Σ0 smoothness implies that we have a unique solution at least until the first time the process hits Σ0 . Further, the ellipticity of L gives us what we need for existence on Σ0 \ Σe , which is the reason for introducing L. Near Σe , we need the following lemma. Recall that the signed distance to a smooth hypersurface is the smooth function, defined in some neighborhood of the hypersurface, the absolute value of which is the distance to the hypersurface. Lemma 4.2. Let (x, y) be a point of Σe . Then there is a neighborhood S of (x, y) such that one of the following holds: (1) Σ0 ∩ S is a smooth hypersurface, and if v0 is the gradient of the signed distance to Σ0 , then either Γ+ (v0 , v0 ) or Γ− (v0 , v0 ) is positive at (x, y). (2) Σe ∩ S is contained in some smooth hypersurface H , and if vH is the gradient of the signed distance to H , then both Γ+ (vH , vH ) and Γ− (vH , vH ) are positive at (x, y). Proof. We begin by introducing some notation. We let ∂α = ∂α + A cos σ ∂a − sin σ ∂b

and ∂β = ∂β + A sin σ ∂a + cos σ ∂b ,

so that Γ± (v, v) = v, ∂α 2 + v, ∂β 2 for the appropriate choice of A and σ . We need to look at the first-order derivatives of θ , ϕ, and ψ with respect to ∂α and ∂β when ψ = 0 and θ + ϕ = π/2. Fortunately, the derivatives of (x − y)/|x − y| are particularly simple in this case, as we see from Eq. (3.1) and the fact that both σ+ and σ− are zero when ϕ = 0. As for the normal vectors m and n, their derivatives are constrained only by the fact that the Gauss map is anti-conformal. Thus, we can describe the derivatives of m by k1 0 and s1 ∈ [0, 2π) in the following way. We know that ∂α and ∂β form an orthonormal basis for Tx M and thus also for Tm S2 . Then the general anti-conformal map between the two can be written as ∂α (m) = k1 (cos s1 ∂α + sin s1 ∂β )

and ∂β (m) = k1 (sin s1 ∂α − cos s1 ∂β ).

2460


An analogous description of the derivatives of n can be given in terms of k2 0 and s2 ∈ [0, 2π) with respect to ∂a and ∂b . Because the roles of M and N are symmetric, we can assume that ϕ θ . Recall that Σ0 is the zero level-set of cos θ cos ϕ − cos ψ sin θ sin ϕ, at least when the configuration is in the interior of its range. Next, note that this continues to hold if we allow ψ to take negative values. Thus, if we do not insist on using only configurations in the canonical range mentioned above, all three angles are smooth functions near (x, y), if we assume that θ > 0, and Σ0 is the zero level-set of a smooth function. So we first consider the case where θ > 0. In light of the above discussion, a few simple computations show that under these conditions, that is whenever ψ = 0, 0 < θ π/4, and ϕ = π/2 − θ , we have 2 ∂α (θ + ϕ) = (cos θ − A sin θ ) − k1 cos s1 − Ak2 cos s2 , r ∂β (θ + ϕ) = −k1 sin s1 − k2 sin s2 , ∂α ψ = −

k1 Ak2 sin s1 + π sin s2 , θ 2 −θ

and

∂β ψ =

k1 cos s1 − θ

π 2

k2 cos s2 . −θ

We begin by determining when the first possibility in the lemma holds. It is easy to see that v0 at (x, y) will be a non-zero multiple of the gradient of θ + ϕ. As long as ∂α (θ + ϕ) for at least one choice of A or ∂β (θ + ϕ) is not zero, either Γ+ (v0 , v0 ) or Γ− (v0 , v0 ) will be positive at (x, y). We conclude that the first condition holds unless all three of the following equations are satisfied (at (x, y)): −k1 sin s1 = k2 sin s2 ,

2 cos θ = k1 cos s1 , r

and

2 sin θ = −k2 cos s2 . r

To complete the proof, for the θ > 0 case, we need to show that the second possibility holds whenever all three of the equations are satisfied. We know that Σe is contained in the zero levelset of ψ , at least near (x, y). Note that, if the above three equations are satisfied, then k1 cos s1 > 0 and k2 cos s2 < 0. This implies that ∂β ψ > 0, for both choices of A, since it does not depend on A. Thus we can let H be the zero level-set of ψ , and the second possibility holds. We now consider a point (x, y) ∈ Σe where θ = 0. Here, ψ is not continuous, and θ (which, we recall, is the distance in S2 between m and (x − y)/|x − y|) has the usual non-differentiability of a distance function at its zero level-set. First, observe that ψ = 0 requires that m, n, and (x − y)/|x − y| all lie on the same geodesic in S2 . If we let γ be the great circle through (x − y)/ |x − y| and n, then γ varies smoothly near (x, y). Further, if we let h be the signed distance between m and γ , then h is smooth near (x, y) as well. It follows that Σe is contained in the zero level-set of h, near (x, y). Because of this, it is natural to ask what conditions must be satisfied in order for the second possibility in the lemma to hold. To first-order, (x − y)/|x − y| will only move along γ . It follows that if k1 = 0, either ∂α h, for both values of A, or ∂β h will be non-zero. In this case, possibility two of the lemma holds, with H the zero level-set of h. If k1 = 0, then h is zero to first-order. In other words, m, n, and (x − y)/|x − y| remain colinear (in the sense of lying on the same great circle) to first-order. This suggests the following approach. If we allow ϕ to take values greater than π/2 and θ to take negative values (this is well defined if we restrict θ to the great circle), then these two angles determine the configuration, up to first-order. It is easy to check that the characterization of Σ0 as points with θ + ϕ = π/2


2461

(when h = 0) extends to these values of ϕ and θ . Further, both angles are differentiable and our earlier formulas specialize to ∂α (θ + ϕ) =

2 − Ak2 cos s2 r

∂β (θ + ϕ) = −k2 sin s2 .

and

It is immediate that these cannot both be zero for both values of A. Since we know a priori, by Lemma 4.1, that Σ0 is the zero level-set of a smooth function, it follows that the first possibility in the lemma holds. 2 The next two lemmas concern the case of an operator on Rd which, in standard coordinates (z1 , . . . , zd ), can be written in terms of a (measurable) locally bounded function a taking values in the set of symmetric, non-negative definite matrices and a (measurable) locally bounded function b taking values in Rd as d d 1 ai,j (z)∂zi ∂zj + bi (z)∂zi , L˜ = 2 i,j =1

(4.1)

i=1

where the ai,j and the bi are the obvious components of a and b. Further, we choose C > 0 so that |ai,j | and |bi | are all less than C on BR = {z: z1 + · · · + zn R}. Let P˜ be a measurable, ˜ with P˜x denoting strong Markov family of solutions to the martingale problem associated to L, the solution started at x (we assume that such a family of solutions exists). Lemma 4.3. Let L˜ and P˜x be as above, and suppose further that a is uniformly elliptic on BR (so that all of its eigenvalues are bounded below by some c > 0). Then the expected occupation time under P˜x from time 0 to T has a density GT (x, y) on BR/2 , and this density obeys the estimate sup GT (x, ·)Lq (B

x∈Rd

R/2 )

A,

where A and q are positive numbers, with q > d/(d − 1), that depend only on d, R, T , c, and C. Proof. Because a and b are locally bounded, a standard localization argument implies that it is enough to prove the lemma when the process is stopped at the first exit time of BR . / BR ); we denote the corresponding Let σ1 be the first exit time of BR (which is zero if x ∈ density by GT ∧σ1 (x, y). An application of Girsanov’s theorem (see [15, Section 6.4]) shows that it is enough to consider the case when b ≡ 0. Then the estimate for GT ∧σ1 (x, y), as well as the fact that such a density exists, follows directly from Corollary 2.4 of [3]. 2 For the next lemma, we assume that the rectangle (0, 1) × (−1, 1)d−1 is contained in BR . Because we can rescale the coordinates, this is no loss of generality. Lemma 4.4. Let L˜ and P˜x be as above. Suppose that a1,1 is bounded from below by a positive constant c on the rectangle (0, 1) × (−1, 1)d−1 . Then the expected occupation time of (0, δ) × (−1/2, 1/2)d−1 from time 0 to T under P˜x goes to zero with δ, at a rate which depends only on d, R, T , c, and C.

2462


Proof. Again, the local boundedness of a and b means that it is sufficient to prove the lemma when the process is stopped at σ1 , the first exit from BR . Let ξ(x) be a smooth, non-negative, even function such that ξ(x) |x|, |ξ (x)| 1, ξ (x) 0, and ξ (x) = 1 on (−1/2, 1/2). Then let ξδ (x) = δ 2 ξ(x/δ). Because ξδ (z1 − δ/2) is bounded on BR , its expectation at time T ∧ σ1 is bounded from above by kδ for some constant k depending only on d and R (assuming δ < 1). On the other hand, Itô’s rule implies that the expectation of ξδ (z1 − δ/2) at time T ∧ σ1 is at least c/2 times the expected occupation time of (0, δ) × (−1/2, 1/2)d−1 , minus (T ∧ σ1 )Cδ. It follows that the expected occupation time of (0, δ) × (−1/2, 1/2)d−1 is less than k + C(T ∧ σ1 ) δ. c/2 Thus we have proved that the lemma holds for the process stopped at σ1 .

2

4.2. Proof of existence We are now in a position to prove the existence of an adequate coupling, namely, the existence of a solution to the martingale problem corresponding to any choice of operator L as described above. Theorem 4.5. Let M and N be any stochastically complete minimal surfaces. For any points x0 ∈ M and y0 ∈ N with r(x0 , y0 ) = 0, there exists an adequate coupling of Brownian motions started at x0 and y0 defined until the first time rt = r(xt , yt ) hits zero, in particular, the coupling corresponding to any choice of operator L as described above. Further, given any such L, the corresponding coupling is unique until the first time it hits Σ0 . Proof. Existence of a solution to the martingale problem for L starting from (x0 , y0 ) is not guaranteed by standard results, so we proceed by an approximation argument. Consider the family of operators L(j ) = L + /j where is the Laplacian on the product manifold M × N . In order to deal with stopping the process at {r = 0}, we will need a second level of approximation. For all small, positive , let η be a smooth function from M × N into [0, 1] that is 1 on {r /2} and 0 on {r }. Then we let L

(j,k)

= (1 − η1/k )L

(j )

1 . + η1/k 1 + j

We note that L(j,k) is defined on all of M × N , unlike L which is not defined on {r = 0}. In order to apply standard theorems in martingale theory, it will be helpful to introduce global coordinates and thus work on R4 . As usual, we can pass to the universal covers of M and N , and so we assume that they are simply connected. Because they have non-positive curvature, normal coordinates around any point give global coordinates, and map each surface diffeomorphically to R2 . This extends to the product manifold in the obvious way. In these global coordinates, each L(j,k) can be written in the form shown in Eq. (4.1) with locally bounded, locally uniformly elliptic coefficients. It follows that for each L(j,k) we can find a measurable, strong Markov family of solutions to the corresponding martingale problem (see the beginning of Section 3.2 for the (j,k) definition of the martingale problem) which we denote P(x0 ,y0 ) . (A priori, these solutions are


2463

defined only up to explosion, but we will see in a moment that they never explode.) In order to simplify the notation, we will assume some starting point (x0 , y0 ) has been chosen and simply denote the corresponding measures by P (j,k) whenever there is no possibility of confusion. Consider the marginal processes on M and N . In each case, the marginal distribution of P (j,k) solves the martingale problem for (j + 2)/2j times the Laplacian. This is just time-changed (by a constant factor) Brownian motion on a stochastically complete manifold. It follows that the P (j,k) processes never explode, since M and N are both stochastically complete and a process on a product manifold blows up if and only if one of the marginals blows up. Next, we need to show that any sequence of {P (j,k) } has a weakly convergent subsequence, that is, that this family of measures is pre-compact. All of our process start from the same point, so Theorem 1.3.1 of [15] asserts that {P (j,k) } is pre-compact if and only if, for every ρ > 0 and T < ∞, we have sup ω(t) − ω(s)R4 ρ = 1. (4.2) lim inf P (j,k) δ0 j,k

0stT t−sδ

Recall that the marginals on M and N are time-changed (by a constant factor) Brownian motions, and thus we see that the family of marginals on M and the family of marginals on N both possess the property described by Eq. (4.2). For any path on the product space, the increment |ω(t) − ω(s)|R4 is bounded by the sum of the increments of the projections onto M and N , by the triangle inequality. It follows that the family {P (j,k) } possesses the property described by Eq. (4.2), and thus any sequence of {P (j,k) } has a weakly convergent subsequence. Consider any sequence (j (l), k(l)) such that j (l) → ∞ and k(l) k0 for some positive integer k0 . The corresponding sequence of measures has a convergent subsequence, so after possibly re-indexing our sequence, we can assert that P (j (l),k(l)) converges to a limit we call P k0 (obviously, the limit depends in general on the sequence (j (l), k(l)) and not just on k0 , but we will see that this notation will be sufficient for our purposes). Further, let ζ be the first hitting time of the set {r }; in particular, ζ0 = ζ which we have previously defined as the first hitting time of {r = 0}. We wish to prove that P k0 is a solution to the martingale problem corresponding to L until ζ1/k0 . It is easy to see that P k0 [ω(0) = (x0 , y0 )] = 1 almost surely, and so it remains to prove that P k0 has the desired martingale property. For this, it is enough to show that

E F h ω(t) − h ω(s) −

t

! Lh ω(u) du =0

s

for 0 s < t, any bounded, continuous, Bs -measurable function F from C[0, ∞) to R, and any smooth h compactly supported on (M × N ) \ {r 1/k0 }, where the expectation is with respect to P k0 . This in turn will follow if we show that t E

(l)

|F |

! (j (l)) L − L h ω(u) du

s

! ! t t (l) + E F Lh ω(u) du − E F Lh ω(u) du s

s

(4.3)

2464


goes to zero as l goes to infinity, where E(l) is expectation with respect to P (j (l),k(l)) . The first term goes to zero because F is bounded and (L(j (l)) − L)h converges to zero uniformly, using the fact that h is smooth and compactly supported on (M × N ) \ {r 1/k0 } and L(j (l)) converges to L on (M × N) \ {r 1/k0 }. For the second term, we note that, by a partition of unity argument, it is sufficient to show that for any point (x, y) ∈ (M × N ) \ {r 1/k0 } there is an open neighborhood S of the point such that the second term goes to zero for h supported on S (and thus S should be taken to be disjoint from {r 1/k0 }). For the purposes of such an argument, we see that there are three types of points. First, suppose that (x, y) is contained in the complement of Σ0 . Then we can choose S also to be contained in this interior. If h is smooth and supported on S, then so is Lh, using the fact that L is smooth on the complement of Σ0 . Then the second term of Eq. (4.3) goes to zero by the definition of weak convergence (recall that F is continuous). Next, suppose that (x, y) ∈ Σ0 \ Σe . Then L is uniformly elliptic and bounded on any sufficiently small neighborhood of (x, y). Moreover, we can choose coordinates centered at (x, y) so that L(j (l)) and P (j (l),k(l)) satisfy the assumptions of Lemma 4.3 for some choice of constants R, c, and C independent of n, and with d = 4 and T > t. Also, we can choose S, our neighborhood of (x, y), to be contained in BR/2 (in these coordinates) and disjoint from {r 1/k0 }. Let q > 4/3 be the constant from Lemma 4.3, and let q be its Hölder conjugate. Then we can find a sequence of continuous functions ξm supported on BR/2 which approximate Lh in Lq (BR/2 ). By weak convergence, we know that t E

(l)

F s

! ! t ξm ω(u) du → E F ξm ω(u) du s

as l → ∞. Further, Lemma 4.3 and the Hölder inequality imply that t E

(l)

F s

! ! t (l) ξm ω(u) du → E F Lh ω(u) du s

as m → ∞, uniformly in l. Because this convergence is uniform in l, combining these two equations shows that the second term of Eq. (4.3) goes to zero as desired. The final type of point we need to consider is (x, y) ∈ Σe . This divides further into two cases, depending on which of the possibilities in Lemma 4.2 holds. Suppose that the second possibility in Lemma 4.2 holds. Let Γ be the bilinear form that gives the cross-variation of vector fields under L. We know that the operators corresponding to the optimal orientation-preserving and reversing couplings are smooth, and that L agrees with L at (x, y). It follows that Γ (vH , vH ) is bounded below by a positive constant on an open neighborhood S of (x, y); we also assume that S is disjoint from {r 1/k0 }. Thus, we can find coordinates (z1 , . . . , z4 ) centered at (x, y) such that L(j (l)) and P (j (l),k(l)) satisfy the assumptions of Lemma 4.4 on both sides of H for some choice of constants R, c, and C independent of l, with d = 4 and T > t; and where these coordinates are such that (−1, 1)4 ⊂ S ⊂ BR


2465

and z1 restricted to S is the signed distance to H . In particular, Lemma 4.4 implies that the occupation time of (−δ, δ) × (−1/2, 1/2) goes to zero in δ at a rate that can be taken to be independent of l. We can assume that the support of h is contained in S = (−1/2, 1/2)4 . For any δ ∈ (0, 1/2), we can find a mollified version of Lh, which we denote ξδ , with the properties that ξδ is supported on S, ξδ is continuous in a δ/2-neighborhood of Σe , the ξδ are bounded uniformly in δ, Lh and ξδ are equal outside of a δ-neighborhood of Σe , and the discontinuities of ξδ are contained in an open set where L is uniformly elliptic (here we mean uniformly in space, not in δ). We claim that, for any δ, t E

(l)

F s

! ! t ξδ ω(u) du → E F ξδ ω(u) du

(4.4)

s

as l → ∞. To see this, note that a bump function argument shows that ξδ can be written as the sum of a bounded function supported on a subset of S where L is uniformly elliptic and a continuous function supported on S. The argument given above for Lh supported on a set where L is uniformly elliptic shows that we have the desired convergence for the first term in the decomposition of ξδ , and weak convergence applies directly to the second term in the decomposition. The claim follows. Next, note that the difference in the expectation of ξδ and Lh is bounded by a constant that does not depend on l times the occupation time of (−δ, δ) × (−1/2, 1/2), which we have already seen goes to zero with δ, uniformly in l. Thus we have that t E

(l)

F s

! ! t (l) ξδ ω(u) du → E F Lh ω(u) du s

as δ 0, uniformly in l. Because this convergence is uniform in l, combining this with Eq. (4.4) shows that the second term of Eq. (4.3) goes to zero as desired. Now suppose the first possibility in Lemma 4.2 holds. The argument is similar to the previous case. Without loss of generality, we can assume that Γ+ (v0 , v0 ) is positive at (x, y), since the roles of Γ+ and Γ− are symmetric. It follows that Γ (v0 , v0 ) is uniformly positive “on one side” of Σ0 in S , where S is some open neighborhood of (x, y), disjoint from {r 1/k0 }. Then S \ Σ0 is naturally divided into two disjoint, connected open sets, say S1 and S2 , and Γ (v0 , v0 ) is bounded below by a positive constant on one of them, which we can assume is S1 . Then we can find coordinates (z1 , . . . , z4 ) centered at (x, y) such that L(j (l)) and P (j (l),k(l)) satisfy the assumptions of Lemma 4.4 for some choice of constants R, c, and C independent of l, with d = 4 and T > t; and where these coordinates are such that (0, 1) × (−1, 1)3 ⊂ S ⊂ BR and z1 restricted to S is a constant (non-zero) multiple of the signed distance to Σ0 , with positive z1 corresponding to S1 (this implies that Σ0 ∩ S is {z1 = 0} ∩ S ). In particular, we note that, in the notation of Lemma 4.4, this implies that a1,1 is bounded below by a positive constant on S1 = {z1 > 0} ∩ S . We can assume that the support of h is contained in S = (−1/2, 1/2)4 . In contrast to the previous cases, here the definition of L on Σe matters. In particular, we assume that L|Σ0 is

2466


chosen to be the limit when approached from within S2 (this is equivalent to having it agree with the optimal orientation-preserving coupling on Σ0 if L corresponds to the orientation-preserving coupling on S2 , and similarly for the orientation-reversing case). We will say more about this assumption below. This implies that, for each δ ∈ (0, 1/2), we can find a mollified version of Lh, which we again denote ξδ , with the properties that ξδ is continuous, the ξδ are bounded uniformly in δ, and Lh and ξδ are equal outside of (0, δ) × (−1/2, 1/2)3 . Weak convergence means that t E

(l)

F s

! ! t ξδ ω(u) du → E F ξδ ω(u) du s

as l → ∞. In addition, Lemma 4.4 and our choice of ξδ imply that t E

(l)

F s

! ! t (l) ξδ ω(u) du → E F Lh ω(u) du s

as δ 0, uniformly in l. Because this convergence is uniform in l, combining these two equations shows that the second term of Eq. (4.3) goes to zero as desired. The only aspect of the argument in the preceding paragraph that needs comment is the possibility of globally defining L on Σe . However, comparing the arguments in the cases when the first or second possibility of Lemma 4.2 holds, we see that the choice of L on Σe only matters near points (x, y) ∈ Σe where the second possibility does not hold for any H and where either Γ+ (v0 , v0 ) or Γ− (v0 , v0 ) is zero at (x, y) (which, of course, must be true if the second possibility does not hold, since otherwise we could just take H to be Σ0 ). Near any such point (x, y), we know that there exists a neighborhood S of (x, y) such that Σ0 ∩ S is a smooth hypersurface. If we choose smooth orthonormal frames for M and N on S, then the set of points where L must correspond to the orientation-preserving coupling and the set of points where L must correspond to the orientation-reversing coupling are closed, disjoint sets. It follows that we can make a global choice of L (on (M × N ) \ {r = 0}) such that L|Σ0 is what it must be for the above argument to work at all points (x, y) ∈ Σe where only the first possibility of Lemma 4.2 holds. From now on, we will assume that L has been defined on Σ0 in a way that satisfies the above description. As this is all that is needed, we have proved that P k0 is a solution to the martingale problem corresponding to L starting at (x0 , y0 ) until ζ1/k0 . To continue, note that the limit of a convergent sequence P (j (l),k(l)) depends only on the tail of the sequence. Thus we have actually shown that if j (l) → ∞ and lim inf k(l) k0 , then the limit of a convergent subsequence is a solution to the martingale problem corresponding to L until ζk0 . Choose (j (l), k(l)) so that j (l) and k(l) both go to infinity with l, pass to a convergent subsequence, and call the limit P . Then because lim inf k(l) = ∞, P is a solution to the martingale problem corresponding to L until ζ1/k for every positive integer k. Since ζ1/k ζ as k → ∞, it follows that P is a solution to the martingale problem corresponding to L until ζ . The proof of the existence of a solution to the martingale problem for L starting from any point (x0 , y0 ) ∈ (M × N) \ {r = 0} and stopped at the first hitting time of {r = 0} is complete. To prove the final assertion of the theorem, we recall that L is smooth on the complement of Σ0 . Thus the martingale problem corresponding to L has a unique solution until the first hitting time of Σ0 . By uniqueness, the solution just constructed (and, moreover, any solution to the martingale problem for L) must agree with this solution until the first hitting time of Σ0 . 2


2467

5. Applications of the coupling 5.1. Strong halfspace-type theorems A strong halfspace theorem states that two minimal surfaces, satisfying some condition, either intersect or are parallel planes. Hoffman and Meeks [4] proved a strong halfspace theorem for (complete) properly immersed minimal surfaces. Their proof used geometric measure theory to show that two such non-intersecting minimal surfaces are separated by a stable minimal surface, which must then be a plane (since planes are the only stable minimal surfaces in R3 by a result of Schoen [13]). This reduces the problem to the corresponding weak halfspace theorem. Rosenberg [12] proved a strong halfspace theorem for complete minimal surfaces of bounded curvature, as did Bessa, Jorge, and Oliveira-Filho [1] (this later paper also gives a “mixed” strong halfspace theorem in which one minimal surface is properly immersed and the other is complete with bounded curvature). The ultimate goal of introducing our coupled Brownian motions is to show that the particles couple, either with positive probability or with probability one. If the particles couple with positive probability, then the minimal surfaces on which they move obviously intersect. This gives a potential method for proving strong halfspace theorems or similar results. The issue is proving that the particles couple. The distance between the particles, under our coupling, is dominated, after time change, by a two-dimensional Bessel process. Recall that a two-dimensional Bessel process comes arbitrarily close to zero, while a Bessel process of dimension less than two strikes zero in finite time almost surely. Thus, heuristically, we see that the only obstacles to our particles coupling is that the distance between them might converge, corresponding to the process accumulating only finite quadratic variation, or that the process might look too much like a twodimensional Bessel process when the particles are close, causing them to come arbitrarily close but never to couple. The rate at which the quadratic variation grows and the ratio of the drift to the dispersion are both determined by the configuration of the tangent planes, that is, by the angles θ , ϕ, and ψ introduced above. Unfortunately, it is not clear how to get the necessary control of the evolution of these angles to prove the full strong halfspace theorem for either properly immersed or bounded curvature minimal surfaces. Instead, we have the following partial result. Theorem 5.1. Let M be minimal surface that is either recurrent or stochastically complete with bounded curvature, and let N be a stochastically complete minimal surface. Then if M is not flat, dist(M, N) = 0. Proof. Consider Brownian motion on M and N , coupled as described above (for arbitrary starting points), and assume that M is not flat. Then the particles will become arbitrarily close (whether or not they meet) as long as the distance between them does not converge to some positive limit. To show that this cannot happen, we proceed by contradiction. Assume that, with positive probability, rt converges to a positive value (and thus that the process continues for all time). The vector xt − yt is an R3 -valued martingale, and it is easy to see that its length can only converge if the vector itself converges. This, in turn, means that the direction of the vector in S2 must converge. However, because M is recurrent or stochastically complete with bounded curvature and is not flat, we know from Theorem 2.1 that, up to a set of probability zero, any path that continues for all time has a normal vector that spends an infinite amount of time in every open set of S2 . We conclude that, along such a path, the system spends an infinite amount of time

2468


where θ ∈ (3π/4, π/2). The rate of growth of quadratic variation is bounded from below by a positive constant on this set, and thus the rt -process must accumulate infinite quadratic variation. This contradicts our assumption that it converges, and the proof is complete. 2 Note that this proof shows that any Brownian motion on M almost surely becomes arbitrarily close to N and vice versa. Further, in contrast to Rosenberg’s result, only one of the minimal surfaces needs to be stochastically complete and have bounded curvature (or be recurrent); the other need only be stochastically complete. On the other hand, the weakness of this theorem is obvious. We have only shown that the distance between the surfaces is zero, not that they actually intersect. As the proof makes clear, the difficulty is controlling the process for small r, to rule out both the possibility that r converges to zero and the possibility that it becomes arbitrarily close to zero without ever hitting it. One might hope that better understanding of the process for small r would allow the theorem to be strengthened to conclude that the surfaces intersect. In the properly immersed case, we do not even have the analogue of Theorem 5.1. This is a consequence of our inability to control the long-term behavior of the normal vector to any extent greater than that implied by the weak halfspace theorem. 5.2. Maximum principle at infinity In a more positive vein, one nice feature of the use of coupled Brownian motions is that one expects results to extend naturally to the case of minimal surfaces with boundary, as mentioned above. By a minimal surface with boundary, we mean a surface with boundary together with an immersion which is minimal on the interior and extends continuously to the boundary. Our approach requires extending Theorem 4.5 to the case when one or both of M and N are allowed to have boundary. To do this, assume that M has non-empty boundary. Then it is fairly straightforward to show that the first hitting time of the boundary of M is almost surely continuous, with respect to Brownian motion on M, on the set where it is finite. The same is true of N in case it has non-empty boundary. The first hitting time of the boundary of M × N , which we denote η, is the minimum of the first hitting times of the boundaries of M and N . All of the processes P (j,k) that we introduced in the proof of Theorem 4.5 and all of their weak limits have marginals that are time-changed (by a constant factor) Brownian motion, so s ∧ η and t ∧ η (for fixed times 0 < s t) are bounded and almost surely continuous with respect to all of these measures. Because of this boundedness and almost sure continuity, the weak convergence argument we used in the proof of Theorem 4.5 to show that the limit measure is a solution to the martingale problem for L is compatible with stopping all of the processes at the boundary. Thus we see that we can solve the martingale problem for L, stopped at the boundary, and this gives an adequate coupling, stopped at the boundary, in the case when one or both of our minimal surfaces has boundary. (The reason we had to use an approximation argument earlier when stopping the process at ζ is because it is not clear that ζ is almost surely continuous.) The model theorem from (non-stochastic) geometric analysis is the following version of the maximum principle at infinity, recently proved by Meeks and Rosenberg [8]. Theorem 5.2. Let M and N be disjoint, complete, properly immersed minimal surfaces-withboundary, at least one of which has non-empty boundary. Then the distance between them satisfies dist(M, N ) = min dist(M, ∂N ), dist(∂M, N ) .


2469

This is a generalization of the strong halfspace theorem for properly immersed minimal surfaces. It is proved by similar methods, although the addition of the boundary makes things more difficult. We expect the analogue in the bounded curvature case to be true, although to our knowledge no (previous) work has been done in that direction. However, we have the following version of the maximum principle at infinity for minimal surfaces-with-boundary of bounded curvature. Theorem 5.3. Let M and N be stochastically complete minimal surfaces-with-boundary, at least one of which has non-empty boundary, and such that dist(M, N ) > 0. If M has bounded curvature or is recurrent, and is not flat, then dist(M, N ) = min dist(M, ∂N ), dist(∂M, N ) . Proof. Suppose M and N satisfy the hypotheses of the theorem, and that dist(M, N ) = a > 0. Consider any point (x0 , y0 ) in the interior of M × N . We run a coupled Brownian motion starting at (x0 , y0 ), stopped when it hits the boundary. With one caveat, it is clear from our proof of Theorem 5.1 that rt almost surely hits the boundary in finite time, having accumulated only finite quadratic variation, since otherwise the process would hit a level below a with positive probability. The caveat is that the proof of Theorem 5.1 uses Theorem 2.1, which we have not proved for surfaces-with-boundary. However, we can prove that any Brownian path on M with an infinite lifetime has a normal vector which accumulates infinite occupation time in every open set of S2 , up to a set of probability zero, as follows. If M is recurrent this is clear for the same reasons as before, so assume that M is transient (meaning the interior of M is transient). Let M˜ be the universal cover of the interior of M, so that M˜ is conformally equivalent to the unit disk. Then M˜ can be described by Weierstrass data just as before, and M˜ also has bounded curvature. Further, we see that the argument in the proof of Theorem 2.1 can now be applied to the set of paths with infinite lifetime, so we conclude that any such path has a normal vector which accumulates infinite occupation time in every open set of S2 , up to a set of probability zero. This establishes our claim that rt almost surely hits the boundary of M × N in finite time. To complete the proof, first suppose that, with positive probability, rt accumulates no quadratic variation. It follows that this set of paths produces points (x, y) on the boundary with dist(x, y) = dist(x0 , y0 ). For the other case, suppose that rt almost surely accumulates positive quadratic variation. Then, by comparison with a two-dimensional Bessel process, there is some

> 0 such that rt hits the level dist(x0 , y0 ) − with probability at least 1/2. Since the process must stop before rt gets below level a, comparison with a two-dimensional Bessel process shows that there is a positive probability that the paths along which rt hits dist(x0 , y0 ) − are stopped at the boundary before rt increases to dist(x0 , y0 ) − /2. This produces points (x, y) on the boundary with dist(x, y) < dist(x0 , y0 ). It follows from the above that, for any point (x0 , y0 ) in the interior of M × N , there is a point (x, y) on the boundary of M × N such that dist(x, y) dist(x0 , y0 ). We conclude that dist(M, N) = min{dist(M, ∂N ), dist(∂M, N )}, and the theorem is proved. 2 5.3. Liouville theorems In the previous section, we coupled Brownian motions on two different surfaces in an attempt to control the distance between these surfaces. However, one can also consider coupling two Brownian motions started at different points on the same surface. This gives an approach to proving that there are no non-constant bounded harmonic functions on certain classes of minimal

2470


surfaces. In particular, suppose that M is an embedded (by which we mean injectively immersed) minimal surface such that Brownian motions started from any two points couple almost surely. By embeddedness, the fact that they couple in the extrinsic distance also means that they couple in the intrinsic distance. Then the standard representation of harmonic functions as integrals of Brownian motion with respect to bounded stopping times shows that any bounded harmonic function must be constant. Our efforts in this direction are guided by the following conjecture, which appears to go back to Sullivan (see [7, Conjecture 1.6] and the surrounding discussion). Conjecture 5.4. A complete, properly embedded minimal surface admits no non-constant, positive harmonic functions. Though the full conjecture remains open, various special cases are known. Of course, any class of surfaces which are recurrent satisfies the theorem. For example, Theorem 3.5 of [2] states that any complete, properly embedded minimal surface with two limit ends (see the introduction of the paper just cited for a discussion of ends and limit ends) is recurrent. As for results that apply to transient surfaces, in [9], Meeks, Pérez, and Ros prove the conjecture under the additional assumption that the surface possesses one of various symmetries (such as being triply periodic or, more generally, having a sufficiently large group of ambient isometries). We provide another partial result, under the additional assumption that M has bounded curvature. Also note that, as indicated above, our result only prohibits non-constant bounded (rather than positive) harmonic functions. Theorem 5.5. Let M be a complete, properly embedded minimal surface of bounded curvature. Then M has no non-constant bounded harmonic functions. Proof. Choose any two distinct points x0 and y0 in M, and run our adequate coupling of Brownian motions from these points. Our proof of Theorem 5.1 shows that rt either hits zero in finite time or else spends an infinite amount of time in every neighborhood of zero. To prove the theorem, it is enough to prove that rt almost surely hits zero in finite time, as discussed above. As a consequence of their maximum principle at infinity (which we stated above), Meeks and Rosenberg were also able to prove that any properly embedded minimal surface with bounded Gauss curvature has a fixed size tubular neighborhood (see the first paragraph of Section 5 of [8] along with Theorem 5.3). With bounded curvature, the existence of such a tubular neighborhood implies that there is some a > 0 such that, whenever rt = r(xt , yt ) a, the distance between xt and yt with respect to the metric on M, which we denote distM (xt , yt ), is less than or equal to 2a. Further, if this property holds for some particular a, then it also holds for any smaller a. Because the curvature is bounded and the embedding is minimal, the entire second fundamental form of the embedding is uniformly bounded. This means that any fixed sized (with respect to the metric on M) neighborhood of a point is uniformly comparable to the tangent plane at that point. More concretely, for any , we can choose a small enough so that, whenever distM (x, y) < 2a, the resulting configuration satisfies


θ∈

π π − , , 2 2

ϕ∈

π π − , , 2 2

2471

ψ ∈ [0, ].

One consequence of this is that, by choosing a small enough, we can guarantee that the set of points with r 2a is disjoint from Σ0 . In fact, because we also know that the curvature is bounded, the subset of M × N where r 2a is a positive distance (in the product metric) from Σ0 . Thus we can assume that our modified coupling, given by L, agrees with the optimal coupling (given by L) on the set where r 2a. Under the optimal coupling, θ = ϕ = π/2, ψ = 0 corresponds to rt evolving like the standard mirror coupling in the plane, drt = 2 dBt . So by taking small enough and a, the configuration can be made arbitrarily close to that of the standard mirror coupling on the plane, which is just time-changed Brownian motion, whenever distM (xt , yt ) < 2a. The other consequence of the fact that the set {r 2a} is disjoint from Σ0 that we need is that the (xt , yt )-process, started at any point in {(x, y): r(x, y) 2a}, is unique at least until its first exit time from this set. In particular, the process has the strong Markov property on any time interval during which it is contained in this set. The above shows that the rate of growth of quadratic variation of rt is bounded from below on {(x, y): r(x, y) 2a}, so if rt does not strike zero in finite time, it hits the level a and then leaves the set {(x, y): r(x, y) 2a} infinitely many times. The point now is to argue that each time rt hits a it has some probability, bounded from below, of hitting 0 before it hits 2a. Then we wish to use the (almost) independence of these events to conclude that, almost surely, on one of these occasions rt must hit 0. As indicated above, this proves the theorem. The rest of the proof is devoted to making this argument precise. Note that a one-dimensional Bessel process (which corresponds to θ = ϕ = π/2, ψ = 0) has the property that, if started at some l, it hits zero before it hits 2l with probability 1/2. In light of the above, we can choose a so that whenever rt 2a, rt is dominated by a time-changed Bessel process of dimension d, where d is such that a Bessel process of dimension d started at a strikes zero before 2a with some probability p > 0, and so that the time-change satisfies the estimate (1) dτ/dt 3/4. Let t1 be the first time that rt hits the level a, and let ρτ (t) be the comparison Bessel process started from level a at time t1 , given by (1) ρτ (t)

τ (t) = a + Wτ (t) − Wτ (t1 ) + τ (t1 )

d −1 (1)

t ds,

2ρτ (s)

τ (t) =

f ds, 0

where Wτ is a Brownian motion. (1) (1) As usual, we stop (xy , yt ) and ρτ (t) if rt hits zero. Let σ1 be the first time after t1 that ρτ (t) (1)

hits 2a; by convention, σ1 = ∞ if rt strikes zero before ρτ (t) hits 2a. We now iterate this procedure. For n 2, let tn be the first time after σn−1 that rt hits a, and let σn be the first time after (n) (n) (1) tn that either rt hits 0 or ρτ (t) hits 2a. Here ρτ (t) is defined by the same equation as ρτ (t) , except (n)

that it is begun at from level a at time tn . Now consider the processes {ρτ (t) , tn t tn+1 }. Recall that, starting from any point with r = a, the (xt , yt )-process is unique at least until rt hits 2a. Using this and the fact that the [tn , tn+1 ] are disjoint, we see that this collection of processes is independent, and that each one enjoys the strong Markov property during its lifetime. Thus, the probability that σn is finite (that is, the probability that rt drops to level a and escapes back up to 2a without striking zero n times) is less that (1 − p)n . Since this goes to zero as n → ∞, we see that, almost surely, rt hits zero in finite time, completing the proof. 2

2472


Acknowledgments The author gratefully acknowledges support from a Clay Liftoff Fellowship and an NSF Postdoctoral Research Fellowship. I would like to thank Dan Stroock for numerous helpful discussions about Brownian motion and geometry and for comments on an earlier draft of this paper. I would also like to thank Ioannis Karatzas for advice on solving martingale problems (in particular, the proof of Theorem 6.1 of [6] provided an outline for the proof of Theorem 4.5 below) and Michel Émery for comments on an earlier draft of this paper. Finally, I am grateful to Bob Finn for introducing me to minimal surfaces several years ago. References [1] G. Pacelli Bessa, Luquésio P. Jorge, G. Oliveira-Filho, Half-space theorems for minimal surfaces with bounded curvature, J. Differential Geom. 57 (3) (2001) 493–508. [2] Pascal Collin, Robert Kusner, William H. Meeks III, Harold Rosenberg, The topology, geometry and conformal structure of properly embedded minimal surfaces, J. Differential Geom. 67 (2) (2004) 377–393. [3] E.B. Fabes, D.W. Stroock, The Lp -integrability of Green’s functions and fundamental solutions for elliptic and parabolic equations, Duke Math. J. 51 (4) (1984) 997–1016. [4] D. Hoffman, W.H. Meeks III, The strong halfspace theorem for minimal surfaces, Invent. Math. 101 (2) (1990) 373–377. [5] Luquésio P. de M. Jorge, Frederico Xavier, A complete minimal surface in R3 between two parallel planes, Ann. of Math. (2) 112 (1) (1980) 203–206. [6] Ioannis Karatzas, Gittins indices in the dynamic allocation problem for diffusion processes, Ann. Probab. 12 (1) (1984) 173–192. [7] William H. Meeks III, Minimal surfaces in flat three-dimensional spaces, in: The Global Theory of Minimal Surfaces in Flat Spaces, Martina Franca, 1999, in: Lecture Notes in Math., vol. 1775, Springer, Berlin, 2002, pp. 1–14. [8] William H. Meeks III, Harold Rosenberg, Maximum principles at infinity, J. Differential Geom. 79 (1) (2008) 141–165. [9] William H. Meeks III, Joaquín Pérez, Antonio Ros, Liouville-type properties for embedded minimal surfaces, Comm. Anal. Geom. 14 (4) (2006) 703–723. [10] Nikolai Nadirashvili, Hadamard’s and Calabi–Yau’s conjectures on negatively curved and minimal surfaces, Invent. Math. 126 (3) (1996) 457–465. [11] Robert Osserman, A Survey of Minimal Surfaces, second ed., Dover Publications Inc., New York, 1986. [12] Harold Rosenberg, Intersection of minimal surfaces of bounded curvature, Bull. Sci. Math. 125 (2) (2001) 161–168. [13] Richard Schoen, Estimates for stable minimal surfaces in three-dimensional manifolds, in: Seminar on Minimal Submanifolds, in: Ann. of Math. Stud., vol. 103, Princeton Univ. Press, Princeton, NJ, 1983, pp. 111–126. [14] Daniel W. Stroock, An Introduction to the Analysis of Paths on a Riemannian Manifold, Math. Surveys Monogr., vol. 74, Amer. Math. Soc., Providence, RI, 2000. [15] Daniel W. Stroock, S.R. Srinivasa Varadhan, Multidimensional Diffusion Processes, reprint of the 1997 edition, Classics Math., Springer-Verlag, Berlin, 2006. [16] Frederico Xavier, Convex hulls of complete minimal surfaces, Math. Ann. 269 (2) (1984) 179–182.


The cubic fourth-order Schrödinger equation Benoit Pausader Department of Mathematics, University of Cergy-Pontoise, CNRS UMR 8088, 2, avenue Adolphe Chauvin, 95302 Cergy-Pontoise cedex, France Received 19 June 2008; accepted 11 November 2008 Available online 28 November 2008 Communicated by I. Rodnianski

Abstract Fourth-order Schrödinger equations have been introduced by Karpman and Shagalov to take into account the role of small fourth-order dispersion terms in the propagation of intense laser beams in a bulk medium with Kerr nonlinearity. In this paper we investigate the cubic defocusing fourth-order Schrödinger equation i∂t u + 2 u + |u|2 u = 0 in arbitrary space dimension Rn for arbitrary initial data. We prove that the equation is globally well-posed when n 8 and ill-posed when n 9, with the additional important information that scattering holds true when 5 n 8. © 2008 Elsevier Inc. All rights reserved. Keywords: Fourth-order dispersive equation; Scattering; Energy-critical equation

1. Introduction Fourth-order Schrödinger equations have been introduced by Karpman [15] and Karpman and Shagalov [16] to take into account the role of small fourth-order dispersion terms in the propagation of intense laser beams in a bulk medium with Kerr nonlinearity. Such fourth-order Schrödinger equations have been studied from the mathematical viewpoint in Fibich, Ilan and Papanicolaou [8] who describe various properties of the equation in the subcritical regime, with part of their analysis relying on very interesting numerical developments. Related references are by Ben-Artzi, Koch, and Saut [3] who gave sharp dispersive estimates for the biharmonic Schrödinger operator, Guo and Wang [11] who proved global well-posedness and scattering E-mail address: [email protected]. 0022-1236/$ – see front matter © 2008 Elsevier Inc. All rights reserved. doi:10.1016/j.jfa.2008.11.009

2474

B. Pausader / Journal of Functional Analysis 256 (2009) 2473–2517

in H s for small data, Hao, Hsiao and Wang [12,13] who discussed the Cauchy problem in a high-regularity setting, and Segata [35] who proved scattering in the case the space dimension is one. We refer also to Pausader [28,29] where the energy critical case for radially symmetrical initial data is discussed. The defocusing case like in (1.1) below is discussed in Pausader [28] for radially symmetrical initial data. The focusing case, following the beautiful results of Kenig and Merle [18,19], is settled in Pausader [29] still for radially symmetrical initial data. Finally, we note that related equations appear in Fibich, Ilan and Schochet [9], Huo and Jia [14] and Segata [33,34]. We focus in this paper on the study of the initial value problem for the cubic fourth-order defocusing equation in arbitrary space dimension Rn , n 1, without assuming radial symmetry for the initial data. The equation is written as i∂t u + 2 u + |u|2 u = 0,

(1.1)

where u = I × Rn → C is a complex valued function, and u|t=0 = u0 is in H 2 , the space of L2 functions whose first and second derivatives are in L2 . The equation is critical when n = 8 because of the criticality of the Sobolev embedding H 2 ⊂ L4 in this dimension, and it enjoys rescaling invariance leaving the energy and H˙ 2 -norm unchanged. Let S be the space of Schwartz functions. The theorem we prove in this paper provides a complete picture of global well-posedness for (1.1). It is stated as follows. Theorem 1.1. Assume 1 n 8. Then for any u0 ∈ H 2 there exists a global solution u ∈ C(R, H 2 ) of (1.1) with initial data u(0) = u0 . Moreover, for any t ∈ R, the mapping u(0) → u(t) is analytic from H 2 into itself. On the contrary, if n 9 then the Cauchy problem for (1.1) is ill-posed in H 2 in the sense that for any ε > 0, there exist u0 ∈ S, tε ∈ (0, ε), and u ∈ C([0, ε], H 2 ) a solution of (1.1) with initial data u0 such that u0 H 2 < ε while u(tε )H 2 > ε −1 . Besides, if 5 n 8, then scattering holds true in H 2 for (1.1) and the scattering operator is analytic. The fourth-order dispersion scaling property leads to the heuristic that smooth solutions of the n free homogeneous equation have their L∞ norm which decays like t − 4 . However, the situation is not so transparent and all frequency parts of the function have their L∞ -norm that decays much n n faster, like t − 2 , but at a rate which depends on the frequency. Uniformly, the rate of decay t − 4 is the best possible, but it is not optimal when the solution is localized in frequency. As one will see, there are various differences between the dispersion behaviors of second-order Schrödinger equations and of (1.1). Our paper is organised as follows. We fix notations in Section 2 and recall preliminary results from Pausader [28] in Section 3. In Section 4, we prove that the Cauchy problem is ill-posed when n 9. In order to do so we use a low-dispersion regime argument which was essentially given in Christ, Colliander and Tao [6]. We also refer to Lebeau [24,25], Alazard and Carles [1], Carles [4] and Thomann [38,39] for other results in different settings. Starting from Section 5 we focus on the energy-critical case, and so on the n = 8 part of our theorem (the equation is subcritical when n 7). We prove in Section 5, using important ideas of concentration compactness developed in Kenig and Merle [18] and Killip, Tao and Visan [23], that any failure of global wellposedness implies the existence of some special solutions satisfying three possible scenarii. The remaining part of the analysis consists in excluding these hypothetical special solutions working at the level of H˙ 2 -solutions. The first scenario is that there is a self-similar-like solution. It is not consistent with conservation of energy, conservation of local mass and compactness up


2475

to rescaling. We exclude this scenario in Section 6. The two other scenarii are that there is a soliton-like solution or that there is a low-to-high cascade-like solution. In these two scenarii the solution is away from the L2 -like region, namely we have that h 1 with respect to the notation of Theorem 5.1. We use this to prove an interaction Morawetz estimate in Sections 7 and 8, following previous analysis from Colliander, Keel, Staffilani, Takaoka and Tao [7], Ryckman and Visan [32] and Visan [40]. The estimate we prove is not an a priori estimate. A major difficulty 1 is that the estimate scales like the H˙ 4 -norm and thus creates a 7/4-difference in scaling with the H˙ 2 -norm control we have. In Section 9, we exclude soliton-like solution by proving that it is not consistent with the frequency-localized interaction Morawetz estimates and compactness up to rescaling. The last scenario is excluded in Section 10 by proving that any low-to-high-like solution has an unexpected L2 -regularity. Then, conservation of L2 -norm, frequency-localized interaction Morawetz estimates and conservation of energy allows us to exclude this existence of low-to-high cascade-like cascade solutions. Finally, in Section 11, we prove the scattering part of Theorem 1.1. As a remark, with the arguments we develop here and adaptations of the analysis in Visan [40], global well-posedness and scattering in Theorem 1.1 continue to hold true when n 8 and the cubic nonlinearity is replaced by the n-dimensional energy-critical nonlinearity with total power (n + 4)/(n − 4). We also refer to Miao, Xu and Zhao [27] for another proof in high dimensions n 9 following previous work by Killip and Visan [22]. For radially symmetrical data, see Pausader [28], this is also true in any dimension n 5. 2. Notations We fix notations we use throughout the paper. In what follows, we write A B to signify that there exists a constant C depending only on n such that A CB. When the constant C depends on other parameters, we indicate this by a subscript, for example, A u B means that the constant may depend on u. Similar notations hold for . Similarly we write A B when A B A. We let Lq = Lq (Rn ) be the usual Lebesgue spaces, and Lr (I, Lq ) be the space of measurable functions from an interval I ⊂ R to Lq whose Lr (I, Lq ) norm is finite, where u

Lr (I,Lq )

=

u(t)r q dt L

1 r

.

I

When there is no risk of confusion we may write Lq Lr instead of Lq (I, Lr ). Two important conserved quantities of Eq. (1.1) are the mass and the energy. The mass is defined by M(u) =

u(x)2 dx

(2.1)

Rn

and the energy is defined by E(u) = Rn

|u(x)|2 |u(x)|4 + dx. 2 4

(2.2)

2476


In what follows we let F f = fˆ be the Fourier transform of f given by fˆ(ξ ) =

1

f (y)ei y,ξ dy

n

(2π) 2

Rn

for all ξ ∈ Rn . The biharmonic Schrödinger semigroup is defined for any tempered distribution g by eit g = F −1 eit|ξ | F g. 2

4

(2.3)

Let ψ ∈ Cc∞ (Rn ) be supported in the ball B(0, 2), and such that ψ = 1 in B(0, 1). For any dyadic number N = 2k , k ∈ Z, we define the following Littlewood–Paley operators: P N f (ξ ) = ψ(ξ/N )fˆ(ξ ), P >N f (ξ ) = 1 − ψ(ξ/N ) fˆ(ξ ), P N f (ξ ) = ψ(ξ/N ) − ψ(2ξ/N) fˆ(ξ ).

(2.4)

Similarly we define P 0 such that for any initial data u0 ∈ H˙ 2 , and any interval I = [0, T ], if it2 e u0

W (I )

< δ,

(3.5)


2479

then there exists a unique solution u ∈ C(I, H˙ 2 ) of (1.1) with initial data u0 . This solution has conserved energy, and satisfies u ∈ S˙ 2 (I ). Moreover, uS˙ 2 (I ) u0 H˙ 2 + δ 3 ,

(3.6)

and if u0 ∈ H 2 , then u ∈ S˙ 0 (I ) ∩ S˙ 2 (I ), uS˙ 0 (I ) u0 L2 , and u has conserved mass. Besides, in this case, the solution depends continuously on the initial data in the sense that there exists δ0 , depending on δ, such that, for any δ1 ∈ (0, δ0 ), if v0 − u0 H 2 δ1 , and if we let v be the local solution of (1.1) with initial data v0 , then v is defined on I and u − vS˙ 0 (I ) δ1 . In addition to Proposition 3.2 we also have Proposition 3.3. Proposition 3.3. Let n = 8, I ⊂ R be a compact time interval such that 0 ∈ I , and u˜ be an approximate solution of (1.1) in the sense that i∂t u˜ + 2 u˜ + |u| ˜ 2 u˜ = e

(3.7)

for some e ∈ N(I ). Assume that u ˜ Z(I ) < +∞ and u ˜ L∞ (I,H˙ 2 ) < +∞. There exists δ0 > 0, ˜ Z(I ) , u ˜ L∞ (I,H˙ 2 ) ), such that if eN (I ) δ, and u0 ∈ H˙ 2 satisfies δ0 = δ0 (Λ, u u(0) ˜ − u0

H˙ 2

2 ˜ − u0 W (I ) δ Λ and eit u(0)

(3.8)

for some δ ∈ (0, δ0 ], then there exists u ∈ C(I, H˙ 2 ) a solution of (1.1) such that u(0) = u0 . Moreover, u satisfies u − u ˜ W (I ) Cδ, u − u ˜ S˙ 2 C(Λ + δ),

and

uS˙ 2 C,

(3.9)

where C = C(Λ, u ˜ Z(I ) , u ˜ L∞ (I,H˙ 2 ) ) is a nondecreasing function of its arguments. In our analysis, we need to consider H˙ 2 -solutions. These solutions do not satisfy conservation of mass. However the next proposition shows that there is still something remaining from that conservation law for these solutions. Proposition 3.4 shows that the local mass of a solution of (1.1) varies slowly in time provided that the radius R is sufficiently large. We define the local mass M(u, B(x0 , R)) over the ball B(x0 , R) of a function u ∈ L2loc by M u, B(x0 , R) =

u(x)2 ψ 4 (x − x0 )/R dx,

Rn

where, ψ is as in (2.4). Proposition 3.4 from Pausader [28], states as follows.

(3.10)

2480


Proposition 3.4. Let n 5, and u ∈ C(I, H˙ 2 ) be a solution of (1.1). Then we have that 3

4 1 ∂t M u(t), B(x0 , R) E(u) M u(t), B(x0 , R) 4 R

(3.11)

for all t ∈ I . We refer to Pausader [28] for a proof of the above propositions. 4. Ill-posedness results In this section we use a quantitative analysis of the small dispersion regime to prove illposedness results for the cubic equation when n > 8. The idea is that now the equation is supercritical with respect to the regularity-setting in which we work, namely H 2 . Hence one can always use rescaling arguments to make any “separation-mechanism” between two different solutions happen sooner and sooner while making the H 2 -norm smaller and smaller. It remains then to find two solutions whose distance goes to ∞ as time evolves. To achieve this, we follow the proof in Christ, Colliander and Tao [6] by considering the small dispersion regime. See also Lebeau [24,25] for previous results, and Alazard and Carles [1], Carles [4] and Thomann [38,39] for instability results in different contexts. Before we prove our theorem, we need the following lemma concerning the small dispersion regime. Lemma 4.1. Let k > n/2. Then, for any φ ∈ S, there exists c > 0 such that for any ν ∈ (0, 1), there exists a unique solution w ν ∈ C([−T , T ], H k ) of the problem i∂t w + ν 4 2 w + |w|2 w = 0

(4.1)

with initial data w ν (0) = φ, where T = c| log ν|c . Besides, the solution satisfies w ν ∈ C([−T , T ], H p ) for any p, and ν w − w 0

L∞ ([−T ,T ],H k )

φ,k ν 3 ,

(4.2)

where 2 w 0 (t, x) = φ(x) exp i φ(x) t

(4.3)

is a solution of the ODE formally obtained by setting ν = 0 in (4.1). Proof. Letting u = w ν − w 0 , we see that u solves the Cauchy problem 2 2 i∂t u + ν 4 2 u = ν 4 2 w 0 + w 0 w 0 − w 0 + u w 0 + u

(4.4)

with u(0) = 0. Let k > n/2 be given. Since w 0 ∈ C ∞ (S), standard developments ensure that there exists a unique solution u ∈ C([−t, t], H k ) to (4.4), and that u can be continued as long as uH k remains bounded. Besides, u ∈ C([−t, t], H p ) for any p 0 (in the sense that t does not depend on p). Consequently, it suffices to prove that there exists c > 0 such that for any


2481

s < c| log ν|c , we have that u(s)H k ν 3 . Now, taking derivatives ∂ α of Eq. (4.4), multiplying ¯ taking the imaginary part and integrating, for all α such that |α| k, we get that by ∂ α u, 2 2 2 ∂s u(s)H k uH k ν 4 2 w 0 (s)H k + w 0 + u w 0 + u − w 0 w 0 H k .

(4.5)

By (4.3) we see that, for p 0, 0 w

Hp

φ,p t p .

(4.6)

Independently, since H k is an algebra, we get that 0 w + u2 w 0 + u − w 0 2 w 0

Hk

2 0 j 3−j O w u j =0

Hk

2 uH k 1 + w 0 H k + uH k .

(4.7)

Now, using (4.5)–(4.7), we see that, in the sense of distributions, 2 ∂s u(s)H k φ,k ν 4 1 + |s|k+4 + u(s)H k 1 + |s|k + u(s)H k .

(4.8)

An application of Gromwall’s lemma gives the bound u(s) k,φ ν 4 exp C 1 + |s|C

(4.9)

for all s such that u(s)H k 1. By (4.9) we see that u(s)H k 1 holds for all times |s| c| log ν|c , c > 0 sufficiently small. This gives (4.2) and finishes the proof of Lemma 4.1. 2 Now, we are in position to prove the main theorem of this section which states that the flow map u0 → u(t), from H 2 into H 2 which maps the initial data to the associated solution fails to be continuous at 0. As a remark, note that (4.10) is false when n 8 since the H 2 -norm controls the energy. Theorem 4.1. Let n > 8. Given ε > 0, there exists a solution u ∈ C([0, ε], H 2 ) such that u(0)

H2

ε −1 ,

(4.10)

for some tε ∈ (0, ε). Besides, we can choose u such that u(0) ∈ S and u ∈ C([0, ε], H k ) for any k > 0. Proof. For φ ∈ S and ν ∈ (0, 1], we let w ν be the solution of Eq. (4.1) with initial data w ν (0) = φ. By Lemma 4.1, we see that for |s| c| log ν|c , (4.2) holds true for w 0 as in (4.3). Now, for λ ∈ (0, ∞), we let u(ν,λ) (t, x) = λ2 w ν λ4 t, λνx .

(4.11)

Then u(ν,λ) solves (1.1) with initial data u(ν,λ) (0, x) = λ2 φ(λνx). A simple calculation gives

2482


(ν,λ) 2 u (0)

H2

=

λ4 (λν)−2n (2π)n −n

φˆ ξ/(λν) 2 1 + |ξ |2 2 dξ

Rn

φ(η) ˆ 2 |λνη|4 dη +

λ (λν) 4

Rn

φ(η) ˆ 2 dη

Rn

φ λ4 (λν)4−n ,

(4.12)

provided that λν 1. Now, given ε > 0, and ν > 0, we fix − 1 λ = λν,ε = ε 2 ν n−4 n−8

(4.13)

2

such that λ4 (λν)4−n = ε 2 , and λν = (εν 2 )− n−8 > 1. Independently, by (4.3), we see that 0 w (t)

H˙ 2

φ t 2 + O(t),

and, consequently, using (4.2), we get that for |s| c| log ν|c sufficiently large independently of ν, there holds that ν w (s) ˙ 2 φ s 2 . H

(4.14)

Consequently, using (4.11), (4.13) and (4.14) we get that (ν,λ) −4 2 u λ t

H2

2 u(ν,λ) λ−4 t H˙ 2 2 λ4 (λν)4−n w ν (t) ˙ 2 H

φ ε t

2 4

(4.15)

for t sufficiently large. Now, given ε, we let ν > 0 be sufficiently small such that ε 2 tν4 > ε −2 , ε

for tν = c| log ν|c , 16−n n−8

ν

4(n−4) n−8

< ε.

and (4.16)

We choose λ = λν,ε as in (4.13). Using (4.16), we get that tε = λ−4 tν < ε, and then (4.12) and (4.15) give (4.10). This finishes the proof. 2 5. Reduction to three scenarii From now on we start with the analysis of the energy-critical case n = 8. In this section we prove that the analysis can be reduced to the study of some very special solutions. In order to do so, we borrow ideas from previous works developed in the context of Schrödinger and wave equations by Bahouri and Gerard [2], Kenig and Merle [18], Keraani [21], Killip, Tao and Visan [23], and Tao, Visan and Zhang [37]. We refer also to Pausader [30] for a similar result


2483

developed in the context of the L2 -critical fourth-order Schrödinger equation. For any E > 0, we let

Λ(E) = sup u6Z(I ) : E(u) E ,

(5.1)

where the supremum is taken over all maximal-lifespan solutions u ∈ C(I, H˙ 2 ) of (1.1) satisfying E(u) E. In light of Proposition 3.2 and of the Strichartz estimates (3.2), we know that there exists δ > 0 such that, for any E δ, Λ(E) δ E < +∞. Besides, Λ is clearly an increasing function of E. Hence, we can define

Emax = sup E > 0: Λ(E) < ∞ .

(5.2)

The goal in Sections 5–10 is to prove that Emax = +∞. Theorem 5.1 below is a first step in this direction. Theorem 5.1. Suppose that Emax < +∞. There exists u ∈ C(I, H˙ 2 ) a maximal-lifespan solution of energy exactly Emax such that the Z(I )-norm of u is infinite for I = (T∗ , 0) and I = (0, T ∗ ), where I = (T∗ , T ∗ ). Besides, there exist two smooth functions h : I → R∗+ and x : I → Rn such that

K = g(h(t),x(t)) u(t): t ∈ I

(5.3)

is precompact in H˙ 2 , where the transformation g(t) = g(h(t),x(t)) is as in (2.11). Furthermore, one can assume that one of the following three scenarii holds true: (soliton-like solution) there holds I = R and h(t) = 1 for all t; (double low-to-high cascade) there holds lim inft→T¯ h(t) = 0 for T¯ = T∗ , T ∗ , and h(t) 1 for all t; 1 (self-similar solution) there holds I = (0, +∞) and h(t) = t 4 for all t. As a remark, since E(u) = Emax , the solution u in Theorem 5.1 is such that u = 0. Assuming Propositions 6.1, 9.1 and 10.1 which exclude the three scenarii in Theorem 5.1, the following corollary holds true. Corollary 5.1. For any E > 0, there exists C = C(E) such that, for any u0 ∈ H˙ 2 satisfying E(u0 ) E, if u ∈ C(I, H˙ 2 ) is the maximal solution of (1.1) with initial data u(0) = u0 , then I = R and uS˙ 2 (R) C. Proof. First, using [28, Proposition 2.6.], we see that a bound on the Z-norm of u implies a bound on the S˙ 2 -norm of u. Hence if Corollary 5.1 is false, then Emax < +∞. Applying Theorem 5.1, we find a maximal solution satisfying one of the three scenarii in Theorem 5.1. Then, using Propositions 6.1, 9.1 and 10.1, we get a contradiction. Hence Emax = +∞. 2 Now we prove Theorem 5.1. Proof of Theorem 5.1. In several ways the proof is similar to the one developed in the L2 critical case in Pausader [30]. We prove the more general statement that Theorem 5.1 holds true

2484


in any dimension n 5 when (1.1) is replaced by the H˙ 2 -critical equation. In particular, this is the case when n = 8. Therefore, in this proof, (1.1) always refers to the energy-critical equation in dimension n, and the energy E and Λ must be replaced by E(u) = Rn

2 n − 4 2n 1 n−4 u(x) + u(x) dx 2 2n

and

2(n+4)

Λ(E) = sup uZn−4 : E(u) E ,

where the supremum is taken over all maximal solutions of the energy-critical equation of energy less or equal to E. Besides, the definition of τ and g as in (2.10) and (2.11) and Propositions 3.2 and 3.3 refer to their n-dimensional energy-critical counterparts. A consequence of the precised Sobolev’s inequality in Gerard, Meyer and Oru [10] and of the Strichartz estimates (3.2) is that, for any u0 ∈ H˙ 2 , it2 e u0

Z(R)

n−4 2 eit |∇|u0 n−2 2(n+4) L

2(n+4) n−2 L n−2

it2 2 e |∇|u0 n−2

2n

L∞ L n−2

4 n−4 2 it2 n(n−2) eit2 |∇|u0 n e u0 Hn−2 |∇|u 0 ∞ ˙1 ˙2 L∞ H˙ 1

L B2,∞

n2 −2n−4

4

u0 H˙n(n−2) u0 n(n−2) , 2 ˙2 B2,∞

(5.4)

s where for s = 1, 2, B˙ 2,∞ is a standard homogeneous Besov space. Now, thanks to (5.4), we may follow the analysis in Bahouri and Gerard [2] and Keraani [21]. In the following, we call scalecore a sequence (hk , tk , xk ) such that for every k, hk > 0, tk ∈ R and xk ∈ Rn . Mimicking the proof in Keraani [21] we obtain that for (vk )k a bounded sequence in H˙ 2 , there exists a sequence (V α )α in H˙ 2 , and scale-cores (hαk , tkα , xkα ) such that for any α = β,

α log hk + hα 4 t α − t β + hα x α − x β → +∞ k k k k k k β hk

(5.5)

as k → +∞, with the property that, up to a subsequence, for any A 1, vk =

A

α 4 α 2 g(hαk ,xkα ) e−i(hk ) tk V α + wkA

(5.6)

α=1

for all k, where wkA ∈ H˙ 2 for all k and A, and 2 lim lim supeit wkA Z = 0.

A→+∞ k→+∞

Moreover, we have the following estimates:

(5.7)


2485

+∞ it2 2(n+4) it2 α 2(n+4) e e vk Zn−4 = V Zn−4 + o(1) and α=1

E(vk ) =

A 2 α 4 2 E e−i(hk ) tk V α + wkA H˙ 2 + o(1)

(5.8)

α=1

for all k, where o(1) → 0 as k → +∞. Let (V , (hk )k , (tk )k , (xk )k ) be such that V ∈ H˙ 2 , and (hk , tk , xk ) ∈ R+ × R × Rn is a scale-core such that h4k tk has a limit l ∈ [−∞, +∞] as k → +∞. We say that U is the nonlinear profile associated to (V , (hk )k , (tk )k , (xk )k ) if U is a solution of (1.1) defined on a neighborhood of −l, and 4 U −h tk − e−ih4k tk 2 V

H˙ 2

k

→0

as k → +∞. Using the analysis in Pausader [28], it is easily seen that a nonlinear profile always exists and is unique. Besides if 4 2 E(U ) = lim E e−ihk tk V

(5.9)

k

is such that E(U ) < Emax , then the associated nonlinear profile U is globally defined, and U S˙ 2 (R) E(U ) 1. Now, we enter more specifically into the proof of Theorem 5.1. A consequence of Proposition 3.3 is that there exists a sequence of nonlinear solutions uk such that E(uk ) < Emax , E(uk ) → Emax , and uk Z(−∞,0) , uk Z(0,+∞) → +∞.

(5.10)

We let ((hαk )k , (tkα )k , (xkα )k ) = (hα , zα ), V α , and wA be given by (5.6) applied to the sequence (vk = uk (0))k . Passing to subsequences, and using a diagonal extraction argument, we can assume that, for all α, (hαk )4 tkα has a limit in [−∞, ∞]. We let U α be the nonlinear profile associated to (V α , hα , zα ). Suppose first that there exists α such that 0 < E(U α ) < Emax . Then, applying (5.8) and (5.9), we see that there exists ε > 0 such that for any β, E(U β ) < Emax − ε, 2 and we get that all the nonlinear profiles are globally defined. Letting WkA (t) = eit wkA , we remark that pkA

=

A

τ(hαk ,zkα ) U α + WkA

α=1

satisfies (3.7) with e

= ekA

=f

A α=1

τ(hαk ,zkα ) U

α

+ WkA

−

A f τ(hαk ,zkα ) U α α=1

2486

B. Pausader / Journal of Functional Analysis 256 (2009) 2473–2517 8

and initial data pkA (0) = uk (0) + oA (1), where f (x) = |x| n−4 x. First, we claim that A lim sup τ(hαk ,zkα ) U α Emax ,ε 1 k α=1

(5.11)

Z

β

β

β

independently of A. Indeed, we remark that when (hαk , tkα , xkα ) and (hk , tk , xk ) satisfy (5.5), then for any u, v with finite Z-norm, there holds that |τ

β

β

β

(hk ,tk ,xk )

n+12 v| n−4 τ(hαk ,tkα ,xkα ) uL1 (R,L1 ) → 0

(5.12)

as k → +∞, where τ(hk ,tk ,xk ) is as in (2.10). Now, since Λ is sublinear around 0, and bounded on [0, Emax − ε], using (5.8) and (5.12), we get that A A

n−4 2(n+4) 2(n+4) n−4 α α U lim sup τ(hαk ,zkα ) U = Z k α=1

α=1

Z

n−4 A α 2(n+4) Λ E U α=1

Emax ,ε

A

n−4 2(n+4)

α

E(U )

α=1

Emax ,ε 1. Using again (5.12), we get that A

A α α τ(hαk ,zkα ) U f τ(hαk ,zkα ) U − f α=1

α=1

= oA (1)

(5.13)

L2 (R,L2 )

as k → +∞. On the other hand, using the blow-up criterion in Pausader [28, Proposition 2.6.], and the bound U α Z Λ(E(U α )) Λ(Emax − ε), we get that, for any α, α U E ,ε 1. max M Using the Leibnitz and chain rules for fractional derivative in Kato [17] and Visan [40, Appendix A], we obtain that A

A τ(hαk ,zkα ) U α − f τ(hαk ,zkα ) U α f α=1

α=1

n+8 , 2n(n+4) n2 +6n+16 )

A,Emax ,ε 1.

(5.14)

L2 (R,H˙ n+4

Interpolating between (5.13) and (5.14), we get that A

A α α τ(hαk ,zkα ) U f τ(hαk ,zkα ) U = oA (1). − f α=1

α=1

N

(5.15)


Now, we claim that, letting skA =

A

α=1 τ(hk ,zk ) U α

α

α,

2487

there holds that

lim supskA M Emax ,ε 1,

(5.16)

k

independently of A. Indeed, skA satisfies the equation i∂t skA + 2 skA +

A f τ(hαk ,zkα ) U α = 0, α=1

with initial data skA (0) =

A

τ(hαk ,zkα ) U α (0) =

α=1

A

g(hαk ,xkα ) e−i(hk )

α 4 t α 2 k

V α + oA (1),

α=1

and consequently (5.8) and (5.9) give that A 2 s (0) ˙ 2 2E s A (0) E 1 + oA (1). max k k H Using the Strichartz estimates (3.2), (5.11) and (5.15), we get that A s k

M

A α + f τ(hαk ,zkα ) U k H˙ 2 α=1 N A

A 1 α 2 E sk (0) + oA (1) + f τ(hαk ,zkα ) U s A (0)

α=1

N

Emax

8 1 + oA (1) + s A n−4 s A

Emax

8 1 1 1 + oA (1) + s A n−4 s A 2 s A 2

Emax

A 12 ,ε 1 + oA (1) + s

k

k

Z

k

k

Z

k

W Z

k

M

M

Emax ,ε 1 + oA (1)

(5.17)

and (5.17) proves (5.16). Independently, A

A

α A α τ(hαk ,zkα ) U + Wk − f τ(hαk ,zkα ) U f α=1

L2 (R,L2 )

α=1

W A k

Z

8 A n−4 W

k

Z

8 A n−4 + τ(hαk ,zkα ) U α α=1

8 Emax ,ε WkA Z WkA Zn−4 + 1 Emax ,ε WkA Z

Z

(5.18)

2488


and again, using (5.16) and the product and Leibnitz rules for fractional derivatives, we get that A

A

α A α τ(hαk ,zkα ) U + Wk − f τ(hαk ,zkα ) U f α=1

n+8 , 2n(n+4) n2 +6n+1 )

Emax ,ε 1.

(5.19)

L2 (R,H˙ n+4

α=1

Interpolating between (5.18) and (5.19), we obtain that A

A

4 α A α τ(hαk ,zkα ) U + Wk − f τ(hαk ,zkα ) U Emax ,ε WkA Zn+8 f α=1

α=1

(5.20)

N

and (5.7), (5.15) and (5.20) show that lim supekA N = o(1)

(5.21)

k

as A → +∞. Independently, A A α p α ,zα ) U τ + WkA W Emax ,ε 1 + oA (1). (h k W k k α=1

(5.22)

W

Now using Proposition 3.3, (5.21) and (5.22), since pkA (0) = uk (0) + oA (1), we get that 2(n+4) 2(n+4) lim sup uk Zn−4 lim lim suppkA Zn−4

k

A→+∞

k

2(n+4) U α n−4 E ,ε E U α Emax ,ε 1 max Z α

α

and this contradicts (5.10). Now, suppose that for all α, we have that V α = 0. Then Strichartz estimates (3.2) and (5.8) give that it2 e uk (0)

W

12 it2 1 2 e eit uk (0)M uk (0)Z2 1 1 2 2 it e Emax uk (0)Z2 → 0

as k → +∞, and Proposition 3.2 gives that uk Z → 0, which contradicts (5.10). Consequently, we know that there exists a scale core (hk , tk , yk ), and V ∈ H˙ 2 such that uk (0) = g(hk ,yk ) e−itk hk V + wk , 4

2

where E(wk ) → 0. Now, up to passing to a subsequence, we can assume that 2 tk h4k → l ∈ [−∞, +∞]. If l ∈ R, then, replacing V by e−il V , we can assume that l = 0, and changing slightly wk , we can assume that for any k, tk = 0. We then get that uk (0) = g(hk ,yk ) V + o(1) in H˙ 2 , and in particular E(V ) = Emax . Otherwise, by time reversal symmetry, we can assume that l = −∞, and then, we find that


it2 e uk (0)

Z([0,+∞))

2489

2 τ(hk ,tk ,yk ) eit V Z([0,+∞)) + wk Z([0,+∞)) 2 eit V Z([−h4 t ,+∞)) + o(1) k k

= o(1), and by standard developments, we get that, for k sufficiently large, uk Z(R+ ) remains bounded. Once again, this contradicts (5.10). Let U be the maximal nonlinear solution of (1.1) with initial data V , defined on I = (−T∗ , T ∗ ). Suppose, for example that T ∗ = +∞, and that U Z(R+ ) < +∞. Then, using Proposition 3.3 on R+ with v = U , and u = τ(h−1 ,0,−yk ) uk , we see k

that uk Z(R+ ) is bounded uniformly in k, which is a contradiction with (5.10). Consequently, we have that U Z(0,T ∗ ) = U Z(−T∗ ,0) = +∞ and E(U ) = Emax . Now, we prove the compactness property of U . In the sequel, we let Nmin > 0 be sufficiently small so that uH˙ 2 Nmin implies E(u) < Emax /4. Proceeding as above, it is easily proved by contradiction that for any ε > 0, there exist t1 , . . . , tj , j = j (ε), such that for any time t ∈ (−T∗ , T ∗ ), there exist i = i(t), and g(t) = g(h(t),y(t)) with the property that u(ti ) − g(t)u(t)H˙ 2 ε. Let us apply this with ε = Nmin . We get a function g(t) = g(h(t),y(t)) , and a finite set of times t1 , . . . , tj such that for any t, there exists i satisfying u(ti ) − g(t)u(t) ˙ 2 Nmin . H We claim that K = {g(t)u(t): t ∈ (−T∗ , T ∗ )} is precompact in H˙ 2 . Suppose by contradiction that this is not true. Then, there exist ε > 0, and a sequence sk such that for any k and p, g(sk )u(sk ) − g(sp )u(sp ) ˙ 2 > ε. (5.23) H According to what we said above, and passing to a subsequence, we can assume that there exist two times t¯, t¯ , and a sequence gk = g(h k ,yk ) such that, for any k, u(t¯) − g(sk )u(sk )

< Nmin , and H˙ 2 u(t¯ ) − g u(sk ) ˙ 2 < ε . k H 4

(5.24)

Passing to a subsequence, it is easily seen that (h k )−1 h(sk ) remains in a compact subset of (0, ∞) and that and y(sk ) − h(sk )−1 h k yk remains in a compact subset of Rn . Hence, up to considering a subsequence, we can find g∞ such that g(sk )(gk )−1 → g∞ strongly. Now, using (5.24) and the fact that g(h,y) is an isometry on H˙ 2 for all (h, y), we get that g(sk )u(sk ) − g(sk+1 )u(sk+1 ) ˙ 2 H ¯ g(sk )u(sk ) − g∞ u(t ) H˙ 2 + g∞ u(t¯ ) − g(sk+1 )u(sk+1 )H˙ 2 gk u(sk ) − gk g(sk )−1 g∞ u(t¯ )H˙ 2 + gk+1 u(sk+1 ) − gk+1 g(sk+1 )−1 g∞ u(t¯ )H˙ 2 ε + o(1). 2

2490


Clearly, this contradicts (5.23) and proves the compactness property of K. The remaining part follows the line of the work in Tao, Visan and Zhang [37] and Killip, Tao and Visan [23]. However, in order to obtain a low-to-high cascade (instead of a high-to-low cascade), we make the following slight modification. We use the notations in Killip, Tao and Visan [23], except for h(t) = N(t)−1 . In case Osc(κ) is unbounded, instead of a, we introduce the quantity b(t0 ) = inf

h(t0 ) h(t0 ) , . inftt0 h(t) inftt0 h(t)

Then, if supt0 ∈J b(t0 ) = +∞, we can find intervals on which the solution presents arbitrarily large relative peak. In particular it becomes possible to find a solution satisfying the low-tohigh cascade scenario. Finally, in case supt0 ∈J b(t0 ) < +∞, the solution has arbitrarily large oscillation, but no relative peak. Mimicking the proof in Killip, Tao and Visan [23], but changing future (resp. past)-focusing time into future (resp. past)-defocusing time, one can find a solution behaving as in the self-similar case scenario. Theorem 5.1 follows. 2 6. The self-similar case In this section, we deal with the easiest case in Theorem 5.1, namely, the self-similar-like solution. We prove that it is not consistent with conservation of the energy, compactness up to rescaling, and almost conservation of the local L2 -norm as expressed in (3.11). More precisely, we prove that the following proposition holds true. Proposition 6.1. Let u ∈ C(I, H˙ 2 ) be a maximal-lifespan solution such that K = {g(t)u(t): t ∈ I } is precompact in H˙ 2 for some function g as in (2.11). If n = 8, and I = R, then u = 0. In particular, the self-similar scenario in Theorem 5.1 does not hold true. Proof. Let u ∈ C(I, H˙ 2 ) be a solution as above, with I = R, and let v(t) = g(t)u(t). Without loss of generality, we can assume that inf I = 0 and that (0, 2) ⊂ I . Fix 0 < t < 1. First, using Hölder’s inequality, we get that, for any δ > 0,

u(t, x)2 dx E

max

δ4 .

(6.1)

B(−h(t)x(t),δ)

Independently, let x0 ∈ Rn , R > δ > 0, D = B(x0 , R) \ B(−h(t)x(t), δ), and D = B(x(t) + x0 / h(t), R/ h(t)) \ B(0, δ/ h(t)). Using Hölder’s inequality once again, we get that

u(t, x)2 dx = h(t)4

v(t, x)2 dx

D

D

h(t)4

v(t, x)4 dx

δ |x| h(t)

1 δ/ h(t) 2 R 4 ,

1

1

2

2

dx x

0 +x(t), R ) B( h(t) h(t)

(6.2)


2491

where is given by

v(t, x)4 dx.

(R) = sup t∈I

|x|R

A consequence of the compactness of K as in Theorem 5.1 is that (R) → 0,

as R → +∞.

(6.3)

Combining (6.1) and (6.2), we get that for any ball BR of radius R > δ, 1 u(t, x)2 dx E δ 4 + R 4 δ/ h(t) 2 . max

(6.4)

BR

Using almost conservation of local mass, as expressed in (3.11), and (6.4), we get, for any x0 ∈ R8 and any R > 4, that the following bound at time 1 holds true 3 3 1 M u(1), B(x0 , R) 4 Emax + M u(t), B(x0 , 2R) 4 R 1 3 1 Emax + δ 4 + R 4 δ/ h(t) 2 4 , R

(6.5)

where the local mass is as in (3.10). Letting t → 0 and using (6.3), and then letting δ → 0, we get with (6.5) that 4 M u(1), B(x0 , R) Emax R − 3 .

(6.6)

Letting R → ∞ in (6.6), we obtain u(1)

L2

= 0.

Clearly (6.7) contradicts u = 0. This proves Proposition 6.1.

(6.7) 2

7. An interaction Morawetz estimate To deal with the remaining two scenarii in Theorem 5.1, in which there is no prescribed finite-time blow-up, we need a new ingredient that bounds the amount of nonlinear presence of the solution at a given scale. Natural candidates to achieve this are Morawetz estimates and in our case, interaction Morawetz estimates. In light of Theorem 5.1, we need to work exclusively 1 with H˙ 2 -solutions. Interaction Morawetz estimates scale like the H˙ 4 -norm. Because of this 7/4difference in scaling, following Colliander, Keel, Staffilani, Takaoka and Tao [7], Ryckman and Visan [32] and Visan [40], we seek for frequency-localized interaction Morawetz estimates. This is the purpose of Sections 7 and 8. In Section 7 we derive an a priori interaction estimate that applies to all solutions u ∈ C(H 2 ), and in Section 8 we use it to obtain a frequency-localized version of these estimates. The frequency localized version applies only to the special H˙ 2 -solutions given by Theorem 5.1. We prove here that the following proposition holds true.

2492


Proposition 7.1. Let n 7 and let u ∈ C([T1 , T2 ], H 2 ) be a solution of (3.1), with forcing term 2 0 h ∈ S˙¯ ([T1 , T2 ]) + S˙¯ ([T1 , T2 ]). Then the following estimate holds true: n T2 j =1 T

1

{h, u}m (t, y)

R2n

(x − y)j {∂j u, u}m (t, x) dx dy dt |x − y|

n T2 u(t, y)2 (x − y)j {h, u}jp (t, x) dx dy dt +, |x − y| j =1 T

1

T2 + T1 R2n

R2n

|u(t, x)|2 |u(t, y)|2 dx dy dt |x − y|5

2 2 sup u(t)L2 u(t) 1 ,

(7.1)

H˙ 2

t=T1 ,T2

where { , }m and { , }p are the mass and momentum brackets. In this proposition, the mass and momentum brackets are defined by ¯ {f, g}m = Im(f g)

and {f, g}p = Re(f ∇ g¯ − g∇ f¯).

(7.2)

In addition to Proposition 7.1, in order to exploit the bound given in (7.1), we also prove that the following lemma holds true. Lemma 7.1. Assume n 6. Then 1 2 − n−5 − n−5 2 |∇| 4 u 4 2 N |PN u| L

L4

N

1 n−5 |∇|− 2 |u|2 L2 2 ,

(7.3)

3 for all u ∈ H˙ 2 such that |∇|− 2 |u|2 ∈ L2 , where the summation is over all dyadic numbers.

Proof. The equivalence of norms is classical. We first claim that for any g ∈ S, and any n 6, − n−5 |∇| 4 g

L4

1 n−5 |∇|− 2 |g|2 L2 2 .

n−5

(7.4)

We prove (7.4). Let φ(ξ ) = |ξ |− 4 (ψ(ξ ) − ψ(2ξ )) where ψ is as in (2.4). Using the Cauchy– Schwartz inequality we get that for any dyadic N , n−5 n−5 PN |∇|− 4 g (x) = N − 4 g ∗ F −1 φ(ξ/N ) (x) 3n+5 ˇ =N 4 g(x − y)φ(Ny) dy Rn


N

3n+5 4

dy g(x − y)2 φ(Ny) ˇ

1 2

Rn

N

n+5 4

dy φ(Ny) ˇ

2493

1 2

Rn

dy g(x − y)2 φ(Ny) ˇ

1 2

(7.5)

Rn

uniformly in N . Since φ ∈ S, for any y ∈ Rn , we get n+5 n+5 −2n ˇ N |y| 2 φ(Ny) N |y| 2 1 + N |y| 1, N

(7.6)

N

where the summation is over all dyadic numbers N . Consequently, using (7.5), (7.6) and the fact that φˇ ∈ S, we get that n+5 2 PN |∇|− n−5 g(x − y)2 φ(Ny) dy ˇ 4 g (x) N 2 N

N

Rn

Rn

|g(x − y)|2 |y|

n+5 2

n+5 2 ˇ N |y| φ(Ny) dy

N

n−5 |∇|− 2 |g|2 (x),

(7.7)

and using the Littlewood–Paley theorem, (7.7) gives (7.4) for g smooth. Density arguments then give (7.3). This ends the proof of Lemma 7.1. 2 Proof of Proposition 7.1. Since the estimate we want to prove is linear, we can assume that u is smooth and use density arguments to recover the general case. We adopt the convention that repeated indices are summed. Given some real function a, we define the Morawetz action centered at 0 by Ma0 (t) = 2

∂j a(x) Im u(t, ¯ x)∂j u(t, x) dx.

(7.8)

Rn

Following the computation in Pausader [28], we get that ∂t Ma0 (t) = 2

1 ¯ j k a − 3 a |u|2 − 4∂j k a∂ik u∂ij u¯ 2∂j u∂k u∂ 2

Rn

+ a|∇u| 2

2

j + ∂j a{u, h}p

(7.9)

dx. y

Similarly, we define the Morawetz action centered at y, Ma (t) = Ma0y (t) for ay (x) = |x − y|. Finally, we define the interaction Morawetz action by the following formula:

2494


M (t) = i

u(t, y)2 May (t) dy

Rn

u(t, y)2 x − y ∇u(t, x)u(t, ¯ x) dx dy . = 2 Im |x − y| Rn

We can directly estimate

(7.10)

Rn

i M (t) u2 ∞ 2 u2 L L

1

L∞ H˙ 2

.

(7.11)

Now, we get an estimate on the variation of M i by writing that y y ∂t M i = 2 {u, h}m (y)Ma dy + 4 Im ∂j u(y)∂j k u(y)∂ ¯ k Ma dy Rn

+ 2 Im

Rn

2 y y u(y)∇u(y)∇M ¯ dy + u(y) ∂t Ma dy. a

Rn

This gives that

(7.12)

Rn

y x Im u(y)∂ ¯ ¯ dx dy j u(y) ∂j ∂k a(x − y) Im ∂k u(x)u(x)

∂t M i = 4 Rn ×Rn

y x Im ∂i u(y)∂ij u(y) ¯ ∂j ∂k a(x − y) Im ∂k u(x)u(x) ¯ dx dy

+8 Rn ×Rn

{u, h}m (y)∂kx a(x − y) Im ∂k u(x)u(x) ¯ dx dy

+4 Rn ×Rn

u(y)2 ∂ x a(x − y) ∂j u(x)∂k u(x) ¯ dx dy jk

+4 Rn ×Rn

u(y)2 3 a(x − y) u(x)2 dx dy

− Rn ×Rn

−8

u(y)2 ∂ x a(x − y) ∂ik u(x)∂ij u(x) ¯ dx dy jk

Rn ×Rn

+2

u(y)2 2 a(x − y) ∇u(x)2 dx dy

Rn ×Rn

+2

u(y)2 ∂ x a(x − y){u, h}jp (x) dx dy, j

(7.13)

Rn ×Rn y

where ∂jx denotes derivation with respect to xj , and ∂k derivation with respect to yk . Most of the terms in (7.13) have the right sign if we let a(z) = |z|. Now we focus on the first two


2495

terms in (7.13). In the sequel, we let z = x − y. Using the fact that Re(AB) = Re(A) Re(B) − Im(A) Im(B), we get the equality:

y x Im u(y)∂ ¯ ¯ dx dy j u(y) ∂j ∂k a(z) Im ∂k u(x)u(x)

R2n

=−

1 4

u(y)2 3 a(z)u(x)2 dx dy − R (∇u ⊗ u); (∇u ⊗ u) ,

(7.14)

R2n

where we let R be the bilinear form on S(Rn , Rn ) ⊗ S(Rn , R) defined by x ¯ αj (y)δ(y) ∂j k a(z) γ¯k (x)β(x) dx dy. R ( α ⊗ β); (γ ⊗ δ) = Re

(7.15)

R2n

For the second term, we proceed as follows:

x y Im ∂ij u(x)∂ ¯ ¯ dx dy i u(x) ∂j ∂k a(z) Im ∂k u(y)u(y)

R2n

1 = 4

∇u(x)2 2 a(z)u(y)2 dx dy + Q (∇∂i u ⊗ u); (∇u ⊗ ∂i u) ,

(7.16)

Rn

where we define the quadratic form Q on S(Rn , Rn ) ⊗ S(Rn , R) by zj zk 1 ¯ γ¯j (y)β(y) dy dx. δj k − Q ( α ⊗ β); (γ ⊗ δ) = Re αk (x)δ(x) |z| |z|2

(7.17)

R2n

As one can check by computing the Fourier transform of its kernel, Q is nonnegative. Hence, applying the Cauchy–Schwartz inequality, we get 1 1 Q (∇∂i u ⊗ u); (∇u ⊗ ∂i u) Q (∇∂i u ⊗ u)2 2 Q (∇u ⊗ ∂i u)2 2 1 1 Q (∇∂i u ⊗ u)2 + Q (∇u ⊗ ∂i u)2 2 2

(7.18)

and if R and Q are as in (7.15) and (7.17), we observe that Q (∇u ⊗ ∂i u)2 = Q (∇∂i u ⊗ u)2 − R (∇u ⊗ u)2 x + 2 Re ∂k u(x)u(x) ¯ ∂ij k a(z) ∂ij u(y)u(y) ¯ dx dy R2n

= Q (∇∂i u ⊗ u)2 + R (∇u ⊗ u)2 2 + Re u(x) ∂ijx a(z) ∂i u(y)∂ ¯ j u(y) dx dy. R2n

(7.19)

2496


Consequently, applying (7.14), (7.16), (7.18) and (7.19), we get that

y x Im u(y)∂ ¯ ¯ dx dy j u(y) ∂j ∂k a(x − y) Im ∂k u(x)u(x)

4 Rn ×Rn

y x Im ∂i u(y)∂ij u(y) ¯ ∂j ∂k a(x − y) Im ∂k u(x)u(x) ¯ dx dy

+8 Rn ×Rn

u(y)2 3 a(z) u(x)2 dx dy + 8Q (∇∂i u ⊗ u)2

− R2n

u(y)2 2 a(z) ∇u(x)2 dx dy

+2 R2n

+ 4 Re

u(x)2 ∂ x a(z) ∂i u(y)∂ ¯ j u(y) dx dy. ij

(7.20)

R2n

Now, for e ∈ Rn a vector, and u a function, we define ∇e u = (e · ∇u)

e |e|2

and ∇e⊥ u = ∇u − ∇e u.

Then, applying the Cauchy–Schwartz inequality, we get that Q (∇∂i u, u)2 =

∂ij u(x)u(x) ¯

R2n

=

(x − y)j (x − y)k 1 ∂ik u(y)u(y) δj k − ¯ dx dy |x − y| |x − y|2

⊥ ∂ u(y)] · [∇ ⊥ ∂ u(x) [u(x)∇x−y u(y)] ¯ i x−y i ¯

|x − y|

R2n

u(x)2

2 1 ⊥ ∇x−y ∂i u(y) dx dy |x − y|

u(x)2

(x − y)j (x − y)j 1 ∂ik u(y)∂ δj k − ¯ ij u(y) dx dy. (7.21) |x − y| |x − y|2

R2n

dx dy

R2n

Finally, (7.13), (7.20), and (7.21) give ∂t M −2 i

u(y)2 3 a(x − y) u(x)2 dx dy

Rn ×Rn

+4 Rn ×Rn

{u, h}m (y)∂kx a(x − y) Im ∂k u(x)u(x) ¯ dx dy


2497

u(y)2 ∂ x a(x − y){u, h}jp (x) dx dy

+2

j

Rn ×Rn

u(y)2 ∂ x a(x − y) ∂j u(x)∂k u(x) ¯ dx dy jk

+ Rn ×Rn

u(y)2 2 a(x − y) ∇u(x)2 dx dy.

+4

(7.22)

Rn ×Rn

Let T 1 and T 2 be the last two terms in (7.22). Then 1 (T 1 + T 2) = − 4(n − 1)

R2n

(x − y)j (x − y)k |u(y)|2 (n − 1)δ ∂j u(x)∂k u(x) − 6 ¯ dx dy jk |x − y|3 |x − y|2

which is nonpositive when n 7. Finally, (7.22) and this remark give (7.1).

2

8. A frequency-localized interaction Morawetz estimate The preceding interaction Morawetz estimate is ill-suited for H˙ 2 -solutions. In order to exploit such an estimate in the context of H˙ 2 -solutions, we need to localize it at high frequencies. The 1 difficulty then is to deal with an inequality that scales like the H˙ 4 -norm, while using only bounds that scale like the H˙ 2 -norm. To overcome this difference of 7/4 derivatives, we split the solution into high and low frequencies and develop an intricate bootstrap argument to get the inequality. This is made possible because we restrict ourselves to the case of the special solutions obtained in Theorem 5.1. More precisely, we prove that the following proposition holds true. Proposition 8.1. Let n = 8. Let u ∈ C(I, H˙ 2 ) be a maximal lifespan-solution of (1.1) such that K = {g(t)u(t): t ∈ I } is precompact in H˙ 2 and such that ∀t ∈ I , h(t) h(0) = 1. Then, for any ε > 0, there exists N = N (ε) such that −3 |∇| 2 |P1 u|2

L2 (I,L2 )

P1 u

3

− S˙ 2 (I )

ε,

ε,

and P1 uS˙ 2 (I ) ε

(8.1)

up to replacing u by g(N,0) u. Proof. We can assume that 0 < ε < ε0 for some ε0 > 0 sufficiently small to be chosen later on. We remark that for N a dyadic number and for all time, PN g −1 (t) g(t)u(t)

H˙ 2

= PN h(t) g(t)u(t) H˙ 2 .

(8.2)

Hence, by compactness of K, and since h 1, we have that PN uL∞ H˙ 2 → 0 as N → 0. Let N be such that ε Pε−4 N uL∞ H˙ 2 . 2

2498


Replacing K by Kg(ε4 N −1 ,0) , and modifying slightly h, one can assume that P1 uL∞ (I,H˙ 2 ) ε,

and

P1 uL∞ (I,H˙ s ) P1· 0 is sufficiently small. Using again (8.10), we get the second inequality in (8.5). Now we turn to the control on uh . Still using the Strichartz estimates (3.3) and Sobolev’s inequality, we get that uh

3

− S˙ 2

uh (0) ε+

3

− H˙ 2

+

3 −5 |∇| 2 P1 O uj u3−j h l

j =0

P1 O uj u3−j h l

j =2,3

16 L2 L 15

+

8

L2 L 5

|∇|− 52 P1 O uj u3−j h l

j =0,1

8

L2 L 5

. (8.11)

2500

B. Pausader / Journal of Functional Analysis 256 (2009) 2473–2517 3

By convolution estimate, letting cN = N − 4 |PN uh |, we get that |uh |2 uh

M1 M2 M3

O(PM1 uh PM2 uh PM3 uh )

cM3

M1 M2 M3

2 cM

M3 M2

3

4

cM2

M2 M1

3 2

3 M12 PM1 uh

3 sup M 2 |PM uh | .

(8.12)

M

M

Consequently, using the Bernstein’s properties (2.5), (7.3) and (8.12), we get that P1 |uh |2 uh

16 L2 L 15

|uh |2 uh

1 2 2 3 − 32 2 M |P u | M h 4 4 sup |∇| 2 PM uh L∞ L 167

−3 |∇| 2 |uh |2

16

L2 L 15

M

L L

M

L2 L2

uh L∞ H˙ 2

Emax η.

(8.13)

Note that instead of using the pointwise evaluation of uh = PM uh , we can replace uh by an arbitrary Schwartz function, get the bound, and then use density arguments to recover (8.13). When j = 2, we proceed as follows, using Sobolev’s inequality, the Bernstein’s properties (2.5), (8.3) and the estimate for ul in (8.5), 2 O u ul h

16

L2 L 15

ul L4 L8 uh

L4 L

16 7

uh

1 3 εul S˙ 2 |∇|− 2 uh 2

8 L2 L 3

5

8

L∞ L 3

3 12 |∇| 2 uh ∞

L L2

1

ε 2 uh 2 − 3 . S˙

(8.14)

2

When j = 1, we proceed similarly to get −5 |∇| 2 P1 O u2 uh l

8 L2 L 5

O u2l uh

ul 2L4 L8 uh ε3 , and finally,

8

L2 L 5 8

L∞ L 3

(8.15)


−5 |∇| 2 P1 |ul |2 ul

8 L2 L 5

|∇||ul |2 ul

2501

8

L2 L 5

ul 3S˙ 2 ε3 .

(8.16)

Combining (8.11) and (8.13)–(8.16), we get that uh

5

3

− S˙ 2

1

ε + η + ε 3 + ε 2 uh 2 − 3 S˙

2

η. This ends the proof of (8.5). As a consequence of conservation of energy, (8.3), (8.5) and Hardy– Littlewood–Sobolev’s inequality, we get the following estimates on J = J (2). Namely, uh

4

5

5

L2 L2

Emax η 5 ,

uh

2

Emax η 3 ,

8

L3 L 3

|uh |2 ∗ |x|−1

L3 L24

uh 2

uh

4

9

L 2 L3 4

24

L6 L 11

Emax η 9 ,

2

Emax ε 3 η 3

and (8.17)

Now that we have good Strichartz control on the high and low frequencies, we can control the error terms arising in the frequency-localized interaction Morawetz estimates. First, we treat the terms arising from the mass bracket. We claim that on J = J (2), as defined above, we have that

(x − y)j {∂j u, u}m (t, x) dx dy ε 2 η2 . P1 |u|2 u , u m (t, y) (8.18) |x − y| J R2n

Exploiting cancellations, we write

P1 |u|2 u , uh m = P1 |u|2 u − |uh |2 uh , uh m

− P 0 such that for any R 1,

T R

fs 1{fs 1} L3 (BR ) ds C

γ

R3

0

β f0 (v) dv −

β fT (v) dv +

R3

1 + fs δL3−ε ds

0

T

C 1+

T

fs δL3−ε ds . 0

Step 2. For α > 0, we define gt (v) = ft (v)(1 + |v|)γ −α 1{ft 1} . We have, using Step 1, T gs L3 ds

T

gs L3 ({2k −1|v|2k+1 −1}) ds

0 k0

0

T

2k(γ −α) fs 1{fs 1} L3 (B k+1 ) ds 2

0 k0

C2

−γ

2

−αk

0

T

C 1+

fs δL3−ε 0

fs δL3−ε

1+

k0

T

ds .

ds

2558

N. Fournier, H. Guérin / Journal of Functional Analysis 256 (2009) 2542–2560

Step 3. We now prove if α > 0 is small enough, for some constant C, ft 1{ft 1} L3−ε C 1 + gt L3 . We consider a nonnegative function h with hL3−ε =

2−ε

h R3

R3

h(v) dv 1. By Hölder’s inequality, for ε ∈ (0, 1)

1/(3−ε) (1 + |v|)(γ −α) 3(2−ε)/2 (v) h(v) dv (1 + |v|)(γ −α)

γ −α 3 1 + |v| h(v) dv

R3

(3.5)

γ −α 3 1 + |v| h(v) dv

2−ε 2(3−ε)

3(α−γ )(2−ε)/ε 1 + |v| h(v) dv

ε 2(3−ε)

R3

2−ε 2(3−ε)

1 + |v|3(α−γ )(2−ε)/ε h(v) dv .

R3

R3

Then, for h = ft 1{ft 1} , setting r = 3(α − γ )(2 − ε)/ε, and recalling that gt (v) = ft (v)(1 + |v|)γ −α 1{ft 1} 3(2−ε) ft 1{ft 1} L3−ε 1 + mr (ft ) gt L2(3−ε) 1 + mr (ft ) 1 + gt L3 . 3

But by assumption, mq (f0 ) < ∞ for some q > 3|γ |(2 − ε)/ε, whence, by Proposition 8, sup[0,T ] mq (ft ) < ∞. Choosing α > 0 small enough, in order that q r, we deduce (3.5).

Step 4. Using that R3 fs (v) dv = 1, Steps 2 and 3, we obtain, for some constant C (depending in particular on T ), T

T fs L3−ε ds C +

0

T fs 1{fs 1} L3−ε ds C + C

0

fs δL3−ε ds 0

T C +C

δ fs L3−ε

.

0

But one may choose p ∈ (3/(3 + γ ), 3 − ε) (recall Step 1) such that δ = (p−1)(3−ε) ∈ (0, 1) p(2−ε) (choose p very close to 3/(3 + γ ) and use that by assumption, 3/(3 + γ ) < 3 − ε, whence

T ε < 6+3γ 3+γ ). As a consequence, 0 fs L3−ε ds x0 , where x0 is the largest solution of x = C + Cx δ . Following carefully the proof above, one may check that C, and thus x0 , depends only on T , f0 , q, γ , ε. 2 3.5. Soft potentials We now would like to obtain a result which includes the case of very soft potentials, that is γ ∈ (−3, −2].


2559

3 Proposition 11. Let γ ∈ (−3, 0) and p ∈ ( 3+γ , +∞). Let f be a weak solution of the Lan3 p 3 dau equation with f0 ∈ P2 (R ) ∩ L (R ). Then there exists a time T ∗ > 0 depending on γ , p and f0 Lp such that for any T ∈ [0, T ∗ ), at least formally, sup[0,T ] ft Lp < ∞.

Proof. Let us consider the function β(x) = x p . Since p > 1, we have β 0 and φβ (x) = 3 (p − 1)x p . Using (3.1), neglecting all nonnegative terms, and using Remark 1, since p > 3+γ , there exists a constant Cγ ,p > 0 such that d p ft Lp 2(γ + 3)(p − 1) dt

p

|v − v∗ |γ ft (v∗ )ft (v) dv dv∗

R3 ×R3

p p 2p ft Lp Jγ (ft ) Cγ ,p 1 + ft Lp ft Lp Cγ ,p 1 + ft Lp . Thus for all 0 t < T ∗ := ( π2 − arctanf0 Lp )/Cγ ,p , we have ft Lp tan(arctanf0 Lp + Cγ ,p t), which concludes the proof. 2 p

p

p

Proof of Corollary 4(ii). We only have to prove the existence, since the uniqueness immediately follows from Theorem 3 and Remark 1. We thus assume that γ ∈ (−3, 0), p > 3/(3 + γ ), and consider an initial condition f0 ∈ P2 (R3 ) ∩ Lp (R3 ) (which implies that H (f0 ) < ∞). Then Villani [14] has shown the existence of a weak solution (ft )t∈[0,T ] ∈ L∞ ([0, T ], P2 ), for arbitrary T . But using the a priori estimate of Proposition 11, we deduce that this solution can be built in p such a way that it belongs to L∞ loc ([0, T∗ ), L ). This concludes the proof. 2 References [1] R.A. Adams, Sobolev Spaces, Academic Press, New York, London, Toronto, 1975. [2] R. Alexandre, L. Desvillettes, C. Villani, B. Wennberg, Entropy dissipation and long range interactions, Arch. Ration. Mech. Anal. 152 (4) (2000) 327–355. [3] A.G. Bhatt, R.L. Karandikar, Invariant measures and evolution equations for Markov processes characterized via martingale problems, Ann. Probab. 21 (4) (1993) 2246–2268. [4] L. Desvillettes, C. Villani, On the spatially homogeneous Landau equation for hard potentials. Part I: Existence, uniqueness and smoothness, Comm. Partial Differential Equations 25 (1–2) (2000) 179–259. [5] L. Desvillettes, C. Mouhot, Stability and uniqueness for the spatially homogeneous Boltzmann equation with longrange interactions, Arch. Ration. Mech. Anal., submitted for publication. [6] J. Fontbona, H. Guérin, S. Méléard, Measurability of optimal transportation and convergence rate for Landau type interacting particle systems, Probab. Theory Related Fields (2008), in press. [7] N. Fournier, C. Mouhot, On the well-posedness of the spatially homogeneous Boltzmann equation with a moderate angular singularity, Preprint, 2007. [8] N. Fournier, H. Guérin, On the uniqueness for the spatially homogeneous Boltzmann equation with a strong angular singularity, J. Stat. Phys. 131 (4) (2008) 749–781. [9] H. Guérin, Solving Landau equation for some soft potentials through a probabilistic approach, Ann. Appl. Probab. 13 (2) (2003) 515–539. [10] S.T. Rachev, L. Rüschendorf, Mass Transportation Problems. Vol. I. Theory, Probab. Appl. (N.Y.), Springer-Verlag, New York, 1998. [11] H. Tanaka, Probabilistic treatment of the Boltzmann equation of Maxwellian molecules, Z. Wahrsch. Verw. Geb. 46 (1) (1978–1979) 67–105. [12] C. Villani, Contribution à l’étude mathématique des équations de Boltzmann et de Landau en théorie cinétique des gaz et des plasmas, Thèse de doctorat, Université Paris 9-Dauphine, 1998.

2560


[13] C. Villani, On the spatially homogeneous Landau equation for Maxwellian molecules, Math. Models Methods Appl. Sci. 8 (6) (1998) 957–983. [14] C. Villani, On a new class of weak solutions to the spatially homogeneous Boltzmann and Landau equations, Arch. Ration. Mech. Anal. 143 (3) (1998) 273–307. [15] C. Villani, A review of mathematical topics in collisional kinetic theory, in: Handbook of Mathematical Fluid Dynamics, vol. I, North-Holland, Amsterdam, 2002, pp. 71–305. [16] C. Villani, Topics in Optimal Transportation, Grad. Stud. Math., vol. 58, Amer. Math. Soc., 2003. [17] J.B. Walsh, An introduction to stochastic partial differential equations, in: École d’été de Probabilités de Saint-Flour XIV, in: Lecture Notes in Math., vol. 1180, 1986, pp. 265–437.


On maximal Lp -regularity Frédéric Bernicot a , Jiman Zhao b,∗,1 a Université Paris-Sud XI, 91405 Orsay Cedex, France b School of Mathematical Sciences, Beijing Normal University, Key Laboratory of Mathematics and Complex Systems,

Ministry of Education, Beijing 100875, PR China Received 7 July 2008; accepted 15 January 2009 Available online 6 February 2009 Communicated by C. Kenig

Abstract The aim of this paper is to propose weak assumptions to prove maximal Lq regularity for Cauchy problem: du (t) − Lu(t) = f (t). dt Mainly we only require “off-diagonal” estimates on the real semigroup (etL )t>0 to obtain maximal Lq regularity. The main idea is to use one kind of Hardy space H 1 adapted to this problem and then use interpolation results. These techniques permit us to prove weighted maximal regularity too. © 2009 Elsevier Inc. All rights reserved. Keywords: Maximal regularity; Hardy spaces; Atomic decomposition

Contents 1. 2. 3. 4.

Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Preliminaries . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . An application to maximal Lq regularity on Lebesgue spaces Proof of Theorem 3.4 . . . . . . . . . . . . . . . . . . . . . . . . . . .

. . . .

. . . .

. . . .

. . . .

. . . .

. . . .

. . . .

. . . .

. . . .

. . . .

. . . .

. . . .

. . . .

. . . .

. . . .

. . . .

. . . .

. . . .

. . . .


E-mail addresses: [email protected] (F. Bernicot), [email protected] (J. Zhao). 1 J. Zhao is supported by SRF for ROCS, SEM, China and supported by NSF of China (Grant: 10871048).


. . . .

. . . .

. . . .

2562 2564 2568 2574

2562

F. Bernicot, J. Zhao / Journal of Functional Analysis 256 (2009) 2561–2586

5.

Other results . . . . . . . . . . . . . . . . . . . . . 5.1. Maximal regularity on Lp for p 2 5.2. Study of our Hardy spaces . . . . . . . Acknowledgments . . . . . . . . . . . . . . . . . . . . . . References . . . . . . . . . . . . . . . . . . . . . . . . . . .

. . . . .

. . . . .

. . . . .

. . . . .

. . . . .

. . . . .

. . . . .

. . . . .

. . . . .

. . . . .

. . . . .

. . . . .

. . . . .

. . . . .

. . . . .

. . . . .

. . . . .

. . . . .

. . . . .

. . . . .

. . . . .

. . . . .

. . . . .

. . . . .

. . . . .

. . . . .

. . . . .

. . . . .

. . . . .

. . . . .

. . . . .

. . . . .

. . . . .

. . . . .

2582 2582 2583 2585 2586

1. Introduction Let (Y, dY , ν) be a space of homogeneous type. Let L be the infinitesimal generator of an analytic semigroup of operators on Lp := Lp (Y ) and J = (0, l], l > 0 or J = (0, +∞) (in the second case, one has to assume that L generates a bounded analytic semigroup). Consider the Cauchy problem du

dt (t) − Lu(t) = f (t), u(0) = 0,

t ∈ J,

(CP)

where f : J → B is given, where B is a Banach space. If etL is the semigroup generated by L, u is formally given by t u(t) =

e(t−s)L f (s) ds. 0

For fixed q ∈ (1, +∞), one says that there is maximal Lq regularity on B = Lp for the problem q p if for every f ∈ Lq (J, Lp ), ∂u ∂t (or Lu) belongs to L (J, L ). It is known that the property of q maximal L -regularity does not depend on q ∈ (1, ∞). For the maximal Lq regularity, we refer the reader to the works of P. Cannarsa and V. Vespri [7], T. Coulhon and X.T. Duong [9,10], L. de Simon [11], M. Hieber and J. Prüss [12] and D. Lamberton [13], etc. The literature is so vast that we do not give exhaustive references. However we emphasize that in all these works, the different authors obtain maximal regularity under the assumption that the heat kernel (the kernel of the semigroup) admits pointwise estimates and gaussian decays. Such assumptions imply that the semigroup extends consistently to all Lebesgue spaces Lp for p ∈ (1, ∞). For a few years, people have studied problems associated to a semigroup, which do not satisfy this property. For example, gaussian estimates have been successfully generalized by “offdiagonal estimates” for studying the boundedness of Riesz transforms on a manifold (see [1]). That is why we look for weaker assumptions associated to “off-diagonal” estimates on the semigroup to guarantee the maximal Lq -regularity. In this direction, there is a first work of S. Blunck and P.C. Kunstmann [6]. The authors have obtained the following result (using the R-boundedness of the complex semigroup and the recent characterization of L. Weiss [14]):


2563

Theorem 1.1. Let δ be the homogeneous dimension of Y . Assume that (ezL )z is a bounded analytic semigroup on L2 and p0 < 2 < q be exponents. Suppose there are coefficients (g(k))k1 such that for all balls Q of radius rQ and all integer k 0, we have 2 1Q erQ L 1(k+1)Q\kQ

Lp0 →Lq

1

ν(Q) q

− p1

0

g(k)

(1.1)

with ∞

k δ−1 g(k) < ∞.

(1.2)

k=1

Then for all r ∈ (p0 , 2], L has maximal regularity on Lr . Now we come to our results. We look for similar results with some improvements. First as the conclusion concerns only exponents r ∈ (p0 , 2], we would like to not require assumption (1.1) with an exponent q > 2. In addition, we want to understand how the assumption (1.2) is important. In our result, we will give some similar assumption, which seem to be not comparable to this one. However our proof (which use very different techniques) permit us to obtain simultaneously positive and new results for “weighted” maximal regularity. In [2], the authors consider the Cauchy Problem (CP) with −L equals to the Laplacian operator on some Riemannian manifolds or a sublaplacian on some Lie groups or some second order elliptic operators on a domain. We show the boundedness of the operator of maximal regularity f → Lu and its adjoint on appropriate Hardy spaces. In this paper, we apply the general theories of our paper [5] to the maximal regularity in abstract setting. In [5], we construct Hardy spaces through an atomic (or molecular) decomposition which keep the main properties of the (already known) Hardy spaces H 1 . We prove some results about continuity from these spaces into L1 and some results about interpolation between these spaces and the Lebesgue spaces. Now we will use these theories to study the maximal regularity. Here is our main result: Theorem 1.2. Let L be a generator of a bounded analytic semigroup T := (etL )t>0 on L2 (Y ) such that T , (tLetL )t>0 and (t 2 L2 etL )t>0 satisfy “L2 − L2 off-diagonal decay” (precisely belonging to the class O4 (L2 − L2 ), see Definition 3.2). For an exponent p0 ∈ (1, 2], we assume some weak “Lp0 − L2 off-diagonal decay” (we require (3.5), see Proposition 3.9). Then for all exponent p ∈ (p0 , 2], the operator T admits a continuous extension on Lp (Y ) and so L has maximal Lp -regularity. In addition we can have weighted results: let ω ∈ A∞ (Y ) be a weight on Y . Then for all exponents p ∈ (p0 , 2) satisfying ω ∈ Ap/p0 ∩ RH(2/p) , L has maximal Lp (ω)-regularity. Remark 1.3. Our weak “Lp0 − L2 off-diagonal decay” is similar to assumption (1.1) of Theorem 1.1 but is not comparable to this one. What is important is that we only require informations on the real semigroup. In addition, the answer concerning weighted results is totally new and do not seem accessible by the techniques of [6] used to prove Theorem 1.1.

2564


The plan is as follows: in Section 2, we recall the abstract results concerning Hardy spaces. Then in Section 3, we will explain the application to the maximal regularity problem: how to define an adapted Hardy space. Then we conclude in Section 4 by checking the abstract assumptions for this application. We will finish in the last section to give results for exponents p 2 and study the Hardy spaces adapted to this problem of maximal regularity. 2. Preliminaries In this section, we give an overview of some basic facts which we will use in the sequel. For more details concerning abstract Hardy spaces, see [5]. Let (X, d, μ) be a space of homogeneous type. That is meaning d is a quasi-distance on the space X and μ a Borel measure which satisfies the doubling property: ∃A > 0, ∃δ > 0, ∀x ∈ X, ∀r > 0, ∀t 1,

μ B(x, tr) At δ μ B(x, r) ,

(2.1)

where B(x, r) is the open ball with center x ∈ X and radius r > 0. We call δ the homogeneous dimension of X. Let Q be a ball, for i 0, we write Si (Q) the scaled corona around the ball Q: d(x, c(Q)) i i+1 , Si (Q) := x, 2 1 + 0}, and B := (BQ )Q∈Q a collection of L2 -bounded linear operators, indexed by the collection Q. We assume that these operators BQ are uniformly bounded on L2 : there exists a constant 0 < A < ∞ so that: ∀f ∈ L2 , ∀Q ball,

BQ (f ) A (δ) f 2 . 2

(2.2)

Now, we recall some definitions and theorems of [5]. The -molecules (or atoms) are defined as follows. Definition 2.1. (See [5].) Let > 0 be a fixed parameter. A function m ∈ L1loc is called an molecule associated to a ball Q if there exists a real function fQ such that m = BQ (fQ ), with ∀i 0,

−1/2 −i fQ 2,Si (Q) μ 2i Q 2 .

We call m = BQ (fQ ) an atom if in addition we have supp(fQ ) ⊂ Q. So an atom is exactly an ∞-molecule. Using this definition, we can define the “finite” molecular (atomic) Hardy space.


2565

Definition 2.2. (See [5].) A measurable function h belongs to the “finite” molecular Hardy space 1 HF,,mol if there exists a finite decomposition: h=

λi mi

μ-a.e.,

i

where for all i, mi is an -molecule and λi are real numbers satisfying

|λi | < ∞.

i∈N

We define the norm: h H 1

F,,mol

:=

inf h= i λi mi

|λi |,

i

where we take the infimum over all the finite atomic decompositions. Similarly we define the 1 “finite” atomic space HF,ato replacing -molecules by atoms. We will use the following theorem for studying maximal regularity. Proposition 2.3. (See [5].) Let T be an L2 -bounded sublinear operator satisfying the following “off-diagonal” estimates: for all ball Q, for all k 0, j 2, there exists some coefficient αj,k (Q) such that for every L2 -function f supported in Sk (Q)

1

T BQ (f ) 2 dμ

μ(2j +k+1 Q)

1/2

αj,k (Q)

1/2

1

|f |2 dμ

μ(2k+1 Q)

Sj (2k Q)

. (2.3)

Sk (Q)

If the coefficients αj,k satisfy Λ := sup sup

k0 Q ball j 2

μ(2j +k+1 Q) α (Q) < ∞, j,k μ(2k+1 Q)

(2.4)

then for all > 0 there exists a constant C = C() such that 1 ∀f ∈ HF,,mol ,

T (f ) C f 1 H 1

F,,mol

.

Definition 2.4. (See [5].) We set AQ be the operator I d − BQ . For σ ∈ [2, ∞] we define the maximal operator:

∀x ∈ X,

Mσ (f )(x) := sup Q ball x∈Q

1 μ(Q)

∗ A (f )σ dμ Q

Q

1/σ .

(2.5)

2566


We use duality so we write A∗Q for the adjoint operator. The standard maximal “Hardy– Littlewood” operator is defined by: for s > 0,

∀x ∈ X,

MH L,s (f )(x) := sup Q ball x∈Q

1 μ(Q)

1/s

|f | dμ s

.

Q

The main result of [5] is the following one about interpolation between L2 and the Hardy spaces: Theorem 2.5. (See [5].) Let σ ∈ (2, ∞]. Assume that we have an implicit constant such that for all functions h ∈ L2 Mσ (h) MH L,2 (h). 1 1 Let T be an L2 -bounded, linear operator. Assume that T is continuous from HF,ato (or HF,,mol ) 1 into L . Then for all exponent p ∈ (σ , 2] there exists a constant C = C(p) such that:

∀f ∈ L2 ∩ Lp ,

T (f ) C f p . p

We have boundedness in weighted spaces too. We recall the definition of Muckenhoupt’s weights and reverse Hölder classes: Definition 2.6. A nonnegative function ω on X belongs to the class Ap for 1 < p < ∞ if

sup Q ball

1 μ(Q)

ω dμ

1 μ(Q)

Q

ω−1/(p−1) dμ

p−1 < ∞.

Q

A nonnegative function ω on X belongs to the class RHq for 1 < q < ∞, if there is a constant C such that for every ball Q ⊂ X

1 μ(Q)

1/q

q

ω dμ

1 C μ(Q)

Q

ω dμ .

Q

We use the following notation of [3]: Let ω ∈ A∞ be a weight on X and 0 < p0 < q0 ∞ be two exponents, we introduce the set Wω (p0 , q0 ) := p ∈ (p0 , q0 ), ω ∈ Ap/p0 ∩ RH(q0 /p) . Then we have the following result: Theorem 2.7. Let σ ∈ (2, ∞]. Assume that we have an implicit constant such that for all h ∈ L2 , Mσ (h) MH L,2 (h).


2567

Let T be an L2 -bounded, linear operator such that for all balls Q and for all functions f supported in Q

∀j 2

1

T BQ (f ) 2 dμ

μ(2j +1 Q)

1/2

αj (Q)

1 μ(Q)

1/2

|f | dμ 2

, (2.6)

Q

Sj (Q)

with coefficients αj (Q) satisfying sup

μ(2j +1 Q)

Q ball j 0

μ(Q)

αj (Q) < ∞.

Let ω ∈ A∞ be a weight. Then for all exponents p ∈ Wω (σ , 2), there exists a constant C such that ∀f ∈ L2 ∩ Lp (ω),

T (f )

p,ω dμ

C f p,ω dμ .

1 Remark 2.8. From Proposition 2.3, (2.6) implies the HF,ato − L1 boundedness of T . However, the proof for weighted results requires (2.6) and not only this boundedness of the operator T .

Now we give some results concerning the Hardy spaces. Assume that B satisfies some decay estimates: for M > n/2 an integer (with n the homogeneous dimension of X), there exists a constant C such that ∀i 0, ∀k 0, ∀f ∈ L2 , supp(f ) ⊂ 2k Q,

BQ (f ) C2−Mi f 2,2k Q . 2,S (2k Q) i

(2.7)

Then we have the following results: 1 and H 1 Proposition 2.9. (See [5].) The spaces Hato ,mol are Banach spaces. And

∀ > 0,

1 1 → H,mol → L1 . Hato

Therefore 1 ∗ 1 ∗ L∞ ⊂ H,mol ⊂ Hato . 1 the classical Hardy space (of Coifman–Weiss) (see [8]). As we noted in [5], We denote HCW 1 or H 1 it corresponds to our Hardy space Hato ,mol when the operators BQ exactly correspond to the oscillation operators. 1 1 Proposition 2.10. (See [5].) Let ∈ (0, ∞]. The inclusion H,mol ⊂ HCW is equivalent to the ∗ ∗ fact that for all Q ∈ Q, (AQ ) (1X ) = 1X in (Mol,Q ) . In this case for all we have the 1 1 1 ⊂ H1 inclusions Hato ,mol ⊂ H,mol ⊂ HCW .

2568


3. An application to maximal Lq regularity on Lebesgue spaces In this section, we apply the previous general theory to maximal Lq regularity for Cauchy problem. We first define an operator T : Definition 3.1. With L the generator of the semigroup, we define the operator: t Tf (t, x) =

(t−s)L Le f (s, .) (x) ds.

0

Let p, q ∈ (1, ∞) be two exponents. We know that the maximal Lq regularity on Lp (Y ) is equivalent to the fact that T is bounded on Lp (J × Y ). That is why we study this operator. Of course, the problem of maximal Lq regularity is completely understood by the abstract result in [14] of L. Weiss using the R boundedness. Here we want to remain as concrete as possible and look for practicable assumptions. We define operators BQ and Hardy spaces adapted to the operator T . Then using interpolation, we prove Lp boundedness of this one. 1 In particular case we will see that the HF,,mol − L1 continuity of the operator T below depends only on L2 assumptions. It is only when we want to deduce Lp estimates that we need stronger assumptions which imply R-boundedness used in [14]. Now we describe the choice of the collection B, adapted to this operator. Then we will check that the assumption (2.2) and the one about Mq0 are satisfied. To finish the proof, we will show 1 the HF,,mol − L1 boundedness of T in Theorem 4.1. Equip X = J × Y with the parabolic quasi-distance d and the measure μ defined by: d (t1 , y1 ), (t2 , y2 ) = max dY (y1 , y2 ), |t1 − t2 |

and dμ = dt ⊗ dν.

If we write δ for the homogeneous dimension of the space (Y, dY , ν), then the space X is + of homogeneous type with homogeneous dimension δ + 2. We choose ϕ ∈ S(R ) such that R+ ϕ(t) dt = 1 and ϕ(t) := 0 for all t < 0 (ϕ does not need to be continuous at 0). In fact we shall use only the fast decay of ϕ and we will never consider regularity about it. In addition, we have added a condition for the support. This is a “physical” heuristics: this condition permits to define AQ (f )(t, x) by (3.2) with only (f (σ, y))σ t , which corresponds to the “past informations” about f . However we do not really need this assumption in the sequel. For each ball Q of X, we write rQ its radius and we define the BQ operator as: BQ = Br 2

Q

with Br (f ) := f − Ar (f ),

(3.1)

where the operator Ar is defined by: +∞ ϕr (t − σ )erL f (σ, .) (x) dσ. Ar (f )(t, x) := σ =0

(3.2)


2569

Here we write ϕr as the L1 (R) normalized function ϕr (t) := r −1 ϕ(t/r). In fact, the integral for σ ∈ [0, ∞) is reduced to [0, t], due to the fact that ϕ is supported in R+ . Now to check the abstract assumption on the Hardy space, to be able to interpolate our operator T , we will use some conditions on our semigroup etL . We refer the reader to the work of P. Auscher and J.M. Martell [4] to a precise study of off-diagonal estimates. Here we exactly define the decays, which will be required later. Definition 3.2. Let T := (Tt )t∈J be a collection of L2 (Y )-bounded operators and p a positive integer. We will say that T satisfies off-diagonal L2 − L2 estimates at order p if there exists a bounded decreasing function γ satisfying ∀0 k p,

sup γ (u)(1 + u)k < ∞,

(3.3)

u0

such that for all balls B ⊂ Y of radius r, for all functions f supported in B then

1

ν(2j +1 B)

T 2 (f )2 dν u

1/2

j +1

1/2 1 2 r ν(B) 2 γ |f | dν . (3.4) u ν(B) ν(2j +1 B) B

Sj (B)

We also write T ∈ Op (L2 − L2 ). Remark 3.3. This condition is satisfied for p = ∞ if the kernel Kt of the operator Tt admits some gaussian estimates like Kt (x, y)

1 2 e−ρd(x,y) /t , ν(B(x, t 1/2 ))

with ρ > 0. We will prove the following result in the next section: Theorem 3.4. Let L be a generator of a bounded analytic semigroup T := (etL )t>0 on L2 (Y ) such that T , (tLetL )t>0 and (t 2 L2 etL )t>0 belong to O4 (L2 − L2 ). Then for all > 0 the oper1 ator T is continuous from HF,,mol (X) to L1 (X). Remark 3.5. We recall that the semigroup (etL )t>0 is supposed to be analytic on L2 . Using Cauchy formula, if (ezL )z satisfies the L2 − L2 off-diagonal estimates of O4 (L2 − L2 ) for the complex variable z belonging to a complex cone, then (tLetL )t>0 and (t 2 L2 etL )t>0 belong to O4 (L2 − L2 ). We finish this section by explaining how we can use this result to obtain positive answer for the maximal regularity problem. We want to apply the abstract results, recalled in the previous section. First we have to check the assumption (2.2): Proposition 3.6. There is a constant 0 < A < ∞ so that for all r > 0 the operator Ar is L2 (X) bounded and we have: Ar L2 →L2 A .

2570


Proof. By definition the semigroup erL is L2 (Y )-bounded so we have the following estimates: +∞ rL Ar (f ) ϕr (t − σ )e f (σ, .) dσ 2 2,dν

2,dt

σ =0 y∈Y

+∞ ϕr (t − σ )f (σ, .) dσ 2,dν

2,dt

σ =0

t ϕr (σ )f (t − σ, .) dσ 2,dν

2,dt

σ =−∞

ϕr 1 f 2, dμ f 2,dμ . So we have proved that Ar is L2 (X)-bounded and its boundedness is uniform for r > 0.

2

Theorem 3.7. The operator T is L2 (X)-bounded. This fact was proved in [11] because it is equivalent to the maximal L2 regularity on L2 (Y ). Applying Theorem 2.5, we have: Theorem 3.8. Let L be a generator of a bounded analytic semigroup T := (etL )t>0 on L2 (Y ) such that T , (tLetL )t>0 and (t 2 L2 etL )t>0 belong to O4 (L2 − L2 ). Let us assume that for q0 ∈ (2, ∞] Mq0 MH L,2 .

(3.5)

Then for all exponent p ∈ (q0 , 2], the operator T admits a continuous extension on Lp (Y ) and so L has maximal Lp -regularity. In addition we can have weighted results: let ω ∈ A∞ (Y ) be a weight on Y . Then for all exponents p ∈ Wω (q0 , 2) L has maximal Lp (ω)-regularity. Proof. The first part of the theorem is a direct consequence of Theorems 2.5 and 3.4 as the above assumptions were checked before. The second part about weighted results is an application of Theorem 2.7 with the following property. For ω ∈ A∞ (Y ) a weight on the space Y , we set ω˜ for the associated weight on X = J × Y defined by the tensor product ω˜ := 1R ⊗ ω: for all ball 2 and Q ⊂ Y a Q ⊂ X of radius rQ , we can write Q = I × QY with an interval I of length rQ Y ball of radius rQ 2 ω(Q) ˜ := rQ ω(QY ).

Then with this definition, it is obvious to check that for exponents p, q ∈ [1, ∞]: ω˜ ∈ Ap (X)

⇐⇒

ω ∈ Ap (Y )

ω˜ ∈ RHq (X)

⇐⇒

ω ∈ RHq (Y ).

and


2571

So we have Wω (σ , 2) = Wω˜ (σ , 2).

2

We want now to study the main assumption (3.5). For example, we give other stronger assumption, describing “off-diagonal” estimates. Proposition 3.9. We recall the maximal operator

Mq0 (f )(σ, x) :=

sup Q ball (σ,x)∈Q

1 μ(Q)

∗ A (f )q0 dμ

1/q0

Q

.

Q

∗

If the semigroup (etL )t>0 satisfy these “Lq0 −L2 off-diagonal” estimates: there exist coefficients (βj )j 0 satisfying

2j βj < ∞

(3.6)

j 0

such that for all balls B and for all functions f ∈ L2 (Y ) we have

1 ν(B)

r 2 L∗ e B (f )q0 dν

1/q

0

βj

j 0

B

1 ν(2j B)

1/2

|f |2 dν

(3.7)

.

2j B

Then Mq0 is bounded by the Hardy–Littlewood maximal operator MH L,2 on X, so (3.5) is satisfied. Proof. Let Q be a ball containing the point (σ, x) ∈ X and rQ be its radius. For f, g ∈ L2 (X) we have:

+∞ 2 ϕr 2 (t − σ )erQ L f (σ, .) (x)g(t, x) dσ dt dν(x)

AQ (f ), g :=

Q

(t,x)∈X σ =0

+∞ 2 ∗ ϕr 2 (t − σ )f (σ, x) erQ L g(t, .) (x) dσ dt dν(x).

=

Q

(t,x)∈X σ =0

So we conclude that: A∗Q (g)(σ, x) :=

2 ∗ ϕr 2 (t − σ ) erQ L g(t, .) (x) dt.

(3.8)

Q

t∈R+

By using the Minkowski inequality, we also have that

∗ A (g)q0 dμ Q

Q

1

q0

t∈R+

2 ϕ 2 (t − σ ) erQ L ∗ g(t, .) (x)1Q (σ, x) r q Q

0 ,dν(x) dσ

dt.

2572


By definition of the parabolic quasi-distance, we can write Q=I ×B 2 and B a ball of Y of radius r . Then we have: with I an interval of length rQ Q

∗ A (f )q0 dμ Q

Q

1/q0

ϕ 2 (t − σ )1I (σ ) r q

0 ,dσ

Q

2 1B (x) erQ L ∗ g(t, .)(x)

q0 ,dν(x)

dt.

t∈R+

With the assumption (3.7), we obtain

∗ A (f )q0 dμ

1/q0

Q

Q

ϕ 2 (t − σ )1I (σ ) r q

βj 0 ,dσ

Q

j 0t∈R+

ν(B)1/q0 g(t, x)1 j (x) dt. 2 B 2,dν(x) ν(2j B)1/2

Now we decompose the integration over t by:

∗ A (f )q0 dμ

1/q0

Q

Q

j 0 k0 t∈S (I ) k

ϕ 2 (t − σ )1I (σ ) r q

βj 0 ,dσ

Q

ν(B)1/q0 g(t, x)1 j (x) dt. 2 B 2,dν(x) j 1/2 ν(2 B)

With the Cauchy–Schwarz inequality, we have

∗ A (f )q0 dμ Q

1/q0

Q

j 0 k0

−l 2/q ν(B)1/q0 k 2 1/2 −2 g(t, x)1 k j (t, x) 1 + 2k rQ 0 βj 2 rQ rQ 2 I ×2 B 2,dt dν(x) j 1/2 ν(2 B) −1+2/q0

rQ

1 + 2k

−l+1/2

βj

j 0 k0

ν(B)1/q0 g(t, x)1 k j (t, x) . 2 I ×2 B 2,dt dν(x) ν(2j B)1/2

Here l is an integer as large as we want, due to the fast decay of ϕ. Using the Hardy–Littlewood maximal operator, we have g(t, x)1

2k I ×2j B (t, x) 2,dt dν(x)

1/2 μ max 2j , 2k/2 Q inf MH L,2 (g). Q


2573

So we obtain

∗ A (g)q0 dμ

1/q0

Q

Q

−l+1/2 −1+2/q0 1 + 2k rQ βj

j 0 k0

j k/2 1/2 ν(B)1/q0 inf MH L,2 (g). Q μ max 2 , 2 Q ν(2j B)1/2

We now estimate the sum over the parameters j and k. We have the two following cases. Write

S1 :=

−1+2/q0

1 + 2k

rQ

−l+1/2

βj

j k/20

ν(B)1/q0 j 1/2 μ 2 Q ν(2j B)1/2

and S2 :=

−1+2/q0

1 + 2k

rQ

k/2j 0

−l+1/2

βj

ν(B)1/q0 k/2 1/2 μ 2 Q . ν(2j B)1/2

2 ν(B) to have We must estimate these two sums. For the first, we use that μ(Q) = |I |ν(B) = rQ

S1

j k/20

−l+1/2 μ(Q)1/q0 j 1/2 2j 1 + 2k βj μ 2 Q μ(2j Q)1/2

μ(Q)1/q0

−l+1/2 2j 1 + 2k βj

j k/20

μ(Q)1/q0

2j βj μ(Q)1/q0 .

j 0

In the last inequality, we have used the assumption (3.6) about the coefficients (βj )j . For the second sum, we have (with the doubling property of μ and l large enough)

2/q S2 rQ 0 ν(B)1/q0

k/2j 0

μ(Q)1/q0

k/2j 0

μ(Q)1/q0

μ(2k/2 Q) ν(2j B)

1/2

−l+1/2 μ(2j Q) 1/2 (k/2−j )(δ+2)/2 −1 1 + 2k rQ βj 2 ν(2j B)

−l+1/2 1 + 2k βj 2j 2(k/2−j )(δ+2)/2

k/2j 0

μ(Q)1/q0

−l+1/2 −1 1 + 2k rQ βj

−l+4+δ/2 1 + 2j βj 2−j (δ/2+1) μ(Q)1/q0 .

j 0

2574


So we have proved that there exists a constant C (independent on g and Q) such that:

1/q0 ∗ A (g)q0 dμ Cμ(Q)1/q0 inf MH L,2 (g). Q Q

Q

We can also conclude that Mq0 (f ) MH L,2 (g).

2

We have described an “off-diagonal” estimates implying (3.5) with dyadic scale. We would like to finish this section by comparing our result with the one of S. Blunck and P.C. Kunstmann [6]. In their paper, the authors have used their assumptions (1.1) and (1.2) to use inside their proof the following inequality: for all ball QY of Y and all functions f ∈ L2 (Y )

1/2 1/p0 ∞ −r 2 L 2 1 1 −1− p e QY f dν k |f | 0 dν . (3.9) ν(QY ) ν(kQY ) k=0

QY

kQY

With this inequality, a simple computation gives us that for all function f ∈ L2 (X) and all balls Q of X

1/2 1 AQ (f )2 dμ inf MH L,p0 (f ). (3.10) Q ν(Q) Q

It is surprising to note that their assumption (3.10) seems to be not comparable with ours (3.5). These two assumptions are quite different in the sense that we require different kind of “offdiagonal” estimates, however they seem to be the dual of each other. 4. Proof of Theorem 3.4 This section is devoted to the proof of a technical result: Theorem 3.4. Let us first repeat it. Theorem 4.1. Let L be a generator of a bounded analytic semigroup on L2 (Y ). Assume that (etL )t>0 , (tLetL )t>0 and (t 2 L2 etL )t>0 belong to O4 (L2 − L2 ). Then there exist coefficients αj,k such that for all balls Q ⊂ X, for all k 0, j 2 and for all functions f supported in Sk (Q)

1/2 1/2 1 1 2 T BQ (f ) 2 dμ αj,k |f | dμ . (4.1) μ(2j +k+1 Q) μ(2k+1 Q) Sj (2k Q)

Sk (Q)

In addition the coefficients αj,k (independent in Q) satisfy

μ(2j +k+1 Q) αj,k < ∞. Λ := sup sup μ(2k+1 Q) Q k0

(4.2)

j 2

1 With Theorem 2.3, these estimates imply the HF,,mol (X) − L1 (X) boundedness of T for all > 0.


2575

Proof. We write r = rQ and (t0 , x0 ) the radius and the center of the ball Q so we have defined BQ as Br 2 . The function f is fixed. The parameter j and k are fixed too. We write Q as the 2 and B a ball of Y of radius r . We have product Q = I × B with I an interval of length rQ Q T Br 2 (f )(t, x) = T (f )(t, x) − T Ar 2 (f )(t, x) t =

(t−s)L Le f (s, .) (x) ds −

t

0

(t−s)L Le Ar 2 f (s, .) (x) ds,

0

where (t−s)L Le Ar 2 f (s, .) (x) = Le(t−s)L

+∞ r2L ϕr 2 (s − σ )e f (σ, .) dσ (x). σ =0

So we obtain T (Br 2 f )(t, x) =

ϕr 2 (s − σ ) 100 and (t 2 L2∗ etL )t>0 belong to O4 (L2 − L2 ). Then T ∗ is HF,,mol 1 1 bounded for every > 0, with the Hardy space H,mol := H,mol,(B (which is the Hardy ∗ Q ) ∗ ). space constructed with the dual operators BQ

Q∈Q

Proof. The adjoint operator T ∗ is given by: ∗

Z

T f (t, x) =

∗ (s−t)L ∗ L e f (s, .) (x) ds.

s=t

The parameter Z depends on the time interval J , it is defined by: Z :=

∞ l

if J = (0, ∞), if J = (0, l).

The argument of the previous theorem can be repeated and we omit details. We refer the reader to [2] for a precise study of T and T ∗ for L = − defined as the opposite of the Laplacian. 2


2583

So now we can apply our general theory to obtain the following result: Theorem 5.2. Let L be a generator of a bounded analytic semigroup on L2 (Y ) such that ∗ ∗ ∗ (etL )t>0 , (tL∗ etL )t>0 and (t 2 L2∗ etL )t>0 belong to O4 (L2 − L2 ). Let us assume that for q0 ∈ (2, ∞], for all balls Q ⊂ X and all functions f ∈ L2 (X), we have

1 ν(Q)

AQ (f )q0 dμ

1/q0

inf MH L,2 (f ). Q

Q

Then for all p ∈ (q0 , 2] the operator T ∗ is Lp (X)-bounded and so T is Lp -bounded. We have also the maximal regularity on Lp (Y ) for all p ∈ [2, q0 ). Proof. We use Theorem 2.5 for the operator T ∗ whose hypotheses are satisfied thanks to Theorem 5.1. 2 5.2. Study of our Hardy spaces To finish this paper, we show some results on our Hardy space. First we have the off-diagonal decay (2.7). Proposition 5.3. Assume that (etL )t>0 ∈ Op (L2 − L2 ) for an integer p 2. For BQ defined by (3.1) and (3.2), we have that for all balls Q ⊂ X ∀i 0, ∀k 0, ∀f ∈ L2 2k Q ,

BQ (f )

2,Si (2k Q)

C2−M i f 2,2k Q

(5.1)

with the exponent M = p − 1. Proof. By definition we just have to prove the decay for the AQ operator. Let r be the radius of Q. As previously, we write Q = I × B where I is an interval of length r 2 and B is a ball in Y of radius r. Recall that +∞ 2 AQ (f )(t, x) := ϕr 2 (t − σ )er L f (σ, .) (x) dσ. σ =0

For i 1, we just use the L2 (Y )-boundedness of AQ to prove (5.1). Then for i 2 and (σ, y) ∈ 2k Q if (t, x) ∈ Si (2k Q) we have that d((x, t), (σ, y)) 2k+i r and by using the definition of the parabolic quasi-distance, we conclude that x ∈ Si (2k B) or t ∈ S2i (22k I ). We will study the two cases: First for x ∈ Si (2k B), by the off-diagonal estimate (3.4) we have the estimate: for all σ > 0

ν(2k B) i+k+1 ν(2i+k B) 1/2 f (σ, .) k . γ 2 k 2,Si (2 B) 2,2 B ν(2i+k B) ν(2k B)

r 2L e f (σ, .)

So by the Minkowski inequality, we obtain

2584


AQ (f )(t, .) +∞

σ =0

2,Si (2k B)

|t − σ | 1+ r2

−N

ν(2k B) i+k ν(2i+k B) 1/2 f (σ, .) k dσ γ 2 2,2 B r 2 ν(2i+k B) ν(2k B)

1 ν(2k B) 1/2 i+k f 2,2k Q γ 2 ν(2i+k B) r 1 γ 2i+k f 2,2k Q . r

Then we integrate for t ∈ 22(i+k) I to have AQ (f )

2,22(i+k) I ×Si (2k B)

2i+k γ 2i+k f 2,2k Q .

For the second case, we have |t − σ | 22(i+k) r 2 . By using the L2 (Y )-boundedness of the semigroup r2L e f (σ, .)

2,2i+k B

f (σ, .)2,2k B .

So by the Minkowski inequality, we obtain AQ (f )(t, .) k+i 2,2 B

−N f (σ, .) k dσ 1 + 22(k+i) 2,2 B r 2

σ ∈2k I

1 2−2(k+i)(N −1) f 2,2k Q . r So we can conclude that AQ (f )

2,S2i (22k I )×2i+k B

2−(N −2)(k+i) f 2,2k Q .

With these two cases, we can conclude (for N any large enough integer) AQ (f )

2,Si (2k Q)

2−(N −2)i + 2i+k γ 2i+k f 2,2k Q

which with the decay of γ permits to prove the result.

2

1 With this decay M > δ+2 2 (if p > (δ + 4)/2), we have shown that the Hardy spaces Hato (X) 1 1 and H,mol (X) are included into the space L (X) (see Proposition 2.9). In fact we can improve this result, by comparing it with the classical Hardy space of Coifman– Weiss on X. 1 1 (X) ⊂ H 1 Proposition 5.4. Let > 0. The inclusion Hato ,mol (X) ⊂ HCW (X) is equivalent to the fact for all ball Q > 0, (erQ AQ )∗ (1Y ) = 1Y (in the sense of Proposition 2.10).


2585

Proof. We use the notations of Proposition 2.10. By using this Proposition, we know that 1 1 (X) is equivalent to the fact that for all balls Q of X, A∗ (1 ) = 1 in the H,mol (X) ⊂ HCW X Q X sense of (Mol,Q )∗ . Let Q = B((tQ , cQ ), rQ ) be fixed. By (3.8) we know that A∗Q (g)(σ, x) :=

2 ∗ ϕr 2 (t − σ ) erQ L g(t, .) (x) dt. Q

t∈R+

As

R ϕ(t)dt

= 1, we formally obtain 2 ∗ A∗Q (1X )(σ, x) = erQ L (1Y )(x). 2

This equality can be rigorously verified by defining (erQ L )∗ (1Y )(x) as the continuous linear form on the space Mol,rQ (Y ) := f ∈ L1 (Y ), f Mol,rQ (Y ) < ∞ , where 1/2 i 2 . f Mol,rQ (Y ) := sup f 2,Si (QY ) ν 2i QY i0

Here we write QY = B(cQ , rQ ) = y ∈ Y, dY (x, cQ ) rQ the ball in Y . Then the equivalence is a consequence of Proposition 2.10.

2

In the paper [2], the authors have shown that with −L equals to the Laplacian on X a complete Riemannian manifold with doubling and Poincaré inequality, the operator T is bounded 1 (X) (not just bounded into L1 (X)). This is a better result than the one here because on HCW Proposition 5.4 applies (see [2]) so 1 1 1 Hato (X) ⊂ H,mol (X) ⊂ HCW (X) ⊂ L1 (X). 1 -boundedness is using stronger hypotheses than ours in a specific situation. But the HCW

Acknowledgments The authors are indebted to professor Pascal Auscher for suggesting this topic, numerous discussions and valuable advices to improve this paper. The authors are very grateful to the referee’s helpful comments and advices. The second author would like to express many thanks to Department of Mathematics, Paris-Sud University for its hospitality.

2586


References [1] P. Auscher, On necessary and sufficient conditions for Lp estimates of Riesz transforms associated to elliptic operators on Rn and related estimates, Mem. Amer. Math. Soc. 186 (871) (2007). [2] P. Auscher, F. Bernicot, J. Zhao, Maximal regularity and Hardy spaces, Collect. Math. 59 (1) (2008) 103–127. [3] P. Auscher, J.M. Martell, Weighted norm inequalities, off-diagonal estimates and elliptic operators, Part I: General operator theory and weights, Adv. in Math. 212 (2007) 225–276. [4] P. Auscher, J.M. Martell, Weighted norm inequalities, off-diagonal estimates and elliptic operators, Part II: Offdiagonal estimates on spaces of homogeneous type, J. Evol. Equ. 7 (2007) 265–316. [5] F. Bernicot, J. Zhao, New abstract Hardy spaces, J. Funct. Anal. 255 (2008) 1761–1796. [6] S. Blunck, P. Kunstmann, Weighted norm estimates and maximal regularity, Adv. Differential Equations 7 (12) (2002) 1513–1532. [7] P. Cannarsa, V. Vespri, On maximal Lp regularity for the abstract Cauchy problem, Boll. Unione Mat. Ital. B (6) 5 (1) (1986) 165–175. [8] R. Coifman, G. Weiss, Extensions of Hardy spaces and their use in analysis, Bull. Amer. Math. Soc. 83 (1977) 569–645. [9] T. Coulhon, X.T. Duong, Riesz transforms for 1 p 2, Trans. Amer. Math. Soc. 351 (2) (1999) 1151–1169. [10] T. Coulhon, X.T. Duong, Maximal regularity and kernel bounds: Observations on a theorem by Hieber and Prüss, Adv. Differential Equations 5 (1–3) (2000) 343–368. [11] L. de Simon, Un’applicazione della teoria degli integrali singolari allo studio delle equazioni differenziali lineari astratte del primo ordine, Rend. Sem. Mat. Univ. Padova 34 (1964) 205–223. [12] M. Hieber, J. Prüss, Heat kernels and maximal Lp –Lq estimates for parabolic evolution equations, Comm. Partial Differential Equations 22 (9-10) (1997) 1647–1669. [13] D. Lamberton, Équations d’évolution linéaires associées à des semi-groupes de contractions dans les espaces Lp , J. Funct. Anal. 72 (2) (1987) 252–262. [14] L. Weis, Operator-valued Fourier multiplier theorems and maximal Lp -regularity, Math. Ann. 319 (2001) 735–758.


Positive commutators, Fermi golden rule and the spectrum of zero temperature Pauli–Fierz Hamiltonians Sylvain Golénia Mathematisches Institut der Universität Erlangen-Nürnberg, Bismarckstr. 1 1/2, 91054 Erlangen, Germany Received 7 July 2008; accepted 5 December 2008 Available online 21 January 2009 Communicated by L. Gross En hommage au 60ème anniversaire de Vladimir Georgescu

Abstract We perform the spectral analysis of a zero temperature Pauli–Fierz system for small coupling constants. Under the hypothesis of Fermi golden rule, we show that the embedded eigenvalues of the uncoupled system disappear and establish a limiting absorption principle above this level of energy. We rely on a positive commutator approach introduced by Skibsted and pursued by Georgescu–Gérard–Møller. We complete some results obtained so far by Dereziński–Jak˘sić on one side and by Bach–Fröhlich–Sigal–Soffer on the other side. © 2008 Elsevier Inc. All rights reserved. Keywords: Fermi golden rule; Positive commutator estimates; Quantum field theory; Absence of eigenvalue; Limiting absorption principle

1. Introduction Pauli–Fierz operators are often used in quantum physics as generator of approximate dynamics of a (small) quantum system interacting with a free Bose gas. They describe typically a non-relativistic atom interacting with a field of massless scalar bosons. Pauli–Fierz operators appear also in solid state physics. They are used to describe the interaction of phonons with a quantum system with finitely many degrees of freedom. This paper is devoted to the justification of the second-order perturbation theory for a large class of perturbation. For positive temperature E-mail address: [email protected]. 0022-1236/$ – see front matter © 2008 Elsevier Inc. All rights reserved. doi:10.1016/j.jfa.2008.12.016

2588

S. Golénia / Journal of Functional Analysis 256 (2009) 2587–2620

system, this property is related to the return to equilibrium, see for instance [10] and reference therein. This question has been studied in many places, see for instance [3–5,9,11,13,22] for zero temperature systems and [9,25,27] for positive temperature. We mention also [12,15,24,31] who studied certain spectral properties using positive commutator techniques. Here, we focus on the zero temperature setting. In [3], one initiates the analysis using analytic deformation techniques. In [5] and in [9], one introduces some kind of Mourre estimate approach. In the former, one enlarges the class of perturbation studied in [3] and in the latter, one introduces another class. These two classes do not fully overlap. This is due to the choice of the conjugate operator. In this paper, we enlarge the class of perturbations used in [9] for the question of the Virial theorem (one-commutator theory) and also for the limiting absorption principal (two-commutator theory). Now, we present the model. For the sake of simplicity and as in [9], we start with a n-level atom. It is described by a self-adjoint matrix K acting on a finite dimensional Hilbert space K . Let (ki )i=0,...,n be its eigenvalues, with ki < ki+1 . On the other hand, we have the Bosonic field Γs (h) with the 1-particle Hilbert space h := L2 (Rd , dk). The Hamiltonian is given by the second quantization dΓ (ω) of ω, where ω(k) = |k|, see Section 2.1. This is a massless and zero temperature system. The free operator is given by H0 = K ⊗ 1Γ (ω) + 1K ⊗ dΓ (ω) on K ⊗ Γ (h). Its spectrum is [k0 , ∞). It has no singularly continuous spectrum. Its point spectrum is the same as K, with the same multiplicity. Let α ∈ B(K , K ⊗ h) be a form-factor and φ(α) the field operator associated to it, see Section 2.2. Under the condition (I0) (1 ⊗ ω−1/2 )α ∈ B(K , K ⊗ h), we define the interacting Hamiltonian on K ⊗ Γ (h) by Hλ := K ⊗ 1Γ (ω) + 1K ⊗ dΓ (ω) + λφ(α),

where λ ∈ R.

(1.1)

The operator is self-adjoint with domain K ⊗ D(dΓ (ω)). We now focus on a selected eigenvalue ki0 , with i0 > 0. The aim of this paper is to give hypotheses on the form-factor α to ensure that Hλ has no eigenvalue in a neighborhood of ki0 for λ small enough (and non-zero). First, we have to ensure that the perturbation given by the field operator will really couple the system at energy ki0 ; we have to avoid form factors like α(x) = 1 ⊗ b for all x ∈ K and some b ∈ h, see Section 6. Here comes the second-order perturbation theory, namely the hypothesis of Fermi golden rule for the couple (H0 , α) at energy ki0 : w- lim P φ(α)P Im(H0 − k + iε)−1 P φ(α)P > 0, ε→0+

on P H ,

(1.2)

where P := Pki0 ⊗ PΩ and P := 1 − P . At first sight, this is pretty implicit. We make it explicit in Appendix A. This condition involves the form factor, the eigenvalues of H0 lower than ki0 and its eigenfunctions. Therein, we also explain why the ground state energy is tacitly excluded. In this paper, we are establishing an extended Mourre estimate, in the spirit of [16,31]; this is an extended version of the positive commutator technique initiated by E. Mourre, see [1,28] and [19,21] for recent developments. Due to the method, we make further hypotheses on the form-


2589

factor. To formulate them, we shall take advantage of the polar coordinates and of the unitary map: T :=

˜ L2 Rd , dk −→ L2 R+ , dr ⊗ L2 S d−1 , dθ := h, u −→ T u := (r, θ ) → r (d−1)/2 u(rθ ).

(1.3)

We identify h and h˜ through this transformation. We write ∂r for ∂r ⊗ 1. We first give meaning to the commutator via: (I1a) α ∈ B(K , K ⊗ H˙ 1 (R+ ) ⊗ L2 (S d−1 )), 1 ⊗ ω−1/2 ∂r α ∈ B(K , K ⊗ h). Here, the dot means the completion of Cc∞ (R+ ) under the norm given by the space. We denote by · 2 the L2 norm. Recall the norm of H 1 is given by · 2 + ∂r · 2 . We explain the method on a formal level. We start by choosing a conjugate operator so as to obtain some positivity of the commutator. We choose A := 1K ⊗ dΓ (i∂|r| ). Note this operator is not self-adjoint and only maximal symmetric. We set N := 1K ⊗ dΓ (Id), the number operator. Thanks to (I1a), one obtains [Hλ , iA] = N + 1K ⊗ PΩ + λφ(∂r α) − 1K ⊗ PΩ =: M + S. 1 Hλ -bounded Consider a compact interval J . Since dΓ (ω) is non-negative, we have: EJ (H0 ) =

Pki ⊗ EJ −ki dΓ (ω) .

(1.4)

0isup(J )

We infer (1K ⊗ PΩ )EJ (H0 ) = 0 if and only if J contains no eigenvalues of K. We evaluate the commutator at an energy J which contains ki0 and no other ki . Thus, M + EJ (H0 )SEJ (H0 ) 1 + −1 + O(λ) EJ (H0 ) O(λ)EJ (H0 ),

(1.5)

since φ(i∂r α) is H0 -bounded. We keep M outside the spectral measure as it is not Hλ -bounded. Note we have no control on the sign of O(λ) so far. We have not yet used the Fermi golden rule assumption. We follow an idea of [5] and set −1 Bε := Im (H0 − ki0 )2 + ε 2 P φ(α)P . Observe that (1.2) implies there exists c > 0 such that P [Hλ , iλBε ]P =

λ2 cλ2 P φ(α)P Im(H0 − ki0 + iε)−1 P φ(α)P P, ε ε

ˆ = holds true for ε small enough. Let Aˆ := A + λBε and Sˆ := S + λ[Hλ , iBε ]. We have [Hλ , i A] ˆ M + S. We go back to (1.5) and infer: ˆ J (H0 ) cλ2 /ε + O(λ) EJ (H0 ) + error terms. M + EJ (H0 )SE

(1.6)

2590


By taking ε := ε(λ), one hopes to obtain the positivity of the constant in front of EJ (H0 ), to control the errors terms and to replace the spectral measure by the one of Hλ . Using the Feshbach method and with a more involved choice of conjugate operator, we show in Section 6 that there are λ0 , c , η > 0 so that ˆ J (Hλ ) c |λ|1+η EJ (Hλ ), M + EJ (Hλ )SE

for all |λ| λ0 ,

(1.7)

on the sense of forms on D(N 1/2 ). One would like to deduce there is no eigenvalue in J from (1.7). To apply a Virial theorem, one has at least to check that the eigenvalues of Hλ are in the domain of N 1/2 . One may proceed like in [27]. In this article, we follow [15,31] and construct a sequence of approximated conjugate ˆ and such that one may operators Aˆ n such that [Hλ , i Aˆ n ] is Hλ -bounded, converges to [Hλ , i A] apply the Virial theorem with An . To justify these steps, we make a new assumption: (I1b) 1K ⊗ ω−a α ∈ B(K , K ⊗ h), for some a > 1. We now give our first result, based on the Virial theorem, see Proposition 4.11. Theorem 1.1. Let I be an open interval containing ki0 and no other ki . Assume the Fermi golden rule hypothesis (1.2) at energy ki0 . Suppose that (I0), (I1a) and (I1b) are satisfied. Then, there is λ0 > 0 such that Hλ has no eigenvalue in I, for all |λ| ∈ (0, λ0 ). We now give more information on the resolvent Rλ (z) := (Hλ − z)−1 as the imaginary part of z tends to 0. We show it extends to an operator in some weighted spaces around the real axis. This is a standard result in the Mourre theory, when one supposes some 2-commutatorslike hypothesis, see [1]. Here, as the commutator is not Hλ -bounded, one relies on an adapted theory. We use [15] which is a refined version of [31]. We check the hypotheses (M1)–(M5) given in Appendix C and deduce a limiting absorption principle, thanks to Theorem C.8. Using again (1.3), we state our class of form factors: (I2) α ∈ B(K , K ⊗ B˙ 21,1 (R+ ) ⊗ L2 (S d−1 )). Recall that the dot denotes the completion of Cc∞ . One choice of norm for B21,1 is:

1 f B1,1 (R+ ) = f 2 + 2

f (2t + ·) − 2f (t + ·) + f (·) dt . 2 t2

0

We refer to [1,32] for Besov spaces and real interpolation. To express the weights, consider b˜ the ˜ square root of the Dirichlet Laplacian on L2 (R+ , dr). Using (1.3), we define b := 1K ⊗ T −1 bT −s 1/2 in H . Set Ps := 1K ⊗ (dΓ (b) + 1) (N + 1) . Theorem 1.2. Let I be an open interval containing ki0 and no other ki . Assume the Fermi golden rule hypothesis (1.2) at energy ki0 . Suppose that (I0), (I1a) and (I2) (and not necessarily (I1b)), there is λ0 > 0 such that Hλ has no eigenvalue in I, for all |λ| ∈ (0, λ0 ). Moreover, Hλ has no


2591

singularly continuous spectrum in I. For each compact interval J included in I, and for all s ∈ (1/2, 1], the limits Ps∗ Rλ (x ± i0)Ps := lim Ps∗ Rλ (x ± iy)Ps y→0+

exist in norm uniformly in x ∈ J . Moreover the maps: J x → Ps∗ Rλ (x ± i0)Ps are Hölder continuous of order s − 1/2 for the norm topology of B(H ) To our knowledge, the condition (I2) is new, even for the question far from the thresholds. We believe it to be optimal in the Besov scale associated to L2 for limiting absorption principle. We now compare our result with the literature. In [9, Theorem 6.3], one shows the absence of embedded eigenvalues by proving a limiting absorption principal with the weights 1K ⊗ (dΓ (b) + 1)−s , for s > 1/2, without any contribution in N . They suppose essentially (I0) and that α ∈ B(K , K ⊗ H˙ s (R+ ) ⊗ L2 (S d−1 )), for s > 1. The class of perturbations is chosen in relation with the weights. Their strategy is to take advantage the Fermi golden rule at the level of the limiting absorption principle, with the help of the Feshbach method. The drawback is that they are limited by the relation weight/class of form-factors and they cannot give a Virial-type theorem. On the other hand, their method allows to cover some positive temperature systems and we do not deal with this question. Their method leads to fewer problems with domains questions. We mention that they do not suppose the second condition of (I1a). Therefore, concerning merely the disappearance of the eigenvalues, the conditions (I1a) and (I1b) do not imply α to be better than H˙ 1 (R+ ), in the Sobolev scale. Hence, Theorem 1.1 is a new result. We point out that the condition (I2) is weaker than the one used in [9]. The weights obtained in the limiting absorption principle are also better than the ones given in [9]. We mention that one could improve them by using some Besov spaces, see [15]. To simplify the presentation, we do not present them here. We believe they could hardly be reached by the method exposed in [9] due to the interplay between weights and form-factors. In [16] and in [31], one cares about showing that the point spectrum is locally finite, i.e. without clusters and of finite multiplicity. Here, they use a Virial theorem. Between the eigenvalues, one shows a limiting absorption principle, and uses a hypothesis on the second commutator, something stronger than (I2), see Section 4.5. In our approach, we use the Virial theorem and the limiting absorption principle in an independent way. In particular, if one is interested only in the limiting absorption principle, one does not need to suppose the more restrictive condition (I1b) but only (I0), (I1a) and (I2). This is due to the fact that we are showing a strict Mourre estimate, i.e. without compact contribution. In [5], one proves some version of Theorems 1.1 and 1.2 for a different class of perturbation. They use the second quantization of the generator of dilatation: Adil := i1K ⊗ dΓ (k · ∇k + ∇k · k), which is a self-adjoint operator. One motivation being that: [Hλ , iAdil ] = 1K ⊗ dΓ (ω) + O(λ).

2592


Then, one modifies the conjugate operator in the same way as we do but the choice of parameters is more involved. Note that the commutator is Hλ -bounded if and only if the dimension of K is finite. When the dimension is not finite, like in QED models, θ (Hλ )[Hλ , iAdil ]θ (Hλ ) is bounded when the support of θ contains only a finite number of eigenvalues of K. This approach leads to less questions of domains than in this paper but one relies on another alternate Mourre theory, see [30]. We point out this choice of conjugate operator has proved to be better to treat the infrared singularities present in QED. By choosing a function Gx acting in h and depending on x ∈ X, where K = L2 (X), one may consider

φ(α) =

Gx (k) ⊗ a ∗ (k) + Gx (k) ⊗ a(k) dk,

where a and a ∗ are the standard photon creation and annihilation operators. They are operatorvalued distribution in h. In QED, the behavior of ω−1 Gx near k = 0 determines the infrared problem. One has Gx (k) ≈ |k|−1/2 in the vicinity of k = 0, in this case. Since applying Adil recreates this singularity, this somehow explains why the generator of dilatation is efficient with infrared problems. The choice of conjugate operator A is inferior in this regard. The first condition (I1a) requires α to be bounded; This can be fulfilled if the atomic part has a particular shape by using some gauge transformations, see for instance [16, Section 2.4] and [9, Section 1.6]. After this, one considers Gx (k) ≈ |k|1/2 . The problem dwells in the second condition of (I1a) which cannot be checked. Albeit one is not able to recover the physical case, this choice of conjugate operator remains popular in the literature. From mathematical standpoint, note the classes of perturbation induced by the two operators do not cover one another. We now give the plan of the paper. In Section 2, we recall some definitions and properties of Pauli–Fierz models. In Section 3, we construct the conjugate operators. In Section 4, we prove the regularity properties so that one may apply the Mourre theory. The Virial theorem is discussed in Section 4.4. In Section 5, we establish the extended Mourre estimate far from the thresholds for small coupling constants, we explain in Remark 5.3 why the method should be improved to obtain the result above a threshold. In Section 6, we settle the extended Mourre estimate above the thresholds under the hypotheses of a Fermi golden rule. In Appendix A, we explain how to check the Fermi golden rule and why this hypothesis is compatible with the hypothesis (I0), (I1a), (I1b) and (I2). In Appendix B, we gather some properties of C0 -semigroups and in Appendix C we recall the properties of the C 1 class in this setting and the hypotheses so as to apply the extended Mourre theory. Notation. Given a borelian set J , we denote by EJ (A) the spectral measure associated to a self-adjoint operator A at energy J . Given Hilbert spaces H , K , we denote by B(H , K ) the set of bounded operator from H to K . We simply write B(H ), when H = K . We denote by σ (H ) the spectrum of H . We set x := (1 + x 2 )1/2 . We denote by · H and by ·,·H the norm and the scalar product of H , respectively. We omit the indices when no confusion arises. We denote by w-lim and s-lim the weak and strong limit, respectively. A dot over a Besov or a Sobolev space denotes the closure of the set Cc∞ of smooth functions with compact support, with respect to the norm of the space.


2593

2. The Pauli–Fierz model Pauli–Fierz operators are often used in quantum physics as generator of approximate dynamics of a (small) quantum system interacting with a free Bose gas. They describe typically a non-relativistic atom interacting with a field of massless scalar bosons. The quantum system is given by a (separable) complex Hilbert space K . The Hamiltonian describing the system is denoted by a self-adjoint operator K, which is bounded from below. We will suppose that K has some discrete spectrum. One may consider purely discrete spectrum, like [16], or not, like in [31]. To do not mutter the presentation, we will take K = Ran EI (K), where I contains a finite number of eigenvalues and consider the restriction of K to this space. Hence, we restrict the analysis to a self-adjoint matrix K acting in a Hilbert space K of finite dimension. This corresponds to analyze n level atoms. Doing so, we avoid some light problems of domains, which are already discussed in details in [16,31] and gain in clarity of presentation. 2.1. The bosonic field We refer to [6,7,29] for a more thorough discussion of these matters. The bosonic field is described by the Hilbert space Γ (h), where h is a Hilbert space. We recall its construction. Set h0⊗ = C and hn⊗ = h ⊗ · · · ⊗ h. Given a closed operator A, we define the closed operator An⊗ defined on hn⊗ by A0⊗ = 1 if n = 0 and by A ⊗ · · · ⊗ A otherwise. Let Sn be the group of permutation of n elements. For each σ ∈ Sn , one defines the action on hn⊗ by σ (fi1 ⊗· · ·⊗fin ) = fσ −1 (i1 ) ⊗ · · · ⊗ fσ −1 (in ) , where (fi ) is a basis of h. The action extends to hn⊗ by linearity to a unitary operator. The definition is independent of the choice of the basis. On hn⊗ , we set Πn :=

1 σ n!

and Γn (h) := Πn hn⊗ .

(2.1)

σ ∈Sn

Note that Πn is an orthogonal projection. We call Γn (h) the n-particle bosonic space. The bosonic space is defined by Γ (h) :=

∞

Γn (h).

n=0

We denote by Ω the vacuum, the element (1, 0, 0, . . .) and by PΩ := Γ (h) → Γ0 (h) the projection associated to it. We define Γfin (h) the set of finite particle vectors, i.e. Ψ = (Ψ1 , Ψ2 , . . .) such that Ψn = 0 for n big enough. We now define the second quantized operators. We recall that a densely defined operator A is closable if and only if its adjoint A∗ is densely defined. Given a closable operator q in h. We define Γfin (q) acting from Γfin (D(q)) into Γfin (h) by

Γfin (q) Π (D(q)n⊗ ) := q ⊗ · · · ⊗ q. n Since q is closable, q ∗ is densely defined. Using that Γfin (q ∗ ) ⊂ Γfin (q)∗ , we see that Γfin (q) is closable and we denote by Γ (q) its closure. Note that Γ (q) is bounded if and only if q 1.

2594


Let b be a closable operator on h. We define dΓfin (b) : Γfin (D(b)) → Γfin (h) by n

1 ⊗ · · · ⊗ 1 ⊗ b ⊗1 ⊗ · · · ⊗ 1. dΓfin (b) Π (D(b)n⊗ ) := n j =1

j th

As above, dΓfin (b) is closable and dΓ (b) denotes also its closure. We link the objects. Lemma 2.1. Let R+ t → wt ∈ B(h) be a C0 -semigroup of contractions (resp. of isometries), with generator a. Then R+ t → Γ (wt ) ∈ B(Γ (h)) is a C0 -semigroup of contractions (resp. of isometries) whose generator is dΓ (a). Proof. It is easy to see that Wt := Γ (wt ) is a C0 -semigroup of contractions (resp. of isometries). Let A be its generator. Immediately, one gets dΓfin (a) ⊂ A. Since Γfin (D(a)) is dense in dΓ (h) and invariant under Wt , the Nelson lemma gives that Γfin (D(a)) is dense in D(A) for the graph norm and also that dΓ (a) = A. 2 2.2. The interacting system Given a self-adjoint operator ω in h and a finite dimensional Hilbert space K . One defines the free Hamiltonian H0 acting on the Hilbert space H := K ⊗ Γ (h) by H0 := K ⊗ 1Γ (h) + 1K ⊗ dΓ (ω).

(2.2)

We recall also the definition of the number operator N := 1K ⊗ dΓ (Id). We now define the interaction. Let α be an element B(K , K ⊗ h). This is a form-factor. We define b(α) on H by b(α) := K ⊗ hn⊗ → K ⊗ h(n−1)⊗ , where b(α)(Ψ ⊗ φ1 ⊗ · · · ⊗ φn ) := α ∗ (Ψ ⊗ φ1 ) ⊗ φ2 ⊗ · · · ⊗ φn , for n 1 and by 0 otherwise. This operator is bounded and its norm is given by αB(K ,K ⊗h) . We define the annihilation operator on K ⊗ Γ (h) with domain K ⊗ Γfin (h) by a(α) := (N + 1)1/2 b(α)(1 ⊗ Π), where Π := n Πn , see (2.1). As above, it is closable and its closure is denoted by a(α). Its adjoint is the creation operator. It acts as a ∗ (α) = b∗ (α)(N + 1)1/2 on H . Note that b∗ (α)(ψ ⊗ φ1 ⊗ · · · ⊗ φn ) = (αφ) ⊗ φ1 ⊗ · · · ⊗ φn . The (Segal,) Field operator is defined by 1 φ(α) := √ a(α) + a ∗ (α) . 2 We consider its closure on K ⊗ D(N 1/2 ). We have the two elementary estimates: (N + 1)−1/2 a (∗) (α) α,

√ (N + 1)−1/2 φ(α) 2α.

An assertion containing (∗) holds with and without ∗.

(2.3)


2595

We give the following Nτ -estimate and refer to [9, Proposition 4.1] for a proof of (i). The point (ii) is a direct consequence of the Kato–Rellich lemma. This kind of estimates comes back to [20]. See also [3]. We refer to [18, Appendix A] and [16, Proposition 3.7] for unbounded K. Proposition 2.2. Let ω be a non-negative, injective, self-adjoint operator on h. Let β ∈ B(K , K ⊗ D(ω−1/2 )). (i) Then φ(β) ∈ B(K ⊗ D(dΓ (ω)1/2 ), H ) and for any Φ ∈ D(dΓ (ω)1/2 ),

φ(β)Φ 2 βB(K ,K ⊗h) Φ2 + 2ω−1/2 β B(K ,K ⊗h) Φ, 1K ⊗ dΓ (ω)Φ .

(2.4)

(ii) The field operator φ(α) is H0 -operator bounded with relative bound ε, for all ε > 0. Hence, Hλ := H0 + λφ(α), for λ ∈ R, defines a self-adjoint operator with domain D(Hλ ) = K ⊗ dΓ (ω) and is essentially self-adjoint on any core of H0 . 2.3. The zero-temperature Pauli–Fierz model We now precise our model to the zero-temperature physical setting. The one particle space is given by h := L2 (Rd , dk), where k is the boson momentum. The one particle kinetic energy is the operator of multiplication by ω(k) := |k|. Consider a self-adjoint matrix K on a finite dimensional Hilbert space K and denote by (ki )i=0,...,n , with ki < ki+1 its eigenvalues. We denote by Pki the projection onto the ith eigenspace. The spectrum of dΓ (ω) in Γ (h) is [0, ∞) and due the vacuum part, 0 is the only eigenvalue. Its multiplicity is one. The spectrum of H0 given by (2.2) is [k0 , ∞). The eigenvalues are given by (ki )i=0,...,n and have the same multiplicity as those of K . The singularly continuous component of the spectrum is empty. Here, (ki )i=0,...,n play also the rôle of thresholds. We consider a form-factor α satisfying hypothesis (I0). By applying Proposition 2.2, the operator Hλ , given by (1.1), is self-adjoint and D(Hλ ) = K ⊗ D(dΓ (ω)). Since we study form factors in B(K , K ⊗ h), we forbid some eventual singularities of the form-factor from the very beginning. However, if the atomic part has a particular shape, one may use some gauge transformations and gains in singularity, see for instance [16, Section 2.4] and [9, Section 1.6]. Nevertheless, it is an open question if there exists some gauge transformation that allows one to cover the physical form factor studied in [3,5], from our conditions. Conversely, the classes of perturbations studied in the latter does not fully cover ours. 3. The conjugate operators In this paper, we analyze the spectrum of the Pauli–Fierz Hamiltonian Hλ described in Section 2.3 using some commutator techniques. We study the behavior of the embedded eigenvalues of Hλ under small coupling constants and establish some refined spectral properties. To do so, we establish a version of the Mourre estimate, see Appendix C.2. Hence, we start by constructing the conjugate operator. We follow similar ideas as in [16,24,31]. Later, we modify it by a finite rank perturbation, in the spirit of [5]. Unlike in the standard Mourre theory, the conjugate operator is not self-adjoint and only maximal-symmetric. We refer to Appendix C.1 for discussions about 1-commutators properties in this setting. We point out that one may avoid to work

2596


with maximal-symmetric operator by symmetrizing the space and thus gluing non-physical free bosons, see [9, Section 5.2]. This trick leads to some problems of domains with our method and would be treated elsewhere. We point out that the real drawback of this choice of conjugate operator comes from the fact that the commutator is not Hλ bounded, like in the standard Mourre theory and [3,5,12,13]. Some difficulties appear to apply the Virial theorem. To overcome them, we follow ideas of [16,31] and construct a series of approximate conjugate operators. One may also proceed like in [27]. 3.1. The semigroup on the 1-particle space Fix χ ∈ Cc∞ (R+ ; [0, 1]) decreasing such that χ (x) = 1 for x 1 and 0 for x 2. Set χ˜ := 1 − χ . We consider the following vector fields on R+ : mn (t) :=

χ˜ (nt), 1,

for n ∈ N, for n = ∞,

and sn (t) =

mn (t) . t

(3.1)

Note that mn converges increasingly to m∞ , almost everywhere, as n goes to infinity. As in [31] and in [16], the rôle of m∞ would be to ensure the positivity of the commutator and the one of mn would be to guarantee of the Virial theorem. We define the associated vector fields in Rd as follows: − s→n (k) := sn |k| k,

for k ∈ Rd

and n ∈ N∗ ∪ {∞}.

(3.2)

s→n on h = We shall construct the C0 -semigroup of isometries associated to the vector fields − 2 d L (R ) and identify the generators. We define 1 → an := − (− sn · Dk + Dk · − s→n ) 2

(3.3)

on Cc∞ (Rd \ {0}) for all n ∈ N∗ ∪ {∞} and where Dk = i∇. These operators are closable as the domains of their adjoints are dense. In the sequel, we denote by the same symbol their closure. We work in polar coordinates. We identify h and h˜ through the transformation (1.3). Given an operator B in h, we denote by B˜ the corresponding operator acting in the h˜ and given by B˜ := T BT −1 . We have: Proposition 3.1. For n finite, an is essentially self-adjoint on Cc∞ (Rd \ {0}) and a∞ is maximal symmetric with deficiency indices (N, 0). Here, N = ∞ for d 2 and N = 2 for d = 1. The operator an generates a C0 -semigroup of isometries denoted by {wn,t }t∈R+ . In polar coordinates, the domains are given by D(a˜ n ) ⊃ D(a˜ ∞ ) = H˙ 1 R+ ⊗ L2 S d−1 , for all n ∈ N∗ , ∗ = H 1 R+ ⊗ L2 S d−1 , D a˜ ∞ where H˙ 1 (R+ ) is the closure of Cc∞ (R+ ) under the norm · + ∂r · and where H 1 (R+ ) is the Sobolev space of first order.


2597

See Appendix B for an overview on C0 -semigroups. For n finite, the C0 -semigroup extends to a C0 -group since an is self-adjoint. Proof. When n is finite, it is well known that an is essentially self-adjoint on Cc∞ (Rd ) and follows by studying C0 -group associated to the flow defined by the smooth vector field − s→n . The density follows by the Nelson lemma. See for instance [1, Proposition 4.2.3]. Hence, for n finite, it remains to show that Cc∞ (Rd \ {0}) is a core for an . Straightforwardly, for n ∈ N∗ ∪ {∞}, one gets 1 a˜ n := T an T −1 = i mn (·)∂r + (mn ) (·) ⊗ 1, 2

where mn (r) := rsn (r).

(3.4)

We extend mn on R by setting mn (−r) := mn (r) for r > 0 and prolongate it by continuity in 0. Let φn,t be the flow generated by the smooth vector field mn on R. In other words, φn,t := φn (t, ·) is the unique solution of (∂t φn )(t, r) = mn (φn (t, r)), where φn (0, r) = r. Since mn is globally Lipschitz, φn,t exists for all time t. Moreover, φn,t is a smooth diffeomorphism of R with inverse φn,−t for all t ∈ R. Let φ˜ n,t be the restriction of φn,t from R+∗ to R+∗ . Let Ωn,t be the domain of this restriction, i.e. the set of r > 0 such that φn,t (r) > 0. One has Ωn,t = R+∗ for t 0 as mn (r) is positive. For the same reason, t → Ωn,t is increasing. Note also that we have ˜ we set: Ωn,−t = φn,t (R+∗ ) for t 0. For u ∈ h, (r)u φn,−t (r), θ , (w˜ n,t u)(r, θ ) := 1Ωn,−t (r) φn,−t

for t 0.

(3.5)

A change of variable gives that w˜ n,t is an isometry of L2 (R+ ) with range L2 (Ωn,−t ) for all t 0. Since φn,t is a smooth flow, {w˜ n,t }t0 is a C0 -semigroup of isometries. The adjoint C0 semigroup is given by ∗ (r)u φ (r), θ , w˜ n,t u (r, θ ) := 1R+∗ (r) φn,t n,t

for t 0.

(3.6)

This is not a semigroup of isometries when n = ∞. ˜ We have w˜ n,t u ∈ We compute the generator of the semigroup {w˜ n,t }t0 . Take u ∈ Cc∞ (h). ∞ Cc (Ωn,−t × S d−1 ). Let r ∈ Ωn,−t , we get −

d 1 w˜ n,t u (r, θ ) = w˜ n,t mn (·)∂r + (mn ) (·) u (r, θ ). dt 2

Denoting by the same symbol the closure of a˜ n on Cc∞ (R+∗ × S d−1 ), we obtain −i

d w˜ n,t u = w˜ n,t a˜ n u. dt

The closed operator is a priori only a restriction of the generator of the semigroup (in the sense ˜ for all t 0, the Nelson of the inclusion of graph of operators). Now, since w˜ n,t stabilizes Cc∞ (h) lemma gives that this space is a core for generator of the C0 -semigroup {w˜ n,t }t0 . Since this one is an extension of a˜ n , we have shown that a˜ n is really the generator. One may denote formally

2598


w˜ n,t = eit a˜ n . The domain of a˜ n contains H˙ 1 (R+ ) ⊗ L2 (S d−1 ). Easily, this is an equality for n = ∞. Considering the spectrum of an , we derive the deficiency indices of the closure of an on Cc∞ (Rd \ {0}) are of the form (N, 0). For n finite these indices are equal, we infer the essential self-adjointness of an on Cc∞ (Rd \ {0}). At this point, one may feel the real difference between the case n finite and infinite. On one hand m∞ 1 and on the other hand, for finite n, mn (r) tends to 0 as r tends to 0. The domain of the adjoint of a˜ ∞ would be different. Indeed, ∗ 1 a˜ ∞ u (r, θ ) = i m∞ (r)(∂r u)(r, θ ) + (m∞ ) (r)u(r, θ ) , 2

(3.7)

∗ ) = H 1 (R+ ) ⊗ L2 (S d−1 ). Moreover, when n = ∞, the deficiency indices are where u ∈ D(a˜ ∞ then (∞, 0), as the dimension of L2 (S d−1 ) is infinite. 2

3.2. The C0 -semigroup on the Fock space Thanks to Proposition 3.1 and Lemma 2.1, we define the C0 -semigroups on the whole Hilbert space. We set: Wn,t := 1K ⊗ Γ (wn,t )

∗ and Wn,t = 1K ⊗ Γ (wt∗ ),

for t 0.

(3.8)

Clearly, {Wn,t }t0 is a C0 -semigroup of isometries. Let A∞ be its generator. In the same way, for n finite, we set An := 1K ⊗ dΓ (an ).

(3.9)

This is the generator of the C0 -group 1K ⊗ Γ (eitan ) by Lemma 2.1. Recall the rôle of the An is to ensure a Virial theorem, see Proposition 4.11. In Section 5, we see that the operator A∞ alone is not enough to deal with threshold energy as the system could be uncoupled. One needs to take in account the Fermi golden rule. One way is to follow [9] and to take advantage of it in the limiting absorption principle. Another way is to modify the conjugate operator with a finite rank perturbation so as to obtain more positivity above the thresholds, by letting appearing the Fermi golden rule in the commutator, see Section 6. This idea comes from [5]. We follow it. Choose ki0 an eigenvalue of K and assume that (6.1) holds true at energy ki0 for the couple (H0 , α). Let P be the projector Pki0 ⊗ PΩ . For ε < ε0 , we define Aˆ n := An + λθ Bε ,

for n ∈ N∗ ∪ {∞},

where Bε := Im(R 2ε φ(α)P ), Rε := ((H0 − ki0 )2 + ε 2 )−1/2 and R ε := P Rε . Note that the conjugate operator depends on the two parameters λ ∈ R from the coupling constant, ε > 0 from the Fermi golden rule hypothesis and on an extra technical θ > 0. For the sake of clarity, we do not write these extra dependences. Using Proposition B.5 and the fact that Bε is bounded, one gets Aˆ ∞ is the generator of a C0 -semigroup. A bit more is true.


2599

Lemma 3.2. The operator Aˆ ∞ is maximal symmetric on D(A∞ ) and is the generator of C0 semigroup of isometries, denoted by {Wˆ n,t }t0 . For n finite, the operator Aˆ n is self-adjoint on the domain of D(An ). Proof. The second point is obvious. We concentrate on the first one. By Proposition 3.1, A∞ is maximal symmetric with deficiency indices (N, 0) for some N = 0. Since Bε is bounded, there is c < 0 such that Bε (A∞ − z)−1 < 1, for all z ∈ C where Im(z) c. Since (I + Bε (A∞ − z)−1 )(A∞ − z) = A∞ + Bε − z on the domain of A∞ , we get the spectrum of Aˆ ∞ is contained in an upper half plane R + i[c, ∞). Now, since Bε is symmetric, so is Aˆ ∞ . If the indices of Aˆ ∞ would be both non-zero then its spectrum would be C. Therefore, the deficiency indices of Aˆ ∞ are (N , 0) for some non-negative N . Note that N = 0 by the Kato–Rellich theorem applied on Aˆ ∞ , since Bε is bounded. Hence, Aˆ ∞ is maximal symmetric on D(A∞ ) and its spectrum is R + i[0, ∞). It is automatically a C0 -semigroup of isometries. 2 4. Smoothness with respect to the C0 -semigroup In Section 4.1, we recall a general result. In Section 4.2, we give some 1-commutator properties for An . We check the hypothesis (M1)–(M4) of Appendix C.2. We identify the spaces and operators appearing therein in Lemma 4.3. In Section 4.3, we extend these properties to Aˆ n , using Proposition B.5 and Lemma 4.5. The Virial theorem is discussed in Section 4.4. At last, second commutator assumptions and the hypothesis (M5) are discussed in Section 4.5. 4.1. A general result In order to check the C 1 properties, the b-stability, see Definition B.3, and to be able to deduce hypothesis (M1)–(M5) of Appendix C.2, we recall [16, Proposition 4.10]. We formulate it for bounded K. Set first a C0 -semigroup of isometries R+ t → vt ∈ B(h) with generator a. By Lemma 2.1, Vt := 1K ⊗ Γ (vt ) is a C0 -semigroup of isometries with generator A = 1K ⊗ dΓ (a). Let b 0 be a self-adjoint operator on h, and K as in (2.2). Set B := K ⊗ 1Γ (h) + 1K ⊗ dΓ (b),

GB := D B 1/2 = 1K ⊗ D dΓ (b)1/2 .

Proposition 4.1. Let ω and b 0 acting in h. Then, (i) The space GB is b-stable under {Vt }t∈R+ (resp. {Vt∗ }t∈R+ ), if vt∗ bvt Ct b,

resp. vt bvt∗ Ct b with sup Ct < ∞.

(4.1)

0 1. 2 We point out that if one knows that ω−1 α ∈ B(K , K ⊗ C0 (R+ ) ⊗ L2 (S d−1 )), one may relax (I1b) and take a = 1. Here C0 (R+ ) denotes the continuous functions vanishing in 0 and in +∞. Lemma 4.9. Assume n finite, (I0) and (I1a). Then, {Wˆ n,t }t∈R b-stabilizes the form domain of Hλ . Proof. First we apply Proposition 4.1(i) with vt = wn,t and b = w. As we have a C0 -group, by taking t negative we obtain the result for the adjoint. As in the proof of Proposition 3.1, we s→n . Since mn (0) = 0, we denote by φn,t : Rd → Rd the flow generated by the smooth vector field − have

φn,t (k) − k = φn,t (k) − φn,0 (k)

|t|

mn φn,s (k) − mn (0) ds 0

|t|

φn,s (k) ds,

∇mn ∞ 0

for all t ∈ R.

(4.8)

2604


By the Gronwall lemma, we infer there is C such that |φn,t (k)| C|k|, for all t ∈ [1, 1]. Plugging back into (4.8), we obtain |φn,t (k) − k| C|tk|, for all t ∈ [1, 1]. Now using (3.5) and (3.6), we infer e−itan weitan = w(φn,t (·)). Since mn is globally Lipschitz, there is C such that

w φn,t (k) − w(k) C |t|w(k),

for all t ∈ [1, 1].

(4.9)

Hence, we satisfy the hypothesis (4.1) and D(|Hλ |1/2 ) is b-stable under {Wn,t }t∈R . We now take care about {Wˆ n,t }t∈R . Let An be the generator of {Wn,t }t∈R in D(|Hλ |1/2 ). As in (6.6), set Aˆn := An + λθ Bε . By Lemma 4.5(ii) and the fact that Bε is with values in the 0 and 1 particles space, we get Bε bounded in D(|Hλ |1/2 ). Thanks to Proposition B.5, Aˆn is the 1/2 generator of a C0 -group in D(|Hλ | ). We name it {Wˆ n,t }t∈R . By duality and interpolation, it } extends to a C0 -group in H . Comparing the generators, we obtain that {Wˆ n,t t∈R is really the restriction of {Wˆ n,t }t∈R and this gives the result. 2 Lemma 4.10. Assume n finite, (I0) and (I1a). Then Hλ ∈ C 1 (Aˆ n ). Moreover: [Hλ , i Aˆ n ]◦ = Mn + Sˆn ,

(4.10)

holds true in the sense of forms on D(|Hλ |1/2 ). Proof. Using again (4.9), we check (4.2). We get [H0 , iAn ]◦ = 1K ⊗ dΓ ([ω, ian ]◦ ) in the sense of form on D(|Hλ |1/2 ). By computing [ω, ian ]◦ on the core Cc∞ (Rd \ {0}), we obtain [ω, ian ]◦ = mn . Now, by Lemma 4.2, we can use Proposition 4.1(iii) and deduce [Hλ , iAn ]◦ = Mn + Sn in the sense of forms on D(|Hλ |1/2 ). Finally, by Lemma 6.7, [Hλ , Bε ]◦ is of finite rank, we also obtain (4.10) on the same domain. Now, Hλ ∈ C 1 (Aˆ n ; D(|Hλ |1/2 ), D(|Hλ |1/2 )∗ ) by Lemma 4.9 and Proposition C.6. We apply [16, Lemma 6.3] to get Hλ ∈ C 1 (Aˆ n ). 2 Therefore, the Virial theorem holds true when Aˆ n is the conjugate operator and when n is finite. However, there is no Mourre estimate for Aˆ n but only one for Aˆ ∞ . To overcome this problem, we take advantage of the monotone convergence of [H0 , iAn ]◦ to [H0 , iA∞ ]◦ and of the uniformity given in Lemma 4.8 to prove: Proposition 4.11 (Virial theorem). Assume (I0) and (I1). Let u be an eigenfunction of Hλ then u ∈ D(N 1/2 ) and u, (M∞ + Sˆ∞ )u = 0, as a quadratic form on D(N 1/2 ) ∩ D(Hλ ). Proof. First, Mn is a bounded form for Hλ . Note that 0 mn m implies 0 dΓ (mn ) dΓ (m) for all n. Now, since mn is increasing and converges to m as n goes to infinity, monotone convergence gives 0 g, Mn g g, M∞ g

and g, Mn g −→ g, M∞ g, n→∞

for all g ∈ D(M∞ ) ∩ D(Hλ ). Using some Cauchy sequences, this holds true also in the sense 1/2 of forms for g ∈ D(M∞ ) ∩ D(Hλ ). By authorizing the value +∞ on the two r.h.s. when


2605

1/2 g∈ / D(M∞ ), one allows g ∈ D(Hλ ). On the other hand, Lemma 4.2 gives that Sˆn tends to Sˆ∞ as a quadratic form on D(H ). Let H˙ be the closure of quadratic form u, Hˆ λ u defined on D(M∞ ) ∩ D(H ). It is given 1/2 by the quadratic form u, (M∞ + Sˆ∞ )u defined on D(M∞ ) ∩ D(H ). Take now an eigenfunction u of Hλ . By Lemma 4.10 and the Virial theorem, see [1, Proposition 7.2.10], we get 1/2

u, (Mn + Sˆn )u = 0. By letting n go to infinity and noticing that D(M∞ ) = K ⊗ D(N 1/2 ), we get the result. 2

4.5. Estimation on the second commutator In this section, we discuss the second commutator hypothesis (I2) so as to obtain a limiting absorption principle through the Theorem C.8. We stress we forgo the hypothesis (I1b) in this section. We start with the important remark. Lemma 4.12. We have C 2 (A∞ , G , G ∗ ) = C 2 (Aˆ ∞ , G , G ∗ ). Proof. It is enough to show one inclusion. Using Proposition C.6 and the invariance of G and G ∗ given in Lemmata 4.4 and 4.7, one may work directly with A∞ and Aˆ ∞ . Let H ∈ B(G , G ∗ ) be in C 2 (A∞ , G , G ∗ ). One justifies the next expansion, by working in the form sense on D((A∗∞ )2 |G ) × D((A∞ )2 |G ). This is legal by using Lemma 4.5. We have: [H, Aˆ ∞ ], Aˆ ∞ = [H, A∞ ], A∞ + [H, A∞ ], λθ Bε + [H, λθ Bε ], A∞ + [H, λθ Bε ], λθ Bε . The first term is in B(G , G ∗ ) by hypothesis. For the second one, note that [H, A∞ ] ∈ B(G , G ∗ ) since H is C 1 (A∞ , G , G ∗ ). For the third one, we expand the commutator inside, use again that H ∈ C 1 (A∞ , G , G ∗ ) and finish with Lemma 4.5(iii). For the last one, one expands it and use Lemma 4.5(ii). 2 We start by discussing the C 2 theory used in [16,31] and check the point (M5 ). Through the isomorphism given by (1.3), we suppose the stronger (I2 ) α ∈ B(K , K ⊗ H˙ 2 (R+ ) ⊗ L2 (S d−1 )). This hypothesis is stronger than α ∈ B(K , K ⊗ H˙ s (R+ ) ⊗ L2 (S d−1 )) for s > 1, the one used in [9, Theorem 6.3]. Lemma 4.13. Assume (I0), (I1a) and (I2 ). Then Hλ ∈ C 2 (Aˆ ∞ , G , G ∗ ) and ◦ 2 ◦ [Hˆ λ , i Aˆ ∞ ]◦ = λφ a∞ α + λθ [Hλ , Bε ]◦ , iA + λ2 θ 2 Hˆ λ , Bε . Therefore, the hypothesis (M5 ) is fulfilled. Proof. We use Proposition 4.1(ii) and (iii) for the operator H := N − λφ(ia∞ α). Point (ii) is trivially satisfied. The hypothesis (I2) and Proposition 2.2 give (4.3). We obtain H ∈ C 1 (A∞ ; G , G ∗ ). 2

2606


We now work with the hypothesis (I2) which is weaker than the one used in [9]. Thanks to Lemma 4.12, we have C 1,1 (A∞ , G , G ∗ ) := C 2 (A∞ , G , G ∗ ), B(G , G ∗ ) 1/2,1 = C 2 (Aˆ ∞ , G , G ∗ ), B(G , G ∗ ) 1/2,1 =: C 1,1 (Aˆ ∞ , G , G ∗ ). We refer to [1,32] for real interpolation. We obtain: Lemma 4.14. Assume (I0), (I1a) and (I2). Then Hλ ∈ C 1,1 (Aˆ ∞ , G , G ∗ ) and the hypothesis (M5) is fulfilled. Proof. By Lemma 4.13, we have H0 ∈ C 2 (Aˆ ∞ , G , G ∗ ). It is enough to show that φ(α) ∈ C 1,1 (A∞ , G , H ). By [9, Lemma 2.7], we have W∞,t φ(α) = φ(w∞,t α)W∞,t for t 0. By Proposition 2.2 and b 1 and since {W∞,t } b-preserves G , we get

1

W∞,t [W∞,t , φ(α)]

0

dt B(G ,H ) t 2

1

φ w∞,t [w∞,t , α] W∞,2t

dt

B(G ,H ) t 2

0

1 C

w∞,t [w∞,t , α]

dt B(K ,K ⊗h) t 2

.

0 2 )), B(K , K ⊗ h)) The latter is finite if and only if α belongs to (B(K , D(a∞ 1/2,1 . On the other hand, using the isomorphism (1.3) and Proposition 3.1, this space is the same as ˜ 1/2,1 . Finally, using [32, Section 2.10.4], (B(K , K ⊗ H˙ 2 (R+ ) ⊗ L2 (S d−1 )), B(K , K ⊗ h)) this is equivalent to the fact that α satisfies (I2). 2

5. A Mourre estimate far from the thresholds 5.1. The result The aim in this part is to show a Mourre estimate far from thresholds for small coupling constants. This is a well-known result, see [3,9] for instance. For the sake of completeness, we give a proof of the estimate. Doing so, we point out, in Remark 5.3, where the lack of positivity occurs above the thresholds. We use the approach based on the theory described in Appendix C. To obtain information just above the thresholds and without supposing the Fermi golden rule, one should add a compact term in (5.1), see [15,31]. Theorem 5.1. Let I0 be a compact interval containing no element of σ (K). Suppose also that (I0) and (I1a) are satisfied. Then, for all open interval I ⊂ I0 : (i) There are M∞ 1 and S∞ a |Hλ |1/2 -bounded operator such that [Hλ , iA∞ ]◦ = M∞ + S∞ holds in the sense of forms on D(N 1/2 ). (ii) The conditions (M1)–(M4) are satisfied.


2607

(iii) There is λ0 > 0 such that the following extended Mourre estimate M∞ + S∞ a(λ)EI (Hλ ) − b(λ)EI c (Hλ ) Hλ .

(5.1)

holds true in the sense of forms on D(N 1/2 ), for all |λ| λ0 . Here, a(λ) is positive and can be written as (1 + O(λ)). Besides, b(λ) is also positive. (iv) If (I1b) holds true, then Hλ has no eigenvalue in I, for all |λ| λ0 . (v) If (I2) holds true (and not necessarily (I1b)), then Hλ has no eigenvalue in the interior of I, for all |λ| λ0 . Moreover, one obtains the estimations of the resolvent given in Theorem 1.2. Proof. By Lemma 4.3, we have the first point. The point (ii) is shown in Section 4.2. The point (iii) follows from Proposition 5.2. Indeed, since S∞ is form bounded with respect to Hλ , we have that for all η > 0 EI (Hλ )S∞ EI c (Hλ ) + EI c (Hλ )S∞ EI (Hλ ) −ηEI (Hλ )S∞ Hλ −1 S∞ EI (Hλ ) − η−1 EI c (Hλ ) Hλ .

(5.2)

The point (iv) follows from the Virial Theorem, Proposition 4.11. Finally, Theorem C.8 gives point (v), the space G appearing therein is identified in Lemma 4.3. 2 5.2. The inequality Here we establish the extended Mourre estimate away from the threshold. We use only (I0) and (I1a) and do not assume any Fermi golden rule assumption. Proposition 5.2. Let I0 be a compact interval such that σ (K) ∩ I0 = ∅. Let I be an open interval included in I0 . Let M∞ := N + 1 ⊗ PΩ 1 and let S∞ := −1 ⊗ PΩ − λφ(ia∞ α). For λ small enough, we get M∞ + EI (Hλ )S∞ EI (Hλ ) 1 + O(λ) EI (Hλ ), (5.3) holds true in the sense of forms on D(N 1/2 ). Proof. Let J be a compact set containing I and contained in the interior of I0 . Note that (1.4) gives EJ (H0 )1K ⊗ PΩ = 0. By Proposition 2.2, we derive: EJ (H0 )S∞ EJ (H0 ) = λEJ (H0 )φ(ia∞ α)EJ (H0 ) = O(λ)EJ (H0 ).

(5.4)

As M∞ 1, it remains to prove that EI (Hλ )S∞ EI (Hλ ) = O(λ)EI (Hλ ). We insert EJ (H0 ) + EJ c (H0 ) on the right and on the left of S∞ . By (5.4), all the four terms are actually O(λ)EI (Hλ ). Indeed, Proposition 2.2 gives for instance that c EI (Hλ )EJ (H0 )S∞ EJ (H0 )EI (Hλ ) = O(λ)EI (Hλ ).

For the right-hand side, take h ∈ Cc∞ (J ) so that h|I = 1. We have EI (Hλ )EJ c (H0 ) = EI (Hλ ) h(Hλ ) − h(H0 ) EJ c (H0 ) = O(λ), by Lemma 5.4.

2

2608


Remark 5.3. This proof would not work over one of thresholds {ki }i=0,...,n . Here, we use in a drastic way that EJ (H0 )1 ⊗ PΩ = 0. However, when σ (K) ∩ I = {ki }, this expression is never 0 and is of norm 1. A brutal estimation would give M + EI (Hλ )S∞ EI (Hλ ) O(λ)EI (Hλ ).

(5.5)

We have no control on the sign. This is no surprise as we know that one may uncouple the two parts of the system and an eigenvalue can remain, see Section 6. To control the sign, one needs to gain some positivity just above Pki ⊗ PΩ . This would be the rôle of the Fermi golden rule and of the operator Bε . Here we have used the elementary: Lemma 5.4. Let h ∈ Cc∞ (R) and s 1/2. Let V be symmetric operator being H0 -form bounded operator, with constant lower than 1. Then, there is C such that H0 s h(H0 ) − h(H0 + λV ) C|λ|. 6. A Mourre estimate at the thresholds In this section we would like to study the absence of eigenvalue above one of the thresholds. From a physical point of view, as soon as the interaction is on, one expects the embedded eigenvalues to disappear into the complex plane and to turn into resonances. This is however not mathematically true as one may uncouple the Bosonic Field and the atom. Take for instance ω bounded, α ∈ B(K , K ⊗ h), given by α(x) := 1 ⊗ b, for all x ∈ K and where ωb ∈ h. After a dressing transformation, see for instance [8, Theorem 3.5], the operator Hλ is unitarly equivalent to the free operator K ⊗ 1Γ (h) + 1K ⊗ dΓ (ω˜ λ ), for some ω˜ λ ∈ B(h). Therefore, Hλ has the same eigenvalues as H0 for all λ. Note that this is no restriction to suppose that ω is bounded thanks to the exponential law, see for instance [7, Section 3.2]. We couple the two systems through a Fermi golden rule assumption. 6.1. The Fermi golden rule hypothesis We choose one eigenvalue ki0 of Hel for i0 > 0. Let P := Pki0 ⊗ PΩ and let P := 1 − P . Note that P is of finite rank. We give an implicit hypothesis on α and explain how to check it in Appendix A. Definition 6.1. We say that the Fermi golden rule holds true at energy k for a couple (H0 , α) if there exist positive ε0 , c1 and c2 such that c1 P P φ(α)P Im(H0 − k + iε)−1 P φ(α)P c2 P ,

(6.1)

holds true in the sense of forms, for all ε0 > ε > 0. Due to the Fock space structure, one may omit P in (6.1) but we keep it to emphasize the link between hypotheses of this type in other fields (like for Schrödinger operators). Since P is of finite rank, this property follows from (1.2).


2609

The upper and the lower bounds of (6.1) would be crucial in our analysis. We shall keep track of the lower bound in the sequel so as to emphasis the gain of positivity it occurs. We set few notations. −1/2 Rε := (H0 − ki0 )2 + ε 2 , R ε := P Rε

and Fε := R 2ε .

(6.2)

Note that εRε2 = Im(H0 − ki0 + iε)−1 and that Rε commutes with P . We get: (c1 /ε)P P φ(α)Fε φ(α)P (c2 /ε)P ,

(6.3)

for ε0 > ε > 0. It follows: Rε = 1/ε

1/2 and P φ(α)R ε c1 ε −1/2 .

(6.4)

As pointed out in Remark 5.3, we seek some more positivity for the commutator above the energy P = Pki ⊗ PΩ . We proceed like in [5] and set Bε := Im R 2ε φ(α)P . It is a finite rank operator, see Lemma 4.5 for more properties. Observe now that we gain some positivity as soon as λ = 0: P [Hλ , iλBε ]P = λ2 P φ(α)Fε φ(α)P (c2 λ2 /ε)P .

(6.5)

It is therefore natural to modify our conjugate operator. We set Aˆ n := An + λθ Bε ,

for n ∈ N∗ ∪ {∞}.

(6.6)

It depends on the two parameters λ ∈ R, ε > 0 and on an extra technical θ > 0. For the sake of clarity, we do not write these extra dependences. Heuristically, the operator A∞ would give the positivity around the threshold and the Bε would complete it just above. We mention that Aˆ ∞ is maximal symmetric and generates a semigroup of isometries, see Lemma 3.2. 6.2. Main result We prove the extended Mourre estimate over the threshold ki0 . This is the heart of the paper. The proof relies on the Feshbach method. We exploit the freedom we have so far on ε and θ : set ε := ε(λ) and θ =: θ (λ) and suppose that λ = o(ε), ε = o(θ ) and θ = o(1) as λ tends to 0. We summarize this into: |λ| ε θ 1,

as λ tends to 0.

(6.7)

In [5], this condition is more involved and the size of the interval comes into the play. We stress that the conjugate operator Aˆ ∞ depends on these three parameters.

2610


Theorem 6.2. Let I0 be a compact interval containing ki0 and no other ki . Assume the Fermi golden rule hypothesis (6.1) and (6.7) hold true. Suppose also that (I0) and (I1a) are satisfied. Then, for all open interval I ⊂ I0 : (i) There are M∞ 1 and Sˆ∞ a |Hλ |1/2 -bounded operator such that [Hλ , i Aˆ ∞ ]◦ = M∞ + Sˆ∞ holds in the sense of forms on D(N 1/2 ). (ii) There is λ0 > 0 such that the following extended Mourre estimate M∞ + Sˆ∞ a(λ)EI (Hλ ) − b(λ)EI c (Hλ ) Hλ

(6.8)

holds true in the sense of forms on D(N 1/2 ), for all λ ∈ (0, λ0 ). Here, one has a(λ) = λ2 θ c2 /5ε and b(λ) > 0. (iii) If (I1b) holds true, then Hλ has no eigenvalue in I. (iv) If (I2) holds true (and not necessarily (I1b)), then Hλ has no eigenvalue in the interior of I, for all |λ| λ0 . Moreover, one obtains the estimation of the resolvents given in Theorem 1.2. Remark 6.3. By taking θ and ε as power of λ, one may take a(λ) = λ1+η /5, for some η > 0. We do not reach the power 1 as expected in Remark 5.3. This is due to the non-linearity in λ of the conjugate operator. Note also, this is very small and then one does not expect a fast propagation of the state, i.e. the eigenvalue turns into a resonance. See for instance [4,22] for some lifetime estimates. The proof of this theorem needs few steps and is given in Section 6.4. We first go into the Feshbach method and deal with unperturbed spectral measure in Proposition 6.5. Next, in Proposition 6.8, we change the spectral measure. 6.3. The infrared decomposition As suggested by (6.5), one expects to have to slip the space with the projector P to take advantage of this positivity. To do so, we use the Feshbach method. As our result is local in energy, we fix a compact interval J which contains the selected eigenvalue ki0 and no others. We consider the Hilbert space HJ := EJ (H0 )H . Let HJv := P HJ and HJv its orthogonal in HJ . The v subscript stands for vacuum. Given H bounded in HJ = HJv ⊕ HJv , we write it following this decomposition in a matricial way: H=

H vv H vv

H vv H vv

.

(6.9)

We recall the Feshbach method, see [3] and see also [9, Section 3.2] for more results of this kind. Proposition 6.4. Assume that z ∈ / σ (H vv ). We define −1 Gv (z) := z1vv − H vv − H vv z1vv − H vv H vv . Then, z ∈ σ (H ) if and only if 0 ∈ σ (Gv (z)).


2611

The reader should keep in mind that J would be chosen slightly bigger than the interval I. This lost comes from the change of spectral measure from H0 to Hλ . The aim of the section is to show the following proposition about Sˆ∞ , see (4.7). Proposition 6.5. Let J be a compact interval containing k and no other ki . Suppose the Fermi golden rule (6.1) and (6.7), then one has EJ (H0 )Sˆ∞ EJ (H0 ) c2 λ2 θ ε −1 /3 − 1 EJ (H0 )

(6.10)

holds true in the sense of forms, for λ small enough. We go through a series of lemmata and give the proof at the end of the section. The −1 of the r.h.s. seems at first sight disturbing as we seek for some positivity. It would be balanced when we will add the operator M∞ 1, see Section 6.4. In the first place, we estimate the parts of Sˆ∞ . Lemma 6.6. With respect to the decomposition (6.9), as λ goes to 0, we have EJ (H0 ) λφ(a∞ α) − P EJ (H0 ) =

O(λ) O(λ)

O(λ) . −1

Proof. The part in P follows directly from (1.4). The one in α results from Proposition 2.2 and the fact that P φ(a∞ α)P = 0. 2 Lemma 6.7. Suppose that the Fermi golden rule (6.1) holds true. Then, the form [Hλ , Bε ] defined on D(Hλ ) × D(Hλ ) extends to a finite rank operator on H , denoted by [Hλ , Bε ]◦ . As λ tends to 0, we have [Hλ , λθ Bε ]◦

B(H )

O λθ ε −1/2 + O λ2 θ ε −3/2 .

(6.11)

Besides, with respect to the decomposition (6.9), we have: ◦

EJ (H0 )[H0 , λθ Bε ] EJ (H0 ) =

0 O(λθ ε −1/2 )

O(λθ ε −1/2 ) 0

and ◦

EJ (H0 )[λφ(α), λθ Bε ] EJ (H0 ) =

O(λ2 θ ε −3/2 ) O(λ2 θ ε −3/2 )

O(λ2 θ ε −3/2 ) . λ2 θ Fε

Proof. We give some estimates independent of J . We expand the commutators, this could be justified by considering the commutator in the form sense on D(Hλ ). dΓ (ω), Rε 2 φ(α)P = H0 − ki0 , Rε 2 φ(α)P = P (H0 − ki0 )Rε Rε φ(α)P + P Rε Rε φ(α)P (H0 − ki0 ) = P O ε −1/2 P + 0.

(6.12)

2612


Indeed, the first term derives from (6.4) and (H0 − ki0 )Rε = O(1). For the second one, note that (H0 − ki0 )P = 0. We turn to the second estimation and apply Proposition 2.2. We get φ(α)Rε = φ(α)R1 R1−1 Rε = O(ε −1 ). By (6.4), we have φ(α), Rε 2 φ(α)P = P Fε P + P φ(α)Rε Rε φ(α)P + P Rε Rε φ(α)P φ(α)(P + P ) (6.13) = P Fε P + P O ε −3/2 P + P O(ε −3/2 )P . Gathering lines (6.12) and (6.13), we get (6.11). We finish by adding EJ (H0 ).

2

We go into the Feshbach method and conclude. Proof of Proposition 6.5. We set Cλ := EJ (H0 )Sˆ∞ EJ (H0 ). First observe that for all μ −3/4, we get Cλvv − μ is invertible in B(H vv ) and (Cλvv − μ)−1 B(H vv ) 2. Indeed, from Lemmas 6.6 and 6.7, we have that Cλvv is bounded from below by O(λ2 θ ε −3/2 ) + O(λ). This is bigger than −1/2 by (6.7), for λ small enough. We now estimate from below the internal energy of Cλ , uniformly in μ 3/4. By Lemmata 6.6 and 6.7, the first part and the Fermi golden Rule (6.3), we infer −1 Cλvv − Cλvv Cλvv − μ Cλvv + 1 2 c2 λ2 θ ε −1 + O λθ ε −1/2 + O λ2 θ ε −3/2 + O(λ) = c2 λ2 θ ε −1 O(θ ) + O λθ ε −1 + O ε 1/2 + O λ2 θ ε −2 + O λε −1/2 + O θ −1 ε c2 λ2 ε −1 /2,

for λ small enough.

We have used (6.7) for the last line. We are now able to conclude. Since J contains ki0 and no other ki . We have EJ (H0 )PΩ = P by (1.4). Let μ < c2 λ2 θ ε −1 /2 − 1. Note that μ −3/4 for λ small enough by (6.7). Thanks to the previous lower bound, we can apply Proposition 6.4 with respect to the decomposition (6.9) for Cλ and with z = μ to get the result. 2 6.4. The extended Mourre estimate At the end of the section, we establish the extended Mourre estimate. We start by enhancing Proposition 6.5. Proposition 6.8. Let I be a compact interval containing ki0 and not other ki . Assume the Fermi golden rule (6.1) and (6.7). Then, EI (Hλ )Sˆ∞ EI (Hλ ) c2 λ2 θ ε −1 /4 − 1 EI (Hλ ) holds true in the sense of forms for λ small enough.


2613

Proof. Let J be a compact interval as in Proposition 6.5 such that I is included in its interior and contains ki0 . By (6.7), it is enough to prove EI Hλ (λφ(a∞ α) + [Hλ , iλθ Bε ]◦ − PΩ )EI (Hλ ) c2 λ2 θ ε −1 /3 + O λ2 + O λ2 θ ε −1/2 + O λ3 θ ε −3/2 − 1 EI (Hλ ).

(6.14)

We start with the left-hand side of (6.14) and introduce EJ (H0 ) + EJ c (H0 ) on the right and on the left of ([Hλ , iλθ Bε ]◦ + λφ(a∞ α) − PΩ ). Note that both of spectral measures are bounded in D(H0 ), endowed with the graph norm. We need to control the mixed term. Using Lemma 5.4 and (6.11), we get EI (Hλ )EJ c (H0 )[Hλ , iλθ Bε ]◦ EJ (H0 )EI (Hλ ) = O λ2 θ ε −1/2 + O λ3 θ ε −3/2 EI (Hλ ), and a better term for EI (Hλ )EJ c (H0 )[Hλ , iλθ Bε ]◦ EJ c (H0 )EI (Hλ ). Since the term φ(a∞ α) H0 −1/2 is bounded in H by Proposition 2.2, Lemma 5.4 gives EI (Hλ )EJ c (H0 )λφ(a∞ α)EJ (H0 )EI (Hλ ) = O λ2 EI (Hλ ), and a better term for the full-mixed term. As H0 commute with PΩ , we infer EI (Hλ )EJ c (H0 )PΩ EJ (H0 )EI (Hλ ) = 0. Now using Proposition (6.5) we obtain EI (Hλ ) [Hλ , iλθ Bε ]◦ + λφ(a∞ α) − PΩ EI (Hλ ) c2 λ2 θ ε −1 /3 − 1 EI (Hλ )EJ (H0 )EI (Hλ ) + O λ2 + O λ2 θ ε −1/2 + O λ3 θ ε −3/2 EI (Hλ ). Finally, the estimation (6.14) follows by noticing that EI (Hλ )EJ (H0 )EI (Hλ ) is equal to (1 + O(λ2 ))EI (Hλ ), again by Lemma 5.4. 2 We are now able to prove the announced result. Proof of Theorem 6.2. The operator M∞ and Sˆ∞ are given in (4.4) and (4.7). Points (i) and (ii) are given in Section 4.3. By Proposition 6.8 and since M∞ 1, M∞ + EI (Hλ )Sˆ∞ EI (Hλ ) c2 λ2 θ ε −1 /4EI (Hλ ) holds true in the form sense on D(N 1/2 ). Then, (5.2) gives (iii). The point (iv) follows from the Virial theorem, Proposition 4.11. Finally Theorem C.8 gives point (v). Indeed, the space G appearing therein is identified in Lemma 4.3. In remains to notice that the spaces (C.4) given for Aˆ ∞ and A∞ are the same. This follows from the fact that these operators have the same domain in G ∗ , by Lemma 4.5 and that the spaces Gs∗ are given by complex interpolation. 2

2614


Acknowledgments I express my gratitude to Jan Dereziński who encouraged me in pursuing these ideas. I would also like to thank Volker Bach, Alain Joye, Christian Gérard, Vladimir Georgescu, Wolfgang Spitzer, Claude-Alain Pillet and Zied Ammari for some useful discussions. This work was partially supported by the Postdoctoral Training Program HPRN-CT-2002-0277. Appendix A. Level shift operator In this paper, we never make the hypothesis that we analyse an eigenvalue which could be different than the ground state energy of H0 . The point is that it is well known that it is supposed to remain, even if the perturbation is switched on, see for instance [2,3,17]. This leads to a contradiction to the hypothesis made on the Fermi golden rule. Therefore, in this section, we explain how one may check the Fermi golden rule assumption (6.1), why it is not fulfilled at ground state energy. This would also explain the compatibility with (I0)–(I2). The computations we lead are standard, we keep it simple. See also [3,10,25]. Let ei be an orthonormal basis of eigenvectors of K relative to the eigenvalue ki . To simplify the computation, say that ki0 is simple. Since ki0 is simple and since φ(α)(ei0 ⊗ Ω) = α(ei0 ) ∈ K ⊗ h, (6.1) is equivalent to: c1 α(ei0 ), Im(H0 − ki0 + iε)−1 α(ei0 ) c2 > 0,

for 0 < ε ε0 .

We have α(ei0 ) = i=1,...,n ei ⊗ fi,i0 ∈ K ⊗ h, where fi,i0 = ei ⊗ 1h , α(ei0 ). As h = L2 (Rd , dk), we write fi,i0 as a function of k. We go into polar coordinates, see (1.3) and infer ∞

c1

ε

i=1,...,n 0 d−1 S

|fi,i0 |2 (rθ )r d−1 dσ dr c2 > 0. (r + λi − λi0 )2 + ε 2

Suppose now that (r, θ ) → |fi,i0 |2 (rθ )r d−1 is continuous and in L1 . Then by dominated convergence, we let ε go to zero and get: c1

i=1,...,i0

ci (λi0 − λi )

d−1

ε|fi,i0 |2 θ (λi0 − λi ) dσ c2 > 0.

(A.1)

S d−1

Here note that, up to the constant ci , r → ε/((r + λi − λi0 )2 + ε 2 ) is a Dirac sequence if and only if λi λi0 . To satisfy the Fermi golden rule, it is enough to have a non-zero term in (A.1). When d 2, we stress that the sum is taken till i0 − 1 and therefore is empty at ground state energy. When the 1-particle space is over R, it cannot be satisfied at this level of energy as well. Indeed, one would obtain a contradiction with the hypothesis (I0) and the continuity of (r, θ ) → |fi,i0 |2 (rθ ).


2615

Appendix B. Properties of C0 -semigroups In this section, we gather various facts about C0 -semigroups we use along this article. Let H be a Hilbert space. Recall that w-lim denotes the weak limit. Definition B.1. We say R+ t → Wt , with Wt ∈ B(H ) is a C0 -semigroup if (1) W0 = Id and Ws+t = Ws Wt , for all s, t 0, (2) w- limt→0+ Wt = Id. Automatically, this implies that R+ ⊃ t → Wt is strongly continuous, see [23, Theorem 10.6.5]. We keep the convention of [16] and define the generator of {Wt }t0 as being the closed densely defined operator A defined on

D(A) := u ∈ H lim (it)−1 (Wt − Id)u exists . t→0+

We set Au this limit. Formally, one reads Wt = eitA . The map R+ ⊃ t → Wt∗ being weakly continuous, {Wt∗ }t0 is also a C0 -semigroup. Its generator is −A∗ . We recall the Nelson Lemma, see for instance [6, Corollary 3.1.7]. Lemma B.2 (Nelson lemma). Let D be a dense subset of H and let {Wt }t0 be a C0 -semigroup. If Wt D ⊂ D then D is a core for the generator of {Wt }t0 . Let G and H be two Hilbert spaces such that G ⊂ H continuously and densely. Using the Riesz isomorphism, we identify H with H ∗ , where the latter is the set of anti-linear forms acting on H . We infer the following scale of spaces G ⊂ H H ∗ ⊂ G ∗ with continuous and dense embeddings. In order to define the restriction of Wt on G , we set: Definition B.3. Given a C0 -semigroup {Wt }t0 on H . We say that G is b-stable (boundedly stable) under the action of {Wt }t0 if (i) Wt G ⊂ G , for all t ∈ R+ , (ii) supt∈[0,1] Wt u is bounded for all u ∈ G . Remark B.4. Note that unlike for C0 -groups, the second condition is required to ensure the continuity in 0. These two conditions are equivalent to the fact that {Wt |G }t0 is a C0 -semigroup on G . Assuming that G is b-stable under the action of {Wt }t0 , we denote by AG its generator. Thus, AG is the restriction of A and its domain is given by

D(AG ) = u ∈ G ∩ D(A) Au ∈ G . If G ∗ is also b-stable under {Wt∗ }t0 , we denote by AG ∗ the generator of {Wt }t0 extended to G ∗ . As above A is a restriction of AG ∗ and thanks to the Nelson lemma, we have that A is the

2616


closure of AG in H and that AG ∗ is the closure of A in G ∗ . We would drop the subscript G when no confusion could arise. We recall the following result of perturbation, see [26, Theorem IX.2.1]. Proposition B.5. Let B be a bounded operator in a Hilbert space H . Then A is the generator a C0 -semigroup if and only if A + B is also one. Appendix C. The Mourre method C.1. The C 1 class Given a self-adjoint operator A, the so-called C 1 (A) class of regularity is a key notion within the Mourre’s theory, see [1] and [14]. This guarantees some properties of domains and that the commutator of an operator H with A would be H -bounded. In this paper, we have to deal with maximal symmetric conjugate operators and thus have to extend the standard class exposed in details in [1, Section 6.2]. As some refinements appear, we present an overview of the properties and refer to [15, Section 2] for proofs. Within this section, we consider a closed densely defined operator A acting in a Hilbert space H . Note this implies that D(A∗ ) is dense in H . We first defined the class of bounded operators belonging to C 1 (A). Let S ∈ B(H ). We denote by [S, A] the sesquilinear form defined on D(A∗ ) × D(A) by

u, [S, A]v := A∗ u, Sv − S ∗ u, Av,

for u ∈ D(A∗ ), v ∈ D(A).

Definition C.1. An operator S ∈ B(H ) belongs to C 1 (A) if the sesquilinear form [S, A] is continuous for the topology of H × H . We denote by [S, A]◦ the unique bounded operator in H extending this form. We now extend the definition to unbounded operator by asking the resolvent R(z) := (S − z)−1 to be C 1 (A). We precise the statement. We first recall that given S a closed densely defined operator on H , the A-regular resolvent set of S is the set ρ(S, A) ⊂ C \ σ (S) such that R(z) is of class C 1 (A). Definition C.2. Let S be a closed and densely defined operator on H . We say that S is of class C 1 (A) if there are a constant C and a sequence of complex numbers zν ∈ ρ(S, A) such that |zν | → ∞ and R(zν ) C|zν |−1 . If S is of class C 1 (A) and ρ(S, A) = C \ σ (S) then we say that S is of full class C 1 (A). In many cases these two definitions coincide. Indeed, given S ∈ C 1 (A), one shows that if A is regular or if S is self-adjoint with a spectral gap then S is in the full class C 1 (A). We recall that a closed densely defined operator B is regular if there is a constant C and αn ∈ C \ σ (B) such that (B − αn ) C|αn |−1 and such that |αn | → ∞. The generators of C0 -semigroups are regular for instance. Definition C.3. Let A and S be two closed and densely defined operators in H . We define [A, S] as the sesquilinear form acting on (D(A∗ ) ∩ D(S ∗ )) × (D(A) ∩ D(S)) and given by

u, [S, A]v := A∗ u, Sv − S ∗ u, Av.


2617

Proposition C.4. Let S ∈ C 1 (A). Then D(A∗ ) ∩ D(S ∗ ) and D(A) ∩ D(S) are cores for S and S ∗ respectively and the form [A, S] has a unique extension to a continuous sesquilinear form denoted by [A, S]◦ on D(S ∗ ) ∩ D(S). Moreover, ◦ A, R(z) = −R(z)[A, S]◦ R(s),

for all z ∈ ρ(S, A),

where on the right-hand side, [A, S]◦ is considered as an element of B(D(S), D(S ∗ )). We stress the fact that [A, S] extends to an element of B(D(S), D(S ∗ )) is not enough to ensure S ∈ C 1 (A), see [14]. Some conditions of compatibilities are to be added, see [15, Proposition 2.21]. This could also be bypassed by knowing some invariance under a C0 -semigroup generated by A. Definition C.5. Let {W1,t }t∈R+ , {W2,t }t∈R+ be two C0 -semigroups on the Hilbert spaces H1 and H2 with generator A1 and A2 . We say that B ∈ B(H1 , H2 ) is of class C 1 (A1 , A2 ) if: W2,t S − SW1,t B(H1 ,H2 ) ct,

0 t 1.

If G ⊂ H are two Hilbert spaces continuously and densely embedded and if a C0 -semigroup {Wt }t∈R+ , with generator A on H , b-stabilizes G and G ∗ , we denote the class C 1 (AG , AG ∗ ) by C 1 (A; G , G ∗ ). We have the following result. Proposition C.6. S ∈ C 1 (A1 , A2 ) if and only if the sesquilinear form 2 [S, A]1 on D(A∗2 ) × D(A1 ) defined by u2 ,2 [S, A]1 u1 := S ∗ u2 , A1 u1 − A∗2 u2 , Su1 is bounded for the topology of H2 × H1 . Let 2 [S, A]◦1 be the closure of this form in B(H1 , H2 ). We have: ◦ 2 [S, A]1

= s- lim (SW1,t − W2,t S). t→0+

Note that for S ∈ B(H ), with Hi = H and Wi,t = Wt , one has S ∈ C 1 (A1 , A2 ) if and only if B ∈ C 1 (A). C.2. Regularity assumptions for the limiting absorption principle In this part, we recall a set of assumptions presented in [15] so as to ensure a limiting absorption principle, see Theorem C.8. Consider H a self-adjoint operator, H symmetric closed and densely defined and A closed and densely defined. These operators are linked by H = [H, iA] in a sense defined lower. Denote also D := D(H ) ∩ D(H ) endowed with the intersection topology, namely the topology associated to the norm · + H · + H · . We start by some assumptions on H and on H . (M1) H is of full class C 1 (H ), D = D(H ) ∩ D(H ∗ ) and this is a core for H . (M2) There are I ⊂ R open and bounded and a, b > 0 such that H a1I (H ) − b1I c (H ) H holds true in the sense of forms on D .

(C.1)

2618


The last one is the strict Mourre estimate. In order to check the first hypothesis, we rely on [15, Lemma 2.26], see also [31, Lemma 2.6]: Lemma C.7. Let H, M be self-adjoint operators such that H ∈ C 1 (M) and that D(H ) ∩ D(M) is a core of M. Let R be a symmetric operator such that D(R) ⊃ D(H ). Set H the closure of M + R defined on D(R) ∩ D(M). Then H is of full class C 1 (H ) and D(H ) ∩ D(H ) is a core for H and D(H ) ∩ D(H ) = D(H ) ∩ D(H ∗ ) = D(H ) ∩ D(M). Assuming (M2), one chooses c > 0 such that H + c H H (take for instance c = b + 1). Since H + c H is symmetric and positive, it possesses a Friedrichs extension G H . We name the form domain of G: G := D G1/2 ,

endowed with the graph norm · G .

(C.2)

Note that G is also obtained by completing the space D with the help of the norm uG = √

u, (H + c H )u. We identify these spaces in Lemma 4.3. We now recall the dual norm · G ∗ of G . Given u ∈ H , we set uG ∗ :=

sup

u, v = G−1/2 u.

v∈D,vG 1

(C.3)

Using the Riesz isomorphism, we identify H with H ∗ the space of anti-linear forms on H . The space G ∗ is given by the completion of H with respect to the norm · G ∗ . We get the following scale space: D ⊂ G ⊂ H H ∗ ⊂ G ∗ ⊂ D ∗, with dense and continuous embeddings. We turn to the assumptions concerning the conjugate operator A and higher commutators. Suppose A to be the generator of {Wt }t∈R+ (M3) The C0 -semigroup {Wt }t∈R+ is of isometries and b-stabilizes G and G ∗ , (M4) H ∈ C 1 (A; G , G ∗ ), (M5) H ∈ C 1,1 (A; G , G ∗ ). The hypothesis (M4) implies that lim u, Wt H u − H u, Wt u = u, H u,

t→0+

for all u ∈ D.

The hypothesis (M5) means that H ∈ B(G , G ∗ ) and that

1

Wt , [Wt , H ]

dt

B(G ,G ∗ ) t 2

< ∞.

0

This is equivalent to the fact that H belongs to (C 2 (A; G , G ∗ ), B(G , G ∗ ))1/2,1 . We refer to [1,32] for real interpolation.


2619

One may also consider the stronger H ∈ C 1 (A; G , G ∗ ), i.e. (M5 ) H ∈ C 2 (A; G , G ∗ ). We now give the result. Let AG ∗ be the generator of {Wt }t∈R+ generator in G ∗ . For s ∈ (0, 1), we set: ∗ Gs∗ := D |AG ∗ |s and G−s := Gs∗ .

(C.4)

Here, the absolute value is taken with respect to the Hilbert structure of G ∗ . Given J an interval, we define J0± := {λ ± iμ, λ ∈ J and μ > 0}. Finally, set R(z) := (H − z)−1 . From [15], we obtain: Theorem C.8. Assume that (M1)–(M5) hold true. Let J be a compact interval included in I. Then if z ∈ J0± , R(z) induces a bounded operator in B(Gs∗ , G−s ), for all s ∈ (1/2, 1]. Moreover the limit R(λ ± i0) = limμ→±0 R(λ + iμ) exists in the norm topology of B(Gs∗ , G−s ), locally uniformly in λ ∈ J and the maps λ → R(λ ± i0) ∈ B(Gs∗ , G−s ) are Hölder continuous of order s − 1/2. This theorem can be improved by considering weights in some Besov spaces related to the conjugate operator. We refer to [15] for more details. Note that the theory exposed in [15] is formulated with the hypothesis (M5 ) but, as mentioned in [15] and proceeding like in [1] for instance, the hypothesis (M5) is enough to apply the theory. References [1] W.O. Amrein, A. Boutet de Monvel, V. Georgescu, C0 -groups, Commutator Methods and Spectral Theory of N body Hamiltonians, Birkhäuser, 1996. [2] A. Arai, M. Hirokawa, On the existence and uniqueness of ground states of a generalized spin-boson model, J. Funct. Anal. 151 (1997) 455–503. [3] V. Bach, J. Fröhlich, I.M. Sigal, Quantum electrodynamics of confined nonrelativistic particles, Adv. Math. 137 (2) (1998) 299–395. [4] V. Bach, J. Fröhlich, I.M. Sigal, Spectral analysis for systems of atoms and molecules coupled to the quantized radiation field, Comm. Math. Phys. 207 (2) (1999) 249–290. [5] V. Bach, J. Fröhlich, I.M. Sigal, A. Soffer, Positive commutators and spectrum of Pauli–Fierz hamiltonian of atoms and molecules, Comm. Math. Phys. 207 (3) (1999) 557–587. [6] O. Bratelli, D. Robinson, Operator Algebras and Quantum Statistical Mechanics, vol. 1, Springer-Verlag, 1979. [7] J.C. Baez, I.E. Segal, I.E. Zhou, Introduction to Algebraic and Constructive Quantum Field Theory, Princeton University Press, Princeton, NJ, 1991. [8] J. Dereziński, Van Hove Hamiltonians—Exactly solvable models of the infrared and ultraviolet problem, Ann. Henri Poincaré 4 (4) (2003) 713–738. [9] J. Dereziński, V. Jak˘sić, Spectral theory of Pauli–Fierz operators, J. Funct. Anal. 180 (2) (2001) 243–327. [10] J. Dereziński, V. Jak˘sić, Return to equilibrium for Pauli–Fierz systems, Ann. Henri Poincaré 4 (4) (2003) 739–793. [11] J. Faupin, J.S. Møller, E. Skibsted, Fermi golden rule for a class of Pauli–Fierz models. An abstract approach, in preparation. [12] J. Fröhlich, M. Griesemer, I.M. Sigal, Mourre estimate and spectral theory for the standard model of non-relativistic QED, preprint mp_arc06-316. [13] J. Fröhlich, A. Pizzo, On the absence of excited eigenstates of atoms in QED, preprint mp_arc07-97. [14] V. Georgescu, C. Gérard, On the virial theorem in quantum mechanics, Comm. Math. Phys. 208 (2) (1999) 275–281. [15] V. Georgescu, C. Gérard, J.S. Møller, Commutators, C0 -semigroups and resolvent estimates, J. Funct. Anal. 216 (2) (2004) 303–361.

2620


[16] V. Georgescu, C. Gérard, J.S. Møller, Spectral theory of massless Pauli–Fierz models, Comm. Math. Phys. 249 (1) (2004) 29–78. [17] C. Gérard, On the existence of ground states for massless Pauli–Fierz Hamiltonians, Ann. Henri Poincaré 1 (3) (2000) 443–459. [18] C. Gérard, On the scattering theory of massless Nelson models, Rev. Math. Phys. 14 (11) (2002) 1165–1280. [19] C. Gérard, A proof of the abstract limiting absorption principle by energy estimates, J. Funct. Anal. 254 (11) (2008) 2707–2724. [20] J. Glimm, A. Jaffe, Quantum Physics. A Functional Integral Point of View, vol. XX, Springer-Verlag, New York– Heidelberg–Berlin, 1981. [21] S. Golénia, T. Jecko, A new look at Mourre’s commutator theory, Complex Anal. Oper. Theory 1 (3) (2007) 399– 422. [22] D. Hasler, I. Herbst, M. Huber, On the lifetime of quasi-stationary states in non-relativistic QED, Anal. Henri Poincaré, in press. [23] E. Hille, R.S. Phillips, Functional Analysis and Semi-Groups, Amer. Math. Soc., Providence, RI, 1957. [24] M. Hübner, H. Spohn, Spectral properties of the spin-boson Hamiltonian, Ann. Inst. H. Poincaré Phys. Théor. 62 (3) (1995) 289–323. [25] V. Jak˘sić, C.A. Pillet, On a model for quantum friction. I, Fermi’s golden rule and dynamics at zero temperature, Ann. Inst. H. Poincaré, Phys. Théor. 62 (1) (1995) 47–68. [26] T. Kato, Perturbation Theory for Linear Operators, Classics Math., vol. XXI, Springer-Verlag, Berlin, 1995. [27] M. Merkli, Positive commutators in non-equilibrium statistical mechanics, Comm. Math. Phys. 223 (2001) 327– 362. [28] E. Mourre, Absence of singular continuous spectrum for certain self-adjoint operators, Comm. Math. Phys. 78 (1981) 391–408. [29] M. Reed, B. Simon, Methods of Modern Mathematical Physics, Tome I–IV, Academic Press, 1979. [30] J. Sahbani, The conjugate operator method for locally regular Hamiltonians, J. Oper. Theory 38 (2) (1997) 297–322. [31] E. Skibsted, Spectral analysis of N -body systems coupled to a bosonic field, Rev. Math. Phys. 10 (7) (1998) 989– 1026. [32] H. Triebel, Interpolation Theory. Function Spaces. Differential Operators, Deutscher Verlag des Wissenschaften, 1978.


The intrinsic hypoelliptic Laplacian and its heat kernel on unimodular Lie groups Andrei Agrachev a , Ugo Boscain b , Jean-Paul Gauthier c , Francesco Rossi a,∗ a SISSA, via Beirut 2-4, 34014 Trieste, Italy b CNRS CMAP, École Polytechnique, Route de Saclay, 91128 Palaiseau Cedex, France c Laboratoire LSIS, Université de Toulon, France

Received 8 July 2008; accepted 8 January 2009

Communicated by J.-M. Coron

Abstract We present an invariant definition of the hypoelliptic Laplacian on sub-Riemannian structures with constant growth vector using the Popp’s volume form introduced by Montgomery. This definition generalizes the one of the Laplace–Beltrami operator in Riemannian geometry. In the case of left-invariant problems on unimodular Lie groups we prove that it coincides with the usual sum of squares. We then extend a method (first used by Hulanicki on the Heisenberg group) to compute explicitly the kernel of the hypoelliptic heat equation on any unimodular Lie group of type I. The main tool is the noncommutative Fourier transform. We then study some relevant cases: SU(2), SO(3), SL(2) (with the metrics inherited by the Killing form), and the group SE(2) of rototranslations of the plane. © 2009 Elsevier Inc. All rights reserved. Keywords: Hypoelliptic Laplacian; Generalized Fourier transform; Heat equation

1. Introduction The study of the properties of the heat kernel in a sub-Riemannian manifold drew an increasing attention since the pioneer work of Hörmander [28]. * Corresponding author.

E-mail addresses: [email protected] (A. Agrachev), [email protected] (U. Boscain), [email protected] (J.-P. Gauthier), [email protected] (F. Rossi). 0022-1236/$ – see front matter © 2009 Elsevier Inc. All rights reserved. doi:10.1016/j.jfa.2009.01.006

2622

A. Agrachev et al. / Journal of Functional Analysis 256 (2009) 2621–2655

Since then, many estimates and properties of the kernel in terms of the sub-Riemannian distance have been provided (see [7,8,12,13,20,30,35,42,47] and references therein). In most cases the hypoelliptic Laplacian appearing in the heat equation is the sum of squares of the vector field forming an orthonormal frame for the sub-Riemannian structure. In other cases it is built as the divergence of the horizontal gradient, where the divergence is defined using any C ∞ volume form on the manifold (see for instance [46]). The Laplacians obtained in these ways are not intrinsic in the sense that they do not depend only on the sub-Riemannian distance. Indeed, when the Laplacian is built as the sum of squares, it depends on the choice of the orthonormal frame, while when it is defined as divergence of the horizontal gradient, it depends on the choice of the volume form. The first question we address in this paper is the definition of an invariant hypoelliptic Laplacian. As far as we know, this question has been raised for the first time in a paper by Brockett [11]. Many details can be found in Montgomery’s book [38]. To define the intrinsic hypoelliptic Laplacian, we proceed as in Riemannian geometry. In Riemannian geometry the invariant Laplacian (called the Laplace–Beltrami operator) is defined as the divergence of the gradient where the gradient is obtained via the Riemannian metric and the divergence via the Riemannian volume form. In sub-Riemannian geometry, we define the invariant hypoelliptic Laplacian as the divergence of the horizontal gradient. The horizontal gradient of a function is the natural generalization of the gradient in Riemannian geometry and it is a vector field belonging to the distribution. The divergence is computed with respect to the sub-Riemannian volume form, that can be defined for every sub-Riemannian structure with constant growth vector. This definition depends only on the sub-Riemannian structure. The sub-Riemannian volume form, called the Popp’s measure, was first introduced in Montgomery’s book [38], where its relation with the Hausdorff measure is also discussed. The definition of the sub-Riemannian volume form is simple in the 3D contact case, and a bit more delicate in general. We then prove that for the wide class of unimodular Lie groups (i.e. the groups where the right- and left-Haar measures coincide) the hypoelliptic Laplacian is the sum of squares for any choice of a left-invariant orthonormal base. We recall that all compact and all nilpotent Lie groups are unimodular. In the second part of the paper, we present a method to compute explicitly the kernel of the hypoelliptic heat equation on a wide class of left-invariant sub-Riemannian structures on Lie groups. We then apply this method to the most important 3D Lie groups: SU(2), SO(3), and SL(2) with the metric defined by the Killing form, the Heisenberg group H2 , and the group of rototranslations of the plane SE(2). These groups are unimodular, hence the hypoelliptic Laplacian is the sum of squares. The interest in studying SU(2), SO(3) and SL(2) comes from some recent results of the authors. Indeed, in [9] the complete description of the cut and conjugate loci for these groups was obtained. These results, together with those presented in this paper, open new perspectives for the clarification of the relation between the presence of the cut locus and the properties of the heat kernel, in line with the result of Neel and Stroock [39] in Riemannian geometry. Up to now the only case in which both the cut locus and the heat kernel has been known explicitly was the Heisenberg group [21,22,29].1 1 The Heisenberg group is in a sense a very degenerate example. For instance, in this case the cut locus coincides globally with the conjugate locus (set of points where geodesics lose local optimality) and many properties that one expects to be distinct for more generic situations cannot be distinguished. The application of our method to the Heisenberg group H2 provides in a few lines the Gaveau–Hulanicki formula [21,29].


2623

The interest in the hypoelliptic heat kernel on SE(2) comes from a model of human vision. It was recognized in [15,41] that the visual cortex V1 solves a nonisotropic diffusion problem on the group SE(2) while reconstructing a partially hidden or corrupted image. The study of the cut locus on SE(2) is a work in progress. Preliminary results can be found in [37]. The method is based upon the generalized (noncommutative) Fourier transform (GFT, for short), that permits to disintegrate2 a function from a Lie group G to R on its components on (the class of) non-equivalent unitary irreducible representations of G. This technique permits to transform the hypoelliptic heat equation into an equation in the dual of the group,3 that is particularly simple since the GFT disintegrate the right-regular representations and the hypoelliptic Laplacian is built with left-invariant vector fields (to which a one parameter group of right-translations is associated). Unless we are in the abelian case, the dual of a Lie group in general is not a group. In the compact case it is a so called Tannaka category [25,27] and it is a discrete set. In the nilpotent case it has the structure of Rn for some n. In the general case it can have a quite complicated structure. However, under certain hypotheses (see Section 3), it is a measure space if endowed with the so called Plancherel measure. Roughly speaking, the GFT is an isometry between L2 (G, C) (the set of complex-valued square integrable functions over G, with respect to the Haar measure) and the set of Hilbert–Schmidt operators with respect to the Plancherel measure. The difficulties of applying our method in specific cases rely mostly on two points: (i) Computing the tools for the GFT, i.e. the non-equivalent irreducible representations of the group and the Plancherel measure. This is a difficult problem in general: however, for certain classes of Lie groups there are suitable techniques (for instance the Kirillov orbit method for nilpotent Lie groups [33], or methods for semidirect products). For the groups discussed in this paper, the sets of non-equivalent irreducible representations (and hence the GFT) are well known (see for instance [43]). (ii) Finding the spectrum of an operator (the GFT of the hypoelliptic Laplacian). Depending on the structure of the group and on its dimension, this problem gives rise to a matrix equation, an ODE or a PDE. Then one can express the kernel of the hypoelliptic heat equation in terms of eigenfunctions of the GFT of the hypoelliptic Laplacian, or in terms of the kernel of the transformed equation. For the cases treated in this paper, see Table 1 (the symbol means disjoint union). The idea of using the GFT to compute the hypoelliptic heat kernel is not new: it was already used on the Heisenberg group in [29] at the same time as the Gaveau formula was published in [21], and on all step 2 nilpotent Lie groups in [5,16]. See also the related work [34]. The structure of the paper is the following: in Section 2 we recall some basic definitions from sub-Riemannian geometry and we construct the sub-Riemannian volume form. We then give the definition of the hypoelliptic Laplacian on a regular sub-Riemannian manifold, and we show that the hypothesis of regularity cannot be dropped in general. For this purpose, we show that the invariant hypoelliptic Laplacian defined on the Martinet sub-Riemannian structure is singular. We then move to left-invariant sub-Riemannian structures on Lie groups and we show that a Lie group is unimodular if and only if the invariant hypoelliptic Laplacian is the sum of squares. We 2 One could also say decompose (possibly continuously). 3 In this paper, by the dual of the group, we mean the support of the Plancherel measure on the set of non-equivalent

unitary irreducible representations of G; we thus ignore the singular representations.

2624


Table 1 Group

Dual of the group

GFT of the hypoelliptic Laplacian

H2

R

d 2 − λ2 x 2 (quantum Harmonic oscillator) dx 2

Hermite polynomials

SU(2)

N

Linear finite dimensional operator related to the quantum angular momentum

Complex homogeneous polynomials in two variables

SO(3)

N

Linear finite dimensional operator related to orbital quantum angular momentum

Spherical harmonics

SL(2)

R+ R + Z

Continuous: Linear operator on analytic functions with domain {|x| = 1} ⊂ C Discrete: Linear operator on analytic functions with domain {|x| < 1} ⊂ C

Complex monomials

SE(2)

R+

d 2 − λ2 cos2 (θ) (Mathieu’s equation) dθ 2

Eigenfunctions of the GFT of the hypoelliptic Laplacian

2π -periodic Mathieu functions

also provide an example of a 3D non-unimodular Lie group for which the invariant hypoelliptic Laplacian is not the sum of squares. The section ends with the proof that the invariant hypoelliptic Laplacian can be expressed as Δsr = −

m

L∗Xi LXi ,

i=1

where the formal adjoint L∗Xi is built with the sub-Riemannian volume form, providing a connection with existing literature (see e.g. [31]). The invariant hypoelliptic Laplacian is then the sum of squares when LXi are skew-adjoint.4 In Section 3 we recall basic tools of the GFT and we describe our general method to compute the heat kernel of the hypoelliptic Laplacian on unimodular Lie groups of type I. We provide two useful formulas, one in the case where the GFT of the hypoelliptic Laplacian has discrete spectrum, and the other in the case where the GFT of the hypoelliptic heat equation admits a kernel. In Section 4 we apply our method to compute the kernel on H2 , SU(2), SO(3), SL(2) and SE(2). For the Heisenberg group we use the formula involving the kernel of the transformed equation (the Mehler kernel). For the other groups we use the formula in terms of eigenvalues and eigenvectors of the GFT of the hypoelliptic Laplacian. The application of our method to higher dimensional sub-Riemannian problems and in particular to the nilpotent Lie groups (2, 3, 4) (the Engel group) and (2, 3, 5) is the subject of a forthcoming paper. 4 This point of view permits to give an alternative proof of the fact that the invariant hypoelliptic Laplacian on leftinvariant structures on unimodular Lie groups is the sum of squares. As a matter of fact, left-invariant vector fields are formally skew-adjoint with respect to the right-Haar measure. On Lie groups the invariant volume form is left-invariant, hence is proportional to the left-Haar measure, and is in turn proportional to the right-Haar measure on unimodular groups.


2625

2. The hypoelliptic Laplacian In this section we give a definition of the hypoelliptic Laplacian Δsr on a regular subRiemannian manifold M. 2.1. Sub-Riemannian manifolds We start recalling the definition of sub-Riemannian manifold. Definition 1. A (n, m)-sub-Riemannian manifold is a triple (M, , g), where • M is a connected smooth manifold of dimension n; • is a smooth distribution of constant rank m n satisfying the Hörmander condition, i.e. is a smooth map that associates to q ∈ M a m-dim subspace (q) of Tq M and ∀q ∈ M we have span X1 , . . . [Xk−1 , Xk ] . . . (q) Xi ∈ VecH (M) = Tq M

(1)

where VecH (M) denotes the set of horizontal smooth vector fields on M, i.e. VecH (M) = X ∈ Vec(M) X(p) ∈ (p) ∀p ∈ M ; • gq is a Riemannian metric on (q), that is smooth as function of q. When M is an orientable manifold, we say that the sub-Riemannian manifold is orientable. Remark 2. Usually sub-Riemannian manifolds are defined with m < n. In our definition we decided to include the Riemannian case m = n, since all our results hold in that case. Notice that if m = n then condition (1) is automatically satisfied. A Lipschitz continuous curve γ : [0, T ] → M is said to be horizontal if γ˙ (t) ∈ (γ (t)) for almost every t ∈ [0, T ]. Given an horizontal curve γ : [0, T ] → M, the length of γ is T

gγ (t) γ˙ (t), γ˙ (t) dt. l(γ ) =

(2)

0

The distance induced by the sub-Riemannian structure on M is the function d(q0 , q1 ) = inf l(γ ) γ (0) = q0 , γ (T ) = q1 , γ horizontal .

(3)

The hypothesis of connectedness of M and the Hörmander condition guarantee the finiteness and the continuity of d(·,·) with respect to the topology of M (Chow’s Theorem, see for instance [2]). The function d(·,·) is called the Carnot–Carathéodory distance and gives to M the structure of metric space (see [6,23]).

2626


It is a standard fact that l(γ ) is invariant under reparameterization of the curve γ . Moreover, if an admissible curve γ minimizes the so-called energy functional T E(γ ) =

gγ (t) γ˙ (t), γ˙ (t) dt

0

with T fixed (and fixed initial and final point), then v = gγ (t) (γ˙ (t), γ˙ (t)) is constant and γ is also a minimizer of l(·). On the other side, a minimizer γ of l(·) such that v is constant is a minimizer of E(·) with T = l(γ )/v. A geodesic for the sub-Riemannian manifold is a curve γ : [0, T ] → M such that for every sufficiently small interval [t1 , t2 ] ⊂ [0, T ], γ|[t1 ,t2 ] is a minimizer of E(·). A geodesic for which gγ (t) (γ˙ (t), γ˙ (t)) is (constantly) equal to one is said to be parameterized by arclength. Locally, the pair (, g) can be given by assigning a set of m smooth vector fields spanning and that are orthonormal for g, i.e. (q) = span X1 (q), . . . , Xm (q) ,

gq Xi (q), Xj (q) = δij .

(4)

In this case, the set {X1 , . . . , Xm } is called a local orthonormal frame for the sub-Riemannian structure. When (, g) can be defined as in (4) by m vector fields defined globally, we say that the sub-Riemannian manifold is trivializable. Given a (n, m)-trivializable sub-Riemannian manifold, the problem of finding a curve minimizing the energy between two fixed points q0 , q1 ∈ M is naturally formulated as the optimal control problem

q(t) ˙ =

m

ui (t)Xi q(t) ,

ui (.) ∈ L [0, T ], R , ∞

i=1

q(0) = q0 ,

T m

u2i (t) dt → min,

0 i=1

q(T ) = q1 .

(5)

It is a standard fact that this optimal control problem is equivalent to the minimum time problem with controls u1 , . . . , um satisfying u1 (t)2 + · · · + um (t)2 1 in [0, T ]. When the manifold is analytic and the orthonormal frame can be assigned through m analytic vector fields, we say that the sub-Riemannian manifold is analytic. We end this section with the definition of the small flag of the distribution : Definition 3. Let be a distribution and define through the recursive formula 1 := ,

n+1 := n + [n , ]

where n+1 (q0 ) := n (q0 ) + [n (q0 ), (q0 )] = {X1 (q0 ) + [X2 , X3 ](q0 ) | X1 (q), X2 (q) ∈ n (q), X3 (q) ∈ (q) ∀q ∈ M}. The small flag of is the sequence 1 ⊂ 2 ⊂ · · · ⊂ n ⊂ · · · . A sub-Riemannian manifold is said to be regular if for each n = 1, 2, . . . the dimension of n (q0 ) = {f (q0 ) | f (q) ∈ n (q) ∀q ∈ M} does not depend on the point q0 ∈ M.


2627

A 3D sub-Riemannian manifold is said to be a 3D contact manifold if has dimension 2 and 2 (q0 ) = Tq0 M for any point q0 ∈ M. In this paper we always deal with regular sub-Riemannian manifolds. 2.1.1. Left-invariant sub-Riemannian manifolds In this section we present a natural sub-Riemannian structure that can be defined on Lie groups. All along the paper, we use the notation for Lie groups of matrices. For general Lie groups, by gv with g ∈ G and v ∈ L, we mean (Lg )∗ (v) where Lg is the left-translation of the group. Definition 4. Let G be a Lie group with Lie algebra L and p ⊆ L a subspace of L satisfying the Lie bracket generating condition Lie p := span p1 , p2 , . . . , [pn−1 , pn ] pi ∈ p = L. Endow p with a positive definite quadratic form .,.. Define a sub-Riemannian structure on G as follows: • the distribution is the left-invariant distribution (g) := gp; • the quadratic form g on is given by gg (v1 , v2 ) := g −1 v1 , g −1 v2 . In this case we say that (G, , g) is a left-invariant sub-Riemannian manifold. Remark 5. Observe that all left-invariant manifolds (G, , g) are regular. In the following we define a left-invariant sub-Riemannian manifold choosing a set of m vectors {p1 , . . . , pm } being an orthonormal basis for the subspace p ⊆ L with respect to the metric defined in Definition 4, i.e. p = span{p1 , . . . , pm } and pi , pj = δij . We thus have (g) = gp = span{gp1 , . . . , gpm } and gg (gpi , gpj ) = δij . Hence every left-invariant sub-Riemannian manifold is trivializable. The problem of finding the minimal energy between the identity and a point g1 ∈ G in fixed time T becomes the left-invariant optimal control problem

g(t) ˙ = g(t) ui (t)pi , i

g(0) = Id,

ui (.) ∈ L [0, T ], R ∞

T 0

u2i (t) dt → min,

i

g(T ) = g1 .

(6)

Remark 6. This problem admits a solution, see for instance Chapter 5 of [10]. 2.2. Definition of the hypoelliptic Laplacian on a sub-Riemannian manifold In this section we define the intrinsic hypoelliptic Laplacian on a regular orientable subRiemannian manifold (M, , g). This definition generalizes the one of the Laplace–Beltrami operator on an orientable Riemannian manifold, that is Δφ := div grad φ, where grad is a unique

2628


operator from C ∞ (M) to Vec(M) satisfying gq (grad φ(q), v) = dφq (v) ∀q ∈ M, v ∈ Tq M, and the divergence of a vector field X is a unique function satisfying div Xμ = LX μ where μ is the Riemannian volume form. We first define the sub-Riemannian gradient of a function, that is an horizontal vector field. Definition 7. Let (M, , g) be a sub-Riemannian manifold: the horizontal gradient is a unique operator gradsr from C ∞ (M) to VecH (M) satisfying gq (gradsr φ(q), v) = dφq (v) ∀q ∈ M, v ∈ (q). One can easily check that if {X1 , . . . Xm } is a local orthonormal frame for (M, , g), then gradsr φ = m i=1 (LXi φ)Xi . The question of defining a sub-Riemannian volume form is more delicate. We start by considering the 3D contact case. Proposition 8. Let (M, , g) be an orientable 3D contact sub-Riemannian structure and {X1 , X2 } a local orthonormal frame. Let X3 = [X1 , X2 ] and dX1 , dX2 , dX3 the dual basis, i.e. dXi (Xj ) = δij . Then μsr := dX1 ∧ dX2 ∧ dX3 is an intrinsic volume form, i.e. it is invariant for a orientation preserving change of orthonormal frame. Proof. Consider two different orthonormal frames with the same orientation {X1 , X2 } and {Y1 , Y2 }. We have to prove that dX1 ∧ dX2 ∧ dX3 = dY1 ∧ dY2 ∧ dY3 with X3 = [X1 , X2 ], Y3 = [Y1 , Y2 ]. We have

Y1 Y2

=

cos(f (q)) − sin(f (q))

sin(f (q)) cos(f (q))

X1 X2

,

for some real-valued smooth function f . A direct computation shows that Y3 = X3 + f1 X1 + f2 X2

(7)

where f1 and f2 are two smooth functions depending on f . We first prove that dX1 ∧ dX2 = dY1 ∧ dY2 . Since the change of variables {X1 , X2 } → {Y1 , Y2 } is norm-preserving, we have dX1 ∧ dX2 (v, w) = dY1 ∧ dY2 (v, w) when v, w ∈ . Consider now any vector v = v1 X1 + v2 X2 + v3 X3 = v1 Y1 + v2 Y2 + v3 Y3 : as a consequence of (7), we have v3 = v3 . Take another vector w = w1 X1 + w2 X2 + w3 X3 = w1 Y1 + w2 Y2 + w3 Y3 and compute dX1 ∧ dX2 (v, w) = dX1 ∧ dX2 (v − v3 X3 , w − w3 X3 ) = dY1 ∧ dY2 (v − v3 X3 , w − w3 X3 ) = dY1 ∧ dY2 (v, w), because the vectors v − v3 X3 , w − w3 X3 are horizontal. Hence the two 2-forms coincide.


2629

From (7) we also have dY3 = dX3 + f1 dX1 + f2 dX2 for some smooth functions f1 , f2 . Hence we have dY1 ∧dY2 ∧dY3 = dX1 ∧dX2 ∧dY3 = dX1 ∧dX2 ∧(dX3 +f1 dX1 +f2 dX2 ) = dX1 ∧ dX2 ∧ dX3 , where the last identity is a consequence of skew-symmetry of differential forms. 2 Remark 9. Indeed, even if in the 3D contact case there is no scalar product in Tq M, it is possible to define a natural volume form, since on the scalar product is defined by g and formula (7) guarantees the existence of a natural scalar product in ( + [, ])/. The previous result generalizes to any regular orientable sub-Riemannian structure, as presented below. 2.2.1. Definition of the intrinsic volume form Let 0 = E0 ⊂ E1 ⊂ · · · ⊂ Ek = E be a filtration of an n-dimensional vector space E. Let e1 , . . . , en be a basis of E such that Ei = span{e1 , . . . , eni }. Obviously, the wedge product e1 ∧ · · · ∧ en depends only on the residue classes e¯j = (ej + Eij ) ∈ Eij +1 /Eij , where nij < j nij +1 , j = 1, . . . , n. This property induces a natural (i.e. independent on the choice of the basis) isomorphism of 1-dimensional spaces: n

k n E∼ (Ei /Ei−1 ) . = i=1

Now consider the filtration 0 ⊂ 1 (q) ⊂ · · · ⊂ k (q) = Tq M,

dim i (q) = ni .

Let X1 , . . . , Xi be smooth sections of = 1 ; then the vector

X1 , [. . . , Xi ] . . . (q) + i−1 (q) ∈ i (q)/i−1 (q) depends only on X1 (q) ⊗ · · · ⊗ Xi (q). We thus obtain a well-defined surjective linear map βi :

(q)⊗i → i (q)/i−1 (q),

X1 (q) ⊗ · · · ⊗ Xi (q) → X1 , [. . . , Xi ] . . . (q) + i−1 (q) .

The Euclidean structure on (q) induces an Euclidean structure on (q)⊗i by the standard formula:

ξ1 ⊗ · · · ⊗ ξi , η1 ⊗ · · · ⊗ ηi = ξ1 , η1 . . . ξi , ηi ,

ξj , ηj ∈ (q), j = 1, . . . , i.

2630


Then the formula: |v| = min |ξ¯ |: ξ¯ ∈ βi−1 (v) ,

v ∈ i (q)/i−1 (q)

defines an Euclidean norm on i (q)/i−1 (q). Let νi be the volume form on i (q)/i−1 (q) associated with the Euclidean structure: 1 m

νi , v1 ∧ · · · ∧ vmi = det mi vj , vj j,ji =1 ,

where mi = ni − ni−1 = dim(i (q)/i−1 (q)). Finally, the intrinsic volume form μsr on Tq M is the image of ν1 ∧ · · · ∧ νk under the natural isomorphism ∗ k n n

i i−1 ∼ (q)/ (q) (Tq M)∗ . = i=1

Remark 10. To our knowledge, the construction given above was first presented by Brockett [11, pp. 16–17] in the 2-step case, then by Montgomery [38, Section 10.5] in the general case, where the measure μsr is called Popp’s measure. Montgomery also observed that a sub-Riemannian volume form was the only missing ingredient to get an intrinsic definition of hypoelliptic Laplacian.5 Once the volume form is defined, the divergence of a vector field X is defined as in Riemannian geometry, i.e. it is the function divsr X satisfying divsr Xμsr = LX μsr . We are now ready to define the intrinsic hypoelliptic Laplacian. Definition 11. Let (M, , g) be an orientable regular sub-Riemannian manifold. Then the intrinsic hypoelliptic Laplacian is Δsr φ := divsr gradsr φ. Consider now an orientable regular sub-Riemannian structure (M, , g) and let {X1 , . . . , Xm } be a local orthonormal frame. We want to find an explicit expression for the operator Δsr . If n = m then Δsr is the Laplace–Beltrami operator. Otherwise consider n − m vector fields Xm+1 , . . . , Xn such that {X1 (q), . . . , Xm (q), Xm+1 (q), . . . , Xn (q)} is a basis of Tq M for all q in a certain open set U . The volume form μsr is μsr = f (q) dX1 ∧ · · · ∧ dXn , with dXi dual basis of X1 , . . . , Xn : then we can find other n − m vector fields, that we still call Xm+1 , . . . , Xn , for which we have μsr = dX1 ∧ · · · ∧ dXn . Recall that Δsr φ satisfies (Δsr φ)μsr = LX μsr with X = gradsr φ. We have 5 Montgomery did not use Popp’s measure to get the intrinsic definition of the hypoelliptic Laplacian since there are, a priori, two natural measures on a regular sub-Riemannian manifold: the Popp’s measure, and the Hausdorff measure (see [36,40]). However, for left-invariant sub-Riemannian manifolds, both of them are proportional to the left-Haar measure. See Remark 16.


LX μsr =

2631

m

i ∧ · · · ∧ dXn (−1)i+1 d dφ, Xi ∧ dX1 ∧ · · · ∧ dX i=1

i ∧ · · · ∧ dXn ) . + dφ, Xi d(dX1 ∧ · · · ∧ dX

i ∧ Applying standard results of differential calculus, we have d( dφ, Xi ) ∧ dX1 ∧ · · · ∧ dX 2 i+1 i+1 · · · ∧ dXn = (−1) LXi φμsr and d(dX1 ∧ · · · ∧ dXi ∧ · · · ∧ dXn ) = (−1) Tr(ad Xi )μsr , where the adjoint map is ad Xi : and by Tr(ad Xi ) we mean

n

Vec(U ) → Vec(U ), X → [Xi , X]

j =1 dXj ([Xi , Xj ]).

Finally, we find the expression

m

2 Δsr φ = LXi φ + LXi φ Tr(ad Xi ) .

(8)

i=1

Notice that the formula depends on the choice of the vector fields Xm+1 , . . . , Xn . The hypoellipticity of Δsr (i.e. given U ⊂ M and φ : U → C such that Δsr φ ∈ C ∞ , then φ is C ∞ ) follows from the Hörmander theorem (see [28]): Theorem 12. Let L be a differential operator on a manifold M, that locally in a neighbor 2 + L , where X , X , . . . , X are C ∞ vector fields. If hood U is written as L = m L X0 0 1 m i=1 Xi Lieq {X0 , X1 , . . . , Xm } = Tq M for all q ∈ U , then L is hypoelliptic. m 2 Indeed, Δsr is written locally as i=1 LXi + LX0 with the first-order term LX0 = m i=1 Tr(ad Xi )LXi . Moreover by hypothesis we have that Lieq {X1 , . . . , Xm } = Tq M, hence the Hörmander theorem applies. Remark 13. Notice that in the Riemannian case, i.e. for m = n, Δsr coincides with the Laplace– Beltrami operator. Remark 14. The hypothesis that the sub-Riemannian manifold is regular is crucial for the construction of the invariant volume form. For instance for the Martinet metric on R3 , that is the 2 sub-Riemannian structure for which L1 = ∂x + y2 ∂z and L2 = ∂y form an orthonormal base, one gets on R3 \ {y = 0} 1 Δsr = (L1 )2 + (L2 )2 − L2 . y This is not surprising at all. As a matter of fact, even the Laplace–Beltrami operator is singular in almost-Riemannian geometry (see [3] and references therein). For instance, for the Grushin metric on R2 , that is the singular Riemannian structure for which L1 = ∂x and L2 = x∂y form an orthonormal frame, one gets on R2 \ {x = 0} 1 ΔLB = (L1 )2 + (L2 )2 − L1 . x

2632


2.3. The hypoelliptic Laplacian on Lie groups In the case of left-invariant sub-Riemannian manifolds, there is an intrinsic global expression of Δsr . Corollary 15. Let (G, , g) be a left-invariant sub-Riemannian manifold generated by the orthonormal basis {p1 , . . . , pm } ⊂ L. Then the hypoelliptic Laplacian is Δsr φ =

m

2 LXi φ + LXi φ Tr(ad pi )

(9)

i=1

where LXi is the Lie derivative w.r.t. the field Xi = gpi . Proof. If m n, we can find n − m vectors {pm+1 , . . . , pn } such that {p1 , . . . , pn } is a basis for L. Choose the fields Xi := gpi and follow the computation given above: we find formula (9). In this case the adjoint map is intrinsically defined and the trace does not depend on the choice of Xm+1 , . . . , Xn . 2 The formula above reduces to the sum of squares in the wide class of unimodular Lie groups. We recall that on a Lie group of dimension n, there always exist a left-invariant n-form μL and a right-invariant n-form μR (called respectively left- and right-Haar measures), that are unique up to a multiplicative constant. These forms have the properties that

f (ag)μL (g) =

G

G

f (ga)μR (g) =

f (g)μL (g), G

f (g)μR (g), G

for every f ∈ L (G, R) and a ∈ G, 1

where L1 is intended with respect to the left-Haar measure in the first identity and with respect to the right-Haar measure in the second one. The group is called unimodular if μL and μR are proportional. Remark 16. Notice that for left-invariant sub-Riemannian manifolds the intrinsic volume form and the Hausdorff measure μH are left-invariant, hence they are proportional to the left-Haar measure μL . On unimodular Lie groups one can assume μsr = μL = μR = αμH , where α > 0 is a constant that is unknown even for the simplest among the genuine sub-Riemannian structures, i.e. the Heisenberg group. Proposition 17. Under the hypotheses of Corollary 15, if G is unimodular then Δsr φ = m 2 i=1 LXi φ. Proof. Consider the modular function Ψ , that is a unique function such that G

f h−1 g μR (g) = Ψ (h)

f (g)μR (g) G


2633

for all f measurable. It is well known that Ψ (g) = det(Adg ) and that Ψ (g) ≡ 1 if and only if G is unimodular. Consider a curve γ (t) such that γ˙ exists for t = t0 : then γ (t) = g0 e(t−t0 )η+o(t−t0 ) with g0 = γ (t0 ) and for some η ∈ L. We have

d −1 d det(Adγ (t) ) = Tr (Adg0 ) Ad sη+o(s) dt |t=t0 ds |s=0 g0 e = Tr(Adg −1 Adg0 ad η ) = Tr(ad η ). 0

(10)

Now choose the curve γ (t) = g0 etpi and observe that det(Adγ (t) ) ≡ 1, then Tr(ad pi ) = 0. The conclusion follows from (9). 2 All the groups treated in this paper (i.e. H2 , SU(2), SO(3), SL(2) and SE(2)) are unimodular. Hence the invariant hypoelliptic Laplacian is the sum of squares. A kind of inverse result holds: Proposition 18. Let (G, , g) be a left-invariant sub-Riemannian manifold generated m by2 the orthonormal basis {p1 , . . . , pm } ⊂ L. If the hypoelliptic Laplacian satisfies Δsr φ = i=1 LXi φ, then G is unimodular. 2 Proof. We start observing that Δsr φ = m i=1 LXi φ if and only if Tr(ad pi ) = 0 for all i = 1, . . . , m. Fix g ∈ G: due to Lie bracket generating condition, the control system (6) is controllable, then there exists a choice of piecewise constant controls ui : [0, T ] → R such that the corresponding solution γ (.) is an horizontal curve steering Id to g. Then γ˙ is defined for all t ∈ [0, T ] except for a finite set E of switching times. Consider now the modular function along γ , i.e. Ψ (γ (t)), that is a continuous function, differd det(Adγ (t) ) = entiable for all t ∈ [0, T ]\E. We compute its derivative using (10): we have dt |t=t0 m −1 Tr(ad η ) with η = γ (t0 ) γ˙ (t0 ). Due to horizontality of γ , we have η = i=1 ai pi , hence Tr(ad η ) = m i=1 ai Tr(ad pi ) = 0. Then the modular function is piecewise constant along γ . Recalling that it is continuous, we have that it is constant. Varying along all g ∈ G and recalling that Ψ (Id) = 1, we have Ψ ≡ 1, hence G is unimodular. 2 2.3.1. The hypoelliptic Laplacian on a non-unimodular Lie group In this section we present a non-unimodular Lie group endowed with a left-invariant subRiemannian structure. We then compute the explicit expression of the intrinsic hypoelliptic Laplacian: from Proposition 18 we already know that it is the sum of squares plus a first-order term. Consider the Lie group +

A (R) ⊕ R :=

a 0 0

0 b 1 c 0 1

a > 0, b, c ∈ R .

2634


It is the direct sum of the group A+ (R) of affine transformations on the real line x → ax + b with a > 0 and the additive group (R, +). Indeed, observe that

a 0 0

0 b 1 c 0 1

x ax + b d = c+d . 1 1

The group is non-unimodular, indeed a direct computation gives μL = 1 a da db dc. Its Lie algebra a(R) ⊕ R is generated by p1 =

1 0 0 0 0 0 , 0 0 0

p2 =

0 0 1 0 0 1 , 0 0 0

k=

0 0 0

1 a2

da db dc and μR =

0 0 0

1 0 , 0

for which the following commutation rules hold: [p1 , p2 ] = k, [p2 , k] = 0, [k, p1 ] = −k. We define a trivializable sub-Riemannian structure on A+ (R) ⊕ R as presented in Section 2.1.1: consider the two left-invariant vector fields Xi (g) = gpi with i = 1, 2 and define (g) = span X1 (g), X2 (g) ,

gg Xi (g), Xj (g) = δij .

Using (9), one gets the following expression for the hypoelliptic Laplacian: Δsr φ = L2X1 φ + L2X2 φ + LX1 φ. 2.4. The intrinsic hypoelliptic Laplacian in terms of the formal adjoints of the vector fields In the literature another common definition of hypoelliptic Laplacian can be found (see for instance [31]): Δ∗ = −

m

L∗Xi LXi ,

(11)

i=1

where {X1 , . . . , Xm } is a set of vector fields satisfying the Hörmander condition and the formal adjoint L∗Xi is computed with respect to a given volume form. This expression clearly simplifies to the sum of squares when the vector fields are formally skew-adjoint, i.e. L∗Xi = −LXi . In this section we show that our definition of intrinsic hypoelliptic Laplacian coincides locally with (11), when {X1 , . . . , Xm } is an orthonormal frame for the sub-Riemannian manifold and the formal adjoint of the vector fields are computed with respect to the sub-Riemannian volume form. We then show that left-invariant vector fields on a Lie group G are formally skew-adjoint with respect to the right-Haar measure, providing an alternative proof of the fact that for unimodular Lie groups the intrinsic hypoelliptic Laplacian is the sum of squares. Proposition 19. Locally, the intrinsic hypoelliptic Laplacian Δsr can be written as ∗ L , where {X , . . . , X } is a local orthonormal frame, and L∗ is the formal L − m 1 m i=1 Xi Xi Xi adjoint of the Lie derivative LXi of the vector field Xi , i.e.


φ1 , L∗Xi φ2 = (φ2 , LXi φ1 ),

for every φ1 , φ2 ∈ Cc∞ (M, R), i = 1, . . . , m,

2635

(12)

2 and the scalar product is the one of L (M, R) with respect to the invariant volume form, i.e. (φ1 , φ2 ) := M φ1 φ2 μsr .

Proof. Given a volume form μ on M, a definition of divergence of a smooth vector field X (equivalent to LX μ = div(X)μ) is

div(X)φμ = −

M

LX φμ,

for every φ ∈ Cc∞ (M, R);

M

see for instance [45]. We are going to prove that

Δsr φ = −

m

L∗Xi LXi φ,

for every φ ∈ Cc∞ (M, R);

(13)

i=1

indeed, multiplying the left-hand side of (13) by ψ ∈ Cc∞ (M) and integrating with respect to μsr we have,

divsr (gradsr φ) ψμsr =

(Δsr φ) ψ μsr = M

M

=−

divsr M

n (LXi φ)Xi ψμsr i=1

n (LXi φ)(LXi ψ)μsr . M i=1

For the right-hand side we get the same expression. ψ is arbitrary, the conclusion follows. Since ∗ L φ, for every φ ∈ C 2 (M, R). 2 Then, by density, one concludes that Δsr φ = − m L i=1 Xi Xi Proposition 20. Let G be a Lie group and X a left-invariant vector field on G. Then LX is formally skew-adjoint with respect to the right-Haar measure. Proof. Let φ ∈ Cc∞ (M, R) and X = gp (p ∈ L, g ∈ G). Since X is left-invariant and μR is right-invariant, we have

(LX φ)(g0 ) μR (g0 ) =

G

G

tp

d d φ g0 e μR (g0 ) = φ g0 etp μR (g0 ) dt t=0 dt t=0

d = dt

t=0

d φ(g )μR g e−tp = dt

G

Hence, for every φ1 , φ2 ∈ Cc∞ (M, R) we have

G

t=0

G

φ(g )μR (g ) = 0.

2636


0=

LX (φ1 φ2 ) μR =

G

LX (φ1 ) φ2 μR +

G

and the conclusion follows.

φ1 (LX φ2 ) μR = (φ2 , LX φ1 ) + (φ1 , LX φ2 ) G

2

For unimodular groups we can assume μsr = μL = μR (cf. Remark 16) and left-invariant vector fields are formally skew-adjoint with respect to μsr . This argument provides an alternative proof of the fact that on unimodular Lie groups the hypoelliptic Laplacian is the sum of squares. 3. The generalized Fourier transform on unimodular Lie groups Let f ∈ L1 (R, R): its Fourier transform is defined by the formula ˆ f (λ) = f (x)e−ixλ dx. R

If f ∈ L1 (R, R) ∩ L2 (R, R) then fˆ ∈ L2 (R, R) and one has f (x)2 dx = fˆ(λ)2 dλ , 2π R

R

called Parseval or Plancherel equation. By density of L1 (R, R) ∩ L2 (R, R) in L2 (R, R), this equation expresses the fact that the Fourier transform is an isometry between L2 (R, R) and itself. Moreover, the following inversion formula holds: dλ f (x) = fˆ(λ)eixλ , 2π R

where the equality is intended in the L2 sense. It has been known from more than 50 years that the Fourier transform generalizes to a wide class of locally compact groups (see for instance [14,19,25,26,32,44]). Next we briefly present this generalization for groups satisfying the following hypothesis: (H0 ) G is a unimodular Lie group of type I. For the definition of groups of type I see [18]. For our purposes it is sufficient to recall that all groups treated in this paper (i.e. H2 , SU(2), SO(3), SL(2) and SE(2)) are of type I. Actually, both the real connected semisimple and the real connected nilpotent Lie groups are of type I [17,24] and even though not all solvable groups are of type I, this is the case for the group of the rototranslations of the plane SE(2) [43]. In the following, the Lp spaces Lp (G, C) are intended with respect to the Haar measure μ := μL = μR . ˆ be the dual6 of the group G, i.e. the set of all Let G be a Lie group satisfying (H0 ) and G ˆ in the following we equivalence classes of unitary irreducible representations of G. Let λ ∈ G: 6 See footnote 3.


2637

indicate by Xλ a choice of an irreducible representation in the class λ. By definition, Xλ is a map that to an element of G associates a unitary operator acting on a complex separable Hilbert space Hλ : λ

X :

G → U Hλ , g → Xλ (g).

The index λ for Hλ indicates that in general the Hilbert space can vary with λ. Definition 21. Let G be a Lie group satisfying (H0 ), and f ∈ L1 (G, C). The generalized (or noncommutative) Fourier transform (GFT) of f is the map (indicated in the following as fˆ or ˆ associates the linear operator on Hλ : F (f )) that to each element of G fˆ(λ) := F (f ) :=

f (g)Xλ g −1 dμ.

(14)

G

Notice that since f is integrable and Xλ unitary, then fˆ(λ) is a bounded operator. Remark 22. fˆ can be seen as an operator from ⊕ fˆ(λ).

⊕ ˆ G

Hλ to itself. We also use the notation fˆ =

ˆ G

ˆ is not a group and its structure can be quite complicated. In the case in which G In general G ˆ is a group; if G is nilpotent then G ˆ has the structure of Rn for some n; if G is abelian then G is compact then it is a Tannaka category (moreover, in this case each Hλ is finite dimensional). ˆ a positive measure dP (λ) (called the Plancherel Under the hypothesis (H0 ) one can define on G measure) such that for every f ∈ L1 (G, C) ∩ L2 (G, C) one has

f (g)2 μ(g) = Tr fˆ(λ) ◦ fˆ(λ)∗ dP (λ). ˆ G

G

By density of L1 (G, C) ∩ L2 (G, C) in L2 (G, C), this formula expresses the fact that the GFT is ⊕ an isometry between L2 (G, C) and Gˆ HSλ , the set of Hilbert–Schmidt operators with respect to the Plancherel measure. Moreover, it is obvious that: Proposition 23. Let G be a Lie group satisfying (H0 ) and f ∈ L1 (G, C) ∩ L2 (G, C). We have, for each g ∈ G

(15) f (g) = Tr fˆ(λ) ◦ Xλ (g) dP (λ) ˆ G

where the equality is intended in the L2 sense. It is immediate to verify that, given two functions f1 , f2 ∈ L1 (G, C) and defining their convolution as

2638


(f1 ∗ f2 )(g) =

f1 (h)f2 h−1 g dh,

(16)

G

then the GFT maps the convolution into noncommutative product: F (f1 ∗ f2 )(λ) = fˆ2 (λ)fˆ1 (λ).

(17)

Another important property is that if δId (g) is the Dirac function at the identity over G, then δÎd (λ) = IdH λ .

(18)

In the following, a key role is played by the differential of the representation Xλ , that is the map

d λ λ dX : X → dX (X) := Xλ etp , (19) dt t=0 where X = gp (p ∈ L, g ∈ G) is a left-invariant vector field over G. By Stone theorem (see for instance [44, p. 6]) dXλ (X) is a (possibly unbounded) skew-adjoint operator on Hλ . We have the following: Proposition 24. Let G be a Lie group satisfying (H0 ) and X be a left-invariant vector field over G. The GFT of X, i.e. Xˆ = F LX F −1 splits into the Hilbert sum of operators Xˆ λ , each one of which acts on the set HSλ of Hilbert–Schmidt operators over Hλ : Xˆ =

⊕

Xˆ λ .

ˆ G

Moreover, Xˆ λ Ξ = dXλ (X) ◦ Ξ,

for every Ξ ∈ HSλ ,

(20)

i.e. the GFT of a left-invariant vector field acts as a left-translation over HSλ . Proof. Consider the GFT of the operator Retp of right-translation of a function by etp , p ∈ L, i.e.

(Retp f ) (g0 ) = f g0 etp , and compute its GFT:

F (Retp f )(λ) = F f g0 etp (λ) = = G

f g0 etp Xλ g0−1 μ(g0 )

G

f (g )Xλ etp Xλ g −1 μ g e−tp

= Xλ etp fˆ(λ),


2639

where in the last equality we use the right-invariance of the Haar measure. Hence the GFT acts as a left-translation on HSλ and it disintegrates the right-regular representations. It follows that Rˆ etp = F Retp F

−1

⊕ =

Xλ etp .

ˆ G

Passing to the infinitesimal generators, with X = gp, the conclusion follows.

2

Remark 25. From the fact that the GFT of a left-invariant vector field acts as a left-translation, it follows that Xˆ λ can be interpreted as an operator over Hλ . 3.1. Computation of the kernel of the hypoelliptic heat equation In this section we provide a general method to compute the kernel of the hypoelliptic heat equation on a left-invariant sub-Riemannian manifold (G, , g) such that G satisfies the assumption (H0 ). We begin by recalling some existence results (for the semigroup of evolution and for the corresponding kernel) in the case of the sum of squares. We recall that for all the examples treated in this paper, the invariant hypoelliptic Laplacian is the sum of squares. Let G be a unimodular Lie group and (G, , g) a left-invariant sub-Riemannian manifold generated by the orthonormal basis {p1 , . . . , pm }, and consider the hypoelliptic heat equation ∂t φ(t, g) = Δsr φ(t, g).

(21)

Since G is unimodular, then Δsr = L2X1 + · · · + L2Xm , where LXi is the Lie derivative w.r.t. the vector field Xi := gpi (i = 1, . . . , m). Following Varopoulos [47, pp. 20–21, 106], since Δsr is a sum of squares, then it is a symmetric operator that we identify with its Friedrichs (selfadjoint) extension, that is the infinitesimal generator of a (Markov) semigroup etΔsr . Thanks to the left-invariance of Xi (with i = 1, . . . , m), etΔsr admits a right-convolution kernel pt (.), i.e. etΔsr φ0 (g) = φ0 ∗ pt (g) =

φ0 (h)pt h−1 g μ(h)

(22)

G

is the solution for t > 0 to (21) with initial condition φ(0, g) = φ0 (g) ∈ L1 (G, R) with respect to the Haar measure. Since the operator ∂t −Δsr is hypoelliptic, then the kernel is a C ∞ function of (t, g) ∈ R+ ×G. Notice that pt (g) = etΔsr δId (g). The main results of the paper are based on the following key fact. Theorem 26. Let G be a Lie group satisfying (H0 ) and (G, , g) a left-invariant sub-Riemannian manifold generated by the orthonormal basis {p1 , . . . , pm }. Let Δsr = L2X1 + · · · + L2Xm be the intrinsic hypoelliptic Laplacian where LXi is the Lie derivative w.r.t. the vector field Xi := gpi . Let {Xλ }λ∈Gˆ be the set of all non-equivalent classes of irreducible representations of the group G, each acting on an Hilbert space Hλ , and dP (λ) be the Plancherel measure on the dual ˆ We have the following: space G.

2640


(i) The GFT of Δsr splits into the Hilbert sum of operators Δˆ λsr , each one of which leaves Hλ invariant: Δˆ sr = F Δsr F −1 =

⊕

Δˆ λsr dP (λ),

where Δˆ λsr =

m

λ 2 Xˆ i .

(23)

i=1

ˆ G

(ii) The operator Δˆ λsr is self-adjoint and it is the infinitesimal generator of a contraction semiˆλ ˆλ group et Δsr over HSλ , i.e. et Δsr Ξ0λ is the solution for t > 0 to the operator equation ∂t Ξ λ (t) = Δˆ λsr Ξ λ (t) in HSλ , with initial condition Ξ λ (0) = Ξ0λ . (iii) The hypoelliptic heat kernel is

ˆλ (24) pt (g) = Tr et Δsr Xλ (g) dP (λ), t > 0. ˆ G

Proof. Following Varopoulos as above, and using Proposition 24, (i) follows. Item (ii) follows from the split (23) and from the fact that GFT is an isometry between L2 (G, C) (the set of ⊕ square integrable functions from G to C with respect to the Haar measure) and the set Gˆ HSλ of Hilbert–Schmidt operators with respect to the Plancherel measure. Item (iii) is obtained applying ˆλ the inverse GFT to et Δsr Ξ0λ and the convolution formula (17). The integral is convergent by the existence theorem for pt , see [47, p. 106]. 2 ˆλ Remark 27. As a consequence of Remark 25, it follows that Δˆ λsr and et Δsr can be considered as operators on Hλ .

In the case when each Δˆ λsr has discrete spectrum, the following corollary gives an explicit formula for the hypoelliptic heat kernel in terms of its eigenvalues and eigenvectors. Corollary 28. Under the hypotheses of Theorem 26, if in addition we have that for every λ, Δˆ λsr (considered as an operator over Hλ ) has discrete spectrum and {ψnλ } is a complete set of eigenfunctions of norm one with the corresponding set of eigenvalues {αnλ }, then pt (g) =

ˆ G

λ eαn t ψnλ , Xλ (g)ψnλ dP (λ)

(25)

n

where .,. is the scalar product in Hλ . Proof. Recall that Tr(AB) = Tr(BA) and that Tr(A) = i∈I ei , Aei for any complete set ˆλ ˆλ {ei }i∈I of orthonormal vectors. Hence Tr(et Δsr Xλ (g)) = n ψnλ , Xλ (g)et Δsr ψnλ . Observe that λ ˆλ ∂t ψnλ = Δˆ λsr ψnλ = αnλ ψnλ , hence et Δsr ψnλ = eαn t ψnλ , from which the result follows. 2 The following corollary gives a useful formula for the hypoelliptic heat kernel in the case in ˆ each operator et Δˆ λsr admits a convolution kernel Qλt (.,.). Here by ψ λ , we which for all λ ∈ G intend an element of Hλ .


2641

ˆ we have Hλ = L2 (X λ , dθ λ ) Corollary 29. Under the hypotheses of Theorem 26, if for all λ ∈ G for some measure space (X λ , dθ λ ) and t Δˆ λ λ e sr ψ (θ ) =

ψ λ (θ¯ )Qλt (θ, θ¯ ) d θ¯ ,

Xλ

then pt (g) =

Xλ (g)Qλt (θ, θ¯ )θ=θ¯ d θ¯ dP (λ),

ˆ Xλ G

where in the last formula Xλ (g) acts on Qλt (θ, θ¯ ) as a function of θ . Proof. From (24), we have pt (g) =

ˆλ Tr et Δsr Xλ (g) dP (λ) =

ˆ G

ˆλ Tr Xλ (g)et Δsr dP (λ).

ˆ G

We have to compute the trace of the operator ˆλ

ˆλ

Θ = Xλ (g)et Δsr : ψ λ (θ ) → Xλ (g)et Δsr ψ λ (θ ) = Xλ (g) =

ψ λ (θ¯ )Qλt (θ, θ¯ ) d θ¯

Xλ

K(θ, θ¯ )ψ λ (θ¯ ) d θ¯

(26)

Xλ λ λ λ ¯ ¯ ¯ where K(θ, θ¯ ) = Xλ (g)Q t (θ, θ ) is a function of θ, θ and X (g) acts on Qt (θ, θ ) as a function of θ . The trace of Θ is X K(θ¯ , θ¯ ) d θ¯ and the conclusion follows. 2

4. Explicit expressions on 3D unimodular Lie groups 4.1. The hypoelliptic heat equation on H2 In this section we apply the method presented above to solve the hypoelliptic heat equation (21) on the Heisenberg group. This kernel, via the GFT, was first obtained by Hulanicki (see [29]). We present it as an application of Corollary 29, since in this case an expression for the kernel of the GFT of this equation is known. We write the Heisenberg group as the 3D group of matrices H2 =

1 x 0 1 0 0

z + 12 xy y 1

x, y, z ∈ R

2642


endowed with the standard matrix product. H2 is indeed R3 , (x, y, z) ∼

z + 12 xy y 1

1 x 0 1 0 0

,

endowed with the group law

1 (x1 , y1 , z1 ) · (x2 , y2 , z2 ) = x1 + x2 , y1 + y2 , z1 + z2 + (x1 y2 − x2 y1 ) . 2 A basis of its Lie algebra is {p1 , p2 , k} where p1 =

0 0 0

1 0 0

0 0 , 0

p2 =

0 0 0

0 0 0

0 1 , 0

k=

0 0 1 0 0 0 . 0 0 0

(27)

They satisfy the following commutation rules: [p1 , p2 ] = k, [p1 , k] = [p2 , k] = 0, hence H2 is a 2-step nilpotent group. We define a left-invariant sub-Riemannian structure on H2 as presented in Section 2.1.1: consider the two left-invariant vector fields Xi (g) = gpi with i = 1, 2 and define (g) = span X1 (g), X2 (g) ,


Writing the group H2 in coordinates (x, y, z) on R3 , we have the following expression for the Lie derivatives of X1 and X2 : y LX1 = ∂x − ∂z , 2

x LX2 = ∂y + ∂z . 2

The Heisenberg group is unimodular, hence the hypoelliptic Laplacian Δsr is the sum of squares:

Δsr φ = L2X1 + L2X2 φ.

(28)

Remark 30. It is interesting to notice that all left-invariant sub-Riemannian structures that one can define on the Heisenberg group are isometric. In the next proposition we present the structure of the dual group of H2 . For details and proofs see for instance [33]. ˆ = {Xλ | λ ∈ R}, where Proposition 31. The dual space of H2 is G Xλ (g):

H → H, xy

ψ(θ ) → eiλ(z−yθ+ 2 ) ψ(θ − x),

whose domain is H = L2 (R, C), endowed with the standard product


2643

ψ1 , ψ2 :=

ψ1 (θ )ψ2 (θ ) dθ R

where dθ is the Lebesgue measure. ˆ is dP (λ) = The Plancherel measure on G

|λ| 4π 2

dλ, where dλ is the Lebesgue measure on R.

Remark 32. Notice that in this example the domain H of the representation Xλ does not depend on λ. 4.1.1. The kernel of the hypoelliptic heat equation Consider the representation Xλ of H2 and let Xˆ iλ be the corresponding representations of the differential operators LXi with i = 1, 2. Recall that Xˆ iλ are operators on H. From formula (19) we have λ λ d Xˆ 2 ψ (θ ) = −iλθ ψ(θ ), Xˆ 1 ψ (θ ) = − ψ(θ ), dθ

2 λ d 2 2 ˆ − λ θ ψ(θ ). Δsr ψ (θ ) = dθ 2

hence

The GFT of the hypoelliptic heat equation is thus

∂t ψ =

d2 2 2 − λ θ ψ. dθ 2

The kernel of this equation is known (see for instance [4]) and it is called the Mehler kernel (its computation is very similar to the computation of the kernel for the harmonic oscillator in quantum mechanics): Qλt (θ, θ¯ ) :=

1 λ cosh(2λt) 2 λ λθ θ¯ 2 ¯ exp − θ +θ + . 2π sinh(2λt) 2 sinh(2λt) sinh(2λt)

Using Corollary 29 and after straightforward computations, one gets the kernel of the hypoelliptic heat equation on the Heisenberg group: pt (x, y, z) =

1 (2πt)2

R

τ (x 2 + y 2 ) zτ 2τ exp − cos 2 dτ. sinh(2τ ) 2t tanh(2τ ) t

(29)

This formula differs from the one by Gaveau [21] for some numerical factors since he studies the equation ∂t φ =

1

(∂x + 2y∂z )2 + (∂y − 2x∂z )2 φ. 2

The Gaveau formula is recovered from (29) with t → t/2 and z → z/4. Moreover, a multiplicative factor 14 should be added, because from the change of variables one gets that the Haar measure is 14 dx dy dz instead of dx dy dz as used by Gaveau.

2644


4.2. The hypoelliptic heat equation on SU(2) In this section we solve the hypoelliptic heat equation (21) on the Lie group α β 2 2 α, β ∈ C, |α| + |β| = 1 . SU(2) = −β¯ α¯ A basis of the Lie algebra su(2) is {p1 , p2 , k} where7 p1 =

1 0 i , 2 i 0

p2 =

1 0 −1 , 2 1 0

k=

1 i 0 . 2 0 −i

(30)

We define a trivializable sub-Riemannian structure on SU(2) as presented in Section 2.1.1: consider the two left-invariant vector fields Xi (g) = gpi with i = 1, 2 and define (g) = span X1 (g), X2 (g) ,


The group SU(2) is unimodular, hence the hypoelliptic Laplacian Δsr has the following expression:

Δsr ψ = L2X1 + L2X2 ψ.

(31)

In the next proposition we present the structure of the dual group of SU(2). For details and proofs see for instance [43]. ˆ = {Xn | n ∈ N}. Proposition 33. The dual space of SU(2) is G n n polynomials of degree n in two variables The domain H of X is the space of homogeneous (z1 , z2 ) with complex coefficients Hn := { nk=0 ak z1k z2n−k | ak ∈ C}, endowed with the scalar product n n n ak zk zn−k , bk zk zn−k := k!(n − k)!ak b¯k . 1 2

k=0

1 2

k=0

k=0

The representation Xn is Hn → Hn , Xn (g):

n k=0

ak z1k z2n−k →

n

ak w1k w2n−k

k=0

¯ 2 , βz1 + αz with (w1 , w2 ) = (z1 , z2 )g = (αz1 − βz ¯ 2 ). ˆ is dP (n) = (n + 1) dμ (n), where dμ is the counting measure. The Plancherel measure on G Notice that an orthonormal basis of Hn is {ψkn }nk=0 with ψkn := 7 See [43, p. 67]: p = X , p = X , k = X . 1 1 2 2 3

zk zn−k √1 2 . k!(n−k)!


2645

4.2.1. The kernel of the hypoelliptic heat equation Consider the representations Xˆ in of the differential operators LXi with i = 1, 2: they are operators on Hn , whose action on the basis {ψkn }nk=0 of Hn is (using formula (19)) i n n Xˆ 1n ψkn = kψk−1 , + (n − k)ψk+1 2

1 n n Xˆ 2n ψkn = kψk−1 − (n − k)ψk+1 2

hence Δˆ nsr ψkn = (k 2 − kn − n2 )ψkn . Thus, the basis {ψkn }nk=0 is a complete set of eigenfunctions of norm one for the operator Δˆ nsr . We are now able to compute the kernel of the hypoelliptic heat equation using formula (25). Proposition 34. The kernel of the hypoelliptic heat equation on (SU(2), , g) is

pt (g) =

∞ n n 2 (n + 1) e(k −kn− 2 )t An,k (g), n=0

(32)

k=0

where

n,k

A

(g) :=

ψkn , Xn (g)ψkn

=

min{k,n−k} l=0

with g =

k k−l

l n−k α¯ k−l α n−k−l |α|2 − 1 l

α β . −β¯ α¯

n (k 2 −kn− n2 )t Proof. The formula pt (g) = ∞

ψkn , Xn (g)ψkn is given by applyn=0 (n + 1) k=0 e ing formula (25) in the SU(2) case. We now prove the explicit expression for ψkn , Xn (g)ψkn : a direct computation gives n n

X

(g)ψkn

=

n s=0 ψs

min{s,n−k} k n−k s−l √ ¯ k−s+l β l α¯ n−k−l α (−β) s!(n − s)! l=max{0,s−k} s−l l . √ k!(n − k)!

Observe that ψkn is an orthonormal frame for the inner product: hence

ψkn , Xn (g)ψkn

= ψkn , ψkn

min{k,n−k} l=0

The result easily follows.

k k−l

n−k ¯ l β l α¯ n−k−l . α k−l (−β) l

2

Remark 35. Notice that, as the sub-Riemannian distance (computed in [9]), pt (g) does not depend on β. This is due to the cylindrical symmetry of the distribution around ek = {eck | c ∈ R}.

2646


4.3. The hypoelliptic heat equation on SO(3) Let g be an element of SO(3) = {A ∈ Mat(R, 3) | AAT = Id, det(A) = 1}. A basis of the Lie algebra so(3) is {p1 , p2 , k} where8 p1 =

0 0 0 0 0 −1 , 0 1 0

p2 =

0 0 1 0 0 0 , −1 0 0

k=

0 1 0

−1 0 0 0 . 0 0

(33)

We define a trivializable sub-Riemannian structure on SO(3) as presented in Section 2.1.1: consider the two left-invariant vector fields Xi (g) = gpi with i = 1, 2 and define (g) = span X1 (g), X2 (g) ,


The group SO(3) is unimodular, hence the hypoelliptic Laplacian Δsr has the following expression:


(34)

We present now the structure of the dual group of SO(3). For details and proofs see [43]. First consider the domain Hr , that is the space of complex-valued polynomials of rth degree in three real variables x, y, z that are homogeneous and harmonic Hr = f (x, y, z) deg(f ) = r, f homogeneous, Δf = 0 . Notice that an homogeneous polynomial f ∈ Hr is uniquely determined by its value on S 2 = {(x, y, z) | x 2 + y 2 + z2 = 1}, as f (ρx, ρy, ρz) = ρ r f (x, y, z). Define f˜(α, β) := f (sin(α) cos(β), sin(α) sin(β), cos(α)). Then endow Hr with the scalar product 1 f1 (x, y, z), f2 (x, y, z) := f˜1 (α, β)f˜2 (α, β) sin α dα dβ. 4π S2

In the following proposition we present the structure of the dual group. ˆ = {Xr | r ∈ N}. Proposition 36. The dual space of SO(3) is G Given g ∈ SO(3), the unitary representation Xr (g) is Xr (g):

Hr → Hr , f (x, y, z) → f (x1 , y1 , z1 )

with (x1 , y1 , z1 ) = (x, y, z)g. The Plancherel measure on SO(3) is dP (r) = (2r + 1) dμ (r), where dμ is the counting measure. 8 See [43, p. 88]: p = Z , p = Z , k = Z . 1 1 2 2 3


2647

An orthonormal basis for Hr is given by {φsr }rs=−r with φ˜ sr (α, β) := eisβ Pr−s (cos(α)), where are the Legendre polynomials.9

Prs (x)

4.3.1. The kernel of the hypoelliptic heat equation Consider the representations Xˆ ir of the differential operators LXi with i = 1, 2: using formula (19) we find the following expressions in spherical coordinates10 ∂ψ ∂ψ Xˆ 1r ψ = sin(β) + cot(α) cos(β) , ∂α ∂β

∂ψ ∂ψ Xˆ 2r ψ = − cos(β) + cot(α) sin(β) ∂α ∂β

hence Δˆ nsr ψ =

∂ 2ψ ∂ 2ψ ∂ψ + cot2 (α) 2 + cot(α) 2 ∂α ∂α ∂β

(35)

and its action on the basis {φsr }rs=−r of Hr is

Δˆ nsr φsr = s 2 − r(r + 1) φsr .

(36)

Hence the basis {φsr }rs=−r is a complete set of eigenfunctions of norm one for the operator Δˆ nsr . We compute the kernel of the hypoelliptic heat equation, using (25). Proposition 37. The kernel of the hypoelliptic heat equation on (SO(3), , g) is pt (g) =

∞ r=0

(2r + 1)

r

e(s

2 −r(r+1))t

φsr , Xr (g)φsr .

(37)

s=−r

4.3.2. The heat kernel on SO(3) via the heat kernel on SU(2) In this section we verify that the heat kernel on SO(3) given in (37) can be easily retrieved from the one on SU(2) given in (32). In the following, all the objects relative to SO(3) are underlined, e.g. g ∈ SO(3), pi ∈ so(3), the representations Xr acting on Hr with basis φsr . Consider the isomorphism of Lie algebras ad : su(2) → so(3) defined by ad p1 = p1 , ad p2 = p2 , ad k = k: it gives the matrix expression of the adjoint map on su(2) with respect to the basis {p1 , p2 , k}. There is a corresponding endomorphism of groups Ad : SU(2) → SO(3) given by Ad(exp v) = exp(ad (v)). It is a covering map of SO(3) by SU(2), such that for each matrix g ∈ SO(3) the preimage is given by two opposite matrices g, −g ∈ SU(2). Proposition 38. The following relation holds between the kernel pt on SO(3) given in (37) and the kernel pt on SU(2) given in (32): ∀g ∈ SO(3), g ∈ Ad−1 (g), s

pt (g) =

9 Recall that P s (x) is defined by P s (x) := (1−x 2 ) 2 d r+s (x 2 −1)r . r r r!2r dx r+s 10 I.e. x = ρ sin(α) cos(β), y = ρ sin(α) sin(β), z = ρ cos(α).

pt (g) + pt (−g) . 2

2648


Proof. Observe the following key facts (see e.g. [43, II, §7]): • on SU(2): Xn (−g)φ = (−1)n Xn (g)φ; • the representation Xr of SO(3) and the representation X2r of SU(2) are unitarily related, i.e. the following relation holds: ∀g ∈ SU(2), φsr ∈ Hr T r Xr (Ad g) φsr = X2r (g) T r φsr

(38)

where the map Hr → H2r ,

r

T :

2r φsr → φr+s

is a unitary isomorphism. Then we have explicitly pt (g) + pt (−g) = 2

∞

n n=0 (n + 1)(1 + (−1) )

n

k=0 e

(k 2 −kn− n2 )t

An,k (g)

2 ∞ r 2r 2 2r (2r + 1) e(s −r(r+1))t φr+s , X2r (g)φr+s = r=0

s=−r

2r , X2r (g)φ 2r = after the substitution r = 2n, s = k − r. Using Eq. (38), we have φr+s r+s r r r

φs , X (Ad g)φs , from which the result directly follows. 2

4.4. The hypoelliptic heat equation on SL(2) In this section we solve the hypoelliptic heat equation (21) on the Lie group SL(2) = g ∈ Mat(R, 2) det(g) = 1 . A basis of the Lie algebra sl(2) is p1 =

1 1 0 , 2 0 −1

p2 =

1 0 1 , 2 1 0

k=

1 0 −1 . 2 1 0

We define a trivializable sub-Riemannian structure on SL(2) as presented in Section 2.1.1: consider the two left-invariant vector fields Xi (g) = gpi with i = 1, 2 and define (g) = span X1 (g), X2 (g) ,


The group SL(2) is unimodular, hence the hypoelliptic Laplacian Δsr has the following expression:


(39)


It is well known that SL(2) and SU(1, 1) = via the isomorphism Π:

α β β¯ α¯

SL(2) → SU(1, 1), g → G = CgC −1 ,

2649

| |α|2 − |β|2 = 1 are isomorph Lie groups

1 with C = √ 2

1 1

−i i

.

This isomorphism also induce an isomorphism of Lie algebras dΠ : sl(2) → su(1, 1) defined by dΠ(p1 ) = p1 , dΠ(p2 ) = p2 , dΠ(k) = k with p1 =

1 0 1 , 2 1 0

p2 =

1 0 2 i

−i , 0

k =

1 −i 2 0

0 . i

This isomorphism induces naturally the definitions of left-invariant sub-Riemannian structure and of the hypoelliptic Laplacian on SU(1, 1). We present here the structure of the dual of the group SU(1, 1), observing that the isomorphism of groups induces an isomorphism of representations. For details and proofs, see [43]. ˆD ˆ of SU(1, 1) contains two continuous parts and a discrete part: G ˆ =G ˆ C G The dual space G 1 1 1 j,s + n ˆ ˆ with GC = {X | j ∈ {0, 2 }, s = 2 + iv, v ∈ R } and GD = {X | n ∈ 2 Z, |n| 1}. We define the domain HC of the continuous representation Xj,s : it is the Hilbert space of L2 complex-valued functions on the unitary circle S 1 = {x ∈ C | |x| = 1} with respect to the normaldx dx ized Lebesgue measure 2π , endowed with the standard scalar product f, g := S 1 f (x)g(x) 2π . −m An orthonormal basis is given by the set {ψm }m∈Z with ψm (x) := x . Proposition 39. The continuous part of the dual space of SU(1, 1) is 1 1 j,s + ˆ GC = X j ∈ 0, , s = + iv, v ∈ R . 2 2 Given G ∈ SU(1, 1), the unitary representation Xj,s (G ) is HC → HC , j,s

X

with G −1 =

α β β¯ α¯

(G ):

¯ + α| ψ(x) → |βx ¯ −2s

¯ + α¯ βx ¯ + α| |βx ¯

2j αx + β ψ ¯ + α¯ βx

.

ˆ C is The Plancherel measure on G v

j = 0, 1 2π Tanh(πv) dv, dP j, + iv = v 1 2 2π Cotanh(πv) dv, j = 2 , where dv is the Lebesgue measure on R. Remark 40. Notice that in this example the domain of the representation HC does not depend on j, s.

2650


Now we turn our attention to the description of principal discrete representations.11 We first define the domain Hn of these representations Xn : consider the space Ln of L2 complex-valued functions on the unitary disc D = {x ∈ C | |x| < 1} with respect to the measure 2 2n−2 dz where dz is the Lebesgue measure. L is an Hilbert space if dμ∗ (z) = 2|n|−1 n π (1 − |z| ) endowed with the scalar product f, g := D f (z)g(z) dμ∗ (z). Then define the space Hn with n > 0 as the Hilbert space of holomorphic functions of Ln , while Hn with n < 0 is the Hilbert space of antiholomorphic functions of L−n . An orthonormal basis for Hn with n > 0 is given 1 (2n+m) n} n m by {ψm m∈N with ψm (z) = ( (2n)(m+1) ) 2 z where is the Gamma function. An orthonormal −n n} n basis for Hn with n < 0 is given by {ψm m∈N with ψm (z) = ψm (z).

ˆ D = {Xn | n ∈ 1 Z, |n| 1}. Proposition 41. The discrete part of the dual space of SU(1, 1) is G 2 n Given G ∈ SU(1, 1), the unitary representation X (G ) is Hn → Hn , n

X (G ):

with G −1 =

α β β¯ α¯

¯ + α) ψ(x) → (βx ¯ −2|n| ψ

αx + β , ¯ + α¯ βx

.

ˆ D is dP (n) = The Plancherel measure on G sure.

2|n|−1 4π

dμ (n), where dμ is the counting mea-

4.4.1. The kernel of the hypoelliptic heat equation In this section we compute the representation of differential operators LXi with i = 1, 2 and give the explicit expression of the kernel of the hypoelliptic heat equation. j,s We first study the continuous representations Xˆ i , for both the families j = 0, 12 . Their actions on the basis {ψm }m∈Z of HC is s −m−j s +m+j j,s ψm−1 + ψm+1 , Xˆ 1 ψm = 2 2 s −m−j s +m+j j,s ψm−1 − i ψm+1 . Xˆ 2 ψm = i 2 2 Hence

1 j,s 2 2 ˆ ψm . Δsr ψm = − (m + j ) + v + 4 Moreover, the set {ψm }m∈Z is a complete set of eigenfunctions of norm one for the operator Δˆ sr . j,s

j,s Remark 42. Notice that the operators Xˆ i are only defined on the space of C ∞ vectors, i.e. the vectors v ∈ HC such that the map g → [Xj,s (g)] v is a C ∞ mapping. This restriction is not crucial for the following treatment.

11 There exist also the so-called complementary discrete representations, on which the Plancherel measure is vanishing. Hence they do not contribute to the GFT of a function defined on SU(1, 1). For details, see for instance [43].


2651

We now turn our attention to the discrete representations in both cases n > 0 (holomorphic functions) and n < 0 (antiholomorphic functions). Consider the discrete representation Xn of SU(1, 1) and let Xˆ in be the representations of the differential operators LXi with i = 1, 2. Their n} n actions on the basis {ψm m∈N of H are √ √ (2|n| + m)(m + 1) n (2|n| + m − 1)m n ψm+1 − ψm−1 , 2 2 √ √ (2|n| + m)(m + 1) n (2|n| + m − 1)m n n ψm+1 − i ψm−1 . = −i Xˆ 2n ψm 2 2

n Xˆ 1n ψm =

n = −(|n| + 2m|n| + m2 )ψ n , thus the basis {ψ n } Hence Δˆ nsr ψm m m m∈N is a complete set of eigenn ˆ functions of norm one for the operator Δsr . We now compute the kernel of the hypoelliptic heat equation using formula (25).

Proposition 43. The kernel of the hypoelliptic heat equation on (SL(2), , g) is +∞ pt (g) =

v 2 2 1 Tanh(πv) e−t (m +v + 4 ) ψm , X0,s (G )ψm dμ(v) 2π m∈Z

0

+∞ +

1 v 2 2 1 Cotanh(πv) e−t (m +m+v + 2 ) ψm , X 2 ,s (G )ψm dμ(v) 2π m∈Z

0

+

n∈ 12 Z, |n|1

2|n| − 1 −t (|n|+2m|n|+m2 ) n n n ψm , X (G )ψm , e 4π

(40)

m∈N

where G = Π(g −1 ) ∈ SU(1, 1). 4.5. The hypoelliptic heat kernel on SE(2) Consider the group of rototranslations of the plane SE(2) =

cos(α) sin(α) 0

− sin(α) cos(α) 0

x1 x2 1

α ∈ R/2π, xi ∈ R .

In the following we often denote an element of SE(2) as g = (α, x1 , x2 ). A basis of the Lie algebra of SE(2) is {p0 , p1 , p2 }, with p0 =

0 −1 0 1 0 0 , 0 0 0

p1 =

0 0 1 0 0 0 , 0 0 0

p2 =

0 0 0

0 0 0

0 1 . 0

(41)

We define a trivializable sub-Riemannian structure on SE(2) as presented in Section 2.1.1: consider the two left-invariant vector fields Xi (g) = gpi with i = 0, 1 and define

2652


(g) = span X0 (g), X1 (g) ,


The group SE(2) is unimodular, hence the hypoelliptic Laplacian Δsr has the following expression:


(42)

Remark 44. As for the Heisenberg group, all left-invariant sub-Riemannian structures that one can define on SE(2) are isometric. In the following proposition we present the structure of the dual of SE(2). ˆ = {Xλ | λ ∈ R+ }. Proposition 45. The dual space of SE(2) is G Given g = (α, x1 , x2 ) ∈ SE(2), the unitary representation Xλ (g) is H → H,

Xλ (g):

ψ(θ ) → eiλ(x cos(θ)−y sin(θ)) ψ(θ + α),

where the domain H of the representation Xλ (g) is H = L2 (S 1 , C), the Hilbert space of L2 functions on the circle S 1 ⊂ R2 with respect to the Lebesgue measure dθ , endowed with the scalar product ψ1 , ψ2 = S 1 ψ1 (θ )ψ2 (θ ) dθ . ˆ is dP (λ) = λ dλ where dλ is the Lebesgue measure on R. The Plancherel measure on G Remark 46. Notice that in this example the domain of the representation H does not depend on λ. 4.5.1. The kernel of the hypoelliptic heat equation Consider the representations Xˆ iλ of the differential operators LXi with i = 0, 1: they are operators on H, whose action on ψ ∈ H is (using formula (19)) λ dψ(θ ) Xˆ 0 ψ (θ ) = , dθ

λ Xˆ 1 ψ (θ ) = iλ cos(θ )ψ(θ ),

hence n d 2 ψ(θ ) Δˆ sr ψ (θ ) = − λ2 cos2 (θ )ψ(θ ). dθ 2 We have to find a complete set of eigenfunctions of norm one for Δˆ nsr . An eigenfunction ψ with eigenvalue E is a 2π -periodic function satisfying the Mathieu’s equation d 2ψ

+ a − 2q cos(2x) ψ = 0 2 dx 2

with a = − λ2 − E and q = ter 20].

λ2 4 .

(43)

For details about Mathieu functions see for instance [1, Chap-


2653

Remark 47. Notice that we consider only 2π -periodic solutions of (43) since H = L2 (S 1 , C). There exists an ordered discrete set {an (q)}+∞ n=0 of distinct real numbers (an < an+1 ) such that 2

the equation ddxf2 + (an − 2q cos(2x))f = 0 admits a unique even 2π -periodic solution of norm one. This function cen (x, q) is called an even Mathieu function. Similarly, there exists an ordered discrete set {bn (q)}+∞ n=1 of distinct real numbers (bn < bn+1 ) 2

such that the equation ddxf2 + (bn − 2q cos(2x))f = 0 admits a unique odd 2π -periodic solution of norm 1. This function sen (x, q) is called an odd Mathieu function. 2 λ2 +∞ The set B λ := {cen (x, λ4 )}+∞ n=0 ∪ {sen (x, 4 )}n=1 is a complete set of 2π -periodic eigenfunc2 2 2 tions of norm one for the operator Δˆ nsr . The eigenvalue for cen (x, λ ) is anλ := − λ − an ( λ ). 2

2

4

2

2

4

The eigenvalue for sen (x, λ4 ) is bnλ := − λ2 − bn ( λ4 ). We can now compute the explicit expression of the hypoelliptic kernel on SE(2). Proposition 48. The kernel of the hypoelliptic heat equation on (SE(2), , g) is +∞ +∞ +∞ λ an t λ bnλ t λ pt (g) = λ dλ e cen (θ ), X (g)cen (θ ) + e sen (θ ), X (g)se(θ ) . n=0

0

(44)

n=1

The function (44) is real for all t > 0. Proof. The formula (44) is given by writing the formula (25) in the SE(2) case. We have to prove that pt (g) is real: we claim that cen , Xλ (g)cen is real. In fact, write the scalar product with g = (α, x, y):

2π

cen , X (g)cen = λ

eiλ(x cos(θ)−y sin(θ)) cen (θ )cen (θ + α) dθ. 0

2π Its imaginary part is 0 sin(λ(x cos(θ ) − y sin(θ )))cen (θ )cen (θ + α). Its integrand function assumes opposite values in θ and θ + π : indeed

sin λ x cos(θ + π) − y sin(θ + π) = sin λ −x cos(θ ) + y sin(θ )

= − sin λ +x cos(θ ) − y sin(θ ) , while cen (θ + π) = (−1)n cen (θ ) as a property of Mathieu functions. Thus, the integral over [0, 2π] is null. With similar observations it is possible to prove that sen (θ ), Xλ (g)sen (θ ) is real. Thus, pt (g) is an integral of a sum of real functions, hence it is real. 2 Acknowledgments The authors are grateful to Giovanna Citti and to Fulvio Ricci for helpful discussions.

2654


References [1] M. Abramowitz, I.A. Stegun (Eds.), Handbook of Mathematical Functions with Formulas, Graphs and Mathematical Tables, National Bureau of Standards Applied Mathematics, Washington, 1964. [2] A.A. Agrachev, Yu.L. Sachkov, Control Theory from the Geometric Viewpoint, Encyclopaedia Math. Sci., vol. 87, Springer-Verlag, 2004. [3] A.A. Agrachev, U. Boscain, M. Sigalotti, A Gauss–Bonnet-like formula on two-dimensional almost-Riemannian manifolds, Discrete Contin. Dyn. Syst. 20 (4) (2008) 801–822. [4] R. Beals, Solutions fondamentales exactes (Exact kernels), in: Journées “Équations aux Dérivées Partielles”, Exp. No. I, Saint-Jean-de-Monts, 1998, Univ. Nantes, Nantes, 1998, 9 pp. (in French). [5] R. Beals, B. Gaveau, P. Greiner, Hamilton–Jacobi theory and the heat kernel on Heisenberg groups, J. Math. Pures Appl. (9) 79 (7) (2000) 633–689. [6] A. Bellaiche, The tangent space in sub-Riemannian geometry, in: A. Bellaiche, J.-J. Risler (Eds.), Sub-Riemannian Geometry, in: Progr. Math., vol. 144, Birkhäuser, Basel, 1996, pp. 1–78. [7] A. Bonfiglioli, E. Lanconelli, F. Uguzzoni, Stratified Lie Groups and Potential Theory for Their Sub-Laplacians, Springer Monogr. Math., Springer-Verlag, Berlin, 2007. [8] U. Boscain, S. Polidoro, Gaussian estimates for hypoelliptic operators via optimal control, Atti Accad. Naz. Lincei Cl. Sci. Fis. Mat. Natur. Rend. Lincei (9) Mat. Appl. 18 (4) (2007) 333–342. [9] U. Boscain, F. Rossi, Invariant Carnot–Carathéodory metrics on S 3 , SO(3), SL(2) and Lens spaces, SIAM J. Control Optim. 47 (4) (2008) 1851–1878. [10] A. Bressan, B. Piccoli, Introduction to the Mathematical Theory of Control, AIMS Ser. Appl. Math., vol. 2, 2007. [11] R. Brockett, Control theory and singular Riemannian geometry, in: New Directions in Applied Mathematics, Cleveland, Ohio, 1980, Springer-Verlag, New York, Berlin, 1982, pp. 11–27. [12] R. Brockett, Nonlinear control theory and differential geometry, in: Proceedings of the International Congress of Mathematicians, vols. 1, 2, Warsaw, 1984, pp. 1357–1368. [13] L. Capogna, D. Danielli, S.D. Pauls, J.T. Tyson, An Introduction to the Heisenberg Group and the Sub-Riemannian Isoperimetric Problem, Progr. Math., vol. 259, Birkhäuser Verlag, Basel, 2007. [14] G.S. Chirikjian, A.B. Kyatkin, Engineering Applications of Noncommutative Harmonic Analysis, CRC Press, Boca Raton, FL, 2001. [15] G. Citti, A. Sarti, A Cortical Based Model of Perceptual Completion in the Roto-Translation Space, AMS Acta, 2004. [16] J. Cygan, Heat kernels for class 2 nilpotent groups, Studia Math. 64 (3) (1979) 227–238. [17] J. Dixmier, Sur les représentations unitaires des groupes de Lie algébriques, Ann. Inst. Fourier (Grenoble) 7 (1957) 315–328. [18] J. Dixmier, Sur les représentations unitaires des groupes de Lie nilpotents. I, Amer. J. Math. 81 (1959) 160–170. [19] M. Duflo, Analyse harmonique sur les groupes algébriques complexes: formule de Plancherel (d’après M. Andler) et conjecture de M. Vergne, Bourbaki seminar, vol. 1982/1983, pp. 279–291. [20] G.B. Folland, E.M. Stein, Estimates for the ∂¯b complex and analysis on the Heisenberg group, Comm. Pure Appl. Math. 27 (1974) 429–522. [21] B. Gaveau, Principe de moindre action, propagation de la chaleur et estimées sous elliptiques sur certains groupes nilpotents, Acta Math. 139 (1–2) (1977) 95–153. [22] V. Gershkovich, A. Vershik, Nonholonomic manifolds and nilpotent analysis, J. Geom. Phys. 5 (1988) 407–452. [23] M. Gromov, Carnot–Carathéodory spaces seen from within, in: A. Bellaiche, J.-J. Risler (Eds.), Sub-Riemannian Geometry, in: Progr. Math., vol. 144, Birkhäuser, Basel, 1996, pp. 79–323. [24] Harish-Chandra, Representations of a semisimple Lie group on a Banach space. I, Trans. Amer. Math. Soc. 75 (1953) 185–243. [25] E. Hewitt, K.A. Ross, Abstract Harmonic Analysis, vol. II: Structure and Analysis for Compact Groups. Analysis on Locally Compact Abelian Groups, Grundlehren Math. Wiss., Band 152, Springer-Verlag, New York, Berlin, 1970. [26] E. Hewitt, K.A. Ross, Abstract Harmonic Analysis, vol. I. Structure of Topological Groups, Integration Theory, Group Representations, second ed., Grundlehren Math. Wiss., vol. 115, Springer-Verlag, Berlin, New York, 1979. [27] H. Heyer, Dualität lokalkompakter Gruppen, Lecture Notes in Math., vol. 150, Springer-Verlag, Berlin, New York, 1970. [28] L. Hörmander, Hypoelliptic second order differential equations, Acta Math. 119 (1967) 147–171. [29] A. Hulanicki, The distribution of energy in the Brownian motion in the Gaussian field and analytic-hypoellipticity of certain subelliptic operators on the Heisenberg group, Studia Math. 56 (2) (1976) 165–173.


2655

[30] D. Jerison, A. Sanchez-Calle, Estimates for the heat kernel for the sum of squares of vector fields, Indiana Univ. Math. J. 35 (1986) 835–854. [31] D. Jerison, A. Sanchez-Calle, Subelliptic, second order differential operators, in: Complex Analysis, III, in: Lecture Notes in Math., vol. 1277, Springer-Verlag, Berlin, 1987, pp. 46–77. [32] A.A. Kirillov, Elements of the Theory of Representations, Grundlehren Math. Wiss., Band 220, Springer-Verlag, Berlin, New York, 1976. [33] A.A. Kirillov, Lectures on the Orbit Method, Grad. Stud. Math., vol. 64, American Mathematical Society, 2004. [34] A. Klingler, New derivation of the Heisenberg kernel, Comm. Partial Differential Equations 22 (11–12) (1997) 2051–2060. [35] R. Léandre, Minoration en temps petit de la densité d’une diffusion dégénérée, J. Funct. Anal. 74 (1987) 399–414. [36] J. Mitchell, On Carnot–Carathéodory metrics, J. Differential Geom. 21 (1) (1985) 35–45. [37] I. Moiseev, Yu.L. Sachkov, Maxwell strata in sub-Riemannian problem on the group of motions of a plane, arXiv: 0807.4731v1. [38] R. Montgomery, A Tour of Sub-Riemannian Geometries, Their Geodesics and Applications, Math. Surveys Monogr., vol. 91, Amer. Math. Soc., Providence, RI, 2002. [39] R. Neel, D. Stroock, Analysis of the cut locus via the heat kernel, in: Surveys in Differential Geometry, IX, International Press, Somerville, MA, 2004, pp. 337–349. [40] P. Pansu, Métriques de Carnot–Carathéodory et quasiisométries des espaces symétriques de rang un, Ann. of Math. (2) 129 (1) (1989) 1–60. [41] J. Petitot, Vers une Neuro-géométrie. Fibrations corticales, structures de contact et contours subjectifs modaux, in: Numéro spécial de Mathématiques, in: Inform. Sci. Humaines, vol. 145, EHESS, Paris, 1999, pp. 5–101. [42] L.P. Rothschild, E.M. Stein, Hypoelliptic differential operators and nilpotent groups, Acta Math. 137 (3–4) (1976) 247–320. [43] M. Sugiura, Unitary Representations and Harmonic Analysis: An Introduction, second ed., North-Holland Math. Library, vol. 44, North-Holland Publishing, Amsterdam, 1990. [44] M.E. Taylor, Noncommutative Harmonic Analysis, Math. Surveys Monogr., vol. 22, American Mathematical Society, Providence, RI, 1986. [45] M.E. Taylor, Partial Differential Equations. I. Basic Theory, Appl. Math. Sci., vol. 115, Springer-Verlag, New York, 1996. [46] T.J.S. Taylor, Off diagonal asymptotics of hypoelliptic diffusion equations and singular Riemannian geometry, Pacific J. Math. 136 (1989) 379–399. [47] N. Varopoulos, L. Saloff-Coste, T. Coulhon, Analysis and Geometry on Groups, Cambridge Tracts in Math., vol. 100, Cambridge University Press, Cambridge, 1992.


Continuous model for homopolymers M. Cranston a,∗ , L. Koralov b , S. Molchanov c , B. Vainberg c a Department of Mathematics, University of California, Irvine, CA 92697, USA b Department of Mathematics, University of Maryland, College Park, MD 20742, USA c Department of Mathematics, University of North Carolina, Charlotte, NC 28223, USA

Received 10 July 2008; accepted 14 July 2008 Available online 28 August 2008 Communicated by Paul Malliavin

Abstract We consider the model for the distribution of a long homopolymer in a potential field. The typical shape of the polymer depends on the temperature parameter. We show that at a critical value of the temperature the transition occurs from a globular to an extended phase. For various values of the temperature, including those at or near the critical value, we consider the limiting behavior of the polymer when its size tends to infinity. © 2008 Elsevier Inc. All rights reserved. Keywords: Gibbs measure; Phase transition; Homopolymer

1. Introduction The goal of this paper is to analyze various critical phenomena for a model of long homogeneous polymer chains in an attracting potential field. The model exhibited here demonstrates a phase transition from a densely packed globular phase at low temperatures to an extended phase at higher temperatures. In the latter phase, the thermal fluctuations overcome the attraction between monomers and the√chain takes on the shape of a 3d random walk or Brownian motion with a typical scale O( T ) where T is the length of the polymer. A real life example of this phenomenon is that of albumen (egg white). We describe a rough picture of this situation. * Corresponding author.

E-mail addresses: [email protected] (M. Cranston), [email protected] (L. Koralov), [email protected] (S. Molchanov), [email protected] (B. Vainberg). 0022-1236/$ – see front matter © 2008 Elsevier Inc. All rights reserved. doi:10.1016/j.jfa.2008.07.019

M. Cranston et al. / Journal of Functional Analysis 256 (2009) 2656–2696

2657

The physical reality is more complex as there are present several types of protein with different critical points. However in a simplified version, at room temperature the albumen is in the globular state and as a result, it forms a viscous, translucent liquid. However, at higher temperatures (around 60–65 ◦ C) there is a transition of the albumen to a diffusive (extended) state resulting in an opaque semi-solid material. While this transition may be reversible for an individual polymer, in the aggregate, the polymer strands in the diffusive state become interwoven and form chemical bonds with each other and cannot return to the globular state when the temperature is decreased. It is worthwhile recalling Gibbs’ philosophy of phase transitions. Start with a system of finite size T . The configuration space ΣT = {x(·)} denotes all possible states x(·) of the system. The space ΣT is equipped with a reference measure P0,T which corresponds to infinite absolute temperature (in our case, the inverse temperature β = 0). The configurations satisfy boundary conditions which reflect the interaction of the finite system with its environment. This system is endowed with a Hamiltonian HT giving the energy HT (x) of the state x. For β > 0, the Gibbs measure Pβ,T is given by the density dPβ,T exp(−βHT (x)) (x) = , dP0,T Zβ,T

(1)

where Zβ,T =

exp −βHT (x) dP0,T .

(2)

ΣT

When T < ∞, the measure Pβ,T and the thermodynamic quantities associated to Pβ,T are analytic functions of β. Now let T → ∞. In typical situations, there is a critical value βcr such that for β > βcr , there exists a unique limiting measure Pβ on Σ, the space of infinite configurations, and this limiting measure is independent of the boundary conditions on ΣT . Moreover, Pβ and its relevant thermodynamic quantities are still analytic functions of β for β > βcr . One manifestation of the phase transition is the non-uniqueness for β < βcr of the limiting measure as T → ∞ as it has dependence on the boundary conditions on ΣT . Another is the non-analyticity of thermodynamic quantities associated to Pβ as a function of β. The mathematical characterization of the phase transition in terms of non-uniqueness of the limiting Gibbs measure traces its history to the works of Dobrushin [4] and Ruelle [9]. Modern physical theories predict that near the critical point β = βcr the limiting Gibbs measure Pβ must be invariant with respect to renormalizations of the system (self-similarity). This idea is related to the two-parametric scaling by Fisher [5] for β near βcr . Another important fact is that critical behavior as β → βcr of the physical system demonstrates universality, that is the same behavior holds for a wide class of Hamiltonians. The most essential part of the present paper is the detailed description of the polymer chain near the critical point and the establishment of the physical ideas of universality and selfsimilarity for our particular model of homopolymers. 2. Description of the model and results A continuous function x : [0, T ] → Rd , x(0) = 0, will be thought of as a realization of the polymer. The parameter t ∈ [0, T ] can be intuitively understood as the length along the polymer

2658


(although the functions x = x(t) are not differentiable and the genuine notion of length cannot be defined). We assume that for β = 0, the polymer is distributed according to the Wiener measure P0,T on ΣT = C([0, T ], Rd ). For an infinitely smooth compactly supported potential v ∈ C0∞ (Rd ) and a coupling constant β 0, the polymer is distributed according to the Gibbs measure Pβ,T , whose density with respect to P0,T is T exp(β 0 v(x(t)) dt) dPβ,T (x) = , dP0,T Zβ,T

x ∈ C [0, T ], Rd .

In other words, the Hamiltonian HT is given by HT = − Zβ,T , called the partition function, is given by Zβ,T = C([0,T ],Rd )

T 0

(3)

v(x(s)) ds. The normalizing factor

T exp β v x(t) dt dP0,T (x) = E0,T e−βHT .

(4)

0

It will be usually assumed that the potential is non-negative and not identically equal to zero. We shall be interested in the prevalent behavior of the polymer with respect to the measure Pβ,T as T → ∞. More realistic models would include pairwise interaction of monomers and interaction with T an external field (here the latter is represented by β 0 v(x(t)) dt). Pairwise interaction would t t be modeled by introduction of a term in the exponential in (3) of the form γ 0 0 f (x(s) − x(u)) ds du where f is a non-constant function, compactly supported in a neighborhood of the origin and γ > 0. Self-repulsion would then be modeled by the requirement f 0 while selfattraction would occur in the case f 0. Models involving self-interactions are highly complex and phase transitions are difficult to establish. We will discuss only the mean field model and exclude self-interaction. Under this simplifying assumption, we will give a complete physical description. We shall see that there are two qualitatively different cases corresponding to different values of β. Namely, for all sufficiently large values of β there is a limiting distribution for x(T ) with respect to P0,T . Moreover, for each positive constant s and each function S(T ) such that S(T ) → ∞ and T − S(T ) → ∞ as T → ∞, the family of processes x(S(T ) + t), t ∈ [0, s], with respect to either measure Pβ,T or Pβ,T (·|x(T ) = 0), converges to a Markov process as T → ∞. The generator of the limiting Markov process and its invariant measure are written out explicitly in Theorem 8.3. Since x(S(T )) and x(T ) converge to limiting distributions and thus typically remain bounded as T → ∞, we shall say that the polymer is in the globular state.√ If β > 0 is sufficiently small and d 3, then the family of processes x(tT )/ T , 0 t 1, defined on (C([0, T ], Rd ), Pβ,T ), converges to a Brownian motion on the interval [0, 1] (Theorem 9.2). In this case√we shall say that the polymer is in the diffusive state. Similarly, the family of processes x(tT )/ T , 0 t 1, defined on (C([0, T ], Rd ), Pβ,T (·|x(T ) = 0)), converges to a Brownian bridge on the interval [0, 1]. We shall see that there is a number βcr (called the critical value of the coupling constant) such that the polymer is in the diffusive state for β < βcr and in the globular state for β > βcr . The value of βcr and the behavior of the polymer when β is near βcr depend on the dimension d and on the potential. In particular, we shall see that βcr = 0 for d = 1, 2 and βcr > 0 for d 3.


2659

Of particular interest is the behavior of the polymer when β = βcr . In this case the appropriate √ scaling is the same as in the diffusive case, that is we study the family of processes x(tT )/ T , 0 t 1. We shall find the limit of this family as T → ∞. It turns out to be a Markov process with a non-Gaussian, spherically symmetric transition function (Theorem 10.6). The transition function of the limiting Markov process will be written out explicitly. A remarkable fact is the connection between this limiting process and a process derived at a critical value of a 0-range potential for d = 3 in [3]. In that paper we considered a one parameter, call the parameter γ , family of self-adjoint extensions of the Laplacian with domain Cc∞ (R3 − {0}). This family of operators, introduced by Bethe and Peierl’s as a model for a diplon, is now well understood. An excellent exposition can be found in [1]. There exist non-trivial, closed self-adjoint extensions only for d 3. The corresponding theory of polymer (Gibbs) measures based on these Hamiltonians is interesting only in d = 3 since for d = 1, 2 the polymer exhibits no phase transition and is always in the globular phase. The heat kernel given at (79) is identical with the heat kernel for the d = 3, 0-range self-adjoint extension of the Laplacian at the critical parameter value γ = 0. As an aside we repeat that there are no closed, self-adjoint extensions of the Laplacian with domain Cc∞ (Rd − {0}), other than the Laplacian itself when d 4. In order to determine whether the polymer is in the globular or diffusive state for a given β, we shall look at the rate of growth of the partition function Zβ,T . Namely, let ln Zβ,T . T →∞ T

λ0 (β) = lim

It will be demonstrated that the limit exists and is equal to the supremum of the spectrum of the operator Hβ = 12 + βv : L2 (Rd ) → L2 (Rd ). The infimum of the set of β for which λ0 (β) > 0 is equal to βcr . It will be seen that λ0 (βcr ) = 0 is an eigenvalue of Hβcr in dimensions d 5, and corresponds to a ground state of Hβcr in dimensions d = 3, 4. The paper is organized as follows. In Section 3 we consider finite T and show that {x(t), 0 t T } is a time-inhomogeneous Markov process with respect to the measures Pβ,T and Pβ,T (·|x(T ) = 0). In Section 4 we prove the existence of the critical value of the coupling constant. In Section 5 we analyze the properties of the resolvent of the operator Hβ which, in particular, will be needed to study the asymptotic properties of the partition function. In Section 6 we shall examine the asymptotics of λ0 (β) when β ↓ βcr and show it has the following asymptotic behavior as β ↓ βcr , ⎧ ⎨ c3 (β − βcr )2 , λ0 (β) ∼ c4 (β − βcr )/ ln(1/(β − βcr )), ⎩ cd (β − βcr ),

d = 3, d = 4, d 5.

These asymptotics demonstrate universality in that they depend only on dimension. The constants cd , d 3, are not universal however. In Section 7 we find the asymptotics, as T → ∞, of Zβ,T . In particular, when β > βcr , we shall find that Zβ,T ∼ kβ eλ0 (β)T for some constant kβ , while for β < βcr , Zβ,T has a finite limit as T → ∞. Finally, when β = βcr , it turns out that Zβ,T ∼ k3 T 1/2 for d = 3, Zβ,T ∼ k4 T /ln T for d = 4, while Zβ,T ∼ kd T for d 5. We also give asymptotics of the solutions to the parabolic equation ∂u/∂t = Hβ u. In Sections 8–10, we describe the behavior of the polymer for β > βcr , β < βcr and β = βcr , respectively, establishing the convergence results mentioned above.

2660


Some of the results presented above have been obtained by Cranston and Molchanov in [2] for the discrete model with the potential concentrated at one point. The analysis was based on explicit formulas for the solution of the parabolic equation with such a potential. The current results demonstrate that the behavior of the polymer is “universal” with respect to the choice of the potential. Another essential feature of this paper is the detailed analysis of the behavior of the polymer when β = βcr . We refer the reader to the review of Lifschitz, Grosberg and Khokhlov [7] for a wealth of information and ideas on polymer chains. 3. Time-inhomogeneous Markov property First we define pβ as the fundamental solution of the heat equation ∂pβ 1 (t, y, x) = x pβ (t, y, x) + βv(x)pβ (t, y, x), ∂t 2 pβ (0, y, x) = δ(x − y).

(5)

In this section we shall prove that with respect to the measure Pβ,T , the process {x(t), 0 t T } is a time-inhomogeneous Markov process. Since we shall point out the link between non-uniqueness of Gibbs measures and phase transitions it will be necessary to also consider the transition mechanism for the process {x(t), 0 t T } under the conditional measure Pβ,T (·|x(T ) = 0). Namely, we will show that the free boundary condition corresponding to the measure Pβ,T and the pinned boundary condition corresponding to the measure Pβ,T (·|x(T ) = 0) lead to different Gibbs measures in the limit. t Let Zβ,t (x) = Ex exp(β 0 v(xs ) ds), where Ex is the expectation with respect to the measure induced by the Brownian motion starting at x. Thus Zβ,t (0) = Zβ,t , where Zβ,t is the partition function introduced in the previous section. Theorem 3.1. The process {x(t), 0 t T } is a time-inhomogeneous Markov process with respect to the measures Pβ,T . Its transition density is given by −1 qβT (s, y), (t, x) = pβ (t − s, y, x)Zβ,T −t (x) Zβ,T −s (y) .

(6)

The transition density qβT ((s, y), (t, x)) solves the parabolic equation 1 ∂ T qβ (s, y), (t, x) + y qβT (s, y), (t, x) + ∇y ln Zβ,T −s (y)∇y qβT (s, y), (t, x) = 0. (7) ∂s 2 With respect to the conditional measure Pβ,T (·|x(T ) = 0), the process {x(t), 0 t T } is a time-inhomogeneous Markov process with transition density (T ,0)

qβ

−1 (s, y), (t, x) = pβ (t − s, y, x)pβ (T − t, x, 0) pβ (T − s, y, 0) .

(8)

While this result is not used directly in later sections, it provides some intuition on the nature of the limiting processes when we consider the limit T → ∞.


Proof. The Feynman–Kac formula gives that for 0 < t T , pβ (t, 0, x)Ex exp(β 0T −t v(xs ) ds) dx. Pβ,T x(t) ∈ dx = Zβ,T

2661

(9)

Similarly, for 0 = t0 < t1 < t2 < · · · < tn T and x0 = 0, Pβ,T x(t1 ) ∈ dx1 , . . . , x(tn ) ∈ dxn T −tn n−1 xn v(xs ) ds) i=0 pβ (ti+1 − ti , xi , xi+1 )E exp(β 0 = dx1 dx2 . . . dxn . Zβ,T So, if we set for 0 s < t T , −1 qβT (s, y), (t, x) = pβ (t − s, y, x)Zβ,T −t (x) Zβ,T −s (y) , then n−1 Pβ,T x(t1 ) ∈ dx1 , . . . , x(tn ) ∈ dxn = qβT (ti , xi ), (ti+1 , xi+1 ) . i=0

Since qβT ((s, y), (t, x)) > 0 and

qβT (s, y), (t, x) dx = 1,

Rd

this means that {x(t), 0 t T } under the measure Pβ,T is a time-inhomogeneous Markov process with transition probabilities q T . Turning the equation for qβT around and solving for pβ yields pβ (t − s, y, x) =

qβT ((s, y), (t, x))Zβ,T −s (y) Zβ,T −t (x)

.

Using the fact that ∂ 1 pβ (t − s, y, x) + y pβ (t − s, y, x) + βv(y)pβ (t − s, y, x) = 0, ∂s 2 we derive that qβT satisfies the equation Zβ,T −s (y) ∂ Zβ,T −s (y) ∂ T qβ (s, y), (t, x) + qβT (s, y), (t, x) ∂s ∂s Zβ,T −t (x) Zβ,T −t (x) Zβ,T −s (y) Zβ,T −s (y) 1 + βv(y)qβT (s, y), (t, x) + y qβT (s, y), (t, x) 2 Zβ,T −t (x) Zβ,T −t (x) +

T ∇y Zβ,T −s (y) 1 qβ ((s, y), (t, x)) y Zβ,T −s (y) + ∇y qβT (s, y), (t, x) = 0. 2 Zβ,T −t (x) Zβ,T −t (x)

(10)

2662


Simplifying this leads to the following parabolic equation for qβT , 1 ∂ T qβ (s, y), (t, x) + y qβT (s, y), (t, x) + ∇y ln Zβ,T −s (y)∇y qβT (s, y), (t, x) = 0. ∂s 2 (11) Next we consider the pinned case, for 0 = t0 < t1 < · · · < tn < tn+1 = T and x0 = xn+1 = 0. Then,

Pβ,T (x(t1 ) ∈ dx1 , . . . , x(tn ) ∈ dxn , x(T ) = 0) Pβ,T x(t1 ) ∈ dx1 , . . . , x(tn ) ∈ dxn x(T ) = 0 = Pβ,T (x(T ) = 0) n−1 pβ (ti+1 − ti , xi , xi+1 ) = i=0 dx1 . . . dxn . (12) pβ (T , 0, 0) Now set for 0 s < t T , (T ,0)

qβ

−1 (s, y), (t, x) = pβ (t − s, y, x)pβ (T − t, x, 0) pβ (T − s, y, 0) .

(13)

Then (T ,0)

n−1 (ti , xi ), (ti+1 , xi+1 ) . qβ Pβ,T x(t1 ) ∈ dx1 , . . . , x(tn ) ∈ dxn x(T ) = 0 =

(14)

i=0 (T ,0)

Since qβ

((s, y), (t, x)) > 0 and

qβT (s, y), (t, x) dx = 1,

Rd

this means that {x(t), 0 t T } under the conditional measure Pβ,T (·|x(T ) = 0) is a time(T ,0) inhomogeneous Markov process with transition densities qβ . 2 We shall see below in that in the globular phase β > βcr the drift term ∇x ln Zβ,T −s (x) has a non-trivial limit as T → ∞. This means that for β > βcr , the Gibbs measure corresponds to a stationary Markov process in the T → ∞ limit. On the other hand, this limit will vanish for β < βcr . This explains the nature of the diffusive state for high temperature. 4. Critical value of the coupling constant Let 1 Hβ = + βv : L2 Rd → L2 Rd , 2

v = v(x) ∈ C0∞ Rd , β 0.

We shall always assume that v(x) is non-negative and compactly supported, although many results do not require these restrictions or can be modified to be valid without these restrictions. We


2663

shall also assume that v is not identically equal to zero. It is well known that the spectrum of Hβ consists of the absolutely continuous part (−∞, 0] and at most a finite number of non-negative eigenvalues: σ (Hβ ) = (−∞, 0] ∪ {λj },

0 j N, λj = λj (β) 0.

We enumerate the eigenvalues in the decreasing order. Thus, if {λj } = ∅, then λ0 = max λj . Lemma 4.1. There exists βcr 0 (which will be called the critical value of β) such that sup σ (Hβ ) = 0 for β βcr and sup σ (Hβ ) = λ0 (β) > 0 for β > βcr . For β > βcr the eigenvalue λ0 (β) is a strictly increasing and continuous function of β. Moreover, limβ↓βcr λ(β) = 0 and limβ↑∞ λ(β) = ∞. Proof. The form (Hβ ψ, ψ) is positive on a function ψ supported on supp(v) if β is large enough. Thus sup σ (Hβ ) > 0 for sufficiently large β. On the other hand, σ (Hβ ) = (−∞, 0] when β = 0. Let βcr = sup{β: sup σ (Hβ ) = 0}. It is clear that sup σ (Hβ ) = 0 for β < βcr since the operator Hβ depends monotonically on β. Other statements easily follow from the fact that for each ψ the form (Hβ ψ, ψ) depends continuously and monotonically on β. 2 Remark. As will be shown below, βcr = 0 for d = 1, 2, and βcr 0 for d 3. Thus we do not talk about phase transition for d = 1, 2 since we do not consider negative values of β. For d 3, by the Cwikel–Lieb–Rozenblum estimate [8], λi (β) 0 cd β d/2

v(x) d/2 dx.

Rd

This implies that there are no eigenvalues for sufficiently small values of β if d 3, that is βcr > 0. It is also well know (see [8]) that sup σ (Hβ ) > 0 for d = 1, 2 if β > 0, v 0 and v is not identically zero. These statements will also be proved below without referring to the Cwikel– Lieb–Rozenblum estimate. 5. Analytic properties of the resolvent The resolvent of the operator Hβ will be considered in the spaces of square-integrable and continuous functions. The resolvent Rβ (λ) = (Hβ − λ)−1 : L2 (Rd ) → L2 (Rd ) is a meromorphic operator-valued function on C = C \ (−∞, 0]. Denote the kernel of Rβ (λ) by Rβ (λ, x, y). If β = 0, the kernel depends on the difference x − y and will be denoted by R0 (λ, x − y). The (1) kernel R0 (λ, x) can be expressed through the Hankel function Hν : √

2|x| ,

1− d (1) √ 2H i 2k|x| , R0 (λ, x) = ck d−2 k|x| d

k=

d

(1) i 2 −1

R0 (1, x) = c|x|1− 2 H d

(15)

and 2 −1

√ λ, Re k > 0.

(16)

2664


In particular, √

R0 (λ, x) =

e−

2k|x|

√ , − 2k

√

e− 2k|x| R0 (λ, x) = , −2π|x|

d = 1;

d = 3.

We shall say that f ∈ L2exp (Rd ) if f is measurable and f L2exp (Rd ) =

f 2 (x)e|x| dx 2

1 2

< ∞.

Rd

Similarly, we shall say that f ∈ Cexp (Rd ) if f is continuous and

2 f Cexp (Rd ) = sup f (x) e|x| < ∞. x∈Rd

Note that R0 (λ), λ ∈ C , is a bounded operator not only in L2 (Rd ) but also from Cexp (Rd ) to C(Rd ), where C(Rd ) is the space of bounded continuous functions on Rd . Denote A(λ) = v(x)R0 (λ) : L2exp Rd → L2exp Rd

and Cexp Rd → Cexp Rd .

(17)

The well-known properties of the Hankel functions together with (15) and (16) imply the following lemma (see [10] for a similar statement for general elliptic operators). Lemma 5.1. Consider the operator A(λ) in the spaces L2exp (Rd ) and Cexp (Rd ). (1) The√operator A(λ) is analytic in λ ∈ C . It admits an analytic extension √ as an entire function of λ if d is odd, except d = 1, when it has a pole (with respect to λ ) at the origin. The operator A(λ) has the form A(λ) = A1 (λ) + ln λA2 (λ) if d is even, where A1 and A2 are entire functions. (2) A2 (0) = 0 if d 4 (d is even), and therefore A(0) = limλ→0,λ∈C A(λ) exists and is a bounded operator for all d 3. (3) The operator A(λ) is compact for all λ ∈ C ∪ {0} (λ = 0 if d = 1 or 2). (4) For each ε > 0, we have A(λ) = O(1/|λ|) as λ → ∞, |arg λ| π − ε. (5) The operator A(λ) has the following asymptotic behavior as λ → 0, λ ∈ C : ⎧ √ ⎪ −vP1 / λ + O(1), ⎪ ⎪ ⎪ ⎨ −vP2 ln(1/λ)√+ O(1), A(λ) = −v(P3 + Q3 λ ) + O(|λ|), ⎪ ⎪ ⎪ −v(P4 + Q4 λ ln(1/λ)) + O(|λ|), ⎪ ⎩ −v(Pd + Qd λ) + O(|λ|3/2 ),

d = 1, d = 2, d = 3, d = 4, d 5,

where the operators Pd , d 1, Qd , d 3, have the following kernels:


1 P1 (x, y) = √ , 2 1 , P3 (x, y) = 2π|x − y|

P2 (x, y) =

1 , − y|2 ad Pd (x, y) = , |x − y|d−2

P4 (x, y) =

π 2 |x

2665

1 , π

1 Q3 (x, y) = − √ , 2π 1 Q4 (x, y) = − 2 , 2π −ad Qd (x, y) = , (d − 4)|x − y|d−4

ad > 0, d 5.

Proof. Let d be odd. From (15), (16) and (17) it follows that the √ kernel A(λ, x, y) = v(x)R0 (λ, x − y) of the operator A(λ) is an entire function of k = λ if d 3 (but has a pole at k = 0 if d = 1). The kernel has a weak singularity at x = y and an exponential estimate at infinity. To be more exact,

2

2

A k , x, y + ∂A(k , x, y) C(d, k) v(x) e|k(x−y)| |x − y| + |x − y|−(d−1) ,

∂k

(18)

where C(d, k) has a singularity at k = 0 if d = 1. Since 2 A k d 2 A k dk

Cexp

(Rd )

sup x∈Rd

sup Cexp (Rd )

x∈Rd

e|x|

2

A k , x, y dy,

e|x|

∂A(k 2 , x, y)

dy,

∂k

2 −|y|2

2 −|y|2

√ the estimate (18) immediately leads to the analyticity in k = λ of the operator A(λ) in the space Cexp (Rd ). In order to get the same result in the space L2exp (Rd ), we represent A(λ) in the form B1 + B2 were the kernel B1 (λ, x, y) of the operator B1 is equal to χ(x − y)A(λ, x, y). Here χ is the indicator function of the unit ball. Since

χ(x)R0 k 2 , x + d χ(x)R0 k 2 , x ∈ L1 Rd ,

dk

(19)

the convolution with χ(x)R0 (k 2 , x) is an analytic in k operator in the space L2 (Rd ). Then B1 (which is the convolution followed by multiplication by v(x)) is an analytic operator in the space 2 2 L2exp (Rd ). The product of the kernel of the operator B2 and e|x| −|y| is square integrable in (x, y). The same is true for the derivative in k of the kernel of B2 multiplied by e|x| −|y| . Thus B2 is also analytic in k. This completes the proof of the analyticity of A(λ) when d is odd. The case of even d is similar. One needs only to take into account that R0 (λ, x) has a logarithmic branching point at λ = 0 in this case. The second statement of the lemma follows immediately from (15), (16) and (17). To prove the compactness of A(λ), we note that the estimate (18) is valid not only for A(k 2 , x, y) and ∂A(k 2 , x, y)/∂k, but also for ∇x A(k 2 , x, y). Thus the arguments above lead to the boundedness of the operators ∂x∂ i A(λ) (the composition of A(λ) with the differentiation). 2

2

2666


Since the supports of functions A(λ)f belong to the support of v, the standard Sobolev embedding theorems imply the compactness of the operator A(λ) in both the spaces L2exp (Rd ) and Cexp (Rd ). In order to prove the fourth statement of the lemma, we observe that the L2 (Rd ) norm of the resolvent R0 (λ) does not exceed 1/|Im λ| (the inverse distance from the spectrum). Since A(λ) is obtained from R0 (λ) after multiplying it by a bounded function with compact support, the L2exp (Rd ) norm of A(λ) does not exceed c/|Im λ|, where c is a positive constant which depends on v. The norm of A(λ) in the space Cexp (Rd ) can be estimated by 2 supx∈Rd |v(x)e|x| | Rd |R0 (λ, x)| dx, which is of order O(1/|λ|) as λ → ∞, |arg λ| π − ε, due to (16). The remaining statements also easily follow from (15) and (16). 2 Note that for d 3, there exists the limit R0 (0, x − y) :=

lim

λ→0,λ∈C

R0 (λ, x − y) = −ad |x − y|2−d ,

which is a fundamental solution of the operator 12 . The operator with this kernel will be denoted by R0 (0). While R0 (λ), λ ∈ C , acts in L2 (Rd ) and C(Rd ), the operator R0 (0) only maps Cexp (Rd ) to C(Rd ) if d < 5. The following lemma follows from formulas (15) and (16) similarly to Lemma 5.1. Lemma 5.2. For d 3, the operator R0 (λ) considered as an operator from Cexp (Rd ) to C(Rd ) is analytic in λ ∈ C . It is uniformly bounded in C . For each ε > 0, it is of order O(1/|λ|) as λ → ∞, |arg λ| π − ε. It has the following asymptotic behavior as λ → 0, λ ∈ C : ⎧ √ ⎨ R0 (0) + O( |λ| ), R0 (λ) = R0 (0) + O(|λ ln λ|), ⎩ R0 (0) + O(|λ|),

d = 3, d = 4, d 5.

The following lemma is simply a resolvent identity. It plays an important role in our future analysis. Lemma 5.3. For λ ∈ C , we have the following relation between the meromorphic operatorvalued functions, −1 Rβ (λ) = R0 (λ) − R0 (λ) I + βv(x)R0 (λ) βv(x)R0 (λ) .

(20)

Remark. Note that (20) can be written as −1 βv(x)R0 (λ) . Rβ (λ) = R0 (λ) − R0 (λ) I + βA(λ)

(21)

From here it also follows that −1 Rβ (λ) = R0 (λ) I + βA(λ) ,

(22)


2667

which should be understood as an identity between meromorphic in λ operators acting from L2exp (Rd ) to L2 (Rd ) and from Cexp (Rd ) to C(Rd ). In the lattice case considered in [2], the operator A(λ) has rank one and Rβ (λ, x, y) = R0 (λ, x, y)/ 1 − βI (λ) , where I (λ) is an analytic function of results in [2].

√ λ related to A(λ). This exact formula is the key to all the

The kernels of the operators I + βA(λ) (both in spaces L2exp (Rd ) and Cexp (Rd )) are described by the following lemma. Lemma 5.4. (1) The operator-valued function (I + βA(λ))−1 is meromorphic in C . It has a pole at λ ∈ C if and only if λ is an eigenvalue of Hβ . These poles are of the first order. (2) Let λi (β) be a positive eigenvalue of Hβ . There is a one-to-one correspondence between the kernel of the operator I + βA(λi ) and the eigenspace of the operator Hβ corresponding to the eigenvalue λi . Namely, if (I + βA(λi ))h = 0, then ψ = −R0 (λi )h is an eigenfunction of Hβ and h = βvψ. (3) If d 3, there is a one-to-one correspondence between the kernel of the operator I + βA(0) and solution space of the problem 1 Hβ (ψ) = ψ + βv(x)ψ = 0, 2 2−d ∂ψ ψ(x) = O |x| (x) = O |x|1−d , as r = |x| → ∞. ∂r

(23)

Namely, if (I + βA(0))h = 0 for h ∈ L2exp (Rd ), then h ∈ Cexp (Rd ), ψ = −R0 (0)h is a solution of (23) and h = βvψ. Remark. The relations (23) are an analogue of the eigenvalue problem for zero eigenvalue and the eigenfunction ψ which does not necessarily belong to L2 (Rd ) (see Lemma 5.6 below). We shall call a non-zero solution of (23) a ground state. Proof. The operator A(λ), λ ∈ C , is analytic, compact, and tends to zero as λ → +∞ by Lemma 5.1. Therefore (I + βA(λ))−1 is meromorphic by the analytic Fredholm theorem. If λ ∈ C is a pole of (I + βA(λ))−1 , then it is also a pole of the same order of Rβ (λ) as follows from (22) since the kernel of R0 (λ) is trivial. Therefore the pole is simple and coincides with one of the eigenvalues λi . Note that λ is a pole of (I + βA(λ))−1 if and only if the kernel of I + βA(λ) is non-trivial. Let h ∈ L2exp (Rd ) be such that hL2exp (Rd ) = 0 and (I + βvR0 (λ))h = 0. Then ψ := −R0 (λ)h ∈ L2 (Rd ) and ( 12 − λ + βv)ψ = 0, that is ψ is an eigenfunction of Hβ . Conversely, let ψ ∈ L2 (Rd ) be an eigenfunction corresponding to an eigenvalue λi , that is

1 − λi ψ + βvψ = 0. 2

(24)

2668


Denote h = βvψ. Then ( 12 − λi )ψ = −h. Thus ψ = −R0 (λi )h and (24) implies that h satisfies (I + βvR0 (λi ))h = 0. Note that h ∈ C ∞ (Rd ), h vanishes outside supp(v), and therefore belongs to the kernel of I + βA(λi ). This completes the proof of the first two statements. Similar arguments can be used to prove the last statement. If h ∈ L2exp (Rd ) is such that hL2exp (Rd ) = 0 and (I + βA(0))h = 0, then h has compact support and the integral operator R0 (0) can be applied to h. It is clear that ψ := −R0 (0)h satisfies (23) and, since h has compact support, h ∈ Cexp (Rd ). In order to prove that any solution of (23) corresponds to an eigenvector of I + βA(0), one only needs to show that the solution ψ of the problem (23) can be represented in the form ψ = −R0 (0)h with h = βvψ. The latter follows from the Green formula ψ(x) = − R0 (0)(βvψ) (x) +

∂ R0 (0, x − y)ψr (y) − R0 (0, x − y)ψ(y) ds, ∂r

|x| < a,

|y|=a

after passing to the limit as a → ∞.

2

Lemma 5.4 can be improved for λ = λ0 (β). Due to the monotonicity and continuity of λ = λ0 (β) for β > βcr , we can define the inverse function β = β(λ) : [0, ∞) → [βcr , ∞).

(25)

We shall prove that the operator −A(λ), λ > 0, has a non-negative kernel and has a positive simple eigenvalue such that all the other eigenvalues are smaller in absolute value. Such an eigenvalue is called the principal eigenvalue. Lemma 5.5. The operator −A(λ), λ > 0 (in the spaces L2exp (Rd ) and Cexp (Rd )), has the principal eigenvalue. This eigenvalue is equal to 1/β(λ) and the corresponding eigenfunction can be taken to be positive in the interior of supp(v) and equal to zero outside of supp(v). If d 3, then the same is true for the operator −A(0) (in particular, βcr > 0). Remark 1. Let d 3. Lemmas 5.4 and 5.5 imply that the ground state of the operator Hβ for β = βcr (defined by (23)) is defined uniquely up to a multiplicative constant and corresponds to the principal eigenvalue of A(0). The ground state (with λ = 0) does not exist if β < βcr . Remark 2. Let d 3. From Lemma 5.1 it follows that lim

λ→0, λ∈C

A(λ) = A(0).

Therefore for all λ ∈ C with |λ| sufficiently small, the operator −A(λ) has a simple eigenvalue whose real part is larger than the absolute values of the other eigenvalues. We shall denote this eigenvalue by 1/β(λ), thus extending the domain of the function β(λ) (see (25)) from [0, ∞) to [0, ∞) ∪ (U ∩ C ), where U is a sufficiently small neighborhood of zero. Proof of Lemma 5.5. By Lemma 5.4 it is sufficient to consider the case of L2exp (Rd ). The maximum principle for the operator ( 12 − λ), λ > 0, implies that the kernel of the operator R0 (λ),


2669

λ > 0, is negative. Thus, by (17), for all y the kernel of −A(λ) is positive when x is in the interior of supp(v) and zero, otherwise. Thus −A(λ), λ > 0, has the principal eigenvalue (see [6]). On the other hand, by Lemma 5.4, 1/β(λ) is a positive eigenvalue of −A(λ). Note that this is the largest positive eigenvalue of −A(λ). Indeed, if μ = 1/β > 1/β(λ) is an eigenvalue of −A(λ), then λ is one of the eigenvalues λi of Hβ by Lemma 5.4. Therefore, λi (β ) = λ0 (β) for β < β. This contradicts the monotonicity of λ0 (β). Hence the statement of the lemma concerning the case λ > 0 holds. For d 3, the kernel of −A(0) is equal to vPd and has the same properties as the kernel of −A(λ), λ > 0. Thus −A(0) has the principal eigenvalue. Since A(λ) → A(0) as λ ↓ 0, the principal eigenvalue 1/β(λ) converges to the principal eigenvalue μ < ∞ of −A(0). On the other hand, β(λ) is a continuous function, and therefore μ = 1/βcr , which proves the statement concerning the case λ = 0. 2 The relationship between ground states and eigenfunctions of Hβ is explained by the following lemma. Lemma 5.6. Let β = βcr . If d = 3 or d = 4, then Hβ has a unique ground state (up to a multiplicative constant), but λ = 0 is not an eigenvalue. If d 5, then λ = 0 is a simple eigenvalue of Hβ and the sets of ground states and eigenfunctions coincide. Proof. The ground states belong to L2 (Rd ) if and only if d 5. In order to complete the proof of the lemma, it remains to show that any eigenfunction of Hβ with zero eigenvalue satisfies (23). Thus, it is enough to prove that if 12 ψ + βv(x)ψ = 0 and ψ ∈ L2 (Rd ), then ψ = −R0 (0)h with h = βvψ. From 12 ψ + βv(x)ψ − λψ = −λψ we obtain ψ = −R0 (λ)(h + λψ). Obviously R0 (λ)h → R0 (0)h in L2 (Rd ) as λ ↓ 0 since h ∈ L2exp (Rd ). Now the lemma will be proved if we show that λR0 (λ)ψ 2 2

L (Rd )

= Rd

(σ )| 2λ|ψ σ 2 + 2λ

2 dσ → 0 as λ ↓ 0.

The latter follows from the dominated convergence theorem.

2

The following lemma summarizes some facts about the operator (I + βA(λ))−1 proved above. It also describes the structure of the singularity of the operator (I + βA(λ))−1 for λ and β in a neighborhood of λ = 0, β = βcr . Lemma 5.7. Let d 3 and β 0. The operator (I + βA(λ))−1 (considered in L2exp (Rd ) and Cexp (Rd )) is meromorphic in λ ∈ C and has poles of the first order at eigenvalues of the operator Hβ . For each ε > 0 and some Λ = Λ(β), the operator is uniformly bounded in λ ∈ C , |arg λ| π − ε, |λ| Λ. If β = βcr , then the operator (I + βA(λ))−1 is analytic in λ ∈ C and uniformly bounded in λ ∈ C , |arg λ| π − ε, |λ| ε. If β < βcr , then the operator (I + βA(λ))−1 is analytic in λ ∈ C and uniformly bounded in λ ∈ C , |arg λ| π − ε. There are λ0 > 0 and δ0 > 0 such that for λ ∈ C ∪ {0}, |λ| λ0 , |β − βcr | δ0 , β = β(λ), we have the representation

2670


−1 I + βA(λ) =

β(λ) B + Sd (λ) + C(λ, β). β(λ) − β

(26)

Here β(λ) is defined in Remark 2 following Lemma 5.5, B is the one-dimensional operator with the kernel v(x)ψ(x)ψ(y) , 2 Rd v(x)ψ (x) dx

B(x, y) =

(27)

where ψ is a ground state defined in the remark following Lemma 5.4, and S3 (λ) = O

S4 (λ) = O λ ln(λ) ,

|λ| ,

Sd (λ) = O |λ| ,

as λ → 0, λ ∈ C ,

d 5, (28)

Sd (0) = 0, d 3, and C(λ, β) is bounded uniformly in λ and β. Proof. The analytic properties of (I + βA(λ))−1 follow from Lemma 5.4. By Lemma 5.1, the norm of A(λ) decays at infinity when λ → ∞, |arg λ| π − ε. Therefore there is Λ > 0 such that the operator (I + βA(λ))−1 is bounded for |arg λ| π − ε, |λ| Λ. If β βcr , then (I + βA(λ))−1 does not have poles in λ ∈ C , and therefore Λ can be taken to be arbitrarily small. If β < βcr , then (I + βA(0)) is invertible by Lemma 5.5. By Lemma 5.1, the operators A(λ) tend to A(0) when λ → 0, λ ∈ C . Therefore (I + βA(λ))−1 , λ ∈ C , are bounded in a neighborhood of zero. It remains to justify (26). For d 3, let hλ be an eigenvector corresponding to the eigenvalue 1/β(λ) of the operator −A(λ), λ ∈ [0, ∞) ∪ (U ∩ C ). By Lemma 5.5 and the second remark following it, this eigenvector is defined up to a multiplicative constant. Let A∗ (λ) be the operator in L2exp (Rd ) or Cexp (Rd ) with the kernel A∗ (λ, x, y) = A(λ, y, x)e|y| −|x| . Similarly to Lemma 5.5, it is not difficult to show that 1/β(λ) is an eigenvalue for the operator −A∗ (λ) and that its real part exceeds the absolute values of the other eigenvalues. The corresponding eigenvector h∗λ is uniquely defined up to a multiplicative constant. Moreover, we can take hλ and h∗λ such that 2

2

v(x)e|x| h∗λ (x) = hλ (x). 2

(29)

Note that hλ and h∗λ can be chosen in such a way that hλ − h0 , h∗λ − h∗0 k A(λ) − A(0)

(30)

for some k > 0 and all sufficiently small |λ|, where the norms on both sides of (30) are either in the space L2exp (Rd ) or Cexp (Rd ). Recall that A(λ) → A(0) as λ → 0, λ ∈ C , by Lemma 5.1. Using this and the fact that 1/βcr is the principal eigenvalue for −A(0), it is easy to show that there are λ1 > 0 and δ1 > 0 such that for λ ∈ C ∪ {0}, |λ| λ1 , the eigenvalue 1/β(λ) of the operator −A(λ) is the unique eigenvalue whose distance from 1/βcr does not exceed δ1 . Take 0 < λ0 < λ1 and 0 < δ0 < δ1 such that for λ ∈ C ∪ {0}, |λ| λ0 , the distance between 1/β(λ) and 1/βcr does not exceed δ0 . Then for λ ∈ C ∪ {0}, |λ| λ0 and β such that |1/β − 1/βcr | δ0 , the operator-valued function


F (z) =

2671

(A(λ) + zI )−1 z − (1/β)

is meromorphic inside the circle γ = {z: |z − 1/βcr | = δ1 }. It has two poles: one at z = 1/β and the other at z = 1/β(λ). The residue at the first pole is equal to (A(λ) + I /β)−1 . In order to find the residue at the second pole, recall that it is a simple pole for (A(λ) + zI )−1 , and therefore −1 = T−1 (λ) z − A(λ) + zI

1 β(λ)

−1

+ T0 (λ) + T1 (λ) z −

1 + ··· β(λ)

for some operators T−1 , T0 , T1 , . . . and all z in a neighborhood of 1/β(λ). From here and the fact that the kernels of A(λ) + I /β(λ) and A∗ (λ) + I /β(λ) are one-dimensional and coincide with span{hλ } and span{h∗λ }, respectively, it easily follows that T−1 (λ)f =

hλ f, h∗λ L2exp (Rd ) hλ , h∗λ L2exp (Rd )

in particular if f ∈ Cexp Rd . f ∈ L2exp Rd

,

From (30) and Lemma 5.1 it follows that Sd (λ) := T−1 (λ) − T−1 (0) satisfies (28). The residue of F (z) at z = 1/β(λ) is equal to β(λ)β T−1 (0) + Sd (λ) . β − β(λ) Integrating F (z) over the contour γ , we obtain −1 β(λ)β 1 T−1 (0) + Sd (λ) = A(λ) + I /β + β − β(λ) 2πi

γ

(A(λ) + zI )−1 dz. z − (1/β)

The right-hand side of this formula is uniformly bounded, which completes the proof of the lemma if we show that T−1 (0) = B. Thus it remains to prove that h0 (x)e|y| h∗0 (y) v(x)ψ(x)ψ(y) . = ∗ 2 h0 , h0 L2exp (Rd ) Rd v(x)ψ (x) dx 2

The latter follows from the relation h0 = βvψ (see Lemma 5.4) and (29).

2

Formula (22) and Lemmas 5.2 and 5.7 imply the following result. Lemma 5.8. Let d 3 and β 0. The operator Rβ (λ) (considered as an operator from Cexp (Rd ) to C(Rd )) is meromorphic in λ ∈ C and has poles of the first order at eigenvalues of the operator Hβ . For each ε > 0 and some Λ = Λ(β), the operator is uniformly bounded in λ ∈ C , |arg λ| π − ε, |λ| Λ. It is of order O(1/|λ|) as λ → ∞, |arg λ| π − ε. If β = βcr , then the operator Rβ (λ) is analytic in λ ∈ C and uniformly bounded in λ ∈ C , |arg λ| π − ε, |λ| ε. If β < βcr , then the operator Rβ (λ) is analytic in λ ∈ C and uniformly bounded in λ ∈ C , |arg λ| π − ε.

2672


There are λ0 > 0 and δ0 > 0 such that for λ ∈ C , 0 < |λ| λ0 , |β − βcr | δ0 , β = β(λ), we have the representation Rβ (λ) =

β(λ) R0 (0)B + Sd (λ) + C(λ, β), β(λ) − β

(31)

where β(λ) is defined in Remark 2 following Lemma 5.5 and B is given by (27), Sd , d 3, satisfy (28), and C(λ, β) is bounded uniformly in λ and β. 6. The behavior of the principal eigenvalue for β ↓ βcr In Lemma 5.5 we showed that βcr > 0 for d 3. The following theorem implies, in particular, that βcr = 0 for d = 1 or 2. Theorem 6.1. For d = 1, 2 (when βcr = 0) the eigenvalue λ0 (β) has the following behavior as β ↓ βcr : 1 λ0 (β) ∼ c12 β 2 , 2

c1 =

c2 , λ0 (β) ∼ exp − β

v(x) dx, d = 1,

(32)

π , d = 2. c1

(33)

Rd

c2 =

In dimensions d 3 the eigenvalue λ0 (β) has the following behavior as β ↓ βcr : λ0 (β) ∼ c3 (β − βcr )2 ,

d = 3, λ0 (β) ∼ c4 (β − βcr )/ ln 1/(β − βcr ) , λ0 (β) ∼ cd (β − βcr ),

(34) d = 4,

d 5,

(35) (36)

where cd = 0, d 3, depend on v and will be indicated in the proof. Proof. Since we are interested in the behavior of λ0 (β) for β ↓ βcr and λ0 (β) ↓ 0 when β ↓ βcr by Lemma 4.1, we shall study the behavior of β(λ) as λ ↓ 0 (or, more generally, as λ → 0, λ ∈ C ). The arguments below are based on Lemma 5.1. First consider the case d = 1. For λ → 0, λ ∈ C , the eigenvalue problem for −A(λ) can be written in the form √ vP1 + O( λ ) hλ =

√

λ hλ . β(λ)

(37)

Note that the kernel of vP1 is positive when x is an interior point of supp(v). Therefore vP1 has a principal √ eigenvalue. In fact, the operator vP1 is one-dimensional and the eigenvalue is equal to c1 / 2 where c1 = Rd v(x) dx. Since this eigenvalue is simple and the operator in the √ √ √ left-hand side of (37) is analytic in λ, both hλ and λ/β(λ) are analytic functions of λ in a neighborhood of the origin and


lim

λ→0,λ∈C

2673

√ √ λ/β(λ) = c / 2. 1

√ √ Therefore, βcr = 0, β(λ) is analytic in λ, and β(λ) ∼ 2λ/c1 as λ → 0, λ ∈ C , which proves (32). The same arguments in the case d = 2 lead to the relation lim

λ→0,λ∈C

−1 β(λ) ln λ

= c1 /π.

This implies that βcr = 0 and (33) holds. In the case d = 3 the eigenvalue problem for −A(λ) takes the form √ −A(0) + λv(x)Q3 + O(λ) hλ =

1 hλ . β(λ)

(38)

√ As in the one-dimensional case, 1/β(λ) and hλ are analytic functions of λ. Now 1/βcr is equal to the principal eigenvalue of −A(0). Recall that h0 is the principal eigenfunction of −A(0) and h∗0 is the principal eigenfunction of −A∗ (0). Standard perturbation arguments imply that √ 1 1 = − γ λ + O(λ), β(λ) βcr

λ → 0, λ ∈ C ,

(39)

where γ=

−vQ3 h0 , h∗0 L2exp (Rd ) h0 , h∗0 L2exp (Rd )

> 0,

(40)

4 ). Note that γ > 0 since the kernel of the operator vQ is which implies (34) with c3 = 1/(γ 2 βcr 3 negative and principal eigenfunctions h0 , h∗0 can be chosen to be positive inside supp(v). Formula for γ can be simplified. We choose h0 = βvψ (see Lemma 5.4) and h∗0 defined in (29). Then

( 3 v(x)ψ(x) dx)2 , γ = √ R 2π R3 v(x)ψ 2 (x) dx

d = 3.

(41)

Let d = 4. Then instead of (38) we get −A(0) + λ ln(1/λ)vQ4 + O(λ) hλ =

1 hλ . β(λ)

(42)

From here it follows that 1 1 = − γ λ ln(1/λ) + O(λ), β(λ) βcr

λ → 0, λ ∈ C ,

(43)

where 1/βcr is the principal eigenvalue of −A(0) and γ is given by (40) with Q3 replaced 2 ). by Q4 . Thus (35) holds with c4 = 1/(γβcr

2674


For d 5 we get −A(0) + λvQd + O λ3/2 hλ =

1 hλ . β(λ)

From here it follows that 1 1 − γ λ + O λ3/2 , = β(λ) βcr

λ → 0, λ ∈ C ,

where 1/βcr is the principal eigenvalue of −A(0) and γ is given by (40) with Q3 replaced 2 ). 2 by Qd . Thus (36) holds with cd = 1/(γβcr 7. Asymptotics of the partition function, solutions, and fundamental solutions We shall need the following notation. Recall from (5) that by pβ (t, y, x) we denote the fundamental solution of the parabolic problem ∂pβ (t, y, x) 1 = x pβ (t, y, x) + βv(x)pβ (t, y, x), ∂t 2 pβ (0, y, x) = δ(x − y). For a given f ∈ L2 (Rd ), let uβ (t, x) =

pβ (t, y, x)f (y) dy Rd

be the solution of the Cauchy problem with the initial data f . The partition function is defined as the integral of the fundamental solution Zβ,t (x) =

pβ (t, x, y) dy =

Rd

pβ (t, y, x) dy. Rd

Note that the partition function defined in (4) is simply Zβ,T = Zβ,T (0). Also note that Zβ,t (x) is the solution of the Cauchy problem with initial data equal to one: ∂Zβ,t (x) 1 = Zβ,t (x) + βv(x)Zβ,t (x), ∂t 2

Zβ,0 (x) ≡ 1.

For β > βcr , let ψβ be the positive eigenfunction for the operator Hβ with eigenvalue λ0 (β) normalized by the condition ψβ L2 (R) = 1. This function is defined uniquely by Lemma 5.4 and is equal to −R0 (λ)hλ , where λ = λ0 (β) and hλ is the principal eigenfunction for the operator −A(λ). Note that ψβ decays exponentially at infinity. For a ∈ R, let Γ (a) be the following contour in the complex plane: Γ (a) = {a − s + is, s 0} ∪ {a − s − is, s 0}.


2675

We choose the direction along Γ (a) in such a way that the imaginary coordinate increases. The following lemma is an important tool for investigating the asymptotics of Zβ,T . Lemma 7.1. Let a > λ0 (β). Then for f ∈ L2 (Rd ) (or f ∈ Cexp (Rd )) and t > 0, uβ (t, x) =

−1 2πi

eλt Rβ (λ)f (x) dλ,

(44)

Γ (a)

which holds in L2 (Rd ) (or C(Rd )). This formula remains valid if the initial function f is identically equal to one and Rβ (λ)f is understood by substituting f ≡ 1 into (20) with R0 (λ)1 = −1/λ. More precisely, Zβ,t (x) − 1 =

−1 2πi

eλt Rβ (λ)(βv) (x) dλ λ

(45)

Γ (a)

in L2 (Rd ) and C(Rd ). Proof. First, let f ∈ L2 (Rd ). We solve the Cauchy problem for uβ using the Laplace transform with respect to t. This leads to (44) with Γ (a) replaced by the line {λ: Re λ = a}. The integral over this line is equal to the integral over Γ (a) since the resolvent is analytic between these contours and its norm decays as |λ|−1 when |λ| → ∞. Now let f ≡ 1. Then w(t, x) = Zβ,t (x) − 1 is the solution of the problem ∂w(t, x) 1 = w(t, x) + βv(x)w(t, x) + βv(x), ∂t 2

w(0, x) ≡ 0.

By the Duhamel formula and (44), −1 w(t, x) = 2πi

t

−1 eλ(t−s) Rβ (λ)βv (x) dλ ds = 2πi

0 Γ (a)

=

−1 2πi

eλt − 1 Rβ (λ)βv (x) dλ λ

Γ (a)

eλt Rβ (λ)βv (x) dλ, λ

Γ (a)

since in the domain Γ + (a) to the right of the contour Γ (a), the operator Rβ (λ) : L2 (Rd ) → L2 (Rd ) is analytic and decays as |λ|−1 at infinity. This justifies (45) in L2 (Rd ) sense. It remains to show that the right-hand side of (44) is continuous for f ∈ Cexp (Rd ) and the right-hand side of (45) is continuous. Since βv ∈ C0∞ , the integrands are continuous in (t, x) for each λ ∈ Γ (a). It remains to note that the integrals converge uniformly when x ∈ Rn , t t0 > 0. This is due to the fact that Rβ (λ)f C(Rd ) , Rβ (λ)βvC(Rd ) Cd (a), as follows from Lemma 5.8. 2 In order to state the next theorem we shall need the following notation. As in part (3) of Lemma 5.4, it is not difficult to show that for d 3, 0 β < βcr and f ∈ C0∞ (Rd ) there is a unique solution of the problem

2676


1 Hβ (ϕ) = ϕ + βv(x)ϕ = f, 2 2−d ∂ϕ ϕ = O |x| , as r = |x| → ∞. (x) = O |x|1−d ∂r

(46)

This solution is given by ϕ = R0 (0)(I + βA(0))−1 f . For f = −βv, we denote this solution by ϕβ . Theorem 7.2. (1) For β > βcr there is ε > 0 such that we have the following asymptotics for the partition function: Zβ,t (x) − 1 = exp λ0 (β)t ψβ L1 (Rd ) ψβ (x) + O exp(−εt) as t → ∞, which holds in L2 (Rd ) and in C(Rd ), where ψβ is the positive eigenfunction for the operator Hβ with eigenvalue λ0 (β) normalized by the condition ψβ L2 (R) = 1. (2) For β = βcr we have the following asymptotics for the partition function: Zβ,t (x) = k3 t 1/2 ψ(x) + O(1) as t → ∞, d = 3, t t as t → ∞, d = 4, ψ(x) + O Zβ,t (x) = k4 ln t ln2 t √ Zβ,t (x) = kd tψ(x) + O( t ) as t → ∞, d 5, which holds in C(Rd ). Here kd , d 3, are positive constants and ψ is the positive ground state for Hβcr normalized by the condition βcr vψL2exp (Rd ) = 1. (3) If 0 β < βcr , then lim Zβ,t (x) = 1 + ϕβ (x)

t→∞

in C(Rd ). Proof. (1) Note that the resolvent Rβ (λ) has only one pole between the contours Γ (a) and Γ (λ0 (β) − ε) if ε is less than the distance from λ0 to the rest of the spectrum. This pole is at the point λ0 (β) and the residue is the integral operator with the kernel −ψβ (x)ψβ (y). Therefore from (45) it follows that eλ0 (β)t ψβ (x) Zβ,t (x) − 1 = λ0 (β)

Rd

1 βv(y)ψβ (y) dy − 2πi

eλt Rβ (λ)βv (x) dλ. λ

(47)

Γ (λ0 (β)−ε)

Since ( 12 + βv − λ0 (β))ψβ = 0, we have βvψβ = (λ0 (β) − 12 )ψβ , and the integral in the first term of the right-hand side of (47) is equal to λ0 (β)ψβ L1 (Rd ) . Thus the first term on the right-hand side coincides with the main term of the asymptotics stated in the theorem.


2677

It remains to show that the second term on the right-hand side of (47) is exponentially smaller than the first term. This is due to the fact that the norm of the operator Rβ (λ) is of order 1/|λ| at infinity for λ ∈ Γ (λ0 (β) − ε). (2) Let d = 3. First, let us analyze (31) when β = βcr and λ√ → 0, λ ∈ C . By (39), the factor β(λ)/(β(λ) − β) in the right-hand side of (31) is equal to (βcr γ λ )−1 + O(1) as λ → 0, λ ∈ C , where γ > 0 is given by (40). We choose the same ground state ψ specified in the statement of Theorem 7.2. Then from (27) and Lemma 5.4 it follows that d v(x)ψ(x) dx R d v(x)ψ(x) dx R0 (0)B(βcr v) = R R ψ. (0)(β vψ) = − cr 2 (x) dx 0 2 v(x)ψ Rd R d v(x)ψ (x) dx

(48)

Now, by Lemma 5.8 and (34), (39), Rβcr (λ)(βcr v) =

− R d v(x)ψ(x) dx −k ψ ψ + D(λ) = √3 + D(λ), √ γβcr λ R d v(x)ψ 2 (x) dx λ

k3 > 0,

(49)

where the remainder D(λ) is of order O(1) when λ → 0, λ ∈ C . Note that D(λ) is bounded on Γ + (0) since the left-hand side and the first term on the right-hand side of (49) are bounded on Γ + (0) outside a neighborhood of zero. Next, we apply (45) with a replaced by 1/t and use the expression (49) to obtain

1 Zβ,t (x) − 1 = 2πi

eλt λ

Γ (1/t)

k3 ψ √ + D(λ) dλ. λ

(50)

Let us change the variables in the integral λt = z. Thus

1 Zβ,t (x) − 1 = 2πi

ez z

√ tk3 ψ z +D dz. √ z t

Γ (1)

The contribution to the integral from the term containing D(z/t) is bounded, while the contribution from the first term is equal to k3 t 1/2 ψ(x), as claimed in the lemma. One needs only to note that k3 > 0 since 1 2πi

z

1 e dz = πi

−3/2 z

Γ (1)

z

2 e dz = π

−1/2 z

∞

2 σ −1/2 e−σ dσ = √ > 0. π

0

Γ (1)

2 γ λ ln(1/λ) as λ → 0, λ ∈ C . This leads to If d = 4, then (35), (43) imply that β(λ) − βcr ∼ βcr the following analog of (50)

1 Zβ,t (x) − 1 = 2πi

Γ (1/t)

eλt λ

k4 ψ(x) + D(λ) dλ, k4 > 0, λ ln(1/λ)

2678


where D(λ) is of order O(1/|λ ln2 λ|) when λ → 0, λ ∈ C , and is bounded at infinity. After the change of variables λt = z, we obtain Zβ,t (x) − 1 =

1 2πi

Γ (1)

tk4 ψ(x) z ez +D dz, z z(ln t − ln z) t

which easily leads to the second part of the lemma in the case d = 4. The treatment of the case d 5 is similar. (3) We apply (45) with a replaced by 1/t to obtain Zβ,t (x) − 1 =

−1 2πi

eλt −1 Rβ (λ)(βv) (x) dλ = λ 2πi

Γ (1/t)

z ez Rβ (βv) (x) dz. (51) z t

Γ (1)

Note that by Lemma 5.2 and since 1/β is not an eigenvalue of A(0) we have lim

λ→0, λ∈C

Rβ (λ)(βv) =

lim

λ→0, λ∈C

−1 R0 (λ) I + βA(λ) (βv)

−1 = R0 (0) I + βA(0) (βv) = −ϕβ . Since the difference between Rβ (z/t)(βv) and −ϕβ is bounded on Γ (1), one can pass to the limit t → ∞ under the integral sign in (51), which leads to lim Zβ,t (x) = 1 +

t→∞

ϕβ (x) 2πi

ez dz = 1 + ϕβ (x). z

2

Γ (1)

The third part of Theorem 7.2 establishes the existence of limt→∞ Zβ,t (x) for β < βcr . Next we examine the behavior of this quantity as β ↑ βcr . Lemma 7.3. There are positive constant bd , d 3, such that lim Zβ,t (x) − 1 =

t→∞

bd ψ(x) + O(1) βcr − β

as β ↑ βcr

is valid in C(Rd ), where ψ is the positive ground state for Hβcr normalized by the condition βcr vψL2exp (Rd ) = 1. Proof. By the third part of Theorem 7.2, we only need to find the asymptotics as β ↑ βcr of ϕβ = −R0 (0)(I + βA(0))−1 (βv). From (26) with λ = 0 and β(0) = βcr and (48) it follows that −1 −βcr bd ϕβ = −R0 (0) I + βA(0) (βv) = R0 (0)B(βcr v) + O(1) = ψ + O(1) βcr − β βcr − β for some positive constant bd .

2


2679

8. Behavior of the polymer for β > βcr In this section we shall assume that β > βcr is fixed. A result similar to the first part of Theorem 7.2 is valid for the solution of the Cauchy problem and for the fundamental solution. Theorem 8.1. Let f ∈ L2 (Rd ) (or f ∈ Cexp (Rd )). For β > βcr there is ε > 0 such that we have the following asymptotics for the solution uβ of the Cauchy problem with the initial data f : uβ (t) = exp λ0 (β)t ψβ , f L2 (Rd ) ψβ + qf (t) ,

(52)

which holds in L2 (Rd ) (or in C(Rd )), where qf (t) cf exp(−εt) for some c and all sufficiently large t. We have the following asymptotics for the fundamental solution of the parabolic equation: pβ (t, y, x) = exp λ0 (β)t ψβ (y)ψβ (x) + q(t, y, x) ,

(53)

where limt→∞ q(t, y, x) = 0, uniformly in y, and (53) holds in L2 (Rd ) and in C(Rd ) for each y fixed. Proof. The proof of (52) is the same as the proof of the first part of Theorem 7.2, and therefore we omit it. δ,y Let fβ (x) = pβ (δ, y, x) be the fundamental solution of the parabolic problem at time δ. δ,y

Note that fβ

δ,y

∈ L2 (Rd ) for all δ > 0 and all y, and fβ

∈ Cexp (Rd ) for all sufficiently small δ,y

δ > 0 and all y. Denote the solution of the parabolic equation with the initial data fβ δ,y uβ (t, x).

by

Then

δ,y δ,y pβ (t, y, x) = uβ (t − δ, x) = exp λ0 (β)(t − δ) ψβ , fβ L2 (Rd ) ψβ (x) + q δ (t, y, x) , δ,y

where q δ (t, y, x) cfβ exp(−ε(t − δ)) for some c and all sufficiently large t. δ,y

Note that ψβ , fβ L2 (Rd ) can be made arbitrarily close to ψβ (y) uniformly in y, by choosδ,y

ing a sufficiently small δ, and fβ is uniformly bounded in y for any fixed δ. This justifies (53). 2 Next, let us study the distribution of the end of the polymer with respect to the measure Pβ,T as T → ∞. Theorem 8.2. The distribution of x(T ) with respect to the measure Pβ,T converges, weakly, as T → ∞, to the distribution with the density ψβ /ψβ L1 (Rd ) . Proof. The density of x(T ) with respect to the Lebesgue measure is equal to pβ (T , 0, x) exp(λ0 (β)T )(ψβ (0)ψβ (x) + q(T , 0, x)) = , Zβ,T (0) exp(λ0 (β)T )(ψβ L1 (Rd ) ψβ (0) + o(1))

(54)

2680


where q is the same as in (53). When T → ∞, the right-hand side of (54) converges to ψβ (x)/ψβ L1 (Rd ) uniformly in x by Theorem 8.1. This justifies the weak convergence. 2 Now let us examine the behavior of the polymer in a region separated both from zero and T . Let S(T ) be such that lim S(T ) = lim T − S(T ) = +∞.

T →∞

T →∞

(55)

Let s > 0 be fixed. Consider the process y T (t) = x(S(T ) + t), 0 t s. Theorem 8.3. The distribution of the process y T (t) with respect to either of the measures Pβ,T or Pβ,T (·|x(T ) = 0) converges as T → ∞, weakly in the space C([0, s], Rd ), to the distribution of a stationary Markov process with invariant density ψβ2 and the generator (∇ψβ , ∇g) 1 Lβ g = g + . 2 ψβ Remark. Let rβ (t, y, x) =

pβ (t, y, x)ψβ (x) exp −λ0 (β)t . ψβ (y)

(56)

Note that rβ (t, y, x) is the fundamental solution for the operator ∂/∂t − L∗β , where L∗β is the formal adjoint to Lβ . Thus rβ is the transition density for the Markov process with the generator Lβ . Also note that L∗β ψβ2 = 0, and thus ψβ2 is the invariant density for the Markov process. Proof of Theorem 8.3. We shall only consider the measure Pβ,T since the arguments for the measure Pβ,T (·|x(T ) = 0) are completely analogous. First, let us prove the convergence of the finite-dimensional distributions. For y ∈ Rd and a Borel set A ∈ B(Rd ), let R(t, y, A) =

rβ (t, y, x) dx, A

with rβ given by (56). Note that R is a Markov transition function since rβ (t, y, x) dx ≡ 1. Rd

The generator of the corresponding Markov process is Lβ and the invariant density is ψβ2 . Let 0 t1 < · · · < tn s. The density of the random vector (y T (t1 ), . . . , y T (tn )) with respect to the Lebesgue measure on Rdn is equal to ρ T (x1 , . . . , xn ) = pβ S(T ) + t1 , 0, x1 pβ (t2 − t1 , x1 , x2 ) · · · pβ (tn − tn−1 , xn−1 , xn ) −1 × Zβ,T −tn (xn ) Zβ,T (0) .


2681

We replace here all factors pβ , except the first one, by rβ using (56). We replace the first factor and the factors Z by their asymptotic expansions given in Theorems 8.1 and 7.2, respectively. This leads to ρ T (x1 , . . . , xn ) = ψβ2 (x1 )rβ (t2 − t1 , x1 , x2 ) · · · rβ (tn − tn−1 , xn−1 , xn ) + o(1),

T → ∞,

where the remainder tends to zero uniformly in (x1 , . . . , xn ). By the remark made after the statement of the theorem, this justifies the convergence of the finite-dimensional distributions of y T to those of the Markov process. It remains to justify the tightness of the family of measures induced by the processes y T . From the convergence of the one-dimensional distributions it follows that for any η > 0 there is a > 0 such that

Pβ,T y T (0) > a η

(57)

for all sufficiently large T . For a continuous function x : [0, T ] → Rd , x(0) = 0, let mT (x, δ) =

sup

|t1 −t2 |δ, S(T )t1 ,t2 S(T )+s

x(t1 ) − x(t2 ) .

Let us prove that for each ε, η > 0, there is δ > 0 such that Pβ,T mT (x, δ) > ε η

(58)

for all sufficiently large T . Observe that Pβ,T mT (x, δ) > ε S(T )+s −1 βv x(t) dt χ{mT (x,δ)>ε} Zβ,T −S(T )−s x S(T ) + s = Zβ,T (0) E0,T exp 0

S(T )+s −1 Zβ,T (0) sup Zβ,T −S(T )−s (x)E0,T exp βv x(t) dt χ{mT (x,δ)>ε} x∈Rd

0

−1 exp sβ sup v(x) Zβ,T (0) sup Zβ,T −S(T )−s (x)E0,T x∈Rd

x∈Rd

S(T )

βv x(t) dt χ{mT (x,δ)>ε}

× exp 0

−1 exp sβ sup v(x) Zβ,T (0) sup Zβ,T −S(T )−s (x) sup pβ S(T ), 0, x C(δ, ε), x∈Rd

x∈Rd

x∈Rd

where C(δ, ε) is the probability that for a d-dimensional Brownian motion Wt , 0 t s, we have

2682


sup

|t1 −t2 |δ, 0t1 ,t2 s

W (t1 ) − W (t2 ) > ε.

Note that −1 exp sβ sup v(x) Zβ,T (0) sup Zβ,T −S(T )−s (x) sup pβ S(T ), 0, x x∈Rd

x∈Rd

x∈Rd

is bounded, as follows from Theorems 7.2 and 8.1, while C(δ, ε) can be made arbitrarily small by selecting a sufficiently small δ. This justifies (58). Since the inequalities (57) and (58) hold for all sufficiently large T , by choosing different a and δ, we can make sure that they hold for all T . Thus the family of measures induced by the processes y T is tight. 2 Remark. If instead of (55) we assume that S(T ) = 0, the result of Theorem 8.3 will hold with the only difference that the initial distribution for the limiting Markov process will now be concentrated at zero, instead of being the invariant distribution. 9. Behavior of the polymer for β < βcr First, we shall study the asymptotic behavior of the solution uβ (t, x)√ of the Cauchy problem √ and of the fundamental solution pβ (t, y, x) when t → ∞, |y| ε −1 , ε t |x| ε −1 t, and ε > 0 is small but fixed. Recall that ϕβ was defined before Theorem 7.2. Lemma 9.1. Let d 3, 0 β < βcr , ε > 0 and f ∈ Cexp (Rd ), f 0. We have the following asymptotics for the solution uβ of the Cauchy problem with the initial data f : uβ (t, x) = (2πt)−d/2 exp −|x|2 /2t 1 + ϕβ , f L2 (R3 ) + qf (t, x) ,

(59)

where for some constant Cβ (ε) we have sup

√ √ ε t|x|ε −1 t

qf (t, x) Cβ (ε)t −1/2 f

Cexp (R3 ) ,

t 1.

We have the following asymptotics for the fundamental solution of the parabolic equation: pβ (t, y, x) = (2πt)−d/2 exp −|x|2 /2t 1 + ϕβ (y) + q(t, y, x) ,

(60)

where lim

t→∞

sup

√ √ |y|ε −1 , ε t|x|ε −1 t

q(t, y, x) = 0.

Proof. Note that (60) follows from (59) since the fundamental solution at time t is equal to the solution with the initial data pβ (t, y, δ) evaluated at time t − δ (the same argument was used in the proof of Theorem 8.1). Therefore it is sufficient to prove (59). For the sake of transparency of exposition, we shall consider only the case d = 3. From Lemma 5.8 it follows that we can put a = 0 in (44) when β < βcr . Thus using (22) and the explicit formula for R0 (λ), we obtain


−1 uβ (t, x) = 2πi

1 e Rβ (λ)f (x) dλ = 2πi λt

√

e− 2λ|x−y| g(λ, y) dy dλ, (61) e 2π|x − y| λt

Γ (0) R3

Γ (0)

2683

where −1 g(λ) = I + βA(λ) f.

(62)

√ λ. By the analytic Fredholm theorem, By Lemma 5.1, A(λ) is an entire function of √ (I + βA(λ))−1 is a meromorphic function of λ, since A(λ) tends to zero as λ → +∞, Im(λ) = 0. It does not have a pole at zero as follows from Lemma 5.4 and Remark 1 following Lemma 5.5. Therefore, by the Taylor formula, for all sufficiently small |λ|, λ ∈ Γ (0), and some c > 0, we have g1 (λ)

g(λ) = g0 + g1 (λ),

Cexp (R3 )

c |λ|f Cexp (R3 ) ,

(63)

where g0 = (I + βA(0))−1 f . Since (I + βA(λ))−1 Cexp (R3 ) is bounded on Γ (0), formula (63) is valid for all λ ∈ Γ (0), but not only in a neighborhood of zero. (1) Let uβ (x) be given by (61) with g replaced by g1 . Then (1) uβ (t, x) =

1 2πi

√

e− 2λ|x−y| g1 (λ, y) dy dλ e 2π|x − y| λt

√ Γ (0) |y|ε t/2

1 + 2πi

√

e− 2λ|x−y| g1 (λ, y) dy dλ e 2π|x − y| λt

√ Γ (0) |y|>ε t/2

= I1 + I 2 . √ We change the variable λt = ζ and use the estimate 1/|x − y| < 2/(ε t ) in I1 . This implies |I1 |

cf Cexp (R3 ) 2π 2 εt 2

√ |x−y|

|ζ |eζ − 2ζ √t e−y 2 dy dζ

√ Γ (0) |y|ε t/2

C(ε)f Cexp (R3 ) t2

√ √ ε t |x| ε −1 t.

,

√ √ 2 2 In I2 we change the variables λt = ζ, x = tz, y = tu and use the estimate e−y e−(εt/2) . This leads to the exponential decay of |I2 | as t → ∞. Hence 1 uβ (t, x) = 2πi

√

e− 2λ|x−y| g0 (y) dλ dy + r1 (t, x), e 2π|x − y| λt

R3 Γ (0)

where the remainder r1 (t, x) satisfies

(64)

2684


sup

√ √ ε t|x|ε −1 t

r1 (t, x) = f

Cexp (R3 ) O

t −2

as t → ∞.

(65)

The integral over Γ (0) in (64) can be evaluated, and we obtain |x−y|2 1 uβ (t, x) = e− 2t g0 (y) dy + r1 (t, x). 3/2 (2πt) R3

Since g0 Cexp (R3 ) Cf Cexp (R3 ) for some constant C, we have uβ (t, x) =

|x|2 1 e− 2t 3/2 (2πt)

g0 (y) dy + r2 (t, x), R3

where r2 satisfies (65) with r1 replaced by r2 . In order to prove (59), it remains to show that 1 + ϕβ (x) f (x) dx. g0 (x) dx = (66) R3

R3

Since (I + βvR0 (0))g0 = f, we have g0 = f − βvR0 (0)g0 . Recall that ϕβ is the solution of (46) with f = −βv. Thus 1 g0 (x) dx = f (x) dx + ϕβ + βvϕβ R0 (0)g0 dx. 2 R3

R3

R3

Since ϕβ , R0 (0)g0 = O(1/|x|) and their derivatives are of order O(|x|−2 ) as |x| → ∞, the Green formula implies 1 1 ϕβ R0 (0)g0 dx = ϕβ R0 (0)g0 dx = ϕβ g0 dx. 2 2 R3

R3

R3

Hence

g0 (x) dx =

R3

which implies (66.)

f (x) dx +

R3

ϕβ I + βvR0 (0) g0 dx,

R3

2

Next, let us study the distribution of√the polymer with respect to the measure Pβ,T as T → ∞. Consider the process y T (t) = x(tT )/ T , 0 t 1. Theorem 9.2. Let d 3 and 0 β < βcr . With respect to Pβ,T , the distribution of the process y T (t) converges as T → ∞, weakly in the space C([0, 1], Rd ), to the distribution of the ddimensional Brownian motion. With respect to Pβ,T (·|x(T ) = 0), the distribution of the process y T (t) converges as T → ∞, weakly in the space C([0, 1], Rd ), to the distribution of the ddimensional Brownian bridge.


2685

Proof. We shall only prove the first statement since the proof of the second one is completely similar. First, let us prove the convergence of the finite-dimensional distributions. Clearly Pβ,T (y T (0) = 0) = 1. Let 0 < t1 < · · · < tn 1. The density of the random vector (y T (t1 ), . . . , y T (tn )) with respect to the Lebesgue measure on Rdn is equal to ρ T (x1 , . . . , xn ) dn 1 1 1 = T 2 pβ t1 T , 0, x1 T 2 pβ (t2 − t1 )T , x1 T 2 , x2 T 2 . . . −1 1 1 × pβ (tn − tn−1 )T , xn−1 T 2 , xn T 2 Zβ,T (0) . By Lemma 9.1, 1 pβ t1 T , 0, x1 T 2 = T −d/2 (2πt1 )−d/2 1 + ϕβ (0) exp −|x1 |2 /2t1 1 + r(T , x1 ) , where lim

sup

T →∞ ε|x |ε −1 1

r(T , x1 ) = 0.

(67)

Note that pβ p0 since v is non-negative, and limT →∞ (Zβ,T (0)) = (1 + ϕβ (0)) by Theorem 7.2. Therefore, ρ T (x1 , . . . , xn ) d

(2πt1 )− 2 e

−

|x1 |2 2t1

− d − |x2 −x1 |2 1 + r(T , x1 ) 2π(t2 − t1 ) 2 e 2(t2 −t1 ) . . .

− d − |xn −xn−1 | × 2π(tn − tn−1 ) 2 e 2(tn −tn−1 ) = ρtW (x1 , . . . , xn ) 1 + r(T , x1 ) , 1 ,...,tn 2

(68)

where ρtW (x1 , . . . , xn ) is the density of the Gaussian vector (W (t1 ), . . . , W (tn )), where W is 1 ,...,tn a d-dimensional Brownian motion, and q(T , x1 ) satisfies (67) with q instead of r. Since ε was an arbitrary positive number, this implies the convergence of the finite-dimensional distributions of y T to the finite-dimensional distributions of the Brownian motion. Indeed, the estimate from (x1 , . . . , xn ) is below for ρ T (x1 , . . . , xn ) in (68) is sufficient since we know a priori that ρtW 1 ,...,tn the density of a probability measure. It remains to prove tightness of the family of processes y T , T 1. For a continuous function x : [0, T ] → Rd , let m(x, δ) = m (x, δ, ε) =

sup

|t1 −t2 |δT , 0t1 ,t2 T

sup

√

x(t1 ) − x(t2 ) / T , √

√

x(t1 ) − x(t2 ) / T .

|t1 −t2 |δT , 0t1 ,t2 T , |x(t1 )|ε T

The tightness will follow if we show that for each ε, η > 0 there is δ > 0 such that Pβ,T m(x, δ) > ε η

2686


for all sufficiently large T . Note that m(x, δ) > ε implies that m (x, δ, ε/4) > ε/4. Therefore, it is sufficient to show that Pβ,T m (x, δ, ε/4) > ε/4 η.

(69)

Fix ε > 0. For a continuous function x : [0, T ] → Rd , let √

τ = min T , inf t 0: x(t) = ε T /4 . Let Eδ be the event that m(x, δ) > ε/4 and Eδ the event that m (x, δ, ε/4) > ε/4. For 0 s T , let Eδs be the event that a continuous function x : [0, T − s] → Rd satisfies sup

|t1 −t2 |δT , 0t1 ,t2 T −s

√

x(t1 ) − x(t2 ) T > ε/4.

Then T −1 Pβ,T (Eδ ) = Zβ,T (0) E0,T exp βv x(t) dt χEδ 0

τ

−1 Zβ,T (0) E0,T exp

T−τ x(τ ) βv x(t) dt E0,T −τ χEδτ exp βv x(t) dt ,

0

0

where Ex0,T denotes the expectation with respect to the measure induced by the Brownian motion starting at the point x. Since τ E0,T exp

βv x(t) dt Zβ,T (0)

0

and x(τ ) E0,T −τ

T−τ χEδτ exp

βv x(t) dt

T

0

sup

√ x∈Rd , |x|=ε T /4

Ex0,T

χEδ exp

βv x(t) dt

,

0

it is sufficient to estimate sup √ Ex0,T x∈Rd , |x|=ε T /4

T χEδ exp

βv x(t) dt

.

(70)

0

Let E be the event that a trajectory starting at x reaches the support of v before time T . Note that


lim

T →∞

sup

√

x∈Rd , |x|=ε T /4

2687

Px0,T (E ) = 0

since d 3. The expression in (70) is estimated form above by sup

√ x∈Rd , |x|=ε T /4

T

Ex0,T

χE exp

βv x(t) dt

+ Px0,T (Eδ )

.

0

The second term does not depend on T due to the scaling invariance of the Brownian motion, and can be made arbitrarily small by selecting a sufficiently small δ. Due to the Markov property of the Brownian motion, the first term is estimated from above by sup x∈Rd , |x|=εT /4

and thus tends to zero when T → ∞.

Px0,T (E ) ·

sup

Zβ,T (x),

x∈supp(v)

2

10. Behavior of the polymer for β = βcr In this section we assume that d = 3. Again, we start with the asymptotic behavior of the sox) of the Cauchy lution uβ (t, √ √problem and of the fundamental solution pβ (t, y, x) when t → ∞, |y| ε −1 , ε t |x| ε −1 t, and ε > 0 is small but fixed. Recall that ψ is the positive ground state for Hβcr normalized by the condition βcr vψL2exp (R3 ) = 1 (see the remark following Lemma 5.4 and Theorem 7.2). For f ∈ Cexp (R3 ), define α(f ) =

ψ(x)f (x) dx, R3

=√

2πβcr

1

R3

v(x)ψ(x) dx

.

We can formally apply this to f being the δ-function centered at a point y, and thus define α δy (x) = ψ(y). Theorem 10.1. Let d = 3, β = βcr , ε > 0 and f ∈ Cexp (R3 ), f 0. We have the following asymptotics for the solution uβ of the Cauchy problem with the initial data f : uβ (t, x) =

1 √ exp −|x|2 /2t α(f ) + qf (t, x) , |x| t

where for some constant Cβ (ε) we have √

sup

√ ε t|x|ε −1 t

qf (t, x) Cβ (ε)t −1/2 f

Cexp (R3 ) ,

t 1.

(71)

2688


We have the following asymptotics for the fundamental solution of the parabolic equation: pβ (t, y, x) =

√ exp −|x|2 /2t ψ(y) + q(t, y, x) , |x| t

(72)

where lim

sup

√ √ t→∞ |y|ε −1 , ε t|x|ε −1 t

q(t, y, x) = 0.

Proof. As in Lemma 9.1, formula (72) follows from (71). Lemma 5.7 implies −1 I + βcr A(λ) =

βcr B + O(1), β(λ) − βcr

λ → 0, λ ∈ C ,

where B is the one-dimensional operator with the kernel B(x, y) =

v(x)ψ(x)ψ(y) . 2 R3 v(x)ψ (x) dx

From here, (34) and (39) we get −1 I + βcr A(λ) =

1 √ B + O(1), λ

βcr γ

λ → 0, λ ∈ C ,

where γ is defined in (40), (41). Hence, for any f ∈ Cexp (R3 ) and λ → 0, λ ∈ C , −1 α (f ) h(λ, x) := I + βcr A(λ) f = √ v(x)ψ(x) + g1 (λ), λ √ 2π R3 ψ(x)f (x) dx α (f ) = , βcr ( R3 v(x)ψ(x) dx)2

(73)

where g1 (λ) cf Cexp (R3 ) for some constant c. Now, similarly to (61), we have −1 uβ (t, x) = 2πi

1 e Rβ (λ)f (x) dλ = 2πi

λt

Γ (0) R3

Γ (0)

√

e− 2λ|x−y| h(λ, y) dy dλ. e 2π|x − y| λt

(1)

The integral with g1 (λ) instead of h can be estimated similarly to the estimate on uβ in the case of β < βcr . This leads to following analogue of (64): α (f ) uβ (t, x) = 2πi

R3 Γ (0)

where the remainder r1 (t, x) satisfies

√

e− 2λ|x−y| eλt √ v(y)ψ(y) dλ dy + r1 (t, x), 2π λ|x − y|


sup

√ √ ε t|x|ε −1 t

r1 (t, x) = f

Cexp (R3 ) O

−3/2 t

as t → ∞.

2689

(74)

We evaluate the integral over Γ (0): 1 2πi

Γ (0)

√

|x−y|2 1 eλt− 2λ|x−y| dλ = √ e− 2t . √ 2πt 2λ

(75)

This equality simply means that the inverse Laplace transform of the Green function of the onedimensional Helmholtz equation coincides with the fundamental solution of the corresponding heat equation. Thus, α (f ) uβ (t, x) = √ 2π 3/2 t

R3

|x−y|2 1 e− 2t v(y)ψ(y) dy + r1 (t, x). |x − y|

This implies (71) since v has a compact support.

2

The√next theorem concerns the fundamental solution when both y and x are at a distance of order t away from the origin. Note that now there are two terms in the asymptotic expansion for the fundamental solution which are of the same order in t. The main terms have the order t −3/2 when t → ∞, compared with t −1 in the case considered in Theorem 10.1 (where y was bounded). Theorem 10.2. Let d = 3, β = βcr , ε > 0. We have the following asymptotics for the fundamental solution of the parabolic equation: pβ (t, y, x) = p0 (t, y, x) +

1

√ e−(|y|+|x|) t

2 /2t

(2π)3/2 |y||x|

1 + q(t, y, x) ,

(76)

where lim

sup

√ t→∞ √ ε t|y|,|x|ε −1 t

q(t, y, x) = 0.

(77)

Proof. Let pβ (t, y, x) = p0 (t, y, x) + u. Then ut = Hβ u + βvp0 , u|t=0 = 0, and therefore by the Duhamel formula t pβ (t − s, z, x)βv(z)p0 (s, y, z) dz ds.

u(t, y, x) = 0 R3

Using (72), we get t u(t, y, x) = 0 R3

exp −|x|2 /2(t − s) ψ(z)βv(z)p0 (s, y, 0) dz ds + h1 + h2 √ |x| t − s

(78)

2690


with t h1 = 0 R3

t h2 = 0 R3

exp −|x|2 /2(t − s) ψ(z)βv(z) p0 (s, y, z) − p0 (s, y, 0) dz ds, √ |x| t − s exp −|x|2 /2(t − s) q(t − s, z, x)βv(z)p0 (s, y, z) dz ds, √ |x| t − s

where q is the same as in (72). The integral in the right-hand side of (78) (let us denote it by w) is a convolution of two functions and can be evaluated using the Laplace transform (see (75)). It gives the second term in the right-hand side of (76). The contribution from the other two terms can be shown to satisfy (77). Let us prove the statement about w. In fact, √ 2π 1 , 1 = βv(z)ψ(z) dz = |x| |x|

w = 1 w1 ∗ p0 (t, y, 0) ,

R3

1 w1 = √ 2πt

exp −|x|2 /2t . √

The Laplace transform w 1 (λ) of the function √w1 is equal to e− Laplace transform of p0 (t, y, 0) is equal to e− 2λ|y| /2π|y|. Thus

√

2λ|x| /

2λ (see (75)), and the

√

e− 2λ(|x|+|y|) 1 . w (λ) = √ 2π|x||y| 2λ It remains to apply (75) one more time.

2

T As in √ Section 9, we shall study the limit, as T3 → ∞, of the family of processes y (t) = x(tT )/ T , 0 t 1. For 0 s < t 1, y, x ∈ R , define

√ √ pβT (s, t, y, x) = pβ T (t − s), y T , x T , √ pβ (s, t, 0, x) = lim TpβT (s, t, 0, x) = lim Tpβ T (t − s), 0, x T , T →∞

T →∞

√

x = 0,

√ p β (s, t, y, x) = lim T 3/2 pβT (s, t, y, x) = lim T 3/2 pβ T (t − s), y T , x T , T →∞

T →∞

y, x = 0.

By Theorems 10.1 and 10.2, pβ (s, t, 0, x) =

ψ(0) exp −|x|2 /2(t − s) , √ |x| t − s

x = 0,

pβ (s, t, y, x) = p0 (t − s, y, x) +

1

√

(2π)3/2 |y||x|

t −s

2 exp − |y| + |x| /2(t − s) ,

y, x = 0.

(79)


2691

For 0 < t1 < · · · < tn 1, let the density of the random vector (y T (t1 ), . . . , y T (tn )) with respect to the Lebesgue measure on Rdn be denoted by ρ T (x1 , . . . , xn ). For 0 s < t 1 and y, x ∈ R3 , define

T

Q

(s, t, y, x) = pβT (s, t, y, x)

pβT (t, 1, x, z) dz

R3 T

Q

(s, 1, y, x) = pβT (s, 1, y, x)

pβT (s, 1, y, z) dz R3

pβT (s, 1, y, z) dz

−1 ,

t < 1,

−1 .

R3

Thus ρ T (x1 , . . . , xn ) = QT (0, t1 , 0, x1 )QT (t1 , t2 , x1 , x2 ) . . . QT (tn−1 , tn , xn−1 , xn ). In order to find the limit of the finite-dimensional distributions of y T , we need to identify the limit of QT as T → ∞. For 0 s < t 1, y ∈ R3 and x ∈ R3 \ {0}, define Q(s, t, y, x) = lim QT (s, t, y, x).

(80)

T →∞

By Theorems 10.1 and 10.2,

Q(s, t, y, x) = pβ (s, t, y, x)

pβ (t, 1, x, z) dz

R3

−1 pβ (s, 1, y, z) dz

,

t < 1,

(81)

R3

−1 pβ (s, 1, y, z) dz . Q(s, 1, y, x) = pβ (s, 1, y, x)

(82)

R3

We additionally define Q(s, t, y, 0) = 0. Using (80), (81) and (82), we can identify the limit of the densities ρ T (x1 , . . . , xn ) for x2 , . . . , xn = 0. In order to identify the weak limit of the finite-dimensional distributions of the processes y T , we are going to show that the limit of the densities is the density of a probability distribution, i.e. the mass does not escape to the origin or infinity. This is done in Lemma 10.4, where we show that Q serves as the transition density for a Markov process. First, however, we show that Q satisfies a Fokker–Plank type equation on R3 \ {0}. Let g(t, x) = ln

pβ (t, 1, x, z) dz ,

0 t < 1, |x| > 0.

R3

Let L be the differential operator acting on C 2 (R3 \ {0}) according to the formula ∂g(t, x) ∂f 1 (t, x), (Lf )(t, x) = x f (t, x) + 2 ∂r ∂r

|x| > 0,

(83)

2692


and let L∗ be the formal adjoint of L, i.e. 1 1 ∂[(∂g/∂r)v] L ∗ v = x v − 2 . 2 ∂r r Lemma 10.3. For 0 s < 1 and y ∈ R3 , the function Q(s, t, y, x) satisfies the equation ∂Q(s, t, y, x) = L∗ Q(s, t, y, x), ∂t

|x| > 0, s < t < 1.

(84)

Proof. Let us consider the case when y = 0 (the other case is similar). Let 2 1 /2(t − s) , exp − |y| + |x| √ (2π)3/2 |y||x| t − s 2 1 v2 (t, x) = exp − |x| + |z| /2(1 − t) dz. √ 3/2 (2π) |x||z| 1 − t

v1 (s, t, y, x) =

R3

Observe that

∂ 1 − x v1 = 0, ∂t 2

1 ∂ + x v2 = 0. ∂t 2

(85)

For fixed s and y, the function Q(s, t, y, x) is proportional to u(t, x) = p0 (t − s, y, x) + v1 (s, t, y, x) 1 + v2 (t, x) . By (85),

∂ ∂p0 ∂v1 ∂v2 1 ∂v2 − x u = − + + 2(p0 + v1 ) . ∂t 2 ∂r ∂r ∂r ∂t

(86)

For any two functions A and B we have ∂ ∂p0 ∂v1 ∂v2 A +B u=A + (1 + v2 ) + A(p0 + v1 ) + B(p0 + v1 )(1 + v2 ). (87) ∂r ∂r ∂r ∂r Thus ∂ 1 ∂ − x + A + B u ∂t 2 ∂r ∂v2 ∂v2 ∂p0 ∂v1 ∂v2 + − + A(1 + v2 ) + 2(p0 + v1 ) +A + B(1 + v2 ) = 0 = ∂r ∂r ∂r ∂t ∂r if A=

∂v2 (1 + v2 )−1 , ∂r

∂v2 ∂v2 +A B =− 2 (1 + v2 )−1 . ∂t ∂r


2693

Since g(t, x) = ln(1 + v2 ) and 2∂v2 /∂t = −∂ 2 v2 /∂r 2 − 2∂v2 /∂r (see (85)), it is easy to check that the operator in the left-hand side of the equation for u is ∂t∂ − L∗ , and this justifies (84). 2 Lemma 10.4. The function Q(s, t, y, x), 0 s < t 1, y, x ∈ R3 , is the transition density for a Markov process on R3 . Proof. To show the existence of a Markov process, we need to verify that Q(t1 , t2 , x1 , x2 ) dx2 = 1,

(88)

t1 < t2 ,

R3

and Q(t1 , t2 , x1 , x2 )Q(t2 , t3 , x2 , x3 ) dx2 = Q(t1 , t3 , x1 , x3 ),

t1 < t2 < t3 .

(89)

R3

Let us assume that (88) has been demonstrated, and prove (89). Observe that T 2+α pβT (t1 , t2 , x1 , x2 )pβT (t2 , t3 , x2 , x3 ) dx2 = T 1+α pβT (t1 , t3 , x1 , x3 ),

t1 < t2 < t3 ,

R3

where α = 1/2 if x1 = 0 and α = 0 otherwise. For x3 = 0 we take the limit, as T → ∞, on both sides of this relation. The integrand on the left-hand side converges to pβ (t1 , t2 , x1 , x2 )p β (t2 , t3 , x2 , x3 ), however, the convergence is not necessarily uniform in x2 , and we can only conclude by the Fatou lemma that pβ (t1 , t2 , x1 , x2 )p β (t2 , t3 , x2 , x3 ) dx2 pβ (t1 , t3 , x1 , x3 ), t1 < t2 < t3 , x3 = 0. R3 \{0}

From (81) and (82) it now follows that Q(t1 , t2 , x1 , x2 )Q(t2 , t3 , x2 , x3 ) dx2 Q(t1 , t3 , x1 , x3 ),

t1 < t2 < t3 , x3 = 0.

R3 \{0}

Note that both sides of this inequality are continuous in x3 ∈ R3 \ {0}. Due to (88), the integrals in x3 over R3 \ {0} are equal to one for the expressions in both sides of this inequality. Therefore, Q(t1 , t2 , x1 , x2 )Q(t2 , t3 , x2 , x3 ) dx2 = Q(t1 , t3 , x1 , x3 ), R3 \{0}

and thus (88) implies (89).

t1 < t2 < t3 , x3 = 0,

2694


Now let us verify (88). Put s = t1 , τ = t2 , y = x1 and x = x2 . Again, we shall consider the case y = 0, the other case being similar. Moreover, we can assume that τ < 1, since the case τ = 1 can be treated by taking the limit τ ↑ 1. On a formal level, (88) follows from (84) by integrating the both sides of (84) over Ω = [s, τ ] × R3 ⊂ R4t,x :

Q(s, τ, y, x) dx − lim t↓s

R3

Q(s, t, y, x) dx = L∗ Q, 1 L2 (Ω) = Q, L1L2 (Ω) .

(90)

R3

One needs only to note that lim t↓s

Q(s, t, y, x) dx = 1,

(91)

R3

and that the operator L applied to the identity function gives zero. The latter implies that the left-hand side in (90) is zero, and (91) implies that the second term on the left-hand side of (90) is one. In order to make relations (90) rigorous we note that Q(s, t, y, x) is infinitely smooth in (t, x) when x = 0 and decays exponentially as |x| → ∞. However, it has a singularity at x = 0. Thus the integrals over R3 and Ω in (90) must be understood as limits of the corresponding integrals over the region |x| > ε as ε → 0. Let us examine the singularities of Q and of the coefficients of L∗ at the origin. Relation (79) implies that pβ (s, t, y, x) =

a + O(r), r

r = |x| → 0, a = a(s, t, y).

(92)

It is important that (92) does not contain a term of order O(1). From (92), (83) and (81) it follows that ∂g(t, x) 1 = − + O(r), r r

Q(s, t, y, x) =

c + O(1), r2

2c ∂Q(s, t, y, x) = − 3 + O(1), ∂r r (93)

where r → 0, c = c(s, t, y). Since Q has a weak singularity at x = 0, the integral of the left-hand side of (84) over Ωε = Ω ∩ {x: |x| > ε} converges to the left-hand side of (90). Hence, in order to prove (88), it remains to show that

L∗ Q dt dx → 0,

ε → 0.

Ωε

The integral above is equal to τ 1 ∂Q ∂g − + Q dσ dt, 2 ∂r ∂r s |x|=ε

(94)


2695

where dσ is the element of the surface area of the sphere |x| = ε. The convergence of (94) to zero follows immediately from (93) 2 Lemma 10.5. The family of processes y T (t), T 1, is tight. We shall prove this lemma below. First, however, we formulate the main result of this section. Theorem 10.6. The distributions of the processes y T (t) converge as T → ∞, weakly in the space C([0, 1], R3 ), to the distribution of the 3-dimensional Markov process with continuous trajectories. The transition densities for the limiting Markov process are given by (81) and (82). Proof. The convergence of the finite-dimensional distributions of y T (t) to those of the Markov process follows from (80) and Lemma 10.4. Since the family y T (t) is tight, there is a modification of the Markov process which has continuous trajectories. 2 Proof of Lemma 10.5. To prove tightness it is enough to demonstrate that for each η, ε > 0 there are 0 < δ < 1 and T0 1 such that for all u ∈ [0, 1] we have

T

y (s) − y T (u) > ε δη,

sup

Pβ,T

T T0 .

(95)

usmin(t+δ,1)

Let η, ε > 0 be fixed. Let Eδ be the event that a continuous function x : [0, T ] → R3 satisfies

√

sup x(t) − x(0) / T > ε/8. tδT ,

Using arguments similar to those leading to (70), we can show that (95) follows from T

sup

√ x∈Rd , |x|=ε T /4

Ex0,T

χEδ exp

βv x(t) dt

δη,

T T0 .

(96)

0

Let √

τ = min δT , inf t 0: x(t) − x(0) = ε T /8 . The expectation in (96) can be estimated as follows T

Ex0,T

χEδ exp

βv x(t) dt

Ex0,T

) χEδ Ex(τ 0,T −τ

T−τ exp βv x(t) dt .

0

0

We claim that x(τ ) E0,T −τ

T−τ

βv x(t) dt

exp 0

T sup

√ x∈Rd , |x|ε T /8

Ex0,T

exp 0

βv x(t) dt c(ε)

(97)

2696


for some constant c(ε) for all sufficiently large T . It then remains to choose δ such that Ex0,T (χEδ ) δη/c(ε), and the estimate (96) will follow. The second inequality in (97) easily follows from part (2) of Theorem 7.2 and the fact that the probability of reaching the support √ of v before time T by a Brownian path starting at a distance ε T /8 away from the origin is of order O(T −1/2 ) if d = 3. 2 References [1] S. Albeverio, Hoegh-Krphn R. Gesztesy, H. Holden, Solvable Models in Quantum Mechanics, second ed., Amer. Math. Soc., Providence, RI, 2005. [2] M. Cranston, S. Molchanov, Analysis of a homopolymer, submitted for publication. [3] M. Cranston, S. Molchanov, A solvable homopolymer model, submitted for publication. [4] R. Dobrushin, On the Way to the Mathematical Foundations of Statistical Mechanics, Lecture Notes in Math., vol. 1567, Springer-Verlag, New York, 1993. [5] J.W. Essam, M.E. Fisher, J. Chem. Phys. 38 (1963) 802, 74. [6] M. Krasnoselski, Positive Solutions of Operator Equations, Noordhoff, Groningen, 1964. [7] I.M. Lifschitz, A.Y. Grosberg, A.R. Khokhlov, Some problems of the statistical physics of polymer chains with volume interaction, Rev. Modern Phys. 50 (3) (July 1978). [8] M. Reed, B. Simon, Methods of Modern Mathematical Physics, vol. 4, Analysis of Operators, Academic Press, New York, 1978. [9] D. Ruelle, Thermodynamic Formalism. The Mathematical Structures of Equilibrium Statistical Mechanics, second ed., Cambridge Math. Lib., Cambridge Univ. Press, Cambridge, 2004. [10] B. Vainberg, On the short-wave asymptotic behavior of solutions of stationary problems and the asymptotic behavior as t → ∞ of solutions of non-stationary problems, Russian Math. Surveys 30 (2) (1975) 1–58.


The Dunkl intertwining operator ✩ M. Maslouhi, E.H. Youssfi ∗ LATP, U.M.R. C.N.R.S. 6632, CMI, Université de Provence, 39 Rue F-Joliot-Curie, 13453 Marseille Cedex 13, France Received 11 July 2008; accepted 22 September 2008 Available online 11 October 2008 Presented by the Editors

Abstract We consider Dunkl theory associated to a general Coxeter group G. A new characterization of the regularity of the weight k is given and a new construction, devoid of Kozul complex theory, of the Dunkl intertwining operator Vk is established. We apply our results to derive sharp estimates of the Dunkl kernel. We give explicit formula in the case of orthogonal positive root systems. © 2008 Elsevier Inc. All rights reserved. Keywords: Dunkl operator; Intertwining operator; Dunkl kernel

1. Introduction and preliminaries In this paper we give final solutions to several open problems in Dunkl theory associated to general Coxeter groups. We consider Dunkl operators associated to an arbitrary finite reflexion group G and a general weight k. See Dunkl and Xu [1] or Roesler [4] for a complete history of the subject. The main goal of this work is first to give a new characterization of the regularity of the weight k, which in turn was proved to be the necessary and sufficient condition for the existence of Vk , see Opdam [3] and Dunkl, de Jeu and Opdam [2]. Second, we establish a new and concrete construction of Vk , devoid of Kozul complex theory, which allows us to prove that f → (Vk f )(x) is extendable to the Dunkl algebra and then obtain sharp estimates on the Dunkl kernel. We finally give explicit formula in the case of orthogonal positive root systems. ✩

This research is partially supported by the French ANR DYNOP, Blanc 07-198398.


E-mail addresses: [email protected] (M. Maslouhi), [email protected] (E.H. Youssfi). 0022-1236/$ – see front matter © 2008 Elsevier Inc. All rights reserved. doi:10.1016/j.jfa.2008.09.018

2698

M. Maslouhi, E.H. Youssfi / Journal of Functional Analysis 256 (2009) 2697–2709

Fix a root system R and consider the finite group G generated by the reflections σα where α ∈ R and σα (x) = x − 2

x, α α, α2

x ∈ Rd .

Here , denotes the canonical inner product in the space Rd and its Euclidean norm. We extend the form , to a bilinear form on Cd × Cd and assign the same notation to this extension. The action of G on functions is defined by (g · f )(x) := f (g · x),

x ∈ Rd .

Set R+ := {α ∈ R: α, β > 0} for some β ∈ Rd such that α, β = 0 for all α ∈ R. Let k be a parameter function on R, that is, k : R → C and G-invariant. For ξ ∈ Rn , the Dunkl operator Tξ on Rd associated to the group G and the parameter function k is given by Tξ (k)f (x) = ∂ξ f (x) +

k(α)α, ξ

α∈R+

f (x) − f (σα x) , α, x

x ∈ Rd ,

where f ∈ C 1 (Rd ). Note that the definition of Tξ (k) does not depend on the choice of β. In the sequel we will write Tξ in place of Tξ (k). Consider the vector space M of all parameter functions and let ker Tξ (k) = C · 1 M reg := k ∈ M: ξ ∈Rd

be the set of regular parameter functions. Let Π d := C[Rd ] denote the C-algebra of polynomial functions on Rd and Pn , n ∈ N, its subspace consisting of all homogeneous polynomials of degree n. An important result in Dunkl theory, established in [2], states that k ∈ M reg if and only if there exists a unique isomorphism Vk of Π d satisfying Vk (Pn ) ⊂ Pn ,

Vk (1) = 1 and Tξ Vk = Vk ∂ξ ,

∀ξ ∈ Rd .

(1.1)

The operator Vk is called the intertwining operator. It follows from (1.1) that Vk satisfies g · Vk (p) = Vk (g · p),

∀g ∈ G.

Next, consider the operator A defined on functions f : Rd → C by A(f ) := k(α)σα · f.

(1.2)

(1.3)

α∈R+

It is clear that for all non-negative integer n, the space Pn is invariant under the action of A. We let An denote the restriction of A to Pn and set k(α). (1.4) γ = γ (k) := α∈R+


2699

A straightforward computing shows that Tx (p)(x) =

(n + γ )I − An P (x)

for all P ∈ Pn . If the operator (n + γ )I − An is invertible on Pn . we denote by Hn its inverse. Our first result is the following Theorem A. If k ∈ M, then k ∈ M reg if and only if (n + γ )I − An is invertible for all positive integers n. In addition, the intertwining operator Vk is given by Vk (p)(x) = (∂x H )n (p),

for all x ∈ Rd and p ∈ Pn ,

where H is the operator defined on Π d whose restriction to each Pn is the operator Hn = ((n + γ )I −An )−1 , and (∂x H )m is defined by induction by setting (∂x H )0 := Id, (∂x H )1 (p) = ∂x H (p) and for all positive integers m, (∂x H )m (p) = ∂x H (∂x H )m−1 p ,

x ∈ Rd , p ∈ Π d .

(1.5)

We equip each subspace Pn with the norm

f S := sup f (x) ,

f ∈ Pn .

x=1

In Lemma 3.1 below we estimate on the norm of each operator Hn on Pn endowed with the norm above. These estimates are key ingredients to extend Vk to the Dunkl algebra defined in Section 3 and prove our second main result. Theorem B. There is a positive constant δ = δ(k) such that

(Vk p)(x) δ n xn pS for all homogeneous polynomials p of degree n and all x ∈ Rd . 2. New construction for Vk Define the matrix B by B =2

α∈R+

2

k(α) α⊗α α2

(2.1)

where α ⊗ α is the rank one symmetric matrix given by (α ⊗ α)(x) := α, xα, We observe that σα = Id −

2 α α2

x ∈ Rd .

⊗ α for all α ∈ R and by a direct calculation we see that

Tξ (p) = p (I + B)ξ ,

for all p ∈ P1 .

(2.2)

2700


Theorem 2.1. Suppose k ∈ M reg . Then the matrix I + B is invertible and satisfies Vk (p)(x) = p (I + B)−1 x for all polynomial p ∈ P1 and all x ∈ Rd . Proof. Let p ∈ P1 and ξ ∈ Rd . Since Vk (p) ∈ P1 it follows from (2.2) that Tξ Vk (p) = Vk (p) (I + B)ξ . On the other hand, by the intertwining property of Vk we have that Tξ Vk (p) = Vk (∂ξ p) = ∂ξ p = p(ξ ). Hence Vk (p) (I + B)ξ = p(ξ )

(2.3)

for all p ∈ P1 and ξ ∈ Rd . Suppose that ξ ∈ Rd satisfies (I + B)ξ = 0. Then from (2.3) we have 0 = Vk (p)(0) = p(ξ ) for all p ∈ P1 and hence ξ = 0.

2

The next lemma is useful for the sequel. Lemma 2.2. Suppose that n 2 and let p ∈ Pn satisfy Tx (k)p(x) = 0 for all x ∈ Rd . Then for all non-negative integers m 1 and all x, ξ ∈ Rd we have m ∂ξm−1 Tξ p (x) = − ∂ξm Tx p (x). Proof. For simplicity notations, we will neglect the parameter function k and write Tξ in place of Tξ (k) where ξ ∈ Rd . Since (Tx p)(x) = 0 for all x ∈ Rd , by differentiation with respect to x in the direction ξ ∈ Rd we have Tξ (p)(x) = − ∂ξ (Tx p) (x) for all x, ξ and thus the lemma follows for m = 1. The general case follows by an easy induction on m 1 using the same argument consisting of a differentiation with respect to x in the direction ξ ∈ Rd . 2 Theorem 2.3. Suppose k ∈ M reg and let p ∈ Pn , n 1, such that Tx (k)p(x) = 0 for all x ∈ Rd . Then p = 0.

(2.4)


2701

Proof. The case n = 1 is obvious, we may and will assume that n 2. Since Vk commutes with the action of the Weyl group G it follows that if p satisfies (2.4) then Vk (p) satisfies the same equation. Taking m = n − 1 in Lemma 2.2 we get (n − 1) ∂ξn−2 Tξ p (x) = − ∂ξn−1 Tx p (x) = − ∂ξn−2 Tx p (ξ ) for all ξ, x ∈ Rd . Applying Vk with respect to the variable x to both sides, and using Theorem 2.1 and the intertwining property of Vk , yields (n − 1) Tξn−2 Vk Tξ p (x) = − ∂ξn−2 T(1+B)−1 x p (ξ ) for all ξ, x ∈ Rd . Now if we choose x ∈ Rd such that (1 + B)−1 x = ξ , then the latter equality gives n! (n − 1) Tξn−2 Vk Tξ p (1 + B)ξ = − ∂ξn−2 Tξ p (ξ ) = − (Tξ p)(ξ ) = 0 2

(2.5)

for all ξ ∈ Rd . But from (2.2) we have that n−2 Tξ Vk Tξ p (1 + B)ξ = Tξ Tξn−2 Vk Tξ p = Tξn−1 Vk Tξ p. This fact, combined with (2.5) and the intertwining property of Vk , applied to Vk (p) instead of p shows that (n − 1) Tξn−1 Vk Tξ Vk p = Vk2 ∂ξn p = 0 for all ξ ∈ Rd . The theorem now follows since Vk (1) = 1 and ∂ξn p = n!p(ξ ).

2

Now we give a new construction of the intertwining operator Vk . The remaining part of this section is devoted to the proof of Theorem A. So, whenever the operator (n + γ )I − An is =V (k) by its restrictions V n to Pn as invertible on Pn for all positive integer n we define V n (p) (x) := (∂x H )n (p), V

x ∈ Rd , p ∈ Pn ,

0 (λ) := λ where (∂x H )n (p) is as in the statement of Theorem A, with the understanding that V for all constant polynomials λ ∈ C. Lemma 2.4. Suppose that (n + γ )I − An is invertible for all positive integers n. Then for all w ∈ R and n ∈ N0 we have n . n ◦ σw = σw ◦ V V Proof. First it is easy to check that An commutes with the action of the σα , α ∈ R and consequently with the action of Hn for all non-negative integers n. We need to show that n (p)(σw x) n (σw · p)(x) = V V

2702


for all p ∈ Pn and all x ∈ Rd . Our claim is obvious for n = 0. Suppose that the claim holds for some non-negative integer n and let p ∈ Pn+1 . By definition we have n ∂x Hn+1 (σω p) (x) n+1 (σω p)(x) = V V n ∂x σω Hn+1 (p) (x) =V n σω ∂σω x Hn+1 (p) (x) =V so that from the induction hypothesis we get n ∂σω x Hn+1 (p) (σω x) = V n+1 (p)(σω x) n+1 (σω p)(x) = V V 2

and thus the proof of the lemma is complete.

Lemma 2.5. Suppose that (n + γ )I − An is invertible for all positive integers n. Then (p) = V (∂ξ p) Tξ V for all ξ ∈ Rd and all polynomials p ∈ Π d . Proof. It suffices to show the result for homogeneous polynomials p ∈ Pn . We will proceed by induction on n ∈ N0 . The cases n = 0, 1 are obvious. Let n 1 and suppose that the lemma holds for n − 1. Again for simplicity, we will assume that α2 = 2 for all α ∈ R. Let p ∈ Pn and ξ, x ∈ Rd . Then we have ∂ξ (An p)(x) = k(α)∂σα ξ p(σα x) α∈R+

=

k(α)(∂ξ p)(σα x) −

α∈R+

= An−1 (∂ξ p)(x) −

k(α)ξ, α(∂α p)(σα x)

α∈R+

k(α)ξ, α(∂α p)(σα x)

α∈R+

= An−1 (∂ξ p)(x) − Bn,ξ (p)(x) where Bn,ξ (p)(ξ, x) :=

k(α)ξ, α(∂α p)(σα x).

α∈R+ −1 On the other hand, since Hn+1 = (n + 1 + γ )Id − An+1 , we see that −1 ∂ξ ◦ Hn+1 = (n + 1 + γ )∂ξ − ∂ξ ◦ An+1

= (n + 1 + γ )∂ξ − An ◦ ∂ξ + Bn+1,ξ = Hn−1 ◦ ∂ξ + ∂ξ + Bn+1,ξ . Therefore, Hn ◦ ∂ξ = ∂ξ ◦ Hn+1 + Ln+1

(2.6)


where Ln+1 := Hn ◦ (∂ξ + Bn+1,ξ ) ◦ Hn+1 . Since by definition we have n ∂z (Hn+1 p) (z), n+1 (p)(z) = V V

for all z ∈ Rd ,

it follows that n+1 (p) (x) = V n ∂x (Hn+1 p) (x). n ∂ξ (Hn+1 p) (x) + ∂ξ V ∂ξ V Using (2.6) this can be written in the form n+1 (p) (x) = V n ∂x (Hn+1 p) (x). n Hn ∂ξ (p) (x) − V n (Ln+1 p)(x) + ∂ξ V ∂ξ V A similar argument shows that for all α ∈ R+ and t ∈ [0, 1] n+1 (p) x − tx, αα ∂α V n ∂α (Hn+1 p) + ∂α V n ∂x−tx,αα (Hn+1 p) = V x − tx, αα n ∂α (Hn+1 p) + ∂α V n ∂x (Hn+1 p) = V x − tx, αα n ∂α (Hn+1 p) x − tx, αα − tx, α∂α V n ∂α (Hn+1 p) + ∂α V n ∂x (Hn+1 p) = V x − tx, αα +t

d Vn ∂α (Hn+1 p) x − tx, αα . dt

Therefore 1

n+1 (p) x − tx, αα dt ∂α V

0

1 =

n ∂α (Hn+1 p) x − tx, αα dt V

0

1 +

n ∂x (Hn+1 p) x − tx, αα dt ∂α V

0

1 +

n ∂α (Hn+1 p) x − tx, αα dt t V

0

n ∂α (Hn+1 p) (σα x) + =V

1 0

n ∂x (Hn+1 p) x − tx, αα dt. ∂α V

2703

2704


Hence n+1 p) x − tα, xα dt k(α)ξ, α ∂α (V Tξ Vn+1 (p) (x) = ∂ξ Vn+1 (p) (x) + 1

α∈R+

0

n ∂x (Hn+1 p) (x) n Ln+1 (p) + ∂ξ V n Hn (∂ξ p) (x) − V =V n ∂α (Hn+1 p) (σα x) k(α)ξ, αV + α∈R+

+

1 k(α)ξ, α

α∈R+

n ∂x (Hn+1 p) x − tx, αα dt ∂α V

0

n ∂x (Hn+1 p) (x) n Ln+1 (p) + Tξ V n Hn ∂ξ (p) (x) − V =V n ∂α (Hn+1 p) (σα x). k(α)ξ, αV + α∈R+

By the induction hypothesis we have n ∂x (Hn+1 p) (x) = V n−1 ∂x ∂ξ (Hn+1 p) (x). Tξ V n commutes with the σα . Thus In view of Lemma 2.4 we know that V

n ∂α (Hn+1 p) (σα x) = V n Bn+1,ξ (Hn+1 p) (x). k(α)ξ, αV

α∈R+

Therefore, n+1 (p) (x) = V n−1 ∂x ∂ξ (Hn+1 p) (x) + V n Hn (∂ξ p)(x) − V n Ln+1 (p) (x) Tξ V n ∂α (Hn+1 p) (σα x) k(α)ξ, αV + α∈R+

n−1 ∂x ∂ξ (Hn+1 p) (x) + V n Hn ∂ξ (p) (x) − V n Ln+1 (p) (x) =V n Bn+1,ξ (Hn+1 p) (x). +V Applying again (2.6) we get n−1 ∂x Hn (∂ξ p) (x) − V n−1 ∂x (Ln+1 p) (x) n−1 ∂x ∂ξ (Hn+1 p) (x) = V V n (∂ξ p)(x) − V n−1 ∂x (Ln+1 p) (x) =V and hence


2705

n+1 (p) (x) = V n (∂ξ p)(x) − V n−1 ∂x (Ln+1 p) (x) + V n Hn (∂ξ p) (x) Tξ V n Bn+1,ξ (Hn+1 p) (x) n (Ln+1 p)(x) + V −V n (∂ξ + Bn+1,ξ )(Hn+1 p) (x) + V n Hn (∂ξ p) (x) n (∂ξ p)(x) − V =V n Ln+1 (p) (x) + V n Bn+1,ξ (Hn+1 p) (x) −V n ∂ξ p − ∂ξ (Hn+1 p) + Hn ∂ξ p − Ln+1 (p) (x) =V n (∂ξ p)(x) =V where the latter equality holds in view of (2.6). This completes the proof of the lemma.

2

Theorem 2.6. Suppose that (n + γ )I − An is invertible for all positive integers n. Then we have . Vk = V Proof. The result follows from [2, Corollary 3.5] and Lemma 2.5.

2

Corollary 2.7. A parameter function k is in M reg if and only if (n + γ (k))I − An is invertible for all positive integers n. Proof. By Theorem 2.6 and Corollary 3.5 [2] we see that if k a parameter function k such that (n + γ (k))I − An is invertible for all positive integers n, then k ∈ M reg . Conversely, suppose that k ∈ M reg and for there exists n ∈ N such that (n + γ )Id − An is not invertible. Then for some p ∈ Pn , p = 0 we have Tx (k)(p)(x) = 0 for all x ∈ Rd . This is impossible by virtue of Theorem 2.3. 2 Proof of Theorem A. By the previous corollary we know that parameter function k is in M reg if and only if (n + γ (k))I − An is invertible for all positive integers n. The theorem now follows from Lemma 2.5 and the uniqueness of Vk . 2 3. Extension of Vk to the Dunkl algebra Let Br be the closed ball centered at 0 with radius r > 0 in Rd . We

consider the Dunkl algebra Ar consisting of homogeneous series f : Br → C given by f = +∞ n=0 fn , where each fn is a homogeneous polynomial of degree n, and f =

+∞

r n fn S < +∞.

n=0

Furnished with the pointwise multiplication and the complex conjugation as an involution, this normed algebra is a commutative Banach ∗-algebra. See [4]. Lemma 3.1. There is a positive constant δ = δ(k) such that Hn S for all n 1.

δ n

2706


Proof. Choose a positive integer n0 2 such that |n + γ | >

k(α)

and

α∈R+

1

|1 +

γ |k(α)| α∈R+ n n|−

for all n n0 . Since for these integers we also have An Hn =

∞ j =0

α∈R+

2

|k(α)|, it follows that

j

An (n + γ )j +1

and Hn

∞ j =0

∞ ( α∈R+ |k(α)|)j j =0

=

An j |n + γ |j +1 |n + γ |j +1

|n + γ | −

1

α∈R+

|k(α)|

2 n

for all n n0 . It suffices now to choose a constant δ 2 that satisfies Hj 1, . . . , n0 − 1. 2

δ j

for all j =

Proof of Theorem B. Fix an integer n 1 and p ∈ Pn . In virtue of Theorem A we have Vk (p)(x) = (∂x H )n (p) for all x ∈ Rd . Appealing to [1, Theorem 2.5.3] and Lemma 3.1, an induction process shows that for all x ∈ Rd we have

Vk (p)(x) δ n xn pS . (3.1) Note that the above equality holds also for n = 0 since Vk (1) = 1.

2

This allows us to extend Vk to the algebra Ar by setting Vk (f )(x) =

+∞

Vk (fn )(x),

n=0

f=

+∞

fn ,

n=0

and as a direct consequence of the above result we have the following:

Theorem 3.2. Suppose that f = +∞ n=0 fn is a member of the Dunkl algebra Ar . Then +∞

(Vk f )(x) δ n xn fn S n=0

for all x ∈ Rd satisfying x rδ .


2707

For fixed y ∈ Cd , the map x → ex,y is in the algebra Ar for all r > 0. The Dunkl kernel is then obtained by Ek (x, y) = Vk e·,y (x),

x ∈ Rd , y ∈ Cd .

For ν = (ν1 , . . . , νd ) ∈ Nd0 set ∂ ν = ∂eν11 . . . ∂eνdd

and |ν| = ν1 + · · · + νd .

A direct consequence of Theorem A and [1, Theorem 2.5.3] is the following estimates for the Dunkl kernel. Corollary 3.3. For all ν ∈ Nd0 we have

ν

∂ Ek (x, y) δy |ν| eδxy for all x, y ∈ Rd , where the differentiation is with respect to x ∈ Rd . 4. The case of orthogonal positive roots system First we point out that an explicit expression of Vk was given in [5] in the case where the positive roots system R+ is the canonical basis of Rd . In this section we shall establish an expression of Vk when R+ is any orthogonal family in Rd which will generalize [5]. Let R+ = {α1 , . . . , αm } and consider the function h : Rm × Rd → Rd defined by h(t, x) := x +

m x, αj (tj − 1) αj αj 2

(4.1)

j =1

where t = (t1 , . . . , tm ) ∈ Rm and x ∈ Rd . Theorem 4.1. If R+ is orthogonal, then Vk (p)(x) = [−1,1]m

m k −1 p h(t, x) cj (1 + tj ) 1 − tj2 j dt,

x ∈ Rd ,

j =1

for all polynomials p, where cj :=

(kj ) ,

(kj + 1/2) (1/2)

j = 1, . . . , m.

Proof. For simplicity, we will assume that α = 1 for all α ∈ R+ , the proof in the general case is the same, R+ is then orthonormal. Set θj (t) := (t1 , . . . , tj −1 , −tj , tj +1 , . . . , tm ),

t = (t1 , . . . , tm ) ∈ Rm ,

2708


j = 1, . . . , m. Then θj is just the reflection in Rm with respect to hyperplane (Rej )⊥ orthogonal to the vector ej of the canonical basis (e1 , . . . , em ) of Rm . Since {α1 , . . . , αm } is orthonormal a little computing shows that for all x ∈ Rd and t ∈ Rm , h(t, σαj x) = h θj (t), x , Set (t) :=

m

2 kj −1 j =1 cj (1 + tj )(1 − tj )

j = 1, . . . , m.

and j (t) :=

V (p)(x) :=

(t) 1+tj

(4.2)

. Then the operator

p h(t, x) (t) dt,

(4.3)

[−1,1]m

satisfies V (1) = 1 and V (Pn ) ⊂ Pn . In order to show that V = Vk , we only need to prove that Tξ V (p) = V (∂ξ p) for all ξ ∈ Rd . A little computing shows that

∂ξ p h(t, x) h(t, ξ ) (t) dt

∂ξ V (p)(x) = [−1,1]m

∂ξ p h(t, x) (t) dt

= [−1,1]m

+

m

∂αj p h(t, x) (tj − 1)ξ, αj (t) dt.

j =1 [−1,1]m

Since

∂ ∂tj

(p(h(t, x))) = ∂αj p(h(t, x))(tj − 1)ξ, αj , integration by parts yields

∂αj p h(t, x) (tj − 1)ξ, αj (t) dt

[−1,1]m

ξ, αj = x, αj

[−1,1]m

= −2kj cj

ξ, αj x, αj

∂ p h(t, x) (tj − 1)(t) dt ∂tj

p h(t, x) tj j (t) dt.

[−1,1]m

Therefore,

∂ξ p h(t, x) (t) dt

∂ξ V (p)(x) = [−1,1]m

−2

m j =1

kj c j

ξ, αj x, αj

[−1,1]m

p h(t, x) tj j (t) dt.


2709

By the change of variables we see that for each j = 1, . . . , m, we have 1 − tj p h θj (t), x (t) dt = p h(t, x) (t) dt 1 + tj [−1,1]m

[−1,1]m

so that by (4.2) m

kj ξ, αj

V (p)(x) − V (p)(σαj x) x, αj

j =1

=

m

kj

j =1

=

m

kj

j =1

=

m j =1

=2

ξ, αj x, αj ξ, αj x, αj

ξ, αj kj x, αj

m j =1

p h(t, x) − p h(t, σαj x) (t) dt

[−1,1]m

p h(t, x) − p h θj (t), x (t) dt

[−1,1]m

[−1,1]m

ξ, αj kj x, αj

1 − tj 1− p h(t, x) (t) dt 1 + tj p h(t, x) tj j (t) dt.

[−1,1]m

This, combined with the last expression of ∂ξ V (p)(x), yields ∂ξ p h(t, x) (t) dt = V (∂ξ p) Tξ V (p) = [−1,1]m

and hence completes the proof.

2

References [1] C.F. Dunkl, Yuan Xu, Orthogonal Polynomials of Several Variables, Cambridge Univ. Press, 2001. [2] C.F. Dunkl, M.F.E. de Jeu, E. Opdam, Singular polynomials for finite reflection group, Trans. Amer. Math. Soc. 346 (1994) 237–256. [3] E. Opdam, Dunkl operators, Bessel functions and the discriminant of finite Coxeter group, Compos. Math. 85 (1993) 333–373. [4] M. Roesler, Dunkl operators: Theory and applications, in: Lecture Notes in Math., vol. 1817, Springer, 2003, pp. 93– 165. [5] Y. Xu, Orthogonal polynomials for a family of product weight functions on the spheres, Canad. J. Math. 49 (1997) 175–192.


The classification problem for von Neumann factors Roman Sasyk a , Asger Törnquist b,∗ a Department of Mathematics and Statistics, University of Ottawa, 585 King Edward Avenue,

Ottawa, Ontario, K1N 6N5 Canada b Department of Mathematics, University of Toronto, 40 St. George Street, Room 6092,

Toronto, Ontario, M5R 2E4 Canada Received 13 July 2008; accepted 14 November 2008 Available online 10 December 2008 Communicated by Alain Connes

Abstract We prove that it is not possible to classify separable von Neumann factors of types II1 , II∞ or IIIλ , 0 λ 1, up to isomorphism by a Borel measurable assignment of “countable structures” as invariants. In particular the isomorphism relation of type II1 factors is not smooth. We also prove that the isomorphism relation for von Neumann II1 factors is analytic, but is not Borel. © 2008 Elsevier Inc. All rights reserved. Keywords: Von Neumann algebras; Classification; Borel reducibility; Turbulence

1. Introduction The purpose of this paper is to apply the notion of Borel reducibility of equivalence relations, developed extensively in descriptive set theory in recent years, and the deformation-rigidity techniques of Sorin Popa, to study the global structure of the set of factors on a separable Hilbert space. Recall that if E, F are equivalence relations on standard Borel spaces X and Y , respectively, we say that E is Borel reducible to F , written E B F , if there is a Borel f : X → Y such that xEy

⇐⇒

f (x)Ff (y),


E-mail addresses: [email protected] (R. Sasyk), [email protected] (A. Törnquist). 0022-1236/$ – see front matter © 2008 Elsevier Inc. All rights reserved. doi:10.1016/j.jfa.2008.11.010

R. Sasyk, A. Törnquist / Journal of Functional Analysis 256 (2009) 2710–2724

2711

in other words, if it is possible to classify the points of X up to E equivalence by a Borel assignment of invariants that are F -equivalence classes. We write E 0 . Then the random operator Hω is ergodic with respect to translations in Zd and thus has all the basic properties of ergodic operators, see e.g. [3]. In particular, the integrated density of states (IDS) X 1 E tr χ(−∞,E] Hω,Λ L L→∞ |ΛL |

N (E) = lim

(27)

X exists for all energies E ∈ R. Here ΛL = ( 12 , L + 12 )d and Hω,Λ is the restriction of Hω to L 2 L (ΛL ) with boundary condition X ∈ {P , N, D}, as periodic (P), Neumann (N) and Dirichlet (D) boundary conditions all give the same limit in (27).

2734

J. Baker et al. / Journal of Functional Analysis 256 (2009) 2725–2740

The spectrum σ (Hω ) is almost surely deterministic, i.e. Σ = σ (Hω ) for almost every ω, and given by the growth points of the non-decreasing function N (E). It can be characterized in terms of the spectra of those Hω for which the configuration ω is periodic, Σ=

σ (Hω ),

(28)

ω

where the union is taken over all periodic ω such that ωi ∈ supp μ for all i. This corresponds to a well-known result for the Anderson model, e.g. [3], and is found with the same proof. If the support of the distribution μ contains all the corners {(±dmax , . . . , ±dmax )} of the cube [−dmax , dmax ]d , then it follows from (28) and Proposition 2.1 that min Σ = E0 = min σ (Hωmin ). For large classes of random Schrödinger operators it is known that the IDS vanishes rapidly at the bottom of the spectrum E0 , for example one has Lifshits tail behavior −d/2

N (E) ∼ e−c|E−E0 |

(29)

for Anderson models with sign-definite single-site potential, see e.g. [9,13] or [15] for proper statements, proofs and references to the original literature. It turns out that for the random displacement model the behavior of the IDS at the bottom of the spectrum is much more subtle. Here we will present several results for the one-dimensional displacement model, which were obtained in [1]. We will generally assume that alternative (i) holds. We expect quite different phenomena to appear for the multi-dimensional displacement model as well as for the case that the single site potential satisfies alternative (ii). More discussion of this is included in Section 5. In the first result we will consider the one-dimensional Bernoulli displacement model, i.e. the case where the distribution of the displacements ωi is given by 1 1 μ = δdmax + δ−dmax . 2 2

(30)

It turns out that in this case the low-energy asymptotics of the IDS is at the opposite extreme of Lifshits tails: Theorem 4.1. Let Hω be the one-dimensional symmetric Bernoulli displacement model given by (1), (2) and (30) and assume that alternative (i) holds. Then there exist C > 0 and ε > 0 such that N (E)

C log (E − E0 ) 2

(31)

for E ∈ (E0 , E0 + ε). As N(E0 ) = 0, this means that N (E) has infinite upper derivative at E = E0 , i.e. the density of states n(E) = N (E) has a strong singularity at the bottom of the spectrum. This is opposed to the case of Lifshits tails which would yield n(E0 ) = 0. In fact, (31) says that the IDS is not


2735

even Hölder-continuous at E = E0 , an even stronger singularity than one gets for the Laplacian H0 = −d 2 /dx 2 , where the IDS has a van Hove singularity C|E|1/2 . For general one-dimensional ergodic Schrödinger operators (and for discrete ergodic Schrödinger operators also in higher dimension) the IDS is log-Hölder-continuous at all energies, i.e. N (E) − N (E )

C | log |E − E ||

(32)

for E close to E , see [4,5]. Craig and Simon constructed examples of quasi-periodic potentials which show that the bound (32) is optimal. As far as we know, the result in Theorem 4.1 provides the first known example of a random potential (with finite correlation length) where for at least one energy the IDS is not Hölder-continuous and, in fact, close to the minimal possible regularity for ergodic operators given by (32). Proof. We will use the standard lower bound, e.g. [3], N (E)

1 D P E0 (Hω,L )<E , L

(33)

D is short for H D which holds for arbitrary L, to be chosen later depending on E. Here Hω,L ω,ΛL , ΛL = (1/2, L + 1/2). D ) < E we will find ψ ∈ D(H D ) with ψ = 1 and To show that E0 (Hω,L ω ω ω,L D ψω , Hω,L ψω < E. To construct ψω , let displacements ω = (ω1 , . . . , ωL ) be given and let uN be the solution of −u

+ Vω u = E0 u with uN ( 12 ) = 1, u N ( 12 ) = 0. Choose cut-off functions θL ∈ C0∞ (R) with 0 θL 1, supp θL ⊂ [1, L], θL (x) = 1 for 3/2 x L − 1/2, and θL ∞ and θL

∞ uniformly bounded in L. As the ωi have distribution (30), we have ωi ∈ {±dmax } for all i. Thus the restriction of −d 2 /dx 2 + Vω to (i − 1/2, i + 1/2) has Neumann ground state energy E0 for all i ∈ {1, . . . , L}, which implies that

u N (i + 1/2) = 0 for all i ∈ {1, . . . , L}.

(34)

We choose ψω := θL uN /θL uN and calculate θL uN , −θL

uN − 2θL uN , θL u N D ψω − E 0 = ψω , Hω,L θL uN 2

˜ + u2 (L + 1/2)) β(1 N L−1/2 2 uN (x) dx 3/2

β(1 + u2 (L + 1/2)) L N , 2 i=1 uN (i + 1/2)

(35)

where β˜ > 0 and β > 0 can be chosen uniformly in ω and L. Here we have repeatedly used standard a priori upper and lower bounds on solutions of −u

+ V u = Eu, for example that x+1 r(x) ∼ r(x + 1) and x u2 ∼ r 2 (x), where r(x) = (u2 (x) + u 2 (x))1/2 is the Prüfer amplitude of u and constants can be chosen uniform as long as E and V ∞ vary in a bounded interval, see

2736


e.g. [16,17] for more details. We also use that by (34) the Prüfer amplitude of uN at the points i + 1/2 coincides with uN (i + 1/2). Thus

D 1 + u2N (L + 1/2) P E0 Hω,L < E P β L < E − E0 . 2 i=1 uN (i + 1/2)

(36)

Another consequence of ωi ∈ {±dmax } is that uN satisfies (16) for every i with a positive r = 1, using that we are in alternative (i). Assume without restriction that r > 1 (if 0 < r < 1 then we can do the following construction from “right to left,” choosing uN (L + 1/2) = 1, u N (L + 1/2) = 0) and set Xi =

log(uN (i + 1/2)/uN (i − 1/2)) , log r

(37)

i = 1, . . . , L. The Xi are independent symmetric Bernoulli random variables with values ±1, and u2N (i + 1/2) = e2Si log r ,

(38)

where Si = X1 + · · · + Xi . If Y := maxi=1,...,L Si , then it is a consequence of the reflection principle for symmetric random walks, e.g. [7] that √ √ P Y L SL 0 = P SL 2 L .

(39)

∞ The latter converges to π −1/2 2 exp(−y 2 /2) dy > 0 as L → ∞ by the central limit theorem. √ √ √ 2 Let AL := {ω | Y L and SL 0}. If Y L, then L i=1 uN (L + 1/2) exp(2 L log r). Also, SL 0 means u2N (L + 1/2) 1. Thus (36) implies

D 1 + u2N (L + 1/2) < E − E0 | AL P(AL ) P E0 Hω,L < E P β L 2 i=1 uN (i + 1/2) = P(AL ) c0 > 0

(40)

√ if 2β exp(−2 L log r) < E − E0 and L sufficiently large. This determines the choice of L ∈ N for given E such that 1 −2√L−1 log r 1 −2√L log r e e E − E0 . 2β 2β

(41)

0) 2 Thus L ∼ ( log 2β(E−E ) . From (33) and (40) we have N (E) c0 /L, which, for E − E0 suffilog r ciently small, takes the form (31). 2

As mentioned above, (31) says in particular that the IDS is not Hölder-continuous at E = E0 . This is only possible if the distribution μ is concentrated in the extreme points dmax and −dmax , as is demonstrated by our next result.


2737

Theorem 4.2. Suppose that the distribution μ of the ωi in the one-dimensional displacement model (1), (2) satisfies μ (−dmax , dmax ) > 0.

(42)

Then the IDS N(E) is Hölder-continuous at E = E0 . This result may not be optimal. We expect that under the conditions of Theorem 4.2 one can at show that N(E) Cα |E − E0 |α near E0 for arbitrary α > 0. But, as long as the distribution μ is chosen symmetric and not too small at ±dmax , one does not get Lifshits tail decay as in (29). To make this precise, define the Lifshits exponent γ at E0 by γ = lim

E↓E0

log(− log N (E)) log(E − E0 )

(43)

whenever this limit exists. Note that γ 0. If γ < 0, then it determines the asymptotics of the IDS in the sense that, up to logarithmic corrections, N (E) ∼ C1 exp(−C2 (E − E0 )γ ) as E ↓ E0 . Theorem 4.3. Assume that the distribution μ is symmetric and satisfies μ [dmax − , dmax ] ∪ [−dmax , −dmax + ] C1 N

(44)

for some positive C1 and N and all > 0. Also assume that the single-site potential q is uniformly Hölder-continuous, i.e. that |q(x) − q(y)| C2 |x − y|ρ for some C2 and ρ > 0 and all x, y. Then γ = 0. In the following we sketch the proofs of Theorems 4.2 and 4.3, referring for additional details to [1]. To prove Theorem 4.3 we follow the general strategy of the proof of Theorem 4.1, starting with (33) and using the test function ψω := θL uN /θL uN . However, the construction of uN needs to be modified as follows: On each interval [i − 1/2, i + 1/2], i ∈ {1, . . . , L}, uN is chosen to coincide with a constant multiple of the positive ground state of the Neumann problem for −d 2 /dx 2 + q(x − i − ωi ) on [i − 1/2, i + 1/2]. Scaling constants are chosen such that uN (1/2) = 1 and uN is continuously differentiable throughout [1/2, L + 1/2]. As we now generally have E0 (ωi ) = E0 , this leads to extra terms in the bound

|θL ψω , −θL

ψω | + 2|θL ψω , θL ψω | 1 N (E) P L θL ψω 2 L θL ψω 2 + (E0 (ωi ) − E0 ) < E − E0 , θL ψω 2

(45)

i=1

i here θL ψω 21 := i−1 θL2 ψω2 . Due to the symmetry of μ, the numbers log uN (i + 1/2), i = 1, . . . , L, are still a symmetric random walk (but not Bernoulli). Versions of the reflection principle and central limit theorem for

2738


general symmetric random walks and a choice of L as in (41) (with a suitable positive constant replacing log r) lead to the bound L θL ψω 2i C N (E) P (E0 (ωi ) − E0 ) < E − E0 L θL ψω 2 i=1 L C (E0 (ωi ) − E0 ) < E − E0 P L i=1

L E − E0 C P E0 (ω1 ) − E0 < . L L

(46)

In Lemma 2.1 of [2] the continuity of E0 (·) was shown. The proof given there provides the bound E0 (a1 ) − E0 (a2 )p C

q(x − a1 ) − q(x − a2 )p dx

for any p 2. Uniform Hölder continuity of q gives Hölder continuity of E0 (·). Using E0 = E0 (dmax ) = E0 (−dmax ) and (44) we see that P(E0 (ω1 ) − E0 < δ) C1 (δ/C)N/ρ . We plug into (46) N (E)

C E − E0 N/ρ . L L

From this bound, having chosen L through (41), a calculation shows that the Lifshits exponent vanishes. The proof of Theorem 4.2 is based on the standard upper bound, e.g. [3], N N (E) CP E0 Hω,L E .

(47)

s0 e−2C1 (L+1) E − E0 s0 e−2C1 L

(48)

We choose L through

with constants s0 and C1 to be determined later. By the calculation done in (9), i L 2 N i−1 |ψω | E0 Hω,L E0 (ωi ) L+1/2 , |ψω |2 i=1 1/2

(49)

N . By a priori bounds (e.g. [16]) there exists C > 0 such where ψω is the ground state of Hω,L 1 that

e

−C1 L

i i−1

|ψω |2 dx eC1 L


2739

uniformly in L ∈ N, i ∈ {1, . . . , L} and all configurations ω. Using this C1 in (48) we further estimate L E0 (ωi ) − E0 N s0 e−γ0 L . P E0 Hω,L E P L i=1

Here the last step is a large deviations bound, which is applicable with suitably chosen s0 > 0 and γ0 > 0 due to the assumption (42). Note for this that E(ωi ) − E0 are non-negative random variables which are strictly positive with positive probability. With this s0 in (48) if follows that e−γ0 L C(E − E0 )α , where α := s0 /4C1 . This completes the proof. 5. Concluding remarks With the above results we have only started to touch the various possibilities for the lowenergy asymptotics of the IDS in the random displacement model. There are several other regimes which we have not considered yet: (i) For one-dimensional random displacement models with non-symmetric distribution, in particular the case μ = pδdmax + (1 − p)δ−dmax with p = 1/2 we expect that the IDS might have Lifshits tails. (ii) It would be most interesting to decide if the uniqueness of the minimizing periodic configuration established in Theorem 2.4(b) leads to Lifshits tails of the IDS at E0 for the multi-dimensional random displacement model with general (or suitable) distributions μ. Beyond uniqueness of the minimizing configuration this would require to have quantitative results on the probability that other configurations have ground state energy near E0 . In this context we mention the recent work of Klopp and Nakamura [11] on sign-indefinite Anderson models, where some phenomena similar to those found by us for the random displacement model appear. In particular, they find that Lifshits tail as well as van Hove asymptotics of the IDS at the bottom of the spectrum are both possible in their model, depending on the choice of single-site potential and distribution of the random parameters. They have informed us about work in preparation [12] which, when combined with the uniqueness result Theorem 2.4(b) above, should indeed lead to Lifshits-type asymptotics of the IDS for multi-dimensional random displacement models as considered here. Their methods require that the distribution μ of the displacements has discrete support containing the corners of [−dmax , dmax ]d . (iii) We have used the non-overlap condition supp μ ⊂ [−dmax , dmax ]d , dmax + r = 1/2, mostly for technical reasons. In particular, it is crucial for the Neumann-bracketing arguments used in [2] and also Section 3 above. However, relaxing this condition will also lead to different phenomena. We mention the recent work by Fukushima [8] who studies the random displacement model (1), (2) for positive q and displacements with unbounded distribution μ. In this case it is easily seen that the almost sure spectrum is [0, ∞), due to the presence of large empty regions in typical single-site configurations (while in our setting the spectral minimum would be strictly positive). Under this condition Fukushima establishes Lifshits tails of the IDS at 0. Another interesting task would be to look at intermediate cases, where supp μ is bounded but not small, allowing overlapping finite clusters of single-site potentials, but no large empty regions. (iv) Under alternative (ii) all random configurations give the same ground state energy. This is an example of a random operator with a stable spectral boundary (as opposed to fluctuation boundaries). In other examples of this type, for a discussion see Sections 6B and 9 of [13], this

2740


has been found to lead to van Hove behavior of the IDS, i.e. N (E) ∼ (E − E0 )d/2 as for the unperturbed Laplacian. We also expect this here. Acknowledgments M.L. would like to acknowledge partial support through NSF grant DMS-0600037. G.S. was partially supported through NSF grand DMS-0653374. References [1] J. Baker, Spectral properties of displacement models, PhD thesis, University of Alabama at Birmingham, 2007, electronically available at http://www.mhsl.uab.edu/dt/2007p/baker.pdf. [2] J. Baker, M. Loss, G. Stolz, Minimizing the ground state energy in a randomly deformed lattice, Comm. Math. Phys. 283 (2008) 397–415. [3] R. Carmona, J. Lacroix, Spectral Theory of Random Schrödinger Operators, Birkhäuser, Basel, 1990. [4] W. Craig, B. Simon, Subharmonicity of the Lyapunov exponent, Duke Math. J. 50 (1983) 551–560. [5] W. Craig, B. Simon, Log Hölder continuity of the integrated density of states for stochastic Jacobi matrices, Comm. Math. Phys. 90 (1983) 207–218. [6] D. Damanik, R. Sims, G. Stolz, Localization for one-dimensional, continuum, Bernoulli–Anderson models, Duke Math. J. 114 (2002) 59–100. [7] W. Feller, An Introduction to Probability Theory and Its Applications, vol. I, John Wiley & Sons, Inc., 1968. [8] R. Fukushima, Brownian survival and Lifshitz tail in perturbed lattice disorder, arXiv:0807.2486. [9] W. Kirsch, An invitation to random Schrödinger operators, arXiv:0709.3707. [10] F. Klopp, Localization for semiclassical continuous random Schrödinger operators. II. The random displacement model, Helv. Phys. Acta 66 (1993) 810–841. [11] F. Klopp, S. Nakamura, Spectral extrema and Lifshitz tails for non monotonous alloy type models, arXiv:0804.4079. [12] F. Klopp, S. Nakamura, private communication. [13] L. Pastur, A. Figotin, Spectra of Random and Almost-Periodic Operators, Springer, Berlin, 1992. [14] B. Simon, Schrödinger semigroups, Bull. Amer. Math. Soc. 7 (1982) 447–526. [15] P. Stollmann, Caught by Disorder—Bound States in Random Media, Prog. Math. Phys., vol. 20, Birkhäuser, Boston, 2001. [16] G. Stolz, On the absolutely continuous spectrum of perturbed Sturm–Liouville operators, J. Reine Angew. Math. 416 (1991) 1–23. [17] G. Stolz, Bounded solutions and absolute continuity of Sturm–Liouville operators, J. Math. Anal. Appl. 169 (1992) 210–228. [18] G. Stolz, Strategies in localization proofs for one-dimensional random Schrödinger operators, in: Spectral and Inverse Spectral Theory, Goa, 2000, Proc. Indian Acad. Sci. Math. Sci. 112 (2002) 229–243.


Note

A note on the paper “Optimizing improved Hardy inequalities” by S. Filippas and A. Tertikas Roberta Musina 1 Dipartimento di Matematica ed Informatica, Università di Udine, via delle Scienze 206, 33100 Udine, Italy Received 14 July 2008; accepted 22 August 2008 Available online 3 September 2008 Communicated by H. Brezis

Keywords: Hardy inequality; Caffarelli–Kohn–Nirenberg inequalities

In this short note we prove that Theorem A in [3] is incorrect. Indeed, the following proposition holds. Proposition 0.1. Let n 2 and let Bn be the unit ball in Rn . If q > 2 then

I :=

inf ∞ n

w∈Cc (B

−2 2 Bn |x| w dx = 0. n−2 −n+q 2 \{0}) ( −1−q/2 |w|q dx)2/q |x| | log |x|| n B Bn

2 |∇w|2 dx − ( n−2 2 )

Proof. In order to simplify computations we notice that

I=

inf ∞ n

u∈Cc (B

2−n |∇u|2 dx Bn |x| . −n −1−q/2 |u|q dx 2/q \{0}) ( Bn |x| |log |x||

DOI of original article: 10.1006/jfan.2001.3900. E-mail address: [email protected]. 1 Fax: +39 0432558499. 0022-1236/$ – see front matter © 2008 Elsevier Inc. All rights reserved. doi:10.1016/j.jfa.2008.08.009

(0.1)

2742

R. Musina / Journal of Functional Analysis 256 (2009) 2741–2745

To prove (0.1) in case n 3 it is sufficient to use the standard functional change n−2 w ∈ Cc∞ Bn \ {0} → u(z) := |x| 2 w(z) ∈ Cc∞ Bn \ {0} . Direct computations and the divergence theorem lead to the following identities: Bn

|x|2−n |∇u|2 dx =

|∇w|2 dx − Bn

−1−q/2 q |x|−n log |x| |u| dx =

Bn

n−2 2

2

|x|−2 w 2 dx,

Bn

−1−q/2 q n−2 |x|−n+q 2 log |x| |w| dx.

Bn

Therefore (0.1) holds for any n 2. Fix a nontrivial smooth radially symmetric map u1 with compact support in Bn \ {0}. Our aim is to test the infimum I in (0.1) with the following smooth map on Bn : uα (rσ ) = u1 r α ϕ(σ ), where α > 0 and ϕ is any smooth map on Sn−1 . We set c1 :=

|x|

2−n

Bn

c3 :=

|∇u1 | dx,

c2 :=

2

|x|−n |u1 |2 dx,

Bn

−1−q/2 |x|−n log |x| |u1 |q dx

2/q ,

Bn

and we denote by ∇σ ϕ the gradient of ϕ on Sn−1 . Since

|x|2−n |∇uα |2 dx = α c1 Bn

|ϕ|2 dσ + α −2 c2

Sn−1

−1−q/2 |x|−n log |x| |uα |q dx

Bn

|∇σ ϕ|2 dσ ,

Sn−1

2/q

2/q

= αc3

|ϕ|q dσ Sn−1

then I

c1

Sn−1

|ϕ|2 dσ + α −2 c2 Sn−1 |∇σ ϕ|2 dσ . c3 ( Sn−1 |ϕ|q dσ )2/q

Letting α → +∞ we infer

I

c1 c3−1

2 n−1 |ϕ| dσ S . ( Sn−1 |ϕ|q dσ )2/q

,


2743

Since q > 2, then

2 n−1 |ϕ| dσ S = 0, q ϕ∈C ∞ (Sn−1 ) ( Sn−1 |ϕ| dσ )2/q

inf

and the conclusion readily follows.

2

1. Remarks The infimum in Proposition 0.1 is positive provided that q = 2. More precisely, the following improved Hardy inequality holds (see also [1,3,4] and Remark 1.2 for the case n = 2). Proposition 1.1. Let n 2 and let Bn be the unit ball in Rn . Then

|wr |2 dx Bn

n−2 2

2

|x|−2 w 2 dx +

Bn

for any w ∈ Cc∞ (Bn ), where wr (x) := ∇w(x) ·

x |x|

1 4

−2 |x|−2 log |x| |w|2 dx

(1.1)

Bn

is the radial derivative of w at x.

Proof. Set C n := (0, +∞) × Sn−1 . For any smooth map v = v(s, σ ) ∈ Cc∞ (C n ) and for any fixed σ ∈ Sn−1 we write down the Hardy inequality for the smooth map v(·, σ ) : (0, +∞) → R: +∞ +∞ 2 vs (s, σ )2 ds 1 s −2 v(s, σ ) ds. 4 0

0

Integrating on Sn−1 we get Cn

vs (s, σ )2 ds dσ 1 4

2 s −2 v(s, σ ) ds dσ

∀v ∈ Cc∞ C n .

(1.2)

Cn

One can identify maps w ∈ Cc∞ (Bn \ {0}) with maps v ∈ Cc∞ (C n ) by composing the functional transform of the proof of Proposition 0.1 with the Emden–Fowler transform. More precisely, we set w(rσ ) = r

2−n 2

v |log r|, σ ,

for r ∈ (0, 1), σ ∈ Sn−1 .

Since

n−2 |wr | dx − 2

2

2

Bn

Bn

|x|−2 w 2 dx =

Bn

−2 |x|−2 log |x| |w|2 dx =

Cn

|vs |2 ds dσ, Cn

s −2 |v|2 ds dσ,

2744


from (1.2) one infers that (1.1) holds for any w ∈ Cc∞ (Bn \ {0}). The conclusion follows by a density argument. For example, it suffices to approximate any map w ∈ Cc∞ (Bn ) with wh (x) := ηh |x| w(x),

η(r) =

⎧ ⎪ ⎨0 ⎪ ⎩

if r h−2 ,

log h2 r |log h|

if h−2 < r < h−1 ,

1

if h−1 r < 1.

2

Remark 1.2. If n = 2 the weight |x|−2 is not summable. In this case (1.1) becomes −2 1 2 |wr | dx |x|−2 log |x| |w|2 dx ∀w ∈ Cc∞ B2 . 4 B2

B2

This inequality was firstly proved by J. Leray [4] in 1933. Remark 1.3. The constant 1/4 in (1.1) is sharp, and it is not achieved. We conclude this note by showing that inequality (1.8) in [3] is true for radially symmetric maps. Proposition 1.4. Let n 2 and let Bn be the unit ball in Rn . Then for any q > 2 there exists c > 0 such that

|∇w| dx 2

Bn

n−2 2

2

|x|−2 w 2 dx

Bn −n+q n−2 2

|x|

+c

log |x|−1−q/2 |w|q dx

2/q (1.3)

Bn

for any w ∈ Cc∞ (Bn ) radially symmetric about the origin. Proof. For w ∈ Cc∞ (Bn \ {0}) radially symmetric we use once again the transform w(r) = r

2−n 2

v |log r| .

A direct computation shows that (1.3) is equivalent to +∞

2/q +∞ 2 −1−q/2 q |vs | ds c s |v| ds , 0

0

with c > 0 independent of v. To prove (1.4) we notice that +∞ +∞ −1 2 2 d s v ds = s |vs |2 ds. ds 0

0

(1.4)


2745

On the other hand, the Caffarelli–Kohn–Nirenberg inequalities in [2] imply that +∞

2/q +∞ −1 2 q q 2 d −1+ −1 2 s s v ds c s s v ds , ds 0

0

and (1.4) readily follows. Thus (1.3) holds for any radially symmetric map in Cc∞ (Bn \ {0}). To conclude the proof one can use a density argument, as in the proof of Proposition 1.1. 2 Remark 1.5. The inequalities in Propositions 1.1 and 1.4 are related to some elliptic critical problems on the unit ball that will be investigated in [5]. References [1] Adimurthi, N. Chaduri, N. Ramaswamy, An improved Hardy–Sobolev inequality and its applications, Proc. Amer. Math. Soc. 130 (2002) 489–505. [2] L. Caffarelli, R. Kohn, L. Nirenberg, First order interpolation inequalities with weights, Compos. Math. 53 (1984) 259–275. [3] S. Filippas, A. Tertikas, Optimizing improved Hardy inequalities, J. Funct. Anal. 192 (2002) 186–233. [4] J. Leray, Étude de diverses équations integrales nonlinéaire et de quelques problèmes que pose l’hydrodinamique, J. Math. Pures Appl. 9 (12) (1933) 1–82. [5] R. Musina, Nonexistence and existence results for some critical Dirichlet’s problems on the unit ball, in preparation.

No title

No title

No title

No title

No title

No title

No title

No title

No title

No title

No title

No title

No title

No title

No title

No title

No title

No title

No title

No title

No title

No title

No title

No title

No title

No title

Recommend Documents