Communications In Mathematical Physics - Volume 292

Commun. Math. Phys. 292, 1–28 (2009) Digital Object Identifier (DOI) 10.1007/s00220-009-0839-8 Communications in Mathe...

Author: M. Aizenman (Chief Editor)

24 downloads 721 Views 11MB Size Report

This content was uploaded by our users and we assume good faith they have the permission to share this book. If you own the copyright to this book and it is wrongfully on our website, we offer a simple DMCA procedure to remove your content from our site. Start by pressing the button below!
Report copyright / DMCA form

DOWNLOAD PDF

Commun. Math. Phys. 292, 1–28 (2009) Digital Object Identifier (DOI) 10.1007/s00220-009-0839-8

Communications in

Mathematical Physics

Right Limits and Reflectionless Measures for CMV Matrices Jonathan Breuer, Eric Ryckman, Maxim Zinchenko Mathematics 253-37, California Institute of Technology, Pasadena, CA 91125-0001, USA. E-mail: [email protected]; [email protected]; [email protected] Received: 24 November 2008 / Accepted: 10 February 2009 Published online: 29 May 2009 – © Springer-Verlag 2009

Abstract: We study CMV matrices by focusing on their right-limit sets. We prove a CMV version of a recent result of Remling dealing with the implications of the existence of absolutely continuous spectrum, and we study some of its consequences. We further demonstrate the usefulness of right limits in the study of weak asymptotic convergence of spectral measures and ratio asymptotics for orthogonal polynomials by extending and refining earlier results of Khrushchev. To demonstrate the analogy with the Jacobi case, we recover corresponding previous results of Simon using the same approach.

1. Introduction This paper considers some issues in the spectral theory of CMV matrices viewed through the lens of the notion of right limits. In particular, a central theme will be the fact that one may use the properties of right limits of a given CMV matrix to deduce relations between the asymptotics of its entries and its spectral measure. CMV matrices (see Definition 1.2 below) were named after Cantero, Moral and Velázquez [4] and may be described as the unitary analog of Jacobi matrices: they arise naturally in the theory of orthogonal polynomials on the unit circle (OPUC) in much the same way that Jacobi matrices arise in the theory of orthogonal polynomials on the real line (OPRL). Two related topics will be at the focus of our discussion. The first is the extension to the CMV setting of a collection of results, proven recently by Remling [33], describing various consequences of the existence of absolutely continuous spectrum of Jacobi matrices. The second topic is the simplification of various elements of Khrushchev’s theory of weak limits of spectral measures, through the understanding that the matrices at the center of attention have right limits in a very special class. As we shall see, these two subjects are intimately connected through the notion of reflectionless whole-line CMV matrices. This is a concept that has been extensively

2

J. Breuer, E. Ryckman, M. Zinchenko

investigated in recent years, in the context of both CMV and Jacobi matrices ([6,7,9– 11,14,17–22,24–29,32,33,38–42]) and was seen to have numerous applications in their spectral theory. There are various definitions of this notion, all of which turn out to be equivalent in the Jacobi matrix case. We shall show that this is not true in the CMV case. In particular, we construct an example of a whole-line CMV matrix that is not reflectionless in the spectral-theoretic sense, while all of its diagonal spectral measures are reflectionless in the measure-theoretic sense. We will show, however, that this may only happen for a very limited class of CMV matrices. Their existence in the CMV case, together with Remling’s Theorem (Theorem 1.4 below), provides for a simple proof of Khrushchev’s Theorem (Theorem 1.9 below). We should remark that ours is not the first paper to deal with right limits of CMV matrices. For other examples and related results, see for instance [15,23]. In order to describe our results, some notation is needed: given a probability measure, ∞ µ, on the boundary of the unit disc, ∂D, we let {n (z)}∞ n=0 and {ϕn (z)}n=0 denote the monic orthogonal and the orthonormal polynomials one gets by applying the Gram– Schmidt procedure to 1, z, z 2 , . . . (we assume throughout that the support of µ is an infinite set so the polynomial sequences are indeed infinite). The n satisfy the Szeg˝o recurrence equation: n+1 (z) = zn (z) − αn ∗n (z),

(1.1)

∗ n where {αn }∞ n=0 is a sequence of parameters satisfying |αn | < 1 and n (z) = z n (1/z). ∞ We call {αn }n=0 the Verblunsky coefficients associated with µ. As is well known [36], the sequence {αn }∞ n=0 may be used to construct a semi-infinite 5-diagonal matrix, C, (called the CMV matrix) such that the operator of multiplication by z on L 2 (∂D, dµ) is unitarily equivalent to the operator C on 2 (Z+ ) (Z+ = {0, 1, 2, . . .}). Explicitly, C is given by ⎛ ⎞ α¯ 0 α¯ 1 ρ0 ρ1 ρ0 0 0 ... ⎜ ρ0 −α¯ 1 α0 −ρ1 α0 0 0 ... ⎟ ⎜ ⎟ α¯ 2 ρ1 −α¯ 2 α1 α¯ 3 ρ2 ρ3 ρ2 ... ⎟ ⎜ 0 (1.2) C=⎜ ρ2 ρ1 −ρ2 α1 −α¯ 3 α2 −ρ3 α2 . . . ⎟ ⎜ 0 ⎟ ⎝ 0 ⎠ 0 0 α¯ 4 ρ3 −α¯ 4 α3 . . . ... ... ... ... ... ...

1/2 with ρn = 1 − |αn |2 . Now, for a probability measure µ on ∂D, let dµ(θ ) = w(θ )dθ + dµsing (θ ) be the decomposition into its absolutely continuous and singular parts (with respect to Lebesgue measure). If C is the corresponding CMV matrix, we define the essential support of the absolutely continuous spectrum of C to be the set ac (C) ≡ {θ | w(θ ) > 0}. Clearly, ac (C) is only defined up to sets of Lebesgue measure zero, so the symbol and the name should be understood as representing elements in an equivalence class of sets rather than a particular set. For the sake of simplicity we will ignore this point in our discussion. The first part of this paper deals with proving the analog of Remling’s Theorem (Theorem 1.4 in [33]) for CMV matrices and deriving some consequences. In a nutshell, Remling’s Theorem for CMV matrices says that for any given CMV matrix, C, all of its right limits are reflectionless on ac (C) (see Definitions 1.2 and 1.3 and Theorem 1.4 below). Here is a consequence that will also provide the link to Khrushchev’s theory (the Jacobi analog was stated and proved in [33]):

Right Limits for CMV Matrices

3

Theorem 1.1. Let {αn }∞ αn }∞ n=0 and { n=0 be two sequences of Verblunsky coefficients such that (with n = αn − αn ): (i) |αn | < 1, | αn | < 1 for all n. ∞ (ii) There exist sequences {m j }∞ j=1 , {n j } j=1 with n j − m j → ∞ so that

lim

sup

j→∞ m j ≤n 0. Furthermore, let C and C denote the CMV matrices of {αn }∞ αn }∞ n=0 and { n=0 respectively. Then

ac (C) ∩ ac C = 0, (1.3) where | · | denotes Lebesgue measure. In particular, if C is associated with a sequence of Verblunsky coefficients satisfying ∀k ≥ 1 lim αn αn+k = 0, n→∞

lim sup |αn | > 0,

(1.4)

n→∞

then C has purely singular spectrum. In order to state Remling’s Theorem we need some more terminology. Definition 1.2. Given a sequence of Verblunsky coefficients {αn }∞ n=0 , a doubly-infinite sequence of parameters { αn }n∈Z with | αn | ≤ 1 is called a right limit of {αn }∞ n=0 if there is a sequence of integers n j → ∞ such that ∀n ∈ Z, αn = lim j→∞ αn+n j . Since a sequence of Verblunsky coefficients is always bounded, by compactness it always has at least one (and perhaps many) right limits. Given a doubly infinite sequence { αn }n∈Z , one may also define a corresponding unitary matrix on 2 (Z), extending the half-line matrices to the left and top (see (3.1) for the precise form). We call such a matrix the corresponding whole-line CMV matrix and denote it by E. For this reason, we shall often refer to a doubly infinite sequence of numbers { αn }n∈Z with | αn | ≤ 1 as a (doubly infinite) sequence of Verblunsky coefficients. If { αn }n∈Z is a right limit of {αn }∞ n=0 , we refer to the corresponding whole-line CMV matrix as a right limit of the half-line CMV matrix associated with {αn }∞ n=0 . Recall that any probability measure µ on ∂D may be naturally associated with a Schur function f (an analytic function on D satisfying supz∈D | f (z)| ≤ 1) and a Carathéodory function F (an analytic function on D satisfying F(0) = 1 and Re F(z) > 0 on D). This is given by 2π iθ 1 + z f (z) e +z = F(z) = dµ(θ ). (1.5) 1 − z f (z) eiθ − z 0 The correspondence is 1-1 and onto. By a classical result, limr ↑1 F(r eiθ ) and limr ↑1 f (r eiθ ) exist Lebesgue a.e. on ∂D. We denote them by F(eiθ ) and f (eiθ ) respectively and, when there is no danger of confusion, simply by F(z) or f (z) for z ∈ ∂D. Given a Schur function, f , let f 0 = f and define a sequence of Schur functions f n and parameters γn ∈ D by z f n+1 (z) =

f n (z) − γn , 1 − γ n f n (z)

γn = f n (0).

4


If for some n, |γn | = 1, we stop and the Schur function is a finite Blaschke product. Otherwise, we continue. It is known [36] that this process (known as the Schur algorithm) sets up a 1-1 correspondence between Schur functions f and parameter sequences γn ∈ D, so given any sequence γn ∈ D there is a unique Schur function f (z; γ0 , γ1 , . . . ) associated to it in the above way. The γ ’s are frequently termed the Schur parameters associated to f (or equivalently, to µ or F). Geronimus’s Theorem [8] says that γn = αn (the Verblunsky coefficients of µ appearing above). Finally, note that by definition f n (z; α0 , α1 , . . . ) = f (z; αn , αn+1 , . . . ).

(1.6)

For a doubly infinite sequence of Verblunsky coefficients, {αn }n∈Z (some of which may lie on ∂D), we define two sequences of Schur functions: f + (z, n) = f (z; αn , αn+1 , . . . ) and f − (z, n) = f (z; −αn−1 , −αn−2 , . . . ),

(1.7)

where as usual, if one of the α’s lies in ∂D then we stop the Schur algorithm at that point and the corresponding Schur function is a finite Blaschke product. Definition 1.3. Let {αn }n∈Z be a doubly-infinite sequence of Verblunsky coefficients and let E be the associated whole-line CMV matrix. Given a Borel set A ⊆ ∂D, we will say E is reflectionless on A if for all n ∈ Z, z f + (z, n) = f − (z, n) for Lebesgue almost every z ∈ A. By the Schur algorithm, one can easily see that “for all n ∈ Z” may be replaced with “for some n ∈ Z.” Remark. The analogous definition for whole-line Jacobi matrices involves a similar relationship between the left and right m-functions. The following is the CMV version of Remling’s Theorem: Theorem 1.4. (Remling’s Theorem for CMV matrices) Let C be a half-line CMV matrix, and let ac (C) be the essential support of the absolutely continuous part of the spectral measure. Then every right limit of C is reflectionless on ac (C). Remling’s proof in the Jacobi case relies on previous work by Breimesser and Pearson [1,2] concerning convergence of boundary values for Herglotz functions. We will prove Theorem 1.4 using the analogous theory for Schur functions. For a CMV matrix, C, recall that its essential spectrum, σess (C), is its spectrum with the isolated points removed. The following extension of a celebrated theorem of Rakhmanov is a simple corollary of Theorem 1.4: Theorem 1.5. Assume ac (C) = σess (C) = A, where σess (C) is the essential spectrum of C. Then for any right limit E of C, σ (E) = A and E is reflectionless on A. Remark. In the case that A is a finite union of intervals, the class of whole-line CMV matrices, T A , described in the above theorem is called the isospectral torus of A. Much the same as in the Jacobi case ([33, Sect. 6, 42, Sect. 2]) the torus structure is naturally introduced by considering the gaps ∂D\A = ∪j=1 (a j , b j ). The torus is obtained by taking two copies of each interval, “gluing” them at the edges and taking the Cartesian product. It then turns out that each point in this torus corresponds to a unique matrix in T A . More formally, one identifies CMV matrices E ∈ T A with -tuples (tˆ1 , . . . , tˆ ),


5

tˆj = (t j , ε j ), where t j ∈ [a j , b j ], ε j ∈ {±1}, and we identify (t j , 1) and (t j , −1) if t j = a j or b j . As [28, Thm. 1.4] has a detailed discussion of this identification in a slightly more general context (see also [12] for the case of a single arc spectrum), we simply sketch the argument here. The key is a circular analogue of Craig’s formula [6] for Carathéodory with Im F = 0 a.e. on A, Re F = 0 on ∂D\A, and having no zeros in functions, F, has a single singularity in each gap (a simple ∂D\A. By this formula, every such F pole in the interior of a gap or a square root singularity at an edge), and is uniquely determined up to a positive multiplicative constant by the locations of these singularities is not necessarily normalized to have F(0) = 1, and in general t j ∈ [a j , b j ] (note that F F(0) is not real). Given the ε j ’s, F can be uniquely decomposed into a sum of two ± having no common poles, F + (0) > 0, and F + = F − a.e. on Carathéodory functions F to either F + or F − according to the sign of A. This is done by assigning each pole of F the corresponding ε j ’s (the “gluing” comes from the fact that both will have a square and F ± by F + (0) = 1, it root singularity at t j if it is at an edge). After normalizing F ± . These, in turn, follows that every -tuple (tˆ1 , . . . , tˆ ) corresponds to a unique pair F + (z) = 1+z f+ (z) and F − (z) = 1+ f− (z) with z f + = f − define two Schur functions f ± by F 1−z f + (z) 1− f − (z) a.e. on A, which then determine a CMV matrix E ∈ T A . Conversely, it is easy to see by reversing the above steps that every element of T A corresponds to a unique -tuple. If A = ∂D, the isospectral torus is known to consist of a single point—the CMV matrix with Verblunsky coefficients all equal to zero [12]. Thus, one gets Rakhmanov’s Theorem [30,31] as a corollary. Proof of Theorem 1.5. Let E be a right limit of C and {δn }n∈Z be the standard orthonormal basis for 2 (Z). For ψ = n∈Z 2−|n| δn , let dµψ (θ ) = wψ (θ )dθ + dµψ,sing be the spectral measure of ψ and E. Let ac (E) = {θ | wψ (θ ) > 0} (defined, again, up to sets of Lebesgue measure zero). By Theorem 1.4, A ⊆ ac (E) (up to a set of Lebesgue measure zero), since the reflectionless condition implies positivity of the real part of the Carathéodory function associated with dµψ . Also, σ (E) ⊆ σess (C) = A by approximate-eigenvector arguments (see for instance [23]). Since obviously ac (E) ⊆ σ (E), we have equality throughout. The reflectionless condition now follows from Theorem 1.4. Remark. Using Theorem 1.4 and a bit of work, one can also derive parts of Kotani theory for ergodic CMV matrices (see for instance [37, Sect. 10.11]). Remling also obtains deterministic versions of these results for Jacobi matrices (see [33, Thm’s 1.1 and 1.2]). His proofs extend directly to the CMV case we are considering, so we will not pursue this here. Corresponding to the notion of reflectionless operators, there is also the notion of reflectionless measures: Definition 1.6. A probability measure µ on ∂D is said to be reflectionless on a Borel set A ⊆ ∂D if the corresponding Carathéodory function F has Im F(eiθ ) = 0 for Lebesgue a.e. eiθ ∈ A. Remark. The analogous definition for measures on the real line involves the vanishing of the real part of the Borel (a.k.a. Cauchy or Stieltjes) transform of µ (see for instance [43]).

6


Remark. There is also a natural dynamical notion for when an operator is reflectionless. For the relationship between this and spectral theory see [3]. Reflectionless Jacobi matrices and reflectionless measures on R are related in the following way: given a whole-line Jacobi matrix, H , let µn be the spectral measure of H and δn (δn ∈ 2 (Z) is defined by δn ( j) = δn, j with δn, j the Kronecker delta). Then H is reflectionless on A ⊆ R if and only if µn are reflectionless on A for all n ∈ Z (again, see [43]). A fact we would like to emphasize in this paper is that the analogous statement does not hold for CMV matrices. Example 1.7. Fix j0 ∈ Z and some 0 < |β| < 1, and let {αn }n∈Z be the sequence of Verblunsky coefficients defined by

β n = j0 αn = 0 otherwise. Let E be the CMV matrix for these α’s. From the Schur algorithm we see f (z; 0, 0, . . . ) = 0 and f (z; β, 0, 0, . . . ) = β,

(1.8)

so E is not reflectionless anywhere. On the other hand, let µn be the spectral measure of E and δn , and let f (z, n) be its corresponding Schur function. It is shown in [13] (see also [18] for the analogous formula in the half-line case) that f (z, n) = f + (z, n) f − (z, n), z ∈ D, n ∈ Z.

(1.9)

dθ . In particular, µn is reflectionless on Thus, for any n ∈ Z, (1.8) implies dµn (θ ) = 2π all of ∂D while E is not reflectionless on any subset of positive Lebesgue measure.

We will show, however, that this is the only example of such behavior: Theorem 1.8. Let E be the whole-line CMV matrix corresponding to the sequence {αn }n∈Z , satisfying αn = 0 for at least two different n ∈ Z. Then E is reflectionless on A ⊆ ∂D if and only if µn is reflectionless on A for all n. The connection between the above result and Khrushchev’s theory of weak limits comes from the fact that, together with Example 1.7, Theorem 1.1 provides for a particularly simple proof of the following theorem of Khrushchev. Theorem 1.9 (Khrushchev [18]). Let C be a CMV matrix with Verblunsky coefficients iθ 2 {αn }∞ n=0 and measure µ, and let dµn (θ ) = |ϕn (e )| dµ(θ ). Then dµn (θ ) →

dθ 2π

weakly if and only if ∀k ≥ 1 lim αn αn+k = 0. n→∞

Furthermore, these conditions imply that either αn → 0 or µ is purely singular.


7

This theorem is naturally a part of a larger discussion dealing with weak limits of µn . In particular, Khrushchev’s theory deals with the cases in which such weak limits exist. We will show that the analysis of these cases becomes simple when performed using right limits. The reason for this is that the µn above are actually the spectral measures of C and δn and, along a proper subsequence, these converge weakly to the corresponding spectral measures of the right limit. Thus, if µn converges weakly to ν as n → ∞, all of the diagonal measures of any right limit are ν. This leads naturally to Definition 1.10. We say that a whole-line CMV matrix, E, belongs to Khrushchev Class if µn = µm for all n, m ∈ Z, where µ j is the spectral measure of E and δ j . By the discussion above, Proposition 1.11. If C is a CMV matrix such that the sequence µn has a weak limit as n → ∞ then all right limits of C belong to Khrushchev Class. Thus, Khrushchev theory reduces to the analysis of Khrushchev Class. Since Simon analyzed the analogous Jacobi case [35], we feel the following is fitting: Definition 1.12. We say that a whole-line Jacobi matrix, H , belongs to Simon Class if µn = µm for all n, m ∈ Z, where µ j is the spectral measure of H and δ j . The final section of this paper will be devoted to the analysis of these two classes. In particular, we rederive all of the main results of [35] and even extend some of those of [19]. We conclude with an amusing (and easy) fact: Proposition 1.13. Any H in the Simon Class is either periodic, and so reflectionless on its spectrum, or decomposes into a direct sum of finite (in fact 2 × 2 matrices), and so has pure point spectrum of infinite multiplicity. Similarly, any E in the Khrushchev Class that does not belong to the class introduced in Example 1.7 is either reflectionless on its spectrum or has pure point spectrum. The rest of this paper is structured as follows. Section 2 contains the proof of Theorems 1.4 and 1.1 as well as an application to random perturbations of CMV matrices. Section 3 contains a proof of Theorems 1.8 and 1.9, and Sect. 4 contains our analysis of the operators in the Khrushchev and Simon Classes and their relevance to Khrushchev’s theory of weak limits and ratio asymptotics. 2. The Proof of Remling’s Theorem for OPUC Our proof will parallel that of Remling [33] quite closely, so we will content ourselves with presenting the parts that differ significantly, but only sketching those parts that are similar. We will first need some definitions. Let z ∈ D and let S ⊂ ∂D be a Borel set, and define iθ e + z dθ ωz (S) = Re iθ . e − z 2π S (Here, and numerous times below, we have made use of the standard identification of ∂D with [0, 2π ) in that the integration is actually over the set {θ ∈ [0, 2π ) : eiθ ∈ S}. We trust this will not cause any confusion.) If f : D → D is a Schur function, define ω f (eiθ ) (S) = lim ω f (r eiθ ) (S). r ↑1

8


As z → ω f (z) (S) is a non-negative harmonic function in D, Fatou’s Theorem implies that this limit exists for (Lebesgue) almost every θ . Given Schur functions f n (z) and f (z), we will say that f n converges to f in the sense of Pearson if for all Borel sets A, S ⊆ ∂D, dθ dθ = ω fn (eiθ ) (S) ω f (eiθ ) (S) . lim n→∞ A 2π 2π A (We note here that in [1,2,33] this mode of convergence was called convergence in value distribution. However, since this term had already been used in [26] for a completely different concept, we will use the above name instead.) The next lemma relates this type of convergence to a more standard one: Lemma 2.1. Let f , f n , n ∈ N, be Schur functions. Then f n converges to f in the sense of Pearson if and only if f n (z) converges to f (z) uniformly on compact subsets of D. Of course, in this case it is well-known that the associated spectral measures then converge weakly as well. Proof. We simply sketch the proof since the full details may be found in [33]. For the forward implication, we may use compactness to pick a subsequence where g(z) := limk→∞ f n k exists (uniformly on compact subsets of D) and defines an analytic function. By uniqueness of limits in the sense of Pearson, we then must have g = f . For the opposite direction, one may either use spectral averaging (as in [33]), or simply appeal to Lemma 2.4 below. The basic result behind Theorem 1.4 is the following analog of a result of Breimesser and Pearson [1]: Theorem 2.2. Let C be a half-line CMV matrix. For all Borel sets S ⊆ ∂D and A ⊆ ac (C) we have dθ ∗ dθ lim − = 0, ω f+ (eiθ ) (S) ωeiθ f− (eiθ ) (S ) n→∞ 2π 2π A A where S ∗ = {z : z ∈ S}. Assuming Theorem 2.2 for a moment, we can prove Theorem 1.4: Proof of Theorem 1.4. Let E be a right limit of C, so there is a sequence n j ↑ ∞ such that lim j→∞ αn+n j (C) = αn (E) for the corresponding sequences of Verblunsky coefficients. Thus, if f ± (z) are the Schur functions of E defined by (1.7) for n = 0, then f ± (z, n j ) → f ± (z) as j → ∞ uniformly on compact subsets of D. By Lemma 2.1 and Theorem 2.2 we now have dθ dθ = ω f+ (eiθ ) (S) ωeiθ f− (eiθ ) (S ∗ ) 2π 2π A A for all Borel sets A ⊆ ac (C), S ⊆ ∂D. Now Lebesgue’s differentiation theorem and the fact that ωz (S ∗ ) = ωz (S) shows f + (eiθ ) = e−iθ f − (eiθ ) almost everywhere on ac (C). Thus, E is reflectionless on ac (C).


9

We now turn to the proof of Theorem 2.2. We will need a few preparatory results. Lemma 2.3. For any Schur function f (z), Borel set S ⊆ ∂D, and z ∈ D we have 2π ω f (z) (S) = ω f (eiθ ) (S)dωz (eiθ ). 0

In particular, for any Borel set A ⊆ ∂D, 2π dθ dθ ω f (r eiθ ) (S) ω f (eiθ ) (S)ωr eiθ (A) . = 2π 2π A 0 Proof. For the first statement, just note that both sides are harmonic functions of z with the same boundary values. The second statement follows by writing dωr eiθ (eiφ ) = and applying Fubini’s theorem.

dφ 1 − r2 2 1 + r − 2r cos(φ − θ ) 2π

Lemma 2.4. Let A ⊆ ∂D be a Borel subset. Then

dθ dθ

− lim sup

ω f (r eiθ ) (S) ω f (eiθ ) (S)

= 0, r ↑1 f,S 2π 2π A A where the supremum is taken over all Schur functions f (z) and all Borel sets S ⊆ ∂D. Proof. This follows from Lemma 2.3 and analyzing (the f -independent quantity) dθ ωr eiθ (Ac ) . 2π A For more details, see Lemma A.1 in [33] whose proof is nearly identical.

We will need a notion of pseudohyperbolic distance on D. Given w1 , w2 ∈ D define |w1 − w2 | γ (w1 , w2 ) = . 1 − |w1 |2 1 − |w2 |2 This is an increasing function of the hyperbolic distance on D. As such, if F : D → D is analytic, then γ (F(w1 ), F(w2 )) ≤ γ (w1 , w2 ) and if F is an automorphism with respect to hyperbolic distance on D (written “F ∈ Aut(D)”) then we have equality above. Taking F(z) to be the analytic function whose real part is ωz (S), we see that for all z, ζ ∈ D and all Borel sets S ⊆ ∂D, |ωz (S) − ωζ (S)| |ωz (S) − ωζ (S)| ≤ ≤ γ (F(z), F(ζ )) ≤ γ (z, ζ ). 1 − |ωz (S)|2 1 − |ωζ (S)|2 (2.1)

10


Now let {αn }n∈Z be a sequence of Verblunsky coefficients (some of which may lie on ∂D). Recall the two sequences of Schur functions defined by (1.7): f + (z, n) = f (z; αn , αn+1 , . . . ),

f − (z, n) = f (z; −αn−1 , −αn−2 , . . . ).

Since the Schur algorithm terminates at any αk ∈ ∂D, we see that for a half-line sequence of α’s (recall α−1 = −1) we have f − (z, n = 0) = −α−1 = 1. Viewing matrix arithmetic projectively (that is, identifying an automorphism of D with its coefficient matrix, see for instance [33]), the Schur algorithm shows f ± (z, n + 1) = T± (z, αn ) f ± (z, n), where

1 T+ (z, α) = −zα

−α z

−α . 1

z and T− (z, α) = −zα

By elementary manipulations we see that for any z ∈ C, 1 0 1 0 T+ (z, α) = . T (z, α) 0 z − 0 z

(2.2)

We will let P± (z, n) = T± (z, αn−1 ) · · · T± (z, α0 ) so that f ± (z, n) = P± (z, n) f ± (z, n = 0). We have the following mapping properties of T± (z, α): Lemma 2.5. Let α ∈ D. (1) If z ∈ ∂D, then T± (z, α) ∈ Aut(D). (2) If z ∈ D, then T− (z, α) : D → D and γ (T− (z, α)w1 , T− (z, α)w2 ) ≤ |z|γ (w1 , w2 ) for all w1 , w2 ∈ D. Proof. Let

1 S(α) = −α

−α 1

z and M(z) = 0

0 1

so that T+ (z, α) = M(z −1 )S(α) and T− (z, α) = S(α)M(z). Because α ∈ D we have S(α) ∈ Aut(D). If z ∈ ∂D then M(z) ∈ Aut(D) as well, while a straightforward calculation shows that if z ∈ D then γ (M(z)w1 , M(z)w2 ) ≤ |z|γ (w1 , w2 ). This proves (1) and (2).


11

With these preliminaries in hand we are ready for the proof of Theorem 2.2. We emphasize again that we are following the proof of Theorem 3.1 from [33]. Proof of Theorem 2.2. Subdivide A = A0 ∪ A1 ∪ · · · ∪ A N in such a way that 1. |A0 | < ε. N Ak , limr ↑1 f + (r eiθ , 0) exists and lies in D. 2. On k=1 3. For each 1 ≤ k ≤ N , there is a point m k ∈ D such that γ f + (eiθ , 0), m k < ε for all eiθ ∈ Ak . The construction of such a decomposition is identical to that given in [33], so we do not review it here. To deal with A0 , we note that for any z ∈ D and any Borel set S ⊆ ∂D, we have |ωz (S)| ≤ 1. Thus

dθ ∗ dθ

− ω f+ (eiθ ,n) (S) ωeiθ f− (eiθ ,n) (S ) < 2ε.

2π 2π A0 A0 N Now we consider A1 , . . . , A N . Notice that if eiθ ∈ k=1 Ak , then for all n ∈ N we iθ also have that limr ↑1 f + (r e , n) exists and lies in D. As P+ (eiθ , n) ∈ Aut(D) we see γ f + (eiθ , n), P+ (eiθ , n)m k < ε for all eiθ ∈ Ak and all n ∈ N. Using (2.1) and integrating we find

dθ dθ

< ε|Ak |. ω (S) ω (S) − iθ iθ f + (e ,n) P+ (e ,n)m k

2π 2π

Ak

Ak

By (2.2) and the fact that ωz (S ∗ ) = ωz (S), we can rewrite this as

dθ ∗ dθ

− ω f+ (eiθ ,n) (S) ωeiθ P− (eiθ ,n)(e−iθ m k ) (S ) < ε|Ak |

2π 2π Ak Ak

(2.3)

(and notice that because T− (z, α) = S(α)M(z), we have that z P− (z, n)(z −1 m k ) is indeed a Schur function). By Lemma 2.5 there is an n 0 ∈ N so that for all n ≥ n 0 , γ z P− (z, n)(z −1 wk ), z f − (z, n) < ε. As before, using (2.1) and integrating shows

∗ dθ ∗ dθ

− ωz P− (z,n)(z −1 wk ) (S ) ωz f− (z,n) (S ) < ε|Ak |.

2π 2π Ak Ak

(2.4)

Now use Lemma 2.4 to find an r < 1 so that

dθ dθ

− ω f (eiθ ) (S) ω f (r eiθ ) (S) < ε|Ak |

2π 2π Ak

Ak

for all Schur functions f (z), all Borel sets S ⊆ ∂D, and k = 1, . . . , N . Applying this to (2.3) and (2.4) shows

dθ ∗ dθ

− ω f+ (eiθ ,n) (S) ωeiθ f− (eiθ ,n) (S ) < 4ε|Ak |.

2π 2π Ak Ak

12


Now summing in k shows

∗ dθ

ω iθ (S) dθ − ωeiθ f− (eiθ ,n) (S ) < 4ε|A| + 2ε f + (e ,n)

2π 2π A A for all n ≥ n 0 .

Next, we illustrate Theorem 1.4 by a simple example of constant coefficients CMV matrices: Example 2.6. Let C be the half-line CMV matrix associated with the constant Verblunsky coefficients αn = a, n ≥ 0, for some a ∈ (0, 1). It follows from the Schur algorithm that the corresponding Schur function f a satisfies the quadratic equation az f a (z)2 + (1 − z) f a (z) − a = 0, and hence is given by

(1 − z)2 + 4a 2 z , z ∈ D, 2az √ where the square root is defined so that eiθ = eiθ/2 for θ ∈ (−π, π ). Using the 1+z f a (z) Carathéodory function Fa (z) = 1−z f a (z) we compute f a (z) =

−(1 − z) +

ac (C) = {eiθ : Re Fa (eiθ ) > 0} = {eiθ : | f a (eiθ )| < 1} = {eiθ : 2 arcsin(a) < θ < 2π − 2 arcsin(a)}. The half-line CMV matrix C has exactly one right limit E which is the whole-line CMV matrix associated with the constant coefficients αn = a, n ∈ Z. It follows from (1.7) that the two Schur functions for E are given by f + (z, n) = f a (z) and f − (z, n) = f −a (z) = − f a (z), n ∈ Z. Since for all eiθ ∈ ac (C), ⎛ ⎞ 2 i sin(θ/2) a ⎝1 − 1 − ⎠ f a (eiθ ) = aeiθ/2 sin(θ/2) and the expression under the square root is positive, one easily verifies the reflectionless property of E on ac (C), eiθ f + (eiθ , n) = eiθ f a (eiθ ) = − f a (eiθ ) = f − (eiθ , n), eiθ ∈ ac (C), thus confirming the claim of Theorem 1.4. Note that adding a decaying perturbation to the Verblunsky coefficients of C does not change the uniqueness of the right limit, nor does it change the limiting operator. Moreover, if the decay is sufficiently fast (e.g. 1 ), ac (C) does not change either. The following is one of the reasons reflectionless operators are so useful: Lemma 2.7. Let {αn }n∈Z , {βn }n∈Z be two sequences of Verblunsky coefficients such that their corresponding whole-line CMV matrices are both reflectionless on some common set A with |A| > 0. If αn = βn for all n < 0, then αn = βn for all n.


13

Proof. By the Schur algorithm, {αn }n n) with Dirichlet boundary conditions, ∞ − and Jn = J {an− j , bn+1− j } j=1 , the half-line Jacobi matrix one gets when restricting

H to 2 ( j ≤ n) with Dirichlet boundary conditions. Jn+ and Jn− have spectral measures associated with them which we denote by µ+n and µ− n . Finally, for z ∈ C \ R, let dµ±n (x) m ± (n; z) = x−z be the corresponding Borel-Stieltjes transforms. We are interested in H for which these are constants in n. The reason for this is the fact that if H is a right (z) 1 limit of J , then − m − (0;z) is a limit of PPn+1 along an appropriate subsequence (see e.g. n (z) [33]—note that his m − is our −1/m − ). Thus, Theorem 4.5. Let H ({an , bn }n∈Z ) be a whole-line Jacobi matrix. Then the following are equivalent: (i) H belongs to Simon Class and its spectrum is a single interval. (ii) an = a, bn = b for some numbers, a ≥ 0 and b ∈ R and all n ∈ Z. (iii) m − (n; z) = m − (n + 1; z) for all z ∈ C \ R, n ∈ Z. (iv) m + (n; z) = m + (n + 1; z) for all z ∈ C \ R, n ∈ Z. (v) m − (n; z) = m − (n + 1; z) for some z ∈ C \ R and all n ∈ Z. (vi) m + (n; z) = m + (n + 1; z) for some z ∈ C \ R and all n ∈ Z. Proof. (i) ⇔ (ii) follows from the theory of periodic Jacobi matrices (see [43, Sect. 7.4]). (ii) ⇒ (iii) ⇒ (v) and (ii) ⇒ (iv) ⇒ (vi) are clear by periodicity. Thus we are left with showing (v) ⇒ (ii) and (vi) ⇒ (ii). Writing down the continued fraction expansion for m − (n; z): −

1 = z − bn+1 + an2 m − (n; z), n ∈ Z, m − (n + 1; z)

one sees that (v) implies m − (n; z) satisfies a quadratic equation. an and bn+1 are then determined from this equation by taking imaginary and real parts, and so we get (v) ⇒ (ii) (see the proof of Theorem 2.2 in [35] for details). The same can be done for m + (n; z) to get (vi) ⇒ (ii). By the above discussion and Theorem 4.5, it follows that µ is ratio asymptotic if and only if its Jacobi matrix has a unique right limit in Simon Class with constant off-diagonal elements. Moreover, (v) in Theorem 4.5 implies that it is enough to require ratio asymptotics at a single z ∈ C \ R. This is precisely the content of Theorem 1 in [35]. We shall show below that the same strategy can be applied in the OPUC case in order to get a strengthening of corresponding results by Khrushchev.


21

4.2. The Khrushchev Class. We now turn to the discussion of the analogous theory for half-line CMV matrices. Namely, we study CMV matrices with the property that dµn (θ ) = |ϕn (eiθ )|2 dµ(θ ) has a weak limit as n → ∞. Again, as is clear from the discussion in the Introduction, all these right limits belong to Khrushchev Class (recall Definition 1.10) and so the analysis is mainly the analysis of properties of that class. Since nontrivial CMV matrices can have many powers with zero diagonal, the computations are substantially more complicated. Here is the analog of Theorem 4.1: Theorem 4.6. Let E be a whole-line CMV matrix and k ∈ N ∪ {∞}. Then the following are equivalent: (i) E belongs to Khrushchev Class with [E ]n,n = 0 for = 1, . . . , k − 1 and all n ∈ Z, and in the case k < ∞, [E k ]n,n = c for some c ∈ D \ {0} and all n ∈ Z. (ii) For = 1, . . . , k − 1, 2π eiθ dµn (θ ) = 0, n ∈ Z, 0

and if k < ∞ then additionally, for some c ∈ D \ {0}, 2π eikθ dµn (θ ) = c, n ∈ Z. 0

(iii) There exist n 0 ∈ N, a, b ∈ (0, 1], and t ∈ [0, 2π ) such that in the case k < ∞, αn 0 +nk+ j

|αn 0 +2nk | = a, |αn 0 +(2n+1)k | = b, = 0, arg(α n 0 +(n+1)k αn 0 +nk ) = t, n ∈ Z, j = 1, . . . , k − 1,

and in the case k = ∞, α j = 0,

j ∈ Z \ {n 0 }.

Remark. In particular, this shows that the constancy of the first k moments, where the k th moment is the first nonzero one, implies that E belongs to Khrushchev Class. Note, however, that the value of the k th moment does not determine the element of the class itself (again, not even up to translation; see Theorem 4.8 below). Thus, it makes sense to define K(c, k), for k < ∞, to be the set of all matrices in the Khrushchev Class with [E ]n,n = 0 for all n ∈ Z, = 1, . . . , k − 1, and [E k ]n,n = c = 0 for all n ∈ Z. In the case k = ∞, let K(∞) be the set of all matrices with [E ]n,n = 0 for all n ∈ Z, ≥ 1. We note that every CMV matrix E from the Khrushchev Class belongs to one of K(c, k), c ∈ D \ {0}, k ∈ N, or to K(∞). Proof. (i) ⇒ (ii): Follows from 2π eiθ dµn (θ ) = [E ]n,n for all ∈ N, n ∈ Z.

(4.7)

0

(ii) ⇒ (iii): First, observe that (iii) is equivalent to the following López-type condition: there exists n 0 ∈ Z such that for all n ∈ Z, = 1, . . . , k, j = 0, . . . , − 1,

abeit j = 0, = k, (4.8) α n 0 +n+ j αn 0 +(n−1)+ j = 0 otherwise.

22


We will show that (4.8) holds with abeit = −c by verifying inductively with respect to that

2π iθ e dµn 0 +n j = 0, − α n 0 +n+ j αn 0 +(n−1)+ j = 0 (4.9) 0 otherwise for some n 0 ∈ Z and all n ∈ Z, = 1, . . . , k, j = 0, . . . , − 1. The case = 1 trivially follows from (3.1) since the first moment of µn is exactly the diagonal element En,n for all n ∈ Z. Now suppose (4.9) holds for = 1, . . . , p − 1 for some p ≤ k. In view of (4.7), we want to compute [E p ]n,n = [(LM) p ]n,n . To do this, it turns out to be useful to separate the diagonal and off-diagonal elements of L and M and identify the contributions to the product. Thus, let the diagonal matrices X −1 = diagL and X 1 = diagM be the diagonals of L and M respectively. Furthermore, define Y−1 and Y1 through L = X −1 + Y−1 , M = X 1 + Y1 . Expressed in this notation, our objective is to compute the diagonal elements of (4.10) E p = (X −1 + Y−1 ) X (−1)2 + Y(−1)2 · · · X (−1)2 p + Y(−1)2 p . First, it is a direct computation to verify that for any two s, r ∈ N, diagY(−1)s Y(−1)s+1 · · · Y(−1)s+r = 0.

(4.11)

Now, using (ii) and the induction hypothesis ((4.9) for ≤ p − 1) one verifies that [Y(−1) j−s Y(−1) j−s+1 · · · Y(−1) j−1 X (−1) j Y(−1) j+1 · · · Y(−1) j+s−1 Y(−1) j+s ]n,n

2 α n+s ρn2 · · · ρn+s−1 n + s + j is odd, = 2 2 −αn−s−1 ρn−s · · · ρn−1 n + s + j is even

α n+s n + s + j is odd, = n, j ∈ Z, s = 0, . . . , p − 1, (4.12) −αn−s−1 n + s + j is even, and [Y(−1) j−s Y(−1) j−s+1 · · · Y(−1) j−1 X (−1) j Y(−1) j+1 · · · Y(−1) j+s−1 Y(−1) j+s ]n,m = 0 (4.13) whenever n = m. This identity combined with the induction hypothesis, (ii), (4.7), (4.10), and (4.11) j = Y(−1) j , implies that (for notational simplicity we let Y X j = X (−1) j ) 2π ei pθ dµn (θ ) = [E p ]n,n 0

=

p 1 · · · Y −1 p+−1 2 p ]n,n +1 · · · Y p++1 · · · Y [Y X Y X p+ Y =1

p α n+ p− αn− n is odd, =− α n+−1 αn− p+−1 n is even =−

=1 p =1

α n+ p− αn− , n ∈ Z.


23

The idea behind the computation is that all summands containing no X , a single X , or two X ’s that are a distance greater than or less than p apart do not contribute to the diagonal. This follows from the induction hypothesis and (4.11)–(4.13). Now, observe that the sum in the above equality may have at most one nonzero term. Indeed, if there are no nonzero terms we are done (this may happen only if p < k), otherwise let n 0 ∈ Z be such that α n 0 αn 0 − p = 0. Then combining the induction hypothesis (4.9) with (ii) yields, α n+ αn = 0, n ∈ Z, = 1, . . . , p − 1 which together with α n 0 αn 0 − p = 0 implies αn 0 +np+ = 0, n ∈ Z, = 1, . . . , p − 1. Hence, when p < k,

2π

0= 0

ei pθ dµn 0 +np (θ ) = −α n 0 +np αn 0 +(n−1) p , n ∈ Z,

and carrying the induction up to k, c= 0

2π

eikθ dµn 0 +nk (θ ) = −α n 0 +nk αn 0 +(n−1)k , n ∈ Z.

Thus, (4.9) holds for = k, and hence (iii) follows from (4.8) with abeit = −c. (iii) ⇒ (i): First, note that if k = ∞ then there is at most one nonzero Verblunsky dθ coefficient and hence by Example 1.7, E is in the Khrushchev Class with dµn (θ ) = 2π for all n ∈ Z, and hence it follows from (4.7) that [E ]n,n = 0 for all ∈ N and n ∈ Z. Next suppose k < ∞. It follows from (iii) that there are t0 , t ∈ [0, 2π ) such that αn 0 +n = |αn 0 +n |ei(t0 +tn) for all n ∈ Z. Then, using the Schur algorithm, one finds the following relations between the functions f ± associated with α = {αn }n∈Z and |α| = {|αn |}n∈Z , respectively, f + (z, n 0 + n; α) = ei(t0 +tn) f + (e−it z, n 0 + n; |α|), f − (z, n 0 + n; α) = e−i(t0 +t (n−1)) f − (e−it z, n 0 + n; |α|), n ∈ Z, z ∈ D. Hence by (1.9) the diagonal Schur functions associated with α and |α| are related by f (z, n 0 + n; α) = eit f (e−it z, n 0 + n; |α|), n ∈ Z, z ∈ D.

(4.14)

Now, the conditions in (iii) imply, f + (z, n 0 + nk + j; |α|) = z k− j f + (z, n 0 + (n + 1)k; |α|), f − (z, n 0 + nk + j; |α|) = z j−1 f − (z, n 0 + nk + 1; |α|), f + (z, n 0 + nk; |α|) = f + (z, n 0 + (n mod 2)k; |α|), f − (z, n 0 + nk + 1; |α|) = − f + (z, n 0 + nk; |α|), n ∈ Z, j = 1, . . . , k, z ∈ D.

24


These identities together with (1.9) and (4.14) yield f (z, n 0 + nk + j; α) = e−it (k−2) z k−1 f + (e−it z, n 0 + (n + 1)k; |α|) f − (e−it z, n 0 + nk + 1; |α|)

= −e−it (k−2) z k−1 f + (e−it z, n 0 + (n + 1)k; |α|) f + (e−it z, n 0 + nk; |α|) = −e−it (k−2) z k−1 f + (e−it z, n 0 + k; |α|) f + (e−it z, n 0 ; |α|)

for all n ∈ Z, j = 1, . . . , k, z ∈ D. Hence f (·, n; α) = f (·, m; α) which is equivalent to µm = µn for all m, n ∈ Z. The presence of the factor z k−1 implies that the first k − 1 moments of µn , n ∈ Z, are zero. This follows from the relationship (1.5) between Schur functions and Carathéodory functions and the fact that Taylor coefficients of F are twice the complex conjugates of the moments of µ. Moreover, since z −k+1 f (z, n 0 + nk + j; α) is nonzero at the origin, the k th moment of µn , n ∈ Z, is nonzero, and hence one gets (i) from (4.7). Corollary 4.7. Let C be a half-line CMV matrix and let µ be its spectral measure. For n ≥ 0, let dµn (θ ) = |ϕn (eiθ )|2 dµ(θ ) be the spectral measure of C and δn . If for some c ∈ D \ {0} and k ∈ N ∪ {∞}, lim

n→∞ 0

2π

e

iθ

0 = 1, . . . , k − 1, dµn (θ ) = c = k, k < ∞,

(4.15)

then all right limits of C are in K(c, k) if k < ∞ or in K(∞) if k = ∞. The analogy with Corollary 4.2 should be clear. Corollary 4.7 is a variant of Theorem E in [19] with weaker assumptions and weaker conclusions. Our proof is new and based on a completely different approach. We also note that much the same as in the Jacobi case, convergence of the first k-moments does not imply weak convergence, but by Corollary 4.7 it does imply the same weak form of weak convergence: convergence holds along any subsequence on which C has a right limit. A notable difference between the OPUC and OPRL cases is the fact that multiplication of the Verblunsky coefficients by a constant phase does not change the spectral measures. Thus, even when µn converges weakly, it is not possible to deduce uniqueness of a right limit (even up to a shift). Note that the phase ambiguity is equivalent to a choice of t0 in the proof of Theorem 4.6 and there is no way to determine this t0 from information on µn alone. In the case k = ∞ even |αn 0 | cannot be determined from the information on the measure and so the indeterminacy is, in a sense, even more severe. That said, as in the Jacobi case, it is clear that when k < ∞ any condition forcing weak convergence of µn (in addition to those in Corollary 4.7) is equivalent to a condition that distinguishes an element of K(c, k) (up to a shift and multiplication by an arbitrary phase). In particular, a somewhat tedious computation (along the lines of the argument in (ii) ⇒ (iii) above) shows that the following result holds: 2π Theorem 4.8. Suppose that k < ∞, (4.15) holds, and limn→∞ 0 e2ikθ dµn (θ ) exists. Then dµn converge weakly and C has a unique right limit (up to a shift and multiplication by a constant phase) in K(c, k).


25

Remark. We note that on the level imply for k < ∞, ⎧ ⎪ ⎨a lim |αn 0 +2nk+ j | = b n→∞ ⎪ ⎩0

of Verblunsky coefficients, Theorems 4.6 and 4.8 j = 0, j = k, j ∈ {1, . . . , k − 1, k + 1, . . . , 2k − 1},

(4.16)

lim α n 0 +(n+1)k αn 0 +nk = −c, for some n 0 ∈ Z and ab = |c|,

n→∞

and similarly, Theorem 4.6 and Corollary 4.7 imply for k = ∞, lim |αn 0 +n αn | = 0 for any n 0 ∈ Z.

n→∞

(4.17)

This extends [19, Thm. E], where the stronger condition of weak convergence for the measures dµn is assumed. Next, we use right limits to study ratio asymptotics. It is convenient to introduce 1) as the subclass of K(c, 1) consisting of CMV matrices with Verblunsky coeffiK(c, cients of constant absolute value. Definition 4.9. Let µ be a probability measure on the unit circle. We say µ is ratio asymptotic if ∗n+1 (z) n→∞ ∗ n (z) lim

exists for all z ∈ D, where, as usual, n (z) is the degree n monic orthogonal polynomial associated to µ. In particular, we say ratio asymptotics holds at z ∈ D with limit G(z) if ∗n+1 (z) = G(z). n→∞ ∗ n (z) lim

(4.18)

Theorem 4.10. Let n be the monic orthogonal polynomials associated with a half-line CMV matrix C. If either all right limits of C are in K(∞) or C has a unique right limit 1), then µ is ratio asymptotic. (up to a multiplication by a constant phase) in K(c, Conversely, if ratio asymptotics holds at some point z 0 ∈ D\{0} with limit G(z 0 ) = 1, then all right limits of C are in K(∞). If ratio asymptotics holds at two points z 1 , z 2 ∈ D \ {0} and the limit is not 1 at either point, then C has a unique right limit (up to 1) for some c ∈ D \ {0}. multiplication by a constant phase) in K(c, Proof. First, observe that it follows from the Szeg˝o recursion (1.1) that for all n ∈ Z+ and z ∈ D, 1−

∗n+1 (z) n (z) = zαn ∗ = zαn f (z; −α n−1 , −α n−1 , . . . , −α 0 , 1). ∗ n (z) n (z)

(4.19)

We refer to [37, Prop. 9.2.3] for the details on the second equality. Abbreviating by f n (z) = f (z; −α n−1 , −α n−1 , . . . , −α 0 , 1), we see that ratio asymptotics (4.18) at z ∈ D \ {0} is equivalent to limn→∞ αn f n (z) = g(z) ≡ (1 − G(z))/z. Let E be a right limit of C and βn , f ± (·, n), n ∈ Z, be the Verblunsky coefficients and Schur functions associated with E. Then, βn f − (z, n) = lim j→∞ αn+n j f n+n j (z) for all n ∈ Z, z ∈ D, and some sequence {n j } j∈N .

26


By Theorem 4.6, if E ∈ K(∞) then at most √one βn is nonzero, and hence β0 f − (z, 0) = 1) then |βn | = |c| and β n+1 βn = −c for all n ∈ Z. Then 0 for all z ∈ D. If E ∈ K(c, the Schur algorithm implies that β0 f − (z, 0) is a function that depends only on the value of c. Since in both cases β0 f − (z, 0) is independent of the sequence n j , it follows that limn→∞ αn f n (z) = β0 f − (z, 0) for all z ∈ D. Thus, by (4.19), ratio asymptotics holds for all z ∈ D. Conversely, by (4.19), ratio asymptotics at z ∈ D \ {0} implies βn f − (z, n) = g(z) for all n ∈ Z. By the Schur algorithm we have f − (z, n + 1)[1 − zg(z)] = z f − (z, n) − β n for all n ∈ Z.

(4.20)

If (4.18) holds at z 0 = 0 and the limit is 1, then by (4.19) g(z 0 ) = 0. Thus, by (4.20) there is at most one nonzero βn since βn 0 = 0 implies inductively that f − (z 0 , n) = 0 and hence βn = 0 (since g(z 0 ) = 0) for all n > n 0 . By Theorem 4.6, E ∈ K(∞). Finally consider the case where ratio asymptotics holds at two different points z 1 , z 2 ∈ D \ {0} and the limit is not 1 at either point. Then by (4.19), g(z 1 ) = 0 and g(z 2 ) = 0 and hence βn = 0 for all n ∈ Z. We also see that z 1 g(z 1 ) = z 2 g(z 2 ) since otherwise it follows from (4.20) that z 1 = z 2 . Thus, multiplying (4.20) by βn+1 /(zg(z)), substituting z = z j , j = 1, 2, and subtracting the results then yields βn+1 β n =

1 z1

− g(z 1 ) − 1 z 1 g(z 1 )

−

1 z2

+ g(z 2 )

1 z 1 g(z 1 )

for all n ∈ Z.

(4.21)

Similarly, multiplying (4.20) by |βn |2 βn+1 and evaluating at z = z 1 one obtains, |βn |2 =

z 1 g(z 1 )βn+1 β n g(z 1 )(1 − z 1 g(z 1 )) + βn+1 β n

for all n ∈ Z.

(4.22)

By (4.21) the right-hand side of (4.22) is n-independent, and hence |βn | as well as βn+1 β n are n-independent constants uniquely determined by the ratio asymptotics (4.18) at z 1 1) with c = −β n+1 βn = 0. Since c is determined by the ratio and z 2 . Thus, E ∈ K(c, asymptotics at z 1 and z 2 , all right limits are the same up to multiplication by a constant phase. Remark. This theorem extends an earlier result of Khrushchev [19, Thm. A]. We conclude with the Proof of Theorem 1.13. Let H belong to the Simon Class and let a, b, c be as in (iii) of Theorem 4.1. If a, c > 0 then H is a periodic whole-line Jacobi matrix which is well known to be reflectionless on its spectrum ([43]). If a = 0 or c = 0 then H is a direct sum of identical 2 × 2 (or 1 × 1) self-adjoint matrices and so has pure point spectrum of infinite multiplicity supported on at most two points. For E in the Khrushchev Class with k < ∞ the same analysis goes through: as long as |a|, |b| < 1 we get a reflectionless operator. If one of them or both are unimodular then it is easy to see that E is a direct sum of 2 × 2 (or 1 × 1) matrices. If k = ∞ then E is either reflectionless or belongs to the class of matrices from Example 1.7. Acknowledgements. We would like to thank Barry Simon for helpful discussions, as well as the referees for their useful comments.


27

References 1. Breimesser, S.V., Pearson, D.B.: Asymptotic value distribution for solutions of the Schrödinger equation Math. Phys. Anal. Geom. 3, 385–403 (2000) 2. Breimesser, S.V., Pearson, D.B.: Geometrical aspects of spectral theory and value distribution for Herglotz functions. Math. Phys. Anal. Geom. 6, 29–57 (2003) 3. Breuer, J., Ryckman, E., Simon, B.: Equality of the spectral and dynamical definitions of reflection. (2009, preprint) 4. Cantero, M.J., Moral, L., Velázquez, L.: Five-diagonal matrices and zeros of orthogonal polynomials on the unit circle. Lin. Alg. Appl. 362, 29–56 (2003) 5. De Concini, C., Johnson, R.A.: The algebraic-geometric AKNS potentials. Ergod. Th. Dyn. Syst. 7, 1–24 (1987) 6. Craig, W.: The trace formula for Schrödinger operators on the line. Commun. Math. Phys. 126, 379–407 (1989) 7. Deift, P., Simon, B.: Almost periodic Schrödinger operators III. The absolutely continuous spectrum in one dimension. Commun. Math. Phys. 90, 389–411 (1983) 8. Geronimus, Ya.L.: On polynomials orthogonal on the circle, on trigonometric moment problem, and on allied Carathéodory and Schur functions. Mat. Sb. 15, 99–130 (1944) 9. Gesztesy, F., Krishna, M., Teschl, G.: On isospectral sets of Jacobi operators. Commun. Math. Phys. 181, 631–645 (1996) 10. Gesztesy, F., Makarov, K.A., Zinchenko, M.: Local AC spectrum for reflectionless Jacobi, CMV, and Schrödinger operators. Acta Appl. Math. 103, 315–339 (2008) 11. Gesztesy, F., Yuditskii, P.: Spectral properties of a class of reflectionless Schrödinger operators. J. Funct. Anal. 241, 486–527 (2006) 12. Gesztesy, F., Zinchenko, M.: A Borg-type theorem associated with orthogonal polynomials on the unit circle. J. Lond. Math. Soc. (2) 74, 757–777 (2006) 13. Gesztesy, F., Zinchenko, M.: Weyl–Titchmarsh theory for CMV operators associated with orthogonal polynomials on the unit circle. J. Approx. Theory 139, 172–213 (2006) 14. Gesztesy, F., Zinchenko, M.: Local spectral properties of reflectionless Jacobi, CMV, and Schrödinger operators. J. Diff. Eqs. 246, 78–107 (2009) 15. Golinskii, L., Nevai, P.: Szego difference equations, transfer matrices and orthogonal polynomials on the unit circle. Commun. Math. Phys. 223, 223–259 (2001) 16. Jakšić, V., Last, Y.: Spectral structure of Anderson type Hamiltonians. Invent. Math. 141, 561–577 (2000) 17. Johnson, R.A.: The recurrent Hill’s equation. J. Diff. Eqs. 46, 165–193 (1982) 18. Khrushchev, S.: Schur’s algorithm, orthogonal polynomials, and convergence of Wall’s continued fractions in L 2 (T). J. Approx. Theory 108, 161–248 (2001) 19. Khrushchev, S.: Classification theorems for general orthogonal polynomials on the unit circle. J. Approx. Theory 116, 268–342 (2002) 20. Kotani, S.: Ljapunov indices determine absolutely continuous spectra of stationary random one-dimensional Schrödinger operators. In: Stochastic Analysis, K. Itˇo (ed.), Amsterdam: North-Holland, 1984, pp. 225–247 21. Kotani, S.: One-dimensional random Schrödinger operators and Herglotz functions. In: Probabilistic Methods in Mathematical Physics, K. Itˇo, N. Ikeda (eds.), New York: Academic Press, 1987, pp. 219– 250 22. Kotani, S., Krishna, M.: Almost periodicity of some random potentials. J. Funct. Anal. 78, 390–405 (1988) 23. Last, Y., Simon, B.: The essential spectrum of Schrödinger, Jacobi, and CMV operators. J. Anal. Math. 98, 183–220 (2006) 24. Melnikov, M., Poltoratski, A., Volberg, A.: Uniqueness theorems for Cauchy integrals. http://arxiv.org/ abs/0704.0621v1[math.cv], 2007 25. Nazarov, F., Volberg, A., Yuditskii, P.: Reflectionless measures with a point mass and singular continuous component. http://arxiv.org/abs/0711.0948v1[math-ph], 2007 26. Nevanlinna, R.: Analytic Functions. Translated from the second German edition by Phillip Emig, Die Grundlehren der mathematischen Wissenschaften, Band 162, New York-Berlin: Springer-Verlag, 1970 27. Peherstorfer, F., Yuditskii, P.: Asymptotic behavior of polynomials orthonormal on a homogeneous set. J. Anal. Math. 89, 113–154 (2003) 28. Peherstorfer, F., Yuditskii, P.: Almost periodic Verblunsky coefficients and reproducing kernels on Riemann surfaces. J. Approx. Theory 139, 91–106 (2006) 29. Poltoratski, A., Remling, C.: Reflectionless Herglotz functions and Jacobi matrices. To appear in Commun. Math. Phys., DOI:10.1007/s00220-008-0696-x, 2009 30. Rakhmanov, E.A.: On the asymptotics of the ratio of orthogonal polynomials. Math. USSR Sb. 32, 199–213 (1977)

28


31. Rakhmanov, E.A.: On the asymptotics of the ratio of orthogonal polynomials, II. Math. USSR Sb. 46, 105–117 (1983) 32. Remling, C.: The absolutely continuous spectrum of one-dimensional Schrödinger operators. Math. Phys. Anal. Geom. 10, 359–373 (2007) 33. Remling, C.: The absolutely continuous spectrum of Jacobi matrices. http://arxiv.org/abs/0706. 1101v1[math.SP], 2007 34. Simon, B.: The classical moment problem as a self-adjoint finite difference operator. Adv. Math. 137, 82–203 (1998) 35. Simon, B.: Ratio asymptotics and weak asymptotic measures for orthogonal polynomials on the real line. J. Approx. Theory 126, 198–217 (2004) 36. Simon B.: Orthogonal Polynomials on the Unit Circle, Part 1: Classical Theory. AMS Colloquium Series, 54.1, Providence, RI: Amer. Math. Soc., 2005 37. Simon, B.: Orthogonal Polynomials on the Unit Circle, Part 2: Spectral Theory, AMS Colloquium Series, 54.2, American Mathematical Society, Providence, RI, 2005 38. Sims, R.: Reflectionless Sturm–Liouville equations. J. Comp. Appl. Math. 208, 207–225 (2007) 39. Sodin, M., Yuditskii, P.: Almost periodic Sturm–Liouville operators with Cantor homogeneous spectrum and pseudoextendible Weyl functions. Russ. Acad. Sci. Dokl. Math. 50, 512–515 (1995) 40. Sodin, M., Yuditskii, P.: Almost periodic Sturm–Liouville operators with Cantor homogeneous spectrum. Comment. Math. Helvetici 70, 639–658 (1995) 41. Sodin, M., Yuditskii, P.: Almost-periodic Sturm–Liouville operators with homogeneous spectrum. In: Algebraic and Geometric Methods in Mathematical Physics. A. Boutel de Monvel and A. Marchenko (eds.), Dordrecht: Kluwer, 1996, pp. 455–462 42. Sodin, M., Yuditskii, P.: Almost periodic Jacobi matrices with homogeneous spectrum, infinite dimensional Jacobi inversion, and Hardy spaces of character-automorphic functions. J. Geom. Anal. 7, 387–435 (1997) 43. Teschl, G.: Jacobi Operators and Completely Integrable Nonlinear Lattices. Mathematical Surveys and Monographs, 72, Providence, RI: Amer. Math. Soc., 2000 Communicated by B. Simon


Communications in


Eigenvalue Estimates for Schrödinger Operators with Complex Potentials Ari Laptev1 , Oleg Safronov2 1 Department of Mathematics, Imperial College London, Huxley Building, 180 Queen’s Gate,

London SW7 2AZ, UK. E-mail: [email protected]

2 University of North Carolina at Charlotte, Mathematics and Statistics, 9201 University City Blvd.,

Charlotte, NC 28223, USA. E-mail: [email protected] Received: 27 November 2008 / Accepted: 8 June 2009 Published online: 31 July 2009 – © Springer-Verlag 2009

Abstract: We discuss properties of eigenvalues of non-self-adjoint Schrödinger operators with complex-valued potential V . Among our results are estimates of the sum of powers of imaginary parts of eigenvalues by the L p -norm of V . 1. Introduction Throughout the paper, f ± denotes either the positive or the negative part of f , which is either a function or a self-adjoint operator. The symbols z and z denote the real and the imaginary part ofz. If a is a function on Rd , then a(i∇) is the operator whose integral kernel is (2π )−d eiξ(x−y) a(ξ )dξ . Let H be a non-self-adjoint Schrödinger operator in L 2 (Rd ), H = − + V (x), with a complex-valued potential V . We call λ an eigenvalue of H if there is a solution of the equation H ψ = λψ for some ψ ∈ L 2 , ψ = 0. We deal with operators that have countably many eigenvalues lying in the cut plane C\[0, ∞). We denote them by λ j , j = 1, 2, 3, . . . A given number λ ∈ C\[0, ∞) may occur several times in this list according to the dimension of the generalized eigenspace {ψ : (H − λ)k ψ = 0 for some k ∈ N}, which is called the algebraic multiplicity. In principle, a generalized eigenspace could have infinite dimension, but, as we shall see, this will not occur in the situations considered in this paper. The main result of [9] tells us, that for any t > 0, the eigenvalues λ j of H lying outside the sector {z : |z| < t z} satisfy the estimate |λ j |γ ≤ C |V (x)|γ +d/2 d x, γ ≥ 1, where the constant C may depend on t, γ and d.

30

A. Laptev, O. Safronov

In this paper we study inequalities for the eigenvalues that might be close to the positive half-line. In particular, our results provide some information about the rate of accumulation of eigenvalues to the set R+ = [0, ∞). Theorem 1. Let V ≥ 0 be a bounded function. Assume that V ∈ L p (Rd ), where p > d/2 if d ≥ 2 and p ≥ 1 if d = 1. Then the eigenvalues λ j of the operator H = − + V satisfy the estimate p λ j p ≤ C (V )+ (x) d x. (1.1) |λ j + 1|2 + 1 + Rd j

The constant C can be computed explicitly: −d C = (2π )

Rd

dξ . (ξ 2 + 1) p

(1.2)

Note that the right-hand side of (1.1) is independent of the real part of the potential ∞ (Rd ) (since we V and therefore the statement is true for an arbitrary V ≥ 0 from L loc only need to check that (3.2) holds). It is not the case when we try to obtain an estimate p of the sum j (λ j /(|λ j + 1|2 + 1))+ , where we allow p ≤ d/2 and where a certain regularity of V is required. Theorem 2. Let V ≥ 0 and V be two bounded real valued functions. Assume that V ∈ L p (Rd ), where p > d/4 if d ≥ 4 and p ≥ 1 if d ≤ 3. Then the eigenvalues λ j of the operator H = − + V satisfy the estimate p λ j p 2p ≤ C (1 + ||V || ) (V )+ (x) d x, (1.3) ∞ |λ j + 1|2 + 1 + Rd j

where C = (2π )−d

Rd

((ξ 2

dξ . + 1)2 + 1) p

(1.4)

Next Theorems 3-4 give sufficient conditions on V that guarantee convergence of the sum |λ j |γ < ∞ a 0} satisfy the inequality 2γ −1 1 |λ j |γ ≤ C|b (W )| 2r −1 (b + |b (W )| 2r −1 ). (1.5) λ j ∈b

The constant C in this inequality depends on d, γ and r .

Eigenvalues of Schrödinger Operators with Complex Potentials

31

Applying the same method we also prove: Theorem 4. Let λ j be the eigenvalues of the operator H = −d 2 /d x 2 + V lying inside the semi-infinite strip a,b = {z : a < z < b, z > 0} with a > 0. Then for any γ > 3/2 and r ∈ (γ − 21 , γ ) the condition V ∈ L 2r (R) ∩ L r (R) implies

2γ −1

1

|λ j |γ ≤ C| a (W )| 2r −1 (b + | a (W )| 2r −1 ),

λ j ∈a,b

where a (W ) = a −1/2

R

W r d x.

The constant in this inequality depends on γ and r . Note that in Theorems 3 and 4 the inequality γ ≥ 3/2 is required in any dimension while in Theorems 1 and 2 the values of p could be smaller in lower dimensions. One should mention that the paper [9] had been motivated by a question of E.B. Davies (see [1] and [7]). He obtained that if d = 1 and V ∈ L 1 (R), then all eigenvalues λ of H which do not belong to R+ satisfy 1 |λ| ≤ 4

2 |V (x)|d x

.

The question was raised if a similar estimate holds in dimension d ≥ 2. The following conjecture seems to be reasonable Conjecture. Let d ≥ 2, 0 < γ ≤ d/2 and let V ∈ L d/2+γ (Rd ) be a complex-valued potential. Then for any eigenvalue λ ∈ / R+ of the operator H = − + V , |λ|γ ≤ C |V (x)|d/2+γ d x, (1.6) Rd

for every complex valued potential and every eigenvalue λ ∈ / R+ of the operator H = − + V . We carefully avoid the case γ > d/2, since the operator H in this case might have arbitrary large positive eigenvalues due to the Wigner-Von Neumann example [18] (we are grateful to S. Molchanov for drawing our attention to this circumstance). So far, we are able to prove only the following result related to this conjecture: Theorem 5. Let V be a function from L p (Rd ), where p ≥ d/2, if d ≥ 3; p > 1, if d = 2, and p ≥ 1, if d = 1. Then every eigenvalue λ of the operator H = − + V with the property λ > 0 satisfies the estimate p−1 d/2−1 |λ| ≤ |λ| C |V | p d x. (1.7) Rd

The constant C in this inequality depends only on d and p. Moreover, C = 1/2 for p = d = 1.

32


The inequality (1.7) was established in [1] in the case d = p = 1. We prove it in higher dimensions and in dimension d = 1 for p > 1. We also show the elementary estimate (see Theorem 16) √ 2γ | λ| ≤ C |V |3/2+γ d x, γ > 0, d = 3, R3

however it is not quite the same as (1.6). While we are not able to prove Conjecture 1.1, we find some information about the location of eigenvalues of the operator − + i V with a positive V ≥ 0, see Thorem 13. In particular, in Theorem 15 we prove that if d = 3 and V d x is small and λ ∈ / R+ is an eigenvalue of − + i V , then |λ| must be large. It might seem that eigenvalues do not exist at all for small values of V d x, however their presence in such cases can be easily established using scaling. Proposition 1. Let d ≥ 3. Then there is a sequence of positive functions Vn ≥ 0 such that the “largest modulus” eigenvalue λn ∈ / R+ of the operator − + i Vn satisfies |λn | → ∞ as n → ∞, while limn→∞ Vn (x)d x = 0. Proof. If λ is an eigenvalue n 2 λ is an eigenvalue of −+n 2 i V (nx). 2of −+i V (x), then 2−d It remains to note that n V (nx)d x = Cn . The idea of the proof of existence of a non-real eigenvalue of − + i V (x) at least for one V ≥ 0 is to start with the onedimensional case, when V (x) = δ(x) + δ(x − ). In this case, there is an eigenvalue of H that behaves like 1 + iα + O( 2 ) as → 0. If V is spherically symmetric, then the multi-dimensional case can be reduced to the one-dimensional case by separation of variables.

Remark. Note that our results also imply that the eigenvalues of − + i V can not accumulate to zero in d = 3, if V ≥ 0 is integrable (Corollary 5). Complex potentials are used in physics (see, for example, [2,10 and 13]) to describe different phenomena. In quantum mechanics and nuclear physics, the imaginary part of the potential is used to describe dissipation. Unlike the selfadjoint case, where the L 2 -norm of the wave function is constant, the L 2 -norm of the wave function in systems with a dissipation might change in time. The real part of the potential describes usual scattering whereas the imaginary part describes the absorption. 2. Preliminaries In what follows, the inner products and the norms in various spaces are denoted by (·, ·) and || · || respectively. 1. Let a[·, ·] be a sesquilinear form in a Hilbert space H. We assume that its domain d[a] is dense in H and a is semibounded from below and closed on d[a]. The form a induces the selfadjoint operator A in H. Fix the value of γ ∈ R, such that aγ := a + γ ≥ 1, i.e. aγ [x, x] = a[x, x] + γ ||x||2 = ||x||2 ,

x ∈ d[a],

and denote by Hγ [a] the (complete) Hilbert space d[a] with the metric form aγ [x, x] = ||(A + γ I )1/2 x||2 ,

x ∈ d[a].


33

Let V : H → H be a selfadjoint linear operator, satisfying D(|V |1/2 ) ⊃ d[a] and G := |V |1/2 (A + γ I )−1/2 ∈ S∞ ,

(2.1)

where S∞ denotes the space of compact operators in H. Put V 1/2 x, |V | y . v[x, y] = |V |1/2

(2.2)

Then the form v is compact on d[a]. This means that the form v is continuous on Hγ [a] and the corresponding operator Q (determined by the relations aγ [Qx, y] = v[x, y] for x, y ∈ d[a]) is compact on Hγ [a]. Define the operator H by setting H + γ I = (A + γ I )(I + i Q),

(2.3)

on the domain D(H ) = (I +i Q)−1 D(A). It is clear that the operator H can be interpreted as the sum H = A + i V. Proposition 2. The operator H defined in (2.3) is densely defined and closed. Proof. Let us first prove that H is densely defined. Assume the opposite, that there is a non-zero vector h ∈ d[a] such that aγ [(I + i Q)−1 u, h] = 0 for all vectors u ∈ D(A). Then aγ [u, (I − i Q)−1 h] = 0 for all u ∈ d[a], which implies that (I − i Q)−1 h = 0. The latter relation contradicts the assumption that h = 0. In order to prove that H is closed, it is sufficient to observe that H + γ I is invertible and prove that the inverse is bounded. But this follows from the relation (H + γ I )−1 = (I + i Q)−1 (A + γ I )−1 , and the fact that (A + γ I )−1 maps continuously H to Hγ [a].

Remark. The condition that the sesquilinear form v is generated by a self-adjoint operator V is excessive. We can always define H by (2.3), as soon as we know that v[u, u] = aγ [Qu, u], where Q is compact in the space Hγ [a]. This remark allows one to consider the case when the elliptic operator A = 2 is perturbed by a differential operator of first order. Under the above assumptions, the difference between the resolvents of the operators A and H is compact. Hence, the spectrum σ (H ) of the operator H is discrete in C\σ (A). 2. Let H be as described above. In order to develop the perturbation theory suitable for non-selfadjoint operators, we consider a contour C which encloses a finite number of eigenvalues λ1 , λ2 , . . . , λm of the operator H and has no intersection with the spectrum of H . Then the projection onto the span of the corresponding root vectors is given by the formula i P= (H − z)−1 dz. (2.4) 2π C Consequently, if Hn is a family of closed operators in H having the property that σ (Hn )\R is discrete and satisfying the condition (Hn − z)−1 − (H − z)−1 → 0,

(2.5)

34


as n → ∞, uniformly for z ∈ C, then the sequence of projections Pn , defined by (2.4) with Hn instead of H , will converge to P. That means that the norm ||P − Pn || will be small for sufficiently large n. Now, in order to draw a conclusion about eigenvalues of Hn , we can apply the following statement. Lemma 1 (see, for example, [14]). If P and P0 are two projections such that rank P = rank P0 , then ||P − P0 || ≥ 1. Note that if Hn is a family of closed operators in H having the property that σ (Hn )\R is discrete and satisfying the condition (Hn − z 0 )−1 − (H − z 0 )−1 → 0,

(2.6)

as n → ∞, for some point z 0 , then (2.5) holds for all z outside of the spectrum of H , because of the formula (H − z)−1 = (I + (z − z 0 )(H − z 0 )−1 )−1 (H − z 0 )−1 and a similar formula for (Hn − z)−1 . The convergence in (2.5) is uniform on compact subsets of C\σ (H ). We conclude that every non-real eigenvalue λ j of the operator H is the limit of a sequence λ j (n) of the eigenvalues of the operators Hn . The algebraic multiplicities of the eigenvalues are preserved in the obvious manner, and we omit the discussion of that. Let us formulate a condition that guarantees (2.5). Let Vn be a sequence of self-adjoint operators in H such that (2.1) holds with Vn instead of V . Let the sesquilinear form vn be the same as (2.2) with V replaced by Vn . Suppose that the sequence of operators Q n determined by aγ [Q n x, y] = vn [x, y] for x, y ∈ d[a], converges to the operator Q (acting in the space Hγ [a]). If we define Hn by the formula Hn + γ I = (A + γ I )(I + i Q n ), then we will obtain a sequence of operators Hn having the property (2.6) with z 0 = −γ and H defined by (2.3). Proposition 3. Let Hn + γ I = (A + γ I )(I + i Q n ), where the sequence of compact selfadjoint operators Q n in Hγ [a] converges to Q (as operators in Hγ [a]). Let H +γ I = (A + γ I )(I + i Q). Then every non-real eigenvalue λ j of the operator H is the limit of a sequence λ j (n) of the eigenvalues of the operators Hn as n → ∞. Proposition 4. The operators Q and Q n are unitarily equivalent to the operators (A + γ )−1/2 V (A + γ )−1/2 and (A + γ )−1/2 Vn (A + γ )−1/2 acting in H. Convergence of Q n to Q (as operators in Hγ [a]) is equivalent to convergence of (A + γ )−1/2 Vn (A + γ )−1/2 to (A + γ )−1/2 V (A + γ )−1/2 . Proof. Since aγ [(A + γ )−1/2 u, (A + γ )−1/2 v] = (u, v),

∀u, v ∈ H,

we conclude that the operator U = (A + γ )−1/2 is a unitary mapping from H to Hγ [a]. It remains to prove that U (A + γ )−1/2 V (A + γ )−1/2 U ∗ = Q. In order to show this, we simply note that aγ [U (A + γ )−1/2 V (A + γ )−1/2 U ∗ u, v] = v[u, v] = aγ [Qu, v]. That proves one of the statements of the proposition. The other statements are obvious.


35

3. We already know that A is semibounded from below. Suppose also that the negative spectrum of A is discrete. Then the operator H has only a discrete set of eigenvalues in the left half-plane Cle f t = {z : z < 0}. Moreover, suppose that λ j ∈ Cle f t are eigenvalues of the operator H , and τ j are negative eigenvalues of A enumerated in the order of increasing real parts. Then n n λj ≤ |τ j | 1

1

for all n. Indeed, let P be the orthogonal projection onto the span of eigenvectors x j corresponding to λ j , 1 ≤ j ≤ n. Then tr H P =

n

λj.

1

Consequently,

n 1

λj =

n (Ax j , x j ) 1

≥ min tr ((A + γ )1/2 P)∗ A(A + γ )−1 (A + γ )1/2 P , P

(2.7)

where the minimum is taken over all orthogonal projections P of rank n with the property RanP ⊂ d[a]. Thus, n 1

λ j ≥

n

τj,

(2.8)

1

since the minimum in the right-hand side of (2.7) coincides with the sum in the right-hand side of (2.8). Corollary 1. Let γ > 0. Then n n (λ j + γ )− ≤ (τ j + γ )− . 1

(2.9)

1

Proof. We can find a positive integer m such that λm + γ < 0, but λm+1 + γ ≥ 0. There are two possibilities: either m ≤ n or m > n. If m ≤ n, then n m m n (λ j + γ )− = (λ j + γ )− ≤ (τ j + γ )− ≤ (τ j + γ )− . 1

1

1

1

If m > n then (2.9) is just obvious, since, in this case, n 1

(λ j + γ )− = −

n n n (λ j + γ ) ≤ − (τ j + γ ) ≤ (τ j + γ )− . 1

1

1

36


Actually, (2.8) holds for all n if and only if (2.9) holds for all n and γ > 0. Indeed, let us fix n and choose γ in (2.9) so small that all terms in both sides of (2.9) are different from zero. Then γ cancels out and we obtain (2.8). 4. Let T be a bounded operator in a Hilbert space, whose spectrum outside the unit circle {z : |z| > 1} is discrete. Suppose also that the essential spectrum of the operator (T ∗ T )1/2 is contained in [0, 1]. Let λ j be the eigenvalues of the operator T lying outside of the unit circle, and let s j > 1 be the eigenvalues of (T ∗ T )1/2 . If we enumerate the sequences |λ j | and s j in the decreasing order, then n

|λ j | ≤

1

n

sj

(2.10)

1

for all values of n. One should mention also that, if one of the sequences ends at j = j0 , we extend it by setting it equal to 1 for j > j0 . This inequality was discovered for compact operators by H. Weyl (see [19]). Weyl’s proof is carried over to the case of bounded operators. Indeed, let P be the orthogonal projection onto the span of eigenvectors corresponding to λ j , 1 ≤ j ≤ n. Then for any α > 0, det (I + P(αT ∗ T − I )P) = α n

n

|λ j |2 .

1

Consequently, αn

n

|λ j |2 ≤ det (I + P(αT ∗ T − I )+ P)

1

≤ det (I + (αT ∗ T − I )+ P(αT ∗ T − I )+ ). 1/2

1/2

Since (αT ∗ T − I )+ P(αT ∗ T − I )+ ) ≤ (αT ∗ T − I )+ ) we can remove the orthogonal projection P in the right-hand side and obtain 1/2

αn

1/2

n

|λ j |2 ≤ det (I + (αT ∗ T − I )+ ) =

αs 2j .

αs 2j >1

1

It remains to choose α = sn−2 . Note that if the number of s j > 1 is finite, we can take α = 1 to obtain that n

|λ j |2 ≤

1

s 2j

s 2j >1

for all n. Corollary 2. Let γ ≥ 1. Then n 1

for all n.

(|λ j |2 − 1)γ ≤

n (s 2j − 1)γ 1


37

Proof. Our arguments are quite standard and can be compared with the ones in the book by Birman and Solomyak [4], which contains a survey on different inequalities for compact operators. It is sufficient to consider the case γ > 1, because the proof in the case γ = 1 is obtained by passing to the limit as γ → 1. As a consequence of (2.10), we obtain that n

log |λ j | ≤

1

n

log s j .

(2.11)

1

Moreover, n n (log |λ j | − η)+ ≤ (log s j − η)+ j=1

(2.12)

j=1

for any −∞ < η < ∞. (To prove (2.12) one has to repeat the arguments of the proof of Corollary 1.) Note now that the function φ(t) = (e2t − 1)γ is representable in the form ∞ (λ − t)+ φ (t) dt and φ (t) ≥ 0 for t ≥ 0. φ(λ) = 0

Since φ(log |λ|) = (|λ|2 − 1)γ , the statement of Corollary 2 for γ > 1 follows from (2.12).

5. Let T be a compact operator in √ a Hilbert space and let n(s, T ) be the counting function of its s-numbers (eigenvalues of T ∗ T ) n(s, T ) = card{ j :

s j > s}, s > 0.

Then by the Ky Fan inequality (see [11]) for any pair of compact operators T1 and T2 and s1 , s2 > 0, n(s1 + s2 , T1 + T2 ) ≤ n(s1 , T1 ) + n(s2 , T2 ). The class of operators T for which p

[T ] p := sup s p n(s, T ) < ∞ s>0

is called the weak Neumann-Schatten class p . Let F be the Fourier transform F f (ξ ) = e−i xξ f (x) d x. Rd

Theorem 6 (M. Cwikel [5]). Let α and β be the operators of multiplication by the functions α(ξ ) and β(x). Suppose that β ∈ L q (Rd ), q > 2, and let q

[α]q = sup t q meas{ξ ∈ (Rd : |α(ξ )| > t} < ∞. t>0

Then the operator T = βF ∗ α (as well as the operator αFβ) is in p and q q [T ]q ≤ C[α]q |β(x)|q d x.

(2.13)

38


Proposition 5 (Birman-Schwinger principle [3,15]). Let A and V be two positive self-adjoint operators √ acting in the same Hilbert space. Suppose that V is bounded and the operator V (A + I )−1/2 is compact. Then for E > 0, the number N (E) of eigenvalues of the operator A − V lying to the left of −E satisfies the relation √ √ N (E) = n(1, V (A + E)−1 V ). In applications, A is a differential operator with constant coefficients and V is the operator √ of multiplication by a function. Then applying Theorem 6 to the operator T = V (A + I )−1/2 one obtains sharp inequalities for N (E). 6. In order to state the next result we need to introduce one more Neumann-Schatten class S p of compact operators. Namely, we say that T ∈ S p , p ≥ 1, if ||T || p := tr (T ∗ T ) p/2 = p

p

s j < ∞.

j

It is easy to see that S p is a Banach space. The next theorem gives us a sufficient condition guaranteeing that an operator of the form β(x)α(i∇) belongs to the class S p . Theorem 7. Let α and β be the operators of multiplication by α(ξ ) and β(x). Suppose that α, β ∈ L p (Rd ), where p ≥ 2. Then T = βF ∗ α ∈ S p and p (2.14) ||T || p ≤ (2π )−d |α(ξ )| p dξ |β(x)| p d x. This theorem can be found in [16]. See also [12] and [17]. 7. We now formulate a statement about eigenvalue estimates for a certain operator with constant coefficients perturbed by a potential V . It is one of the consequences of the inequality (2.13). Proposition 6. Let α(ξ ) = (|ξ |2 − µ)2 , V (x) ≥ 0 and p > 1/2. Suppose that V ∈ L p+d/4 (Rd ) ∩ L p+1/2 (Rd ) if d ≥ 2, or V ∈ L p+1/2 (R) if d = 1. Let N (E) be the number of eigenvalues of the operator α(i∇) − V (x) lying to the left of the point −E, where E > 0. Then C N (E) ≤ p V p+d/4 d x + µd/2−1 V p+1/2 d x , if d ≥ 2; (2.15) E Rd Rd C V p+1/2 d x, if d = 1. (2.16) N (E) ≤ p 1/2 E µ Rd Proof. It is an elementary application of the Cwikel estimate. Indeed, according to the Birman-Schwinger principle N (E) = n(1, X ), where X is the compact operator defined by the equality √ √ X = V (α(i∇) + E)−1 V .


39

Let χ be the characteristic function of the ball {|ξ | ∈ Rd : |ξ |2 ≤ µ}. Let us split X such that X = X 1 + X 2 , where √ √ X 2 = V (α(i∇) + E)−1 χ (i∇) V . According to the Ky Fan inequality, n(1, X ) ≤ n(1, 2X 1 ) + n(1, 2X 2 ).

(2.17)

Therefore it is sufficient to estimate each term in the right-hand side of (2.17) separately. We begin with the first term. Set q1 = p + d/4. Then according to (2.13), dξ n(1, 2X 1 ) ≤ C0 V q1 d x 2 2 q |ξ |2 >µ ((|ξ | − µ) + E) 1 ∞ ∞ d/2−1 s d/2−1 ds s ds q1 q1 ≤ C1 V d x ≤ C2 V d x 2 + E)q1 2 + E)q1 ((s − µ) (s µ 0 C = p V p+d/4 d x. E Rd In order to estimate the second term in (2.17) we set q2 = p + 1/2. Using (2.13) again we find dξ q2 n(1, 2X 2 ) ≤ C3 V d x 2 2 q |ξ |2 1,

∞

∞

dξ ≤ 2 ((ξ − µ)2 + E)q

∞

dξ C = √ q−1/2 , √ 2 q ((|ξ | − µ) µ + E) µE

∞

where C=

∞

−∞

(s 2

ds . + 1)q

If now q = p + 1/2, then by using (2.13) we arrive at C V q dx C V p+1/2 d x N (E) ≤ √ = , √ µ E q−1/2 µ Ep which means that (2.16) is also proven.

40


3. Proof of Theorem 1 The main tool of the proof is the linear fractional mapping that takes the upper half-plane {z : z > 0} into the compliment of the unit disk {z : |z| > 1} given by the formula z →

z+i +1 . z−i +1

Insert the operator H = − + V instead of z into this formula, i.e. consider the operator U = (H + I + i)(H + I − i)−1 = I + 2i(H + I − i)−1 . Obviously z ∈ / R is an eigenvalue of the operator H if and only if (z + i + 1)/(z − i + 1) is an eigenvalue of U . Clearly U ∗ = I − 2i(H ∗ + I + i)−1 , and therefore U ∗ U = I + 2i(H + I − i)−1 − 2i(H ∗ + I + i)−1 + 4(H ∗ + I + i)−1 (H + I − i)−1 . Using the Hilbert identity, we obtain U ∗ U = I + 2i(H ∗ + I + i)−1 (H ∗ − H )(H + I − i)−1 , and since H ∗ − H = −i V , U ∗ U = I + 4(H ∗ + I + i)−1 V (H + I − i)−1 . In particular, this implies U ∗ U − I ≤ 4Y ∗ Y,

√ where Y = V + (H + I −i)−1 . By using Corollary 2 the eigenvalues λ j of the operator H satisfy the inequality p λ j + 1 + i 2 p 2p ≤ tr (U ∗ U − I )+ ≤ 4 p tr (Y ∗ Y ) p = 4 p ||Y ||2 p . λ + 1 − i − 1 j

j

+

It follows from this inequality that j

λ j |λ j + 1|2 + 1

p

2p

≤ ||Y ||2 p . +

Indeed, denote a = 2λ j /(|λ j + 1|2 + 1) and suppose that λ j > 0. Then λ j + 1 + i 2 − 1 = 1 + a − 1 ≥ 2a. λ + 1 − i 1−a j We come to the conclusion that one needs to estimate the norm of the operator

Y = V + (H + I − i)−1

(3.1)


41

in the class S2 p . Let us represent this operator in the form Y =

V + (− + I )−1/2 B,

where B = (− + I )1/2 (H + I − i)−1 .

We will show that the operator B is bounded and its norm does not exceed 1. In other words, we will show that ||(− + I )1/2 (H + I − i)−1 f ||2 ≤ || f ||2 ,

(3.2)

for all f ∈ L 2 . Denote u = (H + I − i)−1 f . It is obvious that

(|∇u| + (1 + V (x))|u| ) d x = 2

Rd

2

Rd

f u¯ d x.

Due to the condition V ≥ 0, we obtain from this relation that

1 (|∇u| + |u| ) d x ≤ (| f |2 + |u|2 ) d x. 2 Rd Rd 2

2

The latter inequality can be written in the form

(2|∇u| + |u| ) d x ≤ 2

Rd

2

Rd

| f |2 d x.

Replacing 2 by a smaller number we will make the inequality weaker. As a result we obtain the estimate ||(− + I )1/2 u||2 ≤ || f ||2 .

(3.3)

It remains to note that (3.3) is equivalent to (3.2). Let us summarize the results. Since Y =

V + (H + I − i)−1 =

V + (− + I )−1/2 B and ||B|| = 1,

we obtain

||Y ||2 p ≤ || V + (− + I )−1/2 ||2 p .

(3.4)

On the other side, according to Theorem 7,

p −1/2 2 p −d || V + (− + I ) ||2 p ≤ (2π ) C0 V + d x, where C0 = Rd (ξ 2 + 1)− p dξ . Combining (3.4) with (3.1), we complete the proof of Theorem 1.

42


4. Proof of Theorem 2 The main arguments in the proof of this result remain the same apart from the estimate of the norm ||Y ||2 p of the operator Y . Recall that j

λ j |λ j + 1|2 + 1

p

2p

≤ ||Y ||2 p ,

(4.1)

+

√ where Y = V + (H + I − i)−1 . In order to find a bound for the s-numbers of the operator Y we represent it in the form Y =

V + (− + I − i)−1 (I − V (H + I − i)−1 ).

In the previous section we have found that (H + I − i)−1 = (− + I )−1/2 B

and ||B|| ≤ 1.

Consequently, ||(H + I − i)−1 || ≤ 1, and this means that

||Y ||2 p ≤ || V + (− + I − i)−1 ||2 p (1 + ||V ||∞ ). By using Theorem 7 we obtain that for any p > d/4,

2p p || V + (− + I − i)−1 ||2 p ≤ (2π )−d C0 V + d x, where C0 =

((ξ 2

dξ . + 1)2 + 1) p

Consequently, ||Y ||2 p ≤ (2π )−d (1 + ||V ||∞ )2 p C0 2p

Consequently (4.1) and (4.2) imply (1.3).

p

V + d x.

(4.2)


43

5. Proof of Theorem 3 and Some Related Results Proof of Theorem 3. Assume that λ j ∈ b are enumerated in the order of decreasing imaginary parts. Note that the theorem would be proven, if instead of the infinite sum in the left-hand side of (1.5), we estimated a partial sum m

2γ −1

1

|λ j |γ ≤ C|b (W )| 2r −1 (b + |b (W )| 2r −1 ),

λ j ∈ b .

(5.1)

j=1

On the other hand, it is sufficient to prove the estimate (5.1) for the case when V ∈ C0∞ (Rd ). Indeed, if V ∈ / C0∞ (Rd ) , then we can always find a sequence Vn of C0∞ -functions that converges to V in L d/2−1+2r (Rd ) ∩ L r (Rd ). Obviously the corresponding sequence of quantities b (Wn ) (here Wn = (|Vn |2 + 4Vn )+ ) will converge to b (W ) in this case. Moreover, due to (trivially modified) Propositions 3 and 4, the non-real eigenvalues λ j of H will be the limits of the sequences of non-real eigenvalues λ j (n) of Hn = −+ Vn , which implies that mj=1 |λ j |γ = limn→∞ mj=1 |λ j (n)|γ . Note that convergence of (− + γ )−1/2 Vn (− + γ )−1/2 to (− + γ )−1/2 V (− + γ )−1/2 is guaranteed by Theorem 7. Corollary 1 plays an essential role in the proof, as well as a trick relating the eigenvalues of the operator H = −+V and the eigenvalues of the operator (−+2i −µ+V )2 , µ > 0, lying to the left of z = −4. Indeed, let λ j be eigenvalues of the operator −+V lying in the hyperbolic domain Dµ = {z : (z + 2)2 − (z − µ)2 ≥ 4, z > 0}, then (λ j − µ + 2i)2 are eigenvalues of the operator (− − µ + 2i + V )2 , and it is easy to see that (λ j − µ + 2i)2 = (λ j − µ)2 − (λ j + 2)2 ≤ −4,

∀λ j ∈ Dµ .

Consequently, due to Corollary 1, n n 2 sj , (λ j − µ + 2i) + 4 ≤ 1

(5.2)

1

where s j are eigenvalues of the operator T1 = (− − µ)2 + V1 (− − µ) + (− − µ)V1 + V12 − V22 − 4V2 , where V1 = V and V2 = V are the real and the imaginary parts of the potential. The inequality (5.2) takes care of all eigenvalues from the domain Dµ . It turns out that we do not need all of them, but only the eigenvalues λ j lying inside the domain µ = {z : (z + 1)2 − (z − µ)2 ≥ 1, z > 0}. Note that the boundaries of both domains Dµ and µ touch the real line at the point z = µ. Note also that µ ⊂ Dµ and therefore this might imply that bounds on eigenvalues lying in µ are better than those in Dµ . It turns out that the imaginary parts of eigenvalues in µ can be estimated in terms of real parts of eigenvalues of the operator (H − µ + 2i)2 + 4. A similar trick was used by Davies in [6] to obtain individual inequalities for the eigenvalues of the operator H .

44


Let us study the relation between the spectra of the operators H and (H − µ + 2i)2 in more detail. Assume that λ j ∈ µ and λ j > s. Then 2(λ j − s) ≤ (λ j + 1)2 − (λ j − µ)2 − 1 + 2(λ j − s) = (λ j + 2)2 − (λ j − µ)2 − 4 − 2s = −(λ j − µ + 2i)2 − 4 − 2s. Due to Corollary 1 it means that 2 (λ j − s)+ ≤ tr (H − µ + 2i)2 + 4 + 2s ≤ tr (T1 + 2s)− . −

λ j ∈µ

Now, we represent the operator T1 in the form T1 =

1 (− − µ)2 + 2

√ 1 √ (− − µ) + 2V1 2

2 − 4V2 − V12 − V22 .

Since the operator

√ 1 √ (− − µ) + 2V1 2

2 ≥0

is positive, we obtain that the spectrum of the operator T1 can be estimated by the spectrum of the operator 1 (− − µ)2 − |V |2 − 4V2 . 2

T2 = Thus,

2

(λ j − s)+ ≤ tr (T2 + 2s)− .

(5.3)

λ j ∈µ

Let τ j be negative eigenvalues of T2 . In order to estimate the right-hand side of (5.3) we apply Proposition 6 according to which the number N (E) of eigenvalues of T2 lying to the left of the point −E satisfies the inequality C N (E) ≤ p W d/4+ p d x + µd/2−1 W 1/2+ p d x (5.4) E Rd Rd with p > 1/2 and d ≥ 2. If now q > p > 1/2 then ∞ |τ j |q = q E q−1 N (E)d E 0

j

≤C

Rd

W

d/4+ p

dx + µ

d/2−1

Rd

W

1/2+ p

d x |λ1 |q− p .

From (5.4) it follows that the lowest eigenvalue τ1 satisfies the inequality |τ1 |r −1/2 ≤ C W d/4+r −1/2 d x + µd/2−1 W r d x = Cµ (W ). Rd

Rd


45

Hence for q > p > 1/2 and r > 1 we arrive at |τ j |q ≤ C W d/4+ p d x + µd/2−1 W 1/2+ p d x |µ (W )|2(q− p)/(2r −1) . Rd

j

Rd

Recall that

2

(λ j − s)+ ≤

λ j ∈µ

and therefore (λ j − s)+ ≤ C λ j ∈µ

(W − 2s)+

+µ

(τ j + 2s)− ,

j

d/4+ p

Rd

d/2−1

Rd

dx

1/2+ p (W −2s)+ dx

|µ (W )|

2(1− p) 2r −1

=: F(s, µ)

(5.5)

with 1/2 < p < 1 and r > 1. Let now b be the semi-infinite strip {z : 0 < z < b, z > 0}. Since the boundary of a domain µ touches the real line parabolically, it is obvious, that for small values of s < ε0 , the set b (s) of √ all points z ∈ b whose z > s > 0 can be covered by not more than m(b) = [Cb/ s] + 1 sets of the form µ . Since µ contains the sector z > |z − µ|, we obtain that the number of domains µ covering √ the√set b (s) can be also estimated by [b/s] + 1 for any s > 0. Finally, note that 1/ s ≥ ε0 /s for s ≥ ε0 . Therefore without loss of generality one can assume that √ m(b) = [Cb/ s] + 1, ∀s > 0. 2

Since λ j ≤ C|b (W )| 2r −1 for any λ j ∈ b , we obtain

(λ j − s)+ ≤

m(b)

λ j ∈b

1

(λ j − s)+ ≤ C

l=1 λ j ∈µl

Obviously

γ

|λ j | = γ (γ − 1)

λ j ∈b

which leads to

λ j ∈b

1

∞

(b + |b (W )| 2r −1 ) F(s, b). √ s

(λ j − s)+ s γ −2 ds,

0

|λ j |γ ≤ (b + |b (W )| 2r −1 )C

∞

s γ −5/2 F(s, b)ds.

0

λ j ∈b

The integral in the right-hand side converges only if γ > 3/2 and by using the notation introduced in (5.5) we finally obtain 2δ 1 |λ j |γ ≤ C|b (W )| 2r −1 (b + |b (W )| 2r −1 ) λ j ∈b

×

Rd

|W |d/4−1/2+γ −δ d x + +bd/2−1

Rd

|W |γ −δ d x ,

where 0 < δ < 1/2. It remains to set r = γ − δ to complete the proof.

46


We have proved inequalities for |λ j |γ with γ > 3/2. However, (5.5) allows us to obtain a bound on eigenvalues belonging to µ with γ = 1. Corollary 3. Let λ j be the eigenvalues of the operator − + V lying inside µ = {z : (z + 1)2 − (z − µ)2 ≥ 1, z > 0} and let d ≥ 2. Then 1− p d/4+ p d/2−1 1/2+ p |λ j | ≤ C W dx + µ W d x ||W ||∞ Rd

j

Rd

for any 1/2 < p < 1. Similarly we can show Corollary 4. Let d = 1 and let λ j be the eigenvalues of the operator −d 2 /d x 2 + V lying inside µ = {z : (z + 1)2 − (z − µ)2 ≥ 1, z > 0}. Then 1− p |λ j | ≤ C||W ||∞ µ−1/2 W 1/2+ p d x Rd

j

for any 1/2 < p < 1. Unfortunately, if d = 1 then in order to obtain similar results we have to avoid the point z = 0 and in this case we deal with the strip a < z < b, a > 0. However, this is no longer true if γ > 7/4. Indeed: Theorem 8. Let λ j be the eigenvalues of the operator H = −d 2 /d x 2 + V lying inside the semi-infinite strip b = {z : 0 < z < b, z > 0}. Then for any γ > 7/4, r ∈ (γ − 21 , γ ) and V ∈ L ∞ (Rd ) such that W = (|V |2 + 4V )+ ∈ L r −1/4 we have

γ

|λ j | ≤

γ −r C||W ||∞ (b

λ j ∈b

1/2 + W ∞ )

r −1/4

Rd

|W |

dx .

Proof. The inequality (5.5) could be easily modified and we can obtain that for 1/2 < p < 1, 1/2+ p 1− p (λ j − s) ≤ C µ−1/2 (W − 2s)+ d x W ∞ =: F(s, µ), (5.6) Rd

λ j ∈µ 1− p

where W ∞ appears when we estimate the lowest eigenvalue λ1 of the operator T2 = 21 (−d 2 /d x 2 − µ)2 − W . Now consider the part of the strip b = {z : 0 < z < b, z > 0} satisfying z > s > 0. We cover it by sets µ , µ ∈ R+ . While doing this, we avoid the value µ = √0 taking µ as large as possible. The optimal choice of such µ would be µ = µ0 = s 2 + 2s, so that µ0 satisfies the equation (s + 1)2 − µ20 = 1. Thus, without √ loss of generality, we can assume that µ ≥ s 2 + 2s. Arguing as in the proof of Theorem 3, we find that√set of all points z ∈ b whose z > s can be covered by not more than m(b) = [Cb/ s] + 1 sets of the form µ .


47

Since there is no λ j ∈ b satisfying λ j > W ∞ , we obtain

(λ j − s)+ ≤

λ j ∈b

m(b)

1/2

b + ||W ||∞ F(s, µ0 ). √ s

(λ j − s)+ ≤ C

l=1 λ j ∈µl

Therefore

|λ j |γ = γ (γ − 1)

∞

(λ j − s)+ s γ −2 ds

λ j ∈b 0

λ j ∈b

1/2

≤ (b + ||W ||∞ )C

∞

s γ −5/2 F(s,

√

s) ds.

0

The integral in the right-hand side converges only if γ > 7/4 and using (5.6) we arrive at 1− p 1/2 |λ j |γ ≤ C||W ||∞ (b + ||W ||∞ ) |W |γ −5/4+ p d x Rd

λ j ∈b

with 1/2 < p < 1. It remains to set r = γ + p − 1 to complete the proof.

6. Proof of Theorem 5 Theorem 5 has been already proved before for d = p = 1 (see [1,7]). Consider first the case when p > max{1, d/2}. By using Birman-Schwinger principle, we find that the value λ ∈ / R+ is an eigenvalue of the operator H if and only if 1 is an eigenvalue of the operator X = |V |1/2 (− − λ)−1 |V |−1/2 V, and thus X ≥ 1. Note now that X ≤ X p ≤ Q22 p , where Q = |V |1/2 | − − λ|−1/2 . Using Theorem 7 we obtain that 1≤

2p ||Q||2 p

≤ (2π )

−d

|V | d x p

Rd

Rd

dξ . ||ξ |2 − λ| p

Assuming that p > d/2 there is a constant C such that dξ dξ d/2− p = |λ| ≤ C |λ|d/2− p | sin φ|1− p , J= 2 − λ| p 2 − eiφ | p d d ||ξ | ||ξ | R R where φ = arg λ and consequently, J ≤ C |λ|1− p |λ|d/2−1 .

48


It remains to note that 1 ≤ (2π )−d J

Rd

|V | p d x.

In order to prove Theorem 5 for p = d/2 > 1 we use just Theorem 6 instead of Theorem 7. Indeed, let a(ξ ) =

1 ||ξ |2 − λ|

and p = d/2 > 1.

Then, using homogeneity, we obtain p

p

[a] p = [a0 ] p ,

where a0 =

There is a constant C > 0 such that p [a] p

=

p [a0 ] p

≤ C| sin φ|

1− p

1 . |ξ 2 − eiφ |

1− p λ = C . λ

It remains to note that, if λ is an eigenvalue of H , then p 1 ≤ C[a] p |V | p d x p = d/2. Rd

The proof is complete.

7. Individual Eigenvalue Estimates Let us now consider a Schrödinger operatorH = − + i V (x) whose potential is pure imaginary. Besides we assume that V ≥ 0 and lim|x|→∞ V (x) = 0. Our first statement concerns the case d = 3. / R+ be an eigenvalue of Theorem 9. Let V ∈ L 1 (R3 ), V ≥ 0 and let z = k 2 ∈ H = − + i V (x). Then k V (x) d x ≥ 1. (7.1) 4π R3 In particular, this shows that if R3 V d x is small, then the real part of the square root of the eigenvalue of H is large. That implies that non-real eigenvalues of − + it V escape any compact subset of C, as t → 0. It does not necessary imply that the eigenvalues tend to infinity as t → 0, because they might simply reach the positive real semi-axis for some t > 0 (see Theorem 15). Proof of Theorem 9. By using the Birman-Schwinger principle we find that z = k 2 ∈ R+ is an eigenvalue of the operator H = − + i V if and only if the operator √ √ X = −i V (− − z)−1 V (7.2) has an eigenvalue 1.


49

Suppose that z > 0. Then the real part of the operator X is positive and, consequently, the spectrum of this operator lies in the right half plane. Therefore if z is an eigenvalue of H , then ζ j ≥ 1, j

where ζ j are eigenvalues of X . On the other side, ζ j ≤ tr X = τ (x, x) d x, R3

j

where τ (x, y) is the integral kernel of the operator X . Since the kernel of the operator (− − z)−1 equals g(x, y) =

eik|x−y| , 4π |x − y|

we obtain that the kernel of the operator (− − z)−1 equals g0 (x, y) = (2i)−1 (g(x, y) − g(y, x)) whose diagonal values are g0 (x, x) =

k k + k¯ = . 8π 4π

Finally

k tr X = V (x) g0 (x, x) d x = V (x) d x 4π R3 R3 implies (7.1).

Corollary 5. Let d = 3 and let V ∈ L 1 (R3 ) be a positive function. Then non-real eigenvalues of − + i V do not accumulate to zero. Using the same approach we obtain the following two results in dimensions d = 1 and d = 2. Theorem 10. Let d = 1, z = k 2 ∈ R+ be an eigenvalue of the operator H = − + i V , V ≥ 0, V ∈ L 1 (R). Then k V (x) d x ≥ 1, 2|k|2 R which means that k lies inside the circle of radius 4−1 V (x) d x with the centre at 4−1 R V (x) d x. It is interesting to observe that if d = 2 then the eigenvalues do not appear at all if the integral of V is small.

50


Theorem 11. Let d = 2, z ∈ / R+ be an eigenvalue of H = −+i V , V ≥ 0, V ∈ L 1 (R2 ). Then 1 π + arctan( z/ z) V (x) d x ≥ 1. 2 2 In particular, the spectrum of H is real if π V (x) d x < 1. 2 Proof. In order to prove this statement we just notice that if X is the Birman-Schwinger operator (7.2) defined in the proof of Theorem 9, then 1 tr X = V (x) d x dξ. 2π(|ξ |2 − z) R2

The next result deals with some properties of complex eigenvalues of Schrödinger operators in higher dimensions d ≥ 4. Theorem 12. Let d ≥ 4 and let z ∈ / R+ be an eigenvalue of H = − + i V with V ≥ 0. Then −d+1 (d−2)/2 ωd−1 |z + 2V ∞ | (2π ) V (x) d x ≥ 2, (7.3) where ωd−1 is the area of the unit sphere Sd−1 . Proof. If as before X is the Birman-Schwinger operator introduced in (7.2) and z is an eigenvalue of the operator H , then 1/2 is an eigenvalue of the operator X − 1/2. Consequently, tr (X − 1/2)+ ≥ 1/2.

(7.4)

Indeed, for the eigenvalues λ j of the operator X we have (λ j − 1/2)+ ≤ tr ( X − 1/2)+ . Therefore the eigenvalue sum in the left-hand side is not less than 1/2. Obviously, V (X − 1/2)+ ≤ X − 2||V ||∞ + √ √ 1 = V (− − z)−1 − V. 2||V ||i n f t y + Concequently, using (7.4) we have −d 1/2 ≤ (2π ) V (x) d x Rd

Rd

z 1 − 2 2 2 (|ξ | − z) + (z) 2V ∞

dξ. +


51

The integration in the last integral is carried out over the domain where

|ξ |2 ≤ z + (2V − z)+ z ≤ z + 2V ∞ . Therefore −1 ωd−1

≤2 and we obtain (7.3).

Rd −1

z 1 − 2 2 2 (|ξ | − z) + ( z) 2V ∞

dξ +

π | z + 2V ∞ |(d−2)/2 ,

(7.5)

We now obtain some results involving L p norms of potentials with p > 1. Theorem 13. Let d ≥ 3 and let V ≥ 0. Suppose that z ∈ / R is an eigenvalue of H = − + i V . Then there are positive constants C1 and C2 depending only on d and γ ≥ 0 such that z d/2−1 γ (7.6) |z| ≤ C1 + C2 ( ) V d/2+γ d x. z Proof. Let as before √ √ X = −i V (− − z)−1 V . If z is an eigenvalue of H , then there is at least one eigenvalue of the operator X that is not less than 1. If by s j we denote the eigenvalues of the operator X , then this implies sup s −(d/2+γ ) card{ j : s j > s} ≥ 1. s>0

This supremum is related to the norm in the weak Neumann-Schatten class d/2+γ and, due to Theorem 6, it can be estimated by

V d/2+γ d x

Rd

z (ξ 2 − z)2 + z 2

d/2+γ dξ.

(7.7)

We conclude the proof by estimating the latter integral,

d/2+γ z dξ 2 2 2 Rd (ξ − z) + z d/2+γ d/2+γ ∞ ∞ z z d/2−1 s ds + C |z|d/2−1 ds ≤C 2 + z 2 2 + z 2 s s −∞ −∞ d/2−1 z ≤ C1 + C2 |z|−γ . z

52


Applying this result for the case γ = 0 we obtain:

Corollary 6. Let d ≥ 3 and let C1 be the constant in (7.6). If C1 V d/2 d x < 1, then the eigenvalues of − + i V belong to the conical sector {z : 0 ≤ arg z ≤ α}, where α satisfies the equation d/2−1 (C1 + C2 ( cot α) ) V d/2 d x = 1. If γ > 0 then in the proof of Theorem 13 one can apply Theorem 6 even if d = 2 and obtain Theorem 14. Let d = 2 and let V ≥ 0. Suppose that z ∈ / R is an eigenvalue of H = − + i V . Then there is a positive constant C depending only on γ > 0 such that |z|γ ≤ C V 1+γ d x, γ > 0.

8. Additional Remarks Concluding this paper, we mention two rather obvious facts, that are valid for an arbitrary complex potential V . For the sake of simplicity, we restrict our study to the case d = 3. As before, H = − + V is the Schrödinger operator and ω2 is the area of the unit sphere S2 . Theorem 15. Let d = 3. If V ∈ L ∞ ∩ L 1 and let ω2 V ∞ + 2V 1 < 8π . Then the spectrum of the operator H is real. The same statement is true if |V (y)| sup dy < 4π. x R3 |x − y| / R+ be an eigenvalue of the operator Theorem 16. Let d = 3 and let z = k 2 ∈ H = − + V , k > 0. Then there is a positive constant C depending only on γ > 0, such that 2γ (k) ≤ C |V |3/2+γ d x. R3

Proof of both theorems. Suppose that z = k 2 is an eigenvalue of the operator H . Then the norm of the operator X = |V |1/2 (− − z)−1 |V |1/2 is not smaller than 1. By Schur’s inequality if G is an integral operator with the kernel g(x, y) then G ≤ m 1 m 2 , where

m 1 = sup x

|g(x, y)|

dy and m 2 = sup ρ(x, y) y

(8.1) |g(x, y)|ρ(x, y) d x

and ρ is a positive weight. Since the kernel of the operator X equals |V (x)|1/2

eik|x−y| |V (y)|1/2 , 4π |x − y|


53

√ then applying (8.1) with the weight ρ = V (x)/V (y), we obtain that −k|x−y| e 1 sup |V (y)| dy. ||X || ≤ 4π x |x − y| The statement of Theorem 15 follows from the trivial estimate 1 1 |V (y)| 1 ≤ X ≤ sup dy ≤ (ω2 V ∞ + 2V 1 ). 4π x R3 |x − y| 8π We obtain the statement of Theorem 16 using the Hölder inequality |V (y)| −k|x−y| 1 e dy 1≤ 4π R3 |x − y| 1/q ||V || p e−qk|y| dy =C , ≤ C0 ||V || p q 2γ / p 3 |y| (k) R where p = 3/2 + γ and q = p/( p − 1).

Remark. By using similar arguments one can show that √ γ +1/2 z √ 2γ √ | z| ≤ C |V |1/2+γ d x, γ ≥ 1/2, z R for eigenvalues z ∈ / R+ of the one-dimensional Schrödinger operator H = −d 2 /d x 2 +V . The constant C in this inequality can be computed explicitly 1 2γ − 1 γ −1/2 C= . 2 2γ + 1 Acknowledgement. The authors would like to thank Grigori Rozenblioum, Rupert Frank, Stanislav Molchanov and Robert Seiringer for their remarks.

References 1. Abramov, A.A., Aslanyan, A., Davies, E.B.: Bounds on complex eigenvalues and resonances. J. Phys. A 34, 57–72 (2001) 2. Austern, N.: The use of complex potentials in nuclear physics. Ann. Phys. 45(1), 113–131 (1967) 3. Birman, M.: On the spectrum of singular boundary value problems. Matem. Sb. 55, 125–174 (1961) 4. Birman, M.Sh., Solomjak, M.Z.: Spectral theory of selfadjoint operators in Hilbert space. Mathematics and its Applications (Soviet Series). Dordrecht: D. Reidel Publishing Co., 1987 5. Cwikel, M.: Weak type estimates for singular values and the number of bound states of Schrödinger operators. Ann. Math. 106, 93–100 (1977) 6. Davies, E.B.: Linear operators and their spectra, Cambridge Studies in Advanced Mathematics 106, Cambridge: Cambridge University Press, 2007 7. Davies, E.B., Nath, J.: Schrödinger operators with slowly decaying potentials. J. Comput. Appl. Math. 148(1), 1–28 (2002) 8. Demuth, M., Hansmann, M., Katriel, G.: On the discrete spectrum of non-selfadjoint operators. Preprint, TV clausthal, 2008 9. Frank, R.L., Laptev, A., Lieb, E.H., Seiringer, R.: Lieb-Thirring inequalities for Schrödinger operators with complex-valued potentials. Lett. Math. Phys. 77(3), 309–316 (2006) 10. Ge, J.-Y., Zhang, J.: Use of negative complex potential as absorbing potential. J. Chem. Phys. 108(4), 1429–1433 (1998)

54


11. Fan, K.: Maximum properties and inequalities for the eigenvalues of completely continuous operators. Proc. Nat. Acad. Sci., U. S. A. 37, 760–766 (1951) 12. Lieb, E.H., Thirring, W.: Inequalities for the moments of the eigenvalues of the Schrödinger Hamiltonian and their relation to Sobolev inequalities. In: Studies in Mathematical Physics (Essays in Honor of Valentine Bargmann), Princeton, NJ: Princeton Univ. Press, 1976, pp. 269–303 13. Nimtz, G., Spieker, H., Brodowsky, H.M.: Tunneling with dissipation. J. Phys. I France 4, 1379– 1382 (1994) 14. Reed, M., Simon, B.: Methods of modern mathematical physics. IV. Analysis of operators. New YorkLondon: Academic Press [Harcourt Brace Jovanovich, Publishers], 1978 15. Schwinger, J.: On the bound states of a given potential. Proc. Nat. Acad. Sci. USA 47, 122–129 (1967) 16. Seiler, E., Simon, B.: Bounds in the Yukawa2 quantum field theory: upper bound on the pressure, Hamiltonian bound and linear lower bound. Commun. Math. Phys. 45, 99–114 (1975) 17. Simon, B.: Trace ideals and their applications, London Mathematical Society Lecture Note Series 35, Cambidge: Cambidge University Press, 1979 18. Von Neumann, J., Wigner, E.: Uber merkwurdige diskrete Eigenwerte. Physik. Zeitschr. 30, 465 (1929) 19. Weyl, H.: Inequalities between the two kinds of eigenvalues of a linear transformation. Proc. Nat. Acad Sci. USA 35, 408–411 (1949) Communicated by B. Simon


Communications in


Entanglement Transmission and Generation under Channel Uncertainty: Universal Quantum Channel Coding Igor Bjelaković1,2 , Holger Boche1,2 , Janis Nötzel1 1 Heinrich-Hertz-Lehrstuhl für Mobilkommunikation (HFT 6), Technische Universität Berlin,

Einsteinufer 25, 10587 Berlin, Germany. E-mail: [email protected]; [email protected]; [email protected] 2 Institut für Mathematik, Technische Universität Berlin, Strasse des 17, Juni 136, 10623 Berlin, Germany. Received: 28 November 2008 / Accepted: 20 May 2009 Published online: 13 August 2009 – © Springer-Verlag 2009

Abstract: We determine the optimal rates of universal quantum codes for entanglement transmission and generation under channel uncertainty. In the simplest scenario the sender and receiver are provided merely with the information that the channel they use belongs to a given set of channels, so that they are forced to use quantum codes that are reliable for the whole set of channels. This is precisely the quantum analog of the compound channel coding problem. We determine the entanglement transmission and entanglement-generating capacities of compound quantum channels and show that they are equal. Moreover, we investigate two variants of that basic scenario, namely the cases of informed decoder or informed encoder, and derive corresponding capacity results. Contents 1. 2. 3. 4. 5. 6. 7.

Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Definitions and Main Result . . . . . . . . . . . . . . . . . . . . . . . . . . One-Shot Results . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Direct Part of the Coding Theorem for Finitely Many Channels . . . . . . . Finite Approximations in the Set of Quantum Channels . . . . . . . . . . . Direct Parts of the Coding Theorems for General Quantum Compound Channels Converse Parts of the Coding Theorems for General Quantum Compound Channels . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 8. Continuity of Compound Capacity . . . . . . . . . . . . . . . . . . . . . . 9. Entanglement-Generating Capacity of Compound Channels . . . . . . . . . 10. Conclusion and Further Remarks . . . . . . . . . . . . . . . . . . . . . . . A. Appendix . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .

55 60 62 74 79 80 87 91 92 94 94

1. Introduction The determination of capacities of quantum channels in various settings has been a field of intense work over the last decade. In contrast to classical information theory, to any

56

I. Bjelaković, H. Boche, J. Nötzel

quantum channel we can associate in a natural way different notions of capacity depending on what is to be transmitted over the channel and which figure of merit is chosen as the criterion for the success of the particular quantum communication task. For example we may try to determine the maximum number of classical messages that can be reliably distinguished at the output of the channel leading to the notion of classical capacity of a quantum channel. We might as well wish to establish secure classical communication over a quantum channel, giving rise to the definition of a channel’s private capacity. On the other hand, in the realm of quantum communication, one may ask e.g. the question what the maximal amount of entanglement is that we can generate or transmit over a given quantum channel, leading to the notions of entanglement-generating and entanglement transmission capacities. Other examples of quantum capacities are the subspace transmission and average subspace transmission capacities. Such quantum communication tasks are needed, for example, to support computation in quantum circuits or to provide the best possible supply of pure entanglement in a noisy environment. Fortunately, these genuinely quantum mechanical capacities are shown to be equal for perfectly known single user channels [1,21]. First results indicating that coherent information was to play a role in the determination of the quantum capacity of memoryless channels were established by Schumacher and Nielsen [26] and, independently, by Lloyd [23] who was the first to conjecture that indeed the regularized coherent information would give the correct formula for the quantum capacity and gave strong heuristic evidence to his claim. In 1998 Barnum, Knill, and Nielsen and Barnum, Nielsen, and Schumacher [1] gave the first upper bound on the capacity of a memoryless channel in terms of the regularized coherent information. Later on, Shor [29] and Devetak [10] offered two independent approaches to the achievability part of the coding theorem. Despite the fact that the regularized coherent information was identified as the capacity of memoryless quantum channels many other approaches to the coding theorem have been offered subsequently, for example Devetak and Winter [11] and Hayden, Shor, and Winter [14]. Of particular interest for our paper are the developments by Klesse [20] and Hayden, Horodecki, Winter, and Yard [13] based on the decoupling idea which can be traced back to Schumacher and Westmoreland [28]. In fact, the main purpose of our work is to show that the decoupling idea can be utilized to prove the existence of reliable universal quantum codes for entanglement transmission and generation. On the other hand, the classical capacity of memoryless quantum channels has been determined in the pioneering work by Holevo [15] and Schumacher and Westmoreland [27]. Their results have been substantially sharpened by Winter [31] and Ogawa and Nagaoka [25] who gave independent proofs of the strong converse to the coding theorem. However, most of the work done so far on quantum channel capacities relies on the assumption that the channel is perfectly known to the sender and receiver. Such a requirement is hardly fulfilled in many situations. In this paper we consider compound quantum channels which are among the simplest non-trivial models with channel uncertainty. A rough description of this communication scenario is that the sender and receiver do not know the memoryless channel they have to use. The prior knowledge they have access to is merely that the actual channel belongs to a set I of channels which in turn is known to the sender and receiver. It is important to notice that we impose no restrictions on the set I, i.e. it can be finite, countably-infinite or uncountable. Our intention is to identify the best rates of quantum codes for entanglement transmission and generation that are reliable for the whole set of channels I simultaneously. This is, in some sense, a quantum channel counterpart of the universal quantum data compression result discovered by Jozsa and the Horodecki family [18].

Universal Quantum Channel Coding

57

While the classical capacity of compound quantum channels has been determined only recently in [3], in this paper we will focus on entanglement-generating and entanglement transmission capacities of compound quantum channels. Specifically we will determine both of them and show that they are equal. The investigation of their relation to other possible definitions of quantum capacity of compound quantum channels in spirit of [1,21] will be given elsewhere. 1.1. Related Work. The capacity of compound channels in the classical setting was determined by Wolfowitz [32,33] and Blackwell, Breiman, and Thomasian [5]. The full coding theorem for transmission of classical information via compound quantum channels was proven in [3]. Subsequently, Hayashi [12] obtained a closely related result with a completely different proof technique based on the Schur-Weyl duality from representation theory and the packing lemma from [7]. In our previous paper [4] we determined the entanglement transmission capacity of finite quantum compound channels (i.e. |I| < ∞). Moreover, we were able to prove the coding theorem for arbitrary I with informed decoder. It is important to remark here that we used a different notion of codes in [4], following [20], which is motivated by the theory of quantum error correction. In the cases of an informed decoder and uninformed users this change does not appear to be of importance. In the case of an informed encoder it is of crucial importance in the proof of the direct part of the coding result. In our former paper, the strategy of proof was as follows. First, we derived a modification of Klesse’s one-shot coding result [20] that was adapted to arithmetic averages of channels. Application of this theorem combined with a discretization technique based on τ -nets yielded the coding result for quantum compound channels with informed decoder and arbitrary I. With the help of the channel-estimation technique developed by Datta and Dorlas [9] we were able to show that in the case of√a finite compound channel it is asymptotically of no relevance if one spends the first l transmissions for channel estimation, thus turning an uninformed decoder into an informed decoder. Since for an informed decoder we had already proven the existence of good codes, we were able to obtain the full coding result in the case |I| < ∞. Unfortunately, the speed at which one can gain channel knowledge using the channel estimation technique we employed is highly dependent on the number of channels. Due to this fact, the combination of channel estimation and approximation of general compound channels through finite ones did not seem to work in the other two cases. In this paper, we use a more direct strategy. First, we derive one-shot coding results for finite compound channels with uninformed users and informed encoder. In order to evaluate the dependence of the derived bounds on the block length we have to project onto typical subspaces of suitable output states of the individual channels. Therefore, it turns out that we effectively end up in the scenario with informed decoder. Now, instead of employing a channel estimation strategy we study the impact of these projections onto the typical subspaces on the entanglement fidelity of the entire encoding-decoding procedure. It turns out that these projections can simply be removed without decreasing the entanglement fidelity too much and we have got a universal (i.e. uninformed) decoder for our coding problem. Then, again, using the discretization technique based on τ -nets we can convert these results for finite I to arbitrary compound quantum channels. Another difference to our previous paper [4] is that we determine the optimal rates in all the scenarios described above for entanglement generation over compound quantum channels and show that they coincide with the entanglement transmission capacities.

58


1.2. Outline. Section 2 contains the fundamental definitions of codes and capacities for entanglement transmission in all three different settings. Moreover, the reader can find there the statement of our main result. It is followed by a section on one-shot results containing the one-shot result of Klesse [20], as well as our modifications thereof. The modified coding results guarantee the existence of unitary encodings as well as recovery operations for finite arithmetic averaged channels in all three different cases and establish a relation between the rate of the code and its entanglement fidelity. We also give an estimate relating the entanglement fidelity of a coding-decoding procedure to that of a disturbed version, where disturbance means that the application of the channel is followed by a projection. With these one-shot results at hand, in Sect. 4 we are able to prove the existence of codes for entanglement transmission of sufficiently high rates and entanglement fidelity asymptotically approaching one exponentially fast in the case of finite compound channels. Section 5 states the basic properties of finite size nets in the set of quantum channels. They are used to approximate general sets of quantum channels and provide the link between finite and general compound channels. The construction is such that their size depends polynomially on the approximation parameter. We use the coding results for finite compound channels and the properties of finite nets in Sect. 6 to derive sharp lower bounds on the entanglement transmission capacity of general compound channels. This section also contains variants of the BSST Lemma [2], where BSST stands for Bennett, Shor, Smolin, and Thapliyal. The proofs rely heavily on the difference in the polynomial growth of nets versus exponentially fast convergence to the entanglement fidelity one for the codes in the finite setting. The next Sect. 7 contains the converse parts of the coding theorems for general compound channels. Since the converse must hold for arbitrary encoding schemes and since we explicitly allow the code space to be larger than the input space of the channels, we deviate from the usual structure and instead employ the converse part for the case of entanglement generation that was developed by Devetak [10]. We also use a recent continuity result due to Leung and Smith [22] that connects the difference in coherent information between nearby channels. In Sect. 8 we show, once again using the work of Leung and Smith [22], that the entanglement transmission capacities of compound quantum channels are continuous with respect to the Hausdorff metric. In the final Sect. 9 we apply the results obtained so far to determine the entanglement-generating capacities of compound quantum channels. It is not very surprising that it turns out that they coincide with their counterparts for entanglement transmission. 1.3. Notation and Conventions. All Hilbert spaces are assumed to have finite dimension and are over the field C. S(H) is the set of states, i.e. positive semi-definite operators with trace 1 acting on the Hilbert space H. Pure states are given by projections onto onedimensional subspaces. A vector of unit length spanning such a subspace will therefore be referred to as a state vector. To each subspace F of H we can associate unique projection qF whose range is the subspace F and we write πF for the maximally mixed state qF on F, i.e. πF := tr(q . The set of completely positive trace preserving (CPTP) maps F) between the operator spaces B(H) and B(K) is denoted by C(H, K). Thus H plays the role of the input Hilbert space to the channel (traditionally owned by Alice) and K is the channel’s output Hilbert space (usually in Bob’s possession). C ↓ (H, K) stands for the set of completely positive trace decreasing maps between B(H) and B(K). U(H)


59

will denote in what follows the group of unitary operators acting on H. For a Hilbert space G ⊂ H we will always identify U(G) with a subgroup of U(H) in the canonical way. For any projection q ∈ B(H) we set q ⊥ := 1H − q. Each projection q ∈ B(H) defines a completely positive trace decreasing map Q given by Q(a) := qaq for all a ∈ B(H). In a similar fashion any u ∈ U(H) defines a U ∈ C(H, H) by U(a) := uau ∗ for a ∈ B(H). We use the base two logarithm which is denoted by log. The von Neumann entropy of a state ρ ∈ S(H) is given by S(ρ) := −tr(ρ log ρ). The coherent information for N ∈ C(H, K) and ρ ∈ S(H) is defined by Ic (ρ, N ) := S(N (ρ)) − S((idH ⊗ N )(|ψ ψ|)), where ψ ∈ H ⊗ H is an arbitrary purification of the state ρ. Following the usual conventions we let Se (ρ, N ) := S((idH ⊗ N )(|ψ ψ|)) denote the entropy exchange. A useful equivalent definition of Ic (ρ, N ) is given in terms of N ∈ C(H, K) and the complementary channel N ∈ C(H, He ), where He denotes the Hilbert space of the environment: Due to Stinespring’s dilation theorem N can be represented as N (ρ) = trHe (vρv ∗ ) for ρ ∈ S(H), where v : H → K ⊗ He is a linear isometry. The complementary channel N ∈ C(H, He ) to N is given by N (ρ) := trH (vρv ∗ )

(ρ ∈ S(H)).

The coherent information can then be written as Ic (ρ, N ) = S(N (ρ)) − S(N (ρ)). As of closeness between two states ρ, σ ∈ S(H) we use the fidelity F(ρ, σ ) := √ √a measure || ρ σ ||21 . The fidelity is symmetric in the input and for a pure state ρ = |φ φ| we have F(|φ φ|, σ ) = φ, σ φ. A closely related quantity is the entanglement fidelity. For ρ ∈ S(H) and N ∈ C ↓ (H, K) it is given by Fe (ρ, N ) := ψ, (idH ⊗ N )(|ψ ψ|)ψ, with ψ ∈ H ⊗ H being an arbitrary purification of the state ρ. For the approximation of arbitrary compound channels by finite ones we use the diamond norm || · ||♦, which is given by ||N ||♦ := sup

max

n n∈N a∈B(C ⊗H),||a||1 =1

||(idn ⊗ N )(a)||1 ,

where idn : B(Cn ) → B(Cn ) is the identity channel, and N : B(H) → B(K) is any linear map, not necessarily completely positive. The merits of ||·||♦ are due to the following facts (cf. [19]). First, ||N ||♦ = 1 for all N ∈ C(H, K). Thus, C(H, K) ⊂ S♦, where S♦ denotes the unit sphere of the normed space (B(B(H), B(K)), || · ||♦). Moreover, ||N1 ⊗ N2 ||♦ = ||N1 ||♦||N2 ||♦ for arbitrary linear maps N1 , N2 : B(H) → B(K). We further use the diamond norm to define the function D♦(·, ·) on {(I, I ) : I, I ⊂ C(H, K)}, which is for I, I ⊂ C(H, K) given by D♦(I, I ) := max{ sup inf ||N − N ||♦, sup inf ||N − N ||♦}.

N ∈I N ∈I

N ∈I N ∈I

60


For I ⊂ C(H, K) let I¯ denote the closure of I in || · ||♦. Then D♦ defines a metric on ¯ I = I¯ } which is basically the Hausdorff distance {(I, I ) : I, I ⊂ C(H, K), I = I, induced by the diamond norm. Obviously, for arbitrary I, I ⊂ C(H, K), D♦(I, I ) ≤ implies that for every N ∈ I (N ∈ I ) there exists N ∈ I (N ∈ I) such that ||N − N ||♦ ≤ 2. If ¯ I = I¯ holds we even have ||N − N ||♦ ≤ . In this way D♦ gives a measure I = I, of distance between two compound channels. Finally, for any set I ⊂ C(H, K) and l ∈ N we set I⊗l := {N ⊗l : N ∈ I}. 2. Definitions and Main Result Let I ⊂ C(H, K). The memoryless compound channel associated with I is given by the family {N ⊗l : S(H⊗l ) → S(K⊗l )}ł∈N,N ∈I. In the rest of the paper we will simply write I for that family. Each compound channel can be used in three different scenarios: 1. the informed decoder 2. the informed encoder 3. the case of uninformed users. In the following three subsections we will give definitions of codes and capacity for these cases. 2.1. The Informed Decoder. An (l, kl )-code for I with informed decoder is a pair (P l , {RlN : N ∈ I}) where: 1. P l : B(Fl ) → B(H)⊗l is a CPTP map for some Hilbert space Fl with kl = dim Fl . 2. RlN : B(K)⊗l → B(Fl ) is a CPTP map for each N ∈ I, where the Hilbert space Fl satisfies Fl ⊂ Fl . In what follows the operations RlN are referred to as recovery (or decoding) operations. Since the decoder knows which channel is actually used during transmission, they are allowed to depend on the channel. Note at this point that we deviate from the standard assumption that Fl = Fl . We allow Fl Fl for convenience only since it allows more flexibility in code construction. It is readily seen from the definition of achievable rates and capacity below that the assumption Fl Fl cannot lead to a higher capacity of I in any of the three cases that we are dealing with. A non-negative number R is called an achievable rate for I with informed decoder if there is a sequence of (l, kl )-codes such that 1. lim inf l→∞ 1l log kl ≥ R, and 2. liml→∞ inf N ∈I Fe (πFl , RlN ◦ N ⊗l ◦ P l ) = 1 holds. The capacity Q I D (I) of the compound channel I with informed decoder is given by Q I D (I) := sup{R ∈ R+ : R is achievable for I with informed decoder}.


61

l : 2.2. The Informed Encoder. An (l, kl )-code for I with informed encoder is a pair ({PN N ∈ I}, Rl ) where: l : B(F ) → B(H)⊗l is a CPTP map for each N ∈ I for some Hilbert space 1. PN l l are the encoding operations which we allow to Fl with kl = dim Fl . The maps PN depend on N since the encoder knows which channel is in use. 2. Rl : B(K)⊗l → B(Fl ) is a CPTP map where the Hilbert space Fl satisfies Fl ⊂ Fl .

A non-negative number R is called an achievable rate for I with informed encoder if there is a sequence of (l, kl )-codes such that 1. lim inf l→∞ 1l log kl ≥ R, and l )=1 2. liml→∞ inf N ∈I Fe (πFl , Rl ◦ N ⊗l ◦ PN holds. The capacity Q I E (I) of the compound channel I with informed encoder is given by Q I E (I) := sup{R ∈ R+ : R is achievable for I with informed encoder}. 2.3. The Case of Uninformed Users. Codes and capacity for the compound channel I with uninformed users are defined in a similar fashion. The only change is that we do not allow the encoding operations to depend on N . I.e. an (l, kl )− code for I is a pair (P l , Rl ) of CPTP maps P l ∈ C(Fl , H⊗l ), where Fl is a Hilbert space with kl = dim Fl and Rl ∈ C(K⊗l , Fl ) with Fl ⊂ Fl . A non-negative number R is called an achievable rate for I if there is a sequence of (l, kl )-codes such that 1. lim inf l→∞ 1l log kl ≥ R, and 2. liml→∞ inf N ∈I Fe (πFl , Rl ◦ N ⊗l ◦ P l ) = 1. The capacity Q(I) of the compound channel I is given by Q(I) := sup{R ∈ R+ : R is achievable for I}. A first simple consequence of these definitions is the following relation among the capacities of I. Q(I) ≤ min{Q I D (I), Q I E (I)}. 2.4. Main Result. With these definitions at our disposal, we are ready now to state the main result of the paper. Theorem 1. Let I ⊂ C(H, K) be an arbitrary set of quantum channels, where H and K are finite dimensional Hilbert spaces. 1. Then Q(I) = Q I D (I) = lim

l→∞

1 max inf Ic (ρ, N ⊗l ), l ρ∈S (H⊗l ) N ∈I

and Q I E (I) = lim

l→∞

1 inf max Ic (ρ, N ⊗l ). l N ∈I ρ∈S (H⊗l )

62


2. Moreover, for the corresponding entanglement-generating capacities E(I), E I D (I), and E I E (I) (defined in Sect. 9) we have E(I) = E I D (I) = Q(I) and E I E (I) = Q I E (I). The rest of the paper contains a step-by-step proof of Theorem 1. 3. One-Shot Results In this section we will establish the basic building blocks for the achievability parts of the coding theorems for compound channels with and without channel knowledge. The results are formulated as one-shot statements in order to simplify the notation.

3.1. One-Shot Coding Result for a Single Channel. Before we turn our attention to quantum compound channels we will shortly describe a part of recent developments in coding theory for single (i.e. perfectly known) channels as given in [20] and [13]. Both approaches are based on a decoupling idea which is closely related to approximate error correction. In order to state this decoupling lemma we need some notational preparation. Let ρ ∈ S(H) be given and consider any purification ψ ∈ Ha ⊗ H, Ha = H, of ρ. According to Stinespring’s representation theorem any N ∈ C ↓ (H, K) is given by N ( · ) = trHe ((1H ⊗ pe )v( · )v ∗ ),

(1)

where He is a suitable finite-dimensional Hilbert space, pe is a projection onto a subspace of He , and v : H → K ⊗ He is an isometry. Let us define a pure state on Ha ⊗ K ⊗ He by the formula 1 ψ := √ (1Ha ⊗K ⊗ pe )(1Ha ⊗ v)ψ. tr(N (πF )) We set

:= trK (|ψ ψ |), ρ := trHa ⊗He (|ψ ψ |), ρae

and ρa := trK⊗He (|ψ ψ |), ρe := trHa ⊗K (|ψ ψ |). The announced decoupling lemma can now be stated as follows. Lemma 1 (cf. [13,20]). For ρ ∈ S(H) and N ∈ C ↓ (H, K) there exists a recovery operation R ∈ C(K, H) with

− wρa ⊗ ρe ||1 , Fe (ρ, R ◦ N ) ≥ w − ||wρae

where w = tr(N (ρ)).


63

−ρ ⊗ The striking implication of Lemma 1 is that if the so called quantum error ||ρae a

ρe ||1 for ρ ∈ S(H) and N ∈ C(H, K) is small then almost perfect error correction is possible via R. Lemma 1 was Klesse’s [20] starting point for his highly interesting proof of the following theorem which is a one-shot version of the achievability part of the coding theorem. In the statement of the result we will use the following notation.

Fc,e (ρ, N ) :=

max

R∈C (K,H)

Fe (ρ, R ◦ N ),

where ρ ∈ S(H) and N ∈ C ↓ (H, K). Theorem 2 (Klesse [20]). Let the Hilbert space H be given and consider subspaces E ⊂ G ⊂ H with dim E = k. Then for any N ∈ C ↓ (H, K) allowing a representation with n Kraus operators we have √ Fc,e (uπE u ∗ , N )du ≥ tr(N (πG )) − k · n||N (πG )||2 , U(G )

where U(G) denotes the group of unitaries acting on G and du indicates that the integration is with respect to the Haar measure on U(G). We will indicate briefly how Klesse [20] derived the direct part of the coding theorem for memoryless quantum channels from Theorem 2. Let us choose for each l ∈ N subspaces El ⊂ G ⊗l ⊂ H⊗l with dim El =: kl = 2l(Ic (πG ,N )−3) . To given N ∈ C(H, K) and πG Klesse constructed a reduced version Nl of N ⊗l in such a way that Nl has a Kraus representation with nl ≤ 2l(Se (πG ,N )+) Kraus operators. Let ql ∈ B(K⊗l ) be the entropy-typical projection of the state (N (πG ))⊗l and set Nl (·) := ql Nl (·)ql . Then we have the following properties (some of which are stated once more for completeness): 1. 2. 3. 4.

kl = 2l(Ic (πG ,N )−3) , tr(Nl (πG⊗l )) ≥ 1 − o(l 0 ),1 nl ≤ 2l(Se (πG ,N )+) , and ||Nl (πG⊗l )||22 ≤ 2−l(S(πG )−) .

An application of Theorem 2 to Nl shows heuristically the existence of a unitary u ∈ U(G ⊗l ) and a recovery operation Rl ∈ C(K⊗l , H⊗l ) with l

Fe (uπEl u ∗ , Rl ◦ Nl ) ≥ 1 − o(l 0 ) − 2− 2 . This in turn can be converted into Fe (uπEl u ∗ , Rl ◦ N ⊗l ) ≥ 1 − o(l 0 ), which is the achievability of Ic (πG , N ). The passage from πG to arbitrary states ρ is then accomplished via the Bennett, Shor, Smolin, and Thapliyal Lemma from [2] and the rest is by regularization. 1 Here, o(l 0 ) denotes simply a non-specified sequence tending to 0 as l → ∞, i.e. we (ab)use the BachmannLandau little-o notation.

64


3.2. One-Shot Coding Result for Uninformed Users. Our goal in this section is to establish a variant of Theorem 2 that works for finite sets of channels. Since the entanglement fidelity depends affinely on the channel it is easily seen that for each set I = {N1 , . . . , N N } any good coding scheme with uninformed users is also good for the channel N :=

N 1 Ni N i=1

and vice versa. Since it is easier to deal with a single channel and we do not loose anything if passing to averages, we will formulate our next theorem for arithmetic averages of completely positive trace decreasing maps instead of the set {N1 , . . . , N N }. Theorem 3 (One-Shot Result: Uninformed Users and Averaged Channel). Let the Hilbert space H be given and consider subspaces E ⊂ G ⊂ H with dim E = k. For any choice of N1 , . . . N N ∈ C ↓ (H, K) each allowing a representation with n j Kraus operators, j = 1, . . . , N , we set N :=

N 1 Nj, N j=1

and for any u ∈ U(G), Nu :=

N 1 N j ◦ U. N j=1

Then U(G )

Fc,e (πE , Nu )du ≥ tr(N (πG )) − 2

N

kn j ||N j (πG )||2 ,

j=1

where the integration is with respect to the normalized Haar measure on U(G). Remark 1. It is worth noting that the average in this theorem is no more over maximally mixed states like in Theorem 2, but rather over encoding operations. Proof. The proof is easily reduced to that of the corresponding theorem in our previous paper [4]. Most of the details can also be seen in the proof of Theorem 4 in the next subsection. 3.3. One-Shot Coding Result for Informed Encoder. Before stating the main result of this section we recall a useful lemma from [4] which will be needed in the proof of Theorem 4. Lemma 2. Let L and D be N × N matrices with non-negative entries which satisfy L jl ≤ L j j , L jl ≤ L ll ,

(2)


65

and D jl ≤ max{D j j , Dll }

(3)

for all j, l ∈ {1, . . . , N }. Then N N 1 L jl D jl ≤ 2 L jj Djj. N

j,l=1

j=1

Proof. The proof of this lemma is elementary. The details can be picked up in our previous paper [4]. We will focus now on the scenario where the sender or encoder knows which channel is in use. Consequently, the encoding operation can depend on the individual channel. The idea behind the next theorem is that we perform an independent, randomized selection of unitary encoders for each channel in the finite set I = {N1 , . . . , N N }. This explains why the averaging in (4) is with respect to products of Haar measures instead of averaging over one single Haar measure as in Theorem 3. Theorem 4 (One-Shot Result: Informed Encoder and Averaged Channel). Let the finitedimensional Hilbert spaces H and K be given. Consider subspaces E, G1 , . . . , G N ⊂ H with dim E = k such that for all i ∈ {1, . . . , N } the dimension relation k ≤ dim Gi holds. Let N1 , . . . N N ∈ C ↓ (H, K) each allowing a representation with N n j Kraus operators, j = 1, . . . , N . Let {vi }i=1 ⊂ U(H) be any fixed set of unitary operators such that vi E ⊂ Gi holds for every i ∈ {1, . . . , N }. For an arbitrary set N ⊂ U(H), define {u i }i=1 Nu 1 ,...,u N :=

N 1 Ni ◦ Ui ◦ Vi . N i=1

Then U(G1 )×...×U(G N )

Fc,e (πE , Nu 1 ,...,u N )du 1 . . . du N ≥

N 1 j=1

N

tr(N j (πG j ))

− 2 kn j ||N j (πG j )||2 , (4)

where the integration is with respect to the product of the normalized Haar measures on U(G1 ), . . . , U(G N ). Proof. Our first step in the proof is to show briefly that Fc,e (πE , Nu 1 ,...,u N ) depends measurably on (u 1 , . . . , u N ) ∈ U(G1 ) × · · · × U(G N ). For each recovery operation R ∈ C(K, H) we define a function f R : U(G1 ) × · · · × U(G N ) → [0, 1] by f R (u 1 , . . . , u N ) := Fe (πE , R ◦ Nu 1 ,...,u N ). Clearly, f R is continuous for each fixed R ∈ C(K, H). Thus, the function Fc,e (πE , Nu 1 ,...,u N ) =

max

R∈C (K,H)

f R (u 1 , . . . , u N )

is lower semicontinuous, and consequently measurable.

66


We turn now to the proof of inequality (4). From Lemma 1 we know that there is a recovery operation R such that

Fe (πE , R ◦ Nu 1 ,...,u N ) ≥ w − ||wρae − wρa ⊗ ρe ||1 ,

(5)

where we have used the notation introduced in the paragraph preceding Lemma 1, and w = w(u 1 , . . . , u N ) = tr(Nu 1 ,...,u N (πE )). n

j For each j ∈ {1, . . . , N } let {b j,i }i=1 be the set of Kraus operators of N j . Clearly, for nj every set u 1 , . . . , u N of unitary matrices, N j ◦ U j ◦ V j has Kraus operators {a j,i }i=1 given by a j,i = b j,i u j v j . Utilizing the very same calculation that was used in the proof of Theorem 3 in [4], which in turn is almost identical to the corresponding calculation in [20], we can reformulate inequality (5) as

Fe (πE , R ◦ Nu 1 ,...,u N ) ≥ w − ||D(u 1 , . . . , u N )||1 ,

(6)

with w = tr(Nu 1 ,...,u N (πE )) and D(u 1 , . . . , u N ) :=

n j ,nl N 1 D(i j)(rl) (u j , u l ) ⊗ |ei er | ⊗ | f j fl |, N i,r =1

j,l=1

where D(i j)(rl) (u j , u l ) :=

1 k

1 ∗ pa j,i al,r p − tr( pa ∗j,i al,r p) p , k

and p := kπE is the projection onto E. Let us define n j ,nl

D j,l (u j , u l ) :=

D(i j)(kl) (u j , u l ) ⊗ |ei ek | ⊗ | f j fl |.

(7)

i=1,k=1

The triangle inequality for the trace norm yields ||D(u 1 , . . . , u N )||1 ≤

N 1 ||D j,l (u j , u l )||1 N

j,l=1

≤

N 1 k min{n j , nl }||D j,l (u j , u l )||2 , N

j,l=1

N 1 = k min{n j , nl }||D j,l (u j , u l )||22 , N

(8)

j,l=1

√ where the second line follows from ||a||1 ≤ d||a||2 , d being the number of non-zero singular values of a. In the next step we will compute ||D j,l (u j , u l )||22 . We set pl := vl pvl∗ which defines N with supp( p ) ⊂ G for every l ∈ {1, . . . , N }. A glance at (7) new projections { pl }l=1 l l shows that n j ,nl ∗

(D j,l (u j , u l )) =

i=1,k=1

(D(i j)(kl) (u j , u l ))∗ ⊗ |ek ei | ⊗ | fl f j |,

(9)


67

and consequently we obtain ||D j,l (u j , u l )||22 = tr((D j,l (u j , u l ))∗ D j,l (u j , u l )) n j ,nl

=

tr((D(i j)(kl) (u j , u l ))∗ D(i j)(kl) (u j , u l ))

i=1,r =1 n j ,nl

=

1 k2

{tr( p(a ∗j,i al,r )∗ pa ∗j,i al,r )

i=1,r =1

1 − |tr( pa ∗j,i al,r )|2 } k n j ,nl 1 ∗ {tr( pl u l∗ bl,r b j,i u j p j u ∗j b∗j,i bl,r u l ) = 2 k i=1,r =1

1 − |tr( pv ∗j u ∗j b∗j,i bl,r u l vl )|2 }. k

(10)

It is apparent from the last two lines in (10) that ||D j,l (u j , u l )||22 depends measurably on (u 1 , . . . , u N ) ∈ U(G1 ) × · · · × U(G N ). Let U1 , . . . , U N be independent random variables taking values in U(Gi ) according to the normalized Haar measure on U(Gi ) (i ∈ {1, . . . , N }). Then using Jensen’s inequality and abbreviating L jl := k min{n j , nl } we can infer from (8) that E(||D(U1 , . . . , U N )||1 ) ≤

N 1 L jl E(||D j,l (U j , Ul )||22 ). N

(11)

j,l=1

Note that the expectations on the RHS of (11) are only with respect to pairs of random variables U1 , . . . , U N . Our next goal is to upper-bound E(||D j,l (U j , Ul )||22 ). Case j = l. Since the last term in (10) is non-negative and the random variables U j and Ul are independent we obtain the following chain of inequalities: E(||D j,l (U j , Ul )||22 ) =

1 k2 −

≤ = =

1 k2 1 k2 1 k2

n j ,nl

∗ Etr( pl Ul∗ bl,r b j,i U j p j U ∗j b∗j,i bl,r Ul )

i=1,r =1

1 E|tr( pv ∗j U ∗j b∗j,i bl,r Ul vl )|2 k

n j ,nl

∗ Etr( pl Ul∗ bl,r b j,i U j p j U ∗j b∗j,i bl,r Ul )

i=1,r =1 n j ,nl

∗ Etr(Ul pl Ul∗ bl,r b j,i U j p j U ∗j b∗j,i bl,r )

i=1,r =1 n j ,nl

i=1,r =1

∗ tr(E(Ul pl Ul∗ )bl,r b j,i E(U j p j U ∗j )b∗j,i bl,r )

68


=

1 k2

n j ,nl

∗ tr(k · πGl bl,r b j,i k · πG j b∗j,i bl,r )

i=1,r =1

= N j (πG j ), Nl (πGl ) H S ,

(12)

where · , · H S denotes the Hilbert-Schmidt inner product, and we used the fact that E(Ul pl Ul∗ ) = k · πGl and E(U j p j U ∗j ) = k · πG j . Case j = l. In this case we obtain

E(||D j, j (U j , U j )||22 ) =

1 k2

Etr( p j U ∗j b∗j,r b j,i U j p j U ∗j b∗j,i b j,r U j )

n j ,n j

i=1,r =1

1 − E|tr( pv ∗j U ∗j b∗j,i b j,r U j v j )|2 k 1 = 2 k

n j ,n j

Etr(U j p j U ∗j b∗j,r b j,i U j p j U ∗j b∗j,i b j,r )

i=1,r =1

1 ∗ ∗ 2 − E|tr(U j p j U j b j,i b j,r )| . k

(13)

Thus, the problem reduces to the evaluation of E{bU pU ∗ (x, y)},

(x, y ∈ B(H)),

where p is an orthogonal projection with tr( p) = k and 1 bU pU ∗ (x, y) := tr(U pU ∗ x ∗ U pU ∗ y) − tr(U pU ∗ x ∗ )tr(U pU ∗ y), k for a Haar distributed random variable U with values in U(G) where supp( p) ⊂ G ⊂ H. Here we can refer to [20] where the corresponding calculation is carried out via the theory of group invariants and explicit evaluations of appropriate integrals with respect to row-distributions of random unitary matrices. The result is

E{bU pU ∗ (x, y)} =

k2 − 1 1 − k2 ∗ tr( p tr( pG x ∗ )tr( pG y), x p y) + G G d2 − 1 d(d 2 − 1)

(14)

for all x, y ∈ B(H) where pG denotes the projection onto G with tr( pG ) = d. In Appendix A we will give an elementary derivation of (14) for the sake of completeness.


69

Inserting (14) with x = y = b∗j,i b j,r into (13) yields with d j := tr( pG j ): E(||D j, j (U j , U j )||22 ) =

≤

1− d 2j

−1

1−

⎡

n j ,n j

⎣

1 d 2j

tr( pG j b∗j,r b j,i pG j b∗j,i b j,r )

i=1,r =1

⎤ 1 − |tr(( pG j b∗j,i b j,r )|2 ⎦ dj

n j ,n j

1 k2

d 2j − 1 i=1,r =1

1 ≤ 2 dj =

1 k2


n j ,n j


i=1,r =1 n j ,n j

tr(b j,r pG j b∗j,r b j,i pG j b∗j,i )

i=1,r =1

= N j (πG j ), N j (πG j ) H S . Summarizing, we obtain E(||D j, j (U j , U j )||22 ) ≤ N j (πG j ), N j (πG j ) H S = ||N j (πG j )||22 .

(15)

Similarly E(tr(NU1 ,...,U N (πE ))) =

N 1 1 E(tr(N j (U j p j U ∗j ))) N k j=1

=

N 1 tr(N j (πG j )). N

(16)

j=1

Equations (6), (8), (12), (15), and (16) show that N 1 E(Fc,e (πE , NU1 ,...,U N )) ≥ tr(N j (πG j )) N j=1

−

N 1 L jl D jl , N

j,l=1

where for j, l ∈ {1, . . . , N } we introduced the abbreviation D jl := N j (πG j ), Nl (πGl ) H S , and, as before, L jl = k min{n j , nl }.

(17)

70


It is obvious that L jl ≤ L j j and L jl ≤ L ll hold. Moreover, the Cauchy-Schwarz inequality for the Hilbert-Schmidt inner product shows that D jl = N j (πG j ), Nl (πGl ) H S ≤ ||N j (πG j )||2 ||Nl (πGl )||2 ≤ max{||N j (πG j )||22 , ||Nl (πGl )||22 } = max{D j j , Dll }. Therefore, an application of Lemma 2 allows us to conclude from (17) that E(Fc,e (πE , NU1 ,...,U N )) ≥

N 1 tr(N j (πG j )) N j=1

−2

N

kn j ||N j (πG j )||2 ,

j=1

and we are done.

3.4. Entanglement Fidelity. The purpose of this subsection is to develop a tool which will enable us to convert a special kind of recovery maps depending on the channel into such that are universal, at least for finite compound channels. Anticipating constructions in Sect. 4 below, the situation we will be faced with is as follows. For finite set I = {N1 , . . . , N N } of channels, block length l ∈ N, and small > 0 we will be able to find one single recovery map Rl and a unitary encoder W l such that for each i ∈ {1, . . . , N }, Fe (πFl , Rl ◦ Ql,i ◦ Ni⊗l ◦ W l ) ≥ 1 − , where Ql,i (·) := ql,i (·)ql,i with suitable projections ql,i acting on K⊗l . Thus we will effectively end up with the recovery maps Rli := Rl ◦ Ql,i . Consequently, it turns out that the decoder is informed. Lemma 3 below shows how to get rid of the maps Ql,i ensuring the existence of a universal recovery map for the whole set I while decreasing the entanglement fidelity only slightly. Lemma 3. Let ρ ∈ S(H) for some Hilbert space H. Let, for some other Hilbert space K, A ∈ C(H, K), D ∈ C(K, H), q ∈ B(K) be an orthogonal projection. 1. Denoting by Q⊥ the completely positive map induced by q ⊥ := 1K − q we have Fe (ρ, D ◦ A) ≥ Fe (ρ, D ◦ Q ◦ A)(1 − 2Fe (ρ, D ◦ Q⊥ ◦ A)).

(18)

2. If for some > 0 the relation Fe (ρ, D ◦ Q ◦ A) ≥ 1 − holds, then Fe (ρ, D ◦ Q⊥ ◦ A) ≤ , and (18) implies Fe (ρ, D ◦ A) ≥ (1 − )(1 − 2) ≥ 1 − 3.

(19)


71

3. If for some > 0 merely the relation tr{qA(ρ)} ≥ 1 − holds then we can conclude that Fe (ρ, D ◦ A) ≥ Fe (ρ, D ◦ Q ◦ A) − 2.

(20)

The following Lemma 4 contains two inequalities one of which will be needed in the proof of Lemma 3. Lemma 4. Let D ∈ C(K, H) and x1 ⊥ x2 , z 1 ⊥ z 2 be state vectors, x1 , x2 ∈ K, z 1 , z 2 ∈ H. Then | z 1 , D(|x1 x2 |)z 1 | ≤ | z 1 , D(|x1 x1 |)z 1 | · | z 1 , D(|x2 x2 |)z 1 | ≤ 1, (21) and | z 1 , D(|x1 x2 |)z 2 | ≤

| z 1 , D(|x1 x1 |)z 1 | · | z 2 , D(|x2 x2 |)z 2 | ≤ 1.

(22)

We will utilize only (21) in the proof of Lemma 3. But the inequality (22) might prove useful in other contexts so that we state it here for completeness. Proof of Lemma 4. Let dim H = h, dim K = κ. Extend {x1 , x2 } to an orthonormal basis {x1 , x2 , . . . , xκ } of K and {z 1 , z 2 } to an orthonormal basis {z 1 , z 2 , . . . , z h } on H. Since x1 ⊥ x2 and z 1 ⊥ z 2 , this can always be done. By the theorem of Choi [6], a linear map from B(H) to B(K) is completely positive if and only if its Choi matrix is positive. h ij Write D(|xi x j |) = k,l=1 Dkl |z k zl |. Then the Choi matrix of D is, with respect to the bases {x1 , . . . , xk } and {z 1 , . . . , z h }, written as CHOI(D) =

κ

|xi x j | ⊗

i, j=1

h

ij

Dkl |z k zl |.

k,l=1

If CHOI(D) is positive, then all principal minors of CHOI(D) are positive (cf. Corollary 7.1.5 in [17]) and thus ij ii | · |D j j | |Dkl | ≤ |Dkk ll for every suitable choice of i, j, k, l. Thus 12 | z 1 |D(|x1 x2 |)z 2 | = |D12 | 11 | · |D 22 | ≤ |D11 22 = | z 1 , D(|x1 x1 |)z 1 | · | z 2 , D(|x2 x2 |)z 2 |,

and similarly | z 1 , D(|x1 x2 |)z 1 | ≤

| z 1 , D(|x1 x1 |)z 1 | · | z 1 , D(|x2 x2 |)z 1 |.

The fact that D is trace preserving gives us the estimate z i , D(|x j x j |)z i ≤ 1 (i, j suitably chosen) and we are done.

72


Proof of Lemma 3. Let dim H = h, dim K = κ, |ψ ψ| ∈ Ha ⊗ H be a purification of ρ (w.l.o.g. Ha = H). Set D˜ := idHa ⊗ D, A˜ := idHa ⊗ A, q˜ := 1Ha ⊗ q and, as usual, q˜ ⊥ the orthocomplement of q˜ within Ha ⊗ K. Obviously, ˜ Fe (ρ, D ◦ A) = ψ, D˜ ◦ A(|ψ ψ|)ψ ˜ ˜ q˜ + q˜ ⊥ ]A(|ψ ψ|)[ q˜ + q˜ ⊥ ])ψ = ψ, D([ ˜ ˜ q˜ A(|ψ ψ|) ˜ ˜ q˜ ⊥ A(|ψ ψ|) q˜ ⊥ )ψ = ψ, D( q)ψ ˜ + ψ, D( ⊥ ⊥ ˜ ˜ q˜ A(|ψ ψ|) ˜ ˜ q˜ A(|ψ ψ|) q)ψ ˜ + ψ, D( q˜ )ψ + ψ, D( ˜ q˜ A(|ψ ψ|) ˜ ˜ q˜ A(|ψ ψ|) ˜ ≥ ψ, D( q)ψ ˜ + 2{ ψ, D( q˜ ⊥ )ψ} ˜ q˜ A(|ψ ψ|) ˜ ˜ q˜ A(|ψ ψ|) ˜ ≥ ψ, D( q)ψ ˜ − 2| ψ, D( q˜ ⊥ )ψ| ˜ q˜ A(|ψ ψ|) ˜ q˜ ⊥ )ψ|. = Fe (ρ, D ◦ Q ◦ A) − 2| ψ, D(

(23)

We establish a lower bound on the second term on the RHS of (23). Let ˜ A(|ψ ψ|) =

κ·h

λi |ai ai |,

i=1

where {a1 , . . . , aκ·h } are assumed to form an orthonormal basis. Now every ai can be written as ai = αi xi + βi yi where xi ∈ supp(q) ˜ and yi ∈ supp(q˜ ⊥ ), i ∈ {1, ..., κ · h}, ˜ then are state vectors and αi , βi ∈ C. Define σ := A(|ψ ψ|), σ =

κ·h

λ j (|α j |2 |x j x j | + α j β ∗j |x j y j | + β j α ∗j |y j x j | + |β j |2 |y j y j |).

(24)

j=1

˜ q˜ A(|ψ ψ|) ˜ Set X := | ψ, D( q˜ ⊥ )ψ|. Then ˜ qσ X = | ψ, D( ˜ q ⊥ )ψ| a

=|

κ·h

˜ q|a λi ψ, D( ˜ i ai |q˜ ⊥ )ψ|

i=1

=|

κ·h

˜ i yi |)ψ| λi αi βi∗ ψ, D(|x

i=1

≤

κ·h

˜ i yi |)ψ| |λi αi βi∗ | · | ψ, D(|x

i=1

κ·h b ˜ ˜ i yi |)ψβ ∗ | ≤ | λi | ψ, D(|xi xi |)ψαi λi ψ, D(|y i i=1 c

≤

κ·h i=1

˜ i xi |)ψ λi |αi |2 ψ, D(|x

κ·h

˜ j y j |)ψ. λ j |β j |2 ψ, D(|y

(25)

j=1

˜ Here, a follows from using the convex decomposition of A(|ψ ψ|), b from utilizing inequality (21) from Lemma 4 and c is an application of the Cauchy-Schwarz inequality.


73

Now, employing the representation (24) it is easily seen that ˜ q˜ A(|ψ ψ|) ˜ Fe (ρ, D ◦ Q ◦ A) = ψ, D( q)ψ ˜ =

κ·h

˜ i , xi |)ψ λi |αi |2 ψ, D(|x

(26)

i=1

and similarly Fe (ρ, D ◦ Q ◦ A) =

κ·h

˜ j y j |)ψ. λ j |β j |2 ψ, D(|y

(27)

j=1

The inequalities (27), (26), (25), and (23) yield Fe (ρ, D ◦ A) ≥ Fe (ρ, D ◦ Q ◦ A) − 2Fe (ρ, D ◦ Q ◦ A)Fe (ρ, D ◦ Q⊥ ◦ A) = Fe (ρ, D ◦ Q ◦ A)(1 − 2Fe (ρ, D ◦ Q⊥ ◦ A))

(28)

which establishes (18). Let us turn now to the other assertions stated in the lemma. Let tr{qA(ρ)} ≥ 1 − . This implies tr(q ⊥ A(ρ)) ≤ . A direct calculation yields tr(q˜ ⊥ σ ) = tr Ha (tr K ((1Ha ⊗ q ⊥ )idHa ⊗ A(|ψ ψ|))) = tr K (q ⊥ A(tr Ha (|ψ ψ|))) = tr K (q ⊥ A(ρ)) ≤ . Using (24), we get the useful inequality ≥ tr(q˜ ⊥ σ ) =

κ·h

λi |βi |2 tr(q˜ ⊥ |yi yi |)

i=1

=

κ·h

λi |βi |2 .

(29)

i=1

Using Lemma 4 and (29) we get X ≤

κ·h i=1

λi |αi |2

κ·h

λ j |β j |2

j=1

≤ , thus by Eq. (23) we have Fe (ρ, D ◦ A) ≥ Fe (ρ, D ◦ Q ◦ A) − 2. In case that Fe (ρ, D ◦ Q ◦ A) ≥ 1 − , we note that the linear maps Q and Q⊥ are elements of C ↓ (K, K) whilst Q + Q⊥ ∈ C(K, K) and since Fe is affine in the operation Fe (ρ, D ◦ Q ◦ A) + Fe (ρ, D ◦ Q⊥ ◦ A) = Fe (ρ, D ◦ (Q + Q⊥ ) ◦ A) ≤ 1

74


has to hold. This in turn implies Fe (ρ, D ◦ Q⊥ ◦ A) ≤ . Using this, our assumption that Fe (ρ, D ◦ Q ◦ A) ≥ 1 − , and (28) we obtain that Fe (ρ, D ◦ A) ≥ Fe (ρ, D ◦ Q ◦ A)(1 − 2Fe (ρ, D ◦ Q⊥ ◦ A)) ≥ (1 − )(1 − 2) ≥ 1 − 3, which is the claim we made in (19).

4. Direct Part of the Coding Theorem for Finitely Many Channels 4.1. Typical Projections and Kraus Operators. In this subsection we recall briefly the well-known properties of frequency typical projections and reduced operations. A more detailed description can be found in [4] and references therein. Lemma 5. There is a real number c > 0 such that for every Hilbert space H there exist functions h : N → R+ , ϕ : (0, 1/2) → R+ with liml→∞ h(l) = 0 and limδ→0 ϕ(δ) = 0 such that for any ρ ∈ S(H), δ ∈ (0, 1/2), l ∈ N there is an orthogonal projection qδ,l ∈ B(H)⊗l called frequency-typical projection that satisfies 1. tr(ρ ⊗l qδ,l ) ≥ 1 − 2−l(cδ −h(l)) , 2. qδ,l ρ ⊗l qδ,l ≤ 2−l(S(ρ)−ϕ(δ)) qδ,l . 2

The inequality 2 implies ||qδ,l ρ ⊗l qδ,l ||22 ≤ 2−l(S(ρ)−ϕ(δ)) . Moreover, setting d := dim H, ϕ and h are given by h(l) =

d δ log(l + 1) ∀l ∈ N, ϕ(δ) = −δ log ∀δ ∈ (0, 1/2). l d

Lemma 6. Let H, K be finite dimensional Hilbert spaces. There are functions γ : (0, 1/2) → R+ , h : N → R+ satisfying limδ→0 γ (δ) = 0 and h (l) 0 such that for each N ∈ C(H, K), δ ∈ (0, 1/2), l ∈ N and maximally mixed state πG on some subspace G ⊂ H there is an operation Nδ,l ∈ C ↓ (H⊗l , K⊗l ) called the reduced operation with respect to N and πG that satisfies

2

1. tr(Nδ,l (πG⊗l )) ≥ 1 − 2−l(c δ −h (l)) , with a universal positive constant c > 0,

2. Nδ,l has a Kraus representation with at most n δ,l ≤ 2l(Se (πG ,N )+γ (δ)+h (l)) Kraus operators. 3. For every state ρ ∈ S(H⊗l ) and every two channels I ∈ C ↓ (H⊗l , H⊗l ) and L ∈ C ↓ (K⊗l , H⊗l ) the inequality Fe (ρ, L ◦ Nδ,l ◦ I) ≤ Fe (ρ, L ◦ N ⊗l ◦ I) is fulfilled. Setting d := dim H and κ := dim K, the function h : N → R+ is given by h (l) = d·κ δ l log(l + 1) ∀l ∈ N and γ by γ (δ) = −δ log d·κ , ∀δ ∈ (0, 1/2).


75

4.2. The Case of Uninformed Users. Let us consider a compound channel given by a finite set I := {N1 , . . . , N N } ⊂ C(H, K) and a subspace G ⊂ H. For every l ∈ N, we choose a subspace El ⊂ G ⊗l . As usual, πEl and πG denote the maximally mixed states on El , respectively G while kl := dim El gives the dimension of El . For j ∈ {1, . . . , N }, δ ∈ (0, 1/2), l ∈ N and states N j (πG ), let q j,δ,l ∈ B(K)⊗l be the frequency-typical projection of N j (πG ) and N j,δ,l be the reduced operation associated with N j and πG as defined in Subsect. 4.1. These quantities enable us to define a new set of channels that is more adapted to our problem than the original one. We set for an arbitrary unitary operation u l ∈ B(H⊗l ), l l Nˆ j,u l ,δ := Q j,δ,l ◦ N j,δ,l ◦ U

and, accordingly, N 1 ˆl Nˆ ul l ,δ := N j,ul ,δ . N j=1

We will show the existence of good codes for the reduced channels Q j,δ,l ◦ N j,δ,l in the limit of large l ∈ N. An application of Lemma 3 and Lemma 6 will then show that these codes are also good for the original compound channel. Let U l be a random variable taking values in U(G ⊗l ) which is distributed according to the Haar measure. Application of Theorem 3 yields EFc,e (πEl , Nˆ Ul l ,δ ) ≥ tr(Nˆ δl (πG⊗l )) − 2

N

l kl n j,δ,l ||Nˆ j,δ (πG⊗l )||2 ,

(30)

j=1

where n j,δ,l stands for the number of Kraus operators of the reduced operation N j,δ,l ( j ∈ {1, . . . , N }) and l Nˆ j,δ := Q j,δ,l ◦ N j,δ,l , N 1 ˆl N j,δ . Nˆ δl := N j=1

Notice that Q j,δ,l ◦ N j,δ,l trivially has a Kraus representation containing exactly n j,δ,l elements. We will use inequality (30) in the proof of the following theorem. Theorem 5 (Direct Part: Uninformed Users and |I| < ∞). Let I = {N1 , . . . , N N } ⊂ C(H, K) be a compound channel and πG the maximally mixed state associated to a subspace G ⊂ H. Then Q(I) ≥ min Ic (πG , Ni ). Ni ∈I

Proof We show that for every > 0 the number minNi ∈I Ic (πG , Ni )− is an achievable rate for I. 1) If minNi ∈I Ic (πG , Ni ) − ≤ 0, there is nothing to prove. 2) Let minNi ∈I Ic (πG , Ni ) − > 0.

76


Choose δ ∈ (0, 1/2) and l0 ∈ N satisfying γ (δ) + ϕ(δ) + h (l0 ) ≤ /2 with functions γ , ϕ, h from Lemma 5 and 6. Now choose for every l ∈ N a subspace El ⊂ G ⊗l such that dim El =: kl = 2l(minNi ∈I Ic (πG ,Ni )−) . By S(πG ) ≥ Ic (πG , N j ) (see [1]), this is always possible. Obviously, min Ic (πG , Ni ) − − o(l 0 ) ≤

Ni ∈I

1 log kl ≤ min Ic (πG , Ni ) − . l Ni ∈I

We will now give lower bounds on the terms in (30), thereby making use of Lemma 5 and Lemma 6: 2

2

tr(Nˆ l (π ⊗l )) ≥ 1 − 2−l(cδ −h(l)) − 2−l(c δ −h (l)) . (31) δ

G

A more detailed calculation can be found in [4] or [20]. Further, and additionally using the inequality ||A + B||22 ≥ ||A||22 + ||B||22 valid for non-negative operators A, B ∈ B(K⊗l ) (see [20]), we get the inequality l ||Nˆ j,δ (πG⊗l )||22 ≤ 2−l(S(N j (πG ))−ϕ(δ)) .

(32)

From (30), (31), (32) and our specific choice of kl it follows that 2

2

EFc,e (πEl , Nˆ Ul l ,δ ) ≥ 1 − 2−l(cδ −h(l)) − 2−l(c δ −h (l)) N 1

−2 2l( l log kl +γ (δ)+ϕ(δ)+h (l)−Ic (πG ,N j )

j=1

2

≥ 1 − 2−l(cδ −h(l)) − 2−l(c δ −h (l))

−2N 2−l(−γ (δ)−ϕ(δ)−h (l)) . 2

Since − γ (δ) − ϕ(δ) − h (l) ≥ ε/2 for every l ≥ l0 , this shows the existence of at least one sequence of (l, kl )−codes for I with uninformed users and lim inf l→∞

1 log kl = min Ic (πG , Ni ) − , l Ni ∈I

as well as (using that entanglement fidelity is affine in the channel), for every l ∈ N, min

j∈{1,...,N }

1 l Fe (πFl , Rl ◦ Nˆ j,δ ◦ W l ) ≥ 1 − N l , 3

where wl ∈ U(G ⊗l ) ∀l ∈ N and l = 3 · (2−l(cδ

2 −h(l))

2 −h (l))

+ 2−l(c δ

+ 2N 2−l(−γ (δ)−ϕ(δ)−h (l)) ).

(33)

(34)

Note that liml→∞ l = 0 exponentially fast, as can be seen from our choice of δ and l0 . For every j ∈ {1, . . . , N } and l ∈ N we thus have, by property 3 of Lemma 6, l construction of Nˆ j,w j ,δ , and Eq. (33), Fe (πFl , Rl ◦ Q j,δ,l ◦ N j⊗l ◦ W l ) ≥ Fe (πFl , Rl ◦ Q j,δ,l ◦ N j,δ,l ◦ W l ) l = Fe (πFl , Rl ◦ Nˆ j,w j ,δ )

1 ≥ 1 − N l . 3


77

By the first two parts of Lemma 3, this immediately implies min Fe (πFl , Rl ◦ N j⊗l ◦ W l ) ≥ 1 − N l ∀l ∈ N.

N j ∈I

(35)

Since > 0 was arbitrary, we have shown that minNi ∈I Ic (πG , Ni ) is an achievable rate. 4.3. The Informed Encoder. In this subsection we shall prove the following theorem: Theorem 6 (Direct Part: Informed Encoder and |I| < ∞). For every finite compound channel I = {N1 , . . . , N N } ⊂ C(H, K) and any set {πG1 , . . . , πG N } of maximally mixed states on subspaces {G1 , . . . , G N } with Gi ⊂ H for all i ∈ {1, . . . , N } we have Q I E (I) ≥ min Ic (πGi , Ni ). Ni ∈I

Proof Let a compound channel be given by a finite set I := {N1 , . . . , N N } ⊂ C(H, K) and let G1 , . . . , G N be arbitrary subspaces of H. We will prove that for every > 0 the value R() := min Ic (πGi , Ni ) − 1≤i≤N

is achievable. If R() ≤ 0, there is nothing to prove. Hence we assume R() > 0. For every l ∈ N and all i ∈ {1, . . . , N } we choose the following. First, a subspace El ⊂ H⊗l of dimension kl := dim El that satisfies kl ≤ dim Gi⊗l . Second, a set {v1l , . . . , vlN } of unitary operators with the property vil El ⊂ Gi⊗l . Again, the maximally mixed states associated to the above mentioned subspaces are denoted by πEl on El and πGi on Gi . For j ∈ {1, . . . , N }, δ ∈ (0, 1/2), l ∈ N and states N j (πG j ) let q j,δ,l ∈ B(K)⊗l be the frequency-typical projection of N j (πG j ) and N j,δ,l be the reduced operation associated with N j and πG j as considered in Sect. 4.1. Let, for the moment, l ∈ N be fixed. We define a new set of channels that is more adapted to our problem than the original one. We set, for an arbitrary set {u l1 , . . . , u lN } of unitary operators on H⊗l : l N˜ j,δ := Q j,δ,l ◦ N j,δ,l , l l l ˜l Nˆ j,u l ,δ := N j,δ ◦ U j ◦ V j , j

and, accordingly, Nˆ ul l ,...,ul 1

N ,δ

:=

N 1 ˆl N j,ul ,δ . N j j=1

We will first show the existence of good unitary encodings and recovery operation for l ,...,N ˜ l }. Like in the previous subsection, application of Lemma 3 will enable {N˜ 1,δ N ,δ us to show the existence of reliable encodings and recovery operation for the original compound channel I.

78


Let U1l , . . . , U Nl be independent random variables such that each Uil takes on values in U(Gi⊗l ) and is distributed according to the Haar measure on U(Gi⊗l ) (i ∈ {1, . . . , N }). By Theorem 4 we get the lower bound EFc,e (πEl , Nˆ Ul l ,...,U l ,δ ) ≥ 1

N

N 1 l l [ tr(N˜ j,δ (πG ⊗l )) − 2 kl n j,δ,l ||N˜ j,δ (πG ⊗l )||2 ], j j N

(36)

j=1

where n j,δ,l denotes the number of Kraus operators in the operations N˜ j,δ,l ( j ∈ {1, . . . , N }). By Lemmas 5, 6 for every j ∈ {1, . . . , N } the corresponding term in the above sum can be bounded from below through 1 1 2

2

l (πG ⊗l )) ≥ (1 − 2−l(cδ −h(l)) − 2−l(c δ −h (l)) ) tr(N˜ j,δ j N N and

l(− min1≤ j≤N Ic (πG j ,N j )+γ (δ)+ϕ(δ)+h (l)) l ˜ . −2 kl n j,δ,l ||N j,δ (πG ⊗l )||2 ≥ −2 kl · 2

j

Set kl :=

2l R() .

Obviously, for any j ∈ {1, . . . , N }, l(− min1≤ j≤N Ic (πG j ,N j ))

kl · 2

≤ 2−l .

This implies

2

EFc,e (πEl , Nˆ Ul l ,...,U l ,δ ) ≥ 1 − 2−l(cδ −h(l)) − 2−l(c δ −h (l)) 1 N

−2N 2l(−+γ (δ)+ϕ(δ)+h (l)) . 2

Now choosing both the approximation parameter δ and an integer l0 ∈ N such that − + γ (δ) + ϕ(δ) + h (l) < − 21 holds for every l ≥ l0 and setting 2

2

l := 2−l(cδ −h(l)) + 2−l(c δ −h (l)) + 2N 2l(−+γ (δ)+ϕ(δ)+h (l)) we see that EFc,e (πEl , Nˆ Ul l ,...,U l ,δ ) ≥ 1 − l , 1

N

where again l 0 and our choice of δ and l0 again shows that the speed of convergence is exponentially fast. Thus, there exist unitary operators w1l , . . . , wlN ⊂ U(H⊗l ) and a recovery operation Rl such that, passing to the individual channels, we have for every j ∈ {1, . . . , N }, Fe (πEl , Rl ◦ Q j,δ,l ◦ N j,δ,l ◦ W lj ) ≥ 1 − N l . By property 3 of Lemma 6 and Lemma 3, we immediately see that Fe (πEl , Rl ◦ N j⊗l ◦ W lj ) ≥ 1 − 3N l ∀ j ∈ {1, . . . , N } is valid as well. We finally get the desired result: For every set {πG1 , . . . , πG N } of maximally mixed states on subspaces G1 , . . . , G N ⊂ H and every > 0 there exists a sequence of (l, kl ) codes for I with informed encoder with the properties 1. lim inf l→∞ 1l log kl = minN j ∈I Ic (πG j , N j ) − , 2. minN j ∈I Fe (πEl , Rl ◦ N j⊗l ◦ W lj ) ≥ 1 − 3N l . Since > 0 was arbitrary and l 0, we are done.


79

5. Finite Approximations in the Set of Quantum Channels Our goal in this section is to discretize a given set of channels I ∈ C(H, K) in such a way that the results derived so far for finite sets can be employed to derive general versions of coding theorems for compound channels. The first concept we will need is that of a τ -net in the set C(H, K) and we will give an upper bound on the cardinality of the best τ -net in that set. Best τ -nets characterize the degree of compactness of C(H, K). N with the property that for each N ∈ C(H, K) A τ -net in C(H, K) is a finite set {Ni }i=1 there is at least one i ∈ {1, . . . , N } with ||N −Ni ||♦ < τ . Existence of τ -nets in C(H, K) is guaranteed by the compactness of C(H, K). The next lemma contains a crude upper bound on the cardinality of minimal τ -nets.

2

N in C(H, K) with N ≤ ( 3 )2(d·d ) , Lemma 7 For any τ ∈ (0, 1] there is a τ −net {Ni }i=1 τ

where d = dim H and d = dim K.

Proof The assertion of the lemma follows from the standard volume argument (cf. Lemma 2.6 in [24]). The details can be found in our previous paper [4]. N with Let I ⊆ C(H, K) be an arbitrary set. Starting from a τ/2−net N := {Ni }i=1 6 2(d·d )2

N ≤ (τ ) as in Lemma 7 we can build a τ/2−net Iτ that is adapted to the set I given by I τ := Ni ∈ N : ∃N ∈ I with ||N − Ni ||♦ < τ/2 , (37)

i.e. we select only those members of the τ/2-net that are contained in the τ/2neighborhood of I. Let T ∈ C(H, K) be the useless channel given by T (ρ) := dim1 K 1K , ρ ∈ S(H), and consider τ τ (38) Iτ := (1 − )N + T : N ∈ I τ , 2 2 where I τ is defined in (37). For I ⊆ C(H, K) we set Ic (ρ, I) := inf Ic (ρ, N ), N ∈I

for ρ ∈ S(H). We list a few more or less obvious results in the following lemma that will be needed in the following. Lemma 8 Let I ⊆ C(H, K). For each positive τ ≤ defined in (38).

1 e

let Iτ be the finite set of channels

2

1. |Iτ | ≤ ( τ6 )2(d·d ) with d = dim H and d = dim K. 2. For N ∈ I there is Ni ∈ Iτ with ||N ⊗l − Ni⊗l ||♦ < lτ.

(39)

Consequently, for N , Ni , and any CPTP maps P : B(F) → B(H)⊗l and R : B(K)⊗l → B(F ) the relation |Fe (ρ, R ◦ N ⊗l ◦ P) − Fe (ρ, R ◦ Ni⊗l ◦ P)| < lτ holds for all ρ ∈ S(H⊗l ) and l ∈ N.

(40)

80


3. For all ρ ∈ S(H) we have d |Ic (ρ, I) − Ic (ρ, Iτ )| ≤ τ + 3τ log . τ

(41)

Proof The proofs of the assertions claimed here are either identical to those given in [4] or can be obtained by trivial modifications thereof. 6. Direct Parts of the Coding Theorems for General Quantum Compound Channels 6.1. The Case of Informed Decoder and Uninformed Users. The main step towards the direct part of the coding theorem for quantum compound channels with uninformed users is the following theorem. Lemma 9 Let I ∈ C(H, K) be an arbitrary compound channel and let πG be the maximally mixed state associated with a subspace G ⊂ H. Then Q(I) ≥ inf Ic (πG , N ). N ∈I

Proof We consider two subspaces El , G ⊗l of H⊗l with El ⊂ G ⊗l ⊂ H⊗l . Let kl := dim El and we denote as before the associated maximally mixed states on El and G by πEl and πG . If inf N ∈I Ic (πG , N ) ≤ 0 there is nothing to prove. Therefore we will suppose in the following that inf Ic (πG , N ) > 0

N ∈I

holds. We will show that for each ε ∈ (0, inf N ∈I Ic (πG , N )) the number inf Ic (πG , N ) − ε

N ∈I

is an achievable rate. For each l ∈ N let us choose some τl > 0 with τl ≤ 1e , liml→∞ lτl = 0, and such that Nτl grows sub-exponentially with l. E.g. we may choose τl := min{1/e, 1/l 2 }. We consider, for each l ∈ N, the finite set of channels Iτl := {N1 , . . . , N Nτl } associated to I given in (38) with the properties listed in Lemma 8. We can conclude from the proof of Theorem 5 that for each l ∈ N there is a subspace Fl ⊂ G ⊗l of dimension ε

kl = 2l(mini∈{1,...,Nτ } Ic (πG ,Ni )− 2 ) ,

(42)

a recovery operation R, and a unitary encoder W l such that min

i∈{1,...,Nτl }

Fe (πFl , R ◦ Ni⊗l ◦ W l ) ≥ 1 − Nτl l ,

(43)

where l is defined in (34) (with the approximation parameter ε replaced by ε/2), and we have chosen l, l0 ∈ N with l ≥ l0 large enough and δ > 0 small enough to ensure that both ε min Ic (πG , Ni ) − > 0, i∈{1,...,Nτl } 2


81

and

ε − γ (δ) − ϕ(δ) − h (l0 ) > ε/4 > 0. 2 By our construction of Iτl we can find to each N ∈ I at least one Ni ∈ Iτl with |Fe (πFl , R ◦ Ni⊗l ◦ W l ) − Fe (πFl , R ◦ N ⊗l ◦ W l )| ≤ l · τl ,

(44)

according to Lemma 8. Moreover, by the last claim of Lemma 8 we obtain the following estimate on the dimension kl of the subspace Fl : l(inf N ∈I Ic (πG ,N )− 2ε −τl −2τl log

kl ≥ 2

d τl

.

(45)

The inequalities (43) and (44) show that inf Fe (πFl , R ◦ N ⊗l ◦ W l ) ≥ 1 − Nτl l − lτl ,

N ∈I

which in turn with (45) shows that inf N ∈I Ic (πG , N ) is an achievable rate.

In order to pass from the maximally mixed state πG to an arbitrary one we have to employ the compound generalization of Bennett, Shor, Smolin, and Thapliyal Lemma (BSST Lemma for short) from [2] and [16]. For the proof of this generalized BSST Lemma we refer to [4]. Lemma 10 (Compound BSST Lemma). Let I ⊂ C(H, K) be an arbitrary set of channels. For any ρ ∈ S(H) let qδ,l ∈ B(H⊗l ) be the frequency-typical projection of ρ and set qδ,l ∈ S(H⊗l ). πδ,l := tr(qδ,l ) Then there is a positive sequence (δl )l∈N satisfying liml→∞ δl = 0 with 1 inf Ic (πδl ,l , N ⊗l ) = inf Ic (ρ, N ). l→∞ l N ∈I N ∈I With these preparations it is easy now to finish the proof of the direct part of the coding theorem for the quantum compound channel with uninformed users. First notice that for each k ∈ N, lim

Q(I⊗k ) = k Q(I) S(H⊗m ) let q

(46)

B(H⊗ml ) be the frequency-typical projection

holds. For any fixed ρ ∈ δ,l ∈ q of ρ and set πδ,l = tr(qδ,lδ,l ) . Lemma 9 implies that for any δ ∈ (0, 1/2) we have Q(I⊗ml ) ≥ Ic (πδ,l , I⊗ml ),

(47)

for all m, l ∈ N. Utilizing (46), (47) and Lemma 10 we arrive at 1 1 lim Q(I⊗ml ) m l→∞ l 1 1 lim inf Ic (πδl ,l , (N ⊗m )⊗l ) ≥ m l→∞ l N ∈I 1 = Ic (ρ, I⊗m ). (48) m From (48) and since Q I D (I) ≥ Q(I) trivially holds we get without further ado the direct part of the coding theorem. Q(I) =

82


Theorem 7 (Direct Part: Informed Decoder and Uninformed Users). Let I ⊂ C(H, K) be an arbitrary set. Then 1 max inf Ic (ρ, N ⊗l ). l→∞ l ρ∈S (H⊗l ) N ∈I

Q I D (I) ≥ Q(I) ≥ lim

(49)

Remark 2. It is quite easy to see that the limit in (49) exists. Indeed it holds that max

inf Ic (ρ, N ⊗l+k ) ≥

ρ∈S (H⊗l+k ) N ∈I

max

inf Ic (ρ, N ⊗l )

ρ∈S (H⊗l ) N ∈I

+

max

inf Ic (ρ, N ⊗k ),

ρ∈S (H⊗k ) N ∈I

which implies the existence of the limit via standard arguments. 6.2. The Informed Encoder. The main result of this section will rely on an appropriate variant of the BSST Lemma. To this end we first recall Holevo’s version of that result. For δ > 0, l ∈ N, and ρ ∈ S(H), let qδ,l ∈ B(H⊗l ) denote the frequency typical projection of ρ ⊗l . Set πδ,l = πδ,l (ρ) :=

qδ,l . tr(qδ,l )

(50)

Moreover, let λmin (ρ) := min{λ ∈ σ (ρ) : λ > 0}, where σ (ρ) stands for the spectrum of the density operator ρ. 1 Lemma 11 (BSST Lemma [2,16]). For any δ ∈ (0, 2 dim H ), any N ∈ C(H, K), and every ρ ∈ S(H) with associated state πδ,l = πδ,l (ρ) ∈ S(H⊗l ) we have 1 S(N ⊗l (πδ,l )) − S(N (ρ)) ≤ θl (δ, λmin (ρ), λmin (N (ρ))), (51) l

where θl (δ, λmin (ρ), λmin (N (ρ))) =

dim H log(l + 1) − dim H · δ log δ l − dim H · δ · (log λmin (ρ) + log λmin (N (ρ))).

(52)

Before we present our extended version of the BSST Lemma we introduce some notation. For t ∈ (0, 1e ) and any set I ⊂ C(H, K) let us define I(t) := {N (t) = (1 − t)N + tTK : N ∈ I} = (1 − t)I + tTK ,

(53)

tr(x) where T ∈ C(H, K) is given by TK (x) := dim K 1K . On the other hand, to each N ∈ I ⊂ C(H, K) we can associate a complementary channel Nc ∈ C(H, He ) where we assume w.l.o.g. that He = Cdim H·dim K . Let I ⊂ C(H, He ) denote the set of channels complementary to I and set

I (t) := (I )

(t)

= {Nc(t) = (1 − t)Nc + tTHe : Nc ∈ I } = (1 − t)I + tTHe , (54)


83

where THe ∈ C(H, He ) is defined in a similar way as TK . Finally, for N ∈ I let ρN := arg max Ic (ρ, N ), ρ∈S (H)

and for t ∈ (0, 1e ), δ > 0, and l ∈ N define

(t) (t) πδ,l,N := πδ,l ρN ,

(55)

where we have used the notation from (50) and (t)

ρN := (1 − t)ρN +

t 1H . dim H

(56)

1 1 ), and δ ∈ (0, 2 dim Lemma 12 (Uniform BSST-Lemma). 1. Let l ∈ N, t ∈ (0, l·e H ). Then with the notation introduced in the preceding paragraph we have 1 inf Ic (π (t) , N ⊗l ) − inf max Ic (ρ, N ) ≤ ∆l (δ, t), l N ∈I δ,l,N N ∈I ρ∈S (H)

with

∆l (δ, t) = 2θl δ,

t t t t , + 2θl δ, , dim H dim K dim H dim He t lt −4t log − 2lt log , dim K · dim He dim K · He where θl δ, dimt H , dimt K and θl δ, dimt H , dimt He are from Lemma 11. 2. Consequently, choosing suitable positive sequences (δl )l∈N , (tl )l∈N with 1. liml→∞ δl = 0 = liml→∞ ltl , and 2. liml→∞ δl log tl = 0, we see that for νl := ∆l (δl , tl ), 1 inf Ic (π (tl ) , N ⊗l ) − inf max Ic (ρ, N ) ≤ νl l N ∈I δl ,l,N N ∈I ρ∈S (H)

(57)

holds with liml→∞ νl = 0. Proof. Our proof strategy is to reduce the claim to the BSST Lemma 11. Let t > 0 be 1 small enough to ensure that l · t ∈ (0, 1e ) and let δ ∈ (0, 2 dim H ) be given. From (53) and (54) we obtain that λmin (N (t) (ρ)) ≥

t , dim K

λmin (Nc(t) (ρ)) ≥

t dim He

∀ ρ ∈ S(H),

(58)

and (56) yields that (t) )≥ λmin (ρN

t dim H

for all N ∈ I. The bounds (58) and (59) along with Lemma 11 show that 1 S((N (t) )⊗l (π (t) )) − S(N (t) (ρ (t) )) ≤ θl δ, t , t , l N δ,l,N dim H dim K

(59)

(60)

84

and


1 t S((N (t) )⊗l (π (t) )) − S(N (t) (ρ (t) )) ≤ θl δ, t , . c c l N δ,l,N dim H dim He

(61)

On the other hand, by definition we have ||N (t) − N ||♦ ≤ t,

||(N (t) )⊗l − N ⊗l ||♦ ≤ l · t,

(62)

||Nc(t) − Nc ||♦ ≤ t,

||(Nc(t) )⊗l − Nc⊗l ||♦ ≤ l · t,

(63)

and similarly

for all N ∈ I. Since l · t ∈ (0, 1e ) we obtain from this by Fannes inequality t , dim K t (t) (t) , |S(Nc(t) (ρN )) − S(Nc (ρN ))| ≤ −t log dim He (t)

(t)

|S(N (t) (ρN )) − S(N (ρN ))| ≤ −t log

and

1 S((N (t) )⊗l (π (t) )) − 1 S(N ⊗l (π (t) )) ≤ −l · t log l · t , l N δ,l,N δ,l, l dim K

as well as 1 S((N (t) )⊗l (π (t) )) − 1 S(N ⊗l (π (t) )) ≤ −l · t log l · t , c c l N δ,l,N δ,l, l dim He

(64) (65)

(66)

(67)

for all N ∈ I. Since (t)

(t)

(t)

Ic (ρN , N ) = S(N (ρN )) − S(Nc (ρN )) and (t) (t) (t) ⊗l ⊗l ⊗l Ic (πδ,l, N , N ) = S(N (πδ,l,N )) − S(Nc (πδ,l,N )),

the inequalities (60),(61), (64), (65), (66), (67) and triangle inequality show that uniformly in N ∈ I we have 1 t t Ic (π (t) , N ⊗l )− Ic (ρ (t) , N ) ≤ θl δ, t , t δ, +θ , l l N δ,l,N dim H dim K dim H dim He t l ·t − t log −l · t log . dim K · dim He dim K · He (68) Now, by (56) we have (t)

||ρN − ρN ||1 ≤ t, which implies (t)

||N (ρN ) − N (ρN )||1 ≤ t,

(t)

||Nc (ρN ) − Nc (ρN )||1 ≤ t,


85

since the trace distance of two states can only decrease after applying a trace preserving completely positive map to both states. Thus Fannes inequality leads us to the conclusion that t (t) . Ic (ρN , N ) − Ic (ρN , N ) ≤ −t log dim K · dim He This and (68) shows that uniformly in N ∈ I, 1 t t Ic (π (t) , N ⊗l )− Ic (ρN , N ) ≤ θl δ, t , t δ, +θ , l l δ,l,N dim H dim K dim H dim He t l ·t −2t log − l · t log dim K · dim He dim K · He ∆l (δ, t) . (69) =: 2 Finally, it is clear from the uniform estimate in (69) that 1 inf Ic (π (t) , N ⊗l ) − inf max Ic (ρ, N ) = 1 inf Ic (π (t) , N ⊗l ) l N ∈I l N ∈I δ,l,N δ,l,N N ∈I ρ∈S (H) − inf Ic (ρN , N ) N ∈I ≤ ∆l (δ, t), which concludes the proof.

(70)

Lemma 12 and Theorem 6 easily imply the following result. Lemma 13. Let I ⊂ C(H, K) be an arbitrary set of quantum channels. Then Q I E (I) ≥ inf

max Ic (ρ, N ).

N ∈I ρ∈S (H)

Proof. Take any set {πGN }N ∈I of maximally mixed states on subspaces GN ⊂ H. In a first step we will show that Q I E (I) ≥ inf Ic (πGN , N ) N ∈I

(71)

holds. Notice that we can assume w.l.o.g. that inf N ∈I Ic (πGN , N ) > 0. Denote, for every τ > 0, by Iτ a τ -net for I as given in (38) of cardinality Nτ :=

2 |Iτ | ≤ ( τ6 )2(d·d ) , where d, d are the dimensions of H, K. Starting from this set Iτ it is easy to construct a finite set I◦τ with the following properties: 1. I◦τ ⊂ I,

2 2. |I◦τ | ≤ ( τ6 )2(d·d ) , and 3. to each N ∈ I there is at least one N ∈ I◦τ with ||N − N ||♦ ≤ 2τ . Let (τl )l∈N be defined by τl := l12 and consider the sets I◦τl , l ∈ N. Take any η ∈ (0, inf N ∈I Ic (πGN , N )) and set R(η) := inf Ic (πGN , N ) − η, N ∈I

86


and Rl (η) := min Ic (πGN , N ) − η. N ∈I◦τl

Then for every l ∈ N, Rl (η) ≥ R(η)

(72)

I◦τl

⊂ I. since Fix some δ ∈ (0, 1/2) such that γ (δ ) + ϕ(δ ) < η/4. For every l ∈ N, choose a subspace El ⊂ H⊗l of dimension kl (η) := dim El = 2l Rl (η) . The proof of Theorem 6 then shows the existence of a recovery operation Rl and for l such that for each l ∈ N, each N ∈ I◦τl a unitary encoder WN

l ∀ N ∈ I◦τl , Fe (πEl , Rl ◦ N ⊗l ◦ WN

) ≥ 1 − 3 · Nτl · εl 3η

2

2

where εl := 2−l(cδ −h(l)) + 2−l(c δ −h (l)) + 2Nτl 2l(− 4 +h (l)) ). From Lemma 8 along ◦ with the properties of Iτl and our specific choice of (τl )l∈N it follows that there exist l (for every l ∈ N and each N ∈ I), such that unitary encodings WN

2 ∀l ∈ N, N ∈ I. l l ) = 1 and (72) implies for each η ∈ (0, inf Clearly, liml→∞ Fe (πEl , Rl ◦N ⊗l ◦WN N ∈I Ic (πGN , N )) that l ) ≥ 1 − 3 · Nτl · εl − Fe (πEl , Rl ◦ N ⊗l ◦ WN

1 1 log kl (η) = lim inf log dim El ≥ R(η). l→∞ l l→∞ l Consequently inf N ∈I Ic (πGN , N ) is achievable. We proceed by repeated application of the inequality 1 Q I E (I) ≥ Q I E (I⊗l ) (∀l ∈ N). (73) l l } From (71) and (73) we get that for each l ∈ N and every set {πN N ∈I of maximally ⊗l mixed states on subspaces of H , 1 l Q I E (I) ≥ inf Ic (πN , N ⊗l ). l N ∈I lim inf

l , namely, for every N ∈ I and l ∈ N, set We now make a specific choice of the states πN (tl ) l := π (tl ) πN δl ,l,N with πδl ,l,N taken from the second part of Lemma 12. By an application of the second part of Lemma 12 it follows 1 l inf Ic (πN Q I E (I) ≥ lim , N ⊗l ) l→∞ l N ∈I ≥ lim ( inf Ic (ρN , N ) − νl ) l→∞ N ∈I

= inf Ic (ρN , N ) N ∈I

= inf

max Ic (ρ, N ).

N ∈I ρ∈S (H)


87

Employing inequality (73) one more time we obtain from Lemma 13 applied to I⊗l , 1 Q I E (I⊗l ) l 1 ≥ inf max Ic (ρ, N ⊗l ). l N ∈I ρ∈S (H⊗l )

Q I E (I) ≥

Consequently we obtain the desired achievability result. Theorem 8 (Direct Part: Informed Encoder). For any I ∈ C(H, K) we have Q I E (I) ≥ lim

l→∞


(74)

Remark 3. Note that the limit in (74) exists. Indeed, set Cl (N ) :=

max

ρ∈S (H⊗l )

Ic (ρ, N ⊗l ).

Then it is clear that Cl+k (N ) ≥ Cl (N ) + Ck (N ) and consequently inf Cl+k (N ) ≥ inf (Cl (N ) + Ck (N ))

N ∈I

N ∈I

≥ inf Cl (N ) + inf Ck (N ), N ∈I

N ∈I

which implies the existence of the limit in (74). 7. Converse Parts of the Coding Theorems for General Quantum Compound Channels In this section we prove the converse parts of the coding theorems for general quantum compound channels in the three different settings concerned with entanglement transmission that are treated in this paper. The proofs deviate from the usual approach due to our more general definitions of codes. 7.1. Converse for Informed Decoder and Uninformed Users. We first prove the converse part in the case of a finite compound channel, then use a recent result [22] that gives a more convenient estimate for the difference in coherent information of two nearby channels in order to pass on to the general case. For the converse part in the case of a finite compound channel we need the following lemma that is due to Devetak [10]: Lemma 14 (cf. [10]). For two states σ, ρ ∈ S(H1 ⊗ H2 ) where dim H1 ⊗ H2 = b with fidelity f = F(σ, ρ), 2 |∆S(ρ) − ∆S(σ )| ≤ + 4 log(b) 1 − f , e where ∆S( · ) := S(tr H1 [ · ]) − S( · ). We shall now embark on the proof of the following theorem.

88


Theorem 9 (Converse Part: Informed Decoder, Uninformed Users, |I| < ∞). Let I = {N1 , . . . , N N } ⊂ C(H, K) be a finite compound channel. The capacities Q I D (I) and Q(I) of I with informed decoder and uninformed users are bounded from above by Q(I) ≤ Q I D (I) ≤ lim

max

min

l→∞ ρ∈S (H⊗l ) Ni ∈I

1 Ic (ρ, Ni⊗l ). l

Proof. The inequality Q(I) ≤ Q(I) I D is obvious from the definition of codes. We give a proof for the second inequality. Let for arbitrary l ∈ N an (l, kl ) code for a compound channel I = {N1 , . . . , N N } with informed decoder and the property min1≤i≤N Fe (πFl , Rli ◦ Ni⊗l ◦ P l ) ≥ 1 − l be given, where l ∈ [0, 1]. Let |ψl ψl | ∈ S(El ⊗Fl ) be a purification of πFl , where El is just a copy of Fl . We use the abbreviation N Dl := N1 i=1 Rli ◦ Ni⊗l . Obviously, the above code then satisfies ψl , idEl ⊗ Dl (idEl ⊗ P l (|ψl ψl |))ψl =

N 1 Fe (πFl , Rli ◦ Ni⊗l ◦ P l ) N i=1

≥ 1 − l .

(75)

Let σP l := idEl ⊗ P l (|ψ l ψ l |) and consider any convex decomposition σP l = (dim Fl )2 λi |ei ei | of σP l into pure states |ei ei | ∈ S(Fl ⊗ H⊗l ). By (75) there i=1 is at least one i ∈ {1, . . . , (dim Fl )2 } such that ψl , idEl ⊗ Dl (|ei ei |)ψl ≥ 1 − l

(76)

holds. Without loss of generality, i = 1. Turning back to the individual channels, we get ψl , idEl ⊗ Rli ◦ Ni⊗l (|e1 e1 |)ψl ≥ 1 − N l ∀i ∈ {1, . . . , N }.

(77)

We define the state ρ l := tr El (|e1 e1 |) ∈ S(H⊗l ) and note that |e1 e1 | is a purification of ρ l . Application of recovery operation and individual channels to ρ l now defines the states σkl := idEl ⊗ Rlk ◦ Nk⊗l (|e1 e1 |) (k ∈ {1, . . . , N }) which have independently of k the property F(ψ l , σkl ) = ψl , idEl ⊗ Rlk ◦ Nk⊗l (|ei ei |)ψl ≥ 1 − N l , and thus put us into position for an application of Lemma 14, which together with the data processing inequality for coherent information [26] establishes the following chain of inequalities for every k ∈ {1, . . . , N }: log dim Fl = S(πFl ) = ∆S(|ψ l ψ l |) 2 ≤ ∆S(σkl ) + + 4 log((dim Fl )2 ) N l e = S(tr El (idEl ⊗ Rlk ◦ N ⊗l (|e1 e1 |)) − S(idEl ⊗ Rlk ◦ Nk⊗l (|e1 e1 |)) 2 + + 4 log((dim Fl )2 ) N l e 2 = Ic (ρ l , Rlk ◦ Nk⊗l ) + + 4 log((dim Fl )2 ) N l e 2 ≤ Ic (ρ l , Nk⊗l ) + + 4 log((dim Fl )2 ) N l . (78) e


89

Thus, 2 + 4 log((dim Fl )2 ) N l k∈{1,...,N } e 2 ≤ max min Ic (ρ, Nk⊗l ) + + 8 log(dim Fl ) N l . e ρ∈S (H⊗l ) k∈{1,...,N }

log dim Fl ≤

min

Ic (ρ l , Nk⊗l ) +

(79)

Let a sequence of (l, kl ) codes for I with informed decoder be given such that lim inf l→∞ 1l log dim Fl = R ∈ R and liml→∞ l = 0. Then by (79) we get 1 log dim Fl l 1 max ≤ lim inf min Ic (ρ, Nk⊗l ) l→∞ l ρ∈S (H⊗l ) k∈{1,...,N } 12 + lim inf + lim inf 8 log(dim Fl ) N l l→∞ l e l→∞ 1 = lim min Ic (ρ, Nk⊗l ). max l→∞ l ρ∈S (H⊗l ) k∈{1,...,N }

R = lim inf l→∞

Let us now focus on the general case. We shall prove the following theorem: Theorem 10 (Converse Part: Informed Decoder, Uninformed Users). Let I ⊂ C(H, K) be a compound channel. The capacities Q I D (I) and Q(I) for I with informed decoder and with uninformed users are bounded from above by Q(I) ≤ Q I D (I) ≤ lim

max

inf

l→∞ ρ∈S (H⊗l ) N ∈I

1 Ic (ρ, N ⊗l ). l

For the proof of this theorem, we will make use of the following lemma: Lemma 15 (cf. [22]). Let N , Ni ∈ C(H, K) and dK = dim K. Let Hr be an additional Hilbert space , l ∈ N and φ ∈ S(Hr ⊗ H⊗l ). If ||N − Ni ||♦ ≤ , then |S(idHr ⊗ N ⊗l (φ)) − S(idHr ⊗ Ni⊗l (φ))| ≤ l(4 log(dK ) + 2h()). Here, h(·) denotes the binary entropy. This result immediately implies the following lemma: Lemma 16. Let H, K be finite dimensional Hilbert spaces. There is a function ν : [0, 1] → R+ with lim x→0 ν(x) = 0 such that for every I, I ⊆ C(H, K) with D♦(I, I ) ≤ τ ≤ 1/2 and every l ∈ N we have the estimates 1. 1 1 | Ic (ρ, I⊗l ) − Ic (ρ, I ⊗l )| ≤ ν(2τ ) ∀ρ ∈ S(H⊗l ). l l 2. |

1 1 inf max Ic (ρ, N ⊗l ) − inf max Ic (ρ, N ⊗l )| ≤ ν(2τ ). l N ∈I ρ∈S (H⊗l ) l N ∈I ρ∈S (H⊗l )

90


The function ν is given by ν(x) = x + 8x log(dK ) + 4h(x). Again, h(·) denotes the binary entropy. Proof of Theorem 10. Again, the first inequality is easily seen to be true from the very definition of codes in the two cases, so we concentrate on the second. Let I ⊂ C(H, K) be a compound channel and let for every l ∈ N an (l, kl ) code for I with informed decoder be given such that lim inf l→∞ 1l log kl = R, and liml→∞ inf N ∈I Fe (πFl , RlN ◦ N ⊗l ◦ P l ) = 1 hold. Take any 0 < τ ≤ 1/2. Then it is easily seen that starting with a τ2 -net in C(H, K) 2 we can find a set I τ = {N1 , . . . , N Nτ } ⊂ I with |Nτ | ≤ ( τ6 )2(dim H·dim K) such that for

each N ∈ I there is Ni ∈ Iτ with ||N − Ni ||♦ ≤ τ. Clearly, the above sequence of codes satisfies for each i ∈ {1, . . . , Nτ }: 1. lim inf l→∞ 1l log kl = R, and 2. liml→∞ minNi ∈Iτ Fe (πFl , Rl ◦ Ni⊗l ◦ P l ) = 1. From Theorem 9 it is immediately clear then, that R ≤ lim

max

min

l→∞ ρ∈S (H⊗l ) Ni ∈I τ

1 Ic (ρ, Ni⊗l ), l

and from the first estimate in Lemma 16 we get by noting that D♦(I, I τ ) ≤ τ holds, R ≤ lim

max

inf

l→∞ ρ∈S (H⊗l ) N ∈I

Taking the limit τ → 0 proves the theorem.

1 Ic (ρ, N ⊗l ) + ν(2τ ). l

7.2. The Informed Encoder. The case of an informed encoder can be treated in the same manner as the other two cases. We will just state the theorem and very briefly indicate the central ideas of the proof. Theorem 11 (Converse Part: Informed Encoder). Let I ⊂ C(H, K) be a compound channel. The capacity Q I E (I) for I with informed encoder is bounded from above by 1 inf max Ic (ρ, N ⊗l ). l→∞ l N ∈I ρ∈S (H⊗l )

Q I E (I) ≤ lim

Proof. The proof of this theorem is a trivial modification of the one for Theorem 10. Again, the first part of the proof is the converse in the finite case, while the second part uses the second estimate in Lemma 16. For the proof in the finite case note the following: due to the data processing inequality, the structure of the proof is entirely independent from the decoder. A change from an informed decoder to an uninformed decoder does not change our estimate. The only important change is that there will be a whole set {ei11 , . . . , eiNN } of vector states satisfying Eq. (76), one for each channel in I. This causes the state ρ l in Eq. (78) to depend on the channel.


91

8. Continuity of Compound Capacity This section is devoted to a question that has been answered only recently in [22] for single-channel capacities, namely that of continuity of capacities of quantum channels. The question is relevant not only from a mathematical point of view, but might also have a strong impact on applications. It seems a hard task in general to compute the regularized capacity formulas obtained so far for quantum channels. There are, however, cases where the regularized capacity formula can be reduced to a one-shot quantity (see for example [8] and references therein) that can be calculated using standard optimization techniques. Knowing that capacity is a continuous quantity one could raise the question how close an arbitrary (compound) channel is to a (compound) channel with one-shot capacity and thereby get an estimate on arbitrary capacities. We will now state the main result of this section. Theorem 12 (Continuity of Compound Capacity). The compound capacities Q( · ), Q I D ( · ) and Q I E ( · ) are continuous. To be more precise, let I, I ⊂ C(H, K) be two compound channels with D♦(I, I ) ≤ ≤ 1/2. Then |Q(I) − Q(I )| = |Q I D (I) − Q I D (I )| ≤ ν(2), |Q I E (I) − Q I E (I )| ≤ ν(2), where the function ν is taken from Lemma 16. ¯ = 0, implying that the three different Remark 4. Let I ⊂ C(H, K). Then D(I, I) ¯ We may thus define the equivalence relation capacities of I coincide with those for I. I ∼ I ⇔ I¯ = I¯ and even use D♦ as a metric on the set of equivalence classes without losing any information about our channels. Proof. Let D♦(I, I ) ≤ . By the first estimate in Lemma 16 and the capacity formula Q I D (I) = Q(I) = liml→∞ 1l maxρ∈S (H⊗l ) Ic (ρ, I⊗l ) we get |Q(I) − Q(I )| = |Q I D (I) − Q I D (I )| 1 max Ic (ρ, I⊗l ) − max Ic (ρ, I ⊗l ) = lim l→∞ l ρ∈S (H⊗l ) ρ∈S (H⊗l ) 1 1 ⊗l

⊗l = lim max Ic (ρ, I ) − max Ic (ρ, I ) l→∞ l ρ∈S (H⊗l ) l ρ∈S (H⊗l ) ≤ lim ν(2) l→∞

= ν(2). For the proof in the case of an informed encoder let us first note that Q I E (I) = liml→∞ inf N ∈I maxρ∈S (H⊗l ) Ic (ρ, N ⊗l ) holds. The second estimate in Lemma 16 justifies the following inequality: 1 inf max Ic (ρ, N ⊗l ) − inf |Q I E (I) − Q I E (I )| = lim max I (ρ, N ⊗l ) c

⊗l ⊗l l→∞ l N ∈I ρ∈S (H ) N ∈I ρ∈S (H ) 1 1 = lim inf max Ic (ρ, N ⊗l ) − max I (ρ, N ⊗l ) inf c

⊗l ⊗l l→∞ l N ∈I ρ∈S (H ) l N ∈I ρ∈S (H )

92


≤ lim ν(2) l→∞

= ν(2).

9. Entanglement-Generating Capacity of Compound Channels In this last section we will use the results obtained so far to achieve our main goal. Namely, we will determine the entanglement-generating capacity of quantum compound channels. We give the definitions of codes and capacity only for the most interesting case of uninformed users because there is no doubt that the reader will easily guess the definitions in the remaining cases. Nevertheless, we will state the coding result in all three cases. An entanglement-generating (l, kl )-code for the compound channel I ⊂ C(H, K) with uninformed users consists of a pair (Rl , ϕl ), where Rl ∈ C(K⊗l , Fl ) with kl = dim Fl and ϕl is a pure state on Fl ⊗ H⊗l . R ∈ R+ is called an achievable rate for I with uninformed users if there is a sequence of (l, kl ) entanglement-generating codes with 1. lim inf l→∞ 1l log kl ≥ R, and 2. liml→∞ inf N ∈I F(|ψl ψl |, (idFl ⊗ Rl ◦ N ⊗l )(|ϕl ϕl |)) = 1, where ψl denotes the standard maximally entangled state on Fl ⊗ Fl and F(·, ·) is the fidelity. The entanglement-generating capacity of I with uninformed users is then defined as the least upper bound of all achievable rates and is denoted by E(I). The entanglement-generating capacities E I D (I) and E I E (I) of I with informed decoder or informed encoder are obtained if we allow the decoder or preparator to choose Rl or ϕl in dependence of N ∈ I. Recall from the proof of Theorem 9 that to each subspace G ⊂ H and > 0 we always can find a subspace Fl ⊂ G ⊗l ⊂ H⊗l , a recovery operation Rl ∈ C(K⊗l , Fl ), and a unitary operation U l ∈ C(H⊗l , H⊗l ) with

kl = dim Fl ≥ 2l(inf N ∈I Ic (πG ,N )− 2 −o(l

0 ))

,

(80)

and inf Fe (πFl , Rl ◦ N ⊗l ◦ U l ) = 1 − o(l 0 ).

N ∈I

(81)

Notice that the maximally entangled state ψl in Fl ⊗ Fl purifies the maximally mixed state πFl on Fl and defining |ϕl ϕl | := U l (|ψl ψl |), the relation (81) can be rewritten as inf F(|ψl ψl |, idFl ⊗ Rl ◦ N ⊗l (|ϕl ϕl |)) = 1 − o(l 0 ).

N ∈I

(82)

This together with (80) shows that E(I) ≥ inf Ic (πG , N ). N ∈I

(83)

Thus, using the compound BSST Lemma 10 and arguing as in the proof of Theorem 7, we can conclude that 1 max inf Ic (ρ, N ⊗l ). E(I) ≥ Q(I) = lim (84) l→∞ l ρ∈S (H⊗l ) N ∈I


93

Since E(I) ≤ E I D (I) holds it suffices to show E I D (I) ≤ Q(I) = lim

l→∞

1 max inf Ic (ρ, N ⊗l ) l ρ∈S (H⊗l ) N ∈I

(85)

in order to establish the coding theorem for E I D (I) and E(I) simultaneously. The proof of (85) relies on Lemma 14 and the data processing inequality. Indeed, let R ∈ R+ be an achievable entanglement generation rate for I with informed decoder and let ((RlN )N ∈I, ϕl )l∈N be a corresponding sequence of (l, kl )-codes, i.e we have 1. lim inf l→∞ 1l log kl ≥ R, and 2. inf N ∈I F(|ψl ψl |, (idFl ⊗ RlN ◦ N ⊗l )(|ϕl ϕl |)) = 1 − l , where liml→∞ l = 0 and ψl denotes the standard maximally entangled state on Fl ⊗ Fl with Schmidt rank kl . Set ρ l := trFl (|ϕl ϕl |) and l := idFl ⊗ RlN ◦ N ⊗l (|ϕl ϕl |). σN

Then the data processing inequality and Lemma 14 imply for each N ∈ I, Ic (ρ l , N ⊗l ) ≥ Ic (ρ l , RlN ◦ N ⊗l ) l ) = ∆(σN

2 √ − 8 log(kl ) l e 2 √ = log kl − − 8 log(kl ) l . e ≥ ∆(|ψl ψl |) −

Consequently, 1 2 √ 1 max inf Ic (ρ, N ⊗l ) + , (1 − 8 l ) log kl ≤ l l ρ∈S (H⊗l ) N ∈I le

(86)

and we end up with R ≤ lim sup l→∞

1 1 log kl ≤ lim max inf Ic (ρ, N ⊗l ), l→∞ l ρ∈S (H⊗l ) N ∈I l

which implies (85). The expression for E I E (I) is obtained in a similar fashion. We summarize the results in the following theorem. Theorem 13 (Entanglement-Generating Capacities of I). For arbitrary compound channels I ⊂ C(H, K) we have E(I) = E I D (I) = Q(I) = lim

l→∞

1 max inf Ic (ρ, N ⊗l ), l ρ∈S (H⊗l ) N ∈I

and E I E (I) = Q I E (I) = lim

l→∞


94


10. Conclusion and Further Remarks We have demonstrated that universal codes in the sense of compound quantum channels exist, and we determined the best achievable rates. The results are analogous to those well known related results from the classical information theory obtained by Wolfowitz [32,33], and Blackwell, Breiman and Thomasian [5]. In contrast to the classical results on compound channels there is, in general, no single-letter description of the quantum capacities for entanglement transmission and generation over compound quantum channels. Notice, however, that for compound channels with classical input and quantum output (cq-channels) a single-letter characterization of the capacity is always possible according to the results of [3]. Natural candidates of compound quantum channels that might admit a single-letter capacity formula are given by sets of quantum channels consisting entirely of degradable channels. While it is quite easy to see from the results in [8] that the degradable compound quantum channels with informed encoder have a single-letter capacity formula for entanglement transmission and generation, the corresponding statement in the uninformed case seems to be less obvious. This and related questions will be addressed in a future work. Another issue we left open in this paper is the relation of the capacities considered here to other quantum communication tasks, for example to the subspace transmission and average subspace transmission and even to the randomized versions thereof. Again, we hope to come back to this point at some later time. Acknowledgement. We would like to thank Mary Beth Ruskai and the referee for many helpful suggestions and advice that led to significant improvement of the overall structure and readability of the paper. I.B. is supported by the Deutsche Forschungsgemeinschaft (DFG) via project “Entropie und Kodierung großer Quanten-Informationssysteme” at the TU Berlin. H.B. and J.N. are grateful for the support by TU Berlin through the fund for basic research.

A. Appendix Let E and G be subspaces of H with E ⊂ G ⊂ H, where k := dim E, dG := dim G. p and pG will denote the orthogonal projections onto E and G. For a Haar distributed random variable U with values in U(G) and x, y ∈ B(H) we define a random sesquilinear form 1 bU pU ∗ (x, y) := tr(U pU ∗ x ∗ U pU ∗ y) − tr(U pU ∗ x ∗ )tr(U pU ∗ y). k In this appendix we will give an elementary derivation of the formula E{bU pU ∗ (x, y)} =

k2 − 1 1 − k2 ∗ x p y) + tr( p tr( pG x ∗ )tr( pG y) G G dG2 − 1 dG (dG2 − 1)

(87)

for all x, y ∈ B(H) and where the expectation is taken with respect to the random variable U . Let us set pU := U pU ∗ . Since tr( pU x ∗ pU y) and tr( pU x ∗ )tr( pU y) depend sesquilinearly on (x, y) ∈ B(H) × B(H) it suffices to consider operators of the form x = | f 1 g1 | and y = | f 2 g2 |

(88)


95

with suitable f 1 , f 2 , g1 , g2 ∈ H. With x, y as in (88) we obtain tr( pU x ∗ pU y) = f 1 , pU f 2 g2 , pU g1 = f 1 ⊗ g2 , (U ⊗ U )( p ⊗ p)(U ∗ ⊗ U ∗ ) f 2 ⊗ g1 ,

(89)

tr( pU x ∗ )tr( pU y) = tr(( pU ⊗ pU )(|g1 f 1 | ⊗ | f 2 g2 )) = f 1 ⊗ g2 , (U ⊗ U )( p ⊗ p)(U ∗ ⊗ U ∗ )g1 ⊗ f 2 .

(90)

and

Since the range of the random projection (U ⊗U )( p⊗ p)(U ∗ ⊗U ∗ ) is contained in G ⊗G we see from (89) and (90) that we may (and will) w.l.o.g. assume that f 1 , f 2 , g1 , g2 ∈ G. Moreover, (89) and (90) show, due to the linearity of expectation, that the whole task of computing the average in (87) is boiled down to the determination of A( p) := E((U ⊗ U )( p ⊗ p)(U ∗ ⊗ U ∗ )) = (u ⊗ u)( p ⊗ p)(u ∗ ⊗ u ∗ )du. U(G )

(91)

Obviously, A( p) is u ⊗ u-invariant, i.e. A( p)(u ⊗ u) = (u ⊗ u)A( p) for all u ∈ U(G). It is fairly standard (and proven by elementary means in [30]) that then A( p) = αΠs + βΠa ,

(92)

where Πs and Πa denote the projections onto the symmetric and antisymmetric subspaces of G ⊗ G. More specifically Πs :=

1 (id + F), 2

Πa =

1 (id − F), 2

with id( f ⊗ g) = f ⊗ g and F( f ⊗ g) = g ⊗ f , for all f, g ∈ G. Since Πs and Πa are obviously u ⊗ u-invariant, and Πs Πa = Πa Πs = 0 holds, the coefficients α and β in (92) are given by α=

1 2 tr(( p ⊗ p)Πs ) = tr(( p ⊗ p)Πs ), tr(Πs ) dG (dG + 1)

(93)

β=

2 1 tr(( p ⊗ p)Πa ) = tr(( p ⊗ p)Πa ), tr(Πa ) dG (dG − 1)

(94)

and

where dG = dim G and we have used the facts that tr(Πs ) = dim ran(Πs ) =

dG (dG + 1) 2

tr(Πa ) = dim ran(Πa ) =

dG (dG − 1) . 2

and

It is easily seen by an explicit computation with a suitable basis that tr(( p ⊗ p)Πs ) =

1 2 1 (k + k) and tr(( p ⊗ p)Πa ) = (k 2 − k). 2 2

(95)

96


For example choosing any orthonormal basis {e1 , . . . , edG } of G with e1 , . . . , ek ∈ ran( p) we obtain tr(( p ⊗ p)Πs ) =

dG

ei ⊗ e j , ( p ⊗ p)Πs ei ⊗ e j

i, j=1

=

k

ei ⊗ e j , ( p ⊗ p)Πs ei ⊗ e j

i, j=1

⎛

=

1⎝ 2

k

⎞ ei , ei e j , e j + ei , e j e j , ei ⎠

i, j=1

=

1 2 (k + k), 2

with a similar calculation for tr(( p ⊗ p)Πa ). Utilizing (93), (94), (95), and (92) we end up with A( p) =

k2 − k k2 + k Πs + Πa . dG (dG + 1) dG (dG − 1)

(96)

Now, (96), (91), (90), (89), and some simple algebra show that 1 k2 − 1 tr(x ∗ y) E{tr(U pU ∗ x ∗ U pU ∗ y) − tr(U pU ∗ x ∗ )tr(U pU ∗ y)} = 2 k dG − 1 +

1 − k2 tr(x ∗ )tr(y). dG (dG2 − 1)

References 1. Barnum, H., Knill, E., Nielsen, M.A.: On Quantum Fidelities and Channel Capacities, IEEE Trans. Inf. Th. 46, 1317–1329 (2000); Barnum, H., Nielsen, M.A., Schumacher, B.: Information transmission through a noisy quantum channel, Phys. Rev. A 57, No. 6, 4153 (1998) 2. Bennett, C.H., Shor, P.W., Smolin, J.A., Thapliyal, A.V.: Entanglement-assisted capacity of a quantum channel and the reverse Shannon theorem. IEEE Trans. Inf. Th. 48, 2637–2655 (2002) 3. Bjelaković, I., Boche, H.: Classical Capacities of Averaged and Compound Quantum Channels. IEEE Trans. Inf. Th. 57(7), 3360–3374 (2009) 4. Bjelaković, I., Boche, H., Nötzel, J.: Quantum capacity of a class of compound channels. Phys. Rev. A 78(4), 042331 (2008) 5. Blackwell, D., Breiman, L., Thomasian, A.J.: The capacity of a class of channels. Ann. Math. Stat. 30(4), 1229–1241 (1959) 6. Choi, M.-D.: Completely positive linear maps on complex matrices. Linear Alg. Appl. 10, 285–290 (1975) 7. Csizsar, I., Körner, J.: Information Theory; Coding Theorems for Discrete Memoryless Systems. Budapest: Akadémiai Kiadó, New York: Academic Press Inc., 1981 8. Cubitt, T., Ruskai, M., Smith, G.: The structure of degradable quantum channels. J. Math. Phys. 49(10), 102104 (2008) 9. Datta, N., Dorlas, T.C.: The coding theorem for a class of quantum channels with long-term memory. J. Phys. A: Math. Theor. 40, 8147–8164 (2007) 10. Devetak, I.: The private classical capacity and quantum capacity of a quantum channel. IEEE Trans. Inf. Th. 51(1), 44–55 (2005) 11. Devetak, I., Winter, A.: Distillation of secret key and entanglement from quantum states. Proc. R. Soc. A 461, 207–235 (2005)


97

12. Hayashi, M.: Universal coding for classical-quantum channel. http://arxiv.org/abs/0805.4092v2 [quawt-ph], 2008 13. Hayden, P., Horodecki, M., Winter, A., Yard, J.: A decoupling approach to the quantum capacity. Open. Syst. Inf. Dyn. 15, 7–19 (2008) 14. Hayden, P., Shor, P.W., Winter, A.: Random quantum codes from gaussian ensembles and an uncertainty relation. Open. Syst. Inf. Dyn. 15, 71–89 (2008) 15. Holevo, A.S.: The Capacity of the quantum channel with general signal states. IEEE Trans. Inf. Th. 44(1), 269–273 (1998) 16. Holevo, A.S.: On entanglement-assisted classical capacity. J. Math. Phys. 43(9), 4326–4333 (2002) 17. Horn, R.A., Johnson, C.R.: Matrix Analysis. Cambridge: Cambridge University Press, 1999 18. Jozsa, R., Horodecki, M., Horodecki, P., Horodecki, R.: Universal quantum information compression. Phys. Rev. Lett. 81(8), 1714–1717 (1998) 19. Kitaev, A.Yu., Shen, A.H., Vyalyi, M.N.: Classical and Quantum Computation. Graduate Studies in Mathematics 47, Providence, RI: Amer. Math. Soc., 2002 20. Klesse, R.: Approximate Quantum Error Correction, Random Codes, and Quantum Channel Capacity. Phys. Rev. A 75, 062315 (2007) 21. Kretschmann, D., Werner, R.F.: Tema con variazioni: quantum channel capacity. New J. Phys. 6, 26–59 (2004) 22. Leung, D., Smith, G.: Continuity of quantum channel capacities, http://arxiv.org/abs/0810.4931v1 [quawt-ph], 2009 23. Lloyd, S.: Capacity of the noisy quantum channel. Phys. Rev. A 55(3), 1613–1622 (1997) 24. Milman, V.D., Schechtman, G.: Asymptotic Theory of Finite Dimensional Normed Spaces. Lecture Notes in Mathematics 1200, Berlin: Springer-Verlag, corrected second printing, 2001 25. Ogawa, T., Nagaoka, H.: Strong converse to the quantum channel coding theorem. IEEE Trans. Inf. Th. 45(7), 2486–2489 (1999) 26. Schumacher, B., Nielsen, M.A.: Quantum data processing and error correction. Phys. Rev. A 54(4), 2629–2635 (1996) 27. Schumacher, B., Westmoreland, M.D.: Sending classical information via noisy quantum channels. Phys. Rev. A 56(1), 131–138 (1997) 28. Schumacher, B., Westmoreland, M.D.: Approximate quantum error correction. Quant. Inf. Proc. 1, 5–12 (2002) 29. Shor, P.: Quntum error correction. Unpublished talk manuscript. Available at: http://www.msri.org/ publications/ln/msri/2002/quantumcrypto/shor/1/ 30. Werner, R.F.: Quantum states with Einstein-Podolsky-Rosen correlations admitting a hidden-variable model. Phys. Rev. A 40(8), 4277–4281 (1989) 31. Winter, A.: Coding theorem and strong converse for quantum channels. IEEE Trans. Inf. Th. 45(7), 2481– 2485 (1999) 32. Wolfowitz, J.: Simultaneous channels. Arch. Rat. Mech. Anal. 4(4), 371–386 (1960) 33. Wolfowitz, J.: Coding Theorems of Information Theory. Erg. Math. Grenzgebiete. 31, 3rd Edition, Berlin: Springer-Verlag, 1978 Communicated by M.B. Ruskai

Commun. Math. Phys. 292, 99–129 (2009) Digital Object Identifier (DOI) 10.1007/s00220-009-0837-x

Communications in


Meixner Class of Non-Commutative Generalized Stochastic Processes with Freely Independent Values I. A Characterization Marek Bo˙zejko1 , Eugene Lytvynov2 1 Instytut Matematyczny, Uniwersytet Wrocławski, Pl. Grunwaldzki 2/4,

50-384 Wrocław, Poland. E-mail: [email protected]

2 Department of Mathematics, Swansea University, Singleton Park,

Swansea SA2 8PP, U.K. E-mail: [email protected] Received: 1 December 2008 / Accepted: 23 February 2009 Published online: 22 May 2009 – © Springer-Verlag 2009

Abstract: Let T be an underlying space with a non-atomic measure σ on it (e.g. T = Rd and σ is the Lebesgue measure). We introduce and study a class of non-commutative generalized stochastic processes, indexed by points of T , with freely independent values. Such a process (field), ω = ω(t), t∈ T , is given a rigorous meaning through smearing out with test functions on T , with T σ (dt) f (t)ω(t) being a (bounded) linear operator in a full Fock space. We define a set CP of all continuous polynomials of ω, and then define a non-commutative L 2 -space L 2 (τ ) by taking the closure of CP in the norm P L 2 (τ ) := P, where is the vacuum in the Fock space. Through procedure of orthogonalization of polynomials, we construct a unitary isomorphism between L 2 (τ ) 2 n and a (Fock-space-type) Hilbert space F = R ⊕ ∞ n=1 L (T , γn ), with explicitly given measures γn . We identify the Meixner class as those processes for which the procedure of orthogonalization leaves the set CP invariant. (Note that, in the general case, the projection of a continuous monomial of order n onto the n th chaos need not remain a continuous polynomial.) Each element of the Meixner class is characterized by two continuous functions λ and η ≥ 0 on T , such that, in the F space, ω has representation ω(t) = ∂t† + λ(t)∂t† ∂t + ∂t + η(t)∂t† ∂t2 , where ∂t† and ∂t are the usual creation and annihilation operators at point t.

1. Introduction In his classical work [30], Meixner searched for all probability measures µ on R with infinite support whose system of monic orthogonal polynomials ( p (n) )∞ n=0 has an (exponential) generating function of the exponential type: ∞ ∞ p (n) (t) n 1 z = exp(tψ(z) + φ(z)) = (tψ(z) + φ(z))k . n! k! n=0

k=0

(1.1)

100

M. Bo˙zejko, E. Lytvynov

Meixner discovered that this (essentially) holds if and only if there exist λ ∈ R and η ≥ 0 such that the polynomials ( p (n) )∞ n=0 satisfy the recursive relation t p (n) (t) = p (n+1) (t) + λnp (n) (t) + (n + ηn(n − 1)) p (n−1) (t).

(1.2)

(We refer to [35] for a modern presentation of this result.) From (1.2) one concludes that the measure µ can be either Gaussian, or Poisson, or gamma, or Pascal (negative binomial), or Meixner. We may now introduce in L 2 (R, µ) creation (raising) and annihilation (lowering) operators through ∂ † p (n) := p (n+1) and ∂ p (n) := np (n−1) , respectively. Then, by (1.2), the action of the operator of multiplication by t in L 2 (R, µ) has a representation t· = ∂ † + λ∂ † ∂ + ∂ + η∂ † ∂∂.

(1.3)

Since Meixner’s laws are infinitely divisible, they appear as distributions of increments of corresponding Lévy processes. These are exactly Brownian motion, Poisson, gamma, Pascal, and Meixner processes. Note the first two of these processes correspond to the case η = 0, while the latter three correspond to η > 0. We will refer to all of them as the Meixner class of Lévy processes. From numerous applications of these processes let us mention that, for η > 0, they naturally appear in the study of a realization of the renormalized square of white noise, see [1,28,36] and the references therein. In [26] (see also [14,23,24,33]), Meixner-type generalized stochastic processes with independent values were constructed and studied. More precisely, consider a standard triple of the form S ⊂ L 2 (R, dt) ⊂ S , where S is a nuclear space of smooth functions, and S is the dual of S with respect to the central space L 2 (R, dt), i.e., S is a space of generalized functions (distributions). Let λ, η be parameters as in (1.2), or even more generally, let λ(·) and η(·) be smooth functions on R, which give, at each t ∈ R, parameters λ(t), η(t). Then, there exists a probability measure µ on the space S which is a generalized stochastic process with independent values (in the sense of [21]), and the operator of multiplication by a monomial f, ω , f ∈ S, ω ∈ S , has a representation f, ω · = dt f (t)(∂t† + λ(t)∂t† ∂t + ∂t + η(t)∂t† ∂t ∂t ), R

that is, ω(t)· = ∂t† + λ(t)∂t† ∂t + ∂t + η(t)∂t† ∂t ∂t

(1.4)

(compare with (1.3)). In (1.4), the operators ∂t† and ∂t are defined by analogy with the one-dimensional case, although on infinite-dimensional orthogonal polynomials of ω ∈ S , so that ∂t† and ∂t are the usual creation and annihilation operators at point t. As a result, one has a unitary isomorphism between the L 2 -space L 2 (S , µ) and some Hilbert ∞ space F = n=0 F (n) , where F (0) = R, while for each n ∈ N, F (n) = L 2sym (Rn , θn ) — the space of all symmetric functions on Rn which are square integrable with respect to some measure θn (depending on λ and η). In the special case where η ≡ 0, the space F reduces to the usual symmetric Fock space over L 2 (R, dt), whereas in the general case the space F is wider than the Fock space, which is why, in [26], F was called an extended Fock space. As follows from [13,27], the Meixner class may be characterized between all generalized stochastic processes with independent values as exactly those processes whose orthogonal polynomials remain continuous polynomials. Recall that, in infinite dimensions, orthogonalization of polynomials means: first, decomposing the L 2 -space into the

Meixner Class of Non-Commutative Generalized Stochastic Processes

101

infinite orthogonal sum of its subspaces generated by polynomials, and second, taking the projection of each monomial of order n onto the n th space. This is why, although the initial monomials are continuous functions of ω ∈ S , their orthogonal projections do not need to retain this property. The result of [13,27] also means that it is only for the Meixner-type processes that the multiplication operator ω(t)· can be represented through the operators ∂t† , ∂t . In free probability, Meixner’s systems of polynomials (on R) were introduced by Anshelevich [2] and Saitoh, Yoshida [34]. (In fact, such polynomials had already occurred in many places in the literature even before [2,34], see [17, p. 62] and [5, p. 864] for bibliographical references.) The free Meixner polynomials (q (n) )∞ n=0 have a (usual) generating function of the resolvent type: ∞ n=0

q (n) (t)z n = (1 − (tψ(z) + φ(z)))−1 =

∞ (tψ(z) + φ(z))k

(1.5)

k=0

(compare with (1.1)). Recall the following notation from q-analysis: for each q ∈ [−1, 1], we define [0]q := 0 and [n]q := 1 + q + q 2 + · · · + q n−1 for n ∈ N. In particular, for q = 0, we have [0]0 = 0 and [n]0 = 1 for n ∈ N. Then, by [2], equality (1.5) (essentially) holds if and only if there exist λ ∈ R and η ≥ 0 such that the polynomials (q (n) )∞ n=0 satisfy the recursive relation tq (n) (t) = q (n+1) (t) + λ[n]0 q (n) (t) + ([n]0 + η[n]0 [n − 1]0 )q (n−1) (t),

(1.6)

or, equivalently, equality (1.3) holds in which ∂ † and ∂ are defined through ∂ † q (n) := q (n+1) and ∂q (n) := [n]0 q (n−1) . Each measure of orthogonality of a free Meixner system of polynomials (which has an infinite support) is freely infinitely divisible, and therefore there exist corresponding free Lévy processes. A characterization of these processes in terms of a regression problem was given in [17]. These processes also appeared in the study of a realization of the renormalized square of free white noise [36]. A deep study of free Meixner polynomials of d (d ∈ N) non-commutative variables has been carried out by Anshelevich in [3,5–7]. The aim of the present paper is to introduce and study the Meixner class of noncommutative generalized stochastic processes with freely independent values, or equivalently Meixner-type free polynomials of infinitely-many (non-commutative) variables. We “translate” the aforementioned results of the theory of classical generalized stochastic processes with independent values into the language of free probability. In particular, we derive representation (1.4) for these processes in which ∂t† and ∂t are the creation and annihilation operators, as in the full Fock space, at point t. The main result of the paper—Theorem 4.1—is the characterization of the Meixner class as exactly those noncommutative generalized stochastic processes with freely independent values whose orthogonal polynomials are continuous in ω. It should be stressed that, generally speaking, the orthogonal polynomials we consider resemble one-dimensional free Meixner polynomials only in the infinitesimal sense, i.e., at each point of the underlying space. The paper is organized as follows. We start, in Sect. 2, with a discussion of processes of Gauss–Poisson type. We fix an underlying space T and a non-atomic measure σ on it. (Although the most importanant case is when T is either R or [0, ∞) and σ is the Lebesgue measure, we prefer to deal with a general space to stress that its structure does not play any significant role.) We fix a function λ ∈ C(T ), and consider a process (noise)

102


of the form ω(t) = ∂t† + ∂t + λ(t)∂t† ∂t in the full Fock space over L 2 (T, σ ). A sense to this process is given through smearing out with a test function f on T . We introduce a free expectation τ and the corresponding (non-commutative) L 2 -space L 2 (τ ). In terms of the expansion through orthogonal polynomials of ω, the space L 2 (τ ) is unitarily isomorphic to the original Fock space. We prove that the procedure of orthogonalization in L 2 (τ ) is equivalent to the procedure of free Wick (normal) ordering of the operators ∂t† and ∂t . This, in particular, generalizes a corresponding result of [16, p. 137], which was proved in the Gaussian case, i.e., when λ ≡ 0 (compare also with [4, p. 186]). We note, however, that in [16], the authors did not use the Wick ordering in the infinitesimal sense, which is only possible when λ ≡ 0. We then derive theorems giving a Wick rule for the product ω(t1 ) · · · ω(tn ), as well as a Wick rule for a product of Wick products. The latter theorems present a free counterpart of results of [29], see also [4, Prop. 6] for a q-case. In Sect. 3, we study (quite) general non-commutative generalized stochastic processes with freely independent values. They are described by assigning to each t ∈ T , a compactly supported probability measure µ(t, ds) on R, so that µ(t, {0}) is the diffusion coefficient of the process, while outside zero ν(t, ds) := s12 µ(t, ds) is the Lévy measure of “jumps” at point t (compare with [8–10]). We prove that the set of continuous polynomials of ω is dense in the corresponding space L 2 (τ ), introduce orthogonal polynomials, decompose any element of L 2 (τ ) into an infinite sum of orthogonal polynomials, and thusderive a unitary isomorphism between L 2 (τ ) and an extended full (n) (n) = L 2 (T n , γ ) with some measure γ on T n . Fock space F = ∞ n n n=0 F , where F We also present an explicit form of the action of the operators of (left) multiplication by f, ω realized in the space F. These operators have a clear Jacobi-field structure (compare with [12,13,18,25]). To derive our results, we produce an expansion of L 2 (τ ) in multiple stochastic integrals, by analogy with the Nualart and Schoutens result [31] in the classical case. In fact, Anshelevich [4] extended the result of [31] to the case of general q-Lévy processes. Comparing our result in this section with that of [4], we note that, first, we do not assume the process to be stationary, i.e., we allow the Lévy measure to depend on t, and second, what is much more important, our main results in this section—Theorems 3.3 and 3.4—are new even in the stationary case (when q = 0). Finally, in Sect. 4, we derive the Meixner class of free processes as exactly those non-commutative generalized stochastic processes with freely independent values for which orthogonal polynomials are continuous in ω, and thus we derive a counterpart of formula (1.4) in the free case. In the second part of this paper, which is currently in preparation, we will discuss the generating function for the orthogonal polynomials of ω from the Meixner class and other related problems, and we will also mention some open problems. 2. Free Gauss-Poisson Process Let T be a locally compact, second countable Hausdorff topological space. Recall that such a space is known to be Polish. A subset of T is called bounded if it is relatively compact in T . We will additionally assume that T does not possess isolated points, i.e., for every t ∈ T , there exists a sequence {tn }∞ n=1 ⊂ T such that tn = t for all n ∈ N, and tn → t as n → ∞. We denote by B(T ) the Borel σ -algebra in T , and by B0 (T ) the collection of all relatively compact sets from B(T ). Let D := C0 (T ) denote the set of all real-valued continuous functions on T with compact support. Analogously, we define D(n) := C0 (T n ), n ∈ N, and D(0) := R.


103

For a real, separable space H we denote by F(H) the full Fock space over Hilbert ⊗n , where H⊗0 := R. As usual, we will identify each H⊗n H, i.e., F(H) := ∞ H n=0 with the corresponding subspace of F(H). We denote by Ffin (D) the subset of F(H) consisting of all sequences f = ( f (0) , f (1) , . . . , f (n) , 0, 0, . . . ) such that f (i) ∈ D(i) , i = 0, 1, . . . , n, n ∈ N0 := N ∪ {0}. The element :=(1, 0, 0, . . . ) ∈ Ffin (D) is called the vacuum. Let σ be a Radon, non-atomic measure on (T, B(T )). We will assume that the measure σ satisfies σ (O) > 0 for each open, non-empty set O in T . Let H:=L 2 (T, σ ) be the real L 2 -space over T with respect to the measure σ , and thus we get the Fock space F(H) = F(L 2 (T, σ )). For each f ∈ D, we denote by a + ( f ), a − ( f ), and a 0 ( f ) the corresponding creation, annihilation, and neutral operators, respectively. These are bounded linear operators on F(H) given through a + ( f ) = f ⊗ g (n) , g (n) ∈ H⊗n , n ∈ N0 , a − ( f ) g1 ⊗ · · · ⊗ gn = ( f, g1 )H g2 ⊗ · · · ⊗ gn , a 0 ( f ) g1 ⊗ · · · ⊗ gn = ( f g1 ) ⊗ g2 ⊗ · · · ⊗ gn , g1 , . . . , gn ∈ H, n ∈ N, a − ( f ) = a 0 ( f ) = 0. The operator a + ( f ) is the adjoint of a − ( f ), whereas a 0 ( f ) is self-adjoint. Note that a + ( f ) and a − (g), f , g ∈ D, satisfy the free commutation relation a − (g)a + ( f ) = (g, f )H ,

(2.1)

where, as usual, a constant is understood as the constant times the identity operator 1. Throughout the paper, we will heavily use the following standard notations. For each t ∈ T , we define ∂t as the annihilation operator at point t. More precisely, we set ∂t := 0, and for each f (n) ∈ D(n) , n ∈ N, we set (∂t f (n) )(t1 , . . . , tn−1 ) := f (n) (t, t1 , . . . , tn−1 ). Clearly, ∂t f (n) ∈ D(n−1) . Extending by linearity, we see that ∂t maps Ffin (D) into itself. If we introduce the “delta-function” δt : δt , f = f (t) for f ∈ D, then the operator ∂t can be thought of as a − (δt ). Next, we heuristically define ∂t† as the creation operator at point t, i.e., ∂t† is the “adjoint” of ∂t , so that ∂t† = a + (δt ). A rigorous meaning to formulas involving ∂t† will be given through smearing with test functions. In particular, for each f ∈ D, we get: a+( f ) = σ (dt) f (t)∂t† , a − ( f ) = σ (dt) f (t)∂t , a 0 ( f ) = σ (dt) f (t)∂t† ∂t . T

T

T

(2.2) Note that the relation (2.1) can now be written down in the form ∂s ∂t† = δ(s, t), where

σ (dt)δ(s, t) f (2) (s, t) :=

σ (ds) T

T

σ (dt) f (2) (t, t), T

(2.3)

f (2) ∈ D(2) .

(2.4)

104


We now fix λ ∈ C(T ) — the space of all continuous functions on T , and define, for each f ∈ D a self-adjoint operator x( f ) := a + ( f ) + a − ( f ) + a 0 (λ f ), so that

x( f ) = T

σ (dt) f (t) ∂t† + ∂t + λ(t)∂t† ∂t .

(2.5)

As we will see below, if λ ≡ 0, then (x( f )) f ∈D is a free Gaussian process, and if λ ≡ 1, then (x( f )) f ∈D is a free (centered) Poisson process. In view of (2.5), we denote ω(t) := ∂t† + ∂t + λ(t)∂t† ∂t , so that (2.5) becomes

σ (dt) f (t)ω(t).

x( f ) = T

Thus, ω := (ω(t))t∈T can be interpreted as the corresponding free noise. Lemma 2.1. The vacuum vector is cyclic for the operator family (x( f )) f ∈D , i.e., c. l. s.{, x( f 1 ) · · · x( f n ) | f 1 , . . . , f n ∈ D, n ∈ N} = F(H). Here and below, c. l. s. stands for the closed linear span. Proof. The statement follows by induction from the fact that we have a Jacobi field, i.e., each operator a( f ) has a three-diagonal structure, with a + ( f ), f ∈ D, being the usual creation operators (compare with e.g. [12,25]). We can naturally extend the definition of x( f ) to the case where f ∈ B0 (T ) — the space of all real-valued bounded measurable functions on T with compact support. Let A denote the real algebra generated by (x( f )) f ∈B0 (T ) . We define a free expectation on A by τ (a) := (a, )F (H) , a ∈ A. Recall that a set partition π of a set X is a collection of disjoint subsets of X whose union equals X . Let N C(n) denote the collection of all non-crossing partitions of {1, . . . , n}, i.e., all set partitions π = {A1 , . . . , Ak }, k ≥ 1, of {1, . . . , n} such that there do not exist Ai , A j ∈ π , Ai = A j , for which the following inequalities hold: x1 < y1 < x2 < y2 for some x1 , x2 ∈ Ai and y1 , y2 ∈ A j . For each n ∈ N, we define a free cumulant C (n) as the n-linear mapping C (n) : B0 (T )n → R defined recurrently by the following formula, which connects the free cumulants with moments: C(A, f 1 , . . . , f n ), (2.6) τ (x( f 1 )x( f 2 ) · · · x( f n )) = π ∈N C(n) A∈π

where for each A = {a1 , . . . , ak } ⊂ {1, 2, . . . , n}, a1 < a2 < · · · < ak , C(A, f 1 , . . . , f n ) := C (k) ( f a1 , . . . , f ak ).


As easily seen, C (1) ≡ 0 and f 1 (t) · · · f n (t)λn−2 (t) σ (dt), C (n) ( f 1 , . . . , f n ) =

105

f 1 , . . . , f n ∈ B0 (T ), n ≥ 2.

T

(2.7) By (2.6) and (2.7), the expectation τ on A is tracial, i.e., for any a, b ∈ A, τ (ab) = τ (ba). Proposition 2.1. Let f 1 , . . . , f n ∈ B0 (T ) be such that f i f j = 0 σ -a.e. for all 1 ≤ i < j ≤ n.

(2.8)

Then x( f i ), i = 1, . . . , n, are freely independent with respect to τ . Proof. By (2.7) and (2.8), for each k ≥ 2 and any indices i 1 , . . . , i k ∈ {1, . . . , n} such that il = i m for some l, m ∈ {1, . . . , k}, C (k) ( f i1 , . . . , f ik ) = 0. Using e.g. [37], we conclude from here the statement. Let B0 (T )C denote the complexification of B0 (T ). We extend C (n) by linearity to the n-linear mapping C (n) : B0 (T )nC → C. For each f ∈ B0 (T )C , we denote C (n) ( f ) := (n) C (n) ( f, . . . , f ), and define the free cumulant transform C( f ) := ∞ n=1 C ( f ), provided that the latter series converges absolutely. By (2.7) and the dominated convergence theorem, we get: Proposition 2.2. Let f ∈ B0 (T )C be such that there exists ε ∈ (0, 1) for which | f (t)| < (where

1−ε 0

1−ε for all t ∈ T λ(t)

(2.9)

:= +∞). Then C( f ) = T

f 2 (t) σ (dt). 1 − λ(t) f (t)

(2.10)

Remark 2.1. Note that, for f ∈ D, condition (2.9) is equivalent to | f (t)| < 1/|λ(t)| for all t ∈ T . For each ∈ B0 (T ), we define x() := x(χ ) =

σ (dt)ω(t),

where χ denotes the indicator function of . Then, by Proposition 2.1, for any mutually disjoint sets 1 , . . . , n ∈ B0 (T ), the operators x(1 ), . . . , x(n ) are freely independent, and so by analogy with the classical case (see e.g. [21]), we can interpret ω as a non-commutative generalized stochastic process with freely independent values. For each f (n) ∈ B0 (T n ), we define a monomial of ω by σ (dt1 ) · · · σ (dtn ) f (n) (t1 , . . . , tn )ω(t1 ) · · · ω(tn ) f (n) , ω⊗n := Tn = σ (dt1 ) · · · σ (dtn ) f (n) (t1 , . . . , tn )(∂t†1 + ∂t1 + λ(t1 )∂t†1 ∂t1 ) Tn

× · · · × (∂t†n + ∂tn + λ(tn )∂t†n ∂tn ).

(2.11)

106


In fact, the presence of ∂t†i in (2.11) just means the creation of a function in the ti -variable, the presence of λ(ti )∂t†i ∂ti means the identification of the ti -variable with the previous ti−1 -variable and additional multiplication by λ(ti ), whereas the presence of ∂ti means integration in the ti -variable. For example, for f (4) ∈ B0 (T 4 ) and g (2) ∈ D(2) ,

σ (dt1 ) · · · σ (dt4 ) f (4) (t1 , . . . , t4 )∂t†1 ∂t2 λ(t3 )∂t†3 ∂t3 ∂t†4 g (2) (s1 , s2 , s3 ) T4 = σ (dt)λ(t) f (4) (s1 , t, t, t)g (2) (s2 , s3 ). T

Using the Cauchy–Schwarz inequality, we easily conclude that (2.11) indeed identifies a bounded linear operator in F(H). In particular, if f (n) = f 1 ⊗ · · · ⊗ f n with f 1 , . . . , f n ∈ B0 (T ), then f 1 ⊗ · · · ⊗ f n , ω⊗n = f 1 , ω · · · f n , ω = x( f 1 ) · · · x( f n ). We will also interpret constants as monomials of order 0. Let P and CP denote the set of all non-commutative polynomials (finite sums of monomials) with kernels f (n) ∈ B0 (T n ) and f (n) ∈ D(n) , respectively. (CP stands for “continuous polynomials.”) Clearly, CP ⊂ P and A ⊂ P. Lemma 2.2. We have CP = Ffin (D). Proof. Clearly, CP ⊂ Ffin (D). On the other hand, for each f (n) ∈ D(n) , f (n) = f (n) , ω⊗n − g (n−1) , where g (n−1) ∈

n−1 i=0

D(i) . From here, by induction, we conclude that Ffin (D) ⊂ CP.

We now naturally extend the free expectation τ to the set CP, and define an inner product (P1 , P2 ) L 2 (τ ) := τ (P2 P1 ) = (P1 , P2 )F (H) ,

P1 , P2 ∈ CP.

LetP ∈ CP and P = 0. Then P = 0 as an element of L 2 (τ ). Indeed, let n (i) ⊗i (n) = 0 (we then call P a polynomial of order n). P = i=0 f , ω , where f ⊗n Then the H -th component of P is f (n) , which implies that (P, P) L 2 (τ ) > 0. Hence, we can define a real Hilbert space L 2 (τ ) as the closure of CP with respect to the norm generated by the inner product (·, ·) L 2 (τ ) . As we will see below, we can naturally embed P (and so also A) into L 2 (τ ). Furthermore, we will also show that every element of L 2 (τ ) may be understood as (generally speaking, unbounded) Hermitian operator in F(H). Let CP(n) denote the subset of CP consisting of all continuous polynomials of order ≤ n. Let MP(n) denote the closure of CP(n) in L 2 (τ ). (MP stands for “measurable polynomials.”) Let OP(n) := MP(n) MP(n−1) , n ∈ N, OP(0) := R, where the sign denotes the orthogonal difference in L 2 (τ ). (OP stands polynomials.”) for “orthogonal (n) Thus, we get the orthogonal decomposition L 2 (τ ) = ∞ n=0 OP . Proposition 2.3. Consider a linear operator I : CP → Ffin (D) given by I P = P for P ∈ CP. Then, I extends to a unitary operator I : L 2 (τ ) → F(H). Furthermore, I OP(n) = H⊗n .


107

Proof. The first statement of the proposition directly follows Lemma 2.2. Next, n from it follows from the proof of Lemma 2.2 that I CP(n) = i=0 D(i) , so that I MP(n) = n ⊗i i=0 H . From here, the second statement follows. For f (n) ∈ D(n) , let P( f (n) ) denote the orthogonal projection of f (n) , ω⊗n onto OP(n) , i.e., by the results proved above, P( f (n) ) = I −1 f (n) . Theorem 2.1. For each f (n) ∈ D(n) , we have P( f (n) ) ∈ CP. Before proving Theorem 2.1, we have to introduce some notations. Let N C(n, ±1) denote the collection of all κ = {(A1 , m 1 ), . . . (Ak , m k )}, k ∈ N,

(2.12)

such that π(κ) := {A1 , . . . , Ak } is an element of N C(n), m 1 , . . . , m k ∈ {−1, +1}, and if for some i ∈ {1, . . . , k}, the set Ai has only one element, then m i = 1. For each j ∈ {1, . . . , k}, we will interpret m j as the mark of the element A j of the non-crossing partition π(κ). Finally, we denote by G n the subset of N C(n, ±1) consisting of all κ as in (2.12) such that there do not exist i, j ∈ {1, . . . , k}, i = j, for which min Ai < min A j ≤ max A j < max Ai with m j = +1, i.e., an element of a non-crossing partition with mark +1 cannot be “within” any other element of this partition. (Note that, in [3,4], elements of G n were called extended partitions, with classes labeled +1 called “classes open on the left”.) Let n ∈ N and let us fix an arbitrary κ ∈ G n as in (2.12). We then define W (κ)ω(t1 ) · · · ω(tn ) as follows. For each i ∈ {1, . . . , k}, let Ai = { j1 , j2 , . . . , jl }, j1 < j2 , · · · < jl . If m i = −1 (and so l ≥ 2), then replace the factors ω(t j1 ), ω(t j2 ), . . . , ω(t jl ) in the product ω(t1 )ω(t2 ) · · · ω(tn ) by the “function” λl−2 (t j1 )δ(t j1 , t j2 , . . . , t jl ). If m i = +1, then leave the factor ω(t j1 ) without changes, and if l ≥ 2 then additionally replace the factors ω(t j2 ), ω(t j3 ), . . . , ω(t jl ) in the product ω(t1 )ω(t2 ) · · · ω(tn ) by the “function’ λl−1 (t j1 )δ(t j1 , t j2 , . . . , t jl ). Here, analogously to (2.4), we have set, for k ≥ 2, (k) σ (t1 ) · · · σ (tk ) f (t1 , . . . , tk )δ(t1 , t2 , . . . , tk ) := σ (dt) f (k) (t, t . . . , t). Tk

T

For example, if n = 8, and κ = {({1, 2}, +1), ({3, 4, 8}, +1), ({5, 6, 7}, −1)} , then W (κ)ω(t1 ) · · · ω(t8 ) = λ(t1 )δ(t1 , t2 )λ2 (t3 )δ(t3 , t4 , t8 )λ(t5 )δ(t5 , t6 , t7 )ω(t1 )ω(t3 ). Next, we denote by Int(n) the collection of all interval partitions of {1, . . . , n}, all of whose elements are intervals of consecutive integers. Clearly, Int(n) ⊂ N C(n). We will denote by Int(n, ±1) the corresponding subset of N C(n, ±1). Note that Int(n, ±1) ⊂ G n .

108


Proof of Theorem 2.1. For any f ∈ D, denote by f, ω · the operator of left multiplication by f, ω acting on CP. Clearly, under I , f, ω · goes over into the operator f, ω acting on Ffin (D). Now, for any f 1 , . . . , f n ∈ D, n ≥ 2, f 1 , ω f 2 ⊗ · · · ⊗ f n = f 1 ⊗ · · · ⊗ f n + (λ f 1 f 2 ) ⊗ f 3 ⊗ · · · ⊗ f n +( f 1 , f 2 )H f 3 ⊗ · · · ⊗ f n . Therefore, applying I −1 to the above equality, we get P( f 1 ⊗ · · · ⊗ f n ) = f 1 , ω P( f 2 ⊗ · · · ⊗ f n ) − P((λ f 1 f 2 ) ⊗ f 3 ⊗ · · · ⊗ f n ) −( f 1 , f 2 )H P( f 3 ⊗ · · · ⊗ f n ). (2.13) (n) denote the subset of D(n) consisting of finite sums of functions of the form Let Dalg f 1 ⊗ · · · ⊗ f n with f 1 , . . . , f n ∈ D. Then, it follows by induction from (2.13) that, for (n) each f (n) ∈ Dalg ,

P( f

(n)

)=

c(κ)

κ∈Int(n,±1)

Tn

σ (dt1 ) · · · σ (tn ) f (n) (t1 , . . . , tn )W (κ)ω(t1 ) · · · ω(tn ), (2.14)

where c(κ) ∈ R (compare with [20, Sect. 4] and [3, Sect. 3]). (n) (n) Now, let us fix an arbitrary f (n) ∈ D(n) . Choose a sequence { f k }∞ k=1 ⊂ Dalg such ∞ (n) (n) (n) that the set k=1 supp f k is in B0 (T ), f k are uniformly bounded and f k → f (n) (n) point-wise as k → ∞. Hence f k , ω⊗n → f (n) , ω⊗n in L 2 (τ ), which implies that (n) P( f k ) → P( f (n) ) in L 2 (τ ). On the other hand, for each κ ∈ G n ,

(n)

σ (dt1 ) · · · σ (dtn ) f k (t1 , . . . , tn )W (κ)ω(t1 ) · · · ω(tn ) → σ (dt1 ) · · · σ (dtn ) f (n) (t1 , . . . , tn )W (κ)ω(t1 ) · · · ω(tn )

Tn

Tn

in L 2 (τ ) as n → ∞. This implies that (2.14) holds for each f (n) ∈ D(n) , and therefore P( f (n) ) ∈ CP. For each n ∈ N, we define (free) Wick product of ω(t1 ), . . . , ω(tn ), denoted by :ω(t1 ) · · · ω(tn ): as follows: first we formally evaluate the product ω(t1 ) · · · ω(tn ) = (∂t†1 + ∂t1 + λ(t1 )∂t†1 ∂t1 ) · · · (∂t†n + ∂tn + λ(tn )∂t†n ∂tn ), and then remove all the terms containing ∂ti ∂t†i+1 for some i ∈ {1, . . . , n − 1}. We clearly have the following recursive formula: :ω(t1 ) := ω(t1 ), :ω(t1 ) · · · ω(tn ) := ∂t†1 :ω(t2 ) · · · ω(tn ): +λ(t1 )∂t†1 ∂t1 ∂t2 · · · ∂tn + ∂t1 ∂t2 · · · ∂tn , n ≥ 2.

(2.15)


109

Furthermore, as easily seen, :ω(t1 ) · · · ω(tn ) := ∂t†1 ∂t†2 · · · ∂t†n +

n (∂t†1 · · · ∂t†i−1 ∂ti · · · ∂tn + ∂t†1 · · · ∂t†i−1 λ(ti )∂t†i ∂ti ∂ti+1 · · · ∂tn ).

(2.16)

i=1

Theorem 2.2. For each f (n) ∈ D(n) , n ∈ N, (n) P( f ) = σ (dt1 ) · · · σ (dtn ) f (n) (t1 , . . . , tn ) :ω(t1 ) · · · ω(tn ):.

(2.17)

Tn

Proof. Analogously to the proof of Theorem 2.1, it suffices to prove formula (2.17) in the case f (n) = f 1 ⊗ · · · ⊗ f n with f 1 , . . . , f n ∈ D. Using (2.3) and (2.15), we have: ω(t1 ) :ω(t2 ) · · · ω(tn ) := ∂t†1 :ω(t2 ) · · · ω(tn ): + (λ(t1 )∂t†1 ∂t1 + ∂t1 )(∂t†2 :ω(t3 ) · · · ω(tn ): +λ(t2 )∂t†2 ∂t2 ∂t3 · · · ∂tn + ∂t2 ∂t3 · · · ∂tn ) = ∂t†1 :ω(t2 ) · · · ω(tn ): + λ(t1 )δ(t1 , t2 )∂t†2 :ω(t3 ) · · · ω(tn ): +λ(t1 )δ(t1 , t2 )λ(t2 )∂t†2 ∂t2 ∂t3 · · · ∂tn + λ(t1 )∂t†1 ∂t1 ∂t2 · · · ∂tn +δ(t1 , t2 ) :ω(t3 ) · · · ω(tn ): + λ(t1 )δ(t1 , t2 )∂t2 ∂t3 · · · ∂tn + ∂t1 ∂t2 · · · ∂tn =: ω(t1 ) · · · ω(tn ): + λ(t1 )δ(t1 , t2 ) :ω(t2 ) · · · ω(tn ): + δ(t1 , t2 ) :ω(t3 ) · · · ω(tn ):, (2.18) the calculations taking rigorous meaning after smearing out with the f (n) as above. By virtue of (2.13), we see that (2.18) implies the statement of the theorem. Taking Theorem 2.2 into account, for each f (n) ∈ D(n) we will write f (n) , :ω⊗n : for P( f (n) ). More generally, for each f (n) ∈ H⊗n , we will denote by f (n) , :ω⊗n : the element of L 2 (τ ) defined as I −1 f (n) . Thus, each element F ∈ L 2 (τ ) admits a unique representation F=

∞

f (n) , :ω⊗n : ,

n=0

where f =

( f (n) )

∈ F(H).

Remark 2.2. With each F ∈ L 2 (τ ) one can associate a Hermitian (i.e., densely defined and symmetric, possibly unbounded) operator in F(H) with domain Ffin (D). Indeed, let us fix arbitrary f (n) ∈ D(n) and g (m) ∈ D(m) . By virtue of (2.16), n∧m (2) (3) Lk ( f (n) ) + Lk ( f (n) ) g (m) , f (n) , :ω⊗n : g (m) = L(1) ( f (n) )g (m) + k=1

where L(1) ( f (n) ) = (2)

Lk ( f (n) ) = (n) L(3) ) k (f

=

Tn

Tn

σ (dt1 ) · · · σ (dtn ) f (n) (t1 , . . . , tn )∂t†1 · · · ∂t†n , σ (dt1 ) · · · σ (dtn ) f (n) (t1 , . . . , tn )∂t†1 · · · ∂t†n−k ∂tn−k+1 · · · ∂tn , σ (dt1 ) · · · σ (dtn ) f (n) (t1 , . . . , tn )

Tn ×∂t†1

· · · ∂t†n−k λ(tn−k+1 )∂t†n−k+1 ∂tn−k+1 ∂tn−k+2 · · · ∂tn .

(2.19)

110


Note that L(1) ( f (n) )g (m) ∈ H⊗(n+m) , (2)

(3)

Lk ( f (n) )g (m) ∈ H⊗(n+m−2k) , Lk ( f (n) )g (m) ∈ H⊗(n+m−2k+1) .

(2.20)

Using (2.19) and the Cauchy–Schwarz inequality, we conclude that the vectors in (2.20) are well-defined for each f (n) ∈ H⊗n (independently of the choice of a version of f (n) ), and the F(H)-norm of each such vector is bounded by C f (n) H⊗n , where the constant depends on g (m) and is independent of n. Therefore, for each ∞C >(n)0 only ⊗n F = n=0 f , :ω : ∈ L 2 (τ ), ∞

∞ m (2) (3) Fg (m) := Lk ( f (n) ) + Lk ( f (n) ) g (m) , L(1) ( f (n) )g (m) + n=0

k=1

n=k

which is a vector in F(H). Indeed, by (2.20), 2 ∞ L(1) ( f (n) )g (m) n=0

F (H )

=

∞

L(1) ( f (n) )g (m) 2F (H) ≤ C 2

n=0

∞

f (n) 2F (H) < ∞,

n=0

and analogously we deal with the other sums. Extending F by linearity to the whole Ffin (D), we thus get a Hermitian operator in F(H) with domain Ffin (D). The following theorem gives a rule of representation of a monomial through a sum of orthogonal polynomials. Theorem 2.3 (Wick rule for a product of free noises). For each n ∈ N, we have: ω(t1 ) · · · ω(tn ) = :W (κ)ω(t1 ) · · · ω(tn ): , (2.21) κ∈G n

the formula making sense after smearing out with a function f (n) ∈ D(n) . Proof. We prove (2.21) by induction. Formula (2.21) trivially holds for n = 1. Assume that it also holds for n − 1, n ≥ 2. Then ω(t1 ) · · · ω(tn ) = ω(t1 ) :W (κ)ω(t2 ) · · · ω(tn ): κ∈G n−1

=

3

:W (κ (i) )ω(t1 ) · · · ω(tn ):.

κ∈G n−1 i=1

Here, κ (i) , i = 1, 2, 3, are the elements of G n that are obtained by first taking the marked partition κ of {2, 3, . . . , n}, and then for i = 1, by adding {1} as a singleton element with mark +1, for i = 2, by adding 1 to the first (from the left-hand side) element of κ which has mark +1 (if there is no such an element, then this term is zero), and for i = 3, by adding 1 to the first element of κ that has mark +1 and changing the mark to −1 (again this term becomes zero if no element of κ has mark +1). From here the statement of the theorem follows.


Remark 2.3. For each f (n) ∈ B0 (T n ), σ (dt1 ) · · · σ (dtn ) f (n) (t1 , . . . , tn ) :W (κ)ω(t1 ) · · · ω(tn ):

111

(2.22)

n κ∈G n T

is clearly an element of L 2 (τ ), and it follows from Theorem 2.3 and Remark 2.2 that (2.22) is associated with the operator f (n) , ω⊗n (first on Ffin (D), and then it is extended by continuity to the whole F(H)). Thus, we get the inclusion of P into L 2 (τ ). The following theorem generalizes Theorem 2.3. Theorem 2.4 (Wick rule for a product of normal products of free noises). For any k1 , . . . , kl ∈ N, l ∈ N, we have :ω(t1 ) · · · ω(tk1 ): :ω(tk1 +1 ) · · · ω(tk1 +k2 ): · · · :ω(tk1 +k2 +···+kl−1 +1 ) · · · ω(tn ): = :W (κ)ω(t1 )ω(t2 ) · · · ω(tn ): , (2.23) where n := k1 + k2 + · · · + kl and the summation in (2.23) is over all κ ∈ G n such that each element of the induced partition π(κ) of {1, . . . , n} contains maximum one element of each of the sets {1, . . . , k1 }, {k1 + 1, . . . , k1 + k2 }, …, {k1 + k2 + · · · + kl−1 + 1, . . . , n}. Formula (2.23) makes sense after smearing out with a function f (n) ∈ D(n) . Proof. Analogously to the proof of Theorem 2.3, it suffices to show that, for any k1 , k2 ∈ N, :ω(t1 ) · · · ω(tk1 ): :ω(tk1 +1 ) · · · ω(tk1 +k2 ): = :ω(t1 ) · · · ω(tk1 )ω(tk1 +1 ) · · · ω(tk1 +k2 ): +:ω(t1 ) · · · ω(tk1 −1 )δ(tk1 , tk1 +1 )ω(tk1 +2 ) · · · ω(tk1 +k2 ): +:ω(t1 ) · · · ω(tk1 −1 )λ(tk1 )δ(tk1 , tk1 +1 )ω(tk1 +1 ) · · · ω(tk1 +k2 ):.

(2.24)

To show (2.24), represent :ω(t1 ) · · · ω(tk1 ): in the form (2.16) and represent :ω(tk1 +1 ) · · · ω(tk1 +k2 ): in the form (2.15), then use the free commutation relation (2.3) whenever ∂tk1 ∂t†k +1 enters, and finally collect the terms in order to get the right-hand side of (2.24). 1 We leave these long, but quite simple calculations to the interested reader. 3. Non-Commutative Generalized Stochastic Processes with Freely Independent Values Let the space T and the measure σ be as in Sect. 2. For each t ∈ T , let µ(t, ·) be a probability measure on (R, B(R)) with compact support. We will assume that, for each A ∈ B(R), the mapping T t → µ(t, A) is measurable, and for each ∈ B0 (T ) there exists R = R() > 0 such that, for all t ∈ , the measure µ(t, ·) has support in [−R, R]. We denote T˜ := T × R, and define a measure σ˜ (dt, ds) := σ (dt)µ(t, ds) on (T˜ , B(T˜ )). Clearly, σ˜ ( × R) < ∞ for all ∈ B0 (T ).

(3.1)

We denote H := L 2 (T˜ , σ˜ ). Let λ ∈ C(T˜ ) be chosen as λ(t, s) := s. Let L 2 (τ ) be the Hilbert space as in Sect. 2 which corresponds to T˜ , σ˜ , and λ. By Proposition 2.3, we have a unitary operator I : L 2 (τ ) → F(H).

112


Remark 3.1. In view of (3.1), we will call a subset of T˜ bounded if it is a subset of a set × R, where ∈ B0 (T ). We then define B0 (T˜ ) and C0 (T˜ ) as the set of all bounded measurable functions on T˜ with bounded support, and the set of all bounded continuous functions on T˜ with bounded support, respectively. All the respective definitions and results of Sect. 2 evidently remain true for these spaces. For each f : T → R and g : R → R, we denote by f ⊗ g the function on T˜ given by ( f ⊗ g)(t, s) := f (t)g(s). If f ∈ D = C0 (T ) and g is continuous, then f ⊗ g is continuous, has bounded support, but is not necessarily bounded. Still we will identify this function with any f ⊗ g¯ ∈ C0 (T˜ ), where g¯ : R → R is continuous, bounded, and coincides with g on [−R, R]. Here R = R(supp f ) > 0, i.e., R is chosen so that, for each t from the support of f , µ(t, ·) has support in [−R, R]. We will analogously proceed in the case where f ∈ B0 (T ). Now, for each f ∈ B0 (T ) we define X ( f ) as the element of L 2 (τ ) given by X ( f ) := x( f ⊗ 1) = a + ( f ⊗ 1) + a − ( f ⊗ 1) + a 0 ( f ⊗ s).

(3.2)

(Here and below, if g(s) = s l , l ∈ N0 , we write the function f ⊗ g as f ⊗ s l .) Thus, † † X( f ) = σ˜ (dt, ds) f (t)(∂(t,s) + ∂(t,s) + s∂(t,s) ∂(t,s) ) T˜ † † = σ (dt) f (t) µ(t, ds)(∂(t,s) + ∂(t,s) + s∂(t,s) ∂(t,s) ) R T = σ (dt) f (t)ω(t), (3.3) T

where

ω(t) :=

R

† † µ(t, ds) (∂(t,s) + ∂(t,s) + s∂(t,s) ∂(t,s) ) =

R

µ(t, ds) (t, s)

(3.4)

with † † (t, s) = ∂(t,s) + ∂(t,s) + s∂(t,s) ∂(t,s) .

(3.5)

Also for ∈ B0 (T ), we set X () := X (χ ). By Proposition 2.1, for any f 1 , . . . , f n ∈ B0 (T ) such that f i f j = 0 σ -a.e. for all 1 ≤ i < j ≤ n, X ( f 1 ), . . . , X ( f n ) are freely independent with respect to the state τ . In particular, for any mutually disjoint sets 1 , . . . , n ∈ B0 (T ), the operators X (1 ), . . . , X (n ) are freely independent. Hence, we may interpret ω as a non-commutative generalized stochastic process with freely independent values (compare with [15]). Remark 3.2. Let us derive an equivalent representation of the free random field (X ( f )) f ∈B0 (T ) . For each t ∈ T , denote c(t) := µ(t, {0}), and let ν(t, ·) denote the measure on R \ {0} given by ν(t, ds) := s12 µ(t, ds). Then, we define a unitary operator U : H → L 2 (T, c(t)σ (dt)) ⊕ L 2 (T × (R \ {0}), σ (dt)ν(t, ds)) := G by H f → U f := ( f (t, 0), f (t, s)s) ∈ G.


113

We naturally extend U to a unitary operator U : F(H) → F(G). As easily seen, for each f ∈ B0 (T ), U X ( f )U −1 = a + ( f, 0) + a − ( f, 0) + a + (0, f ⊗ s) + a 0 (0, f ⊗ s) + a − (0, f ⊗ s). (3.6) In (3.6), the operator B( f ) := a + ( f, 0) + a − ( f, 0) describes the Brownian part of the process, while the operator J ( f ) := a + (0, f ⊗ s) + a 0 (0, f ⊗ s) + a − (0, f ⊗ s)

(3.7)

describes the “jump” part of the process. Thus, ν(t, ·) is the Lévy measure of the process at point t, and it describes the value and intensity of “jumps” (compare with e.g. [32] in the bosonic (classical) case, and with [8–10] in the free case). Analogously to Sect. 2, we define the free cumulants C (n) : B0 (T )nC → C through τ (X ( f 1 )X ( f 2 ) · · · X ( f n )) = C(A, f 1 , . . . , f n ), f 1 , f 2 , . . . , f n ∈ B0 (T ), π ∈N C(n) A∈π

and then we define the free cumulant transform C( f ) :=

∞

C (n) ( f ),

f ∈ B0 (T )C

n=1

(we have used obvious notations). By (the proof of) Proposition 2.2 and using the notations introduced in Remark 3.2, we get: Proposition 3.1. Let f ∈ B0 (T )C be such that there exists ε ∈ (0, 1) for which | f (t)|
0, i.e., R is such that, for each t ∈ supp f , the measure µ(t, ·) has support in [−R, R]. Then f 2 (t) C( f ) = σ (dt) µ(t, ds) 1 − s f (t) T R f 2 (t)s 2 = σ (dt)c(t) f 2 (t) + σ (dt) ν(t, ds) 1 − s f (t) T T R\{0} ∞ = σ (dt)c(t) f 2 (t) + σ (dt) ν(t, ds) s n f n (t). T

T

R\{0}

n=2

Next, we have: Proposition 3.2. The vacuum vector in F(H) is cyclic for the operator family (X ( f )) f ∈D .

114


Proof. It can be easily shown by approximation that it suffices to prove that is cyclic for the operator family (X ( f )) f ∈B0 (T ) . We first state that the linear span of the set {χ ⊗ s n | ∈ B0 (T ), n ∈ N0 } is dense in L 2 (T˜ , σ˜ ). Indeed, let g ∈ L 2 (T˜ , σ˜ ) be orthogonal to all elements of this set, i.e., σ (dt) µ(t, ds)s n g(t, s) = 0 for all ∈ B0 (T ) and n ∈ N0 . (3.8) R

Since R µ(·, ds)s n g(·, s) ∈ L 1 (, σ ) for each ∈ B0 (T ), we conclude from (3.8) that, for σ -a.e. t ∈ T , µ(t, ds)s n g(t, s) = 0 for all n ∈ N0 . R

But, for each t ∈ T , µ(t, ·) is a probability measure on R with compact support, and hence the set of all polynomials on R is dense in L 2 (R, µ(t, ·)). Therefore, for σ -a.e. t ∈ T and for µ(t, ·)-a.e. s ∈ R, g(t, s) = 0. Hence g(t, s) = 0 for σ˜ -a.e. (t, s) ∈ T˜ . Since the measure σ is non-atomic, we can analogously prove the following lemma: Lemma 3.1. For each n ∈ N, H⊗n = c. l. s.({χ1 ⊗ s l1 ) ⊗ (χ2 ⊗ s l2 ) ⊗ · · · ⊗ (χn ⊗ s ln ) | l1 , . . . , ln ∈ N0 , for each j = 1, . . . , n − 1: j ∩ j+1 = ∅}. Below, we denote by M the set of all multi-indices of the form (l1 , . . . , li ) ∈ Ni0 , i ∈ N. Lemma 3.2. For each n ∈ N, we define the following subsets of F(H): R(n) := c. l. s. {, X ( f 1 ) · · · X ( f i ) | f 1 , . . . , f i ∈ B0 (T ), i ∈ {1, . . . , n}} , S (n) := c. l. s.{, (χ1 ⊗ s l1 ) ⊗ · · · ⊗ (χi ⊗ s li ) | (l1 , . . . , li ) ∈ M, l1 + · · · + li + i ≤ n, for each j = 1, . . . , i − 1: j ∩ j+1 = ∅}.

(3.9)

Then R(n) = S (n) . Proof. First, we note by approximation that, for each n ∈ N, S (n) = c. l. s. , f (i) (t1 , . . . , ti )s1l1 · · · sili | f (i) ∈ B0 (T i ), (l1 , . . . , li ) ∈ M, l1 + · · · + li + i ≤ n}

(3.10)

(we are using obvious notations for elements of F(H)). From (3.2) and (3.10), the inclusion R(n) ⊂ S (n) follows by induction. Next, let us prove that S (n) ⊂ R(n) . For n = 1, this is trivially true. Assume now that this is true for n ∈ {1, . . . , N }, and let us show it for n = N + 1. Thus, we have to show


115

that, for any 1 , . . . , i ∈ B0 (T ) such that j ∩ j+1 = ∅ for all j = 1, . . . , i − 1, and any (l1 , . . . , li ) ∈ M such that l1 + · · · + li + i = N + 1, (χ1 ⊗ s l1 ) ⊗ · · · ⊗ (χi ⊗ s li ) ∈ R(N +1) .

(3.11)

If l1 = 0, then (χ1 ⊗ 1) ⊗ (χ2 ⊗ s l2 ) ⊗ · · · ⊗ (χi ⊗ s li ) = a + (χ1 ⊗ 1)((χ2 ⊗ s l2 ) ⊗ · · · ⊗ (χi ⊗ s li )) = X (1 )((χ2 ⊗ s l2 ) ⊗ · · · ⊗ (χi ⊗ s li )). Hence, in the case l1 = 0, (3.11) holds. Now, for l1 ≥ 1, we have: (χ1 ⊗ s l1 ) ⊗ · · · ⊗ (χi ⊗ s li ) = a 0 (χ1 ⊗ s)((χ1 ⊗ s l1 −1 ) ⊗ (χ2 ⊗ s l2 ) ⊗ · · · ⊗ (χi ⊗ s li )) = X (1 )((χ1 ⊗ s l1 −1 ) ⊗ (χ2 ⊗ s l2 ) ⊗ · · · ⊗ (χi ⊗ s li )) − (χ1 ⊗ 1) ⊗ (χ1 ⊗ s l1 −1 ) ⊗ (χ2 ⊗ s l2 ) ⊗ · · · ⊗ (χi ⊗ s li )

σ˜ (dt, ds)χ1 (t)s l1 −1 (χ2 ⊗ s l2 ) ⊗ · · · ⊗ (χi ⊗ s li ). − T˜

By the results proved above and the induction’s assumption, we therefore conclude that (3.11) holds for l1 ≥ 1. From Lemmas 3.1 and 3.2 the proposition follows.

For each f (n) ∈ B0 (T n ), we define a monomial of ω by σ (dt1 ) · · · σ (dtn ) f (n) (t1 , . . . , tn )ω1 (t1 ) · · · ωn (tn ) f (n) , ω⊗n : = Tn = σ˜ (dt1 , ds1 ) · · · σ˜ (dtn , dsn ) f (n) (t1 , . . . , tn ) (t1 , s1 ) · · · (tn , sn ) T˜ n

(recall (3.3)–(3.5)). We clearly have, for f (n) = f 1 ⊗ · · · ⊗ f n with f 1 , . . . , f n ∈ B0 (T ): f 1 ⊗ · · · ⊗ f n , ω⊗n = f 1 , ω · · · f n , ω = X ( f 1 ) · · · X ( f n ).

(3.12)

With some abuse of notations, we will denote by P and CP the set of all polynomials in ω with kernels f (n) ∈ B0 (T n ) and f (n) ∈ D(n) , respectively. (Note that below we will not use polynomials in the variable, so keeping the same notations as in Sect. 2 for rather different objects should not lead to a contradiction, and will be justified below.) From Proposition 3.2, we now conclude: Proposition 3.3. The set CP is dense in L 2 (τ ). Let CP(n) denote the subset of CP consisting of all continuous polynomials in ω of order ≤ n. Let MP(n) denote the closure of CP(n) in L 2 (τ ). Let OP(n) := MP(n) MP(n−1) , n ∈ N, OP(0) := R. Thus, we get:

116


Theorem 3.1. We have the following orthogonal decomposition of L 2 (τ ): L 2 (τ ) =

∞

OP(n) .

n=0

Let us recall that, in the case of a classical Lévy process, Nualart and Schoutens [31] derived an orthogonal decomposition of any square-integrable functional of the process in multiple stochastic integrals with respect to orthogonalized power jump processes (see also [27] and [4] for extensions of this result). Our next aim is to derive a free counterpart of [27,31]. Fix any t ∈ T . Denote by ( p (n) (t, ·))n≥0 the system of monic polynomials on R which are orthogonal with respect to µ(t, ·). If the support of µ(t, ·) is an infinite set, then by Favard’s theorem, the following recursive formula holds: sp (0) (t, s) = p (1) (t, s) + b(0) (t), sp (n) (t, s) = p (n+1) (t, s) + b(n) (t) p (n) (t, s) + a (n) (t) p (n−1) (t, s), n ∈ N,

(3.13)

where p (0) (t, s) = 1, a (n) (t) > 0 for n ∈ N, and b(n) (t) ∈ R for n ∈ N0 . If, however, the support of µ(t, ·) is a finite set consisting of N points (N ∈ N), then we have a finite N −1 system of monic orthogonal polynomials ( p (n) (t, ·))n=0 satisfying (3.13) for n ≤ N −2, and, for n = N − 1, we have: sp (N −1) (t, s) = b(N −1) (t, s) p (N −1) (t, s) + a (N −1) (t, s) p (N −2) (t, s). For technical reasons, we set, in this case, p (n) (t, s) := 0, a (n) (t) := 0, n ≥ N , (b(n) (t), n ≥ N being arbitrary), so that recursive relation (3.13) now always holds. For each n ∈ N0 , we denote g (l) (t) := µ(t, ds)| p (l) (t, s)|2 , t ∈ T, (3.14) R

and then we define a measure on (T, B(T )) by σ (l) (dt) := g (l) (t)σ (dt).

(3.15)

Note that σ (0) = σ . For each (l1 , . . . , li ) ∈ M, we define H(l1 ,...,li ) := L 2 (T i , σ (l1 ) ⊗ · · · ⊗ σ (li ) ).

(3.16)

Then, clearly, the following mapping is an isometry: H(l1 ,...,li ) f (i) → K (l1 ,...,li ) f (i) = (K (l1 ,...,li ) f (i) )(t1 , s1 , . . . , ti , si ) := f (i) (t1 , . . . , ti ) p (l1 ) (t1 , s1 ) · · · p (li ) (ti , si ) ∈ H⊗i . We denote by H(l1 ,...,li ) the range of the isometry K (l1 ,...,li ) .

(3.17)


Lemma 3.3. We have

F(H ) = R ⊕

H(l1 ,...,li ) .

117

(3.18)

(l1 ,...,li )∈M

Furthermore, for each (l1 , . . . , li ) ∈ M, we have: H(l1 ,...,li ) = c. l. s.{(χ1 × p (l1 ) ) ⊗ · · · ⊗ (χi × p (li ) ) | 1 , . . . , i ∈ B0 (T ), for all j = 1, . . . , i − 1: j ∩ j+1 = ∅}. (3.19) Here, (χ × p (l) )(t, s) := χ (t) p (l) (t, s). Proof. Fix any f (i) ∈ B0 (T i ). Let ∈ B0 (T ) be such that the support of f (i) is a subset of i . Choose R = R() > 0 such that, for each t ∈ , µ(t, ·) has support in [−R, R]. Recall the recursive formula (3.13). We have, for each t ∈ , |b(n) (t)| ≤ R and a (n) (t) ≤ R 2 , which easily follows from the theory of Jacobi matrices (see e.g. [11]). Therefore, by (3.13), each p (n) (t, s) is bounded as a function of (t, s) ∈ × [−R, R]. Therefore, for each f (i) ∈ B0 (T i ), f (i) (t1 , . . . , ti ) p (l1 ) (t1 , s1 ) · · · p (li ) (ti , si ) ∈ H(l1 ,...,li ) . From here equality (3.19) easily follows (recall that the measure σ is non-atomic, which allows us to choose only those sets 1 , . . . , i in (3.19) for which j ∩ j+1 = ∅ for j = 1, . . . , i − 1). Formula (3.18) can now be proven analogously to the proof of Lemma 3.1. Recall that by Proposition 2.3, we have a unitary operator I : L 2 (τ ) → F(H). For each (l1 , . . . , li ) ∈ M, denote H(l1 ,...,li ) . := I −1 H(l1 ,...,li ) . For any ∈ B0 (T ) and l ∈ N0 , denote X (l) () := σ˜ (dt, ds)χ (t) p (l) (t, s) (t, s). T˜

For arbitrary (l1 , . . . , li ) ∈ M and 1 , . . . , i ∈ B0 (T ) such that j ∩ j+1 = ∅ for j = 1, . . . , i − 1, we clearly have: X (l1 ) (1 ) · · · X (li ) (i ) = (χ1 × p (l1 ) ) ⊗ · · · ⊗ (χi × p (li ) ). Therefore, by (3.19), H(l1 ,...,li ) = c. l. s.{X (l1 ) (1 ) · · · X (li ) (i ) | 1 , . . . , i ∈ B0 (T ), for all j = 1, . . . , i − 1: j ∩ j+1 = ∅}. For each f (l1 ,...,li ) ∈ H(l1 ,...,li ) (recall (3.16)), we can easily define a non-commutative multiple stochastic integral f (l1 ,...,li ) (t1 , . . . , ti )X (l1 ) (dt1 ) · · · X (li ) (dti ) (3.20) Ti

as an element of H(l1 ,...,li ) . Indeed, for each f (l1 ,...,li ) of the form f (l1 ,...,li ) (t1 , . . . , ti ) = χ1 (t1 ) · · · χi (ti )

118


with 1 , . . . , i ∈ B0 (T ) such that j ∩ j+1 = ∅, j = 1, . . . , i − 1, we define (3.20) as X (l1 ) (1 ) · · · X (li ) (i ). We then extend this definition by linearity to the linear span of such functions, and finally we extend it by continuity to obtain a unitary operator H(l1 ,...,li ) f (l1 ,...,li ) → f (l1 ,...,li ) (t1 , . . . , ti )X (l1 ) (dt1 ) · · · X (li ) (dti ) ∈ H(l1 ,...,li ) . Ti

Taking (3.18) into account, we thus derive Theorem 3.2. Denote F := R ⊕

H(l1 ,...,li ) .

(l1 ,...,li )∈M

Then, the following unitary operator gives an orthogonal expansion of L 2 (τ ) in non-commutative multiple stochastic integrals: F F = (c, ( f (l1 ,...,li ) )(l1 ,...,li )∈M ) f (l1 ,...,li ) (t1 , . . . , ti )X (l1 ) (dt1 ) · · · X (li ) (dti ) ∈ L 2 (τ ). → J F := c1 + Ti

(l1 ,...,li )∈M

(3.21) In terms of this orthogonal expansion, we have: L 2 (τ ) = R ⊕ H(l1 ,...,li ) .

(3.22)

(l1 ,...,li )∈M

(Note that, in (3.22), R denotes the space of all operators c1, where c ∈ R.) Remark 3.3. For each l ∈ N0 and ∈ B0 (T ), define Y (l) () ∈ L 2 (τ ) by Y (l) () : = σ˜ (dt, ds)χ (t)s l (t, s) T˜ +

= a (χ ⊗ s l ) + a 0 (χ ⊗ s l+1 ) + a − (χ ⊗ s l ) (recall (3.5)). Clearly, Y (0) () = X (). Recall now the unitary operator U : F(H) → F(G) from Remark 3.2. Then, for each l ∈ N, we have: U Y (l) ()U −1 = a + (0, χ ⊗ s l+1 ) + a 0 (0, χ ⊗ s l+1 ) + a − (0, χ ⊗ s l+1 ) (compare with (3.6) and (3.7)). Hence, by analogy with the classical case (see [31]), Y (l) (·), l ∈ N0 , may be treated as “power jump processes” (recall that s describes the value of “jumps”). For any l1 , l2 ∈ N0 , l1 < l2 , and any 1 , 2 ∈ B0 (T ), τ (Y (l1 ) (1 )X (l2 ) (2 )) = (X (l2 ) (2 ), Y (l1 ) (1 ))F (H) = σ (dt) µ(t, ds) p (l2 ) (t, s)s l1 = 0. 1 ∩2

R

Therefore, X (l) (·), l ∈ N0 , may be thought of as the orthogonalized power jump processes Y (l) (·), l ∈ N0 . The following theorem describes a connection between Theorems 3.1 and 3.2.


Theorem 3.3. For each n ∈ N,

OP(n) =

119

H(l1 ,...,li ) .

(l1 ,...,li )∈M, l1 +···+li +i=n

Proof. We have to show that, for each n ∈ N, MP(n) = R ⊕

H(l1 ,...,li ) ,

(l1 ,...,li )∈M, l1 +···+li +i≤n

or, equivalently, I MP(n) = Z (n) , where

Z (n) := R ⊕

H(l1 ,...,li ) .

(3.23)

(l1 ,...,li )∈M, l1 +···+li +i≤n

As easily seen, I MP(n) = R(n) (see (3.9)). Hence, by Lemma 3.2 and (3.10), I MP(n) = c. l. s.{, f (i) (t1 , . . . , ti )s1l1 · · · sili | f (i) ∈ B0 (T i ), (l1 , . . . , li ) ∈ M, l1 + · · · + li + i ≤ n}.

(3.24)

Furthermore, (3.19) implies that Z (n) = c. l. s.{, f (i) (t1 , . . . , ti ) p (l1 ) (t1 , s1 ) · · · p (li ) (ti , si ) | f (i) ∈ B0 (T i ), (l1 , . . . , li ) ∈ M, l1 + · · · + li + i ≤ n}. (3.25) It follows from the proof of Lemma 3.3 that each p (l) has a representation p (l) (t, s) =

l

α (l, j) (t)s j ,

j=0

where α (l, j) ’s are measurable functions on T which are bounded on each ∈ B0 (T ). By (3.24) and (3.25), we therefore get the inclusion Z (n) ⊂ I MP(n) . Next, for each t ∈ T , denote by P (i) (t, ·), i ∈ N0 , the system of normalized orthogonal polynomials in L 2 (R, µ(t, ·)). We then have an expansion s = l

l

β (l, j) (t)P ( j) (t, s),

(3.26)

j=0

where the functions β (l, j) are measurable. For each ∈ B0 (T ) and each t ∈ , we have: l (β (l, j) (t))2 = s l 2L 2 (R,µ(t,ds)) = s l 2L 2 ([−R,R],µ(t,ds)) ≤ R 2l , j=0

where R = R(). Thus, the functions β (l, i) (·) are locally bounded on T . Define Y (n) := c. l. s.{, f (i) (t1 , . . . , ti )P (l1 ) (t1 , s1 ) · · · P (li ) (ti , si ) | f (i) ∈ B0 (T i ), (l1 , . . . , li ) ∈ M, l1 + · · · + li + i ≤ n}.

120


Then, by (3.24) and (3.26), I MP(n) ⊂ Y (n) . Set u (l) (t) := p (l) (t, ·)−1 . (In the L 2 (R, µ(t,·))

case where p (l) (t, ·) = 0, set u (l) (t) := 0.) To show that I MP(n) ⊂ Z (n) , it only remains to show that, for each f (i) ∈ B0 (T i ) and each (l1 , . . . , li ) ∈ M, l1 + · · · + li + i ≤ n, the function f (i) (t1 , . . . , ti )u (l1 ) (t1 ) · · · u (li ) (ti ) p (l1 ) (t1 , s1 ) · · · p (li ) (ti , si ) belongs to Z (n) . But this easily follows through approximation of f (i) (t1 , . . . , ti )u (l1 ) (t1 ) · · · u (li ) (ti )

by functions from B0 (T i ).

Recall that we have constructed the following chain of unitary isomorphisms: J

I

F −→ L 2 (τ ) −→ F(H) (see, in particular, Theorem 3.2). Thus, K := I J : F → F(H) is a unitary operator. Note that the restriction of K to each space H(l1 ,...,li ) is K (l1 ,...,li ) , see (3.17). We will preserve the notation for the vector in F defined as K −1 . For each n ∈ N0 , we denote

so that F =

∞

F(n) := J −1 OP(n) ,

(n) n=0 F ,

and by Theorem 3.3, for each n ∈ N, F(n) = H(l1 ,...,li ) . (l1 ,...,li )∈M, l1 +···+li +i=n

For each f ∈ B0 (T ), we will preserve the notation X ( f ) for the image of this operator under K −1 , i.e., for the equivalent realization of X ( f ) in F. Corollary 3.1. For each f ∈ B0 (T ), we have X ( f ) = X + ( f ) + X 0 ( f ) + X − ( f ), where X + ( f ) : F(n) → F(n+1) , X 0 ( f ) : F(n) → F(n) , and X − ( f ) : F(n) → F(n−1) . Furthermore, X ± ( f ) = X 1± ( f ) + X 2± ( f ), and for each (l1 , . . . , li ) ∈ M, H(l1 ,...,li ) g → (X 1+ ( f )g)(t1 , . . . , ti+1 ) := f (t1 )g(t2 , . . . , ti+1 ) ∈ H(0,l1 ,...,li ) , (3.27) H(l1 ,...,li ) g → (X 2+ ( f )g)(t1 , . . . , ti ) := f (t1 )g(t1 , . . . , ti ) ∈ H(l1 +1,l2 ,...,li ) , (3.28) H(l1 ,...,li ) g → (X 1− ( f )g)(t1 , . . . , ti−1 ) σ (dt) f (t)g(t, t1 , . . . , ti−1 ) ∈ H(l2 ,...,li ) , := δl1 , 0 T

H(l1 ,...,li ) g → (X 2− ( f )g)(t1 , . . . , ti ) := (1 − δl1 , 0 )a (l1 ) (t1 ) f (t1 )g(t1 , . . . , ti ) ∈ H(l1 −1,...,li ) , H(l1 ,...,li ) g → (X 0 ( f )g)(t1 , . . . , ti ) := b(l1 ) (t1 ) f (t1 )g(t1 , . . . , ti ) ∈ H(l1 ,...,li ) , and X 0 ( f ) = X − ( f ) = 0, X + ( f ) = f ∈ H(0) . Here, δl1 , 0 is equal to 1 if l1 = 0, and equal to 0, otherwise.


121

Proof. We fix any f ∈ B0 (T ) and g (i) ∈ H(l1 ,...,li ) . Then, by (3.13) and (3.17), we have: a + ( f ⊗ 1) + a − ( f ⊗ 1) + a 0 ( f ⊗ s) K g (i) = a + ( f ⊗ 1) + a − ( f ⊗ 1) + a 0 ( f ⊗ s) g (i) (t1 , . . . , ti ) p (l1 ) (t1 , s1 ) · · · p (li ) (ti , si ) = f (t1 )g (i) (t2 , . . . , ti+1 ) p (0) (t1 , s1 ) p (l1 ) (t2 , s2 ) · · · p (li ) (ti+1 , si+1 ) +δl1 ,0 σ (dt) f (t)g (i) (t, t1 , . . . , ti−1 ) p (l2 ) (t1 , s1 ) · · · p (li ) (ti−1 , si−1 ) T (i)

+ f (t1 )g (t1 , . . . , ti )( p (l1 +1) (t1 , s1 ) + b(l1 ) (t1 ) p (l1 ) (t1 , s1 ) + a (l1 ) (t1 ) p (l1 −1) (t1 , s1 )) × p (l2 ) (t2 , s2 ) · · · p (li ) (ti , si ). Applying the operator K −1 to the above element of F(H), we easily conclude the statement. For f (n) ∈ D(n) , let P( f (n) ) denote the orthogonal projection of f (n) , ω⊗n onto OP(n) . Remark 3.4. By Proposition 3.3, the set CP is dense in L 2 (τ ). From here it follows that the linear span of the set {P( f (n) ) | f (n) ∈ D(n) , n ∈ N0 } is also dense in L 2 (τ ). In fact, for each n ∈ N, the set {P( f (n) ) | f (n) ∈ D(n) } is dense in OP(n) . Indeed, by definition, the set CP(n) is dense in MP(n) . Therefore, the set of all projections of P ∈ CP(n) onto OP(n) is dense in OP(n) . But the projection of each P ∈ CP(n−1) onto OP(n) equals zero, from which the statement follows. Corollary 3.2. Let n ∈ N and let (l1 , . . . , li ) ∈ M, l1 + · · · + li + i = n. For each f (n) ∈ D(n) , the H(l1 ,...,li ) -coordinate of the vector J −1 P( f (n) ) in F(n) is given by f ( t1 , . . . , t1 , t2 , . . . , t2 , . . . , ti , . . . , ti ). (l1 + 1) times (l2 + 1) times

(li + 1) times

Proof. By approximation, it suffices to check the statement in the case where f (n) = f 1 ⊗ · · · ⊗ f n , f 1 , . . . , f n ∈ D. Then, by (3.12), J −1 f (n) , ω⊗n = X ( f 1 ) · · · X ( f n ). Hence, J −1 P( f (n) ) is equal to the projection of X ( f 1 ) · · · X ( f n ) onto F(n) . Therefore, by Corollary 3.1, J −1 P( f (n) ) = X + ( f 1 ) · · · X + ( f n ). The statement now follows from (3.27) and (3.28).

In view of Remark 3.4 and Corollary 3.2, we will now give an equivalent interpretation of the F(n) spaces. So, we fix any n ∈ N. For each (l1 , . . . , li ) ∈ M, l1 +· · ·+li +i = n, we define T (l1 ,...,li ) := (t1 , . . . , tn ) ∈ T n | t1 = t2 = · · · = tl1 +1 , tl1 +2 = tl1 +3 = · · · = tl1 +l2 +2 , . . . , tl1 +l2 +···+li−1 +i = tl1 +l2 +···+li−1 +i+1 = · · · = tn , tl1 +1 = tl1 +l2 +2 , tl1 +l2 +2 = tl1 +l2 +l3 +3 . . . , tl1 +···+li−1 +i−1 = tn . The T (l1 ,...,li ) sets with (l1 , . . . , li ) ∈ M, l1 + · · · + li + i = n, form a set partition of T n .

122


We define B(T (l1 ,...,li ) ) as the trace σ -algebra of B(T n ) on T (l1 ,...,li ) . Now, consider the measurable mapping T (l1 ,...,li ) (t1 , . . . , tn ) → (tl1 +1 , tl1 +l2 +2 , tl1 +l2 +l3 +3 , . . . , tn ) ∈ T i .

(3.29)

Since σ is a non-atomic measure, the image of T (l1 ,...,li ) under the mapping (3.29) is of full σ (l1 ) ⊗ · · · ⊗ σ (li ) measure. We denote by γ (l1 ,...,li ) the pre-image of the measure σ (l1 ) ⊗ · · · ⊗ σ (li ) under the mapping (3.29). We then extend γ (l1 ,...,li ) by zero to the whole space T n . Note that, for different (l1 , . . . , li ) and (l1 , . . . , l j ) from M for which

l1 +· · ·+li +i = l1 +· · ·+l j + j = n, the measures γ (l1 ,...,li ) and γ (l1 ,...,l j ) are concentrated on disjoint sets in T n . We then define a measure on (T n , B(T n )) as follows: γn := γ (l1 ,...,li ) . (3.30) (l1 ,...,li )∈M, l1 +···+li +i=n

Recall that, by Remark 3.4, the set {J −1 P( f n ) | f (n) ∈ D(n) } is dense in F(n) , while the set D(n) is clearly dense in L 2 (T n , γn ). Therefore, by Corollary 3.2 the mapping L 2 (T n , γn ) ⊃ D(n) f (n) → J −1 P( f (n) ) ∈ F(n) extends to a unitary operator. In terms of this unitary isomorphism, we will, in what follows, identify F(n) with L 2 (T n , γn ), so that the space F becomes F=R⊕

∞

L 2 (T n , γn ).

n=1

By analogy with [26,27], we call F a free extended Fock space. Since, for each n ∈ N, D(n) ⊂ L 2 (T n , γn ), we have an evident inclusion of Ffin (D) into F. Corollaries 3.1 and 3.2 can now be reformulated as the following theorem, which is the main result of this section. Theorem 3.4. The following mapping J

F ⊃ Ffin (D) ( f (0) , f (1) , f (2) , . . . ) −→

∞

P( f (n) ) ∈ L 2 (τ )

(3.31)

n=0

(the sum being, in fact, finite) extends to the unitary operator J : F → L 2 (τ ). In particular, for any f (n) , g (n) ∈ D(n) , n ∈ N, (P( f (n) ), P(g (n) )) L 2 (τ ) = ( f (n) , g (n) ) L 2 (T n ,γn ) = ( f (n) g (n) )(t1 , . . . , t1 , . . . , ti , . . . , ti ) Ti (l1 ,...,li )∈M, l1 +···+li +i=n

×g

(l1 )

(t1 ) · · · g

(li )

l1 + 1 times

(ti ) σ (dt1 ) · · · σ (dti ),

where the functions g (l) are given by (3.14).

li + 1 times

(3.32)


123

For each f ∈ D, X ( f ) = X + ( f ) + X 0 ( f ) + X − ( f ), where X + ( f ) : F(n) → F(n+1) , : F(n) → F(n) , and X − ( f ) : F(n) → F(n−1) . Furthermore, for each n ∈ N and (n) each g ∈ D(n) , X 0( f )

(X + ( f )g (n) )(t1 , . . . , tn+1 ) = f (t1 )g(t2 , . . . , tn+1 ), (t1 , . . . , tn+1 ) ∈ T n+1 ; for each (l1 , . . . , li ) ∈ M, l1 + . . . , li + i = n, and each (t1 , . . . , tn ) ∈

(3.33)

T (l1 ,...,li ) ,

(X 0 ( f )g (n) )(t1 , . . . , tn ) = b(l1 ) (t1 ) f (t1 )g (n) (t1 , . . . , tn );

(3.34)

and for each (l1 , . . . , li ) ∈ M, l1 +. . . , li +i = n−1, and each (t1 , . . . , tn−1 ) ∈ (X − ( f )g (n) )(t1 , . . . , tn−1 ) = σ (dt) f (t)g (n) (t, t1 , . . . , tn−1 )

T (l1 ,...,li ) ,

T

+a (l1 +1) (t1 ) f (t1 )g (n) (t1 , t1 , t2 , . . . , tn−1 ) (3.35) (the second addend on the right-hand side of (3.35) being equal to zero for n = 1). Additionally, X + ( f ) = f , X 0 ( f ) = 0, X − ( f ) = 0. Remark 3.5. For the reader’s convenience, let us quickly summarize the constructed spaces and the established unitary isomorphisms. We first have the following commutative diagram: L 2 (τ ) = R ⊕

(l1 ,...,li )∈M

J

F=R⊕

- F (H ) = R ⊕

I

H(l1 ,...,li )

H(l1 ,...,li )

(l1 ,...,li )∈M

K H(l1 ,...,li )

(l1 ,...,li )∈M

Here, the spaces H(l1 ,...,li ) are defined by (3.16), the isomorphism I is established in Proposition 2.3, K is given through (3.17), J is given by (3.21), the spaces H(l1 ,...,li ) and H(l1 ,...,li ) are the images of H(l1 ,...,li ) under K and J , respectively. Furthermore, we have realized each space F(n) = H(l1 ,...,li ) , n ∈ N, (l1 ,...,li )∈M, l1 +···+li +i=n

as L 2 (T n , γn ), and derived the following commutative diagram: L 2 (τ ) =

∞

I

OP(n)

n=0

J F=R⊕

∞

- F(H) = R ⊕

K

∞

H(n)

n=1

L 2 (T n , γn )

n=1

where H(n) :=

H(l1 ,...,li ) , n ∈ N.

(l1 ,...,li )∈M, l1 +···+li +i=n

Formula (3.31) gives the action of J in terms of the latter diagram, while formulas (3.33)–(3.35) give the action of X ( f ) in F.

124


4. The Free Meixner Class As we saw in Theorem 2.1, the free Gauss–Poisson processes have the property that, for each f (n) ∈ D(n) , the orthogonal polynomial P( f (n) ) is a continuous polynomial. We will now search for all the free processes as in Sect. 3 for which this property remains true. So, as in Sect. 3, we fix a free process (X ( f )) f ∈D — a family of bounded linear operators in the free extended Fock space F. Theorem 4.1. The following statements are equivalent: i) For each f (n) ∈ D(n) , P( f (n) ) ∈ CP. ii) For each f ∈ D, X ( f ) maps Ffin (D) into itself. iii) There exist λ and η from C(T ), η(t) ≥ 0 for all t ∈ T , such that b(l) (t) = λ(t), t ∈ T, l ∈ N0 , a (l) (t) = η(t), t ∈ T, l ∈ N. In this case, for each f ∈ D and g (n) ∈ D(n) , n ∈ N, (X + ( f )g (n) )(t1 , . . . , tn+1 ) = f (t1 )g(t2 , . . . , tn+1 ), (t1 , . . . , tn+1 ) ∈ T n+1 , (X ( f )g

(n)

(n)

)(t1 , . . . , tn ) = λ(t1 ) f (t1 )g (t1 , t2 , . . . , tn ), (t1 , . . . , tn ) ∈ T , − (n) (X ( f )g )(t1 , . . . , tn−1 ) = σ (dt) f (t)g (n) (t, t1 , . . . , tn−1 ) 0

n

(4.1) (4.2)

T

+η(t1 ) f (t1 )g (n) (t1 , t1 , t2 , . . . , tn−1 ), (t1 , . . . , tn−1 ) ∈ T (n−1)

(4.3)

(the second addend on the right-hand side of (4.3) being equal to zero for n = 1). Proof. Assume that i) holds. Hence, for any n ∈ N, there exist linear operators Ui,n : D(n) → D(i) , i = 0, 1, . . . , n, such that P( f (n) ) =

n Ui,n f (n) , ω⊗i ,

f (n) ∈ D(n) .

(4.4)

i=0

Applying the orthogonal projection of L 2 (τ ) onto OP(n) to both right-and left-hand sides of (4.4), we get P( f (n) ) = P(Un,n f (n) ). Hence, Un,n is the identity operator, so that (4.4) becomes P( f (n) ) = f (n) , ω⊗n +

n−1 Ui,n f (n) , ω⊗i ,

f (n) ∈ D(n) .

(4.5)

i=0

From here it follows that, for any n ∈ N, there exist linear operators Vi,n : D(n) → D(i) , i = 0, 1, . . . , n − 1, such that f (n) , ω⊗n = P( f (n) ) +

n−1 i=0

P(Vi,n f (n) ),

f (n) ∈ D(n) .

(4.6)


125

Indeed, for n = 1, (4.6) clearly holds. Assume that (4.6) holds for all n = 1, . . . , N , N ∈ N. Then, by (4.5) and by (4.6) for n ≤ N , we have, for each f (N +1) ∈ D(N +1) , f (N +1) , ω⊗(N +1) = P( f (N +1) ) −

N

Ui,N +1 f (N +1) , ω⊗i

i=0

= P( f (N +1) )−

N

⎛ ⎝P(Ui,N +1 f (N +1) )+

i=0

i−1

⎞ P(V j,i Ui,N +1 f (N +1) )⎠,

j=0

from which (4.6) holds for n = N + 1. Now, for each f ∈ D and g (n) ∈ D(n) , by (4.5) and (4.6), f, ω P(g (n) ) = f ⊗ g (n) , ω(n+1) +

n f ⊗ (Ui−1,n g (n) ), ω⊗i i=1

= P( f ⊗ g (n) ) +

+

n

n

P(V j,n+1 ( f ⊗ g (n) ))

j=0

P( f ⊗ (Ui−1,n g

i=1

(n)

)) +

i−1

P(Vk,i ( f ⊗ (Ui−1,n g

(n)

))

k=0

= P( f ⊗ g (n) ) +

n

P(Z j,n+1 ( f, g (n) )),

(4.7)

j=0

where Z j,n+1 ( f, g (n) ) ∈ D( j) . Thus, by (4.7), f, ω P(g (n) ) ∈ CP(n+1) , and so ii) holds. (Note that, in view of symmetricity, Z j,n+1 ( f, g (n) ) = 0 for j ≤ n − 2.) Let us now prove that ii) implies iii). For each t ∈ T , denote λ(t) := b(0) (t) and η(t) := a (1) (t). Fix any open set O ∈ B0 (T ). Let f, g ∈ D be such that f (t) = g(t) = 1 for all t ∈ O. Then, by (3.34), for each t ∈ O, (X 0 ( f )g)(t) = λ(t).

(4.8)

By ii), (4.8) implies that λ(t) continuously depends on t ∈ O. Hence, λ ∈ C(T ). Next, let a set O and functions f , g be as above, and assume additionally that f ≥ 0 and g ≥ 0 on T . Further, choose any h ∈ D such that h ≥ 0 on T , with h(t) = 0 for all t ∈ O, and f h ≡ 0. Choose ε < 0 such that σ (dt) f (t)(g(t) + εh(t)) = 0. T

Set g (2) (t1 , t2 ) := (g(t1 ) + εh(t1 ))g(t2 ), (t1 , t2 ) ∈ T 2 . Then, by (3.35), for each t ∈ O, (X − ( f )g (2) )(t) = η(t), which implies that η is continuous on O. Hence, η ∈ C(T ).

126


Next, fix any t ∈ T , and let f ∈ D and g (n) ∈ D(n) , n ≥ 2, be such that f (t) = 1 and = 1. By (3.34), for any (t1 , . . . , tn−1 ) ∈ T n−1 such that t = t1 , t1 = t2 , …, tn−2 = tn−1 , we have g (n) (t, t, . . . , t)

(X 0 ( f )g (n) )(t, t1 , t2 . . . , tn−1 ) = λ(t)g (n) (t, t1 , t2 , . . . , tn−1 ),

(4.9)

(X 0 ( f )g (n) )(t, t, . . . , t) = b(n−1) (t).

(4.10)

whereas

By ii), lim

(t1 ,t2 ,...,tn−1 )→(t,t,...,t)

(X 0 ( f )g (n) )(t, t1 , t2 , . . . , tn−1 ) = (X 0 ( f )g (n) )(t, t, . . . , t).

Hence, by (4.9) and (4.10), b(n−1) (t) = λ(t). Thus, for all t ∈ T and all n ∈ N0 , b(n) (t) = λ(t). Completely analogously, we then also deduce from (3.35) that, for all t ∈ T and all n ∈ N, a (n) (t) = η(t). Formulas (4.2) and (4.3) now follow from (3.34) and (3.35), respectively. Thus, iii) holds. Finally, we prove that iii) implies i). Analogously to (2.13), we now have, for any f 1 , . . . , f n ∈ D, n ≥ 2: P( f 1 ⊗ · · · ⊗ f n ) = f 1 , ω P( f 2 ⊗ · · · ⊗ f n ) − P((λ f 1 f 2 ) ⊗ f 3 ⊗ · · · ⊗ f n ) − σ (dt) f 1 (t) f 2 (t)P( f 3 ⊗ · · · ⊗ f n ) − P((η f 1 f 2 f 3 ) ⊗ f 4 ⊗ · · · ⊗ f n ) T

(compare with (2.13)). From here we conclude statement i) by an easy generalization of the proof of Theorem 2.1. Remark 4.1. As easily seen by approximation, formulas (4.1)–(4.3) remain true for any f ∈ B0 (T ) and g (n) ∈ F(n) = L 2 (T n , γn ). The set of all free processes as in Theorem 4.1, iii) will be called the Meixner class of free processes. We note that, if, for t ∈ T , η(t) = 0, then the measure µ(t, ·) is concentrated at one point, namely λ(t). Hence, g (0) (t) = 1 and g (l) (t) = 0 for all l ∈ N (see (3.14) and (3.15)). In particular, if η(t) = 0 for all t ∈ T , the measure γn becomes σ ⊗n (see (3.30)). Thus, F = F(H) and X ( f ) = x( f ), f ∈ D, where (x( f )) f ∈D is the free process as in Sect. 2, which corresponds to the function λ ∈ C(T ). If, however, η(t) > 0, then µ(t, ·) has an infinite support. Recall that µ(t, ·) is the measure of orthogonality of monic polynomials ( p (n) (t, ·))∞ n=0 satisfying sp (n) (t, s) = p (n+1) (t, s) + λ(t) p (n) (t, s) + η(t) p (n−1) (t, s), n ∈ N0 ,

(4.11)

where p (−1) (t, s) := 0. Hence, µ(t, ·) is Wigner’s semicircle law with mean λ(t) and variance η(t): −1 √ √ µ(t, ds) = χ[−2 η(t)+λ(t), 2 η(t)+λ(t)] (s) (4π η(t)) 4η(t) − (s − λ(t))2 ds (compare with [34] and [17]). By (3.14) and (4.11), we have: g (l) (t) = ηl (t), l ∈ N0 .

(4.12)


127

Substituting (4.12) into (3.32), we get the explicit form of the inner product in the free extended Fock space F. Assume that, for some ∈ B0 (T ), the functions λ(·) and η(·) are constant on , i.e., λ(t) = λ, η(t) = η for all t ∈ , where λ ∈ R and η ≥ 0. Then, by (4.1)–(4.3) (see Remark 4.1), we have: ⊗(n+1)

⊗n X ()χ = χ

⊗(n−1)

⊗n + [n]0 λχ + ([n]0 σ () + [n]0 [n − 1]0 η)χ

, n ∈ N0 , (4.13)

⊗0 ⊗n ⊗n ⊗n where χ := . Denote P(χ ) := J χ . Then, by (4.13), P(χ ) = q (n) (X ()), where (q (n) )∞ n=0 is the system of monic polynomials on R satisfying the recursive relation (1.6). By Favard’s theorem, (q (n) )∞ n=0 is a system of polynomials which are orthogonal with respect to some probability measure ρλ,η,σ () . For an explicit form of this measure, we refer to e.g. [34].

Corollary 4.1. Let (X ( f )) f ∈B0 (T ) be as in Theorem 4.1 iii). Then, for each ∈ B0 (T ), there exists r = r () > 0 such that, for each f ∈ B0 (T )C satisfying | f (t)| < r χ (t) for all t ∈ T, we have

−1 2 2 σ (dt) 2 f (t) 1 − λ(t) f (t) + (1 − λ(t) f (t)) − 4 f (t)η(t) .

C( f ) =

2

T

Proof. The result directly follows from Proposition 3.1 and the following formula which holds for z ∈ C from a neighborhood of zero:

−1 1 2 2 = 2 1 − λ(t)z + (1 − λ(t)z) − 4z η(t) µ(t, ds) , 1 − sz R see [2,34]. Recall that Ffin (D) is a dense subset of F. Analogously to Sect. 2, we can therefore interpret smeared, Wick ordered products of operators ∂t† and ∂t as operators in F. Corollary 4.2. Let (X ( f )) f ∈B0 (T ) be as in Theorem 4.1 iii). Then, using the same notations as in Sect. 2, we may represent the action of each X ( f ) in F as follows: X( f ) = σ (dt) f (t)ω(t), T

where ω(t) = ∂t† + λ(t)∂t† ∂t + ∂t + η(t)∂t† ∂t ∂t . Proof. The statement directly follows from (4.1)–(4.3) if we note that, for each g (n) ∈ D(n) ,

† (n) (t1 , . . . , tn−1 ) = η(t1 ) f (t1 )g (n) (t1 , t1 , t2 , . . . , tn−1 ). σ (dt) f (t)η(t)∂t ∂t ∂t g T

Acknowledgements. We would like to thank the referee for a careful reading of the manuscript and making very useful comments and suggestions. The authors acknowledge the financial support of the SFB 701 “Spectral structures and topological methods in mathematics”, Bielefeld University. MB was partially supported by the KBN grant no. 1P03A 01330. EL was partially supported by the PTDC/MAT/67965/2006 grant, University of Madeira.

128


References 1. Accardi, L., Franz, U., Skeide, M.: Renormalized squares of white noise and other non-Gaussian noises as Lévy processes on real Lie algebras. Commun. Math. Phys. 228, 123–150 (2002) 2. Anshelevich, M.: Free martingale polynomials. J. Funct. Anal. 201, 228–261 (2003) 3. Anshelevich, M.: Appell polynomials and their relatives. Int. Math. Res. Not. 2004(65), 3469–3531 (2004) 4. Anshelevich, M.: q-Lévy processes. J. Reine Angew. Math. 576, 181–207 (2004) 5. Anshelevich, M.: Free Meixner states. Commun. Math. Phys. 276, 863–899 (2007) 6. Anshelevich, M.: Orthogonal polynomials with a resolvent-type generating function. Trans. Amer. Math. Soc. 360, 4125–4143 (2008) 7. Anshelevich, M.: Monic non-commutative orthogonal polynomials. Proc. Amer. Math. Soc. 136, 2395–2405 (2008) 8. Barndorff-Nielsen, O.E., Thorbjørnsen, S.: Lévy laws in free probability. Proc. Natl. Acad. Sci. USA 99, 16568–16575 (2002) (electronic) 9. Barndorff-Nielsen, O.E., Thorbjørnsen, S.: Lévy processes in free probability. Proc. Natl. Acad. Sci. USA 99, 16576–16580 (2002) (electronic) 10. Barndorff-Nielsen, O.E., Thorbjørnsen, S.: The Lévy-Itô decomposition in free probability. Probab. Theory Related Fields 131(2), 197–228 (2005) 11. Berezansky, Ju.M.: Expansions in Eigenfunctions of Selfadjoint Operators. Providence, RI: Amer. Math. Soc., 1968 12. Berezansky, Yu.M.: Commutative Jacobi fields in Fock space. Integral Equations Operator Theory 30, 163–190 (1998) 13. Berezansky, Yu.M., Lytvynov, E., Mierzejewski, D.A.: The Jacobi field of a Lévy process. Ukrainian Math. J. 55, 853–858 (2003) 14. Berezansky, Yu.M., Mierzejewski, D.A.: The structure of the extended symmetric Fock space. Methods Funct. Anal. Topology 6(4), 1–13 (2000) 15. Biane, P.: Processes with free increments. Math. Z. 227, 143–174 (1998) 16. Bozejko, M., Kümmerer, B., Speicher, R.: q-Gaussian processes: non-commutative and classical aspects. Commun. Math. Phys. 185, 129–154 (1997) 17. Bozejko, M., Bryc, W.: On a class of free Lévy laws related to a regression problem. J. Funct. Anal. 236, 59–77 (2006) 18. Brüning, E.: When is a field a Jacobi field? A characterization of states on tensor algebras. Publ. Res. Inst. Math. Sci. 22, 209–246 (1986) 19. Donati-Martin, C.: Stochastic integration with respect to q Brownian motion. Probab. Theory Related Fields 125, 77–95 (2003) 20. Effros, E.G., Popa, M.: Feynman diagrams and Wick products associated with q-Fock space. Proc. Natl. Acad. Sci. USA 100, 8629–8633 (2003) (electronic) 21. Gel’fand, I.M., Vilenkin, N.Ya.: Generalized Functions, Vol. IV, New York-London: Academic Press, 1964 22. Hida, T., Kuo, H.-H., Potthoff, J., Streit, L.: White Noise: An Infinite Dimensional Calculus. DordrechtBoston-London: Kluwer Acad. Publ., 1993 23. Kondratiev, Yu.G., Lytvynov, E.W.: Operators of gamma white noise calculus. Infin. Dimens. Anal. Quantum Probab. Relat. Top. 3, 303–335 (2000) 24. Kondratiev, Yu.G., da Silva, J.L., Streit, L., Us, G.F.: Analysis on Poisson and gamma spaces. Infin. Dimens. Anal. Quantum Probab. Relat. Top. 1, 91–117 (1998) 25. Lytvynov, E.W.: Multiple Wiener integrals and non-Gaussian white noises: a Jacobi field approach. Meth. Func. Anal. and Topol 1, 61–85 (1995) 26. Lytvynov, E.: Polynomials of Meixner’s type in infinite dimensions—Jacobi fields and orthogonality measures. J. Funct. Anal. 200, 118–149 (2003) 27. Lytvynov, E.: Orthogonal decompositions for Lévy processes with an application to the gamma, Pascal, and Meixner processes. Infin. Dimens. Anal. Quantum Probab. Relat. Top. 6, 73–102 (2003) 28. Lytvynov, E.: The square of white noise as a Jacobi field. Infin. Dimens. Anal. Quantum Probab. Relat. Top. 7, 619–629 (2004) 29. Lytvynov, E.W., Rebenko, A.L., Shchepan’uk, G.V.: Wick theorems in non-Gaussian white noise calculus. Rep. Math. Phys. 37, 217–232 (1996) 30. Meixner, J.: Orthogonale Polynomsysteme mit einem besonderen Gestalt der erzeugenden Funktion. J. London Math. Soc. 9, 6–13 (1934) 31. Nualart, D., Schoutens, W.: Chaotic and predictable representations for Lévy processes. Stochastic Process. Appl. 90, 109–122 (2000) 32. Parthasarathy, K.R.: An Introduction to Quantum Stochastic Calculus. Basel: Birkhäuser Verlag, 1992 33. Rodionova, I.: Analysis connected with generating functions of exponential type in one and infinite dimensions. Methods Funct. Anal. Topology 11, 275–297 (2005)


129

34. Saitoh, N., Yoshida, H.: The infinite divisibility and orthogonal polynomials with a constant recursion formula in free probability theory. Probab. Math. Statist. 21, 159–170 (2001) 35. Schoutens, W.: Stochastic Processes and Orthogonal Polynomials. New York: Springer-Verlag, 2000 ´ 36. Sniady, P.: Quadratic bosonic and free white noises. Commun. Math. Phys. 211, 615–628 (2000) 37. Speicher, R.: Free probability theory and non-crossing partitions. Sém. Lothar. Combin. 39, Art. B39c, 38 pp. (1997) (electronic) Communicated by Y. Kawahigashi


Communications in


How Hot Can a Heat Bath Get? Martin Hairer1,2 1 Mathematics Institute, The University of Warwick, Coventry CV4 7AL,

United Kingdom. E-mail: [email protected]

2 Courant Institute, New York University, New York, NY 10012, USA.

E-mail: [email protected] Received: 2 December 2008 / Accepted: 31 March 2009 Published online: 25 June 2009 – © Springer-Verlag 2009

Abstract: We study a model of two interacting Hamiltonian particles subject to a common potential in contact with two Langevin heat reservoirs: one at finite and one at infinite temperature. This is a toy model for ‘extreme’ non-equilibrium statistical mechanics. We provide a full picture of the long-time behaviour of such a system, including the existence/non-existence of a non-equilibrium steady state, the precise tail behaviour of the energy in such a state, as well as the speed of convergence toward the steady state. Despite its apparent simplicity, this model exhibits a surprisingly rich variety of long time behaviours, depending on the parameter regime: if the surrounding potential is ‘too stiff’, then no stationary state can exist. In the softer regimes, the tails of the energy in the stationary state can be either algebraic, fractional exponential, or exponential. Correspondingly, the speed of convergence to the stationary state can be either algebraic, stretched exponential, or exponential. Regarding both types of claims, we obtain matching upper and lower bounds. Contents 1. 2. 3. 4. 5. 6. 7.

Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . Heuristic Derivation of the Main Results . . . . . . . . . . . . . A Potpourri of Test Function Techniques . . . . . . . . . . . . . Existence and Non-existence of an Invariant Probability Measure Integrability Properties of the Invariant Measure . . . . . . . . . Convergence Speed Towards the Invariant Measure . . . . . . . . The Case of a Weak Pinning Potential . . . . . . . . . . . . . . .

. . . . . . .

. . . . . . .

. . . . . . .

. . . . . . .

. . . . . . .

. . . . . . .

131 138 141 146 156 167 170

1. Introduction The aim of this work is to provide a detailed investigation of the dynamic and the long-time behaviour of the following model. Consider two point particles moving in a

132

M. Hairer

potential V1 and interacting through a harmonic force, that is the Hamiltonian system with Hamiltonian H ( p, q) =

p02 + p12 α + V1 (q0 ) + V1 (q1 ) + V2 (q0 − q1 ), V2 (q) = q 2 . 2 2

(1.1)

We assume that the first particle is in contact with a Langevin heat path at temperature T > 0. The second particle is also assumed to have a stochastic force acting on it, but no corresponding friction term, so that it is at ‘infinite temperature’. The corresponding equations of motion are dqi = pi dt, i = {0, 1}, dp0 = −V1 (q0 ) dt + α(q1 − q0 ) dt − γ p0 dt + 2γ T dw0 (t), dp1 = −V1 (q1 ) dt + α(q0 − q1 ) dt + 2γ T∞ dw1 (t),

(1.2)

where w0 and w1 are two independent Wiener processes. Although we use the symbol T∞ in the diffusion coefficient appearing in the second oscillator, this should not be interpreted as a physical temperature since the corresponding friction term is missing, so that detailed balance does not hold, even if T∞ = T . We also assume without further mention throughout this article that the parameters α, γ , T and T∞ appearing in the model (1.2) are all strictly positive. The equations of motion (1.2) determine a diffusion on R4 with generator L given by (1.3) L = X H − γ p0 ∂ p0 + γ T0 ∂ 2p0 + T∞ ∂ 2p1 , where X H is the Liouville operator associated to H , i.e. the first-order differential operator corresponding to the Hamiltonian vector field. It is easy to show that (1.2) has a unique global solution for every initial condition since the evolution of the total energy is controlled by LH = γ (T0 + T∞ ) − γ p02 ,

(1.4)

and so EH (t) ≤ H (0) exp (γ (T0 + T∞ )t). Schematically, the system under consideration can thus be depicted as follows, where we show the three terms contributing to the change of the total energy:

This model is very closely related to the toy model of heat conduction previously studied by various authors in [EPR99b,EPR99a,EH00,RT00,RT02,EH03,Car07,HM08a] consisting of a chain of N anharmonic oscillators coupled at its endpoints to two heat

How Hot Can a Heat Bath Get?

133

baths at possibly different temperatures. The main difference is that the present model does not have any friction term on the second particle. This is similar in spirit to the system considered in [MTVE02,DMP+ 07], where the authors study the stationary state of a ‘resonant duo’ with forcing on one degree of freedom and dissipation on another one. Because of this lack of dissipation, even the existence of a stationary state is not obvious at all in such a system. Indeed, if the coupling constant α is equal to zero, one can easily check that the invariant measure for (1.2) is (formally) given by exp(−( p02 /2+ V1 (q0 ))/T ) dp0 dp1 dq0 dq1 , which is obviously not integrable. One of the main questions of interest for such a system is therefore to understand the mechanism of energy dissipation. In this sense, this is a prime example of a ‘hypocoercive’ system where the dissipation mechanism does not act on all the degrees of freedom of the system directly, but is transmitted to them indirectly through the dynamic [Vil07,Vil08]. This is somewhat analogous to ‘hypoelliptic’ systems, where it is the smoothing mechanism that is transmitted to all degrees of freedom through the dynamic. The system under consideration happens to be hypoelliptic as well, but this is not going to cause any particular difficulty and will not be the main focus of the present work. Furthermore, since one of the heat baths is at ‘infinite’ temperature, even if a stationary state exists, one would not necessarily expect it to behave even roughly like exp(−β H ) for some effective inverse temperature β. It is therefore of independent interest to study the tail behaviour of the energy of (1.2) in its stationary state. In order to simplify our analysis, we are going to limit our investigation to one of the simplest possible cases, where V1 is a perturbation of a homogeneous potential. More precisely, we assume that V1 is an even function of class C 2 such that V1 (x) =

|x|2k + R1 (x), 2k

with a remainder term R1 such that (m)

|R1 (x)| < ∞. 2k−1−m x∈R\[−1,1] m≤2 |x| sup

sup

Here, k ∈ R is a parameter describing the ‘stiffness’ of the individual oscillators. (In the case k = 0, we assume that V1 (x) = C + R1 (x) for some constant C.)

In the case where both ends of the chain are at finite temperature (which would correspond to the situation depicted above), it was shown in [EPR99b,EH00,RT02,Car07] that, provided that the coupling potential V2 grows at least as fast at infinity as the pinning potential V1 and that the latter grows at least linearly (i.e. provided that 21 ≤ k ≤ 1 with our notations), the Markov semigroup associated to the model has a unique invariant

134

M. Hairer

measure µ and its transition probabilities converge to µ at exponential speed. One can actually show even more, namely that the Markov semigroup consists of compact operators in some suitably weighted space of functions. Intuitively, the condition that V2 grows at least as fast as V1 can be understood by the fact that in this case, at high energies, the interaction dominates so that no energy can get ‘trapped’ in the system. Therefore, the system is sufficiently stiff so that if the energy of any one of its oscillators is large, then the energy of all of the oscillators must be large after a very short time. As a consequence, the system behaves like a ‘molecule’ at some effective temperature that moves in the global potential V1 . While the arguments presented in [RT02,Car07] do not cover the case of one of the heat baths being at infinite temperature, it is nevertheless possible to show that in this case, the Markov semigroup Pt generated by solutions to (1.2) behave qualitatively like in the case of finite temperature. In particular, if V1 grows at least linearly at infinity, the system possesses a spectral gap in a space of functions weighted by a weight function ‘close to’ exp(β0 H ) for some β0 > 0. This discussion suggests that: 1. If V2 V1 1, our toy model can sustain arbitrarily large energy currents. 2. In this case, even though the heat bath to the right is at infinite temperature, the system stabilises at some finite ‘effective temperature’, as expressed by the fact that H has finite exponential moments under the invariant measure. This is in stark contrast with the behaviour encountered when V1 grows faster than V2 at infinity. In this case, the interaction between neighbouring particles is suppressed at high energies, which precisely favours the trapping of energy in the bulk of the chain. It was shown in [HM08a] that this can lead in many cases to a loss of compactness of the semigroup generated by the dynamic and the appearance of essential spectrum at 1. This is a manifestation of the fact that energy transport is very weak in such systems, due to the appearance of ‘breathers’, localised structures that only decay very slowly [MA94]. In this case, one expects that the long-time behaviour of (1.2) depends much more strongly on the fine details of the model. For example, regarding the finiteness of the ‘temperature’ of the second oscillator, one may introduce the following notions by increasing order of strength: 1. There exists an invariant probability measure µ for (1.2), that is a positive solution to L∗ µ = 0. 2. There exists an invariant probability measure µ and the average energy of the second oscillator is finite under µ . 3. There exists an invariant probability measure µ and the energy of the second oscillator has some finite exponential moment under µ . We will show that it is possible to find parameters such that the second oscillator does not have finite temperature according to any of these notions of finiteness. On the other hand, it is also possible to find parameters such that it does have finite temperature according to some notions and not to others. It turns out that, maybe rather surprisingly for such a simple model, there are five different critical values for the strength k of the pinning potential V1 that separate between qualitatively different behaviours regarding both the integrability properties of the invariant measure µ and the speed of convergence of transition probabilities towards it. These critical values are k = 0, k = 21 , k = 1, k = 43 , and k = 2. More precisely, there exists


135

a constant Cˆ > 0 such that, setting 3 α 2 Cˆ − T∞ 2 , κ = − 1, 4 T∞ k the results in this article can be summarised as follows: ζ =

(1.5)

Theorem 1.1. The integrability properties of the invariant measure µ for (1.2) and the speed of convergence of transition probabilities of (1.2) toward µ can be described by the following table: Parameter range k>2 k = 2, T∞ > α 2 Cˆ k = 2, T∞ < α 2 Cˆ 4 ≤k 2. The numerics was performed with a Störmer-Verlet scheme that was modified to take into account the damping and the stochastic forcing

Remark 1.4. For k ∈ (0, 21 ), even the gradient dynamic fails to exhibit a spectral gap. It is therefore not surprising (see for example [HN05]) that in this case we see again subexponential relaxation speeds. Remark 1.5. This table exhibits a symmetry κ ↔ k and H κ ↔ H around k = 1 (indicated by a grayed row in the table). The reason for this symmetry will be explained in Sect. 2 below. If we had chosen V1 (x) = K log x + R1 (x) in the case k = 0, this K symmetry would have extended to this case, via the correspondence ζ ↔ T +T − 21 . ∞ Remark 1.6. It follows from (1.6) that the time it takes for the transition probabilities 2 starting from x to satisfy Pt (x, · ) − µ TV ≤ 21 , say, is bounded by H (x)2− k for 1

k ∈ (1, 2) and by H (x) k −1 for k ∈ (0, 1). These bounds are expected to be sharp in view of the heuristics given in Sect. 2 below. Remark 1.7. Instead of considering only distances in total variation between probability measures, we could also have obtained bounds in weighted norms, similarly to [DFG06].


137

Remark 1.8. The operator (1.3) appears to be very closely related to the kinetic Fokker-Planck operator LV,2 = p ∂q − ∇V (q) ∂ p − γ p ∂ p + ∂ 2p , for the potential V (q0 , q1 ) = V1 (q0 ) + V1 (q1 ) + α2 (q0 − q1 )2 . The fundamental difference however is that there is a lack of friction on the second degree of freedom. The effect of this is dramatic, since the results from [HN04] (see also [DV01]) show that one has exponential return to equilibrium for the kinetic Fokker-Planck operator in the case k ≥ 1, which is clearly not the case here. Finally, the techniques presented in this article also shed some light on the mechanisms at play in the Helffer-Nier conjecture [HN05, Conj. 1.2], namely that the long-time behaviour of the Fokker-Planck operator without inertia LV,1 = −∇V (q) ∂q + ∂q2 , is qualitatively the same as that of the kinetic Fokker-Planck operator. If V grows faster than quadratically at infinity (so that in particular LV,1 has a spectral gap), then the deterministic motion on the energy levels gets increasingly fast at high energies, so that the angular variables get washed out and the heuristics from Sect. 2.1 below suggests that the total energy of the system behaves like the square of an Ornstein-Uhlenbeck process, thus leading to a spectral gap for LV,2 as well. If on the other hand V grows slower than quadratically at infinity, then the motion of the momentum variable happens on a faster timescale at high energies than that of the position variable. The heuristics from Sect. 2.2 below then suggests that the dynamic corresponding to LV,2 is indeed very well approximated at high energies by that corresponding to LV,1 . These considerations suggest that any counterexample to the Helffer-Nier conjecture would come from a potential that has very irregular (oscillating) behaviour at infinity, so that none of these two arguments quite works. On the other hand, any proof of the conjecture would have to carefully glue together both arguments. The structure of the remainder of this article is the following. First, in Sect. 2, we derive in a heuristic way reduced equations for the energies of the two oscillators. While this section is very far from rigorous, it allows to understand the results presented above by linking the behaviour of (1.2) to that of the diffusion √ d X = −ηX σ dt + 2 dW (t), X ≥ 1, for suitable constants η and σ . The remainder of the article is devoted to the proof of Theorem 1.1, which is broken into five sections. In Sect. 3, we introduce the technical tools that are used to obtain the above statements. These tools are technically quite straightforward and are all based on the existence of test functions with certain properties. The whole art is to construct suitable test functions in a relatively systematic manner. This is done by refining the techniques developed in [HM08a] and based on ideas from homogenisation theory. In Sect. 4, we proceed to showing that k = 2 and T∞ = α 2 Cˆ is the borderline case for the existence of an invariant measure. In Sect. 5, we then show sharp integrability properties of the invariant measure for the regime k > 1 when it exists. This will imply in particular that even though the effective temperature of the first oscillator is always finite (for whatever measure of finiteness), the one of the second oscillator need not

138

M. Hairer

necessarily be. In particular, note that it follows from Theorem 1.1 that the borderline case for the integrability of the energy of the second oscillator in the invariant measure ˆ These two sections form the ‘meat’ of the paper. is given by k = 2 and T∞ = 73 α 2 C. In Sect. 6, we make use of the integrability results obtained previously in order to obtain bounds both from above and from below on the convergence of transition probabilities towards the invariant measure. The upper bounds are based on a recent criterion from [DFG06,BCG08], while the lower bounds are based on a simple criterion that exploits the knowledge that certain functions of the energy fail to be integrable in the invariant measure. Finally, in Sect. 7, we obtain the results for the case k ≤ 1. While these final results are based on the same techniques as the remainder of the article, the construction of the relevant test functions in this case in inspired by the arguments presented in [RT02,Car07]. 1.1. Notations. In the remainder of this article, we will use the symbol C to denote a generic strictly positive constant that, unless stated explicitly, depends only on the details of the model (1.2) and can change from line to line even within the same block of equations. 2. Heuristic Derivation of the Main Results In this section, we give a heuristic derivation of the results of Theorem 1.1. Since we are interested in the tail behaviour of the energy in the stationary state, an important ingredient of the analysis is to isolate the ‘worst-case’ degree of freedom of (1.2), that would be some degree of freedom X which dominates the behaviour of the energy at infinity. The aim of this section is to argue that it is always possible to find such a degree of freedom (but what X really describes depends on the details of the model, and in particular on the value of k) and that, for large values of X , it satisfies asymptotically an equation of the type √ d X = −ηX σ dt + 2 dW (t), (2.1) for some exponent σ and some constant η > 0. Before we proceed with this programme, let us consider the model (2.1) on the set {X ≥ 1} with reflected boundary conditions at X = 1. The invariant measure µ for (2.1) then has density proportional to exp(−ηX σ +1 /(σ + 1)) for σ > −1 and to X −η for σ = −1. In particular, (2.1) admits an invariant probability measure if and only if σ > −1 or σ = −1 and η > 1. For such a model, we have the following result, which is a slight refinement of the results obtained in [Ver00,VK04,Ver06]. Theorem 2.1. The long-time behaviour of (2.1) is described by the following table: Parameter range σ < −1 σ = −1, η ≤ 1

Integrability of µ — —

Convergence speed — —

σ = −1, η > 1 −1 < σ < 0

X η−1±ε exp γ± X σ +1 exp γ± X σ +1 exp γ± X σ +1 exp γ± X σ +1

t 2 ±ε exp −γ± t (1+σ )/(1−σ )

0≤σ 1

1−η

Prefactor — —

exp(−γ± t)

X η+1+ε exp δ X σ +1 exp δ X 1−σ

exp(−γ± t)

Xε

exp(−γ± t)

1


139

The entries of this table have the same meaning as in Theorem 1.1, with the exception that the lower bounds on the convergence speed toward the invariant measure hold for all t > 0 rather than only for a subsequence. Proof. The case 0 ≤ σ ≤ 1 is very well-known (one can simply apply Theorem 3.4 below with either V (X ) = exp(δ X 1−σ ) for δ small enough in the case σ < 1 or with X ε in the case σ = 1). The case σ > 1 follows from the fact that in this case one can find a constant C > 0 such that EX (1) ≤ C, independently of the initial condition. The bounds for σ = −1 and η > 1 can be found in [Ver00,FR05,Ver06] (a slightly weaker upper bound can also be found in [RW01]). However, as shown in [BCG08], the upper bound can also be retrieved by using Theorem 3.5 below with a test function behaving like X η+1+ε for an arbitrarily small value of ε. The lower bound on the other hand can be obtained from Theorem 3.6 by using a test function behaving like X α , but with α 1. (These bounds could actually be slightly improved by choosing test functions of the form X η+1 (log X )β for the upper bound and exp((log X )β ) for the lower bound.) The upper bound for the case σ ∈ (−1, 0) can be found in [VK04] and more recently in [DFG06,BCG08]. This and the corresponding lower bound can be obtained similarly to above from Theorems 3.5 and 3.6 by considering test functions of the form exp(a X σ +1 ) for suitable values of a (small for the upper bound and large for the lower bound). Returning to the problem of interest, it was already noted in [EH00,RT02] that k = 1 is a boundary between two types of completely different behaviours for the dynamic (1.2). The remainder of this section is therefore divided into two subsections where we analyse the behaviour of these two regimes.

2.1. The case k > 1. When k > 1, the pinning potential V1 is stronger than the coupling potential V2 . Therefore, in this regime, one would expect the dynamic of the two oscillators to approximately decouple at very high energies [HM08a]. This suggests that one should be able to find functions H0 and H1 describing the energies of the two oscillators such that H0 is distributed approximately according to exp(−H0 /T ), while the distribution of H1 has heavier tails since that oscillator is not directly damped. In order to guess the behaviour of H1 at high energies, note first that since H0 is expected to have exponential tails, the regime of interest is that where H1 is very large, while H0 is of order one. In this regime, the second oscillator feels mainly its pinning potential, so that its motion is well approximated by the motion of a single free oscillator moving in the potential |q|2k /2k. A simple calculation shows that such a motion 1

is periodic with frequency proportional to H12

1 − 2k

and with amplitude proportional to

1 2k

H1 . In other words, one can find periodic functions P and Q such that in the regime of interest, one has (up to phases) 1

1

q1 (t) ≈ H12k Q(H12

1 − 2k

1

t),

1

p1 (t) ≈ H12 P(H12

1 − 2k

t).

(2.2)

˙ q = p, ˙ p = Q that average out to Let now p and q be the unique solutions to zero over one period. It is apparent from the equations of motion (1.2) that if we assume

140

M. Hairer

that (2.2) is a good model for the dynamic of the second oscillator, then the motion of the first oscillator can, at least to lowest order, be described by 1

p0 (t) = p˜ 0 (t) − α H1k

− 21

1

p (H12

1 − 2k

3

t),

q0 (t) = q˜0 (t) − α H12k

−1

1

q (H12

1 − 2k

t), (2.3)

where the functions p˜ 0 and q˜0 do not show any highly oscillatory behaviour anymore. Furthermore, they then satisfy, at least to lowest order, the decoupled Langevin equation d q˜0 ≈ p˜ 0 dt, d p˜ 0 ≈ −V1 (q˜0 ) dt − α q˜0 dt − γ p˜ 0 dt + 2γ T dw0 (t), (2.4) that indeed has exp(−H0 /T ) as invariant measure, provided that we set H0 =

p˜ 02 α + V1 (q˜0 ) + q˜02 . 2 2

Let us now return to the question of the behaviour of energy dissipation. The average rate of change of the total energy of our system is described by (1.4). Plugging our ansatz (2.3) into this equation and using the fact that p is highly oscillatory and averages out to 0, we obtain 2

LH ≈ γ (T + T∞ ) − γ p˜ 02 − γ α 2 H1k

−1

2p .

On the other hand, it follows from (2.4) that one has LH0 ≈ γ T − γ p˜ 02 ≈ C1 − C2 H0 ,

(2.5)

so that one expects to obtain for the energy of the second oscillator the expression 2

LH1 ≈ γ T∞ − γ α 2 2p H1k

−1

.

This suggests that, at least in the regime of interest, and since the p-dependence of H1 probably goes like of the type

p12 2 ,

the energy of the second oscillator follows a decoupled equation

2 −1 d H1 ≈ γ T∞ − γ α 2 2p H1k dt + 2γ T∞ K H1 dw1 (t),

(2.6)

where K is the average of p12 over one period of the free dynamic at energy 1, which will be shown in (5.10) below to be given by K = 2k/(1 + k). In order to analyse (2.6), it is convenient to introduce the variable X given by X 2 = 4H1 /(γ T∞ K ), so that its evolution is given by dX =

√ 2 2 2 3 γ α p γ T∞ K X 2 k − 2 √ 1 2 −1 − √ + 2 dw1 (t). K X 4 T∞ K

(2.7)

This shows that there is a transition at k = 2. For k > 2, we recover (2.1) with σ = −1 and η = 1 − K2 < 1, so that one does not expect to have an invariant measure, thus recovering the corresponding statement in Theorem 1.1.


141

At k = 2, we still have σ = −1, but we obtain η =1−

3 α 2 2p 1 2 2α 2 2p + = − , K T∞ K 2 T∞ 2

so that one expects to have existence of an invariant probability measure if and only if T∞ < α 2 2p . Furthermore, we recover from Theorem 2.1 the integrability results and convergence rates of Theorem 1.1, noting that one has the formal correspondence ζ = (η − 1)/2. This correspondence comes from the fact that H ≈ X 2 in the regime of interest and that X η−1 is the borderline for non-integrability with respect to µ in Theorem 2.1. In the regime k ∈ (1, 2), the first term in the right-hand side of (2.7) is negligible, so that we have the case σ = k4 − 3. Applying Theorem 2.1 then immediately allows to derive the corresponding integrability and convergence results from Theorem 1.1, noting that one has the formal correspondence κ = (σ + 1)/2. 2.2. The case k < 1. This case is much more straightforward to analyse. When k < 1, the coupling potential V2 is stiffer than the pinning potential V1 . Therefore, one expects the two particles to behave like a single particle moving in the potential V1 . This suggests that the ‘worst case’ degree of freedom should be the centre of mass of the system, thus motivating the change of coordinates Q=

q1 − q0 q0 + q1 , q= . 2 2

Fixing Q and writing y = (q, p0 , p1 ) for the remaining coordinates, we see that there exist matrices A and B and a vector v such that y approximately satisfies the equation dy ≈ Ay dt + V1 (Q)v dt + B dw(t). Here, we made the approximation V1 (q0 ) ≈ V1 (q1 ) ≈ V1 (Q), which is expected to be justified in the regime of interest (Q large and y of order one). This shows that for Q fixed, the law of y is approximately Gaussian with covariance of order one and mean proportional to V1 (Q). Since d Q = ( p0 + p1 )/2 dt, we thus expect that over sufficiently long time intervals, the dynamic of Q is well approximated by d Q ≈ −C1 V1 (Q) dt + C2 dW (t) ≈ −C1 Q|Q|2k−2 dt + C2 dW (t), for some positive constants C1 , C2 and some Wiener process W . We are therefore reduced again to the case of Theorem 2.1 with X ∝ |Q| and σ = 2k − 1. Since in the regime considered here one has H ≈ X 2k , this immediately allows to recover the results of Theorem 1.1 for the case k < 1. 3. A Potpourri of Test Function Techniques In this section, we present the abstract results on which all the integrability and nonintegrability results in this article are based, as well as the techniques allowing to obtain upper and lower bounds on convergence rates toward the invariant measure. All of these results without exception are based on the existence of test functions with certain properties. In this sense, we follow to its bitter end the Lyapunov function-based approach advocated in [BCG08,CGWW07,CGGR08] and use it to derive not only upper bounds on convergence rates, but also lower bounds.

142

M. Hairer

While most of these results from this section are known in the literature (except for the one giving the lower bounds on the convergence of transition probabilities which appears to be new despite its relative triviality), the main interest of the present article is to provide tools for the construction of suitable test functions in problems where different timescales are present at the regimes relevant for the tail behaviour of the invariant measure. The general framework of this section is that of a Stratonovich diffusion on Rn with smooth coefficients: d x(t) = f 0 (x) dt +

m

f i (x) ◦ dwi (t), x(0) = x0 ∈ Rn .

(3.1)

i=1

Here, we assume that f j : Rn → Rn are C ∞ vector fields on Rn and the wi are independent standard Wiener processes. Denote by L the generator of (3.1), that is the differential operator given by 1 2 Xi , 2 m

L = X0 +

X j = f j (x)∇x .

i=1

We make the following two standing assumptions which can easily be verified in the context of the model presented in the introduction: Assumption 1. There exists a smooth function H : Rn → R+ with compact level sets and a constant C > 0 such that the bound LH ≤ C(1 + H ) holds. This assumption ensures that (3.1) has a unique global strong solution. We furthermore assume that: Assumption 2. Hörmander’s ‘bracket condition’ holds at every point in Rn . In other words, consider the families Ak (with k ≥ 0) of vector fields defined recursively by A0 = { f 1 , . . . , f m } and Ak+1 = Ak ∪ {[ f j , g], g ∈ Ak ,

j = 0, . . . , m}.

Define furthermore the subspaces A∞ (x) = span{g(x) : ∃k > 0 with g ∈ Ak }. We then assume that A∞ (x) = Rn for every x ∈ Rn . As a consequence of Hörmander’s celebrated ‘sums of squares’ theorem [Hör67, Hör85], this assumption ensures that transition probabilities for (3.1) have smooth densities pt (x, y) with respect to Lebesgue measure. In our case, Assumption 2 can be seen to hold because the coupling potential is harmonic. Assumption 3. The origin is reachable for the control problem associated to (3.1). That is, given any x0 ∈ Rn and any r > 0 there exists a time T > 0 and a smooth control u ∈ C ∞ ([0, T ], Rm ) such that the solution to the ordinary differential equation dz f i (z(t))u i (t), z(0) = x0 , = f 0 (z(t)) + dt m

i=1

satisfies z(T ) ≤ r .


143

The fact that Assumption 3 also holds in our case is an immediate consequence of the results in [EPR99a,Hai05]. Assumptions 2 and 3 taken together imply that: 1. The operator L satisfies a strong maximum principle in the following sense. Let D ⊂ Rn be a compact domain with smooth boundary such that 0 ∈ D. Let furthermore u ∈ C 2 (D) be such that Lu(x) ≤ 0 for x in the interior of D and u(x) ≥ 0 for x ∈ ∂ D. Then, one has u(x) ≥ 0 for all x ∈ D, see [Bon69, Theorem 3.2]. 2. The Markov semigroup associated to (3.1) admits at most one invariant probability measure [DPZ96]. Furthermore, if such an invariant measure exists, then it has a smooth density with respect to Lebesgue measure. 3.1. Integrability properties of the invariant measure. Throughout the article, we are going to use the following criterion for the existence of an invariant measure with certain integrability properties: Theorem 3.1. Consider the diffusion (3.1) and let Assumptions 2 and 3 hold. If there exists a C 2 function V : Rn → [1, ∞) such that lim sup|x|→∞ LV (x) < 0, then there exists a unique invariant probability measure µ for (3.1). Furthermore, |LV | is integrable against µ and LV (x)µ (d x) = 0. Proof. The proof is a continuous-time version of the results in [MT93, Chap. 14]. See also for example [HM08a]. The condition given in Theorem 3.1 is actually an if and only if condition, but the other implication does not appear at first sight to be directly useful. However, it is possible to combine the strong maximum principle with a Lyapunov-type criterion to rule out in certain cases the existence of a function V as in Theorem 3.1. This is the content of the next theorem which provides a constructive criterion for the non-existence of an invariant probability measure with certain integrability properties: Theorem 3.2. Consider the diffusion (3.1) and let Assumptions 1, 2 and 3 hold. Let furthermore F : Rn → [1, ∞) be a continuous weight function. Assume that there exist two C 2 functions W1 and W2 such that: • The function W1 grows in some direction, that is lim sup|x|→∞ W1 (x) = ∞. • There exists R > 0 such that W2 (x) > 0 for |x| > R. • The function W2 is substantially larger than W1 in the sense that there exists a positive function H with lim|x|→∞ H (x) = +∞ and such that lim sup R→∞

sup H (x)=R W1 (x) inf H (x)=R W2 (x)

= 0.

• There exists R > 0 such that LW1 (x) ≥ 0 and LW2 (x) ≤ F(x) for |x| > R. Then the Markov process generated by solutions to (3.1) does not admit any invariant measure µ such that F(x) µ (d x) < ∞. Proof. The existence of an invariant measure that integrates F is equivalent to the existence of a positive C 2 function V such that LV ≤ − F outside of some compact set [MT93, Chap. 14]. The proof of the claim is then a straightforward extension of the proof given for the case F ≡ 1 by Wonham in [Won66]. Remark 3.3. If one is able to choose F ≡ 1 in Theorem 3.2, then its conclusion is that the system under consideration does not admit any invariant probability measure.

144

M. Hairer

3.2. Convergence speed toward the invariant measure: upper bounds. We still assume in this section that we are in the same setting as previously and that Assumptions 2–3 hold. The strongest kind of convergence result that one can hope to obtain is exponential convergence toward a unique invariant measure. In order to formulate a result of this type, given a positive function V , we define a weighted norm on measurable functions by ϕV = sup

x∈Rn

|ϕ(x)| . 1 + V (x)

We denote the corresponding Banach space by Bb (Rn ; V ). Furthermore, given a Markov semigroup Pt over Rn , we say that Pt has a spectral gap in Bb (Rn ; V ) if there exists a probability measure µ on Rn and constants C and γ > 0 such that the bound Pt ϕ − µ (ϕ)V ≤ Ce−γ t ϕ − µ (V )V , holds for every ϕ ∈ Bb (Rn ; V ). We will also say that a C 2 function V : Rn → R+ is a Lyapunov function for (3.1) if lim|x|→∞ V (x) = ∞ and there exists a strictly positive constant c such that LV ≤ −cV, holds outside of some compact set. With this notation, we have the following version of Harris’ theorem [MT93] (see also [HM08b] for an elementary proof): Theorem 3.4. Consider the diffusion (3.1) and let Assumptions 2 and 3 hold. If there exists a Lyapunov function V for (3.1), then Pt admitsa spectral gap in Bb (Rn ; V ). In particular, (3.1) admits a unique invariant measure µ , V dµ < ∞, and convergence of transition probabilities towards µ is exponential with prefactor V . However, there are situations where exponential convergence does simply not take place. In such situations, one cannot hope to be able to find a Lyapunov function as above, but it is still possible in general to find a ϕ-Lyapunov function V in the following sense. Given a function ϕ : R+ → R+ , we say that a C 2 function V : Rn → R+ is a ϕ-Lyapunov function if the bound LV ≤ −ϕ(V ), holds outside of some compact set and if lim|x|→∞ V (x) = ∞. If such a ϕ-Lyapunov function exists, upper bounds on convergence rates toward the invariant measure can be obtained by applying the following criterion from [DFG06,BCG08] (see also [FR05]): Theorem 3.5. Consider the diffusion (3.1) and let Assumptions 2 and 3 hold. Assume that there exists a ϕ-Lyapunov function V for (3.1), where ϕ is some increasing smooth concave function that is strictly sublinear. Then (3.1) admits a unique invariant measure µ and there exists a positive constant c such that for all x ∈ Rn , the bound Pt (x, ·) − µ TV ≤ cV (x)ψ(t), t holds, where ψ(t) = 1/(ϕ ◦ Hϕ−1 )(t) and Hϕ (t) = 1 (1/ϕ(s)) ds.


145

3.3. Convergence speed toward the invariant measure: lower bounds. In order to obtain lower bounds on the rate of convergence towards the invariant measure µ , we are going to make use of the following mechanism. Suppose that we know of some function G that on the one hand it has very heavy (non-integrable) tails under the invariant measure of some Markov process but, on the other hand, its moments do not grow too fast. Then, this should give a lower bound on the speed of convergence towards the invariant measure since the moment bounds prevent the process from exploring its heavy tails too quickly. This is made precise by the following elementary result: Theorem 3.6. Let X t be a Markov process on a Polish space X with invariant measure µ and let G : X → [1, ∞) be such that: • There exists a function f : [1, ∞) → [0, 1] such that the function Id · f : y → y f (y) is increasing to infinity and such that µ (G ≥ y) ≥ f (y) for every y ≥ 1. • There exists a function g : X × R+ → [1, ∞) increasing in its second argument and such that E(G(X t ) | X 0 = x0 ) ≤ g(x0 , t). Then, one has the bound µtn − µ TV ≥

1 f (Id · f )−1 (2g(x0 , tn )) , 2

(3.2)

where µt is the law of X t with initial condition x0 ∈ X . Proof. It follows from the definition of the total variation distance and from Chebyshev’s inequality that, for every t ≥ 0 and every y ≥ 1, one has the lower bound µt − µ TV ≥ µ (G(x) ≥ y) − µt (G(x) ≥ y) ≥ f (y) −

g(x0 , t) . y

Choosing y to be the unique solution to the equation y f (y) = 2g(x0 , t), the result follows. The problem is that in our case, we do not in general have sufficiently good information on the tail behaviour of µ to be able to apply Theorem 3.6 as it stands. However, it follows immediately from the proof that the bound (3.2) still holds for a subsequence of times tn converging to ∞, provided that the bound µ (G ≥ yn ) ≥ f (yn ) holds for a sequence yn converging to infinity. This observation allows to obtain the following corollary that is of more use to us: Corollary 3.7. Let X t be a Markov process on a Polish space X with invariant measure µ and let W : X → [1, ∞) be such that W (x) µ (d x) = ∞. Assume that there exist F : [1, ∞) → R and h : [1, ∞) → R such that: ∞ • h is decreasing and 1 h(s) ds < ∞. • F · h is increasing and lims→∞ F(s)h(s) = ∞. • There exists a function g : X × R+ → R+ increasing in its second argument and such that E((F ◦ W )(X t ) | X 0 = x0 ) ≤ g(x0 , t). Then, for every x0 ∈ X , there exists a sequence of times tn increasing to infinity such that the bound µtn − µ TV ≥ h (F · h)−1 (g(x0 , tn )) holds, where µt is the law of X t with initial condition x0 ∈ X .

146

M. Hairer

Proof. Since W (x) µ (d x) = ∞, there exists a sequence wn increasing to infinity such that µ (W (x) ≥ wn ) ≥ 2h(wn ), for otherwise we would have the bound ∞ ∞ W (x) µ (d x) = 1 + µ (W (x) ≥ w) dw ≤ 1 + 2 h(w) dw < ∞, 1

1

thus leading to a contradiction. Applying Theorem 3.6 with G = F ◦W and f = 2h◦F −1 concludes the proof. 4. Existence and Non-existence of an Invariant Probability Measure 4.1. Non-existence of an invariant probability measure. The aim of this section is to show that (1.2) does not admit any invariant probability measure if k > 2 or k = 2 and ˆ Note first that one has an upper bound on the evolution of the total energy T∞ > α 2 C. of the system given by LH = γ T + γ T∞ − γ p02 , which suggests that H is a natural choice for the function W2 in Wonham’s criterion for the non-existence of an invariant probability measure. It therefore remains to find a function W1 that grows to infinity in some direction (not necessarily all), that is dominated by the energy in the sense that lim

E→∞

1 E

sup

H ( p,q)=E

W1 ( p, q) = 0,

(4.1)

and such that LW1 ≥ 0 outside of some compact region K. In order to construct W1 , we use some of the ideas introduced in [HM08a]. The technique used there was to make a change of variables such that, in the new variables, the motion of the ‘fast’ oscillator decouples from that of the ‘slow oscillator’. In the situation at hand, we wish to show that the energy of the second oscillator grows, so that the relevant regime is the one where that energy is very high. One is then tempted to set W1 = H −ζ (H − H0 ),

(4.2)

for some (typically small) exponent ζ ∈ (0, 1), where H0 is a multiple of the energy of the first oscillator, expressed in the ‘right’ set of variables. In order to compute LW1 , we make use of the following ‘chain rule’ for L: L( f ◦ g) = (∂i f ◦ g)Lgi + (∂i2j f ◦ g)(gi , g j ),

(4.3)

(summation over repeated indices is implied), where we defined the ‘carré du champ’ operator (gi , g j ) = γ T ∂ p0 gi ∂ p0 g j + γ T∞ ∂ p1 gi ∂ p1 g j . (Note that it differs by a factor two from the usual definition in order to keep expressions as compact as possible.) This allows us to obtain the identity LW1 = H −ζ (γ T + γ T∞ − γ p02 − LH0 ) −γ ζ H −ζ −1 (H − H0 )(T + T∞ − p02 )

−2γ ζ H −ζ −1 T p0 ( p0 − ∂ p0 H0 ) + T∞ p1 ( p1 − ∂ p1 H0 ) +γ ζ (ζ + 1)H −ζ −2 (H − H0 ) T p02 + T∞ p12 .

(4.4)


147

Following our heuristic calculation in Sect. 2.1, we expect that at high energies, one has LH0 ≈ γ T −γ p˜ 02 , where p˜ 0 denotes the correct variable in which to express the motion of the oscillator. One would then like to first choose our compact set K sufficiently large so that the expression on the first line of (4.4) is larger than δ H −ζ (1+ p02 ) for some δ > 0. Then, by choosing ζ sufficiently close to zero, one would like to make the remaining terms sufficiently small so that LW1 > 0 outside of a compact set. This is made precise by the following lemma: Lemma 4.1. Let L be as in (1.3). Assume that there exist a C 2 function H0 : R4 → R and strictly positive constants c and C such that, outside of some compact subset of R4 , it satisfies the bounds LH0 ≤ γ (T + T∞ − p02 ) − c(1 + p02 ), |H0 | + |∂ p0 H0 |2 + |∂ p1 H0 |2 ≤ C H. If the function H0 furthermore satisfies lim sup E→∞

1 E

inf

H (x)=E

H0 (x) < 1,

(4.5)

then (1.2) admits no invariant probability measure. Proof. Setting W1 as in (4.2), we see from (4.4) and the assumptions on H0 that there exists a constant C > 0 independent of ζ ∈ (0, 1) such that the bound LW1 ≥ cH −ζ (1 + p02 ) − ζ C H −ζ (1 + p02 ) holds outside of some compact set. Choosing ζ < c/C, it follows that LW1 > 0 outside of some compact subset of R4 . Assumption (4.5) makes sure that W1 grows to +∞ in some direction and rules out the trivial choice H0 ∝ H . Since it follows furthermore from the assumptions that W1 ≤ C H 1−ζ , (4.1) holds so that the assumptions of Wonham’s criterion are satisfied. The remainder of this section is devoted to the construction of such a function H0 , thus giving rise to the following result: Theorem 4.2. There exists a constant Cˆ such that, if either k > 2, or k = 2 and ˆ the model (1.2) admits no invariant probability measure. T∞ > α 2 C, Remark 4.3. As will be seen from the construction, the constant Cˆ is really equal to the constant 2p from Sect. 2.1. Proof. As in [HM08a], we define the Hamiltonian Hf (P, Q) =

P 2 |Q|2k + 2 2k

of a ‘free’ oscillator on R2 and its generator L0 = P∂ Q − Q|Q|2k−1 ∂ P .

(4.6)

These definitions will be used for all of the remainder of this article, except for Sect. 7. The variables (P, Q) should be thought of as ‘dummy variables’ that will be replaced by for example ( p1 , q1 ) or ( p0 , q0 ) when needed.

148

M. Hairer

We also define as the unique centred1 solution to the Poisson equation L0 = Q − R(P, Q), where R : R2 → R is a smooth function averaging out to zero on level sets of Hf , and such that R = 0 outside of a compact set and R = Q inside an open set containing the origin. The reason for introducing the correction term R is so that the function is smooth everywhere including the origin, which would not be the case otherwise. It 1

follows from [HM08a, Prop. 3.7] that scales like Hfk

− 21

in the sense that, outside

1 1 k −2

a compact set, it can be written as = Hf 0 (ω), where ω is the angle variable conjugate to Hf . Inspired by the formal calculation from Sect. 2.1, we then define p˜ 0 = p0 − α( p1 , q1 ), so that the equations of motion for the first oscillator turn into dq0 = p˜ 0 dt + α dt,

(4.7) d p˜ 0 = −q0 |q0 |2k−2 dt − αq0 dt − γ p0 dt + 2γ T dw0 (t) 2 2 +αR dt − α (q0 − q1 )∂ P dt − α 2γ T∞ ∂ P dw1 (t) − αγ T∞ ∂ P dt +α R1 (q1 )∂ P dt − R1 (q0 ) dt. Here, we omitted the argument ( p1 , q1 ) from , its partial derivatives, and R in order to make the expressions shorter. Setting H˜ 0 =

p˜ 02 + Veff (q0 ) + θ p˜ 0 q0 , 2

Veff (q) = V1 (q) + α

q2 , 2

(4.8)

we obtain the following identity: L H˜ 0 = γ T − (γ − θ ) p02 − θ |q0 |2k − αθ |q0 |2 − γ θ p0 q0 +α 2 (γ − θ )2 + α(γ − θ ) p˜ 0 + αVeff (q0 ) +α p˜ 0 R − α 2 p˜ 0 (q0 − q1 )∂ P − αγ T∞ p˜ 0 ∂ P2 + α 2 γ T∞ (∂ P )2 +θq0 αR − α 2 (q0 − q1 )∂ P − αγ T∞ ∂ P2 +( p˜ 0 + θq0 )α R1 (q1 )∂ P .

(4.9)

All the terms on lines 3 to 5 (and also the terms on line 2 provided that k > 2) are of the form f ( p0 , q0 )g( p1 , q1 ) with g a function going to 0 at infinity and f a function such that f ( p0 , q0 )/Hf ( p0 , q0 ) goes to 0 at infinity. It follows that, for every ε > 0, there exists a compact set K ε ⊂ R4 such that, outside of K ε , one has the inequality θγ 2 + ε p02 − (θ − ε)|q0 |2k L H˜ 0 ≤ γ T − γ p02 + ε + θ + 4α +α 2 (γ − θ )2 + α(γ − θ ) p˜ 0 + αVeff (q0 ).

(4.10)

1 We say that a function on R2 is centred if it averages to 0 along orbits of the Hamiltonian system with Hamiltonian Hf .


149

Here, we also used the fact that γ θ p0 q0 ≤ αθ |q0 |2 + γ4αθ p02 . If k > 2, then the function also converges to 0 at infinity, so that the bound θγ 2 L H˜ 0 ≤ γ T − γ p02 + ε + θ + + ε p02 − (θ − ε)|q0 |2k , 4α 2

holds outside of a sufficiently large compact set. It follows that the conditions of Lemma 4.1 are satisfied by H0 = (1 + δ) H˜ 0 for δ > 0 sufficiently small whenever T∞ > 0, provided that one also chooses both θ and ε sufficiently small. The case k = 2 is slightly more subtle and we assume that k = 2 for the remainder of this proof. In particular, this implies that scales like a constant outside of some compact set. This suggests that the term 2 should average out to a constant, whereas the terms p˜ 0 and Veff (q0 ) should average out to zero, modulo some lower-order corrections. It turns out that these corrections will have the unfortunate property that they grow faster than Hf in the ( p0 , q0 ) variables. On the other hand, we notice that both p˜ 0 and Veff (q0 ) do grow slower than Hf at infinity. As a consequence, it is sufficient to compensate these terms for ‘low’ values of ( p0 , q0 ). Before giving the precise expression for a function H0 that satisfies the assumptions of Lemma 4.1 for the case k = 2, we make some preliminary calculations. We denote by ψ : R → R+ a smooth decreasing ‘cutoff function’ such that ψ(x) = 1 for x ≤ 1 and ψ(x) = 0 for x ≥ 2. Given a positive constant E, we also set Hf Hf Hf ( p˜ 0 , q0 ) 1 1 , ψ E = ψ , ψ E = 2 ψ . ψ E ( p˜ 0 , q0 ) = ψ E E E E E Definition 4.4. We will say that a function f : R+ × R4 → R is negligible if, for every 4 ε > 0, there exists E ε > 0 and, for every E > E ε there

exists a compact set K E,ε R such that the bound | f (E; p, q)| ≤ ε 1 + Hf ( p˜ 0 , q0 ) holds for every ( p, q) ∈ K E,ε . With this definition at hand, we introduce the notations f g,

f ∼ g,

(4.11)

to mean that there exists a negligible function h such that f ≤ g + h or f = g + h respectively. With this notation, we can rewrite (4.10) as L H˜ 0 γ T − γθ p02 − θ |q0 |2k + α 2 (γ − θ )2 + f θ , where we introduced the constant γθ = γ − θ (1 + θ ) p˜ 0 + αVeff (q0 ).

γ2 4α )

(4.12)

and the function f θ = α(γ −

Lemma 4.5. Let a, b ≥ 0 and let f, g : R2 → R be functions that scale like Hfa and Hf−b respectively. Then, the following functions are negligible:

i) ii) ii’) iii) iv)

f ( p˜ 0 , q0 )g( p1 , q1 )ψ E ( p˜ 0 , q0 ), provided that b > 0. f ( p˜ 0 , q0 )g( p1 , q1 )ψ E ( p˜ 0 , q0 ), provided that b > 0 or a < 2.

2 f ( p˜ 0 , q0 )g( p1 , q1 ) ψ E ( p˜ 0 , q0 ) , provided that b > 0 or a < 3. f ( p˜ 0 , q0 )g( p1 , q1 )ψ E ( p˜ 0 , q0 ), provided that b > 0 or a < 3. f ( p˜ 0 , q0 )g( p1 , q1 )(1 − ψ E ( p˜ 0 , q0 )), provided that a < 1.

150

M. Hairer

Proof. We assume without loss of generality that the bounds f ( p, q) ≤ 1 ∨ Hfa ( p, q)

and g( p, q) ≤ 1 ∧ Hf−b ( p, q) hold for every ( p, q) ∈ R2 . In the case i), we take E ε = 1 and choose for K E,ε the set of points such that either Hf ( p˜ 0 , q0 ) ≥ 2E, in which case the expression vanishes, or Hf ( p1 , q1 ) ≥ (2E)a/b ε−1/b in which case the expression is smaller than ε. The case ii) with b > 0 follows exactly like the case i), so we consider the case a < 2 and b = 0. Since ψ E = 0 if Hf ( p˜ 0 , q0 ) ≥ 2E and is smaller than 1/E otherwise, we have the bounds | f ( p˜ 0 , q0 )g( p1 , q1 )ψ E ( p˜ 0 , q0 )| ≤ (1 + Hf ( p˜ 0 , q0 ))E (0∨a−1)−1 . Since the exponent of E appearing in this expression is negative provided that a < 2, this is shown to be negligible by choosing E ε sufficiently large and setting K E,ε = φ. Cases ii’) and iii) follow in a nearly identical manner. In the case iv), we use the fact that since a < 1, for fixed ε > 0, we can find a constant Cε such that | f ( p˜ 0 , q0 )| ≤ 2ε Hf ( p˜ 0 , q0 ) + Cε . We then set E ε = 2Cε /ε, so that Hf ( p˜ 0 , q0 ) ≥ E ε implies Hf ( p˜ 0 , q0 ) ≥ 2Cε ε . Since g is bounded by 1 by assumption and since 1 − ψ E vanishes for Hf ( p˜ 0 , q0 ) ≤ E, it follows that the expression iv) is uniformly bounded by ε Hf ( p˜ 0 , q0 ) for E ≥ E ε . Remark 4.6. In the case where both b > 0 and a < 1, the function f ( p˜ 0 , q0 )g( p1 , q1 ) is negligible, which can be seen from cases i) and iv) above. Corollary 4.7. In the setting of Lemma 4.5, the following functions are negligible: v) f ( p˜ 0 , q0 )g( p1 , q1 )∂ p0 ψ E ( p˜ 0 , q0 ) provided that b > 0 or a < 3/2. vi) f ( p˜ 0 , q0 )g( p1 , q1 )∂ p1 ψ E ( p˜ 0 , q0 ). vii) f ( p˜ 0 , q0 )g( p1 , q1 )Lψ E ( p˜ 0 , q0 ) provided that b > 21 − k1 or b = 21 − k1 and a < 1. Proof. We can write f ( p˜ 0 , q0 )g( p1 , q1 )∂ p0 ψ E ( p˜ 0 , q0 ) = p˜ 0 f ( p˜ 0 , q0 )g( p1 , q1 )ψ E , f ( p˜ 0 , q0 )g( p1 , q1 )∂ p1 ψ E ( p˜ 0 , q0 ) = p˜ 0 f ( p˜ 0 , q0 )g( p1 , q1 )∂ P ( p1 , q1 )ψ E , so that the first two cases can be reduced to case ii) of Lemma 4.5. For case vii), we use the fact that Lψ E = ψ E LHf + γ T0 + T∞ (∂ P )2 p˜ 02 ψ E , (4.13) and that LHf consists of terms that all scale like Hfc ( p˜ 0 , q0 )Hfd ( p1 , q1 ) with c ≤ 1 and d ≤ k1 − 21 (see (4.9)) to reduce ourselves to cases ii) and iii) of Lemma 4.5. Before we proceed with the proof of Theorem 4.2, we state two further preliminary results that will turn out to be useful also for the analysis of the case k ∈ (1, 2): Lemma 4.8. Let k ∈ (1, 2] and let f : R2 → R be a function that scales like Hfa for some a ∈ R. Then, the function g = L( f ( p˜ 0 , q0 )) consists of terms that are bounded by 1 1 multiples of Hfc ( p˜ 0 , q0 )Hfd ( p1 , q1 ) with either c ≤ a + 21 − 2k and d ≤ 0 or c ≤ a − 2k 1 1 and d ≤ k − 2 .


151

Proof. It follows from (4.7) that g = −αq0 − γ ( p˜ 0 + α) + αR − α 2 (q0 − q1 )∂ P − αγ T∞ ∂ P2 ∂ P f

+ α R1 (q1 )∂ P − R1 (q0 ) ∂ P f +γ T + T∞ (∂ P )2 ∂ P2 f + ( p˜ 0 + α)∂ Q f − q0 |q0 |2k−2 ∂ P f, from which the claim follows by simple powercounting.

Lemma 4.9. Let k ∈ (1, 2] and let f : R2 → R be a function that scales like Hf−b for some b ∈ R. Then, the function g = L( f ( p1 , q1 ))−(L0 f )( p1 , q1 ) consists of terms that 1 are bounded by multiples of Hfc ( p˜ 0 , q0 )Hfd ( p1 , q1 ) with either c ≤ 2k and d ≤ −b − 21 1 1 or c ≤ 0 and d ≤ −b − 2 + 2k . Proof. It follows from (1.2) that g = α(q0 − q1 )∂ P f − R1 (q1 )∂ P f + γ T∞ ∂ P2 f, from which the claim follows.

(4.14)

We now return to the proof of Theorem 4.2. We define as the unique centred solution to the equation L0 = . One can see in a similar way as before that scales like −1 H 4 . Since scales like a constant, there exists some constant Cˆ such that 2 averages f

to Cˆ outside a compact set. While the constant Cˆ can not be expressed in simple terms, it is easy to compute it numerically: Cˆ ≈ 0.6354699.2 ˆ : R+ → R+ with compact support and such In particular, there exists a function R ˆ f (P, Q)) is centred. Denote by the centred solution to the equation that 2 − Cˆ + R(H ˆ f (P, Q)), L0 = 2 − Cˆ + R(H − 14

so that scales like Hf

(4.15)

, just like does. With these definitions at hand, we set

H0 = H˜ 0 − α 2 (γ − θ )( p1 , q1 ) + f θ ( p1 , q1 ) ψ E ( p˜ 0 , q0 ),

(4.16)

where we used the function f θ introduced in (4.12). Recalling that f θ consists of terms scaling like Hfa ( p˜ 0 , q0 ) with a ≤ 34 , we obtain from Lemmas 4.9 and 4.5 that f θ L( ( p1 , q1 ))ψ E = f θ − f θ (1 − ψ E ) + f θ (L − L0 )ψ E ∼ f θ . Similarly, we obtain that ˆ E ∼ 2 − C. ˆ L(( p1 , q1 ))ψ E = 2 − Cˆ − 2 − Cˆ (1 − ψ E ) + Rψ 2 All displayed digits are accurate.

152

M. Hairer

It therefore follows from (4.12), the facts that ∂ p0 p˜ 0 = 1 and ∂ p1 p˜ 0 = −α∂ P ( p1 , q1 ), and the multiplication rule for L, that one has the bound LH0 γ T − γθ p02 − θ |q0 |2k + α 2 (γ − θ )Cˆ −(α 2 (γ − θ ) + f θ )Lψ E − L f θ ψ E +C|∂ P ∂ p1 ψ E | + C| f θ ∂ P ∂ p1 ψ E | +C| ∂ P f θ 1 + α 2 (∂ P )2 ψ E | + C|∂ P ∂ P f θ ∂ P ψ E |. The terms on the second and third line are negligible by Lemma 4.8 and Corollary 4.7. The terms on the last line are negligible by Lemma 4.5, so that we finally obtain the bound ˆ ˆ − γθ p02 − θ |q0 |2k − α 2 θ C. LH0 γ (T + α 2 C)

(4.17)

Since the constant γθ can be made arbitrarily close to γ by choosing θ sufficiently small, ˆ it is possible to choose θ small enough we see as before that, provided that T∞ > α 2 C, and E large enough so that the choice H0 = (1+δ)H0 with δ > 0 sufficiently small again allows to satisfy the conditions of Lemma 4.1. This concludes the proof of Theorem 4.2.

4.2. Existence of an invariant measure. Theorem 4.2 has the following converse: ˆ the model (1.2) admits a Theorem 4.10. If either 1 < k < 2, or k = 2 and T∞ < α 2 C, unique invariant probability measure µ . The constant Cˆ is the same as in Theorem 4.2. Proof. Somewhat surprisingly given that the two statements are almost diametrically opposite, it is possible to prove this positive result in very similar way to the previous negative result by constructing the right kind of Lyapunov function. As before, the case k = 2 will be treated somewhat differently. The case k = 2. Similarly to what we did in (4.2), the idea is to look at the function V = H − cH0 for a suitable constant c, but this time we choose it in such a way that lim|( p,q)|→∞ V = ∞ and lim sup|( p,q)|→∞ LV < 0, so that we can apply Theorem 3.1. Note that, with the same notations as in the proof of Theorem 4.2, one has from (4.9), L H˜ 0 ∼ γ T − (γ − θ ) p02 − θ |q0 |2k − αθ |q0 |2 − γ θ p0 q0 + α 2 (γ − θ )2 + f θ , so that, provided this time that we choose θ < 0 in the definition of H˜ 0 (and therefore of H0 ), we have the bound LH0 ∼ γ T + α 2 (γ − θ )Cˆ − (γ − θ ) p02 − θ |q0 |2k − αθ |q0 |2 − γ θ p0 q0 γ T + α 2 (γ − θ )Cˆ − γθ p02 − θ |q0 |2k , where we set γθ = γ − θ (1 + γ4α ) as before. Here, the function H0 is as in (4.16) and depends on a large parameter E as above. If we choose c < 1, the function 2

V = H − cH0

(4.18)


153

does then indeed grow to infinity in all directions and we have LV γ T (1 − c) + γ T∞ − cα 2 (γ − θ )Cˆ − c|θ ||q0 |2k − (γ − cγθ )| p0 |2 . If the assumption α 2 Cˆ > T∞ is satisfied, we can find a constant β > 0 such that γ T (1 − c) + γ T∞ − cα 2 (γ − θ )Cˆ ≤ −β for all θ sufficiently small and all c sufficiently close to 1. By fixing c and making θ sufficiently small, we can furthermore ensure that γ − cγθ > 0. This shows that, by first choosing c sufficiently close to 1, then making θ very small and finally choosing E very large, we have constructed a function V satisfying the assumptions of Theorem 3.1, thus concluding the proof in the case k = 2. The case k < 2. Even though one would expect this to be the easier case, it turns out to be tricky because of the fact that the approximate decoupling of the oscillators at high energies is not such a good description of the dynamic anymore. The idea is to consider again the variable p˜ 0 introduced previously but, because of the fact that the function is now no longer bounded, we are going to multiply certain correction terms by a ‘cutoff function’. Since we are following a similar line of proof to the non-existence result and since we expect from (2.5) and (2.6) to be able to find a function V close to H and such that it 2

−1

asymptotically satisfies a bound of the type LV ≈ −Hf ( p˜ 0 , q0 ) − Hfk ( p1 , q1 ), this suggests that we should introduce the following notion of a negligible function suited to this particular case: Definition 4.11. A function f : R4 → R is negligible exists a if, for every ε > 0, there 2

compact set K ε such that the bound | f ( p, q)| ≤ ε Hf ( p˜ 0 , q0 ) + Hfk

−1

( p1 , q1 ) holds

for every ( p, q) ∈ K ε . We also introduce the notations ∼ and similarly to before. For θ > 0, we then set Vˆ = H + θ p˜ 0 q0 , so that (4.7) yields LVˆ = γ (T + T∞ ) − γ p02 + αθ p˜ 0 + θ p˜ 02 − θ |q0 |2k − αθ |q0 |2 − γ θ p0 q0 +θ αq0 R − α(q0 − q1 )∂ P − γ T∞ ∂ P2

+θq0 α R1 (q1 )∂ P − R1 (q0 ) . It is straightforward to check that all of the terms on the second and third lines are negligible. Using the definition of p˜ 0 and completing the square for the term α|q0 |2 + γ p0 q0 , we thus obtain the bound LVˆ −γθ p˜ 02 − θ |q0 |2k + cθ p˜ 0 − αθ 2 . Here, we defined the constants γθ def αθ = αγ α − , 4 in order to shorten the expressions.

cθ = (αθ − 2αγ + 21 γ 2 θ ), def

(4.19)

154

M. Hairer

¯ : R2 → R+ As before, we see that there exists a positive constant C¯ and a function R 2 −1 ¯ is smooth, centred, and vanishes in with compact support such that 2 − C¯ Hfk + R a neighbourhood of the origin. Similarly to (4.15), we define as the unique centred solution to 2

−1 ¯ L0 (P, Q) = 2 (P, Q) − C¯ Hfk (P, Q) + R(P, Q), 5

and as the unique centred solution to L0 = . Note that scales like Hf2k 3 2k −1

that scales like Hf

− 23

and

.

At this stage, we would like to define V = Vˆ + αθ ( p1 , q1 ) − cθ p˜ 0 ( p1 , q1 ) in order to compensate for the last two terms in (4.19). The problem is that when applying the generator to p˜ 0 , we obtain an unwanted term of the type q0 |q0 |2k−2 , which grows too fast in the q0 direction. We note however that the term p˜ 0 only needs to be compensated when | p˜ 0 | , which is the regime in which the description (4.7) is expected to be relevant. We therefore consider the same cutoff function ψ as before and we set 1 + Hf ( p˜ 0 , q0 ) ˆ V = V + αθ ( p1 , q1 ) − cθ p˜ 0 ( p1 , q1 )ψ , (4.20) (1 + Hf ( p1 , q1 ))η for a positive exponent η to be determined later. In order to obtain bounds on LV , we make use of the fact that Lemma 4.9 still applies to the present situation. In particular, we can apply it to the function , thus obtaining the bound 2 k −1 LV −Cθ Hf ( p˜ 0 , q0 ) + Hf ( p1 , q1 ) + cθ ( p˜ 0 − L( p˜ 0 ( p1 , q1 )ψ)),

for some constant Cθ , where it is understood that the function ψ is composed with the ratio appearing in (4.20). Using the fact that L0 = by definition and applying the chain rule (4.3) for L, we thus obtain 2 k −1 LV −Cθ Hf ( p˜ 0 , q0 ) + Hf ( p1 , q1 ) −cθ ((L p˜ 0 ) ψ + p˜ 0 Lψ + p˜ 0 (L − L0 ) + p˜ 0 L (ψ − 1)))

−cθ T ∂ p1 p˜ 0 ∂ P ψ + ∂ p1 p˜ 0 ∂ p1 ψ + p˜ 0 ∂ P ∂ p1 ψ − cθ T∞ ∂ p0 ψ. (4.21) We claim that all the terms appearing on the second and the third line of this expression are negligible, thus concluding the proof. The most tricky part of showing this is to obtain bounds on Lψ. Define E 0 = 1 + Hf ( p˜ 0 , q0 ) and E 1 = 1 + Hf ( p1 , q1 ) as a shorthand. Our main tool in bounding LV is then the following result which shows that the terms containing Lψ are negligible: Proposition 4.12. Provided that η ∈ [2 − k, k], there exists a constant C such that η 1 Lψ E 0η ≤ C, ∂ p ψ E 0η ≤ C E − 2 , ∂ p ψ E 0η ≤ C E − 2 . 0 1 1 1 E1 E1 E1


155

Proof. Define the function f : R2+ → R+ by f (x, y) = ψ((1 + x)/(1 + y)η ). It can then be checked by induction that, for every pair of positive integers m and n with m + n > 0 and for every real number β, there exists a constant C such that the bound |∂xm ∂ yn f | ≤ (1 + x)−m+β (1 + y)−n−ηβ

(4.22)

holds uniformly in x and y. It furthermore follows from (4.7) and (1.2) that 1 1 1 1− 2k k−2 , E1 |LE 0 | ≤ C E 0 + E 0 1 1 1 1 + |LE 1 | ≤ C E 12 2k + E 02k E 12 , 1/2

|∂ p0 E 0 | ≤ C E 0 , 1 2

1 k −1

|∂ p1 E 0 | ≤ C E 0 E 1

|∂ p0 E 1 | = 0, ,

1

|∂ p1 E 1 | ≤ C E 12 .

Combining these two bounds with (4.22) and the chain rule (4.3), the required bounds follow. Let us now return to the bound on LV . It is straightforward to check that 1 1 1 1− 2k k −2 , + E1 |L p˜ 0 | ≤ C E 0 for some constant C, so that 3 5 1− 1 −1 −3 | ( p1 , q1 )L p˜ 0 | ≤ C E 0 2k E 12k + E 12k 2 , which is negligible. Combining Proposition 4.12 with the scaling behaviours of and , one can check in a similar way that the term p˜ 0 Lψ, as well as all the terms appearing on the third line of (4.21) are also negligible. It therefore remains to bound p˜ 0 (L − L0 ) and p˜ 0 L (ψ − 1). It follows from (4.14) that 1 1 3 3 1 2 3 2k − 2 2k 2k 2k k −2 E0 + E1 ≤ C E0 + E1 , (4.23) |(L − L0 ) | ≤ C E 1 1

so that | p˜ 0 (L − L0 ) | is negligible as well. Since we know that L0 scales like E 1k it follows from (4.23) that 1 3 1 −3 1− 1 | p˜ 0 L | ≤ C E 02 E 12k 2 E 02k + E 1 2k .

− 21

,

This term has of course no chance of being negligible: we have to use the fact that it is η multiplied by 1 − ψ. The function 1 − ψ is non-vanishing only when E 0 ≥ E 1 , so that we obtain 1 1 1 3 3 1 1 1 1 + + ( − ) + ( − ) | p˜ 0 L (1 − ψ)| ≤ C E 02k 2 η 2k 2 + E 02 η k 2 .

156

M. Hairer

We see that both exponents are strictly smaller than 1, provided that η > 2k − 1. Combining all of these estimates with (4.21), we see that, provided that η ∈ ( 2k − 1, k), there exists a constant C such that 2 −1 . LV −C E 0 + E 1k In particular, using the scaling of , we deduce the existence of a constant c such that the bound 2

LV ≤ −cV k −1 ,

(4.24)

holds outside of a sufficiently large compact set (we can choose such a set so that V is positive outside), thus concluding the proof of Theorem 4.10 by applying Theorem 3.1.

5. Integrability Properties of the Invariant Measure The aim of this section is to explore the integrability properties of the invariant measure µ when it exists. First of all, we show the completely unsurprising fact that: Proposition 5.1. For all ranges of parameters for which there exists an invariant measure µ , one has exp (β H (x)) µ (d x) = ∞ for every β > 1/T . Proof. Choose β > β2 > 1/T . Setting W2 (x) = exp(β2 H (x)), we have LW2 = γβ2 W2 T + T∞ − p02 + β2 T p02 + T∞ p12 ≤ exp(β H ), outside of a sufficiently large compact set. Setting similarly W1 = exp(H (x)/T ), we see immediately from a similar calculation that LW1 ≥ 0, so that the result follows from Theorem 3.2. Remark 5.2. Actually, one can show similarly a slightly stronger result, namely that there exists some exponent α < 1 such that H α exp(H/T ) is not integrable against µ . 5.1. Energy of the first oscillator. What is maybe slightly more surprising is that the tail behaviour of the distribution of the energy of the first oscillator is not very strongly influenced by the presence of an infinite-temperature heat bath just next to it, provided that we look at the correct set of variables. Indeed, we have: Proposition 5.3. Let either 23 ≤ k < 2 or k = 2 and T∞ be such that there exists an invariant probability measure µ . Then exp β Hf ( p˜ 0 , q0 ) µ (d x) < ∞ for every β < 1/T . Remark 5.4. When k = 2, is bounded and the exponential integrability of Hf ( p˜ 0 , q0 ) is equivalent to that of Hf ( p0 , q0 ). This is however not the case when k < 2. Remark 5.5. The borderline case k = 23 is expected to be optimal if we restrict ourselves to the variables ( p˜ 0 , q0 ). This is because for k < 23 one would have to add additional correction terms taking into account the nonlinearity of the pinning potential.


157

The main ingredient in the proof of Proposition 5.3 is the following proposition, which is also going to be very useful for the non-integrability results later in this section. Proposition 5.6. For every θ > 0, there exist functions H0 , pˆ 0 : R4 → R and a constant Cθ such that • For every ε > 0, there exists a constant Cε such that the bounds 0 ≤ H0 ≤ (1 + ε)H + Cε ,

(5.1)

hold. • Provided that k ≥ 23 , for every ε > 0 there exists a constant Cε such that the bound (1 − ε)Hf ( p˜ 0 , q0 ) − Cε ≤ H0 ≤ (1 + ε)Hf ( p˜ 0 , q0 ) + Cε ,

(5.2)

holds. • One has the bounds (∂ p0 H0 − pˆ 0 )2 ≤ Cθ + θ 4 H0 ,

(5.3a)

(∂ p1 H0 ) ≤ Cθ + θ H0 .

(5.3b)

2

4

• If furthermore k ≥ 23 , the bound LH0 ≤ Cθ − (γ − 2θ ) pˆ 02 − θ H0 holds. • If k ∈ (4/3, 3/2) then, for every δ > (2k − 1)( k3 − 2), one has the bound LH0 ≤ Cθ − (γ − 2θ ) pˆ 02 − θ H0 + θ 2 Hfδ ( p1 , q1 ). Remark 5.7. The presence of p˜ 0 rather than pˆ 0 in (5.2) is not a typographical mistake. Proof. We start by defining the differential operator K acting on functions F : R2 → R as

KF = γ T∞ ∂ P2 F ( p1 , q1 ) + α qˆ0 − q ( p1 , q1 ) − q1 − R1 (q1 ) (∂ P F)( p1 , q1 ), so that KF = L(F( p1 , q1 )) − (L0 F)( p1 , q1 ). Setting pˆ 0 = p0 + p ( p1 , q1 ),

η

qˆ0 = q0 + q ( p1 , q1 )ψ(E 0 /E 1 ),

(5.4)

for some yet to be defined functions p and q and for E i and ψ as in Proposition 4.12, we then obtain

d qˆ0 = pˆ 0 dt + L0 q − p dt +ψKq dt + (ψ − 1)L0 q dt + q Lψ dt + γ T∞ ∂ p1 ψ∂ P q dt

+ 2γ T∞ ψ∂ P q + q ∂ p1 ψ dw1 (t) + 2γ T q ∂ p0 ψ dw0 (t), d pˆ 0 = −Veff (qˆ0 ) dt − γ pˆ 0 dt + 2γ T dw0

+ L0 p − αq1 + γ p dt +K p dt + Veff (qˆ0 ) − Veff (q0 ) dt + 2γ T∞ ∂ P p dw1 (t), (5.5) 2

where we defined as before the effective potential Veff (q) = V1 (q) + α q2 . (1) Let E > 0 and set p as the unique centred solution to

L0 (1) p = α Q 1 − ψ Hf (P, Q)/E ,

(5.6)

158

M. Hairer (2)

where ψ is the same cutoff function already used previously. We then define p by (2) (1) (1) (2) L0 p = γ p and we set p = p + p . This ensures that one has the identity L0 p − αq1 + γ p = R p , 3 where the function R p consists of terms that scale like Hfa with a ≤ 2k − 1. We furthermore set q to be the unique centred solution to L0 q = p . Note that p consists of terms scaling like Hfa with a ≤ k1 − 21 and that q consists of terms scaling like Hfa 3 with a ≤ 2k − 1. The introduction of the parameter E in (5.6) ensures that we can make functions scaling like a negative power of Hf arbitrarily small in the supremum norm. 1

It follows indeed that one has for example |∂ P p | ≤ C E k −1 . With these definitions at hand, it follows from (5.5) that

d qˆ0 = pˆ 0 dt + 2γ T∞ ψ∂ P q + q ∂ p1 ψ dw1 (t) + 2γ T q ∂ p0 ψ dw0 (t) +ψKq dt + (ψ − 1) p dt + q Lψ dt + γ T∞ ∂ p1 ψ∂ P q dt, d pˆ 0 = −Veff (qˆ0 ) dt − γ pˆ 0 dt + 2γ T dw0 +R p dt + K p dt + Veff (qˆ0 ) − Veff (q0 ) dt + 2γ T∞ ∂ P p dw1 (t). (5.7) Let now H0 be defined by pˆ 02 + Veff (qˆ0 ) + θ pˆ 0 qˆ0 + C0 , 2 where C0 is a sufficiently large constant so that H0 ≥ 1. Note that, as a consequence of the definitions of pˆ 0 and qˆ0 , if k ≥ 3/2 then | pˆ 0 − p˜ 0 | and |qˆ0 − q0 | are bounded so that the two-sided bound (5.2) does indeed hold. Showing that the weaker one-sided bound (5.1) holds for every k ∈ [ 23 , 2] is straightforward to check. Before we turn to the proof of (5.3), let us recall the definitions of E 0 and E 1 from the proof of the case k < 2 of Theorem 4.10, and define similarly Eˆ 0 = 1 + Hf ( pˆ 0 , qˆ0 ). If k ≥ 23 , then Eˆ 0 and E 0 are equivalent in the sense that they are bounded by multiples of each other. If k < 23 , this is not the case, but it follows from the definitions of p and q that E 0 ≤ C Eˆ 0 + E 13−2k . Eˆ 0 ≤ C E 0 + E 13−2k , H0 =

It follows that, provided that we impose the condition η > 3 − 2k, where η is the exponent appearing in (5.4), then one has the implications E0 ≤ C E1

η

⇒

E0 ≥

η C E1

⇒

η Eˆ 0 ≤ C˜ E 1 , η Eˆ 0 ≥ C˜ E 1 ,

(5.8a) (5.8b)

for some constant C˜ depending on C. We will assume from now on that the condition η > 3 − 2k is indeed satisfied. Let us now show that (5.3a) holds. We have the identity ∂ p0 H0 − pˆ 0 = θ qˆ0 + Veff (qˆ0 ) + θ pˆ 0 q ∂ p0 ψ. Since the term θ qˆ0 satisfies the required bound, we only need to worry about the second term. It follows from Proposition 4.12 and from the scaling of q that this term is


159 1

3

η

1− −1− 2 bounded by a multiple of Eˆ 0 2k E 12k . Since the bounds (5.8) hold on the support 3

η

−1− 2k 1/2 of ∂ p0 ψ, this in turn is bounded by a multiple of Eˆ 0 E 12k , so that the requested bound follows, provided again that the condition η > 3 − 2k holds. Turning to (5.3b), we have the identity ∂ p1 H0 = ( pˆ 0 + θ qˆ0 )∂ P p + Veff (qˆ0 ) + θ pˆ 0 ∂ p1 (q ψ).

Making use of the parameter E introduced in (5.6), it follows that the first term is 2k 1/2 1 bounded by Eˆ 0 E k −1 , which can be made sufficiently small by choosing E θ 1−k . In order to bound the second term, we expand the last factor into q ∂ p1 ψ + ψ∂ P q . The first term can be bounded just as we did for ∂ p0 H0 , noting that the bound on ∂ p1 ψ in Proposition 4.12 is better than the bound on ∂ p0 ψ. Using the fact that (5.8a) holds 1

η−3

(1− 1 )

k , which on the support of ψ, the second term yields a bound of the form Eˆ 02 E 1 2 yields the required bound provided that η < 3. It therefore remains to show the bound on LH0 . It follows from (5.7) that one has the identity

LH0 = γ T − (γ − θ ) pˆ 02 − θ |qˆ0 |2k − αθ |qˆ0 |2 − γ θ pˆ 0 qˆ0

2 +γ T∞ (∂ P p )2 + Veff (qˆ0 ) ∂ p1 (q ψ) + 2θ ∂ P p ∂ p1 (q ψ) +γ T Veff (qˆ0 )(q ∂ p0 ψ)2 + 2θ q ∂ p0 ψ +( pˆ 0 + θ qˆ0 ) R p + K p + Veff (qˆ0 ) − Veff (q0 ) + Veff (qˆ0 ) + θ pˆ 0

× ψKq + (ψ − 1) p + q Lψ + γ T∞ ∂ p1 ψ∂ P q . (5.9) We now use the following notion of a negligible function. A function f : R+ × R4 → R is negligible if, for every ε > 0 there exists a constant E ε and, for every E > E ε , there exists a constant Cε such that the bound | f (E; p, q)| ≤ Cε + ε Eˆ 0 + E 1δ holds, where δ is as in the statement of the proposition. (Set δ = 0 for k ≥ 3/2.) With this notation, the required bounds follow if we can show that all the terms appearing in (5.9) are negligible, except for those on the first line. The terms appearing in the second line are all smaller than the last term appearing in ∂ p1 H0 and so they are negligible. Similarly, the terms appearing in the third line are smaller than those appearing in ∂ p1 H0 − pˆ 0 . It is easy to see that the first termon the fourth line is negligible. Concerning the 1 3 −1 , so that this term is also seen to second term, we see that |K p | ≤ C Eˆ 02k + E 12k be negligible by power counting. Note now that the definitions of Veff and qˆ0 imply that one has the bound 3 −1 Veff (qˆ0 ) − Veff (q0 ) ≤ C 1 + |qˆ0 |2k−2 + |q0 |2k−2 Hf2k ( p1 , q1 ) 3 3 (2k−2)( 2k −1) −1 E 12k ≤ C 1 + |qˆ0 |2k−2 + E 1 3 3 1− 1 −1 (2k−1)( 2k −1) . ≤ C Eˆ 0 k E 12k + E 1

160

M. Hairer

η Furthermore, one has Veff (qˆ0 ) = Veff (q0 ), unless Eˆ 0 ≤ E 1 , so that we have the bound 3 1 η( 1 − 1 )+ 3 −1 (2k−1)( 2k −1) . pˆ 0 Veff (qˆ0 ) − Veff (q0 ) ≤ C Eˆ 0 E 1 2 k 2k + Eˆ 02 E 1

The second term is always negligible. Furthermore, if η > (3 − 2k)/(2 − k) the first term is also negligible. We now turn to the last line in (5.9). In order to bound the term involving Kq , note that the functions q ∂ P q , ∂ P2 q , and Q∂ P ϕq are bounded provided that k ≥ 4 3 , so that the terms involving these expressions are negligible. Concerning the term Veff (qˆ0 )qˆ0 ∂ P q , we use the fact that ∂ P q can be made arbitrarily small by choosing E large enough in (5.6) to conclude that it is also negligible. The term involving p is 1−

1

+1(1−1)

bounded by a multiple of Eˆ 0 2k η k 2 , so that it is negligible provided that η > 2 − k. The term involving q Lψ is bounded similarly, using the fact that Lψ is bounded by Proposition 4.12 and that q scales like a smaller power of Hf than p . Finally, the last term is negligible since ∂ p1 ψ∂ P q is bounded, thus concluding the proof of Proposition 5.6. Note that the choice η = 2 for example allows to satisfy all the conditions that we had to impose on η in the interval k ∈ [4/3, 2]. We are now able to give the Proof of Proposition 5.3. It follows from (5.2) that if we can show that exp(βH0 ) is integrable with respect to µ for every β < 1/T , then the same is also true for exp(β Hf ( p˜ 0 , q0 )), provided that we restrict ourselves to the range k ≥ 23 . Before we proceed, we also note that (5.3a) implies that for θ sufficiently small, one has the bound (∂ p0 H0 )2 ≤ (1 + θ ) pˆ 02 + C˜ θ + θ 2 H0 , for some constant C˜ θ . Setting W = exp(βH0 ), we thus have the bound LW = LH0 + γβ T (∂ p0 H0 )2 + T∞ (∂ p1 H0 )2 βW ≤ Cθ − (γ − 2θ ) pˆ 02 − θ H0 + γβ(1 + θ ) pˆ 02 + Cθ 2 H0 , for some constant C independent of θ . Since we assumed that β < 1/T , we can make θ sufficiently small so that −(γ − 2θ ) + γβ(1 + θ ) < 0 and Cθ 2 − θ < 0. The claim then follows from Theorem 3.1. 5.2. Integrability and non-integrability in the case k = 2. We next show that if k = 2 ˆ then the invariant measure is heavy-tailed in the sense that there exists and T∞ ≤ α 2 C, an exponent ζ such that H ζ (x) µ (d x) = ∞. Our precise result is given by: ˆ one has H ζ (x) µ (d x) = ∞, provided that Theorem 5.8. If k = 2 and T∞ ≤ α 2 C, ζ > ζ = def

Conversely, one has

3 α 2 Cˆ − T∞ . 4 T∞

H ζ (x) µ (d x) < ∞ for ζ < ζ .


161

Proof. We first show the positive result, namely that H ζ is integrable with respect to µ for any ζ < ζ . Fixing such a ζ , our aim is to construct a smooth function W bounded from below such that, for some small value ε > 0, the bound LW ≤ −ε H ζ holds outside of some compact set. This then immediately implies the required integrability by Theorem 3.1. Consider the function V defined in (4.18). Note that this function depends on parameters E, θ and c and that, for any given value of ε > 0, it is possible to choose first θ sufficiently small and c sufficiently close to 1, and then E sufficiently large, so that the bound LV ≤ γ T∞ − α 2 γ Cˆ + ε, holds outside of some compact set. Let us now turn to the behaviour of ∂ p0 V and ∂ p1 V . It follows from the definitions, Lemma 4.5, and Corollary 4.7 that one has the identity

2 ∂ p0 V = (1 − c)2 p02 + R0 , where the function R0 can be bounded by an arbitrarily small multiple of V outside of some sufficiently large compact set. Furthermore, it follows from the definition of V and the construction of H0 that one has the bound V ≥ 1−c 2 H outside of some compact set, so that we have the bound

2 ∂ p0 V ≤ 4(1 − c)V + R0 . Ensuring first that 1 − c ≤ ε/8 and then choosing E sufficiently large, it follows that

2 we can ensure that ∂ p0 V ≤ εV outside of a sufficiently large compact set. It follows in a similar way that, by possibly choosing E even larger, the bound

2 ∂ p1 V ≤ p12 + εV holds outside of some compact set. Note now that since L0 (P Q) = 3P 2 − 4Hf ,

(5.10)

˜ : R2 → R be a centred compactly the function P 2 − 43 Hf is centred. Let furthermore R 4 2 ˜ supported function such that P − 3 Hf + R vanishes in a neighbourhood of the origin ˜ be the centred solution to and let ˜ = P2 − L0

4 ˜ Hf + R, 3

(5.11)

so that we have the identity ˜ p1 , q1 ) = p12 − L(

4 ˜ p1 , q1 ) + α(q0 − q1 ) − R1 (q1 ) ∂ P ˜ ( p1 , q1 ). Hf ( p1 , q1 ) + R( 3

Furthermore, it follows at once from the definition of V and the scaling behaviours of and that the bound Hf ( p1 , q1 ) ≤ (1 + ε)V,

162

M. Hairer

˜ is bounded and scales like holds outside of some compact set. Since furthermore R 3

Hf4 , it follows that the bound 4 ˜ p1 , q1 ) ≥ p12 − (1 + ε)V, L( 3 holds outside of some (possibly larger) compact set. Finally, it follows from the scaling ˜ that the bounds of ˜ ˜ ≤ εV, |LV | ≤ εV and |∂ p1 V ∂ p1 |

(5.12)

hold outside of some sufficiently large compact set. With all these definitions at hand, we consider the function ˜ p1 , q1 ). W = V ζ +1 − γ ζ (ζ + 1)T∞ V ζ (

(5.13)

Note that V is positive outside of a compact set, so that W is well-defined there. Since we do not care about compactly supported modifications of W , we can assume that (5.13) makes sense globally. We then have the identity

2

2 ˜ LW = (ζ + 1)V ζ LV + ζ γ (ζ + 1)V ζ −1 T ∂ p0 V + T∞ ∂ p1 V − T∞ L ˜ . ˜ + γ T∞ ∂ p1 V ∂ p1 − γ ζ 2 (ζ + 1)T∞ V ζ −1 LV Collecting all of the bounds obtained above, this in turn yields the bound 4 ζ 2 ˆ ζ LW ≤ (ζ + 1)V γ T∞ − α γ C + ε + ζ γ (ζ + 1)V T ε + T∞ ε + T∞ (1 + ε) 3 − γ εζ 2 (ζ + 1)T∞ V ζ 4 ≤ γ (ζ + 1) T∞ − α 2 Cˆ + ζ T∞ + K ε V ζ , 3 holding for some constant K > 0 independent of ε outside of some sufficiently large compact set. It follows that if ζ < ζ , it is possible to choose ε sufficiently small so that the prefactor in this expression is negative, thus yielding the desired result. We now prove the ‘negative result’, namely that H ζ is not integrable with respect to µ if ζ > ζ . In order to show this, we are going to apply Wonham’s criterion with W2 = H 1+ζ . It therefore suffices to find a function W1 growing to infinity in some direction, such that LW1 > 0 outside of some compact set, and such that sup

H ( p,q)=E

W1 ( p, q)E −1−ζ → 0

(5.14)

as E → ∞. We are going to construct W1 in a way very similar to the construction in the proof of the positive result above. Fix some arbitrarily small ε > 0 as before. Setting V as above, note first that it follows immediately from (4.17) that, by choosing first θ sufficiently small, then c sufficiently close to 1 and finally E large enough, we can ensure that the bound LV ≥ γ T∞ − α 2 γ Cˆ − ε(1 + p02 )


163

holds outside of some sufficiently large compact set. Similarly as before, we can also ensure that the bound

2 ∂ p1 V ≤ p12 − εV holds. Fix now some ζ˜ ∈ (ζ , ζ ) and define W0 as in (5.13), but with ζ˜ replacing ζ . It follows that the bound 4 ˜ 2 ˆ 2 ˜ ˜ LW0 ≥ γ (ζ + 1) T∞ − α C + ζ T∞ − K ε(1 + p0 ) V ζ , 3 holds for some constant K > 0 outside of some compact set. The problem is that the right hand side of this expression is not everywhere positive because of the appearance of the term p02 . This can however be dealt with by setting ˜

W1 = W0 − K ε H 1+ζ , so that

(5.15)

4 ˜ LW1 ≥ γ (ζ˜ + 1) T∞ − α 2 Cˆ + ζ˜ T∞ − K˜ ε V ζ , 3

for some different constant K˜ . Since ζ˜ > ζ , we can ensure that this term is uniformly positive by choosing ε sufficiently small. By possibly making ε even smaller, we can furthermore guarantee that W1 grows in some direction, despite the presence of the ˜ term −K ε H 1+ζ in (5.15). Finally, the condition (5.14) is guaranteed to hold because we ˜ choose ζ < ζ . As a corollary of Theorem 5.8, we obtain: ˆ then even though the system admits Corollary 5.9. If k = 2 and α 2 Cˆ > T∞ > 37 α 2 C, , the average kinetic energy of the second oscillator is a unique invariant measure µ infinite, that is p12 µ (d x) = ∞. Proof. The proof is very similar to the proof of the “negative part” of Theorem 5.8. However, instead of choosing W2 = H 2 , we choose W2 = H 2 + K p1 q1 for some constant K . Since this additional term does not change the behaviour of W2 at infinity, the conclusions of Wonham’s criterion still apply, showing that (LW2 )+ is not integrable with respect to µ . A simple explicit calculation shows that, provided that K is large

enough, there exists a positive constant C such that LW2 ≤ C 1 + Hf ( p0 , q0 ) + p12 . On the other hand, we know that the expectation of Hf ( p0 , q0 ) is finite under µ by the remark following Proposition 5.3, so that the expectation of p12 under µ necessarily diverges. 5.3. Integrability and non-integrability in the case k < 2. In this case, we show that the exponential of a suitable fractional power of H is integrable with respect to the invariant measure. Our positive result is given by: Theorem 5.10. For every k ∈ (1, 2) there exists δ > 0 such that 2 exp δ H k −1 (x) µ (d x) < ∞, R4

where µ is the unique invariant measure for (1.2).

(5.16)

164

M. Hairer

Proof. Define W = exp(δV κ ) for a (small) constant δ > 0 and an exponent κ ∈ (0, 1] to be determined later (the optimal exponent will turn out to be κ = 2k − 1). Here, V is the function that was previously defined in (4.20). Since V and H are equivalent in the sense that there exist positive constants C1 and C2 such that C1−1 V − C2 ≤ H ≤ C1 V + C2 , showing the integrability of W implies (5.16) for a possibly different constant δ. Applying the chain rule (4.3), we obtain outside of a sufficiently large compact set the bound LW = δκ W V κ−1 LV + (δκ V 2κ−2 + (κ − 1)V κ−2 )(V, V ) ≤ δκ W V κ−1 LV + 2δκ V κ−1 (V, V ) . (5.17) Note now that it follows immediately from (4.20) and Proposition 4.12 that, outside of some compact set, one has the bounds 1 1 3 α 3 1 1 √ 2k −1− 2 2k −1 2 2 2 2 ≤ C E0 + E1 ≤ C V , |∂ p0 V | ≤ C E 0 + E 0 E 1 + E1 1 1 1 5 3 3 1 1 √ 2k 2k −2 2k − 2 2 2 2 2 ≤ C E0 + E1 ≤ C V , + E0 E1 |∂ p1 V | ≤ C E 1 + E 0 + E 1 so that (V, V ) ≤ C V . Combining this with (4.24), we obtain the existence of constants c and C (possibly depending on κ, but not depending on δ) such that 2 (5.18) LW ≤ δW V κ−1 C + CδV κ − cV k −1 , thus concluding the proof.

We have the following partial converse to Theorem 5.10: Theorem 5.11. Let k ∈ ( 43 , 2). Then, there exists > 0 such that 2 exp H k −1 (x) µ (d x) = ∞,

(5.19)

R4

where µ is the unique invariant measure for (1.2). Proof. We are again going to make use of Wonham’s criterion. Let K˜ be a (sufficiently large) constant, define κ = 2k − 1 ∈ (0, 21 ), set F(x) = exp(H κ (x)), and set W2 (x) = exp K˜ H κ (x) . We then have the bound LW2 = H κ−1 T + T∞ − p02 + H κ−2 (κ − 1 + κ K˜ H κ ) T p02 + T∞ p12 γ κ K˜ W2 ≤ C 1 + H 2κ−1 , for some constant C > 0. In particular, we have LV ≤ F outside of some compact set, provided that we choose > K˜ .


165

˜ the centred solution to Similarly to 5.11, we denote by ˜ = Hf − L0

k+1 2 ˜ P + R, 2k

˜ ensuring that the right hand side vanishes in for some compactly supported function R a neighbourhood of the origin. Let now K be any constant smaller than K˜ , let M be a (large) positive constant to be determined later, and set def ˜ p1 , q 1 ) = exp(K H1 ), W1 = exp K H κ − 2H κ−1 H0 + M H 2κ−2 ( ˜ was defined above. Note that the where H0 is the function from Proposition 5.6 and properties of H0 imply that, outside of some compact set, one has the bounds 1 ≤ H0 ≤ (1 + ε)H. It is clear that W2 is much larger than W1 at infinity, so that it remains to show that LW1 > 0 outside of a compact set for K sufficiently large. We are actually going to show that there exists a constant C such that (LW1 )/W1 ≥ C H 2κ−1 outside of some compact set. Therefore, we call a function f negligible if, for every ε > 0, there exists a compact set such that | f | ≤ ε H 2κ−1 outside of this set. Note that since we consider the range of parameters such that κ < 21 , bounded functions are not negligible in general. Using the chain rule (4.3), we have the identity LW1 = LH1 + γ K T (∂ p0 H1 )2 + T∞ (∂ p1 H1 )2 K W1 ≥ LH1 + γ K T∞ (∂ p1 H1 )2 . We first turn to the estimate of LH1 . Using again (4.3), we have the identity ˜ LH1 = κ H κ−1 + 2(1 − κ)H κ−2 H0 LH − 2H κ−1 LH0 + M H 2κ−2 L ˜ ˜ + γ M(2κ − 2)(2κ − 3)H 2κ−4 T p02 + T∞ p12 + 2(κ − 1)M H 2κ−3 LH + γ (κ − 1)H κ−3 (κ H + 2(2 − κ)H0 ) T p02 + T∞ p12

+ 2γ (1 − κ)H κ−2 T p0 ∂ p0 H0 + T∞ p1 ∂ p1 H0 ˜ + γ M T∞ (2κ − 2)H 2κ−3 p1 ∂ P . ˜ scales like a power of We see immediately that since κ is strictly positive and since the energy strictly smaller than one, all terms except for the ones on the first line are negligible. Furthermore, it follows from (5.1) that κ κ−1 κ H κ−1 + 2(1 − κ)H κ−2 H0 ≤ 2 − H , 2 say. Combining this with Proposition 5.6 and the fact that the inequality κ > (2k − 1)( k3 − 2) holds in the range of parameters under consideration, we obtain the lower bound κ ˜ − 2 p02 + 2(γ − 2θ ) pˆ 02 + 2θ H0 + M H 2κ−2 L. LH1 H κ−1 γ 2

166

M. Hairer

Using the definition of pˆ 0 , and choosing θ < γ κ/8, we obtain the existence of a constant C such that ˜ LH1 H κ−1 2θ H0 − C Hfκ ( p1 , q1 ) + M H 2κ−2 L0 . ˜ in order to replace L ˜ by L0 . ˜ Here, we also made use of the scaling properties of Note that the constant C appearing in the expression above can be made independent of θ provided that we restrict ourselves to θ ≤ γ κ/16, say. At this point, we make the choice M = 2C and we set c = k+1 2k , so that we have the lower bound M κ Hf ( p1 , q1 ) + M H 2κ−2 (Hf ( p1 , q1 ) − cp12 ) LH1 H κ−1 2θ H0 − 2 M κ−1 κ Hf ( p1 , q1 ) + M H 2κ−2 (H0 + Hf ( p1 , q1 ) − cp12 ), − H 2 where we made use of the fact that, since κ < 1, for every constant C, there is a compact set such that H κ−1 H0 ≥ C H 2κ−2 H0 outside of that compact set. From the definitions of H and H0 , we see that there exists a constant C and a compact set outside of which CH0 + Hf ( p1 , q1 ) > 43 H , say, so that we finally obtain the lower bound M 2κ−1 H − cM H 2κ−2 p12 . 4

LH1

(5.20)

Let us now turn to the term (∂ p1 H1 )2 . We have the identity ˜ p1 ∂ p1 H1 = H κ−2 (κ H + 2(1 − κ)H0 ) p1 + 2M(κ − 1)H 2κ−3 ˜ − 2H κ−1 ∂ p1 H0 + M H 2κ−2 ∂ P . 2

Using the inequality (a + b)2 ≥ a2 − b2 , as well as the bound (5.3b), it follows that there exists a constant C such that the bound (∂ p1 H1 )2 ≥

κ 2 2κ−2 2 H p1 − 16θ 4 H 2κ−2 H0 − C H 2κ−2 , 2

holds. Combining this bound with (5.20), we obtain the lower bound LW1 M 2κ−1 H + K W1 4

γ K T∞ κ 2 − cM H 2κ−2 p12 − 16γ K T∞ θ 4 H 2κ−2 H0 . 2

We now choose K = 2cM/(γ T∞ κ 2 ) so that the second term vanishes. The prefactor of the last term is then given by 32Mcθ 4 /κ 2 . Choosing θ small enough so that θ 4 < κ 2 /(256c), say, we finally obtain the lower bound LW1 ≥

M K 2κ−1 W1 > 0, H 16

valid outside of some sufficiently large compact set, as required.


167

6. Convergence Speed Towards the Invariant Measure In this section, we are concerned with the convergence rates towards the invariant measure in the case 1 < k ≤ 2, where it exists. Our main result will be that k = 43 is the threshold separating between exponential convergence and stretched exponential convergence. 6.1. Upper bounds. Our main tool for upper bounds will be the integrability bounds obtained in the previous section, together with the results recently obtained in [DFG06, BCG08]. The results obtained in Sect. 5 suggest that it is natural to work in spaces of functions weighted by exp(δV ε ), where V was defined in (4.18). For ε > 0 and δ > 0, we therefore define the space B(ε, δ) as the closure of the space of all smooth compactly supported functions under the norm

ϕ(ε,δ) = sup |ϕ(x)| exp −δ H ε (x) , x∈R4

where we used the letter x to denote the coordinates ( p0 , q0 , p1 , q1 ). Note that the dual norm on measures is a weighted total variation norm with weight exp(δ H ε (x)). We also say that a Markov semigroup Pt with invariant measure µ has a spectral gap in a Banach space B containing constants if there exist constants C and γˆ such that Pt ϕ − µ (ϕ)B ≤ Ce−γˆ t ϕB , ∀ϕ ∈ B. As a consequence of the bounds of Sect. 5, we obtain: Theorem 6.1. Let k ∈ (1, 2] and set κ = 2k − 1. Then, the semigroup Pt extends to a

C0 -semigroup on the space B(ε, δ), provided that ε ≤ max 21 , 1 − κ . Furthermore: a. If 1 < k < 43 then, for every ε ∈ [1 − κ, κ) and every δ > 0, the semigroup Pt has a spectral gap in B(ε, δ). Furthermore, there exists δ0 > 0 such that it has a spectral gap in B(κ, δ) for every δ ≤ δ0 . In particular, for every δ > 0 there exist constants C > 0 and γˆ > 0 such that the bound Pt (x, · ) − µ TV ≤ C exp(δ H 1−κ (x))e−γˆ t ,

(6.1)

holds uniformly over all initial conditions x and all times t ≥ 0. b. If k = 43 , then there exists δ0 > 0 such that the semigroup Pt has a spectral gap in B( 21 , δ) for every δ ≤ δ0 . In particular, there exists δ > 0 such that the convergence result (6.1) holds. c. For 43 < k < 2, there exist positive constants δ, C and γˆ such that the bound Pt (x, · ) − µ TV ≤ C exp(δ H κ (x))e−γˆ t

κ/(1−κ)

(6.2)

holds uniformly over all initial conditions x and all times t ≥ 0. d. For the case k = 2, set ζ as in Theorem 5.8. Then, for every T∞ < α 2 Cˆ and every ζ < ζ , there exists C > 0 such that the bound Pt (x, · ) − µ TV ≤ C H 1+ζ (x)t −ζ , holds uniformly over all initial conditions x and all times t ≥ 0.

(6.3)

168

M. Hairer

Proof. The set of bounded continuous functions is dense in B(ε, δ) and is mapped into itself by Pt . Therefore, in order to show that it extends to a C0 -semigroup on B(ε, δ), it remains to verify that: 1. There exists a constant C such that Pt ϕ(ε,δ) ≤ Cϕ(ε,δ) for every t ∈ [0, 1] and every bounded continuous function ϕ. 2. For every ϕ ∈ C0∞ , one has limt→0 Pt ϕ − ϕ(ε,δ) = 0. Using the a priori bounds on the solutions given by the bound LH ≤ γ (T + T∞ ), it is possible to check that the second statement holds for every (ε, δ). The first claim then follows from [MT93] and (5.18). It remains to show claims a to d. Claims a and b follow immediately from (5.18). To show that claim c also holds, we use the fact that, by using (5.18) in the case ε = 2k − 1, we can find δ > 0 such that the bound k

1

LW ≤ −δ 2 V 2κ−1 W = −δ 2−k W (log(W ))2− κ , holds outside of some compact subset of R4 . Since we are considering a regular Markov process, every compact set is petite. This shows that there exists a constant δ such that, in the terminology of [BCG08], W is a ϕ-Lyapunov function for our model with k

1

ϕ(t) = δ 2−k t (log t)2− κ . In particular, this yields the identity log(t) t k 1 1−κ ds = δ − 2−k s κ −2 ds = C(log t) κ , Hϕ (t) = ϕ(s) 1 0 for some constant C depending on δ and κ. It follows from the results in [BCG08] that the convergence rate to the invariant measure is given by ψ(t) =

1 (ϕ ◦

Hϕ−1 )(t)

= Ct

1−2κ 1−κ

e−γ t

κ/(1−κ)

,

for some positive constants C and γ , so that (6.2) follows. The case k = 2 can be treated in a very similar way. It follows from the first part of the proof of Theorem 5.8 that there exists β > 0 and a function W growing like H 1+ζ at infinity such that one has the bound LW ≤ −β H ζ outside of some sufficiently large ζ

compact set. Therefore, W is a ϕ-Lyapunov function for ϕ(t) = −βt 1+ζ . Following the same calculations as before, we obtain ψ(t) = Ct −ζ , so that the required bound follows at once. 6.2. Lower bounds. In order to be able to use Theorem 3.7, we need upper bounds on the moments of some observable that is not integrable with respect to the invariant measure. This is achieved by the following proposition: Proposition 6.2. For every α > 0 and every κ ∈ [0, 21 ] there exist constants Cα and Cκ such that the bounds

Pt H α (x) ≤ (H (x) + Cα t)α ,

Pt exp α H κ (x) ≤ exp α H κ (x) + Cκ (1 + t)κ/(1−κ) , hold for every t > 0 and every x ∈ R4 .


169

Proof. Note first that LH ≤ γ (T + T∞ ) and that T (∂ p0 H )2 + T∞ (∂ p1 H )2 = T p02 + T∞ p12 ≤ 2(T + T∞ )H.

(6.4)

It follows that for α ≥ 1, there exists C > 0 such that one has the bound

d Pt H α (x) = Pt (LH α ) (x) dt = α Pt (H α−1 LH + γ (α − 1)H α−2 (T p02 + T∞ p12 )) (x)

1− 1 ≤ C Pt H α−1 (x) ≤ C Pt H α (x) α . 1

The last inequality followed from the concavity of x → x 1− α . Setting Cα = C/α, the bound on Pt H α now follows from a simple differential inequality. The corresponding bound for α ∈ (0, 1) follows by a simple application of Jensen’s inequality. The bounds on the exponential of the energy are obtained in a similar way. Set 1 f κ (x) = x(log x)2− κ and note that there exists a constant K κ such that, provided that 1 κ ∈ (0, 2 ], f κ is concave for x ≥ exp(α K κκ ). It then follows as before from (6.4) and the bound on LH that there exists a constant C such that

d Pt exp α(K κ + H )κ (x) ≤ C Pt (K κ + H )2κ−1 exp α(K κ + H )κ (x) dt

= C Pt f κ (exp α(K κ + H )κ ) (x)

(6.5) ≤ C f κ Pt exp α(K κ + H )κ (x) . The result then follows again from a simple differential inequality.

As a consequence, we have the following result in the case k = 2: Theorem 6.3. For every ζ > ζ and every x0 ∈ R4 , there exists a constant C and a −ζ sequence tn increasing to infinity such that µ − µtn ≥ Ctn . Proof. Let ζ˜ ∈ (ζ , ζ ), and let ε > 0, α > ζ (1+ε). It then follows from Theorem 5.8 and ˜ Proposition 6.2 that the assumptions of Theorem 3.7 are satisfied with W (x) = H ζ (x), ˜ h(s) = s −1−ε , F(s) = s α/ζ , and g(x0 , t) = (H (x0 ) + Ct)α . Applying Theorem 3.7 yields the lower bound ˜ α−ζ −εζ

ζ − (1+ε)α ˜ ˜

µ − µtn ≥ Ctn

,

for some C > 0 and some sequence tn increasing to infinity. Choosing ε sufficiently small and α sufficiently large, we can ensure that the exponent appearing in this expression is larger than −ζ , so that the claim follows. Furthermore, we have Theorem 6.4. Let k ∈ ( 43 , 2) and define κ = 2k − 1. Then, there exists a constant c such that, for every initial condition x0 ∈ R4 there exists a constant C and a sequence of κ/(1−κ) times tn increasing to infinity such that µ − µtn ≥ C exp(−ctn ).

170

M. Hairer

Proof. We apply Theorem 3.7 in a similar way to above, but it turns out that we don’t need to make such ‘sharp’ choices for h and F. Take h(s) = s −2 , F(s) = s 3 , and let W = exp(K H κ ) with the constant K large enough so that W is not integrable with respect to µ . It then follows

from Proposition 6.2 that we can choose g(x, t) = exp 3K H κ (x) + C(1 + t)κ/(1−κ) for a suitable constant C. The requested bound follows at once, noting that h ◦ (F · h) ◦ g = 1/g 2 . 7. The Case of a Weak Pinning Potential In this section, we are going to study the case k ≤ 1, that is when we have either V1 ≈ V2 or V1 " V2 at infinity. This case was studied extensively in the previous works [EH00,RT02,EH03,Car07], but the results and techniques obtained there do not seem to cover the situation at hand where one of the heat baths is at ‘infinite temperature’. Furthermore, these works do not cover the case k < 1/2, where one does not have a spectral gap and exponential convergence fails. One further interest of the present work is that, unlike in the above-mentioned works, we are able to work with the generator L instead of having to obtain bounds on the semigroup Pt . This makes the argument somewhat cleaner. We divide this part into two subsections. We first treat the case where one can find a spectral gap, which is relatively easy in the present setting. In the second part, we then treat the case where the spectral gap fails to hold, which follows more closely the heuristics set out in Sect. 2.2. There, we also show that, rather unsurprisingly, no invariant measure exists in the case where k ≤ 0. 7.1. The case k > 1/2. Our aim is to find a modified version Hˆ of the energy function H such that, for a sufficiently small constant β0 , one has exp(−β0 Hˆ )L exp(β0 Hˆ ) " 0 at infinity. This is achieved by the following result: Theorem 7.1. Let k ∈ ( 21 , 1) and let δ ∈ [ k1 −1, 1]. Then, there exist constants c, C > 0, β0 > 0 and a function Hˆ : R4 → R such that • The bounds cH ≤ Hˆ ≤ C H hold outside of some compact set. • For any t > 0, the operator Pt admits a spectral gap in the space of measurable functions weighted by exp(β0 Hˆ δ ). Remark 7.2. Combining this result with Proposition 5.1 shows the existence of constants c, C > 0 such that exp(cH ) dµ < ∞, but exp(C H ) dµ = ∞. Remark 7.3. The technique used in the proof of Theorem 7.1 is more robust than that used in the previous sections. In particular, it applies to chains of arbitrary length. It would also not be too difficult to modify it to suit the more general class of potentials considered in [RT02,Car07]. Proof. Define the variable y = (q, p0 , p1 ) with q = (q0 − q1 )/2 and let A and B be the matrices defined by ⎞ ⎛ ⎛ ⎞ 1 0 − 21 0 √0 2 def def A = ⎝−2α −γ B = 2γ ⎝ T √0 ⎠. (7.1) 0 ⎠, 2α 0 0 T∞ 0


171

With this notation, we can write the equations of motion for y following from (1.2) as dy = Ay dt + F(y, Q) dt + B dw(t),

(7.2)

where we defined the centre of mass Q = (q0 + q1 )/2 and F : R → R is a vectorvalued function whose components are all bounded by C + |V1 (q0 )| + |V1 (q0 )| for some constant C. Since det A = −γ α < 0 and we know from a simple contradiction argument [RT02, Car07] that the energy of the system converges to zero under the deterministic equation y˙ = Ay, we conclude that all eigenvalues of A have strictly negative real part. As a consequence, there exists γ˜ > 0 such that the strictly positive definite symmetric quadratic form ∞ def y, Sy = eγ˜ t e At y2 dt (7.3) 4

3

0

is well-defined. A simple change of variable shows that one then has the bound e At , Se At y ≤ e−γ˜ t y, Sy.

(7.4)

For any given (small) value ε > 0, let now G ε : R → R be a smooth function such that: • There exists a constant Cε such that the bounds G ε (q)V1 (q) ≤ Cε − |V1 (q)|2 and |G ε (q)|2 ≤ Cε + |V1 (q)|2 hold for every q ∈ R. • One has |G ε (q)| ≤ ε for every q ∈ R. Since we assumed that k < 1, it is possible to construct a function G ε satisfying these conditions by choosing Rε sufficiently large, setting G ε (q) = −V1 (q) for |q| ≥ 2Rε , G ε (q) = q|Rε |2k−2 for |q| ≤ Rε , and interpolating smoothly in between. For large values of Rε , one can then guarantee that |G ε (q)| ≤ C Rε2k−2 , which does indeed go to 0 for large values of Rε . We now define, for a (large) constant ξ to be determined, Hˆ = H + y, Sy − ξ( p0 + p1 )(G ε (q0 ) + G ε (q1 )). Before we bound L Hˆ , we note that we have the bound

(G ε (q0 ) + G ε (q1 )) V1 (q0 ) + V1 (q1 ) q1

= 2 G ε (q0 )V1 (q0 ) + G ε (q1 )V1 (q1 ) + G ε (q) dq V1 (q0 ) − V1 (q1 ) q0 2 2 ≤ 2Cε − 2 |V1 (q0 )| + |V1 (q1 )| + Cε(q0 − q1 )2 , for some constant C independent of ε. It therefore follows from (7.4), (7.2), (1.2) and the properties of G ε that there exist constants Ci independent of ξ and ε such that we have the bound L Hˆ ≤ C1 − γ p02 − γ˜ y, Sy + 2y, S F(y, Q)

+ξ (G ε (q0 ) + G ε (q1 )) V1 (q0 ) + V1 (q1 )

+γ ξ p0 (G ε (q0 ) + G ε (q1 )) − ξ( p0 + p1 ) G ε (q0 ) p0 + G ε (q1 ) p1 ≤ C2 Cε + |V1 (q0 )|2 + |V1 (q1 )|2 γ˜ − C3 εξ y, Sy − 2ξ |V1 (q0 )|2 + |V1 (q1 )|2 . − 2

172

M. Hairer

It follows that, by first making ξ sufficiently large and then making ε sufficiently small, it is possible to obtain the bound γ˜ 1 + y, Sy + |V1 (q0 )|2 + |V1 (q1 )|2 , (7.5) L Hˆ ≤ C − 2 for some constant C. (The constant C depends of course on the choice of ξ and of ε, but assume those to be fixed from now on.) Furthermore, it follows immediately from the definition of Hˆ that 1 ( Hˆ , Hˆ ) ≤ C 1 + y, Sy + |V1 (q0 )|2 + |V1 (q1 )|2 ≤ C Hˆ 2− k , (7.6) where we used the scaling behaviour of V1 in order to obtain the second bound. Set now W = exp(β0 Hˆ δ ) for a constant β0 to be determined. It follows from (5.17) that the bound LW ≤ β0 δW Hˆ δ−1 L Hˆ + 2β0 δ Hˆ δ−1 ( Hˆ , Hˆ ) , holds outside of some sufficiently large compact set. Combining this with (7.6) and (7.5), we see that if δ ∈ [ k1 − 1, 1] and β0 is sufficiently small, then the bound LW ≤ −C W ( Hˆ )δ+1− k ≤ −C W, 1

holds outside of some compact set. The claim then follows immediately from Theorem 3.4. The case k = 1 can be shown similarly, but the result that we obtain is slightly stronger in the sense that one has a spectral gap in spaces weighted by H δ for any δ > 0: Theorem 7.4. Let k = 1 and let δ > 0. Then, for any t > 0, the operator Pt admits a spectral gap in the space of measurable functions weighted by H δ . Proof. The proof is similar to the above, but this time by setting y˜ = (q0 , q1 , p0 , p1 ), ⎞ ⎛ ⎛ ⎞ 0 0 0 0 1 0 ⎜ 0 0 ⎟ 0 0 1⎟ def ⎜ 0 def √ ⎟ , B˜ = 2γ ⎜ A˜ = ⎝ ⎝ T 0 ⎠, −α α −γ 0⎠ √ α −α 0 0 T∞ 0 and noting that d y˜ = A˜ y˜ dt + F( y˜ ) dt + B˜ dw(t), for some bounded function F. It then suffices to construct S˜ similarly to above and to set Hˆ = y˜ , S˜ y˜ , without requiring any correction term. This yields the existence of constants C1 and C2 such that one has the bounds L Hˆ ≤ −C1 Hˆ ,

( Hˆ , Hˆ ) ≤ C2 Hˆ ,

outside of some compact set. The existence of a spectral gap in spaces weighted by Hˆ δ follows at once. The claim then follows from the fact that Hˆ is bounded from above and from below by multiples of H .


173

7.2. The case k ≤ 1/2. This case is slightly more subtle since the function V (q) is either bounded or even converges to zero at infinity, so that bounds of the type (7.5) are not very useful. We nevertheless have the following result: Theorem 7.5. Let k ∈ (0, 21 ]. Then, (1.2) admits a unique invariant probability measure µ and there exist constants c, C > 0, β0 > 0, and a function Hˆ : R4 → R such that • The bounds cH ≤ Hˆ ≤ C H hold outside of some compact set. • If k = 21 , then Pt admits a spectral gap in the space of measurable functions weighted by exp(β0 Hˆ ). • If k < 21 , then there exist positive constants C and γˆ such that the bound Pt (x, · ) − µ TV ≤ C exp(β0 H (x))e−γˆ t

k/(1−k)

,

(7.7)

holds uniformly over all initial conditions x and all times t ≥ 0. Proof. Define again y, A and B as in (7.1) but let us be slightly more careful about the remainder term. We define as before the center of mass Q = (q0 + q1 )/2 and the displacement q = (q0 − q1 )/2 and write V1 (q0 ) = V1 (Q) + R0 (q, Q), V1 (q1 ) = V1 (Q) + R1 (q, Q). With this notation, defining furthermore the vector 1 = (0, 1, 1), the equation of motion for y = (q, p0 , p1 ) is given by dy = Ay dt − V1 (Q)1 dt + R(Q, y) dt + B dw(t), R = (0, −R0 (Q, y), −R1 (Q, y)). This suggests the introduction, for fixed Q ∈ R, of the reduced generator L Q acting on functions from R3 to R by 1 L Q = Ay, ∂ y − V1 (Q)1, ∂ y + B ∗ ∂ y , B ∗ ∂ y . 2 Following the usual procedure in the theory of homogenisation, we wish to correct the ‘slow variable’ Q in order to obtain an effective equation that takes into account the behaviour of the ‘fast variable’ y. Since the equation of motion for Q is given by Q˙ = ( p0 + p1 )/2 = 1, y/2, this can be achieved by finding a function ψ(Q) such that 1, y/2 − ψ(Q) is centred with respect to the invariant measure for L Q and then solving the Poisson equation L Q ϕ Q = 1, y/2 − ψ(Q). Since all the coefficients of L Q are linear (remember that Q is a constant there), this can be solved explicitly, yielding 2 ψ(Q) = − V1 (Q), γ

ϕ Q (y) = −a, y, a = (1, 1/γ , 1/γ ).

We now introduce the corrected variable Qˆ = Q + a, y, so that the equations of motion for Qˆ are given by 2T 2T∞ 2 dw0 (t) + dw1 (t). d Qˆ = − V1 (Q) dt + a, R(Q, y) dt + γ γ γ

174

Defining γˆ =

M. Hairer 2 γ,

the ‘mean temperature’ Tˆ = (T + T∞ )/2, and ˆ − V1 (Q) , Rˆ = a, R(Q, y) + γˆ V1 ( Q)

(7.8)

we thus see that there exists a Wiener process W such that Qˆ satisfies the equation ˆ ˆ ˆ d Q = −γˆ V1 ( Q) dt + R dt + 2γˆ Tˆ dW (t). Setting again S as in (7.3), this suggests that in order to extract the tail behaviour of ˆ + y, Sy. This the invariant measure for (1.2), a good test function would be V1 ( Q) ˆ function however turns out not to be suitable in the regime where Q is large and y is small, because of the constant appearing when applying L to y, Sy. In order to avoid this, let us introduce a smooth increasing function χ : R+ → [0, 1] such that χ (t) = 1 ˆ = 1 + Qˆ 2 , so that for t ≥ 2 and χ (t) = 0 for χ ≤ 1. We also define the function Q ˆ ≤ C Q ˆ 2k−1 and similarly for V ( Q). ˆ |V1 ( Q)| 1 Note that, since we are considering the regime where V1 is a bounded function, there exists a constant C S such that −C S − 2γ˜ y, Sy ≤ Ly, Sy ≤ C S −

γ˜ y, Sy, 2

where γ˜ is as in (7.4). Furthermore, we note that since all terms contained in Rˆ are of ˆ − V ( Qˆ + b, y) for some vector b ∈ R3 , there exists a constant C such the form V1 ( Q) 1 that the bound ˆ ≤ C|y|, C | Q| ˆ ≤ (7.9) | R| 2k−2 ˆ ≥ C|y|, ˆ | Q| C|y| Q ˆ y). (In particular Rˆ is bounded.) We now set holds for every pair ( Q, ˆ , W = exp (β0 y, Sy) + exp β0 λV1 ( Q) for some positive constants β0 and λ to be determined. Since we are only interested in bounds that hold outside of a compact set, we use in the remainder of this proof the notation f g to signify that there exists a constant c > 0 such that the bound f ≤ cg holds outside of a sufficiently large compact set. With this notation, one can check in a straightforward way that there exist constants βi depending on β0 and λ such that the two-sided bound exp(β1 H ) W exp(β2 H ), holds. It follows then from the chain rule that there exist constants Ci > 0 such that one has the upper bound γ˜ LW ≤ β0 C S − ( − C1 β0 )y, Sy exp (β0 y, Sy) 2 ˆ R| ˆ + C4 V1 ( Q) ˆ exp β0 λV1 ( Q) ˆ . + β0 λ −(1 − C2 β0 λ)|V1 (Q)|2 + C3 |V1 ( Q)|| (7.10)


175

Choosing β0 sufficiently small, we obtain the existence of a constant Cˆ such that the bound ˆ R| ˆ − Q ˆ 2k−1 exp β0 λV1 ( Q) ˆ 2k−1 C| ˆ , LW Cˆ − y, Sy exp (β0 y, Sy)+ Q holds. ˆ ≥ y, Sy ≥ Cˆ and We now consider three separate cases. In the regime λV1 ( Q) provided that λ is chosen sufficiently small, it follows from (7.9) that we have the bound ˆ 2k−1 C| ˆ R| ˆ − Q ˆ 2k−1 exp β0 λV1 ( Q) ˆ − Q ˆ 4k−2 W. LW Q ˆ ≥ y, Sy but y, Sy ≤ C, ˆ we similarly have In the regime where λV1 ( Q) ˆ − Q ˆ 4k−2 W. ˆ − Q ˆ 4k−2 exp β0 λV1 ( Q) LW Cˆ exp(β0 C) ˆ ≤ y, Sy, we have the bound Finally, in the regime where λV1 ( Q) LW −y, Sy exp (β0 y, Sy) −|y|2 W. Combining all of these bounds, we have 1

LW −(log W )2− k W, so that the upper bounds on the transition probabilities follow just as in the proof of Theorem 6.1 with κ replaced by k. Before we obtain lower bounds on the convergence speed, we show the following non-integrability result: ˆ dµ = ∞. Lemma 7.6. In the case k < 21 there exists β > 0 such that exp(βV1 ( Q)) Proof. We are going to construct functions W1 and W2 satisfying Wonham’s criterion. ˆ S and y be as in the proof of the previous result and set Let Q, ˆ + exp(εy, Sy), W2 = exp(2βV1 ( Q)) for constants β > 0 and ε > 0 to be determined. It follows from the boundedness of V1 , V1 and Rˆ that, whatever the choice of β, one has ˆ LW2 exp(βV1 ( Q)), provided that we choose ε sufficiently small. Setting ˆ − exp(εy, Sy), W1 = exp(βV1 ( Q)) we have similarly to (7.10) the bound ˆ 2 − C3 |V1 ( Q)|| ˆ R| ˆ + C4 V1 ( Q) ˆ exp(βV1 ( Q)) ˆ LW1 ≥ β (C2 β − 1)|V1 ( Q)| +ε((γ˜ /2 − C1 ε)y, Sy − C S ) exp(εy, Sy), so that an analysis similar to before shows that LW1 ≥ 0 outside of some compact set, provided that ε < γ˜ /(2C1 ) and β > 1/C2 , thus concluding the proof. Remark 7.7. The proof of Lemma 7.6 does not require k > 0. It therefore shows that there exists no invariant probabilitymeasure for (1.2) if k ≤ 0.

176

M. Hairer

We now use this result in order to obtain the following lower bound on the convergence of the transition probabilities towards the invariant measure: Theorem 7.8. Let k ∈ (0, 21 ). Then, there exists a constant c such that, for every initial condition x0 ∈ R4 there exists a constant C and a sequence of times tn increasing to k/(1−k) infinity such that µ − µtn ≥ C exp(−ctn ). Proof. We use the same notations as above. Let β be sufficiently large so that the funcˆ is not integrable with respect to the invariant measure. We also fix tion exp(βV1 ( Q)) some small ε > 0 and we set ˆ + exp(εy, Sy). W = exp(βV1 ( Q)) We then obtain in a very similar way to before the upper bound ˆ 2 + C3 |V1 ( Q)|| ˆ R| ˆ + C4 V1 ( Q) ˆ exp(βV1 ( Q)) ˆ LW ≤ β (C2 β − 1)|V1 ( Q)| + ε(C S − (γ˜ /2 − C1 ε)y, Sy) exp(εy, Sy). It follows again from a similar analysis that there exists a constant C > 0 such that the bound 1

LW ≤ C(log W )2− k W, holds outside of some compact set. As in the proof of Proposition 6.2, this implies the existence of a constant C > 0 such that one has the pointwise bound Pt W ≤ W exp C(1 + t)k/(1−k) . Combining this with Lemma 7.6, the rest of the proof is identical to that of Theorem 6.4.

Acknowledgements. The author would like to thank Jean-Pierre Eckmann, Xue-Mei Li, Jonathan Mattingly, and Eric Vanden-Eijnden for stimulating discussions on this and closely related problems, as well as Charles Manson for discovering several mistakes in an earlier version. This work was supported by an EPSRC Advanced Research Fellowship (grant number EP/D071593/1).

References [BCG08] [Bon69] [Car07] [CGGR08] [CGWW07] [DFG06]

Bakry, D., Cattiaux, P., Guillin, A.: Rate of convergence for ergodic continuous Markov processes: Lyapunov versus Poincaré. J. Funct. Anal. 254(3), 727–759 (2008) Bony, J.-M.: Principe du maximum, inégalite de Harnack et unicité du problème de Cauchy pour les opérateurs elliptiques dégénérés. Ann. Inst. Fourier (Grenoble) 19, no. fasc. 1, 277–304 xii (1969) Carmona, P.: Existence and uniqueness of an invariant measure for a chain of oscillators in contact with two heat baths. Stoch. Process. Appl. 117(8), 1076–1092 (2007) Cattiaux, P., Gozlan, N., Guillin, A., Roberto, C.: Functional inequalities for heavy tails distributions and application to isoperimetry. http://arxiv.org/abs/0807.3112v1[math.PR], 2008 Cattiaux, P., Guillin, A., Wang, F.-Y., Wu, L.: Lyapunov conditions for logarithmic Sobolev and super Poincaré inequality, http://arxiv.org/abs/0712.0235[math.PR], 2007 Douc, R., Fort, G., Guillin, A.: Subgeometric rates of convergence of f -ergodic strong Markov processes, http://arxiv.org/abs/math/0605791v1[math.ST], 2006

How Hot Can a Heat Bath Get? [DMP+ 07] [DPZ96] [DV01] [EH00] [EH03] [EPR99a] [EPR99b] [FR05] [Hai05] [HM08a] [HM08b] [HN04] [HN05] [Hör67] [Hör85] [MA94] [MT93] [MTVE02] [RT00] [RT02] [RW01] [Ver00] [Ver06] [Vil07] [Vil08] [VK04] [Won66]

177

DeVille, R.E.L., Milewski, P.A., Pignol, R.J., Tabak, E.G., Vanden-Eijnden, E.: Nonequilibrium statistics of a reduced model for energy transfer in waves. Comm. Pure Appl. Math. 60(3), 439–461 (2007) Da Prato, G., Zabczyk, J.: Ergodicity for Infinite-Dimensional Systems, Vol. 229 of London Mathematical Society Lecture Note Series. Cambridge: Cambridge University Press, 1996 Desvillettes, L., Villani, C.: On the trend to global equilibrium in spatially inhomogeneous entropy-dissipating systems: the linear Fokker-Planck equation. Comm. Pure Appl. Math. 54(1), 1–42 (2001) Eckmann, J.-P., Hairer, M.: Non-equilibrium statistical mechanics of strongly anharmonic chains of oscillators. Commun. Math. Phys. 212(1), 105–164 (2000) Eckmann, J.-P., Hairer, M.: Spectral properties of hypoelliptic operators. Commun. Math. Phys. 235(2), 233–253 (2003) Eckmann, J.-P., Pillet, C.-A., Rey-Bellet, L.: Entropy production in nonlinear, thermally driven hamiltonian systems. J. Statist. Phys. 95(1-2), 305–331 (1999) Eckmann, J.-P., Pillet, C.-A., Rey-Bellet, L.: Non-equilibrium statistical mechanics of anharmonic chains coupled to two heat baths at different temperatures. Commun. Math. Phys. 201(3), 657–697 (1999) Fort, G., Roberts, G.O.: Subgeometric ergodicity of strong Markov processes. Ann. Appl. Probab. 15(2), 1565–1589 (2005) Hairer, M.: A probabilistic argument for the controllability of conservative systems. http:// arxiv.org/abs/math-ph/0506064v2, 2005 Hairer, M., Mattingly, J.: Slow energy dissipation in anharmonic oscillator chains. http:// arxiv.org/abs/0712.3889v2[math-ph], 2009 Hairer, M., Mattingly J.: Yet another look at Harris’ ergodic theorem for Markov chains. http://arxiv.org/abs/0810.2777v1[math.PR], 2008 Hérau, F., Nier, F.: Isotropic hypoellipticity and trend to equilibrium for the fokker-planck equation with a high-degree potential. Arch. Rat. Mech. Anal. 171(2), 151–218 (2004) Helffer, B., Nier, F.: Hypoelliptic Estimates and Spectral Theory for Fokker-Planck Operators and Witten Laplacians, Vol. 1862 of Lecture Notes in Mathematics. Berlin: Springer-Verlag, 2005 Hörmander, L.: Hypoelliptic second order differential equations. Acta Math. 119, 147–171 (1967) Hörmander, L.: The Analysis of Linear Partial Differential Operators. III, Vol. 274 of Grundlehren der Mathematischen Wissenschaften [Fundamental Principles of Mathematical Sciences]. Berlin: Springer-Verlag, 1985 MacKay, R.S., Aubry, S.: Proof of existence of breathers for time-reversible or hamiltonian networks of weakly coupled oscillators. Nonlinearity 7(6), 1623–1643 (1994) Meyn, S.P., Tweedie, R.L.: Markov Chains and Stochastic Stability. Communications and Control Engineering Series. London: Springer-Verlag London Ltd., 1993 Milewski, P.A., Tabak, E.G., Vanden-Eijnden, E.: Resonant wave interaction with random forcing and dissipation. Stud. Appl. Math. 108(1), 123–144 (2002) Rey-Bellet, L., Thomas, L.E.: Asymptotic behavior of thermal nonequilibrium steady states for a driven chain of anharmonic oscillators. Commun. Math. Phys. 215(1), 1–24 (2000) Rey-Bellet, L., Thomas, L.E.: Exponential convergence to non-equilibrium stationary states in classical statistical mechanics. Commun. Math. Phys. 225(2), 305–329 (2002) Röckner, M., Wang, F.-Y.: Weak Poincaré inequalities and L 2 -convergence rates of Markov semigroups. J. Funct. Anal. 185(2), 564–603 (2001) Veretennikov, A.Y.: On polynomial mixing estimates for stochastic differential equations with a gradient drift. Teor. Veroyatnost. i Primenen. 45(1), 163–166 (2000) Veretennikov, A.Y.: On lower bounds for mixing coefficients of Markov diffusions. In: From Stochastic Calculus to Mathematical Finance. Berlin: Springer, 2006, pp. 623–633 Villani, C.: Hypocoercive diffusion operators. Boll. Unione Mat. Ital. Sez. B Artic. Ric. Mat. (8) 10(2), 257–275 (2007) Villani, C.: Hypocoercivity, 2008 To appear in Memoirs Amer. Math. Soc. Veretennikov, A.Y., Klokov, S.A.: On the subexponential rate of mixing for Markov processes. Teor. Veroyatn. Primen. 49(1), 21–35 (2004) Wonham, W.M.: Liapunov criteria for weak stochastic stability. J. Diff. Eqs. 2, 195–207 (1966)

Communicated by A. Kupiainen

Commun. Math. Phys. 292, 179–199 (2009) Digital Object Identifier (DOI) 10.1007/s00220-009-0836-y

Communications in


Crystal Melting and Toric Calabi-Yau Manifolds Hirosi Ooguri1,2 , Masahito Yamazaki1,2,3 1 California Institute of Technology, 452-48, Pasadena, CA 91125, USA 2 Institute for the Physics and Mathematics of the Universe, University of Tokyo,

Kashiwa, Chiba 277-8586, Japan

3 Department of Physics, University of Tokyo, Hongo 7-3-1,

Tokyo 113-0033, Japan. E-mail: [email protected] Received: 9 December 2008 / Accepted: 12 February 2009 Published online: 19 May 2009 – © Springer-Verlag 2009

Abstract: We construct a statistical model of crystal melting to count BPS bound states of D0 and D2 branes on a single D6 brane wrapping an arbitrary toric Calabi-Yau threefold. The three-dimensional crystalline structure is determined by the quiver diagram and the brane tiling which characterize the low energy effective theory of D branes. The crystal is composed of atoms of different colors, each of which corresponds to a node of the quiver diagram, and the chemical bond is dictated by the arrows of the quiver diagram. BPS states are constructed by removing atoms from the crystal. This generalizes the earlier results on the BPS state counting to an arbitrary non-compact toric Calabi-Yau manifold. We point out that a proper understanding of the relation between the topological string theory and the crystal melting involves the wall crossing in the Donaldson-Thomas theory.

1. Introduction In type IIA superstring theory, supersymmetric bound states of D branes wrapping holomorphic cycles on a Calabi-Yau manifold give rise to BPS particles in four dimensions. In the past few years, remarkable connections have been found between the counting of such bound states and the topological string theory: (1) When the D brane charges are such that bound states become large black holes with smooth event horizons, the OSV conjecture [1] states that the generating function Z BH of a suitable index for black hole microstates is equal to the absolute value squared of the topological string partition function Z top , 2 Z BH = Z top , to all orders in the string coupling expansion.

(1.1)

180

H. Ooguri, M. Yamazaki

(2) When there is a single D6 brane with D0 and D2 branes bound on it, it has been proposed [2] that the bound states are counted by the Donaldson-Thomas invariants [3,4] of the moduli space of ideal sheaves on the D6 brane. For a non-compact toric Calabi-Yau manifold, the Donaldson-Thomas invariants are related to the topological string partition function [2,5,6] using the topological vertex construction [7]. Recently the connection between the topological string theory and the Donaldson-Thomas theory for toric Calabi-Yau manifolds was proven mathematically in [8]. Given the conjectural relation between the counting of D brane bound states and the Donaldson-Thomas theory, it is natural to expect the relation, Z BH = Z top .

(1.2)

The purpose of this paper is to understand the case (2) better. We start with the lefthand side of the relation, namely the counting of BPS states. Recently, the non-commutative version of the Donaldson-Thomas theory is formulated by Szendröi [9] for the conifold and by Mozgovoy and Reineke [10] for general toric Calabi-Yau manifolds.1 In this paper, we will establish a direct connection between the non-commutative Donaldson-Thomas theory and the counting of BPS bound states of D0 and D2 branes on a single D6 brane. Using this correspondence, we will find a statistical model of crystal melting which counts the BPS states. The crystal melting description has been found earlier in the topological string theory on the right-hand side of (1.2). It was shown in [2,5] that the topological string partition function on C3 , the simplest toric Calabi-Yau manifold, and the topological vertex can be expressed as sums of three-dimensional Young diagrams, which can be regarded as complements of molten crystals with the cubic lattice structure.2 Since the topological vertex can be used to compute the topological string partition function for a general non-compact toric Calabi-Yau manifold, it is natural to expect that a crystal melting description exists for any such manifold. To our knowledge, however, this idea has not been made explicit. The crystal melting model defined in this paper appears to be different from the one suggested by the topological vertex construction. The low energy effective theory of D0 and D2 branes bound on a single D6 brane is a one-dimensional supersymmetric gauge theory, which is a dimensional reduction of an N = 1 gauge theory in four dimensions. The field content of the gauge theory is encoded in a quiver diagram and the superpotential can be found by the brane tiling [25–28].3 From these gauge theory data, we define a crystalline structure in three dimensions. The crystal is composed of atoms of different colors, each of which corresponds to a node of the quiver diagram and carries a particular combination of D0 and D2 charges. The chemical bond is dictated by the arrows of the quiver diagram. There is a special crystal configuration, whose exterior shape lines up with the toric diagram of the Calabi-Yau manifold. Such a crystal corresponds to a single D6 brane with no D0 and D2 charges. We define a rule to remove atoms from the crystal, which basically says that the crystal melts from its peak. By using the non-commutative Donaldson-Thomas theory [10], we show that there is a one-to-one correspondence between molten crystal configurations and BPS bound states carrying non-zero D0 and D2 charges. The statistical model of crystal melting computes the index of D brane bound states. The number of BPS states depends on the choice of the stability condition, and the BPS countings for different stability conditions are related to each other by the wall 1 See [11–16] for further developments. 2 See [17–24] for further developments. 3 See [29,30] for reviews of the quiver gauge theory and the brane tiling method.

Crystal Melting and Toric Calabi-Yau Manifolds

181

crossing formulae. In this paper, we find that, under a certain stability condition, BPS bound states of D branes are counted by the non-commutative Donaldson-Thomas theory. We can use the wall crossing formulae recently derived in [13,14] to relate this result to the commutative Donaldson-Thomas theory. Since the topological string theory is equivalent to the commutative Donaldson-Thomas theory for a general toric Calabi-Yau manifold [8], the relation (1.2) is indeed true for some choice of the stability condition, as expected in [2]. In general, the topological string partition function and the partition function of the crystal melting model are not identical, but their relation involves the wall crossing, Z crystal melting ∼ Z top (modulo wall crossings).

(1.3)

This does not contradict the result in [2,5] since there is no wall crossing phenomenon for C3 . In general, however, a proper understanding of the relation between the topological string theory and the crystal melting requires that we take the wall crossing phenomena into account. In Sect. 2, we will summarize the computation of D brane bound states from the gauge theory perspective. In Sect. 3, we will discuss how this is related to the recent mathematical results on the non-commutative Donaldson-Thomas invariants. In Sect. 4, we will formulate the statistical model of crystal melting for a general toric Calabi-Yau manifold. The final section is devoted to summary of our result and discussion on the wall crossing phenomena. The Appendix explains the equivalence of a configuration of molten crystal with a perfect matching of the bipartite graph. 2. Quiver Quantum Mechanics In the classic paper by Douglas and Moore [31], it was shown that the low energy effective theories of D branes on some orbifolds are described by gauge theories associated to quiver diagrams. Subsequently, this result has been generalized to an arbitrary non-compact toric Calabi-Yau threefold. A toric Calabi-Yau threefold X is a fiber bundle of T 2 × R over R3 , where the fibers are special Lagrangian submanifolds. The toric diagram tells us where and how the fiber degenerates. For a given X and a set of D0 and D2 branes on X , the following procedure determines the field content and superpotential of the gauge theory on the branes. We will add a single D6 brane to the system later in this section.

2.1. Quiver diagram and field content. The low energy gauge theory is a one-dimensional theory given by dimensional reduction of an N = 1 supersymmetric gauge theory in four dimensions. The field content of the theory is encoded in a quiver diagram, which is determined from the toric data and the set of D branes, as described in the following. A quiver diagram Q = (Q 0 , Q 1 ) consists of a set Q 0 of nodes, with a rank Ni > 0 associated to each node i ∈ Q 0 , and a set Q 1 of arrows connecting the nodes. The corresponding gauge theory has a vector multiplet of gauge group U (Ni ) at each node i. There is also a chiral multiplet in the bifundamental representation associated to each arrow connecting a pair of nodes. In the following, we will explain how to identify the quiver diagram. The reader may want to consult Fig. 1, which describes the procedure for the Suspended Pinched Point singularity, which is a Calabi-Yau manifold defined by the toric diagram in Fig. 1-(a) or

182


(a)

(b)

(c)

Fig. 1. (a) The toric diagram for the Suspended Pinched Point singularity. (b) The configuration of D2 and NS5 branes after the T-duality on T2 . The green exterior lines are periodically identified. The red lines representing NS5 branes separate the fundamental domain into several domains. The T-dual of D0 branes wrap the entire fundamental domain, and fractional D2 branes are suspended between the red lines. The white domains contain D2 branes only. In each shaded domain, there is an additional NS5 brane. There are two types of shades depending on the NS5 brane orientation. The white domains are connected by arrows through the vertices, and the directions of the arrows are determined by the orientation of the NS5 branes. (c) The quiver diagram obtained by replacing the white domains of (b) by the nodes

equivalently by the equation, x y = zw 2 ,

(2.1)

in C4 . To identify the quiver diagram, we take T-dual of the toric Calabi-Yau manifold along the T2 fibers [26,32]. The fibers degenerate at loci specified by the toric diagram , and the T-duality replaces the singular fibers by NS5 branes [33]. Some of these NS5 branes divide T2 into domains as shown in the red lines in Fig. 1-(b) [30,34–36]. The D0 branes become D2 branes wrapping the whole T2 . The original D2 are still D2 branes after the T-duality, but each of them is in a particular domain of T2 suspended between NS5 branes. In addition, there are some domains that contain NS5 branes stretched two-dimensionally in parallel with D2 branes.4 Let us denote the domains without NS5 branes by i ∈ Q 0 and the domains with NS5 branes by a ∈ I . In Fig. 1-(b), the Q 0 -type domains are shown in white, and the I -type domains are shown with shade. There are two types of shades, corresponding to two different orientations of NS5 branes. This distinction will become relevant when we discuss the superpotential. The Q 0 -type domains are identified with nodes of the quiver diagram since open strings ending on them can contain massless excitations. The rank Ni of the node i ∈ Q 0 is the number of D2 branes in the corresponding domain. On the other hand, I -type domains give rise to the superpotential constraints as we shall see below. Though two domains i, j ∈ Q 0 never share an edge, they can touch each other at a vertex. In that case, open strings going between i and j contain massless modes. We draw an arrow from i → j or i ← j depending on the orientation of the massless open string modes, which is determined by the orientation of NS5 branes. Note that the quiver gauge theory we consider in this paper are in general chiral. This completes the specification of the quiver diagram. 4 The NS5 branes are also filling the four-dimensional spacetime R1,3 while the D2 branes are localized along a timelike path in four dimensions.


183

As another example, the quiver diagram for the conifold geometry has two nodes connected by two sets of arrows in both directions. The ranks of the gauge groups are n 0 and n 0 − n 2 , where n 0 and n 2 are the numbers of D0 and D2 branes. The gauge theory is a dimensional reduction of the Klebanov-Witten theory [37] when n 2 = 0 and the Klebanov-Strassler theory [38] when n 2 > 0. 2.2. Superpotential and brane tiling. Each domain a ∈ I containing an NS5 brane is surrounded by domains i 1 , i 2 , . . . , i n ∈ Q 0 without NS5 branes, as in Fig. 1-(b). By studying the geometry T-dual to X in more detail, one finds that the domain is contractible. Since open strings can end on the domains i 1 , i 2 , . . . , i n , the domain a can give rise to worldsheet instanton corrections to the superpotential. This fact, combined with the requirement that the moduli space of the gauge theory agrees with the geometric expectation for D branes on X , determines the superpotential. Depending on the NS5 brane orientation, the I -type domains are further classified into two types, I+ and I− , and thus the regions of torus are divided into three types Q 0 , I+ and I− . Such a brane configuration, or a classification of regions of T2 , is called the brane tiling.5 In Fig. 1-(b), the brane tiling is shown by the two different shades. The superpotential W is then given by ⎛ + ⎛ − ⎞ ⎞ na na W = Tr ⎝ Ai (a) ,i (a) ⎠ − Tr ⎝ Ai (a) , j (a) ⎠ , (2.2) a∈I+

q=1

q,+ q+1,+

a∈I−

q=1

q,−

(a)

q+1,−

(a)

where the domain a ∈ I± are surrounded by the arrows i 1,± → i 2,± → · · · → (a)

(a)

(a)

(a)

i n ± +1,± → i 1,± . For each arrow i q,± → i q+1,± (1 ≤ q ≤ n a± ), the corresponding a bifundamental field is denoted by Ai (a) ,i (a) . This formula is tested in many examq,± q+1,±

ples. In particular, it has been shown that the formula reproduces the toric Calabi-Yau manifold X as the moduli space of the quiver gauge theory [40]. In the literature of brane tiling, bipartite graphs are often used in place of brane configurations as in Fig. 1-(b). A bipartite graph is a graph consisting of vertices colored either black or white and edges connecting black and white vertices. Since bipartite graphs will also play roles in the following sections, it would be useful to explain how it is related to our story so far. For a given brane configuration, we can draw a bipartite graph on T2 as follows. In each domain in I+ (I− ), place a white (black) vertex. Draw a line connecting a white vertex in a domain i ∈ I+ and a black vertex in a neighboring domain j ∈ I− . The resulting graph is bipartite. See Fig. 2 for the comparison of the brane configuration and the bipartite graph in the case of the Suspended Pinched Point singularity. We can turn this into a form that is more commonly found in the literature, for example in [26], by choosing a different fundamental region as in Fig. 3.

2.3. D-term constraints and the moduli space. The F-term constraints are given by derivatives of the superpotential, which can be determined as in the above. The moduli space of solutions to the D-term constraints is then described by a set of gauge invariant observables divided out by the complexified gauge group G C [41]. The theorem by King 5 In the literature the word brane tiling refers to the bipartite graph explained below. Here the word brane tiling refers to a brane configuration as shown in Fig. 1-(b). Such a graph is called the fivebrane diagram in [39].

184


Fig. 2. The correspondence between the brane configuration on T2 and the bipartite graph. The white (black) vertex of the bipartite graph corresponds to the region I+ (I− ) in light (dark) shade. The edge of the bipartite graph corresponds to an intersection of I− and in I+ . From this construction, it automatically follows that the graph so obtained is bipartite

Fig. 3. By choosing a different fundamental region of T2 , we find a bipartite graph which is more commonly found in the literature

[42] states that an orbit of G C contains a solution to the D-term conditions if and only if we start with a point that satisfies the θ -stability, a condition defined in the next section. Thus, we can think of the moduli space as a set of solutions to the F-term constraints obeying the θ -stability condition, modulo the action of G C . 2.4. Adding a single D6 brane. To make contact with the Donaldson-Thomas theory, we need to include one D6 brane. Since the D6 brane fills the entire Calabi-Yau manifold, which is non-compact, it behaves as a flavor brane. In the low energy limit, the open string between the D6 brane and another D brane gives rise to one chiral multiplet in the fundamental representation for the D brane on the other end. The D6 brane then enlarges the quiver diagram by one node and one arrow from the new node. To understand why we only get one arrow from the D6 brane, let us take T-duality along the T2 fiber again. The D6 brane is mapped into a D4 brane which is a point in some region in T2 . This means that we only have one new arrow from the new node corresponding to the D6 brane to the node corresponding to the D2 branes in the region. See [23,43] for related discussion in the literature. 3. Non-commutative Donaldson-Thomas Theory In the previous section, we discussed how to construct the moduli space of solutions to the F-term and D-term constraints in the quiver gauge theory corresponding to a toric


185

Calabi-Yau manifold X with a set of D0/D2 branes and a single D6 brane. In this section, we will review and interpret the mathematical formulation of the non-commutative Donaldson-Thomas invariant in [9,10] for X . We find that it is identical to the Euler number of the gauge theory moduli space. 3.1. Path algebra and its module. For the purpose of this paper, modules are the same as representations. Consider a set of all open paths on the quiver diagram Q = (Q 0 , Q 1 ). By introducing a product as an operation to join a head of a path to a tail of another (the product is supposed to vanish if the head and the tail do not match on the same node) and by allowing formal sums of paths, the set of open oriented paths can be made into an algebra CQ called the path algebra. We would like to point out that there is a one-to-one correspondence between a representation of the path algebra and a classical configuration of bifundamental fields of the quiver gauge theory. Suppose there is a representation M of the path algebra. For each node i ∈ Q 0 , there is a trivial path ei 2 of zero length that begins and ends at i. It is a projection, (ei ) = ei . Since every path starts at some node i and ends at some node j, the sum i ei acts as the identity on the path algebra. Therefore, M = ⊕i∈Q 0 Mi , where Mi = ei M. Let us write Ni = dim Mi . For each path from i to j, one can assign a map from Mi to M j . In particular, there is an Ni × N j matrix for each arrow i → j ∈ Q 1 of the quiver diagram. By identifying this matrix as the bifundamental field associated to the arrow i → j, we obtain a classical configuration of bifundamental fields with the gauge group U (Ni ) at the node i. By reversing the process, we can construct a representation of the path algebra for each configuration of the bifundamental fields. 3.2. F-term constraints and factor algebra A. Let us turn to the F-term constraints. Since the bifundamental fields of the quiver gauge theory is a representation of the path algebra, the F-term equations give relations among generators of the path algebra. It is natural to consider the ideal F generated by the F-term equations and define the factor algebra A = CQ/F. The bifundamental fields obeying the F-term constraints then generate a representation of this factor algebra. Namely, classical configurations of the quiver gauge theory obeying the F-term constraints are in one-to-one correspondence with A-modules. As an example, the algebra A for the conifold geometry contains an idempotent ring C[e1 , e2 ] generated by two elements and is given by the following four generators and relations:6 A = C[e1 , e2 ]a1 , a2 , b1 , b2 / (a1 bi a2 = a2 bi a1 , b1 ai b2 = b2 ai b1 )i=1,2 .

(3.2)

Each A-module for this algebra corresponds to a choice of ranks of the gauge groups and a configuration of the bifundamental fields ai , bi satisfying the F-term constraints. F-term constraints have a nice geometric interpretation on the quiver diagram, which we will find useful in the next section. We observe that each bifundamental field appears exactly twice in the superpotential with different signs of coefficients in the superpotential shown in (2.2). By taking a derivative of the superpotential with respect to a 6 The center Z (A) of this algebra A is generated by x = a b + b a (i, j = 1, 2), and is given by ij i j j i

Z (A) = C[x11 , x12 , x21 , x22 ]/(x11 x22 − x12 x21 ), which is the ring of functions of the conifold singularity.

(3.1)

186


Fig. 4. Representation of F-term constraints on the quiver diagram on T2 . In this example, if we write by X AB the bifundamental corresponding to an arrow starting from vertex A and ending at B etc., then the superpotential 2.2 contains a term W = −tr(X AB X BC X C A ) + tr(X AB X B D X D E X E A ), and the F-term condition for X AB (multiplied by X AB ) says that the product of bifundamental fields along the triangle ABC and that along the square AB D E is the same

bifundamental field corresponding to a given arrow, the resulting F-term constraint states that the product of bifundamental fields around a face of the quiver on T2 on one side of the arrow is equal to that around the face on the other side. See Fig. 4 for an example. Therefore, when we have a product of bifundamental fields along a path, any loop on the path can be moved along the path and the resulting product is F-term equivalent to the original one. In [10], it is shown that for any point i, j ∈ Q 0 , we can find a shortest path vi, j from i to j such that any other path a from i to j is F-term equivalent to vi, j ωn with non-negative integer n, where ω is a loop around one face of the quiver diagram. It does not matter where the loop ω is inserted along the path vi, j since different insertions are all F-term equivalent. This means that any path is characterized by the integer n and the shortest path vi, j . In the next subsection, we will impose the D-term constraints on the space of finitely generated left A-modules, mod A. Before doing this, however, it is instructive to discuss topological aspects of mod A by considering its bounded derived category7 D b (mod A). In mathematics, the algebra A gives the so-called “non-commutative crepant resolution” [45]. For singular Calabi-Yau manifolds such as X , the crepant resolution means a resolution that preserves the Calabi-Yau condition.8 For a crepant resolution Y of X , we have the following equivalence of categories9 : D b (coh(Y )) ∼ = D b (mod A),

(3.3)

where D b (coh(Y )) is a bounded derived category of coherent sheaves of crepant resolution Y , and D b (mod A) is the bounded derived categories of finitely generated left A-modules. Equation (3.3) is also interesting from the physics viewpoint. Since D b (coh(Y )) gives a topological classification of A branes on the resolved space Y , the equivalence means that D b (mod A) also classifies D branes, which is consistent with our interpretation above that A-modules are in one-to-one correspondence with a configuration of bifundamental fields obeying the F-term constraints. We should note that the paper [10], which computes the non-commutative DonaldsonThomas invariants for general toric Calabi-Yau manifolds, requires a set of conditions 7 See [44] for an introductory explanation of derived categories in the context of string theory. 8 Mathematically, we mean a resolution f : Y → X such that ω = f ∗ ω , where ω and ω are Y X X Y

canonical bundles of X and Y , respectively. For the class of toric Calabi-Yau threefolds, the existence of crepant resolution is known and different crepant resolutions related by flops are equivalent in derived categories [46]. 9 This is well-known in the case of the conifold (cf. [47]). For general toric Calabi-Yau threefolds, this is not yet mathematically proven as far as the authors are aware of, although there are proofs in several examples [14,48,49].


187

on brane tilings, namely on the superpotential. We find that most of their conditions (specified in Lemma 3.5 and Conditions 4.12) are automatically satisfied for any quiver gauge theories for D branes on general toric Calabi-Yau manifold. We have not been able to prove that Condition 5.3 also holds in general, but it is satisfied in all the examples we know. 3.3. D-term constraints and θ -stability. We saw that the derived category D b (mod A) of A-modules gives the topological classification of D branes in the toric Calabi-Yau manifold X . To understand the moduli space of D branes, however, we also need to understand implications of the D-term constraints. This is where the θ -stability comes in.10 Let θ ∈ N Q 0 be a vector whose components are real numbers. Consider an A-module M, and recall that this M is decomposed as M = ⊕i∈Q 0 Mi with Mi = ei M. The module M is called θ -stable if θi (dim ei M ) > 0 (3.4) i∈Q 0

for every submodule M of M.11 When > is replaced by ≥, the module M is called θ -semistable. In the language of gauge theory, the stability condition (3.4) is required by the D-term conditions. Some readers might wonder why the D-term conditions, which are equality relations, can be replaced by an inequality as (3.4). In fact, the similar story goes for the Hermitian Yang-Mills equations. There instead of solving the Donaldson-UhlenbeckYau equations, we can consider holomorphic vector bundles with a suitable stability condition, the so-called µ-stability or Mumford-Takemoto stability [52,53]. As we mentioned at the end of Sect. 2, it is known that a configuration of bifundamental fields is mapped to a solution to the D-term equations by a complexified gauge transformation G C if and only if the configuration is θ -stable. Since each A-module M gives a representation of G C = i∈Q 0 G L(Ni , C), where G L(Ni , C) is represented by Mi = ei M at each node, each A-module specifies a particular G C orbit. Thus, finding a θ -stable module is the same as solving the D-term conditions. Up to this point we have not specified the value of θ . Physically, θ ’s correspond to the FI parameters, which are needed to write down D-term equations [50]. Although the Euler number of the space of θ -stable A-modules does not change under infinitesimal deformation of θ , it does change along the walls of marginal stability [13,14]. The noncommutative Donaldson-Thomas invariant defined by [9] is in a particular chamber in the space of θ ’s. Following [10], we hereafter take θ = (0, 0, . . . , 0). We will comment more about this issue in the final section. 3.4. D6 brane and compactification of moduli space. We have found that solutions to the F-term and D-term conditions in the quiver gauge theory are identified with θ -stable A-modules. We want to understand the moduli space of such modules and compute its Euler number. Since D brane charges correspond to the ranks of the gauge groups, we consider moduli space of θ -stable modules with dimension dim Mi = Ni (i ∈ Q 0 ), which we 10 The θ -stability is a special limit of -stability as discussed in [50,51]. 11 In some literature, an additional condition i∈Q 0 θi (dim Mi ) = 0 is imposed for a choice of θ . This is

trivially satisfied for the choice θ = (0, 0, . . . , 0) we choose below.

188


denote by M N (A). To compute its Euler number, we need to address the fact that the moduli space of stable A-modules is not always compact. In mathematics literature, the necessary compactification is performed by enlarging the quiver diagram by adding one more node in the following way. Let us fix an arbitrary vertex i 0 , and define a new quiver Qˆ = ( Qˆ 0 , Qˆ 1 ) by Qˆ 0 = Q 0 ∪ {∗},

Qˆ 1 = Q 1 ∪ {a∗ : ∗ → i 0 }.

(3.5)

Namely, we have added one new vertex ∗ and one arrow ∗ → i 0 to obtain the extended ˆ As in the previous case for Q, we can define the path algebra C Q, ˆ the quiver diagram Q. Q 0 ˆ ˆ ˆ ˆ ˆ ideal F generated in CQ by F, and the factor algebra A = C Q/F. Define θ ∈ N +1 by ˆ θˆ = (θ, 1) and define θˆ -stable and semistable A-modules using stability parameter θˆ . It ˆ ˆ is shown in Lemma 2.3 of [10] that θˆ -semistable A-modules are always θ-stable, and the N ˆ ˆ ˆ moduli space Mi0 (A) of θ -stable modules with specified dimension vector N ∈ N Q 0 +1 is compact. Adding the extra-node allows us to compactify the moduli space. In the language of D branes, this corresponds to adding a single D6 brane filling the entire Calabi-Yau manifold, which is necessary to interpret the whole system as a six-dimensional U (1) gauge theory related to the Donaldson-Thomas theory. As we mentioned in Sect. 2, the D6 brane serves as a flavor brane and adds an extra node exactly in the way described in the above paragraph. Note that, in the above paragraph, the ideal Fˆ is generated by the original ideal F. In the quiver gauge theory, this corresponds to the fact that the flavor brane does not introduce a new gauge invariant operator to modify the superpotential. In this way, we arrive at the definition of non-commutative Donaldson-Thomas invariant ˆ N (A)) of cohomologies. With our identification of M ˆ N (A) as the Euler number χ (M i0 i0 ˆ N (A)) with the moduli space of solutions to the F-term and D-term conditions, χ (M i0 computes the Witten index of bound states of D0 and D2 branes bound on a single D6 brane (ignoring the trivial degrees of freedom corresponding to the center of mass of D branes in R1,3 ). We have chosen a specific vertex i 0 to define the non-commutative DonaldsonThomas invariant. The i 0 dependence drops out in simple cases such as C3 and conifold, ˆ N (A)) depends on the choice of the i 0 . We note that the quiver but in general χ (M i0 gauge theory discussed in Sect. 2 also has an apparent dependence on i 0 . There i 0 corresponds to the Q 0 -type domain where the D6 brane is located after the T-duality. Since the fundamental group of the toric Calabi-Yau manifold X is trivial, there is no moduli intrinsic to the D6 brane before taking the T-duality. Thus, we expect that the apparent i 0 dependence in the gauge theory side should disappear with a proper treatment of the T-duality. It would be interesting to study this point further.12 4. Crystal Melting In this section, we define a statistical mechanical model of crystal melting and show that the model reproduces the counting of BPS bound state of D branes. Using the quiver diagram and the superpotential of the gauge theory, we define a natural crystalline structure in three dimensions. We specify a rule to remove atoms from the crystal and show 12 In the example discussed in [14], the result is dependent on the choice of i and a particular vertex should 0 be chosen in order to match with the category of perverse coherent sheaves. We thank Yukinobu Toda for discussions on this point.


189

that each molten crystal corresponds to a particular BPS bound state of D branes. We use the result of [10] to show that all the relevant BPS states are counted in this way. 4.1. Crystalline structure. Mathematically, the three-dimensional crystal we define here is equivalent to a set of basis for Aei0 , where A is the factor algebra A = CQ/F of the path algebra CQ divided by the ideal F generated by the F-term constraints and ei0 is the path of zero length at the reference node i 0 , which is also the projection operator to the space of paths starting at i 0 . Colloquially, the crystal is a set of paths starting at i 0 modulo the F-term relations. As we shall see later, it corresponds to a BPS state corresponding to a single D6 brane with no D0 and D2 charges. We interpret Aei0 in terms of a three-dimensional crystal as follows. The crystal is composed of atoms piled up on nodes in the universal covering Q˜ on 2 R . By using the projection, π : Q˜ → Q, each atom is assigned with a color corresponding to the node in the original quiver diagram Q. The arrows of the quiver diagram determines the chemical bond between atoms. We start by putting one atom on the top of the reference node i 0 . Next attach an atom at an adjacent node j ∈ Q˜ 0 that is connected i 0 by an arrow going from i 0 to j. The atoms at such nodes are placed lower than the atom at i 0 . In the next step, start with the atoms we just placed and follow arrows emanating from them to attach more atoms at the heads of the arrows. As we repeat this procedure, we may return back to a node where an atom is already placed. In such a case, we use the following rule. As we explained in Sect. 3.2, modulo F-term constraints, any oriented path a starting at i 0 and ending at j can be expressed as vi0 , j ωn , where ω is the loop around a face in the quiver diagram and vi0 , j is one of the shortest paths from i 0 to j. This defines an integer h(a) = n for each path a. The rule of placing atoms is that, if a path a takes i 0 to j and if h(a) = n, we place an atom at the n th place under the first atom on the node j. If there is already an atom at the n th place, we do not place a new atom. By repeating this procedure, we continue to attach atoms and construct a pyramid consisting of infinitely many atoms. Since atoms are placed following paths from i 0 modulo the F-term relations, it is clear that atoms in the crystal are in one-to-one correspondence with basis elements of Aei0 . Note that by construction the crystal has a single peak at the reference node i 0 . This defines a crystalline structure for an arbitrary toric Calabi-Yau manifold. In particular, it reproduces the crystal for C3 discussed in [2,5], and the one for conifold in [9]. See Fig. 5 for the crystalline structure corresponding to the Suspended Pinched Point singularity. In this example, the ridge of the crystal (shown as blue lines in Fig. 5) coincides with the ( p, q)-web of the toric geometry. As we will discuss later, this is a general property of our crystal. 4.2. BPS state and molten crystal. In the forthcoming discussions, the crystal defined above will be identified with a single D6 brane with no D0 and D2 charges. Bound states with non-zero D0 and D2 charges are obtained by removing atoms following the rule specified below. ˆ N (A)) are computed by using the In [9,10], the Donaldson-Thomas invariants χ (M i0 ˆ N corresponding to translational invariance U (1)⊗2 symmetry of the moduli space M i0

of T2 . By the standard localization techniques, the Euler number can be evaluated at the fixed point set of the moduli space under the symmetry. Correspondingly, in the gauge

190


Fig. 5. Starting from the universal cover Q˜ of quiver Q shown on the left, we can construct a crystal on the right. Each atom carries a color corresponding to a node in Q, and they are connected by arrows in Q˜ 1 . The green arrows represent arrows on the surfaces of the crystal, whereas the red ones are not. In the case of the Suspended Pinched Point singularity, the atoms come with 3 colors (white, black and gray), corresponding to the 3 nodes of the original quiver diagram Q on T2 shown in Fig. 1

theory side, BPS states counted by the index are those that are invariant under the global U (1)⊗2 symmetry acting on bifundamental fields preserving the F-term constraints since those do not have extra zero modes and do not contribute to the index. We are interested in counting such BPS states. In order for a molten crystal to correspond to U (1)⊗2 invariant θˆ -stable A-modules, we need to impose the following rule to remove atoms from the crystal. Let be a finite set of atoms to be removed from the crystal. The Melting Rule. If aα is in for some a ∈ A, then α should also be in . Since atoms of the crystal correspond to elements of Aei0 , we used the natural action of A on Aei0 to define aα in the above. This means that crystal melting starts at the peak at i 0 and takes place following paths in Aei0 . An example of a molten crystal satisfying this condition is shown in Fig. 6. The melting rule means that a complement I of the vector space spanned by in Aei0 gives an ideal of A. To see this, we just need to take the contraposition of the melting rule. It states: For any β ∈ I and for any a ∈ A, aβ is also in I. Generally speaking, an ideal of an algebra defines a module. To see this, consider a vector |I which is annihilated by all elements of the ideal I. From |I, we can generate a finite dimensional representation of the algebra A by acting elements of A on it. ˆ However, the converse is not always true. Fortunately, when modules are θ-stable and invariant under the U (1)⊗2 symmetry, it was shown in [9,10] that there is a one-to-one correspondence between ideals and modules. It follows that our molten crystal configurations are also in one-to-one correspondence with A-modules and therefore with relevant BPS bound states of D branes. This proves that the statistical model of crystal melting computes the index of D brane bound states.


191

Fig. 6. Example of a molten crystal and its complement . In this example contains 12 atoms, one hidden behind an atom on the reference point represented by a blue point. It is easy to check that satisfies the melting rule mentioned in the text

It would be instructive to understand explicitly how each molten crystal configuration corresponds to a BPS bound state. Starting from a molten crystal specified by , Prepare a one-dimensional vector space Vα with basis vector eα for each atom α ∈ . ˜ define the action of a on Vα by a(eα ) = eβ when the arrow For each arrow a of Q, a begins from α and ends at another atom β ∈ . Otherwise a(eα ) is defined to be zero. Since an arbitrary path is generated by concatenation of arrows, we have defined an action of a ∈ A on each Vα . By linearly extending the action of a onto the total space M = ⊕α∈ Vα , we obtain a A-module M. There are several special properties about this module M. First, the F-term relations are automatically satisfied. This is because when there exist two different paths a, b ∈ A starting at α and ending at β, a(eα ) and b(eα ) are both defined to be eβ . Second, by construction M is generated by action of the algebra A on a single element ei,0 ∈ Vi,0 . In such a case M is called a cyclic A-module, and by Lemma 2.3 of [10] is also θˆ -stable. Third, by the cyclicity of the module it follows that M is U (1)⊗2 invariant up to gauge transformations. Therefore, M is a U (1)⊗2 invariant θˆ -stable module. It follows from the result of [10] discussed at the beginning of this section that M indeed corresponds to a bound state of D branes contributing to the Witten index. At the beginning of this subsection, we stated without explanation that the original crystal corresponds to a single D6 brane with no D0 and D2 charges, and removing atoms correspond to adding the D brane charges. To understand this statement, let us recall that, in Sect. 2, we started with a configuration of D0 and D2 branes on the toric Calabi-Yau manifold and took a T-duality along the fiber to arrive at the brane configuration. Thus, the number of D2 branes at each node j of the quiver diagram Q is a combination of D0 and D2 charges before T-duality. It is this number that is equal to the rank of the gauge group at j.

192


By using the projection π : Q˜ o → Q 0 , the A-module M is decomposed as M = ⊕ j∈Q 0 M j as we saw in Sect. 3, where

Mj = Vα . (4.1) α∈,π(α)= j

In particular, the formula (4.1) means that the rank of the gauge group N j = dimM j at the node j is equal to the number of atoms with the color corresponding to the node j that have been removed from the crystal. Thus, removing an atom at the node j is equivalent to adding D0 and D2 charges carried by the node j. It is interesting to note that each atom in the crystal does not correspond to a single D0 brane or a single D2 brane, but each of them carries a specific combination of D0 and D2 charges. In the crystal melting picture, fundamental constituents are not D0 and D2 branes but the atoms. This reminds us of the quark model of Gell-Mann and Zweig, where the fundamental constituents carry combinations of quantum numbers of hadrons, as opposed to the Sakata model, where existing elementary particles such as the proton, neutron and particle are chosen as fundamental constituents. 4.3. Observations on the crystal melting model. We would like to make a few observations on the statistical model of crystal melting that counts the number of BPS bound states of D branes. We have studied several examples of toric Calabi-Yau manifolds and found that the crystal structure in each case matches with the toric diagram. In particular, the ridges of the crystal, when projected onto the R2 plane, line up with the ( p, q) web of the maximally degenerate toric diagram. This phenomenon is discussed in Appendix. There, we also explain the correspondence between molten crystal configurations and perfect matching of the bipartite graph introduced in Sect. 2.2. So far, we have considered molten crystals that are obtained by removing a few atoms. We may call them low temperature configurations. The high temperature behavior of the model, describing bound states with large D0 and D2 charges, is also interesting. For C3 , it was shown in [2,5] that the high temperature limit of the crystal melting model reproduces the geometric shape of the mirror manifold. Since the high temperature limit of a general statistical model of random perfecting matchings is known to be described by a certain plane algebraic curve [54]; it would be interesting to understand its relation to the mirror of a general toric Calabi-Yau manifold [55]. In the last subsection, we found it useful to describe BPS bound states using ideals of the algebra A. In the case when the toric Calabi-Yau manifold is C3 , ideals are closely related to the quantization of the toric structure as discussed in [2]. The gauge theory for C3 is the dimensional reduction of the N = 4 supersymmetric Yang-Mills theory in four dimensions down to one dimension, and the bifundamental fields are three adjoint fields. The F-term and D-term conditions require that they all commute with each other. Thus chiral ring is generated by three elements x, y, z which commute with each other without any further relation. In this case, any ideal Iπ is characterized by the three-dimensional Young diagram π . Locate each box in the 3d Young diagram π by the Cartesian coordinates (i, j, k) (i, j, k = 1, 2, 3, . . .) of the corner of the box most distant from the origin, and define π to be a set of the 3d Cartesian coordinates (i, j, k) for boxes in π . We can then define the ideal ωπ of the chiral ring by, Iπ = {x i−1 y j−1 w k−1 |(i, j, k) ∈ / π }.

(4.2)


193

In [2], this description was obtained by quantizing the toric geometry by using its canonical Kähler form and by identifying x i−1 y j−1 w k−1 as states in the Hilbert space. This can be generalized to an arbitrary toric Calabi-Yau manifold X as follows. One starts with the quiver diagram corresponding to X and use the brane tiling to identify the F-term equations. This gives the chiral ring generated by bifundamental fields obeying the F-term and D-term relations. As we saw in Sect. 4.3, each BPS bound state is related to an ideal of the chiral algebra. We expect that such ideals arise from quantization of the toric structure. BPS bound states of D branes emerging from the quantization of background geometry is reminiscent of the bubbling AdS space of [56] and Mathur’s conjecture on black hole microstates [57]. 5. Summary and Discussion In this paper, we established the connection between the counting of BPS bound states of D0 and D2 branes on a single D6 brane to the non-commutative Donaldson-Thomas theory. We studied the moduli space of solutions to the F-term and D-term constraints of the quiver gauge theory which arises as the low energy limit of the brane configuration. We found the direct correspondence between the gauge theory moduli space and the space of modules of the factor algebra of the path algebra for the quiver diagram quotiented by its ideal related to the F-term constraints, subject to a stability condition to enforce the D-term constraints. Using this correspondence, we found a new description of BPS bound states of the D branes in terms of the statistical model of crystal melting. The crystalline structure is determined by the quiver diagram and the brane tiling which characterize the low energy effective theory of D branes. The crystal is composed of atoms of different colors, each of which corresponds to a node of the quiver diagram, and the chemical bond is dictated by the arrows of the quiver diagram. BPS states are constructed by removing atoms from the crystal. The relation between the commutative and non-commutative Donaldson-Thomas invariant has been extensively discussed in the recent literature. The degeneracy of D brane bound states changes when the value of θ , used to define the stability condition, jumps along the codimension one subspace, which is called walls of marginal stability. The jump in the degeneracy can be computed by the wall crossing formula [58–60], and if we start from a particular chamber and applying the wall crossing forˆ N (A)) in any chamber we want. In the example of mula, we can obtain the value of χ (M i conifold [13], wall crossing relates non-commutative Donaldson-Thomas invariants to commutative Donaldson-Thomas invariants and to new invariants defined by Pandharipande and Thomas [61]. This story is further generalized by [14] when has no internal lattice point. When the toric diagram contains an internal lattice point, non-commutative Donaldson-Thomas invariant includes D4 branes, since H4 (Y ) = 0. Since (commutative) Donaldson-Thomas invariants do not include D4 brane charges, the above discussion of wall crossing should be modified. It has been proven recently that the topological string theory is equivalent to the commutative Donaldson-Thomas theory for a general toric Calabi-Yau manifold [2,8]. Since the commutative Donaldson-Thomas theory count BPS states for some choice of stability condition, Z BH = Z top ,

(5.1)

is indeed true in some chamber. On the other hand, our result shows that the relation,

Z BH = Z crystal melting ,

(5.2)

194


holds in another chamber, where Z BH is the BPS state counting for another choice of the stability condition. Combining these two results, we find that the topological string theory and the statistical model of crystal melting are related by the wall crossing, and we have

Z crystal melting ∼ Z top (modulo wall crossings).

(5.3)

Since there is no wall crossing phenomenon for the Donaldson-Thomas theory on C3 , this result does not contradict with [2], where a direct identification of the topological string theory and the crystal melting is made for C3 . In general, we expect that a proper understanding of the relation between the topological string theory and the crystal melting requires that we take the wall crossing phenomena into account. The OSV formula (1.1) suggests yet another relation between the black hole microstate counting and the topological string theory. According to [62], for a compact Calabi-Yau manifold, the D6/D2/D0 brane system gives rise to a large black hole in four dimensions since it is related to a spinning M theory black hole by the KaluzaKlein reduction. In fact, one can compute the semi-classical Bekenstein-Hawking entropy for such a 4-dimensional black hole and find that it can be made arbitrarily large provided the D2 charges are sufficiently larger than the D0 charge. In this paper, we discussed the D6/D2/D0 system on a non-compact toric Calabi-Yau manifold with an infinite volume. Though it is not obvious that the gravity description in four dimensions is applicable in this case, the OSV formula has been successfully tested for a similar class of non-compact Calabi-Yau manifolds [63–66]. If it is applicable in our case, it would imply the relation between Z crystal melting and |Z top |2 , modulo wall crossings. It would be interesting to find out if such a relation holds. It appears that the crystal melting picture is closely related to the quantization of the toric structure of the Calabi-Yau manifold. It would be interesting to understand the relation better. This could lead to a new insight into quantum geometry, along with the observations in [56] for the bubbling AdS geometry and [57] for black hole microstates. Acknowledgements. We would like to thank Kentaro Nagao, Kazutoshi Ohta, Yukinobu Toda, Kazushi Ueda and Xi Yin for discussions. This work is supported in part by DOE grant DE-FG03-92-ER40701 and by the World Premier International Research Center Initiative of MEXT of Japan. H. O. is also supported in part by a Grant-in-Aid for Scientific Research (C) 20540256 of JSPS and by the Kavli Foundation. M. Y. is also supported in part by the JSPS fellowships for Young Scientists and by the Global COE Program for Physical Sciences Frontier at the University of Tokyo funded by MEXT of Japan.

A. Perfect Matchings In this Appendix we are going to explain the one-to-one correspondence between a molten crystal discussed in the main text and a perfect matching of the bipartite graph. This means that the problem of counting BPS states can also be reformulated as a problem of counting perfect matchings of the bipartite graph, where a perfect matching is a subset of edges of the bipartite graph such that each vertex is contained exactly once. The contents of this appendix is basically a recapitulation of [10]. In Sect. 4.1 we considered a quiver Q˜ = ( Q˜ 0 , Q˜ 1 ), which is a universal cover of the ˜ which we denote by , ˜ can be made bipartite quiver Q on T2 . The dual graph of Q, ˜ and is a universal cover of the bipartite graph on T2 using orientation of arrows of Q, described in Sect. 2.2. What we are going to do is to give an explicit correspondence between a perfect matching of the bipartite graph ˜ and a configuration of molten crystal.


195

Fig. 7. Given a configuration of a molten crystal, we can construct a perfect matching of the bipartite graph. Each arrow is colored green if it is along the surface of the crystal, and red otherwise. The set of dual of arrows colored red gives a perfect matching of the bipartite graph

We first construct a perfect matching from a molten crystal. Given a molten crystal as shown in Fig. 5, choose all the arrows of Q˜ which are along the surface of the crystal. In the example of Fig. 5, such arrows are colored green in Fig. 5, while the remaining arrows are colored red. Take the set of the dual of edges colored red. It is proven by [10] that such a subset of edges of ˜ is a perfect matching. This is the perfect matching we wanted to construct. In the case when no atoms are removed from the crystal, the perfect matching obtained by this method is called the canonical perfect matching, which we denote by D0 . Since only a finite number of atoms are removed from the crystal, the perfect matching obtained from a molten crystal by the above method coincides with D when sufficiently away from the reference point i. Conversely, given a perfect matching D which coincides with D0 when sufficiently away from the reference point i, we can reproduce a molten crystal. Let us superimpose D with D0 , and we have a finite number of loops, as shown in Fig. 8 in the case of Suspended Pinched Point. Define a height function h D such that (1) h D ( j) = 0 when sufficiently away from i. (2) h D increases by one whenever we cross the loop and go inside it. The example of h D for the case of Suspended Pinched Point is shown in Fig. 8. By removing h D ( j) atoms from each j ∈ Q˜ 0 , we can construct a molten crystal. It was proven in [10] that the set of atoms removed from the crystal so defined satisfies the melting rule of Sect. 4.2. This establishes the one-to-one correspondence between a molten crystal and a perfect matching of the bipartite graph, meaning that BPS states ˜ can also be counted by perfect matchings of the bipartite graph . Finally, let us finish this Appendix by pointing out an interesting connection of the canonical perfect matching D0 with toric geometry. The example of canonical perfect matching D0 for the Suspended Pinched Point is shown in Fig. 9. In this example, the asymptotic form of the bipartite graph has four different patterns. Each of four patterns

196


Fig. 8. By superimposing a perfect matching of Fig. 7 with the canonical perfect matching shown later in Fig. 9, we have a set of loops, which defines a height function h D . From this function we can recover a molten crystal

Fig. 9. The canonical perfect matching of the bipartite graph for the Suspended Pinched Point singularity. Asymptotically, the perfect matching corresponds to one of the four perfect matchings of the bipartite graph corresponding to vertices of the toric diagram. The blue borders between different choices of perfect matchings represent the ( p, q)-web


197

is periodic and therefore can be thought of as a perfect matching of the bipartite graph on T2 . In the brane tiling literature, a perfect matching on the bipartite graph on T2 is known to correspond to one of the lattice points of the toric diagram [25,40].13 We recognize that the four perfect matchings are identified with the four corners of the toric diagram in Fig. 1-(a) and that the borders between different patterns are identified with the blue lines in Fig. 1-(b), which makes the ( p, q)-web of the diagram. In general, for an arbitrary toric Calabi-Yau manifold, we can use the same pattern to construct a perfect matching. Divide the universal covering of the bipartite graph into segments separated by the ( p, q)-web of the toric diagram.14 The perfect matching in each segment is periodic and is identified with one of the perfect matchings of bipartite graphs on T2 , which corresponds to one of the lattice points of the toric diagram and the lattice point in question is precisely the vertex surrounded by the two ( p, q)-webs on T2 .15 This determines a perfect matching. In particular, this means that the ridges of the crystal line up with the ( p, q)-web of the toric diagram. We have examined several other examples as well, and this pattern holds in all cases. Thus, we conjecture that the perfect matching constructed in this way is canonical. We would like to stress again that this conjecture is not needed to construct the crystal melting model. Here we are simply pointing out that, in the examples we have studied, the crystalline structures fit beautifully with the corresponding toric geometries. References 1. Ooguri, H., Strominger, A., Vafa, C.: Black hole attractors and the topological string. Phys. Rev. D 70, 106007 (2004) 2. Iqbal, A., Nekrasov, N., Okounkov, A., Vafa, C.: Quantum foam and topological strings. JHEP 0804, 011 (2008) 3. Donaldson, S.K., Thomas, R.P.: Gauge theory in higher dimensions. In: The geometric universe: science, geometry and the work of Roger Penrose. Oxford: Oxford Univ. Press, 1998 4. Thomas, R.P.: A holomorphic Casson invariant for Calabi-Yau 3-folds, and bundles on K3 fibrations. J. Diff. Geom. 54, 367 (2000) 5. Okounkov, A., Reshetikhin, N., Vafa, C.: Quantum Calabi-Yau and classical crystals. http://arXiv.org/ abs/hep-th/0309208v2, 2003 6. Maulik, D., Nekrasov, N., Okounkov, A., Pandharipande, R.: Gromov-Witten theory and DonaldsonThomas theory. I. Compos. Math. 142, 1263 (2006) 7. Aganagic, M., Klemm, A., Marino, M., Vafa, C.: The topological vertex. Commun. Math. Phys. 254, 425 (2005) 8. Maulik, D., Oblomkov, A., Okounkov, A., Pandharipande, R.: Gromov-Witten/Donaldson-Thomas correspondence for toric 3-folds. http://arxiv.org/abs/0809.3976v1[math.AG], 2008 9. Szendröi, B.: Non-commutative Donaldson-Thomas theory and the conifold. Geom. Topol. 12, 1171 (2008) 10. Mozgovoy, S., Reineke, M.: On the noncommutative Donaldson-Thomas invariants arising from brane tilings. http://arXiv.org/abs/0809.0117v2[math.AG], 2008 11. Young, B.: Computing a pyramid partition generating function with dimer shuffling. http://arXiv.org/abs/ 0709.3079v2[math.CO], 2007 12. Young, B.: with an appendix by J.Bryan, Generating functions for colored 3D Young diagrams and the Donaldson-Thomas invariants of orbifolds. http://arXiv.org/abs/0802.3948v2[math.CO], 2008 13. Nagao, K., Nakajima, H.: Counting invariant of perverse coherent sheaves and its wall-crossing. http:// arXiv.org/abs/0809.2992v3[math.AG], 2008 14. Nagao, K.: Derived categories of small toric Calabi-Yau 3-folds and counting invariants. http://arXiv. org/abs/0809.2994v3[math.AG], 2008 13 For this correspondence, we consider superimposition of perfect matchings and define a Z2 -valued height function, which is similar to the height function h D defined previously. 14 We choose the diagram that corresponds to the most singular Calabi-Yau manifold. 15 In a consistent quiver gauge theory, it is believed that the multiplicity of perfect matchings at the vertices of the toric diagram is one [28].

198


15. Jafferis, D.L., Moore, G.W.: Wall crossing in local Calabi Yau manifolds. http://arxiv.org/abs/0810. 4909v1[hep-th], 2008 16. Chuang, W.Y., Jafferis, D.L.: Wall crossing of BPS states on the Conifold from Seiberg duality and pyramid partitions. http://arxiv.org/abs/0810.5072v2[hep-th], 2008 17. Saulina, N., Vafa, C.: D-branes as defects in the Calabi-Yau crystal. http://arxiv.org/abs/hep-th/ 0404246v1, 2004 18. Katz, S.H.: Gromov-Witten, Gopakumar-Vafa, and Donaldson-Thomas invariants of Calabi-Yau threefolds. http://arxiv.org/abs/math/0408266v2[math.AG], 2004 19. Okuda, T.: Derivation of Calabi-Yau crystals from Chern-Simons gauge theory. JHEP 0503, 047 (2005) 20. Dijkgraaf, R., Vafa, C., Verlinde, E.: M-theory and a topological string duality. http://arxiv.org/abs/hepth/0602087v1, 2006 21. Sulkowski, P.: Crystal model for the closed topological vertex geometry. JHEP 0612, 030 (2006) 22. Dijkgraaf, R., Orlando, D., Reffert, S.: Dimer models, free fermions and super quantum mechanics. http:// arxiv.org/abs/0705.1645v2[hep-th], 2007 23. Jafferis, D.L.: Topological quiver matrix models and quantum foam. http://arxiv.org/abs/0705. 2250v1[hep-th], 2007 24. Heckman, J.J., Vafa, C.: Crystal melting and black holes. JHEP 0709, 011 (2007) 25. Hanany, A., Kennaway, K.D.: Dimer models and toric diagrams. http://arxiv.org/abs/hep-th/0503149v2, 2005 26. Franco, S., Hanany, A., Kennaway, K.D., Vegh, D., Wecht, B.: Brane dimers and quiver gauge theories. JHEP 0601, 096 (2006) 27. Franco, S., Hanany, A., Martelli, D., Sparks, J., Vegh, D., Wecht, B.: Gauge theories from toric geometry and brane tilings. JHEP 0601, 128 (2006) 28. Hanany, A., Vegh, D.: Quivers, tilings, branes and rhombi. JHEP 0710, 029 (2007) 29. Kennaway, K.D.: Brane tilings. Int. J. Mod. Phys. A 22, 2977 (2007) 30. Yamazaki, M.: Brane Tilings and Their Applications. Fortsch. Phys. 56, 555 (2008) 31. Douglas, M.R., Moore, G.W.: D-branes, quivers, and ALE instantons. http://arxiv.org/abs/hep-th/ 9603167v1, 1996 32. Feng, B., He, Y.H., Kennaway, K.D., Vafa, C.: Dimer models from mirror symmetry and quivering amoebae. Adv. Theor. Math. Phys. 12, 3 (2008) 33. Ooguri, H., Vafa, C.: Two-Dimensional black hole and singularities of CY manifolds. Nucl. Phys. B 463, 55 (1996) 34. Imamura, Y.: Anomaly cancellations in brane tilings. JHEP 0606, 011 (2006) 35. Imamura, Y.: Global symmetries and ’t Hooft anomalies in brane tilings. JHEP 0612, 041 (2006) 36. Imamura, Y., Isono, H., Kimura, K., Yamazaki, M.: Exactly marginal deformations of quiver gauge theories as seen from brane tilings. Prog. Theor. Phys. 117, 923 (2007) 37. Klebanov, I.R., Witten, E.: Superconformal field theory on threebranes at a Calabi-Yau singularity. Nucl. Phys. B 536, 199 (1998) 38. Klebanov, I.R., Strassler, M.J.: Supergravity and a confining gauge theory: Duality cascades and χ SBresolution of naked singularities. JHEP 0008, 052 (2000) 39. Imamura, Y., Kimura, K., Yamazaki, M.: Anomalies and O-plane charges in orientifolded brane tilings. JHEP 0803, 058 (2008) 40. Franco, S., Vegh, D.: Moduli spaces of gauge theories from dimer models: Proof of the correspondence. JHEP 0611, 054 (2006) 41. Luty, M.A., Taylor, W.: Varieties of vacua in classical supersymmetric gauge theories. Phys. Rev. D 53, 3399 (1996) 42. King, A.D.: Moduli of representations of finite dimensional algebras. Quart. J. Math. Oxford 45, 515 (1994) 43. Cirafici, M., Sinkovics, A., Szabo, R.J.: Cohomological gauge theory, quiver matrix models and Donaldson-Thomas theory. Nucl. Phys. B 809, 452–518 (2009) 44. Aspinwall, P.S.: D-branes on Calabi-Yau manifolds. http://arxiv.org/abs/hep-th/0403166v1, 2004 45. Van den Bergh, M.: Non-commutative crepant resolutions. In: The legacy of Niels Henrik Abel, Berlin: Springer, 749 (2004) 46. Bridgeland, T.: Flops and derived categories. Invent. Math. 147, 613 (2002) 47. Van den Bergh, M.: Three-dimensional flops and non-commutative rings. Duke Math. J. 122, 423 (2004) 48. Ueda, K., Yamazaki, M.: Brane tilings for parallelograms with application to homological mirror symmetry. http://arxiv.org/abs/math/0606548v2[math.AG], 2006 49. Ueda, K., Yamazaki, M.: Homological mirror symmetry for toric orbifolds of toric del Pezzo surfaces. http://arxiv.org/abs/math/0703267v1[math.AG], 2007 50. Douglas, M.R., Fiol, B., Romelsberger, C.: Stability and BPS branes. JHEP 0509, 006 (2005) 51. Bridgeland, T.: Stability conditions on triangulated categories. Ann. of Math. 166, 317 (2007)


199

52. Donaldson, S.K.: Anti self-dual Yang-Mills connections over complex algebraic surfaces and stable vector bundles. Proc. London Math. Soc. 50, 1 (1985) 53. Uhlenbeck, K., Yau, S.-T.: On the existence of Hermitian-Yang-Mills connections in stable vector bundles. Comm. Pure Appl. Math. 39, 257 (1986) 54. Kenyon, R., Okounkov, A., Sheffield, S.: Dimers and Amoebae. http://arxiv.org/abs/math-ph/0311005v1, 2003 55. Ooguri, H., Yamazaki, M.: Work in progress 56. Lin, H., Lunin, O., Maldacena, J.M.: Bubbling AdS space and 1/2 BPS geometries. JHEP 0410, 025 (2004) 57. Mathur, S.D.: The fuzzball proposal for black holes: An elementary review. Fortsch. Phys. 53, 793 (2005) 58. Denef, F., Moore, G.W.: Split states, entropy enigmas, holes and halos. http://arxiv.org/abs/hep-th/ 0702146v2, 2007 59. Kontsevich, M., Soibelman, Y.: Stability structures, motivic Donaldson-Thomas invariants and cluster transformations. http://arxiv.org/abs/0811.2435v1[math.AG], 2008 60. Gaiotto, D., Moore, G.W., Neitzke, A.: Four-dimensional wall-crossing via three-dimensional field theory. http://arxiv.org/abs/0807.4723v1[hep-th], 2008 61. Pandharipande, R., Thomas, R.P.: Curve counting via stable pairs in the derived category. http://arxiv. org/abs/0707.2348v3[math.AG], 2007 62. Gaiotto, D., Strominger, A., Yin, X.: New connections between 4D and 5D black holes. JHEP 0602, 024 (2006) 63. Vafa, C.: Two dimensional Yang-Mills, black holes and topological strings. http://arxiv.org/abs/hep-th/ 0406058v2, 2004 64. Aganagic, M., Ooguri, H., Saulina, N., Vafa, C.: Black holes, q-deformed 2d Yang-Mills, and non-perturbative topological strings. Nucl. Phys. B 715, 304 (2005) 65. Dijkgraaf, R., Gopakumar, R., Ooguri, H., Vafa, C.: Baby universes in string theory. Phys. Rev. D 73, 066002 (2006) 66. Aganagic, M., Ooguri, H., Okuda, T.: Quantum entanglement of baby universes. Nucl. Phys. B 778, 36 (2007) Communicated by N. A. Nekrasov


Communications in


Continuity of Quantum Channel Capacities Debbie Leung1 , Graeme Smith2 1 Institute for Quantum Computing, University of Waterloo, Waterloo, Ontario, N2L3G1,

Canada. E-mail: [email protected]

2 IBM TJ Watson Research Center, 1101 Kitchawan Road, Yorktown Heights,

NY 10598, USA. E-mail: [email protected] Received: 11 December 2008 / Accepted: 24 February 2009 Published online: 26 May 2009 – © Springer-Verlag 2009

Abstract: We prove that a broad array of capacities of a quantum channel are continuous. That is, two channels that are close with respect to the diamond norm have correspondingly similar communication capabilities. We first show that the classical capacity, quantum capacity, and private classical capacity are continuous, with the variation on arguments apart bounded by a simple function of and the channel’s output dimension. Our main tool is an upper bound of the variation of output entropies of many copies of two nearby channels given the same initial state; the bound is linear in the number of copies. Our second proof is concerned with the quantum capacities in the presence of free backward or two-way public classical communication. These capacities are proved continuous on the interior of the set of non-zero capacity channels by considering mutual simulation between similar channels.

1. Introduction There are several notions of capacity for a noisy quantum communication channel. For example, we may be interested in a channel’s capacity for either classical [1,2], private classical [3], or quantum [3–5] communications. We may have access to auxiliary resources in addition to the channel, such as entanglement, one-way classical communication from the sender to receiver, from the receiver to the sender, or two-way classical communications. In all of these situations, there is a sensible notion of capacity that can be studied. Except when free auxiliary entanglement is available, where the problem is effectively solved [6], the various capacities of even very simple channels are unknown. One property that we would hope for in a capacity is continuity. From a practical point of view there will always be a certain amount of channel uncertainty in real systems. In this setting, if nearby channels had dramatically different capacities, the theory of quantum capacities would be of limited value. However, from a mathematical point of

202

D. Leung, G. Smith

view continuity is not at all obvious—very similar channels can become quite far apart given many copies, and the capacity is operationally defined in terms of an asymptotic number of channel uses. This is not a problem when a single-letter capacity formula is available, in which case we can reason about the formula directly, but when only a multiletter formula is available (or worse, none at all) the problem of continuity becomes a challenge. The continuity of channel capacities has been considered before. For example, in their study of the quantum erasure channel [7], Bennett, DiVincenzo, and Smolin implicitly assumed the continuity of the quantum channel capacity to upper bound the capacity of this channel. For the erasure channel, this assumption was rigorously justified later in [8]. Keyl and Werner explicitly considered continuity of the quantum channel capacity in [9], where it was shown that the capacity is lower semi-continuous. Continuity of the Holevo information (whose regularization gives a multi-letter formula for the classical capacity) was considered in [10], where it was shown to be continuous for finite dimensional outputs and lower semi-continuous in general. A related set of questions concerns the continuity of entropic quantities and entanglement measures, which are functions on quantum states. For example, Fannes [11] found a tight bound on the variation of von Neumann entropy of finite dimensional states. This was subsequently used by Nielsen to study the continuity of entanglement of formation [12]. As another example, Donald and Horodecki proved the continuity of the relative entropy of entanglement [13]. The continuity of asymptotic (i.e., regularized) entanglement measures was studied by Vidal in [14], which were shown to be continuous in any open set of distillable states. More recently, Alicki and Fannes generalized the continuity result in [15] to conditional entropy, and used it to prove the continuity of squashed entanglement [16]. In this work we show the continuity of various communication capacities of quantum channels with finite output dimensions. For the unassisted capacities for classical, private classical, and quantum communication, our tool is an inequality controlling the variation of output entropies of many copies of two nearby channels given the same initial state. By careful use of the Alicki-Fannes inequality [15], this bound is shown to be linear (not quadratic) in the number of copies. For the quantum capacity with two-way classical communication, and the quantum capacity with classical back communication, we also show continuity within an open set of nonzero quantum capacity channels. Our results in this setting build on [14], whose arguments are extended from the distillable entanglement of states to the capacity of channels. The rest of the paper is organized as follows. Section 2 contains various definitions, concepts, and prior results used in this paper. Our main tool, the inequality controlling the variation of output entropies of many copies of two nearby channels given the same initial state is proved in Sec. 3. This is used to show our main results, the continuity of the quantum, classical, and private classical capacity in Sec. 4. For simplicity throughout most of this paper, we focus on channels with finite dimensional inputs and outputs, although the results of Sec. 4 can easily be seen to apply to channels with infinite dimensional inputs and finite outputs. One exception to this focus is in Sec. 5, where we consider a family of pairs of infinite dimensional channels, parameterized by n. As n increases, each pair has decreasing distance, but their capacities differ by at least a constant, thereby showing that finite output dimension is needed for continuity. Continuity for the quantum capacities assisted by backward or two-way classical communication in the interior of the nonzero capacity region is proved in Sec. 6. We make a few concluding remarks in Sec. 7.

Continuity of Quantum Channel Capacities

203

2. Preliminaries In this section, we introduce the concepts, notations, definitions, and background materials, focusing on finite dimensional quantum systems. Notations and discussion in the infinite dimensional case will be deferred to Sec. 5. 2.1. Quantum States and Channels. Let H be a complex Hilbert space, and B(H) be the set of bounded linear operators taking H to itself. A quantum state is represented by a positive semidefinite operator ρ ∈ B(H) with unit trace. Except in Sec. 5, we will be interested in finite-dimensional H. A quantum channel N that takes states from Hin to Hout is a linear map from B(Hin ) to B(Hout ) that is trace-preserving and completely-positive. In particular, when H = Hin = Hout , we denote by I the identity map from B(H) to itself. Recall the definition that N is completely-positive if for any reference system with associated Hilbert space Href , I ⊗ N maps the positive-semidefinite cone in B(Href ⊗ Hin ) to that in B(Href ⊗ Hout ). We also call channels, which are trace-preserving and completely-positive, “TCP maps.” They are exactly the physical operations on a state that are allowed by quantum mechanics. A quantum system is associated with a Hilbert space and its set of bounded operators. We also use the system name loosely. For example, we may say that a channel takes system A to system B, or write N : A → B. We denote the trace, which is a simple example of a TCP map, by Tr [·]. A partial trace on a composite system is simply the trace operation on one component. A pure state is a rank one projector, and is also represented by any vector it projects onto. For a quantum state ρ ∈ B(H), a purification is any pure state |ψψ| ∈ B(H ⊗ H ) such that the partial trace over H gives ρ, and purifications always exist. Any channel N can be represented as a conjugation by an isometry U : Hin → Hout ⊗ Henv , followed by a partial trace: N (ρ) = Tr env UρU † . We sometimes add subscripts to the symbols for quantum states and channels to emphasize what systems they act on, but we may omit these to avoid cluttering. However, for multipartite states, the reduced state on a subset of systems is always subscripted by the subset. Throughout this paper, we use a distance measure between states given by the 1-norm of their difference: ||ρ − σ ||1 = Tr |ρ − σ |. Half of the above is called the trace distance, the quantum analogue of the total variation distance in the classical setting. We use a distance measure between channels (mapping from B(Hin ) to B(Hout )) induced by the diamond norm: ||N1 − N2 || = max{||(N1 − N2 ) ⊗ I(X )||1 : X ∈ B(Hin ⊗ Href ), ||X ||1 = 1}. The maximum can always be attained with X being a pure quantum state. Operationally, the diamond norm on the difference between the two channels characterizes the probability to distinguish them, if one can prepare an optimal state and feed part of it into the channel. The distance measure also has the nice property that, increasing the dimension of the reference system beyond dim(Hin ) does not increase the distinguishability. This gives us control over the trace distance of the output states of different channels given the same input, and subsequently other quantities of interest to be defined in the next subsection.

204

D. Leung, G. Smith

The diamond norm of a channel is closely related to the family of completely bounded norms (cb-norms), and in fact is equal to the usual cb-norm of the adjoint channel as well as a generalized cb-norm of the original channel (for more on cb-norms and their relation to quantum information, see [17,18]). 2.2. Entropic Quantities. For a classical random variable X with Prob(X = x) = px , the Shannon entropy of X is given by H (X ) = − x px log px (or H ({ px })). If X is binary with probabilities p, 1 − p, H (X ) is written as H ( p). Here and throughout this paper, log is in base 2. For a quantum system A prepared in state ρ, the von Neumann entropy is written as S(A)ρ or S(ρ) = − Tr ρ log ρ = H ({λk }), where λk is the k th eigenvalue of ρ. Throughout the paper, subscripts showing states on which entropies and other information theoretic quantities are evaluated are omitted when there is little risk of confusion. For two systems AB in state ρ, we mention a few measures of correlation between A and B: • the quantum mutual information is defined as I (A; B)ρ = S(A) + S(B) − S(AB), where entropies are evaluated on ρ and its partial traces. • The conditional entropy is given by S(A|B) = S(AB) − S(B). • The coherent information I coh (AB)ρ is given by S(B) − S(AB) = −S(A|B). The entropy and conditional entropy, viewed as functions of the underlying states, are both continuous. The following, particularly Theorem 2, will be helpful tools for our task of showing the continuity of capacities. Theorem 1 (Fannes Inequality [11]). For any ρ and σ with ||ρ − σ ||1 ≤ , |S(ρ) − S(σ )| ≤ log d + H (). Theorem 2 (Alicki-Fannes Inequality [15]). For any ρ AB and σ AB with ||ρ − σ ||1 ≤ , S(A|B)ρ − S(A|B)σ ≤ 4 log d A + 2H (). 2.3. Capacities of a quantum channel. Consider a quantum channel N : A → B. The channel N has several different capacities for communication. The following quantities will play crucial roles in the various capacities: • For an input ensemble { px , φx }, let ω = x px |xx| X ⊗ N (φx ) and χ (N ) := max I (X ; B)ω px ,φx

be the optimal Holevo information [19] of the output ensemble (after the channel acts on the input). • For an input state ρ A A , where part of it will be fed into N , let I coh (N , ρ A A ) = I coh (AB)I ⊗N (ρ A A ) be the coherent information generated. Maximizing over the input gives the coherent information of N : I coh (N ) = max I coh (N , ρ A A ). ρ A A

We remark that the maximizing state can be chosen to be pure.


205

• For an input ensemble { px , φx }, let ω = x px |xx| X ⊗ (U φx U † ) B E , where U : H A → H B ⊗ H E is an isometric extension of N . Then, I priv (N ) = max (I (X ; B)ω − I (X ; E)ω ) , px ,φx

where the mutual information is evaluated on the reduced states. To give the operational definitions of the different capacities of N for communication, we need to consider n uses of the channel. We will use the shorthands N n , An , B n , and E n to stand for N ⊗n , A⊗n , B ⊗n , and E ⊗n . Definition 1. Classical Capacity. We say that a rate R is -classically-achievable if Kn and a there is an n such that for all n ≥ n there is a classical code {ρk ∈ An }k=1 Kn n n decoding operation Dn : B → {|kk|}k=1 such that ∀k, ||Dn (N (ρk )) − |kk|||1 ≤ with log K n ≥ n R. A rate is classically-achievable if it is -classically achievable for all > 0. The classical capacity of N , C(N ), is the supremum over classically-achievable rates. Theorem 3 (HSW Theorem [1,2]). The classical capacity satisfies C(N ) = lim

n→∞

1 χ (N n ). n

Definition 2. Quantum Capacity. We say that a rate R is -achievable if there is an n such that for all n ≥ n there is a quantum code, Cn ⊂ An and decoding operation Dn : B n → Cn such that for all ψ ∈ B(Cn ), ||Dn (N n (ψ)) − ψ||1 ≤ and log dim HCn ≥ n R. A rate R is achievable if it is -achievable for all > 0. The quantum capacity of N , Q(N ), is the supremum over achievable rates. Theorem 4 (LSD Theorem [3–5]). The quantum capacity satisfies Q(N ) = lim

n→∞

1 coh n I (N ). n

Definition 3. Private Capacity. The private capacity is the capacity of a channel for classical communication with the added requirement that an adversary with access to the environment of the channel is ignorant of the communication. More formally, we say that a rate R is -privately-achievable if there is an n such that ∀n ≥ n Kn there exists a classical code {ρk ∈ An }k=1 with log K n ≥ n R and decoding operation K n such that for all k, Dn : B n → {|kk|}k=1 ||Dn (N n (ρk )) − |kk| ||1 ≤ , and ||ρ Ek n − σ E n ||1 ≤ . n (ρk ), where N (ρ) = Tr B UρU † , with U : H A → H B ⊗ H E an Here ρ Ek n = N isometric extension of N , and σ E n is a fixed state on E n . If R is -privately-achievable for all > 0, it is called privately achievable, and the supremum of privately-achievable rates is called the private capacity. Theorem 5 ([3]). The private capacity satisfies C p (N ) = lim

n→∞

1 priv n I (N ). n

206

D. Leung, G. Smith

The three capacity definitions above are similar in structure, and differing only in the type of information being sent. The corresponding theorems, which give what are called “regularized capacity formulas” also seem to be parallel. In each case, the “regularization”, as the limit over n is called, prevents us from evaluating the capacity of a given channel explicitly, or even numerically. In the case of the quantum capacity [20,21] and the private classical capacity [22] it is known for a while that the regularization cannot be removed in general. More recently, the regularization in the classical capacity was reported to be generally necessary [23]. While very little is known about the capacities above, even less is known about the capacity of a channel for quantum communication assisted by two-way classical communication. To define this capacity, we introduce the notion of an n-use protocol Pn , where n denote the number of times the channel N can be used. Just as in the definition of the unassisted quantum capacity, we consider a system Cn which holds the quantum information to be sent. We use the same symbol to denote Bob’s quantum system which holds the quantum data in his possession at the end of the protocol. Pn is a composition of the following steps (in order of being performed): A0 , M→0 , N , B1 , M←1 , A1 , M→1 , N , B2 , M←2 , · · · An−1 , M→(n−1) , N , Bn , M←n , An . Here, each Ai is performed by the sender Alice on Cn and her auxiliary system after the i th channel use, and each produces an extra system A as an input to the (i+1)th channel use. Each M→i transmits classical communication from Alice to the receiver Bob. Each Bi is performed by Bob on his auxiliary system and all i systems cumulated from the channel uses. Each generates some classical outcome to be sent to Alice in the step M←i . Using the notion of a protocol, we can now define quantum capacity with two-way classical assistance. Definition 4. Quantum Capacity with two-way classical assistance. For any > 0 we say that a rate R is -2-way-achievable if there is an n such that for all n ≥ n there is an n-use protocol Pn such that for any auxiliary reference system A, ψ ∈ Cn ⊗ A, ||Pn ⊗ I(ψ) − ψ||1 ≤ and log dim HCn ≥ n R. In other words, Pn and the identity map on the code space are -close in the diamond norm. A rate is achievable if it is -achievable for all > 0. The quantum capacity of N with two-way classical assistance, Q 2 (N ), is the supremum over achievable rates. Definition 5. Quantum Capacity with back classical assistance Q B (N ). An n-use protocol in this setting is similar to that with two-way assistance, except that M→i are omitted. The rest of the capacity definition is similar to that of Q 2 (N ). Little is known about these assisted capacities. One proven fact [24,25] is that Q 2 (N ) is equal to the entanglement capacity of N (informally, that is the maximum amount of near perfect entanglement generated per use of N , asymptotically). Clearly Q(N ) ≤ Q B (N ) ≤ Q 2 (N ), but beyond that, almost nothing is known about Q B (N ). For instance, there is no known analogue of a connection to entanglement capacity. 3. Continuity of Output Entropy The following theorem is one of our main technical tools. Theorem 6. Let N : A → B and M : A → B be quantum channels and d B be the finite dimension of B. Let A be an auxiliary reference system. If ||N − M|| ≤ , then, for any state φ ∈ B(A An ), S (I ⊗ N n )(φ) − S (I ⊗ Mn )(φ) ≤ n (4 log d B + 2H ()).


207

Proof. Let ρ kAB n = I A ⊗ M⊗k ⊗ N ⊗(n−k) (φ A An ). In the above, we have explicitly labeled the auxiliary, the input and the output systems on the states. We omit these subscripts from now on. Setting k = 0 and then n, we have in particular ρ 0AB n = I ⊗ N n (φ) and ρ nAB n = I ⊗ Mn (φ). Since ρ k−1 and ρk differs only in the k th output system, S(AB1 . . . Bk−1 Bk+1 . . . Bn )ρ k−1 = S(AB1 . . . Bk−1 Bk+1 . . . Bn )ρ k .

(1)

The quantity we are interested in is S(AB n )ρ 0 − S(AB n )ρ n , which satisfies S(AB n )ρ 0 − S(AB n )ρ n = ≤

n n n S(AB )ρ k−1 − S(AB )ρ k k=1 n

S(AB n )

ρ k−1

− S(AB n )ρ k .

k=1

Applying Eq.(1) to a single term in this sum, we have S(AB n )

S(AB n )ρ k = S(AB n )ρ k−1 − S(AB1 . . . Bk−1 Bk+1 . . . Bn )ρ k−1 −S(AB n )ρ k + S(AB1 . . . Bk−1 Bk+1 . . . Bn )ρ k = S(Bk |AB1 . . . Bk−1 Bk+1 . . . Bn )ρ k−1 −S(Bk |AB1 . . . Bk−1 Bk+1 . . . Bn ) k .

ρ k−1−

ρ

Because ||N − M|| ≤ , we also have ||ρ k − ρ k−1 ||1 ≤ , so by the Alicki-Fannes inequality, S(Bk |AB1 . . . Bk−1 Bk+1 . . . Bn ) ≤ 4 log d B + 2H ().

ρk

− S(Bk |AB1 . . . Bk−1 Bk+1 . . . Bn )ρ k−1

As a result, we find S(AB n )ρ 0 − S(AB n )ρ n ≤ n(4 log d B + 2H ()), which completes the proof.

208

D. Leung, G. Smith

4. Continuity of Capacities for Channels with Finite Output Dimension We now apply Theorem 6 to show the continuity of C(N ), Q(N ), and C p (N ). Each of these capacities has the form F(N ) = limn→∞ n1 max P (n) f n (N n , P (n) ) for some appropriate family of function { f n } and parameters P (n) to be optimized over. We make repeated use of the following lemma. Lemma 1. If F(N ) = limn→∞ n1 sup P (n) f n (N n , P (n) ) and ∀n, ∀P (n) , | f n (N n , P (n) )− f n (Mn , P (n) )| ≤ nc, then |F(N ) − F(M)| ≤ c. Proof. Let > 0 be arbitrary. Let f n (N n ) = sup P (n) f n (N n , P (n) ). Suppose f n (N n ) (n) (n) and f n (Mn ) are -close to optimal at P1 and P2 . Then, (n)

(n)

f n (N n ) − < f n (N n , P1 ) ≤ f n (Mn , P1 ) + nc ≤ f n (Mn ) + nc, (n)

(n)

f n (Mn ) − < f n (Mn , P2 ) ≤ f n (N n , P2 ) + nc ≤ f n (N n ) + nc. Thus, ∀ > 0, ∀n, | f n (N n ) − f n (Mn )| ≤ nc + . Taking limits → 0, n → ∞, |F(N ) − F(M)| ≤ c. Note that in particular, Lemma 1 holds with sup replaced by max, as needed in the following corollaries. Corollary 1. The classical capacity of a quantum channel with finite-dimensional output is continuous. Quantitatively, if N , M : A → B, where the dimension of B is d B and ||N − M|| ≤ , then |C(N ) − C(M)| ≤ 8 log d B + 4H (). Proof. From the HSW theorem C(N ) = lim

n→∞

1 1 χ (N n ) = lim max I (X ; B n )ω(n) , n→∞ n p ,φ (n) n x x

(n) where ω(n) = x px |xx| X ⊗N ⊗n (φx ). For any N : A → B and M : A → B with ||N − M|| ≤ , and for fixed n and { px , φx(n) }, letting ω = x px |xx| X ⊗ N n (φx(n) ) (n) and ω˜ = x px |xx| X ⊗ Mn (φx ), we have I (X ; B n )ω − I (X ; B n )ω˜ = S(B n )ω − S(B n X )ω − S(B n )ω˜ + S(B n X )ω˜ ≤ S(B n )ω − S(B n )ω˜ + S(B n X )ω˜ − S(B n X )ω ≤ 2n (4 log d B + 2H ()) . Applying Lemma 1 gives the desired result |C(N ) − C(M)| ≤ 8 log d B + 4H (). Corollary 2. The quantum capacity of a quantum channel with finite dimensional output is continuous. Quantitatively, if N , M : A → B, where the dimension of B is d B and ||N − M|| ≤ , then |Q(N ) − Q(M)| ≤ 8 log d B + 4H ().


209

Proof. From the LSD Theorem, Q(N ) = lim

n→∞

1 coh n 1 I (N ) = lim max I coh (N n , ρ A An ). n→∞ n n ρ A An

Let ω AB n = I ⊗ N n (ρ A An ) and ω˜ AB n = I ⊗ Mn (ρ A An ). Consider the difference of coherent informations coh n I (N , ρ A An ) − I coh (Mn , ρ A An ) = S(B n )ω − S(AB n )ω − S(B n )ω˜ + S(AB n )ω˜ ≤ S(B n )ω − S(B n )ω˜ + S(AB n )ω − S(AB n )ω˜ ≤ 2n (4 log d B + 2H ()). Applying Lemma 1 gives the result.

Corollary 3. The private classical capacity of a quantum channel with finitedimensional output is continuous. Quantitatively, if N , M : A → B, where the dimension of B is d B and ||N − M|| ≤ , then |C p (N ) − C p (M)| ≤ 16 log d B + 8H (). Proof. Let U and W be the isometric extensions for N and M respectively: 1 priv n 1 I (N ) = lim max I (X ; B n )ω − I (X ; E n )ω , n→∞ n px ,φx n where φx lives in An , ω X B n E n = x px |xx| ⊗ U φx U † and |ω X B n E n G purifies it. Then, C p (N ) = lim

n→∞

I (X ; B n )ω − I (X ; E n )ω = [S(B n ) − S(B n X )]ω − [S(E n ) − S(E n X )]ω = [S(B n ) − S(B n X )]ω − [S(X B n G) − S(B n G)]|ωω| .

(2)

Similarly, define ω˜ X B n E n = x px |xx| ⊗ W φx W † for M. Switching from Eq. (2) to that defined by ω, ˜ the difference can be bounded by applying Theorem 6 to each of the four terms followed by Lemma 1, giving the stated result. 5. Discontinuity of Capacities with Infinite Output Dimension In this section we provide simple examples to show that the classical and quantum capacities of channels with infinite output dimensions are not generally continuous. An earlier demonstration of the discontinuity of the classical capacity for infinite dimensional quantum channel was given by Shirokov [26]. For an infinite dimensional complex Hilbert space H with bounded linear operators B(H), the space of all trace class operators (subset of B(H) with finite trace) is denoted T(H), and its positive semidefinite subset is denoted T+ (H). A quantum state is an element of T+ (H) with unit trace. A quantum channel N from Hin to Hout is a linear map from T(Hin ) to T(Hout ) that is trace-preserving and completely-positive.

210

D. Leung, G. Smith

5.1. Classical Capacity. ∞ , and H = Span{|i}∞ . Consider the channels N Example 1. Let H = Span{|i}i=0 + i=1 and Mn : T(H+ ) → T(H) with

N (|i j|) = Tr(|i j|) |00| and

Mn = 1 −

1 1 N+ idn , log n log n

where idn (|i j|) = |i j| for 1 ≤ i, j ≤ n = Tr(|i j|) |00| otherwise. First of all, we have C(N ) = 0, since N maps every state to |00|. As for the capacity of Mn , an easy lower bound can be obtained by using the codewords |kk| for k = 1, . . . , n, turning Mn to a classical erasure channel in n-dimensions, with erasure probability pe = 1− log1 n . The capacity of the latter is known [7] to be (1− pe ) log n = 1. Thus, C(Mn ) ≥ 1. However,

1 ||N − Mn || = (N − idn ) log n 1 2 ||N − idn || ≤ . = log n log n

5.2. Quantum Capacity. Example 2. Now let N : T(H+ ) → T(H) be defined by N (ρ) =

1 1 Tr(ρ) |00| + ρ. 2 2

That is, N is a 50% erasure channel, so that Q(N ) = 0. Let

1 1 N+ idn . Mn = 1 − log n log n A lower bound of the quantum capacity can be obtained by restricting each input to the span of {|i}i=1,...,n , so that Mn is effectively a quantum erasure channel with 1 n-dimensional inputs and with erasure probability pe = 21 − 2 log n . This quantum erasure channel has capacity[7] (1 − 2 pe ) log n = 1. Therefore, Q(Mn ) ≥ 1. As before, we have ||N − Mn || ≤

2 log n ,

so that Q is also discontinuous.


211

6. Two-Way Capacity and Capacity with Back Communication For a general channel, these capacities are not known to have a closed form expression. In this setting, an argument similar to that for continuity of asymptotic entanglement measures in [14] can be used for the interior of the nonzero capacity region. Q 2 and Q B differ in the definition of the n-use protocol, and we will see that this difference does not affect the argument, and we only talk about Q 2 for clarity. For any metric chosen for the space of channels, continuity of Q 2 at N can be stated as ∀ > 0, ∃δ > 0 such that ∀N ∈ B(N , δ), |Q 2 (N ) − Q 2 (N )| ≤ , where B(N , δ) is an open ball of radius δ centered at N (similarly for Q B ). We consider the set of channels taking din to dout dimensions. 6.1. Interior of {Q 2 (N ) > 0}. Let us denote the interior of {Q 2 (N ) > 0} by Q+2 . Suppose N ∈ Q+2 . Using the definition of continuity stated above, we will derive δ as a function of and other relevant parameters, so that ∀ > 0, ∃δ > 0 such that ∀N ∈ B(N , δ), |Q 2 (N ) − Q 2 (N )| ≤ . First, consider B(N , ), where is small enough to ensure B(N , ) ⊂ Q+2 (i.e., Q 2 > 0 on the entire B(N , )). Second, for every M on the boundary of B(N , ), we specify two other channels M1 and M2 so that: M = p1 M1 + (1 − p1 )N , N = p2 M2 + (1 − p2 )M,

(3) (4)

for some p1 , p2 ∈ [0, 1]. M1 , M2 need not be in Q+2 but have to be TCP maps. Such M1 , M2 always exist (for example, we can take them to be M and its antipodal point on B(N , ) respectively). We take M1 , M2 to be on the boundary of the set of channels, as far from N , M as possible to minimize p1 , p2 . The concepts involved in the proof are summarized in the following diagram:

We show how to simulate M by N , from which we derive an upper bound on Q 2 (M), Eq. (5), in terms of Q 2 (N ): (1) We start from the definition of Q 2 (N ). Consider any R1 < Q 2 (N ), with δ1 > 0 such that R1 = Q 2 (N ) − δ1 . For any > 0, ∃n such that ∀n 1 ≥ n , there is a protocol Pn 1 with n 1 uses of N and 2-way classical communication that simulates the identity map on an 2n 1 R1 -dimensional system -close in diamond norm.

212

D. Leung, G. Smith

(2) Any channel can be trivially (and inefficiently) simulated by either one of the two following methods: Alice sends the input noiselessly to Bob who then locally applies the channel, or Alice applies the channel on the input and sends the resulting state to Bob via the noiseless channel. Thus, log d noiseless qubit channels are sufficient for simulating any channel where d = min(din , dout ), in an exact and 1-shot manner. (3) Using the assisting classical communication (only one of the forward or backward direction suffices), Alice and Bob can agree on n biased coins (with probabilities of the two outcomes being p1 , 1− p1 ) and apply the channel N or M1 accordingly. Due to the Chernoff bound, ∀δCh , ∃n Ch such that the probability is less than that it requires more than n( p1 + δCh ) uses of M1 or more than n(1 − p1 + δCh ) uses of N . In this unlikely event, Alice and Bob just run an inaccurate simulation. We now put these 3 steps together. Let n 1 = n (Q 2 (N ) − δ1 )−1 ( p1 + δCh ) log d. We use an n 1 -use protocol of N to simulate n( p1 + δCh ) log d identity channels (it will be -close in diamond norm if n 1 ≥ n ) which in turns simulates n( p1 + δCh ) uses of M1 with the same precision. In addition to the above, we use the coin tosses and n(1 − p1 + δCh ) direct uses of N to simulate n uses of M. This simulation is -close unless an atypical outcome of the coin tosses occurs. If n is large enough, then n 1 ≥ n and n ≥ n Ch , the simulation is 2-close in diamond norm. This takes a total of n(1 − p1 + δCh ) + n (Q 2 (N ) − δ1 )−1 ( p1 + δCh ) log d uses of N . Now, ∀δ2 > 0, R2 = Q 2 (M) − δ2 , ∃m such that ∀n ≥ m , there is a protocol with n uses of M that simulates the identity map on 2n R2 dimensions -close in diamond norm. Substitute these n uses of M by the 2-close simulation above. We have an 3-close simulation of the 2n R2 -dim identity map with n(1 − p1 + δCh ) + n/[Q 2 (N ) − δ1 ] × ( p1 + δCh ) log d uses of N . Letting , δ1 , δ2 , and δCh → 0, we have log d p1 + (1 − p1 ) Q 2 (N ) ≥ Q 2 (M). (5) Q 2 (N ) Running the same argument with N , M reversed and using Eq. (4) instead, we have log d p2 + (1 − p2 ) Q 2 (M) ≥ Q 2 (N ). (6) Q 2 (M) Together, |Q 2 (N ) − Q 2 (M)| ≤ min[ p1 (log d − Q 2 (N )), p2 (log d − Q 2 (M))].

(7)

N

which is colinear with N and M, and is on the boundary of We now consider B(N , δ). We can run the same argument with N in place of M but with the same M1 , M2 . Here, N = δ M + (1 − δ )N . Eliminating M from Eqs. (3) and (4), one can verify that the parameters change as δ ,

δ p2 → q 2 = p2 ·

p1 → q 1 = p1

1 δ

p2

+ (1 − p2 )

≤ p2

1 δ δ ≤ 2 p2 .

1 − p2

In the last inequality, we use the fact that p2 ≤ 1/2 by construction. Using Eq. (5) for

N and substituting p1 , p2 by q1 , q1 , and for δ ≤ 2 log d, |Q 2 (N ) − Q 2 (N )| ≤ min[q1 (log d − Q 2 (N )), q2 (log d − Q 2 (N ))] ≤ .


213

Note that δ depends on N ∈ B(N , δ) via the dependence of M and on N . The continuity bound is not as tight as those derived for the unassisted capacities, but it has the merit of being independent of the metric used for the channels. The same argument holds for continuity of Q B in the interior of Q B (N ) > 0 with the only modification in the definition of an n-use protocol. A less -δ-loaded, more concise, and slightly more heuristic derivation in terms of resource inequalities [27] is also possible.1

6.2. Q B of Erasure Channel. The erasure channel of erasure probability p acts on qubit states as follows: E p (ρ) = (1 − p)ρ + p|22|, where |2 can be viewed as an error symbol. Q 2 (E p ) = 1 − p but an expression for Q B (E p ) is unknown, though it is known to be positive for p < 1. Instead of the continuity of Q 2 or Q B at E p , we can ask if these capacities are continuous as a function of p. In other words, we are considering the restriction of these functions to the 1-parameter family of channels E p . 1 We describe the proof in the language of resource inequalities[27]. Each resource inequality (RI) S + 1 S2 · · · ≥ R1 + R2 · · · represents the fact that n units of LHS resources can be used to simulate n units of the RHS resources for asymptotically large n and with sufficient accuracy (say, in diamond norm of the operations involved in the RHS). If the simulation is sufficiently good, manipulating RIs as though they are usual algebraic inequalities can often be justified. In particular, the following can be rigorously proved in many situations. (1) Multiplication by a positive scalar on both sides is allowed. (2) Inequalities can be summed. (3) The inequalities are transitive. (4) Substitution that preserves the inequalities is allowed. Operationally, this requires that simulations are accurate enough to be composable, so that recursive/concatenated simulation is possible. In a related manner, cancellation (or subtraction) is frequently possible. Now, we give an argument for Eq. (11). By the definition of two-way assisted channel capacity,

N + ∞ CC↔ ≥ Q 2 (N ) I,

(8)

where ∞ CC↔ denotes the assistance. Using log d qubit noiseless channels to simulate M1 (see main text), log d I ≥ M1 .

(9)

Together N + ∞ CC↔ ≥ Q 2 (N ) I ≥

Q 2 (N ) M1 . log d

(10)

Q (N )

2 It means that we can use n copies of N to transmit n log d inputs to the receiver, who then applies copies of M1 locally thereby giving a protocol to simulate M1 using N . Equation (3) means that

p1 M1 + (1 − p1 )N + ∞ CC← ≥ M.

(11)

Using free back communication to generate n biased coins (see main text) p1

log d + (1 − p1 ) N + ∞ CC↔ ≥ M. Q 2 (N )

(12)

log d + (1 − p1 ) Q 2 (N ) ≥ Q 2 (M) Q 2 (N )

(13)

This implies p1 as claimed.

214

D. Leung, G. Smith

In this restricted domain, Q 2 (E p ) = 1 − p is clearly continuous. For Q B (E p ), the previous proof now holds on the restricted domain for p < 1. For the point p = 1, continuity still holds because Q B (E p ) ≤ Q 2 (E p ) = 1 − p which is vanishing (converging towards Q B (E1 )) as p → 1. 7. Discussion We have shown that many of the communication capacities of a quantum channel are continuous. For unassisted capacities, such as private, quantum, and classical capacities we proved continuity using Theorem 6. In these cases, we showed variations in capacity of the form A log d + B H () for channels that are distance apart, where the constants A and B depend on the particular capacity under consideration and are typically of order unity. For the more involved case of two-way capacity, we have shown continuity of Q 2 on the interior of {Q(N ) > 0}, and similarly for Q B by making use of an argument of Vidal[14]. In general, application of Theorem 6 will give continuity any time a regularized capacity formula is available. In particular, it can easily be used to show the continuity of the capacity region of multi-user channels such as the multiple access channel [28] and broadcast channels [29,30]. Acknowledgements. We are grateful to Aram Harrow for discussions about continuity of the two-way and back-assisted capacities, and John Smolin for suggesting the example of discontinuity for the classical capacity of infinite-dimensional channels. We thank Bill Rosgen for a careful reading and many helpful corrections on an earlier version of the manuscript. DL was supported by CRC, CFI, ORF, NSERC, CIFAR, MITACS, ARO, and QuantumWorks. GS was supported by the DARPA QUEST program under contract no. HR0011-09-C-0047.

References 1. Holevo, A.S.: The capacity of the quantum channel with general signal states. IEEE. Trans. Inf. Theory 44(1), 269–273 (1998) 2. Schumacher, B., Westmoreland, M.D.: Sending classical information via noisy quantum channels. Phys. Rev. A 56(1), 131–138 (1997) 3. Devetak, I.: The private classical capacity and quantum capacity of a quantum channel. IEEE Trans. Inf. Theory 51, 44–55 (2005) 4. Lloyd, S.: Capacity of the noisy quantum channel. Phys. Rev. A 55, 1613–1622 (1997) 5. Shor, P.W.: The quantum channel capacity and coherent information. In: Lecture notes, MSRI Workshop on Quantum Computation, 2002. Available online at http://www.msri.org/publications/ln/msri/2002/ quantumcrypto/shor/1/ 6. Bennett, C.H., Shor, P.W., Smolin, J.A., Thapliyal, A.V.: Entanglement-assisted capacity of a quantum channel and the reverse shannon theorem. IEEE Trans. Inf. Theory 48, 2637–2655 (2002) 7. Bennett, C.H., DiVincenzo, D.P., Smolin, J.A.: Capacities of quantum erasure channels. Phys. Rev. Lett. 78(16), 3217–3220 (1997) 8. Barnum, H., Smolin, J.A., Terhal, B.M.: Quantum capacity is properly defined without encodings. Phys. Rev. A. 58, 3496–3501 (1998) 9. Keyl, M., Werner, R.F.: How to correct small quantum errors. In: Cohereat Evalution in Noisy Environments, Lecture Notes in Physics, 611 Berlin-Heidelberg-New York: Springer Verlag, 2002, pp. 263–286 10. Shirokov, M.E.: The holevo capacity of infinite dimensional channels and the additivity problem. Comm. Math. Phys. 262, 137–159 (2006) 11. Fannes, M.: A continuity property of the entropy density for spin lattice systems. Comm. Math. Phys. 31, 291–294 (1973) 12. Nielsen, M.A.: Continuity bounds for entanglement. Phys. Rev. A 61, 064301 (2000) 13. Donald, M., Horodecki, M.: Continuity of relative entropy of entanglement. Phys. Lett. A 264, 257–260 (1999)


215

14. Vidal, G.: On the continuity of asymptotic measures of entanglement. http://arxiv.org/abs/quant-ph/ 0203107vl, 2002 15. Alicki, R., Fannes, M.: Continuity of quantum conditional information. J. Phys. A:Math. Gen. 37, L55–L57 (2004) 16. Christandl, M., Winter, A.: “squashed entanglement” - an additive entanglement measure. J. Math. Phys. 45, 829–840 (2004) 17. Paulsen, V.I.: Completely Bounded Maps and Dilations. John Wiley & Sons, Inc, New York, 1987 18. Devetak, I., Junge, M., King, C., Ruskai, M.B.: Multiplicativity of completely bounded p-norms implies a new additivity result. Commun. Math. Phys. 266, 37–63 (2006) 19. Holevo, A.S.: Statistical problems in quantum physics. In: G. Maruyama, J.V., Prokhorov, eds, Proceedings of the second Japan-USSR Symposium on Probability Theory, Volume 330 of Lecture Notes in Mathematics, Berlin: Springer-Verlag, 1973, pp. 104–119 20. Shor, P.W., Smolin, J.A.: Quantum error-correcting codes need not completely reveal the error syndrome. http://arxiv.org/abs/quant-ph/9604006v2, 1996 21. Smith, G., Smolin, J.A.: Degenerate quantum codes for Pauli channels. Phys. Rev. Lett. 98, 030501 (2007) 22. Smith, G., Renes, J., Smolin, J.A.: Structured codes improve the bennett-brassard-84 quantum key rate. Phys. Rev. Lett. 100, 170502 (2008) 23. Hastings, M.B.: A counterexample to additivity of minimum output entropy. http://arxiv.org/abs/0809. 3972v3quant-ph, 2008 24. Bennett, C.H., DiVincenzo, D.P., Smolin, J.A., Wootters, W.K.: Mixed state entanglement and quantum error correction. Phys. Rev. A. 54, 3824–3851 (1996) 25. Horodecki, M., Horodecki, P., Horodecki, R.: Unified approach to quantum capacities: towards quantum noisy coding theorem. http://arxiv.org/abs/quant-ph/0003040v1, 2000 26. Shirokov, M.: On channels with finite holevo capacity. Theory of Probability and Its Applications 53(4), 732–750 (2008) 27. Devetak, I., Harrow, A., Winter, A.: A family of quantum protocols. Phys. Rev. Lett. 92, 187901 (2004) 28. Yard, J., Devetak, I., Hayden, P.: Capacity theorems for quantum multiple-access channels: Classicalquantum and quantum-quantum capacity regions. IEEE Trans. Inf. Theory 54, 3091–3113 (2008) 29. Yard, J., Hayden, P., Devetak, I.: Quantum broadcast channels. http://arxiv.org/abs/quant-ph/0603098v1, 2006 30. Dupuis, F., Hayden, P.: A father protocol for quantum broadcast channels. http://arxiv.org./abs/quantph/0612155v2, 2006 Communicated by M. B. Ruskai


Communications in


Invariance of the White Noise for KdV Tadahiro Oh Department of Mathematics, University of Toronto, 40 St. George St, Rm 6290, Toronto, ON M5S 2E4, Canada. E-mail: [email protected] Received: 11 December 2008 / Accepted: 9 March 2009 Published online: 24 June 2009 – © Springer-Verlag 2009

Abstract: We prove the invariance of the mean 0 white noise for the periodic KdV. First, we show that the Besov-type space bsp,∞ , sp < −1, contains the support of the white noise. Then, we prove local well-posedness in bsp,∞ for p = 2+, s = − 21 + such that sp < −1. In establishing the local well-posedness, we use a variant of the Bourgain spaces with a weight. This provides an analytical proof of the invariance of the white noise under the flow of KdV obtained in Quastel-Valko [21]. 1. Introduction In this paper, we consider the periodic Korteweg-de Vries (KdV) equation: u t + u x x x + uu x = 0 u t=0 = u 0 ,

(1)

where u is a real-valued function on T × R with T = [0, 2π ) and the mean of u 0 is 0. By the conservation of the mean, it follows that the solution u(t) of (1) has the spatial mean 0 for all t ∈ R as long as it exists. Our main goal is to show that the mean 0 white noise dµ = Z −1 exp − 21 u 2 d x du(x), u mean 0 (2) x∈T

is invariant under the flow and that (1) is globally well-posed almost surely on the statistical ensemble (i.e. on the support of µ) without using the complete integrability of the equation. First, we briefly review recent well-posedness results of the periodic KdV (1). In [2], Bourgain introduced a new weighted space-time Sobolev space X s,b whose norm is given by u X s,b (T×R) = ns τ − n 3 b u (n, τ ) L 2 (Z×R) , (3) n,τ

218

T. Oh

where · = 1+|·|. He proved the local well-posedness of (1) in L 2 (T) via the fixed point argument, immediately yielding the global well-posedness in L 2 (T) thanks to the conservation of the L 2 norm. Kenig-Ponce-Vega [14] improved Bourgain’s result and estab1 lished the local well-posedness in H − 2 (T). Colliander-Keel-StaffilaniTakaoka-Tao [9] proved the corresponding global well-posedness result via the I -method. More recently, Kappeler-Topalov [13] proved the global well-posedness of the KdV in H −1 (T), using the complete integrability of the equation. There are also results on the necessary conditions on the regularity with respect to smoothness or uniform continuity of the solution map : u 0 ∈ H s (T) → u(t) ∈ H s (T). Bourgain [3] showed that if the solution map is C 3 , then s ≥ − 21 . Christ-Colliander-Tao [8] proved that if the solution map is uniformly continuous, then s ≥ − 21 . (Also, see Kenig-Ponce-Vega [15].) In [4], Bourgain proved the invariance of the Gibbs measures for the nonlinear Schrödinger equations (NLS). In dealing with the super-cubic nonlinearity, (where only the local well-posedness result was available), he used a probabilistic argument and the approximating finite dimensional ODEs (with the invariant finite dimensional Gibbs measures) to extend the local solutions to the global ones almost surely on the statistical ensemble and showed the invariance of the Gibbs measures. Note that it was crucial that the local well-posedness was obtained with a “good” estimate on the solutions (e.g. via the fixed point argument) for his argument to obtain the uniform convergence of the solutions of the finite dimensional ODEs to those of the full PDE. Also see Burq-Tzvetkov [6], Oh [19], and Tzvetkov [23,24]. In the present paper, we’d like to follow Bourgain’s argument [4]. Unfortunately, it is known (cf. Zhidkov [25]) that the white noise µ in (2) is supported on ∩s 2, where B sp ,∞ is p the usual Besov space with p = p−1 . In Sect. 3, we use the theory of abstract Wiener spaces to show that bsp,∞ contains the full support of the white noise for sp < −1. Now, we’d like to establish the local well-posedness in bsp,∞ for sp < −1. Note 1

that this space is essentially less regular than H − 2 since it contains the support of the bsp,∞ (T). Let X s,b white noise. First, define a variant of the X s,b space adjusted to p be the completion of the Schwartz class S(T × R) under the norm u X s,b = ns τ − n 3 b u (n, τ )bsp,∞ L τp . p

(5)

Then, one of the crucial bilinear estimates that we need to prove is: ∂x (uv)

s,− 21

Xp

u

s, 21

Xp

v

s, 21

Xp

.

(6)

Invariance of White Noise for the Periodic KdV

219

As in [2 and 14], a key ingredient is the algebraic identity n 3 − n 31 − n 32 = 3nn 1 n 2 for n = n 1 + n 2 . However, this is not enough to prove (6) for sp < −1. In establishing the local well-posedness through the usual integral equation, we view the nonlinear problem (1) as a perturbation to the Airy equation u t + u x x x = 0. Noting the Fourier transform of the solution to the Airy equation is a measure supported on {τ = n 3 }, we modify X s,b p with a carefully chosen weight w(n, τ ) in Sect. 4 to treat the resonant cases in (6). (cf. Bejenaru-Tao [1], Kishimoto [17] in the context of NLS.) Theorem 1. Assume the mean 0 condition on u 0 . Let s = − 21 +, p = 2+ such that sp < −1. Then, KdV (1) is locally well-posed in bsp,∞ . Once we prove Theorem 1, we can use the finite dimensional approximation to (1): N u t + u xNx x + P N (u N u xN ) = 0 (7) u N t=0 = u 0N , where PN is the projection onto thefrequencies |n| ≤ N and u N = P N u. Note that (7) is Hamiltonian, and that it preserves (u N )2 d x. Hence, by Liouville’s theorem, the finite dimensional white noise −1 N 2 1 dµ N = Z N exp − 2 (u ) d x du N (x) (8) x∈T

is invariant under the flow of (7). The remaining argument follows just as in [4,6,19,23, 24], and we obtain the a.s. GWP of (1) and the invariance of the white noise µ. Theorem 2. Let {gn (ω)}∞ n=1 be a sequence of i.i.d. standard complex Gaussian random variables on a probability space (, F, P). Consider (1) with initial data u 0 = n =0 gn (ω)einx , where g−n = gn . Then, (1) is globally well-posed almost surely in ω ∈ . Moreover, the mean 0 white noise µ is invariant under the flow. Remark 1.1. This provides an analytical proof of the invariance of the white noise µ. Recently, Quastel-Valko [21] proved the invariance of the white noise under the flow of KdV. Their argument combines the GWP in H −1 (T) via the complete integrability (Kappeler-Topalov [13]), the correspondence between the white noise for KdV and the Gibbs measure (weighted Wiener measure) of mKdV under the (corrected) Miura transform (Cambronero-McKean [7]), and the invariance of the Gibbs measure of mKdV (Bourgain [4].) Their method is not applicable to the general non-integrable coupled KdV system considered in [19], whereas our argument is applicable in the non-integrable case as well. Remark 1.2. Let F L s, p be the space of functions on T defined via the norm f F L s, p = ns f (n) L np . Then, Theorems 1 and 2 can also be established in F L s, p for some s = − 21 +, p = 2+ with sp < −1. See Remark 4.7. This paper is organized as follows: In Sect. 2, we introduce some standard notations. In Sect. 3, we go over the basic theory of Gaussian Hilbert spaces and abstract Wiener spaces. Then, we give the precise mathematical meaning to the white noise µ and show that it is a (countably additive) probability measure on bsp,∞ for sp < −1. In Sect. 4, we introduce the function spaces and linear estimates. Then, we prove Theorem 1 by establishing the crucial bilinear estimate.

220

T. Oh

2. Notation In the periodic setting on T, the spatial Fourier domain is Z. Let dn be the normalized counting measure on Z, and we say f ∈ L p (Z), 1 ≤ p < ∞, if f L p (Z) =

1 | f (n)| dn p

Z

p

:=

1 | f (n)| p 2π

1

p

< ∞.

n∈Z

If p = ∞, we have the obvious definition involving the essential supremum. We often drop 2π for simplicity. If a function depends on both x and t, we use ∧x (and ∧t ) to denote the spatial (and temporal) Fourier transform, respectively. However, when there is no confusion, we simply use ∧ to denote the spatial Fourier transform, the temporal Fourier transform, and the space-time Fourier transform, depending on the context. Given a space X of functions on T×R, we define the local in time restriction X (T× I ) for any time interval I = [t1 , t2 ] ⊂ R, (or simply X [t1 ,t2 ] ) by u X (T×R) : u|I = u . u X I = u X (T×I ) = inf For a Banach space X ⊂ S (T × R), we use X to denote the space of the Fourier trans−1 forms of the functions in X , which is a Banach space with the norm f X = Fn,τ f X , −1 where F denotes the inverse Fourier transform (in n and τ ). Also, for a space Y of to denote the space of the inverse Fourier transforms of the functions on Z, we use Y functions in Y with the norm f Y = F f Y . Now, define bsp,q (T) by the norm s p f bsp,q (T) = f bsp,q (Z) := n f (n) L

|n|∼2 j

q lj

⎛ ⎛ ⎞ q ⎞ q1 p ∞ ⎜ sp p⎠ ⎟ ⎝ =⎝ n | f (n)| ⎠ j=0

(9)

|n|∼2 j

for q < ∞ and by (4) when q = ∞. Lastly, let η ∈ Cc∞ (R) be a smooth cutoff function supported on [−1, 1] with η ≡ 1 on [− 21 , 21 ] and let ηT (t) = η(T −1 t). We use c, C to denote various constants, usually depending only on s and p. If a constant depends on other quantities, we will make it explicit. We use A B to denote an estimate of the form A ≤ C B. Similarly, we use A ∼ B to denote A B and B A and use A B when there is no general constant C such that B ≤ C A. We also use a+ (and a−) to denote a + ε (and a − ε), respectively, for arbitrarily small ε 1. 3. Gaussian Measures in Hilbert Spaces and Abstract Wiener Spaces In this section, we go over the basic theory of Gaussian measures in Hilbert spaces and abstract Wiener spaces to provide the precise meaning of the white noise “dµ = Z −1 exp(− 21 u 2 d x) x∈T du(x)” appearing in Sect. 1. For details, see Zhidokov [25], Gross [12], and Kuo [18].


221

First, recall (centered) Gaussian measures in Rn . Let n ∈ N and B be a symmetric positive n × n matrix with real entries. The Borel measure µ in Rn with the density 1 −1 1 n dµ(x) = √ exp − B x, x R 2 (2π )n det(B) is called a (nondegenerate centered) Gaussian measure in Rn . Note that µ(Rn ) = 1. Now, we consider the analogous definition of the infinite dimensional (centered) Gaussian measures. Let H be a real separable Hilbert space and B : H → H be a linear positive self-adjoint operator (generally not bounded) with eigenvalues {λn }n∈N and the corresponding eigenvectors {en }n∈N forming an orthonormal basis of H . We call a set M ⊂ H cylindrical if there exists an integer n ≥ 1 and a Borel set F ⊂ Rn such that M = {x ∈ H : (x, e1 H , · · · , x, en H ) ∈ F}.

(10)

For a fixed operator B as above, we denote by A the set of all cylindrical subsets of H . One can easily verify that A is a field. Then, the centered Gaussian measure in H with the correlation operator B is defined as the additive (but not countably additive in general) measure µ defined on the field A via n −1 2 1 n − 21 − n2 µ(M) = (2π ) λj e− 2 j=1 λ j x j d x1 · · · d xn , for M ∈ A as in (10). (11) j=1

F

The following theorem tells us when this Gaussian measure µ is countably additive. Theorem 3.1. The Gaussian measure µ defined in (11) is countably additive on the field A if and only if B is an operator of trace class, i.e. ∞ n=1 λn < ∞. If the latter holds, then the minimal σ -field M containing the field A of all cylindrical sets is the Borel σ -field on H . Consider a sequence of the finite dimensional Gaussian measures {µn }n∈N as follows. For fixed n ∈ N, let Mn be the set of all cylindrical sets in H of the form (10) with this fixed n and arbitrary Borel sets F ⊂ Rn . Clearly, Mn is a σ -field, and setting n −1 2 1 n − 21 − n2 λj e− 2 j=1 λ j x j d x1 · · · d xn µn (M) = (2π ) j=1

F

for M ∈ Mn , we obtain a countably additive measure µn defined on Mn . Then, one can show that each measure µn can be naturally extended onto the whole Borel σ -field M of H by µn (A) := µn (A ∩ span{e1 , · · · , en }) for A ∈ M. Then, we have Proposition 3.2. Let µ in (11) be countably additive. Then, {µn }n∈N constructed above converges weakly to µ as n → ∞. Now, we construct the mean 0 white noise. Let φ = n an einx be a real-valued function on T with mean 0, i.e. we have a0 = 0 and a−n = an . First, define µ N on CN ∼ = R2N with the density N − 12 n=1 |an |2 N (12) dµ N = Z −1 n=1 dan , N e 1 N 2 N where Z N = C N e− 2 n=1 |an | n=1 dan . Note that this measure is the induced probN , where g (ω), n = 1, . . . , N , ability measure on C N under the map ω → {gn (ω)}n=1 n

222

T. Oh

are i.i.d. standard complex Gaussian random variables. Next, define the white noise µ by 1 2 (13) dµ = Z −1 e− 2 n≥1 |an | n≥1 dan , 1 2 where Z = e− 2 n≥1 |an | n≥1 dan . Then, in the above correspondence, we have inx , where {g (ω)} g e φ = n n≥1 are i.i.d. standard complex Gaussian random n =0 n variables and g−n = gn . Let H˙ 0s be the homogeneous Sobolev space restricted to the real-valued mean 0 s inx inx ˙ elements. Let ·, · H˙ s denote the inner product in H0 , i.e. cn e , dn e = H˙ 0s 0 √ 2s 2s −s inx − . Then, the weighted exponentials {|n| e }n =0 n≥1 |n| cn dn . Let Bs = are the eigenvectors of Bs with the eigenvalue |n|2s , forming an orthonormal basis of H˙ 0s . Note that |n|−2s an einx , an einx s = − 21 |an |2 . − 21 B −1 φ, φ H˙ s = − 21 0

n =0

n =0

H˙ 0

n≥1

The right-hand side is exactly the expression appearing in the exponent in (13). By Theorem 3.1, µ is countably additive if and only if B is of trace class, i.e. n =0 |n|2s < ∞. Hence, s 0, there exists P0 ∈ F such that µ(|||Px||| > ε) < ε for P ∈ F orthogonal to P0 . Any measurable seminorm is weaker than the norm of H , and H is not complete with respect to ||| · ||| unless H is finite dimensional. Let B be the completion of H with respect to ||| · ||| and denote by i the inclusion map of H into B. The triple (i, H, B) is called an abstract Wiener space. Now, regarding y ∈ B ∗ as an element of H ∗ ≡ H by restriction, we embed B ∗ in H . Define the extension of µ onto B (which we still denote by µ) as follows. For a Borel set F ⊂ Rn , set µ({x ∈ B : ((x, y1 ), · · · , (x, yn )) ∈ F}) := µ({x ∈ H : (x, y1 H , · · · , x, yn H ) ∈ F}),


223

where y j ’s are in B ∗ and (·, ·) denote the natural pairing between B and B ∗ . Let R B denote the collection of cylinder sets {x ∈ B : ((x, y1 ), . . . , (x, yn )) ∈ F} in B. Theorem 3.3 (Gross [12]). µ is countably additive in the σ -field generated by R B . In the present context, let H = L 2 (T) and B = bsp,∞ (T) for sp < −1. Then, we have Proposition 3.4. The seminorms · bsp,∞ is measurable for sp < −1. Hence, (i, H, B) = (i, L 2 , bsp,∞ ) is an abstract Wiener space, and µ defined in (13) is countably additive in bsp,∞ . We present the proof of Proposition 3.4 at the end of this section. It seems that the statement in Proposition 3.4 holds true for sp = −1 (cf. Roynette [22] for p = 2.) However, we can choose s and p such that sp < −1 for our application, and thus we will not discuss the endpoint case. It follows from the proof that (i, L 2 , F L s, p ), where F L s, p = bsp, p , is also an abstract Wiener space for sp < −1 (we need a strict inequality in this case.) Given an abstract Wiener space (i, H, B), we have the following integrability result due to Fernique [10]. Theorem 3.5 (Theorem 3.1 in [18]). Let (i, H, B) be an abstract Wiener space. Then, 2 there exists c > 0 such that B ecx B µ(d x) < ∞. Hence, there exists c > 0 such that 2 µ(x B > K ) ≤ e−c K . 2 In our context, if sp < −1, we have µ φ ≥ K , φ mean 0 ≤ e−cK for some s b p,∞ (T) c > 0. With this estimate and Theorem 1, we can follow the argument in [4] to prove Theorem 2. We omit the details. Also, see [6,19,23,24] for the details. Proof of Proposition 3.4 We present the proof only for 2 < p < ∞, which is the relevant case for our application. We just point out that the proof for p ≤ 2 is similar but simpler (where one can use Hölder inequality in place of Lemma 3.6 below.) For p = ∞, see [4,5,19]. It suffices to show that for given ε > 0, there exists large M0 such that µ P>M0 φ bsp,∞ > ε < ε, whereP>M0 is the projection onto the frequencies |n| > M0 . In the following, write φ = n =0 gn einx , where {gn (ω)}∞ n=1 is a sequence of i.i.d. standard complex-valued Gaussian random variables and g−n = gn . First, recall the following lemma. Lemma 3.6. (Lemma 4.7 in [20]) Let {gn } be a sequence of i.i.d standard complexvalued Gaussian random variables. Then, for M dyadic and δ > 0, we have max|n|∼M |gn |2 lim M 1−δ = 0, a.s. 2 M→∞ |n|∼M |gn | Fix K > 1 and δ ∈ (0, 21 ) (to be chosen later.) Then, by Lemma 3.6 and Egoroff’s theorem, there exists a set E such that µ(E c ) < 21 ε and the convergence in Lemma 3.6 is uniform on E, i.e. we can choose dyadic M0 large enough such that {gn (ω)}|n|∼M L ∞ n ≤ M −δ , {gn (ω)}|n|∼M L 2n

(14)

224

T. Oh

for all ω ∈ E and dyadic M > M0 . In the following, we will work only on E and drop ‘∩E’ for notational simplicity. However, it should be understood that all the events are under the intersection with E so that (14) holds. Let {σ j } j≥1 be a sequence of positive numbers such that σ j = 1, and let M j = M0 2 j dyadic. Note that σ j = C2−λj = C M0λ M −λ for some small λ > 0 (to be deterj mined later). Then, we have s µ P>M0 φ > ε ≤ µ {g } > ε s n |n|>M b 0 b p,∞ p,1 ≤

∞

µ {ns gn }|n|∼M j L np > σ j ε ,

(15)

j=0

where bsp,1 is defined in (9). By interpolation and (14), we have p−2

2

p {ns gn }|n|∼M j L np ∼ M sj {gn }|n|∼M j L np ≤ M sj {gn }|n|∼M j Lp 2 {gn }|n|∼M j L ∞

≤ M sj {gn }|n|∼M L 2n

{gn }|n|∼M j L ∞ n

p−2 p

s−δ p−2 p

≤ Mj

{gn }|n|∼M j L 2n

n

n

{gn }|n|∼M j L 2n

a. s. Thus, if we have {ns gn }|n|∼M j L np > σ j ε, then we have {gn }|n|∼M j L 2n R j , −s+δ p−2 p

where R j := σ j εM j

. With p = 2 + 2θ , we have −s + δ p−2 p =

taking δ sufficiently close to −s+δ p−2 p σ j εM j

1 2

−sp+2δθ 2+2θ

>

1 2

by

since −sp > 1. Then, by taking λ > 0 sufficiently small, −s+δ p−2 p −λ

= CεM0λ M j Rj = the polar coordinates, we have µ {gn }|n|∼M j L 2n R j ∼

B c (0,R

j)

e

− 21 |g|2

1

+

CεM0λ M j2 . By a direct computation in |n|∼M j

dgn

∞

1 2

e− 2 r r 2·#{|n|∼M j }−1 dr.

Rj

Note that the implicit constant in the inequality is σ (S 2·#{|n|∼M j }−1 ), a surface measure of n the 2·#{|n| ∼ M j }−1 dimensional unit sphere. We drop it since σ (S n ) = 2π 2 / ( n2 ) −1

2M j 4M j t .

1. By change of variable t = M j 2 r , we have r 2·#{|n|∼M j }−2 r 4M j ∼ M j − 21

Since t > M j

R j = CεM0λ M 0+ j , we have 2M j

Mj

1

1

2

= e2M j ln M j < e 8 M j t and t 4M j < e 8 M j t 1

2

1 2

for M0 sufficiently large. Thus, we have r 2·#{|n|∼M j }−2 < e 4 M j t = e 4 r for r > R. Hence, we have ∞ 2 2 2λ 1+ 2 1 2 µ {gn }|n|∼M j L 2n R j ≤ C e− 4 r r dr ≤ e−c R j = e−cC M0 M j ε . (16) 2

Rj

From (15) and (16), we have ∞ 2 1+2λ+ j+ 2 µ P>M0 φ e−cC M0 2 ε ≤ 21 ε bsp,∞ > ε ≤ j=1

by choosing M0 sufficiently large.


225

4. Local Well-Posedness in bsp,∞ In this section, we prove Theorem 1 via the fixed point argument. In Subsect. 4.1, we go over the previous local well-posedness theory of KdV to motivate the definition bsp,∞ . Then, we establish the of the Bourgain space W ps,b with the weight, adjusted to basic linear estimates in Subsect. 4.2. Finally, we prove the crucial bilinear estimate in Subsect. 4.3.

4.1. Bourgain Space with a weight. In [14], Kenig-Ponce-Vega proved ∂x (uv)

1

X s,− 2

u

1

X s, 2

v

1

X s, 2

,

(17)

for s ≥ − 21 under the mean 0 assumption on u and v, where X s,b is defined in (3). Their proof is based on proving the equivalent statement: Bs ( f, g) L 2n,τ f L 2n,τ g L 2n,τ ,

(18)

where Bs (·, ·) is defined by Bs ( f, g)(n, τ ) =

1 2π τ

1

− n3 2

n 1 +n 2 =n n 1 =0,n

|n|ns n 1 s n 2 s

f (n 1 , τ1 )g(n 2 , τ2 )dτ1 1

τ1 +τ2 =τ

1

τ1 − n 31 2 τ2 − n 32 2

.

(19) One of the main ingredients is the observation due to Bourgain [2]: n 3 − n 31 − n 32 = 3nn 1 n 2 , for n = n 1 + n 2 ,

(20)

which in turn implies that MAX := max(τ − n 3 , τ1 − n 31 , τ2 − n 32 ) nn 1 n 2

(21)

for n = n 1 + n 2 and τ = τ1 + τ2 with n, n 1 , n 2 = 0. Recall that (21) implies that |n|ns 1 1 |n|ns 1 1 1 1 s s s s 3 3 n 1 n 2 τ − n 3 2 τ1 − n 2 τ2 − n 2 n 1 n 2 MAX 21 1 2

(22)

for s ≥ − 21 . Note that (22) is optimal, for example, when τ − n 3 ∼ 3nn 1 n 2 and τ j −n 3j 3nn 1 n 2 0+ . To exploit this along with the fact the free solution concentrates on the curve {τ = n 3 }, we define the weight w(n, τ ) in the following. For k ∈ Z \ {0}, let 1

Ak = {(n, τ ) : |n| ≥ C, τ − n 3 + 3n(n − k)k n 100 }, for some C > 0. With δ = 0+ (to be determined later), let w(n, τ ) = 1 + min(k, n − k)δ χ Ak . k =0

(23)

226

T. Oh

Note that, for fixed n and τ , there are at most two values of k such that |(n−k)k + τ −n | 0+ 3n 3 1 −n n−1+ 100 . It follows from the definition that w(n, τ ) max(1, τ n ) ≤ τ − 3

n 3 0+ . Now, define the Bourgain space W ps,b with the weight w via the norm uW s,b = u W u + u s,b := w X s,b p

p

p

where ⎧ ⎨ f := ns τ − n 3 b f (n, τ )b0 X s,b

p p,∞ L τ

p

⎩ f Ys,b := p

ns τ

− n 3 b

s,b− 21

p Y

,

(24)

= sup j ns τ − n 3 b f (n, τ ) L p

f (n, τ )b0p,∞ L 1τ = sup j

ns τ

− n 3 b

f (n, τ )

p Lτ |n|∼2 j p L L1 |n|∼2 j τ

,

.

s, 1

For our application, we set b = 21 . Note that Y ps,0 is introduced so that we have W p 2 (T× [−T, T ]) ⊂ C([−T, T ]; bsp,∞ (T)). In the following, we take p > 2. 4.2. Linear Estimates. Let S(t) = e−t∂x and η(t) be a smooth cutoff such that η(t) = 1 on [− 21 , 21 ] and = 0 for |t| ≥ 1. 3

Lemma 4.1. For any s ∈ R, we have η(t)S(t)u 0

s, 21

Wp

u 0 bsp,∞ .

η(τ − Proof. Recall that w(n, τ ) τ − n 3 0+ . Noting that (η(t)S(t)u 0 )∧ (n, τ ) = n 3 )u0 (n), we have 1 η(τ − n 3 ) L τp |u0 (n)| L p η(t)S(t)u 0 s, 1 ≤ sup ns τ − n 3 2 + Wp

2

|n|∼2 j

j

+ sup ns η(τ − n 3 ) L 1τ |u0 (n)| L p

|n|∼2 j

j

1

η(τ ) L τp + η L 1 < ∞. where Cη = τ 2 +

≤ Cη u 0 bsp,∞ ,

Now, we estimate the Duhamel term. By the standard computation [2], we have t ∞ k k i t 3 λ)dλ S(t − t )F(x, t )dt = −i ei(nx+n t) η(λ − n 3 ) F(n, k! 0 n =0 k=1 (1 − η) (λ − n 3 ) iλt einx e F(n, λ)dλ +i λ − n3 n =0 (1 − η) (λ − n 3 ) i(nx+n 3 t) e F(n, λ)dλ +i λ − n3 n =0

=: N1 (F)(x, t) + N2 (F)(x, t) + N3 (F)(x, t).

(25)

Lemma 4.2. For any s ∈ R, we have η(t)N1 (F)

s, 21

Wp

, N2 (F)

s, 21

Wp

, η(t)N3 (F)

s, 21

Wp

F

s,− 21

Wp

.


227

Proof. Recall that w(n, τ ) τ − n 3 0+ . Let ηk (t) = t k η(t). First, note that |ηk (t)| ≤ |η(t)| since η(t) = 0 for |t| ≥ 1. Moreover, by Hausdorff-Young and Hölder inequal1 ities, we have τ 2 + ηk (τ ) L τp ≤ ηk 1 + ≤ ηk H 1 1 + k. Then, by Minkowski

η(t)N1 (F)

s, 1 Xp 2

t

Ht 2

integral inequality, we have

λ)|dλ ≤ Cη sup ns η(λ − n 3 )| F(n, j

L

p |n|∼2 j

Cη FY s,−1 , p

∞ 1 3 21 + (τ − n 3 ) p ≤ where Cη = supn ∞ k k=1 k! τ − n η k=1 Lτ ∞ 1+k < ∞. Similarly, we have k=1 k! η(t)N1 (F)Y s,0 ≤ p

where Cη = supn

Cη

s λ)|dλ η(λ − n 3 )| F(n, sup n j

∞

1 k (τ k=1 k! η

− p1 −

n

since

1 p +

τ

k!

Cη FY s,−1 ,

p |n|∼2 j

p

− n 3 ) L 1τ . Now, note that

sup ηk (τ − n 3 ) L 1τ ≤ sup τ − n 3 n

L

1

τ 2 + ηk (τ ) L p

1

p Lτ

+

τ − n 3 p ηk (τ − n 3 ) L τp 1 + k,

= 2+. Hence, we have Cη < ∞ as before.

For |τ − n 3 | 1, we have |τ − n 3 | ∼ τ − n 3 . Thus, we have N 2 (F)(n, τ ) 3 −1 τ − n F(n, τ ). Then, by monotonicity (i.e. f s, 1 ≤ g s, 1 for | f | ≤ |g|), we have N2 (F)

s, 1 Wp 2

F

s,− 1 Wp 2

p W

.

2

p W

2

Lastly, by Minkowski integral inequality with w(n, τ ) τ − n 3 0+ , we have η(t)N3 (F)

s, 1 Xp 2

1 = sup ns τ − n 3 2 + η(τ − n 3 ) j

(1 − η)(λ − n 3 ) | F(n, λ)|dλ L p LP 3 λ−n |n|∼2 j τ ≤ Cη FY s,−1 , ×

p

1

1

where Cη = supn τ − n 3 2 + η(τ − n 3 ) L τp = τ 2 + η(τ ) L τp < ∞. Similarly, we have η(t)N3 (F)Y s,0 Cη FY s,−1 , p

η(τ − n 3 ) L 1τ = η L 1τ < ∞. where Cη = supn

p

228

T. Oh

4.3. Bilinear estimate. By expressing (1) in the integral formulation, we see that u is a solution to (1) for |t| ≤ T 1 if and only if u satisfies u(t) : = tu 0 (u) = η(t)S(t)u 0 + η(t)N1 (η2T F(u))(t) + N2 (η2T F(u))(t) + η(t)N3 (η2T F(u))(t), where F(u) = −uu x and η2T (t) = η(t/2T ), i.e. η2T (t) ≡ 1 for |t| ≤ T . In this subsection, we prove the crucial bilinear estimate so that tu 0 (·) defined above is a contraction bsp,∞ (T)) for T sufficiently small. on a ball in W ps (T × [−T, T ]) ⊂ C([−T, T ], Proposition 4.3. Assume that u and v have the spatial means 0 for all t ∈ R. Then, there exist s = − 21 +, p = 2+ with sp < −1, and θ > 0 such that η2T ∂x (uv)

s,− 21

Wp

T θ u

s, 21

Wp

v

s, 21

Wp

.

(26)

Before proving Proposition 4.3, we present some lemmata. Lemma 4.4. (Ginibre-Tsutsumi-Velo [11], Lemma 4.2) Let 0 ≤ α ≤ β and α + β > 21 . Then, we have τ −2α τ − a−2β dτ a−γ , where γ = 2α − [1 − 2β]+ with [x]+ = x if x > 0, = ε > 0 if x = 0, and = 0 if x < 0. Lemma 4.5. For l1 + 2l2 > 1 with l1 , l2 ≥ 0, there exists c > 0 such that for all n = 0 and λ ∈ R, we have 1 1 < c. (27) n 1 l1 λ + n 1 (n − n 1 )l2 n 1 =0,n

Proof. When l1 > 1, (27) is clear. When l2 > 21 , (27) follows from Lemma 5.3 in [16]. Thus, we assume l1 ∈ (0, 1] and l2 ∈ (0, 21 ] in the following. Since l1 + 2l2 > 1, there exists ε > 0 such that l1 + 2l2 − 3ε ≥ 1. If Pn,λ (n 1 ) := λ + n 1 (n − n 1 ) has two real roots, i.e. Pn,λ (n 1 ) = −(n 1 −r1 )(n 1 −r2 ), then there are at most 6 values of n 1 such that |n 1 − r j | ≤ 1. For the remaining values of n 1 , we have Pn,λ (n 1 ) > 14 2j=1 n 1 − r j . Then, (27) follows from Hölder inequality with p = (l1 − ε)−1 and q = (l2 − ε)−1 , we have 1 2 1 p q − pl1 −ql2 n 1 n 1 − r j < c < ∞, LHS of (27) n1

j=1

n1

since pl1 > 1 and ql2 > 1. If Pn,λ (n 1 ) has only one or no real root, then we have |Pn,λ (n 1 )| ≥ (n 1 − 21 n)2 for all n 1 ∈ Z. Then, by Hölder inequality with p = (l1 − ε)−1 and q = (2l2 − 2ε)−1 , we have 1 1 p q − pl1 1 2 −ql2 LHS of (27) ≤ n 1 (n 1 − 2 n) < c < ∞, n1

since pl1 > 1 and 2ql2 =

l2 l2 −ε

n1

> 1.

Lastly, recall the following lemma from [9, (7.50) and Lemma 7.4].


229

Lemma 4.6. Let 1

(n) = {η ∈ R : η = −3nn 1 n 2 + o(nn 1 n 2 100 ) for some n 1 ∈ Z with n = n 1 + n 2 }. Then, we have

3

τ − n 3 − 4 χ(n) (τ − n 3 )dτ 1.

(28)

Note that (28) is stated with τ − n 3 −1 in [9]. However, by examining the proof of Lemma 7.4 in [9], one immediately sees that (28) is valid with τ − n 3 −α for any 1 α > 23 + 100 . Proof of Proposition 4.3. In the proof, we use (n, τ ), (n 1 , τ1 ), and (n 2 , τ2 ) to denote the Fourier variables for uv, u, and v, respectively, i.e. we have n = n 1 + n 2 and τ = τ1 + τ2 Moreover, by the mean 0 assumption on u and v and by the fact that we have ∂x (uv) on the left-hand side of (26), we assume n, n 1 , n 2 = 0 in the following. First, we prove ∂x (uv)

s,− 21

Wp

u

s, 21

Wp

v

s, 21

Wp

,

(29)

i.e. we first prove (26) with no gain of T θ . Then, it suffices to show B( f, g)(n, τ )

0,− 21

p W

f b0

where B(·, ·) is defined by 1 B( f, g)(n, τ ) = 2π

n 1 +n 2 =n

|n|ns n 1 s n 2 s

p p,∞ L τ

τ1 +τ2 =τ

gb0

p p,∞ L τ

,

(30)

f (n 1 , τ1 )g(n 2 , τ2 )dτ1 . 2 3 21 j=1 w(n j , τ j )τ j − n j

Let MAX := max(τ − n 3 , τ1 − n 31 , τ2 − n 32 ). Then, by (20), we have MAX nn 1 n 2 . 1 0,− 1 p0,− 2 norm on the left-hand side of • Part 1. First, we consider the X p 2 part of the W (30). • Case (1): MAX = τ − n 3 . Without loss of generality, assume |n 1 | ≥ |n 2 |. For fixed 3 n = 0 and τ , let λ = τ −n 3n and define Bn,τ = {n 1 ∈ Z : |n 1 − r j | ≥ 1, j = 1, 2 r j is a real root of Pn,λ (n 1 ) := λ + n 1 (n − n 1 ) 1 or r j = n if no real root}. 2 On Bn,τ , we have τ − n 3 + 3nn 1 n 2 nλ + n 1 (n − n 1 ).

(31)

c . For s > − 1 , we have ◦ Subcase (1.a). On Bn,τ 2

|n|ns 1 1 . 1 n 1 s n 2 s MAX 21 n 2 2 +s

(32)

230

T. Oh

By Lemma 4.4, we have 1

1

τ1 − n 31 − 2 τ2 − n 32 − 2

p L τ1

−1+ p1

τ − n 3 + 3nn 1 n 2

.

c , i.e. the summation Note that for fixed n and τ there are at most four values of n 1 ∈ Bn,τ p over n 1 can be replaced by the L n 1 norm. Then, by Hölder inequality, we have w(n, τ ) f (n 1 , τ1 )g(n 2 , τ2 )dτ1 LHS of (30) sup 1 p +s 3 21 τ − n 3 21 L p 2 j Lτ n τ − n n=n 1 +n 2 2 1 2 1 2 |n|∼2 j τ =τ1 +τ2 1 sup n 2 − 2 −s+δ f (n 1 , ·) L τp g(n 2 , ·) L τp p p . L

j

|n|∼2 j

L n1

Note that w(n, τ ) n 2 δ since |n 1 | ≥ |n 2 |. If |n 1 | |n 2 | and |n| ∼ 2 j , then we have |n 1 | ∼ 2k , where |k − j| ≤ 5: ⎛ ∞ 1 LHS of (30) sup ⎝ n 2 (− 2 −s+δ) p j

|k− j|≤5 |n 1 |∼2k l=0 |n 2 |∼2l p

p

× f (n 1 , ·) L p g(n 2 , ·) L p

∞

τ

1

p

τ

1

2(− 2 −s+δ) p l sup f L p

|n|∼2k

k

l=0

f b0

p p,∞ L τ

gb0

p p,∞ L τ

p

Lτ

sup g L p l

|n|∼2l

p

Lτ

,

by taking δ > 0 sufficiently small such that − 21 − s + δ < 0. Similarly, if |n 1 | ∼ |n 2 | and |n 2 | ∼ 2l , then we have |n 1 | ∼ 2k where |k − l| ≤ 5. ⎛ ∞ 1 n 2 (− 2 −s+δ) p LHS of (30) ⎝ l=0 |k−l|≤5 |n 1 |∼2k |n 2 |∼2l

1

p

×

∞

p p f (n 1 , ·) L p g(n 2 , ·) L p τ τ 1

2(− 2 −s+δ) p l sup f L p

|n|∼2k

k

l=0

f b0

p p,∞ L τ

gb0

p p,∞ L τ

p

Lτ

sup g L p l

|n|∼2l

p

Lτ

.

◦ Subcase (1.b). On Bn,τ . In this case, we have (31). Also, recall that w(n, τ ) τ − n 3 0+ . Moreover, τ − n 3 0+ max(n, n 2 , τ − n 3 + 3nn 1 n 2 )0+ since either τ − n 3 |nn 1 n 2 | or τ − n 3 |nn 1 n 2 | max(n3 , n 2 3 ). In particular, by (31), we have w(n, τ ) (n 2 τ − n 3 + 3nn 1 n 2 )0+ .

(33)


231

By applying Hölder inequality and proceeding as before, we have LHS of (30) M sup n 2 0− f (n 1 , ·) L τp g(n 2 , ·) L τp L p M f b0

p p,∞ L τ

where

M = sup n,τ

p

|n|∼2 j

j

gb0

p p,∞ L τ

L n1

,

w(n, τ ) 1− p1

1

n 2 2 +s− τ − n 3 + 3nn 1 n 2

p

.

L n1

Thus, it remains to show that M < ∞. By (33), (31), and Lemma 4.5, we have n,τ

1

M p sup

n

p −1−

n2

1 ( 21 +s−) p

n 2

λ + n 1 (n − n 1 ) p −1−

< ∞,

since ( 21 + s−) p + 2( p − 1)− > 1 for p = 2+ < 4 and sp = −1−. Now, assume MAX = τ2 − n 32 . By symmetry, this takes care of the case when MAX = τ1 − n 31 . Note that we have w(n, τ ) τ − n 3 0+ by a crude estimate. Thus, by duality, it suffices to show ∞ |n|ns 1 f (n 1 , τ1 )h(n, τ )dτ 1 1 1 p s n s p − L 3 3 n 3 1 2 2 2 2 L τ2 w(n 2 , τ2 )τ2 − n 2 τ1 − n 1 τ − n n l=0 |n |∼2l 2

sup f L p

∞ p

|n 1 |∼2k

k

L τ1

h

L

j=0

For fixed n 2 = 0 and τ2 , let λ =

p |n|∼2 j

p

Lτ

τ2 −n 32 3n 2

.

(34)

and define

Bn 2 ,τ2 = {n ∈ Z : |n − r j | ≥ 1, j = 1, 2 r j is a real root of Pn 2 ,λ (n) := λ + n(n 2 − n) 1 or r j = n 2 if no real root}. 2 On Bn 2 ,τ2 , we have τ2 − n 32 − 3nn 1 n 2 n 2 λ + n(n 2 − n).

(35)

Also, note that w(n 2 , τ2 ) min(nδ , n 1 δ ) on Bnc2 ,τ2 . 3 • Case (2): MAX = τ2 − n 2 and |n 1 | |n 2 |. In this case, we have |n|ns 1 1 . 1 1 s s n 1 n 2 MAX 2 n 2 2 +s

(36)

1 ◦ Subcase (2.a). On Bnc2 ,τ2 . First, suppose τ2 − n 32 − 3nn 1 n 2 n 2 100 . Thus, by Lemma 4.4, we have 1

1

1

−1 1

τ1 − n 31 − 2 +α τ − n 3 − 2 + L τp τ2 − n 32 − 3nn 1 n 2 − 2 +α+ n 2 100 ( 2 −α)+ (37)

232

T. Oh

for α > 0. Then, by Hölder inequality in τ followed by Young and Hölder inequalities, we have f (n , τ ) −1 1 f (n 1 , τ1 )h(n, τ )dτ 1 1 100 ( 2 −α)+ n h(n, τ ) p 2 1 1 p 3 α − 3 L τ2 ,τ τ1 − n 1 τ1 − n 1 2 τ − n 3 2 L τ2 −1 1

≤ n 2 100 ( 2 −α)+ τ1 − n 31 −α

p−2 p

for fixed n and n 1 . By choosing α >

p p−2

L τ1

f (n 1 , ·) L τp h(n, ·) 1

p

Lτ

= 0+, we have τ1 −n 31 −α

< C < ∞,

p p−2

L τ1

independently of n 1 . Note that if |n| ∼ 2 j and |n 2 | ∼ 2l , then we have |n 1 | ∼ 2k , where |k − j| ≤ 5 or |k − l| ≤ 5, since n = n 1 + n 2 and |n 1 | ≥ |n 2 |. As in Subcase (1.a), for fixed n 2 and τ2 there are at most four values of n ∈ Bnc2 ,τ2 , i.e. the summation over n can be replaced p

by the L n norm. By Hölder inequality in n 2 after switching the order of summations, LHS of (34)

∞ 1 1 ( 2 −α)+ n 2 − 12 −s− 100 f (n 1 , ·)

L τ1 h(n, ·) L τp L p p

l=0

⎛ ∞ ∞ 1 1 1 p (2l )0− sup ⎝ n 2 − 2 −s− 100 ( 2 −α)+ l

l=0

× f (n

p L τ1 |n 2 |∼2l

p p,∞ L τ

h

1 1 ( 2 −α)+ = n 2 − 12 −s− 100 where M

2− 1 1− 100 +

∼

200− 99

j=0 |n|∼2 j

p − n 2 , ·) L p

f 0 M b

for p 0 and J4 is any real number, u is a strictly increasing function of tanh(β J ) and has range (0, 1), as one can check by using the definition of s, see (A.9). On the other hand, det √ T (k) = 0 only if k = 0 and µ(k) = 0; hence, g(x) has a singularity at u = u c = 2 − 1, which is an allowed value; moreover, if β|J4 | 1 (as we shall suppose in the following), u = tanh(β J ) + O(β J4 ). Since we expect that the interaction will move this singularity, it is convenient to modify the interaction by adding a − + ψ finite counterterm iν1 L12 ω,k ωψ k,ω k,−ω , which is compensated by replacing, in the matrix T (k), µ(k) with √ u∗ , u ∗ = 2 − 1 − ν1 . µ1 (k) = (cos k0 + cos k1 − 2) + 2 1 − (2.20) u Let us call T1 (k) the new matrix and P1 (dψ) the corresponding measure; we get 1 (1) ¯ ¯ ¯ (2.21) P1 (dψ)Pχ (dχ ) eQ(ψ,χ )+V (ψ,χ )+B( A) , Z ( A) = N1 where V (1) (ψ, χ ) = iν1

1 + − k,ω ψk,−ω + V(ψ, χ ), ωψ L2

(2.22)

ω,k

and ν1 has to be determined so that the interacting propagator has an infrared singularity at u = u ∗ ; the critical temperature is uniquely determined by the value of u ∗ . Let us now remark that det T χ (k) is strictly positive for any k, as one can easily see by using the fact that u ∈ (0, 1). Hence, if we define + = ψ + QTχ−1 , ψ − = Tχ−1 Qψ − , ψ

(2.23)

+ , χ − → χ − + ψ − , allows us to rewrite (2.21) in the change of variables χ + → χ + + ψ the form 1 (1) ¯ ¯ ¯ (2.24) Z ( A) = PZ 1 ,µ1 (dψ)Pχ (dχ ) eV (ψ,χ −ψ )+ B( A) , N ¯ is the functional obtained from B( A) ¯ by replacing χ with χ − ψ and where B( A) PZ 1 ,µ1 (dψ) is the Gaussian measure with propagator g(x) =

1 −ikx (1) −1 e (T ) (k), L2 k∈D

(2.25)

Extended Scaling Relations

577

where T (1) (k) = T (k) − Q(k)Tχ−1 Q(k). It is also convenient to perform the trivial change of variables + k,ω + , ψ − → ψ − , k = (k0 , k1 ), ψ → −iωψ k = (k1 , k0 ). k,ω k,ω k,ω

(2.26)

Hence, by an explicit calculation of Q(k) and using the identity u ∗ /u = 1 − µ1 (0)/2, one can see that T (1) (k) is the matrix −µ1 − µ+,− (k) −i sin k0 + sin k1 + µ+,+ (k) (2.27) Z 1 C1 (k) −µ1 − µ+,− (k) −i sin k0 − sin k1 + µ−,− (k) with C1 (k) = 1, µ1 = 2µ1 (0)/(2 − µ1 (0)) and Z 1 = u ∗ ; moreover µ+,+ (k) = −µ−,− (k)∗ is an odd function of k of the form µ+,+ (k) = 2µ1 (0)(−i sin k0 +sin k1 )/(4− 2µ1 (0)) + O(|k|3 ), while µ+,− (k) is a real even function, of order |k|2 , which vanishes only at k = 0. Finally, det T (1) (k) ≥ C(2 − cos k0 − cos k1 ), so that PZ 1 ,µ1 (dψ) has the same type of infrared singularity as P1 (dψ). The fact that det Tχ (k) is strictly positive implies that gχ (x) is an exponential decaying function; hence, we can safely perform the integration over the field χ in (2.24). The result can be written in the following form (see Lemma 1 of [22]) 2 (1) ¯ (1) (1) ¯ ¯ S ( A) ¯ ¯ Z ( A) ≡ e = PZ 1 ,µ1 (dψ)e L N +V (Z 1 ψ)+B ( A) , (2.28) where N (1) is a constant and the effective potential V¯ (1) (ψ) can be represented as 2n V¯ (1) = Wω,α,ε,2n (x1 , . . . , x2n )∂ α1 ψxε11,ω1 . . . ∂ α2n ψxε2n ,ω2n . (2.29) n≥1 α,ω,ε x1 ,...,xn

The kernels Wω,α,ε,2n in the previous expansions are analytic functions of λ and ν1 near the origin; if we suppose that ν1 = O(λ), their Fourier transforms satisfy, for any n ≥ 1, the bounds, see [22], α,ω,ε,2n (k1 , . . . , k2n−1 )| ≤ L 2 C n |λ|n . |W

(2.30)

¯ A similar representation can be written for the functional of the external field B (1) ( A). As explained in detail in [22], the symmetries of the two models we are considering imply that, in the r.h.s. of (2.29), there are no local terms quadratic in the field, which are relevant or marginal, except those which are already present in the free measure.

2.3. Multiscale analysis. We briefly recall here the analysis in [22] (see also [7,8]). The integration in (2.28) can be done by iteratively integrating the fields with decreasing momentum scale and by moving to the free measure all the marginal terms quadratic in the field. We introduce a scaling parameter γ = 2, a decomposition of the unity 1 = f 1 + 0h=−∞ f h (k), with f h (k) a smooth function with support {γ h−1 π/4 ≤ |k| ≤ γ h+1 π/4}, and the corresponding decomposition of the field ψ = 1j=−∞ ψ ( j) . If the fields ψ (1) , . . . , ψ (h+1) are integrated, we get (h) ¯ (h) √ (≤h) (h) √ (≤h) ¯ ¯ (2.31) eS ( A) = e S ( A) PZ¯ h ,µh (dψ (≤h) )eV ( Z h ψ )+B ( Z h ψ , A) ,

578

G. Benfatto, P. Falco, V. Mastropietro

h ( j) and P where ψ (≤h) = j=−∞ ψ Z¯ h ,µh (dψ) is the Gaussian fermionic measure with the propagator obtained from (2.25) by replacing in (2.27) C1 (k) with C h (k) = h f h (k)]−1 , µ1 with µh , Z 1 with the function Z¯ h (k) (to be defined below) and [ k=−∞ the functions µσ,σ (k) with similar functions µ(h) σ,σ (k); finally, the constant Z h , which ¯ rescale the field, is given by Z h = Z h (0). The effective interaction V (h) (ψ) is a sum over monomials in the Grassmann variables and we define a localization operator (see e.g. §4 of [22]) as (h)

LV (h) (ψ) = (−m h + γ h n h )Fν(h) + lh Fλ − z h Fz(h) ,

(2.32)

where m h , n h , z h and lh are suitable real numbers, 1 (≤h)+ (≤h)− ψ (2.33) k,ω ψk,−ω , L2 ω k 1 (≤h)− , (≤h)+ ψ = 2 (−i sin k0 + ω sin k1 )ψ (2.34) k,ω k,−ω L ω k 1 (≤h)+ ψ (≤h)+ ψ (≤h)− ψ (≤h)− δ(k1 − k2 + k3 − k4 ). ψ = 8 k1 ,+ k3 ,− k2 ,+ k4 ,− L

Fν(h) = Fz(h) Fλ(≤h)

Moreover we define

k1 ,...,k4

(ε) ¯ = LB(h) ( Z h ψ (≤h) , A) Z h A¯ εx Ox(≤h)ε ,

(2.35)

ε,x

where (≤h)+ (≤h)− (≤h)+ (≤h)− Ox(≤h)+ = ψx,+ ψx,− + ψx,− ψx,+ , (≤h)+

Ox(≤h)− = i[ψx,+

(≤h)+

ψx,−

(≤h)−

+ ψx,+

(≤h)−

ψx,−

].

(2.36)

We now move to the fermionic measure the terms proportional to m h and z h in (2.32) and we rescale the fields so that (h) ¯ 2 ¯ eS ( A) = e S ( A)+L th PZ¯ h−1 ,µh−1 (dψ (≤h) ) ¯ (h) (√ Z h−1 ψ (≤h) )+B¯ (h) (√ Z h−1 ψ (≤h) , A) ¯

eV

,

(2.37)

with th a normalization constant and Z¯ h−1 (k) = Z h (1 + z h C h−1 (k)), µh−1 =

Zh [µh (k) + m h C h−1 (k)]. (2.38) ¯ Z h−1 (k)

The renormalized potential V¯ (h) (ψ) can be written as V¯ (h) (ψ) = γ h νh Fν(h) + λh Fλ(h) + R (h) (ψ),

(2.39)

with νh = n h (Z h /Z h−1 ) and λh = lh (Z h /Z h−1 )2 ; R (h) (ψ) is a√sum over monomi¯ = als similar to (2.29), with 2n + α1 + .. + α2n > 4. Finally, B¯ (h) ( Z h−1 ψ (≤h) , A) √ ¯ The field ψ (h) is integrated and the procedure can be iterated. The B (h) ( Z h ψ (≤h) , A). above integration procedure is done till the scale h ∗ defined as the maximal j such that


579 ∗

γ j ≤ |µ j |, and the integration of the fields ψ (≤h ) can be done in a single step. Roughly speaking, h ∗ defines the momentum scale of the mass. Notice that the propagator of the field ψ (≤h) can be written, for h ≤ 0, as g (≤h) (x, y) = gT(≤h) (x, y) + r (≤h) (x, y),

(2.40)

where 1 −ik(x−y) 1 −1 e T (k), L2 Zh h k∈D −ik0 + k1 −µh , Th (k) = C h (k) µh −ik0 − k1 (≤h)

gT

(x, y) =

(2.41) (2.42)

and, for any positive integer M, |r (≤h) (x, y)| ≤ C M

γ 2h . 1 + (γ h |x − y| M )

(2.43)

(h)

The propagator gT (x, y) verifies a similar bound with γ h replacing γ 2h . A similar decomposition can be done for g (h) (x, y). The definition of the localization operator L selects in V (h) , B (h) the terms with positive or vanishing scaling dimension, which is given, for the monomials with n ψfields and m A-fields, by D =2−

n − m. 2

(2.44)

In the RG language, the terms with positive or vanishing dimension are called relevant or marginal terms, respectively. Notice that a priori many other possible local marginal or relevant terms could be generated in the RG integration, with respect to the one listed in (2.32); however these terms are absent, thanks to the symmetries of the problem, as proved in [22], App. F (see also [14], §A2.2). The outcome of the above procedure is that the kernels in V¯ ( j) and B¯ ( j) are analytic functions of the running coupling constants λk , νk , k ≥ j, provided that supk≥ j (|λk | + |νk |) is small enough, see [22] and §3 of [9]. The running couplings λ j (which, by construction, are the same in the massless µ = 0 or in the massive µ = 0 case, see [13]), satisfy a recursive equation of the form ( j)

( j)

λ j−1 = λ j + βλ (λ j , . . . , λ0 ) + β¯λ (λ j , ν j ; . . . ; λ0 , ν0 ), ( j)

(2.45)

( j)

where βλ , β¯λ are µ-independent and expressed by a convergent expansion in λ j , ( j) ν j . . . , λ0 , ν0 ; moreover β¯λ vanishes if at least one of the νk is zero. The running coupling λ j stays close to λ for any j as a consequence of the following property, called vanishing of the Beta function, which was proved in Theorem 2 of [11]; for suitable positive constants C and ϑ < 1: ( j)

|βλ (λ j , . . . , λ j )| ≤ C|λ j |2 γ ϑ j .

(2.46)

Indeed, it is possible to prove that, for a suitable choice of ν1 = O(λ), ν j = O(γ ϑ j λ¯ j ), if λ¯ j = supk≥ j |λk |, and this implies, by the short memory property (see for instance

580


( j) A4.6 of [13]), β¯λ = O(γ ϑ j λ¯ 2j ) so that the sequence λ j converges, as j → −∞, to a smooth function λ−∞ (λ) = λ + O(λ2 ), such that

|λ j − λ−∞ | ≤ Cλ2 γ ϑ j .

(2.47)

Z j−1 ( j) ( j) = 1 + βz (λ j , . . . , λ0 ) + β¯z (λ j , ν j ; . . . , λ0 , ν0 ), Zj

(2.48)

Moreover

( j) with β¯z vanishing if at least one of the νk is zero so that, by using the bound ν j = ( j) O(γ ϑ j λ¯ j ) and the short memory property, β¯z = O(λ j γ ϑ j ). Finally ( j)

βz (λ j , . . . , λ0 ) = βz (λ−∞ ) + O(λγ ϑ j ),

(2.49)

where the last identity follows from (2.47) and the short memory property. An important point is that the function βz (λ−∞ ) is model independent. Similar equations hold for Z h(±) , µh , with leading terms again model independent, that is ( j)

β± (λ j , . . . , λ0 ) = β± (λ−∞ ) + O(λγ ϑ j ).

(2.50)

By an explicit computation and (2.49), (2.50), there exist η+ (λ−∞ ) = c1 λ−∞ + O(λ2−∞ ), η− (λ−∞ ) = −c1 λ−∞ + O(λ2−∞ ), ηµ (λ−∞ ) = c1 λ−∞ + O(λ2−∞ ) and ηz (λ−∞ ) = c2 λ2−∞ + O(λ3−∞ ), with c1 and c2 strictly positive, such that, for any j ≤ 0, | logγ (Z j−1 /Z j ) − ηz (λ−∞ )| ≤ Cλ2 γ ϑ j , | logγ (µ j−1 /µ j ) − ηµ (λ−∞ )| ≤ C|λ|γ ϑ j , (±) (±) | logγ (Z j−1 /Z j ) − η± (λ−∞ )|

(2.51)

≤ Cλ2 γ ϑ j .

The critical indices are functions of λ−∞ only, as it is clear from (2.49); moreover from (6.28) and (5.4) of [22], the indices x± appearing in (1.8) are such that x ± = 1 − η± + η z , ηµ = η+ − η z = 1 − x + .

(2.52)

When the limit µ → 0 is taken (after the limit L → ∞, so that all the Z γ ,γ have the same limit), the multiscale integration procedure implies the power law decay of the correlations given by (1.8). If µ = 0 (that is, if the temperature is not the critical one), the correlations decay faster than any power with rate proportional to µh ∗ , where, if [x] denotes the largest integer ≤ x, h ∗ is given by ∗

h =

logγ |µ| 1 + ηµ

,

which implies, together with (2.52), the identity (1.11) of Theorem 1.1.

(2.53)


581

2.4. The anisotropic Ashkin-Teller model. In order to derive (1.12), we briefly recall the analysis of the anisotropic Ashkin-Teller model in [13,14] with J = J . We still obtain an expression similar to (2.21), the main difference being that (see (A.21) below) ε(≤h) ε(≤h) P1 (dψ) contains in the exponents also terms of the form ψx,ω ψx,−ω , and the same is true for Pχ (dχ ). The integration procedure is similar to the one in §2.3, but we have to substitute the Grassmann integration PZ h ,µh (dψ (≤h) ) in (2.31) with a new measure PZ h ,µh ,σh (dψ (≤h) ), where µh and σh are the constants multiplying, respectively, the quadratic mass terms 2

(≤h)−

(≤h)+ ψx,ω ψx,−ω

and

− 2i

ω=±

(≤h)ε

(≤h)ε

ψx,+ ψx,− .

(2.54)

ε=±

One can see that | logγ (µ j−1 /µ j ) − ηµ (λ−∞ )| ≤ Cλ2 γ ϑ j , | logγ (σ j−1 /σ j ) − ησ (λ−∞ )| ≤ Cλ2 γ ϑ j .

(2.55)

Hence, since the two mass terms are clearly proportional, respectively, to the operators O + and O − , we find that ηµ = η+ − η z ,

ησ = η− − η z .

(2.56)

It turns out that the difference of the critical temperatures scales as |J − J |x T , where x T , see (5.26) of [13] (where the indices are defined with a different sign and the definitions of µh and σh are exchanged), is given by xT =

1 + ηµ , 1 + ησ

(2.57)

which implies (1.12), since ηµ = 1 − x+ and ησ = 1 − x− .

3. Equivalence with a Continuum Model In this section we show that the spin model (1.1) is equivalent, for the purpose of computing the long distance behavior of the correlations we are considering, to a fermionic theory defined as the formal scaling limit of the original one plus an ultraviolet regularization; more exactly, we prove that the critical indices x+ , x− , ν and x T of the spin model (1.1) are equal to the indices of a fermionic theory provided that the bare coupling λ∞ of the new theory is properly chosen as a suitable function of the parameters of the 8V or AT models. The new fermionic theory has correlations expressed by Grassmann integrals which are identical to the ones appearing in certain Quantum Field Theory models; in particular it verifies extra Gauge symmetries with respect to the original spin Hamiltonian (1.1).

582


3.1. The model. The continuum (or QFT) model is defined as the limit N → ∞, followed by the limit −l → ∞, to be called the removed cutoff limit, of a model with an infrared γ l and an ultraviolet γ N momentum cut-off, −l, N 0. This model is expressed in terms of the following Grassmann integral: W N (A,J,η) [l,N ] (N ) [l,N ] = P(dψ ) exp V (ψ )+ e dx Aεx Oε,x +

ε

dx

[l,N ]+ [l,N ]− ψx,ω [Jx,ω ψx,ω

+[l,N ] − + ψx,ω ηx,ω

+ [l,N ]− + ηx,ω ψx,ω ]

, (3.1)

ω

, a square subset of R2 of size | | ≤ γ −2l , Ox+ and Ox− are defined in (2.36) where x ∈ and P(dψ [l,N ] ) is a Gaussian measure with propagator gT[l,N ] (x, y) given by (2.41) with N −1 µh = 0, Z h = 1 and C h−1 (k) replaced by Cl,N (k) = k=l f k (k); moreover ηx± are ε external fermionic fields and Ax , Jx,ω are external bosonic fields. The interaction is λ∞ − + + − V (N ) (ψ) = ψy,−ω ψx,ω ψy,−ω , (3.2) dx dyv K (x − y)ψx,ω 2 ω where K < N and v K (x − y) is given by v K (x − y) =

1 χ0 (γ −K p)eip(x−y) , L2 p

(3.3)

χ0 (p) being a smooth function with support in {|p| ≤ 2} and equal to 1 for {|p| ≤ 1}. The correlation functions are found by making suitable derivatives with respect to the external fields Ax , Jx , ηx and setting them equal to zero. We consider K fixed, for example K = 0, so that no ultraviolet regularization is needed, as we shall see, when we take the limit N → ∞. We shall study the functional W N (A, J, η) by performing a multiscale integration of (3.1); we have to distinguish two different regimes: the first regime, called ultraviolet, contains the scales h ∈ [K + 1, N ], while the second one contains the scales h ≤ K , and is called infrared. 3.2. The ultraviolet integration. We describe how to control the integration of the ultraviolet scales, in order to remove the ultraviolet cut-off N → ∞. For simplicity, we shall only consider the case A = η = 0, but the result is valid for the full problem (see also the remark at the end of this section). Assume that the fields ψ (N ) , ψ (N −1) , . . . , ψ (h+1) are integrated so that (h) eW N (0,J,0) = eS (J ) P(dψ [l,h] ) exp V (h) (ψ [l,h] ) + B (h) (ψ [l,h] , J ) , (3.4) where V (h) +B (h) is the sum of integrated monomials in m ψx+i ,ωi variables, i = 1, . . . , m, m ψy−i ,ωi variables and n Jz j ,ωj external fields, j = 1, . . . , n, multiplied by suitable

kernels Wω(n;2m)(h) (z; x, y). The scaling dimension is again (2.44), and, as in §2.3, we ;ω define a localization operator on the terms with positive or vanishing scaling dimensions which, as in the previous case, are the terms with (2m, n) = (2, 0) or (4, 0) or (2, 1).


583

Notice however that in this case the localization operation is defined as the identity on the (0;2)(h) (1;2)(h) (0;4)(h) relevant or marginal terms, that is Wω , Wω ;ω and Wω,ω , while it annihilates, as always, all the other contributions to the effective potential. (n;2m)(h) These kernels Wω ;ω (z; x, y) are represented as power expansions in the running (0;2)(k)

(1;2)(k)

(0;4)(k)

coupling functions Wω , Wω ;ω and Wω,ω , k ≥ h, whose size is estimated by 1 the L norm, as well as the kernels themselves. Of course, since the kernels may contain delta functions, we extend, as usual, the definition of the L 1 norm, by treating the delta as a positive funcion. Hence, we define (n;2m)(k) (n;2m)(k) de f 1 (3.5) dzdxdy Wω ;ω = (z; x, y) Wω ;ω | | and we prove the following theorem. Theorem 3.1. If λ∞ is small enough, there exist two constants C1 > 1 and C2 , such that, if K ≤ h ≤ N , the relevant or marginal contributions to the effective potential satisfy the bounds: Wω(0;2)(h) ≤ C1 |λ∞ |γ h γ −2(h−K ) ,

(1;2)(h) Wω ;ω (0;4)(h) Wω,ω

− δ2 δω,ω ≤ C2 |λ∞ |γ

−(h−K )

(3.6) ,

− λ∞ vδ4 δω,−ω ≤ C2 |λ∞ |2 γ −(h−K ) ,

(3.7) (3.8)

where δ2 (z; x, y) ≡ δ(z − x)δ( z − y) and vδ4 (x1 , x2 , y1 , y2 ) ≡ δ(x1 − y1 )v K (x1 − x2 )δ(x2 − y2 ). Before proving the theorem, notice that, as for the multiscale analysis in §2.3, the fact that the running coupling functions are small for λ∞ small enough (as it follows by the bounds (3.6), (3.7), (3.8)) implies the following standard “dimensional” bound for all other kernels with negative scaling dimension, for λ∞ small enough, see e.g. App. A of [23]: (n;2m)(k)

Wω ;ω

≤ C n+dn,m |C1 λ∞ |dn,m γ k(2−n−m) ,

(3.9)

where dn,m = max{m − 1, 0}, if n > 0, and dn,m = max{m − 1, 1}, if n = 0, and C is (n;2m)(k) a suitable constant larger, at least, than γ . Indeed, in the tree expansion for Wω ;ω defined in [23], all the vertices of the tree have negative scaling dimension and there (0;2)(h) (1;2)(h) (0;4)(h) are three types of endpoints (see [23]), associated to Wω , Wω ;ω , Wω,ω , which contribute (up to dimensional factors and for λ∞ small enough) a factor C1 |λ∞ |, 1 + C2 |λ∞ | ≤ C and |λ∞ |[1 + C2 |λ∞ |] ≤ C1 |λ∞ |, respectively. Notice that the condition C > γ comes from the bound of the trivial tree (that with only one endpoint) (0;2)(k) contributing to the tree expansion of Wω . The bounds (3.6), (3.7), (3.8) follow from a “power counting improvement”, similar to the one used in [19] for the Yukawa model, in which the non-locality of the interaction plays an essential role. Proof of Theorem 3.1. The proof is by induction: we assume that the bounds (3.6)–(3.8) hold for h : k + 1 ≤ h ≤ N (for h = N they are true with C1 = C2 = 0) and we prove them for h = k.

584


(0;2)(k)

(1;2)(k) and W−ω;ω , the [k+1,N ] dotted lines the external fermionic lines, the paired line is the fermionic propagator gω and the wiggly line is the interaction v K

Fig. 2. Graphical representation of (3.10); the gray blobs represent the kernels Wω

The inductive assumption implies the validity of (3.9) and we need to improve such bound when 2 − n − m ≥ 0. We can write, by using the properties of the fermionic trun(1;0) cated expectations and the fact that, by the oddness of the free propagator, Wω (k) = 0, Wω(0;2)(k) (x, y) (1;2)(k) = λ∞ dwdw v K (x − w)gω[k+1,N ] (x − w )W−ω;ω (w; w , y),

(3.10)

which can be bounded, by using (3.9), as (Fig. 2) (1;2)(k)

Wω(0;2)(k) ≤ |λ∞ |v K L ∞ W−ω;ω

N

gω( j) L 1

j=k+1

c1 ≤ γ 2K C|λ∞ |γ −k ≤ C1 |λ∞ |γ k γ −2(k−K ) , 1 − γ −1

(3.11)

c1 where, for example, C1 = max{2, 1−γ −1 C}; hence (3.6) is proved. Notice that the condition C1 ≥ 2 is introduced only because C1 is the same constant appearing in (3.9). (1;2)(k)

Let us now consider Wω ;ω (z; x, y) and notice that it can be decomposed as the sum of the five terms in Fig. 3. The term denoted by (a) in Fig. 3 can be bounded as (1;2)(k)

(2;2)(k)

W(a);ω ;ω ≤ |λ∞ |v K L ∞ Wω ,−ω;ω

N

gω( j) L 1 ≤ CC1 |λ∞ |γ −2(k−K ) . (3.12)

j=k+1 (0;2)(k)

The bounds for the graphs (c) and (d) are an easy consequence of the bound for Wω . In order to obtain an improved bound also for the graph (b) of Fig. 3, we need to (2;0)(k) further expand Wω,ω as done in Fig. 4, if we define the graph (b2) so that the vertex u can be either on the fermion line joining w with w (as in the figure) or on the other fermion line ending in w. The bound for the graph (b2) can be done by using the previous arguments. We can write (1;2)(k) 2 W(b2)ω dwdu dz v K (x − w)v K (u − z ) ;ω (z; x, y) = λ∞ δ(x − y) · dudw gω[k+1,N ] (w − u)gω[k+1,N ] (u − w)gω[k+1,N ] (w − u ) (2;2)(k)

· Wω ,ω;−ω (z, z ; w , u).

(3.13)


585

(a)

(b)

(c)

(d)

(1;2)(k)

Fig. 3. Graphical representation of Wω ;ω (z; x, y); the external wiggly line represents the external field J , while the internal wiggly line is the interaction v K , as before

(b)

(b1)

(b2)

(b3)

Fig. 4. Graphical representation of the term (b) in Fig. 3

In order to get the right bound, it is convenient to decompose the three propagators gω into scales and then bound by the L ∞ norm the propagator of lowest scale, while the two others are used to control the integration over the inner space variables through their L 1 norm. Hence we get: (1;2)(k)

(2;2)(k)

W(b2)ω ;ω ≤ |λ∞ |2 v K L ∞ v K L 1 Wω ,−ω;ω · 3! gω( j) L 1 gω(i) L 1 gω(i ) L ∞ ≤ C3 |λ∞ |2 γ −2(k−K )

(3.14) (3.15)

k+1≤i ≤ j≤i≤N

for some constant C3 . The bound of (b1) and (b3) requires a new argument, based on a cancelation following from the particular form of the free propagator. Let us consider, for instance, (b1): (1;2)(k)

W(b1)ω ;ω (z; x, y) = λ∞ δω ,−ω δ(x − y)

[k+1,N ] dw v K (x − w) g−ω (w − z)

2

.

(3.16)

586


Since the cutoff function Ck,N (k) is symmetric in the exchange between k0 and k1 , it is easy to see that gω[k,N ] (x0 , x1 ) = −iωgω[k,N ] (x1 , −x0 ); hence 2 [k+1,N ] (u) = 0. (3.17) du g−ω It follows, by using (3.17) and the identity

v K (x − w) = v K (x − z) +

(z j − w j )

ds ∂ j v K (x − z + s(z − w)),

1

(3.18)

0

j=0,1

that we can write (1;2)(k)

W(b1)ω ;ω (z; x, y) = λ∞ δω ,−ω δ(x − y) · 1 [k+1,N ] ds dw ∂ j v K (x − z + s(z − w)) (z j − w j ) g−ω (w − z) ·

2

.

j=0,1 0

(3.19) Hence, (1;2)(k) W(b1)ω ;ω ≤ 4|λ∞ |

·

N i

( j)

g−ω L ∞

dx (∂ j v K )(x)

(3.20)

i=k j=k (i)

dw |w j ||g−ω (w)| ≤ C4 |λ∞ |γ −(k−K ) .

(3.21)

By summing all the bounds, we see that there is a constant C2 such that (1;2)(k)

Wω ;ω

− δω,ω δ2 ≤ C2 |λ∞ |γ −(k−K ) ,

(3.22)

which proves (3.7). The bound (3.8) for W (0;4)(k) follows from similar arguments.

Remark. In the presence of the A fields the above analysis can be repeated, with the only difference being that in the analogue of Fig. (4) the (b1) and (b3) terms are missing.

3.3. Equivalence of the spin and the QFT models. As a consequence of the integration of the ultraviolet scales discussed in the previous section, we can write the removed cutoffs limit of (3.1), with η = 0 and with the choice K = 0, as (0) (≤0) (0) (≤0) lim lim (3.23) Pµ0 ,Z 0 (dψ (≤0) )eV (ψ )+B (ψ ,A,J ) , l→−∞ N →∞

where the propagator of the integration measure in (3.23) coincides with gT(≤0) (x, y), (0) defined in (2.41), LV (0) = λ0 Fλ and LB (0) , when J = 0, defined as in (2.37); from the analysis of the previous section it follows that λ0 is a smooth function of λ∞ , such that λ0 = λ∞ + O(λ2∞ ).


587

The multiscale integration for the negative scales can be done exactly as described in §2.3, with the only difference being that, by the oddness of the free propagator, ν j = 0 and ( j)

(λ j , . . . λ0 ), λ j−1 = λ j + β λ

(3.24)

where, by (2.40) and the short memory property, ( j) (λ j , . . . λ0 ) = β ( j) (λ j , . . . λ0 ) + O(λ¯ 2j γ ϑ j ), β λ λ

(3.25)

( j)

βλ (λ j , . . . λ j ) being the function appearing in the bound (2.46), so that we can prove in the usual way that λ−∞ = λ0 + O(λ20 ); since λ0 = λ∞ + O(λ2∞ ), we have λ−∞ = h(λ∞ ) = λ∞ + O(λ2∞ ),

(3.26)

for some analytic function h(λ∞ ), invertible for λ∞ small enough. Moreover, by using (2.40), Z± j−1 Z± j

( j) ± =1+β (λ j , . . . , λ0 ),

(3.27)

with ( j) ( j) ± β (λ j , . . . , λ0 ) = β± (λ j , . . . λ0 ) + O(λ¯ 2j γ ϑ j ),

(3.28)

( j)

β± being the functions appearing in (2.50). This implies that η± = logγ [1 + β± (λ−∞ )],

(3.29)

that is the critical indices in the AT or 8V or in the model (3.1) are the same as functions of λ−∞ . Of course λ−∞ is a rather complex function of all the details of the models. However, if we call λj (λ) the effective couplings of the lattice model of the previous sections, the invertibility of h(λ∞ ) implies that we can choose λ∞ so that h(λ∞ ) = λ−∞ (λ).

(3.30)

With this choice of λ∞ (λ), the critical indices are the same, as they depend only on λ−∞ ; the rest of this chapter is devoted to the proof that the critical indices have, as functions of λ∞ , simple expressions, which imply the scaling relations in the main theorem. 4. Ward Identities and Schwinger-Dyson Equation In this section we prove that the Gauge symmetry of the equivalent QFT model implies closed equations for the correlations, from which an explicit expression of the correlations and the indices as a function of λ∞ can be derived; such expressions are so simple that the validity of the extended scaling relation can be checked. Such a simplicity follows from the fact that the Ward Identities for the equivalent QFT model, from which the closed equations are derived, verify a property called anomaly non-renormalization.

588


4.1. Schwinger-Dyson equations and Ward Identities. The Schwinger-Dyson equations for the model (3.1) are generated by the identity, see [6], Dω (k)

∂eW N − W N (0,η) (0, η) = χl,N (k) ηk,ω e + ∂ ηk,ω ! ∂ 2 eW N dp − λ∞ v K (p) (0, η) , + (2π )2 ∂ Jp,−ω ∂ ηk+p,ω

(4.1)

where Dω (k) = −ik0 + ωk and we have shortened the notation of W N (0, J, η) into W N (J, η). ± → e±iαx,ω ψ ± , we obtain another On the other hand, by the change of variables ψx,ω x,ω identity: ∂W N ∂W N (0, η) − τ v K (p)D−ω (p) (0, η) ∂ Jp,ω ∂ Jp,−ω " ! ∂W N ∂W N dk ∂WA − + (0, η) ηk,ω + (0, 0, η), = η + (0, η) − − (2π )2 k+p,ω ∂ ηk,ω ∂ αp,ω ∂ ηk+p,ω

Dω (p)

where τ is a constant to be chosen later, # (N ) [l,N ] eWA (α,J,η) = P(dψ [l,N ] )eV (ψ )+ ω dx ·e

# ω

[l,N ]+ − + ψ [l,N ]− ] dx[ψx,ω ηx,ω +ηx,ω x,ω

[l,N ]+ [l,N ]− Jx,ω ψx,ω ψx,ω

e[A0 −τ A− ]

α,ψ [l,N ]

,

dq dp + − q,ω ψp,ω , A0 (α, ψ) = Cω (q, p) αq−p,ω ψ 4 (2π ) ω=± dq dp de f − + q,−ω p,−ω A− (α, ψ) = ψ D−ω (p − q) v K (p − q) αq−p,ω ψ , 4 (2π ) ω=± de f

(4.2)

(4.3) (4.4) (4.5)

−1 −1 (p) − 1]Dω (p) − [χl,N (q) − 1]Dω (q), (4.6) Cω (q, p) = [χl,N N and χl,N (k) = k=l f k (k). An explicit derivation of (4.2) can be found in §2.2 of [10]; (4.2) is obtained by introε (k) never vanishing for all values of k = 0 and equivalent ducing a cut-off function χl,N to χl,N (k) as far as the scaling properties of the theory are concerned; ε is a small paramε (k) = χ eter and limε→0+ χl,N l,N (k). This further regularization (to be removed before ε )−1 (k) − 1]χ ε (k) = taking the removed cutoffs limit) ensures that the identity [(χl,N l,N ε 1 − χl,N (k) is satisfied for all k = 0. When this further regularization is removed, all the quantities we shall study have a well defined expression and, by the change of variables ± → e±iαx,ω ψ ± , we get the Ward identity (WI): ψx,ω x,ω

∂W N (0, η) ∂ Jp,ω " ! ∂W N dk ∂W N ∂WA − + = (0, η) − (0, η) η (0, 0, η) η + . (4.7) k,ω + − τ =0 (2π )2 k+p,ω ∂ ηk,ω ∂ αp,ω ∂ ηk+p,ω

Dω (p)


589

Equation (4.2) is obtained by just adding and subtracting the second term in the first line of (4.2). The last term in (4.7) is a correction to the formal WI, due to the presence of the ultraviolet cut-off function χl,N . The two equations obtained from (4.2) by putting ω = ±1 can be solved w.r.t. ∂eW N /∂ Jp,ω and, if we define 1 1 , a(p) ¯ = , 1−τ v K (p) 1+τ v K (p) a(p) + εa(p) ¯ , Aε (p) = 2

a(p) =

(4.8)

we obtain the identity Aωω (p) ∂eWA ∂eW N (0, η) − (0, 0, η) Dω (p) ∂ αp,ω ∂ Jp,ω ω " ! WN WN Aωω (p) dk ∂e ∂e − + = (0, η) ηk,ω . ηk+p,ω + (0, η) − − Dω (p) (2π )2 ∂ ηk,ω ∂ ηk+p,ω ω

(4.9)

By using (4.9) and (4.1), we easily get: ∂eW N − W N (0,η) (0, η) = χl,N (k) ηk,ω e + ∂ ηk,ω dp A−ωω (p) − λ∞ v (p) 2 K (2π ) D−ω (p) ω " ! ∂ 2 eW N ∂ 2 eW N dq − + (0, η) − + (0, η) ηq,ω · ηq+p,ω + + − (2π )2 ∂ ηq,ω ∂ ηk+p,ω ∂ ηk+p,ω ∂ ηq+p,ω dp ∂ 2 eWA A−ωω (p) − λ∞ v K (p) (0, 0, η) , (4.10) + (2π )2 D−ω (p) ∂ αp,ω ∂ ηk+p,ω

Dω (k)

ω

where we have used that, by simple parity arguments, λ∞

A−ωω (p) ∂eW N dp v (p) K + (0, η) = 0. (2π )2 D−ω (p) ∂ ηk,ω

(4.11)

4.2. Closed equation. If we make an arbitrary number of functional derivatives with respect to the η external fields in (4.10), then we set η = 0 and perform the Fourier transform, we obtain a set of differential equations. We will prove in the last section the following crucial result: Theorem 4.1. If λ∞ is small enough and we put τ=

λ∞ , 4π

(4.12)

590


then the Fourier transforms of the correlation functions generated by setting η = 0 after deriving w.r.t. η the functional dp A−ωω (p) ∂ 2 eWA (0,0,η) v (p) , (4.13) K + (2π )2 D−ω (p) ∂ αp,ω ∂ ηk+p,ω ω

vanish at distinct points in the removed cutoff limit (defined at the beginning of §3.1). This theorem will be proved in §4.3. We want now to show how to use it to prove the identity (1.10), so completing the proof of Theorem 1.1. By using Theorem 4.1 and some regularity property of the Schwinger functions (for details, see §A.1 in [6]) we get, in the removed cutoff limit, a set of closed equation for the Schwinger functions. In particular, if we define de f

− + ψx,ω ψy,ω = Sω (x − y), de f

− − + + ψy,−ω ψu,−ω ψv,ω = G ω (x, y, u, v), ψx,ω

(4.14) (4.15)

we get

where ∂ω = ∂x0

(∂ω Sω ) (x) − λ∞ FK ,− (x)Sω (x) = δ(x), # K ,− (−p), with + iω∂x1 and FK ,− (x) = dp/(2π )2 e−ipx F

(4.16)

f v K (p)Aε (p) K ,ε (p) de . F = D−ω (p)

(4.17)

Sω (x) = eλ∞ − (x,0) gω (x),

(4.18)

The solution of (4.16) is:

having defined ε (x, z) =

dk e−ikx − e−ikz FK ,ε (−k). (2π )2 Dω (k)

(4.19)

Notice that, for large |x|, thanks to (4.8), ε (x, 0) ∼ −

a(0) + εa(0) ¯ Aε (0) ln |x| = − ln |x|, 2π 4π

(4.20)

which implies, in particular, that the critical index ηz , defined in (2.51) is given by ηz =

2τ 2 λ∞ [a(0) − a(0)] ¯ = . 4π 1 − τ2

(4.21)

− − + Moreover, if we take in (4.10) three derivatives w.r.t. ηq,−ω , ηk+q−s,ω and ηs,−ω , we find, after Fourier transforming and in the cutoffs limit, x ∂ω G ω (x, y, u, v) = δ(x − v)S−ω (y − u)

(4.22) + λ∞ FK ,+ (x − y) − FK ,+ (x − u) − FK ,− (x − v) G ω (x, y, u, v),


591

which is solved, by using (4.18), by G ω (x, y, u, v) = e−λ∞ [+ (x−y,v−y)−+ (x−u,v−u)] · Sω (x − v)S−ω (y − u).

(4.23)

The r.h.s. of (4.23) is well defined for x = u and y = v, if x = y, or x = y and + ψ− u = v, if x = u. This is a consequence of the fact that the operators ψx,ω x,−ω and + + ψx,ω ψx,−ω are well defined even in the limit N → ∞, thanks to the non locality of the

∂ WN N interaction. Hence, one expects that ∂∂A+W + and ∂ A+ ∂ A+ can be calculated by simply x ∂ Ay x y using Eqs. (4.23), (4.18). A rigorous proof of this statement could be done by a simple (1) extension of Lemma 4.1 of [6] (with Z¯ N = c1 = 1 + O(λ∞ )), where a similar (more difficult) problem is considered. If we put (4.23) x = u and y = v, we find, using also (4.15) and (4.20), that 2

2

− − + + − + + − ψx,ω ψx,−ω ψy,−ω ψy,ω = ψx,ω ψx,−ω ψy,−ω ψy,ω 0 e−2λ∞ [+ (x−y,0)−− (x−y,0)]

∼

|x−y|→∞

C |x

¯ ∞ /2π )] − y|2[1−a(0)(λ

.

(4.24)

If we put instead x = y and u = v, we get − − + + − + + − ψx,ω ψx,−ω ψu,−ω ψu,ω = ψx,ω ψx,−ω ψu,−ω ψu,ω 0 e2λ∞ [+ (x−u,0)+− (x−u,0)]

∼

|x−u|→∞

C . |x − u|2[1+a(0)(λ∞ /2π )]

(4.25)

Let us now choose λ∞ , so that (3.30) is satisfied. Then, by using (2.36), (4.8), (4.12) and the definition (1.8) of x± we get the identities x+ =

1−τ 1+τ , x− = , 1+τ 1−τ

(4.26)

which imply the identity (1.10). Remark. The proof of the relation x− x+ = 1 follows from two main ingredients, namely the linearity of τ as a function of λ∞ , see (4.12), and the vanishing of the last term in (4.10). The validity of such properties is due to our choice of the equivalent continuum model; it is indeed known, as proved in [5], that in other QFT models, still equivalent to the spin model, such properties are not true so that they do not allow to derive the relation x− x+ = 1. The linearity of τ as a function of λ∞ corresponds to a property called in the physical literature non renormalization of the anomaly or Adler-Bardeen theorem, see [23–25]. 4.3. Proof of Theorem 4.1. We start with the multiscale integration of the Grassmann integral WA (α, 0, η) (4.3) appearing on the WI (4.2). Notice # that W+A (α,−0, η) (4.3) is very similar to W N (J, η), see (3.1), the difference being that Jx,ω ψx,ω ψx,ω is replaced by A0 − τ A− . A crucial role in the analysis is played by the function Cω (p, q) appearing in the definition of A0 ; this function is very singular, but it appears in the various equations relating the correlation functions only through the regular function f ω(i, j) (q + p, q) de = χ N (p)Cω (q + p, q) gω(i) (q + p) gω( j) (q), U

(4.27)

592


where χ N (p) is a smooth function, with support in the set {|p| ≤ 3γ N +1 } and equal to 1 in the set {|p| ≤ 2γ N +1 }; we can add freely this factor in the definition, since ω(i, j) (q + p, q) will only be used for values of p such that χ N (p) = 1, thanks to the U (i, j) support properties of the propagator. It is easy to see that Uω vanishes if neither j nor i equals N or l; this has the effect that at least one of the fields in A0 has to be integrated at the N or l scale. As a matter of fact, the terms in which at least one field is integrated at scale l are much easier to analyze, see the considerations after (4.61) below. In order to study the (i, j) others, it is convenient to introduce the function Sω,ω such that ¯ ω(i, j) (q + p, q) = U

ω¯

(i, j) Dω¯ (p) Sω,ω (q + p, q). ¯

One can show that, if we define dp dq −ip(x−z) iq(y−z)(i, j) (i, j) Sω,ω (z; x, y) = e e (p, q), Sω,ω ¯ ¯ (2π )4

(4.28)

(4.29)

then, given any positive integer M, there exists a constant C M such that, if j > l, (N , j)

|Sω,ω (z; x, y)| ≤ C M ¯

γN γj , 1 + [γ N |x − z|] M 1 + [γ j |y − z|] M

(4.30)

a bound which is used to control the renormalization of the marginal terms containing a vertex of type A0 . We choose τ as given by τ = λ∞

N i, j=l+1

dq (i, j) (q, q) ; S (2π )2 −ω,ω

(4.31)

by an explicit calculation one can see that, for any l < 0 and N > 0, τ satisfies (4.12). We remark that, to get this result, it is important to exclude from the sum in the r.h.s. of (4.31) the couples (i, j) with one of the indices equal to l; without this restriction, τ would be equal to 0, for any N > 0. We will proceed as in the analysis of W N (J, η), by integrating first the ultraviolet scales N , N − 1, . . . , h + 1, h ≥ K , and following a procedure very similar to the one described in §3.2; we have new marginal terms with one α field and two ψ fields and we have to prove the analogue of (3.7) for them. The marginal terms such that only one of these two fields is contracted are proportional to W (0;2)(k) , so that one can use (3.6) to bound them. Hence, we shall consider in detail only the terms such that both fields of (n;2m)(k) A0 or A1 are contracted and we shall call K the corresponding kernels of the ;ω;ω monomials with 2m ψ-fields and n α-fields. In the case n = 1, we decompose them as follows: (1;2m)(k) (p; k), (1;2m)(k) Dσ ω (p)W (4.32) K ;ω;ω (p; k) = ;σ,ω;ω σ

where p is the momentum flowing along the external α-field. As in §3.2, we have to (1;2)(k) improve the dimensional bound of W;σ,ω;ω . We can write the following identity, which


593

(a)

(b)

(c)

(d)

(e)

(1;2)(k)

Fig. 5. Graphical representation of W;−1,ω;ω ; the triangular vertex represents Cω (q, p) given by (4.5)

is represented by the first line of Fig. 5 in the case σ = −1: N (1;2)(k) (0;4)(k) j) W;σ,ω;ω (z; x, y) = (z; u, w)Wω,ω (u, w, x, y) dudw Sσ(i,ω,ω i, j=k

−τ δ−1,σ

(1;2)(k)

dw v K (z − w)W−ω;ω (w; x, y).

(4.33)

(1;2)(k)

We can further decompose W;−1,ω;ω as in the last three lines of Fig. 5. The term (c) can be written as N (i, j) λ∞ dudu dwdw S−ω,ω (z; u, w)gω[k,N ] (u − u )v K (u − w ) i, j=k (1;4)(k)

· W−ω;ω,ω (w ; u , w, x, y).

(4.34) (i, j)

de f

Hence, if we put b j (x) = γ j /(1 + [γ j |x|]3 ), we recall that S−ω,ω is different from 0 only if either i or j is equal to N , and we use the bound (4.30), we see that the norm of (c) is bounded by N∗ (1;4)(k) C3 |λ∞ |v K L ∞ dxdu dwdw |W−ω;ω,ω (w ; u , w, x, y)| ·

i, j,m=k

dzdu bi (z − w)b j (z − u)|gω(m) (u − u )|,

(4.35)

594


where ∗ reminds us that max{i, j} = N . Since the L 1 and the L ∞ norm of b j satisfy a ( j) bound similar to analogous bounds of gω , we can proceed as in the previous section to # bound dzdu bi (z − w)b j (z − u)|gω(m) (u − u )|, by taking the L ∞ norm for the factor with the smaller index and the L 1 norm for the other two. By also using (3.9), we get the bound Cϑ |λ∞ |2 γ −2(k−K ) γ −ϑ(N −k) ,

(4.36)

for any 0 < ϑ < 1 (Cϑ is divergent for ϑ → 1). With respect to the analogous bound in §3.2 ((b2) in Fig.4), there is an improvement of a factor γ −ϑ(N −k) . The term (d) can be bounded by N ∗

C|λ∞ |v K L ∞

bi L 1 b j L 1 ≤ C|λ∞ |γ −(k−K ) γ −(N −k) ;

i, j=k

for the term (e) we get the bound C|λ∞ |2 γ −3(k−K ) γ −(N −k) . By putting together all the previous bounds, we get (c) + (d) + (e) ≤ Cϑ |λ∞ |γ −(k−K ) γ −ϑ(N −k) .

(4.37)

We consider now the terms (a) and (b), whose sum can be written as ⎡ ⎤ N (i, j) du ⎣λ∞ S−ω,ω (z; u, u) − τ δ(z − u)⎦ ·

i, j=k (1;2)(k) dw v K (u − w)W−ω;ω (w; x, y).

(4.38)

By using the identity (3.18), (4.38) can be written also as ⎤ ⎡ N (i, j) (1;2),(k) ⎣λ∞ (w; x, y) du S−ω,ω (z; u, u) − τ ⎦ dw v K (z − w)W−ω;ω i, j=k

+ λ∞

N

(i, j)

du S−ω,ω (z; u, u)(u p − z p )

p=0,1 i, j=k

1

·

ds 0

(1;2),(k) dw (∂ p v K )(z − w + s(u − z))W−ω;ω (w; x, y).

(4.39)

The latter term is again irrelevant and vanishing for N − k → +∞; in fact, its norm can be bounded by (1;2),(k)

2|λ∞ |W−ω;ω ∂v K L 1

N ∗

dz bi (z − u)b j (z − u)|u − z p |

i, j=k

≤ C|λ∞ |γ −(k−K ) γ −(N −k) .

(4.40)

Contrary to what happened for the graph (b1) of Fig. 4, the contribution of the graph (a) to the first term in the r.h. side of (4.39) is not zero (that is, the fermionic bubble is not


595

vanishing); however, in this case its value is compensated by the graph (b), thanks to the explicit choice we made for τ . Indeed we have λ∞

N

(i, j)

du S−ω,ω (z; u, u) − τ = −2λ∞

i, j=k

k−1

(N , j)

du S−ω,ω (z; u, u), (4.41)

j=l+1

that easily implies that the norm of the first term in the r.h. side of (4.39) is bounded by C|λ∞ |γ −(N −k) . (1;2)(k) Let us finally consider W;+1,ω;ω , for which we can use a graph expansion similar to that of Fig. 5, the only differences being that τ is replaced by 0 and the indices −ω are replaced by ω. Hence a bound can be obtained with the same arguments used above, with only one important difference: the contribution that in the previous analysis was compensated by the graph (b) now is zero by symmetry reasons. Indeed, if we call k∗ the (i, j) ∗ ∗ (i, j) (k , p ) = −ωω¯ Sω,ω (k, p), which vector k rotated by π/2, it is easy to see that Sω,ω ¯ ¯ implies that N

du

(i, j) (z; u, u) Sω,ω

=

i, j=k

N i, j=k

dk (i, j) (k, −k) = 0. S (2π )2 ω,ω

(4.42)

We have then proved that (1;2)(k)

W;σ,ω;ω ≤ C|λ∞ |γ −ϑ(N −k) ,

(4.43)

which implies, as in the proof of Theorem 3.1, that, for K ≤ k ≤ N , (1;2m)(k) m (1−m)k −ϑ(N −k) γ . W;σ,ω;ω ≤ (C|λ∞ |) γ

(4.44)

With respect to the bounds appearing in Theorem 3.1, there is an extra factor γ −ϑ(N −k) , implying that such kernels vanish at fixed k in the N → ∞ limit. However, this is not sufficient to prove Theorem 4.1, as in (4.13) the derivatives of WA (α, 0, η) with respect to the external fields are integrated over p. It is convenient to write (4.13) as ω

∂WT,ε A−ωω (p) ∂ 2 eWA (0,0,η) dp v K (p) = (0, η), + 2 k,ω (2π ) D−ω (p) ∂ αp,ω ∂ ηk+p,ω ∂β ε=±

(4.45)

where eWT,ε (β,η) =

P(dψ [l,N ] )eV

·e

(ε)

(ε)

T1 −τ T−

(N ) (ψ [l,N ] )+

ψ l,N ,β

ω

#

[l,N ]+ − + ψ [l,N ]− ] dx[ψx,ω ηx,ω +ηx,ω x,ω

(4.46)

596


and (ε)

T1 (ψ, β) =

dk dp dq (ε) C−εω (q + p, q) v K (p) (2π )4 D−ω (p) ω k,ω ψ − ψ + − ·β k+p,ω q+p,−εω ψq,−εω ,

(ε)

T− (ψ, β) =

dk dp dq (ε) − ψ + − k,ω ψ u K (p)β k+p,ω q+p,εω ψq,εω , 4 (2π ) ω

(4.47)

(4.48)

where Dεω (p) de f (ε) (ε) (ε) ε (p), . u K (p) = v K (p) v K (p) v K (p) = v K (p) A D−ω (p) (±)

(4.49)

(−)

Notice that v K (x) and u K (x) are smooth functions of fast decay, hence they are equiv(+) alent to v K (x) in the bounds. This is not true for u K (x), whose Fourier transform is bounded but discontinuous in p = 0. However, in the following we shall only need to (+) (+) (+) know that u K L ∞ ≤ Cγ 2K and that | u K (p)| ≤ | v K (p) v K (p)|, which are easy to prove. As in §4.1, we now perform a multiscale integration for the ultraviolet scales N , N − 1, . . . , k + 1, k ≥ K , very similar to the one described in §3.2, the main difference being that there appear in the effective potential new monomials in the external field β and in ψ. Again, as in Theorem 3.1, one has to produce an improved bound only on the terms with positive or vanishing dimension, so that one has to analyze the kernels of the monomials with a β field and one or three ψ fields. We consider first the terms contributing to WT,ε (β, η), in which at least one of the (ε) two ψ-fields in T1 (ψ, β) with momentum q + p or q is contracted at scale N . We shall (1;2m−1) call WT,ε;ω;ω the corresponding kernels of the monomials with 2m − 1 ψ-fields and 1 α-field. We can write (1;1)(k) (1;1)(k) (1;1)(k) = W(a)T,ε;ω,ω + W(b)T,ε;ω,ω , WT,ε;ω,ω

(4.50)

where (1;1)(k) (ε) a) W(a)T,ε;ω,ω is the sum over the terms such that the field β belongs only to a T1 -ver+ q+p,−εω − tex, whose ψ-field ψ either is contracted with ψ k+p,ω (this can happen only for (0;2)(k) ω ε = −1) or is connected to it through a kernel W (q + p). (1;1)(k)

b) W(b)T,ε;ω,ω is the sum over the remaining terms. −1 (k) − 1 = 0; hence Let us consider the first term. Given k, for N large enough, χl,N we can write:

v (−1) dp (1;1)(k) K (p) W(a)T,ε;ω,ω (k) = δε,−1 [χ−∞,N (p + k) − 1] (2π )2 D−ω (p) ω(0;2)(k) (p + k) 1 + ω(0;2)(k) (k) . · 1 + gω[k+1,N ] (p + k)W gω[k+1,N ] (k)W

(4.51)


597

(1;1)(k)

Fig. 6. Graphical representation of W(b)T,ε;ω,ω ; the dotted line coming from x represents the external field β (−1)

(−1)

Moreover, since v K (p) = 0 for |p| ≥ 2γ K , then χ−∞,N (p+k)−1 = 0, if vK and N is large enough. It follows that, given a fixed k, for N large enough,

(p) = 0

(1;1)(k) (k) = 0. W (a)T,ε;ω,ω

(4.52)

(1;1)(k)

Let us now consider W(b)T,ε;ω,ω (x − y), which can be decomposed as in Fig. 6. By using (4.33), it can be written as (ε) (1;2)(k) dz u K (x − z)gω[k,N ] (x − w)W;σ,−εω;ω (z; y, w), (4.53) σ

hence its norm, by using (4.43), can be bounded by ∞ u (ε) K L

N

(1;2)(k) |gω( j) | L 1 W;σ,−εω;ω ≤ C|λ∞ |γ k γ −2(k−K ) γ −ϑ(N −k) ,

(4.54)

j=k

so that (1;1)(k)

WT,ε;ω,ω ≤ C|λ∞ |γ k γ −ϑ(N −k) γ −2(k−K ) .

(4.55)

(1;3)(k) (1;3)(k) (1;3)(k) WT,ε;ω;ω = W(a)T,ε;ω;ω + W(b)T,ε;ω;ω ,

(4.56)

Moreover

(1;3)(k) k+p,ω of T1 and T− is not where W(a)T,ε;ω;ω contains the terms in which the field ψ (0;2)(k) ω contracted or is linked to a kernel W , while the other terms are collected in (1;3)(k) W(b)T,ε;ω;ω . (1;3)(k)

Let us consider first W(b)T,ε;ω;ω , which can be represented as in Fig. 7. We can write (1;3)(k)

W(b)T,ε;ω,ω (x, y, u, v) (ε) (1;4)(k) = dzdw u K (x − z)gω[k,N ] (x − w)W,ε;ω,ω (z; w, y, u, v), (1;4)(k)

(4.57) (ε)

so that, by the bounds (4.43), W,ε;ω,ω ≤ C|λ∞ |γ −k γ −ϑ(N −k) and u K L ∞ ≤ Cγ 2K , we get: (1;3)(k)

W(b)T,ε;ω;ω ≤ C|λ∞ |γ −2(k−K ) γ −ϑ(N −k) .

(4.58)

598


(b1)

(b2)

(b3)

(b4)

(1;3)(k)

Fig. 7. Graphical representation of W(b)T,ε;ω;ω (1;3)(k)

Let us now consider W(a)T,ε;ω;ω ; its Fourier transform, if we call k+ and k− the momenta

of the two fields connected to the line u (ε) K , can be written as (notice that ω is of the form (ω, ω , ω )): (1;3)(k) (k; k+ , k− ) = 1 + ω(0;2)(k) (k + k+ − k− ) W gω[k+1,N ] (k + k+ − k− )W (a)T,ε;ω;ω (1;2)(k) (ε) − + − − W · u (k+ − k− ) (k + k − k , k ). K

σ

;σ,−εω,ω

(4.59) (−1)

Then, if ε = −1, since v K

L 1 ≤ C, by using the bounds (4.43) and (3.6), we find

(1;3)(k) −ϑ(N −k) W(a)T,−1;ω;ω . ≤ C|λ∞ |γ

(4.60)

Using the general bound (2.58) of [5], we get that the contributions to the derivatives of WT,ε with respect to η at distinct space points, coming from trees containing one (1;1)(k) (1;3)(k) (1;3)(k) endpoint associated with one of the kernels WT,−1;ω;ω , W(a)T,−1;ω;ω , W(a)T,−1;ω;ω , −N

are bounded by C k k!4 λk∞ δ −2k ( γ δ )ϑ , with 0 < ϑ < 1 and δ the minimal distance between the external points; hence they are vanishing in the removed cutoff limit. A similar conclusion is true for the contributions to the derivatives of WT,ε with respect to η at distinct points, coming from trees containing one endpoint associated (1;3)(k) (+) | ≤ γ −2l , it is easy to with W(a)T,+;ω;ω , even if u K (x) is not integrable. In fact, since | # (+1) −l show that dx|v K (x)| ≤ Cγ , so that the previous bound has to be multiplied by −l γ ; however, we take the limit −l → ∞ after the limit N → ∞, hence the conclusion is the same. Finally we have to consider the contributions to the correlation functions such that one of the ψ-fields in T1(ε) (ψ, β) with momentum q + p or q, see (4.47), is contracted at scale l. In such a case we can use the bound (i,l) U γ −l−i ω (q + p, q) , if |p| ≥ 2γ l+1 , (4.61) ≤ Cγ −(i−l) Dω (p) Z i−1


599

and the factor γ −(i−l) in the r.h.s. of this bound makes negative the scaling dimension of marginal terms with an external β line (there are no relevant terms), see §4.8 of [11] for details. Again by the bound (2.58) of [5] we get, for this kind of contributions to the derivatives of WT,ε with respect to η at distinct points, the bound C k k!4 λk∞ δ −2k (γ l /δ)ϑ , with 0 < ϑ < 1, so they are vanishing as l → −∞. This completes the proof of Theorem 4.1. Acknowledgements. P.F. is indebted to David Brydges for stimulating his interest in the topic with the request of a review seminar on the papers [29] and [22].

A. Fermionic Representation of the Partition Function A.1. Proof of (2.6). Since σx , σx = ±1, % $ = cosh(α) + σx σx+e j σy σy+e exp ασx σx+e j σy σy+e sinh(α), j j so that the partition function for the AT or 8V model with external fields is given by: Z (I, I ) = [cosh(β J4 )]2|| "

· 1 + tanh(β J4 ) j=0,1 x∈

∂2 j,x A ∂A j,x

! Z (I )Z (I ),

(A.1)

j,x = A j,x and A = A , while, in where I j,x = Aj,x + β J and, in the AT case, A j,x j,x 0,x = A0,x , A = A , A 1,x = A1,x+e0 , A = A the 8V case, A . 0,x 1,x 1,x 0,x+e1 Let us call t j,x , c j,x the expressions obtained from t j,x , c j,x by substituting A j,x with j,x ; in a similar way we define t j,x , cj,x . Let us now define: A t j,x f j,x = 1 + tanh(β J4 ) t j,x , g j,x = h j,x =

t j,x tanh(β J4 ) t j,x tanh(β J4 ) , g j,x = 2 , ( c j,x )2 f j,x ( c j,x ) f j,x 1 ( cj,x )2 ( c j,x )2

(A.2)

tanh(β J4 ) − g j,x g j,x . f j,x

we can write the partition function j,x and A By explicitly taking the derivatives w.r.t. A j,x (A.1) as ⎛ ⎞

Z (I, I ) = 4|| [cosh(β J4 )]2|| ⎝ f j,x c j,x cj,x ⎠ j,x

·

(−1) γ ,γ

δγ +δγ

4

Z γ ,γ (I, I ),

where Z γ ,γ (I, I ) is the Grassmannian functional integral Z γ ,γ (I, I ) = d H d V d H d V e S(t+g)+ S (t +g )+V (h) ,

(A.3)

(A.4)

600


with boundary conditions γ = (ε0 , ε1 ) and γ = (ε0 , ε1 ) on the variables H , V and H , V , respectively. Moreover S(t) and S (t) have a definition which depends on the model. S(t) is equal to S(t) in the AT model, while, in the 8V model, it is the function which is obtained from S(t), by substituting, in the first line of (2.4), V¯x Vx+e1 with V¯x+e0 Vx+e0 +e1 . S (t), in the AT case, is obtained from S(t), by simply replacing H, V with H , V , while, in the 8V case, we also have to substitute H¯ x Hx+e with V¯x Vx+e and V¯x Vx+e 0 1 1 with H¯ x+e1 Hx+e1 +e0 . V (h) is a quartic interaction that, in the AT case, is given by V AT (h) =

+ h 1,x V¯x Vx+e1 V¯x Vx+e h 0,x H¯ x Hx+e0 H¯ x Hx+e , 0 1

(A.5)

x∈

while, in the 8V case, is given by V8V (h) =

+ h 1,x V¯x+e0 Vx+e0 +e1 H¯ x+e H h 0,x H¯ x Hx+e0 V¯x Vx+e . (A.6) 1 1 x+e1 +e0

x∈

We remark that g j,x , g j,x , h j,x = O(β J4 ). The truncated correlations of the quadratic observables are obtained by taking two derivatives of ln Z (I, I ) w.r.t. the external sources in two different points, and putting such external sources to zero. The addends 2|| ln[2 cosh(β J4 )] and j,x (ln f j,x + ln c j,x + ln cj,x ) do not contribute when we take two derivatives in the A variables of two different points. If we define ∂ εj,x = ∂/∂ A j,x + ε∂/∂ Aj,x , we get: T Oxε ; Oyε =

i, j

⎤ δγ +δγ ε ε ⎣ ⎦ ∂i,x ∂ j,y ln (−1) Z γ ,γ (I, I ) γ ,γ ⎡

,

(A.7)

A≡0

so that we get, by some simple algebraic calculations: ∂ 2 Z¯ γ ,γ (A) 1 (−1)δγ +δγ Z γ ,γ ∂ A¯ εx ∂ A¯ εy A=0 ¯ ¯ 1 δγ +δγ ∂ Z γ ,γ (A) − (−1) 2 ε ( Z) ∂ A¯ x ¯

T Oxε ; Oyε =

γ ,γ

(−1)δγ +δγ

A=0 γ ,γ

∂ Z¯ γ ,γ (A) ∂ A¯ εy ¯

,

A=0

(A.8) with Z γ ,γ defined as in (2.7), with (γ , γ )-boundary conditions (instead of antiperiodic in all variables) and Z = γ ,γ (−1)δγ +δγ Z¯ γ ,γ (0, 0); the parameters s, s and λ are given by s = t j,x + g j,x A≡0 = tanh(β J ) + O(β J4 ), s = t j,x + g j,x = tanh(β J ) + O(β J4 ), A≡0 2λ = h j,x A≡0 = O(β J4 ),

(A.9)


601

and the parameters appearing in (2.10) and (2.11) are given by & ' ∂ ∂ +ε , qε = { t, g → t , g }, qε = ( ti,x + gi,x ) ∂ A j,x ∂ A j,x i A≡0 & ' ∂h i,x ∂h j,x +ε . (A.10) pε = ∂ A j,x ∂ A j,x i

A≡0

In order to prove (2.6) we note that, as proved in App. G of [22], if we put Z¯ γ = Z γ | A=0 , the quantities Z¯ γ ,γ (0)/ Z¯ γ Z¯ γ are exponentially insensitive to boundary conditions in the thermodynamic limit, away from the critical temperature; this implies that Z coincides, in the thermodynamic limit, with ( Z¯ γ¯ ,γ¯ (0)/ Z¯ γ¯ Z¯ γ¯ )( Z¯ )2 with γ¯ = (−, −) and Z¯ = γ (−1)δγ Z¯ γ . Notice that Z¯ is non vanishing; indeed, as proved in §4 of [26], away from the critical temperature | Z¯ γ | is exponentially insensitive to boundary conditions and below the critical temperature Z γ is positive for any γ while above is negative if γ = (+, +) and positive in all other cases. Moreover, as proved in App. G of [22], ∂ Z¯ γ ,γ (A) ∂ 2 Z¯ γ ,γ (A) 1 and Z¯ γ ,γ (0) ∂ A¯ εx A=0 Z¯ γ ,γ (0) ∂ 2 A¯ εx ∂ A¯ εy A=0 ¯ ¯ 1

(A.11)

are exponentially insensitive to boundary conditions, so that the r.h.s. of (A.8) coincides, in the thermodynamic limit, with

¯ ∂ 2 log Z¯ γ¯ ,γ¯ ( A) | A=0 ¯ . ∂ 2 A¯ εx ∂ A¯ εy

A.2. Proof of (2.12) . In order to make more evident the analogy of the above functional integral with the action of a fermionic (Euclidean) Quantum Field Model, it is convenient to make a change of variables in the Grassmann algebra. The new Grassmannian variables will be denoted by ψx , ψ¯ x , χx and χ¯ x and are related to the old ones by the equations: π

H¯ x + i Hx = ei 4 (ψx − χx ) , V¯x + i Vx = ψx + χx , π H¯ x − i Hx = e−i 4 ψ¯ x − χ¯ x , V¯x − i Vx = ψ¯ x + χ¯ x .

(A.12)

A similar transformation is done for the primed variables. After a straightforward computation, we see that the action (2.4), calculated at t j,x = s, ∀ j, x, can be written in terms of the Majorana fields as S(s) = A(ψ, m s ) + A(χ , Ms ) + Q(ψ, χ ), where m s = 1 −

√

2 + s, Ms = 1 +

A(ψ, m) =

√ 2 + s and, if we define ∂ i ψx = ψx+ei − ψx ,

% s $ 0 ψ¯ x ψx ψx ∂ − i∂ 1 ψx + c.c. − im 4 x∈ x∈ % s $ 0 1 + ψ¯ x −i∂ − i∂ ψx + c.c. , 4 x∈

(A.13)

(A.14)

602


% s $ 0 ψx ∂ + i∂ 1 χx + {ψ ↔ χ } + c.c. 4 x∈ % s $ χ¯ x −i∂ 0 + i∂ 1 ψx + {ψ ↔ χ } + c.c. , − 4

Q(ψ, χ ) = −

(A.15)

x∈

where, in agreement with (A.12), we are calling complex conjugation (c.c.) the operation on the Grassmann algebra which amounts to exchange ψx with ψ¯ x , χx with χ¯ x and i with −i. The quartic interaction of the AT model becomes:

ψ¯ x ψx ψ¯ x ψx + ψ¯ x ψx χ¯ x χx + {ψ ↔ χ } V AT = −λ x∈

−λ

χ¯ x ψx χ¯ x ψx + χ¯ x ψx ψ¯ x χx + {ψ ↔ χ } + R V ,

(A.16)

x∈

where R V is the sum of quartic terms with at least one (discrete) derivative. In the case of the 8V model, the second square bracket has +λ in front, rather than −λ. If we set bε = (qε + εqε )/2 and dε = (qε − εqε )/2, the interaction with the external field is given by

¯ = −i bε A¯ εx ψ¯ x ψx + εψ¯ x ψx + χ¯ x χx + εχ¯ x χx B( A) x∈ ε=±

−i

dε A¯ εx ψ¯ x ψx − εψ¯ x ψx + χ¯ x χx − εχ¯ x χx + R B ,

x∈ ε=±

where R B is the sum of monomials quartic in the fields or quadratic with derivatives. We remark that, if J = J , then dε = 0, while bε = 1 − tanh(β J ) + O(β J4 ). We now make another change of variables, defined by the relations ε ψx,+ =

ψ¯ x − εi ψ¯ x ψx − εiψx ε = , ψx,− , ε = ±, √ √ 2 2

(A.17)

and the similar ones for the χ -variables. This change of variables is analogous in the euclidean theories of the transformation from Majorana fermions to Dirac fermions in real time QFT. If we put u = (s+s )/2, v = (s−s )/2 (s, s defined in (A.9)) and m ε = (m s +εm s )/2 (m s and m s defined after (A.13)), we get A(ψ, m s ) + A(ψ , m s ) $ % $ % u + − − + ψx,+ ∂ 0 − i∂ 1 ψx,+ ∂ 0 − i∂ 1 ψx,+ + ψx,+ + c.c. = 4 x∈ % $ % u − $ 0 − + + i∂ + i∂ 1 ψx,− i∂ 0 + i∂ 1 ψx,− + ψx,+ + c.c. + ψx,+ 4 $ % $ % v + + − − ∂ 0 − i∂ 1 ψx,+ ∂ 0 − i∂ 1 ψx,+ + ψx,+ + c.c. + ψx,+ 4 $ % $ % v − − + + i∂ 0 + i∂ 1 ψx,− i∂ 0 + i∂ 1 ψx,− + ψx,+ + c.c. + ψx,+ 4

( − − + − + − + + −im + ψx,− + im − ψx,+ , (A.18) ψx,+ − ψx,+ ψx,− ψx,− + ψx,+ ψx,−


603

ε with ψ −ε and i with −i. In where now the c.c. operation amounts to exchange ψx,ω x,−ω the new variables the interaction with the external source is given by − + + − ¯ =i B( A) (b+ A¯ +x + d− A¯ − x )[ψx,+ ψx,− − ψx,− ψx,+ x∈ − + + − + χx,+ χx,− − χx,− χx,+ ]+i

¯+ (b− A¯ − x + d+ Ax )

x∈ − + + − + + − − ¯ · [ψx,+ ψx,− + ψx,+ ψx,− + χx,+ χx,− + χx,+ χx,− ] + R B ( A),

(A.19)

and V(ψ, χ ) is given by (2.19). 1 2π 1 Let D be the set of k’s such that k0 = 2π L (n 0 + 2 ) and k1 = L (n 1 + 2 ), for L L n 0 , n 1 = − 2 , . . . , 2 − 1, and L an even integer. Then, the Fourier transform for the fermions with antiperiodic boundary condition is defined by de f

ε ψx,ω =

1 iεkx ε k,ω . ψ e ||

(A.20)

k∈D

Therefore (A.18) can be written as A(ψ, m s ) + A(ψ , m s ) =

u + k S(k)k , 2||

(A.21)

k∈D

where + + − , ψ − , ψ −k,+ −k,− k = (ψ ,ψ ), k,+ k,− − − + + + ,ψ ), k = (ψk,+ , ψk,− , ψ −k,+

(A.22)

−k,−

and , if we define µ(k) as in (2.15) and ω (k) = −i sin k0 + ω sin k1 , D v v σ (k) = (cos k0 + cos k1 − 2) + 2 , u u the matrix S(k) is given by ⎛ − (k) D ⎜ ⎜ ⎜ −iµ(k) ⎜ S(k) = ⎜ ⎜v ⎜ D ⎜ u − (k) ⎝ −iµ(k)

iµ(k)

v u

− (k) D

+ (k) D

−iσ (k)

+iµ(k)

− (k) D

+ (k) D

−iσ (k)

v u

iσ (k)

⎞

⎟ ⎟ D+ (k)⎟ ⎟ ⎟. ⎟ iσ (k) ⎟ ⎟ ⎠ D+ (k)

v u

(A.23)

In the case J = J we have v = 0 and σ (k) ≡ 0, so that we get the much simpler equation A(ψ, m s ) + A(ψ , m s ) = −

1 + k,ω ψ − Tω,ω (k), ψ k,ω || k∈D ω,ω

(A.24)

604


with Tω,ω (k) given by (2.14) and A(χ , Ms ) + A(χ , Ms ) = −

1 + − χ χ k,ω χ k,ω Tω,ω (k), ||

(A.25)

k∈D ω,ω

T χ (k) being the matrix obtained from T (k) by substituting µ(k) with (2.16). Moreover, ¯ can be written as in (2.17). εqε = qε , so that dε = 0 and bε = qε ; it follows that B( A) This completes the proof of (2.12).

References 1. Ashkin, J., Teller, E.: Statistics of two-dimensional lattices with four components. Phys. Rev. 64, 178– 184 (1943) 2. Baxter, R.J.: Eight-vertex model in lattice statistics. Phys. Rev. Lett. 26, 832–833 (1971) 3. Baxter, R.J.: Exactly solved models in statistical mechanics. London: Academic Press, Inc., 1989 4. Barber, M., Baxter, R.J.: On the spontaneous order of the eight-vertex model. J. Phys. C 6, 2913– 2921 (1973) 5. Benfatto, G., Falco, P., Mastropietro, V.: Functional integral construction of the massive thirring model: verification of axioms and massless limit. Commun. Math. Phys. 273, 67–118 (2007) 6. Benfatto, G., Falco, P., Mastropietro, V.: Massless Sine-Gordon and massive thirring models: proof of the Coleman’s equivalence. Commun. Math. Phys. 285, 713–762 (2008) 7. Benfatto, G., Gallavotti, G.: Perturbation Theory and the Fermi surface in a quantum liquid. A general quasi-particle formalism and one dimensional systems. J. Stat. Phys. 59, 541–664 (1990) 8. Benfatto, G., Gallavotti, G., Procacci, A., Scoppola, B.: Beta functions and schwinger functions for a many fermions system in One dimension. Commun. Math. Phys. 160, 93–171 (1994) 9. Benfatto, G., Mastropietro, V.: Renormalization group, hidden symmetries and approximate ward identities in the XYZ model. Rev. Math. Phys. 13, 1323–1435 (2001) 10. Benfatto, G., Mastropietro, V.: On the density-density critical indices in interacting fermi systems. Commun. Math. Phys. 231, 97–134 (2002) 11. Benfatto, G., Mastropietro, V.: Ward identities and chiral anomaly in the luttinger liquid. Commun. Math. Phys. 258, 609–655 (2005) 12. Falco, P., Mastropietro, V.: Renormalization group and asymtotic spin–charge separation for chiral luttinger liquid. J. Stat. Phys. 131, 79–116 (2008) 13. Giuliani, A., Mastropietro, V.: Anomalous critical exponents in the anisotropic Ashkin-Teller model. Phys. Rev. Lett. 93, 190603–07 (2004) 14. Giuliani, A., Mastropietro, V.: Anomalous universality in the anisotropic AshkinâTeller model. Commun. Math. Phys. 256, 681–725 (2005) 15. Haldane, D.M.: General relation of correlarion exponents and spectral properties of one dimensional Fermi systems: application to the anisotropic S = 1/2 Heisenberg chain. Phys. Rev. Lett. 45, 1358–1362 (1980) 16. Kadanoff, L.P.: Connections between the critical behavior of the planar model and that of the eight-vertex model. Phys. Rev. Lett. 39, 903–905 (1977) 17. Kadanoff, L.P., Brown, A.C.: Correlation functions on the critical lines of the baxter and ashkin-teller models. Ann. Phys. 121, 318–345 (1979) 18. Kadanoff, L.P., Wegner, F.J.: Some critical properties of the eight-vertex model. Phys. Rev. B 4, 3989– 3993 (1971) 19. Lesniewski, A.: Effective action for the Yukawa2 quantum field theory. Commun. Math. Phys. 108, 437– 467 (1987) 20. Luther, A., Peschel, I.: Calculations of critical exponents in two dimension from quantum field theory in one dimension. Phys. Rev. B 12, 3908–3917 (1975) 21. Mastropietro, V.: Non-universality in ising models with four spin interaction. J. Stat. Phys. 111, 201–259 (2003) 22. Mastropietro, V.: Ising models with four spin interaction at criticality. Commun. Math. Phys. 244, 595–642 (2004) 23. Mastropietro, V.: Nonperturbative Adler-Bardeen theorem. J. Math. Phys 48, 022302 (2007) 24. Mastropietro, V.: Non-perturbative aspects of chiral anomalies. J. Phys. A 40, 10349–10365 (2007) 25. Mastropietro, V.: Non-perturbative Renormalization. River Edge, NJ: World Scientific, 2008 26. McCoy, B., Wu, T.: The two dimensional Ising model. Cambridge, MA: Harvard Univ. Press, 1973 27. den Nijs, M.P.M.: Derivation of extended scaling relations between critical exponents in two dimensional models from the one dimensional Luttinger model. Phys. Rev. B 23, 6111–6125 (1981)


605

28. Pruisken, A.M.M., Brown, A.C.: Universality for the critical lines of the eight vertex, Ashkin-Teller and Gaussian models. Phys. Rev. B 23, 1459–1468 (1981) 29. Pinson, H., Spencer, T.: Unpublished 30. Samuel, S.: The use of anticommuting variable integrals in statistical mechanics. I. The computation of partition functions. J. Math. Phys. 21, 2806 (1980) 31. Smirnov, S.: Towards conformal invariance of 2D lattice models. Proceedings Madrid ICM, Zürich Europ. Math. Soc, 2006 32. Spencer, T.: A mathematical approach to universality in two dimensions. Physica A 279, 250–259 (2000) 33. Zamolodchikov, A.B., Zamolodchikov, Al.B.: Conformal field theory and 2D critical phenomena, part 1. Soviet Sci. Rev. A 10, 269 (1989) Communicated by G. Gallavotti


Communications in


Noncommutative Manifolds from Graph and k-Graph C ∗ -Algebras David Pask1 , Adam Rennie2,3 , Aidan Sims1 1 School of Mathematics and Applied Statistics, University of Wollongong,

Wollongong, NSW, Australia. E-mail: [email protected]; [email protected] 2 Institute for Mathematical Sciences, University of Copenhagen, Copenhagen, Denmark 3 Mathematical Sciences Institute, Australian National University, Canberra, ACT, Australia. E-mail: [email protected]

Received: 19 January 2007 / Accepted: 29 July 2009 Published online: 26 September 2009 – © Springer-Verlag 2009

Abstract: In [PRen] we constructed smooth (1, ∞)-summable semifinite spectral triples for graph algebras with a faithful trace, and in [PRS] we constructed (k, ∞)-summable semifinite spectral triples for k-graph algebras. In this paper we identify classes of graphs and k-graphs which satisfy a version of Connes’ conditions for noncommutative manifolds. Contents 1. 2.

Introduction . . . . . . . . . . . . . . . . . . . . . . . Background on Graph C ∗ -Algebras and Spectral Triples 2.1 The C ∗ -algebras of graphs . . . . . . . . . . . . . 2.2 Semifinite spectral triples . . . . . . . . . . . . . . 2.3 Summability . . . . . . . . . . . . . . . . . . . . . 2.4 The gauge spectral triple for a graph C ∗ -algebra . . 3. Conditions for Locally Compact Semifinite Manifolds . 3.1 The analytic conditions . . . . . . . . . . . . . . . 3.2 The orientation and closedness conditions . . . . . 3.3 The bimodule conditions . . . . . . . . . . . . . . 4. k-Graph Manifolds . . . . . . . . . . . . . . . . . . . . References . . . . . . . . . . . . . . . . . . . . . . . . . . .

. . . . . . . . . . . .

. . . . . . . . . . . .

. . . . . . . . . . . .

. . . . . . . . . . . .

. . . . . . . . . . . .

. . . . . . . . . . . .

. . . . . . . . . . . .

. . . . . . . . . . . .

. . . . . . . . . . . .

. . . . . . . . . . . .

. . . . . . . . . . . .

607 608 608 610 612 614 616 616 617 622 628 635

1. Introduction The object of this paper is to address the general definition of noncommutative manifolds using new examples. The phrase ‘noncommutative manifold’ is one which is still open to some degree of interpretation. Broadly speaking, a noncommutative manifold is a spectral triple (A, H, D) satisfying some additional conditions, such as those originally

608

D. Pask, A. Rennie, A. Sims

proposed by Connes, [C1]. However, the conditions presented in [C1] only make sense when A is unital (that is, a compact noncommutative space). Moreover, the proof of Connes’ spin manifold theorem in [C2] uses a version of Connes’ conditions which do not have an immediate generalisation to the noncommutative case, since the volume form is assumed antisymmetric. We suspect that supplementing the conditions in [C2] with the closedness condition discussed at length in [RV] will resolve this issue. In this paper we consider the set of conditions presented in [RV], and show that there is a natural generalisation of each to noncommutative, nonunital and semifinite spectral triples. In the process, we show that for certain graphs and k-graphs, the spectral triples constructed in [PRen,PRS] satisfy these conditions, making them reasonable candidates for the title of noncommutative manifolds. Conditions for noncompact noncommutative manifolds have previously been considered in [R1,R2,GGISV]. We have made an effort to generalise the conditions from [RV] in as minimal and stringent a way as possible. Nevertheless, our conditions must be regarded as provisional. Additional examples and applications are required to determine the ‘correct’ conditions characterising noncommutative manifolds. The vast majority of examples of noncommutative manifolds in this paper come from nonunital algebras (see [PRen,PRS]), so our conditions must address aspects of ‘noncompact noncommutative manifolds.’ Moreover, most of our examples are semifinite, in that the trace employed is not the operator trace on Hilbert space; it is a faithful normal semifinite trace on a different von Neumann algebra. This is not to say that the C ∗ -algebras arising in our examples do not admit type I spectral triples. By considering traces which reflect the geometry of the underlying graph (or k-graph), we are naturally led to semifinite spectral triples. For simplicity we discuss only graph algebras (i.e. algebras of 1-graphs) in detail; in a final section we summarise the k-graph situation, since it is largely similar. The main result for k-graph algebras that is novel is the construction of a Hochschild k-cycle representing the volume form. While the proof of the main result of [RV] is incorrect, several important results still hold. For instance, the conditions introduced in Sect. 3 do not employ Poincaré Duality in K -theory, but rather the Morita equivalence condition characterising spinc structures, [P]. The equivalence of these two conditions in the compact commutative case (in the presence of the other conditions) follows from [C2] and [RV, Sect. 7]. In addition, we do not consider the metric condition, since this has been shown to be redundant in the compact case [RV, App. B]. 2. Background on Graph C ∗-Algebras and Spectral Triples 2.1. The C ∗ -algebras of graphs. For a more detailed introduction to graph C ∗ -algebras we refer the reader to [BPRS,KPR,R] and the references therein. A directed graph E = (E 0 , E 1 , r, s) consists of countable sets E 0 of vertices and E 1 of edges, and maps r, s : E 1 → E 0 identifying the range and source of each edge. The graph is row-finite if each vertex emits at most finitely many edges and locally finite if it is rowfinite and each vertex receives at most finitely many edges. We write E n for the set of paths µ = µ1 µ2 · · · µn of length |µ| := n; that is, sequencesof edges µi such that r (µi ) = s(µi+1 ) for 1 ≤ i < n. The maps r, s extend to E ∗ := n≥0 E n in an obvious way. A loop in E is a path L ∈ E ∗ with s(L) = r (L), we say that a loop L has an exit if there is v = s(L i ) for some i which emits more than one edge. If V ⊆ E 0 then we write

Noncommutative Manifolds from Graph and k-Graph C ∗ -Algebras

609

V ≥ w if there is a path µ ∈ E ∗ with s(µ) ∈ V and r (µ) = w. A sink is a vertex v ∈ E 0 with s −1 (v) = ∅, a source is a vertex w ∈ E 0 with r −1 (w) = ∅. The infinite path space of E is denoted E ∞ ; if E is row-finite then E ∞ becomes a σ -compact topological space in the topology inherited from the product topology on E 1 . For a row-finite graph E, a Cuntz-Krieger E-family in a C ∗ -algebra B consists of mutually orthogonal projections { pv : v ∈ E 0 } and partial isometries {Se : e ∈ E 1 } satisfying the Cuntz-Krieger relations Se∗ Se = pr (e) for e ∈ E 1 and pv = Se Se∗ whenever v is not a sink. {e:s(e)=v}

Theorem 1.2 of [KPR] shows that there is a universal C ∗ -algebra C ∗ (E) generated by a universal Cuntz-Krieger E-family {Se , pv }. A product Sµ := Sµ1 Sµ2 . . . Sµn is non-zero precisely when µ = µ1 µ2 · · · µn is a path in E n . Since the Cuntz-Krieger relations imply that the projections Se Se∗ are also mutually orthogonal, we have Se∗ S f = 0 unless e = f , and words in {Se , S ∗f } collapse to sums of products of the form Sµ Sν∗ for µ, ν ∈ E ∗ satisfying r (µ) = r (ν) (cf. [KPR, Lemma 1.1]). Indeed, because the family {Sµ Sν∗ } is closed under multiplication and involution, we have C ∗ (E) = span{Sµ Sν∗ : µ, ν ∈ E ∗ and r (µ) = r (ν)}.

(1)

The algebraic relations and the density of span{Sµ Sν∗ } in C ∗ (E) play a critical role. The universal property of C ∗ (E) gives rise to a strongly continuous action γ , called the gauge action of S 1 on C ∗ (E) satisfying γz (Se ) = zSe for all e ∈ E 1 and γz (Pv ) = Pv for all v ∈ E 0 . Because S 1 is compact, averaging over γ with respect to normalised Haar measure gives an expectation of C ∗ (E) onto the fixed-point algebra C ∗ (E)γ : 1 (a) := γz (a) dθ for a ∈ C ∗ (E), z = eiθ . 2π S 1 The map is positive, has norm 1, and is faithful in the sense that (a ∗ a) = 0 implies a = 0. From Equation (1), it is easy to see that a graph C ∗ -algebra is unital if and only if the underlying graph is finite. When we consider infinite graphs, formulas which involve sums of projections may contain infinite sums. To interpret these, we use strict convergence in the multiplier algebra of C ∗ (E). The following is proved in [PR]. Lemma 2.1. Let E be a row-finite graph, let A be a C ∗ -algebra generated by a CuntzKrieger E-family {Te , qv }, and let { pn } be a sequence of projections in A. If pn Tµ Tν∗ converges for every µ, ν ∈ E ∗ , then { pn } converges strictly to a projection p ∈ M(A). To build spectral triples we will require traces. The following two results from [PRen] characterise the ‘nice’ traces on graph algebras. Lemma 2.2. Let E be a row-finite directed graph. (i) If C ∗ (E) has a faithful, semifinite trace then no loop can have an exit. (ii) If C ∗ (E) has a faithful, semifinite, lower semicontinuous, gauge-invariant trace τ then τ ◦ = τ and τ (Sµ Sν∗ ) = δµ,ν τ ( pr (µ) ).

610


This result was sharpened using the notion of a graph trace, which we also require in this paper. Definition 2.3 (See [T]). If E is a row-finite directed graph, then a graph trace on E is a function g : E 0 → R+ such that for any v ∈ E 0 which is not a sink we have g(r (e)). (2) g(v) = s(e)=v

Proposition 2.4. Let E be a row-finite directed graph. Then there is a one-to-one correspondence between faithful graph traces on E and faithful, semifinite, lower semicontinuous, gauge invariant traces on C ∗ (E). Another graph theoretic concept useful for the graphs we will be dealing with is the following. Definition 2.5. Let E be a row-finite directed graph. An end will mean either a sink or the shift-tail equivalence class of an infinite path with no exits. For more information about shift-tail equivalence see [KPRR, §2]. Lemma 2.6. Let A = C ∗ (E) be a graph C ∗ -algebra such that no loop in the locally finite graph E has an exit. Then, K 0 (C ∗ (E)) = Z#ends ,

K 1 (C ∗ (E)) = Z#loops .

In particular, K ∗ (C ∗ (E)) is finitely generated if there are finitely many ends in E. 2.2. Semifinite spectral triples. We begin with some semifinite versions of standard definitions and results. Let τ be a fixed faithful, normal, semifinite trace on the von Neumann algebra N . Let KN be the τ -compact operators in N (that is the norm closed ideal generated by the projections E ∈ N with τ (E) < ∞). Definition 2.7. A semifinite spectral triple (A, H, D) consists of a Hilbert space H, a ∗-algebra A ⊂ N , where N is a semifinite von Neumann algebra acting on H, and a densely defined unbounded self-adjoint operator D affiliated to N such that 1) [D, a] is densely defined and extends to a bounded operator in N for all a ∈ A, and 2) a(λ − D)−1 ∈ KN for all λ ∈ R and all a ∈ A. The triple is said to be even if there is ∈ N such that ∗ = , 2 = 1, a = a for all a ∈ A and D + D = 0. Otherwise it is odd. Definition 2.8. A semifinite spectral triple (A, H, D) is QC k for k ≥ 1 (Q for quantum) if for all a ∈ A the operators a and [D, a] are in the domain of δ k , where δ(T ) = [|D|, T ] is the partial derivation on N defined by |D|. We say that (A, H, D) is QC ∞ if it is QC k for all k ≥ 1. Note. The notation is meant to be analogous to the classical case, but we introduce the Q so that there is no confusion between quantum differentiability of a ∈ A and classical differentiability of functions. Remarks concerning derivations and commutators. By partial derivation we mean that δ is defined on some subalgebra of N which need not be (weakly) dense in N . More


611

precisely, dom δ = {T ∈ N : δ(T ) is bounded}. We also note that if T ∈ N , one can show that [|D|, T ] is bounded if and only if [(1 + D2 )1/2 , T ] is bounded, by using the functional calculus to show that |D| − (1 + D2 )1/2 extends to a bounded operator in N . In fact, writing |D|1 = (1 + D2 )1/2 and δ1 (T ) = [|D|1 , T ] we have dom δ n = dom δ1n

for all n.

We also observe that if T ∈ N and [D, T ] is bounded, then [D, T ] ∈ N . Similar comments apply to [|D|, T ], [(1 + D2 )1/2 , T ]. The proofs of these statements can be found in [CPRS2]. Definition 2.9. A ∗-algebra A is smooth if it is Fréchet and ∗-isomorphic to a proper dense subalgebra i(A) of a C ∗ -algebra A which is stable under the holomorphic functional calculus. Thus saying that A is smooth means that A is Fréchet and a pre-C ∗ -algebra. Asking for i(A) to be a proper dense subalgebra of A immediately implies that the Fréchet topology of A is finer than the C ∗ -topology of A (since Fréchet means locally convex, metrizable and complete.) It has been shown that if A is smooth in A then Mn (A) is smooth in Mn (A), [GVF,S]. This ensures that the K -theories of A and A are isomorphic, the isomorphism being induced by the inclusion map. Stability under the holomorphic functional calculus extends to nonunital algebras, since the spectrum of an element in a nonunital algebra is defined to be the spectrum of this element in the ‘one-point’ unitization, though we must of course restrict to functions satisfying f (0) = 0. Likewise, the definition of a Fréchet algebra does not require a unit. The point of contact between smooth algebras and QC ∞ spectral triples is the following lemma, proved in [R1]. Lemma 2.10. If (A, H, D) is a QC ∞ spectral triple, then (Aδ , H, D) is also a QC ∞ spectral triple, where Aδ is the completion of A in the locally convex topology determined by the seminorms qn,i (a) = δ n d i (a) , n ≥ 0, i = 0, 1, where d(a) = [D, a]. Moreover, Aδ is a smooth algebra. We call the topology on A determined by the seminorms qni of Lemma 2.10 the δ-topology. Whilst smoothness does not depend on whether A is unital or not, many analytical problems arise because of the lack of a unit. As in [R1,R2,GGISV], we make two definitions to address these issues. Definition 2.11. An algebra A has local units if for every finite subset of elements n {ai }i=1 ⊂ A, there exists φ ∈ A such that for each i, φai = ai φ = ai . Definition 2.12. Let A be a Fréchet algebra and Ac ⊆ A be a dense subalgebra with local units. Then we call A a quasi-local algebra (when Ac is understood). If Ac is a dense ideal with local units, we call Ac ⊂ A local. Separable quasi-local algebras have an approximate unit {φn }n≥1 ⊂ Ac such that φn+1 φn = φn , [R1].

612


Example. For a graph C ∗ -algebra A = C ∗ (E), Eq. (1) shows that Ac = span{Sµ Sν∗ : µ, ν ∈ E ∗ and r (µ) = r (ν)} is a dense subalgebra. It has local units because Sµ Sν∗ v = s(µ) ∗ . pv Sµ Sν = 0 otherwise Similar comments apply to right multiplication by ps(ν) . By summing the source and range projections (without repetitions) of all Sµi Sν∗i appearing in a finite sum a=

cµi ,νi Sµi Sν∗i ,

i

we obtain a local unit for a ∈ Ac . By repeating this process for any finite collection of such a ∈ Ac we see that Ac has local units. We also require that when we have a spectral triple the operator D is compatible with the quasi-local structure of the algebra, in the following sense. Definition 2.13. If (A, H, D) is a spectral triple, then we define CD (A) to be the algebra generated by A and [D, A]. Definition 2.14. A local spectral triple (A, H, D) is a spectral triple with A quasi-local such that there exists an approximate unit {φn } ⊂ Ac for A satisfying CD (A)n , CD (Ac ) = n

where CD (A)n = {ω ∈ CD (A) : φn ω = ωφn = ω}. Remark. A local spectral triple has a local approximate unit {φn }n≥1 ⊂ Ac such that φn+1 φn = φn φn+1 = φn and φn+1 [D, φn ] = [D, φn ]φn+1 = [D, φn ], [R2]. This is the crucial property we require to prove most of our summability results, to which we now turn. 2.3. Summability. In the following, let N be a semifinite von Neumann algebra with faithful normal trace τ . Recall from [FK] that if S ∈ N , the t th generalized singular value of S for each real t > 0 is given by µt (S) = inf{||S E|| : E is a projection in N with τ (1 − E) ≤ t}. The ideal L1 (N √ ) consists of those operators T ∈ N such that T 1 := τ (|T |) < ∞ where |T | = T ∗ T . In the Type I setting this is the usual trace class ideal. We will simply write L1 for this ideal in order to simplify the notation, and denote the norm on L1 by · 1 . An alternative definition in terms of singular values is that T ∈ L1 if ∞ T 1 := 0 µt (T )dt < ∞.


613

Note that in the case where N = B(H), L1 is not necessarily complete in this norm but it is complete in the norm ||.||1 + ||.||∞ . (where ||.||∞ is the uniform norm). Another important ideal for us is the domain of the Dixmier trace: (1,∞)

L

(N ) = T ∈ N : T L(1,∞)

1 := sup log(1 + t) t>0

t

µs (T )ds < ∞ .

0

We will suppress the (N ) in our notation for these ideals, as N will always be clear from context. The reader should note that L(1,∞) is often taken to mean an ideal in the algebra of τ -measurable operators affiliated to N . Our notation is however consistent with N that of [C] in the special case N = B(H). With this convention the ideal of τ -compact ) such that operators, K(N ), consists of those T ∈ N (as opposed to N µ∞ (T ) := lim µt (T ) = 0. t→∞

Definition 2.15. A semifinite local spectral triple is (1, ∞)-summable if a(D − λ)−1 ∈ L(1,∞) for all a ∈ Ac , λ ∈ C \ R. Remark. If A is unital, ker D is τ -finite dimensional. Note that the summability requirements are only for a ∈ Ac . We do not assume that elements of the algebra A are all integrable in the nonunital case. Strictly speaking, this definition describes local (1, ∞)summability, however we use the terminology (1, ∞)-summable to be consistent with the unital case. We need to briefly discuss the Dixmier trace, but fortunately we will usually be applying it in reasonably simple situations. For more information on semifinite Dixmier traces, see [CPS2]. For T ∈ L(1,∞) , T ≥ 0, the function FT : t →

1 log(1 + t)

t

µs (T )ds

0

is bounded. For certain generalised limits ω ∈ L ∞ (R∗+ )∗ , we obtain a positive functional on L(1,∞) by setting τω (T ) = ω(FT ). This is the Dixmier trace associated to the semifinite normal trace τ , denoted τω , and we extend it to all of L(1,∞) by linearity, where of course it is a trace. The Dixmier trace τω is defined on the ideal L(1,∞) , and vanishes on the ideal of trace class operators. Whenever the function FT has a limit at infinity, all Dixmier traces return the value of the limit. We denote the common value of all Dixmier traces on measurable operators by −. So if T ∈ L(1,∞) is measurable, for any allowed functional ω ∈ L ∞ (R∗+ )∗ we have τω (T ) = ω(FT ) = − T.

614


d Example. Let D = 1i dθ act on L 2 (S 1 ). Then it is well known that the spectrum of D consists of eigenvalues {n ∈ Z}, each with multiplicity one. So, using the standard operator trace, Trace, the function F(1+D2 )−1/2 is N 1 (1 + n 2 )−1/2 , log 2N n=−N

and this is bounded. Hence (1 + D2 )−1/2 ∈ L(1,∞) and Traceω ((1 + D2 )−1/2 ) = − (1 + D2 )−1/2 = 2.

(3)

In [R1,R2] we proved numerous properties of local algebras. The introduction of quasilocal algebras in [GGISV] led us to review the validity of many of these results for quasi-local algebras. Most of the summability results of [R2] are valid in the quasi-local setting. In addition, the summability results of [R2] are also valid for general semifinite spectral triples since they rely only on properties of the ideals L( p,∞) , p ≥ 1, [C,CPS2], and the trace property. We quote the version of the summability results from [R2] that we require below. Proposition 2.16 ([R2]). Let (A, H, D) be a QC ∞ , local (1, ∞)-summable semifinite spectral triple relative to (N , τ ). Let T ∈ N satisfy T φ = φT = T for some φ ∈ Ac . Then T (1 + D2 )−1/2 ∈ L(1,∞) . For Re(s) > 1, T (1 + D2 )−s/2 is trace class. If the limit lim + (s − 1/2)τ (T (1 + D2 )−s ) exists, then it is equal to

s→1/2

1 − T (1 + D2 )−1/2 . (4) 2

In addition, for any Dixmier trace τω , the function a → τω (a(1 + D2 )−1/2 ) defines a trace on Ac ⊂ A. 2.4. The gauge spectral triple for a graph C ∗ -algebra. In this section we summarise the construction of a Kasparov module and a semifinite spectral triple for locally finite directed graphs with no sources. This material is based on [PRen]. We begin by constructing a Kasparov module. For E a row finite directed graph, we set A = C ∗ (E), F = C ∗ (E)γ , the fixed point algebra for the S 1 gauge action. The algebras Ac , Fc are defined as the finite linear span of the generators. Right multiplication makes A into a right F-module, and similarly Ac is a right module over Fc . We define an F-valued inner product (·|·) R on both these modules by (a|b) R := (a ∗ b), where is the expectation A → F. Completing A in the norm x 2X := (x|x) R F = (x ∗ x) F gives us a right C ∗ -F-module denoted X . The algebra A acting by multiplication on the left of X provides a representation of A as adjointable operators on X .


615

We let X c be the copy of Ac ⊂ X . For each k ∈ Z, define an adjointable endomorphism k on X by 1 k (x) = z −k γz (x)dθ, z = eiθ , x ∈ X, so 2π S 1 Sα Sβ∗ |α| − |β| = k ∗ . (5) k (Sα Sβ ) = 0 |α| − |β| = k Proposition 2.17 ([PRen]). Let X be the right C ∗ -F-module defined above. Let k 2 (xk |xk ) R < ∞}, X D = {x ∈ X : k∈Z

and define D : X D → X by D and regular.

k∈Z x k

=

k∈Z kx k .

Then D is closed, self-adjoint

Theorem 2.18. Suppose that the graph E is locally finite and has no sources. Let V = D(1 + D2 )−1/2 . Then (X, V ) is an odd Kasparov module for A-F and so defines an element of K K 1 (A, F). To construct a semifinite spectral triple, we suppose that our graph C ∗ -algebra also has a faithful gauge invariant trace τ : A → C. Using τ , we define a C-valued inner product ·, · on X c by x, y := τ ((x|y) R ) = τ ((x ∗ y)) = τ (x ∗ y), the last equality following from the gauge invariance of τ . Denote the Hilbert space completion of X c by H = L 2 (X, τ ). The operator D extends to a self-adjoint operator on H, [PRen, Lemma 5.5], and for all a ∈ Ac the commutator [D, a] extends to a bounded operator on H. Lemma 2.19. The algebra Ac and the linear space [D, Ac ] are contained in the smooth domain of the derivation δ, where for T ∈ B(H), δ(T ) = [|D|, T ]. So the completion of Ac in the δ-topology, which we denote by A, is a Fréchet pre-C ∗ -algebra. Moreover A is a quasi-local algebra with dense subalgebra Ac , and CD (Ac ) ⊂ CD (A) is also quasi-local. The last piece of information we require is the von Neumann algebra and trace which give us a semifinite spectral triple. Let End F00 (X c ) be the finite rank endomorphisms of the pre-C ∗ -module X c . Proposition 2.20. Let N = (End F00 (X c )) . Then there exists a faithful, normal, semifinite trace τ˜ : N → C such that for all rank one endomorphisms x,y of X c we have τ˜ (x,y ) = τ ((y|x) R ), x, y ∈ X c . Moreover, for all a ∈ A and λ ∈ C \ R, the operator a(λ − D)−1 lies in KN . Hence we obtain a semifinite spectral triple. However, more is true.

616


Theorem 2.21. Let E be a locally finite graph with no sources, and let τ be a faithful, semifinite, norm lower-semicontinuous, gauge invariant trace on C ∗ (E). Then (A, H, D) is a QC ∞ (1, ∞)-summable odd local semifinite spectral triple (relative to (N , τ˜ )). For all a ∈ A the operator a(1 + D2 )−1/2 is not trace class. For any v ∈ E 0 which does not connect to a sink we have τ˜ω ( pv (1 + D2 )−1/2 ) = 2τ ( pv ), where τ˜ω is any Dixmier trace associated to τ˜ . The main point is that for v ∈ E 0 such that v does not connect to a sink, and for k ∈ Z we have τ˜ ( pv k ) = τ ( pv ). This is the spectral triple we will be working with for the rest of the paper, and we refer to it as the gauge spectral triple of the directed graph E (or algebra C ∗ (E)). We remind the reader that the existence of this spectral triple depends only on the graph E being locally finite with no sources, and the existence of a faithful, semifinite, gauge invariant, norm lower-semicontinuous trace τ : A → C. The latter is a nontrivial condition. 3. Conditions for Locally Compact Semifinite Manifolds We now review in turn the conditions for noncommutative manifolds as presented in [RV]. We will consider natural generalisations of these conditions to the semifinite and nonunital setting and consider when the gauge spectral triple satisfies these generalisations. We will present each condition as stated for the type I and unital case, where (N , τ ) = (B(H), Trace) and (1 + D2 )−1/2 ∈ K(H), and then present the necessary modification of the condition, if it requires modification. When dealing with these generalisations we will suppose that (A, H, D) is a local semifinite spectral triple relative to N , τ˜ . We will not suppose that A is unital, but will suppose that Ac ⊂ A gives us a quasi-local algebra. When considering the conditions as applied to graph algebras, we will suppose that E is a locally finite directed graph with no sources and possessing a faithful graph trace g. We will let (A, H, D) be the gauge spectral triple of E described in the previous section. The conditions are somewhat interdependent, and we have found it is difficult to present them in a logical fashion. It seems that this difficulty is greatly eased if we assume at the outset that the Hilbert space H carries commuting representations π : A → B(H) and π op : Aop → B(H). The former representation actually has π(A) ⊂ N ⊂ B(H), but we do not assume this for the latter representation. We will explicitly state this bimodule requirement again when we look at the first order condition, but it will be apparent that several of our conditions require a bimodule structure for their statement. In all the following, we identify a ∈ A with π(a) ∈ N unless stated otherwise. 3.1. The analytic conditions. Old Condition 1 (Dimension). The type I unital spectral triple (A, H, D) is ( p, ∞)summable for a fixed positive integer p, for which Traceω ((1 + D 2 )− p/2 ) > 0 for all Dixmier limits ω.


617

To generalise this condition we evidently need to replace the operator trace, Trace, by the trace τ˜ : N → C which determines the compactness and summability requirements of our spectral triple. We also need to restate the requirement, since in general for a nonunital spectral triple, even type I, we will not have (1 + D2 )− p/2 ∈ L(1,∞) , [R2]. So we have a simultaneous generalisation to the nonunital and semifinite case. New Condition 1 (Semifinite Nonunital Dimension). The local semifinite spectral triple (A, H, D) is ( p, ∞)-summable for a fixed positive integer p, for which τ˜ω (a(1 + D 2 )− p/2 ) > 0 for all ω and all 0 = a ∈ Ac with a ≥ 0. Remark. In the type I setting, we also have the condition of Absolute Continuity which states: For all nonzero a ∈ A with a ≥ 0, and for any ω-limit, the following Dixmier trace is positive: Traceω (a(1 + D2 )− p/2 ) > 0. This is half of Connes’ finiteness and absolute continuity condition, [C1,C2,GVF], the other half being finiteness discussed in Sect. 3.3 below; see also [RV]. Here we have demanded positivity only for positive elements of Ac , but this extends to positive elements of A, provided we allow the value +∞. Of course our reformulation of the dimension condition already subsumes a semifinite version of absolute continuity, so the natural generalisation of the absolute continuity condition is already satisfied by our gauge spectral triples. This shows that even in the unital case it makes sense to combine the dimension and absolute continuity conditions, as mentioned in [RV]. Thus our formulation of the conditions has rendered the absolute continuity condition redundant. This generalisation of the dimension condition is satisfied by the gauge spectral triple of a directed graph with p = 1. Provided the graph E has no sinks this follows from Theorem 2.21 since the Dixmier trace of a ∗ a, 0 = a ∈ Ac , is given by τ˜ω (a ∗ a(1 + D2 )−1/2 ) = 2τ (a ∗ a) > 0. Even if the graph has sinks, the proof of Theorem 2.21 in [PRen] shows that we still have positivity. Old Condition 2 (Regularity). The spectral triple (A, H, D) is QC ∞ . Without loss of generality, we assume that A is complete in the δ-topology and so is a Fréchet pre-C ∗ -algebra. It follows from Lemma 2.19 that this condition is satisfied with no need to modify it at all. New Condition 2 (Regularity). The spectral triple (A, H, D) is QC ∞ . Without loss of generality, we assume that A is complete in the δ-topology and so is a Fréchet pre-C ∗ -algebra.

3.2. The orientation and closedness conditions. This section examines the orientation and finiteness conditions. The orientability condition for spectral triples with unital algebra A is Old Condition 3 (Orientability). Let p be the metric dimension of (A, H, D). We require that the spectral triple be even, with Z2 -grading , if and only if p is even.

618


For convenience, we take = I dH when p is odd. We say the spectral triple (A, H, D) is orientable if there exists a Hochschild p-cycle c=

n α=1

aα0 ⊗ bαop ⊗ aα1 ⊗ · · · ⊗ aαp ∈ Z p (A, A ⊗ Aop )

(6a)

whose Hochschild class [c] ∈ H H p (A, A ⊗ Aop ) may be called the “orientation” of (A, H, D), such that aα0 bαop [D, aα1 ] . . . [D, aαp ] = . (6b) πD (c) := α

Here A ⊗ Aop is a bimodule for A via a · (x ⊗ y op ) = ax ⊗ y op , (x ⊗ y op ) · a = xa ⊗ y op , a, x, y ∈ A. Now, typically, we have a nonunital algebra, and require a different formulation. We adopt the attitude that we should have a locally finite but possibly infinite cycle, as would be the case for a volume form on a noncompact manifold. New Condition 3 (Nonunital Orientability). Let p be the metric dimension of (A, H, D). We require that the spectral triple be even, with Z2 -grading , if and only if p is even. For convenience, we take = I dH when p is odd. We say the spectral triple (A, H, D) is orientable if there exists a Hochschild p-cycle c=

∞ α=1

aα0 ⊗ bαop ⊗ aα1 ⊗ · · · ⊗ aαp

(7a)

whose Hochschild class [c] may be called the “orientation” of (A, H, D), such that aα0 bαop [D, aα1 ] . . . [D, aαp ] = , (7b) πD (c) := α

where the sum in (7b) converges strongly. Remark. We have deliberately omitted any mention of the homology groups that c should belong to, there being many possibilities and few examples to guide us. We offer one possible candidate, without examining the subject in detail. op

Let Cn (Ac , Ac ⊗ Ac ) be the linear space of algebraic Hochschild n-chains for Ac . Suppose A is the completion of Ac in the topology determined by the seminorms qk , let op {qk }k∈Nn be a corresponding family of seminorms on Cn (Ac , Ac ⊗ Ac ) and let {φ j } be ∞ op a local approximate unit for A, [R1]. Define Cn (A, A ⊗ A ) to be the completion of op Cn (Ac , Ac ⊗ Ac ) for the topology determined by the family of seminorms qk, j (a 0 ⊗ bop ⊗ a 1 ⊗ · · · ⊗ a n ) := qk (φ j a 0 ⊗ (φ j b)op ⊗ φ j a 1 ⊗ · · · ⊗ φ j a n ). This should be viewed as similar to uniform convergence of all derivatives on compacta, and so analogous to a C ∞ topology. Ultimately more nonunital examples are required to clarify this issue; for more comments see [GGISV,R1,R2]. We leave these homological questions for future investigation.


619

For the case of graph algebras, we consider the sum over all edges in the graph c= Se∗ ⊗ Se . (8) e∈E 1

Before worrying about the convergence of this sum (in the multiplier algebra), we apply the Hochschild boundary b to find b(c) = (Se∗ Se − Se Se∗ ) = pr (e) − pv , e

v not sink

e

where we have used the Cuntz-Krieger relation to obtain the second sum on the righthand side. Thus if there are no sinks, the second sum on the right-hand side converges to the identity (in the multiplier algebra or the ‘one-point’ unitization). The first sum on the right-hand side contains each vertex projection pv with multiplicity equal to the number of edges entering it, which we denote by |v|1 . Thus b(c) = (|v|1 pv − pv ) + |v|1 pv . v a sink

v∈E 0 , v not sink

In particular, if each vertex has precisely one edge entering it, and no vertex is a sink, b(c) = 0. We say that such a graph E has no sinks, and satisfies the single entry condition. Observe that the single entry condition (together with the requirement that no loop has an exit) rules out loops except for the case where the (connected) graph comprises a single loop. The C ∗ -algebra of a graph consisting of a simple loop on n vertices is isomorphic to Mn (C(S 1 )). For a one-edge loop, the Hochschild cycle c is z −1 ⊗ z, the usual volume form for the circle. The single entry condition also rules out sources, so unless our (connected) graph comprises a single loop, it is an infinite directed tree with no sources or sinks, in which case the C ∗ -algebra is AF [KPR,

Theorem 2.4]. If E satisfies the single entry condition then we claim that Se∗ ⊗ Se converges to a partial isometry in the multiplier algebra of C ∗ (E) ⊗ C ∗ (E). Let X e = Se∗ ⊗ Se , then it is clear that X e is a partial isometry in C ∗ (E) ⊗ C ∗ (E) with X e X e∗ = (Se∗ ⊗ Se )(Se ⊗ Se∗ ) = Pr (e) ⊗ Se Se∗ , X e∗ X e = (Se ⊗ Se∗ )(Se∗ ⊗ Se ) = Se Se∗ ⊗ Pr (e) . By the relations in C ∗ (E) the Se Se∗ are mutually orthogonal, and then by the single entry hypothesis the Pr (e) are too. Hence the X e have mutually orthogonal ranges, and a standard argument (see [PR2, Lemma 1.1] or [BPRS, Lemma 1.1]) finishes off the claim. Using the single-entry condition, we see that the Hochschild cycle defined in (8) is represented by πD (c) = Se∗ [D, Se ] = Se∗ Se = pr (e) = I dH , (9) e

e

e

showing that the new condition of orientation is satisfied for this cycle. The sums in (9) converge in the strict topology as an operator on the C ∗ -module X , and also converge strongly on H. It may well be possible that there is a Hochschild cycle for a more general family of graphs, and we are not claiming that the above conditions are necessary for the orientability condition to hold, only sufficient.

620


From now on we suppose that E has no sinks and satisfies the single entry condition. As noted above, it follows that the algebra C ∗ (E) is then AF unless it is Mn (C(S 1 )). In the AF case, E is a directed tree. We record the following lemma describing the fixed-point algebra of the directed tree examples. Lemma 3.1. Suppose that E is a directed tree with no sinks satisfying the single entry condition and having finitely many ends. Then F = C ∗ (E)γ is an abelian algebra, isomorphic to the continuous functions on the infinite path space E ∞ of E. Letting N denote the number of ends, each f ∈ Fc can be represented as N

f =

v∈E 0

∗ cv,n Sv,n Sv,n , cv,n ∈ C,

n=1

where (v, n) denotes a path with source v and range in the n th tail. The C ∗ -norm of such an f is f 2F = sup |cv,n |2 . Proof. The assertion that F ∼ = C0 (E ∞ ) follows from an argument along the lines of [KPRR, Lemma 4.3]. To see that it is possible to write f ∈ Fc in the above form, consider a path α with range r (α) a vertex emitting one edge e. Then ∗ Sαe Sαe = Sα Se Se∗ Sα∗ = Sα ps(e) Sα∗ = Sα pr (α) Sα∗ = Sα Sα∗ .

So any Sα Sα∗ is equal to Sβ Sβ∗ where β is an extension of α not passing through a vertex emitting more than one edge. If α is a path with range a vertex emitting, say, k edges, e1 , . . . , ek , then Sα Sα∗

=

Sα pr (α) Sα∗

=

k

Sα Sei Se∗i Sα∗ ,

i=1

and this can be subsequently extended until the next vertices emitting more than one edge. This process terminates after finitely many steps because there are finitely many ∗ are mutually orthogonal, so ends. The Sv,n Sv,n f∗f =

v

∗ |cv,n |2 Sv,n Sv,n ,

n

and f 2F = sup |cv,n |2 . The next condition is closedness, which, in its original form, is basically Stoke’s theorem for the Dixmier trace applied to elements of A ⊗ Aop . The original formulation for ( p, ∞)-summable triples using the operator trace Trace is Old Condition 4 (Closedness). The ( p, ∞)-summable spectral triple (A, H, D) is closed if for any a1 , . . . , a p ∈ A ⊗ Aop , the operator [D, a1 ] · · · [D, a p ](1 + D2 )− p/2 has vanishing Dixmier trace; thus, for any Dixmier trace Traceω , Traceω [D, a1 ] · · · [D, a p ](1 + D2 )− p/2 = 0. (10)


621

Remark. By setting φ(a0 , . . . , a p ) := Traceω a0 [D, a1 ] · · · [D, a p ](1 + D2 )− p/2 , Eq. (10) may be rewritten [C, VI.2] as B0 φ = 0, where B0 is defined on (k + 1)-linear functionals by (B0 φ)(a1 , . . . , ak ) := φ(1, a1 , . . . , ak ) + (−1)k φ(a1 , . . . , ak , 1). To see the utility of this condition, we introduce some notation so that we can quote Lemma 3 of [C, VI.4.γ ]. Let ∗ (A) be the universal differential algebra of A, [C, II.1.α]. Then πD : ∗ (A) → CD (A) defined by πD (a0 δa1 . . . δan ) = a0 [D, a1 ] · · · [D, an ] is a ∗-algebra representation. Denote by ∗D (A) the graded differential algebra we obtain by quotienting CD (A) by the differential ideal πD (δ(ker πD )), where δ is the universal derivation on ∗ (A). We denote by d the derivation on ∗D (A). See [C, Chap VI] for more information. Finally, let Z k (A, A∗ ) denote the Hochschild cocycles. Lemma 3.2. Let (A, H, D) be ( p, ∞)-summable and satisfy Old Condition 5 (first order). Then for each k = 0, 1, . . . , p and η ∈ k A, a Hochschild cocycle Cη ∈ Z p−k (Aop , (Aop )∗ ) is defined by Cη (a 0 , . . . , a p−k ) := Traceω πD (η) a 0 [D, a 1 ] . . . [D, a p−k ] (1 + D2 )− p/2 , a 0 , . . . , a p−k ∈ Aop . Moreover, if Old Condition 4 (closedness) also holds, then Cη depends only on the class of πD (η) in kD A, and B0 Cη = (−1)k Cdη . Thus the first order condition together with closedness give us tools to study the Hochschild and cyclic homology of the algebra A. More information can be found in [C, VI.4.γ ]. The difficulty we face is that we have a Dixmier trace defined on N ⊃ A which we can not apply to A ⊗ Aop . As we discuss in the next section, we do not believe having a spectral triple for A ⊗ Aop is of central importance. Nevertheless, the utility of Lemma 3.2 is greatly reduced by our new formulation. New Condition 4 (Semifinite Closedness). The ( p, ∞)-summable local semifinite spectral triple (A, H, D) is closed if for any Dixmier trace τ˜ω we have (11) τ˜ω [D, a1 ] · · · [D, a p ](1 + D2 )− p/2 = 0 for all a1 , . . . , a p ∈ A. It would seem that this formulation does not give us tools to study the Hochschild and cyclic cohomology of A as in the type I case described above, [C, VI.4.γ ]. More examples are required to understand the proper extension of this condition to the semifinite setting. For the gauge spectral triple of a graph algebra and generator Sµ Sν∗ ∈ A, [PRen, Theorem 5.8], τ˜ω ([D, Sµ Sν∗ ](1 + D2 )−1/2 ) = (|µ| − |ν|)τ˜ω (Sµ Sν∗ (1 + D2 )−1/2 ) = 2(|µ| − |ν|)τ (Sµ Sν∗ ).

622


The gauge invariance of the trace says that τ (Sµ Sν∗ ) is non-zero only if |µ| = |ν|, whence the whole expression always vanishes. Hence the new closedness condition holds for the gauge spectral triple.

3.3. The bimodule conditions. This section is concerned with the relation between the bimodule structure of the Hilbert space and the spectral triple. First we have the first order condition which specifies the bimodule structure. In the original type I setting we have Old Condition 5 (First Order). There are commuting representations π : A → B(H) and π op : Aop → B(H) of the opposite algebra Aop (or equivalently, an antirepresentation of A). Writing a for π(a), and bop for π op (b), we ask that [a, bop ] = 0. In addition, the bounded operators in [D, A] commute with Aop ; in other words, [[D, a], bop ] = 0 for all a, b ∈ A.

(12)

In the type I setting the first order condition gives us a spectral triple for A ⊗ Aop , but we believe this is not essential, and just an artefact of the type I setting. Rather we focus on the fact that in the type I setting the algebra CD (A) is contained in the endomorphism algebra of the right A module H∞ . The finiteness condition (below) asks that H∞ = m≥1 dom Dm be a finite projecR (H ), tive (right) A module. The first order condition then says that CD (A) ⊆ EndA ∞ where R is for right. One would expect this finite projective condition to be symmetric in some sense, but this is an extra requirement. If H∞ is also a finite projective left L (H ), L for left. Typically however, these two A-module, then CD (Aop ) ⊆ EndA ∞ algebras of endomorphisms, one left and one right, will not commute with each other. They do for the gauge spectral triple of a graph algebra, but this is a one-dimensional phenomenon (see also [GGISV]). A moment’s thought shows that regarding the (sections of the) spinor bundle of a spin manifold M as a C ∞ (M) bimodule, the two collections of endomorphisms we obtain do not commute, since both algebras of endomorphisms are the same Clifford algebra. These arguments show that the most important aspect of the first order condition is that the algebra CD (A) acts as endomorphisms of a noncommutative bundle, and that the ‘symbol’ of D is such an endomorphism. Moreover, in the semifinite setting we begin with a representation π : A → N ⊂ B(H). The von Neumann algebra N is thus required to contain A and the spectral projections of D, and these are the only requirements. So typically, π op (Aop ) ⊂ N , and this is the case for the gauge spectral triple. In particular, Aop need not lie in the domain of the trace we employ, and even supposing we have a version of the first order condition, we will not obtain a spectral triple for A ⊗ Aop . We therefore change the first-order condition only very slightly as follows: New Condition 5 (Semifinite First Order). There are commuting representations π : A → N and π op : Aop → B(H) of the opposite algebra Aop . Writing a for π(a), and bop for π op (b), we ask that [a, bop ] = 0. In addition, the bounded operators in [D, A] commute with Aop ; in other words, [[D, a], bop ] = 0 for all a, b ∈ A.

(13)


623

For the gauge spectral triple of a directed graph, the Hilbert space naturally carries commuting representations of A and Aop . The first order condition [[D, A], Aop ] = 0 follows since [D, A] ⊂ A, and the left and right actions of A on the Hilbert space commute. The condition of finiteness in the unital case is Old Condition 6 (Finiteness). The dense subspace of H which is the smooth domain of D, H∞ := dom Dm , m≥1

is a finitely generated projective right A-module. Thus H∞ q Am , where q ∈ Mm (A) is an idempotent. Without loss of generality, we may suppose q = q ∗ also, so that, without further hypotheses, H∞ carries

∗ q b when an A-valued Hermitian pairing, namely, that given by (ξ, η) := a j,k j jk k

m , η = ( q b )m . ξ = ( j qi j a j )i=1 k ik k i=1 In the nonunital case, this is necessarily more subtle as the elements of H∞ = ∩m domDm must also satisfy integrability conditions. In [R1], the notion of smooth module was introduced for nonunital algebras which are local. As we are dealing with quasi-local algebras, most of the results on smooth modules in [R1] are not applicable. We take the attitude that: Point 1) H∞ should be a continuous A-module, Point 2) H∞ should embed continuously as a dense subspace in the C ∗ -A-module X A = H∞ , Point 3) X should be the completion of q A N for some N and some projection q in M N (Ab ), where Ab is a unitization of A, Point 4) the Hermitian product H∞ x, y → (x|y) should have range in A (acting on the right). Point 1) is implied by the condition of regularity. Proof. For x ∈ H∞ and a ∈ A we have n n δ n− j (a op )|D| j (x) D (xa) H = |D| (xa) H = j n

n

≤

j=0 n j=0

n j

H

δ n− j (a op ) |D| j (x) H .

(14)

The continuity of the action of A on H∞ now follows easily. Point 2 above is included to ensure that we can recover the ‘module of continuous sections vanishing at infinity’ from H∞ , and it is a nontrivial condition as we shall see. Once we have a continuous embedding, the image will be dense for our graph algebra examples, since Ac ⊂ H∞ .

624


Once we can recover the module X , we demand that it be ‘finitely generated and projective’ in the sense of 3): see also [R1, Theorem 8]. The examples arising from graph algebras have A dense in X , so taking N = 1 and q = id Ab in any unitization Ab of A shows that 3) is always satisfied for the gauge spectral triple of a graph algebra. All four points are satisfied in the unital case, so we will ignore the case of a single loop in the following, focussing attention instead on the directed trees. Roughly speaking, without points 2) and 4), H can contain many ‘functions’ on the graph which are unbounded, and so are not in the algebra A or the module X . Modules of unbounded ‘functions’ are not terrible per se, but we prefer to remain close to the C ∗ -theory. Example. Let E be the ‘dyadic directed tree’.

•

•

• 1

··· • •

.........................

............. ........................ ................... ................... ............................. ................................... .........

•

2

•

.......... .......................... .......................

•.................................................................

•

• •

....................

.. ......................... ....................... .. ....................... ....................... ....................... .. ........................ ...

•

···

•

........ .................................. ............................

•........................................................ ....

........... .................... ............... ............... ....................... ........... ........... ........... ............... ........... .. ....... 1 ................... ...... . ..... . . . . 2 .... . . . . .... . . . . .... . . . . . ...... ..... ..... ...... ............................................................................ ...... ...... ...... ...... ...... ...... ...... ...... ...... ...... ............... ...... ........ ............... ........... .............. ........... . . . . . . . . . . .... ....................... ............... ............... 1 ..................... ..........

•

Define a faithful trace as follows. If v is a vertex before the first split, let τ ( pv ) = 1. If v occurs after n splits and before n + 1 splits, define τ ( pv ) = 2−n . Finally define τ (Sµ Sν∗ ) = δµ,ν τ ( pr (µ) ). Then the Hilbert space H = L 2 (X, τ ) contains a = lim

N →∞

N

2i/4 pi ,

(15)

i=1

where τ ( pi ) = 2−i , and the pi are mutually orthogonal. The element a ∈ H in Eq. (15) does not lie in the C ∗ -module X , as the limit does not exist in the norm · X . New Condition 6 (Nonunital Finiteness). The dense subspace of H which is the smooth domain of D, H∞ := dom Dm m≥1

has a right inner product A-module structure. Moreover, H∞ embeds as a dense subspace of a C ∗ -A-module which is finitely generated and projective over some unitization Ab of A.


625

Having identified a working generalisation of the finiteness condition, we identify the restrictions it places on a graph C ∗ -algebra. So to check that New Condition 6 holds, we must verify points 2) and 4). Proposition 3.3. Suppose that the locally finite directed graph E has no sinks, no loops and satisfies the single entry condition. The A-module H∞ satisfies 2) if and only if the K -theory of A is finitely generated. In this case the Hilbert space H also satisfies point 2). If the K -theory of A is finitely generated then point 4) holds. Remark. Thus for the directed tree examples, the finiteness condition is satisfied if and only if the K -theory of A is finitely generated. Proof. We begin with condition 2) for our directed trees. First of all we must have ker D ∩ H∞ = L 2 (Fc , τ ) ⊂ X . Thus we require a C > 0 such that f 2X = f ∗ f F ≤ Cτ ( f ∗ f )1/2 = C f 2H , 1/2

for all f ∈ L 2 (Fc , τ ). In particular, we require for all v ∈ E 0 that 1 = pv F ≤ Cτ ( pv )1/2 . Hence τ ( pv ) must be bounded below, which implies, by the definition of a graph trace, the faithfulness of τ , and since E is connected and row-finite, that there exist at most finitely many ends, and so K 0 (A) is finitely generated. Thus the condition is necessary. Conversely, suppose that K 0 (A) is finitely generated, and let rank(K 0 (A)) = N < ∞ be the number of ends. Observe that having finitely many ends implies that any faithful graph trace is bounded from below. Then if f ∈ Fc , Lemma 3.1 allows us to write f =

N

∗ cv,n Sv,n Sv,n , cv,n ∈ C,

v∈E 0 n=1

where (v, n) denotes a path with source v and range in the n th end. We have f 2F = sup |cv,n |2 . Now suppose that f ∈ H, so that f 2H = τ ( f ∗ f ) =

v

|cv,n |2 τ ( pn ) < ∞,

n

where pn is any projection in the n th end. Then f 2H =

v

|cv,n |2 τ ( pn ) ≥ min{τ ( pn )}

|cv,n |2

v

n

≥ min{τ ( pn )} sup |cv,n | = 2

v,n

n min{τ ( pn )} f 2F

= min{τ ( pn )} f 2X .

626


Hence f ∈ X . Finally, suppose that x ∈ H, so x = As f k := xk∗ xk ∈ F and is positive, we have x 2H =

τ (xk∗ xk ) =

k

k∈Z x k

f k H ≥ (min{τ ( pn )})1/2

and

k

= (min{τ ( pn )})1/2

k

xk∗ xk X = (min{τ ( pn )})1/2

k

= (min{τ ( pn )})

1/2

∗ k∈Z τ (x k x k )

< ∞.

fk X

(xk∗ xk )2 F

1/2

k

xk∗ xk F

= (min{τ ( pn )})

1/2

k

sup xk∗ xk F k

= (min{τ ( pn )})1/2 x 2X . This proves that the finite generation of K 0 (A) is necessary and sufficient for the second point. For point 4), we assume that K 0 (A) is finitely generated. We observe that if x, y ∈ X c = Ac ⊂ H∞ we have x ∗ y ∈ Ac ⊂ A. In particular, x ∗ y is in the smooth domain of the derivation δ = [|D|, ·]. Thus for x, y ∈ X c we have, by Lemma 2.19,

δ m (x ∗ y) 2 ≤

|k − l|2m xk∗ yl 2A ,

k,l∈Z

where the sum over k, l is finite and we have used δ m ((x ∗ y)∗op ) = δ m (x ∗ y) for a ∈ Ac to avoid writing op throughout the following calculation. Now xk∗ yl 2A = yl∗ xk xk∗ yl A ≤ yl∗ yl A xk xk∗ A ≤ C 2 τ (yl∗ yl )τ (xk∗ xk ), the last inequality using the finite generation of K 0 (A). So we have the inequality δ m (x ∗ y) ≤ C 2

|k − l|2m τ (yl∗ yl )τ (xk∗ xk )

k,l∈Z

≤C

2

(|k| + |l|)2m τ (yl∗ yl )τ (xk∗ xk )

k,l∈Z

= C2

2m 2m |k|2m− j |l| j τ (yl∗ yl )τ (xk∗ xk ) j

k,l∈Z j=0

2m 2m τ ((|D| j/2 yl )∗ (|D| j/2 yl )) =C j 2

k,l∈Z j=0

= C2

2m j=0

×τ ((|D|(2m− j)/2 xk )∗ (|D|(2m− j)/2 xk )) 2m |D| j/2 y 2H |D|(2m− j)/2 x 2H . j

(16)

So suppose that {x i } ⊂ X c is a sequence converging to x ∈ H∞ in the topology determined by the seminorms x → |D|m x H , m ≥ 0, and similarly y i → y.


627

The estimate (16) shows that δ m (x ∗j y j − xi∗ yi ) 2A = δ m (x ∗j y j − x ∗j yi + x ∗j yi − xi∗ yi ) 2A 2m 2m |D| j/2 (y j − yi ) 2H |D|(2m− j)/2 (x j ) 2H ≤ C2 j j=0

2m 2m |D| j/2 (yi ) 2H |D|(2m− j)/2 (x j − xi ) 2H , +C j 2

j=0

and this goes to zero. Hence the sequence x ∗j y j is Cauchy in A, and so for the limits x, y ∈ H∞ , the inner product (x|y) A = x ∗ y is in the completion of Ac for the δ-topology, and so x ∗ y ∈ A. Remark. We also note that Connes stipulates that when we restrict the Hilbert space inner product to H∞ we should have x, y = − (x|y)A (1 + D2 )−1/2 , where the Hermitian product is the A-valued one: (x|y)A = x ∗ y. However, the trace satisfies τ = τ ◦ , so τ ((x|y)A ) = τ (x ∗ y) = τ ((x ∗ y)) = τ ((x|y) R ), and the inner product does indeed satisfy this formula, up to a factor of 2; see Eq. (3). The factor of 2 also occurs in the type I case, and is simply a matter of normalisation of the inner product, and does not affect the Hilbert space; see [R2, Sect. 5] for the constants in the commutative case. The following condition describes a spinc structure for the noncommutative manifold, [P]. Old Condition 7 (Spinc ). The C ∗ -A-module completion of H∞ is a Morita equivalence bimodule between A and the norm completion of the algebra CD (A) generated by A and [D, A]. Since for graph algebras the A-bimodule A is contained in X , we have a natural Morita equivalence bimodule between A and A. As the norm closed algebra generated by A and [D, A] is just A in the case of the gauge spectral triple, the Morita equivalence follows. Thus there is no need to alter the spinc condition to deal with semifiniteness or lack of a unit (at least for graph algebras). New Condition 7 (Spinc ). The C ∗ -A-module completion of H∞ is a Morita equivalence bimodule between A and the norm completion of the algebra CD (A) generated by A and [D, A]. In the case where A = C ∞ (M), M a manifold, the spinc condition (together with orientability) provides a spinc structure for M, [P]. Given a spinc manifold M, M is spin if and only if at least one (oriented) Morita equivalence bimodule admits a bijective antilinear map satisfying the requirements of the reality condition, [GVF, Theorem 9.6]. Thus the reality condition below, in conjunction with the spinc condition, may be regarded as a noncommutative spin structure.

628


Old Condition 8 (Reality). There is an antiunitary operator J : H → H such that J a ∗ J −1 = a op for all a ∈ A; and moreover, J 2 = ±1, J D J −1 = ±D and also J J −1 = ± in the even case, according to the following table of signs depending only on p mod 8: p mod 8

0 2 4 6

J 2 = ±1

+−− +

J D J −1 = ±D + + + + J J −1 = ± + − + −

p mod 8

1 3 5 7

J 2 = ±1

+ −−+

(17)

J D J −1 = ±D − + − +

For the origin of this sign table in K R-homology, we refer to [GVF, §9.5]. For the gauge spectral triple, the operator J : L 2 (X, τ ) → L 2 (X, τ ), J (x) = x ∗ satisfies the reality condition for p = 1, namely, J 2 = 1, J a ∗ J = a op and J D J = −D, so the bimodule and spectral triple are real. This can be directly verified with ease. For this reason we retain the reality condition in its original form. New Condition 8 (Reality). There is an antiunitary operator J : H → H such that J a ∗ J −1 = a op for all a ∈ A; and moreover, J 2 = ±1, J D J −1 = ±D and also J J −1 = ± in the even case, according to the table (17) of signs. For the type I case connectedness of the underlying noncommutative space is formulated in the following condition. Old Condition 9 (Irreducibility). The spectral triple (A, H, D) is irreducible: that is, the only operators in B(H) commuting with D and all a ∈ A are the scalars. In a von Neumann algebra context it is clear what we should replace this condition with. New Condition 9 (Semifinite Irreducibility). The semifinite spectral triple (A, H, D) is irreducible: that is, the only operators in N commuting with D and all a ∈ A are the scalars. For our algebra A, only the fixed-point subalgebra F commutes with D. For graphs satisfying the single entry condition, F is abelian. A graph-theoretic argument shows that if E is connected, then no nontrivial element of F can commute with all of A. We summarise our results for graph algebras. Theorem 3.4. Let E be a connected locally finite graph with no sinks, admitting a faithful graph trace, satisfying the single entry condition and having finitely generated K -theory. Then the gauge spectral triple (A, H, D) of E satisfies the new (semifinite, nonunital) conditions 1 to 9. If E is not a single loop, the gauge spectral triple is both nonunital (noncompact) and semifinite. 4. k-Graph Manifolds In [PRS] we adapted the construction of [PRen] described earlier to construct a Kasparov module and semifinite spectral triple for suitable k-graph algebras. This was accomplished by ‘pushing forward’ the Dirac operator (of the simplest spin structure) on the k-torus, using the canonical Tk action on a k-graph algebra.


629

We will not go into the details of these constructions as we did for graph algebras, noting only that they are essentially analogous to the graph case. We also omit a general discussion of k-graph algebras, as this is lengthy. We will adopt the definitions, notations and conventions of [PRS], and refer the reader to this work for an introduction to k-graph algebras adapted to this context. We do require several notational reminders so that we can state our results here with the minimum of ambiguity. In particular: Warning In this section we reverse our conventions regarding range and source of edges. This means that sinks and sources play opposite roles, the single entry condition becomes the single exit condition, and so on. This is in keeping with the notation employed in [PRS]. Briefly, a k-graph is a set of paths with a degree map d : → Nk . For n ∈ Nk , we write n for d −1 (n), and regard 0 as the set of vertices. Paths have the unique factorisation property: if d(λ) = m + n, then there are unique paths µ ∈ m and ν ∈ n such that λ = µν. In particular, if m ≤ n ≤ l = d(λ), then there is a unique factorisation λ = λ(0, m)λ(m, n)λ(n, l), where d(λ(0, m)) = m and so forth. It also follows that each path λ has a unique range r (λ) ∈ 0 such that r (λ)λ = λ; likewise for sources. With this in mind, we write vn for r −1 (v) ∩ n and n v for s −1 (v) ∩ n for n ∈ Nk and v ∈ 0 . The C ∗ -algebra C ∗ () of a k-graph is the universal C ∗ -algebra generated by a set {Sλ : λ ∈ } of partial isometries satisfying Cuntz-Krieger type relations [KP]. For the remainder of this section, ‘k-graph’ shall be an abbreviation for ‘locally convex, locally finite k-graph without sinks, which possesses a faithful k-graph trace’. All the conditions below refer to the general semifinite nonunital versions discussed for graph algebras (with appropriate changes to the dimensions involved where necessary). The gauge spectral triples (A, H, D) for k-graph algebras satisfy the new dimension, regularity (smoothness) and absolute continuity conditions, with dimension k. All this is proved in [PRS]. The new first order condition is satisfied just as in the graph case, and the new irreducibility condition is also satisfied if the k-graph is connected. The remaining conditions which we need to verify are the new finiteness, orientability, closedness, Morita equivalence (spinc ) and reality conditions. In order to do this, we will need to assume that our k-graphs are row-finite with no sources (0 < |vn | < ∞ for v ∈ 0 and n ∈ Nk ), and satisfy the single exit condition (|n v| = 1 for each v ∈ 0 and n ∈ Nk ). Finiteness and Morita Equivalence. The proof of our rather strict finiteness condition for k-graphs is almost identical to the proof for the 1-graph case. In fact, once we have the following result it is virtually identical. Suppose that is a row-finite k-graph with no sources satisfying the single-exit condition. We claim there is an isomorphism of the fixed point algebra C ∗ ()γ onto C0 (∞ ) (where the infinite path space ∞ is endowed with the topology generated by the cylinder sets λ∞ , λ ∈ ). The isomorphism takes Sλ Sλ∗ to the characteristic function χλ∞ for each λ ∈ . To see this, first recall from [FPS] (see also [KP]) that for an arbitrary row-finite k-graph with no sources, there is an isomorphism of the diagonal D() := span{Sλ Sλ∗ : λ ∈ } onto C0 (∞ ) which takes Sλ Sλ∗ to χλ∞ . We also know that C ∗ ()γ = span{Sµ Sν∗ : d(µ) = d(ν)}, but the single-exit condition ensures that whenever Sµ Sν∗ = 0 and d(µ) = d(ν), we have µ = ν. Hence C ∗ ()γ = D() when satisfies the single-exit condition, and this establishes the claim.

630


In particular, it is not hard to deduce from this an exact analogue of Lemma 3.1: if is row-finite, satisfies the single exit condition, and has finitely many (say N ) ends, then each element a of Fc can be expressed as a :=

N v∈0

∗ b(v,i) S(v,i) S(v,i) ,

(18)

i=1

where (v, i) is a path from the i th end to v. As in the graph case, there is almost nothing to prove when the algebra is unital. This follows since then the trace of the identity is finite, and we can compare the Hilbert space and C ∗ -module norms easily. For the nonunital case we have the following. Proposition 4.1. Suppose that the locally finite, locally convex k-graph (, d) has no sources and satisfies the single exit condition. The A-module H∞ embeds continuously in the C ∗ -A-module completion if and only if the K -theory of A is finitely generated; in this case the Hilbert space H does also. If K ∗ (A) is finitely generated then the C ∗ -inner product restricted to H∞ takes values in A. Apart from the above result describing the fixed-point algebra of C ∗ (), we also require the K -theory computations of [PRS] which show that in the situation of Proposition 4.1 the K -theory is finitely generated if and only if there are finitely many ends. With these results in hand, our corresponding proof for 1-graphs can be applied with minor modifications. The Morita equivalence condition is now simple, since CD (A) ∼ = Cli f f (Rk ) ⊗ A [k/2] (or Cli f f + (Rk ) in odd dimensions) and H∞ = A2 . So the gauge spectral triple of a k-graph is spinc . [k/2] Orientation. As noted above, H∞ = A2 , and the operator D acts on generators k k j j x ∈ X c ⊂ H with degree d(x) = n ∈ N by Dx = j=1 γ (in j ) x, where the γ are constant matrices generating the complex Clifford algebra of Rk . In what follows we write 1k for (1, . . . , 1) ∈ Nk . We let k be the group of permutations of {1, . . . , k}. Fix a k-graph and a path µ ∈ 1k . Given a permutation σ ∈ k , the factorisation property guarantees that there is a unique factorisation µ = µσ1 µσ2 . . . µσk such that µiσ ∈ eσ (i) for 1 ≤ i ≤ k. For example, let k = 2 and let µ = e f = ab be a commuting square, so that d(e) = d(b) = e1 and d( f ) = d(a) = e2 . There are two elements of 2 , namely the (1,2) (1,2) flip (1, 2) and the identity id. We have µ1 = a and µ2 = b whilst µid 1 = e and id µ2 = f . We use the notation (−1)σ for the canonical homomorphism σ → (−1)σ from k to {−1, 1} which takes the 2-cycles (i, j) to −1. Proposition 4.2. Let be a row-finite k-graph with no sources, and suppose that for every v ∈ 0 and 1 ≤ i ≤ k we have |ei v| = 1 (single exit). Define ck := i

k+1 2

1 (−1)σ Sµ∗ ⊗ Sµσ1 ⊗ Sµσ2 ⊗ · · · ⊗ Sµσk . k! 1

µ∈

k

(19)

σ ∈k

Then b(ck ) = 0, where b is the Hochschild boundary operator, and πD (ck ) = , where

is the grading for k even, and the identity for k odd.


631

Proof. We begin by establishing that πD (ck ) = because this is the easier of the two calculations. To see this, we just calculate (here, the γ j are the generators of Cli f f (Rk )): π D (ck ) = i = i = i

k+1 2

k+1 2

k+1 2

1 (−1)σ Sµ∗ [D, Sµσ1 ][D, Sµσ2 ] . . . [D, Sµσk ] k! 1

µ∈

k

µ∈

k

σ ∈k

1 (−1)σ Sµ∗ Sµσ1 γ σ (1) Sµσ2 γ σ (2) . . . Sµσk γ σ (k) k! 1 σ ∈k

1 Sµ∗ Sµσ1 Sµσ2 . . . Sµσk = ωC ps(µ) , k! 1 1

γ1 ···γk

µ∈

k

σ ∈k

µ∈

k

where ωC is the complex volume form in Cli f f (Rk ). The single exit assumption ensures that the sum of vertex projections in the last line has exactly one term for each vertex of , and hence converges to the identity in the multiplier algebra of C ∗ (), establishing that πD (ck ) = . Now we need to establish that b(ck ) = 0. To begin with, fix µ ∈ 1k . We claim that ⎞ ⎛ (−1)σ Sµ∗ ⊗ Sµσ ⊗Sµσ ⊗ · · · ⊗ Sµσ ⎠ b⎝ 1

σ ∈k

2

k

=

σ ∈k

(−1)σ Sµ∗ σ ...µσ ⊗ Sµσ2 ⊗ · · · ⊗ Sµσk 2

(20)

k

+(−1)k Sµσk Sµ∗ σ Sµ∗ σ ...µσ 1

k

k−1

⊗ Sµσ1 ⊗ · · · ⊗ Sµσk−1 .

To see this, we apply the definition of the Hochschild boundary b to obtain ⎛ ⎞ b⎝ (−1)σ Sµ∗ ⊗ Sµσ ⊗Sµσ ⊗ · · · ⊗ Sµσ ⎠ 1

σ ∈k

2

=

k

σ ∈k

+

(−1)σ Sµ∗ σ ...µσ ⊗ Sµσ2 ⊗ · · · ⊗ Sµσk 2

k

k−1 (−1) j Sµ∗ ⊗ Sµσ1 ⊗ · · · ⊗ Sµσj Sµσj+1 ⊗ · · · ⊗ Sµσk j=1

+(−1)k Sµσk Sµ∗ σ Sµ∗ σ ...µσ k

1

k−1

⊗ Sµσ1 ⊗ · · · ⊗ Sµσk−1 .

To establish (20), it therefore suffices to show that for 1 ≤ j ≤ k − 1, we have σ ∈k

(−1)σ Sµ∗ ⊗ Sµσ1 ⊗ · · · ⊗ Sµσj Sµσj+1 ⊗ · · · ⊗ Sµσk = 0.

To see this, we fix 1 ≤ j ≤ k − 1, and note that we may partition k as k = A j B j where A j := {σ ∈ k : σ ( j) < σ ( j + 1)} and B j := {σ ∈ k : σ ( j) > σ ( j + 1)}. Let t j ∈ k be the transposition ( j, j + 1). Then σ → σ ◦ t j is a bijection from A j to B j .

632


Hence (−1)σ Sµ∗ ⊗ Sµσ1 ⊗ · · · ⊗ Sµσj Sµσj+1 ⊗ · · · ⊗ Sµσk

σ ∈k

=

(−1)σ Sµ∗ ⊗ Sµσ1 ⊗ · · · ⊗ Sµσj Sµσj+1 ⊗ · · · ⊗ Sµσk

σ ∈A j

+(−1)σ ◦t j Sµ∗

⊗S

σ ◦t j

µ1

⊗ ··· ⊗ S

σ ◦t j

µj

S

σ ◦t j

µ j+1

⊗ ··· ⊗ S

.

σ ◦t j

µk

The definition of t j guarantees that (−1)σ + (−1)σ ◦t j = 0 for all σ ∈ A j , and we will therefore have established (20) if we can show that for fixed 1 ≤ j ≤ k − 1 and fixed σ ∈ A j , we have ∗ ⊗ S σ ⊗ ··· ⊗ S σ S σ Sµ µ µ µ 1

j

j+1

∗⊗ S ⊗ · · · ⊗ Sµσ = Sµ

σ ◦t j

µ1

k

⊗ ··· ⊗ S

σ ◦t j

µj

S

σ ◦t j

µ j+1

⊗ ··· ⊗ S

σ ◦t j

µk

.

(21) σ ◦t

By definition of t j we have µiσ = µi j whenever i = j, j + 1. If we set m :=

j−1 k i=1 eσ (i) ∈ N , then the factorisation property in ensures that σ ◦t j

µσj µσj+1 = µ(m, m + eσ ( j) + eσ ( j+1) ) = µ(m, m + eσ ◦t j ( j) + eσ ◦t j ( j+1) ) = µ j

σ ◦t

µ j+1j .

It follows that corresponding terms in the elementary tensors on either side of (21) are identical. This establishes (20). We must now show that if we sum the right-hand side of (20) over all µ ∈ 1k , we obtain zero. Fix, for the time being, µ ∈ 1k and σ ∈ k . Consider the expression (−1)σ Sµ∗ σ ...µσ ⊗ Sµσ2 ⊗ · · · ⊗ Sµσk 2

(22)

k

appearing as a summand in the first term on the right-hand side of (20). Let λ := µσ2 µσ3 . . . µσk , so that µ = µσ1 λ. Let ψk ∈ k be the permutation defined by ψk (i) = i +1 for i ≤ k − 1 and ψk (k) = 1. Fix α ∈ s(λ)eσ (1) . Then λα ∈ 1k . Consider the expression x(λ, σ ◦ ψk , α) := (−1)σ ◦ψk (−1)k S(λα)σ ◦ψk S ∗ k

σ ◦ψk

(λα)k

S∗

σ ◦ψk

(λα)1

σ ◦ψ

...(λα)k−1 k

⊗ S(λα)σ ◦ψk ⊗ · · · ⊗ S(λα)σ ◦ψk 1

k−1

which appears in the second term on the right-hand side of (20) for λα ∈ 1k and σ ◦ ψk ∈ k . We have (−1)ψk = (−1)k−1 , and hence (−1)σ ◦ψk (−1)k = −(−1)σ . By σ ◦ψ σ ◦ψ definition of ψk , we have (λα)k k = α, and (λα) j k = µσj+1 for 1 ≤ j ≤ k − 1. Hence, we may rewrite x(λ, σ ◦ ψk , α) = −(−1)σ Sα Sα∗ Sµ∗ σ ...µσ ⊗ Sµσ2 ⊗ · · · ⊗ Sµσk . 2

k

By the Cuntz-Krieger relation, we have Sα Sα∗ = ps(λ) = ps(µσk ) , α∈s(λ)

eσ (1)


and hence Sµ∗ σ ...µσ ⊗ Sµσ2 ⊗ · · · ⊗ Sµσk − 2

k

α∈s(λ)

x(µσ2 . . . µσk , σ ◦ ψk , α) = 0.

633

(23)

eσ (1)

The single-exit condition and the unique factorisation property guarantee that each Sµ∗ σ ...µσ ⊗ Sµσ2 ⊗ · · · ⊗ Sµσk 2

k

occurs exactly once in the first summand of the right-hand side of (20) as µ ranges over 1k and σ ranges over k . The factorisation property shows that for fixed µ and σ , a term x(λ, σ , α) is of the form x ⊗ Sµσ2 ⊗ · · · ⊗ Sµσk for some x ∈ C ∗ () only if σ = σ ◦ ψk , λ = µσ2 . . . µσk and α ∈ s(λ)eσ (1) . Hence we may formally rewrite b(ck ) = (−1)σ Sµ∗ σ ...µσ ⊗ Sµσ2 ⊗ · · · ⊗ Sµσk 2

µ∈1k σ ∈k

k

x(µσ2 . . . µσk , σ ◦ ψk , α) ,

−

α∈s(µσk )

eσ (1)

which formally collapses to zero by (23).

One can check relatively easily, using the approximate identity µ∈1k Sµ Sµ∗ for C ∗ (), that the infinite sums involved in the definition of ck and the formal calculations in this proof make sense in the multiplier algebra of the k +1-fold tensor power of C ∗ (). Closedness. To show that for all a1 , . . . , ak ∈ A we have τ˜ω ( [D, a1 ] · · · [D, ak ](1 + D2 )−k/2 ) = 0, it suffices to prove the result for generators of the algebra. So let Tµ j ,ν j = Sµ j Sν∗j , j = 1, . . . , k, be generators. Then [D, Tµ j ,ν j ] = γ (id j ) = i

k

γ m n m, j Tµ j ,ν j ,

m=1

where d j = (n 1, j , . . . , n k, j ) is the degree of Tµ j ,ν j . With this notation we have τ˜ω ( [D, Tµ1 ,ν1 ] · · · [D, Tµk ,νk ](1 + D2 )−k/2 ) ⎛ ⎞ γ j1 n j1 ,1 ) · · · ( γ jk n jk ,k )Tµ1 ,ν1 · · · Tµk ,νk (1 + D2 )−k/2 ⎠ . = i p τ˜ω ⎝ ( j1

(24)

jk

Now = ωC ⊗ 1, where ωC is the (representation of) the complex volume form in the Clifford algebra. Since the only products of generators of the Clifford algebra with nonzero trace are multiples of the identity, the only surviving terms on the right-hand side

634


of Eq. (24) when we expand the products are those with precisely one of each generator γ j . Thus τ˜ω ( [D, Tµ1 ,ν1 ] · · · [D, Tµk ,νk ](1 + D2 )−k/2 ) ⎛ ⎞ = i p τ˜ω ⎝

γ σ (1) n σ (1),1 · · · γ σ (k) n σ (k),k )Tµ1 ,ν1 · · · Tµk ,νk (1 + D2 )−k/2 ⎠ σ ∈Sk

⎛

= i p−[( p+1)/2] τ˜ω ⎝

⎞ (−1)σ n σ (1),1 · · · n σ (k),k )Tµ1 ,ν1 · · · Tµk ,νk (1 + D2 )−k/2 ⎠

σ ∈Sk

=i

p−[( p+1)/2]

det(n j,k )τ˜ω Tµ1 ,ν1 · · · Tµk ,νk (1 + D2 )−k/2 .

Now, the trace τ˜ω (Tµ1 ,ν1 · · · Tµk ,νk (1 + D2 )−k/2 ) = τ (Tµ1 ,ν1 · · · Tµk ,νk ) is zero unless Tµ1 ,ν1 · · · Tµk ,νk ∈ F, since τ is gauge invariant. This is equivalent to k j=1

dj = 0 ⇔ ∀l

k

nl,m = 0.

m=1

Hence the first, say, column of the matrix (n j,k ) is a linear combination of the other columns, and det(n j,k ) = 0. Hence for any generators Tµ j ,ν j = Sµ j Sν∗j , we have τ˜ω ( [D, Tµ1 ,ν1 ] · · · [D, Tµk ,νk ](1 + D2 )−k/2 ) = 0. Reality. We take the complex Clifford algebra Cli f f k to be generated by k elements γ j , j = 1, . . . , k such that (γ j )∗ = −γ j and γ j γ l + γ l γ j = −2δl, j I d. We make some further specifications on the generators consistent with these conventions. Denote by j the antilinear operator on X such that ⎞ ⎛ ∗ ⎞ ⎛ x1 x1 ⎜ .. ⎟ ⎜ .. ⎟ jx = j ⎝ . ⎠ = ⎝ . ⎠ . x2∗[k/2] x2[k/2] Let s(k) = [ k2 ](k + 1) − k and label the generators of the Clifford algebra so that (−1)s(k) γ j j odd . γ j = jγ j j = (−1)s(k)+1 γ j j even Observe that s(k) is even only when k = 4n, so except for these dimensions the odd generators have complex entries and are invariant under transpose, while the even generators have real entries and are antisymmetric. In dimensions 4n the situation is of course reversed. Let χ = γ 2 γ 4 · · · γ 2[k/2] be the product of the even generators (take χ = 1 when k = 1). Since the entries of χ are real for all k (if k = 4n there are 2n factors in χ and so the entries of χ are real) we have χ¯ = χ .


635

Using (γ j )∗ = −γ j we find χ ∗ = (−1)[k/2]([k/2]+1)/2 χ . We then define J := χ ◦ j = j ◦ χ . Lemma 4.3. The operator J satisfies J 2 = , J D = D J and for k even J = J , where , , are given in the table in Eq. (17). Proof. To check the sign , one needs only J ∗ J = 1 (which is straightforward) and J ∗ = j∗ ◦ χ ∗ = (−1)[k/2]([k/2]+1)/2 j ◦ χ = (−1)[k/2]([k/2]+1)/2 J. The sign can now be easily checked. The sign , in even dimensions, arises because j preserves the ±1 eigenspace decomposition of ωC , and so commutes with ωC , while ωC χ = (−1)k/2 χ ωC . For this is more subtle. We require the straightforward identity jn j = −n which may be checked on generators. Then we compute iγ j n j )n J ∗ = χ (−i jγ j jn j )jn J ∗ J Dn J ∗ = J ( j

⎛

= −iχ ⎝

j

(−1)s(k) γ j n j +

j odd

= −i(−1)[(k+1)/2](k+2)

⎞

(−1)s(k)+1 γ j n j ⎠ jn J ∗

(25)

j even

γ j n j J n J ∗ = (−1)[(k+1)/2](k+2) D−n .

j

Using the orthogonality of the n , for any x ∈ DomD we have J Dn J ∗ x = (−1)[(k+1)/2](k+2) D−n x = (−1)[(k+1)/2](k+2) Dx. J D J ∗x = n∈Zk

n∈Zk

The reader will check that the sign appearing here agrees with the values of in the table above. Theorem 4.4. Let (, d) be a connected, locally convex, locally finite k-graph with no sources, a faithful k-graph trace, satisfying the single exit condition and having finitely generated K -theory. Then the gauge spectral triple (A, H, D) of satisfies the (semifinite, nonunital) Conditions 1 to 9. Acknowledgements We thank J. Varilly for useful discussions. This work was supported by the ARC and the Danish Research Council.

References [BPRS] [CPS2] [CPRS1] [CPRS2]

Bates, T., Pask, D., Raeburn, I., Szymański, W.: The C ∗ -algebras of row-finite graphs. New York J. Math 6, 307–324 (2000) Carey, A., Phillips, J., Sukochev, F.: Spectral flow and Dixmier traces. Adv. in Math. 173, 68–113 (2003) Carey, A., Phillips, J., Rennie, A., Sukochev, F.: The Hochschild class of the Chern character of semifinite spectral triples. J. Funct. Anal. 213, 111–153 (2004) Carey, A., Phillips, J., Rennie, A., Sukochev, F.: The local index theorem in semifinite von Neumann algebras I: spectral flow. Adv. in Math. 202, 451–516 (2006)

636

[C] [C1] [C2] [D] [FK] [FPS] [GGISV] [GVF] [KP] [KPR] [KPRR] [L] [M] [P] [PR] [PRen] [PRS] [R] [RW] [RS] [RSz] [R1] [R2] [RV] [RLL] [S] [T]


Connes, A.: Noncommutative Geometry. London-New York: Academic Press, 1994 Connes, A.: Gravity coupled with matter and the foundation of noncommutative geometry. Commun. Math. Phys. 182, 155–176 (1996) Connes, A.: On the spectral characterization of manifolds. http://arxiv.org/abs/0810. 2088v1[math.OA], 2008 Dixmier, J.: von Neumann Algebras. Amsterdam: North-Holland, 1981 Fack, T., Kosaki, H.: Generalised s-numbers of τ -measurable operators. Pacific J. Math. 123, 269–300 (1986) Farthing, C., Pask, D., Sims, A.: Crossed products by Zl as higher rank graph C ∗ -algebras. Houston J. Math. (to appear) Gayral, V., Gracia-Bondía, J.M., Iochum, B., Schücker, T., Varilly, J.C.: Moyal planes are spectral triples. Commun. Math. Phys. 246, 569–623 (2004) Gracia-Bondía, J.M., Varilly, J.C., Figueroa, H.: Elements of Noncommutative Geometry. Boston: Birkhauser, 2001 Kumjian, A., Pask, D.: Higher rank graph C ∗ -algebras. New York J. Math. 6, 1–20 (2000) Kumjian, A., Pask, D., Raeburn, I.: Cuntz-Krieger algebras of directed graphs. Pacific J. Math. 184, 161–174 (1998) Kumjian, A., Pask, D., Raeburn, I., Renault, J.: Graphs, groupoids and Cuntz-Krieger algebras. J. Funct. Anal. 144, 505–541 (1997) Lance, E.C.: Hilbert C ∗ -Modules. Cambridge: Cambridge University Press, 1995 Mallios, A.: Topological Algebras, Selected Topics. London: Elsevier Science Publishers B.V., 1986 Plymen, R.J.: Strong Morita equivalence, spinors and symplectic spinors. J. Operator Th. 16, 305–324 (1986) Pask, D., Raeburn, I.: On the K-theory of Cuntz-Krieger algebras. Publ. RIMS, Kyoto Univ. 32, 415–443 (1996) Pask, D., Rennie, A.: The noncommutative geometry of graph C ∗ -algebras I: the index theorem. J. Funct. Anal. 233, 92–134 (2006) Pask, D., Rennie, A., Sims, A.: The noncommutative geometry of k-graph C ∗ -algebras. J. K-Theory 1, 259–304 (2008) Raeburn, I.: Graph Algebras, CBMS Regional Conference Series in Mathematics, Vol. 103, Providence, RI: Amer. Math. Soc., 2005 Raeburn, I., Williams, D.P.: Morita Equivalence and Continuous-Trace C ∗ -Algebras, Math. Surveys & Monographs, vol. 60, Providence, RI: Amer. Math. Soc., 1998 Reed, M., Simon, B.: Volume I: Functional Analysis, Volume II: Fourier Analysis, Self-Adjointness. New York: Academic Press, 1980 Raeburn, I., Szymanski, W.: Cuntz-Krieger algebras of infinite graphs and matrices. Trans. Amer. Math. Soc. 356, 39–59 (2004) Rennie, A.: Smoothness and locality for nonunital spectral triples. K-Theory 28, 127–165 (2003) Rennie, A.: Summability for nonunital spectral triples. K-Theory 31, 71–100 (2004) Rennie, A., Varilly, J.: Reconstruction of manifolds in noncommutative geometry. http://arxiv. org/abs/math/0610418v4[math.OA], 2008 Rørdam, M., Larsen, F., Laustsen, N.J.: An Introduction to K -Theory and C ∗ -Algebras. LMS Student Texts, 49, Cambridge: Cambridgr Univ. Press, 2000 Schweitzer, L.B.: A short proof that Mn (A) is local if A is local and Fréchet. Int. J. Math. 3, 581–589 Tomforde, M.: The ordered K 0 -group of a graph C ∗ -algebra. C.R. Math. Acad. Sci. Soc. R. Can 25, 19–25 (2003)

Communicated by A. Connes


Communications in


Spectral Gap and Transience for Ruelle Operators on Countable Markov Shifts Van Cyr, Omri Sarig∗ Mathematics Department, The Pennsylvania State University, University Park, PA 16802, USA. E-mail: [email protected], [email protected] Received: 1 December 2008 / Accepted: 30 March 2009 Published online: 25 August 2009 – © Springer-Verlag 2009

Abstract: We find a necessary and sufficient condition for the Ruelle operator of a weakly Hölder continuous potential on a topologically mixing countable Markov shift to act with spectral gap on some rich Banach space. We show that the set of potentials satisfying this condition is open and dense for a variety of topologies. We then analyze the complement of this set (in a finer topology) and show that among the three known obstructions to spectral gap (weak positive recurrence, null recurrence, transience), transience is open and dense, and null recurrence and weak positive recurrence have empty interior.

1. Introduction 1.1. Overview. Thermodynamic formalism is a branch of ergodic theory which studies, for a given dynamical system T : X → X and a given function φ : X → R, the existence andproperties of invariant probability measures µφ which maximize the quantity h µ (T ) + φdµ (“equilibrium measures”). The key tool is the Ruelle operator, (L φ f )(x) =

eφ(y) f (y).

(1.1)

T y=x

Under fairly mild conditions, if L φ acts with spectral gap on some sufficiently rich Banach space L, then µφ exists, and quite a lot can be said about its properties (see the books [B,HH,PP,R], or Theorem 1.1). O.S. was partially supported by an NSF grant DMS-0652966 and by an Alfred P. Sloan Research Fellowship.

638

V. Cyr, O. Sarig

Here we ask how large is the set of functions φ : X → R for which such a space L can be found. We study this question within the cadre of countable Markov shifts, and weakly Hölder continuous functions φ : X → R (see below). We (a) identify a necessary and sufficient condition on φ for the existence of a Banach space on which L φ acts with spectral gap; (b) analyze the topological structure of the set of functions φ which satisfy this condition; (c) compare the topological properties of the various obstructions to this condition, and figure out which obstruction is the most important. 1.2. Setting. Let S be a countable set, and A = (ti j )S ×S be a matrix of zeroes and ones. The countable Markov shift (CMS) with set of states S and transition matrix A is the dynamical system T : X → X , where X := {(x0 , x1 , . . . , ) ∈ S N∪{0} : txi xi+1 = 1 for all i}, and T (x)i := xi+1 . We think of X as of the collection of one sided infinite admissible paths on a directed graph with vertices v ∈ S, and edges v1 → v2 (v1 , v2 ∈ S , tv1 v2 = 1). We equip X with the metric d(x, y) = 2−t (x,y) , t (x, y) := inf{k : xk = yk } (where inf ∅ := ∞). The resulting topology is generated by the cylinder sets [a0 , . . . , an−1 ] := {x ∈ X : xi = ai , i = 0, . . . , n − 1} (a0 , . . . , an−1 ∈ S, n ≥ 1). A word a ∈ S n is called admissible if the cylinder it defines is non-empty. The length of an admissible word a = (a0 , . . . , an−1 ) is |a| := n. We assume throughout that T : X → X is topologically mixing. This is the case when for any two states a, b there is an N (a, b) such that for all n ≥ N (a, b) there is an admissible word of length n which starts at a and ends at b. Next we consider real valued functions φ : X → R. We define the variations of a function φ : X → R to be the numbers var n (φ) := sup |φ(x) − φ(y)| : x0n−1 = y0n−1 , n where here and throughout z m := (z m , . . . , z n ). We say that φ has summable variations, if n≥2 var n φ < ∞. We say that φ is θ –weakly Hölder continuous for 0 < θ < 1, if there exists Aφ > 0 such that var n (φ) ≤ Aφ θ n for all n ≥ 2. A weakly Hölder continuous function is Hölder (with respect to the metric defined above) iff it is bounded. A bounded θ –weakly Hölder function is called θ –Hölder. k The Birkhoff sums of a function φ are denoted by φn := n−1 k=0 φ ◦ T . Suppose φ has summable variations and X is topologically mixing. The Gurevich pressure of φ is the limit

1 log Z n (φ, a), where Z n (φ, a) = eφn (x) 1[a] (x), and a ∈ S. n→∞ n n

PG (φ) = lim

T x=x

This limit is independent of a, and if sup φ < ∞, then it is equal to sup{h µ (T ) + φdµ}, where the supremum ranges over all invariant probability measures such that the sum is not of the form ∞ − ∞ [S1].

Spectral Gap and Transience for Ruelle Operators

639

1.3. The spectral gap property. Recall that the Ruelle operator associated with φ is the operator (L φ f )(x) := T y=x eφ(y) f (y). This is well defined for functions f such that the sum converges for all x ∈ X . Let dom(L φ ) denote the collection of such functions. Definition 1.1. Suppose φ is θ –weakly Hölder continuous, and PG (φ) < ∞. We say that φ has the spectral gap property (SGP) if there is a Banach space of continuous functions L s.t. L ⊂ dom(L φ ) and L ⊇ {1[a] : a ∈ S n , n ∈ N}; f ∈ L ⇒ | f | ∈ L, | f | L ≤ f L ; L–convergence implies uniform convergence on cylinders; L φ (L) ⊆ L, and L φ : L → L is bounded; L φ = λP + N , where λ = exp PG (φ), and P N = N P = 0, P 2 = P, dim Im P = 1, and the spectral radius of N is less than λ; (f) If g is θ –Hölder, then L φ+zg : L → L is bounded, and z → L φ+zg is analytic on some complex neighborhood of zero.

(a) (b) (c) (d) (e)

The motivation is the following (compare with [R,HH,Li,PP,BS,GH,AD]). Suppose X is a topologically mixing CMS, and φ : X → R is a weakly Hölder continuous potential with finite Gurevich pressure, finite supremum, and the SGP. Write L φ = λP + N as above, then Theorem 1.1. P takes the form P f = h f dν, where h ∈ L is positive, and ν is a measure which is finite and positive on all cylinders. The measure dm φ = hdν is a T –invariant probability measure with the following properties: (a) If m φ has finite entropy, then m φ is the unique equilibrium measure of φ. (b) There is a 0 < κ < 1 s.t. for all g ∈ L ∞ (m φ ) and f bounded Hölder continuous, ∃C( f, g) > 0 s.t. | Covm φ ( f, g ◦ T n )| ≤ C( f, g)κ n . (Cov = covariance.) (c) Suppose ψ is a bounded Hölder continuous function such that Em φ [ψ] = 0. If √ ψ = ϕ − ϕ ◦ T with ϕ continuous, then ∃σ > 0 s.t. ψn / n converges in distribution (w.r.t. m φ ) to the normal distribution with mean zero and standard deviation σ. (d) Suppose ψ is a bounded Hölder continuous function, then t → PG (φ + tψ) is real analytic on a neighborhood of zero. We remark that the assumption that m φ has finite entropy is satisfied trivially for all CMS with finite Gurevich entropy PG (0) < ∞. Versions of Theorem 1.1 were shown in a variety of contexts by many people [R,GH,HH,AD,Li,BS,G1] (this is a partial list). The proof in our context is given in Appendix A. 1.4. The problem. When does a potential φ satisfy the SGP? How common is this phenomenon? What are the most important obstructions? If |S| < ∞ then every (weakly) Hölder continuous function has SGP (Ruelle [R]), but this is not the case when |S| = ∞ because of the phenomena of null recurrence, transience [S2], and positive recurrence with sub-exponential decay of correlations [S4] or non–analytic pressure function [S5,Lo,PrS]. Doeblin and Fortet have given sufficient conditions for spectral gap for potentials φ associated to a class of countable Markov chains [DF]. Aaronson & Denker had constructed Banach spaces with spectral gap for potentials associated with Gibbs–Markov

640

V. Cyr, O. Sarig

measures [AD]. The underlying CMS must satisfy a certain combinatorial condition (the “big images” property). Young had constructed Banach spaces with spectral gap for certain functions φ on CMS satisfying a different combinatorial condition (“tower structure”), see [Y]. 1.5. Notational convention. a = c ± ε means c − ε < a < c + ε, a = B ±1 c means B −1 ≤ a/c ≤ B, and an cn means that ∃B s.t. an = B ±1 cn for all large n. 2. Summary of Results 2.1. A necessary and sufficient condition for SGP. The condition is in terms of the discriminant, a notion which was introduced in [S3]. We recall the definition, and refer the reader to Appendix A for further information. If one induces a CMS on one of its states a ∈ S, then the result is a full shift. It is useful to fix the following notation: (a) S := {[a] = [a, ξ1 , . . . , ξn−1 ] : n ≥ 1, ξi = a, [a, ξ , a] = ∅}; N∪{0}

(b) X := S , viewed as a countable Markov shift with set of states S; (c) π : X → [a]; π([a 0 ], [a 1 ], . . .) = (a 0 , a 1 , . . .). This is a conjugacy between the left shift on X , and the induced (=first return) map on [a]. Every function φ : X → R has an “induced version” φ : X → R given by ⎞ ⎛ ϕ a −1 φ := ⎝ φ ◦ T k ⎠ ◦ π, where ϕa (x) := 1[a] (x) inf{n ≥ 1 : T n (x) ∈ [a]}. k=0

It is easy to see that if φ is weakly Hölder continuous on X , then φ is weakly Hölder continuous on X (moreover, var 1 ψ < ∞ even when var 1 ψ = ∞). The a–discriminant of φ is the (possibly infinite) quantity

a [φ] := sup{PG (φ + p) : p ∈ R s.t. PG (φ + p) < ∞}. The sign of this number has meaning [S3], see Appendix A. A weakly Hölder continuous function φ on a topologically mixing countable Markov shift is called strongly positive recurrent, (SPR), if it has finite Gurevich pressure and if there is a state a s.t. a [φ] > 0. Strong positive recurrence is a generalization of the notion of stable positive recurrence for positive infinite matrices due to Gurevich and Savchenko [GS]. It has its roots in the classical work of Vere-Jones on the problem of geometric ergodicity for Markov chains [VJ]. Theorem 2.1. Suppose X is a topologically mixing CMS, and φ : X → R is weakly Hölder continuous with finite Gurevich pressure, then φ has the spectral gap property iff φ is strongly positive recurrent. That SGP implies SPR is fairly routine, given the results of [S3]. The main part of the theorem is the other direction. It is perhaps useful at this point to explain how to check strong positive recurrence. Define Z n∗ (φ, a) := eφn (x) 1[ϕa =n] (x), T n x=x


641

and let R denote the of rφ (x) := n≥1 x n Z n∗ (φ, a), then [S3]

radius of convergence

∞ proves that either

a [φ] − log rφ (R) ≤ n=2 var n φ or a [φ] = log rφ (R) = ∞. In particular, if rφ (·) diverges at its radius of converges, then φ is SPR. It is easy to construct examples of φ with SGP on any topologically mixing CMS: Start with any weakly Hölder continuous φ : X → R with finite pressure, and fix some state a. One checks that rφ+t1[a] (x) = et rφ (x), thus φ +t1[a] is strongly positive recurrent for any t large enough. 2.2. SGP is open and dense. Let denote the collection of weakly Hölder continuous functions φ : X → R with finite Gurevich pressure. There are many different useful topologies on . To list them concisely, fix an infinite sequence ω = (ωn )n≥1 , 0 ≤ ωn ≤ ∞ and define for a function f : X → R,

f ω := sup | f | +

∞

ωn var n ( f ), where 0 · ∞ := 0,

n=1

V (φ, ε) := {φ ∈ : φ − φ ω < ε}. The ω–topology is the topology generated by V (φ, ε), (ε > 0, φ ∈ ). The choice ω = (0, 0, . . .) is useful for the study of perturbations in the sup norm. Other important choices are ω = (0, . . . , 0, ∞, ∞, . . .) (finite memory), ω = (0, 1, 1, . . .) (summable variations), and ω = (0, θ −1 , θ −2 , . . .) (Hölder). Theorem 2.2. The set of φ ∈ with the spectral gap property is open and dense in with respect to the ω–topology, for any ω = (ωn )n≥1 . In particular, the spectral gap property is stable under perturbations in with sufficiently small sup norm (ω = (0, 0, . . .)); and any φ ∈ can be perturbed to be strongly positive recurrent using a perturbation of arbitrarily small Hölder norm, or even finite memory of length one (ω = (∞, ∞, . . .)). This means that there is an open and dense set of φ ∈ which satisfy the conclusion of Theorem 1.1. Loosely speaking these are potentials whose thermodynamic formalism is similar to the behavior of thermodynamic systems at equilibrium without a phase transition. The following works contain related results: (1) Gurevich and Savchenko showed in [GS] that if φ ∈ is “stably positive recurrent” and φ is Markovian (i.e. var 2 φ = 0), then there is an ε > 0 s.t. any Markovian φ ∈ s.t. φ − φ ∞ < ε is positive recurrent (cf. Appendix A). For Markovian potentials, “stable positive recurrence” can be easily seen to be equivalent to strong positive recurrence. (2) Gallavotti & Miracle–Sole considered in [GM] multi-dimensional lattice gas models, and showed that in a certain topology there is a dense G δ –set of interaction potentials whose pressure functional is differentiable. Next we consider the larger set SV of all φ : X → R with summable variations and finite Gurevich pressure. Again, we can define the ω–topology on SV as the topology generated by {φ ∈ SV : φ − φ ω < ε} for all ε > 0, φ ∈ SV . Theorem 2.2 . Let SV denote the collection of all φ : X → R with summable variations and finite Gurevich pressure, then {φ ∈ SV : φ is strongly positive recurrent} is open and dense in SV for every ω–topology.

642

V. Cyr, O. Sarig

Obstructions to the SGP. If a potential φ ∈ does not have the spectral gap property, then by Theorem 2.1 it is not strongly positive recurrent, and a [φ] ≤ 0. Potentials with strictly negative discriminant are called transient. Potentials with zero discriminant are divided into two groups: null recurrent, and weakly positive recurrent (see Appendix A for a summary of the definitions and properties of the various modes of recurrence – in particular see Theorem 7.3 to equate the above definition of transience with that in Definition 7.1). We ask whether one of these obstructions is more common, in some sense, than the others. The ω–topologies are too weak to detect the difference between transience, null recurrence, and weak positive recurrence (they are all nowhere dense), so we need to use a stronger topology. The topologies of perturbations of finite support are sufficient for this purpose. To N define these topologies, fix a (nonempty) finite collection of states B = i=1 [ai ]. The uniform topology localized at B (or just the “B–uniform topology”) is the topology generated by the basis U (φ; ε, B) := {φ ∈ : φ − φ ∞ < ε, φ | X \B = φ| X \B } (ε > 0, φ ∈ ). Denote the resulting topology by LU(B). Theorem 2.3. Let (Tr) := {φ ∈ : φ is transient}. With respect to LU(B), (Tr) is open in , and open and dense in {φ ∈ : φ does not have SGP}. As a corollary of this theorem and its proof we have the following topological description of the various modes of recurrence in each of the LU(B)–topologies: (a) (b) (c)

strong positive recurrence: open; transience: open; weak positive recurrence and null recurrence: empty interior, contained in the boundaries of the first two sets.

In other words, transience is the most common obstruction to spectral gap. 3. Proof of Theorem 2.1 3.1. Strong Positive Recurrence implies Spectral Gap. Assume w.l.o.g. that PG (φ) = 0 (otherwise pass to φ − PG (φ), cf. §7.1). Fix some state a ∈ S s.t. a [φ] > 0. By the discriminant theorem (Appendix A, Theorem 7.3), PG (φ) = 0, where the over bar indicates induction on [a]. Therefore, by strong positive recurrence, there exists εa such that 0 < PG (φ + 2εa ) < ∞. This εa must be positive, because p(t) := PG (φ + t) is an increasing function. The function φ is by assumption weakly Hölder so there exists 0 < θ < 1 and Aφ > 0 such that var n φ ≤ Aφ θ n for all n ≥ 2. Make εa so small that 0 < θ e p < 1, where p := PG (φ + εa ).

(3.1)

This is possible to do, because p(t) := PG (φ + t) is continuous (being convex and finite) on (−∞, 2εa ) (see (7.4) in Appendix A). Define ψ := φ + εa − p1[a] , then using the properties of PG (·) listed in Appendix A §7.1 it readily follows that (1) PG (ψ) = 0, becausePG (ψ) = PG (φ + εa − p) = PG (φ + εa ) − p = 0;


643

(2) ψ is strongly positive recurrent, because PG (ψ + εa ) ≤ PG (φ + 2εa ) < ∞, and PG (ψ + εa ) = PG (ψ + εa ϕa ) ≥ PG (ψ) + εa = εa > 0, so a [ψ] > 0; (3) PG (ψ) = 0, because PG (ψ) = 0 and ψ is (strongly positive) recurrent, see Appendix A, Theorem 7.3, part (1). Since ψ is SPR, it is positive recurrent (Appendix A, Theorem 7.3). By the generalized Ruelle Perron Frobenius theorem (Appendix A, Theorem 7.2) and the assumption that PG (ψ) = 0, there exists a Borel measure ν0 , finite and positive on cylinders, and a positive continuous function h 0 : X → R such that L ∗ψ ν0 = ν0 , L ψ h 0 = h 0 , and h 0 dν0 = 1. Moreover, var 1 [log h 0 ] ≤ ≥2 var φ. Setting C0 := exp ≥2 var φ, we see that for every x, h 0 (x) = C0±1 h 0 [x0 ], where h 0 [x0 ] := sup[x0 ] h 0 . Define for x, y ∈ X , t (x, y) := min{n : xn = yn }, where min ∅ = ∞, sa (x, y) := #{0 ≤ i ≤ t (x, y) − 1 : xi = yi = a} (compare with the notion of “separation time” due to L.-S. Young [Y]). Let L denote the collection of continuous functions f : X → C for which

1 sa (x,y)

f L := sup : x, y ∈ [b],x = y < ∞. sup | f (x)|+sup | f (x)− f (y)|/θ b∈S h 0 [b] x∈[b] It is clear that (L, · L ) is a Banach space. We show that L φ (L) ⊆ L, and that L φ : L → L is a bounded operator with spectral gap. The proof uses the strengthening of the Ionsecu-Tulcea & Marinecu theorem due to Hennion ([HH], Theorem II.5). Suppose there exists a continuous semi-norm · C on L with the following properties: (A) There is a constant M > 0 s.t. L φ f C ≤ M f C for all f ∈ L; (B) Let ρ(L φ ) denote the spectral radius of L φ : L → L. There are constants n 0 ∈ N, 0 < r < ρ(L φ ), and R > 0 such that

L nφ0 f L ≤ r n 0 f L + R f C ;

(3.2)

(C) Every sequence { f n }n≥1 ∈ L s.t. sup f n L ≤ 1 has a subsequence { f n k }k≥1 s.t.

L φ f n k − g C −−−→ 0 for some g ∈ L. k→∞

Hennion’s theorem then says that L = F ⊕ N , where F, N are L φ –invariant subspaces such that dim(F) < ∞, ρ(L φ |N ) < ρ(L φ ), and such that every eigenvalue of L φ |F is of modulus ρ(L φ ). As we shall see below, the theory of equilibrium measures on topologically mixing CMS implies that ρ(L φ ) = 1, that the only eigenvalue on the unit circle is one, and that this eigenvalue is simple. This gives the spectral gap property with λ = 1, P the eigenprojection of one, and N := L φ (I − P). We will apply Hennion’s theorem to L φ : L → L. The semi–norm we use is

· C := · L 1 (ν0 ) .

644

V. Cyr, O. Sarig

Step 1. (A) holds: · C is a continuous semi-norm on L, and there is a constant M such that L φ f C ≤ M f C for all f ∈ L. Proof. To see that · C is continuous, suppose that f n − f L → 0. Then f n → f pointwise, and | f n (x) − f (x)| ≤ f n − f L h 0 [x0 ] ≤ C0 f n − f L h 0 (x) at every point. Since h 0 ∈ L 1 (ν0 ), f n − f C = | f n − f |dν0 → 0. Next fix f ∈ L. Then | f | ≤ C0 f L h 0 . The identity φ = ψ − εa + p1[a] ≤ ψ + p − εa shows that |L φ f | ≤ e p−εa L ψ (C0 f L h 0 ) = C0 e p−εa f L h 0 . Integrating w.r.t ν0 , we get L φ f C ≤ C0 e p−εa f L and the step follows with M := C0 exp( p − εa ). Step 2. Proof of (3.2). Proof. We need some notation. For every b ∈ S, set P n (b) := { p = ( p0 , . . . , pn−1 ) : ( p, b) is admissible}. For every p = ( p0 , . . . , pn−1 ) admissible, let n( p) := #{0 ≤ i ≤ n − 1 : pi = a}, and set Pkn (b) := { p ∈ P n (b) : n( p) ≥ k + 1}. In what follows we fix k (to be determined later), and estimate L nφ f L for arbitrary f ∈ L and n ≥ 1. Part 1. Analysis of supx∈[b] |(L nφ f )(x)| (b ∈ S). Suppose x ∈ [b]. Since φ = ψ − εa + p1[a] and | f | ≤ C0 f L h 0 , |(L nφ f )(x)| ≤ eφn ( px) | f ( px)| = eψn ( px)−nεa + pn( p) | f ( px)| p∈P n (b)

≤

p∈P n (b)

eψn ( px)−nεa + pn( p) | f ( px)|

p∈Pkn (b)

+ C0 ekp−nεa f L

eψn ( px) h 0 ( px)

p∈P n (b)\Pkn (b)

≤

eψn ( px)−nεa + pn( p) | f ( px)| + C0 ekp−nεa f L h 0 [b],

p∈Pkn (b)

because the last sum is bounded by (L nψ h 0 )(x) = h 0 (x) ≤ h 0 [b]. Every p ∈ Pkn (b) admits a unique decomposition p = (α, β, γ ) with α ∈ Ak , β ∈ B and γ ∈ C, where: Ak := {α : n(α) = k, and (α, a) is admissible}, B := {β : β starts at a, and (β, a) is admissible} ∪ {empty word}, C := {γ : γ contains exactly one a, at its beginning, and (γ , b) is admissible}. Conversely, every triplet (α, β, γ ) ∈ Ak × B × C such that |α| + |β| + |γ | = n (where |w| := length of w) gives rise to an element of Pkn (b). Thus


645

|(L nφ f )(x)| ≤ C0 ekp−nεa f L h 0 [b] ⎧ ⎨ eψγ (γ x)−γ εa + ⎩ α+β+γ =n

×

γ ∈C,|γ |=γ

eψβ (β γ x)−βεa + p[n(β γ )+k]

β∈B,|β|=β

⎫ ⎬ eψα (α β γ x)−αεa | f (α β γ x)| , ⎭

α∈Ak ,|α|=α

(3.3) with the convention that ψ0 ≡ 0. We estimate the innermost sum. Since n(α) = k, | f (α β γ x)| ≤ inf | f | + f L θ k h 0 [α0 ] ≤ inf | f | + C0 f L θ k h 0 (α β γ x). [α,a]

[α,a]

Since vari φ = vari ψ for all i ≥ 1, eψα (α β γ x) ≤ C0 inf eψα . [α,a]

We can thus estimate the inner sum by C0 inf eψα −αεa inf | f |+C0 f L θ k e−αεa α∈Ak ,|α|=α

≤ C0

[α,a]

[α,a]

eψα (α β γ x) h 0 (α β γ x)

α∈Ak ,|α|=α

inf eψα −αεa | f | + C0 f L θ k e−αεa h 0 [a] (∵ (β γ )0 = a)

α∈Ak ,|α|=α

[α,a]

≤ C0 e−αεa

α∈Ak ,|α|=α

≤ C0 e−αεa

α∈Ak ,|α|=α

C0 e−αεa = ν0 [a]

1 ν0 [a] 1 ν0 [a]

α∈Ak ,|α|=α

[a]

eψα (α y) | f (α y)|dν0 (y) + C0 f L θ k e−αεa h 0 [a]

L αψ (1[α,a] | f |)dν0 + C0 f L θ k e−αεa h 0 [a]

1[α,a] | f |dν0 + C0 f L θ k e−αεa h 0 [a] (∵ L ∗ψ ν0 = ν0 )

e−αεa

C0

f C + C0 h 0 [a]θ k e−αεa f L ν0 [a] 1 + h 0 [a] . ≤ C1 e−αεa f C + θ k f L , where C1 := C0 ν0 [a] ≤

Substituting this estimate in (3.3), we see that |(L nφ f )(x)| ≤ C0 ekp−nεa f L h 0 [b] + ⎡ ×⎣

eψγ (γ x)−γ εa

α+β+γ =n γ ∈C,|γ |=γ

⎤ eψβ (β γ x)−βεa + p[n(β γ )+k] C1 e−αεa f C + θ k f L ⎦ .

β∈B,|β|=β

(3.4)

646

V. Cyr, O. Sarig

By construction n(β γ ) = n(β) + 1 and ψβ (β γ x) = φβ (β γ x) + βεa − pn(β), so the sum in the square brackets is eφβ (β γ x)+ p(k+1) C1 e−αεa f C + θ k f L β∈B,|β|=β

= C1 e−αεa f C + θ k f L · e p(k+1)

eφβ (β γ x)

β∈B,|β|=β

≤ C1 e

p(k+1)−αεa

f C + θ k f L · C0 Z β (φ, a), where Z β (φ, a):= eφβ (z) 1[a] (z), T β z=z

because β, γ start with a. We claim that supβ Z β (φ, a) ≤ 2C0 : Had there been a β with Z β (φ, a) > 2C0 , then we would have had Z nβ (φ, a) ≥ [ C10 Z β (φ, a)]n ≥ 2n , in contradiction to the assumption that n1 log Z n (φ, a) −−−→ PG (φ) = 0. Setting C2 := 2C0 , n→∞ we obtain that the sum in the square brackets in (3.4) is bounded by C0 C1 C2 e(k+1) p−αεa f C + θ k f L . Substituting this in (3.4), gives |(L nφ f )(x)| ≤ eψγ (γ x)−γ εa C0 C1 C2 e(k+1) p−αεa α+β+γ =n γ ∈C,|γ |=γ

× f C + θ k f L + C0 ekp−nεa f L h 0 [b] C e−(α+γ )εa 0 ≤ C0 C1 C2 e(k+1) p f C + θ k f L eψγ (γ x) h 0 (γ x) h 0 [a] α+β+γ =n

f L h 0 [b] ≤ C02 C1 C2 e(k+1) p f C + θ k f L + C0 e

γ ∈C,|γ |=γ

kp−nεa

α+β+γ =n

e−(α+γ )εa h 0 [b] h 0 [a]

f L h 0 [b]. (∵ L ψ h 0 =h 0 ) It is easy to check that supn∈N α+β+γ =n e−(α+γ )εa −εa 2 , then for all x ∈ [b], C3 = 1 + ≥0 e + C0 e

kp−nεa

|(L nφ f )(x)| ≤ e(k+1) p

≤

≥0 e

−εa 2 .

C02 C1 C2 C3

f C + (θ k + e−nεa ) f L h 0 [b]. h 0 [a]

Let

(3.5)

Part 2. Analysis of the Lipschitz constant of L nφ f on [b]. Suppose x, y ∈ [b]. |(L nφ f )(x) − (L nφ f )(y)|

eφn ( px) 1 − eφn ( py)−φn ( px) | f ( px)| + eφn ( py) | f ( px) − f ( py)| ≤ p∈P n (b)

≤

eφn ( px) C4 θ t (x,y) | f ( px)| +

p∈P n (b)

p∈P n (b)

eφn ( py) C0 h 0 ( py) f L θ n( p)+sa (x,y) ,

p∈P n (b)


! Aφ where C4 := max 1, 1−θ sup |(L nφ

f )(x)−(L nφ

f )(y)| ≤ θ

Aφ |δ|≤ 1−θ

⎡

647

δ "

1−e

|φ(x)−φ(y)|

δ , and Aφ := sup θ t (x,y) . Thus

⎤

C4 sup L nφ | f | + C0 f L eφn ( py) h 0 ( py)θ n( p) ⎦ [b] p∈P n (b)

sa (x,y)⎣

=: θ sa (x,y) [I + II],

(3.6)

where I := C4 sup L nφ | f | ≤ e(k+1) p [b]

by (3.5), and II := C0 f L

C02 C1 C2 C3 C4

f C + (θ k + e−nεa ) f L h 0 [b] h 0 [a]

eφn ( py) θ n( p) h 0 ( py)

p∈P n (b)

= C0 f L

eψn ( py)−nεa + pn( p) θ n( p) h 0 ( py)

p∈P n (b)

= C0 f L e−nεa

eψn ( py) (e p θ )n( p) h 0 ( py)

p∈P n (b)

≤ C0 f L e−nεa

eψn ( py) h 0 ( py), because e p θ < 1 by (3.1)

p∈P n (b)

≤ C02 f L e−nεa h 0 [b], because L ψ h 0 = h 0 and y ∈ [b] ≤ e(k+1) p

C02 C1 C2 C3 C4 −nεa e

f L h 0 [b], because p > 0 and C1 C2 C3 C4 > h 0 [a]. h 0 [a]

Substituting the estimates for I and II in (3.6), we see that for all x, y ∈ [b], 2 |(L nφ f )(x) − (L nφ f )(y)| (k+1) p C 0 C 1 C 2 C 3 C 4 k −nεa

f

h 0 [b]. ≤ e + (θ + 2e ) f

C L h 0 [a] θ sa (x,y) (3.7) Part 3. Putting everything together to obtain (3.2). Equations (3.5) and (3.7), together with the fact that C4 ≥ 1 give C 2 C1 C2 C3 C4

L nφ f L ≤ 3e(k+1) p 0

f C + (θ k + e−nεa ) f L h 0 [a] 2C C C C C 1 2 3 4 ≤ 3e p 0 ekp f C + ((e p θ )k + ekp−nεa ) f L (3.8) h 0 [a] At this stage, it is probably useful to recall the definition of the constants Ci : C0 := exp

∞

var φ,

=2

C3 := 1 +

#∞ =0

$2 e

−εa

,

C1 := C0 (h 0 [a] + 1/ν0 [a]) , C2 := 2C0 , ⎫ ⎧ ⎪ ⎬

δ ⎪ ⎨ Aφ

C4 := max 1, 1−θ sup 1−e

δ ⎪. ⎪ A ⎭ ⎩ |δ|≤ φ 1−θ

648

V. Cyr, O. Sarig

These constants do not depend on k or n. Using (3.1), it is no problem to choose first k and then n 0 so that 1 3C0 e(k+1) p &

L nφ f L ≤ R f C + f L for all n ≥ n 0 , where R := Ci . 2 h 0 [a] 4

(3.9)

i=0

In the particular case n = n 0 , we get (3.2) with r := 2−1/n 0 . In the next step we shall see that r < ρ(L φ ). Step 3. L φ is a bounded operator on L and ρ(L φ ) = 1, thus (B) holds. Proof.

· C ≤ C0 · L on L, because for every f ∈ L, | f | ≤ C0 f L h 0 , and h 0 dν0 = 1. Thus (3.8) implies that L φ < ∞, and (3.9) says that sup L nφ < ∞. It follows that L φ is bounded, and that its spectral radius is not larger than one. We claim that the spectral radius is equal to one. Otherwise, there is some κ < 1 such that L nφ = O(κ n ), and then |L nφ 1[a] | = O(κ n ) uniformly on [a]. Now L nφ 1[a] Z n (φ, a) uniformly on [a] (Appendix A, Remark 7.1), so this means that 0 = PG (φ) = limn→∞ n1 log Z n (φ, a) ≤ log κ < 0, a contradiction. Step 4. Every sequence { f n }n≥1 in C such that sup f n L < ∞ has a subsequence which converges w.r.t. · C to some element of L. Since L φ < ∞, (C) holds. Proof. Let X 0 denote the subset of X consisting of all sequences which contain the symbol a infinitely many times. This is a subset of ν0 –full measure, because ν0 is an ergodic conservative measure which charges every partition set. The function δ(x, y) := θ sa (x,y) is a metric on X 0 , and (X 0 , δ) is a complete separable metric space. The family { f n }n≥1 is uniformly Lipschitz on partition sets with respect to this metric. By the Arzela–Ascoli theorem, there is a subsequence { f n k }k≥1 which converges pointwise on X 0 to some function g0 : X 0 → C. Since | f n k (x)| ≤ C0 (sup f n L )h 0 (x), and h 0 dν0 < ∞, X 0 | f n k − g0 |dν0 → 0. We show that ∃g ∈ L such that g| X 0 = g0 . Choose points y b ∈ [b] ∩ X 0 , (b ∈ S), and define a map ϑ : X → X 0 by ⎧ x ⎪ ∃i s.t. xi = a, ⎨y 0 a a ϑ(x) := (x0 , . . . , xk , y1 , y2 , . . .) ∃i s.t. xi = a, k := max{i : xi = a} < ∞, ⎪ ⎩x ∃ infinitely many i s.t. xi = a. We claim that for all x, y ∈ X , sa (ϑ(x), y) ≥ sa (x, y). If sa (x, y) = 0 or ϑa (x) = x, then there is nothing to prove. Otherwise, x has finitely many coordinates equal to a. Let k := max{i : xi = a, x0i = y0i } and k := max{i : xi = a, ϑ(x)i0 = y0i }, then sa (x, y) = #{0 ≤ j ≤ k : y j = a} and sa (ϑ(x), y) = #{0 ≤ j ≤ k : y j = a}. By construction, ϑ(x)k0 = x0k = y0k , therefore k ≥ k and sa (ϑ(x), y) ≥ sa (x, y). Now set g := g0 ◦ ϑ. Since ϑ| X 0 = id, g| X 0 = g0 . If x ∈ [b], then ϑ(x) ∈ [b], so |g(x)| = |g0 (ϑ(x))| ≤ sup | f n (ϑ(x))| ≤ h 0 [b] supn≥1 f n L . If x, y ∈ [b], then |g(x) − g(y)| ≤ |g0 (ϑ(x)) − g0 (ϑ(y))| ≤ sup | f n (ϑ(x)) − f n (ϑ(y))| ≤ sup f n L · h 0 [b]θ n

We conclude that g ∈ L, and that

X

n sa (ϑ(x),ϑ(y))

| f n k − g|dν0 =

≤ sup f n L · h 0 [b]θ sa (x,y) . n

X0

| f n k − g0 |dν0 → 0.


649

Step 5. L φ : L → L satisfies parts (a)–(e) of the spectral gap property. Proof. It is clear that every element of L is continuous, and that L contains all indicators of cylinder sets. Parts (a) and (d) of the spectral gap property were shown in Step 3. Parts (b) and (c) are obvious from the definition of · L . We prove part (e). The previous steps show that the conditions of Hennion’s theorem are satisfied and that ρ(L φ ) = 1. It follows that L = F ⊕ N , where L φ (F) ⊆ F, L φ (N ) ⊆ N , F is a finite dimensional space, the eigenvalues of L φ |F are all of modulus one, and the spectral radius of L φ |N is strictly less than one. We show that F = span{h} for some function h s.t. L φ h = h. Once this is done, we let P : L → F denote the eigenprojection of the eigenvalue 1, and N := L φ (I − P). It is clear that L φ = P + N , P 2 = P, P N = N P = 0, and dim Im P = dim F = 1. To see that ρ(N ) < 1, we use the facts ρ(L φ |N ) < 1 and L φ |F = id to see that L nφ = P + N n → P, whence N = { f ∈ L : L nφ f −−−→ 0}. n→∞

It follows that N = ker P. Thus N = L φ (I − P) is equal to zero on F and equal to L φ on N . Since F and N are L φ –invariant and ρ(L φ |N ) < 1, ρ(N ) < 1 and (e) is proved. Step 5.1. 1 is an eigenvalue of L φ |F : F → F. We construct an eigenfunction. Recall that φ is (strongly) positive recurrent with pressure zero. By the generalized RPF theorem (Appendix A, Theorem 7.2) there is a positive continuous function h and a Borel measure ν such that L φ h = h, L ∗φ ν = ν, hdν = 1. The measure dµ = hdν is known to be an exact invariant probability measure, and for every cylinder [a], L nφ 1[a] −−−→ hν[a] n→∞ pointwise [S2]. We claim that h ∈ L. By (3.8), supn≥1 L nφ 1[a] L < ∞. By Step 4, ∃n k → ∞ such L 1 (ν0 )

that L nφk 1[a] −−−→ g ∈ L. The limit must agree with the pointwise limit of L nφk 1[a] , n→∞

whence with h. Thus h = (1/ν[a])g ν0 –almost everywhere, whence by continuity — everywhere. Thus h ∈ L. We claim that h ∈ F. Write h = h 1 +h 2 , where h 1 ∈ F and h 2 ∈ N . Since L φ h = h, h = L nφ h 1 + L nφ h 2 . The first summand stays inside F, and the second summand tends to zero in norm, because ρ(L φ |N ) < 1. It follows that h ∈ F. But dim F < ∞ so F = F. Thus h ∈ F. Since h ∈ F and L φ h = h, 1 is an eigenvalue of L φ : F → F. Step 5.2. 1 is the only eigenvalue of L φ |F : F → F. This eigenvalue is simple. By the definition of F, all the eigenvalues of L φ |F : F → F have modulus one. Suppose f ∈ F \ {0} and L φ f = eiθ f , we show that eiθ = 1 and f = const h. We claim that f ∈ L 1 (ν). Since L φ is positive, L φ | f | ≥ |L φ f | = | f |, whence N n=1

L nφ [L φ | f | − | f |] ≤ L φN +1 | f | ≤ C0 sup L nφ f L h 0 for all N . n

650

V. Cyr, O. Sarig

But ν is conservative and ergodic and L ∗φ ν = ν, so for every F ≥ 0 such that Fdν = 0, n L φ F = ∞ ([A], Proposition 1.3.2). Thus L φ | f | = | f | ν–almost everywhere. It follows that | f | is an absolutely continuous invariant density of ν. An ergodic conservative measure can have at most one invariant density, so | f | = const h ν–a.e., whence f ∈ L 1 (ν). We claim that f is proportional to h and eiθ = 1. Let dµ := hdν. Since f ∈ L 1 (ν), 'ψ = 1 L φ (hψ).1 Since µ is exact, f / h ∈ L 1 (µ). The transfer operator of µ is T h n ' 'n ( f / h) = einθ ( f / h),

T ( f / h) − ( f / h)dµ L 1 (µ) → 0 ([A], Theorem 1.3.3). But T so this can only happen if eiθ = 1 and f / h = const almost everywhere. Since f, h are continuous and ν has global support, f / h = const. Step 5.3. dim F = 1. Since the spectrum of L φ |F consists of a single simple eigenvalue equal to one, and since (by construction) dim F < ∞, F has a basis with respect to which L φ : F → F is represented by dim(F) × dim(F) Jordan block with ones on the diagonal. The iterates of such a matrix diverge when dim(F) > 1 (the (1, 2)–entry escapes to infinity). This cannot be the case, because sup L nφ < ∞ by (3.8). The conclusion is that dim(F) = 1. We conclude that F = span{h}, where L φ h = h. By the discussion above, part (e) of SGP is proved. Step 6. Proof of part (f) of SGP. Part (f) of SGP says that if f ∈ F is θ –Hölder, then z → L φ+z f is analytic at zero. Write for every θ –Hölder continuous function g,

g θ := sup |g| + sup{|g(x) − g(y)|/θ t (x,y) : x, y ∈ X }. It is easy to verify that g f L ≤ g θ f L for all f ∈ L. It follows that the operator Mn : f → L φ (g n f ) is bounded, and that Mn ≤ zn

L φ

g nθ . Thus the series ∞ n=0 n! Mn converges absolutely in the operator norm for all zn |z| < 1/ g θ . As a result, L φ+zg ≡ ∞ n=0 n! Mn is analytic on {z ∈ C : |z| < 1/ g θ }. This shows (f), and completes the proof of SGP. 3.2. Spectral Gap implies Strong Positive Recurrence. Suppose φ has the spectral gap property, and write L φ = λP + N with λ = exp PG (φ) and P, N as above. Since P N = N P = 0 and P 2 = P, L nφ = λn P + N n . Since the spectral radius of N is less than λ, λ−n N n = O(κ n ) where 0 < κ < 1. Thus for (any) fixed x ∈ [a], λ−n Z n (φ, a) λ−n (L nφ 1[a] )(x) = P1[a] (x) + O(κ n ) (Appendix A, Remark (7.1)). It is impossible for P1[a] (x) to vanish, because this would imply that Z n (φ, a) = O((κλ)n ), whereas n1 log Z n (φ, a) → log λ and κ < 1. Thus P1[a] (x) = 0. According to the theory of analytic perturbations of linear operators [K], there exists ε > 0 s.t. every L : L → L which satisfies L − L φ < ε can be written in the form L = λ(L)P(L) + N (L), where P(L), N (L) are bounded linear operators s.t. P(L)2 = P(L), dim Im P(L) = 1, N (L)P(L) = P(L)N (L) = 0, and such that the spectral radius of N (L) is smaller 1 The transfer operator of a measure µ s.t. µ ◦ T −1 µ is the operator T 1 ' 1 : L (µ) → L (µ) whose value ' f dµ = ϕ ◦ T · f dµ for all test functions on a function f ∈ L 1 (µ) is determined by the condition ϕ T ϕ ∈ L ∞ (µ).


651

than |λ(L)|. Moreover, if ε > 0 is sufficiently small, then L → λ(L), P(L), N (L) are analytic on {L : L − L φ < ε}. Since g := 1[a] is Hölder continuous, t → L φ+tg is real analytic, whence continuous, at zero. So ∃δ > 0 such that if |t| < δ, then L φ+tg − L φ < ε and L φ+tg = λt Pt + Nt , where λt := λ(L φ+tg ), Pt := P(L φ+tg ), Nt := N (L φ+tg ). Since L–convergence implies pointwise convergence, Pt 1[a] (x) −−→ P1[a] (x). We t→0

saw above that for any x ∈ [a], P1[a] (x) = 0. Choosing our δ sufficiently small, we can ensure that (Pt 1[a] )(x) = 0 for all |t| < δ for some x ∈ [a]. We now repeat the argument above for φ + tg and see that for all t real such that |t| < δ, |λt |−n Z n (φ + tg, a) |λt |−n (L nφ+tg 1[a] )(x) = |(Pt 1[a] )(x) + o(1)|, whence |λt |−n Z n (φ + tg, a) 1. This implies that for all |t| < δ, |λt | = exp PG (φ + tg) and φ + tg is recurrent. By the discriminant theorem, a [φ + tg] ≥ 0 for all |t| < δ. But a [φ + tg] = a [φ + t1[a] ] = a [φ] + t (Appendix A, Lemma 7.1). If this is non-negative for all |t| < δ, then it must be the case that a [φ] > 0. 4. Strong Positive Recurrence is Open and Dense The material in this section relies on the theory of modes of recurrence, which we summarized for the convenience of the reader in Appendix A. Main Lemma. As we shall see below, it is fairly easy to approximate a recurrent potential by a strongly positive recurrent potential. Here we show that every potential can be approximated by a recurrent potential. Lemma 4.1. If φ ∈ is a transient potential, a ∈ S, and ψ is a non-positive bounded weakly Hölder function s.t. supp ψ ⊂ [a], then φ + ψ ∈ , φ + ψ is transient, and PG (φ + ψ) = PG (φ). Proof. Since φ + ψ ≤ φ we have PG (φ + ψ) ≤ PG (φ). To see the other inequality, we note that since φ is transient, 1 log Z n∗ (φ, a) (Appendix A, (7.6)) n n→∞ 1 = lim sup log Z n∗ (φ + ψ, a) (∵ supp φ ⊂ [a] and sup |ψ| < ∞) n→∞ n ≤ PG (φ + ψ). (∵ (7.5))

PG (φ) = lim sup

This shows that PG (φ) = PG (φ + ψ). Using the transience of φ and the non-positivity of ψ, we see that ∞

e−n PG (φ+ψ) Z n (φ + ψ, a) =

n=0

∞ n=0

≤

∞ n=0

so φ + ψ is transient.

e−n PG (φ) Z n (φ + ψ, a) e−n PG (φ) Z n (φ, a) < ∞,

652

V. Cyr, O. Sarig

Lemma 4.2 (Main Lemma). Suppose φ ∈ is transient, then for any ε > 0 there exists a recurrent ϕ ∈ so that ϕ − φ ∞ ≤ ε and var 1 [ϕ − φ] = 0. k

Proof. Recall that S denotes the set of states. We write a − → b for a, b ∈ S if there is an admissible word with k + 1 symbols which starts with a and ends with b. Fix ε > 0 and b ∈ S. We construct finite sets of states {c1k , . . . , crkk } (k ≥ 0) by induction as follows. When k = 0, let r0 := 1, and c10 := b. Now suppose we have carried the construction for each < k. Let b1k , b2k , b3k , . . . be the list of all different

states c for which b − → c for ≤ k. If this collection is finite, let rk be its size, and set {c1k , . . . , crkk } := {b1k , . . . , brkk }. If it is infinite, observe that # Z n∗

φ+ε

∞ i=1

$ 1[bk ] , b ≥ enε Z n∗ (φ, b) (1 ≤ n ≤ k), i

since for any x with T n x = x and x0 = b we have added an extra factor of ε to the potential at states x0 , x1 , . . . , xn−1 . Therefore we can find sk ∈ N such that $ # sk ε ∗ Zn φ + ε 1[bk ] , b ≥ en· 2 Z n∗ (φ, b) (1 ≤ n ≤ k). (4.1) i=1

i

We let {c1k , . . . , crkk } be the set {c1k−1 , . . . , crk−1 } ∪ {b1k , . . . , bskk } where, in this case, rk k−1 k is the number of different states ci so defined. Set φ[0] := φ, and define for k ≥ 1, φ[k] = φ + ε

rk

1[ck ] ,

i=1

i

We interpolate these potentials. Observe that for all k ≥ 1, φ[k] = φ[k − 1] + ε

mk i=1

1[d k ] , where {d1k . . . , dmk k } := {c1k . . . , crkk } \ {c1k−1 , . . . , crk−1 }, k−1 i

with m k defined by the above identity. Define for k ≥ 1 and 0 ≤ i ≤ m k , φ[k, i] := φ[k − 1] + ε

i j=1

1[d k ] . j

Then φ[k, 0] = φ[k − 1], and φ[k, m k ] = φ[k]. We claim that there must be some k, i such that φ[k, i] is recurrent. Assume by way of contradiction that this is not the case: φ[k, i] is transient for all k, i. In this case, the sequence φ[k] = φ[k, m k ] ≥ φ[k, m k − 1] ≥ · · · ≥ φ[k, 1] ≥ φ[k − 1, m k−1 ] ≥ · · · is a decreasing sequence of transient potentials where each term is equal to its predecessor minus ε times the indicator of some partition set. By Lemma 4.1, all terms


653

in the sequence have the same Gurevich pressure. Since the sequence terminates after finitely many steps at φ[0] = φ, PG (φ[k]) = PG (φ) for all k.

(4.2)

Consider now the power series tk (x) := 1 +

∞

Z i (φ[k], b)x i ,

i=1

rk (x) :=

∞

Z i∗ (φ[k], b)x i .

i=1

Both have radius of convergence exp[−PG (φ)]: the first by the definition of the Gurevich pressure and (4.2), and the second because of the assumption that φ[k] is transient (Appendix A, (7.6)). They are related by the following inequality for all 0 < x < exp[−PG (φ)] (Appendix A, (7.2)): ∞

1 [tk (x) − 1] ≤ tk (x)rk (x) ≤ B 2 [tk (x) − 1], where B := exp var n φ. (4.3) 2 B n=2

By (4.3), rk (x) ≤ B 2 for all 0 < x < exp[−PG (φ)] and k ≥ 1. But this cannot be the case, because for exp[−PG (φ) − 2ε ] < x < exp[−PG (φ)], rk (x) ≥

k

Z n∗ (φ[k], b)x n ≥

n=1

−−−→ k→∞

k

ε

en· 2 Z n∗ (φ, b)x n (by (4.1))

n=1 ∞

Z n∗ (φ, b)(eε/2 x)n = ∞.

n=1

This contradiction shows that there must be some k0 , i 0 for which ϕ := φ[k0 , i 0 ] is recurrent. By construction ϕ ∈ , var 1 [ϕ − φ] = 0, and ϕ − φ ∞ = ε. Proof of Theorem 2.2. The proof has two parts: (a) If φ ∈ , then for every ε > 0 there is a strongly positive recurrent potential ϕ ∈ s.t. ϕ − φ ∞ < and var 1 [ϕ − φ] = 0. (b) The set of strongly positive recurrent potentials is open w.r.t the sup norm on . The first part shows that the set of strongly positive recurrent potentials is dense in the strongest possible ω–topology; the second step shows that it is open in the weakest possible ω–topology. Part 1. Approximating general potentials by strongly positive recurrent potentials. Fix φ ∈ and ε > 0. By Lemma 4.2, there exists a recurrent ψ ∈ such that φ − ψ ∞ < ε/2 and var 1 [φ − ψ] = 0. We now appeal to the discriminant theorem (Appendix A, Theorem 7.3): Fix some a ∈ S, then the recurrence of ψ implies that a [φ] ≥ 0. If ϕ := ψ + 2ε · 1[a] , then

a [ϕ] = a [ψ] +

ε (Appendix A, Lemma 7.1), 2

654

V. Cyr, O. Sarig

so ϕ is strongly positive recurrent. It is obvious that φ − ϕ ∞ < ε and var 1 [ϕ − φ] = var 1 [ψ − φ] = 0. Part 2. For every strongly positive recurrent φ ∈ there exists a δ > 0 such that if ϕ ∈ and ϕ − φ ∞ < δ, then ϕ is strongly positive recurrent. We fix some a ∈ S and work with the induced system on [a], X , as defined in §2.1. By the definition of the discriminant, if φ ∈ is strongly positive recurrent then there exists p ∈ R such that 0 < PG (φ + p) < ∞. W.l.o.g. PG (φ + p ) < ∞ for some p > p. The map x → PG (φ + x) is convex and finite on (−∞, p ), whence continuous on (−∞, p] (Appendix A (7.4)). It is also strictly increasing (because φ + x + h ≥ (φ + x) + h for all h > 0). Hence, there exist numbers p1 < p2 s.t. 0 < PG (φ + p1 ) < PG (φ + p2 ) < ∞. Take p0 := ( p1 + p2 )/2 and δ := ( p2 − p1 )/2. If ϕ ∈ and ϕ − φ ∞ < δ, then φ + p1 ≤ ϕ + p0 ≤ φ + p2 so 0 < PG (φ + p1 ) < PG (ϕ + p0 ) < PG (φ + p2 ) < ∞, proving that a [ϕ] > 0. This shows that the set of strongly positive recurrent potentials is · ∞ -open. Proof of Theorem 2.2 . The proof is identical to the proof of Theorem 2.2 with the words “weakly Hölder” replaced by “summable variations”. 5. Transience is Open and Dense in the Set of Non-strongly Positive Recurrent Potentials The reader is referred to Appendix A for the definition and properties of transient, null recurrent, and weakly positive recurrent potentials. Proof of Theorem 2.3. Lemma 7.1 in Appendix A says that for every a ∈ S and t ∈ R,

a [φ + t · 1[a] ] = (

a [φ] + t. Suppose B = ri=1 [ai ], and φ ∈ is transient. Then a1 [φ] < 0. Find ε1 > 0 s.t. φ (1) := φ + ε1 · 1[a1 ] satisfies a1 [φ (1) ] < 0. Then φ (1) is transient. The transience of φ (1) means that a2 [φ (1) ] < 0, so we can find ε2 > 0 s.t. φ (2) := φ (1) + ε2 · 1[a2 ] satisfies a2 [φ (2) ] < 0. So φ (2) is also transient. Continuing in this manner, we obtain ε1 , . . . , εr > 0 s.t. ψ := φ (r ) = φ +

r

εi · 1[ai ] is transient.

i=1

Take δ := min{ε1 , . . . , εr }. We claim that every ϕ ∈ such that ϕ − φ ∞ < δ and φ| X \B = ϕ| X \B is transient. To see this, we observe that ϕ can be obtained from ψ by subtracting the r non-negative functions (ψ − ϕ)1[ai ] . By Lemma 4.1 each subtraction preserves transience, so the end result ϕ is transient. This proves that the set of transient potentials is LU(B)–open. We claim that it is dense in the complement of the strongly positive recurrent potentials. To see this, it is enough to show that every φ ∈ s.t. a1 [φ] = 0 can be approximated in LU(B) by a transient potential. Take φ + t · 1[a1 ] with t → 0− .


655

6. More on Transience The previous arguments suggest the following new characterization of transience: Theorem 6.1. φ ∈ is transient if and only if there exists ψ ∈ such that ψ ≥ φ, ψ ≡ φ, and PG (ψ) = PG (φ). Proof. If φ is transient, then for any a ∈ S, ψ := φ + t · 1[a] is transient for all t > 0 sufficiently small (Theorem 2.3). By Lemma 4.1, PG (ψ − s1[a] ) = PG (ψ) for all s > 0. In the particular case s = t we get PG (ψ) = PG (φ), and ψ is as required. We will show that if φ is recurrent then no such ψ can exist. Suppose by way of contradiction that ∃ψ ∈ such that ψ ≡ φ, ψ ≥ φ, and PG (ψ) = PG (φ). Find some word [a] := [a1 , . . . , an ] such that ψ − φ > α on [a] for some α > 0. Since φ ≤ φ + α · 1[a] ≤ ψ and PG (·) is increasing, PG (φ + α · 1[a] ) = PG (φ). The potential ϕ := φ + α · 1[a] must be recurrent, because ∞

Z n (ϕ, a)e

−n PG (ϕ)

n=1

=

∞ n=1

Z n (ϕ, a)e

−n PG (φ)

≥

∞

Z n (φ, a)e−n PG (φ) = ∞,

n=1

by the recurrence of φ. Therefore there exists a positive continuous function h such that L ϕ h = e PG (ϕ) h = e PG (φ) h (Appendix A, Theorem 7.2). This and φ ≤ ϕ implies that L φ h ≤ e PG (φ) h, and it is easy to see that this inequality is strict on T [a]. Now consider the non-negative function f := h − e−PG (φ) L φ h. This is a non-negative continuous function, not everywhere equal to zero, such that ∞

e−k PG (φ) L kφ f ≤ h < ∞ everywhere.

k=0

In particular the series on the left (all of whose summands are non-negative) converges almost surely. But this is impossible: φ is recurrent, so L φ has a conservative ergodic eigenmeasure ν, L ∗φ ν = e PG (φ) ν. Since L ∗φ ν = e PG (φ) ν, e−PG (φ) L φ is the transfer operator of k ν, whence L φ f = ∞ ν–almost everywhere (cf. [A], Proposition 1.3.2) , whence at least at one point. This contradiction shows that ψ cannot exist. The result should be compared with the results of S. Ruette [Rt] on the transience of φ ≡ 0. Acknowledgements. The authors would like to thank the referees for their careful reading of the paper and for many valuable suggestions.

7. Appendix A: The Discriminant and the Three Modes of Recurrence The purpose of this section is to summarize the results of [S1,S2 and S3] concerning the thermodynamic formalism of countable Markov shifts (CMS). Throughout this section we assume that X is a topologically mixing CMS with set of states S and transition matrix A, which we think of as the set of one sided admissible paths on a directed graph G. We use the notation introduced in §1.2.

656

V. Cyr, O. Sarig

7.1. Gurevich pressure. Suppose φ has summable variations, and define as always φn := φ + φ ◦ T + · · · + φ ◦ T n−1 . The Gurevich pressure of φ is 1 log Z n (φ, a), where Z n (φ, a) := eφn (x) 1[a] (x). n→∞ n n

PG (φ) = lim

T x=x

The limit exists, is independent of the choice of a, and satisfies [S1]: (a) For every constant c, PG (φ + c) = PG (φ) + c; (b) φ ≤ ψ ⇒ PG (φ) ≤ PG (ψ); (c) if φ, ψ have summable variations, then PG (tφ+(1−t)ψ) ≤ t PG (φ)+(1−t)PG (ψ) for all t ∈ [0, 1]. Theorem 7.1 (Variational Principle [S1]). If sup φ < ∞ and φ has summable vari ations, then PG (φ) = sup{h µ (T ) + φdµ}, where the supremum ranges over all T –invariant Borel probability measures such that (h µ (T ), φdµ) = (∞, −∞). Remark 7.1. If X is topologically mixing and φ has summable variations, then L nφ 1[a] Z n (φ, a) uniformly on [a]. 7.2. Modes of recurrence. Recall that ϕa (x) := 1[a] (x) inf{n ≥ 1 : T n (x) ∈ [a]}, and set Z n∗ (φ, a) = eφn (x) 1[ϕa =n] (x). T n x=x ∗ (φ, a) are related by the following “approximate renewal equation”: Z n (φ, a) and Z n set B := exp 2 ∞ n=2 varn (φ) , then

∗ Z n (φ, a) = B ±1 Z n−1 (φ, a)Z 1∗ (φ, a)+· · ·+ Z 1 (φ, a)Z n−1 (φ, a) + Z n∗ (φ, a) . (7.1) Passing to the generating functions, tφ (x) = 1 +

∞

Z n (φ, a)x

n

n=1

and rφ (x) =

∞

Z n∗ (φ, a)x n ,

n=1

we obtain 1 tφ (x)rφ (x) ≤ tφ (x) − 1 ≤ B 2 tφ (x)rφ (x) B2 for every x ∈ [0, R), where R = e−PG (φ) is the radius of convergence of tφ (·). Definition 7.1. Set λ = e PG (φ) . We call φ • transient, if tφ (λ−1 ) < ∞; • positive recurrent, if tφ (λ−1 ) = ∞ but rφ (λ−1 ) < ∞; • null recurrent, if tφ (λ−1 ) = ∞ and rφ (λ−1 ) = ∞.

(7.2)


657

We have the following [S2, Theorem 1]: Theorem 7.2 (Generalized Ruelle-Perron-Frobenius Theorem [S2]). φ is recurrent iff there exist λ > 0, a conservative measure ν, finite and positive on cylinders, and a positive continuous function h such that L ∗φ ν = λν and L φ h = λh. In this case λ = e PG (φ) and ∃ an ∞ such that for every cylinder [a] and x ∈ X , n 1 −k k λ (L φ 1[a] )(x) −−−→ h(x)ν[a], n→∞ an k=1

where {an }n satisfies an ∼ ( more,

[a] hdν)

−1

n

k=1 λ

−k Z

k (φ, a)

for every a ∈ S. Further-

(1) if φ is positive recurrent then ν(h) < ∞, an ∼ n ·const and for every [a],λ−n L nφ 1[a] −−−→ hν[a]/ν(h) uniformly on compacts; n→∞

(2) if φ is null recurrent then ν(h) = ∞, an = o(n) and for every cylinder [a], λ−n L nφ 1[a] −−−→ 0 uniformly on compacts. n→∞

It is not difficult to see, using the representation of h as the limit above, that var 1 [log h] ≤ n≥2 var n φ. 7.3. The discriminant. Fix a state a ∈ S, and recall the operation of passing from the pair (X, φ) to (X , φ) as explained in §2.1. Define pa∗ [φ] := sup{ p | PG (φ + p) < ∞} (the bar means that we induce on the state a). This number can be calculated by the formula [S3] pa∗ [φ] = − lim sup n→∞

1 log Z n∗ (φ, a). n

(7.3)

Moreover, the map p(t) = PG (φ + t)

(7.4)

is convex, strictly increasing and continuous on {t : t ≤ pa∗ [φ]} ([S3, Prop. 3]). The discriminant of φ at a ∈ S is defined to be

a [φ] := sup{PG (φ + p) | p < pa∗ [φ]}. The following is frequently useful so we state it as a lemma. Lemma 7.1. If X is topologically mixing and φ has summable variations and finite pressure then a [φ + t · 1[a] ] = a [φ] + t. Proof. PG (φ + t · 1[a] + p) = PG (φ + p+t) = PG (φ + p)+t, so pa∗ [φ+t ·1[a] ] = pa∗ [φ] and a [φ + t · 1[a] ] = a [φ] + t. The discriminant detects modes of recurrence: Theorem 7.3 (Discriminant Theorem [S3]). Let X be a topologically mixing countable Markov shift and let φ : X → R be some function with summable variations and finite Gurevich pressure. Let a ∈ S be some arbitrary fixed state.

658

V. Cyr, O. Sarig

(1) The equation PG (φ + p) = 0 has a unique solution p(φ) if a [φ] ≥ 0 and no solution if a [φ] < 0. The Gurevich pressure of φ is given by ! − p(φ) i f a [φ] ≥ 0 ; PG (φ) = − pa∗ [φ] i f a [φ] < 0 (2) φ is positive recurrent if a [φ] > 0 and transient if a [φ] < 0. In the case

a [φ] = 0, φ is either positive recurrent or null recurrent. In particular, strong positive recurrence implies positive recurrence. Definition 7.2. We say that φ is weakly positive recurrent if it is positive recurrent but not strongly positive recurrent. Corollary 7.1. Suppose X is topologically mixing and φ has summable variations and finite Gurevich pressure. If φ is recurrent then PG (φ) ≥ lim sup

1 log Z n∗ (φ, a) n

(7.5)

PG (φ) = lim sup

1 log Z n∗ (φ, a). n

(7.6)

n→∞

and if φ is transient then n→∞

The first equation is by definition of the pressure and Z n (φ, a) ≥ Z n∗ (φ, a). The second equation is the discriminant theorem and (7.3). 8. Appendix B: Proof of Theorem 1.1 Throughout this section, assume that T : X → X is a topologically mixing countable Markov shift, and that φ ∈ . We use the thermodynamic formalism for CMS as summarized in Appendix A. 8.1. Some technical implications of SGP.

Lemma 8.1. If φ has SGP, then the P in Definition 1.1 has the form Pg = h gdν for all g ∈ L, where h ∈ L is positive and bounded away from zero on cylinders, ν is finite and positive on cylinders, and L φ h = λh, L ∗φ ν = λν, hdν = 1. Proof. We show that φ is positive recurrent (Appendix A, Definition 7.1). The idea is to fix a ∈ S and show that λ−n Z n (φ, a) 1, where Z n (φ, a) = T n x=x eφn (x) 1[a] (x). This implies recurrence by definition, and rules out null recurrence because if φ were null recurrent, then λ−n Z n (φ, a) λ−n L nφ 1[a] −−−→ 0 on [a] because of Theorem 7.2, n→∞

part (2), which contradicts λ−n Z n (φ, a) 1. Write L φ = λP + N with λ = exp PG (φ) and P, N as in Definition 1.1. Since P N = N P = 0 and P 2 = P, L nφ = λn P + N n . Since the spectral radius of N is less than λ, λ−n N n = O(κ n ), where 0 < κ < 1. We have for (any) fixed x ∈ [a], λ−n Z n (φ, a) λ−n (L nφ 1[a] )(x) = P1[a] (x) + O(κ n ) (see (7.1) in Appendix A). It is impossible for P1[a] (x) to vanish, because this would imply that Z n (φ, a) =


659

O((κλ)n ), whereas n1 log Z n (φ, a) → log λ and κ < 1. Thus P1[a] (x) = 0. It follows that λ−n Z n (φ, a) 1, whence the positive recurrence of φ. By the generalized RPF theorem (Appendix A, Theorem 7.2), ∃h positive, continuous, and bounded away from zero on cylinders, and ∃ν positive and finite on cylinders such that L φ h = λh, L ∗φ ν = λh, hdν = 1. Moreover, λ−n L nφ 1[a] −−−→ ν[a]h n→∞

pointwise. But λ−n L nφ 1[a] − P1[a] L ≤ λ−n N n 1[a] L → 0, so λ−n L nφ 1[a] → P1[a] pointwise. We see that P1[a] = ν[a]h. Since P(L) ⊆ L, h ∈ L. Since dim Im P = 1, there exists ϕ ∈ L∗ s.t. Pg = ϕ(g)h for all g ∈ L. We show that ϕ(g) = gdν for all g ∈ L. Let m φ := hdν. The relations L ∗φ ν = λν and L φ h = λh can be used to see that m φ is T –invariant measure. The methods of [ADU] show that it is mixing (even exact). 1 g ∈ L ∩ L 1 (ν), then gh −1 Suppose ∈ L (m φ ), and the mixing of m φ implies that −1 n (gh )1[a] ◦ T dm φ −−−→ m φ [a] gdν. On the other hand n→∞ (gh −1 )1[a] ◦ T n dm φ = g1[a] ◦ T n dν = λ−n L nφ (g1[a] ◦ T n )dν ) * −n n Pg + λ−n N n g dν −−−→ ϕ(g)m φ [a], = λ L φ gdν = [a]

n→∞

[a]

because λ−n N n g L → 0, whence λ−n N n g → 0 uniformly on [a]. Comparing the limits we see that ϕ(g) = gdν for all g ∈ L ∩ L 1 (ν). It remains to see that L ⊂ L 1(ν). Otherwise there exists f ∈ L s.t. | f |dν = ∞. Since f ∈ L, g := | f | ∈ L, and gh −1 dm φ = ∞. The mixing of m φ implies that (gh −1 )1[a] ◦ T n dm φ −−−→ ∞ n→∞

(bound gh −1 from below by a bounded function with large integral). But g ∈ L, so we can write as before (gh −1 )1[a] ◦ T n dm φ = [a] λ−n L nφ gdν −−−→ ϕ(g)m φ [a]. This n→∞ limit is finite, so we arrive at a contradiction. Lemma 8.2. Let ν be as in the previous lemma, then there exists some constant C0 s.t.

· L 1 (ν) ≤ C0 · L . Proof. Suppose f ∈ L. By assumption,L has the lattice property: f ∈ L ⇒ | f | ∈ L. By the previous lemma, P| f | = h | f |dν, so f L 1 (ν) = P| f | L / h L ≤

P

P

h L | f | L ≤ h L f L . So take C 0 := P / h L . Lemma 8.3. Suppose φ ∈ has the SGP. If ψ is (bounded and) Hölder continuous, then ψ f ∈ dom(L φ ) and L φ (ψ f ) ∈ L for all f ∈ L. The operator f → L φ (ψ f ) is a bounded linear operator on L. Proof. If f ∈ L, then | f | ∈ L. Since L ⊆ dom(L φ ), | f | ∈ dom(L φ ). If ψ is bounded, then |ψ f | < C| f | for some C, so |ψ f | ∈ dom(L φ ), whence ψ f ∈ dom(L φ ). Next, by assumption, t → L φ+tψ is a real analytic Hom(L, L)–valued map on a neighborhood of zero. This means that for every f ∈ L, t → L φ+tψ f is a real analytic L–valued map on a neighborhood of zero. Differentiating at zero, we see that there exists g ∈ L s.t. 1 L [L φ+tψ f − L φ f ] −−→ g ∈ L. t→0 t

660

V. Cyr, O. Sarig

Since L-convergence implies pointwise convergence, 1 g(x) = lim [L φ+tψ f − L φ f ](x) = L φ (ψ f )(x) + lim L φ t→0 t t→0

etψ − 1 −ψ t

f (x)

for every x ∈ X . Since ψ is bounded and | f | ∈ L ⊂ dom(L φ ),

tψ

τt

e − 1 e −1

L φ

− ψ f ≤ sup

− 1

ψ ∞ (L φ | f |)(x) −−→ 0.

t→0 t tτ |τ |≤ ψ ∞ Thus g(x) = L φ (ψ f )(x) for all x, whence L φ (ψ f ) = g ∈ L. We estimate L φ (ψ f ) L . We just saw that L φ (ψ f ) is the derivative at zero of the L–valued function t → L φ+tψ f . By SGP, this function extends to a holomorphic function z → L φ+zψ on some complex neighborhood U of the origin. Let C be a circle with center zero and radius r so small that C ⊂ U , then for every f ∈ L: + + + 1 , 1 + L f dz

L φ (ψ f ) L = + 2πi + ≤ r1 maxz∈C L φ+zψ · f L . φ+zψ 2 C z L

It follows that f → L φ (ψ f ) is a bounded operator.

8.2. Equilibrium measures. It was proved in [BS] that if a weakly Hölder continuous function φ with finite pressure and supremum has an equilibrium measure, then this measure is of the form hdν with h > 0 continuous and ν s.t. L φ h = λh, L ∗φ ν = λν, hdν = 1. Here we show the converse: If h, ν are as above, and dm = hdν has finite entropy, then it is an equilibrium measure (by [BS] the unique one). Let α := {[a] : a ∈ S} denote the natural generator. Lemma 8.4 (Rokhlin). Let µ be a shift invariant measure on a CMS X , and let α be the natural generator. Then h µ (T ) ≥ Hµ (α|α1∞ ), with equality when Hµ (α) < ∞. Proof. The equality when Hµ (α) < ∞ is standard, so we focus on the case when Hµ (α) = ∞. We use the following notational conventions for partitions. Suppose γ is a measurable partition of X , then σ (γ ) :=the sigma algebra generated by γ ; γmn := n -n −k γ =the smallest partition s.t. σ (γ n ) ⊇ −k γ ); and γ ∞ :=the m k=m T k=m σ (T 1 n smallest sigma–algebra which contains n≥1 σ (γ1 ). Take an increasing sequence of finite partitions β (n) such that σ (β (n) ) ↑ σ (α). For every fixed n, since Hµ (β (n) ) < ∞, k−1 . −1/ 1 1 (n) (n) H β = lim (β −H ) h µ T, β (n) = lim Hµ (β (n) )k−1 µ µ 0 0 k→∞ k k→∞ k 0 =1

1 k→∞ k

= lim

k−1 =1

k−1 1 Hµ β (n) |(β (n) )1 ≥ lim Hµ β (n) |α1 , k→∞ k

(8.1)

=1

because σ (α) ⊃ σ (β (n) ). We claim that Hµ β (n) |α1 −−−→ Hµ β (n) |α1∞ . →∞

(8.2)


661

This is because (a) Iµ β (n) |α1 −−−→ Iµ β (n) |α1∞ µ–a.e. (Martingale Convergence Theorem) →∞ (b) sup≥1 Iµ β (n) |α1 dµ < ∞, by the Chung–Neveu Lemma ([P], Lemma 2.1); (c) the Dominated Convergence Theorem. By (8.1) and (8.2), for all n, h µ T, β (n) ≥ Hµ β (n) |α1∞ ≡ Iµ β (n) |α1∞ dµ. Now Iµ β (n) |α1∞ ↑ Iµ (α|α1∞ ), because β (n) increase to α (see e.g. [P], Theorem 2.2 (ii)). By the Monotone Convergence Theorem Hµ β (n) |α1∞ ↑ Hµ α|α1∞ , and we con clude that h µ T, β (n) ≥ Hµ β (n) |α1∞ −−−→ Hµ α|α1∞ . Since h µ (T )≥h µ T, β (n) , n→∞ the proof is completed. Proposition 8.1. Suppose φ has summable variations, has finite Gurevich pressure, and sup φ < ∞. Suppose further that h > 0 ispositive continuous, ν is positive and finite on cylinders, L φ h = λh, L ∗φ ν = λν, and hdν = 1. If dµ = hdν has finite entropy, then it is an equilibrium measure of φ. dµ ≡ −[φ + ln h − ln h ◦ T − Proof. One can show, as in [L], that Iµ α|α1∞ = − ln dµ◦T PG (φ)], so (Iµ (α|α1∞ ) + φ + ln h − ln h ◦ T )dµ = PG (φ). By Lemma 8.4, Iµ α|α1∞ dµ = Hµ α|α1∞ ≤ h µ (T ) < ∞, so Iµ is absolutely integrable (it is a non-negative function). Since φ + ln h − ln h ◦ T is bounded from above (by PG (φ)), it is also absolutely integrable, and h µ (T ) + [φ + ln h − ln h ◦ T ]dµ ≥ [Iµ + φ + ln h − ln h ◦ T ]dµ = PG (φ). (8.3)

We claim that φ ∈ L 1 (µ), and φdµ = [φ + ln h − ln h ◦ T ]dµ. The following holds for almost every x ∈ X : (a) φn (x)/n −−−→ φdµ (because sup φ < ∞ and µ is ergodic); n→∞ ) * (b) φn (x) + ln h(x) − ln h (T n x) /n −−−→ [φ + ln h − ln h ◦ T ]dµ (because n→∞

φ + ln h − ln h ◦ T ∈ L 1 (µ)); (c) ∃n k (x) ↑ ∞ s.t. | ln h(x) − ln h(T n k (x) x)| ≤ 1 (because of the Poincaré recurrence theorem, and the continuity of h). Choose one such x, then 1 1 φn k (x) = lim φn k (x) + ln h(x) − ln h(T n k (x) x) φdµ = lim k→∞ n k (x) k→∞ n k (x) 1 φn + ln h(x) − ln h(T n x) = (φ + ln h − ln h ◦ T ) dµ. = lim n→∞ n By (8.3) h µ (T ) + φdµ ≥ PG (φ). The proposition now follows from the variational principle (Appendix A, Theorem 7.1).

662

V. Cyr, O. Sarig

8.3. Proof of Theorem 1.1. Suppose that X is topologically mixing, and φ ∈ has the SGP and satisfies sup φ < ∞. Let λ, P, N be as in Definition 1.1. Proof of (a). Lemma 8.1 says that P is of the form P f = h f dν, where L φ h = λh, L ∗φ ν = λν, and hdν = 1. Proposition 8.1 says that if dm φ = hdν has finite entropy, then m φ is an equilibrium measure for φ. By [BS], there is at most one such measure, so m φ is unique. Proof of (b). Let ρ(N ) denote the spectral radius of N : L → L. By the SGP, ∃κ ∈ (ρ(N )/λ, 1). If f is (bounded and) Hölder continuous, then L φ ( f h) ∈ L (Lemma 8.3). If g ∈ L ∞ (m φ ), then the identities dm φ = hdν, L ∗φ ν = λν, and P L φ ( f h) = λh f dm φ imply

−n n n−1

f · g ◦ T n dm φ −

λ = gdm f dm [L ( f h) − λ P L ( f h)]gdν φ φ

φ φ

+ + + + + + + + ≤ g ∞ +λ−n N n−1 L φ ( f h)+ 1 ≤ C0 g ∞ +λ−n N n−1 L φ ( f h)+ ≤ C0 λ

−1

g ∞ λ

−(n−1)

N

L (ν) n−1

L

L φ ( f h) L ≤ const g ∞ L φ ( f h) L κ n .

Part (c). We assume without loss of generality that λ = 1, Em φ [ψ] = 0. To arrange this, replace φ by φ − log λ and ψ by ψ − Em φ [ψ]. Part (e) of the SGP is stable under perturbation in Hom(L, L) [K]: There exists a neighborhood U of L φ in Hom(L, L) and analytic maps P, N : U → Hom(L, L), λ : U → C such that for all L ∈ U , L = λ(L)P(L) + N (L), P(L)N (L) = N (L)P(L) = 0, P(L)2 = P(L), dim Im P(L) = 1. If U is sufficiently small, then there is some ε0 > 0 s.t. for all L ∈ U , the spectral radius of N (L) is less than 1 − 2ε0 and the spectral radius of L (equal to |λ(L)|) is more than 1 − ε0 . By the SGP, t → L t := L φ+itψ is analytic on a neighborhood of zero. The maps λt = P(L t ), Pt = P(L t ), Nt = N (L t ) must also be analytic in t on a small neighborhood I of zero. Recall that there is a constant C0 s.t. · L 1 (ν) ≤ C0 · L . For t in I , Em φ eitψn = λ−n L nφ eitψn h dν = L nt hdν (∵ λ = 1) * ) n = λnt Pt h + λ−n t Nt h dν = λnt [1 ± C0 Pt − P + |λt |−n Ntn h L ]. The spectral radius of Nt is less than 1 − 2ε0 , and |λt | ≥ 1 − ε0 , so this gives Em φ [eitψn ] = λnt [1 + εn (t)] for all n, where εn (t) −−−−−−→ 0. t→0,n→∞

Later we will see that if ψ is not cohomologous to a constant, then λt = 1 −

σ2 2 t + o(t 2 ) as t → 0, 2

(8.4)


663

√ where σ > 0. It will then follow that Em φ [exp(itψn / n)] −−−→ exp(−σ 2 t 2 /2), which n→∞

means that √1n ψn converges in distribution (w.r.t m φ ) to a normal law with mean zero and standard deviation σ . To prove (8.4), we expand λt as in [GH]. Define for this purpose h t := Pt 1/ Pt 1dν (the denominator approaches 1 as t → 0 so it is not zero for all |t| sufficiently small), and write L := L 0 = L φ . Then L t h t = λt h t and so λt = L t h t dν = (L t − L)(h t − h)dν + (L t − L)hdν + Lh t dν = (L t − L)(h t − h)dν + Eν [L((eitψ − 1)h)] + h t dν (∵ L ∗ ν = λν = ν) t2 (8.5) = (L t − L)(h t − h)dν − ψ 2 hdν + o(t 2 ) + 1, 2 2

where we have used the fact that ψ is bounded to expand eitψ = 1 + itψ − t2 ψ 2 + o(t 2 ), and the assumption that Em φ [ψ] = 0 to note that ψhdν = 0. (The assumption that ψ is bounded is an overkill.) The analyticity of t → L t , Pt and the estimate · L 1 (ν) ≤ C0 · L can be used to show that (L t − L)(h t − h)dν = o(t) as t → ∞. Thus λt = 1 + o(t). Next we study the difference h t − h, as in [G1]. In what follows, o(1) means an element of L whose L–norm is o(1): ht − h λt h t − h = + o(1) (because λt = 1 + o(t) and h t L is bounded near zero) t t ht − h L t h t − Lh ht + o(1) = (L t − L) + L + o(1). = t t t Subtracting the second summand from both sides, we obtain itψ ht − h = (L t − L) htt + o(1) = L e t −1 h t + o(1). (1 − L) t

(8.6)

The left side of (8.6) converges in L, whence in L 1 (ν), to (1 − L)a, where

d

ht . a := dt t=0 The right side of (8.6) converges in L 1 (ν) to i L(ψh). To see this, note the following: (a) ψ is bounded, so ∃M s.t. |(eitψ − 1)/t| ≤ M for all |t| < 1; itψ L 1 (ν) e −1 h −−−→ iψh, because of the dominated convergence theorem and the (b) t t→0

≤ C+0 h L < ∞; bound + itψ h L 1 (ν) + + e −1 (h t − h)+ 1 ≤ C0 M h t − h L −−→ 0. (c) + t t→0 L (ν) itψ Thus e t −1 h t −−→ iψh in L 1 (ν). t→0

Now L extends to a bounded operator on L 1 (ν) s.t. L ≤ 1 (the transfer operator itψ L 1 (ν) of ν), so L e t −1 h t −−−→ i L(ψh). t→0

664

V. Cyr, O. Sarig

Passing to the limit t → 0 in (8.6), we see that (1 − L)a = i L(ψh) ν–a.e. Since all elements of L are assumed to be continuous, and since ν is globally supported, (1 − L)a = i L(ψh). Apply L k to both sides: L k a − L k+1 a = i L k (ψh). The norm of the right hand side is summable:

L k (ψh) L = P(L(ψh)) + N k−1 L(ψh) L = N k

L(ψh) L (∵ P[L(ψh)] = h

L[ψh]dν = h

ψdm φ = 0),

k and N k < ∞. Summing over k ≥ 0, we obtain a = i ∞ k=1 L (ψh). Returning to the expansion (8.5) of λt , we see that t2 λt = (L t − L)(at + o(t))dν − ψ 2 hdν + o(t 2 ) + 1 2 itψ e −1 t2 (a + o(1))dν − = t2 ψ 2 hdν + o(t 2 ) + 1 t 2 ∞ t2 L k (ψh)dν + o(t 2 ), = 1− ψ 2 hdν − t 2 ψ 2 k=1

and we obtain (8.4) with

σ := 2

∞ 2 k ψ + ψ L (ψh) dm φ . h 2

k=1

But it is not yet clear that σ 2

is strictly positive. Tok see this we follow [G1] and rewrite the integrand in terms of the function u := ∞ k=0 L (ψh), noting that ψh = u − Lu: 1 1 2 2 2 (ψh) (u − Lu) σ = + 2ψh(u − ψh) dm = + 2(u − Lu)Lu dm φ φ h2 h2

2 1 2 1 2 2 L(h · u/ h) u dm = − (Lu) = (u/ h) − dm φ . φ h2 h ' : v → h −1 L(hv) preserves m φ : T '∗ m φ = m φ (it is the transfer The operator T operator of m φ = hdν). Thus we get '[(u/ h)2 ] − (T '(u/ h))2 dm φ . T σ2 = ' takes the form T ' f = T y=x g(y) f (y), where g = eφ It is not difficult to see that T '[(u/ h)2 ] ≥ (T '(u/ h))2 , h/ h ◦ T . We have T y=x g(y) ≡ 1. Since t → t 2 is convex, T '[(u/ h)2 ] = [T '(u/ h)]2 m φ –a.e. and we get that σ 2 ≥ 0, with equality iff T 2 By the strict convexity of t → t , u/ h must be constant on {y : T y = x} for a.e. x. Since m φ ∼ m φ ◦ T (∵ dm/dm ◦ T = eφ h/ h ◦ T > 0), this means that there is a function ϕ s.t. u/ h = ϕ ◦ T almost everywhere. Thus ψ=

1 1 (u − Lu) = ϕ ◦ T − L(hϕ ◦ T ) = ϕ ◦ T − ϕ a.e. h h


665

It follows that ψ is an almost everywhere coboundary w.r.t m φ . By the Livsic theorem of Gouëzel [G2], ψ is a coboundary with a continuous transfer function. But part (c) assumes that ψ is not like that. Part (d). Suppose ψ is a (bounded) Hölder continuous function, and let L t , λt , Pt , Nt be as above. We saw that t → λt , Pt , Nt are analytic on some complex neighborhood of 0, and that for all |t| sufficiently small, ρ(Nt ) < |λt |. We claim that λt = exp PG (φ + tψ) on some real neighborhood of t = 0. This is because of the estimates Z n (φ + tψ, a) L nt 1[a] (x) = λnt Pt 1[a] (x) + Ntn 1[a] (x) λnt , which hold uniformly in x on [a] provided t is small enough that ρ(Nt ) < |λt | (see Appendix A, Remark 7.1). In particular λ0 = exp PG (φ) = 0, and PG (φ + tψ) = log λt is real analytic on a neighborhood of zero. Note added in proof. The first author has recently found a combinatorial characterization of the topologically mixing CMS for which {φ ∈ : φ does not have SGP} is not empty. As it turns out, “most” infinite state CMS are like that, e.g. all CMS whose associated transition graph G contains an infinite ray. Complete statements and detailed proofs will appear elsewhere. References [ADU] Aaronson, J., Denker, M., Urbanski, F.: Ergodic theory for Markov fibred systems and parabolic rational maps. Trans. Amer. Math. Soc. 337(2), 495–548 (1993) [A] Aaronson, J.: An introduction to infinite ergodic theory. Math. Surv. and Monog. 50, Providence, RI: Amer. Math. Soc. 1997 [AD] Aaronson, J., Denker, M.: Local limit theorems for Gibbs–Markov maps. Stochastics Dyn. 1, 193– 237 (2001) [B] Baladi, V.: Positive Transfer Operators and Decay of Correlations. Advanced Series in Nonlinear Dynamics 16, Singapore: World Scientific, 2000 [BS] Buzzi, J., Sarig, O.: Uniqueness of equilibrium measures for countable Markov shifts and multidimensional piecewise expanding maps. Erg. Thy. Dynam. Sys. 23, 1383–1400 (2003) [DF] Doeblin, W., Fortet, R.: Sur des chaînes à liasions complètes. Bull. Soc. Math. France 65, 132– 148 (1937) [GM] Gallavotti, G., Miracle-Sole, S.: Statistical mechanics of lattice systems. Commun. Math. Phys. 5(5), 317–323 (1967) [G1] Gouëzel, S.: Central limit theorem and stable laws for intermittent maps. Probab. Theory Rel. Fields 128(1), 82–122 (2004) [G2] Gouëzel, S.: Regularity of coboundaries for nonuniformly expanding Markov maps. Proc. Amer. Math. Soc. 134(2), 391–401 (2006) [GH] Guivarc’h, Y., Hardy, J.: Théorèmes limites pour une classe de chaînes de Markov et applications aux difféomorphismes d’Anosov. (In French, English summary) [Limit theorems for a class of Markov chains and applications to Anosov diffeomorphisms] Ann. Inst. H. Poincaré Probab. Statist. 24(1), 73–98 (1988) [GS] Gurevich, B.M., Savchenko, S.V.: Thermodynamics formalism for countable Markov chains. Usp Mat. Nauk 53, 2, 3–106 (1998). Engl. transl. in Russ. Math. Surv. 53:2 3–106 (1998) [HH] Hennion, H., Hervé, L.: Limit theorems for Markov chains and stochastic properties of dynamical systems by quasi–compactness, LNM 1766, Berlin Heidelberg-New York: Springer, 2001 [K] Kato, T.: Perturbation theory for linear operators. Reprint of the 1980 edition. Classics in Mathematics. Berlin: Springer-Verlag, 1995 [L] Ledrappier, F.: Principe variationnel et systemes dynamiques symboliques. Z. Wahrs. Verb. Geb. 30, 185–202 (1974)

666

V. Cyr, O. Sarig

[Li]

Liverani, C.: Central limit theorem for deterministic systems. International Conference on Dynamical Systems (Montevideo, 1995), Pitman Res. Notes Math. Ser. 362, Harlow: Longman, 1996, pp. 56–75 Lopes, A.: The zeta function, nondifferentiability of pressure, and the critical exponent of transition. Adv. Math. 101(2), 133–165 (1993) Parry, W.: Entropy and generators in ergodic theory. New York: W.A. Benjamin Inc., 1969 Parry, W.: Pollicott, M.: Zeta functions and the periodic orbit structure of hyperbolic dynamics. Astérisque 187–8 (1990) Prellberg, T., Slawny, J.: Maps of intervals with indifferent fixed points: thermodynamic formalism and phase transitions. J. Stat. Phys. 66(1–2), 503–514 (1992) Ruelle, D.: Thermodynamic formalism. The mathematical structures of equilibrium statistical mechanics. Second Edition, Cambridge: Cambridge Univ. Press, 2004 Ruette, S.: On the Vere-Jones classification and existence of maximal measures for countable topological Markov chains. Pacific J. Math. 209(2), 365–380 (2003) Sarig, O.: Thermodynamic formalism for countable Markov shifts. Erg. Th. Dynam. Syst. 19, 1565– 1593 (1999) Sarig, O.: Thermodynamic formalism for null recurrent potentials. Israel J. Math. 121, 285–311 (2001) Sarig, O.: Phase transitions for countable Markov shifts. Commun. Math. Phys. 217, 555–577 (2001) Sarig, O.: Subexponential decay of correlations. Invent. Math. 150, 629–653 (2002) Sarig, O.: Critical exponents for dynamical systems. Commun. Math. Phys. 267, 631–667 (2006) Vere–Jones, D.: Geometric ergodicity in denumerable Markov chains. Quart. J. Math. Oxford 13(2), 7–28 (1962) Young, L.-S.: Statistical properties of dynamical systems with some hyperbolicity. Ann. Math. 147(2), 585–650 (1998)

[Lo] [P] [PP] [PrS] [R] [Rt] [S1] [S2] [S3] [S4] [S5] [VJ] [Y]

Communicated by G. Gallavotti


Communications in


Lie Conformal Algebra Cohomology and the Variational Complex Alberto De Sole1 , Victor G. Kac2 1 Dipartimento di Matematica, Universitá di Roma “La Sapienza”,

Cittá Universitaria, 00185 Rome, Italy. E-mail: [email protected]

2 Department of Mathematics, MIT, 77 Massachusetts Avenue,

Cambridge, MA 02139, USA. E-mail: [email protected] Received: 22 December 2008 / Accepted: 11 June 2009 Published online: 14 August 2009 – © Springer-Verlag 2009

Dedicated to Corrado De Concini on his 60th birthday Abstract: We find an interpretation of the complex of variational calculus in terms of the Lie conformal algebra cohomology theory. This leads to a better understanding of both theories. In particular, we give an explicit construction of the Lie conformal algebra cohomology complex, and endow it with a structure of a g-complex. On the other hand, we give an explicit construction of the complex of variational calculus in terms of skew-symmetric poly-differential operators. 1. Introduction Lie conformal algebras encode the properties of operator product expansions in conformal field theory, and, at the same time, of local Poisson brackets in the theory of integrable evolution equations. Recall [K] that a Lie conformal algebra over a field F is an F[∂]-module A, endowed with a λ-bracket, that is an F-linear map A ⊗ A → F[λ] ⊗ A, denoted by a ⊗ b → [aλ b], satisfying the two sesquilinearity properties [∂aλ b] = −λ[aλ b],

[aλ ∂b] = (∂ + λ)[aλ b],

(1)

such that the skew-symmetry [aλ b] = −[b−∂−λ a]

(2)

[aλ [bµ c]] − [bµ [aλ c]] = [[aλ b]λ+µ c]

(3)

and the Jacobi identity

hold for any a, b, c ∈ A. It is assumed in (2) that ∂ is moved to the left. A module over a Lie conformal algebra A is an F[∂]-module M, endowed with a λ-action, that is an F-linear map A ⊗ M → F[λ] ⊗ M, denoted by a ⊗ b → aλ b, such

668

A. De Sole, V. G. Kac

that sesquilinearity (1) holds for a ∈ A, b ∈ M and the Jacobi identity (3) holds for a, b ∈ A, c ∈ M. A cohomology theory for Lie conformal algebras was developed in [BKV]. Given a Lie conformal algebra Aand an A-module M, one first defines the basic cohomology complex Γ• (A, M) = k∈Z+ Γk , where Γk consists of F-linear maps γ : A⊗k → F[λ1 , . . . , λk ] ⊗ M, satisfying certain sesquilinearity and skew-symmetry properties, and endows this complex with a differential δ : Γk → Γk+1 , such that δ 2 = 0. This complex is isomorphic to the Lie algebra cohomology complex for the annihilation Lie algebra g− of A with coefficients in the g− -module M [BKV, Theorem 6.1]. Next, one endows Γ• (A, M) with a structure of a F[∂]-module, such that ∂ commutes with δ, which allows one to define the reduced cohomology complex Γ • (A, M) = Γ• (A, M)/∂ Γ• (A, M), and this is the Lie conformal algebra cohomology complex, introduced in [BKV]. Our first contribution to this theory is a more explicit construction of the reduced cohomology complex. Namely, we introduce a new cohomology complex C • (A, M) = ⊕k∈Z+ C k , where C 0 = M/∂ M, C 1 = HomF[∂] (A, M), and for k ≥ 2, C k consists of poly λ-brackets, namely of F-linear maps c : A⊗k → F[λ1 , . . . , λk−1 ] ⊗ M, satisfying certain sesquilinearity and skew-symmetry conditions, and we endow C • (A, M) with a square zero differential d. We construct embeddings of complexes: Γ • (A, M) ⊂ C¯ • (A, M) ⊂ C • (A, M),

(4)

C¯ • (A,

where M) consists of cocycles which vanish if one of the arguments is a torsion element of A. In fact, C¯ k = C k , unless k = 1. We show that Γ • (A, M) = C¯ • (A, M), provided that, as an F[∂]-module, A is isomorphic to a direct sum of its torsion and a free F[∂]-module (which is always the case if A is a finitely generated F[∂]-module). Our opinion is that the slightly larger complex C • (A, M) is a more correct Lie conformal algebra cohomology complex than the complex Γ • (A, M) of [BKV]. This is illustrated by our Theorem 3.1(c), which says that the F[∂]-split abelian extensions of A by M are parameterized by H 2 (A, M) for the complex C • (A, M). This holds for the cohomology theory of [BKV] only if A is a free F[∂]-module. Following [BKV], we also consider the superspace of basic chains Γ• (A, M) and its subspace of reduced chains Γ• (A, M) (they are not complexes in general). Corresponding to the embeddings of complexes (4), we introduce the vector superspaces of chains C• (A, M) and C¯ • (A, M), and the maps: C• (A, M) C¯ • (A, M) → Γ• (A, M).

(5)

We develop the theory further in the important case for the calculus of variations, when the A-module M is endowed with a commutative associative product, such that ∂ and aλ for all a ∈ A are derivations of this product. In this case one can endow the superspace Γ• (A, M) with a commutative associative product [BKV]. Furthermore, we introduce a Lie algebra bracket on the space g := Π Γ1 (A, M) (Π , as usual, stands for reversing of the parity). Let g = ηg ⊕ g ⊕ F∂η be a Z-graded Lie superalgebra extension of g, where η is an odd indeterminate, η2 = 0. We endow Γ• (A, M) with a structure of a g-complex, which is a Z-grading preserving Lie superalgebra homomorphism ϕ : g → EndF Γ• (A, M), such that ϕ(∂η ) = δ. We also show that ϕ( g) lies in the • subalgebra of derivations of the superalgebra Γ (A, M). For each X ∈ g we thus have the Lie derivative L X = ϕ(X ) and the contraction operator ι X = ϕ(ηX ), satisfying all the usual relations, in particular, the Cartan formula L X = ι X δ + δι X .

Lie Conformal Algebra Cohomology and the Variational Complex

669

Denoting by g∂ the centralizer of ∂ in g, we obtain the induced structure of a g∂ -complex for Γ • (A, M), which we, furthermore, extend to the larger complex C • (A, M). Namely, we introduce a canonical Lie algebra bracket on all spaces of 1-chains with reversed parity (see (5)), so that all the maps ΠC1 Π C¯ 1 → Π Γ1 → Π Γ1 are Lie algebra homomorphisms, and the embeddings (4) are morphisms of complexes, endowed with a corresponding Lie algebra structure. What does it all have to do with the calculus of variations? In order to explain this, introduce the notion of an algebra of differential functions (in variables). This is a differential algebra, i.e., a unital commutative associative algebra V with a derivation ∂, endowed with commuting derivations ∂(n) , i ∈ I = {1, . . . , }, n ∈ Z+ , such that only a finite number of rules with ∂ hold:

∂f (n) ∂u i

∂u i

are non-zero for each f ∈ V, and the following commutation

∂ (n) ∂u i

,∂ =

∂

(the RHS is 0 if n = 0) .

(n−1) ∂u i

(6) (n)

An important example is the algebra of differential polynomials F[u i |i ∈ I , n ∈ Z+ ] (n) (n+1) , n ∈ Z+ , i ∈ I . Other examples include any localization by a with ∂(u i ) = u i multiplicative subset or any algebraic extension of this algebra. • = Ω • (V) over V is defined as an exterior superalgeThe basic de Rham complex Ω 1 = i∈I, n∈Z Vδu (n) on generators δu (n) with odd parity. bra over the free V-module Ω i i + k , where Ω 1 . This Z-graded superalgebra 0 = V, Ω k = Λk Ω • = k∈Z Ω We have: Ω V + (n) f is endowed by an odd derivation δ of degree 1, such that δ f = i∈I, n∈Z+ ∂ (n) δu i for ∂u i

0 and δ(δu (n) ) = 0. One easily checks that δ 2 = 0, so that Ω • is a cohomology f ∈Ω i complex. Let g be the Lie algebra of derivations of the algebra V of the form X=

i∈I, n∈Z+

Pi,n

∂ (n)

∂u i

,

where Pi,n ∈ V.

(7)

To any such derivation X we associate an even derivation L X (Lie derivative) and an odd • by letting L X |V = X, L X (δu (n) ) = derivation ι X (contraction) of the superalgebra Ω i (n) • with a structure of a g-complex, δ Pi,n , ι X |V = 0, ι X (δu i ) = Pi,n . This provides Ω by letting ϕ(X ) = L X and ϕ(ηX ) = ι X . Also, the derivation ∂ extends to an (even) • by letting ∂(δu (n) ) = δu (n+1) . derivation of Ω i i It is easy to check, using (6), that ∂ and δ commute, hence we can consider the reduced complex • (V)/∂ Ω • (V), Ω • (V) = Ω which is called the variational complex. This is, of course, a g∂ -complex. Our main observation is the interpretation of the variational complex Ω • (V) in terms of Lie conformal algebra cohomology, given by Theorem 1 below. Let R = i∈I F[∂]u i be a free F[∂]-module of rank , endowed with the trivial λ-bracket [aλ b] = 0 for all a, b ∈ R. Let V be an algebra of differential functions.

670


We endow V with the structure of an R-module by letting u iλ f =

n∈Z+

λn

∂f ∂u i(n)

, i ∈ I,

and extending to R by sesquilinearity. Let g be the Lie algebra of derivations of V of the form (7), and let g∂ be the subalgebra of g, consisting of derivations commuting with ∂. Theorem 1. The g∂ -complexes C • (R, V) and Ω • (V) are isomorphic. As a result, we obtain the following interpretation of the complex Ω • (V), which explains the name “calculus of variations”. of Ω 0 are called We have: Ω 0 = V/∂V, Ω 1 = HomF[∂] (R, V) = V ⊕ . Elements 0 local functionals and the image of f ∈ V in Ω is denoted by f . Elements of Ω 1 are called local 1-forms. differential δ : Ω 0 → Ω 1 is identified with the variational

The δ f = δδuf , where derivative: δ f = δu i i∈I

δf ∂f = (−∂)n (n) . δu i ∂u i n∈Z+

(8)

Furthermore, the space of 2-cochains C 2 is identified with the space of skew-adjoint differential operators by associating to a λ-bracket {· λ ·} : R ⊗2 → F[λ] ⊗ V the × matrix Si j (∂) = {u j ∂ u i }→ , where the arrow means that ∂ is moved to the right. The differential δ : Ω 1 → Ω 2 is expressed in terms of the Frechet derivative D F (∂)i j =

∂ Fi (n)

n∈Z+ ∂u j

∂n,

i, j ∈ I,

(9)

which defines an F-linear map: V → V ⊕ . Namely: δ F = D F (∂) − D F (∂)∗ . The subspace of closed 2-cochains in C 2 is identified with the space of symplectic differential operators. A 2-cochain, which is a skew-adjoint differential operator Si j (∂), can be identified with the corresponding F-linear map (V )2 → V/∂V, of “differential type”, given by Q i Si j (∂)P j . S(P, Q) = i, j∈I

Skew-adjointness of S translates to the skew-symmetry condition S(P, Q) = −S(Q, P). More generally, the space of k-cochains C k for k ≥ 2 is identified with the space of all skew-symmetric F-linear maps S : (V )k → V/∂V, of “differential type”: 1 ,...,n k f in1 ,...,i (∂ n 1 Pi11 ) · · · (∂ n k Pikk ), S(P 1 , . . . , P k ) = k i 1 ,...,i k ∈I n 1 ,...,n k ∈Z+

1 ,...,n k where f in1 ,...,i ∈ V. The skew-symmetry condition is simply S(P 1 , . . . , P k ) = k sign(σ )S(P σ (1) , . . . , P σ (k) ), for every σ ∈ Sk . The subspace of closed k-cochains for k ≥ 2 is the subspace of “symplectic” k-differential operators.


671

We prove in [BDK] that the cohomology H j of the complex Ω • (V) is zero for j ≥ 1 f and H 0 = C/(C ∩ ∂V), where C := { f ∈ V | ∂ (n) = 0 ∀i ∈ I, n ∈ Z+ }, provided ∂u i

that V is normal, as defined in Sect. 5.6. (Any algebra of differential functions can be δ included in a normal one.) As a corollary, we obtain (cf. [D]) that Ker δu = ∂V + C, and δ F ∈ Im δu iff D F (∂) is a self-adjoint differential operator, provided that V is normal. The first result can be found in [D] (see also [Di] and [Vi], where it is proved under stronger conditions on V), but it is certainly much older. The second result, at least under stronger conditions on V, goes back to [H], [V]. We also obtain the classification of symplectic differential operators (cf. [D]) and of symplectic poly-differential operators for normal V, which seems to be a new result. Thus, the interaction between the Lie conformal algebra cohomology and the variational calculus has led to progress in both theories. On the one hand, the variational calculus motivated some of our constructions in the Lie conformal algebra cohomology. On the other hand, the Lie conformal algebra cohomology interpretation of the variational complex has led to a better understanding of this complex and to a classification of symplectic differential operators. The ground field is an arbitrary field F of characteristic 0. We wish to thank Bojko Bakalov for very valuable comments, in particular, for the observation that our complex C • (A, M) is isomorphic to the complex in [BDAK, Sect. 15.1], in the case when the Hopf algebra H is F[∂]. 2. Lie Conformal Algebra Cohomology Complexes 2.1. The basic cohomology complex Γ• and the reduced cohomology complex Γ • . Let us review, following [BKV], the definition of the basic and reduced cohomology complexes associated to a Lie conformal algebra A and an A-module M. A k-cochain of A with coefficients in M is an F-linear map γ : A⊗k → F[λ1 , . . . , λk ] ⊗ M, a1 ⊗ · · · ⊗ ak → γλ1 ,...,λk (a1 , . . . , ak ), satisfying the following two conditions: A1. γ λ1 ,...,λk (a1 , . . . , ∂ai , . . . , ak ) = −λi γλ1 ,...,λk (a1 , . . . , ak ) for all i, A2. γ is skew-symmetric w.r.t. simultaneous permutations of the ai ’s and the λi ’s. Remark 1. Note that Condition A1 implies that γλ1 ,...,λk (a1 , . . . , ak ) is zero if one of the elements ai is a torsion element of the F[∂]-module A. k • • k We letk Γ = Γ (A, M) be the space of all k-cochains, and Γ = Γ (A, M) = γ is defined by the following formula: k≥0 Γ . The differential δ of a k-cochain (δ γ )λ1 ,...,λk+1 (a1 , . . . , ak+1 ) = +

k+1 i, j=1 i< j

k+1

(−1)i+1 ai λi

γ i

i j

λ1 ,··· ˇ ···,λ ˇ k+1 ,λi +λ j

i

λ1 ,···,λ ˇ k+1

i=1

(−1)k+i+ j+1 γ

i

j

(a1 , · ˇ· ·, ak+1 ) (10)

(a1 , · ˇ· ·· ˇ· ·, ak+1 , [ai λi a j ]).

One checks that δ maps Γk to Γk+1 , and that δ 2 = 0. The Z-graded space Γ• (A, M) with the differential δ is called the basic cohomology complex associated to A and M.

672


Define the structure of an F[∂]-module on Γ• by letting (∂ γ )λ1 ,...,λk (a1 , . . . , ak ) = (∂ M + λ1 + · · · + λk ) γ λ1 ,...,λk (a1 , . . . , ak ) ,

(11)

where ∂ M denotes the action of ∂ on M. One checks that δ and ∂ commute, and therefore ∂ Γ• ⊂ Γ• is a subcomplex. We can consider the reduced cohomology com plex Γ • (A, M) = Γ• (A, M)/∂ Γ• (A, M) = k∈Z+ Γ k (A, M). For example, Γ 0 = M/∂ M M, and we denote, as in the calculus of variations, by m the image of m ∈ M in M/∂ M M. As before we let, for brevity, Γ • = Γ • (A, M) and Γ k = Γ k (A, M), k ∈ Z+ . In the following sections we will find a simpler construction of the reduced cohomology complex Γ • , in terms of poly λ-brackets.

2.2. Poly λ-brackets. Let A and M be F[∂]-modules, and, as before, denote by ∂ M the action of ∂ on M. For k ≥ 1, a k-λ-bracket on A with coefficients in M is, by definition, an F-linear map c : A⊗k → F[λ1 , . . . , λk−1 ] ⊗ M, denoted by a1 ⊗ · · · ⊗ ak → {a1λ1 · · · ak−1 λk−1 ak }c , satisfying the following conditions: B1. {a1λ1 · · · (∂ai )λi · · · ak−1 λk−1 ak }c = −λi {a1λ1 · · · ak−1 λk−1 ak }c , for 1 ≤ i ≤ k − 1; B2. {a1λ1 · · · ak−1 λk−1 (∂ak )}c = (λ1 + · · · + λk−1 + ∂ M ){a1λ1 · · · ak−1 λk−1 ak }c ; B3. c is skew-symmetric with respect to simultaneous permutations of the ai ’s and the λi ’s in the sense that, for every permutation σ of the indices {1, . . . , k}, we have: {a1λ1 · · · ak−1 λk−1 ak }c = sign(σ ){aσ (1) λσ (1) · · · aσ (k−1) λσ (k−1) aσ (k) }c The notation in the RHS means that λk is replaced by λ†k = − occurs, and ∂ M is moved to the left.

k−1

j=1 λ j

λk →λ†k

.

− ∂ M , if it

Remark 2. A structure of a Lie conformal algebra on A is a 2-λ-bracket on A with coefficients in A, satisfying the Jacobi identity (3). We let C 0 = M/∂ M M and, for k ≥ 1, we denote by C k = C k (A, M) the space 1 of all k-λ-brackets on A with coefficients in M. For example, C is the space of all F[∂]-module homomorphisms c : A → M. We let C • = k∈Z+ C k , the space of all poly λ-brackets. We also define C¯ • = k∈Z+ C¯ k , where C¯ 0 = C 0 = M/∂ M M, and C¯ k ⊂ C k is the subspace of k-λ-brackets c with the following additional property: {a1λ1 · · · ak−1 λk−1 ak }c is zero if one of the elements ai is a torsion element in A. Clearly, C¯ 1 needs not be equal to C 1 . On the other hand, it is easy to check, using the sesquilinearity Conditions B1 and B2, that C¯ k = C k for k ≥ 2.


673

2.3. The complex of poly λ-brackets. We next define a differential d on the space C • of poly λ-brackets such that d(C k ) ⊂ C k+1 and d 2 = 0, thus making C • a cohomology complex. For m ∈ C 0 = M/∂ M M, we let d m ∈ C 1 be the following F[∂]-module homomorphism:

(12) d m (a) = {a}d m := a−∂ M m. This is well defined since, if m ∈ ∂ M M, the RHS is zero due to sesquilinearity. For c ∈ C k , with k ≥ 1, we let dc ∈ C k+1 be the following poly λ-bracket: {a1λ1 · · · ak λk ak+1 }dc := k

+

i=1 i

j

c

(−1)k+i+ j+1 a1λ1 · ˇ· ·· ˇ· · ak λk ak+1 λ† [ai λi a j ] k+1

i, j=1 i< j

+(−1)k ak+1 λ†

k+1

+

k i (−1)i+1 ai λi a1λ1 · ˇ· · ak λk ak+1

a1λ1 · · · ak−1 λk−1 ak

c

c

i (−1)i a1λ1 · ˇ· · ak λk [ai λi ak+1 ] ,

k

(13)

c

i=1

where, as before, λ†k+1 = − kj=1 λ j − ∂ M , and ∂ M is moved to the left. For example, for an F[∂]-module homomorphism c : A → M, we have {aλ b}dc = aλ c(b) − b−λ−∂ c(a) − c([aλ b]).

(14)

Proposition 1. (a) For c ∈ C k , we have d(c) ∈ C k+1 and d 2 (c) = 0. This makes (C • , d) a cohomology complex. (b) d(C¯ k ) ⊂ C¯ k+1 for all k ≥ 0. Hence (C¯ • , d) is a cohomology subcomplex of (C • , d). Proof. We prove part (b) Mfirst. For k ≥ 1 there is nothing to prove. For k = 0 just notice that, if m ∈ M/∂ M and a ∈ A is a torsion element, then, by (12), we have d m (a) = 0, since torsion elements of A act trivially in any module [K]. Hence d m ∈ C¯ 1 . In order to prove part (a) we have to check that, if c ∈ C k , then dc, defined by (12) and (13), satisfies Conditions B1, B2, B3, and d(dc) = 0. To simplify the arguments, we rewrite (13) in a concise form: {a1λ1 · · · ak λk ak+1 }dc := +

k+1 i, j=1 i< j

k+1

(−1)k+i+ j+1 a1λ1

(−1)

i+1

ai λi

i a1λ1 · ˇ· · ak λk ak+1

i=1

i j · ˇ· ·· ˇ· · ak+1λk+1 [ai λi a j ] c

c

λk+1 =λ†k+1

,

(15)

where the RHS is evaluated at λk+1 = λ†k+1 = − kj=1 λ j − ∂ M , with ∂ M acting from the left. The above equation should be interpreted by saying that, in the first term in

674


the RHS, for i = k + 1, the last index λk does not appear in the poly λ-bracket. Let us replace ah by ∂ah in Eq. (15). It is not hard to check, using Conditions B1 and B2 for c and the sesquilinearity of the λ-action of A on M, that, for 1 ≤ h ≤ k, each term in the RHS of (15) gets multiplied by −λh , while, for h = k + 1, each term gets multiplied by −λ†k+1 = kj=1 λ j + ∂ M . Hence dc satisfies Conditions B1 and B2. In order to prove B3., let σ be a permutation of the set {1, . . . , k + 1}. A basic observation σ (k+1)

is that, if we first replace λσ (k+1) by λ†σ (k+1) = −λ1 − · ˇ· · −λk+1 − ∂ M , and then λk+1

by λ†k+1 = −λ1 · · · − λk − ∂ M , as a result λσ (k+1) stays unchanged. Notice, moreover, σ (i)

i

that, for 1 ≤ i ≤ k + 1, {σ (1), · ˇ· ·, σ (k + 1)} is a permutation of {1, · ˇ· ·, k + 1}, and its sign is (−1)i+σ (i) sign(σ ). Hence, using the assumption B3 on c, we get i aσ (i) λσ (i) aσ (1) λσ (1) · ˇ· · aσ (k) λσ (k) aσ (k+1) † c λk+1 =λk+1 σ (i) i+σ (i) = sign(σ )(−1) aσ (i) λσ (i) a1λ1 · ˇ· · ak λk ak+1 . (16) † λk+1 =λk+1

c j

i

Similarly, for the second term in (15), we notice that {σ (1), · ˇ· ·· ˇ· ·, σ (k + 1)} is a perσ (i)σ ( j)

mutation of {1, · ˇ· · · ˇ· · , k + 1}, and its sign is (−1)i+ j+σ (i)+σ ( j) sign(σ ) if σ (i) < σ ( j), and it is (−1)i+ j+σ (i)+σ ( j)+1 sign(σ ) if σ (i) > σ ( j). Hence, for σ (i) < σ ( j) we have i j aσ (1) λσ (1) · ˇ· ·· ˇ· · aσ (k+1) λσ (k+1) [aσ (i) λσ (i) aσ ( j) ] † λk+1 =λk+1

c

= sign(σ )(−1)i+ j+σ (i)+σ ( j)

σ (i)σ ( j)

× a1λ1 · ˇ· · · ˇ· · ak+1λk+1 [aσ (i) λσ (i) aσ ( j) ] c

λk+1 =λ†k+1

,

(17)

while for σ (i) > σ ( j) we have, by the skew-symmetry of the λ-bracket in A, i j aσ (1) λσ (1) · ˇ· ·· ˇ· · aσ (k+1) λσ (k+1) [aσ (i) λσ (i) aσ ( j) ] † c

λk+1 =λk+1

= sign(σ )(−1)i+ j+σ (i)+σ ( j)

σ ( j)σ (i)

× a1λ1 · ˇ· · · ˇ· · aσ (k+1) λk+1 [aσ ( j) −λσ (i) −∂ aσ (i) ] c

= sign(σ)(−1)i+ j+σ (i)+σ ( j) σ ( j)σ (i)

× a1λ1 · ˇ· · · ˇ· · aσ (k+1) λk+1 [aσ ( j) λσ ( j) aσ (i) ] c

λk+1 =λ†k+1

λk+1 =λ†k+1

.

(18)

In the last identity we used the assumption that c satisfies condition B2. Clearly, Eqs. (16), (17) and (18), together with the definition (15) of dc, imply that dc satisfies condition B3. We are left to prove that d 2 c = 0. We have, by (15),


{a1λ1 · · · ak+1λk+1 ak+2 }d 2 c = +

k+2

675

k+2 i (−1)i+1 ai λi a1λ1 · ˇ· · ak+1λk+1 ak+2 i=1

j

i

dc

(−1)k+i+ j a1λ1 · ˇ· ·· ˇ· · ak+2λk+2 [ai λi a j ]

i, j=1 i< j

λk+2 =λ†k+2

dc

,

(19)

M M is moved to where, in the RHS, we replace λk+2 by λ†k+2 = − k+1 j=1 λ j − ∂ , and ∂ the left. Again by (15) and by sesquilinearity of the λ-action of A on M, the first term in the RHS of (19) is ⎛

k+2 ⎜ ⎜ (−1)i+ j (i, j)a j λ j ⎝

i

j

a1λ1 · ˇ· ·· ˇ· · ak+1λk+1 ak+2

ai λi

i, j=1 i = j

c

k+2

+

(−1)k+i+ j+h (i, h)( j, h)

i, j,h=1 i< j i, j =h

j

i

⎞

h

⎟ ⎟ ⎠

× ah λh a1λ1 · ˇ· ·· ˇ· ·· ˇ· · ak+2λk+2 [ai λi a j ]

λk+2 =λ†k+2

c

,

(20)

where (i, j) is +1 if i < j and −1 if i > j. Similarly, by (13) the second term in the RHS of (19) is ⎛ ⎜ k+2 ⎜ h i j ⎜ (−1)k+i+ j+h+1 (h, i)(h, j)ah λh a1λ1 · ˇ· ·· ˇ· ·· ˇ· · ak+2λk+2 [ai λi a j ] ⎜ ⎜ c ⎝i, j,h=1 i< j i, j =h

+

k+2

(−1)

i+ j

[ai λi a j ]λi +λ j

i

j

a1λ1 · ˇ· ·· ˇ· · ak+1λk+1 ak+2

i, j=1 i< j

+

c

k+2

(−1)i+ j+ p+q ( p, i)( p, j)(q, i)(q, j)

i, j, p,q=1 i< j, p h + j, the proof is similar to that of Proposition 3. Thus we only have to consider the case k = h + j. Recalling (84) and (85), we have

ι y (ιx c) = ψ µ (χh φ)µ {a1λ1 · · · ah−1 λh−1 ah λh b1µ1 · · · b j−1 µ j−1 b j }c . Applying the skew-symmetry Condition B3 for c and using the definition (78) of χh , we get, after integration by parts, that the RHS is

(−1)h j (χ j ψ)µ φ µ {b1µ1 · · · b j−1 µ j−1 b j µ j a1λ1 · · · ah−1 λh−1 ah }c , which is the same as (−1)h j ιx (ι y c). For example, given an element m ∈ C0 = {m ∈ M | ∂m = 0}, we have {a1λ1 · · · ak−1 λk−1 ak }ιm c = m{a1λ1 · · · ak−1 λk−1 ak }c . Recall also that C1 = A ⊗ M/∂(A ⊗ M). The contraction operators associated to 1-chains are given by the following formulas: if c ∈ C 1 = HomF[∂] (A, M), then ιa⊗m c = mc(a), (90) while if c ∈ C k , with k ≥ 2, then {a2λ2 · · · ak−1 λk−1 ak }ιa1 ⊗m c = {a1 ∂ M a2λ2 · · · ak−1 λk−1 ak }c → m,

(91)

where the arrow in the RHS means, as usual, that ∂ M should be moved to the right. Also we have the following formulas for the Lie derivative L x = [d, ιx ] by a 1-chain x ∈ C1 acting on C 0 = M/∂ M M and C 1 = HomF[∂] (A, M): (92) L a⊗m n = (a∂ M n)→ m, (L a⊗m c)(b) = a∂ M c(b) → m +← (b−∂ M m)c(a) − c [a∂ M b] → m, where the left arrow in the RHS means, as usual, that ∂ M should be moved to the left. The definitions of the contraction operators associated to elements of Γ• and C• are “compatible”. This is stated in the following: Theorem 4. For x ∈ C h and γ ∈ Γ k , with k ≥ h, we have ιx (ψ k (γ )) = ψ k−h (ιχh (x) (γ )), where ψ k : Γ k → C k , denotes the injective linear map defined in Theorem 2, and χh : C h → Γh , denotes the linear map defined in Proposition 6. In other words, there is a commutative diagram of linear maps: CO k

ιx

/

ψk

C k−h O ,

? Γk

ψ k−h ιξ

/

? Γ k−h

provided that ξ ∈ Γh and x ∈ C h are related by ξ = χh (x).

(93)

700


Proof. Let γ ∈ Γk be a representative of γ ∈ Γ k , and let a1 ⊗ · · · ⊗ ah ⊗ φ ∈ ⊗h A ⊗ Hom(F[λ1 , . . . , λh−1 ], M) be a representative of x ∈ C h . Recalling the definition (22) of ψ k and the definition (84) of ιx , we have

µ γ = (χ φ) (a , . . . , a ) , (94) {ah+1λh+1 · · · ak−1 λk−1 ak }ιx ψ k ( † h 1 k γ) λ ,...,λ ,λ 1

where, in the RHS, λ†k stands for −

k−1

k

k−1

j=1 λ j

− ∂ M , with ∂ M acting on the argument of M (χh φ)µ . By Lemmas 3 and (6)(c), we can replace λ†k by − k−1 j=h+1 λ j − ∂ , where now M µ ∂ is moved to the left of (χh φ) . Hence, the RHS of (94) is the same as (χh φ)µ γλ1 ,λ2 ,...,λk (a1 , . . . , ak ) λ →λ† k k = {ah+1λh+1 · · · ak−1 λk−1 ak }ψ k−h (ιχ (x) ( γ )) , h

thus completing the proof of the theorem.

4.7. Lie conformal algebroids. A Lie conformal algebroid is an analogue of a Lie algebroid. Definition 1. A Lie conformal algebroid is a pair (A, M), where A is a Lie conformal algebra, M is a commutative associative differential algebra with derivative ∂ M , such that A is a left M-module and M is a left A-module, satisfying the following compatibility conditions (a, b ∈ A, m, n ∈ M): L1. ∂(ma) = (∂ M m)a + m(∂a), L2. aλ (mn) = (aλ m)n + m(aλ n), L3. [aλ mb] = (aλ m)b + m[aλ b]. It follows from Condition M L3

and skew-symmetry (2) of the λ-bracket, that ∂ ∂ λ m [aλ b] + (aλ+∂ m)→ b, L3’. [maλ b] = e ∞ 1 M i where the first term in the RHS is i=0 i! (λ + ∂ ) m (a(i) b), and in the second term the arrow means, as usual, that ∂ should be moved to the right, acting on b. We next give two examples analogous to those in the Lie algebroid case. Let M be, as above, a commutative associative differential algebra. Recall from Sect. 3 that a conformal endomorphism on M is an F-linear map ϕ(= ϕλ ) : M → F[λ] ⊗ M satisfying ϕλ (∂ M m) = (∂ M + λ)ϕλ (m). The space Cend(M) of conformal endomorphism is then a Lie conformal algebra with the F[∂]-module structure given by (∂ϕ)λ = −λϕλ , and the λ-bracket given by [ϕλ ψ]µ = ϕλ ◦ ψµ−λ − ψµ−λ ◦ ϕλ . Example 2. Let Cder(M) be the subalgebra of the Lie conformal algebra Cend(M) consisting of all conformal derivations on M, namely of the the conformal endomorphisms satisfying the Leibniz rule: ϕλ (mn) = ϕλ (m)n +mϕλ (n). Then the pair (Cder(M), M) is a Lie conformal algebroid, where M carries the tautological Cder(M)-module structure, and Cder(M) carries the following M-module structure: M

(mϕ)λ = e∂ ∂λ m ϕλ . (95)

Lie Conformal Algebra Cohomology and the Variational Complex M

701 M

M

This is indeed an M-module, since e x∂ (mn) = (e x∂ m)(e x∂ n). Furthermore, ConM M dition L1 holds thanks to the obvious identity e∂ ∂λ λ = (λ + ∂ M )e∂ ∂λ . Condition L2. holds by definition. Finally, for Condition L3 we have [ϕλ mψ]µ (n) = ϕλ (mψ)µ−λ (n) − (mψ)µ−λ (ϕλ (n)) M

M

= ϕλ e∂ ∂µ m ψµ−λ (n) − e∂ ∂µ m ψµ−λ (ϕλ (n))

M

M = e(λ+∂ )∂µ ϕλ (m) ψµ−λ (n) + e∂ ∂µ m ϕλ ψµ−λ (n) − ψµ−λ (ϕλ (n)) M

M

= e∂ ∂µ ϕλ (m) ψµ (n) + e∂ ∂µ m [ϕλ ψ]µ (n) = (ϕλ (m)ψ + m[ϕλ ψ])µ (n). Example 3. Assume, as in Sect. 4.3, that A is a Lie conformal algebra and M is an Amodule endowed with a commutative, associative product, such that ∂ M : M → M, and aλ : M → C[λ] ⊗ M, for a ∈ A, satisfy the Leibniz rule. The space M ⊗ A has a natural structure of F[∂]-module, where ∂ acts as ∂(m ⊗ a) = (∂ M m) ⊗ a + m ⊗ (∂a).

(96)

Clearly, M ⊗ A is a left M-module via multiplication on the first factor. We define a left λ-action of M ⊗ A on M by M

(m ⊗ a)λ n = e∂ ∂λ m (aλ n), (97) and a λ-bracket on M ⊗ A by [(m ⊗ a)λ (n ⊗ b)] M

= e∂ ∂λ m n ⊗ [aλ b] + ((m ⊗ a)λ n) ⊗ b − e∂∂λ ((n ⊗ b)−λ m ⊗ a) .

(98)

We claim that (96) and (98) make M ⊗ A a Lie conformal algebra, (97) makes M an M ⊗ A-module, and the pair (M ⊗ A, M) is a Lie conformal algebroid. This will be proved in Proposition 9, using Lemmas 8 and 9. Lemma 8. (a) The following λ-bracket defines a Lie conformal algebra structure on the C[∂]-module M ⊗ A: M

e∂ ∂λ m n ⊗ [aλ b]. (99) [(m ⊗ a) λ (n ⊗ b)]0 = (b) For x, y ∈ M ⊗ A and m ∈ M, we have M

[mxλ y]0 = e∂ ∂λ m [xλ y]0 , [xλ my]0 = m[xλ y]0 .

(100)

Proof. For the first sesquilinearity condition, we have M

M

) * ∂(m ⊗ a) λ (n ⊗ b) 0 = e∂ ∂λ ∂ M m n ⊗ [aλ b] − e∂ ∂λ m n ⊗ λ[aλ b] = −λ [(m ⊗ a) λ (n ⊗ b)]0 . The second sesquilinearity condition and skew-symmetry can be proved in a similar way, and they are left to the reader. Let us check the Jacobi identity. We have M

M

) ) * * (m ⊗ a) λ (n ⊗ b) µ ( p ⊗ c) 0 0 = e∂ ∂λ m e∂ ∂µ n p ⊗ [aλ [bµ c]].

702


Exchanging a ⊗ m with b ⊗ n and λ with µ, we get M

M

) * (n ⊗ b) µ [(m ⊗ a) λ ( p ⊗ c)]0 0 = e∂ ∂λ m e∂ ∂µ n p ⊗ [bµ [aλ c]]. Furthermore, we have

M M

[[m ⊗ a λ n ⊗ b]0 ν p ⊗ c]0 = e∂ ∂ν e∂ ∂λ m n p ⊗ [[aλ b]ν c].

Putting ν = λ + µ, the RHS becomes M

M

e∂ ∂λ m e∂ ∂µ n p ⊗ [[aλ b]λ+µ c]. Hence, the Jacobi identity for the λ-bracket (99) follows immediately from the Jacobi identity for the λ-bracket on A. This proves part (a). Part (b) is immediate. We define another λ-product on M ⊗ A: (m ⊗ a)λ (n ⊗ b) = ((m ⊗ a)λ n) ⊗ b.

(101)

Notice that the λ-bracket (98) can be nicely written in terms of the λ-bracket (99) and the λ-product (101): [xλ y] = [xλ y]0 + xλ y − y−λ−∂ x.

(102)

Lemma 9. (a) The λ-product (101) satisfies both sesquilinearity conditions (for x, y ∈ M ⊗ A): ( ∂ x)λ y = −λ xλ y, xλ ( ∂ y) = (λ + ∂)(xλ y).

(103)

(b) For x ∈ M ⊗ A, m ∈ M and y either in M ⊗ A or in M, we have M

(mx)λ y = e∂ ∂λ m xλ y, xλ (my) = (xλ m)y + m(xλ y).

(104)

(c) We have the following identity for x, y, z ∈ M ⊗ A: xλ [yµ z]0 = [(xλ y)λ+µ z]0 + [yµ (xλ z)]0 .

(105)

(d) We have the following identity for x, y ∈ M ⊗ A and z either in M or in M ⊗ A: xλ (yµ z) − yµ (xλ z) = [xλ y]λ+µ z.

(106)

Proof. We have M

( ∂(m ⊗ a))λ (n ⊗ b) = e∂ ∂λ (∂ M − λ)m (aλ n) ⊗ b. The first sesquilinearity condition follows from the obvious identity e∂ ∂λ (∂ M − λ) = M −λe∂ ∂λ . The second sesquilinearity condition can be proved in a similar way. This proves part (a). Part (b) is immediate. For part (c) and (d), let x = a ⊗m, y = b ⊗n, z = c ⊗ p ∈ A ⊗ M. We have M

M

xλ [yµ z]0 = e∂ ∂λ m aλ e∂ ∂µ n p ⊗ [bµ c]. (107) M


Similarly,

M M

[(xλ y)ν z]0 = e∂ ∂ν e∂ ∂λ m (aλ n) p ⊗ [bν c].

703

(108)

Hence, if we put ν = λ + µ, the RHS becomes M

M

M

M

e∂ ∂λ m e∂ ∂µ (aλ n) p ⊗ [bλ+µ c] = e∂ ∂λ m aλ e∂ ∂µ n p ⊗ [bµ c], (109) where we used the sesquilinearity of the λ-bracket on A. Furthermore, we have M

M

[yµ (xλ z)]0 = e∂ ∂µ n e∂ ∂λ m (aλ p) ⊗ [bµ c]. (110) Combining Eqs. (107), (109) and (110), we immediately get (105), thanks to the assumption that the λ-action of A on M is a derivation of the commutative associative product on M. We are left to prove part (d). We have M

M

xλ (yµ p) = e∂ ∂λ m aλ e∂ ∂µ n (bµ p) M

M

M

M

= e∂ ∂λ m e∂ ∂µ n aλ (bµ p) + e∂ ∂λ m e∂ ∂µ (aλ n) (bλ+µ p). (111) For the second equality, we used the Leibniz rule and the sesquilinearity condition for the λ-action of A on M. Exchanging x with y and λ with µ, we have M

M

M

M

yµ (xλ p) = e∂ ∂λ m e∂ ∂µ n bµ (aλ p) + e∂ ∂µ n e∂ ∂λ (bµ m) (aλ+µ p). (112) By similar computations, we get M

M

(xλ y)λ+µ z = e∂ ∂λ m e∂ ∂µ (aλ n) (bλ+µ p), and

M

M

(y−λ−∂ x)λ+µ p = e∂ ∂µ n e∂ ∂λ (bµ m) (aλ+µ p).

Finally, it follows by a straightforward computation that M

M

[xλ y]0 λ+µ z = e∂ ∂λ m e∂ ∂µ n [aλ b]λ+µ p. Equation (106) is obtained combining Eqs. (111), (112), (113), (114) and (115).

(113)

(114)

(115)

Proposition 9. (a) The λ-bracket (98) defines a Lie conformal algebra structure on the F[∂]-module M ⊗ A. (b) The λ-action (97) defines a structure of a M ⊗ A-module on M. (c) The pair (M ⊗ A, M) is a Lie conformal algebroid. (d) We have a Lie conformal algebroid homomorphism (M ⊗ A,M)→(Cder(M), M), given by the identity map on M and the following Lie conformal algebra homomorphism from M ⊗ A to Cder(M): M

m ⊗ a → e∂ ∂λ m aλ .

704


Proof. It immediately follows from Lemma 8 and Lemma 9(a) that the λ-bracket (102) satisfies sesquilinearity and skew-symmetry. Furthermore, the Jacobi identity for the λbracket (98) follows from Lemma 8 and Eqs. (105) and (106). This proves part (a). Part (b) is Lemma 103(c), in the case z ∈ M. For part (c) we need to check Conditions L1, L2 and L3. The first two conditions are immediate. The last one follows from Eqs. (100) and (104). Finally, part (d) is straightforward and is left to the reader. 4.8. The Lie algebra structure on ΠC1 and the ΠC1 -structure on the complex (C • , d). Recall that the space of 1-chains of the complex (C • , d) is C1 = (A ⊗ M)/∂(A ⊗ M) with odd parity. We want to define a Lie algebra structure on ΠC1 , where, as usual, Π denotes parity reversing, making C • into a ΠC1 -complex. By Proposition 9(a), we have a Lie conformal algebra structure on M ⊗ A. Hence, if we identify M ⊗ A with A ⊗ M by exchanging the two factors, we get a structure of a Lie algebra on the quotient space (A ⊗ M)/∂(A ⊗ M), induced by the λ-bracket at λ = 0 [K]. Explicitly, we get the following well-defined Lie algebra bracket on ΠC1 = (A ⊗ M)/∂(A ⊗ M): [a ⊗ m, b ⊗ n] = [a∂ M b]→ ⊗ mn + b ⊗ a∂ M n → m − a ⊗ b∂ M m → n, (116) 1

where in the RHS, as usual, the right arrow means that ∂ M should be moved to the right, and in the first summand ∂1M denotes ∂ M acting only on the first factor m. " Recall from Sect. 4.4 that Γ1 = (A ⊗ M[[x]]) (∂ ⊗ 1 + 1 ⊗ ∂x )(A ⊗ M[[x]]), and Γ1 = ξ ∈ Γ1 | ∂ξ = 0 , where the action of ∂ on Γ1 is given by (59). Under this identification, the map χ1 : C1 → Γ1 defined by (78) and (82) is given by M

χ1 (a ⊗ m) = a ⊗ e x∂ m.

(117)

Proposition 10. The map χ1 : C1 → Γ1 is a Lie algebra homomorphism, which factors through a Lie algebra isomorphism χ1 : C¯ 1 → Γ1 , provided that A decomposes as in (23). Proof. We have, by (116) and (117) that M M

χ1 ([a ⊗ m, b ⊗ n]) = [a∂ M b]→ ⊗ e x∂ m e x∂ n 1 M M x∂ +b ⊗ e (a∂ M n)→ m − a ⊗ e x∂ (b∂ M m)→ n . Recalling formula (65) for the Lie bracket on Γ1 , we have

M M [χ1 (a ⊗ m), χ1 (b ⊗ n)]) = [a∂x1 b] ⊗ e x1 ∂ m e x∂ n x1 =x # $ # $ +b ⊗ m(x1 ), aλ1 n(x) − a ⊗ n(x1 ), bλ1 m(x) .

(118)

(119)

Clearly, the first term in the RHS of (118) is the same as the first term in the RHS of (119). Recalling the definition (62) of the pairing , , and using the sesquilinearity of the λ-action of A on M, we have that the second term in the RHS of (118) is the same as the second term in the RHS of (119), and similarly for the third terms. The last statement follows from Proposition 7.


705

1 → End C • , given Proposition 11. The complex (C • , d) has a ΠC1 -structure ϕ : ΠC by ϕ(∂η ) = d, ϕ(ηx) = ιx , ϕ(x) = L x = [d, ιx ]. Moreover, (C¯ • , d) is a ΠC1 -subcomplex. Proof. Due to Remark 5 and Proposition 8, we only need to check that, for x, y ∈ ΠC1 , we have [L x , ι y ] = ι[x,y] .

(120)

This follows from a long but straightforward computation, using the explicit formulas (13) and (91) for the differential and the contraction operators. It is left to the reader. Notice though that, in the special case when A decomposes as in (23), Eq. (120) is a corollary of Proposition 5, Theorem 2 and Theorem 4 for h = 1. Indeed, due to these results, it suffices to check that both sides of (120) coincide when acting on C 1 = HomF[∂] (A, M). In the latter case, using Eqs. (12), (14), (90), (91), (92) and (116), we have, for c ∈ C 1 , L a⊗m (ιb⊗n c) = c(b) a∂ M n → m + n a∂ M c(b) → m, ιb⊗n (L a⊗m c) = n a∂ M c(b) → m + c(a) b∂ M m → n − nc [a∂ M b] → m, ι[a⊗m,b⊗n] c = nc [a∂ M b] → m + c(b) a∂ M n → m − c(a) b∂ M m → n. It follows that (120) holds when applied to elements of C 1 .

The above results imply the following Theorem 5. The maps ψ • : Γ • → C¯ • ⊂ C • and χ1 : C1 → Γ1 define a homomorphism of g-complexes. Provided that A decomposes as in (23), we obtain an isomorphism ∼ of ΠC1 Π Γ1 -complexes ψ • : Γ • → C¯ • . Proof. It follows from Theorem 2, Proposition 7, Theorem 4 and Proposition 10.

4.9. Pairings between 1-chains and 1-cochains. Recall that Γ0 = M. Hence, the contraction operators of 1-chains, restricted to the space of 1-cochains, define a natural pairing Γ1 × Γ1 → M, which, to ξ ∈ Γ1 and γ ∈ Γ1 , associates γ = φ µ ( γλ (a)) ∈ M, ιξ

(121)

where a ⊗ φ ∈ A ⊗ Hom(F[λ], M) is a representative of ξ . When we consider the reduced spaces, we have Γ 0 = M/∂ M, and the above map induces a natural pairing Γ1 × Γ 1 → M/∂ M, which, to ξ ∈ Γ1 and γ ∈ Γ 1 , associates γλ (a)) ∈ M/∂ M, (122) ιξ γ = φ µ ( where again a ⊗ φ ∈ A ⊗ Hom(F[λ], M) is a representative of ξ , and γ ∈ Γ1 is a representative of γ . A similar pairing can be defined for 1-chains in C1 and 1-cochains in C 1 . Recall that C 0 = M/∂ M, C 1 is the space of F[∂]-module homomorphisms c : A → M, and C1 = A ⊗ M/∂(A ⊗ M). The corresponding pairing C1 × C 1 → M/∂ M, is obtained as follows. To x ∈ C1 and c ∈ C 1 , we associate, recalling (85), (123) ιx (c) = m · c(a) ∈ M/∂ M, where a ⊗ m ∈ A ⊗ M is a representative of x. Recalling Theorems 2 and 4, the above pairings (122) and (123) are compatible in the sense that ιx (c) = ιξ (γ ), provided that γ ∈ Γ 1 and c ∈ C 1 are related by c = ψ 1 (γ ), and ξ ∈ Γ1 and x ∈ C1 are related by ξ = χ1 (x).

706


4.10. Contraction by a 1-chain as an odd derivation of Γ• . Recall that, if the A-module M has a commutative associative product, and ∂ M and aλM are even derivations of it, then the basic cohomology complex Γ• is a Z-graded commutative associative superalgebra with respect to the exterior product (35), and the differential δ is an odd derivation of degree +1. Proposition 12. The contraction operator ιξ , associated to a 1-chain ξ ∈ Γ1 , is an odd derivation of the superalgebra Γ• of degree -1. Proof. Let a1 ⊗ φ, with a1 ∈ A and φ ∈ Hom(F[λ1 ], M), be a representative of ξ ∈ Γ1 . By the definition (35) of the exterior product, we have ))λ2 ,...,λh+k (a2 , . . . , ah+k ) (ιξ ( α∧β sign(σ ) αλσ (1) ,...,λσ (h) (aσ (1) , . . . , aσ (h) )× = φµ h!k! σ ∈Sh+k λσ (h+1) ,...,λσ (h+k) (aσ (h+1) , . . . , aσ (h+k) ) . ×β

(124)

, we can rewrite the RHS of (124) as By the skew-symmetry condition A2 for α and β h i sign(σ ) i+1 µ (−1) φ (a1 , aσ (1) , · ˇ· ·, aσ (h) ) α i h!k! λ1 ,λσ (1) ,···,λ ˇ σ (h) i=1 σ | σ (i)=1

λσ (h+1) ,...,λσ (h+k) (aσ (h+1) , . . . , aσ (h+k) ) ×β +

h+k

sign(σ ) (−i)i−h+1 αλσ (1) ,...,λσ (h) (aσ (1) , . . . , aσ (h) ) × h!k! i=h+1 σ | σ (i)=1 ×φ µ β

i

i

λ1 ,λσ (h+1) ,···,λ ˇ σ (h+k)

(a1 , aσ (h+1) , · ˇ· ·, aσ (h+k) ) .

(125)

The set of all permutations σ ∈ Sh+k such that σ (i) = 1, is naturally in bijection with the set of all permutations τ of {2, . . . , h + k}, and the correspondence between the signs is sign(τ ) = (−1)i+1 sign(σ ). Hence, (125) can be rewritten as sign(τ ) τ

h(ιξ α )λτ (2) ,...,λτ (h) (aτ (2) , . . . , aτ (h) )× h!k! λτ (h+1) ,...,λτ (h+k) (aτ (h+1) , . . . , aτ (h+k) ) ×β

αλτ (2) ,...,λτ (h+1) (aτ (2) , . . . , aτ (h+1) )× + k(−1)h )λτ (h+2) ,...,λτ (h+k) (aτ (h+2) , . . . , aτ (h+k) ) × (ιξ β )λ2 ,...,λh+k (a2 , . . . , ah+k ) α) ∧ β = (ιξ ( h ))λ2 ,...,λh+k (a2 , . . . , ah+k ). + (−1) ( α ∧ ιξ (β Remark 7. One can show that the g-structure of all our complexes Γ• , Γ • and C • can be extended to a structure of a calculus algebra, as defined in [DTT]. Namely, one can extend the Lie algebra bracket from the space of 1-chains to the whole space of chains (with reverse parity), and define there a commutative superalgebra structure, which extends our g-structure and satisfies all the properties of a calculus algebra.


707

5. The Complex of Variational Calculus as a Lie Conformal Algebra Cohomology Complex 5.1. Algebras of differential functions. An algebra of differential functions V in the variables u i , indexed by a finite set I = {1, . . . , }, is, by definition, a differential algebra (i.e. a unital commutative associative algebra with a derivation ∂), endowed with commuting derivations ∂(n) : V → V, for all i ∈ I and n ∈ Z+ , such that, given f ∈ V,

∂ (n) ∂u i

∂u i

f = 0 for all but finitely many i ∈ I and n ∈ Z+ , and the following

commutation rules with ∂ hold:

∂ (n) ∂u i

,∂ =

∂ (n−1) ∂u i

,

(126)

where theRHS is considered to be zero if n = 0. As in the previous sections, we denote by f → f the canonical quotient map V → V/∂V. Denote by C ⊂ V the subspace of constant functions, i.e. ∂f C= f ∈V = 0 ∀i ∈ I, n ∈ Z+ . (127) (n) ∂u i It follows from (126) by downward induction that Ker (∂) ⊂ C.

(128)

Also, clearly, ∂C ⊂ C. Typical examples of algebras of differential functions are: the ring of polynomials (n)

R = F[u i | i ∈ I, n ∈ Z+ ], (n)

(129)

(n+1)

, any localization of it by some multiplicative subset S ⊂ R, where ∂(u i ) = u i (n) such as the whole field of fractions Q = F(u i | i ∈ I, n ∈ Z+ ), or any algebraic extension of the algebra R or of the field Q obtained by adding a solution of a certain polynomial equation. In all these examples the action of ∂ : V → V is given by ∂ ∂= u i(n+1) (n) . Another example of an algebra of differential functions is the ∂u i i∈I,n∈Z+ ∂ (n) (n+1) ∂ + ring R [x] = F[x, u i | i ∈ I, n ∈ Z+ ], where ∂ = ui . (n) ∂x ∂u The variational derivative

δ δu

i∈I,n∈Z+

i

: V → V ⊕ is defined by

δf ∂f := (−∂)n (n) . δu i ∂u i n∈Z+

(130)

It follows immediately from (126) that δ (∂ f ) = 0, δu i for every i ∈ I and f ∈ V, namely, ∂V ⊂ Ker

δ δu .

(131)

708


A vector field is, by definition, a derivation of V of the form X=

Pi,n

i∈I,n∈Z+

∂ ∂u i(n)

,

Pi,n ∈ V.

(132)

We let g be the Lie algebra of all vector fields. The subalgebra of evolutionary vector fields is g∂ ⊂ g, consisting of the vector fields commuting with ∂. By (126), a vector field X is evolutionary if and only if it has the form XP =

(∂ n Pi )

i∈I,n∈Z+

∂ (n)

∂u i

,

where P = (Pi )i∈I ∈ V .

(133)

5.2. Normal algebras of differential functions. Let V be an algebra of differential functions in the variables u i , i ∈ I = {1, . . . , }. For i ∈ I and n ∈ Z+ we let ∂f Vn,i := f ∈ V (m) = 0 if (m, j) > (n, i) in lexicographic order . (134) ∂u j We also let Vn,0 = Vn−1, . (n) A natural assumption on V is to contain elements u i , for i ∈ I, n ∈ Z+ , such that (n)

∂u i

(m)

∂u j

= δi j δmn .

(135)

Clearly, such elements are uniquely defined up to adding constant functions. Moreover, choosing these constants appropriately, we can assume that ∂u i(n) = u i(n+1) . Thus, under this assumption V is an algebra of differential functions extension of the algebra R in (129). Lemma 10. Let V be an algebra of differential functions extension of the algebra R . Then: (a) We have ∂ = ∂ R + ∂ , where ∂R =

i∈I,n∈Z+

(n+1)

ui

∂ ∂u i(n)

and ∂ is a derivation of V which commutes with all

, ∂ (n) ∂u i

(136)

and which vanishes on

R ⊂ V. In particular, ∂ Vn,i ⊂ Vn,i . (b) If f ∈ Vn,i \Vn,i−1 , then ∂ f ∈ Vn+1,i \Vn+1,i−1 , and it has the form (n+1) h ju j + r, ∂f = j≤i

where h j ∈ Vn,i for all j ≤ i, r ∈ Vn,i , and h i = 0. (c) For f ∈ V, f g = 0 for every g ∈ V if and only if f = 0.

(137)


709

f Proof. Part (a) is clear. By part (a), we have that ∂ f is as in (137), where h j = ∂ (n) ∈ ∂u j (m) ∂ f Vn,i , and r = j∈I,m≤n u j (m−1) + ∂ f ∈ Vn,i . We are left to prove part (c). Suppose ∂u j f = 0 is such that f g = 0 for every g ∈ V. By taking g = 1, we have that f ∈ ∂V. (n+1) Hence f has the form (137) for some i ∈ I and n ∈ Z+ . But then u i f does not have (n+1) this form, so that u i f = 0.

Definition 2. The algebra of differential functions V is called normal if we have (n) ∂ du i f ∈ Vn,i (n) Vn,i = Vn,i for all i ∈ I, n ∈ Z+ . Given f ∈ Vn,i , we denote by ∂u i

a preimage of f under the map Vn,i−1 .

∂ (n) . This integral is defined up to adding elements from ∂u i

Proposition 13. Any normal algebra of differential functions V is an extension of R . (n)

Proof. As pointed out above, we need to find elements u i ∈ V, for i ∈ I, n ∈ Z+ , such that (135) holds. By the normality assumption, there exists vin ∈ Vn,i such that ∂vin

∂v n

∂v n

= 1. Note that ∂(n) (n)i = ∂1(n) = 0, hence (n)i ∈ Vn,i−1 . If we then replace ∂u ∂u i−1 ∂u i−1 ∂u i−1 ni ∂vin ∂win ∂wn n n n vi by wi = vi − du i−1 (n) , we have that (n) = 1 and (n)i = 0. Proceeding by (n) ∂u i

∂u i−1

∂u i

(n)

downward induction, we obtained the desired element u i .

∂u i−1

Clearly, the algebra R is normal. Moreover, any extension V of R can be further extended to a normal algebra, by adding missing integrals. For example, the localization of R1 = F[u (n) | n ∈ Z+ ] by u is not a normal algebra, since it doesn’t contain du u . Note that any differential algebra (A, ∂) can be viewed as a trivial algebra of differential functions with ∂(n) = 0. Such an algebra does not contain R , hence it is not ∂u i

normal.

5.3. The complex of variational calculus. Let V be an algebra of differential functions. • = Ω • (V) is defined as the free commutative superalThe basic de Rham complex Ω (n) • consists of gebra over V with odd generators δu i , i ∈ I, n ∈ Z+ . In other words Ω finite sums of the form (m ) (m ) 1 ···m k 1 ···m k f im1 ···i δu i1 1 ∧ · · · ∧ δu ik k , f im1 ···i ∈ V, (138) ω= k k ir ∈I,m r ∈Z+

and it has a (super)commutative product given by the wedge product ∧. We have a nat k defined by saying that elements in V have degree • = ural Z+ -grading Ω Ω k∈Z+ (n) k is a free module over V with 0, while the generators δu have degree 1. Hence Ω i

(m 1 )

(m k )

, with (m 1 , i 1 ) > · · · > (m k , i k ) (with 0 = V and Ω 1 = i∈I,n∈Z Vδu (n) . respect to the lexicographic order). In particular Ω i + 1 Notice that there

is a natural V-linear pairing Ω × g → V defined on generators by (m) 1 × g by V-bilinearity. δu i , ∂(n) = δi, j δm,n , and extended to Ω

basis given by the elements δu i1

∂u j

∧ · · · ∧ δu ik

710


• , such that δ f = We let δ be an odd derivation of degree 1 of the complex Ω (n) (n) ∂f for f ∈ V, and δ(δu i ) = 0. It is immediate to check that δ 2 = 0 (n) δu i i∈I, n∈Z+

∂u i

k as in (138), we have and that, for ω∈Ω δ( ω) =

1 ···m k ∂ f im1 ···i k

ir ∈I,m r ∈Z+ j∈I,n∈Z+

∂u (n) j

(n)

(m 1 )

δu j ∧ δu i1

(m k )

∧ · · · ∧ δu ik

.

(139)

• → Ω • , as an odd derivation For X ∈ g we define the contraction operator ι X : Ω (n) • of degree -1, such that ι X ( f ) = 0 for f ∈ V, and ι X (δu ) = X (u (n) ). If X ∈ g of Ω i i k is as in (138), we have is as in (132) and ω∈Ω ω) = ι X (

k

ir ∈I,m r ∈Z+ q=1

q

1) 1 ···m k k) (−1)q+1 f im1 ···i Piq ,m q δu i(m ∧ · ˇ· · ∧δu i(m . 1 k k

(140)

In particular, for f ∈ V we have ι X (δ f ) = X ( f ).

(141)

It is easy to check that the operators ι X , X ∈ g, form an abelian (purely odd) subalgebra • , namely of the Lie superalgebra Der Ω [ι X , ιY ] = ι X ◦ ιY + ιY ◦ ι X = 0.

(142)

The Lie derivative L X along X ∈ g is defined as a degree 0 derivation of the super• , commuting with δ, and such that algebra Ω LX( f ) = X( f )

0 . for f ∈ Ω

(143)

One can easily check (on generators) Cartan’s formula (cf. (45)): L X = [δ, ι X ] = δ ◦ ι X + ι X ◦ δ.

(144)

We next prove the following: [ι X , L Y ] = ι X ◦ L Y − L Y ◦ ι X = ι[X,Y ] .

(145)

0 = V. It is clear by degree considerations that both sides of (145) act as zero on Ω Moreover, it follows by (141) that [ι X , L Y ](δ f ) = ι X διY δ f − ιY δι X δ f = X (Y ( f )) − Y (X ( f )) = [X, Y ]( f ) = ι[X,Y ] (δ f ) for every f ∈ V. Equation (145) then follows Finally, as by the fact that both sides are even derivations of the wedge product in Ω. immediate consequence of Eq. (145), we get that [L X , L Y ] = L X ◦ L Y − L Y ◦ L X = L [X,Y ] .

(146)

• is a g-complex, • by derivations. Thus, Ω g acting on Ω • , such that Note that the action of ∂ on V extends to a degree 0 derivation of Ω (n)

(n+1)

∂(δu i ) = δu i

, i ∈ I, n ∈ Z+ .

(147)


711

This derivation commutes with δ, hence we can consider the corresponding reduced de Rham complex Ω • = Ω • (V), usually called the complex of variational calculus: + k /∂ Ω k , Ω• = Ωk, Ωk = Ω k∈Z+

with the induced action of δ. With an abuse of notation, we denote by δ and, for X ∈ g∂ , by ι X , L X , the maps induced on the quotient space Ω k by the corresponding maps on k . Obviously, Ω • is a g∂ -complex. Ω 5.4. Isomorphism of the cohomology g∂ -complexes Ω • and Γ • . Proposition 14. Let V be an algebra of differential functions. Consider the Lie conformal algebra A = ⊕i∈I F[∂]u i with the zero λ-bracket. Then V is a module over the Lie conformal algebra A, with the λ-action given by ∂f λn (n) . (148) ui λ f = ∂u i n∈Z+ Moreover, the λ-action of A on V is by derivations of the associative product in V. Proof. The fact that V is an A-module follows from the definition of an algebra of differential functions. The second statement is clear as well. Let Γ• = Γ• (A, V) and Γ • = Γ • (A, V) be the basic and reduced Lie conformal algebra cohomology complexes for the A-module V, defined in Proposition 14. Thus, to every algebra of differential functions V we can associate two apparently unrelated types of cohomology complexes: the basic and reduced de Rham cohomology com• (V) and Ω • (V), defined in Sect. 5.3, and the basic and reduced Lie conformal plexes, Ω • • algebra cohomology complexes Γ (A, V) and Γ (A, V), defined in Sect. 2.1, for the Lie conformal algebra A = i∈I F[∂]u i , with the zero λ-bracket, acting on V, with the λ-action given by (148). We are going to prove that, in fact, these complexes are isomorphic, and all the related structures (such as exterior products, contraction operators, Lie derivatives,...) correspond via this isomorphism. We denote, as in Sect. 4.2, by Γ• = Γ• (A, V) (resp. Γ• = Γ• (A, V)) the basic (resp. reduced) space of chains of A with coefficients in V. Recall from Sect. 4.4 that Π Γ1 is " identified with the space (A ⊗ V[[x]]) (∂ ⊗ 1 + 1 ⊗ ∂x )(A ⊗ V[[x]]), and it carries a Lie algebra structure given (65), which in this case takes the form, by the1 Lie bracket 1 for i, j ∈ I and P(x) = m∈Z+ m! Pm x m , Q(x) = n∈Z+ n! Q n x n ∈ V[[x]]: [u i ⊗ P(x), u j ⊗ Q(x)] = −u i ⊗

n∈Z+

Qn

∂ P(x) (n) ∂u j

+uj ⊗

m∈Z+

Pm

∂ Q(x) (m)

∂u i

.

(149)

Moreover, ∂ acts on Γ1 by (59). Its kernel Π Γ1 consists of elements of the form u i ⊗ e x∂ Pi , where Pi ∈ V, (150) i∈I

and it is a Lie subalgebra of Π Γ1 . We also denote, as in Sect. 5.1, by g the Lie algebra of all vector fields (132) acting on V, and by g∂ ⊂ g the Lie subalgebra of evolutionary vector fields (133).

712


Proposition 15. The map Φ1 : Π Γ1 → g, which maps ξ=

u i ⊗ Pi (x) =

i∈I,n∈Z+

i∈I

1 u i ⊗ Pi,n x n ∈ Γ1 , n!

(151)

to

Φ1 (ξ ) =

Pi,n

i∈I, n∈Z+

∂ (n)

∂u i

,

(152)

is a Lie algebra isomorphism. Moreover, the image of the space of reduced 1-chains via Φ1 is the space of evolutionary vector fields. Hence we have the induced Lie algebra ∼ isomorphism Φ1 : Π Γ1 → g∂ . Proof. Clearly, Φ1 is a bijective map, and, by (150), Φ1 (Γ1 ) = g∂ . Hence we only need to check Φ1 is a Lie algebra homomorphism. This is immediate from Eq. (149). • , such that Φ 0 = 1I|V and, for k ≥ 1, Φ k : Γk → Theorem 6. The map Φ • : Γ• → Ω k is given by Ω Φ k ( γ) =

1 k!

(m 1 )

ir ∈I,m r ∈Z+

1 ···m k f im1 ···i δu i1 k

(m k )

∧ · · · ∧ δu ik

,

(153)

mk 1 ···m k 1 where f im1 ···i ∈ V is the coefficient of λm γλ1 ,...,λk (u i1 , . . . , u ik ), is an iso1 · · · λk in k morphism of superalgebras, and an isomorphism of g-complexes, (once we identify the Lie algebras g and Π Γ1 via Φ1 , as in Proposition 15). Moreover, Φ • commutes with the action of ∂, hence it induces an isomorphism of the ∼ corresponding reduced g∂ -complexes: Φ • : Γ • → Ω • .

Proof. Since I is a finite index set, the RHS of (153) is a finite sum, so that Φ k (Γk ) ⊂ k . By the sesquilinearity and skew-symmetry Conditions A1 and A2 in Sect. 2.1, Ω elements γ ∈ Γk areuniquely determined by the collection of polynomials 1 ···m k m 1 k γλ1 ,...,λk (u i1 , . . . , u ik ) = m r ∈Z+ f im1 ···i λ1 · · · λm k , which are skew-symmetric with k respect to simultaneous permutation of the variables λr and the indices ir . We want to k . In fact, denote by Ψ k : Ω k → Γk check that Φ k is a bijective linear map from Γk to Ω k the linear map which to ω as in (138) associates the k-cochain Ψ ( ω), such that 1 ···m k m 1 k ω)λ1 ,...,λk (u i1 , . . . , u ik ) = f im1 ···i λ1 · · · λm Ψ k ( k , k m r ∈Z+

where f denotes the skew-symmetrization of f : m σ (1) ···m σ (k) 1 ···m k = sign(σ ) f iσ (1) f im1 ···i ···i σ (k) , k σ

and Ψ k ( ω) is extended to A⊗k by the sesquilinearity Condition A1. It is straightforward to check that Ψ k ( ω) is indeed a k-cochain, and that the maps Φ k and Ψ k are inverse to each other. This proves that Φ • is a bijective map. Next, let us prove that Φ • is an associative superalgebra homomorphism. Let α ∈ h ∈ Γk and let α m 1 ,...,m h be the coefficient of λm 1 · · · λm h in the polynomial Γ ,β 1 i 1 ,...,i h h λ1 ,...,λk αλ1 ,...,λh (u i1 , . . . , u i h ), and let β n 1 ,...,n k be the coefficient of λn 1 · · · λn k in β j1 ,..., jk

1

k


713

m h+k 1 )λ1 ,...,λh+k (u i1 , . . . , u i h+k ) (u j1 , . . . , u jk ). By (35), the coefficient of λm α ∧β 1 · · · λh+k in ( is sign(σ ) m σ (1) ,...,m σ (h) m σ (h+1) ,...,m σ (h+k) α β . h!k! iσ (1) ,...,iσ (h) iσ (h+1) ,··· ,iσ (h+k) σ ∈Sh+k

) = Φ h ( ) follows by the definition (153) of Φ k . The identity Φ h+k ( α∧β α ) ∧ Φ k (β m ···m mk 1 k 1 k Let γ ∈ Γ , and denote by f i1 ···ik ∈ V the coefficient of λm γλ1 ,...,λk 1 · · · λk in k+1 k (u i1 , . . . , u ik ). We want to prove that Φ (δ γ ) = δΦ ( γ ). By assumption, the λ-bracket

on A is zero, and the λ-action of A on V is given by (148). Hence, recalling (10), the m k+1 1 coefficient of λm γ )λ1 ,...,λk+1 (u i1 , . . . , u ik+1 ) is 1 · · · λk+1 in the polynomial (δ r

k+1

ˇ k+1 ∂ f mr1 ···m

(−1)r +1

r =1

i 1 ···i ˇ k+1 (m ) ∂u ir r

.

It follows that q

Φ k+1 (δ γ) = =

1 (k + 1)! 1 k!

k+1 (−1)q+1

ir ∈I,m r ∈Z+ q=1 1 ···m k ∂ f im1 ···i k (m 0 ) ∂u i0 ir ∈I,m r ∈Z+

ˇ k+1 ∂ f mq1 ···m i 1 ···i ˇ k+1 (m ) ∂u iq q

1) k+1 ) δu i(m ∧ · · · ∧ δu i(m 1 k+1

0) k) δu i(m ∧ · · · ∧ δu i(m = δΦ k ( γ ), 0 k

thus proving the claim. mk 1 ···m k 1 Similarly, the coefficient of λm γ )λ1 ,...,λk (u i1 , . . . , u ik ) is ∂ M f im1 ···i 1 · · · λk in (∂ k k 1 ···m r −1···m k + r =1 f im1 ···i , so that k ⎛ 1 k ⎝∂ M f m 1 ···m k δu (m 1 ) ∧ · · · ∧ δu (m k ) γ) = Φ (∂ i1 ik i 1 ···i k k! ir ∈I,m r ∈Z+ ⎞ k (m q +1) (m 1 ) (m k ) ⎠ m 1 ···m k = ∂Φ k ( + f i1 ···ik δu i1 ∧ · · · ∧ δu iq ∧ · · · ∧ δu ik γ ). q=1

Φ•

This proves that is compatible with the action of ∂. Finally, we prove that Φ • is compatible with the contraction operators. Let γ ∈ Γk be as in the statement of the theorem, and let ξ ∈ Γ1 be as in (151). By Eq. (63), we have the following formula for the contraction operator ιξ , # $ (ιξ Pi1 (x1 ), γ )λ2 ,...,λk (u i2 , . . . , u ik ) = γλ1 ,λ2 ,...,λk (u i1 , u i2 , . . . , u ik ) , i 1 ∈I

where , denotes the contraction of x1 with λ1 defined in (62). Hence, the coefficient mk 2 of λm γ )λ2 ,...,λk (u i2 , . . . , u ik ) is 2 · · · λk in (ιξ Pi1 ,m 1 f im1 i12m···i2 k···m k . i 1 ∈I,m 1 ∈Z+

714


It follows that γ )) = Φ k−1 (ιξ (

1 (k − 1)!

ir ∈I,m r ∈Z+

(m 2 )

Pi1 ,m 1 f im1 i12m···i2 k···m k δu i2

(m k )

∧ · · · ∧ δu ik

,

γ )). This completes the proof which, recalling (140) and (152), is the same as ιΦ1 (ξ ) (Φ k ( of the theorem. Let V be an alge5.5. An explicit construction of the g∂ -complex of variational calculus. bra of differential functions in the variables {u i }i∈I , let A = i∈I F[∂]u i be the free F[∂]-module of rank , considered as a Lie conformal algebra with the zero λ-bracket, and consider the A-module structure on V, with the λ-action given by (148). By Theorem 6, the g∂ -complex of variational calculus Ω • (V) is isomorphic to the Π Γ1 -complex Γ • (A, V). Furthermore, due to Theorems 2 and 4, the Π Γ1 -complex Γ • (A, V) is iso• morphic to the ΠC1 -complex C (A, V) = k∈Z+ C k , which is explicitly described in Sects. 2.3 and 4.6. In this section we use this isomorphism to describe explicitly the ΠC1 g∂ -complex of variational calculus C • (A, V) Ω • (V), both in terms of “poly-symbols”, and in terms of skew-symmetric “poly-differential operators”. We shall identify these two complexes via this isomorphism. We start by describing all vector spaces Ω k and the maps d : Ω k → Ω k+1 , k ∈ Z+ . First, we have Ω 0 = V/∂V.

(154)

Next, Ω 1 = HomF[∂] (A, V), hence we have a canonical identification Ω 1 = V ⊕ .

(155)

Comparing (12) and (148), we see that d : Ω 0 → Ω 1 is given by the variational derivative: δf . d f = δu

(156)

For arbitrary k ≥ 1, the space Ω k can be identified with the space of k-symbols in u i , i ∈ I . By definition, a k-symbol is a collection of expressions of the form u i1 λ1 u i2 λ2 · · · u ik−1 λk−1 u ik ∈ F[λ1 , . . . , λk−1 ] ⊗ V, (157) where i 1 , . . . , i k ∈ I , satisfying the following skew-symmetry property: u i1 λ1 u i2 λ2 · · · u ik−1 λk−1 u ik = sign(σ ) u iσ (1) λ · · · u iσ (k−1) λ u iσ (k) , σ (1)

σ (k−1)

(158)

for every permutation σ ∈ Sk , where λk is replaced, if it occurs in the RHS, by λ†k = − k−1 j=1 λ j − ∂, with ∂ acting from the left. Clearly, by sesquilinearity, for k ≥ 1, the space Ω k = C k of k-λ-brackets is one-to-one correspondence with the space of k-symbols.


715

⊕ For example, the space of 1-symbols is the same as V . A 2-symbol is a collection of elements u i λ u j ∈ F[λ] ⊗ V, for i, j ∈ I , such that u i λ u j = − u j −λ−∂ u i . A 3-symbol is a collection of elements u i λ u j µ u k ∈ F[λ, µ] ⊗ V, for i, j, k ∈ I , such that u i λ u j µ u k = − u j µ u i λ u k = − u i λ u k −λ−µ−∂ u j ,

and similarly for k > 3. Comparing (13) and (148) we see that, if F ∈ V ⊕ , its differential d F corresponds to the following 2-symbol: n ∂ Fj n ∂ Fi ui λu j = − (−λ − ∂) λ = (D F ) ji (λ) − (D ∗F ) ji (λ), (159) (n) (n) ∂u ∂u i j n∈Z+ where D F is the Frechet derivative defined by (9). More generally, the differential of a k-symbol for k ≥ 1 is given by the following formula:

d {u i1 λ1 · · · u ik−1 λk−1 u ik } i 1 ,...,i k ∈I ⎛ k s s+1 n ∂ ⎝ = (−1) λs (n) u i1 λ1 · ˇ· · u ik λk u ik+1 (160) ∂u is n∈Z+ s=1 ⎛ ⎞ ⎞n k ∂ ⎠ ⎝− u + (−1)k λ j − ∂⎠ · · · u u . i i i 1 k−1 k λ λk−1 1 (n) ∂u j=1 i k+1 n∈Z+ i 1 ,...,i k+1 ∈I

Provided that V is an algebra of differential functions extension of R , an equivalent language is that of skew-symmetric poly-differential operators. By definition, a k-differential operator is an F-linear map S : (V )k → V/∂V, of the form ,...,n k n 1 1 f in1 1,...,i (∂ Pi1 ) · · · (∂ n k Pikk ). (161) S(P 1 , . . . , P k ) = k n 1 ,...,n k ∈Z+ i 1 ,...,i k ∈I

The operator S is called skew-symmetric if S(P 1 , . . . , P k ) = sign(σ )S(P σ (1) , . . . , P σ (k) ), for every P 1 , . . . , P k ∈ V and every permutation σ ∈ Sk . Given a k-symbol n ,...,n n k−1 1 k−1 f i1 ,...,i λn 1 · · · λk−1 , i 1 , . . . , i k ∈ I, u i1 λ1 · · · u ik−1 λk−1 u ik = k−1 ,i k 1

(162)

n 1 ,...,n k−1 ∈Z+

n ,...,n

1 k−1 where f i1 ,...,i ∈ V, we associate to it the following poly-differential operator: k k S : (V ) → V/∂V, is n 1 ,...,n k−1 1 k S(P , . . . , P ) = f i1 ,...,i (∂ n 1 Pi11 ) · · · (∂ n k−1 Pik−1 )Pikk . (163) k−1 ,i k k−1

n 1 ,...,n k−1 ∈Z+ i 1 ,...,i k ∈I

716


Clearly, the skew-symmetry property of the k-symbol is translated to the skew-symmetry of the poly-differential operator. Conversely, integrating by parts, any k-differential operator can be written in the form (163). Thus we have a surjective map Ξ from the space of k-symbols to the space of skew-symmetric k-differential operators. Provided that V is an algebra of differential functions extension of R , by Lemma 10(c), the k-differential operator S can be written uniquely in the form (163). Hence, the map Ξ is an isomorphism. Note that the space of 1-differential operators S : V → V/∂V can be canonically identified space Ω 1 = V ⊕ . Explicitly, to the 1-differential operator with the n ∂ n P , we associate: S(P) = f i i∈I,n∈Z+ i ⎛ ⎝

⎞

(−∂)n f in ⎠

n∈Z+

∈ V ⊕ .

(164)

i∈I

We can write down the expression of the differential d : Ω k → Ω k+1 in terms of poly-differential operators. First, if F ∈ Ω 1 = V ⊕ , the 2-differential operator corresponding to d F ∈ Ω 2 is obtained by looking at Eq. (159): (d F)(P, i) i∈I Q i X P (Fi ) − Pi X Q (F Q) = (165) = i, j∈I Q i D F (∂)i j P j − Pi D F (∂)i j Q j , where X P denotes the evolutionary vector field associated to P ∈ V , defined in (133), and D F (∂) is the Frechet derivative (9). Next, if S : (V )k → V/∂V is a skew-symmetric k-differential operator, its differential d S, obtained by looking at (160), is the following k + 1-differential operator: (d S)(P 1 , . . . , P k+1 ) =

k+1 s (−1)s+1 (X P s S) (P 1 , · ˇ· ·, P k+1 ).

(166)

s=1

In the above formula, if S is as in (161), X P S denotes the k-differential operator obtained ,...,n k 1 ,...,n k from S by replacing the coefficients f in1 ,...,i by X P ( f in1 1,...,i ). k k Remark 8. For k ≥ 2, a k-differential operator can also be understood as a map S : (V )k−1 → V ⊕ of the following form: n ,...,n k−1 n1 1 n k−1 k−1 S(P 1 , . . . , P k−1 )ik = f i1 1,...,ik−1 Pik−1 ). (167) ,i k (∂ Pi 1 ) · · · (∂ n 1 ,...,n k−1 ∈Z+ i 1 ,...,i k−1 ∈I

This corresponds to the k-symbol (162) in the obvious way. With this notation, the differential d S is the following map (V )k → V ⊕ : (d S)(P 1 , . . . , P k )i =

k s (−1)s+1 (X P s S)(P 1 , · ˇ· ·, P k )i s=1

+ (−1)

k

j∈I,n∈Z+

(−∂)

n

P jk

∂S (n)

∂u i

(P , . . . , P 1

k−1

)j .

(168)


717

Recall that the Lie algebra g∂ ΠC1 is identified with the space V via the map P → X P , defined in (133). Given P ∈ V , we want to describe explicitly the action of the corresponding contraction operator ι P and the Lie derivative L P = [d, ι P ]. First, for F ∈ V ⊕ = Ω 1 , we have (cf. (90)): ι P (F) = Pi Fi ∈ V/∂V = Ω 0 . (169) i∈I

Next, the contraction of a k-symbol for k ≥ 2 is given by the following formula (cf. (91)):

ι P u i1 λ1 · · · u ik−1 λk−1 u ik i 1 ,...,i k ∈I ⎞ ⎛ u i1 ∂ u i2 λ2 · · · u ik−1 λk−1 u ik = ⎝ Pi1 ⎠ , (170) →

i 1 ∈I

i 2 ,...,i k ∈I

where, as usual, the arrow in the RHS means that ∂ is moved to the right. For k = 2, the above formula becomes ⎞ ⎛ ι P u i λ u j i, j∈I = ⎝ u j ∂ ui → Pj ⎠ ∈ V ⊕ = Ω 1 . (171) j∈I

i∈I

We can write the above formulas in the language of poly-differential operators. For a k-differential operator S, we have (ι P 1 S)(P 2 , . . . , P k ) = S(P 1 , P 2 , . . . , P k ).

(172)

For k = 2 ι P 1 S is a 1-differential operator which, by (164), is the same as an element of V ⊕ = Ω 1 . Remark 9. In the interpretation (167) of a k-differential operator, the action of the contraction operator is given by (ι P 1 S)(P 2 , . . . , P k−1 )ik = S(P 1 , P 2 , . . . , P k−1 )ik . Next, we write the formula for the Lie derivative L Q : Ω k → Ω k , associated to Q ∈ V g∂ , using Cartan’s formula L Q = [ι Q , d]. Recalling (156) and (169), after integration by parts we obtain, for f ∈ Ω 0 = V/∂V: f = X Q ( f ), (173) LQ where X Q is the evolutionary vector field corresponding to Q (cf. (133)). Similarly, recalling (159) and (171), we obtain, for F ∈ Ω 1 = V ⊕ : dι Q (F) = D F (∂)∗ Q + D Q (∂)∗ F, ι Q d(F) = D F (∂)Q − D F (∂)∗ Q, where D F (∂) denotes the Frechet derivative (9), and D F (∂)∗ is the adjoint differential operator. Putting the above formulas together, we get: L Q F = D F (∂)Q + D Q (∂)∗ F.

(174)

718


For k ≥ 2, L Q acts on a k-symbol in Ω k by the following formula, which can be derived from (160) and (170): L Q {u i1 λ1 · · · u ik−1 λk−1 u ik } = X Q {u i1 λ1 · · · u ik−1 λk−1 u ik } k−1 s + (−1)s+1 {u j λs +∂ u i1 λ1 · ˇ· · u ik−1 λk−1 u ik }→ D Q (λs ) jis s=1

j∈I

+ (−1)

k+1

{u j λ† +∂ u i1 λ1 · · · u ik−2 λk−2 u ik−1 }→ D Q (λ†k ) jik . j∈I

k

In the RHS the evolutionary vector field X Q is applied to the coefficients of the k-symbol, in the last two terms the arrow means, as usual, that we move ∂ to the right, D Q (λ) denotes the Frechet derivative (9) considered as a polynomial in λ, and, in the last term, λ†k = −λ1 − · · · − λk−1 − ∂, where ∂ is moved to the left. This formula takes a much nicer form in the language of k-differential operators. Namely we have: (L Q S)(P 1 , . . . , P k ) = (X Q S)(P 1 , . . . , P k ) +

k

S(P 1 , . . . , X Q P s , . . . , P k ).

s=1

(175) Here X Q S has the same meaning as in Eq. (166). This formula can be obtained from the previous one by integration by parts. 5.6. An application to the classification of symplectic differential operators. Recall that C ⊂ V denotes the subspace (127) of constant functions. In [BDK] we prove the following: Theorem 7. If V is normal, then H k (Ω • , d) = δk,0 C/(C ∩ ∂V). Recall that a symplectic differential operator (cf. [D] and [BDK]) is a skew-adjoint differential operator S(∂) = Si, j (∂) i, j∈I : V → V ⊕ , which is closed, namely the following condition holds (cf. (168)): u i λ Sk j (µ) − u j µ Ski (λ) − u k −λ−µ−∂ S ji (λ) = 0,

(176)

where the λ-action of u i on V is defined by (148). We have the following corollary of Theorem 7. Corollary 1. If V is a normal algebra of differential functions, then any symplectic differential operator is of the form: S F (∂) = D F (∂) − D F (∂)∗ , for some F ∈ V ⊕ . Moreover, S F = SG if and only if F − G = δδuf for some f ∈ V. A skew-symmetric k-differential operator S : (V )k → V/∂V is called symplectic if it is closed, i.e. k+1

s

(−1)s+1 (X P s S) (P 1 , · ˇ· ·, P k+1 ) = 0.

s=1

The following corollary of Theorem 7 is a generalization of Corollary 1 and uses Proposition 13


719

Corollary 2. If V is a normal algebra of differential functions, then any symplectic k-differential operator, for k ≥ 1, is of the form: S(P 1 , . . . , P k ) =

k s (−1)s+1 (X P s T ) (P 1 , · ˇ· ·, P k ), s=1

for some skew-symmetric k − 1-differential operator T . Moreover, T is defined up to adding a symplectic k − 1-differential operator. Remark 10. It follows from the proof of Theorem 7 that, Corollaries 1 and 2 hold in any algebra of differential functions V, provided that we are allowed to take F and T respectively in an extension of V, obtained by adding finitely many integrals of elements (n) of V (an integral of an element f ∈ Vn,i is a preimage du i f of ∂(n) independent on (m) uj

∂u i

with (m, j) > (n, i)).

Remark 11. The map Ξ defined in Sect. 5.5 may have a non-zero kernel if V is not an extension of the algebra R , but, of course, for any V the image of Ξ is a g∂ -complex. The 0th term of this complex is V/∂V and the k th term, for k ≥ 1, is the space of skew-symmetric k-differential operators S : (V )k → V/∂V. Remark 12. Throughout this section we assumed that the number of variables u i is finite, but this assumption is not essential, and our arguments go through with minor modifications. This is the reason for distinguishing V from V ⊕ , in order to accommodate the case = ∞. References [BKV] [BDAK] [BDK] [D] [Di] [DTT] [H] [K] [Vi] [V]

Bakalov, B., Kac, V.G., Voronov, A.A.: Cohomology of conformal algebras. Commun. Math. Phys. 200, 561–598 (1999) Bakalov, B., D’Andrea, A., Kac, V.G.: Theory of finite pseudoalgebras. Adv. Math. 162(1), 1–140 (2001) Barakat, A., De Sole, A., Kac, V.G.: Poisson vertex algebras in the theory of Hamiltonian equations. http://arXiv.org/abs/0907.1275, 2009 Dorfman, I.: Dirac structures and integrability of non-linear evolution equations. New York: John Wiley and Sons, 1993 Dickey, L.A.: Soliton equations and Hamiltonian systems. Advanced Ser. Math. Phys. 26, Second ed., Singapore: World Sci., 2003 Dolgushev, V., Tamarkin, D., Tsygan, B.: Formality of the homotopy calculus algebra of Hochschild (co)chains. http://arXiv.org/abs/0807.5117v1[math.KT], 2008 Helmholtz, H.: Uber der physikalische bedentung des princips der klinstein wirkung. J. Reine Angen Math 100, 137–166 (1887) Kac, V.G.: Vertex algebras for beginners. Univ. Lecture Ser., Vol 10, 1996, Second edition, Providence, RI: Amer. Math. Soc., 1998 Vinogradov, A.M.: On the algebra-geometric foundations of lagrangian field theory. Sov. Math. Dokl. 18, 1200–1204 (1977) Volterra, V.: Leçons sur les Fonctions de Lignes. Paris: Gauthier-Villar, 1913

Communicated by Y. Kawahigashi


Communications in


Abelian Sandpiles and the Harmonic Model Klaus Schmidt1,2 , Evgeny Verbitskiy3,4 1 Mathematics Institute, University of Vienna, Nordbergstrasse 15, A-1090 Vienna, Austria.

E-mail: [email protected]

2 Erwin Schrödinger Institute for Mathematical Physics, Boltzmanngasse 9, A-1090 Vienna, Austria 3 Philips Research, High Tech Campus 36 (M/S 2), 5656 AE, Eindhoven, The Netherlands.


4 Department of Mathematics, University of Groningen, PO Box 407, 9700 AK, Groningen, The Netherlands

Received: 15 January 2009 / Accepted: 14 April 2009 Published online: 15 August 2009 – © The Author(s) 2009. This article is published with open access at Springerlink.com

Abstract: We present a construction of an entropy-preserving equivariant surjective map from the d-dimensional critical sandpile model to a certain closed, shift-invariant d subgroup of TZ (the ‘harmonic model’). A similar map is constructed for the dissipative abelian sandpile model and is used to prove uniqueness and the Bernoulli property of the measure of maximal entropy for that model.

Contents 1.

Introduction . . . . . . . . . . . . . . . . . . . 1.1 Four models . . . . . . . . . . . . . . . . . 1.2 Outline of the paper . . . . . . . . . . . . . 2. A Potential Function and its 1 -Multipliers . . . 3. The Harmonic Model . . . . . . . . . . . . . . 3.1 Linearization . . . . . . . . . . . . . . . . 3.2 Homoclinic points . . . . . . . . . . . . . . 3.3 Symbolic covers of the harmonic model . . 3.4 Kernels of covering maps . . . . . . . . . . 4. The Abelian Sandpile Model . . . . . . . . . . 5. The Critical Sandpile Model . . . . . . . . . . . 5.1 Surjectivity of the maps ξg : R∞ −→ X f (d) 5.2 Properties of the maps ξg , g ∈ I˜d . . . . . . 6. The Dissipative Sandpile Model . . . . . . . . . 6.1 The dissipative harmonic model . . . . . . (γ ) 6.2 The covering map ξ (γ ) : R∞ −→ X f (d,γ ) . 7. Conclusions and Final Remarks . . . . . . . . . References . . . . . . . . . . . . . . . . . . . . . . .

. . . . . . . . . . . .

. . . . . . . . . . . .

. . . . . . . . . . . .

. . . . . . . . . . . .

. . . . . . . . . . . .

. . . . . . . . . . . .

. . . . . . . . . . . .

. . . . . . . . . . . .

. . . . . . . . . . . .

. . . . . . . . . . . .

. . . . . . . . . . . .

. . . . . . . . . . . .

. . . . . . . . . . . .

. . . . . . . . . . . .

. . . . . . . . . . . .

722 722 723 723 733 734 734 735 739 744 747 747

. . . . . .

. . . . . .

. . . . . .

. . . . . .

. . . . . .

. . . . . .

. . . . . .

. . . . . .

. . . . . .

. . . . . .

. . . . . .

. . . . . .

. . . . . .

. . . . . .

. . . . . .

754 755 755 756 758 759

722

K. Schmidt, E. Verbitskiy

1. Introduction For any integer d ≥ 2 let hd = 0

1

··· 0

1

log 2d − 2

d

cos(2π xi )

d x1 · · · d xd ,

(1.1)

i=1

h 2 = 1.166, h 3 = 1.673, etc. It turns out that for d ≥ 2, h d is the topological entropy of three different d-dimensional models in mathematical physics, probability theory, and dynamical systems. For d = 2, there is even a fourth model with the same entropy h d . 1.1. Four models. The d-dimensional abelian sandpile model was introduced by Bak, Tang and Wiesenfeld in [3,4] and attracted a lot of attention after the discovery of the Abelian property by Dhar in [8]. The set of infinite allowed configurations of the sandpile d model is the shift-invariant subset R∞ ⊂ {0, . . . , 2d − 1}Z defined in (4.4) and discussed in Sect. 4.1 In [10], Dhar showed that the topological entropy of the shift-action σR∞ on R∞ is also given by (3.4), which implies that every shift-invariant measure µ of maximal entropy on R∞ has entropy (1.1). Shift-invariant measures on R∞ were studied in some detail by Athreya and Jarai in [1,2], Jarai and Redig in [13]; however, the question of uniqueness of the measure of maximal entropy is still unresolved. Spanning trees of finite graphs are classical objects in combinatorics and graph theory. In 1991, Pemantle in his seminal paper [17] addressed the question of constructing uniform probability measures on the set Td of infinite spanning trees on Zd — i.e., on the set of spanning subgraphs of Zd without loops. This work was continued in 1993 by Burton and Pemantle [5], where the authors observed that the topological entropy of the set of all spanning trees in Zd is also given by the formula (1.1). Another problem discussed in [5] is the uniqueness of the shift-invariant measure of maximal entropy on Td (the proof in [5] is not complete, but Sheffield has recently completed the proof in [22]. This coincidence of entropies raised the question about the relation between these models. A partial answer to this question was given in 1998 by R. Solomyak in [24]: she constructed injective mappings from the set of rooted spanning trees on finite regions of Zd into X f (d) such that the images are sufficiently separated. In particular, this provided a direct proof of coincidence of the topological entropies of α f (d) and σTd without making use of formula (1.1). In dimension 2, spanning trees are related not only to the sandpile models (cf. e.g., [19] for a detailed account) and, by [24], to the harmonic model, but also to a dimer model (more precisely, to the even shift-action on the two-dimensional dimer model) by [5]. However, the connections between the abelian sandpiles and spanning trees (as well as dimers in dimension 2), are non-local: they are obtained by restricting the models to finite regions in Zd (or Z2 ) and constructing maps between these restrictions, but these maps are not consistent as the finite regions increase to Zd . In this paper we study the relation between the infinite abelian sandpile models and the algebraic dynamical systems called the harmonic models. The purpose of this paper is to define a shift-equivariant, surjective local mapping between these models: from the d

1 In the physics literature it is more customary to view the sandpile model as a subset of {1, . . . , d}Z by adding 1 to each coordinate.

Abelian Sandpiles and the Harmonic Model

723

infinite critical sandpile model R∞ to the harmonic model. Although we are not able to prove that this mapping is almost one-to-one it has the property that it sends every shift-invariant measure of maximal entropy on R∞ to Haar measure on X f (d) . Moreover, it sheds some light on the somewhat elusive group structure of R∞ . Firstly, the dual group of X f (d) is the group Gd = Rd /( f (d) ), ± where Rd = Z[u ± 1 , . . . , u d ] is the ring of Laurent polynomials with integer coefficients in the variables u 1 , . . . , u d , and ( f (d) ) is the principal ideal in Rd generated by d (u i + u i−1 ). The group Gd is the correct infinite analogue of the groups f (d) = 2d − i=1 of addition operators defined on finite volumes, see [9,19] (cf. Sect. 7). Secondly, the map ξ Id constructed in this paper gives rise to an equivalence relation ∼ on R∞ with

x ∼ y ⇐⇒ x − y ∈ ker(ξ Id ), such that R∞ /∼ is a compact abelian group. Moreover, R∞ /∼ , viewed as a dynamical system under the natural shift-action of Zd , has the topological entropy (1.1). This extends the result of [16], obtained in the case of dissipative sandpile model, to the critical sandpile model. Finally, we also identify an algebraic dynamical system isomorphic to the dissipative sandpile model. This allows an easy extension of the results in [16]: namely, the uniqueness of the measure of maximal entropy on the set of infinite recurrent configurations in the dissipative case. Unfortunately, we are not yet able to establish the analogous uniqueness result in the critical case. 1.2. Outline of the paper. Sect. 2 investigates certain multipliers of the potential function (or Green’s function) of the simple random walk on Zd . In Sect. 3 these results are used to describe the homoclinic points of the harmonic model. These points are then used to define shift-equivariant maps from the space ∞ (Zd , Z) of all bounded d-parameter sequences of integers to X f (d) . In Sect. 4 we introduce the critical and dissipative sandpile models. In Sect. 5 we show that the maps found in Sect. 3 send the critical sandpile model R∞ onto X f (d) , preserve topological entropy, and map every measure of maximal entropy on R∞ to Haar measure on the harmonic model. After a brief discussion of further properties of these maps in Subsect. 5.2, we turn to dissipative sandpile models in Sect. 6 and define an analogous map to another closed, shift-invariant subgroup of d TZ . The main result in [16] shows that this map is almost one-to-one, which implies that the measure of maximal entropy on the dissipative sandpile model is unique and Bernoulli. 2. A Potential Function and its 1 -Multipliers Let d ≥ 1. For every i = 1, . . . , d we write e(i) = (0, . . . , 0, 1, 0, . . . , 0) for the i th unit vector in Zd , and we set 0 = (0, . . . , 0) ∈ Zd . d We identify the cartesian product Wd = RZ with the set of formal real power series ±1 ±1 in the variables u 1 , . . . , u d by viewing each w = (wn ) ∈ Wd as the power series wn u n (2.1) n∈Zd

724


with wn ∈ R and u n = u n1 2 · · · u nd d for every n = (n 1 , . . . , n d ) ∈ Zd . The involution w → w ∗ on Wd is defined by wn∗ = w−n , n ∈ Zd .

(2.2)

For E ⊂ Zd we denote by π E : Wd −→ R E the projection onto the coordinates in E. For every p ≥ 1 we regard p (Zd ) as the set of all w ∈ Wd with ⎛ ⎞1/ p |wn | p ⎠ < ∞. w p = ⎝ n∈Zd

Similarly we view ∞ (Zd ) as the set of all bounded elements in Wd , equipped with the ±1 1 d supremum norm · ∞ . Finally we denote by Rd = Z[u ±1 1 , . . . , u d ] ⊂ (Z ) ⊂ Wd the ring of Laurent polynomialswith integer coefficients. Every h in any of these spaces will be written as h = (h n ) = n∈Zd h n u n with h n ∈ R (resp. h n ∈ Z for h ∈ Rd ). The map (m, w) → u m ·w with (u m ·w)n = wn−m is a Zd -action by automorphisms of the additive group Wd which extends linearly to an Rd -action on Wd given by h·w = hnun · w (2.3) n∈Zd

for every h ∈ Rd and w ∈ Wd . If w also lies in Rd this definition is consistent with the usual product in Rd . For the following discussion we assume that d ≥ 2 and consider the irreducible Laurent polynomial d f (d) = 2d − (u i + u i−1 ) ∈ Rd . (2.4) i=1

The equation

f (d) · w = 1

(2.5)

with w ∈ Wd admits a multitude of However, there is a distinguished (or fundamental) solution w (d) of (2.5) which has a deep probabilistic meaning: it is a certain multiple of the lattice Green’s function of the symmetric nearest-neighbour random walk on Zd (cf. [6,12,25,27]). solutions.2

Definition 2.1. For every n = (n 1 , . . . , n d ) ∈ Zd and t = (t1 , . . . , td ) ∈ Td we set n, t = dj=1 n j t j ∈ T. We denote by F (d) (t) =

f n(d) e2πi n,t = 2d − 2 ·

d

n∈Zd

cos(2π t j ), t = (t1 , . . . , td ) ∈ Td ,

(2.6)

j=1

the Fourier transform of f (d) . 2 Under the obvious embedding of R → ∞ (Zd , Z), the constant polynomial 1 ∈ R corresponds to the d d element δ (0) ∈ ∞ (Zd , Z) given by

(0) δn =

1 if n = 0, 0 otherwise.


(1) For d = 2, wn(2)

:=

725

e−2πi n,t − 1 dt for every n ∈ Z2 . F (2) (t) T2

(2) For d ≥ 3, wn(d) :=

e−2πi n,t dt for every n ∈ Zd . (d) Td F (t)

The difference in these definitions for d = 2 and d > 2 is a consequence of the fact that the simple random walk on Z2 recurrent, while on higher dimensional lattices it is transient. Theorem 2.2. ([6,12,25,27]) We write · for the Euclidean norm on Zd . (i) For every d ≥ 2, w(d) satisfies (2.5). (ii) For d = 2, ⎧ ⎨ 0 if n = 0, 1 4 +n 4 )− 3 wn(2) = (n ⎩− 1 log n − κ2 − c2 n4 1 2 2 4 + O(n−4 ) if n = 0, 8π n (2)

where κ2 > 0 and c2 > 0. In particular, w0 Moreover, 4 · wn(2) =

∞

(2.7)

(2)

= 0 and wn < 0 for all n = 0.

(P(X k = n|X 0 = 0) − P(X k = 0|X 0 = 0)),

k=1

where (X k ) is the symmetric nearest-neighbour random walk on Z2 . (iii) For d ≥ 3, n

d−2

wn(d)

= κd + cd

1 n4

d

4 i=1 n i n2

−

3 d+2

+ O(n−4 )

(2.8)

as n → ∞, where κd > 0, cd > 0. Moreover, 2d · wn(d) =

∞

P(X k = n|X 0 = 0) > 0 for every n ∈ Zd ,

k=0

where (X k ) is again the symmetric nearest-neighbour random walk on Zd . Definition 2.3. Let w(d) ∈ Wd be the point appearing in Definition 2.1. We set Id = g ∈ Rd : g · w (d) ∈ 1 (Zd ) ⊃ ( f (d) ),

(2.9)

(d) where ( f (d) ) = f (d) · Rd is the principal ideal generated by f (d) . Since wn(d) = w−n ∗ d ∗ for every n ∈ Z it is clear that Id = Id = {g : g ∈ Id }.

726


Theorem 2.4. The ideal Id is of the form Id = ( f (d) ) + I3d ,

(2.10)

Id = {h ∈ Rd : h(1) = 0} = (1 − u 1 ) · Rd + · · · + (1 − u d ) · Rd

(2.11)

where

with 1 = (1, . . . , 1). For the proof of Theorem 2.4 we need several lemmas. We set Jd = ( f (d) ) + I3d ⊂ Rd .

(2.12)

Lemma 2.5. Let g = k∈Zd gk u k ∈ Rd . Then g ∈ Jd if and only if it satisfies the following conditions (2.13)–(2.16). gk = 0, (2.13)

k∈Zd

gk ki = 0

for i = 1, . . . , d,

(2.14)

gk ki k j = 0

for 1 ≤ i = j ≤ d,

(2.15)

gk (ki2 − k 2j ) = 0

for 1 ≤ i = j ≤ d.

(2.16)

k=(k1 ,...,kd )∈Zd

k=(k1 ,...,kd

)∈Zd

k=(k1 ,...,kd )∈Zd

Proof. Condition (2.13) is equivalent to saying that g ∈ Id . In conjunction with (2.13), (2.14) is equivalent to saying that g ∈ I2d : indeed, if g ∈ Id , then it is of the form g= with ai ∈ Rd for i = 1, . . . , d. Then ∂g = ∂u j

d

(1 − u i ) · ai

(2.17)

i=1

k −1

gk k j · u k11 · · · u j j

· · · u kdd = −a j +

k=(k1 ,...,kd )∈Zd

d

(1 − u i ) ·

i=1

∂ai , ∂u j

∂g and ∂u (1) = 0 if and only if a j ∈ Id . j If g ∈ Id is of the form (2.17) and satisfies (2.14) we set

aj =

d (1 − u i ) · bi, j i=1

with bi, j ∈ Rd . Condition (2.15) is satisfied if and only if ∂a j ∂2g ∂ai (1) = − − = bi, j (1) + b j,i (1) = 0 ∂u i ∂u j ∂u j ∂u i for 1 ≤ i = j ≤ d.

(2.18)


727

Finally, if g satisfies (2.13)–(2.14) and is of the form (2.17)–(2.18) with bi, j ∈ Rd for all i, j, then (2.16) is equivalent to the existence of a constant c ∈ R with

gk ki2 = −2

k=(k1 ,...,kd )∈Zd

∂ai (1) = 2bi,i (1) = c ∂u i

for i = 1, . . . , d. The last equation shows that bi,i − b1,1 ∈ Id for i = 2, . . . , d. By combining all these observations we have proved that g satisfies (2.13)–(2.16) if and only if it is of the form g = h1 ·

d (1 − u i )2 + h 2

(2.19)

i=1

with c ∈ Z, h 1 ∈ Rd and h 2 ∈ I3d . The set of all such g ∈ Rd is an ideal which we d denote by J˜. Clearly, I3d ⊂ J˜ and i=1 (1 − u i )2 ∈ J˜. Since (1 − u i )2 · (1 − u i−1 ) ∈ I3d for i = 1, . . . , d as well, we conclude that f (d) =

d d (1 − u i )2 − (1 − u i−1 ) · (1 − u i )2 ∈ J˜. i=1

(2.20)

i=1

This shows that J˜ ⊂ Jd , and the reverse inclusion also follows from (2.20) and (2.19). Lemma 2.6. Id ⊂ Jd . (d) Proof. We assume that g ∈ Id and set v = g · w . In order to verify (2.13) we argue by contradiction and assume that k gk = 0. If d = 2 then gk log n + l.o.t., vn = − k 2π

for large n. If d ≥ 3, then vn =

κd k gk + l.o.t. nd−2

for large n. In both cases it is evident that v ∈ 1 (Zd ). By taking (2.13) into account one gets that, for every d ≥ 2, (d) vn = (g · w (d) )n = gk wn−k k

=

Td

e

−2πi n,t

2d

2πi k,t k gk e d − 2 j=1 cos(2π t j )

Hence v = (vn ) is the sequence of Fourier coefficients of the function 2πi k,t k gk e . H (t) = d 2d − 2 j=1 cos(2π t j )

dt.

728


If v ∈ 1 (Zd ), then H must be a continuous function Td . Since t = 0 is the only on2πi k,· (d) d zero of F on T (cf. (2.6)), the numerator G = k gk e must compensate for this singularity. Consider the Taylor series expansion of G at t = 0: G(t) =

gk + 2πi

k

d

tj

j=1

gk k j − 2π 2

d j=1

k

t 2j

gk k 2j − 4π 2

ti t j

i= j

k

gk ki k j

k

+h.o.t. The Taylor series expansion of F (d) at t = 0 is given by F (d) (t) = 4π 2

d

t 2j + h.o.t.

j=1

Suppose that h(t) =

a0 +

d

j=1 b j t j

d

2 j=1 c j t j + i= j 2 2 t1 + · · · + td + h.o.t

+

di, j ti t j + h.o.t

is continuous at t = 0. Then a0 = 0, b j = 0 for all j, c j = c for all j, di j = 0 for all i = j, and for some constant c. If any of these conditions is violated, then one easily produces examples of sequences t(m) → 0 as m → ∞ with distinct limits limm→∞ h(t(m) ). By applying this to H we obtain (2.13)–(2.16), so that g ∈ Jd by Lemma 2.5. To establish the inclusion Jd ⊆ Id , we have to show that for any g ∈ Jd , g·u ∈ 1 (Zd ), where u ∈ Wd of the form d n4 1 ωn = i=1d+4i , or ωn = with γ ≥ d − 2. n nγ For d = 2, we also have to treat the case ωn = log n. These results are obtained in the following three lemmas. Lemma 2.7. Suppose that d ≥ 2 and that ω ∈ Wd is given by

0 if n = 0, d ωn = i=1 n i4 if n = 0. nd+4 If g ∈ Rd satisfies (2.13), then g · ω ∈ 1 (Zd ). Proof. Let M = max{k : gk = 0}, and suppose that n > M. Then d d 4 3 4 i=1 n i + O(n ) i=1 (n i − ki ) (g · ω)n = gk = g k n − kd+4 nd+4 (1 + O(n−1 )) k k d 4 1 1 i=1 n i =O . = gk + O nd+4 nd+1 nd+1 Therefore,

k

n |(g

· ω)n | < ∞.


729

For the reverse inclusion Jd ⊂ Id we need different arguments for d = 2 and for d ≥ 3. We start with the case d = 2. k ∈ R satisfies (2.13). We set S = Lemma 2.8. Suppose that g = 2 + k∈Z2 gk u {k : gk > 0} and S− = {k : gk < 0}. Put Mg = 2 gk = 2 |gk | k∈S+

k∈S−

and define two polynomials in the variables (n 1 , n 2 ): gk (n 1 − k1 )2 + (n 2 − k2 )2 = n − k2gk , P+ (n 1 , n 2 ) = k∈S+

P− (n 1 , n 2 ) =

(n 1 − k1 )2 + (n 2 − k2 )2

|gk |

k∈S+

=

k∈S−

n − k2|gk | .

(2.21)

k∈S−

Let m g be the degree of P = P+ − P− . If Mg − m g ≥ 3, then g · ω ∈ 1 (Z2 ), where

ωn =

Proof. Since

k∈Z2

(2.22)

0 if n = (0, 0), log n if n = (0, 0).

gk = 0 by (2.13), Mg = deg P+ = deg P− and

m g = deg P < max(deg P+ , deg P− ) = Mg . Let v = g · ω. Hence, for all n with n > max{k : k ∈ S+ ∪ S− }, one has P+ (n 1 , n 2 ) 1 P+ (n 1 , n 2 ) − P− (n 1 , n 2 ) 1 = log 1 + |(g · ω)n | = log . 2 P− (n 1 , n 2 ) 2 P− (n 1 , n 2 ) There exist constants C, N such that mg P+ (n 1 , n 2 ) − P− (n 1 , n 2 ) C 1 ≤ C n = < P− (n 1 , n 2 ) 2 n Mg n Mg −m g for n ≥ N . Hence we can find another constant C˜ such that |(g · ω)n | ≤

C˜ n Mg −m g

for all sufficiently large n. Since Mg −m g ≥ 3, we finally conclude that g·ω ∈ 1 (Z2 ). Lemma 2.9. Suppose that g ∈ Jd (cf. (2.13)–(2.16)), and that ω ∈ Wd is given by

0 if n = 0, ωn = 1 if n = 0, nγ for some integer γ ≥ d − 2. Then g · ω ∈ 1 (Zd ).

730


Proof. Let Sg = {k ∈ Zd : gk = 0}, M = max{k : k ∈ Sg }, and note that Sg ⊂ Bd = {y ∈ Rd : y ≤ M},

(2.23)

where · is the Euclidean norm on Zd ⊂ Rd . We fix n ∈ Zd with n > M and set h (n) (k) = n − k−γ =

d

−γ /2 (n i − ki )2

.

(2.24)

i=1

In calculating the Taylor expansion of h (n) as a function of the variables k1 , . . . , kd we use the notation I ! = i 1 ! · · · i d !, |I | = i 1 + · · · + i d and

∂ |I | h (n) ∂ i1 +···+id h (n) = , ∂k I ∂k1i1 · · · ∂knin

(2.25)

for I = (i 1 , . . . , i d ) ∈ Zd+ , k = (k1 , . . . , kd ) ∈ Zd , where Z+ = {n ∈ Z : n ≥ 0}. Then the Taylor expansion of h (n) for k ≤ M is given by 1 ∂ |I | h (n) (n) (0) k I + RI kI , I I ! ∂k |I |≤2 |I |=3 1 ∂ |I | h (n) (n) |R I | ≤ sup (y) I y∈Bd I ! ∂k

h (n) (k) = where

(cf. (2.23)). The first and second order derivatives of h (n) have the following form. ∂h (n) (k) = γ · (n i − ki ) · n − k−γ −2 for i = 1, . . . , d, ∂ki ∂ 2 h (n) (k) = γ · (γ + 2) · (n i − ki ) · (n j − k j ) · n − k−γ −4 ∂ki ∂k j for i, j = 1, . . . , d, i = j, 2 (n) ∂ h (k) = γ · (γ + 2) · (n i − ki )2 · n − k−γ −4 − γ · n − k−γ −2 ∂ki2 for i = 1, . . . , d. It follows that h (n) (0) = n−γ , ∂h (n) (0) = γ · n i · n−γ −2 , ∂ki ∂ 2 h (n) (0) = γ · (γ + 2) · n i · n j · n−γ −4 , i = j, ∂ki ∂k j ∂ 2 h (n) (0) = γ · (γ + 2) · n i2 · n−γ −4 − γ · n−γ −2 . ∂ki2


731

For I = (i 1 , . . . , i d ) ∈ Zd+ and y ∈ Rd , ∂ |I | h (n) (y) = PI (n 1 , . . . , n d ) · n − y−γ −2|I | , ∂k I where PI is a polynomial of degree at most |I | in the variables n 1 , . . . , n d . Therefore, for every I ∈ Zd+ with |I | = 3, −γ −3 |R (n) ). I | ≤ O(n

(2.26)

By using the Taylor series expansion of h (n) above we obtain that, for all n with sufficiently large norm, |(g · ω)n | = gk h (n) (k) k∈Sg ⎛ ⎞ d (n) (0) (n) ∂h ⎝ ≤ h (0) gk + gk ki ⎠ i=1 ∂ki k∈Sg k∈Sg ⎛ ⎞ 2 (n) ∂ h (0) ⎝ + gk ki k j ⎠ i= j ∂ki ∂k j k∈Sg ⎛ ⎞ d 1 ∂ 2 h (n) (0) ⎝ 2 ⎠ −(γ +3) + g k ). (2.27) k i + O(n 2 2 ∂k i i=1 k∈S g

The first three terms on the right-hand side of the above inequality vanish because of (2.13), (2.14), and (2.15). The fourth term is estimated as follows: (2.16) implies that gk ki2 = const for all i = 1, . . . , d, k∈Sg

and we denote by C this common value. Then ⎛ ⎞ d ∂ 2 h (n) (0) ⎝ gk ki2 ⎠ 2 ∂k i i=1 k∈S g

=

d γ (γ + 2) · n i2 · n−γ −4 −γ · n−γ −2 C = γ (γ + 2) − γ d C||n||−γ −2 . i=1

Therefore, if γ = d − 2, then the fourth term vanishes. If γ > d − 2, i.e., if γ ≥ d − 1, then the fourth term is of the order O(n−(d+1) ), and is thus summable over Zd . The remainder term in (2.27) is always summable since γ + 3 ≥ d + 1. Proof of Theorem 2.4. We start with the case d ≥ 3. Recall that for n = 0, d 4 κd 3cd 1 i=1 n i (d) +c − +O(n−(d+2) ) =: ωn(1) + ωn(2) + ωn(3) + rn . wn = d d−2 d+4 n n d + 2 nd

732


Applying g, we conclude that g · w ∈ 1 (Zd ), because g · ω(1) , g · ω(3) ∈ 1 (Zd ) by Lemma 2.9 for γ = d − 2 and γ = d, respectively; g · ω(2) ∈ 1 (Zd ) by Lemma 2.7; (g · r )n = O(n−(d+2) ), and hence g · r ∈ 1 (Zd ) as well. Now consider the case d = 2. Then wn(2) = −

n4 + n4 1 3 1 log n−κ2 − c2 1 4+22 − +O(n−4 ) = ωn(1) + ωn(2) + ωn(3) + rn . 8π n 4 n2

For any g ∈ J2 , g · ω(2) , g · ω(3) , g · r ∈ 1 (Z2 )

(2.28)

by the results of Lemmas 2.7 and 2.9. The remaining term g · ω(1) has to be treated slightly differently. First of all, note that since J2 = ( f ) + (u 1 − 1)3 · R2 + (u 1 − 1)2 (u 2 − 1) · R2 + (u 1 − 1)(u 2 − 1)2 · R2 + (u 2 − 1)3 · R2 , it is sufficient to check that g · ω(1) ∈ 1 (Z2 ) only for the set of generators, i.e., for g = f (2) , (u 1 − 1)3 , (u 1 − 1)2 (u 2 − 1), (u 1 − 1)(u 2 − 1)2 , (u 2 − 1)3 . For g = f (2) , f (2) · w (2) = δ (0) ∈ 1 (Z2 ) (cf. (2.5) and Footnote 2 on page 724), and hence, given (2.28), f · ω(1) ∈ 1 (Z2 ) as well. For g = (u 1 − 1)3 ∈ R2 we apply Lemma 2.8. Note that S+ = {(1, 0), (3, 0)}, S− = {(0, 0), (2, 0)}, P+ = ((n 1 − 3)2 + n 22 )((n 1 − 1)2 + n 22 )3 , P− = ((n 1 − 2)2 + n 22 )3 (n 21 + n 22 ) and P+ − Pi = 9 − 60n 1 + 108n 21 − 84n 31 + 30n 41 − 4n 51 − 36n 22 + 60n 1 n 22 −36n 21 n 22 + 8n 31 n 22 − 18n 42 + 12n 1 n 42 . Hence Mg = deg P+ = deg P− = 8, m g = deg P = 5, Mg − m g = 3. Therefore, by Lemma 2.8, |(g · ω(1) )n | = O(n−3 ), and hence g · ω(1) ∈ 1 (Z2 ), which is equivalent to g ∈ I2 . The same calculation shows that (u 2 − 1)3 ∈ I2 . Furthermore, since f (2) ∈ I2 and 3 (2) 2 u −1 = −u −1 1 (u 1 − 1) + f 2 (u 1 − 1)(u 2 − 1) ,

we obtain that (u 1 − 1)(u 2 − 1)2 ∈ I2 and, by symmetry, that (u 1 − 1)2 (u 2 − 1) ∈ I2 . This proves that J2 ⊂ I2 , and Lemma 2.6 yields that J2 = I2 .


733

3. The Harmonic Model Let d > 1. We define the shift-action α of Zd on TZ by d

(α m x)n = xm+n

(3.1)

for every m, n ∈ Zd and x = (xn ) ∈ TZ and consider, for every h ∈ Rd , the group homomorphism d d h(α) = h m α m : TZ −→ TZ . (3.2) d

m∈Zd

Since Rd is an integral domain, Pontryagin duality implies that h(α) is surjective for d ∼ every nonzero h ∈ Rd (it is dual to the injective homomorphism from Rd = TZ to itself consisting of multiplication by h). d Let f (d) ∈ Rd be given by (2.4) and let X f (d) ⊂ TZ be the closed, connected, shift-invariant subgroup ⎧ d ⎨ d (xn+e( j) + xn−e( j) ) = 0 X f (d) = ker f (d) (α) = x = (xn ) ∈ TZ : 2d xn − ⎩ j=1 ⎫ (3.3) ⎬ for every n ∈ Zd . ⎭ We denote by α f (d) the restriction of α to X f (d) . Since every α m , m ∈ Zd , is a continuf (d) ous automorphism of X f (d) , the Zd -action α f (d) preserves the normalized Haar measure λ X f (d) of X f (d) .

The Laurent polynomial f (d) can be viewed as a Laplacian on Zd and every x = (xn ) ∈ X f (d) is harmonic (mod 1) in the sense that, for every n ∈ Zd , 2d · xn is the sum of its 2d neighbouring coordinates (mod 1). This is the reason for calling (X f (d) , α f (d) ) the d-dimensional harmonic model. According to [21, Theorem 18.1] and [21, Theorem 19.5], the metric entropy of α f (d) with respect to λ X f (d) coincides with the topological entropy of α f (d) and is given by hλX

f (d)

(α f (d) ) = h top (α f (d) ) 1 1 = ··· log f (d) (2πit1 , . . . , 2πitd ) dt1 · · · dtd < ∞. 0

(3.4)

0

Furthermore, α f (d) is Bernoulli with respect to λ X f (d) (cf. [21]). Since every constant element of TZ lies in X f (d) , α f (d) has uncountably many fixed points and is therefore nonexpansive: for every ε > 0 there exists a nonzero point x = (xn ) in X f (d) with d

| xn| < ε for every n ∈ Zd , where | t (mod 1)|| = min {|t − n| : n ∈ Z}, t ∈ R.

(3.5)

734


3.1. Linearization. Consider the surjective map ρ : Wd = RZ −→ TZ given by d

d

ρ(w)n = wn (mod 1)

(3.6)

for every n ∈ Zd and w = (wn ) ∈ Wd . We write σ for the shift action (σ m w)n = (u −m · w)n = wm+n of Zd on Wd (cf. (2.3)). As in (3.2) we set, for every g = n 1 d n∈Zd h n u ∈ (Z ), h(σ ) = h n σ n : Wd −→ Wd .

(3.7)

n∈Zd

gn u n ∈ Rd , h =

(3.8)

n∈Zd

Then h(σ )(w) = h ∗ · w, g(α)(ρ(w)) = ρ(g ∗ · w)

(3.9)

for every w ∈ Wd (cf. (2.2) and (2.3)). d We set Wd (Z) = ZZ ⊂ Wd . According to (3.3), W f (d) := ρ −1 (X f (d) ) = {w ∈ Wd : ρ(w) ∈ X f (d) } = f (d) (σ )−1 (Wd (Z)) = {w ∈ Wd : f (d) · w ∈ Wd (Z)}.

(3.10)

For later use we denote by d R ⊂ Wd , Z ⊂ Wd (Z), T ⊂ TZ

(3.11)

the set of constant elements. If c is an element of R, Z or T we denote by c˜ the corresponding constant element of R, Z or T. Equation (3.10) allows us to view W f (d) as the linearization of X f (d) . 3.2. Homoclinic points. Let β be an algebraic Zd -action on a compact abelian group Y , i.e., a Zd -action by continuous group automorphisms of Y . An element y ∈ Y is homoclinic for β (or β-homoclinic to 0) if limn→∞ β n y = 0. The set of all homoclinic points of β is a subgroup of Y , denoted by β (Y ). If β is an expansive algebraic Zd -action on a compact abelian group Y then β (Y ) is countable, and β (Y ) = {0} if and only if β has positive entropy with respect to the Haar measure λY (or, equivalently, positive topological entropy). Furthermore, β (Y ) is dense in Y if and only if β has completely positive entropy w.r.t. λY . Finally, if β is expansive, then β n x → 0 exponentially fast (in an appropriate metric) as n → ∞. All these results can be found in [14]. If β is nonexpansive on Y , then there is no guarantee that β (Y ) = {0} even if β has completely positive entropy. Furthermore, β-homoclinic points y may have the property that β n y → 0 very slowly as n → ∞. The Zd -action α f (d) on X f (d) is nonexpansive and the investigation of its homoclinic points therefore requires a little more care. In particular we shall have to restrict our attention to α f (d) -homoclinic points x for which α nf (d) x → 0 sufficiently fast as n → ∞. For this reason we set


(1) α (X f (d) ) =

⎧ ⎨ ⎩

735

x ∈ α (X f (d) ) :

⎫ ⎬ | x n| < ∞ , ⎭ d

(3.12)

n∈Z

where | · | is defined in (3.5). In order to describe the homoclinic groups α (X f (d) ) and (1) α (X f (d) ) we set x = ρ(w (d) ) ∈ X f (d) .

(3.13)

The fact that x ∈ X f (d) is a consequence of Theorem 2.2 (1) and (3.10). Proposition 3.1. Let α f (d) be the algebraic Zd -action on the compact abelian group X f (d) defined in (3.3). Then every homoclinic point z ∈ α (X f (d) ) is of the form z = ρ(h · w(d) ) for some h ∈ Rd . Furthermore, (d) (X ) = ρ {h · w : h ∈ I } (3.14) (1) (d) d f α (cf. Theorem 2.2, (2.9) and (3.12)). Proof. If z ∈ α (X f (d) ), then we choose w ∈ ∞ (Zd ) with limn→∞ wn = 0 and ρ(w) = z. From (3.10) we know that f (d) · w ∈ Wd (Z), and the smallness of (most of) the coordinates of w guarantees that h = f (d) · w ∈ Rd = 1 (Zd ) ∩ ∞ (Zd , Z), where ∞ (Zd , Z) = {w = (wn ) ∈ ∞ (Zd ) : wn ∈ Z for every n ∈ Zd }. If we multiply the last identity by w(d) we get that w (d) · f (d) · w = w = w (d) · h = h · w (d) for some h ∈ Rd . (1) If z ∈ α (X f (d) ) then w ∈ 1 (Zd ) and hence, by definition, h ∈ Id . Conversely, if (1)

h ∈ Id , then z = ρ(h · w (d) ) ∈ α (X f (d) ).

Remark 3.2. A homoclinic point z of an algebraic Zd -action β on a compact abelian group Y is fundamental if its homoclinic group β (Y ) is the countable group generated by the orbit {β n z : n ∈ Zd } (cf. [14]). Proposition 3.1 shows that x = ρ(w (d) ) also has the property that its orbit under (1) α f (d) generates the homoclinic groups α (X f (d) ) and α (X f (d) ), although x itself may not be homoclinic (e.g., when d = 2). 3.3. Symbolic covers of the harmonic model. We construct, for every homoclinic point ∞ d z ∈ (1) α (X f (d) ), a shift-equivariant group homomorphism from (Z , Z) to X f (d) which we subsequently use to find symbolic covers of α f (d) . (1)

According to Proposition 3.1, every homoclinic point z ∈ α (X f d ) is of the form z = g(α)(x ) = ρ(g ∗ · w (d) ) for some g ∈ Id . We define group homomorphisms d ξ¯g : ∞ (Zd ) −→ ∞ (Zd ) and ξg : ∞ (Zd ) −→ TZ by ξ¯g (w) = (g · w (d) )(σ )(w) = (g ∗ · w (d) ) · w and ξg (w) = (ρ ◦ ξ¯g )(w).

(3.15)

736


These maps are well-defined, since ξ¯g (w)n =

wn−k · (g ∗ · w (d) )k

k∈Zd

converges for every n, and equivariant in the sense that ξ¯g ◦ σ n = σ n ◦ ξ¯g , ξg ◦ σ n = α n ◦ ξg , ξ¯g ◦ h(σ ) = h(σ ) ◦ ξ¯g , ξg ◦ h(σ ) = h(α) ◦ ξg ,

(3.16)

for every n ∈ Zd , g ∈ Id and h ∈ Rd . We also note that vn α −n g(α)(x ) ξg (v) = for every v = (vn ) ∈ ∞ (Zd , Z). Proposition 3.3. For every g ∈ Id , ∞

ξg ( (Z , Z)) = d

n∈Zd

{0} if g ∈ ( f (d) ), X f (d) if g ∈ I˜d := Id ( f (d) ),

(3.17)

(cf. (2.9) and (3.15)–(3.16)). We begin the proof of Proposition 3.3 with two lemmas. Lemma 3.4. For every w ∈ ∞ (Zd ) and g ∈ Id , ( f (d) (σ ) ◦ ξ¯g )(w) = f (d) · (g ∗ · w (d) ) · w = g ∗ · ( f (d) · w (d) ) · w = g ∗ · w = g(σ )(w). (3.18) Furthermore, ξg (∞ (Zd , Z)) ⊂ X f (d) . Proof. For every h, v ∈ Rd , Theorem 2.2 (1) implies that f (d) · h ∗ · w (d) · v = h ∗ · f (d) · w (d) · v = h ∗ · v.

(3.19)

Zd

Fix g ∈ Id and let K ≥ 1 and VK = {−K + 1, . . . , K − 1} ⊂ ∞ (Zd , Z). Then VK is shift-invariant and compact in the topology of pointwise convergence, and the set VK ⊂ VK of points with only finitely many nonzero coordinates is dense in VK . For v ∈ VK ⊂ Rd , ξ¯g (v) = (g ∗ · w (d) ) · v

(3.20)

( f (d) (σ ) ◦ ξ¯g )(v) = f (d) · g ∗ · w (d) · v = g ∗ · f (d) · w (d) · v = g ∗ · v

(3.21)

and by (3.15) and (3.19). Since both ξ¯g and multiplication by g ∗ are continuous on VK , (3.21) holds for every v ∈ VK . By letting K → ∞ we obtain (3.21) for every v ∈ ∞ (Zd , Z), hence for every v ∈ M1 ∞ (Zd , Z) with M ≥ 1, and finally, again by coordinatewise convergence, for every w ∈ ∞ (Zd ), as claimed in (3.18). For the last assertion of the lemma we note that ξg (v) = ρ((g ∗ · w (d) ) · v) = (g · v ∗ )(α)(x ) ∈ X f (d)

(3.22)

for every v ∈ VK (cf.(3.13)). The continuity argument above yields that ξg (v) ∈ X f (d) for every v ∈ ∞ (Zd , Z).


737

Lemma 3.5. If g ∈ I˜d then ξg (∞ (Zd , Z)) = X f (d) . In fact, ξg (2d ) = X f (d) , where m = {0, . . . , m − 1}Z ⊂ ∞ (Zd , Z) for every m ≥ 1. Furthermore, the restriction of ξg to 2d (or to any other closed, bounded, shift-invariant subset of ∞ (Zd , Z)) is continuous in the product topology on that space. d

Proof. We fix x ∈ X f (d) and define w ∈ W f (d) by demanding that ρ(w) = x and 0 ≤ wn < 1 for every n ∈ Zd . If v = f (d) (σ )(w) then −2d + 1 ≤ vn ≤ 2d − 1 for every n ∈ Zd . Since ξ¯g commutes with f (d) (σ ) by (3.16), (3.21) shows that ξg (v) = (ρ ◦ ξ¯g )(v) = g(α)(x).

(3.23)

X f (d) ⊃ ξg (∞ (Zd , Z)) ⊃ ξg (V2d ) ⊃ g(α)(X f (d) ),

(3.24)

Hence

where VK = {−K + 1, . . . , K − 1}Z ⊂ ∞ (Zd , Z). We claim that d

g(α)(X f (d) ) = X f (d) .

(3.25)

Indeed, consider the exact sequence g(α)

{0} −→ ker g(α) ∩ X f (d) −→ X f (d) −→ X f (d) −→ {0}, set Y = ker g(α)∩ X f (d) , Z = g(α)(X f (d) ) ⊂ X f (d) , write αY and α Z for the restrictions of α to Y and Z , and denote by α the Zd -action induced by α on X f (d) /Z . Yuzvinskii’s addition formula ([21, (14.1)]) implies that h top (α f (d) ) = h top (αY ) + h top (α Z ) = h top (α ) + h top (α Z ), where we are using the fact that the topological entropies of these actions coincide with their metric entropies with respect to Haar measure. Since the polynomials f (d) and g have no common factors, h top (αY ) = 0 by [21, Corollary 18.5], hence h top (α f (d) ) = h top (α Z ) is given by (3.4) and 0 < h top (α f (d) ) < ∞. Since the Haar measure λ X f (d) of X f (d) is the unique measure of maximal entropy for α f (d) we conclude that λ X f (d) (g(α) (X f (d) )) = 1 and g(α)(X f (d) ) = X f (d) , as claimed in (3.25). We have proved that ξg (V2d ) = X f (d) . If v ∈ ∞ (Zd , Z) satisfies that vn = 2d − 1 for every n ∈ Zd , then v + V2d = 4d−1 , and (3.24) implies that ξg (4d−1 ) = ξg (V2d ) + ξg (v ) = X f (d) + ξg (v ) = X f (d) . We still have to show that ξg (2d ) = X f (d) . Fix M ≥ 1 for the moment and put Q M = {−M, . . . , M}d ⊂ Zd . Let ∞ (Zd , Z+ ) = {v ∈ ∞ (Zd , Z) : vn ≥ 0 for every n ∈ Zd }.

(3.26)

738


For every v ∈ ∞ (Zd , Z+ ) and n ∈ Zd we set

u n · f (d) if vn ≥ 2d (v,n) = h 0 otherwise, and we put H (v,M) = h (v,n) , T (v) = v − H (v,M) . n∈Q M

If

D M (v) =

vn · n2max ,

(3.27)

n∈Q M

where · max is the maximum norm on Rd , then T (v) = v if and only if vn < 2d for every n ∈ Q M , and D M (T (v)) ≥ D M (v) + 2

(3.28)

otherwise. We define inductively T n (v) = T (T n−1 (v)), n ≥ 2, and conclude from (3.28) that there exists, for every v ∈ ∞ (Zd , Z+ ), an integer K M (v) ≥ 0 with v˜ (M) = T k (v) for every k ≥ K M (v). For v ∈ 4d−1 and any M ≥ 1, the corresponding

v˜ (M)

(3.29)

satisfies

0 ≤ vñ(M) ≤ 2d − 1 if n ∈ Q M , vñ(M) ≥ vn if nmax = M + 1, vñ(M) − vn ≤ (2d − 1) · (2M + 1)d ,

(3.30)

{n:nmax =M+1}

vñ(M) = vn if nmax > M + 1, where · max is the maximum norm on Rd . Let V˜ (M) = {v˜ (M) : v ∈ 4d−1 }. Since v − v˜ (M) ∈ ( f (d) ) it is clear that ξg (v) = ξg (v˜ (M) ) for every v ∈ 4d−1 and g ∈ I˜d . Since g ∈ I˜d , Theorem 2.4 implies that there exists a constant C > 0 with −d−1 |(g ∗ · w (d) )n | ≤ C · nmax for every nonzero n ∈ Zd .

Hence |ξ¯g (v˜ (M) )0 − ξ¯g (v¯ (M) )0 | < 4d · (2M + 1)d · C · (M + 1)−d−1 → 0 as M → ∞, where v¯n(M) It follows that

(M) if n ∈ Q M , v˜ = n vn otherwise.

lim ξ¯g (v − v¯ (M) ) = 0

M→∞

in the topology of coordinate-wise convergence. Since v¯ (M) ∈ {v ∈ 4d−1 : 0 ≤ vn < 2d for every n ∈ Q M } for every v ∈ 4d−1 and M ≥ 1, we conclude that ξg (2d ) is dense in X f (d) . As ξg (2d ) is also closed, this implies that ξg (2d ) = X f (d) , as claimed.


739

Remark 3.6. Although we have not yet introduced sandpiles and their stabilization (this will happen in Sect. 4), the second part of the proof of Lemma 3.5 is effectively a ‘sandpile’ argument, and v˜ (M) is a stabilization of v in Q M . Proof of Proposition 3.3. If g lies in I˜d , Lemma 3.5 shows that ξg (2d ) = ξg (∞ (Zd , Z)) = X f (d) . On the other hand, if g = h· f (d) for some h ∈ Rd , then g ∗ ·w (d) ∈ Rd , and hence ξ¯g (v)n ∈ Z for every n ∈ Zd and v ∈ ∞ (Zd , Z), implying that ξg (v) = 0. 3.4. Kernels of covering maps. Having found compact shift-invariant subsets V ⊂ ∞ (Zd , Z) such that the restrictions of ξg to V are surjective for every g ∈ I˜d (cf. Lemma 3.5), we turn to the problem of determining the kernels of the group homomorphisms ξg : ∞ (Zd , Z) −→ X f (d) , g ∈ Id (cf. (3.15)). We shall see below that ker(ξg ) depends on g and that ker ξgh ker(ξg ) for g ∈ Id and 0 = h ∈ Rd . In view of this it is desirable to characterize the set Kd = ker(ξg ) (3.31) g∈Id

of all v ∈ ∞ (Zd , Z) which are sent to 0 by every ξg , g ∈ Id . In the following discussion we set, for every ideal J ⊂ Rd , X J = {x ∈ TZ : g(α)(x) = 0 for every g ∈ J } = d

ker g(α),

(3.32)

g∈J

and put /( f (d) ) = X f (d) / X Id . X˜ f (d) = Id

(3.33)

In order to explain (3.33) we note that the dual group of X˜ f (d) is a subgroup of X f (d) = (d) Rd /( f ), hence X˜ f (d) is a quotient of X f (d) by a closed, shift-invariant subgroup, which is the annihilator of Id /( f (d) ) and hence equal to X Id . The Zd -action α f (d) on X f (d) induces a Zd -action α˜ f (d) on X˜ f (d) . Note that α˜ nf (d) is dual to multiplication by u n

on Id /( f (d) ). With this notation we have the following result.

Theorem 3.7. There exists a surjective group homomorphism η : ∞ (Zd , Z) −→ X˜ f (d) with the following properties: (1) The homomorphism η is equivariant in the sense that η ◦ σ n = α˜ nf (d) ◦ η for every n ∈ Zd ; (2) ker(η) = K d ; (3) The topological entropy of α˜ f (d) coincides with that of α f (d) (cf. (3.4)). For the proof of Theorem 3.7 we choose and fix a set of generators G d = {g (1) , . . . , of Id (for d = 2 we may take, for example, G 2 = {g (1) , g (2) , g (3) } with g (1) = (1 − u 1 )2 · (1 − u 2 ), g (2) = (1 − u 1 ) · (1 − u 2 )2 and g (3) = (1 − u 1 )2 + (1 − u 2 )2 ); for d ≥ 3 we can use the set of generators G d = { f (d) } ∪ {(u i − 1) · (u j − 1) · (u k − 1) : i, j, k = 1, . . . , d}). With such a choice of G d we define a map g (m) }

ξ Id : ∞ (Zd , Z) −→ X mf(d)

(3.34)

740


by setting ξ Id (v) = (ξg(1) (v), . . . , ξg(m) (v))

(3.35)

for every v ∈ ∞ (Zd , Z). Lemma 3.8. There exists a continuous shift-equivariant group isomorphism θd : ξ Id (∞ (Zd , Z)) −→ X˜ f (d) .

(3.36)

Proof. We define a continuous group homomorphism θ : X f (d) −→ X mf(d) by setting

θ (x) = (g (1) (α)(x), . . . , g (m) (α)(x)) for every x ∈ X f (d) . According to (3.15) and (3.16),

ξg ◦ h(σ )(v) = ρ(g ∗ · w (d) · h ∗ · v) = g(α) ◦ ξh (v) for every g, h ∈ I˜d and v ∈ Rd , and hence, by continuity, for every g, h ∈ I˜d and v ∈ ∞ (Zd , Z). Since ξh (∞ (Zd , Z)) = X f (d) by Lemma 3.5 we conclude that ξ Id (∞ (Zd , Z)) ⊃ ξ Id ◦ h(σ )(∞ (Zd , Z)) = θ (X f (d) ). On the other hand, ξ Id (v) = (g (1) (α) ◦ v ∗ (α)(x ), . . . , g (m) (α) ◦ v ∗ (α)(x )) ∈ θ (X f (d) ) for every v ∈ Rd and hence, again by continuity, for every v ∈ ∞ (Zd , Z). We have proved that ξ Id (∞ (Zd , Z))) = θ (X f (d) ). The homomorphism θ has kernel X Id and induces a group isomorphism θ : X˜ f (d) −→ θ (X f (d) ). The proof is completed by setting θd = (θ )−1 . Proof of Theorem 3.7. We set η = θd ◦ ξ Id (cf. (3.34)–(3.36)). By definition, K d = ker(ξ Id ) = ker(η). The equivariance of η is obvious. Furthermore, h top (α) ˜ ≤ h top (α f (d) ), since X˜ f (d) is an equivariant quotient of X f (d) . On the other hand, X˜ f (d) ∼ = ξ Id (∞ (Zd , Z)), and the ∞ d first coordinate projection π1 : ξ Id ( (Z , Z) −→ X f (d) is surjective by Lemma 3.5. This implies that h top (β1 ) = h top (α) ˜ ≥ h top (α f (d) ), so that these entropies have to coincide. In order to characterize the kernel K d of η further we need a lemma and a definition. Lemma 3.9. For every y ∈ ∞ (Zd ) with ρ(y) ∈ X I3 there exists a unique c(y) ∈ [0, 1) d ˜ ∈ ∞ (Zd , Z), where c(y) ˜ denotes the element of R with c(y) ˜ n = c(y) with f (d) · y + c(y) for every n ∈ Zd .


741

Proof. Let x ∈ X I3 and y ∈ ∞ (Zd ) with ρ(y) = x. According to the definition of X I3 d d this means that g(α)(x) = ρ(g ∗ · y) = 0 for every g ∈ I3d . Since g j = (u j − 1) · f (d) ∈ I3d for j = 1, . . . , d, g j (α)(x) = ρ(g ∗j · y) = 0 for j = 1, . . . , d, which implies that f (d) (α)(x) is a fixed point of the Zd -action α on X I3 . ˜ ∈ ∞ (Zd , Z). Hence there exists a unique constant c(y) ∈ [0, 1) with f (d) · y + c(y)

d

Definition 3.10. We call points v ∈ ∞ (Zd ) and x ∈ TZ periodic if their orbits (under σ and α, respectively) are finite. If ⊂ Zd is a subgroup of finite index we denote by (Zd )() , ∞ (Zd , Z)() and () K d the sets of all -invariant elements in the respective spaces. d

Theorem 3.11. (1) For every y ∈ ∞ (Zd ) with ρ(y) ∈ X I3 , d ˜ + m˜ ∈ K d ⊂ ∞ (Zd , Z) v = f (d) · y + c(y)

(3.37)

for every m˜ ∈ Z (cf. (2.10), (3.11), (3.32) and Lemma 3.9). (2) Let ⊂ Zd be a subgroup of finite index. An element v ∈ ∞ (Zd , Z)() lies in K d if and only if it is of the form (3.37) with y ∈ ∞ (Zd )() , ρ(y) ∈ X I3 and m˜ ∈ Z. d

We start the proof of Theorem 3.11 with two lemmas. Lemma 3.12. For every g ∈ Id and every constant element m˜ ∈ ∞ (Zd , Z), ξg (m) ˜ = 0. In other words, Z ⊂ Kd . Proof. We know that g ∈ Id if and only if it satisfies (2.13)–(2.16). We fix g = 2 ∈ Z (note that k ∈ I , put v = g ∗ · w (d) ∈ 1 (Zd ), and set c = g u g k d d d k∈Z k k∈Z k j this value is independent of j ∈ {1, . . . , d} by (2.16)). For every n ∈ Zd , −2πi k,t (d) ∗ (d) −2πi n,t k gk e vn = (g · w )n = dt. gk wn+k = e d 2d − 2 j=1 cos(2π t j ) Td k∈Zd Hence v = (vn ) is the sequence of Fourier coefficients of the function −2πi k,t k gk e Hg (t) = . d 2d − 2 j=1 cos(2π t j ) Since these Fourier coefficients are absolutely summable by assumption, we get that vn = Hg (0). (3.38) n∈Zd

742


On the other hand, given the Taylor series expansion of Hg at t = 0, we have 2 + h.o.t −2π 2 dj=1 t 2j g k k −2π 2 c dj=1 t 2j + h.o.t k j Hg (t) = = , 4π 2 dj=1 t 2j + h.o.t 4π 2 dj=1 t 2j + h.o.t and hence Hg (0) = −c/2.

We are going to show that Hg (0) ∈ Z. Indeed, since k gk k j = 0 for all j by (2.14), we have that k j (k j − 1) 1 1 gk k 2j = − gk k j (k j − 1) = − gk Hg (0) = − ∈ Z. (3.39) 2 2 2 k

k

k

Z, we have Finally, for any g ∈ Id and m˜ ∈ ξ¯g (m) ˜ =m· vn = m Hg (0) ∈ Z ˜ =0∈X by (3.38), and hence ξg (m)

n∈Zd f (d) .

(3.40)

Lemma 3.13. For every g ∈ I3d , Hg (0) = 0 (cf. (3.38)). Proof. Every element of I3d is of the form h · g with h ∈ Rd and g = (u i − 1) · (u j − 1) · (u k − 1) for set v = g ∗ · w (d) and obtain from some i, j, k ∈ {1, . . . , d}. We (3.39) that ∗ · w (d) = h ∗ · v, then H (0) = H (0) = v = 0. If w = (hg) hg n∈Zd wn = g n∈Zd n h v = 0. d d k k∈Z n∈Z n−k Proof of Theorem 3.11. Let x ∈ X I3 , y ∈ ∞ (Zd ) with ρ(y) = x, m˜ ∈ Z, and v = d

f (d) · y + c(y) ˜ + m˜ ∈ ∞ (Zd , Z) (cf. Lemma 3.9). Then g(α)(x) = ρ(g ∗ · y) = 0

for every g ∈ I3d . We set w = g ∗ · w (d) and obtain from (3.16), (3.18) and Lemma 3.12, that ˜ + m) ˜ = ξg ( f (d) · y + c(y)) ˜ ξg (v) = ξg ( f (d) · y + c(y) = ρ(g ∗ · w (d) · f (d) · y + g ∗ · w (d) · c(y)) ˜ = ρ(g ∗ · y + w · c(y)) ˜ ∗ = ρ(g · y) = 0,

since n∈Zd wn = 0 by Lemma 3.13. This proves that every v ∈ ∞ (Zd , Z) of the form (3.37) lies in K d . For (2) we assume that ⊂ Zd is a subgroup of finite index. In view of (1) we only have to verify that every v ∈ ∞ (Zd , Z)() ∩ K d has the form (3.37). Assume therefore that v ∈ ∞ (Zd , Z)() ∩ K d . We choose a set C ⊂ Zd which () intersects each coset of in Zd in a single point and set 0 = {w ∈ ∞ (Zd )() : () (d) (σ )) = R there exists, for n∈C wn = 0}. As 0 is finite-dimensional and ker( f (d) · y = y. every y ∈ () , a unique y ∈ () 0 0 with f d ˜ then Put a˜ = n∈C vn /|Z / |, regarded as an element of R. If v = v − a, ()

v ∈ 0

()

and f (d) · y = v for some y ∈ 0 .


743

Since v ∈ K d , ξg (v) = 0 for every g ∈ Id . For g ∈ I3d , Lemma 3.13 shows that ξ¯g (v) = g ∗ · w (d) · v = g ∗ · w (d) · v + g ∗ · w (d) · a˜ = g ∗ · y + g ∗ · w (d) · a˜ = g ∗ · y ∈ ∞ (Zd , Z). Hence ρ(g ∗ · y) = g(α)(ρ(y)) = 0 for all g ∈ I3d , so that ρ(y) ∈ X I3 . d We obtain that v = f (d) · y + a˜ R, which completes the proof for some y ∈ ∞ (Zd ) with ρ(y) ∈ X I3 and some a˜ ∈ d of (2). Theorem 3.11 implies that there exist nonconstant elements v ∈ K d f (d) (σ )(∞ ∈ ∞ (Zd , Z) differ in only finitely many coordinates, then they get identified under ξ Id (i.e., their difference lies in K d ) if and only if they differ by an element in ( f (d) ) ⊂ ∞ (Zd , Z). This is a consequence of the following assertion:

(Zd , Z)). However, if two elements v, v

Proposition 3.14. For every g ∈ I˜d , ker(ξg ) ∩ Rd = ( f (d) ) = f (d) · Rd . Proof. Suppose that h ∈ Rd ∩ ker(ξg ). Then v := ξ¯g (h) = g ∗ · w (d) · h ∈ ∞ (Zd , Z).

(3.41)

Since g ∈ Id , g ∗ · w (d) ∈ 1 (Zd ) and hence v ∈ Rd = 1 (Zd ) ∩ ∞ (Zd , Z). If we multiply both sides of (3.41) by f (d) we get that f (d) · v = g · h. As Rd has unique factorization this implies that h ∈ f (d) · Rd .

Remarks 3.15. (1) One can show that the periodic points are dense in K d , so that every v ∈ K d is a coordinate-wise limit of elements of the form (3.37) in Theorem 3.11. (2) Theorem 3.11 (1) gives a ‘lower bound’ for the kernel K d of the maps ξg , g ∈ Id . There is also a straightforward ‘upper bound’ for that kernel: an element v ∈ ∞ (Zd , Z) lies in K d if and only if ξ¯g (v) = g ∗ · w (d) · v =: wg ∈ ∞ (Zd , Z) for every g ∈ Id . By multiplying this equation with f (d) we obtain that K d ⊂ {v ∈ ∞ (Zd , Z) : g · v ∈ f (d) · ∞ (Zd , Z) for every g ∈ Id } =: K¯ d . (3.42) It is not very difficult to see that the inclusion in (3.42) is strict. In fact, K¯ d /K d turns out to be isomorphic to Td . (3) In [18], the kernel K¯ d of ξ Id was studied using methods of commutative algebra.

744


4. The Abelian Sandpile Model Let d ≥ 2, γ ≥ 2d, and let E ⊂ Zd be a nonempty set. For every n ∈ E we denote by N E (n) the number of neighbours of n in E, i.e., (4.1) N E (n) = E ∩ {n ± e(i) : 1 = 1, . . . , d} , where e(i) is the i th unit vector in Zd . We set γ = {0, . . . , γ − 1}Z

d

(4.2)

(cf. Lemma 3.5) and put (γ )

P E = {v ∈ {0, . . . , γ − 1} E : vn ≥ N E (n) for at least one n ∈ E}, (γ ) PF . RE =

(4.3)

∅= F⊂E 0 2d, i.e., in the dissipative case.3 (γ ) We denote by σ = σR(γ ) the shift-action of Zd on R∞ ⊂ ∞ (Zd , Z) ⊂ Wd ∞ (cf. (3.7)). For the following discussion we introduce the Laurent polynomial f (d,γ ) = γ −

d ± (u i + u i−1 ) ∈ Rd = Z[u ± 1 , . . . , u d ].

(4.5)

i=1

For γ = 2d, f (d,γ ) = f (d) (cf. (2.4)). Proposition 4.1. Let d ≥ 2 and γ ≥ 2d. The following conditions are equivalent for every v ∈ γ : (γ )

(1) v ∈ R∞ ; (2) For every nonzero h ∈ Rd with h n ∈ {0, 1} for every n ∈ Zd , ( f (d,γ ) · h)n + vn ≥ γ for at least one n ∈ supp(h) = {m ∈ Zd : h m = 0}. (3) For every h ∈ Rd with h n > 0 for some n ∈ Zd , ( f (d,γ ) · h)n + vn ≥ γ for at least one n ∈ {m ∈ Zd : h m > 0}. (γ )

Furthermore, if v, v ∈ R∞ and 0 = v − v ∈ Rd , then v − v ∈ / f (d,γ ) · Rd . Proof. Fix an element v ∈ γ . If h ∈ Rd with h n ∈ {0, 1} for every n ∈ Zd and E = supp(h), then ( f (d,γ ) · h)n + vn ∈ {0, . . . , γ − 1} for every n ∈ E if and only if (γ ) vn ≤ N E (n) − 1 for every n ∈ E, in which case π E (v) ∈ / P E and v ∈ / R∞ (cf. (4.3)). This proves the equivalence of (1) and (2). Now suppose that h ∈ ∞ (Zd , Z) with Mh = maxm∈Zd h m > 0, and that f (d,γ ) · h + v ∈ γ . We set Smax (h) = {n ∈ Zd : h n = Mh }

(4.6)

and observe that vn + ( f (d,γ ) · h)n ≥ vn + Mh · (γ − NSmax (h) ) < γ for every n ∈ Smax (h), so that vn ≤ NSmax (h) − 1 for every n ∈ Smax (h).

(4.7)

If h ∈ Rd , then Smax (h) is finite and (4.7) yields a contradiction to the definition of (γ ) R∞ . This proves the implication (1) ⇒ (3), and the reverse implication (3) ⇒ (2) is obvious. The last assertion of this proposition is a consequence of (3). The proof of Proposition 4.1 has the following corollary. 3 Even in the dissipative case stable configurations will, in general, only arise as coordinate-wise limits of infinite sequences of topplings of v.

746

K. Schmidt, E. Verbitskiy (γ )

Corollary 4.2. If v ∈ R∞ , and if h ∈ ∞ (Zd , Z) satisfies that maxm∈Zd h m > 0 and (γ ) v + f (d,γ ) · h ∈ R∞ , then every connected 4 component of Smax (h) is infinite (cf. (4.6)). Proof. If Smax (h) has a finite connected component C then (4.7) shows that ¯ n + vn = γ − NC (n) ( f (d,γ ) · h)n + vn ≥ ( f (d,γ ) · h) for every n ∈ C, where

¯h n = h n if n ∈ C, 0 otherwise.

As in (4.7) we obtain a contradiction to (4.3).

(γ )

(γ )

Remark 4.3. Proposition 4.1 implies that ( f (d,γ ) (σ )(h) + R∞ ) ∩ R∞ = ∅ for every d nonzero h ∈ Rd . However, if h ∈ {0, 1}Z satisfies that the set S(h) = {n ∈ Zd : h n = (γ ) 1} is infinite and connected, then one checks easily that there exists a v ∈ R∞ with (γ ) f (d) (σ )(h) + v ∈ R∞ . In spite of this the following result holds. Proposition 4.4. The set (γ )

(γ )

V = {v ∈ R∞ : v + w ∈ / R∞ for every nonzero w ∈ f (d,γ ) (σ )(∞ (Zd , Z))}

(4.8)

(γ )

is a countable intesection of dense open sets (i.e., a dense G δ -set) in R∞ . (γ )

Proof. Let v ∈ R∞ and h ∈ ∞ (Zd , Z) be such that maxn∈Zd h n ≥ 0, f (d,γ ) · h = 0 (γ ) and v + f (d,γ ) · h ∈ R∞ . We set Mh = maxm∈Zd h m , define Smax (h) ⊂ Zd as in (4.6), and put ∂Smax (h) = {n ∈ Smax (h) : m − nmax = 1 for some m ∈ Zd Smax (h)}. As ( f (d,γ ) · h)n > 0 for every n ∈ ∂Smax (h), the set ∂Smax (h) must have empty intersection with F(v) = {n ∈ Zd : vn = γ − 1}. (γ )

Now suppose that v ∈ R∞ has the following properties: (a) The set F(v) is connected. (b) Every connected component of Zd F(v) is finite. (c) minn∈Zd vn = 0. According to Corollary 4.2, every connected component C of Smax (h) is infinite. If C = Zd , then the hypotheses (a)–(b) above guarantee that the boundary ∂C = C ∩ ∂Smax (h) of C is a union of finite sets, each of which is contained in one of the connected components of Zd F(v). Let C and D be connected components of Smax (h) and Zd F(v), respectively, with D ∩ ∂C = ∅. Since C is infinite and connected and F(v) is connected, we must have that h m = Mh = 0 for every m ∈ F(v). 4 A set S ⊂ Zd is connected if we can find, for any two coordinates m and n in S, a ‘path’ p(0) = m, p(1), . . . , p(k) = n in S with p( j) − p( j − 1)max = 1 for every j = 1, . . . , k.


747

Define h˜ by

h n if n ∈ D h˜ n = 0 otherwise. ˜ n = ( f (d,γ ) · h)n for every n ∈ D, and 0 ≤ ( f (d,γ ) · h) ˜ n ≤ ( f (d,γ ) · h)n Then ( f (d,γ ) · h) d (d,γ ) ˜ for every n ∈ F(v). For n ∈ Z (F(v) ∪ D), ( f · h)n = 0. By combining these (γ ) statements we see that v + f (d,γ ) · h˜ ∈ R∞ . Since 0 = h˜ ∈ Rd we obtain a contradiction to Proposition 4.1. (γ ) (γ ) This shows that v + f (d,γ ) · h ∈ / R∞ for every v ∈ R∞ satisfying conditions (a)–(b) above and every nonzero h ∈ ∞ (Zd , Z) with maxn∈Zd h n ≥ 0. If γ = 2d and h ∈ ∞ (Zd , Z) satisfies that f (d) · h = 0, then we may add a constant to h, if necessary, to ensure that maxn∈Zd h n ≥ 0. Since such an addition will not affect (γ ) / R∞ = R∞ for every v ∈ R∞ satisfying conditions f (d) ·h, we obtain that v + f (d) ·h ∈ (a)–(c) above and every nonconstant h ∈ ∞ (Zd , Z). If γ > 2d and h ∈ ∞ (Zd , Z) satisfies that maxn∈Zd h n < 0, then ( f (d,γ ) · h)n < 0 (γ ) (γ ) for every n ∈ Zd , and v + f (d,γ ) · h ∈ / R∞ for every v ∈ R∞ satisfying condition (c) above. (γ ) Let V ⊂ R∞ be the set of all points satisfying conditions (a)–(c) above. This set is clearly dense and (γ )

(γ )

V ⊂ V = {v ∈ R∞ : v + w ∈ / R∞ for every nonzero w ∈ f (d) (σ )(∞ (Zd , Z))}. (4.9) The set V is therefore dense, and it is obviously shift-invariant. In order to verify that V is a G δ we write its complement as an Fσ of the form (γ ) π˜ {(v, h) ∈ R∞ × B N (∞ (Zd , Z)) :

(γ )

R∞ V = M≥1 N ≥1 0=c∈Z Q M

(γ ) v + f (d) · h ∈ R∞ and π Q M ( f (d,γ ) · h) = c} ,

where B N (∞ (Zd , Z)) = {h ∈ ∞ (Zd , Z) : h∞ ≤ N }, Q M appears in (3.26) and (γ ) (γ ) π˜ : R∞ × ∞ (Zd , Z) −→ R∞ is the first coordinate projection. 5. The Critical Sandpile Model (2d)

Throughout this section we assume that d ≥ 2 and γ = 2d. We write R∞ = R∞ for d the critical abelian sandpile model, define the harmonic model X f (d) ⊂ TZ by (2.4) and (3.3), and use the notation of Sect. 3. 5.1. Surjectivity of the maps ξg : R∞ −→ X f (d) . For every g ∈ I˜d (cf. (3.17)) we define the map ξg : ∞ (Zd , Z) −→ X f (d) by (2.9) and (3.15). We shall prove the following results.

748


Theorem 5.1. For every g ∈ I˜d , ξg (R∞ ) = X f (d) . Furthermore, the shift-action σR∞ of Zd on R∞ has topological entropy, 1 log π Q N (R∞ ) N →∞ |Q N | 1 1 = ··· log f (d) (e2πit1 , . . . , e2πitd ) dt1 · · · dtd = h(α f (d) ).

h top (σR∞ ) = lim

0

(5.1)

0

For the proof of this result we need a bit of notation and several lemmas. For every Q ⊂ Zd and v ∈ Wd we set S (Q) (v) = {v ∈ Wd : πZd Q (v ) = πZd Q (v)}.

(5.2)

If V ⊂ Wd is a subset we set SV(Q) (v) = S (Q) (v) ∩ V . We fix g ∈ I˜d . Let ε with 0 < ε < 1/4d. Since g ∗ · w (d) ∈ 1 (Zd ) we can find K ≥ 1 with |ξ¯g (v)0 − ξ¯g (v )0 | < ε for every v, v ∈ 2d with π Q K (v) = π Q K (v )

(5.3)

(cf. (3.26)) (Q)

Lemma 5.2. Let v ∈ 2d , Q ⊂ Zd be a finite set and v ∈ S2d (v) (cf. (4.2) and (5.2)). (1) ξg (v ) = ξg (v) if and only if v − v ∈ ( f (d) ). (2) If ξg (v ) = ξg (v), then | ξg (v )n − ξg (v)n| ≥ 1/4d for some n ∈ Q + Q K = {m + k : m ∈ Q, k ∈ Q K }, where K is defined in (5.3), Q K in (3.26) and | · | in (3.5). Proof. We put y = ξ¯g (v), x = ρ(y) = ξg (v), y = ξ¯g (v ) and x = ξg (v). Assume that | xn − xn| < 1/4d

(5.4)

for every n ∈ Q + Q K . Since (5.4) holds automatically for n ∈ Zd (Q + Q K ) by (5.3), it holds for every n ∈ Zd . We choose z ∈ W f (d) with ρ(z) = x − x and z n ∞ < 1/4d (cf. (3.10)). Then f (d) · z ∈ ∞ (Zd , Z), and the smallness of the coordinates of z implies that f (d) · z = 0. Since ρ(z) = ρ(y − y) we obtain that z − (y − y) ∈ ∞ (Zd , Z). As the coordinates of z are small and limn→∞ |y − y| = limn→∞ |ξ¯g (v − v)| = 0, due to the continuity of ξ¯g , we conclude that h = z − (y − y) ∈ Rd . According to (3.18), f (d) · (z − (y − y)) = f (d) · h = g ∗ · (v − v). As Rd has unique factorization and g ∗ is not divisible by f (d) , v − v must lie in the ideal ( f (d) ) ⊂ Rd . Theorem 2.2 (i) and (3.15) together imply that ξg (v ) = x = x = ξg (v).


749

If ε > 0 and Q ⊂ Zd we call a subset Y ⊂ X f (d) (Q, ε )-separated if there exists, for every pair of distinct points x, x ∈ Y , an n ∈ Q with | xn − xn | ≥ ε . The set Y is (Q, ε )-spanning if there exists, for every x ∈ X f (d) , an x ∈ Y with | xn − xn | < ε for every n ∈ Q. (Q+Q K )

Lemma 5.3. Let Q ⊂ Zd be a finite set and v ∈ 2d . Then the set ξg (S2d (Q, ε)-spanning.

(v)) is

Proof. According to Lemma 3.5, ξg (2d ) = X f (d) . If we fix w ∈ 2d and set

vn if n ∈ Q + Q K , wn = wn otherwise, (Q+Q K ) then w ∈ S (v) and | ξg (w)n − ξg (w )n| < ε for every n ∈ Q by (5.3). 2d

Lemma 5.4. For every finite set Q ⊂ Zd and every w ∈ R∞ , the restriction of ξg to (Q) (Q) SR∞ (w) is injective and the set ξg (SR∞ (w)) is (Q + Q K , 1/4d)-separated. (Q) (w), then Proposition 4.1 and Lemma 5.2 show Proof. If v, v are distinct points in SR ∞ that | ξg (v)n − ξg (v )n| ≥ 1/4d for some n ∈ Q + Q K . We write every h ∈ Rd as h = n∈Zd h n u n and set supp(h) = {n ∈ Zd : h n = 0}. For Q ⊂ Zd we put

R(Q) = {h ∈ Rd : supp(h) ⊂ Q}, R + (Q) = {h ∈ R(Q) : h n ≥ 0 for every n ∈ Zd },

(5.5)

S + (Q) = {h ∈ R(Q) : h n ∈ {0, 1} for every n ∈ Zd }. For L ≥ 1, v ∈ 2d and q ≥ 0 we set Yv (q) = {w ∈ S (Q L+K +1 ) (v) : for every n ∈ Zd , 0 ≤ wn < 2d if nmax = L + K + 1 and − q ≤ wn < 2d if nmax = L + K + 1}, (5.6) Yv (q) = {w ∈ Yv (q) : π Q L+K (w) ∈ π Q L+K (R∞ )}. Lemma 5.5. Let L ≥ 1, q ≥ 0 and v ∈ 2d . Then Yv (q) = Yv (q)

(Yv (q + 1) − h · f (d) ).

(5.7)

0=h∈S + (Q L+K )

Proof. Suppose that v ∈ Yv (q). According to the proof of Proposition 4.1 there exists, for every nonzero h ∈ S + (Q L+K ), an n ∈ supp(h) ⊂ Q L+K with (v+h · f (d) )n > 2d −1. In particular, v + h · f (d) (q) ∈ / Yv (q + 1) and v ∈ / Yv (q + 1) − h · f (d) . This shows that Yv (q) ⊂ Yv (q)

(Yv (q + 1) − h · f (d) ). 0=h∈S + (Q

!

L+K )

Conversely, if v ∈ Yv (q) 0=h∈S + (Q L+K ) (Yv (q + 1) − h · f (d) ), but v ∈ / Yv (q), + then the proof of Proposition 4.1 allows us to find a nonzero h ∈ S (Q L+K ) with

750


(v+h · f (d) )n < 2d for every n ∈ supp(h). If (v+h · f (d) )n < 0 for some n ∈ Q L+K , then n∈ / supp(h) and −2d ≤ (v+h· f (d) )n < 0. We replace h by h = h+u n ∈ S + (Q L+K ) and obtain that 0 ≤ (v +h · f (d) )n < 2d for every n ∈ supp(h ). By repeating this process we can find h ∈ S + (Q L+K ) with supp(h ) ⊃ supp(h) such that 0 ≤ (v+h · f (d) )n ≤ 2d−1 for every n ∈ Q L+K . Since 0 ≥ (h · f (d) )n ≥ −1 if nmax = L + K + 1 and (h · f (d) )n = 0 outside Q L+K +1 , we see that v + h · f (d) ∈ Yv (q + 1). This contradicts our choice of v and proves (5.7). Lemma 5.6. For every v ∈ 2d and L ≥ 1 there exists an h ∈ R + (Q L ) with v = v + h · f (d) ∈ Yv ((2d − 1) · (2L + 1)d ). Proof. For every v ∈ ∞ (Zd , Z) we define D Q L+1 (v) by (3.27). Since D Q L+1 (v + u n · f (d) ) ≤ D Q L+1 (v) − 2 for every n ∈ Q L , D Q L+1 (v + h · f (d) ) ≤ D Q L+1 (v) − 2h1 for every h ∈ S + (Q L ). Suppose that v ∈ 2d . If w ∈ / Yv (0) then (5.7) shows that we can find a nonzero (1) + (1) (1) h ∈ S (Q L ) with v = v + h · f (d) ∈ Yv (1), and the first paragraph of this proof shows that D Q L+1 (v (1) ) ≤ D Q L+1 (v) − 2h (1) 1 . If v (1) ∈ / Yv (1) we can repeat this argument and find a nonzero h (2) ∈ S + (Q L ) with (2) (1) v = v + h (2) · f (d) ∈ Yv (2) and D Q L+1 (v (2) ) ≤ D Q L+1 (v) − 2h (1) 1 − 2h (2) 1 . Proceeding by induction, we choose nonzero elements h (1) , . . . , h (m) ∈ S + (Q L ) with (k) v = v + (h (1) + · · · + h (k) ) · f (d) ∈ Yv (m) for every k = 1, . . . , m. We claim that v (k) ∈ Yv ((2d − 1) · (2L + 1)d ) for every k ≥ 1, and that this process has to stop, i.e., that v = v (m) = v + (h (1) + · · · + h (m) ) · f (d) ∈ Yv ((2d − 1) · (2L + 1)d )

(5.8)

for some m ≥ 1. In order to verify this we assume that we have found h (1) , . . . , h (k) ∈ S + (L) with (k) v (k) = v + (h (1) + · · · + h (k) ) · f (d) ∈ Yv (k). Since n∈Q L+1 vn = n∈Q L+1 vn , (k)

(k)

(k)

0 ≤ vn ≤ 2d − 1 for n ∈ Q L , vn ≤ vn if nmax = L + 1 and vn = vn for every n∈ / Q L+1 , we know that vn ≥ vn(k) (2d − 1) · 2d · (2L + 1)d−1 ≥ {n:nmax =L+1}

≥

{n:nmax =L+1}

{n:nmax =L+1}

vn −

vn(k)

(5.9)

n∈Q L

≥ −(2d − 1) · (2L + 1)d , so that v (k) ∈ Yv ((2d − 1) · (2L + 1)d ) for every k ≥ 1. Furthermore, k D Q L+1 (v (k) ) = D Q L+1 (v) − 2 h ( j) 1 ≤ D Q L+1 (v) − 2k j=1

< (L + 1) · (2d − 1) · (2L + 3)d − 2k 2

and D Q L+1 (v (k) ) ≥ −(L + 1)2 · (2d − 1) · (2L + 1)d · |Q L+1 Q L | for every k, so that the integer k has to remain bounded. This shows that our inductive process has to terminate, which proves (5.8).


751

Before we complete the proof of Theorem 5.1 we state another consequence of Lemmas 5.5 and 5.6. Proposition 5.7. Let v ∈ ∞ (Zd , Z) and M ≥ 1. Then there exists a unique h ∈ Rd with the following properties: (1) (2) (3) (4)

supp(h) = {m ∈ Zd : h m = 0} ⊂ Q M ; If v = v + h · f (d) , then π Q M (v ) ∈ π Q M (R∞ ); for every m ∈ Zd with m vm = vm max > M + 1; | ≤ (2M + 3)d · v . |v ∞ {n:nmax =M+1} n

Proof. The proof of Lemma 5.5 allows us to find a polynomial h − ∈ Rd with nonnegative coefficients and supp(h − ) ⊂ Q M such that (v −h − · f (d) )n < 2d for every n ∈ Q M . Next we proceed as in the proof of Lemma 5.6 and choose a polynomial h + ∈ Rd with nonnegative coefficients and supp(h + ) ⊂ Q M such that v = v +(h + −h − )· f (d) satisfies (2). Condition (3) holds obviously, and (4) follows from the fact that n∈Q M+1 vn = n∈Q M+1 vn . In order to verify the uniqueness of h = h + − h − we assume that h ∈ Rd is another polynomial with supp(h ) ⊂ Q M such that v = v + h · f (d) satisfies Condition (2) above. We assume without loss in generality that h m > h m for some m ∈ Q M and set g = h − h and

v if n ∈ Q M , wn = n 2d otherwise. Then w ∈ R∞ and (w + g · g (d) )n = vn < 2d for every n ∈ Q M . Since supp(g) ⊂ Q M and gn > 0 for some n ∈ Q M this contradicts Proposition 4.1. Proof of Theorem 5.1. We fix ε > 0 and choose K according to (5.3). Lemma 5.6 and (5.9) show that X f (d) = ξg (2d ) = ξg (2d (L + K + 1, (2d − 1) · (2L + 2K + 1)d )), where 2d (M, q) = v ∈ ∞ (Zd , Z) : vm < 2d for every n ∈ Zd , vn ≥ 0 for every n ∈ Zd with nmax > M + 1, vn ≥ −q and π Q M (v) ∈ π Q M (R∞ ) . d {n∈Z :nmax =M+1}

(5.10)

Exactly the same argument as in the proof of Lemma 3.5 shows that ξg (R∞ ) = X f (d) . Since ξg (R∞ ) = X f (d) we know that h top (σR∞ ) ≥ h top (α f (d) )=

0

1

1

···

log f (d) (e2πis1 , . . . , e2πisd ) ds1 · · · dsd

(5.11)

0

(cf. [15] or [21, Theorem 18.1]). (Q ) In order to prove the reverse inequality we note that ξg is injective on SR∞L (v) for

(Q L ) (v)) is a (Q L+K , 1/4d)-separated subset of every v ∈ R∞ and L ≥ 1 and that ξg (SR ∞ X f (d) , by Proposition 4.1 and Lemma 5.2. In particular, if v¯ ∈ R∞ is given by

752


v¯n = 2d − 1 for every n ∈ Zd ,

(5.12) (Q L ) then π Q L (SR (v)) ¯ = π Q L (R∞ ) for every L ≥ 1. ∞ For every L ≥ 0 we denote by n(L + K ) the maximal size of a (Q L+K , 1/4d)separated set in X f (d) . From the definition of topological entropy we obtain that 1 1 (Q ) log π Q L (R∞ ) = lim log SR∞L (v) ¯ L→∞ |Q L | L→∞ |Q L | 1 1 (Q ) = lim log ξg (SR∞L (v)) log n(L + K ) ¯ ≤ lim L→∞ |Q L | L→∞ |Q L | 1 (d) = lim log n(L + K ) = h top (α f ), L→∞ |Q L+K |

h top (σR∞ ) = lim

which completes the proof of the theorem.

(5.13)

Remark 5.8. The expression (3.4) for the topological entropy of σR∞ can be found in [10, p. 56]. By using the fact that α f (d) and σR∞ have the same topological entropy one can prove Theorem 5.1 a little more directly: Lemmas 5.3 and 5.4 imply that the restriction of α to the closed, shift-invariant subset ξg (R∞ ) ⊂ X f (d) has the same topological entropy as α f (d) . Since the Haar measure λ X f (d) is the unique measure of maximal entropy for α f (d) by [15], ξg (R∞ ) has to coincide with X f (d) , as claimed in Theorem 5.1. (w)

Theorem 5.9. For every w ∈ R∞ and L ≥ 1 we denote by ν L the equidistributed (Q ) probability measure on the set SR∞L (w) in (5.2). Fix w ∈ R∞ and let µ(w) be any limit point of the sequence of probability measures (w)

µL

=

1 k (w) σ∗ ν L |Q L | k∈Q L

as L → ∞. Then µ(w) is a measure of maximal entropy on R∞ and (ξg )∗ µ(w) = λ X f (d) for every g ∈ I˜d . In fact, if µ is any shift-invariant probability measure of maximal entropy on R∞ , then (ξg )∗ µ = λ X f (d) for every g ∈ I˜d . (w)

Proof. We fix w ∈ R∞ . Let L ≥ 1 and let ν˜ L = (ξg )∗ ν L be the equidistributed (Q L ) probability measure on the (Q L+K , 1/4d)-separated set ξg (SR (w)) of cardinality ∞ ≥ π Q L−1 (R∞ ). (w) (w) 1 k We set µ˜ (w) k∈Q L (α f (d) )∗ ν˜ L . By choosing a suitable subseL = (ξg )∗ µ L = |Q L |

(w) quence (L k , k ≥ 1) of the natural numbers we may assume that limk→∞ µ(w) Lk = µ (w)

and limk→∞ µ˜ L k = µ˜ (w) = (ξg )∗ µ(w) . We denote by µ = (π{0} )∗ µ˜ (w) the projection of µ˜ (w) onto the zero coordinate in X f (d) and choose a partition {I1 , . . . , I8d } of T into half-open intervals of length 1/8d such that the endpoints of these intervals all have µ -measure zero. For i = 1, . . . , 8d we set Ai = {x ∈ X f (d) : x0 ∈ Ii } and observe that µ˜ (w) (∂ Ai ) = 0. We write ζ = {A1 , . . . , A8d } for the resulting partition of X f (d) .


For every L ≥ 1 we set ζ L =

753

" k∈Q L+K

(w)

α −k (ζ ). Since each atom of ζ L conf (d)

tains at most one atom of ν˜ L (by Lemma 5.4) and all these atoms have equal mass, (Q L ) Hν˜ (w) (ζ L ) = log |SR (w)|. ∞ L Exactly the same argument as in the proof of the inequality (∗) in [28, Theorem 8.6] shows that, for every M, L ≥ 1 with 2M + 2K < L, |Q M | (Q L ) log |SR (w)| = Hν˜ (w) (ζ M ) ≤ Hµ˜ (w) (ζ M ) ∞ L L |Q L | |Q M+K | · (|Q L+K | − |Q L−M−K | + · log(8d). |Q L | By setting L = L k and letting k → ∞ we obtain from (5.13) that |Q M | · h top (α f (d) ) ≤ lim Hµ˜ (w) (ζ M ) = Hµ˜ (w) (ζ M ) k→∞

Lk

for every M ≥ 1, and hence that h top (α f (d) ) ≤ lim

M→∞

1 |Q M+K |

· Hµ˜ (w) (ζ M ) = h µ˜ (w) (α f (d) ).

Since λ X f (d) is the unique measure of maximal entropy on X f (d) , µ˜ (w) coincides with

λ X f (d) , and µ(w) is a measure of maximal entropy on R∞ . In order to complete the proof of Theorem 5.9 we assume that µ is an arbitrary ergo(γ ) dic shift-invariant probability measure with maximal entropy on R∞ . We let M ≥ 5, put F = π Q M (R∞ ) and set, for every z ∈ F, Oz = {v ∈ R∞ : π Q M (v) = z}. Fix z ∈ F with c = µ(Oz ) > 0. The ergodic theorem guarantees that 1 lim 1O (σ 3Mm v) = c N →∞ |Q N |

(5.14)

m∈Q N

for µ-a.e. v ∈ R∞ . Let z ∈ F be given by

2d − 1 if nmax = M, zn = zn if n ∈ Q M−1 . We claim that µ(Oz ) > 0. In order to see this we assume that µ(Oz ) = 0 (which implies, of course, that z = z ). If v ∈ R∞ is fixed for the moment, and if Sv = {n ∈ Zd : σ 3Mn v ∈ Oz }, then we can replace the coordinates of σ 3Mm v in Q M by those of z for every m ∈ Sv , and we can do so independently at every m ∈ Sv . The resulting points v will always lie in R∞ . An elementary entropy argument shows that we could increase the entropy of µ under the Zd -action n → σ 3Mn by making all these points v equally likely, which would violate the maximality of the entropy of µ (a more formal argument should be given in terms of conditional measures). Exactly the same kind of argument as in the preceding paragraph allows us to conclude that the cylinder sets Oz with z n ∈ F and z n = 2d − 1 for every n ∈ Q M with nmax = M, all have equal measure. A slight modification of the proof of the first part of this theorem now shows that h((ξg )∗ µ) = h(λ X f (d) ), i.e., that (ξg )∗ µ = λ X f (d) .

754


5.2. Properties of the maps ξg , g ∈ I˜d . 5.2.1. The ‘group structure’ of R∞ In (3.4) we saw that σR∞ and α f (d) have the same topological entropy. If µ is a shift-invariant measure of maximal entropy on R∞ , then the dynamical system (R∞ , µ, σR∞ ) has a Bernoulli factor of full entropy (cf. [23]). As (X f (d) , λ X f (d) , α f (d) ) is Bernoulli by [20], the full entropy Bernoulli factor of (R∞ , µ, σR∞ ) is measurably conjugate to (X f (d) , λ X f (d) , α f (d) ). In particular, there exists a µ-a.e. defined measurable map φ : R∞ −→ X f (d) with φ∗ µ = λ X f (d) and φ ◦ σR∞ = α f (d) ◦ φ µ-a.e. What distinguishes the maps ξg , g ∈ I˜d , from these abstract factor maps φ : R∞ −→ X f (d) is that the ξg are not only continuous and surjective, but that they also reflect the somewhat elusive group structure of R∞ in the following sense. It is well known that the set R E of recurrent sandpile configurations on a finite set E ⊂ Zd in (4.3) is a group (cf. [8–10]). However, the group operation does not extend in any immediate way to the infinite sandpile model R∞ . Fix g ∈ I˜d and suppose that v, v ∈ R∞ , and that w = v + v ∈ 4d−1 (with coordinate-wise addition). Proposition 5.7 shows that there exists, for every M ≥ 1, an element w (M) ∈ ∞ (Zd , Z) satisfying conditions (1)–(4) there. Since w − w(M) ∈ ( f (d) ) for every M ≥ 1, ξg (w (M) ) = ξg (w) for every M ≥ 1. Exactly as in the proof of the lemma we observe that any coordinate-wise limit w˜ ∈ R∞ of the sequence (w (M) , M ≥ 1) still satisfies that ξg (w) ˜ = ξg (w) = ξg (v) + ξg (v ). The ‘sum’ w˜ of v and v is, of course, not uniquely defined, but any two versions of this sum are identified under ξg . Moreover, if ∼ is the equivalence relation on R∞ defined by v ∼ v if and only if v − v ∈ ker(ξ Id ) = K d (cf. (3.31)), then R∞ /∼ is a compact abelian group isomorphic to X˜ f (d) = X f (d) / X Id (cf. Lemma 3.8): if [v] is the equivalence class of v ∈ R∞ , then the map θd ◦ ξ Id : R∞ −→ X˜ f (d) in (3.36) sends [v] to θd ◦ ξ Id (v) and maps the group operation [v] ⊕ [v ] := [v + v ] on R∞ /∼ to that on X˜ f (d) . 5.2.2. The problem of injectivity In Subsect. 5.2.1 we saw that R∞ has a natural group structure modulo elements in the kernel of ξg . Another problem which depends on the intersection of Rd with the cosets of ker ξ Id is the question of ‘pulling back’ to R∞ dynamical properties of α f (d) , such as uniqueness or the Bernoulli property of the measure of maximal entropy of R∞ . It is clear that the map ξ Id (and hence all the maps ξg , g ∈ I˜d ) must be noninjective on R∞ , since these maps are continuous, Rd is zero-dimensional, and the groups X f (d) and X¯ f (d) are connected. The following lemma shows that some of the maps ξg , g ∈ I˜d , are ‘more injective’ than others and is the reason for determining the ideal Id precisely in Sect. 2. Lemma 5.10. Let g ∈ I˜d and h ∈ Rd . For every v, w ∈ R∞ with ξg (w) ∈ ξg (v) + ker h(α), ξg·h (v) = ξg·h (w). It follows that |{w ∈ R∞ : ξg·h (w) = ξg·h (v)}| = | ker h(α f (d) )| for every v ∈ R∞ .

(5.15)


755

Proof. If x = ξg (v), y ∈ ker h(α f (d) ) and w ∈ R∞ satisfies that ξg (w) = x + y (cf. Theorem 5.1), then ξg·h (w) = h(α)(ξg (w)) = h(α)(x + y) = h(α)(x) = ξg·h (v).

6. The Dissipative Sandpile Model In this section we fix d ≥ 2 and γ > 2d, and consider the dissipative sandpile model γ R∞ ⊂ γ described in Sect. 4 and investigated in [26,7,16]. 6.1. The dissipative harmonic model. Consider the Laurent polynomial f (d,γ ) ∈ Rd defined in (4.5) and the corresponding compact abelian group d # d X f (d,γ ) = ker f (d,γ ) (α) = x = (xn )n∈Zd ∈ TZ : γ xn − (xn+e(i) + xn−e(i) ) = 0

$

i=1

for every n ∈ Z . d

(6.1)

We write α X f (d,γ ) for the shift-action (3.1) of Zd on X f (d,γ ) ⊂ TZ . d

Lemma 6.1. The shift-action α of Zd on X f (d,γ ) is expansive, i.e., there exists an > 0 such that sup | xn − xn | >

n∈Zd

for every x, x ∈ X f (d,γ ) with x = x . The entropy of α f (d,γ ) is given by h top (α f (d,γ ) ) = h λ X

f

(α f (d,γ ) ) = (d,γ )

1 0

1

···

log f (d,γ ) (e2πit1 , . . . , e2πitd ) dt1 · · · dtd ,

0

and the Haar measure λ X f (d,γ ) is the unique shift-invariant measure of maximal entropy on X f (d,γ ) . Proof. Since f (d,γ ) has no zeros in Sd = (z 1 , . . . , z d ) ∈ Cd : |z i | = 1 for i = 1, . . . , d , α f (d,γ ) is expansive by [21, Theorem 6.5]. The last two statements follow from [21, Theorems 19.5, 20.8 and 20.15].

756


6.2. The covering map ξ (γ ) : R∞ −→ X f (d,γ ) . Since α f (d,γ ) is expansive and has completely positive entropy, the equation f (d,γ ) · w = 1

(6.2)

has a unique solution w = w(d,γ ) ∈ 1 (Zd ), given by (d,γ )

wn

1

= 0

··· 0

1

e−2πi n,t dt1 · · · dtd , d γ − 2 · i=1 cos(2π ti )

where t = (t1 , . . . , td ) (cf. (2.5), [14] and [6]). Since w (d,γ ) ∈ 1 (Zd ), we can proceed (γ ) as in (3.15) and define a homomorphism ξ (γ ) : R∞ → X f (γ ,d) by ξ¯ (γ ) (v)n = (w (d,γ ) · v)n = (γ )

for every v ∈ R∞ , and by

(d,γ )

vn−k wk

n∈Zd

ξ (γ ) = ρ ◦ ξ¯ (γ ) . Proposition 6.2. The map ξ (γ ) has the following properties: (γ )

(a) ξ (γ ) (R∞ ) = X f (d,γ ) ; (γ )

(b) For v, v ∈ R∞ , ξ (γ ) (v) = ξ (γ ) (v ) if and only if v = v + f (d,γ ) · h

(6.3)

for some h ∈ ∞ (Zd , Z); (γ ) (c) ξ (γ ) (v) = ξ (γ ) (v) for all v, v ∈ R∞ with v − v ∈ Rd . Furthermore, the topological entropies of the shift-actions α f (d,γ ) on X f (d,γ ) and σR(γ ) ∞ (γ ) on R∞ coincide. Proof. The proofs are completely analogous to (but simpler than) those of the corresponding results in the critical case. Corollary 6.3. For every v ∈ ∞ (Zd , Z) there exists a h ∈ ∞ (Zd , Z) such that w = (γ ) v + f (d,γ ) · h ∈ R∞ . Proof. This follows from Proposition 6.2 (a)–(b).

Remark 6.4. The element w in Corollary 6.3 can be constructed explicitly by using the method described in the proofs of Lemma 3.5, Theorem 4.1 and Subsect. 5.2.1. In [16], two elements v, v ∈ ∞ (Zd , Z) are called equivalent (denoted by v ∼ v ) if they satisfy (6.3) for some h ∈ ∞ (Zd , Z).5 We write [v] ⊂ ∞ (Zd , Z) for the equivalence class of v in this relation. The following theorem summarizes the results of [16]. 5 Definition 3.2 in [16, p. 404] contains a misprint: the requirement that h ∈ ∞ (Zd , Z) is omitted, although it is used subsequently.


757

(γ )

(γ )

Theorem 6.5. The quotient R∞ /∼ is a compact space. Moreover, (R∞ /∼ , ⊕) is a compact abelian group, where [y] ⊕ [ y˜ ] = [y + y˜ ]. (γ )

Furthermore, there exists a shift-invariant measure of maximal entropy on R∞ , denoted by µ, such that (γ ) (γ ) µ y ∈ R∞ : [y] ∩ R∞ is a singleton = 1. (6.4) Proof. The first two statements are the results of [16, Prop. 3.2 and Th. 3.1]. Furthermore, the main result of [16], Theorem 3.2, states that, if µV is the uniform measure on (γ ) πV R Q(N ) , where V ⊂ Zd is a rectangle, then the set of limit points of sequences µV , V Zd , is a singleton. Denote by µ this unique limit point. We claim that µ is a (γ ) shift-invariant measure on R∞ , which moreover, has maximal entropy. The invariance follows immediately from the uniqueness of the weak limit point. (γ ) (γ ) Denote by σ the Zd -shift action on R∞ . For every Borel set A ⊆ R∞ , every n ∈ Zd , and any sequence of rectangles E k Zd : µ(σ n A) = lim µ E k (σ n A) = lim µ E k +n (A) = µ(A). k→∞

k→∞

Using the methods of [28, Chap. 8] (see also the proof of Theorem 5.9 above), one can show that 1 1 (γ ) log |R E | µ([y E ]) log µ([y E ]) = lim h µ (σR(γ ) ) = lim − d d ∞ |E| |E| E→Z E→ Z (γ ) y E ∈R E

= h top (σR(γ ) ), ∞

(γ )

where σR(γ ) is the restriction of σ to R∞ . Finally, (6.4) is the result of [16, Prop. 3.3]. ∞ We are now able to extend the results of [16] further. (γ )

Theorem 6.6. Let d ≥ 2, γ > 2d, and let R∞ be the dissipative sandpile model (4.4). (γ ) (γ ) (γ ) (i) The set C = y ∈ R∞ : [y] ∩ R∞ is a singleton is a dense G δ -subset of R∞ ; (γ )

(ii) The group (R∞ /∼ , ⊕) is isomorphic to X f (d,γ ) ; (γ )

(iii) The subshift R∞ admits a unique measure µ of maximal entropy. (γ ) (iv) The shift action of Zd on (R∞ , µ) is Bernoulli. Proof. The first statement is proved in Proposition 4.4. Using the properties of ξ γ : (γ ) R∞ → X f (d,γ ) (Lemma 6.2), the second statement is immediate. The same proof as (γ )

in Theorem 5.9 shows that h top (σR(γ ) ) = h top (X f (d,γ ) ), and that ξ∗ ν = λ X f (d,γ ) for ∞

(γ )

every shift-invariant probability measure ν of maximal entropy on R∞ . (γ ) Since the restriction of the continuous map ξ (γ ) : R∞ −→ X f (d,γ ) to C is injective, ξ (γ ) (C) is a Borel subset of X f (d,γ ) with full Haar measure.

758


(γ )

If ν is a shift-invariant probability measure of maximal entropy on R∞ , then ξ∗ ν = λ X f (d,γ ) . Hence ν(C) = 1, and the injectiveness of ξ (γ ) on C implies that ν = µ, where µ is the measure appearing in Theorem 6.5. This proves (iii). (γ ) The Bernoulli property of the shift-action of Zd on (R∞ , µ) follows from the corresponding property of α f (d,γ ) on (X f (d,γ ) , λ X f (d,γ ) ) proved in [20], since the two systems are measurably conjugate. 7. Conclusions and Final Remarks (1) In [11], toppling invariants have been constructed for the abelian sandpile model in finite volume. These are functions which are linear in height variables and are invariant under the topplings. It is also obvious that the definition [11, Eq. (3.3)] cannot be extended to the infinite volume. The underlying problem (non-summability of the lattice potential function) is precisely the problem overcome by the introduction of 1 -homoclinic points {v = g · w (d) : g ∈ Id }. The inevitable drawback is a larger kernel ξg f (d) · ∞ (Zd , Z). Nevertheless, we are tempted to conjecture that for d ≥ 2, the set {v ∈ R∞ : there exists v˜ ∈ R∞ : v˜ = v and ξ Id (v) = ξ Id (v)} ˜ has measure 0 with respect to any measure of maximal entropy. As in the dissipative case, this would imply that R∞ carries a unique measure of maximal entropy. (2) In the present paper we did not address the properties of the infinite volume sandpile dynamics, see e.g. [13]. We note that the sandpile dynamics takes a particularly simple form in the image space, the harmonic model X f (d) or its factor group X˜ f (d) . Namely, given any initial configuration v, suppose one grain of sand is added at site n. For every g ∈ I˜d = Id ( f (d) ), ξg (v + δ (n) ) = ξg (v) + ξg (δ (n) ) = ξg (v) + ρ(α −n z (g) ), where δ (n) = σ −n δ (0) (cf. Footnote 2) and z (g) = ρ(g ∗ · w (d) ) ∈ (1) α (X f (d) ) is the homoclinic point appearing in (3.14). It might be interesting to understand whether any statistical properties of the harmonic model can be used to draw any conclusions on the distribution of avalanches and other dynamically relevant notions in R∞ . Finally, as already mentioned in the Introduction, the group Gd = Rd /( f (d) ) is the appropriate infinite analogue of the groups of addition operators in finite volumes: on the sandpile model, Gd can be viewed as the abelian group generated by the elementary addition operators {an : n ∈ Zd } satisfying the basic relations an2d = ak k:k−nmax =1

for all n ∈ Zd . These addition operators are well-defined on R E , E Zd , but for the infinite volume limit R∞ these operators are not defined everywhere. Under the maps ξg : R∞ −→ X f (d) , g ∈ Id , or ξ Id : R∞ −→ X˜ f (d) = X f (d) / X Id , the addition operator an is sent to addition of the homoclinic points ξg (δ (n) ) = ρ(g ∗ · w (d) ) = g(α)(x ) (on X f (d) ) and ξ Id (δ (n) ) (on X˜ f (d) ), respectively. These additions are defined everywhere on X f (d) and X˜ f (d) , and the isomorphism between X˜ f (d) and R∞ /∼ implies that the addition operators an , n ∈ Zd , are defined everywhere on R∞ /∼ (cf. Subsect. 5.2.1).


759

Acknowledgement. E.V. would like to acknowledge the hospitality of the Erwin Schrödinger Institute (Vienna), where part of this work was done. E.V. is also grateful to Frank Redig, Marius van der Put and Thomas Tsang for illuminating discussions. K.S. would like to thank EURANDOM (Eindhoven) and MSRI (Berkeley), for hospitality and support during part of this work. Open Access This article is distributed under the terms of the Creative Commons Attribution Noncommercial License which permits any noncommercial use, distribution, and reproduction in any medium, provided the original author(s) and source are credited.

References 1. Athreya, S.R., Járai, A.A.: Infinite volume limit for the stationary distribution of abelian sandpile models. Commun. Math. Phys. 249(1), 197–213 (2004) 2. Athreya, S.R., Járai, A.A.: Erratum: Infinite volume limit for the stationary distribution of abelian sandpile models. Commun. Math. Phys. 264(3), 843 (2006) 3. Bak, P., Tang, C., Wiesenfeld, K.: Self-organized criticality: An explanation of the 1/ f noise. Phys. Rev. Lett. 59, 381–384 (1987) 4. Bak, P., Tang, C., Wiesenfeld, K.: Self-organized criticality. Phys. Rev. A 38, 364–374 (1988) 5. Burton, R., Pemantle, R.: Local characteristics, entropy and limit theorems for spanning trees and domino tilings via transfer-impedances. Ann. Probab. 21, 1329–1371 (1993) 6. de Boor, C., Höllig, K., Riemenschneider, S.: Fundamental solutions for multivariate difference equations. Amer. J. Math. 111, 403–415 (1989) 7. Daerden, F., Vanderzande, C.: Dissipative abelian sandpiles and random walks. Phys. Rev. E 63, 30301– 30304 (2001) 8. Dhar, D.: Self organized critical state of Sandpile Automaton models. Phys. Rev. Lett. 64, 1613– 1616 (1990) 9. Dhar, D.: The abelian sandpiles and related models. Phys. A 263, 4–25 (1999) 10. Dhar, D.: Theoretical studies of self-organized criticality. Phys. A 369, 29–70 (2006) 11. Dhar, D., Ruelle, P., Sen, S., Verma, D.-N.: Algebraic aspects of abelian sandpile models. J. Phys. A 28, 805–831 (1995) 12. Fukai, Y., Uchiyama, K.: Potential kernel for the two-dimensional random walk. Ann. Probab. 24, 1979– 1992 (1996) 13. Járai, A., Redig, F.: Infinite volume limit of the Abelian sandpile model in dimensions d ≥ 3. Probab. Theor. Relat. Fields 141, 181–212 (2008) 14. Lind, D., Schmidt, K.: Homoclinic points of algebraic Zd -actions. J. Amer. Math. Soc. 12, 953–980 (1999) 15. Lind, D., Schmidt, K., Ward, T.: Mahler measure and entropy for commuting automorphisms of compact groups. Invent. Math. 101, 593–629 (1990) 16. Maes, C., Redig, F., Saada, E.: The infinite volume limit of dissipative abelian sandpiles. Commun. Math. Phys. 244, 395–417 (2004) 17. Pemantle, R.: Choosing a spanning tree for the integer lattice uniformly. Ann. Probab. 19, 1559–1574 (1991) 18. van der Put, M., Tsang, F.L.: Discrete Systems and Abelian Sandpiles. J. Alg. 322(1), 153–161 (2009) 19. Redig, F.: Mathematical aspects of the abelian sandpile model. Mathematical Statistical Physics, Volume Session LXXXIII: Lecture Notes of the Les Houches Summer School 2005 (Les Houches), Bovier, A., Dunlop, F., den Hollander, F., van Enter, A., Dalibard, J. (eds.), Amsterdam: Elsevier, (2006), pp. 657–728 20. Rudolph, D.J., Schmidt, K.: Almost block independence and Bernoullicity of Zd -actions by automorphisms of compact groups. Invent. Math. 120, 455–488 (1995) 21. Schmidt, K.: Dynamical Systems of Algebraic Origin. Basel-Berlin-Boston: Birkhäuser Verlag, 1995 22. Sheffield, S.: Uniqueness of maximal entropy measure on essential spanning forests. Ann. Probab. 34, 857–864 (2006) 23. Sinai, Ya.G.: On a weak isomorphism of transformations with invariant measure. Mat. Sb. 63(105), 23–42 (1964) 24. Solomyak, R.: On coincidence of entropies for two classes of dynamical systems. Ergod. Th. & Dynam. Sys. 18, 731–738 (1998) 25. Spitzer, F.: Principles of random walks. New York: van Nostrand Reinhold, 1964 26. Tsuchiya, T., Tomori, M.: Proof of breaking of self-organized criticality in a nonconservative abelian sandpile model. Phys. Rev. E 61, 1183–1188 (2000) 27. Uchiyama, K.: Green’s function for random walks on Z N . Proc. London Math. Soc. 77, 215–240 (1998) 28. Walters, P.: An introduction to ergodic theory. Graduate Texts in Mathematics, Vol. 79, BerlinHeidelberg-New York: Springer Verlag, 1982 Communicated by G. Gallavotti


Communications in


Quantum Inequalities from Operator Product Expansions Henning Bostelmann1,2, , Christopher J. Fewster2 1 Dipartimento di Matematica, Universita di Roma “Tor Vergata”,

Via della Ricerca Scientifica, 00133 Roma, Italy

2 Department of Mathematics, University of York, Heslington,

York, YO10 5DD, United Kingdom. E-mail: [email protected]; [email protected] Received: 29 January 2009 / Accepted: 3 April 2009 Published online: 9 July 2009 – © Springer-Verlag 2009

Dedicated to the memory of Bernd Kuckert Abstract: Quantum inequalities are lower bounds for local averages of quantum observables that have positive classical counterparts, such as the energy density or the Wick square. We establish such inequalities in general (possibly interacting) quantum field theories on Minkowski space, using nonperturbative techniques. Our main tool is a rigorous version of the operator product expansion. 1. Introduction The principal qualitative difference between classical and quantum physics lies in the fundamentally unsharp nature of the latter, quantitatively expressed by the uncertainty principle. This distinction becomes particularly acute when one seeks analogues in quantum theory of quantities that are classically positive. In quantum mechanics, for example, one replaces a probability distribution over classical phase space by the Wigner function, which is pointwise positive only for Gaussian states [1]. Consequently, Weyl quantization of classically positive observables does not generally yield positive operators. Similarly, a positive (local) quadratic form in a classical field and its derivatives, such as the energy density of a free minimally coupled scalar field, would not be expected to have a positive analogue in quantum field theory, owing to the subtractions necessary to renormalize products of fields at a point. Nonetheless, positivity is not completely destroyed in quantization. The sharp Gårding inequalities [2] show that classically positive symbols have Weyl quantizations that are positive modulo corrections of lower order; that is, operators corresponding to a lower rate of growth in momentum. The aim of this paper is to establish analogous results for quantum field theory in a model independent and nonperturbative setting. The key to our approach is a recently-developed microscopic phase space condition [3] that controls the degrees of freedom available to the theory at small scales and bounded energy, and Supported by the EU network “Noncommutative Geometry” (MRTN-CT-2006-0031962).

762

H. Bostelmann, C. J. Fewster

guarantees the existence of a rigorous operator product expansion (OPE) [4]. In any theory obeying this condition (along with other standard criteria set out in Sect. 3) we identify a class of ‘classically positive’ operator products and show how this classical positivity is reflected in estimates on suitable smearings of the composite fields appearing in the corresponding OPEs. If there is a distinguished normal product associated with the underlying classically positive expression the picture is closely analogous to that emerging from the Gårding inequalities: suitable smearings of the normal product are positive modulo corrections of a lower order. As we will describe, our results significantly generalize the quantum (energy) inequalities, developed over recent years, that provide lower bounds on smearings of quadratic normal ordered quantities in free field theories. In the following subsections, we will describe the background and motivation for our study. 1.1. Quantum inequalities. It has been known for many years that expectation values of quantities such as the Wick square or energy density of a free scalar field may assume negative values and are pointwise unbounded from below as the quantum state is varied. Indeed, no local observable (other than the zero operator) can be both positive and have a vanishing vacuum expectation value [5]. Thirty years ago, Ford made the key observation that, as unrestricted negative energy densities or fluxes could produce macroscopic violations of the second law of thermodynamics, it was to be expected that QFT itself places strict limits on such departures from positivity [6]. Subsequently, Ford and Roman were able to derive lower bounds, called quantum inequalities (QIs), on averaged energy densities for scalar fields in Minkowski space [7–9]; these results were generalized to static curved spacetimes by Pfenning and Ford [10]. In the results just mentioned, the averaging is performed along a timelike geodesic with respect to a Lorentzian weight. With Eveson, one of us (CJF) obtained similar results for general weight functions [11]. As an example, the renormalized energy density :ρ: of the field of mass m in four-dimensional Minkowski space obeys the inequality ∞ 1 2 2 dt ω(:ρ: (t, 0))|g(t)| ≥ −Q[g] := − du u 4 |g(u)| ˜ (1.1) 16π 3 m for any g ∈ D(R) and all Hadamard states ω (this is slightly weaker than the bound of [11]). Here g˜ denotes the Fourier transform. Similar bounds are obeyed by any classically positive field of form i :(Pi φ)2 :, where the Pi are partial differential operators with smooth real coefficients. We will understand the term ‘quantum inequality’ to apply to any bound of this type, and not just those relating to the energy density (for which the more specific term ‘quantum energy inequality’ (QEI) is also used). The basic technique of [11] generalizes straightforwardly to static spacetimes [12] and the electromagnetic field [13]. It also underlies the general and rigorous results of [14], which give QIs for averaging with arbitrary weights along arbitrary timelike curves in arbitrary globally hyperbolic spacetimes, valid for all Hadamard states. (The bound in [14] is expressed using a reference state; see [15] for analogous results with a purely local geometric bound.) Similar results hold for spin-1 fields [16]. We note that averaging in timelike directions is essential for establishing inequalities; while averaging over spacetime volumes also yields lower bounds (see, e.g., [15]), purely spatial [17] or lightlike [18] averaged energy densities do not generally obey quantum inequalities. An important feature of the lower bound in (1.1) is that it is independent of the state ω, and can be rewritten as an operator inequality :ρ : (|g|2 ) ≥ −Q(g)1. One cannot

Quantum Inequalities from Operator Product Expansions

763

expect bounds of this type for general interacting theories [19] (although they do hold for conformal field theories in two dimensions [20]; see also [21,22] for precursors). Indeed, the nonminimally coupled scalar field provides an example of a free field theory in which averaged energy densities are unbounded from below [23]. The best that can be expected, in general, is an inequality of the form :ρ: (|g|2 ) ≥ −Q(g), where Q(g) is now permitted to be an operator. As noted in [24], this would be a rather empty notion without some constraints on Q(g) (for example, Q(g) = − :ρ : (|g|2 ) gives a trivial inequality of this type). To qualify as a nontrivial inequality, Q should be of ‘lower order’ than :ρ: in a defined sense. For example, the nonminimally coupled scalar field obeys bounds of the form :ρ: (|g|2 ) ≥ −Q1 (g)1 + 2ξ :φ 2: (g˙ 2 )

(1.2)

in four-dimensional Minkowski space for coupling ξ ∈ [0, 1/4] [23]. Crucially, the right-hand side is bounded relative to (1 + H ) p for any p > 2, while the left-hand side is not bounded relative to any (1 + H )q with q < 3, where H is the Hamiltonian. In the present paper we will weaken the criterion of nontriviality slightly owing to the approximate nature of OPEs. As we explain in outline in Sect. 2 and in detail in Sect. 6, we permit bounds containing a remainder term that is of higher order in energetic terms than the field of interest, but which is vanishing in the small distance limit. All the results mentioned so far rely on positivity of an underlying classical expression, namely, a sum of squares of fields and their derivatives; this is also the focus of the present work. However, it is important to recall that the energy density of a Dirac field is not expressed in this way; accordingly different techniques are required to obtain quantum energy inequalities in this case (see [25–28] for spin-1/2 and [29,30] for spin-3/2).

1.2. Perturbative versus nonperturbative approaches. While QIs were first studied for free fields on Minkowski space, it is now known – as mentioned above – that the concept is compatible at least with some simple types of interaction, specifically the coupling to an external gravitational field and those in conformal field theories. However, on the technical side, the existing results typically rely on the rather simple structure of linear quantum fields fulfilling c-number commutation relations. For dealing with general, possibly self-interacting quantum fields, this is far too restrictive. Instead, our aim here is to derive inequalities from general principles of quantum field theory that are not restricted to linear fields. A fully rigorous construction of self-interacting quantum field theories remains a challenge, and has to date been completed in low-dimensional models only [31–33]. In physical spacetime, interacting theories have generally been established in a perturbative setting, usually without any control on the convergence of the perturbation series. It would therefore seem natural to investigate QIs in a perturbative context. However, severe conceptual difficulties arise here. In order to give any reasonable meaning to quantum inequalities theory, we need to determine when a formal power in perturbation k series, say P[g] = ∞ k=0 ck g with ck ∈ C, and with the formal variable g being interpreted as a “coupling constant”, should be considered positive. Understanding the set of formal power series as a ∗-algebra, the natural notion of positivity is as follows [34]: P is considered positive if and only if P[g] = Q ∗ [g]Q[g] for some formal power series Q.

(1.3)

764


It turns out that this condition is equivalent to the following one: P[g] = g 2n

∞

dk g k with n ∈ N0 , dk ∈ R, d0 > 0.

(1.4)

k=0

(Here (1.3) ⇒ (1.4) is immediate; the converse follows by inserting x = (d0−1 g −2n P[g]− √ 1) into the power series of 1 + x around x = 0.) Now Eq. (1.4) shows that this notion of positivity is not useful in our context, since it roughly says that positivity of P is determined by its lowest-order coefficient. (See [35] for a slight variant.) The order-0 coefficient however is supposed to be the contribution from free field theory. So—with this definition—QIs would hold at finite coupling if and only if they hold at coupling g = 0; the effects of interaction on inequalities cannot be captured in this approach. Let us illustrate these difficulties in a simple example: should one consider the following formal power series positive? P[g] =

∞ (−1)k k=0

(2k)!

g 2k .

(1.5)

Considering P as a convergent series, it would be positive for small, but not for all g. Forgetting all convergence properties, the only information that remains is positivity at g = 0, i.e., of the zero-order coefficient. The question of interest, however, would be √ whether the physical value of g falls into the convergence radius of P; this question is not accessible in formal perturbation theory. It is therefore necessary to conduct our investigation in a nonperturbative formulation of quantum field theory, such as the Wightman setting [36] or the C ∗ algebraic formulation of Haag and Kastler [37]. (We shall actually use a combination of both; the technical details will be recalled in Sect. 3.) This is, in a way, a very strong assumption to start with, since we assume that our QFT models have been fully constructed and are under complete topological control. Indeed, as mentioned above, the rigorous construction of interacting models in physical space-time remains an open challenge. The virtues of our axiomatic approach, however, are of a different nature: within the framework of algebraic quantum field theory, we can formulate physically motivated, qualitative properties of quantum field theories, which can explicitly be verified in simple models such as free field theory, but which appear general enough to be postulated for the interacting situation. We can then show how observable consequences, such as quantum inequalities, follow from these postulated properties. 1.3. Phase space conditions. The specific qualitative properties we will employ are known as phase space conditions. Semi-classical considerations (originating with Bohr and Sommerfeld) suggest that only finitely many independent states (or, dually, observables) are required to describe a quantum system which is restricted to a finite volume in phase space e.g., by cut-offs in configuration space and energy. In quantum field theory, this picture can certainly persist only qualitatively and in an approximate sense. However, it is possible to give a precise meaning to the aforementioned concepts, expressed as the compactness or nuclearity of certain maps; see e.g. [38–40]. These phase space conditions have physically interesting consequences: for example, they imply the existence of thermal equilibrium states [41] and are important for the particle interpretation of quantum field theories [42].


765

The role of phase space conditions for QIs has been partially investigated before. Even in the free field situation described above, one may see the need for some restrictions on the phase space behaviour of the theory [43]: instead of one field of mass m, consider an infinite number of fields with masses m j (for simplicity, in four-dimensional Minkowski space). The total energy density will obey a QI, ∞ 1 2 dt ω(:ρ: (t, 0))|g(t)|2 ≥ − du u 4 N (u)|g(u)| ˜ , (1.6) 16π 3 0 where N (u) =

ϑ(u − m j )

(1.7)

j

counts the number of species with masses below energy u. If N grows no faster than polynomially with u, the lower bound is finite for all g ∈ C0∞ (R); the same condition is known to guarantee that this theory obeys nuclearity in the sense of Buchholz and Wichmann [39]. Other ideas concerning the relationship between QEIs and nuclearity conditions are discussed in [44], while connections with thermodynamic stability are described in [45,46]. However, the results presented here are the first in which QIs have been derived as a consequence of phase space criteria. For our purposes, we will use a microscopic phase space condition recently introduced by one of us (HB) in [3]; we shall recall its formulation and consequences in Sect. 3. Compared with other similar conditions, it is specifically sensitive in the short-distance regime, the realm which is of most interest for QIs. Indeed, one heuristically expects [47] that at short distances and finite energies, the theory may be well-approximated in terms of finitely many observables corresponding to pointlike quantum fields. This approximation of bounded observables by quantum fields can indeed be made precise [3] and plays a central role in our approach. Its use is twofold. First, it tells us how our primary objects—local algebras of bounded operators—relate to the quantum fields for which inequalities are formulated. Second, it serves to establish an additional structure for the quantum fields, namely a rigorous version of the operator product expansion [4]. We can understand this OPE, which describes the “structure constants” of the “improper algebra” of quantum fields, as containing all relevant information about the interaction, and in this sense as a replacement for the Lagrangian [48]. In fact, it is the OPE from which our inequalities will be computed. In particular, the OPE allows us to generalize the notion of normal ordering that has a key role for QIs of linear fields, replacing it with normal products in the sense of Zimmermann [49]. The remainder of the paper is organized as follows. We start with a non-technical account of our main methods and results in Sect. 2. Then, in Sect. 3, we introduce the framework of nonperturbative quantum field theory that we work in, including the phase space condition mentioned above. Section 4 presents some technical preliminaries from distribution theory. In Sect. 5, we establish the rigorous operator product expansion in the variant that we require. This expansion will be the base of our quantum inequalities, derived in Sect. 6. Dilation covariance as a special case is covered in Sect. 7. We end with a brief outlook in Sect. 8. 2. Overview We now give a non-technical overview of our main techniques and results, postponing rigorous arguments to later sections. Throughout, we work in Minkowski space of

766


dimension 2 + 1 or more (possible generalizations are discussed in Sect. 8). For simplicity, we shall always pick a fixed Lorentz frame, and hence a fixed time axis; all quantum fields φ(t) = φ(t, x = 0) will be restricted to this time axis, and smeared expressions φ( f ) = dt f (t) φ(t) will refer to one-dimensional integration only. This is sufficient for regularizing Wightman fields [50]; due to the symmetry properties of Minkowski space, it covers the essential features of the inequalities we wish to consider. To illustrate our approach we begin by sketching the derivation of a QI for the Wick square of the free real scalar field, essentially following the argument of [14] but in a form which is amenable to our generalization. We will then indicate which changes are necessary to deal with the general situation. Accordingly, let φ denote the free field and let σ be a normal state in the vacuum sector with sufficiently regular high-energy behaviour such that the expectation values in the following are finite. The distributional integration kernel F(t, t ) = σ (φ(t)φ(t ))

(2.1)

is positive-definite, in the sense that for any test function g, we have

dt dt F(t, t )g(t)g(t ) = σ (φ(g)∗ φ(g)) ≥ 0.

(2.2)

Then, also F(t, t )/ıπ(t − t − ı0) is positive-definite; namely we have by Fourier analysis,

dp dt dt F(t, t )g(t)g(t ) ϑ( p)e−ı p(t−t ) π ∞ dp = dt dt F(t, t )eı pt g(t)eı pt g(t ) ≥ 0. π 0 (2.3)

g(t)g(t ) dt dt F(t, t ) = ıπ(t − t − ı0)

We now use Wick ordering and introduce new variables s = (t + t )/2, s = t − t in order to rewrite the kernel F: F(t, t ) = σ (:φ 2: (s)) + + (s ) + σ (R(s, s )),

(2.4)

where + (t −t ) = ω(φ(t)φ(t )) is the vacuum two-point function of φ, and the remainder R is given by

t + t 2

2 :

R(s, s ) = :φ(t)φ(t ): − :φ = U (s) :φ(s /2)φ(−s /2) − φ 2 (0): U (s)∗ ,

(2.5)

with U (s) being the time translation unitaries. Note that R(s, s ) is a smooth function when evaluated in σ . Inserting this into Eq. (2.3), we obtain σ (:φ 2: ( f ) + cg 1) ≥ −Rσ,g ,

(2.6)


767

where

g(s + s /2)g(s − s /2) , ıπ(s − ı0) g(s + s /2)g(s − s /2) , cg := ds ds + (s ) ıπ(s − ı0) g(s + s /2)g(s − s /2) . Rσ,g := ds ds σ (R(s, s )) ıπ(s − ı0) f (s) :=

ds

(2.7) (2.8) (2.9)

It seems plausible that Rσ,g becomes small as supp g shrinks to a point. We will give more quantitative estimates in that respect later. Here, let us consider the special case where g is real-valued. Then both g(s + s /2)g(s − s /2) and R(s, s ) are even functions in s . Hence, in Eqs. (2.7) and (2.9), we can replace the factor (s − ı0)−1 with its even part, 1 −1 1 + (2.10) = ıπ δ(s ). 2 s − ı0 s + ı0 Since R(s, 0) = 0, this results in Rσ,g = 0 and f (s) = g(s)2 . Thus Eq. (2.6) gives the more usual inequality for the Wick square, :φ 2: (g 2 ) ≥ −cg 1.

(2.11)

We now aim at a generalization beyond free field theory. So let φ be a general, possibly self-interacting local quantum field. (The term “quantum field” is used here in a generic fashion, and may include derivatives of fields as well as composite fields or suitably defined powers of fields.) The main difficulty we face in applying the above construction is that no concept of normal ordering is available; we cannot use Wick ordering to split the product into higher-order and lower-order terms, as in Eq. (2.4). Instead, we shall use an operator product expansion for the product φ(t)∗ φ(t ), φ ∗ (t)φ(t ) =

n

C j (t − t )φ j ((t + t )/2) + Rn (t, t ).

(2.12)

j=1

Here Rn is a remainder term, which is “small” if t and t are close, while the φ j are composite fields. Smearing against g(t)g(t ), where g ∈ D(R), the left-hand side is then a positive operator, and this remains true if we multiply (2.12) with any positive-type kernel K (t − t ), which takes the role of K (t − t ) = 1/ıπ(t − t − ı0) above. In other words, Eqs. (2.2) and (2.3) remain valid. We can then rearrange and obtain as analogue to Eq. (2.6), n

φj( f j) ≥ −

dt dt g(t)g(t )K (t − t )Rn (t, t ),

(2.13)

j=1

where the test functions f j are given in terms of g, K , and the OPE coefficients C j by f j (s) = ds K (s ) C j (s ) g(s + s /2)g(s − s /2). (2.14)

768


Note that there is no guarantee that these functions are necessarily pointwise positive (the issues here are related to Hudson’s theorem [1] and the ‘choice of basis’ invoked in the OPE). We will return to this below. In order to establish our results rigorously, the main task is to establish the OPE and to control the remainder term on the right-hand side. We will show in Theorem 6.1 that, given α ≥ 0, one may find n, m and so that for all d > 0 and g ∈ D(−d, d), n

φ j ( f j ) ≥ − (d)g2d,m (1 + H )2 ,

(2.15)

j=1

where H is the Hamiltonian, (d) = o(d α ) as d → 0 and · d,m is equivalent to the Sobolev norm on W0m,1 (−d, d). (Of course, finite sums of field products can be, and are, accommodated by our result.) The relationship with the QIs described in Sect. 1.1 is most apparent in the case where one of the composite fields, say φ1 , is of higher order than the others, in the sense that there exists for which (1 + H )− φ j ( f )(1 + H )− is bounded for j ≥ 2, while (1 + H )− φ1 ( f )(1 + H )− is unbounded. Then we may rearrange to write φ1 ( f 1 ) ≥ −

n

φ j ( f j ) − (d)g2d,m (1 + H )2 .

(2.16)

j=2

In cases where the remainder term vanishes (2.16) is then a nontrivial QI in the sense of [24]: namely, one cannot find constants C, C such that |σ (φ1 ( f 1 ))| ≤ C

n

|σ (φ j ( f j ))| + C

(2.17)

j=2

for all (sufficiently regular) states σ because φ1 is of higher energetic order than the fields on the right-hand side. Examples include the QI (2.11), where the only composite field on the right-hand side is the identity, and the QEI (1.2) on the nonminimally coupled field, where both the identity and Wick square appear on the right-hand side. This simple situation does not persist in general, however. First, it does not seem guaranteed that a unique choice of a highest-order field φ1 exists. For interacting fields, one would expect φ1 to be the normal product of φ ∗ φ in the sense of Zimmermann [49]; but there are indications from perturbation theory that in some cases, this normal product might not be unique [51]. Second, the remainder term cannot be expected to vanish in general–this reflects that the OPE is a controlled approximation, rather than an exact formula. Third, the remainder term is not of lower energetic order than the fields: in fact, is chosen so that each (1 + H )− φ j ( f j )(1 + H )− is bounded. Although the inequalities in Eq. (2.15) remain valid, it is necessary to adapt the criterion of nontriviality to our setting. Our approach is to focus on the short-distance behaviour, in which the remainder term vanishes as o(d α ). By contrast, we will show in Sect. 6.2 that, for the bounds we obtain, n −2 − − (2.18) sup gd,m (1 + H ) φ j ( f j )(1 + H ) g∈D (−d,d) j=1


769

is not o(d α ) as d → 0. Thus the remainder term cannot dominate the contribution of the composite fields in the small. In this context, it turns out to be crucial to formulate the OPE, and correspondingly the inequalities, in a “basis-independent” fashion, that is, in a way that is independent of a possible arbitrariness in the choice of composite fields. For practical purposes, it is still important to understand the more specific question as to whether there is a normal product of strictly higher energetic order than the other fields in the OPE. At present it seems to us that this must be discussed in the light of particular examples. Last but not least, one would like to gain more insight in the properties of the sampling functions f j , given by Eq. (2.14), in particular for the function f 1 corresponding to a “highest-order” composite field, which generalizes f 1 (s) = g(s)2 from the free field case. In general, it is certainly not expected that f 1 depends on g in a simple pointwise fashion. However, one may ask whether φ1 can be chosen so that f 1 retains other properties that are apparent in the free-field situation, for example whether f 1 ≥ 0, either pointwise or in an averaged sense. This will depend crucially on the form of the OPE coefficients C j , which are however unknown in general. We will give two approaches to this problem. The first, in Sect. 6.3, indicates conditions under which one may simultaneously tune the leading sampling function to a given positive form, while also reducing the remainder term. These conditions are broadly met under the assumption that the OPE coefficients have scaling limits in the sense of [52]. Our argument here is essentially to form a Riemann sum of QIs over small distance scales, in which the remainder term is suppressed. This approach is, however, tied to basis representations of the OPE; it would appear to be most useful in the context of particular models. Second, in Sect. 7, we discuss the particular case of dilation covariant theories. This is of interest since we can expect that our theory is approximated by a dilation covariant “scaling limit theory” in the ultraviolet. In this restricted situation, we will derive explicit criteria on g that guarantee positivity of f 1 . Since positivity of f 1 also fixes the sign of the composite field φ1 , this gives us a means of distinguishing the positive “normal square” of φ from its negative.

3. Algebras of Observables and Pointlike Quantum Fields As a mathematical basis of quantum field theory, we adopt the framework of local quantum physics [37]. Specifically, for describing pointlike quantum fields, we use the methods set forth in [3]. For the convenience of the reader, we will collect the relevant notions and results below, and introduce some notations that are useful in our context. We set out from a local net of algebras, O → A(O), in the vacuum sector. That is, for each bounded open region O of Minkowski space, we have an algebra A(O) of bounded operators; we take these to be von Neumann algebras acting on a common Hilbert space H. Further, we have a strongly continuous unitary representation (x, ) → U (x, ) of the proper orthochronous Poincaré group on H, with a common invariant unit vector ∈ H. We write the translation subgroup as U (x, 1) = exp ı Pµ x µ . Together these objects are supposed to fulfil the following axioms: (i) Isotony: A(O1 ) ⊂ A(O2 ) if O1 ⊂ O2 . (ii) Locality: [A1 , A2 ] = 0 if O1 , O2 are two spacelike separated regions, and Ai ∈ A(Oi ). (iii) Covariance: U (x, )A(O)U (x, )∗ = A(O + x) for all Poincaré transformations (x, ).

770


(iv) Positivity of energy: The joint spectrum of the Pµ falls into the closed forward light cone. (v) Uniqueness of the vacuum: is unique (up to a phase) as an invariant vector for all U (x, 1). We are primarily interested in the algebras associated with standard double cones Or of radius r centred at the origin, and use A(r ) as shorthand for A(Or ). Also, for most parts we only use the time-translation subgroup of U (x, ), which we denote as t → U (t), with positive generator H = P0 ≥ 0. We write the spectral projectors of H for the interval [0, E] as P(E). Let be the set of ultraweakly continuous functionals on B(H). We consider for > 0 the subspaces

C () = σ ∈ σ ( ) := σ (1 + H ) · (1 + H ) < ∞ , (3.1) which are Banach spaces in the norm · ( ) . Their duals C ()∗ consist of linear forms φ for which the dual norm φ(− ) = (1 + H )− φ(1 + H )− is finite. (More precisely, φ are quadratic forms on a dense subspace of H × H, for which the form (1 + H )− φ(1 + H )− , with the multiplication defined in the weak sense, is bounded.) We also introduce the space of smooth functionals, C ∞ () = ∩ >0 C (), and equip it with the Fréchet topology induced by all norms · ( ) . The dual space C ∞ ()∗ is then given by ∪ >0 C ()∗ , and will be considered with the weak∗ topology. Further, we define for E > 0 the set of energy-bounded functionals, (E) = {σ (P(E) · P(E)) | σ ∈ }. Then ∪ E>0 (E) is dense in C ∞ () and weakly dense in . Each space C () is invariant under the natural action of hermitian conjugation, i.e. σ ∗ (A) = σ (A∗ ), and this structure transfers to the dual spaces; so we can speak of hermitian elements in C ∞ () and C ∞ ()∗ . With respect to pointlike fields, we assume that the theory fulfils a specific type of phase space condition [3], sensitive in the ultraviolet. To formulate this, consider the inclusion map : C ∞ () → . We assume that can be approximated with finite-rank maps in the following sense. Definition 3.1 (Microscopic phase space condition). A net O → A(O) is said to satisfy the microscopic phase space condition if for every γ ≥ 0, there exists a continuous linear map ψ : C ∞ () → of finite rank such that for sufficiently large > 0, r −γ ( − ψ)A(r )( ) → 0 as r → 0. Here the restriction A(r ) is applied to the image points of the maps, which are functionals in . This phase space condition is known to be fulfilled in free field theory in at least 3 + 1 space-time dimensions, for massive free fields also in 2 + 1 dimensions [53]. The consequences of this condition are as follows [3]. While the maps ψ are not uniquely fixed by the property above, the image of their dual maps, img ψ ∗ =: γ , is actually unique at fixed γ , provided that the rank of ψ is chosen minimal. These finitedimensional spaces γ form an increasing sequence 0 ⊂ 1 ⊂ 2 . . ., and their union ∪γ γ = FH is precisely the field content of the theory as defined by Fredenhagen and Hertel [54]. After smearing with test functions, the elements φ ∈ γ are local Wightman fields. Actually it suffices for regularizing φ to smearit along the time axis; that is, for f ∈ S(R) and φ ∈ FH , the quadratic form φ( f ) = dt f (t) U (t) φ U (t)∗ can be continued to an unbounded, but closable operator on the dense invariant domain C ∞ (H) = ∩ >0 (1 + H )− H. Further, φ ∈ γ can be approximated with bounded operators in a controlled way; cf. [3, Lemma 3.5] and the remark following it:


771

Theorem 3.2. Let φ ∈ FH . One can find constants > 0, k > 0 and operators Ar ∈ A(r ) for each r > 0 such that, as r → 0, Ar − φ(− ) = O(r ), Ar (− ) = O(1), dn Ar = O(r −k ), ∀n ∈ N : n U (t)Ar U (t)∗ = O(r −k−n ). dt

φ(− ) < ∞,

Moreover, the spaces γ are related to the approximation of bounded operators in the short distance limit; see [3, Eq. (4.4)]: Theorem 3.3. Let pγ : C ∞ ()∗ → γ ⊂ C ∞ ()∗ be a continuous projection onto γ . Then, for sufficiently large > 0, (pγ ∗ − )A(r )( ) = o(r γ ). Here pγ ∗ : C ∞ () → C ∞ () is the pre-dual map to pγ , which always exists due to its finite rank. Of course, such projections pγ exist in abundance. Since the spaces γ are invariant under conjugation, it is possible to choose pγ hermitian, i.e., such that pγ (A∗ ) = pγ (A)∗ . It was shown in [4] that due to the properties explained above, operator product expansions exist in a rigorous sense. In fact, [4] established the expansion of φ(x)φ (y) for spacelike separated points x and y. A similar scheme can be applied for arbitrary x and y, in the sense of distributions, as sketched in [4] and worked out in more detail in [53, Ch. 5.5]. (See also [55, Sect. 4].) For our purposes, we will need a specific variant of this product expansion, which will be established in Sect. 5.2. 4. Distributions as Boundary Values of Analytic Functions If σ ∈ is energy-bounded and φ a Wightman field with sufficiently regular high-energy behaviour, then the distribution1 σ (φ ∗ (t)φ(t )) is the boundary value of an analytic function in the half plane Im (t − t ) < 0. If further σ is positive, then the distribution is positive-definite. These types of distributions have certain well-known characterizations [36,56]. Since we will need specific quantitative estimates in our context, we will repeat some of those arguments in detail. First of all, for each d > 0 and m ∈ N we define a norm on D(−d, d) by f d,m := max d n f (n) 1 . 0≤n≤m

(4.1)

This norm is equivalent to the Sobolev norm defining the space W0m,1 (−d, d) [57], but it is convenient to use the above norms owing to their behaviour under scaling. Namely, if f ∈ D(−d, d) and λ > 0, and we set f λ (t) = λ−1 f (t/λ), then f λ ∈ D(−λd, λd) and ∀m ∈ N :

f λ λd,m = f d,m .

(4.2)

Let us now define the class of analytic functions that is of interest. 1 Throughout the paper, we will write distributions in terms of their formal integration kernels, such as K (x) f (x)d x for the evaluation of a distribution K ∈ S(R) on a test function f ∈ S(R), even if K does not arise from an integrable function or measure. This is merely a notational convention.

772


Definition 4.1. We say that an analytic function F : R − ıR+ → C is regular at the boundary if there exists > 0 such that F(− ) :=

sup

−1≤Im z0 K . As the name suggests, functions in K have distributional boundary values on the real line. Proposition 4.2. Let F ∈ K . Then the limit limy→0+ F(x − ıy) exists as a tempered distribution in x. The limit distribution F(x − ı0) satisfies the following estimate for f ∈ D(−d, d), d > 0: f (x) F(x − ı0) d x ≤ 4 +2 ( + 3)(1 + d − −2 ) F(− ) f d, +2 . Proof. For fixed f ∈ S(R), consider the function g(y) := f (x)F(x − ıy)d x, 0 < y ≤ 1.

(4.3)

Since F is analytic in z = x − ıy, we can obtain the derivatives of g using integration by parts: j d jg j d j ∀ j ∈ N0 : f ( j) (x)F(x − ıy)d x. (y) = f (x)(−ı) F(x − ıy)d x = ı dy j dz j (4.4) Thus we have the estimate ∀ j ∈ N0 :

j d g ≤ y − F(− ) f ( j) 1 . (y) dy j

(4.5)

We now want to deduce the following improved estimate: j +2 d g ≤ 4 − j+2 (1 + y 3/2− j ) F(− ) (y) f (k) 1 for j ∈ {0, . . . , + 2}. dy j

(4.6)

k=0

In fact, for j = + 2, this directly follows from Eq. (4.5). Now suppose that Eq. (4.6) holds for j + 1 in place of j. We compute: j j+1 1 d g d jg g d dy j+1 (y ) dy j (y) ≤ dy j (1) + dy y 1 +2 ≤ F(− ) f ( j) 1 + 4 − j+1 F(− ) f (k) 1 dy (1 + (y )1/2− j ) k=0

≤ 4 − j+1 F(− )

+2 k=0

f (k) 1 (4 + 2y 3/2− j ).

y

(4.7)


773

This proves Eq. (4.6). In particular, the case j = 1 shows that dg/dy is bounded as y → 0; thus g(y) converges in this limit. Setting j = 0 in Eq. (4.6) then shows that g(0+) =: f (x)F(x − ı0)d x defines a tempered distribution. Also, if f ∈ D(−d, d), we can combine the estimate m f (k) 1 ≤ (m + 1) max{1, d −m } f d,m (4.8) k=0

with Eq. (4.6), where j = 0 and m = + 2, in order to show the proposed estimate for the limit distribution. As is well known, the analytic function F is uniquely determined by its boundary distribution F( · − ı0). Further, for two functions which are regular at the boundary, it follows from Definition 4.1 that their product inherits this property. More explicitly, for F ∈ K and G ∈ Km , we have F G(− −m) ≤ F(− ) G(−m) .

(4.9)

Thus the product of the boundary distributions is well-defined by multiplying the corresponding analytic functions. On the other hand, the Fourier transforms of the boundary distributions have support in [0, ∞). This allows for an alternative definition of the distribution product by convolution in Fourier space. The two definitions are in fact equivalent [56, Ch. IX.10, Example 4]. Apart from our distributions being boundary values of analytic functions, we also need to consider questions of positivity. We remind the reader of the definitions (the terminology is not completely consistent in the literature). For g1 , g2 ∈ D(R), we introduce the abbreviation g1 g2 (s, s ) := g1 (s + s /2)g2 (s − s /2). Definition K ∈ S(R2 ) is called positive-definite if for all g ∈ S(R), 4.3. A distribution one has ds ds K (s, s ) g¯ g(s, s ) ≥ 0. If here K depends on the second variable only, so K ∈ S(R) , it is called a distribution of positive type. With K+ ⊂ K we denote the subset of positive type distributions. The Bochner-Schwartz Theorem asserts that distributions of positive type are precisely the Fourier transforms of positive, polynomially bounded measures. We now show that the product of distributions, as discussed further above, preserves positivity if both factors are positive, at least in a special situation that is of interest to us. Proposition 4.4. Let F ∈ K+ . Let G : R × (R − ıR+ ) → C such that G(s, · ) ∈ K for some and every fixed s, where the map R → K , s → G(s, · ) is bounded and continuous in · (− ) . Suppose further that G(s, s −ı0) is positive-definite. Then the product distribution P(s, s ) = F(s − ı0)G(s, s − ı0) is continuous in s and positive-definite. Proof. First, due to Prop. 4.2, the boundedness and continuity of s → G(s, · ) implies that P(s, s ) f (s )ds is continuous and bounded in s; in particular P ∈ S(R2 ) is well-defined. Now let µ be the positive measure that arises by Fourier transform of F(x − ı0). Since F(x − ı0) is a boundary value, we know supp µ ⊂ [0, ∞). Therefore we have for g ∈ S(R), ds ds P(s, s )g¯ g(s, s ) ∞ = lim dµ( p) ds ds e−ı p(s −ı ) G(s, s − ı )g¯ g(s, s ) . (4.10)

→0+ 0

=:I ( , p)

774


Supposing for a moment that the integrand I ( , p) has an integrable bound in p, uniform in , we can apply the dominated convergence theorem and obtain ∞ dµ( p) ds ds G(s, s − ı0)g¯ p g p (s, s ), ds ds P(s, s )g¯ g(s, s ) = 0

(4.11) where g p (t) = eı pt g(t). This is clearly non-negative, since G is positive-definite. It remains to prove appropriate bounds for I ( , p). To that end, choose n ∈ N so large that dµ( p)(1 + p)−n < ∞. We use integration by parts in s to obtain I ( , p) = (1 + p)−n ds ds (1 + p)n e−ı(s −ı ) p G(s, s − ı )g¯ g(s, s ) ∂ n = (1 + p)−n ds ds e−ı(s −ı ) p 1 − ı G(s, s − ı )g¯ g(s, s ). (4.12) ∂s Via the Leibniz rule, we can distribute the derivatives ∂/∂s to G(s, s ) and to the test function. Now note that with G, also the derivatives ∂ k G/∂z k fulfil polynomial bounds when Im z → 0−; namely, we can use the Cauchy integral formula for a circle of radius |Im z/2| around z in order to obtain the estimate k ∂ G(s, z) 1 k G(x, · )(− ) |Im z|− −k for − ≤ Im z < 0. (4.13) ∂z k ≤ 2 k! sup 2 x This implies that e−ı pz/2 ∂ k /∂z k G(s, z/2) belongs to K +k with norm uniform in s and p. Applying Proposition 4.2, we can then obtain finite bounds on the integral in (4.12) as → 0, so |I ( , p)| ≤ c (1 + p)−n for small ,

(4.14)

with a constant c depending on G and g. This is a bound of the required form. 5. Products Our next aim is to describe products of quantum fields that are of interest to us, and derive an operator product expansion for them. Specifically, we are interested in the products of two quantum fields φ, φ , displaced to different points t, t on the time axis; this product then exists as a distribution in the difference variable s = t − t . In addition, we wish to multiply this distribution with a c-number distributional kernel in t − t , and also consider sums of such expressions. The operator product expansion we use is derived by means of techniques described in [4]; however, we need to generalise the construction both to include the weighting factors and also to obtain more detailed estimates on OPEs at timelike-separated points. We can formally describe the products of interest as elements of the algebraic tensor product space prod := K ⊗ C ∞ ()∗ ⊗ C ∞ ()∗ . Any element ∈ prod has the form of a finite sum, K j ⊗ φ j ⊗ φ j , K j ∈ K, φ j , φ j ∈ C ∞ ()∗ . (5.1) = j


775

For > 0, we set prod = K ⊗C ()∗ ⊗C ()∗ ⊂ prod ; clearly, prod = ∪ >0 prod . Further we consider the subspace prod,loc = K ⊗ FH ⊗ FH ⊂ prod , the space of products of pointlike fields. To each product ∈ prod , we can associate a distribution T , heuristically given by t + t ∗ t + t T (t − t )U = K j (t − t − ı0)φ j (t)φ j (t ). (5.2) U 2 2 j

We shall first discuss in which sense these product distributions exist, before deriving an operator product expansion for them, in the case where φ j and φ j are local fields. Then we will introduce certain convolutions of these distributions with test functions, generalize the OPE for them, and single out a minimal set of composite fields that will be of use to us. 5.1. Operator products. Before considering our operator products, let us first define the set of distributions of interest. Definition 5.1. A C ∞()∗-valued distribution is a linear map T : D(R) → C ∞ ()∗ such that there exist constants > 0 and m ∈ N0 , and, for each d > 0, a constant cd , with the property that ∀ f ∈ D(−d, d) :

T ( f )(− ) ≤ cd f d,m .

Equivalently, we might say that T D(−d, d) extends to a map of W0m,1 (−d, d) to (− ) C ()∗ , with finite norm T d,m ≤ cd . In more standard terms, T might be called a distribution of finite order, but since we will not use other distributions in this context, we drop the extra qualifier. As before, we shall denote these distributions using their formal kernels: T ( f ) = d x T (x) f (x). Their expectation values σ (T (x)), for fixed σ ∈ C (), are then distributions in D(R) in the usual sense. We shall call a C ∞ ()∗ -valued distribution skew-hermitian if T (x)∗ = T (−x). We shall now clarify in which precise sense the distributions T in Eq. (5.2) exist. Proposition 5.2. Let > 0. To each = j K j ⊗ φ j ⊗ φ j ∈ prod , there exists a unique C ∞ ()∗ -valued distribution T such that for any σ ∈ ∪ E>0 (E), σ (T ( f )) = ds f (s )K j (s − ı0)σ φ j (s /2 − ı0)φ j (−s /2 + ı0) . j

The map → T is linear. Further, there is a constant c > 0 such that for any d ≤ 1, (−5 −1) K j (− ) φ j (− ) φ j (− ) . T d,3 +2 ≤ c d −(3 +2) j

Proof. Without loss of generality, we can assume that is of the form = K ⊗ φ ⊗ φ . Let σ ∈ (E), where E > 0 is fixed for the moment. Then, due to the spectrum condition, the distribution σ (φ(s /2)φ (−s /2)) is indeed the boundary value of an analytic function, namely of F(s − ıs ) := σ e(s +ıs )H/2 φ e−(s +ıs )H φ e(s +ıs )H/2 , s > 0. (5.3)

776


This function fulfils the bounds

|F(s − ıs )| ≤ σ e Es (1 + E)2 φ(− ) φ (− ) sup e−λs (1 + λ)2 (− )

≤ σ φ

(− )

φ

λ>0 2 (1+E)s

(1 + E) e

(2 )2 (s )−2 .

(5.4)

So F is regular at the boundary in the sense of Definition 4.1. Rescaling its argument, we explicitly have F(

z )(−2 ) ≤ cσ φ(− ) φ (− ) (1 + E)4 , 1+ E

(5.5)

where the constant c depends on only. The distributional product K (s − ı0)F(s − ı0) therefore exists. Rescaling also K , and applying Proposition 4.2 and Eq. (4.9), we obtain for any g ∈ D(−d, d) and with another constant c , ds g(s )K ( s − ı0 )F( s − ı0 ) 1+ E 1+ E ≤ c σ K (− ) φ(− ) φ (− ) (1 + E)5 (1 + d −3 −2 )gd,3 +2 .

(5.6)

Now let f ∈ D(−d, d), d ≤ 1. We set g(s ) = (1 + E)−1 f (s /(1 + E)) ∈ D(−(1 + E)d, (1+ E)d) and obtain using Eq. (5.6) with (1+ E)d in place of d, ds f (s )K (s − ı0)F(s − ı0) ≤ c σ K (− ) φ(− ) φ (− ) (1 + E)5 d −3 −2 f d,3 +2 .

(5.7)

This serves to define T ( f ) on (E) for any E. Using [55, Lemma 2.6], we can extend this linear form to C 5 +1 (), and obtain another constant c such that (−5 −1)

T d,3 +2 ≤ c K (− ) φ(− ) φ (− ) d −3 −2 .

(5.8)

The extension is unique by density. It is also clear by construction that T ( f ) is linear in f and in , i.e. multilinear in φ, φ , and K . Then, the estimate (5.8) shows that T is a C ∞ ()∗ -valued distribution in the sense of Def. 5.1.

5.2. Product expansions. We now prove the operator product expansion for a product of pointlike fields, ∈ prod,loc , in the following form. Theorem 5.3. Let ∈ prod,loc , and let α ≥ 0. There exist > 0, m ∈ N, γ ≥ 0, and a hermitian projector pγ : C ∞ ()∗ → γ onto γ such that (− ) T − pγ T d,m = o(d α ) as d → 0.

This is a variant of [4, Theorem 3.2]. Note that the approximation emerges into a more familiar form of operator product expansion if pγ is written in a basis.


777

Proof. Again, we can assume = K ⊗ φ ⊗ φ ∈ prod for some > 0, where now φ, φ ∈ FH . Further, after possibly increasing , we choose k > 0 and approximating sequences Ar , Ar for φ, φ as in Theorem 3.2. Set Br = K ⊗ Ar ⊗ Ar . We define m := 3 + 2 and γ := (2k + + 3)(α + m + 1), and choose a hermitian projector pγ onto ˆ γ . Now we estimate for an as yet unspecified , ˆ (− )

ˆ (− )

ˆ (− )

T − pγ T d,m ≤ T − TBr d,m + TBr − pγ TBr d,m ˆ ˆ

ˆ

) + pγ (− , ) T(−Br ) (− d,m .

(5.9)

ˆ ˆ

Here pγ (− , ) is a constant independent of r and d, finite if ˆ is large. We will show ˆ below that for large , ˆ (− )

T(−Br ) d,m = O(r d −m ), ˆ

) −2k− −2 − −2 TBr − pγ TBr (− d (r + d)γ ). d,m = O(r

(5.10) (5.11)

Setting r (d) = d α+m+1 , and using γ = (2k + + 3)(α + m + 1), both terms above are of order O(d α+1 ), from which the theorem follows. To show Eq. (5.10), we write − Br = K ⊗ (φ − Ar ) ⊗ φ + K ⊗ Ar ⊗ (φ − Ar ).

(5.12)

For the first summand, we estimate by Proposition 5.2: (−5 −1)

TK ⊗(φ−Ar )⊗φ d,m

≤ O(d −m ) K (− ) φ − Ar (− ) φ (− ) = O(r d −m ), (5.13)

as proposed. The second summand of Eq. (5.12) has a similar estimate, which combined gives Eq. (5.10). For Eq. (5.11), we use the short-distance approximation of Theorem 3.3 on the operator ArP (s ) := Ar (s /2)Ar (−s /2) ∈ A(Or +d ), where |s | ≤ d, and on its derivatives in s . Using the estimates on the derivatives of Ar (t) and Ar (t) provided by Theorem 3.2, ˆ this entails that for large , n ˆ d (− ) P P A (s ) − p A (s ) = O((r + d)γ )O(r −2k−n ). γ r r (ds )n

(5.14)

Now we compute TBr − pγ TBr , first on a fixed test function f ∈ D(−d, d), d ≤ 1, and on a fixed functional σ ∈ C ∞ (). By Prop. 5.2, we have σ (TBr ( f ) − pγ TBr ( f )) = ds h(s )K (s − ı0), where h(s ) = f (s )g(s ), g(s ) = σ ArP (s ) − pγ ArP (s ) . (5.15) (Note that here h is smooth, the only divergent factor is K . Therefore, also, sharp energybounds of σ do not play a role.) Using Proposition 4.2, it follows that |σ (TBr ( f ) − pγ TBr ( f ))| ≤ cK (− ) d − −2 hd, +2

(5.16)

778


with a numerical constant c. For the Sobolev norm of h, we can derive the following estimate by the Leibniz formula: hd, +2 = f gd, +2 ≤ 2 +2 f d, +2

max d n

0≤n≤ +2

sup |g (n) (t)|.

(5.17)

t∈[−d,d]

The derivatives of g can be estimated by Eq. (5.14). For t ∈ [−d, d] one has ˆ

|g (n) (t)| ≤ σ ( ) O((r + d)γ )O(r −2k−n ),

(5.18)

where the O(. . .) estimates are uniform in σ . Combining Eqs. (5.16)–(5.18), we obtain ˆ

) − −2 TBr − pγ TBr (− )O((r + d)γ )O(r −2k− −2 ), d, +2 ≤ O(d

(5.19)

which gives Eq. (5.11). The bounds established are certainly not strict, in particular regarding the value of γ (i.e., the number of approximation terms needed in the OPE). They might be improved at the price of extra computational effort, but this is not relevant for our purposes. Note however that the kernels K introduce an extra divergence that might make more OPE terms necessary than in the “ordinary” OPE version with K = 1. 5.3. Convolutions. In order to establish the existence of quantum inequalities, we need to analyse distributions evaluated on certain convolutions of test functions, similar to Eqs. (2.7)–(2.9) in the free field case. Let us define them, and establish their welldefinedness. We remind the reader of the abbreviation g1 g2 (s, s ) = g1 (s +s /2)g2 (s − s /2), and of the notion of skew-hermitian C ∞ ()∗ -valued distributions, which fulfil T (s )∗ = T (−s ). Lemma 5.4. Let T be a C ∞ ()∗ -valued distribution. Then the bilinear map κ0 [T ] : D(R) × D(R) → D(R, C ∞ ()∗ ), κ0 [T ](g1 , g2 )(s) = ds g1 g2 (s, s )T (s ) is well-defined; indeed, if g1 , g2 ∈ D(−d, d), then supp κ0 [T ](g1 , g2 ) ⊂ (−d, d). Further, κ[T ] : D(R) × D(R) → C ∞ ()∗ , κ[T ](g1 , g2 ) = ds U (s) (κ0 [T ](g1 , g2 )(s)) U (s)∗ is well-defined as a weak integral. Both κ0 [T ] and κ[T ] are linear in T . If T is skewhermitian, then κ[T ](g, ¯ g) is hermitian for arbitrary g ∈ D(R). For any m ∈ N and d > 0, one has the estimate (− )

(− )

κ[T ]d,m ≤ 2m+1 T 2d,m . The Sobolev norms of the bilinear maps are understood here with respect to a product of identical Sobolev norms on the two arguments.


779

Proof. First, κ0 [T ](g1 , g2 )(s) is well-defined since g1 g2 (s, · ) lies in D(−2d, 2d) for each fixed s; and it is (weakly) smooth in s since s → g1 g2 (s, · ) is smooth in the D(R) topology. The support properties are clear. Further, one sees that (− )

κ0 [T ](g1 , g2 )(s)(− ) ≤ T 2d,m g1 g2 (s, · )2d,m , which is locally bounded in s. Therefore, for each σ ∈ C ∞ (), the map R → C, s → σ U (s) κ0 [T ](g1 , g2 )(s) U (s)∗

(5.20)

(5.21)

is continuous. Hence κ[T ] is well-defined as a weak integral. Using the Leibniz rule and a change of variables, one finds ds (g1 g2 )(s, ·)2d,m ≤

m n n (r ) (n−r ) d r g1 1 d n−r g2 1 r n=0 r =0

≤ (2m+1 − 1)g1 d,m g2 d,m .

(5.22)

Together with Eq. (5.20), this yields the estimate (− ) (− ) κ[T ]d,m ≤ 2m+1 T 2d,m ,

(5.23)

as proposed. Also, it is clear in matrix elements that both κ0 [T ](g) and κ[T ](g) are linear in T . If T is skew-hermitian, one uses the identity g¯ g(s, s ) = gg(s, ¯ −s ) to conclude ∗ ∗ κ0 [T ](g, ¯ g)(s) = κ0 [T ](g, ¯ g)(s) and, in consequence, κ[T ](g, ¯ g) = κ[T ](g, ¯ g). The estimates above show that our operator product expansion for T , as established in Theorem 5.3, can be transferred to κ[T ]. This is in fact the form of OPE we shall use for establishing quantum inequalities. Corollary 5.5. Let ∈ prod,loc , and let α ≥ 0. There exist > 0, m ∈ N, γ ≥ 0, and a hermitian projector pγ : C ∞ ()∗ → γ onto γ such that (− )

κ[T − pγ T ]d,m = o(d α ) as d → 0. 5.4. Minimal approximating projectors. The operator product expansion allows us to approximate a given product with a finite number of composite fields. It is important for our applications to choose the minimal number of composite fields needed, so that none of the approximation terms can be considered “redundant”. Let us introduce that notion of approximation by finitely many terms more abstractly. This is similar, but not identical to the analysis of normal products in [4, Sect. IV]. Definition 5.6. Let ∈ prod,loc , and α ≥ 0. A hermitian projector2 p in C ∞ ()∗ with finite-dimensional image in FH is called α-approximating for if there are constants > 0 and m ∈ N such that (− )

κ[T − pT ]d,m = o(d α ) as d → 0. 2 Projectors in this space will always be assumed as continuous.

780


The operator product expansion in Corollary 5.5 tells us that for any given α, we can choose γ large enough such that any hermitian projector p onto γ is α-approximating for . However, this is in a way an “upper estimate” to the OPE, since γ may contain elements that are not actually needed for approximating the given product. We will therefore minimize the approximating projector in a well-defined sense. This is done as follows. On the family of all α-approximating projectors for a given product , we introduce a partial order by p1 ≤ p2

:⇔

(img p1 ⊂ img p2 ) ∧ (ker p1 ⊃ ker p2 ).

(5.24)

Minimal elements with respect to this partially ordered set will be called minimal α-approximating projectors. By dimensional arguments, any decreasing sequence in the set must eventually become constant; so minimal elements certainly exist, and can be constructed below each given α-approximating projector. However, there seems to be no reason why they should be unique. This is in contrast to the situation for normal product spaces [4, Sect. IV], where the approximation property depends on img p only, i.e., any other projector onto the same space would also be α-approximating. In that case, one finds a unique minimal approximating space of fields. In our situation, these stronger results do not seem to follow, the main difficulty being that the convolution κ[ · ] does not commute with projectors. This turns out not to be a problem however: each minimal α-approximating projector will give us a nontrivial quantum inequality. Let us summarize the main point of the above discussion: Proposition 5.7. Let α ≥ 0 and ∈ prod,loc . There exists at least one minimal α-approximating projector p for . 6. Quantum Inequalities We now establish quantum inequalities as a consequence of the operator product expansion above, and prove that they are nontrivial as discussed in Sect. 2. 6.1. Existence of inequalities. In order to establish inequalities, we define a set of products pos ⊂ prod,loc which are “classically positive”, namely a finite sum of absolute squares with positive-type coefficients:

pos := (6.1) K j ⊗ φ ∗j ⊗ φ j K j ∈ K+ , φ j ∈ FH . j

For any ∈ pos , the distribution T is then skew-hermitian. (One verifies this in matrix elements by the integral formula in Prop. 5.2, using the relation K j (z) = K j (−¯z ) for the positive-type kernels K j .) Products from pos now give rise to quantum inequalities. To formulate these, we use the abbreviation R := (1 + H )−1 . Theorem 6.1. Let ∈ pos and α ≥ 0. Let p be an α-approximating projector for . There exist > 0, m ∈ N, and a function : R+ → R+ of order (d) = o(d α ) such that the following inequality between bounded operators holds. ∀d > 0, g ∈ D(−d, d) :

R κ[ pT ](g, ¯ g)R ≥ − (d)(gd,m )2 1.


781

Proof. By Def. 5.6, there exist , m and (d) = o(d α ) such that κ[ pT − T ](g, ¯ g)(− ) ≤ (d) (gd,m )2 .

∀d > 0, g ∈ D(−d, d) :

(6.2)

¯ g) is guaranteed to be hermitian Note here that, since T is skew-hermitian, κ[T ](g, by Lemma 5.4. Since p is hermitian, the same is true for κ[ pT ](g, ¯ g). The expectation values of these expressions in positive functionals are therefore real. Thus, for any ρ ∈ ∪ E (E), ρ ≥ 0, we obtain from Eq. (6.2), ρ(κ[ pT ](g, ¯ g)) − ρ(κ[T ](g, ¯ g)) ≥ −ρ( ) (d) (gd,m )2 . (6.3) Now ∈ pos is of the form = j K j ⊗ φ ∗j ⊗ φ j . Due to energy-boundedness of ρ, we have by Prop. 5.2 and Lemma 5.4, ρ(κ[T ](g, ds ds g¯ g(s, s ) K j (s − ı0) ¯ g)) = j

Here

×ρ U (s) φ ∗j (s /2) φ j (−s /2) U (s)∗ .

(6.4)

G j (s, s − ı0) := ρ U (s)φ ∗j (s /2)φ j (−s /2)U (s)∗ = ρ φ ∗j (s + s /2)φ j (s − s /2)

(6.5)

are positive-definite distributions, as they give ρ(φ j (g)∗ φ j (g)) when integrated with g¯ g. Also, a similar estimate as in Eq. (5.4) shows that s → G j (s, · ) is uniformly bounded in · (− ) , and also continuous since the energy-bounded state ρ is analytic for U (s). Thus the products of G j with the positive-type kernels K j are positive-definite as well (Prop. 4.4). Therefore, the expression in Eq. (6.4) is non-negative. Setting ρˆ = ρ(R − · R − ), we can thus reduce Eq. (6.3) to ρ(R ˆ κ[ pT ](g, ¯ g) R ) ≥ − (d) (gd,m )2 ρ(1). ˆ

(6.6)

¯ g)R can be extended to a bounded operator by Eq. (6.2). Since ρˆ Here R κ[ pT ](g, can be chosen from a dense subset in the set of all positive functionals, the theorem now follows. The connection of the theorem with more usual forms of quantum inequalities becomes clear when we write the projector p in a basis: p=

n

σ j ( · )φ j , where σ j ∈ C ∞ (), φ j ∈ FH , σ j (φk ) = δ jk .

(6.7)

j=1

Here we choose φ j and σ j hermitian, which is possible since p is hermitian. Then, the inequality in the theorem can be rewritten as n j=1

R φ j ( f j )R ≥ − (d) (gd,m )2 1,

(6.8)

782


where the functions f 1 , . . . , f n are given by f j (s) = ds g¯ g(s, s ) σ j (T (s )).

(6.9)

These f j are actually of compact support, namely supp f j ⊂ (−d, d) if g ∈ D(−d, d), see Lemma 5.4. They are also smooth, since s → g¯ g(s, · ) is differentiable in the S-topology; so they are indeed proper test functions in D(R). Further, the f j are realvalued, which follows from hermiticity of σ j and skew-hermiticity of T . The inequality (6.8) is of an asymptotic nature, inasmuch as only the asymptotic behaviour of the remainder, (d) = o(d α ), is known. For the sake of concreteness, we may choose a fixed test function g ∈ D(−1, 1), and define a family of scaled functions gd (t) = d −1 g(t/d). For these, gd d,m is independent of d, so that the right-hand side of Eq. (6.8) simplifies; the inequality is then valid as the parameter d of the family goes to 0. While the functions f j are real-valued, they are not guaranteed to be pointwise positive, in contrast to the free field situation [11]. To see this, we note that (6.9) has a strong analogy to Weyl quantization. Letting C˜ j be the Fourier transform of C j (s ) = σ j (T (s )), one has dp ˜ dp ˜ f j (s) = C j ( p) ds eı ps g¯ g(s, s ) = C j ( p)Wg (s, p), (6.10) 2π 2π where Wg is the Wigner function associated with the “state” g, Wg (s, p) = ds eı ps g(s + s /2)g(s − s /2).

(6.11)

Now the Wigner function cannot be pointwise positive for compactly supported g [1], so positivity of f j can only be expected in special situations; see e.g., Prop. 7.3. Note that Eq. (6.8) is a far-reaching generalization of the usual inequalities for squares of fields in free field theory. In particular, the estimate will in general not be restricted to two fields, such as the Wick square and the identity in Eq. (2.11), but will involve a possibly large number of fields smeared with different sampling functions. One of the φ j will typically be the identity operator, and another φ j will typically be a normal product in the sense of Zimmermann [4,49]. This term will usually be distinguished as a highest-order field, relating e.g. to scaling dimensions. But there seems to be no guarantee that such a highest-order field exists uniquely, and even less that only two fields φ1 , φ2 appear in the inequality. Compared with the usual free-field situation, we also encounter a remainder term (d) which seems unavoidable in this context, but is of negligible order compared with the contributions of the field operators, as we shall see below. 6.2. Nontriviality. While Thm. 6.1 asserts that our construction yields a large variety of valid quantum inequalities, there remains the concern that they could be trivial in the sense that the lower bound could also serve as an upper bound, cf. [24]. In particular, an inequality for a bounded operator A of the form A ≥ −A1 would be considered trivial. Since the exponent in Eq. (6.8) is so large that all R φ j R are bounded, we might well encounter this situation: the left-hand side of Eq. (6.8) might be dominated in norm by the remainder (d). More generally, since Thm. 6.1 puts no further restrictions


783

on the projector p, it might also be possible that pT contains single “redundant terms” that are individually dominated by (d), and are thus essentially irrelevant. We shall show now that if the approximating projector is chosen minimal, these problems do not occur, and in this sense the inequality is nontrivial. Theorem 6.2. Let ∈ pos and α ≥ 0. Let p be a minimal α-approximating projector for . Let V := img p. For sufficiently large > 0 and m ∈ N, and for any hermitian projector q : V → V , q = 0, it holds that d −α

κ[qpT ](g, ¯ g)(− ) → 0 as d → 0. (gd,m )2 g∈D (−d,d) sup

Proof. Suppose that m, and a hermitian projector q : V → V are given such that d −α

sup

g∈D (−d,d)

(gd,m )−2 κ[qpT ](g, ¯ g)(− ) → 0 as d → 0.

(6.12)

We will show q = 0. First, we can use the polarization identity for the quadratic form κ[qpT ](g1 , g2 ) in order to show (− )

d −α κ[qpT ]d,m → 0.

(6.13)

The triangle inequality then yields (− ) d −α κ[(1 − q) pT − T ]d,m (− ) (− ) ≤ d −α κ[ pT − T ]d,m + d −α κ[qpT ]d,m → 0,

(6.14)

since p is α-approximating; we suppose here that m, are sufficiently large. Now (6.14) shows that (1 − q) p is also α-approximating for . It is clear that (1 − q) p ≤ p. Since however p is minimal, this implies (1 − q) p = p. Thus q = 0. Again, let us illustrate the content of the theorem by passing to a basis representation of p, as in Eq. (6.7). For the case q = 1V , the theorem precisely shows that the left-hand side of Eq. (6.8) does not vanish in norm as fast as (d) = o(d α ). Further, choose q specifically as q = σk φk with fixed k. Then one obtains κ[qpT ](g, ¯ g) = φk ( f k ), with f k as in Eq. (6.9). Thus, Thm. 6.2 provides us with a null sequence (di )i∈N , a constant c > 0, and a sequence of functions g (i) ∈ D(−di , di ) with g (i) di ,m = 1 such that (i)

R φk ( f k ) R ≥ c (di )α for all i ∈ N, where (i)

f k (s) =

ds g¯ (i) g (i) (s, s ) σk (T (s )).

(6.15)

(6.16)

So the field φk in the inequality (6.8) gives a contribution that is large compared to the remainder (d). Theorem 6.2, in full generality, shows that this conclusion is true independent of the choice of basis. We have argued in Prop. 5.7 that minimal α-approximating projectors p exist for any product , and any α ≥ 0. So we always obtain nontrivial quantum inequalities in the sense above. One might suspect here that the minimization of the approximating projector p could lead to p = 0, which might again be seen as trivial. While this is not the case even in a simple free field example, we shall give a general argument that shows that p = 0 cannot occur, under a mild extra assumption.

784


Theorem 6.3. Let α ≥ 0, and ∈ pos \{0}. Suppose that the vacuum vector is separating for the smeared fields φ( f ), with φ ∈ FH and f ∈ D(R). If p is an α-approximating projector for , then p = 0. We note that the condition of a separating vacuum vector is indeed a rather weak one. It would suffice, for example, that there exists a wedge region W such that is cyclic for A(W). Proof. Suppose that α and are given such that p = 0 is α-approximating for . We will show = 0. To that end, we choose and m sufficiently large, and pick a fixed positive test function g ∈ D(−1, 1). Then gd := d −1 g(d −1 · ) lies in D(−d, d), and gd d,m = g1,m . Employing Def. 5.6, we obtain κ[T ](g¯d , gd )(− ) → 0 as d → 0.

(6.17)

Evaluating the convolution integral in the vacuum state ω yields due to translation invariance, (6.18) ds ds g¯d gd (s, s ) ω(T (s )) → 0 as d → 0. As argued in the proof of Thm. 6.1, the distribution ω(T (s )) is of positive type. Hence it is the Fourier transform of a polynomially bounded positive measure µ. With this information, we can rewrite Eq. (6.18) as dµ( p) |g˜d ( p)|2 → 0 as d → 0. (6.19) 2 > 0 locally uniformly. Since µ However, as d → 0, we have |g˜d ( p)|2 → |g(0)| ˜ is positive, we can conclude here that µ is the zero measure. So ω(T (s )) = 0 as a distribution. Using ∈ pos , we have a representation

0 = ω(T (s )) =

n

K j (s − ı0) ω(φ ∗j e−ı(s −ı0)H φ j ) with K j ∈ K+ , φ j ∈ FH .

j=1

(6.20) Since all summands are of positive type, each of them must vanish individually; and clearly, also their analytic continuations must vanish. Thus, for any j, we have either K j = 0 or ω(φ ∗j U (−s )φ j ) = 0. But the latter implies φ j ( f ) = 0 for any f of compact support; thus φ j ( f ) = 0 by assumption, and ultimately φ j = 0 by passing to a delta sequence. In total, this means = 0. One might also be concerned that p might project only onto multiples of the identity. Again, this does not occur in the simple example of the Wick square of the free field, as discussed in Sect. 2. In general, we conjecture, but have not proved, that in this case all fields appearing in the product must be multiples of the identity. At the very least, one can show that the projector may be taken to be of the form p = ω( · )1, where ω is the vacuum state. If this p is indeed α-approximating for , the normal product of can be defined by point splitting, and vanishes identically. So this does not seem to be a case of great interest.


785

6.3. Mesoscopic bounds. The inequalities derived above involve a remainder term that vanishes in the small distance limit. Here, we discuss how the remainder can be reduced for test functions of fixed supports, essentially by forming a Riemann integral of the bounds at short distance. Let χ ∈ D(−1, 1) and f ∈ D(−d, d) be fixed nonnegative functions. We set χλ (s) = λ−1 χ (s/λ) for λ ∈ (0, 1]. As in Thm. 6.1, we suppose p to be an α-approximating projector for ∈ pos , with α ≥ 0. The basic inequality of Thm. 6.1, applied to χλ , entails R κ[ pT ](χλ , χλ )R ≥ − (λ)(χλ λ,m )2 1 = − (λ)(χ 1,m )2 1

(6.21)

for suitable > 0 and m ∈ N, where (λ) = o(λα ). Applying a time-translation through λk, multiplying by λ f (λk) and summing, we find λ f (λk)U (λk)κ[ pT ](χλ , χλ )U (λk)∗ ≥ − (λ)(χ 1,m )2 λ f (λk)R −2 k∈Z

k∈Z

≥ − (λ)(χ 1,m )2 ( f 1 + λ f 1 )R −2 . (6.22) Passing to a basis representation, we may rewrite this inequality in the form n

φ j (F j,λ ) ≥ −2 (λ)(χ 1,m )2 f d,1 R −2

(6.23)

j=1

for λ ≤ d where F j,λ (s) =

λ f (λk)

ds σ j (T (s ))χλ χλ (s − λk, s ).

(6.24)

k∈Z

Owing to the support properties of χλ , at most two terms contribute to the sum on k for each fixed s; moreover, F j,λ ∈ D(−d, d). In any fixed state in C ∞ () the expectation value of the right-hand side of (6.23) can be made arbitrarily small by reducing λ, while the behaviour of the terms on the right-hand side is determined by the asymptotic behaviour of the F j,λ , regarded as compactly supported distributions. In the unlikely event that each F j,λ converged to a limit in the weak-∗ topology on E (R), we would have established a quantum inequality without remainder term. It may be useful to give two examples. If the OPE coefficient σ j (T (s )) is smooth, then convergence does occur, with F j,λ → σ j (T (0))(χ 1 )2 f

in E (R) as λ → 0.

(6.25)

(To see this, one integrates against u(s) and observes that the k th summand is subject to only an O(λ2 ) error if u(s)σ j (T (s )) is replaced by u(λk)σ j (T (0)); as there are at most O(λ−1 ) nonzero summands the result follows by a simple calculation.) On the other hand, if σ j (T (s )) = (ıπ )−1 /(s − ı0), we find λF j,λ → (χ 2 )2 f

in E (R) as λ → 0.

(6.26)

(Note that it is the L 2 -norm that appears here, in contrast to the first example.) In general, therefore, it cannot be expected that all of the F j,λ converge as λ → 0. Nonetheless, as in the second example, its leading order behaviour in λ can be identified as follows.

786


Proposition 6.4. Let q be the order of the germ of σ j (T (s )) at s = 0 and define (6.27) η j (λ) = ds σ j (T (s ))(χλ ∗ χˆ λ )(s ), where χ(s ˆ ) = χ (−s ). If λ−q η j (λ)−1 = o(1) as λ → 0 then in E (R) as λ → 0.

F j,λ /η j (λ) → f

(6.28)

In particular, this is satisfied if σ j (T (s )) has a scaling limit of degree β < 0 and q = −1 − β. Here, the order of the germ of σ (T (s )) q ∈ N0 for which at s = 0 is the minimal q there are λ0 > 0 and C > 0 such that | ds σ (T (s ))u(s )| ≤ C r =0 sup |u (r ) | for all u ∈ D(−λ0 , λ0 ). The notion of scaling limit is taken from [52]: namely, the scaling limit exists if there exists a monotone positive function N (λ) for which (6.29) N (λ) ds σ j (T (s ))u λ (s ) → S(u)

for all u ∈ D(R), with a nonzero limit for at least one u. Under these circumstances, S is a homogeneous distribution, i.e., S(u λ ) = λβ S(u), with degree β ∈ R determined by lim

λ →0

N (λ ) = λβ . N (λλ )

(6.30)

(Our definition of the degree coincides with that of [58, Ch. I Sect. 1.6.] and differs from [52].) If β < 0, for example, the distribution (s − i0)β (log s − i0)γ has a scaling limit of degree β and (germ) order −1 − β, and therefore meets the criteria stated. Proof of Prop. 6.4. We choose λ0 ∈ (0, 1] sufficiently small that σ (T (s )) has order q on (−2λ0 , 2λ0 ), and assume henceforth that 0 < λ < λ0 . As in the second example above, we integrate F j,λ against u ∈ E(R) and approximate u(s) by u(λk) in the k th summand, to obtain λ f (λk) ds χλ χλ (s, s )u(s + λk) ds F j,λ (s)u(s) = ds σ j (T (s )) = η j (λ)

k∈Z

λ f (λk)u(λk) + R j,λ ,

(6.31)

k∈Z

where R j,λ =

ds σ j (T (s ))

λ f (λk)

ds χλ χλ (s, s )[u(s + λk) − u(λk)].

k∈Z

(6.32) Now R j,λ is, at worst, of order O(λ−q ) as λ → 0, as is easily seen using the estimate C sup ds σ j (T (s ))χλ χλ (s, s ) ≤ q+2 ; (6.33) λ s


787

and the facts that (i) the sum contains at most O(λ−1 ) nonzero terms; (ii) the s-integral extends over the region [−λ, λ]. This establishes ds F j,λ (s)u(s) = η j (λ) (6.34) ds f (s)u(s) + O(λ) + O(λ−q ) as λ → 0, from which (6.28) follows immediately. Now suppose that σ j (T (s )) has a scaling limit of degree β < 0. It is easy to see that (6.29) implies N (λ)η j (λ) → S(χ ∗ χ). ˆ The spectrum condition entails that S = C(ı(· − ı0))β , where the nonzero constant C is real owing to hermiticity (cf. the proof of Prop. 7.2 below). As β < 0, we may verify directly that S(χ ∗ χ) ˆ = 0, that N (λ) is necessarily monotone decreasing and vanishing as λ → 0. Thus η j (λ) → ±∞ depending on the sign of C. Moreover, Eqs. (6.29) and (6.30) entail (λ )q η j (λ ) = λ−β−q . λ →0 (λλ )q η j (λλ ) lim

(6.35)

By hypothesis, σ j (T (s )) has order q = −1 − β (as does S). Thus −β − q > 0 and we deduce that λ−q η j (λ)−1 → 0 as λ → 0. The significance of this result becomes clear in the situation where one of the composite fields, say φ1 , is identified as a field of particular interest, e.g., the normal product. By hermiticity of the projection p, η1 is real-valued; the hypothesis of Prop. 6.4 requires that |η1 (λ)| → ∞ as λ → 0. If, in fact, η1 (λ) → +∞, we may divide the quantum inequality (6.23) by this factor to obtain a bound 1 2 (λ) (χ 1,m )2 f d,1 R −2 φ j (F j,λ ) ≥ − η1 (λ) η1 (λ) n

φ1 (F1,λ /η1 (λ)) +

(6.36)

j=2

for λ < d. (If η1 (λ) → −∞ we simply reverse the sign of φ1 and hence σ1 (T (s )) and η1 to obtain the same result; the possibility that η1 oscillates in sign as λ → 0 can be excluded if the scaling limit exists.) In this form, it is clear that the remainder term may be diminished by reducing λ, at the possible cost of increasing the magnitude of the terms in composite fields with j ≥ 2 (if η j (λ) grows more rapidly than η1 (λ)). Moreover, the expectation value of the first term tends to that of φ1 ( f ) as λ → 0 for any state in C ∞ (). Further progress is only possible with more detailed information regarding the (germs of the) OPE coefficient distributions. Nonetheless, we expect that the results presented here will be of use in the context of particular models. 7. Scaling Limits and Dilation Covariance For a concrete interpretation of our quantum inequalities, it is of particular interest to investigate the detailed structure of the sampling functions with which the composite fields are smeared, e.g. the functions f j in Eq. (6.9). For example, one is interested whether they are pointwise positive, or at least “mostly positive” in a well-defined sense. Of course, these properties depend crucially on the structure of the OPE coefficients involved, about which little is known in the general case. The most reasonable approach therefore seems to investigate those properties under more restrictions on the theory.

788


In the preceding section, our approach was to approximate a given sampling function with a Riemann sum; this relied on some assumptions on the behavior of the OPE coefficients in the small, and was tied to a choice of basis in the field spaces. In the following, we want to take a different approach: we investigate the structure of sampling functions in a restricted class of quantum field theories, namely in the presence of dilation symmetries. While for a realistic description of microphysics, one would not consider dilation covariant quantum field theories, this case is still important as an idealization at short scales. Namely, in the short-distance regime, quantum field theories should be approximated by a scaling limit theory, which indeed possesses a dilation symmetry. Let us briefly sketch how the scaling limit of quantum field theories fits into our context. It has been shown by Buchholz and Verch [59] that scaling limits can be formulated very naturally on the level of local algebras. Every quantum field theory possesses a scaling limit in this sense, although it might not be unique. The limit theory is, under a suitable choice of limit states, covariant under a strongly continuous unitary representation of the dilation group [55]. However, the structure of these dilation unitaries may be very intricate, acting on a nonseparable Hilbert space. (See also [60].) In [55], it was shown that this picture is compatible with the usual notion of field renormalization: if the original algebraic theory fulfils a slightly sharpened version of Def. 3.1, then the limit theory fulfils Def. 3.1 too; and pointlike fields in the original theory converge, under a multiplicative renormalization scheme, to pointlike fields in the limit theory. In a certain sense, the projectors pγ onto γ converge to corresponding (0) projectors pγ in the limit theory. Also, this scheme is compatible with products of pointlike fields and operator product expansions. Thus one can expect that the structures exhibited in Sect. 6 properly converge in the scaling limit, and yield quantum inequalities in the limit theory. Our aim here is neither to describe this passage to the limit theory in detail, nor to treat all possible cases of dilation group representations that may appear in the limit. Rather, we take the above as a motivation to investigate quantum inequalities in dilation covariant theories, and to show in certain simple cases that stricter classification results on the form of quantum inequalities can be achieved. In the remainder of this section, we will therefore assume that our theory A has a dilation symmetry; i.e., that there exists a strongly continuous unitary representation λ → U (λ) of the dilation group on H, which is compatible with the Poincaré group representation, and acts on the local algebras in the usual geometric way. The adjoint action of U (λ) can then be extended to C ∞ ()∗ , where we write δλ φ = U (λ)φU (λ)∗ in the weak sense. The spaces γ are invariant under δλ [3, Sect. IV]. We shall now consider the action of δλ on the structures considered so far, and introduce some definitions for convenience. Definition 7.1. A quadratic form φ ∈ C ∞ ()∗ is called dilation covariant if, with some β ∈ R, δλ φ = λβ φ for all λ > 0. A product ∈ prod is called dilation covariant if, with some β ∈ R, δλ T (s) = λβ T (λs) for all λ > 0, in the sense of distributions. A projector p in C ∞ ()∗ is called dilation covariant if δ1/λ ◦ p ◦ δλ A(O1 ) = pA(O1 ) for all 0 < λ ≤ 1, where O1 is the standard double cone of radius 1.


789

Note that the restriction to A(O1 ) in the definition of dilation covariant projectors is unavoidable if we want p to be norm-bounded on B(H). Namely, suppose that δ1/λ ◦ p ◦ δλ (A) = p(A) for all A ∈ B(H) and 0 < λ ≤ 1, and hence for all λ by the group relation. Since δλ acts as a norm isomorphism on B(H), norm-boundedness of p would lead to δλ being uniformly bounded on the finite dimensional space img p, both for λ → 0 and for λ → ∞, which would exclude that img p contains fields with nonzero scaling dimension. Dilation covariant products can easily be constructed, e.g. by choosing dilation covar iant fields φ1 , φ2 , and setting = (ı z)−β ⊗ φ1 ⊗ φ2 with some β ≥ 0. If and p are both dilation covariant, Def. 7.1 implies that δλ p T (s ) = λβ p T (λs ) for 0 < λ ≤ 1 and for s ∈ [−1, 1];

(7.1)

that is, the equation holds when evaluated on test functions with support in [−1, 1]. This follows by approximating T with sequences of bounded local operators, as in the proof of Thm. 5.3. We will now consider the form of quantum inequalities in our case, that is, investigate the structure of minimal approximating projectors p and their subprojectors. We shall restrict here to the simplest case, where one deals with one-dimensional subrepresentations of δλ . In this case, we can find a full classification of our quantum inequality terms. Proposition 7.2. Let ∈ pos be dilation covariant. Let p be a one-dimensional dilation covariant projector in C ∞ ()∗ . Then, there exist a dilation covariant field φ ∈ FH and β ∈ R such that pT (s ) = (ı(s − ı0))β φ on the interval (−1, 1). Proof. We choose φ ∈ FH and σ ∈ C ∞ () such that p = σ ( · )φ. Since σ (φ) = 1, and since φ can be approximated by bounded operators as in Thm. 3.2, we can find A ∈ A(O1 ) such that σ (A) = 1. Using that p is dilation covariant, we obtain σ (δλ A) δ1/λ φ = σ (A)φ = φ for all 0 < λ ≤ 1,

(7.2)

δλ φ = σ (δλ A)φ = c(λ)φ for all 0 < λ ≤ 1.

(7.3)

and thus

Here the C-valued function c(λ) is continuous in λ and fulfils c(1) = 1, c(λ)c(λ ) = c(λλ ) if λ, λ ∈ (0, 1]. This suffices to conclude that there exists a β1 ∈ C such that c(λ) = λβ1 for all 0 < λ ≤ 1.

(7.4)

Due to the group relation, we then obtain for all λ ∈ R+ , δλ φ = λβ1 φ.

(7.5)

Splitting φ = φ R + ıφ I into real and imaginary parts, we note that δλ preserves this splitting, which means that β1 must be real. So φ is dilation covariant. Inserting into Eq. (7.1), we arrive at σ (T (s )) = λβ2 −β1 σ (T (λs )) in the sense of D(−1, 1) ,

(7.6)

790


where β2 ∈ R is the exponent relating to . Using the right-hand side as a definition for |s | > 1, we can construct a homogeneous distribution3 D ∈ D(R) of degree β := β1 − β2 such that D(s ) = σ (T (s )) in the sense of D(−1, 1) .

(7.7)

The homogeneous distributions of one variable are however fully classified (cf. [58, Ch. I Sect. 3.11.]): they are of the form D(s ) = c+ (s + ı0)β + c− (s − ı0)β with c± ∈ C.

(7.8)

We can further restrict the possible form of D. Since σ can be approximated by energybounded functionals σ E , and σ E (T (s )) has an analytic continuation to the lower halfplane, the only singular direction (in the sense of wave front sets) of σ (T (s )) at 0 can be the positive half-line. Since the wave front set is determined locally, Eq. (7.7) entails that c+ = 0. Absorbing a factor ı −β c− into the field φ, we finally obtain β p T (s ) = ı(s − ı0) φ on the interval (−1, 1),

(7.9)

as proposed. Now in the above situation, we can easily describe the quantum inequality terms that arise. One finds for any g ∈ D(−1, 1), κ[ pT ](g, ¯ g) = φ( f ) with f (s) =

ds (ı(s − ı0))β g¯ g(s, s ).

(7.10)

This expression would not represent the entire quantum inequality, as approximating projectors will typically not be one-dimensional. Rather, (7.10) would represent one of the summands of the inequality in Eq. (6.8). In typical cases, one may expect that there exists a distinguished highest-order term in the operator product expansion, which corresponds to the “normal product” part of , and which is described by a one-dimensional dilation covariant projector as above. Note that Prop. 7.2 determines the field φ uniquely. In particular, for β ≤ 0, requiring the distributional factor to be of positive type fixes the phase factor of φ. While other conditions might be used to restrict this phase factor, such as demanding that φ be hermitian, the quantum inequalities give a stronger restriction that even fixes a ± sign in φ. In this sense, our quantum inequalities can be used to distinguish the normal square of a field from its negative; squares of fields retain certain aspects of positivity in the quantum case. Let us further investigate the structure of the smearing function f obtained in Eq. (7.10). We assume for a moment that g is real-valued, and thus g¯ g(s, s ) is symmetric in s . By a standard computation [58, Ch. I Sect. 3 Nr. 8], one obtains the 3 As mentioned in Sect. 6.3, alternative conditions that force a distribution in the scaling limit to be homogeneous are discussed in [52].


791

following simplified expressions in terms of convergent integrals: βπ ∞ β ds (s ) g g(s, s ) for β > −1, (7.11) f (s) = 2 cos 2 0 ∞ βπ ds (s )β g g(s, s ) f (s) = 2 cos 2 0 [(−β−1)/2] 1 ∂ 2k g g 2k =0 s − for β < −1, |β| ∈ 2N + 1, (7.12) s (2k)! (∂s )2k k=0 (−1)k π ∂ 2k g g for β = −2k − 1, k ∈ N0 . (7.13) f (s) = (2k)! (∂s )2k s =0 Using these explicit characterizations, we can directly investigate the positivity properties of the function f . For reasons of simple interpretation, it would be convenient if the f (s) are positive at each s. We can give some sufficient conditions to this end. Proposition 7.3. Let g ∈ D(R), β ∈ R, and f be given as in Eq. (7.10). If any of the following conditions is fulfilled, it follows that f (s) ≥ 0 for all s ∈ R. (i) −1 < β ≤ 1, and g(t) ≥ 0 for all t ∈ R. (ii) β = −1, and g is real-valued. (iii) −3 < β < −1, supp g is a connected interval I , and g is logarithmically concave within I . Proof. The case (i) follows immediately from Eq. (7.11). In case (ii) , we obtain f (s) = π g(s)2 from Eq. (7.13), which yields the result. For (iii) , observe that in this case Eq. (7.12) reads βπ ∞ β 2 g(s) f (s) = 2 cos ds (s ) − g(s + s /2)g(s − s /2) . (7.14) 2 0

Now the concavity of t → log g(t) precisely implies that g(s)2 ≥ g(s + s /2)g(s − s /2) for any s and s . The case β = −1 corresponds to the leading order of the OPE in the Wick square of a massless free field theory, as discussed in Sect. 2. Our main interest is therefore in the case where β is near −1, which might be expected in asymptotically free theories. This realm is covered in the above proposition. In models, it might be possible to exploit the choice of positive-type kernels K j in the definition of in order to arrive at precisely the case β = −1, so that the function f = g 2 has a simple interpretation. We do however not investigate this possibility in detail here. In more generality, for any β ≤ 0, we can at least state the following more qualitative result: since (ı(s − ı0))β is of positive type, one finds ds f (s) ≥ 0, (7.15) so f has at least a non-negative average, regardless of the choice of g. A bit more generally, one can deduce Gårding inequalities for f , similar to those familiar from quantum mechanics [61]: for suitable test functions χ , one has (7.16) χ (s) f (s)ds ≥ −cχ (g2 )2 . Thus positivity of the test function f is preserved at least in a generalized sense.

792


8. Conclusions and Outlook We have shown that quantum field theories obeying the microscopic phase space condition of [3] admit a large class of nontrivial quantum inequalities: to every classically positive expression, i.e., a sum of absolute squares, we find a combination of composite fields that is positive up to an error obeying defined estimates and vanishing in the short distance limit. The composite fields appearing in such QIs are smeared with test functions derived from OPE coefficients as well as a choice of test function g. In the free field case, these smearing functions bore a simple relationship to g, at least for the normal product; here, the relationship is less direct, although we have succeeded in classifying their structure under simplifying assumptions within dilation covariant theories. Our inequalities are primarily valid in the short-distance limit, when the support of the test functions shrinks to a point. However, we also discussed how to obtain inequalities for smearing functions with extended (mesoscopic) support, in which the remainder term can be reduced at the expense of increasing the contributions from other composite fields. To conclude we mention a number of open questions and avenues for further investigation. First, more progress can be made in understanding the sampling functions arising. For example, in the dilation covariant setting, one could also allow general finite-dimensional irreducible representations of the dilation group. Second, it would probably not be hard to generalise our bounds from smearing along a fixed timelike inertial curve to smearing along arbitrary smooth timelike curves in Minkowski space. The structure of inequalities is not expected to change significantly under this generalization. Third, one would also like to establish OPE-based quantum inequalities in curved spacetime. Here, the situation is complicated by the lack of a global Hamiltonian to specify scales of spaces of states and fields. A replacement for the topologies thus induced might be found in the detailed microlocal structure of n-point functions, for example, using wave-front sets modulo Sobolev regularity (see, e.g., [62]). An alternative approach would be to use the stress-energy tensor as the basis for estimates of high-energy behaviour. Hollands has recently established an OPE on curved spacetime for perturbatively constructed theories [63]; however, the generalization of the nonperturbative methods used here presently remains a challenging problem. Fourth, it would be desirable to obtain results that directly constrain the energy density of a quantum field theory, returning to the original motivation for quantum inequalities. One may heuristically expect from perturbation theory that the energy density in purely bosonic theories does arise from such a sum of squares (although a generalization would be needed to cater for theories with fermionic fields) and would therefore be amenable to our approach. However, more direct connections to the energy density are unknown at present; in fact, the very concept of energy density is not well established in a nonperturbative context in purely Minkowski space quantum field theory. More generally, no general nonperturbative version of the Noether theorem has been found to date. In the Wightman framework, only very few results about pointlike Noether currents are available [64,65], in particular an existence proof is missing. In the algebraic framework, partial results have been achieved [66] on the base of the so-called split property of the local algebras [67,68]. In effect, it is possible to construct “local” energy operators HO,Oˆ , which are associated with the observable algebra A(O) of a bounded region O and ˆ for a slightly smaller region Oˆ ⊂⊂ O. These act like the global Hamiltonian on A(O) operators fulfil HO,Oˆ ≥ 0, which may be interpreted as a very weak form of energy inequality: starting from local integrals of the energy density, it seems always possible


793

to add appropriate “boundary terms”, associated with O ∩ Oˆ , such that the resulting operator HO,Oˆ is positive. However, there is no explicit control on these boundary terms, not even a means of separating them from a “main term”, so that this approach does not yet lead to a meaningful interpretation in terms of quantum energy inequalities. In curved spacetime, however, the situation is better. Brunetti, Fredenhagen and Verch have shown the existence of a stress-tensor in locally covariant quantum field theories obeying the time-slice axiom [69]. This stress-energy tensor is obtained by functional differentiation with respect to metric perturbations. This prevents an immediate identification of the energy density as a sum of absolute squares of basic fields. Nevertheless, this may serve as a starting point for future study. Acknowledgements. This work was initiated during the programme ‘Mathematical and Physical Aspects of Perturbative Approaches to Quantum Field Theory’ at the Erwin Schrödinger Institute, Vienna, and the authors thank the organisers of the programme and the ESI for financial support. CJF also thanks the Fakultät für Mathematik, University of Vienna for hospitality and financial support at various stages of the work. It is a pleasure to thank Stefan Hollands for valuable discussions in the early phases of this work. HB also thanks the II. Institut für Theoretische Physik, Hamburg, for hospitality and Klaus Fredenhagen for helpful remarks. The discussion of positivity of formal power series in Sect. 1.2 arose from conversations between HB and Bernd Kuckert, to whose memory this paper is dedicated.

References 1. Hudson, R.L.: When is the Wigner quasi-probability density non-negative? Rep. Math. Phys. 6, 249–252 (1974) 2. Fefferman, C., Phong, D.H.: The uncertainty principle and sharp Gårding inequalities. Comm. Pure Appl. Math. 34, 285 (1981) 3. Bostelmann, H.: Phase space properties and the short distance structure in quantum field theory. J. Math. Phys. 46, 052301 (2005) 4. Bostelmann, H.: Operator product expansions as a consequence of phase space properties. J. Math. Phys. 46, 082304 (2005) 5. Epstein, H., Glaser, V., Jaffe, A.: Nonpositivity of the energy density in quantized field theories. Nuovo Cimento 36, 1016–1022 (1965) 6. Ford, L.H.: Quantum coherence effects and the second law of thermodynamics. Proc. Roy. Soc. London A 364, 227–236 (1978) 7. Ford, L.H.: Constraints on negative-energy fluxes. Phys. Rev. D 43, 3972–3978 (1991) 8. Ford, L.H., Roman, T.A.: Averaged energy conditions and quantum inequalities. Phys. Rev. D 51, 4277–4286 (1995) 9. Ford, L.H., Roman, T.A.: Restrictions on negative energy density in flat spacetime. Phys. Rev. D 55, 2082–2089 (1997) 10. Pfenning, M.J., Ford, L.H.: Scalar field quantum inequalities in static spacetimes. Phys. Rev. D 57, 3489–3502 (1998) 11. Fewster, C.J., Eveson, S.P.: Bounds on negative energy densities in flat spacetime. Phys. Rev. D 58, 084010 (1998) 12. Fewster, C.J., Teo, E.: Bounds on negative energy densities in static space-times. Phys. Rev. D 59, 104016, 10 (1999) 13. Pfenning, M.J.: Quantum inequalities for the electromagnetic field. Phys. Rev. D 65, 024009, 13 (2002) 14. Fewster, C.J.: A general worldline quantum inequality. Class. Quant. Grav. 17, 1897–1911 (2000) 15. Fewster, C.J., Smith, C.J.: Absolute quantum energy inequalities in curved spacetime. Ann. Henri Poincaré 9, 425–455 (2008) 16. Fewster, C.J., Pfenning, M.J.: A quantum weak energy inequality for spin-one fields in curved space-time. J. Math. Phys. 44, 4480–4513 (2003) 17. Ford, L.H., Helfer, A.D., Roman, T.A.: Spatially averaged quantum inequalities do not exist in fourdimensional spacetime. Phys. Rev. D 66, 124012 (2002) 18. Fewster C.J., Roman T.A.: Null energy conditions in quantum field theory. Phys. Rev. D 67, 044003, 11 (2003) 19. Olum, K.D., Graham, N.: Static negative energies near a domain wall. Phys. Lett. B 554, 175–179 (2003)

794


20. Fewster, C.J., Hollands, S.: Quantum energy inequalities in two-dimensional conformal field theory. Rev. Math. Phys. 17, 577–612 (2005) 21. Flanagan, É.É.: Quantum inequalities in two-dimensional Minkowski spacetime. Phys. Rev. D 56, 4922– 4926 (1997) 22. Vollick, D.N.: Quantum inequalities in curved two-dimensional spacetimes. Phys. Rev. D 61, 084022, 5 (2000) 23. Fewster, C.J., Osterbrink, L.W.: Quantum energy inequalities for the non-minimally coupled scalar field. J. Phys. A 41, 025402 (2008) 24. Fewster, C.J.: Quantum energy inequalities and local covariance. II: Categorical formulation. Gen. Rel. Grav. 39, 1855–1890 (2007) 25. Fewster, C.J., Verch, R.: A quantum weak energy inequality for Dirac fields in curved spacetime. Commun. Math. Phys. 225, 331–359 (2002) 26. Fewster, C.J., Mistry, B.: Quantum weak energy inequalities for the Dirac field in flat spacetime. Phys. Rev. D 68, 105010, 6 (2003) 27. Dawson, S.P., Fewster, C.J.: An explicit quantum weak energy inequality for Dirac fields in curved spacetimes. Class. Quant. Grav. 23, 6659–6681 (2006) 28. Smith, C.J.: An absolute quantum energy inequality for the Dirac field in curved spacetime. Class. Quant. Grav. 24, 4733–4750 (2007) 29. Yu, H., Wu, P.: Quantum inequalities for the free Rarita-Schwinger fields in flat spacetime. Phys. Rev. D 69, 064008 (2004) 30. Hu, B., Ling, Y., Zhang, H.: Quantum inequalities for massless spin-3/2 field in Minkowski spacetime. Phys. Rev. D 73, 045015 (2006) 31. Glimm, J., Jaffe, A.: Quantum Physics–A Functional Integral Point of View. 2nd edition, New York: Springer, 1987 32. Rivasseau, V.: From Perturbative to Constructive Renormalization. Princeton, NJ: Princeton University Press, 1991 33. Lechner, G.: Construction of quantum field theories with factorizing S-matrices. Commun. Math. Phys. 277, 821–860 (2008) 34. Dütsch, M., Fredenhagen, K.: A local (perturbative) construction of observables in gauge theories: The example of QED. Commun. Math. Phys. 203, 71–105 (1999) 35. Bordemann, M., Waldmann, S.: Formal GNS construction and states in deformation quantization. Commun. Math. Phys. 195, 549–583 (1998) 36. Streater, R.F., Wightman, A.S.: PCT, Spin and Statistics, and All That. New York: Benjamin, 1964 37. Haag, R.: Local Quantum Physics. 2nd edition. Berlin: Springer, 1996 38. Haag, R., Swieca, J.A.: When does a quantum field theory describe particles? Commun. Math. Phys. 1, 308–320 (1965) 39. Buchholz, D., Wichmann, E.H.: Causal independence and the energy-level density of states in local quantum field theory. Commun. Math. Phys. 106, 321–344 (1986) 40. Buchholz, D., Porrmann, M.: How small is the phase space in quantum field theory? Ann. Inst. H. Poincaré 52, 237–257 (1990) 41. Buchholz, D., Junglas, P.: On the existence of equilibrum states in local quantum field theory. Commun. Math. Phys. 121, 255–270 (1989) 42. Porrmann, M.: Particle weights and their disintegration II. Commun. Math. Phys. 248, 305–333 (2004) 43. Fewster, C.J.: Quantum energy inequalities and stability conditions in quantum field theory. In: A. Boutet de Monvel, D. Buchholz, D. Iagolnitzer, U. Moschella, eds., Rigorous Quantum Field Theory: A Festschrift for Jacques Bros, Volume 251 of Progress in Mathematics, Boston: Birkhäuser, 2006, pp. 95–111 44. Fewster, C.J., Ojima, I., Porrmann, M.: p-nuclearity in a new perspective. Lett. Math. Phys. 73, 1–15 (2005) 45. Fewster, C.J., Verch, R.: Stability of quantum systems at three scales: Passivity, quantum weak energy inequalities and the microlocal spectrum condition. Commun. Math. Phys. 240, 329–375 (2003) 46. Schlemmer, J., Verch, R.: Local thermal equilibrium states and quantum energy inequalities. Annales Henri Poincare 9, 945–978 (2008) 47. Haag, R., Ojima, I.: On the problem of defining a specific theory within the frame of local quantum physics. Ann. Inst. H. Poincaré 64, 385–393 (1996) 48. Wilson, K.G.: Non-Lagrangian models of current algebra. Phys. Rev. 179, 1499–1512 (1969) 49. Zimmermann, W.: Local operator products and renormalization in quantum field theory. In: S. Deser, M. Grisaru, H. Pendleton, eds., Lectures on Elementary Particles and Quantum Field Theory, Volume 1. Cambridge, MA: MIT Press, 1970 50. Borchers, H.J.: Field operators as C ∞ functions in spacelike directions. Nuovo Cimento (10) 33, 1600–1613 (1964)


795

51. Johnson, K.: Solution of the equations for the Green’s functions of a two dimensional relativistic field theory. Nuovo Cimento 20, 773–790 (1961) 52. Fredenhagen, K., Haag, R.: Generally covariant quantum field theory and scaling limits. Commun. Math. Phys. 108, 91–115 (1987) 53. Bostelmann, H.: Lokale Algebren und Operatorprodukte am Punkt. Thesis, Universität Göttingen, 2000. Available online at http://webdoc.sub.gwdg.de/diss/2000/bostelmann/ 54. Fredenhagen, K., Hertel, J.: Local algebras of observables and pointlike localized fields. Commun. Math. Phys. 80, 555–561 (1981) 55. Bostelmann, H., D’Antoni, C., Morsella, G.: Scaling algebras and pointlike fields. A nonperturbative approach to renormalization. Commun. Math. Phys. 285, 763–798 (2009) 56. Reed, M., Simon, B.: Methods of Modern Mathematical Physics, Volume II: Fourier Analysis, SelfAdjointness. San Diego, CA: Academic Press, 1975 57. Adams, R.A., Fournier, J.J.F.: Sobolev Spaces, Volume 140 of Pure and Applied Mathematics. 2nd edition, London-NewYork: Academic Press, 2003 58. Gelfand, I.M., Shilov, G.E.: Generalized Functions. Volume 1, New York: Academic Press, 1968 59. Buchholz, D., Verch, R.: Scaling algebras and renormalization group in algebraic quantum field theory. Rev. Math. Phys. 7, 1195–1239 (1995) 60. Bostelmann, H., D’Antoni, C., Morsella, G.: On dilation symmetries arising from scaling limits. http://arxiv.org/abs/0812.4762v1 [math.ph], 2008, to appear in Commun. Math. Phys. 61. Eveson, S.P., Fewster, C.J., Verch, R.: Quantum inequalities in quantum mechanics. Ann. Henri Poincaré 6, 1–30 (2005) 62. Junker, W., Schrohe, E.: Adiabatic vacuum states on general spacetime manifolds: definition, construction, and physical properties. Ann. Henri Poincaré 3, 1113–1181 (2002) 63. Hollands, S.: The operator product expansion for perturbative quantum field theory in curved spacetime. Commun. Math. Phys 273, 1–36 (2007) 64. Orzalesi, C.A.: Charges and generators of symmetry transformations in quantum field theory. Rev. Mod. Phys. 42, 381–408 (1970) 65. Lopuszanski, J.: An Introduction to Symmetry and Supersymmetry in Quantum Field Theory. Singapore: World Scientific, 1991 66. Buchholz, D., Doplicher, S., Longo, R.: On Noether’s theorem in quantum field theory. Ann. Phys. (N.Y.) 170, 1 (1986) 67. Doplicher, S.: Local aspects of superselection rules. Commun. Math. Phys. 85, 73–86 (1982) 68. Doplicher, S., Longo, R.: Local aspects of superselection rules II. Commun. Math. Phys. 88, 399–409 (1983) 69. Brunetti, R., Fredenhagen, K., Verch, R.: The generally covariant locality principle - A new paradigm for local quantum field theory. Commun. Math. Phys. 237, 31–68 (2003) Communicated by Y. Kawahigashi

Commun. Math. Phys. 292, 797–810 (2009) Digital Object Identifier (DOI) 10.1007/s00220-009-0851-z

Communications in


Nonlinear Instability for the Critically Dissipative Quasi-Geostrophic Equation Susan Friedlander1 , Nataša Pavlović2 , Vlad Vicol1 1 Department of Mathematics, University of Southern California,

3620 South Vermont Ave., KAP 108, Los Angeles, CA 90089, USA. E-mail: [email protected]; [email protected] 2 Department of Mathematics, University of Texas at Austin, 1 University Station, C1200, Austin, TX 78712, USA. E-mail: [email protected] Received: 31 January 2009 / Accepted: 27 April 2009 Published online: 28 June 2009 – © Springer-Verlag 2009

Abstract: We prove that linear instability implies non-linear instability in the energy norm for the critically dissipative quasi-geostrophic equation. 1. Introduction A fundamental equation in oceanography and meteorology is the 3 dimensional Navier-Stokes equation in the context of a rapidly rotating, density stratified, viscous, incompressible fluid. Both the forces of rotation and stratification impose a tendency toward 2 dimensionality on the 3 dimensional fluid motion, and this leads to approximate and simpler mathematical models. Important non-dimensional parameters are the Ekman number (the strength of the viscous term relative to rotation) and the Rossby number (the strength of the nonlinearity relative to rotation). In many geophysical problems these parameters are very small. A set of approximations based on asymptotic expansions in powers of these small parameters yields an approximate equation for the 3 dimensional pressure known as the general quasi-geostrophic equation with appropriate boundary conditions. Further simplifying assumptions reduce the problem to the study of a 2 dimensional equation which describes the evolution of the temperature field on a surface that bounds the fluid. In the geophysical fluids literature this equation is known as the surface quasi-geostrophic equation. A derivation of this equation and a discussion of its physical relevance can be found, for example, in Pedlosky [Pe], Salmon [S], Held at al [HPGS]. The effects of viscosity are incorporated via a boundary layer analysis and a mechanism known as Ekman layer pumping which produces the dissipative term in the 2 dimensional quasi-geostrophic equation. In the mathematical literature this 2 dimensional equation is often called the dissipative quasi-geostrophic equations (QG equation) with the word surface being omitted since the equation is 2 dimensional. This equation, for an unknown active scalar (x, t) representing the temperature on the boundary surface, is given by ∂t + U · ∇ + (−)β = f,

(1.1)

798

S. Friedlander, N. Pavlović, V. Vicol

where U (x, t) is the velocity vector and f (x) is a given external force. The velocity is coupled with the temperature via a stream function (x, t): = (−)1/2 = ,

(1.2)

U = ∇ ⊥ = (∂x2 , −∂x1 ) = (R2 , −R1 ),

(1.3)

and

where Ri is the i th Riesz transform. Our analysis of (1.1) – (1.3) considers x in the 2 dimensional torus [0, 2π ]2 = T2 and t ∈ [0, ∞). Both the non-dissipative and the dissipative QG equations have received much attention following the seminal article of Constantin et al [CMT]. They observed a number of similar features between the full 3 dimensional Euler and Navier-Stokes equations and the much simpler QG equations in terms of possible formation of singularities. Recent results concerning the dissipative QG equations include [CC,CCW,CV,CW,DD,DP, J1,J2,KN,KNV,M,W] and references therein. The appropriate power β of the negative Laplacian in the derivation from the general 3D viscous quasi-geostrophic models and Ekman boundary layer analysis is β = 1/2. Dimensionally the 2D QG equation with β = 1/2 is the analogue of the 3D NavierStokes equation. β = 1/2 is called the critical case. The first results concerning regularity of solutions to the dissipative QG equation were given in the simpler (but non-physical) subcritical case where β > 1/2: see, for example, Constantin and Wu [CW]. In the critical case, β = 1/2, Constantin, Cordoba and Wu [CCW] proved existence of a unique global solution evolving from any initial data that are small in L ∞ . Very recently, the smallness assumption was removed independently in breakthrough works of Caffarelli and Vasseur [CV] and Kiselev, Nazarov and Volberg [KNV]. In particular, Caffarelli and Vasseur [CV] used harmonic extension to establish regularity of the Leray-Hopf weak solution. On the other hand, Kiselev et al [KNV] proved the global well posedness of the critical dissipative QG equations with periodic C ∞ data. Their argument is based on a certain non-local maximum principle for a suitably chosen modulus of continuity. In the present article we consider the question of nonlinear instability of a steady solution of the forced critical QG equations. We note that the above mentioned references concern the case f = 0, but in order to ensure the existence of a large class of steady states we must consider the nontrivially forced problem. In particular, we need to reprove certain results that are known to hold for the unforced equations but not in the forced context, namely the nonlocal maximum principle of Kiselev et al [KNV]. The main result of this paper is that linear instability implies nonlinear Lyapunov instability for , and hence U , in the function space L 2 . Such results connecting linear and nonlinear instability have been proven under certain restrictions for the 2D Euler equations, see Bardos et al [BGS], Friedlander and Vishik [VF], and Lin [L]. There the methods utilize a bootstrap technique where closure relies on the special property of conservation of vorticity which is valid for 2D Euler but not for 3D Euler, where the equivalent instability result is still unproven. This property cannot be utilized for the QG equation because the relation between the temperature and the stream function is not equivalent to the relation between the vorticity and the stream function in the 2D Euler equations. In fact this is one reason why it is conjectured that the QG equations might mimic possible singularity development in the 3D fluid equations. The result that linear instability implies nonlinear instability in L 2 for the Navier-Stokes equations in any dimension was proved in Friedlander et al [FPS] (see also the seminal text of Yudovich [Y]). In this case the special ingredient that permits

Nonlinear Instability for the Critical QG

799

the bootstrap argument to close is the smoothing property of the Laplacian with respect to the nonlinear term. The arguments in [FPS] carry over directly to the subcritical dissipative QG equation (i.e. β > 1/2) because the dissipative term again smooths the nonlinear term in (1.1) - (1.3). However the case of the critical QG equation is more subtle because the critical dissipative term (β = 1/2) and the nonlinear term are now of the same order. Hence to prove that linear instability implies nonlinear instability in L 2 for the critical dissipative QG equations via the bootstrap argument requires a different special ingredient. The one we use in this article is the existence of a global bound on ∇(t) L ∞ . This result for the unforced critically dissipative QG was proved in [KNV] and a recent preprint of Kiselev and Nazarov [KN] shows that the result also holds for the equation augmented by a dispersion term. The existence of this global bound for the forced equations is proven in Sect. 5. We note that the fairly general abstract theorem of Friedlander et al [FSV] can be applied to the critical QG equations. Since the spectrum of the linearized QG operator is discrete (see Sect. 3), the spectral gap condition of the abstract theorem is satisfied. It then follows from [FSV] that linear instability implies nonlinear instability in H s , with s > 2. The novel result of this present paper is to prove instability in the “physically natural” energy space L 2 .

Organization of the paper. In Sect. 2 we formulate the stability problem in terms of the temperature (x, t) perturbed about a steady state θ0 (x) ∈ C ∞ . Also in the same section we define nonlinear stability/instability and we state the main instability result, Theorem 2.1. In Sect. 3 we study the linear operator L for the dissipative QG equations in perturbation form. This operator is elliptic of order 1, with compact resolvent, and hence its spectrum is purely discrete for x ∈ T2 . We prove certain properties of L that will be used in the bootstrap argument. Then in Sect. 4 we use this argument to prove Theorem 2.1. In Sect. 5 we prove, in the spirit of [KNV], that the forced equation has a global C ∞ solution and that supt≥0 ∇(t) L ∞ < ∞. This result in used in the bootstrap argument that proves the main theorem. 2. Notation and Formulation of the Result Let θ0 be the temperature of a smooth steady 2D flow with velocity q0 , and smooth force f , that is we have q0 · ∇θ0 + θ0 = f, q0 = (R2 θ0 , −R1 θ0 ).

(2.1) (2.2)

Here we consider θ0 , q0 , f ∈ C ∞ (T2 ). We linearize (1.1) about the steady state (θ0 , q0 ) by writing (x, t) = θ0 (x) + θ (x, t) and U (x, t) = q0 (x) + q(x, t). In such a way we obtain an equation that governs the perturbation θ : ∂t θ = Lθ + N (θ ),

(2.3)

where the linear operator L is defined by Lθ = −q0 · ∇θ − q · ∇θ0 − θ,

(2.4)

800


the velocity is coupled with the temperature via q = (R2 θ, −R1 θ )

(2.5)

N (θ ) = −q · ∇θ.

(2.6)

and

zero mean on the torus, and in the For simplicity of the presentation we let θ0 , f, θ have following we shall denote H s = {v ∈ H s (T2 ) : T2 vd x = 0}, for all s ≥ 0. We define a suitable version of stability (the same definition was used, e.g. in [FPS,VF]). Definition. Let (X, Z ) be a pair of Banach spaces. A solution θ0 of (2.1)-(2.2) is called (X, Z ) nonlinearly stable if for any ρ > 0, there exists ρ > 0 so that if θ (0) ∈ X and θ (0) Z < ρ , then we have (i) there exists a global in time solution to (2.3) such that θ (t) ∈ C([0, ∞); X ); (ii) θ (t) Z < ρ for a.e. t ∈ [0, ∞). An equilibrium θ0 that is not stable (in the above sense) is called Lyapunov unstable. The Banach space X is the space where a local existence theorem for the nonlinear equations is available, while Z is the space where the spectrum of the linear operator is analyzed, and where the instability is measured. In the case of the critical dissipative QG we let X be the critical Sobolev space H 1 (cf. [CC,CW,DD,J1,J2,M]), while the growth of the perturbation is considered in the energy space Z = L 2 . Now we are ready to formulate the main result of the present paper. Theorem 2.1. Suppose that θ0 is a smooth mean-free steady state solution of the critical dissipative QG, i.e., it solves (2.1)–(2.2). If the associated linear operator L, as defined in (2.4), has spectrum in the unstable region, then the steady state is (H 1 , L 2 ) Lyapunov nonlinearly unstable. 3. Linearized Dissipative QG The linear operator L defined in (2.4) via Lθ = −q0 · ∇θ − q · ∇θ0 − θ is a nonlocal operator with principal symbol a(x, k) = −|k| + iq0 (x) · k, which does not vanish on T2 × Z2 \ {0}. Therefore L is elliptic of order 1. Since q0 , ∇θ0 ∈ C ∞ , for large enough α > 0, we have that (L − α I )−1 is a bounded operator from L 2 into H 1 . Moreover, the domain of L D(L) = {v ∈ H 1 (T2 ), vd x = 0} ⊂ L 2 (T2 ) (3.1) T2

is compactly embedded in L 2 by Rellich’s theorem, so that resolvent (L − α I )−1 is a compact operator. Thus L has discrete spectrum.


801

Let µ be the eigenvalue of L with maximal positive real part over L 2 . Let λ = Re µ and φ ∈ L 2 be the corresponding eigenfunction1 . For a fixed 0 < δ < Cλ , where Cλ > 0 is a constant depending on λ to be determined later, we denote by L δ , L δ = L − (λ + δ)I.

(3.2)

The shift ensures that L δ generates a bounded C0 -semigroup over L 2 and that the resolvent set of L δ contains the right half plane. The following lemma shows that L δ generates an analytic semigroup over L 2 . Lemma 3.1. Over L 2 the operator L δ generates an analytic semigroup. The proof of the lemma modifies the proof of [P, Theorem 7.2.7], which shows the analyticity of a strongly elliptic operator of order 2m over L 2 , to the case of the linearized QG operator, which is elliptic of order 1. Proof. Define the operator G via Gv = v + q0 · ∇v + R(v) · ∇θ0 + 2βv = −Lv + 2βv,

(3.3)

where we have denoted R(v) = (R2 v, −R1 v) and β = ∇θ0 L ∞ . Since q0 is divergence-free we have that G satisfies Gärding’s inequality Re (Gv, v) ≥ 1/2 v2L 2 + βv2L 2 .

(3.4)

In the above estimate we also used R(v) L 2 ≤ v L 2 . Similarly, for every v ∈ D(G), we have |Im (Gv, v)| ≤ |(Gv, v)| ≤ 1/2 v2L 2 + 3βv2L 2 .

(3.5)

Since v is a scalar, it follows from (3.4) and (3.5) that the numerical range S(G) (cf. [P, pp. 12]) is contained in the set Sϑ0 = {λ ∈ C : −ϑ0 < arg λ < ϑ0 },

(3.6)

where ϑ0 = arctan(3) < π/2. Choosing ϑ0 < ϑ < π/2 and defining ϑ = {z ∈ C : | arg z| > ϑ}, we have that there is a constant C = C(ϑ, ϑ0 ) > 0 such that dist(z, S(G)) ≥ C|z|, for all z ∈ ϑ .

(3.7)

We now claim that all real x < 0 are in the resolvent set ρ(G) of the operator G. Recall that G = −L + 2β I , and moreover that the spectrum of the operator L is contained in the half plane {z ∈ C : Re z ≤ λ}, where 0 < λ = Re µ, and µ is the eigenvalue of L with largest real part with associated eigenfunction φ. Since q0 is divergence free we also have that µφ2L 2 = (Lφ, φ) = −1/2 φ2L 2 − (R(φ) · ∇θ0 , φ),

(3.8)

and by taking real parts this implies that λ ≤ ∇θ0 L ∞ = β; hence the spectrum of G is contained in the right half plane, proving the claim. 1 The steady flow q = (sin mx , 0) gives an example for which the operator L has unstable eigenvalues 0 2 over L 2 . This follows from an extension of the analysis in Friedlander and Shvydkoy [FS] to the dissipative equations (see also Meshalkin and Sinai [MS]).

802


We have hence proven that ϑ is contained in the complement of S(G) and has non-empty intersection with ρ(G); by [P, Theorem 1.3.9] we have that ϑ ⊂ ρ(G) and for every z ∈ ϑ we have the resolvent estimate R(z : G) L 2 →L 2 ≤

1 dist(z : S(G))

≤

1 . C|z|

(3.9)

Therefore −G is the infinitesimal generator of an analytic semigroup (cf. [P, Theorem 2.5.2]) and so L δ = −G + (2β − λ − δ)I generates an analytic semigroup on L 2 , since it is a bounded perturbation of −G.

Now we state and prove the lemma that will be used in the proof of our main result, Theorem 2.1. Lemma 3.2. For 0 ≤ γ ≤ 1 there exists a constant C > 0 such that e L δ t v L 2 →L 2 ≤

C 1−γ γ v L 2 −1 v L 2 , tγ

(3.10)

for all smooth functions v ∈ L 2 , where C = C(γ , δ, α, θ0 ). Proof. Since q0 is divergence free, it is convenient to use the operator Aα , defined via Aα v = −q0 · ∇v − v − αv = L δ v + R(v) · ∇θ0 − (α − λ − δ)v,

(3.11)

where α > max{λ + δ, Cθ0 2H 2+ }, > 0, and C is a sufficiently large dimensional constant. We treat L δ as a bounded perturbation of Aα . The operator Aα is also elliptic and 2 has discrete spectrum, so by possibly choosing a different α, we have that A−1 α ∈ L(L ). First, we claim that A−1 α v L 2 ≤ Cv L 2 ,

(3.12)

for all smooth v ∈ L 2 with zero mean. In order prove this, denote h = A−1 α v, which also has zero mean, and observe that (3.12) is equivalent to h L 2 ≤ C−1 Aα h L 2 .

(3.13)

The definition of Aα implies that (−1 Aα h, h) = −(−1 (q0 · ∇h), h) − h2L 2 − α−1/2 h2L 2 , and therefore h2L 2 + α−1/2 h2L 2 ≤ −1 Aα h L 2 h L 2 + |(q0 · ∇h, −1 h)|.

(3.14)

Note that (q0 · ∇−1/2 h, −1/2 h) = 0 since div q0 = 0. Using Plancherel’s theorem, we write this inner product in terms of Fourier coefficients (cf. [KV] and references therein) (q0 · ∇h, −1 h) = (q0 · ∇h, −1 h) − (q0 · ∇−1/2 h, −1/2 h) = i(2π )2 qˆ0 j · k |l|−1/2 − |k|−1/2 hˆ k |l|−1/2 hˆ l . j+k+l=0

(3.15)


803

In the above summation, the Fourier frequencies j, k, l ∈ Z2 \ {0} because q0 and h are mean free, and hˆ k denotes the k th Fourier coefficient of h. Since |l| = | j + k| the triangle inequality gives ||l| − |k|| ≤ | j|, and therefore 1 ||l|1/2 − |k|1/2 | 1 | j||k| ≤ 1/2 1/2 1/2 |k| 1/2 − 1/2 ≤ |k| ≤ | j|. 1/2 1/2 |l| |k| |l| |k| |l| |k| (|l| + |k|1/2 ) Therefore, by (3.15) and the Cauchy-Schwartz inequality we have that | j||qˆ0 j ||hˆ k ||l|−1/2 |hˆ l | |(q0 · ∇h, −1 h)| ≤ C j+k+l=0

≤C

| j||qˆ0 j |

j∈Z2 \{0}

|hˆ − j−l ||l|−1/2 |hˆ l |

l∈Z2 \{0,− j}

≤ Ch L 2 −1/2 h L 2

| j|2+ |qˆ0 j || j|−1−

j∈Z2 \{0} −1/2

≤ Ch L 2

h L 2 2+ θ0 L 2 .

We insert the above estimate into (3.14) and apply the Young’s inequality ab ≤ thus obtaining

(3.16) a2 4

+ b2 ,

1 h2L 2 + (α − C2+ θ0 2L 2 )−1/2 h2L 2 ≤ −1 Aα h2L 2 . 2 Since α > Cθ0 2H 2+ , the above estimate proves (3.13). Now we prove that for smooth v ∈ L 2 we have L −1 δ Aα v L 2 ≤ Cv L 2 ,

(3.17)

for a sufficiently large constant C > 0. The inequality (3.17) follows by writing −1 −1 L −1 δ Aα v = v + L δ (R(v) · ∇θ0 ) − (α − δ − λ)L δ v,

L −1 δ

(3.18)

L2

is bounded on (cf. [P, Lemma 2.6.3]). Together with and noting that the operator the boundedness of the Riesz-transforms on L 2 , (3.18) implies ∞ L −1 δ Aα v L 2 ≤ v L 2 (1 + C(∇θ0 L + α − δ − λ)),

L −1 δ Aα

(3.19)

L(L 2 ).

∈ which proves (3.17) and therefore In order to conclude the proof of the lemma we use the fact that L δ generates an analytic semigroup (cf. Lemma 3.1) and therefore (cf. [P, Theorem 2.6.13]) we have that C −γ γ −γ e L δ t v L 2 →L 2 = L δ e L δ t L δ v L 2 →L 2 ≤ γ L δ v L 2 . (3.20) t −γ

Now we bound L δ v L 2 by interpolating (cf. [P, Theorem 2.6.10]) as follows: −γ

1−γ

L δ v L 2 = L δ

γ

−1 (L −1 δ v) L 2 ≤ Cv L 2 L δ v L 2 1−γ

γ

−1 −1 ≤ Cv L 2 (L −1 δ Aα )(Aα )( v) L 2 1−γ

γ

≤ Cv L 2 −1 v L 2 , 1−γ

(3.21)

where in order to obtain (3.21) we used (3.17) and (3.12). We conclude the proof of the lemma by combining (3.20) and (3.21).

804


4. Proof of Theorem 2.1 Here we prove Theorem 2.1. In order to do this it is sufficient to show that the trivial solution θ = 0 of (2.3) is (H 1 , L 2 ) Lyapunov unstable. With this goal in mind, we consider a family of solutions θ ε to ∂t θ ε = Lθ ε + N (θ ε ), θ ε |t=0 = εφ,

(4.1) (4.2)

where φ is as above an eigenfunction of L associated with the eigenvalue with maximal positive real part λ. We will prove the following proposition that clearly implies the desired Lyapunov instability result. Proposition 4.1. There exist positive constants C¯ and ε¯ ≤ 1 such that for every ε ∈ ¯ (0, ε¯ ), there exists Tε > 0 such that θ ε (Tε ) L 2 ≥ C. We remark that if θ ε (x, t) solves (4.1)–(4.2), then the function ε (x, t) = θ ε (x, t) + θ0 (x) solves the forced QG equations (5.1)–(5.3), with initial data ε (x, 0) = θ0 (x) + εφ(x) ∈ C ∞ (T2 ). Moreover, in Lemma 5.1 of Sect. 5 we prove that the global smooth solution of the forced QG equations satisfies ∇ε (t) L ∞ ≤ C0ε for all t ≥ 0, where the constant C0ε depends solely on the L ∞ and W 1,∞ norms of the initial data and the force. For ε ∈ (0, 1], we have ε (0) L ∞ ≤ θ0 L ∞ + φ L ∞ , and similarly ∇ε (0) L ∞ ≤ ∇θ0 L ∞ + ∇φ L ∞ , which are independent of ε, and therefore there exists a fixed C0 > 0 such that ∇ε (t) L ∞ ≤ C0 , for all ε ∈ (0, 1] and for all t ≥ 0. We refer the reader to the proof of Lemma 5.1 for further details. The triangle inequality then implies that by possibly increasing C0 we have sup ∇θ ε (t) L ∞ ≤ C0

(4.3)

t≥0

for all ε ∈ (0, 1]. We will henceforth denote θ ε simply as θ and will use the analogous notation for q. All constants in the following are ε-independent. Proof of Proposition 4.1. For R > Cφ := φ L 2 to be chosen later, let T = T (R, ε) be the maximal time such that θ (t) L 2 ≤ ε Reλt ,

for t ∈ [0, T ].

(4.4)

Clearly T ∈ (0, ∞] due to the strong continuity in L 2 of t → θ (t) and the chosen initial condition. Using Duhamel’s formula we write the solution of (4.1)–(4.2) as θ (t) = e Lt εφ + B(t),

(4.5)

where

t

B(t) =

e L(t−s) N (θ )(s) ds.

(4.6)

1+γ /2 B(t) L 2 ≤ C1 ε Reλt ,

(4.7)

0

First, we shall prove that


805

where γ ∈ (0, 1) and C1 = C(C0 , λ, δ, γ ) > 0 are constants. To show (4.7), we rewrite the operator B and then use Lemma 3.2 as follows: t B(t) L 2 = e(λ+δ)(t−s) e L δ (t−s) N (θ (s)) ds L 2 0 t ≤ e(λ+δ)(t−s) e L δ (t−s) N (θ (s)) L 2 →L 2 ds 0 t 1 1−γ γ ≤C e(λ+δ)(t−s) N (θ (s)) L 2 −1 N (θ (s)) L 2 ds, (4.8) γ (t − s) 0 where γ ∈ (0, 1) is arbitrary, and C > 0. In order to bound the factor −1 N (θ (s)) L 2 we recall the explicit representation (cf. [CC]) of the nonlinear term −1 (R(θ ) · ∇θ ) = Cn (R1 (θ R2 (θ )) − R2 (θ R1 (θ ))) ,

(4.9)

for some dimensional constant Cn , and the fact that the Riesz transforms are bounded on L 2 and L 4 , to obtain that −1 N (θ (s)) L 2 ≤ Cθ Ri θ L 2 ≤ Cθ 2L 4 .

(4.10)

By interpolating, we have 1/3

2/3

θ L 4 ≤ Cθ L 2 θ L 8 .

(4.11)

On the other hand by the Gagliardo-Nirenberg inequality and the Hölder inequality we have that (cf. [N]) 3/8

3/8

θ L 8 = θ 8/3 L 3 ≤ C∇(θ 8/3 ) L 6/5 3/8

5/8

3/8

≤ Cθ 5/3 ∇θ L 6/5 ≤ Cθ L 2 ∇θ L ∞ .

(4.12)

By combining (4.10) with (4.11) and (4.12) we obtain γ

3γ /2

γ /2

−1 N (θ (s)) L 2 ≤ Cθ L 2 ∇θ L ∞ .

(4.13)

On the other hand, by Hölder’s inequality, and the boundedness of the Riesz transforms on L 2 , we have 1−γ

1−γ

1−γ

N (θ ) L 2 ≤ θ L 2 ∇θ L ∞ .

(4.14)

Recall that by (4.3) we have ∇θ (t) L ∞ ≤ C0 , for all t ≥ 0. Using assumption (4.4) and the fact that 0 < δ < Cλ = λγ /2, we substitute the bounds (4.13) and (4.14) into (4.8), to conclude 1+γ /2 , B(t) L 2 ≤ C1 ε Reλt

(4.15)

for some positive constant C1 = C(C0 , λ, δ, q, γ ), proving (4.7). The Duhamel formula (4.5) and the bound (4.7) imply 1+γ /2 θ (t) L 2 ≤ Cφ εeλt + C1 ε Reλt .

(4.16)

806


Observing that R was chosen such that R > Cφ , it follows that we have the following estimate on the maximal time T :

R − Cφ 2/γ λT =: C2 > 0, (4.17) εe ≥ C1 R 1+γ /2 which clearly holds if T = ∞. On the other hand, if T is finite, (4.17) is obtained by combining the continuity of t → θ (t) L 2 , (4.4) and (4.16) to obtain γ /2 , ε ReλT ≤ Cφ εeλT + C1 R 1+γ /2 εeλT εeλT which, in turn, implies (4.17). Therefore we have T ≥ Tε , where we defined Tε =

1 C2 ln . λ ε

(4.18)

To conclude the proof we must find a lower bound on θ (Tε ) L 2 . We use Duhamel’s formula (4.5), the triangle inequality, and (4.15) to obtain 1+γ /2 θ (Tε ) L 2 ≥ Cφ εeλTε − C1 ε ReλTε .

(4.19)

Using (4.18), with C2 given by (4.17), the lower bound (4.19) implies R − Cφ ) C1 R 1+γ /2 = C2 (2Cφ − R) := C¯ > 0,

θ (Tε ) L 2 ≥ C2 (Cφ − C1 R 1+γ /2

by choosing Cφ < R < 2Cφ . This concludes the proof of the proposition which, in turn, implies Theorem 2.1.

5. Global Well-Posedness for the Forced QG Equation In this section, by modifying the argument of Kiselev et al [KNV], we prove that the forced QG equation has a unique global smooth solution. More precisely, we prove the following: Lemma 5.1. Assume that 0 , f ∈ C ∞ are T2 -periodic functions with zero mean. Then there exists a unique global in time smooth solution of ∂t + U · ∇ + = f, U = R() = (R2 , −R1 ), (0) = 0 .

(5.1) (5.2) (5.3)

Moreover for all t ≥ 0 we have ∇(t) L ∞ ≤ C0 , where C0 = C0 (0 L ∞ , ∇0 L ∞ , f L ∞ , ∇ f L ∞ ) is a positive constant.

(5.4)


807

The proof of the lemma is in the spirit of [KNV], but we additionally need to treat the force term, which a-priori could cause growth of the solution. Since (t) is mean free, it can be shown a-priori that (t) L p , with 2 ≤ p ≤ ∞, remains bounded for all time. However the same methods do not work for the subcritical quantity ∇(t) L ∞ , and therefore we need to prove the nonlocal maximum principle of [KNV] for the forced QG equation (5.1)–(5.3). This is achieved by suitably choosing a scaling parameter B and making use of the fact that due to periodicity we do not need to consider arbitrarily large length scales. We note that the scaling parameter B is used only in the modulus of continuity, whereas the solutions to (5.1)–(5.3) are not space-time rescaled. Proof of Lemma 5.1. We recall that a continuous, increasing, unbounded, concave function ω : [0, ∞) → [0, ∞), with ω(0) = 0 is a modulus of continuity for a function f if | f (x) − f (y)| ≤ ω(|x − y|),

(5.5)

for all x, y ∈ R2 . The modulus is strict if the strict inequality holds in (5.5). We consider a modulus of continuity that also satisfies ω (0) < ∞, and limξ →0+ ω

(ξ ) = −∞, namely, as in [KNV] we let ξ − ξ 3/2 , 0 ≤ ξ ≤ δ, (5.6) ω(ξ ) = δ − δ 3/2 + γ log(1 + 41 log(ξ/δ)), ξ > δ, where δ > γ > 0 are sufficiently small fixed constants. Since 0 ∈ C ∞ , there exists a sufficiently large B > 0 such that 0 has strict modulus of continuity ω B (ξ ) = ω(Bξ ). The scaling parameter B may be chosen as B = C∇0 L ∞ exp(exp(C0 L ∞ )),

(5.7)

where C is a sufficiently large positive constant. Moreover, since ω is unbounded, by possibly increasing B we may ensure that AB 2 ≥ ∇ f L ∞ ,

(5.8)

where the fixed dimensional constant A is as in [KNV, Lemma], and also ω B (d) ≥ 4π f L ∞ , d

(5.9)

√ where d = diam(T2 ) = 2π 2 will be fixed throughout this section. We fix a B that satisfies (5.7)–(5.9) and note that the modulus of continuity is now given by 0 ≤ ξ ≤ Bδ , Bξ − (Bξ )3/2 , ω B (ξ ) = ω(Bξ ) = (5.10) Bξ 1 3/2 δ − δ + γ log(1 + 4 log( δ )), ξ > Bδ . Denote ω B (ξ ) = Bω (Bξ ) and ω B

(ξ ) = B 2 ω

(Bξ ). We claim that ω B (ξ ) is preserved by the evolution (2.3), so that is a global solution. We extend , U, 0 , f to T2 -periodic functions on R2 . The first step of the proof is to show that if (t) has strict modulus of continuity ω B for t ∈ [0, T ], then there exists τ > 0 such that (t) has strict modulus of continuity ω B on t ∈ [0, T + τ ). Since ∇(t) L ∞ < ω B (0), we have that (t) ∈ C ∞ for all

808


t ∈ [0, T ], and by the local regularity theorem (cf. [CC,J1,J2,M]) for some time τ > 0 beyond T . We must show that by possibly shrinking τ , we have that |(x, t) − (y, t)| < ω B (|x − y|)

(5.11)

for all t ∈ (T, T + τ ) and x = y ∈ R2 . Define the compact set K = [−2π, 2π ]2 × [−2π, 2π ]2 ⊂ R4 . Since is T2 -space periodic, we have that for any (x, y) ∈ R4 , with x = y, there exist (x , y ) ∈ K , with x = y , such that |x − y | ≤ |x − y|, (x, t) = (x , t), and (y, t) = (y , t). Because ω B is increasing, if (5.11) holds for all (x , y ) ∈ K with x = y , then we have that for all x = y ∈ R2 , |(x, t) − (y, t)| = |(x , t) − (y , t)| < ω B (|x − y |) ≤ ω B (|x − y|). Therefore it is sufficient to prove that there exists τ > 0 such that (5.11) holds for x = y, with (x, y) ∈ K . By assumption, there exists > 0 such that ∇(T ) L ∞ < ω B (0) − 2, and by continuity, for small enough τ we have that ∇(t) L ∞ < ω B (0) − for all t ∈ [T, T + τ ). Therefore for (x, y) ∈ K , with 0 < |x − y| = ξ < ρ, where ρ ≤ min(δ/B, 2 /B 3 ), we have |(x, t) − (y, t)| ≤ ξ ∇(t) L ∞ < ξ(B − ) ≤ Bξ − (Bξ )3/2 = ω B (ξ ), for all t ∈ [T, T + τ ). On the other hand, due to the continuity in time of |(x, t) − (y, t)|, the compactness of the set {(x, y) ∈ K : |x − y| ≥ ρ}, and and the fact that (5.11) holds at t = T , we have that there is a sufficiently small τ > 0 such that (5.11) holds for all (x, y) ∈ K , x = y, and t ∈ [T, T + τ ). The second part is to rule out the case in which there exists T > 0 and x = y ∈ R2 such that (x, T ) − (y, T ) = ω B (|x − y|) (cf. [KNV]). Note that by the periodicity of , for such x = y ∈ R2 fixed, there exist x , y ∈ T2 such that ω B (|x − y|) = (x, T ) − (y, T ) = (x , T ) − (y , T ) ≤ ω B (|x − y |) ≤ ω B (d), and since ω B is increasing, we must have 0 < ξ = |x − y| ≤ d = diam(T2 ). We conclude by showing that d dt ((x, t) − (y, t))|t=T

< 0,

(5.12)

which contradicts the fact that the strict modulus of continuity is lost at t = T . In the following we suppress the time dependence of and U , since we work at t = T fixed. Since has modulus of continuity ω B (ξ ), we know (cf. [KNV, Lemma]) that U has modulus of continuity B (ξ ), where we defined

ξ ∞ ω B (η) ω B (η) dη + ξ B (ξ ) = A dη , η η2 0 ξ for some positive constant A. Then as in [KNV, Sect. 4] we have that ω B (ξ + h|U (x) − U (y)|) − ω B (ξ ) |(U · ∇)(x) − (U · ∇)(y)| ≤ lim+ h→0 h

≤ |U (x) − U (y)|ω B (ξ ) ≤ B (ξ )ω B (ξ ). (5.13)


809

The dissipative terms are estimated as in [KNV, Sect. 5], namely by the negative quantity 1 ξ/2 ω B (ξ + 2η) + ω B (ξ − 2η) − 2ω B (ξ ) M B (ξ ) = dη π 0 η2 1 ∞ ω B (2η + ξ ) − ω B (2η − ξ ) − 2ω B (ξ ) + dη. (5.14) π ξ/2 η2 Lastly, the force term is estimated using the mean value theorem | f (x) − f (y)| ≤ FB (ξ ) =

ξ ∇ f L ∞ , 0 ≤ ξ ≤ ξ > Bδ . 2 f L ∞ ,

δ B,

(5.15)

Thus, in order to conclude the proof of (5.12), we must show that for all 0 < ξ ≤ d, we have B (ξ )ω B (ξ ) + FB (ξ ) + M B (ξ ) < 0.

(5.16)

First we treat the case 0 < ξ ≤ δ/B. By keeping track of B, and using condition (5.8), similar arguments as in [KNV, Sect. 7] show that δ ξ ) + ξ ∇ f L ∞ + ω B

(ξ ) B (ξ )ω B (ξ ) + FB (ξ ) + M B (ξ ) ≤ AB 2 ξ(3 + log Bξ π

δ 3 2 −1/2 . ≤ B ξ A(4 + log )− (Bξ ) Bξ 4π Since we have 0 < Bξ ≤ δ, the above quantity is strictly negative if δ is sufficiently small. Note that δ does not depend on B. For the case δ/B ≤ ξ ≤ d, we follow the estimates in [KNV, Sect. 8] to conclude that if γ and δ are sufficiently small, independent of B, then B (ξ )ω B (ξ ) + FB (ξ ) + M B (ξ ) ≤ Aγ ω Bξ(ξ ) + 2 f L ∞ −

1 ω B (ξ ) π ξ .

But B was chosen so that (5.9) is satisfied, i.e. 2 f L ∞ ≤ ω B (d)/2π d. Because on [δ/B, ∞) the function ω B (ξ )/ξ is decreasing, for any ξ ∈ [δ/B, d] we have that 2 f L ∞ ≤ ω B (d)/2π d ≤ ω B (ξ )/2π ξ . Thus

1 1 ω B (ξ ) ω B (ξ ) 1 ω B (ξ ) + 2 f L ∞ − ≤ Aγ + − < 0, Aγ ξ π ξ 2π π ξ if γ is sufficiently small, independent of B. Therefore (5.16) holds for all 0 < ξ ≤ d, and so (5.12) is proven. Therefore the solution (t) exists for all time and has strict modulus of continuity ω B , which implies that ∇(t) L ∞ < ω B (0) = B for all t ≥ 0, concluding the proof of the lemma.

Remark 5.2. We note that it is also possible to adapt the De Giorgi-type techniques used by Caffarelli and Vasseur [CV] to treat the forced QG equation. First one proves boundedness of the solution in L 2 using energy estimates, and then similarly to [CV, Sect. 2] one obtains boundedness (not decay) for all time of (t) in L ∞ and of U (t) in B M O. The second step is to show that the solution is actually Hölder and that it remains bounded in this space for all t ≥ 0. Adding a smooth force does not create additional difficulties. Since this is already subcritical regularity, in the third step it is standard to bootstrap to higher regularity and prove that the W 1,∞ norm of (t) is bounded in time.

810


Acknowledgements. We thank Hongjie Dong, Alexander Kiselev, Anna Mazzucato, Roman Shvydkoy and Alexis Vasseur for very helpful discussions. The work of S.F. is supported by NSF grant DMS 0803268. The work of N.P. is supported by NSF grant number DMS 0758247 and an Alfred P. Sloan Research Fellowship.

References [BGS] [CC] [CCW] [CMT] [CV] [CW] [DD] [DP] [FPS] [FS] [FSV] [HPGS] [J1] [J2] [KN] [KNV] [KV] [L] [M] [MS] [N] [P] [Pe] [S] [VF] [W] [Y]

Bardos, C., Guo, Y., Strauss, W.: Stable and unstable ideal plane flows. Dedicated to the Memory of Jacques-Lious Lions, Chinese Ann. Math. Ser B. 23(2), 149–164 (2002) Córdoba, A., Córdoba, D.: A maximum principle applied to quasi-geostrophic equations. Commun. Math. Phys. 249(3), 511–528 (2004) Constantin, P., Cordoba, D., Wu, J.: On the critical dissipative quasi-geostrophic equation. Dedicated to Professors Ciprian Foias and Roger Temam (Bloomington, IN, 2000), Indiana Univ. Math. J. 50 (2001), Special Issue, 97–107 (2001) Constantin, P., Majda, A.J., Tabak, E.: Formation of strong fronts in the 2-D quasi-geostrophic thermal active scalar. Nonlinearity 7(6), 1495–1533 (1994) Caffarelli, L., Vasseur, A.: Drift diffusion equations with fractional diffusion and the quasigeostrophic equation. To appear in Annals of Math, available at http://pjm.math.berkeley.edu/ editorial/uploads/annals/accepted/090120-caffarelli/090120-caffarelli-v1.pdf Constantin, P., Wu, J.: Behavior of solutions of 2D quasi-geostrophic equations. SIAM J. Math. Anal. 30(5), 937–948 (1999) Dong, H., Du, D.: Global well-posedness and a decay estimate for the critical dissipative quasigeostrophic equation in the whole space. Discrete Contin. Dyn. Syst. 21(4), 1095–1101 (2008) Dong, H., Pavlović, N.: Regularity criteria for the dissipative quasi-geostrophic equations in Hölder spaces. Commun. Math. Phys. doi:10.1007/s00220-009-0756-x Friedlander, S., Pavlović, N., Shvydkoy, R.: Nonlinear instability for the Navier-Stokes equations. Commun. Math. Phys. 264, 335–347 (2006) Friedlander, S., Shvydkoy, R.: The unstable spectrum of the surface quasi-geostropic equation. J. Math. Fluid Mech. 7(suppl. 1), S81–S93 (2005) Friedlander, S., Strauss, W., Vishik, M.: Nonlinear instability in an ideal fluid. Ann. Inst. H. Poincaré Anal. Non Linéaire 14(2), 187–209 (1997) Held, I.M., Pierrehumbert, R.T., Garner, S.T., Swanson, K.L.: Surface quasi-geostrophic dynamics. J. Fluid Mech. 282, 1–20 (1995) Ju, N.: Existence and uniqueness of the solution to the dissipative 2D quasi-geostrophic equations in the sobolev space. Commun. Math. Phys. 251(2), 365–376 (2004) Ju, N.: Dissipative quasi-geostrophic equation: local well-posedness, global regularity and similarity solutions. Indiana Univ. Math. J. 56(1), 187–206 (2007) Kiselev, A., Nazarov, F.: Global regularity for the critical dispersive dissipative surface quasigeostrophic equation. Preprint Kiselev, A., Nazarov, F., Volberg, A.: Global well-posedness for the critical 2D dissipative quasigeostrophic equation. Invent. Math. 167(3), 445–453 (2007) Kukavica, I., Vicol, V.: On the radius of analyticity of solutions to the three-dimensional euler equations. Proc. Amer. Math. Soc. 137, 669–677 (2009) Lin, Z.: Nonlinear instability of ideal plane flows. Int. Math. Res. Not. 41, 2147–2178 (2004) Miura, H.: Dissipative quasi-geostrophic equation for large initial data in the critical Sobolev space. Commun. Math. Phys. 267(1), 141–157 (2006) Meshalkin, L., Sinai, Y.: Investigation of stability for a system of equations descibing plane motion of a viscous incompressible fluid. Appl. Math. Mech. 25, 1140–1143 (1961) Nirenberg, L.: On elliptic partial differential equations. Annali Della Scuola Normale Superiore di Pisa - Classe di Scienze Sér. 3(13(2), 115–162 (1959) Pazy, A.: Semigroups of Linear Operators and Applications to Partial Differential Equations. Applied Mathematics Sciences v. 44, New York: Springer-Verlag, 1983 Pedlosky, J.: Geophysical Fluid Dynamcs. New York: Springer-Verlag, 1987 Salmon, R.: Lectures on Geophysical Fluid Dynamics. New York: Oxford University Press, USA, 1998 Vishik, M., Friedlander, S.: Nonlinear instability in two dimensional ideal fluids: the case of a dominant eigenvalue. Commun. Math. Phys. 243(2), 261–273 (2003) Wu, J.: The 2d dissipative quasi-geostrophic equation. Appl. Math. Lett. 15(8), 925–930 (2002) Yudovich, V.I.: The Linearization Method in Hydrodynamical Stability Theory. Transactions of Mathematical Monographs, Vol. 74, Providence, RI: Amer. Math. Soc. 1989

Communicated by P. Constantin

Commun. Math. Phys. 292, 811–827 (2009) Digital Object Identifier (DOI) 10.1007/s00220-009-0852-y

Communications in


The Navier-Stokes Equations in the Critical Lebesgue Space Hongjie Dong1, , Dapeng Du2, 1 Division of Applied Mathematics, Brown University, 182 George Street, Box F,

Providence, RI 02912, USA. E-mail: [email protected]

2 School of Mathematical Sciences, Fudan University, Shanghai 200433,

P. R. China. E-mail: [email protected] Received: 5 February 2009 / Accepted: 23 April 2009 Published online: 7 July 2009 – © Springer-Verlag 2009

Abstract: We study regularity criteria for the d-dimensional incompressible Navier-Stokes equations. We prove in this paper that if u ∈ L t∞ L dx ((0, T ) × Rd ) is a Leray-Hopf weak solution, then u is smooth and unique in (0, T ) × Rd . This generalizes a result by Escauriaza, Seregin and Šverák [5]. Additionally, we show that if T = ∞ then u goes to zero as t goes to infinity. 1. Introduction In this paper we consider the incompressible Navier-Stokes equations in d spatial dimensions with unit viscosity and zero external force: ∂t u + u · ∇u − u + ∇ p = 0, div u = 0

(1.1)

for x ∈ Rd and t ≥ 0 with the initial condition u(0, x) = a(x), x ∈ Rd .

(1.2)

Here u is the velocity and p is the pressure. For sufficiently regular data a, the local strong solvability of such problems is well known (see, for example, [9,13,33 and 14]). The solution is unique and locally smooth in both spatial and time variables. On the other hand, the global in time strong solvability is an outstanding open problem for d ≥ 3. Another important type of solutions are called Leray-Hopf weak solutions (see Sect. 2.1 for the notation and definition). In the pioneering works of Leray [20] and Hopf [12], it is shown that for any divergence-free vector field a ∈ L 2 , there exists at Hongjie Dong was partially supported by the National Science Foundation under agreement No. DMS0111298 and DMS-0800129. Dapeng Du was partially supported by China Postdoctor Science Fund CPSF 20070410683.

812

H. Dong, D. Du

least one Leray-Hopf weak solution of the Cauchy problem (1.1)-(1.2) on (0, ∞) × Rd . Although the problems of uniqueness and regularity of Leray-Hopf weak solutions are still open, since the seminal work of Leray there is an extensive literature on conditional results under various criteria. The most well-known condition is so-called Ladyzhenskaya-Prodi-Serrin condition, that is for some T > 0, u ∈ L rt L qx (Rd+1 T ),

(1.3)

where the pair (r, q) satisfies 2 d + ≤ 1, q ∈ (d, ∞]. r q Under the condition (1.3), the uniqueness of Leray-Hopf weak solutions was proved by Prodi [22] and Serrin [28], and the smoothness was obtained by Ladyzhenskaya [15]. For further results, we refer the reader to [8,30,31] and the recent preprint [2], and references therein. The borderline case (r, q) = (∞, d) is much more subtle since the result cannot be proved by the usual methods using the local smallness of certain norms of u which are invariant under the natural scaling u(t, x) → λu(λ2 t, λx),

p(t, x) → λ2 p(λ2 t, λx).

(1.4)

For d = 3, this case was studied recently by Escauriaza, Seregin and Šverák in a remarkable paper [5]. The main result of [5] is the following theorem. Theorem 1.1 (Escauriaza, Seregin and Šverák). Let d = 3. Suppose that u is a LerayHopf weak solution of the Cauchy problem (1.1)-(1.2) in (0, T ) × R3 and u satisfies the condition (1.3) with (r, q) = (∞, 3). Then u ∈ L 5 ((0, T ) × R3 ), and hence it is smooth and unique in (0, T ) × R3 . Before we give a description of Theorem 1.1, we shall recall another important concept, the partial regularity of weak solutions. The study of partial regularity of the Navier-Stokes equations was originated by Scheffer in a series of papers [23–25]. In three space dimensions, he established various partial regularity results for weak solutions satisfying the so-called local energy inequality. For d = 3, the notion of suitable weak solutions was introduced in a celebrated paper [1] by Caffarelli, Kohn and Nirenberg. They called a pair (u, p) a suitable weak solution if u has finite energy norm, p belongs to the Lebesgue space L 5/4 , u and p are weak solutions to the Navier-Stokes equations and satisfy a local energy inequality. It is proved that, for any suitable weak solution (u, p), there is an open subset in which the velocity field u is Hölder continuous, and the complement of it has zero 1-D Hausdorff measure. In [21], with zero external force, Lin gave a more direct and sketched proof of Caffarelli, Kohn and Nirenberg’s result. A detailed treatment was then later given by Ladyzhenskaya and Seregin in [18]. For other results in this direction, we refer the reader to [3,11,31] and references therein. The proofs in [5] are highly nontrivial and rely on certain regularity criteria in the light of [1,21 and 18]. That is, roughly speaking, if some scaling invariant quantities are small then the solution is locally regular. Another main ingredient of the proof is a backward uniqueness theorem of heat equations with bounded coefficients of lower order terms in the half space (see also [6]). Under an additional assumption on the pressure, there are some extensions of Theorem 1.1 to the half space case and the bounded domain case; we refer the reader to [26] and [19] for some results in this direction. Another interesting open problem is the extension to the higher dimensional Navier-Stokes equations. It seems

The Navier-Stokes Equations

813

to us that the argument in [5] breaks down in several places when d ≥ 4. In particular, the regularity criterion, Theorem 2.2 [5], is unknown for the higher dimensional NavierStokes equations. We now state the main results of the article. Theorem 1.2. Let d ≥ 3, K > 0 and T ∈ (0, ∞). Suppose that u is a Leray-Hopf weak solution of the Cauchy problem (1.1)-(1.2) in (0, T ) × Rd and u satisfies the condition u ∈ L t∞ L dx ((0, T ) × Rd ), u L t∞ L x ((0,T )×Rd ) ≤ K . d

(1.5)

Then u ∈ L d+2 ((0, T ) × Rd ), and hence it is smooth and unique in (0, T ) × Rd . Theorem 1.3. Let d ≥ 3 and K > 0. Suppose that u is a Leray-Hopf weak solution of the Cauchy problem (1.1)-(1.2) in (0, ∞) × Rd and u satisfies the condition u ∈ L t∞ L dx ((0, ∞) × Rd ), u L t∞ L x ((0,∞)×Rd ) ≤ K . d

(1.6)

Then u is smooth and unique in (0, ∞) × Rd . Moreover, we have lim u(t, ·) L ∞ = 0.

t→∞

(1.7)

We give a brief description of our argument. As in [5] we prove by contradiction and blow up the solution near a singular point at the first blow-up time to obtain a sequence of solutions {u k }. The limiting function u ∞ of this sequence is a suitable weak solution of the Navier-Stokes equations. Note that the solutions u k are smooth before the first blow-up time. As we mentioned before, we are not able to establish a regularity criterion similar to Theorem 2.2 [5], which says if certain scaling invariant quantities are small then the solution is locally Hölder continuous. Instead we use a modified one. Roughly speaking, we show that if the solutions are smooth, the L t∞ L dx norm is bounded and some scaling invariant quantities are small, then we have a priori L ∞ bound for the solutions on a much smaller ball. Here the point is the a priori L ∞ bound only depends on the L t∞ L dx norm and the dimension. This regularity criterion together with the L p -convergence of u k yields the local boundedness of u ∞ outside a large cylinder. The local boundedness implies the local smoothness of u ∞ . Then we use the backward uniqueness proved in [5] to see that u ∞ is equivalent to zero outside a large cylinder, which further implies that u ∞ ≡ 0 by using the spatial analyticity of strong solutions and the weak-strong uniqueness of the Navier-Stokes equations. This means the sequence u k converges to zero in L p on any compact set. Going back to the original solution u we see that the modified regularity criterion applies, which gives a contradiction and proves Theorem 1.2. To prove Theorem 1.3, we notice that u is in L 4 ((0, ∞) × Rd ), which implies the smallness of its L 4 norm in (T, ∞) × Rd for large T . Then we use the modified regularity criterion again and the scaling (1.4). We remark that a decay result similar to that of Theorem 1.3 was obtained in [7] by using a completely different method. The remaining part of the article is organized as follows. We give a few definitions and prove several preliminary results in the next section. In Sect. 3, we prove a key estimate (Proposition 3.1) about the scaling invariant quantities and construct a sequence of solutions by blowing up the solution at a singular point. Section 4 is devoted to the proof of a local boundedness estimate (Theorem 4.1). We finish the proof of Theorem 1.2 and 1.3 in Sect. 5.

814

H. Dong, D. Du

2. Preliminaries We make a few preparations in this section. We use the notation in [18]. Let ω be a domain in some finite-dimensional space. Denote L p (ω; Rn ) and W pk (ω; Rn ) to be the usual Lebesgue and Sobolev spaces of functions from ω into Rn . Denote the norm of the spaces L p (ω; Rn ) and W pk (ω; Rn ) by · L p (ω) and · W pk (ω) respectively. As usual, for any measurable function u = u(x, t) and any p, q ∈ [1, +∞], we define u(x, t) L tp L qx := u(x, t) L qx t . Lp

For summable functions p, u = (u i ) and τ = (τi j ), we use the following differential operators: ∂u ∂u , u ,i = , ∇ p = ( p,i ), ∇u = (u i, j ), ∂t ∂ xi div u = u i,i , div τ = (τi j, j ), u = div∇u, ∂t u = u t =

which are understood in the sense of distributions. We use the notation of spheres, balls and parabolic cylinders, S(x0 , r ) = {x ∈ R4 | |x − x0 | = r }, S(r ) = S(0, r ), S = S(1); B(x0 , r ) = {x ∈ R4 | |x − x0 | < r }, B(r ) = B(0, r ), B = B(1); Q(z 0 , r ) = B(x0 , r ) × (t0 − r 2 , t0 ), Q(r ) = Q(0, r ), Q = Q(1). Also we denote mean values of summable functions as follows: 1 u(x, t) d x, [u]x0 ,r (t) = |B(r )| B(x0 ,r ) 1 (u)z 0 ,r = u dz. |Q(r )| Q(z 0 ,r ) We recall the following well-known interpolation inequality. Lemma 2.1. For any functions u ∈ W21 (Rd ) and any q ∈ [2, 2d/(d − 2)] and r > 0, d(q/4−1/2)

|u|q d x ≤ N (q) Br

q/2−d(q/4−1/2)

|∇u|2 d x Br

+ r −d(q−2)/2

|u|2 d x Br

q/2

|u|2 d x

.

(2.1)

Br

2.1. Leray-Hopf weak solutions. We denote C˙ 0∞ the space of all divergence-free infinitely differentiable vector fields with compact support in Rd . Let J˙ and J˙21 be the closure of C˙ 0∞ in the spaces L 2 and W21 , respectively. For any T ∈ (0, ∞], denote d Rd+1 T = (0, T ) × R .

By a Leray-Hopf weak solution of (1.1)-(1.2) in Rd+1 T , we mean a vector field u such that:


815

i) u ∈ L ∞ (0, T ; J˙) ∩ L 2 (0, T ; J˙21 ); ii) The function t → Rd u(t, x) · w(x) d x is continuous on [0, T ] for any w ∈ L 2 ; iii) Equation (1.1) holds weakly in the sense that for any w ∈ C˙ 0∞ (Rd+1 T ), (−u · ∂t w − u ⊗ u : ∇w + ∇u : ∇w) d x dt = 0; (2.2) Rd+1 T

iv) The energy inequality: 1 1 |u(t, x)|2 d x + |∇u|2 d x ds ≤ |a(x)|2 d x 2 Rd 2 Rd Rdt holds for any t ∈ [0, T ], and we have u(t, ·) − a(·) L 2 → 0 as t → 0. It is well known that for any a ∈ J˙, there exists at least one Leray-Hopf weak solution of the Cauchy problem (1.1)-(1.2) on (0, ∞) × Rd (see [20 and 12]). 2.2. Suitable weak solutions. The definition of suitable weak solutions was introduced in [1] (see also [21 and 18]). Let ω be an open set in Rd . By a suitable weak solution of the Navier-Stokes equations on the set (0, T ) × ω, we mean a pair (u, p) such that i) u ∈ L ∞ (0, T ; J˙) ∩ L 2 (0, T ; J˙21 ) and p ∈ L d/2 ((0, T ) × ω); ii) u and p satisfy Eq. (1.1) in the sense of distributions (2.2). iii) For any t ∈ (0, T ) and for any nonnegative function ψ ∈ C0∞ (Rd ) vanishing in a neighborhood of the parabolic boundary {t = 0} × ω ∪ [0, T ] × ∂ω, we have the local energy inequality ess sup0<s≤t |u(s, x)|2 ψ(s, x) d x + 2 |∇u|2 ψ d x ds ω (0,t)×ω 2 2 ≤ {|u| (ψt + ψ) + (|u| + 2 p)u · ∇ψ} d x ds. (2.3) (0,t)×ω

2.3. Scaling invariant quantities. The following notation will be used throughout the article: 1 A(r ) = A(r, z 0 ) = ess supt0 −r 2 ≤t≤t0 d−2 |u(x, t)|2 d x, r B(x0 ,r ) 1 E(r ) = E(r, z 0 ) = d−2 |∇u|2 dz, r Q(z 0 ,r ) 1 C(r ) = C(r, z 0 ) = d−4/(d+1) |u|2(d+3)/(d+1) dz, r Q(z 0 ,r ) 1 D(r ) = D(r, z 0 ) = d−4/(d+1) | p|(d+3)/(d+1) dz. r Q(z 0 ,r ) We notice that these quantities are all invariant under the natural scaling (1.4). We shall use the following two lemmas involving these quantities.

816

H. Dong, D. Du

Lemma 2.2. Let ρ > 0, ε > 0 be constants and a pair (u, p) be a suitable weak solution of (1.1). Suppose Q(z 0 , ρ) ⊂ Rd+1 T and C(ρ) + D(ρ) ≤ ε2(d+3)/(d+1) . Then under the condition (1.5), we have A(ρ/2) + E(ρ/2) ≤ N ε2 . Proof. By a scaling argument, we may assume without loss of generality that ρ = 1. In the energy inequality (2.3), we put t = t0 and choose a suitable smooth cut-off function ψ such that d+1 ψ ≡ 0 in Rd+1 t0 \ Q(z 0 , 1), 0 ≤ ψ ≤ 1 in RT ,

ψ ≡ 1 in Q(z 0 , 1/2), |∇ψ| < N , |∂t ψ| + |∇ 2 ψ| < N in Rd+1 t0 . By using (2.3), we get A(1/2) + 2E(1/2) ≤ N

|u| dz + N 2

Q(z 0 ,1)

Q(z 0 ,1)

(|u|2 + 2| p|)|u| dz.

Due to Hölder’s inequality, one can obtain |u|2 dz ≤ N (C(1))(d+1)/(d+3) ≤ N ε2 , Q(z 0 ,1)

and

Q(z 0 ,1)

(|u|2 + 2| p|)|u| dz

≤N ≤N

Q(z 0 ,1)

Q(z 0 ,1)

|u|

d+3 2

dz 1

|u|d dz

d

2 d+3

Q(z 0 ,1)

|u|

2(d+3) d+1

+ | p|

d+3 d+1

d+1 d+3

dz

(C(1) + D(1))(d+1)/(d+3)

≤ N ε2 , where in the last inequality we used (1.5). The conclusion of Lemma 2.2 follows immediately. Lemma 2.3. Suppose γ ∈ (0, 1/2], ρ > 0 are constants and Q(z 0 , ρ) ∈ Rd+1 T . Then we have D(γρ) ≤ N γ −d+4/(d+1) C(ρ) + γ 4/(d+1) D(ρ) . (2.4) Proof. Let η(x) be a smooth function on Rd supported in the unit ball B(1), 0 ≤ η ≤ 1 ¯ and η ≡ 1 on B(2/3). It is known that for a.e. t ∈ (t0 −ρ 2 , t0 ), in the sense of distribution, one has

(2.5) p = Di j u i u j .


817

For these t, we consider the decomposition p = px0 ,ρ + h x0 ,ρ in B(x0 , ρ), where px0 ,ρ is the Newtonian potential of

Di j u i u j η((x − x0 )/ρ). Then h x0 ,ρ is harmonic in B(x0 , 2ρ/3). Denote r = γρ. By using the Calderón-Zygmund estimate, one has | px0 ,ρ (x, t)|(d+3)/(d+1) dz Q(z 0 ,r ) ≤ | px0 ,ρ (x, t)|(d+3)/(d+1) dz Q(z ,ρ) 0 |u|2(d+3)/(d+1) dz. ≤ Q(z 0 ,ρ)

(2.6)

Since h x0 ,ρ is harmonic in B(x0 , 2ρ/3), any Sobolev norm of h x0 ,ρ in a smaller ball can be estimated by any of its L p norm in B(x0 , 2ρ/3). Thus, one obtains |h x0 ,ρ |(d+3)/(d+1) d x B(x0 ,r )

≤ Nr d sup |h x0 ,ρ |(d+3)/(d+1) d x B(x0 ,r )

d −d

≤ Nr ρ

B(x0 ,ρ)

|h x0 ,ρ |(d+3)/(d+1) d x.

Integrating (2.7) in t ∈ (t0 − r 2 , t0 ), we obtain |h x0 ,ρ |(d+3)/(d+1) dz Q(z 0 ,r ) ≤ Nr d ρ −d |h x0 ,ρ |(d+3)/(d+1) dz Q(z ,ρ) 0 d −d | p|(d+3)/(d+1) + | px0 ,ρ |(d+3)/(d+1) dz ≤ Nr ρ Q(z ,ρ) 0 d −d | p|(d+3)/(d+1) dz + N |u|2(d+3)/(d+1) dz, ≤ Nr ρ Q(z 0 ,ρ)

Q(z 0 ,ρ)

(2.7)

(2.8)

where we used (2.6) in the last inequality. By combining (2.6) and (2.8) we reach (2.4). The lemma is proved. 2.4. Strong solutions and spatial analyticity. We recall the following local strong solvability of (1.1)-(1.2) (see, for example, [9,13,33 and 14]), and the spatial analyticity of strong solutions (see, for example, [10 and 4]). Proposition 2.4. For any divergence-free initial data a ∈ L p (Rd ), p ≥ d, the Cauchy problem (1.1)-(1.2) has a unique strong solution u ∈ C([0, δ); L p (Rd )) for some δ > 0. Moreover, u is infinitely differentiable and spatial analytic for t ∈ (0, δ).

818

H. Dong, D. Du

3. A Blowup Procedure We begin this section by proving the following key estimate, which shows if the quantities C and D are sufficiently small in a cylinder, then they must be also small in any sub-cylinder. Proposition 3.1. Let (u, p) be a pair of suitable weak solution of (1.1). Suppose that Q(z 0 , ρ) ⊂ Rd+1 and the condition (1.5) holds. Then for any ε0 > 0 there exists an T ε∗ > 0 depending only on ε0 and d such that if C(ρ, z 0 ) + D(ρ, z 0 ) ≤ ε∗ , then we have C(r, z 1 ) + D(r, z 1 ) ≤ ε0 for any z 1 ∈ Q(z 0 , ρ/2) and r ∈ (0, ρ/2). Proposition 3.1 follows immediately from the next lemma by using a covering argument and an iteration. Lemma 3.2. Let (u, p) be a suitable weak solution of (1.1). Suppose that Q(z 0 , ρ) ⊂ Rd+1 and condition (1.5) holds. Then there exist universal constants ε∗ > 0 and γ ∈ T (0, 1/4] such that for any ε ∈ (0, ε∗ ] if C(ρ, z 0 ) + D(ρ, z 0 ) ≤ ε, then we have C(γρ, z 0 ) + D(γρ, z 0 ) ≤ ε. Proof. As before, one may assume ρ = 1. We prove by contradiction. Let γ ∈ (0, 1/4] be a constant to be specified later. Suppose there exist a decreasing sequence {εk } converging to 0, and a sequence of pairs of suitable weak solutions (u k , pk ) such that 2(d+3)/(d+1)

C(1, z 0 , u k , pk ) + D(1, z 0 , u k , pk ) ≤ εk C(γ , z 0 , u k , pk ) + D(γ , z 0 , u k , pk ) >

,

(3.1)

2(d+3)/(d+1) εk .

(3.2)

By Lemma 2.2, one also has A(1/2, z 0 , u k , pk ) + B(1/2, z 0 , u k , pk ) ≤ N εk2 ,

(3.3)

where the constant N is independent of k. We define (vk , qk ) = (u k /εk , qk /εk ). Then (vk , qk ) is a suitable weak solution of ∂t vk + εk vk · ∇vk − vk + ∇qk = 0, div vk = 0.

(3.4)

From (3.1), (3.2) and (3.3), we get C(1, z 0 , vk , qk ) + D(1, z 0 , vk , qk ) ≤ 1, C(γ , z 0 , vk , qk ) + D(γ , z 0 , vk , qk ) > 1, A(1/2, z 0 , vk , qk ) + B(1/2, z 0 , vk , qk ) ≤ N .

(3.5) (3.6) (3.7)


819

By using (3.7), applying the interpolation inequality (2.1) with q = 2(d + 2)/d and integrating in t, we bound vk L 2(d+2)/d (Q(z 0 ,1/2)) by N . Thus by Hölder’s inequality, vk · ∇vk L (d+2)/(d+1) (Q(z 0 ,1/2)) ≤ N . Due to the coercive estimate for the Stokes system (see, for instance, [29]) with a suitable cut-off function, we reach

2(d+2) d+2 d+2 d+2 |vk | d + |∂t vk | d+1 + |D 2 vk | d+1 + |∇qk | d+1 dz ≤ N , Q(z 0 ,1/3)

where the constant N is independent of k. Thanks to the compact embedding theorem and (3.5), there exist v ∈ L 2(d+3)/(d+1) (Q(z 0 , 1/3)), q ∈ L (d+3)/(d+1) (Q(z 0 , 1/3)), and a subsequence, which is still denoted by (vk , qk ) such that vk → v in L 2(d+3)/(d+1) (Q(z 0 , 1/3)), qk q in L (d+3)/(d+1) (Q(z 0 , 1/3)).

(3.8)

This together with (3.4) implies ∂t v − v + ∇q = 0, div v = 0.

(3.9)

Moreover, v L 2(d+3)/(d+1) (Q(z 0 ,1/3)) + q L (d+3)/(d+1) (Q(z 0 ,1/3)) ≤ N . By the classical estimate of the Stokes system, one has sup

Q(z 0 ,1/4)

|v| ≤ N ,

which gives C(γ , z 0 , v, q) ≤ N γ 4/(d+1) . This contradicts (3.6) and (3.8), if we choose γ sufficiently small. The lemma is proved. Lemma 3.3. Under the assumptions of Theorem 1.2, we have u(t, ·) L d (Rd ) ≤ N ,

(3.10)

2 d u ∈ L 4 (Rd+1 T ), ∂t u, D u, ∇ p ∈ L 4/3 ((δ, T ) × R ),

(3.11)

for each t ∈ [0, T ], and

for any δ ∈ (0, T ). Moreover, (u, p) is a suitable weak solution of (1.1) in Rd+1 T .

820

H. Dong, D. Du

Proof. The first assertion is due to (1.5) and the weak continuity of Leray-Hopf weak solutions. By using Lemma 2.1 with q = 2d/(d − 2) and r = ∞, we have u L t L x

d+1 2 2d/(d−2) (RT )

≤ N,

which together with (3.10) and Hölder’s inequality yields u L 4 (Rd+1 ) ≤ N , u · ∇u L 4/3 (Rd+1 ) ≤ N . T

T

Thus, (3.11) follows from the coercive estimate for the Stokes system. Finally, due to x (Rd+1 ). the pressure equation (2.5) and the Calderón-Zygmund estimate, p ∈ L t∞ L d/2 T Therefore, it is clear that (u, p) is a suitable weak solution. The lemma is proved. Remark 3.4. From (3.11), one can infer that u ∈ C((0, T ]; L 4/3 (B R )) for any R > 0. For this combined with (3.10) and Hölder’s inequality, we get u ∈ C((0, T ]; L p (B R )) for any p ∈ [1, d). Because of the local strong solvability and the weak-strong uniqueness (see, for instance, [32]), we know that u is regular for t ∈ (0, T0 ) for some T0 ∈ (0, T ]. Suppose T0 is the first blowup time of u, and Z 0 = (T0 , X 0 ) is a singular point. Take a decreasing sequence {λk } converging to 0. We rescale the pair (u, p) at time T0 and define u k (t, x) = λk u(T0 + λ2k t, X 0 + λk x), pk (t, x) = λ2k p(T0 + λ2k t, X 0 + λk x). Then for each k = 1, 2, . . ., (u k , pk ) is a suitable weak solution of (1.1) and u k is smooth for t ∈ (−λ−2 k T0 , 0). We finish this section by constructing a limiting solution. Proposition 3.5. i) There is a subsequence of {(u k , pk )}, which is still denoted by {(u k , pk )}, such that u k → u ∞ in C([t0 − 1, t0 ]; L q1 (B(x0 , 1))), pk p∞ in

x L qt 2 L d/2 (Q(z 0 , 1))

(3.12) (3.13)

for any z 0 ∈ (−∞, 0] × Rd , q1 ∈ [1, d) and q2 ∈ [1, ∞). ii) Furthermore, (u ∞ , p∞ ) is a suitable weak solution of (1.1) in (−∞, 0) × Rd , and u ∞ ∈ L qt 2 L dx ((−T1 , 0) × Rd ),

x p∞ ∈ L qt 2 L d/2 ((−T1 , 0) × Rd )

for any T1 > 0 and q2 ∈ [1, ∞). Proof. First we fix a z 0 ∈ (−∞, 0] × Rd . Since pk , k = 1, 2, . . . have a uniform bound x ((t − 1, t ) × Rd ) norm, and consequently a uniform bound of their of the L t∞ L d/2 0 0 x t L q2 L d/2 (Q(z 0 , 1)) norms, there is a subsequence, which is still denoted by { pk }, such that (3.13) holds. Similarly, u k L t∞ L x (Q(z 0 ,3)) ≤ u k L t∞ L x ((t0 −9,t0 )×Rd ) ≤ N , d

d

where N is independent of k. By Lemma 2.2, we have A(2, z 0 , u k , pk ) + B(2, z 0 , u k , pk ) ≤ N .

(3.14)


821

Now following the proof of Lemma 3.3, we deduce u k ∈ L 4 (Q(z 0 , 3/2)), ∂t u k , D 2 u k , ∇ pk ∈ L 4/3 (Q(z 0 , 3/2)) with uniform norms. Therefore, we can find a subsequence still denoted by {u k } such that u k → u ∞ in C([t0 − 1, t0 ]; L 4/3 (B(x0 , 1))). This together with (3.14) gives (3.12) by using the Hölder’s inequality. To finish the proof of Part i), it suffices to use a Cauchy diagonal argument. Part ii) then follows from Part i) and (3.14). 4. Schoen’s Trick The objective of this section is to establish the following regularity criterion. Theorem 4.1. Suppose u is a regular solution of (1.1) in Q(z 0 , ρ1 ). Then for any K > 0 there exists an ε1 = ε1 (d, K ) > 0 such that the following is true. If any z 1 ∈ Q(z 0 , ρ1 /2), ρ ∈ (0, ρ1 /2) we have C(ρ, z 1 ) ≤ ε1 , p L t∞ L x

d/2 (Q(z 1 ,ρ))

≤ K,

(4.1)

then sup

Q(z 0 ,ρ1 /4)

|u(z)| < N (ρ1 , d).

Proof. We prove the theorem by using Schoen’s trick. Let δ ∈ (0, ρ12 /4) be a number and denote d(z) = (t0 + ρ12 /4 − t)1/2 ,

Mδ =

max

¯ 0 ,ρ1 /2)∩{t≤t0 −δ} Q(z

d(z)|u(z)|.

If for all δ ∈ (0, ρ12 /4) we have Mδ ≤ 2, then there’s nothing to prove. Otherwise, ¯ 0 , ρ1 /2) ∩ {t ≤ t0 − δ}, suppose for some δ and z 1 ∈ Q(z M := Mδ = |u(z 1 )|d(z 1 ) > 2. Let r1 = d(z 1 )/M < d(z 1 )/2. We make the scaling as follows: u(y, ¯ s) = r1 u(r12 s + t1 , r1 y + x1 ), p(y, ¯ s) = r1 p(r12 s + t1 , r1 y + x1 ). The pair (u, ¯ p) ¯ satisfies (1.1) in Q(0, 1) and u¯ is smooth. Obviously, sup |u| ¯ ≤ 2, |u(0, ¯ 0)| = 1.

(4.2)

Q(0,1)

By the scaling-invariant property of the quantity C, in what follows we view it as the object associated to (u, ¯ p) ¯ at the origin. For any ρ ∈ (0, 1], from (4.1) we have C(ρ) ≤ ε1 , p ¯ L t∞ L x (Q(1)) ≤ K . d/2

(4.3) (4.4)

822

H. Dong, D. Du

We decompose p¯ as in the proof of Lemma 2.3: p¯ = p¯ 0,1 + h¯ 0,1 . Because of (4.2), we have | p¯ 0,1 |4(d+2) dz ≤ N .

(4.5)

Q(0,1)

Since h¯ 0,1 (t, ·) is harmonic in B(2/3) for a.e. t ∈ (−1, 0), it holds that |h¯ 0,1 |4(d+2) dz Q(0,1/2)

≤N ≤N

0

sup |h¯ 0,1 (t, ·)|4(d+2) dt

−1/4 B(1/2) 0

−1/4

≤N

|h¯ 0,1 |d/2 d x

8(d+2)/d

B(2/3)

| p¯ 0,1 |

4(d+2)

dt

8(d+2)/d | p(t, ¯ ·)|

dz + sup

Q(0,1)

t∈(0,1)

d/2

dx

B(0,1)

≤ N,

(4.6)

where in the last inequality we used (4.4) and (4.5). Thus, we deduce from (4.5) and (4.6) that | p| ¯ 4(d+2) dz ≤ N . (4.7) Q(0,1/2)

Now we note that (u, ¯ p) ¯ satisfies the equation ¯ − ∇( p) ¯ ∂t u¯ − u¯ = div(u¯ ⊗ u) in Q(0, 1). Owing to (4.2), (4.7) and the classical Sobolev space theory of parabolic equations, we have 1,1/2

u¯ ∈ W4(d+2) (Q(0, 1/3)), u ¯ W 1,1/2

4(d+2) (Q(0,1/3))

≤ N.

By the Sobolev embedding theorem (see [16]), we obtain ¯ C 1/4 (Q(0,1/4)) ≤ N , u¯ ∈ C 1/4 (Q(0, 1/4)), u where N is a universal constant depending only on d and K . Therefore, we can find δ1 < 1/5 independent of ε1 such that |u(x, ¯ t)| ≥ 1/2 in Q(0, δ1 ).

(4.8)

Now we choose ε1 small enough which makes (4.8) and (4.3) a contradiction. The theorem is proved.


823

5. Proof of Theorem 1.2 and 1.3 We finish the proof of Theorem 1.2 in this section. Let u k , pk , u ∞ and p∞ be the functions constructed in Sect. 3. First we verify that the assumptions of Theorem 4.1 hold for (u k , pk ) when k is sufficiently large and the parabolic cylinder is far away from the origin. Lemma 5.1. For any ε2 > 0 and T1 ≥ 1, we can find R ≥ 1 such that, for any z 0 ∈ (−T1 − 1, 0] × (Rd \ B R+1 ), lim sup C(1, z 0 , u k , p∞ ) ≤ ε2 .

(5.1)

k→∞

Proof. Due to Proposition 3.5 ii), we can find R large such that |u ∞ |d dz (−T1 −2,0)×(Rd \B R )

is sufficiently small. This together with Proposition 3.5 i) proves the lemma.

Lemma 5.2. For any ε3 > 0 and T1 ≥ 1, we can find R ≥ 1 and ρ3 ∈ (0, 1/2] such that, for any ρ ∈ (0, ρ3 ] and z 0 ∈ (−T1 − 1, 0] × (Rd \ B R+2 ), lim sup (C(ρ, z 0 , u k , pk ) + D(ρ, z 0 , u k , pk )) ≤ ε3 .

(5.2)

k→∞

Proof. The lemma is a consequence of Lemma 5.1, 2.3 and Proposition 3.1. Indeed, since D(1, z 0 , u k , pk ) has a uniform bound, for any ε > 0, we can choose γ small in (2.4), then ε2 small in (5.1) and R large such that lim sup (C(γ , z 0 , u k , pk ) + D(γ , z 0 , u k , pk )) ≤ ε k→∞

holds for any z 0 ∈ (−T1 − 1, 0] × (Rd \ B R+1 ). Now it suffices to choose ε small depending on ε3 and apply Proposition 3.1. We finish the proof by setting ρ3 = γ /2. Next we show that u ∞ is identically equal to zero. Proposition 5.3. Under the assumptions of Theorem 1.2, let (u ∞ , p∞ ) be the suitable weak solution constructed in Sect. 3. Then, u ∞ (t, ·) ≡ 0 ∀t ∈ (−∞, 0). Proof. Let ε1 be the constant in Theorem 4.1. Let T1 ≥ 1 be a number. Owing to Lemma 5.2, we can find R ≥ 1 and ρ3 ∈ (0, 1/2] such that, for any ρ ∈ (0, ρ3 ] and z 0 ∈ (−T1 − 1, 0] × (Rd \ B R+2 ) estimate (5.2) holds with ε1 /2 in place of ε3 . Moreover, we recall that for each K , pk L t∞ L x

d/2 ((−∞,0)×R

d)

≤ N (d)K .

Thus Theorem 4.1 yields that lim sup

sup

k→∞ Q(z 0 ,ρ3 /4)

|u k (z 0 )| ≤ N (d, ρ3 )

824

H. Dong, D. Du

for any z 0 ∈ [−T1 − 1, 0) × (Rd \ B R+2 ). Now by Proposition 3.5, we obtain |u ∞ (z)| ≤ N (d, ρ3 ) for a.e. z ∈ [−T1 − 1, 0) × (Rd \ B R+2 ). Upon using the regularity results for linear Stokes systems, one can estimate higher derivatives |D j u ∞ (z)| ≤ N (d, j, ρ3 )

(5.3)

for any j ≥ 1 and a.e. z ∈ [−T1 , 0) × (Rd \ B R+3 ). We now claim that u ∞ (0, ·) ≡ 0 by adapting the argument in the proof of Theorem 1.4 [5]. For any x0 ∈ Rd , by using (3.12), |u ∞ (x, 0)| d x B(x0 ,1) ≤ |u k (x, 0) − u ∞ (x, 0)| d x + |u k (x, 0)| d x B(x0 ,1)

B(x0 ,1)

≤ u k − uC([−1,1];L 1 (B(x0 ,1))) + N (d) = u k − uC([−1,1];L 1 (B(x0 ,1))) + N (d)

1/d |u k (x, 0)| d x d

B(x0 ,1)

B(λk x0 ,λk )

1/d |u(y, 0)|d dy

.

The right-hand side of the above inequality goes to zero as k → ∞, which proves the claim. Because of (5.3), the vorticity ω = curl u ∞ satisfies the differential inequality |∂t ω − ω| ≤ N (|ω| + |∇ω|) on (−T1 , 0] × (Rd \ B R+3 ). Thanks to the backward uniqueness theorem proved in [5] (see also [6]), we reach ω(z) = 0 on (−T1 , 0] × (Rd \ B R+3 ).

(5.4)

Now we fix a t0 ∈ (−T1 , 0). Take an increasing sequence {tk } ⊂ (−T1 , 0) converging to t0 . For each k, we consider Eq. (1.1) with initial data u ∞ (tk , ·). By Proposition 2.4, one can locally find a strong solution vk ∈ C([tk , tk + δk ); L d (Rd )) for some small δk , and vk (t, ·) is spatial analytic for t ∈ (tk , tk + δk ). By the weak-strong uniqueness, vk ≡ u ∞ for t ∈ [tk , tk + δk ). Therefore, ω(t, ·) is also spatial analytic for t ∈ (tk , tk + δk ). Because of (5.4), we get ω(z) = 0 on (tk , tk + δk ) × Rd , which implies that u ∞ ≡ 0 in the same region. In particular, there exists a sequence {sk } converging to t0 such that tk < sk ≤ t0 , u ∞ (sk , ·) ≡ 0. This together with the weak continuity of u ∞ yields that u ∞ (t0 , ·) ≡ 0. Since t0 ∈ (−T1 , 0) is arbitrary and T1 ≥ 1 is also arbitrary, we complete the proof of the theorem. We are ready to prove Theorem 1.2.


825

Proof of Theorem 1.2. We prove the theorem in three steps. Step 1. First we show that u is regular for t ∈ (0, T ]. Thanks to Proposition 3.5 and 5.3, u k → 0 in C([−3, 0]; L 2(d+3)/(d+1) (B(3))). Also recall that D(1, z 0 , u k , pk ) has a uniform bound. Following the proof of Lemma 5.2 we have: for any ε4 > 0, there is a ρ4 ∈ (0, 1/2] and a positive integer k0 such that, for any ρ ∈ (0, ρ3 ] and z 0 ∈ (−2, 0] × B(2), C(ρ, z 0 , u k0 , pk0 ) + D(ρ, z 0 , u k0 , pk0 ) ≤ ε4 . We choose ε4 sufficiently small and apply Theorem 4.1 to get sup

(−1,0)×B(1)

|u k0 | < ∞,

which implies that sup

Q(Z 0 ,λk0 )

|u| < ∞.

This contradicts the assumption that (T0 , X 0 ) is a blowup point. Therefore, u is regular for t ∈ (0, T ]. Step 2. We bound the sup norm of u in this step. Fix a δ ∈ (0, T ). Since u L t∞ L x ((0,T )×Rd ) ≤ N , p L t∞ L x

d/2 ((0,T )×R

d

d)

≤ N,

by the same reasoning as at the beginning of the proof of Proposition 5.3, we see that there exists a large R ≥ 1 such that sup [δ,T )×(Rd \B(R))

|u| ≤ N .

(5.5)

¯ Next we estimate the sup norm of u in [δ, T )× B(R). Fix a z 0 = (t0 , x0 ) in [δ, T ]× B(R). In the construction of u k , we replace (T0 , X 0 ) by (t0 , x0 ). By the same reasoning as in the first step, for some ε = ε(T0 , X 0 ) > 0, we have sup |u| < ∞.

Q(z 0 ,ε)

¯ By the compactness of [δ, T ] × B(R), it holds that sup

¯ [δ,T )× B(R)

|u| ≤ N .

This together with (5.5) yields sup [δ,T )×Rd

|u| ≤ N .

Step 3. Finally we prove the uniqueness. Owing to the local strong solvability of (1.1), we have u ∈ L d+2 (Rd+1 T1 ) for some T1 ∈ (0, T ). On the other hand, for t ∈ [T1 , T ] the solution is uniformly bounded and belongs to L t∞ L dx ((T1 , T ) × Rd ), thus u ∈ L d+2 (Rd+1 T ). The uniqueness then follows.

826

H. Dong, D. Du

Now we give Proof of Theorem 1.3. Thanks to Theorem 1.2, it remains to prove (1.7). Let λ > 0 be a constant to be specified later. We define u λ (t, x) = λu(λ2 t, λx), pλ (t, x) = λ2 p(λ2 t, λx). Then (u λ , pλ ) is also a Leray-Hopf weak solution of (1.1) in (0, ∞) × Rd , and u λ satisfies (1.6) with the same constant K due to the scaling invariant property. By the proof of Lemma 3.3, we have u λ ∈ L 4 ((0, ∞) × Rd ). Thus for any ε > 0, there is a T > 0 such that u λ L 4 ((T,∞)×Rd ) ≤ ε. Let ε1 be the constant in Theorem 4.1. Upon using Lemma 2.3 and Proposition 3.1, we can find a large T = Tλ such that C(ρ, z 0 , u λ , pλ ) + D(ρ, z 0 , u λ , pλ ) ≤ ε1 , for any ρ ∈ (0, 1/2] and z 0 ∈ [T, ∞) × Rd . Owing to Theorem 4.1, we conclude sup

Q(z 0 ,1/4)

|u λ (z)| < N ,

where N = N (d, K ) is independent of λ. Therefore, sup t≥λ2 T,x∈Rd

|u(t, x)| < N /λ.

Sending λ → ∞ yields the desired result. The theorem is proved.

Acknowledgement. The authors would like to express their sincere gratitude to Vladmir Šverák for very helpful comments and suggestions. The authors are also grateful to Gabriel Koch and the referee for useful comments on a previous version of the manuscript.

References 1. Caffarelli, L., Kohn, R., Nirenberg, L.: Partial regularity of suitable weak solutions of the Navier-Stokes equations. Comm. Pure Appl. Math. 35, 771–831 (1982) 2. Cheskidov, A., Shvydkoy, R.: On the regularity of weak solutions of the 3D Navier-Stokes equations in −1 B∞,∞ . http://arxiv.org/abs/0708.3067v2[math:AP], 2007 3. Dong, H., Du, D.: Partial regularity of solutions to the four-dimensional Navier-Stokes equations at the first blow-up time. Commun. Math. Phys. 273(3), 785–801 (2007) 4. Dong, H., Li, D.: Optimal local smoothing and analyticity rate estimates for the generalized Navier-Stokes equations. Commun. Math. Sci. 7(1), 67–80 (2009) 5. Escauriaza, L., Seregin, G., Šverák, V.: L 3,∞ -solutions of Navier-Stokes equations and backward uniqueness (In Russian). Usp. Mat. Nauk 58(2)(350), 3–44 (2003); translation in Russ. Math. Surv. 58(2), 211–250 (2003) 6. Escauriaza, L., Seregin, G., Šverák, V.: Backward uniqueness for parabolic equations. Arch. Rat. Mech. Anal. 169(2), 147–157 (2003) 7. Gallagher, I., Iftimie, D., Planchon, F.: Asymptotics and stability for global solutions to the Navier-Stokes equations. Ann. Inst. Fourier (Grenoble) 53(5), 1387–1424 (2003) 8. Giga, Y.: Solutions for semilinear parabolic equations in L p and regularity of weak solutions of the Navier-Stokes system. J. Differ. Eq. 62, 186–212 (1986) 9. Giga, Y., Miyakawa, T.: Solution in L r of the Navier-Stokes initial value problem. Arch. Rat. Mech. Anal. 89, 267–281 (1985) 10. Giga, Y., Sawada, O.: On regularizing-decay rate estimates for solutions to the Navier-Stokes initial value problem. In: Nonlinear Analysis and Applications: to V. Lakshmikantham on his 80th Birthday. 1,2, Dordrecht: Kluwer Acad. Publ., 2003, pp. 549–562


827

11. Gustafson, S., Kang, K., Tsai, T.: Interior regularity criteria for suitable weak solutions of the NavierStokes equations. Commun. Math. Phys. 273(1), 161–176 (2007) 12. Hopf, E.: Über die Anfangswertaufgabe für die hydrodynamischen Grundgleichungen. Math. Nachr. 4, 213–231 (1951) 13. Kato, T.: Strong L p -solutions of the Navier-Stokes equation in Rm with applications to weak solutions. Math. Z. 187, 471–480 (1984) 14. Koch, H., Tataru, D.: Well-posedness for the Navier-Stokes equations. Adv. Math. 157(1), 22–35 (2001) 15. Ladyzhenskaya, O.: On the uniqueness and smoothness of generalized solutions to the Navier-Stokes equations. Zap. Nauchn. Sem. Leningrad. Otdel. Mat. Inst. Steklov. (LOMI) 5, 169–185 (1967); English transl.: Sem. Math. V. A. Steklov Math. Inst. Leningrad 5, 60–66 (1969) 16. Ladyzhenskaya, O., Solonnikov, V., Ural’tseva, N.: Linear and quasi-Linear equations of parabolic type. Moscow: Nauka, 1967 (in Russian); English translation: Providence, RI: Amer. Math. Soc., 1968 17. Ladyzhenskaya, O.: The Mathematical Theory of Viscous Incompressible Flows. 2nd edition, London: Gordon and Breach, 1969 18. Ladyzhenskaya, O., Seregin, G.A.: On partial regularity of suitable weak solutions to the three-dimensional Navier–Stokes equations. J. Math. Fluid Mech. 1, 356–387 (1999) 19. Mikhailov, A., Shilkin, T.: L 3,∞ -solutions to the 3D-Navier-Stokes system in the domain with a curved boundary (English, Russian summary). Zap. Nauchn. Sem. S.-Peterburg. Otdel. Mat. Inst. Steklov. (POMI) 336, 133–152 (2006); translation in J. Math. Sci. (N. Y.) 143(2), 2924–2935 (2007) 20. Leray, J.: Étude de diverses équations intégrales non linéaires et de quelques problèmes que pose l’hydrodynamique. J. Math. Pures Appl. 12, 1–82 (1933) 21. Lin, F.: A new proof of the Caffarelli-Kohn-Nirenberg theorem. Comm. Pure Appl. Math. 51, 241–257 (1998) 22. Prodi, G.: Un teorema di unicità per le equazioni di Navier-Stokes. Ann. Mat. Pura Appl. 48, 173–182 (1959) 23. Scheffer, V.: Partial regularity of solutions to the Navier-Stokes equations. Pacific J. Math. 66, 535–552 (1976) 24. Scheffer, V.: Hausdorff measure and the Navier-Stokes equations. Commun. Math. Phys. 55, 97–112 (1977) 25. Scheffer, V.: The Navier-Stokes equations on a bounded domain. Commun. Math. Phys. 73, 1–42 (1980) 26. Seregin, G.: On smoothness of L 3,∞ -solutions to the Navier-Stokes equations up to boundary. Math. Ann. 332(1), 219–238 (2005) 27. Seregin, G., Sverak, V.: On smoothness of suitable weak solutions to the Navier-Stokes equations. Zap. Nauchn. Sem. S.-Peterburg. Otdel. Mat. Inst. Steklov. (POMI) 306, 186–198 (2003); translation in J. Math. Sci. (N. Y.) 130(4), 4884–4892 (2005) 28. Serrin, J.: On the interior regularity of weak solutions of Navier-Stokes equations. Arch. Rat. Mech. Anal. 9, 187–195 (1962) 29. Maremonti, P., Solonnikov, V.: On estimates for the solutions of the nonstationary Stokes problem in S. L. Sobolev anisotropic spaces with a mixed norm. (Russian. English, Russian summary) Zap. Nauchn. Sem. S.-Peterburg. Otdel. Mat. Inst. Steklov. (POMI) 222, 124–150 (1995); translation in J. Math. Sci. (New York) 87(5), 3859–3877 (1997) 30. Serrin, J.: The initial value problem for the Navier-Stokes equations. In: Nonlinear Problems, R. Langer, ed., Madison, WI: Univ. of Wisconsin Press, 1963, 69–98 31. Struwe, M.: On partial regularity results for the Navier-Stokes equations. Comm. Pure Appl. Math. 41, 437–458 (1988) 32. von Wahl, W.: The Equations of Navier-Stokes and Abstract Parabolic Equations. Braunschweig: Vieweg, 1985 33. Taylor, M.: Analysis on Morrey spaces and applications to Navier-Stokes equation. Comm. Part. Differ. Eqs. 17, 1407–1456 (1992) Communicated by P. Constantin


Communications in


An Effective Mass Theorem for the Bidimensional Electron Gas in a Strong Magnetic Field Fanny Delebecque-Fendt, Florian Méhats IRMAR, Université Rennes 1, Campus de Beaulieu, 35042 Rennes Cedex, France. E-mail: [email protected]; [email protected] Received: 6 February 2009 / Accepted: 28 April 2009 Published online: 24 July 2009 – © Springer-Verlag 2009

Abstract: We study the limiting behavior of a singularly perturbed Schrödinger-Poisson system describing a 3-dimensional electron gas strongly confined in the vicinity of a plane (x, y) and subject to a strong uniform magnetic field in the plane of the gas. The coupled effects of the confinement and of the magnetic field induce fast oscillations in time that need to be averaged out. We obtain at the limit a system of 2-dimensional Schrödinger equations in the plane (x, y), coupled through an effective selfconsistent electrical potential. In the direction perpendicular to the magnetic field, the electron mass is modified by the field, as the result of an averaging of the cyclotron motion. The main tools of the analysis are the adaptation of the second order long-time averaging theory of ODEs to our PDEs context, and the use of a Sobolev scale adapted to the confinement operator. 1. Introduction 1.1. The singularly perturbed problem. Many electronic devices are based on the quantum transport of a bidimensional electron gas (2DEG) artificially confined in heterostructures at nanometer scales, see e.g. [2,4,20,31]. In this article, we derive an asymptotic model for the quantum transport of a 2DEG subject to a strong uniform magnetic field which is parallel to the plane of the gas. The aim of this paper is to understand how the cyclotron motion competes with the effects of the potential confining the electrons and the nonlinear effects of the selfconsistent Poisson potential. Our tool is an asymptotic analysis from a singularly perturbed Schrödinger-Poisson system towards a reduced model of bidimensional quantum transport. In particular, we generalize in this context the notion of cyclotron effective mass, usually explicitly calculated in the simplified situation of a harmonic confinement potential [20,28]. Our starting model is thus the 3D Schrödinger-Poisson system, singularly perturbed by a confinement potential and the strong magnetic field. The three-dimensional space variables are denoted by (x, y, z) and the associated canonical basis of R3 is denoted by

830

F. Delebecque-Fendt, F. Méhats

(ex , ey , ez ). The particles are subject to three effects: a confinement potential depending on the z variable, a uniform magnetic field applied to the gas along the ey axis, and the selfconsistent Poisson potential. Given a small parameter ε > 0, which is the typical extension of the 2DEG in the z direction, our starting model is the following dimensionless Schrödinger-Poisson system: i∂t ε =

1 2 1 2 2 −∂ + B z + V (z) ε − 2i Bz∂x ε − x,y ε + V ε ε , c z 2 ε ε ε (0, x, y, z) = 0 (x, y, z), 1 V ε (t, x, y, z) = ∗ | ε |2 , 4πr ε

(1.1) (1.2) (1.3)

where we have denoted r ε (x, y, z) =

x2 + y 2 + ε2 z 2 .

(1.4)

The scaling is discussed in the next subsection. This system describes the transport of electrons under the action of: – The applied confinement potential ε12 Vc (z), nonnegative, such that Vc (z) → +∞ as |z| → +∞. The precise assumptions of this potential are made below in Assumptions 1.1 and 1.2. – The applied uniform magnetic field Bε ey (with B > 0 fixed), which derives from the magnetic potential 1ε Bzex . We have chosen to work in the Landau gauge. – The Poisson selfconsistent potential V ε . Note that (1.1) is equivalent to 1 2 Bz 2 ε ε i∂t = 2 −∂z + Vc (z) + i∂x − − ∂y2 ε + V ε ε . ε ε ε

(1.5)

The goal of this work is to exhibit an asymptotic system for (1.1), (1.2), (1.3) as ε → 0. Let us end this subsection with short bibliographical notes. In a linear setting, quantum motion constraint on a manifold has been studied for a long time by several authors, see [15,18,21,30] and references therein. Nonlinear situations were studied more recently. The approximation of the Schrödinger-Poisson system with no magnetic field was studied when the electron gas is constrained in the vicinity of a plane in [7,25] and when the gas is constrained on a line in [5]. When the nonlinearity depends locally on the density, as for the Gross-Pitaevskii equation, asymptotic models for confined quantum systems were studied in [8,6,12]. In the classical setting, collisional models in situations of strong confinement have been studied in [17]. Finally, let us draw a parallel with the problem of homogenization of the Schrödinger equation in a large periodic potential, studied in [1 and 29]. At the limit ε → 0, as noted above, we will obtain an homogenized system which takes the form of bidimensional Schrödinger equations with an effective mass in the x direction. However, this phenomenon is due to an averaging of the cyclotron motion induced by a strong magnetic field, and is not exactly the same notion as the usual effective mass for the transport in a lattice or in a crystal. Nevertheless, it is interesting to observe that the scaling used in [1,29] in the case of a strong periodic potential is similar to the strong confinement scaling used in the present paper.

2DEG in a Strong Magnetic Field

831

1.2. The physical scaling. In order to clarify the physical assumptions underlying our singularly perturbed system, let us derive (1.1), (1.2), (1.3) from the Schrödinger-Poisson system written in physical variables. This system reads as follows: 2 eB i∇ − zex + eVc + eV, c e ∗ ||2 . V= 4π x2 + y2 + z2

i∂t =

1 2m

(1.6) (1.7)

Each dimensionless quantity in (1.1), (1.2), (1.3) is the associated physical quantity normalized by a typical scale: x=

x y z ||2 Vc V , y = , z = , | ε |2 = , Vε = , , Vc = x y z N Vc V

B=

B B

. (1.8)

Now we introduce two energy scales in this problem: a strong energy E con f , which will be the energy of the confinement in z and of the magnetic effects, and a transport energy E transp , which will be the typical energy of the longitudinal transport in (x, y) and also of the selfconsistent effects. We introduce the following small dimensionless parameter: ε=

E transp E con f

1/2 1.

(1.9)

Then our scaling assumptions are the following. We set to the scale E con f the confinement potential, the magnetic energy and the kinetic energy along z: E con f

1 := eVc = m 2

eB mc

2 z2 =

2 2mz 2

,

(1.10)

and we set to the scale E transp the selfconsistent potential energy, the kinetic energies along x and y, and we finally choose a time scale adapted to this energy: E transp := eV =

2 e2 N x z 2 = = = . 2 t 2mx 2my 2

(1.11)

By inserting (1.8) in (1.6), (1.7), then by using (1.9), (1.10) and (1.11), we obtain directly our singularly perturbed problem (1.1), (1.3). Note that (1.10) and (1.11) imply that ε is also the ratio between the transversal and the longitudinal space scales: ε=

z z = . x y

1.3. Heuristics in a simplified case. In this section, we analyze a very simplified situation where analytic calculations can be directly done. We assume here that Vc is a harmonic confinement potential and we neglect the Poisson potential V ε . We formally analyze the heuristics in this simplified case, that will be further compared to our result obtained in the general case.

832


We thus consider here a new system, similar to (1.1) where we prescribe Vc (z) = α 2 z 2 , α > 0, and where the Poisson potential V ε is replaced by 0: 1 1 i∂t ε = 2 −∂z2 + (α 2 + B 2 )z 2 ε − 2i Bz∂x ε − x,y ε , (1.12) ε ε ε (0, x, y, z) = 0 (x, y, z). (1.13) In this situation, there is a trick which enables us to transform the equation. Indeed, by remarking that −∂z2 + (α 2 + B 2 )z 2 − 2i Bεz∂x − ε2 ∂x2 2 B α2 2 2 2 = −∂z + (α + B ) z − 2 iε∂ − ε2 ∂x2 , x α + B2 α2 + B 2 we obtain that (1.12) is equivalent to 2 1 B α2 ε i∂t ε = 2 −∂z2 + (α 2 + B 2 ) z − 2 iε∂ − ∂ 2 ε − ∂y2 ε . x ε α + B2 α2 + B 2 x (1.14) Introduce now the following operator: for a function u ∈ L 2 (R3 ), we set B ε −1 ( u)(x, y, z) = Fx εξ ) , Fx u(ξ, y, z + 2 α + B2 where Fx denotes the Fourier transform in the x variable. Note that this operator ε is unitary on L 2 (R3 ) and commutes with ∂x and ∂y . Hence, we deduce from (1.14) and by direct calculations that the function u ε = ε ε satisfies the following system: i∂t u ε =

1 ε α2 u − ∂ 2 u ε − ∂y2 u ε , u ε (t = 0) = ε 0 , H z ε2 α2 + B 2 x

where

z = −∂z2 + (α 2 + B 2 )z 2 . H Let us now filter out the oscillations by introducing the new unknown

z /ε2 )u ε .

ε = exp(it H

z /ε2 ) commutes with ∂x , ∂y and, finally, the following equaAgain, the operator exp(it H tion is equivalent to (1.12): i∂t ε = −

α2

α2 ∂ 2 ε − ∂y2 ε , ε (t = 0) = ε 0 . + B2 x

(1.15)

As ε → 0, it is not difficult to see that, for sufficiently smooth initial data, we have ε 0 → 0 . Therefore, one can show that, in adapted functional spaces, we have

ε → as ε → 0, with solution of the limit system: i∂t = −

α2 ∂ 2 − ∂y2 , (t = 0) = 0 . α2 + B 2 x

(1.16)


833

This equation is a bidimensional Schrödinger equation with an anisotropic operator that can be interpreted as follows. Whereas, as expected, the dynamics in the y is not perturbed by the magnetic field (since it is parallel to y), in the x direction the electrons are 2 2 transported as if their mass was augmented by a factor α α+B > 1. This coefficient is 2 called the (dimensionless) electron cyclotron mass [20,28]. In this article, the model that we want to treat is the nonlinear system (1.1), (1.3), with a general confinement potential Vc instead of α 2 z 2 and the selfconsistent Poisson potential. Consequently, it is not possible to simplify the equation (1.1) by the above trick. Moreover, the potential V ε depends on the z variable and on the function ε itself. Therefore, one has to be careful for instance when filtering out the fast oscillations by

z /ε2 ), since in this nonlinear framework some interferapplying the operator exp(it H ence effects between the elementary waves might appear. In this article, we present a general strategy that enables to overcome these difficulties. The strategy will be inspired from [6] where the nonlinear Schrödinger equation under strong partial confinement was analyzed. Two main differences appear here. First, the Poisson nonlinearity is nonlocal, which requires specific estimates. Observe that, at the limit ε → 0, the nonlinearity in the present paper reads 4π1|x| ∗ |ψ|2 dz and does not depend on z. This makes an important difference with the case of [6], in particular no resonance effects due to the nonlinearity will appear. Second, the magnetic field induces in (1.1) a singular term at an intermediate scale 1ε between the confinement operator (at the scale ε12 ) and the nonlinearity (at the scale ε10 ). Hence, compared to [6], the average techniques have to be pushed to the order two and resonance effects will finally appear here due to this magnetic term. 1.4. Main result. Consider the system (1.1), (1.2), (1.3). We assume that the confinement potential Vc satisfies two assumptions. The first one concerns the behavior of this function at infinity. Assumption 1.1. The potential Vc is a C ∞ nonnegative even function such that a 2 |z|2 ≤ Vc (z) ≤ C|z| M for |z| ≥ 1,

(1.17)

where a > 0, M > 0, and |∂z Vc (z)| = O |z|−M , Vc (z)

|∂zk Vc (z)| = O(1) for all k ∈ N∗ , Vc (z)

(1.18)

as |z| → +∞, where M > 0. Note that a smooth even potential of the form Vc (z) = C|z|s for |z| ≥ |z 0 |, with C > 0, s ≥ 2, satisfies these assumptions. In particular the harmonic potential Vc = a 2 z 2 fits these conditions. Let us discuss the assumptions. The assumption that the function Vc (z) is even is important in our analysis, see e.g. Step 4 in Subsect. 1.5. The left inequality in the first condition (1.17) implies that Vc tends to +∞ as |z| → +∞. The fact that Vc (z) ≥ a 2 z 2 is not essential in our analysis but simplifies it (see below, it allows to give a simple characterization of the energy space related to our system). As it is well-known [26], the spectrum of operator Hz defined by Hz = −∂z2 + B 2 z 2 + Vc (z)

(1.19)

834


is discrete, when Hz is considered as a linear, unbounded operator over L 2 (R), with domain D(Hz ) = {u ∈ L 2 (R), Hz u ∈ L 2 (R)}. The complete sequence of eigenvalues of Hz will be denoted by (E p ) p∈N , taken strictly increasing with p (recall indeed that in dimension 1 the eigenvalues are simple), and the associated Hilbert basis of real-valued eigenfunctions will be denoted by (χ p (z)) p∈N . The right inequality in (1.17) and the second condition (1.18) are more technical and are here to simplify the use of a Sobolev scale based on the operator Hz , which is well adapted to our problem. More precisely, these assumptions are used in Lemma 2.3. The second assumption on Vc concerns the spectrum of the confinement operator Hz . Assumption 1.2. The eigenvalues of the operator Hz defined by (1.19) satisfy the following property: there exists C > 0 and n 0 ∈ N such that ∀ p ∈ N, E p+1 − E p ≥ C(1 + p)−n 0 . The most simple situation where (1.2) is satisfied is when there exists a uniform gap between the eigenvalues: for all p ∈ N∗ , E p+1 − E p ≥ C0 > 0. Note that in this case we have n 0 = 0. This property is true in the following examples: √ – If Vc (z) = a 2 z 2 + V1 (z), with V1 L ∞√< 2 a 2 + B 2 . Indeed, in this case the perturbation theory gives |E p − (2 p + 1) a 2 + B 2 | < V1 L ∞ . – If Vc (z) ∼ a|z|s as |z| → +∞, with s > 2. Indeed, in this case the Weyl asymptotics 2s [19] gives E p ∼ C p s+2 , so E p+1 − E p → +∞ as p → +∞. Let us now give a few indications on the Cauchy problem for (1.1), (1.2), (1.3). This system benefits from two conservation laws, the mass and energy conservations: ∀t ≥ 0, ε (t)2L 2 = 0 2L 2 , E( ε (t)) = E(0 ),

(1.20)

where the total energy of the wavefunction ε is defined by E( ε ) =

1 1 1 ε 2 ∂ + Vc ε 2L 2 + 2 (ε∂x + i Bz) ε 2L 2 z 2 L ε2 ε2 ε 1 √ +∂y ε 2L 2 + V ε ε 2L 2 . 2

(1.21)

For fixed ε > 0, the Cauchy theory for the Schrödinger-Poisson with a constant uniform magnetic field was solved in [14,16] in the energy space. It is not difficult to adapt these proofs (see also the reference book [13]) to our case where an additional confinement potential is applied. The energy space in our situation is the set of functions u such that E(u) is finite: B 1 = u ∈ L 2 (R3 ) : ∂z u ∈ L 2 (R3 ), Vc u ∈ L 2 (R3 ), ∂y u ∈ L 2 (R3 )

i Bz u ∈ L 2 (R3 ) . and ∂x + ε


835

This space seems to depend on ε, which would not be convenient for our asymptotic analysis. In fact, it does not. Indeed, thanks to our assumption (1.17) on the confinement potential, one has zu L 2 ≤

1 Vc u L 2 , a

so u ∈ B 1 implies that zu ∈ L 2 and thus ∂x u ∈ L 2 . Hence one has B 1 = u ∈ H 1 (R3 ) : Vc u ∈ L 2 (R3 ) , and, on this space, we will use the following norm independent of ε: u2B 1 = (I − x,y + Hz )1/2 u2L 2 = u2L 2 + (−x,y )1/2 u2L 2 + (Hz )1/2 u2L 2 = u2H 1 + Vc u2L 2 + B 2 zu2L 2 ,

(1.22)

where we used the selfadjointness and the positivity of −x,y and of the operator Hz defined by (1.19), and where I denotes the identity operator. In this paper, we will assume that the initial datum 0 in (1.2) belongs to this space B 1 . Then, for all ε > 0, the system (1.1), (1.2), (1.3) admits a unique global solution ε ∈ C 0 ([0, +∞), B 1 ). Our aim is to analyze the asymptotic behavior of ε as ε → 0. We are now in position to state our main results. Here and throughout this paper, we will use the notation ∀u ∈ L 1z (R), u = u(z) dz. (1.23) R

Let us introduce the limit system. First define the following coefficients: 2Bzχ p χq 2 ∀ p ∈ N, α p = 1 − , Eq − E p

(1.24)

q= p

where we recall that (E p , χ p ) p∈N is the complete sequence of eigenvalues and eigenfunctions of the operator Hz defined by (1.19). Then, we introduce the following infinite dimensional, nonlinear and coupled differential system on the functions φ p (t, x, y): ∀ p ∈ N, i∂t φ p = −α p ∂x2 φ p − ∂y2 φ p + W φ p , φ p (t = 0) = 0 χ p , (1.25) ⎛ ⎞ 1 W = ∗⎝ |φ p |2 ⎠ . (1.26) 4π x2 + y 2 p∈N Note that the convolution in (1.26) holds on the variables (x, y) ∈ R2 . Equation (1.26) is nothing but the Poisson equation for a measure valued distribution of mass whose support is constrained to the plane z = 0: ⎡ ⎛ ⎞⎤ 1 W (t, x, y) = ⎣ ∗⎝ |φ p (t, x, y)|2 δz=0 ⎠⎦ . 4π x2 + y 2 + z 2 p∈N z=0

836


In order to compare with ε , we introduce the following functions:

(t, x, y, z) = φ p (t, x, y) χ p (z), p∈N

ε app (t, x, y, z) =

e−it E p /ε φ p (t, x, y) χ p (z). 2

(1.27)

p∈N ε can be deduced from through the application of the operator eit Hz /ε , Remark that app unitary on B 1 : 2

ε = e−it Hz /ε . app 2

This explicit relation is the only dependency in ε of the limit system (1.25), (1.26), (1.27). Our main result is the following theorem. Theorem 1.3. Assume that Vc satisfies Assumptions 1.1 and 1.2 and let 0 ∈ B 1 . For all ε ∈ (0, 1], denote by ε ∈ C 0 ([0, +∞), B 1 ) the unique global solution of the initial system (1.1), (1.2), (1.3). Then the following holds true: ε ∈ (i) The limit system (1.25), (1.26), (1.27) admits a unique maximal solution app 0 1 C ([0, Tmax ), B ), where Tmax ∈ (0, +∞] is independent of ε. If Tmax < +∞ ε (t, ·) then app B 1 → +∞ as t → Tmax . (ii) For all T ∈ (0, Tmax ), we have ε = 0. lim ε − app 0 1 ε→0

C ([0,T ],B )

Comments on Theorem 1.3. 1. The cyclotron effective mass. Theorem 1.3 thus states that, on all time intervals where the limit system (1.27), (1.25), (1.26) is well-posed, the solution ε of the singularly ε . As expected, the dynamics in the y perturbed system (1.1), (1.2), (1.3) is close to app direction, i.e. parallel to the magnetic field, is not affected by the magnetic field, since the operator is still −∂y2 . On the other hand, the situation is different in the direction x and the averaging of the cyclotron motion results in a multiplication of the operator −∂x2 by the factor α p which only depends on Vc and B. The coefficient α1p plays in (1.25) the role of an effective mass in the direction perpendicular to the magnetic field. We find that the effective mass in the Schrödinger equation for the mode p depends on the index p of this mode. We do not know whether these coefficients are positive for a general Vc . Notice that the effective mass could be predicted heuristically by the following argument. Denoting by kx , ky the wavevectors of the 2DEG in the plane (x, y), the electron dispersion relation E p (kx , ky ) in the transversal subbands can be written from (1.1) by computing the eigenvalues of the operator 1 d2 2 2 2 2 2 2 − 2 + B z + Vc (z) + 2ε Bzkx + ε kx + ε ky . ε2 dz Since ε is small, an approximation of E p (kx , ky ) can be computed thanks to perturbation theory, which gives the following parabolic band approximation: 2Bzχ p χq 2 Ep 2 E p (kx , ky ) = 2 − kx + kx2 + ky2 + o(1). ε Eq − E p q= p


837

We can read on this formula that the effective mass is 1 in the y direction and is α −1 p according to (1.24) in the x direction. Note that the specific case of the harmonic potential is treated below (see Comment 3). 2. Conservation of the energy for the limit system. Let us write the conservation of the energy for the limit system. The total energy for this system can be splitted into a confinement energy Econ f ( ) and a transport energy Etr ( ) defined by Econ f ( ) = Etr ( ) =

p∈N

p∈N

E p φ p 2L 2 ,

α p ∂x φ p 2L 2 +

p∈N

(1.28)

∂y φ p 2L 2

1 1 + |φ p (x, y)|2 |φq (x , y )|2 dxdydx dy . (1.29) 2 p,q R4 4π |x − x |2 + |y − y |2 An interesting property is that these two quantities are separately conserved by the limit ε solves (1.25), (1.26), (1.27), then, for all t ∈ [0, T ], we have system. If app ε ε ε ε (t)) = Econ f (app (0)) and Etr (app (t)) = Etr (app (0)). (1.30) Econ f (app

In particular, by summing up the two equalities in (1.30), we obtain the following conservation property: ε ε ε ε Econ f (app (t)) + Etr (app (t)) = Econ f (app (0)) + Etr (app (0)).

(1.31)

Note that, in the general case, we do not know whether the energy defined by (1.29) is the sum of nonnegative terms. This point is related to the fact that the well-posedness for t ∈ [0, +∞) of the Cauchy problem for the nonlinear system (1.25), (1.26) is an open issue. Nevertheless, when the α p are such that the energy is coercive on B 1 , i.e. when we have ∀ ∈ B 1 , C0 2B 1 ≤ Econ f ( ) + Etr ( ) ≤ C1 2B 1 + C2 4B 1 ,

(1.32)

with a constant C0 > 0 independent of ε, then the maximal solution of (1.25), (1.26) is globally defined: Tmax = +∞. Corollary 1.4 (Global in time convergence). Under the assumptions of Theorem 1.3, assume moreover that there exists 0 < α < α such that the coefficients α p defined by (1.24) satisfy the following condition: ∀ p ∈ N,

α ≤ α p ≤ α.

(1.33)

ε ∈ C0 Then the system (1.27), (1.25), (1.26) admits a unique global solution app ([0, +∞), B 1 ) and, for all T > 0, we have ε = 0, lim ε − app 0 1 ε→0

C ([0,T ],B )

where ε ∈ C 0 ([0, +∞), B 1 ) denotes the solution of (1.1), (1.2), (1.3).

838


The proof of this corollary is immediate and will not be detailed in this paper. Indeed, ε (t) of (1.25), (1.26) remarking that (1.33) implies (1.32), we obtain that the solution app satisfies the following uniform bound: ε ε ε app (t)2B 1 ≤ C E con f (app (t)) + E tr (app (t)) = C E con f (0 ) + E tr ( 0 ) , where the quantity in the right-hand side is finite as soon as 0 ∈ B 1 . 3. Case of harmonic confinement. In the special case of a harmonic confinement potential Vc (z) = a 2 z 2 , the eigenvalues and eigenfunctions of Hz = −∂z2 + (a 2 + B 2 )z 2 can be computed explicitly and one has E p = (2 p + 1) a 2 + B 2 , χ p (z) = (a 2 + B 2 )1/8 u p (a 2 + B 2 )1/4 z , where (u p ) p∈N are the normalized Hermite functions defined e.g. in [24, Appendix B-8] and satisfying −u + z 2 u p = (2 p + 1)u p . The properties of the Hermite functions give √ √ 2( p + 1) 2p χ p+1 + 2 χ p−1 , 2zχ p = 2 (a + B 2 )1/4 (a + B 2 )1/4 and one can compute explicitly the coefficients α p = 1 − B2

2zχ p χ p+1 2 2zχ p χ p−1 2 a2 + B2 = 2 . E p+1 − E p E p − E p−1 a + B2

We thus recover here the coefficient found in Subsect. 1.3 in the simplified situation. Note that, in this case, condition (1.33) is satisfied and the convergence result holds on an arbitrary time interval. It is reasonable to conjecture that this condition (1.33) holds again when Vc (z) = a 2 z 2 + V1 (z), where V1 is a small perturbation. 4. Towards a more realistic model. Since we aim at describing the transport of electrons, which are fermions, our model should not be restricted to a pure quantum state. The following model describes the transport of an electron gas in a mixed quantum state and is more realistic: 1 1 i∂t εj = 2 −∂z2 + B 2 z 2 + Vc (z) εj − 2i Bz∂x εj − x,y εj + V ε εj , ∀ j, ε ε (1.34) ε j (0, x, y, z) = j,0 (x, y, z), ∀ j, (1.35) 1 V ε (t, x, z) = ∗ ρε, ρε = λ j | εj |2 , (1.36) 4πr ε j

where λ j , the occupation factor of the state εj , takes into account the statistics of the electron ensemble and is fixed once and for all at the initial time. Note that the Schrödinger equations (1.34) are only coupled through the selfconsistent Poisson potential. Therefore, we claim that our main Theorem 1.3, which has been given for the sake of simplicity in the case of the pure quantum state, can be extended to this system (1.34), (1.35), (1.36), with appropriate assumptions on the initial data ( j,0 ). Similarly, a given smooth external potential could be incorporated in the initial system. We also claim that our result can be easily adapted if we add in the right-hand side of (1.1) a term of the form Vext (t, x, y, εz) ε (which is coherent with our scaling), and the result does not change qualitatively.


839

1.5. Scheme of the proof. In this section, we sketch the main steps of the proof of the main theorem. Step 1: A priori estimates. The first task is to obtain uniform in ε a priori estimates for the solution of (1.1), (1.2), (1.3), which are of course crucial in the subsequent nonlinear analysis. Due to the presence of the singular ε12 and 1ε terms in (1.1), this task is not obvious here. In Subsect. 2.1, we introduce a well adapted functional framework: a Sobolev scale based on the operators −x,y and Hz . More precisely, for all m ∈ N, we introduce the Hilbert space m/2 B m = u : u2B m = u2L 2 (R3 ) + (−x,y )m/2 u2L 2 (R3 ) + Hz u2L 2 (R3 ) < +∞ . (1.37) In Subsect. 2.1, we give some equivalent norms which are easier to handle here. Then in Subsect. 2.2 we take advantage of this functional framework and derive some a priori estimates for (1.1), (1.2), (1.3). Step 2: The filtered system. In [3,6], the asymptotics of NLS equations under the form i∂t u ε =

1 Hz u ε − x,y u ε + F(|u ε |2 )u ε , ε2

(1.38)

such as the Gross-Pitaevskii equation, was analyzed. In (1.38), F : R+ → R is a given function and the nonlinearity depends locally on the density |u ε |. It appeared in [6] that a fruitful strategy is to filter out the oscillations in time induced by the term ε12 Hz , without projecting on the eigenmodes of Hz . Indeed, projecting (1.38) on the Hilbert basis χ p leads to difficult problems of series summations and of small denominators in oscillating phases. Introducing the new unknown: v ε (t, x, z) = exp it Hz /ε2 u ε (t, x, z), the filtered system associated to (1.38) reads 2 2 2 2 i∂t v ε = −x,y v ε + eit Hz /ε F e−it Hz /ε v ε e−it Hz /ε v ε ,

(1.39)

where we used the fact that Hz , thus eit Hz , commutes with ∂x and ∂y . Then, the analysis of the limit ε → 0 amounts to prove that it is possible to define an average of the nonlinearity in (1.39) with respect to the fast variable t/ε2 . Let us adapt this strategy to our problem. Introduce

ε (t, x, z) = exp it Hz /ε2 ε (t, x, z). One deduces from (1.1), (1.2), (1.3) the following equation for ε : 2B it Hz /ε2 −it Hz /ε2 t ε ε ε ε e (i∂x ) − x,y + F ze , (t) , (1.40) i∂t = − ε ε2 where we introduced the nonlinear function 1 −iτ Hz 2 −iτ Hz iτ Hz ∗ e u e u, (τ, u) → F (τ, u) = e 4πr ε and where r ε is still defined by (1.4).

(1.41)

840


Step 3: Approximation by an intermediate system. Before performing the limit ε → 0 in (1.40), we remark that (1.41) can be approximated in order to get rid of the fast time variable t/ε2 in the nonlinear term of (1.40). By writing formally 1 1 = + o(1), 2 2 2 2 2 x +y +ε z x + y2 we remark that

(1.42)

! 1 1 −iτ Hz −iτ Hz 2 = ∗ u ∗ u e + o(1) e rε x2 + y 2 # " 1 = ∗ |u|2 + o(1), x2 + y 2

where the symbol ∗ denotes here a convolution in the (x, y) variables only, and where we used the fact that eiτ Hz is unitary on L 2z (R). Hence, inserting this Ansatz in (1.41) yields # " 1 iτ Hz 2 F (τ, u) = e ∗ |u| e−iτ Hz u + o(1) 2 2 4π x + y # " 1 2 = ∗ |u| u + o(1). 4π x2 + y 2 Denoting

F0 (u) =

"

1

∗ |u| 4π x2 + y 2

2

#

u,

$ε of the following intermediate system: and introducing the solution

2B it Hz /ε2 −it Hz /ε2 $ε = − $ε ) − x,y

$ε + F0 ε (t) , e (i∂x

ze i∂t

ε

(1.43)

(1.44)

we expect that the solution ε of (1.40) satisfies $ε + o(1).

ε =

(1.45)

Subsection 2.3 is devoted to the rigorous proof of this heuristics. We give sense to the o(1) in Lemma 2.7 and we prove that the solutions of the two nonlinear equations (1.40) and (1.44) are close together and that (1.45) holds true in the sense of the B 1 norm. This statement is given in Proposition 2.1. Step 4: Second order averaging of oscillating systems. Thanks to Step 3, we can consider the simplest system (1.44) instead of (1.40). We are now left with the analysis of the asymptotics of this intermediate system as ε → 0. Note that (1.44) is under the general form t 1 u(t) + g(u(t)) (1.46) i∂t u = f ε ε2 with f (τ ) = −2Beiτ Hz ze−iτ Hz i∂x

and

g(u) = −x,y u + F0 (u).


841

At this point, a critical fact has to be noticed. Equations under the form i∂t u = f

t ε2

u(t) + g(u(t))

(1.47)

can be averaged when, due to some ergodicity property, one can give a sense to the time average 1 f = lim T →+∞ T

T

0

f (τ ) dτ.

(1.48)

0

Indeed, under rather general assumptions, the techniques of averaging of dynamical systems – see the reference book on the topic by Sanders and Verhulst [27]– enable us to show that (1.47) is well approximated by the averaged equation i∂t u = f 0 u(t) + g(u(t)). Yet, the oscillating term in (1.46), compared to the same term in (1.47), is multiplied by 1ε . Therefore, a necessary condition in order to perform the averaging of (1.46) is that the average f 0 of f is zero. In our case, the integral kernel of the operator eiτ Hz ze−iτ Hz , defined by ∀u,

e

iτ Hz

ze

−iτ Hz

u=

R

G(τ, z, z )u(z )dz ,

is given by G(τ, z, z ) =

p∈N q∈N

=

eiτ (E p −Eq ) zχ p χq χ p (z)χq (z ) eiτ (E p −Eq ) zχ p χq χ p (z)χq (z ).

p∈N q= p

In the last inequality, we used the fact that, by Assumption 1.1, Vc is even. Indeed, this property implies that, for all p, (χ p )2 is also even, thus z(χ p )2 = 0. Consequently, since p = q implies E p = E q , the kernel G(τ, z, z ) is a series of functions which all have a vanishing average in time. We thus expect that the operator-valued function f (τ ) has the same property: f 0 = lim

T →+∞

1 T

T

f (τ ) dτ = 0.

0

In such a situation, the theory of averaging has to be pushed to the second order [27] in order to obtain the limit of (1.46) as ε → 0. Section 3 is devoted to this question of second order averaging, which leads to the limit system (1.25), (1.26). The main result of Sect. 3 is Proposition 3.2. In the short last Sect. 4, we prove our main Theorem 1.3 by just gathering the results proved in the previous sections.

842


2. The Nonlinear Analysis In this section, we obtain some a priori estimates uniform in ε for the initial system (1.1), (1.2), (1.3) and we prove that it can be approximated by an intermediate system, where we regularize the initial data and where we replace the Poisson nonlinearity by its formal limit given in (1.43). This intermediate system takes the form $ε = i∂t

1 1 $ε − 2i Bz∂x $ε − x,y $ε + W ε $ε , Hz ε2 ε $ε (0, x, y, z) = $0 (x, y, z),

W ε (t, x, z) =

1

4π x2 + y 2

# " $ε |2 . ∗ |

(2.1)

(2.2)

(2.3)

Notice that (2.3) is nothing but the Poisson equation (1.3) where we replace $0 in (2.2) r ε = x2 + y 2 + ε2 z 2 by r 0 = x2 + y 2 . Moreover, the initial datum m will be chosen as a regularization in B of the initial datum 0 . Recall the definition (1.37) of the space B m . The main result of this section is the following proposition. Proposition 2.1 (Approximation of the initial system). Assume that Vc satisfies Assumptions 1.1, 1.2 and that 0 ∈ B 1 . For all ε ∈ (0, 1], denote by ε ∈ C 0 (R+ , B 1 ) the unique global solution of the initial system (1.1), (1.2), (1.3). Then the following holds true: (i) There exists a maximal positive time such that ε is bounded uniformly in ε : the quantity % & T0 := sup T ≥ 0 : sup ε C 0 ([0,T ],B 1 ) < +∞ ε∈(0,1]

(2.4)

satisfies T0 ∈ (0, +∞]. If T0 < +∞ then lim sup ε C 0 ([0,T0 ],B 1 ) = +∞. ε→0

(ii) For all T ∈ (0, T0 ), where T0 is defined by (2.4), for all δ > 0 and for all integers $0 ∈ B m and εδ such that the following holds true. For all m ≥ 2, there exist ε ∈ (0, εδ ], the intermediate system (2.1), (2.2), (2.3) admits a unique solution $ε ∈ C 0 ([0, T ], B m ) satisfying the following uniform estimates: $ε C 0 ([0,T ],B 1 ) ≤ δ, ε − ∀ε ≤ εδ $ε C 0 ([0,T ],B m ) ≤ C(0 B 1 ) $0 B m .

(2.5) (2.6)

Remark 2.2. It is a priori not excluded that T0 < +∞. Indeed, although we are in a repulsive case, the energy conservation does not enable us to obtain ε-independent a priori estimates in B 1 (see the proof of Lemma 2.6). This may be linked to the possible formation of caustics, as for the nonlinear Schrödinger equation in semiclassical regime, see e.g. [11].


843

2.1. Preliminaries. As we explained in Subsect. 1.5, our nonlinear analysis will deeply rely on the use of the functional spaces B m defined by (1.37) and adapted to the operators Hz and −x,y . The following result was proved in [6] by using an appropriate Weyl-Hörmander pseudodifferential calculus, inspired by [9,22]: Lemma 2.3 ([6]). Under Assumption 1.1, consider the Hilbert space B m defined by (1.37) for m ∈ N. Then the norm · B m in (1.37) is equivalent to the following norm: u H m (R3 ) + Vc (z)m/2 u L 2 (R3 ) .

(2.7)

Moreover, for all u ∈ B m+1 , we have

1/2 Hz u B m + ∂x u B m + ∂y u B m + ∂z u B m + Vc u B m u B m+1 .

(2.8)

The operator x,y commutes with the rapidly oscillating operator e±it Hz /ε and with the operator i z∂x . This will enable us to obtain uniform bounds for the solution of (1.1) by simply applying x,y to this equation. Unfortunately, the operator Hz does not satisfy this property. For this reason, we introduce the following operator: 2

Hε = Hz − 2iε Bz∂x − ε2 ∂x2 = −∂z2 + Vc (z) + (iε∂x − Bz)2 .

(2.9)

This operator enables us to define another norm equivalent to the B m norm. The following lemma is proved in Appendix A. Lemma 2.4. The operator Hε defined by (2.9) on L 2 (R3 ) with domain B 2 is self-adjoint and nonnegative. There exists a constant C1 > 0 such that, for all ε ∈ (0, 1] and for all u ∈ B 1 , we have 1 u2B 1 ≤ u2L 2 (R3 ) + (−x,y )1/2 u2L 2 (R3 ) + Hε1/2 u2L 2 (R3 ) ≤ C1 u2B 1 . (2.10) C1 Moreover, for all integers m ≥ 2, there exists εm ∈ (0, 1] such that, for all ε ∈ (0, εm ], for all u ∈ B m , we have 1 u2B m ≤ u2L 2 (R3 ) + (−x,y )m/2 u2L 2 (R3 ) + Hεm/2 u2L 2 (R3 ) ≤ 2u2B m . (2.11) 2 2.2. A priori estimates. In this subsection, we obtain an a priori estimate uniform in ε for the initial Schrödinger-Poisson model (1.1), (1.3) and the intermediate model (2.1), (2.2), (2.3). Remark first that these two models can be considered in a unified way. For all u ∈ B 1 and for α ∈ {0, 1}, denote 1 2 Fα (u) = ∗ |u| u, (2.12) 4π x2 + y 2 + αε2 z 2 where the convolution holds on the three variables (x, y, z) ∈ R3 . Remark that for α = 0, this definition coincides with the definition (1.43). We shall consider for ε ∈ (0, 1] and α ∈ {0, 1} the nonlinear equation 1 Hε u ε − ∂y2 u ε + Fα (u ε ), ε2 u ε (0, x, y, z) = u 0 (x, y, z),

i∂t u ε =

(2.13) (2.14)

844


where the operator Hε was defined by (2.9). Note that for u 0 = 0 and α = 1, (2.13), $0 and α = 0, (2.13), (2.14) is the initial system (1.1), (1.2), (1.3), and that for u 0 = (2.14) is the intermediate system (2.1), (2.2), (2.3). Let us first state a technical lemma concerning the nonlinearities F1 and F0 , which is proved in Appendix B. Lemma 2.5. There exists a constant C > 0 such that, for all ε ∈ (0, 1], for α = 0 or 1, we have ∀u, v ∈ B 1 , Fα (u) − Fα (v) B 1 ≤ C u2B 1 + v2B 1 u − v B 1 , (2.15) where Fα is defined by (2.12). Moreover, for all m ∈ N∗ , there exists Cm > 0 such that we have the tame estimate ∀ε ∈ (0, 1], ∀α ∈ {0, 1}, ∀u ∈ B m , Fα (u) B m ≤ Cm u2B 1 u B m .

(2.16)

Now we are able to derive uniform a priori estimates for the solution of (2.13), (2.14). Lemma 2.6. Let ε ∈ (0, 1], α ∈ {0, 1} and u 0 ∈ B 1 . Then the solution u ε of Eq. (2.13), (2.14) exists and is unique in C 0 ([0, +∞), B 1 ) and the following uniform in ε estimates hold true: (i) For all M > 0, there exist T > 0, only depending on M and u 0 B 1 , such that, for all ε ∈ (0, 1], we have u ε C 0 ([0,T ],B 1 ) ≤ (1 + M)u 0 B 1 .

(2.17)

> 0, we have (ii) Let m ≥ 2 be an integer and assume that u 0 ∈ B m . Then, for all T the estimate

u ε 2 0 , (2.18) ∀ε ∈ (0, εm ], u ε C 0 ([0,T ],B m ) ≤ Cu 0 B m exp C T 1 C ([0,T ],B ) where εm > 0 is as in Lemma 2.4. Proof. Step 1. The Cauchy problem and the conservation laws. For any given ε > 0, the existence and uniqueness of a maximal solution u ε ∈ C 0 ([0, T ), B 1 ) can be obtained by standard techniques [13]. We leave this first part of the proof to the reader. This solution satisfies both L 2 and energy conservation laws: ∀t ≥ 0, u ε (t) L 2 = u 0 L 2 and Eα (u ε (t)) = Eα (u 0 ),

(2.19)

where the energy Eα is defined by 1 1 (Hε u, u) L 2 + ∂y u2L 2 + (Fα (u), u) L 2 2 ε 2 1 1 1 1 2 2 = 2 ∂z u L 2 + 2 Vc u L 2 + 2 (ε∂x + i Bz)u2L 2 +∂y u2L 2 + (Fα (u), u) L 2 . ε ε ε 2

Eα (u) =

We recall that the operator Hε is defined by (2.9). These conservation laws show that the solution u ε is global, i.e. that T = +∞. Unfortunately, due to the ε12 terms in this expression, one cannot use the energy conservation to get uniform in ε estimates. Instead, we will directly write the equations satisfied by ∂x u ε , ∂y u ε or (Hε )1/2 u ε and use the standard L 2 -estimates for these equations and the fact that the self-adjoint operators Hε , ∂x and ∂y commute together.


845

Step 2. B 1 estimate. This yields i∂t (∇x,y u ε )(t) = and

1 Hε (∇x,y u ε ) − ∂y2 (∇x,y u ε ) + ∇x,y Fα (u ε ) 2 ε

1 i∂t Hε1/2 u ε (t) = 2 Hε (Hε1/2 u ε ) − ∂y2 (Hε1/2 u ε ) + Hε1/2 Fα (u ε ) . ε

Hence, u ε (t) L 2 + ∇x,y u ε (t) L 2 + Hε1/2 u ε (t) L 2 ≤ u 0 L 2 + ∇x,y u 0 L 2 + Hε1/2 u 0 L 2 t ∇x,y Fα (u ε (s)) L 2 + Hε1/2 Fα (u ε (s)) L 2 ds +C 0

and, for ε ∈ (0, 1], the equivalence of norms given in Lemma 2.4, yields t ε u (t) B 1 ≤ Cu 0 B 1 + C Fα (u ε (s)) B 1 ds 0 t ≤ Cu 0 B 1 + C u ε (s)3B 1 ds,

(2.20)

0

where we used (2.15) with v = 0 to estimate Fα (u ε (s)). Hence, by applying the Gronwall lemma to the integral inequality (2.20), we prove Item (i) of the lemma. Step 3. B m estimate. Let T > 0, m ≥ 2, u 0 ∈ B m and let ε ∈ (0, εm ], where 0 < εm ≤ 1 m/2 as in Lemma 2.4. Since the operators Hε and x,y commute together, Hε u ε satifies the following equation: 1 i∂t Hεm/2 u ε (t) = 2 Hε (Hεm/2 u ε ) − ∂y2 (Hεm/2 u ε ) + Hεm/2 Fα (u ε ) , ε thus, for all t ∈ [0, T ], m/2 ε u (t)

Hε

L2

t ≤ Hεm/2 u 0 L 2 + Hεm/2 Fα (u ε (s)) L 2 ds 0 t ≤ Cu 0 B m + C Fα (u ε (s)) B m ds 0 t 2 ≤ Cu 0 B m + Cu ε C u ε (s) B m ds, 0 ([0,T ],B 1 )

(2.21)

0

where we used Lemma 2.4 and the tame estimate (2.16). Similarly, −x,y u ε satisfies the following equation: 1 i∂t (−x,y u ε )(t) = 2 Hε (−x,y u ε ) − ∂y2 (−x,y u ε ) − x,y Fα (u ε) ε and, using the definition of B m (1.37) and (2.16) yields: t (−x,y )m/2 u ε (t) L 2 ≤ (−x,y )m/2 u 0 L 2 + (−x,y )m/2 Fα (u ε (s)) L 2 ds, 0 t ε 2 ≤ Cu 0 B m + Cu C 0 ([0,T ],B 1 ) u ε (s) B m ds. (2.22) 0

846


Therefore, by using again the equivalence of norms given by Lemma 2.4 and the L 2 conservation law in (2.19), we deduce from (2.21) and (2.22) that, for t ≤ T , we have t ε ε 2 u (t) B m ≤ Cu 0 B m + Cu C 0 ([0,T ],B 1 ) u ε (s) B m ds, 0

and the Gronwall lemma gives (2.18).

2.3. Proof of Proposition 2.1. In this subsection, we prove Proposition 2.1, ie we show that this solution can be uniformly approximated by a regular solution of the intermediate system. We first state a technical lemma on the Poisson kernels, which is proved in Appendix 4. Lemma 2.7. There exists a constant C > 0 such that, for all ε ∈ (0, 1], we have ∀u ∈ B 2 ,

F1 (u) − F0 (u) B 1 ≤ C ε1/3 u3B 2 ,

(2.23)

where F0 and F1 are defined by (2.12). We are now ready to prove the main result of this section. Proof of Proposition 2.1. Let 0 ∈ B 1 , let an integer m ≥ 2 be fixed, and define the $0 by regularized initial datum $0 = I − ηx,y −m/2 (I + ηHz )−m/2 0 ,

(2.24)

where η > 0 is a small parameter that will be fixed further and where I denotes the identity operator. Denote by ε the solution of the initial system (1.1), (1.2), (1.3) and $ε the solution of the intermediate system (2.1), (2.2), (2.3) with the initial datum by $ε . (2.24). We shall estimate the difference ε − Step 1. Uniform bounds for ε . Let 0 < ε ≤ 1. From Lemma 2.6 (i), we first deduce that there exists T1 > 0 only depending on 0 B 1 such that, for all ε ∈ (0, 1], ε C 0 ([0,T1 ],B 1 ) ≤ 20 B 1 . This implies that T0 defined by (2.4) satisfies T0 ≥ T1 > 0. Clearly, if T0 < +∞, we have lim sup ε C 0 ([0,T0 ],B 1 ) = +∞, ε→0

otherwise by reiterating the above procedure we could find a uniform bound on [0, T2 ] with T2 > T0 . Now we fix T ∈ (0, T0 ) and δ > 0 for the sequel of this proof. Definition (2.4) of T0 implies that ε C 0 ([0,T ],B 1 ) ≤ C 0 B 1 , independent of ε ∈ (0, 1]. (2.25) $0 . First, we deduce from (2.24) that Step 2. Bounds for the initial datum $0 = (I − ηx,y )−m/2 (I + ηHz )−m/2 (I − x,y + Hz )1/2 0 , (I − x,y + Hz )1/2


847

hence $0 L 2 (I − x,y + Hz )1/2 ≤ (I − ηx,y )−m/2 (I + ηHz )−m/2 (I − x,y + Hz )1/2 0 L 2 ≤ (I − x,y + Hz )1/2 0 L 2 , where we used the fact that the operators (I − ηx,y )−m/2 and (I + ηHz )−m/2 are bounded on L 2 , with bounds equal to 1. Therefore, using (1.22), we obtain $0 B 1 ≤ 0 B 1 ,

(2.26)

where we recall that the right-hand side is independent of ε. Next, we get from (2.24) the two following identities: for all integer ≤ m, $0 = (−x,y )/2 (I − ηx,y )−/2 (−x,y )/2+1/2 ×(I − ηx,y )/2−m/2 (I + ηHz )−m/2 (−x,y )1/2 0 , and /2+1/2 $ 0

Hz

/2

= Hz (I + ηHz )−/2 (I + ηHz )/2−m/2 (I − ηx,y )−m/2 Hz 0 . 1/2

Thus, from the bound ∀λ ∈ R+ , λ/2 (1 + ηλ)−/2 ≤ Cη−/2 , /2

we deduce that both operators (−x,y )/2 (I − ηx,y )−/2 and Hz (I + ηHz )−/2 are bounded on L 2 , with bounds equal to Cη−/2 , and thus ∀ ≤ m,

$0 B +1 ≤ Cη−/2 0 B 1 ,

(2.27)

where we recall the definition (1.37) of the B m norms. Finally, we obtain also from (2.24) that $0 ) = I − (I − ηx,y )−m/2 (I + ηHz )−m/2 (I − x,y + Hz )1/2 (0 − ×(I − x,y + Hz )1/2 0 . Decompose v = (I − x,y + Hz )1/2 0 on the Hilbert basis (χ p ) p∈N of eigenmodes of Hz : v(x, y, z) = v p (x, y) χ p (z), p∈N

and denote by v'p (ξ ), ξ ∈ R2 , the Fourier transform of v p (x, y). By (1.22), we have 2 2 $ 1 − (1 + η|ξ |2 )−m/2 (1 + ηE p )−m/2 |v'p (ξ )|2 dξ. 0 − 0 B 1 = p∈N

Hence, using that

R2

R2 p∈N

|v'p (ξ )|2 dξ = 0 2B 1 < +∞

(2.28)

848


and that

lim 1 − (1 + η|ξ |2 )−m/2 (1 + ηE p )−m/2 = 0,

∀ξ ∈ R2 , ∀ p ∈ N,

η→0

we deduce from Lebesgue’s dominated convergence theorem and from the convergence of the series in (2.28) that

0 B 1 = 0. lim 0 −

(2.29)

η→0

$ε . Consider Step 3. Uniform a priori estimates for $ε C 0 ([0,T ],B 1 ) ≤ 2 ε C 0 ([0,T ],B 1 ) }. (2.30) Tη := sup{τ ∈ (0, T ] : ∀ε ∈ (0, 1], η Note that, from (2.26) and Lemma 2.6 (i), we know that Tη ∈ (0, T ] is well-defined. Then, from Lemma 2.6 (ii), we deduce the following estimate: $0 B +1 $ε C 0 ([0,T ],B +1 ) ≤ C $ε C 0 ([0,T ],B 1 ) ∀ε ∈ (0, εm ], ∀ ≤ m, η η ε $0 B +1 ≤ C C 0 ([0,T ],B 1 ) $0 B +1 , ≤ C 0 B 1 (2.31) where we used (2.30) and (2.25).

ε . Using the notations defined in (2.9) and Step 4. Estimate of the difference ε − $ε satisfy (2.13),(2.14) with α = 1, u 0 = 0 and α = 0, u 0 = $0 (2.12), ε and respectively. The Duhamel formulation of these equations read respectively t 2 2 ε (t) = e−it (Hε −∂y ) 0 + e−i(t−s)(Hε −∂y ) F1 ( ε (s)) ds, 0

$0 + $ε (t) = e−it (Hε −∂y ) 2

t

$ε (s)) ds. e−i(t−s)(Hε −∂y ) F0 ( 2

0

Hence, for all t ∈ [0, Tη ] and ε ∈ (0, εm ], t $ε (t) B 1 ≤ 0 − $0 B 1 + $ε (s)) B 1 ds ε (t) − F1 ( ε (s)) − F1 ( 0 t $ε (s)) − F0 ( $ε (s)) B 1 ds + F1 ( 0

$0 B 1 + C ≤ 0 −

t 0

$ε (s) B 1 ds + C ε1/3 η−3/2 , ε (s) −

where we used (2.15), (2.25), (2.30), (2.23) and (2.31) with = 1, coupled to (2.27). Here C denotes a generic constant depending only on T and 0 B 1 . Hence, by the Gronwall Lemma, we get, for all t ∈ [0, Tη ], $ε (t) B 1 ≤ 0 − $0 B 1 + C ε1/3 η−3/2 eC T . ε (t) − (2.32)


849

Now, according to (2.29), we fix η such that δ 1 ε CT $ , C 0 ([0,T ],B 1 ) 0 − 0 B 1 e ≤ min 2 3 and, in a second step, we fix εδ ∈ (0, εm ] such that δ 1 ε 1/3 −3/2 C T , C 0 ([0,T ],B 1 ) . e ≤ min C εδ η 2 3 From (2.32), we deduce that 2 $ε (t) B 1 ≤ min δ, ε C 0 ([0,T ],B 1 ) . ∀t ∈ [0, Tη ], ∀ε ∈ (0, εδ ], ε (t) − 3 (2.33) Therefore, we have $ε C 0 ([0,T ],B 1 ) ≤ ε C 0 ([0,T ],B 1 ) + ε − $ε C 0 ([0,T ],B 1 ) η η η ≤

5 ε C 0 ([0,T ],B 1 ) . 3

(2.34)

We claim that Tη = T . Indeed, if Tη < T , then, applying again Lemma 2.6 at Tη and using (2.34) enables us to find τ > 0 such that, for all ε ∈ (0, 1), $ε C 0 ([T ,T +τ ],B 1 ) ≤ 2 ε C 0 ([0,T ],B 1 ) , η η which, together with (2.34), contradicts the definition (2.30) of Tη . Finally, (2.33) gives (2.5) and (2.31) with = m − 1 gives (2.6). The proof of Proposition 2.1 is complete. 3. Second Order Averaging In this section, we focus on the intermediate system (2.1), (2.2), (2.3) as ε goes to zero. As we explained in Subsect. 1.5, it is interesting to consider the filtered version of this $0 ∈ B m be a given initial data, let $ε be the corresponding solution of equation. Let (2.1), (2.2), (2.3) and set $ε (t, ·) = exp it Hz /ε2 $ε (t, ·).

(3.1) This function satisfies the system 2B it Hz /ε2 −it Hz /ε2 $ε = − $ε ) − x,y

$ε + F0 ε (t) , e i∂t

(i∂x

ze ε $ε (t = 0) = $0 ,

(3.2)

where F0 is defined by (1.43). The advantage of this intermediate system, compared to $ε ) has no dependence in the fast variable t2 . (1.40) is that the nonlinearity F0 (

ε We will analyze the filtered system (3.2) in the framework of second order averaging of fast oscillating ODEs under the form (1.46) –see [27]– that we adapt here to our context of nonlinear PDEs. Recall that (E p ) p∈N , (χ p ) p∈N are the complete families

850


of eigenvalues and eigenfunctions of the operator Hz and denote by p the spectral projector on χ p : ∀ ∈ L 2 (R3 ),

p = χ p χ p .

Introduce now the following unbounded operator on L 2 (R3 ): A0 =

−∂x2

αp p

2Bzχ p χq 2 with α p = 1 − . Eq − E p

(3.3)

q= p

p≥0

With this notation, the limit system (1.25), (1.26), (1.27) can be rewritten in a more compact form as i∂t = A0 − ∂y2 + F0 ( ),

(t = 0) = 0 .

(3.4)

We state the main results of this section in the following two propositions. Proposition 3.1. Assume that Vc satisfies Assumptions 1.1 and 1.2. Then the following properties hold true. (i) The unbounded operator A0 defined by (3.3) on L 2 (R3 ) with the domain D(A0 ) = { ∈ L 2 (R3 ) : ∂x2 α p p ∈ L 2 (R3 )} p≥0

is selfadjoint. Moreover, the operator A0 satisfies ∀ ≥ 0, ∀u ∈ B 2n 0 +4+ ,

A0 u B ≤ Cu B 2n0 +4+ ,

(3.5)

where n 0 is as in Assumption 1.2. (ii) Let 0 ∈ B 1 . The limit system (3.4) admits a unique maximal solution

∈ C 0 ([0, Tmax ), B 1 ). If Tmax < +∞ then (t) B 1 → +∞ as t → Tmax . Proposition 3.2 (Averaging of the intermediate system). Assume that Vc satisfies Assumptions 1.1 and 1.2. Then there exists an integer m ≥ 2 such that the following holds true. $0 ∈ B m , we consider the solution

$ε ∈ C 0 ([0, +∞), B m ) of (3.2) and the maximal For 0 1 $0 as initial data: solution ∈ C ([0, Tmax ), B ) of the limit system with t 2 2 $0 − i

(t) = e−it (A0 −∂y ) e−i(t−s)(A0 −∂y ) F0 ( (s))ds. (3.6) 0

We assume that there exist T ∈ (0, Tmax ), ε0 > 0 such that $ε C 0 ([0,T ],B m ) < +∞. M := sup

(3.7)

$ε − C 0 ([0,T ],B 1 ) ≤ ε C M ,

(3.8)

ε∈(0,ε0 ]

Then we have

where C M is independent of ε.


851

3.1. Well-posedness of the limit system. In this section, we prove Proposition 3.1. Step 1. Basic properties of the operator A0 . First, from Vc (z) ≥ a 2 z 2 , we deduce that the pth eigenvalue of Hz is larger than the p th eigenvalue of the harmonic oscillator d2 2 2 2 − dz 2 + (a + B )z : ∀ p ∈ N, E p ≥ a 2 + B 2 (2 p + 1). (3.9) From Assumption 1.2, we deduce that the coefficients α p in (3.3) satisfy 2 2Bzχ p χq = 1 + C(1 + p)n 0 Bzχ p 2L 2 |α p | ≤ 1 + C(1 + p)n 0 q≥0

≤

C E np0 +1 , 1/2

where we used (3.9) and that Bzχ p L 2 ≤ E p . Now, consider a nonnegative integer and u in B 2n 0 +4+ . Let n 0 be defined as in Assumption 1.2, and decompose u over the χ p family which is orthogonal in L 2 : A0 u2B = α 2p ∂x2 p u2B p≥0

≤C

0 +2 u2 E 2n ≤C p p B +2

p≥0

≤C

Hzn 0 +1 p u2B +2

p≥0

p u2B 2n0 +4+

=

Cu2B 2n0 +4+ ,

p≥0

where we used Lemma 2.3. This proves (3.5). Furthermore, by passing to the limit as N → +∞ in the identity ∀ , ∈ D(A),

N

α p (∂x2 p , p ) L 2 =

p=0

N

α p ( p , ∂x2 p ) L 2 ,

p=0

we obtain that the operator A0 is symmetric. Moreover, the equation A0 + i = f admits a solution ∈ D(A0 ) for all f ∈ L 2 (R3 ). Indeed, the projection of this equation on χ p reads −α p ∂x2 φ p + iφ p = f p , and this elliptic equation can obviously be solved for all f p ∈ L 2 (R2 ). Therefore, by the standard criterion for selfadjointness [26], the operator A0 is selfadjoint. We have proved the first part of Proposition 3.1. Step 2. Well-posedness and stability of the limit system. The operator A0 being selfadjoint, the Stone theorem can be applied and the operator −i A0 generates a unitary group of continuous operators e−i A0 t on L 2 and also on B 1 . The Duhamel formulation of (3.4) reads t 2 2

(t) = e−it (A0 −∂y ) 0 − i e−i(t−s)(A0 −∂y ) F0 ( (s))ds (3.10) 0

∂y2

(recall that A0 and commute together). Since, by (2.15), the application F0 is locally Lipschitz continuous on B 1 , it is easy to prove by a standard fixed point technique that

852


(3.10) admits a unique maximal solution ∈ C 0 ([0, Tmax ), B 1 ). The details are left to the reader. Note that, if Tmax < +∞, then (t) B 1 → +∞ as t → Tmax . Item (ii) of Proposition 3.1 is proved. Remark 3.3. In fact, this strategy of proof by a fixed point mapping leads to a stability result. For all η > 0 and for all T ∈ (0, Tmax ), there exists δη,T > 0 such that the

0 satisfying following holds true. For all

0 B 1 ≤ δη,T , 0 − Eq. (3.6),

0 − i

(t) = e−it (A0 −∂y ) 2

t

e−i(t−s)(A0 −∂y ) F0 ( (s))ds, 2

0

admits a unique solution ∈ C 0 ([0, T ], B 1 ) and we have sup (t) − (t) B 1 ≤ η.

t∈[0,T ]

(3.11)

3.2. Proof of Proposition 3.2. This subsection is devoted to the proof of Proposition 3.2, which relies on a reformulation of the Duhamel formula for (3.2). Step 1. Reformulation of the Duhamel formula. Introduce the following family of unbounded self-adjoint operators on L 2 (R3 ): ∀τ ∈ R,

a(τ ) = −2Beiτ Hz ze−iτ Hz i∂x

(3.12)

with domain B 2 . Note that, from (1.17) and Lemma 2.3, we deduce that, for all ∈ N, ∀u ∈ B 2 , ∀τ ∈ R,

a(τ )u L 2 ≤ Cu B 2 .

(3.13)

The Duhamel representation of (3.2) reads $ε (t) = $0 −

i ε

t 0

t s ε $ε (s) + F0 ε (s) ds. (3.14) $ −x,y

a 2 (s)ds − i ε 0

Introduce the primitive of a: ∀u ∈ B 2 , ∀τ ∈ R,

τ

A(τ )u =

a(s)u ds,

(3.15)

0

which is well-defined as a Riemann integral, thanks to (3.13), and is such that ∀u ∈ B 2 , ∀τ ∈ R,

A(τ )u L 2 ≤ Cτ u B 2 .

(3.16)

$ε ∈ $ε ∈ C 0 ([0, T ], B 4 ), then by (3.2) we have that ∂t

Now, we notice that if

C 0 ([0, T ], B 2 ). Hence one can integrate by parts in the first integral of (3.14) and,


853

if m ≥ 4, the following expression holds true for all t ∈ [0, T ], in the sense of functions in C 0 ([0, T ], L 2 ): t i t s ε s t $ε (t) $ε (s)ds − iε A $ (s)ds = iε

− a 2

A 2 ∂t

ε 0 ε ε ε2 0 t s s t $ε (t) $ε (s)ds − iε A

= A 2 a 2

ε ε ε2 0 t s $ε (s) + F0 ε (s) ds, +ε A 2 −x,y

ε 0 $ε . Finally, the Duhamel formula (3.14) becomes where we used (3.2) to evaluate i∂t

$ε (t)

s s t ε $ε (t) $ (s)ds − iε A $0 +

= A 2 a 2

ε ε ε2 0 t s $ε (s) + F0 ε (s) ds +ε A 2 −x,y

ε 0 t $ε (s) + F0 ε (s) ds. −i −x,y

t

(3.17)

0

Step 2. Approximation of the Duhamel formula. Denote t (ε (t) =

$ε (t) $ε (t) + iε A

ε2 and rewrite (3.17) as follows: t t s s 2

ε (s)ds − i $ε (s) + F0 ε (s) ds. $ A 2 a + i∂ −∂y2

x 2 ε ε 0 0 t ε s $ε (s) + F0 (s) ds. +ε A 2 −x,y

(3.18) ε 0

(ε (t) = $0 +

In this step, we prove that $ε (t) −

(ε (t) B 1 ≤ ε C M sup

t∈[0,T ]

(3.19)

and that (ε (t) = $0 − i

t 0

(ε (s) − ∂y2

(ε (s) + F0

(ε (s) + ε f ε (s) ds, A0

(3.20)

with sup f ε B 1 ≤ C M .

t∈[0,T ]

(3.21)

In order to prove this claim, we state two technical lemmas which are proved in Appendix 4 so that the proof would be more readable.

854


Lemma 3.4. Let Vc satisfy Assumptions 1.1 and 1.2. Then, for all integer , the operator A(τ ) defined by (3.15) satisfies t 0 2n 0 ++8 ∀u ∈ C ([0, T ], B ), sup A 2 u(t) ≤ CuC 0 ([0,T ],B 2n0 ++8 ) , ε t∈[0,T ] B (3.22) where n 0 is as in Assumption 1.2 and C is independent of ε. Lemma 3.5. Let Vc satisfy Assumptions 1.1 and 1.2. Let T > 0 and m = 4n 0 + 17. Let u ∈ C 0 ([0, T ], B m ) such that ∂t u ∈ C 0 ([0, T ], B m−2 ). Then we have, for all ε ∈ (0, 1], t t s s 2 2 A 2 a 2 + i∂x u(s)ds + i A0 u(s)ds sup 1 ≤ Cε u, (3.23) ε ε 0 0 t∈[0,T ] B where A0 , a and A are respectively defined by (3.3), (3.12) and (3.15) and where u denotes shortly uC 0 ([0,T ],B m ) + ∂t uC 0 ([0,T ],B m−2 ) . In order to apply these lemmas, we need some bounds for ε and ∂t ε . Let us fix m = 4n 0 + 17, where n 0 is as in Assumption 1.2 and assume that we have the uniform estimate (3.7). By (2.8), we deduce that 2 2 $ε x,y ε C 0 ([0,T ],B m−2 ) + eit Hz /ε ze−it Hz /ε ∂x

≤ C M . (3.24) 0 m−2 C ([0,T ],B

)

Moreover, from (2.16), we deduce that F0 ( ε )C 0 ([0,T ],B m ) ≤ C M .

(3.25)

Hence, from (3.2), (3.24) and (3.25), we get ∂t ε C 0 ([0,T ],B m−2 ) ≤

CM . ε

(3.26)

Therefore, applying Lemmas 3.4 and 3.5 and using (3.7), (3.24), (3.25) and (3.26) yield t ε $ sup A 2 (t) (3.27) 2n +5 ≤ C M , ε t∈[0,T ] B 0 ε t ε $

ε sup A 2 −x,y (t) + F0 (t) ε t∈[0,T ]

B1

≤ ε CM ,

(3.28)

and

t t s s 2 $ε ε (s)ds $

sup A a + i∂

(s)ds + i A 0 x 1 ≤ ε C M , (3.29) 2 2 ε ε 0 0 t∈[0,T ] B

where we used that m ≥ 4n 0 + 17, thus in particular m ≥ 4n 0 + 13 and m ≥ 2n 0 + 11. Hence, from (3.27), we deduce (3.19) and $ε − F0

(ε 0 $ε −

(ε )C 0 ([0,T ],B 1 ) + F0

∂y2 (

1 ≤ ε C M , (3.30) C ([0,T ],B )


855

where we also used the estimate (2.15). Moreover, from (3.5) and (3.27), we get $ε −

(ε )C 0 ([0,T ],B 1 ) ≤ ε C M . A0 (

(3.31)

Finally, inserting (3.28), (3.29), (3.30), (3.31) in (3.18) yields (3.20) with the estimate (3.21). (ε Step 3. A stability result for the limit system. First notice that (3.20) implies that

satisfies in the strong sense the equation (ε = A0

(ε − ∂y2

(ε + F0 (

(ε ) + ε f ε , i∂t

$0 ,

(t = 0) =

which has the following mild formulation: t 2 −it (A0 −∂y2 ) $ ε ( (ε (s)) + ε f ε ds.

(t) = e 0 − i e−i(t−s)(A0 −∂y ) F0 (

(3.32)

0

$0 as initial data: there exists a maximal soluApply now Proposition 3.1 (ii) with 0 1 tion ∈ C ([0, Tmax ), B ) to Eq. (3.6). Assume that T is such that 0 < T < Tmax . Subtracting (3.6) to (3.32) leads, for all t ≤ T , to t F0 (

(ε (t) − (t) B 1 ≤ (ε (s)) − F0 ( (s)) 1 ds + ε f ε C 0 ([0,T ],B 1 )

B 0 t

(ε (s) − (s) 1 ds , ≤ CM ε + B 0

(ε C 0 ([0,T ],B 1 ) ≤ C M . Therefore, the Gronwall where we used (2.15), (3.21) and

Lemma gives the estimate (3.8) and the proof of Proposition 3.2 is complete. 4. Proof of the Main Theorem This section is devoted to the proof of the main Theorem 1.3. Remark that the statement (i) is already proved in Proposition 3.1. Let us prove the statement (ii) of Theorem 1.3. Let 0 ∈ B 1 . Denote by ε ∈ C 0 ([0, +∞), B 1 ) the solution of (1.1), (1.2), (1.3) and let T0 ∈ (0, +∞] be the maximal time given by Proposition 2.1 (i). We also introduce the maximal solution ∈ C 0 ([0, Tmax ), B 1 ) of the limit system (3.4), given by Proposition 3.1. Pick T such that 0 < T < min(T0 , Tmax ) and let η > 0. Since T < Tmax , according to Remark 3.3, one can define δη/3,T > 0 such that the

0 satisfying following holds true. For all

0 B 1 ≤ δη/3,T , 0 −

∈ C 0 ([0, T ], B 1 ) and we have (3.11): Eq. (3.6) admits a unique solution sup (t) − (t) B 1 ≤ η/3.

t∈[0,T ]

856


Next, we fix m ≥ 2 according to Proposition 3.2 and δ > 0 by δ = min

η 3

, δη/3,T .

(4.1)

$0 ∈ B m and εδ such that the Since T < T0 , Proposition 2.1 (ii) enables to choose ε

corresponding solution of the intermediate system (2.1), (2.2), (2.3) satisfies (2.5) and (2.6) for all ε ≤ εδ :

ε C 0 ([0,T ],B 1 ) ≤ δ ≤ ε −

η , 3

(4.2)

ε is bounded in C 0 ([0, T ], B m ) uniformly with respect to ε. and

0 satisfies Now, we remark that by (4.2) this initial data

0 B 1 ≤ δ ≤ δη/3,T . 0 − Hence, Remark 3.3 gives that the solution of Eq. (3.6) satisfies − C 0 ([0,T ],B 1 ) ≤

η , 3

or, equivalently, 2 2 e−it Hz /ε − e−it Hz /ε C 0 ([0,T ],B 1 ) ≤

η . 3

(4.3)

ε in C 0 ([0, T ], B m ) enables us to apply Proposition Moreover, the uniform bound of $ε satisfies 3.2, which gives that the function $ε − e−it Hz /ε2 C 0 ([0,T ],B 1 ) ≤ δ ≤

η , 3

(4.4)

for ε ≤ εδ , where solves (3.6). Finally, (4.2), (4.3) and (4.4) yield the existence of ε0 such that, for all ε ∈ (0, ε0 ] we have ε − e−it Hz /ε C 0 ([0,T ],B 1 ) ≤ η. 2

(4.5)

To conclude, it remains to remark that T0 ≥ Tmax . Indeed, if T0 < Tmax , then we have, by Proposition 2.1 (i), lim sup ε C 0 ([0,T0 ],B 1 ) = +∞, ε→0

which implies by (4.5) that lim (T ) B 1 = +∞.

T →T0

This contradicts T0 < Tmax . The proof of Theorem 1.3 is complete.


857

Appendix A. Proof of Lemma 2.4 First, by integrating by parts and applying Cauchy-Schwarz, we obtain Bz∂x u2L 2 = B 2 z 2 |∂x u|2 dxdydz = (B 2 z 2 u)(−∂x2 u)dxdydz ≤ u2B 2 . R3

R3

Hence, the first properties stated in the lemma are obvious from the definition (2.9), and we shall only detail the proof of the equivalence of norms. Step 1. The case m = 1. From the definition (2.9) and the assumption (1.17) on Vc , we deduce that Hε1/2 u2L 2 = ((−∂z2 + Vc )u, u) L 2 + (ε∂x + i Bz)u2L 2 = ((−∂z2 + Vc )u, u) L 2 + B 2 zu2L 2 + ε2 ∂x u2L 2 − 2ε BIm (zu, ∂x u) L 2 1 a2 ((−∂z2 + Vc )u, u) L 2 + ( + B 2 )zu2L 2 + ε2 ∂x u2L 2 − 2ε Bzu L 2 ∂x u2L 2 2 2 2 1 a a2 ≥ ((−∂z2 + Vc )u, u) L 2 + zu2L 2 + 2 ε2 ∂x u2L 2 2 4 a + 4B 2 1/2 ≥ CHz u2L 2 + Cε2 ∂x u2L 2 . ≥

Conversely, from (1.22) and (2.9), we estimate directly (Hε u, u) L 2 ≤ C Hz u2L 2 + C ε2 ∂x u2L 2 . 1/2

For all ε ∈ (0, 1], this yields the equivalence of norms (2.10). For m ≥ 2, we will proceed by induction. For the clarity of the proof, let us introduce two notations. For m ∈ N, we denote by (Pm ) the property (Pm ): There exists εm > 0 such that, for all ε ∈ (0, εm ] and for all u ∈ B m , we have 1 2 m/2 2 u2B m ≤ u2L 2 (R3 ) + m/2 u L 2 (R3 ) ≤ 2u2B m , x,y u L 2 (R3 ) + Hε 2 and by (Qm ) the property (Qm ): There exists Cm > 0 such that, for all u ∈ B m and ε ∈ (0, 1], the operator Am = 1ε (Hεm − Hzm ) satisfies |(Am u, u) L 2 | ≤ Cm u2B m . Note that the lemma will proved if we show that (Pm ) holds true for all m ≥ 0. Note also that, up to a possible modification of the sequence (εm )m∈N , this sequence can be chosen nonincreasing. Step 2. (Qm ) implies (Pm ). Let m ≥ 0 be fixed. From (Qm ), we deduce that Hεm/2 u2L 2 = (Hεm u, u) L 2 = (Hzm u, u) L 2 + ε(Am u, u) L 2 m/2

= Hz

u2L 2 + ε(Am u, u) L 2 ,

thus m/2

Hz

m/2

u2L 2 − εCm u2B m ≤ Hεm/2 u2L 2 ≤ Hz

Setting εm =

1 , 2Cm

u2L 2 + εCm u2B m .

(A.1)

858


we deduce directly from (1.37) and (A.1) that, for ε ≤ εm , 1 2 m/2 2 u2B m ≤ u2L 2 (R3 ) + m/2 u L 2 (R3 ) ≤ 2u2B m . x,y u L 2 (R3 ) + Hε 2 We have proved (Pm ). Step 3. Proof of (Qm ) for m = 0 and 1. For m = 0, choose A0 = 0 and (Q0 ) is obvious. Let us prove (Q1 ). From (2.9), we have Hε = Hz + ε A1 , with A1 = −2i Bz∂x − ε∂x2 .

(A.2)

For all u ∈ B 1 , we have |(A1 u, u) L 2 | = | − 2i B(∂x u, zu) L 2 + ε∂x u2L 2 | ≤ C(zu2L 2 + ∂x u2L 2 ) ≤ C1 u2B 1 , where we applied Cauchy-Schwarz and Lemma 2.3. We have proved (Q1 ). Step 4. Proof of (Qm ) for m ≥ 2. We shall now proceed by induction. Let m ≥ 2 and assume that (Qm−2 ) and (Qm−1 ) hold true. Let us prove (Qm ). We compute Hεm = (Hz + ε A1 )Hεm−2 (Hz + ε A1 ) = Hz Hεm−2 Hz + ε A1 Hεm−1 + ε Hεm−1 A1 = Hzm + ε Hz Am−2 Hz + ε A1 Hεm−1 + ε Hεm−1 A1 , where we have applied (Qm−2 ). Hence, denoting Am = Hz Am−2 Hz + A1 Hεm−1 + Hεm−1 A1 ,

(A.3)

we obtain Hεm = Hzm + ε Am and, for all u ∈ B m , we get from the definition (A.3) that |(Am u, u) L 2 | ≤ |(Hz Am−2 Hz u, u) L 2 | + 2|(Hεm−1 u, A1 u)| L 2 , where we used that Hεm−1 and the operator A1 defined by (A.2) are selfadjoint. It remains to estimate the two terms in the right-hand side of this inequality. The first one can be estimated as follows: |(Hz Am−2 Hz u, u) L 2 | = |(Am−2 Hz u, Hz u) L 2 | ≤ Cm−2 Hz u2B m−2 ≤ Cm−2 u2B m , where we used (Qm−2 ) and (2.8). The second one can be estimated as follows: |(Hεm−1 u, A1 u)| L 2 = Hεm−1 u, (i∂x )(−2Bzu + i∂x u) 2 L m−1 m−1 2 2 = Hε (i∂x u), Hε (−2Bzu + i∂x u) L2 m−1 m−1 2 2 ≤ Hε (i∂x u) Hε (−2Bzu + i∂x u) L2

≤ C∂x u B m−1 zu B m−1

L2

+ C∂x u2B m−1

≤ Cu2B m , where we used that Hε commutes with ∂x , the Cauchy-Schwarz inequality, the property (Pm−1 ) and, at the last step, (2.8). Therefore, we have proved that |(Am u, u) L 2 | ≤ Cm u2B m , which proves (Qm ). The proof of the lemma is complete.


859

Appendix B. Proof of Lemma 2.5 For readability, we introduce in this Appendix the following notation: ∀(x, y, z) ∈ R3 , ∀α ∈ {0, 1}, ∀ε ∈ (0, 1), rαε (x, y, z) = x2 + y 2 + αε2 z 2 . With this notation, for all u ∈ B 1 , and α ∈ {0, 1}, the nonlinearity Fα (u) defined in (2.12) reads 1 2 Fα (u) = ∗ (|u| ) u. 4πrαε In order to prove the estimates stated in Lemma 2.5, we prove the following technical lemma on the Poisson nonlinearity. Lemma B.1. The following estimates hold: (i) There exists a positive constant C that does not depend on ε ∈ (0, 1] or α ∈ {0, 1} such that 1 ∀u, v ∈ H 1 (R3 ), ∗ (B.1) (uv) rε ∞ ≤ Cu H 1 v H 1 . α L (ii) There exists a positive constant C that does not depend on ε ∈ (0, 1] or α ∈ {0, 1} such that, if D denotes a derivative with respect to x, y or z, 1 D ∀u, v ∈ H 1 (R3 ), ∀v ∈ H 1 (R3 ), ∗ (uv) 3 ∞ ≤ Cu H 1 v H 1 . ε rα L L x,y

z

(B.2) (iii) For any integer k, let β = (βx , βy , βz ) ∈ N3 be a multiinteger of length |β| = β β β βx + βy + βz = k and let D β = ∂x x ∂y y ∂z z be the associated derivative. Then there exists a positive constant Ck depending only on k such that β 1 2 D ∀u ∈ H k , ∗ |u| (B.3) 3 ∞ ≤ Ck u H 1 u H k . rαε L L x,y

z

Proof. Noting that, for all (x, y) ∈ R2 , 1 1 (x, y, ·) ∗ ≤ (uv) ∞ rε (x − x )2 + (y − y )2 R2 α L (R ) × uv(x , y , ·) 1 dx dy , L (R )

we only need estimates for the convolution with √

1 x2 +y 2

(B.4)

in R2 . Here, we refer the reader

to Lemma B.1 of [7] where it was shown that for any f ∈ L p (R2 ) ∩ L 1 (R2 ) with 2 < p ≤ ∞, the following bound holds: 1 ∗ f ≤ C p f θL p (R2 ) f 1−θ , (B.5) L 1 (R 2 ) x2 + y 2 ∞ 2 L (R )

860


where θ = p/(2 p − 2). Moreover, from Cauchy-Schwarz and Sobolev embeddings, we deduce that for all p ∈ [1, +∞), uv(x, y, ·) L 1 (R) p 2 ≤ u(x, y, ·) L 2 (R) v(x, y, ·) L 2 (R) p 2 L (R ) L (R ) ≤ u L 2 p L 2 v L 2 p L 2 ≤ u H 1 (R3 ) v H 1 (R3 ) . x,y

x,y

z

z

Combined with (B.4) and (B.5), this proves Item (i). In order to prove Item (ii), consider a first order derivative D with respect to x, y or z and let u, v ∈ H 1 (R3 ). The usual properties of the convolution give D

1 1 1 = ε ∗ D (uv) = ε ∗ (D(u)v + u D(v)) . ∗ (uv) ε rα rα rα

Using (B.4) combined with the generalized Young formula gives 1 1 ∗ (D(u)v + u D(v)) ∗ D(u)v + u D(v) L 1z rε 3 ∞ ≤ 2 2 3 x +y α L x,y L z L x,y ≤ C D(u)v + u D(v) L 1z 6/5 , (B.6) L x,y

since the function x → √

1 x2 +y 2

belongs to L 2w (R2 ). We end the proof of Item (ii) noting

that, thanks to Sobolev embeddings, D(u)v + u D(v) L 1z 6/5 ≤ CD(u) L 2 v L 3x,y L 2z + CD(v) L 2 u L 3x,y L 2z L x,y

≤ Cu H 1 v H 1 . In order to prove Item (iii), we follow the same lines with derivatives of higher orders. Consider the derivative D β where β = (βx , βy , βz ) ∈ N3 is a multiinteger of length |β| = βx + βy + βz = k. The usual properties of the convolution give β 1 2 D ∗ |u| rε α

L 3x,y L ∞ z

1 β 2 = ε ∗ D |u| r α

L 3x,y L ∞ z

.

Again, using (B.4) combined to the generalized Young’s formula lead to: 1 1 β 2 ∗ D β |u|2 D |u| ≤ ∗ 3 ∞ rε L 1z (R) x2 + y 2 α L x,y L z L 3x,y β ≤ C D |u|2 6/5 1 . L x,y L z

We now write D β (uu) =

β ≤β

Cβ D β (u)D β−β (u),

(B.7)


861

where the sum is over the set of multiintegers β = (βx , βy , βz ) such that βx ≤ βx , βy ≤ βy and βz ≤ βz . Thus, combining (B.7) with Sobolev embeddings gives as above β 1 2 D ∗ |u| ≤C D β (u) L 2 u L 3x,y L 2z ε rα L 3x,y L ∞ z |β |=k +C D β u L 3x,y L 2z D β−β ( u ) L 2 |β |=, 0≤ ε},

− = {X ∈ R2 , |X − X | < ε}.

For all η, µ ∈ R, and X = X , we have

1 |X − X |2 + ε2 η2

−

1 |X −

X |2

+ ε2 µ2

=

εη

εµ

−ξ 3/2 dξ (C.2) |X − X |2 + ξ 2

and 1 1 2 . − ≤ 2 2 2 2 2 2 |X − X | |X − X | + ε η |X − X | + ε µ

(C.3)

Besides, a simple study gives ∀X = X, ∀ξ ∈ R,

|ξ | |X −

X |2

3/2 + ξ2

1 2 ≤ √ . |X − X |2 3 3

(C.4)

Equation (C.2), combined with (C.3) and (C.4) allows us to claim that for all θ ∈ (0, 1), 1 1 1 − . (C.5) ≤ Cεθ |η − µ|θ 2 2 2 |X − X |2 + ε2 η2 |X − X |1+θ |X − X | + ε µ Now, applying (C.5) with η = z − z , µ = z and θ = 3/8 leads to 1 1 − u(X , z )v(X , z )dz d X 2 2 2 2 2 2 + R |X − X | + ε (z − z ) |X − X | + ε z 1 ≤ Cε3/8 |z|3/8 u(X , ·)v(X , ·) L 1 d X |11/8 + R |X − X 1 ≤ Cε3/8 |z|3/8 1/24 u L 6 L 2 v L 6 L 2 ≤ Cε1/3 |z|3/8 u B 1 v B 1 , X z X z ε

864


where we used the Hölder inequality and Sobolev embeddings. Similarly, applying (C.5) with η = z , µ = 0 and θ = 3/4 leads to 1 1 − , z )||v(X , z )|dz d X |u(X + R |X − X | |X − X |2 + ε2 z 2 1 ≤ Cε3/4 |z |3/4 u(X , ·)v(X , ·) L 1 d X R + |X − X |7/4 1 ≤ Cε3/4 5/12 z 3/8 u L 6 L 2 v L 6 L 2 X z X z ε ≤ Cε1/3 u B 2 v B 1 . We have proved that |δ + (u, v)(X, z)| ≤ Cε1/3 (1 + |z|3/8 )u B 2 v B 1 .

(C.6)

Consider now δ − . Using (C.2) again leads to |ξ | |u(X , z )||v(X , z )|dξ dz d X . |δ − (u, v)(X, z)| ≤ 2 2 3/2 − R R (|X − X | + ξ ) (C.7) Moreover, a simple computation gives |ξ | 2 . dξ = |2 + ξ 2 )3/2 (|X − X |X − X | R Hence, (C.7) gives

1 |u(X , z )||v(X , z )|dz d X − R |X − X | ≤ Cε1/3 u L 6 L 2 v L 6 L 2

|δ − (u, v)(X, z)| ≤ C

X

z

X

z

≤ Cε1/3 u B 1 v B 1 .

(C.8)

Combining (C.6) and (C.8) allows to conclude that (C.9) |δ(u, v)(X, z)| ≤ Cε1/3 (1 + Vc (z))u B 2 v B 1 , √ where we have used z 3/8 ≤ C(1 + Vc (z)), deduced from (1.17). Step 2: Difference between the nonlinearities. In order to prove Lemma 2.7, we need to estimate the following quantity in B 1 : F1 (u) − F0 (u) = δ(u, u) u,

(C.10)

where u ∈ B 2 is given. According to Lemma 2.3, we have F1 (u) − F0 (u) B 1 ≤ C Vc (F1 (u) − F0 (u)) L 2 + CF1 (u) − F0 (u) H 1 . First, we deduce from (C.9) that (1 + Vc )δ(u, u) u L 2 ≤ Cε1/3 (1 + Vc )u L 2 u B 2 v B 1 ≤ Cε1/3 u3B 2 , (C.11)


865

where we used Lemma 2.3. Let now D denote a first order derivative with respect to x, y or z. We clearly have D (F1 (u) − F0 (u)) L 2 1 1 ≤ − ∗ (D(u)u + u D(u)) u + δ(u, u)D(u) L 2 2 2 2 2 |X | |X | + ε z L ≤ 2|δ(u, D(u))u L 2 + δ(u, u)D(u) L 2 . (C.12) According to (C.9), we have δ(u, D(u))u L 2 ≤ Cε1/3 (1 + and δ(u, u)D(u) L 2 ≤ Cε1/3 (1 +

Vc )u L 2 u B 2 D(u) B 1 ≤ Cε1/3 u3B 2

(C.13)

Vc )D(u) L 2 u B 2 u B 1 ≤ Cε1/3 u3B 2 , (C.14)

where we used again Lemma 2.3. Combining (C.10), (C.11), (C.12),(C.13) and (C.14) gives (2.23). The proof of Lemma 2.7 is complete. Appendix D. Proof of the Technical Lemmas 3.4 and 3.5 Let us develop the operators a and A defined by (3.12) and (3.15) on the eigenbasis χ p . We have a(τ )u = − eiτ (E p −Eq ) a pq i∂x u q χ p , p≥0 q≥0

where we have introduced the coefficients a pq = 2Bzχ p χq .

(D.1)

Recall that, by Assumption 1.1, the potential Vc is even, so for all p, the function (χ p (z))2 is even. Therefore, we have # " ∀ p ∈ N, a pp = 2Bzχ p2 = 0, thus a(τ )u = −

eiτ (E p −Eq ) a pq i∂x u q χ p .

(D.2)

p≥0 q= p

Let us now integrate this formula in order to compute the operator A defined by (3.15): A(τ )u = i

eiτ (E p −Eq ) − 1 a pq i∂x u q χ p . E p − Eq

(D.3)

p≥0 q= p

Before proving Lemmas 3.4 and 3.5, let us give a useful estimate on coefficients a pq . For all p ∈ N, q ∈ N, k ∈ N we have (k+1)/2

|a pq | ≤ C

Eq

k/2

Ep

.

(D.4)

866


Indeed, we have k/2 k/2 k/2 E p a pq = 2B Hz χ p , zχq 2 = 2B χ p , Hz (zχq ) 2 L

L

k/2 2BHz (zχq ) L 2

≤ ≤ 2Bzχq B k

(k+1)/2

≤ Cχq B k+1 ≤ C E q

,

where we applied Lemma 2.3. Proof of Lemma 3.4. Let n 0 be as in Assumption 1.2, let ∈ N and u ∈ C 0 ([0, T ], B 2n 0 +8+ ). Denoting u p = uχ p ,

2 µ2p = u p χ p C 0 ([0,T ],B 2n 0 +8+ ) ,

(D.5)

we have 2 uC 0 ([0,T ],B 2n 0 +8+ ) =

µ2p < +∞.

(D.6)

p≥0

From (D.3), we obtain

t A 2 ε

u(t)C 0 ([0,T ],B ) ≤ C

p≥0 q= p

(1 + q)n 0 |a pq | u q χ p C 0 ([0,T ],B +1 ) ,

where we used Assumption 1.2. Besides, applying Lemma 2.3 gives u q C 0 ([0,T ],H s (R2 )) = ≤

1 n +4+(−s)/2 Hz 0 (I n 0 +4+(−s)/2 Eq s/2 Eq C n +4+/2 µq , Eq 0

+ (−x,y )s/2 )(u q χq )C 0 ([0,T ],L 2 )

(D.7)

for all s ≤ 2n 0 + 8 + . Hence, from the definition (1.37), we get (+1)/2

u q χ p C 0 ([0,T ],B +1 ) ≤ C E p

u q C 0 ([0,T ],L 2 (R2 )) + Cu q C 0 ([0,T ]H +1 (R2 ))

(+1)/2

≤C

Ep

(+1)/2

+ Eq

n +4+/2

Eq 0

µq ,

and, by using (D.4) and (3.9), (+5)/2

(1 + q)n 0 |a pq | u q χ p C 0 ([0,T ],B +1 ) ≤ C ≤C

Ep E qn 0 |a pq | 2 Ep 1 µq . E 2p E q

(+1)/2

+ Eq

n +4+/2

Eq 0

E 2p

µq


867

Therefore,

t A 2 ε

⎛

u(t)C 0 ([0,T ],B )

⎞⎛ ⎞ 1 µq ⎠⎝ ⎠ ≤C⎝ E 2p Eq p≥0 q≥0 ⎛ ⎞3/2 ⎛ ⎞1/2 1 ⎠ ⎝ ≤C⎝ µq2 ⎠ E 2p p≥0

q≥0

by Cauchy-Schwarz. To conclude, it suffices to use (3.9) and (D.6): the series converge and we obtain the desired estimate (3.22). Proof of Lemma 3.5. Let m = 4n 0 + 17 and let u ∈ C 0 ([0, T ], B m ) such that ∂t u ∈ C 0 ([0, T ], B m−2 ). Denoting now 2 2 u p = uχ p , (D.8) ν 2p = u p χ p C 0 ([0,T ],B m ) + ∂t u p χ p C 0 ([0,T ],B m−2 ) , we have 2 2 uC 0 ([0,T ],B m ) + ∂t uC 0 ([0,T ],B m−2 ) =

ν 2p < +∞.

(D.9)

p≥0

Applying Lemma 2.3 as above yields (m−s)/2

Ep

(m−2−s)/2

u p C 0 ([0,T ],H s (R2 )) + E p

∂t u p C 0 ([0,T ],H s (R2 )) ≤ Cν p (D.10)

for all s ≤ m. By composing the expressions (D.3) and (D.2) for A and a, we obtain A(τ )a(τ )u = i

eiτ (E p −Eq ) − 1 eiτ (Eq −E n ) a pq aqn ∂x2 u n χ p E p − Eq p≥0 q= p n=q

1 − eiτ (Eq −E p ) =i (a pq )2 ∂x2 u p χ p E p − Eq p≥0 q= p

+i

eiτ (E p −E n ) − eiτ (Eq −E n ) a pq aqn ∂x2 u n χ p . E p − Eq p≥0 q= p n = q

n = p

Now, remark that, by (1.24) and (D.1), we have for all p ∈ N the identity 1+

(a pq )2 = αp. E p − Eq

q= p

Therefore we get, using the definition (3.3), A(τ )a(τ ) + i∂x2 u = −i A0 u (a pq )2 2 −i eiτ (Eq −E p ) ∂ u p χp E p − Eq x p≥0 q= p a a pq qn eiτ (E p −E n ) − eiτ (Eq −E n ) +i ∂ 2un χ p , E p − Eq x p≥0 q= p n = q

n = p

868


and, integrating, t t s s A 2 + i∂x2 a 2 u(s)ds + i A0 u(s)ds ε ε 0 0 t (a pq )2 2 χp eis(Eq −E p )/ε ∂x2 u p (s) ds = −i E p − Eq 0 p≥0 q= p t a pq aqn 2 2 eis(E p −E n )/ε − eis(Eq −E n )/ε ∂x2 u n (s) ds. χp +i E p − Eq 0 p≥0 q= p n = q

n = p

(D.11) In order to estimate the right-hand side of this identity, we claim that, for all p ∈ N, p ∈ N and λ = 0, we have t iλs/ε2 2 χ p (z) e ∂x u q (s, x, y) ds

1/2

C 0 ([0,T ],B 1 )

0

1/2

ε2 E p + Eq ≤ CT νq , (D.12) |λ| E q(m−4)/2

where C T only depends on T and νn is defined by (D.8). This claim is proved below. As a consequence, we can estimate (D.11) as follows: t t s s 2 A 2 a 2 + i∂x u(s)ds + i A0 u(s)ds 0 ε ε 0 0 C ([0,T ],B 1 ) (a pq )2 1 ≤ Cε2 νp 2 (m−5)/2 |E p − E q | E p p≥0 q= p

+Cε2

1/2 1/2 |a pq ||aqn | E p + En 1 1 + νn (m−4)/2 |E p − E q | |E p − E n | |E q − E n | En p≥0 q= p n = q n = p

E 3p (1 + p)2n 0 ≤ Cε2 νp E q2 E (m−5)/2 p p≥0 q≥0

n +11/2 En 0 1 (1 + q)n 0 (1 + n)n 0 ν n 0 +2 (m−4)/2 n 2 E p Eq En p≥0 q≥0 n≥0 νp 1 νn 1 1 2 , ≤ Cε2 + Cε 2 3 2 2 (1 + q ) 1 + p (1 + p ) (1 + q ) 1 + n

+Cε2

p≥0 q≥0

p≥0 q≥0 n≥0

where we used Assumption 1.2, (D.4), (3.9) and recall that m = 4n 0 + 17. Hence we deduce (3.23) by using Cauchy-Schwarz and (D.9). It remains to prove the claim. Proof of the claim (D.12). Let

t

v(t, x, y, z) = χ p (z) 0

2

eiλs/ε ∂x2 u q (s, x, y) ds,

(D.13)


869

for p ∈ N, q ∈ N and λ = 0. An integration by parts in (D.13) yields t ε2 2 2 v(t, x, y, z) = i χ p eiλs/ε ∂x2 ∂t u q (s, x, y) ds + eiλt/ε ∂x2 u q (t, x, y) λ 0 −∂x2 u q (0, x, y) . Hence, by using (D.10), we obtain 1/2

vC 0 ([0,T ],B 1 ) ≤ C T

1/2

ε2 E p + Eq νq , |λ| E q(m−4)/2

where C T only depends on T . This concludes the proof of (D.12). The proof of Lemma 3.5 is complete. Acknowledgement. The authors were supported by the Agence Nationale de la Recherche, ANR project QUATRAIN. They wish to thank N. Ben Abdallah and F. Castella for fruitful discussions.

References 1. Allaire, G., Piatnitski, A.: Homogenization of the Schrödinger equation and effective mass theorems. Commun. Math. Phys. 258(1), 1–22 (2005) 2. Ando, T., Fowler, B., Stern, F.: Electronic properties of two-dimensional systems. Rev. Mod. Phys. 54, 437–672 (1982) 3. Bao, W., Markowich, P.A., Schmeiser, C., Weishäupl, R.: On the Gross-Pitaevski equation with strongly anisotropic confinement: formal asymptotics and numerical experiments. Math. Models Meth. Appl. Sci. 15(5), 767–782 (2005) 4. Bastard, G.: Wave Mechanics Applied to Semi-conductor Heterostructures. Les Éditions de Physique, Les Ulis: EDP Sciences, 1992 5. Ben Abdallah, N., Castella, F., Delebecque-Fendt, F., Méhats, F.: The strongly confined Schrödinger-Poisson system for the transport of electrons in a nanowire. SIAM J. Appl. Math. 69(4), 1162–1173 (2009) 6. Ben Abdallah, N., Castella, F., Méhats, F.: Time averaging for the strongly confined nonlinear Schrödinger equation, using almost periodicity. J. Diff. Eq. 245(1), 154–200 (2008) 7. Ben Abdallah, N., Méhats, F., Pinaud, O.: Adiabatic approximation of the Schrödinger-Poisson system with a partial confinement. SIAM J. Math. Anal 36, 986–1013 (2005) 8. Ben Abdallah, N., Méhats, F., Schmeiser, C., Weishäupl, R.M.: The nonlinear Schrödinger equation with strong anisotropic harmonic potential. SIAM J. Math. Anal. 37(1), 189–199 (2005) 9. Bony, J.-M., Chemin, J.-Y.: Espaces fonctionnels associés au calcul de Weyl-Hörmander. Bull. Soc. Math. France 122(1), 77–118 (1994) 10. Brezzi, F., Markowich, P.A.: The three dimensional Wigner -Poisson Problem : existence, uniqueness and approximation. Math. Meth. Appl. Sci. 14(1), 35–61 (1991) 11. Carles, R.: Linear vs. nonlinear effects for nonlinear Schrödinger equations with potential. Commun. Contemp. Math. 7(4), 483–508 (2005) 12. Carles, R., Markowich, P.A., Sparber, C.: On the Gross-Pitaevskii equation for trapped dipolar quantum gases. Nonlinearity 21(11), 2569–2590 (2008) 13. Cazenave, T.: Semilinear Schrödinger Equations. Courant Lecture Notes 10, Providence, RI: Amer. Math. Soc., 2003 14. Cazenave, T., Esteban, M.J.: On the stability of stationary states for nonlinear Schrödinger equations with an external magnetic field. Mat. Apl. Comput. 7, 155–168 (1988) 15. da Costa, R.C.T.: Quantum mechanics for a constraint particle. Phys. Rev. A 23(4), 1982–1987 (1981) 16. de Bouard, A.: Nonlinear Schrödinger equations with magnetic fields. Diffel. Int. Eqs. 4(1), 73–88 (1991) 17. Degond, P., Parzani, C., Vignal, M.-H.: A Boltzmann model for trapped particles in a surface potential. Multiscale Modeling & Simulation, SIAM 5(2), 364–392 (2006) 18. Duclos, P., Exner, P.: Curvature-induced bound states in quantum waveguides in two and three dimensions. Rev. Math. Phys. 7(1), 73–102 (1995)

870


19. Egorov, Yu. V., Shubin, M.A.: Partial Differential Equations. I. Encyclopaedia Math. Sci., 30, Berlin: Springer, 1992 20. Ferry, D.K., Goodnick, S.M.: Transport in Nanostructures. Cambridge: Cambridge University Press, 1997 21. Froese, R., Herbst, I.: Realizing holonomic constraints in classical and quantum mechanics. Commun. Math. Phys. 220(3), 489–535 (2001) 22. Helffer, B., Nier, F.: Hypoelliptic Estimates and Spectral Theory for Fokker-Planck Operators and Witten Lalacians. Berlin-Heidelberg-NewYork: Springer, 2005 23. Illner, R., Zweifel, P.F., Lange, H.: Global Existence, Uniqueness and Asymptotic Behaviour of Solutions of the Wigner-Poisson and Schrödinger-Poisson Systems. Math. Meth. Appl. Sci. 17(5), 349–376 (1994) 24. Messiah, A.: Mécanique Quantique, Tome 1. Paris: Dunod, 2003 25. Pinaud, O.: Adiabatic approximation of the Schrödinger-Poisson system with a partial confinement: the stationary case. J. Math. Phys. 45(5), 2029–2050 (2004) 26. Reed, M., Simon, B.: Methods of Modern Mathematical Physics. Vol. 1–4, New York, San FranciscoLondon: Academic Press, 1972–1979 27. Sanders, J.A., Verhulst, F.: Averaging Methods in Nonlinear Dynamical Systems. Appl. Math. Sci. vol. 59, New York-Heidelberg-Tokio: Springer-Verlag, 1985 28. Smrˇcka, L., Jungwirth, T.: In-plane magnetic-field-induced anisotropy of 2D Fermi contours and the field-dependent cyclotron mass. J. Phys. Conds. Matter 6, 55–64 (1994) 29. Sparber, C.: Effective mass theorems for nonlinear Schrödinger equations. SIAM J. Appl. Math. 66(3), 820–842 (2006) 30. Teufel, S.: Adiabatic Perturbation Theory in Quantum Dynamics. Lecture Notes in Mathematics 1821, Berlin-Heidelberg-New York: Springer-Verlag, 2003 31. Vinter, B., Weisbuch, C.: Quantum Semiconductor Structures: Fundamentals & Applications. LondonNewYork: Academic Press, 1991 Communicated by P. Constantin


Communications in


Cardy Algebras and Sewing Constraints, I Liang Kong1,2 , Ingo Runkel3 1 Max-Planck-Institut für Mathematik, Vivatsgasse 7, 53111 Bonn, Germany.


2 Hausdorff Research Institute for Mathematics, Poppelsdorfer Allee 45, 53115 Bonn, Germany 3 Department of Mathematics, King’s College London, Strand, London WC2R 2LS, United Kingdom.

E-mail: [email protected] Received: 28 March 2009 / Accepted: 7 June 2009 Published online: 13 August 2009 – © Springer-Verlag 2009

Abstract: This is part one of a two-part work that relates two different approaches to two-dimensional open-closed rational conformal field theory. In part one we review the definition of a Cardy algebra, which captures the necessary consistency conditions of the theory at genus 0 and 1. We investigate the properties of these algebras and prove uniqueness and existence theorems. One implication is that under certain natural assumptions, every rational closed CFT is extendable to an open-closed CFT. The relation of Cardy algebras to the solutions of the sewing constraints is the topic of part two. Contents 1. 2.

Introduction and Summary . . . . . . . . . . . . . Preliminaries on Tensor Categories . . . . . . . . 2.1 Tensor categories and (co)lax tensor functors . 2.2 Algebras in tensor categories . . . . . . . . . 2.3 Modular tensor categories . . . . . . . . . . . 2.4 The functors T and R . . . . . . . . . . . . . 3. Cardy Algebras . . . . . . . . . . . . . . . . . . 3.1 Modular invariance . . . . . . . . . . . . . . 3.2 Two definitions . . . . . . . . . . . . . . . . 3.3 Uniqueness and existence theorems . . . . . . A. Appendix . . . . . . . . . . . . . . . . . . . . . . A.1 Proof of Lemma 2.7 . . . . . . . . . . . . . A.2 Proof of Lemma 3.20 . . . . . . . . . . . . .

. . . . . . . . . . . . .

. . . . . . . . . . . . .

. . . . . . . . . . . . .

. . . . . . . . . . . . .

. . . . . . . . . . . . .

. . . . . . . . . . . . .

. . . . . . . . . . . . .

. . . . . . . . . . . . .

. . . . . . . . . . . . .

. . . . . . . . . . . . .

. . . . . . . . . . . . .

. . . . . . . . . . . . .

. . . . . . . . . . . . .

. . . . . . . . . . . . .

871 875 875 878 884 885 891 891 895 899 905 905 906

1. Introduction and Summary This is part I of a two-part work which relates two different approaches to two-dimensional open-closed rational conformal field theory (CFT).

872

L. Kong, I. Runkel

The first approach uses a three-dimensional topological field theory to express correlators of the open-closed CFT [Fe,FRS,Fj]. Here one starts from a modular tensor category, which defines a three-dimensional topological field theory [RT,T], and from a special symmetric Frobenius algebra in this modular tensor category. To each openclosed world sheet X one assigns a 3-bordism M X with embedded ribbon graph constructed from this Frobenius algebra. To the boundary of M X the topological field theory assigns a vector space B(X ) and to M X itself a vector C X ∈ B(X ). One proves that this collection of vectors C X provides a so-called solution to the sewing constraints [Fj]. If the modular tensor category is the category of representations of a suitable vertex operator algebra, the spaces B(X ) are spaces of conformal blocks, and the C X are the correlators of an open-closed CFT. In this approach one thus makes an ansatz for the correlators on all world sheets simultaneously and then proves that they obey the necessary consistency conditions. The relation to CFT rests on convergence and factorisation properties of higher genus conformal blocks, and the precise list of conditions the vertex operator algebra has to fulfill for these properties to hold is not known. However, from a physical perspective one expects that interesting classes of models [W,FK] will have all the necessary properties. The second approach uses the theory of vertex operator algebras to construct directly the correlators of the genus 0 and genus 1 open-closed CFT [HK1,HK2,K3]. More precisely, in this approach one uses a notion of CFT defined in [K3, Sect. 1] (and called partial CFT1 there), where one glues Riemann surfaces around punctures with local coordinates as in [V,H1] instead of gluing around parametrised circles as in [Se]. This approach is based on the precise relation between genus-0 CFT and vertex operator algebras [H1], and on the fact that the category of modules over a rational vertex operator algebra is a modular tensor category [HL,H2]. Let us call a vertex operator algebra rational if it satisfies the conditions in [H2, Sect. 1]. If one analyses the consistency conditions of a genus-0,1 open-closed CFT, one arrives at a structure called Cardy CV |CV ⊗V - algebra in [K3]. It is formulated in purely categorical terms in the categories CV and CV ⊗V of modules over the rational vertex operator algebras V and V ⊗ V , respectively. Cardy algebras (Definition 3.7) are the central objects in part I of this work, and we will describe their relation to CFT in slightly more detail below. The data in a Cardy algebra amounts to an open-closed CFT on a generating set of world sheets, from which the entire CFT can be obtained by repeated gluing. The conditions on this data are necessary for this procedure to give a consistent genus-0,1 open-closed CFT. The two approaches just outlined start at opposite ends of the same problem. In both cases the difficulty to obtain a complete answer lies in the lack of control over the properties of higher genus conformal blocks. Nonetheless, both approaches give rise to notions formulated in entirely categorical terms, and we can compare the structures one finds. In part II we will come to the satisfying conclusion that giving a solution to the sewing constraints is essentially equivalent, in a sense made precise in part II, to giving a Cardy algebra. To motivate the notion of a Cardy algebra and our interest in it, we would like to outline how it emerges when formulating closed CFT and open-closed CFT at genus-0,1 in the language of vertex operator algebras. The next one and a half pages, together with a few

1 The qualifier ‘partial’ refers to the fact that the gluing of punctures is only defined if the coordinates ζ , 1 ζ2 around two punctures can be analytically extended to a large enough region containing no other punctures, so that the identification ζ1 ∼ 1/ζ2 is well-defined. That is, if ζ1 can be extended to a disc of radius r , then ζ2 must be defined on a disc of radius greater than 1/r . Both discs must not contain further punctures.

Cardy Algebras and Sewing Constraints, I

873

remarks in the main text, are the only places where we make reference to vertex operator algebras. The reader who is not familiar with this structure is invited to skip ahead. All types of field algebras occurring below are called self-dual if they are endowed with non-degenerate invariant bilinear forms. A genus-0 closed CFT is equivalent to an algebra over a partial dioperad consisting of spheres with arbitrary in-coming and out-going punctures. The dioperad structure allows to compose one in-going and one out-going puncture of distinct spheres, so that the result is again a sphere. Such an algebra with additional natural properties is canonically equivalent to a so-called self-dual conformal full field algebra [HK2,K1]. A conformal full field algebra contains chiral and anti-chiral parts, the easiest nontrivial example is given by V ⊗V , where V is a vertex operator algebra. A conformal full field algebra containing V ⊗ V as a subalgebra is called a conformal full field algebra over V ⊗ V . When V is rational, the category of self-dual conformal full field algebras over V ⊗ V is isomorphic to the category of commutative symmetric Frobenius algebras in CV ⊗V [K1, Thm. 4.15]. Similarly, a genus-0 open CFT is an algebra over a partial dioperad consisting of disks with an arbitrary number of in-coming and out-going boundary punctures. Such an algebra with additional natural properties is canonically equivalent to a self-dual open-string vertex operator algebra as defined in [HK1]. A vertex operator algebra V is naturally an open-string vertex operator algebra. An open-string vertex operator algebra containing V as a subalgebra in its meromorphic centre is called an open-string vertex operator algebra over V . When V is rational, the category of self-dual open-string vertex operator algebras over V is isomorphic to the category of symmetric Frobenius algebras in CV , see [HK1, Thm. 4.3] and [K3, Thm. 6.10]. Finally, a genus-0 open-closed CFT is an algebra over the Swiss-cheese partial dioperad, which consists of disks with both interior punctures and boundary punctures, and is equipped with an action of the partial spheres dioperad. Such an algebra can be constructed from a so-called self-dual open-closed field algebra [K2]. It consists of a self-dual conformal full field algebra Acl , a self-dual open-string vertex operator algebra Aop , and interactions between Acl and Aop satisfying certain compatibility conditions. Namely, if Acl is defined over V ⊗ V and Aop over V , one requires that the boundary condition on a disc is V -invariant in the sense that both the chiral copy V ⊗ 1 and the anti-chiral copy 1 ⊗ V of V in Acl give the copy of V in Aop in the limit of the insertion point approaching a point on the boundary of the disc [K2, Def. 1.25]. An open-closed field algebra with V -invariant boundary condition is called an open-closed field algebra over V . When V is rational, the category of self-dual open-closed field algebras over V is isomorphic to the category of triples (Aop |Acl , ι˜cl-op ), where Acl is a commutative symmetric Frobenius CV ⊗V -algebra, Aop a symmetric Frobenius CV -algebra, and ι˜cl-op an algebra homomorphism T (Acl ) → Aop satisfying a centre condition (given in (3.20) below), see [K2, Thm. 3.14] and [K3, Sect. 6.2]. Here T : CV ⊗V → CV is the Huang-Lepowsky tensor product functor [HL]. The genus-1 theory does not provide new data as it is determined by taking traces of genus-0 correlators, but it does provide two additional consistency conditions: the modular invariance condition for one-point correlators on the torus [So], and the Cardy condition for boundary-two-point correlators on the annulus [C2,Lw]. Their categorical formulations have been worked out in [HK3,K3]. Adding them to the axioms of a self-dual open-closed field algebra over V finally results in the notion of a Cardy CV |CV ⊗V -algebra. One can prove that the category of self-dual open-closed field algebras over a rational vertex operator algebra V satisfying the two genus-1 consistency conditions is isomorphic to the category of Cardy CV |CV ⊗V -algebras [K3, Thm. 6.15].

874

L. Kong, I. Runkel

If V is rational, then so is V ⊗ V [DMZ,HK2]. Thus both CV and CV ⊗V are modular tensor categories. In fact, CV ⊗V ∼ = CV (CV )− (see [FHL, Thm. 4.7.4] and [DMZ, Thm 2.7]), where the minus sign relates to the particular braiding used for CV ⊗V . Namely, for a given modular tensor category D, we denote by D− the modular tensor category obtained from D by inverting braiding and twist. We will also sometimes write D+ for D. The product amounts to taking direct sums of pairs of objects and tensor products of morphisms spaces. The definition of a Cardy algebra can be stated in a way that no longer makes reference to the vertex operator algebra V , and therefore makes 2 ≡ C C , this leads sense in an arbitrary modular tensor category C. Abbreviating C± + − 2 to the definition of a Cardy C|C± -algebra. The relation to genus-0,1 open-closed CFT outlined above is the main motivation for 2 -algebras. In part I of this work we investigate how much one our interest in Cardy C|C± can learn about Cardy algebras in the categorical setting, and without the assumption that the modular tensor category C is given by CV for some V . We briefly summarise our approach and results below. In Sect. 2.1–2.3, we recall some basic notions we will need, such as (co)lax tensor functors, Frobenius functors, and modular tensor categories. In Sect. 2.4, we study the 2 → C, which is defined by the tensor product on C via T (⊕ A × B ) = functor T : C± i i i ⊕i Ai ⊗ Bi for Ai , Bi ∈ C. Using the braiding of C one can turn T into a tensor functor. A tensor functor is automatically also a Frobenius functor, and so takes a Frobenius algebra A in its domain category to a Frobenius algebra F(A) in its target category. 2 , also defined in Sect. 2.4. An important object in this work is the functor R : C → C± We show that R is left and right adjoint to T . As a consequence, R is automatically a lax and colax tensor functor, but it is in general not a tensor functor. However, we will show that it is still a Frobenius functor, and so takes Frobenius algebras in C to Frobenius 2 . In fact, it also preserves the properties simple, special and symmetric algebras in C± of a Frobenius algebra. In the case C = CV the functor R : CV → CV ⊗V was first constructed in [Li1,Li2] using techniques from vertex operator algebras. This motivated the present construction and notation. The functor R was also considered in a slightly different context in [ENO2]. The above results imply that R and T form an ambidextrous adjunction, and we will 2 . For example, use this adjunction to transport algebraic structures between C and C± the algebra homomorphism ι˜cl-op : T (Acl ) → Aop in C is transported to an algebra 2 . This gives rise to an alternative definition homomorphism ιcl-op : Acl → R(Aop ) in C± 2 of a Cardy C|C± -algebra as a triple (Aop |Acl , ιcl-op ). To prepare the definition of a Cardy algebra, in Sect. 3.1 we discuss the so-called 2 (Definition 3.1 below). We show that modular invariance condition for algebras in C± when Acl is simple, the modular invariance condition can be replaced by an easier condition on the quantum dimension of Acl (namely, the dimension of Acl has to be that of the modular tensor category C), see Theorem 3.4. In Sect. 3.2 we give the two definitions of a Cardy algebra and prove their equivalence. Section 3.3 contains our main results. We first show that for each special symmetric Frobenius algebra A in C (see Sect. 2.2 for the definition of special) one obtains a Cardy algebra (A|Z (A), e), where Z (A) is the full centre of A (Theorem 3.18). The full centre [Fj, Def. 4.9] is a subobject of R(A) and e : Z (A) → R(A) is the canonical embedding. Next we prove a uniqueness theorem (Theorem 3.21), which states that if (Aop |Acl , ιcl-op ) is a Cardy algebra such that dim Aop = 0 and Acl is simple, then Aop is special and (Aop |Acl , ιcl-op ) is isomorphic to (Aop |Z (Aop ), e). When combined with part II of this work, this result amounts to [Fj, Thm. 4.26] and provides an alternative


875

(and shorter) proof. Finally we show that for every simple modular invariant commu2 there exists a simple special symmetric tative symmetric Frobenius algebra Acl in C± Frobenius algebra Aop and an algebra homomorphism ιcl-op : Acl → R(Aop ) such that (Aop , Acl , ιcl-op ) is a Cardy algebra (Theorem 3.22). This theorem is closely related to a result announced in [Mü2] and provides an independent proof in the framework of Cardy algebras. In physical terms these two theorems mean that a rational open-closed CFT with a unique closed vacuum state can be uniquely reconstructed from its correlators involving only discs with boundary punctures, and that every closed CFT with unique vacuum and left/right rational chiral algebra V ⊗ V occurs as part of such an open-closed CFT. 2. Preliminaries on Tensor Categories In this section, we review some basic facts of tensor categories and fix our conventions and notations along the way. 2.1. Tensor categories and (co)lax tensor functors. In a tensor (or monoidal) category C with tensor product bifunctor ⊗ and unit object 1, for U, V, W ∈ C, we denote the ∼ =

associator U ⊗ (V ⊗ W ) − → (U ⊗ V ) ⊗ W by αU,V,W , the left unit isomorphism ∼ =

∼ =

→ U by lU , and the right unit isomorphism U ⊗ 1 − → U by rU . If C is braided, 1⊗U − for U, V ∈ C we write the braiding isomorphism as cU,V : U ⊗ V → V ⊗ U . Let C1 and C2 be two tensor categories with units 11 and 12 respectively. For simplicity, we will often write ⊗, α, l, r for the data of both C1 and C2 . Lax and colax tensor functors are defined as follows, see e.g. [Y, Ch. I.3] or [Ln, Ch. I.1.2]. Definition 2.1. A lax tensor functor G : C1 → C2 is a functor equipped with a morphism φ0G : 12 → G(11 ) in C2 and a natural transformation φ2G : ⊗◦(G × G) → G ◦⊗ such that the following three diagrams commute: α

G(A) ⊗ (G(B) ⊗ G(C)) id G(A) ⊗φ2G

φ2G ⊗id G(C)

G(A) ⊗ G(B ⊗ C)

G(A ⊗ B) ⊗ G(C),

φ2G

l G(A)

G(11 ) ⊗ G(A)

G(α)

/ G(A) G(l −1 A )

φ0G ⊗id G(A)

φ2G

(2.1)

φ2G

G(A ⊗ (B ⊗ C)) 12 ⊗ G(A)

/ (G(A) ⊗ G(B)) ⊗ G(C)

/ G(11 ⊗ A)

,

/ G((A ⊗ B) ⊗ C) G(A) ⊗ 12

r G(A)

G(r A−1 )

id G(A) ⊗φ0G

G(A) ⊗ G(11 )

/ G(A)

φ2G

.

/ G(A ⊗ 11 )

(2.2) Definition 2.2. A colax tensor functor is a functor F : C1 → C2 equipped with a morphism ψ0F : F(11 ) → 12 in C2 , and a natural transformation ψ2F : F ◦⊗ → ⊗◦(F × F) such that the following three diagrams commute:

876

L. Kong, I. Runkel α

F(A) ⊗ (F(B) O ⊗ F(C))

/ (F(A) ⊗ F(B)) ⊗ F(C) O

id F(A) ⊗ψ2F

ψ2F ⊗id F(C)

F(A) ⊗ F(B ⊗ C) O

F(A ⊗ B)O ⊗ F(C),

ψ2F F(α)

F(A ⊗ (B ⊗ C))

12 ⊗ O F(A) o

l −1 F(A)

F(11 ) ⊗ F(A) o

ψ2F

/ F((A ⊗ B) ⊗ C)

F(A)O ⊗ 12 o

F(A) O F(l A ),

ψ0F ⊗id F(A)

(2.3)

ψ2F

−1 r F(A)

F(A) O F(r A )

id F(A) ⊗ψ0F

F(A) ⊗ F(11 ) o

F(11 ⊗ A)

ψ2F

F(A ⊗ 11 ). (2.4)

We denote a lax tensor functor by (G, φ2G , φ0G ) or just G, and a colax tensor functor by (F, ψ2F , ψ0F ) or F. Definition 2.3. A tensor functor T : C1 → C2 is a lax tensor functor (T, φ2T , φ0T ) such that φ0T , φ2T are both isomorphisms. A tensor functor (T, φ2T , φ0T ) is automatically a colax tensor functor (T, ψ2T , ψ0T ) with ψ0T = (φ0T )−1 and ψ2T = (φ2T )−1 . In the next section we will discuss algebras in tensor categories. The defining properties (2.1) and (2.2) of a lax tensor functor are analogues of the associativity, the left-unit, and the right-unit properties of an algebra. Indeed, a lax tensor functor G : C1 → C2 maps a C1 -algebra to a C2 -algebra. Similarly, (2.3) and (2.4) are analogues of the coassociativity, the left-counit and the right-counit properties of a coalgebra, and a colax tensor functor F : C1 → C2 maps a C1 -coalgebra to a C2 -coalgebra. We will later make use of functors that take Frobenius algebras to Frobenius algebras. This requires a stronger condition than being lax and colax and leads to the notion of a ‘functor with Frobenius structure’ or ‘Frobenius monoidal functor’ [Sz,DP,P], which we will simply refer to as Frobenius functor. Definition 2.4. A Frobenius functor F : C1 → C2 is a tuple F ≡ (F, φ2F , φ0F , ψ2F , ψ0F ) such that (F, φ2F , φ0F ) is a lax tensor functor, (F, ψ2F , ψ0F ) is a colax tensor functor, and such that the following two diagrams commute: F(A) ⊗ (F(B) O ⊗ F(C))

α

id F(A) ⊗ψ2F

φ2F ⊗id F(C)

F(A ⊗ B)O ⊗ F(C)

F(A) ⊗ F(B ⊗ C) φ2F

F(A ⊗ (B ⊗ C))

/ (F(A) ⊗ F(B)) ⊗ F(C)

ψ2F F(α)

/ F((A ⊗ B) ⊗ C)

(2.5)


877 α −1

F(A) ⊗ (F(B) ⊗ F(C)) o

(F(A) ⊗ F(B)) ⊗ F(C) O

id F(A) ⊗φ2F

ψ2F ⊗id F(C)

F(A) ⊗ F(B ⊗ C) O

F(A ⊗ B) ⊗ F(C)

ψ2F

F(A ⊗ (B ⊗ C)) o

F(α −1 )

(2.6)

φ2F

F((A ⊗ B) ⊗ C)

Proposition 2.5. If (F, φ2F , φ0F ) is a tensor functor, then F is a Frobenius functor with ψ0F = (φ0F )−1 and ψ2F = (φ2F )−1 . Proof. Since F is a tensor functor, it is lax and colax. If we replace ψ2F by (φ2F )−1 in (2.5) and (2.6), both commuting diagrams are equivalent to (2.1), which holds because F is lax. Thus F is a Frobenius functor.

The converse statement does not hold. For example, the functor R which we define in Sect. 2.4 is Frobenius but not tensor. Let us recall the notion of adjunctions and adjoint functors [Ma, Ch. IV.1]. Definition 2.6. An adjunction from C1 to C2 is a triple F, G, χ , where F and G are functors F : C1 → C2 , G : C2 → C1 , and χ is a natural isomorphism which assigns to each pair of objects A1 ∈ C1 , A2 ∈ C2 a bijective map ∼ =

χ A1 ,A2 : HomC2 (F(A1 ), A2 ) −−→ HomC1 (A1 , G(A2 )), which is natural in both A1 and A2 . F is called a left-adjoint of G and G is called a right-adjoint of F. For simplicity, we will often abbreviate χ A1 ,A2 as χ . Associated to each adjunction ρ

δ

→ G F and F G − → idC2 , where F, G, χ , there are two natural transformations idC1 − idC1 and idC2 are identity functors, given by δ A1 = χ (id F(A1 ) ),

ρ A2 = χ −1 (id G(A2 ) )

(2.7)

for Ai ∈ Ci , i = 1, 2. They satisfy the following two identities: δG

Gρ

id G

G −→ G F G −→ G = G −−→ G,

Fδ

ρF

id F

F −→ F G F −→ F = F −−→ F.

(2.8)

We have, for g : F(A1 ) → A2 and f : A1 → G(A2 ), χ (g) = G(g) ◦ δ A1 ,

χ −1 ( f ) = ρ A2 ◦ F( f ).

(2.9)

For simplicity, δ A1 and ρ A2 are often abbreviated as δ and ρ, respectively. Let F, G, χ be an adjunction from a tensor category C1 to a tensor category C2 and (F, ψ2F , ψ0F ) a colax tensor functor from C1 to C2 . We can define a morphism

878

L. Kong, I. Runkel

φ0G : 11 → G(12 ) and a natural transformation φ2G : ⊗ ◦ (G × G) → G ◦ ⊗ by, for A, B ∈ C2 , δ11

G(ψ0F )

φ0G = χ (ψ0F ) = 11 −→ G F(11 ) −−−−→ G(12 ), δ

→ G F (G(A) ⊗ G(B)) φ2G = χ ((ρ A ⊗ ρ B ) ◦ ψ2F ) = G(A) ⊗ G(B) − G(ψ2F )

G(ρ A ⊗ρ B )

−−−−→ G (F G(A) ⊗ F G(B)) −−−−−−→ G(A ⊗ B),

(2.10)

φ2G

where we have used the first identity in (2.9). Notice that is natural because it is a composition of natural transformations. One can easily show that ψ0F and ψ2F can be re-obtained from φ0G and φ2G as follows: Fφ0G

ρ

→ 12 , ψ0F = χ −1 (φ0G ) = F(11 ) −−→ F G(12 ) − F(δ⊗δ)

ψ2F = χ −1 (φ2G ◦ (δ ⊗ δ)) = F(U ⊗ V ) −−−−→ F (G F(U ) ⊗ G F(V )) Fφ2G

ρ

−−→ F G (F(U ) ⊗ F(V )) − → F(U ) ⊗ F(V ).

(2.11)

for U, V ∈ C1 . The following result is standard; for the sake of completeness, we give a proof in Appendix A.1. Lemma 2.7. (F, ψ2F , ψ0F ) is a colax tensor functor iff (G, φ2G , φ0G ) is a lax tensor functor. 2.2. Algebras in tensor categories. An algebra in a tensor category C, or a C-algebra, is a triple A = (A, m, η), where A is an object of C, m (the multiplication) is a morphism A ⊗ A → A such that m ◦ (m ⊗ id A ) ◦ α A,A,A = m ◦ (id A ⊗ m), and η (the unit) is a morphism 1 → A such that m ◦ (id A ⊗ η) = id A ◦ r A and m ◦ (η ⊗ id A ) = id A ◦ l A . If C is braided and m ◦ c A,A = m, then A is called commutative. A left A-module is a pair (M, m M ), where M ∈ C and m M is a morphism A⊗M → M such that m M ◦(id A ⊗m M ) = m M ◦(m A ⊗id M )◦α A,A,M and m M ◦(η A ⊗id M ) = id M ◦l M . Right A-modules and A-bimodules are defined similarly. Definition 2.8. Let C be a tensor category and let A be an algebra in C. (i) A is called simple iff it is simple as a bimodule over itself. Let C be in addition k-linear, for k a field. (ii) A is called absolutely simple iff the space of A-bimodule maps from A to itself is one-dimensional, dimk Hom A|A (A, A) = 1. (iii) A is called haploid iff dimk Hom(1, A) = 1 [FS, Def. 4.3]. In the following we will assume that all tensor categories are strict to avoid spelling out associators and unit constraints. A C-coalgebra A = (A, , ε) is defined analogously to a C-algebra, i.e. : A → A ⊗ A and ε : A → 1 obey coassociativity and counit conditions. If C is braided and if A and B are C-algebras, there are two in general non-isomorphic algebra structures on A ⊗ B. We choose A ⊗ B to be the C-algebra with multiplication m A⊗B = (m A ⊗m B )◦(id A ⊗c−1 A,B ⊗id B ) and unit η A⊗B = η A ⊗η B . Similarly, if A and B are C-coalgebras, then A ⊗ B becomes a C-coalgebra if we choose the comultiplication A⊗B = (id A ⊗ c A,B ⊗ id B ) ◦ ( A ⊗ B ) and the counit ε A⊗B = ε A ⊗ ε B .


879

Definition 2.9. A Frobenius algebra A = (A, m, η, , ε) is an algebra and a coalgebra such that the coproduct is an intertwiner of A-bimodules, (id A ⊗ m) ◦ ( ⊗ id A ) = ⊗ m = (m ⊗ id A ) ◦ (id A ⊗ ). We will use the following graphical representation for the morphisms of a Frobenius algebra, A

A

A

A

m=

, η=

, =

, ε=

.

(2.12)

A A

A

A

A Frobenius algebra A in a k-linear tensor category, for k a field, is called special iff m ◦ = ζ id A and ε ◦ η = ξ id1 for nonzero constants ζ , ξ ∈ k. If ζ = 1 we call A normalised-special. A Frobenius algebra homomorphism between two Frobenius algebras is both an algebra homomorphism and a coalgebra homomorphism. A (strictly) sovereign tensor category is a tensor category equipped with a left and a right duality which agree on objects and morphisms (see e.g. [Bi,FS] for more details). We will write the dualities as = d˜U : U ⊗ U ∨ → 1,

= dU : U ∨ ⊗ U → 1,

U∨

U

U

U∨

U

U∨

U∨

U

(2.13)

= b˜U : 1 → U ∨ ⊗ U.

= bU : 1 → U ⊗ U ∨ ,

In terms of these we define the left and right dimension of an object U as diml U = dU ◦ b˜U

, dimr U = d˜U ◦ bU ,

(2.14)

both of which are elements of Hom(1, 1). Let now C be a sovereign tensor category. For a Frobenius algebra A in C, we define two morphisms: A∨

A =

A∨

,

A

A =

.

(2.15)

A

Definition 2.10. A Frobenius algebra A is symmetric iff A = A . The following lemma shows that under certain conditions we do not need to distinguish the various notions of simplicity in Definition 2.8. Lemma 2.11. Let A be a commutative symmetric Frobenius algebra in a C-linear semisimple sovereign braided tensor category C and suppose that diml A = 0. Then the following are equivalent.

880

L. Kong, I. Runkel

(i) A is simple. (ii) A is absolutely simple. (iii) A is haploid. Proof. (ii)⇔(iii): A is haploid iff it is absolutely simple as a left module over itself [FS, Eq. (4.17)]. Furthermore, for a commutative algebra we have Hom A (A, A) = Hom A|A (A, A), and so A is haploid iff it is absolutely simple. (i)⇒(ii): If A is simple, then every nonzero element of Hom A|A (A, A) is invertible. Hence this space forms a division algebra over C, and is therefore isomorphic to C. (iii)⇒(i): Since C is semi-simple and A is haploid, also Hom(A, 1) is one-dimensional. The counit ε is a nonzero element in this space, and so gives a basis. This implies firstly, that ε ◦ η = 0, and secondly, that there is a constant β ∈ C such that β · ε = d A ◦ (id A∨ ⊗ m) ◦ (b˜ A ⊗ id A ).

(2.16)

Composing with η from the right yields β ε◦η = diml A. The right-hand side is nonzero, and so β = 0. By [FRS, Lem. 3.11], A is special. We have already proved (ii)⇔(iii), and so A is absolutely simple. A special Frobenius algebra in a semi-simple category has a semi-simple category of bimodules (apply [FS, Prop. 5.24] to the algebra tensored with its opposite algebra). For semi-simple C-linear categories, simple and absolutely simple are equivalent.2 Thus A is simple.

Remark 2.12. For a Frobenius algebra A the morphisms (2.15) are invertible, and hence A∼ = A∨ . In this case one has diml A = dimr A [FS, Rem. 3.6.3] and so we could have stated the above lemma equivalently with the condition dimr A = 0. Let F : C1 → C2 be a lax tensor functor between two tensor categories C1 , C2 and m F(A)

let (A, m A , η A ) be an algebra in C1 . Define morphisms F(A) ⊗ F(A) −−−→ F(A) and η F(A)

12 −−−→ F(A) as m F(A) = F(m A ) ◦ φ2F ,

η F(A) = F(η A ) ◦ φ0F .

(2.17)

Then (F(A), m F(A) , η F(A) ) is an algebra in C2 [JS, Prop. 5.5]. If f : A → B is an algebra homomorphism between two algebras A, B ∈ C1 , then F( f ) : F(A) → F(B) is also an algebra homomorphism. If (M, m M ) is a left (or right) A-module in C1 , then (F(M), F(m M )◦φ2F ) is a left (or right) F(A)-module; if M has a A-bimodule structure, then F(M) naturally has a F(A)-bimodule structure. Similarly, if (A, A , ε A ) is a coalgebra in C1 and F : C1 → C2 is a colax tensor func F(A)

ε F(A)

tor, then F(A) with coproduct F(A) −−−→ F(A) ⊗ F(A) and counit F(A) −−−→ 12 given by F(A) = ψ2F ◦ F( A ),

ε F(A) = ψ0F ◦ F(ε A ),

(2.18)

is a coalgebra in C2 . If f : A → B is a coalgebra homomorphism between two coalgebras A, B ∈ C1 , then F( f ) : F(A) → F(B) is also a coalgebra homomorphism. 2 To see this note that if U is simple, then the C-vector space Hom(U, U ) is a division algebra, and hence Hom(U, U ) = C idU . Conversely, if U is not simple, then U = U1 ⊕ U2 and Hom(U, U ) contains at least two linearly independent elements, namely idU1 and idU2 .


881

Proposition 2.13. 3 If F : C1 → C2 is a Frobenius functor and (A, m A , η A , A , ε A ) a Frobenius algebra in C1 , then (F(A), m F(A) , η F(A) , F(A) , ε F(A) ) is a Frobenius algebra in C2 . Proof. One Frobenius property, (m F(A) ⊗ id F(A) ) ◦ (id F(A) ⊗ F(A) ) = F(A) ◦ m F(A) , follows from the commutativity of the following diagram (we spell out the associativity isomorphisms): F(A) ⊗ F(A)

id F(A) ⊗F( A )

/ F(A) ⊗ F(A ⊗ A)

φ2F

F(id A ⊗ A )

F(A ⊗ A)

F( A )

F(A)

α F(A),F(A),F(A)

/ F(A ⊗ (A ⊗ A))

(F(A) ⊗ F(A)) ⊗ F(A)

F(α A,A,A )

/ F(A) ⊗ (F(A) ⊗ F(A))

φ2F

F((A ⊗ A) ⊗ A)

F(m A )

id F(A) ⊗ψ2F

ψ2F

/ F(A ⊗ A) ⊗ F(A)

ψ2F

F(m A ⊗id A )

/ F(A ⊗ A)

(2.19)

φ2F ⊗id F(A)

F(m A )⊗id F(A)

/ F(A) ⊗ F(A)

The commutativity of the upper-left subdiagram follows from the naturalness of φ2F , that of the upper-right subdiagram follows from (2.5), that of the lower-left subdiagram follows from the Frobenius properties of A, and that of the lower-right subdiagram follows from the naturalness of ψ2F . The proof of the other Frobenius property is similar.

Proposition 2.14. If F : C1 → C2 is a tensor functor and A a Frobenius algebra in C1 , then: (i) F(A) has a natural structure of Frobenius algebra as given in Proposition 2.13; (ii) If A is (normalised-)special, so is F(A). Proof. Part (i) follows from Propositions 2.5 and 2.13. Part (ii) is a straightforward

verification of the definition, using ψ2F = (φ2F )−1 and ψ0F = (φ0F )−1 . Let C1 , C2 be sovereign tensor categories and F : C1 → C2 a Frobenius functor. We ∨ ∨ define two morphisms I F(A∨ ) , I F(A ∨ ) : F(A ) → F(A) , for a Frobenius algebra A in C1 , as follows: I F(A∨ ) = ((ψ0F ◦ F(d A ) ◦ φ2F ) ⊗ id F(A)∨ ) ◦ (id F(A∨ ) ⊗ b F(A) ) , I ∨ = (id F(A)∨ ⊗ (ψ0F ◦ F(d˜ A ) ◦ φ2F )) ◦ (b˜ F(A) ⊗ id F(A∨ ) ).

(2.20)

F(A )

It is easy to see that these are isomorphisms. Lemma 2.15. If F : C1 → C2 is a Frobenius functor and A a Frobenius algebra in C1 , then F(A) = I F(A∨ ) ◦ F( A ),

F(A) = I F(A ∨ ) ◦ F( A ).

(2.21)

3 After the preprint of the present paper appeared we noticed that this proposition is also proved in [DP, Cor. 5].

882

L. Kong, I. Runkel

Proof. We only prove the first equality, the second one can be seen in the same way. By definition, we have I F(A∨ ) ◦ F( A ) = ((ψ0F ◦ F(d A ) ◦ φ2F ) ⊗ id F(A)∨ ) ◦ (id F(A∨ ) ⊗ b F(A) ) ◦ F( A ) = (ψ0F ◦ F(d A ) ◦ φ2F ) ◦ (F( A ) ⊗ id F(A) ) ⊗ id F(A)∨ ◦ (id F(A) ⊗ b F(A) ). For the term inside the square brackets we find ψ0F ◦ F(d A ) ◦ φ2F ◦ (F( A ) ⊗ id F(A) ) = ψ0F ◦ F(d A ) ◦ F( A ⊗ id A ) ◦ φ2F = ψ0F ◦ F(d A ◦ ( A ⊗ id A )) ◦ φ2F = ψ0F ◦ F(ε A ◦ m A ) ◦ φ2F .

(2.22)

On the other hand, by definition, F(A) = [((ψ0F ◦ F(ε A ) ◦ (F(m A ) ◦ φ2F )) ⊗ id F(A)∨ ] ◦ (id F(A) ⊗ b F(A) ). This demonstrates the first equality in (2.21).

Proposition 2.16. Let F : C1 → C2 be a tensor functor, G : C2 → C1 a functor, F, G, χ an adjunction, A a C1 -algebra, and B a C2 -algebra. Then f : A → G(B) is an algebra homomorphism if and only if f˜ = χ −1 ( f ) : F(A) → B is an algebra homomorphism. Proof. We need to show that m G(B) ◦ ( f ⊗ f ) = f ◦ m A and f ◦ η A = ηG(B) ,

(2.23)

m B ◦ ( f˜ ⊗ f˜) = f˜ ◦ m F(A) and f˜ ◦ η F(A) = η B s.

(2.24)

is equivalent to

We first prove that the first identity in (2.23) is equivalent to the first identity in (2.24). For the left-hand side of the first identity in (2.23) we have the following equalities: (1)

m G(B) ◦ ( f ⊗ f ) = G(m B ) ◦ φ2G ◦ ( f ⊗ f ) (2)

= G(m B ) ◦ G(ρ ⊗ ρ) ◦ G(ψ2F ) ◦ δ ◦ ( f ⊗ f )

(3)

= G(m B ) ◦ G(ρ ⊗ ρ) ◦ G(ψ2F ) ◦ G F( f ⊗ f ) ◦ δ

(4)

= G(m B ) ◦ G(ρ ⊗ ρ) ◦ G(F( f ) ⊗ F( f )) ◦ G(ψ2F ) ◦ δ (5) = G m B ◦ (ρ ⊗ ρ) ◦ (F( f ) ⊗ F( f )) ◦ ψ2F ◦ δ (6) (2.25) = χ m B ◦ (ρ ⊗ ρ) ◦ (F( f ) ⊗ F( f )) ◦ ψ2F ,

where (1) is the definition of m G(B) in (2.17), (2) is the second identity in (2.10), (3) and (4) are naturality of δ and ψ2F , respectively, step (5) is functoriality of G and finally step (6) is (2.9). For the right hand side of the first identity in (2.23) we get (1)

f ◦ m A = Gρ ◦ δG ◦ ( f ◦ m A ) (2)

= Gρ ◦ G F( f ◦ m A ) ◦ δ

(3)

= G(ρ ◦ F( f ◦ m A )) ◦ δ

(4)

= χ (ρ ◦ F( f ) ◦ F(m A )),

(2.26)


883

where (1) is the adjunction property (2.8), (2) is naturality of δ, (3) functoriality of G, and (4) amounts to (2.9) and functoriality of F. On the other hand, we see that the first equality in (2.24) is equivalent to m B ◦ (ρ ⊗ ρ) ◦ (F( f ) ⊗ F( f )) = ρ ◦ F( f ) ◦ F(m A ) ◦ φ2F .

(2.27)

Using that φ2F is invertible with inverse (φ2F )−1 = ψ2F and that χ is an isomorphism, it follows that the statement that (2.25) is equal to (2.26) is equivalent to the identity (2.27). Now we prove that the second identity in (2.23) is equivalent to the second identity in (2.24). Using (2.17) and (2.10) we can write ηG(B) = G(η B )◦φ0F = G(η B )◦G(ψ0F )◦δ1 . Together with (2.9) this shows that the second identity in (2.23) is equivalent to f ◦ η A = χ (η B ◦ ψ0F ).

(2.28)

On the other hand, the second identity in (2.24) is equivalent to ρ ◦ F( f ) ◦ F(η A ) ◦ φ0F = η B ,

(2.29)

which, by φ0F = (ψ0F )−1 and (2.9), is further equivalent to (2.28).

Definition 2.17. Let (A, m A , η A , A , ε A ) and (B, m B , η B , B , ε B ) be two Frobenius algebras in a tensor category C. For f : A → B, we define f ∗ : B → A by f ∗ = ((ε B ◦ m B ) ⊗ id A ) ◦ (id B ⊗ f ⊗ id A ) ◦ (id B ⊗ ( A ◦ η A )).

(2.30)

The following lemma is immediate from the definition of (·)∗ and the properties of Frobenius algebras. We omit the proof. Lemma 2.18. Let C be a tensor category, let A, B, C be Frobenius algebras in C, and let f : A → B and g : B → C be morphisms. (g ◦ f )∗ = f ∗ ◦ g ∗ . f is a monomorphism iff f ∗ is an epimorphism. f is an algebra map iff f ∗ is a coalgebra map. If f is a homomorphism of Frobenius algebras, then f ∗ ◦ f = id A and f ◦ f ∗ = id B . (v) If C is sovereign and if A and B are symmetric, then f ∗∗ = f .

(i) (ii) (iii) (iv)

Let C and D be tensor categories and let F : C → D be a Frobenius functor. Given Frobenius algebras A, B in C and a morphism f : A → B, the next lemma shows how (·)∗ behaves under F. Lemma 2.19. F( f ∗ ) = F( f )∗ . Proof. The definition of the structure morphisms of the Frobenius algebra F(A) is given in (2.17) and (2.18). Substituting these definitions gives F( f )∗ = ψ0F ◦ F(ε B ) ◦ F(m B ) ◦ φ2F ⊗ id F(A) ◦ id F(B) ⊗ F( f ) ⊗ id F(A) ◦ id F(B) ⊗ ψ2F ◦ F( A ) ◦ F(η A ) ◦ φ0F = (ψ0F ⊗ id F(A) ) ◦ F (ε B ◦ m B ◦ (id B ⊗ f )) ⊗ id F(A) ◦(φ2F ⊗ id F(A) ) ◦ (id F(B) ⊗ ψ2F ) ◦ id F(B) ⊗ F( A ◦ η A ) ◦ (id F(B) ⊗ φ0F ).

(2.31)

884

L. Kong, I. Runkel

In the middle line of the last expression we can use the defining property (2.5) of F, namely we substitute (φ2F ⊗ id F(A) ) ◦ (id F(B) ⊗ ψ2F ) = ψ2F ◦ φ2F . Then ψ2F can be moved to the left, and φ2F to the right, until they can be omitted against ψ0F and φ0F , respectively, using (2.2) and (2.4). This results in F( f )∗ = F ((ε B ◦ m B ◦ (id B ⊗ f )) ⊗ id A ) ◦ F (id B ⊗ ( A ◦ η A )) ,

(2.32)

which is nothing but F( f ∗ ).

2.3. Modular tensor categories. Let C be a modular tensor category [T,BK], i.e. an abelian semi-simple finite C-linear ribbon category with simple tensor unit 1 and a nondegeneracy condition on the braiding (to be stated in a moment). We denote the set of equivalence classes of simple objects in C by I, elements in I by i, j, k ∈ I and their representatives by Ui , U j , Uk . We also set U0 = 1 and for an index k ∈ I we define k¯ by Uk¯ ∼ = Uk∨ . Since the tensor unit is simple, we shall for modular tensor categories identify Hom(1, 1) ∼ = C (cf. footnote 2). Define numbers si, j ∈ C by4

si, j =

Uj

.

Ui

(2.33)

They obey si, j = s j,i and s0,i = dim Ui , see e.g. [BK, Sect. 3.1]. (In a ribbon category the left and right dimension (2.14) of Ui coincide and are denoted by dim Ui .) The non-degeneracy condition on the braiding of a modular tensor category is that the |I|×|I|-matrix s should be invertible. In fact [BK, Thm. 3.1.7], sik sk j = Dim C δi,j¯ , (2.34) k∈I

where Dim C = i∈I (dim Ui )2 . One can show (even in the weaker context of fusion categories over C) that Dim C √ ≥ 1 [ENO1, Thm. 2.3]. In particular, Dim C = 0. We fix once and for all a square root Dim C of Dim C. Nk

k (i, j)k Ni j }α=1

ij in HomC (Ui ⊗ U j , Uk ) and the dual basis {ϒα Let us fix a basis {λα(i, j)k }α=1

(i, j)k

in HomC (Uk , Ui ⊗U j ). The duality of the bases means that λα(i, j)k ◦ ϒβ = δα,β idUk . We also fix λ(0,i)i = λ(i,0)i = idUi . We denote the basis vectors graphically as follows: Ui

Uk

λα(i, j)k =

α

Ui

Uj

,

ϒα(i, j)k =

Uj α

.

(2.35)

Uk

4 In the graphical notation used below, we have given an orientation to the ribbons indicated by the arrows. For example, it is understood that this orientation determines which of the duality morphisms in (2.13) to use.


885 (i;α)

For V ∈ C we also choose a basis {bV

V } of HomC (V, Ui ) and the dual basis {b(i;β) }

(i;α)

of HomC (Ui , V ) for i ∈ I such that bV notation

V ◦ b(i;β) = δαβ idUi . We use the graphical

Ui

b(i;α) = V

V α

,

V b(i;α) =

V

α

.

(2.36)

Ui

Given two modular tensor categories C and D, by C D we mean the tensor product of additive categories over C [BK, Def. 1.1.15], i.e. the category whose objects are direct sums of pairs V × W of objects V ∈ C and W ∈ D and whose morphism spaces are HomCD (V × W, V × W ) = HomC (V, V ) ⊗C HomC (W, W )

(2.37)

for pairs, and direct sums of these if the objects are direct sums of pairs. If we replace the braiding and the twist in C by the antibraiding c−1 and the antitwist −1 θ respectively, we obtain another ribbon category structure on C. In order to distinguish these two distinct structures, we denote (C, c, θ ) and (C, c−1 , θ −1 ) by C+ and C− respectively. As in the Introduction, we will abbreviate 2 C± = C+ C− .

(2.38)

2 is given by U × U for Note that a set of representatives of the simple objects in C± i j i, j ∈ I. For the remainder of Sect. 2 we fix a modular tensor category C.

2.4. The functors T and R. The tensor product bifunctor ⊗ can be naturally extended 2 → C. Namely, T (⊕ N V × W ) = ⊕ N V ⊗ W for all V , W ∈ C to a functor T : C± i i i i i=1 i i=1 i and N ∈ N. The functor T becomes a tensor functor as follows. For φ0T : 1 → T (1 × 1) take φ0T = id1 (or l1−1 in the non-strict case). Next notice that, for U, V, W, X ∈ C, T (U × V ) ⊗ T (W × X )= (U ⊗ V ) ⊗ (W ⊗ X ), T ((U × V ) ⊗ (W × X ))= (U ⊗ W ) ⊗ (V ⊗ X ).

(2.39)

We define φ2T : T (U × V ) ⊗ T (W × X ) → T ((U × V ) ⊗ (W × X )) by −1 φ2T = idU ⊗ cW V ⊗ id X .

(2.40)

(In the non-strict case the appropriate associators have to be added.) The above definition of φ2T can be naturally extended to a morphism φ2T : T (M1 ⊗ M2 ) → T (M1 ) ⊗ T (M2 ) 2 . The following result can be checked by direct for any pair of objects M1 , M2 in C± calculation [JS, Prop. 5.2]. Lemma 2.20. The triple (T, φ2T , φ0T ) gives a tensor functor.

886

L. Kong, I. Runkel

In particular, (T, φ2T , φ0T , ψ2T , ψ0T ), where ψ2T = (φ2T )−1 and ψ0T = (φ0T )−1 , gives a Frobenius functor. 2 as follows: for A ∈ C and f ∈ Hom (A, B), Define the functor R : C → C± C (A ⊗ Ui∨ ) × Ui , R( f ) = ( f ⊗ idUi∨ ) × idUi . (2.41) R(A) = i∈I

i∈I

This functor was also considered in a slightly different context in [ENO2, Prop. 2.3]. DimC ∨ The family of isomorphisms γ AR = ⊕i∈I dim Ui id(A⊗Ui )×Ui ∈ Aut(R(A)) defines a R natural isomorphism γ : R → R. Our next aim is to show that R is left and right adjoint to T , in other words R and T form an ambidextrous adjunction (see e.g. [Ld] for a discussion of ambidextrous 2, adjunctions). To this end we introduce two linear isomorphisms, for A ∈ C and M ∈ C± χˆ : HomC (T (M), A) −→ HomC 2 (M, R(A)), ±

(2.42)

χˇ : HomC (A, T (M)) −→ HomC 2 (R(A), M). ±

N M l × M r , then χˆ and χˇ are given by If we decompose M as M = ⊕n=1 n n A

χˆ :

N

→

fn

N n=1 i∈I

n=1

Mnl

Ui∨

A

fn

×

Mnr

α

Ui

α

(2.43)

α

Mnr

Mnl

Mnr

and Mnl

χˇ :

Mnr

N n=1

gn

A

Mnl

→

N n=1 i∈I

α

Mnr α Mnr

×

gn

A

Ui∨

α

DimC . dim Ui

(2.44)

Ui

Notice that χˆ and χˇ are independent of the choice of basis. Theorem 2.21. T, R, χˆ and R, T, χˇ −1 are adjunctions, i.e. R is both left and right adjoint of T . N M l × M r . The isomorphism χˆ amounts to the following Proof. Write M as M = ⊕n=1 n n composition of natural isomorphisms,

HomC (T (M), A) = ⊕n HomC (Mnl ⊗ Mnr , A) ∼ = ⊕n,i HomC (Mnl ⊗ Ui , A) ⊗ HomC (Mnr , Ui ) ∼ = ⊕n,i HomC (Mnl , A ⊗ Ui∨ ) ⊗ HomC (Mnr , Ui ) = HomC±2 (M, R(A)). (2.45)


887

Thus χˆ is natural. Let (γ AR )∗ : HomC 2 (R(A), M) → HomC 2 (R(A), M) denote the ± ± pull-back of γ AR . The isomorphism χˇ is equal to the composition of (γ AR )∗ and the following sequence of natural isomorphisms: HomC (A, T (M)) = ⊕n HomC (A, Mnl ⊗ Mnr ) ∼ = ⊕n,i HomC (A, M l ⊗ Ui ) ⊗ HomC (Ui , M r ) n

n

∼ = ⊕n,i HomC (A ⊗ Ui∨ , Mnl ) ⊗ HomC (Ui , Mnr ) = HomC±2 (R(A), M). (2.46)

We have proved that both χˆ and χˇ are natural isomorphisms.

There are four natural transformations associated to χˆ and χ, ˇ namely δˆ

δˇ

ρˇ

idC 2 − → RT − → idC 2 ±

ρˆ

and idC − →TR − → idC ,

±

(2.47)

2, defined by, for A ∈ C, M ∈ C±

δˆ M = χ(id ˆ T (M) ), ρˆ A = χˆ −1 (id R(A) ), ρˇ M = χˇ (id T (M) ), δˇ A = χˇ −1 (id R(A) ).

(2.48)

N Ml × Mr , They can be expressed graphically as follows, with M = ⊕n=1 n n

δˆ M =

n,i

Mnl

Mnr

Ui∨

α

α

Ui

×

Mnl

α

,

ρˆ A =

A

,

i∈I

Mnr

A

Ui∨

Ui

(2.49)

Mnl

ρˇ M =

n,i

Mnr

×

α

α

α Mnl

Mnr

Ui∨

A

Ui

DimC dim Ui

, δˇ A =

Ui∨

Ui dim Ui DimC .

i∈I A

Note that ρˇ M ◦ δˆ M = Dim C · id M and ρˆ A ◦ δˇ A = id A .

(2.50)

Lemma 2.22. The functors T and R as maps on the sets of morphisms have left inverses, and thus are injective. Proof. Let f : A → B be a morphism in C. We define a map Q R : HomC 2 (R(A), ± R(B)) → HomC (A, B) by f → ρˆ B ◦ T ( f ) ◦ δˇ A . Then we have Q R ◦ R( f ) = ρˆ B ◦ T R( f ) ◦ δˇ A = ρˆ B ◦ δˇ B ◦ f = f,

(2.51)

888

L. Kong, I. Runkel

where we used naturality of δˇ and (2.50) in the second and third equalities, respectively. So Q R is a left inverse of R on morphisms. Thus R is injective on morphisms. Similarly, 2 . We define a map Q : Hom (T (M), T (N )) → let g : M → N be a morphism in C± T C −1 HomC 2 (M, N ) by g → (Dim C) · ρˇ N ◦ R(g ) ◦ δˆ M . Then we have ±

Q T ◦ T (g) = (Dim C)−1 · ρˇ N ◦ RT (g) ◦ δˆ M = (Dim C)−1 · ρˇ N ◦ δˆ N ◦ g = g. (2.52) So Q T is a left inverse of T on morphisms. Thus T is injective on morphisms.

Using (2.9) and (2.50), one can express the two inverse maps χˆ −1 , χˇ −1 as follows, for f ∈ HomC 2 (M, R(A)) and g ∈ HomC 2 (R(A), M), ±

±

ˇ χˆ −1 ( f ) = ρˆ ◦ T ( f ), χˇ −1 (g) = T (g) ◦ δ.

(2.53)

By Proposition 2.5 and Lemma 2.7, R is both a lax and colax tensor functor. In particular, φ0R : 1 × 1 → R(1) is given by φ0R = χˆ (ψ0T ) = R(ψ0T ) ◦ δˆ1 × 1 = id1×1

(2.54)

ˆ which can and φ2R : R(A) ⊗ R(B) → R(A ⊗ B) by φ2R = R(ρˆ A ⊗ ρˆ B ) ◦ R(ψ2T ) ◦ δ, be expressed graphically as A

φ2R =

ψ0R

Uk

×

i, j,k∈I α

α

α

A

Similarly,

Uk∨

B

Ui∨

B

U∨ j

Ui

. (2.55)

Uj

: R(1) → 1 × 1 is given by ψ0R = ρˇ1 ◦ R(φ0T ) = Dim C id1×1

(2.56)

and ψ2R : R(A ⊗ B) → R(A) ⊗ R(B) by ψ2R = ρˇ ◦ R(φ2T ) ◦ R(δˇ A ⊗ δˇ B ), which in graphical notation reads A

ψ2R =

Ui∨

B

U∨ j

Ui

Uj

α

×

α

i, j,k∈I α

A

B

Uk∨

dim Ui dim U j . dim Uk Dim C

Uk

(2.57) If C has more than one simple object, then R does not take the tensor unit of C to the 2 and so is clearly not a tensor functor. However, we will show that R is tensor unit of C± still a Frobenius functor. This will imply that if A is a Frobenius algebra in C, then R(A) = (R(A), m R(A) , η R(A) , R(A) , ε R(A) )

(2.58)


889

2 , where the structure morphisms were given in (2.17) and is a Frobenius algebra in C± (2.18). In the case A = 1 it was proved in [Mü1, Prop. 4.1] (see also [Fr, Lem. 6.19] and [K1, Thm. 5.2]) that (2.58) is a commutative simple symmetric normalised-special 2 . In fact, given a Frobenius algebra A in C, it is straightforward Frobenius algebra in C± to verify that the structure morphisms in (2.58) are precisely those of (A × 1) ⊗ R(1), cf. Sect. 2.2.

Proposition 2.23. (R, φ2R , φ0R , ψ2R , ψ0R ) is a Frobenius functor. Proof. Using the explicit graphical expression of φ2R , φ0R , ψ2R , ψ0R , it is easy to see that the commutativity of the diagrams (2.5) and (2.6) are equivalent to the statement that 2 . The latter R(1) with structure morphisms as in (2.58) is a Frobenius algebra in C± statement is true by [Mü1, Prop. 4.1].

From Lemma 2.20 and Proposition 2.23 we see that T and R take Frobenius algebras to Frobenius algebras. The following two propositions show how the properties of Frobenius algebras are transported. 2 . Then T (A) is a Frobenius algebra Proposition 2.24. Let A be a Frobenius algebra in C± in C and

(i) A is symmetric iff T (A) is symmetric. (ii) A is (normalised-)special iff T (A) is (normalised-)special. N Al × Ar . Then the maps I Proof. For part (i) write A as a direct sum ⊕n=1 T (A) , I T (A) : n n ∨ ∨ T (A ) → T (A) defined in (2.20) are given by: N c(Aln )∨ ,(Arn )∨ . IT (A) = IT (A) = ⊕n=1

(2.59)

Therefore, by (2.21), T (A) = T (A) is equivalent to T ( A ) = T (A ). Since by Lemma 2.22, T is injective on morphisms, this proves part (i). Part (ii) can be checked in the same way, for example the condition m T (A) ◦ T (A) = ζ id T (A) is easily checked to be equivalent to T (m A ◦ A ) = ζ T (id A ).

Proposition 2.25. Let A be a Frobenius algebra A in C. Then R(A) is a Frobenius 2 and algebra in C± (i) A is symmetric iff R(A) is symmetric. (ii) A is (normalised-)special iff R(A) is (normalised-)special. Proof. Recall that the structure morphisms of the Frobenius algebra R(A) are equal to those of (A × 1) ⊗ R(1). Using this equality, part (i) and (ii) follow because R(1) is symmetric and normalised-special. For example, m R(A) ◦ R(A) = [(m A ◦ A ) × id1 ] ⊗ id R(1) = R(m A ◦ A ),

(2.60)

so that m R(A) ◦ R(A) = ζ id R(A) is equivalent to R(m A ◦ A ) = ζ R(id A ), which by Lemma 2.22 is equivalent to m A ◦ A = ζ id A .

The functor R has one additional property not shared by T , namely R takes absolutely simple algebras to absolutely simple algebras. We will see explicitly in Sect. 3.3 that this is not true for T .

890

L. Kong, I. Runkel

Lemma 2.26. For a C-algebra A, the map R : Hom A|A (A, A) → Hom R(A)|R(A) (R(A), R(A))

(2.61)

given by f → R( f ) is well-defined and an isomorphism. Proof. Since R is a lax tensor functor, R(A) is naturally a R(A)-bimodule. It is easy to see that R in (2.61) is a well-defined map. R(A) is also naturally a R(1)-bimodule, which can be identified with the induced R(1)-bimodule structure on (A × 1) ⊗ R(1), where the left R(1) action on (A ×1)⊗ R(1) is given by (id A×1 ⊗m R(1) )◦(c−1 A×1,R(1) ⊗id R(1) ). We have the following natural isomorphisms: χˆ −1

∼ =

→ HomC 2 (A × 1, R(A)) −−→ HomC (A, A). Hom R(1)|R(1) (R(A), R(A)) − ±

(2.62)

which, by (2.53), are given by, for f ∈ Hom R(1)|R(1) (R(A), R(A)), f → f = f ◦ (id A×1 ⊗ η R(1) ) → f = ρˆ ◦ T ( f ),

(2.63)

and its inverse is given by, for g ∈ HomC (A, A), g → g = R(g) ◦ δˆ → g = (id A×1 ⊗ m R(1) ) ◦ (g ⊗ id R(1) ),

(2.64)

where g is indeed a R(1)-bimodule map due to the commutativity of R(1). It is easy to check that g = R(g) in (2.64). Therefore R gives an isomorphism from HomC (A, A) to Hom R(1)|R(1) (R(A), R(A)). Moreover, one verifies that R(g) is an R(A)-bimodule map iff g is an A-bimodule map. In other words, R : g → R(g) gives an isomorphism ∼ =

Hom A|A (A, A) − → Hom R(A)|R(A) (R(A), R(A)).

Corollary 2.27. Let A be a C-algebra. (i) A is absolutely simple iff R(A) is absolutely simple. (ii) Let A be in addition Frobenius. Then A is simple and special iff R(A) is simple and special. Proof. Part (i) immediately follows from Lemma 2.26. The statement of part (ii) without the qualifier ‘simple’ is proved in Proposition 2.25. But, as in the proof of (iii)⇒(i) in Lemma 2.11, a special Frobenius algebra in a semi-simple category has a semi-simple category of bimodules, and for a semi-simple C-linear category, simple and absolutely simple are equivalent. Part (ii) then follows from part (i).

The following lemma will be needed in Sect. 3.2 below to discuss the properties of Cardy algebras. Lemma 2.28. Let A be a Frobenius algebra in C. Then (δˇ A )∗ = ρˆ A . Proof. Recall from (2.50) that δˇ A is a morphism A → T R(A). Since T and R are both Frobenius functors, T R(A) is a Frobenius algebra in C. Substituting the definitions, after a short calculation one finds


εT R(A) ◦ m T R(A) = Dim(C)

891

(2.65)

i∈I A

Ui∨

Ui

A

Uı¯∨

Uı¯

Substituting this in the definition of (δˇ A )∗ gives, again after a short calculation, the morphism ρˆ A . At an intermediate step one uses that the part of the morphism (δˇ A )∗ , which is made up of Ui and Uı¯ ribbons, their duals, and the basis morphisms λ(i,¯ı )0 and ϒ (i,¯ı )0 , can be replaced by dim1 Ui · dUi .

3. Cardy Algebras In this section we start by investigating the properties of Frobenius algebras which satisfy the so-called modular invariance condition. We then give two definitions of a Cardy algebra and prove their equivalence. Finally, in Sect. 3.3, we study the properties of these algebras and state our main results. 2 is an abbreviation for C C . We fix a modular tensor category C. Recall that C± + −

2 , we define the object K and the morphism ω : K → K 3.1. Modular invariance. In C± as

K =

Ui × U j ,

ω=

i, j∈I

dim Ui dim U j idUi ×U j . Dim C

(3.1)

i, j∈I

They have the property (see e.g. [BK, Cor. 3.1.11])

(Uk × Ul )∨

Ui × U j

K

= δi,k δ j,l

ω

(Uk × Ul )∨

Ui × U j

Dim C b˜Ui ×U j ◦ dUi ×U j . dim Ui dim U j

(3.2)

892

L. Kong, I. Runkel

Definition 3.1. 2 . A morphism f : A ⊗ B → B is called S-invariant iff (i) Let A, B be objects of C± W

W K

B B f

f

=

A

W

ω

(3.3)

A

W

2. C±

holds for all for W ∈ 2 -algebra (A , m , η ) is called modular invariant iff θ (ii) A C± cl cl cl Acl = id Acl and m cl is S-invariant. Lemma 3.2. The morphism f : A ⊗ B → B is S-invariant if and only if Ui × U j

Ui × U j

B

α

f

=

B

Dim C dim Ui dim U j α

(3.4)

f B α

A

Ui × U j

A

Ui × U j

holds for all i, j ∈ I. Proof. Condition (3.3) holds for all W iff it holds for all W = Ui × U j , i, j ∈ I, so it is enough to show that the right-hand side of (3.3) with W = Ui × U j is equal to the right-hand side of (3.4). Recall the notation for basis morphisms in (2.36). Starting from (3.3), write ⎛ ⎞ ∨ (k×l;α) ∨ B ⎠ ⊗ idUi ×U j , bB ◦ b(k×l;α) (3.5) id B ∨ ⊗ idUi ×U j = ⎝ k,l,α

and then apply (3.2). The graphical representation of the resulting morphism can be deformed to give (3.4).

2 -algeRemark 3.3. As shown in [K3, Sect. 6.1], the modular invariance condition of a C± bra exactly coincides with the modular invariance condition for torus 1-point correlation functions of a genus-0,1 closed CFT. In particular, the condition θ Acl = id Acl is equivalent to invariance under the modular transformation T : τ → τ + 1, and the condition (3.4) with f = m cl is equivalent to invariance under S : τ → − τ1 . Combining the modular invariance condition with the genus zero properties of a genus-0,1 closed CFT 2. results in a modular invariant commutative symmetric Frobenius algebra in C±


893

2 -algebra. Evaluating (3.4) for f = m , composing Let Acl be a modular invariant C± cl it with ηcl ⊗ idUi ×U j and taking the trace implies the following identity:

Zi j =

1 DimC

Ui ×U j

Acl

where

(3.6) Z i j = dimC HomC 2 (Ui ×U j , Acl ). ±

Decomposing Acl into simple objects, this gives Zi j =

Sik Z kl Sl−1 j

√ where Si j = si, j / DimC,

(3.7)

k,l∈I

which in CFT terms is of course nothing but the invariance of the torus partition function under the modular S-transformation. The following theorem gives a simple criterion for modular invariance. 2. Theorem 3.4. Let Acl be a haploid commutative symmetric Frobenius algebra in C±

(i) If Acl is modular invariant, then dim Acl = Dim C. (ii) If dim Acl = Dim C, then Acl is special and modular invariant. Proof. Part (i): Since Acl is haploid, for i = j = 0, Equation (3.6) reduces to 1 = dim Acl /DimC. Part (ii): By the same reasoning as in the proof of (iii)⇒(i) in Lemma 2.11 one shows that Acl is special. Thus m cl ◦ cl = ζcl id Acl for some ζcl = 0. 2 )loc of local A -modules is again a modBy [KO, Thm. 4.5], the category (C± cl Acl 2 )loc = (Dim C/ dim A )2 (see [Fr, Prop. 3.21 & ular tensor category and Dim (C± cl Acl Rem. 3.23] for the same statement in the notation used here). Thus by assumption we 2 )loc = 1. It then follows from [ENO1, Thm. 2.3] that up to isomorphism, have Dim(C± Acl loc 2 (C± ) Acl has a unique simple object (namely the tensor unit). In other words, every simple local Acl -module is isomorphic to Acl (seen as a left-module over itself). We have the following isomorphisms between morphism spaces [FS, Prop. 4.7 & 4.11], Hom Acl (Acl ⊗ (Ui × U j ), Acl ) ∼ = HomC±2 (Ui × U j , Acl ), Hom A (Acl , Acl ⊗ (Ui × U j )) ∼ = Hom 2 (Acl , Ui × U j ). C±

cl

Using these to transport the bases (2.36) from the right to the left, we obtain bases {b(iα j) }α (i j)

of Hom Acl (Acl ⊗ (Ui × U j ), Acl ) and {bβ }β of Hom Acl (Acl , Acl ⊗ (Ui × U j )). These can be expressed graphically as

894

L. Kong, I. Runkel Acl

Ui × U j

Acl

β

b(iα j) =

Acl

(i j)

,

bβ

α

Acl

=

1 DimC dim Ui dim U j ζcl

Ui × U j

Acl ,

(3.8)

Acl

(i j)

(i j)

where the nonzero factor in bβ is included for convenience. Notice that b(iα j) ◦ bβ a left Acl -module map. Since Acl is simple as a left module over itself, we have (i j)

b(iα j) ◦ bβ

= λαβ id Acl

is

(3.9)

(i j)

for some λαβ ∈ C. By computing tr(b(iα j) ◦ bβ ), it is easy to verify that λαβ = δαβ . We will now prove the following identity: Acl

Ui × U j

Acl

Ui × U j

α Acl

1 ζcl

=

α

DimC 1 dim Ui dim U j ζcl

Acl

.

Acl

(3.10)

α

Acl

Ui × U j

Acl

Ui × U j

One checks that the left-hand side of this equation is an idempotent, which we denote by ζcl−1 PAl cl (Ui ×U j ), cf. [Fr, Sect. 3.1]. By [Fr, Prop. 4.1] the image Im(ζcl−1 PAl cl (Ui ×U j )) is a local Acl -module, and hence isomorphic to A⊕N cl for some N ∈ Z≥0 . All left-module morphisms from Acl to Acl ⊗(Ui ×U j ) are linear combinations of the (i j) (i j) (i j) (i j) bβ . Furthermore one verifies that ζcl−1 PAl cl (Ui ×U j ) ◦ bβ = bβ . Therefore, the bβ

(i j) describe precisely the image of the idempotent, i.e. ζcl−1 PAl cl (Ui ×U j ) = α bα ◦b(iα j) , which is nothing but (3.10). Composing (3.10) with ζcl · εcl ⊗ idUi ×U j from the left (i.e. from the top) produces (3.4). In addition, since Acl is commutative symmetric Frobenius it satisfies θ Acl = id Acl [Fr, Prop. 2.25]. Altogether, this shows that Acl is modular invariant.

Remark 3.5. As we were writing this paper, we heard that the results in Theorem 3.4 were obtained independently by Kitaev and Müger [Ki]. Remark 3.6. Setting i = j = 0 in (3.6) gives the identity dim Acl = Z 00 Dim C [KR, Prop. 2.3]. Combining this with Theorem 3.4 (ii) one may wonder if a general modular 2 is isomorphic to a direct invariant commutative symmetric Frobenius algebra Acl in C± sum of simple such algebras. However, this is not so. For example, one can take the commutative symmetric Frobenius algebra Acl = C[x]/x2 in the category of vector spaces equipped with the non-degenerate trace ε(ax + b) = a. In this case the modular invariance condition holds automatically, but Acl is clearly not a direct sum of two algebras.


895

For a general modular tensor category C, the algebra C[x]/x2 R(1), understood as an 2 via the braided monoidal isomorphism Vect (C) C 2 → C 2 , provides algebra in C± f ± ± another counter-example. 3.2. Two definitions. Define a morphism PAl : A → A for a Frobenius algebra A in C 2 as follows [Fr, Sect. 2.4], or C± A A

PAl =

.

m

(3.11)

A

If A is also commutative and obeys m A ◦ A = ζ A id A , we have PAl = ζ A id A . In particular, this holds if A is commutative and special. Using the fact that the Frobenius algebra l R(1) is commutative and normalised-special, one can check that PR(A) : R(A) → R(A) takes the following form: Ui∨

A

l PR(A) =

i∈I

Ui A

×

m

.

(3.12)

Ui∨

A

Ui

2 -algebra, With these ingredients, we can now give the first definition of a Cardy C|C± which was introduced in [K3, Def. 5.14], cf. Remark 3.15 below. 2 -algebra I). A Cardy C|C 2 -algebra is a triple (A |A , ι Definition 3.7 (Cardy C|C± op cl cl-op ), ± where (Acl , m cl , ηcl , cl , εcl ) is a modular invariant commutative symmetric Frobe2 -algebra, (A , m , η , , ε ) is a symmetric Frobenius C-algebra, and nius C± op op op op op ιcl-op : Acl → R(Aop ) an algebra homomorphism, such that the following conditions are satisfied:

(i) Centre condition: R(Aop )

R(Aop ) m R(Aop ) m R(Aop )

=

R(Aop )

ιcl-op

ιcl-op

Acl

.

R(Aop )

R(Aop )

Acl

R(Aop )

(3.13)

896

L. Kong, I. Runkel

(ii) Cardy condition: R(Aop ) R(Aop )

ιcl-op ◦ ι∗cl-op =

.

(3.14)

R(Aop )

Remark 3.8. 2 -algebra” in Definition 3.7 was chosen because many of (i) The name “Cardy C|C± the important ingredients were first studied by Cardy: the modular invariance of the closed theory [C1], the consistency of the annulus amplitude [C2], and the bulk-boundary OPE [CL]. On the other hand, the boundary-boundary OPE and the OPE analogue of the centre condition were first considered in [Lw]. (ii) One can easily see that in the special case that C is the category Vect f (C) of finite2 -algebra gives exactly the algebraic dimensional C-vector spaces, a Cardy C|C± formulation of two-dimensional open-closed topological field theory over C (cf. Remark 6.14 in [K3]), see [Lz, Sect. 4.8], [Mo, Thm. 1.1], [AN, Thm. 4.5], [LP, Cor. 4.3], [MS, Sect. 2.2]. When passing to a general modular tensor category C there are two important differences to the two-dimensional topological field theory. Firstly, the algebras Acl and Aop now live in different categories, which in particular affects the formulation of the centre condition and the Cardy condition. Secondly, the modular invariance condition has to be imposed on Acl . In the case C = Vect f (C), modular invariance holds automatically. (1)

(1)

(1)

(2)

2 -algebras (A |A , ι Definition 3.9. A homomorphism of Cardy C|C± op cl cl-op ) → (Aop (2)

(2)

(1)

(2)

|Acl , ιcl-op ) is a pair ( f op , f cl ) of Frobenius algebra homomorphisms f op : Aop → Aop (2) and f cl : A(1) cl → Acl such that the diagram

A(1) cl

f cl

/ A(2) cl

(1)

(2)

ιcl-op

(1) R(Aop )

ιcl-op

(3.15)

R( f op ) / R(A(2) ) op

commutes. Remark 3.10. Since a homomorphism of Frobenius algebras is invertible (cf. Lemma 2.18 (iv)), a homomorphism of Cardy algebras is always an isomorphism. 2 -algebras, using the commutativity of For a homomorphism ( f op , f cl ) of Cardy C|C± (3.15) and the fact that f cl and f op are both algebra and coalgebra homomorphisms, it is easy to show that (3.15) commutes iff (1)

R(Aop )

R( f op )

/ R(A(2) ) op

(1)

(ιcl-op )∗

(2)

(1)

Acl

f cl

(ιcl-op )∗

/ A(2) cl

(3.16)


897

commutes. 2 -algebra. Define the morphism Let (Aop |Acl , ιcl-op ) be a Cardy C|C± ι˜cl-op = χˆ −1 (ιcl-op ) : T (Acl ) −→ Aop .

(3.17) (n)

N C l × C r such that C l × C r = η (1 × 1). We use ι Decompose Acl as Acl = ⊕n=1 cl n n 1 cl-op 1

to denote the restriction of ιcl-op to Cnl × Cnr and ι˜(n) cl-op to denote the restriction of ι˜cl-op l r to Cn ⊗ Cn . We introduce the following graphical notation: Aop

(n)

ι˜cl-op =

. Cnl

(3.18)

Cnr

By (2.43), ιcl-op can be expressed in terms of ι˜cl-op as follows: Ui∨

Aop

ιcl-op =

N n=1 i∈I

Ui

×

α

α

α

Cnr

Cnl

.

(3.19)

Cnr

Lemma 3.11. The centre condition (3.13) is equivalent to the following condition in C: Aop

Aop

Aop

=

Cnl

Cnr

Aop

Aop

Cnl

for n = 1, . . . , N .

Cnr

(3.20)

Aop

Proof. First, insert (3.19) and the definition (2.17) of m R(Aop ) into (3.13). Then apply the commutativity of R(1) to the left hand side of (3.13). The equivalence between (3.13) and (3.20) follows immediately.

Remark 3.12. The centre condition (3.20) is very natural from the open-closed conformal field theory point of view. Correlators on the upper half plane are expressed in terms of conformal blocks on the full complex plane. The objects Cnl and Cnr are associated to the field insertion at a point z in the upper half plane and at the complex conjugate point z¯ in the lower half plane, respectively. The object Aop corresponds to a field inserted at a point r on the real axis. The centre condition (3.20) simply says that the correlation functions in the disjoint domains |z| > r > 0 and r > |z| > 0 are analytic continuations of each other, see [K2, Prop. 1.18].

898

L. Kong, I. Runkel

Recall that we define ι˜∗cl-op : Aop → T (Acl ) as in (2.30). We introduce the graphical notation Cnl

ι˜∗cl-op =

Cnr

Cnl

N

N

=

Aop

Cnr T (−1 A ), cl

n=1

(3.21)

n=1 Aop

Aop

where the second equality follows from (2.15), (2.21) and (2.59). Lemma 3.13. The Cardy condition (3.14) is equivalent to the following identity in C: Ui∨

Aop

Aop

Ui∨

Cnr

N DimC dim Ui α n=1

α

Aop

=

Cnl α

op

Cnr

Aop Aop

for all i ∈ I . (3.22)

m op

Ui∨

Ui∨

Proof. By (2.44), (3.12) and (3.19), it is easy to see that (3.22) is equivalent to the following identity: l ˇ ι∗cl-op ) = PR(A . ιcl-op ◦ χ(˜ op )

(3.23)

Therefore, it is enough to show that χ(˜ ˇ ι∗cl-op ) = ι∗cl-op .

(3.24)

We have ∗ ∗ (4) (1) (2) (3) χˇ −1 (ι∗cl-op ) = T (ι∗cl-op ) ◦ δˇ = T (ιcl-op )∗ ◦ δˇ∗∗ = ρˆ ◦ T (ιcl-op ) = ι˜cl-op . (3.25) In step (1) we use the expression (2.53) for χˇ −1 , step (2) follows from Lemma 2.18 (v) and Lemma 2.19. Step (3) is Lemma 2.18 (i) and Lemma 2.28, and finally step (4) amounts to substituting (2.53) and (3.17). Acting with χˇ on both sides of the above equality produces (3.24).

Combining Lemmas 3.11 and 3.13, and Proposition 2.16, we obtain the following 2 -algebra (recall the graphical notation (3.18) for equivalent definition of Cardy C|C± ι˜cl-op and (3.21) for ι˜∗cl-op ).


899

2 -algebra II). A Cardy C-algebra is a triple (A |A , ι˜ Definition 3.14 (Cardy C|C± op cl cl-op ), 2 -algebra satisfying property (3.4) where Acl is a commutative symmetric Frobenius C± with f = m cl , Aop is a symmetric Frobenius C-algebra, and ι˜cl-op : T (Acl ) → Aop is an algebra homomorphism satisfying the conditions (3.20) and (3.22).

Remark 3.15. Up to a choice of normalisation, Definition 3.14 is the same as the original one in [K3, Def. 6.13]. The difference between the two definitions is the factor Dim C/ dim Ui on the left hand side of (3.22), which in [K3, Def. 6.13] is given by √ Dim C/ dim Ui . The √ two definitions √ are related by rescaling the coproduct cl and counit εcl of Acl by 1/ Dim C and Dim C, respectively. We chose the convention in (3.22) to remove all dimension factors from the expression (3.14) for the Cardy condition.

3.3. Uniqueness and existence theorems. In this subsection we investigate the structure of Cardy algebras. We start with the following proposition, which, when combined with the results of part II, provides an alternative proof of [Fj, Prop. 4.22]. 2 -algebra. If A is simple and Proposition 3.16. Let (Aop |Acl , ιcl-op ) be a Cardy C|C± cl dim Aop = 0, then Aop is simple and special.

Proof. By Remark 3.6, we have dim Acl = Z 00 Dim C = 0, and by Lemma 2.11, Acl is therefore haploid. Restricting the Cardy condition (3.22) to the case Ui = 1 and composing both sides with εop from the left, we see that εop kills all terms associated to U j × 1 ∈ Acl in the sum except for a single 1 × 1 term. Thus we obtain the following identity: β εop = d˜ Aop ◦ (m op ⊗ id A∨op ) ◦ (id Aop ⊗ b Aop ),

(3.26)

where β ∈ C. Composing with ηop from the right in turn implies that βεop ◦ ηop = dim Aop , which is nonzero by assumption. Thus also β = 0 and εop is a nonzero multiple of the morphism on the right hand side of (3.26). By [FRS, Lem. 3.11], Aop is special. Since Aop is a special Frobenius algebra, Aop is semi-simple as an Aop -bimodule (apply [FS, Prop. 5.24] to Aop tensored with its opposite algebra). Suppose Aop is not (1) (2) (1) (2) simple, so that we can write Aop = Aop ⊕ Aop for nonzero Aop -bimodules Aop and Aop . We denote the canonical embeddings and projections associated to this decomposition as ι1,2 and π1,2 . We have the identities m op ◦ (ι1 ⊗ ι2 ) = 0,

εop ◦ ηop =

2

εop ◦ ιi ◦ πi ◦ ηop .

(3.27)

i=1

The first identity follows since π1 ◦ m op ◦ (ι1 ⊗ ι2 ) = 0 (as m op gives the left action of (2) Aop on Aop and hence it preserves Aop ), and similarly π2 ◦ m op ◦ (ι1 ⊗ ι2 ) = 0. The second identity is just the completeness of ι1,2 , π1,2 . Since εop ◦ ηop = 0, without losing generality we can assume εop ◦ ι1 ◦ π1 ◦ ηop = 0. Using that π2 is a bimodule map we compute π2 ◦ [LHS of (3.22)]Ui =1 ◦ ι1 ◦ π1 ◦ ηop = π2 ◦ [RHS of (3.22)]Ui =1 ◦ ι1 ◦ π1 ◦ ηop = P l (2) ◦ π2 ◦ ι1 ◦ π1 ◦ ηop = 0. Aop

(3.28)

900

L. Kong, I. Runkel

On the other hand, using that Acl is haploid, that ι˜cl-op is an algebra map, and that ∗ ι˜cl-op is a coalgebra map, one can check that the left-hand side of (3.28) is equal to ι1 ◦ π1 ◦ ηop ) π2 ◦ ηop for some λ = 0. This implies that π2 ◦ ηop = 0. Thus λ(εop ◦

ηop = i ιi ◦ πi ◦ ηop = ι1 ◦ π1 ◦ ηop . Hence, we have 0 = π2 ◦ ι2 = π2 ◦ m op ◦ (ηop ⊗ ι2 ) = π2 ◦ m op ◦ ((ι1 ◦ π1 ◦ ηop ) ⊗ ι2 ). (3.29) However, the right-hand side is zero by (3.27). This is a contradiction and hence Aop must be simple.

To formulate the next theorem we need the notion of the full centre of an algebra [Fj, Def. 4.9]. Recall that an algebra A in a braided tensor category has a left centre and a right centre [VZ,O], both of which are sub-algebras of A. Of these two, we will only need the left centre. The following definition is [Fr, Def. 2.31], which in our setting is equivalent to that of [VZ,O]. Definition 3.17. Let A be a symmetric special Frobenius algebra such that m A ◦ A = ζ A id A . (i) The left centre Cl (A) of A is the image of the idempotent ζ A−1 PAl . (ii) The full centre Z (A) is Cl (R(A)). That ζ A−1 PAl is an idempotent follows from [FRS, Lem. 5.2] when keeping track of the factors ζ A ([FRS] assumes normalised-special, i.e. ζ A = 1). Note that Cl (A) is again 2 . Let el : C (A) → A be the embedding an object of C, while Z (A) is an object of C± l A of Cl (A) into A. The left centre is in fact the maximal subobject of A such that m A ◦ c A,A ◦ (elA ⊗ id A ) = m A ,

(3.30)

see [Fr, Lem. 2.32]. This observation explains the name left centre and also makes the connection to [O, Def. 15]. l The full centre is by definition the image of the idempotent ζ A−1 PR(A) : R(A) → 2 R(A). Since C± is abelian, the idempotent splits and we obtain the embedding and restriction morphisms e : Z (A) → R(A) and r : R(A) Z (A)

(3.31)

l ζ A−1 PR(A) .

It follows from Proposition 2.25 and which obey r ◦ e = id Z (A) and e ◦ r = 2 with [Fr, Prop. 2.37] that Z (A) is a commutative symmetric Frobenius algebra in C± 5 structure morphisms m Z (A) = r ◦ m R(A) ◦ (e ⊗ e), η Z (A) = r ◦ η R(A) , Z (A) = ζ A · (r ⊗ r ) ◦ R(A) ◦ e, ε Z (A) = ζ A−1 · ε R(A) ◦ e.

(3.32)

Moreover, if A is simple then Z (A) is simple, and if A is simple and dim A = 0, then Z (A) is simple and special. The normalisation of the counit is such that ε Z (A) ◦ η Z (A) = ζ A−2 dim A DimC.

(3.33)

5 The normalisation of product and unit is the standard one. The factors in the coproduct and counit have to be included in order for (A|Z (A), e) to be a Cardy algebra, see Theorem 3.18 below. The normalisation of the counit enters the Cardy condition (3.14) through the definition of ( · )∗ .


901

Theorem 3.18. Let A be a special symmetric Frobenius C-algebra. Then (A|Z (A), e) 2 -algebra. is a Cardy C|C± The proof of this theorem makes use the following two lemmas. Lemma 3.19. e : Z (A) → R(A) is an algebra map, and e∗ = ζ A · r . Proof. It follows from [Fr, Lem. 2.29] (or by direct calculation, using in particular m R(A) ◦ R(A) = ζ A id R(A) ) that l m R(A) ◦ (e ⊗ e) = ζ A−1 · PR(A) ◦ m R(A) ◦ (e ⊗ e).

(3.34)

l shows that e is compatible with multiplication. For the Substituting e ◦ r = ζ A−1 PR(A) unit one finds l e ◦ η Z (A) = e ◦ r ◦ η R(A) = ζ A−1 PR(A) ◦ η R(A) = η R(A) .

(3.35)

Thus e is an algebra map. For the second statement one computes Z (A)

Z (A)

R(A)

r

e

(1)

e∗ = ζ A

Z (A)

R(A)

R(A)

(2)

=

r R(A)

r R(A)

R(A)

e R(A)

R(A)

Z (A) r R(A)

(3)

(4)

=

= ζ A · r,

(3.36)

R(A)

where in (1) the definitions (2.30) and (3.32) have been substituted, step (2) is e ◦ l r = ζ A−1 PR(A) , step (3) uses that R(A) is symmetric Frobenius, and step (4) is again l e ◦ r = ζ A−1 PR(A) .

Lemma 3.20. Let A be a symmetric Frobenius algebra in C. The morphism l l l PR(A) : R(A) ⊗ R(A) −→ R(A) ◦ m R(A) ◦ PR(A) ⊗ PR(A)

(3.37)

is S-invariant. The proof of this lemma is a slightly lengthy explicit calculation and has been deferred to Appendix A.2.

902

L. Kong, I. Runkel

Proof of Theorem 3.18. That e is an algebra map was proved in Lemma 3.19. The centre condition (3.13) holds by property (3.30) of the left centre. The Cardy condition (3.14) also is an immediate consequence of Lemma 3.19, l ιcl-op ◦ ι∗cl-op = e ◦ (ζ A r ) = PR(A) .

(3.38)

The full centre Z (A) is a commutative symmetric Frobenius algebra. It remains to prove modular invariance. That θ Z (A) = id Z (A) is implied by commutativity and symmetry of Z (A) [Fr, Prop. 2.25]. The S-invariance condition (3.3) follows from Lemma 3.20: l In (3.37) substitute PR(A) = ζ A e ◦ r and then put the resulting morphism into (3.3). Compose the resulting equation with e ⊗ id W from the right (i.e. from the bottom) and substitute the definition (3.32) of m Z (A) . This results in the statement that m Z (A) is S-invariant.

The following theorem is analogous to [LR, Prop. 2.9] and [Fj, Thm. 4.26], which, roughly speaking, answer the question under which circumstances the restriction of a two-dimensional conformal field theory to the boundary already determines the entire conformal field theory. The first work is set in Minkowski space and uses operator algebras and subfactors, while the second work is set in Euclidean space and uses modular tensor categories. 2 -algebra such that dim A = 0 and Theorem 3.21. Let (A|Acl , ιcl-op ) be a Cardy C|C± Acl is simple. Then A is special and (A|Acl , ιcl-op ) ∼ = (A|Z (A), e) as Cardy algebras.

Proof. By Proposition 3.16, A is simple and special. Since Acl is simple, the algebra map ιcl-op : Acl → R(A) is either zero or a monomorphism. But ιcl-op ◦ ηcl = η R(A) , and so ιcl-op = 0. Thus ιcl-op is monic. By Lemma 2.18 (ii), ζ A−1 ι∗cl-op is epi. The Cardy condition (3.14) implies l = e ◦ r. ιcl-op ◦ ζ A−1 ι∗cl-op = ζ A−1 PR(A)

(3.39)

Composing this with e ◦ r from the left yields e ◦ r ◦ ιcl-op ◦ ζ A−1 ι∗cl-op = e ◦ r = ιcl-op ◦ ζ A−1 ι∗cl-op . Since ζ A−1 ι∗cl-op is epi, we have

e ◦ r ◦ ιcl-op = ιcl-op .

(3.40)

Actually, (3.40) also follows from (3.13) and specialness of R(A). We will prove that ( f op , f cl ) : (A|Acl , ιcl-op ) −→ (A|Z (A), e) where f op = id A , f cl = r ◦ ιcl-op , (3.41) is an isomorphism of Cardy algebras. f cl is an algebra map: Compatibility with the units follows since ιcl-op is an algebra map, f cl ◦ ηcl = r ◦ ιcl-op ◦ ηcl = r ◦ η R(A) = η Z (A) .

(3.42)

Compatibility with the multiplication also follows since ιcl-op is an algebra map, m Z (A) ◦ ( f cl ⊗ f cl ) = r ◦ m R(A) ◦ (e ⊗ e) ◦ (r ⊗ r ) ◦ (ιcl-op ⊗ ιcl-op ) = r ◦ m R(A) ◦ (ιcl-op ⊗ ιcl-op ) = r ◦ ιcl-op ◦ m cl = f cl ◦ m cl , where in the second step we used (3.40).

(3.43)


903

f cl is an isomorphism: As above, since f cl is an algebra map and since Acl is simple, f cl has to be monic. By Lemma 3.19, r ∗ = ζ A−1 e. Thus f cl∗ = ι∗cl-op ◦ r ∗ = ζ A−1 ι∗cl-op ◦ e and f cl ◦ f cl∗ = r ◦ ιcl-op ◦ ζ A−1 ι∗cl-op ◦ e = r ◦ e ◦ r ◦ e = id Z (A) ,

(3.44)

and so f cl is also epi, and hence iso. f cl is a coalgebra map: Since f cl is an algebra map, so is f cl−1 . By (3.44), f cl−1 = f cl∗ and by Lemma 2.18 (iii) this implies that f cl is a also coalgebra map. The diagram (3.15) commutes: Commutativity of (3.15) is equivalent to e ◦ f cl = ιcl-op , which holds by (3.40).

Let A be a special symmetric Frobenius algebra. So far we have seen that (A|Z (A), e) is a Cardy algebra, and that all Cardy algebras with Aop = A and simple Acl are of this form. It is now natural to ask if every simple Acl does occur as part of a Cardy algebra. The following theorem provides an affirmative answer. Recall that for an A-left module M, the object M ∨ ⊗ A M is an algebra (see e.g. [KR, Lem. 4.2]). Theorem 3.22. If Acl is a simple modular invariant commutative symmetric Frobenius 2 -algebra, then there exist a simple special symmetric Frobenius C-algebra A and a C± morphism ιcl-op : Acl → R(A) such that ∼ Z (A) as Frobenius algebras; (i) Acl = 2 -algebra; (ii) (A|Acl , ιcl-op ) is a Cardy C|C± ∨ ∼ (iii) T (Acl ) = ⊕κ∈J Mκ ⊗ A Mκ as algebras, where {Mκ }κ∈J is a set of representatives of the isomorphism classes of simple A-left modules. Proof. By Remark 3.6, we have dim Acl = Z 00 Dim C = 0, and by Lemma 2.11, Acl is haploid. It then follows from Theorem 3.4 that Acl is special. By Proposition 2.24, T (Acl ) is a special symmetric Frobenius algebra in C. Thus T (Acl ) = ⊕i Ai , where the Ai are simple symmetric Frobenius algebras. We will show that at least one of the Ai is special. Since T (Acl ) is special, we have m T (Acl ) ◦ T (Acl ) = ζ id T (Acl ) for some ζ ∈ C× . Restricting this to the summand Ai shows m i ◦ i = ζ id

Ai . Furthermore, εT (Acl ) ◦ ηT (Acl ) = ξ id1 for some ξ ∈ C× . But εT (Acl ) ◦ ηT (Acl ) = i εi ◦ ηi , and so at least one of the εi ◦ ηi has to be nonzero. Therefore, at least one of the Ai is special; let A ≡ Ai be this summand. We denote the embedding A → T (Acl ) by e0 and the restriction T (Acl ) A by r0 . Notice that r0 is an algebra homomorphism. Define ιcl-op = χˆ (r0 ) : Acl −→ R(A).

(3.45)

By Proposition 2.16, ιcl-op is an algebra homomorphism. Next we verify the centre condition (3.13), or rather its equivalent form (3.20). By substituting the definitions, one 2 implies the can convince oneself that the commutativity m cl ◦ c Acl ,Acl = m cl of Acl in C± condition m T (Acl ) ◦ = m T (Acl ) in C, see [K2, Prop. 3.6]. Here, : T (Acl ) ⊗ T (Acl ) → T (Acl ) ⊗ T (Acl ) is given by idCnl ⊗ cCml ,Cnr ⊗ idCmr ◦ cCml ,Cnl ⊗ cC−1r ,C r ◦ idCml ⊗ cC−1l ,C r ⊗ idCnr , = n

m,n

m

n

m

(3.46) and we decomposed Acl as Acl =

⊕n Cnl

× Cnr . As a consequence

we obtain the identity

r0 ◦ m T (Acl ) ◦ ◦ (id Acl ⊗ e0 ) = r0 ◦ m T (Acl ) ◦ (id Acl ⊗ e0 ).

(3.47)

904

L. Kong, I. Runkel

Using that r0 is an algebra map, and that by definition ι˜cl-op = r0 , we obtain (3.20). In order to show that (A|Acl , ιcl-op ) is a Cardy algebra, it remains to show that the Cardy condition (3.14) is satisfied. We will demonstrate this via a detour by first proving that Acl ∼ = Z (A) as Frobenius algebras. Recall the notations e and r given in (3.31). Using the centre condition (3.20) one l can check that PR(A) ◦ ιcl-op = m R(A) ◦ R(A) ◦ ιcl-op . By specialness of A we have l m A ◦ A = ζ A id A and so together with e ◦ r = ζ A−1 PR(A) we get,

e ◦ r ◦ ιcl-op = ιcl-op .

(3.48)

f cl = r ◦ ιcl-op : Acl −→ Z (A).

(3.49)

Next, consider the morphism

By the same derivation as in (3.42) and (3.43) one sees that f cl is an algebra map. In particular, f cl ◦ ηcl = η Z (A) = 0 and so f cl = 0. Since Acl is simple, f cl has to be a monomorphism. By the same argument as used in the proof of Theorem 3.4 (ii), up to isomorphism Acl is the unique simple local Acl -(left-)module. The algebra monomorphism f cl turns Z (A) into an Acl -module. Since Z (A) is commutative, it is local as an Acl -module, and so Z (A) ∼ = A⊕N cl for some N ≥ 1. By construction, A is a simple special symmetric Frobenius algebra. Proposition 2.25 and Corollary 2.27 show that R(A) inherits all these properties, and thus Z (A) is simple (see the comment below Eq. (3.32)). By Theorem 3.18, Z (A) is modular invariant, and then by Theorem 3.4 (i), dim Z (A) = Dim C. This implies that N = 1 in Z (A) ∼ = A⊕N cl , and so f cl is in fact an isomorphism. Since Acl and Z (A) are both haploid, we have ε Z (A) ◦ f cl = ξ εcl for some ξ ∈ C× . The counit uniquely determines the Frobenius structure on Acl and Z (A) (see e.g. [FRS, Lemma 3.7]), so that f cl is a coalgebra isomorphism iff ξ = 1. To compute ξ we compose the above identity with ηcl from the right. Defining ζcl via εcl ◦ ηcl = ζcl−1 Dim C · id1 and using (3.33) gives ξ = dim A ζcl /ζ A2 . By rescaling the comultiplication and the counit of A, and consequently changing ζ A , we can always achieve ξ = 1. This proves part (i) of the theorem. Equation (3.48) implies that ιcl-op = e ◦ f cl . Since f cl is an isomorphism of Frobenius algebras, by Lemmas 2.18 and 3.19 we have l ιcl-op ◦ ι∗cl-op = e ◦ f cl ◦ f cl∗ ◦ e∗ = ζ A e ◦ r = PR(A) .

(3.50)

Thus (A|Acl , ιcl-op ) is a Cardy algebra. This proves part (ii) of the theorem. Part (iii) can be seen as follows. By [KR, Prop. 4.3], T Z (A) ∼ = ⊕κ∈J Mκ∨ ⊗ A Mκ as algebras. Together with the observation that T ( f cl ) : T (Acl ) → T Z (A) is an isomorphism of algebras, this proves part (iii).

Remark 3.23. Part (i) of Theorem 3.22 was announced by Müger [Mü2]. We provide an independent proof in the setting of Cardy algebras The above theorem, together with Lemma 2.11 and Theorem 3.4, shows that a sim2 -algebra A with dim A = Dim C is always ple commutative symmetric Frobenius C± cl cl part of a Cardy algebra (Aop |Acl , ιcl-op ) for some simple special symmetric Frobenius algebra Aop in C. However, the above proof also illustrates that Aop is not unique. This raises the question how two Cardy algebras with a given Acl can differ. This question is answered by [KR, Thm. 1.1], which in the present framework can be restated as follows.

Cardy Algebras and Sewing Constraints, I (i)

(i)

905

(i)

2 -algebras such that Theorem 3.24. If (Aop |Acl , ιcl-op ), i = 1, 2 are two Cardy C|C± (i) (i) (1) ∼ (2) A is simple and dim Aop = 0 for i = 1, 2, then A = A as algebras if and only cl

(1)

cl

(2)

cl

if Aop and Aop are Morita equivalent. (i) Proof. Theorem 1.1 in [KR] is stated for A(i) op being non-degenerate algebras and Acl = (i) (i) Z (Aop ) for i = 1, 2. By Proposition 3.16, Aop are simple and special for i = 1, 2. (i) Then by [KR, Lem. 2.1], Aop are non-degenerate algebras. By Theorem 3.21, we have (i) ∼ (i) (1) (2) Acl = Z (Aop ) as Frobenius algebras. Finally, by [KR, Thm. 1.1], Z (Aop ) ∼ = Z (Aop ) (1) (2) as algebras iff Aop and Aop are Morita equivalent.

2 ) be the set of equivalence classes [B] of simple modular invariant Let Cmax (C± 2 . Two such algebras B and B are commutative symmetric Frobenius algebras B in C± equivalent if B and B are isomorphic as algebras (but not necessarily as Frobenius algebras). Let Msimp (C) be the set of Morita classes of simple special symmetric Frobenius 2 ) by z : {A} → [Z (A)], where algebras in C. Define the map z : Msimp (C) → Cmax (C± {A} denotes the Morita class of A. From Theorem 3.22 (i) and [KR, Thm. 1.1] we learn: 2 ) is a bijection. Corollary 3.25. The map z : Msimp (C) → Cmax (C±

A. Appendix A.1. Proof of Lemma 2.7 . We will show that if (F, ψ2F , ψ0F ) is a colax tensor functor from C1 to C2 , then (G, φ2G , φ0G ) is a lax tensor functor from C2 to C1 . Applying this result to the opposed categories then gives the converse statement. We need to show that φ0G and φ2G make the diagrams (2.1) and (2.2) commute. We first prove the commutativity of (2.1). Consider the following diagram: id G(A) ⊗φ2G

G(A) ⊗ (G(B) ⊗ G(C))

/ G(A) ⊗ G(B ⊗ C)

Gψ2F ◦δ

G(F G(A) ⊗ F(G(B) ⊗ G(C)))

G(F(id G(A) )⊗F(φ2G ))

Gψ2F ◦δ

/ G(F G(A) ⊗ F G(B ⊗ C)) G(ρ A ⊗ρ B⊗C )

G(id F G(A) ⊗ψ2F )

G(F G(A) ⊗ (F G(B) ⊗ F G(C)))

G(ρ A ⊗(ρ B ⊗ρC ))

/ G(A ⊗ (B ⊗ C)).

(A.1) The top subdiagram is commutative because of the naturality of ◦ δ. The commutativity of the bottom subdiagram follows from the following identities: Gψ2F

(ρ B ⊗ ρC ) ◦ ψ2F = (ρ B ⊗ ρC ) ◦ ψ2F ◦ ρ F ◦ Fδ = ρ B⊗C ◦ F G(ρ B ⊗ ρC ) ◦ F G(ψ2F ) ◦ Fδ = ρ B⊗C ◦ F(φ2G )

(A.2)

as a map F(G(B) ⊗ G(C)) → B ⊗ C. The commutativity of (A.1) implies that the composition of maps in the left column in (2.1) can be replaced by (A.3) G(ρ A ⊗ (ρ B ⊗ ρC )) ◦ G (id F G(A) ⊗ ψ2F ) ◦ ψ2F ◦ δ.

906

L. Kong, I. Runkel

Similarly, we can show that the composition of maps in the right column in (2.1) can be replaced by (A.4) G((ρ A ⊗ ρ B ) ⊗ ρC ) ◦ G (ψ2F ⊗ id F G(C) ) ◦ ψ2F ◦ δ. Using the commutativity of (2.3), it is easy to see that (2.1) with the left and right columns of (2.1) replaced by (A.3) and (A.4) respectively is commutative. Hence (2.1) is commutative. Now we prove the commutativity of the first diagram in (2.2). φ2G ◦ (φ0G ⊗ id G(A) )

= G(ρ12 ⊗ ρ A ) ◦ Gψ2F ◦ δ ◦ (Gψ0F ◦ δ11 ) ⊗ id G(A)

(1)

(2)

= G(ρ12 ⊗ ρ A ) ◦ Gψ2F ◦ G F(Gψ0 ⊗ id G(A) ) ◦ G F(δ11 ⊗ id G(A) ) ◦ δ

(3)

= G(ρ12 ⊗ ρ A ) ◦ G(F G(ψ0F ) ⊗ id F G(A) ) ◦ Gψ2F ◦ G F(δ11 ⊗ id G(A) ) ◦ δ (4) = G(id12 ⊗ ρ A ) ◦ G [ψ0F ◦ ρ F(11 ) ◦ (Fδ)11 ] ⊗ id F G(A) ◦ Gψ2F ◦ δ (5)

= G(id12 ⊗ ρ A ) ◦ G(ψ0F ⊗ id F G(A) ) ◦ Gψ2F ◦ δ

(6)

= G(id12 ⊗ ρ A ) ◦ G(l −1 F G(A) ) ◦ G F(l G(A) ) ◦ δ

(7)

= G(l −1 A ) ◦ Gρ A ◦ δG ◦ l G(A)

(8)

= G(l −1 A ) ◦ l G(A) ,

(A.5)

where in step (1) we substituted the definition of φ0G , φ2G given in (2.10); in step (2) we used the naturality of δ; in step (3) we used the naturality of Gψ2F ; in step (4) we switched the position between Gρ12 and G F G(ψ0F ) and the position between Gψ2 and G F(δ12 ⊗ id G(A) ) using the naturality of ρ and Fψ2G respectively; in step (5) we applied the second identity in (2.8); in step (6) we used (2.4); in step (7) we used the naturality of l −1 and δ; in step (8) we used the first identity in (2.8). The proof of the commutativity of the second diagram in (2.2) is similar. Thus we have shown that G is a lax tensor functor.

A.2. Proof of Lemma 3.20. To prepare the proof, recall that for a given object B ∈ C, the modular group P S L(2, Z) acts on the space ⊕i HomC (B ⊗ Ui , Ui ), see e.g. [BK, Sect 3.1] and [K3, Eq. (4.55)]. We will only need the action of S and S −1 . Let f ∈ ⊕i HomC (B ⊗ Ui , Ui ). Then Uj Ui Ui

S :

f

−→

i∈I B

Ui

dim U j √ DimC i∈I j∈I

f

B

,

Uj

(A.6)


907

Uj Ui Ui

S −1 :

f

−→

i∈I B

dim U j √ DimC i∈I j∈I

f

Ui

B

.

(A.7)

Uj

By Lemma 3.2, to establish that (3.37) is S-invariant, it is enough to prove the identity (3.4) when f is given by (3.37). Using (A.6) and (A.7), we can see that Eq. (3.4) simply says that ⊕i, j [RHS of (3.4)] is invariant under the action of S × S. Consider the element ∨ g of ⊕ j,k∈I HomC 2 (R(A) ⊗ (U ∨ j × Uk ), U j × Uk ) given by ±

U∨ j × Uk α

R(A)

g

=

R(A) R(A)

j,k∈I

.

(A.8)

α α U∨ j × Uk

R(A)

By the above arguments, proving S-invariance of (3.37) is equivalent to proving invariance of g under the action of S × S. For i ∈ I, we denote by gi the component of g in ∨ ⊕ j∈I HomC (A ⊗ Ui∨ ⊗ U ∨ j , U j ) ⊗ (⊕k∈I Hom C (Ui ⊗ Uk , Uk )) . We view the second Hom-space in the above tensor product as a Hom-space in C+ instead of C− . It is enough to show that gi is invariant under the action of S × S −1 . Note that the action of S −1 in C+ is equivalent to that of S in C− . The morphism gi can be canonically identified with a bilinear pairing ⎛ ⎞ ∨ ∨ ⎠ ( · , · )i : ⎝ HomC (U ∨ j , A ⊗ Ui ⊗ U j ) ×

j∈I

HomC (Uk , Ui ⊗ Uk )

−→ C

(A.9)

k∈I ∨ ∨ as follows. For h 1 ∈ HomC (U ∨ j , A ⊗ Ui ⊗ U j ) and h 2 ∈ Hom C (Uk , Ui ⊗ Uk ) we set

(h 1 , h 2 )i = (dim U j dim Uk )−1 trU ∨j ×Uk [gi ◦ (h 1 × h 2 )] .

(A.10)

908

L. Kong, I. Runkel

When substituting the explicit form of the product m R(A) of R(A) = (A × 1) ⊗ R(1), after a short calculation one finds

A A

α A Uk

(h 1 , h 2 )i =

α

1 dim U j

Uj

Ui Uk A

A

A

h2

.

(A.11)

α

U∨ j

A h1

l Here the top morphism PR(A) has been simplified with the help of the identity

PR(A) ◦ m R(A) ◦ (PR(A) ⊗ PR(A) ) = ((m A ◦ A ) ⊗ idUi∨ ) × idUi ◦ m R(A) ◦ (PR(A) ⊗ PR(A) ),

(A.12)

i∈I

which can be checked by direct calculation along the same lines as in the proof of [Fr, Lem 3.10]. The action of the modular transformation S on ⊕i∈I (B ⊗ Ui , Ui ) for B ∈ C naturally induces an action on ⊕i∈I (Ui , B ⊗ Ui ) [K3, Prop. 5.14], which we denote by ∨ ∨ S ∗ . In the present case we get an action of S ∗ on ⊕ j∈I HomC (U ∨ j , A ⊗ Ui ⊗ U j ) and ⊕k∈I HomC (Uk , Ui ⊗ Uk ). Then to show gi is invariant under the action of S × S −1 amounts to showing that

(h 1 , h 2 )i = ((S −1 )∗ h 1 , S ∗ h 2 )i ,

(A.13)


909

∨ ∨ for all h 1 ∈ HomC (U ∨ j , A ⊗ Ui ⊗ U j ) and h 2 ∈ Hom C (Uk , Ui ⊗ Uk ). We have

A A

α A

Ui

dim Un ((S −1 )∗ h 1 , S ∗ h 2 )i = DimC m,n,α

Um

Un

A

A

.

h2 Uk

A

α

Uj

A h1

(A.14)

Now drag the upper vertex indexed by α in the above graph along its Um∨ -leg until it meets the lower vertex also indexed by α, then sum over α and m. This gives

A A

A Ui

((S −1 )∗ h 1 , S ∗ h 2 )i =

dim Un n

A

A

DimC

h2 Uk

A Uj

A

.

Un

h1

(A.15)

910

L. Kong, I. Runkel

If we just look at the neighbourhood of the Un -loop in the above graph, we see the following subgraphs: A

A ∨ Uk

Uk∨

dim Un n

Un

DimC

U∨ j

A∨

A

=

α

Uk∨

Uk

1 dim U j

, α

α

U∨ j

Uj

Uj

(A.16)

where we have applied [BK, Cor. 3.1.11]. Substituting this subgraph back to the original graph in (A.15), we obtain

α A

A

A

A

Ui

((S

−1 ∗

∗

) h 1 , S h 2 )i =

α

Uk

1 dim U j

A

mA

Uj

.

A h2 Uk

A A

α U∨ j

A h1

(A.17) The graph in (A.17) is equal to that in (A.11). In order to see this, we first drag the “bubble” (m A ◦ A ) along A lines and through the m A vertex (because m A ◦ A is a bimodule map) until it reaches the lower-left leg of the upper vertex indexed by α. Then drag the m A vertex along the (red) dotted line in the above graph. Finally, we apply the associativity of A, (A.12), and [Fr, Lem 3.11]. Then we see that the graph in (A.17) exactly matches with the one in (A.11).

Acknowledgements. We would like to thank the organisers of the Oberwolfach Arbeitsgemeinschaft “Algebraic structures in conformal field theories” (April 2007), where this work was started, for an inspiring meeting. We would further like to thank the Hausdorff Institute for Mathematics in Bonn and the organisers of the stimulating meeting “Geometry and Physics” (May 2008). We are indebted to Alexei Davydov, Jens Fjelstad, Jürgen Fuchs, Yi-Zhi Huang, Alexei Kitaev, Urs Schreiber, Christoph Schweigert, Stephan Stolz and Peter Teichner for helpful discussions and/or comments on a draft of this paper. The research of IR was partially supported by the EPSRC First Grant EP/E005047/1, the PPARC rolling grant PP/C507145/1 and the Marie Curie network ‘Superstring Theory’ (MRTN-CT-2004-512194).


911

References [AN] [Bi] [BK] [C1] [C2] [CL] [DMZ] [DP] [ENO1] [ENO2] [Fe] [FHL] [Fj] [FK] [Fr] [FRS] [FS] [H1] [H2] [HK1] [HK2] [HK3] [HL] [JS] [K1] [K2] [K3] [Ki] [KO] [KR] [Ld] [Li1] [Li2]

Alexeevski, A., Natanzon, S.M.: Noncommutative two-dimensional topological field theories and hurwitz numbers for real algebraic curves. Sel. Math., New Ser. 12, 307–377 (2006) Bichon, J.: Cosovereign hopf algebras. J. Pure Appl. Alg. 157, 121–133 (2001) Bakalov, B., Kirillov, A.A.: Lectures on Tensor Categories and Modular Functors. Providence, RI: Amer. Math. Soc., 2001 Cardy, J.L.: Operator content of two-dimensional conformal invariant theories. Nucl. Phys. B 270, 186–204 (1986) Cardy, J.L.: Boundary conditions, fusion rules and the verlinde formula. Nucl. Phys. B 324, 581–596 (1989) Cardy, J.L., Lewellen, D.C.: Bulk and boundary operators in conformal field theory. Phys. Lett. B 259, 274–278 (1991) Dong, C.-Y., Mason, G., Zhu, Y.-C.: Discrete series of the Virasoro algebra and the moonshine module. In: Algebraic Groups and Their Generalizations: Quantum and infinite-dimensional Methods, Proc. Symp. Pure Math. 56, Part 2, Providence, RI: Amer. Math.Soc., 1994, pp. 295–316 Day, B., Pastro, C.: Note on frobenius monoidal functors. New York J. Math. 14, 733–742 (2008) Etingof, P.I., Nikshych, D., Ostrik, V.: On fusion categories. Ann. Math. 162, 581–642 (2005) Etingof, P.I., Nikshych, D., Ostrik, V.: An analogue of radford’s s 4 formula for finite tensor categories. Int. Math. Research Notices 54, 2915–2933 (2004) Felder, G., Fröhlich, J., Fuchs, J., Schweigert, C.: Correlation functions and boundary conditions in rational conformal field theory and three-dimensional topology. Comp. Math. 131, 189–237 (2002) Frenkel, I.B., Huang, Y.-Z., Lepowsky, J.: On axiomatic approaches to vertex operator algebras and modules. Mem. Amer. Math. Soc. 104 (1993) Fjelstad, J., Fuchs, J., Runkel, I., Schweigert, C.: Uniqueness of open/closed rational cft with given algebra of open states. Adv. Theor. Math. Phys. 12, 1283–1375 (2008) Fröhlich, J., King, C.: The chern-simons theory and knot polynomials. Commun. Math. Phys. 126, 167–199 (1989) Fröhlich, J., Fuchs, J., Runkel, I., Schweigert, C.: Correspondences of ribbon categories. Adv. Math. 199, 192–329 (2006) Fuchs, J., Runkel, I., Schweigert, C.: Tft construction of rcft correlators. I: partition functions. Nucl. Phys. B 646, 353–497 (2002) Fuchs, J., Schweigert, C.: Category theory for conformal boundary conditions. Fields Inst. Commun. 39, 25–71 (2003) Huang, Y.-Z.: Two-dimensional conformal geometry and vertex operator algebras. Progress in Mathematics, Vol. 148, Boston: Birkhäuser, 1997 Huang, Y.-Z.: Rigidity and modularity of vertex tensor categories. Comm. Contemp. Math. 10, 871–911 (2008) Huang, Y.-Z., Kong, L.: Open-string vertex algebra, category and operads. Commun. Math. Phys. 250, 433–471 (2004) Huang, Y.-Z., Kong, L.: Full field algebras. Commun. Math. Phys. 272, 345–396 (2007) Huang, Y.-Z., Kong, L.: Modular invariance for conformal full field algebras. http://arxiv.org/abs/ math/0609570v2[math.QA], 2006 Huang, Y.-Z., Lepowsky, J.: Tensor products of modules for a vertex operator algebra and vertex tensor categories, In: Lie Theory and Geometry, in honor of Bertram Kostant, ed. R. Brylinski, J.-L. Brylinski, V. Guillemin, V. Kac, Boston: Birkhäuser, 1994, pp. 349–383 Joyal, A., Street, R.: Braided tensor categories. Adv. Math. 102, 20–78 (1993) Kong, L.: Full field algebras, operads and tensor categories. Adv. Math. 213, 271–340 (2007) Kong, L.: Open-closed field algebras. Commun. Math. Phys. 280, 207–261 (2008) Kong, L.: Cardy condition for open-closed field algebras. Commun. Math. Phys. 283, 25–92 (2008) Kitaev, A.: Private communication Kirillov, A.A., Ostrik, V.: On q-analog of mckay correspondence and ade classification of sl(2) conformal field theories. Adv. Math. 171, 183–227 (2002) Kong, L., Runkel, I.: Morita classes of algebras in modular tensor categories. Adv. Math. 219, 1548–1576 (2008) Lauda, A.D.: Frobenius algebras and ambidextrous adjunctions. Theo. Appl. Cat. 16, 84– 122 (2006) Li, H.-S.: Regular representations of vertex operator algebras. Commun. Contemp. Math. 4, 639–683 (2002) Li, H.-S.: Regular representations and huang-lepowsky tensor functors for vertex operator algebras. J. Alge. 255, 423–462 (2002)

912

[Ln] [LP] [LR] [Lw] [Lz] [Ma] [Mo] [MS] [Mü1] [Mü2] [O] [P] [RT] [Se] [So] [Sz] [T] [V] [VZ] [W] [Y]

L. Kong, I. Runkel

Leinster, T.: Higher operads, higher categories. London Mathematical Society Lecture Note Series 298. Cambridge: Cambridge University Press, 2004 Lauda, A., Pfeiffer, H.: Open-closed strings: two-dimensional extended tqfts and frobenius algebras. Topology Appl. 155, 623–666 (2008) Longo, R., Rehren, K.H.: Local fields in boundary conformal qft. Rev. Math. Phys. 16, 909 (2004) Lewellen, D.C.: Sewing constraints for conformal field theories on surfaces with boundaries. Nucl. Phys. B 372, 654–682 (1992) Lazaroiu, C.I.: On the structure of open-closed topological field theory in two dimensions. Nucl. Phys. B 603, 497–530 (2001) Mac Lane, S.: Categories for the working mathematician. Brelin-Heidelberg-NewYork: Springer, 1998 Moore, G.: Some comments on branes, G-flux, and K -theory. Int. J. Mod. Phys. A16, 936– 944 (2001) Moore, G., Segal, G.: D-branes and K-theory in 2D topological field theory. http://arxiv.org/abs/ hep-th/0609042v1, 2006 Müger, M.: From subfactors to categories and topology II. The Quantum Double of Tensor Categories and Subfactors. J. Pure Appl. Alg. 180, 159–219 (2003) Müger, M.: Talk at workshop ‘Quantum Structures’ (Leipzig, 28. June 2007), Preprint in preparation Ostrik, V.: Module categories, weak hopf algebras and modular invariants. Transform. Groups 8, 177–206 (2003) Pfeiffer, H.: Finitely semisimple spherical categories and modular categories are self-dual. Adv. Math. 221, 1608–1652 (2009) Reshetikhin, N., Turaev, V.G.: Invariants of 3-manifolds via link polynomials and quantum groups. Inv. Math. 103, 547–597 (1991) Segal, G.: The definition of conformal field theory. Preprint 1988; also in: U. Tillmann (ed.), Topology, geometry and quantum field theory, London Math. Soc. Lect. Note Ser. 308, Cambridge: Cambridge Univ. Press, 2004, pp. 421–577 Sonoda, H.: Sewing conformal field theories II. Nucl. Phys. B 311, 417–432 (1988) Szlachányi, K.: Adjointable monoidal functors and quantum groupoids. In: Hopf algebras in noncommutative geometry and physics, Caenepeel, S., Oystaeyen, F.V. (eds.) Lecture Notes in Pure and Applied Mathematics 239 Boca Raton, FL: CRC Press, 2004, pp. 297–307 Turaev, V.G.: Quantum Invariants of Knots and 3-Manifolds. New York: de Gruyter, 1994 Vafa, C.: Conformal theories and punctured surfaces. Phys. Lett. B 199, 195–202 (1987) Van Oystaeyen, F., Zhang, Y.H.: The brauer group of a braided monoidal category. J. Algebra 202, 96–128 (1998) Witten, E.: Quantum field theory and the jones polynomial. Commun. Math. Phys. 121, 351–399 (1989) Yetter, D.N.: Functorial knot theory. Categories of tangles, coherence, categorical deformations, and topological invariants. Series on Knots and Everything 26, River Edge, NJ: World Scientific, 2001

Communicated by Y. Kawahigashi

Communications In Mathematical Physics - Volume 292

Communications in Mathematical Physics - Volume 221

Communications in Mathematical Physics - Volume 220

Communications in Mathematical Physics - Volume 235

Communications in Mathematical Physics - Volume 223

Communications In Mathematical Physics - Volume 283

Communications In Mathematical Physics - Volume 270

Communications in Mathematical Physics - Volume 208

Communications in Mathematical Physics - Volume 186

Communications In Mathematical Physics - Volume 294

Communications in Mathematical Physics - Volume 217

Communications In Mathematical Physics - Volume 274

Communications in Mathematical Physics - Volume 239

Communications in Mathematical Physics - Volume 306

Communications in Mathematical Physics - Volume 264

Communications in Mathematical Physics - Volume 227

Communications in Mathematical Physics - Volume 184

Communications in Mathematical Physics - Volume 261

Communications in Mathematical Physics - Volume 225

Communications In Mathematical Physics - Volume 263

Communications in Mathematical Physics - Volume 211

Communications In Mathematical Physics - Volume 293

Communications in Mathematical Physics - Volume 246

Communications In Mathematical Physics - Volume 298

Communications in Mathematical Physics - Volume 234

Communications In Mathematical Physics - Volume 288

Communications in Mathematical Physics - Volume 304

Communications in Mathematical Physics - Volume 233

Communications in Mathematical Physics - Volume 253

Communications in Mathematical Physics - Volume 222

Communications in Mathematical Physics - Volume 260

Communications In Mathematical Physics - Volume 292

Communications in Mathematical Physics - Volume 221

Communications in Mathematical Physics - Volume 220

Communications in Mathematical Physics - Volume 235

Communications in Mathematical Physics - Volume 223

Communications In Mathematical Physics - Volume 283

Communications In Mathematical Physics - Volume 270

Communications in Mathematical Physics - Volume 208

Communications in Mathematical Physics - Volume 186

Communications In Mathematical Physics - Volume 294

Communications in Mathematical Physics - Volume 217

Communications In Mathematical Physics - Volume 274

Communications in Mathematical Physics - Volume 239

Communications in Mathematical Physics - Volume 306

Communications in Mathematical Physics - Volume 264

Communications in Mathematical Physics - Volume 227

Communications in Mathematical Physics - Volume 184

Communications in Mathematical Physics - Volume 261

Communications in Mathematical Physics - Volume 225

Communications In Mathematical Physics - Volume 263

Communications in Mathematical Physics - Volume 211

Communications In Mathematical Physics - Volume 293

Communications in Mathematical Physics - Volume 246

Communications In Mathematical Physics - Volume 298

Communications in Mathematical Physics - Volume 234

Communications In Mathematical Physics - Volume 288

Communications in Mathematical Physics - Volume 304

Communications in Mathematical Physics - Volume 233

Communications in Mathematical Physics - Volume 253

Communications in Mathematical Physics - Volume 222

Communications in Mathematical Physics - Volume 260

Recommend Documents